Article
UCD 5/5 Evaluation

We review the current level of interface development and compare it with previous and other designs. It is an early, qualitative and informal determination and validation of usability requirements.

Aug 1, 2017

Who, What, Where, When and How

Who

Test persons need to be inexperts because of the inequality between user and designer. Developers are usually not representative because they have a technical perspective and are mostly not familia with the users tasks and work. A lack of user research is filled with the developers intuition and therefor falsifies the evaluation. Observations, interviews and usability tests are conducted and evaluated by experts.

What

Subjective data are about users assumptions, wishes and expectations. They are gathered through interviews, questionnaire and free formulated answers.

Objective data are collected within tests and represent for instance the amount and frequents of errors, fulfilled tasks in a given period of time and mouse movements.

Test person data include things about demography, physical condition and background like age, gender, education, right or left hander and experiences with the test object.

Experimental condition data like time, weekday, temperature and noise are accumulated only if they are relevant for the context of use.

Where

A systems point of use is the best place to comprehend contextualised informations like communication and behaviours, work equipment and environmental conditions, tasks and processes. Field studies are time consuming and consist of qualitative and quantitative methods. One hour in the field means 1 till 4 hours of evaluation.

Questioning and usability tests can take place in laboratories like test rooms or simulators and with mobile equipment wherever the test persons are located.

When

Formative evaluation takes place before and during the development. It is the basis for a newly design or an overhaul and delivers hints about flaws and suggestions for improvement. At the time of development it compares concepts, designs and detect the reasons if they aren’t working.

Summative evaluation is needed for final assessment, certification and verification of the requirements and follows after the development. It also compares the final system with others.

How

Empirical methods are carried out with users and potential users. They can consist of interviews, questionnaire and usability tests. Usability as a complex interplay between design, user knowledge, task details and random influences is better graspable with empirical than analytical methods.

Analytical methods don't include the „real“ user and are conducted on the basis of guidelines and industry standards, expert-evaluations like Cognitive Walkthrough and formal analytical procedures like GOMS.

Usability Test

Is a task oriented test to observe and evaluate the behaviour of an user that is interacting with a system. The goal is to find usability problems and to understand the user.

Example questions that need to be answered:

Why is the user doing what he is doing?
What goes wrong?
What confuse and annoys?
What could go easier?

Planing

The test team is put together and can include a test leader, protocol writer, moderator and designer. After that the test persons are selected which represent the typical user but members of the development team or external experts can’t be part of it. The test exercises have to simulate real use cases and must suit to the work tasks, test persons and test objects. Also decisions regarding the method and the usability laboratory are made.

Conducting

Explaining the purpose of the test to the user is an important part:

We want to find problems!
We test the system and not you!
There are no wrong questions!
Don’t concentrate on the look of the prototype.
Not every interaction is shown.

Next step is to clarify the task if necessary and requesting the users to explain what they are doing and why. The test team is engaged with observation, asking and making notes.

Evaluating

It depends on the test but the evaluation can take a long time especially with video and eye tracking materials.

Main parameters to evaluate:

effectiveness
efficiency
satisfaction

The colloquial term user-friendliness should be avoided because it is not measurable in that sense.

Methods

Dialog

The moderator formulates the tasks as easy and realistic scenarios. Phrase like "show me" remind the user to fulfil a task. Word that "give away" the task need to be avoided.

Bad; What would you do if you wanted to add that item to cart?
Good; Show me what you would to if you decided to add that item to cart?

Example responses if an interaction does not go any further:

We didn't integrate that into the prototype but what do you expect would happen?
If you use that button and something unexpected happens, what would you do next?

Loud Thinking

Is a more flexible method and demands the test users to verbalise their thoughts. The user formulates what he/she is doing and the moderator talks only in exceptional situations. With retrospective loud thinking it is also possible to do that after a test. While watching recordings, users describe what they did and why.

This are well working methods to learn what users think but we have to be aware that the statements are filtered. That is because users tend to formulate their thoughts after their solved a problem or because they just want to appear smart. An other aspect is that this is an unusual situation for the users and the can be uncomfortable.

Laboratory

A usability laboratory can be a stationary test room or simulator. Mobile „laboratories“ are usable as well, guerrilla usability testing is an example. See the software solution Silverbackapp for more informations.

Expert Evaluation

Is an inspection procedure used to find usability problems and to give specific improvement suggestions. It is mainly a formative evaluation and conducted during the development process. I will just give a small introduction of two variations and compare them otherwise this article would be twice as long.

Cognitive Walkthrough CW

The goal is to examine the learnability of a system, to identify the cognitive hurdles and to show the development team if, where and why the design will affect the interaction. The expert takes the position of an hypothetical user for that. The basic assumption; User will go the path of least cognitive effort.

Heuristic Evaluation HE

Is an assessment of design principles and not task fulfilment. Heuristics are used to determine if the system meets the optimal interaction characteristics. The expert looks for usability problems on the basis of knowledge, experience and heuristic guidelines.

CW vs. HE

The procedures complement each other.

HE task independent and general analysis, CW central key task evaluation
HE more problems identifiable, CW some specific problems better identifiable
HE wider range, all aspects of user interface are considered, CW only a few tasks can be considered

CW vs. Usability Test

Both are task orientated

CW is an analytic method, usability tests are empirical
CW is faster, cheaper and usable without a prototype but finds less problems

ArticleUCD 5/5 Evaluation