Introduction to Interaction Design: Lecture

Testing and Evaluation

F27ID Introduction to Interactive Design

2021-2022

## The Goals of **Testing** 
#### Discover errors and areas of improvement in:
* **Performance** -- How much time, and how many steps, are required for    people to complete basic tasks? (For example, find something to buy,    create a new account, and order the item.)
* **Accuracy** -- How many mistakes did people make? (And were they fatal    or recoverable with the right information?)
* **Recall** -- How much does the person remember afterwards or after    periods of non-use?
* **Emotional response** -- How does the person feel about the tasks    completed? Is the person confident, stressed? Would the user    recommend this system to a friend?   
 
<aside class='notes'>
Usability testing is a black-box testing technique. The aim is to observe people using the product to discover errors and areas of improvement. Usability testing generally involves measuring how well test subjects respond in four areas: efficiency, accuracy, recall, and emotional response. The results of the first test can be treated as a baseline or control measurement; all subsequent tests can then be compared to the baseline to indicate improvement.

The generic goal of most evaluations is to provide "useful feedback" to a variety of audiences including sponsors, donors, client-groups, administrators, staff, and other relevant constituencies.
The major goal of evaluation is to influence decision-making or policy formulation through the provision of empirically-driven feedback.

</aside>

## User-based **evaluation**
  * Considered to yield the most **reliable** and valid estimate of an **application's usability**
  * In a typical user-based evaluation, test subjects are asked to    perform a set of tasks with the technology.
  * Depending on the primary focus of the evaluator, the users' success    at completing the tasks and their speed of performance may be    recorded.
  * Large sample of users would be good, but 3 (**Lewis**) or 5 (**Nielsen**)    are often enough to uncover the majority of problems.
  * **Nielsen**: "once it is found that two or three people are totally confused by the home page, little is gained by watching more    people suffer through the same flawed design"   
 
<aside class='notes'>
In an ideal world user testing with a large sample of the intended user population would routinely occur, however due to resource limitations, user-based tests are often constrained. As a result, there is considerable interest among HCI professionals in determining how to gain the most information from the smallest sample of users. While popular myths exist about being able to determine a majority of problems with only 2 or 3 users, Lewis (1994) has shown that the sample size requirement is largely dependent on the type of errors one seeks to identify and their relative probability of occurrence. Whereas 3 users might identify many problems in a new application, substantially more users will be required to tease out the remaining problems in a mature or revised product.

In the early 1990s, Jakob Nielsen, at that time a researcher at Sun Microsystems, popularized the concept of using numerous small usability tests-typically with only five test subjects each-at various stages of the development process. His argument is that, once it is found that two or three people are totally confused by the home page, little is gained by watching more people suffer through the same flawed design
</aside>

### Interview **Example**    
  * **Goal:** To understand behaviours and roles for social robots in    Education
  * **Users:** School Teachers 
  * **Method:** Interview
  * E.g. Interview questions include:
    * How do you think a robot can contribute towards efficient language    learning?
    * How do you want a robot to show different gestures during a one to one    interaction? 
    * How do you want a robot to display a personality according to a child? 
    * How do you want a robot to react to children emotions?
    * What kind of role a robot should play to improve learning?
    * How do you want a robot to store child's memory?
 
 
<aside class='notes'>
Take a look at the following example. 
Investigates the behaviours and roles for social robots in Education by observing teachers.
The method used is an interview and uses some of the following questions.
How do you think a robot can contribute towards efficient language    learning.
How do you want a robot to display a personality according to a child.
How do you want a robot to store child's memory.
</aside>