m.‘
, <

j Mug",

"in, .

-4

.
33%..

r "1.. gm;

3;”.L‘"?

“3‘13?"
{1

3:31:

3".“
' do 1
a t.
93, :U'

«4‘
v

.w
a

’- w.“ A
tum-u. .,
s..."
n35???"

.9..-

.fggc'J-i
. u i

u..-

‘4‘”.255

‘ .-:-<
aJ-‘w'lu

ea":

w—ou

. .
¢-. .-.
‘1“.7'- "In: 4.

-. 454....
.u.

u... .
u ”57:- I.
m
.

35"”

«4..

m. .. “yr-u

u _ ”
‘wu’ﬁ‘;

w.

tun

a... '1'.-

, 111.49

..
-,. .5“.

Hume.
n...

W.
. u now..-

.u
and

 

 

 

mats

MICHIGAN STATE

I IIIIzIIIIII

RAHIES

IIIIIIII’IIIIIIIIIIIIIIIIIIIIIIII

 

II

 

 

 

 

 

 

 

 

 

This is to certify that the

dissertation entitled

Usability Feedback in Education Software Prototypes:
A Contrast of Users and Experts

presented by
Pericles Varella Gomes

has been accepted towards fulﬁllment
of the requirements for

Ph . D . degree in BhllQiQEhL

thsqm

Major professor

 

Dr. Patrick Dickson

 

Date April 12, 1996

MS U is an Affirmative Action/Equal Opportunity Institution 0- 12771

 

 

LIBRARY
Mlchlgan State
University

 

 

 

PLACE II RETURN BOX to romavothb Mum your record.
TO AVOID FINES Mum on or More data duo.

DATE DUE DATE DUE DATE DUE

    

     

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU In An Afﬁrmative Action/EM Opponunuy Ira-titular:
Walnut

USABILITY FEEDBACK IN EDUCATION SOFTWARE PROTOTYPES:
A CONTRAST OF USERS AND EXPERTS

By

Pericles Varella Gomes

A DISSERTATION
Submitted to
Michigan State University in

partial satisfaction of the requirements
for the degree of

DOCTOR IN PHILOSOPHY

Department of Counseling, Educational Psychology,
and Special Education

1 996

ABSTRACT
USABILITY FEEDBACK IN EDUCATION SOFTWARE

PROTOT YPES:
A CONTRAST OF USERS AND EXPERTS
By
Pericles Varella Comes

This study compares usability feedback from users and hypermedia
designers when evaluating computer-based instruction prototypes. It
provides information for deﬁning cost-effective evaluation strategies and
methods, and for specifying valid instruments and tools. Usability
instruments such as QUIS 5.5b - University of Maryland are combined with
think-aloud evaluation techniques to collect feedback from 16 target users
(engineering students from the US. , China, Korea, India and Pakistan) and 5
educational hypermedia designers. The designers also evaluated data
collected from users, which included quantitative, qualitative reports and
multimedia files. In the quantitative side, descriptive statistics, non-
parametric and cluster techniques were applied to the answers. User groups
and designers were compared, as well as some more general trends, when all
subjects were combined. Gender comparisons were also studied. Critical
incidents multimedia files were produced for each subject, with screen and
audio grabs of problems encountered; navigational maps were generated for
each subject; written comments about the prototype were collected; and a

descriptive list of errors was generated, comparing types of errors

encountered.

Designers reported that qualitative instruments in general were more
useful. Designers were more critical both about the interface aspects and
pedagogical dimensions and significantly found more errors. American users
were more efﬁcient in finding errors. In terms of rating the software, Indian
users were more forgiving; and the American group was the most critical.
Females were systematically more positive about the prototype. Designers
were more efficient than users when executing the usability evaluation, but
could not completely replace users (some errors were found only by users).
Designers were better in the double task of trying to learn and critique a new
interface and learn about the content at the same time. The variability of
feedback within users and within designers was found high. Methodological
considerations for further work include the relative usefulness of combining
quantitative and qualitative methods; the issue of when to use designers as
opposed to target users, and the importance of gathering information from
different ethnic user groups when developing software for an international
audience. One major conclusion regarding the ratings of instruments by
experts was that the best instruments were the ones that produced
contextualized data, both in the quantitative and qualitative aspects (such as

the multimedia files, list of problems, and demographic data).

 

Research supported by a grant by the Conselho Nacional de Pesquisa CNPq -Brasil

Acknowledgments

This research was highly dependent on the volunteer help and
involvement of many people. Thanks go to the students and the
hypermedia designers at Michigan State University, who were willing
to spend time for my research. I also thank Mr. Alaciel Franklin de
Almeida and his collegues at TELEBRAS, in Brasilia, Brazil, for
providing their software and their support and for giving me the
opportunity to colaborate with them in this study.

I thank Dr. Patrick Dickson and Dr. Carrie Heeter, members of
my commitee, for allowing and helping me to choose such a
rewarding topic of research and for their help and support
throughout this project. My parents in Brazil, and my family in
Michigan all deserve thanks for their support, encouragement, and
prayers. I especially thank my wife Luciana for her support and
pacience throughout this research.

I would also like to thank Dr. Leighton Price and Dr. Cindy
Nichols for their help and suggestions on the quantitative side of this

study. Cindy, I am looking forward to play Mozart with you again!

iv

Table of Contents

 

 

 

 

 

 

 

 

 

 

 

List of Tables ......................................................................... viii
List of Figures ........................................................................ ix
Chapter 1 Introduction ........................................................ 1
1.1 Prototypes - - ............. ----- 3
1.2 Instruments ............................. 4
1.3 Research Questions ....................................................................... 6
Chapter 2 Related Research ............................................... 8
2.1 Prototyping - _ _- _- _______ 8
2.2 Usability Questionnaires ............................................................ 10
2.3 Critical Incident ................................................ 11
2.4 Classiﬁcation of Usability Factors ....................................... 13
2.5 User Interface Evaluation by Experts .................................... 15
2.6 Number of Subjects ..................................................................... 18
2.7 Gender and Ethnic Differences .................................................. 20
Chapter 3 Evaluation Setting ............................................ 22
3.1 The Prototype Tested .................. 23
3.2 Instructional Objectives ............................................................ 22
3.3 Theinterface ................................................................................... 27
3.4 Description of the Prototype: Sequential Structure ...... 29
3.5 The Computer Environment ........................................................ 31
3.6 Description of the Physical Space ......................................... 31
Chapter 4 Description of the Evaluation ......................... 33
4.1 EvaluationOverview ..................................................................... 33
4.2 Subjects-- ...... ............ - _ - 35
4.2.1 Users. ...... - -- _-__ 35
4.2.2 Experts- _ -- _________ 36
4.3 Procedures ................. 37
4.3.1 Orientation- - _ ........ -------- 38
4.3.2 The Evaluation Session ...................................................... - 39
4.3.3 Data Recording ...................................................................... 40
4.3.4 MultimediaFiles .................................................................... 41
4.3.5 Questionnaires ....................... 42
4.3.6 MetaAnalysis ......................................................................... 43
4.4 Evaluation Chronology .................................................................. 47

Chapter 5 Analysis ..............................................................

5.1 Choices for Analysis .....................................
5.1.1 Different Approaches on the Same Problem .............
5.1.2 Additional Considerations ...............................................

5.2 Statistical Choices .........................................
5.2.1 Use of Non-parametric Methods .................................
5.2.2 Use of Cluster Analysis .................................................

5.3 Results- -- - -_ -
5.3.1 QUIS Comparison of Users & Experts ................
5.3.2 QUIS Comparison Between Gender
5.3.3 Reeves & Harmon: Comparison of Users & Experts
5.3.4 Reeves 8: Harmon: Comparisons Between Gender ..

 

 

 

 

 

 

 

 

 

5.3.5 List of Problems ......................................
5.3.6 Results of Cluster Analysis ...............
5.3.7 Results of the Meta Evaluation by Experts ..............
5.4 Qualitative Analysis - Multimedia Files ...................
5.5 ChapterSummary .................................................................
Chapter 6 Discussion ..........................................................
6.1 Cultural Identity of Participants and Observers .............
6.2 Differences Among Ethnic Groups, and Experts ...............
6.3 Multimedia Files ...........................................................................
6.4 Analysis of Content, Pedagogy 8: Interface .................
6.5 Use of Questionnaires in Interface Evaluations ........
6.6 Problems Verbalized Versus Errors Observed .............
6.7 Qualitative and Quantitative Instruments .........................
6.8 Statistical Tools in the Evaluation Methodology ......
6.9 Number and Nature of Problems Encountered .............
Chapter 7 Conclusions ........................................................
7.1 Differences in Usability: Users and Experts ...............
7.2 Differences in Usability: Ethnic Groups .........................
7.3 Differences in Usability: Gender .........................................
7.4 Value of Qualitative Tools ....................................................
7.5 Evaluation of Methodology .........................
7.5.1 Videotaping ....................................
7.5.2 The Interaction of Observer and Subjects ................
7.5.3 Number of Subjects ................................
7.5.4 Questionnaires .......................................................................
7.5.5 The List of Problems ..........................

 

vi

48
48
48
49
50
50
51

52
52
58
63
67
68
72
74
75
76

78
78
79
8O
81

82
84
85
86
87

89
89
9O
91
91

‘93

93
93
94
94
95

7.5.6 Navigational Maps ................................................................. 95

 

 

 

 

 

7.5.7 Use of Statistical Tools .................................................... 96

7.7 Future Research. 98
7.7.1 Enhance the Methodology ................... 98

7.7.2 Qualitative Emphasis ........................... 99

7.7.3 Quantitative Emphasis .' ........................ 99

7.7.4 The Inclusion of Personality ........................................ 100

7.7.5 Comparison of different Kinds of Observers ..... 100

7.7.6 Use of Navigational Maps .............................................. 100

7.7.7 Use of Questionnaires ....................................................... 101
Bibliography .......................................................................... 102
Appendix A: Consent form ................................................... 109
Appendix 8: Preliminary Questionnaires .......................... 110
Appendix C: Description of Evalution ................................ 112
Appendix D: Pre-Requisites ................................................. 113
Appendix E: Questionnaire Reeves and Harmon ................ 114
Appendix F: Questionnaire QUIS .......................................... 116
Appendix G: Questionnaire for Meta-Evaluation .............. 118
Appendix H: QUIS Comments ................................................ 119
Appendix I: Navigational Maps ............................................ 131
Appendix J: List of Problems .............................................. 152
Appendix K: Report of Usage ................................................ - 154
Appendix L: Variables included in Minitab ........................ 155

vii

LIST OF TABLES

Table 4.1 Ethnic Groups and their components .................................... 35
Table 4.2 Description of Experts: Qualiﬁcations and Jobs .............. 37
Table 5.1 QUIS Comparisons between Ethnic Groups & Experts 53
Table 5.8 QUIS answers-Gender Comparison ........................................ 59
Table 5.15 Reeves 8: Harmon: Comparisons of Users and Experts... 64
Table 5.19 Reeves 8: Harmon: Gender Comparison .................................. 67

Table 5.21 List of Problems (in order of coding by researcher)..... 70

Table 5.23 Mean number of problems found by user and experts-
Categorization of types of problems ........ 71

Table 5.26 Cluster Analysis by Participants and Groups .................. 73

viii

Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

Fig.
Fig.
Fig.

LIST OF FIGURES

 

3.1 -Instructional Objectives of Prototype - -- 23
3.2-Typical Screen of First Part of Prototype ............................. 24
3.3-Typical Screen of Second Part of Prototype ......................... 25
3.+Example ofan Exercise ................................................................... 26
3.5-Example of a Summary Screen ..................................................... 26
3.6-Look and Feel of the Interface ..................................................... 27
3.7-Overview of the Prototype ............................................................ 30
3.8-Layout of the Observation Room ................................................. 32
4.1-Instruments and Procedures used in the study .................... 34
5.2-QUIS-Comparison of Users & Experts ....................................... 54
5.3-QUIS-Overall items-Ethnic Group Users & Experts ............. 54
5.4-QUIS-Screen items-Ethnic Group Users & Experts ............. 55
5.5-QUiS-Terminology items-Ethnic Group Users & Experts ..56
5.6-QUIS-Learning items-Ethnic Group Users & Experts .......... 57
5.7-QUIS-System items-Ethnic Group Users 8: Experts ............ 57
5.9-QUIS-Gender Comparison ................................................................ 60
5.1 0-QUIS-Gender-Overall Aspects ..................................................... 60
5.1 1 -QUIS-Gender-Screen Aspects ...................................................... 61
5.1 2-QUIS-Gender-Terminology Aspects .......................................... 62
5.1 3-QUIS-Gender—Leaming Aspects .................................................. 62
5.1 4-QUIS—Gender-System Aspects .................................................... 63
5.16-Reeves&Harmon:Comparison of Users & Experts ............... 64

5.17-Reeves&Harmon:Users & Experts-Learning Dimensions 66
5.18-Reeves&Harmon:Users & Experts-Interface Dimensions 66

5.20-Reeves&Harmon:Gender Comparison ........................................ 68
5.22-Mean number of problems found by subject groups ........... 69
5.24-Mean number of problems found by user and experts -
Categorization of types of problems .................................... 71
5.25-Clustering of Subjects by Hierarchical Tree ....................... 72
5.27-Instruments Ratings by Experts ............................................... 74

6.1-Ratings of Instruments by Experts-The context factor ..... 86

ix

Chapter 1

Introduction

What is wrong with interfaces? One problem with design is that it
tends to be done by people who have off-the—top-of-their—heads
ideas and beliefs about imaginary beasts they call “the users”.

Donald Norman

Evaluation may occur at many points in the development of
a software application. Within the instructional development
context, different kinds of evaluations are available, depending on
the aspect to be focused, such as the effectiveness of the program,
its impact, use of resources, and ways for improvement. Each of

these aspects entails a different facet of the evaluation process.

The item “How to improve a program” involves a facet of
evaluation called formative evaluation. The overall purpose of
formative evaluation is to provide information to guide decisions
about enhancing and interactive multimedia program at various

stages of its development.

This dissertation focuses on usability evaluation during the
early stages of interface design of instructional software. The
importance of practical human-computer interface testing in
educational multimedia is clear, yet many of the available models
appear to be inappropriate or non useful to designers, particularly

when dealing with early prototyping. Part of the problem is due to

the fact that requirements and subsequent speciﬁcations evolve

throughout the development period [Briggs and Briggs, 1990].

Another reason why interface usability testing is not widely
adopted is the lack of understanding of its importance or meaning
from project managers. One of the critical factors in improving the
acceptance of usability testing is to provide concrete data that
would convince managers that usability evaluation is worth

executing [Nielsen, 1993].

Usability testing can be costly and time-consuming if
sophisticated experimental methods are used such as those
prescribed in the “usability engineering” approach [\Nhiteside,
Bennett and Holtzblatt 1988]. Such sophisticated methods require
the skills of a human factors specialist and access to usability
laboratories. They provide a large quantity of high-quality data
but many designers see them as intimidating in their complexity
[Bellotti, 1988]. They may also distance the designer from the user
rather than bring the user and designer close together. Nielsen has
argued that it is possible to get a reasonable level of feedback
without using such costly methods. His approach of “discount
usability engineering” aims to strike a balance between the
quality of feedback obtained and the cost of obtaining it
[Nielsen,1989].

DeveIOping interactive multimedia is a creative, demanding,
multifaceted task. One of the most important components of any
interactive multimedia project is the interface, but there are no

established rules available, perhaps because it is an art that is not

easily learned or described [laurel 1990]. One of the few rules
that appears to be accepted is the one that tells the designer about
knowing the target user [Shneiderman 1987]. Yet interfaces are
typically created by professionals who have far more contact with
computers than the intended target users [Nielsen 1993]. This
dissertation seeks to provide educational multimedia designers
with the information needed to improve interface evaluations in

early stages of design.
1. 1 Prototypes

A prototype, by deﬁnition, is a working model of the
conceptual design. Multimedia designers use prototyping
techniques to try out ideas about interface design, among other
things. Working within constraints of time and budget,
prototyping involves the production of early working versions of

future application system and experimenting with them.

Early Prototyping provides a communication basis for
discussions among all the groups involved in the development
process, especially between users and designers [Benimoff and
Whitten, 1989; Diaper, 1990]. It also provides an approach to

software development based on experiment and experience.

The adoption of prototyping has grown out of the realization
that 1) Frequently requirements do not become apparent until a
system is in use; 2) Specifications cannot be completed until the '
construction process begins; and 3) Developers need to understand

the cognitive processes of target users in early stages of design.

1.2 Instruments

. There are a variety of methods that can be used to collect
data to determine the usability of a system. The most accepted
ones are verbal reports from users (think aloud), objective
measures of users’ performance (either by observation or logging),
users’ responses to questionnaires, expert reviews, and critical
incident techniques [Miller and Jeffries, 1 992].

Verbal reports from subjects, or think aloud techniques,
consist of asking the participant to reﬂect audibly on what he or
she is doing or wants to do while using the software. Occasionally,
the evaluator may intervene to ask for clariﬁcation of user
comments or to provide help if the program is an early prototype.
Most often, a videotape of the monologue is accompanied by
unstructured observation of user activity [Roske-Hofstrand, 1989].
The purpose of the think aloud technique is to obtain real time
information from users about their processing of the program

while they are using it.

Objective measures of users’ performance can be done by
observation or by automatic computer automatic logging. This
technique has the main purpose of collecting measurements in
order to compare then with similar measures from users of a
different system, or to evaluate them with respect to usability
goals. This kind of test usually involves an speciﬁc task to be

executed by the user.

Questionnaires are most often done in such a way that users

either choose from a list of multiple choice answers or mark a

number indicating the strength of their agreement of
disagreement with a statement. Questionnaires can be used
repeatedly in different usability tests, thus allowing for cross-
product comparisons. Respondents usually answer questions after
completing the program, but occasionally questionnaires are

presented during the use of the application [Flagg, 1990].

Expert Reviews, often called “face validity” consists of
showing the prototype to a group of interface designer specialists
and asking then to evaluate the interface for usability. The
designers should then conduct a detailed analysis of the interface.
Often one component of this kind of evaluation involves
proceeding step-by-step through task scenarios. Different types of
experts can provide different perspectives on the critical aspects
of the program [Reeves, 1993].

Critical Incident techniques consist of collecting information
on interface problems when they occur and uses open-ended
questions to obtain information on missing or non functional
features of the software interface. This technique gives the user
the chance to react to software by explaining the problem at the
time of its occurrence. It also allows designers to collect
information on satisfactory features of the interface. Critical
incidences are occasions when the system is particularly poor or
surprisingly good, and knowing the detailed circumstances of such
incidents can often help to avoid worst-case incidents in the ﬁnal
product [Nielsen 1993].

1.3 Research Questions

This dissertation attempts to create a model for usability
testing in educational software prototypes, by evaluating an
existing prototype of computer-based instruction and applying
existing usability testing tools combined with additional
complementary methods of data collection. This study includes
both target users and educational hypermedia designers

specialists. Five main research questions are deﬁned below:

#1) What are the differences in usability feedback between users
and hypermedia designers?

This is the main question to be answered in this study. Is it
enough to have designers’ feedback in order to measure and
improve the usability of interfaces? Is user feedback, instead,
sufﬁcient to verify the usability of interfaces? What are the
differences and similarities between users and designers when

evaluating an interface?

#2) What instruments do hypermedia designers value the most
when evaluating both quantitative and qualitative data from the
users feed back?

If usability evaluation is performed to help hypermedia
designers improve the interface of a prototype, which feedback
categories are most valuable to facilitate this process? This
question should help designers and managers when faced with
educational software evaluation planning and execution. Do

designers place greater attention on quantitative data (like

questionnaires and demographic information) or qualitative

oriented data (like multimedia ﬁles and users written comments)?

#3 ) What are the differences in usability feedback across differen t

ethnic user groups?

This question relates to the issue of cultural and ethnic
differences among target users. The prototype in question was
developed for an international audience of engineering students
and professionals. Do different ethnic groups present signiﬁcant
differences in terms of usability evaluation feedback? 3 main
ethnic groups are compared in this study: Chinese/ Koreans,

Indian/ Pakistanis and North Americans engineering students.

#4) What are the dih‘eren ces in usability feedback between males

and females?

This question addresses gender diffferences among target
users. Are there signiﬁcant differences in responses, errors
detected and attitudes when gender is taken in consideration?
This question could generate exploratory answers to this

important aspect of software evaluation.

Chapter 2

Related Research

Over the past 10 years, research in usability evaluation of
computer interfaces has been carried out in three main areas:
human factors, computer science and cognitive psychology. This
research has produced understanding of how to evaluate
interfaces in several different contexts.This chapter surveys

previous work on usability evaluation that focus mainly on early
prototypes.

2.1 Prototyping

This section surveys past research that focused on the

utilization of prototyping in software design.

One primary reason for prototyping the user interface is to
use the user interface prototype to collect feedback from
prospective users (Benimoff and Whitten, 1989; Diaper, 1990). A
user interface prototype can be demonstrated to user to elicit
their feedback about the functionality of the system and about the
interface design. User interface prototypes can be created so that
end users can actually use the prototype as they would the ﬁnal
system. Data on the usability of the design (time to complete a
task, number and type of errors made, and so on) can be collected
before the actual system has been built. Prototypes that are
incomplete or that don’t match the ﬁnal speciﬁcations can still be

used for the collection of user feedback and expert evaluation.

9

Another reason for prototyping the user interface is that it
gives the designer an opportunity to try out various alternative
designs. Competing designs can be prototyped and then either
tested with prospective users or evaluated by experts (Benimoff
and Whitten, 1989). User interface prototyping helps to ensure
consistency in user interface design. When the user interface for a
new computer system works in a way that is already familiar to
users from their experiences with other computer systems, user
ﬁnd much easier to learn to user the new system (Poison, 1988).
Through careful evaluation of prototypes, designers are able to
catch inconsistencies before they become a part of the system
code.

Prototyping reduces cycle time and reduces project costs.
Tavolato and V'mcena (1 984) report ﬁndings that 76% of
development effort is directed toward late-stage activities such as
correcting errors that exist in code and adapting the software to
meet new requirements. In addition, they found that about half of
the errors that are discovered late in development can be traced
to failures in the requirement phase. User’s requirements are
better understood and are communicated to the software
developers more efﬁciently through rapid prototyping. This, in
turn, reduces the number of errors in the code and the number of
new requirements introduced later in the product cycle. Thus, the
relatively small cost of investing in prototyping in the beginning I
of the development process can result in large savings at the end

of the development process.

10

Prototyping encourages iteration, expansion of ideas, and
risk analysis that are characteristic of new software development
models (Boehm, 1988). In a study conducted by Boehm (1984), it
was determined that the use of a prototyping approach resulted in
45% less development time than an approach that relied on
specifying the design only through requirements and speciﬁcation
documents. These data point to signiﬁcant cycle-time

improvements when prototyping is used.

2.2 Usability Questionnaires

This section surveys research that focused on the utilization
of questionnaires for usability issues and research done with QUIS
- Questionnaire for User Interface Satisfaction (developed by Chin,
Diehl, and Norman, 1987,1988).

Speciﬁc questionnaires for evaluating computer systems and
interface designs have been developed. IaLomia and Sidowiski
(1990) reviewed some of the questionnaires under two general
classes: user satisfaction with computer systems, and computer
literacy and aptitude. They report ﬁve questionnaires which have
been developed to address user satisfaction. Each questionnaire
addresses slightly different aspects of usability and different
kinds of equipment. Because of these different aspects, it is

difﬁcult to compare directly different questionnaires.

The most frequently used questionnaire for usability testing
evaluation is QUIS - Questionnaire for User Interface Satisfaction -
[Chin, Diehl, and Norman,1988]. Quis has 27 items using a 9-point

11

Likert scale. According to laLomia, the items test overall reactions
to software (6 items), evaluation of characters on the screen (4
items), use of terms and information throughout the system (6
items), learning to operate the system (6 items), and system
capabilities such as speed (5 items). The reliability of the test was
found to be 0.94. Validity was tested by how well the items
discriminated between PC systems that were liked and disliked.

In all cases, the means were higher for the liked systems than for
the disliked systems, thereby providing evidence for the validity

of the questionnaire.

2.3 Critical Incident

This section surveys research that focused on the issue of

using critical incident for usability purposes.

First used by Pits and Jones (1947) to analyze pilot error,
and more recently by Cooper (1982) to investigate errors made by
anesthetists, this technique consists of the study of ‘critical
incidents’ to identify common features or elements in order to
classify those incidents. In both studies (Pitts and Jones, 1947;
Cooper, 1982), critical incidents are deﬁned as human error or
equipment failure that did or could have had unsatisfactory
results. Cooper (1982) ﬁrst categorized incidents by their outcome:
favorable, unfavorable, neutral, or other. The incidents in each of
the outcome categories were then classiﬁed further by their
causal relationships. Dzida (1978) feels that critical incident

technique is a feasible method to obtain user evaluation of

12

human-computer interfaces and to translate that evaluation into

design requirements.

Galdo, Williges, Williges and Wixon (1987) conducted a
study about critical incident evaluation tool for software
documentation. In this evaluation, subjects were asked to perform
a benchmark task consisting of 19 sub tasks and to use the
associated software documentation. Both hard copy and on-line
documentation were available. After subjects completed each sub
task they were asked to use an on-line questionnaire to report
critical incidents encountered in using the hard copy and on-line
documentation. The critical incidents were sorted into four
categories: on-line documentation failure incidents, on-line
documentation success incidents, hard-copy documentation failure

incidents and hard-copy documentation success incidents.

The incidents in each failure category were reviewed to
identify common documentation features or elements that caused
problems. The same process was repeated for incidents
categorized as successful to determine satisfactory features of the
documentation. The problems were arranged in descending order
from most critical to least critical by frequency of critical incidents
associated with each problem. Ties in frequency were broken by
an average severity index. Average severity was calculated by
averaging the incident severity ratings supplied by users at the

time each incident was reported.

A list of documentation problems and satisfactory features

was presented to the software design team to guide the redesign

13

process. This evaluation helped validate the critical incident
technique as a method for providing software designers with end-

user data for revision of software and documentation.

2.4 Classification of Usability Factors

This section surveys research that focused on the issue of

classiﬁcation of usability factors.

Usability Factors have been divided into ﬁve main attributes
[Nielsen, 1993]:

0 muggy. Program should be easy to learn (the user

can rapidly start getting some work done).

o m Program should be efﬁcient to use (once
learning is completed, high level of productivity is possible).

0 W: Program should be easy to remember (user

is able to return to system without having be trained again).

- Errgmz Program should have a low error rate (users make

few errors when using the system, or easy to undo errors).

0 SaJisfactign: Program should be pleasant to use (users are

satisﬁed when using it).

Reeves and Harmon (1994) describe two complementary
mum-dimensional approaches to evaluating interactive
multimedia programs for education and training. The ﬁrst
approach is based upon a set of fourteen pedagogical dimensions

such as “experiential value” and “learner control”.

14

The second approach is based upon a set of ten user
interface dimensions such as “easy to use” and “screen design”.
They have applied the pedagogical and user-interface dimensions
to the evaluation of two interactive multimedia programs, the
Jasper Woodbury Problem Solving Series developed by the
Cognition and Technology Group at Vanderbilt University, and the
Columbus Encounter: Discover and Beyond “Ultimedia” program
developed by the IBM Corporation.

Their recommendations, in light of their admittedly
preliminary investigations into the value of these dimensions, is to
subject the dimensions to rigorous expert review by leaders in the
design and application of interactive multimedia in both education
and training. They also suggest that, since there is evidence of the
qualitative validity of the dimensions, quantitative scales should
be integrated into each dimension, e.g., a ten point rating system.
They hesitated to add this quantitative aspect to the dimensions
to avoid fear that reviewers might get too distracted by the
numerical values to concentrate on qualitative ratings of the
dimensions themselves. They also recommended that the
validated dimensions be applied within a wide variety of
education and training contexts to provide evidence for their
utility. A last recommendation was that research should be
initiated into the relationships among ratings of the pedagogical
and user interface dimensions of applications and actual data
regarding the instructional effectiveness and impact of these

programs.

15

2.5 Evaluation by Experts

This section surveys research that focused on issues related

to usability evaluations by expert reviews.

Usability evaluation methods by experts have been the
focus of research during the past 5 years. The objective of this
kind of usability evalution is to contribute to the design of usable
software for end users. These evaluations provide a way of
quickly inspect and ﬁnd problems in software protoypes without
having to include target users, at least in the early stages of
development. There are differences in how expert inspections are
conducted depending on the characteristis of the experts, and on

the objective of the evaluation itself.

Pollier (1992) studied the activities of human factors
specialists charged with evaluating a human-computer interface.
Subjects were 4 experienced ergonomists specializing in
information systems. Subjects were asked to think aloud and to
consult with the experimenter while evaluating the human-
computer interface in a multimedia communication system. The
resulting verbalizations, videotapes of subjects activities, and
subjects written and graphic productions were analyzed to
determine the number and type of ergonomic issues taken into
consideration and the strategies used in performing the
evaluation. Individual differences in these variables were
analyzed.

16

Reeves and Harmon (1993) reported on the application of
their user interface and pedagogical dimensions for evaluation
purposes by experienced developers. They have conducted
preliminary analyses with faculty and graduate students in an
instructional technology graduate program. They suggest further
research of these dimensions involving experienced personnel in

other education and training contexts.

Usability inspection methods, based on informed intuitions
about interface design quality, hold promise of providing faster,
more cost-effective ways to generate usability evaluations,
compared to empirical user evaluation methods. Examples of
inspection methods include heuristic evaluation [Nielsen and
Molich, 1990], usability walkthroughs [Bias, 1991; Karat and
Bennett, 1991a, 1991 b], cognitive walkthroughs [Lewis, Poison,
Wharton and Reiman, 1990], and applications of guidelines in
walkthroughs Ueffries, Miller, Wharton, and Uyeda, 1991]. These
methods have been used in development for some time in one

form or another.

Desurvire and Bradford described the use of multiple
methods in development projects, to assess real-world
applicability, to compare effectiveness of methods, and also to
explore how different methods might complement each other
[Desurvire, Kondziela and Atwood, 1992]. Desurvire study
illustrated that inspections can be used across a wide variety of

software interfaces.

17

In relation to methods of making more and better quality
predictions, and streamlining usability evaluations, Monk noted
that sensitivity to potential problems probably is driven largely
by experts’ own problems, the observation of others having
similar problems, and expert’s skill at reﬂecting on and
generalizing these personal experiences. Nielsen noted the
similarity between usability inspection and having problems as a
participant in an empirical user test. Research might be directed
at ﬁnding ways to help experts acquire this experience [Wright
and Monk, 1991] and generalizing it for purposes of making

predictions and judgments.

Another area of research to be explored is the improvement
of the way data from expert evaluations is analyzed, and how
results may be more effectively used by the larger development
cycle [Mack and Nielsen,1993]. Design and evaluation should be
tightly linked, and this relationship needs to be understood and
supported. It would be worthwhile to explore the possibility of
developing on-line tools for cumulating and organizing
information based on inspection data, and for applying it to new

design problems.

Research has conﬁrmed that expert evaluation is an efﬁcient
usability inspection method [Jeffries 1991]. However, expert
evaluation methods were developed to be used in circumstances _
where user testing is impractical, and if it is used to the exclusion
of user testing, it could mean the lost of one the most valuable

tools for interface evaluation [Jeffries and Desurvire 1992].

18

2.6 Number of Subjects

This section surveys research that focused on the issue of

number of subjects for usability evaluation of interfaces.

Results from several studies indicated that any single expert
evaluator would miss most of the usability problems of an
interface. Several studies [Molich and Nielsen 1990; Nielsen and
Molich 1990; Nielsen 1992; Nielsen 1994] indicated that single
evaluators found on average only 35% of usability problems. The
results also indicated that since different evaluators tend to ﬁnd
different problems, it is possible to achieve better performance by
aggregating the evaluations from several experts. The exact
number of evaluators to include should depend on context—speciﬁc

cost-beneﬁt analysis.

Nielsen [1990] reported an experiment that was designed to
measure the percentage of usability problems computer scientists
would ﬁnd using the thinkaloud technique. In this study, 20
groups of minimally trained experimenters independently
conducted usability tests of a paint program. Their task was to
ﬁnd as many of the usability problems Nielsen had deﬁned a
priori as “major usability problems” as they could. Each evaluator
ran an average of 2.8 subjects per evaluation. The results showed
tha computer scientists were able to apply the thinkaloud method
effectively to evaluate user interfaces with a minimum of training
and that even methodologically primitive experiments could
succeed in fmding many usability problems. This experiment was
replicated [Nielsen, 1992] with similar results.

19

Virzi [1992] performed a series of three experiments to
extend the exploratory work done by Nielsen. In these
experiments he examined the rate at which usability problems
were identiﬁed as a function of the number of users run in a
single usability evaluation when the evaluation was conducted by
experts. In all of the three studies, approximately 80% of the
usability problems identiﬁed would have been found after only
ﬁve subjects. He concluded that important usability problems are
more likely to be found with fewer subjects than are less
important problems, and that a practitioner who chooses to run a
small number of users will identify most of the major usability
problems and some proportion of the less important problems.
Experts were able to reach consensus regarding the relative
severity of problems without beneﬁt of frequency data. He
concludes that usability experts can assess the severity of a
problem without explicit knowledge of how frequent the error is
likely to be.

Jeffries and Desurvire (1992) point out that different
methods have strengths and weaknesses and that the best
evaluation of a user-interface comes from applying multiple
evaluation techniques combined. The various techniques have
differing constraints on their applicability and on the resources
required to apply them effectively. User testing and expert
evaluation require access to expert evaluators and users; in the
case of expert evaluation, a group of them is required. They

suggest that when access to multiple experts are available, doing

20

both expert evaluations and usability testing with users is the
best strategy.

2.7 Gender Differences

Many researchers in education and psychology have found
that gender accounts for differences in both attitudes toward
computers and performance [Premkumar, Ramamurthy and King,
1993; Anderson, 198 7]. While Igabaria found that gender had a
signiﬁcant effect on attitudes toward computers [Igabaria, 1990],
Parasuraman and Igabaria found no dissimilarity in attitudes
between males and females [Parasuraman and Igabaria, 1989].
Cronan et al. found that gender was a major factor inﬂuencing the
performance of students taking an introductory computer
information system course [Cronan, Embry and White, 1 989].

According to Dambrot et al. [1985] among fust year
undergraduates, male students were more likely to have attended
and completed computer-related courses and to have knowledge
of a computer language. Similarly, in their study of ﬁrst year
university science students, Clarke and Chamber [1989] found that
men were signiﬁcantly more likely to report previous computer
experience over a ranger of applications. Discussion at the Gender
and Science and Technology Conference [1 990] revealed that a
declining female participation in computer studies appeared to be
an internation trend.

Past research also suggests that, relative to traditional

teaching, use of Computer Assisted Instruction (CAI) can give rise

21

to gender inequities in student achievements [Siann et a1, 1990;
Sutton, 1991].

Cross-cultural studies also shows that there are gender
differences in attitudes towards conputers. In a study of Canadian
and Chinese high school students’ attitudes toward computers,
Collins and Willians [1987] found that in both cultures boys in
general were signiﬁcantly more positive than girls in their
attitudes toward computers and showed higher self-conﬁdence
abou working with computers. However, Chinese students
displayed fewer gender or age differences, the one exception
being the opinions of students concerning the competence of
women with regard to science and technology. Females from both
countries endorse the idea that women have as much ability as
men with respect to science and technology, whereas males were
signiﬁcantly more skeptical.

While it seems to be an agreemnt among most researchers
on the presence of signiﬁcant sex differences in using and learning
about computers, there is less agreement on the causes of this
gender differentiation [Shashaani, 1992].

In conclusion, there is little research done that focus
speciﬁcally on interface usability and gender. Currently, there is
no evidence that one gender is more effective that the other when

conducting evaluation of educational software.

Chapter 3
Evaluation Setting

This chapter describes the prototype tested, its instructional
objectives, its interface, the prototype sequencing, the physical space
and the computer environment in which the prototype was
evaluated. The educational prototype tested was about teletrafﬁc

concepts.

3. 1 The prototype tested

The prototype tested in this study is a computer based training
(CBT) unit in its early stages of development. This prototype was
developed in 1993 by TELEBRAS, the Brazilian Telecommunications
Company, at their central training facility in Brasilia. It was
developed with Assymmetrix Multimedia Toolbook by a team of
instructional designers, content experts and programmers.

At that time, TELEBRAS was conducting negotiations with the
International Telecommunications Union (ITU), the International
Teletrafﬁc Congress (ITC) and the Economic European Community
(EC) to develop a series of telecommunication training modules to be
used as professional training materials by telephone companies of
participating countries of ITU, which includes telecommunications
professionals from Asia, the Americas, Africa and Europe.

The courseware to be developed, using this prototype as the
interface model, would be based on a combination of pedagogical '
strategies: tutorials, simulations, problem-solving, hypertext
information retrieval structures at appropriate points, the learner
control approach, and student performance records with a

23

multimedia interface (audio and graphics). The course is intended to
be self-contained, in the sense that all information necessary for

instruction is included in the software program.

3.2 Instructional Objectives
The prototype consists of a lesson about teletrafﬁc routing. The
instructional objectives of this lesson, routing and equivalent

graphics, are displayed in Figure 3.1.

 

lnnlBook- LICISTUZ’TBK

Elle ﬂelp
2 .2 Teletraﬂ‘ic

 

 

After this lesson you will be able to:
. identify the concepts:
- direct and alternate routing;
- trafﬁc overﬂowing;

. use equivalent graphics.

To complete this lesson you must do 2 exercises
correctly.

 

 

 

 

 

 

Objective 1 j 2 Menu 10mm! Tmorial lam-391mm: nest-no ] «are 1 »

 

Figure 3.1: Instructional Objectives of the Prototype

 

The lesson is subdivided into two main pedagogical sections:
the ﬁrst section covers the deﬁnition and calculation of alternative

routing in telephone trafﬁc. In this section, the student learns

24

deﬁnitions, concepts and how to calculate alternative routing using
simulations, examples and graphics. The unit is composed of a total of
13 screens. Figure 3.2 shows a typical screen of this section.

I ' v ‘

(file Help
2.2 .TeletrmIc. "

 

 

Trunk Offered Trafﬁc

1 2 3 4 5 6 _
0.5000 08887 0.7500 0.8000 0.8333 0.8571 0
0.3000 0.5333 0.6618 0.7385 0.7883 0.8229 ‘ '
0.1375 0.3790 0.5498 0.8587 0.7301 0.7790
0.0471 0.2308 0.4201 0.5801 0.8588 0.7273 . .
0.0123 0.1171 02882 0.4465 0.5674 0.6550 .
0.0028 0.0492 0.1737 0.3278 0.4851 0.5729 8 g

~ suppose: an offered me orssrrangerhecarried trafﬁc inﬁrstsixtrunks . w
”is, 4.4105 .Eriangs withan average «loadof 0.736,: portrunk; while-tho. Iast'six ‘
‘ bunkswili carry only 1.5214 Eriangswith-an average load 0.253per trunk.

 

 

 

 

 

 

 

 

 

 

010‘;de

 

 

12 trunks The six ﬁrst trunks
is about

  sen. —-»'(0 00000000000: W33.

4.4105 Erl. 1.5214 Erl. trunks.

I Tutorial 732? In... lmlrmalmlmjmlmml _=_ I ; !I

Figure 3.2: Typical Screen of the First Part of the Prototype

 

 

 

 

 

 

 

 

 

In the second section, the student learns how to build
equivalent diagrams of alternative routings covered in the ﬁrst
section. In this section, the student is presented with examples and
drills, as well as deﬁnitions of equivalent diagrams. It is composed of
14 screens. Figure 3.3 represents a typical screen of this section.

The prototype contains a practice section that includes all the
exercises from both of the main sections. In this practice section, the
student can verify if he or she already knows the material

presented in the tutorials. This practice is provided to the student as

25

a way of quickly testing the material covered in the tutorial without

having to run the entire application. Figure 3.4 presents an example

of the kinds of exercises included in this prototype.

ToolBook - LIC IST02.IBK

 

file ﬂelp
2 .2 Teletraﬂic

 

Equivalent graphic of the ﬁnal route" TA: ' A‘ .

 

 

 

Figure 3. 3: Typical Screen of the Second Part of the Prototype

 

The program also contains a summary section which presents
the content of the lesson in a compact form (total of 5 screens). This
summary is provided to the student as a way of quickly reviewing
the material covered in the tutorial without having to run the entire
application. Figure 3.5 exempliﬁes the screens contained in the
summary.

The instructional objectives of this prototype follow strict
recommendations made by the International Telecommunication
Union, which provided the content expertise to TELEBRAS. In terms

26

0

 

 

 

Elle Llelp ,
2 . 2 Teletraﬂlc '
, W
NumberotTrunhe

“cm 8° ......... 1::jn -

Variance (V) and the Mean-ﬂ

in the route of 14 trunks wi

a'pois’son tram «9.00511; C:.m - .

Output

VarianceMI:I f 1......
mm, [:3 [:2

 

 

 

 

 

 

Figure 3.4: Example of an Exercise
Ioollloolr — LICT8102.TBK

 

Eile ﬂclp
2 .2 Teletmtﬁc

The traffic carried in the c... trunk {Lam be expressed as 1th
'dill'erencebetween the trafﬁc carried in C trunks-minus the Mc .

carried in (C -1) trunks. IL°= A [ E(c.1,A) . E(C,A) ]I

When the trafﬁc A- is offered, in ﬁrst choice, to a groUp of c trunks the
Mom —' to routing to another group of S trunks.

ﬂa—

 

 

 

Second choice 8 trunks low
loss probability ﬁnal route

First choice C trunks
high loss probability

a ...._) Offered trafﬁc

Summary 215 Ian- oqeeou Tutorial Sun-u; Prooﬁn- nes—o « »

 

 

 

 

Figure 3.5: Example of a Summary Screen

27

of the pedagogical format, the lesson prototype was developed using
traditional instructional design methodology.

3.3 The Interface

The prototype lesson was developed using Assymetrix
Multimedia Toolbook, which runs on Windows 3.1 as its operational
system. In terms of its graphic interface style, the prototype presents
a look and feel that is shown in Figure 3.6.

 

Elle ﬂelp
2.. 2 Teletrafﬁc

The traffic routing from a especiﬁed central, as well the trunks associated
to a—AT, may be represented by 8 equivalent diagram.

0 Mean oflerred trafﬁc to A8 route

Number of trunks available at A8 route

 

2 “.I SE Mean offerred traffic to AC route

I%I Number of trunks avaliable at AC route

a. ’ I 6 mm Mean offerred traffic to Tandem

% Number of trunks available at AT route

 

 

 

 

 

Tutorial 13127 Menu (hie-elite Tutorial Block- Btooko Sun-nu Practice

Figure 3.6: look and Feel of the Prototype

 

In general, the screens have a navigational menu bar on the
bottom, with the following options: Tutorial, Practice, Summary,
“Block +”, “Block -”, left arrow, right arrow. The screens also contain a

28

top menu bar with “File” and “Help” options. In the lower left corner
of every screen there is a location indicator (“tutorial 4/ 7”, for
example). In the upper left corner, the name of the lesson is present
on all screens. The central portion of the screen is dedicated to
content, and is different every screen.

The left and right arrows are for moving backward and
forward in the program, although at times they perform other
functions (such as playing audio narration). The block buttons are
intended for jumping to the next or previous sections in the lesson.
The practice and summary buttons serve as pointers to these
sections. At times, a button called "resume" appears on the
navigational bar. This button is context-speciﬁc, but serves as a way
to return to the original location of the hyperlink. .

The prototype makes use of multiple windows, such as tables,
calculators, and calculation programs. These windows are accessed
through buttons located on the central part of the screen, according
to the instructional ﬂow. Most of the simulations are accessed by
opening additional windows. The lesson makes extensive use of
simple animations. Most of the animations represent basic trafﬁc
ﬂow by means of color cycling or blinking graphics and letters.

The use of direct manipulation activities is more intense in the
second half of the lesson, when students can build equivalent
diagrams by clicking and dragging boxes with letters and names. The
program utilizes several different input alternatives of this kind of
manipulation. There are multiple choice exercises in the prototype, in
which the student clicks on boxes to answer the quiz questions. The
calculation exercises require the students to scroll tables, make use

of the Microsoft calculator, and use paper and pencil.

29

In terms of audio usage, the use of voice is restricted to one
long narration at screen. 3 of the tutorial and as audio feedback for
exercises, such as " incorrect", "try again" or "correct". The beep of the
computer is conﬁgured as a piano chord, which is a sound bite that
comes with Microsoft Windows 3.1. The use of color throughout the
prototype varies. Blue, yellow and red are used depending on the
context. Hyperlinks are indicated by a black transparent rectangle
around the text to be clicked.

3.4 Description of the Prototype: Sequential Structure

A description of the sequential structure of the prototype is
presented here.

The sequence of the program, if visited in a linear fashion,
consists of 40 screens, beginning with the objective (2 screens), and
followed by the ﬁrst half of the tutorial, screens 1 through 9. At
screen 9, the student is faced with a calculation exercise. If the
student successfully answers this exercise, he or she can progress to
the second half of the program, screens 13 through 22. If the
students fails this exercise, the program shows the procedure and
answer of the problem and then presents a new exercise to the
student, which is essentially the same problem with different values.
If the student fails again, the application sends him or her back to
the beginning of the tutorial.

In the second part of the program, the student is presented
with the concept of an equivalent diagram. At screen 15 and 16, an
equivalent diagram is to be constructed by the student. Here, if the
student fails, the computer allows him or her to progress. At the end
of this second half, the student is asked to answer a multiple choice

Tutorial

 

 

1942?

Tutorial

 

2/27

Tutorial

 

3/27 voice

Tutorial

 

4/27

Tutorial
5/27

 

Tutorial

 

6/27

Tutorial
7/27

 

4%.—

Tutorial
8/27

 

«____,

 

   
 
    
   
    

  

Exercise
Tutorial

9/27

review
Tutorial
10/27

 

4?

W
review
Tutorial

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

T

 

 

11/27

—— I

4r
4 Tutorial , Menu
1 2/27
Introduction
Tutorial Objective Objective
1 3/27 I /2 2/ 2
wt 11 .
Tutorial Summary Practice
14/27 1 /S 1 /5
Tutorial Summary Practice
1 5/27 2/5 2/5
Tutorial Summary Practice
1 6/27 3/5 3/5
Tutorial Summary Practice
17/27 4/5 4/5
Tutorial Summary Practice
18/27 5/5 5/5
Tutorial
19/27
Tutorial
20/27
Tutorial
21 /27
Exercise Advice
Tutorial Tutorial l-l Tutorial Tutorial H Tutorial Tutorial '
22/27 23/27 24/27 25/27 26/27 27/27
I I .______J

 

 

 

 

 

 

 

 

 

Figure 3.7: Overview of the Prototype

 

 

 

31

quiz to verify if he or she has learned the content of the prototype.
At any time, through the use of the navigational bar, students can
move back, forward, jump to the beginning and end of sections, or go
to the practice or summary sections.

Figure 3.7 represents the entire prototype, indicating the
position of exercises and the main sections of the application. This
graphic was created by the researcher and serves as a navigational

and visualization instrument for the meta-evaluation.

3.5 The Computer Environment

The computer software and hardware utilized in this study
consisted of a laptop PC compatible brand " Pro Star". This laptop was
conﬁgured with an Intel microprocessor 486 DX4, running at 100
megahertz. It had 12 megabytes of random access memory (RAM)
and 800 megabytes of disk space. Although the laptop had a
trackball, the researcher preferred to use a 3 button Dexxamouse as
the means of input for the participants. The screen was a liquid
Crystal Display (LCD) of 10.5 inches, passive. This laptop had 8 bit
sound capability built-in, with an internal speaker. No extra speakers
were necessary.

In terms of software conﬁguration, the computer ran Microsoft
Windows 3.1. The prototype, which needed Assymetrix Toolbook to
run, could be launched from the Windows Program Manager.

3.6 Description of the Physical Space
The research was conducted in an ofﬁce of 10 by 15 feet. This

ofﬁce contained a wide window to the exterior, and a door to the

32

corridor. The ofﬁce had its furniture layout prepared for the study. A
small computer table with chairs was located near the entrance of
the room, on which the laptop and auxilliary monitor sat, and where
the subjects were seated. A second table (a desk) was available for
the observer. The room ofﬁce lights were kept dim to avoid glare on
the computer screens. A tripod with an 8mm camera was set up for
video and audio recording of the monitor and subject, via a remote
microphone attached to the lapel of the subject. Figure 3.8 represents
the layout of the room.

 

 

 

 

 

 

 

 

 

 

 

Camera ﬁnitor
= mouse
Subjec I
Laptop
Computer
Observer's door
- Desk

 

 

 

 

 

 

 

Figure 3.8: Layout of the Room

 

Chapter 4
Description of the Evaluation

This usability study was composed of two parts: 1) the
evaluation of the prototype described in the previous chapter, which
was tested both with users and experts, and 2) a meta-evaluation of
the instruments and results of the ﬁrst part, made by experts. For
this meta analysis, data collected from users was presented to the
experts, who would then rate the different instruments and tools.

This chapter describes the evaluation design, the subjects who
participated in the study (target users and educational multimedia
experts), the instruments and procedures of data collection and the
summarization process.

4. 1 Evaluation Overview

This evaluation made use of a combination of qualitative and
quantitative data gathering techniques. Videotaping and "think-
alou " techniques were used during the sessions. Two questionnaires
were utilized to collect the quantitative data: QUIS (Questionnaire for
User Interface Satisfaction) version 5.5, developed at the University
of Maryland's Human Computer Interface Laboratory, and a new
questionnaire developed speciﬁcally for this evaluation, which was
based on Reeves’ and Harmon’s guidelines [Reeves and Harmon, 1994]
for evaluating interactive multimedia for education and training. _
Figure 4.1 displays the instruments and procedures utilized in the
ﬁrst part of the study.

33

 

Figure 4.1: Instruments and Procedures

 

Target users*:

16 Engineering Students:

( 5 lndims/Pddstanis,

5 Chinese/Koreans,

5 Americans, 1 Venezuelan)

Experts:

5 Hypermedia Designers

 

Consent form
Demographics Questionnaire
Description of Experiment
Pro-requisite information

Appendix A
Appendix 8
Appendix C
Appendix D

Consent form

Background Questionna’re
Description of Experiment
Pro-requisite information

 

Videotaping
Think Aloud
Critical Incidents

Videotqaing

Think Aloud

 

if 1' Critiod Incidents

 

 

QUS (Questionnaire 10' L337 ”MM“ F QUS (Questionna’re for lber
Interaction Satisfaction) Interaction Satisfaction)

QUS Comments Appendix H QLlS Commnts

REEVES & HARMON Questionnaire Appendix E REEVES & HARMON Questionnaire
Data twulated available on MINITAB Data tabulated available on MINITAB
Multimedia Files Multimedia Files

Navigational Mars ”PEN“ ' Navigational Maps

Report of Usage ”WWI" K Report of Usage

List of Problems ”Fwd“ J List of Problems

 

 

' data oolection aid oonpiation of
users' emotion wee oonpieted prior
to the expats emotion.

 

Meta Analysis by \
Experts

Instruments

 

 

Questionnaire

Appendix .I

M J

 

Figure 4.1: Instruments and Procedures used in the study.

 

 

35

In the second part of the study, the meta-analysis, which took
place immediately after the experts ﬁnished evaluating the
prototype, the instruments, procedures and data collected during the
users’ evaluations were examined and rated by the experts.

4.2 Subjects

The subjects included in this ﬁt into two groups: 16 engineering
students, were deﬁned as potential users for the prototype being
tested; and 5 educational multimedia designers were deﬁned as
experts. They are described in more detail below.
4.2.1 Users

Table 4.1 lists the potential user categories. Potential users
were chosen to form three main groups with similar cultural
background. The Latin group was excluded for lack of enough
subjects available. All users included in this study were current
students recruited from the School of Engineering at Michigan State

University. There was a mix of undergraduate and graduate students.

 

Ethnic Group: Components:
1) Indian, Pakistanis 3 Indians, 2 Pakistanis

 

 

2) Chinese, Koreans 3 Chinese, 2 Koreans

 

3) Americans 5 Americans

 

 

4) Latin Americans 1 Venezuelan
Table 4.1: Ethnic Groups and their components

 

 

 

 

The students were recruited by electronic mail and by the use

of ﬂyers posted around the Engineering Building. A compensation of

36

ten dollars was offered in exchange for their participation in the
study.

The students recruited for the evaluation had the following
characteristics: ages ranging from 19 to 30 years, with a mean of 21
years; 87.5% of the users were familiar with Windows; 75% were
male; 59% were from the ﬁeld of Electrical Engineering; 59% knew
some kind of programming; 50% owned their own microcomputer.
Their self—perception of knowledge in telecommunications presented
a mean of 3.3 (Likert scale with a range of 1 for minimum and 9 for
maximum); their enthusiasm in using CBT had a mean of 7.5 (Likert
scale with a range of 1 for minimum and 9 for maximum); their
previous use of CBT presented a mean of 3.7. The international
students (nine total) presented a mean of 592 in the TOEFL, Test of
English as a Foreign language; and their amount of time in the USA
presented a mean of 2 years with a range of 0.1 to 4 years.

Potential users were scheduled in advance to participate in the
study. Their participation in the evaluation took approximately two
hours. Subjects participated one at a time.

4.2.2 Experts

Table 4.2 lists the experts that participated in this study. They
were recruited from the Michigan State University community of
multimedia designers. They were invited to participate in the
evaluation via letter, and no compensation was paid. All ﬁve experts
were people with whom this researcher had worked before, and felt
comfortable that they would be willing and capable of handling the
task of making the evaluation and the meta-analysis.

37

The experts that participated in the meta-analysis had the
following characteristics: ages ranging from 24 to 46, with a mean of
35.6 years; 80% were males; all were Americans; previous
participation in evaluations similar to this study presented a mean of

4, with a range of O to 9 times.

 

1) PhD. in Educational Technology; Instructional Designer and Professor

 

2) BS and MS in Physics, MS and PhD. in Computer Science;

Hypermedia Designer and Project Manager

 

3) BA in Telecommunications; MA in Educational Systems Development;

Interface Designer and programmer

 

4) BA in English; MA in Telecommunications; Interface Designer;

Project Manager; Hypermedia Designer

 

5) BS in Astrophysics; MS in Aerospace Engineering;

Ph.D. in Educational TechnologY; Hypermedia Designer and Programmer

 

 

 

Table 4.2: Description of Experts: Qualiﬁcations and Job Tittles

 

Experts were scheduled in advance to participate in the study.
Their participation in the evaluation took approximately three hours,

including the meta-analysis. Experts participated one at a time.

4.3 Procedures

The procedures and instruments utilized in this study are
described below. Detailed information is provided for each
component of the study.

38

4.3.1 Orientation
Before the actual evaluation began, each subject was given a
brief orientation of the session. During this time, subjects could ask
questions as long as they would not interfere with the evaluation
itself. The orientation contained the following topics:
1) Informed Consent:
Informed consent was required by the Committee for the
Protection of Human Subjects at Michigan State University.
This form brieﬂy described the purpose of the research, stated
that subjects would be videotaped, and emphasized that
subjects were not to feel any coercion to participate in the
study. This form, which is included in Appendix A, was signed
by all subjects, including the experts. .
2) Completion of Preliminary Questionnaire
Subjects where given a preliminary questionnaire covering
demographics, prior experience with computers and some
attitudinal questions. This questionnaire is included in
Appendix B. Once they ﬁnished this questionnaire, they were
paid with cash for the participation in the study (students
only).
3) Background of Evaluation
Subjects were then given a brief description of the purpose of
the evaluation and the work they would be doing. This
description can be found in Appendix C. Most relevant was the

information that the subjects were not being tested, but that
the W was the focus of the study.

39

4) Pre-req uisi tes List
A printed list of teletrafﬁc pre-requisites was presented and
explained to subjects prior to the beginning of the evaluation.
This list contained relevant information for subjects that were
not familiar with the terminology, facts and concepts related to
teletrafﬁc engineering. A version of this list can be found in
Appendix D.
5) Introduction to the computer system
Subjects were then given a brief introduction to the computer
and prototype they were to use. They were asked if they were
left or right handed and told to use the left button of the
mouse. They were also asked to attach the lapel microphone
and given an explanation that the second monitor was facing
the opposite direction, for videotaping. They were shown how
to wiggle the cursor on the screen every time they wanted to
explain something to the observer.
Following the completion of the orientation, the evaluation was
begun immediately, and the time limit of one hour was set.
4.3.2 The Evaluation Session
Once the user got started, the observer would maintain verbal
contact with the participant, in order to get them to "think-aloud"
during the session. This process of reminding the subject to speak
would vary, depending on the personality of the participant. At
critical points, such as trying to learn to navigate around the
program, the observer would ask what speciﬁc problems the
participant was facing. This critical-incident technique allowed the

veriﬁcation of speciﬁc problems by most of the subjects.

40

In the speciﬁc case of the exercise on screen 9, the observer
tried to see how far the subjects could get. If the subject was
spending too much time without making progress, the observer
would ask the participant to verbalize the strategy of solution being
attempted and to continue with the evaluation.

The time to explore the prototype was set at a limit of one
hour. It would be unrealistic to compile more than one hour of
videotape for each subject, and it would be difﬁcult for each subject
to allocate more than 3 hours for this evaluation. This time limit was
obtained during pilot testing and proved to be adequate for
completing the task by most of the participants.

4. 3.3 Data Recording

The observer recorded (on paper) key timing and success
information during the evaluation. Further notes were made
indicating comments made by the subjects, speciﬁc observations by
the observer, and details of problems or particularly interesting
occurrences.

Video recordings were made of each subject, beginning with
the introduction of the prototype in the orientation section, and
continuing through the follow-up questionnaires. The videotape
included the subject's voice, the observer's voice, the computer
screen, and the computer sounds (narrations, beeps, mouse clicks and
keyboard strokes). The videotapes served as veriﬁcation of problems
encountered and to allow further analysis of subject and
experimenter behavior. The videotapes were also the raw material
for the production of the multimedia ﬁles, which are described in

detail below.

 

 

41

4.3.4 Production of M ul timedia Files

The creation of multimedia was dictated by the clear need of a
tool or instrument that could give random and fast access to the
problems and critical incidents detected. The multimedia ﬁles were
created by following a pre-determined sequence. After each
interview, and for every user, the observer would process the
videotape by executing a series of steps which are described below.

The videocamera was connected to a multimedia capable
computer which had audio digitizing software installed. This setup
was used to grab the critical incidences and relevant comments for
each user. These audio ﬁles were grabbed at 8 bits and 1 1 Khz of
sampling rate, in order to keep ﬁles small. This process would
typically require the observer to watch the tape in small segments,
rewind the tape and start grabbing the audio portion, when relevant
information was encoutered. Each audio grab was saved with a name
that indicated which user and which screen the comment or critical
incident was from, as well as a numbering protocol, so that the audio
segments could be assembled in cronological order later.

At the end of each digitizing section, A Macromedia Director ﬁle
was created, when screen shots and audio segments were combined
in cronological order, and basic interaction was incorporated so that
the ﬁles could be manipulated quickly and easily by the researcher.
The processing time required to produce a multimedia ﬁle for each
user would vary, depending on the amount of “thinkaloud” talking
and the number of problems encoutered, but typically it would take

around 2 to 3 hours for each user.

42

The criteria for including audio segments was based on the
severity of the problem and the richness of the comment. This, of
course, would vary if another observer would process the data, but
there was a basic criteria to include as much data as possible,
keeping in mind the usefulness of the comments. This procedure
should be improved, specially if more than one observer would be
involved in the process of producing the multimedia ﬁles.

It is important to mention that during the same time a list of
problems was being generated, which included not only problems
verbalized by the subjects but also the problems perceived by the
researcher. This was accomplished by ﬁlling a simple form
for each problem, which were compiled later into a comprehensive
list of problems, which is described later. Problems were classiﬁed
among 4 main categories: Interface, Instructional, English-Gramatical,
and Programming. At the same time, with the use of another
computer, the navigational maps were generated, which allowed a
signiﬁcant reduction of time spent watching videotapes, always a
potential problem in this kind of evaluation.

4.3.5 Questionnaires

Once the time limit of one hour expired (immediately after the
session) the subjects were asked to answer the questionnaires. The
ﬁrst questionnaire to be ﬁlled was the Reeves & Harmon one (see
Appendix E). This instrument was presented to the participants in
printed form. This questionnaire took a few minutes to be ﬁll out. -

Participants then were asked to answer the QUIS questionnaire,
which was presented on-line on the same computer utilized for the

evaluation (see Appendix F). This instrument took around 20 to 30

43

minutes to complete. The videotaping was not interrupted at any
time during the session.

Once the participants completed the questionnaires, they were
asked if they had any comments, suggestions or questions about the
whole process. Many users indicated interest in the study and gave
interesting comments.

4.3. 6 Meta Analysis

The experts, once they had ﬁnished answering the QUIS
questionnaire, were given a verbal explanation about the meta-
analysis to be conducted. In this short explanation, they were told
about the need for a classiﬁcation, evaluation and prioritization of
instruments and procedures and were asked to become, for the
meta-analysis’ sake, “managers” of the prototype’s development
project they just ﬁnished evaluating.

The experts were then asked to examine carefully all ten
instruments, the procedures, and the data collected during the
potential users’ evaluation (described below). Most of this
information was presented in printed form in a packet, with the
exception of the multimedia ﬁles and the statistical data tabulated in
Minitab, which had to be presented on the computer.

All 10 instruments and procedures evaluated by the experts in
the meta-analysis are described below:

1) Preliminary Questionnaire

The preliminary questionnaire consisted of background
information and demographic questions about the participants.

This document is presented in Appendix G. This instrument

44

was presented to the participants in printed form and consisted
of a one-sided page. The questionnaire included attitudinal
items, computer usage background and general information
about the participants.

2) Reeves and Harmon Questionnaire

This instrument was included in the evaluation with the
objective of incorporating instructional and pedagogical
dimensions in addition to interface dimensions. It was
composed of two printed pages, containing 10 items on each
page. TWO main dimensions were deﬁned: Pedagogical and
Interface dimensions (see Appendix E).

3) QUIS Questionnaire

The QUIS Questionnaire (Questionnaire for User Interface
Satisfaction) consists of a total of 69 items, subdivided into 5
categories (overall, screen, terminology, learning, and system).
This instrument was included in the study with the objective of
incorporating prior research of a well established usability
instrument (see Appendix F).

4) Users Commen ts From QUIS Questionnaire

The information presented in this item is a subset of QUIS. All
written comments were combined and presented as a separate
category. Appendix H presents the comments generated during
the users’ evaluations. These comments were typewritten at
the end of each of the ﬁve categories of QUIS (overall, screen,

terminology, learning, and system).

45

5 ) Multimedia Files of users

Multimedia ﬁles are qualitative computer documents that were
created for each subject. They contained screen shots and audio
bites of the problems detected by each user. The problems
were exn'acted from the videotapes, and arranged in
chronological order. These multimedia ﬁles were created with
Macromedia Director 4.0, which has the capability of
integrating graphics and audio in an interactive way. These
ﬁles allowed experts to quickly examine the incidences, taking
advantage of the random access nature of digital media.

6) Navigational Maps

Navigational-visualization maps were included in this study
with the objective of providing a way of verifying the
navigation and frequency of screen visits for each participant.
In this study, the navigational maps were generated during the
video compilations. An example of a navigational map is
available in Appendix I. These maps include indications in
numerical order of screens visited, visualized in a sequential
diagram of screens. The observer's comments were included on
these maps, in order to contextualize the navigational strategy
of each user.

7) Ethnic Groups Results

The objective of including ethnic groups in this provided a
diverse range of usability perspectives from international
users. Descriptive statistics were presented showing differences

among the three different ethnic groups. The multimedia ﬁles,

45

and QUIS comments ﬁles also indicated the national origin of
the participants.

8) List of Problems

A list of problems was generated with the objective of
providing an efﬁcient way of reporting and summarizing the
problems detected by all subjects, including experts. With this
list, which can be seen in Appendix J, experts could quickly
grasp the incidence, the location and the description of each
problem. This list was generated by the researcher in the order
of coding, when watching the videotapes. This list categorizes
the problems encountered into four groups: Interface,
Instructional, English and Programming problems.

9) Report of Usage

This item was created with the objective of providing the
experts with a report of usage, which included the amount of
time spent on the prototype, the number of screens visited, and
the level of response to the exercises and quizzes for each
participant tested. Descriptive statistics were presented to the
experts (Appendix K).

10) Statistical Data Tabulated in Minitab

All the data generated in the evaluation was tabulated and
presented to the experts using the statistical software Minitab.
A total of 1 14 variables were generated, including data of all
the instruments described above. A printout of this informatiOn
was given to the experts for reference, consultation and

manipulation (Appendix L).

47

During the meta-analysis, the observer was available for
questions and clariﬁcations of any kind. Once the experts ﬁnished the
analysis of the instruments, they were asked to answer a ten-item
Likert scale questionnaire about the instruments and procedures.
This questionnaire is available in Appendix G. The meta-analysis was

also recorded on videotape for further clariﬁcation.

4.4 Evaluation Chronology

Initial work on this evaluation began in October of 1994.
Related research literature was studied, and potential prototypes and
instruments were considered. Originally, this evaluation was
intended to include 4 or 5 ethnic groups. Because of the difﬁculty of
obtaining the subjects, however, the study was narrowed to include
three main ethnic groups. Two subjects participated in a pilot study
by December of 1994. The pilot study led to considerable changes in
the study, particularly for the sake of the users. The Reeves &
Harmon Questionnaire was shortened, as well as QUIS Questionnaire.
A list of pre requisites was incorporated to the information given
prior to each session in order to allow non-telecommunication
engineering students to be able to participate in the evaluation.

The ﬁrst subject was observed in March 1995, after which no
more changes were made to the evaluation design nor to the
instruments or the prototype. All users participated by June 1995.
During the month of July, the data collected from the users’
evaluations were compiled, summarized and prepared to be shown to
the experts. Experts participated in the evaluation in August and
September of 1995.

Chapter 5

Analysis

This chapter presents and analyzes the quantitative and
qualitative data collected during this study. It is organized as follows:
ﬁrst, the analysis methodologies are surveyed and a rationale is
given for the methodology used. Second, the information collected is
described and analyzed. last, a summary of the analysis is
presented.

5. 1 Choices for Analysis

A primary decision that must be made in any observational
research is how to analyze the data that are collected. This decision is
largely determined by the general knowledge of the ﬁeld of study,
the speciﬁc knowledge of the problem domain, and the perspective
of the researcher.

5.1.1. DifferentApproaches to the Same Problem

The current stage of human-computer interface research
depends on the perspective of the researcher, since it involves a
cross-section of the ﬁelds of computer science, education, cognitive
science, and statistics.

From the computer science standpoint, the discipline of
usability engineering has the objective of designing better interfaces.
Professionals that take this approach are very practical in terms of
ﬁnding out how to make applications more usable. Research of this
nature studies the interaction of users and computers with the

objective of ﬁnding statistically signiﬁcant results.

48

49

From the cognitive and educational standpoint, human-
computer interaction is located between a descriptive and an
explanatory standpoint. Human thinking is not understood well
enough to predict it entirely. Therefore, research of this nature tries
to expand the knowledge of the interaction between humans and
machines. Case studies of small a number of users are ideal in this
scenario [Bell, 1 992].

Statistics can improve the researcher’s capacity to generalize
the results. Since differences are observed and quantiﬁed, statistical
methods can be used to verify whether these results are signiﬁcant.
When questions can be asked in advanced with enough detail a study
could include these questions.

This study attempts to incorporate a combination of the above
approaches. The goal was to gain a better understanding of the
methodologies for testing educational interfaces, combined with
problem detection and attitudes. In order to accomplish this, it
combines detailed observation of subjects found in a case study with
the speciﬁc comparison of different types of users and specialists,
such as in human factors studies.

5. 1.2. Additional Considerations

In addition to the concerns mentioned above, other factors
inﬂuenced the statistical approach taken in the study, the most
important one being the limited availability of participants. Since a
signiﬁcant amount of the data collected were categorical, and other
data were not normally distributed, parametric analyses could not be

utilized. This was a critical decision because this researcher wanted

50

to follow a trend observed in previous studies [Nielsen and Landauer
1 993].

This researcher wanted to include several categories of users
because of the issue of international use of interfaces, and because
previous research indicates that particular types of interfaces are
better suited for particular types of users [Nielsen 1990]. This
researcher also wanted to use several different instruments in this
evaluation because this would allow for the comparison of results
and opinions about their relative usefulness. Consequently, the study
was designed to include qualitative and quantitative instruments
which should help to better deﬁne future evaluations of this kind.

5.2 Statistical Choices
5.2.1 Use of non-parametric methods

The use of Kruskal-Wallis and Mann- Whitney U tests were
chosen because of the small number of participants in this research,
the non-normal distribution of its variables, the unequal variances of
the sample groups, and the scale inconsistencies over the range of
measurements. Another reason was the great variability of responses
obtained in the questionnaires.

A more liberal alpha level was adopted for testing the
hypotheses in order to improve the probability of detecting
differences between the groups. The researcher considered that the
risk of committing type 11 error in this case was not serious. The
computer software utilized for this statistical procedure was StatSoft
[1991].

51

5. 2.2 Use of Cluster Analysis

The researcher decided to use cluster analysis for exploring the
data generated by the questionnaires and for grouping users. In
human-computer interaction research, subjects are analyzed in small
numbers, and measures usually are characterized by fairly large
amounts of error variance, as well as being of indeterminate
underlying distribution. This limits the use of factor analytic
techniques [Kirakovisky and Corbett 1990]. Much more useful are the
cluster analysis techniques.

By deﬁnition, the cluster analysis is based on one or more
similarities coefﬁcients, or distance measures [Aldenderfer &
Blashﬁeld, 1984; Morris, Blashﬁeld, & Satz, 1981]. The cluster
analysis in the present study employed the k—means clustering
method. The computer application used for this procedure was
StatSoft [1991]. Computationally, the k-means clustering technique
minimizes variability within clusters and maximizes variability
between clusters. The program tries to move cases in and out of
groups (clusters) to get the most signiﬁcant results.

In this kind of cluster technique, the researcher speciﬁes, in
advance, the number of clusters and the computer clusters the cases
accordingly. In the present study, the researcher tried different
numbers of clusters, to see if the participants would cluster in more
distinct groups. Two and four cluster analyses were tried, using a
maximum of 10 iterations (rotations). The best solution was found in
between two and ﬁve iterations for the majority of the analyses.
Missing data values were substituted by means. In a sense, cluster

analysis ﬁnds the most practical and efﬁcient solution possible.

52

5.3 Results
5 .3. 1. QUIS Comparisons between Ethnic Groups, Users and Experts

Figure 5.1 presents the computations of the means and ranges
for users, experts and ethnic groups, as well as non-parametric
median test results among ethnic users and among all users and
experts. Results which obtained statistical signiﬁcances are indicated
accordingly. The Kruskal-Wallis test was executed among all four
groups; the Mann-Whitney test was performed among all user group
and experts. For each main category of variables, means were
computed for each group, as well as overall means for each group. An
examination of Table 5.1 indicates the presence of differences
between ethnic groups, and differences between experts and users. A
pattern was detected when the values of each sub—category were
compared. Figures 5.2 through 5.7 show graphically these
differences.

Figure 5.2 compares users and experts and displays at the
same time all questions of QUIS. This graphic clearly shows a trend --
experts were, in general, more critical than users when evaluating
this prototype. It also shows a wider range among experts’ answers.

Figure 5.3 focuses on the M aspects of the interface. This
graphic shows all three ethnic groups, their combinations and the
experts’ responses. The item “Adequate Power” was the only item
that did not obtain any statistical signiﬁcance, when comparing users

and experts.

 

 

Table 5.1: QUIS Comparisons between Ethnic Groups and Experts.

 

 

gra'

 

9.0

 

8.0 J.

I i
“1 ll . i ’
. I

30 J»

 

 

 

 

 

2.0 i'

 

_
lwmi
1.0 -....... t -: .4. Ht: ::

iiiiii iris: iiiiiiiiliiiii tags“:

Figure 5.2: QUIS answers - Comparison of users and experts

 

 

 

 

numb? i

 

 

 

 

 

QUIS: Overall Ratings

 

 

9.0 9.0
8.0 (r

7.0 i-
6 0 .. I===Ilndians

' mean...
5.0 .. Ill—Americans

—0—.All User:

4.0 .. +5199"!

 

 

 

3.0 l.
2.0 4~
1.0

 

 

 

 

 

 

 

 

 

 

me..._._..-.~...-_. m_—~_.‘.-._-M._~. .._.....__.. ..__.... ._...~........~... _.. ”W. _..~ .....

 

 

Figure 5.3: QUIS answers - Overall items - Ethnic group users and experts

 

Figure 5.4 focuses on the screen aspects of the interface. This

graphic shows all three ethnic groups, their combination and the

55

experts’ responses. One observation is the lack of responses on two
items by the experts group: “Reverse Video” and “Blinking”, which
indicates a more careful interpretation of the questions by them. The
use of reverse video and blinking was not available in the prototype
tested. “Going back to previous screen” and “Beginning, Middle and
End of Tasks” both presented differences that were statistically
signiﬁcant. This could represent that experts were more efﬁcient that

users in ﬁnding navigational problems in the prototype.

 

QUIS: Screen Aspects

 

= Indians
1m Chinese
I— Atrium
l—o—AI Users
ﬂ

 

 

   

 

 

 

 

 

 

 

 

Layouts W W -

Sharp

Fonts
Amonllnfo
Sequence
NextScreen
TaskB,M,End

 

 

Figure 5.4: QUIS answers - Screen items - Ethnic group users and experts

 

Figure 5.5 focuses on the Terminolggy aspects of the interface.
This graphic shows all three ethnic groups, their combinations and i
the experts’ responses. Several items presented signiﬁcant
differences: “Terms on Screen” and “User Control Feedback” obtained

higher probability levels. It is interesting to note that experts were

56

more critical than users regarding terminology. This could be due to

the fact that none of the experts were familiar with the content.

 

 

 

 

 

 

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

9.0 9.0
QUIS: Terminology

an. ‘80

7.0. I 47.0

69<§ . l s ‘69

50‘s; 3 .5-0

40‘; .. . 4.0

3n.§ 3.0

20‘; zo
— a m ‘E E c a or re on E
i§E°at§iisse§teg
5 is .2 E S 8 3 8 m m _ E 0

liitéé‘géié’éstté

, a

I .2 F 8 8 »= l 8 0 == '1 u

 

Figure 5.5: QUIS answers - Terminology items - Ethnic group users and experts

 

Figure 5.6 focuses on the Learning aspects of the interface. This
graphic shows all three ethnic groups, their combinations and the
experts responses. “Accessing help messages” was statistically
signiﬁcant.

Figure 5.7 focuses on the System aspects of the interface. This
graphic shows all three ethnic groups, their combination and the
experts’ responses. In this category, experts were more positive in
relation to other categories. One item that obtained high signiﬁcance
statistically was “Experts can use features easily” - users rate this

item high (8) and experts low (3). This could be explained by the

57

possibility that experts had a lower expectation of the efﬁciency of

the prototype in relation to users.

 

 

QUIS: warning Aspects

 

 

 

 

 

 

 

 

 

 

Figure 5.6: QUIS answers - Learning items - Ethnic group users and experts

 

 

 

QUIS: System Capabilities

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 5.7: QUIS answers - System items - Ethnic group users and experts

 

An important trend detected in this section was that, among

the ethnic groups, the Indians were more positive about the

58

prototype in all categories. The Chinese were more critical than the
American users, with the exception of the overall aspects category.
5.3.2 QUIS Comparisons between Gender

Table 5.8 presents the computations of means and range for
males and females, as well as non—parametric median test (Mann-
Whitney) results. Results which obtained statistical signiﬁcance are
indicated accordingly. For each main category of variables, means
were computed for each gender, as well as overall means for each
gender. An examination of Table 5.8 indicates the presence of
differences between gender. A pattern was detected, then the values
for each category were compared. Figures 5.9 through 5.14 show
graphically these differences. Figure 5.9 compares gender and
displays at the same time all questions of QUIS. This graphic shows a
trend detected in this section: males were slightly more critical than
females when evaluating this prototype.

Figure 5.10 focuses on the degall aspects of the prototype.
With the exception of the item “Flexible” , all items were graded
lower by males. In terms of statistical signiﬁcance, only “Wonderful”
was the only item signiﬁcant (although at a level of .2).

Figure 5.1 1 focuses on the Sm aspects of the prototype. The
item “Screen layouts” was statistically signiﬁcant at p<.05. The
sequence of screens was signiﬁcant at p<.2. In this category, the
differences of ratings between males and females were less evident,
although existent.

 

 

Table 5.8: QUIS answers - Gender comparison

 

 

 

 

 

33250:
Bods-Z
232.85

8:33.
imgm
5.53.:
5239.:
053.....—
.8558»
.8523.
233.3
like...
5.8.5
:Gxu-aum
.58.an
30250
.8858
$285.:
x5§=o>
820.;an
31.9.5
5209.32
2&5
. 2:23..
8:55.
£0;
33.35
.35...
5:28
3:88;

 

2 4|

 

 

QUIS answers - Gender comparison

Figure 5.9

 

 

General Aspects

QUIS

 

Ies

I Males
I Fema

 

 

 

 

 

- Overall aspects

Gender

QUIS answers -

10:

Figure 5

 

61

 

QUIS: Screen

 

     
   

 

Blinkino *1 «
Layout

I
8 .. 1;: 5 i
-‘:" “' 5“:
~ ‘ t: "f: 3
. r: it; 5:15 7. .3 3f; 3
“- r "r I Males
,f. :1 :5; 31'. .1 ,ﬂ- ,3; \i P; i‘-, 1 i
5 ‘ ’7‘“ ' ’1? if r»: IFemaIes
4 1 '4‘" 23: if 3}: 33 1 if: :3} f
f, ; :_: :
3 . J :13; -~i 7:. L1 153‘. i :1; 3;: l
13': 7+ 1:5 1’5 -’~:-» '. fix I
5;; %- :25 :- ; g: ,1... .11.
it? i- f. =5” F ' 1
a t
1 1‘5 " 5i “ . Z. .‘rj 3 3i 3:. ,-.
- Q U) U) ' o " 0 U
8 E 9 '8 E E t: o a
.C O = "‘ .- : m L .D
co LL :E > C t 3 8 c
5 0 < U .. a
E w x (5
m < 0) 2

EaSyRea- “2117572
TaskB.M.En~ ”We" 1.5m?

 

 

 

Figure 5.11: QUIS answers - Gender -Screen aspects

 

Figure 5.12 focuses on the Terminology aspects of the
prototype. “Predictable Results” was the only statistically signiﬁcant
item (p<.05). In this category, females were more positive when
rating the prototype, with the exception of “Computer Terms”.

Figure 5.13 focuses on the Learning aspects of the prototype.
This section had a balanced ratings and signiﬁcance was obtained in
only two items (p<.2): “ Remember Rules” and “Steps of a Sequence”.

Figure 5.14 focuses on the Syst_em_ aspects of the prototype.
This section had a relative balance in ratings, and signiﬁcance was
obtained in two items: “Failures Occur Seldom” (p<. 1) and “System

tends to be quiet” (p<.2)

 

logy

"IO

62
Term'

QUIS

 

 

 

t 8:58...

53.5.33.

.4 . 905:8
. 59.558
............... .H . r 95w;
., 59628.".
«c9822
comenEF
. e .. . ... 2.5.3590
x835;

3 «Shaka—=00

9:853...

3.2.0253

 

lMales
I Females

 

Learning

QUIS answers - Gender -Terminology aspects
QUIS

Figure 5.12

 

 

 

 

r. caoE<§oI
. ., , H... 4.... 5.5092;
.7. 5.... ,....,_....mmooo<a.wI
. . . .. . _.. .. cwmzomEoI
oscammawﬁ
030.:me
.. u . .. 32min...
.. 5:52.58.
. , . mo_:I5Em
. conEwEom
. 285820
. Ezﬁouaxm
. .. .;.. .otmﬁE
. a. 9:223
a umoca>u<cj
,.:...; , . ,, c. 8:295

: .. E22654

 

 

QUIS answers - Gender -Learning aspects

Figure 5.13

 

 

 

QUIS: System Capabilities

 

 

 

 

 

 

 

Reliable

SysSpeed ‘
Resptlme
Ratelnlo
SysFallure -'“
WarnProbl
BeepsTons ;.,
CorrectErr
NeedsBoth 3::
Novices
Experts

 

jg
5; 9; .6 28 3
‘y, .. .. . 3 ‘i‘, , ' 4
7 .. , ix‘. v- -- 2' it ‘wj
r: v... 3; J} I L g 3: '~
:- g1- ‘ g
6 -:-,. g; 5; :3 it? 1' i :ié .. ’7
; . 5: :.. c - 5% <. .7 “ r ' g: . . I Males
5 . 1 ‘: :5 r .z 2:- : ‘ 1. .=-'» '3
7;" .“. y? 7;: ' "In g; _‘ -’ .3: f} A I Females
4 3 x
. I- " ‘2“ '» ~‘2 '- 1" (I ‘3 '--' .3 Vl-
J >‘ '2’ 1.5% x '1: 5.3 ’ 3 ‘J . 4‘ . ' "—
. a: .5 z =3 " ; .g 1.; T,“ j
g'. 37 :g ;., g '1. j ;: .pr .' q
3 0 i7 ‘2: 5‘ .7. =14 {1: E? r 9. if f; E
jg 7; :3 3: f: ; $3: .3: {.2 3‘: : 7: a1; 2"
2 "by a; 5;“ €;’ ;-,' ,‘T’T in 5,. a; i k: 3, '-
s E a: r ~‘~ 1’3 as ‘
' "“ ’ .7 ’7: I, a. ,- ."u 1 «r
“-1.; :3: ‘ : ';’-‘ ,3;
a: > a) h
a 2 y.) 8 g
“’ ° ° 3 o
t: Z Z
= 25 c 8 °
0.) > U I— 'O
O. U) Q) I— C
CD 2 0 D
D O

 

 

Figure 5.14: QUIS answers - Gender -System aspects

 

5.3.3 Reeves & Harmon Questionnaire: Comparisons between Ethnic
Groups, Users and Experts

Figure 5.1 5 presents the computations of means and range for
users, experts and ethnic groups, as well as non-parametric test
results among ethnic users and among all users and experts. Results
which obtained statistical signiﬁcance are indicated. The Kruskal-
Wallis test was executed among all groups; the Mann-Whitney test
was performed among users and experts. For both categories of
variables, means were computed for each group. An examination of
Table 5.15 indicates, again, the presence of differences between

ethnic groups, and differences between experts and users.

Mm

Dimensions: . . . 3.20 2.75

ms . 3.441to4 2.40

MM 105 . 3.25 to4 2.60

mmwuom. ”04 . 2.011t02 1.40

:04 . . 3.20 :04 3.00

4.20 :04 3.60 £05 3.20 :05 3.56 MM 3.00
4.30 (04 3.00 toS 3.60 105 3.57 5 3.40

4.40 HOS 3.00 1404 3.201035 3.01 :05 2.00

4.40 to4 3.40 to4 3.40 m5 3.691(05 2.20 1

4.00 «:4 3.60 m3 2.00 ms 3.441to4 2.60 (0.05

 

Table 5.15: Reeves & Harmon answers - Comparison of Users and Experts

A pattern was detected when the values of each dimension
were compared. Figures 5.16 through 5.18 show graphically these

differences.

 

Hooves I; Harmon Questionnaire
Eo—AI Una +Expom|

 

 

 

 

 

 

Figure 5.16: Reeves & Harmon answers - Comparison of Users and Experts

65

Figure 5.16 compares users and experts and displays all items
of the Questionnaire. This graphic conﬁrms the trend found with
QUIS: experts were in, general, more critical than users when
evaluating this prototype. It also conﬁrms the trend of more
variability in range among the experts. Two items presented
signiﬁcant differences: Individual Differences and Media Integration.
The ﬁrst item matched with results from QUIS (Experts graded the
individual differences lower). The second item presents new a
perpective. Experts considered the media integration poorer than
users. A plausible explanation would be their previous knowledge of
other instructional multimedia applications. ‘

Figure 5.17 focuses on the Ming aspects of the prototype.
This graphic shows all three ethnic groups, their combinations and
the experts’ responses. The item “Accommodation of Individual
Differences” was signiﬁcant at p<.OS,”Experiential Value” and ‘
“Motivation” were signiﬁcant at p<. 1 , “learning Control” and
“Cognitive Psychology” were signiﬁcant at p<.2.

Figure 5.18 focuses on the Interface aspects of the prototype.
This graphic shows all three ethnic groups, their combination and the
experts’ responses. The items “Mapping” and “Media Integration”
were signiﬁcant at p<.05; “Aesthetics” was signiﬁcant at p<.1 ,and
“Navigation”, “Screen Design” and “Overall Functionality” were
signiﬁcant at p<.2.

The trend observed with QUIS indicating that the Indian group
to be more positive about the prototype was conﬁrmed here.
However, The Americans were the most critical users here,
contradicting the trend found with QUIS.

 

 

szummolmbns

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 5.17: Reeves & Harmon answers - Comparison of Users and Experts -

Learning Dimensions

 

 

 

Reeves 8. Ramon: Interface Dimensions

 

  

 

 

 

 

 

 

 

 

 

 

 

§

3
§
0

 

Elwin:
F—CNM
—Amgricam

H—NIUM

 

 

 

 

 

 

Figure 5.18: Reeves & Harmon answers - Comparison of Users and Experts -

Interface Dimensions

 

 

 

 

67

5.3.4. Reeves & Harmon: Comparisons between Gender

Table 5.19 presents the computations of means and range for
males and females, as well as non-parametric test results (the Mann-
Whitney test was performed). Results which obtained statistical
signiﬁcance are indicated. For both dimensions of variables (learning

and interface) means were computed for each gender.

Mean
Dimensions: 3.04
Value
Motivation
Acomm. individual Differ.

User

interface Dimensions:
1 to 5 3 to 5

1to4 3.1 3toS

User Control 2 to S 3.27 2 to 4

Media 3 to S 3.82 2 to 4

Overall 2 to 4 3.27 to S

 

Table 5.19: Reeves & Harmon answers - Gender Comparison

 

An examination of Table 5.19 conﬁrms the presence of a trend
in differences between males and females. This pattern was detected
when the values of each dimension were compared, and this trend-
was more evident on the learning dimensions. On the interface side,
the means differences between males and females were smaller.

“Pedagogy of Objectives” was signiﬁcant at p<.05; “Experiential Value”

68

and “Motivation” were signiﬁcant at p<. 1; “Cognitive Psychology” and
“Overall Functionality” were signiﬁcant at p<.2. Figure 5.20

represents graphically these results.

 

i
: Reeves 5 Human: Gender Differences
! [+m 3‘4“». }

i

i

 

 

 

iiiiiiiiii

Figure 5.20: Reeves & Harmon answers - Gender Comparison

 

 

Inieiface Dim. »

 

 

5.3.5. List of Problems: Comparisons between Ethnic Groups, Users
and Experts, and types of problems

The analysis of the list of problems encountered by the
participants (Table 5.2 1) allowed a quantiﬁcation and classiﬁcation of
problems. An examination of this list indicates four types of
problems: Interface problems, Instructional problems, English
problems, and Programming problems. This categorization was
helpful when trying to identify patterns in terms of which group
detected which kinds of problems more frequently. Experts (5

participants) encountered 92 problems, out of a total of 1 14. In other

69

words, experts found 5 1 problems that users did not ﬁnd, as opposed
to 22 problems found only by the users (a total of 15 users).

The mean number of problems encountered by each group:
experts 28.4; Americans 12.8; males 10.8; all users combined 9.98;
Indians 9.4; females 8.25; and chinese 7.75. Figure 5.22 shows these
results graphically. The difference between these groups was
statistically signiﬁcant at p<.01.

 

30

 

25.

 

Problems
20.. a ,

 

Americans
15 ..

All users Vales

combined

   
    

Indian .
1o .. Chinese

5i.

04

Females

    

 

 

 

 

Figure 5.22: Mean number of problems found by subject groups

 

7O

 

 

 

 

 

 

 

Location Type Description Users [“99’
of of of (l to "5 iota: otal
Problem: Problem: ' Problem: 16) "271:0 (%)
Location Four kinds In this space, The' X' placedk
is defined of Problems a brief in thi area
by the were used description indi ate
Screen in this each problem the in idence
location colunm: encoutered is for ach
of each reported pro Iem
problem Instructional _ ..>
found example: If co nted
interface 'Lack of horizo tally, it
example: better way for will gi e the
Tutorial English moving total in idence
2/27 around' of one pacific
(means lProgramminQI pro lem
that at acre 5 all
the second the s bjects
screen of in the tudy
the tutorial
a problem If co nted
was found) verti Ily, it
will gi e the
total n mber
of pr lems
a subje found.
I
Totals I V

 

Table 5.21: Explanation of the List of Problems.
The Components are explained above. The actual list is split into

two pages for microfilm purposes and is available at Appendix J.

Problems were included in order of coding by the reseacher.

(The original list was one page)

 

71

Clearly, the experts were much more efﬁcient in detecting
problems. An examination of Table 5.23 and Figure 5.24 allows a
closer veriﬁcation of the kinds of problems detected by users and
experts. In this case, it is important to notice that the number of
users was three times larger than the number of experts. Among the
types of problems, it seems that experts were particularly efﬁcient in

encountering interface problems.

 

 

 

 

 

 

 

 

 

 

 

 

Type of problem Users Experts Total by type
instructional 1 9 28 32
Interface 31 50 59
English 10 1 1 18
Programming 3 3 5
Total by group 63 92 114
Obs: 16 users and 5 experts participated in this study.

 

Table 5 .23: Mean number of problems found by user and experts -
Categorization of types of problems

 

 

Minibar of Problems Detected by Category

 

   
   

 

 

120 120
110" Paul: =Expom—o—roui by tﬁ ”10
1W? $100
90+ .1.”
80.. #80
70-. 1.70
30.. ..eo
50.. ..50
40.. $40
30. ..30
20. 1.20
10.. 1.10
0.1 I ‘.__.. 0

 

 

 

 

 

 

Instructional interface English Programming intal by 90-0.; p

 

 

 

Figure 5.24: Mean number of problems found by user and experts -
Categorization of types of problems

72

5.3.6. Results of Cluster Analyses

The process of clustering the participants was performed using
two methods. In the ﬁrst method (Joining), a hierarchical tree was
generated, which is represented in Figure 5.25. Missing data values

were substituted by means of variables.

r \
(Dlink/Dmax)*100

Case #‘ 1---1 o---2o---3o---4o---so---eo--7o--80---9o---1oo
Koree/ Male ,
1 l

 

 

Expert/Male
Expert/Male
Korea/Male
China/Male
China/Female
USA/Male
Benga/Female
Expert/Male
USA/Male
PakistJMale
PakistJMale
indie/Female
Expert/Male
Expert/Fem ale
China/Male
India/Male
Venez./ Male
USA/Male
USA/ Fem ale
USA/Male c

 

ow

 

 

as

VINO OCH-5"

 

OOOOOOOOOOOOOOOOOOOO
d

d‘pdmbdd—awdddmmmdxjNN—ad

 

 

2

 

Figure 5.25: Clustering of subjects by Hierarchical Tree

 

An examination of Figure 5.25 indicates two main clusters.
Participants 1, 19, 20, 9 and 12 (one Korean and two experts) form
the ﬁrst cluster. At the bottom of the tree, participants 8, 1 1, 9 and
12 (all Americans) could be interpreted as a second cluster. A clear
distinction between users and experts, or between ethnic groups,
with the exception of the American group, could not be detected.

In the second method (k-means) the number of clusters to the

statistical software was pre—determined to be two and four groups.

73

The researcher wanted to test if the participants would cluster into
users and experts, or into ethnic groups and experts. Table 5.26
summarizes the results of this clustering process. Since two
questionnaires were utilized in this study (QUIS and Reeves &
Harmon) The researcher run separate and combined cluster analyses
for each possible combination. It shows participants by groups. An
examination of the different results for each combination shows an
equivalence of results between Reeves & Harmon , QUIS, and QUIS
and Reeves & Harmon combined when two clusters were executed.
None of the clustering combinations revealed a clear agglutination of
the ethnic groups, gender or experts groups. But a trend can be

detected in the way Indian/ Pakistanis users cluster together, across

 

 

 

 

 

different combinations.
msters: 4 Clusters:
QUIS Cluster 1: Cluster 1:
1,1,1,l,l,2,2,2,2,3,3,3,4,4,4 1,1,1,l,2,2,3,4
Cluster 2: Cluster 2: 2,3,3,4,4
2,3,3,4,4,* Cluster 3: l,2,3,3,*
Cluster 4: 2,4,4
Reeves luster l: Cluster 1: 1,2r,3,3,4,4,*
& 1,1,1,1,1,2,2,2,2,3,3,3,4,4,4 Cluster 2: 1,2,2,2,3
Harmon Cluster 2: Cluster 3: 2,3,3,4,4
2,3,3,4,4,* Cluster 4: 1,1,1,4
QUIS + Cluster 1: Cmter l:
Reeves 1,1, 1 ,1,1 ,2,2,2,2,3 ,3 ,3 ,4,4,4 1, l ,2,3,3,3 ,4,4
& Cluster 2: Cluster 2: 1,1,1,2,2,4
Harmon 2,3,3,4,4,* Cluster 3: 2,3
Cluster 4: 2,3,4,4,*

 

 

 

 

 

1= Indians/ Pakistanis, 2=Chinese/ Koreans, 3= Americans 4=Experts
* Venezuelan

Table 5.26: Cluster Analysis by Participants and Groups

74

5.3. 7. Results of the Meta-Evaluation by Experts

Figure 5.27 represents the results of the questionnaire for
evaluating the instruments and procedures utilized in this study.
This questionnaire was answered only by experts, after they ﬁnished
the evaluation of the prototype itself. They were asked to evaluate
the instruments and the data generated by the users with these
instruments. This meta-evaluation was intended to verify which
instruments experts would rate higher, in the context of developing
educational multimedia. An examination of this ﬁgure indicates that
the multimedia segments were rated the highest, with all five
experts rating nine (maximum value of the scale). The list of
problems came in second, with a mean value of 8.2; the
Demographics Questionnaire mean was 7.8; the QUIS Written
Comments mean was 7.0; Ethnic Groups 6.6; Navigational Maps 6.4;
Statistic Data at Minitab 5.8; Report of Usage 5.6; QUIS Questionnaire
4.2, and Reeves&Harmon Questionnaire 4.0.

 

 

instant-ms Evaluation by Experts

   

 

 

QUIS
Questions

Staﬂeiicai
dab
Report of
Waco

 

 

Multimedia
files

Reevl-iarmon
Queuions

Ethnic Groups
Navigational
Maps

 

 

Figure 5.27: Instruments Ratings by Experts

75

5.4 Qualitative Analysis of Comments from Multimedia Files

A careful examination of the comments collected in the
multimedia ﬁles generated some preliminary and exploratory
results, in terms of the kinds of differences in feedback could exist
between males and females, as well as among the 3 ethnic groups
included in the study.

In terms of gender differences, the comments showed a trend
that females were more focused and detailed when going over the
prototype. In general, their comments showed an interest in the
instructional aspects of the prototype, but with sufﬁcient
understanding of the interface to suggest and indicate valid problems
and modiﬁcations.

This could mean that females are more attentive and willing of
trying to really learn from the prototype. Their comments were
generally more frequent and longer. As an example, one of the
female subjects was trying so hard to execute the calculation on
screen 9 that she ended up ﬁnding a bug in the Microsoft Windows
Calculator.

In terms of differences between ethnic groups, the comments
collected in the multimedia ﬁles indicated that the Indian group was
more focused on the interface aspects, and generated longer and
more rich comments. The Chinese subjects were the most quiet and
seemed to be focused mostly on the learning, generating a few really
useful comments.

American users seemed to generate a more balanced set of
comments, although not as rich as the Indian comments. For the

ethnic comparisons, both males and females subjects were included.

76

The validity of these ﬁndings are of relative merit, though,
considering that the sample was small and too many confounding
variables could be present, such as language difﬁculty, socioeconomic
status, age, the gender of the observer, ﬁeld of engineering. It is
important to note, also, that the fact that one group is less talkative
than the others does not necessarily means that they are less useful.
The role of the observer is not only to detect the problems
verbalized by the subjects, but to detect problems encountered by
the subjects. This discrepancy is apparent if one compares the
number of verbalized problems of each subject (present in the
multimedia ﬁles) in relation with the number of detected problems

present in the list of problems.

5.5 Chapter Summary

This chapter contained a description of the analysis performed
on the data collected during this study and the supplementary meta-

evaluation covering instruments and procedures by the experts.

Experts were more efﬁcient in evaluating the prototype. They
detected more problems than any other user group. Experts found

signiﬁcantly more interface problems than users.

There were interface problems detected only by users; there
were interface problems detected only by experts. This result
suggests that a combination of both kinds of participants would be
the ideal solution for testing prototypes.

77

The Multimedia Segments and the List of Problems were the
instruments most preferred by the experts. Questionnaires were the
least appreciated of the instruments.

Ethnic groups reacted differently in relation to the prototype.
The Indian/ Pakistani group was the most positive about the
prototype. Americans were more critical, according to the Reeves &
Harmon Questionnaire; the Chinese were more critical, according to
QUIS.

Females were more positive than males when rating the
prototype. Males found more problems in the interface.

The process of clustering users according to their responses did
not indicate a clear existence of ethnic groups or a clear distinction
between users and experts. The high variability of responses in the
questionnaires seem to be the cause of this result.

The graphical representations of the questionnaires’ answers
given by experts were more meaningful for interpretation than the
answers given by the users, since there was more contrast between
positive and negative aspects of the prototype.

The next chapter contains more details on particular issues
discovered during this analysis, as well as a discussion of the
methodology utilized in this study.

Chapter 6

Discussion

This chapter discusses particular issues about the data (both
quantitative and qualitative), instruments, procedures and the
overall methodology used in this study.

The objectives of this study were threefold: a) to examine the
differences among experts and users, when evaluating educational
prototypes; b) to compare ethnic groups and gender in the process of
evaluating educational prototypes, and c) to implement a
methodology for evaluating prototypes.

The cognitive diversity of the participants involved in this
study challenged the researcher and could have generated more
questions than answers. This does not mean that evaluating is
unnecessary. Instead, it shows the need for better methodologies and
more precise, yet ﬂexible, instruments and procedures.

The veriﬁcation of differences between groups of users and
experts magniﬁes the issue of the need for collecting more than one
perspective when deveIOping and evaluating educational multimedia.

6.1 Cultural identity between participants and observer
One important observation of this study was the realization of
the importance of the observer as a crucial element in evaluations of
this kind. It is not enough to bring subjects to the laboratory, ask
them to try a piece of software, start videotaping and then watch a

monitor in a remote room. It is necessary to have the observer

78

79

present and interacting with the participants, in order to obtain the
maximum amount of quality information.

The quality of the evaluation feedback depends directly on the
quality of the relation between participants and the observer at the
time of the evaluation. One limitation of the present study was the
lack of good communication skills between some of the participants,
specially among the Chinese/ Korean group. The observer had a good
grasp of English (although not his native language) and presented
good communications skills, being able to establish a cordial and
relaxed rapport with most of the subjects. In some instances,
however, not knowing more about the culture and the language of
the users proved to be a real barrier. Also, the task of “think-aloud”
for subjects in languages other than their native ones was a barrier
for some participants.

One possible solution would be to let observers ask questions in
English, and allow the comments and answers to be given in the
subject’s native language, which could be translated later during the
compilation of the videotapes. Ideally speaking, being able to have
an observer familiar with the culture and language of the subjects

would be the preferred solution.

6.2 Differences among ethnic groups, and experts

Besides the quantitative results reported in the previous
chapter, it is relevant to comment on the quality of verbal comments
generated by each cultural group. Some users, independent of their
cultural background, were shy. This was true even for experts.

Differences in subjects’ personalities could become an important

80

issue to be included in studies of this nature. The researcher was not
capable of incorporating this dimension in the present study, but
obtaining some variables in this direction (perhaps by way of the
demographics questionnaire) is recommended, in terms of
exploratory research.

Experts were clearly more comfortable with the evaluation
sessions than most of the users. This observation has limited value,
though, when one takes into consideration the fact that all experts
were colleagues of the researcher, and they all had English as their
native language.

The comparison of the depth and quality of comments between
different groups indicates that the experts’ cements were more
complete and useful. Both Indians and Americans, depending on the
personality of the user, had very interesting and useful comments.
The Chinese/ Korean group was the least usable in terms of the

quality of their comments.

6.3 Multimedia Files

The multimedia ﬁles were incorporated in this study with the
intent of providing the experts and designers a qualitative tool that
could allow fast and efﬁcient access to problem feedback, in the
context of their occurrence. The high acceptance of this instrument
among the experts indicates its relevance to future evaluations.

Although technologically possible, digital video technology is '
not cost-effective when incorporated in studies of this nature. The
utilization of audio and screen shots was used as an alternative to

digital video, and proved to be adequate for this speciﬁc prototype

81

evaluation. One of the experts suggested the use of recordable
videodisk instead of multimedia ﬁles, but it was this researcher’s
intent to see if the available digital multimedia technology would be
able to provide reasonable quality, with an affordable cost.

This instrument extends the usefulness of videotaping as a
technique for usability testing [Brun—Cottan and Wall 1995].
Videotaping captures and demonstrates to designers user-relevant
methods of ﬁnding, addressing, and resolving interface problems.
Multimedia ﬁles take this process one step further by providing

random access to the information.

6.4 Simultaneous Analysis of Content, Pedagogy & Interface

The detailed observation of users and experts detected a
critical issue, in terms of evaluating educational prototypes. The ideal
evaluation participant should be able to handle three different tasks
at the same time: a) Evaluate the interface, b) Learn the content, and
c) Evaluate the pedagogy. It turns out that very few participants
could in fact handle this complex task smoothly. Even among some of
the experts, this task overwhelming to handle. This problem is more
severe when there is a time limit imposed on participants, which is
often the case. '

This issue was not too evident to the researcher during the
recording of the evaluation sessions, most likely due to the intensity
of the interaction between the observer and the participants. Being
able to watch carefully the videotapes later on allowed these aspects
to become more apparent. Typically, the participant would start the

evaluation, and after a few minutes he or she would focus on one

82

aspect and ignore the others. This effect was less frequent among
experts, who seemed more prepared for multitasking, although at
different levels, depending on their backgrounds. Some of the
experts were very comfortable, due to their educational background
in math or science.

The vast majority of target users tried to concentrate on the
content aspect of the program, and those who did not get lost in
navigational aspects would progress in exploring the prototype. Some
users, however, preferred to start exploring or trying to understand
the interface aspects and navigation tools, which caused some
cognitive overload in the evaluation process. Experts seemed to be
better prepared and seemed to have brought with them some

previous cognitive strategies in order to deal with these situations.

6.5 Use of Questionnaires in Evaluations

The utilization of two Questionnaires (QUIS and Reeves &
Harmon) in this study generated some thoughts about the use of this
kind of instrument in evaluations of educational prototypes. The use
of questionnaires in usability evaluations are a widespread
procedure in the ﬁeld of Human Computer Interaction. QUIS is a
commercial tool, available from the University of Maryland for a fee
(200 dollars for universities and 1,000 dollars for the industry).

QUIS could be answered either in paper form or electronic

form. There is evidence that using questionnaires on-line presents ~
advantages [Slaughter, Harper and Norman 1 994]. On the other hand,
there is evidence that short printed questionnaires (up to two pages)

can cover the main aspects of evaluations more efﬁciently [Lewis

83

1992]. QUIS was developed with the sc0pe of testing software
applications, in general. It is not a speciﬁc tool for evaluating
educational prototypes.

The Reeves & Harmon Questionnaire was created for this study
based on recommendations that educational software presents
peculiarities that cannot be detected precisely with more general
tools. In this sense, this instrument has the potential to be improved
and to ﬁll a gap in terms of instruments available to educational
designers.

Both questionnaires were rated low by experts, when
compared with other instruments in this study. This result suggests
that questionnaires are reasonable tools to provide overall feedback
about the prototype, but lack the necessary context for the problems
detected. It seems that this kind of instrument would be more
adequate for a large number of participants, when quantitative
summarization is necessary and more in depth analysis is not
feasible.

In terms of length, QUIS seemed to be too long, and at times,
redundant. The advantage of ﬁlling long questionnaires on-line, such
as QUIS, is that the users do not know how long they are. The
electronic format utilized did not allow participants to verify the
number of “pages”, or how much was left to be ﬁlled out, which
sometimes can be a problem.

The Reeves & Harmon questionnaire was composed of two
printed pages, face to face, with a total of 20 items, divided into two
categories, learning and interface. Participants did not seem

intimidated by this questionnaire, in terms of length. However, some

84

of the items presented were not meaningful to users and had to be

explained in more detail verbally.

6.6 Problems Verbalized Versus Errors Observed

This issue has important implications at the level of replication
of the present study. The issue here is the method utilized for
detecting the problems during the evaluation sessions. The problems
listed in this study were not only the ones verbally expressed by the
participants, but also the ones perceived by the observer. This means
that different observers would most likely generate different lists,
according to their background, prior usability experience and other
subjective aspects. The decision to include all problems was based on
the fact that this is the way usability evaluation happens. It is
unreasonable to ask an observer, during any evaluation of this kind,
to include only problems verbalized by the users. This procedure
would be considered counterproductive, to say the least. The ability
of the observer to detect problems when observing users is part of
the process and needs to be included in the evaluation. One solution
for this problem would be to use more than one observer, to balance
or attenuate bias.

This point explains why there are more problems in the List of
Problems than in the Multimedia Files, for each participant. This
discrepancy is due to the fact that in many instances, the observer
was able to detect a problem, while the subject was busy cognitively
verbalizing the occurrence. In other instances, in fact, the subject was

unaware of the occurrence of a problem.

85

6.7 Combination of Qualitative and Quantitative Tools

The use of several qualitative and quantitative instruments
combined in this study had the objective of developing a
methodology that could execute a triangulation of useful information
about the prototype. The need of techniques and methods that
integrate qualitative and quantitative tools is imperative within the
ﬁeld of interface design research, where the substantive issues
necessitate the integration of both kinds of methods in order to
understand complex research problems and applications.

Some instruments were, by nature, qualitative, such as the
multimedia ﬁles; other instruments were clearly quantitative, such
as the questionnaires and the demographic data collection. Some
instruments could be categorized as both qualitative and
quantitative, such as the List of Problems. This observation is
important when analyzing the results of the meta-analysis.

Why did experts value qualitative instruments most highly in
this study (see Figure 6.1)? It seems that these instruments were
more capable of providing context for the problems. The lack of
context of quantitative tools is due to the very nature of these
instruments. The process of summarizing the information collected in
this study quantitatively discarded the context of the problems’
occurrence in exchange for an estimator that could represent the
entire population of the study.

As an example of this process, one can look at the item “Going
back” from QUIS. This item showed a low mean, which indicates that,
in general, the participants had difﬁculty going back in the

prototype. But what if someone asks the following question:

86

 

 

instruments Evaluation by Experts

 

 

 

Multimedia
ﬁles
List of
Problems
QUIS
Comments
Ethnic Groups
Navigational
Maps
Statistical
data
Report of
Usage
QUIS
Questions
ReevHarmon
Questions

. Demographics

 

 

 

Figure 6.1: Ratings of Instruments by Experts:
The Contextual Aspect .

 

“Where precisely in the prototype did users encounter difﬁculties
going back?” This person would have to examine the list of problems
or the multimedia ﬁles to obtain an answer to his/ her question. This
example serves to demonstrate the usefulness of having both type of
data available, when interpreting evaluations of this kind. These
results also serves to indicate that, in situations of limited budgets
and time, experts would rather have access to qualitative data, when

developing evaluation of educational prototypes.

6.8 Use of Statistical Tools in the Evaluation Methodology ,
Three types of statistics were used in this study: a) descriptive
statistics; b) non-parametric statistics, and c) a cluster analysis. The

use of descriptive statistics was straightforward. It was simple to

87

tabulate the data and obtain descriptive statistics . It was useful to
be able to summarize the data into more manageable results, and
this process was relatively easy to be executed and interpreted by
the experts.

The use of non-parametric statistics to detect differences
between groups was a little more demanding, in terms of presenting
the results in a usable format. Interpreting the non-parametric
results required extra knowledge, research, and a moderate degree
of statistical knowledge. An advantage of the use of both descriptive
and non-parametric statistical analyses is that they are available in
most popular statistical software applications.

The third statistical application, the use of cluster analysis, was
more demanding. There are several methods of cluster analysis to
choose from; the literature is divergent in some aspects of its use,
and interpreting the results of cluster analysis was challenging and
laborious. Although it seemed promising to use this kind of statistical
analysis, the results were somewhat inconclusive. The development
of speciﬁc tools and techniques to be used in this context would

simplify and broaden its use in human-computer interface design.

6.9 Number and Nature of Problems Encountered

With reference to the number and the nature of problems
encountered in this study, the results were more conclusive. The
problems were clustered into 4 main categories: Interface,
Instructional, English, and Programming. This taxonomy was
generated as an attempt to give more depth on the number of

problems in relation to each group of participants. This categorization

88

was complex to execute, and it was of preliminary value, due to the
consideration that in some cases, problems could be classiﬁed in
more than one category.

Ratings about the severity of problems encountered in this
study were not implemented, due to the complexity and subjective
character of deﬁning this concept [Nielsen, 1994]. One way of
measuring severity would be to use the incidence of each problem
among the participants available in the list of Problems, and to use
these values as an indication of their severity. For example, the lack
of control on the audio narration was detected by 60% of the
participants. However, if another example is taken (the typo “rigth”
which also was detected by 60 % of the participants), we can see that

this criteria does not provide a consistent method.

Chapter 7
Conclusions

This chapter presents the conclusions that were drawn from
this study. In the ﬁrst section, the conclusions are presented in terms
of the speciﬁc hypotheses studied. The second section contains a
discussion of the methodology used. Finally, directions for future
research are described.

This study has attempted to answer a broad range of questions.
The results are limited due to the lack of enough subjects to produce
statistically signiﬁcant results. Twenty-one subjects took part in this
evaluation. Roughly twice as many subjects may yield a statistical
signiﬁcance. The results, however, do lead to several conclusions.

7. 1 Differences in usability feedback among users and
experts

It comes as no surprise that experts detected signiﬁcantly more
problems than users. In terms of ratings, experts were more critical
in both questionnaires, and presented more variability of answers.
This wider variability could simplify the task of interpreting the
results of evaluations of this nature. Feedback from experts was
more usable in general than the users comments.

The differences found between users and experts can be
explained by many factors, including language (English), personality,
and background experience with educational software. However, .
more important than exploring the reasons for these differences is
the fact that, despite the differences found in this study, the two
groups complement each other, in terms of problem detection. This

Q)

becomes apparent if one considers the number of problems detected
only by users, a total of 22, in relation to 5 1 problems encountered
only by experts.

Experts demonstrated better strategies in handling the triple
task of evaluating interface, pedagogical, and content aspects
simultaneously.

7.2 Differences in usability feedback across ethnic groups

The comparison of three distinct ethnic user groups indicated
that differences in usability feedback and attitudes exist. The
Indian/ Pakistanis group of users was consistently more positive and
less critical about the prototype tested. The American group
presented the highest number of problems detected among all three
groups and was the most critical group. The Chinese/ Korean group
presented an intermediary result, if the number of problems and
answers given in questionnaires is considered. In terms of
qualitative answers, the Chinese/ Korean group presented the least
usable results. The American Group was the most usable group, in
this regard.

The lack of good interaction between the observer and the
Chinese/ Korean group in this study limits the validity of the above
results. The amount and quality of feedback generated in evaluations
of this nature is dependent on the interaction between observers
and participants.

Overall, having three ethnic user groups in the study allowed
for a wider range of issues and perspectives to be considered that

would not have become apparent if only one user group was

91

targeted. This issue is particularly important when developing
educational software for international audiences.

7.3 Gender differences in usability feedback

Comparisons of usability feedback between males and females
were indicative of a more positive attitude among women, when
testing the prototype. The number of problems detected by males
was slightly higher than the number of problems encountered by
females. In the qualitative side of the feedback, no apparent
differences were detected.

This research question took only users into consideration. It
was decided not to include experts in the comparison in order to
prevent bias towards males. The small number of experts in the
study did not allow a gender comparison among experts (16 users
and 5 experts).

One aspect to consider in this research question that could limit
the external validity of these results is the fact that the observer was
a male. This aspect might have introduced a bias in the interaction
between the observer and the participants in either direction,
depending on the personality of the observer. This aspect is similar
to the issue of cultural identity presented earlier.

7.4 Value of Multimedia Files as Qualitative Tools

There was a consensus among all experts in this research that
the multimedia ﬁles were the best of all instruments utilized in this
evaluation. This result was surprising, considering that these ﬁles
consisted of audio and screen shots, since digital video was not viable
in this study.

92

The instrument “multimedia ﬁle” used in this study was an
experimental idea of this researcher that proved to be highly
effective and yet relatively simple and cheap to be implemented. Its
simplicity relies on the use of commercial multimedia software
coupled with low cost video equipment, which combined, gives the
social scientist and the interface designer a really powerful way of
collecting and examininng critical incidents and problem when
evaluating instructional technology.

The biggest advantage this instrument offers to designers is the
ability for them to see users struggling with their software’s
problems without having to spend a great amount of time watching
videotapes, or having to deal with real users which a lot of designers
dislike. It is the intent of this researcher to further develop this
instrument and some of these issues in future research opportunities.

Another instrument well received among the experts was the
list of problems. This document, which was rated second best,
presented a contextual list of problems, with descriptions and
locations of problems, as well as incidences.

In contrast, both the QUIS, and Reeves and Harmon
questionnaires were rated low by the experts in relation to other
instruments. A plausible explanation for weak ratings could be the
lack of context these instruments presented. The usefulness of these
tools seemed to be limited according to the opinions of the experts.

The other instruments and procedures were rated in-between
these two poles. It seems that the higher the contextualization of the
instrument, the higher the ratings they received. This trend indicates
a preference towards more qualitative instruments by the experts.

93

7.5 Evaluation Methodology of Educational Prototypes

The development of cost-effective methodologies for evaluating
educational prototypes was a central question to be studied in this
dissertation. Some conclusions are presented below in this regard.
7.5.1 Videotaping

The importance of videotaping usability evaluations was
conﬁrmed: the generation of multimedia ﬁles was dependent on the
availability of the videotapes; the generation of a list of problems
was dependent on the videotapes; a detailed quantiﬁcation of
problems detected was also generated from the videotapes; the
process of videotaping also allows replication of the results, as well
as an efﬁcient form of archiving user -interaction for future
references.

Videotapes also serve as an essential communications medium
in situations where it may be difﬁcult to persuade developers and
managers that a certain usability problem is in fact a problem. Seeing
a real user struggling with the problem convinces managers and
developers [Pauch 199 1].

7.5.2. The interaction of observer and subjects

The importance of the observer being physically present and
the quality of the interaction between observer and participants was
evident in this study. For most of the participants, the think-aloud
process was not easy and having someone willing to help, prompting
then through the evaluation, or just having someone to direct the
speech was very important.

The use of critical incidence and think-aloud techniques
demonstrated to be efﬁcient ways of obtaining feedback from
participants of evaluations, in particular with experts.

94

One disadvantage of the “think-aloud” method was that it did
not lend itself very well to most types of performance
meausurements. On the contrary, its strenth was the wealth of
qualitative data that could be collected from a fairly small number of
users. Also, the subjects’ comments often contained vivid and explicit
quotes that could be used to make the results more readable and
memorable.

7.5.3. Number of Subjects

The more subjects included in usability evaluations, the more
generalizable the results become. However, problem discovery
showed diminishing returns as a function of sample size. Observing
four to ﬁve participants uncovered between 75 and 85 96 of the
usability problems, which was a trend found in previous studies
[Lewis, 1994]. In most situations, it is not cost efﬁcient or viable to
evaluate several people and still preserve the richness and depth or
thickness of feedback like that obtained in this study. There is a
trade off between the statistical signiﬁcance and the level of context
of the results that needs to be taken into consideration. A
recommendation would be to have a minimum of one group of four
to six users, and one group of four to six experts. The inclusion of
more participants should be dictated by the relative availability of
people, time, and equipment.

7. 5.4 Questionnaires

Questionnaires are recommended, if they can be kept short and
if the terminology used can be familiar to the subjects. The use of-
existing generic questionnaires for usability evaluations seemed to
be of limited value, if one takes the experts’ feedback from this

study into consideration . Questionnaires, however, if done

95

appropriately and combined with other qualitative instruments, can
be useful in terms of indicating general strengths and weaknesses
when testing educational prototypes.

From the usability perspective, however, questionnaires are
indirect methods, since they do not study the user interface itself,
but only users’ opinions about the user interface.

7.5.5. List of Problems

The generation of a detailed list of usability problems is
' recommended, as this instrument was rated as one of the most
valuable tools by the experts in this study. The availability of a list of
this nature can simplify the work of instructional designers when
developing educational prototypes. Such a list should indicate the
location of the problem, its incidence, and a clear description. This list
could be connected via hyperlinks to multimedia ﬁles for each
observer, giving the designers fast and efﬁcient random access to the
problems detected [Nielsen, 1994].

A careful analysis of the List of Problems indicated the
existence of four main categories of problems:

a) interface problems; b) instructional problems; c) language
problems; and d) programming problems. This preliminary
taxonomy of problems could be developed further.

7.5 .6. Navigational Maps

Experts showed a relative lack of interest in the generated
maps of navigational patterns in this study. Experts did not
demonstrate interest in analysins the paths taken and the users, and
graded this instrument lower than the other qualitative tools.

This might suggest that there is a need to ﬁnd different and
better instruments and methods for studying navigational issues. It

96

could also mean that navigational mapping of users in instructional
software is not as important as many might think. The fact that one
user browses or jumps around more than another might be of
relative importance to the efﬁciency.

7.5. 7. Use of Sta tistical Tools

The utilization of statistical tools is of relative usefulness.
Descriptive and non-parametric statistics are used more frequently
in usability studies, due to the small number of subjects, and due to
the relative ease of execution and interpretation of the results.

The reliability of usability studies could be a problem because
of the huge differences between subjects’ responses. It is not
uncommon to ﬁnd that the best user is 10 to 15 times faster than the
slowest user [Egan 1988]. Usability testing fosters situations where
designers have to make decisions on the basis of fairly unreliable
data, which is still better than making decisions with no data at all.

The use of cluster analysis is more complex and the results are
more difﬁcult to be converted into practical solutions for the
developers. Very often, clustering methods are not standardized and
may be implemented differently. Also, the problem with doing
cluster testing is the difﬁculty of specifying what the null hypothesis
should be. Perhaps a better way of determining clusters would be by
trying to examine the validity of various solutions to the data, or by
carrying out replication studies [Kirakowski and Corbett 1990].

7.6 Recommendations

In terms of instruments and procedures to include in
evaluations, it became apparent in this study that the use of video
recording is critical. The importance of use and implementation of

97

tools such as the multimedia ﬁles created in this study became
apparent. These tools are particularly recommended for projects
with many designers involved, or in situations where designers are
not part of the evaluation team. The inclusion of someone not
directly involved in the deveIOpment of the prototype as part of the
evaluation team is also recommended, unless an experienced
usability specialist is part of the design team.

The use of several kinds of participants and several types
of instruments is often not viable because of cost and time
limitations. What to do in such circumstances will depend on the
context of the evaluation. However, some testing is better than no
testing at all. A general recommendation would be the inclusion of a
minimum of ﬁve users and ﬁve experts, whenever possible. Being
able to have at least two groups of participants would give the
designers the chance to conduct some preliminary comparisons, and
would avoid premature generalizations regarding the efﬁciency of
the prototype.

The use of statistical procedures coupled with qualitative
instruments is recommended when dealing with methodologies for
evaluation of educational interfaces. The use of graphical ways of
representing statistical results is also recommended as a way to
speed up and facilitate the process of interpretation of these results.
In most instances, being able to visualize the results quickly could be
the decisive factor for determining the use of the data collected.

The generation of a list of problems is also recommended,
including percentages of incidences of problems and location of each
problem in the prototype. The use of a pre-questionnaire with

98

demographic information about the participants completes this
minimum conﬁguration.

Also, suggestion for future studies would be to combine the
qualities of both instruments into one instrument, making it on-line,
relatively short and avoiding complicated terminology and
redundant items.

Of lower priority comes the use of questionnaires, as well as
navigational maps. These items require the acquisition or creation of
speciﬁc tools for the task (such as graphic programs, statistical
packages, and on-line questionnaires such as QUIS). A possible
solution is the use of a simpliﬁed printed questionnaire.

7.7 Future Research

The most important application of this study is using it as the
basis for future research. This study raised many questions that
future studies should attempt to answer.
7. 7.1. Enhance the Methodology

One important avenue for future research is to ﬁnd better
ways to perform studies similar to this one. The results of this study
were based on the use of one prototype only. Other prototypes, in
different ﬁelds of knowledge could lead to distinct results. The use of
more than one prototype could also lead to more complete answers.
Therefore, variations of the approach utilized in this study should be
explored, or new alternatives attempted, in order to obtain a more
complete method. Another important aspect of the methodology used
here that could be studied, are the possible connections between

quantitative and qualitative instruments.

7. 7.2. Qualitative Emphasis

For future research, the qualitative instruments used in this
study should be analyzed and developed more in depth. The results
of the meta-analysis indicated a preference for qualitative kinds of
instruments. Some of the research questions to be studied are: a)Why
these instruments were so popular among the experts b) What could
be done to improve these instruments, and c) How could these
instruments become widely available.

Future research should be directed towards understanding
better the issue on the level of contextualization of the instruments
used in the present study. This line of research could generate some
promising indications for future methodologies in evaluations of
educational multimedia. For example, a comparison of different kinds
of multimedia ﬁles for each subject could measure the
contextualization aspect. The generation and comparison of different
versions of these multimedia ﬁles, such as videodisks, digital videos,
or audio-screen ﬁles (used in this study) are technological
possibilities that need to be understood.

7. 7.3 Quantitative Emphasis

The need for future research with a quantitative emphasis is
clear. Speciﬁcally, research should be performed in order to achieve
statistically signiﬁcant results. One viable approach in this regard is
to narrow the scope of the study. Studies could be performed for
each area of speciﬁc interest. For example, one study could explore
the number of problems detected, without having to worry about the
generation of context-speciﬁc results. Another study could explore
attitudinal differences between ethnic groups or gender. By
narrowing the scope of the study, and using a moderately higher and

1CD

more homogeneous number of subjects, the results could produce
much greater conﬁdence levels.

More research and development on the use of cluster analysis
in evaluations of this nature is necessary. For example, this kind of
study could help in the process of classiﬁcation usability problems, as
well as grouping users.

7. 7. 4. The Inclusion of Personality

Studies of usability in educational software should take into
consideration the participants’ personalities. Future studies should
try to include personality variables as part of the body of
information collected for each participant. There is some preliminary
evidence in this study that personality plays a major role in the
quality and amount of feedback generated.

7. 7. 5 Comparison of different types of observers

Future studies should try to compare different types of
observers, and usability specialists. For example, a comparison of
observers that took part in the development of the prototype, against
independent observers could generate important results. The cultural
background and gender of the observers are topics to be studied in
future studies. Studies that could research the use of more than one
observer simultaneously as a way of avoiding bias, or for veriﬁcation
of results could also generate important results.

7. 7.6 Use of Naviga tion Maps

More research and development of navigational maps as tools
for visualizing the participants' feedback is necessary. The present-
study attempted to study this issue only superﬁcially. There are
many issues that need to be analyzed in more depth. The
navigational instrument generated in this study was primitive, but it

101

should serve as a starting point for future exploration. The use of
spatial modeling for visualization of navigational aspects is
recommended.

7. 7. 7 Use of Questionnaires

The low ratings obtained by the questionnaires in the meta-
analysis portion of this study indicate the need for more in depth
research in this area. The development of more adequate or context-
speciﬁc questionnaires, as well a more detailed comparison of
existing questionnaires could help answer some of the questions
raised here. The issue of printed versus on-line questionnaires is also
one topic that needs to be explored.

Interesting ﬁndings have been made, and they will serve as
the basis for more studies to come. Perhaps this study's biggest
contribution is to point the way and declare the need for future
research that compares different instruments for usability evaluation
of instructional software.

BIBLIOGRAPHY

102
BIBLIOGRAPHY

Anderson, R. E. (1987). Females Surpass Males in Computer Problem
Solving: Findings from Minnesota Computer Literacy Assessment.

kW 3. 39-5 1.
Bell, J. E. (1990) A Case Study of Ad Hg; Query Interfages to

2m. Doctoral Dissertation, University of California, Berkeley.

Benimoff, N. I., & Whitten, W. B. I. (1989). Human Factors Approaches
to Prototyping and Evaluating User Interfaces. ATQT Tech lournal,
5(68), 44—45.

Brun-Collan, F., & Wall, P. (1995). Using Video to Represent the User.
Communigag’ons of the ACM, 38(5), 61-71.

Chapanis, A. (1991). Evaluating Usability. In B. Shackel & S. J.

Richardson (Eds.), Human thors for lnformatigs Ugbﬂity (pp. pp.
359-3 95). Cambridge: Cambridge University Press.

Chin, J. P.,Diehl, V., & Norman, K. (1988). Development of an
Instrument Measuring User Satisfaction of the Human-Computer
Interface. In HI' 8' H Fa 0 ° S m , (pp.
pp.2 13-2 1 8). New York: Association for Computer Machinery.

Chin, J. P.,Norman, K., & Shneiderman (1987). Subjective Evaluation of
CF Pascal Programming Tools. In unpublished.

Clarke, V. A., & Chambers, S. M. (1989). Gender-based Factors in
Computing Enrollments and Achievement: Evidence from a Study of

Tertiary Students. urn f Ed ati n Com ' R ar h, 5(4),
409-429.

Collins, B. A., & Willians, R. L. (1987). Differences in adolescents'
attitudes toward computers and selected school subjects. Ioumal of

W. 8. 17-27.

Cronan, T. P.,Embry, P. R., & White, S. D. (1989). Identifying Factors
that Inﬂuence Performance of Non—computing Majors in the Business
Computer Information Systems Course. Ioumal of Regagh on

Cnmnmmgmlidttcation S_m_rt1_e_r. 431-443

103

Dambrot, F. H., & Watkins-Malek, M. A. (1985). Correlates of sex
differences in attitudes toward and involvement with computers.

loorna of Vooationa ﬁhavior, _2__Z, 7 1-86.

Day, M. C., & Boyce, S. J. (1993). Human Factors in Human-Computer

System Design. In M. Yovits (Eds.), onaooes m‘ Computers (pp. pp.
381-430). San Diego: Academic Press.

Diaper, D. (1990). Simulation: a Stepping-Stone Between
Requirements and Design. In M. A. Life,C. S. Narborough-Hall, & W. 1.

Hamilton (Eds.), Simugtion a__no tho Usor Interface (pp. 59-71).

London: Taylor and Francis.

Eberts, R. E. (1994). User loteﬁaoe Design. Englewood Cliffs, New
Jersey: Prentice-Hall.

Egan, D. E. (1988). Individual Differences in Human-Computer

Interaction. In I. Helander (Eds.), Hanoo mk of Homao Computer
lnteraotion (pp. 543-568). Amsterdam: North-Holland.

Flagg, B. N. (1990). Fogmaove Evaluation for mocatiooal
Technologies. Hillsdale: Iawrence, Erlbaum Associates.

Galdo, E. M. d.,WilligES, R.,Williges, B. H., & Wixon, D. R. (1987). A
Critical Incident Evaluation Tool For Software Documentation. In L

Marla]. Warm, & R Huston (Eds.), W (pp.
pp. 253-258). N. York: Springer-Verlag.

Gardner, D. G.,Discenza, R., & Dukes, R. L. (1993). The measurement of
Computer Attitudes: An Empirical Comparison of Available Scales.

uc tio al 11 Re h, 9(4), pp. 487-607.

Gould, J. D.,Boies, S. J., & lewis, C. (1991). Making Usable, Useful,
Productivity. Enhancing Computer Applications. Commooigaoo' ns of
tho ACM, 34, no, 1, pp. 74-85.

Gould, J. D., & Lewis, C. (1985). Designing for usability: Key Principles

and What Designers Think. Communioao‘oos of the ACM, 28, no. 3, '
pp.300-3 1 1.

Granstam, I. (1990). Contributions GASAT. In . jonkoping, Sweden:
J onkoping University.

104

Gray, D. E., & Black, T. R. (1993). Prototyping of Computer-Based
Training Materials. Computers in Educamn, 22, oo.3, pp. 251-256.

Green, A. J ., & Gilhooly, K. (1990). Individual Differences and
Effective Learning Procedures: The Case of Statistical Computing.

Intomationﬂ loomal of Mao-Maghino Stooios, 33, 97-1 19.

Harper, B., & Norman, K. (1993). QUIS: The Questionnaire for User
Interaction Satisfaction. In College Park, MD: University of Maryland
at College Park.

Hazari, S., & Reaves, R. R. ( 1994). Student Preferences Toward

Microcomputer User Interfaces. W, 22, ngé. pp.
225-229.

Igabaria, M. (1990). End-user Computing Effectiveness: A Structural
Equation Model. Q_m_ega,18(6), 637-652.

Jeffries, R., & Desurvire, H. (1992). Usability Testing versus Heuristic
Evaluation: Was there a Contest? §ICCHI Bolletin, 24, No. 4, pp .39-
41.

Karat, C.-M.,Campbell, R., & Fiegel, T. (1992). Comparison of Empirical
Testing and Walkthrough Methods in User Interface Evaluation. In
CHI‘ 92, (pp. pp. 397-404). Monterey, CA: Association for Computer
Machinery.

Kirakowski, J., & Corbett, M. (1990). Effective Methgﬂology for the
Study of HCI. Stuttgart: North-Holland.

Laurel, B., & Mountford, J. (1990). The Art of Hump-Computer
W11. San Francisco: Addisson—Wesley Publishing.

Lehner, P. E. (1987). Cognitive Factors in User/Expert-System
Interaction. Homg Fagtors, 29, no,1, pp. 97-109.

Lewis, C., & Poison, P. G. (1991). Cognitive Walkthroughs: A Methm

for Thoogg—ﬂsoo Evaloation of User Intefaoes (Tutorial No. SIGCHI‘
ACM.

Lewis, J. R. (1992). Psychometric Evaluation of the Post-Study System

Usability Questionnaire: the PSSUQ. In Human Factors Society 36th
Annual Mgting, (pp. 1259-1263).

105

Lewis, J. R. (1994). Sample sizes for usability studies: additional

considerations (Special Issue: Fatigue). Homan Fagtors, 3o, No.2, pp.
368-379.

Iewis, S. (1991 ). Cluster Analysis as a Technique to Guide Interface
Design. In i n urn fM -M ne u ' , 35, 251-265.

Mack, R., & Nielsen, J. (1993). Usability Inspection Methods: Report
on a Workshop held at CHl'92. SICCHI Bulloon, 25, No. 1, pp. 28-33.

McGraw, K. (1993). Conducting User Interface Evaluation (chapter

1 1). In msigning and Evaluating Ugr lntorfaoes for Knowlﬂgo;
W (pp. pp.171-186). N. York: Ellis Horwood.

McGraw, K. L (1994, November 1994). Knowledge Acquisition and
Interface Design. IEEE Softwﬂe, p. pp. 90-92.

Miller, J. R., & Jeffries, R. (1992, September 1992). Usability
Evaluation: Science of Trade-Offs. IEEE SQMQLC, p. pp. 97-102.

Nadin, M. (1988). Interface Design and Evaluation - Semiotic
Implications. In R. Hartson & D. Hix (Eds.), AW

Computer Interaotion (pp. pp.45-100). Norwood, New Jersey: Ablex
Publishing Corporation.

Nielsen, J. (1992). Finding Usability Problems through Heuristic
Evaluation. In CHI'92, (pp. pp. 373-380). Monterey, CA: Association
for Computer Machinery.

Nielsen, J. (1993a, November, 1993). Is Usability Engineering Really
Worth It? W p. pp. 90—93.

Nielsen, J. (1993b, November 1993). Is Usability Enginnering Really
Worth It? IEEE Soft_ware, p. 90—92.

Nielsen, J., and Iandauer, T.K. (1993c). A mathematical model of the

ﬁnding of usability problems. Prggﬂings ACM INTERCHI'93
conference. 206-2 13.

Nielsen, J. (1993d). Ughilig; Enginooring. Cambridge, MA: AP
Professional.

106

Nielsen, J ., & Levy, J. (1994). Measuring Usability: Preference versus
Performance. Communioap‘ona of tho ACM, 37, No. 4, pp. 67-75.

Norman, K. (1994). Navigating the Educational Space with
Hypercourseware. Mia, o, no. 1, pp. 35-60.

Parasuraman, S., & Igabaria, M. (1989). An Examination of Gender
Differences in the Determinants of Computer Anxiety and Attitudes
Towards Microcomputers among Managers. International lournal of

W. 322. 327-340-

Parasuraman, S., & Igbaria, M. (1990). An Examination of Gender
Differences in the Determinants of Computer Anxiety and Attitudes
Towards Microcomputers Among Managers. W
W. 3.2. 327-340

Pausch, R. (1991). Virtual Reality on ﬁve dollars a day. In ACM
CHI'91, (pp. 265-270). New Orleans:

Pollier, A. (1992). Evaluation d'une Interface Par Des Ergonomes:
Diagnostics et Strategies. “flay/Mom, March. pp. 71-95.

Premkumar, G. ,Ramamurthy, K., & King, W. R. (1993). Computer
Supported Instruction and Student Characteristics: An Experimental

Study- WWW—WWW 9._a_N .3. 1313-373-
396.

Rauterberg, M. (1993). AMME: an Automatic Mental Model
Evaluation to Analyse User Behavior Traced in a Finite, Discrete State

Space. Ergonomics, 3o, no.11, pp. 1369-1380.

Reeves, T. (1993). Evaluating Technology-Based Learning. In G. M.

Piskurich (Eds.), Th k f In i n T hn (Pp.
pp. 15.1-15.31). N. York: McGraw-Hill.

Reeves, T. C. (1991). Ten Commandments for the Evaluation of
Interactive Multimedia in Higher Education. JoumaLQfCompuopg'm

_gh§_Fg_aaa_ 2._o._n 2 pp 84-113

Reeves, T. C., & Harmon, S. W. (1994). Systematic Evaluation
Procedures for Interactive Multimedia for Education and Training. In
S. Reisman (Eds.), Mul 'm i ' : e arin for the 21 t
Camry (pp. pp. 472-505). Harrisburg,PA: Idea Group Publishing.

107

Rettig, M. (1992). Interface Design When You Don't Know How.
Cemanigaagns 9f the ACM, 35, no.1, pp.29-3 4.

Roske-Hofstrand, R. J. (1989). Video in Applied Cognitive Research
for Human-Centered Design. SIGCﬂI Balletin, 21(2), 75-77.

Rowley, D. E., & Rhoades, D. G. (1992). The Cognitive Jogthrough: A
Fast-Paced User Interface Evaluation Procedure. In CHI' 92, (pp.
pp.389-3 95). Monterey, CA: Association for Computer Machinery.

Salasoo, A. (1991). Initiating Usability Methods with a New
Engineering Design Tool. SIGCHI Balletin, 23, no.1, pp. 68-70.

Shashaani, L. (1993 ). Gender-based Differences in Attitudes Toward
Computers. W 2_0(2), 169-181.

Shneiderman, B. (1987). Designing the User Interface: Sttategies fer
Effective Humag-Cgmpttter Interaetion. Addison-Wesley Publishing.

Siann, G.,Macleod, H.,Clissov, P., & Durndell, A. (1990). The effect of
computer use on gender differences in attitudes to computers.

Cgmputers in Edacatien, 1_4t( 2), 1 83-1 9 1 .

Slaughter, L.,Harper, B., & Norman, K. (1994). Assessing The

ival n ftheP ran n-lineF rma fth UI
laboratory for Automation Psychology, University of Maryland,
College Park.

Sutton, R. E. (1991). Equity and computers in the schools: a Decade of
research. Review Qf Eacation Researeh, ﬂ, 475-503.

Svendsen, G. B. (1991). The inﬂuence of interface style on problem

solving. International leurnal of Man-Machine Stuaies, _3_5_, 379-397.

Virzi, R. A. (1992). Reﬁning the Test Phase of Usability Evaluation:
How Many Subjects Is Enough? Human Factors, 34 n.4, pp. 457-468.

 

Wallace, D. P.,Norman, K. L., & Plaisant, C. (1988). The American Voice
and Robotics "Guardian" System: A Case Study in User Interface
Usability Evaluation. In The Human/ Computer Interaction
laboratory, University of Maryland.

108

Wharton, C.,Bradford, J.,Jeffries, R., & Franzke, M. (1992). Applying
Cognitive Walkthroughs to More Complex User Interfaces:
Experiences, Issues, and Recommendations. In CHI' 92, . Monterey,
CA: Association for Computer Machinery.

Wright, P. C., & Monk, A. F. (1991). A cost-effective evaluation

method for use by designers. Intematignal loumg ef Man-Maehine
Studies, .15, 891-912.

APPENDICES

APPENDIX A

109

Consent Form

As a part of a research in the Department of Counseling,
Educational Psychology and Special Education at Michigan State
University, this experiment is being performed during the
Summer of 1995. For this experiment, multimedia designers are
being sought who will voluntarily serve as subjects. The purpose
of this experiment is to determine the differences between target
users and multimedia designers when evaluating a Computer
Assisted Instruction Prototype about Telecommunications.

As a subject for this study, you will be expected to spend
around three hours learning to use and evaluating the usability of
an early teletrafﬁc prototype. You will be videotaped and your
voice and screen choices will be the only focus. of it. These tapes
will be used to record your comments, as well as the interaction
between you and the computer. The amount of time to perform
various tasks will also be recorded. Names of subjects will not be
released in any way.

There is no coercion or demand that you take part in the
study. It is solely your personal choice.

Any questions can be directed to the investigator. Pericles
Gomes (email 22591mgr@msu.edu or phone 353 5497).

" I understand the above, and I voluntarily choose to serve
as a subject for this experiment."

 

Name Date

APPENDIX B

110

Preliminary Questionnaire

Name:

 

Phone:

 

General Computer use:

DOS Windows Macintosh other:

 

Applications:

 

Do you know how to program computers?

 

Do you own a computer:

 

How old were you when you first used a computer?

Nationality:

 

Native Language:

 

Agni:

 

Gender:

 

 

Area of Engineering you are mostly interested in:
What was your last TOEFL result (if applicable):
How long have you been in the USA (if applicable)
How would you place yourself in terms of Telecommunication
Engineering:

Not Knowledgeable 1 2 3 4 5 6 7 8 9 Very Knowledgeable
How excited are you about using a computer as a learning
tool:

Not Excited 1 2 3 4 5 6 7 8 9 Very Excited
Have you used Computer Based Instruction before in your

academic life?

Never 1 2 3 4 5 6 7 8 9 or more Times

111

Preliminary Questionnaire
Name

 

Phone-a

 

Email:
Systems you are familiar with:

113 Windows Macintosh Unix
Age:
Gender:
Your Background Education :

 

 

 

Area of Multimedia you are most
profﬁciest:

 

Which of the following describes best your activity:

(Circle one or ﬁll in the blank)
Instructional Designer Software Engineer
Media Designer Interface Designer

Educational Researcher Hypermedia Designer
Programmer Professor
Instructional Researcher Instructional Technologist

 

Have you participate in usability evaluations before?
Never 1 2 3 4 5 6 7 8 9 ormoreTimes

APPENDIX C

112

Description of Experiment for Participants

In this study, your expertise in using this application will
help us determine current problems of this prototype interface.

This is a very early prototype. It is not a ﬁnished product,
by any means. That means your suggestions will be taken into
consideration by the creators of this application.

Rather than having your performance being evaluated, you
will serve as a means of evaluating the program you use. Hence,
we would like to use a technique called "Think aloud" which
simply means that you should try say everything you are thinking
and tell us any and all impressions you will have of the system:
they are VERY valuable for this experiment, and for the designers
of this software. Please express your opinions, point out whatever
seems confusing to you, and try to explore as much as possible.
(Do not feel afraid to err, because there is no right or wrong here,
for you).

Your task is to explore the application as much as possible.
The content area of this prototype is about teletrafﬁc concepts one
would need be to get a job in a telephone company.

APPENDIX D

113

Some Helpful Definitions:

0 Central ofﬁce: Telephone switch

0 Trunks: Direct lines between two telephone switches
- Route: Direct Routing (without Tandem)

- Tandem: Alternate Routes

0 Full availability: All trunks available

0 Sequential Hunting: the process of searching for a free
trunk

- Erlang: Unit of telephone trafﬁc (universal)
(1 ERLANG = 1 hour of telephone line usage)

APPENDIX E

114

REEVES & EARMQN Questiennaire

Pedagogical Dimensions:
NA (Not applicable)

I ) Goal Orientation:

NA 1 2 3 4
Weak
II) Experiential Value:
NA 1 2 3 4
Weak
III) Use of Error as a Learning Tool:
NA 1 2 3 4
Weak
IV) Motivation (Intrinsic and Extrinsic):
NA 1 2 3 4
Weak
V) Structure:
NA 1 2 3 4
Weak
VI) Accommodation of Individual Differences:
NA . 1 2 3 4
Weak
VII) Learner Control:
NA 1 2 3 4
Weak
VIII) User Activity:
NA 1 - 2 3 4
Weak
IX) Pedagogy of objectives:
NA 1 2 3 4
Weak
X) Cognitive Psychology
NA 1 2 3 4

Weak

Strong

Strong

Strong

Strong

Strong

Strong

Strong

Suong

Strong

Strong

115

User Interface Dimensions:

1) Easy of Use:

NA 1 2 3 4 5
Weak Strong
II) Navigation:
NA 1 2 3 4 5
Weak . Strong
I I 1) Cognitive Load:
NA _ 1 2 3 4 5
Weak Strong
IV) Mapping: -
NA 1 2 3 4 5
Weak - Strong
V ) Screen Design:
NA 1 2 3 4 5
Weak Strong
VI) User Control:
NA 1 2 3 4 5
Weak Strong
VI I ) Information Presentation:
“ NA 1 2 3 4 5
Weak . Strong
VIII)Media Integration:
NA 1 2 3 4 5
Weak Strong
IX) Aesthetics:
NA 1 2 3 4 5
Weak Strong
X) Overall Functionality:
NA 1 2 3 4 5

Weak Strong

APPENDIX F

116

Identification number:

 

PART 1: Type of System to be Rated

1.1 Name of hardware:
1.2 Name of software:

 

 

13 How long have you worked on this system?

less than 1 hour _ 6 months to less than 1 year
lhourtolessthanlday _ lyeartolessthan2years
1 day to less than 1 week __ 2 years to less than 3 years

1 week to less than 1 month 3 years or more

1 month to less than 6 months

1.4 On the average, how much time do you spend per week on this system?

_ less than one hour _ 4 to less than 10 hours
_ one to less than 4 hours __ over 10 hours

PART 2: Past Experience

2.1 How many different types of computer systems (e. g., main frames and personal
computers) have you worked with?

_none _3-4
_1 _5-6
_2 _morethan6

2.2 Of the following devices, software, and systems, check those that you have personally
used and are familiar with:

_ keyboard _ text editor _ color monitor

_ numeric key pad _ word processor _ time-share system
_ mouse _ file manager __ workstation

__ light pen _ electronic spreadsheet _ personal computer
_ touch screen _ electronic mail _ ﬂoppy drive

_ track ball _ graphics software _ hard drive

_ joy stick _ computer games _ compact disk drive

117

PART 3: Overall User Reactions

Please circle the numbers which most appropriately reﬂect your impressions about using this
computer system. Not Applicable = NA.

Overall reactions to the system:

3.1 terrible wonderful

1 2 3 4 5 6 7 8 9 NA
3.2 frustrating satisfying

1 2 3 4 5 6 7 8 9 NA
33 dull stimulating

1 2 3 4 5 6 7 8 9 NA
3.4 difﬁth easy

1 2 3 4 5 6 7 8 9 NA
3.5 inadequate power adequate power

1 2 3 4 5 6 7 8 9 NA
3.6 rigid ﬂexible

I 2 3 4 5 6 7 8 9 NA

Please write down any comments that you have about your 937W

 

 

 

 

 

 

APPENDIX G

1)

2)

3)

4)

5)

6)

7)

8)

9)

10)

118

Meta-Evaluation Questionnaire

How useful was the Preliminary Questionnaire?
Very Useful 1 2 3 4 5 6 7 8 9 Not Useful

How useful was the Reeves/ Harmon Questionnaire?
VeryUseful 1 2 3 4 5 6 7 8 9 NotUseful

How useful was the QUIS Questionnaire:
VeryUseful 1 2 3 4 5 6 7 8 9 NotUseful

How useful were the Users Comments from QUIS:
VeryUseful 1 2 3 4 S 6 7 8 9 NotUseful

How useful were the Audio/ Screen compilations:
VeryUseful 1 2 3 4 5 6 7 8 9 NotUseful

How useful were the Navigations Maps:
VeryUseful 1 2 3 4 5 6 7 8 9 Not Useful

How useful were the different ethnic groups results
VeryUseful 1 2 3 4 5 6 7 8 9 NotUseful

How useful was the Overall list of errors (by screen location)
Very Useful 1 2 3 4 5 6 7 8 9 NotUseful

How useful was the Report of Usage?
VeryUseful 1 2 3 4 5 6 7 8 9 NotUsefull

How useful was to have the quantitative data available for
statistical analysis?
VeryUseful 1 2 3 4 5 6 7 8 9 NotUsefull

Thank you so much for taking the time to participate in this
study!

APPENDIX H

119

Korea 0 1

Characters on the screen : I hope that program has more helpful loption
like initial letter. dont use initail letter, it is hard to memorize at once and
more table to use calculate for some formular, and more example. like
when I calculate some variable number, computer explains why get
solution number and explain what solution number means to us and our
society. also on diagam, to better make understandable diagram to use this
program, don’t use ike A, B, C , and T. use like favorite place and favorite
car to get more interesting.

Just put undo option to make useful computer. when I calculate some
number, then retype that value to the other space that screen is gone so I
have to memorize result even it is big number.

And the option screen is like boring. when I look at it ﬁrst time, I jsut feel
man, it is going to be boring. hence if jput atractive stuﬂ‘ on screen then it is
more neat program. just don't make them boring and tried. Sometimes I
lose position 1 mean what should 1 suppose to do for next step. so put
previous back option to get back not go back to beginning screen.

 

 

Terminology appropriateness : put it more option to more usfull. like
more interesting option even it is program for study

Keeps you informed : when I calculate some information , it does not say
what I am doing and what is correct result and what that result for.

System Speed: I hope this program has more step by step

Sounds and noises : say information with easy words

Korea 02

Characters on the screen : If the color of the characters were black or
some diﬁ‘ernt color than blue, it might be better for the users to understand.

Highlighting on screen : The highlight was good. However, if the path
gas highlighted with the color something darker than yellow, it might be
tter.

Screen layouts : The layouts were pretty good. But if the small windows of
the tables, calculator, and the program stays while the user write the
answer, that would be great.

Sequence of screens : It was okay. No speciﬁc comments.
Use of terms : The terms were sequencially comming out , so it was very '

good. Reminds the users in every few monent whether they creally
understands the concepts or not.

120

Terminology appropriateness: If the user can pop up the useful screens
(i.e. calculators, programs) anytime they want, and also the deﬁnition of the
terms, it would be better.

Messages on screen : it was crear and easy to read.

Messages to users : it was clear, but if the screen shows all the equations
for the answers was shown in the examples, it might be better.

Keeps you informed : not really. I wanted to go back to the speciﬁc
screen, but I couldn't. Ifi can ﬁnd with the keyward, it would be great.

Learning to operate system: Not that hard, but confusing sometimes.
no speciﬁc comments.

Exploration of features : Ifthere were some kind of the help screen, it
might be easier to explore the features.

Remembering names and commands : I never entered any commands
Tasks performed in a straight-forward manner : no equations for the
tasks. there was just numeriacl answers. I want some straight forward
equations for the tasks and the answers.

Help messages : I couldn‘t ﬁnd where the hep screen was. I never used.
System Speed : N 0 comments. It was fast enough.

System Reliability : There was no warning while i was using the program.
However, the program was easy and reliable to use.

Sounds and noises : Was good. But kind of boring. who cares!!! this is not
a computer game.

Error correction : N O UNDOs. I tried, but I couldn't.
Experienced and novice users : Good enough!! easy to use.

India 03

Characters on the screen : it’s a well processed tool but is frustating at
times and sometimes a bit confusing too. direct access to pages is not
available unless scrolling through the whole lot. doesn't intend to give
instantaneous solutions to all errors.has a limited menu.

Highlighting on screen : it's very useﬁil in making out the priorities.
Screen layouts : screen layouts have been excellent.

121

Sequence of screens : sequence has been organised well. but a over-view
after certain portions would be very helpful making things look more
sequential.

Messages to users : sometimes it gets very confusing as to what is being
done wrongly.

Keeps you informed : it doesn't specify exactly which part has to be
reviewed more but just brings lot many pages to be reviewed again.

Learning to operate system : initially is slightly confusing as both the
arrows keep pointingafter going for a couple of minutes we get the hold of
the instructions directing us.

Remembering names and commands : might be easy after going
through the whole for once but deﬁnitely not at the ﬁrst attempt

Pakistan 04

Characters on the screen : I would prefer if there was more material in
hard copy ( paper ). Also, a way to bypass the sound part so the user can
have more ﬂexibility in choosing the parts of the program he/she accede to
review.

Use of terms : a few grammatical errors / typos were found and reported

Terminology appropriateness : as i said in one of the previous

comments, more information in hard copy. would help. also, if more

hchane1 ces to review just the deﬁm’tions on the screen are provided, it would
p.

Messages on screen : a few more messages will help. e.g, in the ﬁrst
screen: click here to continue

USA 05

Characters on the screen : There is a lot of user interactive parts to the
program, but many times I formd myself just clicking the arrow key to
move on. I liked the screens where you had to move the cursor over the
diagrams to get the explanation or deﬁnition to appear on the screen. That
made you more active , made you move around and see the program and
also directly gave an explanation for what you were speciﬁcally looking at. '
Instead of just looking at the whole screen and clicking the arrow key to get
the deﬁnitions up and then moving on.

Highlighting on screen : Highlighting, blinking, and changing the color of
certain parts of a diagram or sentence are very useful as long as the color is

 

122

pleasant to the eye. Otherwise it will still grab your attention but you will
want to move on to avoid looking at the screen instead of studying the
screen and learning the main point of consentration.

The reverse video screen is a nice touch also but should not be over done
because it could confuse the viewer into thinking the wrong thing.

Screen layouts : The screen layouts were easy to follow, and easy to read.
They were about the right size, big enough to be the point of emphasis, yet
there was enough room for them to add deﬁnitions at the side. I think that
maybe other explanations could of been offered. Like more deﬁniton boxes,
or maybe speciﬁc example boxes could be brought up on screen.

Sequence of screens : The sequence of screens was not necessarily what
I expected. It was easy to move forward but not to go back. Many times I
found myself lost or back at the beginning when I tried to go back one or two
blocks. Then instead of being able to continue where I left off I had to go all
the way through the program again. That was not very convenient.

 

Use of terms : The terms I thought were very consistant and easy to

follow. Iunderstood the message that was being oﬂ‘ered and it remained b
consistant throughout the use of the program. I did not understand some of

the notation but there was a deﬁnition offered or a box would come up and

explain it to me in those situations. That made it easy to continue with the

program with a knowledge of the terms without having to stop using the

program to research the topic or look up the deﬁnition of the word.

Terminology appropriateness : The terms I thought for the most part
were appropriate. I did not think that some of them necessarily were
speciﬁc to telecommunications, but they were convienent for the use of the
public in general. There were not too many computer terms and the ones
there were basic enough that if you were able to use the program you would
be able to understand what they meant.

Messages on screen : The messages that appeared on the screen were
good and helpful, but they were not always consistant in where they were
going to be, the length they were going to be, the number of them there
were. i.e. sometiones I thought maybe an extra diagram or message might
help convey the idea a little better but that option was not there.

The position of the messages (boxes) is best when it appeared directly over
or under the word or directly next to the diagram that it related to. moving
along at diagonal angles is not convienent.

Messages to users. Directions for correcting errors were not good at all. I
couldn‘t ever ﬁgure it out. It would tell me to go review or to try again and
my old choices would still be present, yet there was no clear option to erase
my previous answers. I tried typing over them, and I tried pressing the
arrow keys, but they did not work. Usually I just ended up moving on and
not correcting my mistakes.

123

Keeps you informed : I didn't feel like I could control the amount of
feedback and I didn't feel that the computer was letting me know what was
up next. I just followed the path of the program and hit the arrow keys. It
was easy to maneuver through the program but I didn't always know what
to expect next.

' to operate system : I felt the progam to be easy and quick to
learn. The advanced features were well introduced and well explained. The
arrow keys made it quick and easy to move ﬁ'om screen to screen and the
menu allowed you to pick up in different parts of the program.

Exploration of features : It was easy to see all the options the program
had to oﬁ‘er, the menu was simple and if you did happen to ”mess" up you
could easiy ﬁnd your way back to where you left off. It was encouraged by
the program to try problems or to practice some of the concepts you were
going over. And if you wanted out of the practice you could continue on and
go to the menu where it was possible to continue where you left off.
Sometimes you would have to go through a lot of screens but it really was
not all that time consuming or diﬁcut.

Remembering names and commands : It was very easy to remember
names and how to enter numbers or check boxes as to choices you might
want to make. Often times they would remind you numerous times the
name of what you were working with. And entering numbers was no
problem. One time though I thought that by clicking on the plus button I
could put more numbers on the screen and this was not the case. It wasn‘t
hard to ﬁgure out I just didn't know that you couldn't do that.

Tasks performed in a straight-forward manner : The steps were well
outlined and clear as to what the message was. They followed a logical
order and were easy to understand. I think that some could be emphasized
better or be holder to give a stronger message, but for the most part they
were very adequate.

Help messages : The amount of help was probably just right but
sometimes it left you wondering what to do. then you would continue on and
it would make sense. Maybe the messages should inform you that if you
keep going on with the program it will make sense. It was easy to get the
messages and to get help when you needed it and it was easy to understand
the message once it had been brought up on the screen.

System Speed : I am very impressed with the speed of the system. There
is nearly no waiting inbetween screens or messages brought up, and when a
calculation is done it is immediatly displayed in the appropriate spots. '
response to the mouse is quick and continuous.

System Reliability : The system appears very reliable and stable. It was
not disrupted when I tried clicking numerous options on the menu and

 

124

 

immediatly responded to what I did move to. The system did not fail so I do
not know how it reacts in that situation.

Sounds and noises : The system is quiet. In the program some of the
tones are loud and surprising. I did not expect to here chimes or beeps

especially at the loudness that I did. I don‘t think that there is anything
wrong with them, they are just kind of loud.

Error correction : I did not have an easy time correcting my mistakes and
I did not have a very good time getting back to where I had left oﬂ' once I
made a mistake. It wasn't that it was hard or all that time consuming. I
just didn't like going through the entire menu process again. Errors I had F:
would usually end up skipped instead of showing an example of how to
prOperly do it.

Experienced and novice users : I think that it is meant to be run for
someone with more of a background than a new user. It is not diﬁcult so a
new user could get on the program and with a fair amount of ease learn the
system and eventually, with some quickness master it. but I think that if
someone was already familar with the system then they would be able to Iv
accomplish a lot more and be more successful with it.

 

Bangladesh 06
Characters on thescreen : itwas interesting!

Highlighting on screen : very helpful, but if there is a meaning that we
could ﬁnd clicking on it ,that should be mentioned.

Sequence of smens : the sequence of the screens were not correct. I
could not go back to the previosus item that I had just seen or forward to
the item I wanted to see and that shoud be afer the one I am on. It did not
happen always , but most of the times. So it seeems to be a bit
inconsistent.

Messages to users : You could make it a little more clearer in some cases I
think. But otherwise they were ﬁne. But sometrnes when I was doing the
practice aelitereise and wanted formula I could not get it, instead I had to go
to tutori .

Learning to operate system : it was interesting and with time could be
learned quite easily

Remembering names and commands : it wasn't that tough
Tasks performed in a straight-forward manner : yes

Experienced and novice users : needs more information

125

China 07

Characters on the screen : Based on my evaluation, it's principally a nice
system. I ﬁgured out, I believe, the main structure of this system in the
case that I had not had the user manual. The interface is pretty friend with
some minor things need to be improved. Also a suggestion is that if the
menu structure can be more friend such that a user who has some
experience in using some current popular software would be feel much more
comfortable.

Venezuela 08

Characters on the screen : very clear and with goodcontrasting m
backgrounds

Sequence of screens : Main menu and navigation is VERY unclear and
confusing

Sounds and noises : the beep when yourth the end of a scroll screen is
very annoying H

 

Error correction: it is hard to return to a prior screen sometimes

USA 09

Highlighting on screen : The highlighting on the screen helps to explain
individual terms. It is a good help method.

Screen layouts: The way the screens were set up was ok, but the amount
of information was not easy to use. Ifyou wanted to see more of the data,
ﬁt was hard to scroll through the data. If you did you would lose your
calculations on the calculator, because you would click on sometime else.

Sequence of screens : The sequence of screens was ﬁne. It wa logical, but
:21:ng hard to go back and look at examples, because you would lose your
ations.

Messages on smen : The messages that appeared on the screen were
helpful in the step by step routing sequences. But when doing calculations,
you never received any positve feedback. You only knew when your
calculations were wrong.

Messages to users : There were no messages to the user for correcting
errors. It told you if you were wrong. I assumed if it said nothing, then the .
calculations were correct. After one try, it seemed that it didn't check your
answers anymore.

Keeps you informed : On the problems it had you work out, it only told
you when you were wrong. It made it diﬁcult to know if you corrected your
errors and ﬁxed the problems.

126

Learning to operate system : It was overwhealming at ﬁrst, I did not
know exactly what I was supposed to be doing. It would be easier to break
the examplesup and then combine it into one problem.

Exploration of features : It was hard to go back and look at previous
problems, when you did your calculations were lost.

Tasks performed in a straight-forward manner: When doing
calculations on the calculator, you could not scroll through the data. It
made it hard to do the complete calculations, and it took at least one or two

trystoﬁgurethatout.

Error correction : I don't think that you could correct you mistakes, it
would do it for you and show you what you should have had but you could
not compare them. Ifyou tried to correct them on you own, you never knew
if you got them right or not. THe computer only told you when your answers
were wrong.

Pakistan 10

Characters on the screen :For the ﬁrst time, I am encoutered with this
sort of experience, it was a very interesting and fascinating one. It was a
good experience this software Is a good introduction for the new entry in
telecommunication services especially in telephony. It could be chanelized
for the better services in telephone department.

USA 11

Characters on the screen. Was not easy to navigate, using calculator
was ﬁ'ustrating as the answers kept disappearing.

Help messages : There was some diﬁiculty in obtaining help with subject
matter.

USA 12

Screen layouts : the windows sometimes cvered eachother up whih made
reading multiplewindows diﬁcult. I shouln't have to rearrange windows.

Messages on screen : arrangement was poor becaue it blocked room for
other windows such as calculator or program

Characters on the screen: They are ﬁne and I don'tthink it needs any
adjustment.

India 13

Highlighting on screen: I have a comment about the blinking of
characters on the screen. The Blinking Speed is a important factor and

 

 

127

should be adjusted based on the importance, length of the statement and
the it could be a good idea to just change the colors which will bring the

effect of blinking.

Messages on screen : The positioning sometimes caused hinderance to
read the other relavent info on the screen.

Help messages : I did not use help at any time so I really don't know
anything about help.

USA 14

Screen layouts : At one point, I was unsure which diagram the comments
were referring to.

Sequence of screens : ok, but sometimes diﬂicult to see purpose in
structure.

Use of terms : more deﬁnitions would have been helpful, or easier access to
these at any point in the program

Messages on screen : instructions could sometimes be spaced better in
relation to the diagram

Messages to users : messages were clear and understandable

Keeps you informed : feedback was not always predictable
Learningtooperatesystem : fairly easy

Remembering names and commands : I found it diﬁ'ucult to remember
new terminology later in the program and would have liked to have the
deﬁnitions easy to access at any time.

Tasks performed in a straight-forward manner : Examples of how to
perform certain calculations would have been helpful

Error correction : able to rework calculation easily

Experienced and novice users : instructions usually, but not always,
easy to understand for the novice

China 15
CharactersonthescreenzltisOK.

Remembering names and commands : Good.

Sounds and noises : excellant

128

China 16

Remembering names and commands : not very easy

Expert 1

Sequence of screens : not sure how to get back to earlier screen or
exercise, not sure where I was within whole programand how t jump around
and sometimes arrow for forwardkept meon screen with more
informationand sometimes same arrowwent forward

Messages to users : sometimes not clear on how to get more information
of tocontinue a demonstration to continue

Keeps you informed : one exercise I didn't get feedback on my choice until
Iseletedto move forward,I would like feedback on my choice immediatly,
audio feedback was somewhatintimidating and feels more judgementalalso
somewhat unexpected,I would like choice of audio or print fedback

Help messages : help or explainations of content came where Ineeded it
but I didn't bring them up by choice, they came when I seleted right arrow
to continue-ie when I had already cognitively movedon to nexttask, I would
like to get extra help by choice (ie further explanation or diagrams that did
follow weigh I was wrogn or confused but I didn'tknow they were there until I
continu

Expert 2
Highlighting on screen : bad highlight color
Screen layouts : some pop-up windows were unnecessary

Terminology appropriateness : some inconsistency with regard to which
terms need to be deﬁned
would like to have easy access to deﬁnitions of any term at any time

Keeps you informed : there was inconsistency in how to interact (e.g.
move mouse over capital letter, vs. click on capital letter). Didn't always tell
you if you were correct

Exploration of features : set of ”features" was very limited (perhaps due
to the nature of the tutorial)

Help messages : not much help oﬁ‘ered

System Reliability : not enough evidence regarding possible system
failures

129

Expert 3

Screen layouts : Pop-up media appeared covering the media it referred to.
I found I had to make mental notes in order to understand the pop graphics.

Terminology appropriateness : Terms were used but not deﬁned.

Keeps you informed : With problems of navigation, you didn't know what
the computer is going to show next. Also, I went through a few review
questions that shot me back to the beginning of the block, supposedly to
make me study the whole block again. It didn't tell me that though.

Learning to operate system : There were no instructions on how to use
the system.

Expert 4

Characters on the screen. there was little or no anti-aiasing used around
the characters, at times they looked "jaggey" -- also, that font rs boring,
dull, from a design standpoint.

Diﬂ‘erent fonts could have been used to display different TYPES of
information.

Screen layouts : poor use of space at times, too rigid and also very dull
from a design standpoint. boring and not used to display information in an
intelligent way

Sequence of screens : The arrows meant something diﬁ'erent from every
screen, very confusing. Sometimes when I made an incorrect answer, I was
asked or forced to review the same information I had seen already. and in
other sections if my answer was incorrect I was allowed to proceed. why?

Use of terms : very little implementation of hypertext to deﬁne terms.
some but not enough

Terminology appropriateness : the formulas and some of their terms
Ivere ejxplained, but there was no way to go back and review what you

Messages on screen : oﬁrn appeared in different places, when those ugly
light blue rectangles appeared at the bottom of the screen, they were

graphically ugly.
Messages to users : inconsistent
Keeps you informed : I was often lost, thus un-informed.

Learning to operate system : feedback was inconsistent, resulting in my
confusion and inability to predict what would happen next.

I30

Exploration of features : no navigation or "help me" featured are used.
everything is trial and errror, and often when you make an error your result
is inconsistent with what happened before.Remembering names and
commands : block + and block - were confusing to me

Tasks performed in a straight-forward manner : no, because of
disorientation of where you were, you never knew, so you never knew what
you had to complete to move on

Help messages : needed "show me how to do this by exampe" button, or a
"help" feature. or at least a voice-over explaining what to do.

Error correction : when I made a mistake I was removed ﬁ-orn ﬁguring
out that problem and made to force-leam more operations.

Expert5

Characters on the screen: a bit boring, could have used some color coding
at times

 

Highlighting on screen: good use with diagrams to correspond with text
or buttons

APPENDIX I

131

 

 

 

 

 

 

 

 

 

 

 

 
   
  

 

 

 

 

 

 
   
 
    

 

 

 

 

 
  
 

  
  

 

 

 

 

 

 

 

 

    
  
 
 

 

 

 
  

 

 

 

 

 

 

 

 

 

 

_ 1,4,37.
- 9.1A33. (
Tutorial user 1
Tutorial
1,4,2? , 1 2/27 Korea
10 34 53 55" 1115.175; Introduction 2.5.7.1138 6 8 39
Tutorial Tutorial l—i Objective Objective
2/27 13431.7 1/2 2/2
5.40 , ( 4 29 32 5a
0A1: #14 user was cleary lost 14 (tester
Tutorial Tutorial Summary Practice '1" ‘° ““9 m"): . .
3/27 voice 14/27 1/5 ,3;f,‘.i,:;'°°"°° " “mm”
0At #17 tester helped with Tandem. by
41. l ( 2530 tellingmeusertolookatthepaper.
0At #17 user double clicks on squares
Tutorial Tutorial Summary Practice 3: g; :3 :2: :2 2:33:
4/27 ‘ 1 5/27 2/5 2/5 0At #62 user tires to calculate
42 44 . ( . 26
Hetnedtodothepractioepartbefore doing
Tutorial Tutorial Summary Practice the tutorial, and kept asking the tester
5/27 16/27 3/5 3/5 W the mm “m
In general, this user had trouble ﬁguring
4345 51 1 ( 1 7 ctnhiswayaroundmeprogamJ-lepointed
. . . out some defects in the interface. like the
Tutonal Tutor1a| Summary Practlce calculator erasing numbers.
6/27 17/27 4/5 4/5 Hewastiredafter one hour. Hewasvury
, aflraid of do'ng "wrong" thins:-
‘6 SO ‘ ( ‘ 8 Notethatheddnotgooverthemt:
Tutorial Tutorial Summary :hnﬂam Max:321” in tm'
7/27 1 8/27 5/5 of navigation patterns.
His english has not strong enoum to
W ( communicate his ideas clearly. The tester
Tutorial ‘1 Tutorial had” keepaskhotooet m
8/27 19/27 Aﬁnalcommont,theuserwasaskingwhat
to do most of the time.
19 21 23 31 (
48. .
"ﬁgﬁg Tutorial
9/27 20/27
64,
review
Tutorial Tutorial
10/27 21/27
it .5 sea 57.69. 58. 1' 537.59.61.63.
review Exercise Advice
Tutorial , Tutorial Tutorial H Tutorial l-l Tutorial H Tutorial l-l Tutorial
1 1/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

7
Tutorial
1/27

part I

8.20

Tutorial
2/27

9J1L21.

 

Tutorial

 

3/27 voice

10.12.22.

Tutorial

 

4/27
13 23 23

Tutorial
5/27

 

1424

Tutorial
6/27

1525

Tutorial
7/27

 

 

16,26.

 

Tutorial
8/27

 

 

A

    
 

Exercise
Tutorial

 

3L9.—
review

 

review
Tutorial
10/27

 

Tutorial

 

r;__41

 

11/27

 

 

 

 

 

 

132

 

 

 

 

 

 

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

    

 

 

    
   
 
   

 

   
 

 

 

 

 

 

 

 

 

1
28.
. User2
Tutorial
12/27 Korea
29. Introduction . ’ 3
Tutorial Objective Objective
1.33.4317 l”Z . .2/2 ,
30- I43.
Tutorial Summary Practice
14/27 1/5 Vdeo def
i tape active
31. I;
Tutorial Summary Practice
15/27 2/5 2/5
‘ 32.
Tutorial Summary Practice
16/27 3/5 3/5
Practice
4/5
12::
Practice
5/5
Tutorial
20/27
’ 37.
Tutorial
21/27
39. 14°. 41. 1' 412,46,
Exercise Advice
, Tutorial Tutorial HTutorial HTutorial HTutorial HTutorial
22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

133

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6.29.43. 60.
Tutorial -) Tutorial usgr 3
194;? 12/27 Indla
‘ 8.30.40'42'44 ‘ 7'21'61'74' Introduction 3.5 4’
31
Tutorial Tutorial l—) K—-) Objective H Objective
2/27 1342,17 1 2/2
. 7.31.41.45.51 205175 . t 86.
53. 2 user tried the help menu from
Tutorial Tutorial Summary Practice ‘°°‘b°°" "9mm“ beginning-
3/27 voice 14/27 1/5 1/5
: 26.32.4650 51: 19.63.76. ( : 89
The user started to use the tutorial by
Tutorial Tutorial Summary Practice the block ll badrwards which means a
4/27 15/27 2/5 2/5 lot Of the instruction was not accessible.
. g; as 47 49 1854.77, 1 < 87. 32T..‘2:‘17".12:21§£;ﬁ?3“°°’
Tutorial Tutorial Summary Practice Student visited most of the screens.
5/27 16/27 3/5 3/5 She was evidently more interested on
7 I . ‘ the interface than trying to learn the
content. Most the mations were
, 24.34 4856 . 17.65.78. . 10.90 interface famed
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
: 23 35 57 : 16.66.79.
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“22:36.50,58 . 15.67.80.
Tutorial Tutorial
8/27 19/27
“£751.59, . 14,68,81.
Exercise
Tutorial Tutorial
9/27 20/27
. 8 52. : 11.13.69.82
review
Tutorial Tutorial
10/27 21/27
‘gslss. . 12.70.83 71, 73. 34.
review Exercise Advice
Tutorial K-J Tutorial Tutorial H Tutorial k.) Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 26/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

_ 1.48.
6.52. ~w45.
Tutorial Tutorial user4
1,437 12/27 Pakistan
I7 25 37 53 ‘ 4657' Introduction 14,49. 3'5'
Tutorial T“t°”a' K—i r...) ObJeCtive i—l Objective
2/27 1342'] 1/2 2/2
. .12.24,2e.38. 58' o. t (
54,
Tutorial Tutorial Summary Practice
3/27 voice 14/27 1/5 1/5
: 9,11,13,23,“? : 59. 51. I (
9.5 .
Tutorial Tutorial Summary Practice
4/27 15/27 2/5 2/5
: 10142228 , 60. (
40,56,
Tutorial Tutorial Summary Practice
5/27 16/27 3/5 3/5
: 15.21,29.41 ’ 61. l
\
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
. 16203032 . 62.
42
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
17.19.3133. 63.
‘ 43.46. ‘
Tutorial Tutorial
8/27 19/27
18,34,44, I 64.
Exercise
Tutorial Tutorial
9/27 20/27
. s. : 65.
review
Tutorial Tutorial
10/27 21/27
\&1 \ 661 ( (
review Exercise Advice
Tutorial (_ Tutorial H Tutorial H Tutorial H Tutorial H Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

135

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 4s
9.30.47.67 J User 5
Tutorial Tutorial
1,431 Viz/27 USA
I 0,31,43,49, . 11.48.50.74, '"tmdUCt'o" 2.4.7.46. 3 a,
68. 7 . 8..
Tutorial Tutorlal Objective H Objective
2/27 1:342] 1/2
. 244 69 51.77.89, ‘ 1315 20 t 18.
_ _ This user was very technology
Tutorial Tutorial Summary Practice oriented. Hewasgood at pointinggood
3/27 voice 14/27 1/5 1/5 and bad points ofthe program.
4 He was mostly interested in the
interface than on trying to learn. He
£270, ‘ 52'78'90' ‘ 14 16'21’ ddn't‘try to answer the proposed
Tutorial Tutorial Summary Practice exerc'ces'
4/27 1 5/27 2/5 2/5 While trying to switch the audo of at
. 44. the user ended up leaving the
34.71, 5,53,79,91. 17.22.26. system-
. . . By th end he w: proﬁcient at using
Tutorial Tutorial Summary Practice the Jock Mons and anm_
5/27 16/27 3/5 3/5
. 35 72 . 54.62.80. 19,
\
Tutorial Tutorial Practice
6/27 17/27 4/5
.[:36,38.73, . 55.61.63.81 4,23
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“37,39, . 56.60.64.82
9s,
Tutorial Tutorial
8/27 19/27
. 57.59.65.83
Exercise 4'
Tutorial Tutorial
9/27 20/27
1 : 58.84.93.97
review
Tutorial Tutorial
10/27 21/27
1229.66.75.87
2, . 598 8699101 100102 103 2
review Exercise Advice
Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

136

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5,11,25,32. 4 1:3.
35.39.42.415. so, ”
its . a 1 User 6
utoria ~
-) Tutorial
14%? 12/27 Bangladesh
‘ 2;: 2' 2- .' 6,40,43,47, Introduction 2.4.7,33'38' (
61. 52.64.77.81. Ob
Tutorial 539-. jective Objective
utorial H
2/27 7pm I 1/2 2/2
. 13 23 27 58 (SS-82.844100 30 32 45 31.49.91
62,
Tutorial Tutorial Summary Practice 13°F“? 99:5 25 3 613:“ :9“ W35 “35:6
- rying ornaesenseo W OMOVGII'OU
3/27 V0.“ 14/27 1 /5 1 /5 the program. She had a strong willing to
’ ‘ ’ learn the content as well as the interface
V ;4'2:'57'63 \ 66'83'101' 1 Q 96 and was willing to try all of it.
Tutorial Tutorial Summary Practice
4/27 15/27 2/5 2/5
‘ 1555 39 . 67,102,108, ‘ ( 97
Tutorial Tutorial Summary Practice
5/27 16/27 3/5 3/5
t 1 588 ‘ 68,103,107, 33 93
l .
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
I 17,54,87 : 69,104,106, 4
1 .1 ,
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
411.52.29.81 ‘ 70-10511“-
115.117.
Tutorial Tutorial
8/27 19/27
19.79, : 71,106,112,
Exercise 114,11 .
Tutorial Tutorial
9/27 20/27
0. 4 72,113,119.
review
Tutorial Tutorial
10/27 21/27
1. 73 120 74 7s :7,41,44,48,
review 1 Exercise Advice
Tutorial 1- Tutorial {-1 Tutorial H Tutorial H Tutorial H Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

137

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 3 125
5,7,10,12,42. ' . -
§Q.82,95,112(J User 7
121,136_.155, .
Tutorial -) Tutorial Chlna
L42? 12/27 2,4,64,91,93.
’ J 3.1 5,17. 6,58,63,74, Introduction ”2320125.
42 (2 K: 7.95.1031 9' 160163
57.59.75.78, 124,126,128130 . - . .
82.841.96.10 , 138,140,142 3.44 013160th Objective
123,131,137 146,150.152157 1/2 2/2
141 1:6 1:2
' 4 .65.?6,87. I
‘5 8..,202444. ‘ 6.88.90.101. . 07114124 4 11141619.
1 5. 154,165,173 81,89,179.
:39 3113913 Tutorial Summary Practice The W Rem gang "0'“ M 3/27
3; 14/27 1/5 1/5 to practice, evidently because he
3/27 voice thought the practice was related
’ ‘ ’ to that particularly screen (steps
. 21.23.25.29 . 102. 66,108,115, . 9 22.110 10.0 20)
45, 1 '1 5' 1663723174- This user was tryingsohard to
Tutorial Tutorial Summary Practice make sense of the navigation
buttons that after a while he
I 4/27 I 15/27 2/5 I 2/5 my withdrew from any.
. 26 28 30 34, 103 . 67 71.116. ( mmzmugtﬂm‘ftged
46.62.99.134 167,171,175 mm mm ween
Tutorial Tutorial Summary Practice That explains in part why he
5/27 16/27 3/5 3/5 ddi‘t explore the second half of
I I I I the progam (he was not keeping
. 27 93133 104 . 870 72117, . 79, "‘cku‘mngmm'")
35,47,135 168,170,176
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
. 32 36 48, 105 69 73,118.
169,177,
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
4
37,49.
Tutorial Tutorial
8/27 19/27
1
£8.50. (
Exercise
Tutorial Tutorial
9/27 20/27
I
. 9,51, 148,
review
. . 127,129,139,
Tutorial Tutorial 143,145,147.
10/27 21/27 149,151,153.
7 158,159,162,
‘90 52 ( 178.180.
review Exercise Advice
Tutorial Tutorial H Tutorial H Tutorial H Tutorial H Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

138

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5,32,36,64. .
159.72.76.96. - 1-34-
102. (
FT_utorial J T t - I user8
u cm
1.4%? F 12/27 Venezuela
: 5 31 33 55 . 7,47,66,73, Introduction 2. 4,35,70,94, 3.71.
97, 75.77,7.,1 0103
Tutorial Tutorial i—i Objective Objective
2/27 1342”? 1/2 2/2
: 7.98. 45301104. ‘ 2 67.95. t 93
. . . Th usergotst rted‘ th
Tutorial Tutorial Summary Practice meow and “0;”; ”in“
3/27 V°|C° 14/27 1/5 1/5 problems. He wasvery good of
4 4 4 think aloud. and gotagood
. 8.99. . 45.81.83.85 , 60 understandngofthecontentof
9.105, block1.
Tutorial Tutorial Practice .
4/27 15/27 2/5 $33.32*?“ """° ‘°
4 4 Goodcommentsdmutthe'henu"
9g . 44.82.84.86 . 92 screens.
. .l .
Tutorial Tutorial Practice
5/27 16/27 3/5
: 10 22 I 43.87.91.107. 5 5L : 9,
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
11.13.15.21 : 42.108, 8. 1
23.
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“12.14.16.18‘ : 41,109,111,
20.24.48.
Tutorial Tutorial
8/27 19/27
(17192549 1 40,110,112,
Exercise
Tutorial Tutorial
9/27 20/27
J l
1' 8' 9’50’ . 39.113.
review
Tutorial Tutorial
10/27 21/27
‘27.30.51.62, ( : 8.74.78.101,
review Exercise Advice
Tutorial J Tutorial H Tutorial H Tutorial H Tutorial H Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

‘ 57,948.54,

Tutorial
1/27
part I

 

 

103650

Tutorial
2/27

 

Tutorial
3/27 voice

 

 

 

Tutorial
5/27

 

 

 

Tutorial

7/27
dr———‘

4.16.18.20.221
24.26.28.30

1.2.53, ,
utorial

”8.421.—

 

 
   
   
 

Tutorial
9/27

 

review
Tutorial

10/27

 

 

‘22—,

review

‘1930

139

 

 

J (

 

Tutorial
11/27

 

 

 

(.1.

 

Tutorial
12/27

 

 

37.49.51 ,

Tutorial
13/27

oart ll
52,

Tutorial
14/27

(

 

 

 

Tutorial
15/27

(

Tutorial
1 6/27

(

Tutorial
17/27

(

Tutorial
18/27

(

Tutorial
1 9/27

 

 

 

 

 

(

Tutorial
20/27

 

(

Tutorial

 

21/27

A

 

Introduction

2.4.

 

User 9

USA

Objective
1 / 2

 

Objective
2/2

 

 

1/5

 

 

 

.S___

Exercise
Tutorial
22/27

Tutorial
23/27

 

Practice

 
   
   

 

 

 

2/5

 

 

3/5

 

 

 

 

*1

5/5

Practice

Practice

Practice
4/5

12:

Practice

 
   
   
 
   
   
 

 

 

 

 

Tutorial
24/27

 

 

 

V

‘7

 

i-i

 

 

Tutorial
25/27

i—i

 

 

 

 

‘ Advice

Tutorial

26/27

l-i

The user right from the begining
dd NOT get dstracted by the
menu buttons and was interested
mainly in learning the subject.
She worked carefully through the
ﬁrst half of the progam,

. 3_1___ﬁ3 _

Tutorial
27/27

 

 

 

 

 

 

140

 

 

 

 

 

 

 
   

 

 

    

 

   

 

 
   

 

 

 

 

 

  

   
   
 
   

 

 

 

6
1.43? 12/27 Pakistan
’ 7 19 28. Introduction 2.4. ‘ 3.5,
Tutorial Tutorial Objective Objective
2/27 13/27 1/2 2/2
anll , , .
29.
Tutorial Tutorial Practice mrwmm?£;tm:nﬂ::
14/27 1/ 5 Therefore was able to proves smoothly
in the tutorial and found his way to the
30. second half with no problem.
Tutorial Tutorial Practice
15/27 2/5
3L
Tutorial Tutorial Practice
16/27 3/5
32.
Tutorial Tutorial Practice
17/27 4/5
33.
Tutorial Tutorial Practice
7/27 18/27 5/5
.g1z2z ‘ at
Tutoﬂal Tutoﬂal
8/27 19/27
1626 35.
Exercise
Tutoﬁal Tutoﬂal
9/27 20/27
:E36 ‘
review
Tutoﬁal Tutoﬁal
10/27 21/27
.ds. . ( ' 1' 38
review Exercise Advice ‘
Tutorial H Tutorial Tutorial H Tutorial H Tutorial H Tutorial Tutorial
1 1/27 22/27 23/27 24/27 25/27 26/27 27/27

 

.l

 

 

 

 

 

 

 

 

   
 
   

 

 

  

 

 

 

 

 

 

  
  
    
 

 

 

 

  

 

 
   
   

 

 

 

 

 

 

   

 

 

 

 

 

 

 

 

 

 

 

141

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.35.
(
T t , I User 1 1
u oria
12/27 USA
‘ 7 2] 32 39 50.53.5557 introduction 24.36.62. 3 5.37,
41.47.56.6 , 4, 7,. ,
Tutorial Tutorial H Objective Objective
2/27 13/27 1/2 2/2
art ll
. 4.22022 33,38 70. 8.61 . 44,68. BEST user“.
Tutorial Tutorial Summary Practice U9" "‘5 W °" ““1"“ "‘e
. ' obl
3/27 voice 14/27 1/5 1/5 ﬁﬂg'ﬁ‘ﬁfﬂ Qﬁv‘ﬂi :,..::;‘
calculating the exercise 9/27
9.23.34, 71, . . 59 correctly but got off track when the
computer lost his data. He tried "help"
Tutorial Tutorial Summary Practice Emmet :16"ng Wis: 35:11
e ua or e aency. ea y
4/27 15/27 2/5 2/5 undertood how to calculate. after
‘ looking at screen 10/27.
1018.24, 72. 4 60 Heeouion't ﬁgureouttheblock
buttons, which held him from exploring
Tutorial Tutorial Summary Practice thesecondhelp.
5/27 16/27 3/5 3/5
. 11 1725 73. ‘ : 3
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
. 12 16 26 52 74, 5
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
13,15,27.51 75.
Tutorial Tutorial
8/27 19/27
“14.28. 76.
Exercise
Tutorial Tutorial
9/27 20/27
. 9, 77.
review
Tutorial Tutorial
10/27 21/27
.30, 78 79. 80. 31 54.58.82
review Exercise Advice
Tutorial Tutorial H Tutorial Tutorial Tutorial Tutorial Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

142
- 1.29.
4.28.32.35.
50.36.83,. {-1 User 12
U 0” Tutorial
F 12/27 USA
‘ 38.41.43.60. Introduction 230.33“ 131434)
. 6, 8. 1
Tutorial Tutorial Objective l—i Objective
2/27 133,7 1/2 2/2
. 37.52 69-92- . 5.81, . L
Tutorial Tutorial Summary Practice mifttxﬁfnzgxezmam
3/27 voice 14/27 1/5 1/5 ..,. c031,", M9,...” Mons
’ ’ ‘ weren't used in the ﬁrst block.
. 7.11.53, . 70.93. 46 82 (
User got upset whenhelost his
Tutorial Tutorial Summary Practice clicuiations and'the navigation
4/27 15/27 2/5 2/5 Wm’ “d" i ““9-
c XE’LE:§E.".1“SE;£°.Z°“°
explored smoothly.
Tutorial Tutorial Practice
Almost at the end. when he
5/27 16/27 . 3/5 checked the summary, he coulch't
getbackto the tutorial point
, 91317.26. . 72.95. ~ where he was.
J
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
. 14161822 73.96, 985
25.27.56.
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“15.19.21.23 74.97.
24.57.
Tutorial Tutorial
8/27 19/27
.M I 75.77.79.98
Exercise
Tutorial Tutorial
9/27 20/27
. 6,59, : 76.78.80.99
review
Tutorial Tutorial
10/27 21/27
7, . 100 (
review Exercise Advice
Tutorial (4 Tutorial H Tutorial Tutorial H Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 26/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

    

 

 

 

Tutorial
4/27

 

 

 

Tutorial
6/27

 

 

Tutorial
7/27

 

 

Y1—Z..i__— .

Tutorial
8/27

 

 

  

 

13
Exercise
Tutorial

9/27

   

 
 
   
 

review
Tutorial

10/27

 

 

VL5.__,

review

. 1.17.23.

143

 

 

 

Tutorial

 

 

Tutorial
14/ 27

 

1:37, ,

Tutorial

 

 

1 5/27
38.

Tutorial
16/27

 

39.

Tutorial

 

17/27
40.

Tutorial
18/27

 

41.

Tutorial
19/27

 

42.

Tutorial

 

20/27

43.45.47.

Tutorial
21 /27

 

4 46 48
Exercise

 

 

User 13

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

India
Introduction 2'4'18'22'24" 3 19 25
Objective l—i Objective
1/2 2/2
132““ 17.6““ Enemies:

by navigation M

 

 

Tutorial
1 1/27

 

 

r,

 

Tutorial
22/27

 

 

 

 

 

 

 

 

1: 1:: ...........,..,,,..,
. Gulch.
Hewasfastin most ofthestudy
Summary Practice anldtaot'wt 0.11;“ "at“
rea ivey ea .
2/5 2/5 , Theresearcherddl't wantto
force feed more time.
1:26;
Summary Practice
3/5 3/5
Summary Practice
4/5 4/5
Summary Practice
5/5 5/5
49. 50. 51, 1' 3 0,
Advice
Tutorial H Tutorial H Tutorial H Tutorial Tutorial
23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 54,
6.24.44,
Tutorial Tutorial User 14
1.431 F1272. USA
: 7 25 4s 47 . 48.50.52.60, Introduction 2 4.55.57
51.59.63, 62,
Tutorial Tutorial
2/27 13342.7
1 J 4
.E.12.26.46, 63.67- 19. .

. . . Lise r d his ' t block 1.
Tutorial Tutorial Summary Practice no gggms ”33%;.“ at
3/27 Win 14/27 1 /5 1 /5 the begining, but felt a little

. intimichted by the program. He
9,11,13,27, . 64.66.68, 20 L apologized for not havingagreat
29.37. - experience with computers.
Tutorial Tutorial Summary Practice
He d ooth block II
4/27 1 5/27 2/5 2/5 ..dpgé’iﬁeeﬂmfme
a about the program.
. 1014.28 36. .:65. . 21. . 53
38. This user explicitly explained the
Tutorial Tutorial Summary Practice 2:011:32: 2:32:33 am;
5/27 16/27 3/5 3/5 of lay out one,“ 2,27
’1; It 4 4 13/27
.15293539. . 66 ‘ 2 . )
y A great user.
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
.:16.30.34,40 67. 3.
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“17.31.3341 . 68
Tutorial Tutorial
8/27 19/27
\déa-EiL . 59
Exercise
Tutorial Tutorial
9/27 20/27
3 70
review
Tutorial Tutorial
10/27 21/27
4 71 49 61
review Exercise Advice
Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

145

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1,9,16
12.22.29. 61.
Tutorial T - I user 15
—) utoria .
194%? 12/27 Chlna

. 13.21.23.30, . 62.75.92.103 '"tmducmn 2,4.6.10.17. 3,5.7,11,18.
52,102,104 , . 7. ' 8
Tutorial Tutorial i—) Objective Objective

7
2/27 1,3342" , 1/2 2/2

. 142031 53 63-75'931106- . 25
Tutorial Tutorial Summary Practice This ”59" Q“ confused
3/27 voice 14/27 1/5 1/5 by the nawgatlon

i . buttons in the begining
~ 1319-32-42 54177-941101 ~ ( ~ 23 but cought the tutorial
Tutorial Tutorial Summary Practice 2:22.25 r109ressed through
4/27 15/27 2/5 2/5 '
. 33 41 43 55 , 65.78.95.103. g 31241 Note that he found block
2 by repeating the block
Tutorial Tutorial Summary Practice 1 enough times that the
5/27 16/27 3/5 3/5 program takes you to
66 79 96 ‘0’ ‘ ‘ block 2 (He didn't get

‘ 3 4° 44 56 ‘ ' ' ' ' ‘ ‘ 7' there through Block -

Tutorial Tutorial Summary Practice °r +)
6/27 17/27 4/5 4/5 U t ‘ t st d
, ser go very in ere e
4 35 39 4s 57 . 67.80.97.113. E6. ‘ in trying to answer the
. . . multiple choice of
Tutorial Tutorial Summary Practice 2.2/27, but he waisted
7/27 18/27 5/5 5/5 time not noticing
‘36,38,46,sa , 68.81.98.111. "sentences"-

‘ He put most of his
Tutorial Tutorial effort in block 2 in
8/27 19/27 ”playing" with the

4 multiple choice,

‘ﬁﬁé— ‘ 69'82'99'112' without really trying to
Tutorial Tutorial learn In order to
9/27 20/27 answer, In other words,

I trial and error.

‘ 850 . 70.72.83.87
review 1 J‘ 1
Tutorial Tutorial
10/27 21/27

' 1,84,88,101

¢9,51,60. m 72.89, 73,90 74,91, 1:6
review Exercise
Tutorial Tutorial Tutorial Tutorial Tutorial H Tutorial
11/27 22/27 23/27 24/27 25/27 27/27

 

 

 

 

 

 

 

 

Tutorial
2/27

Tutorial
3/27 voice

Tutorial
4/27

Tutorial

146

 

 

 

      
  

 

 
   
   
   
 

 

 

5/27
is:

Tutorial

6/27
18,

 

Tutorial
7/27

 

 

1

Tutorial
8/27

$.94— .

 

 

    
   
 
   

Exercise
Tutorial

 

review
Tutorial

10/27

 

 

2

review
Tutorial

(Jﬁl Tutorial ‘

12/27

 

 

‘40

 

Tutorial

13/27
part II

 

ZS.

 

Tutorial
14/27

 

 

Tutorial
16/27

 

Tutorial

 

 

17/27

 

 

 

 

 

  

Exercise

 

 

 

11/27

 

Tutorial
22/27

   
 

 

 

introduction

 

Objective
1/2

User 16
China

3,

Objective
2/2

 

 

 

 

 

 

 

Summary
1/5

Summary

ll

 

 

2/5

ll

Summary
3/ 5

 

Summary
4/5

Summary
5/5

illl

 

 

 

 
   

Practice
1/5

 

   
 
   
   
 

Practice
2/5

Practice
3/5

 

Practice

J:

 

4/5

Practice
5/5

 

IE:

 

 

The user was very affraid of err at
the start, and drift get into the
tutorial right away.

She was very atentive. though. and
once found her way into the
program. she worked hard to learn
the content. She got a little aim at
the midde of block 1 because she
was not understandng the content
and the raearcher couldi't answer
her content cautions.

lnthesecondblocktheusercalmed

downandfdtgoodevensaying
thatitwaseasy.

(

 

 

Tutorial

 

T

23/27

Tutorial H Tutorial
25/27

 

 

24/27

 

Advice ' '
Tutorial H Tutorial
26/27 27/27

 

 

 

 

 

147

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
   

, 1.72,
5.52.54, 34, '
Tutorial Tutorial Expert 1
L43? 12/27
I ' 1 3' ‘
‘ 6 26 46 58 ‘ 3:17 :1 “ :7 introduction 14.73. 3.74.
71.751 .7l.6.82.. . .
Tutorial utorial i—l (—-) Objective H Objective
2/27 1:342]? 1/2 2/2
’ 7.27, 48.83. 6,42,44 ‘ 50'
Tutorial Tutorial Summary Practice Eiﬁggmxe think aloud
3/27 voice _1 4/27 1 /5 1 /5 technique and explored the ﬁrst
4 a 4 block with a good balanceofbeing
8,10,19,28, ‘ 64.84.86.88 . 37, , a learner and interface
9 evaluator. In the second half.
Tutorial Tutorial Summary Practice :ggvzgfnzhtimfzfygotf
4/27 15/27 2/5 2/5 overwhelming, she ended up
4 I fowsing more on the interface
. 9 11 20 29 ‘ 63.65.85.87 38 49, (shoeshethougnthecontent
8 .90. was above her anyways...)
Tutorial Tutorial Summary Practice
5/27 16/27 3/5 3/5
12.18.21.30, , 62.66.91. 9. :
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
. 1317 22 31 . 63.67.92. 0 1,
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“14.16.2332 . 62.68.93.
Tutorial Tutorial
8/27 19/27
“15.24.33, . 61.69.94,
Exercise
Tutorial Tutorial
9/27 20/27
25. , 60.70.78.80
review 95
Tutorial Tutorial
10/27 21/27
. 79.96. 97, 56.59.77 81.
review Exercise Advice
Tutorial Tutorial Tutorial H Tutorial Tutorial Tutorial Tutorial
“/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.8.
3 5.7.13
1.39.
Tutorial _, Tutorial Expert 2
L43? 12/27 '
3698432332.,“ 3,25,27,29. mtmduam" 2.9.11. 10,12,
48. 4350.5 '.
Tutorial Tutorial l—) Objective (—) Objective
2/27 1,334.?17 1/2‘ 2/2
. 15.19.33.41 54- o, t ‘
49. _
Tutorial Tutorial Summary Practice murggztwaoa$°sigtxhwt
3/27 V°iC¢ 14/27 1/5 1/5 problems. He was very good of
4 a think aloud. and got a good
. 16.18.20.34 ‘ 55, . understandngofthe contentof
4 block 1.
Tutorial Tutorial Summary Practice .
fused
4/27 15/27 2/5 2/5 fggmz, ”"9“
I 4 Good comments about the"menu"
.217.21.35, ‘ 55. . screens.
Tutorial Tutorial Summary Practice
5/27 16/27 3/5 3/5
I 36. 57. I I
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
37 . 58.60.
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
“38.44.51, . . 59.61.63.
Tutorial Tutorial
8/27 19/27
.4552, . 62.64.
Exercise
Tutorial Tutorial
9/27 20/27
45’ . 65.
review
Tutorial Tutorial
10/27 21/27
7, . 66. ‘ ( (7
review Exercise Advice
Tutorial Tutorial Tutorial Tutorial Tutorial FTutorial Tutorial
”/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

149

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10.12.16.19. 1-3-5
2432.36.42.52
Tutorial Tutorial Expert 3
1,437 T' 12727 _
0,25,23,33, . 1.13.15.17.26 Introduction 2,4.6.8.19.22 7.9.23.
3537.41.51.53 30.64.68, ...9. 9
°Putorial Tutorial l—l Objective Objective
2/27 134113,? 1/2 2/2
. 1 34 3840 5971-73-79. ‘
42.44.50.54 E f , em
Tutorial Tutorial Summary Practice 3:1: ..,,mmggigﬁe ce
3/27 voice 14/27 1/5 1/5 ﬁrst 15 minutes.
39.43.45.47. I 70.72.74.78 ‘ 1_ Expert repeteadely expressed his
49.55. 82, confusion on feeling of having no
Tutorial Tutorial Summary Practice duebetmm“ 5°” 55'
4/27 15/27 2/5 2/5
. 46 48 56 . 75.77.83~
Tutorial Tutorial Summary Practice
5/27 16/27 3/5 3/5
57 76.84 ‘ :
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
. 58 . 85
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
.&_6_5,__ . 86.88.
Tutorial Tutorial
8/27 19/27
‘éOﬁG. . 87,89
Exercise
Tutorial Tutorial
9/27 20/27
61'67' . 90.92.94.
review
Tutorial Tutorial
10/27 21/27
2, . 1.93.95 96. 97. 98.
review Exercise Advice
Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial
11/27 22/27 23/27 24/27 25/27 26/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  
    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  
 

 

  
  

 

 

 

 

 

 

 

 

8.10.12.15. 1.3.67-
69.85.89,97 54.
.11
Tutorial Tutorial Expert 4
F 12/27
. ’ 5.5 .61.63. '
‘ 911 1530 . 107.09 11. Introduction 2,4.6.13.68. 5.7.14.
. . 113,119,148 86_ .
Tutorial Tutorial Obiective Objective
2/27 1:342“? 1/2 2/2
I 4
.217,29,31.43 . 56,58,114. 66 93 t 79-87'
71,99, 118,12 , _
Tutorial Tutorial Summary Practice airﬁggtfﬁgﬂﬂzs
3/27 V0509 14/27 1/5 1/5 intimidatedbythe content.
a 7 Expert got a little irritated by
218.20.28.32 . 57,115,117, 94 so as so the interface.
‘ 44,72,100 1 1,1 , ‘ He had dfﬁculties ﬁndngthe
Tutorial Tutorial Summary Practice ”cm“ “a" °f .‘he progam. me
t lck f fth
4/27 15/27 2/5 2/5 J.,;me’mm’” e
I 4
. 1931.27.33 . ”6.122.124 95 _ 5178,91,
45,73,101 1 ,130,
Tutorial Tutorial Summary Practice
5/27 16/27 3/5 3/5
I 4
. 22,26 34 46. .[l§_3125_127. e , .
74,102 1 1
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
I I
1m ~l:__‘25v‘32- ,
75.83.103
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
. 4 36,48 76 . 133.137
82,84,104
Tutorial Tutorial
8/27 19/27
I I
7'49'53’77 81 . 134,136,138.
Exercise
Tutorial Tutorial
9/27 20/27
7 .
38.40 .;13s,139,141.
review
Tutorial Tutorial
10/27 21/27
62.64.108.
.3941. . 140,142 143 144 14s ”2
review Exercise Advice
Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial Tutorial
11/27 22/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

151

 

 

 

 

 

 

 

 

 

 

 

 

  
     
 
   

 
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

    

 

   
   

 

   
   

 

   

 

 

 

 

 

 

 

, 1.4.
12.14,17.21,J,
26.44,
Tutorial Tutorial Expert 5
1/27 12/27
part I . 2.5.8,10,13, 3 5,9,11,14
18.20.22.24. 5.19.54, Introduction ‘5
27.45. ‘ . _ .
Tutorial Tutorial Objective l—i Objective
2/27 13/27 1/2 2/2
‘ art ll
3 25 28 46 55- .
Tutorial Tutorial Summary Practice Eﬁg°mrirbf Egg?
3/27 V05“ 1 4/27 1 / 5 1/5 interface evaluator and
‘ instructional designer. all at the
29.47, 56. . some time.
Hewasabletogoin depthonthe
Tutorial Tutorial Summary Practice mm ""mm “W” °f a" .
ogram. both from :11 le
30 48 57. Hegotaconfusedaboutthe
equivalent dagam in the second
Tutorial Tutorial Summary Practice :05: an“: gmilytrimt
5/27 16/27 3/5 3/5 devaluation" ““5
31 49 58, .
Tutorial Tutorial Summary Practice
6/27 17/27 4/5 4/5
32 38 4o 50‘ 59. x : E:
Tutorial Tutorial Summary Practice
7/27 18/27 5/5 5/5
91233135137139: 60.
41.51.
Tutorial Tutorial
8/27 19/27
4 36 42.52 61,
Exercise
Tutorial Tutorial
9/27 20/27
62.
review
Tutorial Tutorial
10/27 21/27
Q33. 63 7 64. 1' 1' ,
review Exercise Advice
Tutorial F Tutorial Tutorial Tutorial H Tutorial Tutorial Tutorial
11/27 7.2/27 23/27 24/27 25/27 26/27 27/27

 

 

 

 

 

 

 

 

 

 

 

 

 

APPENDIX .1

 

 

 

 

 

List of Problems Found by Users

ZSI

 

 

 

 

 

 

 

List of Problems Found by Experts

  

 

APPENDIX K

154

USERID:

 

Time spent on the application

Number of Screens visited

Explored Block I? Yes

Find block 11 without help? Yes

Explored Block 11? Yes

Exercise 9/27 in Block 1:

Resolved Understood Tried

Equivalent Graphic 16/27:

Resolved Understood Tried

Equivalent Graphic 17/27:

Resolved Understood Tried

 

partially

partially

partially

not Tried

Not Tried

Not Tried

Multiple choice at end of block II (22/27):

Yes

No

 

APPENDIX L

Column
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
C34
C35
C36
C37
C38
C39
C40
C41
C42
C43
C44
C45
C46
C47
C48
C49
C50
C51
C52
C53
C54

Name

DOS
Windows
Macintos
Unix
Geograph
KnowPrgm
Owner
FirstTim
Age
Gender
Bng.Area
TOEFL

IN USA
Telcoan
Excited
CBTbefor
Goal
Experien
ErrLearn
Motivati
Structur
IndDiffe
LrnCtrl
UserActi
Pedagogy
CognPsyc
EasyUse
Navigati
CognLoad
Mapping
ScrnDsgn
UsrCntrl
InfoPres
MediaInt
Aesthets
OverFunc
Minutes
ScreenVi
Scrn/Min
BlockI
Find BII
BlockII
9/27Calc
l6/27Gra
17/27Gra
MultChoi
Problems
NumMachi
Wonderfu
Satisfac
Stimulat
Easy~
Power
Flexible

155

Count Missing

16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16

OOOOOOOl—‘OOOOOOOOOOOl—‘OOOOHNOOib-l—‘l-‘OOOOl—‘O'OOOOO\JOOOl—‘OOOOOOO

C55
C56
C57
C58
C59
C60
C61
C62
C63
C64
C65
C66
C67
C68
C69
C70
C71
C72
C73
C74
C75
C76
C77
C78
C79
C80
C81
C82
C83
C84
C85
C86
C87
C88
C89
C90
C91
C92
C93
C94
C95
C96
C97
C98
C99
C100
C101
C102
C103
C104
C105
C106
C107
C108
C109

EasyRead
Sharp
Fonts
Hilites
RevVideo
Blinking
Layouts
Amntlnfo
ArrjInfo
Sequence
NextScrn
Goinback
TaskBMEs
TrmOverA
TASKStrm
COMPRtrm
YourWork
COMPRter
trmsScrn
Messages
PosInstr
MsgsCfsg
Commesg
CorrtErr
CptrIan
PrdctRes
deckCtr
LrnSystm
LrnStart
LrnAdvan
TimeLrng
T&BEncou
ExplFeat
Dscheat
RembrN&C
Rmerule
TskManer
TskSteps
TskLogic
StepsSeg
HelpScrn
HelpAces
HelpCont
HelpAmou
SysSpeed
RespOper
RateInfo
Reliable
Dependab
SystemPa
WarnsPro
SysNoisy
MechNois
Beep,ton
CorrMist

OO-bHm-bl-‘l—‘l—‘OO-bbwwOOOOOOOOOOi-‘OOHOl—‘l—‘OOOOONOAOOOOOOOOOUJDOOOO

156

C110
C111
C112
C113
C114

CorrTypo
UndoOper
NeedExNo
Novices
Experts

No Constants Used

16
16
16
16
16

\IHOLUH