VIDEO MEDIATED LISTENING PASSAGES: THEIR EFFECTS ON INTEGRATED
WRITING TASK PERFORMANCE AND NOTE-TAKING PRACTICES
By
Justin Cubilo

A THESIS
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
MASTER OF ARTS
Teaching English to Speakers of Other Languages
2011

ABSTRACT
VIDEO MEDIATED LISTENING PASSAGES: THEIR EFFECTS ON INTEGRATED
WRITING TASK PERFORMANCE AND NOTE-TAKING PRACTICES
By
Justin Cubilo
The surge in international students studying abroad in English-speaking countries in
recent years has made it increasingly important to develop adequate assessments of their
language abilities. Prior research has led to a debate regarding listening assessment tasks and
whether visual support should be provided to test takers in these tasks, and much of this previous
research has predominantly investigated the effects of visual support on multiple choice listening
comprehension tasks. The present study seeks to expand on this research by investigating the
effects of video listening passages on integrated writing task performance and note-taking
strategies. 40 international students at Michigan State University participated in the current
study. Each participant wrote essays for two integrated writing tasks, one task was presented
with video listening material and one task was presented with audio listening material.
Participants also completed an exit survey concerning their perceptions of the video and audio
tasks. Results indicated that there was no significant difference in performance on the integrated
writing task between the audio and video conditions; however, there was a significant difference
in word count in participants’ notes. Qualitative data suggested that there were mixed
perceptions of the usefulness of video among test takers. However, the majority of participants
preferred the video condition and believed that video mediated listening passages aided in
comprehension of information presented in the listening passages.

ACKNOWLEDGEMENTS

I would like to take this opportunity to thank all the people who have helped me in the
process of writing this thesis.
First, I would like to thank Dr. Paula Winke, my advisor, for the support and guidance
she has offered throughout this project and for the time she has spent recording the listening
material and video lectures and teaching me some of the intricacies of SPSS. Dr. Charlene Polio,
my second reader, has also been helpful throughout this process. I would also like to give thanks
to the MA TESOL program for their generous financial support. In addition, I would like to
thank Mike Kramizeh, the head of the language laboratory at Michigan State University, for the
use of the language laboratory in my data collection and for taking time out of his busy day to
teach me how to use video editing software to create my lecture videos.
I have also received help and support from my peers in the academic community. Special
thanks goes to Erin Sutton and Erika Lessien, both of whom have given up much of their time
for this project and have provided me with ample feedback. Their help has been invaluable. To
my fellow thesis writers Ann Desiderio and Hyesun Lee who were my mutual encouragers,
thank you. Thanks also to my friends Xiaopeng, Xiang, and Crystal who offered their help in the
earliest stages of my project.
Finally, thanks to my Mom and Dad who have always supported me and encouraged me
throughout my years of education.



iii

TABLE OF CONTENTS

LIST OF TABLES ....................................................................................................................... vi
CHAPTER 1
INTRODUCTION ........................................................................................................................ 1
CHAPTER 2
LITERATURE REVIEW ............................................................................................................. 2
Theories and Models of Listening Comprehension ................................................................ 2
The Role and Effects of Visuals in Listening Comprehension ............................................... 4
Difficulties in the Definition of the Listening Construct ........................................................ 7
Gaps in the Literature and Research Questions ...................................................................... 9
CHAPTER 3
METHODS ................................................................................................................................. 12
Participants ............................................................................................................................ 12
Materials ............................................................................................................................... 13
Procedure .............................................................................................................................. 15
Study Approval ............................................................................................................... 15
Setting ............................................................................................................................. 15
Computer Set-up ............................................................................................................. 16
Data Collection .............................................................................................................. 16
Essay Rating.................................................................................................................... 18
Analysis................................................................................................................................. 19
RQ 1 ................................................................................................................................ 20
RQ 2 ................................................................................................................................ 20
RQ 3 & 4 ......................................................................................................................... 20
CHAPTER 4
RESULTS ................................................................................................................................... 21
Scale Reliability .................................................................................................................... 21
RQ 1 ...................................................................................................................................... 22
RQ 2 ...................................................................................................................................... 24
RQ 3 ...................................................................................................................................... 27
RQ 4 ...................................................................................................................................... 29
CHAPTER 5
DISCUSSION AND CONCLUSION ........................................................................................ 32
Listening Comprehension and Construct Validity................................................................ 32
The Effects of Video and Audio Listening Tasks on Note Taking....................................... 34
Limitations ............................................................................................................................ 37
Directions for Future Research ............................................................................................. 38



iv

APPENDIX A
YOUTUBE LINKS TO LISTENING MATERIALS ................................................................ 42
APPENDIX B
BACKGROUND QUESTIONNAIRE ....................................................................................... 43
APPENDIX C
EXIT QUESTIONNAIRE .......................................................................................................... 45
APPENDIX D
DIRECTIONS AND NOTETAKING SHEET........................................................................... 46
APPENDIX E
ANALYTIC RUBRIC ................................................................................................................ 47
APPENDIX F
WRITING PROMPT .................................................................................................................. 50
APPENDIX G
INTERVIEW 1 TRANSCRIPT .................................................................................................. 51
APPENDIX H
INTERVIEW 2 TRANSCRIPT .................................................................................................. 55
REFERENCES ........................................................................................................................... 58



v

LIST OF TABLES

TABLE 1 Participants’ backgrounds .......................................................................................... 13
TABLE 2 Listening material order ............................................................................................. 17
TABLE 3 Inter-rater reliability ................................................................................................... 21
TABLE 4 Descriptive statistics for type of visual input ............................................................. 22
TABLE 5 Paired-samples t-test comparison of scores between video and audio
conditions ................................................................................................................... 23


TABLE 6 Descriptive statistics of note word counts.................................................................. 24
TABLE 7 Paired-samples t-test of note word count compared to input method ........................ 24
TABLE 8 Participants' overall impression of video input .......................................................... 25
TABLE 9 Participants' preference and reasons .......................................................................... 28
TABLE 10 Participants' focus and use of video content ............................................................ 30
TABLE 11 Participants' comments on note-taking and memory ............................................... 31




vi

CHAPTER 1: INTRODUCTION
With recent increases in the numbers of international students studying abroad in
English-speaking countries, it has become increasingly important to develop adequate
assessments of their English abilities that will exhibit their potential to perform in an Englishspeaking classroom. As a result, tests such as the Test of English as a Foreign Language
(TOEFL—http://www.ets.org) and the International English Language Testing System (IELTS-http://www.ielts.org) have had a great amount of importance attached to them as indicators of
student English ability. Due to this importance in the United States, the TOEFL has been
continuously adapted. Out of this adaptation arose the TOEFL iBT, which aims to provide a
more integrated method for testing English ability in a variety of areas related to the academic
context. One such ability is the skill of incorporating information from both readings and lectures
in students’ writing. In fact, TOEFL writers have placed enough importance on this ability so as
to necessitate the inclusion of an integrated writing task on the TOEFL exam.
With this current study I am interested in the integrated writing task as it appears on the
TOEFL iBT and the role that visual support in the form of video lectures has not only on an
individual’s ability to write an essay that adequately incorporates the listening and reading
material into the essay, but also on the individuals’ ability to retain information from the lecture
and the video’s effects on note-taking strategies. I focus on the role of two types of input used as
a basis for testing L2 learners’ academic listening skills: (a) audio-only (AO) input with just a
photograph present, and (b) audio-visual (AV) input in which the test taker watches an actual
video of a lecture. While several studies have examined the effect of visuals on listening
comprehension tasks (E. Wagner, 2007; 2010; Suvorov, 2008) few, if any, have actually
examined the effects of visuals on writing task performance or note-taking strategies.



1

CHAPTER 2: LITERATURE REVIEW
The purpose of this chapter is to discuss theories and studies that have previously been
developed or conducted regarding L2 listening comprehension and listening skill assessment.
This chapter is broken down into four sections. The first section reviews theories and models of
listening comprehension. The second section consists of a discussion of the role of visuals in
listening comprehension and the effects they have on the assessment of listening comprehension.
In the third section I discuss difficulties with the definition of the listening construct. In the
fourth section I conclude by reviewing some of the gaps in the literature and presenting the
research questions investigated in the current study.
Theories and Models of Listening Comprehension
Listening is an essential component for communication in any language and is a
necessary part of acquiring a new, second language (Suvorov, 2009). With listening
comprehension being so important, many researchers have attempted to define it. However, since
some of the first definitions of listening comprehension first arose, there have been many
conflicting ideas about what should be included in the definition.
The definition of listening comprehension has gone through different stages of
development over time. Some of the earlier definitions, such as that put forward by Lado (1961),
placed the transference of sound and the information it brought with it as the main component of
listening comprehension. As time passed, definitions started to move away from being strictly
concerned with linguistic information and started to also be concerned with the nonverbal cues
that one is able to see when listening to a speaker. Rubin (1995) defined the skill when he wrote
listening comprehension should be considered as “an active process in which listeners select and
interpret information which comes from auditory and visual cues in order to define what is going



2

on and what the speakers are trying to express” (p. 7). Chung (1994) further developed this
definition by stating that messages that listeners hear have three types of information associated
with them: oral (verbally transmitted information from speaker to listener), paralinguistic (body
language, gestures, posture, facial expressions, voice pitch, and rate of speech), and the visual
context (items present in the environment of the conversation).
Because listening can be defined in such a way, it is possible to further define it as a
communication activity (Suvorov, 2008). In the process of listening the listener takes all the
aspects of the situation, both verbal and non-verbal, into account and acquires some sort of
meaning. Many researchers have found that this process can be influenced by a variety of
factors. Ockey (2007) cited a number of studies in which it was found that such factors as
prosody, rate of speech, background knowledge, and rhetorical cues have an impact on an
individual’s ability to listen. The use of non-verbal cues has also been found to have an effect on
listening comprehension. Sueyoshi and Hardison (2005) found that both lip movements and
gestures are able to aid in the comprehension of a listening task. Ockey (2007) and Rubin (1995)
found similar results indicating that body movements, gestures, and facial expressions have an
effect on listening comprehension. Such findings show the importance of taking both the visual
and audio components of communication into account when trying to develop an assessment
task.
Based on previous research on and definitions of listening comprehension, researchers
have developed several models of listening comprehension. Gruba (1999), referencing Kintsch
(1998), wrote that a connectionist cognitive processing model is the most defensible. In this
model, the mind processes multiple incoming stimuli at the same time and revises understanding
of the stimuli continuously as more information about it becomes available. Bejar, Douglas,



3

Jamieson, Nissan, and Turner (2000) took this connectionist approach further and modeled
listening comprehension by splitting it into two stages: the listening stage and the response stage.
In this model, three types of knowledge need to be accessed during the listening stage in real
time: situational knowledge (SK), linguistic knowledge (LK), and background knowledge (BK).
When the incoming acoustic signal and visual cues are received, each of these types of
knowledge is accessed as the signal is processed. This stage culminates in a set of propositions
(PR) being produced from the incoming auditory and visual signals. Once the propositions are
created, the individual switches into the response stage in which they use the propositions as a
means for formulating a response which can manifest itself as a selection among a set of choices,
a spoken response, or, as is the case in the present study, a written response.
Bejar et al.’s model and the model put forth by Grabe exhibit the complexity behind an
individual’s listening comprehension and serve to show that each person’s ability to comprehend
what they hear differs based on the knowledge they have at their disposal. The present study
looks to these models as a method of explaining the way in which learners use the information
provided in order to formulate their written responses.
The Role and Effects of Visuals in Listening Comprehension
With the ever-increasing role of technology in the area of assessment, it has become more
and more important to investigate the effects of such technology on test performance. Many
studies looking at the effects of different visuals on individuals’ performances on listening
comprehension tasks have been conducted. Visuals can range from a single still picture, which is
currently used on the listening and speaking portions of the TOEFL iBT, to multiple still pictures
as used by Ockey (2007), to video images used by several researchers in their studies (E.



4

Wagner, 2007; 2010; Ockey, 2007; Sovorov, 2008). The results of using such differentiating
visuals have produced some mixed results as well.
Several researchers investigating the effects of differing visual input have found that the
inclusion of visual input in the form of video may not be as beneficial for comprehension as
some researchers have tended to suggest. For example, Gruba (1993) found that there was no
statistical significance between scores of video and audio tests. Brett (1997) also found
conflicting results in his study which showed that, while a video group scored higher on certain
task types, an audio group scored higher on other tasks. Coniam (2001) demonstrated that there
was actually no significant difference between an audio and a video group and that the audio
group actually scored slightly higher on a test of listening comprehension. The results of
Coniam’s study also illustrated that participants in the video test-taking group felt that they had
gained nothing from taking the test in this method and that they actually felt that they would
have done better had they not been distracted by the video. Similarly, Suvorov (2008, 2009)
found that, while scores between an audio-only and photo-mediated listening task were not
significantly different, performance on the video-mediated task was significantly lower. The
results of these studies seem to add credence to previous claims that nonverbal cues are not
necessary for, and may be detrimental to, testing listening.
While there have been a number of results supporting the exclusion of video from
listening tests, results supporting the inclusion of video have been just as numerous. An earlier
study conducted by Baltova (1994) with French foreign language learners found that not only did
videos help learners in their development of listening comprehension, but the videos also
contributed to the learners’ confidence in understanding when their comprehension of the
message was low. It was likewise found that visuals in the form of pictures were also somewhat



5

helpful in aiding comprehension, although Chung (1994) and Ockey (2007) found that multiple
pictures caused distraction among test takers. E. Wagner (2010) continued in this line of research
by investigating the effects of video-enhanced tasks on listening comprehension scores. In his
study, he found that scores on the video-enhanced tasks were significantly higher, which he
attributed to test takers’ use of non-verbal cues that were exhibited by the speaker in the video.
Sueyoshi and Hardison (2005) further examined the effects of visuals on listening
comprehension by looking at the way in which lip movement and gestures affected the
comprehension process. They found that the inclusion of the visual channel led to increased test
scores. They found that test takers at higher proficiency levels did not attend to gestures as much
as they attended to lip and facial movement. Sueyoshi and Hardison concluded that not only are
visuals able to aid in listening comprehension, but that the use of visual cues differs based on the
level of language proficiency at which each individual learner is. Wagner (2006, 2008) found
similar findings in his study investigating the effects of audio-only texts compared to video texts.
He found that individuals used videos in different ways, once again suggesting that the ability to
use non-verbal cues differs based on several factors such as proficiency.
Finally, it has been found that not only do visuals have an effect on the listening
comprehension of non-native speakers of a language, but they also have an effect on the listening
comprehension of native speakers. Morrel Samuels and Krauss (1992), in a study looking at the
interplay of gesture and speech in interaction, found that gestures can actually serve to facilitate
speech production and can even be an aid to listeners who are listening to a speaker speaking
their native language. Thus it would appear that native speakers of a language rely on the
presence of gestures to aid comprehension. Likewise, Hadar, Wenkert-Olenik, Krauss, and
Soroket (1998) found that gestures are able to help native speakers negotiate the meaning that the



6

speaker is attempting convey in instances where there may be misunderstanding and that they
actually aid native speakers of a language by helping them recall lexical items more quickly.
Despite the fact that the L2 literature shows that the usefulness of non-verbal cues can be rather
contradictory in assessing L2 listening comprehension, it appears that there is at least some effect
of visuals on listening comprehension for native and non-native speakers alike.
Difficulties in the Definition of the Listening Construct
With the research investigating the effects of visuals on listening comprehension has
come a discussion of what exactly the listening construct is and what exactly should be tested in
listening comprehension. There has recently been much discussion in what one’s listening ability
is actually comprised of and whether tests of listening comprehension should include video in
their listening tasks. While it has been suggested that video should be used in tasks of listening
comprehension which are based on audio that originated with video (Buck, 2001), test
developers have frequently rejected the use of video in their listening tests (as reported in E.
Wagner, 2008). Coniam (2001) mentioned that the test developers for the Hong Kong English
Language Benchmark Test purposely rejected the use of video in their tests, even when some
audio was taken from video-based materials. While part of the reason for splicing video away
from the audio may be the result of a lack of adequate technology in certain areas where listening
comprehension tests are administered, it raises a serious question of the construct validity of
these tests (i.e. are these tests measuring the full range of listening ability?).
Researchers have expressed opinions on both sides of this issue. Buck (2001) stated his
concerns that research has shown that people differ in their ability to use visual cues. He
suggested that the use of visuals may create an unfair advantage for those particularly adept at
using nonverbal cues and that it is therefore better to focus on comprehension of strictly auditory



7

information. In addition, Gruba (1993) was concerned with whether the use of visual information
would affect the overall construct validity of listening tasks and seemed to agree with Buck in
the idea that the verbal aspects of communication are more important than the non-verbal aspects
of communication.
On the other hand, more recently a number of researchers have taken the opposite
viewpoint and have stated that video that is naturally in connection with audio helps with the
construct validity of a listening test. Construct validity, defined by Bachman and Palmer (1996)
is “the extent to which one can interpret a given test score as an indicator of the ability(ies), or
construct(s), we want to measure” (p. 21). Without the presence of video, the construct validity
of a listening test would be endangered. For instance, von Raffler-Engel (1980, p. 235) suggested
that, by taking away the natural visual cues of communication, “an unnatural condition which
strains the auditory receptors to capacity” would be present. In other words, by removing the
visual channel, the test taker is being put under unnecessary strain and the test does not reflect
the natural environment in which test takers would generally use their listening ability and,
therefore, the construct validity of this sort of test is lacking. This would seem to be especially
true given the findings of Morrel-Samuels and Krauss (1992) and Hadar et al. (1998), reviewed
above. They showed that native speakers use the visual channel input in order to facilitate L1
communication. If indeed this is the case, then by removing the visual channel, language testers
are undermining the validity of the task, especially given the test taker is no longer required to
utilize all natural aspects of communication.
The issue of construct validity is of great importance in language testing, and many
researchers have devoted a great deal of time to discussing the concept itself. Messick (1989,
1996) discussed this idea when he suggested that in order for a test task to be considered



8

construct relevant, the task must reflect the context of listening. Bachman and Palmer (1996)
continued with this suggestion by proposing the idea of the target language use domain. Target
language use is simply defined as any set of language tasks that the test taker may encounter
outside of the test. If the task is to listen to a lecture where an individual would normally have
access to visual cues through the lecturer’s gestures, then the listening task should include the
visual channel. Likewise, M. Wagner (2006) argued that if the visual channel is not included in
such contexts, task validity is threatened due to construct underrepresentation.
In addition to construct validity, Bachman (1990) and Bachman and Palmer (1996) also
provided an extensive discussion on authenticity in language assessment tasks. Authenticity, the
extent to which a the TLU task corresponds to the test task, is described as having an important
role that works together with that of construct validity in determining how the construct
definition and the domain of generalization will affect the way in which a test score will be
interpreted.
This model and the discussion of it in the literature make it clear that test developers need
to consider what exactly should be tested in a listening task. Using video will lead to the
assessment of a test taker’s ability to use visual cues for understanding auditory information
much as they would do in a real-life situation outside of a test task. However, as some (Gruba,
1997; Ockey, 2007) have suggested, in order to include video in test tasks, the definition of the
listening construct must first be expanded to incorporate visual cues. However, given the debate
in this area and the mixed results of previous studies, it appears that this remains a matter of
considerable debate.
Gaps in the Literature and Research Questions



9

Researchers have conducted many studies in the area of listening comprehension. These
studies have investigated the effects of visuals on student performance on listening
comprehension tests. However, these studies have all looked at the impact visuals have on the
test taker’s ability to answer multiple-choice comprehension questions. They have not gone
further to examine the effects video may have in skill areas. For this reason, the current study
examines the effects that video has on essays written as part of an integrated writing task and on
the test takers’ note-taking strategies. Therefore, the following research questions guided the
present study:
1. Do test takers score higher on their written responses when they are given an audio-visual
lecture rather than an audio-only lecture?
Although there are some results to the contrary (cf. Sueyoshi and Hardision, 2005), I
hypothesize that test takers will score higher on their written responses when writing an essay
following an audio-visual lecture than following the audio-only lecture accompanied by a still
picture. While writing is different from the multiple choice assessments given in previous
research, I believe that results of writing tasks will align with previous research conducted by E.
Wagner (2010) and Baltova (1994) by showing that participants are able to more easily
comprehend audio-visual listening passages and, as a result, will be able to better use
information from the video listening passage in their essays.
2. Do note-taking strategies change based on the way the listening material is presented?
Based on a study done by English (1982), in which she stated that participants were unable to
attend adequately to a video stimulus due to a note-taking task which was assigned, it is
hypothesized that test takers will take fewer notes when presented with a video listening task due
to a greater attention they will need to place upon the gestures used by the lecturer. It is also



10

believed that the gestures will provide a better means for promoting recall and, therefore, the test
takers will not need to focus as much attention on taking notes.
3. Do test takers notice non-verbal information when watching and listening to a video
lecture? How do they use the non-verbal information?
Based on previous research (Hardison & Sueyoshi, 2005; Morrel-Samuels & Krauss, 1992;
E. Wagner, 2008), it is hypothesized that test takers will take notice of the non-verbal
information. In these studies, test takers explicitly stated how they used the different non-verbal
cues in order to aid their comprehension of certain aspects of their listening comprehension cues.
4. Do test takers find the video cues helpful or distracting and which method of delivering
the listening material do they prefer?
Based on research performed by E. Wagner (2007) and Ockey (2007), it is hypothesized that
for an integrative writing task, test takers will greatly use non-verbal information as a means to
enhance their ability to determine the meaning of previously unknown words and for faster
recall. Therefore, it is hypothesized that test takers will view the video lecture as being helpful.
As Ockey found, there may be a range of opinions concerning how helpful the video actually is.
It is also hypothesized that the video-based prompt will be preferred to the still-picture-based
prompt given results of surveys taken by test takers in Sueyoshi and Hardison’s (2005) study.



11

CHAPTER 3: Methods
Here I discuss the materials used and the data collection procedure. I will then conclude
chapter three with a discussion of the way in which the data were analyzed.
Participants
The participants for the study were non-native speakers of English enrolled in one of
several programs at Michigan State University. Participants at the time of data collection were
either enrolled in the two highest levels of the Intensive English Program (IEP), or were
attending MSU part-time while taking English for Academic Purposes (EAP) classes. In addition
they were either full-time undergraduate students, or full-time graduate students. IEP classes
consisted of both provisionally admitted students who had not earned sufficient TOEFL scores in
order to be admitted to the university and individuals enrolling for language development. Those
with insufficient TOEFL scores are enrolled in IEP classes in order to offset any deficiencies
they may have before attending classes in the university. EAP courses were courses that students
are required to take after finishing their IEP requirements. They are a series of four courses that
give students additional training in using English in an academic context while at the same time
allowing them to attend the university courses part-time.
To recruit participants, I went to the nine IEP level four classrooms, an IEP level three
content lecture (which every IEP level three student is expected to attend twice a week), and
contacted via email EAP, undergraduate, and graduate students at MSU. Flyers were also posted
on campus to inform other international students of the research opportunity. The study was
presented as an opportunity to write TOEFL test preparation essays. Students wishing to
participate gave me their contact information or emailed me at a later time. I subsequently
assigned them to a specific testing date. A total of 40 students participated in the study.



12

Descriptions of the participants are in Table 1. As indicated, most of the participants were
Chinese and most participants were between the ages 18 and 21.
Table 1.
Participants' Backgrounds
Male
17
22
11

Total
40
21
10

9
1
6
1

19
1
0
3

28
2
6
4

14
1
0
1
1

Number of Participants
Average Age
Average Years of English Instruction
Level of Study
IEP
EAP
Undergraduate
Graduate
Native Language
Chinese
Korean
Japanese
Arabic
Vietnamese

Female
23
20
10

16
4
2
1
0

30
5
2
2
1

Materials
I took two practice, integrated TOEFL iBT writing tasks specifically designed by the
Educational Testing Service (ETS) as prompts for this study. I took one of these tasks from the
ETS website (http://www.ets.org/toefl/ibt/prepare/sample_questions) and I used another task
from the TOEFL iBT test preparation guide (ETS, 2009). I used these materials because ETS has
extensively investigated the reliability and validity of the materials through pilot testing and
found that the prompts fell within ETS’s own fairness guidelines (ETS, 2010). In this study I
had the participants follow the instructions and format of the TOEFL iBT integrated writing task.
In addition, for one of the tasks, a video of the lecturer was provided, which differs from the still
picture that is the only source of visual information present on the actual TOEFL iBT writing
exam.



13

All listening materials I used were between two and three minutes in length. The two sets
were completely unrelated in the topics they covered. One set discussed the concept of altruism
while the other discussed the advantages and disadvantages of computerized voting. The same
professor was given the scripts for the listening passages. I recorded the audio and video using a
digital video recorder while the professor delivered the lectures for both listening passages.
During the recordings, a group of students sat in the classroom because previous research
(Alibali et al., 2001) has found that speakers are more likely to produce more natural gestures
when there is an audience present. Once the videos were recorded, two versions of each listening
prompt were created: one with video and audio together and one with the audio and a snapshot
from the video as a still picture. I used Windows Live Movie Maker to edit the videos for clarity
and to extract the sound for the audio/still-picture listening prompts. I uploaded the materials to a
private YouTube account for later use (Appendix A). I typed and printed the reading materials
for use during the task.
Materials also included a background questionnaire (Appendix F) that I used to ask
participants information regarding their age, length of English study, number of times they had
taken the TOEFL, and whether they planned to take the TOEFL again. I also developed an exit
questionnaire (Appendix G), which consisted of four questions concerning opinions and thoughts
about the two forms of listening passages and a question about their note-taking behavior.
Participants used a personal note-taking sheet for each integrated writing task (Appendix H)
which also contained the directions for the task. The directions on the sheet were the same as
those offered on the TOEFL iBT writing task.
Finally, I had two researcher use an analytic rubric originally developed by Polio and
Hughes (2002) to evaluate the participants’ ESL writing development (Appendix I). I chose an



14

analytic rubric for a number of reasons. While it may be more time-consuming than a holistic
rubric to use, the analytic rubric is able to reflect the different aspects of a test taker’s writing
ability (Weir, 2005) and, therefore, it gives a much fuller picture of how the different variables
affect writing ability. In addition, since some students wanted to receive scores back in order to
know what they should improve on, it was important to provide them with a measure that would
actually display specific areas where they needed to improve (Hamp-Lyons, 1991). Furthermore,
I chose this style of rubric since I believed it would be easier to train the raters using an analytic
scale as suggested by McNamara (1996). In addition, I made this decision based on theories on
rubric use outlined by Weigle (2002) and Bachman and Palmer (1990): they noted that the use of
an analytic scale is more reliable than a holistic scale and lends toward a greater range of
difference in scores.
Procedure
Study Approval
Prior to starting data collection, the university IRB approved the study. Policy dictated
that every participant must sign a consent form discussing the purpose of the study and the risks,
benefits, means of ensuring privacy, and the procedures associated with it. Therefore, at the start
of each testing session, I passed out and went through the consent form with participants.
Setting
Testing took place in a university computer lab at Michigan State University. The lab was
equipped with 36 iMac computers running Mac OS X Snow Leopard and all were connected to
the Internet. Computers were arranged in six rows, each with six computers in it. Seats were
arranged so that participants would be facing the front of the room when seated at their
computers. Each computer had a large monitor and headphones so that individuals could hear the



15

listening passage and see the video on their computers clearly. There was also a teacher’s station
connected to a projector which was used to show participants how much time was remaining in
the test.
Computer Set-up
One computer was assigned to each student who had signed up for that day’s test. Each
computer had two open Microsoft Word documents on it that they used for the two writing tasks.
While spell check and grammar check were available, participants were instructed not to use
them. The documents were blank except for the essay prompt typed at the top of the page
(Appendix J). In addition to the Word documents, the Firefox Internet browser was also open
with two separate YouTube pages open. One page contained the video listening passage while
the other page contained the audio listening passage with a still picture and these were ordered
according to the condition the participant was in. The first listening passage that the participants
listened to was already on the screen in full-screen mode when the students were seated at their
assigned computers.
Data Collection
Participants came to the computer lab on the day they had signed up for and were placed
randomly into one of four experimental groups based on the order in which they signed up to
participate in the exams. The experimental groups were arranged based on four conditions:
1. Altruism Listening Material with Video.
2. Altruism Listening Material with Audio/Still Picture.
3. Computerized Voting Listening Material with Video.
4. Computerized Voting Listening Material with Audio/Still Picture.



16

In order to ensure that the listening content or order of presentation did not have an effect on the
participants’ performance, a repeated measure design was used. Table 2 shows the order in
which the listening material was played for all groups.

Table 2.
Listening Material Order
Group

Listening One

Listening Two

1

Condition 1

Condition 4

2

Condition 2

Condition 3

3

Condition 3

Condition 2

4

Condition 4

Condition 1

Upon arrival, participants were seated at their assigned computers. Once all participants
were present, I explained the consent form and instructed students to sign it if they still wanted to
participate. Once I collected the consent forms, I instructed the participants to complete the
background questionnaire. They did so before the start of the actual writing task.
After collecting all background questionnaires, the researcher passed out the first
directions and note sheet to the participants. The researcher then read the Directions out loud to
the participants and stressed that they should use this sheet should to take notes on the listening
material. Participants then had the opportunity to ask questions that they may have had, the
researcher answered them, and then the actual task was begun.
The participants then began their first integrated writing task. The researcher handed out
the reading material face down to each participant. When each participant had a copy of the
reading, they were instructed to flip over the sheet of paper and to take three minutes to do the
reading. At the end of the three minutes, the test takers flipped the reading over again. At this



17

time, they put on the headphones that belonged to their individual computer station and, at the
same time, everyone started the first listening that was on the screen and they took notes on the
listening material. At the end of the listening, participants closed the Internet browser and opened
the Microsoft Word document prepared for them on the computer. They read the essay prompt
and had 20 minutes to write an essay comparing the reading and listening material. While
writing, they were able to look at the reading material again. After time was up, participants were
saved the document to the desktop of the computer so that the researcher could later go and
collect all essays. The second writing task was conducted in the same manner as the first.
Following the completion of the second writing task, participants received the exit
questionnaire and the researcher explained the questions on it. Participants then filled out the
questionnaire and the researcher walked around answering any questions that may have arisen
due to misunderstanding of what the questions were asking. When the participants finished
filling out the exit questionnaires, the researcher asked for volunteers who would be willing to
participate in a brief recorded interview. The interview was completely voluntary and
participants knew that it was not necessary for them to participate in the interview. Of the 40
participants, four agreed to the interview and they answered questions concerning their opinions
of the task and what their preferences were. Due to time constraints of the participants, the
researcher conducted the interviews with two participants present at the same time.
Essay Rating
Two graduate students studying Teaching English to Speakers of Other Languages
(TESOL) volunteered to help me in rating the essays. One rater had no experience in rating
essays and one had some experience with rating practice ACT essays. The raters used the
analytic scale previously discussed in order to assign scores to the different essays. I trained the



18

raters according to suggestions put forth by Weigle (2002). Raters received three sets of
benchmark essays from the integrated writing tasks of the TOEFL iBT made available by the
Educational Testing Service and each rater examined one set of benchmarks essays at one time.
The researcher gave the raters the first set with the essays in order from lowest score to highest.
Each essay had an appropriate score according to the scoring scheme of the analytic rubric
written on it. The researcher went through each essay with the raters and described the specific
points of the essay that corresponded with the rubric’s criteria. Once raters felt comfortable with
the rubric, they received a second set of essays representing each score range of the analytic
rubric. Raters individually assigned scores to each essay and compared with each other. Finally,
raters examined a third set off essays which contained essays at each score level, with some
score levels having multiple essays and with some more problematic essays. Raters once again
scored essays individually and then compared with each other. At this point, the researcher
decided that the scores given to the practice essay were sufficiently close to each other and the
raters began individually rating the 80 essays collected from the participants.
Analysis
In this study I used mixed methodology in order to answer the four research questions I
presented earlier. The combination of quantitative and qualitative methodology has become
increasingly common and has been viewed as “complementary rather than fundamentally
incompatible” (Duff, 2002, p. 14). Quantitative data consisted of the raw scores that were
collected from scoring the essays with the analytic rubric and the word counts from the note
pages that were collected from the participants. Qualitative data consisted of transcriptions from
the four interviews that were conducted as well as answers to the questions on the exit
questionnaire. IBM SPSS 19 software was used to perform statistical analyses of the quantitative



19

data. Qualitative data was investigated by looking for themes that arose from interview
transcripts (Appendix K and L) and questionnaires.
RQ 1:
I addressed the first research question by using a paired samples t-test in order to
compare the average of the raw scores assigned to the essays across the two types of tasks—the
video task as compared to the audio/still-picture task.
RQ 2:
I addressed the second research question through a combination of quantitative and
qualitative analysis. I compared word counts calculated for each student’s note pages were
compared between video and audio tasks. In addition, I examined answers given in the exit
questionnaire about note-taking in relation to the results of the quantitative analysis.
RQ 3 and 4:
I addressed the final two research questions through the analysis of the qualitative data
obtained from the exit questionnaire and the interview transcriptions. By looking at responses, I
found common themes among the participants’ replies and they were coded.



20

CHAPTER 4: RESULTS
The purpose of chapter four is to present the results of the data analysis for each of the
four research questions. In this chapter I present the results as they relate to each research
question.
Scale Reliability
In order to assure that the scores assigned to the essays by the raters were reliable I
calculated inter-rater reliability and percent agreement values by examining the relationship
between scores given by the two raters for each topic overall as well as for each of the subscales
found on the rubric. The results of this analysis can be seen in Table 3.
Table 3.
Inter-rater Reliability
Topic
Sub-Categories

Altruism

Voting

Content

0.87 **

(92.25)

0.64 ** (87.63)

Organization

0.81 **

(92.38)

0.77 ** (90.75)

Vocabulary

0.63 **

(90.13)

0.48 *

(90.63)

Language Use

0.57 **

(90.00)

0.23

(85.88)

Mechanics

0.55 **

(93.44)

0.17

(91.38)

0.62 ** (91.42)
Overall
0.86 ** (94.24)
Note: Values are Pearson Product Moment correlation coefficients (r) between the two
raters on the given rating category for the given topic; ** = p< .000, * = p< .05; within
parentheses is the percentage of agreement between the two raters.
As can be seen in Table 3, the inter-rater reliabilities are highly significant across most of
the subscales and in terms of the overall scores. The only exception to this is in the areas of
language use and mechanics for the topic of computerized voting. This, however, may be
expected given the vague nature of these categories (i.e. not everyone agrees on what to count as
a syntactic error or punctuation error). The overall inter-rater reliability values are also high, and



21

these values are also closely matched to those obtained by ETS (2011) concerning the writing
portion of the TOEFL iBT, which has obtained average inter-rater reliability results of .78 in its
own calculations. While some of the correlation coefficient values are rather low, the percent
agreements across all categories are very high, with the lowest percentage at 85.88% for
language use in the computerized voting essays. This signals that the raters were generally in
close agreement on scores and that the scores used for the rest of the statistical analyses in this
study were reliable.
RQ 1
The first research question investigated whether there was a significant difference in
essays scores received after being presented with video listening material as compared with
audio/still-picture listening material. Table 4 containts the descriptive statistics for each of the
input types for the 40 participants.
Table 4.
Descriptive Statistics for Type of Visual Input
Audio/Still Picture

Video

Mean

SD

Mean

SD

Overall

40.44

10.08

43.28

12.08

Content

8.94

3.35

8.78

3.93

Organization

7.83

2.88

8.64

3.27

Vocabulary

9.31

1.96

10.04

2.47

Language Use

8.59

2.20

9.83

2.70

Mechanics

5.87

1.24

5.94

1.44

Note: n=40
In Table 4 it can be seen that, for overall score, the video condition had a slightly higher
mean than the audio with still-picture condition. However, when looking at the subcategories
found on the rubric, the results are not so straightforward. While the means for the subcategories


22

of organization, vocabulary, language use, and mechanics all seem to increase in the video
condition, the content subcategory actually received a lower mean in the video condition as
compared to the audio condition.
In order to investigate the significance of these mean differences, paired samples t-tests
were performed. Table 5 displays the results of these tests for both overall scores and the scores
for each of the subcategories found on the rubric.
Table 5.
Paired-Samples t-test Comparison of Scores Between Video and Audio Conditions
Effect Size
Category
t-value
df
p
SE
(r)
Content

-0.21

39

0.84

0.78

0.03

Organization

1.71

39

0.10

0.47

0.26

Vocabulary

1.71

39

0.10

0.42

0.26

Language Use

2.78

39

0.01 *

0.45

0.41

Mechanics

0.28

39

0.78

0.25

0.04

Overall

1.62

39

0.11

1.74

0.25

Note: * = significance at the .05 level
As Table 5 shows, on average, the participants did not receive significantly different
scores on their essays between the video (M = 43.28, SE = 1.93) and audio/still-picture (M =
40.44, SE = 1.61) conditions, t(39) = 1.62, p > .05, r = .25. The subcategories of content,
organization, vocabulary, and mechanics likewise produced non-significant results showing that
the presence of video or still picture did not have a great effect on these areas.
However, on average, it was found that participants did receive a significantly higher
score in the area of language use in the video condition (M = 9.83, SE = 0.43) as opposed to
language use in the audio/still-picture condition (M = 8.59, SE = 0.35), t(39) = 2.78, p < .05, r =
.41. According to Cohen (1988, 1992), an r value of .41 indicates that there was a moderate



23

effect size of the video- versus audio-condition on this category of the rubric, meaning that the
video listening materials were providing some significant aid in this category.

RQ 2
The second research question investigated whether there was a difference in note-taking
practices between the video and audio/still-picture conditions. This was evaluated by doing a
word count on participants’ notes. Table 6 displays the descriptive statistics for the word counts
between the two conditions.
Table 6.
Descriptive Statistics of Note Word Counts
Input Method

N

Mean

SD

Audio

40

34.00

19.90

Video

40

28.93

17.55

Results in Table 6 display the audio condition with the highest mean (M = 34.00) and the
video condition with a lower mean (M = 28.93). In order to investigate whether these means
were significantly different, I performed a paired samples t-test. The results of this analysis are in
Table 7.
Table 7.
Paired-Samples t-test of Note Word Count
Compared to Input Method
t-value

-2.39

df
p

0.02

SE

2.13

Effect Size (r)


39

0.36
24

The results of this analysis demonstrate that the number of words in the participants’
notes significantly differed between the video condition (M = 28.93, SE = 2.78) and the audio
condition (M = 34.00, SE = 3.15), t(39) = -2.39, p < .05, r = .36. Thus, it appears that the method
of delivering the listening material had a moderate effect on note-taking practices—those who
received the video-based input wrote significantly fewer notes.
In addition to performing a paired samples t-test on word counts, qualitative data were
also collected from the participants via the exit questionnaires. I analyzed their responses for
themes related to note-taking that may have explained why fewer notes were taken on average
during the video-based listening tasks. Table 8 displays these themes and the number of tokens
for the occurrence of each theme.

Table 8.
Participants' Comments on Note-Taking and Memory
Major Category

Subthemes

Note-Taking

a. Multitasking difficult

11

b. Body Language was Distracting

3

d. Did not watch video

2

e. Did not take notes

5

a. Easier comprehension led to easier recall

4

b. Gestures facilitate longer storage in memory

15

c. Easier to memorize

2

Memory

Tokens

d. Too much information to recall
4
Note: Numbers may not match total number of participants due to the fact that some comments
were not related to any of the major themes.



25

Table 8 has two major categories related to comments on note-taking and comments on
memory and the subthemes associated with these categories. The most common theme that
appeared in regards to note-taking was the difficulty of listening, watching and taking notes at
the same time. Participants generally wrote that they found it extremely difficult to take notes
during the video. They gave several reasons for this on their exit questionnaires. Many of these
reasons were that it was simply too difficult to pay attention to everything at the same time or
that the movement was distracting. The following are some representative examples of the
common themes that arose in relation to the note-taking category:
Example 1: Yes, it is hard for me. Because when I was taking notes, I couldn't pay
attention to the video. I can only focus on one thing carefully. (Participant 25, in IEP
program, from China)
Example 2: Well Audio with picture was attentive? I couldn't concentrate on it any
more because there is no moving or anything like that. Audio with video, the lecturer was
moving and hand motion and everything. I was just following her action, looking at her.
So, I couldn't concentrate as much as the audio with picture. (Participant 3, in IEP
program, from Korea)
Example 3: I think both ok. I don’t have time to look at screen. (Participant 7, in IEP
program, from China)
Example 4: I don't prefer to take a note while the video was playing because the length
of lecture is only 2 or 3 minutes. (Participant 16, in IEP program, from China)
The researcher also selected comments concerning memory given that it was
hypothesized that non-verbal cues would facilitate storage of information in memory. The most
common theme to arise in respect to this aspect of the question, which had 15 tokens, was that



26

the gestures in the video facilitated longer storage of information so participants felt that they
could take fewer notes.
Example 5: Yes. Sometimes the presenter will show some picture or gester to me, that is
the point. And make me remember longer than reading. (Participant 5, in IEP program,
from Taiwan)
Other common themes in regards to memory were rather equal in their distribution and
much lower than those related to gestures aiding longer storage in memory. These comments
focused more on recall of information rather than the storage of it.
Example 6: Yes. Some behaviors of instructor help me to recall some parts of listening.
(Participant 39, in EAP program, from China)
Example 7: No, because the information is too much and I cannot understand all the
information, just can remember the things which I can understand include a lot of extra
information. (Participant 11, in IEP program, from Vietnam)
RQ 3
The third research question addressed whether participants actually took note of nonverbal information in the video listening and how they used non-verbal information that they did
notice. In order to investigate this, I evaluated the responses from the exit questionnaires and
interviews. I developed a list of themes as they related to the question. The themes and the
number of tokens for each theme are in Table 9.
The information in Table 9 shows that, in terms of focus, the majority of responses on the
questionnaire and in the interviews stated that the focus was either on the gestures and body
language of the lecturer or on the content of the lecture. A smaller portion of the comments
focused on the teacher herself, the intonation and stress in the teacher’s voice, or some other



27

Table 9.
Participants' Focus and Use of Video Content
Major Category

Subtheme

Tokens

Focus

a. Teacher

4

b. Gestures/Body Language

11

c. Stress/Intonation

3

d. Listening to information

15

e. other

8

a. Aid Comprehension

11

b. Cues for important information

6

Use

c. No indication of use
9
Note: Numbers may not match total number of participants due to the fact that some
comments were not related to any of the major themes.
aspect that was only mentioned by one of the participants and could not be considered a major
subtheme. Examples 8 through 11 are representative of these comments. When asked what they
were focusing on when the video was playing, this is what some of the participants said:
Example 8: On the people who talking. (Participant 21, in IEP program, from China)
Example 9: Interviewer: so you're saying that when you...when you were watching the
video you were paying more attention to her...
Speaker 2: her gestures...yeah. (Participant 1, in IEP program, from Saudi Arabia)
Example 10: The stressed words and the words appear in the reading. (Participant 31,
Freshman, from China)
Example 11: I pay the most attention towards the listening. (Participant 30, in IEP
program, from Korea)
It appears the major category of use is related to the actual use of non-verbal cues only. In
terms of use, there was a relatively equal spread across themes regarding the number of instances
of each one found in participant responses. Eleven participants stated that they used gestures to



28

aid in their comprehension of words or overall comprehension of the listening passage. Nine
participants, even though taking notice of the non-verbal cues, indicated that they did not use the
information or find it helpful. Finally, six participants stated that they used the gestures and other
visual information as clues that told them what was important to write down in their notes.
Following are some representative examples of these themes:
Example 12: Speaker 2: uh..i think it's the same as she said, but at first i didn't like it.
because i am...i am the kind of per..people who like to look around everything.
So..but...actually, I understand more because her gestures help me like to find out word
like "burrow." I didn't know it was the hole in the ground unless she did with her gesture
that (hand motion). And also, like she said, the eye contact and some gestures just show
you the important things she's going to do. So...her hand(?) just maybe talks as important
as her words. (Participant 5, in IEP program, from Taiwan)
Example 13: Audio with video. I can find which one is important and which is the first
or second from the person's body language. (Participant 31, Freshman, from China)
RQ 4
The final research question asked whether participants preferred the video format or the
audio-and-still-picture format for the listening passage. With the question I investigated whether
participants found the video lecture distracting or helpful. To determine participants’
preferences, responses to question one on the exit questionnaire were tallied and preference
reasons were coded as well. The results of this analysis can be seen in Table 10.
The data in Table 10 illustrates that the majority of participants had a preference for video input
with most of the comments revealing that it was easier to comprehend the information presented
in the lecture because of clues from the visual channel. The participants also seemed to prefer the



29

Table 10.
Participants' Preference and Reasons
Preference

Themes

Tokens

Video

a. More Realistic

6

b. More Active/Higher Concentration

3

c. Easier to Comprehend

15

d. Other

4

Total
Audio

28
a. Easier to Focus on Content

9

b. Other

1

Total
Both

10
a. Did not look at screen

2

video condition because it was more realistic or authentic. Ten participants said they preferred
the audio/still-picture input, with all but one saying the reason for this was because it was easier
to focus on the audio content of the lecture. Finally, two other participants stated that both forms
of input were okay and stated that they often or mostly did not look at the screen for the video.
Examples 14 and 15 provide contrasting examples of responses from participants that had a
definite preference for one form of input over another.
Example 14: I prefer the audio with video because it is more realistic. I just feel like in
the class. I could see our teacher's action. (Participant 2, in IEP program, from Korea)
Example 15: [I prefer] audio with picture because audio with video made me looking the
computer and follow her moves, therefore I couldn't concentrate on listening as much as I
did on audio with picture. (Participant 26, in EAP program, from China)
In addition to examining these themes, I also examined the survey data to determine
whether participants found the video helpful or distracting. Each participant made mention of



30

this at least once in their survey. The count of those who found the videos helpful or distracting
are in Table 11.
Table 11.
Participants' Overall Impression of Video Input
Perception

Count

Helpful

25

Distracting

11

Neutral

4

Total

40

Overall, the information Table 11 shows that participants generally believed, for reasons that will
be covered in the next chapter, that the videos were a helpful aid, with 25 saying video was
helpful. Eleven participants stated they felt the videos distracted them and four other participants
were neutral concerning the use of video in the listening task. It should be noted that not
everyone who stated that they had a preference for video stated that the video was helpful. Two
participants who had said they preferred the video condition displayed neutrality on whether it
was helpful or distracting and one participant actually said that the video was distracting.



31

CHAPTER 5: DISCUSSION AND CONCLUSION
In this chapter I examine the results from the previous section and seek to discuss them in
further detail for each research question. I then continue by indicating the general and
pedagogical implications that arise. Finally, I conclude with a discussion of the limitations of this
study and some suggestions for future research.
Listening Comprehension and Construct Validity
The findings of the current study seem to demonstrate that the type of visual that an
individual is presented with will generally not affect their use of information in an integrated
writing examination. The results of this study concur with those previously found by Coniam
(2001) and Londe (2009) who found that audio and video conditions on a listening
comprehension task did not produce significant test score differences. The results of the current
study conflict to a degree with those found by other previous research. Many researchers
(Baltova, 1994; Gruba, 1999; Sueyoshi & Hardison, 2005; Suvorov, 2008, 2009; E. Wagner,
2010) have found significant differences on listening comprehension tasks based on the types of
visuals that are provided. However, it should be noted here that the assessments they were using
came in the form of multiple-choice questions related specifically to comprehension, whereas the
assessment task in this study was focused on having the participants not only comprehend the
information they had heard, but also use it in a meaningful way in the essay. Therefore, it could
possibly be argued that, while visuals may have significant impacts on an individual’s ability to
comprehend the information received from a lecture, this effect does not extend into the way the
information is used when writing an essay for an integrated writing task.
By investigating the qualitative data regarding the way the participants used and focused
on the video listening material, a much fuller picture of issues at hand can be determined. Given



32

that the majority of the participants preferred the presence of video and noted the video’s helpful
non-verbal cues, it would seem that the presence of video input, while not necessarily helpful
during the writing phase of the task, is actually relatively helpful during the comprehension
phase. This was seen in many of the comments provided on the exit questionnaires and in the
interviews. The preferences found in this study aligned with the results of previous studies
conducted by Progrosh (1996) and Baltova (1994) who also found that participants generally
preferred video input over audio-only input. Evidence for this preference for video-based input
was revealed in the participants’ comments regarding what they focused on when presented with
the videos and how participants used the videos’ non-verbal information. It was apparent from
comments that participants were not only paying attention to the video, but that they were also
using non-verbal information in the form of gestures as a means for aiding comprehension of
difficult vocabulary items and to determine which aspects of the lecture were the most important
for them to attend to. These results support the previous findings of M. Wagner (2006) who
found that participants tended, on average, to watch video listening input 69% of the time. It also
lends support to findings by Sueyoshi and Hardison (2005), who found that learners used
gestures and lip movement to aid in comprehension, and Ginther (2002), who found that content
visuals were more helpful in aiding comprehension than context visuals.
However, the results of this study should be interpreted with some caution. Coniam
(2001) found that 80% of his study’s participants felt they were not aided by video and actually
preferred audio-only input, contrary to the results found here. Results from the surveys collected
in this study may shed some light on the conflicting findings from this study and Coniam’s.
Some participants openly stated that audio-only input was what they were accustomed to when
presented with listening material in classrooms in their home countries and, therefore, that the



33

audio-with-still-picture condition was preferred. Other participants simply did not see the
purpose of including non-verbal cues. For example, one participant said in an interview, “I don't
care about the lecture is with a video or a picture. The point is, do I have the ability to understand
what they said?” (Participant 35, Freshman, from China). Comments such as this one indicate
that some of the participants came from a background in which they were not taught how to use
non-verbal information during listening tasks, or that perhaps they were not used to watching
video in the context of a listening test.
This study has pedagogical implications in the area of strategy instruction. In this study,
participants who preferred audio-only input generally stated that the reason for this was because
the audio-visual input was distracting and made it difficult to focus on content. This may indicate
that students need to be instructed in strategies for using these non-verbal cues and that video
should, perhaps, be included in listening tasks that are presented in class and/or in preparation for
listening tests. The tests themselves should have more video-based content. If students are being
assessed on their ability to receive information from a lecture, in order for the test to be a valid
measure of this ability the presence of non-verbal information should be present. If students are
expected to succeed in a real lecture, it seems that it would not only be important that they know
how to utilize non-verbal information, but that they also know how not to be distracted by it.
Buck (2001) stated that by including non-verbal information, students who are adept at using
such information have an unfair advantage. However, as E. Wagner (2008) stated, by
administering an audio-only listening task, the components of the Target Language Use domain
are missing and those students who are adept at using these cues are unfairly disadvantages and
their ability is underrepresented.
The Effects of Video and Audio Listening Tasks on Note Taking



34

Being able to take notes can compensate for memory constraints and therefore increase
face validity of a test (Vandergrift, 2009). In addition, listeners are not able to go back and
review the information that has been presented to them in the same way readers can (Thomson,
1995), making a task such as the integrated writing task incredibly difficult without some form of
notes. Previous research has done little in the way of investigating the note taking behavior of
test takers on listening comprehension tests. Several studies have found that the presence of
notes and the ability to take notes during a listening task can actually facilitate comprehension
and aid in the recognition of specific information (Liu, 2001; Carrell, Dunkel, & Mollaun, 2004).
Another study conducted by Carrell (2007) investigated the effect of note taking strategy
intervention find that while there was little impact of strategy intervention on note taking, the
subsequent performance on a listening and writing task, similar to that conducted in this study,
related consistently to the number of content words in the test taker’s notes.
Since the task in this study was based on the integrated writing task found on the TOEFL
iBT, students were provided with paper to take notes as they are in that test. The current study
took advantage of this condition by investigating the effects of the different listening conditions
on note taking. Results indicated that there was a significant difference in word counts between
the two conditions with the video condition producing notes with lower word counts. This
supports previous research in which it was found that test takers tend to focus their attention on
the screen for longer periods of time when video is present than when there are still-pictures only
(E. Wagner, 2007) and that changing images or movement can be distracting (Chung, 1994; E.
Wagner, 2008). This may also be the result of a greater cognitive load being placed on the test
taker who has to focus on watching the lecturer, listening to the information, and writing relevant
information down in notes.



35

Qualitative data helped clarify this conclusion by providing some additional information
that may support this evidence of split-attention effect. Many participants stated that the video
created a need to multitask and that this made it much more difficult to take notes. In fact, as was
seen in Table 8, 11 out of 21 tokens in reference to note-taking mentioned multitasking
difficulty. Further comments demonstrated that participants seemed to believe the video listening
was somehow “faster” than the audio/still-picture listening material and that the video content
made getting information from the lecture more difficult. This comment was found in relation to
both video lecture topics, which may suggest that participants perceived the video lectures as
being “faster” because of an extra cognitive load the video condition presented. In essence, it
could be that the “distracting” effect of video presentation reported by Chung (1994) and E.
Wagner (2008) is one and the same as the split-attention effect also described by E. Wagner
(2008)—the participants may just be describing their inability to attend to all modalities (audio,
visual) and task components (comprehension, note-taking) at the same time and the participants
may, in laymen’s terms, describe their perception of this cognitive overload as being “distracted
by the video” or by “the movement in the video.”
In the current study I also investigated whether test takers felt it was easier to remember
information in the video lecture than in the audio lectures. As was seen in Table 8, 19 of the 25
comments concerning memory centered on how the video was helpful in terms of storage of
information in memory and recall. Participants’ comments suggested that specific gestures
actually facilitated the recall of information when they needed it. This could possibly be because
the gestures aided in comprehension and made the information in the lectures more salient to the
listeners. Furthermore, some participants stated that they felt taking notes actually negatively
impacted the effect of the videos. One of the interview participants stated:



36

Example 16: If I do not take notes, I think it is better for video, because I mean...body
language is helpful. So, I can understand more, but if we need to take notes, it's not a
good idea. (Participant 9, in IEP program, from China)
Comments such as this one suggest that language learners realize that body language and
gestures are helpful and that they should use them as an aid when listening. However, this
comment also causes more questions to arise. Namely, if students are taking notes, how attentive
can they be to body language and gestures? Obviously in the real world, students in lecture halls
have to negotiate both tasks. Perhaps practice in doing so is needed. There is some evidence in
this study that a test which requires both (note taking and visual attention to gestures, facial
features, and body language to augment comprehension) is more advanced than the curriculum
leading up to the test—if listening tasks in the classrooms do not require such task and skill
negotiation, a test with such components may feel unfair or be too taxing.
Limitations
While the data obtained from the current study is rather informative, there are still several
limitations in relation to the study and the listening test itself.
First of all, the study lacked authentic materials. The listening materials used for this
study were scripted materials originally designed by the Educational Testing Service and
therefore lacked authenticity. Gruba (1997) stated that authentic materials are preferable in
listening tests and therefore this would most likely be a consideration for future studies.
A second limitation of the study is found in the interviews and questionnaires. The
information from the interviews and questionnaires should be taken cautiously since participants
were informed of what the study was investigating. Therefore, it could be possible that a
Hawthorne effect was present in which participants presented answers to survey and interview



37

questions to fit in more with the study’s main investigation. In addition, due to participants’ time
constraints, it was necessary to conduct interviews in pairs. This may have made it so that one
participant affected the other participant’s opinions.
A third limitation is the way the notes were analyzed. A word count does not necessarily
provide a good indication of how video and audio/still-picture input sources affect note-taking
strategies. It may have been better to count content units since that may have provided a more
accurate measure of how video affected note taking behavior.
Another limitation to the present study can be found in the participants themselves. Not
only were most of the participants from the same cultural background, but the proficiency levels
that were represented are not very diverse. Most were approximately at the intermediate level of
proficiency on the ACTFL scale. Therefore, the results of this study are difficult to generalize.
Directions for Future Research
The results of the present study leave several questions to be investigated in future
research. The first of which would be to investigate the effects of strategy instruction on
participants’ ability to utilize non-verbal cues. Given that there were several responses of
individuals who found the video listening passages “distracting,” it would be interesting to
instruct test takers on stress patterns and gestures and then see if there is an effect of that
instruction on subsequent test scores.
Second, there were several participants who stated that they felt more relaxed when given
the video listening input and that because of this they were able to comprehend better.
Researchers such as Arnold (2000) and In’nami (2006) have found conflicting results concerning
the effects of anxiety on listening comprehension. The comments provided in this study may
indicate that anxiety does have some effect on listening comprehension and, therefore, it may be



38

important to further investigate the affective dimensions of the inclusion of video input on
listening comprehension exams.
A third possibility for future research concerns note taking strategies. When looking at
notes from the current study, content was not considered. It may be interesting to examine how
video and audio listening passages affect the actual content found in notes and what kind of
content may contribute to higher scores on an integrated writing task.
A fourth direction for future research could be to investigate the actual focus of test
takers when given a video listening task. While E. Wagner (2007) found that participants looked
at the screen 69% of the time when presented with a video listening task, it would be interesting
to investigate what test takers actually focus on in the video using eye-tracking technology.
Another direction for possible future research would be to investigate specific kinds of
gestures and to investigate how each type aids comprehension. For instance, investigating
whether metaphoric gestures would aid in retention and comprehension more than beat gestures
would lead to further understanding of why types of gestures may attract the most attention and
facilitate the process of listening comprehension. Furthermore, investigating the effect of these
types of gestures on note-taking practices in terms of content would be of interest to investigate
what is more salient to test takers.
A further possibility for further research would be to investigate the connection between
the comprehension phase and the writing phase. It would be interesting to examine whether the
results of this study are due to the fact that there is a stronger link between the listening and
comprehension phase than there is between listening and writing or if some other factor is at play
regarding performance on the essay task.



39

Finally, with the number of participants stating that they found it easier to store
information in their memory and later recall it with the video, it may be worth investigating the
way in which body language and gestures ease an individual’s cognitive load and the amount of
information one is able to recall as a result of video listening tasks. The findings of such a study
would give a fuller view on the factors that affect an individual’s ability to comprehend the aural
stream.



40

APPENDICES



41

APPENDIX A– YOUTUBE LINKS TO LISTENING MATERIALS

Computerized Voting with Still Picture and Audio:
http://youtu.be/FBW_STr6tEA?hd=1
Computerized Voting with Video:
http://youtu.be/3IWSQ8PT4CY?hd=1
Altruism with Still Picture and Audio:
http://youtu.be/kAWvQmKZI-U?hd=1
Altruism with Video:
http://youtu.be/hRgsmMRpeDY?hd=1



42

APPENDIX B – BACKGROUND QUESTIONNAIRE

Participant ID # _________________
(To be filled in by the researcher)

BACKGROUND QUESTIONNAIRE
TOEFL Integrative Writing Task Project

PLEASE FILL OUT THE FOLLOWING BACKGROUND INFORMATION. PLEASE PRINT
CLEARLY.
1.

Name:

a. First name:

____________________________

c. Middle initial:

_______

b. Last name:

___________________________

2.

Age: _____

3.

Gender:

4.

Phone number:

(

5.

Email address:

_________________________________________

6.

Native language (first fluent language, also known as your “mother tongue”):
__________________________

Male

Female
) __________ - __________________

a. How did you learn English?
_______________________________________________________________________
b. How old were you when you started learning English?
________________________
7.

How many times have you taken the TOEFL in the past?
________________________

8.



Do you currently have plans to retake the TOEFL? If so, when? Have you taken any
classes designed to improve your test score?

43

________________________________________________________________________
9.



Have you taken any classes or used any study aids (books, flashcards, etc…) designed to
improve your TOEFL score? If so, what did you use and how recently did you use it?
_______________________________________________________________________









44

APPENDIX C – EXIT QUESTIONNAIRE

Exit Questionnaire: Please answer the following questions to the best of your ability based on
your test-taking experience.
Name: ____________________
1. Which of the lectures did you prefer, the audio with picture or the audio with video?
Why?

2. Do you think that the presence of the video aided in your comprehension of the
information being delivered? Why or why not?

3. What did you find yourself paying the most attention towards in the video lecture?

4. Did you find it difficult to take notes while the video was playing? Please Explain.

5. Did you think it was easier to remember information received from the video lecture?
Please Explain.








45

APPENDIX D – DIRECTIONS AND NOTE-TAKING SHEET

Directions: For this task you will read a passage about an academic topic and you will listen to
a lecture about the same topic. You may take notes while you read and listen.
Then you will write a response to a question that asks you about the relationship between the
lecture you heard and the reading passage. Try to answer the question as completely as possible
using information from the reading passage and the lecture. The question does not ask you to
express your personal opinion. You may refer to the reading passage again when you write. You
may use your notes to help you answer the question.
Typically, an effective response will be 150 to 225 words. Your response will be judged on the
quality of your writing and on the completeness and accuracy of the content.
You will be given 3 minutes passage. Then you will listen to the lecture. Then you will be
allowed 20 minutes to plan and write your response.



46

APPENDIX E – ANALYTIC RUBRIC

Essay ID #:____________________

Content
Organization

Vocabulary

Language Use

Score Mechanics
/2

20 Thorough and
20 Excellent overall
20 Very sophisticated 20 No major errors in 20
logical development
organization
vocabulary
word order or
of thesis
Clear thesis
Excellent choice of
complex structures
Substantive and
statement
words with no errors
No errors that
detailed
Substantive
Excellent range of
interfere with
No irrelevant
introduction and
vocabulary
comprehension
16 information
16 conclusion
16 Idiomatic and near 16 Only occasional
16
Interesting
Excellent use of
native-like
errors in
A substantial
transition word
vocabulary
morphology
number of words for
Excellent
Academic register
Frequent use of
amount of time
connections
complex sentences
given
between paragraphs
Excellent sentence
Unity within every
variety
paragraph

Appropriate layout
with indented
paragraphs
No spelling errors
No punctuation
errors

15 Good and logical
development of
thesis
Fairly substantive
and detailed
Almost no
irrelevant
information
11 Somewhat
interesting

Appropriate layout
with indented
paragraphs
No more than a few
spelling errors in
less frequent
vocabulary
No more than a few
punctuation errors



15 Good overall
organization
Clear thesis
statement
Good introduction
and conclusion
Good use of
transition
11 wordsGood
connections

15 Somewhat
15 Occasional errors
sophisticated
in awkward order
vocabulary
or complex
Attempts, even if not
structures
completely
Almost no errors
successful, at
that interfere with
sophisticated
comprehension
vocabulary
Attempts, even if
11 Good choice of
11 not completely
words with some
successful, at a

47

15

11

An adequate
number of words for
the amount of time
given

between paragraphs
Unity within most
paragraphs

errors that don’t
obscure meaning
Adequate range of
vocabulary but some
repetition
Approaching
academic register

variety of complex
structures
Some errors in
morphology
Frequent use of
complex sentences
Good sentence
variety

10 Some development 10 Some general
10 Unsophisticated
10
of thesis
coherent
vocabulary Limited
Not much substance
organization
word choice with
or detail
Minimal thesis
some errors
Some irrelevant
statement or main
obscuring meaning
information
idea
Repetitive choice of
Somewhat
Minimal
words
uninteresting
introduction and
No resemblance to
6 Limited number of 6 conclusion
6 academic register
6
words for the
Occasional use of
amount of time
transitions words
given
Some disjointed
connections
between paragraphs
Some paragraphs
may lack unity

Errors in word
10
order or complex
structures
Some errors that
interfere with
comprehension
Frequent errors in
morphology
Minimal use of
6
complex sentences
Little sentence
variety

Appropriate layout
with most
paragraphs indented
Some spelling
errors in less
frequent and more
frequent vocabulary
Several punctuation
errors

5

Serious errors in
5
word order or
complex structures
Frequent errors
that interfere with
comprehension
Many error in
0

No attempt to
arrange essay into
paragraphs
Several spelling
errors even in
frequent vocabulary
Many punctuation

0



No development of
thesis
No substance or
details
Substantial amount
of irrelevant
information

5

0

No coherent
5 Very simple
organization
vocabulary
No thesis statement
Severe errors in
or main idea
word choice that
No introduction and
often obscure
conclusion
meaning
No use of transition 0 No variety in word

48

5

0

Completely
uninteresting
Very few words for
the amount of time
given

words
Disjointed
connections
between paragraphs
Paragraphs lack
unity

choice
No resemblance to
academic register





49

morphology
Almost no attempt
at complex
sentences
No sentence
variety

errors

APPENDIX F– WRITING PROMPT


Summarize the points made in the lecture you just heard, explaining how they cast doubt on
points made in the reading.



50

APPENDIX G – INTERVIEW 1 TRANSCRIPT

[00:00:01.23] Interviewer: ok, um..what..which of the lectures did you end up preferring? ..the
video or the picture
[00:00:17.00] Speaker 1: I think the video one is better. actually..umm..except the problem of the
topics, actually they are different topics...i think videos and i can focus on the lecturer's eyes or
her hand gestures or facial features, especially mouth and i can understand when does she when
she stopped and when some kind of her attitude and uh i can have more understanding about that.
And i think it is better...it's kind of distraction, but i think it's good because i took tons of times
TOEFL listening test and when every time i listened to lectures i was very easily to distracted
because..because only audio input and you stay there and you feel sleepy or you feel um you
want to do anything el...something else so you cannot focus on the lecture very well, but when
you have someone you can have some eye contact with the lecturer or you have...some...other
clues to focus on and you will not...will not so easily to uh...you know to disconcentrate.
[00:01:41.14] Speaker 2: uh..i think it's the same as she said, but at first i didn't like it. because i
am...i am the kind of per..people who like to look around everything. So..but...actually, I
understand more because her gestures help me like to find out word like "burrow." I didn't know
it was the hole in the ground unless she did with her gesture that (hand motion). And also, like
she said, the eye contact and some gestures just show you the important things she's going to do.
So...her head?hand (?) just maybe talks as important as her words.
[00:02:31.20] Interviewer: so you're saying that when you...when you were watching the video
you were paying more attention to her...
[00:02:38.01] Speaker 2:her gestures...yeah



51

[00:02:36.26] Interviewer: ok. were you paying attention to her lips at all?Did you see her lips?
[00:02:42.29] Both: No...
[00:02:46.09] Interviewer: Um...
[00:02:48.21] Speaker 1: but, but, but i don't think that uh...i think that, i think underst...seeing,
looking at her lips clearly is kind of important because actually uh international students'
listening skills is not as good as...yeah...so...if they have some...they have they can look more
clearly.They can have more understanding about what was really said. For example, when I
watch the TV shows or I watch CNN I always depend on their lips to understand more about
what they said.
[00:03:23.06] Speaker 2: I never looked at their lips.
*laughs*
[00:03:32.08] Speaker 1: it's important
[00:03:32.08] Interviewer: Um...do you think that gestures are more important than lip reading
maybe?
[00:03:41.18] Speaker 2: yeah, to me it is.
[00:03:42.23] Speaker 1: yes, yes, to me too. umm...uh, uh, uh, like _________ said, at first i'm
not really used to the video one, because i'm , i when i practice my TOEFL test i usually use the
audio one, but after when we take the second task, i feel a little bit weird when only listen to the
audio. yeah..it's not...i cannot
[00:04:06.25] Interviewer: so the difference between the two is a little jarring?
[00:04:14.21] Speaker 1:yeah...yup



52

[00:04:18.24] Speaker 2: the first is much better, the video than that picture. I will not get...like
i'm not gonna get anything out of the picture that's there. no gestures, no eye contact...you don't,
you just don't know.
[00:04:33.00] Speaker 1:actually, pictures there is no use, totally no use. so i don't know why
TOEFL listening test always give me some pictures there.
[00:04:46.10] Interviewer: what types of things do you pay attention to when you're listening?
what do you think a listening test should include in it's listening?
[00:05:16.19] Speaker 2:i think anything that's mainly related to that article. If they're going to
put a picture, i don't need to see the lecturer, i wanna see that, i think they were talking about the
meerkat? On the first one. It was very helpful. I'd like to see the picture ofthe meerkat, or even
the meerkat guard.
[00:05:42.03] Speaker 1:it helps out
[00:05:45.12] Speaker 2:yeah, i think pictures are important if there's no video.
[00:05:54.24] Speaker 1: i'm sorry, i cannot geet your question.
[00:06:03.25] Interviewer::so when you're given a listening, and say you're in the real world. so
compare what you have in the test to the real world. what do you think the listening in your test
should include from the outside? so like when you're listening to people talk, what are you
paying attention to? Do you think those should be included in the test?
[00:06:34.04] Speaker 2:yes
[00:06:35.29] Speaker 1: let me think. i don't have a lot of opinions about this questions.
maybe...pictures as an add, not the main one. and i need some...i don't, i...i don't know how to
explain it well. the frequency of the voice? yeah, because most of tests i would listen are not



53

too...you know...they don't have lots of emotions when they explain something. yeah...i think i
need more emotions.
[00:07:15.27] Interviewer: ok. umm...i have one more question. When you were taking notes, did
you find it more difficult for the video listening than for the the audio listening...to take notes?
[00:07:35.29] Speaker 2:Video was more difficult
[00:07:41.00] Interviewer: more difficult?what made it more difficult?
[00:07:39.15] Speaker 2:um because, to me i want to concentrate on what she's saying and what
she's doing and her gestures, and also to write what i think. uh...it's just i'm not good with
multitasking so...that's why.
[00:07:54.25] Speaker 1:yeah, to me, it's kind of, but not affect much because i think, uh...in my
opinion, i think the video lecture um the gestures and the facial expressions such other clues are
useful, but not uh...not necessary. Like the lecturer might say, "please feel free to look at my
face." but you can look and you cannot, you don't have to, you don't have to look at it. So if I
want to focus on my, take notes I won't look at her. Yeah. I don't think to me it is kind of
multi...mult...multitasking? because I will divide it into two sides. If I know, I am sure that this
part is an important part I need to take notes, I won't...I don't have to look at her anymore. Yeah.
[00:08:55.16] Interviewer: What types of gestures did you find most helpful?
[00:08:59.16] Speaker 2:Hand gestures.
[00:08:56.07] Speaker 1:Hand gestures. Yes. Really Helpful. Like ________ said this (makes
burrow gesture)
[00:09:07.04] Interviewer: Yeah. The burrow?
[00:09:09.23] Speaker 1:Yeah that was very helpful.



54

APPENDIX H – INTERVIEW 2 TRANSCRIPT

[00:00:00.16] Interviewer: um so which, which of the lectures did you end up preferring?
[00:00:11.07] Speaker 1:the second
[00:00:09.08] Interviewer: the second. which one is the second one?
[00:00:11.01] Speaker 1:The vote. Vote
[00:00:18.05] Interviewer: Was it video or picture?
[00:00:20.04] Speaker 1:I Picture. I think picture is more...i can understand picture uh...more
clear. Clear...you know...I think so, but I don't know.
[00:00:38.23] Interviewer: Why do you think you understand it more clearly?
[00:00:43.28] Speaker 1:You know, your computer play a video, I am easy to interrupt...by a
video...by a motion I mean. So I cannot take notes very well. And that is the reason.
[00:00:59.20] Interviewer: Ok. What about you?
[00:01:04.17] Speaker 2:Well Audio with picture was attentive? I couldn't concentrate on it any
more because there is no moving or anything like that. Audio with video, the lecturer was
moving and hand motion and everything. I was just following her action, looking at her. So, I
couldn't concentrate as much as the audio with picture.
[00:01:31.18] Interviewer: Ok. So you both like the audio with picture...
[00:01:35.00] Speaker 1: I have another...
[00:01:37.17] Interviewer: Ok
[00:01:37.17] Speaker 1: If I do not take notes, I think it is better for video, because I
mean...body language is helpful. So, I can understand more, but if we need to take notes, it's not
a good idea.



55

[00:01:52.22] Interviewer: Did you think, if...did you think it was necessary to take notes as
much with the video?
[00:02:01.27] Speaker 1:....I like to take notes. But the video is actually interrupt me...to take
notes. I didn't try it...you know, I just...but I think TOEFL test is necessary to take notes when
the lecture is playing.
[00:02:20.01] Interviewer: Ok. Um...Do you think that...do you think it was...like notes aside...do
you think it was easier to understand using the video....instead of just the still picture?
[00:02:45.05] Speaker 1:Without taking notes? I think it's video, because I can understand more
by body language and movement and...you know...and, I can also understand the emotion of the
professor. But if I taking notes, that is really interrupt me. So, it depends.
[00:03:15.25] Interviewer: Ok. What about you? What's your feeling?
[00:03:26.00] Speaker 2:What was your question?
[00:03:25.14] Interviewer: Do you think that the video helped you to understand more? Say you
didn't have to take notes and you could just watch the video. Do you think it would've helped you
understand more?
[00:03:37.07] Speaker 2:Um...like, I understand more...but when I have to write it, I forget the
story. So if I don't write it I cannot like...I have to compare both the reading and lecture, but I
cannot remember the lecture...lecturer's saying. So I have to take notes. But if I don't take notes I
understand better.
[00:04:01.05] Interviewer: Ok. So....so the use, so looking at the gestures it didn't help you to
remember it? Did it help you?
[00:04:21.18] Speaker 1:It's a really hard question. It depends. But...help me to remember? no.
Help me to understand? yes



56

[00:04:41.26] Interviewer: Um...Did you...so...it's more difficult to take notes...um...do you think
it's important to include that video since you 're being tested on your academic skills in a lecture?
Even though it's more difficult to take notes, do you think it's more important to include that
video because....of more realistic situations...
[00:06:26.16] Speaker 1:in TOEFL test? No. Because, first of all, it's a...you know...I think it
interrupt people to understand, when, during the test. Second, the network may be no good. So...I
think the most important point is they interrupt.
[00:06:52.07] Interviewer: What about you? Do you feel the same way?
[00:07:04.18] Speaker 2:I don't feel any realistic...from...from the video since it is, I
cannot...(?????)...test TOEFL, I always feel like "oh yeah, it's not like i'm in there..." so...
[00:07:23.23] Interviewer: One final question...was the audio in the videos clear in both of the
passages?
[00:07:47.17] Speaker 1:Yeah...I understand the second one, the American Votes. But the first
one is a little bit not very clear. Maybe I interrupt by something, because I did multitasking, you
know I look at watching the video and taking the notes, listen to the audio, so maybe it's not very
distinct.
[00:08:07.20] Interviewer: So you're saying the first one is not as....the quality of the audio is not
as good?
[00:08:14.06] Speaker 1:It's not as good as the second one, but it's still okay, but it...i think the
second one is better.
[00:08:28.20] Speaker 2:Same as him.



57

REFERENCES



58

REFERENCES
Alibali, M. W., Heath, D. C., & Myers, H. J. (2001). Effects of visibility between speaker and
listener on gesture production: Some gestures are meant to be seen. Journal of Memory
and Language, 44, 169-188.
Arnold, J. (2000). Seeing through listening comprehension exam anxiety. TESOL Quarterly 34,
777–786.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford
University Press.
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and
developing useful language tests. Oxford: Oxford University Press.
Baltova, I. (1994). The impact of video on the comprehension skills of core French students.
Canadian Modern Language Review, 50(3), 507-531.
Bejar, I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening
framework: A working paper . Princeton, NJ: Educational Testing Service.
Brett, P. (1997). A comparative study of the effects of the use of multimedia on listening
comprehension. System, 25(1), 39-53.
Buck, G. (2001). Assessing listening. Cambridge: Cambridge University Press.
Carrell, P. (2007). Notetaking strategies and their relationship to performance on listening
comprehension and communicative assessment tasks (TOEFL Monograph Series No.
MS-25). Princeton, NJ: ETS.
Carrell, P. L., Dunkel, P. A., & Mollaun, P. (2004). The effects of notetaking, lecture length and
topic on a computer-based test of ESL listening comprehension. Applied Language
Learning, 14, 83-105.
Chung, U. K. (1994). The effect of audio, a single picture, multiple pictures, or video on secondlanguage listening comprehension. Unpublished PhD dissertation, University of Illinois
at Urbana-Champaign.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New York:
Academic Press.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155-159.
Coniam, D. (2001). The use of audio or video comprehension as an assessment instrument in the
certification of English language teachers: A case study. System, 29, 1-14.



59

Duff, P. (2002). Research approaches in applied linguistics. In R. A. Kaplan (Ed.), The Oxford
handbook of applied linguistics (pp. 13-23). Oxford, UK: Oxford University Press.
English, S. L. (1982, May). Kinesics in academic listening. Paper presented at the 16th annual
conventionof Teachers of English to Speakers of Other Languages, Honolulu, HI. (ERIC
Document Reproduction Service No. ED 218 976)
Educational Testing Service. (2009). The official guide to the TOEFL® test (3rd ed.). New York,
NY: McGraw Hill.
Educational Testing Service. (2010). TOEFL iBT™ test framework and tst development.
Retrieved February 24, 2011 from http://www.ets.org/toefl/research/ibt_insight_series
Education Testing Service. (2011). Reliability and comparability of TOEFL iBT™ scores.
Retrieved February 24, 2011 from http://www.ets.org/toefl/research/ibt_insight_series
Ginther, A. (2002). Context and content visuals and performance on listening comprehension
stimuli. Language Testing, 19(2), 133-167.
Gruba, P. (1993). A comparison study of audio and video in language testing. JALT Journal,
15(1), 85-88.
Gruba, P. (1997). The role of video media in listening assessment. System, 25 (3), 335-345.
Gruba, P. (1999). The role of digital video media in second language listening comprehension.
Unpublished PhD dissertation, Department of Linguistics and Applied Linguistics,
University of Melbourne. Retrieved May 5, 2008, from
http://eprints.unimelb.edu.au/archive/00000244/
Hadar, U., Wenkert-Olenik, D., Krauss, R., & Soroket, N. (1998). Gesture and the processing of
speech: Neuropsychological evidence. Brain and Language, 62, 107-126.
Hamp-Lyons, L. (1991). Scoring procedures for ESL contexts. In: L. Hamp-Lyons (Ed.),
Assessing second language writing in academic contexts (pp. 241–276). Norwood, NJ:
Ablex.
In’nami, Y. (2006). The effects of test anxiety on listening test performance. System 34, 317–340.
Kintsch, W. (1998). Comprehension. Cambridge: Cambridge University Press.
Lado, R. (1961). Language testing: The construction and use of foreign language tests. London:
Longman.
Liu, Y. (2001). A cognitive study on the functions of note-taking and the content of notes taken in
a context of Chinese EFL learners. (Unpublished master’s thesis). Guangdong University
of Foreign Studies, Guangdong, People’s Republic of China.



60

Londe, Z. C. (2009). The effects of video media in English as a second language listening
comprehension tests. Issues in Applied Linguistics, 17(1), 41-50.
McNamara, T. (1996). Measuring second language performance. London: Longman.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (pp. 13-103). New
York: Macmillan.
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 242-256.
Morrel-Samuels, P., & Krauss, R. M. (1992). Word familiarity predicts temporal asynchrony of
hand gestures and speech. Journal of Experimental Psychology: Learning, Memory and
Cognition, 18, 615-662.
Ockey, G. J. (2007). Construct implications of including still image or video in computer- based
listening tests. Language Testing, 24(4), 517-537.
Polio, C., & Hughes, A. Writing development in an ESL program: What can we expect in 15
weeks? TESOL Annual Convention, Salt Lake City, April 2002.
Progrosh, D. (1996). Using video for listening assessment: Opinions of test-takers. TESL Canada
Journal, 14, 34-44.
von Raffler-Engel, W. (1980). Kinesics and paralinguistic: A neglected factor in secondlanguage research and teaching. Canadian Modern Language Review, 36(2), 225-237.
Rubin, J. (1995). The contribution of video to the development of competence in listening. In D.
Mendelsohn, & J. Rubin (Eds.), A guide for the teaching of second language listening (pp.
151-165). San Diego, CA: Dominie Press.
Sueyoshi, A., & Hardison, D. M. (2005). The role of gestures and facial cues in second language
listening comprehension. Language Learning, 55, 661-699.
Suvorov, R. (2008). Context visuals in L2 listening tests: The effectiveness of photographs and
video vs. audio-only format (Unpublished master’s thesis). Iowa State University, Ames,
IA.
Suvorov, R. (2009). Context visuals in L2 listening tests: The effects of photographs and video
vs. audio-only format. In C. A. Chapelle, H. G. Jun, & I. Katz (Eds.) Developing and
evaluating language learning materials (pp 53-68). Ames, IA: Iowa State University.
Thompson, I. (1995). Assessment of second/foreign language listening comprehension. In D.
Mendelsohn, & J. Rubin (Eds.), A guide for the teaching of second language listening (pp.
31-58). San Diego, CA: Dominie.



61

Vandergrift, L. (2007). Recent developments in second and foreign language listening
comprehension research. Language Teaching, 40, 191-201.
Wagner, E. (2007). Are they watching? Test-taker viewing behavior during an L2 video listening
test. Language Learning and Technology , 11(1), 67-86.
Wagner, E. (2008). Video listening tests: What are they measuring? Language Assessment
Quarterly, 5(3), 218-243.
Wagner, E. (2010). The effect of the use of video texts on ESL listening test-taker performance.
Language Testing, 27, 493-513.
Wagner, M. (2006). Utilizing the visual channel: An investigation of the use of video texts on
tests of second language listening ability. Unpublished doctoral dissertation, Teachers
College, Columbia University, New York.
Weigle, S. C. (2002). Assessing writing. Cambridge: Cambridge University Press.
Weir, C. (2005). Language testing and validation: An evidence-based approach. Basingstoke:
Palgrave Macmillan.





62