THE ROLES OF CONTEXT AND REPETITION IN INCIDENTAL VOCABULARY ACQUISITION FROM L2 READING: AN EYE MOVEMENT STUDY By Ayman Ahmed Abdelsamie Mohamed A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Second Language Studies – Doctor of Philosophy 2015 ABSTRACT THE ROLES OF CONTEXT AND REPETITION IN INCIDENTAL VOCABULARY ACQUISITION FROM L2 READING: AN EYE MOVEMENT STUDY By Ayman Ahmed Abdelsamie Mohamed Research on extensive reading has provided ample evidence on the role of repetition in lexical learning and called for further research on the role of context in vocabulary acquisition from L2 reading (e.g. Chen, & Truscott, 2010; Horst, 2005; Waring, & Nation, 2004; Webb, 2007, 2008). On the other hand, eye movement studies on reading behavior documented cognitive effects of repetition and context quality on lexical processing and associated vocabulary learning with processing patterns in the light of the eye-mind link hypothesis (Rayner, 1998, 2009). The present study aimed at bringing together methods from both strands to investigate incidental vocabulary acquisition and track the cognitive roles of repetition and context predictability in the development of different aspects of vocabulary knowledge. Forty-two upper-intermediate and advanced second language learners of English read a stage 1 graded reader, ‘Goodbye Mr. Hollywood’, on a desk-mounted eye tracker screen followed by comprehension questions and vocabulary posttests. Target vocabulary consisted of 20 pseudo words and 20 known words with a range of repetition from 1 to 30. Eye-movement data showed that readers spent more time on pseudo words than on familiar words and that fixation times decreased across encounters with more attention given to target words on early encounters. Context predictability decreased total times spent on target words particularly on late encounters. Readers scored highest in form recognition followed by meaning recognition and finally meaning recall. Repeated exposure supported form recognition while context predictability supported meaning recognition and recall. Moment-by-moment lexical processing showed that first fixations predicted form recognition while gaze durations predicted meaning recall. Total times spent on each encounter was positively associated with learning success in all vocabulary measures. When aggregating fixation times by vocabulary items, it was found that the amount of attention, as reflected in total reading times on each pseudo word across all its encounters, positively predicted learning outcomes above and beyond total exposure and item predictability, which highlights an important role of readers’ individual attention and their optimal use of input to infer and retain meaning from context. Results of the study add a cognitive dimension to the concept of engagement in lexical learning and provide implications on the process of incidental learning from extensive reading and classroom teaching tasks. Copyright by AYMAN AHMED ABDELSAMIE MOHAMED 2015 To my beloved family And to my beloved country, Egypt v ACKNOWLEDGMENTS This dissertation is the culmination of long years of study and research in applied linguistics. Since I started college majoring in English as a foreign language, my life has been revolving around linguistics and my ultimate goal was to become a university professor. The journey for the PhD degree has been quite long and full of different kinds of academic, social and cultural experiences. From the moment I stepped into MSU campus, a whole new world opened for me and my family. We have gone through a lot, establishing a new life in a diverse community. We have fully enjoyed it to the extent that we cannot realize that the journey is coming to an end and that we should close this chapter and move on to the next step. Many hands have supported my steps towards achieving my dream. The first kind hand was of my father. Although he was reluctant to letting me go away from him, he did not hesitate to financially support me for my first travel tickets. His priority was to see me a professor regardless of the suffering he had to go through all these years without me and away from his little grandsons. My mother shared similar feelings but remained patient over the years. I owe my progress in life and work to them for their care and support all the way. At the academic level, I cannot express in words how I am grateful to my advisor, Dr. Aline Godfroid who brilliantly shaped my thoughts and developed my research abilities over the years. When Dr. Godfroid first saw me, I was still struggling in my second year trying to find my way. With constructive and intensive feedback, she challenged my abilities and managed to have me produce the best I could. I truly appreciate all the time and effort she spent with all my papers and messy drafts. She fostered my random thoughts about this project until it became something real. It was only through her continuous feedback and follow up that I was able to get my first vi paper in press for publication. I owe Dr. Godfroid the quality and the value of this dissertation and any of my upcoming publication projects. I will always be proud that I was one of her students. I express my utmost appreciation to Dr. Susan Gass who believed in my abilities and brought me from the far Middle East to be a member in the precious SLS family. She was always there for us giving full attention to everyone in the program among all her responsibilities. I owe Dr. Gass my very early beginnings with research. It was through her review of incidental learning in volume 21 of SSLA journal that my MA thesis started to develop and since then I became hooked to incidental learning. I also owe Dr. Gass my current expertise in Arabic teaching as she was the one who provided this opportunity for me to support my studies. I am truly honored to have her on my dissertation committee. Everyone knew Susan Gass as a prominent scholar in the field, but we as the SLS family consider ourselves the most privileged to have known her personally. I express my gratitude to all my professors who shaped my identity in the field at my early stages in the degree. I will always remember Dr. Debra Hardison, Dr. Shawn Loewon, and Dr. Patricia Spinner because every one of them had an impact on me in a different way. Special thanks go to Dr. Paula Winke and Dr. Charlene Polio for their support as readers and their guidance through my study years. Dr. Winke is an example of a dedicated professor who gives outstanding care for her students. I loved her smooth style in class and her solid research practice. Dr. Polio is very approachable and flexible. Although I had little opportunity to work with her, she taught me a great deal about L2 writing and opened a new avenue of research that I anticipate I can develop further in my future academic career. I am fortunate to have them on my dissertation committee for their dedication and constructive feedback. vii I also express my best wishes to all SLS peers who made life easier when we came together and shared our thoughts and challenges. I appreciate the good times, conversations and conference journeys I shared with Scott, Dominik, Jens, Roman and all the old and new members of the SLS family. I particularly extend my gratitude for Sehoon Jung with whom I shared many days of so-called ‘study union’ although I have frequently broken my promises with him, yielding to my family responsibilities. I will always cherish our memories together and truly appreciate his friendship. In the background of all this, some people stayed in the shade patiently adapting to my hectic work style. These were my wife Samaa and my two kids Ahmed and Omar. At times we went through a lot of tension that threatened our relationship but Samaa was so patient and accepted the challenge. She suffered through my absence from home and my absent mind at home. As I am writing these lines, it happened to be her birthday. I wish her wonderful days ahead of us and I am sure she will be the happiest person when she realizes that her effort and patience were not in vain. I extend my best wishes to my son Ahmed, who is finishing his 3rd grade at school and little Omar, 3 years and a half, almost the age I spent in my PhD research papers. I can simply say that I missed my kids so much. I am proud to be the first and only Egyptian Spartan in the SLS department to date. Egypt is a part of my personality and a part of who I am. The start of my PhD journey coincided with the first spark of extraordinary events and sudden changes in Egypt, which I only followed at a distance. As I am wrapping up my degree, I hold high hopes for my country to rise up and gain the fruits of its revolution. A final message that I have for everyone who knew me: please forgive my shortcomings, I am not a perfect person in any way. I believe that we will meet again for surely it is a small world. viii TABLE OF CONTENTS LIST OF TABLES ......................................................................................................................... xi LIST OF FIGURES ..................................................................................................................... xiii INTRODUCTION .......................................................................................................................... 1 CHAPTER 1: REVIEW OF THE LITERATURE ......................................................................... 6 1.1 Incidental vocabulary learning.............................................................................................. 6 1.2 Empirical research on incidental learning ............................................................................ 7 1.3 Vocabulary learning from L2 reading .................................................................................. 9 1.3.1 Extensive reading and L2 vocabulary .......................................................................... 10 1.3.2 Contextual richness and vocabulary learning .............................................................. 12 1.4 Interim summary ................................................................................................................. 14 1.5 Cognitive perspectives on lexical learning ......................................................................... 17 1.5.1 Eye tracking ................................................................................................................. 17 1.5.2 Eye movement models ................................................................................................. 18 1.5.3 Eye movement research in reading .............................................................................. 19 1.6 General summary ................................................................................................................ 22 1.7 Goals of the study ............................................................................................................... 25 CHAPTER 2: CURRENT STUDY .............................................................................................. 26 2.1 Research questions .............................................................................................................. 26 2.2 Participants.......................................................................................................................... 26 2.3 Material ............................................................................................................................... 27 2.3.1 Background questionnaire ........................................................................................... 27 2.3.2 Vocabulary size test ..................................................................................................... 27 2.3.3 Reading material .......................................................................................................... 28 2.3.4 Target words ................................................................................................................ 29 2.3.5 Comprehension packet ................................................................................................. 32 2.3.6 Reading perception questionnaire ................................................................................ 32 2.3.7 Vocabulary tests ........................................................................................................... 32 2.4 Procedure ............................................................................................................................ 33 2.4.1 Apparatus ..................................................................................................................... 33 2.4.2 The reading session ...................................................................................................... 34 2.4.3 The testing session ....................................................................................................... 35 2.4.4 Modified cloze procedure ............................................................................................ 35 2.5 Analyses .............................................................................................................................. 37 2.5.1 Definition of variables ................................................................................................. 37 2.5.2 Data structure ............................................................................................................... 39 2.5.3 Statistical tests .............................................................................................................. 41 ix 2.5.4 Reporting results .......................................................................................................... 42 CHAPTER 3: RESULTS .............................................................................................................. 43 3.1 Encounter and predictability data ....................................................................................... 43 3.2 Online reading patterns ....................................................................................................... 45 3.2.1 First fixation durations ................................................................................................. 45 3.2.2 Gaze durations ............................................................................................................. 49 3.2.3 Total reading times ...................................................................................................... 52 3.2.4 Reading behavior ......................................................................................................... 55 3.2.5 Summed reading times ................................................................................................. 57 3.2.6 Interim summary .......................................................................................................... 60 3.3 Vocabulary knowledge gains from reading ........................................................................ 61 3.3.1 Descriptive statistics .................................................................................................... 61 3.3.2 Text-based characteristics and vocabulary learning .................................................... 63 3.3.3 Real time processing and vocabulary learning ............................................................ 67 3.3.4 The role of cumulative online processing in vocabulary learning ............................... 69 3.4 Individual differences in learning from reading ................................................................. 71 3.5 General summary of results ................................................................................................ 73 CHAPTER 4: DISCUSSION ........................................................................................................ 75 4.1 Lexical processing in repeated encounters ......................................................................... 75 4.2 Text-based effects on vocabulary learning ......................................................................... 78 4.3 Early indicators of vocabulary intake ................................................................................. 80 4.4 Combined measures of attention and exposure .................................................................. 81 4.5 Overview ............................................................................................................................. 82 CHAPTER 5: CONCLUSION ..................................................................................................... 85 5.1 Summary of the findings..................................................................................................... 85 5.2 Practical and pedagogical implications............................................................................... 87 5.3 Limitations and further research ......................................................................................... 88 APPENDICES .............................................................................................................................. 91 Appendix A: Participant Information ....................................................................................... 92 Appendix B: Background questionnaire ................................................................................... 93 Appendix C: Sample of reading material ‘Goodbye Mr. Hollywood’ ..................................... 94 Appendix D: Sample page from the comprehension packet ..................................................... 96 Appendix E: Reading perception questionnaire ....................................................................... 97 Appendix F: Form recognition test ........................................................................................... 98 Appendix G: Meaning recall test .............................................................................................. 99 Appendix H: Meaning recognition test ................................................................................... 100 Appendix I: Modified cloze task ............................................................................................ 101 Appendix J: Token predictability data .................................................................................... 102 REFERENCES ........................................................................................................................... 103 x LIST OF TABLES Table 1 The role of exposure and predictability in vocabulary learning ...................................... 15 Table 2 The role of exposure and predictability in eye movement studies .................................. 23 Table 3 Lexical profile of 'Goodbye Mr. Hollywood' .................................................................. 29 Table 4 Pseudo forms and their frequency in the reading text ..................................................... 31 Table 5 Definitions of variables and terminology in the study ..................................................... 39 Table 6 Encounter and predictability data for target vocabulary .................................................. 44 Table 7 Effects of text-based factors on first fixation durations (FFD) ........................................ 46 Table 8 Effects of text-based factors on gaze durations (GD) ...................................................... 50 Table 9 Effects of text-based factors on total reading times (TFD) ............................................. 53 Table 10 Mean percentages of skipping and regressions on target and control words................. 55 Table 11 Effects of text-based factors on skipping and regression rates ...................................... 57 Table 12 Mean summed fixation measures by exposure bands .................................................... 58 Table 13 Effects of text-based factors on summed processing times ........................................... 59 Table 14 Average word gains for the vocabulary post tests ......................................................... 61 Table 15 Effects of exposure and predictability on form recognition .......................................... 64 Table 16 Effects of exposure and predictability on meaning recognition .................................... 64 Table 17 Effects of exposure and predictability on meaning recall.............................................. 64 Table 18 Token-based predictors of form recognition ................................................................. 67 Table 19 Token-based predictors of meaning recognition ........................................................... 68 Table 20 Token-based predictors of meaning recall ..................................................................... 69 Table 21 Regression output of the online vs. text-based predictors of form recognition ............. 69 xi Table 22 Regression output of the online vs. text-based predictors of meaning recognition ....... 70 Table 23 Regression output of the online vs. text-based predictors of meaning recall ................ 71 Table 24 Mean responses on the reading perception questionnaire ............................................. 72 Table 25 Participants' proficiency and vocabulary size chart ....................................................... 92 Table 26 Estimated predictability for target tokens .................................................................... 102 xii LIST OF FIGURES Figure 1. Data structure for participants and target words............................................................ 40 Figure 2. Mean fixation times (in milliseconds) on target and control words by encounter ........ 47 Figure 3. The interaction of condition and predictability in first fixation durations (FFD) ......... 48 Figure 4. Scatter plot for mean first fixation durations by encounter and condition .................... 48 Figure 5. Mean gaze durations (in milliseconds) on target and control words by encounter ....... 49 Figure 6. Scatter plot for gaze durations by encounter and condition .......................................... 49 Figure 7. The interaction of condition and predictability in gaze durations (GD) ....................... 51 Figure 8. Mean total reading times for target and control words by encounter ............................ 52 Figure 9. Scatter plot for mean total durations by encounter and condition ................................. 52 Figure 10. The interaction of condition and predictability in total fixation durations (TFD) ...... 54 Figure 11. Mean vocabulary gains in the vocabulary posttests by exposure bands ..................... 62 Figure 12. Mean percentages of word gains by context type ....................................................... 62 Figure 13. The interaction between exposure and predictability in form recognition .................. 65 Figure 14. The interaction of exposure and context in meaning recognition ............................... 66 Figure 15. The interaction of exposure and context in meaning recall ......................................... 66 xiii INTRODUCTION Second language research has shown that reading plays an important role in the development of learners’ vocabulary knowledge beyond what language classes and textbooks can offer (Coady and Huckin, 1997; Grabe and Stoller, 1997, 2002; Hill and Laufer, 2003 ; Horst, 2005; Huckin, Haynes and Coady, 1993 ; Huckin and Coady, 1999; Kweon & Kim, 2008; Matsouka & Hirsh, 2010; Lupescu and Day, 1993 ; Nagy, Anderson and Herman, 1987; Nation, 2001, 2006; Schmitt, 2008, 2010; Zimmerman, 1997). This type of learning has mainly been characterized as incidental because it occurs in the context of a meaning-oriented task with no intentional emphasis on vocabulary (e.g., Fraser, 1999; Paribakht and Wesche, 1999; Pulido, 2007; Waring and Nation, 2004; Watanabe, 1997). Various factors have been posited to facilitate incidental acquisition from written input such as type of task (Brown,Waring and Donkaewbua, 2008 ; Cho and Krashen, 1994 ; Hulstijn, 1992 ; Hulstijn, Hollander, and Greidanus, 1996 ; Hulstijn and Trompetter, 1998 ; Knight, 1994), repeated exposure (Horst, Cobb and Meara, 1998 ; Pigada and Schmitt, 2006 ; Rott, 1999 ; Webb, 2007) or context properties (Haastrup, 1989 ; Joe, 2010 ; Nagy, 1987 ; Nassaji, 2003; Webb, 2008 ; Zahar, Cobb and Spada, 2001). Although incidental learning has been challenged as slow and inefficient in terms of acquisition and retention (e.g., Laufer, 2003, 2005; Macaro, 2003; Read, 2004), many researchers and teachers believe it is an essential supplement for learners to expand their vocabulary independently (see Schmitt, 2008, 2010). The argument is that learners face a lexical coverage challenge, given that a knowledge of 8000-9000 word families is required to achieve adequate comprehension of an authentic English text (Hirsh and Nation, 1992; Hu and Nation, 2000 ; Nation, 2001, 2006 ; Nation and Wang, 1999 ; Waring and Nation, 2004; Webb, 2010). Therefore, ESL programs usually incorporate an extensive reading component in their curricula 1 taking advantage of graded readers which are linguistically and lexically adjusted to learners’ levels of competence and can support a smooth transition to unsimplified reading material and bridge the lexical coverage gap (Horst, 2005; Uden, Schmitt & Schmitt, 2014). In fact, extensive reading has been valued for its role in developing reading speed and fluency, reinforcing existing lexical knowledge and providing incidental learning opportunities for less frequent vocabulary (e.g., Cho and Krashen, 1994; Day and Bamford, 1998; Elley, 1991; Grabe and Stoller, 2002; Parry, 1991). The potential of extensive reading to enhance vocabulary knowledge has been widely investigated (e.g., Cho and Krashen, 1994; Day, Omura and Hiramatsu, 1991; Horst, 2005; Hulstijn, Hollander and Geridanus, 1996; Pitts, White and Krashen, 1989; Saragi, Nation and Meister, 1978). A significant role was shown for repeated exposure (Horst, Cobb & Meara, 1998; Pellicer-Sanchez and Schmitt, 2010; Waring and Takaki, 2003; Webb, 2005) while less conclusive results were reported for the role of context quality and lexical inference on vocabulary learning outcomes (Fraser, 1999; Haastrup, 2008; Hu, 2013; Joe, 2010; Nassaji, 2003; Webb, 2008; Zahar, Cobb, and Spada, 2001). Most studies on incidental learning from reading were paper-and-pencil based with outcomes measured through posttests or self-report. In an attempt to explain the trend of results in vocabulary studies, Schmitt (2008, 2010) emphasized the role of engagement with target vocabulary which, in his view, can be triggered by different factors including repeated exposure, increased noticing and increased time spent on target words. Expanding the concept of engagement, Hulstijn and Laufer (2001) relied on insights from Schmidt’s (1990) noticing hypothesis and Craik and Lockhart’s (1972) depth of processing hypothesis to introduce the involvement load hypothesis as a motivational cognitive construct for interpreting and predicting 2 the findings of vocabulary learning studies from a cognitive perspective. However, the hypothesis and its experimental replications (Laufer & Hulstijn, 2001; Kim, 2008; Keating, 2008; Mohamed, in press; Yaqubi, Rayati, & Gorgi, 2010), while informative, could not account for all facets of vocabulary learning because their findings only applied to controlled vocabularyfocused tasks and not to a natural reading setting. The concepts of engagement and noticing, as cognitive processes underlying lexical learning, have been retrospectively discussed in studies that used think-aloud protocols or interviews (e.g. Fraser, 1999; Haarstup, 1991; Rott, 2005) as well as within the involvement load framework but they were not empirically measured. Because it was difficult to measure these cognitive processes offline, vocabulary researchers took advantage of the eye tracking technique as one advanced psycholinguistic method that has been posited to capture real time processing. This method can be adopted in L2 reading studies to track moment-by-moment processing of input based on the assumption that eye movements reflect an accurate representation of ongoing cognitive processes in a learner’s mind. This assumption was coined ‘the eye-mind link’ which proposes a connection between overt and covert attention (see Rayner, 1998, 2009 for a review). Reading behavior studies investigated the processing of short sentences and paragraphs in terms of repeated exposure (e.g. Hyönä & Niemi, 1990; Raney & Rayner, 1995; Rayner, Raney, & Pollatsek, 1995) and context predictability (Altarriba, Kroll, Scholl, & Rayner, 1996; Ashby, Rayner, & Clifton, 2005; Balota, Pollatsek, & Rayner, 1985; Clifton, Staub & Rayner, 2007; Ehrlich & Rayner, 1982; Juhasz & Pollatsek, 2011; Kliegl, Grabner, Rolfs & Engbert, 2004; Liversedge & Rayner, 2011; Rayner & Well, 1996; Rayner & Clifton, 2005; Wochna & Juhasz, 2013). However, none of these studies specifically looked at learning opportunities as related to lexical processing. In this regard, William and Morris (2004) and Brusinghan and Folk (2012) 3 found a systematic relationship between online processing patterns and retention of novel word meanings in reading comprehension. Godfroid, Boers, and Housen (2013) explained this association in terms of attention to novel words, maintaining that fixation times reflected the amount of attention to lexical items and predicted their subsequent recognition. The current picture of incidental learning from reading thus points to two distinct strands of research. Mainstream vocabulary studies have shown strong evidence on the positive role of repetition yet mixed results on context effects, which may be due to the inconsistency of context rating methods adopted in these studies. On the other hand, eye movement studies have widely examined context predictability effects on reading times using standardized norming procedures but they did not investigate repeated exposure as much, and very few attempts were made to link processing with acquisition. The interaction between repetition and context quality was not directly investigated in these studies either. Finally, eye movement research in this area was based on sentence or paragraph reading, which makes their results less generalizable for longer text or in an extensive reading setting. In the current study, I attempt to bring together the two strands of research by borrowing methods from both extensive reading research and eye movement studies to provide a picture of lexical processing in natural reading of novels and obtain a real time record of incidental learning of vocabulary from L2 reading. The initial hypothesis that motivates the current study is that exposure to novel lexical items during leisure reading invites some attention to form and meaning, which may be reflected in processing time and provide opportunities for incidental intake and retention. These opportunities are likely to be mediated by a hypothesized interaction between exposure frequency and context predictability. 4 To address the research questions, I implement eye-tracking methodology to investigate the online aspects of incidental vocabulary learning from an English graded reader, Goodbye Mr. Hollywood, which is a stage 1 short novel made available through Oxford University Press. The main goal of the study is to track the cognitive effects of repeated exposure and context predictability on English learners’ reading patterns, and study whether the eye-movement reading measures can predict the development of different components of vocabulary knowledge, including form and meaning recognition and meaning recall. This dissertation is organized into five chapters. In chapter 1, I review areas of the literature on vocabulary acquisition from reading and relevant eye movement studies. Chapter 2 describes the design, procedures, materials and research questions of the current study, and chapter 3 reports the results of these empirical questions. In chapter 4, I discuss the findings in light of the research questions. Finally, in chapter 5, I summarize the findings of the study, discuss pedagogical implications, address limitations, and make recommendations for future research. 5 CHAPTER 1: REVIEW OF THE LITERATURE 1.1 Incidental vocabulary learning Incidental vocabulary acquisition is defined as the process of learning new words from meaningful input or meaning-based activities such as reading, listening, or interaction that has no particular focus on lexical items (Paribakht & Wesche, 1999; Richards & Schmidt, 2002). Earlier conceptualizations of incidental learning varied in how they distinguished it from intentional learning. Ellis (1994, 1999) distinguished them under two types of attention, arguing that in incidental learning the learner’s primary attention is placed on meaning while allowing a secondary attention to be directed to form. Similarly, Hulstijn (2001, 2003) maintained that both types of learning must involve attention to varying degrees but the difference is that in incidental learning one does not intend to commit input to memory. Gass (1999) took a more conservative view, stating that incidental learning is more likely to be subconscious and less likely to involve deliberate attention or an active role from the learner. Bruton, Garcia Lopez and Esquiliche Mesa (2011) argued that what is characterized as incidental can be in some fundamental sense ‘intentional’ at least from the learner’s perspective. Because paper-and-pencil studies could not track the existence or the amount of attention, they adopted a methodological distinction, derived from psychology, that learning outcomes were deemed incidental when learners were not expecting to be tested on the input they received (Hulstijn, 2001, 2003). Several factors were hypothesized to encourage incidental learning including input factors such as word properties, salience and repetition, or individual factors such as proficiency, vocabulary size, increased attention and time devoted to target input, learner’s first language, learning strategies and background knowledge (see Schmitt, 2008 for a review). Although the incidental learning rate was described as lower than that of intentional learning, it is now widely 6 acknowledged in language pedagogy that both modes of learning complement each other in the process of learners’ incremental vocabulary development. 1.2 Empirical research on incidental learning Early studies on incidental vocabulary acquisition were inspired by the interaction hypothesis (Long, 1985, 1996) which stated that communication and negotiation of meaning is a vehicle for language development (see Gass, Behney, & Plonsky, 2013 for a review). Numerous studies found support for this hypothesis in language development in general, particularly question formation (e.g., Gass & Varonis, 1994; Polio & Gass, 1998; Swain & Lapkin 1998; Mackey & Philp 1998). Following similar designs, it was also found that incidental vocabulary acquisition occurred as a byproduct of negotiation and output within interaction and speaking tasks (Ellis, Tanaka & Yamazaki, 1994; Ellis & He, 1999; de la Fuente, 2002; Brown, Sagers & LaPorte, 1999). Listening tasks were found to be conducive to vocabulary learning yet with lower rates than interaction tasks (Brown, Waring, & Donkaebua, 2008; Elley, 1989; Smidt & Hegelheimer, 2004; Vidal, 2010). Some classroom research reported learning outcomes from spontaneous class interaction and teaching activities (e.g., Dobinson, 2001; Mohamed, 2012). Horst (2010) contributed to this line of research with a corpus-based study that indicated many opportunities for incidental intake from teacher-talk and classroom communication. Text-based tasks were more frequently investigated in vocabulary studies. Research in this area promoted engagement in reading tasks, either by manipulating word presentation and saliency in text or administering different tasks with varying degrees of complexity. For example, learners who inferred the meanings of certain words by having to choose from options provided retained words better than another group who were only provided the meanings of target words in a gloss (Hulstijn, 1992). Looking up meanings in a dictionary was a more 7 effective task than encountering meaning in marginal glosses (Hulstijn, Hollander, & Greidanus, 1996). Reading followed by vocabulary-focused exercises yielded better retention than reading with inferring meaning from context (Paribakht & Wesche, 1997). Reading combined with dictionary usage was more beneficial than reading only (Cho & Krashen, 1994; Knight, 1994; Luppescu & Day, 1993). Using words in a composition was more effective than only encountering words in reading comprehension (Hulstijn & Trompetter, 1998). To find a general interpretation of the common findings in vocabulary studies, Schmitt (2008) referred to engagement with lexical items as a key factor in vocabulary learning. Engagement, in his view, can be fostered by many factors, including, but not limited to, frequency of exposure, increased attention to target words, and increased time spent on the target items. In line with this claim, Watanabe (1997) and Peters, Hulstijn, Sercu and Lutjeharms (2009) found that the text input which affords increased processing due to contextual, lexical or semantic enhancement is more likely to yield more vocabulary gains (see Rott, Williams & Cameron, 2002; Rott & Williams, 2003). Beyond paper-and-pencil results, some researchers have presented cognitive interpretations for vocabulary learning outcomes. Studies that used think-aloud protocols or interviews might have been the first to probe into the cognitive processes underlying lexical acquisition (e.g. Fraser, 1999; Haastrup, 1991; Paribakht and Wesche, 1999; Rott, 2005). In an attempt to drive the theory-building process, Laufer and Hulstijn (2001) introduced the involvement load hypothesis to account for the pattern of results observed in previous literature. The hypothesis was based on an analysis of the cognitive and motivational involvement imposed by any given L2-vocabulary task. Involvement, a cognitive-motivational construct, was defined as the combined effects of need, search and evaluation. Tasks that induce higher involvement 8 were hypothesized to produce higher vocabulary gains. The hypothesis received empirical support from several studies (e.g. Huang, Willson & Eslami, 2012; Hulstijn & Laufer, 2001; Keating, 2008; Kim, 2008). It also generated further research questions, for example Jing and Jianbin (2009) validated it in listening comprehension tasks while Eckreth and Tavakoli (2012) investigated a combination of involvement and repetition factors on vocabulary learning. Some counterevidence was reported regarding the re-evaluation of the components of the hypothesis regarding input vs. output-based tasks (Flose, 2006; Yaqubi, Rayati & Gorgi, 2010) and the role of individuals’ accuracy in task performance on learning outcomes (Mohamed, in press). In general, the hypothesis can explain a good amount of the variance in incidental learning studies, yet it is not directly applicable to natural reading setting or leisure reading, which is proclaimed to have a significant role in learners’ vocabulary development. 1.3 Vocabulary learning from L2 reading Teachers and researchers generally agree that leisure reading beyond class material is a recommended path for lexical development above and beyond the most frequent vocabulary bands. However, when learners are directed to extensive reading of authentic text, they usually face a lexical coverage challenge. Nation (2001, 2006) calculated that the percentage of known words in a text should range between 95% and 98% in order for learners to obtain a sufficient comprehension level. It was thus calculated that authentic novels require at least a vocabulary size of 8000 to 9000 word families for adequate comprehension and new vocabulary intake (Hu & Nation, 2000; Nation & Wang, 1999; Waring & Nation, 2004). Because it can take several years for L2 learners to reach higher levels of vocabulary size, extensive reading programs have taken advantage of simplified graded readers that are systemically adjusted to different levels. One important advantage of these readers is that they 9 can provide spaced repeated exposures to new and low frequent vocabulary and reinforce partially known words, which is an ideal setting for incremental vocabulary development. 1.3.1 Extensive reading and L2 vocabulary Grabe and Stoller (2002) defined extensive reading as reading that exposes learners to “large quantities of material within their linguistic competence” (p.259). Proponents of extensive reading reported its value in increasing reading fluency, reading comprehension, and speed of access to frequent words as well as providing opportunities to meet new words, infer new meanings and build larger mental lexicons (Day & Bamford, 1998; Elley, 1991; Horst, 2005; Lai, 1993; Parry, 1991). One important benefit of extensive reading was reported by Uden, Schmitt, and Schmitt (2014) who found evidence that graded readers can support a smooth transition to authentic novel reading. Several studies investigated the potential of lexical gains from graded readers and authentic novels. The classic study of Saragi, Nation and Meister (1978) used the novel A Clockwork Orange (1962) by Anthony Burgess. It was of particular interest because it included Russian slang words, referred to as Nadsat, which were targeted in reading experiments. They found that native English speakers were able to learn an average of 76 % of 90 Russian slang words used in the novel. Pitts, White and Krashen (1989) used one chapter of the same novel with second language readers and found modest rates of learning, about 6.4 % to 8.1 % of 30 target Russian words. Day, Omura and Hiramatsu (1991) reported that Japanese EFL learners learned an average of 3 words out of 17 target words encountered in a simplified short story, The Mystery of the African Mask. Horst, Cobb and Meara (1998) had learners read a simplified version of The Mayor of Casterbridge, and reported that learners could pick up an average of 5 words out of the 45 target words. Horst (2005) showed that readers picked up around 51 % of the 10 target words from selected extracts of graded readers. A common factor among all these studies was frequency of exposure in that learning chances increased as learners encountered target words more times in the text. In addition to word meaning, the acquisition of other aspects of lexical knowledge was also investigated in extensive reading. Waring and Takaki (2003) used the 400 headwords graded reader A Little Princess. They found that learners scored higher in meaning recognition of the target words than productive translation and that scores in both tests dropped sharply after three months. Pigada and Schmitt (2006) found that a French learner showed considerable improvement in word spelling but a lesser command of meaning and grammatical knowledge after one month of extensive reading especially as exposures with target words increased. Webb (2005, 2007) reported that vocabulary encounters in reading or writing positively reinforced spelling, associations, syntax, grammatical functions, and form-meaning mapping. He found that the group that encountered the target words more than 10 times showed a better grasp of different aspects of word knowledge than other groups who received fewer exposures. PellicerSanchez & Schmitt (2010) investigated vocabulary learning outcome from an authentic novel Things Fall Apart, and found that meaning recognition reached 84 % after ten exposures while meaning recall was still around 55 %. Taken together, all previous studies suggest that reading yields different outcomes for different aspects of word knowledge, with more substantial gains in meaning recognition compared to other lexical aspects. What is also common is that all these studies point to the effect of repeated exposure; specifically, an average of 8 to 10 repetitions was shown to be appropriate for the development of receptive knowledge of vocabulary with relatively low gains in productive knowledge (Schmitt, 2010). Finally, the amount and quality of learning 11 demonstrated in previous research indicate that incidental learning from reading is possible but retention is not durable unless a learner receives further exposure within a reasonable time span. Schmitt (2008) suggests supplementing extensive reading with an explicit teaching component or activities to enhance engagement and maximize the benefit of exposures. Table 1 summarizes the findings of extensive reading studies and highlights the roles of exposure and context in vocabulary learning. 1.3.2 Contextual richness and vocabulary learning A basic assumption in learning vocabulary from reading is that learners will use their linguistic resources and lexical inferencing to derive meanings from context and thus be able to retain some knowledge of words if they get repeated over time (Fraser, 1999; Paribakht & Wesche, 1999). Some research indicates that guessing from context is unreliable in learning vocabulary (Laufer, 2005; Nassaji, 2003). In fact, two opposing views were presented in this regard. Schouten van-Parreren (1989) argued that informative contexts support guessing ability, which in turn may transfer to learning. On the other hand, Mondria and Wit-de Boer (1991) argued that rich context can aid comprehension but it diverts attention from the lexical level and that even correct guessing does not guarantee retention. Mondria (2003) found that meaning inference was time consuming and less efficient than other explicit methods of retention. In the same line, Hu and Nassaji (2012) found that ease of guessing affected word retention negatively. Empirical research on context effect reported inconclusive results. Schwanenflugel, Stahl and McFalls (1997) found no evidence for the role of contextual support in vocabulary development of elementary school children. Zahar, Cobb and Spada (2001) found no clear association between the learning outcome and the quality of contexts in which lexical items occurred. Instead, they suggested that variable contexts are favorable for effective inferencing 12 and retention and that unclear contexts can be ideal for triggering more attention at the lexical level, which sets the scene for meaning retention. Similarly, Haastrup (1989) argued that meeting words in less informative contexts invites more cognitive engagement and thus increases chances of meaning recall in subsequent contexts. Webb (2008) investigated context quality and the effect of repeated exposure in a controlled reading study. He found that while repetition supported form recognition, the quality of context was associated more with meaning recognition. This may indicate that a rich context aided guessing and retention to a certain degree. Joe (2010) found that encountering target words repeatedly in a wide range of tasks is more conducive to vocabulary retention than contextual richness. Hu (2013) found a similar conclusion in that repeated exposure affected knowledge of form while contextual richness was more beneficial to form-meaning connections and grammatical functions. One possible reason for the somehow mixed results regarding context effects may be related to the way context predictability has been operationalized. Many studies adopted the classification of contexts provided by Beck, McKeown and McCaslin (1983) which categorizes contexts into misdrective, nondirective, general and directive (Zahar et al., 2001; Hu, 2013). Schwanenflugel, Stahl and McFalls (1997) rated contexts from 1 (low transperency) to 4 (high transparenecy). Webb (2008) had two native speakers rate the conetxts from 1 (misleading) to 4 (high chance of lexical inference). An alternative method of measuring predictability, derived from psycholinguistics, is through a modified cloze procedure where native spakers’ percentage of agreement in predicting the missing word determines the degree of predictability. Schwanenflugen and LaCount (1988), based on previous literature, defined a high constraint cutoff at 78% or above and low constraint at 68% and below. Table 1 summarizes the roles of exposure and context in vocabulary learning from reading studies. 13 1.4 Interim summary Up to this point, I have reviewed how early research defined and operationalized incidental learning and tested it in different modalities: speaking, listening, reading, interaction and text-based tasks. I then discussed how extensive reading programs made use of graded readers to provide incidental learning opportunities for ESL students. As Table 1 indicates, extensive reading research showed significant gains of vocabualry in different aspects of form recognition, meaning recognition and recall. Generally, more than 10 exposures was the recommended threshold for substantial word knowledge gains. Similarly, research on text-based tasks showed a significant role of reading in vocabulary learning with established strong effects of frequency of exposure. Fewer studies looked at the role of context in extensive reading setting. The effect of context was generally unclear or correlated more with meaning recognition rather than the knowlede of word form. Adequate measures for context predictability can shed more light on the pattern of reported learning outcomes from extensive reading research. In the following section, I argue that insights from online processing can add to our understanding of the lexical factors that determine vocabulary acquisition from context. I outline how psycholinguistic approaches investigated the same factors from a cognitive perspective, particularly through eye tracking. 14 Table 1 The role of exposure and predictability in vocabulary learning Study population Reading material Vocabulary Effect of exposure gains Effect of predictability Saragi, Nation & Meister 20 native English The authentic novel 76 % of 90 Minimum 10 exposure (1978) speakers A Clockwork Orange Russian slang exposures for mitigated by words learning context Pitts, White & Krashen (1989) 51 ESL learners A Clockwork Orange 6.4 % - 8% Not tested Not tested Day, Omura & Hiramatsu 292 Japanese EFL The Mystery of the 17 % of 17 Not tested Not tested (1991) learners African Mask target words Horst, Cobb, & Meara (1998) 34 EFL learners The Mayor of 20% of 23 Strong effect Not tested Casterbridge target words The Golden Fleece 2.3 of 30 words Strong effect Subordinate Zahar, Cobb, & Spada (2001) 144 ESL students to exposure 15 Table 1 (cont’d) Study Waring & Takaki (2003) Population 15 Japanese EFL Reading material A Little Princess (25 target words) Several graded readers Vocabulary gains Effect of Effect of exposure predictability +18 exposures Not tested 17 words Not tested Not tested 15 word forms; 20 word meanings Horst (2005) 21 ESL students Pigada & Schmitt (2006) One French learner Four graded readers 8-23 % 20+ exposures Not tested Webb (2007) 121 Japanese EFL 10 paragraphs Multiple aspects 10 + exposures Not tested Webb (2008) 50 Japanese EFL 30 sentences Multiple aspects Effective for form for meaning Sanchez & Schmitt (2010) 20 Spanish EFL Things Fall Apart 84% meaning +10 exposures Not tested Joe (2010) One Turkish ESL Class material 77 % of 20 words Strong effect Not significant Hu & Nassaji (2012) 11 ESL learners Academic text Significant gains Not tested Negative effect Hu (2013) One ESL learner Graded readers Significant gains Effective for form Effective for meaning 16 1.5 Cognitive perspectives on lexical learning The previous review points to a possible interaction between exposure frequency and contextual richness that may be responsible for different attention patterns from readers and thus variable learning outcomes. Based on Schmidt’s (1990) noticing hypothesis, vocabulary researchers assume that readers need to notice novel words in context based on text properties or lexical features, and that this pattern of noticing would determine the nature of learning outcomes. However, it is difficult to test this assumption offline because retrospective measures that have been used to track noticing such as note taking, underlining or think-aloud protocols can be less sensitive in capturing moment-by-moment processing of context. Godfroid et al (2013) reviewed these measures, concluding that a more precise and complete account of cognitive processing during reading can be fulfilled by the eye tracking technique, which can provide a more sensitive measure of the amount and locus of attention during processing. 1.5.1 Eye tracking Eye tracking is defined as the online recording of learners’ eye movement behavior, which is described in terms of fixation times (how longer readers look at interest areas) and saccades (the movement of the eyes from one point to the next) (Godfroid, 2012). Reviews of eye tracking research show that eye movements provide an accurate representation of the cognitive processes in the reader’s mind. This assumption was coined the ‘eye-mind’ link, which proposes a connection between overt and covert attention (Rayner, 1998, 2009). In reading research, many variables were tested such as word properties, such as frequency, predictability, familiarity and other context variables in order to examine their effects on reading behavior as measured by eye tracking. 17 1.5.2 Eye movement models A large amount of research used recordings of eye movements to explore the psychological processes that control the reading behavior of adult skilled readers (see Rayner, 1998, 2009 for a review). Several computational models were developed to explain the characteristics of reading behavior based on the assumption that there is a strong relationship between lexical encoding and eye fixation measures (Liversedge, Gilchrist, & Everling, 2011; Van Gompel, Fischer, Murray, & Hill, 2007). These models were categorized into serialattention and parallel-attention models. Serial attention models assume that attention is allocated sequentially to support lexical processing of one word at a time and that lexical processing causes the eye to move from one word to the next (e.g. Reader model: Just, & Carpenter, 1980; EMMA model: Salvucci, 2001; E-Z Reader model, Reichle, Rayner, & Pollatsek, 2003).In parallel attention models, processing is shared to neighboring words due to their specific characteristics (e.g. SWIFT model: Engbert et al., 2002). Although no model was claimed to account for the whole picture, the E-Z reader model was found to be the most comprehensive in linking lexical recognition process to eye fixations because it provided assumptions as necessary to account for sophisticated observations in reading behavior (Liversedge, Gilchrist, & Everling, 2011). Simulations of eye movements in reading studies showed that the E-Z reader assumptions and the serial attention hypothesis is sufficient to account for reading behavior in alphabetic and non-alphabetic languages (see Pollatsek, Reichle & Rayner, 2006; Rayner, Ashby, Pollatsek, & Richle, 2004 for a full review). A key assumption of this model is that lexical factors influence when the eyes move in that an early stage called familiarity check triggers the eyes to move to the next word, while later stage of full lexical access causes covert attention to shift to the next word. The mean time spent 18 on lexical items is the time required for familiarity check, which is influenced by item frequency of occurrence and within sentence predictability. If the next word is highly frequent or predictable, it will most probably be skipped, being processed entirely para-foveally, in which case a familiarity check stage is initiated for the following word to proceed with reading. In the light of this model, it was found that specific early eye movement measures like gaze duration can exclusively reflect a familiarity check stage in lexical processing (Juhasz & Pollatsek, 2011). An important assumption of this model is that the durations of both familiarity check and lexical access are highly sensitive to lexical factors such as word frequency, word familiarity, repeated exposure, lexical ambiguity, age of acquisition, context predictability, morphology and plausibility (Clifton, Staub & Rayner, 2007). 1.5.3 Eye movement research in reading Many eye movement studies have looked at native and nonnative speakers’ processing of written input and responding to different lexical and contextual features. Hyönä and Niemi (1990) used the repeated reading paradigm with Finnish readers. The readers’ fixation times decreased consistently from first to third encounter with target sentences, and the number of their progressive fixations and regressions also decreased. Similarly, Raney and Rayner (1995) investigated the effects of repeated exposure on native-English speaker’s second reading performance. They found that individuals had shorter reading times, made fewer fixations, and had longer saccades during the second reading of the same text. Moreover, shorter fixation durations were associated with high frequency words, suggesting independent effects of word frequency and repetition on reading times. Rayner, Raney, and Pollatsek (1995) found similar results regarding the effect of three repetitions of lexical items in a given text, and they also found frequency effects after the first two repetitions, but no further differences occurred after 19 that, which indicated that word frequency was mitigated by repetition. Recently, Joseph, Wonnacott, & Nation (2014) found significant decreases in reading times as a function of repated exposures and shorter reading times for novel words that were presented earlier in the text than later items. Early presneted words were remembered more accurately in an offline post test. In their results, they advocated an important effect of age of aqcuisition on lexical processing and learning. Regarding lexical processing of context, eye movement studies have consistently shown that high context predictability is associated with shorter fixations and more skipping than low predictability contexts. Ehrlich and Rayner (1981) presented passages with target words the meaning of which was constrained (i.e., predicted) by the preceding context. They found that readers fixated more on target words in low-constraint contexts words in given paragraphs. Moreover, the readers tended to be less sensitive to misspellings of the target words in highconstraint contexts. Similarly, Rayner and Well (1996) found that readers fixated more on target words when they occurred in low-constraining contexts than in medium or high-constraint contexts. The probability of skipping was higher in high-constraining contexts compared to other context conditions. Kliegel, Grabner, Rolfs and Engbert (2004) also reported that high predictability increased skipping rates and it was associated more with second pass reading. Rayner, Ashby, Pollatsek and Reichle (2004) found that skipping was affected by predictability more significantly in high frequency target words. In contrast, Ashby, Rayner and Clifton (2005) found that lexical frequency and predictability independently affected reading times and patterns of processing. They also found qualitative differences between groups of readers as skilled readers were more sensitive to predictability and more consistent in word recognition patterns than average readers. 20 Few studies have investigated a potential association between online processing patterns and learning new words. Chaffin, Morris, and Seely (2001) found that the familiarity of target words and context quality (informative or neutral) determined the amount of time readers spent on the target words in that learners fixated the most on novel words encountered in neutral contexts. Williams and Morris (2004) examined the effect of word familiarity in reading comprehension and word recognition. They found that readers spent more processing time on novel words than familiar words, and that there was a systematic relationship between online processing patterns (i.e. reading times), and retention of new word meanings. Brusnighan and Folk (2012) conducted a self-paced reading study on incidental vocabulary learning. They found that readers spent more time processing sentences that contained novel compound words, and that they were able to retain new word meanings from a single exposure. They made a case for a strong relationship between increased processing times and accuracy in vocabulary retention measures. They stated that skilled readers spend extra time on difficult items to establish formmeaning connections, which results in memory traces which are available for later recall. On the level of context, they found that opaque contexts triggered higher rereading times and slower processing than transparent contexts, which was considered an ideal situation for meaning inference and retention of target words. One recent study that specifically targeted vocabulary in second language reading was conducted by Godfroid et al. (2013). They operationalized attention to novel pseudo words as a quantitative variable reflected in the participants’ eye fixation times during reading. Twentyeight advanced EFL learners read 12 paragraphs in English with target areas that consisted of known words, pseudo words or a combination of both. Results showed that readers fixated longer on pseudo words than on known words, regardless of whether these pseudo words were 21 combined with appositive cues. There was a significant association between the total fixation time on pseudo words and subsequent recognition of these words in a surprise posttest. Taken together, the eye tracking studies widely investigated both exposure and predictability from a processing perspective yet focused less on learning opportunities as function of processing patterns. Table 2 summarizes the findings of these studies and their implications on the effects of exposure and context. 1.6 General summary The previous review hints at how the recent trends in applied linguistics can explain findings from earlier studies of vocabulary acquisition from reading. Tables 1 and 2 summarize the current picture of vocabulary learning form extensive reading and eye tracking research traditions. Several studies has validated the potential of extensive reading to foster vocabulary learning in general and support the development of different components of word knowledge. Within this research tradition, many studies has found evidence for the importance of repeated exposure in vocabulary development but less attention was given to the role of context in the process of incidental learning. On the other hand, Table 2 shows that eye movement research in reading has given a considerable focus on context and repetition from a processing perspective with only few attempts to associate online processing with incidental vocabulary acquisition. While extensive reading research primarily investigated second language reading in authentic or simplified lengthy texts, eye movement studies relied more on customized sentences or short paragraphs read by native speakers. 22 Table 2 The roles of exposure and predictability in eye tracking studies Study Population Reading material Effect of exposure Effect of Vocabulary predictability Hyönä and Niemi (1990) 11 Finnish Text of 371 words Decreased fixation times speakers (read 3 times) and longer saccades 28 English 16 short passages Decreased fixation times speakers Read 2 times and longer saccades Ehrlich and Rayner 24 English 48 paragraphs Not tested (1981) speakers Joseph, Wonnacott, & 37 college 16 sentences read Nation (2014) students 5 times Rayner and Well (1996) 18 English 36 sentences Rayner, & Raney (1995) Not tested Not tested Not tested Not tested Shorter fixations and Not tested more skipping Decreased reading times Not tested Retention of earlier Not tested speakers 23 Shorter fixations Not tested Table 2 (cont’d) Study Population Reading material Effect of Predictability Vocabulary More skipping and longer Not tested exposure Kliegel, Grabner, 50 native German Rolfs and Engbert 144 German Not tested sentences second reading times (2004) Rayner, Ashby, 44 Native English Pollatsek and Reichle 32 English Not tested sentences More skipping in high Not tested frequency words (2004) Ashby, Rayner and 44 Native English Clifton (2005) Williams & Morris Not tested sentences 24 Native English (2004) Brusnighan and Folk 24 English Shorter reading times for Not tested skilled readers 48 English Not tested Not tested Tested sentences 56 Native English English sentences Not tested Better retention Tested 21 ESL learners 12 paragraphs Not tested Not tested Tested (2012) Godfroid et al. (2013) 24 1.7 Goals of the study The goal of the present study is to bring together methods from extensive reading tradition and eye movement research to investigate the online aspects of incidental vocabulary learning from L2 reading with a focus on the effects of repeated exposure and context predictability on processing and vocabulary intake. Looking at reading patterns of novel words in context can help us interpret the concepts of engagement and noticing more precisely and associate them with learning outcomes. Tracking moment-by-moment interaction with the text can provide a cognitive picture of the factors that increase or decrease attention to target words, as reflected in online fixation measures. I aim to investigate the role of different fixation measures in predicting vocabulary intake; in other words, how repeated exposure and context cues provide opportunities for readers to combines bits of information about novel words over successive encounters. To investigate the holistic effects of attention and exposure, I also aim to test the role of summed online measures, the total times readers spent on individual target words, in predicting the variance in vocabulary outcomes, and whether these processing aspects override or support the roles of total exposures and predictability of novel words in L2 reading environment. 25 CHAPTER 2: CURRENT STUDY 2.1 Research questions The current study is guided by the following research questions: (1) How do learners of English in the study process novel lexical items in silent reading relative to known control items? And how do repeated encounters and predictability influence lexical processing of pseudo words compared to control words in the text? (2) What are the text-based effects of repeated exposure and predictability of novel target words in an L2 English text on the acquisition of receptive and productive knowledge of form and meaning of target words in vocabulary posttests? (3) To what extent do moment-by-moment eye fixation times on successive encounters with target words predict the learning gains of L2 readers in the vocabulary knowledge posttests? (4) To what extent do summed reading times of target words predict successful form and meaning gains in vocabulary posttests? And how do online predictors compare to text-based effects of exposure and predictability on vocabulary learning from reading? 2.2 Participants The participants in this study were 42 advanced second language learners of English (22 females and 20 males) ranging in age from 19 to 35 (M = 22, SD = 4.2). Thirty participants were undergraduate international students with diverse majors who were also enrolled in advanced ESL reading and writing classes, and 12 participants were graduate students mostly majoring in scientific and engineering fields. Participants represented different language backgrounds including Chinese (N=13), Arabic (N=4), Spanish (N=5), Portuguese (N=5), Japanese (N=5), African languages (N=5), Hindi (N=2), in addition to single representations for Korean, Polish and Russian. Proficiency levels were determined based on self-reports of recent TOEFL IBT 26 scores that ranged from 79 to 100 (M =89, SD =7.3). The minimum of 79 in TOEFL is the cutoff required for undergraduate studies at MSU. Their vocabulary sizes, measured at the 5k level using Meara’s (1992) vocabulary size test, yielded an average of 3908 (SD = 659). Detailed information about participants’ background and levels are provided in Appendix A. 2.3 Material 2.3.1 Background questionnaire A one-page language background questionnaire was prepared to collect basic information about participants’ native languages, majors of study, English learning years and other languages spoken or used. Participants were also asked to provide the most recent TOEFL IBT or any other proficiency test score they received in addition to self-rated proficiency on a scale from 1 to 9 in the areas of reading, writing, vocabulary and overall proficiency. A sample of the questionnaire is shown in Appendix B. 2.3.2 Vocabulary size test To confirm that students’ vocabulary levels matched the selected reading material, a yesno vocabulary size measure, adapted from Meara (1992), was planned to be administered prior to the experimental session for each participant. The test comprised 5 levels targeting the first 5,000 most frequent words according to Nation (2001). Each level contained 60 words (40 real words and 20 non-words). The score on each level is calculated based on the estimation of hits (real words checked as known) against false alarms (non-words checked as known). A participant’s vocabulary size at 5k is estimated as the sum of scores across the five levels multiplied by 10. This particular test was selected for its quick administration besides the fact that the experimental reading material will be targeting the 5k level in lexical coverage. Examples of this test can be found online at (http://www.lextutor.ca/tests/). 27 2.3.3 Reading material In selecting an appropriate reading material for the study, it was essential to have a text that is appropriate to learners’ level of English-in terms of language structure and lexical coverage. It was equally important to provide an amount of lexical richness with a good spread of vocabulary items showing variable repetition patterns. A third criterion for the text was to maintain a reasonable length while taking into consideration the practical issues of implementing the eye tracking methodology in reading. After screening several resources of modern short novels, it was found that graded readers would be more relevant for the purposes of the study because they were no less authentic and they would easily satisfy the requirements for length and controlled lexical features. The search for a graded reader involved consultations with ESL teachers and browsing the library resources of the English Language Center. Several short novels were inspected for content and length and then run through the Range software (Heatley, Nation and Coxhead, 2002), which lists the words in a given text according to their frequency and word families. The final selection was a short novel Goodbye Mr. Hollywood by John Escott, which is a stage 1 (400 headwords) graded reader made available through the Bookworms Library, Oxford University Press. It is available in print with a word count of 5400 (642 types and 372 word families) and classified under thriller and adventure stories. The text was cut down to 4649 words (595 types and 394 word families) by adjusting encounters of target words and taking out unnecessary details. Accordingly, the lexical density (types/tokens ratio) was not high (12.9%). Range output confirmed that the lexical coverage of the story is at the 5,000 word level. Table 3 outlines the lexical distribution of the text across frequency levels. 28 Table 3 Lexical profile of 'Goodbye Mr. Hollywood' Lexical profile of “Goodbye Mr. Hollywood” Word List Tokens (percentage) Types (percentage) Families 1000 4074 (87.6%) 479 (80.5%) 328 2000 309 (6.65 %) 61 (10.25%) 49 3000 38 (0.82%) 12 (2.02%) 9 4000 13 (0.28%) 6 (1.01%) 4 5000 22 (0.47) 6 (1.01%) 4 Not in the lists 193 (4.15%) 31 (5.21%) Total 4649 595 394 The lexical profile for the novel shows that the text is densely populated with highfrequency vocabulary (the first 1000 words) and less populated with words at the 2000 level while very few tokens appear from the rest of the levels. The tokens not in the list constituted only proper nouns and names of people and places. Two versions, which differed only in terms of the target words assigned in each one, were created of the original text. To determine the list of target words, Range lexical analysis was inspected for frequency (number of occurrences) of certain vocabulary items. A sample chapter of the original story is provided in Appendix C. 2.3.4 Target words The final list of the target words consisted of 40 items with occurrences ranging from 1 to 30. These words were equally split into two lists (20 items each), of which a given participant saw one list as experimental items (i.e., pseudo words) and the other list as familiar English controls. Each vocabulary item in the first list matched another item in the second list in part of speech, and number of letters and syllables. Because the graded reader contained all familiar words that were estimated to be a part of participants’ lexical repertoire, the experimental items in each version were replaced by matching pseudo words retrieved from online resources 29 especially the ARC Nonword Database (http://www.cogsci.mq.edu.au/~nwdb/) and previous vocabulary research (Godfroid et al., 2013; Webb, 2007, 2008). The pseudo words in one version of the story appeared in their familiar forms in the other version and vice versa. With this procedure, the two versions were counterbalanced and every pseudo word in a given context had a familiar counterpart in the other text version. To minimize item effect, a single pseudo form was made to substitute two different words: one in each story version. A large list of pseudo words was passed over to two native speakers who intuitively edited or excluded some of them. Additionally, the same list of pseudo forms was passed around in an ESL class with 9 international students who were asked to judge whether the listed items could be actual English words. These students were not participating in the actual study. Taking all feedback in consideration, a finalized list of 20 pseudo items was created to substitute for the 40 target items in the experiment. Each pseudo item matched the real word in number of letters and syllables to minimize visual effects on eye movements. Table 4 outlines the target words in the two versions, the number of times they appeared in text and their substitute pseudo words. The total number of pseudo tokens in each version was 121, which accounted for 2.6 % of the total tokens in the text. This guaranteed that the reading material provided approximately 97.4% of lexical coverage, which falls within the recommended lexical coverage range of (95%98%) to ensure reading comprehension and the ability to guess novel words from context (Nation, 2006). Based on these criteria, pseudo words were inserted and the text was divided into shorter parts (seven chapters) and shorter paragraphs in preparation for programming. 30 Table 4 Pseudo forms and their frequency in the reading text Pseudo forms and their frequency in the reading text Version A targets Version B targets Pseudo words hotel table fozle Number of encounters 30 café room gube 18 face desk mave 10 stop meet tund 9 tall busy leam 7 kill push blef 6 party money toker 6 pocket window bannow 5 bag gun mot 5 picture airport fonteen 4 quiet happy dangy 4 garden letter windle 4 shirt dress neech 3 accident hospital redaster 3 rich cold dook 2 sleep drink tance 1 cinema camera pamery 1 famous hungry tantic plane noise dorch 1 chair shoes smick 1 1 31 2.3.5 Comprehension packet A 50-item comprehension test (5-8 items per chapter) was created to monitor readers’ understanding of the main content of the story. The items included a combination of true/false statements and multiple choice questions depending on the content of each chapter. The test was printed out in seven pages (one page per chapter) along with characters’ illustrations copied from the story book to foster reader engagement and visualize the content. A sample page of the packet is provided in Appendix D. 2.3.6 Reading perception questionnaire To gauge readers’ interest and enjoyment during reading, a short 10-item questionnaire, adapted from Uden, Schmitt and Schmitt (2014), was used as a post-reading task. The items were in the form of short statements with a six-point Likert scale where 1 indicates ‘strongly disagree’ and 6 indicates ‘strongly agree’. The statements mainly revolved around readers’ enjoyment, ease of reading and their overall comfort through the experiment. See Appendix E for a copy of the questionnaire. 2.3.7 Vocabulary tests To obtain a multi-faceted picture of incidental lexical development, it was important to include multiple measures of vocabulary knowledge (Nation, 2001; Schmitt, 2008, 2010). Three vocabulary tests were prepared to measure form recognition, meaning recognition and meaning recall of the target pseudo words. In general, only these target words were identical in all the tests while distracter items differed. Because the target words carried two different meanings according to the story versions, all the tests were adjusted to accommodate this factor by including the two meaning options in the meaning recognition test and considering two different 32 responses in the scoring procedures of the meaning recall test. All tests were scored in a binary fashion where zero means “no response’ or incorrect answer and 1 refers to the correct response. Form recognition test. This test comprised 100 vocabulary items including the 20 target pseudo words, familiar words from the text and other sources and pseudo words out of the text. The instruction for the task is to circle only the words that were seen in the reading material. A copy of the test is found in Appendix F. Meaning recall test. This test included the 20 target pseudo words in addition to 10 distracter items that represented off text pseudo words, familiar words from the list and other low frequency English words. The task was to recall meanings, synonyms, related words or semantic fields for the given items. A sample of the test is provided in Appendix G. Meaning recognition test. This is a multiple choice test with 30 items covering the target words along with other additional pseudo words, familiar words and low frequency words. Each item had five meaning options in addition to an ‘I don’t know’ option to minimize guessing. A sample of the test is shown in Appendix H. 2.4 Procedure 2.4.1 Apparatus Before any participants were invited into the lab, the reading material was programmed into the desk-mounted EyeLink 1000, an eye-tracker manufactured by SR Research (http://www.sr-research.com/). The story was copied into the Experiment Builder and set up in two versions so that a participant can selectively be assigned to one experiment file at the time of participation. The text was typed in Courier New font size 18, on a 19-inch computer monitor set up 55 cm from the participants’ eyes. The font color was black on a light grey background. 33 The full experiment file consisted of 87 screens including introductory pages, instructions and break transitions. The main story content was thus provided in 70 screens, each containing 60-70 words in double spaced text. Minor editing was performed on the displayed text to confirm that target words did not appear in the beginning of slides and/or at the beginning and end of sentences. Each chapter was captured in a range of 7 to 11 screens. Breaks were offered at the end of each chapter in the story. Eye calibration was set to be performed at the beginning of the experiment and after the return from breaks. Participants moved across screens using a button on the right side of a hand-held controller. Drift correction was set up at the beginning of each page. Participants placed their heads on a chin and forehead rest during reading time to minimize head movements. 2.4.2 The reading session The participants for the study arranged individual meeting with me in the eye tracking lab run by Second Language Studies Program at Michigan State University. After signing the consent form and filling the background questionnaire, they took the vocabulary size test and prepared for the eye tracking session. I started with an introduction about the story and main characters and asked each participant if he/she had read it before and what was his/her expectations about the incidents in the story based on the title and illustrations of main characters. I told participants that they would be tested on the content of the story so that they would pay attention during reading but I did not explicitly forewarn them about any vocabulary testing. Once the participant was warmed up for reading, I started giving directions regarding the use of the eye tracking equipment then I performed the initial calibration. Two participants did not pass the calibration stage so they could not continue with the experiment. 34 I randomly assigned participants to either version A or B in the experiment builder. Once a participant is done with a chapter, a break prompt appears on the screen. At the beginning of the break, he/she would take the comprehension packet on the side of the desk and respond only to the questions on the chapter he/she has just finished. Whenever he/she was ready, the participant would return to the chin rest and perform calibration for the following chapter. The same procedure continued with the rest of the chapters. I gave a longer break by almost the middle of the story (around chapter 4 or 5) and provided snacks for the reader. When the participant reached the end of the story, he/she would complete the last page in the comprehension packet then move away from the eye tracker to another desk in the lab. The reading session for each participant including calibration, breaks and comprehension check took an average of 45 to 70 minutes. 2.4.3 The testing session The testing session started with the reading perception questionnaire to gauge their attitudes and feelings about the eye tracking experience and the story. They then took the vocabulary tests in the following order to avoid transfer effects: form recognition, meaning recall and meaning recognition. The testing session for an average participant took a maximum of 10 to15 minutes and it was the final task required from participants. 2.4.4 Modified cloze procedure To retrieve predictability information of target words, we needed to look at the text from a native speaker perspective. A norming study was designed in which the two original versions of the story, with target words deleted from context, were circulated online to English native speakers in order to intuitively fill in the gaps with appropriate words. This procedure was termed in previous research as modified cloze procedure (Schwanenflugen and LaCount, 1988 ; 35 Rayner and Well,1996 ). A high percentage of agreement on a specific item in a given context would be interpreted as strong predictability for the vocabulary item and a lack of agreement would mean low or zero predictability. To create a user-friendly cloze task, I worked with a doctoral candidate in computer science engineering to build a web-based interactive survey that can be easily adminstered and analyzed. The story versions were provided in the same format L2 readers saw them but with 121 gaps representing the target tokens in each version. Participants were asked to log in one version only. To guarantee maximum responses, every chapter was presented in a single web page with a submit button at the bottom so that once a respondent submits a chapter, the answers get recorded to the server. An incentive of a $50 drawing was announced to encourage more respondents. The survey was open for three weeeks then closed for data analysis. A sample snapshot of the survey is shown in Appendix I. A total of 136 entries were recorded in the server for the whole survey. All respondents were undergraduate and graduate native speakers of English at Michigan State University. After cleaning procedures and exclusion of blank entries and non-native respondents in both versions of the story, a sample of 108 valid entries were considered (56 in version A and 52 in version B). The output was orgnaized around target words with each column recording the entries for a specific gap in the text. The predictability is calculated as the proportion of correct answer over the total number of responses for an item. In previous literature, an agreement percentage of 78% - 100 % was considered for high predictability, 55 % - 77 % for medium and 0 % - 54 % for low predictability. For the purpose of the current study, the predictability values were entered as a numerical variable to measure for the role of context in online processing and vocabulary learning. 36 2.5 Analyses 2.5.1 Definition of variables I distinguish between online and offline effects on vocabulary outcomes. Online variables refer to the information in eye movement records that includes early measures of processing (e.g.; first fixation and first pass time) and late processing measures (e.g.; gaze duration and total time). First fixation duration captures the time of the first look at the target area (for example, a novel vocabulary word) when encountered for the first time during forward reading. Gaze duration combines first fixation duration along with any other fixation made on the target area at the initial visit before the eyes move forward or backward to the next target area. Total reading time is the sum of all fixation durations on the target area (see Winke, Godfroid, & Gass, 2013). I also report skipping rates, regressions-in and regressions-out of the interest areas. Regressions-in refer to instances when readers returned to a target word after first pass. Regressions-out refer to times when readers went back to a previous part of the sentence on first pass. These processing measures are reported for each of the target tokens as well as summed over vocabulary items to test if eye movement behavior predicts learning outcomes in token-based and item-based analyses. Offline variables refer to the textual factors of total exposure and predictability. Total exposure is an item-based factor that represents the number of times a vocabulary item was seen in the text. Based on exposure, each item contributed different number of tokens. The instance of meeting a single token was labeled as an ‘encounter’. In a similar manner, I distinguish between token-based predictability and item-based predictability. Token-based predictability is the specific predictability score of a given encounter with a word. Item-based predictability is its maximum reported predictability among all its tokens. For example, the pseudo word ‘gube’ 37 received a range of predictability scores between 0 and 38.3 over its 18 tokens in version A of the story, and between 5 and 84 in version B. For token-based analysis, all predictability scores of ‘gube’ were used in the model to predict learning success. In item-based analysis, the item ‘gube’ was assigned its maximum reported predictability (38.3 in version A and 84 in version B). In this way, it was possible to test the effect of predictability at each encounter and also test if readers exhibited different learning patterns vocabulary items based on their predictability levels. To further elucidate the role of context in word learning, I categorized item maximum predictability into two levels: predictable and less predictable. Previous literature has defined a range around 78 % as a cutoff for high predictability (Ehrlich, & Rayner, 1981; Rayner, & Well, 1996; Schwanenflugen, & LaCount, 1988). Based on the distribution of predictability data in the study, I set a cutoff point of 77 %, yielding equal number of items in predictable and less predictable categories. Because there are two versions of the reading material where target and control words were counterbalanced, each participant contributed reading times to two conditions: experimental and control. The factor of condition was used to describe differences in processing measures and reading behavior between the target words (pseudo words) and control words (familiar English words). Individual factors included learners’ vocabulary size, L2 proficiency, reading speed and reading comprehension scores. Vocabulary outcomes, scored as 0 or 1 for each item, represented three categorical dependent variables in the statistical models, one for form recognition, one for meaning recognition and the third for meaning recall. Table 5 briefly outlines and defines the variables and terminology used in presenting the results. 38 Table 5 Definitions of variables and terminology in the study Definitions of variables and terminology in the study Term Definition Condition Whether the word appeared as a pseudo or familiar token Encounter Each target or control token in the reading text First fixation duration (FFD) The time of the first look at the target word Gaze duration (GD) The sum of fixations made on the target word at the initial visit Interest area(target area) A word for which eye movement measures were recorded and analyzed Item predictability The maximum predictability score for a vocabulary item across all tokens Online processing measures Recorded Eye movement times on target and control words Regression-in When readers went back to the target word after the first pass Regression-out When readers returned to an earlier part of the sentence on first pass Skipping The absence of a fixation on a target word at first pass Summed processing times The eye movement measures summed for each vocabulary item across all encounters; the cumulative attention measure over specific target words regardless of encounters. Token predictability The reported predictability for each token in the text Total Fixation Duration (TFD) The sum of all fixation durations on the target word Total exposure The number of times each vocabulary item was seen in the text 2.5.2 Data structure A total of 57 data files were screened and reviewed for errors. Several data files were excluded from the analysis because they showed blank or poor captures of eye movements; that is, irregular and/or incomplete recordings. Therefore, the offline vocabulary tests and other information associated with these recordings were excluded from the data. Forty-two valid 39 samples were considered for analysis. The data sheet was organized by subjects and item information. Each subject reported 242 observations, representing the total number of experimental and control tokens. In this fashion, the layout of the data showed items nested within subjects, and encounters nested within items. Figure 1 shows an example of this structure for a given reader in the experiment. Figure 1. Data structure for participants and target words Based on this hierarchical structure, I adopted a Generalized Linear Mixed Model (GLMM) to fit the appropriate regression that can accommodate multiple levels (Heck, Thomas, & Tabata, 2012). In the light of Figure 1, GLMM is conducted with two levels when we test by vocabulary item so that the model would only include subject variables and item variable in repeated measures. The model expands to three levels when we need to test the level of encounter including information about all the tokens of all items. 40 2.5.3 Statistical tests Online reading patterns. Reading patterns were averaged from first encounter to last encounter over all items to investigate how processing times changed from early to repeated exposure to target words. I then examined the role of condition, encounter and predictability in the online reading patterns and reading behavior exhibited by second language readers in the study. Three sets of GLMM were conducted with online processing measures as continuous dependent variables; condition, encounter and predictability as fixed factors; word length as a control variable and subject and items as random factors. Upon inspection, online reading data (first fixations, gaze durations and total times) was not normally distributed and was largely skewed to the positive side, which made the use of a linear regression model inappropriate. One alternative test in this case is Gamma regression, which uses a log link function to fit non-normal positive dependent variables (McCullagh, & Nelder, 1989). Interaction terms were estimated for encounter, predictability and condition under each model. Similar tests were performed to predict the patterns of skipping, regressions and summed reading times on target and control words. Item-based effects. I presented descriptive statistics for the average vocabulary gains in the three post tests and investigated the role of total exposure and maximum item predictability on learning outcomes through three sets of two-level GLMM to fit a binary logistic regression for each of the dependent variables: form recognition, meaning recognition and meaning recall. Using similar models, the effects of summed processing measures on learning outcomes were also investigated. Total exposure and maximum predictability were entered as covariates in the same model to estimate how offline and online predictors compare in explaining the variance in vocabulary learning. 41 Token-based effects. Three-level GLMM were used to fit binary logistic regressions to estimate the effect of online reading times for every token in the text on the probability of learning novel words. By combing token-based predictability and online fixation times in a single model, it was possible to examine which of the two was the more important factor and whether they interacted in predicting learning outcomes. Individual differences. The final part of the results shows descriptive statistics for the reading questionnaire followed by logistic regressions to investigate the role of L2 proficiency, vocabulary size, comprehension and reading speed on incidental learning from reading. 2.5.4 Reporting results The GLMM output calculates the probability of the incidence of a dependent variable in terms of an odds ratio (OR), quantifying the predicted change in the dependent measure as a function of a one unit increase in a given predictor (Ferguson, 2009). An OR larger than 1 indicates a positive relationship and an OR less than one indicates a negative relationship. The interpretation of OR varies according to the type of the dependent variable. For example, if encounter predicted fixation times (continuous variable) with an OR of 0.25, this would indicate that one additional encounter predicts a percent decrease in fixation times by 75 % (1 – 0.25 * 100%). On the other hand, if repeated exposure predicted form recognition (binary categorical variable) with an OR of 1.75, this would indicate that one extra exposure was associated with an increase in the odds of correct responses in form recognition by 75 % (1.75 – 1 * 100 %). In addition to the odds ratio (OR), I report the 95% confidence interval of the effect size of the predictor variable. The predictor was considered significant at the .05 level while the strength of the relationship was interepreted through OR. A strong relationship starts at OR < 0.33 or OR > 3 (Ferguson, 2009; Menard, 2010; Powers, & Xie, 2008). 42 CHAPTER 3: RESULTS The results presented in this chapter are organized by research questions. I first compare token-based and item-based online reading measures in both the control and experimental condition and investigate how encounter and predictability in increase or decrease reading times. In the second part, I provide descriptive statistics for vocabulary gains and estimate the effects of token-based online processing and token-based predictability on learning outcomes. In the third part, I explain the role of summed processing measures and compare their effects with those of offline textual factors; i.e., item predictability and total exposure. Finally, I present findings regarding the role of individual differences in incidental learning from L2 reading. 3.1 Encounter and predictability data To retrieve the predictability data, I calculated respondents’ percentage of agreement for each token in the cloze procedure of the norming study (see section 2.4.4). These percentages were entered as numerical values between 0 and 100 to represent token predictability in the model. For item-based analyses, the highest predictability score for each target word was assigned as that word’s predictability score. Table 6 shows item-based information about number of exposures and highest predictability for target words in the two versions of the text. A detailed information about token predictability in the two versions of the story are provided in detail in Appendix J. 43 Table 6 Encounter and predictability data for target vocabulary Encounter and predictability data for target vocabulary Number of encounters Pseudo word Meaning in version A Maximum item predictability Meaning in version B Maximum item predictability 30 fozle hotel 90 table 96 18 gube café 38.3 room 84 10 mave face 85 desk 75 tund stop 90 meet 94.5 7 leam tall 60 busy 56.3 6 blef kill 77.5 push 88 6 toker party 27.5 money 82 5 bannow pocket 77.5 window 72.7 5 mot bag 85 gun 77.1 4 fonteen picture 90 airport 77.3 4 dangy quiet 62.5 happy 72.9 4 windle garden 5 letter 81.3 3 neech shirt 60 dress 44 3 redaster accident 97.5 hospital 54.5 2 dook rich 72.5 cold 72.7 1 tance sleep 65 drink 59 1 pamery cinema 71.7 camera 77 1 tantic famous 10 hungry 37.5 1 dorch plane 57.7 noise 45.8 1 smick chair 72.5 shoes 50 9 44 3.2 Online reading patterns Before considering the relationship between online processing and vocabulary learning, I needed to investigate input and contextual factors that influenced reading patterns and lexical processing. To test these effects, online reading measures were entered as continuous dependent variables in a GLMM to run a Gamma regression analysis with condition, encounter and token predictability as predictors and word length as a control variable. The Gamma regression is an alternative to linear regression which uses a log link function to fit non-normal positive dependent variables (Heck, Thomas, & Tabata, 2012; McCullagh, & Nelder, 1989). The beta coefficient for gamma regression does not provide meaningful interpretation unless odds ratios are calculated. The quotient of odds ratio in a significant relationship is interpreted as a percent change in the incident rate of the continuous outcome either negatively or positively. For each processing measure, I created a line graph and a scatter plot to identify decreases and major cutoffs, if any, in fixation times. When cutoff points were observed, follow up analyses were made to explain any discrepancies between early encounters and late encounters with target words. 3.2.1 First fixation durations Figure 2 shows that first fixation durations on target words started at an average of 264 ms (SD = 124) and ended with 215 ms (SD = 88) while first fixations on control words started at 227 ms (SD = 86) and ended at 218 ms (SD = 88). Visually, there were no major changes in fixation times from first to last encounter or major differences between conditions. Table 7 summarizes the regression output for the effect of text-based factors on first fixation durations. 45 Table 7 Effects of text-based factors on first fixation durations (FFD) Effects of text-based factors on first fixation durations (FFD) Odds Intercept OR 95% CI 5.022 p 4.82 5.22 < .001 *** Condition 1.08 1.04 1.12 < .001 *** Encounter 0.98 0.96 0.99 < .001 *** Predictability 0.89 0.82 0.91 .001 ** Encounter * Predictability 0.995 0.991 0.999 .008 ** Condition * Encounter 1.01 1.005 1.03 < .001 *** Condition * Predictability 0.89 0.84 0.95 .001 ** Note: The (*) marks signify the level of significance of the p value Regression output showed that condition significantly predicted first fixation (OR = 1.08, 95% CI = [1.04, 1.12], p < .001). Comparing the odds ratio against the odds of the intercept, fixation times in the experimental condition was significantly longer than in the control condition and that the probability of fixating longer on a target words increases by about 2 % when the word is unfamiliar. Encounter was slightly associated with a decrease in FFD (OR =0.98, 95% CI = [0.967, 0.993], p < .001), implying that adding more encounters was associated with a negative change in first fixation durations by a factor of 1 %. There was a significant interaction between encounter and condition (OR = 1.01, 95% CI = [1.005, 1.03], p < .001), which implied that target and control words started to behave similarly as encounters increased. Token-based predictability was negatively associated with FFD (OR = 0.89, 95% CI = [0.82, 0.91], p = .001), suggesting that an increase in the token predictability resulted in a negative change in first fixation durations by 11 %. The negative effect for an interaction between encounter and predictability (OR =0.995, 46 95% CI = [0.991, 0.999], p = .008) suggested that the effect of predictability was slightly more Mean FFD (ms.) evident by later encounters. 280 260 240 220 200 180 160 140 120 100 80 60 40 20 0 Target Control 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Encounter Figure 2. Mean fixation times (in milliseconds) on target and control words by encounter There was also a small negative interaction between condition and predictability (OR =0.89, 95% CI = [0.84, 0.95], p = .001), which suggested that the effect of predictability may have been less pronounced for the pseudo words in the text. To visualize the interaction between predictability and condition, I categorized token predictability into predictable and less predictable bands based on a cutoff point of 77 %. I then created a graph for mean first fixation durations in both target and control conditions with separate lines for predictability bands to investigate how predictability influenced fixation times. Figure 3 shows that less predictable tokens received more fixation times and that the effect of predictability in reducing the amount of processing time was slightly more pronounced for familiar words than for pseudo words. The scatter plot for first fixation durations across encounters did not show major cutoff points. Accordingly, no follow up analyses were done for specific encounters. 47 Figure 3. The interaction of condition and predictability in first fixation durations (FFD) Figure 4. Scatter plot for mean first fixation durations by encounter and condition 48 3.2.2 Gaze durations Figure 5 shows that gaze durations (GD) on target words started at 393 ms (SD = 282) Mean GD (ms.) and ended at 237 (SD = 118). A scatter plot was created to identify cutoff points. 440 400 360 320 280 240 200 160 120 80 40 0 Target Contol 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Encounter Figure 5. Mean gaze durations (in milliseconds) on target and control words by encounter Figure 6. Scatter plot for gaze durations by encounter and condition 49 The scatter plot in Figure 6 shows a larger decrease until encounter 12, after which there was a slow but steady decrease until the last exposure. Table 8 summarizes these textual factors on the patterns of gaze durations. Table 8 Effects of text-based factors on gaze durations (GD) Effects of text-based factors on gaze durations (GD) Odds OR Intercept 95% CI 4.18 p 3.84 4.53 < .001 *** Condition 1.24 1.17 1.32 < .001 *** Encounter 0.95 0.94 0.97 < .001 *** Predictability 0.91 0.85 0.96 .003 ** Encounter * Predictability 0.99 0.97 1.01 .172 Condition * Encounter 1.04 1.01 1.10 < .001 *** Condition * Predictability 0.93 0.89 0.95 .045 * Note: The (*) marks signify the level of significance of the p value In the light of Table 8, condition was a significant predictor of the variance in gaze durations (OR = 1.24, 95% CI = [1.17, 1.32], p < .001). Comparing the odds of the intercept with odds ratio, the probability of longer gaze durations on pseudo words increased by around 16 % when target words were unfamiliar. Each additional encounter predicted a decrease in gaze duration by about 5 % (OR = 0.95, 95% CI = [0.94, 0.97], p < .001). There was a small interaction between encounter and condition (OR = 1.04, 95% CI = [1.01, 1.10], p < .001). The increase in token predictability was associated with a decrease in gaze durations (OR = 0.91, 95% CI = [0.85, 0.96], p = .003). The interaction between condition and predictability was 50 significant (OR = 0.93, 95% CI = [0.89, 0.95], p = .045). Figure 7 illustrates this interaction showing that predictability had almost equal effects on both target and control words. Figure 7. The interaction of condition and predictability in gaze durations (GD) Based on the scatter plot in Figure 6, follow up analyses were conducted for early encounters (1-12) and late encounters (13-30). Results indicated that the effect of condition was larger in early encounters (OR = 1.41, 95% CI = [1.32, 1.65], p < .001) than in later encounters (OR = 1.09, 95% CI = [1.02, 1.12], p = .035), suggesting that target and control words started to be read generally fast after 12 encounters. Encounter was only significant in the first 12 exposures (OR = 0.84, 95% CI = [0.81, 0.89], p < .001), confirming the visual observation that the decrease was larger and more significant than in later encounters. On the other hand, token predictability was significant only in later exposures (OR = 0.87, 95% CI = [0.82, 0.94], p < .001) and not in early encounters (OR = 0.97, 95% CI = [0.96, 1.002], p = .152). 51 3.2.3 Total reading times Figure 8 shows that total reading times recorded highest on the first encounter of target Mean TFD (ms.) words (M = 702 ms, SD = 512) and lowest by the final encounter (M=265 ms, SD=130). 750 700 650 600 550 500 450 400 350 300 250 200 150 100 50 0 Target Control 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Encounter Figure 8. Mean total reading times (in milliseconds) for target and control words by encounter Figure 9. Scatter plot for mean total durations by encounter and condition 52 The scatter plot in Figure 9 shows that the decreasing pattern was more evident until encounter 11. Later encounters from 12 to 23 showed slower decline in reading times, after which the line dropped steadily until the last exposure. Table 9 summarizes main textual effects on total reading times. Table 9 Effects of text-based factors on total reading times (TFD) Effects of text-based factors on total reading times (TFD) Odds OR Intercept 95% CI 3.30 p 2.88 3.73 < .001 *** Condition 1.68 1.42 1.77 < .001 *** Encounter 0.88 0.87 0.91 < .001 *** Predictability 0.93 0.88 0.95 .001 ** Encounter * Predictability 0.988 0.981 0.995 < .001 *** Condition * Encounter 1.020 1.016 1.023 < .001 *** Condition * Predictability 0.85 0.77 0.93 .001 ** Note: The (*) marks signify the level of significance of the p value Table 9 shows that condition was a significant predictor of the variance in total reading times (OR = 1.68, 95% CI = [1.42, 1.77], p < .001), indicating that readers took longer times on target words than known words. Each additional encounter was associated with a decrease in total times (OR = 0.88, 95% CI = [0.87, 0.91], p < .001), which was modulated by a small interaction between encounter and condition (OR = 1.020, 95% CI = [1.016, 1.023], p < .001). There was a significant association between token predictability and the decrease in total times (OR = 0.93, 95% CI = [0.88, 0.95], p = .001), which was modulated by a small interaction 53 between encounter and predictability (OR = 0.988, 95% CI = [0.981, 0.995], p < .001) and between condition and predictability (OR = 0.85, 95% CI = [0.77, 0.93], p = .001). This interaction is illustrated in Figure 10, which shows that the effect of predictability was slightly more pronounced on familiar words rather than on pseudo words. Figure 10. The interaction of condition and predictability in total fixation durations (TFD) Based on the scatter plot for total times in Figure 9, follow up analyses were conducted for early encounters (1-11) and late encounters (12-30) separately. Results showed that the effect of condition was larger in early encounters (OR = 1.85, 95% CI = [1.75, 1.96], p = .001) than late encounters (OR = 1.44, 95% CI = [1.21, 1.74], p = .013). The decrease in total times was also greater in early encounters (OR = 0.75, 95% CI = [0.71, 0.80], p = .001) than in later (OR = 0.94, 95% CI = [0.90, 0.98], p = .005). The effect of token predictability was only significant in late encounters (OR = 0.87, 95% CI = [0.83, 0.92], p = .001) but not on early encounters (OR = 0.96, 95% CI = [0.91, 1.01], p = .189). 54 3.2.4 Reading behavior Other aspects of reading behavior such as skipping and regressions contributed to the patterns of total times. Skipping rates refer to the instances when interest areas were skipped on first pass. Skipping was more frequent in the control condition than in the experimental condition (around 21% of target occurrences and almost 26% of control occurrences). Regressions-in refer to the instances when readers regressed to the target or control word after the first pass. Readers returned more to target items (almost 25%) than to control items (almost 15%). Regression-out refers to the instance when readers launched a regression from the target or control word after first pass. Regressions-out occurred on almost 27 % of target observations and on 22 % of control observations. Table 10 summarizes these reading behavior patterns on both target and control conditions. Because skips and regressions are binary data, logistic regression analyses were performed as the next step to identify what factors predicted their occurrence. Table 10 Mean percentages of skipping, regressions and rereading on target and control words Mean percentages of skipping and regressions (%) on target and control words Skipping rate Regression in Regression out Target words 21.3 (40.9) 24.6 (43.1) 26.7 (44.2) Control words 25.8 (43.7) 14.07 (34.8) 22.4 (41.7) Total 23.5 (42.4) 19.04 (39.5) 24.6 (43.05) Skipping. Condition was a significant predictor of skipping (OR = 0.82, 95% CI = [0.71, 0.95], p = .007), meaning that the odds of skipping decreased by around 18 % when the target words were familiar. The effect of encounter was not significant (OR = 0.96, 95% CI = [0.90, 1.03], p = .273). Token predictability showed to be a significant predictor of skipping (OR = 55 1.19, 95% CI = [1.03, 1.39], p = .001), suggesting that higher predictability triggered more skipping. No interaction was found between encounter and condition (OR = 0.97, 95% CI = [0.95, 1.02], p = .582) or between condition and predictability (OR = 1.002, 95% CI = [0.98, 1.003], p = .174). Regressions-in. Condition was a strong predictor of regression-in rates (OR = 2.79, 95% CI = [2.42, 3.22], p < .001), which indicated that the odds of regressing-in significantly increased by 2.79 times when the target was unfamiliar. Each additional encounter decreased the odds of regressing to the target word by about 28 % (OR = 0.72, 95% CI = [0.67, 0.77], p < .001), implying that regressions-in were more frequent in initial encounters. Token predictability did not have a significant effect on regressions-in (OR = 0.88, 95% CI = [0.72, 1.09], p = .251). There was an interaction between encounter and condition (OR = 0.90, 95% CI = [0.86, 0.94], p < .001) suggesting that the rates of regressions-in become similar for target and control words by later encounters. There was no interaction between encounter and predictability (OR = 1.007, 95% CI = [0.98, 1.03], p = .622) or condition and predictability (OR = 0.98, 95% CI = [0.96, 1.09], p = .351). Regressions-out. Condition was a significant predictor of regression-out rates (OR = 1.21, 95% CI = [1.09, 1.34], p < .001), which shows that the odds of regressing-out increased by about 21% when the target was unfamiliar. Each additional encounter decreased the odds of regressing out of the interest area by 2 % (OR = 0.98, 95% CI = [0.97, 0.99], p = .010). Token predictability was not a significant predictor of regression out (OR = 0.99, 95% CI = [0.84, 1.17], p = 0.971). No further interactions were found significant between predictability, condition and encounter. Table 11 summarizes the roles of textual factors on skipping and regressions. 56 Table 11 Effects of text-based factors on skipping and regression rates Effects of text-based factors on skipping and regression rates Skipping OR Regression-in p OR p Regression-out OR p Condition 0.82 .007 ** 2.79 < .001 *** 1.21 <.001*** Encounter 0.96 .273 < .001 *** 0.98 .010 * Predictability 1.19 .001** 0.88 .251 0.99 .971 Encounter * Predictability 0.99 .362 1.007 .622 1.001 .521 Condition * Encounter 0.98 .582 0.90 < .001*** 0.96 .187 Condition * Predictability 1.002 .174 0.98 .351 0.99 .231 0.72 Note: The (*) marks signify the level of significance of the p value 3.2.5 Summed reading times Summed processing measures are the sum of all times spent on a given item over all its encounters. Because repetition generally invites more reading, I arranged mean summed fixation measures by exposure bands (the number of times a word was seen). Items with a higher frequency of occurrence in the text generated higher summed times and that pseudo words received more attention than control words. Table 12 outlines the average summed fixation measures on all pseudo and control words by exposure bands. Summed first fixations were significantly different by condition (OR = 1.14, 95% CI = [1.11, 1.18], p < .001) and were significantly predicted by exposure band (OR = 1.09, 95% CI = [1.03, 1.12], p < .001). Controlling for exposure, maximum item predictability was associated with a decrease in the summed FFD (OR = 0.88, 95% CI = [0.81, 0.91], p < .001). No significant interactions were found between exposure and condition or exposure and predictability. 57 Table 12 Mean summed fixation measures by exposure bands Mean summed fixation measures (in milliseconds), with SD in parentheses, by exposure bands Exposur e band Summed FFD Summed GD Summed TFD target control target control target control 1 347 (952) 331(622) 460 (769) 410 (780) 748 (695) 543 (607) 2 419 (623) 357 (126) 491 (198) 457 (292) 891 (480) 762 (705) 3 734 (179) 580 (252) 1344 (846) 844 (518) 2282 (1411) 1092 (752) 4 934 (251) 814 (206) 1477 (651) 1037 (428) 2329 (1120) 1255 (534) 5 1174 (243) 946 (271) 1544 (546) 1172 (354) 2249 (997) 1396 (448) 6 1443 (459) 1191 (281) 1974 (720) 1415 (373) 3179 (993) 1848 (711) 7 1593 (367) 1362 (237) 2060 (556) 1507 (334) 2974 (933) 1985 (776) 9 2063 (500) 1808 (416) 2795 (804) 2212 (671) 4873 (1834) 2932 (1037) 10 2281 (392) 1993 (448) 3002 (717) 2443 (608) 4798 (952) 3123 (854) 18 3793 (866) 3098 (877) 4550 (1310) 3527 (1057) 6121 (843) 4289 (1433) 30 6109 (991) 5388 (987) 7676 (2551) 6291 (1815) 10220 (634) 7766 (2854) Mean 1375 (1325) 1222 (156) 1852 (3073) 1473 (2470) 2801 (3152) 1864 (3287) Summed gaze durations were significantly predicted by condition (OR = 1.27, 95% CI = [1.19, 1.34], p < .001) and exposure band (OR = 1.07, 95% CI = [1.01, 1.11], p < .001). Item predictability was associated with a decrease in summed gaze durations (OR = 0.95, 95% CI = [0.87, 0.99], p = .019). No significant interactions were found between exposure and condition or exposure and predictability. Summed reading times were significantly different by condition (OR = 1.49, 95% CI = [1.41, 1.57], p < .001) and exposure band, (OR = 1.12, 95% CI = [1.09, 1.17], p < .001). This 58 confirms the fact that pseudo words took more processing times than control words regardless of the number of exposures. Item predictability was not a significant predictor of summed reading times (OR = 0.99, 95% CI = [0.95, 1.02], p = .581). A significant interaction was found between total exposure and condition with a small effect (OR = 0.99, 95% CI = [0.98, 0.99], p = .001). Table 13 summarizes the role of text-based factors in summed processing measures. Table 13 Effects of text-based factors on summed processing times Effects of text-based factors on summed processing times Summed FFD OR Summed GD p OR p Summed TFD OR p Condition 1.14 < .001** 1.27 < .001*** 1.49 < .001*** Encounter 1.09 < .001** 1.07 < .001*** 1.12 < .001 *** Predictability 0.88 < .001** 0.95 .019 * 0.99 .581 Encounter * Predictability 0.99 .362 0.99 .522 0.99 .001 ** Condition * Encounter 0.98 .182 0.91 .145 0.96 .187 Condition * Predictability 1.01 .274 0.98 .131 0.99 .231 Note: The (*) marks signify the level of significance of the p value 59 3.2.6 Interim summary Online reading patterns showed a clear distinction between the processing of pseudo words and known words in context. Condition was significant on all reading measures, showing that readers looked longer and spent more time processing unfamiliar words. It was also shown that pseudo words invited less skipping and more regressions than known words. There was evidence of a growing sense of familiarity with target words as additional encounters were associated with shorter reading times and less regressions. Line graphs of gaze duration and total reading times (Figures 6 and 9) demonstrated that readers dwelled more on early encounters until around exposures (11-13), after which the decrease in fixation times became slower. The difference between conditions was larger in early encounters, and the data suggests that target and control words started to behave similarly after encounters 12-13. Token predictability was generally associated with shorter reading times and more skipping of target items. This effect became more important in late encounter than in early encounters. This may imply that predictability started to play a role later when pseudo words became better integrated in the sentence structure. The interaction between condition and predictability pointed to the fact that context effects, though significant, were less pronounced on pseudo words than on control words, implying that word familiarity may interfere with predictability in real time processing. Repeated exposure generated higher summed processing times with a significant effect of condition in terms of total times spent on vocabulary items. Predictability was associated with reduced first fixation and gaze duration times. 60 3.3 Vocabulary knowledge gains from reading 3.3.1 Descriptive statistics In overall vocabulary measures participants reported the highest gains in form recognition, followed by meaning recognition and finally meaning recall. Table 14 indicates that participants were able to retain the forms of an average 42 % of target words while they recognized the meanings of 30 % of the words and recalled the meanings of only 13 % of the same target items. Table 14 Average word gains for the vocabulary post tests Average word gains, with standard deviations in parentheses, for the vocabulary post tests Test M (SD) Percentages (%) Minimum Maximum Form recognition 8.36 (3.16) 41.8 1 (5%) 16 (80%) Meaning Recognition 6.06 (3.27) 30.3 1 (5 %) 13 (65%) Meaning recall 2.59 (2.32) 12.9 0 (0 %) 8 (40 %) To investigate the effect of amount of exposure on vocabulary learning, I analyzed participants’ responses by exposure bands (refer to Table 6) to estimate how many hits (correct responses) each item received from participants in each test. Figure 11 reveals a wide difference between highest and lowest exposure bands but variable patterns were noted for middle bands particularly in meaning recognition and recall. To elucidate the role of context in word learning, I categorized maximum item predictability into two levels: predictable and less predictable based on a cutoff point of 77 % Figure 12 shows the average percentages of vocabulary gains by context type, indicating that context richness increased chances of learning words in all vocabulary tests in a similar manner. 61 100 90 Mean word gain (%) 80 70 60 Form recognition 50 meaning recognition 40 meaning recall 30 20 10 0 1 2 3 4 5 6 7 9 10 18 30 Encounter Figure 11. Mean percentages of vocabulary gains in the vocabulary posttests by exposure bands 45 40 Mean word gain (%) 35 30 25 Predictable 20 Less Predictable 15 10 5 0 Form recognition Meaning recognition Meaning recall Vocabulary test Figure 12. Mean percentages of word gains by context type 62 3.3.2 Text-based characteristics and vocabulary learning To explain the variance in learning outcomes based on text-based factors, I looked at how item exposure and maximum item predictability contributed to vocabulary learning. Because the vocabulary variables are binary, I fitted a logistic regression using a two-level GLMM for every vocabulary test. Controlling for item effects and word length, logistic regression output showed that total exposure was a significant predictor for all the vocabulary outcomes but to somewhat different degrees: form recognition (OR = 1.21, 95% CI = [1.05, 1.40], p = .010), meaning recognition (OR = 1.29, 95% CI = [1.15, 1.44], p < .001), and meaning recall (OR = 1.42, 95% CI = [1.27, 1.61], p < .001). By comparing the odds ratios with the odds of the intercept in the three models, we calculate the difference between the probability of learning outcomes and the baseline probability of the intercept [OR * odds/ (odds+1)]. Regression output (Tables 15-17) indicated that each additional exposure increased the probability of form recognition by around 2 %, meaning recognition by around 3 % and meaning recall by 2 %. Item predictability was most strongly associated with meaning recall (OR = 1.63, 95% CI = [1.36, 1.95], p < .001) followed by meaning recognition (OR = 1.24, 95% CI = [1.08, 1.42], p = .002) yet it did not have a significant relationship with form recognition. Tables 15 through 17 summarize these effects showing positive effects for both exposure and predictability. However, the interaction between exposure and predictability yielded odds ratios < 1, implying a negative impact on meaning recognition and meaning recall although ratios were very close to 1 as shown in the tables. Figures 13 through 15 illustrate the interacting effects of context and repetition on vocabulary gains. 63 Table 15 Regression output for the effects of exposure and predictability on form recognition Regression output for the effects of exposure and predictability on form recognition Odds Intercept OR 95% CI 0.13 p 0.054 0.30 < .001 *** Total Exposure 1.21 1.05 1.40 .010 ** Item predictability 1.11 0.99 1.24 .691 Exposure * predictability 0.99 0.98 1.02 .874 Note: The (*) marks signify the level of significance of the p value Table 16 Regression output for the effects of exposure and predictability on meaning recognition Regression output for the effects of exposure and predictability on meaning recognition Odds Intercept OR 95% CI 0.058 p .016 0.20 < .001 *** Total Exposure 1.29 1.15 1.44 < .001 *** Item predictability 1.24 1.08 1.42 .002 ** Exposure * predictability 0.98 0.97 0.99 .012 * Note: The (*) marks signify the level of significance of the p value Table 17 Regression output for the effects of exposure and predictability on meaning recall Regression output for the effects of exposure and predictability on meaning recall Odds Intercept OR 95% CI .012 p .002 .068 < .001 *** < .001 *** Total Exposure 1.43 1.27 1.61 Item predictability 1.63 1.36 1.95 < .001 *** Exposure * predictability 0.97 0.96 0.99 .002 ** Note: The (*) marks signify the level of significance of the p value 64 Figure 13. The interaction between exposure and predictability in form recognition With contextual constraint categorized into predictable and less predictable bands, I divided exposure bands into four categories: single exposure, low exposure (2-5), medium exposure (6-9), and high exposure (10 and more). Figure 13 illustrates that highly predictable items yielded relatively better gains except for single exposure words. On the other hand, Figure 14 on meaning recognition indicates that predictable context makes the largest difference in the medium-exposure band. In meaning recall, this variance becomes clearer as high context words are more likely to be recalled in all exposure bands while less predictable words with single, low and medium exposures were not recalled as much (see Figure 15). Overall, repetition was effective in all vocabulary gains but context predictability enhanced these gains, especially in the medium-exposure band. 65 Figure 14.The interaction of exposure and context in meaning recognition Figure 15. The interaction of exposure and context in meaning recall 66 3.3.3 Real time processing and vocabulary learning In this section, I investigate real time processing of the target words in text and how moment-by-moment eye movement measures and token predictability can predict that certain vocabulary items will be acquired from reading. Because the data included items nested within subjects and encounters nested within items, I fitted a binary logistic regression in a three-level GLMM for each vocabulary test. Online reading measures and token predictability were entered as fixed factors with subjects and items as random factors and word length and total exposure as control variables. Results yielded significant positive relationships of form recognition with first fixation durations (OR = 1.21, 95% CI = [1.13, 1.32], p = .035) and total reading times (OR = 1.42, 95% CI = [1.12, 1.80], p =.004), indicating that a one second increase in first fixations and total times spent on a target occurrence increased the probability of form recognition success by 4 % and 7 % respectively. Token predictability was not a significant predictor for form recognition (OR = 0.91, 95% CI = [0.96, 2.32], p = .084. Table 18 outlines the regression output for form recognition. Table 18 Token-based predictors of form recognition Token-based predictors of form recognition Odds Intercept OR 95% CI 0.26 p .075 0.93 .038 * First fixation duration 1.21 1.13 1.32 .035 * Gaze duration 1.16 0.73 1.82 .533 Total time 1.42 1.12 1.80 .004 ** Token predictability 1.45 0.91 2.32 .116 Note: The (*) marks signify the level of significance of the p value 67 Meaning recognition results pointed to a positive effect of total reading times on vocabulary outcomes (OR =1.33, 95% CI = [1.03, 1.72], p = .029). A one second increase in reading times of each token increased the probability of meaning recognition by 3 %. In addition, token predictability was highly associated with meaning recognition success (OR = 2.81, 95% CI = [1.81, 4.34], p < .001), implying that one unit increase in the predictability of individual encounters increased the chance of meaning recognition by 22 %. Table 19 summarizes token-based predictors of meaning recognition. Table 19 Token-based predictors of meaning recognition Token-based predictors of meaning recognition Odds Intercept OR 95% CI 0.10 p .034 0.32 < .001 *** First fixation duration 2.65 0.81 8.67 .106 Gaze duration 1.54 0.71 1.88 .560 Total time 1.33 1.03 1.72 .029 * Token predictability 2.81 1.81 4.34 < .001 *** Note: The (*) marks signify the level of significance of the p value. Meaning recall was significantly predicted by gaze durations (OR = 2.19, 95% CI = [1.22, 3.77], p = .005) and total reading times (OR = 1.73, 95% CI = [1.14, 2.63], p = .010). One additional second spent on target tokens increased the probability of meaning recall by 3 %. Token predictability showed a strong positive effect on meaning recall (OR = 5.68, 95% CI = [3.19, 10.25], p < .001), implying that an increase in predictability increased the probability of meaning recall by almost 17 %. Table 20 outlines the regression output for meaning recall. 68 Table 20 Token-based predictors of meaning recall Token-based predictors of meaning recall Odds Intercept OR 95% CI .051 p .008 0.30 .002 ** First fixation duration 0.77 0.26 2.31 .650 Gaze duration 2.19 1.22 3.77 .005 ** Total time 1.73 1.14 2.63 .010 * Token predictability 5.68 3.19 10.25 < .001 *** Note: The (*) marks signify the level of significance of the p value 3.3.4 The role of cumulative online processing in vocabulary learning Because summed fixation times reflected the cumulative processing effort devoted to target words, it was interesting to test how these measures would compare with text-based factors (exposure and item predictability) in explaining the variance in vocabulary outcomes. I fitted binary logistic regressions using a two-level GLMM because holistic effects based on items rather than the encounter level are of interest. Table 21 Regression output of the online vs. text-based predictors of form recognition Regression output of the online vs. text-based predictors of form recognition Odds Intercept OR 95% CI 0.18 p 0.094 0.35 < .001 *** Summed FFD 0.53 0.34 0.83 .006 ** Summed GD 1.17 0.85 1.61 .332 Summed TFD 2.16 1.21 3.38 < .001 *** Total exposure 1.29 1.18 1.41 < .001 *** Item predictability 1.01 0.98 1.09 .704 Note: The (*) marks signify the level of significance of the p value 69 Table 21 points to a negative relationship between summed first fixation durations and form recognition in that a one second increase in total first fixation durations decreased the probability of successfully recognizing word form by almost 6 % (OR = 0.53, 95% CI = [0.34,0.83], p = .006). On the other hand, total reading times spent on target words positively increased the chances of learning form (OR = 2.16, 95% CI = [1.21, 3.38], p < .001), indicating that looking for one extra second at target words increased the probability of form recognition success by 13 %. At the level of text-based features, total exposure positively influenced form recognition although the effect was somewhat smaller than online processing times (OR = 1.29, 95% CI = [1.18, 1.41], p < .001). Meaning recognition was significantly predicted by summed reading times (OR =1.47, 95% CI = [1.25, 1.72], p < .001) and total exposure (OR =1.38, 95% CI = [1.21, 1.58], p < .001). Meaning recall followed the same pattern with total reading times (OR =3.27, 95% CI = [1.28, 5.33], p < .001) and total exposure (OR =1.27, 95% CI = [1.13, 1.41], p < .001). In both models, item predictability was significant, although with a modest association strength. Tables 22 and 23 summarize the predictors of meaning recognition and recall. Table 22 Regression output of the online vs. text-based predictors of meaning recognition Regression output of the online vs. text-based predictors of meaning recognition Odds Intercept OR 95% CI 0.15 p .060 0.38 < .001 *** Summed FFD 0.81 0.48 1.39 0.45 Summed GD 0.78 0.51 0.21 . 284 Summed TFD 1.47 1.25 1.72 < .001 *** Total exposure 1.38 1.21 1.58 < .001*** Item predictability 1.10 1.01 1.21 .047 * Note: The (*) marks signify the level of significance of the p value 70 Table 23 Regression output of the online vs. text-based predictors of meaning recall Regression output of the online vs. text-based predictors of meaning recall Odds Intercept OR 95% CI 0.033 p 0.007 0.15 < .001 *** Summed FFD 1.07 0.61 1.88 .825 Summed GD 0.80 0.49 1.28 .352 Summed TFD 3.27 1.28 5.33 < .001 *** Total exposure 1.27 1.13 1.41 < .001 *** Item predictability 1.16 1.03 1.30 .016 * Note: The (*) marks signify the level of significance of the p value A general overview of Tables (21-23) indicates that holding the effects of total exposure and item predictability constant, summed reading times strongly predicted learning success in all vocabulary measures particularly in form and meaning recall. This might suggest that individual attention on the part of the reader can be more important in explaining vocabulary learning above and beyond repeated exposure. 3.4 Individual differences in learning from reading Eye Link trial reports showed that readers spent an average of 19.2 minutes (SD = 4.59) on the actual text with a mean reading speed of 258 words per minute. The range of their TOEFL IBT scores (79-100) and vocabulary sizes (2950- 4200) out of 5000 suggested that most participants were at upper-intermediate to advanced levels in English proficiency and that the reading material, with pseudo tokens, met the lexical coverage threshold necessary to ensure adequate reading comprehension and the possibility of learning from reading. Average comprehension scores ranged from 60 % to 100% (M = 87, SD = 8.6), indicating a generally good understanding of story content. 71 Post-reading questionnaire included items on a 6-point scale ranging from 1 (strongly disagree) to 6 (strongly agree). Average responses indicated that readers enjoyed the story (M= 4.9, SD=.86) while they did not express much discomfort with reading on the eye tracker (M=3.6, SD=1.6). They also expressed that the text was easy to read and there was no need to use a dictionary. Table 24 summarizes descriptive statistics for the items in the reading questionnaire. Table 24 Mean responses on the reading perception questionnaire Mean responses on the reading perception questionnaire Question Mean response (SD) Ease of reading 4.6 (.94) Reading enjoyment 4.9 (.86) Need of dictionary 2.9 (1.5) Reading comfort 4.3 (1.3) Eye tracking discomfort 3.6 (1.6) Individual differences in reading speed, reading comprehension scores and vocabulary size did not yield significant effects on any of the vocabulary measures. However, a small effect of proficiency scores was found on form recognition (OR =1.014, 95% CI = [1.001, 1.025], p = .039) and meaning recall (OR =1.023, 95% CI = [1.00, 1.067], p = .049), indicating that more proficient readers may have shown slightly better retention of form and meaning of pseudo words encountered in the text. 72 3.5 General summary of results Participants in this study read the graded reader ‘Goodbye Mr. Hollywood’, a stage 1 story of 4649 words, which was found to be well within their current English proficiency levels. The percentage of pseudo tokens in the story was below 3 %, and the lexical coverage of the story was satisfactory (Nation, 2001, 2006). Readers met pseudo words and familiar control words in equal number of exposures ranging from 1 to 30, yielding 121 tokens in each condition. The predictability of tokens ranged between 0 and 96 based on English native speaker cloze agreement percentages. Eye movements were recorded during reading to compare online and text-based effects on incidental vocabulary learning. Online reading patterns pointed to significant differences between attention to target items and familiar items. First fixations, gaze durations and total times decreased as a result of additional encounters to target words, pointing to a gradual increase in familiarity with pseudo words in the text as they were repeated. The decrease in reading times was more significant in early encounters (1-12) than in later encounters. After about 12 encounters, both conditions started to elicit similar processing patterns. Conversely, the role of token predictability in reducing processing load was more important in later encounter than in early encounters. The interaction between condition and predictability suggested that the role of predictability might have been slightly more pronounced in processing familiar control words than with pseudo words. Analyses of regressions and skips confirmed, as was to be expected, the extra attention devoted to pseudo words in early encounters. The overall summed fixation times were significantly influenced by condition and exposure band, suggesting that more repetition normally invited more attention and that pseudo words elicited longer summed processing times than known words over repeated exposures. 73 Readers displayed learning outcomes in form recognition followed by meaning recognition and finally meaning recall. Total exposure predicted all vocabulary outcomes while maximum item predictability supported meaning recognition and recall. The interaction between total exposure and predictability in text-based effects suggested that a rich context may have mitigated the positive effect of repetition in the process of retaining word meanings from reading. Overall, repetition was effective in all vocabulary gains while context predictability enhanced these gains, especially in the low-exposure and medium-exposure bands. Token-based online processing measures demonstrated that Total time was a positive indicator of learning success in all vocabulary tests while first fixations only predicted form recognition and gaze durations only predicted meaning recall. Token-based predictability was an indicator of meaning recognition and recall but not of form recognition. When aggregating processing measures on all encounters, it was shown that only summed total time was positively associated with learning outcomes. After accounting for total exposure and item predictability, it was estimated that a one second increase in total times is a significant indicator of vocabulary learning particularly form and meaning recall. This suggested that word-based attention and utilization of context on the part of the reader can represent independent additive effects in the process of incidental learning from L2 reading. Overall, participants represented a relatively homogenous group in terms of proficiency and vocabulary size. They expressed that the text was an easy and enjoyable piece of reading and reported a good comprehension of the details of the story. A slight effect of proficiency scores was observed in form recognition and meaning recall but no other effects of comprehension scores, reading speed or vocabulary size were found. 74 CHAPTER 4: DISCUSSION Research on extensive reading has provided ample evidence on the role of repetition in lexical learning and called for further research on the role of contextual richness in vocabulary acquisition from L2 reading (e.g. Horst, 2005; Waring, & Nation, 2004; Webb, 2007, 2008). On the other hand, eye movement studies on reading behavior documented the cognitive effects of repetition and context quality on lexical processing and associated lexical retention in terms of online processing patterns and the eye-mind link hypothesis (Godforoid et al., 2013; Juhasz, & Pollatsek, 2011; Rayner, & Well, 1996). The present study aimed to bring together methods from both strands to investigate incidental vocabulary acquisition from L2 reading and track the cognitive effects of repetition and context predictability on the development of different aspects of vocabulary knowledge. In this chapter, I discuss the findings of the study in the light of the research questions and draw implications from extensive reading and eye movement research. 4.1 Lexical processing in repeated encounters The first research question sought to investigate how second language readers processed unknown words in the graded reader ‘Goodbye Mr. Hollywood’, and what textual factors influenced their reading patterns in real time. It was shown that readers gave relatively more attention to pseudo words as compared to familiar words, particularly in early encounters. Gaze durations and total times were inflated between encounters 1 and 12, after which target and control words started to exhibit similar processing patterns. Steady decreases were more significant in early encounters than in later encounters. The cutoff point around the 11th – 12th encounter was not clear for first fixation durations which showed significant yet small decreases across encounters and few differences by condition. A possible explanation can be provided in the light of the E-Z Reader model that postulates different stages of lexical processing (e.g. 75 Pollatsek, Reichle, & Rayner, 2006). In this model, first fixations reflect an early stage of familiarity check, and do not capture later events of reanalysis, word recognition or formmeaning mapping. The unfamiliarity of target words triggered subsequent fixations that fed into gaze durations, and the reported frequent regressions to target words ultimately fed into total times. This scenario may have caused the notable rise of attention exhibited in early encounters. The fact that readers did pay more attention to pseudo words and particularly on early encounters was also confirmed by other evidence from reading behavior. In particular, skipping was less frequent on pseudo words while regressions occurred more frequently particularly in early encounters. Skipping instances occurring at pseudo words does not contradict with the fact that readers attended more to novel items. Parafoveal processing may have occurred for new words making them less likely to be skipped at first pass. Less skipping and more regressions indicated increased processing and reanalysis of target words, which may have supported the form-meaning mapping process. Repeated encounters were associated with shorter fixation times, which is consistent with previous eye movement research (e.g. Joseph et al., 2014), suggesting a gradual increase in familiarity with target forms over time. The interaction between encounter and condition pointed to a possible exposure threshold after which pseudo words are read as fluently as familiar words. In extensive reading studies, 10 or more repetitions supported word learning (Pellicer-sanchez, & Schmitt, 2010). This may imply that a full knowledge of meaning might have been established after sufficient exposures, triggering more fluent reading. The observed cutoff point in gaze durations and total times after 12 or 13 encounters may point to this stage of meaning acquisition although it does not exclude the possibility that some readers internalized word meanings at earlier encounters or at least accumulated partial knowledge over successive exposures. A further support for this assumption 76 comes from the finding that regressions became less frequent in later encounters, indicating that readers might have formed plausible hypotheses about target word meanings in later encounters, which increased their fluency and made them proceed with reading with less hesitation. The role of predictability in the present study was consistent with previous research that associated high context predictability with reduced reading times and higher skipping rates (Kleigel et al., 2004; Rayner, & Well, 1996). One further finding in the light of online processing results was that the role of predictability became more important in later encounters than early encounters with target words. A possible explanation for this observation is that pseudo words were better integrated in the sentence structure by later encounters because form retrieval became more fluent as a function of repetition. Due to the novelty of word forms, readers needed more repeated encounters to recognize them before they could rely on context to guess their meanings. This explanation is also consistent with assumptions from the E-Z Reader model presented by Reichle, Warren, and McConell (2009) who postulated a post-lexical integration stage that begins immediately after word identification. In this stage, readers may require additional time to construct higher-level representations such as linking the word to its syntactic structure, creating a context-based semantic representation or incorporate the word meaning at the discourse level. This explains the additional time shown for pseudo words in the present study, and the regression rates reported in early encounters. The interaction between condition and predictability (as shown in Figures 3, 7 and 10) confirmed that highly predictable tokens required less processing in target and control condition although it can be noted that this effect was slightly more pronounced with control words. This finding is consistent with the perceived effect of form unfamiliarity, which interfered with the role of predictability in early encounters. It can also highlight the effect of lexical frequency on 77 processing based on previous research reviews which maintained that low frequency vocabulary attract longer processing times (Clifton, Rayner, & Staub, 2007; Rayner, Raney, & Pollatsek, 1995; Rayner, 2009, 2007; William, & Morris, 2004). From a lexical perspective, the pseudo words integrated in the text can be claimed to share features with low-frequency vocabulary in English. Previous eye movement studies found that the level of frequency and predictability independently affected reading times and interacted with the number of exposures (e.g. Ashby, Rayner, & Clifton; Rayner, Raney, & Rayner, 1995). Further research can shed more light on the hypothesized interaction between word frequency and context predictability. Taking a broader perspective, the role of context predictability can in fact extend beyond lexical tokens because pseudo words can incrementally acquire higher predictability over later encounters at the discourse level. The readers’ engagement with the content of the story at the discourse level can eventually feed into the predictability of individual words, particularly those that were repeated more often. This explains the fact that estimated item predictability reduced the summed first fixations and summed gaze durations on individual target words. 4.2 Text-based effects on vocabulary learning In line with previous literature on vocabulary acquisition (Nation, 2001; Schmitt, 2008, 2010), knowledge of form seemed to be the first component to develop followed by meaning recognition and finally meaning recall. These differential learning rates can be explained in terms of a progression from the lowest to the highest cognitive demands on the learner’s memory. In form recognition, the learner only needs to access the orthographic form of the target word from memory traces while in meaning recall the learner had to have sufficient informative encounters to guess meanings correctly and subsequently decontextualize words and retain them in memory, which is even more demanding task than meaning recognition where learners are given several 78 options that trigger access to memory of contextual information encountered in the text. The overall picture of learning outcomes shows plausible and predictable patterns in line with the incremental nature of vocabulary knowledge development (Schmitt, 2008, 2010). Form recognition was mainly influenced by total exposure (see Table 15) while gains in meaning recognition and recall were determined by an interaction between total exposure and item predictability (Tables 16 and 17). Repeated exposure of items that were categorized as highly predictable yielded increases in the chances of meaning recognition and recall while low context items did not show that linear trend, implying that the ambiguity of certain items attenuated the effects of repeated exposure in the acquisition of word meanings. A further finding is that the effect size of predictability was strongest in meaning recall, implying that the minimum gains reported in the meaning recall test were associated with the most predictable items in the text. These findings are in line with previous vocabulary research (Webb, 2008) that repetition supports knowledge of form while context quality supports knowledge of meaning. Readers were able to retain traces of word forms due to repetition regardless of context while acquiring meaning required further contextual support which was not available with the same degree in all exposures. When a vocabulary item was highly predictable, high exposure was an ideal setting for accurate guessing and retention of word meanings while a combination of low context and high exposure was more conducive to form recognition and inconsistently associated with meaning gains. Overall, repetition was effective in all vocabulary gains while context predictability enhanced these gains, especially in the low-exposure and mediumexposure bands (Figures 13, 14 and 15). 79 4.3 Early indicators of vocabulary intake Token-based analyses were conducted to explore whether lexical processing patterns and the predictability of individual encounters provided early predictions of the probability of retaining new vocabulary items in the three types of the posttest. Controlling for total exposure, it was found that total time spent on individual tokens was associated with successful intake in all the three vocabulary measures. Additionally, first fixations predicted form recognition while gaze durations predicted meaning recall. The kind of associations found between online processing and different types of vocabulary gain aligns with the claim that different eye movement measures tap into different cognitive processes. Within the framework of the E-Z reader model (Reichle, Rayner, & Pollatsek, 2003; Pollatsek, Reichle, & Rayner, 2006), lexical processing has been posited to proceed in two stages: an early stage called ‘familiarity check’, and a later stage referred to as the completion of lexical access. The fact that first fixation durations predicted form recognition conforms to this hypothesis in that early lexical processing is largely form-focused (Reichle, Warren, & McConnell, 2009). Gaze duration, as the total duration of early processing, predicted meaning recall, which may indicate that subsequent lexical processing of form-meaning mapping and encoding into memory becomes more important with subsequent fixations on the target word. The same principle would explain why total time, as a late measure, predicted all types of vocabulary learning. Because total time marks the completion of lexical access and sentence integration, it was indicative of the total attention devoted to each individual token in the text. As the total time spent on every encounter of a target word increased, there was more chance that the reader would retain that word in all vocabulary measures. 80 Token predictability strongly supported meaning recognition and recall. This suggests that, after controlling for reading times, the contextual properties of individual encounters offered crucial support for retaining knowledge of meaning. Similar to the item-level analyses, token predictability was not significantly associated with form recognition. The input characteristics required for form retention seemed to be largely dependent on repetition regardless of context levels. It can be generalized that total reading times and token predictability were the major early indicators of vocabulary intake. As readers paid more attention to individual tokens, they were more likely to acquire knowledge of form and meaning about vocabulary items. The chance of meaning retention was boosted further when readers spent more time on highly predictable tokens. 4.4 Combined measures of attention and exposure Another perspective in investigating attention to target words was to combine fixation times over individual tokens of each item. The goal of such analysis was to compare overall attention, as reflected in summed fixation times, with text-based effects of total exposure and item predictability. It was clearly shown that summed total reading times positively predicted learning outcomes in all vocabulary measures. This confirmed an association between online processing and lexical retention as documented by previous research (Godfroid et al., 2013). The fact that summed total times were a strong predictor of learning outcomes after controlling for total exposure (Tables 21, 22 and 23) might indicate that individual attention to target words can explain the variance in vocabulary learning above and beyond mere repeated exposures. This finding aligns with lexical processing data which showed that readers invested more time in initial encounters checking for familiarity and reanalyzing context. From a reader’s perspective, exposures were not equal in the amount of context and information they provided 81 about target words. Thus, when we compare online times with total exposure, we are actually comparing two dimensions of exposure that I may distinguish as dynamic versus static exposure. Dynamic exposure involves the sum of all the information that readers have accrued from all encounters with a given word while static exposure mainly represents an offline scale variable; that is, a number. In the present study, the dynamic exposure captured readers’ interaction with target words and all the stages of lexical integration (Reichle et al, 2009) that have contributed to the incremental development of word knowledge as a byproduct of exposure. From this perspective, it was plausible to find that the way readers utilized their repeated encounters with target words strongly predicted learning outcomes beyond encounters per se. It was interesting to note that the effect of item predictability was somewhat attenuated by summed reading times in meaning recognition and recall. This may suggest that, while context information remains important for meaning acquisition, dynamic exposure to specific target words may moderate this effect to some extent based on individual reading behavior. 4.5 Overview Some general statements can be presented in the light of the above discussion. First, vocabulary learning from reading is not a byproduct of a single factor but it is rather influenced by multiple variables with variable effect sizes. What makes this type of statistical analysis complicated is that the model controls for the effects of other variables in the equation before assessing the effect of the variable of interest. Therefore, it may moderate or reduce other effects. For example, text-based variables were found to be good predictors of learning. However, their roles were moderated, showing that there were factors beyond the text that had more important roles, particularly reader’s processing behavior. 82 Another general statement concerns the differences between token-based and item-based processing times. In real time processing of individual words (i.e., tokens), it was found that first fixation durations predicted form recognition while gaze durations predicted meaning recall. This was explained in the light of the E-Z Reader model. However, when these measures were summed by item their effects did not transfer for the most part (Tables 21 and 23). A possible explanation for this is that the two analyses provided two different pictures because the summed measures of first fixations or gaze durations only combine partial events of processing and do not reflect all the stages of lexical access. On the other hand, summed total time consistently predicted learning gains in both token-based and item-based analyses. This may well be because total times and summed times reflected a more inclusive inventory of lexical processing events. Finally, the role of predictability has been consistent for meaning recognition and recall in both text-based and token-based analyses. It may be reasonable to assume that predictability complemented the role of exposure and boosted the effects of reading times in the development of word meanings from context. However, an interesting aspect of predictability that was shown in previous literature as well as in the present study is that high predictability induces shorter reading times and more frequent skipping (e.g. Kliegel et al, 2004; Rayner, & Well, 1996). Does this imply that high predictability tokens or words received less attention? One explanation for this apparent tension is that readers might have looked at high predictable tokens relatively less than other items because not as much processing was necessary for successful form-meaning mapping given that the context already provided part of the solution. The overall vocabulary gains were good, relative to the amount of reading material and the limited time spent on task (around 41 % in form recognition, 30 % in meaning recognition and 13 % in meaning recall). Although individual differences had minor effects on learning 83 outcomes, the observed effect of attention as a holistic measure highlights the role of differences in reading behavior regarding incidental vocabulary learning. Because total times predicted vocabulary learning above and beyond total exposure, it can be concluded that it is the reader’s use of exposure opportunities and context information that determines the amount and quality of learning from reading in addition to the static exposure and context properties of the written input. 84 CHAPTER 5: CONCLUSION This final chapter is divided into three parts. First, I summarize major results and contributions of the study. Next, I present the practical and pedagogical implications informed by the results. Finally, I conclude with a brief discussion of limitations of the study, in addition to suggestions for future research. 5.1 Summary of the findings The present study investigated incidental vocabulary acquisition from L2 reading using methods from extensive reading and eye movement research to highlight the role of cognitive processing in incidental vocabulary learning. An important contribution of the study was to introduce a natural reading task of reasonable length in an eye movement setting, which can represent a further step in understanding real time processes involved in incidental learning from reading. The experiment provided a considerable ecological validity regarding the reading task, making use of available authentic graded readers in a close approximation of leisure reading input that learners occasionally encounter in their ESL reading resources. Readers exhibited signs of increased familiarity and reading fluency on target words over encounters whereas they paid more attention to new words in early exposures. Most learning was shown in form recognition, followed by meaning recognition and finally meaning recall. All learning outcomes were significantly predicted by total number of exposures while predictability only aided meaning recognition and recall. Total times on individual encounters provided early indicators of vocabulary learning success in all measures. Additionally, first fixations predicted form recognition whileras gaze durations predicted meaning recall. When aggregating processing measures, it was found that summed reading times predicted learning outcomes above and 85 beyond text-based characteristics, which highlights the important role of readers’ individual attention and their optimal use of input to infer and retain meaning from context. Results of the study emphasize the significant role of leisure reading in the incremental development of vocabulary knowledge, starting with the word form and gradually building connections with meanings and retaining them for immediate recall. Lexical properties influenced how readers interacted with text and identified new word forms and meanings. Repeated exposure to new vocabulary seems to be a key factor that guarantees sufficient processing opportunities for successful intake. However, learning opportunities were further enriched when highly predictable items were repeatedly encountered in text. The probability of retaining word meanings was most closely associated with the most predictable vocabulary items in the text. On the cognitive level, the consistent relationship between total processing times and learning outcomes indicated that readers were more successful in gaining lexical knowledge from reading when they paid more attention to new vocabulary, taking advantage of exposure opportunities and context features to gather information about lexical items across several encounters. The present study sheds more light on the cognitive aspects of engagement (Schmitt, 2008) and involvement (Laufer, & Hulstijn, 2001), which were emphasized in vocabulary acquisition research and particularly within the incidental learning framework. Reader engagement with lexical items is reflected in online measures which capture ongoing processing of new vocabulary in different contexts. This adds another dimension to extensive reading as a source of vocabulary development, distinguishing between learning opportunities offered by the text and the expected learning outcomes based on textual features and readers’ engagement. 86 5.2 Practical and pedagogical implications The results of the study are mostly relevant to second language vocabulary learning and teaching. Maximizing exposure to vocabulary in rich contexts is a recommended strategy to ensure the best conditions for internalizing partially known words or acquiring new vocabulary. Exposure is not only confined to reading, but can also be extended to task-based learning where different input modalities (speaking, listening, reading and writing) can integrate vocabulary learning goals in variable contexts (Brown, Waring, & Donkaewbua, 2008). Task-based learning can extend beyond the classroom to include online courses that can be adapted to enhance the opportunities for incidental exposure to vocabulary in self-study modules. To increase the chances of vocabulary acquisition, it is recommended to recycle new vocabulary repeatedly through different types of teaching tasks over several class sessions to provide different contexts for targeted words. The present study corroborates previous research on the role of extensive reading mainly in developing reading fluency along with creating possible opportunities for learning new vocabulary. Increasing reading fluency can be an early stage that sets the scene for acquiring new vocabulary. One relevant implication for extensive reading is that it can afford more familiarity with new lexical items but this does not guarantee successful internalization of new word meanings in a limited time frame. This is particularly true in light of the fact that the effects of extensive reading are longitudinal in nature. Reading programs should be evaluated over longer periods of time, considering all factors of input, textual features and individual reading behavior as well as learners’ motivation. Vocabulary reviews have shown that word knowledge is multifaceted and being able to retrieve the meaning of a given L2 word is just one aspect of this knowledge (Nation, 2001; 87 Schmitt, 2008). Lexical gain results in the present study corroborate this principle and relate it to multiple exposures and levels of predictability in the text. Teachers should consider this fact in their testing material so as to accurately gauge different levels of their students’ lexical knowledge and set up plans for their vocabulary building strategies. 5.3 Limitations and further research The current study provides additional insights in SLA vocabulary research and extends further understanding of the cognitive aspects of incidental vocabulary acquisition. As a newly integrated technology in second language vocabulary research, the eye-tracking technique can answer specific questions about learners’ interaction with L2 material with considerable temporal and spatial accuracy. Implementing eye-tracking methodology in SLA is likely to open new avenues of investigation to uncover detailed cognitive processes in language acquisition in general and vocabulary development in particular. Some methodological issues need to be discussed regarding the nature of tasks and participants in the present study. Using a head mount and a chin rest during the reading task might have interfered with the natural reading behavior of readers to some extent. Further eyetracking research can make use of more advanced techniques to maximize the ecological validity of task performance without jeopardizing the accuracy of eye movement measures. The second point concerns the use of pseudo words for the study. As learners were expected to know the real words for the target items, they may have concluded that the novel words they encountered in reading were less frequent synonyms of the words they already knew, an impression that may have reduced their motivation or cognitive effort to incorporate the new lexical items. Moreover, the lab-controlled experiment condensed the number of exposures into one experimental session, which may not exactly match the typical incremental route that learners go through in incidental 88 learning, where repeated exposures are spaced over longer periods of time. For practical reasons, delayed post tests were not conducted. Further research should consider the role of repetition and context on vocabulary retention over time. The roles of repeated exposure and context quality can be investigated in different modalities and different teaching tasks. Findings from task-based vocabulary learning research have associated vocabulary acquisition with the concepts of engagement (Schmitt, 2008) and involvement load (Laufer, & Hulstijn, 2001). Looking at the cognitive perspectives of task performance through eye tracking techniques can shed more light on how learners respond to tasks with different levels of difficulty, how engagement is reflected in their online processing and how their attention resources are divided between task completion and lexical processing of unknown words. Vocabulary acquisition from L2 reading is usually characterized as incidental when learners are not forewarned of a vocabulary test after receiving input. In the current study, the amount of attention measured through eye movements seemed to be learner-driven because there was no external motivation that manipulated the existence or amount of attention on target vocabulary. Future research can examine how drawing attention of readers to focus on novel words in L2 input can yield different processing patterns and subsequently reflect on the amount of vocabulary gains. However, this kind of methodological manipulation should point to vocabulary gains in terms of a clear distinction between incidental and intentional learning setting. Individual differences in proficiency and vocabulary size did not yield significant effects on how much they were able to utilize context to learn new words. However, it may be interesting for further inquiry to investigate how diverse L1 backgrounds may have a role in 89 facilitating or hampering lexical acquisition from reading particularly in form recognition. Future research can address effects of script differences on incidental learning and how it interacts with learners’ proficiency in vocabulary development. Finally, the ideal extensive reading study will be longitudinal in nature and it evaluates learning outcomes from several readings over longer periods of times (Horst, 2005). The present study provided a model for further large-scale research that can consider a wider variety of reading material and more authentic texts with different populations of second language learners. Although eye movement research can provide precise quantitative account of lexical processing, it would be an additional asset in future studies to apply stimulated recalls or think-aloud protocols to explore qualitative aspects of attention to target words and reading fluency and their relationship to vocabulary acquisition (Rott, 2005; Rott, & Williams, 2003). Generally speaking, combining quantitative and qualitative methods to explore lexical learning from reading would add to our understanding of attention and engagement in reading comprehension and provide further implications on the process of incidental vocabulary learning from L2 reading. 90 APPENDICES 91 Appendix A: Participant Information Table 25 Participants' proficiency and vocabulary size chart Participant ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Age 38 21 29 35 19 23 30 19 19 29 20 21 40 18 20 19 24 19 19 20 20 31 18 30 30 21 19 26 24 27 21 20 28 23 20 22 21 22 25 27 28 22 Gender M M M F F M F F M F F M F M F M M F F M M M M F F F F M M F F F M F F F M F M F F F L1 Arabic Portuguese Twi Spanish Portuguese Chinese Arabic Japanese Chinese Spanish Russian Chinese Portuguese Ndebele Korean Chinese Shona Chinese Japanese Japanese Chinese Chinese Spanish Chinese Amharic Spanish Hindi Chinese Chinese Chinese Yoruba Portuguese Swahili Hindi Chinese Portuguese Polish Arabic Arabic Hindi Japanese Japanese Proficiency 79 80 86 100 88 80 79 86 80 100 90 86 96 88 83 92 97 92 82 90 79 84 93 87 100 94 97 88 79 80 100 86 86 100 79 90 100 86 79 90 88 79 92 Vocabulary size 3300 3800 2870 4490 3190 4730 4570 3140 4420 4930 3250 4260 4630 3880 4130 4290 2830 2980 3800 4630 3900 3900 2790 2730 4680 4660 4230 4070 3890 4360 3300 3510 4520 3550 2820 4470 4430 3290 3920 4570 3580 4630 Appendix B: Background questionnaire PLEASE FILL OUT THE FOLLOWING BACKGROUND INFORMATION: 1. First name: ________________ 2. Age: _____ 3. Gender:  Male 4. Year of study:  Freshman Last name: ________________  Female  Sophomore  Junior  Senior  Graduate Student 5. Major ____________________________ 6. In which section of English are you enrolled? …………………….. 7. How many years have you been studying English? _______ 8. How old were you when you started learning English …………………………. 9. Your native language: ………………………… 10. Other languages you studied ………………………………………………. 11. Language(s) spoken at home ……………………………………….. 12. Your recent TOEFL/IELTS score ……………………………. 13. On a scale from 1 to 9, how do you rate your skills in the following areas of English? Reading 1 2 3 4 5 6 7 8 9 Writing 1 2 3 4 5 6 7 8 9 Vocabulary size 1 2 3 4 5 6 7 8 9 Overall Proficiency 1 2 3 4 5 6 7 8 9 Thank you very much for your time 93 Appendix C: Sample of reading material ‘Goodbye Mr. Hollywood’ Chapter 1: Mystery Girl It all began on a beautiful spring morning in a village called Whistler, in Canada- a pretty little village in the mountains of British Columbia. There was a café in the village, with tables outside, and at one of these tables sat a young man. He finished his breakfast, drank his coffee, looked up into the blue sky, and felt the warm sun on his face. Nick Lortz was a happy man. The waiter came up to his table.' More coffee? ‘He asked. 'Yeah. Great,' said Nick. He gave the waiter his coffee cup. The waiter looked at the camera on the table. 'On vacation?' he said. 'Where are you from?' 'San Francisco,' Nick said. He laughed. But I'm not on vacation - I'm working. I’m a travel writer, and I’m doing a book on mountains in North America. I've got some great pictures of your mountain. The two men looked up at Whistler Mountain behind the village. It looked very beautiful in the morning sun. ‘Do you travel a lot, then? Asked the waiter. 'All the time, ‘Nick said. I write books, and I write for travel magazines. I write about everything – different countries, towns, villages, rivers, mountains, people, the waiter looked over Nick's head. , There’s a girl across the street, ‘he said. , Do you know her?' Nick turned his head and looked. “No, I don’t.”, 'well, she knows you, I think,' the waiter said. , She’s watching you very carefully. He gave Nick a smile. .Have a nice day! 'He went away, back into the cafe. Nick looked at the girl across the street. She was about twenty-five, and she was very pretty. She is watching me, Nick thought. Then the girl turned and looked in one of the shop windows. After a second or two, she looked back at Nick again. Nick watched her. She looks worried,' he thought. 'What’s she doing? Is she waiting for somebody? Suddenly, the girl smiled. Then she walked across the street, came up to Nick's table, and sat down. She put her bag down on the table. The bag was half open. 'Hi! I'm Jan,' she said. 'Do you remember me? We met at a party in Toronto. ''Hi, Jan,' said Nick. He smiled. 'I'm Nick. But we didn't meet at a party in Toronto. I don't go to parties very often, and never in Toronto.' 'Oh, 'the girl said. But she didn't get up or move away. 'Have some coffee,' said Nick. The story about the party in Toronto wasn't true, but it was a beautiful morning, and she was a pretty girl. , Maybe it was a party in Montreal. Or New York.' The girl laughed. 'OK. Maybe it was. And yes, I'd love some coffee. ‘When she had her coffee, Nick asked, what are you doing in Whistler? Or do you live here? 'Oh no,' she said. 'I'm just, err, just travelling through. And what are you doing here?' 'I'm a travel writer,' Nick said, and I’m writing a book about famous mountains.' 'That's interesting,' she said. But her face was worried, not interested, and she looked across the road again. A man with very short, white hair walked across the road. He was about sixty years old, and he was tall and thin. The girl watched him. 'Are you waiting for someone?' asked Nick. 'No,’ she said quickly. Then she asked, where are you going next, Nick?' 'To Vancouver, for three or four days, ‘he said. 'When are you going?' she asked. 'Later this morning,' he said. There was a letter in the top of the girl's half-open bag. Nick could see some of the writing, and he read it because he saw the word 'Vancouver' - . . . and we can meet at the Empress Hotel, Victoria,Vancouuer island, on Friday afternoon . ' . 'So she's going to Vancouver too'' he thought. Suddenly the girl said, 'Do you like movies?' 'Movies? Yes, I love movies, 'he said' "Why?' 'I know a man, and he - he loves movies, and going to the cinema,' she said slowly. 'People call him "Mr Hollywood".' She smiled at Nick. , Can I call you “Mr Hollywood" too?' Nick laughed.' OK, 'he said, And what can I call you? She smiled again. Call me Mystery Girl, 94 she said. 'That's a good name for you, said Nick. Just then, the man with white hair came into the cafe. He did not look at Nick or the girl, but he sat at a table near them. He asked the waiter for some breakfast. Then he began to read a magazine. The girl looked at the man, then quickly looked away again. 'Do you know him? Nick asked her. No,' she said. She finished her coffee quickly and got up. 'I must go now, ‘she said. Nick stood up, too. “Nice to –, he began. But the girl suddenly took his face between her hands, and kissed him on the mouth. , Drive carefully, Mr. Hollywood. Goodbye, she said, with a big, beautiful smile. Then she turned and walked quickly away. Nick sat down again and watched her. She walked down the road and into a big hotel. ‘Now what,’ thought Nick, was that all about? The man with white hair watched Nick and waited. After four or five minutes, Nick finished his coffee, took his books and his camera, and left the cafe. His car was just outside the girl's hotel, and he walked slowly along the street to it. The man with white hair waited a second, then quickly followed Nick. From a window high up in the hotel, the girl looked down into the road. She saw Nick, and the man with white hair about fifty yards behind him. Nick got into his car, and the man with white hair walked quickly to a red car across the street. Five seconds later Nick drove away in his blue car, and the red car began to follow him. When the girl saw this, she smiled, then went to put some things in her travel bag. 95 Appendix D: Sample page from the comprehension packet Chapter TWO: hand in the back Check True or False 1-Nick asked the girl to see her again in Vancouver. 2-The girl was not telling the truth about her name. 3-The weather was nice in Vancouver when Nick arrived. 4-A car hit Nick in the middle of the street. Answer briefly 5-What do you think happened to Nick in the street ? Who did that to him? ……………………………………………………………………………………………………… ……………………………………………………………………………………………………… ……………………………………………………………………………………………………… ……………………………………………………………………………………………………… 6-How did Nick know more information on the mystery girl? a)She called him and told him everything b)from a TV show about her family c)from a magazine he was reading d)The police man told him about her 7-Nick learned things about the Mystery girl (check all what applies): a)She changed her name b)she is from Torronto c)she is a daughter of a millionaire d)she knows the man with white hair 96 Appendix E: Reading perception questionnaire Read the following statements and say how much you agree or disagree with them by simply circling a number from 1 to 6. Strongly disagree disagree Slightly disagree Slightly agree agree Strongly agree 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 I needed a dictionary to read it 1 2 3 4 5 6 It was easy to read 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 I enjoyed reading this book I would like to read it again This was comfortable to read The story was complicated and not clear I often lost track of the story It took longer to read than I expected In general, I enjoy reading such stories The eye tracking experience was disturbing 97 Appendix F: Form recognition test Circle only the words you have encountered in the story 1 ship sense joker tame table 2 bannop hospital bus rude pag 3 fozle mystery tance mave lame 4 fonteen gell chortan stoll tund 5 shame window nase gun camera 6 rim sind fake mork letter 7 blef red rimple cube pamery 8 kerp crasty lead mot shoes 9 bannow havoc barn money pennem 10 hungry subid room speat smick 11 happy bandle neech doom prink 12 jurgs busy hair manage desk 13 levider tidge yelt noise airport 14 commute dress drink system bing 15 bannifet meet similar dillet tantic 16 windle mand push redaster vack 17 leam tantic popkum nook toker 18 fungi dook dangy megole smile 19 cheem borch gotty tickeny palk 20 dorch gube plampy cold dern 98 Appendix G: Meaning recall test No Meaning? clue 1 ferry 2 mot 3 blef 4 fonteen 5 mystery 6 windle 7 rude 8 leam 9 redaster 10 dorch 99 Similar to Related to ? Appendix H: Meaning recognition test Meaning recognition Circle the best meaning for each of the words below as far as you know. word 1 2 3 4 5 poor cheap I don't know move very fast stop a car get off the ground suddenly I don't know annoyed strong famous hungry I don't know joke false statement kind of humor way of speaking way of thinking I don't know blef To hide To push To kill to move I don't know windle club garden letter file I don't know upset Tired unhappy jump lie on top of water tantic neech Shirt hair dress drawer I don't know dangy dirty happy quiet bright I don't know working desk person's face big couch I don't know door I don't know busy I don't know past time I don't know mave bannow eye glasses Cap roof leam Slow fast century length measure hundred years window tall record 100 Appendix I: Modified cloze task 101 Appendix J: Token predictability data Table 26 Estimated predictability for target tokens Pseudo word fozle gube mave Predictability (version A) 56.7 67.5 75 90 57.5 5 38.3 5 8.3 6.7 37.5 80 47.5 87.5 80 20 77 82.5 52.5 7.5 45 10 55 35 52.5 15 25 31.7 27.5 30 28.3 15 35 12.5 30 neech 85 17.5 75 32.5 51.7 47.5 42.5 67.5 25 25 47.5 57.5 70 85 82.5 90 27.5 17.5 0 5 5 redaster dook tance pamery tantic dorch smick 87.5 72 65 71.7 10 57.7 72.5 tund leam blef toker bannow mot fonteen dangy windle Predictability (version B) 62.6 90 32.5 52.5 60 37.5 28.3 70.9 96.4 48 77.1 77.1 15.9 28 74.5 90.9 80 90.9 92.7 45.4 46 14.6 56.3 68.8 2.1 0 72.9 77.1 72.9 75 50 60 79.2 43.3 83.3 84 37.5 32.5 37.5 4.2 58.3 83.3 29.2 31.3 43.8 32.5 27.5 25 33.3 35.4 39.6 75 81.8 79.5 66.7 12.5 45 12.5 70 77.5 80 82.5 50 17.5 90 10 32.5 62.5 10 15 60 47.5 27.5 42.5 70 70 70 77.5 25 21.7 25 27.5 77.5 55 57.5 48.3 80 85 80 85 62.5 12.5 5 0 60 60 0 12.5 92.7 37.5 50 50 76 78 82 81.8 5.5 0 22.9 56.3 0 75 25.5 45.8 1.8 16.7 44 97.5 65 54.5 13.6 59 77 37.5 45.8 50 70 102 36.4 85.5 83.3 79.2 64.6 52.5 72.9 50 50 75 72.9 14.6 16.7 64.6 66.7 94.5 40 54.2 12.5 63.6 20.5 68.2 30 56.3 43.8 37.5 43.2 82 88 79 64.6 56.8 22.7 63.6 61.4 56.4 49.1 72.7 68.8 72.9 77.1 72.7 77.3 72.9 8.3 81.3 6.8 40 42 50 18.2 72.7 REFERENCES 103 REFERENCES Altarriba , J. , Kroll , J.F. , Scholl , A. , Rayner , K. ( 1996 ). The influence of lexical and conceptual constraints on reading mixed language sentences: Evidence from eye fixation and naming times . Memory and Cognition, 24 , 477 – 492 . Ashby, J., Rayner, K., & Clifton, C. J. (2005). Eye movements of highly skilled and average readers: Differential effects of frequency and predictability. Quarterly Journal of Experimental Psychology, 58A, 1065-1086. Balota, D. A., Pollatsek, A., & Rayner, K. (1985). The interaction of contextual constraints and parafoveal visual information in reading. Cognitive Psychology, 17, 364–388. Beck, I. L., McKeown, M. G., & McCaslin, E. S. (1983). All contexts are not created equal. Elementary School Journal, 83, 177–181. Brown, R., Waring, R., and Donkaewbua, S. (2008). Vocabulary acquisition from reading, reading-while-listening, and listening to stories. Reading in a Foreign Language, 20, (2)136–163. Brusnighan, S. M., & Folk, J. R. (2012). Combining contextual and morphemic cues is beneficial during incidental vocabulary acquisition: Semantic transparency in novel compound word processing. Reading Research Quarterly, 47(2), 172-190. Bruton, A., Garcı´a Lo´pez, M., & Esquiliche Mesa, R. (2011). Incidental L2 vocabulary learning: An impracticable term? TESOL Quarterly, 45, 759–768. Chen, C., & Truscott, J. (2010). The effects of repetition and L1 lexicalization on incidental vocabulary acquisition. Applied Linguistics, amq031. Clifton, C., Jr., Staub, A., & Rayner, K. (2007). Eye movements in reading words and sentences. In R. P. G. van Gompel, M. H. Fischer,W. S. Murray, & R. L. Hill (Eds.), Eye movements: A window on mind and brain (pp. 341-371). Amsterdam: Elsevier, North Holland. Day, R. R., & Bamford, J. (1998). Extensive reading in the second language classroom. Cambridge: Cambridge University Press. Coady, J., & Huckin, T. N. (1997). Second language vocabulary acquisition: A rationale for pedagogy. Cambridge, U.K.; New York: Cambridge University Press. Day, R., Omura, C., & Hiramatsu, M. (1991). Incidental EFL vocabulary learning and reading. Reading in a Foreign Language, 7, 541–551. 104 Dobinson, T. (2001). Do learners learn from classroom interaction and does the teacher have a role to play? Language Teaching Research, 5(3), 189-211. Eckerth, J., & Tavakoli, P. (2012). The effects of word exposure frequency and elaboration of word processing on incidental L2 vocabulary acquisition through reading. Language Teaching Research, 16(2), 227-252. Ehrlich, S. F., & Rayner, K. (1981). Contextual effects on word perception and eye movements during reading. Journal of verbal learning and verbal behavior, 20(6), 641-655. Elley, W. B. (1989). Vocabulary acquisition from listening to stories. Reading Research Quarterly, 24(2), 174–187. Elley, W. B. (1991). Acquiring literacy in a second language: The effect of book-based programs. Language Learning, 41, 375– 411. Ellis, R. (1999). Learning a second language through interaction. Amsterdam: John Benjamins. Ellis, R., & He, X. (1999). The roles of modified input and output in the incidental acquisition of word meanings. Studies in Second Language Acquisition, 21(2), 285-301. Ellis, R., Tanaka, Y., & Yamazaki, A. (1994). Classroom interaction, comprehension, and the acquisition of L2 word meanings. Language Learning, 44(3), 449-491. Ferguson, C. J. (2009). An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research and Practice, 40(5), 532-538. Folse, K. (2006). The Effect of Type of Written Exercise on L2 Vocabulary Retention.TESOL Quarterly, 40(2), 273-293. Fraser, C. A. (1999). Lexical processing strategy use and vocabulary learning through reading. Studies in Second Language Acquisition, 21(2), 225-241. Fraser, C. (2007). Reading rate in L1 Mandarin Chinese and L2 English across five reading tasks.The Modern Language Journal, 91, 372–394. doi: 10.1111/j.1540-4781.2007.00587.x Gass, S. (1999). Incidental vocabulary learning. Studies in Second Language Acquisition, 21(2), 319-333. Gass, S. M., Behney, J., & Plonsky, L. (2013). Second language acquisition: An introductory course. New York: Routledge. Grabe, W., & Stoller, F. (1997). Reading and vocabulary development in a second language: A case study. In J. Coady & T. Huckin (Eds.), Second language vocabulary acquisition (pp. 98–123). Cambridge, England: Cambridge University Press. 105 Grabe, W., & Stoller, F. (2011). Teaching and researching reading (2nd ed.). Harlow, Essex:Pearson Education. Godfroid, A., Housen, A., & Boers, F. (2010). A procedure for testing the Noticing Hypothesis in the context of vocabulary acquisition. In M. Pütz & L. Sicola (Eds.), Inside the learner's mind: Cognitive processing and second language acquisition (pp. 169-197). Amsterdam/Philadelphia: John Benjamins. Godfroid, A. (2012). Eye tracking. In P. Robinson (Ed.), The Routledge encyclopedia of second language acquisition (pp. 234-236). New York/London: Routledge. Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in incidental L2 vocabulary acquisition by means of eye-tracking. Studies in Second Language Acquisition, 35(3), 483-517. doi: 10.1017/S0272263113000119 Haastrup, K. (2008). Lexical inferencing procedures in two languages. In Albrechtsen, D., Haastrup, K., and Henriksen, B., Vocabulary and Writing in a First and Second Language: Process and Development. Basingstoke: Palgrave Macmillan. pp. 67–111. Heatley, A., Nation, I. S. P., & Coxhead, A. (2002). Range [Computer software]. Retrieved from http://www.victoria.ac.nz/lals /staff/paul-nation/nation.aspx Heck, R. H., Thomas, S., & Tabata, L. N. (2012). Multilevel modeling of categorical outcomes using IBM SPSS. New York: Routledge. Hill, M. and Laufer, B. (2003). Type of task, time-on-task and electronic dictionaries in incidental vocabulary acquisition. International Review of Applied Linguistics in Language Teaching, 41(2), 87–106. Hirsh, D., & Nation, I. S. P. (1992). What vocabulary size is needed to read unsimplified texts for pleasure? Reading in a Foreign Language, 8, 689–696. Horst, M. (2005). Learning L2 vocabulary through extensive reading: A measurement study. The Canadian Modern Language Review, 61(3), 355-382. doi: 10.3138/cmlr.61.3.355 Horst, M., Cobb, T., &Meara, P.(1998). Beyond a clockwork orange: Acquiring second language vocabulary through reading. Reading in a Foreign Language, 11(2), 207-223. Hu, M., & Nation, I.S.P. (2000). Vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430. Hu, H.C., & Nassaji, H. (2012). Ease of inferencing, learner inferential strategies, and their relationship with the retention of word meanings inferred from context. Canadian Modern Language Review, 68(1), 54-77. 106 Hu, H. C. M. (2013). The Effects of Word Frequency and Contextual Types on Vocabulary Acquisition from Extensive Reading: A Case Study. Journal of Language Teaching and Research, 4(3), 487-495. Huang, S., Willson, V., & Eslami, Z. (2012). The Effects of Task Involvement Load on L2 Incidental Vocabulary Learning: A Meta-Analytic Study. Modern Language Journal, 96(4), 544-557. Huckin, T., & Coady, J. (1999). Incidental vocabulary acquisition in a second language: A review. Studies in Second Language Acquisition, 21(2), 181-193. Huckin, T. N., Haynes, M., & Coady, J. (1993). Second language reading and vocabulary learning. Norwood, N.J. Ablex Publishing Corporation. Hulstijn, J. (1992). Retention of inferred and given word meanings: Experiments in incidental vocabulary learning. In P Arnaud &: H. Bejoint (Eds.), Vocabulary and applied linguistics (pp. 113-125). London: Macmillan Academic and Professional Limited. Hulstijn, J., Hollander, M., & Greidanus, T. (1996). Incidental vocabulary learning by advanced foreign language students: The influence of marginal glosses, dictionary use, and reoccurrence of unknown words. The Modern Language Journal, 80, 327–339. Hulstijn, J. H., & Trompetter, P. (1998). Incidental learning of second language vocabulary in computer-assisted reading and writing tasks. In Albrechtsen, D., Hendricksen, B., Mees, M., & Poulsen, E. (Eds.) Perspectives on foreign and second language pedagogy (pp. 191–200). Odense, Denmark: Odense University Press. Hulstijn, J., & Laufer, B. (2001). Some Empirical Evidence for the Involvement Load Hypothesis in Vocabulary Acquisition. Language Learning, 51(3), 539-558. Hulstijn, J. H. (2001). Intentional and incidental second language vocabulary learning: a reappraisal of elaboration, rehearsal and automaticity. In P. Robinson (Ed.), Cognition and second language instruction (pp. 258-286). Cambridge: Cambridge University Press. Hulstijn, J. (2003). Incidental and intentional learning. In Doughty, C., & Long, M. (Ed.) The handbook of second language acquisition. Malden, MA ; Oxford : Blackwell Publishing. Hyönä, J. & Niemi, P. (1990). Eye movements in repeated movements of a text. Acta Psychologica, 73, 259-280. Jing, L., & Jianbin, H. (2009). An empirical study of the involvement load hypothesis in incidental vocabulary acquisition in EFL listening. Polyglossia, 16, 1-11 Joe, A. (2010). The Quality and Frequency of Encounters with Vocabulary in an English for Academic Purposes Programme. Reading in a Foreign Language, 22(1), 117-138. 107 Joseph, H. S., Wonnacott, E., Forbes, P., & Nation, K. (2014). Becoming a written word: Eye movements reveal order of acquisition effects following incidental exposure to new words during silent reading. Cognition, 133(1), 238-248. Juhasz, B. J. & Pollatsek, A. P. (2011). Lexical influences on eye movements in reading. In S. P. Liversedge, I. D. Gilchrist, & S. Everling (Eds.), The Oxford handbook on eye movements (pp.873–893). Oxford: Oxford University Press. Kliegl, R., Grabner, E., Rolfs, M., & Engbert, R. (2004). Length, frequency, and predictability effects of words on eye movements in reading. European Journal of Cognitive Psychology, 16(1-2), 262-284. Knight, S. (1994). Dictionary Use While Reading: The Effects on Comprehension and Vocabulary Acquisition for Students of Different Verbal Abilities. The Modern Language Journal, 78, 3, 285-299. Kowen , S. & Kim, H. (2008). Beyond raw frequency: Incidental vocabulary acquisition in extensive reading. Reading in a Foreign Language. 20(2), 191-215. Lai, F. K. (1993). The effect of a summer reading course on reading and writing skills. System, 21, 87–100. Laufer, B. (2003). Vocabulary acquisition in a second language: Do learners really acquire most vocabulary by reading? Some empirical evidence. Canadian Modern Language Review, 59, 567 587. Laufer, B. (2005). Focus on Form in Second Language Vocabulary Learning. EUROSLA Yearbook, 5, 223-250. doi: 10.1075/eurosla.5.11lau Laufer, B., & Hulstijn, J. (2001). Incidental Vocabulary Acquisition in a Second Language: The Construct of Task-Induced Involvement. Applied Linguistics, 22(1), 1-26. Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 22, 15–30. Liversedge, Simon P. and Rayner, Keith (2011) Linguistic and cognitive influences on eye movements during reading. In, Gilchrist, Iain and Everling, Stefan (eds.) The Oxford Handbook of Eye Movements. Oxford, GB, Oxford University Press, 751-766. Liversedge, S., Gilchrist, I., & Everling, S. (Eds.). (2011). The Oxford handbook of eye movements. Oxford University Press. Lupescu, S., & Day, R. (1993). Reading, dictionaries, and vocabulary learning. Language Learning, 43(2), 263-287. 108 Macaro, E. (2003). Teaching and learning a second language : A review of recent research. London; New York: Continuum. Mackey, A. (1999). Input, interaction, and second language development: An empirical study of question formation in ESL. Studies in Second Language Acquisition, 21(4), 557-587. Mackey, A., & Goo, J.. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In A.Mackey(Ed.), Conversational interaction in second language acquisition (pp.433-464). Oxford University Press. Mackey, A., Gass, S., & McDonough, K. (2000). How do learners perceive interactional feedback? Studies in Second Language Acquisition, 22(4), 471-497. Mackey, A., & Philp, J. (1998). Conversational interaction and second language development: Recasts, responses, and red herrings? Modern Language Journal, 82(3), 338-356. Matsuoka, W., & Hirsh, D. (2010). Vocabulary learning through reading: Does an ELT course book provide good opportunities? Reading in a Foreign Language, 22(1), 56-70. McCullagh, P., Nelder, J. A., & McCullagh, P. (1989). Generalized linear models (Vol. 2). London: Chapman and Hall. Meara, P. (1992). EFL vocabulary test. Swansea, UK: University College, Centre for Applied Language Studies. Menard, S. (2010). Logistic regression: From introductory to advanced concepts and applications. Thousand Oaks, CA: SAGE Publications, Inc. Mohamed. A (in press). Task-based incidental vocabulary learning in L2 Arabic: The role of proficiency and task performance. Accepted, JNCOLCTL. Mohamed. A (2012). Investigating incidental vocabulary acquisition in conversation classes: A qualitative and quantitative analysis. MSU Working Papers in SLS, 3(1), 30-48. Mondria, J. A. (2003). The effects of inferring, verifying, and memorizing on the retention of L2 word meanings. Studies in Second Language Acquisition,25(04), 473-499. Mondria, J. A., & Wit-de Boer, M. (1991). The Effects of Contextual Richness on the Guessability and the Retention of Words in a Foreign Language1. Applied Linguistics, 12(3), 249-267. Nagy, W. (1997). On the role of context in first-and second-language vocabulary learning. In N. Schmitt & M. McCarthy (Eds.), Vocabulary description, acquisition, and pedagogy (pp. 64–83). Cambridge: Cambridge University Press. Nagy, W. E., Anderson, R. C., & Herman, P. A. (1987). Learning word meanings from context during normal reading. American Educational Research Journal, 24(2), 237–270. 109 Nassaji, H. (2003). L2 vocabulary learning from context: Strategies, knowledge sources, and their relationship with success in L2 lexical inferencing. TESOL Quarterly,37(4),645–670. Nation, P., & Wang, K. (1999). Graded readers and vocabulary. Reading in a foreign language, 12(2), 355-380. Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge; New York: Cambridge University Press. doi: 10.1017/CBO9781139524759 Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63, 59–82. Paribakht, T. S., & Wesche, M. (1999). Reading and "incidental" L2 vocabulary acquisition: An introspective study of lexical inferencing. Studies in Second Language Acquisition, 21(2), 195-224. doi: 10.1017/S027226319900203X Parry, K. (1991). Building a vocabulary through academic reading. TESOL Quarterly, 25, 629– 653. Pellicer-Sanchez, A., & Schmidt. (2010). Incidental vocabulary acquisition from an authentic novel: Do things fall apart? Reading in a Foreign Language, 22(1), 31-55. Peters, E., Hulstijn, J., Sercu, L., & Lutjeharms, M. (2009). Learning L2 German vocabulary through reading: The effect of three enhancement techniques compared. Language Learning,59, 113–151. Pigada, M., &Schmitt, N.(2006). Vocabulry acquisition from extensive reading: A case study. Reading in a Foreign Language, 22(1), 1-28. Pitts, M., White, H., & Krashen, S. (1989). Acquiring second language vocabulary through reading: A replication of the clockwork orange study using second language acquirers. Reading in a Foreign Language, 5, 271–275. Pollatsek, A., Reichle, E. D., & Rayner, K. (2006). Tests of the E-Z Reader model: Exploring the interface between cognition and eye-movement control. Cognitive Psychology, 52, 1–56. Powers, D. A., & Xie, Y. (2008). Statistical methods for categorical data analysis. Bingley, UK: Emerald. Pulido, D. (2007). The relationship between text comprehension and second language incidental vocabulary acquisition: A matter of topic familiarity? Language Learning, 57(1), 155-199. doi: 10.1111/j.1467-9922.2007.00415.x Rayner, K., Ashby, J., Pollatsek, A., & Reichle, E. D. (2004). The effects of frequency and predictability on eye fixations in reading: Implications for the E-Z Reader model. Journal of Experimental Psychology: Human Perception and Performance, 30, 720–732 110 Raney, G., & Rayner, K. (1995). Word frequency effects and eye movements during two readings of a text. Canadian Journal of Experimental Psychology, 49(2), 151-172. doi: 10.1037/1196-1961.49.2.151 Rayner, K., Raney, G. E., & Pollatsek, A. (1995). Eye movements and discourse processing. In R. F. Lorch and E. J. O’Brien (Eds.), Sources of coherence in reading (pp. 9-36). Hillsdale, NJ: Lawrence Erlbaum Associates. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372-422. doi: 10.1037/0033-2909.124.3.372 Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psychology, 62(8), 1457–1506. Rayner,K.,&Well,A.D.(1996).Effects of contextual constraint on Eye movements in reading:A further examination. Psychonomic Bulletin & Review, 3,504-509. Read, J. (2004). Research in teaching vocabulary. Annual Review of Applied Linguistics, 24, 146-161. doi: 10.1017/S0267190504000078 Reichle, E.D., Rayner, K., & Pollatsek, A. (2003). The E-Z Reader model of eye movement control in reading: Comparison to other models. Brain and Behavioral Sciences, 26, 445– 476. Reichle, E. D., Warren, T., & McConnell, K. (2009). Using EZ Reader to model the effects of higher level language processing on eye movements during reading. Psychonomic bulletin & review, 16(1), 1-21. Richards, J., & Schmidt, R. (2002). Longman dictionary of language teaching and applied linguistics. Malaysia: Pearson Education. Robb, T.N., & Susser, B. (1989). Extensive reading vs. skills building in an EFL context. Reading in a Foreign Language, 5, 239– 251. Rott, S. (1999). The effect of exposure frequency on intermediate language learners’ incidental vocabulary acquisition and retention through reading. Studies in Second Language Acquisition, 21(3), 589-619. Rott, S. (2005). Processing glosses: A qualitative exploration of how form-meaning connections are established and strengthened. Reading in a Foreign Language, 17(2), 95-124. Rott, S., & Williams, J. (2003). Making form-meaning connections while reading: A qualitative analysis of word processing. Reading in a Foreign Language, 15(1), 45-75. 111 Rott, S., Williams, J., & Cameron, R.(2002). The effect of multiple-choice L1 glosses and inputoutput cycles on lexical acquisition and retention. Language Teaching Research, 6, 183222. Saragi, T., Nation, I. S. P., & Meister, F. (1978). Vocabulary learning and reading. System, 6, 72–78. Schaffin, R., Morris, R.K., & Seely, R.E. (2001). Learning new words in context: A study of eye movements. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 225-235. doi: 10.1037/0278-7393.27.1.225 Schmitt, N. (2008). Review article: Instructed second language vocabulary learning. Language Teaching Research, 12(3), 329-363. doi: 10.1177/1362168808089921 Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual. Palgrave Macmillan. Schmitt, N., Jiang, X., & Grabe, W. (2011). The percentage of words known in a text and reading comprehension. The Modern Language Journal, 95, 26–43. Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129-158. doi: 10.1093/applin/11.2.129 Schouten-van Parreren, C. (1989). Vocabulary learning through reading: Which conditions should be met when presenting words in texts. AILA review, 6(1), 75-85. Schwanenflugel, P. J., Stahl, S. A., & Mcfalls, E. L. (1997). Partial word knowledge and vocabulary growth during reading comprehension. Journal of Literacy Research, 29(4), 531-553. Schwanenflugel, P. J., & LaCount, K. L. (1988). Semantic relatedness and the scope of facilitation for upcoming words in sentences. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14(2), 344. Uden, J., Schmitt, D., & Schmitt, N. (2014). Can learners make the jump from the highest graded readers to ungraded novels?: Four case studies. Reading in a Foreign Language, 26, 1, 128. Van Gompel, R. P. G., Fischer, M. H., Murray, W. S., & Hill, R. L. (EDS.) (2007).Eye movements: A window on mind and brain. Oxford: Elsevier. Vidal, K. (2010). A Comparison of the effects of reading and listening on incidental vocabulary acquisition. Language Learning, 61(1), 219-258. Waring, R., & Nation, I. S. P. (2004). Second language reading and incidental vocabulary learning. Angles on the English speaking world, 4, 97-110. 112 Waring, R., & Takaki, M. (2003). At what rate do learners learn and retain new vocabulary from a graded reader? Reading in a Foreign Langugae, 15, 130-163. Watanabe, Y. (1997). Input, intake and retention: Effects of increased processing on incidental learning of foreign language vocabulary. Studies in Second Language Acquisition,19, 287307. doi: 10.1017/S027226319700301X. Webb, S. (2005). Receptive and productive vocabulary learning: The effects of reading and writing on word knowledge. Studies in Second Language Acquisition, 27(01), 33-52. Webb, S. (2007).The effects of repetition on vocabulary knowledge. Applied Linguistics,28,4665. doi: 10.1093/applin/aml048 Webb, S. (2008). The effects of context on incidental vocabulary learning. Reading in a Foreign Language, 20(2), 232-245. Webb, S. (2010). Using glossaries to increase the lexical coverage of television programs. Reading in a Foreign Language, 22, 201-221. Williams, R.S., & Morris, R.K. (2004). Eye movements, word familiarity, and vocabulary acquisition. European Journal of Cognitive Psychology, 16, 312-339. Winke, P. M., Godfroid, A., & Gass, S. M. (2013). Introduction to the special issue. Eyemovement recordings in second language research. Studies in Second Language Acquisition, 35(2), 205-212. doi: 10.1017/S027226311200085X Wochna, K. L., & Juhasz, B. J. (2013). Context length and reading novel words: An eye movement investigation. British Journal of Psychology, 104(3), 347-363. Yaqubi, B., Rayati, A., & Gorgi, N.(2010). The Involvement Load Hypothesis and Vocabulary Learning: The Effect of Task Types and Involvement Index on L2 Vocabulary Acquisition. The Journal of Teaching Language Skills.2, 1, 59/4. Zahar, R., Cobb, T., & Spada, N. (2001). Acquiring vocabulary through reading: Effects of frequency and contextual richness. Canadian Modern Language Review, 57(4), 541-572. Zimmerman, C. B. (1997). Historical trends in second language vocabulary instruction. In J. Coady, & T. Huckin (Eds.), Second language vocabulary acquisition: A rationale for pedagogy (pp. 5-19). England: Cambridge U Press. 113