! CONSEQUENCES OF BILINGUALISM FOR SPEECH UNDERSTANDING IN NOISE
 By Jens Schmidtke
        A DISSERTATION
 Submitted to
 Michigan State University
 in partial fulfillment of the 
requirements
 for the degree of
  Second Language Studies
 Ð Doctor of Philosophy
 2015        ABSTRACT
 CONSEQUENCES OF BILINGUALISM FOR SPEECH UNDERSTANDING IN NOISE
 By Jens Schmidtke
 The present study sought to 
identify
 factors that would be associated with 
speech understanding 
in noise (SUN
) ability
 in monolingual and bilingual listeners
. The Ease of Language 
Understanding (ELU) model predicts that mismatches between the speech signal and 
phonological representations stored in long
-term memory (LTM) will res
ult in greater explicit 
processing effort and, as a consequence, decreased comprehension. Such mismatches can be the 
result of signal degradations or imprecise lexical representations in LTM.
 Based on the lexical 
quality hypothesis 
(Perfetti & Hart, 2002; Perfetti, 2007)
, it was hypothesized that
 the quality of 
lexical representations would differ within speakers as a function of word frequency and between 
speakers a function of overall language experience, 
operationalized here as vocabulary 
knowledge. From these assumptions it followed that bilingual speakers
 would 
have less precise 
lexical representations than monolinguals because of their reduced language experience as a 
result of speaking two languages. 
A second hypothesis was that the same relationship between 
vocabulary knowledge and SUN exists in monolingual and bilingual speakers. 
 The present study tested these predictions in a sample of 53 English monolingual and 48 
early Spanish
-English bilingual sp
eakers with a mean age of 20.7 years (SD = 2.6, range = 18
-31). All participants completed two subtests of verbal ability (picture vocabulary and verbal 
analogies) from the Woodcock
-MuŒoz Language Survey
 (WMLS)
, a standardized test of 
English.
 In addition,
 participants completed
 tests that were believed to be associated with SUN,
 a verbal WM test
, a nonlinguistic test of auditory attention
, and a consonant perception in noise 
  test. SUN
 was tested using sentences from 
a previously published test
, the 
Speech 
Perception in 
Noise (
SPIN
) test
 (Bilger, Nuetzel, Rabinowitz, & Rzeczkowski, 1984)
, at two signal
-to-noise 
ratios (SNR; 3 dB and 
-2dB), using multi
-talker background babble as the noise masker. The 
participantsÕ task was to type the last word of the sentence, which was either predictable from 
context (e.g., 
The ship sailed along the coast
) or nonpredictable (e.g., 
Mrs. Brown did not 
consi
der the coast
).  When looking at group differences, the results replicated previous studies, showing that 
bilinguals recognized target words with lower accuracy relative to monolinguals. In addition, 
monolinguals benefitted more from a predictive context t
han bilinguals. The results from the 
WMLS showed that bilinguals scored significantly lower than monolinguals. When English 
proficiency was used as a covariate, 
higher proficiency was associated with higher SUN 
accuracy in both groups
. In addition, an anal
ysis of word frequency showed that group 
differences were largest for low frequency words. However, the frequency effect was modulated 
by English proficiency in the bilingual group. Assuming that both the frequency effect and 
language proficiency are close
ly related to exposure to English, the present results suggest that 
the bilingual disadvantage in SUN results from reduced exposure to English, which is a 
consequence of being exposed to two languages.
 In conclusion, the 
results confirm
ed the predictions of the ELU, showing that both signal 
degradations and receiver limitations (less precise phonological representations of words in 
LTM) result
ed in less accurate 
SUN ability.
iv  ACKNOWLEDGMENTS
    This past five years at Michigan S
tate Unive
rsity have been an enormous
 learning 
exp
erience for me, both inside and outside of the university and I would like to thank those who 
have investe
d their time and efforts in me and the friends I have made on the way.
 First of all I would like 
to thank my a
dvisor 
Aline Godfroid for always 
encouraging
 me to 
pursu
e my ideas while giving me direction on the way. 
I am also grateful for her
 excitement 
about me research and faith in me throughout these years that often exceeded my own.
 My special thanks also go to
 the professor
s on my dissertation committee, Debra 
Hardison, Laura Dilley, and Paula Winke, who contributed their expertise to this dissertation but 
I also learned a lot by taking classes with them. 
 My gratitude also goes to Susan Gass for her leadership
 in t
he Second
 Language Studies, 
which always created
 a pleasant and productive work environment and I am thankful for the 
different teaching and research assistantships I received during my tenure in the program. Lastly, 
I would like to thank those 
with w
hom I have taken classes in these past five years, Diogo 
Almeida, Dr. Altman, Dr. Enbody, Debra Friedman, Shawn Loewen, Mr. McCullen, Charlene 
Polio, and Patti Spinner. This dissertation would not have been possible without the knowledge 
that they passed o
n to me. My thanks also go to professors Elizabeth Howard, Letti Naigles, and 
Manuela Wagner from the University of Connecticut without whom I would not have pursued 
this Ph.D. 
 Besides these amazing professors I also need to thank my colleagues in the Ph.
D. 
program without whose encouragement I would not have endured without them, especially those 
in my cohort, Roman, Se Hoon, 
Ayman, Dominik, Le Anne, and Scott. 
 v  I am also especially grateful for my good friends Tim, Justin, Garrett, and Jessie. 
Although t
heir areas of expertise are not in linguistics, their support and prayers were 
indispensable over the years
 and I donÕt know if I had been able to keep my sanity without them. 
The same is true for my friends from graduate intervarsity, especially Dan and D
anielle, Camille, 
Laura and Chris and Priscilla. 
 Finally
 I would like to thank m
y family for all their support, I love you all. Danke, Oma 
und Opa, fll eure Gebete. And lastly I thank God without whom everything would be in vain. 
 This dissertation wa
s financially supported by the College of Arts and Letters with a 
Dissertation Completion Fellowship and by the NSF with a Doctoral Dissertation Improvement 
Grant (NSF
-DDIG 
1349125).              vi  TABLE OF CONTENTS
    LIST OF TABLES
 ix  LIST OF FIGURES
 ix  KEY TO ABBREVIATIONS
 xiii
  INTRODUCTION
 1  CHAPTER 1: REVIEW OF THE LITERATURE
 3 1.1 Speech perception
 3 1.2 Speech perception under adverse listening conditions
 6 1.3 Factors affecting speech perception in noise
 7 1.3.1 Language background
 7 1.3.2 Language proficiency
 11 1.4 How does language proficiency 
influence SUN?
 14  Less precise phonological representations
 14 1.4.1
 The Lexical Quality Hypothesis
 18 1.4.2
 Frequency effects
 20 1.4.3
 Activation, inhibition, and lexical knowledge
 25 1.4.4
 Word predictability
 29 1.4.5
1.5 Speech perception in noise and cognition
 30 1.5.1 Working memory
 31 1.5.2 Working memory and Speech perception in noise
 36 1.5.3 The Ease of Language Understanding mo
del
 39 1.6 Phonological quality hypothesis
 41  CHAPTER 2: EXPERIMENT 1
 45 2.1 Research questions and predictions
 45 2.2 Methods
 46 2.2.1 Participants
 46 2.2.2 Materials
 48 2.2.2.1 Background questionnaire
 48 2.2.2.2 Speech perception in noise test
 48 2.2.3 Procedure
 49 2.3 Analysis
 50 2.4 Results
 53 2.4.1 The effects of noise and predictable context
 53 2.4.2 The influence of lexical and sublexical variables on word recognition
 55 2.5 Discussion
 58   vii
  CHAPTER 3: EXPERIMENT 2
 61 3.1 Methods
 61 3.1.1 Participants
 61 3.1.2 Materials
 61 3.1.2.1 Woodcock MuŒoz Language Survey 
- Revised
 62 3.1.2.2 Test of attention in listening
 62 3.1.2.3 Working memory
 64 3.1.2.4 Consonant perception in noise
 65 3.1.3 Relationship between predictor variables
 66 3.2 Results
 68 3.3 Discussion
 74  CHAPTER 4: GENERAL DISCUSSION
 78  CHAPTER 5: ANALYSIS OF INDIVIDUAL TESTS
 85 5.1 Words in Noise
 85 5.1.1 Methods
 86 5.1.1.1 Participants
 86 5.1.1.2 Materials
 86 5.1.2 Results
 87 5.1.2.1 English Words in Noise Test
 87 5.1.2.2 Spanish Words on Noise Test
 89 5.1.2.3 Individual differences analysis
 90 5.1.3 Discussion
 92 5.2 Verbal ability
 96 5.2.1 Materials
 98 5.2.2 Procedure
 98 5.2.3 Results
 99 5.2.4 Discussion
 109 5.3 Working memory
 115 5.3.1 Materials and procedure
 115 5.3.2 Results
 116 5.3.3 Discussion
 117 5.4 Consonant perception in noise
 119 5.4.1 Materials and Procedure
 120 5.4.2 Results
 121 5.4.3 Discussion
 134 5.5 Test of Attention in Listening
 138 5.5.1 The bilingual advantage
 139 5.5.2 Methods
 141 5.5.2.1 Materials
 141 5.5.2.2 Procedure
 142 5.5.3 Analysis
 142 5.5.4 Results
 143 5.5.5 Discussion
 151  viii
  CHAPTER 6: CONCLUSION
 155  APPENDIX
 159  REFERENCES
 164 ! ix  LIST OF TABLES
   Table 1. 
Participant characteristics devided by language status.
 47!Table 2. 
Summary of mixed
-effects regression results for variables predicting accuracy on the 
Speech Perception in Noise test.
 54!Table 3. 
Summary of the mixed
-effects regression results of SPIN accuracy for monolinguals and 
bilinguals.
 58!Table 4. 
Means and standard deviations of the predictor variables used in Experiment 2.
 66!Table 5. 
Correlations and bivariate correlations between pr
edictor variables and the four 
conditions of the Speech perception in Noise test.
 67!Table 6. 
Results from the mixed
-effects regression analysis of SPIN accuracy
. 69!Table 7. 
Results of the mixed
-effect regression analysis of the SPIN test for the monolingual and 
bilingual group.
 71!Table 8. 
Word frequency of high, mid, and low frequency words on the SPIN test
 72!Table 9. 
Mean language proficiency of the upper and lower half of the monolingual and 
bilingual group.
 72!Table 10. 
Mean accuracy on the Words in Noise test.
 87!Table 11. 
Mean number of Spanish speakers and percent exposure to Spanish
 101!Table 12. 
Results of the regression analysis predicting picture vocabulary scores
 103!Table 13. 
Differences in background variables between balanced and unbalanced bilingual 
participants.
 107!Table 14. 
Confusion matrix 
- bilingual participants.
 124!Table 15. 
Confusion matrix 
- monolingual 
participants.
 125!Table 16. 
Typical consonant confusions by monolingual and bilingual participants.
 126!Table 17. 
Results of the regression analysis of TAIL response times.
 147!Table 18. 
Results of the regression analysis of respons
e times on the TAIL.
 150!Table 19. 
Mean item accuracy on the WIN.
 160!Table 20. 
Items discrimination index for E
-WIN words.
 161!x   LIST OF FIGURES
   Figure 1. Three different sources of adverse listening conditions. Based on Mattys et al. (2012); 
also see Mattys, Brooks, and Cooke (2009).
 6!Figure 2. Effect of w
ord frequency on lexical decision times for Dutch (DLP), English (ELP), 
and French (FLP). From Keuleers, Diependaele, and Brysbaert (2010). Used with 
permission under the Creative Commons license.
 21!Figure 3. Results of the Speech Pereption in Noise test divided by noise level and group. Error 
bars show the 95% confidence interval.
 54!Figure 4. Results of the Speech Perception in Noise test. Results are divided by condition and 
language group. Error bars show the 95% confidence interval.
 55!Figure 5. Effect of biphone probability on Speech Perception in Noise accuracy divided by 
group. Grey
-shaded area shows the 95% confidence interval of the slope of the regression 
line. Each point
 represents the mean accuracy of a certain word.
 56!Figure 6. Effect of log10 word frequency on Speech Perception in Noise accuracy divided by 
group. Grey
-shaded a
rea shows the 95% confidence interval of the slope of the regression 
line. Each point represents the mean accuracy of a certain word.
 57!Figure 7.Relationship bet
ween oral language ability and accuracy on the SPIN test, depending on 
condition. HNHP=high noise
-high predictability. HNLP=high noise
-low predictability. 
LNHP=low noise
-high predictability. LNLP=low noise
-low predictability.
 70!Figure 8. The effect of frequency show for each of four groups. The monolingual and bilingual 
group were each divided into a high and low group based on a median split of their 
proficiency 
score. Whiskers show the 95% confidence interval.
 74!Figure 9. Mean accuracy on the SPIN test for each group (bilingual/monolingual) separated by 
noise level (hig
h/low) and target word frequency (low/mid/high). The figure shows that 
in the bilingual group, the effect of noise was largest when frequency was low.
 77!Figure 
10. Results of the English WIN test. Solid lines show the predicted values based on 
coefficients of the regression model described in the text. Dashed lines show the fitted 
values of this model. Whiskers show the 95% confidence interval.
 88!Figure 11. Results of the English and Spanish versions of the WIN test (bilingual participants 
only). Solid lines show the predicted values based on coefficients of the regressi
on model 
described in the text. Dashed lines show the fitted values of this model. Whiskers show 
the 95% confidence interval.
 90!xi  Figure 12. Effect of 
Baseline RT
 on WIN accuracy at each SNR. SNR = signal
-to-noise ratio. 
Baseline RT is the mean response time on the Test of Attention in Listening (see text for 
further explanation).
 91!Figure 13. Effect of oral language ability on WIN accuracy at each SNR. SNR = signal
-to-noise 
ratio. W
-scores are arbitrary units with equal interval spacing.
 92!Figure 14. Relationship between language dominance and proficiency in English and Spanish. 
Language dominance was calculated by subtracting Spanish scores from English scores. 
Thus a positive score means English dominance and a negative score me
ans Spanish 
dominance.
 104!Figure 15. Relationship between percent of exposure to Spanish and age in the bilingual sample. 
Participants were divided into a balanc
ed and an unbalanced group based on the 
difference between their Spanish and English score on the WMLS (see text).
 106!Figure 16. Relationship between the picture
 vocabulary and the verbal analogies subtests of the 
WMLS. Compared to the monolingual participants, bilinguals performed lower on the 
picture vocabulary test as would be expected from the verbal analogies score.
 108!Figure 17. Relationship between working memory capacity and picture vocabulary scores. Grey
-shaded area shows the 95% confidence interval of the regression line.
 117!Figure 18. Distribution of working memory scores when the effect of picture vocabulary was 
partialled out (residual variance).
 119!Figure 19. Mean accuracy on the consonant perception test divided by babble segment and 
speaker. Whiskers show the 95% confidence interval. Note the limited range of the y
-axis 
to highlight the effects.
 122!Figure 20. Mean accuracy for each consonant on the consonant perception test. Whiskers show 
the 95% confidence interval.
 123!Figure 21. Relationship between accuracy on the consonant perception test and oral language 
ability. The regression line included one knot at 99.5.
 130!Figure 22. Accuracy on the consonant perception test as a function of group. The monolingual 
and bilingual groups were each divided into a high and low proficiency group based on a 
median split of their verb
al ability score. Whiskers show the 95% confidence interval.
 131!Figure 23. Mean accuracy for each consonant on the consonant perception test. The monolingual 
and
 bilingual groups were each divided into a high and low proficiency group based on a 
median split of their verbal ability score. Whiskers show the 95% confidence interval.
 132!Figure 24. Relationship between mean accuracy on the consonant perception test and oral 
language ability. Consonant were divided into high and low phonotactic probability based 
on a median split. The interaction between phonotactic probability
 and language ability 
was significant.
 134!xii
  Figure 25. Mean accuracy on the TAIL in each of four conditions. Whiskers show the 95% 
confidence interval. Note the li
mited range of the y
-axis to highlight the effect.
 144!Figure 26. Mean accuracy on the TAIL for monolinguals and bilinguals. The difference between 
same frequency
 and different frequency trials was larger for monolinguals than for 
bilinguals.  Whiskers show the 95% confidence interval. Note the limited range of the y
-axis to highlight the effect.
 144!Figure 27. Mean response time (RT) on the TAIL in each of four conditions. Whiskers show the 
95% confidence interval.
 145!Figure 28. 
Mean response time (RT) in msec. on same and different frequency trials. Whiskers 
show the 95% confidence interval. Note the limited range of the y
-axis.
 146!Figur
e 29. Mean response times (RT) in msec. in each of the four conditions of the TAIL. DF/SF 
= different/same frequency, DL/SL = different/same location. The difference between 
Version 1 and 2 was the location of the response keys (see Methods section in text
). 147!Figure 30. Effect of frequency difference between the first and second tone on response times 
(RT) in msec. The regression line shows the best fit with a p
olynomial function with 
three terms.
 148!Figure 31. Effect of language dominance on response times (RT, in msec.) and the location 
effect. Language dominance was 
calculated by subtracting Spanish proficiency scores 
from English proficiency scores so that scores above 0 indicate English dominance.
 151!Figure 32. Schematic re
presentation of the results in this study. Arrows indicate significant 
relationships between variables. The two
-way arrow indicates that more exposure to one 
language is associated with less exposure to the other language. SUN = speech 
understanding in noi
se. WM = working memory. CP = consonant perception.
 156!Figure 33. Mean accuracy on the English Words in Noise test for List 1 and 2. Whiskers show 
the 95% confide
nce interval.
 163!    xiii
  KEY TO
 ABBREVIATIONS
    AoA
 Age of acquisition
 CP consonant perception
 dB decibel
 DFDL different frequency
-different location
 DFSL different frequency
-same location
 ELU Ease of Language Understanding
 HNHP high noise
-high probability
 HNLP high noise
-low probability
 LNHP low noise
-high probability
 LNLP low noise
-low probability
 LQH lexical quality hypothesis
 LRM
 lexical 
restructuring hypothesis
 LTM long
-term memory
 M mean ms milliseconds
 OL-SS/OL-W oral language (standard/W
-score)
 PQH phonological quality hypothesis
 PV picture vocabulary
 RT response time
 SD standard deviation
 SFDL same frequency
-different 
location
 xiv
  SFSL same frequency
-same location
 SNR signal
-to-noise ratio
 SPIN
 speech perception in noise
 SRT
 speech reception threshold
 STM short
-term memory
 SUN speech understanding in noise
 TAIL
 test of attention in listening
 VA verbal ability
 VC vowel consonant
 WIN
 words in noise
 WM working memory
 WMC working memory capacity
 WMLS-R Woodcock
-MuŒoz Language Survey
-Revised
     1  INTRODUCTION
 Many can attest to the difficulty of following a conversation in a noisy environment. Yet, 
while everyone is affected by noise, some people seem to be better able to cope with adverse 
listening situations than others. The aim of the research described in t
his dissertation is to find 
factors that would be able to explain some of these individual differences. Of special interest is 
the variable language experience 
given that
 many prior studies have found that listening in noise 
in a second language is more di
fficult than in oneÕs first language 
(e.g., Mayo, Florentine, & 
Buus, 1997)
. Although this seems to be a robust finding, it is not yet clear 
what factor
s are 
res
ponsible for these differences.
 The foundation for the hypo
theses generated 
and tested in
 the present investigations are 
the Ease of Language Understanding (ELU) model 
(Rınnberg et al., 2013)
 and the lexical 
quality hypothesis 
(LQH; Perfetti & Hart, 2002; Perfetti, 2007)
. The LQ
H was developed by 
Perfetti and colleagues to explain differences between skilled and less skilled readers but, 
according to Perfetti, it also applies to Òspoken language with a f
ocus on phonological 
representations and meaningÓ 
(2007, p. 361). The assumption of the hypothesis is 
that lexical 
representations will be more or less precise
, whereby preciseness of phonological representations 
is defined as stronger connections between levels of representation (phonology, semantics, and 
orthography) and more distinguishing features of w
ords that make similar sounding words less 
confusable. 
More experience with a word will strengthen its representations
 so that a high 
frequency word has
 more robust 
representations than a low frequency word. 
The LQH
 has direct 
consequences for bilingual speakers because they often have less experience with words in either 
of their language
s compared to a speaker of only one language 
(cf. Gollan, Montoya, Cera, & 
Sandoval, 2008)
. 2  Whereas the LQH 
predicts differences in phonological processing because of differences 
in lexical representations, t
he ELU model provides a framework 
for investigating the influence 
of individual differences in executive functions on word recognition in noise. 
The model 
assumes that lexical access is effortless when there is a match between the speech signal and 
phonological representations in long
-term
 memory. However, under sub
-optimal listening 
conditions, when the speech signal is distorted, the resulting mismatch has to be resolved through 
explicit processing, which depends on working memory resources, to fill in information missing 
from the input.
 Thus the prediction is that individual differences 
in working memory are 
correlated with scores of word recognition in noise.
 At the same time, the model also predicts a 
greater mismatch between the signal and long
-term memory representations when these 
representations are less precise, that is, fewer phonological attributes match the speech signal. 
Thus the ELU model complements the LQH and both make similar predictions regarding 
bilingual speakers.
  The aim of this dissertation is to test the predictions 
generated
 by the two models to 
better understand how individual differences in language experience and executive functions 
affect language processing in noise. Results will help refine models and hopefully also inform 
interventions that aim at improving li
stening in noise ability in mono
- and bilingual speakers.
  3  CHAPTER 1: 
REVIEW OF THE LITERATURE
 1.1 Speech perception
 Speech perception is a complex process that, simply speaking, comprises the mapping of 
an acoustic signal (mechanic vibrations at different fre
quencies) to internal abstract 
representations in the brain 
(Giraud & Poeppel, 2012b, p. 225)
. What is remarkable about this 
process is that 
the recognition of words in the signal seems to be effortless 
despite the 
fact that
, contrary to
 written language
, no clear markers of word boundaries are
 present in the acoustic 
signal.
 What is more, the speech signal is surprisingly variable, that is, there is considerable 
variance in the production of single phonemes and word
s between and even within speakers
 (e.g., Ernestus & Warner, 2011; Pitt, Dilley, & Tat, 2011)
. Thus one important goal of research 
on speech perception, and word recognition, is to find the mechanism by which the brain 
decodes
 and recomposes the signal into the message that was intended by a speaker. The field of 
speech perception has thus been concerned with these two problems: How are words recognized 
(e.g., Dahan & Magnuson, 2006; McQueen, 2007)
 and how is invariant perception achieved 
despite a variable speech signal 
(e.g., Diehl, Lotto, & Holt, 2004; Liberman & Mattingly, 1985)
. In this rev
iew I will mostly focus on the word level.
 Most models of spoken word recognition assume a process by which 
the acoustic
-phonetic signal is mapped to phonological representations, or 
phonemes
, that in turn
 activate 
matching words. 
Furthermore,
 because poss
ible words are often embedded within words and can 
also cross word boundaries, most models agree that word recognition is a competitive process
 (Luce & Pisoni, 1998; Marslen
-Wilson, 1987; McClelland & Elman, 1986; Norris, 1994)
. For 
example, the sentence 
The catalogue in a library
 contains the embedded words 
cat
, cattle
, login
, lie, and 
eye (Norris & McQueen, 2008, p. 361)
. These words are assumed to r
eceive 
activation
, a 
4  metaphor often used in psycholinguistics. Evidence for activation of 
multiple 
words comes from 
cross
-moda
l priming studies
 among others
. For example, hearing /lai/ extracted from 
library
 facilitates recognition of both 
lie and 
library
 in a lexical decision task (i.e., deciding whether a
n orthographically presented stimulus is a word or nonword). Hearing /laib/, on the other hand, 
impedes recognition of 
lie (see Cutler, 2012)
. Thus it is assumed that words are only considered 
as possible candidates for word recognition 
as long as they match the speech signal
. In the 
sentence above, 
catalogue
 receives more activation as the signal unfolds and in turn inhibit 
cattle
.  Through this process of activation and inhibition of possible words,
 competition 
between 
lexical candidates 
is resolved
 and those lexical candidates that exceed a certain activati
on threshold are selected
. The
 examples 
above 
show the interactive nature of word recognition. 
Words receive activation as the speech signal unfolds, that is, as soon as the signal partially 
matches a phonological representation, 
and 
activation also cascad
es down to the semantic level. 
This is well documented by studies using the visual
-world paradigm 
(Tanenhaus, Spivey
-Knowlton, Eberhard, & Sedivy, 1995)
. In this paradigm, participants are typically presented with 
four pictures on a computer screen and hear instructions to manipula
te one of the pict
ures (e.g., 
by clicking on
 it or moving it on the screen),
 with the assumption being that fixation probabilities 
on pictures reflect lexical activation. 
A seminal
 study by 
Allopenna, Magnuson, and Tanenhaus
 (1998) showed that eye
-movements are closely linked
 to the unfolding speech signal
. In this 
study, participants saw, for example, a display with a beaker, a beetle, a speaker, and
 a baby 
carriage. When participants heard ÒPick 
up the beakerÓ they were initially
 equa
lly likely to look 
at any of the pictures. However, after the onset of the target word
, in this case 
beaker
, they were 
more likely to look at the beaker and the beetle until the two words disambiguated. A few 
5  hundred milliseconds into the target word participants also looked more at the speaker than the 
unrelated object, suggesting that 
speaker
 had received act
ivation despite the initial mismatch.
 Of course, speech perception is a much more complex process
 than outlined here
 and 
models of spoken word recognition have to make many simplifying assumptions about the input. 
For example, most models take a pre
-proces
sed signal as input and omit the stage during which 
the signal is presumably decoded into phonemes. As a consequence, one can easily forget that 
speech outside the laboratory hardly ever consists of a stream of 
discrete 
phonemes 
in citation 
form
. For examp
le, competing noise, sloppy pronunciations
 and coarticulation
, an unfamiliar 
accent, all make the signal that arrives at the ear less than optimal. Yet people usually succeed in 
decoding and understanding the message. In fact, research has shown that 
speec
h perception, and 
subsequently language comprehension, is still possible when the signal is deeply impoverished. 
For example, speech with reduced spectral informat
ion (e.g., voice
-vocoded speech)
 that 
preserves the temporal structure of the speech signal c
an still be understood 
(Shannon, Zeng, 
Kamath, Wygonski, & Ekelid, 1995)
. At the same time, when temporal detail is removed from 
the signal through low
-pass filtering, comprehension is still possible with detailed spectral 
information 
(Obleser, Eisner, & Kotz, 2008)
. This shows that the speech signal carries 
information that is seemingly redundant under opt
imal listening conditions.
 What is evident from studies using acoustically degraded stimuli is tha
t word recognition 
cannot simply be a process by which phonemes are mapped to lexical entries stored in long
-term 
memory. Rather, it must be a probabilistic process 
with the parser settling on the most likely 
intended message given the evidence from the bo
ttom
-up signal but also top
-down information 
such as the topic 
(Norris & McQueen, 2008; also see Obleser & Eisner, 2009)
.  6  1.2 Speech perception under adverse listening conditions
 Mattys, Davis, Bradlow, and Scott
 (2012) identify three sources for adverse listening 
conditions
. Source degradation refers to situations where 
the 
speech signal diverges from speech 
carefully produced by a member of the same speech community as the listener. Reasons for 
source degradation can 
be casual speech, a speech disorder, or an unfamiliar accent. 
Transmission degradation, on the other hand, occurs during the transmission of the signal from 
the sender to the receiver. This can be as a result of energetic masking or non
-energetic masking. 
Energetic masking refers to 
the masking of the speec
h signal by a competing signal. When the 
competing signal is another talker, t
he listener 
also has to selectively attend to one speaker and 
ignore the other
, which will result in additional cognitive load
. Non
-energetic masking occurs 
through
 signal distortions such as
 reverberation 
but also telephone conversations
. In the latter, 
frequencies below 400 Hz and above 3400
 Hz are cut out which 
results in a smaller range
 than 
the 
one covered by typical speech 
(100 Ð 5000 Hz). 
  Figure 
1. Three different sources of adverse listening conditions. 
Based on
 Mattys et al. (2012)
; also see Mattys, Brooks, and Cooke 
(2009). Lastly, receiver limitations can
 also result in suboptimal listening situations. The cause 
can be a hearing impairment, insufficient proficiency in a language, a language impairme
nt, for 
!"#$%&'
(&)$*(*+",'
-$*,./0..0",'
(&)$*(*+",'
1&%&02&$'
30/04*+",.'
5",2&$.*+",*3'
.6&&%7'
8%%&,4&('
.6&&%7'
!6&&%7'
(0."$(&$.'
9,&$)&+%'
/*.:0,)'
5"/6&+,)'
.0),*3'
;",<&,&$)&+%'
/*.:0,)'
1&2&$=&$*+",'
-&3&67",&'
>&*$0,)'
0/6*0$/&,4'
?*,)#*)&'
6$"@%0&,%A'
?*,)#*)&'
0/6*0$/&,4'
5"),0+2&'3"*('
7  example as a result of brain injury
, and cognit
ive resource limitations
 (e.g., Mattys & Wiget, 
2011; Mayo et al., 1997; Wilson, McArdle, & Smith, 2007)
.  The researc
h described in this dissertation investigates the effects of 
one type of 
transmission degradation, 
energetic masking,
 and two potential receiver limitations, namely
 language experience/proficiency
 and individual differences in cognitive resources (executiv
e functions).
 1.3 Factors affect
ing speech perception in noise
 Two broad factors influencing speech 
understanding
 in noise 
(SUN) will be reviewed 
here that are the focus of this dissertation. The first factor is verbal ability in relation to 
the 
language status of the tested language (first language vs. second language) and language 
experience (growing up with one language vs. two languages). The second factor is cognitive 
ability or executive functions, 
which are, broadly defined,
 Òa set of gener
al-purpose control 
mechanisms [É] that regulate the dynamics of human cognition and actionÓ 
(Miyake & 
Friedman, 2012, p. 8)
. These two factors have been associated with 
SUN but have typically been 
studied in isolation. However, there may be interactions between verbal and cognitive abilities, 
which remain hidden 
if these factors are stud
ied separately. Not included in
 this revi
ew are 
studies on 
SUN in clinical populations and the elderly. For example, 
deficits in 
SUN have
 been 
shown to be associated with 
dyslexia 
(e.g., Ziegler, Pech
-Georgel, George, & Lorenzi, 2009)
 and 
language learning impairmen
t (e.g., Ziegler, Pech
-Georgel, George, Alario, & Lorenzi, 2005)
.  1.3.1 Language background
 Many studies have investigated differences in speech perception in native and nonnative 
speakers. The usual finding is 
that speech perception in quiet is not different between
 first 
language
 (L1) and 
second language (
L2) speakers
 but in noise L2 speakers 
typically 
perform 
8  significantly worse 
(Bradlo
w & Alexander, 2007; Crandell & Smaldino, 1996; Mayo et al., 
1997; Meador, Flege, & Mackay, 2000; Rogers, Lister, Febo, Besing, & Abrams, 2006; 
Schneider, Avivi
-Reich, & Daneman, 2014; Shi & S⁄nchez, 2010, 2011; Shi, 2009, 2010; Van 
Engen, 2010)
. A few 
studies have also tested the same speakers in their L1 and L2 and found that 
L2 SUN is usually worse 
(Kilman, Zekveld, H−llgren, & Rınnberg, 2014; Rosenhouse, Haik, & 
Kishon
-Rabin, 2006; Weiss & Dempsey, 2008)
. What is not always consistent across stud
ies is 
whether noise has an additive or multiplicative effect on L2 listeners. 
Whereas Mayo et al. 
(1997) found an interaction between group and noise level 
(also see Shi, 2010; Tabri, Smith 
Abou Chacra, & Pring, 2011)
, other studies failed to find this interaction 
(Rogers et al., 2006)
. This may be 
due 
to differences in the tested participant population and noise conditions. 
For 
example, Rogers et al. 
(2006) used three fixed signal
-to-noise ratios (SNRs) whereas Mayo et al. 
(1997) used an adaptive staircase procedure
1.  In studies on bil
ingual 
SUN samples are
 often divide
d into early and late bilinguals to test 
if age of acquisition has an effect on hearing in noise ability. An early onset of L2 acquisition is 
often 
defined as age 6 or younger whereas late commonly refers to 11 or older. 
These cutoff 
points
 are based on research on the critical period hypothesis that suggests a critical period for 
language acquisition roughly between 6 and puberty 
(e.g., Flege, Yeni
-Komshian, & Liu, 1999; 
Johnson & Newport, 1989)
. For example, Meador et al. 
(2000) tested a group of early 
bilinguals 
(L2 onset ~ 7 years), a ÒmidÓ group (age of arrival ~14 years) and a ÒlateÓ group with an age of 
arrival of ~19 years. They found a linear negative relationship between age of arrival and 
SUN                                                 1 In this procedure, the SNR is adjusted up or down depending on whether a participant correctly 
repeated a target word. This is done until the SNR is found at which a participant is able to repeat 
the target word 50% of the time 
(for a detailed explanation of this procedure see May
o et al., 
1997, p. 687). 9  performance, wi
th all groups being worse than monolingual native speakers. In a subsequent 
regression analysis 
the authors
 found that age of arrival could explain 41.5% of the variance in 
SUN test scores. This same pattern was confirmed in other studies 
(e.g., Rogers et al., 2006; Shi 
& S⁄nchez, 2010)
, in which
 language background variables such as age of acquisition (AoA) and 
self-rated proficiency explai
ned up to 80% of the variance in 
SUN (Shi, 2012)
. Thus the claim 
that 
AoA
 and other linguistic variables influence 
SUN is firmly established in the literature. 
What is still an open question is whether bilinguals who learned both languages from infancy, 
often called simultaneous bilinguals,
 will perform like monolinguals. 
Shi 
(2009) tested 12 
simultaneou
s bilinguals who learned English between 1 and 3 and found no difference in 
performance to a group of 24 monolingual English speakers at an SNR of 0 dB in four different 
noise conditions (speech
-weighted noise, multi
-talker babble, and instrumental music p
layed 
forward and reversed). 
Calandruccio and Zhou
 (2013) tested bilinguals
 growing up in a Greek
-English bilingual environment in New York and found no difference
 compared
 to a group of 
monolingual Eng
lish speakers when tested with three
-talker background babble at an SNR of 
-5 dB. Interestingly, the bilingual group was also tes
ted in Greek and no significant difference 
between the English and Greek test was found.
 However, the Greek version was not tested 
against a monolingual sample of Greek speakers and may 
therefore not be comparable to the 
English version.
 Nonetheless, the r
esults show that the bilingual participants were proficient in 
both languages.
 Shi 
(2010) also tested 
a group of eight
 simultaneous bilinguals and found no difference to 
a monolingual control group at SNRs of +6 and 0 dB and reverberation times of 1.2 and 3.6 
seconds. The test Shi used included sentences with high and low 
predictability taken from the 
Speech Perception in Noise
 (SPIN)
 test (Bilger et al., 1984)
. In a high predicta
bility sentence, the 
10  final word, which 
participants have to recognize
, can be inferred from the preceding context 
(e.g., Ò
The ship sailed along the coastÓ vs. ÒMs. Brown thought about the coastÓ). Shi (2010) 
found a significant difference between the monolinguals and the simultaneous bilinguals in the 
most unfavorable listening condition (high noise and high reverberation) in
 the predictable 
context condition with a large effect size (
CohenÕs 
d = 2.58). This suggests that bilinguals did 
not benefit as much from predictive context 
(cf. Mayo et al., 1997)
 but such differences may 
only emerge in the most unfavorable listening conditions.
 A similar conclusion can be drawn 
from a study by Crandell and Smaldino 
(1996), who tested 20 monolingual and 20 early 
bilingual children matched in age (age range = 8 
Ð 10 years). The bilingual participants had 
started to learn English before the age of 2
, as rep
orted by their parents,
 and were exposed to 
each language roughly 50% of the time.
 In quiet and at an SNR of +6 dB the authors found no 
significant differences between the groups but at more unfavorable SNRs (
-6, -3, 0, and +3 dB) 
the bilingual group perfo
rmed significantly worse than the monolingual group with the slope of 
the decline 
under increasing noise levels 
appearing to be steeper for bilinguals (though the 
author
s did not state whether the group
-by-noise
-level
 interaction
 was significant).  
 Howeve
r, AoA may not be the only linguistic variable influencing 
SUN. Shi and S⁄nchez
 (2011) tested Spanish
-English bilingual speakers 
using
 SUN tests in English and Spanish. All 
participants had learned Spanish from birth but one group learned English early (~4 years) and 
became dominant in English 
whereas
 the other group 
learned English 
later in life (~13 years) and 
was dominant in Spanish. The 
authors found that both groups performed better on the test that 
measured
 their dominant language. 
This suggests that more exposure to a language has a positive 
effect on 
SUN as already mentioned above
, but reduced language exposure over a lifetime may 
also have a negative effect
 on word recognition in noise.
 11  An improvement of more recent studies over earlier studies including bilingual 
population
s is that more background variables are usually reported, following a realization that 
bilinguals differ in many
 respects 
(Grosjean, 2001, 2008; von Hapsburg & P
eŒa, 2002)
 as well as
 the publication of more standardized assessment instruments 
(Marian, Blumenfeld, & 
Kaushkanskaya, 2007)
. This makes comparisons across studies 
easier and may help explain 
why 
studies sometimes seem to find conflicting results. However, even more detailed information 
about the participants
Õ background may be necessary. 
For example, even simultaneous bilinguals 
who were exposed to both languages from birth may differ in the r
elative exposure to each 
language. Parents may be monolingual or bilingual speakers and participants may 
spend
 more or 
less time in monolingual environments;
 for example, the
y may go to an English
-only day
care 
from an early age on. 
These variables may dete
rmine whether simultaneous bilinguals differ 
from monolinguals on tests of 
SUN or not, as it has been shown that amount of early language 
exposure influences processing efficiency in monolingual and bilingual children 
(e.g., Gollan, 
Starr, & Ferreira, 2014; Hurtado, Ger, Marchman, & Fernald, 2013; Weisleder & Fernald, 
2013). 1.3.2 Language proficiency
 Language proficiency is often included as a variable in research on 
SUN in bilingual 
speakers. It is often 
measured 
through
 self-assessment on a Likert
-scale (e.g., Shi, 2012)
 or a proficiency test 
(e.g., Kilman et al., 2014)
. Language proficiency is often correlated with other 
language background variables such as AoA and le
ngth of residence in the country where the 
target language is spoken. Nevertheless, proficiency can sometimes explain additional variance 
above and beyond AoA 
(Shi, 2012)
. This may 
be 
because AoA and length of residence 
do not 
take into account how much a participant was exposed to each language. Two partic
ipants may 
12  have come to the US at the same age but one may have been completely immersed in English 
whereas the other may have had more contact to other speakers of their native language 
(see 
Meador et al., 2000)
. Bilinguals with such different profiles will likely also differ in their 
proficiency in their two languages and so langu
age proficiency may be a proxy variable for 
language exposure over a lifetime.
 While self
-rated proficiency can often explain substantial variance in a diverse sample of 
second language speakers
 (e.g., Shi & Farooq, 2012)
, it may be less sensitive to more nuanced 
differences in a sample of highly proficient or native speakers. For example, Shi and S⁄nchez 
(2011) tested English
-Spanish bilingual speakers who were either dominant in English or 
Spanish. Participants were tested in both languages and the authors found that self
-rated 
proficiency was only correlated with 
SUN performance in the non
-dominant
 language
. This may 
have been because of greater variance between subjects in the non
-dominant language. 
In comparison
, participants may tend to 
overestimate the
ir proficiency
 in the dominant language
, resulting in ceiling effects. 
It may also be that particip
ants rate their ability to successfully 
communicate in everyday situations, which would be a more holistic measure and may be 
different from more fine
-grained measures of verbal ability. 
In a more homogeneous sample, 
self-rated proficiency may therefore no
t be 
as good of a predictor
 as in very diverse samples.
  A few studies have used standardized tests to measure proficiency.
 Rimikis, Sm
iljanic, 
and Calandruccio
 (2013) tested 
a diverse group of 
102 nonnative s
peakers of English enrolled at 
two US universities
. Participants took the Versant English test, which is a test designed for 
nonnative speakers. 
In addition, they also took a test
 of 
SUN that was specifically created for 
nonnative English speakers with limited proficiency. As in the studies cited above, t
he authors 
found 
a correlation between 
SUN and 
age of immigration and length of residence. In addition, 
13  they found a high correla
tion (
r = .73) between 
SUN performance and the Versant. Combined, 
Versant and age of immigration could explain 63% of the variance in 
SUN performance.
 Similar results were obtained by 
Kilman et a
l. (2014). The authors tested native speakers 
of Swedish who had learned English as a foreign language in school. All participants completed 
a standardized test of English and an adaptive 
SUN test in English and Swedish in four different 
noise condition
s to determine the SNR at which they perceived sentences with 50% accuracy
 (the Speech Reception Threshold, SRT)
. The noise conditi
ons were stationary 
and fluctuating 
speech
-shaped noise, English babble, and Swedish babble. The correlations between the SRT
 and English proficiency were 
r = -.48, -.6, -.51, and 
-.65, respectively, in the four conditions 
when the target language was English
2. When the target language was Swedish, no correlations 
were found.
 An interesting question is whether vocabulary knowled
ge or overall verbal
 ability are
 also predictive of word recognition in noise performance in monolingual native speakers of a 
language. The answer to this question would show whether differences between first and second 
language listening are of a qualitat
ive or quantitative nature. Monolingual speakers growing up 
in typical circumstances do not differ in age of first exposure to the language but there are 
differences 
in the amount and quality of input infants receive from their care
givers
, which lead to 
great variability in vocabulary knowledge even at a very young age 
(Hart & Risley, 1995)
. This 
variability is likely to influence 
SUN given a diverse sample. Some evidence that vocabulary 
knowledge may be associated with 
SUN comes from a study by 
Tamati, Gilbert, and Pisoni
 (2013). The authors first tested a large sample of 121 healthy young
-adult
 listeners on a 
SUN test and then asked those performing in the lower and upper qua
rtile to come back for additional 
                                                2 Note that these correlations are negative because a lower SRT means better SUN hearing)
 14  tests. One of those tests was a word familiarity test 
on which participants rated how familiar they 
were with 150 words that 
were 
categorized as high, mid, and low familiarity based on a previous 
norming study. The good 
SUN listeners were significantly more familiar with low and mid 
familiarity words and marginally more familiar with high familiarity words. One limitation is 
that these results were based on self
-ratings instead of a standardized test but they suggest 
noneth
eless that the better 
SUN listeners had a larger vocabulary.
 1.4 How does lan
guage proficiency influence 
SUN? In the previous section it was shown that language proficiency may be an important 
variable that predicts SUN ability. In the following sections, I wi
ll review studies that may 
explain why language proficiency is correlated with SUN. 
  Less precise phonological representations
 1.4.1
Language proficiency as measured by vocabulary size is correlated with many variables 
related to language processing. One model o
f word recognition and lexical development in 
children that has been very influential is the Lexical Restructuring Model (LRM) by Metsala, 
Walley and colleagues 
(Metsala & Walley, 1998; Walley, Metsala, & Garlock, 2003; Walley, 
2008). This model proposes that infantsÕ phonological representations of w
ords in memory start 
out as crude, whole
-word re
presentations that lack phonemic
 detail.
 In contrast to theories that 
assume
 that infants have the same phonemic representations as adults
 (e.g., Kuhl, Williams, 
Lacerda, Stevens,
 & Lindblom, 1992)
, the LRM assumes that phoneme categories only develop
, or emerge,
 over time
 as a result of vocabulary growth
. As children add more words to their 
mental lexicons, there is a need for those words to be represented with finer
, segmental
 detail 
to 
ensure efficient processing 
(also see Charles
-Luce & Luce, 1990)
. The model propos
es that 
lexical restructuring, from crude representations to fine
-grained segmental representations, 
15  occurs on an item
-by-item basis and is determined by lexical frequency, how often a word is 
encountered, and phonological neighborhood size. Thus lexical r
epresentations of high frequency 
words will be more precise or detailed than those of low frequency words. In addit
ion, a word 
with many neighbors
3 will be represented with more detail because there are more words that it 
sounds similar to. For example, 
cap is a neighbor of 
cat
, as is 
bat, cut
, and 
mat
. If 
cat
 was only 
crudely represented in memory, it would be easily confusable with its neighbors. A word with no 
neighbors
 such as 
idol
, on the other hand, 
would not have to be represented with as much detail
 because there are no words competing with it during recognition.
 Thu
s, according to the model, 
high frequency
, high density
 words have the most precise phonological representations whereas 
low frequency, low density words have 
the least precise representa
tions.
 While there is evidence for the LRM 
(Metsala & Wall
ey, 1998)
, subsequent studies have 
shown that infants may be more sensitive to phonetic detail than previously thought. Using eye
-tracking, Swingley and colleagues have shown that mispronunciations affect word recognition in 
infants as young as 18 month
s (Swingley & Aslin, 2002, 2000, 2007; Swingley, 2003)
. The idea 
here is that a mispronunciation would be ha
rder to detect if a heard word is matched to
 a stored 
representation in memory only on overall similarity compared to a word that is stored with 
segmental information. 
In Swingley and Aslin 
(2000) infants and toddlers saw two pictures and 
heard a sentence like ÒWhere is the babyÓ (correct pronunciation condition) or ÒWhere is the 
vabyÓ (mispronunciation condition). In both conditions children looked more to the target picture 
than the distracto
r but they also looked more to the target when hearing the correct pronunciation 
                                                3 A phonological neighbor is typically defined as a word that can be formed from another word 
by adding, deleting, or substituting a 
single phoneme. A word with many neighbors is said to 
come from a dense neighborhood whereas a word with no or few neighbors is said to come from 
a sparse neighborhood.
 16  compared to the mispronunciation. These results show that children were thrown off by the 
mispronunciation and therefore must have been sensitive to the b/v distinction. 
The a
bility to 
distinguish between /b/ and /v/ was unrelated to vocabulary knowledge or age as would have 
been predicted by the LRM
, which assumes that phone
mic representations emerge as a result of 
vocabulary growth
. However,
 Swingley and Aslin note 
(2000, p. 161) that 
the results do not 
necessarily provide evidence that infants have segmental representations or more holi
stic representations of words
. It c
ould be that non
-phonemic representations are still quite detailed
 phonetically
. Furthermore, the words used in studies like this one are usually words 
with which 
children are familiar
. Even though children can detect mispronunciations in familiar words, t
he 
LRM predicts that less familiar words will be represented less precise
ly in the mental lexicon. 
Thus frequency of encounter with a word may be more important than neighborhood density for 
lexical restructuring, especially since studies have shown that w
hile words in childrenÕs lexicons 
have fewer neighbors than words in adult lexicons, there are still many words in childrenÕs 
vocabularies that have neighbors 
(Coady & Aslin, 2003
). To recap, the LRM posits that 
vocabulary acquisition drives a restructuring of phonological representations. As children add 
more words to their lexicons there is a need for more precise representations to be able to 
distinguish similar sounding words
. To come back to the questions of how vocabulary size is 
related to spoken word recognition, one could hypothesize that speakers with larger vocabularies 
have more precise representations of these words, which, in turn, resu
lts in more efficient 
retrieval
 from amidst a more densely populated neighborhood.
 Several objections have been raised against the LRM
. Instead of positing that 
representations of words are qualitatively different in young children and adults, observed 
differences in experiments could r
esult from the fact that children are just less familiar with 
17  words because they have not heard them as many times as an older person. 
The more experience 
someone has with a word, the more phonological detail may be stored for this word. Frequency 
effects 
are well documented in the literature 
(e.g., Grosjean, 1980; Monsell, 1991; Murray & 
Forster, 2004; Oldfield & Wingfield, 1965; Rub
enstein & Pollack, 1963)
 and are a powerful 
predictor of
 speed and accuracy of 
word 
recognition
. In addition, frequency effects appear early, 
before the offset of a word, suggesting that less phonetic information is needed for successful 
recognition 
of 
high frequency words compared to low frequency words 
(Dahan, Magnuson, & 
Tanenhaus, 2001; Grosjean, 1980)
.  A recent
 study tested the hypothesis that frequency of encounter with a word determines 
the precision of a phonologica
l representation of that word. 
White, Yee, Blumstein, and Morgan
 (2013) used an artificial lexicon paradigm 
(see Magnuson, Tanenhaus, Aslin, & Dahan, 2003)
 in 
which participants learned mappings between artificial words and geometric figures. The author
s manipulated frequency 
by presenting word
-object pairing
s once, five, or eight times during the 
learning phase. In the testing phase, participants 
saw a familiar and a novel shape and heard a 
familiar word, a mispronounced familiar word (e.g., 
gav inst
ead of 
bav), or a novel word while 
their eye
-movements were tracked. The eye
-movement results showed that participants were less 
sensitive to mispronunciations after one exposure than after 
five or eight
 exposures, as evidenced 
by looks to the familiar object
. The authors 
assume
d that the strength of a lexical representation 
could
 explain these results. Because low frequency words require more acoustic input to be 
recognized
, competitor words will receive more activation and may be less efficiently inhibited. 
Whatever the underlying mechanism may be, the main point of 
the White et al. (2013)
 study is 
that the results from adults are very similar to those obtained from children, that is, both look 
more at the familiar than the unfamiliar object when presented wi
th a mispronunciation of the 
18  label for the familiar object
. In other words
, they do not take the mispronounced word as a label 
for the unfamiliar object, which suggests that they did 
not notice the mispronunciation
. Therefore 
we may assume that child and a
dult word recognition is not qualitatively different. The fact that 
young children behave differently when tested with familiar words 
(see studies by Swingley and 
colleagues cited above) 
may just reflect the fact that they have less experience with these w
ords 
compared to adults or older children and thus weaker phonological representations. 
 To come back to the relationship between vocabular
y size and word recognition, if the 
quality of lexical representations is dependent on frequency of encounter, then w
e may assume 
that people with a larger vocabulary also have more language experience in general. In this case, 
the relationship 
between vocabulary size and word recognition 
would not be causal
 but mediated 
by language experience
. For example, we may assume
 that people with a larger vocabulary hear 
and read words in a greater variety of contexts. This may be better illustrated for reading than for 
listening but is certainly true for both modalities. Someone who regularly reads newspapers, 
novels, and scienti
fic journal
s will learn many words by reading but they will also encounter all 
words
, and especially low frequency words,
 much more often than someone who seldom reads
 (Kuperman & Van Dyke, 2013)
. This view is expressed in the Lexical Quality Hypothesis (LQH; 
Perfetti & Hart, 2002)
 that I will talk more about in the next 
section
.  The Lexical Quality Hypothesis
 1.4.2
Perfetti and colleagues 
(Perfetti & Hart, 2002; Perfetti, 2007)
 developed the LQH to 
explain individual differences between low
-skill and high
-skill reader
s. The assu
mption is that 
entries in the mental lexicon of a given reader differ in the quality of their representations, 
from 
words that are well
-known to the individual 
to others that are only rarely encountered and 
of 
which
 the individual only has rudimentary know
ledge. Quality then refers to the precision
 in the 
19  representation of a 
wordÕs form and meaning. 
Perfetti 
(2007) identifies five features that may
 distinguish high fr
om low quality representations: o
rthography, phonology, grammar, meaning, 
and constituent binding. For example
, high
-quality phonological representations differ from low 
quality representations in the amount of phonological redundancy t
hat is s
tored and the stability 
of th
e phonological representation; a
 less stable representation may not always be retrieved 
successfully
. In the meaning dimension, high
-quality lexical representations are less dependent 
on context and can be readily 
distinguished from related words. For example, one individual may 
know that barley, wheat, oat
, and rye are grains but they may not know any attributes that 
distinguish among them. That person would have low
-quality meaning representations of these 
words. 
Important for the present study 
is also the feature Perfetti calls 
constituent binding
, which 
is Òthe degree to which the first four features [orthography, phonology, morpho
-syntax, and 
meaning] are bound togetherÓ 
(2007, p. 360). High
-quality constituent bindings are characterized 
by stronger connections between the different features, especially meaning 
and orthographic and 
phonological form. A stronger connection between phonology and meaning will make the 
meaning accessible faster upon hearing the phonological form of a word. Less tightly bound 
constituents, on the other hand, may 
lead to
 slow retrieval
 or retrieval failures
. For example, 
someone might recognize the phonological form of a word but not remember its meaning.
 While the LQH was developed to explain individual differences in reading in 
monolingual speakers, the model can easily be extended to
 second language speakers. 
According 
to 
the LQH, word knowledge is essential to reading skill. More skilled readers have better 
knowledge of all constituents of a word. Thus reading skill develops with experience. 
For 
example, the LQH states that high freq
uency words, words that are encountered often, have more 
precise representations compared to low frequency words. Someone who reads a lot will 
20  encounter all words more frequently, thus all words will be of higher absolute frequency for this 
individual comp
ared to someone who reads seldom. 
  Frequency effects
 1.4.3
Word frequency is usually determined by tallying up the number of occurrences of words 
in large corpora of language. For example, Brysbaert and New 
(2009) based their word 
frequency database on a corpus of subtitles from 
American movies;
 the British National Corpus 
is based on 100 million words extracted from different written and spoken (tr
anscribed) texts. A 
word that occurs once in the corpus may be encountered more or less frequently by someone 
who reads a lot but never by someone who does not read. Thus the objective word frequency 
would not be accurate for these two individuals, the sub
jective frequency for them would be 
higher and lower, respectively. 
 To understand how subjective, or actual, word frequency influences word recognition, it 
is important to understand frequency effects in general. As stated earlier, the frequency effect 
is the most robust variable known to predict lexical access
 (Murray & Forster, 2004)
. High 
frequency words are named faster, read
 faster, and recognized faster in spoken word
 recognition. 
However, the relationship between frequency and lexical access is not linear. 
Differences in 
frequency in the low frequency range have a much bigger impact on response times (RTs), for 
example in l
exical decision, than changes in the high frequency range. However, when the log10 
frequency is used
 as a predictor
 instead of frequency per million, the relationship becomes linear 
up to 
the very high frequency range where RTs reach asymptote. 
This is sho
wn in 
Figure 2
, which
 is ad
opted from 
Keuleers, Diependaele, and Brysbaert
 (2010) and shows 
RTs in lexical 
decision across a wide range of word frequencies in three languages, Dutch, English, and French
. Because of the logarithmic relationship between frequency and RTs, a change in magnitude at 
21  the low end of the scale, for example from 1 to 10 occurrences per million, will have the same 
effect as a change in magnitude at the high end, say, from 100 to 1
000 occurrences per million. 
This suggests that in terms of 
individual differences, differences in reading experience, or 
language experience in general, will only have small effects on words from the high frequency 
range. However, we can expect large diff
erences at the low end, especially for the least frequent 
words that may almost never be encountered by some people.
  Figure 
2. Effect of 
word frequency on lexical decision times for Dutch
 (DLP)
, English
 (ELP)
, and French
 (FLP)
. From 
Keulee
rs, Diependaele, and Brysbaert (2010)
. Used with permission 
under the Creative Commons license.
 There is evidence for the hypothesis that individual differences in print exposure and 
vocabulary knowledge are associated with the size of the frequency effect. Chateau and Jared 
(2000) estimated reading exposure with a test called the famous author recognition test. In this 
test, participants are presented with a list of famous authors and foils and they check all the 
authors they recognize. Chateau and Jare
d found that on a lexical decision test, the frequency 
effect was larger for 
participants who reco
gnized fewer authors compared to those who 
recognized more. 
This finding stands in contrast to a study by
 Lewellen, Goldinger, Pisoni, and 
Greene
 (1993). They divided participants into two groups (high verbal
-low verbal) based on their 
22  familiarity ratings of words, a vocabulary test, and a language experience questionnaire. The 
authors found that high verbal participants were consistently faster on
 three different tests, 
visual 
naming, lexical decision, and semantic classification. However, they
 did not find the critical
 interaction between lexical variables (frequency and neighborhood density) and 
group (high 
verbal/low verbal) on any of the tests.
 To reconcile these conflicting findings, 
Sears, Siakaluk, 
Chow, and Buchanan
 (2008) replicated both studies
 with a sample of university students that 
they divided into two groups based on their performance on the author recognition test
. In two 
experiments, they used the same targ
et words for a lexical decision test but manipulated the types 
of nonwords, regular nonwords (Exp. 1) and pseudohomophones (Exp. 2). Pseudohomophones 
are words that sound like real words, for example, 
brane
, and are therefore harder to reject as 
nonwords. 
When pseudohomophones were used, 
the frequency by group interaction found by 
Chateau and Jared
 (2000) was replicated and when regular nonwords were used, no interaction 
was found as in Lewellen et al. 
(1993). Sears et al.
 (2008) suggest that the low print
-exposure 
group relied more on phonological processes to compensate for less efficient orthographic 
processing. And so 
when the nonwords used in Exp. 2
 sounded like real words, the task became 
more difficult for them.
 Yap, Balota, Sibley, and Ratcliff
 (2012) reanalyzed data from a large scale project, the 
English Lexicon Project 
(Balota et al., 2007)
, in which the authors collected RTs for a wide 
range of words from a large sample of par
ticipants on two tasks, speeded naming (reading of 
single words) and lexical decision. 
Participants a
lso completed a vocabulary test
 and the authors 
analyzed correlations between an individ
ualÕs vocabulary score and the frequency effect in his or 
her RTs, 
as estimated by 
the 
regression coefficient
. For speeded naming, they found a correlation 
such that higher vocabulary knowledge was associated with a smaller frequency effect but this 
23  relationship was not found for lexical decision. The authors 
explained th
is discrepancy in 
findings in terms of task demands. They 
speculated that lexical decision involves two stages, 
a lexical access stage and a decision stage. Vocabulary knowledge more likely affects lexical 
access but if frequency effects mostly occur at th
e decision stage, then individual differences in 
vocabulary knowledge would be unrelated to the frequency effect.
 Especially interesting with regards to individual differences in the frequency effect 
is a 
recent study by 
Diependaele, Lemhıfer, 
and
 Brysbaert
 (2013). These authors used a wide range 
of words of different frequen
cies and investigated the shape of the frequency curve. The task 
they used was a gated 
word 
identification test
, in which 
participants saw 
words 
alternating with a 
visual mask on a computer screen. The 
visual form
 of the word 
appeared incrementally on the 
screen
 and participants hit a key and typed the word as soon as they recognized it. Participants 
were drawn from four groups, monolingual English speakers, and native speakers of Dutch, 
French, and German, who had learned English as a second language. 
All participants also 
completed a vocabulary test in English, in which participants had to decide whether a presented 
word was a real word in English. Because the authors were interested in the shape of the 
frequency curve
, they used frequency
-per
-million cou
nts from the Brysbaert and New
 (2009) subtitle corpus and fitted those to the RTs using a natural spline with 
2 knots
4 to account for the 
nonlinear relationship between RTs and frequency. 
Diependaele and colleagues
 found that for 
frequencies below ~100 per 
million the regression line was steep whereas it reached asymptote 
for frequencies above 100. Differences between groups only emerged in the lower frequency 
range, with the slope being steeper for the nonnative speakers compared to the native speak
ers. 
                                                4 A natural spline function allows the regression line to break at certain points to al
low for 
nonlinear relationships between the predictor variable and the outcome variable. 
 24  Imp
ortantly, a proficiency
-by-frequency interaction 
fitted the data better than a group by 
frequency interaction, suggesting that the differences between participants can be better 
explained in terms of English language proficiency rather than language status
 (L1 vs. L2).
 Moreover, the proficiency
-by-frequency interaction was significant for the native English 
speakers, which confirms the results of the above cited studies and shows that even within a 
small restricted sample of college students sharing the sam
e language experience (monolingual 
speakers) there is enough variance in proficiency scores to explain individual differences in 
lexical access.
 All the studies cited in this section so far dealt with visual word recognition and 
found 
that print exposure or vocabulary knowledge, which is also related to print exposure 
(Lewellen et 
al., 1993)
, is relat
ed to the size of the frequency effect. A ready explanation for this finding is 
that for someone who reads a lot, all words will be of higher subjective frequency compared to 
someone who reads little. The weaker
-links hypothesis developed by Gollan and col
leagues 
(Gollan & Acenas, 2004; Gollan et al., 2008; Gollan, Montoya, & Werner, 2002; Gollan & 
Silverberg, 2001)
 is based on a similar logic 
in an attempt 
to explain differe
nces in language 
production between monolingual and bilingual speakers. 
The hypothesis was originally put forth 
to explain why bilingual speakers experienced more tip
-of-the
-tongue states, 
which are 
situations 
in which the speaker knows a word but is unabl
e to produce it 
(Gollan & Silverberg, 2001)
. The 
assumption
 underlying the
 weaker
-links
 hypothesis
 is that while monolingual and bilingual 
speakers have the same ov
erall language experience, the experience of bi
linguals with
 either of 
their languages will be reduced. For example, a Spanish
-English bilingual student may speak 
only Spanish at home and only English at school. 
Because of this reduced language experience, 
the hypothesis assumes that the links between 
semantic and phonological representations are 
25  weaker compared to a
n age
-matched monolingual speaker.
 This is similar to constituent binding 
feature of
 Perfetti
Õs (2007) LQH.
 Evidence for the
 weaker
-links
 hypothesis comes from a picture 
naming study that sh
owed that the frequency effect was larger for bilingual speakers compared to 
monolingual speakers, even when bilinguals were tested in their dominant language 
(Gollan et 
al., 2008)
. In addition, the frequ
ency effect in their nondominant (but first acquired) language 
was even larger. The same pattern was 
found
 by Ivanova and Costa 
(2008) who tested early 
bilingual speakers of Catalan and Spanish in their first acquired and currently dominant 
language. They found tha
t compared to Spanish monolingual speakers, bilingual speakers were 
slower to name pi
ctures with low frequency names. This is in line with a frequency account of 
the bilingual disadvantage because, due to the nonlinearity of the frequency effect, reduced 
exposure will affect low frequency words more than high frequency words.
  Activation, inhibition, and lexical knowledge
 1.4.4
What is common to the LRH and the frequency account is that words differ in their 
phonolo
gical representations as a result of language exp
erience. 
Lexical representations of high 
frequency words may consist of more redundant 
(phonetic) 
information 
(Perfetti & Hart, 2002, p. 
190). Thus, during spoken word recognition, more precise phonological representations may
 receive more activati
on from the acoustic si
gnal because of a better match and thus its memory 
location is found faster, resulting in faster retrieval. When the speech signal is distorted, more 
redundancy in phonological representations will make lexical retrieval more robust.
 Consequently
 words with high quality lexical representations should be recognized 
more 
accurately and more efficiently
 under suboptimal listening conditions than low quality words. 
In line with this prediction is the finding that high frequency words are 
recognized more accurately 
under adverse listening conditions 
(Howes, 1957)
. Further evidence for th
e link between quality 
26  of phonological representations and word recognition in noise come
s from a recent study by 
Sommers and Barcroft 
(2011). In this study, native English speakers
 learned 24 novel words in 
Spanish, a language
 they had had no prior exposure to. All participants heard six repetitions of 
each word but half of the participants heard the words spoken by the same speaker whereas the 
other half heard each word spoken by six different speakers. In a subsequent testing
 phase, 
participants heard the Spanish words presented in white noise at four different SNRs
, spoken by 
two speakers unfamiliar to either group,
 and had to provide the English translation. The results 
showed that the multiple
-talker group performed this te
st more accurately and also faster. The 
authors concluded that by hearing a word from multiple speakers, listeners may form more 
robust
 lexical representations
. This study also shows that it may
 not only 
be 
the frequency of 
encounter with a word 
that matte
rs 
but also the 
contextual diversity of encounters that influences 
lexical representations.
 To 
recap
, more precise lexical representations may receive more activation from a 
distorted speech signal. Higher activation of the target word may then result in m
ore efficient 
inhibition of competitor words. In many models of word recognition, segmentation of the speech 
signal is achieved by boosting activation of words matching the speech signal. For example, in 
TRACE 
(McClelland & Elman, 1986)
 the speech signal activates sub
-lexical units that then send 
activation to those lexical units they are connected to. Because many words will temporarily 
match the speech signal, many words receive activation in parallel with the ones matching the 
speech s
ignal best receiving the most activation. However, speech segmentation is achieved not 
only by activation but also inhibition. The more activation a word receives, the more it inhibits 
competitor words. Thus having less precise phonological representations
 may also result in less 
efficient inhibition, which in turn would make it more difficult for the parser to settle on the 
27  correct segmentation of the signal. For example, 
going
 back to the sentence given in 1.1, a 
listener who hears 
The catalogue in a libr
ary
 may initially parse the signal as 
the cattle
. However, as the speech signal unfolds further, 
catalogue
 would receive more activation than 
cattle
 and thus send inhibition to 
cattle
. If inhibition is less efficient, a listener will be led down a 
garden p
ath longer and will take longer to recover from it.
 Experimental studies have provided some evidence for the link between less precise 
phonological representations and less efficient inhibition
 of competitors. In a study using an 
artificial lexicon similar
 in design to the one described
 in section
 1.4.1, Magnuson et al.
 (2003) had participants learn mappings between novel words and arbitrary shapes. 
Because the 
researchers used an artificial language, th
ey could tightly control phonological similarity 
between words. The study took place over two days and consisted of a learning phase and a 
testing phase using the visual world para
digm. Each word in the artificial lexicon had an onset 
competitor and a rhym
e competitor. For example, the word 
pibo
 had the onset competitor 
pibu
 and the rhyme competitor 
dibo
. In the testing phase o
n the first day, 
both rhyme and 
onset 
competitor effects were present. However, rhyme effects were larger on the first day compared to 
the second day. This suggests that the inhibition of competitor items had become more efficient 
with increased training. 
 An alternative way of thinking about the development of competition between words 
comes from Norris and McQueenÕs 
(2008) model of spoken word recognition
 called Shortlist B, 
which is built on 
Bayesian
 principles. Simply put, as the speech signal unfolds, the model 
evaluates the probability for a specific word in the lexicon given the evidence from the 
perceptual input 
and the prior probability of that word occurring in the language
 (based on 
subjective frequency of words but also more local factors such as context)
. In this model, words 
28  do not inhibit each other directly but if the probability of a certain word increase
s, the probability 
of competitor words necessarily decreases. With newly learned or very infrequent words, there 
may be less certainty that the perceptual input refers
 to this word. 
Going
 back to the Magnuson et 
al. study, after only a few exposures to 
pib
o and 
dibo
, the prior probability of either word will be 
low given the perceptual evidence
. Thus there will be more competition from similar sounding 
words.
 Regardless of w
hether we think of word recognition in terms of interactive activation 
models or Bay
esian models, more precise lexical representations will result in a more efficie
nt 
parsing of the speech signal. In addition, speech perception studies have shown that listeners rely 
to a great 
extent
 on lexical cues (top
-down 
word 
knowledge) and only to a
 lesser degree on 
sublexical cues (bottom
-up cues from the signal)
 when segmenting the speech signal
. Examples 
of s
ublexical cues to word boundaries are
 stress, 
biphone
 probability
, and coarticulation. In a 
series of experiments, 
Mattys, White, & Melhorn
 (2005) tested those cues against each other and 
found that lexical cues 
(i.e., lexicality of the preceding segment) 
were the cues most 
strongly 
used by listeners to segment the speech. Sublexical cues only received greater weight when 
lexical cues were not informative or when the speech signal was severely distorted.
 One possible 
hypothesis 
that emerges 
from these results is that better l
exical knowledge, that is, more precise 
lexical representations, will result in better segmentation of the signal. This hypothesis received 
some evidence from another study conducted by Mattys and colleagues. 
Mattys, Carroll, 
Li, and 
Chan
 (2010) compared native English speakers to native Cantonese speakers who 
had 
learned 
English as a second language
 and had attained advanced proficiency
. The authors found that the 
L2 speakers relied more on sublexical cues for speech segmen
tation compared to the native 
29  English speakers
. This suggests that when language proficiency is 
relatively 
low, top
-down 
information, that is, lexical knowledge, 
will play a smaller role in
 speech segmentation.
 This section
 considered three mechanisms thro
ugh which
 the quality of lexical 
representations
 may influence word recognition. 
High quality memory representations, defined 
as representations that 
contain more redundant 
phonetic 
information
 about a word, may receive 
more activation, exert more inhibiti
on on competitor words, and provide better cues to s
peech 
segmentation compared to 
low quality representations. 
It is important to note that 
the 
quality of 
lexical representations will differ within speakers, as a function of frequency of encounter of 
indi
vidual words, and between speakers, as a function of overall language experience. 
  Word predictability
 1.4.5
When listening to speech in noise, listeners make use of sentence context to compensate 
for a degraded speech signal. As a result, target words embedded 
in sentences with predictable 
context
s are recognized with greater accuracy compared to the same words in unpredictable 
contexts. 
The Speech in Noise test (SPIN; 
Bilger et al., 1984)
, for example, uses sentences 
in 
which the target word, the last word
 in the sentence
, is either predictable or not from the 
preceding context 
(compare 
low predictability: Ms. Brown might consider the coast;
with high 
predictability: 
The boat sailed along th
e coast). When the target is predictable, listeners can use 
top
-down information (their knowledge of the world) and are therefore less reliant on the 
perceptual information from the speech signal.
 With regard to bilingual 
SUN, research suggests 
that biling
ual speakers may not make as much use of a pred
ictive context as monolinguals 
(Bradlow & Alexander, 2007; Mayo et al., 1997)
 with the effect being modulated by age of 
acquisition of the second language 
(Shi, 2010)
. It may be that second language speakers 
in 
general form weaker expectations during language processing in the second language 
(Martin et 
30  al., 2013
; but see Gollan et al., 2011)
 or that listening in noise consumes more attentiona
l resources so that fewer resources can be devoted to exploiting a predictive context.
 This question 
will be revisited in
 section 
1.5.1 on working memory.
 1.5 Speech p
erception in noise and cognition
 Section
 1.4 dealt with the influence of linguistic factors on 
SUN. This section will 
consider how other cognitive variables contribute to individual differences in 
SUN. Early 
perception studies were ofte
n conducted and interpreted under the assumption that speech is 
special 
(e.g., Liberman & Mattingly, 1989)
, that is, the speech perception system is separate from 
other c
ognitive functions in the brain. M
ore recently some researchers h
ave adopted the view 
that speech perception and word rec
ognition may also depend on 
domain
-general 
cognitive, 
nonlinguistic resources 
(see Arlinger, Lunner, Lyxell, & Pichora
-Fuller, 2009; Holt & Lotto, 
2008; Mattys et al., 2012)
. For example, 
Mattys and Wiget
 (2011) tested the effect of cognitive 
load, o
perationalized as a visual search task,
 on phoneme identification.
 Participant
s heard
 an 
ambiguous phoneme
 on a /g
-k/ 
continuum in
 context
s such as
 /?ift/ 
or /?iss/ 
that favored 
either 
a /g/ 
or /k/ response. The typical finding in this paradigm is that listeners give more /g/ responses 
in a /?ift/ context and 
more /k/ responses in a /?iss/ context
. When participants concurrently 
performed the visual search task, this effect increased, suggesti
ng that participants relied more on 
lexical knowledge than fine phonetic detail
 as a result of the greater task demands
. This finding 
shows that perception is not impervious to cognitive load
.  Other researchers have adopted a correlational approach and co
mpared performance on 
cognitive tests, for example working memory, with performance on 
SUN tests (see, e.g., 
Akeroyd, 2008)
. Because I also adopted a correlational approach for the research described in 
31  this dissertation, in the following sections I will concentrate on working memory and attention
al control, 
which are 
thought to be components of executive functions 
(e.g., Miyake et al., 2000)
. 1.5.1 Working memory
 The most influential model of working memory (WM) is that of Baddeley 
(Baddeley, 
1992, 2012). Baddeley
 and colleagues proposed that WM is a multi
-component construct, 
originally consisting of a central
 executive, a phonological loop, and a visuo
-spatial sketchpad. It 
was thought that phonological processing and visual processing was done in different systems 
and that attention resources were of limited capacity that could lead to attentional overload wh
en 
the capacity was exhausted. Later Baddeley added another component, the episodic buffer, and 
gave a greater importance to the interaction of WM with long
-term memory. The research in the 
field of WM has been very fruitful with thousands of articles appe
aring since the first model was 
proposed by Baddeley and Hitch 
(1974). The focus in the review is not on a specific model of 
WM but on individual differences in WM and how these relate to outcomes on a range of 
different tests. 
Research in individual differences in WM started with a seminal paper by 
Daneman and Carpenter 
(1980). They developed the concept of reading span, which they 
determined by having participants read sentences with set sizes of 2 to 6 sentences. After each 
set o
f sentences, participants were asked to recall the last word of each sentence. The number of 
words that a participant could successfully recall was their reading span. The test was based on 
the logic that WM consists of storage and processing capacity. Rea
ding sentences required 
participants to process them for meaning while remembering the last word tapped into storage. 
Surprisingly, an individual reading span was 
highly 
correlated with their verbal SAT score
 (r=.59), and passage comprehension (fact retrie
val, 
r=.72, and pronoun reference, 
r=.9).  32  WM is not independent of long
-term memory
 (LTM)
; rather, the two memory systems 
interact during pr
ocessing
 (Cowan, 1993)
. Evidence for this hypothesis comes from studies 
showing that short
-term
 memory (STM) for words is influenced
 by lexical and semantic 
variables suc
h as frequency, familiarity, phonotactic probability, and imageability 
(e.g., Hulme, 
Maughan, & Brown, 1991; Roodenrys, 
Hulme, Alban, Ellis, & Brown, 1994)
. For example, high 
frequency words are remembered better than low frequency words. The fact that these variables 
influence recall of words suggests that LTM representations must become active. 
To account for 
these dat
a, Hulme and colleagues
 hypothesized
 that 
Òword frequency influences the 
redintegration of partially decayed
 traces 
retrieved from a short
-term storeÓ 
(Hulme et al., 1997, 
p. 1227). The same authors assume that
 the effect of word frequency manifests itself
 via more 
accessible or bet
ter-specified phonological representations of those words compared to low 
frequency words.
 Baddeley 
(2012) states that Òthe phonological loop, the simplest component of 
WM, is likely to depend on phonological and lexical representations within LTM as well 
as procedurally based language habits for rehearsalÓ 
(Baddeley, 2012, p. 18)
. CowanÕs 
(1999) model of WM differs from BaddeleyÕs in that Cowan does not assume a separate STM storage 
system. Rather, information in STM differs from LTM in the state of activation. 
Capacity limits 
in 
CowanÕs (1999) model arise from attention limits rather than storage limits since LTM is 
believed to be of unlimited capacity.
 Researchers differ in their view of WM as being 
either 
domain general or depending on 
domain specific 
storage capacity
. Conway an
d his collaborators, for example, view WM as a 
general capacity store that can hold information of any kind 
(see Conway et al., 20
05). Conway 
and Engle showed that not only is reading span correlated with measures of verbal aptitude but 
also operation span, a measure derived in a similar way as reading span but requiring 
33  mathematical operations as the processing component 
(Conway & Engle, 1996)
. Kane et al.
 (2004) also ascribe to 
a general capacity theory, hypothesizing that WM 
primarily 
consists of a 
domain
-gen
eral 
executive attention component and a only secondarily of a domain
-specific
 storage component. By taking a psychometric approach, the authors tested 
a large number of 
participants on a 
wide range of tests thought to tap into different WM components and 
other 
constructs such as fluid intelligence
. Based on model comparisons, they conclude
d that a one
-component m
odel of WM fits their data best, that is, different WM tests such as verbal and 
spatial
 WM all share
d a common variance
. Kane et al. (2004) took t
his as evidence for 
the 
hypothesis that WMC is domain general.
  Others have posited that 
individual differences in verbal working memory, 
as measured 
for example, by the reading
-span test, 
depend on language experience plus 
differences 
in 
biological factor
s (MacDonald & Christiansen, 2002)
. In this latter view, a reading
-span test and 
a text comprehension te
st correlate because both rely o
n language skill, which develops with 
language experience. 
The difference between MacDonald and Christiansen (2002) 
on one 
side
 and Conway, Kane, and colle
agues 
(1996; 2004)
 on the 
other sid
e may just be one of focus. 
Whereas the first focus
es on the storage component, the latter focus
es on the executive attention 
component. 
However, it seems that the differences are more fundamental, with MacDonald and 
Christiansen (2002) stating that Òcapac
ity is an intrinsic part of the language comprehension 
system, not a separately modulated resourceÓ (p.50). 
Assuming that WM tests measure domain
-general attention limits and domain
-specific storage limits, 
one question arises
 regarding 
individual differen
ces research in SUN:
 what 
component we are looking at when we observe 
correlations between WMC and SUN tests? It is beyond the scope of this dissertation to give a 
34  definite answer to this question; however, we need to keep this issue in mind and I will com
e back to it in the discussion.
 One test that is often used to assess phonological STM
, which corresponds to
 the 
phonological loop in BaddeleyÕs model 
(Baddeley, 1992)
, is the nonword repetition
 test. In this 
test, 
participants are asked to repeat nonwords of varying length
s. Nonword repetition
 has been 
shown to be a good predictor of vocabulary growth in children 
(S. E. Gathercole & Baddeley, 
1989) and adults 
(Baddeley, Gathercole, & Papagno, 1998)
. However, nonword re
petition ability 
is not independent of language experience. For 
instance
, nonword repetition accuracy is related 
to the phonotactic probability of the nonword, that is, how probable its phoneme sequence is in 
the
 participantÕs L1
 (Majerus, Linden, Mulder, Meulemans, & Peters, 2004)
. In one study 
(Majerus et al., 2004)
 participants were exposed to an artificial language with certain phonotactic 
rules. In a subsequent nonword repetition test 
following the brief exposure, participants were 
better able to remember nonwords that were in agreement with the phonotactic pattern of the 
artificial language compared to those that violated the phonotactic pattern
. This suggests that 
phonotactic sensitiv
ity emerges through language experience.
 To 
investigate
 this hypothesis
 further
, Edwards, Beckman, and Munson
 (2004) tested children between 3 and 8 years and adults 
with a nonword repetition test 
in which they manipulated the phonotactic probability. They 
found that 
high
-frequency sequences 
were repeated with greater accuracy than low
-frequency 
sequences. In addition, 
there was an effect of age such that accuracy increased with age and an 
interaction between age and frequency, showing that the frequency effect, the difference between 
high and low frequency sequences, decreased as a function of age. Importantly, expressi
ve 
vocabulary size explained 29% of the variance in accuracy scores after accounting for age 
effects. Although these results do not 
allow 
causal inferences
, they imply that nonword repetition 
35  ability is strongly related to
 language experience and specifica
lly vocabulary knowledge. 
In the 
view of Edwards et al., listeners 
induce more generalized, abstract representations of sequences 
from the phonological patterns of words that they have encountered and learned. This will help 
the listener to quickly access 
similar patterns in other words and the fine
-grained phonological 
knowledge becomes more precise 
as more instances of a patterns are accumulated 
(Edwards et 
al., 2004, p. 434)
. More evidence for the view that nonwo
rd repetition ability 
improves
 as a 
result of an individual speakerÕs experience with a language comes from a recent study by 
Parra, 
Hoff, and
 Core (2011). The authors tested a sample of English
-Spanish bilingual 22
-month
-old 
children who had been exposed to both languages from birth. Phonological STM in each 
language, as measured by
 a nonword repetition test, was related to the relative exposure of 
children to English and Spanish. ChildrenÕs exposure to English was positively correlated with 
their English nonword repetition score and negatively with their score for Spanish
-like 
nonwo
rds. Together these results suggest that both vocabulary and phonological STM develop as 
a function of language experience. 
Thus, verbal WM, of which phonological STM is a 
component, is not independent of language experience but is dependent on the quality
, or precision, of phonological representations
 in LTM 
(Acheson, Hamidi, Binder, & Postle, 2011)
. The relationship between phonological STM and vocabulary acquisition therefore seems to 
be interactive rather than unidirectional 
(Thorn & Gathercole, 1999)
. A larger vocabulary is 
associated with better phonological 
STM and a better STM is associated with more ef
ficient 
vocabulary acquisition 
(Gupta & Tisdale, 2009)
. In section 1.3.4, I argued that phonological representations 
are, on average, less precise in
 bilinguals as a result of 
their 
reduced language experience. Given the relationship between 
phonological representations and 
ver
bal 
WM, it comes as no surprise that bilinguals often 
score 
36  below monolinguals on tests of verbal WM. S
tudies have shown that verbal WM is usually better 
in an L1 than an L2 
(e.g., 
Service, Simola, Mets−nheimo, & Maury, 2002)
 and
 even highly 
proficient bilinguals may have poorer verbal WM comp
ared to monolingual speakers
, while 
visual WM is not affected
 (Luo, Craik, Moreno, & Bialystok, 2013)
. Thus, bilin
guals do not 
seem to be impaired in general WM
C but memory processes that rely on LTM representations 
will be less efficient. As described above, decaying phonological representations in STM are 
thought to be restored (redintegrated) from their LTM represe
ntations 
(Hulme et al., 1997)
. The 
bilingual disadvantage on verbal WM tests may 
therefore have the same underlying cause as the 
word frequency effect on tests of serial recall. Low frequency words are recalled less accurately 
than high frequency words, so if all words are of lower experienced frequency in bilinguals, 
performance on ve
rbal memory tests should be commensurate with a bilingualÕs language 
experience
 in each language
. 1.5.2 Working memory and Speech perception in noise
 Recent studies have highlighted the role of 
WMC in 
SUN (for a review see Akeroyd, 
2008). Several studies hav
e investigated t
he role of WM in the
 encoding and recall of 
acoustically degraded speech. 
Br−nnstrım, Zunic, Borovac, 
and
 Ibertsson 
(2012) found a 
positive correlation between listenersÕ working memory span and the acceptable background 
noise level in their study. 
Pichora
-Fuller, Schneider, 
and
 Daneman 
(1995) administered a verbal 
working memory test in which participants listened to sentences of variable length of which they 
had to remember the last word. The authors found that a SNR
 of +8 dB did not affect recall 
compared to the quiet condition; however, for SNRs of +5 and 0 dB, they found an interaction 
between set size and noise level showing that participants were able to recall fewer words as the 
SNR decreased. 
Piquado, Cousins, Wingfield, 
and
 Miller 
(2010) presented participants aurally 
37  with word lists and found that when a word was masked so that it was just abov
e the perceptual 
threshold, recall of that word and the preceding words was impeded. 
Ljung, Israelsson, 
and
 Hygge 
(2012) administered a WM test and in addition presented partic
ipants with word lists 
under different SNRs that they repeated immediately and also recalled
 later. The authors found 
that whereas individual differences in WM did not predict speech intelligibility in noise (i.e., 
word recognition), there was an interacti
on between WM
C and SNR for the delayed recall, 
showing that recall was affected by SNR in low but
 not in high
-span individuals. Tamati, Gilbert, 
and Pisoni 
(2013) tested a large sample with a SUN test and then asked those participants who 
fell into the upper or lo
wer quartile to come back for additional testing. 
The authors found that 
backward
-digit and forward digit spans of participants in the upper quartile were significantly 
longer than the digit spans of participants in the lower quartile group.
 Lastly, 
Obleser, Wıstmann, Hellbernd, Wilsch, and Maess (2012)
 investigated the effects 
of memory load and acoustic degradation
 on WM by looking at behavioral and neuroimag
ing 
data
. They presented participants with 2, 4, or 6 digits in one of three levels of degradation 
(voice
-vocoded speech with 
4, 8 or 16 frequency 
bands). After a brief pause, participants were 
then presented with one digit and had to decide whether it was
 in the list or not. The authors 
found that while accuracy was high (above 90%), 
there was an effect on RTs. B
oth 
larger 
set sizes and 
higher levels of degradation resulted in longer RTs. In addition, the authors 
investigated alpha oscillations, a frequenc
y band associated with WM load, during the 
retention
 interval (
the interval between encoding and recall) and found that alpha power did not only 
increase as a function of set size but also as a function of acoustic degradation. This finding 
suggests that t
he rehearsal of degraded verbal stimuli in WM is a
ssociated with extra effort and 
fits well with the ELU model that I will describe in the next section.
 38  Underlying
 these account
s is a limited capacity view of attentional resources. When 
attentional resourc
es are taken up by decoding and processing degraded speech, fewer resources 
are available for other mental processes
 such as rehearsal
. However, as outlined in section
 1.5.1, verbal WM tests 
do not only measure
 attention
 capacity
 available to a subject but
 also correlate
 with lexical knowledge. It is therefore not clear whether the 
storage or the attention
 component 
of verbal WM tests, or both components share variance with speech processing in noise when 
lexical k
nowledge is not controlled for. Two studies are suggestive of this hypothesis. Kilman et 
al. 
(2014) administered
 an 
English and 
a Swedish 
SUN test, an 
English and 
a Swedish WM test
 (reading span), and an English proficiency test
 to Swedish native speakers
. The English SRT 
correlated more strongly with 
English proficiency than English reading span, and Swedish 
reading span was even 
more weakly
 associat
ed with the English SRT. 
Sırqvist, Hurtig, Ljung, 
and Rınnberg
 (2014) tested Swedish native speakers who had learned English as a
n L2. 
Participants performed an English 
reading 
proficiency test, a WM test in L1 and L2, and an 
English listening proficiency test with thre
e different reverberation times. 
L1 and L2 
WMC and 
L2 proficiency correlated highly with the 
listening test
 results
. For further analysis, the authors 
ran a regression model in which they used the listening test results with the longest reverberation 
time as 
the 
dependent variable and the results from the shortest reverberation time as control 
variable.
 When 
L2-WMC was entered in a next step, it was significant 
but when L2 reading 
proficiency was entered, WM was no longer significant
.  Results from these two studies show that WM and language proficiency may not 
independently contribute to 
SUN. The fact that WM was no lo
nger predictive in Sırqvist et al. 
(2014) does not necessarily mean that 
individual differences in 
general WM
C did not contribute 
to listening under reverberation. However, if language proficiency and verbal WM predict 
SUN 39  because both are indicative of th
e quality of phonological representations, then it may be difficult 
to 
disentangle
 the contributions of verbal WM and language proficiency. Using 
a nonverbal WM 
test or 
a composite score based on more than one WM test may be necessary to 
gauge the unique 
contribution of WM
C during 
SUN. It may also be that individual differences in WM
C become 
more important in older listeners.
 The degree to which individual differences in WM
C are predictive of 
SUN may also 
depend on the type of 
SUN test used. For example,
 the Words
-in-Noise (WIN) test 
(Wilson, 
Carnell, & Cleghorn, 2007)
 has 
few
 attentional demands as listeners only have to repeat single 
words 
that are 
preceded by the 
carrie
r phrase ÒSay the wordÓ. In t
he SPIN, on the other hand, 
target words are embedded in
 sentences with predictive and unpredictive context
s and so the 
onset of the target word is less predictable
. In addition, i
f participants want to use 
the semantic
 context to predict the last word, they need to maintain representations of the preceding words in 
STM. Thus the test places greater attentional demands on the listener. I mentioned above that 
second language speakers and early bilinguals may not benefit a
s much from a predictive context 
as monolingual speakers 
(Bradlow & Alexander, 2007; Mayo et al., 1997; Shi, 2010)
. This may 
be because processing a sentence takes up more attentional resources when phonological 
representations are less precise. Thus listening in noise may take up more a
ttentional resources in 
bilinguals
 and L2 speakers
 so that they have fewer resources left to predict the target word. 
In the 
next section 
I will describe a model that brings together WM
C, quality of lexical representations, 
and 
SUN. 1.5.3 The Ease of Language Un
derstanding model
 The ELU
 model 
(Rınnberg et al., 2013; see Introduction)
 was developed to describe the 
interplay of bottom
-up (the perceptual input) and top
-down (lexical kno
wledge, WM) processes 
40  during language processing. The broader context of the model is that of Cognitive Hearing 
Science 
(e.g., Arlinger et al., 2009)
, which developed o
ut of a realization that domain
-general 
higher
-order cognitive processes interact with perceptual processes and therefore 
speech 
perception cannot be studied separate
ly from the rest of the cognitive sciences. 
 The model assumes that sublexical information at the level of the syllable is buffered in a 
temporary storage system called RAMBPHO (
rapid, 
automatic, 
multi
-modally 
bound phonological representations). These syllabic units are then compared to phonological 
representations in LTM. 
The model assumes that phonological representations consist of 
multiple attributes and 
for successful lexical access
 the speech signal has to
 activate a minimum 
number of attributes. 
If the threshold for lexical retrieval is not reached, similar sounding words 
may be retrieved instead. However, contextual information may often be sufficient for a lexical 
item to be retrieved even when the speec
h signal is too distorted. In such cases 
when information 
in RAMBPHO cannot be matched with a LTM representation, e
xplicit processing that involves
 WM is
 needed to resolve the mismatch, causing a delay in 
lexical access. 
Otherwise lexical 
access occurs aut
omatically. 
 Mismatches between the speech signal and LTM representations can occur for speaker 
external 
reasons 
(e.g., distorted speech or an unfamiliar accent) or internal (imprecis
e phonological representations; 
Rınnberg et al., 2013, p. 3)
. According to this model
, listening in 
noise will take up more attentional resources than listening in quiet because the perceptual 
information will often be too distorted to be eff
ortlessly matched to LTM representations. 
Individual differences in WMC relate to 
SUN because individuals with greater WMC are 
thought to resolve mismatches with greater ease
 (c.f. Pichora
-Fuller et al., 1995)
. At the same 
41  time, the model predicts greater 
processing 
effort for individuals with 
less precise phonological 
representations in LTM, for example, second language learners.
 The role 
of attention in language processing has recently been highlighted in 
a brain 
imaging study. 
Wild et al.
 (2012) tested participants with a complex task that required them to 
attend t
o one of three 
simultaneously presented stimuli
, namely 
aurally presented sentences, 
auditory distracters, or visual stimuli. The intelligibility of the sentences was manipulated by 
reducing the spectral information present in the signal
. The results showe
d that when participants 
heard u
ndistorted sentences
 while attending to the auditory or visual information
, frontal regions 
associated with speech comprehension showed activation 
and participants were later able to 
recall information from these unattended 
sentences. However, when unattended sentences were 
distorted, activation of frontal regions was not greater than in the control condition (unintelligible 
sentences)
 when participants were instructed to attend to the distracters
. In addition, the level of 
activation of frontal regions correlated with the degree of acoustic distortion when participants 
attended to the sentences. Activation of auditory cortex, on the other hand, was not modulated by 
attention. The results from this study fit well with the ELU 
model in that processing of clear 
speech seemed to be effortless and not dependent on top
-down
 attention; d
istorted speech, on the
 other hand, required attention 
for it 
to be processed and remembered. In addition, the finding that 
speech intelligibility wa
s correlated with 
activation of frontal regions fits well with the ELU 
models
 emphasis on processing effort: the greater the distortion, the greater the need for explicit 
processing
. 1.6 Phonological quality hypothesis
 The phonological quality hypothesis (PQH)
 will be the overarching hypothesis of this 
dissertation. 
It is derived from the literature review presented here and related to the LQH put 
42  forward by Perfetti 
(2007), the weaker links hypothesis 
(Gollan et al., 2008; Gollan, Montoya, 
Fennema
-Notestine, & Morris, 2005; Gollan et al., 2002)
, the phonological mismatch hypothesis 
(Imai, Walley, & Flege, 2005)
, the representation quality hypothesis 
(Sommers & Barcroft, 
2011), and the ELU 
(Rınnberg et al., 2013)
. The PQH makes further assumptions regarding the 
nature of phonological representations. The
 LQH was developed to explain individual 
differences in reading and is therefore not directly 
translatable to spoken word recognition. I will 
make the same general assumption as the LQH, namely that words differ in the quality of their 
representation withi
n a single speaker and between speakers. as a function of frequency of 
encounter. With each encounter, connections between phonological and semantic representations 
will be strengthened. High frequency words are encountered more often and in more diverse 
contexts than low frequency words. Therefore, the lexical representations of high frequency 
words are assumed to be more precise and semantic information can more easily be integrated to 
extract the meaning/gist of an utterance. The ELU makes the assumption
 that phonological 
representations differ in the number of attributes with which they are stored in LTM. Thus more 
precise representations will consist of a higher number of attributes compared to less precise 
representations. More attributes will result i
n a better match between the acoustic signal and 
phonological representations and thus more efficient and robust lexical retrieval. In the PQH, I 
further assume that representations in LTM do not only consist of abstract phonetic information 
but that each 
encounter with a word leaves a memory trace
 (cf. Goldinger, 1996)
. The hypothesis 
thus builds on exemplar theories of word recognit
ion 
(Goldinger, 1996, 1998;
 Hintzman, 1986; 
Pierrehumbert, 2001)
. Exemplar
-based models of the mental lexicon differ from models that 
assume that only abstract representations of words are stored and that the speech signal is 
normalized and 
stripped of all indexical information (
e.g., speaker voice, gender, etc.) prior to 
43  lexical access 
(e.g., K. Green, Kuhl, Meltzoff, & Stevens, 1991)
. The exact nature of the men
tal lexicon is still an active area of research but e
xemplar theories are especially useful in the present 
context because they make specific predictions about the frequency of encounter with words. 
Lexical items that are encountered more frequently will b
e associated with more episodic 
memory traces and a match between the signal and a memory representation will be more likely.
 The same logic that can explain frequency effects within speakers can also be extended to 
explain individual differences in word r
ecognition between speakers. I
ndividuals 
who overall 
have more language experience
 will encounter all words
, and especially low frequency words 
(Kuperman & Van Dyke, 2013)
, more often compared to other individuals. 
For example, s
ome 
people 
may interact with a greater variety of people and in a greater variety of contexts.
 This is 
especially true for speakers of two or more languages who, on average, will spend less time 
listenin
g to and speaking each language 
(for an application of exemplar theory to nonnative 
speakers see Hardison, 2003)
.  Whereas bilinguals may 
be able to 
estimate quite accurately 
what percentage of the time
 they speak and hear each language on average, i
t is certainly much harder for
 monolingual 
speaker
s to reliably estimate the number of hours they listen to language, how many speakers 
they int
eract with regularly, and the type of contexts
 in which
 they encounter language. 
Therefore the assumption is made in the present study that language experience is closely related 
to verbal ability or language proficiency (these terms will be used interchan
geably throughout 
this dissertation). Language proficiency can easily be measured by a standardized test. Such tests 
were developed by testing a large sample representative of the general population and have high 
reliability. 
Standardized tests are
 thus pr
eferable to data based on self
-report. Proficiency is 
related to experience because individuals who are generally exposed to language more and 
44  interact with more people are more likely to
 hear words more often, especially less frequent 
ones, 
than individua
ls who 
have fewer interactions
. 45  CHAPTER 2: 
EXPERIMENT 1
 2.1 Research questions and predictions
 This experiment was designed to test the effect of noise, predictability and language 
status (bilingual/monolingual) on word recognition in noise. Secondly, I tested
 the influence of 
lexical and sublexical variables on word recognition in noise and whether the effect of these 
variables is different for monolinguals and bilinguals. Results could provide evidence for or 
against the hypothesis that the previously reporte
d bilingual disadvantage in SUN is related to a 
bilingualÕs generally reduced exposure to each of their languages compared to someone who 
speaks only one language. As discussed in Section 
1.4.3, the word frequency effect is a result of 
language exposure. Words that are more frequent in the language are encountered more often 
and are therefore recognized faster and with greater accuracy. As I described in Section 
1.4.1, phonological representations of high frequency words are assumed to be more precise, including 
more redundant information, compared to low frequency words and thus recognition of high 
frequency words is more robust to the effect of background noise. If r
educed exposure to each 
language due to bilingualism is one factor underlying the bilingual disadvantage in SUN, then 
differences between groups are expected to 
be disproportiona
tely larger
 in the low frequency 
range because of the logarithmic nature of fr
equency effects (see Section
 1.4.3)5. Thus, we would 
expect a frequency by group interaction. 
Originally I also intended to investigate the effect of 
neighborhood f
requency on word recognition as previous studies have found this variable to be 
a good predictor of word recognition in noise 
(Luce & Pisoni, 1998)
. However, many words in the 
present study behaved differently than would have been expected based on the
ir neighborhood 
                                                5 As described in Section 1.4.3, an individual with less exposure to the language that he or she 
speaks will encounter low frequency words disproportionately less oft
en compared to someone 
with more language exposure.
 46  frequency. This made the results difficult to interpret
 and so only the results for lexical frequency 
are reported here
. 2.2 Methods
 2.2.1 Participants
 The study includes 53 monolingual and 48 bilingual participants. 
The inclusion criteria 
for monolinguals were that they did not learn a second language before 10. Some monolinguals 
had learned a second language in foreign language classes in school 
but they were not fluent in 
their second language and had not spent more
 than a short vacation in a non
-English speaking 
country. Bilinguals had to have learned Spanish from birth and English before the age of 8. Four 
bilinguals reported to have learned English later than 8 but they were included in the study 
because they were
 born in the US and attended school in the US from kindergarten. They 
reported that they attended a Spanish
-English bilingual program but that little English was 
taught. However, they likely had some exposure to English. 
Thirty
-seven (77%) bilinguals were 
born in the US. Of the remaining bilinguals, a
ll but five arrived in the US before the age of 6. 
Four of those 
immigrated
 at the age of 7 and one at the age of 13. The latter participant was 
included because her mother was a native speaker of English
 and s
he had learned both English 
and Spanish from birth
 and attended a bilingual school
. In addition, participants had to be 
between 18 and 35 years old.
 I tested 
six additional monolinguals and five additional bilinguals 
but they were not included in the final
 sample because they did not meet 
the definition of 
monolingual (5
), early bilingual (4), or were too old (1) or too young (1) to be included in the 
study. 
Detailed participant information can be seen in Table 1.
 The bilingual participants were mostly 
second
-generation
 immigrants from Mexico. As 
might
 be expected in this population
 (Capps et al., 2005)
, the 
parental
 education level
 was lower 
47  than
 that of the monolingual group. 
Participant groups also differed on other variables,
 most 
notably English proficiency. I will come back to this point in the results section. Importantly, 
participants were matched on age, years of formal education, and self
-rated hearing ability.
 Table 
1. Participant characteristics
 devided by language status.
  Monolingual
 Bilingual
 Age
 in years
 20.6 (2.4) 20.8 (2.8) Number of m
ales 18 (34%) 16 (33%) Years of formal education
 14.9 (1.6) 14.4 (1.4) Primary care
givers 
education 
level
:   - Less than high school
 0% 40% - High school
 11% 46% - Some college
 30% 8% - College
 32% 4% - Some Graduate school
 4% 0% - Graduate school
 23% 2% Self-rated hearing ability (out of 10)
 8.6 (1.0) 8.6 (1.1) Years of m
usical experience
 4.7  1.0  Oral language W
-score
 533.2 (8.9) 515.6 (11.4) Oral 
language Standard Score
 105 (7.7) 90 (8.8) Picture Vocabulary W
-score
 537.1 (11.0) 516.1 (13.5) Picture Vocabulary Standard score
 101 (7.6) 86 (8.4) Verbal Analogies W
-score
 529.5 (9.2) 515.3 (11.8) Verbal Analogies Standard Score
 109 (7.3) 98 (9.0) Oral language W
-score 
- Spanish
 - 503.0  (11.9) Oral language Standard
-score 
- Spanish
 - 81 (9.3) Picture Vocabulary W
-score 
Ð Spanish
 - 500.8 (11.8) Picture Vocabulary Standard score 
Ð Spanish 
 - 77 (7.9) Verbal Analogies W
-score 
- Spanish
 - 505.3 (14.2) Verbal Analogies Standard Score 
ÐSpanish
 - 90 (10.8) Age of 
Acquisition: English
 - 4.4 (2.5) Age of Acquisition: Spanish
 - 0 Age of Arrival in USA
 - 1.3 (2.8) Listening to English
 - 64.6% (18.4) Speaking English
 - 65.5% (17.4) Reading English
 - 81.3% (16.7)  48  Participants were recruited through flyers. Monolinguals and bilinguals were tested at a 
large rural university in Michigan and additional bilinguals were tested at a large urban 
university in Illinois. Most bilinguals tested in Michigan 
were originally from Texas whereas 
those tested in Illinois were mostly from Chicago.
 2.2.2 Materials
 2.2.2.1 Background questionnaire
 ParticipantsÕ background information was collected with a questionnaire created for this 
study, administered by the experimenter. The instrument was loosely based on Marian et al. 
(2007) but included additional information about parental education and use of English and 
bilingual par
ticipantsÕ use of English and Spanish during their childhood and adolescence. It took 
about 6 to 10 minutes to administer. The questionnaire can be seen in the Appendix. 
 2.2.2.2 Speech perception in noise test
 The revised Speech Perception in Noise test (SPIN; 
Bilger et al., 1984)
 was used in a 
modified form. The test consists of 200 target words and each word is recorded in a predictive 
and unpredictive context. For example, the word 
coast
 could
 be preceded by 
Ms. Brown might 
consider the coast 
(low predictability) or by 
The boat sailed along the coast
 (high predictability). 
The original SPIN recordings were obtained on CD and were cut in Audacity so that each 
sentence could be saved in a separat
e sound file. For the background babble, a short sequence 
from the original track was chosen and mixed with each sentence in Praat 
(Boersma & Weenink, 
2014) at two different SNRs (
-2 dB and 3 dB). The sound intensity of the sentence was held 
constant and so the intensity of the babble differed for the two SNRs.
 In the present study, 128 se
ntences from the test were chosen and divided into four lists 
of 32 words. Words in each list were matched on lexical frequency, based on subtitle frequencies 
49  from Brysbaert and New 
(2009), and on neighborhood density. Each participant heard the first 
half at 3 dB SNR and the second half at 
-2 dB. Within each SNR, half of all words were played 
in a predictable context and the other half in an 
unpredictable context in a randomized order. 
Across all participants, each word was administered in all four conditions in a Latin
-square 
design. After each sentence, the participant was prompted to type the last word of the sentence. 
The next trial starte
d when a participant pressed Enter. Before the actual experiment, 10 
sentences were administered at a SNR of 8 dB to ensure that participants had understood the 
task. Participants were also told to check the word they typed on the screen for any spelling 
errors before going to the next trial. This test was administered in Eprime 2.0 (Psychology 
Software Tools, Sharpsburg, PA).
 Information about lexical variables was taken from different sources. Information about 
lexical frequency was taken from Brysbaert a
nd New (2009). These norms are based on a large 
corpus created from subtitles of American movies and TV shows. The mean log10 word 
frequency of the stimuli used in the present study was 2.71 (
SD = 0.45) and the mean FpM was 
17.63 (SD = 25.40). Information 
about phonotactic probability came from Vitevitch and Luce 
(2004). This database provides the summed probabilities of each phoneme in a wor
d and the 
summed probability of each biphone. The correlation between biphone probability and log
-frequency was 
r = .14.  2.2.3 Procedure
 In this experiment, participants heard recordings of spoken sentences. After each 
sentence, the participant was prompted to 
type the last word of the sentence. The next trial 
started when a participant pressed Enter. Before the actual experiment, 10 sentences were 
administered at a SNR of 8 dB to ensure that participants had understood the task. Participants 
50  were also told to c
heck the word they typed on the screen for any spelling errors before going to 
the next trial. This test was administered in Eprime 2.0 (Psychology Software Tools, Sharpsburg, 
PA).
 2.3 Analysis
 For the analysis, mixed
-effects regression modeling was used 
(Baayen, Davidson, & 
Bates, 20
08; Gelman & Hill, 2007)
. Models were run in 
R (R Core Team, 2014)
 using the 
package 
lme4 (Bates, Maech
ler, Bolker, & Walker, 2014)
. Mixed
-effects models have the 
advantage over ANOVA that they allow for crossed
-random effects of subjects and items. That 
eliminates the necessity to run separate analyses for subjects and items. At the same time, models 
can be run with continuous predictors as in linear regression while controlling for multiple 
observations from the same subject. This is an advantage over ANOVA because naturally 
continuous variables such as word frequency can be used as a continuous predict
or and do not 
have to be factorized into low and high frequency items. In addition, mixed
-effects models allow 
to test for interactions between subject
-level and item
-level variables. Lastly, a great advantage 
over ANOVA is that mixed
-effects models can ha
ndle continuous and dichotomous outcome 
variables. In ANOVA, when variables are dichotomous, such as accuracy data, researchers 
usually average across subjects and use percentage correct as the outcome variable, using a 
transformation to correct for the ty
pically non
-normal distribution. This traditional method has 
certain shortcomings as described in Jaeger 
(2008), which c
an be circumvented by using a 
generalized mixed
-effects model with a binomial error distribution. This also obviates the need 
of averaging data.
 Interpreting the output from a mixed
-effects model is somewhat different from the output 
of an ANOVA. 
To report
 the significance of main effects and interactions, likelihood
-ratio tests 
51  will be reported. These tests compare the log
-likelihood of a model excluding a variable with one 
including the variable. The change in log
-likelihood has a chi
-square distribution 
with the 
degrees of freedom corresponding to the difference in number of variables between the models 
(i.e., one). In addition, I will also report the model estimates for the effect sizes of each variable. 
For logistic regression, these are on the logit sc
ale and thus not easily interpretable but the logit 
values can be transformed into odds
-ratios by taking the exponent. An odds
-ratio describes the 
likelihood of one event occurring over another event occurring. 
In the present analysis there are 
three categ
orical variables, Noise (high/low), Predictability (high/low), and Group 
(monolingual/bilingual). For each variable, one will be the reference category, for example, Low 
Noise. In the model output, the regression coefficient of the variable Noise then show
s the 
change on the logit scale when noise is high. Because the logit scale is nonlinear, the actual 
values are meaningless. However, the sign of the coefficient will tell us whether the probability 
of recognizing a word increases or decreases relative to 
the reference category. So if the sign of 
the Noise coefficient is negative and Low Noise is the reference category, we know that the 
probability of recognizing a word is lower when noise is high compared to when it is low. It is 
important to know what the
 intercept in a regression model represents because a coefficient of a 
continuous variable will show the effect size for the baseline condition, which is the intercept. 
For example, if the reference categories for our three categorical variables are High N
oise, Low 
Predictability, Monolingual, then the coefficient of the intercept will give us the probability of 
recognizing the word in this condition. If we include an interaction between Group and 
Frequency, the main effect of frequency will give us the eff
ect size for the monolingual group 
and the coefficient of the interaction between Group and Frequency will indicate the change in 
the effect size for the bilingual group. For example, if the main effect of Frequency was 5% (that 
52  is, a one unit change in Fr
equency is associated with a 5% increase in recognition), and the 
coefficient of the interaction between Group and Frequency is 4%, then the effect size for the 
bilinguals is 9%. If the interaction is significant, it means that the slopes of the frequency 
effect 
for monolinguals and bilinguals are statistically different. Because the actual values will be on 
the logit scale, the sign of the coefficients can help us determine again whether the effect is 
smaller or larger for bilinguals compared to monolingua
ls. Instead of adding up the coefficients, 
we could also change the reference category for group to bilingual. The main effect of Frequency 
would then show us the effect size for the bilinguals. Note that this is different from doing 
multiple comparisons b
ecause we run the same model and just change the reference category.
 The test was scored automatically by Eprime and an answer was counted as correct, when 
it matched the target word. All answers that were coded as incorrect were manually checked for 
any s
pelling mistakes. A misspelled word was counted as correct in the following cases: letter 
transposing (e.g., 
theif
 for 
thief
), wrong letter when the correct letter was adjacent to it on the 
keyboard and the resulting word was not a word in English (e.g., 
ahore
 for 
shore
), when a letter 
was missing and the resulting word was not a word in English, or when the answer was a 
homophone of the target word, regardless of whether the typed word was a real English word 
(e.g., 
gyn
 or jin for 
gin
). In total, 286 (2.2%
) instances were corrected this way, which is 
comparable to 2.5% in Luce and Pisoni (1998) who used a similar procedure. In three instances, 
participants started typing before the prompt. In this case, the letters that were typed before the 
prompt were not
 recorded. This typically resulted in very short RTs (measured from the start of 
the prompt to the point where a participant hit enter). For example, one participant seemingly 
only typed 
t for 
pet
 with an RT of only 435 ms (a typical RT would be between 15
00 and 3000 
ms). These trials were coded as missing data.
 53  2.4 Results
 For the analysis, I ran one regression model but I will report the results separately for 
each research question, that is, does the effect of noise and predictable context differ for each 
group and what is the effect of frequency and phonotactic probability on monolingual and 
bilingual speakers.
 2.4.1 The effects of noise and predictable context
 Words in low noise (
M = 85.5%, SD = 35.2) were recognized with higher accuracy than 
words in high noise
 (M = 67.6%, SD = 46.8; 
!2 (1) = 712.4, p < .001), and words in a predictable 
context (
M = 88.7%, SD = 31.6) better than words in an unpredictable context (
M = 64.4%, SD = 47.9; 
!2 (1) = 1059.3, p < .001). The effect of predictability was 28.2% when noise 
was high and 
20.5% when noise was low and this interaction was significant
 (!2 (1) = 30.7, p < .001). Monolinguals recognized words more accurately (
M = 80.8%, SD = 39.4) than bilinguals (
M = 71.8%, SD = 45.0; 
!2 (1) = 76.7, p < .001). The effect of noise 
was smaller for monolinguals 
(16.1%) than bilinguals (19.9%), but this interaction was not significant (
!2 (1) = 3.3, p = .068, see Figure 
3). The effect of predictab
ility was larger for monolinguals (24.8%) than bilinguals 
(23.8%; 
!2 (1) = 46.7, p < .001). The effect of predictability can best be seen in 
Figure 
4. When 
noise was low, the effect of predictability was larger for bilinguals (22.7%) than monolinguals 
(18.6%), likely because monolinguals were at ceiling in the low noise
-high predictability
 condition (
M = 98.2%, SD = 13.4%). When noise was high, on the other hand, the effect was 
larger for monolinguals (31.1%) than bilinguals (24.9%), but the three
-way interaction was not 
significant (
!2 (1) = 0.1, p = .809). Expressed as CohenÕs 
d, the effe
ct sizes of group differences 
were as follows in the four conditions: HNLP: 0.16, HNHP: 0.37, LNHP: 0.25, LNLP: 0.21. The 
model estimates for the
 SPIN test are summarized in Table 2. 
 54  Table 
2. Summary of mixed
-effects regression res
ults for variables predicting accuracy on the 
Speech Perception in Noise test.
  Estimate (
!)   Odds ratio
 Logit
 scale SE Intercept 
 (High noise, low predictability, 
bilingual)
 1.03 0.03 0.15 Noise (high vs. low)
 3.50 1.25 0.09 Predictability (low vs. 
high)
 4.25 1.45 0.09 Group (bilingual vs. monolingual)
 1.54 0.43 0.11 Noise*Predictability
 2.00 0.69 0.16 Predictability*Group
 2.20 0.79 0.14 Noise*Group
 1.20 0.18 0.13 Noise*Predictability*Group
 1.07 0.07 0.27 Frequency (scaled)
 1.41 0.35 0.13 Frequency*Group
 0.86 -0.15 0.05 Biphone probability (scaled)
 1.22 0.20 0.13 Biphone Prob.*Group
 1.09 0.09 0.06   Figure 
3. Results of the Speech Pereption in Noise test divided by noise level and group. Error 
bars show the 95% 
confidence interval.
 55    Figure 
4. Results of the Speech Perception in Noise test. Results are d
ivided by condition and 
language group. Error bars show the 95% confidence interval.
  2.4.2 The influence of lexical and sublexical variables 
on word recognition
 Phonotactic probability
: The effect of phonotactic pro
bability can be seen in 
Figure 
5. There appears to be an effect of phonotactic probability s
uch that words with higher probability 
were recognized 
with higher accuracy than low probability words but when the variable was 
entered into the model along with frequency, the effect was only marginally significant (
!2 (1) = 3.3, p = .068). From 
Figure 
5 it appears that the effect was the same for both groups
 and model 
estimates confirmed this
, showing that the interaction with group was not significant
 (!2 (1) = 2.3, p = .127).  56   Figure 
5. Effect of biphone probability on Speech Perception in Noise accuracy divided by 
group. Grey
-shaded area shows the 95% confidence interval of the slope of the regression line. 
Each point 
represents the mean accuracy of a certain word.
 Lexical frequency
: The word frequency effect 
is shown
 in 
Figure 
6. Recognition of high 
frequency words was better than 
recognition of low frequency words
 (!2 (1) = 4.6, p = .033) and 
the interaction between frequency and group was significant (
!2 (1) = 7.9, p = .005). Table 
2 shows a negative sign for the interaction between group and frequency, which indicates that the 
effect of frequency was smaller for monolinguals than bilinguals.
 57   Figure 
6. Effect of log10 word frequency on Speech Perception in Noise accuracy divided by 
group. Grey
-shaded area shows the 95% confidence interval of the slope of the regression line. 
Each point represents the mean accuracy of a certain word.
 In order to b
etter understand the interactions reported here, I also ran separate models for 
each group
. As can be seen in 
Table 
3, the effect of predictability was higher for monolinguals 
than bilinguals. The effect of frequency was slightly larger in the bilingual group than in the 
monolingual group and the effect only reached significance in the bilingual group. 
The opposite 
patte
rn was found for biphone probability, which was only significant in the monolingual group. 
Although frequency and biphone probability were not highly correlated (
r = .14), the effects are 
likely not independent and so the fact that each variable was only s
ignificant in one group but not 
the other may be the result of this correlation.
  58  Table 
3. Summary of the mixed
-effects regression results of SPIN accuracy for monolinguals and 
bilinguals.
  Monolinguals
  Bilinguals
  OR LS SE  OR LS SE Intercept 
 (High noise, low predictability
) 1.6 0.48 ** 0.15  1.0 0.01  0.14 Noise (high vs. low)
 4.3 1.46 *** 0.09  3.4 1.24 *** 0.09 Predictability (low vs. high)
 9.7 2.23 *** 0.11  4.2 1.43 *** 0.09 Noise*Predictability
 2.1 0.74 *** 0.22  2.1 0.73 *** 0.16 Frequency (scaled)
 1.2 0.20  0.14  1.4 0.34 ** 0.13 Biphone probability (scaled)
 2.1 0.74 * 0.22  1.2 0.19  0.13 ***p < .001; **
p < .01; *
p < .05. OR = odds ratio. LS = logit scale. SE = standard error.
 2.5 Discussion
 The first part of the 
results replicated previous studies that showed that the effect of a 
predictable context is smaller for bilingual
s compared to monolinguals 
(Bradlow & Alexander, 
2007; Mayo
 et al., 1997)
. However, 
contrary to 
some previous studies 
(Mayo et al., 1997)
, the 
interaction between Noise level and Group was 
only marginally
 significant. This suggests that 
the effect of noise was the same for both groups. 
 The second research question 
asked
 whether differences between gro
ups observed on the 
SPIN could be explained by reduced language exposure of the bilinguals. The assumption was 
that because bilinguals speak two languages, they will hear and speak each language less 
frequently. As a result, phonological representations of
 words will be weaker or less precise, and 
a bilingual person
, on average,
 will know fewer words 
in each language 
compared to a
n age
-matched 
monolingual person
. The prediction was that if reduced language experience is the 
cause for less accurate word reco
gnition, it would affect low frequency words more than high 
frequency words
 because low frequency words will be heard disproportionately less as a result of 
reduced exposure (see section 
1.4.3). This prediction was borne out by the results. 
Frequency 
59  effects were larger in bilinguals compared to monolinguals, with bilinguals performing close to 
monolinguals for high frequency words but different for low frequency 
words.
 Phonotactic probability has a facilitative effect on word recognition such that words with 
high phonotactic probability are recognized with higher accuracy than those with low 
phonotactic probability and this was 
confirmed in the present study, alth
ough the effect was only 
marginally significant
. An additional 
prediction for phonotactic probability was that 
bilinguals 
would be more negatively affected by low
-phonotactic
-probability words because they would be 
less sensitive to phonotactic patterns th
at are less common in the language
. The prediction for the 
present study was based on the fact that sensitivity to phonotactic probability is a result 
of 
language experience and vocabulary knowledge 
(Edwards et al., 2004)
. Speakers extract the 
probabilities
 for phoneme sequences by generalizing across all lexical items in their mental 
lexicons. Because bilinguals may know fewer words, they have a smaller basis to abstract the 
sound patterns from and so they may be less sensitive to those sublexical units. 
This prediction 
was not confirmed in the present results. When separate models were run for each group, 
phonotactic probability was only significant in the monolingual group and frequency was only 
significant in the bilingual group. This may suggest that the
 monolinguals relied more on 
sublexical information whereas the bilinguals relied more on lexical information. However, 
when comparing the coefficients of these effects between groups, the effects appear to be very 
similar and the different significant lev
els may be a result of the fact that the two effects are not 
completely independent, that is, high frequency words also tend to have higher phonotactic 
probability. 
In addition, the interaction between phonotactic probability and group was not 
significant,
 whereas the interaction between group and frequency was highly significant. This 
60  suggests that differences between groups were mostly present in the frequency effect, suggesting 
that frequency of exposure to English may be one factor affecting group diffe
rences on the SPIN.
  61  CHAPTER 3
: EXPERIMENT 2
 Results of Experiment 1 replicated previous studies and provided some evidence that 
bilinguals are more affected by noise than monolinguals and that they benefit less from a 
predictive context
 under adverse list
ening conditions (i.e., high noise)
. The experiment also 
provided some evidence for the hypothesis that the cause of differences between monolinguals 
and bilinguals is the quality of phonological representations. According to 
the phonological 
quality
 hypothesis, reduced language experience is the reason for less precise phonological 
representations and a generally smaller vocabulary. 
If this hypothesis is true, we would not only 
expect group differences between monolinguals and bilinguals but also individua
l differences 
within each group as a result of language experience and vocabulary knowledge. 
Thus, t
he 
purpose of Experiment 2 was to investigate factors that could explain individual variation in the 
sample. 
Besides language experience, the influence of i
ndividual differences in aspects of 
cognition were investigated, namely WM and auditory attentional control. 
 3.1 Methods
 3.1.1 Participants
 The results analyzed in
 Experiments 2 
come from the same participants as those reported 
in Experiment 1 (see section 
2.2.1). 3.1.2 Materials
 In experiment 2, the results from experiment 1 will be reanalyzed using an individual 
differences design instead of a group design. To assess 
individual differences on different 
dimensions, participants completed several tests that will be described in the following sections. 
In chapter 5, these tests will be analyzed in more detail; here, they just serve as predictor 
variables for the SPIN test
. 62  3.1.2.1 Woodcock MuŒoz Language Sur
vey - Revised
 The Woodcock
-MuŒoz Language Survey
 - Revised
 (WMLS
-R; Woodcock, MuŒoz 
Sandoval, Ruef, & Alvarado, 2005)
 is a 
norm-referenced, 
standardized test of English and 
Spanish. Both versions were normed on a large sample of speakers in the US and Latin America 
in the case of the Spanish version. The raw
-score on the test can be transformed into a standard 
score with a population mean of 100 and a standard deviation of 15 through software that is 
provided with the test 
(Schrank & Woodcock, 2005)
. In addition, scores can be expressed as W
-scores, which are based on an equal interval scale and are therefore suitable for statistical 
analyses and group comparisons. Unlike standard scores, W
-scores are not corrected for 
participant age at testing.
 The WMLS
-R consists of seven tests, two of which were administered in the present 
study. The first one is called Picture
-Vocabulary (PV) test. Participants are shown pictures in sets 
of six and are asked to name them one by one as the experimenter asks them
 ÒWhat is thisÓ and 
points at the picture. The second administered test is called Verbal Analogies (VA). Participants 
are asked to solve ÒriddlesÓ such as 
In is to out as down is to É?
 Scores from both tests can be 
combined into a single score with the pro
vided software, which the test developers call Oral 
Language Ability (OL). This score correlates highly with the cluster score that is based on all 
tests of the WMLS
-R (r = .9). The standard error of the mean for all tests is between 5.55 and 
5.93 and the 
internal consistency reliability coefficients were around 
r11 = .9 (Alvarado & 
Woodcock, 2005)
.  3.1.2.2 Test of attention in listening
 The Test of Attention in Listening (TAIL) was adapted from 
Zhang, Barry, Moore, and 
Amitay (2012). In this test, participants have to decide whether two tones were played to the 
63  same ear or dif
ferent ears. What makes this test challenging is that the frequency of the two tones 
is sometimes the same and sometimes different. Because participants are only supposed to 
respond based on the location of the tones, response conflict arises on trials on 
which the 
location is different but the frequency the same or the location the same and the frequency 
different. The manipulation of frequency and location results in four conditions, same
-frequency 
same-location (SFSL), same
-frequency different
-location (
SFDL), different
-frequency same
-location (DFSL), different
-frequency different
-location (DFDL). The original test also has a 
second condition where frequency is the task
-relevant dimension and location is the irrelevant 
dimension that has to be ignored. Ho
wever, only the first condition was used in the present study 
to reduce the time needed to administer the test.
 Three different measures can be derived from the TAIL, baseline RT, involuntary 
orientation, and conflict resolution. Baseline RT is the mean RT
 in the SFSL condition. 
In Zhang 
et al. 
(2012), baseline RT correlated with the RTs in a separate test that did not involve response 
conflict and therefore the authors suggested that this measure reflects information processing 
speed. 
Involuntary attention can be calculated by subtracting RTs on trial
s with the same 
frequency from those of different frequency ([DFDL+DFSL] 
Ð [SFSL+SFDL]). Conflict 
resolution can be calculated by subtracting the mean RTs on trials where location and frequency 
were both different or both the same (no response conflict) fr
om those where they were different 
([SFSL+DFDL] 
Ð [SFDL+DFSL]). 
 The tones were created in Praat 
(Boersma & Weenink, 2014)
 as pure tones with a length 
of 100 ms. The frequency ranged between 500 and 1400 Hz in 100 Hz intervals, which resulted 
in ten different sound files. There were a total of 96 experimental trials, 24 trials in each 
cond
ition. The experiment was programmed in E
-Prime.
 64  3.1.2.3 Working memory
 The WM test used for this study comes from the NIH Toolbox. The NIH toolbox is a 
collection of different tests in the areas of cognition, emotion, motor function, and sensation. All 
tests are 
available freely and are administered online. In the WM test, participants see pictures 
and their labels and hear their names. The set
-size differs from 2 to seven pictures. Pictures are 
either animals or food items. After each set of pictures, participant
s are asked to repeat what they 
just saw in size order from smallest to biggest. For example, if they saw a bear, a duck, and an 
elephant, they would say duck, bear, elephant. To establish the size order, participants have to 
pay attention to the size of t
he object on the screen but in most cases, the relative proportions on 
the screen corresponded to real life. The test has two parts. In the first part, sets consist only of 
animals or only of food items. In the second part, sets consist of animals and food
 and 
participants are asked to repeat the food first from smallest to biggest and then the animals from 
smallest to biggest. Both parts start with two practice sets to ensure that participants understood 
the directions. If they made a mistake in either pra
ctice set, the instructions were repeated and the 
set was administered again. After the practice items, the test starts with a set size of two. If a 
participant correctly repeats all pictures, the set size of the next trial increases by one. If the 
partici
pant makes an error, another set of the same size but different items is administered. 
Testing stopped when a participant could not correctly repeat two sets in a row or when the last 
set was administered. Responses were recorded on a paper sheet and a sco
re for each participant 
was calculated by counting the total number of items of all correctly repeated sets. Thus the total 
score for each part is 27 (2+3+4+5+6+7) and the total possible score is 54. This test was only 
administered in English.
 65  Recently, th
e reliability of the test was established 
(Tulsky et al., 2014)
. The test
-retest 
interclass correlation coefficient was .77. The test also correlated with other established WM 
tests (
r = .57) and tests of executive function (
r  = .43 - .58). The correlation with a test of 
receptive vocabulary, on the other hand, was low (
r = .24). Also interesting with respect to the 
present study was the finding that Hispanic participants scored, on average, .41 SDs below 
Caucasian participants.  
 3.1.2.4 Consonant perception in noise
 In the consonant perception test (CP), participants heard 16 different consonants in a 
/VCV/ cluster and were asked to identify them by clicking on one of 16 options on the computer 
screen. The consonant recordings came from 
Shannon, Je
nsvold, Padilla, Robert, and Wang
 (1999). The original recordings done by Shannon and colleagues included 25 consonants in three 
different vowel contexts /u/, /a/, and /i/ in medial /VCV/ and initial position /CV/. Following 
Garcia Lecumberri and Cooke (
2006), stimuli were reduced to 16 consonants (/p b t d k g t
! b f v 
s z 
! b m n l r/) in only one vowel context (aCa) and one consonant position. Two male speakers 
(M2 and M3) were chosen 
from the original set of 5 male and 5 female speakers and each token 
was repeated four times for a total of 128 items. The experimental items were mixed with 
background noise (multi
-talker babble) taken from the original SPIN recording. Three different 
sections from the babble noise track were cut and mixed at a SNR of 
-4 dB in Praat 
(Boersma
 & Weenink, 2014)
. One of those babble segments was repeated once and the other two were played 
once. The SNR was chosen based on a pilot study. Participants in a pilot study performed at 
about 85% accuracy at an SNR of 
-2 dB. To avoid ceiling effects, 
the SNR was lowered to 
-4 dB in the present study. Participants also heard each token in silence at the beginning of the 
experiment so they could adapt to the pronunciation of each speaker. These trials were only used 
66  as practice trials and were not scored
. When a participant made a mistake on those practice trials, 
the same token was repeated until the participant made a correct response.
 3.1.3 Relationship between p
redictor variables
 The predictor variables used were oral language ability, WMC, consonant percep
tion 
(henceforth CP) in noise (mean accuracy), and attention. The attention test provided different 
variables such as baseline RT, 
involuntary orientation
, and 
conflict resolution (see 
3.1.2.2). There was no clear hypothesis for which of those measures, if any, would predict accuracy on the 
SPIN so the
 analysis was exploratory and
 results need to be interpreted with some caution. 
Accuracy on the TAIL was not considered as a variable because there was very little variation in 
accuracy rates. 
The results of these tests can be seen in 
Table 
4. Table 
4. Means and standard deviations of the predictor variables
 used in Experiment 2
.  Monolingual
 Bilingual
 Total sample
 Oral language ability (W)
 533.2 (8.9) 515.6 (11.4) 524.8 (13.4) Working memory
  37.6 (8.0) 32.4 (7.9) 35.2 (8.3) Consonant 
perception
 (%) 76.9 (5.4) 66.9 (9.1) 72.2 (8.9) TAIL measures
    Attention baseline RT (in ms)
 680 (125) 702 (139) 690 (132) Involuntary attention effect 
(in ms)
 31 (40) 19 (49) 25 (44) Conflict resolution
 effect (in ms)
 46 (49) 29 (43) 38 (47) Note. 
Oral language ability is reported in W scores, which are on an equal interval scale with an 
arbitrary unit. The maximum score on the working memory test was 54.
 TAIL = test of attention 
in listening. See text for an explanation of TAIL measures.
 The influe
nce of each variable was first investigated through 
simple
 and bi
-partial 
correlations with accuracy on the SPIN test in each condition. Because some variables were 
intercorrelated (e.g., WM and 
verbal
 ability), the unique contribution 
of working memory 
was investigated by partialling out the covariance shared with oral language ability. The results are 
reported in 
Table 
5. The correlation between WM and verbal ability was 
r(53) = .43, p = .001, in 
the monolingual group and 
r(48) = .47, p < .001, in the bilingual group. Consonant perception 
67  and verbal ability were correlated in the bilingual group only 
r(48) = .55, p < .001 (monolinguals: 
r(53) = .20, p = .154). Table 
5. Correlations and bivariate correlati
ons between predictor variables and the four 
conditions of the Speech perception in Noise test.
  LNHP LNLP HNHP HNLP  Mono
-lingual
 Bilingual
 Mono
-lingual
 Bilingual
 Mono
-lingual
 Bilingual
 Mono
-lingual
 Bilingual
 Picture vocabulary
 .37
 <.01
** .60
 <.01**
 .10
 .46
 .34
 .02*
 .47
 <.01**
 .62
 <.01**
 .21
 .13
 .40
 <.01**
 Verbal analogies
 .29
a .04*
 .52
 <.01**
 .18
 a .20
 .37
 <.01**
 .36
 a <.01**
 .48
 <.01**
 .25
 a .08
+ .55
 <.01**
 Working memory
 .18
 .19
 .19
 .20
 -.14
 .33
 .22
 .12
 .33
 .02*
 .36
 .01*
 .17
 .23
 .34
 .02
 Working 
memory 
(-verbal ability)
 .03
 .86
 -.14
 .33
 -.23
 .09
+ .05
 .73
 .15
 .27
 .11
 .47
 .07
 .64
 .12
 .41
 Consonant 
Perception (CP)
 .19
 .18
 .49
 <.01**
 -.01
 .94
 .37
 .01*
 .04
 .75
 .37
 .01*
 .12
 .39
 .41
 <.01**
 CP (
-verbal 
ability)
 .12
 .38
 .22
 .13
 -.04
 .76
 .20
 .18
 -.06
 .68
 .04
 .81
 .07
 .61
 .17
 .25
 Attention baseline
 -.07
 .62
 -.20
 .17
 .02
 .87
 -.13
 .36
 -.08
 .56
 -.32
 .03*
 -.02
 .88
 -.28
 .05
+ Distraction effect
 -.14
 .30
 .01
 .93
 .03
 .85
 .18
 .22
 -.37
 <.01**
 -12 .40
 -.24
 .09
+ .12
 .43
 Incongruency 
effect
 -.24
 .08
+ -.11
 .46
 -.01
 .94
 -.03
 .84
 -.14
 .33
 -.03
 .84
 .10
 .47
 -.00
 .98
 a n = 52.  Note. 
LNHP = low noise, high predictability; LNLP = low noise, low predictability; HNHP = 
high noise, high predictability; HNLP = high noise, low predictability.
 For each cell, the upper 
value
 shows the 
correlation coefficient (
r-value
) and the lower 
value
 shows the 
p-value.
 (-verbal 
ability) indicates that the effect of verbal ability was partialled out of the 
correlation.
   From these correlation analyses it appears that
 WMC was only correlat
ed with SUN in 
the HNHP condition but this effect disappeared when verbal ability was partialled out
. Consonant perception was correlated with SUN in the bilingual group only and the effect 
disappeared when verbal ability was controlled for.
 Because of the
se high correlations between 
68  predictor variables, on
ly verbal ability will be used as a predictor
6. For the different attention 
measures, the result
s are somewhat difficult to interpret. For the bilinguals, a lower baseline RT 
was associated with higher ac
curacy in the HNHP condition. For the monolinguals, on the other 
hand, it was the distraction effect that was associated with higher accuracy in the HNHP 
condition. The analysis of the TAIL test showed some differences between the two groups on the 
test an
d this might be why different measures correlate
d with the SPIN test for the two groups
 (see section 
5.5 for a detailed analysis of the TAIL)
. When the whole sample
 was considered, 
there was a small but significant correlation between baseline RT and HNHP mean accuracy 
(r(101) = -.21, p = .032) and so this measure will be used for further analyses. 
Following these 
preliminary analyses, a mixed
-effects logistic 
regression model was run with verbal ability and 
baseline RT as additional predictor variables besides those entered in Experiment 1.
 3.2 Results
 The results of this analysis are reported in 
Table 
6. As can be seen, those variables that 
interacted with Group in Experiment 1 also interacted with 
verbal
 ability measured as a 
continuous variable
7. Specifically, 
there was a significant main effect of verbal ability and 
significant interactions between ve
rbal ability and predictability on the one hand and verbal 
ability and frequency on the other hand. However, when these interactions were entered into the 
model, group was still a significant factor as well, suggesting that not all variance could be 
explai
ned by verbal ability. 
Because many of the predictors are on continuous scales, the results 
                                                6 Working memory did not predict SPIN accuracy when entered together with verbal ability and 
consonant perception was only significant in the main analysis but not when groups were 
analyzed separately.  
 7 Because biphone probability did not interact with group in Experiment 1, the interaction 
between this variable and verbal ability was not entered into the model.
 69  can be best interpreted by displaying them graphically. The main effect of verbal ability is shown 
in 
Figure 
7. Table 
6. Results from the mixed
-effects regression analysis of SPIN accuracy
.  Odds
-ratio
 Logit 
scale SE p Baseline 
(HNLP, bi
lingual, ND=high, 
PhonemePr=high)
 1.10 0.09 0.14  Noise (high vs. 
low)
 3.82 1.34 0.06 < .001 Predictability (low vs. high)
 6.41 1.86 0.07 < .001 Noise*Predictability
 2.23 0.80 0.14 < .001 Group (bilingual vs. monolingual)
 1.36 0.31 0.09 < .001 Frequency (z
-score)
 1.31 0.27 0.13 .032 Phoneme Probability (z
-score)
 1.26 0.23 0.13 .072 Verbal
 ability (z
-score)
 1.20 0.17 0.05 < .001 Attention baseline RT (z
-score)
 0.92 -0.08 0.03 .012 Predictability
*Verbal ability
 1.46 0.38 0.07 < .001 Frequency*
Verbal ability
 0.94 -0.06 0.03 .024 Noise*Predictability*Language 
ability
 1.21 0.19 0.13 .144 Note. Odds ratios are shown here because they are easier to 
interpret. Fo
r example, participants 
were 3.8
 times as likely to recognize a word in the low noise condition compared to the high 
noise condition.
 For continuous variables, the coefficient shows that change in SPIN accuracy 
associated with a 1 SD increase in the predictor variable. Standard errors are on the logit scale.
 P-values were calculated using Type II likelihood ratio tests.
 70   Figure 
7.Relationship between oral language ability and accuracy on the SPIN test, depending on 
condition. HNHP=high noise
-high predictability. HNLP=high noise
-low predictability. 
LNHP=low noise
-high predictability. LNLP=low noise
-low predictabi
lity.
 Because one of the crucial questions was whether the relationship between verbal ability 
and SPIN accuracy would be present in both groups, separate models were run for bilinguals and 
monolinguals. 
The results for each group are reported in 
Table 
7. What is striking about these 
results is that many of the effect sizes are very similar when language proficiency is taken into 
account. For example, 1 SD increase in language 
ability was associated with a similar 
increase in 
accuracy 
for monolinguals and bilinguals. Likewise, the benefit of having a predictive context 
increased by
 roughly
 the same amount for 1 SD change in language ability. 
 The effect of attention was no longer significant for either group, although the effect size 
was the same for each group as in the previous analysis. This may indicate that the sample size 
was not large enough anymore to find the effect (i.e., there was too much uncertainty in the 
estimate).
 71  Table 
7. Results of the mixed
-effect regression analysis of the SPIN test for the monolingual and 
bilingual group.
  Monolingual
   Bilingual
 OR logit
 SE  OR logit
 SE Baseline 
(HNLP
) 1.62 0.48 0.15  1.02 0.02 0.13 Noise (high vs. low)
 4.28 1.45*** 0.09  3.41 1.23*** 0.09 Predictability 
(low vs. high)
 9.92 2.29*** 0.11  4.16 1.43*** 0.09 Noise*Predictability
 2.32 0.84*** 0.24  2.25 0.81*** 0.17 Frequency (z
-score)
 1.22 0.20 0.14  1.40 0.34** 0.13 Biphone Probability (z
-score)
 1.39 0.33* 0.14  1.21 0.19 0.13 Verbal ability (z
-score)
 1.15 0.14*** 0.07  1.27 0.24*** 0.07 Attention baseline RT (z
-score)
 0.93 -0.07 0.05  0.92 -0.09+ 0.05 Predictability*Verbal ability
 1.25 0.22** 0.10  1.14 0.13** 0.10 Frequency*Verbal ability
 0.99 -0.01 0.03  0.97 -0.03 0.04 Noise*Predictability*Verbal 
ability
 1.26 0.23 0.23  1.31 0.27 0.17 Note. See note in Table 5.
 ***p < .001; **
p <.01; *
p < .05; 
+p = .059. p-values based on 
likelihood ratio tests.
 One prediction that was not borne out by the data was that the frequency effect would be 
modulated by verbal ability. To further investigate this relationship, I divided the continuous 
frequency variable into three factors, low, mid, and high frequency, wi
th each group containing 
an equal number of words. The reasoning behind this post
-hoc analysis was that frequency 
effects may not be linear, that is, they may only be present at the low end of the scale. By 
dividing frequency into three factors, I can comp
are low frequency words to high frequency 
words, which may give more power to find effects. Using frequency as a factor is common in 
psycholinguistic studies, mainly because traditional ANOVAs do not allow continuous variables, 
and so the results will also
 be more 
easily comparable to other studies. The raw frequency of 
72  each factor 
is shown i
n Table 
88. In addition to factorizing frequency, I also split both groups 
into
 a high and low proficiency group based on 
a median split of 
their 
oral language ability score. 
As a result, the bilingual high proficiency group was not significantly different from the 
monolingual low group 
(p = .196) and so the results will show if subg
roups of monolinguals and 
bilinguals matched on proficiency will still perform significantly different
9. The mean 
proficiency level of each group is shown in 
Table 
9.  Table 
8. Word frequency of high, mid, and low frequency words on the SPIN test
  High frequency
 Mid frequency
 Low frequency
 Log10 frequency
 3.22 (0.18) 2.72 (0.13) 2.22 (0.17) Frequency
 per million
 35.7 (16.6) 10.83 (3.33) 3.49 (1.31) Note. Frequencies are based on Brysbaert and NewÕs (2009) subtitle frequencies.
 Table 
9. Mean 
language proficiency of the upper and lower half of the monolingual and 
bilingual group.
  High group
  Low group
 OL-W OL-SS  OL-W OL-SS Monolingual
 541 (5.1) 112 (4.7)  527 (5.7) 100 (4.6) Bilingual
 525 (7.0) 98 (5.3)  507 (6.0) 83 (4.6) Note. OL
-W= 
oral language ability W
-score. OL
-SS = oral language ability standard score. See 
section 
3.1.2.1 for further explanation.
 Standard deviations are shown in 
parentheses
.                                                 8 The results of ANOVA showed that the three frequency groups did not significantly differ i
n the number of neighbors (
F(2, 124) < 1), biphone probability (
F(2, 124) = 2.1, p = .131), or frequency
-weighted neighborhood density (
F(1, 124) < 1). 9 There are still some caveats, even when comparing participant groups matched on proficiency because th
ese 
participants stilled differed on other dimensions such as parental education level. Nevertheless, results from this 
follow
-up analysis can still be suggestive.
 73  The results of this follow
-up analysis can be seen in 
Figure 
8. As can be seen, 
the decline 
from high to mid frequency is smaller 
than
 the decline from mid to low frequency, and this is the 
pattern for all groups. Of special interest was whether the mono
lingual low (ML) group would be 
significantly different from either the monolingual high (MH) or the bilingual high (BH) group. 
A mixed
-effects logistic regression model with frequency (low/mid/high) and group 
(MH/ML/BH/BL) as predictor variables 
(all othe
r variables were ignored for this analysis) 
showed a main effect of frequency (
!2 (2) = 7.7, p = .022) and a main effect of group (
!2 (3) = 132.2, p < .001) but the interaction was not significant (
!2 (6) = 8.0, p < .241). To further 
investigate these grou
p differences
, follow
-up analyses were run for each frequency level with 
the ML group as the reference category. When frequency was high, t
he BH
 group was 
marg
inally less accurate than the ML
 group (b = -0.21, SE = 0.12, p = .084) and 
the MH group 
was not 
significantly different from the ML group (
b = 0.19, SE = 0.13, p = .148). At the mid 
frequency level, the BH group was not significantly different from the ML group (
b = -0.11, SE = 0.13, p = .343) but the MH group was more accurate than the ML group (
b = 0.43, SE = 0.13, p = .001). At the lowest frequency level, the BH group was less accurate than the ML group (
b = -0.42, SE = 0.10, p < .001) and there was a trend for the MH group to 
be 
more accurate than the 
ML group (
b = 0.18, SE = 0.11, p = .095). 74   Figure 
8. The effect of frequency show for each of four groups. The monolingual and bilingual 
group were each divided into a high and low group based on a median split of their proficiency 
score. Whiskers show the 95% confidence in
terval.
 These results suggest that proficiency also had an effect in the monolingual group but 
effects were only statistically significant at the mid frequency level. More 
importantly
, the 
subgroups of monolinguals and bilinguals that were matched on profi
ciency were not 
significantly different from each other at the mid and high frequency levels. 
 3.3 Discussion
 Experiment 2 showed a large influence of individual differences on 
SUN. Both the main 
effect of group and the group by predictability interaction repo
rted in Exp. 1 were modulated by 
language ability
, as measured by the WMLS,
 in Experiment 2. 
This shows that differences 
between monolinguals and bilinguals previously reported in the literature may 
to a large part
 be 
attributable to differences in languag
e experience. As described in the introduction, bilinguals 
often know fewer words in each of their languages compared to someone who only speaks one 
language 
(Bialystok & Luk, 2012; Gasquoine & Dayanira Gonzales, 2012; Portocarrero, 
Burright, & Donovick, 2007)
. For example, Gasquoine 
and 
Dayanira Gonzales
 (2012) tested a 
75  sample of 56 Mexican
-American participants residing in the Rio Grande Valley region of South 
Texas
 using the same proficiency test that was use
d in the current study
. These participants were 
more diverse than the current sample in terms of age (they ranged from 18 to 65 years) but were 
similar in terms of years of formal educati
on (mean = 13.9 years). The age
-adjusted standard 
score for the Engli
sh picture vocabulary test was 86 in the Gasquoine 
and 
Dayanira Gonzales
 study, which is coincidentally the exact same figure as in the present study. 
The WMLS was also 
used in a study by 
Delgado, Guerrero, Goggin, and
 Ellis (1999). These authors tested 80 
Spanish
-English bilingual students in Texas.
 The sample differed somewhat from the present 
sample in that only half of the participants had received all of their form
al education in the US. 
In this study, the mean W
-score for picture vocabulary was 508.4 compared to 516.1 in the 
present study. 
The similarity in figures
 suggests
 that the 
bilingual 
participants 
in the present 
study were likely representative of the large
r Spanish
-English 
bilingual population in the US 
with a similar educational background. Given these differences in language proficiency
 compared to monolinguals
, it is not surprising that bilinguals often perform 
less well
 on verbal 
tests. The monolinguals
 in the present study performed 1 SD higher
10 (d = 1.78), which is a large 
effect. The present analysis showed that when group differences in language ability were 
controlled for, the difference in performance on the SPIN become much smaller. In addition, t
he 
relationship between accuracy on the 
SPIN and language ability was similar 
in both groups. As 
suggested by a previous study 
(Tamati et al., 2013)
, greater word knowledge is positively 
associated with better listening in noise ability in monolingual speaker
s. The present study 
confirms this result by using a standardized test of proficiency rather than self
-ratings as in the 
Tamati et al. study. 
                                                 10 That is, 1 standard deviation in the population sample, which is 15 for the WMLS.
 76  In addition to the main effect of language ability, an interaction was also found between 
this variable and 
predictability. Participants with greater verbal ability were better able to make 
use of a predictive context
 (see 
Figure 
7). Again, this was true for bilinguals and m
onolinguals, 
showing that the previously reported 
monolingual 
advantage 
(Bradlow & Alexander, 2007; Shi, 
2010) may be better described as a general advantage associated with 
verbal ability. 
 So far, it seems that 
differences between groups can be best explained by differences in 
verbal ability. However, verbal ability by itself is not an explanatory factor but rather an 
observational factor.  
To test the hypothesis that individual differences in 
SUN are attributabl
e to 
differences
 in language exposure, frequency effects were investigated
. In the main analysis, 
frequency interacted with verbal ability as was predicted. However, in follow
-up analyses of 
each group, the interaction was not found. This may be because of
 the more restricted range in 
language proficiency in each group but it may also suggest that the interaction was only 
significant because of group differences in verbal ability and so group status may be the actual 
cause of this interaction. In addition, 
the main effect of frequency was not significant in the 
monolingual group as in the previous group analysis reported in Experiment 1. However, a 
follow
-up analysis with frequency as a factor with three levels, frequency effects also became 
apparent in the 
monolingual group. 
A wider range of word frequencies may be necessary, 
though, to find a more robust effect. The follow
-up analysis may also provide some insight into 
the finding that group differences were still significant in the main analysis after cont
rolling for 
proficiency and frequency effects. 
For the two subgroups of monolinguals and bilinguals that 
were matched on proficiency, differences in accuracy were small or nonexistent when frequency 
was in the medium to high range but became apparent when 
frequency was low. This finding 
suggests that even when monolinguals and bilinguals are matched on frequency, a bilingual 
77  person may still have encountered low frequency words disproportionately less often compared 
to a monolingual person of the same overa
ll language proficiency. 
Interesting in this respect is 
also the observation that the marginally significant noise
-by-group interaction found in 
Experiment 1 seemed to have mostly been driven by the low frequency words as the following 
figure suggests (
Figure 
9), although the three
-way interaction between noise, group, and 
frequency was not significant
. This again shows the nonlinear nature of frequency effects and it 
suggests that the weakest lexical representations are those that are the most affected by noise.
  Figure 
9. Mean accuracy on the SPIN test for each group (bilingual/monolingual) 
separated by 
noise level (high/low) and target word 
frequency (low/mid/high). The figure shows that in the 
bilingual group, the effect of noise was largest when frequency was low.
 78  CHAPTER 4: 
GENERAL DISCUSSION
 To summar
ize the main results again, Experiment
 1 replicated earlier findings showing 
that the bil
ingual group performed below the monolingual group in all four conditions. The 
differences were especially large when noise and predictability were both high, replicating 
previous studies that found that bilinguals did not benefit as much from a pred
ictive
 context as 
monolinguals 
(e.g., Bradlow & Alexander, 2007; Mayo et al., 1997; Shi, 2010)
.  The two
-way interaction between frequency and group suggests that bilingualsÕ word 
recognition in noise may be especially affected when to
-be-recognized words
 are of low 
frequency. T
hese results are different from
 Imai et al. 
(2005) who also tested monolingual and 
bilingual participants 
on word recognition with low
-level background noise (SNR = 12 dB) 
but 
did not find this
 interaction. Instead, they showed that 
the 
effect of 
neighborhood density was 
larger for the bilingual group compared to the monolingual group. 
The lack of an interaction in 
their study may have resulted from the corpus 
(Kucera & Francis, 1967)
 that their frequency 
counts were based on. As 
Brysbaert and New (2009) show, subtitle frequencies better reflect 
actual word frequencies, especially since the Kucera and Francis corpus is quite old and based on 
text word frequencies, something also noted by Imai et al. (2005). 
Bradlow and Pisoni 
(1999) also investigated the effects of lexical variables on native and nonnative word recognition in 
noise
. These authors
 divided words into 
easy (high frequency, low neighborhood density) and 
hard (low frequency, high neighborhood density) 
words. They 
found that native and nonnative 
speakers of English recognized easy words better than hard words when there was single or 
multi
-talker babble in
 the background. However, the effect was much larger for the nonnative 
speakers compared to the native speakers. 
Although the lexical variables investigated in the 
present study and the other two studies were not the same, the present results are neverthel
ess in 
79  line with their results. Both studies suggest that nonnative speakers were more affected 
by these 
lexical variables than native speakers. 
 Previous SUN studies differ in whether the group
-by-noise interaction is significant or 
not. Shi 
(2010) found a significant interaction whereas Rogers et al. (2006) did not. 
Shi (20
10) compared a group of eight native bilingual speakers who learned English and another language 
before the age of two
 and another group of eight bilingual
 speakers who had learned English 
between five and seven 
to eight monolingual speakers
 (Shi also incl
uded two groups of late 
learners of English). At SNR +6 dB simultaneous bilingual group was not significantly different 
from the monolinguals but at SNR 0 the groups were different, suggesting an interaction between 
group and noise. The early bilingual gro
up was different from the monolinguals at both SNRs 
(although this difference was not significant when correcting for multiple comparisons
 of which 
there were twenty
). The results reported in Shi (2010) suggest that AoA is an important factor 
that predicts
 SUN. However, these results do not allow any conclusion about the origin of AoA 
effects. 
The present results shed some light on factors influencing the interaction between group 
and noise level as well as main effects of group. 
Because the bilingual parti
cipants in the present 
study were quite homogeneous in terms of AoA
, the present results suggest that it is the amount 
of exposure to the tested language that determines 
SUN. Amount of exposure is closely related to 
AoA, of course, but what is striking abo
ut the present results is that the same relationship 
between proficiency and SUN was also found in the monolingual group who had acquired 
English from birth. 
As would be expected when amount of exposure is the determining factor, 
differences between groups
 were largest for low frequency words. 
 Reanalyzing the data from Experiment 1 with an individual differences design rather than 
a group design further confirmed the hypothesis that amount of exposure to English is the main 
80  contributing factor to SUN. 
In Experiment
 2, the effect of individual differences in different 
domains
 (linguistic vs. nonlinguistic)
 was explored. The largest difference between groups was 
in language proficiency and this variable also emerged as the strongest mediating variable 
between
 groups. In other words, differences between groups became smaller once language 
proficiency was taken into account. 
But language proficiency did not only mediate overall 
accuracy but also interacted with predictability. Both findings can be explained with
 the ELU. 
To 
reiterate, t
he basic assumption of the ELU 
(Rınnberg et al., 2013)
 is that 
speech information is 
bound into a phonological representation in an episodic buffer. This information, referred to as 
RAMBPHO (rapidly, automatically, and multimodally bound
 phonological representation) is 
assumed to operate at the syllable level and is matched to semantic representations in LTM
 (c.f. 
Giraud & Poeppel, 2012a)
. Listeners are assumed to constantly form predictions about upcoming 
acoustic 
information based on preceding 
suprasegmental, segmental,
 and semantic information
 (Pickering & Garrod, 2007)
. For example, when the context of an utterance is predictive of a 
certain 
word,
 this word may receive activation even before it is mentioned 
(Altmann & Kamide, 
1999) and
 lexical access may happen even when the acoustic information is heavily degraded. 
Also, listeners have been shown to 
use distal 
prosodic information 
and context speech rat
e to 
make predictions 
about 
upcoming 
word boundaries to segment the speech stream 
(Brown, 
Salverda, Dilley, & Tanenhaus, 2011; Dilley & McAuley, 2008; Dilley & Pitt, 2010)
. When the 
speech signal is optimal, this process
 is effortless an
d proceeds rapidly. However, when 
mismatches between 
RAMBPHO and phonological LTM representations occur, lexical access is 
delayed and the feed
-forward cycle is interrupted 
(Rınnberg et al., 2013, p. 3)
. Such mismatche
s can occ
ur because of a poor speech signal or poorly specified phonological representations. In 
such cases, the assumption of the ELU model is that those mismatches have to be resolved 
81  through explicit processes that operate on a larger time scale. 
According to th
e ELU, this is when 
individual differences in WMC will 
become visible. Those individuals with a larger WMC are 
assumed to have more processing resources available to make, for example, predictions based on 
the preceding context.
 As is obvious from this dis
cussion of the ELU is the great emphasis on WMC to explain 
individual differences in speech understanding in noise. In light of the present results, these 
assumptions may have to be modified to some extent. 
As predicted by the ELU, verbal WM was 
associated
 with better HIN. However, WM was correlated with verbal ability and when both 
variables were entered into a regression model to predict SPIN accuracy, only verbal ability was 
significant. The strong correlation between WM and verbal ability may be surpris
ing because the 
WM used very common objects, namely animals and food items that participants can be 
expected to be very familiar with. As laid out in the introduction, 
verbal 
WM is not independent 
of LTM
 representations of words
 (Baddeley, 2012, p. 20
; also see MacDonald & Christiansen, 
2002). For example, studies have shown that high frequency words can be better remembered 
than low frequency words
 (e.g., Hulme et al., 1991)
, suggesting that stronger LTM 
representations 
may 
facilitate 
encoding and 
rehearsal of those words. 
The correlation between 
verbal ability and WM may have the same e
xplanation as the frequency effect. For individuals 
with overall less precise lexical re
presentations, all words may behave like low frequency words, 
that is, their representations are underspecified. Another
 aspect of the WM test used in the 
present study
 is that participants heard semantically related words (i.e., animals and foods) and 
thus had to inhibit previously activated words that were not relevant in the current set. For 
example, a participant may replace 
elephant
 with 
bear
 because a bear was the 
largest animal in 
82  the previous set. Individuals with a larger vocabulary may be better able to inhibit previously 
activated words
 and thus prevent interference.
 Given these interactions between verbal WM and verbal ability, these two variables may 
not be e
asily separated. The fact that WM was no longer 
a significant predictor when entered 
together with verbal ability does not necessarily mean that WM does not play a role in speech 
understanding in noise. However, individual differences in WMC may play a sma
ller role than 
individual differences in verbal ability. 
Such i
ndividual differences in verbal ability may 
influence speech understanding in noise in different ways. 
For
 an individual with more language 
experience, words in the mental lexicon may be better
 integrated semantically
 because they will 
have experienced words in more diverse contexts 
(Bolger, Balass, Landen, & Perfetti, 2008)
. For 
example, word collocations will be better entrenched because they are experienced more often 
and thus co
-occurr
ences of words may be better predicted. In the sentence 
The ship sailed along 
the coast
 (taken from the SPIN test), both ship and sail may trigger associations with coast but 
for someone who has no
t experienced those words together much, the association wi
th coast may 
only weakly exist and thus they would not predict coast and would have to rely more on the 
acoustic 
signal (i.e., bottom
-up information)
. ERP studies of the N400 effect, an 
electrophysiological response that 
indicates
 semantic integration of w
ords into the preceding 
context
, have shown that the effect is modulated by vocabulary knowledge in monolingual and 
bilingual speakers 
(Moreno & Kutas, 2005; Newman, Tremblay, Nichols, Neville, & Ullman, 
2012). These studies suggest that individuals with a larger vocabulary are better able to form 
predictions during listening. 
 Besides these semantic contributions to speech understanding, higher verbal ability may 
also help listeners 
to segment the speech stream into word units. Mattys et al. 
(2005) found that 
83  listeners rely largely on lexical information for word segmentation and only revert to sublexical 
information such as word stress when the signal is heavily degraded (at SNRs of 
-5 dB). Thus 
stronger lexical knowledge may help 
listeners segment the speech 
stream more accurately and 
they may recover from false segmentations more rapidly. 
 Coming back to the ELU, the results of the present study suggest that individual 
differences in phonological representations in LTM may be more indicative of 
SUN difficulti
es than
 individual differences in WMC, at least in a sample of healthy young adults. 
It may be that 
in older people, individual differences in WMC become more important. Especially since 
vocabulary knowledge typically increases with age (and then decreases
 in old age; 
Kav”, Knafo, 
& Gilboa, 2010)
, it cannot be responsible f
or the common observation that 
SUN ability decreases 
as a function of age. The fact that the present study investigated a sample of healthy young adults 
may also explain why attentional control, meas
ured by the TAIL, only had a small effect on 
recognition accuracy. 
A tentative interpretation of the TAIL effect is that individuals with better 
attentional control are better able to attend to the relevant speaker and ignore the background 
babble. 
Thus th
e temporal separation of the target and distractor signal may rely on attentional 
processes.
 Testing a wider range of age groups may reveal whether attentional control will 
correlate more highly with 
SUN in a younger or older sample. 
Future studies should 
also 
administer more than one test of attention and executive function to determine whether overall 
processing speed is more indicative of 
SUN or rather a specific component of executive function, 
that is, inhibition, updating, or shifting 
(Miyake & Friedman, 2012)
. Another way forward may 
be to manipulate attention load during 
SUN instead of 
employing
 a correlational design to be 
better able to establish causal
 relationships
 (cf. Mattys & Wiget, 2011)
. In any case, researchers 
need to make sure that the concurrent task or the t
ask to be used as a predictor variable is not 
84  dependent on verbal ability to avoid confounds. For example, Sommers and Danielson 
(1999) used a linguistic Stroop test 
to predict 
SUN performa
nce. In this test, participants heard the 
words 
mother
, father
, and 
person
 spoken by a man or a woman. Inhibition was necessary when 
there was incongruence between the sex of the speaker and the gender of the spoken word, for 
example, when the word 
mother
 was spoken by a male speaker. An inhibition index was 
calculated by subtracting RTs in the incongruent condition from RTs in the neutral condition and 
this 
measure correlated with 
SUN for hard words (hard words were defined by the authors as low 
frequency
 and high neighborhood density words). Because a Stroop test using linguistic stimuli 
may not be independent of verbal ability, using a nonlinguistic auditory test of attention may be 
better.
 85  CHAPTER 5: 
ANALYSIS OF INDIVIDUAL TESTS
 In the previous section,
 the results from some of the administered tests were used as 
predictor variables in a regression analysis. The purpose of this section is to describe those tests 
plus one additional test, the Words in Noise (WIN) test, in more detail.
 5.1 Words in Noise
 Exper
iment 1 and 2 were designed to answer the question 
why monolinguals and early 
bilinguals are differentially affected by noise. However, the 
SUN test used was not administered 
in the way it is commonly administered to assess a hearing deficit (e.g., the
 noise levels were 
different tha
n on the original test
 as described in Bilger et al., 1984
). Therefore, a standardized 
hearing in noise test
, the WIN,
 was also administered to all participants. 
This was done to 
investigate whether bilinguals may be wrongly dia
gnosed with a hearing deficit 
based on their 
bilingual status.
 The first research question I will answer is whether monolinguals and bilinguals 
are d
ifferentially affected by noise
. We may expect results on the WIN to be different than those 
obtained with 
the SPIN because 
of 
the different make
-ups of the two tests. On the SPIN, the 
onset of target words is unpredictable because the preceding context is different for each 
sentence. On the WIN, on the other hand, target words are always preceded by the same c
arrier 
phrase, which is 
say the word
. In addition to making the target word onset predictable, the WIN 
places a lower processing load on participants compared to SPIN sentences for which the context 
is predictive of the target word. As was shown in Experim
ent 2, recognition of words in a 
predictable context is especially dependent on individual differences in verbal ability. 
When 
testing bilingual speakers for a hearing deficit, it may therefore be advisable to use a test that is 
not strongly correlated wit
h verbal ability.
   86  5.1.1 Methods
 5.1.1.1 Participants
 The same participants were tested as described above. One participant from the bilingual 
group was excluded from this analysis because the test could not be administered due to 
technical difficulties
, which reduced 
the bilingual sample to 47.
 5.1.1.2 Materials
 The WIN was developed by Wilson and colleagues 
(Wilson, Abrams, & Pillion, 2003)
 and was also administered through the NIH Toolbox. 
The NIH toolbox is a collection of different 
tests in the areas of cog
nition, emotion, motor function, and sensation. All tests are available 
freely and are administered online. 
The test consists of two lists of 35 words each. Each list is 
divided into groups of five words that are played back with background babble 
(multiple 
speakers) at the same SNR. Participants hear a woman asking them to repeat words, for example, 
ÒSay the word dogÓ. The sound intensity of the background babble is fixed and the womanÕs 
voice becomes increasingly softer starting at a SNR of 24 dB 
and decreasing to 0 dB in 4 dB 
decrements. Administration stops when none of five items at a particular SNR can be correctly 
repeated by a participant or when the end of the list is reached. Each list is administered 
monaurally to one ear only
 with ear of 
testing being counterbalanced
. For example, one 
participant will hear List 1 presented to the right ear and List 2 to the left ear and another 
participant will hear List 2 presented to the right ear and List 1 to the left.
 The score for the test is 
derived
 from the inflection point of the psychometric function
 (which describes the relationship 
between accuracy and SNR)
 to determine at which SNR a participant recognized 50% of the 
words. This test was administered in English and Spanish.
 The Spanish version 
was administered 
at the end of the session after all English tests were done.
 87  5.1.2 Results
 5.1.2.1 English Words in Noise Test
 A logistic mixed
-effects regression model 
with a probit link
-function and 
with Accuracy 
as the outcome variable and the main effects of Group 
and SNR and their interaction was run 
including random 
intercepts for words and subjects and random slopes for SNR within subjects.
 The descriptive statistics are shown in 
Table 
1011 and mean accuracy of all items can be found in 
APPENDIX
. Because participants were at ceiling at SNR 24 dB to 12 dB, the regression model 
was only fit 
to the data between 12 dB and 0 dB. The results showed a main effect of SNR (
!2 (1) = 90.1, p < .001). The effect of group was not significant (
!2 (1) = 2.8, p = .093), nor was the 
interaction between SNR and group (
!2 (1) = 2.4, p = .121)12.  Table 
10. Mean accuracy on the Words in Noise test.
 SNR  24 dB 20 dB 16 dB 12 dB 8 dB 4 dB 0 dB Group
 Monolingual
 M SD 100% (0.0) 100% (0.0) 99.2% (9.1) 100% (0.0) 74.3% (43.7) 53.0% (50.0) 20.6% (40.5) Bilingual
 M SD 99.4% (8.0) 100% (0.0) 96.9% (17.3) 98.9% (10.3) 71.6% (45.2) 49.0% (50.0) 21.7% (41.3) Note. SNR = signal
-to-noise ratio. 
 The results of the regression model are shown in 
Figure 
10 along with the observed 
values. 
The predicted values that are derived from the model estimates overestimate accuracy at 
SNR 8 dB and underestimate accuracy at SNR 4 dB. However, the actual fi
tted values that take 
                                                11 An exa
mination of individual items showed two outliers and these were excluded from the 
descriptive statistics. The word 
time at SNR 16 and 
shawl
 at SNR 20. See 
Table 
19 for mean 
accuracy of all items.
 12 When the model was fit to the whole data set, the effect of group was significant (
!2 (1) = 5.4, p = .020) as was the interaction between group and SNR (
!2 (1) = 13.8, p < .001). However, it 
seems that this interaction was attributable to the lower performance of the bilingual group at 
SNR 16. At SNR 12, group differences were not significant and so the differences at SNR 16 are 
likely attributable to specific
 items that caused difficulty (also see Appendix 2).
 88  into consideration subject and item variance are quite close to the observed values, suggesting 
that the model describes the data well.
  Figure 
10. Results of the English WIN test. 
Solid l
ines show the 
predicted
 values 
based on
 coefficients
 of the regression model 
described in the text. 
Dashed lines show the fitted values of 
this model. 
Whiskers show the 95% confidence interval.
 Another way to look at the data is to extract the SNR at which a participant
 achieved 50% 
accuracy. This can be done by running a logistic regression model for each participant. Using the 
predicted intercept and slope, we can calculate the SNR
50. The formula for this is
 !"#
!!!!!!!!!!!" where x is SNR and y is percent accuracy
 at this particular SNR. Solving the equation for y = .5 
gives
 !!!!!!! 89  The regression coefficients can also be used to calculate the inclination of slope at the SNR 
needed to achieve 50% accuracy. The slope of a logistic regression model is nonlinear be
cause 
the model tends to 0 and 1 at the extreme ends. The slope is the steepest at the central point 
where Pr(x) = .5, which is SNR
50. The formula for this function is
 !"#$%!"!!!!!!!!!!"!!!!!!"! We already established that 
!!!!!!" equals 
0 at the point of 50% accuracy and so the equation 
becomes
 !"#$%!"!!!!!!!!!!!!!!!! Thus we can simply divide the coefficient of the slope of the logistic regression by 4 to 
obtain the inclination of the slope at SNR
50, that is, the % change in ac
curacy for a change in 1 
dB (for further explanation see 
Gelman & Hill, 2007, p. 82)
.  Using these formulae, the SNR
50 for monolinguals is 3.
66 dB and for bilinguals it is 
3.93 dB. The slope at the inflection point is 
8.00%/dB
 for monolinguals and 
7.39%dB for bilinguals.
 5.1.2.2 Spanish Words on Noise Test
 The bilingual participants also completed the Spanish version of the WIN
, the S
-WIN
. Here I will compare performance on one vers
us the other test. As in the analysis of the E
-WIN, a 
logistic mixed
-effects regression model with a probit link
-function was fit to the data. The model 
included random intercepts for subjects and items and the 
main effects 
of SNR and
 language 
(English/Spa
nish) and 
the
 interact
ion between the two variables were entered as fixed effects.
 As in the previous analysis, the model was only fit to data between SNR 12 and 0 dB. 
 90  The main effect of SNR was significant (
!2 (1) = 220.8, p < .001). Neither
 the main eff
ect of language (
!2 (1) = 1.4, p = .235), nor the interaction between language and SNR 
(!2 (1) = 0.1, p = .727) were not significant
. The SNR
50 on the S
-WIN was 4.9 dB and the slope was 7.8%
/dB
. Given the nonsignificant results of language and the language
-by-SNR interaction, 
the SNRs
50 and slopes did not differ in either language.
  Figure 
11. Results of the English and Spanish versions of the WIN test (bilingual partic
ipants 
only). Solid lines show the predicted values based on coefficients of the regression model 
described in the text. Dashed lines show the fitted values of this model. Whiskers show the 95% 
confidence interval.
 5.1.2.3 Individual differences analysis
 As in Exp
eriment 2, the effect of individual differences was investigated. For this 
purpose, the variables oral language 
ability
 and Baseline RT 
(c.f. 
3.1.2.2) were entered
 as continuous predictor variables into a regression model with the E
-WIN as the outcome variable. 
Group was not entered as a predictor since it was not significant in the previous analysis. The 
results showed a main effect of 
language ability (
!2 (1) = 5.8, p = .016) and a marginally 
91  significant effect of Baseline RT (
!2 (1) = 3.3, p = .068). The interaction between Baseline RT 
and SNR was also marginally significant (
!2 (1) = 2.8, p = .094). To interpret these results, we 
can again calculate the SNR
50 and the slope based on the model coefficients. For an individual 1 
SD below the mean on Baseline RT, the predicted SNR
50 is 3.58 dB and for an individual 1 SD 
above the mean, the predicted SNR
50 is 3.99. Thus, faster processing speed
 (i.e., a lower baseli
ne 
RT)
 was associated with a lower SNR
50. In addition, 
Figure 
12 suggests that the effect of 
Baseline RT was largest at the lowest SNR. For language ability, the pred
icted SNRs
50 for 
individuals below and above 1 SD were 4.02 dB and 3.58 dB, respectively. 
Figure 
13 suggests 
that this effect was most apparent at SNR 4 dB.
  Figure 
12. Effect of 
Baseline RT
 on WIN accuracy at each SNR. SNR = signal
-to-noise ratio. 
Baseline RT is the mean response time on the Test of Attention in Listening (see text for further 
explanation).
 92   Figure 
13. Effect of oral language ability on WIN accuracy at each SNR. SNR = signal
-to-noise 
ratio. W
-scores are arbitrary units with equal interval spacing.
 Next, the effects of Spanish language ability and Baseline RT were investigated for the 
Spanish version. R
esults showed a significant main effect of language ability (
!2 (1) = 5.4, p = .021) and SNR (
!2 (1) = 84.9, p < .001). All other main effects and interactions were not 
significant (
ps > .150).
 As in the English version, higher language ability was associa
ted with a 
lower SNR
50. The predicted SNRs
50 were 4.98 and 4.78 for individuals 1 SD 
above
 and 
below
 the mean on the language test, respectively.
 5.1.3 Discussion
 Are monolinguals and bilinguals differentially affected by noise?
 Returning to the 
research question of whether background babble at different SNRs differentially affected 
monolingual and bilinguals, the data 
suggest that both groups performed very similarly
. The 
descriptive statistics showed that bilinguals were slightl
y less accurate but the
 psychometric 
93  functions fit 
to the data showed that the SNR
50 and the slope at the inflection point were very 
similar for both groups.
 One concern with this test when interpreting the results is that test administration 
happened in a
 quiet but not sound insulated room. Also, the test was administered via the internet 
(following the NIH toolbox protocol) and no audiometer was used to adjust the sound pressure 
level (SPL). However, when comparing the present results to those of publishe
d results, they 
seem quite similar. 
Wilson, McArdle, and Smith
 (2007) compared normal hearing (NH) l
isteners 
and listeners with hearing loss (HL) on the WIN and other tests. The authors calculated the 
SNR50 and the slope of the psychometric function. In their study, the SNR
50 for NH 
was 4.1 dB 
compared to 3.66 dB 
in the present study (monolinguals), whic
h is quite similar. Performance of 
both groups using the 50% accuracy criterion was also within one standard deviation of the NIH 
toolbox norming study (NIH toolbox Technical Manual, p.25; available through 
NIHtoolbox.org), which were 
M = 4.79, SD = 4.07. The slopes for the monolinguals appear to be 
steeper in the present study (
8%/dB) compared to 
6.3%/dB in
 Wilson, McArdle, et al. 
(2007) but 
are 
similar to 
8.4%/dB
 reported in 
Wilson, Carnell, and Cleghorn
 (2007). The NIH manual for 
the WIN does not report mean values of the slopes of the psychometr
ic function for the norming 
population. 
 Using the criterion of the NIH manual, 91% of participants scored within the range for 
NH (
SNR50 <= 6 dB) and
 9% within the range of mild hearing loss (SNR
50 < 8 dB). In Wilson, 
McArdle, et al. (2007), the WIN was the best out of four 
SUN tests to distinguish listeners with 
HL from normal hearing listeners. Only 1% of the listeners with HL performed within the 95% 
CI of the normal hearing listeners and there wer
e marked differences between groups at each 
SNR of the WIN. In the present study, pure
-tone thresholds to measure hearing loss were not 
94  obtained from participants but participants rated their hearing as good (8.6 out of 10 on average). 
Two participants rat
ed their hearing as 6 but those two participants did not perform outside the 
range of the remaining participants. Furthermore, all participants were young adults and 
performance was similar to the study by Wilson, McArdle, et al. (2007; see above). 
 How do
es performance on the WIN compare to the results reported in Experiment 1?
 In the HNLP condition (SNR = 
-2 dB), bilinguals achieved around 50% accuracy, whereas the SNR 
on the WIN for 50% accuracy was around 4 dB. This could be because of differences in th
e speaker voice, differences in the babble noise used, and differences in the target words. At the 
same time, differences between groups seem to be much more pronounced on the SPIN. On the 
WIN, both groups performed very similar at each SNR but on the SPIN
, group differences were 
significant in each condition, although effect sizes of group differences were small. The different 
performance of both groups relative to each other may be explained by different task demands. 
On the WIN, words are always presente
d with the same carrier phrase. This makes the onset of 
the target word predictable and thus puts low demands on word segmentation ability. On the 
SPIN, on the other hand, target word onset is not predictable and so listeners may be more 
affected by misseg
mentations. In addition, listeners also have to pay attention to sentence context 
if they want to exploit it to predict the target word and this places higher attentional demands on 
the listener. Because listening may be generally more demanding for biling
ual speakers 
(Schmidtke, 2014)
, noise may disproportionally increase attentional demands. This may also 
explain why bilinguals did not benefit as much from a predictive context as monolinguals. 
  95  How does performance in one language relate to performance in the other language?
 The resu
lts for the SNR
50 for the Spanish version (
M = 4.9 dB) are similar to those obtained from 
the norming sample (
M = 5.53 dB, SD = 1.36; NIH toolbox Technical Manual, p.25). The mean 
SNR50 of the bilinguals reported in Carlo (2008) is somewhat higher with 6.2
 dB (SD = 1.3). In Carlo (2008), the mean slopes of the psychometric functions were steeper in the Spanish version 
than the English version. 
In the present study, performance on the E
-WIN was not significantly 
differen
t from performance on the S
-WIN, neith
er in the SNR
50 nor the slope of the 
psychometric function. This suggests that as a group, test language did not have an effect on 
hearing in noise ability. However, for the individual it may have an effect
 depending on the 
proficiency in English and Spani
sh as I will discuss in the next section.
 Individual differences predicting WIN accuracy
: As in the analysis of the SPIN in section 
0, I also investigated whether 
individual differences in verbal ability and processing speed 
(Baseline RT on the TAIL test) would be associated with accuracy on the WIN test. 
The WIN 
test is supposed to place minimal attentional and memory demands on the listener in order to 
measure hea
ring ability and not some other skill. As was shown in the analysis of the SPIN, 
individuals with larger verbal ability can potentially compensate for their hearing loss by being 
less dependent on the bottom
-up signal. Other tests such as the QuickSIN 
(Killion, Niquette, 
Gudmundsen, Revit, & Banerjee, 2004)
 have participants 
repeat 
a whole sentence and they 
receive one point for each of five keywords that they repeat per sentence. Thus participants with 
better STM may score higher because they are better able to remember the keywords. 
The 
present results suggest that even though the 
WIN test reduces the possibility to compensate for 
hearing loss by employing higher order cognitive skills, the test may still 
be sensitive to these 
individual differences. However, it should be noted that these effect sizes were small and they 
96  may have gr
eater theoretical than practical implications. For example, the fact that processing 
speed predicted WIN accuracy suggests that this may be one reason for greater hearing difficulty 
in older people. Further confirming the conclusion that individual differe
nces in linguistic 
abilities did not play a big role on the WIN was the finding that monolinguals and bilinguals 
performed very similar. This suggests that the WIN may be a good test to use with nonnative 
speakers of English
. When testing Spanish
-English b
ilingual speakers, it may be best to test them 
in their stronger language because both English proficiency and Spanish proficiency was 
associated with higher accuracy on each respective test.
 5.2 Verbal ability
 The results from Experiment 2 in section 
3.2 and the results reported in the previous 
section have shown that verbal ability is associated with higher accuracy on SUN tests. In this 
section, I am going to investi
gate which 
biographical variables
 predict verbal ability
 in bilingual 
speakers.
 Many studies have found that vocabulary knowledge in bilinguals is lower than in age
-matched monolinguals 
(Bialystok, Luk, Peets, & Yang, 2009; Bialystok & Luk, 2012; 
Portocarrero et al., 2007)
. The purpose of the present study was to find variables that would 
predict individual differences in verbal ability between monolingual and bi
lingual speakers. 
Previous studies found that exposure to each language is a good predictor of language 
development in children 
(Hammer et al., 2012; Hurtado et al., 2013; Place & Hoff, 2011)
. However, few studies have systematically investigated vocabulary knowledge in
 young adult 
bilinguals. Because the participants did all their schooling in the US, it may be that by the time 
they entered college, they had caught up with their monolingual peers. 
The present study shows 
that this was not the case and therefore it may b
e beneficial to identify variables that predict 
97  proficiency in the dominant language. In addition, the present study contributes to the literature 
on heritage language maintenance
 (Peyton, Ranard, & McGinnis, 2001)
 by not only testing 
participants in their dominant language but also in their ho
me language
 to investigate how 
different variables differentially affect proficiency in English and Spanish.
 The
 predictor variables
 for verbal ability
 came from the 
background questionnaire that 
was administered to all participants. 
Participants were asked to estimate what percentage of the 
time they were exposed to English and Spanish growing up and 
the number of
 people 
who 
interacted with them
 during childhood and adolescence
 in each language on a regular basis 
(regular was defined 
as at least once in two weeks
). Participants were given 5 different age 
periods (age 0 
Ð 2; 3 
Ð 5; elementary school; middle school; high school). 
The variable 
number 
of speakers
 was included based on a recent study that suggested that the number of speake
rs an 
individual interacted with predicted language proficiency above and beyond frequency of use 
(Gollan et al., 2014)
. Gollan et a
l. asked participants to estimate the number of speakers and 
percentage of use
 of the heritage language
 from birth through high school. In the present study, 
participants were asked to give more nuanced answers according to the five 
age
-related 
categories 
mentioned above to see how the relative use of English and Spanish changed from 
birth to high school. In addition, participants were asked to estimate their current use of English 
and Spanish in three 
areas
, speaking, listening, and reading. 
 A second purp
ose was to investigate the relationship between vocabulary knowledge and 
verbal reasoning. Most studies on bilingualism only include a test of vocabulary knowledge. 
However, bilinguals often know a word in one language but not the other because they do not
 use each language in the same contexts. For example, many of the participants in the present 
study reported speaking Spanish at home but English in most other situations. Therefore, 
98  vocabulary knowledge in one language likely underestimates the total numb
er of words that a 
bilingual speaker knows and thus vocabulary knowledge may not be a good indicator of general 
verbal ability.
 For example, a W
-score of 500 is the average score that a 10
-year
-old is expected 
to achieve. In the present study, some bilingu
al participants performed below 500, yet they were 
studying at a major US university. This suggests that the true verbal ability of an individual 
scoring around 500 is most likely higher. 
Using the same tests of verbal ability
 as in the present 
study
, I fo
und in a previous study that monolinguals and early bilinguals did not perform 
significantly different on the verbal analogies test but bilinguals 
gave significantly fewer 
correct 
responses on the vocabulary test 
(Schmidtke, 2014)
. Thus the prediction follows that the 
bilinguals score on the PV test is significantly lower than woul
d be expected based on their VA 
score compared to the monolingual group.
 5.2.1 Materials
 See section 
3.1.2.1 for a description of the Woodcock
-MuŒoz Language 
Survey
-Revised 
(WMLS
-R). 5.2.2 Procedure
 Following the standard procedures
 (Alvarado & Woodcock, 2005)
, both tests started 
from an age appropriate page. If a participant did not give six correct answers, the test was 
administered in backward order until the participants could correctly answer all s
ix items from a 
set or until the first item was administered. Once the basal score was established, testing resumed 
from the first administered page. Testing stopped when a participant could not correctly name 
any item from a set of six. 
   99  5.2.3 Results
 Woodcoc
k-MuŒoz Language Survey
-Revised English
: For group comparisons, it is most 
appropriate to use the age
-corrected standard scores, which are normed on a large sample with a 
population mean of 100 and a standard deviation of 15. However, for subsequent statis
tical analyses it will be more appropriate to use the W
-scores, which are not age
-corrected, because 
for many research questions absolute vocabulary knowledge is of greater importance than 
relative vocabulary knowledge in comparison to peers of the same ag
e. The mean 
Picture Vocabulary (
PV) standard score for monolinguals was 101 (
SD = 7.6), which is right at the population mean and that for bilinguals was 86 (
SD = 8.4), which is almost 1 
standard deviation below the population mean. This difference was sig
nificant, 
t(99) = 9.05, p < .001, d = 1.80. The 
Verbal Analogies (
VA) scores were also significantly different between 
groups, 
t(98) = 6.90, p < .001, d = 1.38. Monolinguals scored above the population mean (
M = 109, SD = 7.3) and bilinguals just below the
 mean (
M = 98, SD = 9.0). The difference in the 
composite score was also significant, 
t(98) = 8.85, p < .001, d = 1.77, with monolingual scoring 
higher (
M = 105, SD = 7.7) than bilinguals (
M = 90, SD = 8.8). WMLS
-R Spanish
: The SS on the Spanish version 
(bilinguals only) were 77 (
SD = 7.9), 90 (SD = 10.8), and 81 (
SD = 9.3) for PV, VA, and OL, respectively. For all three measures, 
participants performed on average better on the English version than the Spanish version (
ts > 
4.88, ps < .001, 
ds > 0.77), sh
owing that as a group, they were dominant in English.
 What is the effect of socio
-economic status on verbal ability?
 Monolinguals and 
bilinguals differed in terms of socio
-economic status, measured by motherÕs education. 
Therefore, it was of interest to de
termine the influence of SES on oral language ability. For this 
purpose, the education levels college, some grad school, and grad school were combined into one 
100  category
, college+. The other categories were 
less than high school
, high school
, and 
some 
colle
ge. When both groups were considered in one analysis, including group as a factor, 
motherÕs education was not a significant factor. However, when each group was considered on 
its own, motherÕs education was a significant 
predictor of verbal ability
 for mon
olinguals (
b = 3.6, SE = 1.5, p = .017, R2 = .11) but not bilinguals
. When examining the distribution of motherÕs 
education level for the bilinguals, 86% of participants reported that their mothe
rÕs education 
level was less tha
n high school
 or high school.
 Therefore, there
 may not have been enough 
variance in the
 bilingual speakersÕ SES
 distribution to find a significant effect. It would likely be 
necessary to test participants from a wider range of SES 
levels
 or to employ a more fine
-grained 
measure of SES
 to determine how much of the variance can be attributed to language group and 
how much to SES. Spanish 
verbal ability
 was not associated with SES, either (
r(48) = -.05, p = .716).  Factors explaining Spanish and English proficiency in bilinguals
: Based on previous 
research, the differences in English proficiency between monolinguals and bilinguals were 
expected. A more interesting question is therefore what factors 
may 
predict proficiency in 
English and Spanish in the bilingual group.
 The predictor
s for Spanish proficiency were the number of people who spoke Spanish 
with the participants, the percentage of Spanish exposure at the five life stages described above 
(age 0 
Ð 2; 3 
Ð 5; elementary school; middle school; high school). Participants also est
imated 
their parentsÕ use of
 English and Spanish (in %). The
se estimates were
 significantly correlated 
with percentage of Spanish exposure at age 0
-2 and 3
-5. The means and standa
rd deviations are 
shown in 
Table 
11.  101  Table 
11. Mean 
number of Spanish speakers and percent exposure to Spanish 
   Stage
 Number of Speakers 
 of 
Spanish
 M (SD) Percent exposure 
 to 
Spanish
 M (SD) 0-2 years
 5.8 (3.9) 91.4% (18.3) 3-5 years
 6.6 (5.1) 76.3% (22.2) Elementary school
 9.2 (6.7) 45.5% (14.9) Middle school
 9.3 (6.7) 35.0% (13.9) High school
 9.8 (6.8) 33.9% (16.9) Mean 8.1 (4.4) 56.6% (11.5) Note. Participants were asked to estimate how many people they interacted with 
in Spanish 
regularly and the percentage they were exposed to Spanish at 
each of the five stages in life 
shown on the right.
 Initial correlation analyses with the outcome variable Oral Language Ability Spanish 
(standard score) and the predictor variables showed that for percentage exposure to Spanish, the 
correlation was only 
significant at age 3
-5 (r(48) = .48, p < .001) and for the mean percentage 
exposure (
r(48) = .36, p = .011). For the number
-of-Spanish
-speaker
s variable, the correlation 
was only significant for age 0
-2 (r(48) = .30, p = .040) and 3
-5 (r(48) = .32, p = .028). The 
motherÕs use of English in the home while growing up was negatively but not significantly 
related to the participantÕs Spanish proficiency (
r(48) = -.18, p = .231). The correlation with the 
fatherÕs use of English, on the other hand, was sign
ificant (
r(46) = -.34, p = .021)13. A regression 
model using the number of speakers and percent exposure to Spanish at age 3
-5 and the fatherÕs 
use 
of English as predictor variables explained 29% of the variance (adjusted R
2) in Spanish 
proficiency
. In order to show how much additional variance was explained by each variable after 
accounting for variance explained by the other two variables, stepwise regressions were carried 
out. Number of Spanish speakers explained 4% additional variance, and percent exposu
re to 
Spanish and fatherÕs use of Spanish explained 10%
 each. 
Adding age of acquisition 
of English 
to 
                                                13 Two participants did not provide an estimate for their fatherÕs use of English.
 102  the model did not increase the explained variance. Age of arrival
 to the US
, on the other hand, 
had a positive effect and the adjusted R
2 increased to 40%
. However, only 11 out of 48 bilingual 
participants were not born in the US, which may make this variable unreliable.
 Next, I look at the influence of 
different variables on English and Spanish proficiency 
simultaneously. For example, more use of Spanish m
ay be associated with greater Spanish 
proficiency and lower English proficiency. For this analysis, the data were arranged in the long 
format with language (English vs. Spanish) as a predictor variable
 so that each participant 
contributed two observations
. Next to the variables reported above, participants were also asked 
to estimate the current relative time 
(in percent) 
spent listening, speaking, and 
reading
 in English 
and Spanish, respectively
. An aver
age was taken for this variable
, referred to here as 
current 
English use
 (current English and Spanish use always added up to 100% for each participant as 
they were not exposed to other languages)
. The outcome variable in this analysis was the picture 
vocabulary standard score instead of the oral language sco
re. The reason for this is that oral 
language is a composite score
 of VA
 and 
PV but 
PV is more strongly associated with language 
exposure
. Because
 VA in English and Spanish were
 correlated
, this suggests that verbal 
reasoning skills transfer from one langu
age to the other and so VA may be less associated with 
exposure to each language
 (see below)
. All model coefficients
 show the change in the outcome 
variable
 (PV on the standard score scale)
 associated with 1 SD increase in the predictor variable.
 Language
 of test
 alone explained 25.5% of the variance in the scores on the English and 
Spanish version
s, with participants scoring on average 9.6 points lower on the Spanish version 
compared to the English version. 
More current English use was associated with a m
arginally 
higher English score (
b = 2.1, SE = 1.1, p = .061) and a lower Spanish score (
b = -3.7, SE = 1.5, p = .020). More Spanish exposure from birth through high school was associated with lower 
103  English proficiency (
b = -3.3, SE = 1.1, p = .003) and hig
her Spanish proficiency (
b = 5.7, SE = 1.5, p < .001). Together, these variables explained 
37.6% of the variance. 
  The next question was what the relationship was between language dominance 
and PV 
scores. Language dominance was 
calculated by subtracting t
he Spanish 
picture vocabula
ry score 
from the English score
. Are more balanced bilinguals less proficient in each of their language
s compared to the stronger language of less balanced bilinguals?
 The mean l
anguage dominance 
score was 9.6 (
SD = 11.7), showin
g that most bilingual participants were English dominant. 
A regression model of picture vocabulary standard scores with Language dominance and Language 
as predictors explained 63% of the variance (adj
usted
 R2; see 
Table 
12).  Table 
12. Results of the regression analysis predicting picture vocabulary scores
 Variable name
 Beta SE p Intercept 
 81.1 1.1  Language dominance (LD)
 0.5 0.1 < .001 Test Language (baseline = English)
 0.0 1.5 1.000 LD*Test Language 
 -1.0 0.1 < .001 Note. 
Language dominance was calculated by subtracting Spanish scores from English scores. 
Thus a positive score means English dominance. Test language was a factor with two levels, 
English and Spanish.
 Because 0 is the score for a 
perfectly balanced bilingual (
an individual who
 obtained the 
same score on the English and Spanish version of the test
), the intercept of the model 
shows that 
the mean 
PV score in both languages for a balanced bilingual was 81. 
Every one
-point increase
 in 
English dominance was associated with 
a half
-point increase
 on the Englis
h version and a one
-point decrease
 on the Spanish 
version of the test. In o
ther words, participants with 
higher English 
scores tended to have lower Spanish scores
 (see 
Figure 
14). As might be expected when testing 
bilingual participants who live in a predominantly English environment, there were no 
participants with very strong dominance in Spanish; 8 out of 48 participants were dominant in 
Spanish an
d 3 participants were balanced.
 104   Figure 
14. Relationship between language dominance and proficiency in English and Spanish. 
Language dominance was calculated by subtracting Spanish scores from English scores. Thus a 
positive score
 means English 
dominance and a negative score means Spanish dominance.
 As is evident from 
Figure 
14, there was great variance in the data with some participants 
being fairly balanced and others being clearly dominant in English. To se
e if some of this 
variance could be explained by biographical variables related to exposure to English and 
Spanish, further analyses were run. For these analyses, the bilingual sample was split into 
balanced 
bilinguals 
and English
-dominant
 bilinguals
. This
 split was done on the median, which 
was 10 (i.e., the English standard score was higher than the Spanish score by ten points). This 
resulted in 23 balanced and 25 unbalanced bilinguals. 
 First, it was investigated whether the two groups differed in their 
use of Spanish from 
birth through high school. 
Because all participants started out with more or less the same amount 
of exposure to Spanish, the question was whether the decline in Spanish exposure was faster for 
individuals who were later to become Engli
sh dominant compared to those who remained more 
balanced. 
For this, a re
gression analysis was run with P
ercent exposure to Spanish as outcome 
variable and Age and Language dominance (balanced/unbalanced) as predictor variables. 
Because of the nonlinear dec
line in Spanish exposure as a functi
on of age, age squared and 
cubed were
 also entered. For this analysis, age was treated as a continuous variables although it 
105  technically was a factor with five levels. 
The results showed that age, age squared, and age 
cubed 
were significant predictors (line
ar term: 
b = -322.0, SE 
= 24.3, p < .001; 
quadratic term: 
b = 78.9 SE 
= 24.3, p = .001; cubic term: 
b = 60.5, SE 
= 24.3, p = .013). English
-dominant
 bilingualsÕ 
exposure to Spanish was, on average, 10% lower than that
 of balanced bilinguals (
b = -10.6, SE 
= 2.2, p < .001), but Language dominance did not interact with any of the polynomial terms (|
ts| 
< 1.2, ps > .243), suggesting that the difference between groups remained constant. However, 
Figure 15 
suggests a trend for a steeper decline in Spanish exposure in the 
English
-dominant
 group. Whereas both groups did not differ significantly until age 5, the two groups started to 
differ from elementary school onwards
 (se
e Table 13
). Table 13 
also shows that the effect size 
increases as a function of age from a small effect 
in infancy 
to a large effect in middle 
school 
and 
high school. Other biographical variables shown in
 Table 13 
confirm
 the same trend, although 
few of the 
other 
variables reach statistical significance. 
The table shows that parents of English 
dominant participants tended to use more English and were more proficient in English when 
participants were growing up compared to 
the balanced bilinguals. Furthermore, English 
dominant participants interacted with more English speakers
 during childhood
. Balanced 
bilinguals tended to have participated more in transitional or bilingual programs when entering 
school compared to English 
dominant bilinguals
, suggesting that these programs aided Spanish 
language maintenance. 
A correlation
 analysis showed that 
hours (square
-rooted to account for 
outliers) in Spanish immersion programs was positively correlated with oral language ability in 
Spanish (
r(48) = .30, p = .038) but not with English oral language ability (
r(48) = -.14, p = .326). 106   Figure 
15. Relationship between percent of exposure to Spanish and age in the bilingual sample. 
Participants were divided into a 
balanced and an unbalanced group based on the difference 
between their Spanish and English score on the WMLS (see text). 
 Importantly, both groups did not differ in Age of Acquisition and motherÕs education 
level, suggesting that these variables did not de
termine language dominance. Interestingly, 
though, groups differed in years of musical training. This may suggest greater integration
 of the 
English
-dominant bilingualsÕ families
 into the dominant culture but participants were not asked 
in what language th
ey had received musical training so this explanation is only speculative. The 
difference may also be indicative of differences between parent characteristics. For example, one 
study found that length of musical training in 7
-9 year olds was correlated with
 parental income 
(Corrigall & Schellenberg, 2015)
. If this was true in the present study, it may also indicate 
greater integrati
on into the dominant culture.
 107   Table 
13. Differences in background variables between balanced and unbalanced bilingual 
participants.
 Variable name
 Balanced
 English 
dominant
 t-value
 d Percent exposure Spanish
      0-2 years
 94.1% (15.0) 88.8% (20.9) 1.0 0.29 3-5 years
 82.2% (20.9) 70.9% (22.3) 1.8+ 0.52 Elementary school
 51.7% (13.0) 41.6% (15.2) 2.5* 0.71 Middle school
 41.7% (11.3) 28.8% (13.3) 3.6*** 1.05 High school
 40.9% (13.4) 27.4% (17.4) 3.0** 0.86 MotherÕs use English
   6.7% (21.4)   8.2% (16.3) 0.3 0.08 MotherÕs proficiency English (1
-10) 2.5 (2.4) 3.5 (2.5) 1.4 0.41 FatherÕs use English
 4.6% (12.0) 14.6% (24.7) 1.7 0.51 FatherÕs proficiency English (1
-10) 3.6 (2.9) 5.1 (3.1) 1.6 0.51 Number English speakers
     0-2 years
 0.4 (1.2) 1.2 (2.1) 1.5 0.43 3-5 years
 1.0 (1.4) 4.5 (5.8) 2.9** 0.82 Number Spanish Speakers
     0-2 years
 6.6 (4.8) 5.1 (2.8) -1.3 -0.37 3-5 years
 7.5 (6.8) 5.8 (2.7) -1.2 -0.34 Spanish Immersion program (
!!!"!#$
) 28.2 (26.3) 15.0 (23.4) -1.8+ -0.53 MotherÕs education level
 1.8 (0.9) 1.8 (0.8) 0.2 0.07 Years of musical training
 0.3 (0.8) 1.6 (2.8) 2.0* 0.58 Age of English Acquisition
 4.7 (2.5) 4.1 (2.5) -0.8 -0.23 Note. ***
p < .001; **
p < .01; *
p < .05; 
+p < .1. See text for an explanation of variables.
 Spanish 
Immersion program: Participants were asked 
how many hours per week of Spanish instruction 
they had received in bilingual and transitional programs. These were added up to the total 
number of hours, whi
ch were subsequently square
-rooted to achieve a normal distribution.
 What is the relationship between picture vocabulary and verbal reasoning?
 The above 
analyses showed that picture vocabulary in English and Spanish was associated with relative 
exposure to
 English and Spanish. On the other hand, verbal reasoning, measured by the verbal 
analogies subtest of the WMLS, involves higher order thinking skills, which may develop 
independently of relative language exposure. Several observations support this assumpt
ion. 
Scores on the English 
VA version were correlated with scores on the Spanish 
VA version (
r(48) 108  = .42, p = .003). The 
PV scores, on the other hand, were not correlated 
between both languages 
(r(48) = -.03, p = .822). Also, current relative exposure and past relative exposure only explained 
15.5% of the variance on 
VA test compared to 37.6% on the 
PV test (see previous analysis). 
Therefore, verbal reasoning may provide a better indication of a bilingual partic
ipantÕs
 actual
 verbal ability than 
PV. To test this hypothesis, the relationship between 
VA and 
PV was compared between the monolingual and bilingual participants. 
 The results of the regr
ession analysis show that
 a bilingual 
matched with
 a monolingual 
par
ticipant on the
ir verbal analogies score would, on average, perform 7.8 points lower on the 
picture vocabulary test compared to the monolingual participant (
b  = 7.83, SE = 1.59, p < .001). This relationship can best be seen in 
Figure 
16.   Figure 
16. Relationship between the picture vocabulary and the verbal analogies subtests of the 
WMLS. Compared to the monolingual participants, bilinguals p
erformed lower on the picture 
vocabulary test as would be expected from the verbal analogies score.
  109  5.2.4 Discussion
 The results reported here showed that monolinguals scored higher on both measures of 
the WMLS
-R, PV and 
VA, compared to the bilinguals. The 
effect sizes were large, which may 
be surprising given that all participants were enrolled at a university and were matched on level 
of education.
 However, there were significant 
between
-group differences in motherÕs education 
level
, which is a commonly
-used indicator of SES
. The SES of the bilinguals was significantly 
lower than that of the monolinguals. SES has been shown to be associated with vocabulary 
knowledge
 (e.g., Farkas & Beron, 2004)
 and the link between SES and vocabulary knowledge is 
believed to be
 reflected
 in the way mothers from different SES interact with their children 
(Hoff, 
2003). SES cannot explain all differences between groups, though, because participants talked 
mostly Spanish at
 home and learned English at school or kindergarten. This may explain why 
SES was not a significant predictor of English language ability in the bilinguals. However, 
because SES was not associated with Spanish language ability, either, a more likely explan
ation 
is that the variance in the data did not permit find
ing
 an association with only 7 mothers having 
received any schooling beyond high school. 
A more nuanced measurement of SES may be 
necessary to find the association that is usually very robust. For e
xample, information about the 
parentsÕ occupation and annual income may be collected
 in addition to education level
. Finding a 
greater range of SES
, however,
 will likely remain difficult in the current population because 
many Spanish
-English bilingual spea
kers come from immigrant backgrounds and are more likely 
to have received limited education. For example, Capps et al. 
(2005) report that 
in the year 2000
 in the US
, 32% of children of immigrants had 
parents with 
no high school degree compared to 
110  9% of children of natives 
(parents born in the US). This shows that the distribution of SES in the 
present study is not uncommon in this population.
 Despite the differences in motherÕs education level between the two groups, SES 
is unlikely to
 be the only exp
lanation for the observed differences. The regression analyses showed 
that proficiency in English and Spanish was closely related to the amount of language exposure 
in e
ach language. Language exposure, or amount of parental verbal input directed to the chi
ld, is 
a significant predictor of vocabulary growth in children who grow up monolingual 
(e.g., 
Huttenlocher & Haight, 1991; Weisleder & Fernald, 2013)
 and so it is reasonable to assume that 
the same holds true for bilingual children 
(Hoff et al., 2012; Hurtado et al., 2013)
. And because 
bilingual children are exposed to two languages, they hear each language less often compared to 
a monolingual child with the same overall amount of language input. 
Recent evidence sug
gests 
that it is not only the amount of language exposure but also the number of speakers a child 
interacts with that predicts language proficiency 
(Gollan et al., 2014)
. In the present study there 
was also some evidence for this relationship. The number of speakers a participant regularly 
interacted with at age 3
-5 explained variance 
above 
and beyond his or her 
relative exposure to 
Spanish. The variance explained by this variable was 4%, which is
 less than in Gollan et al., who 
reported that frequency of exposure explained 26% and number of speakers an additional 10%. 
A difference between the Gollan et al. study and the present is th
at in the present study,
 the mean 
number of people a participant interacted with from birth through high school was not 
a significant 
predictor 
but only the number of speakers in childhood (Gollan et al. only asked 
participants to estimate the 
number of sp
eakers they regularly spoke to from birth through high 
school)
. One reason
 for this difference
 in findings
 may be that some participants
 in the present 
study
 overestimated the number of people they regularly spoke to. Several participants indicated 
111  20 or more once they entered school, which may not be realistic. 
Another possibility is that the 
number of speakers a person interacts with in childhood is more important than later in life. But 
because of the retrospective nature of the data, more evidence would
 be needed to confirm this 
hypothesis. 
The present results also do not preclude the conclusion that more speakers just 
equaled more input. 
For example, a child that grows up with two parents and older siblings 
may receive more input than a child growing up
 with a single parent. 
However, Gollan et al. 
conducted a more controlled experiment with chi
ldren in w
hich they carefully counted the
 number of hours of exposure
 in the her
itage language (Hebrew) and the
 number of speakers
 through parental report
. In thei
r experiment, the number of speakers w
as still a significant 
variable 
(also see Place & Hoff, 2011)
, suggesting independent contributions from amount of 
input and the number of interactions with different speakers.
 The effect of n
umber of speak
ers fit
s well with the broader hypothesis of this dissertation 
that differences between monolinguals and bilinguals on verbal tasks result from differences in 
the precision of phonological representations. 
Frequency of exposure strengthens phonological 
rep
resentations
. This
 is why pictures with high frequency labels are named with greater accuracy 
than those with low frequency labels 
(Gollan et al., 2008)
. Hearing input from more diverse
 speakers may help 
children learning a language to 
form more exact representations of phoneme 
categories. For example, Maye, Werker, and Gerken 
(2002) found that infants are sensitive to the 
statistical distribution of 
phoneme exemplars. Hearing input from a greater variety of speakers 
will provide more evidence what the mean and the allowable variance of a phoneme category is 
(Rost & McMurray, 2009)
. A different view poses that listeners store exemplars of words every 
time they encounter a word
; phoneme categories emerge from the accumulated eviden
ce of 
stored exemplars 
(Pierrehumbert, 2003)
. A finding from the infant literature is that infants at 14 
112  months of age
 confuse similar
-sounding words such as 
bih
 and 
dih
 on a word learning task 
(Stager & Werker, 1997; Werker, Fennell, Corcoran, & St
ager, 2002
; but see Yoshida, Fennell, 
Swingley,
 & Werker, 2009)
. Rost and McMurry 
(2009) replicated the finding of Werker et al. 
(2002) in their Experiment 1 with the words /buk/ and /puk/, showing that 14
-month
-olds failed 
to discriminate between the two words. However, in Experiment 2 they used the same task with 
the same words but recorded tokens
 from 18 different speakers. This time infants were able to 
distinguish the two words. 
When measuring VOT of /b/ and /p/ across all exemplars, the authors 
found considerable variation among speakers and this may have provided infants with 
information about
 the category boundary. In contrast, when infants receive input from only one 
speaker, they may be less confident that /b/ and /p/ are two different phonemes as opposed to two 
exemplars of the same category. Thus receiving input from multiple speakers may 
lead to more 
precise phonological representations of words.
 In addition to the infant literature, there is evidence from adult vocabulary acquisition 
studies that suggest that speaker variability aids in learning new words. 
Sommers and Barcroft 
(2011) present evidence for the representation quality hypothesis. This hypothesis states that 
acoustic variability is beneficial for learning new words because it leads to a more distributed 
mental representation of the new word. 
As in a
 previ
ous study 
(Barcroft & Sommers, 2005)
, words were learned with greater accuracy when they were presented
 by six speakers as opposed 
to one
 speake
r. In addition, Sommers and Barcroft (2011) found that recognition of words 
learned from multiple speakers was more robust under adverse listening conditions. 
These 
findings suggest that phonological representations of newly learned words became more preci
se through greater talker variability.
 113  The results also showed that oral language ability was associated with frequency of 
exposure to each language. Frequency of exposure may act in two ways. Because many words 
are tied to specific circumstances, bilingua
l participants may encounter those words in 
only 
one 
of their languages. For example, many bilingual participants were not able to name a picture of a 
high chair
 in English
. Because most participants 
only 
spoke Spanish at home, they may have 
never heard 
the word in English. Consistent with this 
explanation
 is the finding that while 
bilingual children know fewer words in each of their languages compared to monolingual 
children, the total number of words they know is equal to monolingual children 
(Hoff et al., 
2012). Another explanation may be that participants had heard the
 word for high chair before but 
they had not encountered the word sufficient times to be able to recall it. This explanation is 
consistent with the observed bilingual disadvantage in tip
-of-the
-tongue (TOT) states 
(Gollan & 
Acenas, 2004; Gollan & Silverberg, 2001)
. Gollan and colleagues have shown that bilinguals 
suffer more TOTs compared to monolinguals. Because TOTs are more common for low 
frequency wo
rds than high frequency words, Gollan and colleagues suggest that the reason for 
the bilingual disadvantage 
in lexical retrieval is a frequency effect; that is,
 all words 
in each 
language 
are less frequent because they are encountered less frequently by so
meone who speaks 
two languages
 (see section 
1.4.3). Also consistent with this explanation is the finding that the gap 
between receptive and productive vocabulary i
s larger in bilinguals compared to monolinguals 
(Gibson, Oller, Jarmulowicz, & Ethington, 2012; Gibson, PeŒa, & Bedore, 2014)
. Knowledge of 
a word may be sufficient
ly precise
 to recognize a word and match it with a picture but no
t precise enough to produce it
 when presented with a picture
. With regard to
 language dominance, an interesting picture emerged. 
Language 
dominance was correlated with language proficiency so that more
 English dominant participants 
114  were more proficient in English and less proficient in Spanish compared to less English 
dominant participants
 (see 
Figure 
14). In fact, 
only four participants scored within 1 SD of 
the 
mean of 
the normative sample of 
both 
the English and Spanish version
s of the test. This suggests 
a trade
-off between English and Spanish proficiency. 
Because proficiency in English and 
Spanish was closely re
lated to exposure to each language, it may be difficult for bilinguals to 
achieve and maintain high proficiency in two languages. 
The results also suggest that language 
dominance 
in young adulthood 
can be predicted relatively early in life. Already in elem
entary 
school did balanced and English dominant participants differ in English exposure by 10% points. 
With the caveat that all biographical data were based on
 retrospective
 self-report, 
the results 
suggest that increased exposure to the heritage language 
through immersion programs may be 
effective 
for heritage language maintenance 
but 
children may also need increased support in the 
L2 to not stay behind in their language development. At the same time, it may be unrealistic to 
expect bilinguals to perform e
quivalent
ly to monolinguals
 on language tests
 when language 
maintenance is the goal of a bilingual speaker. 
 Lastly, one interesting finding 
was that verbal reasoning in English and Spanish was 
correlated while picture vocabulary was not. In addition, pict
ure vocabulary was more strongly 
associated with language exposure. This suggests that verbal reasoning skills transfer from one 
language to the other. 
Furthermore
, when compared to monolingual speakers, bilinguals 
performed lower on the picture vocabulary
 test than would be expected based on their verbal 
analogies score. These findings have important practical 
implications
 for bilingual language 
assessment in scho
ols. Because bilingual children usually have less language exposure to each of 
their languages
 and thus perform less well on verbal tests, they are more likely to be diagnosed 
with having a language disorder 
(Paradis, Genesee, & Crago, 2011)
. Testing th
em with a verbal 
115  analogies test
 may therefore be a better indicator of actual language development 
that 
is independent of amount of exposure in each language (
although 
the total amount 
of language 
exposure 
and 
the 
quality 
of 
interactions
 remain
 important
, of course
).  5.3 Working memory
 Pre
vious studies found that 
verbal working memory (VWM)
 may be reduced in 
bilinguals as a function of language proficiency 
(Delcenserie & Genesee, 2013; Guti”rrez
-Clellen, CalderŠn, & Ell
is Weismer, 2004; Luo et al., 2013; Ratiu & Azuma, 2015)
. As discussed 
in Chapter 2, the 
connection
 between 
VWM and language proficiency may be the quality of 
phonological representations in LTM. For example, high frequency words are remembered better 
on STM tests than low frequency words 
(e.g., Hulme et al., 1991)
. In the same way, more 
proficient speakers may have overall stronger phonological representations. 
As a result, they may 
have to
 devote fewer attentional resources to 
retrieving and 
maintaining those representations on 
a WM test and can thus devote more resources to the processing part of the WM task. 
 5.3.1 Materials and procedure
 The Working Memory test used for this study comes from the NIH Toolbox. 
Just as the 
WIN, it was
 administered 
over the internet
. In the WM test, participants see pictures and their 
labels and hear their names
 (in English)
. The set
-size differs from 
two to s
even pictures. Pictures 
are either animals or food items. After each set of pictures, participants are asked to repeat what 
they just saw in size order from smallest to biggest. For example, if they saw a bear, a duck, and 
an elephant, they would say duck,
 bear, elephant. To establish the size order, participants have to 
pay attention to the size of the object on the screen but in most cases, the relative propor
tions on 
the screen correspond
 to real life. The test has two parts. In the first part, sets cons
ist of 
only 
animals or only food items. In the second part, sets consist of animals and food and participants 
116  are asked to repeat the food first
, from smallest to biggest
, and then the animals
, from smallest to 
biggest. Both parts start with two practice sets to ensure that participants understood the 
directions. If they made a mistake in either practice set, the instructions were repeated and the set 
was administered again. After the practice i
tems, the test starts with a set size of two. If a 
participan
t correctly repeats all picture labels
, the set size of the next trial increases by one. If the 
participant makes an error, another set of the same size but 
with 
different items is administered. 
Testing stopped when a participant could not correctly repeat two sets in a row or when the last 
set was administered. Responses were recorded on a paper sheet and a score for each participant 
was calculated by counting the total number of items of all cor
rectly repeated sets. Thus the total 
score for each part is 27 (2+3+4+5+6+7) and the total possible score is 54. This test was only 
administered in English.
 Recently, the reliability of the test was established 
(Tulsky et al., 2014)
. The test
-retest 
intra
class correlation coefficient was .77. The test also correlated with other established WM 
tests (
r = .57) and tests of executive function (
r  = .43 - .58) from a standardized cognition 
battery 
(see Tulsky et al., 2014)
. The correlation with a test of receptive vocabulary, on the other 
hand, was low (
r = .24). Also interesting with respect to the present study was the finding that 
Hispanic participants scored, on average, .41 SDs
 below Caucasian participants.  
 5.3.2 Results
 The monolingual group (
M = 37.6, SD = 8.0) scored higher than the bilingual group (
M = 32.4, SD = 7.9) and this difference was significant (
t(99) = 3.29, p < .001, d = 0.66). The next 
question was whether this diffe
rence would still be significant when
 the picture vocabulary score 
was included as a covariate. A regression analysis showed that 
PV was a significant predictor
 (b = 0.40, SE = 0.13, p = .002), showing that 1 point increase on the 
PV standard score scale w
as 117  associated with an increase in WMC of 0.4 points. The factor Group was no longer significant (
b = 5.75, SE = 17.70, p = .746) and neither was the interaction between Group and vocabulary (
b = -0.06, SE = 0.19, p = .738), suggesting that vocabulary
 knowledge
 fully accounted for the 
differences between groups. This is further illustrated in
 Figure 
17. The model explained 22% of 
the variance and was significant (
F (3, 97) = 9.19, p < .001).  Figure 
17. Relationship between working memory capacity and picture vocabulary scores. Grey
-shaded area shows the 95% confidence interval of the regression line.
 5.3.3 Discussion
 The results confirmed the hypothesis that VWM is related to vocabulary knowledge. 
While the dif
ferences between groups were significant, 
vocabulary knowledge could fully 
account for these differences. This suggests that bilinguals did not have generall
y lower WMC 
but performed more poorly
 on the WM test as a group because of their lower vocabulary 
knowledge in English. The direction of this relationship could go in either direction
. For one,
 a 118  lower WMC may lead to a smaller vocabulary because WMC may be involved in vocabulary 
acquisition 
(Baddeley et al., 1998)
. Conversely
, a larger vocabulary may subserve WM via more 
precise 
phonological 
representations
 in LTM
. A third 
possible 
explanation is that the relationship 
may be bidirectional. 
The first explanation is unlikely because it would suggest that bilinguals 
had a smaller
 general WMC than monolinguals. 
However, general WMC has been shown to be 
constrained
 by neural limitations 
(Vogel & Machizawa, 2004)
 and is therefore unlikely to be 
influenced by bilingualism. Indeed, when vocabulary knowledge was 
regressed on
 WMC, the 
residual variance
 was exactly the same for monolinguals and bilinguals (see
 Figure 
18). This 
finding contrasts with Luo et al. 
(2013) who found that
 monolinguals still scored higher than 
bilinguals 
on a VWM test 
after accounting for differences in vocabulary knowledge. 
The 
different results in this study and the present one may be due to the type of vocabulary 
knowledge tested. Luo et al. tested recep
tive vocabulary whereas bilinguals in the present study 
completed a test of productive vocabulary
 (as mentioned in the Materials section, Tulsky et al. 
2014, also did not find a correlation between receptive vocabulary and WM scores in the 
norming sample)
. Productive vocabulary may be more indicative 
of the quality
 of phonological 
representations because they can be less precise for recognition memory.
 The present results lend further support to the hypothesis that the quality of phonological 
representation
s is the main reason for differential performance of monolinguals and bilinguals on 
verbal tasks. 
Importantly, the same relationship between vocabulary knowledge and WMC was 
seen in bilingual and monolingual participants. 
These findings have
 implications f
or studies 
employing VWM tests to predict p
erformance on other cognitive or
 perceptual tests. If 
vocabulary knowledge is not controlled for, it is not clear whether an observed effect is 
truly 
caused by 
WMC or 
verbal abi
lity. One solution to this problem w
ould be to use more than one 
119  test of WMC measuring different modalities (e.g., visual WM, VWM, spatial WM) and calculate 
a composite score based on the shared variance between the tests 
(Conway et 
al., 2005; Kane et 
al., 2004)
. The results also have important implications for teaching second language speakers. 
Teachers 
have
 to bear in mind that English Language Learners with a more limited vocabulary 
will have greater difficulty following lecture
s because of a more limited capacity to maintain 
verbal information in memory. 
  Figure 
18. Distribution of working memory scores when the effect of picture vocabulary was 
partialled out (residual variance). 
 5.4 Consonant perception 
in noise
 The 
next
 test in this test battery was a test of consonant perception.
 There were two 
research ques
tions associated with this test:
 first, do monolinguals and bilinguals differ in the 
accuracy of consonant perception, and, second, what factors can
 explain these differences? 
Consonant perception in a second language may be influenced by the phoneme inventory of the 
120  first language 
(Cutler, Garcia Lecumberri, & Cooke, 2008; Cutler, Weber, Smits, & Cooper, 
2004; Garcia Lecumberri & Co
oke, 2006)
. Plosives in Spanish and English differ in VOT
 so that 
an English /b/ can sound more like a Spanish /p/. Also, Spanish does not have the consonants 
/!/ and
 /"/. <v> and <b>
 represent one phoneme in Spanish with two allophonic realizations, /
#/ and 
/b/. Likewise, /s/ and /z/ are allophones
. It was therefore hypothesized that the bilingual 
participants may exper
ience interference from Spanish, especially since they heard consonants 
decontextualized, that is, without language cues. For example, E
nglish /aba/ may be heard as 
/apa/. 
In addition, i
t was also hypothesized that accuracy would be correlated with vocabulary 
knowledge in English. 
Exemplar theory 
(Pierrehumbert, 2003)
 proposes that phonetic categories 
are refined by type statistics in the lexicon, tha
t is, top
-down information can influence 
perception. 
Thus individuals with a larger lexicon may possess more refined phonetic categories 
that guide them in perception. For example, 
/d/ and /b/
 differ on many different dimensions such 
as formant transition,
 burst amplitude, spectrum, and the ratio of the closure to the voice onset 
time (Pierrehumbert, 2003, p. 120)
. Because there is redundant information, representations can 
be relatively coarse without affecting perception. Ho
wever, more refined representations may be 
beneficial under
 adverse listening conditions, when some of the information such as formant 
transitions is overshadowed by a competing acoustic signal.
 5.4.1 Materials
 and Procedure
 In the consonant perception test (CP)
, participants heard 16 different consonants in a 
/VCV/ cluster and were asked to identify them by clicking on one of 16 options on the computer 
screen. The consonant recordings came from 
Shannon, Je
nsvold, Padilla, Robert, an
d Wang
 (1999). The original recordings done by Shannon and colleagues included 25 consonants in three 
different vowel contexts /u/, /a/, and /i/ in medial /VCV/ 
position 
and initial /CV
/ position
. 121  Following 
Garcia Lecumberri and Cooke (
2006), stimuli were reduced t
o 16 consonants (/p b t d 
k g t! f v s z 
! m n l r/) in only one vowel context (aCa) and one consonant position. Two male 
speakers 
of standard American English 
(M2 and M3
 from Shannon et al., 1999
) were chosen 
from the original set of 5 male and 5 female speakers and each token was repeated 
four
 times for 
a total of 
128 items. The experimental items were mixed with background noise (multi
-talker 
babble) taken from the original SPIN recording. Three different sections from the babble noise 
track were cut and mixed at a SNR of 
-4 dB in Praat
 (Boersma & Weenink, 2014)
. One of those 
babble segments was repeated once and the other two
 were played once
. The SNR was chosen 
based on a pilot study. 
Participants in 
the
 pilot study performed at about 85% accuracy
 at an 
SNR of 
-2 dB. To avoid ceiling effects, the SNR was lowered to 
-4 dB in the present study
. Participants also heard each toke
n in silence at the beginning of the experiment so they could 
adapt to the pronunciation of each speaker. These trials were only used as practice trials and 
were not scored. When a participant made a mistake on those practice trials, the same token was 
rep
eated until the participant made a correct response.
 5.4.2 Results
 Mean accuracy for monolinguals was 76.9%
 (SD = 5.4) and for bilinguals 66.9%
 (SD = 9.1). A logistic mixed
-effects regression model with subject
s and item
s as random effects 
showed that this diffe
rence was significant, 
indicating
 that monolinguals were overall more 
accurate than bilinguals (
b = 0.65, SE = 0.12, p < .001). Additional factors were added to the 
model
 to establish whether the two different speakers and the three different babble segmen
ts had an effect on recognition accuracy and whether the effect was the same or different for mono
- and bilingual participants.
 Speaker 1 was easier to identify than speaker 2 (
b = -0.78, SE = 0.08, p < .001). Speaker interacted with 
Babble segment (
b = 0.31, SE = 0.11, p = .005), showing that 
122  the benefit for Speaker 1 was smaller when paired
 with babble segment 3 (see
 Figure 
19). Importantly, Speaker and Babble segment did not interact with Group, suggesting that the effects 
were the same for both groups.
  Figure 
19. Mean accuracy on the co
nsonant perception test divided by babble segment and 
speaker. Whiskers show the 95% confidence interval. Note the limited range of the y
-axis to 
highlight the effects.
 The next question was whether the monolingual benefit extended over all consonants or 
was specific to certain consonants only. 
Figure 
20 suggests that 
performance differed depending 
on the consonant. First, those conso
nants that are the same in both languages were recognized 
with the same accuracy (/
t!/, /m/, and /n/). In addition, the voiceless plosives /k/, /p/, and /t/ were 
recognized with the same accuracy by both groups. The largest differences ex
isted for those 
consonants for which VOTs in English and Spanish overlap
 (/b/, /d/, and /g/) and those that are 
allophonic in Spanish (/s/ and /z/
, and /v/). Lastly, /f/ was misidentified more often by bilinguals
 compared to monolinguals,
 which was not predicted based o
n native language
 influence.
 123   Figure 
20. Mean accuracy for each consonant
 on the consonant perception test
. Whiskers show 
the 95% confidence interval.
 The 
matrices 
in 
Table 
14 and
 Table 
15 show
 the average percentage of correct responses 
(diagonal bolded figures) and
 which consonant was most often heard when participants did not 
identify the co
rrect one.
 If the first language interfered with correct recognition of the English 
phonemes, then bilinguals should have chosen the consonant that would be predicted based on 
Spanish phonology more often than monolinguals
. For example, the VOT of English 
/b/ i
s more 
similar to a Spanish /p/ so there should be more 
apa responses in the bilingual group compared 
to the monolingual group.
 To test whether groups differed in their responses when the target 
consonant was not correctly identified, a 
!2 analysis wa
s performed
. A significant result shows 
that
 group differences in
 the ratio of responses to a certain consonant to the total number of 
incorrect responses is greater than would be expected by chance.
 The results for select 
consonants
, those for which we wo
uld expect a native
-language influence,
 are shown in 
Table 
16. 124  Table 
14. Confusion matrix 
- bilingual 
participants
.   Consonant
 stimulus
 b t! d f g k l m n p r s ! t v z response
 missing
 1 1 1   2 1 1   1 1 1 1 1 1 1 1 b 17   1 8     7 5   1 1       17   t! 0 95 2   2 6             11 10     d 10   62   1   5 1 2 1       1 6 11 f 5   1 32     2 1             1   g 1 1 12   65   2 1 1 2 1     1 1 3 k     4   29 91   1   10       1     l 16   1 1     32 3 1     1     5 7 m 2     5     13 66 1           1   n 3   2   1   1 7 92         1 1 10 p 27   1 40 1 1 12 6   73 2       12   r 1       1   1       92           s 1     1     1         68 1   2 2 !   2 1                 5 86 1   2 t 1   12             11       85     v 16   1 11     24 9     2       50 1 z 1   1               1 23     2 62 Note. Columns indicate the consonant that was played and rows indicate the response that 
participants gave. All 
values are
 show
n as
 percentages. Values below 1% are not shown, which 
is why not all columns add up to 100%.
 Missing = missing response.
  125  Table 
15. Confusion matrix 
- monolingual participants.
  Consonant
 stimulus
 b t! d f g k l m n p r s  !  t v z response
 missing         
 " " ! " " " # " " ! ! " " # " " b !" ! ! "$ " ! $ % ! & " ! ! ! "# ! t! ! #$ " ! ! $ ! ! ! ! ! ! $ $ ! ! d # ! %& ! ! ! ! ! " ! ! ! ! $ ! $ f % ! " $" ! ! " " ! " ! ! ! ! " ! g ! ' "' ! #" ! ! ! ! ! ! ! ! ! ! # k ! ! ! ! & #& ! ! ! ( ! ! ! ! ! ! l #' ! # " ! ! $' $ ! " ! " ! ! ) "' m " ! ! * ! ! ) (% " ! ! ! ! ! " ! n ' ! # ! ! ! " ) #$ ! ! ! ! " ! & p ' ! ! ) " ! " # ! %% ! ! ! ! " ! r ! ! ! ! ! ! " + ! ! #% ! ! ! ! ! s ! ! ! ! ! ! ! ! ! ! ! )! " ! ! " ! ! " ! ! ! ! ! ! ! ! ! # #" ! ! ! t ! ! # ! ! ! ! ! ! $ ! ! ! )) ! ! v #* ! ! "& ! ! '+ "" ! # ! ! ! ! () " z " ! # ! ! ! ! ! ! ! ! "" ! ! $ %& Note. 
Columns indicate the consonant that was played and rows indicate the response that 
participants gave. All 
values
 show
n as
 percentages. Values below 1% are not shown, which is 
why not all columns add up to 100%.
 Missing = missing response.
 126  Table 
16. Typical consonant confusions by monolingual and bilingual participants.
 Target 
consonant
 Misidentified
 Consonant
 Misidentified/Total wrong
 !2 " Bilinguals
 Monolinguals
 /b/
 /p/
 103/318
 12/288
 78.3*** 0.36 /b/
 /v/
 62/318
 113/288
 28.7*** 0.22 /d/
 /t/ 46/146
 7/110
 24.2*** 0.31 /g/
 /k/ 112/136
 21/36
 9.4** 0.23 /s/ /z/ 90/122
 45/72
 2.7+ 0.12 /v/
 /b/
 67/192
 52/134
 0.5 0.04 /v/
 /p/
 47/192
 6/134
 23.2*** 0.27 /!/ /t!/ 43/52
 19/32
 5.6* 0.26 /f/
 /p/
 154/260
 34/205
 86.6*** 0.43 /f/
 /v/
 43/260
 63/205
 13.1*** 0.17 /f/
 /b/
 32/260
 60/205
 20.8*** 0.21 /l/ /v/
 91/262
 126/208
 31.2*** 0.26 /l/ /p/
 47/262
 6/208
 26.3*** 0.24 ***p < .001, **p < .01, *p < .05, +p < .1. Note. 
The table shows how many times a target consonant was 
misidentified
 as another 
consonant compared to the total number of misidentification. The 
!2-test tested whether the ratio 
was significantly different between groups and 
" shows the effect size of the difference.
 The results 
suggest
 that native language in
fluence can explain some of the confusions. 
For the voiced consonants /b/, /d/, and /g/, bilinguals were more likely to choose the voiceless 
counterpart
s than 
monolinguals. The influence of the merging of /b/ and /v/ in Spanish can also 
be observed. Both 
/v/ and /b/ were confused with /p/.
 However, bilinguals were 
less likely to 
confuse /v/ with /b/ than monolinguals. Also, /s/ and /z/ were not more confusable for bilinguals 
than monolinguals, contrary to what may be expected based on Spanish phonology. 
Monolinguals were more likely to confuse /f/ with /v/ or /b/ and bilinguals were more likely to 
confuse it with /p/. Because /f/ is produced very similarly in English and Spanish, these results 
suggest that monolinguals and bilinguals may have attended t
o different cues in the signal rather 
than L1 influence. The pattern of these results is strikingly similar to those reported in Garcia 
127  Lecumberri and Cooke 
(2006), who also test
ed native speakers of Spanish
 (albeit European 
Spanish)
. However, differences between the present study and theirs 
were observed for the 
consonant
 /!/. The L2 speakers in Garcia Lecumberri and Cooke attained high accuracy for /
!/ in 
noise (92%) and did not
 typically confuse it with /t
!/ (2% of responses). This may be because 
many of their participants also spoke Basque, a language that has the /
!/ sound. 
For other sounds, 
both 
monolingual
 and 
bilingual
 speakers were less accurate in the present study. This 
was true 
for /l/ and /z/
. For example, in Garcia Lecumberri and Cooke, native English speakers reached 
97% accuracy for /z/, compared to 74% in the present study
. It may be that these differences are 
attributable to the different noise maskers 
used in the 
present study and the fact that Garcia 
Lecumberri and Cooke used all five make speakers from Shannon et al. 
(1999) with two 
repetitions per speakers whereas the present study only used two 
speakers 
with 
four
 repetitions 
of each
 consonant
.  The next question was whether English proficiency would be associated with 
CP test 
performance. One possibility is that knowing two languages interferes with consonant perception 
when 
consonants share overlapping spaces such as Spanish /p/ and English /b/
, which may lead 
to intermediate category boundaries that are unlike those of monolin
gual speakers of either 
language
. In this case, English proficiency may not correlate with performance. 
However, 
some 
studies have shown that bilinguals are able to shift their category boundaries depending on 
language mode 
(Antoniou, Tyler, & Best, 
2012; Elman, Diehl, & Buchwald, 1977; Garcia
-Sierra, Diehl, & Champlin, 2009)
. For example, in
 Elman, Diehl, and Buchwald 
(1977) bilinguals listened to 
five
 tokens on a /b/
-/p/ continuum, with VOT ranging between 
-69 to +66 
msec. The authors created an English and a Spanish version wi
th the same test syllables but filler 
words and prompts were either in English or Spanish
 to put participants in the respective 
128  language mode
. The results showed that the same stimulus was identified more often as /b/ in the 
English context than in the Spa
nish context for strong 
(balanced) 
bilinguals
 but weak 
(unbalanced) 
bilinguals did not show this shift
 as a result of language mode
. Nonetheless, even 
the performance of strong bilinguals was different from monolinguals in either language, 
suggesting that 
bilinguals may be unable to completely turn one language off, as it were, when 
listening in the other language. 
Elman and colleagues also assessed p
roficiency in each language 
through an oral interview and degree of bilingualism
 (L1 proficiency/L2 proficie
ncy)
 was correlated with the size of the category shift (
r = .52). This suggests 
that 
proficiency 
is related to 
perception accuracy. The prediction was therefore that higher English proficiency would be 
associated with more native
-like (monolingual) conson
ant perception.
 To address the role of proficiency in consonant perception
, mean accuracy was calculated 
for each participant and the result was
 used as
 the outcome variable in a linear regression 
analysis
. Group and English proficiency (oral language abil
ity) 
were the
 predictor variables. A 
visual 
inspection
 of the data suggested that the relationship between proficiency and accuracy 
was not linear
 (see 
Figure 
21). Rat
her, the effect
 on CP was stronger in the lower proficiency 
range. Therefore, proficiency was entered as a 
cubic
 spline
 with 
2 degrees of freedom
. Results 
showed that 
both 
terms of the spline function (
first
 term: 
b = 0.31, SE 
= 0.06, p < .001; 
second
 term: 
b = 0.11, SE = 0.04, p = .010) and Group (
b = 0.04, SE = 0.02, p = .017) were significant 
predictors.
 The effect of group shows that after proficiency in English was taken into 
account
, the difference between groups was 4% points
. This was smaller t
han the 
10% difference between 
group that was found above
. The model explained 46.3% of the variance (adj. 
R2 = .447). Group 
by itself explained 3
2.0% and proficiency by itself 43.0% of the variance. This suggests that 
proficiency was a better predictor of
 performance on the test than Group. Because a spline 
129  function was used, a breaking point was imposed by the function. This point was at 
the median 
of 
99.5, suggesting that the steepness of the slope differed for individuals below and above this 
point. Thi
s can be seen in 
Figure 
21. Because most participants above the break point were 
monolinguals, this may suggest that the relationship
 between CP and proficiency
 was stronger in 
bilinguals than monolinguals. Thus separate analyses for 
each group were run. 
For monolinguals, 
the model was not significant (
F (2, 49) = 1.6, p = .216, R2 = .061) but for bilinguals it was (
F (2, 45) = 10.1, p < .001, R2 = .309). However, some of the variance is lost when aggregating data 
and so a logistic mixed
-effects model was also run on the raw data. 
The disadvantage is that 
these models do not provide a 
R2 statistic that would allow for model comparisons but the 
estimates are
 likely more accurate because error attributable to subject and item variance is taken 
into acco
unt. 
 Subjects and items and items nested within subjects were entered as random effects to 
account for the fact that each subject heard each item four times 
spoken by two speakers 
and 
contributed 128 data points. As in the previous analysis, the first and second term of the spline 
function 
for language proficiency 
were significant (first term: 
b = 2.32, SE = 0.44, p < .001; 
second term: 
b = 1.14, SE = 0.30, p < .001), as was Group (
b = 0.25, SE = 0.12, p = .041). When 
the model was run 
for each group separately
, proficiency was a significant predictor 
in both 
groups 
(monolinguals only: 
first term: 
b = 0.90, SE = 0.44, p = .039; second term: 
b = 0.74, SE = 0.39, p = .011; bilinguals only: first term: 
b = 2.17, SE = 0.58, p < .001; second term: 
b = 1.26, SE = 0.44, p = .004).  130   Figure 
21. Relationship between accuracy on the consonant perception test and oral language 
ability. The 
regression l
ine included one knot at 99.5
. To illustrate the role of proficiency furthe
r, each group was divided into
 high and low 
proficiency
 based on a median spl
it of oral language proficiency
. A t-test showed that the 
monolingual low and the bilingual 
high group were not significantly different in language 
proficiency (
Mmonolingual low
 = 527 W, 
Mbilingual high
 = 525 W, 
t (51) = 0.97, p = .339, d = 0.27). Therefore, any differences between those groups are likely not attributable to differences in 
Englis
h proficiency but other factors such as L1 influence. 
After establishing these four groups, 
another
 mixed effect regression analysis 
was run with group as a predictor variable with four 
levels (monolingual high/low, bilingual high/low). 
The results 
indicated
 that the bilingual high 
group was significantly different from the bilingual low group (
b = -0.54, SE = 0.14, p < .001) and both the monolingual low (
b = 0.34, SE = 0.14, p = .011) and the monolingual high group (
b = 0.57, SE = 0.15, p < .001). When the monolingual low group was
 used as the reference 
category, they were not significantly different from the monolingual high group (
b = 0.22, SE = 0.14, p = .106). This suggests that differences in consonant perception still persist even when 
131  groups a
re matched on proficiency 
(i.e., monolingual low and bilingual high) 
but t
hose 
differences become smaller (see
 Figure 
22).  Figure 
22. Accuracy on the consonant perception test as a function of group. The mon
olingual 
and bilingual groups were 
each 
divided into a high and low proficiency group based on a median 
split of their verbal ability score. Whiskers show the 95% confidence interval.
 The results for each consonant are shown in 
Figure 
23. The figure shows that whereas the 
bilingual 
low 
group and the monolingual 
high 
group perform significantly different
ly for most 
consonants, the bilingual high and the monolingual low group perform more similar
ly. Differences still exist for some
 consonants (/g/ and 
/l/), which may suggest a native language 
influence
 for those consonants. 
 132   Figure 
23. Mean accuracy for each consonant on the consonant perception test. The monolingual 
and bilingual groups were each divided 
into a high and low proficiency group based on a median 
split of their verbal ability score. Whiskers show the 95% confidence interval.
 The results so far suggest a relationship between language proficiency and consonant 
perception in noise. The hypothesis
 of this dissertation is that a larger vocabulary results in more 
precise phonological representations in long term memory. Likewise, assuming that phonetic 
categories are 
extracted 
from the
 phonetic
 information stored in
 the entire mental lexicon, a 
large
r vocabulary should result in more precise phonetic categories, which would be more robust 
to the effect of noise. 
 To test this hypothesis, the phonotactic probability of each of the 16 consonants 
(only the 
probability of the consonant in the VCV cluster was considered) 
was calculated using the 
phonotactic probability calculator 
(Vitevitch & Luce, 2004)
. The resu
lts showed that phonotactic 
probability was not normally distributed (
M = 0.021, SD = 0.025, Median
 = 0.009). To account 
for this skew, phonotactic probability was divided into 
high
 and 
low
 probab
ility 
based on a 
median split. The 
prediction
 was that consonants with higher phonotactic probability would be 
133  recognized with greater accuracy. In addition, we may expect
 individuals with a larger 
vocabulary would be more sensitive to phonotactic probabil
ity and thus be more accurate on 
VCV clusters with 
low phonotactic probability.
 The reason is that the probabilities based on a 
corpus analysis will only roughly correspond to experienced probabilities. For subjects with less 
language experience, low proba
bility clusters will be of even lower experienced frequency. As in 
the case of the frequency effect described in section 
1.4.3, we may
 therefore
 expect an 
interacti
on between phonotactic probability and English proficiency.
  A mixed
-effects regression model was run with subjects and items
 and items nested 
within subjects
 as random effects
. As before, the results showed a main effect of oral language 
ability (
b = 0.23, SE = 0.07, p = .002) and Group (
b = 0.31, SE = 0.13, p = .014). Importantly, the 
interaction between language ability and phonotactic probability was significant (
b = 0.16, SE = 0.07, p = .015). Because language ability was 
centered
, the main effect of p
honotactic probability 
shows 
the estimated effect for a participant with mean language ability, which was not 
significant (
b = -0.80, SE = 0.67, p = .231). These effects can be best interpreted by looking at 
Figure 
24. 134   Figure 
24. Relationship between mean accuracy on the consonant perception test and oral 
language ability. Consonant were divided into high and low phonotactic probability based on a 
median split. The interaction between phonotactic probab
ility and language ability was 
significant.
 5.4.3 Discussion
 The results from the consonant perception test showed that bilinguals performed 
significantly 
different
ly from monolinguals with an effect size of about 10% points. The pattern 
of consonant confusions 
resemble those reported in Garcia Lecumberri and Cooke (2006) for 
Spanish native speakers
 who had learned English as a foreign language
. As in their study, 
bilingual participants
 in the present study often misperceived the voiced consonants /b/, /g/, /d/, 
and /v/ as 
voiceless /p/, /k/, and /t/
. This suggests a native language influence on L2 perception 
even for early bilinguals
14. However, the present study extends the results of Garcia Lecumberri 
and Cooke by showing that the effect of L1 influence becomes 
smaller as proficiency in the 
tested language increases
 (Figure 
23). Importantly, the relationship between proficiency and 
                                                14 Garcia Lecumberri and Cooke did not report detailed information about their participantsÕ age 
of L2 acquisition and L2 
proficiency but the participants lived in Spain, which suggests more 
limited exposure to English.
 135  accuracy was also found for the monolingual speakers to a certain extent. This suggests that 
differences between monolinguals and bilinguals cannot solely be attributed to L1 influence. 
 Two possible explanations for the 
effect of proficiency
 come to mind
. Higher language 
proficiency may be associated with mo
re precise phonetic
 categories and
/or
 individuals with 
higher language proficiency may be better at attending to those acoustic cues that penetrate the 
background noise. 
Both explanations are consistent
 with a usage
-based view of phonetic 
categories 
(Pierrehumbert, 2001, 2003)
. According to this view, mental representations are 
Ògradually built up through experience with speechÓ 
(Pierrehumbert, 2001, p. 137)
. As 
individuals gain more experience with a language and hear
 more words in a wider range of 
contexts, their phonetic categories of those sounds that distinguish meaning in the language 
become more refined
 (also see Hardison, 2012)
. At the same time, individuals may learn to 
attend to those cues in the speech signal that are most informative, especially when the spee
ch 
signal is not optimal. 
For example, aspiration is a good cue in English to distinguish voiced from 
voiceless plosives
 (although the main cue is VOT, Flege & Eefting, 1987)
. However, Spanish 
does not have aspiration so native speakers of Spanish need to learn to attend to this cue. 
Not 
attending to aspiration as a cue
 may explain why bilinguals often chose /p
/ where monolinguals 
were more likely to hear /v/ or /b/ (see 
Table 
16). At a general level, t
he 
effect of language 
proficiency is
 also in line with FlegeÕs speech learning model
 (Flege, 1995)
, which states that 
new, nonnative phonetic categories can be established 
with increased language experience. 
 The 
results provided some ev
idence that language ability 
- specifically
 vocabulary 
knowledge 
- is directly related to consonant perception in noise. Individuals with a larger 
vocabulary were less influenced by phonotactic probability. This effect is interprete
d best by an 
entrenchment account of phonetic categories 
(Pierrehumbert, 2001)
. More frequent phonemes 
136  are better entrenched than less frequent phonemes. The effect of proficiency is small for high
-probability phonemes because these are wel
l-entrenched for all speakers. However, the low
-probability phonemes are less entrenched in speakers with a smaller vocabulary, leading to an 
interaction between 
vocabulary size and CP
.   Bilinguals showed signs of L1 influence when listening to English 
consonants although 
they learned English early in life and were mostly immersed in an English
-speaking environment
 (all participants attended school in the US from first grade)
. This resembles findings from 
Sebast
i⁄n-Gall”s and colleagues who found that ea
rly Spanish
-Catalan bilinguals had difficulty 
distinguishing between a Catalan vowel contrast nonexistent in Spanish
 (Sebasti⁄n
-Gall”s, 
Echeverr™a, & Bosch, 2005; Sebasti⁄n
-Gall”s & Soto
-Faraco, 1999)
. In the present study, t
he 
differences between bilinguals and monolinguals were attenuated when 
English 
languag
e proficiency was considered but even a subset of monolinguals and bilinguals matched on 
proficiency still performed significantly different
ly from each other. 
Results from other studies, 
though, have shown
 that
 bilinguals are able to shift phonemic catego
ries depending on the 
language mode they are in. Antoniou et al. 
(2012) tested early 
Greek
-English bilinguals 
on stimuli involving voiced and voiceless consonants as those have a shorter VOT category 
boundary in Greek. The results showed that the bilinguals 
were able to shift th
eir category 
boundaries
 depending on l
anguage context. For example, when in Greek mode they perceived a 
Greek /p/ most often as /p/ but when in English mode, they were more likely to hear it either as 
/b/ or as /p/. However, 
Anoniou et al.Õs 
study employed ideal listening conditions. The results 
from the present study differ insofar as stimuli were presented in noise. 
This may reveal more 
subtle differences in perception, especially in cases where bilinguals and monolinguals rely on 
different 
phonetic cues. It should be noted, though, that although only English was used in the 
137  experiment (Spanish was not used until all English tests were completed), the task gave no 
language context cues. Putting bilinguals into a stronger monolingual mode by p
roviding a 
context cue such as a carrier sentence for each token might have changed the results. In contexts 
without strong language cues it may 
even 
be beneficial to have more inte
rmediate phonetic 
boundaries to accommodate language switches. Another reas
on may be frequent exposure to 
accented English. One 
eye
-tracking 
study 
(Ju & Luce, 2004)
 found
 that 
when listening to 
Spanish, Spanish
-English bilinguals only exhibited cross
-language activation (as measured by 
eye movements to 
English competitor 
pict
ures) when VOT of Spanish words was manipulated to 
be consistent with English. For example, when VOT was Spanish
-like, participant did not look to 
a picture of pliers more than to a control picture when hearing 
playa
. When VOT was English
-like, on the othe
r hand, participants looked more to the pliers than to the control picture. 
Thus Ju 
and LuceÕs (2004) study
 showed that lexical access in bilinguals is constrained by language 
specific cues such as VOT. 
Bilinguals who are frequently exposed to accented Eng
lish may thus 
treat /b/ and /p/
 or /d/ and /t/
 as allophonic variants for the purposes of lexical access 
(cf. Samuel 
& Larraza, 2015, p. 67)
. For example, bilinguals might frequently hear 
/t/ as in 
/ten/ with
 a VOT 
acceptable for Spanish /t/ but more akin to English /d
/. 
The boundary from /t/ to /d/ is around 85 
ms in English but ar
ound 19 ms in Spanish 
(Flege & Eefting, 1986)
. Conseque
ntly, the category 
boundary 
from /t
/ to /
d/ may shift to allow shorter VOTs as acceptable for English /p/. 
Or 
speakers frequently exposed to Spanish
-accented English may ignore VOT as a cue altogether 
because of its unreliability and may rely more on conte
xt. For example, in some r
-less New York 
City dialects, the vowels in the words 
source
 and 
sauce
 have nearly merged. Speakers who 
produce two different vowels in these two words are nevertheless not able to reliably indicate 
which one they heard, presumabl
y because of the great variability of this vowel distinction in the 
138  speech community
 (Pierrehumbert, 2003, p. 138)
. The same may be true for Spanish
-English 
bilingual speakers regarding those consonants whose category boundaries overlap in English and 
Spanish but further research is necessary to corroborate this hypothesis.
  Despite the L1 influence on L2 speech percep
tion, the results showed clearly that 
differences 
do not only exist between monolinguals and bilinguals but also within monolinguals. 
The relationship between vocabulary knowledge and speech perception could be bidirectional 
given that 
previous studies hav
e found a relationship between speech discrimination ability and 
vocabulary development in infants 
(Tsao, Liu, & Kuhl, 2004)
. Nevertheless
, the present results 
suggest that differences in speech perception between monolinguals and bilinguals may be less 
categorical than previously thought. One striking result of the present study is the large 
difference in vocabulary knowledge between groups, w
hich amounted to 1 SD
 (see 
Table 
1). Given such differences, monolingual college students may not be a good comparison group. 
Especially 
Figure 
21 suggests that individual differences in speech perception get smaller as 
language proficiency increases. 
Thus differences between monolinguals and bilinguals in speech 
perception might wrongfully be attributed to bilingual status when in fact difference
s are in fact 
attributable to differences in language experience
 in general.
 5.5 Test of Attention in Listening
 The main purpose for including the TAIL in this test battery was that previous research 
has indicated that attentional control, or executive functio
ns, may be recruited when listening 
under adverse conditions. As outlined in the ELU model 
(Rınnberg et al., 2013)
, word 
recognition is effortless when the speech signal is optimal. However, when the signal is distorted 
in some way, listening becomes effortful a
nd requires additional attentional resources. A 
secondary purpose of the study was to test the hypothesis that bilingualism improves attentional 
139  control, often referred to
 as the bilingual advantage 
(e.g., Bialystok, Craik, & Luk, 2012; Hilchey 
& Klein, 2011)
. This second hypothesis will be explored in this section. At first, it seems 
unrelated to the topic of this dissertation; however, it has been proposed that there is a 
relationship between attentional control and language processing in bilinguals 
(Abutalebi et al., 
2013; D. W. Green, 1998; Mercier, Pivneva, & Titone, 2
013; Pivneva, Palmer, & Titone, 2012)
. Because of this literature, it was hypothesized that individual differences in language experience 
may be associated with individual differences in attentional control.
 5.5.1 The bilingual advantage
 One of the first stud
ies to report a bilingual advantage was 
Bialystok, Craik, Klein, and
 Viswanathan
 (2004), These authors administered th
e Simon test, a test that is designed to 
measure inhibitory control. Inhibitory control is the ability to suppress a prepotent response in 
the presence of response conflict. In one version of the test, participants press a right o
r left 
arrow depending on 
the direction of an arrow they see on a computer screen. Response conflict 
arises when a left
-pointing arrow appears on the right side of the screen and the other way round. 
Compared to trials without response conflict, that is, a right
-pointing arrow on t
he right side of 
the screen, RTs in conflict trials are usually larger, referred to as the Simon effect. 
Bialystok et 
al. (2004) found that the Simon
 effect was much smaller for
 bilinguals than monolinguals, 
suggesting that bilingualism may be associated w
ith better inhibitory control.
  One explanation for this advantage is the bilingualÕs need to control access to both 
language
s when speaking in one language and that the constant recruitment of these domain 
general attentional networks improves nonlinguist
ic tests of executive function. 
Costa, 
Hern⁄ndez, and
 Sebasti⁄n
-Gall”s (2008) point
ed out that all theories of bilingual lexical access 
involve some type of control mechanism. In GreenÕs 
(1998) inhibitory control model, for 
140  example, translation equivalents become active when a bilingual person accesses a word in one 
language. For instance, when accessing the concept of DOG, the word forms 
dog and 
perro
 receive activation in a Spanish
-English bilingual speaker. In GreenÕs model, these word forms 
have language tags attached to them 
and 
the form with the wrong tag is inhibited. This inhibition 
mechanism may be the same as the one recruited during tasks used t
o measure attentional control 
(Bialystok, Craik, & Luk, 2008)
.  However
, this hypothesis has 
recently 
come under criticism. In a review of the bilingual 
advantage literature, Hilchey and Klein 
(2011) came to the conclusion that there is no consistent 
evidence for a bilingual advantage in inhibitory control, but there is a bilingual advantage in 
general attentional contro
l with bilinguals often being faster on conflict and nonconflict trials. 
Therefore, Hilchey and Klein (2011) conclude
d that inhibition of the irrelevant language during 
bilingual speech production may not be an adequate explanation of the bilingual advanta
ge. 
Since this review, several studies have been published that did not find any evidence for a 
bilingual advantage 
(AntŠn et al., 20
14; DuŒabeitia et al., 2014; V. C. M. Gathercole et al., 2014; 
Paap & Greenberg, 2013)
, which has some researchers led to question the reliability of the effect 
(e.g., de Bruin, Treccani, & Della Sala, 2015; Klein, 2015; Paap, 
2015). For example, it has been 
suggested that differences in SES 
(Morton & Harper, 2007)
 and immigrant s
tatus 
(Kousaie & 
Phillips, 2011)
 can explain purported bilingual advantages
 (these studies come from Canada 
where immigrants often hav
e a high SES)
. One way forward to resolve these conflicting results 
may be to relate performance on tests of executive function to bilingual experience in a 
correlational design
 with a more homogeneous group of bilingual participants in terms of SES 
and ot
her background variables
. For example, one study found that the degree of bilingualism 
(dominant language proficiency divided by the nondominant language proficiency) was 
141  positively associated with the age of diagnosis of AlzheimerÕs disease in a sample of
 low
-educated Spanish
-English bilinguals 
(Gollan, Salmon, Montoya, & Galasko, 2011)
. 5.5.2 Methods
 5.5.2.1 Materials
 The Test of Attention in Listening (TAIL) was adapted from 
Zhang, Barry, Moore, and 
Amitay (2012). In this test, participants have to decide whe
ther two tones were played to the 
same ear or different ears. What makes this test challenging is that the frequency of the two tones 
is sometimes the same and sometimes different. Because participants are only supposed to 
respond based on the location of 
the tones, response conflict arises on trials on which the 
location is different but the frequency the same or the location the same 
but the frequency 
different. The manipulation of frequency and location results in four conditions, same
-frequency 
same-loc
ation (SFSL), same
-frequency different
-location (SFDL), different
-frequency same
-location (DFSL), different
-frequency different
-location (DFDL). The original test also has a 
second condition where frequency is the task
-relevant dimension and location is th
e irrelevant 
dimension that has to be ignored. However, only the first condition was used in the present study 
to reduce the time needed to administer the test.
 Three different measures can be derived from the TAIL, baseline RT, involuntary 
orientation, an
d conflict resolution. Baseline RT is the mean RT in the SFSL condition. 
In Zhang 
et al. 
(2012), baseline RT correlated with the RTs in a separate test that did 
not involve response 
conflict and therefore the authors suggested that this measure reflects information processing 
speed. 
Involuntary attention can be calculated by subtracting RTs on trials with the same 
frequency from those of different frequency ([DFDL
+DFSL] 
Ð [SFSL+SFDL]). Conflict 
resolution can be calculated by subtracting the mean RTs on trials where location and frequency 
142  were both different or both the same (no response conflict) from those where they were different 
([SFSL+DFDL] 
Ð [SFDL+DFSL]). 
 The tones were created in Praat 
(Boersma & Weenink, 2014)
 as pure tones with a length 
of 100 ms. The frequency ranged between 500 and 1400 Hz in 100 Hz intervals, which resulted 
in ten different sound files. There were a total of 96 experimental trials, 24 trials in each 
condition. The experiment was programmed in E
-Prime.
 5.5.2.2 Procedure
 Partici
pants were seated in front of a computer and were given written and oral 
instructions for the experiment. They were told that they would hear two tones and then would 
decide whether the two tones were played to the same ear or different ears. They were als
o told 
to ignore the frequency of the two tones and just pay attention to location. For their responses, 
participants used the keys Q and P on the keyboard and they were encouraged to respond as fast 
and as accurately as possible. The experiment started wi
th 16 practice trials for which 
participants received automated feedback from the computer. If a participant did not get 85% 
accuracy on these test trials, the instructions were repeated and the participant did another round 
of 16 practice trials. Most par
ticipants reached the accuracy criterion in the first round and 
everyone else in the second. On each trial, a sound file was randomly chosen. For same
-frequency trials, the same sound file was played twice and for the different
-frequency
 condition
, the sec
ond sound file was randomly chosen so that the difference in frequency was at least 100 
Hz. 
 5.5.3 Analysis
 For the accuracy data, a logistic mixed
-effects model 
(Bates et al., 2014)
 was run
 with 
subjects as random effect and Group (monolingual/bilingual), Frequency (same/different), and 
143  Location (sam
e/different) as fixed effects. For the RT data, only correct trials were used and the 
model included the same random and fixed effects as the previous one. Of particular interest was 
the interaction between Group and Frequency and Group and Location. One p
articipant from the 
bilingual group was excluded because of low accuracy (60%).
 5.5.4 Results
 Is there a bilingual advantage?
 Accuracy on the test was high (
M = 96.3%, SD = 18.8, range = 87.5% 
- 100%). The result of the regression model showed that compared to the SFSL 
condition, participants were less accurate when Frequency was different (
b = -0.95, SE = 0.19, p < .001) but this effect was attenuated when both Frequency 
and Location were different (
b = 0.66, SE = 0.22, p = .003; see 
Figure 
25). The F
requency by Group interaction was also 
significant, showing that bilinguals were less distracted by a different frequency (
b = 0.45, SE = 0.19, p = .045). All other main effects and interactions were not significant (|
z| < 1). The 
Frequency by Group interaction is shown in 
Figure 
26. The figure suggests that the interaction 
arose from the fact that the difference between same and diff
erent trials was larger for 
monolinguals than bilinguals.
 Next, RTs were investigated. Compared to the SFSL condition, responses were slower 
when Frequency was different (
b = 72.0, SE = 7.5, p < .001) and when L
ocation was different (
b = 51.7, SE = 7.5, p < .001). These effects were attenuated when both Frequency and Location 
were different (
b = -74.2, SE = 8.8, p < .001; see 
Figure 
27). Group interacted with Frequency, 
showing that the effect of Frequency was smaller in bilinguals (
b = -19.7, SE = 8.9, p = .026), as 
illustrated in 
Figure 
28.   144   Figure 
25. Mean accuracy on the TAIL 
in each of four
 condition
s. Whiskers show the 95% 
confidence interval.
 Note the limited range of the y
-axis to highlight the effect.
  Figure 
26. Mean accuracy 
on the TAIL for monolinguals and bilinguals. The difference between 
same frequency and different frequency trials was larger for monolinguals
 than for bilinguals.
  Whiskers
 show 
the 95% confidence interval
. Note the limited range of the y
-axis to highlight the 
effect.
 145   Figure 
27. Mean response time (RT) on the TAIL in each of four conditions. Whiskers show the 
95% conf
idence interval.
 So far the results seem to show that monolinguals and bilinguals performed differently on 
some aspect of the test. Monolinguals were faster than bilinguals when the frequency was the 
same for both tones and slower when the frequency was di
fferent. This gave rise to an interaction 
between Frequency and 
Group
. As described in the Methods sections, there were two versions of 
the experiment, with one half of each group using Q for 
same
 responses and P for 
different
 responses, and the other way 
around for the other half of each group. To test whether experiment 
version had an effect on the results, RTs were plotted separately for version 1 and 2. 
Figure 
28 shows that monolinguals were faster in the DFDL and SFDL condition on
 version 1 than on 
version 2. All other 95% CIs overlap, which suggests that performance was similar in both 
versions. When the previous model was rerun including an interaction with test version, the 
effects changed. Because of the complexity of the resul
ts, they are r
eported in Table format. 
Table 
17 shows that the Location effect was larger for bilinguals compared to monolinguals on 
146  version 1 but smaller on version 2. The Frequency by Group interaction, on the other hand, was 
only p
resent on version 2, with bilinguals showing a reduced effect. 
  Figure 
28. Mean response time (RT) 
in msec. 
on same and different frequency trials. Whiskers 
show the 95% confidence interval. Note the limited range of the y
-axis.
 147   Figure 
29. Mean response times (RT)
 in msec.
 in each of the four conditions of the TAIL. DF/SF 
= different/same frequency, DL/SL = different/same location.
 The difference between Version 1 
and 2 was the location of the response k
eys (see Methods section in text).
 Table 
17. Results of the regression analysis of TAIL response times.
 Effect
 Beta SE     p Intercept (baseline = SFSL condition)
  680.1 26.2 < .001 Frequency (baseline = same)
    65.1 10.5 < .001 Location (baseline = same)
    21.6 10.4    .039 Frequency*Location 
 - 70.4 12.2 < .001 Group (baseline = monolingual)
   10.0 37.5    .791 Test version (baseline = version 1)
    5.6 37.4    .882 Frequency*Group
 -  1.5 12.2    .903 Location*Group
  27.3 12.2    .026 Frequency*Test version
 13.9 15.0    .357 Location*Test version
  61.5 15.0 < .001 Group*Test version
  12.2 54.2   .821 Frequency*Location*Test version
 -  7.8 17.7   .661 Frequency*Group*Test version
 - 38.5 17.7   .030 Location*Group*Test
 version
 - 53.0 17.7   .003 Note. SFSL = same frequency, same location condition. Frequency and Location were variables 
with two levels, same and different. Group had two levels, monolingual and bilingual. Test 
version had two levels, version 1 and 
version 2.
 A question that arises regarding the result is whether the frequency manipulation was 
successful. When the first and the second tone had a different frequency, the difference could 
148  vary between 100 Hz and 900 Hz. If this manipulation was success
ful, then a greater difference 
should have led to a larger frequency effect. To test this, the difference in frequency between the 
first and second tone was entered as a continuous predictor into a mixed
-effects regression 
model. For this, only the trials 
for which the frequency was different were included. A visual 
observation of the data suggested that a third
-order polynomial would best fit the data since the 
relationship between RTs and the difference in frequency was not linear (see 
Figure 
30). The 
results showed that the difference in frequency was a significant predictor (linear term: 
b = 645.2, SE = 216.6, p = .003; quadratic term: 
b = -396.4, SE = 217.1, p = .068; cubic term: 
b = -458.8, SE = 216.8, p = .034), but the standard errors show that there was quite some uncertainty 
in these estimates. The interactions with Location, Test version and Group were not significant.
  Figure 
30. Effect of frequency difference between the f
irst and second tone on response times 
(RT) in msec. The regression line shows the best fit with a 
polynomial function with three terms.
 Is bilingual experience related to variance in attentional control?
 Language dominance was used as a continuous variabl
e in lieu of bilingual experience 
with the assumption that more balanced bilinguals would have greater language experience in 
each of their languages. Dominance was calculated by subtracting the English oral language 
149  score from the Spanish oral language sc
ore. The resulting variable was normally distributed with 
a mean of 12.6 and a standard deviation of 14.5. Zero indicates that an individual was equally 
proficient in English and Spanish, negative values indicate greater proficiency in Spanish and 
positive
 values greater proficiency in English. Thus most participants were dominant in English 
as is typical for bilinguals who live in a mostly English monolingual environment. 
 For the analysis, a mixed
-effects regression model 
on the RTs 
was run with Frequency
, Location, and Language dominance as main effects and their interaction. The results are 
summarized in 
Table 
18 and are graphically displayed in 
Figure 
31. There was a main effect for 
Language dominance, with mo
re English dominant participants being overall faster. In addition, 
Language dominance interacted with Location with a 
significantly
 larger Location effect the less 
balanced a bilingual was. For the accuracy data, there was a negative effect for Language 
dominance, with more English dominant participants being less accurate. However, the effect 
was small. One SD change in Language dominance was associated with a 2.5% decrease in 
accuracy. Nevertheless, there may have been a trade
-off between speed and accur
acy. To test for 
a speed
-accuracy trade
-off, mean accuracy was correlated with mean RTs in each condition. The 
correlations were small (
rs 47 < .24, ps > .100) so it is unlikely that participants as a group traded 
accuracy for speed. 
 Because language domi
nance was correlated with proficiency in English and Spanish 
(see section 
5.2.3), it is not clear whether dominance is responsible for the results or proficiency 
in English or Spanish. 
To answer this question, t
wo separate analyses were run, replacing 
Language dominance with English and Spanish proficiency, respectively. 
When
 English scores 
were entered into the model, the main effect and the interactions with Frequency and Location 
were not significant. With Spa
nish scores, on the other hand, the pattern of results did not change 
150  compared to those in 
Table 
18. Using simple correlations on the mean RTs in the baseline 
condition (SFSL) with either Language dominance or the Spanish scores as a 
predictor showed 
that these correlations were 
r47 = .39, 95% CI = [.12, .61] and 
r47 = .36, 95% CI = [.09, .59], respectively. The large overlap of the CIs suggests that those correlations were not significantly 
different.
 Table 
18. Results of the regression analysis of response times on the TAIL.
 Variable Name
 RTs
 Accuracy
 Beta
 SE p Beta
 SE p Intercept (baseline = SFSL)
 702.2 19.6  3.66 0.20  Frequency (different vs. same)
 45.0 9.3 .000 -0.53 0.23 .020 Location (different vs. 
same) 45.6 9.3 .000 -0.06 0.25 .802 Location*Frequency
 -59.7 13.2  .000 0.57 0.33 .087 Language dominance (continuous variable)
 -58.2 19.6 .004 -0.49 0.16 .003 Language dominance*Location
 20.3 6.6 .004 0.28 0.17 .096 Language dominance*Frequency
 4.0 6.6 .541 0.33 0.16 .044 Note. 
Only the data from bilingual participants were analyzed. 
Language dominance was 
transformed into a z
-score so that the estimate shows the change associated with a 1 SD change 
in language dominance.
 151   Figure 
31. Effect of language dominance on response times (RT, in msec.) and the location 
effect. Language dominance was calculated by subtracting 
Spanish proficiency scores from 
English proficiency scores so that scores above 0 indicate 
English dominance.
 5.5.5 Discussion
 In light of the differences between test versions, the results
 are somewhat difficult to 
interpret
. Bilinguals showed a reduced Frequency effect, that is, whether the frequency of the 
two tones was the same or different had a 
smaller effect on them compared to monolinguals. 
However, this was only true for version 2 of the test when both versions were considered 
separately. On version 1, on the other hand, bilinguals showed a larger Location effect, that is, 
they were slower to 
respond to trials where the location of the two tones was different compared 
to monolinguals. One possible explanation is that one version was easier than the other for 
monolinguals but for bilinguals, both versions had the same difficulty. More research w
ould be 
needed to determine whether these results could be replicated or are idiosyncratic to this study. 
 152  A further investigation of the frequency effect showed that the manipulation was 
successful. A larger difference generally resulted in longer RTs, su
ggesting that participants were 
more distracted by this irrelevant dimension when the difference was larger. However, the 
relationship was not linear. When the difference was large (900 Hz), participants were as fast to 
respond as when the difference was s
mall (100 Hz). It may be that very large differences were 
easier to ignore because they were more obvious. It is interesting to note that the effect did not 
interact with Location, suggesting that frequency was distracting even when there was response 
cong
ruency (i.e., both Frequency and Location required a 
different
 response). 
 A further question that was investigated was whether specific variables relating to 
bilingualism would be associated with attentional control. The reasoning was that if the bilingua
l advantage is related to bilingual language use, than more balanced bilinguals may be expected to 
perform better than less balanced bilinguals. The results were surprising in that bilinguals who 
were more dominant in English were overall faster. In additi
on, they displayed a larger Location 
effect, which was mainly caused by faster responses to 
same
 trials compared to 
different 
trials. 
This suggests that the larger Location effect was due to an advantage for same location tria
ls rather than a disadvantage 
attributable to
 greater distraction. The direction of the main effect of 
Language dominance was unexpected since it was hypothesized that more balanced bilinguals 
would be faster. Further analyses showed that the Spanish scores were also associated with 
overall faster RTs on the TAIL test. One possible interpretation of these results is that 
participants who were more dominant in English were more integrated into the dominant culture. 
As discussed in section 
5.2.4, English dominant 
participants had more exposure to English, 
which may be equated to greater influence of the dominant (American) culture. It is well 
established that sociocultural differences can influence performance of tasks of executive 
153  function 
(Chasiotis, Kiessling, Hofer, & Campos, 2006; Oh
 & Lewis, 2008; Sabbagh, Xu, 
Carlson, Moses, & Lee, 2006)
. For example, Chasiotis et al. 
(2006) suggest that cultures that 
differ in interpersonal distance (separateness 
Ð relatedness) and agency (autonomy 
Ð heteronomy; 
see Kagitcibasi, 1996)
 may differ on tasks of executive function. The researchers
 found some evidence for this hypothesis by testing children from three cultures that differ on 
these two dimensions, Germany, Cameroon, and Costa Rica. Differences emerged for the 
Cameroonean sample compared to the other two samples. Cameroonean children 
performed less 
well on conflict
-inhibition tasks but better on a delay inhibition task (on this task, the child is told 
not to take a snack in his view until the experimenter rings a bell). Chasiotis et al. (2006) suggest 
that this may hav
e to do with pare
nting style. P
arents
 in Cameroon
 may 
favor
 obedient and 
inhibited behavior but may disregard impulse behavior (p. 258). Likewise, immigrant families of 
Mexican descent (the majority of the bilingual sample) may differ in their parenting style from 
Caucasia
n nonhispanic American families (the majority of the monolingual sample; 
Varela et al., 
2004). One tentative explanation of the present results may thus be that greater English 
dominance was associated with greater adaptation
 to values of the dominant US culture such as 
independence and autonomy. 
It should be noted, though, that this is only speculative and that 
there is no direct evidence for this hypothesis. 
One way to investigate this hypothesis would be 
to survey bilingual
 childrenÕs parents about parenting style and cultural values and relate this to 
their childrenÕs executive function development 
(see, e.g., Bernier, Carlson, & Whipple, 2010)
. In addition, greater English dominance may not only be associated wit
h parentsÕ cultural values 
but also the participantsÕ own adaptation to the dominant culture and its values such as 
autonomy. While Spanish contact is often determined by external forces in the early years (e.g., 
parents may choose to put their child into 
a bilingual program), later in life bilinguals may 
154  choose to build social networks with members of the dominant culture and abandon some of 
their traditional values.
  Studies investigating the bilingual advantage have often relied on group comparisons. 
How
ever, as has been recently pointed out 
(Valian, 2014)
, individuals can differ on many 
dimensio
ns that have been linked to advantages in executive function such as musical training, 
video gaming, and exercise. Thus we can never be sure that differences between groups are 
attributable to bilingualism or some other unobserved variable, especially when
 sample sizes are 
small. In the present study, one possible confounding factor is SES. SES as measured by 
motherÕs education level was significantly lower in the bilingual than the monolingual sample. 
Because there was almost complete separation, it is imp
ossible to statistically control for this 
variable. Thus there may have been a bilingual advantage but it may have been obscured by the 
lower SES of the bilinguals. In light of these difficulties, other researchers have suggested to 
employ individual diffe
rences designs to directly relate aspects of bilingualism to advantages in 
executive function 
(Titone, Pivneva, Sheikh, Webb, & Whitford, 2015)
. In the present study i
t was hypothesized that more balanced bilinguals would have greater attentional control compared 
to less balanced bilinguals. The opposite effect was found with English dominant bilinguals 
being overall faster compared to balanced bilinguals. To explain th
is unexpected result, a 
literature search showed that executive functions may be related to cultural values and parenting 
style. This adds even more variables to the task of singling out bilingualism as a factor of 
benefits in executive function. What seem
s clear, though, from the present results is that while 
bilingualism influences verbal variables, general cognitive function is not negatively affected. 
 155  CHAPTER 6: CONCLUSION
 Interim discussions of each test can be found after the presentation of the results of each 
test. In this conclusion I will summarize the main results and relate the different findings to each 
other. 
Figure 
32 summarizes the results schematically. 
English proficiency, as measured by the 
WMLS-R subtests picture vocabulary and verbal analogies, turned out to be the strongest 
predictor of individual differences in SUN, the
 topic of this dissertation. The main finding from 
Experiment 2 (section 
0) was that differences between monolinguals and bilinguals may be less 
categorical as may be thought when looking at group comparisons only. While bilingual 
participants were overall less accurate on the SPIN 
than monolinguals 
(71.8% vs. 
80.8%), English pr
oficiency was a mediating factor in both groups. English proficiency, in turn, was 
related to exposure to English. Bilingual participants with more exposure to English were also 
more proficient in English. 
On the other hand, more English exposure was neces
sarily related 
with less Spanish exposure. This was further expressed in the finding that higher English 
proficiency was associated with lower Spanish proficiency. Other predictors of Spanish 
proficiency were the number of speakers a participant regularly 
interacted with during childhood. 
This variable was not negatively correlated with English proficiency, which suggests that a trade
-off between a bilingualÕs languages may be attenuated by certain variables. 
Another factor that 
seemed to have positively in
fluenced Spanish proficiency without negatively affecting English 
proficiency was participation in Spanish immersion programs (not shown in 
Figure 
32). Because 
these r
esults were based on retrospective reports only they have to be interpreted with caution 
but the results again suggest that a trade
-off between languages may not be inevitable. 
English 
proficiency furthermore predicted WMC and consonant perception in noise
 and a weak 
association was found between Spanish proficiency and the Spanish WIN test.
 156   Figure 
32. Schematic representation of the results in this study. Arrows indicate significant 
relationships bet
ween variables. The two
-way ar
row indicates that more exposure to one 
language is associated with less exposure to the other language. SUN = speech understanding in 
noise. WM = working memory. CP = consonant perception.
 One limitation of this study and any correlational study is that c
ausation cannot be 
established. The arrows in 
Figure 
32 merely show the hypothesized direction of the relationship 
between variables. 
While it is established in the li
terature that more exposure to the language, for 
example in the form of mother
-child interactions, will lead to vocabulary growth 
(e.g., Hoff, 
2006; Weisleder & Fernald, 2013)
, a remaining question is whether vocabulary size 
is causally 
related to SUN or whether exposure 
causally 
predicts 
both SUN and vocabulary size.
 The lexical 
restructuring model 
(Metsala & Walley, 1998; Walley, 2008)
, for example, assumes a causal 
relationship. 
Furthermor
e, Pierrehumbert 
(2001, 2003) suggests that 
phonetic categories in 
listeners are fine
-tuned by the type statistics computed over the entire lexicon. That is, the mental 
lexicon provides listeners wit
h feedback about which phonetically dissimilar sounds nevertheless 
belong to the same phonetic category. Listeners with more refin
ed phonetic categories may be 
attending to more detailed phonetic information and may thus be less affected some of this 
information is overshadowed by a competing signal. Some evidence for the relationship between 
157  vocabulary size and the precision of phon
etic categories came from the consonant perception 
test. But again, these results cannot establish causation and it may be that English exposure 
is the 
mediating factor,
 leading
 to a larger vocabulary and 
more precise phonetic categories.
 As I already ment
ioned, the results of the SPIN, WIN, CP, and WM tests suggest that 
differences between monolinguals and bilinguals may be more gradual than previously thought. 
Especially the fact that language proficiency predicted performance on these tests for 
monolingu
als and bilinguals suggests that the less accurate performance of the bilinguals may be 
a natural consequence of being exposed to two languages
, and, as a consequence, spending less 
time in each language. The results of the WMLS
-R showed that even highly p
roficient bilinguals 
who have received all their schooling in English and are studying at a major US university may 
still have a smaller vocabulary than the general population. Whereas the monolinguals performed 
above the population mean, the mean standard
 score of the bilinguals was 2/3 of a standard 
deviation below the population mean. The context of the present study may be different from 
studies on bilingualism in other regions of the world such as Catalonia or Montreal where 
bilingualism is not associa
ted with SES
. However, given the relationship between 
language 
proficiency and 
language 
exposure found across many studies, it seems that balanced bilinguals 
as a group will always be less proficient in 
each language compared to monolingual speakers. 
There
fore, if a goal of a study is to make inferences about bilingualism (i.e., the consequences of 
speaking more than one language) and not about language proficiency in general, one must test 
two groups of participants who are matched on language proficiency.
 Otherwise language 
proficiency will be a confounding factor and it will not be clear whether differences are 
attributable to bilingualism or 
to 
lower proficiency. 
Matching bilinguals with monolinguals is not 
straightforward, however. For example, if we te
sted a large group of Spanish
-English bilingual 
158  speakers in the US, the current sample would probably have performed above the population 
mean, given that these participants were college students, that is, they would be drawn from the 
right tail of the bil
ingual population distribution. If we wanted to find a monolingual sample of 
the same mean proficiency, they would be drawn from the left tail of the monolingual population 
distribution. 
Thus the two samples may differ in other ways and may not be easily c
omparable.
 While differences between groups could be attributed to differences in language 
proficiency to a large extent, some differences may also be attributable to cross
-language 
influence. Evidence for this assumption com
es from the CP test. A comparis
on of the confusion 
matrices (
Confusion matrix 
- bilingual 
participants
.Table 
14 and 
Table 
15) suggested that 
bilingual speakers tended to misperceive those consonants that have overl
apping category 
boundaries in English and Spanish (i.e., /b/, /d/, and /g/). This result may have been a 
consequence of the decontextualized nature of the test, which would suggest that bilinguals 
cannot simply sw
itch their languages on and off and functio
n like a monolingual of the presently 
relevant language
 (cf. Grosjean, 2001)
. Lastly, the results from the SPIN and the WIN tests also suggest that individual 
differences in 
domain
-general, cognitive abilities play a role for SUN. A lower baseline RT on 
the TAIL was associated with higher accuracy on the SPIN and WIN. Baseline RT may reflect 
processing speed 
(Zhang et al., 2012)
. A decrease in processing speed has been proposed as a 
major contributor to the age
-related decline in cognition
 (Salthouse, 1996)
. If processing speed is 
associated with 
SUN, then the age
-related decline in processing speed may also explain why 
SUN ge
ts harder as a function of ag
e (cf. Wingfield, 1996)
. In the present study, the effect of 
processing speed was small, which may be due to the fact that participants were young
-adult 
college students. Larger effects may be found if a more diverse sample was tested.
 159            APPENDIX
  160  APPENDIX
 Table 
19. Mean item accuracy on the WIN.
 ,-.
!/012
!30405647895
!:65647895
!,-.
!/012
!30405647895
!:65647895
!+!;9<=
!'(>%?
!$+>$?
!"#!@A91<B
!"++>+?
!"++>+?
!+!;9CB
!')>&?
!')>'?
!"#!@B9<=
!"++>+?
!"++>+?
!+!<95D
!&>*?
!+>+?
!"#!C005
!"++>+?
!(*>(?
!+!29;
!">(?
!+>+?
!"#!E06<A
!"++>+?
!"++>+?
!+!79FA
!#'>"?
!"(>"?
!"#!G6C<B
!"++>+?
!"++>+?
!+!7AC!#&>+?
!'%>#?
!"%!;9@A
!"++>+?
!"++>+?
!+!=655
!"">'?
!$>'?
!"%!29CA
!"++>+?
!"++>+?
!+!56HA
!'+>)?
!#&>&?
!"%!207!"++>+?
!"++>+?
!+!46<A
!##>%?
!#'>$?
!"%!79@
!()>"?
!)'>+?
!+!1A92
!*>*?
!#(>)?
!"%!B9EA
!($>'?
!('>%?
!$!;A7
!"*>+?
!"$>(?
!"%!I827A
!"++>+?
!"++>+?
!$!H91
!&+>(?
!#(>)?
!"%!56EA
!"++>+?
!(&>*?
!$!5A914
!))>*?
!)(>$?
!"%!1A2
!"++>+?
!"++>+?
!$!5047
!#$>&?
!"*>+?
!"%!C6DA
!)$>(?
!*+>#?
!$!DA@@
!&)>&?
!*#>'?
!"%!G61A
!"++>+?
!"++>+?
!$!D002
!##>%?
!$>'?
!#+!<B961
!"++>+?
!"++>+?
!$!D08@A
!%+>$?
!&(>%?
!#+!26C<B
!"++>+?
!"++>+?
!$!40CA
!'(>%?
!$+>$?
!#+!784
!"++>+?
!"++>+?
!$!@BAAJ
!(%>#?
!)'>+?
!#+!B9FA
!"++>+?
!"++>+?
!$!C95=
!*">*?
!*)>*?
!#+!=6<=
!"++>+?
!"++>+?
!)!;6CA
!**>$?
!%%>+?
!#+!58<=
!"++>+?
!"++>+?
!)!2AAJ
!*(>#?
!*+>#?
!#+!1647
!"++>+?
!"++>+?
!)!2055
!'*>*?
!%">*?
!#+!@B9G5
!(%>#?
!*%>%?
!)!B95H
!$(>"?
!$#>%?
!#+!@8<B
!"++>+?
!"++>+?
!)!D9=A
!()>"?
!(&>*?
!#+!C61A
!"++>+?
!"++>+?
!)!J6<=
!(%>#?
!"++>+?
!#$!<005
!"++>+?
!(*>(?
!)!@09J
!%*>(?
!$%>)?
!#$!2027A
!"++>+?
!"++>+?
!)!@081
!)%>)?
!*)>*?
!#$!H002
!"++>+?
!"++>+?
!)!C814
!(#>&?
!*'>(?
!#$!B61A
!"++>+?
!(*>(?
!)!K0847
!&)>&?
!)+>$?
!#$!I86<A
!"++>+?
!"++>+?
!"#!<B6AH
!"++>+?
!(&>*?
!#$!59CA
!"++>+?
!"++>+?
!"#!7002
!"++>+?
!"++>+?
!#$!J964
!"++>+?
!"++>+?
!"#!B9CA
!"++>+?
!"++>+?
!#$!1092
!"++>+?
!"++>+?
!"#!J9@@
!"++>+?
!"++>+?
!#$!GBA9C
!"++>+?
!(*>(?
!"#!18@B
!"++>+?
!(&>*?
!#$!K08CB
!"++>+?
!"++>+?
!Note. SNR = signal
-to-noise
-ratio.
  161  Table 
20. Items discrimination index for E
-WIN words.
 Word
 SNR
 Accuracy
 Accuracy 
Bottom 27%
 Accuracy 
Top 27%
 Item
 discrimination
 Correlation
 far
 4 41.0%
 25.9%
 59.3%
 0.33
 0.34
 gaze
 0 21.2%
 7.4%
 40.7%
 0.33
 0.32
 soap
 8 58.0%
 33.3%
 70.4%
 0.37
 0.30
 turn
 8 83.8%
 70.4%
 92.6%
 0.22
 0.30
 talk
 4 75.0%
 55.6%
 88.9%
 0.33
 0.29
 half
 8 46.0%
 18.5%
 66.7%
 0.48
 0.29
 life
 0 28.3%
 14.8%
 44.4%
 0.30
 0.29
 kill
 0 8.0%
 0.0%
 14.8%
 0.15
 0.28
 get
 0 30.3%
 22.2%
 51.9%
 0.30
 0.28
 shawl
 20 87.0%
 74.1%
 92.6%
 0.19
 0.28
 mood
 4 14.0%
 3.7%
 25.9%
 0.22
 0.27
 live
 16 98.0%
 92.6%
 100.0%
 0.07
 0.27
 long
 4 21.0%
 3.7%
 33.3%
 0.30
 0.25
 calm
 0 3.0%
 0.0%
 11.1%
 0.11
 0.25
 learn
 4 89.0%
 88.9%
 100.0%
 0.11
 0.24
 bite
 8 72.0%
 55.6%
 81.5%
 0.26
 0.24
 mess
 4 65.0%
 51.9%
 81.5%
 0.30
 0.23
 note
 4 40.0%
 37.0%
 55.6%
 0.19
 0.23
 back
 0 40.0%
 33.3%
 59.3%
 0.26
 0.23
 dab
 0 1.0%
 0.0%
 3.7%
 0.04
 0.21
 young 8 68.7%
 55.6%
 77.8%
 0.22
 0.21
 sheep
 4 90.0%
 81.5%
 92.6%
 0.11
 0.20
 deep
 8 75.0%
 59.3%
 92.6%
 0.33
 0.20
 chief
 12 98.0%
 92.6%
 100.0%
 0.07
 0.20
 read
 0 18.2%
 7.4%
 25.9%
 0.19
 0.19
 time
 16 78.0%
 66.7%
 92.6%
 0.26
 0.19
 have
 16 94.0%
 92.6%
 96.3%
 0.04
 0.17
 make
 8 97.0%
 92.6%
 100.0%
 0.07
 0.16
 bath
 0 38.4%
 29.6%
 48.1%
 0.19
 0.16
 gas
 16 91.0%
 81.5%
 92.6%
 0.11
 0.14
 doll
 8 49.0%
 48.1%
 48.1%
 0.00
 0.12
 rush
 12 98.0%
 92.6%
 100.0%
 0.07
 0.12
 beg
 4 16.0%
 7.4%
 22.2%
 0.15
 0.09
 sour
 8 83.0%
 77.8%
 85.2%
 0.07
 0.09
 tool
 12 99.0%
 96.3%
 100.0%
 0.04
 0.07
 hire
 24 99.0%
 96.3%
 100.0%
 0.04
 0.07
 mouse
 4 60.0%
 63.0%
 66.7%
 0.04
 0.06
 cool
 24 99.0%
 100.0%
 100.0%
 0.00
 0.00
 wheat
 24 99.0%
 100.0%
 100.0%
 0.00
 0.00
 162  Table 20 (contÕd).
 Word
 SNR
 Accuracy
 Accuracy 
Bottom 27%
 Accuracy 
Top 27%
 Item
 discrimination
 Correlation
 nice
 0 23.0%
 18.5%
 22.2%
 0.04
 -0.01
 pick
 8 98.0%
 100.0%
 100.0%
 0.00
 -0.05
 good 12 100.0%
 100.0%
 100.0%
 0.00
  hate
 12 100.0%
 100.0%
 100.0%
 0.00
  pass
 12 100.0%
 100.0%
 100.0%
 0.00
  search
 12 100.0%
 100.0%
 100.0%
 0.00
  shack
 12 100.0%
 100.0%
 100.0%
 0.00
  voice
 12 100.0%
 100.0%
 100.0%
 0.00
  witch
 12 100.0%
 100.0%
 100.0%
 0.00
  base
 16 100.0%
 100.0%
 100.0%
 0.00
  date
 16 100.0%
 100.0%
 100.0%
 0.00
  dog 16 100.0%
 100.0%
 100.0%
 0.00
  judge
 16 100.0%
 100.0%
 100.0%
 0.00
  red
 16 100.0%
 100.0%
 100.0%
 0.00
  wire
 16 100.0%
 100.0%
 100.0%
 0.00
  chair
 20 100.0%
 100.0%
 100.0%
 0.00
  ditch
 20 100.0%
 100.0%
 100.0%
 0.00
  gun 20 100.0%
 100.0%
 100.0%
 0.00
  haze
 20 100.0%
 100.0%
 100.0%
 0.00
  kick
 20 100.0%
 100.0%
 100.0%
 0.00
  luck
 20 100.0%
 100.0%
 100.0%
 0.00
  ring
 20 100.0%
 100.0%
 100.0%
 0.00
  such
 20 100.0%
 100.0%
 100.0%
 0.00
  tire
 20 100.0%
 100.0%
 100.0%
 0.00
  dodge 24 100.0%
 100.0%
 100.0%
 0.00
  food
 24 100.0%
 100.0%
 100.0%
 0.00
  juice
 24 100.0%
 100.0%
 100.0%
 0.00
  late
 24 100.0%
 100.0%
 100.0%
 0.00
  pain
 24 100.0%
 100.0%
 100.0%
 0.00
  road
 24 100.0%
 100.0%
 100.0%
 0.00
  youth
 24 100.0%
 100.0%
 100.0%
 0.00
   163   Figure 
33. Mean accuracy on the English Words in Noise test for List 1 and 2. Whiskers show 
the 95% confidence interval.
  164            REFERENCES
             165  REFERENCES
  Abutalebi, J., Della Rosa, P. a, Ding, G., Weekes, B., Costa, A., & Green, D. W. (2013). 
Language proficiency modulates the engagement of cognitive control areas in multilinguals. 
Cortex; a Journal Devoted to the Study of the Nervous System and Behavior
, 49(3), 905Ð11. doi:10.1016/j.cortex.2012.08.018
 Acheson, D., Hamidi, M., Binder, J., & Postle, B. (2011). A common neural substrate for 
language production and verbal working memory. 
Journal of Cognitive Neuroscience
, 23(6), 1358Ð1367. Akeroyd, M. A. (2008). Are individual differences in speech reception related to individual 
differences in cognitive ability? A survey of twenty experimental studies with normal and 
hearing
-impaired adults. 
International Journal of Audiology
, 47(Suppl. 2),
 S53
ÐS71. 
doi:10.1080/14992020802301142
 Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the Time Course of 
Spoken Word Recognition Using Eye Movements: Evidence for Continuous Mapping 
Models. 
Journal of Memory and Language
, 38(4), 419Ð439. doi:10.1006/jmla.1997.2558
 Altmann, G. T. M., & Kamide, Y. (1999). Incremental interpretation at verbs: restricting the 
domain of subsequent reference. 
Cognition
, 73(3), 247Ð64. Alvarado, C. G., & Woodcock, R. W. (2005). 
Comprehensive manual
. Rollin
g Meadows, IL: 
Riverside Publishing.
 AntŠn, E., DuŒabeitia, J. A., Est”vez, A., Hern⁄ndez, J. A., Castillo, A., Fuentes, L. J., É 
Carreiras, M. (2014). Is there a bilingual advantage in the ANT task? Evidence from 
children. 
Frontiers in Psychology
, 5(May),
 1Ð12. doi:10.3389/fpsyg.2014.00398
 Antoniou, M., Tyler, M. D., & Best, C. T. (2012). Two ways to listen: Do L2
-dominant 
bilinguals perceive stop voicing according to language mode? 
Journal of Phonetics
, 40(4), 582Ð594. doi:10.1016/j.wocn.2012.05.005
 Arlin
ger, S., Lunner, T., Lyxell, B., & Pichora
-Fuller, M. K. (2009). The emergence of cognitive 
hearing science. 
Scandinavian Journal of Psychology
, 50(5), 371Ð84. doi:10.1111/j.1467
-9450.2009.00753.x Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixe
d-effects modeling with crossed 
random effects for subjects and items. 
Journal of Memory and Language
, 59(4), 390Ð412. doi:10.1016/j.jml.2007.12.005
 Baddeley, A. D. (1992). Working memory. 
Science
, 255(5044), 556Ð559. 166  Baddeley, A. D. (2012). Working memory
: theories, models, and controversies. 
Annual Review 
of Psychology
, 63, 1Ð29. doi:10.1146/annurev
-psych
-120710-100422 Baddeley, A. D., Gathercole, S. E., & Papagno, C. (1998). The phonological loop as a language 
learning device. 
Psychological Review
, 105(1), 158Ð173. doi:10.1037//0033
-295X.105.1.158
 Baddeley, A. D., & Hitch, G. J. (1974). Working memory. 
The Psychology of Learning and 
Motivation
, 8, 47Ð89. Balota, D., Yap, M. J., Cortese, M. J., Hutchison, K. a, Kessler, B., Loftis, B., É Treiman, R. 
(2007). The English Lexicon Project. 
Behavior Research Methods
, 39(3), 445Ð59. Barcroft, J., & Sommers, M. S. (2005). Effects of Acoustic Variability on Second Language 
Vocabulary Learning. 
Studies in Second Language Acquisition
, 27, 387Ð414. doi:10.1017/S0272
263105050175 Bates, D. M., Maechler, M., Bolker, B., & Walker, S. (2014). lme4: Linear mixed
-effects models 
using Eigen and S4.
 Bernier, A., Carlson, S. M., & Whipple, N. (2010). From external regulation to self
-regulation: 
Early parenting precursors of yo
ung childrenÕs executive functioning. 
Child Development
, 81(1), 326Ð339. doi:10.1111/j.1467
-8624.2009.01397.x Bialystok, E., Craik, F. I. M., Klein, R. M., & Viswanathan, M. (2004). Bilingualism, aging, and 
cognitive control: evidence from the Simon task. 
Psychology and Aging
, 19(2), 290Ð303. doi:10.1037/0882
-7974.19.2.290 Bialystok, E., Craik, F. I. M., & Luk, G. (2012). Bilingualism: consequences for mind and brain. 
Trends in Cognitive Sciences
, 16(4), 240Ð250. doi:10.1016/j.tics.2012.03.001
 Bialystok, E.
, Craik, F., & Luk, G. (2008). Cognitive control and lexical access in younger and 
older bilinguals. 
Journal of Experimental Psychology: Learning, Memory, and Cognition
, 34(4), 859Ð73. doi:10.1037/0278
-7393.34.4.859 Bialystok, E., & Luk, G. (2012). Recepti
ve vocabulary differences in monolingual and bilingual 
adults. 
Bilingualism: Language and Cognition
, 15(2), 397Ð401. doi:10.1017/S136672891100040X
 Bialystok, E., Luk, G., Peets, K. F., & Yang, S. (2009). Receptive vocabulary differences in 
monolingual and 
bilingual children. 
Bilingualism: Language and Cognition
, 13(04), 525Ð531. doi:10.1017/S1366728909990423
 Bilger, R., Nuetzel, J., Rabinowitz, W., & Rzeczkowski, C. (1984). Standardization of a test of 
speech perception in noise. 
Journal of Speech and Heari
ng Research
, 27, 32Ð48. 167  Boersma, P., & Weenink, D. (2014). Praat: doing phonetics by computer.
 Bolger, D. J., Balass, M., Landen, E., & Perfetti, C. a. (2008). Context Variation and Definitions 
in Learning the Meanings of Words: An Instance
-Based Learning 
Approach. 
Discourse 
Processes
, 45(2), 122Ð159. doi:10.1080/01638530701792826
 Bradlow, A. R., & Alexander, J. a. (2007). Semantic and phonetic enhancements for speech
-in-noise recognition by native and non
-native listeners. 
The Journal of the Acoustical Soc
iety of America
, 121(4), 2339Ð2349. doi:10.1121/1.2642103
 Bradlow, A. R., & Pisoni, D. B. (1999). Recognition of spoken words by native and non
-native 
listeners: Talker
-, listener
-, and item
-related factors. 
Journal of the Acoustical Society of 
America, 106(4), 2074Ð2085. doi:10.1121/1.427952
 Br−nnstrım, K. J., Zunic, E., Borovac, A., & Ibertsson, T. (2012). Acceptance of background 
noise, working memory capacity, and auditory evoked potentials in subjects with normal 
hearing. 
Journal of the American Acad
emy of Audiology
, 23(7), 542Ð52. doi:10.3766/jaaa.23.7.6
 Brown, M., Salverda, A. P., Dilley, L. C., & Tanenhaus, M. K. (2011). Expectations from 
preceding prosody influence segmentation in online sentence processing. 
Psychonomic 
Bulletin & Review
, 18(6), 1189Ð1196. doi:10.3758/s13423
-011-0167-9 Brysbaert, M., & New, B. (2009). Moving beyond Kucera and Francis: a critical evaluation of 
current word frequency norms and the introduction of a new and improved word frequency 
measure for American English. 
Behavi
or Research Methods
, 41(4), 977Ð90. doi:10.3758/BRM.41.4.977
 Calandruccio, L., & Zhou, H. (2013). Increase in speech recognition due to linguistic mismatch 
between target and masker speech: Monolingual and simultaneous bilingual performance. 
Journal of Spe
ech, Language, and Hearing Research
, 57, 1089Ð1097. Capps, R., Fix, M., Murray, J., Ost, J., Passel, J., & Herwantoro, S. (2005). 
The new demography 
of AmericaÕs schools: Immigration and the No Child Left Behind Act
. Urban Institute
. Washington, D.C.
 Charl
es-Luce, J., & Luce, P. a. (1990). Similarity neighbourhoods of words in young childrenÕs 
lexicons. 
Journal of Child Language
, 17(01), 205Ð215. doi:10.1017/S0305000900013180
 Chasiotis, A., Kiessling, F., Hofer, J., & Campos, D. (2006). Theory of mind and i
nhibitory 
control in three cultures: Conflict inhibition predicts false belief understanding in Germany, 
Costa Rica and Cameroon. 
International Journal of Behavioral Development
, 30(3), 249Ð260. doi:10.1177/0165025406066759
 Chateau, D., & Jared, D. (2000).
 Exposure to print and word recognition processes. 
Memory & 
Cognition
, 28(1), 143Ð53. doi:10.3758/BF03211582
 168  Coady, J. a., & Aslin, R. N. (2003). Phonological neighbourhoods in the developing lexicon. 
Journal of Child Language
, 30(2), 441Ð469. doi:10.1017/
S0305000903005579
 Conway, A. R. A., & Engle, R. W. (1996). Individual Differences in Working Memory Capacity: 
More Evidence for a General Capacity Theory. 
Memory
, 4(6), 577Ð590. Conway, A. R. A., Kane, M. J., Bunting, M. F., Hambrick, D. Z., Wilhelm, O., &
 Engle, R. W. 
(2005). Working memory span tasks: A methodological review and userÕs guide. 
Psychonomic Bulletin & Review
, 12(5), 769Ð786. Corrigall, K. a., & Schellenberg, E. G. (2015). Predicting who takes music lessons: parent and 
child characteristics. 
Frontiers in Psychology
, 6(282), 1Ð8. doi:10.3389/fpsyg.2015.00282
 Costa, A., Hern⁄ndez, M., & Sebasti⁄n
-Gall”s, N. (2008). Bilingualism aids conflict resolution: 
evidence from the ANT task. 
Cognition
, 106(1), 59Ð86. doi:10.1016/j.cognition.2006.12.013
 Cowan, N. (1993). Activation, attention, and short
-term memory. 
Memory & Cognition
, 21(2), 162Ð7. Cowan, N. (1999). An embedded
-processes model of working memory. In A. Miyake & P. Shah 
(Eds.), 
Models of working memory: Mechanisms of active maintenance and
 executive 
control
 (pp. 62Ð101). Cambridge University Press.
 Crandell, C., & Smaldino, J. (1996). Speech perception in noise by children for whom English is 
a second language. 
American Journal of Audiology
, 5, 47Ð51. Cutler, A. (2012). 
Native listening: la
nguage experience and the recognition of spoken words
. Cambridge, MA: The MIT Press.
 Cutler, A., Garcia Lecumberri, M. L., & Cooke, M. (2008). Consonant identification in noise by 
native and non
-native listeners: effects of local context. 
The Journal of th
e Acoustical 
Society of America
, 124(2), 1264Ð8. doi:10.1121/1.2946707
 Cutler, A., Weber, A., Smits, R., & Cooper, N. (2004). Patterns of English phoneme confusions 
by native and non
-native listeners. 
The Journal of the Acoustical Society of America
, 116(6), 3668. doi:10.1121/1.1810292
 Dahan, D., & Magnuson, J. S. (2006). Spoken Word Recognition. In M. Traxler & M. 
Gernsbacher (Eds.), 
Handbook of psycholinguistics
 (2nd ed., pp. 249
Ð283). Amsterdam, 
NL: Academic Press.
 Dahan, D., Magnuson, J. S., & Tanenhaus
, M. K. (2001). Time course of frequency effects in 
spoken
-word recognition: evidence from eye movements. 
Cognitive Psychology
, 42(4), 317Ð67. doi:10.1006/cogp.2001.0750
 169  Daneman, M., & Carpenter, P. (1980). Individual differences in working memory and read
ing. 
Journal of Verbal Learning and Verbal Behavior
, 466, 450Ð466. De Bruin, A., Treccani, B., & Della Sala, S. (2015). Cognitive Advantage in Bilingualism: An 
Example of Publication Bias? 
Psychological Science
, 26(1), 99Ð107. doi:10.1177/0956797614557866
 Delcenserie, A., & Genesee, F. (2013). Language and memory abilities of internationally 
adopted children from China: evidence for early age effects. 
Journal of Child Language
, 41(6), 1195Ð1223. doi:10.1017/S030500091300041X
 Delgado, P., Guerrero, G., Goggi
n, J. P., & Ellis, B. B. (1999). Self
-Assessment of Linguistic 
Skills by Bilingual Hispanics. 
Hispanic Journal of Behavioral Sciences
, 21(1), 31Ð46. doi:10.1177/0739986399211003
 Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception. 
Annual Re
view of Psychology
, 55, 149Ð79. doi:10.1146/annurev.psych.55.090902.142028
 Diependaele, K., Lemhıfer, K., & Brysbaert, M. (2013). The word frequency effect in first
- and 
second
-language word recognition: A lexical entrenchment account. 
Quarterly Journal of
 Experimental Psychology
, 66(5), 843Ð863. doi:10.1080/17470218.2012.720994
 Dilley, L. C., & McAuley, J. D. (2008). Distal prosodic context affects word segmentation and 
lexical processing. 
Journal of Memory and Language
, 59(3), 294Ð311. doi:10.1016/j.jml.2008.06.006
 Dilley, L. C., & Pitt, M. A. (2010). Altering context speech rate can cause words to appear or 
disappear. 
Psychological Science
"
: A Journal of the American Psychological Society / APS
, 21(11), 1664Ð1670. doi:10.1177/0956797610
384743 DuŒabeitia, J. A., Hern⁄ndez, J. A., AntŠn, E., Macizo, P., Est”vez, A., Fuentes, L. J., & 
Carreiras, M. (2014). The inhibitory advantage in bilingual children revisited: Myth or 
reality? 
Experimental Psychology
, 61, 234Ð251. doi:10.1027/1618
-3169/a000243
 Edwards, J. R., Beckman, M. E., & Munson, B. (2004). The Interaction between Vocabulary 
Size and Phonotactic Probability Effects on ChildrenÕs Production Accuracy and Fluency in 
Nonword Repetition. 
Journal of Speech, Language 
and Hearing Research
, 47, 421Ð436. Elman, J. L., Diehl, R. L., & Buchwald, S. (1977). Perceptual switching in bilinguals. 
The 
Journal of the Acoustical Society of America
, 62(4), 971Ð974. doi:10.1121/1.381591
 Ernestus, M., & Warner, N. (2011). An introduct
ion to reduced pronunciation variants. 
Journal 
of Phonetics
, 39(3), 253Ð260. doi:10.1016/S0095
-4470(11)00055-6 170  Farkas, G., & Beron, K. (2004). The detailed age trajectory of oral vocabulary knowledge: 
differences by class and race. 
Social Science Research
, 33(3), 464Ð497. doi:10.1016/j.ssresearch.2003.08.001
 Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. 
Strange (Ed.), 
Speech perception and linguistic experience: Issues in cross
-language 
research
 (Vol. 92, pp. 23
3Ð277). Timonium, MD: York Press. doi:10.1111/j.1600
-0404.1995.tb01710.x
 Flege, J. E., & Eefting, W. (1986). Linguistic and developmental effects on the production and 
perception of stop consonants. 
Phonetica
, 43, 155Ð171. Flege, J. E., & Eefting, W. (1987
). Production and perception of English stops by native Spanish 
speakers. 
Journal of Phonetics
, 15, 67Ð83. Flege, J. E., Yeni
-Komshian, G. H., & Liu, S. (1999). Age Constraints on Second
-Language 
Acquisition. 
Journal of Memory and Language
, 41(1), 78Ð104. doi:10.1006/jmla.1999.2638
 Garcia Lecumberri, M. L., & Cooke, M. (2006). Effect of masker type on native and non
-native 
consonant perception in noise. 
The Journal of the Acoustical Society of America
, 119(4), 2445Ð2454. doi:10.1121/1.2180210
 Garcia
-Sierra,
 A., Diehl, R. L., & Champlin, C. (2009). Testing the double phonemic boundary 
in bilinguals. 
Speech Communication
, 51(4), 369Ð378. doi:10.1016/j.specom.2008.11.005
 Gasquoine, P. G., & Dayanira Gonzales, C. (2012). Using Monolingual Neuropsychological Test
 Norms with Bilingual Hispanic Americans: Application of an Individual Comparison 
Standard. 
Archives of Clinical Neuropsychology
, 27(3), 268Ð276. doi:10.1093/arclin/acs004
 Gathercole, S. E., &
 Baddeley, A. D. (1989). Evaluation of the role of phonological STM in the 
development of vocabulary in children: A longitudinal study. 
Journal of Memory and 
Language
, 28(2), 200Ð213. doi:10.1016/0749
-596X(89)90044
-2 Gathercole, V. C. M., Thomas, E. M., Ke
nnedy, I., Prys, C., Young, N., ViŒas Guasch, N., É 
Jones, L. (2014). Does language dominance affect cognitive performance in bilinguals? 
Lifespan evidence from preschoolers through older adults on card sorting, Simon, and 
metalinguistic tasks. 
Frontiers i
n Psychology
, 5(11), 1Ð14. doi:10.3389/fpsyg.2014.00011
 Gelman, A., & Hill, J. (2007). 
Data analysis using regression and multilevel/hierarchical 
models
. New York, NY: Cambridge University Press.
 Gibson, T. a, Oller, D. K., Jarmulowicz, L., & Ethington, C.
 a. (2012). The receptive
-expressive 
gap in the vocabulary of young second
-language learners: Robustness and possible 
mechanisms. 
Bilingualism: Language and Cognition
, 15(1), 102Ð116. doi:10.1017/S1366728910000490
 171  Gibson, T. a., PeŒa, E. D., & Bedore, L. M
. (2014). The relation between language experience 
and receptive
-expressive semantic gaps in bilingual children. 
International Journal of 
Bilingual Education and Bilingualism
, 17(1), 90Ð110. doi:10.1080/13670050.2012.743960
 Giraud, A.
-L., & Poeppel, D. (20
12a). Cortical oscillations and speech processing: emerging 
computational principles and operations. 
Nature Neuroscience
, 15(4), 511Ð7. doi:10.1038/nn.3063
 Giraud, A.
-L., & Poeppel, D. (2012b). Introduction: Terminology and concepts. In D. Poeppel, 
T. Over
ath, A. N. Popper, & R. R. Fay (Eds.), 
The Human Auditory Cortex
 (Vol. 43, pp. 
225Ð260). New York, NY: Springer New York. doi:10.1007/978
-1-4614-2314-0 Goldinger, S. D. (1996). Words and voices: episodic traces in spoken word identification and 
recognition
 memory. 
Journal of Experimental Psychology. Learning, Memory, and 
Cognition
, 22(5), 1166Ð83. Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. 
Psychological 
Review, 105(2), 251Ð279. doi:10.1037/0033
-295X.105.2.251
 Gollan, T.
 H., & Acenas, L.
-A. R. (2004). What is a TOT? Cognate and translation effects on tip
-of-the
-tongue states in Spanish
-English and tagalog
-English bilinguals. 
Journal of 
Experimental Psychology: Learning, Memory, and Cognition
, 30(1), 246Ð69. doi:10.1037/02
78-7393.30.1.246 Gollan, T. H., Montoya, R. I., Cera, C., & Sandoval, T. C. (2008). More use almost always 
means a smaller frequency effect: Aging, bilingualism, and the weaker links hypothesis. 
Journal of Memory and Language
, 58(3), 787Ð814. doi:10.1016/j
.jml.2007.07.001
 Gollan, T. H., Montoya, R. I., Fennema
-Notestine, C., & Morris, S. K. (2005). Bilingualism 
affects picture naming but not picture classification. 
Memory & Cognition
, 33(7), 1220Ð34. Gollan, T. H., Montoya, R. I., & Werner, G. a. (2002). Se
mantic and letter fluency in Spanish
-English bilinguals. 
Neuropsychology
, 16(4), 562Ð576. doi:10.1037//0894
-4105.16.4.562 Gollan, T. H., Salmon, D. P., Montoya, R. I., & Galasko, D. R. (2011). Degree of bilingualism 
predicts age of diagnosis of AlzheimerÕs
 disease in low
-education but not in highly 
educated Hispanics. 
Neuropsychologia
, 49(14), 3826Ð30. doi:10.1016/j.neuropsychologia.2011.09.041
 Gollan, T. H., & Silverberg, N. B. (2001). Tip
-of-the
-tongue states in Hebrew English bilinguals. 
Bilingualism: 
Language and Cognition
, 4(1), 63 Ð 83. Gollan, T. H., Slattery, T. J., Goldenberg, D., Van Assche, E., Duyck, W., & Rayner, K. (2011). 
Frequency drives lexical access in reading but not in speaking: the frequency
-lag 
hypothesis. 
Journal of Experimental Psy
chology: General
, 140(2), 186Ð209. doi:10.1037/a0022256
 172  Gollan, T. H., Starr, J., & Ferreira, V. S. (2014). More than use it or lose it: The number
-of-speakers effect on heritage language proficiency. 
Psychonomic Bulletin & Review
. doi:10.3758/s13423
-014-0649-7 Green, D. W. (1998). Mental control of the bilingual lexico
-semantic system. 
Bilingualism: 
Language and Cognition
, 1(02), 67Ð81. Green, K., Kuhl, P. K., Meltzoff, A. N., & Stevens, E. (1991). Integrating speech information 
across talkers, gender, and sensory modality: female faces and male voices in the McGurk 
effect. 
Perception & Psychophysics
, 50(6), 524Ð536. doi:10.3758/BF03207536
 Grosjean, F. (1980). Spoken word recognition processes and the gating paradigm. 
Perception & 
Psychophysics
, 28(4), 267Ð283. Grosjean, F. (2001). A bilingualÕs language modes. In J. L. Nicol (Ed.), 
One Mind, Two 
Languages: Bilingual Language Processing
 (pp. 1 Ð 22). Malden, MA: Blackwell 
Publishers.
 Grosjean, F. (2008). 
Studying bilinguals
. Oxford Un
iversity Press.
 Gupta, P., & Tisdale, J. (2009). Does phonological short
-term memory causally determine 
vocabulary learning? Toward a computational resolution of the debate. 
Journal of Memory 
and Language
, 61(4), 481Ð502. doi:10.1016/j.jml.2009.08.001
 Guti
”rrez
-Clellen, V. F., CalderŠn, J., & Ellis Weismer, S. (2004). Verbal working memory in 
bilingual children. 
Journal of Speech, Language, and Hearing Research
"
: JSLHR
, 47(4), 863Ð76. doi:10.1044/1092
-4388(2004/064)
 Hammer, C. S., Komaroff, E., Rodriguez, B
. L., Lopez, L. M., Scarpino, S. E., & Goldstein, B. 
(2012). Predicting Spanish 
Ð English Bilingual ChildrenÕs Language Abilities. 
Journal of 
Speech, Language, and Hearing Research
, 55(October 2012), 1251
Ð1264. doi:10.1044/1092
-4388(2012/11
-0016)Journal
 Hardison, D. M. (2003). Acquisition of second
-language speech: Effects of visual cues, context, 
and talker variability. 
Applied Psycholinguistics
, 24, 495Ð522. doi:10.1017/S0142716403000250
 Hardison, D. M. (2012). Second language speech perception: A cross
-disciplinary perspective on 
challenges and accomplishments. In 
The routledge handbook of second language 
acquisition
 (pp. 349Ð363). Hart, B., & Risley, T. R. (1995). 
Meaningful differences in the everyday experience of young 
American children.
 Paul H Brooke
s Publishing.
 173  Hilchey, M. D., & Klein, R. M. (2011). Are there bilingual advantages on nonlinguistic 
interference tasks? Implications for the plasticity of executive control processes. 
Psychonomic Bulletin & Review
, 18(4), 625Ð58. doi:10.3758/s13423
-011-0116-7 Hintzman, D. L. (1986). ÒSchema abstractionÓ in a multiple
-trace memory model. 
Psychological 
Review, 93(4), 411Ð428. doi:10.1037/0033
-295X.93.4.411
 Hoff, E. (2003). The specificity of environmental influence: Socioeconomic sta
tus affects early 
vocabulary development via maternal speech. 
Child Development
, 74(5), 1368Ð1378. Hoff, E. (2006). How social contexts support and shape language development. 
Developmental 
Review, 26, 55Ð88. doi:10.1016/j.dr.2005.11.002
 Hoff, E., Core, C.
, Place, S., Rumiche, R., SeŒor, M., & Parra, M. (2012). Dual language 
exposure and early bilingual development. 
Journal of Child Language
, 39(1), 1Ð27. doi:10.1017/S0305000910000759
 Holt, L. L., & Lotto, A. J. (2008). Speech Perception Within an Auditory 
Cognitive Science 
Framework. 
Current Directions in Psychological Science
, 17(1), 42Ð46. Howes, D. (1957). On the Relation between the Intelligibility and Frequency of Occurrence of 
English Words. 
The Journal of the Acoustical Society of America
, 29(2), 296Ð305. Hulme, C., Maughan, S., & Brown, G. D. . (1991). Memory for familiar and unfamiliar words: 
Evidence for a long
-term memory contribution to short
-term memory span. 
Journal of 
Memory and Language
, 30(6), 685Ð701. doi:10.1016/0749
-596X(91)90032
-F Hulme,
 C., Roodenrys, S., Schweickert, R., Brown, G. DA, Martin, S., & Stuart, G. (1997). 
Word
-Frequency Effects on Short
-Term Memory Tasks: Evidence for a Redintegration 
Process in Immediate Serial Recall. 
Journal of Experimental Psychology: Learning, 
Memory, a
nd Cognition
, 23(5), 1217Ð1232. Hurtado, N., Ger, T., Marchman, V. a., & Fernald, A. (2013). Relative language exposure, 
processing efficiency and vocabulary in Spanish
ÐEnglish bilingual toddlers. 
Bilingualism: 
Language and Cognition
, 1Ð14. doi:10.1017/
S136672891300014X
 Huttenlocher, J., & Haight, W. (1991). Early vocabulary growth: Relation to language input and 
gender. 
Developmental Psychology
, 27(2), 236Ð248. Imai, S., Walley, A. C., & Flege, J. E. (2005). Lexical frequency and neighborhood density 
effects on the recognition of native and Spanish
-accented words by native English and 
Spanish listeners. 
The Journal of the Acoustical Society of America
, 117(2), 896. doi:10.1121/1.1823291
 Ivanova, I., & Costa, A. (2008). Does bilingualism hamper lexical ac
cess in speech production? 
Acta Psychologica
, 127(2), 277Ð88. doi:10.1016/j.actpsy.2007.06.003
 174  Jaeger, T. F. (2008). Categorical Data Analysis: Away from ANOVAs (transformation or not) 
and towards Logit Mixed Models. 
Journal of Memory and Language
, 59(4), 434Ð446. doi:10.1016/j.jml.2007.11.007
 Johnson, J. S., & Newport, E. L. (1989). Critical period effects in second language learning: the 
influence of maturational state on the acquisition of English as a second language. 
Cognitive 
Psychology
, 21(1), 60Ð99. Ju, M., & Luce, P. A. (2004). Falling on sensitive ears: Constraints on Bilingual Lexical 
Activation. 
Psychological Science
, 15(5), 314Ð318. Ka
$it“ibasi, ‡. (1996). The Autonomous
-Relational Self: A New Synthesis. 
European 
Psychologist
, 1(3), 180Ð186. Kane, M. J., Hambrick, D. Z., Tuholski, S. W., Wilhelm, O., Payne, T. W., & Engle, R. W. 
(2004). The generality of working memory capacity: a latent
-variable approach to verbal 
and visuospatial memory span and reasoning. 
Journal of Experimental Psychology.
 General
, 133(2), 189Ð217. doi:10.1037/0096
-3445.133.2.189 Kav”, G., Knafo, A., & Gilboa, A. (2010). The rise and fall of word retrieval across the lifespan. 
Psychology and Aging
, 25(3), 719Ð724. doi:10.1037/a0018927
 Keuleers, E., Diependaele, K., & Brysba
ert, M. (2010). Practice effects in large
-scale visual 
word recognition studies: a lexical decision study on 14,000 dutch mono
- and disyllabic 
words and nonwords. 
Frontiers in Psychology
, 1(174), 1Ð15. doi:10.3389/fpsyg.2010.00174
 Killion, M., Niquette, P.
, Gudmundsen, G., Revit, L., & Banerjee, S. (2004). Development of a 
quick speech
-in-noise test for measuring signal
-to-noise ratio loss in normal
-hearing and 
hearing
-impaired listeners. 
The Journal of the Acoustical Society of America
, 116(4), 2395Ð2405. Kilman, L., Zekveld, A., H−llgren, M., & Rınnberg, J. (2014). The influence of non
-native 
language proficiency on speech perception performance. 
Frontiers in Psychology
, 5(July), 
651. doi:10.3389/fpsyg.2014.00651
 Klein, R. M. (2015). Is there a benefit of 
bilingualism for executive functioning? 
Bilingualism: 
Language and Cognition
, 18(1), 29Ð31. doi:10.1017/S1366728914000613
 Kousaie, S., & Phillips, N. A. (2011). Ageing and bilingualism: Absence of a Òbilingual 
advantageÓ in Stroop interference in a nonimmi
grant sample. 
The Quarterly Journal Of 
Experimental Psychology
, 65(2), 356 Ð 369. Kucera, H., & Francis, N. (1967). 
Computational analysis of present
-day American English
. Providence, RI: Brown university press.
 175  Kuhl, P., Williams, K., Lacerda, F., Stevens
, K., & Lindblom, B. (1992). Linguistic experience 
alters phonetic perception in infants by 6 months of age. 
Science
, 255, 606Ð608. Kuperman, V., & Van Dyke, J. A. (2013). Reassessing Word Frequency as a Determinant of 
Word Recognition for Skilled and Unsk
illed Readers. 
Journal of Experimental Psychology: 
Human Perception and Performance
, Advance on
. doi:10.1037/a0030859
 Lewellen, M., Goldinger, S. D., Pisoni, D. B., & Greene, B. (1993). Lexical familiarity and 
procesing efficiency: Individual differences i
n naming, lexical decision, and semantic 
categorization. 
Journal of Experimental Psychology: General
, 122(3), 316Ð330. Liberman, A., & Mattingly, I. (1985). The motor theory of speech perception revised. 
Cognition
, 21(1), 1Ð36. Liberman, A., &
 Mattingly, I. (1989). A specialization for speech perception. 
Science
, 243, 489Ð494. Ljung, R., Israelsson, K., & Hygge, S. (2012). Speech Intelligibility and Recall of Spoken 
Material Heard at Different Signal
!to!noise Ratios and the Role Played by Worki
ng Memory Capacity. 
Applied Cognitive Psychology
, 27, 198Ð203. Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation 
model. 
Ear and Hearing
, 19(1), 1Ð36. Luo, L., Craik, F. I. M., Moreno, S., & Bialystok, E. (2013). Bil
ingualism interacts with domain 
in a working memory task: evidence from aging. 
Psychology and Aging
, 28(1), 28Ð34. doi:10.1037/a0030875
 MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working memory: Comment on 
Just and Carpenter (1992) and Wat
ers and Caplan (1996). 
Psychological Review
, 109(1), 35Ð54. doi:10.1037//0033
-295X.109.1.35
 Magnuson, J. S., Tanenhaus, M. K., Aslin, R. N., & Dahan, D. (2003). The time course of 
spoken word learning and recognition: Studies with artificial lexicons. 
Jour
nal of 
Experimental Psychology: General
, 132(2), 202Ð227. doi:10.1037/0096
-3445.132.2.202 Majerus, S., Linden, M. Van Der, Mulder, L., Meulemans, T., & Peters, F. (2004). Verbal short
-term memory reflects the sublexical organization of the phonological lan
guage network: 
Evidence from an incidental phonotactic learning paradigm. 
Journal of Memory and 
Language
, 51(2), 297Ð306. doi:10.1016/j.jml.2004.05.002
 Marian, V., Blumenfeld, H. K., & Kaushkanskaya, M. (2007). The Language Experience and 
Proficiency Quest
ionnaire (LEAP
-Q): Assessing language profiles in bilinguals and 
multilinguals. 
Journal of Speech, Language, and Hearing Research
, 50, 940 Ð 967. 176  Marslen
-Wilson, W. (1987). Functional parallelism in spoken word
-recognition. 
Cognition
, 25, 71Ð102. Martin, C
. D., Thierry, G., Kuipers, J.
-R., Boutonnet, B., Foucart, A., & Costa, A. (2013). 
Bilinguals reading in their second language do not predict upcoming words as native 
readers do. 
Journal of Memory and Language
, 69(4), 574Ð588. doi:10.1016/j.jml.2013.08.001
 Mattys, S. L., Brooks, J., & Cooke, M. (2009). Recognizing speech under a processing load: 
Dissociating energetic from informational factors. 
Cognitive Psychology
, 59(3), 203Ð243. doi:10.1016/j.cogpsych.2009.04.001
 Mattys, S. L., Carroll, L. M., Li, C. K. W., & Chan, S. L. Y. (2010). Effects of energetic and 
informational masking on speech segmentation by native and non
-native speakers. 
Speech 
Communication
, 52(11-12), 887Ð899. doi:10.1016/j.specom.2010.01.005
 Mattys
, S. L., Davis, M. H., Bradlow, A. R., & Scott, S. K. (2012). Speech recognition in 
adverse conditions: A review. 
Language and Cognitive Processes
, 27(7-8), 953Ð978. Mattys, S. L., White, L., & Melhorn, J. F. (2005). Integration of multiple speech segmenta
tion 
cues: a hierarchical framework. 
Journal of Experimental Psychology. General
, 134(4), 477Ð500. doi:10.1037/0096
-3445.134.4.477 Mattys, S. L., & Wiget, L. (2011). Effects of cognitive load on speech recognition. 
Journal of 
Memory and Language
, 65(2), 145Ð160. doi:10.1016/j.jml.2011.04.004
 Maye, J., Werker, J. F., & Gerken, L. (2002). Infant sensitivity to distributional information can 
affect phonetic discrimination. 
Cognition
, 82(3), B101Ð11. Mayo, L. H., Florentine, M., &
 Buus, S. (1997). Age of second
-language acquisition and 
perception of speech in noise. 
Journal of Speech, Language, and Hearing Research: 
JSLHR
, 40(3), 686Ð93. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. 
Cognitive 
Psychology
, 18(1), 1Ð86. McQueen, J. M. (2007). Eight questions about spoken
-word recognition. In M. G. Gaskell (Ed.), 
The Oxford handbook of psycholinguistics
 (pp. 37Ð53). Oxford: Oxford University Press.
 Meador, D., Flege, J. E., & Mackay, R. (2000). Factors 
affecting the recognition of words in a 
second language. 
Bilingualism: Language and Cognition
, 3, 55Ð67. Mercier, J., Pivneva, I., & Titone, D. (2013). Individual differences in inhibitory control relate to 
bilingual spoken word processing. 
Bilingualism: L
anguage and Cognition
, 1Ð29. doi:10.1017/S1366728913000084
 177  Metsala, J. L., & Walley, A. (1998). Spoken vocabulary growth and the segmental restructuring 
of lexical representations: Precursors to phonemic awareness and early reading ability. In J. 
L. Metsal
a & L. C. Ehri (Eds.), 
Word recognition in beginning literacy
 (pp. 89Ð120). Mahwah, NJ: Lawrence Erlbaum Associates.
 Miyake, A., & Friedman, N. P. (2012). The Nature and Organization of Individual Differences in 
Executive Functions: Four General Conclusion
s. 
Current Directions in Psychological 
Science
, 21(1), 8Ð14. doi:10.1177/0963721411429458
 Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. 
(2000). The Unity and Diversity of Executive Functions and Their Contribution
s to 
Complex ÒFrontal LobeÓ Tasks: A Latent Variable Analysis. 
Cognitive Psychology
, 41, 49Ð100. doi:10.1006/cogp.1999.0734
 Monsell, S. (1991). The nature and locus of word frequency effects. In 
Basic processes in 
reading: Visual word recognition
 (pp. 148Ð197). Moreno, E. M., & Kutas, M. (2005). Processing semantic anomalies in two languages: An 
electrophysiological exploration in both languages of Spanish
-English bilinguals. 
Cognitive 
Brain Research
, 22(2), 205Ð220. doi:10.1016/j.cogbrainres.2004.08.010
 Morton, J. B., & Harper, S. N. (2007). What did Simon say? Revisiting the bilingual advantage. 
Developmental Science
, 10(6), 719Ð26. doi:10.1111/j.1467
-7687.2007.00623.x Murray, W. S., & Forster, K. I. (2004). Serial mechanisms in lexical access: the rank 
hypothesis. 
Psychological Review
, 111(3), 721Ð56. doi:10.1037/0033
-295X.111.3.721
 Newman, A. J., Tremblay, A., Nichols, E. S., Neville, H. J., & Ullman, M. T. (2012). The 
Influence of Language Proficiency on Lexical Semantic Processing in Native and Late 
Learners of English. 
Journal of Cognitive Neuroscience
, 24(5), 1205Ð1223. doi:10.1162/jocn_a_00143
 Norris, D. (1994). Shortlist: a connectionist model continuous speech recognition. 
Cognition
, 52, 189Ð234. Norris, D., & McQueen, J. M. (2008). Shortlist B: a
 Bayesian model of continuous speech 
recognition. 
Psychological Review
, 115(2), 357Ð95. doi:10.1037/0033
-295X.115.2.357
 Obleser, J., & Eisner, F. (2009). Pre
-lexical abstraction of speech in the auditory cortex. 
Trends 
in Cognitive Sciences
, 13(1), 14Ð9. doi:10.1016/j.tics.2008.09.005
 Obleser, J., Eisner, F., & Kotz, S. a. (2008). Bilateral speech comprehension reflects differential 
sensitivity to spectral and temporal features. 
The Journal of Neuroscience
, 28(32), 8116Ð23. doi:10.1523/JNEUROSCI.1290
-08.2008 178  Obleser, J., Wıstmann, M., Hellbernd, N., Wilsch, A., & Maess, B. (2012). Adverse listening 
conditions and memory load drive a common 
%
 oscillatory network. 
The Journal of 
Neuroscience
"
: The Official Journal of the Society for Neuroscience
, 32(36), 12376Ð83. doi:10.1523/JNEUROSCI.4908
-11.2012 Oh, S., & Lewis, C. (2008). Korean preschoolersÕ advanced inhibitory control and its relation to 
other executive skills and mental state understanding. 
Child Development
, 79(1), 80Ð99. doi:10.1111/j.1467
-8624.2007.01112.x Oldfield, R., & Wingfield, A. (1965). Response latencies in naming objects. 
Quarterly Journal of 
Experimental Psychology
, 17(4), 273Ð281. Paap, K. R. (2015). Do many hones dull the bilingual whetstone? 
Bilingualism: Language and 
Cognition
, 18(1), 41Ð42. doi:10.1017/S1366728914000431
 Paap, K. R., & Greenberg, Z. I. (2013). There is no coherent evidence for a bilingual advantage 
in executive processing. 
Cognitive Psychology
, 66(2), 232Ð58. doi:10.1016/j.cogpsych.2012.12.002
 Paradis, J., Genesee, F., & C
rago, M. B. (2011). 
Dual language development and disorders: a 
handbook on bilingualism and second language learning
. Baltimore, Md: Paul H. Brookes 
Pub. Co.
 Parra, M., Hoff, E., & Core, C. (2011). Relations among language exposure, phonological 
memory, an
d language development in Spanish
-English bilingually developing 2
-year
-olds. 
Journal of Experimental Child Psychology
, 108(1), 113Ð25. doi:10.1016/j.jecp.2010.07.011
 Perfetti, C. A. (2007). Reading Ability: Lexical quality to comprehension. 
Scientific 
Studies of 
Reading
, 11(4), 357Ð383. Perfetti, C. A., & Hart, L. (2002). The Lexical quality hypothesis. In L. Verhoeven, C. Elbro, & 
P. Reitsma (Eds.), 
Precursors of functional literacy
 (pp. 189Ð213). Amsterdam/Philadelphia: John Benjamins Publishing Compa
ny. Peyton, J. K., Ranard, D. A., & McGinnis, S. (2001). 
Heritage Languages in America: 
Preserving a National Resource.
 McHenry, IL: Delta Systems and Center for Applied 
Linguistics.
 Pichora
-Fuller, M. K., Schneider, B., &
 Daneman, M. (1995). How young and old adults listen to 
and remember speech in noise. 
The Journal of the Acoustical Society of America
, 97(1), 593Ð608. Pickering, M. J., & Garrod, S. (2007). Do people use language production to make predictions 
during comp
rehension? 
Trends in Cognitive Sciences
, 11(3), 105Ð110. doi:10.1016/j.tics.2006.12.002
 179  Pierrehumbert, J. B. (2001). Exemplar dynamics: Word frequency, lenition and contrast. In J. 
Bybee & P. Hopper (Eds.), 
Frequency and the emergence of linguistic structu
re (pp. 137Ð158). Amsterdam, NL: John Benjamins.
 Pierrehumbert, J. B. (2003). 
Phonetic Diversity, Statistical Learning, and Acquisition of 
Phonology
. Language and Speech
 (Vol. 46). doi:10.1177/00238309030460020501
 Piquado, T., Cousins, K. a Q., Wingfield, 
A., & Miller, P. (2010). Effects of degraded sensory 
input on memory for speech: behavioral data and a test of biologically constrained 
computational models. 
Brain Research
, 1365, 48Ð65. doi:10.1016/j.brainres.2010.09.070
 Pitt, M. A., Dilley, L., & Tat, M.
 (2011). Exploring the role of exposure frequency in recognizing 
pronunciation variants. 
Journal of Phonetics
, 39(3), 304Ð311. doi:10.1016/j.wocn.2010.07.004
 Pivneva, I., Palmer, C., & Titone, D. (2012). Inhibitory control and l2 proficiency modulate 
bilin
gual language production: evidence from spontaneous monologue and dialogue speech. 
Frontiers in Psychology
, 3(1-18), 57. doi:10.3389/fpsyg.2012.00057
 Place, S., & Hoff, E. (2011). Properties of Dual Language Exposure That Influence 2
-Year-OldsÕ 
Bilingual P
roficiency. 
Child Development
, 82(6), 1834Ð1849. doi:10.1111/j.1467
-8624.2011.01660.x Portocarrero, J. S., Burright, R. G., & Donovick, P. J. (2007). Vocabulary and verbal fluency of 
bilingual and monolingual college students. 
Archives of Clinical Neuropsy
chology
"
: The 
Official Journal of the National Academy of Neuropsychologists
, 22(3), 415Ð22. doi:10.1016/j.acn.2007.01.015
 R Core Team. (2014). A Language and Environment for Statistical Computing. Vienna, Austria: 
R Foundation for Statistical Computing.
 Ratiu, I., & Azuma, T. (2015). Working memory capacity: Is there a bilingual advantage? 
Journal of Cognitive Psychology
, 27(1), 1Ð11. doi:10.1080/20445911.2014.976226
 Rimikis, S., Smiljanic, R., & Calandruccio, L. (2013). Nonnative English Speaker Performan
ce on the Basic English Lexicon (BEL) Sentences. 
Journal of Speech, Language, and Hearing 
Research
, 56, 792Ð805. doi:10.1044/1092
-4388(2012/12
-0178)materials
 Rogers, C. L., Lister, J. J., Febo, D. M., Besing, J. M., &
 Abrams, H. B. (2006). Effects of 
bilingualism, noise, and reverberation on speech perception by listeners with normal 
hearing. 
Applied Psycholinguistics
, 27(03), 465Ð485. doi:10.1017/S014271640606036X
 Rınnberg, J., Lunner, T., Zekveld, A. A., Sırqvist, P.
, Danielsson, H., Lyxell, B., É Rudner, M. 
(2013). The Ease of Language Understanding (ELU) model: theoretical, empirical, and 
clinical advances. 
Frontiers in Systems Neuroscience
, 7, 1Ð31. doi:10.3389/fnsys.2013.00031
 180  Roodenrys, S., Hulme, C., Alban, J., 
Ellis, A. W., & Brown, G. D. a. (1994). Effects of word 
frequency and age of acquisition on short
-term memory span. 
Memory & Cognition
, 22(6), 695Ð701. doi:10.3758/BF03209254
 Rosenhouse, J., Haik, L., & Kishon
-Rabin, L. (2006). Speech perception in adverse
 listening 
conditions in Arabic
-Hebrew bilinguals. 
International Journal of Bilingualism
, 10(2), 119Ð135. doi:10.1177/13670069060100020101
 Rost, G. C., & McMurray, B. (2009). Speaker variability augments phonological processing in 
early word learning. 
Developmental Science
, 12, 339Ð349. doi:10.1111/j.1467
-7687.2008.00786.x Rubenstein, H., & Pollack, I. (1963). Word predictability and intelligibility. 
Journal of Verbal 
Learning and Verbal Behavior
, 2(2), 147Ð158. doi:10.1016/S0022
-5371(63)80079-1 Sabbagh, M.
 a, Xu, F., Carlson, S. M., Moses, L. J., & Lee, K. (2006). The Development of 
Executive Functioning and Theory of Mind. 
Psychological Science
, 17(1), 74Ð81. Salthouse, T. A. (1996). The processing
-speed theory of adult age differences in cognition. 
Psycho
logical Review
, 103(3), 403Ð28. Samuel, A. G., & Larraza, S. (2015). Does listening to non
-native speech impair speech 
perception? 
Journal of Memory and Language
, 81, 51Ð71. doi:10.1016/j.jml.2015.01.003
 Schmidtke, J. (2014). Second language experience mod
ulates word retrieval effort in bilinguals: 
Evidence from pupillometry. 
Frontiers in Psychology
, 5(137). doi:10.3389/fpsyg.2014.00137
 Schneider, B. a., Avivi
-Reich, M., & Daneman, M. (2014). How age and linguistic competence 
alter the interplay of perceptu
al and cognitive factors when listening to conversations in a 
noisy environment. 
Frontiers in Systems Neuroscience
, 8(February), 1
Ð17. doi:10.3389/fnsys.2014.00021
 Schrank, F., & Woodcock, R. W. (2005). WMLS
-R scoring and reporting program [Computer 
softwa
re]. In 
Woodcock
-MuŒoz Language Survey
-Revised. Rolling Meadows, IL: Riverside 
Publishing.
 Sears, C. R., Siakaluk, P. D., Chow, V. C., & Buchanan, L. (2008). Is there an effect of print 
exposure on the word frequency effect and the neighborhood size effect
? Journal of 
Psycholinguistic Research
, 37(4), 269Ð91. doi:10.1007/s10936
-008-9071-5 Sebasti⁄n
-Gall”s, N., Echeverr™a, S., & Bosch, L. (2005). The influence of initial exposure on 
lexical representation: Comparing early and simultaneous bilinguals. 
Journal
 of Memory 
and Language
, 52(2), 240Ð255. doi:10.1016/j.jml.2004.11.001
 181  Sebasti⁄n
-Gall”s, N., & Soto
-Faraco, S. (1999). Online processing of native and non
-native 
phonemic contrasts in early bilinguals. 
Cognition
, 72(2), 111Ð23. Service, E., Simola, M., Mets−nheimo, O., & Maury, S. (2002). Bilingual working memory span 
is affected by language skill. 
European Journal of Cognitive Psychology
, 14(3), 383Ð408. Shannon, R. V, Jensvold, a, Padilla, M., Robert, M. E., &
 Wang, X. (1999). Consonant 
recordings for speech testing. 
The Journal of the Acoustical Society of America
, 106(6), L71
Ð4. Shannon, R. V, Zeng, F. G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition 
with primarily temporal cues. 
Science
, 270(5234), 303Ð4. Shi, L.
-F. (2009). Normal
-hearing English
-as-a-second
-language listenersÕ recognition of 
English words in competing signals. 
International Journal of Audiology
, 48, 260Ð270. doi:10.1080/14992020802607431
 Shi, L.
-F. (2010). Perception of 
acoustically degraded sentences in bilingual listeners who differ 
in age of English acquisition. 
Journal of Speech, Language, and Hearing Research
, 53(4), 821Ð35. doi:10.1044/1092
-4388(2010/09
-0081) Shi, L.
-F. (2012). Contribution of Linguistic Variables t
o Bilingual ListenersÕ Perception of 
Degraded English Sentences. 
Journal of Speech, Language and Hearing Research
, 55, 219Ð234. doi:10.1044/1092
-4388(2011/10
-0240)that
 Shi, L.
-F., & Farooq, N. (2012). Linguistic and Attitudinal Factors in Normal
-Hearing Bi
lingual 
ListenersÕ Perception of Degraded English Passages. 
American Journal of Audiology
, 21, 127Ð140. doi:10.1044/1059
-0889(2012/11
-0022)At
 Shi, L.
-F., & S⁄nchez, D. (2010). Spanish/English bilingual listeners on clinical word 
recognition tests: what to 
expect and how to predict. 
Journal of Speech, Language, and 
Hearing Research
, 53(5), 1096Ð1110. doi:10.1044/1092
-4388(2010/09
-0199) Shi, L.
-F., & S⁄nchez, D. (2011). The role of word familiarity in Spanish/English bilingual word 
recognition. 
International 
Journal of Audiology
, 50(2), 66Ð76. doi:10.3109/14992027.2010.527862
 Sommers, M. S., & Barcroft, J. (2011). Indexical information, encoding difficulty, and second 
language vocabulary learning. 
Applied Psycholinguistics
, 32, 417Ð434. doi:10.1017/S0142716410
000469 Sommers, M. S., & Danielson, S. M. (1999). Inhibitory processes and spoken word recognition 
in young and older adults: the interaction of lexical competition and semantic context. 
Psychology and Aging
, 14(3), 458Ð72. 182  Sırqvist, P., Hurtig, A., Ljung,
 R., & Rınnberg, J. (2014). High second
-language proficiency 
protects against the effects of reverberation on listening comprehension. 
Scandinavian 
Journal of Psychology
, 55(2), 91Ð6. doi:10.1111/sjop.12115
 Stager, C. L., & Werker, J. F. (1997). Infants li
sten for more phonetic detail in speech perception 
than in word
-learning tasks. 
Nature
, 388, 381Ð382. doi:10.1038/41102
 Swingley, D. (2003). Phonetic Detail in the Developing Lexicon. 
Language and Speech
, 46(2-3), 265Ð294. doi:10.1177/00238309030460021001
 Swingley, D., & Aslin, R. (2002). Lexical neighborhoods and the word
-form representations of 
14-month
-olds. 
Psychological Science
, 13(5), 480Ð484. Swingley, D., & Aslin, R. N. (2000). Spoken word recognition and lexical representation in very 
young childre
n. Cognition
, 76(2), 147Ð166. doi:10.1016/S0010
-0277(00)00081-0 Swingley, D., & Aslin, R. N. (2007). Lexical competition in young childrenÕs word learning. 
Cognitive Psychology
, 54(2), 99Ð132. doi:10.1016/j.cogpsych.2006.05.001
 Tabri, D., Smith Abou Chacra
, M. K., & Pring, T. (2011). Speech perception in noise by 
monolingual, bilingual and trilingual listeners. 
International Journal of Language 
Communication Disorders
, 46(4), 411Ð422. doi:10.3109/13682822.2010.519372
 Tamati, T. N., Gilbert, J. L., & Pisoni,
 D. B. (2013). Some factors underlying individual 
differences in speech recognition on PRESTO: a first report. 
Journal of the American 
Academy of Audiology
, 24(7), 616Ð634. doi:10.3766/jaaa.24.7.10.Some
 Tanenhaus, M. K., Spivey
-Knowlton, M. J., Eberhard, K
. M., & Sedivy, J. C. (1995). Integration 
of visual and linguistic information in spoken language comrehension. 
Science
, 268(5217), 1632 Ð 1634. Thorn, A. S. C., & Gathercole, S. E. (1999). Language
-specific knowledge and short
-term 
memory in bilingual and
 non-bilingual children. 
The Quarterly Journal of Experimental 
Psychology. A, Human Experimental Psychology
, 52(2), 303Ð24. doi:10.1080/713755823
 Titone, D., Pivneva, I., Sheikh, N. A., Webb, N., & Whitford, V. M. (2015). Doubling down on 
multifactorial ap
proaches to the study of bilingualism & executive control. 
Bilingualism: 
Language and Cognition
, 18(1), 43Ð44. doi:10.1017/S1366728914000595
 Tsao, F.
-M., Liu, H.
-M., & Kuhl, P. K. (2004). Speech Perception in Infancy Predicts Language 
Development in the Se
cond Year of Life: A Longitudinal Study. 
Child Development
, 75(4), 1067Ð1084. doi:10.1111/j.1467
-8624.2004.00726.x Tulsky, D. S., Carlozzi, N., Chiaravalloti, N. D., Beaumont, J. L., Kisala, P. a, Mungas, D., É 
Gershon, R. (2014). NIH Toolbox Cognition Bat
tery (NIHTB
-CB): list sorting test to 
183  measure working memory. 
Journal of the International Neuropsychological Society
, 20(6), 599Ð610. doi:10.1017/S135561771400040X
 Vaden, K. I., Halpin, H., & Hickok, G. S. (2009). Irvine Phonotactic Online Dictionary, Ver
sion 
2.0 [Data file]. Retrieved from http://www.iphod.com
 Valian, V. (2014). Bilingualism and cognition. 
Bilingualism: Language and Cognition
, 18(01), 3Ð24. doi:10.1017/S1366728914000522
 Van Engen, K. J. (2010). Similarity and familiarity: Second language 
sentence recognition in 
first
- and second
-language multi
-talker babble. 
Speech Communication
, 52(11-12), 943Ð953. doi:10.1016/j.specom.2010.05.002
 Varela, R. E., Vernberg, E. M., Sanchez
-Sosa, J. J., Riveros, A., Mitchell, M., &
 Mashunkashey, 
J. (2004). Parenting style of Mexican, Mexican American, and Caucasian
-non-Hispanic 
families: social context and cultural influences. 
Journal of Family Psychology
"
: JFP
"
: Journal of the Division of Family Psychology of the American Psycholog
ical Association 
(Division 43)
, 18(4), 651Ð657. doi:10.1037/0893
-3200.18.4.651 Vitevitch, M. S., & Luce, P. a. (2004). A Web
-based interface to calculate phonotactic 
probability for words and nonwords in English. 
Behavior Research Methods, Instruments, 
& C
omputers
, 36(3), 481Ð487. doi:10.3758/BF03195594
 Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in 
visual working memory capacity. 
Nature
, 428, 748Ð751. Von Hapsburg, D., &
 PeŒa, E. D. (2002). Understanding bilingualism and its impact on speech 
audiometry. 
Journal of Speech, Language, and Hearing Research
, 45(1), 202Ð13. doi:10.1044/1092
-4388(2002/015)
 Walley, A. (2008). Speech Perception in Childhood. In D. B. Pisoni & R. R
emez (Eds.), 
The 
handbook of speech perception
 (pp. 449Ð468). Blackwell Publishing Ltd.
 Walley, A., Metsala, J. L., & Garlock, V. (2003). Spoken vocabulary growth: Its role in the 
development of phoneme awareness and early reading ability. 
Reading and Writ
ing
, 16, 5Ð20. Weisleder, A., & Fernald, A. (2013). Talking to children matters: early language experience 
strengthens processing and builds vocabulary. 
Psychological Science
, 24(11), 2143Ð52. doi:10.1177/0956797613488145
 Weiss, D., & Dempsey, J. J. (2008)
. Performance of Bilingual Speakers on the English and 
Spanish Versions of the Hearing in Noise Test (HINT). 
Journal of the American Academy 
of Audiology
, 19(1), 5Ð17. doi:10.3766/jaaa.19.1.2
 184  Werker, J. F., Fennell, C. T., Corcoran, K. M., & Stager, C. L. 
(2002). InfantsÕ Ability to Learn 
Phonetically Similar Words: Effects of Age and Vocabulary Size. 
Infancy
, 3(1), 1Ð30. doi:10.1207/15250000252828226
 White, K. S., Yee, E., Blumstein, S. E., & Morgan, J. L. (2013). Adults show less sensitivity to 
phonetic d
etail in unfamiliar words, too. 
Journal of Memory and Language
, 68(4), 362Ð378. doi:10.1016/j.jml.2013.01.003
 Wild, C. J., Yusuf, A., Wilson, D. E., Peelle, J. E., Davis, M. H., & Johnsrude, I. S. (2012). 
Effortful listening: the processing of degraded spe
ech depends critically on attention. 
The 
Journal of Neuroscience
, 32(40), 14010Ð21. doi:10.1523/JNEUROSCI.1528
-12.2012 Wilson, R. H., Abrams, H., & Pillion, A. (2003). A word
-recognition task in multitalker babble 
using a descending presentation mode from 
24 dB to 0 dB signal to babble. 
Journal of 
Rehabilitation Research and Development
, 40(4), 321Ð328. Wilson, R. H., Carnell, C. S., & Cleghorn, A. L. (2007). The Words
-in-Noise (WIN) test with 
multitalker babble and speech
-spectrum noise maskers. 
Journal of
 the American Academy 
of Audiology
, 18(6), 522Ð529. Wilson, R. H., McArdle, R., & Smith, S. L. (2007). An Evaluation of the BKB
-SIN, HINT, 
QuickSIN, and WIN Materials on Listeners With Normal Hearing and Listeners With 
Hearing Loss. 
Journal of Speech, 
Language, and Hearing Research
, 50(4), 844Ð56. doi:10.1044/1092
-4388(2007/059)
 Wingfield, A. (1996). Cognitive factors in auditory performance: context, speed of processing, 
and constraints of memory. 
Journal of the American Academy of Audiology
, 7(3), 175Ð182. Woodcock, R. W., MuŒoz Sandoval, A. F., Ruef, M. L., & Alvarado, C. G. (2005). Woodcock
-MuŒoz Language Survery 
- Revised. Itasca, IL: Riverside Publishing.
 Yap, M. J., Balota, D., Sibley, D., & Ratcliff, R. (2012). Individual Differences in Visual Wo
rd Recognition: Insights From the English Lexicon Project. 
Journal of Experimental 
Psychology: Human Perception and Performance
, 38(1), 53Ð79. doi:10.1037/a0024177
 Yoshida, K. a, Fennell, C. T., Swingley, D., & Werker, J. F. (2009). Fourteen
-month
-old infa
nts 
learn similar
-sounding words. 
Developmental Science
, 12(3), 412Ð8. doi:10.1111/j.1467
-7687.2008.00789.x Zhang, Y.
-X., Barry, J. G., Moore, D. R., & Amitay, S. (2012). A new test of attention in 
listening (TAIL) predicts auditory performance. 
PloS One
, 7(12), e53502. 
doi:10.1371/journal.pone.0053502
 Ziegler, J. C., Pech
-Georgel, C., George, F., Alario, F.
-X., & Lorenzi, C. (2005). Deficits in 
speech perception predict language learning impairment. 
Proceedings of the National 
185  Academy of Sciences of the Un
ited States of America
, 102(39), 14110Ð5. doi:10.1073/pnas.0504446102
 Ziegler, J. C., Pech
-Georgel, C., George, F., & Lorenzi, C. (2009). Speech
-perception
-in-noise 
deficits in dyslexia. 
Developmental Science
, 12(5), 732Ð45. doi:10.1111/j.1467
-7687.2009.00817.x