EXPLORING THE PRODUCTION AND PERCEPTION OF SECOND LANGUAGE
FLUENCY: UTTERANCE, COGNITIVE, AND PERCEIVED FLUENCY
By
Ji Min Kahng

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
Second Language Studies ‒ Doctor of Philosophy
2014

ABSTRACT
EXPLORING THE PRODUCTION AND PERCEPTION OF SECOND LANGUAGE
FLUENCY: UTTERANCE, COGNITIVE, AND PERCEIVED FLUENCY
By
Ji Min Kahng
Fluency is one of the most noticeable differences between native and nonnative speech
and constitutes an essential component of second language proficiency; however, the concept has
not been well understood by researchers. In order to deepen understanding of the
multidimensional construct of fluency, the current dissertation investigated the production and
perception of second language fluency from all three aspects—utterance, cognitive, and
perceived fluency.
Study 1 investigated utterance fluency and cognitive fluency of English speakers and
Korean learners of English by comparing temporal measures and stimulated recall responses.
The first language (L1) and second language (L2) speech were different in speed, length of run,
repairs, and silent pauses. In particular, a striking group difference was found in the frequency of
silent pauses within a clause, which is consistent with the claim that pauses within clauses reflect
processing difficulties in speech production such as lexical retrieval. Stimulated recall responses
showed that lower proficiency learners remembered more issues regarding L2 declarative
knowledge on grammar and vocabulary than higher proficiency learners, which was compatible
with the declarative/procedural model and studies on automaticity.
Study 2 examined the relationship between utterance fluency and perceived fluency using
two experiments. Experiment 1 investigated the relative contributions of frequency, length, and
distribution of silent pauses to perceived fluency of L2 speech. Experiment 2 tested causal
effects of pause location on perceived fluency of L1 and L2 speech. Findings of both Experiment

1 and Experiment 2 suggest a significant role of pause location in L2 perceived fluency. In
Experiment 1, pause distribution demonstrated the strongest correlation with fluency ratings, and
in Experiment 2, perceived fluency of L2 speech was influenced by pause location more than
that of L1 speech. The findings are consistent with L1 literature on pause phenomena which has
shown that silent pauses are one of the acoustic cues to clausal units, and silent pauses between
clauses can facilitate speech perception and recall, whereas pauses within clauses can interfere
with them in cognitively demanding contexts.
One of the most novel and important findings of the current dissertation is the close
relationship between L2 fluency and pauses within clauses. L1 and L2 speech exhibited a
striking difference in the frequency of pauses within clauses, which is considered to reflect
difficulties in speech production processing. Pauses within clauses also had a crucial impact on
perceived fluency of L2 speech.

To Jun

iv

ACKNOWLEDGEMENTS

My deepest gratitude goes to my advisor, Dr. Debra Hardison. I am indebted to her
utmost trust in my potential and her constant encouragement throughout my Ph.D. study. Her
dedication to students will always be my inspiration.
I would like to thank my committee members. I am immensely grateful to Dr. Susan Gass
for both financial support and insightful feedback through various channels, such as practice
talks for conference presentations, annual report letters, and defenses. I am deeply thankful to Dr.
Patti Spinner for constructive comments on my second qualifying paper and the current
dissertation. Her comments allowed me to engage more critically with my research. Dr. Paula
Winke provided me with concrete advice, not only on research, but also on life. I truly admire
her genuine care for students.
Dr. Aline Godfroid also offered me thought-provoking comments and suggestions. I am
grateful for her support and contributions. I appreciate Dr. Karthik Durvasula‘s inspirational
enthusiasm for research and teaching. In addition to the support of my mentors and colleagues,
my family and friends have always been there for me through this process, and their unwavering
encouragement helped me pull through challenging times. Most importantly, I thank my husband,
Jun, for believing in me, inspiring me to pursue my passion with courage, and taking this journey
with me.
I greatly appreciate the financial support I received for this dissertation, including
funding from the Second Language Studies program, a Dissertation Completion Fellowship from
the Graduate School of Michigan State University, and a Dissertation Grant from Language
Learning. Finally, I thank all the individuals whose participation made this work possible.

v

TABLE OF CONTENTS

LIST OF TABLES ....................................................................................................................... viii
LIST OF FIGURES ....................................................................................................................... ix
CHAPTER 1: INTRODUCTION AND REVIEW OF THE LITERATURE ................................ 1
Introduction ......................................................................................................................... 1
Utterance Fluency and Perceived Fluency .......................................................................... 2
Speed and Repair Fluency ...................................................................................... 2
Breakdown Fluency ................................................................................................ 4
Cognitive Fluency and Utterance Fluency .......................................................................... 6
Overview of Research Design .......................................................................................... 10
CHAPTER 2: STUDY 1 ............................................................................................................... 12
Introduction ....................................................................................................................... 12
Method .............................................................................................................................. 15
Participants ............................................................................................................ 15
Tasks and Procedures ............................................................................................ 16
Spontaneous Speech.................................................................................. 16
Stimulated Recall ...................................................................................... 17
Utterance Fluency: Quantitative Study ............................................................................. 17
Analysis................................................................................................................. 17
Transcribing and Marking Pauses and Repairs ......................................... 17
Calculating Temporal Variables ............................................................... 19
Statistical Analysis .................................................................................... 22
Results ................................................................................................................... 23
Overall Utterance Fluency ........................................................................ 23
Pause Phenomena in Different Locations ................................................. 24
Correlation Between the Temporal Variables and Speaking Scores ........ 28
Cognitive Fluency: Qualitative Study............................................................................... 31
Analysis................................................................................................................. 31
Results ................................................................................................................... 32
Content of the Message ............................................................................. 33
Vocabulary ................................................................................................ 34
Grammar ................................................................................................... 35
Phonology and Pragmatics ........................................................................ 36
Other Issues ............................................................................................... 37
Discussion ......................................................................................................................... 38
Utterance Fluency ................................................................................................. 38
Speed ......................................................................................................... 38
Length of Run ........................................................................................... 40
Repairs ...................................................................................................... 41
Pause Phenomena...................................................................................... 42
vi

Cognitive Fluency ................................................................................................. 46
General Discussion: Study 1 ............................................................................................. 50
CHAPTER 3: STUDY 2 ............................................................................................................... 54
Introduction ....................................................................................................................... 54
Experiment 1 ..................................................................................................................... 55
Method .................................................................................................................. 56
Raters ........................................................................................................ 56
Stimulus Description................................................................................. 56
Procedure .................................................................................................. 57
Acoustic Analysis of Speech Excerpts ..................................................... 58
Statistical Analysis .................................................................................... 59
Results ................................................................................................................... 59
Discussion ............................................................................................................. 62
Experiment 2 ..................................................................................................................... 64
Method .................................................................................................................. 66
Raters ........................................................................................................ 66
Stimulus Description................................................................................. 67
Procedure .................................................................................................. 69
Analysis..................................................................................................... 70
Results ................................................................................................................... 71
Discussion ............................................................................................................. 74
General Discussion: Study 2 ............................................................................................. 77
CHAPTER 4: CONCLUSION ..................................................................................................... 80
APPENDICES .............................................................................................................................. 83
Appendix A: Questions for spontaneous speech .............................................................. 84
Appendix B: English language learning background questionnaire in Study 1 ............... 85
Appendix C: Instructions for the experiment.................................................................... 86
Appendix D: Rater background questionnaire in Study 2 ................................................ 87
Appendix E: An example of addition of pauses to a speech sample ................................ 89
Appendix F: Example waveforms of the speech manipulations ....................................... 90
REFERENCES ............................................................................................................................. 91

vii

LIST OF TABLES

Table 1: Temporal measures used in the study ............................................................................. 21
Table 2: Overall utterance fluency: Descriptive statistics and group differences ........................ 24
Table 3: Mean length of silent and filled pause in different locations (ms) ................................. 27
Table 4: Pearson correlations between utterance fluency measures and speaking scores ............ 30
Table 5: Distribution of stimulated recall responses..................................................................... 33
Table 6: Summary of statistical analyses ...................................................................................... 39
Table 7: Number of silent and filled pauses per minute in different locations ............................. 45
Table 8: Descriptive statistics of pause phenomena and fluency ratings of L2 speech ................ 60
Table 9: Correlations between the measures of pause phenomena and fluency ratings ............... 60
Table 10: Results of a hierarchical multiple regression ................................................................ 61
Table 11: Results of a stepwise multiple regression ..................................................................... 62
Table 12: A schematic representation of the 3 x 3 Latin Square design. No, B, and W represent
the No Pause, Pauses Between Clauses, and Pauses Within Clauses conditions, respectively. .. 69

viii

LIST OF FIGURES

Figure 1: Segalowitz‘s (2010) model of the L2 speaker. Adapted from Levelt‘s (1999) ―blueprint‖
of the monolingual speaker. The {f} symbols refer to fluency vulnerability points. (Figure used
with permission of Taylor and Francis Group LLC Books) ........................................................... 7
Figure 2: Distribution of speaking and pause time ....................................................................... 25
Figure 3: Pause rates for silent and filled pauses in different locations (means and standard errors)
....................................................................................................................................................... 26
Figure 4: Mean and standard error z-scores of fluency ratings of L1 and L2 speech ................... 72
Figure 5: Speech in the No Pause condition ................................................................................. 90
Figure 6: Speech in the Pauses Between Clauses condition ......................................................... 90
Figure 7: Speech in the Pauses Within Clauses condition............................................................ 90

ix

CHAPTER 1: INTRODUCTION AND REVIEW OF THE LITERATURE

Introduction
One of the most noticeable differences between speech in first language (L1) and second
language (L2) is found in fluency (Gut, 2009; Kormos, 2006). Compared to L1, people are
typically not only weaker in L2 knowledge, but they are considerably less fluent using what L2
knowledge they have (Segalowitz, 2010). Although fluency is considered important by L2
learners and teachers (Schmidt, 2000) and constitutes an essential criterion in assessing L2
performance and proficiency ( osker Pinger Quen , Sanders, & De Jong, 2013; Cucchiarini,
Strik, & Boves, 2002; Housen, Kuiken, & Vedder, 2012; Iwashita, Brown, McNamara, &
O‘Hagan, 2008; Skehan, 1998), the concept is difficult to define and it has not been well
understood (Kormos

nes, 2004; Schmidt, 2000; Segalowitz, 2010). Lennon (1990)

distinguished between fluency in the broad and in the narrow sense. Fluency in the broad sense
refers to global speaking proficiency, whereas fluency in the narrow sense relates to how easily
and smoothly speech is delivered and it constitutes a component of oral proficiency. The present
study concerns the narrow sense of fluency.
Segalowitz (2010) pointed out that even this narrow sense of fluency is a
multidimensional construct which reflects the efficiency of using linguistic knowledge and
executing the neurological and muscular mechanisms that an L2 speaker has developed during
the course of L2 learning, and a distinction should be made among the three notions of fluency—
cognitive, utterance, and perceived fluency. Cognitive fluency is defined as ―the efficiency of
operation of the underlying processes responsible for the production of utterances‖ (Segalowitz,
2010, p. 165). Utterance fluency is ―the features of utterances that reflect the speaker‘s cognitive

1

fluency‖ (p. 165). Utterance fluency can be objectively measured by temporal variables in
speech samples and it has a few different aspects such as speed fluency, breakdown fluency
(pause and hesitation phenomena), and repair fluency (Skehan, 2003, 2009; Tavakoli & Skehan,
2005). The third notion of fluency is perceived fluency, ―the inferences listeners make about
speakers‘ cognitive fluency based on their perceptions of their utterance fluency (p. 165).‖

Utterance Fluency and Perceived Fluency
Speed and Repair Fluency
To identify reliable oral production features of L2 fluency, previous studies compared
speech from fluent and non-fluent speakers (Ejzenberg, 2000; Riazantseva, 2001; Riggenbach,
1991; Tavakoli, 2011), investigated the longitudinal development of fluency (Derwing, Munro,
& Thomson, 2007; Freed, 1995, 2000; Lennon, 1990, Mora & Valls-Ferrer, 2012; Towell,
Hawkins, & Bazergui,1996; Wood, 2010), and related utterance fluency to perceived fluency by
correlating fluency ratings with temporal variables (Bosker et al., 2013; Cucchiarini, Strik, &
Boves, 2000, 2002; erwing Rossiter Munro

Thomson 2004; Fulcher 1996; Kormos

nes, 2004; Rossiter, 2009).
The main findings were that Speech rate (i.e., the number of syllables per minute,
including pause time) and Mean length of run (i.e., the mean number of syllables between two
silent pauses) were consistently strongly associated with L2 oral fluency development and
perceived fluency (e.g., Kormos

nes, 2004; Lennon, 1990; O‘ rien Segalowitz Freed, &

Collentine, 2007; Segalowitz & Freed, 2004; Towell et al., 1996), whereas Articulation rate (i.e.,
the number of syllables per minute, excluding pause time) and repair measures often did not. For
instance, Kormos and

nes (2004) compared temporal features of speech produced by

2

intermediate and advanced learners of English and found that they were significantly different in
Speech rate and Mean length of run but not in Articulation rate or Number of disfluencies
(repetitions, restarts and repairs) per minute. Correlation analysis between the temporal measures
and fluency scores rated by native English speakers showed that Speech rate and Mean length of
run very strongly correlated with fluency scores, while Articulation rate and Number of
disfluencies per minute did not. Cucchiarini et al. (2002) also found that fluency ratings of
speech samples produced by beginning and intermediate learners of Dutch were moderately to
strongly correlated with Speech rate and Mean length of run; on the other hand, fluency ratings
were not correlated with Articulation rate or Number of disfluencies per minute.
However, in Towell et al. (1996) advanced learners of French improved Articulation rate
after a year abroad even if the improvement in Articulation rate was smaller than that in Speech
rate. Ginther, Dimova and Yang (2010) investigated relationships between oral English
proficiency and temporal measures of fluency and found that Articulation rate strongly
correlated with speaking scores, although less strongly than Speech rate did. Considering that L2
fluency can be affected by L1 fluency and speaking style, De Jong, Groenhout, Schoonen and
Hulstijn (2013) examined whether correcting measures of L2 fluency for L1 performance can
better predict L2 proficiency. They found that L2 Mean syllable duration (i.e., the inversion of
Articulation rate) was able to explain 30% of variance of L2 proficiency and correcting the
measure by partialing out L1 behavior increased the explained variance to 41%. Regarding repair
measures, Bosker et al. (2013) recently investigated relative contributions of speed, pauses, and
repairs to perceived fluency by examining perceptual sensitivity to the three fluency aspects, and
found repairs did contribute a small but significant amount to perceived fluency. Therefore,

3

whether Articulation rate and repair measures are indicators of L2 fluency or not is not yet
conclusive.

Breakdown Fluency
Previous findings on pause phenomena show an even more complicated picture. In
Ginther et al. (2010) and Bosker et al. (2013), both pause frequency and pause length were
negatively correlated with proficiency scores and fluency ratings, respectively. On the other
hand in Kormos and

nes (2004) fluency ratings did not correlate with pause frequency but did

correlate with pause length. Furthermore, in Cucchiarini et al. (2002), the opposite pattern was
found, in other words, fluency ratings correlated with pause frequency but not with pause length.
Pause distribution has been investigated by even fewer studies, in which fluent speech
tended to have pauses at grammatical junctures (Lennon, 1990; Towell et al., 1996), whereas
non-fluent L2 speech often had pauses within clauses (Davies, 2003; Deschamps, 1980; Freed,
1995; Riggenbach, 1991; Tavakoli, 2011). It has been argued that in fluent speech, language is
encoded a clause at a time (Pawley & Syder, 2000) and pausing within clauses seems to reflect
difficulties in planning or encoding speech (Cenoz, 1998; Lennon, 1984; Wood, 2010). However,
the results are not yet conclusive because as Kormos (2006) points out, many earlier studies
suffer from very small sample sizes (often with a few to several participants) and did not use
computer technology to obtain more precise temporal measures in milliseconds or statistical
analyses, suggesting difficulty in generalizing the results. Moreover, in Riazantseva (2001)
which had a higher number of participants (20 L1 speakers; 30 L2 speakers in total, 15 per
group), no difference was found in the number of within-constituent pauses between L1 and L2
speakers.

4

L1 literature has a longer history on pause phenomena (e.g., pausology—a specialized
field in psycholinguistics, the study of temporal variables in speech pioneered by GoldmanEisler in the 1950s) and can provide insights on L2 research (Griffiths, 1991). Schnadt (2009)
points out that one of the major issues for the study of silent pauses has been distinguishing a
―hesitant‖ pause from a pause based on a speaker‘s natural prosody. Hesitant pauses (or
performance-based pauses, Ferreira, 1993, 2007) are related to delays in planning and production
processes, whereas prosodic pauses (Ferreira, 1993, 2007) separate utterances into intonational
phrases (i.e., a speech segment which occurs with a single prosodic contour), and thus are part of
the rhythmic structure of speech. Indeed, in L1 speech, most pauses tend to occur at clause
boundaries (Boomer, 1965; Hawkins, 1971; Holmes, 1988; MacGregor, 2008). Prosodic pauses
typically occur at intonational phrase or clause boundaries; however, performance-based pauses
can occur at any point where a speaker needs to plan upcoming speech or encounters difficulty.
L1 research on pause phenomena has also shown its important role in speech perception
and comprehension. Silent pauses at grammatical boundaries have been claimed to help listener
comprehension as they enable them to understand and keep pace with the utterance by indicating
the boundaries of speech to be analyzed, and providing cognitive processing time (Arons, 1993;
Griffiths, 1991; Reich, 1980, Sugito, 1990). Pauses at grammatical junctures are important for
comprehension and eliminating them can interfere with comprehension (Lass & Leeper, 1977).
However, as Arons (1993) maintains, only pauses between clauses or structural pauses (i.e.,
pauses between items of information in lists of meaningful trigrams such as IBM [pause] KGB
[pause] PHD) are useful; pauses within clauses or nonstructural pauses (e.g., DIB [pause] MKG
[pause] BPH) can interfere with speech perception processing (Bower & Springston, 1970;
Griffiths, 1991; Reich, 1980; Sugito, 1990). Silent pauses are one of the acoustic cues to clausal

5

units along with pitch and vowel duration (Seidl

risti , 2008) and in language development

of infants, by 6 months of age, infants show a preference for sentences containing pauses
between clauses to sentences containing pauses within clauses (Hollich & Houston, 2007). In
Reich (1980), words were categorized faster and propositions were recalled more accurately in
sentences containing pauses between clauses than in sentences containing pauses within clauses.
It has also been reported that silent pauses have beneficial effects on listeners under conditions of
cognitive complexity in auditory speech processing and they did not demonstrate apparent
beneficial effects when the speech or tasks were easy enough (Aaronson, 1968; Reich, 1980).

Cognitive Fluency and Utterance Fluency
In understanding the underlying mechanisms responsible for L1 and L2 oral fluency, the
differences are often explained by the degree of automaticity. Automaticity refers to the absence
of attentional control in executing a cognitive activity (Kahneman, 1973) and has several
characteristics, such as rapidity, effortlessness, unconscious and ballistic nature (Segalowitz &
Hulstijn, 2005). Kormos (2006) points out that whereas L1 speech production requires attention
only to speech planning and monitoring, in L2 speech, syntactic and phonological encoding may
not be fully automatized, slowing speech down.
For systematic understanding of L2 cognitive fluency, as shown in Figure 1, Segalowitz
(2010) adopted Levelt‘s (1999) L1 speech production model and identified possible fluency
vulnerability points (i.e., critical points where underlying processing difficulties could result in
L2 speech disfluencies, {f} symbols) by incorporating De Bot‘s (1992) proposals on bilingual
speakers (see Segalowitz, 2010, pp. 7-17 for details).

6

Figure 1: Segalowitz‘s (2010) model of the L2 speaker. Adapted from Levelt‘s (1999)
―blueprint‖ of the monolingual speaker. The {f} symbols refer to fluency vulnerability points.
(Figure used with permission of Taylor and Francis Group LLC Books)

Rhetorical/semantic/
syntactic System
Conceptual preparation:
Macroplanning
{f1} Microplanning
Parsed
speech

Knowledge of external
and internal world
Model of the interlocutor;
Discourse model, etc.

Preverbal message
{f2}Grammatical encoding

{f7} Self-perception

{f3}

Surface structure

Mental Lexicon
Lemmas
Morpho-phonological
codes

{f4} Morphophonological encoding
Phonological score
{f5} Phonetic encoding

Syllabary

Articulatory score
(phonetic plan/internal speech)

Gestural scores

{f6} Articulation
Phonological/phonetic
System
Overt speech

Seven fluency vulnerability points were proposed, microplanning ({f1}), grammatical
encoding ({f2}), lemma retrieval ({f3}), morpho-phonological encoding ({f4}), phonetic
7

encoding ({f5}), articulation ({f6}), and self-perception ({f7}). During macroplanning (i.e.,
planning what to say next) no L2-specific fluency issues are expected, as world knowledge used
in this stage is not assumed to be organized in language specific terms (De Bot, 1992; Levelt,
1989, 1999).
In microplanning, language-specific information (e.g., argument structure, mood, and
tense-aspect) is included in a preverbal message and L2 speakers sometimes strategically
formulate a preverbal message to avoid L2 difficulties (De Bot, 1992). Taking one‘s limitation
into consideration may slow down the process of formulating the preverbal message ({f1}).
Lexical retrieval is one of the most salient problems with L2 speakers (De Bot, 1992) and it
seems difficult for them to retrieve ({f3}) and utilize the appropriate linguistic resources to create
a grammatical surface structure ({f2}). Whereas the process is highly automatized in L1 (Levelt,
1989), in morpho-phonological encoding (i.e., specifying morphological, segmental, and
suprasegmental structure of the word), L2 learners may not have automatic access to syllable
programs ({f4}). L1 and L2 are likely to utilize different repertoires of articulatory gestures and
L2 fluency can be compromised when speakers cannot automatically access appropriate gestural
scores ({f5}), or execute the score ({f6}).
Additionally, Kormos (2006) included an L2 declarative knowledge store in the bilingual
speech production model based on the declarative/procedural model (Ullman, 2001, 2004, 2005,
2013). According to the model, in L1, lexicon and grammar are learned, represented, and
processed in two memory systems—declarative memory and procedural memory. Lexical
aspects depend on declarative memory, which is implicated in the learning/use of facts and

8

declarative memory can be explicitly recollected (Ullman, 2001, 2004, 2005, 2013). On the other
hand, grammatical aspects depend upon procedural memory, which is implicated in the
learning/use of motor/cognitive skills. Procedural memory is often referred to as an ―implicit
memory system‖ as the knowledge itself or procedures of the learning of the knowledge is not
accessible to conscious memory (Ullman, 2004). Learning in the procedural system results in
rapid and automatic processing of skills and knowledge (Ullman, 2013). Ullman claims that in
late L2 learning after puberty, however, grammatical computation as well as the mental lexicon
largely relies on declarative memory, and L2 experience (practice) and age of exposure to L2
affect the relative reliance on declarative versus procedural memory. The claim has been
supported by neuroimaging studies (e.g., Dehaene et al., 1997; Perani et al., 1998), which found
a greater activation in the brain regions responsible for declarative memory in L2 (the
hippocampus and medial temporal lobe structures) for the processing of forms which are mainly
computed by the procedural memory in L1. In addition, the activation pattern from early
bilinguals was similar to L1 speakers, whereas low-proficiency L2 speakers heavily relied on the
declarative memory system (Perani et al., 1998). Opitz and Friederici (2003) also found that
when adults were learning an artificial language, during syntactic processing the hippocampus
and the temporal lobe were activated; the areas are involved in declarative memory. The findings
suggest that declarative knowledge of grammar is stored in a different area from where
procedural knowledge is stored, and as learners are exposed to L2 at an earlier age and as they
practice more, they become more dependent upon the procedural memory system (Ullman, 2001,
2004). Therefore, in L2, many of the syntactic, lexical and phonological rules which are not
automatized are considered to be stored as declarative knowledge (Kormos, 2006; Ullman, 2001).

9

Until now, only a few studies have investigated cognitive fluency in relation to utterance
fluency by measuring subprocesses of speech production. Segalowitz and Freed (2004) measured
cognitive fluency using a semantic classification task and an attention control test, and related
the results to gains in utterance fluency. They found correlations between mean length of run and
lexical access speed and efficiency. De Jong, Steinel, Florijn, Schoonen, and Hulstijn (2013) also
aimed to identify utterance fluency measures indicative of cognitive fluency. They measured
linguistic knowledge (e.g., vocabulary, grammar, pronunciation) and processing skills (e.g.,
speed of morphosyntactic processing, lexical selection, articulation). Results showed that
linguistic knowledge and skills were most strongly related to average syllable duration,
explaining 50% of the variance, whereas they had the weakest correlation with pause duration.

Overview of Research Design
Taken together, previous studies on L2 fluency mainly investigated utterance fluency and
perceived fluency in second language. Only a few studies looked at cognitive fluency in relation
to oral fluency. Results of utterance and perceived fluency seem to suggest that speed and pause
phenomena be related to perception of fluency; however, when closely examined, results are
mixed depending on how each aspect of fluency is measured (e.g., speech rate vs. articulation
rate) and the relative contribution of frequency, duration, and distribution of pauses on fluency
perception have not been fully investigated. Furthermore, there are gaps in the literature on how
L1 and L2 speakers‘ fluency differs (e.g., use of filled pauses, distribution of pauses).
Mixed results in the previous studies may be due to methodological issues such as
variability in speech elicitation tasks, in cut-off points for pauses, and small sample sizes. But
more importantly, a lack of comprehensive theoretical framework to understand fluency

10

(Segalowitz, 2010) may have been a barrier to systematic investigation of L2 fluency. Therefore,
within Segalowitz‘s (2010) fluency framework the current dissertation aims to investigate 1)
utterance fluency and cognitive fluency by examining in what respects and why native speakers‘
and L2 speakers‘ oral fluency are different in Study 1 and 2) perceived fluency by testing effects
of frequency, duration, and location of pauses on the perception of fluency in Study 2.

11

CHAPTER 2: STUDY 1

Introduction
Study 1 investigated (1) utterance fluency, by comparing speech samples from L1
English speakers and L1 Korean L2 English speakers to show in what respects they are different,
and (2) cognitive fluency, by examining stimulated recall responses from L2 speakers to gain an
insight regarding the underlying processes and problems related to L2 disfluencies.
Following Skehan‘s (2003) taxonomy, utterance fluency was investigated in terms of
speed, repairs and breakdown fluency. As mixed results had been found in the literature on
breakdown fluency, pause phenomena were examined rigorously by investigating frequency,
duration, and distribution of both silent and filled pauses. Especially, as pause distribution has
rarely been the main focus of L2 fluency research, frequency and duration of silent and filled
pauses in different locations (e.g., within clauses, at clause boundaries) were analyzed in depth.
In addition, length of run was also included in the analysis of utterance fluency with
consideration for its strong association with L2 oral fluency development (e.g. Ginther et al.
2010; Kormos

nes 2004; Towell et al. 1996) and its conceptual connection with automatic

speech production processing. As discussed earlier, length of run has been consistently found to
be correlated with L2 oral fluency development and perceived fluency in previous studies.
Towell et al. (1996) claimed that increase in length of run reflects proceduralization of
declarative knowledge based on Anderson‘s (1983) ACT* (Adaptive Control of Thought) model
of skill acquisition. Following Towell et al. (1996) and Towell and Dewaele (2005), Skehan
(2009) suggested that length of run can be measured as an indicator of the degree of

12

1

automatization in speech performance . Length of run seems to be related to automaticity and
automatization as automatic speech production processes are likely to result in long fluent runs.
Furthermore, length of run also seems to be closely related to the use of prefabricated language
units and formulaic language, which have been claimed to facilitate L2 oral fluency (Boers,
Eyckman, Kappel et al., 2006; Bybee, 2002; Kuiper, 1996; Skehan, 1998).
The study used stimulated recall to investigate cognitive processes of L2 speakers
regarding disfluencies. As mentioned above, only a couple of studies (De Jong, Steinel, et al.,
2013; Segalowitz & Freed, 2004) have investigated cognitive fluency in relation to utterance
fluency and they tried to measure cognitive processes involved in cognitive fluency, for instance,
by measuring attention control and speed of morphosyntactic processing. None of them utilized
stimulated recall, whereas it was used in a few studies on problem-solving mechanisms and selfmonitoring (e.g.,

rnyei & Kormos, 1998; Kormos, 2000a, 2000b). Stimulated recall is

different from other cognitive measures used in the studies on cognitive fluency in that it is a
more global and indirect measure of cognitive processes. Stimulated recall can reflect cognitive
events and reveal the information attended to during task performance (Gass & Mackey, 2000).
Stimulated recall has a limitation in capturing subconscious cognitive processes; however, it has
the potential to extend our understanding about a complex phenomenon of L2 fluency by tapping

1

Given that automaticity has been operationalized in various ways in the literature based on its
complex characteristics including rapidity, effortlessness, unconscious and ballistic nature
(Segalowitz & Hustijn, 2005), it is unlikely that automatization can be measured by a single
measure such as length of run. The direct connection between automatization and length of run is
also still not clear. For instance, it is unclear whether long fluent run reflects automaticity in
lemma retrieval, grammatical encoding, morpho-phonological encoding, phonetic encoding,
articulation, or combination of some of these stages. Moreover, it is unlikely that other measures
of utterance fluency such as Mean syllable duration does not relate to automatization in speech
production. Therefore, the measure of length of run is included in the analysis as its own
category (see Table 1) and not under the category of automatization (c.f., Koponen &
Riggenbach, 2000; Skehan, 2003).
13

issues that L2 speakers attend to during speaking, which cannot be easily addressed by other
methods. Following are the three research questions in Study 1, addressing utterance fluency and
cognitive fluency.

1. Are there differences in utterance fluency (i.e., speed, length of run, repairs, and
frequency, length, and distribution of pauses) between L1 and L2 speakers?
2. Which of the utterance fluency measures (i.e., speed, length of run, repairs, and
frequency, length, and distribution of pauses) are correlated with L2 oral proficiency?
3. Are there differences in cognitive fluency reflected in the stimulated recall responses
between lower and higher proficiency learners?

Although Study 1 is primarily exploratory in nature, some predictions are possible to
make based on previous studies on fluency. In utterance fluency, if pauses within clauses reflect
processing difficulties in speech production as proposed by Lennon (1984), Pawley and Syder
(2000) and Wood (2010), L2 learners are expected to pause within clauses more often than L1
speakers. In addition, as L2 proficiency increases, frequency of pauses within clauses is expected
to decrease.
Moreover, according to the declarative/procedural model (Ullman, 2001, 2004, 2005,
2013) and previous studies on automaticity (DeKeyser, 2001, 2007; Segalowitz, 2000, 2003), L2
learners tend to rely on declarative memory/knowledge and become more dependent upon
procedural memory/knowledge as they practice and gain more experience in L2. Therefore,
lower proficiency learners are expected to use L2 declarative knowledge more than higher
proficiency learners. Considering that only declarative knowledge can be explicitly recollected

14

and procedural knowledge cannot (Ullman, 2001, 2004, 2013), lower proficiency learners are
expected to remember more about their thoughts at the time of speaking in stimulated recall than
higher proficiency learners. Furthermore, in terms of the content of the stimulated recall
responses, if higher proficiency learners‘ production processes are more automatized than lower
proficiency learners ‘ higher proficiency learners are predicted to report on macroplanning and
monitoring that required their attention as in L1 speech (Kormos, 2006), whereas lower
proficiency learners are expected to report on more varied issues including syntactic and
morpho-phonological encoding that are not fully automatized and controlled in the declarative
memory system.

Method
Participants
Thirty-one Korean learners of English (10 males; 21 females) and 15 English native
speakers (4 males; 11 females) participated in the study. The mean age of the Korean speakers
was 31 (SD = 6.3) and they started to learn English around the age of 12 (SD = 0.1). None of
them had any other languages spoken at home as a child other than Korean, or learned English
before the age of 8. Their length of residence ranged from 1 month to 7.5 years (M = 1.9, SD =
1.7).
Korean participants‘ English oral proficiency was measured by evaluating their speech
samples holistically using a rubric of the SPEAK test (Indiana University, n.d.), which had
criteria such as clarity and effectiveness of communication, effective use of linguistic features,
severity of linguistic errors, organization, and appropriateness (scale range: 20 – 60); the rubric
did not entail specific aspects of fluency. Two raters evaluated speech samples and their

15

interrater reliability measured by Pearson correlation coefficient was r =.910 (p < .001). The
average score was used for each participant (M = 47, SD = 7.5, Min. = 34, Max. = 60). The mean
age of English speakers was 29 (SD = 5.7) and they were undergraduate or graduate students at a
university in the United States.
Tasks and Procedures
All 46 participants completed a spontaneous speech task (Appendix A) and a paper-based
survey questionnaire on their L2 learning background (Appendix B), and 17 Korean learners of
English within this group voluntarily participated in a stimulated recall session. The study was
conducted individually in a quiet room with a Korean-English bilingual researcher present. The
tasks were presented in PowerPoint on a laptop. The participants‘ speech was recorded through a
low-noise headset using the digital audio editing software GoldWave and the recordings were
saved as 22KHz (16-bit resolution; 1-channel).

Spontaneous Speech
Two questions were used to elicit spontaneous speech, in which the participants spoke
freely on a given topic. The questions were on daily life so that all the participants were familiar
with the topic and were able to talk naturally without much difficulty. The first question was
about their major field of study, what it was about, whether they liked it or not and why. The
second one was about their free-time activities (Appendix A).
When each question appeared on the screen, the participants were able to start answering
the question whenever they were ready. However, none of them spent more than 10 seconds
before they answered. They were asked to respond to each question for one minute but they were
not interrupted in the middle of their speech after the requested one minute was over. A stop

16

watch was placed next to the laptop computer so that the participants were able to check the time.
When they finished answering each question, they clicked and moved on to the next question at
their own pace.

Stimulated Recall
The 17 Korean speakers (7 males; 10 females) voluntarily participated in the stimulated
recall session. The participants were asked to describe what they were thinking while pausing or
hesitating during their speech. Stimulated recall was conducted in their L1, Korean. The session
was conducted following Gass and Mackey‘s (2007) recommendations for stimulated recall
research. The audio-recorded spontaneous speech was played for each learner immediately after
the spontaneous speech task was over in order to utilize recent memory and reduce recall
interference. During the session the participants were allowed to pause the audio file whenever
they wanted to describe their thoughts at the time of speech production to make their recalls less
susceptible to researcher interference. The researcher was also able to pause the audio file after
silent or filled pauses in the speech recording to ask participants to recall their thoughts so as not
to let the session become completely unstructured and lose useful data.

Utterance Fluency: Quantitative Study
Analysis
Transcribing and Marking Pauses and Repairs
All speech recordings were transcribed in detail including information regarding pauses
and repairs. The length of silent and filled pauses was measured in milliseconds (ms) by listening
to each speech sample and examining the waveform and spectrogram using Praat (Boersma &

17

Weenink, 2012) and the duration was added to the transcript. In previous studies the lower bound
of silent pauses varied considerably (100 ms, Kang, Rubin, & Pickering, 2010; Riazantseva,
2001; Trofimovich & Baker, 2006; 200 ms, Cucchiarini et al., 2002; 250 ms, De Jong,
Groenhout, et al., 2013; De Jong, Steinel, et al., 2013; Ginther et al., 2010; Raupach, 1987; 280
ms, Towell et al., 1996; 300 ms, Wood, 2010; 400 ms, Derwing et al., 2004; Freed, Segalowitz,
& Dewey, 2004; O‘Brien et al., 2007; Tavakoli, 2011).
2

In the present study any silence equal to or longer than 250 ms was identified as a silent
pause and included in the analysis following De Jong, Groenhout, et al. (2013) and GoldmanEisler (1968). Further support for 250 ms over 400 ms, which is another popular choice in L2
fluency studies, came from recent studies by De Jong and Bosker (2013) and Kahng (2012). In
De Jong and Bosker (2013), a lower cut-off point for silent pauses of 250-300 ms led to the
highest correlation between the number of silent pauses and L2 proficiency scores. Kahng (2012)
compared the results of the analysis based on the two cut-off points for silent pauses and found
that 400 ms missed 12% of the pauses identified by 250 ms. More importantly, 77% of the
pauses which 400 ms missed were pauses within clauses. As pause distribution is one of the
main foci of the present study, 250 ms was selected so as not to lose potentially important
information. Filled pauses were defined as nonlexical fillers such as um and uh (Freed, 1995;
Kang et al., 2010; Riggenbach, 1991).
Pause distribution was operationalized by categorizing pauses into pauses within clauses,
at clause boundaries, or at AS-unit boundaries. A clause was required to consist minimally of a
finite or non-finite verb with at least one other clause element such as a subject, object, or
complement (see Foster, Tonkyn, & Wigglesworth, 2000, pp. 365-368). An AS-unit (ASU) is a
2

Any silent portions preceding or following filled pauses were also counted as silent pauses as
long as they were equal to or longer than 250 ms.
18

single speaker‘s utterance which consists of either an independent clause or sub-clausal unit,
with any subordinate clause (Foster et al., 2000). The ASU was devised based on the T-unit (i.e.,
―a main clause plus any other clauses which are dependent upon it‖ (Foster et al., 2000, p. 360)
but elaborated to deal with the features of spoken data. For instance, unlike the T-unit, the ASU
includes independent sub-clausal units and minor utterances (e.g., Oh poor woman, Thank you
very much, Yes) which are common in speech. Along with clauses, ASUs which are larger than
single clauses were used because speakers may plan multi-clause units (Beattie, 1980) and being
able to plan multiple clauses seems to be related to L2 proficiency (Foster et al., 2000). In the
transcript, an ASU boundary was marked by a double slash …//… a clause boundary was
marked by a double colon :: and repairs such as repetitions and self-corrections were put inside
brackets {…}. Silent pauses were marked in parentheses for duration (in milliseconds) and the
duration of filled pauses was marked next to each filled pause without parentheses. Following is
an example of a transcript with information about clause and ASU boundaries, repairs, and the
duration of silent and filled pauses.

When I was in high school :: I joined {this club} drama club (425) //
and I performed in ah410 several plays //
and I believe :: I have some talent in (484) acting //

Calculating Temporal Variables
Table 1 lists the temporal measures and their operational definitions. The choice of
measures was made so that the measures clearly represent each aspect of fluency (i.e., speed,
length of run, repairs, and pause phenomena) and they are not mathematically dependent or

19

strongly interrelated. For speed fluency, Mean syllable duration was computed, which is the
inverse of the Articulation rate measure. As De Jong, Steinel, et al. (2013) pointed out, Mean
syllable duration is a pure measure of speed in that it excludes pause time unlike the traditional
measure of Speech rate which includes pause time. For length of run, Mean syllables per run
was used. For repair fluency, Number of corrections per minute and Number of repetitions per
minute were calculated.
To examine pause phenomena (i.e., breakdown fluency), frequency of pauses was
measured by Number of pauses per minute and duration of pauses by Mean length of pauses. To
measure distribution of pauses, this study devised Pause rate in different locations to capture a
more accurate picture of pause distribution than a measure used in a previous study such as
Number of pauses per minute within a clause or at a clause boundary (Tavakoli, 2011). Pause
rate takes into account the fact that speech samples do not have the same number of clauses or
ASUs across speakers, and there is always an equal or a greater number of clause boundaries
than ASU boundaries by definition. For instance, when there are 10 ASU boundaries and 16
clause boundaries in a one-minute speech sample and the speaker paused at all 10 ASU
boundaries and all 16 clause boundaries, Number of pauses per minute can incorrectly suggest
that the speaker paused more often at clause boundaries (16) than at ASU boundaries (10),
although the speaker paused at every clause and at every ASU boundary. Pause rate computes
how often a speaker pauses within each clause, at each clause boundary, or at each ASU
boundary. Therefore, in the above example Pause rate would be 1 for both clause (16/16 = 1)
and ASU boundaries (10/10 = 1). Moreover, as Pause rate in different locations (i.e., within a
clause, at a clause boundary, and at an ASU boundary) captures how likely a speaker pauses in

20

3

each location , in calculating Pause rate, when more than one pause occurred in one location,
they were counted as one. For instance in ―I like chocolate :: ah260 (400) um100 because it is
sweet ‖ even if the speaker used three individual pauses at the clause boundary they were
counted as one because all of them occurred in the same location, between ―chocolate‖ and
―because.‖

Table 1: Temporal measures used in the study
Measures
Speed

Submeasures
Mean syllable duration (ms)

Definition and formula
Speech time excluding pause time / total
number of syllables
Repair
Number of corrections per minute Total number of corrections / spoken time
Number of repetitions per minute Total number of repetitions / spoken time
Length of
Mean syllables per run
Average number of syllables produced
run
between two silent pauses
Pause
Number of pauses per minute
Total number of pauses / spoken time
Mean length of pauses (ms)
Total length of pause time / number of
pauses
Pause rate within a clause
Total number of pauses within clauses /
number of clauses
Pause rate at a clause boundary
Total number of pauses at clause
boundaries / number of clause boundaries
Pause rate at an ASU boundary
Total number of pauses at ASU boundaries
/ number of ASU boundaries
Ratio of filled to silent pause
Total length of filled pause time / total
length of silent pause time
Pause concurrence rate
Number of pauses occurring concurrently /
total number of pauses
Note. Spoken time = duration of speech fragment excluding silences of ≥ 250ms.
For the pause measures, silent and filled pauses were measured separately.

In addition, two more measures—Ratio of filled to silent pause and Pause concurrence
rate—were devised in the current study to explore relationships between silent and filled pauses,
3

By definition, it is not possible to pause more than once at a clause boundary or at an ASU
boundary, whereas it is possible within a clause. For instance in ―I um200 (350) don‘t (300)
know ‖ the speaker paused two times within a clause.
21

which has rarely been addressed in L2 fluency studies. Ratio of filled to silent pause was the ratio
of filled to silent pause time to investigate whether there is a difference in the ratio between L1
and L2 speakers. Pause concurrence rate was devised based on the observation that filled pauses
are often preceded or followed by silent pauses (Beattie, 1977). When a filled pause is used, it
usually interrupts the silence and breaks the silence into smaller pieces, suggesting a possibility
that L1 speakers can use filled pauses strategically, for instance, to keep the floor (Beattie, 1977;
Taboada, 2006) and to make their speech sound less disfluent. Pause concurrence rate computes
the probability for pauses to occur right next to each other by dividing number of pauses
occurring concurrently by total number of pauses.
Number of words and syllables were counted using online software, Syllable Counter
(SyllableCount.com, n.d.), which is developed based on an English syllable dictionary. Once
transcripts of speech recordings were entered into the window of the software, it produced a
result table containing the number of words and syllables for the given text. Result tables also
showed the words that the software did not recognize (e.g., ah, um, TESOL) and those words and
syllables were counted manually by the researcher.

Statistical Analysis
For statistical analysis, multivariate analyses of variance (MANOVAs) were run using
SPSS Statistics 17.0 (SPSS Inc. 2008). In reporting the results of MANOVAs Pillai‘s trace was
used as it is robust to small and unequal sample sizes (Hair, Black, Babin, & Anderson, 2009).
The variables which violated the assumptions of parametric tests were transformed using square
root transformations (Larson-Hall, 2010). All the transformed data improved in terms of

22

4

normality and homogeneity of variances after the transformation . Three MANOVAs were run
in order to investigate group differences in terms of 1) overall temporal features, 2) pause rate in
different locations, and 3) pause length in different locations. As three separate MANOVAs were
run, the alpha level was set at 0.016 after Bonferroni adjustment (0.05/3) for the omnibus tests to
control for Type I error. For the follow-up analysis of variance (ANOVA) tests, the alpha level
was also adjusted using Bonferroni correction by dividing 0.05 by the number of dependent
variables included in each MANOVA test.

Results
To summarize the mean utterances produced by the L1 English speakers and L1 Korean
L2 English speakers, the L1 speakers produced 203 words (SD = 82), 287 syllables (SD = 112),
31 clauses (SD = 13), 18 ASUs (SD = 7.0) and the L2 speakers produced 168 words (SD = 62),
237 syllables (SD = 86), 24 clauses (SD = 9.1), and 16 ASUs (SD = 4.7) in the two speech
samples. In the following, the descriptive statistics and group differences in the measures of
overall utterance fluency, pause phenomena in different locations, and the results of correlation
analysis among the variables are reported in turn.

Overall Utterance Fluency
Table 2 demonstrates descriptive statistics and group differences in the measures of
overall utterance fluency. Results of a one-way MANOVA showed that using Pillai‘s trace the
two groups were significantly different regarding overall utterance fluency, V = .71, F(8, 37) =
4

The transformed variables were Number of corrections per minute, Number of repetitions per
minute, Number of filled pauses per minute, Mean length of silent pauses at a clause boundary,
and Mean length of silent pauses at an ASU boundary. As they were moderately positively
skewed, square root transformations were used (Larson-Hall, 2010).
23

11.05, p < .001. Separate univariate ANOVAs on the 8 dependent variables revealed significant
group differences in Mean syllable duration, Mean syllables per run, Number of corrections per
minute, Number of silent pauses per minute with large effect sizes (.01 = small, .06 =
medium, .14 = large; Cohen, 1988), and approaching significance in Mean length of silent pauses,
Number of repetitions per minute, and Number of filled pauses per minute. Therefore, the L1
speakers spoke faster, produced more syllables per run, and used fewer corrections and silent
pauses per minute compared to the L2 speakers. However, the two groups were not statistically
different in terms of Mean length of filled pauses. It is also interesting to note that the L1
speakers used more filled pauses per minute than the L2 speakers, although the difference did not
reach significance at the .006 level.

Table 2: Overall utterance fluency: Descriptive statistics and group differences
L1 Speakers L2 Speakers
(N = 15)
(N = 31)
M
SD
M
SD
F
df
p
Mean syllable duration (ms)
249
37
321
45 28.60
1 < .001*
Mean syllables per run
14.3 5.44 6.25 2.03 53.58
1 < .001*
Number of corrections per minutet
0.67 1.17 2.21 1.57 16.10
1 < .001*
Number of repetitions per minutet
0.82 0.90 2.06 1.78
6.42
1
.015
Number of silent pauses per minute
15.1 2.85 21.8 4.24 30.43
1 < .001*
Number of filled pauses per minutet
9.37 3.21 6.51 4.39
6.26
1
.016
Mean length of silent pauses (ms)
685
170
893
269
7.50
1
.009
Mean length of filled pauses (ms)
499
95
563
127
3.02
1
.089
Note. Silent pause ≥ 250 ms. The subscript t next to a variable indicates that the variable was
square-root-transformed for inferential statistics. An asterisk indicates significant difference at
the 0.006 level after Bonferroni correction (0.05/8).

Pause Phenomena in Different Locations
Figure 2 illustrates the overall distribution of speaking and pause time. As pauses at ASU
boundaries are also at clause boundaries, they are included in pauses at clause boundaries and are

24

2

ηp
.40
.55
.27
.13
.41
.12
.15
.06

not shown separately in the figure. It shows that the L1 speakers used 25% of their response time
on pausing, whereas the L2 speakers used 38% of their time on pausing. The striking group
difference was found in silent pauses. Silent pause time in L2 speech was almost double
compared to that in L1 speech (32% vs. 17%); especially within clauses, the L2 learners spent
over twice the amount of time in silent pauses than the L1 speakers did (18% vs. 7%). The figure
also illustrates that filled pauses were used much less often than silent pauses by both groups.
However, the gap between the two types of pause seems greater for the L2 learners than the L1
speakers. The means and standard deviations of Ratio of filled to silent pause time for the L1
speakers was 0.51 (0.32), whereas 0.22 (0.18) for the L2 learners. Although relatively large
standard deviations seem to reflect some individual differences in the use of filled pauses, the L2
learners still used silent pauses much more often than filled pauses compared to the L1 speakers.

Figure 2: Distribution of speaking and pause time
Speaking
L2

62

L1

18

75

0%

20%

40%

14

7

60%

42

10 3 4

80%

Silent pauses within
clauses
Silent pauses at clause
boundaries
Filled pauses within
clauses
Filled Pauses at clause
boundaries

100%

In addition, as filled pauses were often preceded or followed by silent pauses, Pause
concurrence rate was calculated (see Table 1) to find out how often they co-occurred in the
speech of the two speaker groups. In L1 speech more than half of the pauses (53%) were
adjacent to other pauses, whereas in L2 speech only 35% were.
25

Figure 3 illustrates the L1 and L2 speakers‘ Pause rate within a clause, at a clause
boundary, and at an ASU boundary for silent and filled pauses.

Figure 3: Pause rates for silent and filled pauses in different locations (means and standard
errors)

1.2
1.11

1
0.8

0.69

0.6
0.57
0.49

0.4
0.2

0.31

0.37

0.31
0.16

0.34
0.26

0.24
0.17

0
Silent pause Silent pause Silent pause Filled pause Filled pause Filled pause
rate within a rate at a clause rate at an ASU rate within a rate at a clause rate at an ASU
clause
boundary
boundary
clause
boundary
boundary
L1 speakers

L2 speakers

Pause rate in different locations demonstrates how often a speaker paused within each
clause, at each clause boundary, and each ASU boundary (see Table 1) and 1 means the speaker
paused in a given location every time. The L1 and L2 speakers demonstrated a different pattern
in Pause rate regarding locations. The L1 speakers‘ Pause rate was the lowest within a clause,
increased a bit at a clause boundary, and was the highest at an ASU boundary for both silent
(0.31, 0.37, and 0.49) and filled pauses (0.16, 0.26, and 0.34). On the other hand, the L2 speakers‘

26

Pause rate was the highest within a clause both for silent (1.11) and filled pauses (0.31). Silent
pause rate within a clause especially yielded the most striking group difference, in which the rate
for the L2 speakers to pause was almost four times higher than that for the L1 speakers (1.11 vs.
0.31). Moreover, the L2 speakers‘ Silent pause rate within a clause was over 1, suggesting that
on average they paused more than once within each clause.
A one-way MANOVA showed that using Pillai‘s trace, the two groups were significantly
different in overall Pause rate, V = .60, F(6, 39) = 9.53, p < .001. Univariate ANOVAs on the six
dependent variables revealed significant group differences on Pause rate for silent pauses in all
2

three locations (within a clause, F(1, 44) = 28.37, p < .001, ηp = .39; at a clause boundary, F(1,
2

2

44) = 16.14, p < .001, ηp = .27; at an ASU boundary, F(1, 44) = 18.42, p < .001, ηp = .30 ) but
2

not on Pause rate for filled pauses in any locations (within a clause, F(1, 44) = 4.04, p = .05, ηp
2

= .08; at a clause boundary, F(1, 44) = 4.99, p = .03, ηp = .10; at an ASU boundary, F(1, 44) =
2

4.20, p = .05, ηp = .09) at the .008 level after Bonferroni correction.
Table 3 shows the two speaker groups‘ mean length of silent and filled pauses in different
locations.
Table 3: Mean length of silent and filled pause in different locations (ms)

Mean length of silent pauses within a clause
at a clause boundary
at an ASU boundary
Mean length of filled pauses within a clause
at a clause boundary
at an ASU boundary
Note. Silent pause ≥ 250 ms.

27

L1 Speakers
(N = 15)
M
SD
621
179
722
198
734
218
498
119
503
126
518
125

L2 Speakers
(N = 31)
M
SD
811
233
1018
434
1052
475
488
192
588
247
606
257

The L1 and L2 speakers demonstrated a similar pattern in the mean length of pauses in
terms of locations. For both speakers the length of pause was shortest within a clause and longest
at an ASU boundary. A one-way MANOVA showed that using Pillai‘s trace, there was no
significant difference between the L1 and L2 groups, V = .22, F(6, 39) = 9.53, p = .13.

Correlation Between the Temporal Variables and Speaking Scores
In order to investigate relationships between temporal variables and speaking scores, a
Pearson correlation analysis was run with the L2 learners‘ data on speed length of run repairs
pause phenomena, and speaking scores (Table 4). In the analysis, the length of silent and filled
pause in different locations is not included as there was no group difference in the measures
according to the MANOVA test in the previous section.
First, to examine the relationships between the fluency measures, Mean syllable duration
was positively correlated with Mean syllables per run (r = .582**), Mean length of filled pauses
(r = .425*), and Silent pause rate within a clause (r = .580**). Mean syllables per run was
correlated with Number of repetitions per minute (r = .461*) and with some silent pause
measures. Considering that the calculation of Mean syllables per run involves number of silent
pauses, it is not surprising to see the correlations between them; however, it is noteworthy that
Mean syllables per run demonstrated the highest correlation with Silent pause rate within a
clause (r = -.855**) and a much lower correlation with Silent pause rate at an ASU boundary (r
= -.385*). Number of corrections per minute was not correlated with Number of repetitions per
minute or the rest of the measures. On the other hand, Number of repetitions per minute was
correlated with Number of silent pauses per minute (r = .615**), Mean length of filled pauses (r
= -.412*), and Silent pause rate within a clause (r = .581**), although they were not

28

mathematically related. Number of filled pauses per minute was negatively correlated with Mean
length of silent pauses (r = -.416*), which is consistent with the observation that filled pauses
tend to interrupt the silence and break the silence into smaller pieces.
It is interesting to note that Silent pause rate within a clause had a moderately strong
correlation with Mean syllable duration (r = .580**) and Number of repetitions (r = .581**)
even though they were not mathematically related. It is also notable that Pause rate at an ASU
boundary was very strongly correlated with that at a clause boundary (r = .798**), whereas it
was not correlated with that within a clause (r = .134), suggesting pauses within a clause and at
an ASU boundary have a different pattern.
Finally, speaking scores had a moderately strong correlation with Mean syllable duration
(r = -.541**) and Mean syllables per run (r = .549**). They were weakly correlated with
Number of repetitions per minute, showing an approaching significance (r = -.311, p = .089) but
not correlated with corrections (r = -.048, p = .797). Speaking scores were also not significantly
correlated with Number of silent pauses per minute (r = -.283, p = .123) but weakly correlated
with Mean length of silent pauses, showing an approaching significance (r = -.341, p = .061).
They demonstrated a moderately strong correlation with Silent pause rate within a clause (r = .535**) and a weak correlation with Silent pause rate at a clause boundary (r = -.358*), but no
correlation with Silent pause rate at an ASU boundary (r = -.017). None of the measures on
filled pause was correlated with speaking scores. It is noteworthy that the results of MANOVAs
and ANOVAs in the previous section showed group differences in Number of corrections per
minute, Number of silent pauses per minute, and Silent pause rate at an ASU boundary; however,
these variables were not significantly correlated with speaking scores.

29

Table 4: Pearson correlations between utterance fluency measures and speaking scores
MSD
MSR Cor Rep SPmin FPmin LngSP LngFP SPRw SPRc SPRas FPRw FPRc FPRas Speak
MSD
1
-.541**
MSR -.582**
1
.549**
Cor
.115
.043
1
-.048
Rep
.209 -.461** .177
1
-.311
SPmin
.231 -.782** .046 .615**
1
-.283
FPmin
.050
.212 .111 .099 -.023
1
.230
LngSP .117
-.300 -.212 -.275 -.242 -.416*
1
-.341
LngFP .425*
.168 .005 -.412* -.345 .151
.009
1
.062
SPRw .580** -.855** .227 .581** .776** -.159 .049 -.256
1
-.535**
SPRc
.276 -.618** -.027 .152 .424* -.255 .437* .014 .433*
1
-.358*
SPRas
.072
-.385* -.118 .023
.256 -.163 .328
.011
.134 .798**
1
-.017
FPRw
.187
.002 .088 .310
.130 .829** -.287 -.015 .117 -.076 -.081
1
.050
FPRc
.281
-.228 -.129 -.020 .221 .503** -.110 .237
.124
.092
.192
.196
1
.015
FPRas
.156
-.118 -.151 -.156 .117 .427* -.096 .238 -.037 .071
.247
.052 .946**
1
.107
Note. * = p < .05; ** = p < .01. MSD = mean syllable duration; MSR = mean syllables per run; Cor = number of corrections per
minute; Rep = number of repetitions per minute; SPmin = number of silent pauses per minute; FPmin = number of filled pauses per
minute, LngSP = mean length of silent pauses; LngFP = mean length of filled pauses; SPRw = silent pause rate within a clause;
SPRc = silent pause rate at a clause boundary; SPRas = silent pause rate at an ASU boundary; FPRw = filled pause rate within a
clause; FPRc = filled pause rate at a clause boundary; FPRas = filled pause rate at an ASU boundary; Speak = speaking scores.

30

To summarize the results of utterance fluency, Mean syllable duration, Mean syllables
per run, and Silent pause rate within a clause were most strongly associated with L2 oral fluency.
The three measures not only distinguished between the L1 and L2 speakers with large effect
sizes but also strongly correlated with L2 speaking scores. On the other hand, Mean length of
filled pauses and Mean length of pauses in different locations (i.e., within a clause, at a clause
boundary, and at an ASU boundary) did not distinguish between the L1 and L2 speakers.

Cognitive Fluency: Qualitative Study
Analysis
17 L2 speakers participated in the stimulated recall session right after the spontaneous
speech task. During the session, they were asked to talk about what they were thinking while
they were pausing or hesitating as far as they could remember. Stimulated recall was conducted
in their L1, Korean, and their responses were translated into English by the Korean-English
bilingual researcher. In order to investigate stimulated recall in terms of proficiency levels, the
responses have been analyzed and reported by dividing the participants into two groups based on
their speaking scores; 9 participants in the lower (M = 40, SD = 5 on a scale of 20 – 60) and the
other 8 in the higher proficiency group (M = 54, SD = 5.4 on a scale of 20 – 60). All 8
participants in the higher proficiency group were undergraduate or graduate students at a
university in the United States and had a minimum score of 90 on the internet-based TOEFL. On
the other hand, the 9 participants in the lower proficiency group were either students in English
as a second language (ESL) courses or university students who were still taking ESL courses.
The mean length of residence in English speaking countries for the lower proficiency group was
0.7 years and for the higher proficiency group was 2.2 years.

31

The responses were categorized based on their common themes. Five main categories
emerged: content of the message, vocabulary, grammar, phonology, and pragmatics. Grammar
included responses about the issues on morphological and syntactic aspects, and the use of
function words such as articles and prepositions. The examples of each category are presented
and discussed in the results section. The data should be interpreted with caution in that
stimulated recall has a limitation in capturing cognitive processes accurately, and it is not always
straightforward to match a response with a specific pause or aspect of speech.

Results
Overall, the lower proficiency (LP) group reported 122 issues and the higher proficiency
(HP) group reported only 75 issues. On average, the LP group also responded longer (M = 13.2,
SD = 3.9 minutes) than the HP group did (M = 9.2, SD = 1.8 minutes). Furthermore, only 20% of
the responses were marked with overt repairs in their speech samples, suggesting that 80% of the
issues would have been missed through speech analysis without the stimulated recall data.
Table 5 shows that the comments on the content of the message comprised the largest
proportion of comments by both groups (LP learners: 33.6%, HP learners: 50.7%). The LP
learners mentioned grammar and vocabulary much more often than the HP learners, and the
comments on grammar comprised one third of their responses (31.1%). In the following section,
stimulated recall responses are analyzed qualitatively with examples for each category. The
examples are to demonstrate some of the most common types of responses and to compare
responses from the LP with the HP learners. For each example, information is provided about the
quartile in which the L2 participant‘s speaking score falls.

32

Table 5: Distribution of stimulated recall responses

Content of the message (L2 related issues)
Vocabulary
Grammar
Phonology
Pragmatics
Others
(Individual differences, L1 use, etc.)
Total

Lower Proficiency
Learners
(N = 9)
Number of Percentage
issues
of issues
41 (15)
33.6
23
18.9
38
31.1
3
2.5
2
1.6
15
12.3
122

Higher Proficiency
Learners
(N = 8)
Number of Percentage
issues
of issues
38 (2)
50.7
9
12.0
14
18.7
2
2.7
5
6.7
7
9.3

100

75

100

Content of the Message
Comments on content were mainly about what to say next; however, interestingly, 37% of LP
learners‘ comments on content were related to how their L2 proficiency affected planning their
message, whereas only 5% of HP learners‘ comments were. Example 1 illustrates an LP learner
(second quartile) who selected what to say considering her L2 competence.
1. among the interests (453) my interest is juvenile delinquency // (1198) um512 (812) I like
it // because (1075) I think :: in juvenile delinquency is (378) very important //
Retrospection: “My major is important” is not the reason that I like my major. But I just
said that because I cannot explain other reasons in English.
LP learners also mentioned that they often dropped or modified their original message
because of their limited L2 competence as shown below (first quartile).
2. when I was (320) athlete :: {I was} um300 I (366) liked :: (260) for watching TV or
movie //

33

Retrospection: My original message was “I liked TV but I was afraid of becoming
addicted to it. So I tried not to watch TV a lot.” But this was too difficult for me. So I just
used an easy sentence and said “I liked watching TV.”
In Example 2, the speaker abandoned his original message and said something opposite
to his original message because the original message was too complex for him to convey in
English. Moreover, according to the LP learners, abandoning the original message created an
additional challenge.
3. uh498 (1243) when I have a free time :: I usually (817) do make something (462) like
uh155 food // (887) and (883) {I} (498) I know :: I am very (250) awkward (643) cook //
Retrospection: After I decided to drop a large part of my original message such as
painting my apartment and furniture, and to talk about cooking only, I realized that I had
to think of what to say all over again.
The LP learner in Example 3 (second quartile) dropped part of his original message due
to his limited L2 competence; however, this resulted in another challenge requiring him to plan a
new message. It shows how an issue at one stage can interact with processing at another stage.

Vocabulary
LP learners commented on looking for words or deciding on expressions more often than the HP
learners. Examples 4 and 5 are comments from an LP (first quartile) and an HP learner (fourth
quartile), respectively.
4. I um594 (529) accept my mom {advise} advice //
Retrospection: I was asking myself “what is „padadeulida‟ („accept‟ in Korean) in
English?”

34

5. if we (809) ah174 create very valid (538) uh421 (500) assessment tools :: then I think ::
the more students can benefit (658) ah272 from that (737) ah414 assessment system //
Retrospection: I was thinking whether I would say “valid test” or “valid assessment tool.”
As exemplified above, although both comments concerned vocabulary, there seem to be
qualitative differences between them. In 4, the LP learner was trying to look for an L2 word by
translating it from her L1, whereas the HP learner in 5 already had two multi-word expressions
retrieved and was trying to decide which one to use to better fit the context. Furthermore, LP
learners sometimes tried to decide which vocabulary to use by accessing L2 declarative
knowledge regarding word choice as below (first quartile).
6. I enjoy :: (1504) um1090 {(in a whisper) watching seeing looking (288) reading} (333)
reading books //
Retrospection: I got confused among watching, seeing, looking, and reading. I learned
that with “books”, I should say “reading.”

Grammar
Comments on grammar by the two groups also had not only quantitative differences but
also qualitative differences. The LP learners remembered their problems regarding sentence
construction and choice of tense-aspect, articles, and prepositions. They reported they were often
thinking about specific rules, L2 declarative knowledge they had learned, for instance, how to
make a sentence using comparatives, whether to use a gerund or an infinitive. On the other hand,
the majority of HP learners‘ comments on grammar involved monitoring their sentence structure,
especially with complex clauses. Examples 7 and 8 illustrate LP learners (first quartile) having

35

difficulty deciding on tense-aspect and making a comparative sentence using their L2 declarative
knowledge.
7. uh468 I learned (450) uh699 textile (1312) // and (490) I don‘t (267) like my major (283)
// because it‘s not match for (357) me //
Retrospection: I was wondering whether I should use the present tense or present perfect
because I didn‟t like it in the past and I still don‟t like it.
8. In Korea th (1853) {golf} (1335) golf cost is very expensive // (1814) but (463) I
surprised :: that (856) uh268 in USA (440) very cheap than {in my} in my country //
Retrospection: Instead of just saying that it is expensive to play golf in Korea, I wanted to
make a sentence which compares the golf cost between Korea and the US and say it is
more expensive to play golf in Korea than in the US but it wasn‟t easy to put the sentence
together.

Phonology and Pragmatics
Only a few comments were made on phonology and pragmatics by both groups, although
the HP group commented on pragmatics more frequently than the LP group. Most comments on
pragmatics concerned repetition of words or expressions. Example 9 illustrates an LP learner
(second quartile) monitoring her own pronunciation.
9. ts (476) {is criminal justice} in criminal justice there are (329) lots of academic interests
// (1176) for example (784) murder rapes (862) and thief (521) perjury // (1058) uh711
(358) any kinds of crime // (873) Ts (526) and {in addish} in addition (358){ju} (358)
juvenile delinquency (633)

36

Retrospection: When I said “in addition,” I thought my intonation was wrong, so I tried
to repair stress and intonation. As “juvenile” is such a difficult word for me to pronounce,
I always monitor whether I pronounce it correctly.

Other Issues
Responses also showed that individual differences might play a role in deciding whether
to repair one‘s errors and influence oral fluency. Following are the comments from two LP
learners (second quartile) on their decision regarding self-repair.
10. …some people just disregard their mistakes even after they notice them, but I don‟t like to
ignore them. So when I make mistakes, I try to repair them, even if this often gets my
sentences all mixed up…
11. …before I came to the US, I thought about how to express things in English carefully
before speaking but these days I just try to say things without thinking too much. Because
after I came here I realized that overthinking didn‟t seem to really improve my utterance
and when I make errors in English, some English speakers like my teacher also help me
to repair them.
In 10 the learner seems to attribute her tendency to repair her mistakes to her personality
and the learner in 11 explains how his L2 learning experience changed his self-repair behavior.
Other issues that the L2 learners mentioned included difficulty connecting speech
messages due to constant monitoring and access to L2 declarative knowledge, use of L1 in
conceptual preparation, and translating messages from L1 to L2. A few higher proficiency
learners reported difficulty remembering what they were thinking while pausing, saying ―I don‘t
remember.‖

37

Discussion
The current study aimed to demonstrate in what respects L1 and L2 speakers‘ fluency are
different. Within Segalowitz‘s (2010) fluency framework, the study investigated (1) utterance
fluency by comparing speech samples from L1 and L2 speakers, and (2) cognitive fluency by
examining the stimulated recall responses from L2 speakers.

Utterance Fluency
Utterance fluency was investigated in terms of speed, length of run, repairs and pause
phenomena. Table 6 summarizes the results of the MANOVAs and correlation analysis. Based
on the results, Mean syllable duration, Mean syllables per run, and Silent pause rate within a
clause were most strongly associated with L2 oral fluency. The three measures not only
distinguished between the L1 and L2 speakers with large effect sizes but also significantly
correlated with L2 speaking scores. On the other hand, Mean length of filled pauses and Mean
length of pauses in different locations (i.e., within a clause, at a clause boundary, and at an ASU
boundary) did not distinguish between the L1 and L2 speakers. In the following, the findings on
speed, length of run, repairs, and pause phenomena are discussed in turn in relation to previous
studies.

Speed
In the current study, Mean syllable duration, which excludes pause time and is thus a
pure measure of speed (De Jong, Steinel, et al., 2013), distinguished the L1 and the L2 speakers
and exhibited a significant negative correlation with L2 speaking scores (r = -.541).

38

Table 6: Summary of statistical analyses
Measures

Submeasures

L1-L2
Correlation with
difference
speaking scores
-.541**

.549**

-.048

approaching
-.311
-.283

approaching
.230
approaching
-.341
.062
-.535**

-.358*

-.017

.050
.015
.107

Speed
Length of run
Repair

Mean syllable duration (ms)
Mean syllables per run
Number of corrections per minute
repetitions per minute
Pause
Number of silent pauses per minute
filled pauses per minute
Mean length of silent pauses (ms)
filled pauses (ms)
Silent pause rate within a clause
at a clause boundary
at an ASU boundary
Filled pause rate within a clause
at a clause boundary
at an ASU boundary
Mean length of silent pauses within a clause
at a clause boundary
at an ASU boundary
Mean length of filled pauses within a clause
at a clause boundary
at an ASU boundary
Note. = statistical difference after Bonferroni corrections (for details, see the analysis section).
*p < .05; **p < .01. As results of MANOVA showed that there was no group difference in mean
length of pauses in different locations (the bottom 6 rows), the variables were not included in the
correlation analysis.

Previous studies on perceived fluency often did not find a correlation between
Articulation rate (comparable to Mean syllable duration) and fluency ratings. In Kormos and
nes (2004) and ucchiarini et al. (2002) Articulation rate was not correlated with fluency
ratings. On the other hand, results of the present study are consistent with recent studies which
investigated relationships between acoustic fluency measures and oral proficiency or cognitive
fluency. In Ginther et al. (2010), Articulation rate was significantly correlated with oral
proficiency scores (r = .61). In De Jong, Steinel, et al. (2013), Mean syllable duration explained
50% of the variance of linguistic knowledge and skills (cognitive fluency). De Jong, Groenhout,

39

et al. (2013) also found that L2 Mean syllable duration explained 30% and the corrected measure
for L1 behavior explained 41% of variance of L2 proficiency. Therefore, as suggested by De
Jong, Steinel, et al. (2013), pure measures of speed such as Mean syllable duration and
Articulation rate can be claimed to reflect L2 cognitive fluency and L2 proficiency.

Length of Run
As in many previous studies (e.g. Kormos

nes 2004; Lennon 1990; O‘ rien et al.,

2007; Segalowitz & Freed, 2004; Towell et al., 1996), Mean syllables per run was strongly
associated with L2 fluency by demonstrating a large difference between the L1 and L2 speakers
and a significant correlation with speaking scores. Length of run has a conceptual connection
with automatic speech production processing. Towell et al. (1996) claimed that increase in length
of run reflects proceduralization of declarative knowledge based on Anderson‘s (1983) A T*
model of skill acquisition. Furthermore, length of run also seems to be closely related to the use
of prefabricated language units and formulaic language, which have been claimed to facilitate L2
oral fluency (Boers et al., 2006; Bybee, 2002; Kuiper, 1996; Skehan, 1998). However, the direct
connection between automaticity and length of run is still not clear. For instance, it is unclear
whether long fluent run reflects automaticity in lemma retrieval, grammatical encoding, morphophonological encoding, phonetic encoding, articulation, or combination of some of these stages.
Segalowitz and Freed (2004) found correlations between mean length of run without fillers and
lexical access speed and efficiency, suggesting its potential connection with lemma retrieval;
however, whether length of run is also related to other processing stages such as grammatical
encoding or morpho-phonological encoding requires further research.

40

Repairs
Repair measures have often not been indicative of perceived fluency in previous studies
(e.g., Cucchiarini et al., 2002; Kormos

nes, 2004). However, Bosker et al. (2013) found that

repairs did contribute a small but significant amount to perceived fluency. In the present study,
the L2 speakers used more corrections and repetitions than the L1 speakers, and there was a
weak negative correlation between repetitions and speaking scores. The data also suggest they
are affected by individual differences, as demonstrated by the large standard deviations of the
frequency of repairs in L1 as well as L2 (Table 2), and also by the stimulated recall data
(Examples 10 and 11), in which personality and L2 learning experience played a role in deciding
whether or not to repair one‘s errors.
Another issue regarding repairs is that self-corrections and repetitions exhibited different
relations with L2 fluency. In the correlation analysis, number of corrections and repetitions did
not correlate with each other. Moreover, only repetitions were correlated with speaking scores
(showing an approaching significance) and corrections were not. Number of repetitions was also
correlated with number of silent pauses, length of filled pauses, and silent pause rate within a
clause but number of corrections was not. The differential findings between the number of
corrections and repetitions are in line with De Jong, Steinel, et al. (2013), in which number of
corrections explained 25% of the variance but number of repetitions explained 12% of the
variance of linguistic knowledge and skills (cognitive fluency). Therefore, it would be premature
to claim that repairs do not reflect L2 fluency and future research on repairs will need to address
the issues regarding individual differences and potentially differential roles of self-corrections
and repetitions in L2 fluency.

41

Pause Phenomena
As mixed results had been found in the literature on breakdown fluency, the present study
especially analyzed pause phenomena rigorously by investigating frequency, duration, and
distribution of both silent and filled pauses. In this section, results on pause phenomena are
discussed in depth in relation to previous studies.
Results on pause phenomena showed that the L1 and L2 groups differed more in the
frequency than length of pauses and they were also more different in the use of silent pauses than
filled pauses. The results are consistent with the findings in the literature that pause duration was
often not strongly associated with L2 fluency. It was not correlated with fluency ratings (e.g.,
Cucchiarini et al., 2002) and only very weakly related to cognitive fluency (De Jong, Steinel, et
al., 2013). De Jong, Groenhout, et al. (2013) further showed that length of silent pauses barely
explained the variance of L2 proficiency (0 − 3%). In her comparative research on L1 and L2
pausing, Riazantseva (2001) concluded that pause duration is a language-specific feature.
In terms of filled pauses, it is interesting to note that the L2 speakers used filled pauses
less often than the L1 speakers, showing an approaching significance, and L2 speakers‘ filled-tosilent-pauses ratio was more than two times lower than L1 speakers‘. Filled pauses can be
viewed either as a non-linguistic symptom for trouble in the speech production process (Goffman,
1981; Levelt, 1989), or as ―fillers‖ with linguistic elements such as discourse markers ―well‖ and
―you know.‖ Clark and Fox Tree (2002) proposed ―uh‖ and ―um‖ are words with functions.
From the perspective of their ―filler-as-word hypothesis,‖ appropriate use of filled pauses can be
considered as something that L2 learners should acquire and have a relationship with L2
proficiency. In fact, the correlation analysis showed that speaking scores had a weak but positive
correlation with Number of filled pauses per minute (r = .230), while they had a weak negative

42

correlation with Number of silent pauses per minute (r = -.283). Given that the L1 speakers also
had a higher concurrence rate of silent and filled pauses compared to the L2 speakers, L1
speakers may use filled pauses strategically to signal a delay in speaking and splice a long silent
pause into smaller pauses, making their speech sound less disfluent. In the current study, Number
of filled pauses per minute did negatively correlate with Mean length of silent pauses (r = -.416),
which is consistent with the observation that filled pauses tend to interrupt the silence and break
the silence into smaller pieces. The relationship between filled pauses and L2 proficiency also
needs further investigation.
Although the MANOVA results showed some group differences in the frequency and
length of pauses, it is noteworthy that none of the overall pause measures was significantly
correlated with speaking scores until pause location was taken into account. Silent pause rate
within a clause had the most striking group difference (Figure 3) and a strong negative
5

correlation with speaking scores . The findings are corroborated by previous studies (e.g.,
Deschamps, 1980; Tavakoli, 2011), in which L2 speech often had pauses in the middle of clauses,
whereas L1 speech had pauses at syntactic boundaries. The significant correlation between the
speaking scores and Silent pause rate within a clause is compatible with the findings on L2
fluency development (Lennon, 1990; Towell et al., 1996) and the argument that pauses within
clauses reflect processing difficulties in speech production (e.g., Lennon, 1984; Pawley & Syder,
2000; Wood, 2010).
Research on L1 speech production has shown that pauses within clauses typically occur
before unpredictable and infrequent words and are related to lexical retrieval (Levelt, 1983;
5

These findings can be interpreted with even more emphasis because when speakers make
longer clauses (e.g., L1 speakers, higher proficiency learners), they have a higher chance of
pausing within a clause by definition.
43

Maclay & Osgood, 1959), whereas pauses at clause boundaries are associated with a more
general ―long-term‖ planning of the following clause such as word ordering and syntactic
encoding (Kircher, Brammer, Levelt, Bartel, & McGuire, 2004). Kircher et al. (2004) used
functional Magnetic Resonance Imaging (fMRI) to examine neural correlates of pauses within
clauses and found that pauses within clauses are associated with left temporal activation. They
claimed that their findings suggest that pauses within clauses are a correlate of speech planning
and in particular lexical retrieval. Based on their claim, L2 speakers‘ high Silent pause rate
within a clause in the current data can be interpreted to reflect difficulty in speech planning and
lexical retrieval. The interpretation is also consistent with the observation that lexical retrieval is
one of the most salient problems with L2 speakers (De Bot, 1992). Furthermore, as shown in the
stimulated recall data and their speech in the previous section, unlike L1 speakers, L2 speakers
may pause in the middle of a clause even for more general planning such as deciding content of
message (Examples 1 and 2) and syntactic encoding (Example 8) because their speech planning
processes are not automatized and often go through a process of trial and error during speech
production. These speech planning processes are likely to result in high pause rate within clauses
by the L2 speakers.
Moreover, the measure for the distribution of pauses, Pause rate is worth mentioning.
Pause rate was devised in the current study to measure how likely a speaker pauses within each
clause, at each clause boundary, and at each ASU boundary by dividing number of pauses by
corresponding unit (i.e., number of clauses, clause boundaries, ASU boundaries). Pause rate is
able to capture the frequency of pauses in different locations more accurately than measures used
previously such as Number of pauses per minute (e.g., Tavakoli, 2011). In Table 7, Number of
pauses per minute is calculated. It shows that the L1 speakers had more pauses at clause

44

boundaries than at ASU boundaries; however, the results are not very informative and make it
hard to compare pause frequency across the locations because there were more clauses (M = 31)
than ASUs (M = 18). In fact, when number of clauses and ASUs were controlled, results of Silent
pause rate (Figure 3) showed that the L1 speakers paused at an ASU boundary (0.49) more often
than within a clause (0.31) or at a clause boundary (0.37).

Table 7: Number of silent and filled pauses per minute in different locations
L1 Speakers
(N = 15)
M
SD
Number of silent pauses per minute
Within clauses
At clause boundaries
At ASU boundaries
Number of filled pauses per minute
Within clauses
At clause boundaries
At ASU boundaries

L2 Speakers
(N = 31)
M
SD

6.6
8.5
6.8

2.2
2.1
1.5

13.8
8.0
7.0

4.5
2.3
2.1

3.9
5.5
4.5

2.0
2.5
2.0

4.2
2.3
2.2

3.7
1.8
1.8

Pause rate also enables accurate comparison of pause distribution across speakers.
According to Table 7, the L1 and L2 speakers were similar in Number of silent pauses at clause
boundaries or at ASU boundaries; however, again L1 speech had more clauses and ASUs (31
clauses; 18 ASUs) than L2 speech (24 clauses; 16 ASUs). After controlling for the number of
clauses and ASUs across the speakers, results of Pause rate show that the L2 speakers also had a
higher pause rate than the L1 speakers at a clause boundary and at an ASU boundary (Figure 3).
ased on L2 speakers‘ high Silent pause rate within a clause, one might have predicted that,
unlike within a clause, at clause or ASU boundaries L2 speakers would pause less often than L1
speakers. Pause rate was also able to precisely address this question and shows that, contrary to

45

the prediction, the L2 speakers, in fact, paused more often than the L1 speakers not only within a
clause but also at both boundaries even if the group difference was much more striking within a
clause. Given that L1 speakers‘ pauses between clauses are associated with general speech
planning such as word ordering and syntactic encoding (Kircher et al., 2004), it seems logical for
L2 speakers to pause between clauses as well as within clauses more often than L1 speakers.

Cognitive Fluency
In investigating L2 cognitive processes and issues regarding disfluencies, the current
study used stimulated recall which can reflect cognitive events and reveal the information
attended to during task performance (Gass & Mackey, 2000) and this technique has rarely been
used in studies of L2 fluency. The data showed that only 20% of the responses were marked with
overt repairs in speech samples, suggesting 80% of the issues would have been missed through
speech analysis without stimulated recall.
According to the declarative/procedural model (Ullman, 2001, 2004, 2005) and the
studies on automaticity (DeKeyser, 2001, 2007; Segalowitz, 2000, 2003), practice and more
exposure to L2 lead to dependence on procedural memory/knowledge. Based on their claims,
lower proficiency learners were predicted to rely on L2 declarative knowledge more than higher
proficiency learners. Considering that only declarative knowledge can be explicitly recollected
(Ullman, 2001, 2004, 2013), lower proficiency learners were expected to remember more about
their thoughts at the time of speaking than higher proficiency learners. Results followed the
predictions and on average the lower proficiency learners reported over 1.5 times more issues
than the higher proficiency learners did and also responded 44% longer than the higher
proficiency learners. It may not be very surprising that the lower proficiency learners commented

46

on more cases than the higher proficiency learners as they reported what they were thinking
while pausing or hesitating at the time of speaking. As the lower proficiency learners paused or
hesitated more often than the higher proficiency learners, they had more pauses to talk about.
However, it is still interesting to see that whereas the lower proficiency learners remembered and
reported a number of their thoughts quite easily, the higher proficiency learners often mentioned
having trouble remembering them, saying that ―I don‘t remember.‖ The findings are also in line
with the argument that lower proficiency learners consciously think about more issues during the
speech production process because the process has not yet been automatized and requires a lot of
attentional effort (Kormos, 2006).
Furthermore, as higher proficiency learners‘ production processes are assumed to be
more automatized than lower proficiency learners‘, the higher proficiency learners were
predicted to report mainly on macroplanning and monitoring that required their attention as in L1
speech (Kormos, 2006), whereas the lower proficiency learners were expected to report on more
varied issues including syntactic and morpho-phonological encoding that are not fully
automatized, and controlled in the declarative memory system. The predictions were met and the
content of the message comprised the larger proportion of the comments by the higher
proficiency learners (51%) than the lower proficiency learners (34%), and the lower proficiency
learners commented on L2 declarative knowledge concerning grammar and vocabulary much
more often than the higher proficiency learners.
Some part of the stimulated recall responses can be discussed in association with the L2
speech production model and the fluency vulnerability points in Figure 1. In both L2 groups‘
responses, the largest proportion consisted of the content of the message. Considering that even
L1 speakers‘ fluency declines during macroplanning (Roberts & Kirsner, 2000), it seems

47

reasonable that L2 speakers pause for a considerable amount of time to think about what to say.
However, more importantly, 37% of lower proficiency learners‘ comments on content were
related to their limited L2 competence, whereas only 5% of higher proficiency learners‘ were.
The lower proficiency learners often dropped or modified their original message due to L2
difficulties (Example 2), as reported in the studies on monitoring and problem-solving
mechanisms (

rnyei & Kormos, 1998; Kormos, 2000a, 2006). It is also interesting to note that

the lower proficiency learners often abandoned their original message to avoid problematic
situations; this eventually imposed another cognitive burden that they had to plan a new message
(Example 3).
Furthermore, the responses showed that the L2 speakers were affected by their L2
proficiency not only when monitoring their message after articulation, but also at an early stage
of conceptual preparation as in Example 1, in which an L2 learner chose what to say considering
her L2 competence. Following is a comment from a higher proficiency learner, which precisely
addresses the issue focusing on the proficiency level.
12. …when I choose what to say, I think my English proficiency influences the decision. I
think that “I can say this much, so I will say this.” When I was a beginner in English,
there were so many cases in which I failed to convey what I originally planned to say. So
now I decide what to say based on how much I can express in English…
The comment suggests that as proficiency levels increase, L2 learners not only develop
L2 competence and skills, but may also become more strategic about deciding what to say
possibly based on their understanding of their own oral proficiency through L2 experience. It is
possible that lower proficiency learners do not have a good understanding of their own oral
proficiency, choose a relatively difficult or complex topic for them to talk about, face difficulties

48

in conveying the original message, decide to drop the message, and face another challenge to
pick a better topic for them to talk about. The trial and errors are likely to be associated with
more pauses and repairs, interrupting speech fluency. This is compatible with Segalowitz‘s (2010)
prediction that ―the more macroplanning a communicative situation requires, the more
vulnerable the L2 speech will be to disfluencies because of the diversion of processing resources‖
(p.11). De Bot (1992) also suggested that non-balanced bilinguals sometimes do not have the
lexical items required to express a concept and they often seem to be aware of this problem in
advance and take it into account in conceptual preparation. In their studies on problem-solving
mechanisms and self-monitoring in L2 speech

rnyei and Kormos (1998) and Kormos (2000a)

also exemplify a number of cases when L2 speakers abandon or change their original intended
message (macro-plan) due to L2 difficulties. As the L2 proficiency level and learners‘ level of
understanding about their own oral proficiency can play a role in deciding what to say, the
current study proposes that macroplanning be another candidate for a fluency vulnerability point
in the L2 speech production model (Figure 1; Segalowitz, 2010).
Another point to discuss in relation to the model of L2 speakers concerns the
interpretation of the finding that the lower proficiency learners remembered attending to L2
declarative knowledge regarding specific grammatical features and vocabulary much more often
than the higher proficiency learners. This might be considered to reflect processing difficulties in
grammatical encoding and lemma retrieval in the model. However, there are a number of issues
to be resolved to pinpoint at which speech production stage these difficulties occurred. For
instance, a number of responses on grammar by the lower proficiency learners concerned the
choice of tense-aspect. They often mentioned that they tried to apply what they had learned in
class. However, the information about tense-aspect and mood is claimed to be added during

49

microplanning in the L1 speaking model through an automatic process without requiring
attentional effort (Levelt, 1989). Here we face a number of L2 specific issues which the L1
speech production model does not seem to address. For instance, in L2 it is not clear whether the
information about tense-aspect is still added to a preverbal message during microplanning. If the
microplanning stage itself is highly automatic in nature, it is not easy to explain why so many L2
learners were able to remember thinking about tense-aspect.
In fact, all the boxes in the model (Figure 1) represent processing components with
procedural knowledge, thus being largely automatic and subconscious in nature. By contrast,
based on the responses, the lower proficiency learners almost always seemed to think about L2
declarative knowledge or rules while speaking in L2 (e.g., deciding on the words, tense-aspect,
function words such as articles, prepositions, and putting words together to construct a sentence).
This seems to support Kormos‘ (2006) proposal to add an L2 declarative knowledge storage in
6

the L2 speech production model . However, it is still not clear which processing stages of L2
speech production have access to L2 declarative knowledge or how the L2 declarative storage
and other speech production stages interact. There are a number of issues to be investigated in
order to understand the process of L2 speech production.

General Discussion: Study 1

6

It is true that stimulated recall data had a number of comments on L2 declarative knowledge
because stimulated recall taps the conscious, thus mainly declarative processes. However, what
this study tried to show is not that subconscious processes are involved less, or are less important
in L2 speech production than in L1 speech production, but rather to show that conscious
processes seem to be frequently involved in L2 production, which is not easy to explain based on
the L1 speech production model.
50

To demonstrate in what respects L1 and L2 speakers‘ fluency are different, Study 1
investigated utterance fluency and cognitive fluency of the L1 English speakers and the L1
Korean L2 English speakers by analyzing speed, length of run, repairs, and pause phenomena of
their speech samples, and stimulated recall responses.
The results of the MANOVAs and correlation analysis showed that Mean syllable
duration, Mean syllables per run, and Silent pause rate within a clause were most strongly
associated with L2 oral fluency. The three measures not only distinguished between the L1 and
L2 speakers with large effect sizes but also strongly correlated with L2 speaking scores.
Although pure speed measures (i.e., Articulation rate, Mean syllable duration) have not always
been found to be a strong associate of perceived fluency in the literature, the current finding is
consistent with recent studies which showed a strong correlation between speed fluency with oral
proficiency and cognitive fluency (Ginther et al., 2010; De Jong, Groenhout, et al., 2013; De
Jong, Steinel, et al., 2013). Study 1 especially conducted an in-depth analysis on pause
phenomena by examining frequency, duration, and distribution of both silent and filled pauses.
The results showed that Silent pause rate within a clause clearly distinguished the two groups
and had a strong correlation with speaking scores. The findings are consistent with the claims in
previous studies that pauses within clauses reflect processing difficulties in speech production
(e.g., Kircher et al., 2004; Pawley & Syder, 2000).
Stimulated recall responses showed that the lower proficiency learners remembered more
issues regarding L2 declarative knowledge on grammar and vocabulary than the higher
proficiency learners, which was compatible with the declarative/procedural model and studies on
automaticity. In addition, stimulated recall responses suggested a possibility that macroplanning
be another candidate for a fluency vulnerability point, considering that L2 proficiency seems to

51

affect L2 speakers‘ initial decision on the message. Lower proficiency learners‘ frequent
comments on specific grammatical rules and vocabulary seem to lend support for including an
L2 declarative knowledge store in the L2 speech production model.
The present study tried to fill gaps and extend the body of research on L2 fluency by
investigating utterance and cognitive fluency within Segalowitz‘s (2010) framework. The study
provided empirical evidence to demonstrate in what respects L1 and L2 speakers‘ fluency are
different by examining temporal measures of speed, repairs, and pause phenomena. It analyzed
utterance fluency in a comprehensive and rigorous way to address less studied aspects of fluency
(e.g., distribution of pauses, frequency and duration of filled pauses, relationship between silent
and filled pauses) and inconclusive findings in previous studies (e.g., pure measure of speed
fluency, repairs).
Study 1 also has implications for research methodology. The study discussed strengths
and weaknesses of different measures used in previous studies and further proposed a measure
(i.e., Pause rate) that can depict pause distribution more accurately. It also utilized qualitative
analysis in exploring cognitive fluency to provide additional insight to the field of L2 fluency
research where quantitative analysis is dominant.
Furthermore, results of the study have potential implications for L2 education and
assessment. One of the most novel and important findings of the current study is the close
relationship between L2 utterance fluency and pauses within clauses. L1 and L2 speech exhibited
a striking difference in the frequency of pauses within clauses, which is considered to reflect
difficulties in speech production processing such as lexical retrieval. Based on the findings, in
classroom one way teachers can help L2 learners to enhance L2 fluency is to provide ample
opportunities to practice collocations and formulaic language, which will enable learners to

52

produce longer fluent runs and will decrease pauses within clauses in their speech. In terms of L2
assessment, including the measure which addresses the frequency of silent pauses within clauses
in automatic fluency assessment can evaluate L2 learners‘ oral fluency more accurately.

53

CHAPTER 3: STUDY 2

Introduction
In order to identify speech features which affect the perception of fluency, a number of
previous studies investigated the relationship between utterance fluency and perceived fluency
by relating fluency ratings to acoustic characteristics of L2 speech (e.g., Bosker et al., 2013;
Cucchiarini et al., 2000, 2002; Derwing et al., 2004; Freed 2000; Ginther et al. 2010; Kormos
nes, 2004; Rossiter, 2009). However, results are still mixed regarding pause phenomena. For
instance, pause frequency was correlated with fluency ratings in Rossiter (2009) but not in
Kormos

nes (2004); whereas pause duration was correlated with fluency ratings in Kormos

nes (2004) but not in Cucchiarini et al. (2002). Although pause phenomena seem to play an
important role in the perception of fluency, the relative contributions of frequency, duration, and
distribution of pauses have rarely been investigated. In particular, it has been argued that pauses
within constituents, which recent studies have identified as major characteristics of non-fluent L2
speech (Kahng, 2012; Tavakoli, 2011), reflect difficulties in speech processing and planning.
However, the effects of pause location on perceived fluency of L2 speech have not yet been
examined.
As discussed in the literature review section, L1 research on pause phenomena suggests
that pause location affects speech perception. Silent pauses are one of the acoustic cues to clausal
units along with pitch and vowel duration (Seidl

risti , 2008). Therefore, silent pauses at

grammatical boundaries have been claimed to help listener comprehension by indicating the
boundaries of speech to be analyzed, and by providing cognitive processing time (e.g., Arons,
1993; Griffiths, 1991; Reich, 1980, Sugito, 1990), whereas pauses within clauses can be

54

disrupting. It has also been reported that beneficial effects of silent pauses between clauses on
listeners were apparent only under conditions of cognitive complexity in auditory speech
processing and they did not demonstrate beneficial effects when the speech or tasks were easy
enough (Aaronson, 1968; Reich, 1980).
Study 2 aims to address the gaps in the literature on L2 perceived fluency and examined 1)
the relative contributions of frequency, duration, and distribution of pauses to the perception of
L2 fluency (Experiment 1), and 2) the effects of pause locations on the perception of L1 and L2
fluency (Experiment 2).

Experiment 1
Experiment 1 investigated the relative contributions of frequency, duration, and
distribution of silent pauses to fluency ratings. The research questions for Experiment 1 are:

1. Which acoustic measures of pause phenomena (frequency, duration and/or
distribution of silent pauses) are significantly related to fluency ratings?
2. Does the distribution of pauses explain significantly additional variance of fluency
ratings which is not explained by frequency and duration of silent pauses?

Based on previous studies on perceived fluency, frequency and duration of silent pauses
are predicted to be correlated with fluency ratings. On the other hand, the relationship between
the distribution of silent pauses and L2 perceived fluency has not been investigated; therefore, it
is the main focus of Experiment 1. Research on L1 pause phenomena has shown that silent
pauses are one of the cues to clausal units, and pauses between clauses can be useful, whereas

55

pauses within clauses can interfere with speech perception processing (Bower & Springston,
1970; Griffiths, 1991; Reich, 1980; Sugito, 1990). On the basis of the L1 research findings,
fluency ratings are predicted to correlate with not only frequency and duration of silent pauses
but also distribution of silent pauses. In addition, if the regression model with the variable of
pause distribution explains significantly larger variance of fluency ratings than the model without
the variable of pause distribution, the result can be interpreted to reflect its critical role of pause
distribution in perceived fluency.
In Experiment 1, English native listeners rated L2 speech samples on fluency level. The
speech samples were also acoustically analyzed in terms of frequency, duration, and distribution
of silent pauses. The relative contributions of the three aspects of pause phenomena to fluency
ratings were examined through multiple regression analysis.

Method
Raters
Forty-six native English speakers (16 male; 30 female) participated in the experiment as
raters. They were undergraduate students at a large university in the United States and the mean
age was 21 (SD = 2.3) and reported to have normal hearing. Their mean familiarity with Korean
accented English was 3.4 (SD = 1.7) on a scale of 1 (not familiar at all) to 9 (extremely familiar).

Stimulus Description
74 L2 speech samples from 37 Korean speakers (10 male; 27 female) and six L1 speech
samples from three English speakers (1 male; 2 female) were used. The mean age of the Korean
speakers was 31.5 (SD = 6.5). Their length of residence in English speaking countries ranged

56

from 1 month to 8 years (M = 2.1, SD = 2.1). The Korean speakers also had a wide range of
English proficiency levels, ranging from students in ESL beginner classes to graduate students in
7

the United States who earned close to perfect scores on the internet-based TOEFL . The six L1
speech samples served as reference points to which the listeners could compare the L2 speech.
The three English speakers were undergraduate or graduate students at a large university in the
United States. The speech samples were responses to the same two questions as in Study 1, one
about their major field and the other about their free time activities (Appendix A). For
presentation to the raters, 20-second excerpts were taken from approximately the middle of the
original recordings (Bosker et al., 2013; Derwing et al., 2007). Each excerpt started and ended at
a clause boundary. All the speech samples were normalized in Praat (Boersma & Weenink, 2012)
to have a mean intensity of 70dB.

Procedure
The raters heard 80 speech samples in random order over headphones and rated their
level of fluency using a nine-point scale with labeled extremes (1 = extremely disfluent, 9 =
extremely fluent). The speech excerpts and the scale were presented to raters using Praat
(Boersma & Weenink, 2012). The scale appeared on the screen after each sample excerpt had
been played; therefore, raters could rate each excerpt only after they heard the whole excerpt.
The raters were asked to rate how easily and smoothly speech is delivered, focusing on features
of fluency such as speed, pause and repair phenomena, rather than in terms of overall proficiency
(see Appendix C for the instructions). Before the actual experiment, each rater completed a
practice session to ensure familiarity with the task. In the experiment speech samples were
7

The speech samples collected for Study 1 were also included for the stimuli in Study 2.
57

completely randomized for each rater. The experiment was conducted in a quiet room with a
group of at most 4 raters per session. The rating experiment took about 40 minutes and the raters
were able to take a short break after rating half of the speech excerpts. After completing the
rating experiment, the raters filled out a short questionnaire on their background information,
familiarity with Korean accented English, and L2 learning and teaching experience (see
Appendix D).

Acoustic Analysis of Speech Excerpts
In order to investigate relationships between fluency ratings and pause phenomena in
speech, the L2 speech materials were analyzed acoustically. First, all speech excerpts were
transcribed in detail including information regarding silent pauses (≥ 250 ms; De Jong & Bosker,
2013; Kahng, 2012). The length of silent pauses was measured in milliseconds (ms) by listening
to each speech excerpt and examining the waveform and spectrogram using Praat (Boersma &
Weenink, 2012), and the duration was added to the transcript. Pauses were also categorized,
depending on their locations, as either within clauses or between clauses (Foster et al., 2013).
Next, the frequency and duration of silent pauses were measured by Number of silent pauses per
minute and Mean length of silent pauses, respectively. The distribution of silent pauses was
operationalized by Silent pause rate within a clause based on the findings in Study 1 which
indicated that, unlike silent pauses at grammatical junctures (e.g., at a clause boundary), the
measure not only clearly distinguished between the L1 and L2 speakers but also had a strong
negative correlation with L2 oral proficiency. Silent pause rate within a clause captures how
often a speaker pauses within each clause on average and was computed by dividing the total
number of silent pauses occurred within clauses by the number of clauses in each speech excerpt.

58

Statistical Analysis
To analyze the relative contributions of frequency, length, and distribution of silent
pauses to fluency ratings, multiple regression analyses were conducted with fluency ratings as a
dependent variable and the three measures on pause phenomena (i.e., Number of silent pauses
per minute, Mean length of silent pauses, and Silent pause rate within a clause) as predictor
variables. A log transformation was performed on Mean length of silent pauses and Silent pause
rate within a clause so that the data could closely approximate the normal distribution.

Results
The 46 raters evaluated 80 speech excerpts in terms of their level of fluency and the
interrater reliability and interrater agreement was high. Cronbach‘s alpha coefficient was 0.98
and the intraclass correlation coefficient (absolute agreement) was 0.93. I report both ronbach‘s
alpha coefficient and intraclass correlation coefficient because the former measures internal
consistency and reliability of the measure (treating the raters as items; Carr, 2011) and the latter
measures the extent to which the individual raters agree with one another in their ratings (Field,
2005). For the intraclass correlation, I used a two-way random model as both the speakers and
the raters were random effects (Larsen-Hall, 2010). Table 8 shows the descriptive statistics of
Number of silent pauses per minute, Mean length of silent pauses, and Silent pause rate within a
clause and fluency ratings of L2 speech excerpts. As expected from a wide range of L2
proficiency, the L2 speakers demonstrated a range of performance in terms of frequency,
duration, and distribution of silent pauses in Table 8.

59

Table 8: Descriptive statistics of pause phenomena and fluency ratings of L2 speech

Number of silent pauses per minute
Mean length of silent pauses (ms)
Silent pause rate within a clause
Fluency ratings

N
74
74
74
74

M
23.45
721
0.93
5.56

SD
6.01
192
0.82
1.53

Min.
12.52
390
0.00
2.35

Max.
37.89
1360
5.50
8.35

Table 9 shows Pearson correlations between the measures and fluency ratings. The
correlation analysis shows that the frequency and length measures are not correlated with each
other but the frequency and distribution measures are correlated, which seems natural
considering that Silent pause rate within a clause is related to the number of pauses within
clauses. Correlations between pause phenomena and fluency ratings demonstrated that all three
measures are negatively correlated with ratings. In particular, Silent pause rate within a clause
exhibited the highest correlation with fluency ratings and Number of silent pauses per minute had
a moderately strong correlation with ratings.

Table 9: Correlations between the measures of pause phenomena and fluency ratings
SPmin
1
-.019
.692**

Number of silent pauses per minute (SPmin)
Mean length of silent pauses (LngSP)
Silent pause rate within a clause (SPRwc)
Note. ** = p < .01.

LngSP

SPRwc

1
.226

1

Ratings
-.555**
-.339**
-.673**

A multiple linear regression analysis was performed in order to investigate to what extent
each aspect of pause phenomena can explain the variance of the fluency ratings. First, based on
previous findings that pause frequency and duration are related to perceived fluency, the two
variables were entered first and the measure of pause distribution was entered last so as to

60

examine whether pause distribution can explain additional variance of ratings. Table 10 shows
the results of the hierarchical multiple regression analysis.

Table 10: Results of a hierarchical multiple regression
Model
1
2
3

Predictors
Frequency
Frequency + Length
Frequency + Length + Distribution

2

R

.291
.422
.519

2

R
change
.291
.131
.097

F
change

df

p

28.740
15.615
13.670

1, 70
1, 69
1, 68

< .001
< .001
< .001

Results of the hierarchical multiple regression show that pause frequency explained 29%
of the variance of the fluency ratings and when pause length was added, it explained an
additional 13% of the variance. Finally, when pause distribution was added, it was able to
explain additional 10% of the variance of the fluency ratings, which was significantly more (p
< .001) than what the model without the measure of pause distribution could explain. The three
silent pause measures altogether were able to explain about 52% of the variance of the fluency
ratings. In addition, to see the effects of the order in which predictors are entered into the model,
a stepwise multiple regression was performed to compare the results based on a mathematical
criterion with the results of the hierarchical multiple regression.
Table 11 shows that with the stepwise multiple regression, as pause distribution had the
highest correlation with fluency ratings (Table 9), it was entered into the model first and was
able to explain over 45% of the variance of the fluency ratings by itself. Next, pause length was
entered and it explained additional 4% of the variance; however, pause frequency was not
included in the model as it did not explain significantly additional variance of the fluency ratings.

61

Table 11: Results of a stepwise multiple regression
Model
1
2

Predictors
Distribution
Distribution + Length

2

R

.452
.493

2

R
change
.452
.041

F
change

df

p

57.662
5.751

1, 70
1, 69

< .001
.021

Discussion
Experiment 1 examined the relative contributions of frequency, length and distribution of
silent pauses to perceived fluency. The first research question of Experiment 1 was which
acoustic measures of pause phenomena (frequency, duration and/or distribution of silent pauses)
are significantly related to fluency ratings. The results showed that fluency ratings were
significantly correlated with all three measures—frequency (r = -.555), duration(r = -.339), and
distribution (r = -.673) of silent pauses. It is especially noteworthy that pause distribution
exhibited the strongest correlation with fluency ratings among the three pause variables. The
second research question of Experiment 1 was whether pause distribution explains significantly
additional variance of fluency ratings which is not explained by frequency and duration of silent
pauses. The hierarchical multiple regression analysis showed that the regression model with
pause frequency and length explained 42% of the variance of fluency ratings and when pause
distribution was added to the model it was able to explain about 10% of additional variance of
fluency ratings.
Although it has been suggested that pause phenomena play an important role in the
perception of fluency, previous studies have been equivocal on the relationship between
perceived fluency, and frequency and length of silent pauses. Pause frequency was correlated
with fluency ratings in Rossiter (2009) but not in Kormos
62

nes (2004); whereas pause

duration was correlated with fluency ratings in Kormos

nes (2004) but not in Cucchiarini et

al. (2002). Moreover, neither the relationship between perceived fluency and pause distribution,
nor the relative contributions of frequency, duration, and distribution of pauses had been
investigated.
Experiment 1 precisely examined these gaps in the literature and predicted that not only
pause frequency and length but also pause distribution would be significantly correlated with
fluency ratings, based on the L1 research findings that pause location influences speech
perception (e.g. Arons 1993; Griffiths 1991; Reich 1980 Seidl
Silent pauses are one of the acoustic cues to clausal units (Seidl

risti , 2008; Sugito, 1990).
risti , 2008) and sentences

containing silent pauses between clauses were processed faster and recalled better than sentences
containing silent pauses within clauses (Reich, 1980). In Experiment 1 pause distribution was
operationalized by Silent pause rate within clauses (i.e., number of within-clause pauses per
clause), which demonstrated a striking difference between L1 and L2 speech in Study 1. The
results followed the prediction by demonstrating correlations between fluency ratings and pause
frequency, length, and distribution. Pause distribution was also able to explain 10% of additional
variance that was not explained by pause frequency and length. Given that pause distribution and
pause frequency had a strong correlation (r = -.673), the additional 10% of explanatory power
that pause distribution had suggests its crucial role in perceived fluency. In fact, pause
distribution exhibited the strongest correlation with fluency ratings and was able to explain 45%
of the variance of fluency ratings. Moreover, when pause distribution was entered into the
regression model first, pause frequency was not able to explain additional variance of fluency
ratings.

63

Experiment 2
Experiment 2 tested a causal relationship between pause location and perceived fluency
through speech manipulations. The results of Study 1 have shown that one of the major
differences between L1 and L2 speakers‘ speech lies in the frequency of pauses within clauses.
Although frequency and length of silent pauses have been reported to be correlated with fluency
ratings (e.g. Kormos

nes, 2004; Rossiter, 2009), effects of pause location on perceived

fluency have not been investigated. Therefore, the experiment specifically aims to answer
whether pauses within clauses decrease fluency ratings compared to pauses between clauses.
Furthermore, although L1 speakers also produce disfluencies (e.g., pauses and repairs),
L1 speakers tend to be perceived as fluent by default (Davies, 2003; Riggenbach, 1991), and
studies investigating the relationship between utterance fluency and perceived fluency of L1
speakers are rare. Bosker (2013) recently compared the way raters evaluate fluency of L1 and L2
speech. He manipulated L1 and L2 speech in terms of pauses, by constructing no pause, short
pause, and long pause conditions, and speed, by speeding up L2 speech and slowing down L1
speech. The results showed that the ratings of manipulated L1 and L2 speech were affected in a
similar fashion, suggesting that listeners evaluate fluency characteristics of L1 and L2 speakers
in a similar way. Bosker (2013) also has methodological implications. Many previous studies
used correlational analyses to explore the relationship between utterance fluency and perceived
fluency (e.g., Bosker et al., 2013; Cucchiarini et al., 2002; Derwing et al., 2004; Kormos &
Dénes, 2004; Rossiter, 2009). However, Bosker (2013) points out that the correlational approach
would be unsuitable to compare the perception of L1 and L2 speech because they differ in many
respects. Hypothetically, if pause frequency is found to be more strongly correlated with ratings
of L2 speech than with ratings of L1 speech, it could be due to the fact that L2 speech had more

64

pauses as compared to L1 speech, and not due to a difference in relative weight of pausing.
Therefore, he used phonetic manipulations to ascertain that the effects on fluency ratings could
be directly attributed to the fluency characteristics they manipulated, and that he could compare
how the same modification in L1 and L2 speech affects perceived fluency.
Building upon the previous studies, Experiment 2 aims to fill gaps and extend the body of
research on perceived fluency. The role of pause location in L1 and L2 perceived fluency has not
yet been investigated. Therefore, Experiment 2 examined a causal relationship between pause
location and perceived fluency by constructing three conditions—No Pause, Pauses Between
Clauses, and Pauses Within Clauses conditions—and compared fluency ratings of L1 and L2
speech in the three conditions. These conditions were created through phonetic manipulations of
L1 and L2 speech in order to directly test for a causal effect of pause location on perceived
fluency of L1 and L2 speech. The research questions are as follows.

1. Is there a difference in fluency ratings of L1 speech when the speech has 1) no pause,
2) pauses between clauses, and 3) pauses within clauses?
2. Is there a difference in fluency ratings of L2 speech when the speech has 1) no pause,
2) pauses between clauses, and 3) pauses within clauses?

In Bosker (2013), both L1 and L2 speech in the no pause condition were rated as more
fluent than the short and long pause conditions. Therefore, in Experiment 2 of the current study,
both L1 and L2 speech in the No Pause condition are also predicted to be rated as more fluent
than the Pauses Between Clauses and Pauses Within Clauses conditions. Regarding the
difference in ratings between the Pauses Between Clauses and Pauses Within Clauses conditions,

65

as pauses are one of the acoustic cues to clausal units (Seidl

risti , 2008), raters are likely to

prefer pauses between clauses to pauses within clauses. Perception of L1 and L2 speech might be
influenced by pause location in the same way as it was by speed, and pause frequency and length
in Bosker (2013). However, based on previous research on L1 pause phenomena, there is still a
possibility that pause location affects perception of L1 and L2 speech to a different degree. L1
research has shown that, only under conditions with cognitive complexity, do silent pauses
between clauses have apparent beneficial effects on speech processing, whereas silent pauses
within clauses interfere with speech perception processing and recall (Arons, 1993; Bower &
Springston, 1970; Griffiths, 1991; Reich, 1980; Sugito, 1990). Moreover, a number of studies on
pause detection in L1 speech show that listeners are not good at detecting disfluencies such as
pauses and repairs (Bailey & Ferreira, 2003). For instance, in transcription tasks, listeners tend to
misplace pauses within clauses to between clauses (e.g., Duez, 1985; Martin & Strange, 1968)
and Duez (1985) concluded that listeners tend not to hear pauses which are not expected, such as
within-constituent pauses. Given that it is possible that raters may consider perception of L2
speech cognitively more demanding than that of L1 speech, and that raters may not expect to
hear pauses within clauses in L1 speech but they may do in L2 speech, in Experiment 2 it is
predicted that fluency ratings of L2 speech can be influenced by pause location more than those
in L1 speech.

Method
Raters
Ninety-two native English speakers (20 male; 72 female) participated in the study as
raters. They were undergraduate students at a large university in the United States and the mean

66

age was 21 (SD = 2.0) and reported to have normal hearing. Their mean familiarity with Korean
accented English was 3.7 (SD = 2.0) on a scale of 1 (not familiar at all) to 9 (extremely familiar).

Stimulus Description
Twenty-four L1 and 24 L2 spontaneous speech samples recorded by 12 English speakers
and 12 Korean learners of English were used, which had been collected for Study1 and
Experiment 1. The speech samples were responses to two questions, one about the speaker‘s
major field and the other about their free time activities (Appendix A). The samples were
selected so that the L1 and L2 speech samples were comparable in terms of speed fluency; there
was no significant group difference in Mean syllable duration (ML1 = 246, SDL1 = 23; ML2 =
263, SDL2 = 22). Fragments of approximately 20 seconds were excerpted from the middle of the
original recordings (Bosker et al., 2013; Derwing et al., 2007). Each excerpt started and ended at
a clause boundary.
Three conditions—‗No Pause,‘ ‗Pauses Between Clauses,‘ and ‗Pauses Within
Clauses‘—were created to test whether pauses within clauses lower fluency ratings compared to
pauses between clauses or no pause. To test for a causal relationship between pause location and
perceived fluency, the speech samples in the two conditions with pauses (i.e., Pauses Between
Clauses and Pauses Within Clauses) should be different only in terms of pause location but
should have the same number of pauses with the same length. Therefore, first, to create the No
Pause condition, all the silent pauses in the speech samples were shortened to the length of
around 150 milliseconds (Bosker, 2013). Next, stimuli for the Pauses Between Clauses and
Pauses Within Clauses conditions were constructed by adding the same number of pauses with

67

the same length either between clauses or within clauses depending on the condition, to the
speech samples in the No Pause condition. After examining all the speech samples, it was
decided to add five pauses to them. Five was the optimal number of pauses in that all the speech
samples could have five pauses within and between clauses naturally, without interrupting
coarticulation. The length of pauses added was around 600 milliseconds, which was about the
average length of English speakers‘ silent pauses in Study 1.
A clause was required to consist minimally of a finite or non-finite verb with at least one
other clause element such as a subject, object, or complement (see Foster et al., 2000, pp. 365368). Examples of pauses between clauses are: I performed in several plays [pause] I believe
[pause] I have some talent in acting. Examples of pauses within a clause are: learn new [pause]
things; so [pause] hard, to my [pause] place (see Appendix E and F for more examples).
The speech samples were normalized in Praat (Boersma & Weenink, 2012) to have a
mean intensity of 70dB. In addition, a small subtle white noise was added (33dB) to the speech
samples using the RandomGauss function in Praat (M = 0, SD = 0.001). This was done in order
to normalize the background noise throughout and across the speech samples in an attempt to
mask any possible trace of pause manipulations. The level of noise was very low and sounded
like part of the original recordings; therefore, none of the raters noticed that a noise had been
added to the speech samples. All the manipulated speech samples were evaluated for naturalness
by two native English speakers and two advanced learners of English and corrections were made,
if necessary (e.g., changing pause locations). All the locations where a pause was added
originally had a silence; therefore, none of the added pauses interrupted coarticulation. The
stimuli were arranged according to a Latin Square design, in which raters were presented with
each item in only one condition, with three groups of raters for counterbalancing. A Latin Square

68

design was used because when raters hear the same speech excerpts more than once, the
familiarity with the content of the speech excerpts is likely to affect their ratings. Table 12
demonstrates how speech samples were organized according to a 3 x 3 Latin Square design. 24
speakers were randomly assigned to one of the three speaker groups (i.e., S1, S2, S3) and each
speaker group consisted of 4 L1 speakers and 4 L2 speakers. Ninety-two raters were also
randomly assigned to one of the three rater groups (i.e., R1, R2, R3). For instance, raters in R1
heard speech samples of S1 in the No Pause condition, speech samples of S2 in the Pauses
Between Clauses condition, and speech samples of S3 in the Pauses Within Clauses condition.
By doing so, raters listened to each speech sample in only one condition.

Table 12: A schematic representation of the 3 x 3 Latin Square design. No, B, and W represent
the No Pause, Pauses Between Clauses, and Pauses Within Clauses conditions, respectively.

Raters

R1
R2
R3

Speakers
S2
B
No
W

S1
No
W
B

S3
W
B
No

Procedure
As detailed above, 92 raters were randomly assigned to one of the three rater groups for
counterbalancing. Each rater heard 48 manipulated speech samples produced by 24 speakers in
random order over headphones and rated the level of fluency of the speaker using a nine-point
scale with labeled extremes (1 = extremely disfluent, 9 = extremely fluent). As in Experiment 1,
the speech excerpts and the scale were presented to raters using Praat (Boersma & Weenink,
2012). The scale appeared on the screen after each sample excerpt had been played; therefore,
69

raters could rate each excerpt only after they heard it completely. The raters were asked to rate
how easily and smoothly speech is delivered, focusing on features of fluency such as speed,
pause and repair phenomena, rather than in terms of overall proficiency (see Appendix C for the
instructions). Before the actual experiment, there were three practice items so that raters could
familiarize themselves with the procedure. In the experiment speech samples were randomized
for each rater. The procedure was conducted in a quiet room with a group of at most 4 raters per
session. The rating experiment took about 35 minutes and the raters were able to take a short
break after rating half of the speech excerpts. After they finished rating, they filled out a short
questionnaire on their background information, familiarity with Korean accented English and L2
learning and teaching experience (Appendix D). Lastly, they were also asked whether they had
noticed anything particular or interesting about the speech excerpts and none of them mentioned
that the speech samples sounded unnatural or manipulated.

Analysis
The interrater agreement within the three rater groups was high (Cronbach‘s alpha
coefficients: 0.94, 0.93, 0.95; intraclass correlation coefficients in terms of absolute agreement:
8

0.89, 0.84, 0.91 ). In order to test whether the three pause conditions affected fluency ratings of
L1 and L2 speakers‘ speech, mixed effects ANOVAs were performed with fluency ratings of L1
and L2 speakers‘ speech as dependent variables using SPSS Statistics 17.0 (SPSS Inc., 2008).
Mixed effects ANOVAs were performed in order to test effects of fixed variable (i.e., Condition)

8

Intraclass correlation coefficients (I ) seem to be a bit lower than ronbach‘s alpha
coefficients as the ICCs measured the extent of absolute agreement across raters. The intraclass
correlation can be considered to be a conservative estimate of interrater reliability (Stemler &
Tsai, 2007).
70

more accurately while taking into account effects of random variables such as Speaker group and
Rater group. The mean ratings across raters were not the same, and rating responses were
relative rather than absolute, for instance, one rater‘s 7 on the 9-point scale is not likely to be the
same as other raters‘ response of 7 on the 9-point scale. Therefore, fluency ratings were
standardized by calculating z-scores using each rater‘s mean and standard deviation for a more
accurate analysis by addressing individual differences in ratings. The transformed data closely
approximated normal distributions.

Results
Figure 4 illustrates the means and standard errors of fluency ratings of L1 and L2 speech
in the three conditions. First, the figure shows that the L1 speech excerpts were rated higher than
the L2 speech excerpts. It also shows that for both L1 and L2 speech, ratings of the Pauses
Between Clauses and Pauses Within Clauses conditions are lower than ratings of the No Pause
condition. Ratings of the Pauses Within Clauses condition seem lower than those of the Pauses
Between Clauses condition for both L1 and L2 speech; however, the difference between the two
conditions seems larger for L2 speech. In order to examine statistical differences between the
three conditions for L1 and L2 speech, mixed effects ANOVAs were conducted with fluency
ratings of L1 and L2 speakers‘ speech as dependent variables.

71

Figure 4: Mean and standard error z-scores of fluency ratings of L1 and L2 speech

1.0
0.8
0.6
0.4
0.2
0.0

L1 speech

-0.2

L2 speech

-0.4
-0.6
-0.8
-1.0
No Pause

Pauses Between Pauses Within
Clauses
Clauses

The first mixed effects ANOVA was run with ratings of L1 speech excerpts as a
dependent variable, Condition as a fixed variable, and Speaker group, Rater group, and Raters
within rater groups as random variables. The results showed that there was a main effect of
2

Condition (F(2, 1008) = 42.790, p < .001, ηp = .078), Speaker group (F(2, 1008) = 20.746, p
2

2

< .001, ηp = .040), Raters within rater groups (F(89, 1008) = 1.441, p = .006, ηp = .113), and
2

Rater group (F(2, 89) = 3.178, p = .046, ηp = .067). In order to compare ratings between the
three conditions, post hoc tests were performed using Tukey HSD. The results showed that the
L1 speech excerpts in the Pauses Between Clauses condition (p < .001) and in the Pauses Within
Clauses condition (p < .001) were rated significantly lower than the samples in the No Pause
condition. However, there was no significant difference in ratings between the Pauses Between

72

Clauses and Pauses Within Clauses condition (p = .247). Therefore, L1 speech samples were
rated lower when they had pauses either within or between clauses than when they had no pauses;
however, L1 speech samples containing pauses within clauses and between clauses were not
significantly different in fluency ratings.
Next, another mixed effects ANOVA was run with ratings of L2 speech excerpts as a
dependent variable, Condition as a fixed variable, and Speaker group, Rater group, and Raters
within rater groups as random variables. There was a main effect of Condition (F(2, 1008) =
56.728, p < .001), Speaker group (F(2, 1008) = 48.870, p < .001), and Rater group (F(2, 89) =
3.178, p = .046) ; however, no significant effect of Raters within rater groups (F(89, 1008) =
0.801, p > .05). Therefore, in order to build a model that explains the data better, Condition,
Speaker group, and Rater group were included and Raters within rater groups was excluded. The
new model showed that there was a main effect of Condition (F(2, 1097) = 57.660, p < .001) and
Speaker group (F(2, 1097) = 49.673, p < .001); however, no significant effect of Rater group
(F(2, 1097) = 2.587, p > .05). Finally, the model which further excluded Rater group showed that
2

there was a main effect of Condition (F(2, 1099) = 57.494, p < .001, ηp = .095) and Speaker
2

group (F(2, 1099) = 49.530, p < .001, ηp = .083). To examine significant differences between
the three conditions, Tukey HSD post hoc tests were performed. The results showed that as in L1
speech ratings, L2 speech samples in the Pauses Between Clauses (p < .001) and Pauses Within
Clauses conditions (p < .001) were rated significantly lower than L2 speech samples in the No
Pause condition. However, unlike L1 speech samples, L2 speech samples in the Pauses Within
Clauses condition were rated significantly lower than those in the Pauses Between Clauses
condition (p = .011).

73

In addition, in order to further explore possible acoustic factors of this differential effect
of pause location on perceived fluency of L1 and L2 speech, the speech excerpts were analyzed
and compared to investigate whether there were differences between them in terms of fluency
features. The research design of Experiment 2 ensured that the L1 and L2 speech excerpts were
comparable in terms of the two major oral correlates of perceived fluency—speed and silent
pauses. On the other hand, other minor oral correlates of perceived fluency such as frequency of
filled pauses and repairs have not been matched between the L1 and L2 speech samples.
Therefore, the number of filled pauses and the number of repairs in the L1 and L2 speech
excerpts were calculated and compared. The results showed that the L1 and L2 speech were
comparable in the number of filled pauses (ML1 = 2.04, ML2 = 1.96, SDL1 = 1.18, SDL2 = 1.67,
t(22) = 0.141, p = .889) but the L2 speech had more repairs than the L1 speech (ML1 = 0.17, ML2
= 0.67, SDL1 = 0.33, SDL2 = 0.69, t(15.727) = -2.283, p = .037, r = .50).

Discussion
Experiment 2 examined whether pause location influences perceived fluency of L1 and
L2 speech. In order to test a causal effect of pause location on fluency ratings of L1 and L2
speech, three conditions were constructed—No Pause, Pauses Between Clauses, and Pauses
Within Clauses conditions. The conditions were created through phonetic manipulations. The
baseline No Pause condition was created by shortening all the silent pauses in the speech
samples to the length of around 150 milliseconds (Bosker, 2013). To examine effects of pause
location on perceived fluency directly, the speech samples in the Pauses Between Clauses and
Pauses Within Clauses conditions were prepared by adding the same number of pauses with the

74

same length, either within clauses or at clause boundaries depending on the condition, to the
speech samples in the No Pause condition. The research question was whether there is a
difference in fluency ratings when the L1 and L2 speech have 1) no pause, 2) pauses between
clauses, and 3) pauses within clauses.
Based on the findings in Bosker (2013) that both L1 and L2 speech in the no pause
condition were rated as more fluent than the short and long pause conditions, in Experiment 2,
both L1 and L2 speech in the baseline No Pause condition were also predicted to be rated as
more fluent than the conditions with pauses—Pauses Between Clauses and Pauses Within
Clauses conditions. The results followed the prediction and showed that both L1 and L2 speech
in the No Pause condition were rated to be more fluent than those in the Pauses Between Clauses
and Pauses Within Clauses conditions.
Regarding the main focus of Experiment 2, the effect of pause location on fluency ratings,
raters were predicted to prefer pauses between clauses to pauses within clauses in general, as
pauses are one of the acoustic cues to clausal units (Seidl

risti , 2008). The perception of L1

and L2 speech might be influenced by pause location to a similar degree, as it was by speed, and
pause frequency and length in Bosker (2013). However, it was predicted that pause location
would affect the perception of L2 speech more than that of L1 speech, based on previous
research on L1 pause phenomena (e.g., Arons, 1993; Bower & Springston, 1970; Griffiths, 1991;
Reich, 1980; Sugito, 1990), which suggested that effects of pause location on speech perception
are apparent under cognitively demanding conditions, and listeners tend not to hear pauses which
are not expected, such as within-constituent pauses (Duez, 1985).
The results of the effects of pause location on the fluency ratings also followed the
predictions. Although both L1 and L2 speech had lower fluency ratings in the Pauses Within

75

Clauses condition than in the Pauses Between Clauses condition, only the ratings of L2 speech
were significantly affected by pause location. The difference in the ratings of L1 speech in the
two conditions did not reach significance. The current findings suggest that overall, listeners are
sensitive to pause location. The results are compatible with the fact that a silent pause is an
acoustic cue to clausal units (Seidl

risti , 2008). The L1 infant studies also have shown that

6-month old infants prefer sentences containing pauses between clauses to those containing
pauses within clauses (Hollich & Houston, 2007). More interestingly, the results showed that
perceived fluency of L2 speech was influenced by pause location more than that of L1 speech.
This differential effect of pause location on L1 and L2 perceived fluency is consistent with the
L1 literature which showed that silent pauses have beneficial effects on listeners only under
conditions of cognitive complexity in auditory speech processing and they did not demonstrate
apparent beneficial effects when the speech or tasks were easy enough (Aaronson, 1968; Reich,
1980). It is possible that raters considered perception of L2 speech more cognitively demanding
than that of L1 speech. The finding is also in agreement with L1 pause detection studies (e.g.,
Duez, 1985; Martin & Strange, 1968) and Duez‘s (1985) claim that listeners tend not to hear
pauses which are not expected, such as within-constituent pauses. Raters may not have expected
to hear pauses within clauses in L1 speech; however, they may have expected to hear pauses
within clauses in L2 speech and thus, may have been more ready and sensitive to detect them in
L2 speech than in L1 speech.
In addition, in an attempt to further explore possible acoustic factors of this differential
effect of pause location on perceived fluency of L1 and L2 speech, the excerpts were analyzed
and compared to investigate whether there were differences between them in terms of fluency
characteristics. The research design of Experiment 2 ensured that the L1 and L2 speech excerpts

76

were comparable in terms of the two major oral correlates of perceived fluency—speed and
silent pauses. As discussed in the method section, the L1 and L2 speech samples were selected so
that the two groups were comparable in terms of speed (i.e., Mean syllable duration). Through
the phonetic manipulations, both L1 and L2 speech excerpts had no silent pauses (≥ 250ms) in
the No Pause condition, and they had the same number of pauses with the same length in both
Pauses Between Clauses and Pauses Within Clauses conditions. On the other hand, other minor
oral correlates of perceived fluency such as frequency of filled pauses and repairs have not been
matched between the L1 and L2 speech samples. Acoustic analyses showed that the L1 and L2
speech were comparable in the number of filled pauses but the L2 speech had more repairs than
the L1 speech. The findings suggest that pause location affected fluency ratings more when
speech had more repairs. It is possible that repairs in speech have made speech perception more
cognitively demanding. Another possibility is that repairs in speech have led listeners to expect
more pauses within clauses and to be sensitive and ready to detect them, affecting their ratings.
Further research is needed to confirm whether and how repairs in speech interact with pause
location and affect perceived fluency. In addition, it should be noted that there were differences
between the L1 and L2 speech other than fluency characteristics such as accent, linguistic
accuracy and complexity, and whether these differences can interact with pause location and
affect perceived fluency also requires further investigation.

General Discussion: Study 2
In order to find the speech features that influence L2 perceived fluency, a number of
studies have investigated the relationship between utterance fluency and perceived fluency (e.g.,
Bosker et al., 2013; Cucchiarini, Strik, & Boves, 2000, 2002; Derwing, Rossiter, Munro, &

77

Thomson 2004; Fulcher 1996; Kormos

nes, 2004; Rossiter, 2009) and suggested

importance of silent pauses on perceived fluency; however, both the role of pause location in
perceived fluency, and the relative contributions of the frequency, length, and distribution of
silent pauses to perceived fluency have rarely been examined. The current study aimed to fill
these gaps and extend the body of research on L2 perceived fluency using two experiments.
Experiment 1 investigated the relative contributions of the frequency, length, and
distribution of silent pauses on L2 perceived fluency and showed that pause distribution, in
particular exhibited the strongest correlation with fluency ratings and explained 45% of the
variance of fluency ratings. Pause distribution was also able to explain 10% of additional
variance that was not explained by pause frequency and length, suggesting its crucial role in
perceived fluency. Experiment 2 tested whether pause location affected perceived fluency of L1
and L2 speech by comparing fluency ratings of L1 and L2 speech in the three conditions—No
Pause, Pauses Between Clauses, and Pauses Within Clauses. The results showed that both L1
and L2 speech were rated higher when they had no pause than when they had pauses. More
importantly, L1 and L2 speech in the Pauses Within Clauses condition were rated lower than
those in the Pauses Between Clauses condition; however, only the difference in ratings of L2
speech reached significance.
Findings of both Experiment 1 and Experiment 2 suggest a significant role of pause
location on L2 perceived fluency. They are consistent with L1 research on pause phenomena
which suggests that pause location affects speech perception. Silent pauses are one of the
acoustic cues to clausal units along with pitch and vowel duration (Seidl

risti , 2008).

Therefore, silent pauses at grammatical boundaries help listener comprehension by indicating the
boundaries of speech to be analyzed, and by providing cognitive processing time (e.g., Arons,

78

1993; Griffiths, 1991; Reich, 1980, Sugito, 1990), whereas pauses within clauses can be
disrupting. It has also been reported that silent pauses between clauses have beneficial effects on
listeners only under conditions of cognitive complexity in auditory speech processing and they
did not demonstrate apparent beneficial effects when the speech or tasks were easy enough
(Aaronson, 1968; Reich, 1980). It is possible that the raters in Experiment 2 considered
perception of L2 speech more cognitively demanding than that of L1 speech. The finding is also
compatible with L1 pause detection studies (e.g., Duez, 1985; Martin & Strange, 1968) and
Duez‘s (1985) claim that listeners tend not to hear pauses which are not expected, such as
within-constituent pauses. The raters in Experiment 2 may not have expected to hear pauses
within clauses in L1 speech; however, they may have expected to hear pauses within clauses in
L2 speech and thus, may have been more ready and sensitive to detect them in L2 speech than in
L1 speech. The findings of Study 2 can be viewed as an initial attempt to fill the gaps in the
literature on the relationship between L2 perceived fluency and pause location and further
research is needed in particular on which aspects of L2 speech make perceived fluency more
susceptible to pause location than those of L1 speech.
Study 2 also has methodological implications. Experiment 2 used phonetic manipulations
to test a causal relationship between pause location and perceived fluency. The correlational
approach would be unsuitable to compare the perception of L1 and L2 speech because they differ
in many respects. Phonetic manipulations ensured that effects in fluency ratings could be directly
attributed to fluency characteristics manipulated in both L1 and L2 speech.

79

CHAPTER 4: CONCLUSION

Fluency is one of the most noticeable differences between native and nonnative speech
and constitutes a critical component of second language (L2) proficiency; however, the concept
has not been well understood by researchers. In order to deepen understanding of the
multidimensional construct of fluency, the current dissertation took a novel approach and
investigated the production and perception of second language fluency from all three aspects of
fluency—utterance, cognitive, and perceived fluency.
Study 1 investigated utterance fluency and cognitive fluency of English speakers and
Korean learners of English by comparing temporal measures and stimulated recall responses.
The L1 and L2 speakers were different in speed, length of run, repairs, and silent pauses. In
particular, a striking group difference in Silent pause rate within a clause is consistent with the
claim that pauses within clauses reflect processing difficulties in speech production. Stimulated
recall responses showed that lower proficiency learners remembered more issues regarding L2
declarative knowledge on grammar and vocabulary than higher proficiency learners, which was
compatible with the declarative/procedural model and studies on automaticity.
Study 2 examined the relationship between utterance fluency and perceived fluency using
two experiments. Experiment 1 investigated the relative contributions of frequency, length, and
distribution of pauses to perceived fluency of L2 speech. Experiment 2 tested causal effects of
pause location on perceived fluency of L1 and L2 speech. Findings of both Experiment 1 and
Experiment 2 suggest a significant role of pause location on L2 perceived fluency. In Experiment
1, pause distribution demonstrated the strongest correlation with fluency ratings and in
Experiment 2, perceived fluency of L2 speech was influenced by pause location more than that

80

of L1 speech. The findings are in agreement with L1 literature on pause phenomena that silent
pauses are one of the acoustic cues to clausal units and silent pauses between clauses can
facilitate speech perception and recall, whereas pauses within clauses can interfere with them in
cognitively demanding contexts.
The present study has theoretical, methodological, and practical implications in the fields
of L2 acquisition research, education, and testing. In terms of theoretical contributions, the study
investigated three notions of fluency—utterance, cognitive, and perceived fluency—and
examined their relationships in a comprehensive and systematic way within a theoretical
framework (Segalowitz, 2010). It also critically identified gaps and issues in previous studies on
L2 fluency. An almost exclusive focus on L2 speech in most studies provided little evidence to
show how L1 and L2 utterance fluency are different. The relative contributions of frequency,
duration, and distribution of silent pauses to perceived fluency have not yet been understood. In
particular, effects of pause location on the perception of fluency have rarely been researched.
The present study precisely addressed these gaps and aimed to extend the body of research on L2
fluency. Moreover, it took an interdisciplinary approach to capture the multidimensionality of
fluency by integrating findings from different fields such as second language acquisition,
psycholinguistics, cognitive science, speech science, and pausology.
The two studies also have implications for research methodology. Study 1 discussed
strengths and weaknesses of different measures used in the previous studies on utterance fluency,
and further proposed measures which could depict pause distribution more accurately. Study 1
also utilized qualitative analysis in exploring cognitive fluency to provide additional insight to
the field where quantitative analysis is dominant. Furthermore, in examining effects of temporal
features on perceived fluency, Study 2 tried to overcome the limitation of correlation analysis

81

used in the literature and tested a causal relationship between utterance and perceived fluency
using phonetic manipulations.
Results of the studies also have potential implications for L2 education and assessment.
Finding reliable oral correlates of fluency can help to improve learners‘ oral fluency and to
develop a more valid assessment tool to measure oral fluency and proficiency in L2 speech. One
of the most novel and important findings of the current dissertation is the close relationship
between L2 fluency and pauses within clauses. L1 and L2 speech exhibited a striking difference
in the frequency of pauses within clauses, which is considered to reflect difficulties in speech
production processing such as lexical retrieval. Pauses within clauses also had a crucial impact
on perceived fluency of L2 speech. Based on the findings, in classroom one of the ways teachers
can help L2 learners to enhance L2 fluency is to provide ample opportunities to practice
collocations and formulaic language, which can enable learners to produce longer fluent runs and
decrease pauses within clauses in their speech. In terms of L2 assessment, including the measure
which addresses the frequency of silent pauses within clauses in automatic fluency assessment
can evaluate L2 learners‘ oral fluency more accurately.

82

APPENDICES

83

Appendix A: Questions for spontaneous speech

1. What is your major? What is it about? Do you like it? And why or why not?
2. What do you like to do in your free time?

84

Appendix B: English language learning background questionnaire in Study 1

1. Age: _________
2. Gender:

Male

Female

3. Mother tongue (First language): _________________________________________
4. Other languages spoken at home as a child: ____________________
5. Age at first exposure to English
a. through instruction: ____________________
b. through immersion-type environment (living in an English-speaking country): ___________
6. Years of total instruction (i.e., language courses and content-based coursework) in English up
to present day: ____________________
7. Years of total immersion/exposure to English (i.e., speaking it at home with native speaker(s)
and/or living in an English-speaking country): ____________________
8. First two years of English learning
a. How many hours did you have oral English input from native English speakers
(i.e., listening to audio materials, speaking with native English speakers) per week?
____________________
b. How many hours did you have oral interaction with native English speakers per week?
____________________
c. What was the ratio of oral to written input (e.g., 10:90, 15:85, 75:25)?
_____________________
9. English proficiency (Please recall as best as you can.)
Test: __________

Total score: _________

85

Speaking score: __________

Appendix C: Instructions for the experiment

Your task is to listen to native and nonnative speech samples and rate them in terms of their
fluency using a 9-point scale.
1: extremely disfluent

9: extremely fluent

In this study fluency refers to how easily and smoothly speech is delivered, not overall
proficiency. Please make your judgments based on factors such as
-

speech rate

-

silent and filled pauses (e.g., um, uh)

-

hesitations and/or corrections

-

overall flow of speech

-

NOT grammar or vocabulary

Following are the two questions the speakers answered.
1.

What is your major? What is it about? Do you like it? Why or why not?

2.

What do you like to do in your free time?

Each stimulus is about 20 second long and was excerpted from approximately the middle of the
original recordings.

86

Appendix D: Rater background questionnaire in Study 2

Name:
1. Age: _________
2. Gender:

Male

Female

3. State you are from: ____________________
4. Mother tongue (First language): _________________________________________
5. Other languages spoken at home as a child: ____________________
6. Please list any foreign language that you have previously studied:
Language

Length of study

Level
Basic

Intermediate

Advanced

Basic

Intermediate

Advanced

Basic

Intermediate

Advanced

7. Circle one of the numbers below to show how familiar you are with Korean accented English.
1
Not familiar
at all

2

3

4

5

6

7

8

9
Extremely
familiar

8. If you are familiar with Korean accent, describe how you became familiar with it (i.e., having
Korean friends, teaching or tutoring Korean students, etc.).

9. If you are familiar with any other foreign accent, describe how familiar you are and how you
became familiar with it (or them).

87

10. Have you taught or tutored nonnative English speakers? If so, briefly describe your teaching
experience (i.e., taught what, to whom, for how long etc.).

11. Which factors do you think particularly influenced your fluency rating? (e.g., speed,
silent/filled pauses, repetitions, corrections, accent, grammar, vocabulary etc.) Do you have any
other comments about the experiment?

88

Appendix E: An example of addition of pauses to a speech sample

In my free time which is [PWC] very limited now that I‘m a graduate student [PBC]
I [PWC] like to do yoga [PBC] ahm [PBC] or go running or biking [PBC]
Um I also really like to [PWC] cook [PBC]
Ahm which I do [PWC] almost every day but not [PWC] too much

Note. [PWC] represents a pause within a clause and [PBC] represents a pause between clauses.

89

Appendix F: Example waveforms of the speech manipulations

Figure 5: Speech in the No Pause condition

um

I also

really

like

to cook

ahm

which I

do

Figure 6: Speech in the Pauses Between Clauses condition

um

I also

really

like

to cook

[pause]

ahm

which I

do

ahm

which I

do

Figure 7: Speech in the Pauses Within Clauses condition

um

I also

really

like

to

[pause]

90

cook

REFERENCES

91

REFERENCES

Aaronson, D. (1968). Temporal course of perception in an immediate recall task. Journal of
Experimental Psychology, 76, 129-140.
Anderson, J. R. (1983). The architecture of cognition. Mahwah, NJ: Erlbaum.
Arons, B. (1993). SpeechSkimmer: Interactively skimming recorded speech. Proceedings of the
6th Annual ACM Symposium on User Interface Software and Technology, USA, 6, 187196.
Beattie, G. (1977). The dynamics of interruption and the filled pause. The British Journal of
Social and Clinical Psychology, 16, 283-284.
Beattie, G. (1980). The role of language production processes in the organization of behavior in
face-to-face interaction. In B. Butterworth (Ed.), Language production: Vol. 1. Speech
and talk, (pp. 69-109). London: Academic Press.
Boers, F., Eyckmans, J., Kappel, J., Stengers, H., & Demecheleer, H. (2006). Formulaic
sequences and perceived oral proficiency: Putting a lexical approach to the test.
Language Teaching Research, 10, 245–261.
Boersma, P., & Weenink, D. (2012). PRAAT. Retrieved from http://www.praat.org
Boomer, D. S. (1965). Hesitation and grammatical encoding. Language and Speech, 8, 148-158.
Bosker, H. R. (2013, May). Native and non-native fluency. Paper presented at the New Sounds
2013 Conference, Montreal.
osker H. R. Pinger A. Quen , H., Sanders, T., & de Jong N. H. (2013). What makes speech
sound fluent? The contributions of pauses, speed and repairs, Language Testing, 30, 159175.
Bower, G. H., & Springston, F. (1970). Pauses as recoding points in letter series. Journal of
Experimental Psychology, 83, 421-430.
Bybee, J. (2002). Phonological evidence of exemplar storage of multiword sequences. Studies in
Second Language Acquisition, 24, 215–221.
Carr, N. T. (2011). Designing and analyzing language tests. Oxford: Oxford University Press.
Cenoz, J. (1998). Pauses and communication strategies in second language speech. (ERIC
Document ED 426630). Rockville, MD: Educational Resources Information Center.

92

Clark, H. H., & Fox Tree J. E. (2002). Using uh and um in spontaneous speaking. Cognition, 84,
73-111.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:
Erlbaum.
Cucchiarini, C., Strik, H., & Boves, L. (2000). Quantitative assessment of second language
learners‘ fluency by means of automatic speech recognition technology. Journal of the
Acoustical Society of America, 107, 989–999.
Cucchiarini, C., Strik, H., & Boves, L. (2002). Quantitative assessment of second language
learners‘ fluency: Comparisons between read and spontaneous speech. Journal of the
Acoustical Society of America, 111, 2862–2873.
Davies, A. (2003). The native speaker: Myth and reality. (2nd ed.). Tonawanda, NY:
Multilingual Matters.
De Bot, K. (1992).A bilingual production model: Levelt‘s speaking model adapted. Applied
Linguistics, 13, 1–24.
De Jong, N. H. & Bosker, H. R. (2013). Choosing a threshold for silent pauses to measure
second language fluency. In Proceedings of the 6th Workshop on Disfluency in
Spontaneous Speech (DiSS), Stokholm.
De Jong, N. H., Groenhout, R., Schoonen, R., & Hulstijn, J. H. (2013). Second language fluency:
Speaking style or proficiency? Correcting measures of second language fluency for first
language behavior. Applied Psycholinguistics, Advance online publication.
doi:10.1017/S0142716413000210
De Jong, N. H., Steinel, M. P., Florijn, A., Schoonen, R., & Hulstijn, J. H. (2013). Linguistic
skills and speaking fluency in a second language. Applied Psycholinguistics, 34, 893-916.
Dehaene, S., Dupoux, E., Mehler, J., Cohen, L., Paulesu, E., Perani, D., van de Moortele, P. F.,
Lehericy, S. & Le Bihan, D. (1997). Anatomical variability in the cortical representation
of first and second language. Neuroreport, 8, 3809-3815.
DeKeyser, R. (2001). Automaticity and automatization. In P. Robinson (Ed.), Cognition and
second language instruction (pp. 125–151). New York, NY: Cambridge University Press.
DeKeyser, R. (2007). Practice in a second language: Perspectives from applied linguistics and
cognitive psychology. New York, NY: Cambridge University Press.
Derwing, T. M., Munro, M. J., & Thomson, R. I. (2007). A longitudinal study of ESL learners‘
fluency and comprehensibility development. Applied Linguistics, 29, 359-380.
Derwing, T., Rossiter, M., Munro, M., & Thomson, R. (2004). Second language fluency:
Judgments on different tasks. Language Learning, 54, 655-679.

93

Deschamps, A. (1980). The syntactic distribution of pauses in English spoken as a second
language by French students. In H. W. Dechert & M. Raupach (Eds.), Temporal
variables in speech (pp. 271-285). The Hague, Netherlands: Mouton.
rnyei, Z., & Kormos, J. (1998). Problem-solving mechanisms in L2 communication: A
psycholinguistic perspective. Studies in Second Language Acquisition, 20, 349-385.
Duez, D. (1985). Perception of silent pauses in continuous speech. Language and Speech, 28,
377-389.
Ejzenberg, R. (2000). The juggling act of oral fluency: A psycho-sociolinguistic metaphor. In H.
Riggenbach (Ed.), Perspectives on fluency (pp. 287–314). The University of Michigan
Press: Michigan.
Ferreira, F. (1993). Creation of prosody during sentence production. Psychological Review, 100 ,
233-253.
Ferreira, F. (2007). Prosody and performance in language production. Language and Cognitive
Processes, 22 , 1151-1177.
Field, A. P. (2005). Intraclass Correlation. In B. S. Everitt & D. C. Howell (Eds.), Encyclopedia
of Statistics in Behavioral Science (Volume 2, pp. 948–954). Chichester: Wiley.
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all
reasons. Applied Linguistics, 21, 354-375.
Freed, B. F. (1995). Do students who study abroad become fluent? In B. F. Freed (Ed.), Second
language acquisition in a study abroad context (pp. 123–148). Amsterdam: John
Benjamins.
Freed, B. F. (2000). Is fluency, like beauty, in the eyes (and ears) of the beholder? In H.
Riggenbach (Ed.), Perspectives on fluency (pp. 243-265). Ann Arbor: University of
Michigan Press.
Freed, B. F., Segalowitz, N., & Dewey, D. P. (2004). Context of learning and second language
fluency in French: Comparing regular classroom, study abroad, and intensive domestic
immersion programs. Studies in Second Language Acquisition, 26, 275-301.
Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating
scale construction. Language Testing, 13, 208-238.
Gass, S. M., & Mackey, A. (2000). Stimulated recall methodology in second language research.
Mahwah, NJ: Lawrence Erlbaum Associates.
Gass, S. M., & Mackey, A. (2007). Data elicitation for second and foreign language research.
NY: Rutledge.

94

Ginther, A., Dimova, S., & Yang, R. (2010). Conceptual and empirical relationships between
temporal measures of fluency and oral English proficiency with implications for
automated scoring. Language Testing, 27, 379-399.
Goffman, E. (1981). Radio talk. In E. Goffman (Ed.), Forms of talk (pp. 197-327). Philadelphia,
PA: University of Pennsylvania Press.
Goldman-Eisler, F. (1968). Psycholinguistics: Experiments in spontaneous speech. New York:
Academic Press.
Griffiths, R. (1991). Pausological research in an L2 context: A rationale, and review of selected
studies. Applied Linguistics, 12, 345-364.
Gut, U. (2009). Non-native speech: A corpus-based analysis of phonological and phonetic
properties of L2 English and German. Frankfurt: Peter Lang.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2009). Multivariate data analysis (7th
ed.). Upper Saddle River, NJ: Prentice Hall.
Hawkins, R. R. (1971). The syntactic location of hesitation pauses. Language and Speech, 14,
277-288.
Hollich, G., & Houston, D. (2007). Language Development: From speech perception to first
words. In A. Slater & M. Lewis (Eds.), Introduction to Infant Development (pp. 170-188).
New York, NY: Oxford University Press.
Holmes, V. M. (1988). Hesitations and sentence planning. Language and Cognitive Processes, 3,
323-361.
Housen, A, Kuiken, F., & Vedder, I. (2012). Complexity, accuracy and fluency: Definitions,
measurement and research. In A. Housen, F. Kuiken & I. Vedder (Eds.), Dimensions of
L2 performance and proficiency. Investigating complexity, accuracy and fluency in SLA
(pp. 1-20). Amsterdam: John Benjamins Publishing Company.
Indiana University. (n.d.). SPEAK test rating scale. Retrieved from
http://liberalarts.iupui.edu/english/index.php/academics/eap/eap_contact#rubric
Iwashita, N., Brown, A., McNamara, T., & O‘Hagan, S. (2008). Assessed levels of second
language speaking proficiency: How distinct? Applied Linguistics, 29, 24-49.
Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall.
Kahng, J. (2012). How long should a pause be? Effects of cut-off points of pause length on
analyzing L2 utterance fluency. Poster presented at Fluent Speech Workshop, Utrecht,
The Netherlands.

95

Kang, O., Rubin, D., & Pickering, L. (2010). Sugrasegmental measures of accentedness and
judgments of language learner proficiency in oral English. Modern Language Journal, 94,
554-566.
Kircher, T. T. J., Brammer, M. J., Levelt, W., Bartels, M., & McGuire, P. K. (2004). Pausing for
thought: Engagement of left temporal cortex during pauses in speech, NeuroImage, 21,
84-90.
Koponen M.
Riggenbach H. (2000). Overview: Varying perspectives on ﬂuency. In H.
Riggenbach (Ed.), Perspectives on ﬂuency (pp. 5–24). Ann Arbor: University of
Michigan Press.
Kormos, J. (2000a). The role of attention in monitoring second language speech production.
Language Learning, 50, 343-384.
Kormos, J. (2000b). The timing of self-repairs in second language speech production. Studies in
Second Language Acquisition, 22, 145-169.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Erlbaum.
Kormos J.
nes, M. (2004). Exploring measures and perceptions of fluency in the speech of
second language learners. System, 32, 145-164.
Kuiper, K. (1996). Smooth talkers: The linguistic performance of auctioneers and sportscasters.
Englewood Cliffs, NJ: Erlbaum.
Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. New
York, NY: Routledge.
Lass, N. J., & Leeper, H. A. (1977). Listening rate preference: Comparison of two time
alternation techniques. Perceptual and Motor Skills, 44, 1163-1168.
Lennon P. (1984). Retelling a story in English. In H. W. echert . M hle, & M. Raupach
(Eds.), Second language productions (pp. 50-68). T bingen: Gunter Narr Verlag.
Lennon, P. (1990). Investigating fluency in EFL: A quantitative approach. Language Learning,
40, 387–417.
Levelt, W. J. (1983). Monitoring and self-repair in speech. Cognition, 14, 41–104.
Levelt, W. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Levelt, W. (1999). Producing spoken language: A blueprint of the speaker. In C. Brown & P.
Hagoort (Eds.), The neurocognition of language (pp. 83-122). Oxford, UK: Oxford
University Press.

96

MacGregor, L. J. (2008). Disfluencies affect language comprehension: Evidence from eventrelated potentials and recognition memory (Doctoral dissertation). Retrieved from
Edinburgh Research Archive. (http://hdl.handle.net/1842/3311)
Maclay, H., & Osgood, C. E. (1959). Hesitation phenomena in spontaneous English speech.
Word, 15, 19–44.
Martin, J. G., & Strange, W. (1968). The perception of hesitation in spontaneous speech.
Perception and Psychophysics, 3, 427-438.
Mora, J. C., & Valls-Ferrer, M. (2012). Oral fluency, accuracy, and complexity in formal
instruction and study abroad learning contexts. TESOL Quarterly, 46, 610-641.
O‘ rien I. Segalowitz N. Freed .
ollentine J. (2007). Phonological memory predicts
second language oral fluency gains in adults. Studies in Second Language Acquisition, 29,
557–582.
Opitz, B., & Friederici, A. D. (2003). Interactions of the hippocampal system and the prefrontal
cortex in learning language-like rules. NeuroImage, 19, 1730-1737.
Pawley, A., & Syder, F. (2000). The one clause at a time hypothesis. In H. Riggenbach (Ed.),
Perspectives on fluency (pp. 163–191). Ann Arbor: University of Michigan Press.
Perani, D., Paulesu, E., Galles, N. S., Dupoux, E., Dehaene, S., Bettinardi, V., Cappa, S. F.,
Fazio, F. & Mehler, J. (1998). The bilingual brain: proficiency and age of acquisition of
the second language. Brain, 121, 1841-1852.
Raupach, M. (1987). Procedural learning in advanced learners of a foreign language. In J. A
Coleman & R. Towell (Eds.), The advanced language learner (pp. 123-155). London:
CILT.
Reich, S. S. (1980). Significance of pauses for speech perception. Journal of Psycholinguistic
Research, 9, 379-389.
Riazantseva, A. (2001). Second language proficiency and pausing. Studies in Second Language
Acquisition, 23, 297-526.
Riggenbach, H. (1991). Towards an understanding of fluency: A microanalysis of nonnative
speaker conversation. Discourse Processes, 14, 423-441.
Roberts, B., & Kirsner, K. (2000). Temporal cycles in speech production. Language and
Cognitive Processes, 15, 129-157.
Rossiter, M. J. (2009). Perceptions of L2 fluency by native and non-native speakers of English.
Canadian Modern Language Review, 65, 395-412.

97

Schmidt, R. (2000). Forward. In H. Riggenbach (Ed.), Perspectives on fluency (pp.v-viii). Ann
Arbor: University of Michigan Press.
Schnadt, M. J. (2009). Lexical influences on disfluency production (Doctoral dissertation).
Retrieved from Retrieved from Edinburgh Research Archive.
(http://hdl.handle.net/1842/4424)
Segalowitz, N. (2000). Automaticity and attentional skill in fluent performance. In H.
Riggenbach (Ed.), Perspectives on Fluency (pp. 200-219). Ann Arbor, MI: University of
Michigan Press.
Segalowitz, N. (2003). Automaticity and second languages. In C. Doughty & M. Long (Eds.),
The handbook of second language acquisition (pp. 382-408). Oxford, UK: Blackwell.
Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge.
Segalowitz, N., & Freed, B. F. (2004). Context, contact, and cognition in oral fluency acquisition:
Learning Spanish in at home and study abroad contexts. Studies in Second Language
Acquisition,26, 173–200.
Segalowitz, N., & Hulstijn, J. (2005). Automaticity in bilingualism and second language learning.
In F. F. Kroll & A. M. B. De Groot (Eds.), Handbook of bilingualism: Psycholinguistics
approaches (pp 371-388). Oxford, UK: Oxford University Press.
Seidl A.
risti , A. (2008). Developmental changes in the weighting of prosodic cues.
Developmental Science, 11, 596-606.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.
Skehan, P. (2003). Task based instruction. Language Teaching, 36, 1–14.
Skehan, P. (2009). Modelling second language performance: Integrating complexity, accuracy,
fluency, and lexis. Applied Linguistics, 30, 510-532.
SPSS Inc. (2008). SPSS Statistics for Windows, Version 17.0. Chicago: SPSS Inc.
Stemler, S. E., & Tsai, J. (2007). Best practices in interrater reliability: Three common
approaches. In J. W. Osborne (Ed.), Best practices in quantitative methods (pp. 29–49).
Thousand Oaks, CA: Sage Publications.
Sugito, M. (1990). On the role of pauses in production and perception of discourse. Proceedings
of the 1st International Conference on Spoken Language Processing, Japan, 1, 513-516.
SyllableCount.com. (n. d.). Syllable counter [online software]. Available from
http://www.syllablecount.com.
Taboada, M. (2006). Spontaneous and non-spontaneous turn-taking. Pragmatics, 16, 329-360.

98

Tavakoli, P. (2011). Pausing patterns: Differences between L2 learners and native speakers. ELT
Journal, 65, 71-79.
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure, and performance testing. In
R.Ellis (Ed.), Planning and task performance in a second language (pp. 239–276).
Amsterdam: John Benjamins.
Towell, R. & Dewaele, J.-M. (2005). The role of psycholinguistic factors in the development of
fluency amongst advanced learners of French. In J.-M. Dewaele (Ed.), Focus on French
as a foreign language. Tonawanda, NY: Multilingual Matters.
Towell, R., Hawkins, R., & Bazergui, N. (1996). The development of fluency in advanced
learners of French. Applied Linguistics, 17, 84-119.
Trofimovich, P., & Baker, W. (2006). Learning second language suprasegmentals: Effects of L2
experience on prosody and fluency characteristics of L2 speech. Studies in Second
Language Acquisition, 28, 1-30.
Ullman, M. T. (2001). The neural bases of lexicon and grammar in first and second language:
The declarative/procedural model. Bilingualism: Language and Cognition, 4, 105-112.
Ullman, M. T. (2004). Contributions of memory circuits to language: the declarative/procedural
model. Cognition, 92, 231-270.
Ullman, M. T. (2005). A cognitive neuroscience perspective on second language acquisition: The
declarative/procedural model. In C. Sanz (Ed.), Mind and context in adult second
language acquisition: Methods, theory, and practice (pp. 141-178). Washington, DC:
Georgetown University Press.
Ullman, M.T. (2013). The declarative/procedural model of language. In H. Pashler (Ed.),
Encyclopedia of the Mind (pp. 224-226). Los Angeles: Sage Publications.
Wood, D. (2010). Formulaic language and second language speech fluency: Background,
evidence and classroom applications. London: Continuum.

99