Ch


................................
................................
................................
..................
 

................................
................................
................................
.....................

 
................................
................................
................................
....
 

................................
................................
................................
...............

 
................................
................................
................................
.....

 
................................
................................
.....

 
................................
................................
...............

 
................................
...................

 
................................
................................
.......................

 
................................
...............................

 
................................
.

 
................................
..
 

................................
............
 

..............................
 

................................
................................
................................
.........
 

................................
................................
................................
.......
 

................................
................................
.......................
 

................................
................................
................................
.....
 

................................
................................
................................
............
 

................................
................................
.......................
 

................................
................................
................................
...............
 

................................
................................
........................
 

................................
................................
.......................
 

................................
................................
................................
..
 

................................
................................
................................
............
 

................................
................................
................................
....
 

................................
................................
................................
....
 

................................
................................
...................
 

................................
...............
 

................................
................................
................................
.........................
 

................................
................................
................................
...................
 

................................
................................
.................
 

................................
................................
...............
 

................................
................................
....................
 

................................
................................
..................
 

................................
................................
................................
...................
 

................................
................................
................................
................
 

................................
................................
................................
.................
 

................................
................................
................................
.................
 

................................
................................
................................
..............................
 

................................
................................
................................
.....................
 

................................
................................
..........
 

................................
.........
 

................................
...........................
 

................................
................................
...
 

................................
................................
 

................................
..............
 

................................
................................
......
 

................................
................................
.
 

................................
...............
 

................................
................................
.....
 

................................
..............................
 

................................
................................
............
 

................................
................................
........
 

................................
................................
.............
 

................................
................................
................
 

................................
................................
.
 

................................
 

..............................
 

................................
................................
................................
 

..........................
 

.


................................
................................
..
 

................................
...............................
 

................................
................................
................................
................................
..........
 

................................
................................
................................
................................
.......
 

..........
 

..........
 

................................
....
 

..........
 

......................
 

................................
..
 

................................
..
 

..............................
 

................................
.....
 

...............................
 

................................
.........
 

................................
..............
 

.............................
 

.........................
 

.......................
 

................................
................................
...
 

................................
.................
 

1
 

2
 

2.1
 

T
he English article system poses a unique challenge to lear
ners 
because of
 
cross
-
linguistic 
differences
 
of the article system (
e.g., 


; Section 2
.1.1) and the
 
difficulty in 
teaching
 
and learning the use of the intricate English 
article system (
e.g., 
Dulay, Burt, & 
Krashen, 1982
;
 
Master, 1994; Section 2
.1.2)
. 
M
orpheme 
order studies
 
have quantitatively informed us of the relative difficulty of the article acquisition, 
in comparison with other morphemes
 
(e.g., 


Section 1.1.3)
.
 
In what follows, I discuss each of these aspects.
 
2.1.1
 

2.1.2
 

2.1.3
 

2.2
 

2.2.1
 

et al. (2013) speculate that this is be


2.2.2
 

2.3
 

2.3.1
 

2.3.2
 

Th
e notion of Definiteness
 
is 
central to the use of English articles
. 
A common scheme for 

subsequently revised by Huebner 
(
1983, 1985
) and is still widely used in corpus
-
based learner language research (e.g., 
Butler, 
2002; Crosthwaite, 2016; 
Diez
-
Bedmar & Papp, 2008; Diez
-
Bedmar, 2015;
 
Leroux & Kendall, 
2018
)
. 
This scheme categorizes NPs into f
our semantic contexts based on the presence and 

±) and Specific Referent (SR
 
±)
. 
Figure 1
.1
 
shows the 
graphical representation of this categorization scheme.
 

1.
 

2.
 

a.
 

b.
 

c.
 

d.
 

3.
 

4.
 

a.
 

b.
 

c.
 

The first two categories are both [+HK], meaning that the entity is 
assumed known to the hearer
. 
The former (Category 1) is a known entity without specific referent (e.g., A 
cat
 
likes mice), and 
the latter (Category 2) is a known entity with a specific referent (e.g., Pass me the 
pen
)
. 
The 
other two categories are both [
-
H
K], meaning that the entity is assumed unknown to the hearer
. 
The former (Category 3) is an unknown entity with a specific referent, such as first mention (e.g., 
I saw a strange 
man
), and the latter (Category 4) is an unknown entity without a specific refe
rent 
4. [
-
SR] [
-
HK] 
nonreferentials
3. [+SR] [
-
HR] 
nonreferential 
definites
2. [+SR] [+HK] 
referential 
definites
1. [
-
SR] [+HK] 
generics

 
(e.g., He used to be a 
lawyer
)
. 
All these examples were adopted from Butler (2002, pp. 478
-
479); for a more complete set of examples, see Butler (2002).
 
However, even after the idiomatic expressions and conventional uses were added as the 
fifth 
category by Thomas (1989), this 2 × 2 (+ 1) categorization scheme is not adequate, as it 
does not capture the full variability of the notion of definiteness
. 
That is to say, d
ifferences 
within 
each of the five types of definiteness should also be taken int
o a
ccount; for example, in 
Figure 2
.1, within the single category 2 [+SR, +HK], four types of examples are given
. 
Lumping 
all these four types 
within 
category 2 would be problematic, if learners have varying degrees of 
problems among these four types.
 
This
 
issue seems to be addressed in a more fine
-
grained coding scheme for 
communicative functions of definiteness (CFD
; 
Bhatia, Simons et al., 2014
; subsequently 
modified in 
Bhatia, Lin et al., 2014
a
)
. 
This coding scheme takes a hierarchical
 
structure, as 
show
n in Figure 2
.2.
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

As shown in Figure 2
.2, CFD categorizes various types of definiteness based on its tree structure
 
(examples of each category are given in Appendix A)
. 
The highest order distinction 
categorizes 
all noun phrases (NPs) into the following three intermediate nodes: 
Nonanaphora, Anaphora, 
and 
Miscellaneous
. 
Nonanaphora
 
refers to entities that are discourse
-
new, and it further ramifies 
into 
Unique, Nonunique, 
and 
Generic.
 
Unique
 
refers to uniquely identifiable entities such as 
Bara
c
k Obama, whereas 
Nonunique
 
refers to unidentifiable entities
. 
Generic
 
refers to the entire 
genre rather than an individual case
. 
Anaphora 
refers to entities that are previously mentioned or 
evoked in t
he discourse, and it further ramifies into 
Basic 
and 
Extended
 
Anaphora
. 
The former 
refers to the entities that have 
been 
mentioned in the discourse, whereas the latter refers to the 
entities that have not been directly mentioned, but evoked by indirect all
usion
. 
Miscellaneous 
refers to all other kinds of entities that do not fit into either 
Nonanaphora 
or 
Anaphora
, such as a 
part of an idiomatic expression (e.g., in 
fact
).
 

Even though
, to my knowledge, 
this coding scheme has never been used in SLA research, 
it was deemed more informative and useful
 
than the traditional semantic wheel
 
because
 
CFD 
overcomes the abovementioned problem that the semantic wheel lumps together different kinds 
of definiteness in
to one category. For example, in semantic wheel, 

knowledge 
are
 

knowledge, such as Nonunique_Physical_Copresence and Extended_Anaphora.  The former is 
when t
he referred entity is known to the hearer because it is physically present at the moment of 
the speech, and the latter is when the referred entity is known to the hearer because it has been 
indirectly ev
oked 
in
 
the previous discourse.
 
In addition to the 
variables central to the use of English articles that have been discussed 
in this section thus far, other
 
factors, such as syntactic modification, 
are also reported to affect 
the use of English articles
 
(e.g., Lee, 1999). 
I will now turn to those factors.
 
2.3.3
 

2.3.4
 

2.4
 

1.
 

2.
 

3.
 

3
 

3.1
 

3.1.1
 

. The Two Corpora Used in the Study
 
 
1


2
 
3.1.2
 

Target tokens (i.e., articles) were extracted from 
the abovementioned 
tw
o corpora 
(EFCAMDAT and LOCNESS) in the following ways
. 
For learner language, relevant essay 
topics, L1s, and proficiency levels were selected from EFCAMDAT and downloaded as an 
XML 
file
. 
Initially, 5 L1s (English, Japanese, Chinese, Russian, and Korean) were to be included in the 
analyses; however, to ensure enough number of 
occu
r
rence
s in each L1 group for subsequent 
statistical analyses
, Russian and Korean were excl
uded from data ann
o
tat
i
o
n
.
 
                                        
1
 
https://corpus.mml.cam.ac.uk/efcamdat2/public_html/
 
2
 
https://uclouvain.be/en/research
-
institutes/ilc/cecl/locness.html
 

For essay topics, 
EFCAMDAT has the total of 126 essay topics across 16 proficiency 
levels
. As essay prompts are reported to affect the accuracy of certain article forms (Crosthwaite, 
2016), a care was taken to
 
ensure the compara
bility between the two corpora. Because the essays 
written by English native speakers in LOCNESS are argumentative essays, 
personal topics were 
excluded from EFCAMDAT
. 
More specifically, once the list of 126 essay prompts across 16 
levels and 6 proficiency
 
groups (A1 

 
C2) was extracted, each of them was examined carefully 
based on (a) word count requirement, (b) writing format (e.g., email, letter, list
, etc.
), and (c) 
nature of the prompt
. 
For (a), because essays in LOCNESS are 500 words or more, essays 
with 
fewer word counts were removed
. 
However, because the longest word count requirement was 
150 

 
180 words in EFCAMDAT, an arbitrary cutoff point of at least 100 words was made
. 
For 
(b), writing format that affects the discourse level variable was remove
d; for example, prompts 
that elicited bullet points were excluded
. 
For (c),
 
topics that are either non
-
argumentative or 
personal 
were excluded
. 
A letter to a friend, email to a teacher, formal apology, and apartment 
lease are among the examples of the topi
cs excluded from this study.
 
For the final list of topics, 
see Appendix A.
 
As a result, levels 
A1

B1 and C2 were excluded because these levels did not have 
enough NPs for each L1 group once 
the exclusion
 
criteria (a) 

 
(c) were applied
. 
For example, 
A1 and
 
A2 did not meet the criteria because every single essay prompt in these levels were too 
short (either 20
-
40 or 50
-


For NP extraction, 
due to the dif
ference in file format, different steps were required for 
EFCAMDAT and LOCNESS
. 
For EFCAMDAT, 
the most straightforward way was to extract all 
determiners and the NPs they precede
. 
However, this approach cannot extract NPs with zero 

 
articles
. 
Hence, this 
study took a backward approach

all NPs and the preceding determiners 
were extracted, and irrelevant ones were identified and subsequently removed
. This was done
 
through Python syntax
. 
Concretely, nouns that are either (a) preceded by quantifiers (e.g., 
som
e
 
people
, 
any
 
reason
), (b) preceded by demonstratives (e.g., 
this
 
man
, 
these
 
people
), (
c
) preceded 
by possessives (e.g., 
his
 
car), or (d) functioning as a noun modifier (e.g., 
credit 
card
, 
bank
 
account
) were all removed
. 
Nouns that 
are irrelevant to the ch
oice of determiners (e.g., 
something,
 
anything
)
 
were also removed
. 
These process
es
 
removed
 
approximately a third of the 
NPs
. 
After the extraction and removal processes, all the NPs were exported into an 
Excel 
sheet 
for the
 
s
ubsequent annotation
. 
Each column in the 
Excel 
sheet corresponded to each variable that 
is described in the following section.
 
For 
L
OCNESS, due to its file format (i.e., .txt)
, TagAnt (Anthony, 2016) was employed 
for automatic part of speech 
(POS) 
tagging
. 
After the 
POS
-
tagged text file was generated, all 
NPs that are tagged as either NN, NNS, NNP, or NNPS were extracted and exported into an 
Excel 
sheet for further annotation
. 
Because the automatic removal processes described above 
were not applicable to the native speak
er data due to its file format, irrelevant cases of NPs were 
manually removed one by one
. 
The exclusion criteria were the same as the ones used for the 
learner data
. 
The descriptive statistics for the extracted tokens are presented in Table 
3
.2
, with a 
bre
akdown of how many definite (DA), indefinite (IA), and zero articles (ZA) were used in each 
of the first language groups.
 

. Descriptive Statistics for the Annotated Tokens of Articles
 
L1
 
# Essays
 
Word
 
Counts
 
(per essay)
 
# Tokens
 
DA (%)
 
ZA (%)
 
IA (%)
 
English
 
15
 
4992
 
(332.8)
 
833
 
245 (29%)
 
479 (58%)
 
109 (13%)
 
Chinese
 
25
 
4822
 
(192.88)
 
795
 
221 (28%)
 
480 (60%)
 
94 (12%)
 
Japanese
 
24
 
4603
 
(191.79)
 
833
 
290 (35%)
 
471 (57%)
 
72 (9%)
 
Total
 
64
 
14417
 
(225.27)
 
2461
 
756 (31%)
 
1430 (58%)
 
275 (11%)
 
 
3.1.3
 

3


Overview of the Variables Used in Annotation
 
Types of variables
 
Variables
 
Number of 
Levels
 
Semantics
 
N
OUN
C
OUNT
 
2
 
N
OUN
A
NIMACY
 
12
 
N
OUN
T
YPE 
 
2
 
D
EFINITENESS
 
8
 
V
ERB
T
YPE
 
5
 
Morphological
 
F
ORM
 
3
 
N
UMBER
 
2
 
Syntactic
 
M
ODIFICATION
 
4
 
C
ASE
 
4
 
Data
 
L1
 
3
 
P
ROFICIENCY
 
4
 
ID
 
32
 
 
3
 
I was the sole annotator of this annotation process, and I acknowledge that the untested interrater 
reliability is one of the limitations of this study.
 

In Table 3

belongs
. 
For example, variables labeled as semantics pertain to the meaning of the target NP
. 

how many levels each of the variables has
. 
In what follows, I present a deta
iled description of 
each of the following 12 v
ariables in the order in Table 3
.3
. 
The first variable 
N
OUN
C
OUNT
 
is 
presented in Table 3
.4.
 

. 
The 
N
OUN
C
OUNT
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Semantics
 
N
OUN
C
OUNT
 
countable
 
uncountable
 
 
The 
variable
 
N
OUN
C
OUNT
 
was annotated in the following way
. 
First, all NPs that are tagged as 

nouns can be pluralized
. 

can be either singular countable nouns or mass noun, this distinction was manually annotated
. 
In 
this manual annotation process, whenever the countability was
 
unclear, an online English 
dictionary
4
 
was used as a reference
. 
Because the same noun can be countable 
or 
uncountable 
depending on the meaning it conveys in a 
particular
 
context, the closest meaning was identified 
in the dictionary, and the corresponding 
countability was annotated in the data
. 
Examples below 
show that
 
the NP 
life
 
in
 
(1) is countable because it refers to a particular course of life, whereas 
the one in 
(2) is uncountable because it means general human existence.
 
(1)
 
Today
, having knowledge of 
how the computer operates is considered a necessary 
component of leading a succe
s
sful 
life
 
(
ICLE
-
US
-
MICH
-
00
2.1).
 
                                        
4
 
Longman Dictionary of Contemporar
y English Online 
(
https://www.ldoceonline.com/
) was used.
 

(2)
 
Cars
, telephones
, and nuclear energy are just three examples of inventions and 
discoveries that have had profound effects on modern day 
life
 
(
I
CLE
-
US
-
MICH
-
0035
.1)
.
 
The variable 
N
OUN
A
NIMACY
 
is presented in Table 3
.5
.
 

. 
The 
N
OUN
A
NIMACY
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Semantics
 
N
OUN
A
NIMACY
 
non
-
human
 
human 
 
natnl/group/socrole
 
other
 
abstract 
 
dynamic
 
ling 
 
eff/state
 
mental/emotional
 
natural entity
 
place/time
 
social
-
conv
 
 
This
 
variable 
N
OUN
A
NIMACY
 
was adopted from Des
hors (2016
)
. 
The original variable had 23 
levels; however, it was 
eventually conflated i
nto 12 levels
. 
For examples of each of these levels, 
see Appendix A
. 
The conflation
 
process is presented in Table 3
.6.
 

N
OUN
A
NIMACY
 

As show in 
Table 3
.6, the original variable 
with
 
23 levels
 
was conflated into 
12 levels (
Deshors, 
2016, 
p. 143). Examples of each of the 
N
OUN
A
NIMACY
 
types are presented in Appendix A. For 
more details on the statistical and conceptual validity of this conflation, see Deshor
s (2016).
 
 
The variable 
N
OUN
T
YPE 
is presented in Table 3
.7
.
 

. 
The 
N
OUN
T
YPE
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Semantics
 
N
OUN
T
YPE
 
common
 
proper
 
 
This 
variable 
N
OUN
T
YPE
 
was annotated automatically, based on the part
-
of
-
speech tag provided 
by EFCAMDAT
. 
Because proper nouns are tagged as either NNP (singular) or NNPS (plural), 
and common nous as NN (singular) or NNS (plural), the first two were automatically annotated 

 
as 
proper nouns, and the other two as common nouns
. 
For example, t
he 
NP 
America 
in (3) 
was 
annotated as proper
, and 
humanness as 
common.
 
(3)
 
In 
America
 
, this growing individualistic society, one no longer sees the realitive 
humanness
 
between people (ICLE
-
US
-
MICH
-
0005.1).
 
The variable 
D
EFINITENESS 
is presented in Table 3
.8
.
 

. 
The 
D
EFINITENESS
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Semantics
 
D
EFINITENESS
 
Unique Hearer Old 
(uniq_hear_old)
 
Unique Hearer New (uniq_hear_new)
 
Non
-
Unique Hearer Old (nonuni_hear_old)
 
Non
-
Unique Hearer New (nonuni_hear_new)
 
Non
-
Unique Non
-
Specific
 
(nonuni_nonspe)
 
Generic (generic)
 
Basic Anaphora (bas_anaph)
 
Extended Anaphora 
(ext_anaph)
 
Miscellaneous (misc)
 
 
This
 
variable 
D
EFINITENESS
 
originally had 24 levels, but it was conflated into 9 levels for the 
ease of annotation
. 
Examples for each of the 9 levels will require more than just a sentence as 
this discourse
-
level variable is suprasententially defined; for a simpler list of examples for each 
of the 
D
EFINITENESS
 
levels, see Appendix A
. 
For the annotation, 


Table 3
.9 summarizes the 
conflation process of the variable 
D
EFINITENESS
.
 

. 
The Conflation Process of the Variable 
D
EFINITENESS
 
Original Levels
 
Conflated Levels
 
Unique_Physical_Copresence
 
Unique Hearer Old (uniq_hear_old)
 
Unique_Larger_Situation
 
Unique_Predicative_Identity
 
Unique_Hearer_New
 
Unique Hearer New (uniq_hear_new)
 
NonUnique_Physical_Copresence
 
Non
-
Unique Hearer Old (nonuni_hear_old)
 
NonUnique_Larger_Situation
 
NonUnique_Predicative_Identity
 
NonUnique_Hearer_New_Spec
 
Non
-
Unique Hearer New (nonuni_hear_new)
 
NonUnique_NonSpec
 
Non
-
Unique Non
-
Specific (nonuni_nonspe)
 
Generic_Kind_Level
 
Generic (generic)
 
Generic_Individual_Level
 
Same_Head
 
Basic Anaphora (bas_anaph)
 
Different_Head
 
Bridging_Nominal
 
Extended Anaphora (ext_anaph)
 
Bridging_Event
 
Bridging_Restrictive_Modifier
 
Bridging_Subtype_Instance
 
Bridging_Other_Context
 
Pleonastic
 
Miscellaneous (misc)
 
Quantified
 
Predicative_Equative_Role
 
Part_Of_Noncompositional_MWE
 
Measure_Nonreferential
 
Other_Nonreferential
 
 
Originally, the variable 
D
EFINITENESS
 
had 25 levels, and they were conflated into nine levels as 
shown 
in Table 3
.9
, for the ease and accuracy of annotation. The conflation was mainly based on 
the original hierarchical structure proposed in Bhatia
, Simons
 
et al. (20
14
), which is shown in 
Figure 3
.1. The nine underlined, boldfaced levels were kept after conflation.
 
 
o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

o
 

In addition to the hierarchical structure shown in Figure 
3
.1, 
I also 
took into account 
the 

and Specific Referent [SR±] during the conflation process. For example, because the [HK±] and 
[SR±] distinctions were present within 
Nonanaphora
, this distinction
 
was retained in the 
conflation process. Consequently, from the level 
Nonanaphora 
in Figure 3
.1, the following six 
levels were retained: 
Unique_Hearer_Old, Unique_Hearer_New, Nonunique_Hearer_Old, 
Nonunique_Hearer_New, Nonunique_Nonspecific, 
and
 
Generic
. F
or 
Anaphora
, because it is 
important to make a distinction between the anaphoric NPs that have actually been mentioned 
and the ones that have only been evoked by entities mentioned before, 
Basic_Anaphora
  
and 
Extended_Anaphora
 
were retained. Lastly, other 
types that fall under miscellaneous were 
conflated into one level 
Miscellaneous 
because they are not explicable by anaphoricity, [±HK], 
or [±SR].
 
The variable 
V
ERB
T
YPE
 
is presen
ted in Table 3
.10
.
 

. 
The 
V
ERB
T
YPE
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Semantics
 
V
ERB
T
YPE
 
stative
 
activity
 
achievement
 
accomplishment
 
n/a
 
 
Th
is
 
variable 
V
ERB
T
YPE
 
is based on the taxonomy developed by Vendler (1957)
. 
For the nouns 
that were either subject (nominative 
c
ase) or object (accusative 
c
ase) of a verb, lexical aspect of 
the verb was annotated
. 
For nouns that did not receive any syntactic 
c
ase, its 
V
ERB
T
YPE
 
was 
annotated as n/a.
 
The variable 
M
ODIFICATION
 
is 
presented in Table 3
.11
.
 

. 
The 
M
ODIFICATION
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Syntactic
 
M
ODIFICATION
 
Pre
-
modific
ation with adjective (premod_a
) 
 
Pre
-
m
odification with
 
noun (premod_n
)
 
Post
-
modification with 
prepositional phrase 
(postmod_p
)
 
Post
-
modification with relative clause (postmod_rc)
 
Post
-
modification with infinitival clause (postmod_ic)
 
Post
-
modification with complement clause 
(postmod_cc)
 
 
Originally, these 
six 
levels
 

M
ODIFICATION

. 
However, because noun phrases can have multiple modifications (e.g., a 
big
 
house 
in the city
), it 
was not ideal to annotate this variable with a 
six
-
level multinomial (i.e., 
single
-
label) variable
. 

M
ODIFICATION

-
label variable, it was 
separated into 
six 
variables, each of which was then treated as a two
-
level binary variable
. 
The
 
underlined
 
NP in example (4
) 
was 
annot
ate
d as premod_a and premod_n, (5
) as postmod_p and 
postmod_ic, 
(6) as postmod_cc, and (7
) as posmod_rc.
 

(4)
 
As individuals we are constantly surrounded by racist and discriminative media 
language
 
(ICLE
-
US
-
MICH
-
0004.1).
 
(5)
 
This sudden burst of useful compounds 
not only improved the chances of a patient

s 
survival in a hospital but also caused a great 
need
 
for medical chemists to study and 
classify each new drug as it was discovered (ICLE
-
US
-
MICH
-
0015.1)
 
(6)
 
We Chinese have a 
saying
 
that men at their birth are natura
lly good (EFCAMDAT
-
writing
-
id
-
556256).
 
(7)
 
An 
invention
 
of the 20th century which I feel has significantly changed people

s lives is 
the introduction of Bank
-
cash machines or Automatic teller machines (ICLE
-
US
-
MICH
-
0044.1).
 
The variable 
F
ORM 
is presented in Ta
ble 3
.12
.
 

. 
The 
F
ORM 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Morphological
 
F
ORM
 
DA
 
IA
 
ZA
 
 
The variable 
F
ORM 
is the choice of the article made on each case
. 
DA 
represents for definite 

zero article
. 
For example, the NP 
illustration 
in (8) 
was 
annotated as IA, 
work 
as 
DA, and 
computer 
as 
ZA.
 
(8)
 

A vivid 
illustration
 
o
f this can be found by examining the 
work
. Recently
, an auto
-
pasts 
[
sic
] 
company put all of their inventory on 
computer
 
(ICLE
-
US
-
MICH
-
0002.1).
 
The variable 
N
UMBER 
is presented in Table 3
.13
.
 

a


. 
The 
N
UMBER
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Syntactic
 
N
UMBER
 
singular
 
plural
 
 
This
 
variable 
N
UMBER
 
was annotated automatically based on the POS
-
tagging
. 
NN and NNS 
were annotated as singular, 
and NNP and NNPS as plural
. 
For example, in (9), the NP 
saying 
was annotated as singular, and 
generations 
as plural.
 
(9)
 
Money is the root of all evil is an ancient 
saying
--
 
but its truth applies to all 
generations 
(ICLE
-
US
-
IND
-
0015.1).
 
The variable 
C
ASE 
is pr
esented in Table 3
.14
.
 
Table


. 
The 
C
ASE
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Syntactic
 
C
ASE
 
Acc
usative with preposition (acc_p
)
 
Accusative with verb (acc_v)
 
Nominative (nom)
 
neither
 
 
C
ASE


(10)
 

(11)
 

. 
The 
L1 
Variable and its Levels
 
Type of variables
 
Variable Name
 
L
evels
 
Data
 
L1
 
E
nglish 
 
J
apanese 
 
C
hinese
 

As has already been presented as des
criptive 
statistics in Section 3
.1.2, 833 occurrences of 
articles from LOCNESS were annotated as English, 833 from EFCAMDAT as Japanese, and the 
remaining 795 from EFCAMDAT as Chinese.
 
The variable 
P
ROFICIENCY 
is presented in Table 3
.16
.
 
Table


. 
The 
P
ROFICIENCY
 
Variable and its Levels
 
Type of variables
 
Variable Name
 
levels
 
Data
 
P
ROFICIENCY
 
B
2
 
C1
 
 
The variable 
P
ROFICIENCY
 
was only applicable to NNS data from EFCAMDAT
. 
Based on the
 
conversion chart
 
between the proficiency level measures in EFCAMDAT and that of CEFR 
(

 
3.2
 

3.2.1
 

As has been briefly mentioned earlier in this paper, MuPDAR
 
(Gries & Deshors, 2014)
 
is a 
regression
-
based methodological protocol
 

-
nativelikeness of the use of a certain linguistic structure
. 
Conceptually, it predicts 
what an NS 
would do in a given
 
linguistic
 
context
 
that an NNS is in
, and this 
given 
linguistic 
context
 
is 

 
operationalized through a set o
f relevant linguistic features
. 
Methodologically, 
MuPDAR
 
consists of roughly 
four
 
steps
:
 
(1)
 
train
 
a logistic regression model (R
1
) based on NS data,
 
(2)
 
if the fit of R
1
 
is good, 
apply
 
R
1
 
to NNS data to make pred
ictions and obtain the 
probability distribution of the target linguistic form (i.e., what an NS 
would do
 
at what 
probability in a given
 
situation that an NNS is in),
 
(3)
 

based on the difference between
 
the 
prediction made in (2) 
and 
the actual NNS data (i.e., what an NNS 
actually did
), 
and
 
(4)
 
create a regression model (R
2
) to predict the deviation of NNS calculated in (3).
 
Gries and Deshors (2014), in their analysis of NNS usage of modals
 
may 
and 
can
, explain 
that (3) can be done in two different ways
. 
The first 
approach 
is to calculate the 
deviation 
categorically
; that is to say, whenever 
the 
predicted 
NS choice and 
actual 
NNS choice 
do 
not 
match, 

foreign
), whereas it is mar

native
) 
when they match
. 
The second 
appr
o
ach
 
is to calculate the deviation 
quantitatively
. 
In this 
approach
, a vector Dev (as in 
deviation
) is created, and a numeric value is attached to each case 
of the target linguistic form
. 
Whenever
 
the actual NNS choice and
 
the predicted NS choice 
match
, the numeric value is set to 0 (no deviation)
. 
Whenever the choices do not match, the 
numeric value is 
set to 
p 

 
0.5
, where 
p
 
stands for the predicted probability of NS choice made 
by R
1
 
(
for the co
mplete explanation of the original MuPDAR approach, see Gries & Deshors, 
2014).
  
The second approach, or the quantitative one, is
 
more commonly used because of the 
level of granularity it allows for (e.g., Lester, 2019).
 
When applying MuPDAR to a 
multinomial classification, a crucial difference between 
binomial and multinomial classification has to be noted. In binomial classification, o
ne deviation 

 
vector suffices because the probability of one class 
automatically 
determines the p
robability of 
the
 
other class. For example, 
when the probability of 
may 
is 40%, then the probability of 
can 
is 
automatically 60%
. 
However, this does not hold true in 
a 
multinomial classification like the one 
in the present study, 
because the probability of one class does n
ot 
determine
 
the probability of 
each of 
the remaining classes. For example, when the probability of DA is 40%, it only tells us 
that the sum of the probabilities of IA and ZA is 60%, but it does not tell us what the 
probabilities of IA and ZA are, respecti
vely. T
herefore
,
 
a modification 
has to be 
made to 
accommodate the number of classes of the res
ponse variable
. 
Two
 
possible alternative 
approaches and their pros and cons were considered.
 
3.2.1.1
 

The first approach is similar to the categorical approach 
to MuPDAR explained above
. 
It 
consists of four 
(almost) 
identical steps:
 
(1)
 
train a multinomial logistic regression model (R
1
) based on NS data
 
(2)
 
if the fit of R
1
 
is good, apply R
1
 
to NNS data to make predictions and obtain the 
probability distribution of the a
rticle choice (i.e., what an NS 
would do
 
at what 
probability in a given si
tuation that an NNS is in)
 
(3)
 
create a 
vector that 
categorically represents whether or not the 

actual choice 
matches the NS 
prediction made in (2), 
and
 
(4)
 
create a 
binary logistic 
regression model (R
2
) to predict the deviation of NNS calculated 
in (3).
 
The biggest advantage of this approach is that the final step (4) can be taken in the exact same 
way as the original MuPDAR because of the categorical n
ature of the deviation vector.
 

On the other hand,
 
this approach has (at least) two shortcomings
: it cannot q
uantify the 

kinds 
of deviation
. 
The former is 
inherent to the categorical approach, whereas the latter stems fro
m the 
nature of the multinomial 
classification
. 
That is to say, a dichotomous categorization of deviation (i.e., match vs
.
 
mismatch) will 
not 
tell us what an NNS chose 
and what an NS would choose in the same 
situation, because it only tells us if the responses w
ere the same or not
. 
For example, an NNS 
choosing zero article when an NS chooses definite article and an NNS choosing definite article 
when an NS chooses indefinite article are two very different scenarios (with probably two very 
different reasons for the
 

3.2.1.2
 

Approach 2
 
addresses the first shortcoming; namely, the inability to quantify the 
deviation
. 
In 
this Approach 2
, step
s (1) and (2) follow Approach 1, and steps (3) and (4) 
are 
different:
 
(1)
 
train a multinomial logistic regression model (R
1
) based on NS data
 
(2)
 
if the fit of R
1
 
is good, apply R
1
 
to NNS data to make predictions and obtain the 
probability distribution of the article choice (i.e., what an NS 
would do
 
at what 
probabili
ty in a given situation that an NNS is in)
 
(3)
 

NS prediction made in (2), and
 
(4)
 
create a multiple regression model (R
2
) to predict the deviation score calculated in (3).
 
In s
tep (3), 
instead of calculating the deviation dichotomously, a numeric value will be assigned 
to each instance of article use
. 
If the NS and NNS choices were in alignment, a numeric value of 

 
0 will be assigned (i.e., no deviation)
. 
If the NS and NNS choice
s diverge, then 
the d
eviation will 
be quantified as 
how 
small
 
the probability of NS making 
the same choice as NNS was
.
 
For example, in a given 
linguistic 
context X, NNS chose indefinite article, whereas NS 
had the probability distribution (as predicted by 
R
1
) of (
DA, IA, ZA) = (0.7, 0.1, 0.2
)
. 
In this 
case, the probability of an NS making the same article choice as an NNS
 
(i.e., indefinite article),
 
is 0.1
. 
However, this value is counterintuitive because it
 
is to be interpreted as 
the smaller the 
value is, 
the larger the deviation is
. 
To make this value more intuitively interpretable, 
the 
deviat
ion will be operationalized as 
0.5
 

p
. 
Originally, the deviation was defined as 
p 

 
0.5 
(Gries & Deshors, 2014); however, Lester (2019) flipped the equation to make 
the numeric value 
more intuitively interpretable
. 
In 
this example, the deviation is 0.5
 
-
 
0.
1 = 0.4
. 
The reasoning 
behind this operation is that, when an NS and NNS choices do not match, the theoretical 
minimum of the predicted probability of an NS making 
the same choice as an NNS is 0 (i.e., 
maximum deviation), whereas the theoretical maximum is 
< 0.5 
(i.e., minimum deviation) 
becau
se the choices must have matched
 
if the predicted probability exceeds 0.5 regardless of the 
probability distribution of the other two articles
. 
Therefore, by subtracting this value from 0.5, 
the deviation value will fall under the range 0 

 
dev < 0.5.
 
Given the capacity to quantify 
learn
er 
deviation, Approach 2 was adopted. In what follows, I introduce a more detailed description of 
each step of Approach 2 with a particular focus on the software and packages used.
 
3.2.1.3
 
Software and Packages
 
A statistical software 
R Studio 
was used to run 
each of the four steps in 
Approach 2
. 
Summarized below are the specifics of Approach 2 and the packages used in 
each step.
 
(1)
 
A multinomial logistic regression model was built using a function 
multinom() 
in the R 
package 
nnet
, with the choice of determiner as
 
a three
-
level categorical response variable 

 
and all other variables as categorical predictor variables
. 
Following Lester (2019), cross
-
validation was conducted to ensure the generalizability of this classification accuracy
. 
A 
commonly used five
-
fold cross
-
validation was employed in this study; in other words, 
20% of the NS data were labeled as the test set and the other 80% the training set, and 
this data splitting took place five times, with a unique 20% assigned to the test set for 
each round of data spl
itting
. 
Due to lack of available packages, 
I 
implemented the 
five
-
fold cross
-
validation
 
code
.
 
(2)
 
The prediction
s
 
of R
1
 
on NNS data were obtained through 
predict()
 
function in the R 
package 
nnet
.
 
(3)
 
For the calculation of the deviation score, I wrote a code that 
followed the 
abovementioned calculation 
method.
 
(4)
 
T
he mixed
-
effects multiple regression model R
2
 
was built with
 
lmer() 
function in the R 
package 
l
me4
.
 
3.2.2
 

The whole idea of MuPDAR and its powerfulness is based on 
the assumption that R
1
 
(the 
regression model trained on NS data) will 
predict 
a native
-
like judgment when applied to NNS 
data
. 
This is precisely what enables us to compare 
what an NNS actually did in a given 
linguistic 
context
 
and 
what an NS would 
do 
in the exact same 
linguistic 
context
 
(as repres
ented by a vector 
of variables)
. 
This assumption is reasonable as long as R
1
 
fits the NS data well (as measured in 
the goodness
-
of
-
fit and classification accuracy); however, this assumption has never been, to 
my 
knowledge, tested empirically perhaps due to the lack of data that 
enable
 
th
e empirical validation 
of such an assumption
. 
That is to say, 
this validation necessitates 
data such that an NS choice 
and 
an NNS choice 
of English articles 
can be directly compared in the exact same 
linguistic 
context, 

 
and 
one type of data that meet this requirement is an 
essay
 
written by
 
NNS and corrected by NS.
 
In EFCAMDAT
, one of the corpora used in the present study, 
learner essays 
are provided with 
profe
ssional feedback on grammatical and lexical errors by 
language 
teachers
. 
According to 
recruiting information by Education First,
5
 
the teachers are all English native speakers with the 
minimum of 40 hours of training in TEFL
. 
Therefore, the validation of th
e assumption is 
possible by comparing the prediction made by R
1
 
and the actual error correction (or non
-
correction) made by an NS.
 
Of the 1628 tokens of articles in 49 learner essays extracted from EFCAMDAT, 34 
tokens had error corrections across 13 essays
. 
However, because EFCAMDAT does not claim 
that error corrections have been provided to 
all 

a substantial portion
 
of scripts 

Huang et al., 2017
, p. 7, emphasis added), it is dangerous to 
assume that the othe
r 35 essays, which had no error correction on article, were reviewed by NSs 
and judged to be completely error
-
free
. 
Therefore, the only way to distinguish articles that were 
judged to be correct by an NS from the ones that were simply not reviewed by an NS
 
is to restrict 
the scope of this analysis to the essays that have at least one error correction
. 
It is reasonable to 
assume that all the articles with no error correction in an essay that has at least one error 
correction elsewhere within it were judged b
y an NS to be correct
. 
Consequently, 13 essays with 
at least one error correction were deemed appropriate for the assumption validation
. 
These 13 
essays included 34 error
-
corrected tokens and 428 error
-
free tokens of article
, 
the following 
two 
versions of 
data were created out of these 462 tokens of article
:
 
1.
 
O
riginal NNS production of 462 tokens (428 error
-
free tokens + 34 tokens before 
error
-
correction)
 
                                        
5
 
http://www.englishtown.com/teachonline/
 

2.
 
E
rror
-
corrected NNS production of 462 tokens (428 error
-
free tokens + 34 tokens 
after error
-
correction)
 
If the assumption of MuPDAR is correct, then the
 
predicted
 
NS choice (by R
1
) should align
 
closer to the error
-
corrected NNS production
 
than to the original NNS production.
 
 
Following this procedure, R
1
 
was applied to both versions 1 and 2 of the 462 tokens, and 
the classification accuracy 
of the version 2
 
was 
higher (
73%
)
 
than the version 1 (70%), although 
not statistically significant (
z 
= 0.87, 
p
 
= .38)
. Given that the model R
1
 
predicts Japanese data 
and 
Chinese data at a good accuracy of 70
% and 71%,
 
respectively, this is not a cogent piece of 
evidence to validate the assumption of R
1

-
like judgment on NNS data. This will be 
further discussed in the limitation section.
 
 
4
 

4.1
 

4.1.1
 

Overall, the model fit 
of the multinomial logistic regression model R
1
 
was excellent
6
 
(C= .96, well beyond the threshold level C = .8 as proposed by Gries & Deshors, 2014; as well as 
C = .93 in Lester, 2019; R
2
McFadden
 
= .68)
. 
The overall classification accuracy, which is the 
number of correct predictions divided by the total number of predictions, was 
89
%
. 
This means 
that the model trained on NS data was able to predict with the accuracy of 
89
% which one of the 
three article op
tions (i.e., ZA, IA, DA) a native speaker would use in a given 
linguistic 
context 
represented by a vector of variables. 
Its


result 
of five
-
fold 
cross
-
validation 
showed that the mean accuracy score was 
84
% (
SD
 
= 2%)
. 
Give
n the little 
decrease in the accuracy score, this model is reasonably generalizable to different datasets.
 
Initially, learner ID was to be included as a random effect in R
1
; however, because no R 
packages allow the inclusion of random effects in a 
multinomial logistic regression model as far 
as I know, it was first included as a fixed effect
. 
The classification accuracy was higher with 
learner ID (90%), but the generalizability decreased substantially, as we can see in the mean 
accuracy score of fiv
e
-
fold cross
-
validation (77%)
. 
Therefore, for the first regression model 
R
1
, a 
decision was made to not include the variable learner ID
. 
Also, at this stage of MuPDAR, AIC 
was not considered because the purpose of R
1
 
is not to construct a parsimonious mode
l; rather, it 
solely aims at making the most accurate prediction on NNS data that approximates what an NS 
would
 
do.
 
                                        
6
 
C statistics was defined as the area under the 
receiver operating characteristic (ROC) curve.
 

Because the fit of R
1
 
was shown to be good, the NS model R
1
 
was applied to NNS data
. 
As has already been explained in the methodology sectio
n, NS prediction based on R
1
 
was made 
on each of the
 
cases in NNS data, and Tables 4
.1 and 
4
.2 
show
 
the confusion matrices for the 
actual article choice by Chinese and Japanese learners of English and the NS prediction by R
1
, 
respectively.
 

Confusion Matrix of R
1
 
prediction on NNS data (L1 = Chinese)
 
 
NS choice predicted by R
1
 
 
DA
 
IA
 
ZA
 
Total
 
Actual NNS choice 
(L1 = Chinese)
 
DA
 
153
 
32
 
36
 
221
 
IA
 
23
 
64
 
7
 
94
 
ZA
 
76
 
54
 
350
 
480
 
Total
 
252
 
150
 
393
 
795
 
 
Confusion Matrix of R
1
 
prediction on NNS data (L1 = Japanese)
 
 
NS choice predicted by R
1
 
 
DA
 
IA
 
ZA
 
Total
 
Actual NNS choice
 
(L1 = Japanese)
 
DA
 
213
 
29
 
48
 
290
 
IA
 
12
 
56
 
4
 
72
 
ZA
 
91
 
66
 
314
 
471
 
 
Total
 
316
 
151
 
366
 
833
 
 
In Tables 4
.1 and 
4
.2, each of the rows correspond with the actual NNS choices, whereas each of 
the columns correspond with the predicted NS choices
. 
The boldfaced, underlined figures 
indicate the match between the 
two; namely, the number of occurrences of articles, in which the 
actual NNS choice and the predicted NS choice were the same
. 
Overall, 
the proportion of 

that 
matched with
 
the predicted
 
NS
 
choice (
71%
) was not 
significantly 
higher than the proportion of 

that 
matched with 
the predicted 
NS
 
choice (
70%
; 
z
 
= 0.59, 
p 
= .56)
.
 

4.1.2
 

Approach 2 was adopted first in the calculation of deviation score and the construction of 
the 
final regression model (R
2
) because 
it allows for a more fine
-
grained quantitative analysis of 
the deviation
. 
Following the procedure described in the methodology section, the deviation score 

. 
A generalized linear mixed
-
effect model was built with a function 
glmer()
 
in 
the 
R package 
lme4
. 
All the independent 
variables included in R
1
 
were entered as predictors, and the variable 
F
ORM 
was also included in 
R
2
, as we would like to see how three typ
es of article uniformly or differentially affect the 
learner deviation, and how they interact with other predictors
. 
Also, all of them were allowed to 
interact with the variable L1, as their effects on the deviation are expected to differ based on the 
lear

. 
Consequently, the model included main effects, two way interactions 
with 
F
ORM
 
(
F
ORM
 
: everything), two way interactions with L1 (L1 : everything), and three
-
way 
interactions with both 
F
ORM
 
and L1
 
(
F
ORM
 
: L1 : everything)
. 
To avoid ove
rparameterization, 
AIC and BIC scores were calculated for the models with (1) only main effect, (2) main effect and 
two
-
way 
F
ORM
 
interaction, (3) main effect and two
-
way L1 interaction, and (4) main effect, two
-
way 
F
ORM
 
interaction, two
-
way L1 interaction, and three
-
way 
F
ORM 
: L1 interaction
. 
This model 
selection 
process is summarized in Table 4
.3.
 

Model Selection of R
2
 

Note
.
 

In 
Table 
4
.3, 
AIC and BIC 
represent numerical measures of the model fit and model 
parsimony, which 
penalize the inclusion of additional terms
. 
AIC is more useful for detecting 

 
type II error (false negatives), whereas BIC is more sensitive to type I error (false positives)
. 
Based on
 
the table, Model 2 has the smallest AIC and BIC, and Model 4 has a slightly higher 
R
2
C
. 
The contrast between the models with smaller AIC and BIC (Model 2 and Model 4) and the 
ones with larger AIC and BIC (Model 1 and Model 3) is most likely due to the inc
lusion of 
interaction terms with the variable 
F
ORM
. 
The slightly higher R
2
C
 
of Model 4 is not surprising, 
given that Model 4 is Model 2 + three
-
way interaction (
F
ORM
 
: L1 : everything)
. 
The AIC and 
BIC for Model 4 are lower than those of Model 2; however, 
given the conceptual importance of 
investigating how different L1 speakers receive different influences of other variables, Model 4 
was selected as the initial model of R
2
. 
The model was highly significant (
F
(192, 1435) = 35.46, 
p
 
< .001) without any sign 
of multicollinearity (all VIFs < 2).
7
 
 
Because the inclusion of all of these categorical variables involves the generation of 
dummy variables, a reference level had to be set
. 
A reference level is interpreted as the baseline 
level, to which all other level
s will be compared to
. 
The reference levels
 
of
 
the 12 variables are 
presented in Table
 
4
.4.
 
                                        
7
 

term and its constituent variables are often highly correlated, this multicollinearity does not pose pr
oblems for the 

 
Reference Level for Each of the Categorical Independent Variables
 
 
In table 4
.4, f
or the variables that 
are 
linguistic
ally more marked
, the least marked level 
was set as the reference level (e.g., no prenominal adjectival 
modification is less marked than 
modified ones)
. 
For the variables 
for which 
linguistic markedness
 
was difficult to define
, the 
most frequent level was set as the reference level, following Gries and Deshors (2014
)
. 
In 
addition to these fixed
-
effect indepe
ndent variables, the variable 
ID
 
was also included in this 
model as a random effect.
 
Significant predictors of the regression
 
model are summarized in Table 4
.5.
 

.
 
Significant Predictors of Deviation Score
 
 
In 
Table 
4
.5, 
only the statistically significant effects with no significant higher interaction 
effects were included
. 
For example, even though it was significant, the main effect for 


was 
not included in the table because its main effect was overridden by the significant interaction 
term 


. This interaction was not included either because it was overridden by 
the highest order interaction term 


, which w
as highly significant.
 
In this 
sense, it is this highest order interaction (three
-
way interaction) that 
is particularly noteworthy in 
Table 
4
.5. That is to say, the three highest order inter
actions at the bottom of Table 4
.5 indicate 
that each of these thr
ee independent variables (i.e., 


and 


differentially affected the accuracy of article use, and that such differential effects 
further varied for Chinese and Japanese learners of English.
 
This will be presented graphically 
later in this section.
 
Each of the F
-
statistics 
in Table 4
.5 
can be interpreted as the amount of change in the 
model fit when the full model is compared against another model without one variable of 
interest
. 
For example, the row L1 : 


represents how much improvement the 
inclusion of the interaction term L1: 


would make in terms of the model fit, when 
all other terms are already included in the model
. 
Considering that all the independent variables 

 
are categorical variables, thi
s is in principle identical to factorial ANOVA with type
-
III sum of 
squares
. 
In the following sections, the significant predictors will be further investigated 
graphically and statistically.
 
For each graph, the dots (or other shapes, such as triangles and 
squares) represent the 
means of the deviation score at a given level of the variable of interest
. 
Because all other 
variables not
 
on the graph are held constant, the differences between such means represent the 
marginal effects of the variables of interest
. 
Error bars represent 95% confidence intervals
. 
The 
value on 
y
-
axes is always the deviation score, whereas the value on 
x
-
axes and legends are the 
predictor variables that construct the interaction term of interest (as no main effects are analyzed 
here, e
very graph will have at least two predictor variables).
 
For the statistical analyses, because everything to everything comparison will lead to an 

-
level adjustment, every level of the predictor variable was only 
compared against the
 
reference level, which has already been discussed earlier
. 
This means that 
for each predictor variable with 
k
 
levels, (
k 

 
1) tests were conducted
. 

-
level was 
adjusted accordingly based on Bonferroni correction, a simple and conservative t
ype of 
adjustment
. 

number of tests (
k
 
-
1)
. 
This calculation is a simplification of a more complex formula, but it can 
be used satisfactorily in most cases (Walters, 2016)
. 
For t
wo
-
way interaction terms, comparisons 
were conducted based on the second predictor, within a given level of the first predictor
. 
For 
example, when analyzing 


interactions in Figure 4
.1, comparisons were 
conducted based on the second predictor (i.e., 


; countable vs
.
 
uncountable) within a 
given level of the first predictor (i.e., 


); which is, say, DA
. 
Because the second predictor 

 
only has two levels and only one comparison 

-
level will not be 
adjusted for this particular example.
 
4.1.2.1
 

Marginal Effect of the Interaction Term 


Figure 
4
.2
 
shows that, regardless of the noun countability, both Chinese and Japanese learners 
had problems using DA accurately, and this inaccuracy of DA was more obvious with 
uncountable nouns
. 
This difference becomes much more pronounced for the use of IA; 
learners 
had great difficulty using IA with uncountable nouns, whereas their use of IA with countable 
nouns was almost native
-
like
. 
Most strikingly, the relative ease of countable nouns does not hold 
true for ZA; that is to say, the use of ZA with uncounta
ble nouns was more nativelike, whereas 
the use of ZA with countable nouns was more deviant.
 
 
Figure 4
.3
 
shows the marginal effect 
of the interaction term
 

4.1.2.2
 

5
 

5.1
 

(i.e., 


and 


˛¨'Ü0·(¾


5.2
 

Throughout the post
-
hoc graphical analyses of each significant effect, it appeared that the 
use of IA, in general, was associated with
 
lower deviation scores (i.e., more nativelikeness)
. 

 
However, it is important to note that this does not mean Chinese and Japanese learners of 
English have a good understanding of the distribution of IA; rather, it seems like the low 
deviation score of IA 
is attributed to its overly restricted use by NNS
. 
With a closer look at the 
confusion matrices of actual NNS article choice and predicted NS article choice presented 
earlier, it becomes clear that NS (as predicted by R
1
) is far more likely to use IA (19%)
 
than 
Chinese (12%)
. 
This pattern was also true for the Japanese data, in which NS prediction of IA 
(18%) was much more frequent than Japanese (9%)
. 
It, then, follows naturally that the seeming 
nativelikeness of IA use by NNS was due to the high precision 
of IA use (Chinese: 68%, 
Japanese: 78%), and the overall infrequency of IA use resulted in a low recall of IA use 
(Chinese: 43%, Japanese: 37%)
. 
This difference in the precision score and recall score of IA use 
is particularly large for Japanese
. 
In other 
words, the problem lies in the fact that the marginal 
effect of 


(and its interactions with other independent variables) on the deviation score only 
considers the non
-
nativelikeness of IA that was 
actually used
 
by NNS, and does not consider the 
non
-
nat
ivelikeness due to the non
-
use of IA in an obligatory context (as predicted by R
1
).
 
 
One possible fix for this problem is to define 


as what the NS prediction by R
1
 
is, 
instead of what the actual NNS choice is
. 
By doing so, we can observe how the devia
tion score is 
differentially affected by other variables in each of the three obligatory contexts for DA, IA, and 
ZA
. 
However, this is merely an ad
-
hoc fix for the problem described above, and is not effective 
when learners overuse a certain target form (f
or the exact same reason for which TLU was 
proposed in place for SOC)
. 
This problem of defining the variable 


is unique to a 
multinomial MuPDAR like the present study.
 
 
Another aspect of this study that warrants further investigation is the calculation of the 
deviation score
. 
Conceptually, it is reasonable to operationalize the deviation score as 
p 

 
0.5, 

 
where 
p 
stands for the probability of an NS 
not 
choosing the articl
e chosen by an NNS
. 
However, as has alrea
dy been pointed out in Section 3
, 
the deviation score defined in this way 
does not tell us whether the deviation is due to an overproduction or an underproduction of one 
level of the target structure
. 
The deviation 
score for binary linguistic structures in original 
MuPDAR (Gries & Deshors, 2014) contains more information, because it ranges from 
-
0.5 to 
0.5, with the absolute value and ± 
sign indicating 
the magnitude of deviation and the direction of 
the deviation (un
derproduction vs. overproduction)
, respectively
. 
Therefore, a way to 
operationalize the deviation score in a multinomial setting, such that the direction of the 
deviation is also included in the score, would advance the instrumental convenience of 
multinomial MuPDAR.
 
 
As to the validation 
of the assumption that R
1
 
in fact makes a nativelike judgment on 
NNS data, the results remained inconclusive
. 
This is mainly due to the small number of 
corrections on the article errors available in the data used i
n this study
. 
As has alr
eady been 
mentioned in Section 3
, 
the number of occurrences of articles used for this validation was 461, of 
which 34 were error
-
corrected and 418 were error
-
free
. 
Because the validation relies on the 
difference between R
1

ication accuracy on the 461 occurrences before error
-
correction 
and the same 461 occurrences after error
-
correction, the number of occurrences of articles that 
were actually error
-
corrected
 
has to be big enough for the two classification accuracies to be 
d
ifferent enough to validate the assumption
. 
One way to address this problem is to selectively 
extract essays with a large number of error corrections on articles from EFCAMDAT.
 
 
6
 

The present study is the first to (i) apply MuPDAR to a 
multinomial target structure and 
to (ii) take a multifactorial approach to the investigation of the article use by NNS
. 
Conceptually, 
the results showed relative importance of each of the relevant semantic, syntactic, and 
morphological factors governing th
e use of English articles
. 
Methodologically, the first attempt 
to extend MuPDAR to a multinomial linguistic structure was not without problems, but it would 
potentially open up MuPDAR to a wider range of linguistic phenomena, as it is no longer 
restricted 
to the ones that have binary choices. 
 

Table A.
1
.
 
List of Topics Extracted from EFCAMDAT
 
Level
 
Unit
 
Title
 
Topic
 
10
 
1
 
Extreme activities
 
Helping a friend find a job
 
10
 
2
 
Gender differences
 
Doing a survey about discrimination
 
10
 
3
 
The cost of living
 
Requesting a bank loan
 
10
 
4
 
Health and fitness
 
Applying to be a fitness trainer
 
10
 
5
 
Lifestyles
 
Finding a home for a wealthy client
 
10
 
6
 
Telling stories
 
Describing a terrifying experience
 
10
 
7
 
Presenting information
 
Presenting trends
 
10
 
8
 
Competition and cooperation
 
Giving feedback about a colleague
 
11
 
1
 
Talking about films
 
Writing a movie review
 
11
 
2
 
Fears and phobias
 
Helping a coworker deal with a phobia
 
11
 
3
 
Technology
 
Writing an 
advertising blurb
 
11
 
4
 
Beliefs and convictions
 
Writing up survey findings
 
11
 
5
 
Career paths
 
Reviewing a self
-
help book
 
11
 
6
 
Computers and the Internet
 
Setting rules for social networking
 
11
 
7
 
Law and order
 
Dealing with a breach of contract
 
11
 
8
 
Listening skills
 
Improving your study skills
 
12
 
1
 
Manners and etiquette
 
Turning down an invitation
 
12
 
2
 
Books and stories
 
Entering a writing competition
 
12
 
3
 
Mysterious phenomena
 
Buying a painting for a friend
 
12
 
4
 
Corporate culture
 
Writing a report on
 
staff satisfaction
 
12
 
5
 
World English
 
Proofreading an article
 
12
 
6
 
 
Leadership qualities
 
Attending a leadership course
 
12
 
7
 
Soft skills
 
Conducting a performance appraisal
 
12
 
8
 
Awkward situations
 
Writing an apology note
 
13
 
1
 
Politics
 
Writing a 
campaign speech
 
13
 
2
 
Home design
 
Renting out a room
 
13
 
3
 
Market research
 
Comparing two demographic groups
 
13
 
4
 
Fair trade
 
Giving advice about budgeting
 
13
 
5
 
Contributing to society
 
Writing about a disaster relief effort
 
13
 
6
 
Art and design
 
Writing a 
brochure for a museum
 
13
 
7
 
Mother nature
 
Making an educational product for kids
 
13
 
8
 
Reaching your potential
 
Reaching your potential
 
14
 
1
 
Advertising
 
Writing advertising copy
 
14
 
2
 
The environment
 
Choosing a renewable energy source
 
14
 
3
 
Good and bad 
news
 
Writing a rejection letter
 
14
 
4
 
Health and well
-
being
 
Attending a seminar on stress reduction
 
14
 
5
 
Taking a risk
 
Talking a friend out of a risky action
 
14
 
6
 
Education and training
 
Applying for sponsorship
 

14
 
7
 
Making a speech
 
Writing a wedding toast
 
14
 
8
 
Jokes and humor
 
Delivering a punch line
 
15
 
1
 
In the news
 
Covering a news story
 
15
 
2
 
Communication
 
Hosting a group of foreign buyers
 
15
 
3
 
The power of the mind
 
Writing an article about NLP techniques
 
15
 
4
 
The entertainment 
industry
 
Making a movie
 
15
 
5
 
E
-
commerce
 
Comparing two online retailers
 
15
 
6
 
Urban issues
 
Writing an article about a superstore
 
15
 
7
 
Quality of life
 
Writing about future lifestyles
 
15
 
8
 
Meaning and symbols
 
Interpreting a prophecy
 
16
 
1
 
Science and 
technology
 
Attending a robotics conference
 
16
 
2
 
 
National identity
 
Writing about a symbol of your country
 
16
 
3
 
Tough choices
 
Following a code of ethics
 
16
 
4
 
Fame and fortune
 
Criticizing a celebrity
 
16
 
5
 
Creative thinking
 
Using creative writing 
techniques
 
16
 
6
 
Financial planning
 
Applying for a home loan
 
16
 
7
 
Dealing with stress
 
Writing a visualization script
 
16
 
8
 
Doing research
 
Researching a legendary creature
 

N
OUN
A
NIMACY
 

D
EFINITENESS
 

ID Tag levels
 
Examples
 
Conflated Tag Levels
 
Unique_Physical_Copresence
 
John
 
here is an investment banker.
 
Unique Hearer Old 
(uniq_hear_old)
 
Unique_Larger_Situation
 
In the days since 
Hillary Clinton
 
unburdened herself in an 

 
Unique_Predicative_Identity
 
Clark Kent is 
Superman
.
 
Unique_Hearer_New
 
A restaurant chain named 

 
Unique Hearer New 
(uniq_hear_new)
 
NonUnique_Physical_Copresence
 
The podium 
is too high.
 
Non
-
Unique Hearer Old 
(nonuni_hear_old)
 
NonUnique_Larger_Situation
 
The chair 
(at a conference) / 
today
 
NonUnique_Predicative_Identity
 
He is 
the manager
.
 
NonUnique_Hearer_New_Spec
 
I am looking for 
a nurse
. Her name is Sara.
 
Non
-
Unique Hearer New 
(nonuni_hear_new)
 
NonUnique_NonSpec
 
I am looking for 
a nurse
 
[any nurse would do].
 
Non
-
Unique Non
-
Specific 
(nonuni_nonspe)
 
Generic_Kind_Level
 
Dinosaurs 
are 
extinct.
 
Generic (generic)
 
Generic_Individual_Level
 
Cats 
have fur.
 
Same_Head
 

a true story
.
 
Basic Anaphora 
(bas_anaph)
 
Different_Head
 
I adopted a cat this weekend. 
The animal 
is so cute.
 
Bridging_Nominal
 
Ilooked at 
an apartment
 
yesterday. 
The kitchen 
was really large.
 
Extended Anaphora 
(ext_anaph)
 
Bridging_Event
 

got married 
this weekend. 
The bride 
looked 
beautiful.
 
Bridging_Restrictive_Modifier
 
the house 
next door
/ 

daughter
 
Bridging_Subtype_Instance
 
I collect 
coins.
 
I have 
a 1943 steel penny
.
 
Bridging_Other_Context
 
I want to focus on 
what many of you have said you would like me 
to elaborate on
. What can you do about 
the climate crisis?
 
Pleonastic
 
It is raining.
 
Miscellaneous (misc)
 
Quantified
 
All the people
 
/ 
no motorcade
 
Predicative_Equative_Role
 

a teacher
. / This is 
an opportunity.
 
Part_Of_Noncompositional_MWE
 

the bucket 
today.
 
Measure_Nonreferential
 
Hours 
later / 
miles 
away
 
Other_Nonreferential
 
Global warming/ concern/ 
the topic of 
energy
 
 
8
 

8
 
The excerpts were re
-
organized into bullet points, and quotation marks were not used for readability.
 
 
9
 

9
 
The excerpts were re
-
organized into bullet points, and quotation marks were not use
d for readability.