DEFAULTS AND THE THEORY OF GRAMMAR

By

Kali Elizabeth Morris

A DISSERTATION

Michigan State University

in partial fulﬁllment of the requirements

Submitted to

for the degree of

Linguistics – Doctor of Philosophy

2018

ABSTRACT

DEFAULTS AND THE THEORY OF GRAMMAR

By

Kali Elizabeth Morris

This thesis is concerned with the nature of syntactic defaults and what their investigation can tell
us about the theory of grammar. Since Chomsky (1995) ﬁrst introduced feature-checking, we’ve
understood the need to value features to be central to how the grammar regulates grammaticality.
The failure of an uninterpretable feature to receive a value and be deleted before reaching the
interface induces the derivation crash that diﬀerentiates grammatical sentences from ungrammatical
ones. Defaults oﬀer us an interesting domain of inquiry because we would expect them to be
impossible to generate in this type of system; nonetheless, they surface in a number of core
syntactic domains. The existence of syntactic defaults raises three central questions: ﬁrst, how is it
that defaults are produced in a system where the failure to value features causes ungrammaticality?
Second, how is it that the production of defaults is constrained such that whatever mechanism
accounts for their production doesn’t overapply to instances where they aren’t licit? Finally, given
that syntactic defaults appear to involve underspeciﬁcation, what can an understanding of the default
mechanism tell us about the role of underspeciﬁcation in the syntactic domain?

In this thesis, I focus our attention on two arenas: defaults in the domains of case and ϕ-
agreement. A number of proposals have been made in recent years that address these issues, at
least in part. They share a similar logic: the way to account for how defaults surface in the system
is to abandon the notion that failing to value a feature is fatal to the derivation. I will argue in this
thesis that by making small modiﬁcations to the generally accepted framework, we can account for
the production of defaults without having to abandon that notion.

One such departure is dependent case theory – a conﬁgurational approach to case whereby case
features are valued not through their relationships with case-assigning functional heads, but rather
by their relative positions to other nominals. Built into this system is a default case, assigned as a

last resort to nominals that have failed to receive a more speciﬁc value. While a desire to understand
defaults is not what originally guided the proposal of dependent case theory, its ability to easily
account for their production has certainly contributed to its widespread adoption. In the domain
of ϕ-agreement, another departure called obligatory operations addresses the default issue more
directly and proposes a new understanding of what drives derivations. It is not the need to value
features that explains why ϕ-agreement is obligatory, but rather that the operations responsible for
establishing those dependencies are obligatory themselves. By shifting the explanatory burden to
the triggering of operations, rather than their outcomes, obligatory operations claims that syntactic
operations can fail, without inducing ungrammaticality; thus providing a solution to the default
production problem. While the departures in both arenas have directly addressed the issue of how
defaults are produced, neither has been too successful in understanding how that production is
constrained. Furthermore, in order to solve the production issue each has to abandon the central
tenet of feature valuation.

I argue that in light of a host of deep conceptual and empirical issues regarding these two
departures, we are better served to handle the default problem by making modest modiﬁcations to
the standard syntactic framework that the ﬁeld has adopted since (Chomsky, 2000, 2001). I extend
a decomposition of agree that is sensitive to inherent hierarchical relationships between features
to both produce and – more importantly – constrain the distribution of syntactic defaults (Béjar,
2003). This decomposition produces three outcomes of agreement – rather than the standard two
– and it is in this third outcome where we ﬁnd syntactic defaults and other interesting types of
underspeciﬁcation and repairs. What is available to us through this proposal is an understanding of
how defaults are both produced and how that production is constrained and the simultaneous ability
to maintain standard assumptions about the role of feature valuation in regulating grammaticality.
Through this system, we can also gain further insight into the nature of underspeciﬁcation and its
role in the syntactic component.

Copyright by
KALI ELIZABETH MORRIS
2018

To all those who have believed in me more than I’ve believed in myself.

I hope to have made you proud.

v

ACKNOWLEDGEMENTS

this is why we’re called advisors, not founts of information

— Alan Munn

I have dreaded writing this acknowledgments page for years because it is sure to fall short in
expressing the immense gratitude I have for everyone who has helped me get to where I am today.
Alan has simply been the best advisor I could have dreamed of; he is the true embodiment of
the word. He is an admirable role model and I owe him a lifetime of gratitude for how much I’ve
grown under his guidance, both intellectually and personally. He introduced me to syntax, taught
me to love theory, how to teach, how to create an argument (and how not to). He was immensely
encouraging in the honest kind of way that you trusted he wasn’t just pulling you along. He was
unwavering in his support and I’ve never once doubted I was a priority or that he cared. He has
been insanely generous with his time and always knows exactly what to say to calm me down, or get
me moving faster on something. No matter how stressful things were, meeting with Alan instantly
removed all worry. He was incredibly patient and kind while simultaneously pushing me to do
better. My life has been enriched in numerous ways by knowing him and while I am excited for
whatever chapter next awaits me, I am so terribly sad to give up our weekly meetings. I hope that
my time learning from him is far from over. Thank you so very, very much.

I am so grateful for the amazing guidance Suzanne has given me, especially during my ﬁrst
few years in the department. We wrangled twitter data together, taught in the trenches of IAH
together, and she was such a huge support in helping me transition to living so far from home.
She always encouraged me to ﬁnd a good balance between the diﬀerent parts of my life and I
always appreciated that. Working for and with her gave me an appreciation for sociolinguistics
and rigorous quantitative work and was such a pleasure. Of course I showed that appreciation by
writing an entirely theoretical syntax dissertation, but I hope she’ll forgive me - haha. Thank you
for everything.

vi

I don’t think Cristina realizes how many times her reassurance has pulled me out of a self-
deprecating spiral. Her insights are always so perfect and exactly what you need to hear to draw
the connection you were missing. I learned so much from observing how she presented arguments
and how she was able to cut immediately to the big untouchable, messy issue that no one wanted to
talk about. Having her support through this project has been such a source of strength and I have
such fond memories of syntax reading group in the lab. Thank you, thank you.

Watching Marcin get excited about a good idea is probably one of my favorite things about the
work half of my life. His passion for good argumentation, clever ideas no matter how crazy, and his
ability to see straight through all the bullshit is something that I truly enjoy watching and deeply
admire. He also always asks the tough questions right out the gate, which is so helpful when you’re
trying to be better. He has been so fun to talk to throughout this process and his questions always
get me to think about the broader – not just this tiny part of syntax – picture. Thank you for all
you’ve done for me.

Thank you to Dr. Brown for teaching me to write, Dr. Kendall for introducing me to linguistics

for the very ﬁrst time, and Dr. Kallendorf for pushing me to not defer so much.

To my grad school bestie, Jessica. I don’t know what I would have done without you. I am so
glad that out of all the things I picked up while in grad school, a lifelong friend is one of them.
Here’s to many more years of friendship – and hopefully we can convince Steve to move y’all down
to Texas one day ;)

My syntax big brothers: Greg, Joe, Daniel, Matt H. and Matt K. Y’all welcomed me so kindly
when I was new and I feel so lucky to have had so many people looking out for me and guiding me.
Plus talking about syntax is so much more fun over beers. I miss y’all so much.

To Greg – I can’t tell you how much I appreciate our friendship and how it has grown over the
years. You took me under your wing from the very ﬁrst day and many years later, you’re one of my
closest friends. Thank you isn’t nearly enough. Now that I’m done, let’s take over the south!

Hannah, Curt, Ai, and Karthik – Hannah, I don’t think I’ve ever met such a genuinely kind
person. You made my life in Michigan so much better and whether you knew it or not, you helped

vii

encourage me to be a more patient, understanding person. Thank you. Curt, thanks for putting up
with my monologues for such a long time and for always being down to go work or just hangout at
the bar/coﬀee shop when I needed company. Grad school was way more enjoyable with you there,
thank you. Ai, you are that magical mix of approachable while yet so good at what you do it’s
intimidating. I learned so much from you about teaching and I always enjoyed every one of our
long chats. Karthik, I have learned so much about so many diﬀerent things from you I can’t even
start to list them. Those lessons have greatly adjusted my perspective on the world around me and
I’m a better person for it. Thank you. You are always so much fun to be around and to talk with. I
miss our weekly trivia nights and our discussions about How I Met your Mother.

To the lovely staﬀ at the Starbucks Reserve on Rayford and Kuykendahl.
My mom and dad are the loveliest and most supportive people on the planet. There are simply
no words to express how grateful I am for everything they’ve done for me. Thank you for putting
such a high value on education, for sending me to the best schools, and for all the sacriﬁces you
guys have made along the way to provide me with the best opportunities – and of course a fantastic
place to write the ﬁnal chapters of this dissertation. I love y’all so, so much.

Nana and Granddad – my second set of parents. Thank you for all the unconditional support
through the years. I hope I’ve made y’all proud. Thank you for the lifetime of love, long boozy
lunches, and chats on the porch. I love you both.

I am surrounded by a family full of incredibly supportive people who are always looking out for
me: Pete, Devin, David, Kevin, Kathy, Katie, Mike, Tim, and Lisa. Thank you for being interested
in my work, for being invested in my future, and for being so much fun to be around.

To my inner circle – Bonnie, Brook, Candice, Danny, Jerod, Katelyn, McKinney, Mike, and
Sarah – y’all have kept me sane and grounded by always giving me a place to keep a foot in the
“real world”. And I am forever grateful for the years of amazing friendship (decades in some cases
– when we did we get old?). Thank you for listening to 800 elevator pitches even though no one still
really has a clue what I do – your unconditional support otherwise has made a world of diﬀerence.
You are the best cheerleaders a girl could hope for!

viii

I have been incredibly lucky to marry into a wonderful family full of the most kind and supportive
people. Thank you Mark, Cindy, Lisa, Dustin, Witten, Madison(yay!), Gramps, Grandmom, and
Angie for being so wonderful to me over the years. I appreciate everything you’ve done for me.

To Joe, Jane, and Brutus – thank you for bringing me such joy.
And ﬁnally, there will never be enough words to convey my appreciation for my husband
Andrew. He has been a pillar of strength and support throughout the whole process: from 12 hour
Skype sessions when I was homesick, to the near constant reassurance I needed towards the end,
the hugs when I was having an anxious day, and all the million other small things that made such
a huge diﬀerence. There is no world in which I could have imagined getting through this without
you by my side. I’m just happy to have the rest of my life to express my gratitude. I promise I’ll
stop going to school now (maybe)– hahaha.

This dissertation is dedicated to all those who’ve believed in me more than I believe in myself.

Thank you and I hope I’ve made you proud.

ix

TABLE OF CONTENTS

1.1

1.3

.

.

.

.
.

.

CHAPTER 1

ϕ-agreement and Case

1.4 Goals of this Thesis .

1.2 Domain-Speciﬁc Defaults .

. . . . . . . . . . . . . . . . . . . . . .

LIST OF TABLES .
.
.
KEY TO ABBREVIATIONS .

.
.
.
.
INTRODUCTION .
.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introducing Defaults .
1.1.1
Some Data and What it Means . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 The Crux of the Default Problem . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
1
2
7
8
1.2.1
9
1.2.2 Defaults in the ϕ-agreement Domain . . . . . . . . . . . . . . . . . . . . . 13
1.2.3 Defaults in the Case Domain . . . . . . . . . . . . . . . . . . . . . . . . . 15
Jumping Ship lands us in Bizarre Boats
. 16
1.3.1 Dependent Case Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2
Separation of Case from Licensing . . . . . . . . . . . . . . . . . . . . . . 18
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.3 Obligatory Operations
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
CHAPTER 2 DEPENDENT CASE THEORY . . . . . . . . . . . . . . . . . . . . . . . . 23
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 The Default Case Issue .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Dependent Case Theory .
.
2.2.1 Overview of model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1.2 Modern Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.1.3
Interim Walkthrough . . . . . . . . . . . . . . . . . . . . . . . . 46
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Implications .
2.2.2.1 Abandonment of Government
. . . . . . . . . . . . . . . . . . . 51
Parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.2.2.2
2.2.2.3 Dependency Establishment
. . . . . . . . . . . . . . . . . . . . 65
2.3 Separation of Case from Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . 66
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
.
CHAPTER 3 OBLIGATORY OPERATIONS . . . . . . . . . . . . . . . . . . . . . . . . 76
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.3.1 An overview of match/value . . . . . . . . . . . . . . . . . . . . . . . . 91
3.3.2 Accounting for Kichean . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.4 Failed Agreement isn’t Always Default Agreement . . . . . . . . . . . . . . . . . . 106

2.3.1 Motivations .
2.3.2
Implications .
.

2.4 Conclusions .

Introduction .

3.1
.
3.2 Obligatory Operations .
3.3 An alternative . .
.

. .

.

.

.

.

Early Versions

2.2.2

.
.
.

.
.
.

. .

.

.

.

.
.
.

.

.

.
.
.

x

3.4.1

4.1

.

.

.

.

.

.

.

.

.

. .

.

.

4.2.1.1
4.2.1.2

3.7 Conclusions . .

. .

4.2 A New Approach .

3.6.1
3.6.2 What are probes? .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Introduction .
4.1.1 Revisiting the Problem of Default Case
4.1.2

Person Hierarchy Eﬀects . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
. 108
3.4.1.1 Morphological Eﬀects . . . . . . . . . . . . . . . . . . . . . .
. 122
Syntactic Eﬀects . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1.2
3.4.1.3
Probe Modiﬁcation . . . . . . . . . . . . . . . . . . . . . . . .
. 128
. 133
3.4.2 Dative Intervention . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 138
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Conjunct Agreement
3.4.4
Interim Summary .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 144
3.5 The Premature Overapplication of Defaults . . . . . . . . . . . . . . . . . . . . . . 145
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
3.6 Some Conceptual Issues .
Framework-wide adoption . . . . . . . . . . . . . . . . . . . . . . . . . . 150
. 154
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
CHAPTER 4 AGREE-BASED CASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
. 158
. . . . . . . . . . . . . . . . . . . 158
Previous Agree-based Approaches . . . . . . . . . . . . . . . . . . . . . . 160
. 164
4.2.1 Case Feature Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Preliminary Concerns
. . . . . . . . . . . . . . . . . . . . . . . 165
The Hierarchical Nature of Case Features . . . . . . . . . . . . . 171
4.2.2 A Proposed Feature System . . . . . . . . . . . . . . . . . . . . . . . . . . 177
. . . . . . . . . . . . . . . . . . . . . 177
4.2.2.1 Morphological Functions
4.2.2.2
Syntactic Function . . . . . . . . . . . . . . . . . . . . . . . . . 183
4.2.2.3 A Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
4.2.3 Accounting for Default Case . . . . . . . . . . . . . . . . . . . . . . . . . 190
4.2.3.1 Canonical Case Valution . . . . . . . . . . . . . . . . . . . . . . 193
4.2.3.2 Quirky Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
4.2.3.3 Hanging Topic/Left-Dislocation . . . . . . . . . . . . . . . . . . 200
4.2.3.4 Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
4.2.3.5 Gapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
4.2.3.6
. . . . . . . . . . . . . . . . . . . . . . . . . . 207
4.2.3.7 Modiﬁed Pronouns . . . . . . . . . . . . . . . . . . . . . . . . . 210
. 214
Some Problems for Dependent Case Models . . . . . . . . . . . . . . . . . 215
4.3.1.1
. . . . . . . . . . . . . . . . . . . . 215
4.3.1.2 Dependent Case Theory and Default Case . . . . . . . . . . . . . 217
.
Final Remarks .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
CHAPTER 5 CONCLUSIONS .
5.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
. 225
5.2 The Lifeboats are Headed in the Wrong Direction . . . . . . . . . . . . . . . . . . 226
5.3 Putting out the Fire Allows us to Maintain the Course . . . . . . . . . . . . . . . . 228
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
5.4 What Have We Learned .

4.3 Evaluating Our Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sole Accusative Arguments

acc-ing gerunds

4.3.1

Introduction .

. .

.

.

4.3.2

xi

REFERENCES .

.

.

.

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

xii

LIST OF TABLES

Table 3.1: Kichean Agreement Markers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Table 3.2: match outcomes .

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Table 3.3: value outcomes

.

. .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Table 3.4: match and value interactions . . . . . . . . . . . . . . . . . . . . . . . . . .

. 94

Table 3.5: Georgian Agreement Morphemes

. . . . . . . . . . . . . . . . . . . . . . . . . 109

Table 3.6: Karok Agreement Morphemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

Table 3.7: Mordvinian Agreement Morphemes . . . . . . . . . . . . . . . . . . . . . . . . 117

Table 3.8: Nishnaabemwin Agreement Morphemes . . . . . . . . . . . . . . . . . . . . . . 132

Table 4.1: Accidental Homophony in Russian . . . . . . . . . . . . . . . . . . . . . . . .

. 172

Table 4.2: Greek Adjective ‘wise’ .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Table 4.3: Finnish Syncretism Core/Non-core . . . . . . . . . . . . . . . . . . . . . . . . . 175

Table 4.4: Erzja Mordvin Syncretism Non-core Cases

. . . . . . . . . . . . . . . . . . . . 176

Table 4.5: Cross-Classiﬁcation of Features

. . . . . . . . . . . . . . . . . . . . . . . . . . 178

Table 4.6: Decomposition of Cases: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

xiii

1
2
3
a
abl
abs
acc
adj
adv
agr
all
appl
aux
com
comp
dat
det
erg
f
foc

ﬁrst person
second person
third person
agent argument
ablative
absolutive
accusative
adjective
adverb
agreement
allative
applicative
auxiliary
comitative
complementizer
dative
determiner
ergative
feminine
focus

KEY TO ABBREVIATIONS

genitive

imperative

inﬁnitive

intransitive

locative

masculine

neuter

negative

nominative

object

oblique

perfective

possessive

perfect

present

progressive

past

subject

topic

gen

imp

inf

intr

loc

m

n

neg

nom

obj

obl

pfv

poss

prf

prs

prog

pst

subj

top

xiv

CHAPTER 1

INTRODUCTION

1.1 Introducing Defaults

This thesis is concerned with the existence of syntactic defaults and the issues their existence
raises for the set of theoretical assumptions we consider standard. Most important is the assumption
that what drives derivations is the need to value any unvalued features; failure to do so causes
derivation crashes when those unvalued features reach the interfaces. Defaults, at ﬁrst blush,
appear to constitute signiﬁcant diﬃculties for maintaining a framework that centers on the failure-
to-value assumption because it is exactly this failure that is assumed to be responsible for their
production. The apparent incompatibility of default data with this framework has encouraged
a number of proposals that involve quite radical departures from these traditional assumptions.
In the domain of case, default data has in part triggered a move towards the separation of case
from DP licensing and the adoption of an alternative model of case valuation called dependent
case theory (Baker, 2015; Levin & Preminger, 2015; Marantz, 1991; McFadden, 2004). In the
domain of agreement, default data has triggered the adoption of an alternative model of syntactic
operations, Preminger’s (2014) obligatory operations model. Each of these proposals has received
much deserved praise; however I suggest that the optimism surrounding their adoption is overstated
and that we should refocus our attention towards addressing the default issues without completely
recasting these basic assumptions. At its core, this thesis is a call for a more conservative approach,
arguing that what we lose by ‘jumping ship’ doesn’t outweigh what we gain by maintaining the
course if we can reconcile how defaults exist in a framework that categorically appears to rule them
out.

1

1.1.1 Some Data and What it Means

We must ﬁrst make a distinction between two related questions: (i) what are defaults in an empirical
sense and (ii) how do we understand those defaults to be formally represented? To address the
ﬁrst, consider the following data from Hindi-Urdu (Bhatt, 2005). In Hindi-Urdu, the ϕ-features
of ﬁnite T must agree with the closest argument that is not morphologically case-marked. This is
the pattern we see in (1a) and (1b). In (1a) the closest non-case-marked argument is the subject;
agreement is successful and the subject’s ϕ-features surface on the verb. In (1b) agreement with
the subject is blocked by the ergative case marking on the subject and so agreement proceeds with
the next closest argument, the object. What is relevant to the default discussion is that when both
arguments are morphologically case marked, and therefore both unavailable for ϕ-agreement, the
derivation does not crash, but instead produces (1c), with masculine singular features appearing on
the verb; Bhatt (2005) among others who work with similar data, classify this as default agreement.

(1)

khaa-tii
a. Mona
Mona.f
eat.hab.f
‘Mona used to eat guava’

amruud
guava.f

thii
be.prf.f.sg

b. Ram-ne

c. Mona-ne

khaa-yii
eat.pfv.f

thii
be.pst.f.sg

imlii
tamarind.f

Ram.m.erg
‘Ram had eaten tamarind’
kitaab-ko
Mona.f.erg
book.f.acc
‘Mona had read this book’

is
this

parh-aa
read.pfv.m.sg

thaa
be.pst.m.sg

subject agreement

object agreement

default agreement

We observe a similar phenomenon in the domain of case assignment as well.
In the English
sentence in (2) we see an example of what Schütze (2001) among others calls default case where
the DP surfaces with accusative case, despite the absence of an accusative case assigner.

(2) What?! Him wear a tuxedo?! No way!

The data in (1c) and (2) both show the appearance of features, masculine singular and accusative
respectively, despite no source for them in their respective derivations. Furthermore, the features
that surface in each example are consistent within each language.
In every example of default

2

agreement in Hindi-Urdu, it is the masculine singular features that surface; in every instance of
English default case, accusative case features are the ones we observe.

While the set of features that surface in these examples is consistent within a language, it can vary
cross-linguistically. Example (3) lists the environments where we observe default accusative case
in English, while examples (4) and (5) show that these same environments produce the nominative
forms in German and Spanish, respectively (Schütze, 2001).

(3)

Default Case in English:
a. Hanging Topic/Left-Dislocation

What?! Him wear a tuxedo?!

b. Gapping

She will eat cake, him brownies.

c. Coordination

Me and him will go to the store.

d. Modiﬁed Pronouns

Lucky me has to clean all the toilets.

(4)

Default Case in German:
Hanging Topic/Left-Dislocation
a. Der/*Dem

the.nom/*dat

Hans, mit
with

dem
him.dat

spreche
speak

ich
I

nicht
not

mehr.
anymore.

(Schütze, 2001)

(5)

Default Case in Spanish
Coordination

a.

para
for

tú
you.nom

y
and

yo
I.nom

b. *para
for

ti
you.acc

y
and

mí
me.acc

c.

para
for

ti/*tú
you.acc/*you.nom

3

d.

para
for

mı/*yo
me.acc/*you.nom

(Schütze, 2001)

The intra-linguistic consistency we see in (3) coupled with the cross-linguistic variation shown
in (4) and (5) tells us that the task at hand is not to account for unexpected accusative valuation.
Rather, we need to explain how accusative forms appear in English while nominative forms appear
in German and Spanish, despite appearing in the same positions and despite the lack of either an
accusative or nominative source of features. This data therefore suggests that the grammar has a
robust and powerful default mechanism that supplies default features in certain instances where
the derivation fails to do so for one reason or another (Legate, 2008; McFadden, 2004; Schütze,
2001). One broad goal of this thesis is to better understand this powerful mechanism and how it is
integrated into our theoretical framework across a number of diﬀerent domains.

With some clariﬁcation of how we identify defaults empirically, we are in a position to address
the second question – how we understand defaults are formally represented. Of course the way
we understand the data in (1c) and (2) and, by extension, how we understand defaults formally,
depends in large part on the set of theoretical assumptions we choose to hold. I’ll begin with a broad
overview, but will clarify the details further for speciﬁc domains in section 1.2. Adopting the general
framework set up in Chomsky (1995, 2000, 2001), we assume that the syntactic system is a feature-
based derivational one and that features are what drive the two primary syntactic operations: merge
and agree. Each terminal syntactic object is a bundle of morphosyntactic features. These features
come in two ﬂavors: valued features and unvalued ones. Valued features are, not unsurprisingly,
features for which a value is inherently speciﬁed. Unvalued features by contrast are features whose
value is not inherently speciﬁed and therefore must come from somewhere else in the derivation,
usually through establishing a relationship with a valued feature bearing object via agree. Unvalued
features cannot survive once the derivation is sent to spell out and in this way, it is through the
valuation of features that the requirements of various syntactic pieces are satisﬁed. A successful
derivation is one in which all unvalued features have received a value before the derivation is sent
in phases to the interfaces at spell out. Successful phases are sent to the morpho-phonological

4

component where vocabulary items are then inserted into each syntactic node (Halle & Marantz,
1993). This insertion is governed by a subset principle which essentially requires that the features
on an eligible vocabulary item must constitute a subset of the features speciﬁed on the node into
which they are to be inserted. Of the eligible vocabulary items, the one that is most speciﬁed will
be inserted.

This system also makes use of underspeciﬁcation, both within the syntactic nodes themselves
and in the vocabulary items the morphological component makes available for insertion. We are
probably most familiar with underspeciﬁed vocabulary items, called elsewhere forms, which are
essentially just vocabulary items with a completely reduced set of morpho-phonological features.
To illustrate, we can imagine that the following vocabulary items in (6) are available for the present
tense of the verb to be in English to account for the paradigm in (7). The elsewhere form in this
example – are – is inserted into every syntactic node that is unable to insert either one of the more
speciﬁed vocabulary items am and is because its feature set does not constitute a superset of those
respective vocabulary items.

(6)

→ am
[+singular, +author]
[+singular, −participant] → is
→ are
elsewhere

(7)

I
you
he

am we
are
is

y’all
they

are
are
are

Elsewhere forms are distinct from the type of underspeciﬁcation we observe within in the
syntactic nodes themselves, despite both making use of underspeciﬁcation. The syntactic type of
underspeciﬁcation is typically due to one of the following: either a syntactic node is generated
underspeciﬁed, it winds up agreeing with something that was generated underspeciﬁed, or a
node’s speciﬁcation was modiﬁed in some way to become underspeciﬁed under a deﬁned set
of circumstances. A common way for this modiﬁcation to happen is through an operation like
Impoverishment (Halle & Marantz, 1993), which I’ll illustrate here. Take the feature speciﬁcations
for the English pronoun paradigm in (8). The unvalued ϕ-features on verb nodes in English will be
valued with the set of ϕ-features that corresponds with the pronoun that controls the agreement.

5

(8)

I
you
he/she/it
we
y’all
they

[+singular, +participant, +author]
[+singular, +participant, −author]
[+singular, −participant, −author]
[−singular, +participant, +author]
[−singular, +participant, −author]
[−singular, −participant, −author]

After agreement, the verb node itself now has a featural speciﬁcation identical to one of the items
in (8) and the grammar will insert a corresponding vocabulary item, obeying the subset principle.
For the past tense form of the verb to be, imagine English has the two vocabulary items in (9):

(9)

[+singular] → was
elsewhere → were

The grammar would therefore insert was for every syntactic node that contained a [+singular]
feature and insert were for every syntactic node that did not contain that feature. Note however that
this system makes the wrong prediction for verbal nodes that agree with the second person singular
pronoun you. Since the featural speciﬁcation for the syntactic node you includes a [+singular]
feature, the grammar should be directed to insert the most speciﬁed eligible vocabulary item, which
in this case is [+singular] → was, not the were that we expect. However, imagine English has access
to an additional rule that manipulates the featural speciﬁcation of verbal nodes that agree with the
second person singular pronoun, shown in (10).

(10)

[+singular] → [ø] / [+participant, +author]

What this rule says is that in the presence of both [+participant, +author] features, delete the
[+singular] feature. Deleting this feature removes the [+singular] → was vocabulary item from the
set of eligible competitors because it no longer contains a subset of the features on the modiﬁed
verbal node. This deletion process is called Impoverishment and through this operation syntactic
nodes become underspeciﬁed, despite originally having more featural information. Impoverishment
is one way in which we can observe underspeciﬁcation of the syntactic node itself, rather than the

6

more straightforward underspeciﬁcation of the vocabulary items.

Because vocabulary insertion is governed by the subset principle, it’s often the case that
underspeciﬁed syntactic nodes are inserted with underspeciﬁed vocabulary items and in this way,
they often co-occur. It’s crucial though to understand going forward that the two are distinct: they
occupy diﬀerent components of the grammar and as we’ll later see, this has consequences for how
we’re comfortable modeling them.

Defaults are quite intuitively similar to the underspeciﬁed elsewhere forms. Defaults and
these forms both share a last-resort quality whereby they each only appear in the absence of
something more speciﬁc. However, in order for default vocabulary items to ever “win” the insertion
competition, the syntactic nodes into which they are inserted must also be underspeciﬁed. We can
now summarize how we intend to deﬁne defaults formally through the theoretical lens we’ve just
established. We can view default agreement as the failure of the verb to receive enough ϕ-features
through agreement to be spelled out with a more speciﬁed vocabulary item. Likewise, we can
understand default case as the failure of the DP to receive enough case feature information through
the mechanism responsible for case valuation to dictate which morphological form to take.

1.1.2 The Crux of the Default Problem

The fact that underspeciﬁcation can occupy distinct components of the grammar also means that
the range and nature of the problems they can pose is quite diﬀerent. Defaults on the morphological
level are generally unproblematic. Given that one of the primary functions of this component is
to deliver pronounceable strings of language, it’s fairly intuitive to assume that there are default
forms available that can be inserted when the instructions for pronunciation are unspeciﬁed. As
an interface component it is also reasonable to assume that communication between the two
components it connects is imperfect and defaults can serve to bridge that gap.

Defaults in the syntactic domain, however, are far more problematic given the theoretical
framework we’ve established. The crux of the default problem is this: if defaults involve the kind
of underspeciﬁcation that results from the failure to value some set of features, how is it that the

7

grammar allows this given that one of its central tenets is the ungrammaticality that results from
unvalued features surviving to the interfaces? Additionally, once we have an answer, how do we
prevent whatever mechanism allows for the production of defaults to not apply in instances where
we’d predict ungrammaticality. In other words, once we allow defaults to ‘save’ derivations, how do
we prevent them from saving the wrong ones? Because defaults question the validity of whether or
not the grammar can tolerate the failure to value features, their proper analysis has deep theoretical
implications. We can understand these issues as guiding three research questions to keep in mind
as we explore the role of defaults in various syntactic domains.

(i) How does the grammar produce defaults?

(ii) How does the grammar constrain the production of defaults?

(iii) What can an understanding of syntactic defaults tell us about how the syntax can encode

underspeciﬁcation?

Syntactic defaults exist across a number of diﬀerent syntactic domains. This thesis will focus
on their existence in two: syntactic defaults in the domain of ϕ-agreement and syntactic defaults in
the domain of case. The next sections will outline in a bit more detail some of the domain-speciﬁc
assumptions we consider standard and how the default problem is explicitly expressed in each
domain.
1.2 Domain-Speciﬁc Defaults

In this section, I’d like to provide a more thorough discussion of the default problem in the
context of the two syntactic domains that will be the focus of this thesis. This involves again
outlining what we consider standard assumptions, but this time with more of the domain-relevant
details. We’ll also address how the default issue speciﬁcally arises in both ϕ-agreement and case
domains, when considering those assumptions. Because the goal is to identify which assumptions
are considered fairly standard, this discussion is sure to gloss over many important details and

8

popular disagreements. What remains is a stripped down version of the relevant theories, but one
that allows us to clearly see the problem at hand.

1.2.1 ϕ-agreement and Case

We begin with ϕ-agreement and the operation agree. Chomsky (2000, 2001) proposes that the
way to distinguish between the two types of syntactic features is to ground the diﬀerence in whether
a feature is interpretable by the semantic interface.1 There are some features that the interface
can interpret (interpretable features) and others that the interface cannot (uninterpretable features).
Because the semantic interface is unable to ‘read’ those uninterpretable features, Chomsky argues
that in order for a structure to be grammatical, these uninterpretable features must be removed
before that structure reaches the semantic interface. agree is the operation responsible for ensuring
this removal. The basic logic is that if a syntactic object that bears an uninterpretable feature
can ﬁnd an interpretable instance of the same feature category on another syntactic object, it can
establish a relationship that will license the uninterpretable feature’s removal. The assumption is
that it is only through establishing these kinds of relationships that uninterpretable features can be
removed.

Chomsky assumes a fairly limited set of these uninterpretable features largely because by
deﬁnition their existence is an imperfection of the system, violating the Interpretability Condition
which assumes all features are properties of sound and meaning and are thus interpretable by the
two interfaces. The ϕ-agreement and case domains are two domains in which these uninterpretable
features have an especially signiﬁcant role. In the ϕ-agreement domain, there exist interpretable
ϕ-features on nominals and uninterpretable ϕ-features on some of the core functional categories,
like C, T and v. To remove the uninterpretable instances of ϕ-features, the grammar must establish a
feature removing relationship between the relevant functional heads and any nominals that bear the
1Note that this is a slightly diﬀerent distinction than the one made in the previous section. We’ll
clarify the eﬀect that these diﬀerent distinctions have on the framework in chapter 4. Also note that
uninterpretability at the semantic interface does not mean that interpretable features have semantic
content. The classic case is gender features. Grammatical gender is not at all semantic, but can
exist as interpretable features on nominals.

9

interpretable ϕ-feature counterparts. Chomsky deﬁnes the uninterpretable ϕ-features on functional
heads as probes that search for nominal goals with which to agree. A probe can search for a
matching interpretable feature category within its c-command domain (11) and if it successfully
ﬁnds one, the feature removing relationship is established, the value of the interpretable feature is
transferred to the uninterpretable feature-bearing syntactic object, and the uninterpretable feature
itself is removed (12).

(11)

TP

(12)

TP

. . .

T(cid:48)

T
[uϕ]

DP
[iϕ]

probe

vP

v (cid:48)

v

VP

V

DP

. . .

T(cid:48)

T
[uϕ]

DP
[iϕ]

delete

vP

v (cid:48)

v

VP

V

DP

It’s also fairly standard to assume some version of relativized minimality (Rizzi, 1990) which
further reﬁnes the degree of speciﬁcity to which probes are sensitive, essentially allowing probes to
be more speciﬁc in what they consider a match. For example, if a probe is relativized to search for
a [participant] feature, only nominals that encode interpretable ﬁrst or second person are capable
of establishing the relationship that would remove the uninterpretable ϕ-feature from the probe.
This also means that probes can ‘skip over’ nominals in their search domain that do not bear these
features – like the external argument in (13) – in favor of lower nominals that do.

10

(13)

TP

. . .

T(cid:48)

T

[uParticipant]

DP
[iϕ]

skip

vP

v

V

agree

v (cid:48)

VP

DP

[iParticipant]

Identifying a standard set of assumptions in the case domain is arguably a bit more diﬃcult
as the assumptions outlined Chomsky (2000, 2001) aren’t as widely adopted as those for ϕ-
agreement. Chomsky argues that case features, unlike ϕ-features, are uninterpretable both on
functional heads and nominals. It therefore follows that it is impossible for case features to ﬁnd
a matching interpretable instance – since none exist – and thus case features are not considered
probes. Because they cannot probe and subsequently agree on their own, they can only be deleted
if they exist on syntactic objects that have participated in another agreement relation, namely ϕ-
agreement. In this way Chomsky explicitly connects up the domain of case and the domain of
agreement by framing case assignment as the reﬂex of a successful ϕ-agreement relation. If an
uninterpretable ϕ-probe ﬁnds a matching interpretable ϕ-feature on a nominal, a relationship will
be established and agree will delete not only the uninterpretable ϕ-feature on the functional head,
but also any uninterpretable case features that exist on both the functional head and the nominal.
If a nominal agrees with the functional head T, the grammar will spell out nominative features; if
a nominal agrees with the functional head v, the grammar will spell out accusative features (14).
It’s important to note that these operations are assumed to be syntactic and thus may or may not
be morphologically marked in a particular language. For example, if a language does not overtly
mark object agreement, this model would still assume that abstract object agreement obtains, but
is simply not morphologically expressed.

11

(14)

(cid:34)

(cid:35)

DP1
iϕ
uCase

TP

(cid:35)

T(cid:34)

uϕ
uCase

nom

T(cid:48)

vP

(cid:34)

DP1

v
uϕ
uCase

v (cid:48)

(cid:35)

V

acc

VP

(cid:34)

(cid:35)

DP
iϕ
uCase

The details outlined above aren’t universally adopted, but the inability of nominals to surface
without having received case is a fairly uncontroversial standard assumption.
In this way the
Case Filter (Chomsky, 1981; Vergnaud, 2008) which ruled out nominals that failed to receive case
is maintained, although the violation is no longer restricted just to case features, but rather all
uninterpretable features. Likewise, it is fairly standard to assume that functional heads are the
things responsible for assigning the various cases and the set of functional heads that serve as
case assigners is largely agreed upon: ﬁnite T is responsible for assigning nominative case, v is
responsible for assigning accusative case, etc. Also fairly uncontroversial is the assumption that case
assignment – at least in part – occurs syntactically, even for languages that do not morphologically
diﬀerentiate the diﬀerent case categories.

Both case and ϕ-agreement have a morphological function beyond their relative syntactic ones,
dictating the morphological forms of nominals in the domain of case and agreement morphology
on functional heads in the domain of ϕ-agreement. Another set of assumptions surrounds the
relationship between the functions in these independent components. Generally, morphological
case and agreement are assumed to be a reﬂection of syntactic abstract case and ϕ-features.
Languages can vary in the degree to which they overtly reﬂect these abstract syntactic relationships,
but they are assumed to be a universal phenomenon.

12

While many have modiﬁed the details of these systems to account for various things in diﬀerent
ways, the basic scaﬀold of the system remains widely adopted. Uninterpretable features must be
removed in order for the derivation to be interpreted by the interfaces and both case assignment and
ϕ-agreement are largely the result of a similar set of operations. Later in this thesis I will follow
suit and likewise propose some modiﬁcations to how the system accounts for ϕ-agreement and
case, but will remain as committed to the crucial tenets of the system as the nature of the default
problem allows.

1.2.2 Defaults in the ϕ-agreement Domain

As we saw in the previous section, the framework established in Chomsky (2000, 2001) requires
that uninterpretable features must be removed from probes via agree if they are to produce a
grammatical sentence. What this should disallow therefore is the failure of a probe to ﬁnd an
agreeing goal with which to establish this relationship.

Default agreement appears to pose some problems for these assumptions if we assume defaults
are the result of a probe’s uninterpretable feature failing to establish an agreement relation and the
subsequent removal of that feature before spell out. The apparent tolerated failure of ϕ-feature
agreement is a wide-spread phenomenon. We’ve already seen data from Hindi-Urdu, repeated
below in (15) that illustrates this tolerated failure. The uninterpretable ϕ-features on the ﬁnite T
probe search in their c-commanding domain for an interpretable ϕ-feature bearing goal with which
to agree. Being overtly case-marked in Hindi-Urdu independently prevents certain nominals from
being eligible goals for the probe. This means that in (15c) the ﬁnite T cannot agree with the
subject, nor can it agree with the object, which results in the failure of the grammar to remove the
uninterpretable features on the ﬁnite T probe. This should cause the derivation to crash, but as we
have seen, the sentence is perfectly acceptable.

(15)

a. Mona
khaa-tii
Mona.f
eat.hab.f
‘Mona used to eat guava’

amruud
guava.f

thii
be.prf.f.sg

subject agreement

13

imlii
tamarind.f

khaa-yii
eat.pfv.f

thii
be.pst.f.sg

b. Ram-ne

Ram.m.erg
‘Ram had eaten tamarind’
kitaab-ko
Mona.f.erg
book.f.acc
‘Mona had read this book’

is
this

c. Mona-ne

parh-aa
read.pfv.m.sg

thaa
be.pst.m.sg

object agreement

default agreement

We see a similar pattern in Kichean, a member of the Mayan language family. Preminger (2014)
shows evidence that probes in Kichean are relativized to search for [participant] bearing arguments.
This is seen in (16) where ﬁrst and second person arguments control agreement over third person
arguments, regardless of their syntactic position. This preference is explained if one assumes that
ﬁnite T bears an uninterpretable [participant] feature that can only be removed through agreeing
with an interpretable [participant] bearing nominal. Essentially, this increased speciﬁcity allows the
probe to ignore nominals in its search domain that do not bear this feature, third person nominals.

(16)

a.

b.

x-at/*-ø-ax-an
com-2sg.abs/*3sg.abs-hear-AF

rat
you(sg)

ja
foc
‘it was you that heard the man’
ja
foc
‘it was the man that heard you(sg)’

achin
man

ri
the

x-at/*ø-ax-an
com-2sg.abs/*3sg.abs-hear-AF

rat
you(sg)

ri
the

achin
man

Kichean exhibits similar behavior when both arguments are third person (17) – and therefore lack
an interpretable [participant] feature – to what we’ve seen for Hindi-Urdu. Since neither argument
bears the interpretable version of what the probe is searching for, neither argument is available to
delete the uninterpretable [participant] feature on ﬁnite T. Once again, we’d expect this sort of data
to cause a derivation crash since the derivation appears to involve the survival of an uninterpretable
feature, but yet again the sentence is perfectly acceptable.

(17)

a.

ri
the

tz’i’
dog

x-ø-etzel-an
ja
com-3sg.abs-hate-AF
foc
‘it was the dog that hated the cat.’

ri
the

sian
cat

14

b.

ri
the

xoq
woman

x-ø-tz’et-o
ja
com-3sg.abs-see-AF
foc
‘it was the woman who saw the man.’

ri
the

achin
man

As I mentioned in the last section, this is the primary issue that defaults raise: how is it that
these derivations produce acceptable sentences when they appear to involve the failure to remove
a crash-inducing uninterpretable feature?

1.2.3 Defaults in the Case Domain

Likewise, in the domain of case, what rules out a sentence like (18) is that the DP her is unable to
value its unvalued case feature because non-ﬁnite T does not have case features to assign and there
is no other source of case available.

(18)

*It is likely her to leave the party early.

In addition to the data we saw in section 1.1.1, examples (19)-(22) show that default case is a
widespread phenomenon. Languages can diﬀer in which case they select as the default; in most
languages nominative case is the default case, however, English – along with Danish, Norwegian,
and Irish – do appear to be unique in that they use accusative case, rather than nominative case to
mark these default case environments (see Schütze, 2001, for a cross-linguistic survey).

(19)

Default Nominative Case in German:

Der/*Dem
the.nom/*dat

Hans, mit
with

dem
him.dat

spreche
speak

ich
I

nicht
not

mehr.
anymore.

(20)

Default Nominative Case in Greek:

O
the.nom
‘The strange person, we didn’t see him.’

paraksenos
strange.nom

anthropos,
person.nom

dhen
not

ton
him.acc

idhame
saw

15

(21)

Default Accusative Case in Danish

Hende
her.acc
‘She/Her with the blue eyes is a Swede.’

svensker
a.Swede

blå
blue

med
with

de
the

øjne
eyes

er
is

(22)

Default Accusative Case in Irish

é
Rinne
him.acc
did
‘Owen himself did it.’

Eoghan
Owen

féin
emph

é
it

(Schütze, 2001)
As was true in the ϕ-agreement domain, the failure of the derivation to value the case features
on nominals should predict that the derivations produce ungrammatical structures. Instead, what
we see are perfectly acceptable sentences. Similar questions are raised: how does the grammar
produce defaults despite this failure, and how does the grammar constrain the production of those
defaults so that they don’t erroneously appear in places like (18) where the failure to value case
does induce ungrammaticality?
1.3 Jumping Ship lands us in Bizarre Boats

The deep nature of this default problem has in part prompted researchers to propose a number
of solutions that require quite radical departures from the standard set of theoretical assumptions.
At their core, these departures share the same logic: if we remove the theoretical power of failed
feature valuation, we remove at least the primary issue that defaults raise for the grammar – how
defaults are produced. To address issues of grammatical failed ϕ-agreement, Preminger proposes an
entirely new model for derivations: his (2014) obligatory operations model. Under the assumptions
he proposes, the failure to value features does not trigger ungrammaticality at all; rather it is the
failure to trigger a set of obligatory operations that is responsible. In addressing default case, among
other issues surrounding the assignment of morphological case features, researchers have found
promise both in adopting an alternative model of case valuation called dependent case theory and
in abandoning the long-held view that case features play a role in regulating nominal distribution.

16

It is important to say that while both types of proposals do constitute radical departures from the
traditional set of theoretical assumptions, neither could be considered fringe proposals as both have
seen recent mainstream adoption, especially the dependent case model. The speciﬁcs of these
departures will be addressed in greater detail in future chapters, but I’d like to provide a quick
preview of the approaches here.

1.3.1 Dependent Case Theory

Dependent case theory is an alternative model of case valuation that assigns case conﬁgurationally
by examining not the relationships between functional heads and nominals, but rather the relation-
ships between the nominals themselves. Case is assigned to a nominal if that nominal exists in a
certain conﬁguration with respect to other nominals in the same domain. A simpliﬁed example is
shown below in (23). The algorithm in (23) says that accusative case features are assigned to a
nominal only if that nominal is c-commanded by another nominal in the same TP spell out domain.
DP2 in (24) therefore would be assigned accusative case features because it is c-commanded by
another nominal, DP1 that shares its TP spell out domain. Nominative case features are instead
assigned by default2 to nominals that do not exist in that conﬁguration.

(23)

Dependent Case

If there are two distinct NPs in the same spell-out domain such that NP1 c-commands NP2,
then value the case feature of NP2 as acc unless NP1 has already been marked for case.

2Dependent case theory does outline a distinction between unmarked case and default case, the

details of which we’ll explore in chapter 2.

17

(24)

TP

DP1
[nom ]

T(cid:48)

T

vP

DP
t1

v (cid:48)

v

VP

V

DP2
[acc ]

Under these assumptions, DP2 receives accusative case features not because it stands in some rela-
tion to the v of the clause, but rather because it stands in some conﬁguration relative to another DP.
DP1 by contrast receive nominative features because it does not stand in that conﬁguration relative
to another DP. Dependent case theory therefore redeﬁnes case as the reﬂection of a relationship
between two nominals in a given domain.

Nominative features in this system are assigned in the absence of a conﬁguration and in this
way they share the kind of intuitive logic we assign to defaults more generally. It’s this default
nature that makes dependent case theory an attractive possibility when confronted with the default
case data. Given the default nature of unmarked case, dependent case theory is not consistent with
a framework that rules derivation ungrammatical if nominals fails to receive case. Proponents of
this model therefore are forced into adopting an additional theoretical departure: the separation of
case from licensing, which we’ll preview next.

1.3.2 Separation of Case from Licensing

A perfectly reasonable way to handle the production related problem of default case is to assume that
the failure to value case features does not have the power to rule derivations ungrammatical. This
requires suspending the long-held assumption that receiving case is a grammatical requirement for

18

nominal licensing and thus requires alternative explanation for the varied set of data shown below
in (25). The need for case was what primarily drove movement (25a), what prevented superﬂuous
movement (25b), what explained the inability of non-ﬁnite clauses to host overt subjects (25c), and
what explained the distribution and form of nominals in passives (25d) and unaccusatives (25e),
among other things.3

(25)

Johni is likely ti to win the race.

a.
b. *Johni is likely that ti will win the race.
c. *It is likely him to win the race.
d.
e.

Johni was invited ti.
Johni arrived ti.

With the adoption of the EPP feature, movement of the theme argument in passives (25d) and
unaccusatives (25e) to the subject position no longer needed to be tied to the need for case on the
theme argument. Likewise, the adoption of phase heads and spell out domains further reduced
the role of case in regulating superraising (25a)-(25b). While we have theoretical tools to provide
alternative explanations for some of the data in (25), data like that in (26) is arguably much more
diﬃcult to explain without reference to the failure to receive case.

(26)

a. *John hoped him to win the lottery.
b. *It is likely her to leave the party early.

We’ll see in chapter 2 that those who wish to eliminate case’s role in regulating nominal licensing
have proposed an extension of the Empty Category Principle – the idea that overt complementizers
cannot precede empty categories – to account for the type of data in (26). Their view is essentially
that this data is the last frontier for the classical case theory and that if we can propose a reasonable
alternative, we can make an argument that the assumption that case must be valued can be eliminated.

3This is not intended to be an exhaustive list, just a summary of some of the big facts.

19

1.3.3 Obligatory Operations

Preminger (2014) uses a similar logic with respect to modeling what look like grammatical agree-
ment failures. He takes the position that what default ϕ-agreement data shows us is that the grammar
does tolerate the failure to receive a feature value and therefore we must recast our assumptions
about what enforces grammatical requirements. He proposes that what is required of a derivation
is not the successful removal of uninterpretable features, but instead that all operations that are
obligatory must be initiated. In simple terms, it’s not the outcomes that the grammar cares about,
but rather than all obligatory processes have been attempted. To replace agree, he proposes find
(27) which requires that a probe search for an accessible goal with which to agree. However, its
failure is completely tolerated so long as it was initiated in every context it could. When it does
fail, the grammar is able to insert default features on the relevant functional heads.

(27)

find(f )
Given an unvalued feature f on a head H0, look for an XP bearing a valued instance of f
and assign that value to H0.

What all of these theoretical departures share is the claim that the failure to value and subsequently
remove uninterpretable features is not a crash-inducing circumstance. Of course, since the assump-
tion that failure-to-value is fatal has been one of the central tenets of frameworks standardly adopted
in the Minimalist era, abandoning it constitutes a dramatic departure which will predictably have
great eﬀect across a number of syntactic domains.
1.4 Goals of this Thesis

This thesis will broadly argue that by jumping ship, so to speak, we’ve landed in some prob-
lematic lifeboats that are headed in the wrong direction. If can instead put out the ﬁre that caused
us to jump ship in the ﬁrst place, we can maintain the course. There are two main claims that I will
advance:

20

(i) There are some serious conceptual and empirical problems that arise from dependent case

theory and obligatory operations that warrant their rejection.

(ii) There is a solution available that allows us to maintain the basic scaﬀold of the standard
framework and solves both the production and the constraint problems that defaults introduce.

Most of the discussion surrounding claim (i) will focus on conceptual arguments against these
proposals and will also center along the claim that while these departures provide a mechanism
for the production of defaults, the mechanisms available don’t constrain that production very well.
The discussion surrounding claim (ii) will involve proposing a solution to the default problem that
ﬁts within the basic tenets of the standard framework and advancing the argument that its adoption
should be favored over the departures.

Chapter 2 and chapter 3 will advance the ﬁrst main claim of this dissertation: that the radical
departures that the existence of defaults has triggered are more problematic than they ﬁrst appear.
The goal here is to argue that there is reason to revisit the standard approach, despite recent calls
for its abandonment. Chapter 2 will outline the radical departures involving the role of case in the
regulation of DP licensing and the mechanisms that the grammar has available to assign it. I will
show that while the dependent case theory of morphological case assignment covers a wide range
of empirical data, it suﬀers from serious conceptual issues that make its adoption unattractive.
First, I’ll show how modern versions of the model induce an Inclusiveness Condition violation
with respect to the assignment of case feature values. This violation is serious not because of strict
obeisance to Minimalism, but rather because it makes our understanding of which syntactic objects
actually house case features unclear. I’ll also argue that under the dependent case model, case does
not reﬂect a consistent relationship between syntactic objects, undermining its classiﬁcation as a
system. Here, I also discuss issues that the model raises for the structure of case features themselves
and how we want to understand the limits of parameterization. I then present some empirical facts
that are also diﬃcult to model under the dependent case approach (although the bulk of that
discussion happens in chapter 4). The chapter concludes with an argument that the explanations

21

that are proposed to pivot away from assuming a licensing role for case are insuﬃciently ﬂeshed
out, leaving room for a proposal that models default case within the boundaries of a system that
treats unvalued features as fatal to derivations.

Chapter 3 will take a similar approach, instead investigating the notion of obligatory operations,
as proposed in Preminger (2014). As was true of chapter 2, the goal of this chapter will be to
argue that there are some serious issues – both empirical and conceptual – with the adoption of
obligatory operations, enough that modiﬁcations to the standard system are warranted. I show that
the obligatory operations approach to ϕ-agreement is not necessitated by default agreement data
by outlining an alternative proposal made by Béjar (2003) that operates within a more standard
framework. I then present arguments that show that when we examine data more complicated than
what’s used to motivate the obligatory operations model, we are unable to fully account for the
varied outcomes of failed agreement by simply allowing operations to fail. A major claim of this
chapter is that agreement does not have a binary set of outcomes and thus needs to be accounted
for with a model that can capture this fact. The obligatory operations approach is unable to do so,
once again making room for a proposal of defaults that is more in-line with standard assumptions.
Chapter 4 will advance the second main claim of this dissertation: that in light of these issues,
we should instead pursue a more modest approach, one that largely maintains the standard set of
theoretical assumptions surrounding case and agreement that have provided many insights over
many decades of research in these areas. This proposal will focus on accounting for the production
of default case, how we constrain that production, and will provide clariﬁcation on some of the
issues surrounding how case features are modeled. I argue for a novel understanding of case features
and show how an approach similar to Béjar (2003) can operate over these features to produce the
three-way set of outcomes observed in the data: canonical case, default case, and ungrammaticality.
Chapter 4 will also show that this solution should be preferred over the radical departures discussed
in chapters 2 and 3. The main claim is that the theoretical concessions that dependent case theory,
separation of case from licensing, and obligatory operations require us to make are severe enough to
warrant their rejection, despite the empirical coverage beneﬁts they oﬀer. Chapter 5 will conclude.

22

CHAPTER 2

DEPENDENT CASE THEORY

This chapter explores the theoretical implications that follow if one adopts either the separation of
case from licensing or the dependent case model of case valuation. The hope is to provide arguments
that validate a reinvestigation of the more standard approaches that these radical departures reject,
saving proposals of an alternative for chapter 4. While these next two chapters may seem trivially
negative, it is important to motivate that there is reason to return to more standard approaches,
especially since these radical departures are largely motivated on claims that the more standard
approaches aren’t tenable. We begin with a discussion of the ﬁrst set of departures intended to
address the problematic issues that are illustrated by default case. Recall that the crux of the issue is
that a system that enforces grammatical requirements in part through the valuation of case features
should be unable to handle grammatical instances where nominals survive with their case features
unvalued. This type of data is in part addressed by the adoption of the dependent case model of
case valuation, the topic of section 2.2, and the separation of case from licensing, which will be the
focus of section 2.3. The conclusion reached in this chapter is that these departures require adopting
theoretical systems that are further from Minimalist ideals than their more standard counterparts
and thus validate an attempt to modify the more standard approaches in ways that address the
problems that defaults introduce.
2.1 The Default Case Issue

Case has had a central role in standard syntactic frameworks since Vergnaud’s famous letter to
Chomsky in 1977 suggesting that we can use case to provide an explanation for nominal distribution
(Vergnaud, 2008). This revolutionary idea birthed the Case Filter, shown in (1), which stated that
the only licit NPs were ones that received case from somewhere else in the structure (Chomsky,
1981).

23

(1)

Case Filter:

*NP that does not have Case

Case is assumed to have two primary functions: (i) it regulates nominal distribution via licensing
and (ii) it provides the morphological marking of NPs. These two functions, called Abstract Case1
and morphological case respectively, are standardly assumed to be related; morphological case
is the physical realization of Abstract Case features. Languages vary in how richly they express
this relationship with some languages having rich morphological case systems and others failing
to make any case distinctions at all. One of the most revolutionary aspects of Vergnaud’s original
proposal was that he argued that Abstract Case was a part of UG; that even languages without a
rich morphological case system still obeyed the Case Filter, requiring that all nominals must be
licensed by receiving Abstract Case.

In modern versions of case theory, case is modeled as the reﬂex of successful ϕ-agreement
(Chomsky, 2000, 2001). When a nominal values the uninterpretable ϕ-features on a particular
functional head like ﬁnite T, it has its own uninterpretable case features valued as a result. The
functional head with which the nominal agrees determines which morphological case category
the nominal will express – nominative if agreement is with a ﬁnite T, accusative if agreement
is with a v. However, if a nominal instead exists in a position where it is unable to establish a
successful ϕ-agreement relationship with a full set of ϕ-features, case assignment – as a reﬂex of
this relationship – will also fail. The result of that failure is ungrammaticality unless that nominal
is unable to establish an alternative ϕ-agreement relationship, perhaps with an ECM embedding
verb, as shown in (2a). What rules out a sentence like (2b), therefore, is the failure of the DP her to
1A quick note on terminological conventions: the standard convention for distinguishing gram-
matical licensing from morphological case is to use capital “C” Case to refer to the former and
lowercase “c" case to refer to the latter. In this particular thesis, however it is often necessary to refer
to a more general understanding of case as either the combination of the two or to be agnostic about
their relationship. To avoid copious use of the hard to read “C/case", the following conventions
will be used: when speciﬁcally referring to grammatical licensing I will use capital “C” Case and
when speciﬁcally referring to morphological form I will use the term “morphological case". When
a distinction is either not needed or is not assumed I will use lowercase “c" case to refer to case in
its general form.

24

have received case since it can’t from non-ﬁnite T and there is no other source of case available. In
this way, Case has been crucial for understanding how derivations deliver grammatical sentences.

(2)

I expect her to leave the party early.

a.
b. *It is likely her to leave the party early.

However, as we’ve seen in chapter 1, there are a number of structures which like (2b) also contain a
DP that has failed to receive a case value, but unlike (2b) produce a perfectly grammatical sentence.
These are the instances where what is called default case surface, shown in (3):

(3)

Default Case in English:
a. Hanging Topic/Left-Dislocation

What?! Him wear a tuxedo?!

b. Gapping

She will eat cake, him brownies.

c. Coordination

Me and him will go to the store.

d. Modiﬁed Pronouns

Lucky me has to clean all the toilets.

(Schütze, 2001)

What makes default case theoretically interesting is that it appears to constitute a counterexample
to the Case Filter by virtue of being an instance where required case valuation has failed. At least
two questions are raised: (i) how is default case produced in the ﬁrst place, given that the failure
to value case features should result in ungrammaticality and (ii) how is the production of default
case constrained such that its overapplication couldn’t incorrectly produce a grammatical version
of (2b)?
2.2 Dependent Case Theory

We begin with a departure that involves adopting a model of case valuation that builds defaults
directly into the system – an approach called dependent case (Baker, 2015; Levin & Preminger, 2015;

25

Marantz, 1991; McFadden, 2004). Dependent case theory assigns case using the conﬁgurational
relationship between two nominals in a case assigning domain. If a nominal is not assigned one of
the dependent cases, it may then be assigned either unmarked case or default case. The negative
characterization of the environments in which unmarked and default cases are assigned should be
familiar as it is similar in logic to how we intuitively deﬁne the environments in which defaults
appear and is thus considered an attractive way to handle the problems raised by the existence of
defaults.

This section outlines the details of dependent case theory and explores some of the empirical
and conceptual implications that arise from its adoption. In section 2.2.1 we begin with a brief
history followed by a overview of modern versions of dependent case theory. In section 2.2.2, I
discuss some important implications that adopting this system requires of the grammar and then
use them to argue there is still reason to pursue a more conservative standard approach. The details
of that approach will be discussed in chapter 4.

2.2.1 Overview of model

2.2.1.1 Early Versions

The origins of dependent case predate both Minimalism itself (Chomsky, 1995) and standard
Minimalist assumptions regarding case and agreement (Chomsky, 2000, 2001). Understanding
these initial motivations and how this novel system compared to its contemporaries is important for
understanding how the modern versions of dependent case ﬁt into the larger theoretical picture. We
begin therefore with one of the original versions of dependent case, Marantz (1991).2 Marantz has
two primary goals: (i) to rid the syntax of abstract Case and (ii) to contribute signiﬁcant knowledge
about the theory of morphological case.

2There are two other works that could also be considered pioneers of dependent case: (Bittner
& Hale, 1996; Yip, Maling, & Jackendoﬀ, 1987). I focus on (Marantz, 1991) here because the
modern versions of dependent case are primarily based on his version. However it is important to
note that many of the insights we gain from Marantz are echoed in these other works as well.

26

We start with data that illustrates what is called the ergative generalization, deﬁned in (4)
(Marantz, 1991). The data in (5) from Hindi (Marantz, 1991) shows that ergative case is only
possible on subjects that originate in a thematic subject position – either transitive subjects (5d)
or subjects of unergatives (5b)-(5c). Ergative case is disallowed on subjects that originate in a
non-thematic subject position, like they would in an unaccusative (5a).

(4)

Ergative Generalization

No ergative case on a non-thematic subject (ie: on an argument moved into a non-thematic
subject position)

(5)

a.

b.

c.

d.

(*ne)
(*erg)

siita
Sita.fem
‘Sita arrived.’
kutte
dogs.masc.pl
‘Dogs barked.’
kuttoN
dogs.pl
‘Dogs barked.’
ne
raam
Ram.masc
erg
‘Ram was eating bread.’

ne
erg

bhoNkaa
barked.masc.sg

aayii
arrived/came.fem

bhoNke
barked.masc.pl

roTii
bread.fem

khaayii
eat.fem

thii
be.past.fem

Data that reﬂects the ergative generalization mirrors in part what Burzio’s Generalization (6) hoped
to capture: the inability for the object of an unaccusative to receive accusative case (7) (Burzio,
1986). At the time, Burzio’s generalization was understood to be about the abstract syntactic
Case that a nominal received, while the ergative generalization covered morphological case only.
In order to draw a connection between the the two, Marantz reframed Burzio’s generalization to
be about morphological case, allowing for both generalizations to be subsumed under the same
morphological mechanism, thus contributing strongly to our understanding of morphological case
theory.

27

(6)

(7)

Burzio’s Generalization:

If a verb’s subject position is non-thematic, the verb will not assign accusative structural
Case.
a. Hei arrived ti.
b. *Himi arrived ti.

Of course, to do this, Marantz needed to make it attractive to assume that data like (8) could be
captured without reference to abstract Case. Under the standard Burzio explanation, what rules out
the examples in (8) (couched in modern terms) is the inability of both unaccusative and passive v
to assign abstract accusative Case to the nominals the man and the porcupine, respectively. Since
the v is unable to assign the necessary case, the derivations arrive at spell out having failed to value
their abstract Case features, causing ungrammaticality.

(8)

a. *It arrived the man.
b. *It was purchased the porcupine.

If however, one assumes that we couple a requirement that subject positions must be ﬁlled – the
Extended Projection Principle (Chomsky, 1982) – with a preference for move over merge as the
strategy to obey the EPP, we can capture the data in (8) without referring to abstract case at all.
What rules out the sentences in (8) under this view is the failure of the derivation to obey the move
over merge preference, since both derivations involve the merger of an expletive instead of the
movement of the theme argument to subject position.

Part of being able to reject the notion of abstract Case altogether requires some alternative
understanding of Burzio’s generalization, since it directly categorizes when and where abstract Case
is assigned. Marantz proposes that instead of being about abstract Case, Burzio’s generalization
is actually about the assignment of morphological accusative case in particular – aligning it much
more closely with the ergative generalization (9). The comparison drawn between these two
generalizations gave birth to dependent case theory as we know it today, an inﬂuential contribution
that still lies at the center of much debate in the modern case literature. Like modern versions

28

of dependent case theory, Marantz’s original version is crucially only intended to explain how
morphological case assignment works, as he (and those who’ve followed him since) rejects the
idea that abstract Case (i) exists and (ii) has anything at all to do with the regulation of nominal
distribution.

(9)

a.

Ergative Generalization:

no ergative case on a non-thematic subject (ie: on an argument moved into a non-
thematic subject position)
Burzio’s Generalization (reframed):

b.

no accusative case on an object in a sentence with a non-thematic subject position

The primary question that Marantz – and others who work on morphological case – need to
address is this: what is it that determines which particular case features show up on which
nominals and why? It is common to assume that the morphological component, responsible
for assigning morphological case, interprets the syntactic structure delivered to it. So although
Marantz assumes that case assignment happens entirely in the morphological component, it is still
the structural relations between relevant pieces that dictate the choice between the various cases.
What is diﬀerent under dependent case theory are the mechanisms by which these relationships
are established. Marantz proposes a hierarchy that dictates the order in which diﬀerent cases take
precedence over others, shown below in (10):

(10)

a.
b.
c.
d.

lexically governed case
dependent case (accusative and ergative)
unmarked case (environment-sensitive)
default case

First, lexical cases are assigned to nominal chains by virtue of being governed by a verb that
has quirky or lexical case to assign. The fact that it is the nominal chain, rather than just the
highest position the nominal occupies that is relevant for government is how we capture Icelandic

29

data where quirky case is preserved despite movement to a subject position. To illustrate see the
example below in (11) (Harley, 1995). Since V will still always govern part of the chain of the
nominal, even after the nominal moves, V is still capable of assigning case to that nominal. In this
way, a quirky case-assigning verb is able to assign quirky case to a subject because it governs a
position that the subject once occupied. This accounts for the preservation of quirky case under
movement.

(11)

studentum
student.pl.dat

Morgum
many
‘many students like the job’

liki/*lika
like.3sg/*3pl

verkið
job.the.nom

If a lexical case assigning verb is not present in a given structure, then the V+I complex is able
to assign case features to nominals that it governs. Case assignment is determined by looking at
not only the particular nominal in question, but also other nominals that the same V+I complex
governs. This case, called dependent case because its assignment is dependent on the presence of
other nominals, is either accusative or ergative and is assigned according to the following algorithm,
shown in (12):

(12)

Dependent case is assigned by V+I to a position governed by V+I when a distinct position
governed by V+I is:

a.
b.

not “marked” (not part of a chain governed by a lexical case determiner)
distinct from the chain being assigned dependent case
dependent case assigned up to subject: ergative
dependent case assigned down to object: accusative

(Marantz, 1991)

In plain terms, the V+I complex assigns dependent case to a nominal if there is another distinct
nominal that either c-commands it (if the language exhibits nominative-accusative alignment) or
that it c-commands (if the language exhibits ergative-absolutive alignment). In the English example
in (13a) we see that the nominal her is c-commanded by another nominal he. In GB frameworks,

30

these two nominals represent distinct chains, but are both governed by the same V+I complex.
Since he is not lexically case marked, the dependent case algorithm in (12) would direct the V+I
complex to assign dependent accusative case down to the object, her. The same mechanism applied
in the opposite direction is shown for data like (13b) from Burashaski (Willson, 1996). Because
Burashaski is an ergative language, the dependent case mechanism would direct the V+I complex
to assign dependent case upward, to the subject. As in (13a), there are two nominals in (13b) that
represent distinct chains, both governed by the same V+I complex. This time, the dependent case
algorithm would direct V+I to assign dependent case upward to the subject hilés-e, marking it with
the dependent case for ergative-absolutive languages – ergative case. In this way, the case that a
nominal receives is dependent on whether or not there is another nominal with particular properties
(not already case-marked) in a given local domain (the governing domain of V+I). In many ways,
we can view dependent case as the logical successor of Burzio’s generalization: at their core, both
essentially describe the assignment of accusative case as dependent on the presence of a higher,
distinct argument.

(13)

a. He
nom

saw
saw
b. Hilés-e
boy-erg
‘The boy saw the girl’

her
acc
dasin
girl.abs

mu-ye’ets-imi
3.f-see-past.3msg

(Willson, 1996)
Next, if there exists a nominal to which the dependent case algorithm does not apply, the
nominal is eligible to receive an unmarked case. Unmarked case is assigned to nominals after both
the mechanisms behind lexical case and dependent case have applied and is context sensitive in that
diﬀerent environments trigger diﬀerent cases. Nominative and absolutive are assumed to be the
unmarked cases assigned to nominals that don’t receive dependent case in the TP environment, while
genitive case is the unmarked case assigned to nominals that are within another NP environment.
After the dependent case algorithm has applied to all the nominals it can in (13a), the nominal he

31

remains. Since this nominal is in a TP environment, it receives the unmarked nominative. Likewise,
in (13b), after the dependent case mechanism assigned ergative case to hilés-e, the nominal dasin
remains unmarked. Since dasin is in the TP environment, it would receive unmarked absolutive
case.

Lastly, if there exists a nominal such that none of the above case assigning rules in (12) are
able to apply, a more general language-wide default that is not contextually sensitive is assigned –
the default case introduced in chapter 1. Case assignment in the original proposal of this system
is assumed to occur post-syntactically in the morphological component, but see Levin (2015) for
work that places dependent case in the syntax itself. The existence of a default case that applies as
a last resort to nominals that have not received either lexical, dependent, or unmarked case is why
dependent case is categorically incompatible with a framework that aﬀords case the ability to act
as a grammatical ﬁlter in the syntax – it would render that ﬁlter meaningless.

2.2.1.2 Modern Versions

The impact of Marantz (1991) is still strongly felt today in modern versions of dependent case
theory. This next section will describe what current versions of this model look like and outline the
relevant updates that have been made to Marantz’s original proposal to bring it in-line with modern
theoretical assumptions. From there, we can discuss the theoretical and conceptual implications that
adopting this model appears to require. The eventual goal is to provide arguments against adopting
this method of case valuation (despite its impressive empirical coverage), saving an illustration of
what a reasonable alternative could look like for chapter 4.

Since 1991, the standard framework in which we build our theories has changed signiﬁcantly and
with it the standard assumptions we hold about the nature and relationship of case and licensing.
Unsurprisingly, the ways in which the ﬁeld has changed require commensurate changes in how
dependent case is implemented today. Most signiﬁcant is the abandonment of government as a
syntactic relation. Modern versions of dependent case theory have therefore had to redeﬁne the
conditions upon which the dependent case rules apply because Marantz deﬁned the assignment of

32

dependent case in exactly those terms. Today, proponents of dependent case theory largely adopt
the same set of assumptions and details regarding the assignment of dependent case, most of which
come out of Baker (2015).
I will follow suit and thereby focus our discussion on this version,
pointing out relevant departures when needed.

Baker (2015) provides the most comprehensive analysis of how dependent case could operate
within a modern Minimalist framework and in doing so, constitutes the ﬁrst system with enough
detail that it is now a true modern theoretical competitor to the more standard agreement-based
approach ﬁrst outlined in Chomsky (2000, 2001). Baker’s model is a hybrid one, combining
elements from both agreement-based case proposals and dependent case ones and is focused on
morphological case issues only, leaving explanations of nominal licensing to others – an important
distinction we’ll see is relevant later in this chapter. The deﬁning feature of his approach is the
high degree of parameterization, something that allows him to capture an impressive amount of
empirical coverage across a wide range of language families and case system types. This high
degree of parameterization is reﬂected both on a narrow scale in the speciﬁc details of dependent
case assignment and also on a more broad scale, being reﬂected in the choice of assignment
mechanisms themselves.3

Since the highest level of parameterization is in the choice of assignment mechanisms them-
selves, we’ll begin our discussion there. Baker’s main thesis is that the variety in the morphological
case patterns found in the world’s diverse languages supports a model of case assignment that
provides the grammar with two4 mechanisms by which case features can be valued/assigned: one
that is agreement-based5 and one that is conﬁgurational, called dependent case. The motivation
3This last point is the singular source of major contention between most modern researchers.
While Baker argues that we still need remnants of the standard agreement-based mechanism, there
are those who would rather eliminate any need for reference to agreement in assigning case (Levin &
Preminger, 2015). A more thorough discussion of the theoretical implications of this disagreement
will follow in the next section.

4I say “two” here, putting aside for now issues regarding lexical case. We will come back to

that point later.

5A quick note for clariﬁcation: the sources discussed in this chapter are arguing for and against
a very particular model of standard case assignment. The relevant approach is the one proposed in

33

for this hybrid system is that where one model has been weak, the other is strong. Baker argues
that agreement provides a solid theory of morphological case for “some cases in some languages”,
but not for “all cases in all languages” (Baker, 2015, p.47). For those cases or languages where
agreement does not account well for the various morphological patterns, the grammar needs to
have an additional method of case assignment available in order to capture that data. This choice in
case assigning mechanism is not a parameter that is always set on a language-level; it can actually
be set case-by-case. In other words, there can exist languages like Sakha in which some cases –
nominative and genitive – are assigned via agreement, while others – accusative and dative – are
assigned via dependent case.

A strong argument for maintaining agreement-based case is the long observed strict co-
occurrence between nominative case and ϕ-agreement. Baker (2015) and Baker and Vinokurova
(2010) use the language Sakha to illustrate this point. In Sakha, the nominative subject typically
controls verb agreement; this is what we see in (14).6

(14) Masha
Masha
‘Masha’s father bought the book.’

aqa-ta
father-3sg.nomposs

kinige-ni
book-acc

atyylas-ta
buy-past.3sgsubj

Subject agreement is also present in relative clauses in Sakha. Relative clauses consist of a participle
that precedes a head noun along, of course, with a subject (15). In relative clauses, the standard
Chomsky (2000, 2001) where case assignment occurs as a reﬂex of a nominal having established
a successful ϕ-agreement relationship with a probe. This is distinct from later models that use the
agree operation to establish case assignment but divorce it from ϕ-agreement itself (Adger, 2003;
Carstens, 2016, among others) Since the topic of this chapter is dependent case and its proponents,
I intentionally use the phrase agreement-based case as opposed to agree-based case in order to
distinguish the two.

6A quick comment on the notation used here: for the data from Sakha, I’ve included some
annotation Baker used to mark what certain elements have agreed with. This is especially helpful
since the morphological facts aren’t especially obvious in this language. These additional annota-
tions are subscripted on the element that has been agreed with and are either subj, obj, or poss. To
illustrate, if a verb is subscripted with a subj as it is in (14), this shows agreement has successfully
been established with the subject, Masha’s father’. Since in (14) both the subject and object are
third person, this helps to clarify what is going on in the data. Likewise, if something is subscripted
with poss, as ‘father’ is in (14), this tells us that it has agreed with a possessor.

34

agreement between the subject and the verb that we saw in (14) is not allowed, as we can see in (15b)
where ih-er cannot agree with the subject of the relative clause ‘Masha’. What is allowed however,
is agreement between the subject and the head noun of the relative clause; in (15a), caakky-ta
agrees with the subject of the relative clause Masha and the relative clause is grammatical. What is
important about this data with respect to case assignment is the fact that Masha in (15a) is argued
to be genitive, not nominative.

(15)

caakky-ta
a. Masha
Masha
cup-3sgposs
‘a cup that Masha drinks tea from’

ih-er
drink-aor

cej
tea

b. *Masha
caakky
Masha
cup
‘a cup that Masha drinks tea from’

ih-er-e
drink-aor-3sgsubj

cej
tea

Of course, this observation is not immediately obvious as both nominative and genitive are typically
expressed as null morphemes in Sakha. However, Baker provides a comparison that helps buttress
this claim. Compare (14) and (16). In (16), Masha agrees with the head noun at-a, but instead of
the subject Masha aqa-ty-n receiving nominative case as it does in (14) when it agreed with the
verb, it receives genitive case.

(16) Masha
atyylas-pyt
Masha
buy-ptpl
‘the horse that Masha’s father bought’

aqa-ty-n
father-3sgposs-gen

at-a
horse-3sgposs

Baker uses this data to show that when subject agreement is with a verbal element we get nominative
case and when subject agreement is with a nominal element, we get genitive case. Furthermore,
when agreement is totally absent (17), we either get a null subject (17a) or the clause is ungram-
matical (17b).

(17)

a.

ih-er
drink-aor

cej
tea
‘a cup that one drinks from’

caakky
cup

35

b. *Masha
Masha
‘ a cup Masha drinks from’

ih-er
drink.aor

cej
tea

caaky
cup

This data signals a strong correlation between overt subject-verb agreement and nominative case.
When ϕ-agreement between the subject and the verb is present, nominative case surfaces (14) and
when ϕ-agreement between the subject and the verb is not present, either because it has agreed
with something else (15a) or has failed to agree at all (17), nominative case is disallowed.

Additionally, example (18) further illustrates the close relationship between successful subject-
verb ϕ-agreement and the expression of nominative case. In Sakha, the theme of passive verbs can
be either accusative or nominative. When it is nominative, as it is assumed to be in (18a), we see
ϕ-agreement between subject and verb. However, when the theme argument is accusative, as it is
in (18b), ϕ-agreement does not surface; we can see that the verb in (18b) is marked with singular
features, rather than the plural features that would surface had agreement been successful.

(18)

a.

b.

Sonun-nar
news.pl
‘the news was read’
Sonun-nar-y
news.pl.acc
‘the news was read’

aaq-lyln-ny-lar
read.pass.past.3plsubj

aaq-ylyn-na
read.pass.past.3sgsubj

These examples (among others shown in both Baker and Vinokurova (2010) and Baker (2015))
illustrate the following descriptive principle for how nominative case is distributed in Sakha:

(19)

Overt NP X has nominative case if, and only if, exactly one verbal form in the clause
containing X agrees with it.

(19) reﬂects a dependency between ϕ-agreement and the appearance of nominative case. Since the
standard traditional agreement approach models case as the reﬂex of having established a successful
agreement relationship, it accounts for (19) quite directly and it is therefore easy to understand why
Baker concludes that there are cases in some languages that should still be modeled with an

36

agreement-based case mechanism.7

There are also cases and languages for which Baker argues an agreement-based approach is the
wrong approach. The arguments against agreement-based case are quite similar in logic in that
it’s hard to adopt an agreement-based case account if there is either (i) doubt that the agreement
operation exists, (ii) serious mismatches between the agreement system and the case system, or
(iii) the presence of case marking in the absence of the functional head assumed to be responsible.
To address this ﬁrst point, there is a large number of languages which exhibit overt case marking,
but do not exhibit any sort of verbal agreement, making it diﬃcult to make a connection between
the two. For example, Japanese has a fairly robust case-marking system, but does not appear to
have any sort of verbal agreement for any of the typical ϕ-feature categories like person, number,
or gender. Baker couples this fact with the existence of other proposals that use a complete lack
of agreement to account for other phenomena (Kuroda, 1988) to argue that an agreement-based
account seems very unattractive for languages that don’t appear to have it as an active operation.

Similarly, there are also a number of languages that may have ϕ-agreement more generally,
but do not exhibit any object agreement speciﬁcally. To the extent that these languages do ex-
hibit accusative case, it becomes diﬃcult to attribute that accusative case to non-existent object
agreement. Addressing the second argument, Baker illustrates some mismatches between case and
agreement that appear to make it diﬃcult to adopt an agreement-based account of case assignment.
A quick example is shown in (20). What (20) shows is that in Amharic, we observe successful
object agreement with nominals that are diﬀerently case-marked (Baker, 2012b; Leslau, 1995). In
(20a), there is object agreement with the dative argument Almaz. In (20b), a nominative argument
controls the object agreement and in (20c) an instrumental argument controls object agreement.
One might then conclude that we should not ascribe the particular case-marking on each nominal
7Baker acknowledges that a dependent case alternative to nominative case in Sakha has been
proposed (see Levin and Preminger (2015) for details), but he maintains a preference for a more tra-
ditional agreement largely because the dependent case alternative requires a number of stipulations
that are not necessary if one follows Baker’s proposal and, as Baker points out, are not universally
true. However, it is important to acknowledge that there are dependent case alternatives that are
able to capture the data to some degree

37

to the relationship formed by object agreement, since this relationship is not consistently reﬂected
by the same case.8

(20)

a.

l-Almaz
dat-Almaz.f

L@mma
Lemma.m
‘Lemma gave the book to Almaz’

m@ts’@haf-u-n
book.m-def-acc

s@t’t’-at
give-3msubj-3fobj

(Baker, 2012a)

b. Aster
Aster.f
‘Aster has a dog.’

w1SSa
dog.m

all-at
exist-3msubj-3fobj

b@-m@ t’r@ giya-w
inst-broom.m-def

c. Aster
Aster.f
‘Aster swept a doorway with the broom.’

d@dZdZ
doorway

t’@rr@g-@tStS-1bb-@t
sweep-3fsubj-with-3mobj

(Leslau, 1995)

To address the third piece of the argument, Baker asks: what happens when the heads that are
assumed to be responsible for case assignment go “missing”? If we examined English for an
answer, we might conclude that an agreement-based approach is more successful after all. (21)
shows that when ﬁnite T is present, nominative case is grammatical. However, when ﬁnite T is
absent, nominative case is impossible, but other case-markings are grammatical. This kind of
data has famously been used to support the idea that ﬁnite T is responsible for the assignment of
nominative case (Chomsky, 1981; Vergnaud, 2008).

(21)

a. He will ﬁnd some money in the park.
b.
c.

[PRO/for him/*he to ﬁnd some money] would be a lucky break.
[PRO/Him/His/*He ﬁnding some money in the park] was a big help to his budget.

(Baker, 2015)
However, as Baker illustrates, this fact is not universal. In (22) from Tamil, we see an ability
for subjects to appear with nominative case, despite a lack of ﬁnite T. This shows that whatever is
responsible for nominative case assignment in languages like Tamil, it is not ﬁnite T. (See McFadden
8See Baker (2015) and McFadden (2004) for more examples that illustrate case/agreement
mismatches. Baker (2015) extends this point by showing how ergative languages are especially
robust in their agreement/case mismatches.

38

and Sundaresan (2010) for a thorough discussion of this data).

(22)

Champa-vukku
Champa-dat
‘Champa wants Sudha to eat a samosa’

[Sudha
Sudha.nom

oru
a

samosa-vai
samosa-acc

saappiã-a]
eat-inf

veïã-um
want-3nsubj

Taking these arguments together, Baker advances the proposal that case marking is – at least
some of the time – not dependent on the establishment of an agreement relationship. He does admit
that it’s possible to propose accounts that rely on notions of abstract agreement that morphologically
mark syntactic objects in ways that belie any syntactic relationship; however he pursues a diﬀerent
route, asking if we shouldn’t trust more seriously what the surface morphology is telling us. The
model of dependent case he proposes is an exploration of what an alternative system for languages
that aren’t as conducive to an agreement-based approach could look like.

Before moving forward with the details of how dependent case can be assigned in the grammar,
it’s important to clarify where lexical/quirky/inherent case ﬁts into this system since I’ve been
describing the case assignment mechanism parameter as having just the two options. Baker
follows fairly standard assumptions about lexical case, arguing that it applies immediately, via the
relationship an argument forms by being merged into a projection with a quirky case assigning
verbal head. Any assignment of either dependent case or agreement-based case comes after. More
discussion about the timing of this system will follow.

In its most general terms, dependent case assignment proceeds according to the following
abstract schema in (23) where each variable represents an area where there is some degree of
parameterization. I’ll discuss each of these variables in turn, outlining the range of parameters the
grammar is able to set. For reasons of space, I direct the reader to Chapters 3, 4, and 5 in Baker
(2015) for detailed argumentation and motivation for each of these parameters.

(23)

If a category XP bears c-command relationship R to another category ZP in domain W,
then assign case C to XP.

Baker outlines three relationships that the relationship R parameter can take: (i) c-command, (ii) is

39

c-commanded by, and (iii) negative c-command. The ﬁrst two are understood in the familiar way.
The novel negative c-command relation is introduced to account for case patterns called marked
nominative and marked absolutive. In marked nominative languages, the subject of both transitive
and intransitive verbs are overtly marked with a nominative aﬃx and the object of a transitive verb
is not marked with any case aﬃx at all. An example is shown from Oromo in (24) (Owens, 1985).
While similar to nominative-accusative languages, marked nominative languages diﬀer in that the
object of a transitive does not bear any case marking. To account for languages that exhibit this
type of pattern, Baker proposes a relationship called negative c-command which says that there is
no NP2 such that NP2 c-commands NP1; essentially ensuring that the nominal in question is the
highest nominal in a case assigning domain. This negative c-command parameter can manifest in
the dependent case schema in the following way shown in (25a):

(24)

a.

Sárée-n
dog.mnom
‘the dog is barking.’

adii-n
white.mnom

nî
foc

iyyi-f-i
bark-f-impf

b. D’axáa-n

maná
house

duubá:
behind

rock.mnom
‘the rock fell behind the house.’

b-bu’e
loc-fell

c. Húrrée-n
fog.mnom
‘fog reduces visibility.’

arká
sight.abs

d’olki-t-i
prevent-f-impf

unergative

unaccusative

transitive

(Owens, 1985)

(25)

a. Assign NP1 marked nominative if there is no other NP, NP2 in the same domain WP

as NP1 such that NP2 c-commands NP1.

b. Assign NP1 marked absolutive if there is no other NP, NP2 in the same domain WP

as NP1 such that NP2 is c-commanded NP1.

The algorithm in (25b) is intended to take care of marked absolutive languages, but Baker remarks
that we’ve only found evidence that one language exhibits this pattern: Nias (Donohue & Brown,
1999). While the transitive object and the intransitive subject share a case marker in Nias, this

40

marker doesn’t take a typical aﬃxal form, but is rather expressed via a feature change in the initial
consonant (26). With respect to the negative c-command relationship more generally, Baker argues
in favor of proposing it not just to align marked nominative within the bounds of dependent case,
but more importantly because he believes a dependent case account is actually superior for this type
of data. Since agreement appears independent of case marking in these languages, he argues that
an agreement account would require a mismatch between syntactic markedness and morphological
markedness – nominative in marked nominative languages would be syntactically unmarked, but
morphologically marked.

(26)

a. Manavuli

b.

Tohönavanaetu]
Tohönavanaetu

ba
loc

Maenamölö
Maenamölö

sui
again

[n-ama-da
[mabs.father.1plposs

return
‘Ama Tohonavanaetu came back again to Maenamölö.’
[m-bavai]
I-a
3sgsubj.real-eat
abs.pig
‘Ama Gumi eats pigs.’

[ama
father.erg

Gumi]
Gumi

(Donohue & Brown, 1999)

The next variable to explore is the domain variable W. We can broadly characterize this variable as
the set of spell out domains, but “to what degree [the grammar] assign[s] diﬀerent cases in diﬀerent
domains is another one of its parameters” (Baker, 2015, p.182). These domains include CP-TP,
which is the typical clausal domain. We can see these eﬀects quite easily by comparing (27b) and
(27a). In (27a) we see that the presence of the matrix subject, which c-commands the embedded
subject does not trigger accusative case marking on the embedded subject, plausibly because the
two nominals in question are in distinct domains. Conversely, in (27b) the presence of the matrix
subject does trigger accusative case on the embedded subject because this time the two nominals
do occupy the same phase.

(27)

a.
b.

Jane hopes that he will win.
Jane expects him to win.

Baker argues that if we assume the smaller vP-VP domain is one that case assignment is sensitive to,
we can provide explanations for both diﬃcult diﬀerential object marking (DOM) patterns like those

41

from Sakha in (28) and “special” dependent cases assigned within the VP like dative, oblique and
partitive cases. A standard approach to DOM is to suggest that there are two structures available:
one where the object remains inside the VP and one where the object moves outside the VP into the
vP. The eﬀect this has for dependent case assignment is that this movement can feed the application
of dependent case. If the nominal stays inside its original VP, then it and the subject are in diﬀerent
spell out domains (29) and the dependent case schema will not trigger the assignment of any
dependent case. However, if the object moves outside that VP (30), then it will be spelled out
along with the contents of the TP when the phase head C is merged into the structure.
In this
instance, there will be two nominals in the same spell out domain and therefore the dependent case
mechanism will apply, assigning accusative to the object.

(28)

(29)

(30)

a. Masha
Masha
‘Masha ate the porridge’

salamaat-y
porridge.acc

sie-te
eat.past.3sgsubj

b. Masha
Masha
‘Masha ate porridge’

salamaat
porridge

sie-te
eat.past.3sgsubj

[TP Subject T [V Object]].

[TP Subject T Objecti [V ti]].

(Baker & Vinokurova, 2010)

The ability for vP-VP to act as a domain to account for DOM requires an additional theoretical
assumption, one that Baker admits will be quite controversial. The problem is that in accounting
for DOM in this way, we introduce the question of how the dependent case mechanism can apply
properly in languages that don’t exhibit DOM. The problem is this: if we assume that vP-VP is a
spell out domain that triggers dependent case assignment, we must assume that all objects move
out of VP in non-DOM languages in order for them to be able to exist in the same spell out domain
as the subject to receive accusative case. Baker instead suggests that the solution is yet another
languages can select whether their vP is a soft phase or a hard phase. Soft phases
parameter:
are those for which the grammar can continue to see into the vP-VP spell out domain after spell

42

out and hard phases are those for which the contents of the vP-VP are invisible once spelled out.
Baker explains that while superﬁcially similar to Chomsky’s weak/strong phase heads (Chomsky,
2000, 2001), it diﬀers in how the choice is made. Chomsky tried to make the distinction universal
and derivable from head type; passives and unaccusatives would require weak phase heads, for
example while other verb types would require strong. The distinction between Baker’s soft/hard
phases is instead a parameter that a language can just choose to set one way or the other; hard phase
languages would be predicted to allow DOM, while soft phase languages would not.

The other beneﬁt of the grammar being able to take advantage of the smaller vP-VP domain
is that we can explore the assignment of dative, oblique, and partitive cases as an additional kind
of dependent case. Baker proposes the rules in (31) to account for these “special” cases and pairs
them with the rules in (32) to show symmetry between dependent case operating in the CP-TP spell
out domain and dependent case operating in the vP-VP spell out domain.

(31)

(32)

a.
b.
c.

a.
b.
c.

If XP c-commands ZP in VP, then assign Case U (dative) to XP
If XP is c-commanded by ZP in VP, then assign Case V (oblique) to XP.
Elsewhere NP in VP is assigned case W (partitive).

If XP c-commands ZP in TP, then assign Case X (ergative) to XP.
If XP is c-commanded by ZP in TP, then assign Case Y (accusative) to XP.
Elsewhere NP in TP is assigned Case Z (nominative/absolutive)

Baker does propose one additional phase head that can condition dependent case assignment: the
aspect head. Like vP, the ability of an aspect head to serve as a phase is a parameter which
languages can set. The motivation for this part of the proposal is the potential to account for split
ergativity data where case can alternate and is conditioned in part by aspect. The contrast between
examples (33) and (34) from Coast Tsimshian shows that the choice of tense-aspect marker aﬀects
case alignment patterns (Dunn, 1995).

43

(33)

(34)

a. Yágwa

pres
‘the skunk is sniﬃng around.’

húumsg-a
sniﬀ.abs

geen.
skunk

b. Yagwa-t

’yuua-(a)
pres.3sgerg
man.abs
‘the man is pushing the woman.’

t’uus-da
push.erg

hana’k
woman

a. Nah
past
‘the woman was sick.’

siipg-a
be-sick.abs

hana’a
woman

b. Nah
past
‘the man pushed the woman.’

’yuuta-(a)
man.(abs)

t’uus-a
push.abs

hana’k
woman

(Dunn, 1995)

(Dunn, 1995)

The ﬁnal domain capable of conditioning dependent case assignment is the DP-NP spell out domain,
a proposal present in Marantz’s (1991) version. The main function this domain serves is to allow
for the assignment of genitive case as a type of unmarked case, distinct from nominative, that is
assigned in the DP-NP domain rather than the CP-TP domain.

At last we come to the ﬁnal variable – the categories XP and ZP that can participate in the
assignment of dependent case. It is arguably within this variable that we see the greatest degree of
parameterization. Most broadly, the things that can participate in dependent case assignment are
syntactic objects that contain “referential indices” (Baker, 2015, p.183). The two variables XP and
ZP can be of the same category type, the diﬀerence in labels X and Z is intended to diﬀerentiate
the case receiver from any case competitors. Baker suggests that what counts as a case competitor
ZP is something that is parameterized along a scale that is dependent on what sorts of features
each nominal-like thing has. This scale is shown in (35). The idea is that each category type
on that scale has a diﬀerent degree of nominal features. The ones with a full set will always be
case competitors, then ones with none will never be and the rest are ranked along a scale such that
languages can choose where the cut oﬀ point is for them. This scale captures an incredible degree
of interesting data, the details of which are too large to outline here (see Baker (2015) chapter 5
for a full walkthrough). It is important to note that he does admit that the details of exactly what
features are relevant for each diﬀerent category along the scale are still to be worked out.

44

(35)

overt NP & clitics
null referential pronouns (pro)
controlled PRO
uncontrolled PRO
weak implicit arguments
PPs, VP, etc

full nominal features

no nominal features

With the dependent case schema in place and an understanding of the range of values that each
parameter can take, I quickly review Baker’s assumptions regarding the timing of these diﬀerent
modes of case assignment. Lexical/quirky/inherent case is assumed to be assigned ﬁrst, via the
immediate relationship formed by merging into a predicate projection that assigns quirky case.
From there, Baker argues that the dependent case mechanism assigns dependent case according
to the schema outline in (36) with the relevant parameters set given for the particular language
and/or case. After this mechanism applies, the grammar is then able to assign agreement-based
case to anything that ﬁts the relevant structural description. Finally, the grammar will then assign
unmarked case to any nominals that, for whatever reason, were unable to receive case via one of the
earlier methods. It is also worth mentioning here that like most modern applications of dependent
case (Levin, 2015; Levin & Preminger, 2015), Baker assumes that this kind of case assignment
happens in the narrow syntax, at or before spell out, not in a post-syntactic component as assumed
by Marantz (1991), McFadden (2004), and Bobaljik (2008).

(36)

If a category XP bears c-command relationship R to another category ZP in domain W,
then assign case C to XP.

Before moving on to an exploration of the theoretical implications of this model, it is worth
mentioning one interesting beneﬁt of adopting case assigned via the dependent case model. Let’s
ﬁll in the schema in (36) with the following values to produce (37a) and (37b). Baker explains
that if we allow these two dependent case rules to apply independently, then 4 outcomes are
logically possible: (i) (37a) only applies, producing nominative/accusative languages, (ii) (37b)

45

only applies, producing ergative/absolutive languages, (iii) both (37a) and (37b) apply, producing
tri-partite languages,9 and (iv) neither (37a) nor (37b) apply, producing languages like Bantu which
seem to make no use of case morphology at all (Diercks, 2012).

(37)

a.

b.

If an NP1 is c-commanded by another NP2 in domain TP, then assign accusative to
NP1.
If an NP1 c-commmands another NP2 in domain TP, the assign ergative to NP1.

This review, while lengthy, of course does not capture the breadth of the entire proposal, but my hope
is that it provides enough information to discuss some of its broader theoretical implications. The
next section will focus on exploring what those are with the intention of arriving at the conclusion
that there are some troubling results we are forced into accepting by adopting such a model, despite
how well it captures this wide range of data.

2.2.1.3

Interim Walkthrough

Before discussing the implications of adopting the dependent case approach, I want to provide a
quick summary of how the two basic models of case assignment would account for some familiar
case patterns so that we may enter into that conversation having seen the two systems comparatively
illustrated. In a simple transitive clause like (38) (whose derivation is shown in (39)), v is speciﬁed
with uninterpretable ϕ-features and an uninterpretable case feature. The uninterpretable ϕ-feature
probes, looking for a valued instance with which to agree. It ﬁnds a viable goal in the DP object him
which has valued third person singular features and an uninterpretable case feature of its own; the
probing of ϕ-features is represented by the dashed lines drawn. By reﬂex of agree, the interpretable
third person singular features on the object will value the uninterpretable ϕ-features on v. As a
result, v will assign accusative case to the DP object. A similar relationship is formed between the
ﬁnite T and the DP subject. Finite T also has uninterpretable ϕ-features that need a value and they
9Tripartite languages, such as Nez Perce, exhibit case morphology patterns that are viewed
as having shared properties between nominative/accusative languages and ergative/absolutive lan-
guages. (See Deal (2016) for more discussion on these patterns).

46

ﬁnd a suitable goal with valued ϕ-features in the DP subject she which has third person singular
features. As a result of having agreed with ﬁnite T, the ϕ-features on T are valued and the DP
subject receives nominative case.

(38)

She loves him.

(39)

(cid:35)

(cid:34)

DP1
she
uCase:
ϕ:3sg

TP

(cid:35)

T(cid:34)

uCase:
uϕ:3sg

nom

T(cid:48)

DP
t1

(cid:34)

v

uCase
uϕ:3sg

v

(cid:35)

vP

v (cid:48)

VP

(cid:35)

him(cid:34)

DP2
uCase:
ϕ:3sg

V
loves

V

acc

Dependent case theory would assign case in this example diﬀerently. First, we’d assume that
the following parameters are set for English such that accusative case is calculated according to
the following schema in (40). Note that since we’re following Baker’s approach, we’d also need
to assume that the v phase head is a soft phase head, thus allowing the contents of the VP spell
out domain to be visible to things outside the domain. Along with this algorithm dictating how
dependent accusative case is assigned is another assumption that nominative case is the unmarked
case that is assigned in the negative environments that this algorithm does not identify.

(40)

If DP2 is c-commanded by another DP1 in the spell out domain TP, assign accusative case
to DP2, provided DP1 does not already have case.

This algorithm would assign dependent accusative case to the object because the object constitutes

47

a DP that is c-commanded by another DP that shares the spell out domain TP. After accusative
case is assigned by this mechanism, the grammar then investigates the subject nominal, sees that
it is caseless and as a result of being in the domain deﬁned by the unmarked case (TP), assigns it
unmarked nominative case.

For a simple unaccusative clause like that in (41) and (42) under the agreement-based approach,
ﬁnite T’s uninterpretable ϕ-features would probe – again represented by the dashed line – searching
for a valued instance of ϕ-features, ﬁnding success with the theme argument. An EPP feature
triggers the argument’s movement to the speciﬁer of TP position, the uninterpretable ϕ-features on
ﬁnite T are valued by the ϕ-features on the theme argument, and the theme’s uninterpretable case
feature is valued nominative as a result. Dependent case would instead use the same algorithm
shown in the example above to examine the spell out domain TP, see that there is no DP for which
it is c-commanded by another DP, and therefore fails to assign accusative case. The grammar
would then assign unmarked case to any nominals which did not receive accusative case, valuing
the caseless subject with nominative case.

(41)

He arrived.

(42)

TP

(cid:35)

(cid:34)

DP1
he
uCase:
ϕ:3sg

T(cid:48)

(cid:35)

T(cid:34)

uCase
uϕ:3sg

nom

vP

v

VP

V

arrived

DP
t1

Both agreement-based case mechanisms and dependent case ones would handle Icelandic
transitive data as they did with English transitives like the example shown in (39), but an Icelandic
quirky case example would be assigned case diﬀerently. Here, the subject is assigned a quirky dative

48

case and the object receives nominative (43). Both agreement-based case models and dependent
case ones assume that quirky dative is assigned by the verbal projection, but the two models diﬀer
in how they assume the object receives nominative. Agreement-based case models argue that ﬁnite
T agrees with the object, assigning it nominative as a result (44). The idea is that the subject is
able to be bypassed due to its already having received case – it is not visible to the ϕ-agreement
probe. This is supported by data like that in (43b) which shows that when the subject is assigned
quirky case, it is the object that controls number agreement. Dependent case would operate in
the same way it has for the previous two examples: once TP is spelled out, the dependent case
algorithm is not able to assign accusative case since there is no DP for which there is another
caseless c-commanding DP, so the unmarked nominative is assigned to the object instead.

(43)

a. Morgum

studentum
student.pl.dat

many
‘many students like the job’

liki/*lika
like.3sg/*3pl

verkið
job.the.nom

leiddust
b. Henni
was-bored-by.3pl
she.dat
‘She was bored with them.’

þeir
they.nom

(Harley, 1995)

(44)

TP

DP1

many student

T(cid:34)

uCase
uϕ:3sg

T(cid:48)

(cid:35)

(cid:34)

(cid:35)

DP
t1

uCase:dat
ϕ:3pl

vP

v

v

v (cid:48)

V

V
like
dat

Quirky Case

nom

49

VP

(cid:34)

(cid:35)

DP2
the job
uCase:
ϕ:3sg

2.2.2

Implications

Most of those who adopt dependent case have directed their focus toward empirical coverage, Baker
in particular. While we’ve learned much from the impressive empirical coverage this model has
achieved, there are many conceptual and theoretical implications that still need exploration and
subsequent evaluation. The rest of this chapter is an attempt to that aim. I will argue here that this
evaluation should lead us to be more skeptical of the tenability of a dependent case approach and
should motivate us to make other modiﬁcations to case theory that align it more closely with the
standard approach.

Each of the works that employ dependent case as either the singular or primary case assignment
mechanism argue well against the standard agreement-based case approach proposed by Chomsky
(2000, 2001). It is clear that this model, as proposed there, will not be able to account for the
widely varied morphological case patterns; some modiﬁcations will surely need to be made. Default
case data provides an especially clear argument. I will oﬀer a proposal of what I suggest those
modiﬁcations should be in chapter 4. First, I think it prudent to provide motivation for why we
should do so, when there is a model of case valuation available (dependent case) that appears to ﬁll
in these empirical gaps quite well. The arguments laid out here are therefore relatively modest ones
in the face of the impressive empirical coverage achieved by Baker (2015); the impact of which
cannot be overstated. However, it is also incredibly important for us as theoreticians to understand
what concessions we make about theoretical concerns through its adoption. I will argue here, and
in chapter 3, that adopting these systems requires a very rich and detailed UG, one that we should
be cautious of if we entertain seriously the biolinguistic perspective that calls for a minimally lean
UG (Chomsky, 2005).

There are two dimensions along which we need to frame our arguments: those against depen-
dent case more generally and those against the type of hybrid case approach proposed by Baker.
Now that we have worked out enough details to understand how dependent case needs to work
under Minimalist assumptions, it is time to investigate the model’s conceptual and theoretical im-
plications and evaluate whether the data captured is of enough beneﬁt to concede any theoretical

50

unattractiveness. There are three central topics to discuss: (i) the theoretical impact of abandoning
the GB central notion of government, (ii) the high degree of parameterization, and (iii) the nature
of dependency establishment. The following sections will outline the theoretical concerns of each.

2.2.2.1 Abandonment of Government

In this section I discuss a few conceptual issues that have not yet been thoroughly addressed that
arise from the replacement of government as the domain-deﬁning relationship in favor of phase
heads and spell out domains. Because the government relation was the deﬁning relationship that
dictated the assignment of case features in the original 1991 version, it is not surprising that its
abandonment has great eﬀect.

When comparing modern dependent case theory to its contemporary agreement-based alter-
native or to case theory as it existed under the GB framework (Chomsky, 1981), it is easy to
come away with the impression that they are radically diﬀerent approaches that share very little in
common. Although dependent case was indeed a radical new proposal, it didn’t completely upend
the system in ways that required us to reconﬁgure the framework. It employed the same relation –
government – to assign case to nominals, an assumption also held in the standard GB case frame-
work. What distinguished Marantz (1991) from its contemporaries was that Marantz assumed the
case-assigning heads used a diﬀerent kind of information – conﬁgurational information – to make
the decisions about which nominals received which speciﬁc case features. While the information
used to calculate case was diﬀerent, the central notion of government and the role functional heads
played remained primarily the same.

As we’ve discussed, with the abandonment of government came the requirement to redeﬁne
the conditions under which dependent case applies. Recent updates to these conditions replace the
government domain speciﬁcation with one that deﬁnes the domain with reference to phases: case
is assigned conﬁgurationally to nominals that occur within the same phase. This shift is successful
both empirically and theoretically in that it both captures the relevant case patterns and uses the
quintessential Minimalist domain that we assume to be at the center of derivation construction. It

51

thus constitutes an intuitive and reasonable way to update what it is that deﬁnes case assigning
domains. While it is easy to treat this update as a trivial one, made solely to bring dependent case
in-line with modern theoretical assumptions, I argue that this update introduces three non-trivial
diﬃculties: (i) it removes the source of case features in a way that violates the central Minimalist
assumption of Inclusiveness, (ii) it removes the ability to make the empirical distinction between
unmarked and default cases, and (iii) it results in an inconsistent syntactic conceptualization of
case that is both problematic for acquisition and theoretically unattractive, especially if one follows
Baker (2015); Levin (2015); Levin and Preminger (2015); Preminger (2014) in assuming case is
assigned in the syntax.

If the dependent case model applies in the narrow syntax, it is not a trivial detail to ﬁgure
out how exactly the assignment of case feature values works. So while it is intuitive to say that
“case features are assigned”, it isn’t a trivial question to ask “by what”, especially when working
in a feature-driven model of the grammar. At minimum, standard frameworks with Minimalist
assumptions require that, like all syntactic objects, features must come from somewhere (the
numeration) and they must originate on something as they are deﬁned as properties of syntactic
objects. The Inclusiveness Condition, one of the most central Minimalist constraints, further reﬁnes
these requirements (Chomsky, 1995, 2000). The idea is that new syntactic features cannot be added
throughout the course of the derivation. They must be generated on some lexical item and cannot
enter after this initial selection into the array is made. This achieves two aims: ﬁrst, it greatly
constrains the power of the generative ability of the grammar. If features could be inserted at any
stage, we would have to propose a number of additional principles that would further constrain
the range of possibilities that unfettered insertion would produce. Assuming the derivation has
everything it needs at the beginning is the most minimal assumption in that sense. Furthermore, we
can make a connection between this Inclusiveness Condition and the Chomsky-Borer Conjecture
(Borer, 1984; Chomsky, 2001). This conjecture is not only widely adopted, but it is also likely
true and oﬀers the most principled understanding of how humans could actually acquire these rules
and constraints. The strong version of this idea is that all syntactic variation must be visible at

52

the level of the lexical items themselves, since they are the only acquirable/tangible piece to which
the language learner has access. What this means for the acquisition of case features and any
mechanisms available for case assignment is that we have to clearly understand which lexical items
house the individual case features and how those features are assigned from one syntactic object to
another.

Without the relation of government conditioning which nominals are in the case assigning
domain, the actual locus and assignment of case features is not deﬁned. I see three imaginable
options: (i) case features are located on V+T or some other relevant case-assigning functional head,
(ii) case features are located on the nominals themselves, or (iii) case features are supplied “by the
grammar” or some dependent case mechanism. Let’s ﬁrst consider option (i), that the case features
originate on functional heads like V+T, as they do in early versions of this model. This option is
ﬁrst problematic in that no modern dependent case proponent could reasonably adopt it without
seriously undermining the deﬁning principle of the model. The hallmark advantage of modern
dependent case is its ability to remove any dependence on the presence of some case-assigning
functional head. Not only is this the primary motivator, it is also the model’s deﬁning characteristic.
By locating the derivational origin of case features on V+T as the original version does, we once
again revert to a dependence on the presence of those functional heads for case-assignment, negating
the primary beneﬁt over the agreement model. It’s reasonable to ask about the original version,
which did locate those features on functional heads; this is where the abandonment of the relation
government becomes relevant. Say modern versions of dependent case mimicked Marantz (1991)
in locating the source of the case features on V+T, putting aside the reluctance to do so because
of a hesitance to rely on functional heads. Without access to the government relation, we lose the
connection between the origin of the case features and their eventual derivational destination. We
thereby create a system where case is conceptualized as the reﬂection of a relationship between
two syntactic objects (the nominals), but the features that signal this relationship come from an
uninvolved third-party, obfuscating the very relationship case is supposed to reﬂect. The third party
status of those functional heads is what is at issue here. The GB-era version was able to avoid this

53

conceptual issue because through reference to government, the third party V+T complex wasn’t
an uninvolved one. Case-assignment in that early version, although calculated using information
about the existence of other nominals, still operated under the assumption that nominals were only
under consideration for case assignment if they were governed by the functional head that was
the source of those features. Case was simultaneously the reﬂection of a relationship between a
nominal and a governing functional head and a nominal and its government domain-mates. Without
the notion of government and with the decision to dissociate case assignment from agreement, it
seems untenable to locate case features on any functional head like V+T without introducing some
serious conceptual issues.

Option (ii) appears similarly untenable. Case is calculated on information about the relationship
between two nominals, so it’s not unreasonable to wonder if the origin of case features is the
nominals themselves. It should be fairly obvious why this is a nonstarter, but for thoroughness,
let’s brieﬂy examine why. If we pursued this option, nominals would be generated with both a
valued and an unvalued case feature. It is easy to see how the unvalued instance of the case feature
would depend on conﬁgurational information, but it’s important to notice that the valued instance
of the feature would also require this information, making it diﬃcult for each nominal to ‘know’
what feature to be generated with. In other words, not only do nominals depend on conﬁgurations
to receive a feature value, they also would depend on those conﬁgurations to assign one. Since
this conﬁgurational information is inaccessible at the point of generation, it is untenable to assume
nominals are transferring feature values to one another.

Finally, we are left with the option that modern dependent case proponents actually adopt: that
the dependent case mechanism itself is what assigns case features to nominals. This option is
quite attractive upon ﬁrst look because it mimics other sorts of pre-Minimalist grammatical rules.
It follows an operation type logic where in the context of a particular structural description, the
grammar performs a feature-assigning operation. What is more problematic upon a closer look is
that, while familiar, this sort of process constitutes a clear violation of the Inclusiveness Condition
– the assumption that new syntactic features may not be introduced into the derivation after their

54

initial selection from the lexicon into the speciﬁc lexical array for the derivation under discussion.
As properties of lexical items, case features by deﬁnition must originate on some lexical item. If
they don’t their addition later in the derivation, regardless of the mechanism, introduces exactly
the type of dangerous theoretical power Inclusiveness tries to constrain. Violating Inclusiveness
is not a trivial matter as it greatly restricts the power of the derivation by disallowing unnecessary
diacritics, traces, or other convenient theoretical tools that reduce explanatory value and forces the
derivation to adhere as strongly as possible to interface constraints. It is also important to note that
the need to adhere to this condition goes beyond dogmatic obeisance to Minimalism as a program.
Violations of the Inclusiveness Condition constitute real issues for understanding how it is that
individual humans acquire the syntactic system as they minimize the tangible linguistic evidence
that a language learner can observe. They also raise serious issues for understanding how groups
of humans as a species acquired or evolved the capacity for language as they greatly expand the
set of things that must be part of UG. It’s possible for violations of the Inclusiveness Condition
to be tolerated if we could consider those violations perfect solutions to interface constraints; it is
not obvious however that dependent case mechanism could be framed in this way. Since under a
dependent case model, neither of the three available explanations for the origins of case features
appears tenable, the abandonment of government raises some serious concerns about the tenability
of the approach itself, at least as an mechanism that is active in the syntax.

The abandonment of government creates another problem: it removes the ability to draw an
important distinction between unmarked case and default case. The inability to draw the theoretical
distinction between the two doesn’t cause any empirical issues for most languages, whose unmarked
case happens to be synonymous with its default case. There are however, a number of languages
like English, Dutch, and Norwegian whose unmarked case is nominative, but whose default case is
accusative. Therefore an inability to make a distinction between the two raises an empirical issue
for this subset of languages.

Since government was the domain deﬁning condition under Marantz’s original proposal, it was
possible to draw a distinction between unmarked case (case assigned by the governing head when it

55

didn’t assign the dependent case) and default case (case assigned by the grammar when a nominal
wasn’t governed by a case assigning head). With the replacement of this relation with the concept of
phase domains, we are no longer able to maintain this simple distinction because it is impossible for
a nominal to not be in some spell out domain. One might get around this worry by suggesting that
the label of the phase head X is what allows us to distinguish between the unmarked and the default
cases. This could work quite nicely for examples where the default nominal is outside a domain
like TP, presumably in some sort of focus or topic phrase. We could argue that the unmarked case
is restricted to the domain deﬁned explicitly by TP spell out. In this way, the unmarked nominative
case nominal I that is in the TP domain could be distinguished from the default accusative case
nominal Me that is not in the TP domain and is in topP domain instead.

(45)

Me, I love honey.

(46)

topP

DP
Me

top’

top0

TP

I love honey

Where this sort of solution becomes problematic is with examples where default case nominals
do appear within the main clausal structure. This makes it quite diﬃcult to argue that we could
make a distinction using phase domain labels. Take the gapping example shown in (47). For this
discussion, I will assume the proposal suggested in Johnson (2009), shown in (48). There are
three parts to his approach: (i) low coordination of the vPs (ii) heavy NP shift of their objects,
and (ii) across the board movement of the verb phrases. Johnson assumes that when two vPs are
coordinated, that coordination can trigger two separate processes: the rightward shift of the objects
outside of their respective VPs and the subsequent across the board movement of those VPs to the
speciﬁer of a predicate phrase. The subject of the ﬁrst vP, as the highest DP in the structure, will

56

be the one targeted by the EPP feature of T and will raise to subject position. The subject of the
second vP will remain in its original position. It is this position where it receives default case as
there isn’t a canonical accusative case assigner available to assign its case features. In order to
distinguish unmarked from default case in the clausal domain, we would need to deﬁne a set of
domains whereby each of the unmarked cases was assigned. In modern dependent case approaches,
these domains are TP for unmarked nominative case and DP for unmarked genitive case. Only
outside of these unmarked domains would the default case be allowed to surface. As we can see in
the structure below, the default nominal him exists within a TP domain and as such, constitutes a
clear diﬃculty in distinguishing it from the nominative unmarked case nominals.

(47)

She will eat beans and him rice.

(48)

TP

DP1
she

T
will

T(cid:48)

VP2

eat t3

PredP

Pred(cid:48)

pred

vP

vP

v

DP
t1

BP

DP
him

vP

v

v (cid:48)

V

B
and

VP2

DP3

beans

v (cid:48)

V

VP2

DP3

rice

57

Another way to address the distinction problem would be to instead propose that we reverse
the way the dependent case mechanism operates in English and these other languages such that
the default accusative case is aligned with an accusative unmarked case instead of an unmarked
nominative case. This would frame nominative case as the assigned dependent case instead. To be
clear, this type of solution is entirely within the conﬁnes of dependent case theory and so no issues
can be raised there. Instead of adopting the canonical dependent case algorithm for accusative case
(49a), we could changes the values of the parameters to reﬂect this reversal (49b).

(49)

a.

b.

If DP2 c-commands another DP1 in the spell out domain TP, assign accusative case
to DP1, provided DP2 does not already have case.
If DP2 c-commands another DP1 in the spell out domain TP, assign nominative case
to DP2, provided DP1 does not already have case.

Doing so would successfully align the unmarked and the default cases while obeying the rules
of dependent case model, but would require a theoretical departure in how we understand how
case features and categories relate to one another. While it is unproblematic to assume case
feature inventories vary cross-linguistically, it is far more problematic to assume that the inherent
relationships between these features diﬀer, which is what this reversal would require. For languages
where nominative is both the unmarked and default case, we would need to assume that the features
responsible for nominative case are less speciﬁed than those for accusative case. Adopting the
proposal that features have inherent hierarchical structure (Harley & Ritter, 2002) means that
whatever features responsible for nominative would dominate those responsible for accusative. For
languages where accusative is both the unmarked and default case, we would be forced to assume
the opposite. Reconciling how both are true would require that we assume case features do not have
any inherent feature structure, singling them apart from how we assume all other syntactic features
are organized. This would seriously undermine one the hallmark beneﬁts of adopting dependent
case: its ability to uniformly account for varied cross-linguistics morphological patterns using the
same mechanism. Once again, the abandonment of government as a domain-deﬁning relation has

58

made the modern extension of dependent case more problematic than it ﬁrst appeared.

As an aside that we’ll discuss later in chapter 4, it is also worth exploring what the distinction
between unmarked and default case means for the relationship between the individual features that
make up the various case categories. This is an arena where we haven’t connected up the insights
made by those whose primary focus is on morphological case and those whose focus is on syntactic
case. Decomposing case categories into individual case features has its roots in the literature on
the various case syncretic patterns observed cross-linguistically (McFadden, 2007; Müller, 2004a,
2004b, 2005). The main idea is that if two cases are syncretic in a language, they must share some
set of case features with other, while maintaining enough distinction in case features with the other
cases that are not syncretic. This dissociation is incredibly standard in the morphological case
literature, despite the lack of a general consensus on exactly what those features are. Conversely
in the syntactic literature, it is uncommon to discuss case features as singular units that comprise
case categories, despite an implicit acknowledgment that this is of course true (Pesetsky, 2013).
Instead, we talk about assigning nominative case as a whole, failing to connect up the individual
parts that make this happen. Under a dependent case approach, we need to better understand how
those individual features that are responsible for nominative case or accusative case are understood
to operate. McFadden (2007) addresses this point explicitly and we’ll return to this discussion in
chapter 4 where I introduce an alternative proposal that addresses these issues.

Finally, modern versions of dependent case, without access to this government relation, result
in a great deal of conceptual inconsistency when it comes to understanding what it is in the syntax
that case morphologically reﬂects. Under a more standard agreement-based approach, case can be
conceptualized as the reﬂection of a relationship between a nominal and a functional head that is
formed when agree establishes a dependency between those two syntactic objects. It’s interesting
to note that this is also true of the original dependent case proposal. Even though dependent case
uses conﬁgurational information to calculate which nominal receives which case, nominals in the
original proposal did receive case from V+I under government. Case was therefore a reﬂection of
the same functional head/nominal relationship – the V+I complex cannot assign case to nominals

59

that it does not govern.

Under either a hybrid model or a strict modern dependent case model, the syntactic relationship
that case reﬂects is much less clear; case does not seem to have a consistent conceptualization
beyond a way for the grammar to distinguish nominals. When case is assigned lexically, as it is
in (50) where the subject is marked with quirky dative, it is understood to be the reﬂection of a
‘special’ relationship between a nominal and a ‘special’ verbal head. In the dependent case model,
this is unexpected because one of the central principles is the avoidance of a reliance on functional
heads for case assignment. Case is modeled as the reﬂection of a relationship formed between two
nominals, not between a nominal and functional head. To address concerns about whether this
would constitute the kind of dependence on functional heads that proponents of dependent case
passionately avoid, Levin and Preminger (2015) argue that the sisterhood relationship formed by
merge is local enough to obviate the worry.

(50)

studentum
student.pl.dat

Morgum
many
‘many students like the job’

liki/*lika
like.3sg/*3pl

verkið
job.the.nom

(Harley, 1995)

This doesn’t however, address the conceptual inconsistency between lexical and dependent case
that exists even in the strict dependent case model. Sisterhood, while perhaps a more palatable
reliance on functional heads that dependent case proponents might tolerate, is still a relationship
between a nominal and a functional head. The issue here isn’t about how the relationship becomes
established and whether or not that constitutes a reasonable exception, but rather that the sisterhood
exception exists in the ﬁrst place. Under a dependent case model, quirky case is quirky because it
reﬂects an unexpected case relationship, one between functional heads and nominals that explicitly
doesn’t exist elsewhere in the system. Under a more standard approach, quirky case reﬂects an
expected relationship – since all case reﬂects a relationship between nominals and functional heads
– but with an unexpected functional head.
It appears that the existence of quirky case is more
unexpected under the dependent case model than it is under an agreement-based model.

With respect to accusative or ergative case assigned via the dependent case algorithm, the

60

relationship case reﬂects is even more unclear, given our discussion above about where to locate
case features. If the source is the functional complex V+T, then we could maintain consistency
at least between lexical and dependent case, but with the existence of the issues raised above and
given that modern researchers are reluctant to depend on the presence of a functional head, this
doesn’t seem likely to be the case. If the source is either the nominals themselves or somewhere
undeﬁned in the grammar via an operation, we can reasonably conclude that case is the reﬂection of
a relationship between two nominals. This signals a system where sometimes case is the reﬂection
of a relationship between a functional head and a nominal (for lexical case and/or dependent
case, dependent on the feature source) and sometimes the reﬂection of a relationship between two
nominals. Additionally, unmarked case is the reﬂection of no relationship at all. So even within
a strict dependent case model where there is no agreement-based case, case is not a conceptually
consistent entity. This status is even more true of a hybrid model.

At this stage, it’s reasonable for one to ask: why would this conceptual inconsistency be a
problem? Perhaps all case is is the reﬂection of the need for the grammar to distinguish nominals in
some arbitrary way. One could propose a function-based explanation where case distinctions help
aid communication in some meaningful way. If true, then case doesn’t need to consistently reﬂect
the same sort of syntactic relationship; it simply needs to reinforce the fact that the nominals in
question are syntactically distinct from one another. Examples shown in (51a) and (51b) however
illustrate that if case diﬀerentiates nominals to aid in communication, the grammar doesn’t appear
to reinforce those functional distinctions in a consistent way, undermining that there is a system
of case at all. Furthermore, this type of conceptual inconsistency makes the acquisition of case
incredibly diﬃcult to understand because it undermines that there’s a system to acquire in the ﬁrst
place.

(51)

a.
b.

She expected him to hug them.
She hoped he would hug them.

The removal of government as the domain-deﬁning relationship therefore introduces three main

61

issues: (i) it creates a system where case assignment constitutes a violation of a crucial Inclusiveness
Condition, (ii) it creates a system where we cannot maintain an empirically needed distinction
between unmarked and default case, and (iii) its conceptual inconsistency undermines the existence
of a case system itself. While these issues may turn out to have solutions, there is beneﬁt to being
explicit about what theoretical concessions we must adopt by adopting the dependent case system.

2.2.2.2 Parameterization

Modern dependent case captures a wide range of data through a high degree of parameterization.
The implications that we must adopt by pursuing this approach will be the focus of this next section.
I certainly don’t want the reader to infer that the high degree of parameterization is a deﬁciency
of the model on its own.
It is a clear fact that the high degree of cross-linguistic variation in
case patterns is daunting and Baker’s model, with its high degree of parameterization, allows us
to account for a widely disparate number of patterns, while constraining what those options are.
This is empirically attractive and gives us lots of insight into how case patterns can be accurately
modeled and predicted. What follows is an exploration of some of the questions that such a high
degree of variability raises.

Parameters on their own are of course not problematic.

It’s not controversial to assume
something like a head parameter, for example, where merge can choose on which “side” to locate
a head. Notice that with a parameter of this type, the parametrical choice is more or less internal
to how it works. By this I intend to mean that the head parameter does not have to exist as an
external rule guiding how merge can apply; it follows from the logical set of possibilities available.
Not all hypothetical parameters necessarily share this property, however, and we must therefore be
careful to consider the types of parameters we allow and how much power we grant them. This is
especially hard to do in the face of such empirically varied phenomena like case marking, as the
high degree of variation alone invites the proposal of a highly varied set of external parameters.

The types of parameters proposed in modern versions of dependent case are of a type that should
be at least a little concerning if we intend to pursue a Minimalist-aligned theory because they, unlike

62

a head parameter, don’t appear to follow naturally from how the system works in quite the same
way. The most obvious one of these would be the high-level parameter of which case assigning
mechanism is chosen. While it’s certainly possible that languages can decide to assign nominative
case either via agreement or via the unmarked case part of the dependent case mechanism, it’s
not in any way obvious why these are the two choices available. Parameters modeled like this are
thus conceptualized in a way that makes them an external sort of parameter – that operates over a
particular operation – rather than a parameter that is derived from an independent, logical set of
possibilities.

Additionally, in a theory with either a large number of parameters (or rules, as we’ll see in
the next chapter), we also have to understand how those parameters are organized in a way that
makes their acquisition as a consistent set plausible. What the language learner ends up acquiring
is an entire set of parameters that make up the system as a whole, despite the fact that they exist as
independent ‘rules’. In order for language learners to consistently acquire all of the parameters in
the correct way, and not for example, miss one or set one in a way that it conﬂicts with a previously
set parameter, there must be some sort of relationship between them that guides this acquisition.
Furthermore, a high number of external parameters also causes some problems in understanding
solutions to what’s termed Darwin’s problem, the question of how language arose so quickly in
humans (see Hornstein (2018) for a review). The solution that guides the Minimalist program is
that what cannot be explained by either the environment or non-language speciﬁc cognitive systems
must be innate to humans and in order to understand how these innate features evolved so quickly
in humans, we must assume they are quite minimal in number. Systems that propose a long list of
external parameters or other rules deeply violate this assumption by making UG quite rich.

There are also implications that follow from how we want to understand the particular features
that are involved in this case system. Since Baker allows assignment-based case and conﬁgurational
case to co-exist not only in UG more broadly, but within in a single language, we have to ask what this
variation means for the status of case categories in the grammar. It’s fairly standard to conceptualize
case categories as a middle-man type of label that we give to groups of features that appear to behave

63

the same way (Pesetsky, 2013). Once we couple this sort of variation with a decomposition of case
features, it becomes incredibly diﬃcult to see how this variation would actually be implemented.
There are two problems here: one is that by doing so, we elevate the status of case category
to a type of syntactic object that the grammar is aware of, which conﬂicts with more standard
understandings of what case categories are, conceptually. The second is that when coupled with a
decomposition of case features, it’s not entirely clear how the grammar would be able to implement
this sort of variation. Say, for example a language assigned nominative case via agreement, but
assigned accusative case via the dependent case mechanism. Also assume that nominative case is
made up of a set of individual case features, some of which are shared by the set of individual case
features that make up accusative case and others that are not shared. For exposition, let’s assume
nominative is comprised of case features A and B, while accusative is comprised of case features
B and C. The shared feature B is what allows the two cases to be syncretic and those that aren’t
shared, A and C, are what allows us to maintain a distinction. When we try to plug this decomposed
case feature system into the mechanics of how case assignment is intended to work, we run into
an important question: which of the independent case features that make up accusative case is the
dependent case mechanism actually assigning? The most straightforward answer might be features
B and C, the entire feature set that makes up accusative case. Likewise, it would follow that the
agreement operation responsible for nominative case would be capable of assigning features A and
B. However, if true, this means that the language in question has two independent mechanisms of
assigning the same case feature B. If instead, the dependent case mechanism assigns only feature
C, then we of course have to wonder how feature B appears on the nominal. We’ll discuss this point
in more detail without a toy sort of example in chapter 4, but for now the point is simply that the
treatment of case categories in the modern version of dependent case becomes problematic when
one tries to reconcile it with conclusions about feature composition that come from morphological
research on syncretism.

It is also worth asking what it means that some of the parameter settings appear to be forced into
a particular way. I have two of these in mind. The ﬁrst is the assignment of ergative case. Baker

64

argues that ergative case is never assigned via agreement, only via the dependent case mechanism
(he also rejects the idea that it’s a lexical case). Not only does this require that the grammar is
sensitive to the case categories themselves, but it also raises an important question: why for this
case is the parameterization forced in one direction? In order to answer this question, we’d likely
need to understand what is present in ergative data that forces the language learner to consistently
set the ergative case assignment mechanism parameter. Furthermore, whatever this observable-to-
the-learner data ends up being must be distinguished from the other cases for which this parameter
is not ﬁxed. The other ﬁxed parameters setting comes from a potential solution to the unmarked
vs default problem: perhaps for languages where the default is morphologically distinct from the
unmarked, the parameter for unmarked nominative case assignment must be set to the agreement
setting. More generally, these questions become: what does it mean for a system to exist with a set
of parameters, only to have some range of them inaccessible for particular cases?

Many of these are largely conceptual issues, but I argue that they are ones that are important to
explicitly consider when entertaining adopting such a radical proposal. Despite the impressive em-
pirical coverage that dependent case admittedly oﬀers, it’s important to understand what theoretical
concessions we’re making in its adoption.

2.2.2.3 Dependency Establishment

Finally, we come to issues regarding the status of case assignment under the umbrella of syntactic
dependencies. Like ϕ-agreement, case assignment can be viewed as a syntactic dependency in
that it involves one form being dependent on the characteristics of another syntactic object and
this dependency is based on structural relations. Relevant to this domain are the set of operations
that we assume to be capable of establishing various dependencies in the grammar. Because
discussions of dependent case are often (and reasonably I might note) restricted to the domain of
case, it is often treated as an alternative model of case valuation. Within the boundaries of the case
literature it absolutely is and all the work cited in this section has pitted agreement-based models
against dependent case ones and let the data battle out the strengths and weaknesses. Through this

65

exercise though, it’s easy to think that these two models are more or less equal when it comes to
framework complexity; they diﬀer only in their predictive empirical coverage and/or theoretical
implications. However, when we expand our purview to a larger domain of phenomena, namely
the establishment of dependencies more generally, the complexity of the two models we are forced
into adopting is no longer equal.
In both ϕ-agreement and case, we assume that the grammar,
through some method, establishes dependencies between diﬀerent syntactic objects. However, it’s
important to note that those who adopt dependent case explanations for the morphological forms of
nominals, still adopt agreement-based models for establishing ϕ-agreement dependencies. In this
way, dependent case is not purely an alternative model of case valuation, as it’s often advertised,
but it’s an additional method of dependency establishment. Agreement-based models of case
valuation do not require the assumption of an additional strategy for establishing dependencies
between diﬀerent syntactic objects because both are reﬂexes of the same operation. To this aim,
we can level a reductionist argument against dependent case to the extent that under a Minimalist
program, we should seek to minimize the number of operations and strategies that the grammar
has available. Adopting dependent case requires the addition of a separate method of dependency
establishment – a function the agree operation already readily performs.
2.3 Separation of Case from Licensing

As we saw in the previous section, dependent case theory allows for default case forms to
surface when the algorithm has failed to assign them either lexical, dependent, or unmarked case.
Clearly, this cannot be maintained in a framework where the failure to get case is fatal to the
derivation, otherwise there would be no mechanism to prevent default case forms from being erro-
neously inserted into derivations where case has historically borne the theoretical explanation for
ungrammaticality. A conﬁgurational approach to morphological case valuation therefore requires
that case play no role in regulating the requirements that govern nominal licensing. This could
be implemented in a few diﬀerent ways: (i) it could mean that we eliminate nominal licensing
requirements entirely (McFadden, 2004; Preminger, 2014) or (ii) it could mean maintaining those

66

requirements, but proposing that they’re handled by something other than case (Levin, 2015). Both
of these options are quite radical in that they upend central assumptions that mainstream theoretical
syntax has held for over 40 years.

2.3.1 Motivations

In addition to the existence of default case, there have actually been a number of other things that
have motivated researchers over the years to completely recast basic tenets of case theory. On
the whole, they can be categorized as the recognition of the increasingly diminished role case is
assumed to play in nominal distribution.
In the early days of GB (Chomsky, 1981), case was
assumed to be responsible for a host of disparate distribution facts and was argued to be one of the
primary drivers of sentence construction.10 The need for case was what primarily drove movement
(52a), what prevented superﬂuous movement (52b), what explained the inability of non-ﬁnite
clauses to host overt subjects (52c), and what explained the distribution and form of nominals in
passives (52d) and unaccusatives (52e), among other things.11

(52)

Johni is likely ti to win the race.

a.
b. *Johni is likely that ti will win the race.
c. *It is likely him to win the race.
d.
e.

Johni was invited ti.
Johni arrived ti.

Modern syntactic theory has since added a few theoretical tools that have greatly reduced the
theoretical load that case carries (see (Levin, 2015; McFadden, 2004) for a detailed summary of
these issues). With the adoption of the EPP feature, movement to the subject position in passives
and unaccusatives no longer needed to be tied to the need for case on the moved argument. All
clauses seem to require a subject and the feature responsible for encoding this need is what drives
10See Culicover (1997) for a summary. Also see Baker, Johnson, and Roberts (1989); Chomsky

(1973, 1980).

11This is not intended to be an exhaustive list, just a summary of some of the big facts.

67

the movement of the highest argument in the clause to the speciﬁer of TP position. In this way,
the theoretical work that case performed in this arena could be greatly reduced as it overlapped
signiﬁcantly with the EPP feature’s role.

Likewise, the adoption of phase heads and spell out domains further reduced the role of case in
regulating superraising. Compare (53a) with (53b). The nominal John is able to move out of the
embedded clause to the matrix subject position in (53b), but is unable to do so in (53a).

(53)

a. *Johni is likely (that) ti will be sick.
b.

Johni is likely ti to be sick.

The case-dependent explanation for this was that because the nominal John receives nominative
case from the embedded T in (53a), it is in eﬀect “frozen” and therefore unavailable for further
movement to another case position. Modern syntactic theory can rule out examples like (53a)
through reference to the phase impenetrability condition (Chomsky, 2000) which disallows the
movement of syntactic objects across spell out domains. The idea here is that the merger of a
phase head triggers the spell out of its complement, rendering it inaccessible to further syntactic
operations, with the exception of syntactic objects in its left edge position. So what disallows the
movement of John in (53a) is that doing so would involve a movement across phase domains, one
that is disallowed by the grammar. Because the embedded clause in (53b) is assumed to be a TP,
rather than a CP – and thus not a spell out domain – the PIC does not prevent this movement.

2.3.2

Implications

If this varied set of distribution facts is no longer solely captured through a need for case, then
it’s reasonable to wonder if there is any role for abstract case to play at all, especially when we
additionally consider the default case data discussed earlier. Many researchers have taken on this
question, especially those who are inclined to prefer a dependent case model of case valuation
(Levin & Preminger, 2015; McFadden, 2004; Preminger, 2014). Because dependent case models
have been able to achieve an impressive scope of empirical coverage and their adoption depends

68

on being able to remove case’s role in regulating nominal distribution, the separation of case from
licensing has recently seen an increased focus.

Arguably, where we see the biggest impact of case today with respect to nominal distribution
is where nominals can’t appear, rather than where they can. The primary focus here is on the
distribution of nominals in non-ﬁnite clauses. While this might be considered the “last bastion”
for case theory, it’s an arena where case still plays an important role and one in which we’ve not
yet proposed an appropriate replacement. Among those who adopt a dependent case model, Levin
(2015) is alone in maintaining that nominals still have a licensing requirement and that failing to
meet that requirement is fatal to a derivation. He even maintains that case plays a role in this,
albeit indirectly. He replaces the standard case ﬁlter shown in (54a) with the proposed alternative
in (54b).

(54)

a.

Standard Case Filter

A nominal is licensed if and only if its unvalued case feature has received a value at
spell out.
Proposed Case Filter

b.

Noun Phrases must be KPs.

Levin argues that what dictates the ability of a nominal to appear in a particular position is its
size: all nominals must be of size KP; they must include a K projection in order to be licit in
the structure. He ties this to a grammatical requirement that all phrases include their maximal
projections, arguing that the maximal projection for nominals is KP. He ties this idea to the original
notion that case plays a role by arguing that this K position is the position that houses case features.
In this way nominals need case not because they need case directly, but because they are required to
be a big enough size where they include the projection that houses those features. He includes two
additional “escape hatches” where nominals are licit, despite not being generated size KP: (i) the
ability of nominals to late adjoin a K head and (ii) the ability of some nominals to adjoin to other
nominal elements that include that maximal projection KP. The details of these escape hatches are

69

tangential to what I consider a minor criticism, so I direct the reader to Levin (2015) for a thorough
discussion. What is theoretically unattractive about this style of approach is that it doesn’t appear
explanatory in the same way that a more standard case account might be. It more or less argues
that the reason nominals are licensed is that they are simply generated licensed. It does however
pave the way for us to adopt dependent case, which we’ve discussed at length in section 2.2.

A more direct criticism would be that this sort of proposal does not address the data in (55)-(56)
which remains squarely in case’s domain and Levin must follow others who oﬀer independent
reasons unrelated to nominal size or case to explain these distribution facts. For this discussion, I’d
like to focus on two of these examples that do in fact have reasonable alternative explanations for
their ungrammaticality. Under the traditional case story, what rules out both (55) and (56) is that
the non-ﬁnite subject DPs him and her have failed to receive a case value, since non-ﬁnite T is not
a case assigner and there is no other available source. Both McFadden (2004) and Levin (2015)
have proposed similar alternative explanations for the ungrammaticality we see here.

(55)

*John hoped him to win the lottery.

(56)

*It is likely her to leave the party early.

The alternative explanation oﬀered for (55) is that the ungrammaticality is not due to the inability
of him to appear as a non-ﬁnite subject, but rather due to the inability of the complementizer for
to be unpronounced. This solution is intended to be an extension of the Empty Category Principle
and draws its inspiration from a similarity with the that-trace eﬀect (Chomsky, 1981; Perlmutter,
1971; Stowell, 1981), shown in (57). The that-trace eﬀect describes a generalization that the
complementizer that is unable to appear overtly when it is followed by a trace. This was extended
more broadly to be a generalization that the complementizer that must be dropped when followed
by phonetically null subjects.

(57)

a. Whoi do you think ti kissed Mary?
b. *Whoi do you think that ti kissed Mary?

70

Citing a similar distribution to that McFadden and Levin suggest that a similar treatment can

be extended to the complementizer for.

(58)

Complementizer optionality

a.
b.

I would like (for) him to buy the book.
I believe (that) he bought the book.

(59)

Obligatoriness in CP subjects

a.
b.

[*(For) him to buy the book] would be preferable.
[*(That) he bought the book] would be preferable.

(60)

C0-trace eﬀects

a. Whoi do you think (*that) ti bought the book?
b. Whati do you think that he bought ti?
c. Whoi would you like (*for) ti to buy the book?
d. Whati would you like for him to buy ti?

(Levin, 2015)
The idea is that like that, for is also banned from appearing overtly when it is followed by a
phonetically null subject. So while the standard case-based account would rule out (61a) on the
grounds that PRO is unable to receive null case from an empty complementizer, the ECP version
would argue that what explains the ungrammaticality of (61a) is the failure of the complementizer
for to be dropped when preceding a phonetically null nominal, PRO as required.

(61)

a. *John hopes for PRO to leave.
b.
John hopes for him to leave.
c. *John hopes him to leave.
d.
John hopes PRO to leave.

While this particular explanation can directly account for the distribution of for in examples like

71

(61a), where one needs to understand why the null version of for is required, it says nothing
about examples where we need to understand why the overt version of for is required, as shown
in (61b)-(61c). (61c) cannot therefore be ruled out due to an ECP violation, but instead must be
ruled out through other means. What one would have to argue here is that [him to leave] in (61c)
is a TP and that the verb hope is the kind of verb unable to take TP complements. Note, this also
requires assuming that [PRO to leave] in (61d) is a CP. This of course is possible, but given that
the tenability of this account depends on assuming a particular structure for the ungrammatical
sentence in (61c) (and also for the grammatical sentence in (61d)) it wouldn’t be unreasonable to lay
an ad-hoc criticism against this approach. For one, it’s not clear that we couldn’t instead assume, as
is more standard, that [him to leave] in (61c) is a CP with a null complementizer, especially when
one assumes exactly that structure for (61d).

This sort of ECP extension also requires a less modern understanding of complementizer-trace
eﬀects and it is not clear that it could even be extended to Minimalist frameworks in the way
McFadden and Levin intend. As Pesetsky (2017) notes, modern understandings of the mechanisms
behind complementizer-trace eﬀects do not focus on whether the complementizer is overt or not,
as GB era versions did (and as the explanation above requires), but instead are about whether T
to C movement is possible and/or obligatory. Pesetsky and Torrego (2001) oﬀer a proposal of
complementizer-trace eﬀects that argues that syntactic objects like that and for actually originate
in T and eventually move to C if they are attracted by tense feature probes on C. They assume C can
have two probing features: one that attracts tense features and another that attracts wh-features. In
a sentence where the wh-phrase is not in subject position, these probes will ﬁnd matching goals on
diﬀerent syntactic objects – the tense head and the wh-phrase respectively. The tense feature probe
will ﬁnd a goal in the T head, thus triggering the movement of T to C, as it does in (62a). If that
or for is what occupies that position, then that or for will be able to move to C and thus occupy a
position to the left of the subject. The wh-feature probe will agree with and trigger the movement
of the wh-phrase the question targets.

72

(62)

[CP Whoi do you think [C thatj [TP Sue tj met ti ] ] ]?
[CP Whoi do you think [C [TP ti met Sue ] ] ]?

a.
b.
c. *[CP Whoi do you think [C thatj [TP ti tj met Sue ] ] ]?

However, Pesetsky and Torrego (2001) argue that if the wh-phrase occupies the subject position, it
is capable of valuing both the tense feature of C and the wh-feature of C, given its position in spec
TP. Because the subject position constitutes a more local goal than T itself, this has the result of
blocking the T head from being attracted by the probe and thus prevents T-to-C movement (62c).
Since they assume that that and for originally occupy T, this will capture the inability of that or for
to proceed a subject trace. (See Pesetsky & Torrego, 2001, for the details of their approach).

Both McFadden (2004) and Levin (2015) admit that the rules and principles that govern when
complementizers can be overt or must be null are still very poorly understood and they don’t oﬀer
many details beyond the comparison with that for how the complementizer eﬀect might work.
What’s important about the proposal in Pesetsky and Torrego (2001) and the overview presented
in Pesetsky (2017) is that they show that the ECP is not maintained in Minimalism as it was
originally formulated. The basic patterns it intended to capture are argued to be better captured
via the relationships between probes and goals and the sorts of constraints we assume there to be
on movement rather than a type of ﬁlter that captures whether or not complementizers should be
overt. This makes an ECP dependent explanation of the non-ﬁnite data discussed in this section
quite untenable, at least under Minimalist assumptions.

Furthermore, the data in (63) also seems to suggest that the comparison between that and for
that McFadden and Levin center their hopes for an ECP extension on isn’t as strong as one would
need if one wants to place the locus of explanation on similar behavioral patterns. What (63b) and
(63d) show is that the null versions of that and for appear to have diﬀerent behavior. There are
two possibilities here: either both (63b) and (63d) involve null complementizers and we need to
understand how and why they behave diﬀerently or the embedded clause in (63b) is a CP, while
the one in (63d) is a TP and we need to understand why the two have diﬀerent structures. The
ﬁrst possibility is unattractive in part because it is not at all obvious why two null complementizers

73

should have such diﬀerent requirements. Furthermore, if one depends on those diﬀerences to be
able to abandon a long-held central tenet of the grammar like case, then we should have a much
better understanding of what those diﬀerences are before committing to doing so. The second
possibility is also unattractive, as we’ve discussed earlier in this section, because it is essentially a
stipulation – one that appears odd given the similar behavior and structures in (63a)-(63c). Notice
that a case-based explanation here can single out (63d) as the unique member quite easily; it is the
only example in (63) where the embedded subject is unable to receive case.

(63)

it’s possible [CP that he left ].
a.
it’s possible [CP ø he left ].
b.
c.
it’s possible [CP for him to leave ].
d. *it’s possible [CP ø him to leave ].

For these reasons, I argue that an ECP account isn’t a tenable replacement for case theoretical
explanations of nominal distribution in non-ﬁnite clauses. The ECP isn’t really extendable in
the intended way to Minimalist frameworks because we’ve since reframed how to conceptualize
complementizer-trace data, it doesn’t explain the obligatory presence of for in structures like
(63c)-(63d), and it leaves questions about either diﬀerences in null complementizers or stipulated
structures unanswered. I think this validates some real skepticism about whether abandoning case’s
role in regulating nominal distribution is a reasonable departure.

Moving to (56), repeated below as (64), the standard approach rules out this example on
grounds that the DP her is unable to receive case from the embedded non-ﬁnite T, also signaling a
preference for move over merge (Shima, 2000). The alternative that McFadden (2004) and Levin
(2015) propose is that (64) is instead a violation of requirements on what sorts of things are qualiﬁed
to serve as an associate of the expletive it. Levin (2015) explicitly proposes that the embedded
clause in (64) must be a TP and argues that TPs are unable to serve as associates of the expletive.
Since the only potential associate for it in (64) is the TP [her to leave the party], as shown below
in (65), the resulting sentence is ungrammatical.

74

(64)

*It is likely her to leave the party early.

(65)

It is likely [for her to leave the party early.]

a.
b. *It is likely [ her to leave the party early.]
c.
d. *[TP her to leave the part early] is likely.

[CP for her to leave the party early] is likely.

Like the ECP argument outlined above, an appeal to expletive association requires assuming a
particular structure for the ungrammatical sentence shown in (65b), one that is not obviously
correct. There is an equally plausible alternative structure for this sentence that includes a null
complementizer in the C position that is inconsistent with the proposal oﬀered. Without clear
motivation for selecting the TP approach, this explanation reduces to stipulation. When coupled
with the ECP arguments above, I argue that we have not yet proposed a true viable alternative to the
broad theoretical insights classical case theory has oﬀered and thus its abandonment is premature.
2.4 Conclusions

With this discussion in our rear-view, it’s important to take time to summarize where we are. In
this chapter, I hope I have provided enough reasons for the reader to be more skeptical of adopting
a dependent case model and by extension, its required precursor – the separation of case from
licensing. My aim has been a modest one. With a modern and incredibly detailed account of
how dependent case could operate in a grammar that is consistent with a traditional Minimalist
framework, it is time to ask what the conceptual and theoretical implications of adopting this
proposal are. I have provided some arguments, both empirical and conceptual, that suggest that the
exploration of these issues gives us reason to pause and either go back and adjust the dependent
case model to address the issues raised or to reject the system altogether and attempt to address the
issues raised with a more standard approach. In this thesis, I intend to pursue the latter option, but
hope the former is also taken upon by those more inclined.

75

CHAPTER 3

OBLIGATORY OPERATIONS

3.1 Introduction

This chapter is similar to chapter 2 in that is also attempts to explore the empirical and conceptual
implications that adopting an alternative approach to failed valuation has on our understanding of
the grammar. This chapter focuses on this issue in the domain of ϕ-agreement. As discussed
in chapter 1, the existence of default agreement raises some interesting problems related to the
valuation of ϕ-features and the relationship between agreement and grammaticality. As with case,
the crux of the issue is that the existence of default agreement raises questions about how the
functional heads that fail to establish a ϕ-agreement relationship survive to be spelled out without
causing the derivation to crash. A perfectly reasonable way to address this issue is to modify our
assumptions in a way that allows for the failure of agreement, meaning we must completely recast
the grammatical conditions on ϕ-agreement. Preminger (2014) does exactly this by providing an
alternative model of the grammar which encodes grammaticality requirements not by their success,
but by their initiation. If an operation is triggered in the derivation, the grammaticality requirements
are met. The grammar treats failed agreement as a reasonable outcome of the operation, so long as
it was initiated upon the creation of its structural description. This model is quite radical in that it
upends a large set of standard theoretical assumptions largely held by mainstream syntactic theory
since Chomsky (2000, 2001) about what drives derivations. This chapter will detail an overview of
what the grammar might look like if one adopts an obligatory operations model and will evaluate
the implications of the proposal.
3.2 Obligatory Operations

The data used to frame the obligatory operations approach is from Hebrew and is shown below
in (1), what Preminger (2014) calls gratuitous nonagreement. What (1) taken together shows is

76

that ϕ-agreement is the kind of operation that if it can apply, must. A general explanation for the
ungrammaticality in (1b) is that it is a reﬂex of the failure of the operation behind ϕ-agreement to
have applied, reinforcing the characterization that ϕ-agreement as an operation is obligatory. Had
it applied in (1b), it would have caused the ‘correct’ ϕ-features to surface as they do in (1a).

(1)

a.

dibr-u
spoke-3pl

ha-necig-im
the-representative-pl
‘the representatives spoke.’
diber
spoke(3sg.masc)

the-representative-pl

b. *ha-necig-im

(Preminger, 2014)

This explanation is consistent with a number of frameworks. The question becomes: what
speciﬁcally is bearing the primary theoretical burden of enforcing the obligatory nature of ϕ-
agreement.
In a framework that uses the inability of uninterpretable features to survive to the
semantic interface to drive derivations, the failure of the agreement operation would cause those
features to remain unvalued and therefore a derivation crash would be expected. Preminger calls
these unvalued features derivational time bombs and this is how he characterizes the modern
standard approach that came out of the work ﬁrst advanced in Chomsky (2000, 2001). It’s important
to clarify here that it is entirely possible, and is quite common actually, to assume that even within
a derivational time-bombs approach, probes immediately begin their search upon merge into the
derivation. The point Preminger makes is that it is not the time-bomb nature of unvalued features
at the interfaces that is driving the distinction in (1), it is the immediate and automatic probing.

Preminger argues that the best way to model the obligatory nature of ϕ-agreement, and po-
tentially by extension other syntactic phenomena, is to instead propose that there are syntactic
operations that are automatically, obligatorily, and immediately triggered upon the creation of the
respective operation’s structural description. We can view the obligatory operations proposal as
an attempt to reduce the theoretical complexity of the grammar by attempting to place the entire
burden on the immediate and obligatory triggering of the operation behind ϕ-agreement, without
reference to the time bombs themselves. In a standard model where both the immediate probing

77

and the presence of uninterpretable features is used to explain grammaticality distinctions, it would
be useful, if possible, to reduce the theoretical burden so that it only depends on one of those.

Data like the default agreement data from Hindi-Urdu discussed in the introduction to this thesis
(2) and failed agreement data that we will discuss shortly is used to push forward the alternative.
What will unify these examples is that each involves the failure of the agreement operation to
successfully transfer ϕ-features from goal to probe and the subsequent insertion of default feature
values. What this data shows is that despite the failure to value the relevant uninterpretable features,
the resulting sentence is perfectly grammatical.

(2)

khaa-tii
a. Mona
Mona.f
eat.hab.f
‘Mona used to eat guava’

amruud
guava.f

thii
be.prf.f.sg

b. Ram-ne

c. Mona-ne

khaa-yii
eat.pfv.f

thii
be.pst.f.sg

imlii
tamarind.f

Ram.m.erg
‘Ram had eaten tamarind’
kitaab-ko
Mona.f.erg
book.f.acc
‘Mona had read this book’

is
this

parh-aa
read.pfv.m.sg

thaa
be.pst.m.sg

subject agreement

object agreement

default agreement

(Bhatt, 2005)

Preminger concludes that this data shows that it is not the inability for unvalued features to
survive that drives the obligatory nature of ϕ-agreement. Instead he proposes that what’s behind
both the grammaticality of (1a) and the ungrammaticality of (1b) is the existence of obligatory
operations that do the actual transferring of features.
It is the failure to be triggered when an
operation’s structural description is met that causes ungrammaticality. The grammar completely
tolerates an operation that is triggered, but subsequently fails to culminate successfully.

First we’ll survey the data used to advance the obligatory operations approach, with detailed
explanations to follow. The primary data that Preminger uses to argue for obligatory operations
comes from Kichean, a member of the Mayan language family that exhibits ergative-absolutive
agreement alignment. Intransitive subjects show agreement with the verb and use an absolutive

78

marker to do so (3). Transitive subjects agree with verbs and use an ergative marker (italicized),
while transitive objects use the same (bolded) absolutive marker used in intransitives clauses (4).1

(3)

a.

b.

(4)

a.

b.

x-ø-uk’lun
com-3sg.abs-arrive

ri achin
the man
‘the man arrived.’
x-at-uk’lun
rat
you(sg)
com-2sg.abs-arrive
‘you(sg) arrived.’

x-ø-aw-ax-aj
com-3sg.abs-2sg.erg-hear-act

rat
you(sg)
‘you(sg) heard the man.
ri achin
the man
‘the man heard you(sg).

x-a-r-ax-aj
com-2sg.abs-3sg.erg-hear-act

ri achin
the man

rat
you(sg)

Of special concern for the obligatory operations proposal is a construction called Agent Focus,
an example shown in (5). Agent Focus clauses are similar to transitive ones in that they have two
arguments, but are similar to intransitives in that they only have one agreement slot. The two
arguments therefore compete for agreement, obeying a ϕ-feature hierarchy (6) with 1st and 2nd
person arguments being preferred over 3rd person arguments, and 3rd person plural arguments
being preferred over 3rd person singular ones.

(5)

a.

b.

rat
you(sg)

x-at/*-ø-ax-an
com-2sg.abs/*3sg.abs-hear-AF

ja
foc
‘it was you that heard the man’
ja
foc
‘it was the man that heard you(sg)’

achin
man

ri
the

x-at/*ø-ax-an
com-2sg.abs/*3sg.abs-hear-AF

rat
you(sg)

ri
the

achin
man

(6)

1st/2nd person > 3rd person plural > 3rd person singular

1Unless otherwise noted, all Kichean data comes from Preminger (2014).

79

Further constraining the appearance of person features is a language-family-wide constraint (7) that
bars two [participant] bearing arguments from co-occurring in this construction (8).

(7)

The AF person restriction

In the Kichean AF construction, at most one of the two core arguments can be 1st/2nd
person.

(8)

a. *ja

rat
you(sg)

x-in/at/ø-ax-an
com-1sg.abs/2sg.abs/3sg.abs-hear-AF

foc
Intended: ‘It was you(sg) that heard me.’

yïn
me

b. *ja

yïn
me

x-in/at/ø-ax-an
com-1sg.abs/2sg.abs/3sg.abs-hear-AF

foc
Intended: ‘It was me that heard you(sg)

rat
you(sg)

Another piece that will be relevant to accounting for the agreement patterns exhibited in Kichean
AF constructions is the idea that the ﬁrst and second absolutive agreement markers aren’t true
agreement markers in Kichean, but are instead clitics: reduced, determiner-less versions of the
strong pronouns. Table 3.1 below shows the similarities between the agreement markers on the
left and the strong pronouns in the middle used to argue in part for this conclusion. Notice that
these similarities do not exist in the ergative agreement marker paradigm, shown on the right. The
table also shows that these similarities disappear for the 3rd person agreement markers, leading
Preminger to conclude that in Kichean, only 1st and 2nd person absolutive markers are clitics; 3rd
person agreement markers are instead the full expression of person and number.

Table 3.1: Kichean Agreement Markers

abs agreement marker

strong pronoun erg agreement marker

1sg
1pl
2sg
2pl
3sg
3pl

i(n)-
oj-
a(t)-
ix-
ø-
e-

n/w-
q(a)-
a(w)-
i(w)-
r(u)/u-
k(i)-

yin
roj
rat
rix
rja’
rje’

80

One ﬁnal note about the language is that Preminger suggests we treat the [e-] morpheme as a
plural morpheme. If true, we can understand the 1st and 2nd person markers/clitics as expressing
person and number suppletively, while 3rd person markers do not share this property. This is one
more piece of evidence that could motivate treating the [participant]-bearing morphemes diﬀerent
from the non[participant]-bearing morphemes, which will later be used to derive the AF person
restriction which only involves the 1st and 2nd person arguments.

There are two parts to proposing an obligatory operations approach to ϕ-agreement in Kichean
AF constructions: deriving the agreement paradigm itself and deriving the Agent-Focus person
restriction that bars two [participant] bearing arguments from co-occurring in AF constructions.
I’ll discuss each in turn. First, there’s an operation responsible for ϕ-agreement called find, shown
below in (9). This operation is triggered automatically, obligatorily, and immediately upon the
creation of find’s structural description.
In this particular case, this essentially means that as
soon as an unvalued feature f probe is merged into the structure, find will be triggered. One
deﬁning feature of this approach is that if the operation find fails, the grammatical requirements
are considered met and there is no ungrammatical consequence because what is required is the
attempt at agreement, not a particular result. We’ll see how this failure helps to capture the default
agreement data mentioned in the previous section (and throughout this thesis).

(9)

find(f )
Given an unvalued feature f on a head H0, look for an XP bearing a valued instance of f
and assign that value to H0.

With respect to ϕ-features, Preminger assumes that ϕ-probes are separated onto two functional
heads, a person head and a number head, that are each relativized and probe independently (10).
In Kichean, the person head is located lower than the number head and therefore will probe ﬁrst.
The person head is relativized to probe for the featural speciﬁcation [participant], encoding the
language’s preference for agreement with 1st/2nd person arguments, while the number head probes
for [plural], encoding the preference for plural over singular arguments. According to find, the

81

person probe will search for an argument bearing a [participant] and will ignore arguments that do
not bear this feature. Likewise, the number probe will search for arguments bearing [plural] and
will ignore arguments that do not bear this feature.2 In the tree below in (10), the dashed lines
show that each probe is ‘looking’ for valued instances of each head’s relative probing ϕ-features in
their respective c-command domains. Because the external argument is speciﬁed with a participant
feature, the probe will ‘see’ it and establish an agreement relationship via find. Notice, however
that the external argument in our example is not speciﬁed with a [plural] feature. This means that
while visible to the person probe, the external argument is essentially invisible to the number probe.
It therefore can bypass the external argument and find the plural feature on the internal argument
instead.

(10)

Kichean AF Probes

#P

#

[u.plural]

πP

π

[u.participant]

(cid:34)

. . .

. . .

DP

participant
singular

vP

(cid:35)

find

find

v’

v

V

VP

(cid:34)

(cid:35)

DP
π
plural

2Two quick notes on notation: I’m using the form u.feature to represent that the feature on the
probe is unvalued. However, under Preminger’s analysis, this feature is not of the same status as the
typical uninterpretable features that cause derivation crashes. I’ll represent this distinction using
the traditional form ufeature when illustrating models that assume standard feature assumptions and
the unitalicized u.FEATURE, with the added period for models where they are simply unvalued
but aren’t assumed to cause derivation crashing.

82

Being successfully probed by the person head triggers clitic doubling of the probed argument.
Because the probe is relativized to look for [participant] features, this circumstance will arise when
an argument bears a [participant] feature.
If an argument is successfully probed by person, its
entire ϕ-feature set is copied, not just the feature value for the person category, so only when the
person head fails to probe an argument will the number morphology on the verb be an exponent
of agreement. This is illustrated on the tree below in (11) and (12) via the CLϕ attached to the
person probe head. This signals that all the ϕ-features for clausal agreement have been satisﬁed
by the probing of person. If the subject is speciﬁed for [participant], the person probe, bearing a
[participant] feature, will successfully probe the subject, triggering clitic doubling of the subject
(11).
If however the subject does not bear this [participant] feature, the person probe bearing
[participant] will skip over this argument and continue to probe (12).

(11)

Clitic Doubling Triggered by [part] Probe

πP

π-CLϕ
[u.part]

. . .

. . .

vP

DP
[part]

clitic

doubling

v’

v

VP

V

DP

If instead the object of the clause bears the relevant [participant] feature that the person probe
is looking for, clitic doubling with the object will be triggered, as it was with the subject in the
previous example.

83

(12)

Clitic Doubling Triggered by [part] Probe

πP

π-CLϕ
[u.part]

. . .

. . .

vP

DP
[ø]

v’

v

VP

V

DP
[part.]

clitic

doubling

The clitic doubling assumption is crucial for the analysis of how agreement patterns are pro-
duced in Kichean AF because the eﬀects of clitic doubling impact the environment in which the
number head probes. This is because the person probe probes ﬁrst – its outcome conditioning
the environment the number probe will probe – and the result of clitic doubling is the valuation
of the entire ϕ-feature set of the agreed-with argument.
In this way, successful clitic doubling
bleeds number probing because the [participant]-bearing argument has already valued the number
features. To account for the Kichean AF Person Constraint, Preminger extends Béjar & Rezac’s
Person Licensing Condition (Béjar & Rezac, 2003) which dictates that 1st and 2nd person features,
namely the [participant] feature, must enter into an agree relation in order to be licensed (13).3

(13)

Person Licensing Condition (Béjar & Rezac, 2003)

Interpretable 1st/2nd person features must be licensed be entering into an Agree relation
with an appropriate functional category.

If both the subject and the object are speciﬁed with a [participant] feature, as they are in (14), the
3In one of the following sections in this chapter, we’ll address the observation that the PLC
appears to employ a similar mechanics/derivational logic to derivational time-bombs and how
Preminger handles this apparent similarity.

84

person probe probes into the higher argument, the subject, triggering clitic doubling as it did in (11).
However, the derivation would be ruled ungrammatical when it is spelled out due to a violation
of the Person Licensing Constraint because the [participant] feature of the object (bolded in (14))
would remain un-agreed with since only one agreement slot is available. In this way, Preminger
accounts for both the preference for 1st/2nd person arguments and also the restriction that bars two
of them from appearing in the same clause.

(14)

PLC Violation

πP

π-CLϕ
[u.part]

. . .

. . .

vP

DP
[part]

clitic

doubling

v’

v

VP

V

DP
[part]

⇒ PLC violation

Two assumptions stand out as bearing the theoretical weight of explanation: the relativization
of probes and the Person Licensing Condition. The relativization of probes is what derives the
preference for 1st/2nd person arguments over third and (as we’ll see below) the preference for
plural arguments over singular ones. The PLC is what accounts for why two arguments in the same
structure bearing [participant] are illicit.

If neither the subject nor the object is speciﬁed with a [participant] feature – namely both
arguments are third person – then the person probe is unable to successfully probe either argument
and clitic doubling isn’t triggered at all. Importantly, these sentences, shown in (15) are perfectly
acceptable. Preminger’s analysis of this data requires a model of the grammar that allows for
the agreement operation to fail without causing the derivation to crash. In (16) we observe that

85

the person probe that is searching for an argument bearing [participant] will be unsuccessful in
a sentence where both arguments are 3rd person. What’s central to this approach is that despite
the failure of the [participant] probe to ﬁnd an argument to value its uninterpretable feature, the
derivation does not crash and the resultant sentence is grammatical. It is important to note here
that under this find approach, the result of the derivation in (16) is not successful ϕ-agreement
with either of the 3rd person arguments, but rather complete agreement failure and the assumed
insertion of a default morphological form, here a null morpheme.

(15)

a.

b.

ri
the

tz’i’
dog

x-ø-etzel-an
ja
com-3sg.abs-hate-AF
foc
‘it was the dog that hated the cat.’
x-ø-tz’et-o
ja
com-3sg.abs-see-AF
foc
‘it was the woman who saw the man.’

xoq
woman

ri
the

ri
the

sian
cat

ri
the

achin
man

(16)

Failure of [π] Probe

πP

π

[u.part]

. . .

. . .

vP

DP
[ø]



v

v’

V

VP

DP
[ø]



With respect to the number probe, we see a similar type of process. The ϕ-feature on the number
head will only act as a probe when person agreement is unsuccessful. This is because successful
person agreement triggers clitic doubling, which values the entire ϕ-feature set, both person and
number. Only when number remains unvalued will it trigger find. Since the number probe is

86

similarly relativized, in this case for [plural], it is able to ignore or skip arguments that do not come
speciﬁed for this feature. If the subject is speciﬁed with the [plural] feature, the number probe
will successfully agree with it as in (17). If instead, the subject is not speciﬁed with [plural], the
number probe will skip it and continue its search, eventually reaching the object. If this object has
a [plural] feature, then the number probe will agree with the object instead (18).

(17)

Number Agreement with Subject

#P

#

[u.plural]

πP

π

. . .

. . .

vP

DP

[plural]

find

v’

v

VP

V

DP

87

(18)

Number Agreement with Object

#P

#

[u.plural]

πP

π

. . .

. . .

vP

DP
[ø]

v’

v

VP

V

DP

[plural]

find

However, if neither the subject nor the object have a [plural] feature as in (15), then the probe
will have failed to agree with either argument (19). Under the assumptions provided by obligatory
operations, this failure is tolerated by the grammar since the operation find at least was triggered
and attempted to value its unvalued number feature. As with the person features, the failure of
the number probe triggers the insertion of a default singular morpheme – once again the null
morpheme – and is not reﬂective of a successful agreement relationship. It’s also worth noting
here that because there is not a similar Number Licensing Constraint, nothing would bar two
[plural]-bearing arguments from existing in the same clause.

88

(19)

Failure of [#] Probe

#P

#

[u.plural]

πP

π

. . .

. . .

vP

v

v’

V

VP

DP
[ø]



DP
[ø]



To summarize how obligatory operations accounts for agreement patterns, including failed
agreement, in Kichean AF constructions, here’s a quick review of the main assumptions and the
theoretical work each does. The [participant] speciﬁcation on the person probe is what derives
the preference for agreement with ﬁrst and second person arguments over third. The [plural]
speciﬁcation on the number probe is what derives the preference for plural arguments over singular
ones. The assumption that clitic doubling triggers the exponence of the whole ϕ-feature set is what
explains why number agreement is only truly distinguished in the third person. The requirement
that [participant] features be agreed with in order to be licensed coupled with the availability of
only one agreement slot is what accounts for the barring of more than one [participant]-bearing
argument in an AF clause. Finally, the grammar’s tolerance of failed operations is what produces
default failed agreement when neither of the two arguments are able to satisfy a particular probe’s
relativized featural speciﬁcation.

With an overview behind us, we can discuss what this failed agreement proposal is intended
to mean for our broader understanding of the grammar. Preminger correctly recognizes that the
grammar needs some way to address the fact that there are some features whose failure to receive a

89

value in a canonical way does not cause the derivation to crash. In Kichean AF constructions, these
are sentences where neither argument bears a [participant] feature or where neither argument bears
a [plural] feature. Approaches that rely on the failure to value features to determine grammaticality
appear to constitute an incompatibility with this failed agreement data, as we’ve discussed in
chapter 1 and chapter 2. To reconcile this issue, the obligatory operations approach removes the
source of the “explosion” caused by derivational time-bombs surviving to the interfaces. What is
crucial to the grammaticality of a certain derivation under the obligatory operations framework
is not whether or not an unvalued feature successfully ﬁnds a valued counterpart, but rather that
the operation applies whenever it can. The central point here is that this particular analysis of
Kichean AF data means that whatever operation is responsible for valuing features is allowed to
remain unsuccessful without causing the derivation to crash. It therefore cannot be the case that
the grammar uses the need to value features to enforce grammaticality requirements alone.
3.3 An alternative

The system proposed in Béjar (2003) similarly recognizes the need to account for data that
involves the failure of the agreement operation in some regard. She argues that a solution can
be found within a standard framework by decomposing the monolithic agree operation into two
independent, but related, operations: match and value. Both operations are sensitive to the
intrinsic hierarchical relationships held between features. What this does is it essentially expands
the number of agreement outcomes from two to three. With a monolithic agree operation, there
are only two possible outcomes: agree is either successful or it is unsuccessful (this is how find is
assumed to operate.) Once we separate the operation into two suboperations, the application of one
dependent on the successful culmination of the other, we expand the number of outcomes to three:
match can fail (which prevents value from applying since successful match is a precondition
on the application on value), match can be successful and value successful, or match can be
successful and value unsuccessful. Béjar argues that this third outcome, made available only by
the decomposition of agree, is exactly what produces the set of data that appears confounding to

90

derivational time-bombs approaches. It is in this arena that unusual agreement patterns that appear
to involve failure can surface. The existence of an alternative proposal that largely still assumes the
standard assumptions about grammatical requirements being enforced by feature valuation means
that we no longer must adopt a find approach on empirical grounds. This reframes the obligatory
operations discussion as a comparison between two approaches, rather than an empirically required
necessity.

3.3.1 An overview of match/value

Béjar’s system similarly employs privative features that are structured hierarchically. These hier-
archical relationships are derived from inherent entailment relations that the features’ semantics
require. For example, the feature [speaker] semantically encodes that the bearer of the feature has
the semantics of being a speaker in the event represented by the clause/verb; we traditionally call
this ﬁrst person. There’s another feature [participant] which encodes that the bearer of that feature
is a participant in some event. Since being a speaker in an event semantically entails that one is
also a participant in the event, we can derive a hierarchical relationship (20) based on entailment
between the two features. Since speaker semantically entails participant, we assume that the feature
[participant] dominates (or is hierarchically “higher”) than the [speaker] feature. (See Harley and
Ritter (2002) for a detailed view/proposal of this feature system.)

(20)

π

participant

speaker/addressee

What’s particularly important about these hierarchical relationships in Béjar’s system is that
she argues that the syntactic operations responsible for ϕ-agreement are actually sensitive to (and
depend on) these relationships. Agreement is understood to be composed of two suboperations,
independently applying, each with its own set of conditions upon which they succeed. The

91

operation called match is responsible for identifying which among the NPs in a c-command
domain is available or visible to the target for agreement. It does not, importantly, decide which
argument will actually control agreement; rather it circumscribes the set of arguments which are
viable possibilities. A successful match is one in which the features of the goal match the features
of the probe. The conditions upon which this is successful are when the features on the probe are
a subset of the features on the goal (21). The operation match is only evaluated with respect to
the root feature. A more colloquial way of explaining this system so far is to say that probes are
ﬁrst looking for a certain type of feature category, rather than a particular value and want to ﬁrst
identify which syntactic elements could be potential goals by minimally having the right feature
category. By evaluating this operation at the root, we are encoding that at this point, the grammar
cares less about the particular value that an argument has and more about whether or not that
argument is visible. The subset relationship essentially says that it’s okay if the goal has more
featural information than the probe (is more highly speciﬁed), but it doesn’t count as a match if the
probe has features that aren’t speciﬁed on the goal. In table 3.2, we see that the probe will match
a goal so long as the goal shares the same root feature of the probe. Whether the goal has more
(line 1) or less (line 3) structure than the probe doesn’t aﬀect the success of match at this stage.
What would cause match to fail would be if a goal did not share that root feature, either by being
speciﬁed with a completely diﬀerent type of feature (line 4) or by being speciﬁed with no feature
at all (line 5).

(21)

match
A probe matches a goal if the root feature of the probe is either a subset of or identical to
the root feature of a goal.

92

match Outcome

Table 3.2: match outcomes
Goal
[π [part [speaker]]]
[π [part]]
[π]
[# [plural]]
[ø]

success
success
success
failure
failure

Probe
[π [part]]
[π [part]]
[π [part]]
[π [part]]
[π [part]]

The other half of ϕ-agreement is due to the operation value. value is the operation responsible
for actually establishing the relationship between a probe and a goal: an agreement target and
its controller.
It has stronger conditions upon which its success is evaluated (22). The subset
relationship that deﬁned the condition on match is shared by value, but where the value conditions
are stronger are in that this subset condition is evaluated not just at the root, but at the level of the
entire feature set. So where the success of match wasn’t concerned with whether a potential goal
has more or less feature structure than a probe, value is only successful when a goal has more
of the same featural structure than a probe (line 1), not less (line 3). value also depends on the
successful culmination of match; lines 4 and 5 of table 3.3 show that the conditions of value
aren’t even evaluated in the instances where match had failed.

(22)

value
A probe is valued by a goal if the features of the probe are either a subset of or identical
to the features of a goal.

value Outcome

Table 3.3: value outcomes
Goal
[π [part [speaker]]]
[π [part]]
[π]
[# [plural]]
[ø]

success
success
failure
N/A
N/A

Probe
[π [part]]
[π [part]]
[π [part]]
[π [part]]
[π [part]]

The separation of ϕ-agreement into two independently applying operations creates a three-way
set of outcomes: match can fail, both match and value can succeed, or match can succeed, but

93

value can fail. These outcomes are summarized in table 3.4 below. This of course does not solve
the original default problem – that failure to successfully value unvalued features should produce
a derivation crash. To obviate this worry, what she proposes is that the failure to either match or
value a potential goal in some given domain marks the oﬀending features for deletion. This is
called partial default agreement. In a sense, this formally encodes that what the grammar cares
about is the attempt, as is done in the obligatory operations approach.

Table 3.4: match and value interactions

Probe
[π [part]]
[π [part]]
[π [part]]

Goal
[π [part [speaker]]]
[π [part]]
[π]

match Outcome value Outcome

Result

success
success
success

success
success
failure

ϕ-agreement
ϕ-agreement

probe is stripped

2nd cycle is triggered

When a goal matches, but fails to value a probe, as in (23), the grammar strips the probe of its
featural content (minus the root feature) and a second cycle of agree is triggered, where the probe
is able to continue its search in the expanded domain created by the merge of the ‘next’ syntactic
object (24). Because the probe has been modiﬁed upon the second cycle, the properties a goal
must have to be considered a successful value are also modiﬁed. While a third person argument
was not considered a viable agreement controller upon the ﬁrst cycle of agree (23), it is considered
a viable controller upon the second cycle of agree (24). If upon the second cycle of agreement, a
probe is still unable to ﬁnd an agreement controller, even with the reduced featural speciﬁcation on
the projection of the probe, the agreement operation is allowed to fail without consequence since
partial default agreement already marked the feature for deletion. Total default agreement occurs
at the point when this second attempt at agreement is unsuccessful. Essentially, the distinction
between the two is that partial default agreement is the result of agreement with an impoverished
feature set, while total default agreement is the result of a complete and total failure to agree. This
is importantly not a distinction argued for in Preminger (2014) and we’ll return to this point in the
next section.

94

(23)

1st Cycle value Failure

(24)

2nd Cycle value Success

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP
[π]

v(cid:34)

(cid:35)

π
part

vP

DP
[π]

match
value

v’

vP

v(cid:34)

(cid:35)

π
part

VP

V DP
[π]

To see how this system works more explicitly, let’s look at an agreement pattern similar to
Kichean – ϕ-agreement in Georgian (Aronson, 1989; Hewitt, 1995) . Georgian ϕ-agreement
reﬂects a general preference for agreement with objects, if those objects are either ﬁrst or second
person. If the object is third person, then agreement shows a preference for a ﬁrst or second person
subject. In a clause with no ﬁrst or second person arguments, default or third person agreement
is observed. Georgian person probes are assumed to be speciﬁed for [participant], like Kichean
probes, to capture the preference for ﬁrst and second person arguments and are assumed to be
located low on v to capture the preference for agreement with the syntactic object (25).

(25)

Georgian [π] Probe

v’

vP

DP

v(cid:34)

(cid:35)

π
part

VP

V

DP

What this means is that the syntactic object will be the ﬁrst argument a probe will encounter
upon its search. If that object bears a [participant] feature, then match will succeed, since both the
probe and the goal contain the root [π] feature. value will also succeed since the features on the

95

probe will constitute a subset of the features speciﬁed on the goal. With value successful, features
on the probe will be valued and ϕ-agreement will be the outcome, either ﬁrst person agreement
(26) or second person agreement (27).

(26)

1st Person Object Agreement

(27)

2nd Person Object Agreement

v(cid:34)

(cid:35)

π
part.

v’

VP

V

match
value

v(cid:34)

(cid:35)

π
part.

v’

V

match
value

VP

DP(cid:34)

(cid:35)

π
part.



π

DP

part.
speak.

If instead the Georgian object is third person, therefore not [participant]-bearing, match will
still be successful, since both probe and goal contain the root [π] feature. Unlike the previous
example, however, value will not be successful as the probe’s featural content is not a subset of the
goal’s (28). This also illustrates the context where partial default agreement is triggered, reducing
the probe’s feature set to only bear the root feature.

(28)

First Cycle value Failure

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP
[π]

A second cycle of agreement is also triggered upon the projection of the probe’s unvalued
features. Upon this second cycle, which now has the external argument in its search domain, the

96

probe searches for a match and a value, as it did on the ﬁrst. What’s diﬀerent on the second cycle is
that, due to the modiﬁed featural content of the probe, the set of goals that could successfully value
the probe is larger. If on this second cycle, the probe ﬁnds an external argument with [participant]
features, both match and value will be successful (29), as they were on the ﬁrst cycle. If however,
the external argument is third person, we’ll also see successful valuation. Since the probe’s featural
content is now just [π], match will be successful, as it had been upon the ﬁrst cycle, and value this
time will also be successful, as the probe’s feature speciﬁcation is now identical to that of the goal
(30). What this system allows then is an attempt to agree with a ﬁrst or second person argument, but
if that agreement attempt fails, agreement with the third person is possible. The last resort nature
of this third person agreement is captured by it only being possible in situations where the attempt
at agreement with a more speciﬁed set of features was unsuccessful. Both the regular outcomes of
agreement are captured and the default ﬂavor of failed agreement are captured in this system.

(29)

2nd Cycle 1st Person Agreement

vP

π

DP

part
speak

v(cid:34)

(cid:35)

π
part

match
value



v’

vP

v(cid:34)

(cid:35)

π
part.

VP

V

DP
[π]

97

(30)

2nd Cycle 3rd Person Agreement

v(cid:34)

(cid:35)

π
part

vP

DP
[π]

match
value

v’

vP

v(cid:34)

(cid:35)

π
part

VP

V

DP
[π]

3.3.2 Accounting for Kichean

This model can easily account for the Kichean failed agreement data that we talked about in the
last section. It’s worth noting here that the details of how the match/value approach accounts
for the Kichean agreement patterns depend on various assumptions one makes about the featural
speciﬁcation of third person arguments and which functional heads are assumed to house which
probes. While the how details vary however, the conclusion that the match/value is capable of
accounting for failed agreement in Kichean through second cycle agreement eﬀects remains the
same. I’ll do my best to outline how the details may vary, but the bigger takeaway here is that
regardless of how one sets these various parameters, the match/value approach is extendable to
the Kichean data.

In many ways, the Kichean AF agreement patterns are actually a bit simpler than the types of
person hierarchy patterns found in Béjar (2003), making the task of applying the match/value
approach to Kichean very straightforward. Kichean has a relatively simple hierarchy, preferring
[participant]-bearing arguments and preferring plural arguments over singular ones when number is
an exponent of ϕ-agreement and not clitic doubling. This compares to a more complicated person
hierarchy found in Nishnaabemwin which not only prefers second over ﬁrst person, but also ﬁrst
person over third. This allows the Kichean probe to be less speciﬁed than the Nishnaabemwin one.

98

The details of how Nishnaabemwin agreement work will be discussed in section 3.4.

Preminger assumes that probes for both person and number are at least higher than the external
argument. For him, this results in only one agreement cycle, which is either successful or not.
The way that Béjar’s system would work depends on where one assumes those probes are located.
First, let’s see how the system would handle the data if the probes are both located higher than the
external argument, as Preminger does. If a [participant]-bearing argument is in subject position,
then agreement is straightforward (31). Both match and value are successful and we can carry
over the assumption that agreeing with a person head triggers clitic doubling and the subsequent
copying of the entire ϕ-feature set, rather than the person features alone .

(31)

Kichean [π] Agreement with Subject

(cid:34)

(cid:35)

π-CLϕ
π
part

πP

match
value

vP

. . .

. . .

DP(cid:34)

(cid:35)

π
part

v’

v

VP

V

DP

If the [participant]-bearing argument is in the object position, rather than the subject position,
the probe is able to skip over the non[participant]-bearing subject and agree with the object, just
as it did in Preminger’s account. How the probe skips over a non-[participant]-bearing subject
depends in part on how one assumes third person features are speciﬁed in Kichean. We could
assume, as Preminger does, that third person features are not speciﬁed at all and if true, the probe
would quite simply ignore a third person subject (32), as it does in the find approach, subsequently
agreeing with the [participant]-bearing object.

99

(32)

[π] Probe Skips Subject

(cid:35)

(cid:34)

π-CLϕ
π
part

πP

. . .

. . .

vP

DP
[ø]

v’

v

VP

V

DP(cid:34)

(cid:35)

π
part

match
value

If instead we assume that third person has featural content, a [π] feature, then the subject would
match, but not value the probe (33a). When probes are located low, this triggers a second cycle
of agreement; however since the probe is high, it has not exhausted its search domain. The ability
to match, but not value should trigger partial default agreement, as it always does and the probe
could then presumably continue to search, discovering the [participant]-bearing object where both
match and value would then be successful (33b).4

4Béjar doesn’t have any examples of languages with a high person probe, so it’s not immediately
clear what she assumes happens when a probe ﬁnd a match but no value if there’s another potential
agreement controller lower in the search domain. It could be the case that the higher intervener
needs to move in order for the probe to continue to search. If true, then perhaps the movement of
the external argument to subject position would move the external argument “out of the way” of
the probe and it could keep searching. Alternatively – and this will be what I assume here – we
could just say that the probe continues to search if it ﬁnds a match, but no value, as it does when
probes are low.

100

(33)

3rd Subject has Features

πP

match
no value

πP

a.

b.

π(cid:34)

(cid:35)

π
part.

π(cid:34)

(cid:35)

π
part.

. . .

. . .

vP

DP
[π]

. . .

. . .

vP

DP
[π]

match
value

v

v

v’

V

v’

V

VP

DP(cid:34)

(cid:35)

π
part.

VP

DP(cid:34)

(cid:35)

π
part.

Finally, this approach could adopt the same assumptions Preminger does regarding the AF person
restriction which bans two [participant]-bearing argument from co-occurring by adopting the PLC,
repeated below in (34). Since the PLC for Kichean was adopted from Béjar and Rezac (2003) who
also assume Béjar’s match/value model, it’s easy to see that it can exist in a match/value system
as well. If both the subject and the object are speciﬁed with a [participant] feature, the person
probe would probe into the higher argument, the subject, triggering clitic doubling as it did in (31).
However, the derivation would be ruled ungrammatical when it reached spell out due to a violation

101

of the Person Licensing Constraint because the [participant] feature of the object, bolded in (35),
would remain un-agreed with since only one agreement slot is available.

(34)

Person Licensing Condition (Béjar & Rezac, 2003)
Interpretable 1st/2nd person features must be licensed be entering into an Agree relation
with an appropriate functional category.

(35)

PLC Violation

(cid:34)

(cid:35)

π-CLϕ
π
part

πP

match
value

vP

. . .

. . .

DP(cid:34)

(cid:35)

π
part

v’

v

VP

V

DP(cid:34)

(cid:35)

π
part

⇒ PLC violation

Béjar takes the featural speciﬁcation of third person arguments to be a point of cross-linguistic
variation. For some languages, like Nishnaabemwin, third person arguments have no featural
content with respect to person. Others, like Georgian, third person arguments are speciﬁed with a
minimal featural speciﬁcation for person. Béjar suggests what decides this is whether or not the
third person in the particular language ever exhibits any sort of intervention eﬀects. If so, then
the third person needs featural speciﬁcation to be visible; if not, it seems the null hypothesis is
that there’s no reason to think that third person comes with any features if it doesn’t use them for
something. Preminger implicitly assumes that third person features are unspeciﬁed, but doesn’t
explicitly rule out the possibility that they have featural content. This question, is a continual
discussion in much of the ϕ-agreement literature (see Nevins, 2007, for an overview) and while

102

doesn’t eﬀect the ability of either approach to account for the failed agreement data in Kichean,
does aﬀect how the derivation proceeds at least in the match/value approach.

Finally, I’d like to end with the characterization that Béjar herself would likely adopt and that’s
an approach that has at least the person probe lower, on v. Preminger himself oﬀers this as a
hypothetical possibility, but winds up rejecting it because doing so under his system would render
the external argument unable to be reached by the probe – a signal that he intends his find operation
to not trigger a second cycle of agreement upon failure (see next section for a discussion of the
implications of this move). A low person probe would encounter the object upon its ﬁrst cycle
and the subject upon its second cycle, only if agreement with the object was unsuccessful due to
a lack of [participant] features. If the object is [participant]-bearing, both match/value would be
successful and ϕ-agreement would result (36)-(37).

(36)

1st Person Agreement

(37)

2nd Person Agreement

(cid:34)

(cid:35)

v-CLϕ
π
part

v’

VP

V

match
value

π

DP

part
speak



(cid:35)

(cid:34)

v-CLϕ
π
part

v’

V

match
value

VP

DP(cid:34)

(cid:35)

π
part

If the object was instead third person, then match would succeed, but value would fail,
triggering a second cycle of agreement and again stripping the probe of features to its root. If upon
this second cycle the probe encountered a [participant]-bearing subject (38), match/value would
be successful and ϕ-agreement would result with the subject. If the subject was instead third person
(39), then the probe would actually agree with the subject as well, because the probe’s featural
speciﬁcation would be reduced as the result of a failure to value on that ﬁrst cycle. If we look
back to the discussion of how Georgian is accounted for, the Kichean data would work very much

103

the same if we assumed Kichean person probes are low which is not surprising since the languages
obey largely the same person hierarchy preferences.

(38)

Second Cycle Agreement with 1st Person

(cid:34)

(cid:35)

v-CLϕ
π
part

vP

π

DP

part
speak

match
value



vP

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP
[π]

(39)

Second Cycle Agreement with 3rd Person

v(cid:34)

(cid:35)

π
part

vP

DP
[π]

match
value

vP

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP
[π]

One diﬀerence between the two approaches is that a find approach treats third person exponence
as a complete failure to agree, while a match/value approach treats third person exponence as
ϕ-agreement with a third person, at least if one assumes third person has true featural content.
If one instead assumes that third person is featurally empty, then both models characterize third

104

person exponence as the result of failure. There isn’t really a way to decide between these options,
at least empirically, but it’s important to note there is a derivational distinction.

What the proposal illustrated here shows is that it is possible to model failed agreement within
a framework that uses uninterpretable features to enforce grammaticality. This is achieved by
exploiting a third outcome of agreement where these default-like patterns appear. To be clear, the
existence of this approach does not show that uninterpretable features must be behind the obligatory
nature of these operations. Recall that while Preminger hints that we might be able to do away
with derivational time-bombs entirely and does mention failed agreement being incompatible with
the standard derivational time-bombs approach, it’s only the obligatoriness of ϕ-agreement that his
argument references. His central proposal is more modest: one that says that uninterpretable features
are not necessarily to be eliminated from the system, merely that they do not bear the theoretical
burden of enforcing why operations like ϕ-agreement are obligatory. Within a Béjar-style system,
one could certainly argue, as Preminger presumably would, that the probes immediately begin
their search upon merge into the derivation and this immediate probing is what enforces why
ϕ-agreement is obligatory.

However, with the existence of a possible alternative comes a shift in how the discussion should
be framed. No longer does failed agreement necessitate obligatory operations, it merely provides a
context through which we can judge the ﬁtness of multiple alternatives. This will be the focus of the
next three sections, where we examine data that similarly involves failed agreement, but provides a
lens through which the two speciﬁc proposals may be better diﬀerentiated.

The main criticism leveled against the approach advanced in Béjar (2003) is that one can view
the partial default agreement that she proposes as a type of diacritic, and thus an understandably
unattractive feature of the framework that should be avoided (Preminger, 2014). Preminger argues
that since that type of approach would need to track both the attempt and the outcome of agreement,
assuming uninterpretable features enforce grammaticality is redundant. Under obligatory opera-
tions, nothing arguably needs to be tracked as the attempt is triggered obligatorily and automatically
and the outcome is inconsequential for grammaticality. The next section will serve to illustrate that

105

when we view agreement patterns that are more complicated than the Kichean AF patterns, partial
default agreement becomes a necessary feature of the model. There is reason to think that partial
default agreement is not only not a diacritic in the traditional sense, but also is empirically needed.
The resulting conclusion is therefore one that provides further credence to continuing to explore
the tenability of a match/value approach to failed agreement.
3.4 Failed Agreement isn’t Always Default Agreement

Looking solely at the find operation, we might be tempted to think that ϕ-agreement outcomes
are quite simple; ϕ-agreement either succeeds or it fails and the grammar has the means to handle
each. In this section, we’ll see that the outcomes of failed agreement are more diverse than this
binary distinction and we’ll explore what this varied outcome set means for each of the proposals
under discussion.
In sections 3.4.1.1-3.4.1.2, I discuss data that shows that agreement must be
able to continue to try to ﬁnd a feature value after its initial failure and explore implications for
find. In section 3.4.1.3 I illustrate how a find approach runs into trouble with data that exhibits a
higher degree of sensitivity in the hierarchy that governs its person feature preferences. In section
3.4.2, we discuss implications for capturing dative intervention eﬀects, and in section 3.4.3, we
investigate failed agreement as a tool to resolve conjunct agreement conﬂicts.

The data discussed in this section shows two things: (i) that there is an empirical diﬀerence
between the kind of failed agreement that results in partial default agreement and the kind of
agreement failure that results in total default agreement and (ii) that the outcomes of failed agreement
aren’t uniform. I use these conclusions to argue against an obligatory operations approach on the
basis that it does not predict a distinction between the two. I suggest that unvalued features might
serve to mediate these more complicated outcomes and thus shouldn’t be removed from the system.

3.4.1 Person Hierarchy Eﬀects

We begin our discussion of the outcomes of failed agreement by looking at some data from languages
that exhibit person hierarchy eﬀects. What deﬁnes these languages, a group to which Kichean

106

belongs, is a preference for ϕ-agreement with arguments that bear certain person features, rather
than agreement that solely considers their structural position. In Kichean, we saw a prioritization
of ϕ-agreement with [participant] bearing arguments over third person arguments that do not,
regardless of whether those arguments appeared as subjects or objects, (40).

(40)

a.

b.

rat
you(sg)

ja
foc
‘it was you that heard the man’
ja
foc
‘it was the man that heard you(sg)’

achin
man

ri
the

x-at/*-ø-ax-an
com-2sg.abs/*3sg.abs-hear-AF

ri
the

achin
man

x-at/*ø-ax-an
com-2sg.abs/*3sg.abs-hear-AF

rat
you(sg)

Person hierarchy eﬀects often trigger what’s called second cycle agreement eﬀects and these
are especially visible in languages with what Béjar calls “low-ϕ” agreement probes.
In low-ϕ
languages where there appears to be agreement competition between the external and internal
arguments, the language exhibits a preference for agreement with the internal argument, unless that
internal argument does not have a speciﬁed set of characteristics. The preference then displaces
to the external argument, again assuming the argument bears the right ϕ-features. The shifting
of preference dependent on ϕ-features is called agreement displacement (Béjar & Rezac, 2009).
Low-ϕ languages diﬀer from what we would call high-ϕ languages – languages where agreement
preference is with the external argument unless it fails to bear the relevant features.5

Languages with person hierarchy and second cycle eﬀects are especially relevant for discussing
instances of failed agreement as failed agreement is assumed to be the mechanism behind agreement
5Béjar (2003) recognizes the logical possibility of high-ϕ languages, but also notes that her
survey turned up no languages that illustrated this type. She suggests the reason might be due
to agreement features needing to enter the derivation as soon as possible, and given the existence
of a functional head v that is capable of bearing those features, the grammar has an instrinsic
preference for locating them on v. Kichean under Preminger’s description, locates its ϕ-features on
heads higher than the external argument, suggesting that Kichean might be of this high-ϕ languages
type. Interestingly, if we assumed the match/value system, one could actually assume a low-ϕ
characterization of Kichean, at least based on the agreement hierarchy preferences. This has no
obvious consequence for the points I’m making here and so I won’t discuss this further, but see
Béjar (2003) for further discussion on the cross-linguistic variation in locating unvalued probes.

107

displacement (Béjar & Rezac, 2009). The general idea is that in languages where the agreement
preference is initially with the internal argument unless it bears a certain set of ϕ-features, the
displacement of agreement to favor the external argument is captured through the ability of the
agreement probe to reattempt agreement upon a second cycle of ϕ-agreement. We’ll discuss
three types of data in this subsection: (i) languages with a morphological sensitivity to agreement
cycles, (ii) languages with a syntactic sensitivity to agreement cycles, and (iii) languages with a
complicated hierarchy preference that necessitates probe modiﬁcation upon the second cycle of
agreement. Given that Kichean was characterized as having high ϕ-agreement probes in Preminger
(2014), it’s worth understanding what the find proposal would mean for languages with lower
ϕ-agreement probes that often trigger additional cycles of the operation behind ϕ-agreement. The
focus will of course be on how the find proposal and the match/value proposal handle these sorts
of data and what conclusions we can draw from those results. What we will conclude from this
discussion is that the find operation – without a reasonable mechanism for reapplication nor with
the ability to modify the featural speciﬁcation of probes – is largely unable to suﬃciently handle
this more complicated data.

3.4.1.1 Morphological Eﬀects

In a number of languages, the morphological aﬃx that is inserted for a certain person value depends
on whether the probe received that person value upon the ﬁrst attempt at agreement or upon the
second attempt triggered by failed agreement. Georgian provides an illustration. Georgian person
probes are located low, on v. They are speciﬁed for [π[part]] reﬂecting a preference for arguments
bearing a participant feature, either 1st or 2nd person arguments.

Relevant to second cycle morphological eﬀects, Georgian has a morphological alternation
called the m-/v- alternation (Béjar, 2003) where there are two distinct sets of morphemes: an m-set
and a v-set. These are shown in table 3.5. The m-set of morphemes is inserted for the given person
value when agreement is successful on the ﬁrst try (41a) and the v-set is inserted when the probe’s
value came from agreement on the second attempt (41b).

108

Table 3.5: Georgian Agreement Morphemes

Person m-set morpheme

v-set morpheme

1st
1st (inclusive)
2nd
3rd

m-
gv-
g-
N/A

v-
N/A
x-/ø
ø

(41)

a.

First Cycle (m-set morphemes)

v(cid:34)

(cid:35)

π
part

v’

VP

V

match
value

π

DP

part
speak



b.

Second Cycle (v-set morphemes)

vP

π

DP

part
speak

v(cid:34)

(cid:35)

π
part

match
value



vP

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP
[π]

The morphological distinction between the two sets shows a sensitivity not only to the diﬀerence
between partial and true default agreement (exhibited by a diﬀerence between ﬁrst/second and third
person morphemes within each set), but also between more canonical successful agreement and
partial default agreement (exhibited by the existence of two morpheme sets). The m-/v- alternation

109

shows that the grammar is sensitive to a three way distinction – a more complicated set of outcomes
than the simple tolerance of failed agreement models.

The match/value approach quite obviously captures this data as the triggering of a second
cycle of agreement is explicitly built into the system as a result of failed ﬁrst cycle agreement.
Along with the triggering of a second cycle, the match/value system also assumes that the probe
is modiﬁed in a way that it becomes less speciﬁc about the person features that would qualify
to establish a successful agreement relation. This probe modiﬁcation is not a crucial feature for
Georgian agreement patterns, as the Georgian person hierarchy, like Kichean’s, is relatively simple.
So while not particularly relevant here, it will become so for other languages (see section 3.4.1.3).
What is crucially important is the fact that failed agreement for many of these languages does
not immediately trigger default agreement, suggesting that the outcomes of failed agreement are
more complicated than the Kichean AF facts imply. Instead, in languages like Georgian, failed
agreement triggers a second cycle of agreement that could result in either true person agreement
with an argument in the expanded search domain or it could result in default agreement.6

It’s not impossible for us to modify the find operation to capture this data, but doing so does
require us to assume some non obvious (and perhaps unattractive) assumptions. Given the need
in Georgian for a second cycle of agreement, Preminger’s proposed find operation would need to
reapply if it failed to ﬁnd a value for the probe upon its initial search. Since assuming unvalued
features can project higher in the structure, this does not seem like an unreasonable extension. In
6This is a good time to mention that the featural speciﬁcation of third person has a varied history
and there are still many disagreements on how we should assume third person is modeled in the
grammar. Some people are proponents of third person having true featural content, something like
a [π] feature, reﬂecting that third person is a default, but one that is relatively as opposed to totally
underspeciﬁed (Ackema & Neeleman, 2017; Nevins, 2007). Others, assume that third person
is totally underspeciﬁed, lacking person features at all, the reﬂection of total underspeciﬁcation
(Adger & Harbour, 2007). Still others treat each language individually and assume that the third
person speciﬁcation is something that is cross-linguistically variable (Béjar & Rezac, 2003). Why
this is relevant here is that to maintain a distinction between second cycle agreement (partial default
agreement) and total default agreement, one has to assume that third person is featurally empty.
Otherwise, the exponence of third person is not a default as much as the reﬂection of third person
features.

110

this view, find would be triggered automatically upon the merge of an unvalued probe on v and
attempt to ﬁnd a [participant] feature with which to agree. If it is unsuccessful in this search, as it
is in (42a) when the object is third person, we could assume find is re-triggered when the unvalued
feature projects higher in the structure and has the external argument in the larger search domain
(42b). In this way, find would be capable of accounting for the kind of second cycle morphological
eﬀects seen in Georgian. If we did not allow the probe to re-trigger find, we wouldn’t predict
ungrammaticality as such a failure is tolerated by the grammar under the obligatory operations
model, but we would incorrectly predict the appearance of default third person morphology when
find fails in (42a).

(42)

a.

find fails

v’

v

[u.part]

VP

V



DP
[π]

b.

find reapplies

vP

DP

part.
speak.



v

[u.part]

π

find

vP

v’

v

[u.part]

VP

V

DP
[π]

While it might be reasonable to assume that find is able to continue its search until spell out
so long as the structural description is still met, it’s important to point out that adopting this sort

111

of assumption under an obligatory operations approach does require us to explicitly stipulate it.
It’s also worth noting that Preminger does not consider this a possibility. When extending the
find operation to non-AF data in Kichean, he laments find being unable to access the external
argument if we located it on v, implying that he does not consider the failure of find able to continue
searching. In a derivational time-bomb approach like the match/value approach, this additional
assumption comes for free since the need to value an unvalued feature is already what drives the
operation to apply in the ﬁrst place. It’s therefore quite obvious why second cycle agreement eﬀects
are possible.

It is not just the stipulative nature about find’s reapplication that is problematic; more worrisome
is that making this assumption goes against the spirit of the approach itself. This is a question of
how we encode grammatical requirements and what we assume those grammatical requirements to
be. The obligatory operations approach shifts those requirements to the application of a set of rules
or operations. It is the attempt – or the triggering of the operation – that is required, not a successful
(or any speciﬁc) outcome. Once an operation has been triggered, the grammatical requirement that
it is intended to encode has been met. It is therefore not obvious why operations in this framework
should need to reapply. Conversely, in a more standard derivational time-bombs approach, the
grammatical requirements enforced by the grammar do depend on outcomes. It’s therefore less
radical to assume that the grammar dispatches everything at its disposal to satisfy them. The
reapplication of operations – with an understanding of how that application is constrained – isn’t
as unexpected.

It is also interesting to note that there is one derivational distinction between the models,
although we are unable to use it to empirically decide between the two. In a Georgian clause with
two third person arguments, both models will be able to capture that the ﬁrst cycle of agree is
allowed to fail without causing the derivation to crash. The outcome of the second cycle however
– while morphologically identical – is derivationally distinct. Under the match/value approach,
upon the failure of the ﬁrst attempt at agreement, the probe’s featural speciﬁcation is reduced to the
root feature. This means that on the second cycle, a modiﬁed version of the initial probe is doing

112

the searching, which in turn means that a diﬀerent set of arguments now qualiﬁes as a successful
controller. To the extent that one assumes third person in Georgian has featural content, the newly
reduced [π] probe will in fact ﬁnd a successful agreement controller in a third person argument,
but only upon the second cycle of agreement (43). This means that the default [ø] form under this
framework would be the exponent of successful agreement with a third person argument.

The picture looks a bit diﬀerent from the obligatory operations view. Under this view, there
is no means for probe modiﬁcation. Once the find operation has been triggered and it fails
to ﬁnd a successful controller, the grammatical requirements of the operation have been met.
While we can assume that find might be able to reapply as the result of such a failure, there’s
no mechanism that strips the probe of featural content upon the second cycle, a feature of the
system that Preminger explicitly argues against. What this means is that upon the second cycle of
agreement, the [participant] probe would not consider [π] a successful controller and agreement
would once again fail (44). Unlike the match/value approach, the find approach considers the
default [ø] form the exponent of complete failed agreement. At this point, there’s no clear way to
distinguish these two alternatives empirically, but it’s important to point out that they do make some
theoretical distinctions. Also of note is the observation that if one understood third person features
diﬀerently in Georgian, the distinction is removed. If third person is instead assumed to have zero
featural content, then the match/value approach would fail similarly to the find approach on its
second cycle of agreement.

113

(43)

Second Cycle 3rd Person Agreement

v(cid:34)

(cid:35)

π
part.

vP

DP
[π]

match
value

vP

v(cid:34)

(cid:35)

π
part.

v’

VP

DP
[π]

V

match
no value

(44)

Hypothetical Reapplication of find

vP

v

[u.part]

DP
[π]



vP

v’

v

[u.part]



VP

V

DP
[π]

Georgian is not unique in morphologically expressing a distinction between agreement cycles.
Another illustration comes from Karok, a language spoken in California (Bright, 1957). Karok,
like Georgian is a low-ϕ language, meaning that its person and number probes are both located
on the lower agreement functional head v. Karok similarly exhibits separate morphological aﬃxes
that are dependent on which cycle of agreement the probe was successfully valued. The paradigm
is shown below in table 3.6. The Karok facts we’ll discuss here are slightly more complicated
because these aﬃxes, unlike what we reported above involve both person and number. The series A
morphemes are inserted in singular contexts, where the series B morphemes are inserted in plural

114

contexts. What’s important for the point at hand is that there is a distinction between morphemes
inserted on the ﬁrst cycle of agreement and morphemes inserted on the second cycle of agreement.
This, as it did for Georgian, signals that the operation behind ϕ-agreement must be able to reapply
upon failure. Again, both models can account for these morphological patterns in similar ways by
assuming that operations are able to continue to apply if they fail to ﬁnd an agreement controller
upon their ﬁrst search.

Table 3.6: Karok Agreement Morphemes

A1
na-
nu-/Pi

1st
2nd
1st
pl.

First Cycle

N/A
-ap

B1 A2
ni-
kin-
Pi-
ki-
Pu-

Second Cycle

ka-

B2
nu
ki-
kun-

Another example comes from Erza Mordvinian a language spoken in Mordovia (Abondolo,
1982). Erza Mordvinian exhibits a slightly diﬀerent system than the ones we’ve seen in Georgian
and Karok. Erza Mordvinian is what’s called a split-ϕ language, a language whose person probe
and number probe are located on diﬀerent heads. For Mordvinian, the person probe is the lower
probe, located on v, and the number probe is located higher, on T (45). Like the other languages
described in this chapter, Mordivian has person hierarchy eﬀects that are encoded through the use
of relativized probes. Mordivian’s person probe is speciﬁed for [participant], while its number
probe is speciﬁed for [plural], (45). Like Georgian and Karok, Mordvinian also exhibits second
cycle agreement eﬀects that are morphologically realized.

115

(45)

Split ϕ-Probes in Mordvinian

TP

DP

T(cid:34)

#
plural

T’

(cid:35)

v’

vP

DP

v(cid:34)

(cid:35)

π
part

VP

V

DP

The Mordvinian aﬃx paradigms are shown below in table 3.7. There are two things to clarify
in the ﬁrst cycle aﬃx paradigm. First, like we’ve seen above, there are no 3rd person ﬁrst cycle
aﬃxes since given the relativized person probe, ϕ-agreement will never successfully provide a value
upon the ﬁrst cycle if the internal argument is a non-[participant]-bearing third person argument.
Second, the alternation in the plural aﬃxes is due to phonological alternations (Béjar, 2003). What’s
especially interesting about the Mordvinian paradigms is that not only are there morpheme sets for
each cycle of agreement, but the morpheme structure is distinct across cycles as well. Mordvinian’s
second cycle aﬃxes are suppletive for person and number, but its ﬁrst cycle aﬃxes are not. The
second cycle aﬃxes in the ﬁrst column, -a, -ak, -y are the morphological exponence of second
cycle agreement when both arguments are singular. The second cycle aﬃxes in the second column,
-n, -t, -nze, are what surface on the second cycle when the direct object is plural and the second
cycle aﬃxes in the third column, -nek, -~k, ø, are the morphological exponence of second cycle
agreement when the subject is plural.

116

Table 3.7: Mordvinian Agreement Morphemes

First Cycle

Second Cycle

1st
2nd
3rd
pl.

-am
-ad
N/A

-yz/-iz/-y

-a
-ak
-y

-n
-t
-nze
ø

-nek
-~k
ø

Béjar (2003)’s match/value approach shows how the triggering of second cycle agreement
derives this distinction. Central to this explanation is the assumption than upon a failure to agree,
the unvalued feature projects to a higher position in the structure to allow for both the expansion of
the search domain and the continued search in this new domain. Independent of this assumption,
suppletive morphology is predicted to occur when the two heads that house the person and number
features respectively are in an adjacent enough position to encourage the morphology to insert
a suppletive form. To see how these two assumptions produce suppletive morphology when
paired with the results of failed agreement, we’ll work through each of the four-way outcomes of
the interaction between the two ϕ-agreement probes: (i) both succeed, (ii) person succeeds and
number fails, (iii) both fail, and (iv) person fails and number succeeds.

The simplest possibility is that no second cycle eﬀects are triggered by the success of ϕ-
agreement upon the ﬁrst attempt. The success of ϕ-agreement will mean that the now valued
probes – colored in blue – are each located in their original positions, far enough away from each
other to prevent the insertion of any suppletive morphology (46).

117

(46)

Both Probes Successful on 1st Cycle

DP

(cid:35)

T(cid:34)

#
plural

TP

match
value

T’

(cid:34)

(cid:35)

DP

#
plural

vP

v(cid:34)

(cid:35)

π
part

v’

V

match
value

VP

DP(cid:34)

(cid:35)

π
part

If instead, value with person features is successful, but value with number features is not –
thus triggering a second cycle for the number probe – the two heads that house the now valued
probes are still far enough from each other to prevent suppletion in the morphological component.
This is what we see in (47). When the number probe fails to value on the ﬁrst attempt, its feature
speciﬁcation is stripped to the root and it projects to reapply.

118

(47)

Person Successful on 1st Cycle, Number Fails

(cid:35)

T(cid:34)

#
plural

TP

TP

DP

(cid:35)

T(cid:34)

#
plural

match
no value

match
value

T’

DP(cid:102)
(cid:103)

#

vP

v(cid:34)

(cid:35)

π
part

v’

V

match
value

VP

DP(cid:34)

(cid:35)

π
part

A similar outcome is found when both person and number fail to agree on the ﬁrst attempt (48).
The person probe on v will have its features stripped and project. It will be in this position where
the probe’s features are valued either through second cycle agreement with the external argument,
the internal argument, or default agreement.7 Likewise the number probe on T fails to value and
it also is stripped and projected. Both features project higher upon the second cycle and are thus
once again too far from each other to induce suppletion.

7Since the focus of the trees in this section is to show the ﬁnal positions of the probes, I’ve

remained agnostic here about which DP winds up valuing the person features.

119

(48)

Both Probes Fail 1st Cycle

(cid:35)

T(cid:34)

#
plural

TP

TP

DP

(cid:35)

T(cid:34)

#
plural

T’

v(cid:34)

(cid:35)

π
part

match
no value

vP

DP(cid:102)
(cid:103)

#

vP

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP(cid:102)
(cid:103)

π

Notice, however what happens when person agreement fails on the ﬁrst attempt, triggering a
second cycle, but number agreement is successful on its ﬁrst (49). The two valued probes are
located adjacent to each other because person projected while number did not. Their adjacent
positions can thus trigger the insertion of a suppletive form.

120

(49)

Person Fails 1st Cycle, Number Succeeds

DP

(cid:35)

T(cid:34)

#
plural

TP

T’

v(cid:34)

(cid:35)

π
part

match
value

vP

(cid:35)

(cid:34)

DP

#
plural

vP

v(cid:34)

(cid:35)

π
part

v’

V

match
no value

VP

DP(cid:102)
(cid:103)

π

What allows the circumstance exhibited in (49) that produces the suppletive morphology is the
availability of the higher projection of features indicative of a second cycle of agreement that is
triggered by the failure of the operation to succeed on the ﬁrst attempt. In this way, second cycle
agreement eﬀects not only result in distinct morpheme sets, but they also provide an explanation
for the kind of morphological type distinction we observe in Ezra Mordvinian.

What the data in this section has shown is that the grammar makes use of a distinction between
agreement on the ﬁrst attempt and agreement on a later attempt. How the morphology uses this
information to supply the correct morphemes is an independent question, but there are a few
options that have been proposed. The ﬁrst is to assume that the mechanism behind morphological
insertion is sensitive to the features that inherently exist on the probe and the additional featural
structure added by the operation value Béjar (2003). A solution like this one is tied quite heavily
to the assumption made in Béjar (2003) that the probe’s feature structure is reduced to the root
feature upon the exhaustion of the ﬁrst cycle. This reduction allows for a distinction in the probe’s
starting feature set between the ﬁrst and second cycles of agreement that could then be extended
to the vocabulary items themselves. The vocabulary items for ﬁrst person – ﬁrst and second cycle

121

respectively – could diﬀerentiated by what features were on the probe. The second proposal, and
the one more amenable to the find operation, is to assume that the insertion of each of the second
cycle aﬃxes is conditioned by some syntactic context speciﬁed on the vocabulary item itself (Béjar,
2003; Béjar & Rezac, 2009). For example, we could diﬀerentiate the ﬁrst person ﬁrst cycle aﬃx
from the ﬁrst person second cycle aﬃx by including a structural description on the latter, something
like “is inserted when in the domain of T". Since the T head will have not yet merged into the
structure at the time the ﬁrst cycle aﬃx has applied, we can use this context to diﬀerentiate between
the structure that exists at the two cycles.

There are a number of other languages in which we observe similar morphological sensitivities
(see Béjar and Rezac (2009) for a thorough survey). To recap, what these morphological sensitivities
to cyclic agreement show is that the grammar has a way to recognize and express whether an
unvalued probe received a value via establishing a successful agreement relation on its ﬁrst attempt
(ﬁrst cycle agreement) or on its second attempt (second cycle agreement). At minimum, this
data empirically requires that whatever model we adopt, the operations that are responsible for
ϕ-agreement must be able to reapply until the relevant unvalued feature has received a value. The
match/value approaches accounts for this data quite clearly, while the find approach would require
a stipulation that is at odds with the spirit of the proposal.

3.4.1.2 Syntactic Eﬀects

Not only is there evidence the morphology is sensitive to the distinction between ﬁrst and second
cycle agreement, but the syntax itself seems sensitive to this as well. This section illustrates
two places where we see evidence of this: (i) the presence of additional morphology – and thus
additional syntactic material – upon agreement on the ﬁrst cycle in inverse agreement contexts and
(ii) the presence of a special case upon agreement on the ﬁrst cycle. Both of these are described as
repair strategies to address deﬁciencies that arise when agreement is successful with the internal
argument upon the ﬁrst cycle. As with the morphological sensitivities, these syntactic sensitivities
are discussed here only to show once again that the operation behind ϕ-agreement has more

122

complicated outcomes than simple success and failure and by extension requires a model capable
of predicting these varied outcomes.

We ﬁrst begin with a recognition of the morphological facts, then we’ll proceed to discussion
about what people have assumed those facts to reﬂect syntactically.
In Mohawk, there is an
additional preﬁx that appears only when agreement has successfully been achieved with the internal
argument (Beatty, 1974; Béjar & Rezac, 2009; Postal, 1979). This preﬁx is in addition to the
traditional agreement marker that reﬂect the ϕ-features of the agreement controller. In the paradigm
shown below in (50), the canonical agreement marker is shown in small capitals, while the additional
ﬁrst cycle agreement preﬁx is shown underlined.

(50)

a.

b.

c.

d.

ku-see
1/2-see
‘I see you’
k-see
1-see
‘I see him.’
hs-see
2-see
‘You see him.’
(h)s-k-see
2-1-see
‘You see me.’

e. wa-k-see

> hra-o-see

3.inv-1-see
‘He sees me.’
(h)s-(w)a-see
2-3.inv-see
‘He sees you.’
hra-wa-see
3.m.dflt-see
‘It sees him.’

f.

g.

1 → 2, external

1 → 3, external

2 → 3, external

2 → 1, internal

3 → 1, internal

3 → 2, internal

3 → 3, internal

123

What’s relevant for the discussion at hand is that this additional morpheme is only present in
instances where the internal argument was successfully agreed with – on the ﬁrst cycle of agreement.
If the external argument instead controls agreement – the result of a second cycle of agreement –
this additional morpheme is absent. In this way, Mohawk marks a distinction between the ﬁrst and
second cycles of agreement.

Another morphological fact that illustrates the same concept is the appearance of a special kind
of case that Béjar and Rezac (2009) call R-case. An example comes from Kashmiri, a language
spoken in India (Wali & Koul, 1997). In Kashmiri, there is a special case that is morphologically
identical to the dative case that only appears when ﬁrst cycle agreement is successfully established
with the internal argument (51) (Mahajan, 1989; Nash, 1996; Woolford, 1997, 2006). This case,
while morphologically identical to the dative (53), diﬀers from canonical dative as it – unlike
the canonical dative – disappears under passivization (52b). Once again, what’s relevant for the
current point is that there exists a case whose distribution depends on which cycle of agreement
successfully established a relationship with an argument.

(51)

a.

chu-s-ath
be.m.sg-1.sg.n-2.sg.e/a

b1
I.n
‘I am teaching you’
chu-kh
ts1
you.n
be.m.sg-2.sg.n
‘You are teaching me.’

me
me.d

ts1
you.n

par1na:va:n
teaching

par1na:va:n
teaching

tse
you.d

hava:l1
handover

kari-y
do.fut-2.sg.d

me
su
he.n
me.d
‘He will hand you over to me.’
ts1
you.n
‘You will be handed over to me by him.’

yi-kh
come.fut-2.sg.n

hava:l1
handover

me
me.d

karn1
do.inf.abl

1 → 2, direct

2 → 1, inverse

3 → 2, inverse

t@m’s1ndi
he.gen

d@s’
by

b.

a.

b.

(52)

(53)

aslamni
mohnas
Aslam.m
Mohan.m-d
‘Mohan was given the shirt by Aslam.’

a:yi
pass.f.sg

k@mi:z
shirt.f

z@riyi
by

din1
give

124

What groups the Mohawk and Kashmiri examples together, but distinguishes them from the
morphological eﬀects we discussed in the last section, is the fact that these morphological eﬀects
are proposed to be the exponent of an additional ϕ-probe added to the derivation as the result of a
successful agreement with the internal argument. They therefore constitute a syntactic, rather than
a morphological, sensitivity to second cycle agreement eﬀects (although of course they are still
reﬂected in the morphology as well). Béjar and Rezac (2009) propose that once the probe ﬁnds a
successful controller in the internal argument, it establishes an agreement relation (54). This agree
relation is what allows the grammar to generate an additional probe (in blue) upon the projection
of the v head (55) that can access the external argument.

(54)

v(cid:34)

(cid:35)

π
part

v’

VP

V

match
value

π

DP

part
speak

(55)

v(cid:102)

(cid:103)

π

vP

vP

DP

v(cid:34)

(cid:35)

π
part



v’

VP

V

match
value

π

DP

part
speak



If the original probe instead fails to ﬁnd a controller in the internal argument (56), the lack of an
agree relation blocks the addition of this probe to the projection of v and the original probe continues
to search in its new search domain, ﬁnding a controller in the external argument (57).8 Béjar and
Rezac (2009) view this mechanism as a repair strategy intended to resolve PLC violations due to a
failure of the external argument to establish an agreement relation Béjar and Rezac (2009). If each
8Béjar and Rezac (2009) adopt the match/value approach, but suggest that each individual
feature can agree independently. For an example like (56), they assume the [π] feature on the probe
checks the one on the internal argument and then the [part] feature is residual and is what probes
on the second cycle. These details aren’t relevant for the points I’m making here, so I’ve left them
oﬀ the trees. (see Béjar & Rezac, 2009, for details).

125

nominal must enter into some agreement relation to be licensed, then the success of the internal
argument to establish such a relation bleeds the ability for the external argument to do. The added
probe mechanism provides the external argument an opportunity to be licensed by agreement only
in situations where it would otherwise be unable to do so. Mohawk and Kashmiri are understood
to be beholden to the same principles related to the added probes, but are assumed to spell out
those added probes diﬀerently.
In Mohawk, the added probe is spelled out quite obviously as
an additional morpheme, but in Kashmiri, the case assigning properties of v are modiﬁed by the
presence of the additional probe on v.

(56)

v(cid:34)

(cid:35)

π
part.

v’

V

match
no value

VP

DP(cid:102)
(cid:103)

π

(57)

v(cid:34)

(cid:35)

π
part

vP

DP(cid:102)
(cid:103)

π

match
value

v’

vP

v(cid:34)

(cid:35)

π
part

VP

DP(cid:102)
(cid:103)

π

V

One could imagine that Preminger too could account for these types of second cycle eﬀects
by proposing an additional operation, one whose application was triggered by the successful
culmination of the operation find. For this to be possible, the operation would need to be able to
do two things. First, it would of course need the ability to add the additional probe to the higher
projection of v.
I see no issue here, at least none which doesn’t also plague the match/value
style approach. The operation would also however need to be able to be formulated in such a
way where the outcome of agreement could be accessed by the structural description. Preminger
has proposed one such operation that we might use for guidance: a movement to subject position
rule for non-quirky languages, shown in (58). He argues that movement to subject position is an
operation independent of find, but one that depends on find successfully ﬁnding an agreement
controller to be triggered. To encode this, the operation includes the outcome of an independent
operation in its structural description.

126

(58)

In a non-quirky-subject language:

MtoCSPNQSL = Move (XP successfully targeted by find)

What makes this problematic however, is that without derivational tracking or a model driven by the
incessant need to value features – which as we’ve discussed Preminger is loathe to do – it seems that
this reference must be constrained to rules that modify in some perceptible way the syntactic object
they are targeting. Otherwise, it’s not obvious that the grammar can determine that an independent
operation has or has not either been triggered or been successful. With respect to the movement
to subject rule, we can see that the problem is that the find operation wouldn’t modify in any way
the argument being targeted for ϕ-agreement, and subsequent movement. What is modiﬁed are
the features of the probe, not the goal. It’s not clear how the grammar ‘knows’ whether or not a
particular XP goal had been successfully targeted by find. With respect to our hypothetical rule,
we must be able to propose some operation that can trigger the insertion of an additional probe
upon the success of the find operation to succeed in valuing a probe’s features. Here, the situation
actually appears a bit more optimistic than it does for the movement operation. At the success of
find, the probe is modiﬁed in a way such that its unvalued features receive a feature value. If one
proposed an operation that was triggered upon the existence of valued features on v, we might be
able to account for the second cycle eﬀects in Mohawk and Kashmiri.

(59)

Hypothetical added probe rule
Inspect ϕ-features of v. If valued, insert an additional unvalued probe on projection of v.

This operation could add the same added probe that Béjar and Rezac (2009)’s system does in the
case that find was successful. A tangential concern, and one that may aﬀect both proposals equally,
is the question of what motives or explains why the grammar has the added probe mechanism at
all. Under the Béjar and Rezac (2009) approach, the addition of the added probe is one instance
of a broader last resort mechanism, in this case employed to alleviate PLC violations. The find
operation and its hypothetical added probe counterpart could likely invoke reference to the PLC
violation as well, especially given its central role in accounting for the Kichean AF data. However,

127

while not especially problematic, the PLC does seem more amenable to being modeled in a system
modeled by feature valuation rather than an obligatory operations approach. We’ll discuss this
further in section 3.6. In brief, the need for interpretable [participant] features to establish some
sort of agreement relation in order to be licit appears more in sync with a model that enforces its
principles through generally similar mechanisms rather than one that enforces its principles through
the triggering of operations without concern for their outcomes.

At a minimum, what these syntactic eﬀects do is reinforce the idea that the grammar is sensitive
to whether or not the ϕ-agreement probes are successful on the ﬁrst or second attempt at agreement.
This in turn reinforces the idea that the outcomes of agreement are not simply success or failure,
but rather success, failure with interesting eﬀects, or failure that leads to defaults. Whatever model
of ϕ-agreement we adopt, it must be able to acccount for second cycle eﬀects – and to the extent
that we consider Béjar and Rezac (2009)’s treatment of the Mohawk and Kashmiri data reasonable
– it must be able to trigger the insertion of an additional ϕ-probe to obviate violations of the PLC.
Once again, these empirical requirements suggest that the outcomes of failure to agree aren’t a
simple binary set: success (agreement) and failure (default). Instead, the failure to agree results in
more complicated outcomes for both the syntax and the morphology that reduce the tenability of
the find approach.

3.4.1.3 Probe Modiﬁcation

Finally, we come to the most problematic person hierarchy data for an operation like find: the
agreement pattern found in Nishnaabemwin. Nishnaabemwin exhibits both person and number
agreement (Valentine, 2001). Both the person and number probe are assumed to be located on v
and thus we can characterize Nishnaabemwin as a low ϕ-language Béjar (2003). There are two
facts about Nishnaabemwin that diﬀer from the other languages we’ve seen so far. First, the feature
speciﬁcation [π[part]] maps to ﬁrst person in Nishnaabemwin as second person is more speciﬁed
in the language, adding an addressee feature to its feature set: [π[part[add]]]. Second, third person
in Nishnaabemwin is assumed to not be speciﬁed at all, [ø].

128

Like all languages with person hierarchy eﬀects, the choice of agreement controller is dependent
not only on the syntactic characteristics of one argument, but of multiple arguments and on their
relative positions. Nishnaabemwin prioritizes agreement with 2nd person arguments, then 1st
person, then 3rd person. In a clause with a second person object and a non-2nd person subject
(either 1st or 3rd person), the probe targets the object ﬁrst and agrees as in (60).

(60)

Agreement with 2nd Person Object

vπ

part
add



v’

VP

V

match
value

DPπ

part
add



Relevant to our discussion comparing the find operation and Béjar’s match/value approach is
what happens when the object is not 2nd person, and thus not a viable target for agreement. As (61)
shows, the [π[part[add]]] searches its domain and does not ﬁnd a DP that can fully value its person
feature. If the probe is unable to agree with the object due it being a non-second person argument,
the agreement controller displaces to the external argument – if that argument is ﬁrst person. If not,
default third person morphology surfaces. What this tells us is that when agreement fails on the ﬁrst
attempt, the probe still cares about ﬁnding an agreement controller that respects Nishnaabemwin’s
person hierarchy of 2 > 1 > 3; default morphology is not an immediate nor the singular result of
failed agreement. This is viewed as evidence that there’s a distinction between failed agreement
that results in the agreement with something else upon a second cycle of agreement and failed
agreement that results in the insertion of default morphology. As we already saw in the last section,
Béjar (2003) accounts for this pattern through the stripping of the probe’s features upon failing to
value the internal argument - the outcome she terms partial default agreement. The stripping of the

129

probe modiﬁes the set of person features that would qualify as a successful agreement controller,
thus making the way for ﬁrst person second cycle agreement (61)

(61)

1st Person Agreement on Second Cycle

vπ

part.
add.



vP

DP(cid:34)

(cid:35)

π
part.

match
value

vP

vπ

part.
add.



v’

VP

DP
[ø]

V



What’s crucial to accounting for the second cycle agreement patterns found in Nishnaabemwin is
the ability to modify the feature speciﬁcation on probes, a feature that the find obligatory operation
approach does not share. If we assumed the obligatory operation find was triggered immediately
upon the merge of the probe, it would attempt to agree with the internal argument and it would
fail to do so (62). A integral feature of this proposal is that this failure to agree would not cause a
derivation crash, as the operation is allowed to fail without consequences for grammaticality. From
here, there are two potential next steps: either (i) the find operation is exhausted and can’t reapply
and the default third person form is wrongly inserted at spell out or (ii) find applies again upon a
projection of probe features. In section 3.4.1, I presented reasons to be concerned about allowing
find to reapply, but for the sake of pushing the account, let’s assume that it can.

130

(62)

Reapplication of find

vP

DP(cid:34)

(cid:35)

π
part.

v

[u.add]



vP

v’

v

[u.add]

VP

DP
[ø]

V



The problem is that without access to a means for probe modiﬁcation, the probe will still be
relativized for second person upon any subsequent agreement cycles. Thus the ﬁrst person external
argument that is in the newly created second cycle agreement domain would still be unavailable for
agreement because its feature set doesn’t qualify it as a viable agreement controller. Agreement
would then fail on this second cycle and at the spell out of v, the default third person features would
again be wrongly inserted.

Notice that under either assumption about find– whether the probe halts its search after the ﬁrst
attempt or is allowed to continue its search through multiple attempts – the outcome is the failure
of the probe to agree with an argument, resulting in the insertion of default third person features
at spell out. So while second cycle agreement eﬀects of course need the ability to probe a second
cycle (which find can perhaps provide), they also depend on the grammar’s ability to modify the
probe in a way that broadens the type of argument that the probe could agree with successfully
(which find cannot provide). So although the probe is initially relativized to only consider second
person arguments as potential controllers, the probe is able to consider ﬁrst person arguments as
well, but only upon a second attempt. The find operation does not come with the ability to modify
the probe upon failure and as a result predicts third person default features instead of the ﬁrst person
morphology we observe.

131

It’s important to be clear that the pattern shown here reﬂects true agreement with a ﬁrst
person external argument and is not the result of a more general version of default features. In
Nishnaabemwin, there is an empirical diﬀerence between ﬁrst person agreement aﬃxes and third
person default agreement aﬃxes, shown in table 3.8. The distinction between the ﬁrst person aﬃx
and third person default aﬃx, both possible as a result of failed agreement tells us that the outcome
of failed agreement is not simple tolerance, as the find operation models. The probe must be able
to change what it is looking for when it doesn’t ﬁnd an agreement controller on the ﬁrst try. The
result is either agreement on the second attempt (61) or no agreement at all.

Table 3.8: Nishnaabemwin Agreement Morphemes

Morpheme

1st
2nd
3rd (default)

n-
g-
w-

Preminger is understandably critical of including special diacritics in the agreement system,
especially those whose function appears redundant or solely to ensure that the last resort default
mechanism does in fact wait until the last resort. The problem here is that it appears that Béjar’s
partial default agreement has been mischaracterized as such a diacritic. The fact that partial default
agreement has an empirical consequence that diﬀers from total default agreement challenges that
characterization. Partial default agreement does more than just mark a time-bomb ‘safe’, preventing
it from crashing the derivation, and its role goes beyond ensuring that the last resort mechanism
defaults truly wait until all other options have been exhausted. True to its name, partial default
agreement allows for a third outcome – an empirically necessary one – between canonical successful
agreement and default agreement. Furthermore, the ‘diacritic’ itself doesn’t share typical diacritic
behavior in that it actually modiﬁes the probe’s featural speciﬁcation. In this way, partial default
agreement behaves less like a diacritic and more like an additional operation triggered upon the
result of a previous one. Given this crucial role in accounting for more complicated agreement
patterns that appear impossible to account for without it, I would challenge the idea that partial

132

default agreement is a redundant diacritic, if even a diacritic at all.

What the person hierarchy data has shown us is that the outcomes of failed agreement are
more varied than the find operation itself can model. While it’s tempting to think that partial
default agreement is an unnecessary or redundant mechanism intended to ensure that a last resort
mechanism did in fact only apply as a last resort, it importantly provides a third outcome of
agreement – one that the grammar uses in varied ways.

3.4.2 Dative Intervention

An exciting extension of the obligatory operations approach to failed agreement is its potential
to serve as an account for dative intervention. Dative intervention describes the puzzling fact
that dative arguments are unable to transfer their own ϕ-features to a probe (63), but can serve to
intervene and thus block agreement with a lower argument (64). What makes this phenomenon
puzzling is the question of how to reconcile those two behaviors. On the one hand, in order to
block agreement with a lower argument, the dative argument must be visible in some way to this
probe. Traditionally, this means needing to have some set of ϕ-features. Without these features, it’s
unclear how the probe would be able to “see” and subsequently be halted by the dative argument.
On the other hand, if the dative argument does in fact have the ϕ-features that make it visible to the
ϕ-probe, it’s unclear why the dative argument is unable to transfer those feature values to the probe
via agreement.

(63)

(64)

a.

b.

leiddist/*leiddust.
were.bored.3.sg/*3pl

Strákunum
boy.thepl.dat
‘The boys were bored.’
Strákarnir
boy.thepl.nom
‘The boys walked hand in hand.’

leiddust/*leiddist.
walked.hand.in.hand.3pl/*3sg.

(Sigurðsson, 1996)

ÞaD ﬁnnst/*ﬁnnast einhverjum stúdent tölvurnar
expl ﬁnd.sg./*pl. some student.sg.dat computer.the.pl.nom ugly
‘Some student ﬁnds the computers ugly.’

(Holmberg & Hróarsdóttir, 2003)

133

Once we concede the ability of the agreement operation to fail, we can account for the two-
faced nature quite easily (Preminger, 2014). To do so, Preminger ﬁrst cites an assumption that
ϕ-agreement is sensitive to the assignment of morphological case (Bobaljik, 2008), with each
language adhering to the Moravcsik hierarchy, shown in (65). The way to interpret this hierarchy
is to say that if a language permits agreement with dependent case marked arguments, it will also
permit agreement with unmarked case arguments, but not agreement with lexically marked or
oblique arguments. Each language is able to set its own relevant boundary, which accounts for
some of the cross-linguistic variation we see in which types of arguments are viable agreement
controllers in diﬀerent languages. The extension of obligatory operations to dative intervention
also depends on this assumption in that the find operation responsible for ϕ-agreement is sensitive
to case distinctions. Preminger modiﬁes the operation’s description to be sensitive to this case
discrimination, (66).

(65)

Moravcsik Hierarchy

unmarked case > dependent case > lexical/oblique case

(66)

find(f)
Given an unvalued feature f on a head H0, look for an XP bearing a valued instance of
f. Upon ﬁnding such an XP, check whether its case is acceptable with respect to case
discrimination:

a.
b.

yes → assign the value of f found on XP to H0
no → abort find

Dative intervention, under this approach, is the result of failed agreement, explaining why we
observe default third person in exactly these instances. Observe the rough sketch in (67) for the
Icelandic sentence in (64). Icelandic person probes aren’t relativized for a particular set of person
features, so Preminger assumes the probe is [π] and on T. When merged into the structure, find is
immediately and obligatorily triggered and begins its search for an argument with which to agree.
According to the description outlined in (66), it ﬁnds such an argument in the dative argument

134

einhverjum stúdent. The probe then evaluates the acceptability of the dative argument’s case,
according to where on the hierarchy Icelandic sets the parameter for case discrimination – between
dependent and unmarked case. Therefore according to Icelandic’s case discrimination settings,
dative case is not acceptable for agreement and the find operation aborts, thus ending the probe’s
search. Because operations are allowed to fail without grammatical consequence in the obligatory
operations model, the derivation does not crash and when this phase is spelled out, the third person
default features are correctly inserted.

(67)

TP

DP

T’

T

[u.π]

vP

v

vP

einhverjum stúdent

DP

[π]
dat

abort
find

v’

v

VP

V

tölvurnar

DP

[π]
nom

The question then turns to how this account of dative intervention would fare when coupled
with the fact that not all languages with dative intervention are assumed to have ﬂat [π] probes,
as Icelandic and French do. In other words, does this account of dative intervention still work in
languages whose agreement probes are relativized to search for a particular set of person features,
like those sensitive to person hierarchy eﬀects discussed in the previous section? To see why this is
an important question, let’s observe a prediction that this dative intervention model makes. There
are two important features of this account. The ﬁrst is that the dative argument must be visible to
the probe in order to trigger the evaluation for case discrimination. The second is that evaluating

135

that an argument does not meet the requirements dictated by case discrimination must immediately
halt the probe. Otherwise, the probe would be able to continue its search and incorrectly target a
lower nondative argument. With relativized probes, a smaller set of feature speciﬁcations make an
argument visible. This in turn raises the question: what happens when the dative argument is less
speciﬁed than the probe (and a lower nondative argument), as shown in (68)? According to the
modiﬁed find operation, the probe will begin its search, but being relativized for [participant], for
example, would simply ignore a third person dative argument, as it does in all of the canonical failed
agreement data discussed for Kichean AF constructions outlined in previous sections. Because the
dative argument is ignored, it is never evaluated for case discrimination and therefore never causes
the find operation to abort. If a lower nondative argument existed, nothing would prevent the probe
from ﬁnding this argument and establishing a successful agreement relation.

(68)

TP

DP

T’

T

[u.part.]

vP

v

vP

DP
[π]
dat

find

v’

v

VP

V

DP(cid:34)

(cid:35)

π
part.
nom

There is data from Georgian that appears to be exactly the kind described above and thus
constitutes a data set that the find approach to dative intervention cannot capture (Harris, 1981).
In (69a), there is both a third person dative argument and a third person nominative argument.

136

Agreement morphology is third person and is consistent with an approach like the one we’ve
described above where failed agreement would trigger the insertion of third person default features.
The find proposal does not run into any issues here because even if the probe is unable to ‘see’
the dative argument, it would likewise be unable to ‘see’ the third person nominative, resulting in
failed agreement and predicting the appearance of third person features. However, data like (69b)
cause a problem. Since the probe is relativized in Georgian to search for a [participant] feature,
it won’t ever be able to investigate whether or not the dative argument’s case features respect case
discrimination for the language because third person arguments do not bear this feature. Without
being able to halt the probe, nothing would prevent the probe from successfully agreeing with
the [participant]-bearing nominative argument, which as (69b) shows is ungrammatical. What is
missing here is the ability for a third person dative argument to be visible to a more speciﬁed
relativized probe within a system that otherwise depends on the relativized probe ignoring less
speciﬁed arguments.

(69)

a.

vanom
Vano-erg
‘Vano compared Anzor to Givi.’

anzori
Anzor-nom

šeadara
he-compares-him-him

givis
Givi-dat

b. *vanom

Vano-erg
‘Vano compared you to Givi.’

(šen)
you-nom

šegadara
he-compared-him-you

givis.
Givi-dat

(Harris, 1981)

An approach like Béjar’s match/value approach does not share this problem because the probe
evaluates match at the root feature level, ignoring any further featural structure that may exist. This
means that third person arguments aren’t ignored, but instead mark the probe for partial default
agreement, if relevant. The relativized [participant] probe is able to search as it always does and
considers the dative third person argument a successful match. From here, if we assumed that a
match/value approach to dative intervention uses the same sensitivity to case discrimination that
the find approach does and we assumed that case discrimination would similarly halt the operation,
we could account for dative intervention in the same way that Preminger (2014) does.

At its core, the dative intervention puzzle is one of visibility. When a probe is speciﬁed quite

137

minimally, as it is in Icelandic and French, it’s hard to see how any problems could arise because
any dative argument will be visible enough to the probe to instigate case discrimination inspection.
Of course, as is the theme of this section, failed agreement patterns aren’t always this simple and
the additional complexity that person hierarchy sensitive languages add raises some issues with
respect to the interaction between relativized probes and dative intervention. The find operation
has diﬃculty in capturing this additional complexity, while the match/value approach has the
mechanisms available to handle it.

3.4.3 Conjunct Agreement

A ﬁnal example of an instance where failure to value does not immediately trigger a default comes
from analyses of conjunct agreement. Here we’ll see that for some languages, a failure to value
can result in diﬀerent mechanisms for calculating conjunct agreement patterns, either resolved
agreement (RA) or closest conjunct agreement (CCA). What’s important to take away here is
the existence of an outcome of the failure to value that is distinct from the insertion of a default
form. This signals that the grammar must have some way of distinguishing when to use default
mechanisms as the result of failed value and when to use other mechanisms (like CCA) as the result
of failed value. The answer to this question still eludes us, but it’s reasonable to attempt to propose
a syntactic distinction that the morphology or PF can read in a way that would help condition
which outcome we observe. To the extent that we can ﬁnd a syntactic distinction, the match/value
approach seems able to provide that distinction in a way that a find approach that only encodes
triggering cannot. Derivational time-bombs care about outcomes, obligatory operations care about
beginning states. Failed agreement is a non-uniform set of outcomes, needing a model that cares
about outcomes more than it cares about beginnings.

Bhatt and Walkow (2013) outline data from Hindi-Urdu that shows a distinction between
how conjoined arguments interact with agreement that is dependent on whether those conjoined
arguments are in subject position or in object position. When a conjoined argument is in subject
position, we get something called resolved agreement, or agreement with the entire conjoined

138

phrase. This is shown below in (70), where number agreement is easiest to illustrate.
In the
examples in (70), number agreement tracks the entire phrase, not one of the conjuncts. So even
when neither argument is plural, we still observe plural agreement when the conjoined argument
is in subject position because the agreement probe is accessing the features of the entire ConjP.
This of course implies that the grammar has some mechanism available for calculating the resolved
features of a conjoined phrase. The details of how this calculation occurs is not relevant here, so I
direct you to (Bhatt & Walkow, 2013; Marušič, Nevins, & Badecker, 2015) for more discussion

(70)

a. Ram

aur
and

Ramesh
Ramesh.m

gaa
sing

[rahe
[prog.m.pl

Ram.m
‘Ram and Ramesh are singing.’
[rahe
Sita
Sita.f
[prog.m.pl
‘Sita and Ramesh are singing.’

Ramesh
Ramesh.m

gaa
sing

aur
and

b.

hã˜i
be.prs.pl

/
/

*rahaa
*prog.msg
m.sg + m.sg: agreement = m.pl

hai]
be.prs.sg]

hã˜i
be.prs.pl

/
/

*rahaa
*prog.msg

hai]
be.prs.sg]

c. Ram

aur
and

Sita
Sita.f

gaa
sing

[rahe
[prog.m.pl

Ram.m
‘Ram and Sita are singing.’
[[rahii
[[prog.f

d. Mona
mona.f
‘Ram and Sita are singing.’

Sita
sita.f

gaa
sing

aur
and

hã˜i
be.prs.pl

/
/

*rahii
*prog.f

/
/

rahe]
prog.m.pl]

hã˜i/
be.prs.pl

f.sg + m.sg: agreement = m.pl
hai]
be.prs.sg]
m.sg + f.sg: agreement = m.pl
*rahaa
/

hai]
*prog.m.sg be.prs.sg
f.sg + f.sg: agreement = f.pl/m.pl

(Bhatt & Walkow, 2013)

The agreement pattern diﬀers however when conjoined phrases are in object position instead.
When a conjoined phrase is in object position, resolved agreement is completely unavailable (73)
and instead the probe agrees with the closest conjunct (71)-(72).

(71)

ek
a

ek
a

aur
and

thailii
bag.f

badsaa
box.m

(aaj)
(today)

a. Ram-ne
Ram.erg
‘Ram lifted a small bag and a box (today).’
(aaj)
(today)

uthaa
b. Ram-ne
ram.erg
lift
‘Ram lifted many small bags and a box (today.)’

thailiyã:
bag.f

badsaa
box.m

kai
many

uthaa
lift

aur
and

ek
a

[-yaa/
[-pfv.m.sg/

*-yii/
*-pfv.f/

??-ye]
??-pfv.m.pl]

[f.sg + m.sg] . . . V.part.m.sg

[-yaa/*-yii/
??-ye]
*pfv.f/ ??pfv.m.pl]
[pfv.m.sg/
[f.pl + m.sg] . . . V.part.m.sg

139

ek
a

ek
a

aur
and

thailaa
bag.m

baksaa
box.m

(aaj)
(today)

c. Ram-ne
Ram.erg
‘Ram lifted a bag and a box (today).’
baksaa
box.m

d. Ram-ne
Ram.erg
‘Ram lifted many bags and a box (today).’

thaile
bags.m

kai
many

aur
and

ek
a

(aaj)
(today)

uthaa
lift

[-yaa/
[pfv.m.sg/

??-ye]
??pfv.m.pl]

[m.sg + m.sg] . . . V.part.m.sg

uthaa
lift

[-yaa/
[pfv.m.sg/
[m.pl + m.sg] . . . V.part.m.sg

??-ye]
??pfv.m.pl]

(72)

ek
a

a. Ram-ne
Ram.erg
th˜i:
be.pst.f.pl
‘Ram had lifted a small bag and a box (today).’ [f.sg + f.sg] . . . V.part.f Aux[f.sg]

aur
thailii
bag.f
and
??uthaa-ye
??lift-pfv.m.pl

petii
box.f
the]
be.pst.m.pl]

??uthaa-yii
??lift-pfv.f

thii
be.pst.f.sg

[uthaa-yii
[lift-pfv.f

(aaj)
(today)

ek
a

/
/

/
/

b. Ram-ne
Ram.erg
thi˜i:/
be.pst.f.pl
‘Ram had lifted many bags and a box (today).’ [f.pl + f.sg] . . . V.part.f Aux[f.sg]

thailiyã:
kai
many
bags.f
??uthaa-ye
/

lift-pfv.m.pl be.m.pl]

??uthaa-yii
??lift-pfv.f

thii
be.pst.f.sg

[uthaa-yii
[lift-pfv.f

(aaj)
(today)

petii
box.f

the]
??

aur
and

ek
a

/
/

[uthaa-yii
[lift-pfv.f

thii
be.pst.f.sg

/
/

??uthaa-yii
??lift-pfv.f

[m.sg + f.sg] . . . V.part.f Aux[f.sg]

ek
a

ek
a

(aaj)
(today)

aur
thailaa
and
bag.m
uthaa-ye
??
/
??lift-pfv.m.pl

petii
c. Ram-ne
box.f
Ram.erg
th˜i:/
the]
be.pst.f.pl
be.pst.m.pl]
‘Ram had lifted a bag and a box (today).’
(aaj)
(today)

aur
and

d. Ram-ne
Ram.erg
th˜i:
be.pst.f.pl
‘Ram had lifted many bags and a box (today).’ [m.pl + f.sg] . . . V.part.f Aux[f.sg]

thaile
bags.m
??uthaa-ye
??lift-pfv.m.pl

ek
a
the]
be.pst.m.pl]

??uthaa-yii
??lift-pfv.f

kai
many
/
/

thii
be.pst.f.sg

[uthaa-yii
[lift-pfv.f

petii
box.f

/
/

(73)

Ram-ne ek phaalvaalii aur ek duudhvaalii [dekhii thii/ ??dekhii th˜i:/ *dekha tha/ *dekhe
the]
Ram.erg a fruit.seller.f and a milk.seller.f.sg [see-pfv.f be.pst.f.sg/ see-pfv.f be.pst.f.pl/
*see-pfv.m.sg be.pst.m.sg/ *see.pfv.m.pl be.pst.m.pl]
‘Ram had seen a fruit seller and a milk seller.’

(Bhatt & Walkow, 2013)

To account for the distinction in behavior between conjoined subjects and objects, Bhatt and Walkow
(2013) propose an analyses of conjunct agreement that derives the distinction through accessibility.
Conjoined nominals are assumed to have the structure shown in (74), where each individual conjunct

140

has its own ϕ-features and the &P is the locus of resolved features, the ϕ-features that represent
the entire conjoined nominal. Since the ϕ-features of the &P are higher – and therefore closer to
the probe – than the ϕ-features of the individual conjuncts, the probe will interact with them ﬁrst.
This results in resolved agreement being the ‘typical’ outcome, only not obtaining when blocked
in some way. Since Hindi-Urdu has one ϕ-agreement probe, we can view the situation as being
another instance of agreement competition, this time between the ϕ-features on the &P and the
ϕ-features of the individual conjuncts.

(74)

&P[ϕ&]

DP1[ϕ1]

. . .

&

DP2[ϕ2]

When the conjoined argument is in subject position, the ϕ-agreement probe on T will ﬁrst
encounter ϕ-features of the &P (75). Since there is no evidence that Hindi-Urdu obeys the kind of
person hierarchy eﬀects observed in previous sections, we can safely assume the probe on T is a
ﬂat probe, speciﬁed as [uϕ]. Therefore, there is nothing to block agreement with the ϕ-features on
the &P, resulting in resolved agreement.

(75)

T[uϕ]

vP

agree

. . .

. . .

v [uϕ]

&P[ϕ&]

DP1[ϕ1]

. . .

&

DP2[ϕ2]

DP[ϕ]

V
agree

141

When the conjoined argument is instead in object position, the situation is a bit diﬀerent. Bhatt
and Walkow (2013) assume that when v merges into the structure, it assigns case to and agrees with
the coordinated object (76a), rendering its ϕ-features inaccessible for ϕ-agreement via the activity
condition (Chomsky, 2001). This has the eﬀect of making the ϕ-features on &P unavailable to
value any future probes, but importantly still accessible to matching. When the probe on T reaches
the &P object, it cannot agree with the resolved features on the &P itself (76b), explaining why
resolved agreement is impossible when the conjoined argument is in object position, (73). Bhatt
and Walkow (2013) then propose that a match with the ϕ-features on &P, but a failure to value
the features on the probe will trigger a morphosyntactic algorithm that decides which of the two
conjuncts will value the probe: this will result in either ﬁrst conjunct agreement or last conjunct
agreement.

(76)

a.

Step 1:

. . .

. . .

v [uϕ]

&P[ϕ&]

V

agree

DP1[ϕ1]

. . .

&

DP2[ϕ2]

142

b.

Step 2:

. . .

T[uϕ]

v [ϕ&]

. . .

. . .

. . .

. . .

V
match
no value

DPerg

&P[ϕ&]

DP1[ϕ1]

&

DP2[ϕ2]

Once again, we see an instance where the outcomes of failed agreement are more complicated
than the insertion of a default. In the Hindi-Urdu conjunct agreement data, we see that the failure to
value can result in not a default, but rather in the calculation of closest conjunct agreement. While
neither of the two agreement models we’ve discussed are obviously extendable to this data without
modiﬁcation, the match/value approach certainly appears more amenable since it already has a
third outcome of agreement built in via the partial default mechanism. We can imagine that we
could modify this approach to include the morphosyntactic algorithm as an additional outcome of
a probe succeeding in matching with a goal, but failing to be valued by it. One of the research
questions that Bhatt and Walkow (2013) leave for future research is how we distinguish between
default agreement as a result of failure to value and closest conjunct agreement as a result of failure
to value. In the match/value approach, we can imagine the answer. Closest conjunct agreement
is the result of a successful match with a failure to value, while default agreement is the result of a
total failure to agree.

Modifying the find approach to make this distinction between failed valuation outcomes is
more diﬃcult. Inherent to its conceptual basis is that the grammar’s processes are driven by the
need to trigger operations, not by the need to ensure any particular result. In this system, there is no
division between matching and valuing and because find is a single operation; it has a binary set

143

of outcomes: it either succeeds or it fails. It is hard to imagine how to distinguish between when
failure to agree triggers a default and when failure to agree triggers a diﬀerent result.

3.4.4

Interim Summary

The common thread that ties the data in this section together is that the failure to value a set of
ϕ-features leads to a more complex set of outcomes beyond the insertion of a default. The solution
that we need to propose therefore needs to be able to encode this nonbinary set of outcomes of failed
agreement. The person hierarchy data showed both the need for a second cycle of agreement and the
need for probe modiﬁcation. The dative intervention data illustrated a need to be able to distinguish
ϕ-features on two levels:
their value and the category to which they belong. Finally, conjunct
agreement data showed that when valuation fails, the outcome is not the immediate insertion of a
default, but rather that the grammar can use the failed valuation to trigger a host of other outcomes.
We saw how find was unable to capture some of these more complicated agreement phenomena
and how the availability of a match/value approach provided a solution, despite cited criticisms
of partial default agreement.

At this juncture, one may concede that the details of the find operation are unable to capture
the more complicated types of failed agreement illustrated in this section, but still be concerned
that the crux of Preminger’s argument has not been rebutted. One can reasonably ask whether we
could take the details of the Béjar match/value approach that do account for these patterns and
combine them with the obligatory operations impetus that drives derivations. I will spend the next
two sections providing arguments against the conceptual impetus behind obligatory operations to
argue that a model that encodes grammatical requirements in the standard way, via the need to
value features, should be preferred. The relevant implication for the broader discussion in this
chapter is that obligatory operations, which care only about whether or not an operation has been
triggered, and do not encode any sort of grammatical requirement in the outcome of the operation
are ill-suited to handle this non-binary set of failure outcomes.

144

3.5 The Premature Overapplication of Defaults

Fallible operations move the impetus of various derivation operations away from their outcomes
towards the contexts in which they are triggered. However, their ability to fail introduces a few
timing issues because if operations are allowed to fail without grammatical consequence, then we
need to ensure that they do not fail too early. To begin, we ask: what does it really mean to say that
operations are obligatory? If the goal here is to better account for why ϕ-agreement as a phenomenon
is obligatory, then of course we should hope to discover a better answer than ‘ϕ-agreement is
obligatory because the operation responsible for it is obligatory’. What is missing from this
hypothetical response is an explanation for what exactly is responsible for the operation’s inherent
obligatory nature; why must the operation apply? Without this understanding, we reduce inherent
obligatoriness to stipulation. Preminger of course sympathizes with this need for explanation and
he provides some insight towards more satisfying answers.

To encode obligatoriness in a substantive, non-stipulative way, we need two properties: au-
tomation and the immediacy it implies. The basic find operation provides a nice illustration of
why these properties are necessary. Take once again, Kichean AF agreement; all that’s needed to
account for why ϕ-agreement must happen is to say that once the operation’s structural description
is met, the operation immediately and automatically is triggered. For find, this means that upon
the merger of an unvalued feature f on a head H, the operation proceeds. If that unvalued feature
f ﬁnds a successful match, then the operation’s result will be the transfer of ϕ-features from goal
to probe. If it does not ﬁnd a successful match, then the operation can fail without consequence
for grammaticality and third person singular default features will surface. Importantly, what al-
lows us to adopt this explanation for the operation’s obligatory nature without stipulation is that it
automatically applies once its structural description has been met.

Inherent to this account are two concepts that are at odds with one another:

the need for
immediacy and the need for delay until the creation of the relevant structural description. While
there’s a tension between the two, they are certainly not incompatible; the basic find operation is a
great illustration of this. However, because their natures are in constant tension, there at least exists

145

the possibility they can interact in problematic ways.

Where we don’t see problematic interaction is where the structural description for a particular
operation is quite simply the presence of a particular syntactic object. In these instances, of which
basic find – repeated below in (77) – is an example, merging a single syntactic object creates the
structural description and in this way the two conﬂicting natures are easily reconciled in one step.
Upon merge of an unvalued feature f on a functional head, the probe may begin probing.9 Here we
have no real tension between the automatic triggering of the operation and any need for delay.

(77)

find(f )
Given an unvalued feature f on a head H0, look for an XP bearing a valued instance of f
and assign that value to H0.

Where we might see problematic interaction between the need for immediacy and the need for delay
is when either the structural descriptions for operations are more complicated or when the rule that
is triggered relies on the application of an independent operation. Preminger’s extension of the
find operation as an account of dative intervention provides an illustration of this point. Preminger
follows Bobaljik (2008) in arguing that, due to ϕ-agreement’s sensitivity to case discrimination, the
application of ϕ-agreement must follow the valuation of case features. This sort of delay is exactly
the kind of situation that proves problematic for encoding obligatoriness in the automatic triggering
of operations. Essentially, what case discrimination means for the timing of ϕ-agreement is that
find must wait not only until its structural description has been met, but also until the case features
of the relevant arguments are assigned before it can proceed. This gap between the creation of
the structural description and when it needs to be triggered is probably widest if one assumes, as
Preminger does, a dependent case model of case assignment.

Relevant to our current discussion is the fact that dependent case is a conﬁgurational model
of case valuation which means that in order to assign case features, all relevant competitors must
be present in the derivation before their case features can receive their respective values. If ϕ-
9Whether or not the probe can continue probing was discussed in a prior section. While possible,

it would be done via stipulation and would be at odds with the spirit of what drives operations.

146

agreement is dependent on the valuation of these case features, then find needs to wait not only
until the structural description below has been met upon the merger of an unvalued feature f, but it
also needs to wait until all arguments are merged into the structure and have been assigned case.
Otherwise, it could be triggered too early and fail without consequence. The result would be an
overapplication of default morphology in instances where we should observe true agreement.

We can see a similar tension in the operations proposed to handle object shift. Object shift
describes a phenomena that involves the optional movement of a DP object out the VP it initially
occupies (Diesing & Jelinek, 1993; Fox & Pesetsky, 2005; Holmberg, 1986, 1999). The following
data is from Icelandic (Thráinsson, 2007). What is obligatory is that if object shift occurs, then a
speciﬁc interpretation is required (78a) and if object shift does not occur, a speciﬁc interpretation
is impossible (78b). If however, the reason that the object has not shifted outside the VP is because
object shift is blocked or otherwise unavailable, then a speciﬁc interpretation for the object DP is
still licit (79).

(78)

a.

b.

[VP t1 t1].

[þrjár
three

aldrei
never

bækur]2
books

las1
read(past)

Ég
I
‘There are three books that I never read’
((cid:88)speciﬁc reading of ‘three books’,  nonspeciﬁc reading)
Ég
I
‘I never read three books.’
((cid:88)nonspeciﬁc reading of ‘three books’, ? speciﬁc reading)

[VP t1 þrjár
three

las1
read(past)

bækur].
books.

aldrei
never

(79)

a. *þau
they

hafa
have

[viðtöl
interviews

við
with

Blair]2
Blair

alltaf
always

[VP sýnt

shown

t2] klukkan

clock

ellefu
eleven

b.

hafa
have

þau
they
‘They have always shown interviews with Blair at 11 o’clock.’

[viðtöl
interviews

klukkan
clock

alltaf
always

Blair]]
Blair

[VP sýnt

við
with

shown

ellefu.
eleven.

(Thráinsson, 2007)
To account for this, Preminger proposes the operation shown in (80). Notice that this rule is
sensitive to language-speciﬁc structural conditions. For example, in Icelandic object shift, this

147

condition is that the verb must have moved out of the VP. Notice that the structural description for
this rule is the existence of an X that is [+speciﬁc], but the point at which the operation needs to
apply to avoid premature failure is dependent on other syntactic concerns. To account for Icelandic
object shift, the operation must be sensitive to whether or not a V has been moved out of its
VP. In order to encode obligatoriness in the immediate and automatic triggering of operations,
the operation actually needs to be sensitive to much more than the structural description in order
to prevent overapplication of defaults (or failure).
In this way, it seems that we cannot enforce
the grammatical requirements solely through obligatory operations. Once we admit this, it’s not
clear to what degree obligatory operations as a framework is more attractive than the derivational
time-bombs approach.

(80)

An obligatory operations model of OS
X[+speciﬁc] → Shift[X]
where Shift is the operation that causes a noun phrase to vacate the VP, and is subject to
language-particular structural conditions on its successful culmination.

In pursuit of fairness, my intention here is not to pick on the details of the proposed object shift
operation, especially because Preminger does not oﬀer it as a signiﬁcant proposal, but rather a mere
illustration that other phenomena share the same logic as ϕ-agreement. Their shared logic is that if
a rule can apply, it must, but the conditions are such that if a rule is unable to apply, the requirement
is lifted. What I do intend to show however, is that while other phenomena might be amenable to
an obligatory operations approach in logic, we must be especially careful in how we formulate the
rules responsible and that some phenomena may not actually be as amenable as it may appear at
ﬁrst blush due to a dependence on other operations successfully applying ﬁrst.

If all of these operations essentially need to wait until much more structure has been built,
then it is worth wondering at what point they are actually triggered. A principled answer to this
question could be that they are triggered upon spell out, at the phase level. But we know from
our discussion of second cycle eﬀects that – at least for languages that show preference towards

148

internal arguments (low-ϕlanguages) – the probe must begin its search before the addition of the
external argument. What we seem to have to conclude is that in order to derive the obligatoriness
of operations beyond stipulative notions, we have to rely on either automation or immediacy, but
relying on these introduces a number of timing issues and/or rule formation issues.

On the other hand, a derivational time-bombs approach – at least when coupled with an
intermediate ability to fail like what’s encoded in the match/value approach – can avoid some
of these problems because it doesn’t rely so heavily on notions of automation or immediacy.
The derivational time-bombs approach derives its obligatory nature from a response to interface
conditions, making the exact timing of a varied set of operations less crucial to capturing syntactic
phenomena.
If we take the match/value approach to dative intervention, for example, there’s
no inherent problem with ϕ-agreement waiting until case valuation because the operation behind
ϕ-agreement isn’t driven by notions of obligatoriness. What drives the operation in that framework
is the need for features to be valued. Waiting until case is assigned is not at odds with the operation’s
motivation as it is in the obligatory operations framework.

As we discussed in the introduction to this thesis, the overapplication of defaults is just as
problematic to the framework as getting the forms to surface in the ﬁrst place (and in my view is
actually the more theoretically interesting piece.) The overapplication puzzle really centers on the
sorts of timing issues discussed here: how do we prevent the default mechanism from applying too
early? What derivational time-bombs seem to get us is a way to slow down, or otherwise constrain,
the unfettered application of syntactic operations. Preminger frames this as redundant, but I think
both the timing issues illustrated in this section and the non-binary outcomes of failed agreement
in the previous section show that unfettered application causes real problems and that derivational
time-bombs are not redundant, but rather perform important moderating functions with respect to
derivational timing.

149

3.6 Some Conceptual Issues

Now that we’ve discussed the empirical implications of adopting an obligatory operations ap-
proach, it’s worth considering its conceptual implications. There are two issues in particular that are
especially worth discussing: (i) the extendability of the approach and (ii) how it models grammatical
requirements. I argue that if we cannot adopt the obligatory operations approach framework-wide,
then there isn’t much beneﬁt to adopting it in such a small domain of the syntax, especially with
the existence of a derivational time-bomb compatible alternative, like the match/value approach.

3.6.1 Framework-wide adoption

There are a few extensions of this framework beyond the narrow scope of Agree in domains that
share a similar logic to ϕ-agreement. Preminger (2014) characterizes these phenomena as being
similar to ϕ-agreement in that each is obligatory if the phenomenon is possible, but that if the
conditions aren’t such that the operation can apply, there’s no consequence for grammaticality. I’ll
brieﬂy review both the phenomena and their respective obligatory operations logic below.

First is the suggestion that we model object shift with an obligatory operation that is triggered
immediately when the structural conditions for it are met. See the previous section for details.
What Preminger wants to capture is the idea that if the conditions for object shift are impossible,
the typically obligatory covariation between speciﬁcity and movement is lifted. In other words,
the speciﬁc reading of shifted objects is obligatory as is the nonspeciﬁc reading of an unshifted
object. However, if the conditions for object shift are not present and the reason the object stayed
in its original position was because it was prevented from doing so, then both interpretations are
possible. This mimics the logic of ϕ-agreement, as characterized by Preminger: if ϕ-agreement is
possible, it’s obligatory, but if there’s a situation where ϕ-agreement is impossible, the obligatory
requirement is lifted.

Similarly, Preminger oﬀers an obligatory operation to handle the deﬁniteness eﬀect. The
deﬁniteness eﬀect is a phenomena which typically bars deﬁnite arguments from staying in situ.

150

This asymmetrically aﬀects deﬁnite, rather than indeﬁnite arguments (81). Like ϕ-agreement and
object shift, if the conditions are such that the deﬁnite argument cannot move to subject position,
it is allowed to stay in situ (82). Once again, if the context is unavailable, the requirement is lifted.
Preminger proposes an operation that is triggered obligatorily to account for this data (83).

(81)

(82)

(83)

a. The boy/A boy seems to be playing in the garden.
b.

There seems to be a boy/*the boy playing in the garden.

a.
b.

The boy/A boy seems to the girls to be playing in the garden.
There seems to the girls to be a boy playing in the garden.

An obligatory operations model of the DE
a. X[+deﬁnite] → MtoCSP[x]
b. X[Ext. Arg.] → MtoCSP[x]

(universal)
(parameterized)

Finally, Preminger suggest one last extension, long-distance wh-movement. The issue Pre-
minger raises for long-distant wh-movement is that if we model it as being the result of an unvalued
[wh] feature attracting and triggering movement of a valued [wh]-bearing XP, we are forced into
proposing two versions of the non interrogative complementizer: a [wh] bearing one that would
attract the wh-phrase in (84a) and one that does not bear this feature to handle (84b). The need for
two versions is due to in part to the derivational time-bomb nature of unvalued features. If instead
there was an unvalued [wh] in (84b), it would remain unvalued throughout the derivation as there’s
not a [wh]-bearing XP for it to attract.

(84)

a. What did Mary say [t [C that] John wanted what]?
b. Mary said [[C that] John wanted an armadillo].

Preminger instead proposes an obligatory operation displace wh that is shown in (85). Upon the
merge of any complementizer, the operation is triggered to displace a c-commanding wh-bearing
XP. If the clause contains such XP, as in (84a), the wh-phrase will move. If the clause however
does not contain a wh-bearing XP, the operation will fail without grammatical consequence.

151

(85)

An obligatory operations approach to wh-movement
C → Displace(wh)

Preminger tempers his argument with these extensions, oﬀering them merely as suggestions that
show an extension to other domains is at least possible, rather arguing they constitute a more
attractive alternative. As I brieﬂy mentioned in section 3.5, we do have to be careful about what
sorts of rules we are comfortable with proposing.
In order to maintain that obligatoriness is
derived and not stipulated, the operations proposed in the system must be able to immediately
apply upon the creation of the structural description. Some rules, like the deﬁniteness operation
or long-distance wh-movement, arguably seem better suited to this goal. Others however, like the
object shift rule or the revised find operation to handle dative intervention, are far more diﬃcult
and need to be reframed in a way that their obligatoriness truly is enforceable by automatic and
immediate triggering. As discussed in section 3.5, these latter rules suﬀer from a timing issue that is
a result of a gap between the creation of its structural description and independent constraints on its
application. Furthermore, as we saw with the second cycle agreement data, we must also be careful
to consider the potential outcome of operation failure, as it is not often the case that the grammar
simply tolerates failure in a simple way. At a minimum, given these concerns, extension is much
more problematic than we are led to believe. If the existence of tolerated failure necessitated an
obligatory operations approach, we might be more willing to tolerate these concerns; but with the
availability of a match/value approach that also handles grammatical failures, I think the concerns
become more serious. To the extent that these considerations constitute a counterargument, I oﬀer
them here.

Where we might ﬁnd more convincing counterarguments are in extensions to phenomena that
more canonically cause ungrammaticality through failure to value features. Three in particular
are potentially problematic for a framework-wide extension of obligatory operations:
the EPP,
case licensing, and the PLC discussed in the Kichean AF data overview. The EPP is standardly
accounted for through the proposal of a strong unvalued D feature that triggers the movement of
the highest argument to the subject position or triggers the insertion of an expletive. The need

152

for the subject position to be ﬁlled is a strict requirement and is one that is modeled well under
a derivational time-bomb style approach; a derivation crashes if this D feature remains unvalued
either through movement of a DP or through the insertion of a DP expletive.

Another grammatical requirement that is especially amenable to ﬁlter-like systems and therefore
something that would be quite diﬃcult to handle in a framework where failure is widely tolerated
is case theory. Standardly, nominals are assumed to need syntactic licensing and the assignment of
case values to nominals is what can perform this licensing function. If a nominal fails to receive
case throughout the course of the derivation, as it does in (86), the derivation crashes.

(86)

*It is likely her to leave the party early.

Finally the PLC, repeated below in (87), oﬀers another ﬁlter-like phenomenon that would be
especially diﬃcult to capture with an operation that was allowed to fail.

(87)

Person Licensing Condition (Béjar & Rezac, 2003)

Interpretable 1st/2nd person features must be licensed be entering into an Agree relation
with an appropriate functional category.

Preminger speculates that we might move what’s responsible for these ﬁlter-like phenomena to a
more amenable grammatical component where derivation crashing isn’t relevant – removing them
from the syntax (see Preminger, 2014, for more details). I raise the issue here as a reminder of
the scope of the grammar and the models we have under consideration. The fact that problematic
phenomena are addressed by moving the requirement they enforce out of the syntax and into a
diﬀerent component of the grammar is quite telling. It reinforces that each truly provides great
diﬃculty for an obligatory operations approach. Any evidence that shows that these phenomena
belong rightly in the syntax proper introduces a huge problem for framework wide adoption. We’ve
seen that there are some operations whose obligatoriness is diﬃcult to enforce, like object shift
and case discriminating find and other phenomena that appear categorically incompatible with
obligatory operations themselves and we therefore should be quite pessimistic about its extension,

153

especially in light of an alternative and more attractive way to handle failed agreement data.

A ﬁnal related point, and one that I won’t be able to fully address here, is a consideration of
other syntactic phenomena that do not ﬁt under the ϕ-agreement umbrella as narrowly deﬁned in
Preminger (2014), but nonetheless are standardly treated as being accounted for through the agree
operation. In order to motivate more strongly the existence of agreement failures, Preminger restricts
what he considers agreement to a very narrow understanding of morphological co-occurrance of
features. He recognizes that in modern frameworks however, what’s considered an outcome of a
more general agree operation is signiﬁcantly less narrow. agree as an operation has been used
to account for negative concord (Zeijlstra, 2004), noun-modiﬁer concord (Baker, 2008b), modal
concord (Zeijlstra, 2008), and the binding theory (Reuland, 2011), among many other things.
The availability for us to extend find to these other agreement-like phenomenon isn’t addressed
explicitly, but it is an important question to consider. Either we must be able to extend find to
those other phenomena – which means we should ﬁnd evidence that they are allowed to fail – or we
cannot extend find. Being unable to extend find has an unattractive outcome in that we lose the
ability both to treat these phenomenon as a set and to reduce the number of mechanics we propose
to account for them. With the existence of a match/value approach that can also account for failed
agreement, it becomes less clear what the advantages are of adopting the obligatory operations
approach and more clear what we risk to lose. Furthermore, the timing issues raised in the previous
section signal that it is possible that uninterpretable features may serve a timing-regulating function.
If uninterpretable features are needed in the grammar more generally, once again we ask what we
gain and what we lose by removing their role in a very narrow set of circumstances.

3.6.2 What are probes?

One conceptual result of adopting an obligatory operations framework is that it makes it harder
to understand why syntactic objects with unvalued features are the things that probe. It’s quite
standard to assume that the deﬁning characteristic of probes is their unvalued nature (Carstens,
2016). The motivation for the probe’s search is an intrinsic need to get a value. The reason that

154

probes are, by deﬁnition, things with unvalued features is because the state of having an unvalued
feature is generally (with some limitations) not tolerated by the grammar. This is not necessarily
true in a framework that encodes requirements through obligatory triggering. The concern is that
if we’re not careful, what deﬁnes a probe under obligatory operations is more stipulative than it
is under an approach that enforces grammatical requirements through the valuation of unvalued
features. The need to receive a value is not only explicitly denied, but is also largely dependent on
proposing a rule that would require it.

Features are what signal that a syntactic object is eligible to establish a relationship, they are the
mechanism through which that relationship is established, and the transfer of their values is what
signals that relationship to other components of the grammar. What does it mean for us to assume
that despite this central role, they aren’t what enforce the requirements of the grammar?

What the data in this chapter has shown is that the grammar cares very much about unvalued
syntactic features receiving feature values. The grammar appears to have at least two diﬀerent
ways to receive a feature value: either through establishing a syntactic dependency with a valued
feature bearing object or through failing to do so and receiving one via some default mechanism
that supplies features as a last resort. The ability for ϕ-agreement operations to keep continuing to
apply cyclically until a value is found encourages a characterization that this need for valuation is
quite strong. This is more or less expected under an approach that frames grammatical requirements
as being largely imposed by interface conditions, as is true of the standard derivational time-bombs
approach. It is less obviously expected under an approach that says grammatical requirements are
encoded component-internal, through an obligatory triggering of operations responsible for valuing
features. The grammatical requirement find encodes is importantly not that an agreement probe
receive a value, it’s that the find operation is triggered when its structural description is met. The
grammatical requirement – what the grammar cares about – is the application of the rule, not the
valuing of the feature.

One of two things is likely true under such an approach: either (i) there are a number of
obligatory operations, some of which happen to value features as an outcome or (ii) all operations

155

involve this need to value features and they all apply obligatorily. If the ﬁrst is true, then it would
mean that the fact that there’s a strong need for feature valuation is merely a consequence of the
particular rules that have been proposed. Feature valuation only appears to have such a central
role by coincidence; we would not have a deeper understanding of why feature valuation is so
central to producing grammatical sentences. Our generalizations would also be more vulnerable
to the particular rules proposed by various people who work on these phenomena. If the second
alternative were true, that all operations happen to involve an unvalued feature, then we would
similarly miss a generalization that all operations share the same motivation without an explanation
for what is behind that motivation. If we instead maintain the standard approach, that operations
are driven by the need to value features, we directly encode the “correct’ grammatical requirement,
the one that seems empirically motivated and we provide an explanation of sorts for why it is that
the syntax provides these great many avenues for feature valuation – it’s required by an interface
condition, a very minimalist assumption.
3.7 Conclusions

In this chapter I showed that there is an alternative to obligatory operations that allows us to
capture how defaults are produced in the grammar while maintaining the assumption that unin-
terpretable features can still induce derivation crashes. This shows that the obligatory operations
model is not necessitated by the existence of failed agreement, but is rather one option. We looked
a wide range of failed agreement data that showed the outcomes of agree are far from the simple
binary distinction between success and failure. The find approach is ill equipped to handle this
more complicated set of outcomes as it predicts only two. I also introduced arguments that claimed
that obligatory operations raise a number of timing issues that, if we’re not careful, will overgenerate
the distribution of defaults. Finally, we looked at what it means to model grammatical requirements
on the basis of operation triggering rather than feature valuation.

I suggest that what failed agreement shows is not that feature valuation is not a requirement
that feature valuation is such a strong requirement

of the grammar, but rather the opposite:

156

that the grammar has multiple ways of providing those features to unvalued syntactic objects.
The existence of a default mechanism and of second cycle agreement eﬀects strengthens this
characterization. The way forward, then, is not to eliminate the role of derivational time-bombs
in enforcing obligatoriness of grammatical requirements, but rather seek to better understand
this default mechanism, importantly while maintaining the general framework assumptions. The
match/value approach does exactly this and therefore should be the framework upon which we
build an understanding of the default mechanism that supplies unvalued features in a narrow set of
circumstances.

157

CHAPTER 4

AGREE-BASED CASE

4.1 Introduction

With an understanding of why the dependent case and obligatory operations approach to defaults
are problematic, I’d like to turn our focus now towards a solution to the defaults problem that is
more in line with standard assumptions. This chapter has two primary goals. The ﬁrst is to show
that default case can in fact be accounted for without rejecting case’s role in regulating nominal
licensing and doesn’t require a dependent case theory of case valuation. This will be achieved by
extending the match/value approach of ϕ-agreement to the domain of case assignment and will
also involve understanding case features in a novel way. The second goal of this chapter will be
to argue that in light of the serious theoretical concerns about the nature of the framework that
adopting the proposals in chapter 2 and chapter 3 require, this new proposal oﬀers a more attractive
way to model how defaults interact with the basic tenets of the syntactic framework and should thus
be preferred over the alternatives.

4.1.1 Revisiting the Problem of Default Case

As a reminder, a syntactic default in the arena of case is especially interesting because Case has
historically had a central role in regulating the distribution of DPs. We do not expect something
which can rule out derivations to have access to a default because that access by deﬁnition under-
mines any requirements that are encoded. According to traditional assumptions, the failure to value
a Case feature will cause a derivation to crash. What rules out a sentence like (1) therefore is that
the DP her is unable to value its unvalued Case feature because non-ﬁnite T does not have Case
features to assign.

(1)

*It is likely her to leave the party early.

158

However, we’ve seen that there are a number of structures which like (1) also contain a DP that has
failed to receive a Case value, but unlike (1) produce a perfectly grammatical sentence. A few of
these are repeated below in (2):

(2)

Default Case in English:
a. Hanging Topic/Left-Dislocation

What?! Him wear a tuxedo?!

b. Gapping

She will eat cake, him brownies.

c. Coordination

Me and him will go to the store.

d. Modiﬁed Pronouns

Lucky me has to clean all the toilets.

The two crucial properties that the environments in (2) share are: (i) they each lack a case assigner
that could be the source of the accusative features on the bolded DPs and (ii) the morphological
case that surfaces in every one of these instances is consistent within a language, but it varies
cross-linguistically. Essentially, cross-linguistic default case data is indicative of a default case
mechanism that is able to explain how diﬀerent morphological cases systematically appear in the
same positions cross-linguistically.

This data raises three important questions about both the nature and the role of case in the
grammar and also about the nature of defaults and how they can be included in a system that rules
derivations out when requirements are not met.

(i) If we understand default case to be the failure to receive a case value, how can these forms

surface in a system that rules such failures impossible?

(ii) How does such a system distinguish between when it is acceptable to not get a case value and

when it is unacceptable?

159

(iii) What can an understanding of default case tell us about the features responsible for the

distribution and pronunciation of DPs?

Essentially, (i) and (ii) are intended to explain the distinction we see in (1) and (2): how are
the examples in (2) grammatical in the ﬁrst place and once we have a solution, how do we prevent
that solution from in turn applying to instances like (1). Question (iii) addresses the hope that once
we have a better understanding of the default mechanism itself, we might be able to improve our
understanding of how both abstract Case and morphological case features should be modeled.

We saw in chapter 2 that what some researchers have done to address these issues is remove the
licensing oﬀense itself, adopting an approach that removes Case’s role in ruling out derivations and
proposing a model of case valuation that builds the appearance of defaults directly into the system.
These proposals address issue (i) by arguing that failing to value a case feature is not fatal to the
derivation. They address issue (ii) by framing default and unmarked case as the last resort type of
case features that are assigned by the grammar when the grammar is unable to assign one of the
dependent cases. Through the discussion in that chapter, I raised some concerns with adopting both
a conﬁgurational case system and eliminating case’s licensing role. In this section, I will outline
what others, while trying to maintain more closely the larger set of standard assumptions regarding
case and licensing, have proposed for default case.

4.1.2 Previous Agree-based Approaches

As we’ve discussed at length, one of the ﬁrst problems that the existence of a default case poses
for our model of the grammar is how default case is able to surface at all, given that the failure to
value a Case feature is presumed to be fatal to the derivation. If one wants to maintain the standard
function of case, we must ﬁgure out a way to reconcile how that can happen, despite the grammar
appearing to disallow it. Because the locus of the crash is the remaining unvalued Case feature
itself, one way to solve this problem is to remove the feature entirely. By removing the feature, one
removes the source of the crash. The logic of this approach is essentially: you can’t fail to value
something that isn’t there. This approach centers on diﬀerentiating two types of DPs – DPs that

160

surface with default case forms and those that don’t – and encoding this diﬀerence in the featural
speciﬁcation of each DP type. DPs that can surface with the default case form will be generated
without any Case features and therefore without the ability to cause a derivation crash, at least due
to Case. DPs that do not surface with default case are generated with the traditional Case feature
set and must therefore receive a Case value or the derivation will be ruled ungrammatical.

Even among the analyses that propose a solution in this vein, we ﬁnd some variety in how this
removal is implemented. We could try to identify some property that uniﬁes the set of either type of
DP as Legate (2008) does. She couches the identifying property in a DP’s merge position. On her
account, DPs that are going to merge in an argument position are subject to a licensing requirement
and are thus generated with the expected unvalued Case feature that enforces this requirement. DPs
that won’t be merged into an argument position are instead not generated with an unvalued Case
feature and can thus survive to PF regardless of whether or not they agreed with any prototypical
Case assigner.

This approach is successful in a few nice ways: (i) it maintains the standard set of assumptions
regarding both the roles and the relationship between morphological case and licensing and therefore
inherits the beneﬁts of doing so, and (ii) it appeals to the elsewhere condition discussed in chapter 1
to insert a default form, essentially aligning default case with other instances of morphological
defaults more generally.

Despite these successes, the way Legate implements this approach makes a few wrong empirical
predictions. Since Legate identiﬁes that the property that distinguishes whether or not a DP is
generated with Case features is dependent on whether or not it merges in argument position, we
predict default DPs to not appear in argument positions. The data in (3) shows two DPs (in bold)
that are arguments of the gapped verb drink and the tenseless verb wear, respectively. While it’s true
that neither of these verbs has the ability to canonically assign accusative, it is extremely unlikely
that the DPs here are not arguments of their verbs. This data therefore constitutes a counterargument
to Legate’s proposal.

161

(3)

a. We can’t drink champagne and him dollar store wine.
b. What?! Him wear a tuxedo!? Never!

Imagine a derivation like that in (4).

This problem is largely due to the identiﬁcation of some property that the generation of Case
features is grounded in. If we were to remove that property and more or less randomly generate
unvalued morphological case features1 on DPs, as Schütze (2001) does, we could avoid the two
issues outlined above. We do however run into a diﬀerent problem if we pursue that option
– overgeneration.
If DPs are generated randomly in the
numeration either with morphological case features [DP[ucase]] or without [DP], then without an
explicit understanding of what governs their selection, it appears that the grammar is equally able
to select either DP type. If the derivation selects the cased version, (5a), all goes as expected: the
DP is unable to receive any morphological case value from non-ﬁnite T in the embedded clause,
but is able to receive nominative features when it moves to the spec TP of the matrix clause. At
spell-out, this produces the sentence in (5b).

(4)

(5)

[ ]i is likely [ ]i to win the race.

a.

b.

[DP[ucase]]i is likely ti to win the
race.
Shei is likely ti to win the race.

(6)

[DP]i is likely ti to win the race.
a.
b. *Heri is likely ti to win the race.

If instead the derivation in (4) selects the caseless version, as in (6a), the DP once again is
unable to receive a morphological case value from non-ﬁnite T. When this DP type moves to the
spec TP position of the matrix clause, the presence of nominative features is irrelevant because
there is nothing to “receive” the feature values. Without an unvalued morphological case feature,
the DP surfaces exactly as it was generated. Because the resulting sentence (6b) is ungrammatical,
this proposal overgenerates default case forms in positions where they are unattested. It’s important
to take a minute to be clear that Schütze’s account does not overgenerate the actual distribution
1It’s important to clarify that unlike Legate, Schütze does not address licensing which is why

I’ve switched to morphological features here.

162

of DPs more generally because Schütze assumes that a separate set of features are responsible for
DP licensing. For him, what would rule out (7a) would presumably be that whatever licensing
feature that is responsible for DP distribution and is generated on all DPs was not satisﬁed. By
maintaining a strong separation between the features responsible for licensing and those responsible
for morphological case, he models default case a purely morphological phenomenon.

(7)

Jane hopes [DP] to eat all the honey.

a.
b. *Jane hopes she/her to eat all the honey.

It is entirely possible that we might be able to suggest modiﬁcations to either of these proposals that
will avoid the issues that they are confronted with. However, I argue that despite these hypothetical
modiﬁcations, we actually have a bigger conceptual issue here that supports abandoning this type
of approach altogether, regardless of whether or not we could get the details to work. Case isn’t an
individual property; it’s the reﬂection of a relationship. Whether or not a DP is a default case DP
is about how it is integrated into a particular structure, not about individual properties of the DP
itself. By modeling the distinction between DPs that end up with default case and those that end
up with traditional case through a variation in feature speciﬁcation on DPs, we put the locus of that
distinction on the DP itself, rather than on the relationship that that DP and a particular functional
head share. That distinction, I think, should be located on diﬀerences in the environments that
produce defaults, rather than on the DPs themselves.

These arguments, coupled with the arguments against dependent case and the separation of
case from licensing, support the proposal of another way to address the default case issues. What
we’re looking for then, is an agree-based system (contra dependent case theory) that is capable of
both producing and constraining default forms that models the distinction between default forms
and others as the reﬂection of environmental distinctions, rather than DP focused ones (contra
previous agree-based approaches). In the next section, I’m going to argue that we can do just that
by extending the decomposition of agree that is sensitive to the hierarchical relationships between
the features relevant for case and licensing.

163

4.2 A New Approach

Before moving on to the details of the alternative approach I would like to propose here, I oﬀer

a quick overview of its basic components. I argue that we:

(i) adopt a decomposition of agree into two independent operations: match and value. (Béjar,

2003; Béjar & Rezac, 2009; Rezac, 2011)

(ii) propose that all DPs enter the derivation with identical morphological case and licensing

requirements

(iii) propose that the features responsible for morphological case and those responsible for licens-

ing are independent, but related via entailment

(iv) use the featural speciﬁcations allowed by this new relationship to expand the number of
possible outcomes of agree, producing exactly the right circumstances to allow defaults to
surface where (and only where) they are attested.

This section will detail both a novel understanding of case features and how those features
behave in a system that assumes the kind of hierarchical sensitivity that Béjar’s match/value
approach requires.

4.2.1 Case Feature Systems

The match/value approach to defaults works well in the ϕ-agreement domain in large part because
of the inherent feature structure that ϕ-features exhibit. It therefore bears asking: what other feature
systems contain these hierarchical relationships and if Béjar’s approach to agree is correct, how
would those operations act upon them? In this section, I’m going to follow others who’ve come
before me to argue that case features are similarly organized, with hierarchical internal structure.
We’ll examine conclusions from the morphological literature that supports this claim and I’ll
propose a novel system of case features intended to reﬂect the intuitions that have come from

164

literature on case syncretism patterns. The expectation is that with a hierarchically organized
system of case features, the match/value theory of ϕ-agreement can be extended and can account
for default case while being able to maintain case’s role in nominal licensing.

4.2.1.1 Preliminary Concerns

First, we need to address some preliminary concerns about our understanding of the features
responsible for licensing and morphological case. Although case is one of the most discussed
domains in the syntactic literature, the features behind the relevant phenomena are some of the
least understood. The ﬁeld has still yet to arrive at a consensus regarding an accepted system of
case features. This section will outline both what a theory of case features should look like and the
issues that make it quite diﬃcult to propose one.

In a very general way, any proposal of any syntactic phenomenon within a framework that
depends on the valuation of features to enforce the syntactic requirements that one assumes are
active or relevant for a phenomenon really needs to take seriously how those features are motivated,
structured, and organized. This is not about simply outlining assumptions in a way that makes a
match/value-style extension possible, but rather about a more general goal of understanding the
features that play such a crucial role in this type of syntactic framework. Our current syntactic
model uses features as the actual mechanism by which all of these syntactic processes operate and
all grammatical requirements are enforced. In this way, our framework isn’t just a feature-centric
one, it’s a feature-driven one. Because of this, it’s crucial that our features are well motivated
and well grounded. A ﬂaw in our understanding of these features could have drastic eﬀect for the
success of whatever proposal it is that one is making.

For some types of features such a thorough discussion at this stage is unnecessary because some
features, like number, are semantically intuitive and their distribution and any internal structure is
well understood. Case features do not share this status and because of this, discussions on both the
nature of case features and the relationships between them and a discussion of the issues that case
feature proposals face is warranted. The complexity of some of these issues is often overlooked,

165

at least in the more syntactic-focused literature, and so an explicit exploration is worthwhile. Our
understanding of the features relevant for case valuation should include the following: a well
grounded motivation for both feature existence and distribution, a well grounded understanding of
any hierarchical relationships that may hold between those features, and an understanding of how
underspeciﬁcation is modeled.

With respect to the ﬁrst of these needs, an understanding of how features are motivated and
distributed, we can turn to McFadden (2007) for guidance on what proper grounding should
arguably look like. McFadden focuses on how poorly grounded features can have detrimental
consequences for any proposal that employs them and provides some guidelines for how these
features should be grounded in a way that avoids such repercussions. He frames this discussion
through an examination of the decomposed features that make up the standard case categories.
Decomposing these categories into individual features is an oft-employed strategy for accounting
for case syncretic patterns and is widely accepted throughout the morphological case literature (see
McFadden, 2007; Müller, 2004b, 2005, for a few examples)

While this strategy successfully models how syncretic patterns arise in the various languages
in which they are observed, McFadden cautions that without a set of principles to constrain them,
there is nothing to rule out potential patterns that turn out to be unobserved. In this way, McFadden
argues that properly grounding case features is essential to proposing a system that has explanatory
value, rather than simple descriptive adequacy. He proposes a Morphological Feature Constraint,
shown in (8), that attempts to provide these necessary constraints.

(8)

Morphological Feature Constraint:

The positing of a particular feature to handle patterns of morphological form must be
accompanied by an explicit theory of its distribution in syntactic/semantic terms.

This constraint essentially requires that features need to be grounded in a way that is independent
of the primary function they are to perform in the grammar. Once we have a properly grounded and

(McFadden, 2007)

166

well understood set of case features, we expect any intrinsic relationships that might hold between
them to become quite obvious. In addition to understanding how the features we’ve proposed are
distributed among the syntactic objects in a particular derivation, we need to also make sure to
encode those relationships in our model of the case feature system.

We face a number of issues when proposing case feature systems, most of which are actually
unique to case features speciﬁcally:
(i) case is inherently diﬃcult to ground, (ii) case is also
fairly diﬃcult to model, and perhaps unsurprisingly, (iii) the vast diﬀerences in basic theoretical
assumptions about the nature and role of case introduce further diﬃculties in proposing a case
feature system that is widely accepted. Throughout this discussion, it’s important to note that I’m
not suggesting that the issues raised here are insurmountable (or even unresolved in some instances),
but rather that the task is more complicated than one typically assumes and that there is beneﬁt to
being explicit about these diﬃculties.

Case features themselves are inherently more diﬃcult to ground given their lack of semantic
content. It is well known that the case a DP receives does not correspond to a consistent semantic
role, at least for the structural cases. On its own, this does not necessarily make grounding case
features any more diﬃcult than any other morphosyntactic feature; however, when coupled with
standards for independent feature grounding and a dual role in multiple components of the grammar,
this lack of semantic content creates a seemingly impossible task: we must ground case features
independent of their function, while simultaneously needing to ground them in at least one of them.
To illustrate how a lack of semantic content can cause issues for feature grounding, McFadden
(2007) contrasts case features with a feature with full semantic meaning: person. He points out
that while we can certainly debate the speciﬁcs of how ﬁrst person is represented featurally, there
is a limit to the possibilities we can pursue because it is quite easy to determine whether a nominal
is ﬁrst person or not. In this way, the existence of semantic content constrains the nature of the
potential features involved, thus limiting the range of possibilities which provides us with greater
explanatory value. Semantic content can be viewed as a sort of scaﬀold onto which one can frame
a particular feature’s grounding and distribution.

167

The lack of semantic content with case removes these important constraints and therefore makes
the task of independently grounding features that much more crucial. When you pair this fact with
the assumption that case serves functions in multiple parts of the grammar, the task of proposing a
well deﬁned set of case features that will provide explanatory value is made much more diﬃcult. As
discussed above, McFadden also calls us to ground features independent of their primary functions.
For McFadden, this is less of a problem because he assumes case does not play a role in the
syntax and can independently ground these features there. However, if we maintain the traditional
assumptions regarding the dual role of case in the grammar, this means that we must ground case
independent of both its morphological function and its syntactic one in order to satisfy grounding
best practices. The semantic component is therefore the most independent place to ground these
features and unfortunately with case, it is also one that is unavailable to us.

I suggest that this diﬃculty is why case feature grounding has never felt truly satisfying and
it explains why stipulation critiques have gone largely unaddressed – to some extent they are
I suggest that the stipulative nature of case feature proposals is not the result of
unavoidable.
failure to capture case accurately, but is rather a natural artifact of case’s intrinsic nature.
It is
unsurprising to recall that in Chomsky (2001) case features are the only ones features assumed to
be uninterpretable on both DPs and functional heads. In this way, they can be viewed as the only set
of purely formal features, with no interpretable component. In the proposal that follows, I provide
motivations for both the existence and distribution of the features I assume to be responsible for
case’s dual functions, but it’s important to keep in mind why they don’t feel as well motivated as
we might expect for other features.

In addition to being diﬃcult to ground, case features are also actually quite diﬃcult to model.
Most of the syntactic literature on case discusses case categories, like nominative or accusative,
but rarely discusses the details of the features that make up these categories. While understandable
given the focus of that research, it is important to understand why something as simple as “T
assigns nom to a DP with which it agrees” isn’t actually a simple operation at all. In some sense,
the concept of nominative doesn’t even quite exist – it’s a label we’ve given to more easily discuss

168

a set of features. As Pesetsky (2013) points out, case categories are a sort of ‘middleman’ between
the actual features and the morphological forms.

We’ve just outlined how a lack of independent semantics makes case features more diﬃcult to
motivate, and now we turn to the large number of choices that we face when modeling those features
and how each of these choices presents its own set of challenges for the researcher. It is not my
intention to make the claim that case features are impossible to model or even that previous attempts
to have done so are unattractive, but rather to be very clear about why choosing a particular set of
assumptions to hold is not a simple task and why case valuing operations are not as straightforward
as we often assume.

One aspect of case features that makes them particularly tricky to model is their status as
valued/unvalued or interpretable/uninterpretable. Recall Chomsky’s 2001 proposal that the features
responsible for case both on functional heads and on DPs are to be understood as uninterpretable.
The motivation for this is that uninterpretable features are deﬁned as being unable to be interpreted
by the semantic component. As we’ve seen, the features responsible for case do seem to lack an
inherent semantic meaning and by this deﬁnition it is reasonable to assume that case is uniquely
uninterpretable on both types of syntactic objects. This assumption does not come without a cost,
however. By modeling case features as uninterpretable on functional heads, we are implicitly
encoding a grammatical requirement that functional heads with case features must assign case.
While this alone is not reason to reject such a characterization, it is at odds with how we traditionally
understand case, a requirement the grammar imposes on DPs solely. It also introduces a further
question: why would these traditional case assigning heads need to assign case in the ﬁrst place?
Unlike with DPs, there is no evidence that case performs any additional function for functional
heads that selection can’t account for. Despite this consequence, we clearly cannot maintain the
alternative – that case features are interpretable on either functional heads or DPs given case’s
complete lack of semantic content, at least not without drastically modifying what it means to
be interpretable. More recent research has moved towards diﬀerentiating the partner versions
of features along a diﬀerent dimension: whether they are inherently speciﬁed with a value or

169

whether they must receive a value through establishing relationships in the derivation (Adger,
2003; Pesetsky & Torrego, 2007). By this deﬁnition, it is no longer unreasonable to assume that
case features come in two ﬂavors. Syntactic objects with needs, such as DPs that seem to need
licensing beyond selection and instructions for determining form, should be speciﬁed with unvalued
versions of the relevant feature. Syntactic objects that supply or fulﬁll these needs, are speciﬁed
with the valued version of the feature. This move also provides an additional simplicity of model
beneﬁt. For verbs that optionally take an internal argument, assuming case on functional heads
is uninterpretable required us to propose two ﬂavors of every verb that shares this property: one
with an uninterpretable case feature, and one without. Instead, if we propose that case is valued
on functional heads, we can maintain one ﬂavor for these verbs. If that verb fails to ﬁnd a DP
to license, this failure is of no consequence to the derivation: no unvalued features remain at the
interfaces. I will expand on the assumptions regarding the nature of case features more explicitly
when outlining the exact details about the case feature proposal I oﬀer later in this section.

Case feature modeling also runs into issues with respect to the degree of speciﬁcity case features
should encode and the number of the features themselves. The syntactic component only requires
that the features responsible for case represent a binary distinction, to reﬂect the binary nature of
licensing – DPs are either licensed or they are not. Any degree of speciﬁcity greater than this
is superﬂuous for this particular function. The morphological component, however, requires a
larger degree of speciﬁcity – an X-way distinction where X is the number of morphological cases
observed in a language. The morphological component requires that case features reﬂect a greater
number of distinctions than the syntactic component does. Clearly, we must prioritize the needs of
the morphological component, as it would be impossible to model an X-way case system without
an appropriately large inventory of case features, but it’s important to remember that the syntactic
component needs to simultaneously be able to interpret this X-way distinction in a binary way.
This is certainly not impossible and is largely achieved through some of the standard models of
case features in the syntactic literature, where [case] is a feature that has X possible values, one
for each of the morphological cases present in the language in question. The syntactic component

170

is able to identify a binary distinction in whether or not the feature [case] has received a value.
The morphological component is then able to look at the actual values that the feature has, which
produces the required X-way distinction. What we need to be cautious of, however, is that while
this sort of model works well for understanding how the needs of both grammatical components
are met, it is unclear how this can be easily extended to a system that does not treat the traditional
case categories as monolithic units. If we decompose the case categories into independent features,
it’s unclear how we reconcile that choice with the syntactic component’s need to be able to identify
a binary distinction. Again, this is not an insurmountable issue, but is one that needs to be
remembered when proposing a workable system of case features.

4.2.1.2 The Hierarchical Nature of Case Features

With these issues in mind, we can begin to work though what we know about case features and
use this information to propose a system that can capture the right empirical facts. Because there
is a division in labor over the morphological and syntactic functions of case, there winds up
being an understandable lack of consensus around a uniﬁed approach between researchers focused
on each of these two components. Morphological case research tends to focus on patterns of
case syncretism, whether case is expressed aﬃxally, and capturing the varied morphological case
patterns we observe across the world’s languages. Syntactic case research has seen a more recent
uptick in debate, with old disagreements reemerging and long-held assumptions being reexamined,
as we’ve seen in earlier chapters of this thesis. Rarely do these aims converge and as a result,
discoveries made in one arena rarely inform discoveries made in the other. One of the goals of this
chapter is to bring together conclusions drawn from each of the various ﬁelds and propose a system
of case features that can address the independent concerns of each.

There are two important observations about case that will guide our proposal: (i) the observation
that case categories are implicationally hierarchical and (ii) the observation that case categories are
not atomic units, but are instead composed of a number of smaller individual case features.

Case categories appear to involve implicational relationships between one another. Blake (2001)

171

proposes that there is a universal implicational hierarchy involving the inventory of case categories
a particular language has. This hierarchy is shown in (9). The way to interpret this hierarchy is
to say that if a language has a case category X, it will also have the case categories, Y, Z, etc that
appear to the left of X in the hierarchy below. If a language has dative case, then it is also true that
the language will have nominative and accusative case as well.

(9)

nominative > accusative > genitive > dative > instrumental > comitative

Evidence for this implicational hierarchy is also supported through acquisition data involving case
categories. Austin (2012) provides evidence that children learning Basque acquire absolutive
verbal agreement before they acquire ergative agreement, which is then followed by the acquisition
of dative agreement. This relationship mimics the pattern shown in Blake’s hierarchy above.

In addition to evidence that suggests case categories bear implicational hierarchical relationships
to one another, there is also evidence that case categories themselves are not made up of atomic
features, but are instead compositional categories composed of a number of individual case features.
The primary evidence for this conclusion comes from a large body of work on case syncretism (see
Baerman, Brown, & Corbett, 2005, for an overview). Syncretism is a speciﬁc type of homophony
that we ﬁnd in inﬂectional paradigms. Formally, it is understood as the grammar failing to make
some sort of morphosyntactic distinction that under normal circumstances is made. It is a systematic
phenomenon and in this way is diﬀerent from the kind of accidental homophony that might result
from the application of independent phonological rules. Take the following example from Russian:

Table 4.1: Accidental Homophony in Russian

a. stem-stress ‘place’
orthographic

nom/acc sg mesto
mesta
gen sg

b. end-stress ‘wine’
orthographic
vino
vina

phonetic
vji."no
vji."na

phonetic
"mje.st@
"mje.st@

(Baerman et al., 2005)

172

Here, while the genitive and nominative/accusative singular forms are homophonous for the
lexical item ‘mesto’, they are not for the lexical item ‘vino’. Russian has an independent phonological
rule whereby the distinction between /a/ and /o/ is only observed when in a stressed syllable. Since
the homophony in table 4.1 can be explained in phonological terms and is not systematic to the
case paradigm, it would not be considered an example of syncretism of case.

Table 4.2 shows syncretism in the case system. We can see below that Greek diﬀerentiates
between 4 cases (nom, acc, gen, and dat) and also makes a distinction between masculine and
neuter gender. In each cell is the morphological form for the adjective wise, showing how it varies
with respect to how Greek expresses case and gender features. An adjective that is nominative
and masculine will be expressed as soph-os, while the same adjective will take a diﬀerent form
if it is speciﬁed with nominative and neuter features. What the framed cells show is that in the
neuter paradigm, the distinction between nominative case and accusative that is normally marked
in the masculine is neutralized – the nominative and accusative cases both share the same form
soph-on. When two normally distinct categories are expressed via the same form, as they are here,
we call them syncretic. What distinguishes the syncretic type of homophony in table 4.2 from
the homophony in table 4.1 is that the homophony in Greek is systematic for the entire neuter
paradigm and isn’t reducible to any sort of independent phonological processes. This means that
for all adjectives, not just wise, the grammar does not make a morphological distinction between
nominative forms and accusative forms, even though it does make those distinctions in other gender
paradigms, like masculine.

Table 4.2: Greek Adjective ‘wise’

neuter masculine

nom sg soph-on
acc sg
soph-on
soph-ou
gen sg
dat sg
soph-oi

soph-os
soph-on
soph-ou
soph-oi

Syncretism as a process involves the neutralization of distinctions and is viewed as evidence

173

that broader morphosyntactic categories are comprised of more individual features because the
syncretic patterns are indications that the grammar forms natural classes from the categories in
question. In order for the grammar to do this, there must simultaneously be a set of features that
the natural class can reference and a set of features that maintains the distinction between disparate
categories. Analyses of syncretic patterns hope to propose formal ways of encoding those natural
classes, deﬁned by morphological behavior. So for two case categories, X and Y, syncretism
between the two would involve neutralizing their diﬀerences in such a way that the properties they
share are the only ones remaining. Typical syncretism accounts involve proposing a number of
post-syntactic operations that modify the feature speciﬁcations in ways that eliminate any feature
overlaps. I’ve illustrated a brief example of how this works below in (10). Categories X and Y
share the feature set {+a, +c}, but are distinguished through having diﬀerent values for feature {b}.
If we wanted to neutralize the distinction between the two categories, ie: make them syncretic, we
could eliminate or otherwise delete the feature that encodes the distinction between them, {b}. The
result of doing so is that both categories would then be identically speciﬁed with only the features
they share and could invite the insertion of the same vocabulary item.

(10)

a. X: {+a, -b, +c}
b. Y: {+a, +b, +c}

Case syncretism makes the task of proposing a uniﬁed system of case features even more diﬃcult
than we hinted at in the previous section because of the additional constraints it imposes on
hypothetical feature speciﬁcations. Not only do we need to propose features in a way that is
grounded independent from primary function, but we also must make sure to propose a system that
allows us to model the correct potential natural classes that their morphological behavior suggests.2
2It’s important here to understand that while the original featural speciﬁcations are important
for correctly predicting the observed syncretic patterns, not all languages exhibit identical patterns
and it is not considered a problem. The bulk of the theoretical work is borne by the post-syntactic
operations that modify the original featural speciﬁcations. So while the proposed feature system
must be decomposed enough to be able to capture natural classes between the cases, it doesn’t (and
shouldn’t) need to be modeled after particular examples of syncretism.

174

Since we see syncretism in between the nominative and accusative cases in Greek, we need to
decompose the case categories in a way that we can identify at least one feature that they share and
at least one feature that distinguishes them. This is how syncretism provides evidence that feature
categories aren’t monolithic units, but are rather made up of a set of more individual features.

Informally, we can capture some intuitive natural class behaviors that group various case
categories. First, we make a distinction between structural and nonstructural cases. Nominative,
accusative, genitive, and dative are considered the structural cases, dependent on syntax rather
than semantics. The case categories like instrumental, ablative, and locative, to name a few, are
considered nonstructural and are assumed to have a semantic component to their interpretation
and assignment. Among the structural cases, we can further distinguish between the core and non
core cases. Core cases include nominative and accusative and the noncore cases include dative
and genitive cases. Baerman et al. (2005) outlines some general tendencies for syncretic patterns
that use these distinctions that I’d like to outline here quickly. Languages typically exhibit one
of three possible syncretic patterns. First, there can be syncretism between the two core cases,
like the syncretism between nominative and accusative case that we saw in the neuter paradigm in
table 4.2. Second, there can be syncretism between a core case (accusative) and one of the noncore,
but structural cases like genitive or dative. This is what we see in table 4.3 where the accusative
and genitive cases are syncretic with nouns, but not pronouns. Interestingly, syncretism of this type
is almost always restricted to the ‘marked’ core case (accusative or ergative). Finally, there can be
syncretism within the noncore cases, which is the pattern we see in table 4.4 where the dative and
the illative are syncretic in singular deﬁnite nouns, but not in indeﬁnite ones.

Table 4.3: Finnish Syncretism Core/Non-core

noun
‘lock’

pronoun
‘I’

nom lukko minä
luko-n minu-t
acc
gen
luko-n minu-n

175

Table 4.4: Erzja Mordvin Syncretism Non-core Cases

‘the house’

nom kudos’
kudont’
gen
kudonten’
dat
ill
kudonten’
kudodont’
abl

‘(a) house’
kudo
kudon’
kudonen’
kudos
kudodo

The observation that case syncretism suggests a decomposition of case categories coupled with
evidence that case categories themselves have a particular invariant order motivates Caha (2009)
to propose a theory of syncretism that depends on the notion of contiguity. His proposal is shown
below in (11):

(11)

Universal Case Contiguity

a. Non-accidental case syncretism targets contiguous regions in a sequence invariant

b.

across languages.
the case sequence:
nominative - accusative - genitive - dative - instrumental - comitative

He argues that the syncretic patterns informally outlined above can be explained by proposing that
syncretism can only target categories that are contiguous on the case category hierarchy. This
would allow syncretism between nominative and accusative cases, for example, but would bar
syncretism between nominative and genitive cases, unless the accusative is also involved. More
important for our purposes is his conclusion that universal case contiguity is only possible if there
is a universal system of case features with invariant hierarchical organization. Syncretism oﬀers a
way for us to ground case features independent of their primary function in that its behavior, while
morphological in nature, is independent from the actual assignment of the case features themselves
in the syntax or their more general distribution.

176

4.2.2 A Proposed Feature System

To summarize before moving forward with the proposal: we have evidence that case categories have
implicational hierarchical structure. We also have evidence from syncretism that case categories
are comprised of individual case features. Furthermore, case syncretism patterns are constrained
to only involve those categories that are contiguous along the hierarchy. These facts all together
suggest that case features, like ϕ-features, have some internal hierarchical structure. It will be the
focus of this next section to propose what those features are, ground them appropriately independent
of their primary functions, and argue for a novel system of case features that researchers working in
both morphological and syntactic domains can employ. This proposal has two main parts that reﬂect
the dual function of case: its role in regulating nominal licensing and its role in morphologically
distinguishing various case categories. We will therefore discuss the features responsible for the
various case category distinctions we observe and the features responsible for nominal licensing
and how those two disparate sets are related to one another.

4.2.2.1 Morphological Functions

While novel in its details and its mechanics, this proposal draws heavily from a number of intuitions
made by others about the nature of case categories and their features. We ﬁrst begin with Caha
(2009) as this will be the scaﬀold upon which the proposal is built. Caha (2009) concludes that the
only plausible way to account for a universal case contiguity would be if the individual features that
make up case categories are sub-classiﬁed rather than cross-classiﬁed. The primary reason for this is
that the two types of relations make diﬀerent predictions about adjacency/contiguity. He shows that
a set of features that is cross-classiﬁed, shown in table 4.5 creates a system whereby a larger number
of adjacent relationships are formed because there are both vertical and horizontal relationships
available. Sub-classiﬁcation, by contrast, requires a more linear set of adjacency relationships
because the horizontal relationship is unavailable (12). This outlines the ﬁrst characteristic of
our proposed case feature system: case features are sub-classiﬁed. Perhaps unsurprisingly, the
sub-classiﬁcation nature of case features mimics one that is familiar to us throughout this thesis:

177

ϕ-features are also assumed to be sub-classiﬁed/hierarchical.

Table 4.5: Cross-Classiﬁcation of Features

+Y −Y
+X nom acc
−X gen dat

(12)

nom

X

acc

Y

gen

Z

dat

The logic of sub-classiﬁcation would mean that as we proceed down the structure, we eliminate
a case category with each level. Translating what this means for the features themselves, we can
imagine that progressing down the structure could mean either removing – if the lowest member is
the least speciﬁed – or adding – if the highest member is the least speciﬁed – a feature unique to
the eliminated category. Following intuitions that nominative is the least speciﬁed case category
(Baker, 2015; Levin & Preminger, 2015; McFadden, 2004), we’ll want to build a system where as
you move ‘down’ the structure, each lower level will involve the addition of a feature that groups the
remaining categories to the exclusion of the removed category. For an abstract example, consider
the representation shown below in (13). We begin at the top of the structure, the set that includes
all case categories. If we remove what we assume to be the least speciﬁed category, nominative,
we are left with the set {accusative, genitive, dative}. To remove nominative of course, we must
identify a feature A for which nominative is the only case category that does not bear that feature.
Moving to the next level of the universal hierarchy, we want to remove the case category accusative.
We must then identify a feature B that genitive and dative include, but not accusative. Finally, we

178

are left with a paired set. We must therefore identify some feature C that distinguishes genitive
from dative.

(13)

{nom acc gen dat }

nom

{acc gen dat }

acc

{gen dat }

gen

{dat }

dat

Note that we can compare this to the already familiar internal structure of ϕ-features, shown in
(14). All ϕ-features contain the feature [π] that represents the whole set {3rd, 2nd, 1st} and as we
move down the hierarchy towards more speciﬁed person categories, we add features that distinguish
the remaining set from the removed category (15). We remove third person from the set and only
second and ﬁrst person remain. We therefore can identify a feature that categorizes the smaller set
to the exclusion of the removed category, [participant]. Likewise, as we’re left with the paired set
{2nd, 1st}, we identify a feature that distinguishes them, [speaker].3

(14)

{3rd 2nd 1st}

3rd

{2nd 1st}

2nd

{1st}

1st

(15)

[π]

3rd

[participant]

2nd

[speaker]

1st

While we take the intuitions here about the need for sub-classiﬁcation from Caha (2009), we
are unable to adopt his system for our purposes because he does not ground the features themselves
3Of course, we could also identify a feature [addressee] that would similarly distinguish between
the members of the same set. This would reverse the markedness relationship between them, making
the spell out of [participant] ﬁrst person rather than second. Some languages do in fact choose to
express second person as the more speciﬁed member of the set (Béjar, 2003).

179

and continues to use ‘placeholders’ A, B, etc throughout his work. This is not a weakness as he’s
operating in an entirely diﬀerent framework, proposing that those features A, B, C are functional
heads rather than independent features in the way we’ve been considering them throughout this
thesis. So where there’s room for contribution is in how we ground the features that are responsible
for distinguishing each of the eliminated case categories.

From here, we can turn to those who have focused their work on accounting for the speciﬁcs
of various case syncretism patterns. No singular work can be adopted entirely, but I argue that we
can combine insights4 from a number of diﬀerent works, as we did with Caha (2009) to produce a
system that both morphologists and syntacticians can be happy with. For clarity, here’s a reminder
of the features we need to deﬁne:

(16)

a.
b.
c.
d.

a feature that encompasses all categories
a feature that is common to {acc, gen, dat } to the exclusion of nom
a feature that is common to {gen, dat } to the exclusion of acc
a feature that distinguishes gen from dat

I see no reason not to propose a feature [case] that is common to all case categories. It’s ϕ-feature
counterpart [π] similarly signals category type in that it intuitively indicates all further speciﬁcations
belong to the same common set.

Next we need a feature common to {acc, gen, dat } to the exclusion of nom. I propose we
adopt the intuition that we can group this set to the exclusion of nominative through reference
to their ability to be expressed on objects of a verb (Calabrese, 1998; Müller, 2004b, 2005).5 I
propose a feature [verbal] that is grounded in its ability to be assigned by verbs to their objects.
4I use the words ‘insights’ and ‘intuitions’ intentionally here. I will adopt none of the features
actually proposed in the works I cite here. However, I recognize that what we call the feature
and its motivation are superﬁcially distinct and that it’s the motivation and the intuition behind the
proposal that is actually important.

5Each of these sources does this in slightly diﬀerent ways; most notably diﬀerent is Calabrese
(1998) who instead makes negative characterization that the relevant set is ‘not the subject of
predication’. While the basic generalization is similar, I wish to note this here to not misrepresent
his work.

180

The next feature should be common to {gen, dat } to the exclusion of acc and I suggest we follow
the fairly standard intuition that what divides these case categories is their status as oblique. This
justiﬁes the feature [oblique]. Finally, we must propose a feature that distinguishes genitive from
dative. While superﬁcially simpler due to the set only containing two members, this is arguably a
more diﬃcult task whose diﬃculty stems from the greater range of options available. Among the
intuitions already oﬀered in the literature, two appear most promising, but I’ll refrain from adopting
them nonetheless for reasons I’ll discuss in a minute. McFadden (2007) disagrees that dative is the
more speciﬁed member of the set and therefore his proposal of a [+genitive] feature to distinguish
them reﬂects this assumption. Müller (2005) does not maintain that assumption, but still singles
out genitive as the category with a deﬁning property that a feature can reference. He proposes a
feature [+n] that is assigned by nominals and uniquely identiﬁes the genitive case to the exclusion
of all others.

Because my intention is to capture the universal case contiguity argued for in Caha (2009) I’d
prefer to propose a feature uniquely identifying the dative case category, rather than the genitive.
We also cannot do this through negative reference to the property proposed by Müller (2005)
because it does not uniquely distinguish genitive from dative; arguably accusative and nominative
are also not assigned by nouns or in nominal environments. What I will propose is a feature [dative]
that singles out the dative case. This makes the genitive case the default spell out of [oblique]. A
summary of all proposed features and their internal structure is shown in (18)

(17)

{nom acc gen dat }

(18)

[case]

nom

{acc gen dat }

acc

{gen dat }

nom

[verbal]

acc

[oblique]

gen

{dat }

dat

gen

[dative]

dat

However, I suggest that perhaps this disagreement might be informative. As we’ve just seen,

181

one concern is the status of the relationship between dative and genitive cases. The worry is that
for some languages, dative seems less speciﬁed than genitive and for others, the opposite is true.
I believe this is a place where one of the apparent weaknesses of my proposal might turn out
to be a strength. The relationship between ﬁrst and second person is similarly variable. Béjar
(2003) notes that while the feature [participant] will always dominate the feature that distinguishes
between the members of that set, grammars have a choice whether to adopt the feature [speaker] to
be the distinguishing feature or to adopt the feature [addressee] instead. This allows the grammar to
exercise some optionality in markedness. If the grammar singles out ﬁrst person as more marked,
it would use the feature [speaker] to mark ﬁrst person. Under those assumptions second person
would be the spell out of the less speciﬁed set [participant]. If instead the grammar singles out
second person as the more marked category, it would use the feature [addressee]. Under those
assumptions the ﬁrst person would be the spell out of the less speciﬁed set [participant] and the
second person would be marked with the [addressee] feature in addition to the rest of the hierarchy.
If the relationship between genitive and dative is similar to the relationship between ﬁrst and second
person, languages might be able to select how they diﬀerentiate between genitive and dative cases
by exercising a choice in which sister-feature they select to diﬀerentiate between the members of
that paired set, either [genitive] or [dative].

What we’ve done is adopted the argument that constraints on possible syncretism patterns
supports the adoption of a universal case contiguity that is captured through a case feature system
that is made up of hierarchically organized independently grounded case features that comprise
the various case categories. The adoption of the universal case contiguity hypothesis and some of
the intuitions about what properties distinguish the various case categories are taken from those
who’ve come before me, but their combination is novel.

Before moving to the syntactic function of case features, it is worthwhile to show how the
morphological case system proposed here can account for some of the types of syncretic patterns
introduced at the beginning of this section. To capture the syncretism is Greek between the
nominative and the accusative, the grammar needs to neutralize the distinction between the two.

182

Since nominative and accusative case in this proposed system share the feature [case], but do not
share the feature [verbal], the morphological component could delete the [verbal] feature in the
contexts where syncretism shows up (19). After deletion, both case categories correspond only to
the feature [case], allowing for the same vocabulary item to be inserted into either instance.

(19)

a.
b.

nom = [case]
acc = [case [verbal]]

Likewise, we saw syncretism in Finnish between the accusative case and the genitive case.
In
order to get this syncretism to surface, the grammar needs to neutralize their distinction in a way
that allows for one form to be inserted into either category. The logic of the approach is the same
as it was for the Greek syncretism. The [oblique] feature is the feature that distinguishes them,
so if the morphology deleted that feature (20), their featural speciﬁcation would then be identical
([case [verbal]]), paving the way for the insertion of the same vocabulary item. I’ve included the
nominative here to show that by deleting only the [oblique] feature, accusative and genitive will
become syncretic, but they will still be distinct from nominative unless additional features are
removed.

(20)

a.
b.
c.

nom = [case]
acc = [case [verbal]
gen = [case [verbal [oblique]]]

4.2.2.2 Syntactic Function

The proposed system is capable of making the 4-way distinction needed to diﬀerentiate between
the structural case categories. We’ve yet to address how we understand nominal licensing to be
handled in this system. This is the focus of the current section.

The general problem that previous Agree-based approaches to default case ran into was that
they proposed a distinction between default case DPs and DPs that received a typical case. These
approaches handled this data by arguing that each type of DP was generated with a diﬀerent set of

183

case features. The diﬀerentiation between the two groups was grounded in diﬀerent ways: Legate
(2008) grounded the diﬀerence through the distinction between argument and nonargument DPs
while Schütze (2001) made the distinction through random generation of case features. I argued
in section 4.2.1.1 that regardless of the speciﬁcs surrounding how one grounds the distinction,
modeling default case through a diﬀerence in DP type was both conceptually and empirically
unsatisfying.

Instead I propose that in order to derive the intuition that default case is a product of its
environment rather than its DP type, we should assume that all DPs are generated with the same
featural speciﬁcation, at least with respect to case. We’ve discussed that DPs have two needs: (i)
they need conﬁrmation that they appear in a position that is licit (licensing) and (ii) they, like other
syntactic objects, need to receive instructions for how they are to be pronounced (morphological
case). I propose that these two needs actually allow us to set up an entailment relationship that we
could then use to motivate hierarchical relationships between the features that are responsible for
each of these functions. If a DP has received instructions for pronunciation (ie: has a morphological
form), then we know that that DP was licensed (ie: licit in the position it occupies). The opposite,
however is not necessarily true. Default DPs are exactly the example that shows us that DPs can be
licit in a particular position without having received instructions (from the syntax, via agree) for
how they should be pronounced. So in other words, DPs that have morphological case must have
been licensed, but DPs that are licensed do not necessarily have to have received morphological
case. Following the logic set up in Béjar (2003), we can use this to motivate the proposal that the
features responsible for licensing dominate those responsible for morphological case, discussed in
the last section, (21)

184

(21)

[L]

[case]

[verbal]

[oblique]

[dative]

Like morphological case features, it’s important that we also ground the licensing feature [L].
Unfortunately, the simple concept of licensing itself is not something that is very well understood.
Licensing seems to be a concept that essentially says "yes, you can appear here". This is hardly
a theory of anything and makes its grounding incredibly diﬃcult. It winds up meaning that we
know something is a licensor if it allows something else to appear in some determined location or
distance from it. I’m going to suggest that there are three types of functional heads with respect to
satisfying the needs of DPs:

1. there are functional heads whose primary job is to fully integrate DPs.

2. there are functional heads that can integrate DPs, but whose job is not primarily this.

3. there are functional heads that simply cannot integrate DPs on their own.

Functional heads of the ﬁrst type are going to be the canonical case assigners: functional heads
like ﬁnite T, v, P, etc. These functional heads all primarily serve to help integrate DPs into the
derivation they are a part of. Because their focus is on the DPs primarily, I suggest that it’s not
unreasonable to assume that these functional heads are capable of fulﬁlling both needs of the DP:
licensing and pronunciation. How this observation is encoded in the featural speciﬁcation proposed
in this thesis is that these types of functional heads are speciﬁed with a licensing feature [L] and the
relevant amount of morphological case feature structure unique to that particular head. Essentially,
these functional heads come with the entire licensing/morphological case feature bundle proposed
in (21).

185

Functional heads of the second type are those that can certainly integrate DPs into a structure,
but aren’t focused primarily on doing so. These functional heads are capable of integrating a wide
range of category types and because of this, I argue that it is not unreasonable to assume that they
do not fulﬁll all of the requirements of the DP, just the distributional one. In more traditional terms,
what I mean by this is that these types of functional heads are not morphological case assigners,
but they can play a role in licensing DPs. More detailed examples of what functional heads I have
in mind will be further explained in the next section, but a quick example would be a coordinating
head, like and. Coordinating heads surely can coordinate (and therefore integrate) DPs, but they
can also coordinate a range of other category types:

(22)

a.
b.
c.

[DP Jim] and [DP John] will go to the store.
Jim will [vP go to the store] and [vP rent a movie].
The store is [PP around the corner] and [PP down the street].

Because this function is not restricted to just DPs, we can assume that they do not come supplied
with any DP-speciﬁc type features, like morphological case. The featural speciﬁcation I will assume
for functional heads of this type is:

(23)

[L]

Finally, to address the third type of functional heads, those that cannot integrate DPs on their own.
Non-ﬁnite T is traditionally assumed to lack case features and this is what explains why DPs are
typically unable to appear in the subject position of a non-ﬁnite clause. My assumptions for the
featural speciﬁcation for non-ﬁnite T essentially capture the same thing: I assume that non-ﬁnite T
is unable to license DPs, and therefore is also unable to supply them with morphological case. The
featural speciﬁcation for non-ﬁnite T would therefore be:

(24)

[ø]

With focus on the case domain, we’re tempted to assume that non-ﬁnite T is unique in this
speciﬁcation. However, the way we’ve grounded the three-way distinction between functional

186

heads is not case-speciﬁc. There are plenty of functional heads that do not license DPs. Functional
heads within the DP and aspectual heads in the clause oﬀer two quick examples, but any functional
head that cannot license a DP in a local position would be assumed to have this (null) speciﬁcation.
We have to be very careful about how we talk about the role of non-ﬁnite T and what it means to
integrate DPs into a derivation. An intuitive understanding is relating this integration to something
like a selectional feature, or an EPP feature. We clearly don’t want to assume that non-ﬁnite T does
not have an EPP feature; ECM clauses illustrate that non-ﬁnite T is still capable of triggering the
movement of the external argument to its speciﬁer. However, the DP cannot stay there unless it is
licensed by something else, namely the v of the embedding verb. In this way I intend to say that
non-ﬁnite T cannot integrate DPs on its own, and therefore is not speciﬁed with any of the features
related to the requirements of DPs.

A quick comment about the feature [case] and its comparison to [L]: in the following section
we’ll see that I propose that default case is the morphological spell out of the [L] feature, not the
underspeciﬁed [case] feature. Conceptually, this is intended to capture the observation that what
a default case DP is is one that has been licensed, but has not received any morphological case.
However, it would be quite intuitive to instead assume that default case is the physical representation
of the “root" [case] feature in the morphological case feature geometry – the least speciﬁed feature
in the paradigm. As far as I can tell, we have two ways to model what this underspeciﬁcation would
look like.

First, we could assume that for a given language, the least speciﬁed morphological case feature
in a particular paradigm is the exact feature that spells out whichever case category happens to be
the selected default case category in that language. The beneﬁt to modeling case underspeciﬁcation
in this way is that the default case category for a particular language is more or less derived, rather
than stipulated. Furthermore, as McFadden (2007) argues, we could also see a reduction in the
complexity of the system as we would now have a way to align canonical nominative case valuation
with default case, treating the former as a subset of the latter. The problem here, however is that
because languages do not universally select the same case category as its default, we are forced into

187

proposing an entirely separate case feature system for each language with a diﬀerent default. While
I think it’s entirely reasonable – and traditional, assuming the Borer-Chomsky Conjecture (Baker,
2008a; Borer, 1984; Chomsky, 1995) – to assume that languages can vary in the feature inventory
they select for case, I think it’s much less reasonable to assume that any hierarchical relationships
between the various case features should be diﬀerent – and that’s exactly what this system type
would require.

Instead, we could propose an underspeciﬁed abstract type of case feature that dominated all
other morphological case features. This case feature could then be sort of syncretic with whatever
the default case category for a particular language would be. There are two primary beneﬁts to this.
First, unlike the ﬁrst option, this type of model would allow us to maintain the same case feature
system cross-linguistically. We lose the ability to derive which language was the default for a given
language, but I would question whether or not this should be a goal in the ﬁrst place. Given that the
default case category can in fact vary cross-linguistically, I would suggest that deriving which case
category was the default in a language isn’t the goal. The goal should instead be to derive not the
speciﬁc case category, but rather the set of possible categories from which languages are allowed
to select a default. Proposing a [case] feature helps to preserve a universal system of case features
and also captures the intuition that what a DP probes for is not a particular case, but rather a more
or less abstract version of that requirement and also ties that requirement to its need for licensing.

4.2.2.3 A Summary

Before moving on to the discussion about how this feature system interacts with the match/value
system, I’d like to refocus our attention back to some of the preliminary concerns introduced at the
beginning of this section. As we mentioned in section 4.2.1.1, one of the biggest issues in grounding
case features was that it appears to ask an impossible task: ground the feature independent of any
primary functions, while needing be grounded in one of them. One could argue that this new system
doesn’t make any radical progress here, and to some extent I would agree at least with respect to
some of the particular properties we’ve proposed like ‘assigned by verbs’. However, I do think that

188

we’ve been able to get around some of the issues with respect to grounding the features independent
of their purpose. While it is impossible to ground case features outside the component in which
they operate, grounding their internal structure in their syncretic behavior is at least independent of
both their nominal licensing function and their primary morphological marking function. There is
still room of course to debate about whether or not we’ve identiﬁed the correct properties that single
out each of the various excluded categories as we move down the feature tree, but the hierarchical
order itself has allowed us to make progress in proposing at least the right scaﬀold for the system.
This system, when paired with a decomposition of agree, also allows us to reconcile the issue
raised in section 4.2.1.1 about the diﬀerent grammatical components requiring diﬀerent degrees
of speciﬁcity in how case features are modeled. If match is only sensitive to the root feature, we
can capture the binary need of the syntactic component, while value being sensitive to the entire
feature set captures the need for an X-way distinction that the morphological component requires.
We also, I think, gain some insight into one of the oldest case puzzles: the relationship between
abstract and morphological case. Data like default case, quirky/inherent case, and other instances
of unexpected case led researchers to conclude that these must be the result of mismatches between
abstract case features that operate primarily in the syntax and morphological ones that primarily
operate in the morphology (McFadden, 2004; Schütze, 1997). Once multiple kinds of case features
were proposed, the question became what relationship holds between them: are morphological
case features the direct spell out of abstract ones with some leeway that allows for the mismatches
or are morphological and abstract case features completely independent from one another? What
made this debate a diﬃcult one is that if one assumes they are completely independent, we lose the
overwhelming amount of redundancy where the two do align properly. If one instead assumes the
morphological features are the spell out of the abstract features, we lose the ability to account for
the instances where they do not align. I think this proposal provides some novel understanding into
the relationship. Morphological case features, the ones responsible for dictating morphological
form, are dominated by abstract case features, the ones responsible for regulating nominal licensing.
This hierarchical relationship between the two allows us to maintain the intuition that these two

189

functions are independent from one another while at the same time, they are related.

4.2.3 Accounting for Default Case

Now that we’ve argued that case features are hierarchically organized, as ϕ-features are, an obvious
question emerges: what would the application of match/value mean for the system of case
assignment and would this allows us to capture default case in a way that allows us to maintain
case’s role in regulating nominal licensing while also avoiding the adoption of a problematic
conﬁgurational case approach.

I borrow three well-argued for assumptions from Carstens (2016): (i) unvalued case/licensing
features are probes, (ii) if a probe’s feature remains unvalued after probing its c-command domain,
it may continue to probe in the search space until it is sent to transfer at spell out, and (iii) unvalued
features on heads may project to the phrasal level and continue their search from there if they are
unable to ﬁnd a value in their original c-command domain. While a few of these assumptions are
unfamiliar, they ﬁt in quite well with the standard assumptions about the basic architecture and
so I feel comfortable adopting them as minor modiﬁcations to the framework. I’ll discuss their
motivations brieﬂy here, but direct the reader to (Carstens, 2016) for further explication.

One feature of Chomsky (2000, 2001) that was arguably stipulative was that only ϕ-features
probed. Despite case features also entering the derivation unvalued, they weren’t assumed to
probe on their own; in this way they were considered purely ‘goal features’. Carstens (2016)
argues that all unvalued features have the capability to probe, since this capability is grounded
in the need to fulﬁll those requirements encoded by the presence of those unvalued features.
I
follow that assumption and assume that all DPs enter the derivation speciﬁed with identical feature
speciﬁcation: [uL[case]] and that this [uL[case]] feature is a probe. This captures the intuition that
all nominals need both conﬁrmation that they end up in a licit position (nominal licensing) and
instructions for pronunciation (morphological spell out). The second need, the need for instructions
for pronunciation, is only relevant if this ﬁrst need is met and is handled by the features that the [L]
feature dominates.

190

A quick comment on the exact location of these features: for shorthand throughout this ex-
planation you’ll see that I’ve located the case features on the category DP. It’s quite standard to
assume that case features actually exist on the D head itself, or in some independent K layer. With
Carsten’s delayed valuation proposal, features on a head X can project to category XP if they fail
to match in their initial search domain. The motivation for this assumption comes from adjectival
concord. Adjectival concord describes the existence of agreement morphology on adjectives that
matches that of an agreeing noun. The data from Swahili in (25) shows that the adjectives ‘good’
and ‘heavy’ agree with the nouns they modify ‘book’ and ‘load’ in nominal class. With respect
to adjectival concord Carstens assumes two things: that concord is handled by agreement and that
adjectives head the AP adjuncts that participant in concord. The idea is that without assuming the
agreement features on A can project up to AP, there is no way for them to participate at the phrasal
level. The features on A would probe their c-command domain and never ﬁnd anything with which
to value; concord would be impossible, contrary to fact (25)

(25)

a.

[AP kizuri
7good

kitabu
7book
‘a very good book’

sana]
very

[AP mzito
b. mzigo
3load
3heavy
‘too heavy a load’

mno]
too

(Carstens, 2016)
I follow that assumption and since it will always be true that an unvalued case feature on a D
head (or K head) will fail to ﬁnd a case assigning functional head within its own projection, it will
always project to the DP level before interacting with the rest of the structure. Since this interaction
is what’s relevant for the question at hand, I see no need to illustrate this in the derivations.

With respect to the directionality of probing, Carstens argues that its apparent downward-only
nature is actually a simple consequence of the way the structures are built – bottom up – and is not
an inherent characteristic of agreement. She argues that because the only available search domain
for a probe upon merger into a structure is its c-command domain, the probe has no other choice but

191

to search ‘down’, at least at ﬁrst. In many languages, complementizers can participate in agreement.
What is interesting about complementizer agreement is that it’s not universally downward; some
languages appear to exhibit upwards complementizer agreement where the complementizer agrees
with a nominal in the higher clause, rather than in the embedded clause. Data from the African
language Lubukusu illustrates this upwards direction of agreement (Diercks, 2013). What we see
in (26) is the complementizer agreeing with the subject of a higher clause, rather than the one in its
own clause. (The sa in the gloss indicates subject agreement.)

(26)

Khw-aulile
1.pl.sa-heard
‘We heard that the farmers harvested the maize.’

[CP khu-li/*ba-li
1pl-that/2-that

ba-limi
2-farmers

ba-funa
2sa-harvested

ka-ma-indi].
6-6-maize

Carstens assumes the upwards direction of agreement is constrained in the following two ways: it
is only possible when a particular probe has exhausted its c-command domain without ﬁnding a
value and it must obey locality constraints enforced through transfer to spell out. What this means
is that a probe is allowed to probe upwards, but only after probing ﬁrst into its c-command domain
and only until it is transferred to spell out. Upon that point, any remaining unvalued features are
assumed to cause a derivation crash. Adopting this assumption allows us to abandon an agreement
directionality parameter (Baker, 2008b; Diercks, 2011) and since adopting her assumptions about
directionality ﬁt in with the general architecture and allow us to simplify the system, I suggest they
are reasonable modiﬁcations to make.

With those assumptions in place, we can begin to explore how the match/value system interacts
with the novel case feature system proposed in the previous section. For reference, table 4.6 outlines
the possible outcomes for various featural speciﬁcations with respect to the licensing/morphological
case bundle I’ve proposed in this chapter. The ‘probe’ column is speciﬁed with the same featural
speciﬁcation, since we propose that all DPs enter the derivation with the same needs. The ‘goal’
column lists the possible featural speciﬁcations we’ve laid out for various functional heads. The
match and value columns each list the outcome of that particular individual operation, given the
featural content of the probe and goal. Finally, the ‘morphological outcome’ column gives us the

192

morphological result of the two operations: a case category, default case, or ungrammaticality.

Table 4.6: Decomposition of Cases:

Probe

[uL [case] ]
[uL [case] ]
[uL [case] ]
[uL [case] ]
[uL [case] ]
[uL [case] ]

[L [case [verbal [oblique [dative] ] ] ] ]

[L [case [verbal [oblique] ] ] ]

[L [case [verbal] ] ]

[L [case] ]

Goal

[L]
ø

match value
yes
yes
yes
yes
no
no

yes
yes
yes
yes
yes
no

Outcome

dat
gen
acc
nom

default form
ungrammatical

This section is intended to walk through how case valuation proceeds for a number of diﬀerent
circumstances, assuming the proposal I’ve set up so far.
I ﬁrst start with some canonical case
valuation examples, just to illustrate that I’ve not made any drastic modiﬁcations to the traditional
story of how case valuation works. I then show how each of the default case environments produces
default case on the relevant DPs.

4.2.3.1 Canonical Case Valution

I’ll begin this walkthrough with a simple example that illustrates both canonical nominative case
valuation and also accusative case valuation. Take the simple transitive clause in (27a) and its
derivation in (27b):

193

(27)

a.

b.

She loves him.

(cid:35)

(cid:34)

DP1
she
uL
case

TP

(cid:35)

T(cid:34)

L
case

match
value

T(cid:48)

DP
t1

v

L

case
verbal

vP

v

v (cid:48)

VP

V
loves

V

match
value



(cid:35)

DP

him(cid:34)

uL
case

First, we’ll start with canonical accusative valuation. Since the [uL[case]] feature bundle on
the object DP him serves as its own probe, it will begin to search its c-command domain. Upon
not ﬁnding anything with which to agree, it is allowed to continue the search upwards (Carstens,
2016) until the phase it occupies is spelled out, at which point it could cause ungrammaticality if
left unvalued. The most local matching feature is the [L] feature on the v and the probe ﬁnds both
a successful match and a successful value. Since value is successful, the features on v are then
copied over to the DP. The DP will then be spelled out as the accusative pronoun him.

The subject DP she also has a [uL[case]] feature bundle which will serve as its own individual
probe, separate from the one on the object DP him. After moving to specTP, this feature will
probe, ﬁnding both a successful match and a successful value on with the ﬁnite T that is in its
c-command domain. The spell out of this DP will be the nominative pronoun she.

A reasonable question to raise at this point is why the external argument’s probe isn’t able
to agree with the features on the v in (27b), since v is more local to the DP than ﬁnite T. I’m
going to argue that since the case features on the v functional head have already established a

194

dependency with the internal argument, the functional head is unavailable to establish future case
related dependencies with other DPs – it is not obvious however why this should be true. It is
an important concern because with the modiﬁcation that case and licensing features on functional
heads are now valued, there needs to be some mechanism that prevents them from valuing multiple
nominals in a clause. Under the standard Chomskyan approach, this issue is avoided with respect
to case because once ϕ-agreement values case on the nominal, it deletes the uninterpretable case
features from both the nominal and the functional head. The deletion of the case feature on the
functional head prevents case from being reassigned to another nominal by the same head. For
now I will be forced to say that once a functional head licenses a nominal it can no longer license
another nominal, largely by stipulation.6

6There is a similar problem in the ϕ-feature domain under the standard approach. The ϕ-features
on nominals are interpretable and are not deleted when the nominal agrees with a functional head,
so in theory they should be able to continue to value other ϕ-probes on functional heads. What
stops them is a constraint on agree that says that nominals can only participate in agreement if
they have another uninterpretable feature, namely an uninterpretable case feature. There might be
a way to extend this sort of proposal in reverse to the realm of case assignment by proposing that
there needs to be an unvalued feature on the functional head for it to be able to participate in agree
with respect to case. An obvious contender would be ϕ-features which would help account for
the DP fulﬁlling functional heads, but wouldn’t work for functional heads like top, for example.
Another approach would be to somehow model that functional heads that receive something in
return from the nominals they license are frozen in a way that prevents them from participating in
future agree dependencies. This could circumvent the issue of proposing some unvalued feature
that the disparate set of licensing heads share. The intuition would be that two-way dependencies
are somehow stronger than one-way dependencies and the strength of that dependency freezes it
from future participation in other dependency establishing operations.

195

Next, a canonical ‘ungrammatical due to a Case failure’ type of example:

(28)

a. *John hopes her to win the game.

b.

(cid:34)

(cid:35)

DP2
John
uL
case

TP

(cid:35)

T(cid:34)

L
case

T(cid:48)

vP

v (cid:48)

v



V

hopes

DP
t2

v

L

case
verbal

VP

V CP

C TP

⇒ spell out domain

(cid:35)

(cid:34)

DP1
her
uL
case



T
to
[ø]

T(cid:48)

DP
t1

v

L

case
verbal

vP

v (cid:48)

v



VP

(cid:34)

the game

(cid:35)

DP

uL
case

V

V
win

match
value

Accusative case valuation for the DP the game works exactly as it did for the object DP him in
example (27b). Next, the embedded subject DP her probes down its c-command domain and does
not ﬁnd a match at all in the non-ﬁnite T functional head because this functional head is exactly the
type that is not speciﬁed with any of the case features proposed in section 4.2.2.2. The DP fails to
match and is actually allowed to continue probing since it has not yet been sent to spell out via the
merger of a phase head. It reaches the v, but although this functional head was originally generated
with relevant licensing/morphological case features, it has already agree-d with the object DP

196

the game and therefore these features are no longer available to participate in further agree. The
embedded subject DP, having exhausted its c-command domain, is therefore allowed to continue
to probe upwards, still searching for something with which to match. It is allowed to do so until a
phase head causes the spell out domain to be transferred. The probe attempts to search upwards,
but upon the merger of the C phase head, the TP spell out domain is spelled out and having failed
to value its case feature, the subject DP causes the derivation to crash.

To show how embedded DPs in ECM clauses receive a case value, in contrast to ‘regular’

non-ﬁnite clauses like the one we just saw, let’s look at (29b)

(29)

a.

b.

John expects her to win the game.

(cid:34)

(cid:35)

DP2
John
uL
case

TP

(cid:35)

T(cid:34)

L
case

match
value

T(cid:48)

DP
t2

v

L

case
verbal

vP

v (cid:48)

V

expects

v



match
value

VP

V TP

(cid:34)

(cid:35)

DP
her
uL
case



T
to
[ø]

197

vP

v (cid:48)

T(cid:48)

DP
t1

v

L

case
verbal

v



VP

(cid:34)

the game

(cid:35)

DP

uL
case

V

V
win

match
value

Here, the derivation proceeds exactly as it did in (28b), but because no C has merged into the
structure – triggering spell out of TP – the embedded subject may continue its search past the TP
into the matrix clause. It then ﬁnds a match, and subsequent successful value in the v of the
matrix clause. The embedded subject is therefore spelled out with the [acc ] features it receives
from the matrix v.

4.2.3.2 Quirky Case

Next, we need to sketch how Icelandic quirky case might operate under this analysis. Icelandic
famously has non-nominative subjects, often called quirky subjects (Andrews, 1982; Sigurðsson,
1989; Thráinsson, 1979; Zaenen, Maling, & Thráinsson, 1985). When these non-nominative
subjects surface, the object surfaces with nominative case, instead of the usual accusative case.
This is shown in (30):

(30)

líkuðu
liked

Henni
her.dat
‘she liked the horses’

hestarnir
horses.the.nom

The structure of the vP for the example in (30) is shown below in (31):

(31)

vP

(Harley, 1995)

(cid:35)

(cid:34)

DP
henni
uL
case



v

L
case
verbal
oblique
dative



match
value

v (cid:48)

v

VP

hestarnir

(cid:34)

(cid:35)

DP

uL
case

V

líkuðu

V

198

I assume the v that merges with quirky verbs is lexically speciﬁed with whatever case features
correspond to the quirky case required by that verb (Schütze, 1993); in (31), these are the dative
features required by the verb lı’ikuðu. What we have to assume is that by virtue of being a quirky
assigning verb, it is unavailable to the internal argument, otherwise we might expect quirky case
on the internal argument as it merges into the structure ﬁrst. Immediately upon merge, external
arguments of quirky verbs automatically probe looking for a match and value for their licens-
ing/morphological case feature bundle inside the v (cid:48). The bundle ﬁnds both a successful match
and value with the features on the v that merged with the quirky verb. This successful match
and value also has the eﬀect of eliminating the quirky verb from being able to provide licens-
ing/morphological case features to any other DP, leaving the object DP’s licensing/morphological
case bundle still unvalued.

The derivation then continues as usual, with the merging of T along with the typical movement

of the external argument to the spec TP position:

(32)

TP

(cid:35)

T(cid:34)

L
case



DP1
henni
L
case
verbal
oblique
dative



T(cid:48)

DP
t1

v





L
case
verbal
oblique
dative

VP

(cid:34)

(cid:35)

hestarnir

DP

uL
case

vP

v

v (cid:48)

V

líkuðu

V

match
value

199

The object DP’s probe has still yet to successfully match and value its unvalued licens-
ing/morphological case bundle. It is therefore allowed to continue to probe upward until it ﬁnds
something with which to agree, since the quirky case assigning verb was unavailable. Recalling
example (28b), one might be concerned about agreement between T and a VP-internal object, given
that a phase boundary vP appears between the two that should make its constituents unavailable for
syntactic operations. If we adopt the weaker version of the PIC from Chomsky (2001), however,
this is not a problem. Chomsky (2001) allows agreement between T and a VP-internal object
provided that the next phase head, C, has not yet been introduced. In other words the spell-out of
vP is not triggered until the introduction of C. What this would mean here is that the object DP is
able to probe into the TP, provided this happens before C is merged into the structure. Following
this assumption, the object DP is then able to continue to probe past the vP and ﬁnds a successful
match and value with the ﬁnite T that usually assigns nominative case to the subject. The object
in quirky verb constructions therefore surfaces with nominative case.

While some of the details of the mechanics of the canonical and quirky examples look a bit
diﬀerent than the traditional story, I’d like to point out that the underlying idea is exactly the
same. Because these examples involve either a successful match and value or they involve an
unsuccessful match, they aren’t really diﬀerent than the traditional account where something either
successfully agrees or doesn’t. Where this proposal really diﬀers from the traditional Case story is
in the cases where match is successful, but value is not: the places where I predict default case to
surface.

4.2.3.3 Hanging Topic/Left-Dislocation

Now that we understand how the approach proposed in the last section would account for canonical
case valuation and failure, I’d like to begin our discussion of how the match/value based proposal
would account for the distribution of default case forms in the default case environments by looking
at what is possibly the least complicated environment: the left periphery, including left-dislocation
and hanging topics.

200

(33)

Hanging Topic/Left-Dislocation
a. Me, I love honey.
b. What?! Him wear a tuxedo?!

Both examples in (33) are considered left-dislocation, where some element is merged into a topic
position at the left periphery.7 That there is syntactic material in the left periphery is obvious in
(33a), but less so in (33b). Lambrecht (1990) proposes that sentences like (33b) actually involve
the left-dislocation of both the subject and the predicate into two independent topic positions. For
ease of explication, I’m going to provide a walkthrough of (33a), but follow Lambrecht (1990) in
assuming that (33b) would also involve left-dislocation and therefore we should expect it to work
similar to (33a).

I assume that the hanging topic elements occupy a ‘topic’ position – the speciﬁer of a topic
head.8 In order to be grammatical, they must have a reasonable associate. For the sentence in
(33a), I assume the structure in (34):

(34)

topP

(cid:34)

(cid:35)

DP
Me
uL
case

match
no value

top’

top
[L]

TP

I love honey

With respect to the task at hand – accounting for how the left-dislocated DP ends up in the
accusative form – I propose that like all DPs, it should be speciﬁed with the licensing feature I
7I don’t think that assuming a base generated story for left-dislocation or a movement based
account of left-dislocation makes a diﬀerence for my proposal, so I leave these details for another
time.

8Likewise, I would assume that left-dislocated elements occupy the speciﬁer of a focus head.
This focus head would have the identical featural speciﬁcation with respect to the licensing/case
bundle as the one I’m proposing for the topic head.

201

proposed in section 4.2.2.2. I also propose that the topic head should come with an [L] feature,
but should not be speciﬁed with any more of the licensing/case bundle than that. The motivation
for this is that left-dislocation can clearly involve a number of diﬀerent category types:

(35)

a.
b.

[VP Love honey], it’s what I like to do.
[PP In the car], it’s where I like to eat honey.

Because the topic head can clearly license a number of category types, it should be speciﬁed with
a feature to do so, but we would not expect it to be speciﬁed with any morphological case features,
as topic and focus heads do not function primarily to fulﬁll the needs of DPs, speciﬁcally.

With these speciﬁcations in mind, we can now work through how the left-dislocated DP satisﬁes
its requirements. First, as an unvalued feature, the unvalued [uL[case]] on the DP probes into its
c-command domain. Here, it ﬁnds a match with the topic head, which is speciﬁed with an [L]
feature. This match is successful because match is only evaluated at the root. This successful
operation identiﬁes the topic head as a potential valuer of probe. Since match was successful,
value then proceeds where the unvalued licensing/morphological case bundle on the DP evaluates
whether or not its potential valuing goal is able to transfer its features to value the probe. Since the
features on the probe are more speciﬁed than those on the potential goal, value evaluates this as a
failure. The failure to value triggers the stripping of the features on the probe to its root feature,
[L]. Because match was successful, the newly stripped licensing/morphological case bundle on
the DP is able to continue again to ﬁnd a value. This time, because the newly stripped probe is
no longer more speciﬁed than the goal, value is able to proceed successfully and the [L] feature
on the topic head goal is transferred to value the unvalued corresponding feature on the DP. Once
this DP is spelled out, it is directed to insert the default case form – for English, accusative. The
structure in (34) therefore produces the sentence in (33a).

202

4.2.3.4 Coordination

Coordination is another place where we ﬁnd accusative DPs in English, despite no obvious source
for the [acc ] features in the derivation. This fact, coupled with the observation that coordinated
DPs in other languages are not accusative, but rather often9 align with whatever the default case
for that particular language is, lead us to treat the examples in (36) as default environments:

(36)

Coordination
a. Me and her will go to the store.

I ﬁrst want to make clear that I agree with Schütze (2001), among others I’m sure, that we should
not propose that coordinators assign morphological case. This is for two reasons: (i) arguing for
an analysis like that would in fact be quite ad hoc and (ii) doing so would require us to argue that
coordinators both cross-linguistically and within a particular language aren’t consistent in which
morphological case they assign their conjuncts.

Instead, what I will argue here is that coordinators are not morphological case assigners at all,
as discussed above – they do not come speciﬁed with any speciﬁc case features. What they are able
to do though is license DPs. Importantly for the type of feature grounding I’ve suggested in this
thesis, this ability is not restricted to just DPs, but is rather quite cross-categorial:

(37)

a.
b.
c.

[DP Jim] and [DP John] will go to the store.
Jim will [vP go to the store] and [vP rent a movie].
The store is [PP around the corner] and [PP down the street].

Since coordinators are able to license a number of categories rather than just DPs, I argue that
these are the kind of functional heads that will come speciﬁed with an [L] feature, but no other
DP-speciﬁc featural structure beyond that. With this featural speciﬁcation put in place, we now
look at how the coordinated DPs in the examples in (36) come to receive accusative case by default.
9Not unsurprisingly, the coordination data is a bit more involved. Here I provide an account
for the coordinated structures that do involve default case, but admit that the claims here should be
tempered in some ways.

203

Below is an example of the sort of structure I will assume for coordinated DPs, following Munn
(1993)’s BP adjunction analysis:

(38)

TP

DP1

(cid:34)

(cid:35)

DP
me
uL
case

match
no value

T
will

BP

(cid:34)

(cid:35)

DP
her
uL
case

B
and
[L]

match
no value

T(cid:48)

DP
t1

vP

v

v (cid:48)

v

V
go

VP

V

PP

P
to

DP

the store

As DPs, both of the coordinated DPs come with the full licensing/morphological case bundle
I’ve proposed: [uL[case]]. The larger coordinated DP, as a DP itself, comes with this speciﬁcation
as well. When each of the coordinated DPs probe, the ﬁrst functional head each encounters
is the coordinator and. As a reminder, the lower coordinated DP ﬁrst probes its c-command
domain, but following Carstens (2016) is able to continue to probe upward if nothing is found
there, provided the derivation has not yet been spelled out. Since the functional head that each
of these DPs ﬁnds ﬁrst has an [L] feature that is capable of successfully match-ing the unvalued
licensing/morphological case bundle on the DP, match is successful and value is then attempted.
As we saw in the examples with hanging topic/left-dislocation, value between the more speciﬁed
licensing/morphological case probes and the less speciﬁed [L] feature goal is unsuccessful, which
triggers the deletion of any additional feature structure on the DPs, minus the root. These newly
stripped [uL] features on the probes are then able to value when they encounter the [L] feature on
the coordinating head and. The resulting DPs are now valued with just their [L] feature which will

204

trigger the insertion of default case forms at spell out.10

One speciﬁc thing to note with these coordination examples is that I’m of course assuming that
both DPs have access to the licensing feature on the coordinating head. I suggest that this isn’t
really problematic because once again, this licensing feature is independent from morphological
case. It seems that the job of coordinating heads in the structure is to license and connect two
syntactic objects.
It’s not unreasonable therefore to assume that both conjuncts have access to
this licensing. I’m uncommitted at this point to whether this joint access is due to there actually
being two independent licensing features on the coordinating head or whether the two conjuncts
are simply just both able to access it due to it being a valued, rather than unvalued type of feature.

4.2.3.5 Gapping

Gapping is considered another syntactic environment where default case surfaces (Schütze, 2001).

(39)

Gapping

a.
b.

She will eat beans, him rice.
For Mary to be the winner and us the losers is unfair!

For this discussion, I’m going to adopt the analysis of gapping proposed in Johnson (2009), shown
in (40) below. There are three parts to his approach: (i) low coordination of the vPs (ii) heavy NP
shift of their objects, and (ii) across the board movement of the verb phrases. Johnson assumes that
when two vPs are coordinated, that coordination can trigger two separate processes: the rightward
shift of the objects outside of their respective VPs and the subsequent across the board movement
10Of course you may be wondering about the fact that in English, nominative forms are acceptable
as well (ia). Schütze (2001) suggests that these might be instances of hypercorrection given that
they’re not consistently used across the paradigm (ib). If this is true, then we might be able to
account for these situations where a prescriptive rule overrides the grammatical default. (see Sobin,
1997, for a proposal)

(i)

between you and I

a.
b. *between we and they

205

of those VPs to the speciﬁer of a predicate phrase. The subject of the ﬁrst vP, as the highest DP
in the structure, will be the one targeted by the EPP feature of T and will raise to subject position.
The subject of the second vP will remain in its original position. It is this position where it receives
default case as there isn’t a canonical accusative case assigner available to assign its case features.
To see how we can account for this using the case assignment system proposed in this chapter,
let’s focus on the boxed part of the tree in (40), the BP and him rice, to show how the DP him is
spelled out as the accusative pronoun. Here, the object DP rice probes and ﬁnds a successful match
and value with the accusative v, as it has done in all of our canonical examples. This successful
agree makes the features on the v unavailable to other DP probes. The external argument of the vP
probes its c-command search domain, but because the features on the v are unavailable, it continues
its search upwards. It quickly ﬁnds a match with the B head and and attempts to value. Since
value here is unsuccessful because the features on the goal are less speciﬁed than those on the
probe, value fails and the features on the DP probe are stripped down to the root feature, [L].
value then once again is attempted and is this time successful. The DP is therefore transferred to
spell out with a [L] speciﬁcation and the default case form for English, the accusative, is inserted.11

11One concern is how the features of the B head are available, despite a general understanding
that once a head agrees, its features are no longer available to other probes. One thing we could say
here is that the B head, due to its nature of licensing multiple conjuncts, is special in that its features
do remain available, even after successful agree. Another worry that we might have is that by
agreeing with the B head, in a way we are saying that the B head is licensing not only its conjuncts,
but also DPs inside them. One way to avoid some of this discomfort is to say that coordinators
cannot license something if that something is incomplete or ungrammatical in anyway. Since the
verb in the coordinated vP is a transitive verb, it must have an external argument. We could maybe
say that the licensing of the external argument comes as part of the licensing of the vP. Finally, if the
idea ﬂoated in an earlier footnote that two-way dependencies are the only ones that can’t participate
in further dependencies winds up working, we could say that since the nominal isn’t giving the
coordinator anything in return, it’s the type of dependency that doesn’t freeze the licensing head.

206

(40)

DP1
she

TP

T
will

T(cid:48)

VP2

eat t3

DP
t1

PredP

Pred(cid:48)

pred

vP

vP

v (cid:48)

v

VP2

V

DP3

beans

B
and
[L]

BP

vP

(cid:35)

DP

him(cid:34)

uL
case

match
no value

v

L



case
verbal

v (cid:48)

V

match
value

VP2

(cid:35)

(cid:34)

DP3

rice
uL
case

4.2.3.6 acc-ing gerunds

A potential extension that Schütze (2001) didn’t detail (although he hinted at its possibility in
(Schütze, 1997)) is to treat acc-ing gerunds as yet another place where nominals receive default
case (Abney, 1987; Horn, 1975; Milsark, 1988; Reuland, 1983).

(41)

a. Her revising the book is really helpful.
b.

Sue prefers him swimming.

(Pires, 2007)

I adopt the analysis oﬀered in (Pires, 2007), (See Pires, 2007, for further details) While both
types of gerunds are clausal, there are many places where acc-ing gerunds and poss-ing gerunds
exhibit a diﬀerence in behavior: acc-ing can appear with certain adverbs that are ungrammatical
with poss-ing gerunds (42a)-(42b), acc-ing are capable of licensing long-distance wh-extraction

207

(42c)-(42f), and acc-ing gerunds can include an expletive while that option is ungrammatical for
poss-ing gerunds (42g)-(42h). Given that acc-ing gerunds do not seem to behave like poss-ing
gerunds, Pires suggests that acc-ing gerunds must have a diﬀerent structure; proposing that acc-ing
gerunds are TPs and poss-ing acquire some sort of DP layer through the derivation. Of course, the
T that heads the gerund must be somehow distinct from ﬁnite T all well as we get accusative case
assigned inside the gerund rather than nominative. A structure for acc-ing gerunds is shown in
(43).

(42)

a. Mary probably being responsible for the accident, the attorney did not want to defend

her.

b. *Mary’s probably being responsible for the accident, the attorney did not want to

defend her.

c. What did everyone imagine Fred singing?
d. *What did everyone imagine Fred’s singing?
e. Who did you defend Bill inviting?
f. *Who did you defend Bill’s inviting?
g. You may count on there being a lot of trouble tonight.
h. *You may count on there’s being a lot of trouble tonight.

208

(43)

TP

DP2
Sue

T(cid:48)

T

vP

DP
t2

v (cid:48)

v

VP

V

prefers

(cid:34)

(cid:35)

DP1
John
uL
case

TP

T
[L]

match
value

T(cid:48)

DP
t1

vP

v (cid:48)

swimming

What I propose is that the gerundive T is speciﬁed as [L], but the motivations for its speciﬁcation
are a bit diﬀerent than what we’ve seen so far. I argue that there are three ‘ﬂavors’ of T: the normal
ﬁnite T that comes with [L[case]], the non-ﬁnite T that comes with [ø], and the gerundive T that
comes with [L]. We clearly can’t ground this speciﬁcation in the same way we have for the other
functional heads with [L] because it’s quite impossible to argue that this gerundive T licenses many
types of categories. However, I think what might be interesting is if we tied the gerundive T’s
speciﬁcation to ﬁnite T’s speciﬁcation. Viewing them together, we can characterize gerundive T
as a sort of underspeciﬁed T and by extension, clausal gerunds a type of underspeciﬁed clause.
They behave like clauses in many ways and so gerundive T has some of the normal capabilities,
but not all of them – namely, it can’t assign nominative case to its speciﬁer. While clausal, acc-ing
gerunds do exhibit some nominal-like behavior. There are three positions which are associated
with nominals where acc-ing gerunds can appear: complement to V, complement to P, and subject
position; the examples in (44) show this distribution. acc-ing gerunds also pattern with nominals

209

with respect to case positions, being ungrammatical in passives and in raising structures (45). Pires
proposes that what accounts for the nominal-like distribution of acc-ing gerunds is that unlike ﬁnite
T, the gerundive T has its own unvalued case feature, which needs to be valued. With respect to
grounding the diﬀerence between gerundive T and ﬁnite T, it could be that the underspeciﬁcation
of gerundive T is somehow tied to this property.

(44)

(45)

a. Mary favored [him taking care of her land]..
b.
c.

Sylvia wants to ﬁnd a new house without [her helping her].
[Her showing up at the game] was a surprise to everybody.

a. *Him was preferred [reading a book].
b.
[Him reading a book] was preferred.
c. *It appears [him liking Mary].

With respect to the proposal itself, if gerundive T is speciﬁed as [L], then we can show that the
subject of the gerund receives default case via having match-ed, but not value-d its case/licensing
bundle. A nice beneﬁt to this approach is that we do not have to propose that T assigns accusative
case. All that’s needed here is to say that the morphological case assigning function of T is lost,
but its licensing function is not. In this circumstance, the default accusative case arises naturally
from the system.

4.2.3.7 Modiﬁed Pronouns

Finally, we come to modiﬁed pronouns, the ﬁnal default case environment from Schütze (2001)
that I will discuss here. The examples in (46) show that accusative pronouns can often appear as
smaller constituents of a larger DP, once again, despite there not being any obvious source for these
features.

210

(46)

Modiﬁed Pronouns

a.
b.

Lucky me/*I has to clean the toilets all day.
The real me/*I is ﬁnally emerging.

At face value, these examples seem to constitute the most diﬃcult case for the account I’m advancing
in this thesis. If the pronoun can’t ﬁnd a licensor within the DP it’s a constituent of, it is not clear
how we would prevent that pronoun from searching outside the DP and ﬁnding the same [nom ]
that the larger DP would presumably receive from ﬁnite T, shown in (47). What this seems to mean
is that my account requires there to be a source for licensing internal to the larger DP to avoid
nominative case on the smaller constituent pronoun, me. Given a structure like (48), it’s clear that
there are not many (or any) real options.

(47)

(cid:35)

DP(cid:34)

uL
case

D
the

DP

AP

A
real

(cid:35)

T(cid:34)
(cid:35)

L
case

TP

DP(cid:34)

uL
case

D
me

T(cid:48)

vP

v (cid:48)

DP

v

VP

v

V
likes

V

DP

tomatoes

(48)

DP

D
the

DP

AP

A
real

DP

D
me

211

One way to avoid this problem is to argue that the nominal me does not in fact have a licens-
ing/morphological case feature bundle at all and its distribution is only governed by selection. How
could this be possible? If the me in (46) were a noun, instead of a pronoun then we could make the
argument that me doesn’t come into the derivation with the same set of requirements that DPs do.
Me is a regular noun, akin to a proper name and is simply syncretic with the accusative pronouns.
Instead of (48), we would model these modiﬁed pronouns as (49):

(49)

DP

D
the

NP

AP

A
real

NP

N
me

One reason to suggest that the nominals in (46) are not the same as the canonical pronouns is
that they trigger trigger third person agreement, rather than the ﬁrst person agreement that bare
pronouns would trigger.

(50)

3rd person agreement

a.
b.

The real me is/*am emerging.
I *is/am emerging.

Schütze (2001)

This distinction indicates that these two types of pronouns are diﬀerent in some way, and I argue
that the diﬀerence is that they are not of the same category type. Schütze (2001) provides evidence
from Italian that supports a me-as-noun type of analysis as well:

(51)

a.

il
the

vero
real

me
my

stesso
self

b. ??il
the

vero
real

me
me

212

c. **me
me

(stesso)
(self)

vero
real

(52)

a.

il
the

vero
real

Paolo
Paul

b. **Paolo
Paul

vero
real

The data in (51) and (52) show two things: (i) a parallel between modiﬁed pronouns and proper
names, lending support to a me-as-noun type of analysis, and (ii) that in the proper name subset
of nominals, the typical N-to-D movement common in Italian is actually unavailable. If modiﬁed
pronouns are in fact nouns, then we have to have an explanation for how they do not covertly move
to D and receive nominative anyways. The fact that modiﬁed pronouns do not move in a language
that has overt N-to-D movement supports that these modiﬁed pronouns do in fact stay in their
merged N position. The agreement data in (50) paired with analogous proper name data in (51)
and (52) provide reason to draw a distinction between typical bare pronouns and these modiﬁed
pronouns.

What’s convenient about assuming that the modiﬁed pronouns are nouns is that we have a built
in explanation for how they avoid receiving nominative case – they are not DPs, so they do not need
case at all. Each of these nominals’ forms is not an actual example of accusative case, but rather
they are regular nouns that have been reanalyzed to some extent and are simply syncretic with the
accusative forms in English. As nouns, they do not come speciﬁed with the licensing/morphological
case bundle that I’ve argued for in this paper and in this way, the examples in (46) would actually
not be considered default case environments under my analysis at all.

213

(53)

(cid:35)

DP(cid:34)

uL
case

D
the

NP

T(cid:48)

TP

(cid:35)

T(cid:34)

L
case

AP

A
real

NP

N
me

vP

v (cid:48)

DP

v

VP

v

V
likes

V

DP

tomatoes

What (53) shows is that the nominal me, as a noun, does not come with any of the licens-
It therefore has no requirements beyond selection and doesn’t
ing/morphological case bundle.
probe at all. The larger DP that it’s a constituent of, the real me; however does come speciﬁed with
a licensing/morphological case bundle, like all DPs. It does have a set of requirements it needs to
meet and does so by canonical probing into its c-command domain. As usual, it will ﬁnd a match
and a value from the featural speciﬁcation of the ﬁnite T and it therefore receives nominative case.
In English, nominative case isn’t spelled out on this larger DP.
4.3 Evaluating Our Options

We’ve now argued that default case can be accounted for through the adoption of a match/value
type of agreement system proposed by Béjar (2003), but we’ve not yet considered whether or not
we should account for default case this way. This section outlines a few empirical reasons for why
this type of approach has enough promise to be seriously considered. The goal here is to show
that default case does not necessitate the rejection of the role of case in nominal licensing, nor
the abandonment of an agree-based approach to case valuation. While this may appear far too
modest a claim, the existence of defaults and the problems they’ve introduced has been framed as so
serious a problem that its solution requires drastic reconﬁgurations of basic, long-held assumptions.
Showing that smaller modiﬁcations to the system are available removes the severity of the call to

214

adopt the departures. Whether or not one wants to adopt the system proposed here is of course a
separate issue, and one that I’ll try to address here. However, given the breadth of morphological
facts that a theory of case must account for, I simply will not be able to convince the reader that
dependent case theory is dead, or should be dead. What I do hope to do in this section is to bolster
the attractiveness of this other available option that is more in line with the decades long program
of classical case theory.

4.3.1 Some Problems for Dependent Case Models

This section outlines a few types of data that raise serious issues for dependent case models. We
can group these examples into two groups by the issue they pose for the dependent case model. In
section 4.3.1.1 we’ll see data that make it diﬃcult to argue that the assignment of accusative case
is dependent on the existence of other nominals and in section 4.3.1.2, we’ll see examples from the
default case set of environments that make the wrong predictions for case assignment.

4.3.1.1 Sole Accusative Arguments

Because dependent case theory frames the assignment of accusative case as dependent on the
presence of another nominal in a given domain, it makes the prediction that sole arguments of
predicates should not be able to receive it, at least without proposing an ‘invisible’ sort of nominal
in the structure. Kučerová (2012) makes exactly this point and provides data from Polish and
Ukranian of exactly these instances. There is a construction in these languages called the -no/-to
construction, shown below in (54)-(55).

(54)

Polish

a.

b.

był/został
was/stayed.m.sg

Pies
dog.m.sg.nom
‘A dog was killed by a car.’
Psa
dog.m.sg.acc
‘A dog was killed.’

zabito
killed.n.sg

zabity
killed.m.sg

przez
by

samochód
car

canonical passive

nt

215

(55)

Ukranian

a.

b.

buly
Žinky
woman.nom.f.pl
was.f.pl
‘(The) women were killed.
bulo
Žinok
were.n.sg
woman.acc.f.pl
‘(The) women were killed.

vbyty
killed.f.pl

vbyto
killed.n.sg

canonical passive

nt

This construction involves the internal argument being marked as accusative, rather than the

more expected nominative case, (56).

(56)

a.

zabito
Psa
dog.m.sg.acc
killed.n.sg
‘A/The dog was killed.

b. *Pies

dog.nom.m.sg

zabito
killed.n.sg

What makes this sort of data diﬃcult for the dependent case theory to account for is that there
is no obvious way for the dependent case algorithm to assign accusative case to the nominal in
question because there is simply no other nominal that c-commands it. She argues that since the
-no/-to construction can be formed with unaccusatives, raising verbs, and modal verbs, shown in
(57), that there really is an absence of an external argument, even one that is not overt. Since
accusative case is dependent on the presence of that additional nominal by deﬁnition, this data
constitutes a problematic example for this theory to capture.

(57)

a. Balon

b.

rozerwano
pierced.n.sg.ppp

balloon.acc
‘The balloon was pierced.’
Zdawano
seem.imp
‘They seemed not to be noticing us.’

zauważać
notice.inf

sie¸
refl

nas
us

nie
not

termin
c. Musiano
must.nt
refl deadline
‘(They) had to do this, because the deadline was approaching.

zbliżaësie¸
approached

vwykonać,
do.inf

bo
because

to
this

All is not easy in the Agree-based system either, but the Agree-based system does provide a bit

216

more leeway for it to work. Traditionally, the issue is that as a v that does not have an external
argument, it is unable to assign accusative case. This is the familiar Burzio’s Generalization. What
Kučerová argues is that we’ve misunderstood what is responsible for ‘bestowing’ the ability of v to
assign case. It’s not the presence or absence of an external argument that allows v to assign case,
but rather whether or not the v structure is extended that is the relevant property. Of course, one
way this extension can happen is if an external argument is merged in the structure, and in this way
Kučerová captures the initial Burzio intuitions. What this proposal does however is it provides an
additional set of instances where v case assign accusative case. For her, the -no/to construction
involves the have-perfect, which she argues extends the projection in a way that allows for the
assignment of v.

The proposal for the -no/-to construction is of course not without issue, but the set of possible
explanations for how the accusative case ends up on the singular argument conﬂict less with the
version of the system she adopts than it would for dependent case theory.
I suggest that while
one may have reservations about adopting this particular story, all one needs to do if adopting
an Agree-based approach is to propose an understanding for how v, which normally has case
assignment abilities anyway, assigns accusative case, despite the expectation that there’s some
property blocking this ability. To accomplish something similar while adopting dependent case
theory requires one to either suspend the deﬁnition of accusative case altogether and claim that its
assignment is not always dependent on there being another nominal present or it must propose that,
despite evidence to the contrary, there is a null nominal in the structure that the dependent case
mechanism can be sensitive to.

4.3.1.2 Dependent Case Theory and Default Case

One thing that that makes the default environments especially interesting with respect to evaluating
dependent case theory is that they all test the distinction between unmarked and default environ-
ments, a distinction that we saw in chapter 2 is not easily maintained in modern versions of the
theory. Unmarked cases in a dependent case system are like default cases in that they are cases that

217

can be assigned without being dependent on the existence of another nominal in a given domain.
What distinguishes them informally from defaults is that unmarked cases are context-sensitive while
defaults are not. As a reminder, the two main unmarked cases are nominative, which is assigned
by the grammar when a nominal does not otherwise receive case and is typically in the domain of
TP spell out and genitive, which is assigned by the grammar when a nominal does not otherwise
receive case and is typically in the domain of DP/NP spell out. By comparison, default case is
assumed to be the case that is assigned when a nominal is unable to receive either a lexical case,
a dependent case via the dependent case algorithm, or the unmarked case. Because the conditions
under which unmarked case is assigned are already quite default-like in nature, this essentially
means that default case is predicted to only show up where the unmarked case cannot apply –
outside the spell out domains of TP and DP/NP. Where modeling defaults in dependent case theory
sees diﬃculty is that it’s certainly much easier to argue that there are a set of nominals that are
not governed and are thus outside any sort of governing domain, as was true in earlier versions
(Marantz, 1991). It’s much harder however to make an argument that the nominals in question are
not contained or otherwise present in a TP or DP/NP context.

To see the diﬃculty, let’s ﬁrst look an example of a default case that does not challenge this idea:
the default environment of hanging topics and left-dislocation, the structure repeated below in (58).
Here, the nominal in question does not receive accusative case because it is not c-commanded by
another nominal that also does not have case. It also doesn’t appear to qualify for the unmarked
case as it is in a position that is not within the relevant TP domain. It therefore remains completely
caseless at the end of the derivation. We can therefore assume that at spell out, the grammar assigns
default case to that nominal; accusative for languages like English and nominative for others.

218

(58)

topP

DP
Me

top’

top

TP

I love honey

What allowed the nominal in (58) to avoid getting the unmarked case was that it existed in a
position outside the domain that deﬁnes the assignment of unmarked case. Where dependent case
would run into trouble is if we found nominals that received default case, rather than unmarked
case, in positions that were not outside these unmarked assigning domains. I’ll show here that two
of the default case environments are exactly these types of positions: coordinated DPs and gapping
environments. As a quick reminder, the motivation for classifying these environments as default
case environments rather than examples of unmarked case is the cross-linguistic variation we see
in them. While some languages mark these nominals with nominative case – and would thus be
indistinguishable from unmarked case. Other languages mark the nominals in those same positions
with accusative. This indicates a last resort style default mechanism.

Our discussion regarding coordinated nominals is interesting because there are two nominals
to discuss, rather than one. We’ll discuss them in turn by examining the structure in (59). DP3 is in
a position where it is c-commanded by another nominal. It could avoid being assigned accusative
case by the dependent case mechanism if we assume that the DP1 domain that contains it serves
as a barrier of sorts from the larger TP domain where the dependent case mechanism applies. If
it avoids the assignment of dependent case, it is then evaluated for assignment of unmarked case,
nominative if in a TP domain, genitive if in a DP/NP domain. It seems that it would be diﬃcult to
make an argument that the DP3 nominal is in neither of these domains. Default case then would
not be able to be assigned because the unmarked case would have taken precedence. DP2 avoids
getting dependent case because it, unlike DP3 is not c-commanded by another DP in the TP domain.
Like DP3, it however does have trouble avoiding being assigned unmarked case because it also is
in a position that is diﬃcult to argue isn’t in the unmarked domain.

219

(59)

TP

DP1

DP2
me

BP

T
will

B
and

DP3
her

T(cid:48)

DP
t1

vP

v

v (cid:48)

v

V
go

VP

V

PP

P
to

DP

the store

Gapping environments provide a similar argument: that there are default case receiving nom-
inals in positions that would be diﬃcult to claim aren’t in either of the unmarked case assigning
domains. The assumed structure for gapping environments is shown below in (60), with the relevant
nominal highlighted in red. As with the coordinated nominals, it is diﬃcult to make the claim that
the highlighted nominal does not exist in the TP spell out domain. Now, one could argue here in the
English examples that maybe gapping environments aren’t actually examples of default case, but
rather are more canonical accusative case examples. Notice that him is c-commanded by another
nominal in the spell out domain. We’d therefore expect that nominal to receive the dependent
accusative case. For English, there isn’t an obvious reason to reject that proposal since the default
case is accusative and a distinction cannot be made. However to the extent that one considers this a
default environment where other languages would instead mark the nominal with nominative case,
this becomes an issue.

220

(60)

TP

DP1
she

T
will

T(cid:48)

VP2

eat t3

PredP

Pred(cid:48)

pred

vP

vP

v

DP
t1

BP

DP
him

vP

v

v (cid:48)

V

B
and

VP2

DP3

beans

v (cid:48)

V

VP2

DP3

rice

Finally, to the extent that one adopts the proposal of acc-ing gerunds argued for in Pires (2007),
this environment oﬀers diﬃculty for dependent case as well. The highlighted DP appears in the
accusative case but there is not other DP that c-commands it within its own spell out domain. Once
the matrix v merges into the structure, it should trigger the spell out of the embedded TP spell out
domain. Because within this domain, there is no other nominal besides DP1, the dependent case
model would predict the highlighted DP to surface in the unmarked nominative form.

221

(61)

TP

DP2
Sue

T(cid:48)

T

vP

DP
t2

v (cid:48)

v

VP

V

prefers

TP

T(cid:48)

DP1
him

T

vP

v (cid:48)

DP
t1

swimming

To summarize, what causes problems for modern dependent case theory is that there exist a
number of default nominals that exist in positions where we’d predict the unmarked case to apply
rather than the observed default. agree-based models more easily account for default case because
the burden for escaping the case assignment process is easier to meet. Defaults in a system like the
one proposed in section 4.2 simply have to exist in a position where they cannot form a relationship
with a case assigning functional head. Defaults in a conﬁgurational based case system must instead
exist in positions that are outside clause building domains like TP (and DP), something that is much
harder to argue.

4.3.2 Final Remarks

There are a host of other questions that come out of the discussion had in this chapter. Some of
those will be speculatively explored in the next chapter. What I hope this chapter has accomplished
is the following: we’ve provided an alternative account to modeling default case in the grammar,

222

largely maintaining the traditional assumptions about the role of Case in regulating DP distribution.
We followed the traditional understanding that licensing and morphological case are independent;
but by modeling the features responsible for each as having a hierarchical relationship, we’ve been
able to formally encode the notion that while licensing and case are separate, they are related to
one another. This is an interesting ﬁnding because data like default case data appeared to require
the abandonment of these basic theoretical assumptions. The alternative proposed in this chapter
shows that we can in fact reconcile the problematic default case data in a framework where the
failure to receive case can still rule derivations ungrammatical.

This chapter has also contributed a novel system of case features that relied heavily on intuitions
others have made about case syncretism patterns and the hierarchical structure they suggest. Arguing
for this hierarchical feature system allowed us to extend the match/value approach to agreement
that captured so many varied patterns in the ϕ-agreement domain, most important – the ability
for the operation to fail, solving a similar issue in a diﬀerent domain. Being able to account for
similar types of syntactic failure while employing the same set of operations is likely a beneﬁt of
the proposal oﬀered here.

One part of the licensing/morphological case speciﬁcations on functional heads that I’ve left
aside is the question of to what extent do we think that other syntactic categories (like PPs, vPs,
etc) also come with a licensing feature. What motivated DPs having this sort of feature bundle
was that they seemed to have requirements beyond a simple selection feature. Selection alone isn’t
enough to license DPs because they appear to need some sort of conﬁrmation beyond this selection
to appear in the positions where they are allowed. It does not seem to be the case however that
other category types have a requirement like this. As far as we know, selection alone is enough
to account for the distribution of PPs, vPs, etc. Since selection alone appears to be enough, then
proposing that these categories come with an additional head that does the work of licensing these
categories seems unnecessary.

Note that selection alone can’t be enough to handle DP distribution. If we could reduce the [L]
feature proposed here to something like a D feature or a DP feature, then we’d be unable to account

223

for the non-ﬁnite subject position examples that are ungrammatical, as there isn’t a convincing way
to diﬀerentiate ﬁnite T from non-ﬁnite T based on some version of an EPP/DP feature. I’m certainly
not attempting to push these issues aside, as they are quite central to the problems surrounding the
needs of DPs. This should be a familiar discussion as it’s essentially the Case versus EPP debate
that so many researchers have devoted time to (see McFadden (2004) for a review of the relevant
literature).

What I’d like the big take-away to be from this chapter is that the existence of defaults in the case
domain does not require us to abandon ship, as has been previously argued. By nailing down what
DPs are looking for and by being speciﬁc in how those needs are encoded into a feature structure,
we’ve been able to make some progress on accounting for how defaults could surface in the realm of
case, while maintaining its role in determining grammaticality. I’ve paired this argument with some
comments on instances where dependent case theory might have some trouble. I leave speculation
about what these alternatives mean for the bigger picture at large in the next chapter.

224

CHAPTER 5

CONCLUSIONS

5.1 Introduction

With the arguments and proposal laid out, I oﬀer a quick summary of what we’ve done before
some ﬁnal comments about what this all might mean. We saw how the existence of defaults raises
some interesting and deep problems for our understanding of how we want to model the syntactic
framework. The crux of the issue is two fold: how is it that defaults can surface in a system that
should categorically rule them out and how can we constrain the mechanism responsible for their
production? Three main questions guided our discussion:

(i) How does the grammar produce defaults?

(ii) How does the grammar constrain the production of defaults?

(iii) What can an understanding of syntactic defaults tell us about the how the syntax can encode

underspeciﬁcation?

To address these issues, researchers have proposed the abandonment of one of the central tenets
of the syntactic system, proposing in various ways that the failure to value features is not fatal
to the derivation. This quite intuitively solves the problem raised by question (i). If features are
allowed to survive to the interfaces without having been valued, then the production of defaults is
easily accounted for. We saw a solution of this type in the case domain, where the dependent case
proposal built defaults directly into the system by claiming that case features can remain unvalued.
This subsequently requires the adoption of a separation between case and licensing given case’s
central syntactic role in regulating nominal distribution. We also discussed a solution in this vein
proposed in the ϕ-agreement domain with Preminger’s (2014) obligatory operations model whereby

225

operations are triggered by their structural descriptions and inherent need to apply, rather than a
need to fulﬁll the requirements of various syntactic objects.
5.2 The Lifeboats are Headed in the Wrong Direction

In chapters 2 and 3 I presented a number of arguments for why we should be hesitant to jump
ship, so to speak, and pursue this general approach. With respect to dependent case theory, I
showed that its modern instantiation violates the Inclusiveness Condition in a way that makes the
origins of case features – and by extension, their assignment – unclear. This raises questions
about whether or not dependent case can truly constitute a real system as opposed to a clever way
to describe morphological patterns. Related to that point is the conceptual inconsistency that it
indicates. We saw that even under strict dependent case assumptions, case was modeled as the
reﬂection of a number of diﬀerent syntactic dependencies, undermining the status of case as a
system. This likely has negative repercussions for acquisition as well. Empirically, we saw that the
diﬃculty in distinguishing between unmarked and default cases raises issues for languages whose
default is accusative, rather than nominative. Likewise, intransitive clauses whose sole argument
is marked with accusative case appear to constitute serious problems for the model without any
obvious solution. When we expand our purview to how dependent case ﬁts in to the system of
dependency establishment more broadly, we realize that dependent case isn’t just an alternative
mechanism; it is an additional one. To the extent that we wish to reduce the number of operations
UG has access to, an agree based model that can capture both ϕ-agreement dependencies and
case dependencies through the same mechanism is more parsimonious. And ﬁnally, the adoption
of dependent case theory requires that we abandon the long held assumption that case plays a role
in regulating the distribution of nominals. This separation of case from licensing requires that we
propose an alternative understanding of the data case has been understood to account for for the
past 40 years. The discussion in chapter 2 showed that the alternatives proposed have not yet met
the high standard expected for such a central syntactic component.

In the ϕ-agreement domain, we discovered that when the obligatory operations model operates

226

over more complicated data, serious issues with respect to the overapplication of defaults arise.
Languages for which there are second cycle eﬀects, for example, show evidence for an third outcome
of agreement – in addition to success and failure. This third outcome is distinct from defaults and
the obligatory operations model is unable to model that distinction because it only diﬀerentiates
between the success of an operation and the failure of one. This has the result of the grammar
being satisﬁed with failure too early and subsequently inserting defaults where we should be getting
second cycle morphology instead. Languages that have more speciﬁed probes will also cause issues
for dative intervention as the obligatory operations model doesn’t have a way to ensure visibility of
underspeciﬁed nominals. On the conceptual side, it requires simultaneous immediacy and delay,
which while not impossible to reconcile are particularly diﬃcult in this domain. Furthermore, I
made the argument that obligatory operations only makes sense if we can adopt it framework wide.
Otherwise, it is unclear what beneﬁt we gain in adopting it for such a narrow range of phenomena,
especially when a framework-wide alternative is available.

Because the logic of using both dependent case and obligatory operations to provide answers to
the questions at the start of this thesis is the same, it is perhaps not surprising that their implications
are quite similar as well. Both proposals produce defaults by relaxing the rules that encode
ungrammaticality.
If having an unvalued feature at spell out is problematic, then removing the
oﬀense is a way to get defaults to surface. However, while each of the approaches presented was
able to get the system to produce defaults, neither was able to constrain that production in ways
that prevented the overapplication of defaults – the more theoretically interesting puzzle. Because
they are unable to solve the second, and more far reaching, half of the default problem – while
also completely upending the system we’ve built over the past 25 years of work in the Minimalist
program – I argue that these lifeboats, intended to save us from the default ‘ﬁre on board’, veer us
oﬀ course.

227

5.3 Putting out the Fire Allows us to Maintain the Course

Of course, jumping into the lifeboats is unavoidable if there’s no way to put out the ﬁre in the
ﬁrst place. I’ve argued in this thesis for modest modiﬁcations to the standard set of assumptions that
allow us to stay in our boat and maintain the course. In chapter 4 I argued that we can account for the
production of defaults while maintaining that the failure to value features induces ungrammaticality
by extending the theory of agreement proposed in (Béjar, 2003). This model is largely standard
with one major modiﬁcation: it argues that the two operations behind agree are sensitive to the
inherent hierarchical relationships between the individual features that participate. The separation
of the feature-sensitive operations introduces a third outcome of agree, one that we can exploit to
account for the existence of defaults. The third outcome itself is how defaults are produced, but the
presence of that third outcome alongside the failure-induces-ungrammaticality outcome allows us
to constrain that production in ways that the previous approaches couldn’t.

I also proposed a novel system of case features based on a number of previously noted intuitions
that interacts in interesting ways with the two agreement operations. The result is a model of case
valuation that maintains case’s standard role in regulating nominal licensing while also allowing for
the appearance of default case forms in restricted environments. What is really attractive about this
proposal is that it addresses the default issues with surprisingly few modiﬁcations to the framework.
Given the far reaching problematic implications that the drastic theoretical departures discussed in
chapter 2 and chapter 3 invited, this is a welcome result.
5.4 What Have We Learned

Perhaps surprisingly, we learned that syntactic defaults actually can be captured in a framework
that encodes grammatical requirements through the failure to value features if we allow the agree
operations to be sensitive to the inherent hierarchical relationships that hold between features.
Importantly, adopting the separation of agree into two operations reveals a third outcome of
agreement, allowing us to both produce and constrain the appearance of defaults. Less surprising
is the conclusion that by simply allowing features to remain unvalued, we’ve opened a pandora’s

228

box with respect to the framework at large.

With respect to the three questions posed at the start of the thesis, we can now provide some

answers:

(i) Defaults are produced when match succeeds, but value fails.

(ii) The production of defaults is constrained by the existence of this third outcome alongside

assumptions that failure to agree completely still induces ungrammaticality.

(iii) Inherent hierarchical structure in feature systems allows for underspeciﬁcation in the syntax
with respect to which features are speciﬁed on which objects. Underspeciﬁcation can also
can be represented through the probe modiﬁcation that results from the failure to value,
much like the Impoverishment operation introduced in chapter 1.

We learned that failure in the syntax does not lead to a singular set of outcomes. Rather,
the grammar can use failure to trigger second applications of operations, defaults, strategies for
reconciling conjunct features, and of course ungrammaticality. This wide range of outcomes
suggests that unvalued features – which can serve as a sort of derivation pacer – do real theoretical
work and cannot be removed as easily as one might hope.

We learned that case features also have hierarchical structure and we were able to understand
the relationship between case and licensing in a new way. We’ve shown that the two concepts are
independent, but related via entailment. This allows us to encode both their correlations and their
mismatches.

The extension of match/value into the domain of case showed us that the separation of agree
into two operations only makes diﬀerent predictions when the probes are not ﬂat. Not only is
this a welcome result, but it makes some predictions about which feature categories have access
to defaults. Since defaults are the reﬂex of a successful match and a failed value– which can
only happen when a probe is highly speciﬁed – we predict only feature categories with hierarchical
organization to be capable of producing default forms.

229

Finally, the proposal contributes to the discussion on the relationship between case and ϕ-
agreement. Through this exercise, I’ve proposed that the two are separate dependencies, established
via separate probes. However, they are both established via the same set of operations and in a
large number of instances, involve the same participants. I suggest that this is why we often see
such a strong correlation between the two. Their independence, however, is what can explain why
they don’t always match up. Like the relationship between abstract Case and morphological case,
the relationship between case and ϕ-agreement shows they are separate, but related.

As I’m sure is often true, this thesis has raised far more questions than it has answered. I hope
to have shone a little light on a small piece of an important issue and feel grateful to have gotten
the chance to be part of the conversation.

230

REFERENCES

231

REFERENCES

Abney, S. (1987). The English noun phrase in its sentential aspect (Unpublished doctoral disser-

tation). MIT, Cambridge, MA.

Abondolo, D. (1982). Verb paradigm in Erza Mordvinian. Folia Slavica.

Ackema, P., & Neeleman, A.

In
M. Reeve, L. Franco, & M. Moreno (Eds.), Non-local dependencies in the nominal and
verbal domain. Oxford University Press.

(2017). Default person versus default number in agreement.

Adger, D. (2003). Core syntax. Oxford University Press.

Adger, D., & Harbour, D. (2007). Syntax and syncretisms of the person case constraint. Syntax,

10(1), 2-37.

Andrews, A. (1982). The representation of Case in modern Icelandic. In J. Bresnan (Ed.), The

mental representation of grammatical relations (p. 427-503). MIT Press.

Aronson, H. (1989). Georgian: A reading grammar. Columbus, OH: Slavica.

Austin, J. (2012). The case-agreement hierarchy in acquisition: Evidence from children learning

Basque. Lingua, 122(3), 289-302.

Baerman, M., Brown, D., & Corbett, G. G. (2005). The syntax-morphology interface (Vol. 109).

Cambridge University Press.

Baker, M. C. (2008a). The macroparameter in a microparametric world. In T. Biberauer (Ed.), The

limits of syntactic variation (p. 351-374). John Benjamins Publishing.

Baker, M. C. (2008b). The syntax of agreement and concord. Cambridge University Press.

Baker, M. C.

(2012a). “obliqueness” as a component of argument structure in Amharic.
M. C. Cuervo & Y. Roberge (Eds.), The end of argument structure? (p. 43-74). Emerald.

In

Baker, M. C. (2012b). On the relationship of object agreement and accusative case: Evidence from

Amharic. Linguistic Inquiry, 43(2), 255-274.

Baker, M. C. (2015). Case: its principles and its parameters. Cambridge University Press.

Baker, M. C., Johnson, K., & Roberts, I. (1989). Passive arguments raised. Linguistic Inquiry,

20(2), 219-251.

232

Baker, M. C., & Vinokurova, N.

(2010). Two modalities of case assignment: Case in Sakha.

Natural Language and Linguistic Theory, 28, 593-642.

Beatty, J. (1974). Mohawk morphology. University of Northern Colorado, Museum of Anthropol-

ogy.

Béjar, S. (2003). Phi-syntax: A theory of agreement (Unpublished doctoral dissertation). University

of Toronto.

Béjar, S., & Rezac, M. (2003). Person licensing and the derivation of PCC eﬀects. In A. T. Pérez-
Leroux & Y. Roberge (Eds.), Romance linguistics: Theory and acquisition (p. 49-62). John
Benjamins Publishing.

Béjar, S., & Rezac, M. (2009). Cyclic agree. Linguistic Inquiry, 40(1), 35-73.

Bhatt, R.

(2005). Long distance agreement in Hindi-Urdu. Natural Language and Linguistic

Theory, 23, 757-807.

Bhatt, R., & Walkow, M. (2013). Locating agreement in grammar: an argument from agreement

in conjunctions. Natural Language and Linguistic Theory, 31(4), 951-1013.

Bittner, M., & Hale, K. (1996). The structural determination of case and agreement. Linguistic

Inquiry, 27(1), 1-68.

Blake, B. (2001). Case. Cambridge University Press.

Bobaljik, J. D. (2008). Where’s phi? agreement as a postsyntactic operation. In S. Béjar, D. Adger,
& D. Harbour (Eds.), Phi theory: Phi-features across modules and interfaces (Vol. 16).
Oxford University Press.

Borer, H. (1984). Parametric syntax. Foris.

Bright, W. (1957). The Karok language. University of California Press.

Burzio, L. (1986). Italian syntax. Dordrecht: Reidel.

Caha, P. (2009). The nanosyntax of case (Unpublished doctoral dissertation). University of Tromsø.

Calabrese, A. (1998). Some remarks on the Latin case system and its development in Romance.
In J. Lema & E. Trevino (Eds.), Theoretical advances on Romance languages (p. 71-126).
John Benjamins Publishing.

Carstens, V. (2016). Delayed valuation: a reanalysis of goal features, “upward” complementizer

agreement, and the mechanics of case. Syntax, 19(1), 1-42.

233

Chomsky, N. (1973). Conditions on transformations. In S. Anderson & P. Kiparsky (Eds.), A

festschrift for Morris Halle. Holt, Rinehart and Winston.

Chomsky, N. (1980). On binding. Linguistic Inquiry, 11, 47-103.

Chomsky, N. (1981). Lectures on government and binding: The Pisa lectures (No. 9). Walter de

Gruyter.

Chomsky, N. (1982). Some concepts and consequences of the theory of government and binding.

Cambridge, MA: MIT Press.

Chomsky, N. (1995). The minimalist program (Vol. 28; S. J. Keyser, Ed.). MIT Press.

Chomsky, N.

(2000). Minimalist inquiries:

In R. Martin, D. Michaels, &
J. Uriagereka (Eds.), Step by step: Essays on minimalist syntax in honor of Howard Lasnik
(p. 89-155). MIT Press.

the framework.

Chomsky, N. (2001). Derivation by phase. In M. Kenstowicz (Ed.), Ken Hale: a life in language

(p. 1-52). MIT Press.

Chomsky, N. (2005). Three factors in language design. Linguistic Inquiry, 36(1), 1-22.

Culicover, P. W. (1997). Principles and parameters: An introduction to syntactic theory. Oxford

University Press.

Deal, A. R. (2016). Person-based split ergativity in Nez Perce is syntactic. Journal of Linguistics,

52(3), 533-564.

Diercks, M. (2011). The morphosyntax of Lubukusu locative inversion and the parameterization

of agree. Lingua, 121(5), 702-720.

Diercks, M. (2012). Parameterizing case: Evidence from Bantu. Syntax, 15(3), 253-286.

Diercks, M.

(2013). Lubukusu complementizer agreement as a logophoric relation. Natural

Language and Linguistic Theory, 31, 257-407.

Diesing, M., & Jelinek, E. (1993). The syntax and semantics of object shift. Working Papers in

Scandinavian Syntax, 51, 1-54.

Donohue, M., & Brown, L. (1999). Ergativity: Some additions from Indonesia. Australian Journal

of Linguistics, 19, 57-76.

Dunn, M. (1995). Sm’algyax: A reference dictionary and grammar for Coast Tsimshian. Seattle:

University of Washington Press.

234

Fox, D., & Pesetsky, D. (2005). Cyclic linearization of syntactic structure. Theoretical Linguistics,

31, 1-45.

Halle, M., & Marantz, A. P.

In
K. Hale & S. J. Keyser (Eds.), The view from building 20: Essays in linguistics in honor of
Sylvain Bromberger (p. 111-176). Cambrige, MA: MIT Press.

(1993). Distributed morphology and the pieces of inﬂection.

Harley, H.

(1995). Subjects, events and licensing (Unpublished doctoral dissertation). Mas-

sachusetts Institute of Technology.

Harley, H., & Ritter, E. (2002). Person and number in pronouns: A feature-geometric analysis.

Language, 78(3), 482-526.

Harris, A. C. (1981). Georgian syntax (No. 33). Cambridge University Press.

Hewitt, B. (1995). Georgian: A structural reference grammar. John Benjamins Publishing.

Holmberg, A.

(1986). Word order and syntactic features (Unpublished doctoral dissertation).

University of Stockholm.

Holmberg, A. (1999). Remarks on Holmberg’s generalization. Studia Linguistica, 53, 1-39.

Holmberg, A., & Hróarsdóttir, T. (2003). Agreement and movement in Icelandic raising construc-

tions. Lingua, 113(10), 997-1019.

Horn, L. (1975). On the nonsentential nature of the POSS-ING construction. Linguistic Analysis,

1(4), 333-387.

Hornstein, N. (2018). The minimalist program after 25 years. Annual Review of Linguistics, 4,

49-65.

Johnson, K. (2009). Gapping is not VP-ellipsis. Linguistic Inquiry, 40(2), 289-328.

Kuroda, S.-Y. (1988). Whether we agree or not. In Papers from the 2nd international workshop on

Japanese syntax.

Kučerová, I. (2012). Toward a phase account of dependent case. In Proceedings of the 35th annual

Penn Linguistics Colloquium (Vol. 18).

Lambrecht, K. (1990). “what, me worry?” - ‘mad magazine sentences’ revisited. In Proceedings

of the sixteenth annual meeting of the Berkeley Linguistics Society (p. 215-228).

Legate, J. A. (2008). Morphological and abstract case. Linguistic Inquiry, 39(1), 55-101.

Leslau, W. (1995). Reference grammar of Amharic. Harrossowitz.

235

Levin, T.

(2015). Licensing without case (Unpublished doctoral dissertation). Massachusetts

Institute of Technology.

Levin, T., & Preminger, O. (2015). Case in Sakha: are two modalities really necessary? Natural

Language and Linguistic Theory, 33, 231-250.

Mahajan, A. K. (1989). Agreement and agreement phrases. In I. Laka & A. K. Mahajan (Eds.),

Functional heads and clause structure (Vol. 10, p. 217-252). MITWPL.

Marantz, A. P. (1991). Case and licensing. In G. Westphal, B. Ao, & H. Chae (Eds.), Proceedings

of the 8th Eastern States Conference on Linguistics ESCOL 8 (p. 234-253).

Marušič, F., Nevins, A., & Badecker, W.

Slovenian. Syntax, 18(1), 39-77.

(2015). The grammars of conjunction agreement in

McFadden, T. (2004). A study on the syntax-morphology interface (Unpublished doctoral disser-

tation). University of Pennsylvania.

McFadden, T. (2007). Default case and the status of compound categories. In Proceedings of the

30th annual Penn Linguistics Colloquium.

McFadden, T., & Sundaresan, S.

(2010). Nominative case is independent of ﬁniteness and

agreement. Talk Presented at BCGL 5: Case at the Interfaces.

Milsark, G. L. (1988). Singl-ing. Linguistic Inquiry, 19(4), 611-634.

Müller, G. (2004a). A distributed morphology approach to syncretism in Russian noun inﬂection.
In O. Arnaudova, W. Browne, M. L. Rivero, & D. Stojanovic (Eds.), Proceedings of FASL
12 (p. 353-373). Michigan Slavic Publications.

Müller, G.

(2004b). On decomposing inﬂection class features: Syncretism in Russian noun
inﬂection. In G. Müller, L. Gunkel, & G. Zifonun (Eds.), Explorations in nominal inﬂection
(p. 189-227). Mouton de Gruyter.

Müller, G. (2005). Syncretism and iconicity in Icelandic noun declensions: a distributed morphol-

ogy approach. Yearbook of Morphology 2004, 229-271.

Munn, A.

(1993). Topics in the syntax and semantics of coordinate structures (Unpublished

doctoral dissertation). University of Maryland.

Nash, L. (1996). The internal ergative subject hypothesis. In K. Kusumoto (Ed.), Proceedings of

NELS 26 (p. 195-209).

Nevins, A. (2007). The representation of third person and its consequences for person-case eﬀects.

Natural Language and Linguistic Theory, 25(2), 273-313.

236

Owens, J. (1985). A grammar of Harar Oromo (Northeastern Ethiopia. Hamburg: Helmut Buske

Verlag.

Perlmutter, D. (1971). Deep and surface structure constraints in syntax. New York: Holt, Rinehart

and Winston.

Pesetsky, D. (2013). Russian case morphology and the syntactic categories. MIT Press.

Pesetsky, D. (2017). Complementizer-trace eﬀects. In M. Everaert & H. van Riemsdijk (Eds.), The

Wiley Blackwell companion to syntax (2nd ed.). John Wiley and Sons, Inc.

Pesetsky, D., & Torrego, E. (2001). T-to-c movement: Causes and consequences. In M. Kenstowicz

(Ed.), Ken Hale: a life in language (p. 355-426). MIT Press.

Pesetsky, D., & Torrego, E. (2007). The syntax of valuation and the interpretability of features. In
Phrasal and clausal architecture: Syntactic derivation and interpretation in honor of Joseph
E. Emonds. John Benjamins Publishing.

Pires, A. (2007). The derivation of clausal gerunds. Syntax, 10(2), 165-203.

Postal, P. (1979). Some syntactic rule in Mohawk. Garland Publishing.

Preminger, O. (2014). Agreement and its failures (No. 68). MIT Press.

Reuland, E. (1983). Govern -ing. Linguistic Inquiry, 14(1), 101-136.

Reuland, E. (2011). Anaphora and language design. Cambridge, MA: MIT Press.

Rezac, M. (2011). Phi-features and the modular architecture of language (No. 81). Springer.

Rizzi, L. (1990). Relativized minimality. Cambrige, MA: MIT Press.

Schütze, C. (1993). Towards a minimalist account of quirky case and licensing in Icelandic. MIT

Working Papers in Linguistics, 19, 321-375.

Schütze, C. (1997). INFL in child and adult language: Agreement, case and licensing (Unpublished

doctoral dissertation). Massachusetts Institute of Technology.

Schütze, C. (2001). On the nature of default case. Syntax, 4(3), 205-238.

Shima, E. (2000). A preference for move over merge. Linguistic Inquiry, 31(2), 375-385.

Sigurðsson, H. Á. (1989). Verbal syntax and case in Icelandic (Unpublished doctoral dissertation).

University of Lund.

237

Sigurðsson, H. Á. (1996). Icelandic ﬁnite verb agreement. Working Papers in Scandinavian Syntax,

57, 1-46.

Sobin, N. (1997). Agreement, default rules, and grammatical viruses. Linguistic Inquiry, 28(2),

318-343.

Stowell, T. (1981). Complementizers and the empty category principle. In V. Burke & J. Pustejovsky

(Eds.), Proceedings of NELS 11 (p. 345-363).

Thráinsson, H. (1979). On complementation in Icelandic. Garland Publishing.

Thráinsson, H. (2007). The syntax of Icelandic. Cambridge University Press.

Valentine, R. (2001). Nishnaabemwin reference grammar. University of Toronto Press.

Vergnaud, J.-R. (2008). Letter to Noam Chomsky and Howard Lasnik on “ﬁlters and control"
April 17, 1977. In R. Freidin, C. P. Otero, & M. L. Zubizarreta (Eds.), Foundational issues
in linguistic theory: Essays in honor of Jean-Roger Vergnaud. MIT Press.

Wali, K., & Koul, O. N. (1997). Kashmiri: A cognitive-descriptive grammar. Routledge.

Willson, S. (1996). Verb agreement and case marking in Burushaski. In Working papers of the

Summer Institute of Linguistics North Dakota (Vol. 40, p. 1-71).

Woolford, E. (1997). Four-way case systems: Ergative, nominative, objective, and accusative.

Natural Language and Linguistic Theory, 15, 181-227.

Woolford, E. (2006). Lexical case ,and inherent case, and argument structure. Linguistic Inquiry,

37, 111-130.

Yip, M., Maling, J., & Jackendoﬀ, R. (1987). Case in tiers. Language, 63(2), 217-250.

Zaenen, A., Maling, J., & Thráinsson, H. (1985). Case and grammatical functions: the Icelandic

passive. Natural Language and Linguistic Theory, 3, 441-483.

Zeijlstra, H. (2004). Sentential negation and negative concord (Unpublished doctoral dissertation).

University of Amsterdam.

Zeijlstra, H. (2008). Modal concord. In M. Gibson & T. Friedman (Eds.), Proceedings of SALT 17

(p. 317-332).

238