THREE ESSAYS IN APPLIED MICROECONOMICS
                          By
                      Zijian Qi
               A DISSERTATION
                   Submitted to
           Michigan State University
    in partial fulfillment of the requirements
                 for the degree of
       Economics – Doctor of Philosophy
                        2022


                                        ABSTRACT
                   THREE ESSAYS IN APPLIED MICROECONOMICS
                                               By
                                           Zijian Qi
This dissertation has three chapters, each concentrating on a distinct aspect of information
asymmetry. Each chapter approaches information asymmetry from a unique perspective:
the first chapter explores a scenario with both hidden information and hidden action. The
second chapter discusses another type of information asymmetry related to population un-
certainty. Finally, the third chapter focuses on a natural result of information asymmetry
— discrimination.
    Chapter one explores a situation in which managers rely on their subordinates for local
information that aids decision-making but cannot commit to a decision rule. When the firm
and the workers have conflicting interests on how such information gets used, incentives for
effort and information elicitation become intertwined. We explore how one may solve this
incentive problem through job design—the choice between “individual assignment” where all
tasks in a given job are assigned to the same worker, and “team assignment” where the tasks
are split among a group. Team assignment facilitates information elicitation but suffers from
“diseconomies of scope” in incentive provision. This trade-off drives the optimal job design,
and it is shaped by two key parameters — the workers’ ex-ante likelihood of being informed
and the noise in the performance measure that is used to reward the worker. The individual
assignment is optimal when the performance measure is well-aligned, but the team is optimal
when the measure is noisy, and the workers are highly likely to be informed about the local
conditions.


    In chapter two, I study a contest with population uncertainty in which the value of the
prize depends on the number of participants. There is friction between a contestant’s per-
spective and an outsider’s perspective regarding the number of contestants. This discrepancy
drives the main result: under the assumption that the expected value of the prize is the same
across all environments, if the value of the prize increases in the number of players, the play-
ers exert more effort; whereas, if the value of the prize declines in the number of players, the
players exert less effort.
    In the third chapter, I focus on discriminating as a consequence of information asymmetry.
I construct a two-stage assimilation model to analyze the discrimination level in groups with
different discount factors. I have three main results: First, there always exists an equilibrium
for any discount factors and minority group size; the equilibrium will have an on-path action
profile with a cutoff rule; second, as group size increases, both discrimination level and the
ability cutoff will increase; third, when discount factors vary across different regimes, the
effect is not monotonic.


To MY PARENTS
      iv


                                 ACKNOWLEDGMENTS
I am overcome with appreciation and humility to express my gratitude to everyone who has
helped me finish this dissertation. First, I would like to thank Dr. Jon Eguia, my graduate
adviser, for his assistance throughout my studies. I would not be the same man I am today
without his guidance, advice, and support. In addition, I thank Dr. Arijit Mukherjee for
his patience with my adaptation, his innovative learning-by-doing practice, and his service
on my committee. I would also like to express my gratitude to Dr. Jay Pil Choi and Dr.
Julian Guo, my committee members, for their guidance and inspiration. Their constructive
feedback and comments significantly improve the quality of this dissertation.
    I am grateful to Dr. Luı́s Vasconcelos, who inspired me with his perseverance and deter-
mination during our collaboration. My gratitude also extends to the whole academic faculty
and administrative staff of the Department of Economics at Michigan State University. I
want to thank all of my friends at Michigan State University for their unwavering support
and friendship. Thank you so much for the wonderful time we shared.
    I acknowledge Dr. Yiran Fan. We have known each other for 25 years and have witnessed
the development of each other. Yiran was always devoted and tenacious in making the world
a better place, and his true and pure nature never ceases to inspire me. His philosophical
insights and words of wisdom will enlighten me for the rest of my life.
    My family has always been my greatest source of encouragement and motivation. My
parents have done so much for me, from instilling reasoning and skepticism to caring for me
when I am injured. They have supplied me with all the capabilities and resources I need to
succeed for many years, and I will be eternally thankful for their love and support.
                                               v


                                   TABLE OF CONTENTS
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      1
CHAPTER 1 OPTIMAL JOB DESIGN AND INFORMATION ELICITATION . . . . .                                                        7
   1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
   1.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
   1.3 A Public Information Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                22
   1.4 Optimal Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      25
       1.4.1 Optimal Contract under Individual Assignment . . . . . . . . . . . . . . . . . .                            26
       1.4.2 Optimal Contract under Team Assignment . . . . . . . . . . . . . . . . . . . . .                            31
   1.5 Optimal Job Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        37
   1.6 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           43
       1.6.1 Imprecise Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         44
       1.6.2 Exclusive Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         46
       1.6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        48
CHAPTER 2 CONTESTS WITH VALUATION ASSOCIATED WITH POPULATION
             UNCERTAINTY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             50
   2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  50
   2.2 Model Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     56
   2.3 Analysis and Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       59
       2.3.1 Value of Prize Is Monotonic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .               59
       2.3.2 Value of Prize Is Linear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            61
   2.4 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         62
   2.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  65
       2.5.1 Promoting Effort Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .               65
       2.5.2 Design Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            67
   2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  69
CHAPTER 3 ASSIMILATION WITH DIFFERENT WORKING SKILL
             ACQUISITION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           71
   3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  71
   3.2 Model Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     75
       3.2.1 Players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     75
       3.2.2 Lifetime of the Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             75
       3.2.3 Choice of Social Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              76
       3.2.4 The Cost of Assimilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              77
       3.2.5 Network Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          77
       3.2.6 Timing of the Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              77
                                                        vi


       3.2.7 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        78
   3.3 Solution to the Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       79
       3.3.1 Choice of Assimilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           79
       3.3.2 Choice of Discrimination Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                 81
       3.3.3 Choice of Working Skill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            82
   3.4 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        82
   3.5 Comparative Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       84
       3.5.1 Group Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       85
       3.5.2 Different Discount Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             87
   3.6 Testable Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   89
       3.6.1 Group Size Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            89
       3.6.2 Discount Factor Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .               90
   3.7 Further Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     90
   3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
APPENDICES . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
   Appendix A: Proofs for Chapter 1           . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
   Appendix B: Proofs for Chapter 2           . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124
   Appendix C: Proofs for Chapter 3           . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .142
                                                      vii


                                  LIST OF FIGURES
Figure 1.1 Optimal job design as a function of α and µ . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 1.2 Optimal job design with imprecise signals . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Figure 1.3 Optimal job design under mutually exclusive signals . . . . . . . . . . . . . . . . . . 48
Figure 3.1 The range of ability cutoff in different group sizes (m). . . . . . . . . . . . . . . . . 84
Figure 3.2 The range of ability cutoff in different discount factors (βI ). . . . . . . . . . . . . 85
Figure 3.3 The range of ability cutoff in different discount factors (βA ). . . . . . . . . . . . . 86
Figure 3.4 The range of discrimination in different group sizes (m). . . . . . . . . . . . . . . . 87
Figure 3.5 The range of discrimination in different discount factors (βI ). . . . . . . . . . . . 88
Figure 3.6 The range of discrimination in different discount factors (βA ). . . . . . . . . . . . 89
                                              viii


                                     INTRODUCTION
Information asymmetry shatters the hall of welfare economics erected by neoclassical eco-
nomic models. In a world with complete information (together with other assumptions),
welfare economics’s first and second fundamental theorems ensure a well-behaved economy,
and Adam Smith’s “invisible hand” property is satisfied in all markets. In such a Utopian
society, government regulation is straightforward since there is no way to make Pareto im-
provements, leaving transfers as the only option for the government. However, reality no
longer resembles such a beautiful world when information asymmetry occurs. Information
asymmetry has been studied extensively over the last century, and it is still an active topic
in economics.
    Two traditional topics are well-studied on asymmetric information: adverse selection and
moral hazard. Starting from the seminal work by Akerlof (1970) and Spence (1973), a large
amount of literature has explored the topic of adverse selection. Adverse selection occurs
when one party has hidden information. In other words, the type of player is private infor-
mation. In some cases, the social planner could restore market efficiency through signaling
or screening, but it cannot be guaranteed. In many scenarios, the social planner can only
achieve the second-best. On the other hand, a moral hazard problem occurs when there is
hidden action. Grossman and Hart (1983) devised a principal-agent model, which can be
used to explain the moral hazard problem. In some scenarios, a carefully crafted contract
could alleviate the moral hazard issue, although the first best cannot always be achieved. In
                                               1


the chapter “Optimal Job Design and Information Elicitation”, we look at a situation where
there is both hidden information and hidden action.
    Levitt and Snyder (1997) are one of the first to analyze the interaction between hidden
information (information elicitation) and hidden action (effort provision). The incentive
problem for hidden information or hidden action is straightforward, but things become more
complicated when the incentives become entwined. As a result, when the agents may be
privately informed about projects’ viability, the optimal contract is driven by the trade-off
between efficiency in decision making and effort incentives. In order to induce both effort and
truthful reporting of “bad news” (i.e., information that lowers the likelihood of the project’s
success), the optimal contract calls for an inefficiently “lenient” continuation policy where
some projects with negative expected value are allowed to continue.
    Our model examines a similar scenario in which incentives for effort provision and infor-
mation elicitation are interwoven, but it focuses on a fundamental aspect of the organization
structure — job design. Job design refers to how to group different tasks into jobs that
may be assigned to workers. “individual assignment” and “team assignment” are the two
most natural forms of job design. Under “individual assignment”, all tasks associated with
a specific production process are assigned to the same worker who is exclusively responsible
for his job output. Alternatively, under “team assignment”, different tasks in the production
process are assigned to different workers who are jointly responsible for their work perfor-
mance.
    Our analysis highlights that team assignment facilitates information elicitation, whereas
individual assignment facilitates the provision of effort incentives. Under individual assign-
ment, an agent can fully control the project’s outcome, as he can control the effort levels and
                                               2


the information of the project. Team assignment helps mitigate such incentive problems as
the agent does not fully control the outcome of a project. For example, his attempt to con-
ceal information may falter if his teammate happens to provide the same information to the
principal, and he influences the effort only in a part of the project. Thus, team assignment
facilitates information elicitation. However, team assignment suffers from “diseconomies of
scope” because the principal must reward agents separately for distinct tasks to induce effort
on a single project. Such diseconomies of scope might stifle motivation. On the other hand,
individual assignment simply requires the principal to pay one reward for a single project.
Thus, individual assignment facilitates effort provision.
    Finally, we show that the optimal job design is driven by two salient informational fric-
tions: the “availability” of agents’ information and the “noise” in the agents’ performance
measure. Team assignment is strictly optimal when the agents are highly likely to observe
the state, but there is a significant misalignment between the performance measure and the
project output. In contrast, when the extent of misalignment is relatively small, individual
assignment is strictly optimal regardless of the agents’ likelihood of being informed about
the state.
    The scenario in which both hidden action and hidden information are present is the
subject of the preceding analysis. In the chapter “Contests with Valuation Associated with
Population Uncertainty”, I explore information asymmetry in a another perspective. In a
standard contest, a fixed number of players exert efforts to compete for a prize. If the players
have different potential types and the type is private knowledge, the contest becomes one
with hidden information. I investigate a different variant in which the number of players is
                                               3


random.
    A contest with population uncertainty appears to be very similar to a contest with dif-
ferent types. To be more specific, a player in the former contest is uncertain about how
many other players are, and a player in the latter contest is uncertain about who the other
players are. A naı̈ve intuition would lead one to believe that these two contests share the
same feature of incomplete information in a general framework and that the solutions are
fairly similar. This is incorrect since there is a significant distinction between “how many”
and “who” in the contest setup. Once a player has entered the contest, the player’s belief is
updated, as “I am in the contest” already contains some information. As a result, a player’s
belief differs from an outsider’s (game theorist’s) without any further information. In the
contest with different types, a player’s belief of other players is a non-degenerate distribu-
tion and thus represents the uncertainty of “who”; whereas in the contest with population
uncertainty, the uncertainty of “how many” contains not only a non-degenerate distribution
as a belief but also the discrepancy between a player’s belief and an outsider’s belief.
    Contests with population uncertainty are first studied by Myerson and Wärneryd (2006).
In the paper, they set up a contest where the number of players is stochastic, and the value
of the prize is fixed. They show that the total equilibrium expenditure is strictly lower in a
contest with population uncertainty than in a contest without population uncertainty, even
though the expected number of players is the same in both contests.
    I extend the model such that the value of the prize could be dependent on the number
of players. I then consider the following three scenarios:
  (a) The value of the prize is constant;
                                                4


  (b) The value of the prize is increasing in the number of players;
  (c) The value of the prize decreases in the number of players.
Under the assumption that the expected value of the prize is the same in (a), (b), and (c),
I find that the effort level is high if the value of the prize is increasing in the number of
players, and conversely, the effort level is low if the value of the prize is decreasing in the
number of players. I also consider the following scenarios:
  (d) the number of players is a constant, and the value of the prize is also a constant;
  (e) the number of players is random, and the value of the prize is linear on the number of
      players with zero intercepts.
When the expected number of players and the expected value of the prize is the same in (d)
and (e), I find the effort level is the same under (d) and (e).
    The preceding chapters look at several types of information asymmetry. Then, in the
chapter “Assimilation with Different Working Skill Acquisition”, I study the potential out-
come of information asymmetry in a real-world context. People are divided into separate
groups, as they have diverse cultures. In addition, various groups may exhibit different traits
due to information asymmetry. These traits could include how they discount the future, how
they expose themselves to knowledge, and how they collaborate. The cultural barrier creates
a chasm, and communication comes at a cost in the form of discrimination. I examine how
the level of discrimination varies in different circumstances.
    Eguia (2017) is one of the pioneering works on this topic. The paper presents a two-
stage assimilation model: in the first stage, the majority group sets a level of discrimination,
                                                5


which is the barrier that people from the minority group must overcome if they want to
assimilate; in the second stage, people from both the majority and minority groups choose
their skill level, and people from the minority group can decide whether or not to assimilate.
It concludes that the majority group utilizes discrimination as a screening technique and
that only highly skilled minority members will assimilate. This screening equilibrium is
optimal for the majority group because the persons who assimilated into the majority group
are highly skilled and will generate positive peer effects. This paper restricts attention to
circumstances where a minority group is at a disadvantage over the majority.
    I investigate a situation in which a minority group has an advantage over the majority
group. The minority has a higher discount factor and places a higher value on the future. I
provide another two-stage assimilation model: in the first stage, people spend time learning
working skills; in the second stage, the majority group establishes a level of discrimination,
and minority group members can choose whether or not to assimilate. I show that an
equilibrium exists for all discount factors and minority group size, and the equilibrium will
have an on-path action profile with a cutoff rule. Also, when group size increases, both the
discrimination level and the ability cutoff increase, but the effect is not monotonic when
discount factors vary across various regimes.
    This dissertation first investigates a scenario where there is hidden information and hid-
den action, then analyses the gap between beliefs in contests with population uncertainty,
and lastly uses an assimilation model to study the potential outcome of information asym-
metry.
                                                6


                                            CHAPTER 1
          OPTIMAL JOB DESIGN AND INFORMATION ELICITATION∗
 1.1 Introduction
Managerial decision-making in a hierarchical organization often relies on local information
that cannot be directly accessed by the headquarter but may be available to its lower-
ranked employees. A host of key business decisions, such as launching new product lines,
undertaking new business ventures, investments in new R&D initiatives, all require detailed
information on customer preferences, profitability prospects, and technological capabilities
that is more likely to be available to the junior workers who are more familiar with the local
market conditions and the firm’s production process. Effective decision-making, therefore,
calls for timely provision of information that may be dispersed within an organization.
     However, the firm and the workers may have conflicting interests on how information may
be used, and when relaying local information to their manager, the workers may manipulate
information to steer the firm’s decision towards their own interests. A worker may deem
his information “unfavorable” if the firm’s expected action under such information could
reduce the worker’s future rents. Consequently, he may attempt to filter or conceal such
information, particularly when the firm cannot commit on how the information may be used
in its decision process. Such conflict of interest creates a complex incentive problem as the
   ∗
     Disclaimer: This chapter was co-authored with Arijit Mukherjee (arijit@msu.edu) and Luı́s Vasconcelos
(Luis.Vasconcelos@uts.edu.au). Both authors have approved that this work be included as a chapter in my
dissertation.
                                                    7


incentives for effort and information elicitation get intricately entwined (Athey and Roberts,
2001).
    Starting from the seminal work by Marschak (1955) and Marschak and Radner (1972)
on team theory, a large literature has explored the limits on information provision in an or-
ganization and how these limits are influenced by the organization’s structure (Aoki, 1986).
However, this literature typically abstracts away from the problem of incentives as the em-
ployees’ objective is assumed to be perfectly aligned with that of the employer. The goal of
our paper is to explore how the problem of intertwined incentives for effort and information
elicitation shapes a critical part of the organizational structure, namely, job design.
    An essential problem in organizational design is how to group different tasks into jobs that
may be assigned to the workers. An organization may typically choose between two natural
designs: it may opt for “individual assignment” where all tasks associated with a specific
production process are assigned to the same worker who remains solely accountable for
his job output. Alternatively, it may choose “team assignment” where different tasks of the
production process are assigned to different workers who are held jointly accountable for their
job performance. When decision-relevant information is accessible only to the workers who
are directly involved in the production process, the two job designs have distinct implications
on how information may be dispersed within the organization. Under individual assignment,
all information pertaining to a production process can be observed only by the worker who
has been assigned to it, whereas multiple workers may access this information when they are
working as a team.
    The broad prevalence of individual and team assignments in project management struc-
tures has been well-documented in the management literature (Galbraith, 1971; Larson and
                                                8


Gobeli, 1989; Hobday, 2000; Lechler and Dvir, 2010). Firms often adopt a “project-based”
structure where a manager is assigned to oversee all aspects of a project, or a “functional”
structure where projects are divided into segments and different segments are overseen by
different managers. In exploring the relative merits of the two structures, this literature
mostly focuses on the gains from task specialization vis-a-vis task coordination. We high-
light that when the workers need to be incentivized for both effort provision and information
elicitation, the choice between these two designs is shaped by a novel trade-off: Team assign-
ment may facilitate information elicitation as no worker can fully control the information or
project outcome, but it suffers from “diseconomies of scope” in incentive provision and may
undermine workers’ effort.
    We explore this trade-off in a stylized model of job design in a principal-agent environ-
ment. In our setup a principal hires two agents to work on two projects. Each project has two
tasks, and a project can either succeed or fail. The likelihood of success depends on the level
of effort exerted in its tasks and the underlying “state of the world” that may be observable
only to the agent(s) who are assigned to that project. At the beginning of the game, the
principal chooses a job design: under individual assignment, each agent is responsible for a
given project and is expected to exert effort in both tasks that are associated with it. Under
team, an agent is assigned exactly one task from each of the two projects. While performing
a task, an agent may observe the state of the world (pertaining to its associated project)
with some probability and reports it to the principal. While the agent cannot misrepresent
the state (i.e., observation on the state is “hard” information) he may conceal it by feigning
ignorance. Up on receiving the agents’ report on the state, the principal decides whether to
continue or cancel a project. The project output is not verifiable, but the agents’ effort in a
                                                9


project is reflected by a contractible but noisy performance measure.
    Incentives are provided through a wage contract that ties an agent’s pay to the princi-
pal’s cancellation decision and the realization of the performance measures (if the project is
continued). The misalignment between the performance measure and the project outcome
gives rise to a conflict of interest between the principal and the agents. If the observed state
does not bode well for the project’s success but is unlikely to affect the performance measure
(if the project is implemented), the agent may conceal his information to let the project
proceed whereas the principal would have been better off by canceling it.
    We show that the optimal job design is driven by two salient informational frictions:
the “availability” of agents’ information (i.e., the likelihood that the agent gets to observe
the state while performing his assigned tasks) and the “noise” in the agents’ performance
measure (i.e., the extent of misalignment between the measure and the output). Team
assignment is strictly optimal when the agents are highly likely to observe the state but
there is significant misalignment between the performance measure and the project output.
In contrast, when the extent of misalignment is relatively small, individual assignment is
strictly optimal regardless of the agents’ likelihood of being informed about the state.
    The intuition for this result can be gleaned from the aforementioned trade-off between
information elicitation and “diseconomies of scope” in incentive provision. Since the principal
relies on the agents’ report but cannot commit on how this information may be used, the
agent may attempt to control the projects’ outcome by manipulating his information and
effort levels. Team assignment helps in mitigating such incentive problem as the agent does
not fully control the outcome of a project. His attempt to conceal information may falter if
his teammate happens to provide the same information to the principal, and he influences
                                                10


the effort only in a part of the project.
    But incentive provision under team assignment suffers from “diseconomies of scope”:
the principal needs to reward the two agents separately to induce effort on the two tasks
that are associated with the project. And such diseconomies of scope can blunt incentives.
As the principal cannot commit how she may use the agents’ report, in equilibrium, her
continuation policy must be sequentially rational. If the principal proceeds with the project
under a certain information, it must be that her expected payoff from proceeding with the
project (conditional on the agents’ reports) is larger than what she might get from canceling
it. These requirements put an upper bound on the amount of reward the principal pays to
the agents when the project is successful (as per the performance measure). Under team,
the total reward payout is larger and such bounds are harder to meet as the principal needs
to pay the reward for success twice (paying each of the two agents separately) to elicit effort
in both tasks. Consequently, strong incentives may be infeasible.
    In contrast, under individual assignment, a single reward payment would have induced
effort in both tasks, and such economies of scope in incentive provision makes it easier to
provide strong incentives without violating the bounds on reward payments. However, under
individual assignment, information elicitation becomes harder as the agent fully controls the
outcome of a project through his report and effort.
    Thus, between the two forms of job design, team assignment facilitates information elic-
itation whereas individual assignment facilitates the provision of effort incentives.
    When the performance measure is considerably misaligned and can indicate success even
when the project fails, the agents have strong incentives to conceal unfavorable information
to let the project continue. This is when the team’s advantage in information elicitation is
                                              11


most useful: an agent’s attempt to conceal information could be undone by his teammate,
particularly when his teammate is very likely to have the same information. Due to such
misalignment effort is also more sensitive to rewards and strong incentives may be feasible
despite the scope diseconomies that arise under team. As a result, team assignment becomes
optimal. In contrast, if the performance measure is relatively well-aligned with the project
output, information elicitation is relatively easy as the agent has little to gain from concealing
information from the principal. Thus, individual assignment becomes optimal—it allows the
principal to exploit the economies of scope in incentive provision and offer strong incentives
for effort without distorting the agent’s reporting incentives.
    Related literature: Our paper contributes to a growing literature on the interplay be-
tween incentives and communication of dispersed information within an organization. As
mentioned earlier, the literature on team theory that followed from (Marschak and Rad-
ner, 1972) explores managerial decision-making when there are physical constraints on the
flow of information (and the headquarters’ ability to process it) but typically assumes that
the workers are non-strategic in their communication (see, e.g., Cremer, 1980; Aoki, 1986;
Geanakoplos and Milgrom, 1991; Bolton and Dewatripont, 1994). Several authors have
subsequently analyzed strategic communication by privately informed workers and how it
shapes the allocation of decision rights within organization (Dessein, 2002; Alonso, Dessein,
and Matouschek, 2008; Rantakari, 2008). These papers focus on the tradeoff between the
production efficiencies from coordination of actions and adaptation to local information but
abstract away from the incentive problems in effort provision.
    Levitt and Snyder (1997) is one of the first papers to analyze the interaction between the
                                                12


incentives for effort and truthful communication. They highlight a tradeoff between efficiency
in decision making and effort incentives when the agents may be privately informed about
their projects’ viability. In order to induce both effort and truthful reporting of “bad news”
(i.e., information that lowers the likelihood of the project’s success), the optimal contract
calls for an inefficiently “lenient” continuation policy where some projects with negative
expected value are allowed to continue. But in their model the organizational structure
is exogenously given; in contrast, we analyze how the interaction between the effort and
reporting incentives drives the allocation of tasks within the organization.
    Our paper complements the works by Athey and Roberts (2001), Friebel and Raith
(2010), and Dessein, Garicano, and Gertner (2010), who explore organizational forms in the
presence of the tradeoff between incentives for effort, communication, and efficient decision
making. Athey and Roberts show that the tradeoff between effort incentives and efficient
decision making can be mitigated by creating an organizational hierarchy by hiring a top-
level manager who can obtain all information at a cost and coordinates the actions of her
subordinates. However, they assume exogenous task allocation and do not allow for com-
munication between agents. Strategic communication across organizational hierarchy plays
a key role in Friebel and Raith (2010). They analyze the optimal firm structure for allo-
cation of resources across its different divisions where the divisional managers are privately
informed about the best use of such resources. The firm can integrate the units under a CEO
with authority on resource allocation for more efficient allocation of resources but must elicit
truthful reporting from the divisional managers. The optimality of such integration decision
is driven by a tradeoff between the benefit of more efficient resource allocation and the cost
of a distortion in the effort incentives that may be necessary for information elicitation. A
                                                13


similar integration issue is studied by Dessein, Garicano, and Gertner (2010) where a firm
decides on whether to organize into business units (i.e., divisions with considerably auton-
omy) or create functional units that centralizes certain tasks for all divisions. The functional
unit manager can implement standardization to capture synergy benefits but inflicts a cost
on business unit managers by impeding adaptation to local information. The organization
responds to this tradeoff by creating an incentive conflict between the business and func-
tional unit managers and it drives the optimal allocation of authority and tasks within the
organization. However, none of these papers explore the role of job design in incentivizing
truthful communication within organization which is a key focus of our analysis.
    This article also relates to a few other strands in the organizational economics literature.
There is a vast literature on incentives in teams (Groves, 1973; Holmström, 1982; Mookherjee,
1984; McAfee and McMillan, 1991; Che and Yoo, 2001; Marino and Zábojnı́k, 2004; Kvaløy
and Olsen, 2006; Rayo, 2007; Blanes i Vidal and Möller, 2016; Friebel, Heinz, Krueger, and
Zubanov, 2017) that takes the team structure as given and analyze how the underlying
production and information environment drive the optimal provision of effort incentives. A
notable exception is Gromb and Martimort (2007) who consider a setup where the decision-
maker relies on experts to gather and report multiple signals on a risky project’s profitability.
They analyze a case where the decision-maker can either ask a single expert to acquire all
signals or employ multiple experts where each one is responsible for acquiring exactly one
signal. While this setup bears some resemblance to our job design problem, Gromb and
Martimort’s model differs from ours along various key dimensions. In particular, in their
setup the agents’ effort is useful for information acquisition but not for the project’s value,
experts have “soft information” (hence, can lie in their report), and the focus of their analysis
                                                14


is on the optimal incentives for such delegated expertise when the contracting parties may
collude among themselves.
    Job design has also been explored by several scholars, primarily as a possible remedy for
the multitasking problem (Holmström and Milgrom, 1991; Dewatripont, Jewitt, and Tirole,
2000; Besanko, Régibeau, and Rockett, 2005; Corts, 2007; Schöttner, 2008; Mukherjee and
Vasconcelos, 2011; Ishihara, 2017, 2020). In contrast, we abstract away from the multitasking
problem; in our setting the conflict of incentives for effort and information elicitation is the
key driver of the optimal job design. Finally, our work is reminiscent of the literature on
authority and delegation where the contracting parties may have misaligned preferences over
the managerial actions (Aghion and Tirole, 1997; Dessein, 2002; Alonso and Matouschek,
2008; Alonso, Dessein, and Matouschek, 2008; Deimen and Szalay, 2019). In this literature,
the misalignment is assumed to stem from exogenous bias in the agents’ preferences that may
distort the communication within organization. However, in our setup the agents’ possible
gains from information manipulation arises endogenously due to the moral hazard problem
in the agent’s effort provision and the firm’s lack of commitment power over its continuation
policy.
    This paper is structured as follows. Section 1.2 presents our model. A benchmark case
with public signal is analyzed in Section 1.3. The optimal contracts under individual and
team assignment is characterized in Section 1.4. In Section 1.5 we present our main result
on the optimal job design and explore its comparative statics. A final section, Section 1.6,
discusses a few extensions of our model and presents a conclusion. All proofs are given in
the Appendix.
                                                15


 1.2 Model
Players: A principal P (she) hires two agents (he), A1 and A2 to work on two risky
projects, A and B, and concurrently gather information on the projects’ financial viability.
Below we index the agents by i ∈ {1, 2} and the projects by j ∈ {A, B}.
Technology: The production technology is reminiscent of the canonical setup of Dewa-
tripont et al. (2000). Each project j ∈ {A, B} consists of two tasks: Tj1 and Tj2 . To fix ideas,
one may consider a firm exploring the launch of a new product, and a successful launch
requires effort on product development and marketing. For notational clarity, we may refer
to task Tjk simply as task k, k ∈ {1, 2}.
    Each agent can perform at most two tasks. At the beginning of the game, the principal
commits to a task allocation or “job design”. The principal can choose one of two options: (i)
“individual assignment”, where each worker is assigned to a different project, and he works
on the two tasks that are associated with his project, and (ii) “team assignment”, where each
worker performs exactly one task from each of the two projects. Without loss of generality,
we assume that under individual assignment, agent A1 works on project A (and performs
tasks {TA1 , TA2 }), and agent A2 works on project B (and performs tasks {TB1 , TB2 }); whereas
under team assignment, A1 performs the first task in both jobs, {TA1 , TB1 }, and A2 performs
the second task, {TA2 , TB2 }.
    Let ejk ∈ [0, 1/2] denote the effort exerted in task Tjk (i.e., task k ∈ {1, 2} of project
j ∈ {A, B}). Effort is private, and it costs the agent (who has been assigned to this task)
c (ejk ) = e2jk /2.
    The outcome of project j, Yj ∈ {0, y}, can be either a “success” (Yj = y) or a “failure”
                                              16


(Yj = 0). The project’s outcome depends on the effort exerted in each of its two tasks and
on its underlying “state of the world”, ωj ∈ {G, B} that can either be “good” (ωj = G) or
“bad” (ωj = B). The production function is given as (denote ej := (ej1 , ej2 )):
                                               
                                               
                                                ej1 + ej2
                                                              if ωj = G
                      Pr (Yj = y | ej ; ωj ) =                            .
                                               
                                               
                                                     0        if ωj = B
In a “bad” state, the project always fails regardless of the agents’ effort, and yields Yj = 0.
In a “good” state failure can be averted as Yj ∈ {0, y}, and effort is productive as it increases
the chance of obtaining a high output of Yj = y.
    The project outcome is not verifiable, but the agent’s performance is reflected by a metric
Mj ∈ {0, 1} that can be verified. However, the metric Mj is a noisy measure of the project
outcome as:
                                              
                                              
                                              
                                                 ej1 + ej2      if ωj = G
                    Pr (Mj = 1 | ej ; ωj ) =                                ,
                                              
                                               µ (ej1 + ej2 )
                                                                if ωj = B
and µ ∈ [0, 1). In the context of the product launch example, one may consider Yj to be the
product’s long-term value to the firm whereas Mj is a measure of the product’s profitability
in the short run. The extent of misalignment between the metric and the project output is
reflected by the parameter µ; for µ = 0 the distributions of Yj and Mj are identical, but for
µ > 0, the metric may reflect a “success” (Mj = 1) even in a bad state when the project
fails with certainty. And at the extreme, when µ → 1, the metric no longer depends on the
underlying state.
                                                17


Information structure: At the beginning of the production process, the underlying
state of a project, ωj , is unknown to all players but players hold a common prior belief
given as Pr (ωj = G) = 12 , where ωA and ωB are statistically independent. But an agent,
up on completing an assigned task Tjk , privately observes the state ωj with probability
α ∈ [0, 1). Thus, under individual assignment, the agent assigned to project j ∈ {A, B}
learns the underlying state ωj with probability 1 − (1 − α)2 . And, under team assignment,
the probability at least one of the two agents assigned to project j learns the state ωj is also
1 − (1 − α)2 . Denote Ai ’s observation on the state ωj as xji ∈ {G, B, ∅}, where xji = ∅ if Ai
does not observe ωj .
Reporting: The agents simultaneously report their information on the underlying states to
the principal. The observation on the state is “hard information”: an agent cannot misreport
the state but can hide his observation by feigning ignorance. Under individual assignment,
denote Ai ’s report as ri ∈ {G, B, ∅}, where ri = ∅ when the agent claims to have failed
to observe the state associated with his project. And under team assignment, Ai reports
ri = riA , riB where rij ∈ {G, B, ∅} is the report on state ωj , j ∈ {A, B}. With a slight abuse
              
of notation, we denote the collective report of the two agents on state ωj as rj ∈ {G, B, ∅}
(i.e., the information on ωj that the principal obtains from the two reports).
    Given the agents’ reports, the principal decides whether to implement a project or to
cancel it. The project outcome Yj and the associated performance measure Mj are realized
only if the project is implemented. If a project is canceled, the principal earns her outside
option, as described later in this section. The agents’ reports, like the project outcomes Yj ,
are not verifiable.
                                                18


Contract: As mentioned above, the principal commits to a job design d ∈ {I, T } that
specifies either individual assignment (d = I) or team assignment (d = T ). As neither the
projects’ outcomes nor the agents’ reports are verifiable, the principal cannot commit to a
cancellation policy, and can only commit to a wage schedule that depends on (i) whether
the project has been implemented, and (ii) in the event the project is implemented, on
the realization of the associated performance measure Mj ∈ {0, 1}. To streamline nota-
tions, we set Mj = ∅ if project j gets canceled. Thus, under individual assignment, agent
A1 ’s contract is given by the wage schedule w1I (MA ), MA ∈ {0, 1, ∅} as he is only respon-
sible for project A (similarly, w2I (MB ) for agent A2 ), and under team assignment, by the
                    T            T
pair of schedules w1A    (MA ) ; w1B (MB ) as he works on parts of both projects (similarly,
  T           T
  w2A (MA ) ; w2B (MB ) for agent A2 ). Denote the wage schedule for Ai under the job design
d ∈ {I, T } as Wid .
                                    
    We denote a contract as φ := d, W1d , W2d , and let Φ be the set of all such contracts.
Time line: The time line of the game is summarized below:
                                                                               
    • P chooses a job design d ∈ {I, T }, and publicly offers a wage schedule W1d , W2d .
                                                                     
    • A1 and A2 (simultaneously) accept or reject the contract φ = d, W1d , W2d . The game
       proceeds only if both accept.
    • Ai exerts effort in the two tasks that have been assigned to him.
    • Ai may observe the state(s) ωj from his assigned tasks and reports to P.
    • P decides which project, if any, to cancel.
                                               19


   • The project outcomes, performance measures, and payoffs are realized; and the game
      ends.
Payoffs: With a slight abuse of notation, we set Yj = π if project j gets canceled. (Recall
that in this case we also set the performance metric Mj = ∅.) Under individual assignment
the agents’ ex-post payoffs are:
                             uI1 := w1I (MA ) − c (eA1 ) − c (eA2 ) ,
                             uI2 := w2I (MB ) − c (eB1 ) − c (eB2 ) ;
and the principal’s ex-post payoff is π I := πAI + πBI where
                      πAI := YA − w1I (MA ) , and πBI := YB − w2I (MB ) .
Analogously, the payoffs under team assignment are given as
                      uT1 := w1AT            T
                                   (MA ) + w1B   (MB ) − c (eA1 ) − c (eB1 ) ,
                      uT2 := w2AT            T
                                   (MA ) + w2B   (MB ) − c (eA2 ) − c (eB2 ) ,
and π T := πAT + πBT where
                                                         T
                              πjT = Yj − w1j T
                                                                  
                                                (Mj ) + w2j (Mj ) .
                                                20


    All players are risk neutral. If the agents accept the contract offered by the principal, the
ex-ante payoff of an agent Ai is given by his expected wage net of his cost of effort. And the
ex-ante payoff of the principal is given by the expected output from the two projects (when
implemented) net of the expected wage payment. If a project is canceled, the principal can
undertake an “outside option” that yields a payoff of π (> 0). Note that the expectations
over project outcome and performance metric must account for the agents’ reporting strategy
and the principal’s cancellation strategy (as we will elaborate below). But in our discussion
below we do not explicitly mention this dependence to economize on notations.
    We assume that a priori the principal is indifferent between canceling a project and
implementing it without seeking any information from the agents, which implies the following
restriction on the parameters.
Assumption 1. π = maxej1, ej2       1
                                    2
                                      (ej1 + ej2 ) y − c (ej1 ) − c (ej2 ) = 41 y 2 .
    We also assume that the outside option of both agents is 0.
Strategies and Equilibrium concept: The strategy of the principal, σP , has two com-
ponents: (i) A contract φ ∈ Φ offered at the beginning of the game that stipulates the job
design d ∈ {I, T }, and the agents’ wage schedules given the chosen design, W1d and W2d . (ii)
A continuation policy, Cj , that stipulates the principal’s continuation decision on project j,
j ∈ {A, B}, as a function of the agents’ reports r1 and r2 . The strategy of the agent Ai , σAi ,
has three components: (i) accept or reject the contract offered by the principal, (ii) an effort
policy Ei that stipulates effort levels on the assigned tasks, and (iii) a reporting policy ρi
that maps the agent’s observed signals to his report ri . We use perfect Bayesian Equilibrium
(PBE) in pure strategies as a solution concept.
                                                   21


     As the projects are independent and the players’ payoffs are additively separable across
projects, without loss of generality, we limit attention to the class of equilibria where players
use symmetric strategies (i.e., CA = CB , ρ1 = ρ2 , w1I (MA ) = w2I (MB ) , and wiA                T
                                                                                                     (MA ) =
  T
wiB   (MB ), i = 1, 2). We look for the PBE that yields the highest payoff to the principal in
each of the two continuation games that follows from a given job design d ∈ {I, T }. The
optimal job design d is the one that yields the highest payoff to the principal.
 1.3 A Public Information Benchmark
We begin our analysis by considering a benchmark case where the agents’ observations on
the state(s) are publicly verifiable information. Thus, the principal does not need to elicit
any information from the agents on the projects’ viability, and she can also commit at the
outset to a cancellation policy that depends on the observed state. This case serves as an
useful benchmark for the exploration of the optimal job design in our model: it highlights
how the principal’s need for information elicitation and her lack of commitment power on
                                                                                                          1
continuation decisions drive the key trade-off between individual and team assignment.
     As in our main model, denote xj ∈ {G, B, ∅} as the information on the state ωj observed
by the agent(s) assigned to project j (xj = ∅ if neither of the two agents observes ωj ), but
now assume that xj is publicly observed. Suppose that the principal opts for individual
assignment (d = I), commits to proceed with project j if and only if xj ∈ XPj ⊆ {G, B, ∅},
and offers the agents a wage schedule W1I , W2I .
                                              
     In the continuation game that follows, the agent Ai ’s expected payoff from exerting effort
   1
     The class of wage contracts in this benchmark case is assumed to be the same as the one defined in
the main model. Even though the wage payments could be tied to the agents’ observed state (when the
observations are publicly verifiable), as we will explain below, the principal does not benefit from doing so.
                                                       22


e0j := e0j1 , e0j2 is:
                   
                                                           "                                                 #
       e0j , XPj                XPj                                     Pr Mj | e0j , ωj Pr ωj | xj ∈  XPj
                                                                                                         
 UiI               := Pr xj ∈                   wiI (Mj )
                                         P                      P
                                      Mj ∈{0,1}             ωj ∈{G,B}
                                                                   + Pr xj 6∈ XPj wiI (∅) −           c e0jk .
                                                                                               P           
                                                                                              k∈{1,2}
                                                                                                        (1.1)
That is, with probability Pr xj ∈ XPj
                                                
                                                   the project continues, and agent Ai earns his ex-
pected wage conditional on the event that the observation on the underlying state is in XPj .
Otherwise, the project is canceled, and the agent earns his “cancellation wage” wiI (∅). No-
tice that the agent incurs the cost of his effort regardless of the principal’s decision on the
project’s implementation.
     If the effort profile ej is supported in equilibrium, it must satisfy Ai ’s incentive compat-
ibility constraint:
                                                        UiI e0j , XPj
                                                                      
                                    ej = arg max0    0
                                                                         ∀ j,                          (ICI )
                                               ej1 ,ej2
and his participation constraint:
                                              UiI ej , XPj ≥ 0.
                                                            
                                                                                                       (IRI )
Also, the principal’s expected payoff under the effort profiles {eA , eB } is (recall that we set
Yj = π if project j gets canceled):
                   ΠI := E YA − w1I (MA | eA , XPA + E YB − w2I (MB ) | eB , XPB .
                                                                                         
                                                        23


    The optimal contract stipulates the wage schedule and continuation policy (given by the
sets XPj ) that maximize ΠI subject to (IRI ) and (ICI ) .
    Next, consider the case where the principal opts for team assignment (d = T ) and offers
a wage schedule W1T , W2T . In the continuation game that follows, the agents’ subsequent
                      
effort choices constitute a Nash Equilibrium. Thus, if the contract induces the agent Ai to
exert an effort profit ei := (eAi , eBi ), it must be a best response to the other agent A−i ’s
effort level e−i .
    Analogous to UiI e0j , XPj , denote Ai ’s expected payoff under team assignment as
                                   
UiT e0i , e−i , XPA , XPB . The agent’s incentive compatibility constraint parallels its counterpart
                         
under individual assignment, and can be written as:
                                                UiT e0i , e−i , XPA , XPB
                                                                          
                                 ei = arg max
                                            0
                                                                            ∀i.               (ICT )
                                           ei
Also, Ai ’s participation constraint requires:
                                      UiT ei , e−i , XPA , XPB ≥ 0 ∀i.
                                                              
                                                                                              (IRT )
    Thus, the optimal contract stipulates the wage scheme and continuation policy (given by
the sets XPj ) that maximize the principal’s expected payoff
                              X
                    ΠT :=           E Yj − w1T (Mj ) + w2T (Mj ) | e1 , e2 , XPA , XPB ,
                                                                                    
                            j∈{A,B}
                                                      24


subject to (IRT ) and (ICT ) .
Proposition 1. Under both individual and team assignment, in the optimal contract the
principal proceeds with project j if and only if the bad state is not observed (i.e., xj ∈ {G, ∅})
and obtains a payoff
                                                         
                                      ∗               1 2
                                    S :=    1 + α − α π.
                                                      2
That is, in the benchmark case, job design does not affect the principal’s payoff under the
optimal contract.
    The above finding shows that the choice of job design is irrelevant when the agents’ infor-
mation is public. Regardless of job design, the principal can always commit to the optimal
continuation policy, and use the wage contract to induce first-best effort while extracting
all surplus from the agent. Thus, the issue of job design becomes relevant only when the
agents’ observations on the projects’ underlying state remain private (as the agents’ reports
are non-contractible, the principal can no longer commit to her continuation policy).
 1.4 Optimal Contract
In this section we explore how the principal’s need for information elicitation while being
unable to commit to her continuation policy shapes the choice between team and individual
assignment. In contrast to the benchmark case, when the agents are privately informed, the
wage contract not only affects the agents’ effort but it also interferes with their incentives
to reveal information as well as the principal’s incentive to continue with the project. The
analysis below highlights how the optimal job design is driven by such intertwined incentives.
                                                25


 1.4.1 Optimal Contract under Individual Assignment
We begin our analysis with the case of individual assignment. That is, we assume that the
principal chooses d = I, and in the continuation game we solve for the PBE that yields the
highest payoff to the principal. But before we present the formal analysis, it is instructive to
describe our solution method. Since we are looking for symmetric equilibria, we only focus
on agent A1 who performs all tasks that are associated with project A. Also, to streamline
notations, we drop the agent and project indices.
     Our goal is to find the PBE with the largest ex-ante payoff for the principal, and we
proceed in two steps: First, we fix a reporting and continuation policy pair (ρ, C), i.e., a
“communication protocol,” and search for the optimal wage contract W and effort policy E
such that the tuple (W, E, ρ, C) can be supported in a PBE. Next, we compare the payoffs
of the principal obtained in the first step across all possible communication protocols.
Lemma 1. Without loss of generality, we can restrict attention to the following two commu-
nication protocols: (i) if the state is observed to be G, report G, otherwise report ∅; proceed
with the project if and only if r = G, and (ii) if the state is observed to be B, report B,
otherwise report ∅; proceed with the project if and only if r 6= B.
     Lemma 1 implies that we only have to consider two classes of PBE: one where the
project proceeds if and only if there is “good news” , i.e., the agent’s observation is x ∈
XP = {G}, and another where the project proceeds if and only if there is “no bad news”,
i.e., the agent’s observation is x ∈ XP = {G, ∅}. Thus, without loss of generality, the
communication protocols that are relevant for our analysis can be summarized by the set
                                                26


XP ∈ {{G} , {G, ∅}}. Also, for brevity of notation, we can denote w1I (0) =: wF (wage
when the performance metric indicates “failure”), w1I (∅) − w1I (0) =: ∆C (wage premium for
cancellation), and w1I (1) − w1I (0) =: ∆S (wage premium for success).
    Given a wage contract {wF , ∆C , ∆S }, effort levels e1 and e2 , and XP (i.e., the set of
agent’s observation under which the project proceeds), the firm’s ex-ante payoff is:
   ΠI := Pr (x ∈ XP ) [Pr(ω = G | x ∈ XP ) (y − ∆S ) + Pr (ω = B | x ∈ XP ) (−µ∆S )]
                                                                                                   P
                                                                                                     ek
                                                                                                   k
                                                                     + Pr (x 6∈ XP ) [π − ∆C ] − wF .
If the project proceeds, it yields a revenue (y = Y ) only when the state is good, but the wage
premium for success may be paid even if the state is bad (as the performance measure is not
perfectly aligned with the project’s outcome). And if the project is canceled, the principal
gets her outside option and pays the wage premium for cancellation. The agent’s ex-ante
payoff can be written analogously as:
         U I := Pr (x ∈ XP ) [Pr (ω = G | x ∈ XP ) + µ Pr (ω = B | x ∈ XP )] ∆S
                                                                                          P
                                                                                              ek
                                                                                           k
                                                                                        1
                                                                                             e2k .
                                                                                          P
                                                          + Pr (x 6∈ XP ) ∆C + wF −     2
                                                                                          k
    Now, if the tuple (wF , ∆C , ∆S ; e1 , e2 ; XP ) is supported as a PBE, the following constraints
must be met. First, for each of the two communication protocols given in Lemma 1, the
principal’s decision must be sequentially rational. In other words, if the principal believes
that the agent’s signal x is in XP (given the agent’s report), it must be more profitable for
her to proceed with the project than to cancel it. Similarly, if the principal believes that the
                                                     27


agent’s signal is not in XP , it must be more profitable for her to cancel the project than to
proceed with it. Therefore, the principal’s incentive compatibility constraints require:
                                                                                X
   [Pr (ω = G | x ∈ XP ) (y − ∆S ) − µ Pr (ω = B | x ∈ XP ) ∆S ]                  ek ≥ π − ∆C , (ICPI -1)
                                                                                k
and
                                                                                X
   [Pr (ω = G | x 6∈ XP ) (y − ∆S ) − µ Pr (ω = B | x 6∈ XP ) ∆S ]                ek ≤ π − ∆C . (ICPI -2)
                                                                                k
    Next, we have the agent’s participation constraint:
                                                    U I ≥ 0.                                      (IRI )
    Finally, consider the agent’s incentive compatibility constraint. Let U (e01 , e02 ; ρ0 ) be the
agent’s payoff given his efforts e01 , e02 , and reporting policy ρ0 (fixing the wage contract and the
principal’s continuation policy). The agent’s on-path payoff U I must be the largest payoff
attainable for any feasible choice of effort profile and reporting policy. So, we require:
                                       U I = max0 0 0
                                                          U (e01 , e02 ; ρ0 ) .                    (1.2)
                                               e1 ,e2 ,ρ
    Stipulating (1.2) is equivalent to imposing the following two constraints: First, a standard
incentive compatibility constraint that requires the effort levels to be optimal for the agent
                                                         28


given his equilibrium reporting strategy (as per the communication protocol (ρ, C)); i.e.,
                                 (e1 , e2 ) = arg max0 0
                                                           U (e01 , e02 ; ρ) .             (1.2a)
                                                    e1 ,e2
Second, the agent may not gain from a “double deviation” either where he simultaneously
deviates on his effort levels and his reporting strategy. Now, given a communication protocol
(ρ, C), if the agent can profitably deviate to some other reporting policy ρ0 it must be that
his report changes the principal’s decision on whether to proceed with the project (under the
continuation policy C). Consider the two communication protocols mentioned in Lemma 1.
In the first one the associated reporting policy is to report x = G truthfully and report ∅ if
x ∈ {∅, B}; in the second one the agent reports x = B truthfully and reports ∅ if x ∈ {G, ∅}.
So, in the first case the only relevant deviation for the agent is to conceal information when
x = G, and in the second case it is to conceal the information when x = B. Thus, in both
of these cases, it is sufficient to consider only one type of deviation: the agent reports ∅
regardless of his observation. We denote this reporting policy as ρ∅ . Hence, we must have:
                                      U I ≥ max
                                              0 0
                                                    U (e01 , e02 ; ρ∅ ) .                  (1.2b)
                                             e1 ,e2
    It is instructive to elaborate on the conditions (1.2a) and (1.2b) as they, along with
the principal’s incentive constraints, illustrate the key trade-offs associated with information
elicitation.
    Consider a communication protocol from those specified in Lemma 1, and suppose that
                                                    29


the project proceeds if x ∈ XP , (XP ∈ {{G} , {G, ∅}}). Regarding condition (1.2a), it
is routine to check that U is concave in effort for any wage contract and communication
protocol, and hence, the condition can be replaced by its associated first-order condition:
        ei = Pr (x ∈ XP ) [Pr (ω = G | x ∈ XP ) + µ Pr (ω = B | x ∈ XP )] ∆S .            (ICAI -1)
    The condition (1.2b), however, is slightly more intricate. In order to simplify this con-
dition one needs to account for the fact that when the agent deviates from his equilibrium
reporting policy ρ to ρ∅ (i.e., reports ∅ regardless of his observation), it affects the project’s
continuation probability. And in case the project continues, the likelihood of a state ω con-
ditional on the project being continued is the same as its prior probability as the project
would continue regardless of the agent’s observed signal x.
    Let pI∅ be the probability that the project continues when the agent deviates to the
reporting policy ρ∅ given the equilibrium communication protocol, i.e., pI∅ = 1 if XP = {G, ∅}
and pI∅ = 0 if XP = {G}. Also, for brevity of notation, denote pI := Pr (x ∈ XP ), and let
                   P I := Pr (ω = G | x ∈ XP ) + µ Pr (ω = B | x ∈ XP ) ,
                   P∅I := Pr (ω = G) + µ Pr (ω = B) .
Now, off-path, the agent’s payoff can be derived as:
                                               30


    maxe01 ,e02 U (e01 , e02 ; ρ∅ ) = maxe01 ,e02 pI∅ [Pr (ω = G) + µ Pr (ω = B)] ∆S        e0k − 1
                                                                                                        e02
                                                                                         P          P
                                                                                                  2      k
                                                                                          k          k
                                                                                            I
                                                                                              
                                                                                     + 1 − p∅ ∆C + wF
                                                    2               
                                    =    pI∅ P∅I ∆S     + 1 − pI∅ ∆C + wF .
The agent’s on-path payoff can be computed analogously, and (1.2b) simplifies to:
                                    h                          i
                                           I 2
                                       I
                                                            2
                                                    pI∅ P∅I      ∆2S ≥ pI − pI∅ ∆C .
                                                                              
                                      p P       −                                                      (ICAI -2)
    Thus, the optimal wage contract that supports a communication protocol given by XP ∈
{{G} , {G, ∅}} solves the following program:
        PI :                   ΠI s.t. (IRI ), ICPI -1 , ICPI -2 , ICAI -1 , and ICAI -2 .
                                                                                                
                   max
                wF ,∆C ,∆S ,
                   e1 ,e2
Lemma 2. The program P I always admits a solution for XP = {G, ∅}, and admits a solution
for XP = {G} if and only if α is sufficiently large.
    The PBE that yields the highest payoff to the principal (under individual assignment)
induces the communication protocol (given by XP ∈ {{G} , {G, ∅}}) for which the value of
the program P I is the largest.
 1.4.2 Optimal Contract under Team Assignment
The analysis of team assignment resembles our above discussion on individual assignment,
but the two forms of job design differ in two key aspects: First, under team assignment
                                                              31


each agent gets exactly one signal from each job. In particular, both agents may observe
the underlying state associated with a job. Thus, an agent cannot fully control the flow
of information about a given project as his attempt to hide information would fail if the
other agent happens to reveal it. Second, for each of the two projects, both agents must
be (individually) incentivized for information elicitation and effort provision. (In contrast,
under individual assignment the principal has to incentivize only one agent for each project;
the agent is responsible for both tasks associated with the project and observes both signals
on the project’s underlying state). As we will explain later, these two distinctions give rise to
the key trade-off between ease of information elicitation and economies of scope in incentive
provision that drives the optimal job design.
    Now, consider the principal’s optimal contracting problem. As mentioned in the pre-
vious section, since the production environment and the wage schemes are both additively
                                                                          T            T
separable across projects, without loss of generality, we may require wiA    (MA ) = wiB  (MB ).
Consequently, we can formulate the principal’s optimal contracting problem as one where
there is only one project (with two tasks) and the principal hires two agents: each agent
performs exactly one of the two tasks and observes exactly one of the two signals on the
project’s state.
    Analogous to the case of individual assignment, we seek to characterize the PBE of this
continuation game with the largest ex-ante payoff for the principal. The analysis follows the
same two-step process that we have described above: first, we fix a communication protocol
and derive the optimal wage contract that supports this protocol in equilibrium; and next,
we compare the principal’s payoff across all possible communication protocols that could be
sustained in equilibrium.
                                              32


    With a slight abuse of notation, we continue to denote the strategies of the players in this
game by the tuple (Wi , Ei , ρi , C), i = 1, 2. To streamline notation, we drop the project index
               T                   T         T
and relabel wij  (0) =: wiF , wij    (∅) − wij  (0) =: ∆iC (wage premium for cancellation), and
  T         T
wij (1) − wij (0) =: ∆iS (wage premium for success). Also, we denote the team’s collective
observation on the state as xT , where
                                       
                                       
                                       
                                       
                                       
                                         G if xi = G for some i
                                       
                                       
                              xT :=       B if xi = B for some i .
                                       
                                       
                                       
                                       
                                       
                                        ∅ if x1 = x2 = ∅
                                       
As in the case of individual assignment, we can again limit attention to only two commu-
nication protocols as stated in the lemma below. (We omit the proof of this lemma as it
follows the same argument as that of Lemma 1.)
Lemma 3. Without loss of generality, we can restrict attention to the following two commu-
nication protocols: (i) reporting policy for agent Ai (i = 1, 2): if the state is observed to be
G, report G, otherwise report ∅; principal proceeds with the project only if ri = G for some
i, and (ii) reporting policy for agent Ai (i = 1, 2): if state is observed to be B, report B,
otherwise report ∅; principal proceeds only if ri 6= B for all i.
    Thus, without loss of generality, as before, the communication protocols that are relevant
for our analysis of team assignment can be summarized by the set XP ∈ {{G} , {G, ∅}}.
Given the wage contracts {wiF , ∆iC , ∆iS } , i = 1, 2, effort levels e1 and e2 , and XP , it is
routine to check that the firm’s ex-ante payoff is:
                                                   33


  ΠT := Pr xT ∈ XP ×
                                                                                  
                      T
                                      P                       T
                                                                                 P      P
         Pr ω = G | x ∈ XP         y − ∆iS + Pr ω = B | x ∈ XP               −µ ∆iS         ek
                                         i                                        i      k
                                                                                   
                                                            T
                                                                            P         P
                                                    + Pr x 6∈ XP π − ∆iC − wiF .
                                                                              i        i
The agent i’s participation constraint requires:
                            
      UiT := Pr xT ∈ XP ×
                                                                                         (IRiT )
                                                                
               Pr ω = G | xT ∈ XP + µ Pr ω = B | xT ∈ XP ∆iS ek
                                                                          P
                                                                           k
                                            + Pr xT 6∈ XP ∆iC + wiF − 12 e2i ≥ 0.
                                                         
   The principal’s incentive compatibility constraints ensure that is it optimal for the prin-
cipal to proceed with the project if xT is in XP and to cancel it otherwise:
                                         
                   T
                                   P
     Pr ω = G | x ∈ XP          y−     ∆iS
                                     i
                                                                                    (ICPT -1)
                                              
                     + Pr ω = B | xT ∈ XP
                                                     P         P              P
                                                  −µ ∆iS           ek ≥ π − ∆iC ,
                                                      i         k               i
and
                                         
                   T
                                   P
     Pr ω = G | x 6∈ XP         y−     ∆iS
                                     i
                                                                                    (ICPT -2)
                                              
                     + Pr ω = B | xT 6∈ XP
                                                     P         P              P
                                                  −µ ∆iS           ek ≤ π − ∆iC ,
                                                      i         k               i
Notice that in contrast to its counterpart under individual assignment, the (ICP ) constraints
highlight that the project’s success and cancellation both would require the principal to pay
                                               34


the corresponding wage premium to both of the two agents. As we will see later, the need
for such “double payment” captures diseconomies of scope in incentive provision under team
assignment.
    Finally, consider the agents’ incentive compatibility constraints. As before, the con-
straint would require that neither of the two agents can gain by unilaterally deviating to a
different effort choice and reporting policy. However, there is a salient distinction between
the constraints under team and their counterpart under individual assignment. Under team
assignment, an agent chooses the effort in only one of the two tasks, and reports only one
of the two signals on the project’s underlying state. Thus, an agent cannot fully influence
the project’s output and the associated performance measure, nor he can fully control the
information on the underlying state that may be communicated to the principal.
    Let Ui (ei , ρi ; ej , ρj ) be the agent Ai ’s payoff given the two agents’ efforts and reporting
policies (fixing the wage contracts and the principal’s continuation policy). The agent’s on-
path payoff UiT must be the largest payoff attainable for any feasible choice of effort profile
and reporting policy (given the other agent’s equilibrium effort and reporting policy). So,
the constraint requires:
                                        UiT = max
                                                0 0
                                                     Ui (e0i , ρ0i ; ej , ρj ) .                (1.3)
                                              ei ,ρi
    As before, it is sufficient to consider only two types of deviation: (i) the agent follows
his equilibrium reporting policy ρi but deviates on his effort level, (ii) the agent reports ∅
regardless of his observation, and chooses his effort level accordingly. Again, with a slight
abuse of notation, we denote the latter reporting policy (given in (ii)) as ρ∅ . Thus, the
incentive compatibility constraint (1.3) for agent Ai (i = 1, 2) is equivalent to the following
                                                      35


two conditions:
                                   ei = arg max0
                                                  Ui (e0i , ρi ; ej , ρj )               (1.3a)
                                              ei
and
                                   UiT ≥ max0
                                                 Ui (e0i , ρ∅ ; ej , ρj ) .              (1.3b)
                                           ei
   Now, (1.3a) implies that ei satisfies the following first-order condition (i = 1, 2):
    ei = Pr xT ∈ XP         Pr ω = G | xT ∈ XP + µ Pr ω = B | xT ∈ XP ∆iS .
                                                                              
                                                                                       (ICAT i -1)
Also, (1.3b) can be simplified in the same fashion in which we streamlined its counterpart
under individual assignment. However, one needs to account for the fact that under team,
an agent’s attempt to conceal information may be undermined by the report of the other
agent. In parallel to our analysis of individual assignment, let pT∅ be the probability that
the project continues when agent i deviates to the reporting policy ρ∅ , given the equilibrium
                                                                        
communication protocol. Also, denote pT := Pr xT ∈ XP , and
                                                                                 
                  P T := Pr ω = G | xT ∈ XP + µ Pr ω = B | xT ∈ XP ,
                  P∅T := Pr (ω = G | ρ∅ , ρj , C) + µ Pr (ω = B | ρ∅ , ρj , C) ,
where Pr (ω | ρ∅ , ρj , C) denotes the probability of the state ω conditional on the event that
the project proceeds under the communication protocol {ρ∅ , ρj , C}. Now, plugging in the
agent’s on- and off-path payoffs, condition (1.3b) can be stated as:
                                                  36


           h                             i        h                              i
                    T 2
         1     T
                                     2                  T 2
                                                                        
         2
              p P        −    pT∅ P∅T      ∆2iS +     T
                                                     p P      −  pT∅ P∅T    T
                                                                           p P T
                                                                                    ∆iS ∆jS
                                                                                            . (ICAT i -2)
                                                          
                                            ≥ pT − p∅ ∆iC .
                                                        T
    Thus, the optimal wage contract under team assignment that supports a communication
protocol given by XP ∈ {{G} , {G, ∅}} solves the following program:
    PT :                          ΠT s.t. IRiT , ICPT -1 , ICPT -2 , ICAT i -1 , and ICAT i -2 .
                                                                                             
                  max
           {wiF ,∆iC ,∆iS }i=1,2
                  e1 ,e2
Lemma 4. (i) The program P T always admits a solution for XP = {G, ∅} and admits a
solution for XP = {G} if and only if both α and µ are sufficiently large.
(ii) If P T admits a solution, it also admits a symmetric solution where w1F = w2F = wF ,
∆1S = ∆2S = ∆S and ∆1C = ∆2C = ∆C .
    The PBE that yields the highest payoff to the principal (under team assignment) induces
the communication protocol (given by XP ∈ {{G} , {G, ∅}}) for which the value of the
program P T is the largest.
 1.5 Optimal Job Design
By comparing the principal’s payoffs associated with the optimal contracts under team and
individual accountability, we can now characterize the optimal job design.
Proposition 2. (Optimal job design) There exist two thresholds µ0 and µ1 (given α),
µ0 < µ1 , such that:
                                                           37


(i) if µ < µ0 , it is optimal to choose individual assignment where the agent reports B only if
he observes the state to be B, and reports ∅ otherwise; the principal proceeds with the project
only if the report is not B. The associated optimal contract is efficient and the principal’s
payoff is S ∗ (as defined in Proposition 1).
(ii) If µ > µ1 , it is optimal to choose team assignment where the agent reports B only if he
observes the state to be B, and reports ∅ otherwise; the principal proceeds with the project
only if no agent reports B. The associated optimal contract is efficient and the principal’s
payoff is S ∗ .
(iii) Otherwise, (µ0 ≤ µ ≤ µ1 ) the principal is indifferent between team and individual
assignments: both designs, along with the corresponding communication protocol as stated in
parts (i) and (ii) above, yield the same payoff of S ∗ for the principal.
    Moreover, the parameter thresholds µ0 and µ1 vary with α in the following manner.
Proposition 3. (Comparative statics) The threshold µ0 is increasing in α. Also, there
exists a cutoff α∗ such that µ1 = 1 for α ≤ α∗ and µ1 is decreasing in α for α ≥ α∗ .
    Propositions 2 and 3 (illustrated in Figure 1.1) show how the optimal job design is driven
by the “availability” of the agents’ signal (as captured by α) and the “alignment” of the
performance measure with the project’s output (as captured by µ). For low α (i.e., α ≤ α∗ ),
individual assignment is always optimal; for low µ (i.e., µ < µ0 ) it strictly dominates team
assignment but otherwise (i.e., µ ≥ µ0 ) both designs yield the same (optimal) payoff. In
contrast, when α is large, team assignment is strictly optimal provided µ is large as well (i.e.,
                                                38


µ > µ1 ). However, as before, for moderate µ the two designs yield the same payoff, and for
small µ individual assignment remains strictly optimal.
                 µ
                  1
                                                              Team
                                                           assignment
                                               µ1 (α)
                           TA=IA
                                      µ0 (α)
                                              Individual
                                              assignment
                  0                           α∗                         1   α
                    Figure 1.1: Optimal job design as a function of α and µ
    To see the intuition behind the above result, recall that our setup highlights two key
frictions. First, the principal lacks information on the project’s viability and must elicit
it from the agents. Second, even though the principal’s continuation decision depends on
the agents’ information, she cannot commit to any continuation policy ex-ante. These two
frictions give rise to a trade-off that drives the optimal job design: relative to individual
assignment, team facilitates information elicitation but suffers from diseconomies of scope in
incentive provision.
    Team assignment helps in information elicitation as an agent cannot fully control the
outcome of the project (and the performance measure). Even if the agent attempts to
                                               39


suppress information and adjust his effort (in his assigned task) accordingly, his gains from
such deviations are muted by the fact that his teammate may still reveal the information
to the principal. Also, the agent cannot control the level of effort on the task that is
performed by his teammate. But, under individual assignment such a “double deviation”,
i.e., concurrent manipulation of reporting and effort, may be more profitable for the agent:
he fully controls what the principal gets to learn about the project’s underlying state and
how much effort is exerted on both tasks that are associated with the project. In fact, he
stands to profit from it when both α and µ are large.
     When α is large, the agent’s control over the project’s continuation is more valuable as
he is now more likely to observe the state and, under individual assignment, he can hide any
unfavorable information. In particular, the agent would have a strong incentive to conceal the
bad state (and let the project continue) if he expects to earn a large payoff even if the project
fails. This is indeed the case when µ is large, i.e., the performance measure is significantly
misaligned with the project’s outcome: in a bad state, the measure is more likely to indicate
success (given the effort levels) even though the project is sure to fail. Moreover, should
the agent deviate on his reporting policy and hide the bad state, he may also exert more
effort (vis-a-vis the on-path effort levels) so as to further increase his gains from deviation.
Thus, when α and µ are both large, deterring the agent from double-deviation gets harder
under individual assignment, and team’s advantage over individual assignment in information
elicitation becomes stronger. This is why team assignment dominates individual assignment
when α and µ are high.
     However, team assignment lacks economies of scope in incentive provision: in order to
induce effort on both tasks associated with the project, the principal needs to incentivize
                                                40


the two agents separately. Notice that under individual assignment a single wage payment
(wF , wS , or wC based on the project’s outcome) incentivizes the agent to exert efforts on
all tasks. In contrast, in a team, each of the two agents are assigned to exactly one of the
two tasks. Hence, if the principal were to induce the same level of effort in both tasks of the
project her wage bill doubles (2wF , 2wS , or 2wC ).
    Such diseconomies of scope may be costly to the principal. As the principal lacks com-
mitment power over the continuation policy, her (ICP ) constraints must hold. That is, for
any given job design with communication protocol given by XP , (i) the principal’s expected
payoff from proceeding when the agents’ observation is in XP must be larger than her pay-
off from canceling the project, and (ii) the payoff from canceling must be larger than her
expected payoff from proceeding with the project if the agents’ observation is not in XP .
Thus, any feasible contract must ensure that the principal earns more from proceeding when
                                                                             
the signal is in XP than when it is not. For example, ICPI -1 and ICPI -2 imply:
                                                                             P
           [Pr (ω = G | x ∈ XP ) (Y − ∆S ) + Pr (ω = B | x ∈ XP ) (−µ∆S )]        ek ≥
                                                                              k
                                                                              P
            [Pr (ω = G | x 6∈ XP ) (Y − ∆S ) + Pr (ω = B | x 6∈ XP ) (−µ∆S )]      ek .
                                                                                k
This difference in earnings is given by the difference in the expected output of the project
                   "                                                     #
                                                                    X
                     [Pr (ω = G | x ∈ XP ) − Pr (ω = G | x 6∈ XP )]    ek Y,
                                                                    k
and the difference in the expected wage payout
                                              41


              "                                                              #
                                                                       X
                (1 − µ) [Pr (ω = G | x ∈ XP ) − Pr (ω = G | x 6∈ XP )]    ek ∆S .
                                                                       k
    Now, for any XP ∈ {{G} , {G, ∅}} the probabilities that the project and the performance
measure indicate success (i.e., y = Y and M = 1) are both larger (given the effort levels in
the two tasks) when the agents’ signal is in XP than when it is not. Therefore, when the
principal needs to pay the wage premium for success (∆S ) twice in order to elicit the same
amount of effort in both tasks—as is the case under team assignment—the difference in her
expected wage payouts is larger. Consequently, the aforementioned feasibility constraint is
harder to satisfy under team, and individual assignment becomes more favorable.
    Also note that team’s relative disadvantage (due to diseconomies of scope) becomes more
acute when µ is small (i.e., the measure is well-aligned with the project’s output). As the
agent is unlikely to earn a reward for success when the state is bad, the wage premium for
success needs to be sufficiently large so as to incentivize him to exert effort. And when the
principal needs to pay such large premiums twice—as is the case under team assignment—
her continuation policy is less likely to remain credible: proceeding with the project when
the signal is in XP may be less profitable than proceeding when it is not (i.e., (ICP ) gets
violated). This explains why individual accountability dominates team when µ is low.
    The above discussion may be summarized as follows: For low µ, provision of incentives
under team assignment gets compromised due to acute diseconomies of scope, but incen-
tives under individual assignment remain sharp as information elicitation is relatively easy
(“double deviation” is less profitable as a successful performance is unlikely to arise when
the state is bad). Thus, individual assignment strictly dominates team. However, for large
                                               42


µ, diseconomies of scope does not distort incentive provision under team: as the required
success premium is smaller, it may be feasible for the principal to pay it to both agents
separately. Thus, both designs yield the same payoff as long as information elicitation does
not distort incentives under individual assignment. But information elicitation gets harder
under individual assignment if α is also large (along with µ), and team assignment becomes
strictly optimal.
    Notice that at the optimal job design diseconomies of scope does not distort incentives,
and neither does the need for information elicitation. Therefore, the associated contract
yields the efficient level of surplus as obtained in the public information benchmark (in
Section 1.3). However, this observation critically hinges on our modeling assumption that
the agents’ observation on the state does not contain any noise (conditional on observing it
in the first place). As we discuss in the next section, when the agent’s signal is noisy, the
optimal job design may entail inefficiencies both in the principal’s continuation policy and
in the agents’ effort levels.
 1.6 Discussion and Conclusion
While our model adopts a stylized information setup for analytical tractability, the key trade-
off that we highlight here (between information elicitation and diseconomies of scope) may
continue to shape the firm’s job design decision in some related and more general settings.
We consider two such extensions of our model. First, we relax the assumption that an
informed agent observes the state without any noise, and assume that an agent’s signal may
be imprecise. Next, we relax the assumption that the observability of the underlying state
of a project in each of its two tasks is statistically independent, and explore the case where
                                                43


they are mutually exclusive.
 1.6.1 Imprecise Signals
In our model, the agent, conditional on observing the state, always observes it without
any noise. While this assumption improves the analytical tractability of the model, it is
conceivable that the agents may not be able to directly observe the state but only acquire
an imprecise signal on the same. How would our characterization of the optimal job design
change if the agents’ information were noisy?
    In order to explore this issue, we consider the following modification to our model: Sup-
pose that the state ωj ∈ {G, B} associated with the project j (j ∈ {A, B}) is never directly
observed, but the agents’ may observe a signal σj ∈ {G, B} that is informative of ωj . Let
                       Pr (ωj = G | σj = G) = Pr (ωj = B | σj = B) = θ,
where θ ∈ (1/2, 1) reflects the precision of the signal. In parallel with the information
structure of our model, we assume that the agent assigned in task Tjk privately observes
σj with probability α. And with a slight abuse of notation, we denote the agent Ai ’s
observation on the signal σj as xji ∈ {G, B, ∅}, where xji = ∅ if Ai does not observe σj in
any of his assigned tasks. We keep all other aspects of our model unaltered. Notice that our
main model corresponds to the case where θ = 1.
    Though a complete characterization of the optimal job design for this case is analytically
intractable, the following proposition suggests that our main result is robust to a small noise
in the agents’ signal.
                                               44


Proposition 4. There exists a threshold θ∗ < 1 such that for θ > θ∗ , the qualitative char-
acterization of the optimal job design is the same as its counterpart in our main model (as
given in Proposition 2), and the optimal contract is always efficient.
    However, if the agents’ signal becomes sufficiently noisy (i.e., when θ is sufficiently low)
our main result may no longer hold. Recall that under the optimal contract (in our main
model), the project proceeds even when the agents fail to reveal their signal, i.e., the project
continues unless the agent(s) report(s) a bad state. But when the agents’ signal is sufficiently
noisy, information elicitation gets harder. An agent now has a stronger incentive to hide a
bad signal and let the project pass, since with some probability, a bad signal may still be
associated with a good state.
    This effect may introduce two sources of inefficiencies. First, the principal may reduce
the effort incentives so as to mitigate the agent’s incentive to hide a bad signal. (Recall that
as the agent’s effort increases, the performance measure is more likely to indicate success.
Thus when the efforts are high, the agent has stronger incentive to continue the project
under a bad signal.) Second, if such distortions to the effort level is too costly, the principal
may also distort her continuation policy: the project may proceed only if the signal is good.
And at the extreme, i.e., when θ is low enough, it is optimal for the principal to proceed with
all projects without soliciting any information from the agents (or, equivalently, to settle for
the outside option). These inefficiencies are illustrated in Figure 1.2 below that presents a
numerical solution for the optimal job design problem.
                                               45


                µ
                  1
                                                     Team
                                                  assignment
                       TA=IA
                                                                          I
                                            Individual
                                            assignment
                  0                                                         1  α
    Figure 1.2: Optimal job design with imprecise signal (θ = 0.77):
                  In region I individual assignment is optimal but continuation decision
                  is inefficient; project continues only if the report is good
 1.6.2 Exclusive Signals
So far, we have assumed that the observability of the underlying state of a project in each
of its two tasks is statistically independent. Such a setup may reflect a scenario where each
task Tjk (of project j) gives access to a different (and independent) source of information,
each of which may reveal the state ωj with probability α. But it is conceivable that the
informativeness of these sources may not be independent. In this subsection, we focus on
one such scenario: sources being mutually exclusive in terms of their informativeness. An
exploration of this case further illustrates how the agents’ ability to control the outcome of
a project through their efforts may affect the optimal job design.
    To formalize this idea, we make the following modification to our model. We assume that
                                                  46


exactly one of the two tasks associated with a given project may yield information about
the project’s underlying state. In particular, with probability 1/2, only task Tj1 can yield
information: the agent performing task Tj1 observes the state with probability α, whereas the
agent performing Tj2 never observes it. And with probability 1/2, only Tj2 is informative:
the agent performing task Tj1 never observes the state whereas the agent performing Tj2
observes it with probability α. We keep all other aspects of the model unchanged.
    Notice that in this setup, under individual assignment, the probability that an agent
observes the state of his assigned project is α. And this is also the probability that under
team accountability at least one of the two agents observes the state. However, in this
setting team assignment appears to lose its advantage in information elicitation: as the
observability of the state is mutually exclusive between tasks, should an agent observe an
unfavorable information he can completely suppress it as his teammate would necessarily be
uninformed.
    One may anticipate that such complete control over the information on the state may
make team suboptimal to individual assignment as team still continues to suffer from disec-
onomies of scope in incentive provision. However, this intuition is incomplete. Notice that
an agent controls the outcome of a project in two ways: through his reporting on the state
that affects the project’s continuation probability, and also through his effort(s) that affect(s)
the project’s output and the performance metric (should the project proceed). When the
signals are mutually exclusive, the advantage of team in muting the former channel is indeed
diminished. However, team assignment may still help information elicitation as the agent
cannot control the effort in all tasks that are associated with the project. Numerical result
                                               47


                µ
                 1
                                                                      Team
                                                                   assignment
                                     TA=IA
                                          Individual
                                          assignment
                 0                                                       1    α
                 Figure 1.3: Optimal job design under mutually exclusive
                               signals across tasks (within a project)
suggests (see Figure 1.3) that team’s advantage in information elicitation remains sufficiently
strong even under mutually exclusive signals and, as in our main model, it may still dominate
individual assignment when both α and µ are sufficiently large.
 1.6.3 Conclusion
When effective decision-making requires local information, the incentive structure in an
organization must meet two goals at once: induce the workers to exert costly effort and
truthfully report their information even if the information may be detrimental to their own
interest. This article explores how job design—allocation of tasks among workers—interacts
with such intertwined incentives. We argue that the optimal job design is shaped by a novel
tradeoff between the ease of information elicitation and diseconomies of scope in incentive
provision. And this tradeoff, in turn, is driven by the interplay between the “availability” of
                                                48


the workers’ information and the “alignment” of their performance measure with the firm’s
objective. In particular, team assignment may be optimal when the performance measure
is considerably misaligned, but the workers are highly likely to be informed about the local
condition. Our findings suggest a novel explanation of why team can offer better incentive
even when measures of individual performance remain available.
                                             49


                                       CHAPTER 2
    CONTESTS WITH VALUATION ASSOCIATED WITH POPULATION
                                     UNCERTAINTY
 2.1 Introduction
The term “contest” refers to a range of circumstances in which players exert efforts to
surpass their opponents. Such circumstances include rent-seeking for rents allocated by
policymakers, firms’ advertising to compete for market shares, sports tournaments, patent
races, and even military confrontation. Starting from the seminal work by Tullock (1980)
on contest theory, substantial literature has investigated a range of applications using the
Tullock contest success function. However, this literature typically assumes that both the
number of players and the value of the prize is fixed.
    The fact that these assumptions are overly strong is a significant problem. Players do not
always know who their competitors are. In a rent-seeking situation, a firm typically lacks
sufficient information about who and how powerful its competitors are; in a patent race, a
firm lacks information about how many other firms are applying for the same patent when
deciding how much R&D to invest. The players may have a list of potential competitors in
a contest, but it can be tough to discern who is truly competing when they exert efforts.
Games with population uncertainty can aid in the analysis of these situations.
    A mathematical foundation for general games with population uncertainty is provided
                                             50


by Myerson (1998) and Milchtaich (2004), and a contest game suits it very well. Skaperdas
(1996) axiomatizes the Tullock contest success function from several assumptions. One of
the assumptions establishes sub-contests in which some players are excluded from the game.
In an environment with a fixed number of participants, these sub-contests are manually
constructed, in which only a subset of players participate in a hypothetical contest, and a sub-
contest success function determines their chances of winning. In contrast, in an environment
where the number of participants is random, a class of contest success functions for any
number of players is well-defined. There is no need to manually construct hypothetical sub-
contests. As a result, Skaperdas’s assumption is more natural in contests with population
uncertainty than in contests without.
    Another common assumption in contest models is that the value of the prize remains
constant regardless of the environment, i.e., it is unaffected by the number of contestants.
This assumption is implicitly rooted in the classical contest models since the number of
players is fixed, and it is explicitly stated in contest models with population uncertainty.
This assumption becomes too strong in a variety of real-world scenarios.
    For example, consider a scenario where firms compete in an R&D race for a new product,
with the winner getting to launch the product first. If other firms can easily mimic the
product, they will launch similar products after the winner debut the new product. In the
real world, after the debut of the iPad from Apple, Samsung launched the Galaxy Tab, and
Microsoft launched the Surface. In this scenario, the profit of successfully designing the new
product decreases in the number of firms that participate in the R&D race. As a result,
if the number of firms participating in the R&D race is big, the profit from launching the
new product will be small because many firms will divide the market, and the market share
                                               51


of the winning firm will be small. On the other hand, if a firm is the only developer of a
new product, it will become a market monopoly after launching it, and the profit will be
significant. In this case, the value of the prize (profit of winning the R&D race) decreases in
the number of contestants.
    Cryptocurrency mining is another example. In most cases, the miner will invest resources
to obtain a cryptocurrency. A large number of miners usually lead to a competitive contest,
and the probability of a single miner getting a cryptocurrency is low. However, the large
population of miners suggests that this cryptocurrency is popular among the general public,
and its value will be substantial. In this case, the value of the prize (successfully mining for
one unit of cryptocurrency) increases in the number of contestants.
    As the assumption that the value of the prize is constant is too strong, I relax it and
assume that the value of the prize is associated with the number of players. The value of the
prize could be increasing/decreasing in the number of players, or it could be non-monotonic.
The purpose of this study is to examine contests where the value of the prize is associated
with population uncertainty and compare the effort put in under various situations.
    In this paper, I first construct a contest with population uncertainty, and the value of the
prize depends on the realization of the number of players. Then, I prove that the equilibrium
exists and it is unique. I then consider the following three scenarios:
  (a) The value of the prize is constant;
  (b) The value of the prize increases as the number of players increases;
  (c) The value of the prize decreases as the number of players increases.
The value of the prize is constant; The value of the prize increases as the number of players
                                                52


increases; The value of the prize decreases as the number of players increases. I assume that
the expected value of the prize in these three scenarios is the same. When the value of the
prize increases in the number of players, the effort level is high, while when the value of the
prize decreases in the number of players, the effort level is low.
     The driving force behind this result is the friction between the belief in the number of
players from a player’s perspective and the belief in the number of players from an outsider’s
perspective. From an outsider’s perspective, the distribution of the number of players is
just the prior distribution. However, from a player’s perspective, “I am in the game” is
informative, and the belief needs to be updated. As a result, when compared to the prior,
the player’s belief is skewed to the right. When the value of the prize increases in the number
of players, as the player puts more weight on the event where the number of players is large,
a player receives more rewards under the updated belief than under the prior, thus having
more incentive to exert effort. Conversely, when the value of the prize decreases as the
number of players increases, the logic is similar.
     I then extend my analysis to the following scenarios: (i) the number of players is fixed,
and the value of the prize is constant, (ii) the number of players is random, and the value
of the prize is constant, and (iii) the number of players is random, and value of the prize
is linear on the number of players. I also assume that these three contests have the same
expected number of players and expected value of the prize. Myerson and Wärneryd (2006)
show that expenditure under (ii) is smaller than (i). As mentioned above, the player’s belief
is skewed to the right compared with the prior. So, the competition is more “intense” in (ii)
than in (i) from a player’s perspective; thus, the effort level is lower in (ii). Further, I show
that when the value of the prize is proportional to the number of players (linear with zero
                                               53


intercepts), then the effort level in (iii) is the same as in (i).
    Related Literature: This paper contributes to the literature on games with population
uncertainty. Population uncertainty arises when the assumption that the players’ identities
are common knowledge is relaxed, and the number of players is uncertain (or stochastic).
Mcafee and Mcmillan (1987) are the first to investigate auction models with a stochastic
number of players. They show that when bidders are risk-averse, auction with a stochas-
tic number of players Pareto-dominates auctions that announces the number of players.
Harstad, Kagel, and Levin (1990) and Levin and Ozdenoren (2004) concentrate on the rev-
enue equivalence result when the number of bidders is uncertain in an auction. They show
that the general results of revenue equivalence could be extended when the bidders are risk-
neutral, but it breaks down when the bidders are an ambiguity aversion. Following that,
several scholars examine bidder preference for auction forms (Matthews, 1987), endogenize
entry decisions (Levin and Smith, 1994), and characterize information aggregation (Harstad,
Pekeč, and Tsetlin, 2008) for auction with population uncertainty. These publications focus
on population uncertainty in auctions, with no mention of other types of games.
    Myerson (1998) provides formal definitions of games with population uncertainty, and
Milchtaich (2004) proposes a more general mathematical framework. Myerson (1998) also
points out that one particular game — Poisson game, has the following property: a player’s
environment (the number/type of players other than herself) is the same as an external
game theorist’s perception of the whole game. Poisson game is widely used in voting the-
ories (Campbell, 1999; Myerson, 2000; Piketty, 2000; Myerson, 2002; Krishna and Morgan,
2012; Bouton and Castanheira, 2012; Bouton, 2013; Ekmekci and Lauermann, 2022), and
                                                  54


game with population uncertainty are also studied in dynamic games (Satterthwaite and
Shneyerov, 2007) and Bertrand competition (Ritzberger, 2009). This work differs from the
previous studies in that it explores games with population uncertainty under an environment
of the contest.
    Tullock (1980) looks at the issue of competing rent-seekers who spend resources to sway
policy outcomes. Tullock contest success function has a wide application: it may be used to
describe the relationship between advertising expenditure and market shares (Schmalensee,
1976), to describe R&D contests (Fullerton and McAfee, 1999), and to describe the outcome
of sports tournaments (Szymanski, 2003). Skaperdas (1996) axiomatizes the Tullock contest
success function from several reasonable assumptions, thus giving strong support for its use
in actual applications. Tullock contest variants have been investigated in subsequent research
(Azmat and Möller, 2009; Münster, 2007, 2009; Wasser, 2013). These articles assume a fixed
number of players, but I focus on a game with population uncertainty.
    Our paper complements the works by Myerson and Wärneryd (2006), Münster (2006),
and Lim and Matros (2009). Myerson and Wärneryd (2006) is one of the first papers to
analyze the contest with population uncertainty. They set up a model where the number
of players is stochastic and then show that total equilibrium expenditure is strictly lower
in a contest with population uncertainty than in a contest without population uncertainty,
even though the expected number of players is the same in both contests. Münster (2006)
considers a rent-seeking model in which a group of potential players might be active or
inactive. When the expected fraction of active players is low, a rise in the number of potential
players boosts individual rent-seeking expenditure, which is the opposite of what happens in
contests competitions without population uncertainty. Lim and Matros (2009) investigate
                                              55


a similar model where n potential players try to participate in a contest, and each player
participates with probability p. They characterize the game’s equilibrium and show that
individual spending is single-peaked in p and the total spending is monotonically increasing
in p and n. All three articles assume the prize has a fixed value, but I expand the analysis
to a scenario in which the prize’s value is contingent on the realization of the number of
players.
    This paper is structured as follows. The model is built up in Section 2.2. Section 2.3
examines different scenarios and summarizes the key findings. Section 2.4 calculates the
magnitude of this effect through numerical examples. Section 2.5 provides two applications
of the model. Section 2.6 concludes the paper. The Appendix contains all of the proofs.
 2.2 Model Setup
Consider a contest with N identical risk-neutral players, where N is a random variable
over N = {1, 2, ...}. Let π : N → [0, 1] be the prior probability distribution of N , so
P∞
    π(n) = 1. Also, define µ as the expected number of players and assume it is finite,
i=1
           P∞
i.e. µ =       π(n)n < ∞. If the support of π contains two or more elements, population
           i=1
uncertainty arises; otherwise the contest degenerates into one with fixed number of players.
Players do not observe the realization of N , but the prior π is common knowledge.
    Let v : N → R+ , and players compete for a single reward of value V = v(N ). All
players are identical and share the same valuation. V is also a random variable since N is a
random variable. Also define η as the expected value of the reward and assume it is finite,
         P∞
i.e. η =     π(n)v(n) < ∞.
         i=1
                                              56


    For any realization of n, denote xn = (x1 , x2 , ..., xn ) be the efforts for the n players. The
effort levels must be non-negative (xi ∈ [0, ∞), ∀i). The cost of effort is xi itself for player i.
Player i’s winning probability is determined by a contest success function pni (xn ). Similar to
Skaperdas (1996) and Myerson and Wärneryd (2006), I assume {pni } satisfies the following
assumptions:
                                                            n
(A1) ∀n ∈ N, ∀i ∈ {1, ..., n}, pni ≥ 0; ∀n ∈ N,                pni = 1; if xi > 0, pni (xn ) > 0.
                                                           P
                                                           i=1
(A2) ∀n, ∀i, pni is increasing in xi and decreasing in xj for j 6= i.
(A3) Anonymity: for any n ∈ N, for any permutation ϕ of {1, 2, ..., n}, we have
                             pni (x1 , x2 , ..., xn ) = pnϕ(i) (xϕ(1) , xϕ(2) , ..., xϕ(n) )
(A4) Consistency: for any i ≤ m ≤ n, for any effort levels (x1 , x2 , ..., xn ), we have:
                                                                  pni (x1 , x2 , ..., xn )
                                 pm
                                  i (x  1 , x2 , ..., x m ) =    m
                                                                     pnj (x1 , x2 , ..., xn )
                                                                P
                                                                j=1
The following Lemma describes a contest success function.
Lemma 5. A system of contest success functions {pni } that satisfied (A1)-(A4) must have
the following form:
                                                                      f (xi )
                                    pni (x1 , x2 , ..., xn ) = P    n
                                                                        f (xj )
                                                                   j=1
where f (·) is a positive increasing function.
    I further assume that
                                                         57


(A5) f (·) is twice differentiable and concave.
    Time line: The time line of the contest is summarized below:
    • N = n is realized according to distribution π.
    • Without knowing the realization of N , each player i chooses an effort level xi .
    • The winner is chosen by the contest success function pni , and payoffs accrue.
    Equilibrium concept: As Myerson (1998) and Myerson and Wärneryd (2006) pointed
out, the traditional concept of Nash equilibrium and its refinements do not apply to games
with population uncertainty. In such games, a player can only be identified by her type,
instead of her name. As a result, all players of the same type must share the same strategy.
I further restrict attentions to pure strategies. Thus, in the game described above, a strategy
x is an equilibrium if it satisfies the following:
    • Belief π̃ satisfies Bayes’ rule:
                                                            π(n)n
                                            π̃(n) = P    ∞
                                                              π(n0 )n0
                                                        n0 =1
    • The effort level maximizes the player’s payoff, given other players play the equilibrium
      strategy:
                                E[ui (xi , x−i |π̃)] ≥ E[ui (x0i , x−i |π̃)] ∀x0i
    The second condition is standard, and the first condition pins down a player’s belief
about the game she is in. In this game, players have only one type, so the belief is only
about the number of players.
                                                     58


    The equilibrium concept applied in this study expands upon the Bayesian Nash Equi-
librium in the following sense: a) the player only takes other players’ strategies, and her
own belief into consideration, and b) higher-order beliefs will not affect the strategies. The
equilibrium concept is equivalent to symmetric Bayesian Nash Equilibrium if π degenerates
to a one-point distribution.
Proposition 5. An equilibrium exists and is unique.
    Proposition 5 provides a foundation for the following analysis.
 2.3 Analysis and Result
 2.3.1 Value of Prize Is Monotonic
In this subsection, I focus on scenarios where the common-valued reward v(n) is monotonic.
To be more specific, consider the following three contests:
    • Under contest C1 , v(n) = v1 (n) where v1 (n) is increasing in n, with Eπ [v1 (n)] = v.
    • Under contest C2 , v(n) = v2 (n) where v2 (n) is decreasing in n, with Eπ [v2 (n)] = v.
    • Under contest C3 , v(n) = v where the value of the reward is independent to the number
       of players.
Proposition 6. Let x∗1 , x∗2 , x∗3 be the equilibrium effort levels of the players in three contests
respectively. Then x∗2 ≤ x∗3 ≤ x∗1 .
    The contest C3 is a benchmark where the value of the prize is constant, and x∗3 is the
effort level of each player. When the value of the prize increases in the number of players, the
                                                  59


effort level is higher than the benchmark. In contrast, when the value of the prize decreases
in the number of players, the effort level is lower than the benchmark.
    To understand the intuition behind this effect, one needs to focus on the marginal benefit
of exerting effort. When comparing contest C1 and C3 , suppose the realization of the number
of players is high. In that case, although the probability of getting the reward is low due to
a large number of competitors, the marginal benefit of effort becomes high since the value
of the prize is high. Similarly, if the realization of the number of players is low, the marginal
benefit of effort becomes low. The updated belief π̃ puts more weight on the events where
number of players is high. Thus, the former effect dominates the latter, so the marginal
benefit of exerting effort is higher in contest C1 than the benchmark.
    To understand why the belief π̃ puts more weight on the events where the number of
players is high, consider a simple example. Assume that the number of players is equally
likely to be 1 or 3. The distribution would be π(1) = π(3) = 0.5 from the standpoint of
nature (social planner/god mode). However, from the player’s perspective, the belief would
be π̃(1) = 0.25 and π̃(3) = 0.75. The player incorporates the information “I am in the game”
into the belief. The player has a high probability of being chosen if the realized number of
players is high, so the updated belief is skewed towards the right side.
    The friction between the player’s belief and the prior is found but not well studied in
the economics literature. It is referred to as a ”classroom size” problem by Mcafee and
Mcmillan (1987), and Myerson (1998) shows that the expected number of players from a
player’s perspective is one more than the expected number of players from an outsider’s
perspective if and only if the distribution is Poisson. However, they do not focus on this
friction in the above studies.
                                                 60


    The difference between a player’s belief and the prior may lead to other results. For
example, if a player is competing in a market, population uncertainty drives the expected
number of players in the player’s perspective to be higher than the actual number. As a
result, the player is always under the impression that the market is more competitive than
it actually is. The population uncertainty may lead to excess competition in the market.
    This friction also indicates that the number of players is special in the setup of a game.
Suppose there is a game G1 with population uncertainty and a game G2 with uncertainty
about the underlying state of the world. In the current game theory paradigm, if the player
receives no information, the player’s belief is the same as the prior in G2, but it differs from
the prior in G1. It further suggests that the information analysis process differs depending
on whether the uncertainty is about the population or the underlying state of the world.
It is an open topic on why uncertainty about the population and uncertainty about the
underlying state of the world are classified as different information categories. Otherwise, if
one wants to treat these two uncertainties similarly, a more general framework for setting
up a game may be needed.
 2.3.2 Value of Prize Is Linear
In this subsection, I will focus on the scenarios where the value is linear in the number of
players and compare it with contests with no population uncertainty. Consider the following
three contests:
    • Under contest C4 , the number of players is fixed at µ, and v4 (n) = v where v is a
      constant.
    • Under contest C5 , the number of players is a random number with density function π
                                               61


       where Eπ [n] = µ, and v5 (n) = v where v is a constant.
    • Under contest C6 , the number of players is a random number with density function π
       where Eπ [n] = µ, and v6 (n) = a + bn where Eπ [v6 (n)] = v.
Proposition 7. Let x∗4 , x∗5 , x∗6 be the equilibrium effort levels of the players in three contests
respectively, and assume π is non-degenerate. Then:
    • When a > 0, b < 0, x∗4 > x∗5 > x∗6 .
    • When a > 0, b = 0, x∗4 > x∗5 = x∗6 .
    • When a > 0, b > 0, x∗4 > x∗6 > x∗5 .
    • When a = 0, b > 0, x∗4 = x∗6 > x∗5 .
    • When a < 0, b > 0, x∗6 > x∗4 > x∗5 .
Proposition 7 only focuses on contests where v > 0, since x∗ = 0 when v ≤ 0 (or players want
to quit if there is an outside option). Myerson and Wärneryd (2006) show that x∗4 > x∗5 ,
which means the effort level is lower in the contests with population uncertainty than in
contests without population uncertainty. My findings imply that when the reward value is
in the form v(n) = bn, contests with population uncertainty will have the same amount of
effort as contests without population uncertainty.
 2.4 Numerical Example
In this section, I provide some numerically solved examples to measure the magnitude of this
effect. First assume the contest success function f (x) = x. Then fix the expected number
                                                  62


of players to µ, and let the distribution π be: π(µ − τ ) = 12 α, π(µ) = 1 − α, π(µ + τ ) = 12 α,
and π(n) = 0 for n ∈    / {µ − τ, µ, µ + τ }. Both τ ∈ [0, µ − 1] and α ∈ [0, 1] measures how
diverse the distribution is, and the variance of distribution π is ατ 2 . If α = 0 or τ = 0, the
distribution degenerates to one point.
    Now fix the expected value of the prize to 1 since it will not affect the relative magnitude
of the effect. Assume the value of the prize is: v(µ−τ ) = 1−ε, v(µ) = 1, and v(µ+τ ) = 1+ε.
ε ∈ [−1, 1] measures how the value of the prize changes according to the number of players.
If ε = 0, the value of the prize is constant; if ε > 0, the value of the prize is increasing in n;
and if ε < 0, the value of the prize decreases in n.
    I calculate the expected total efforts exerted by all players µx∗ , as it is comparable across
different µ.
                              µ−1              τ2                    τ
                     µx∗ =          −                    α+                    εα
                                µ      (µ − τ )µ(µ + τ )      (µ − τ )(µ + τ )
    The first part ( µ−1
                      µ
                         ) is the solution to the classical contest problem where the number of
                                                    τ 2
players is fixed at µ. The second part (− (µ−τ )µ(µ+τ     )
                                                            α) represents the effect of introducing
population uncertainty with a fixed value of prize, which is found by Myerson and Wärneryd
                                  τ
(2006). The third part ( (µ−τ )(µ+τ   )
                                        εα) illustrates the effect of introducing the assumption
that the value of the prize is associated with the number of players.
    Here are several findings that can be derived from the formula:
    • The effect is significant when µ is small.
       For example, µ = 3, τ = 1, and α = 1, associating the value of the prize with the
                                                  63


       number of players would increase (or decrease) the effort level by 20%.
                                              
                                              
                                              
                                              
                                              
                                                 0.625 if ε = 0
                                              
                                              
                                        µx∗ =      0.75   if ε = 1
                                              
                                              
                                              
                                              
                                              
                                               0.5
                                                         if ε = −1
    • The effect is small when µ is large.
       I give two examples here. For µ = 30, τ = 1, and α = 1, the effort level fluctuate
       about 0.1% if the value of the prize is associated with he number of players.
                                             
                                             
                                             
                                             
                                             
                                               0.966630 if ε = 0
                                             
                                             
                                      µx∗ ≈     0.967742 if ε = 1
                                             
                                             
                                             
                                             
                                             
                                              0.965517 if ε = −1
                                             
                                                        τ
       One may wonder how the effect changes if         µ
                                                            stay the same. The effect is still small
       when µ is large. Set µ = 30, τ = 10, and α = 1, the effect account for about 1.3% of
       the effort:                            
                                              
                                              
                                              
                                              
                                                0.9625 if ε = 0
                                              
                                              
                                       µx∗ ≈      0.975    if ε = 1
                                              
                                              
                                              
                                              
                                              
                                               0.95
                                                          if ε = −1
                                                                  
                            τ
    When τ is fixed,   (µ−τ )(µ+τ )
                                    εα go to 0 at the speed of µ12 . By setting µτ to a constant,
                                             
     τ
(µ−τ )(µ+τ )
             εα go to 0 at the speed of µ1 . In either case, the effect becomes small as µ
increases.
    The intuition behind this is the same as the main result. As the player’s belief is skewed
                                                  64


to the right compared with the prior, the relative magnitude of the difference between the
player’s belief and the prior becomes smaller as µ becomes larger. For µ = 3, τ = 1, and
α = 1, the expected number of players is 3 in an outsider’s perspective and 3.33 in a player’s
perspective. This relative difference is quite large. If µ = 30, τ = 1, and α = 1, the expected
number of players would be 30 and 30.033 respectively, and the relative difference becomes
small. Thus, the effect of associating the value of the prize to the number of players is large
if µ is small, and vice versa.
 2.5 Applications
 2.5.1 Promoting Effort Levels
Consider the following scenario: an organization wants to elicit efforts from a potential group
of people who may or may not be interested in such activities. Because the organization does
not know who is interested in advance, the number of participants is uncertain. The effort
does not directly benefit the player or generate little benefit compared to its cost, but the
organization benefits from it. As a result, the organization must incentivize the players to
put forth an effort. I also assume that the organization will be unable to provide individual
incentives to the players. The reason could be that the effort level is difficult to contract or
that the nature of the organization does not lend itself to issuing those incentives. I give
some examples to illustrate this scenario.
    Example 1: A firm wishes to promote effective communication skills among its depart-
ments. The ability to communicate will improve the department’s overall efficiency but will
not improve the efficiency of an individual worker. There are many workers, and each worker
may be interested or disinterested in learning the communication skill. Hence, the number
                                                65


of workers willing to put forth the effort to learn is uncertain. Furthermore, it is difficult to
provide individual incentives for workers to learn such skills. It is difficult to measure and
contract a worker’s effort level, and it is unreasonable to link a worker’s wage to it. The
company wants to encourage potential employees to learn as much as possible.
     Example 2: An environmental organization seeks to encourage future farmers to adopt
new environmentally friendly technology, and the potential farmers may be obstinate or
open-minded. The proper application of technology will benefit both the environment and
human welfare. On the other hand, farmers must learn how to use the technology and
apply it to their own farms to reap the greatest environmental benefits. The goal of the
organization would be to improve the environment; thus, the more farmers who learned,
the more benefits the organization would receive. However, because learning is difficult to
quantify, it is hard to provide direct subsidies. Therefore, the organization needs to find
another way to incentivize the farmers.
     One possible solution for the organization to address this issue described above would
be to host a tournament/competition based on the amount of effort the player put in. The
winner receives a monetary prize, and it functions similarly to an all-pay auction. The
tournament/competition’s details are not important in this paper, but I would assume that
the process of generating the winner would satisfy (A1) − (A4). As a result, the organization
could avoid the cost of measuring each player’s effort levels because selecting a winner requires
less information than knowing each player’s effort levels.
     Now compare two scenarios in which the award is fixed versus variable. For simplicity,
I’ll assume that the prize is equal to the number of players multiplied by a constant (linear
in the number of players). According to the main model’s analysis, players with a variable
                                               66


award will exert more effort for the same expected monetary amount. Thus, the organization
can elicit more efforts under the same expected payment.
    Myerson and Wärneryd (2006) show that, given the same expected number of players,
the effort level in the scenario with population uncertainty is less than the effort level in
the scenario without population uncertainty for a fixed monetary award. In this paper, I
demonstrate that the effort level could be partially restored by making the monetary award
an increasing function of the number of players with the expected value of the prize stays
the same.
 2.5.2 Design Competition
Consider a design competition in which companies compete to design new products by in-
vesting in R&D. There are two kinds of products: those that are difficult to imitate and
those that are easy to imitate. Because companies may have hidden developing initiatives
that are only revealed after success, the number of companies that participate in the R&D
of a certain product is random.
    Consider a product that is hard to imitate, and assume that the new market has a unit
demand of p = 1−q. Once the firm completes the product design, the firm that first designed
it will have monopoly power on the market. The solution to the unit demand problem would
be p = q = 12 , and the company’s profit would be 14 . Because the product is difficult to
imitate, profit will be zero for firms that are not the first to design it. As a result, the net
payoff of designing a successful product is π d (n) = 41 , and it is constant regardless of the
number of competitors.
    Now consider a product that is easy to imitate and has the same unit demand p = 1 − q.
Because the product is easy to imitate, the firm that initially designs it will benefit from
                                               67


designing it for a short time, but not for a long time. Other businesses will follow suit
after the debut of that new product. To simplify things, I assume that the company that
successfully designed the product will be a Stackelberg leader for this product. In the n
                                                                           1
player Stackelberg game, the Stackelberg leader will receive S(n) =       4n
                                                                             , while the followers
                         1
will receive F (n) =   4n2
                           .  As a result, the net payoff of designing a successful product is
                           n−1
π s (n) = S(n) − F (n) =   4n2
                                . The net payoff depends on the realized number of competitors,
and it decreases as the number of competitors increases.
                     n−1       1
     Since π s (n) = 4n2
                           <   4
                                  = π d (n) for all n > 1 and lim π s (n) = 0, it can be shown
                                                              n→∞
that E[π s (n)] < E[π d (n)] for all non-degenerate distribution of n. That is, assuming the
market has the same demand, the expected net payoff of designing a easy-to-imitate product
is always less than the expected net payoff of designing a difficult-to-imitate product. As a
result, the company will invest more resources to the research and development of difficult-
to-imitate products.
     Now, suppose that the product that is easy to imitate has a larger market. Assume the
product that is difficult to imitate still has unit demand p = 1 − q, but the product that
is easy to imitate has a demand of p = b − q where b > 1. To make things comparable, I
assume that E[π s (n)] = E[π d (n)]. That is, the expected net payoff of the two products is
equal. A naı̈ve intuition would suggest that the incentives to invest in both items are the
same. However, based on the main model’s analysis, the firm will continue to invest more
in the product that is hard to imitate. This approach explains why firms invest more in
hard-to-imitate products, even when the expected net payoff is the same. The very nature
of the net payoff is decreasing in the number of realized competitors, leading to lower R&D
investment in easy-to-imitate products.
                                                   68


 2.6 Conclusion
In this paper, I explore contests with population uncertainty, in which the value of the prize
depends on the number of players. When population uncertainty arises, the player’s belief
in the number of players is skewed to the right compared with the prior. This friction drives
my results.
     Assuming the expected number of players and the expected value of the prize stays the
same, I compare the following three scenarios:
  (a) the value of the prize is constant,
  (b) the value of the prize increases as the number of players increases, and
  (c) the value of the prize decreases as the number of players increases.
I find that the effort level is highest under (b) and lowest under (c).
     I also compare the following three environments:
   (i) the number of players is fixed, and the value of the prize is constant,
  (ii) the number of players is random, the value of the prize is constant, and
 (iii) the number of players is random, and the value of the prize is linear on the number of
        players.
I find that if the value of the prize is proportional to the number of players (linear with zero
intercepts), the effort level is the same under (i) and (iii).
     These analyses have many applications. One possible situation is that exerting efforts has
certain positive externalities, but it is impossible to provide an incentive for each potential
                                                69


player individually. Then, a contest with escalating incentives as the number of players
grows could be a viable option. Another situation would be a design competition. There
are two new products to develop: H is hard to imitate, and E is easy to imitate. Even if
E has a larger market and thus the expected profit of the two products is equal, the firm
will nevertheless invest more in H because the net value of the product is declining as the
number of competitors increases.
                                            70


                                        CHAPTER 3
   ASSIMILATION WITH DIFFERENT WORKING SKILL ACQUISITION
 3.1 Introduction
Discrimination between different groups is a widespread phenomenon around the world. For
example, before WWII, the German government discriminated against Jews to the extreme,
and the government wanted to eliminate them from the earth. However, during the first
half of the 20th century, many European immigrants went to the US and experienced little
discrimination. Therefore, the driving force behind discrimination in different countries is
an exciting topic.
    Recent literature focuses on the discrimination against people with low average working
skill levels but seldom studies the discrimination against people with high average working
skills. For example, the “Acting White” phenomenon is well studied by Eguia (2017) and
Advani and Reich (2015). Both papers proposed a 2-stage game model: for the first paper,
the agent of the advantaged group will choose a discrimination level in the first stage, and in
the second stage, all agents choose a skill level, and agents from the disadvantaged group will
choose their self-identity; for the second paper, in the first stage, agents from the minority
group will choose their identity, and in the second stage, all agents will select their skill
level. In both models, individuals from minority groups face a trade-off between cultural
and economic incentives: assimilation will gain economic benefit, and non-assimilation will
                                                71


prevent the cost. These studies explained the discrimination against minorities with lower
average working skill levels.
    However, discrimination against minority with high working skill level do exist. For
example, in the US, Asian Americans have a higher secondary school completion rate than
white people (Espinosa, Turk, Taylor, and Chessman, 2019). The positive and negative
dichotomy of Asian American stereotypes has been well documented (Fiske, Cuddy, Glick,
and Xu, 2002; Gilbert, 1951; Ho and Jackson, 2001; Jackson, Hodge, Gerard, Ingram, Ervin,
and Sheppard, 1996; Karlins, Coffman, and Walters, 1969; Katz and Braly, 1933). They are
stereotyped as intelligent, diligent, hard-working, self-disciplined, good at math and sciences
(implying competence), but quiet, shy, unpopular, reserved, traditional, and deriving less
value on a leisurely life. With that said, Lai and Babcock (2013) studied how White male and
female evaluators perceive an Asian American versus White job candidate on the dimensions
of competence and social skills and how these perceptions affect evaluators’ decisions in hiring
and promotion. They found that female evaluators were less likely to select Asian than White
candidates for positions involving social skills and were less likely to promote Asian than
White candidates into these positions. These studies give us an example of discrimination
against minorities with high working skill levels.
    To understand the discrimination between different groups, the “self-identity” is an es-
sential concept. Akerlof and Kranton (2000) pointed out the important relationship between
self-identity and economic outcomes. The choice of self-identity affects the agent’s utility
function, so the choice changes the payoff of the agent herself and the payoff of other agents.
Furthermore, the collective choice of self-identity may change the social norms, affecting
identity-based preferences. It is important because discrimination against people is discrim-
                                                72


ination against race and discrimination against group choice. A person born in a family of
a minority group can still choose the majority as “self-identity” and thus share the same
culture with the majority group.
    The empirical results prove that “self-identity” plays an essential role in the utility func-
tion. Benjamin, Choi, and Strickland (2010) conducted experiments to show that the social
identity of an agent can affect her preference. The discount factor and propensity to save
are affected by the choice of the majority group. The Asian American subjects exhibit more
patient preferences when making their ethnicity salient. Similarly, black subjects with long-
standing roots in the United States become more patient when their race becomes salient.
There is also suggestive evidence that native blacks become more risk-averse and whites
become more patient when their racial identity is salient.
    To understand the discrimination among groups, one needs to figure out why there are
differences between different groups’ working skill levels and how “self-identity” affects utility.
I model the difference in working skill level due to a difference in discount factors among
groups. The “self-identity” will affect the utility function because there is a network effect
within groups.
    The discount factor is a generalized factor. There are many estimations about the dis-
count factor, and the results are very different. The estimation conducted by Hausman
(1979), Moore and Viscusi (1990) , Dreyfus and Viscusi (1995), Pender (1996), Coller and
Williams (1999), Harrison, Lau, and Williams (2002) ranges from 0.53 to 0.99. By compar-
ing Pender (1996) and Harrison, Lau, and Williams (2002), the first estimates the discount
factor in India, and the second estimates the discount factor in Denmark. The first one
gets a result between 0.59 to 0.79, and the second one gets a result of 0.78. It is clear that
                                              73


different groups have different discount factors.
    I can explain the discount factor in my model in many ways. The high discount factor
can be thought of as putting large weight into the future. It can also be explained as a
longer expected life since the longer the life is, the more weight an agent will put in the
future utility. It is also a factor of the cultural norm. For example, it has been shown before
that Asian Americans place less value on leisure, so that the discount factor will be larger
for Asian Americans.
    There is evidence that networks play an essential role in the choice of assimilation. Verdier
and Zenou (2017) studied the relationship between the social network and cultural assimila-
tion. They show that agents in the center of the network have more incentive to assimilate
than the agents in the marginal of the network. They also show that more people choose to
assimilate with a denser network (interaction between agents is strong).
    The utility of an agent depends on her group’s average working skill and depends on
how large the group is. The larger the group is, the more benefit that an agent can derive
from being a member. Currarini, Jackson, and Pin (2009) find three significant results: first,
larger groups (measured as a fraction of the population of their respective schools) form a
greater fraction of their friendships with people of their same type; second, larger groups
form significantly more friendships per capita, that is, members of a group that comprises a
small minority in a school form roughly six friendships per capita, while members of groups
that comprise large majorities (close to 100 percent of a school) form on average more than
eight friendships; third, groups tend to form same-type friendships at rates that exceed the
relative fractions in the population. These results give us a solid foundation that the utility
function will depend on the group’s size.
                                                 74


     Our analysis proceeds as follows. Section 3.2 will setup the model with network effect and
two groups have different discount factors. Section 3.3 solves the game. Section 3.4 solves
the game for an explicit functional form. Section 3.5 will do the comparative status which
explain the main result. Section 3.6 proposed some testable results. Section 3.7 discussed
about some further extensions. Section 3.8 concludes the paper.
 3.2 Model Setup
 3.2.1 Players
Consider a society with a continuum of agents. Each agent is identified by her background
and her ability. Assume that the set of possible backgrounds is {A, I}, where A represents
the majority group, and I the minority group. Assume that the set of possible abilities is
[0, 1]. Let N = {A, I} × [0, 1] denote the set of players. For each background J ∈ {A, I},
let NJ = J × [0, 1] denote the subset of agents with background J .
     Assume the measure of N , denoted m(N ), is equal to 1, and that the distribution of
agents is uniform over N . Also assume that m(NI ) = m (m ∈ 0, 12 ) and m(NA ) = 1 − m.
                                                                       
That is, the majority group has more population than the minority group. The distribution
of ability conditional on background J is uniform over [0, 1].
     For any i ∈ N , let θi ∈ [0, 1] denote the ability of agent i. Individual ability is private
information.
 3.2.2 Lifetime of the Agent
The agent will have two stages. In stage 1, the agent will be considered young, and she will
spend time learning working skills si and enjoy her leisure time. In stage 2, the agent will
become an adult and choose between assimilation or not. The agent will then work, and
                                                75


payoff accrues.
Skill Level
Agents acquire working skills when they are young. I normalize the time agents have to 1
when they are young. For any agent i ∈ N , she can choose her leisure time li and spend the
rest of her time 1 − li learning. Based on the ability θi , the working skill level agent i can
get is si = θi (1 − li ).
Discount Factor
Agents with different backgrounds have different time preferences. Assume that agents with
background A have discount factor βA and agents with background I have discount factor
βI . I assume that βA < βI < 1 in this paper, that is, the minority group put more weight
on the future utility.
 3.2.3 Choice of Social Group
Assume that there are two self-identity groups, A and I, characterized by two sets of social
norms and actions expected from their members. In-group networks are strong, and the
networks across groups are very small.
    In stage 2, assume that agents with background A will identify them self as A, NA ⊆ A.
Assume that any agent with background I can choose to belong to social group I at no cost,
or she can embrace the cultural norms of group A to then join A. Let ai ∈ {0, 1} be the
choice of agent i ∈ NI . Let ai = 0 denote that i ∈ NI chooses to be part of group I and
not to assimilate, and let ai = 1 denote that agent i ∈ NI chooses to adopt the majority
cultural norms and to become a member of the majority group A. If ai = 1, I say that
                                             76


i ∈ NI “assimilates.”
 3.2.4 The Cost of Assimilation
The cost of assimilation is d for agent i, where d ∈ R+ is the difficulty of assimilation
to become a member of A. This difficulty of assimilation d is an endogenous, strategic
variable. It can be interpreted as the level of discrimination: if agents with background A
are welcoming to those who assimilate, d is small; if agents with background A are hostile,
or if they give the cold shoulder to those who are trying to assimilate, then d is high.
    The level of d is chosen endogenously in the model by an agent with background A. In
the setup, I assume that agents with background A collectively choose an agent h ∈ NA as a
representative. The agent h will then choose the discrimination level d. As shown in Section
3.3, all agents with background A will share the same optimal choice, so the mechanism of
choosing the representative h will not affect the equilibrium.
 3.2.5 Network Effect
Agents will benefit from the social group network effect. Agents in the same social group
share the same behavior and culture so that they will be closely connected. With a larger
group size, each member in the group will benefit more.
    Mathematically, I use a function f (mJ ) to model the network effect. I will assume that
f (0) = 0, f 0 (.) ≥ 0, f 0 (0) ≤ 1, f 00 (.) ≤ 0. That is, the larger the group is, the greater the
network effect. Also, the marginal benefit of the network effect is decreasing.
 3.2.6 Timing of the Game
The timing of the game is as follows:
    1. For any agent i ∈ N , i chooses the leisure time li when they are young and acquires
working skill si accordingly. All agents will act simultaneously.
                                                    77


    2.1 All agents become adults and observe the working skill si of other agents. Agents
with background A choose a representative h ∈ A and h chooses the discrimination level d.
    2.2 All agents observe d. Agents with background I make an assimilation choice ai based
on the information they have. All agents with background I will act simultaneously. Payoffs
accrue.
 3.2.7 Utility Functions
In stage 1, agent i ∈ N will derive a utility level log(li ) for enjoying the leisure time Ui1 (li ) =
Log(li ).
    In stage 2, agents become adults and start working. For each social group J ∈ {A, I},
let sJ be the average working skill of agents in J and mJ be the size of the group. Assume
that an agent i with skill si in social group J ∈ {A, I} with average skill sJ and size mJ
derives a utility f (mJ )sJ si . In addition, agent i may experience costs of assimilation.
    Let Ui2 (d, ai ) denote the utility function of agent i in stage 2 as a function of the discrim-
ination level d and the assimilation decisions ai . I can fix ai = 0 exogenously for any i ∈ NA ,
then the utility in stage 2 of an agent i in social group J ∈ {A, I} can be written as:
                                  Ui2 (d, ai ) = Log[f (mJ )sJ si − ai d]
    The agents are impatient and agents with different backgrounds have different discount
factors. Let βA denotes the discount factor for agents with background A and βI denotes
the discount factor for agents with background I. I assume that βA < βI < 1.
    Above all, the utility of an agent i with background J ∈ {A, I} in social group J ∈ {A, I}
can be written as:
                                                    78


                        Ui (li , d, ai ) = Log(li ) + βJ Log[f (mJ )sJ si − ai d]
     This completes the definition of game Γm,βA ,βI = (N, S, U ).
 3.3 Solution to the Game
I will solve the game by backward induction. In stage 1, every agent i need to choose her
leisure time li . In stage 2, the representative agent h with background A will choose the
discrimination level d and every agent with background I will choose the assimilation action
ai .
     Using backward induction, I will first characterize how agents make the assimilation
decision in stage 2. We will then find the best choice of discrimination level d. After solving
these, I will characterize the choice of leisure time for all agents.
 3.3.1 Choice of Assimilation
For agents with background I, they make the assimilation choice simultaneously. With the
proposition below, I can identify the structure of equilibria.
Proposition 8. For any bounded measurable function s over N , for any discrimination level
d ∈ R+ , there exists c ∈ (0, 1] and p ∈ [0, 1] such that
                                     
                                     
                                     
                                     
                                     
                                                      1                if si > c
                                     
                                     
                      ai (d, si ) =                    0                if si < c
                                     
                                     
                                     
                                      1 with probability p, and
                                     
                                     
                                     
                                           0 with probability 1 − p     if si = c
constitutes an equilibrium.
                                                     79


    The proposition guarantees the existence of equilibrium but not uniqueness. In general,
the uniqueness in stage 2 cannot be guaranteed since the distribution of working skills s over
all agents N can be any function.
    I focus on one specific functional form of the distribution of working skills s, which would
be my on-path equilibrium result. The working skill distribution will take the form of
                              
                              
                                                  if i ∈ NA ;
                              
                              
                              
                                αA θi
                              
                              
                              
                              
                         si = αI θi + s0          if θi ≥ θ0 and i ∈ NI ;
                              
                              
                              
                              
                              
                              
                                                  if θi < θ0 and i ∈ NI
                              
                              α θ
                              
                                  I i
for some αA , αI ∈ (0, 1), s0 ∈ [0, 1], θ0 ∈ [0, 1]. Denote S be the set that contains all possible
working skill distribution in this functional form.
Corollary 1. For any working distribution s ∈ S, for any discrimination level d ∈ R+ , there
exists c ∈ (0, 1] such that
                                               
                                               
                                               1        if si ≥ c;
                                               
                                               
                                 ai (d, si ) =
                                               
                                               
                                               0
                                               
                                                         if si < c
constitutes an equilibrium.
    The uniqueness still cannot be guaranteed for this specific functional form since it would
depend on the functional form of the network effect f . However, since there is no point mass
in the working skill distribution, the equilibrium structure can be pinned down to the form
above. Furthermore, the cutoff strategy simplifies my analysis because the representative of
NA can indirectly choose the cutoff θc by directly choosing the discrimination level d.
                                                   80


    Denote C(s, d) be a correspondence, such that for every element c ∈ C(s, d), the action
profile ai (d, si ) with cutoff point c constitutes an equilibrium for working skill distribution
s ∈ S and discrimination level d ∈ R+ .
 3.3.2 Choice of Discrimination Level
The choice of discrimination level d is determined by the representative agent h with back-
ground A. The representative agent faces the problem:
                                           max f (mA )sA sh
                                          d∈[0,∞)
    For agent h, sh is fixed so the maximization problem would be the same as:
                                            max f (mA )sA
                                           d∈[0,∞)
    For any agent i ∈ NA , the utility maximization problem will be the same. That is, all
agents with background A share one preference profile. I assume that the representative h is
randomly chosen from all agents with background A. The choice of d will not be affected by
choice of representative h. Thus, the mechanism of choosing h will not affect the equilibrium,
as I discussed before.
    The following proposition proves the existence of the d∗ for any working skill distribution
s ∈ S.
Proposition 9. For any working skill distribution s ∈ S. There is a discrimination level d∗
along with a cutoff c∗ ∈ C(s, d∗ ) such that, the representative h choose discrimination level
                                                   81


d∗ and agents with background I choose action profile
                                              
                                              
                                              1      if si ≥ c∗ ;
                                              
                                              
                                    ∗
                               ai (d , si ) =
                                              
                                              
                                              0      if si < c∗
                                              
constitutes an equilibrium.
 3.3.3 Choice of Working Skill
When choosing working skills, I assume that agents are sequentially rational and update
their beliefs according to the Bayes’ rule. I can characterize the equilibrium as follow:
                                                  βJ
Proposition 10. At any equilibrium, si =         1+βJ
                                                      θi for any agent i in group J with ai = 0
             βI                                                           d∗
and si =        θ
           1+βI i
                  + s∗ for any agent with ai = 1, where s∗ =       (1+βI )f (mA )sA
                                                                                    is a constant.
    By the proposition, the on-path choice of working skill will be in the set S at any equi-
librium. Thus validating my definition of S. On the other hand, I cannot expand S to the
set of any function on N since I need measurable functions to calculate the average working
skill level.
Proposition 11. There exists an equilibrium for the game Γm,βA ,βI .
    The uniqueness of equilibrium cannot be guaranteed, and it will highly depend on the
functional form of f . Therefore, in the next section, I will solve the game with a specific f .
 3.4 Numerical Example
I assume that f (m) = m − 12 m2 as an explicit function of f . We can easily check that this
function satisfies the assumptions above. We will solve the equilibrium for this specific case.
                                                 82


Proposition 12. For f (m) = m − 12 m2 and any βA , βI ∈ (0, 1), m ∈ (0, 12 ), equilibrium
exists. An equilibrium can be characterized by a pair (θ∗ , s∗ ) where the on-path action profiles
would be:                         
                                  
                                        βA
                                                          if i ∈ NA ;
                                  
                                  
                                  
                                     1+βA i
                                            θ
                                  
                                  
                                  
                                  
                            si =        βI
                                            θ + s∗        if θi ≥ θ∗ and i ∈ NI ;
                                  
                                     1+βI i
                                  
                                  
                                  
                                  
                                   βI θ                  if θi < θ∗ and i ∈ NI .
                                  
                                  
                                      1+βI i
                                                                                          
    ∗     1          ∗        βA                       βI
  d = (1 + mθ )                        (1 − m) +              m(1 − (θ ) ) + 2ms (1 − θ ) (1 + βI )s∗ .
                                                                          ∗ 2      ∗     ∗
          4                (1 + βA )                (1 + βI )
                                              
                                                                  βI
                                              1       if si ≥        θ∗  + s∗ ;
                                              
                                              
                                                                 1+βI
                                  ai (si ) =
                                              
                                                                   βI
                                              
                                              0       if si <        θ∗ .
                                              
                                                                 1+βI
    (θ∗ , s∗ ) can be calculated by the following inequalities:
                               
                                                2P ≤ (1 + mθ∗ )2 (γI θ∗ + s∗ )
                               
                               
                               
                               
                               
                               
                               
                               
                               
                               
                               
                                                2P ≥ (1 + mθ∗ )2 γI θ∗
                               
                               
                               
                               
                               
                               
                                   (1 + βI )s∗ P ≤ (P − Q)(γI θ∗ + s∗ )
                               
                               
                               
                               
                               
                               
                               
                               
                               
                               
                               
                                (1 + βI )s∗ P ≥ (P − Q)γI θ∗
                               
where
                                      h                                                        i
                      1           ∗       βA                  βI              ∗ 2     ∗      ∗
             P =      4
                        (1  + mθ )      (1+βA )
                                                (1 − m) +   (1+βI )
                                                                    m(1 − (θ ) ) + 2ms (1 − θ )
                         βI
             Q =      4(1+βI )
                               (2 − mθ∗ )m(θ∗ )2
                                                         83


    The proposition characterizes the equilibrium and provides a way to calculate it.
       (a) Ability Cutoff (βA = 0.1, βI = 0.2)      (b) Ability Cutoff (βA = 0.1, βI = 0.8)
       (c) Ability Cutoff (βA = 0.5, βI = 0.6)      (d) Ability Cutoff (βA = 0.5, βI = 0.8)
               Figure 3.1: The range of ability cutoff in different group sizes (m).
 3.5 Comparative Status
Based on the equilibrium calculated in section 3.4, now I will talk about some comparative
status of the equilibrium. The equilibrium is not unique with an explicit functional form of
the network effect. For every pair of parameters (βA , βI , m), I calculated (dmin , dmax , θmin , θmax ).
Note that there may exist an equilibrium in some conditions where two groups are separated,
and no assimilation will happen. This equilibrium is not the main focus of this paper, so the
discussion below will not consider this equilibrium.
    I calculated the minimum and maximum of both ability cutoff (θ∗ ) and the discrimination
level (d∗ ) for some parameters. For every pair of parameters (βA , βI , m), every discrimination
level d ∈ [dmin , dmax ] can be achieved by some equilibrium. Similarly, every ability cutoff
                                                84


      (a) Ability Cutoff (βA = 0.1, m = 0.05)        (b) Ability Cutoff (βA = 0.1, m = 0.3)
      (c) Ability Cutoff (βA = 0.5, m = 0.05)        (d) Ability Cutoff (βA = 0.5, m = 0.3)
           Figure 3.2: The range of ability cutoff in different discount factors (βI ).
θ ∈ [θmin , θmax ] can be achieved by some equilibrium, but (dmin , θmax ) or (dmax , θmin ) may
not be achieved by some equilibrium. That is, the area in R2 that the pair (d, θ) can be
achieved is not be a rectangle.
 3.5.1 Group Size
First, I will talk about the effect of group size. By comparing Figure 3.1a to Figure 3.1d,
in general, the ability cutoff θ∗ is increasing as the group size become larger. When βA is
small, the increase of ability cutoff is very significant on both θmin and θmax , and when it
is large, θmin and θmax is still increasing, but the slope is much smaller. When βA is small,
according to proposition 12, the average working skill level of agents with background A is
low. If the group size of the minority group is small, the majority group would like almost
all minority people to assimilate since the minority have a higher working skill level, so the
ability cutoff is very small. However, if the group size of the minority group becomes larger,
                                               85


      (a) Ability Cutoff (βI = 0.5, m = 0.05)       (b) Ability Cutoff (βI = 0.5, m = 0.3)
       (c) Ability Cutoff (βI = 0.8, c = 0.05)      (d) Ability Cutoff (βI = 0.8, m = 0.3)
           Figure 3.3: The range of ability cutoff in different discount factors (βA ).
the assimilated minority will increase the average working skill level of the group A, so the
ability cutoff becomes larger. The increase of ability cutoff is significant when βA is small
since the increase of average working skill of group A is very large due to assimilation. The
increase of average working skill of group A is small when βA is large, and in that case, both
θmin and θmax increase slowly when the group size of minority (m) increases.
     Then I focus on the discrimination level. By comparing Figure 3.4a to Figure 3.4d, in
general both dmin and dmax increases as the group size (m) increase. When βA is small,
dmax increased significantly and when βA is large, dmax increased slowly. This effect may
have a similar reason as explained above. When the ability cutoff is small (in general), the
discrimination level will be small; when the ability cutoff is large, the discrimination level
will be large accordingly. The increasing speed of dmax is similar to the increasing speed of
θ∗ . Another result would be that the increase of dmin is significant only in Figure 3.4b where
                                               86


              (a) Discrimination Level                      (b) Discrimination Level
                (βA = 0.1, βI = 0.2)                           (βA = 0.1, βI = 0.8)
              (c) Discrimination Level                      (d) Discrimination Level
                (βA = 0.5, βI = 0.6)                           (βA = 0.5, βI = 0.8)
             Figure 3.4: The range of discrimination in different group sizes (m).
the difference between the discount factors is very large.
 3.5.2 Different Discount Factors
Different discount factors will affect the ability cutoff and the discrimination level differently.
By comparing Figure 3.2a to Figure 3.3d, I can find several results. When βI is fixed, the
increasing of βA will result in an increase on both θmin and θmax . On the other hand, when βA
is fixed, the increase of βI will lead to a result where θmax decrease θmin increase. This result
is very interesting. Intuitively, the ability cutoff will become smaller when the difference
between discount factors becomes larger. In proposition 5, I can see that the on-path action
profile is
                                                 87


             (a) Discrimination Level                        (b) Discrimination Level
                (βA = 0.1, m = 0.05)                           (βA = 0.1, m = 0.3)
              (c) Discrimination Level                       (d) Discrimination Level
                (βA = 0.5, m = 0.05)                           (βA = 0.5, m = 0.3)
          Figure 3.5: The range of discrimination in different discount factors (βI ).
                              
                              
                                 βA
                                                 if i ∈ NA ;
                              
                              
                              
                                    θ
                                1+βA i
                              
                              
                              
                              
                         si =    βI
                                     θ + s∗      if θi ≥ θ∗ and i ∈ NI ;
                              
                               1+βI i
                              
                              
                              
                              
                               βI θ            if θi < θ∗ and i ∈ NI
                              
                              
                                1+βI i
    There is a discontinuity at θ = θ∗ . Agents with ability above the cutoff will exert extra
effort to acquire an extra working skill s∗ . In this way, even if the difference between the
discount factor is very small, the discontinuity s∗ will provide some extra working skill level
so that the cutoff could be small. When βA is fixed and βI increased, the effect of s∗ will
dominate so the θmin is increasing as βI increasing. When βI is fixed and βA increases, the
effect of the difference of discount factors will dominate, so θmin increased as βA increase.
    Now I will focus on the discrimination level. By comparing Figure 3.5a to Figure 3.6d,
                                                88


             (a) Discrimination Level                      (b) Discrimination Level
               (βI = 0.5, m = 0.05)                            (βI = 0.5, m = 0.3)
             (c) Discrimination Level
                                                     (d) Ability Cutoff (βI = 0.8, m = 0.3)
                (βI = 0.8, c = 0.05)
          Figure 3.6: The range of discrimination in different discount factors (βA ).
I can find similar result as last paragraph. When fixed βI , both dmin and dmax increases as
βA increase. This is when the effect of difference between discount factors dominates. When
βA is fixed, in general, when βI increases, dmin will increase and dmax will decrease. The
increasing of dmin is because the effect of s∗ (discontinuity of working skill level) dominates.
There is another interesting result, that is when βA = 0.1 , m = 3, dmax will first increase
then decrease. This may be the effect of the combination of two effects.
 3.6 Testable Results
 3.6.1 Group Size Change
As explained in the last section, when the group size becomes large, the discrimination
level will be higher than when the group size is small. The migration process may serve as
empirical data of this change. When migration just started, the population of the minority
                                               89


group in a community was very small. Therefore, the discrimination against them should be
small. As more and more minority people migrate to the community, the discrimination level
should be larger than before. The discrimination level could be captured by the number of
conflicts between majority and minority or the number of marriages between majority and
minority groups.
 3.6.2 Discount Factor Change
People who have different discount factors may experience different discrimination levels. For
example, Jewish people are a minority group in many countries, and they share the same
culture. Assume they are the minority group and they share the same βI around the world.
By comparing the discrimination level of Jews worldwide, I should see high discrimination
levels in countries with high discount factors (a culture that puts more weight on future
utility).
 3.7 Further Discussion
As discussed before, there always exist an equilibrium that two groups remain separated. It
                                         βA            βI
is easily to check that when f (1 − m) 1+β  A
                                              ≤ f (m) 1+β I
                                                            , the separation equilibrium exists.
The existence of this separation equilibrium will provide more interesting results for the
model.
    Another extension would be the study of the evolution of the group size. With assimi-
lation, group size will change according to time. Different groups will have different growth
rates, and the speed of assimilation will depend on the difference between discount factors.
    A third extension would use a more general functional form of the network effect. Again,
the equilibrium will generally exist, but the change in discrimination level and ability cutoff
                                               90


will vary across different functional forms.
 3.8 Conclusion
In this paper, I construct a 2-stage game model to explain the difference in discrimination
levels across different scenarios. There are several main results. First, there exists an equi-
librium for any discount factors and minority group size, and the equilibrium will have an
on-path action profile with a cutoff rule. Second, as group size increases, both the discrim-
ination level and the ability cutoff will increase. Third, when discount factors vary across
different regimes, there are two effects that drive the discrimination level and ability cutoff
in opposite directions. When βI is fixed, the larger the difference between discount factors,
the larger the discrimination level and ability cutoff. When βA is fixed, the two effects are
mixed, and there are no general results for the discrimination level and ability cutoff.
                                               91


APPENDICES
    92


Appendix A: Proofs for Chapter 1
This article contains the proofs omitted in the text.
Proof of Proposition 1. Under individual assignment, the optimal contracting program
is given as:
                                                                                    
       max ΠI := E YA − w1I (MA ) | eA , XPA + E YB − w2I (MB ) | eB , XPB
                                             s.t.
                          ej = arg maxe0j1 ,e0j2 UiI e0j , XPj
                                                               
                                                                  ∀j                       (ICI )
                                     UiI ej , XPj ≥ 0
                                                  
                                                                                           (IRI )
By standard argument, (IRI ) must bind, and any effort profile can be implemented (i.e.,
made to satisfy the (ICI )) by choosing the wage schedules w1I (MA ) and w2I (MB ) appropri-
ately. Thus, the program boils down to:
                                  X                                      
                                                        j
                                                            1 2       1 2
                         max             E Yj | ej , XP − ej1 − ej2 .
                         eA ,eB                                 2      2
                                j∈{A,B}
    Denote π(XPj ) := maxej E Yj | ej , XPj − 21 e2j1 − 12 e2j2 , and it is routine to check:
                                              
                        
                           (1 + (2α − α2 )(2α − α2 − 12 ))π          if XPj = {g}
                        
                        
                        
                        
                        
                        
                        
             π(XPj ) :=    (1 + α − 12 α2 )π                         if XPj = {g, ∅}     .
                        
                        
                        
                        
                                                                     if XPj = {g, ∅, b}
                        
                         π
                        
    Comparing the values, we obtain that the optimal XPj = {g, ∅} for all j. That is, under
individual assignment, in the optimal contract the principal proceeds with project j if and
                                                  93


only if the bad state is not observed, and obtains payoff S ∗ = 1 + α − 21 α2 π.
                                                                                       
    Similarly, under team assignment, the optimal contracting program is give as:
                                                                                     
         max ΠT :=                  E Yj − w1T (Mj ) + w2T (Mj ) | e1 , e2 , XPA , XPB
                     P
                         j∈{A,B}
                                                  s.t.
                                             UiT e0i , e−i , XPA , XPB
                                                                       
                          ei = arg max    0
                                                                           ∀i            (ICT )
                                         ei
                                                           
                                  UiT ei , e−i , XPA , XPB ≥ 0 ∀i                        (IRT )
As in the case of individual assignment, we can plug (IRT ) in the objective function and
ignore the (ICT ) ; the program boils down to:
                                                                                  
                                          X                1 2      1 2         j
                         max                    E Yj − ej1 − ej2 | ej1 , ej2 , XP .
                   eA1 ,eA2 ; eB1 ,eB2                     2        2
                                        j∈{A,B}
Thus, given XPA and XPB , the principal’s payoff is exactly the same as that in the case of
individual assignment, and so claim follows.
                                                                                                Q.E.D.
Proof of Lemma 1. Note that there are four possible reporting policies: for each x ∈
{G, B}, r = ρ (x) = x or ∅ (and ρ (∅) = ∅); and eight possible continuationpolicies: for each
r ∈ {G, ∅, B}, C (r) = cancel or proceed.
    Step 1. Without loss of generality we can consider only two continuation policies. Triv-
ially, C (r) = cancel ∀r yields a payoff of π (principal’s outside option), and C (r) = proceed
                                                        94


∀r also yields π (by Assumption 1). Also, as under any reporting policy,
               Pr (ω = G | r = G) ≥ Pr (ω = G | r = ∅) ≥ Pr (ω = G | r = B) ,
in equilibrium, if C (B) = proceed then C (r) = proceed for all r, and if C (∅) = proceed then
C (G) = proceed. Thus, without loss of generality, we can focus on equilibria that supports
only one of the following two continuation policies: (i) C (r) = proceed only if r = G, and
(ii) C (r) = proceed only if r ∈ {G, ∅}.
    Step 2. For each of the two continuation policies stated in Step 1, only one reporting
policy may be played in equilibrium.
    Step 2a. Suppose, in the optimal contract, the principal’s continuation policy (i), i.e.,
C (r) = proceed if and only if r = G. The two reporting policies of the agent where ρ (G) = ∅
(and ρ (B) = B or ∅) yield the same payoff as the project gets cancelled under both policies.
Also, the two reporting policies, ρ (G) = G and ρ (B) = B or ∅, yield the same payoff. But
the policy ρ (G) = G and ρ (x) = ∅ if x ∈ {∅, B} relaxes the principal’s incentive constraints
relative to the policy ρ (x) = x for all x ∈ {G, B} as
            Pr (ω = G | x = ∅) ≥ Pr (ω = G | x ∈ {∅, B}) ≥ Pr (ω = G | x = B) .
Thus, if in the optimal contract continuation policy (i) is played, then without loss of gener-
ality, we assume that the associated reporting policy is ρ (G) = G and ρ (x) = ∅ if x ∈ {∅, B}.
    Step 2b. Now suppose in the optimal contract continuation policy (ii), i.e., C (r) =
                                               95


proceed if and only if r ∈ {G, ∅}, is played. The two reporting policies of the agent where
ρ (B) = ∅ (and ρ (G) = G or ∅) yield the same payoff as the project always proceeds. Also,
the two reporting policies, ρ (B) = B and ρ (G) = G or ∅, yield the same payoff. But the
policy ρ (B) = B and ρ (x) = ∅ if x ∈ {G, ∅} relaxes the incentive constraints relative to the
policy ρ (x) = x for all x ∈ {G, B} . Thus, if in the optimal contract continuation policy (ii)
is played, then without loss of generality, we assume that the associated reporting strategy
is ρ (G) = G and ρ (x) = ∅ if x ∈ {∅, B}.
    Together, the observations in Steps 1 and 2 imply that, without loss of generality, we
can limit attention to two communication protocols: (i) If the state is observed to be G,
report G, otherwise report ∅; proceed with the project if and only if r = G. (ii) If the state
is observed to be B, report B, otherwise report ∅; proceed with the project if and only if
r 6= B.
                                                                                           Q.E.D.
Proof of Lemma 2. For brevity, we rewrite the objective function and all constraints by
using the notations pI , pI∅ , P I and P∅I (as defined in Section 1.4.1), and the program P I boils
                                                   96


down to:
                                                                           P
          maxwF ,∆C ,∆S , ΠI := pI [ Pr(ω = G | x ∈ XP )y − P I ∆S             ek
                   e1 ,e2                                                    k
                                    +(1 − pI )[π − ∆C ] − wF
                                                 s.t.
                                                                1
                    pI P I ∆S       ek + (1 − pI )∆C + wF −         e2k ≥ 0        (IRI )
                                 P                                P
                                                                2
                                 k                                k
                   [ Pr(ω = G | x ∈ XP )y − P I ∆S ]                              (ICPI -1)
                                                           P
                                                              ek ≥ π − ∆C
                                                            k
                                        / XP )y − PCI ∆S ]                        (ICPI -2)
                                                           P
                   [ Pr(ω = G | x ∈                           ek ≤ π − ∆C
                                                            k
                                     ek = pI P I ∆S , k = 1, 2                    (ICAI -1)
                         h         2          2 i 2             
                            pI P I − pI∅ P∅I        ∆S ≥ pI − pI∅ ∆C .            (ICAI -2)
                                                          
    By standard argument, we claim that IRI binds. Using (IR) and (ICAI -1) we can
eliminate wF and ei s and the program can be further simplifies to:
                          2                                                      2
     max∆         2 pI       P I Pr(ω = G | x ∈ XP )y∆S + (1 − pI )π − pI P I ∆S
          C ,∆S
                                                 s.t.
                                                                  
               [ Pr(ω = G | x ∈ XP )y − P I ∆S ] 2pI P I ∆S ≥ π − ∆C                   (ICPI -1)
                                                                  
               [ Pr(ω = G | x ∈      / XP )y − PCI ∆S ] 2pI P I ∆S ≤ π − ∆C            (ICPI -2)
                          h        2          2 i 2             
                             pI P I − pI∅ P∅I       ∆S ≥ pI − pI∅ ∆C                   (ICAI -2)
                                                      97


    Case 1: XP = {G, ∅}. Here pI∅ = 1, and the program becomes:
         
                            2                                                                  2
                       2 pI    P I Pr(ω = G|x ∈ XP )y∆S + (1 − pI )π − pI P I ∆S
         
         
         
          max∆ ,∆
         
               C  S
         
         
         
         
         
         
                                                 s.t.
         
         
  I
P{g,∅} :
                                                                                          
                ∆C ≥ lP := π − [Pr(ω = G|x ∈ XP )y − P I ∆S ] 2pI P I ∆S                              (ICPI -1)
         
         
         
                                                                                         
         
         
         
                ∆C ≤ uP := π − [Pr(ω = G|x ∈         / XP )y − PCI ∆S ] 2pI P I ∆S                    (ICPI -2)
         
         
                                            h                    i ∆2
                                                  I 2        I I 2
                                                  
                             ∆C ≥ lA := P∅ − p P                         S
                                                                                                       (ICAI -2)
         
         
                                                                       1−pI
As ∆C does not appear in the objective function, we can rewrite the program as:
  
                        2                                                                    2
             max 2 pI      P I Pr(ω = G | x ∈ XP )y∆S + (1 − pI )π − pI P I ∆S
  
  
  
  
  
             ∆S
  
  
  
  
                                                  s.t.
                                                                                         2          2
                                                                                                             .
                      I                              I
                                                                       I I I       (P∅I ) −(pI P I )     2
  
  
  
    uP ≥ lP ⇔ π ≥ 2p Pr(ω = G | x ∈        / XP )P y ∆S − 2p P PC −                       1−pI
                                                                                                         ∆S
  
  
  
  
                                        u ≥l ⇔∆ ≤ y
  
  
                                         P      A        S     1−µ
By routine calculation one obtains Pr(ω = G|x ∈ XP ) = 1/ (2 − α0 ), Pr(ω = G|x ∈                  / XP ) = 0,
and
                                                                         
                    I        1 0      I        1                     1
                  p =1− α, P =                      +µ 1−                   , pIC = µ,
                             2             2 − α0                2 − α0
where α0 := 1 − (1 − α)2 . Also, to streamline notation, without loss of generality, we set
y = 1 (thus, by Assumption 1, π = 14 ). So, the program P{g,∅}       I
                                                                           boils down to:
                      
                                                      0          1 2
                                                                       + 14 + 18 α0
                                 1                               
                      
                      
                       max    −    2
                                      (1 + µ  (1  − α   ))∆ S − 2
                      
                        ∆S
                      
                                                   s.t.                             .
                      
                      
                      
                      
                                    ≥ 12 (α0 ∆S )2 and
                                 1                          1
                                                                 ≥ ∆S
                      
                      
                                  4                         1−µ
                                                   98


                                                                                           1
    Notice that the objective function is strictly concave with peak at                1+µ(1−α0 )
                                                                                                    and the
feasible set is always non-empty. Thus the solution always exists and is given as:
                                    
                                             1                α0 µ 2      1
                                    
                                    
                                    
                                        1+µ(1−α0 )
                                                      if  (1+µ(1−α0 ))2
                                                                        ≤ 2
                            ∆∗S  =                                           .
                                           √1
                                    
                                    
                                    
                                          µ 2α0
                                                      otherwise
The associated value is:
                   
                                                                                 α0 µ2
                     1
                        + 81 α0                                                               1
                   
                   
                   
                     4
                                                                          if (1+µ(1−α0 ))2
                                                                                           ≤  2
           I
         V{g,∅} =                    h                               i2                           .       (1)
                     1
                        + 81 α0 −  1    √1 (1 + µ (1 − α0 )) − 1
                   
                   
                   
                     4             2   µ 2α0
                                                                          otherwise
Case 2: XP = {G}. Here pI∅ = 0, and the program becomes:
       
                             2                                                              2
                       2 pI     P I Pr(ω = G | x ∈ XP )y∆S + (1 − pI )π − pI P I ∆S
       
       
       
         max∆ ,∆
       
                C S
       
       
       
       
       
       
                                                  s.t.
       
       
  I
P{g} :                                                                                              (ICPI -1) .
                                                                                       
                 ∆C ≥ lP := π − [Pr(ω = G | x ∈ XP )y − P I ∆S ] 2pI P I ∆S
       
       
       
                                                                                      
                  ∆C ≤ uP := π − [Pr(ω = G | x ∈        / XP )y − PCI ∆S ] 2pI P I ∆S               (ICPI -2)
       
       
       
       
       
       
       
                                                            2
                                       ∆C ≤ uA := pI P I        ∆2S                                 (ICAI -2)
       
       
As in Case 1, ∆C does not enter into the objective function, and we can further simplify the
                                                    99


program as:
          
                         I 2 I
                                                                      I         I I
                                                                                       2
                                            |   ∈    P )y∆S + (1 − p )π − p P ∆S
          
          
          
            max    2  p    P  Pr(ω =    G    x    X
          
             ∆S
          
          
          
                                                  s.t.
          
          
                                                                                          ,
                               I                             I
                                                                          h
                                                                             I
                                                                                   i 2
                                                                                  I 2
             lP ≤ uA ⇔ π ≤ 2p Pr(ω = G | x ∈ XP )P y ∆S − p P                         ∆S
          
          
          
          
          
          
          
                                                            y
                                      lP ≤ uP ⇔ ∆S ≤ 1−µ
          
          
and plugging the values for the probablities and setting y = 1 we obtain:
                            
                                    1 02
                                               (1 − ∆S ) + 41 (1 − 12 α0 )
                            
                            
                            
                             max   2
                                      α ∆S
                             ∆S
                            
                            
                                                  s.t.                     .
                            
                            
                            
                            
                            
                             α0 ∆S 1 − 1 ∆S ≤ 1 and ∆S ≤             1
                                               
                                           2           4             1−µ
                                                                                        √
The feasible set is non-empty if and only if α0 ≥ 1/2 (equivalently, α ≥ 1 − 1/ 2), and the
objective function is concave with peak at 1. Thus, the solution of the program and the value
would be:
                                                                  
                                              1 1 0              1             1
                       ∆∗S  = 1 and    I
                                    V{g}   = + α α −       0
                                                                     if α0 ≥                   (2)
                                              4 4                2             2
and no solution otherwise.
                                                                                            Q.E.D.
Proof of Lemma 4. The proof is similar to that of Lemma 2. For brevity, we rewrite the
objective function and all constraints by using the notations pT , pT∅ , P T and P∅T (as defined
                                                 100


in Section 1.4.2), and the program P T boils down to:
                                                                         
                         T     T                  T              T
                                                                   P        P
             max Π = p             Pr(ω = G | x ∈ P )y − P            ∆iS     ek
            ∆iC ,∆iS                                                i       k
             wiF ,ei
                                                       
                                     T
                                              P             P
                             +(1 − p ) π −        ∆iC −        wiF
                                                i            i
                                       s.t. ∀i ∈ {1, 2}
                     pT P T ∆iS    ek + (1 − pT )∆iC + wiF − 21 e2i ≥ 0                (IRiT )
                                P
                                 k
                                                         
                               T               T
                                                                                      (ICPT -1)
                                                  P         P            P
              Pr(ω = G | x ∈ XP )y − P                ∆iS      ek ≥ π − ∆iC
                                                   i                      i
                                                          k
              Pr(ω = G | xT ∈     / XP )y − PCT                                       (ICPT -2)
                                                  P         P            P
                                                      ∆iS      ek ≤ π − ∆iC
                                                   i        k             i
                                        ei = pT P T ∆iS                               (ICAT i -1)
       1
         h
            T T 2
                           T T 2
                                 i 2       h
                                               T T 2
                                                             T T
                                                                    T T i
       2
           p P        − p∅ P ∅      ∆iS + p P            − p ∅ P∅ p P         ∆iS ∆jS
                                                                                      (ICAT i -2)
                                      ≥ (pT − pT∅ )∆iC
    We can eliminate wiF and ei s using (IRiT ) (that must bind) and (ICAT i -1), and the program
                                                     101


further simplifies to:
                 2                                                          1
                                                                                        2 P
              pT    P T Pr(ω = G | xT ∈ XP )y         ∆iS + (1 − pT )π −         pT P T      ∆2iS
                                                 P
  max                                                                        2
 ∆iC ,∆iS                                          i                                       i
                                              s.t.
                                                      
                            T              T
                                                 ∆iS pT P T                                       (ICPT -1)
                                              P                P                 P
             Pr(ω = G | x ∈ XP )y − P                              ∆iS ≥ π − ∆iC
                                              i                 i                 i
                                                      
             Pr(ω = G | xT ∈   / XP )y − PCT     ∆iS pT P T                                       (ICPT -2)
                                              P                P                 P
                                                                   ∆iS ≤ π − ∆iC
                                              i                 i                 i
            h                     i            h                              i
                    T 2          T 2                      T 2
          1
                                      (∆1S )2 +
                                                          
          2
               pT P     − pT∅ P∅                     pT P     − pT∅ P∅T pT P T ∆1S ∆2S
                                                                                                  (ICAT 1 -2)
                                       ≥ (pT − pT∅ )∆1C
            h                     i            h                              i
          1     T T 2         T T 2         2         T T 2
                                                                T T T T
          2
               p P      − p ∅ P∅      (∆2S ) + p P            − p∅ P∅ p P ∆1S ∆2S
                                                                                                  (ICAT 2 -2)
                                       ≥ (pT − pT∅ )∆2C
Part (i). We now prove that if P T admits a solution, it also admits a symmetric solution
where ∆1S = ∆2S = ∆S and ∆1C = ∆2C = ∆C . The proof is given in the following five
steps.
    Step 1: Suppose ∆∗ := (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ) is a solution to PT . If ∆∗1S = ∆∗2S = ∆∗G
(say), we argue that there also exists a symmetric solution (∆∗S , ∆∗S , ∆∗C , ∆∗C ) where
                                                     1X ∗
                                            ∆∗C =           ∆iC .
                                                     2 i
                                                     102


To see this, notice that under ∆∗ , (ICAT i -2)s imply:
   h           2               2                 i
     3
     2
        pT P T    − 1
                    2
                        pT∅ P∅T    − pT∅ P∅T pT P T (∆∗S )2 ≥ max{(pT − P∅T )∆∗1C , (pT − P∅T )∆∗2C }
                                                            ≥ 12 (pT − P∅T )(∆∗1C + ∆∗2C )
                                                            = (pT − P∅T )∆∗C .
Thus, (∆∗S , ∆∗S , ∆∗C , ∆∗C ) is also a solution as it satisfies (ICAT i -2) and does not affect (ICPT -1)
and (ICPT -2).
    Step 2: Denote
                                          2
              ΠT (∆1S , ∆2S ) := pT          P T Pr(ω = G|xT ∈ XP )y          ∆iS + (1 − pT )π
                                                                           P
                                                                            i
                                                                                     2 P
                                                                         − 12 pT P T      ∆2iS .
                                                                                        i
Suppose ∆∗ := (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ) is a solution to PT but ∆∗1S 6= ∆∗2S . Without loss of
generality, assume ∆∗1S > ∆∗2S . We argue that then ∆∗ cannot be a solution. In particular,
there exists ε > 0 and cancellation premiums ∆0iC s such that (∆∗1S − ε, ∆∗2S + ε, ∆01C , ∆02C )
is feasible and
                                 ΠT (∆∗1S − ε, ∆∗2S + ε) > ΠT (∆∗1S , ∆∗2S ).
Observe that ΠT (∆1S , ∆2S ) is symmetric and concave in (∆1S , ∆2S ) with peak at
                                                   y
                                                      Pr ω = G|xT ∈ XP .
                                                                              
                                 ∆1S = ∆2S =        T
                                                  P
Also, the following holds: take any (∆1S , ∆2S ) such that ∆1S 6= ∆2S , ∆1S > ∆2S , say. Then,
                                                      103


there exists ε > 0 such that
                             ΠT (∆1S − ε, ∆02S + ε) > ΠT (∆1G , ∆2G ) .
So, we only need to show that there exists an ε > 0, and ∆01C , ∆02C values such that
(∆∗1S − ε, ∆∗2S + ε, ∆01C , ∆02C ) is feasible. In order to prove this claim, it is worthwhile to
first establish a few properties of the (ICAT i -2) constraints, as given in the next step.
    Step 3: Denote
                               Li (∆1S , ∆2S ) := A (∆iS )2 + B∆1S ∆2S ,
where
                                 2           2                               
                          pT P T − pT∅ P∅T                    pT P T − pT∅ P∅T pT P T
                  A :=                           and B :=                             .
                                                                      pT − pT∅
                                          
                              2 pT − pT∅
Note that the (ICAT i -2) constraints can be written as:
           Li (∆1S , ∆2S ) ≥ ∆iC if pT − pT∅ > 0, and Li (∆1S , ∆2S ) ≤ ∆iC otherwise.
Also,
                                                                   2
                                                  pT P T − pT∅ P∅T
                                     (B − A) =                       ,
                                                    2 pT − pT∅
and hence,
                                    sign (B − A) = sign pT − pT∅
                                                                      
It is routine to check that for XP = {g}, pT − pT∅ > 0 and pT P T − pT∅ P∅T > 0, whereas for
                                                  104


XP = {g, ∅}, pT − pT∅ < 0 and pT P T − pT∅ P∅T < 0. Thus,
                                             A > 0, B > 0.
In the next two steps, we consider the two cases pT − pT∅ > 0 and < 0, and show that the
claim in Step 2 above holds in both cases.
    Step 4: Suppose pT − pT∅ > 0. So, (ICAT i -2)s are given as:
                                         Li (∆1S , ∆2S ) ≥ ∆iC .
There are three possibilities:
    Case 1: Both (ICAT i -2)s are slack at (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ). Consider the solution
                                    (∆∗1S − ε, ∆∗2S + ε, ∆∗1C , ∆∗2C )
                                             
where ε > 0. This solution leaves ICPT s unaffected, for sufficiently small ε, both (ICAT i -2)s
remain slack, and yields a higher value of ΠT (from Step 2).
    Case 2: Exactly one of the two (ICAT i -2)s is slack at (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ). Suppose only
(ICAT 1 -2) is slack, say, (hence, (ICA2 -2) is binding). Set
                                  ∆01C = ∆∗1C + δ, ∆02C = ∆∗2C − δ
where δ > 0. For δ sufficiently small, at (∆∗1S , ∆∗2S , ∆01C , ∆02C ), both (ICAT i -2) become slack
                                                   105


and ICPT s are unaffected, and hence, it is feasible. But then, as argued in Case 1, the
solution (∆∗1S − ε, ∆∗2S + ε, ∆01C , ∆02C ) is also feasible for ε > 0 sufficiently small, and attains
a higher value of ΠT .
    Case 3: Both (ICAT i -2)s are binding at (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ). Consider changing (∆∗1S , ∆∗2S )
to (∆∗1S − ε, ∆∗2S + ε). The left-hand side of ICAT i -2 changes by
                                                            
                             δi := Li (∆∗1S − ε, ∆∗2S + ε) − Li (∆∗1S , ∆∗2S )
where
                                                                              
                                                  1            ∗         ∗
                    δ1 = −ε 2A            ∆∗1S − ε − B (∆1S − (∆2S + ε)) ,
                                                  2
                                                                            
                                                1            ∗         ∗
                    δ2 = ε 2A ∆∗2S           + ε + B (∆1S − (∆2S + ε)) .
                                                2
Note that by A > 0, B > 0 and ε small enough, δ2 > 0.
                                                               
    So, if δ1 > 0, the perturbation relaxes both ICAT i -2 s and by argument given in Case 1,
(∆∗1S − ε, ∆∗2S + ε, ∆∗1C , ∆∗2C ) is an improvement.
                        
    If δ1 < 0, ICAT 1 -2 is now violated, but (ICA2 -2) has become slack. Also note that by
B − A > 0,
                            δ2 + δ1 = 2ε (B − A) (∆∗1S − (∆∗2S + ε)) > 0.
Now, set
                                  ∆01C = ∆∗1C + δ1 , ∆02C = ∆∗2C − δ1 .
                                                    106


Note that
                  L1 (∆∗1S − ε, ∆∗2S + ε) = L1 (∆∗1S , ∆∗2S ) + δ1 = ∆∗1C + δ1 = ∆01C ,
and
                      L2 (∆∗1S − ε, ∆∗2S + ε) = L2 (∆∗1S , ∆∗2S ) + δ2
                                                 = L2 (∆∗1S , ∆∗2S ) − δ1 + (δ2 + δ1 )
                                                 > L2 (∆∗1S , ∆∗2S ) − δ1
                                                 = ∆∗2C − δ1 = ∆02C .
Hence, (∆∗1S − ε, ∆∗2S + ε, ∆01C , ∆02C ) is feasible (note that ICPT s are unaltered by construc-
                                                                          
tion), and for ε > 0 sufficiently small, attains a higher value of ΠT .
    Step 5: Suppose pT − pT∅ < 0. Thus, (ICAT i -2)s are
                                         Li (∆1S , ∆2S ) ≤ ∆iC .
As before, there are three possibilities:
    Case 1: Both (ICAT i -2)s are slack at (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ). By argument in case 1 in
Step 4, this solution can be improved up on.
    Case 2: Exactly one of the two (ICAT i -2)s is slack at (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ). Suppose only
(ICAT 1 -2) is slack, say, (hence, (ICAT 2 -2) is binding). Set
                                   ∆01C = ∆∗1C − δ, ∆02C = ∆∗2C + δ
where δ > 0. As in case 2 in Step 4, the solution (∆∗1S − ε, ∆∗2S + ε, ∆01C , ∆02C ) is also feasible
                                                    107


for ε > 0 sufficiently small, and attains a higher value of ΠT .
    Case 3: Both (ICAT i -2)s are binding at (∆∗1S , ∆∗2S , ∆∗1C , ∆∗2C ). Consider changing (∆∗1S , ∆∗2S )
to (∆∗1S − ε, ∆∗2S + ε). As in case 3 in Step 4, the left-hand side of ICAT i -2 changes by
                                                                                      
                             δi := Li (∆∗1S − ε, ∆∗2S + ε) − Li (∆∗1S , ∆∗2S )
where δ2 > 0 and
                            δ2 + δ1 = 2ε (B − A) (∆∗1S − (∆∗2S + ε)) < 0.
for ε small enough.
                                                               
    So, if δ1 > 0, the perturbation relaxes both ICAT i -2 s and by argument given in Case 1,
(∆∗1S − ε, ∆∗2S + ε, ∆∗1C , ∆∗2C ) is an improvement.
                                                          
    If δ1 < 0, ICAT 2 -2 is now violated, but ICAT 1 -2 has become slack. Now, set
                                  ∆01C = ∆∗1C − δ2 , ∆02C = ∆∗2C + δ2 .
Note that
                    L1 (∆∗1S − ε, ∆∗2S + ε) = L1 (∆∗1S , ∆∗2S ) + δ1
                                               = L1 (∆∗1S , ∆∗2S ) − δ2 + (δ2 + δ1 )
                                               < L1 (∆∗1S , ∆∗2S ) − δ2
                                               = ∆∗1C − δ2 = ∆01C .
and
                L2 (∆∗1S − ε, ∆∗2S + ε) = L2 (∆∗1S , ∆∗2S ) + δ2 = ∆∗2C + δ2 = ∆02C ,
                                                  108


Hence, (∆∗1S − ε, ∆∗2S + ε, ∆01C , ∆02C ) is feasible (note that ICPT s are unaltered by construc-
                                                                                 
tion), and for ε > 0 sufficiently small, attains a higher value of ΠT .
    Combining all cases stated above, we obtain that without loss of generality, we can focus
on the solution where ∆1S = ∆2S = ∆S , ∆1C = ∆2C = ∆C . And from (IRiT ), we obtain
that under such a solution, we must have w1F = w2F = wF . This observation completes the
proof of part (i) of this lemma.
Part (ii). Since we focus on ∆1S = ∆2S = ∆S and ∆1C = ∆2C = ∆C , the program can be
simplified as:
      
                       2                                                                     2
         max 2 pT P T Pr(ω = G | xT ∈ XP )y∆S + (1 − pT )π − pT P T ∆2S
      
      
      
        ∆C ,∆S
      
      
      
      
      
      
      
      
      
      
      
      
    T
                 h                                                     i
 P                       T T 2              T T 2
                   3
                                  − 2 p∅ P∅ − p∅ P∅ p P (∆S )2 ≥ (pT − pT∅ )∆C (ICAT -2)
                                      1
                                                        T T T T
        s.t.      2
                       p P
      
      
      
                                                                     T T
                                            T                   T
                     2  Pr(ω    =   G  |  x    ∈  X   )y −  2P    ∆     p P ∆S ≥ π − 2∆C           (ICPT -1)
      
                                                    P               S
      
      
      
      
      
      
                                                                    
                     2 Pr(ω = G | xT ∈          / XP )y − 2PCT ∆S pT P T ∆S ≤ π − 2∆C              (ICPT -2)
      
      
    As in the case of individual assignment, we have two cases: XP = {g, ∅} and XP = {g}.
    Case 1: XP = {G, ∅}. Here, pT − pT∅ < 0; so we have:
       
                         2                                                                    2
          max 2 pT P T Pr(ω = G | xT ∈ XP )y∆S + (1 − pT )π − pT P T ∆2S
       
       
       
          ∆C ,∆S
       
       
       
       
       
       
       
       
       
       
       
       
  T
                                        h                                                i 2
P{g,∅}                                          T T 2           T T 2                        ∆S
                                          3                1
                                                                                                    (ICAT -2) .
                                                                             T T T T
           s.t.       ∆  C  ≥  lA  :=     2
                                              p  P      −  2
                                                              p   P
                                                                ∅ ∅       −  p  P
                                                                               ∅ ∅ p P     pT −pT∅
       
       
       
       
                                    1
                                                                                        T T
                                                                T                  T
                   ∆     ≥  l    :=    π  −     Pr(ω  =  G  | x   ∈  X    )y −  2P   ∆    p P ∆S (ICPT -1)
       
                      C      P                                          P             S
       
       
       
                                    2
       
       
       
                  ∆C ≤ uP := 21 π − Pr(ω = G | xT ∈
                                                                                       
                                                                   / XP )y − 2PCT ∆S pT P T ∆S (ICPT -2)
       
       
                                                         109


Notice that ∆C is not in the objective function, we can further simplify the program as:
         
                        2                                                                        2
            max 2 pT P T Pr(ω = G | xT ∈ XP )y∆S + (1 − pT )π − pT P T ∆2S
         
         
         
             ∆S
         
         
         
         
         
         
                                                      s.t.
         
         
         
         
         
         
         
         
                 uP ≥ lA ⇔ 21 π ≥ Pr(ω = G | xT ∈
                                                                                      
                                                             / XP )y − 2PCT ∆S pT P T ∆S
         
         
                                   h                                                     i
                            ∆2S     3     T T 2
                                                     1     T T 2
                                                                          T T T T
                        +                p   P    −      p   P        −  p   P    p   P
         
         
                          pT −pT    2                2     ∅ ∅            ∅ ∅
                                 ∅
         
         
         
         
         
         
         
         
         
         
         
         
                                                                   y
                                         uP ≥ lP ⇔ ∆S ≤ 2(1−µ)           .
         
         
                                                                              1
By routine calculation, one obtains Pr(ω = G|xT ∈ XP ) =                    2−α0
                                                                                    , Pr(ω = G|xT ∈  / XP ) = 0,
and
                       pT = 1 − 21 α0 ; P T = µ + (1 − µ) 2−α          1
                                                                         0;  pTC = µ;
                              pT∅ = 1 − 12 α; P∅T = µ + (1 − µ) 2−α          1
                                                                                  ,
where α0 := 1 − (1 − α)2 . Also, as in the proof of Lemma 2, to streamline notation, without
loss of generality, we set y = 1 (and hence, π = 1/4). Plugging the values, the program
becomes:        
                                                                           2
                                               0 2
                            1                                        1
                                                                                + 14 (1 + 21 α0 )
                
                
                
                  max − 4 (1 + µ (1 − α )) ∆S −               1+µ(1−α0 )
                 ∆S
                
                
                
                
                
                
                                                   1                           1
                 s.t.                 ∆2S ≤                and ∆S ≤
                
                                               2µ2 α(1−α)                   2(1−µ)
The solution is given as:
                                  
                                            1                   1               1
                                  
                                  
                                  
                                      1+µ(1−α)2
                                                      if   1+µ(1−α)2
                                                                         ≤  2(1−µ)
                         ∆∗S  =                                                        ,
                                         1
                                  
                                  
                                  
                                      2(1−µ)
                                                      otherwise
                                                    110


and the associated value is:
                 
                    1
                      + 18 α(2 − α)                                                  1             1
                 
                 
                 
                    4
                                                                            if  1+µ(1−α)2
                                                                                           ≤    2(1−µ)
           T
        V{g,∅} =                 h                   i                                                        (3)
                               2                   2
                  1+µ(1−α) 1 − 1+µ(1−α) + 1 α(2 − α)
                 
                 
                      4(1−µ)               4(1−µ)          8
                                                                            otherwise
    Case 2: XP = {G}. Here, pT − pT∅ > 0, so we have:
     
                    2                                                                      2
        max 2 pT P T Pr(ω = G | xT ∈ XP )y∆S + (1 − pT )π − pT P T ∆2S
     
     
     
       ∆C ,∆S
     
     
     
     
     
     
     
     
     
     
     
     
  T
                                   h                                                   i 2
P{g}                                      T T 2               T T 2                       ∆S
                                     3                  1
                                                                                                      (ICAT -2) .
                                                                        T T T T
        s.t.      ∆ C ≤   u A  :=    2
                                        p   P       −   2
                                                            p  P
                                                              ∅ ∅     − p   P
                                                                          ∅ ∅   p  P    pT −pT∅
     
     
     
     
     
                ∆C ≥ lP := 12 π − Pr(ω = G | xT ∈ XP )y − 2P T ∆S pT P T ∆S (ICPT -1)
                                                                                   
     
     
     
     
     
     
     
                ∆C ≤ uP := 12 π − Pr(ω = G | xT ∈
                                                                                     
                                                                / XP )y − 2PCT ∆S pT P T ∆S (ICPT -2)
     
     
As ∆C does not appear in the objective function, we can replace the constraints by requiring
lP ≤ uA and lP ≤ uP , and the program simplifies to:
      
                     2                                                                      2
         max 2 pT P T Pr(ω = G | xT ∈ XP )y∆S + (1 − pT )π − pT P T (∆S )2
      
      
      
           ∆S
      
      
      
      
      
      
                                                       s.t.
      
      
      
      
      
      
                      1                                                                                  .
                                                                           
                     2
                        π ≤ Pr(ω = G | xT ∈ XP )y − 2P T ∆S pT P T ∆S
      
      
                         h                                                    i 2
                                  T T 2             T T 2                          ∆S
                           3
                                             1
                                                                  T T T T
                       +        p  P       −      p  P         − p  P  p   P
      
      
                           2                 2     ∅ ∅            ∅ ∅           pT −pT
                                                                                      ∅
      
      
      
      
                                                            y
                                                ∆S ≤ 2(1−µ)
      
      
                                                      111


Plugging the values for the probabilities (and parameters), we obtain:
                
                    max ΠT{g} (∆S ) := − 14 (α0 (∆S − 1))2 +      1
                                                                    1 − α0       1
                                                                                    − α0
                
                                                                                         
                
                                                                 4              2
                
                    ∆S
                
                
                
                                                     s.t.
                
                
                
                                  α(2 − α)∆S − 12 α(1 − α)∆2S ≥         1
                
                
                
                
                                                                       4
                
                
                                                         1
                                               ∆S ≤
                
                
                                                       2(1−µ)
                                                         q                     
                                           1                               1−α
    Let α̂ := 0.12445 and K (α) :=        1−α
                                                 2 − α − (2 − α)2 −         2α
                                                                                  . It is routine to check
that the program does not admit a solution if α < α̂ or K (α) > 1/2 (1 − µ). Otherwise, the
solution is as follows:
                            
                                                                               1
                                   1         if α ≥ α̂ and K (α) ≤ 1 ≤
                            
                            
                            
                            
                                                                           2(1−µ)
                            
                      ∆∗S =        1
                                             if α ≥ α̂ and K (α) ≤       1
                                                                               <1 ,
                               2(1−µ)                               2(1−µ)
                            
                            
                            
                                                                               1
                                   α̃        if α ≥ α̂ and 1 < K (α) ≤
                            
                            
                                                                            2(1−µ)
and the associated value function is
                          
                                                                                      1
                            ΠT{g} (1)              if α ≥ α̂ and K (α) ≤ 1 ≤
                          
                          
                          
                          
                                                                                  2(1−µ)
                          
                    T
                                           
                  V{g} =    ΠT          1
                                                   if α ≥ α̂ and K (α) ≤        1
                                                                                      <1                 (4)
                           {g}
                          
                          
                                     2(1−µ)                                  2(1−µ)
                          
                           ΠT (α̃)                                                   1
                                                   if α ≥ α̂ and 1 < K (α) ≤
                          
                             {g}                                                   2(1−µ)
    Thus, we conclude that the program P T always admits a solution for XP = {g, ∅} and
admits a solution for XP = {g} if and only if α and µ are sufficiently large.
                                                                                                     Q.E.D.
                                                                      I                 T
Proof of Proposition 2. Step 1. Notice that program P{g,∅}                    and P{g,∅}     have the objec-
                                                    112


tive function. Denote the the unconstrained maximum of that objective function as
                                                      1 1 0
                                            V{g,∅} =      + α.
                                                      4 8
            I          T
Similarly, P{g} and P{g}   have the same objective function, and we denote the unconstrained
maximum as
                                                                  
                                             1          0   1     0
                                    V{g}  =      1−α          −α       .
                                             4              2
    Since unconstrained maximum must be (weakly) larger than the value under a constrained
maximization, we have
                    I                  T                  I                 T
                  V{g,∅}  ≤ V{g,∅} , V{g,∅} ≤ V{g,∅} , V{g}  ≤ V{g} and V{g}  ≤ V{g} .
    Further, we notice that V{g,∅} − V{g} = 14 α0 (1 − α0 ) ≥ 0, so we have
                                              V{g} ≤ V{g,∅}
and equality holds if and only if α0 = 0 or 1.
                                                                    I         T
    Step 2. Recall that the solutions for the programs P{g,∅}            and P{g,∅} (see (1) and (3) ; we
maintain y = 1 to streamline notation) stipulate
                                                                           
                                1         1                   1            1 2
                        I
                      V{g,∅} =       1 + α0       =S   ∗
                                                            =    1+α− α
                                4         2                   4            2
                                                    113


            α0 µ2
when    (1+µ(1−α0 ))2
                      ≤ 21 , and
                                                       T
                                                    V{g,∅}    = S∗
            1              1
when    1+µ(1−α)2
                    ≤  2(1−µ)
                                .
    Let µ0 be the solution to the equation
                                                  1                      1
                                                            2
                                                               =                ;
                                          1 + µ(1 − α)              2(1 − µ)
that is,
                                                        1                    1
                                         µ0 =                       =             .                            (5)
                                               2 + (1 − α)       2      3 − α0
                                       1            1                                       1            1
Note that, for µ ∈ [0, µ0 ),       1+µ(1−α)2
                                              > 2(1−µ)
                                                          ; and for µ ∈ [µ0 , 1),       1+µ(1−α)2
                                                                                                  ≤   2(1−µ)
                                                                                                             .
    Next, define µ1 as follows:
                                   
                                                             α0 µ 2           1
                                   
                                    1
                                                  if  (1+µ(1−α0 ))2
                                                                          <   2
                                                                                ∀µ ∈ [0, 1]
                             µ1 =                                                             ,                (6)
                                    µ∗ (α0 )
                                   
                                                  otherwise
where
                                                                       √
                                            ∗   0         1 − α0 + 2α0
                                           µ (α ) =
                                                         2α0 − (1 − α0 )2
is the unique solution to
                                                      α 0 µ2                1
                                                                     2 =
                                             (1 + µ (1 − α0 ))              2
in [0, 1].
                      α0 µ 2        1                                  α0 µ2         1
    Note that     (1+µ(1−α0 ))2
                                  ≤ 2
                                      for µ ∈ [0, µ1 ] and        (1+µ(1−α0 ))2
                                                                                   > 2
                                                                                       for µ ∈ (µ1 , 1].
                                                           114


   Step 3. Notice that µ0 < µ1 ∀α ∈ [0, 1] as using (5), one obtains
                                            α0 µ20                   α0         1
                                                          2 =             2 < .
                                    (1 + µ0 (1 − α )) 0                 0
                                                               4 (2 − α )       2
   Combining above observations we obtain: (i) if µ < µ0 , S ∗ = V{g,∅}           I           T
                                                                                    > max{V{g,∅}      I
                                                                                                  , V{g}     T
                                                                                                         , V{g} };
that is, individual assignment with XP = {G, ∅} is optimal; (ii) if µ > µ1 , S ∗ = V{g,∅}              T
                                                                                                            >
          I       I      T
max{V{g,∅}    , V{g} , V{g} }; that is, team assignment with XP = {G, ∅} is optimal; (iii) if µ0 ≤
µ ≤ µ1 , S ∗ = V{g,∅}
                    I           T
                           = V{g,∅} > max{V{g}   I      T
                                                   , V{g}  }; that is, both team and individual assignment
with XP = {G, ∅} are optimal.
                                                                                                     Q.E.D.
Proof of Proposition 3. Step 1. From (5) it directly follows that µ0 is increasing in α.
   Step 2. Now, consider the definition for µ1 as given in (6). Note that when α0 < 12 ,
    α0 µ2
(1+µ(1−α0 ))2
               ≤ α0 µ2 ≤ α0 < 12 ; so, µ1 = 1. And for α0 ≥ 12 , we have
                                             µ1 = min {1, µ∗ (α0 )} .
Note that
                                                                                         √ 
                                       √1       2α0 − (1 − α0 )2 + 2 (2 − α0 ) 1 − α0 + 2α0
                                                                   
              d ∗ 0                1−   2α0
                 µ (α ) = −                                                  2                  .
            dα0                                             2α0 − (1 − α0 )2
For α0 ∈ [ 12 , 1) it is routine to check that 1 −             √1
                                                                 2α0
                                                                      ≥ 0 and all other three terms in the
numerator are strictly positive (denominator is positive by virtue of being a sqaure term).
                                                          115


                                                                                                  √ 
      d
         µ∗ (α0 ) < 0. Hence, µ∗ (α0 ) is also strictly decreasing in α when α ∈ 1 − 1/ 2, 1
                                                                                        
So,  dα0
(recall α0 := 1 − (1 − α)2 ).
    Step 3. Finally, note that when α = 1 − √12 , µ∗ (α0 ) = 2; and when α = 1, µ∗ (α0 ) =          √1 .
                                                                                                      2
As µ∗ (α0 ) is decreasing in α, by Intermediate Value Theorem, there exists an α∗ such that
µ∗ (α∗ ) = 1. Also, when α < α∗ , µ∗ (α0 ) > 1; when α > α∗ , µ∗ (α0 ) < 1.
    Thus, for 1 −    √1
                       2
                         ≤ α ≤ α∗ , µ1 = min {1, µ∗ (α0 )} = 1 and for α ≥ α∗ , µ1 is decreasing in
α.
                                                                                                 Q.E.D.
Proof of Proposition 4. Step 1: Since Lemma 1 and 3 hold for any θ ∈ ( 12 , 1) (note that
the proofs of these lemmas presented above do not rely on any specific value of θ), we may
                                                              I     I      T             T
continue to limit attention to the set of four programs P{g,∅}   , P{g} , P{g,∅} , and P{g}  as defined
in the proofs of Lemma 2 and 4. In this step, we compute the unconstrained maximum of
                                        I
these four programs. That is, for PX      P
                                            , XP ∈ {{g, ∅}, {g}}, we solve for
            d                 2                                                       2
          V XP := max 2 pI       P I Pr(ω = G | x ∈ XP )y∆S + (1 − pI )π − pI P I          ∆2S ,
                    ∆S
            T
and for PX    P
                , XP ∈ {{g, ∅}, {g}}, we solve for
          T                  2                                                           2
        V XP := max 2 pT        P T Pr(ω = G | xT ∈ XP )y∆S + (1 − pT )π − pT P T            ∆2S .
                   ∆S
    Plugging in the values for all the probabilities, and solving for the optimization problem
(notice that all objective functions are quadratic in ∆S ; hence solution exists and is unique)
                                                 116


we obtain (recall that α0 = 1 − (1 − α)2 ):
                                                                              
                        I           T        1            0           2    1 0
                      V {g,∅} =V    {g,∅} =      (1 − α (1 − θ)) + α =: V ,
                                             4                             2
and
                                                                             
                                I         T       1      0 2            1 0
                              V {g}  =V   {g} =       (α θ) + (1 − α ) .
                                                  4                     2
Note that
                                                    I         T
                                            V > V {g} = V {g} .
                                                                         I             T
    In what follows, we focus our attention on programs P{g,∅}                 and P{g,∅}  , as we show that
for any given set of parameters, at least one of them achieves the value V .
    Step 2: We show that for θ sufficiently large, there exists a cutoff µ0 (α; θ) such that
  T                                   T
V{g,∅} < V if µ < µ0 (α; θ); and V{g,∅}    = V otherwise. Plugging the values of the probabilities,
                T
the program P{g,∅}   can be written as:
         
                max V − 14 [(1 − α0 (1 − θ) + µ(1 − α0 θ)) ∆S − (1 − α0 (1 − θ))]2
         
         
         
         
         
                 ∆S
         
         
         
                                                       s.t.
         
         
  T
P{g,∅} :                                                                                                       .
         
           [1 − α0 (1 − θ) + µ(1 − α0 θ)](1 − θ)∆S + 12 α(1 − α)(1 − θ (1 − µ))2 ∆2S ≤               1
                                                                                                             
                                                                                                         C1T
         
         
         
         
                                                                                                    4
         
         
                                                           1
                                                                                                             
                                                ∆S ≤ 2(1−µ)                                              C2T
         
         
                                                                      1−α0 (1−θ)
The objective function achieves its peak at ∆∗S =               1−α0 (1−θ)+µ(1−α0 θ)
                                                                                     .  If ∆∗S is feasible un-
                                    T                         T
der constraints C1T     and C2T , V{g,∅}       = V ; and V{g,∅}        < V otherwise. Next, we analyze
conditions under which this solution may be feasible.
                                                    117


    Plugging ∆∗S into C2T and simplifying, we have:
                             
                                        1 − α0 (1 − θ)
                                µ≥                          =: µ0 (α; θ).
                                     3 − α0 θ − 2α0 (1 − θ)
    Thus, ∆∗S satisfies constraint C2T if and only if µ ≥ µ0 (α; θ).
                                          
    We also claim that ∆∗S satisfies C1T if θ > 0.85. To see this, plug ∆∗S into the left-hand
                                             
             
side of C1T , and we obtain:
                                                                       h                      i2
                           0            1                 0          2         1−θ(1−µ)
            (1 − θ)(1 − α (1 − θ)) +    2
                                          α(1  − α)(1 − α (1 − θ))       1−α0 (1−θ)+µ(1−α0 θ)
                                                                                  2
            ≤ (1 − θ)(1 − α0 (1 − θ)) + 12 α(1 − α)(1 − α0 (1 − θ))2       2−α
                                                                              1
                                                                                0
                           α(1−α)
            ≤ (1 − θ) +   2(2−α0 )2
            ≤ 41 .
The first inequality follows as the expression is increasing in µ, the second one follows as
                                                           α(1−α)
1 − α0 (1 − θ) ∈ [0, 1], and the final one holds since    2(2−α0 )2
                                                                    < 0.1 (for α ∈ [0, 1]).
                              T                                        T
    Hence, for θ > 0.85, V{g,∅}     < V when µ < µ0 (α; θ), and V{g,∅}     = V otherwise.
    Step 3: We show that for θ sufficiently large, there exists a cutoff µ1 (α; θ) such that
  T                                       T
V{g,∅} = V when µ ≤ µ1 (α; θ) and V{g,∅}       < V otherwise. The proof is analogous to the one
given in Step 2 above.
                                                  118


                                                                 I
   Plugging the values of the probabilities, the program P{g,∅}       can be written as:
             
                max V − 41 [(1 − α0 (1 − θ) + µ(1 − α0 θ)) ∆S − (1 − α0 (1 − θ))]2
             
             
             
             
             
                ∆S
             
             
             
                                                  s.t.
             
             
      I
    P{g,∅} :                                                                                     .
                (1 − θ)[1 + µ − α0 (1 − θ + µθ)]∆S + 12 α0 (1 − θ (1 − µ))2 ∆2S ≤
                                                                                      1     I
                                                                                               
                                                                                           C1
             
             
             
             
                                                                                      4
             
             
             
                                              ∆S ≤ 1
                                                                                               
                                                                                           C2I
             
             
                                                      1−µ
                                                                   1−α0 (1−θ)
   The objective function achieves its peak at ∆∗S =         1−α0 (1−θ)+µ(1−α0 µ)
                                                                                  . If ∆∗S is feasible
                                   I                    I
under constraints C1I and C2I , V{g,∅}      = V ; and V{g,∅}   < V otherwise. Next, we analyze
conditions under which this solution may be feasible.
   It is routinely to check that ∆∗S is always feasible under C2I :
                                                                       
                                       1 − α0 (1 − θ)                    1
                        ∆∗S =                               ≤   1  ≤          .
                               1 − α0 (1 − θ) + µ(1 − α0 θ)           1−µ
   Now, plugging ∆∗S in the left-hand side of C1I we get:
                                                        
                                                                                                2
                                          1                   2               1 − θ + µθ
 L(µ; α, θ) := (1 − θ)(1 − α (1 − θ)) + α0 (1 − α0 (1 − θ))
                             0
                                                                                                    .
                                          2                        1 − α (1 − θ) + µ(1 − α0 θ)
                                                                          0
Note that L(µ; α, θ) is increasing in µ ∈ [0, 1], so it achieves its maximum at µ = 1, where:
                                                                                 2
                                                                1 − α0 (1 − θ)
                                                              
                                              0           1
                 L(1; α, θ) = (1 − θ)(1 − α (1 − θ)) + α0                           .
                                                          2          2 − α0
                                                 119


                                                        1
Now, if θ > 0.85, we have L(1; 0, θ) = 1 − θ <          4
                                                           and L(1; 1, θ) = θ − 12 θ2 > 14 ; also
                          d                    2 − 2α
                            L (1; α, θ) =                 [R1 + R2 + R3 + R4 ] ,
                         dα                   (2 − α0 )3
where
                    R1 = 21 (1 − θ)2 2α0 + α0 3 , R2 = 3(1 − θ)2 (α0 − α0 2 ),
                                                     
                                     2
                    R3 = 8     3
                               4
                                 −θ     α0 ,              R4 = 1 − 8(1 − θ)2 .
                                            d
As Ri ≥ 0 for i = 1, ..., 4, we have       dα
                                              L (1; α, θ)   ≥ 0. So by Intermediate Value Theorem,
there exists a unique α∗ (θ) ∈ (0, 1) such that L(1; α∗ (θ), θ) = 14 .
    Next, define µ1 (α; θ) as follows: for α ≤ α∗ (θ), let µ1 (α; θ) = 1; and for α > α∗ (θ), let
µ1 (α; θ) be the solution to L(µ; α, θ) = 14 . That is:
                                 
                                                                      if α ≤ α∗ (θ)
                                 
                                  1
                                 
                    µ1 (α; θ) :=                    √          √                      ,
                                     (1−α0 (1−θ))( K−(1−θ) α0 )
                                 
                                                   √           √
                                      (1−α0 (1−θ))θ α0 −(1−α0 θ) K
                                                                      otherwise
where K :=    1
              2
                − 2(1 − θ)(1 − α0 (1 − θ)).
    Notice that when α ≤ α∗ (θ), for all µ ≤ 1 = µ1 (α; θ), L(µ; α, θ) ≤ 41 , i.e., ∆∗S satisfies
 C1I ; when α > α∗ (θ), for all µ ≤ µ1 (α; θ), L(µ; α, θ) ≤ 41 , i.e., ∆∗S satisfies C1I , and for all
                                                                                             
µ > µ1 (α; θ), L(µ; α, θ) > 41 , i.e., ∆∗S always violate C1I . As ∆∗S always satisfies C2I we
                                                                                                 
                            T                                           T
conclude: for θ > 0.85, V{g,∅}   = V when µ ≤ µ1 (α; θ) and V{g,∅}         < V otherwise.
    Step 4: Define θ∗ as the largest solution in [0, 1] to the equation µ0 (1; θ) = µ1 (1; θ); i.e.,
                                                               
                                            ∗     1          1
                                         θ :=          1+ √       .
                                                  2           2
                                                     120


As θ∗ > 0.85, the definition of µ0 and µ1 are valid for θ > θ∗ .
    Step 5: Note that µ0 (α; θ) is increasing in both α and θ for θ ∈ (θ∗ , 1]:
                            d                     (2θ − 1)(2 − 2α)
                                µ0 (α; θ) =                               ≥ 0,
                           dα                 [3 − α0 θ − 2α0 (1 − θ)]2
and
                            d                        α0 (2 − α0 )
                                µ0 (α; θ) =                              ≥ 0.
                           dθ                 [3 − α0 θ − 2α0 (1 − θ)]2
    Step 6: Next, we claim that µ1 (α; θ) is decreasing in α and increasing in θ for θ ∈ (θ∗ , 1].
    Recall that for α ≤ α∗ (θ), µ1 (α; θ) = 1; for α > α∗ (θ), taking the derivative of µ1 (α; θ)
with respect to α we obtain:
                                   d
                                      µ1 (α; θ) = − (S1 S2 + S3 S4 ) S5 ,
                                  dα
    where
                                h                    √                 √ i
               S1 := (1 − θ) (1 − α0 (1 − θ))θ α0 − (1 − α0 θ) K ,
               S2 :=   √1
                      2 K
                           (1 − 6(1 − θ)(1 − α0 (1 − θ))) + 2√1α0 (1 − 3α0 (1 − θ)) ,
                                          h√                √ i
                              0
               S3 := (1 − α (1 − θ)) K − (1 − θ) α0 ,
               S4 :=   √1 θ(1
                      2 α0
                                 − 3α0 (1 − θ)) + 2√1K (θ + 2θ2 + 6α02 − 2) ,
                                    h                    √                 √ i2
               S5 := (2 − 2α) / (1 − α0 (1 − θ))θ α0 − (1 − α0 θ) K .
                                                                      d
It is routine to check that Si ≥ 0 for all i = 1, ..., 5. Hence,        µ (α; θ)
                                                                     dα 1
                                                                                 ≤ 0.
                                                    121


   Next, consider the derivative of µ1 with respect to θ:
                                                    √
                                                       0
                                                    √α T1 + T2
                       d                              K
                          µ1 (α; θ) = h               √               √ i2 ,
                       dθ
                                       1 − α0 (1 − θ)θ α0 − (1 − α0 θ) K
where
  T1 :=   5
          2
            (1  − θ) + 21 (−19 + 33θ − 13θ2 )α0 + (1 − θ) (11 − 17θ + 4θ2 ) α0 2 − 4(1 − θ3 )α0 3 ,
  T2 := − 12 α0 (2 − α0 ) + (1 − α0 (1 − θ)) 2 − (3 − 2θ)α0 + (1 − θ)(8θ − 3)α0 2 .
                                                                                  
Below, we show that T1 > 0 and T2 ≥ 0 that implies µ1 (α; θ) is increasing in θ.
   Step 6a: To show T1 > 0, we consider two cases: A > 0 and A ≤ 0, where A :=
−19 + 33θ − 13θ2 .
   Case 1 : When A > 0, we have 11 − 17θ + 4θ2 < 0 and 13 − 17θ + 4θ2 ≥ 0. Now,
     T1 ≥    5
             2
               (1 − θ) + 21 (−19 + 33θ − 13θ2 )α0 + (1 − θ) (11 − 17θ + 4θ2 ) − 4(1 − θ3 )
                     + 33θ − 13θ2 )α0 + (1 − θ) (13 − 17θ + 4θ2 ) + (1 − θ)
             1
                                                                            1               
        =    2
               (−19                                                           2
                                                                                 − 4(1 − θ)2
        > 0.
   Case 2 : When A ≤ 0 and θ > θ∗ > 0.85, we have 11 − 17θ + 4θ2 < 0. Now,
                5
       T1 ≥     2
                  (1 − θ) + 21 (−19 + 33θ − 13θ2 ) + (1 − θ)(11 − 17θ + 4θ2 ) − 4 (1 − θ3 )
                1
          =     2
                  θ(5θ − 4) > 0.
                                                 122


    Step 6b: To show T2 ≥ 0, we first define
                                                                         2
                         T3 : = 2 − (3 − 2θ)α0 + (1 − θ)(8θ − 3)α0 ,
                         T4 : = 2(1 − θ)(2 − α0 ) + α0 (1 − α0 (1 − θ)).
Now,
                             T2 = − 12 α0 (2 − α0 ) + (1 − α0 (1 − θ))T3
                                ≥ − 12 α0 (2 − α0 ) + (1 − α0 (1 − θ))T4
                                ≥ − 12 α0 (2 − α0 ) + 12 (2 − α0 )2
                                = (2 − α0 )(1 − α0 ) ≥ 0.
The first inequality follows as T3 ≥ T4 (routine to check). The second inequality follows from
the fact that as we have α > α∗ (θ), we have L (1; α, θ) > 14 . And,
                                     1                              1      2
                       L (1; α, θ) >     ⇔ (1 − α0 (1 − θ))T4 > (2 − α0 ) .
                                     4                              2
    Step 7: It is routine to check µ1 (1; θ) > µ0 (1; θ). So, for any θ ∈ (θ∗ , 1], µ1 (α; θ) >
µ0 (α; θ) for all α ∈ [0, 1] (as µ0 is strictly increasing, and µ1 is decreasing in α). Thus,
                                                                           n                 o
                                                         I                    I     T      T
from Step 2 and 3, we find for µ < µ0 (α; θ), V{g,∅} = V > max V{g} , V{g,∅} , V{g} ; for
                                   n                  o
                  T                    I    I       T                 I        T
µ > µ1 (α; θ), V{g,∅} = V > max V{g} , V{g,∅} , V{g} ; otherwise, V{g,∅}   = V{g,∅} = V . Thus, the
characterization of optimal job design is qualitatively identical to that in Proposition 2.
                                                                                             Q.E.D.
                                                123


Appendix B: Proofs for Chapter 2
Proof of Lemma 5. Skaperdas (1996) proved the following: in a contest with n players
(fixed number), a contest success function satisfies (B1)-(B5) if and only if it satisfies (B6).
        Pn
(B1)        pi (x) = 1 and pi (x) ≥ 0 for all i ∈ {1, ..., n} and all x; if xi > 0, then pi (x) > 0.
        i=1
(B2) For all i ∈ {1, ..., n}, pi (x) is increasing in yi and decreasing in yj for all j 6= i.
(B3) For any permutation ϕ of {1, ..., n} (i.e., a bijection ϕ : {1, ..., n} → {1, ..., n}) we have
                                         pi (x) = pϕ(i) (xϕ(1) , ..., xϕ(n) ), ∀i ∈ {1, ..., n}.
(B4) Denote pm     i the probability of winning in the subcontest where the players are in the
        subset M . Consistency requires
                                                       pi (x1 , ..., xn )
                               pmi (x1 , ..., xn ) = P                      ∀i ∈ M, ∀M ⊆ {1, ..., n}
                                                          pj (x1 , .., xn )
                                                     j∈M
(B5) pm  i is independent of the xi s of the players not included in the subset M .
                       f (xi )
(B6) pi (x) =        n
                     P             for all i ∈ {1, ..., n} and f is unique up to positive multiplicative
                          f (xj )
                    j=1
        transformations.
Let {pni } be a system of contest success functions that satisfies (A1)-(A4). Consider n = 9,
{p9i } satisfies (B1)-(B3), and {pni }8n=1 satisfies (B4). (B5) is trivially satisfied as the definition
of {pni }8n=1 only contains the efforts of players in the game. Thus, there exists a f (.) such
                    f (xi )
that pni (x) =     n
                   P           , ∀n  ≤ 9.
                       f (xj )
                  j=1
                                                               124


       The next step is to prove pni shares the same form for all n > 9. Suppose (by contradiction)
                                                                                          f (xi )
that there exist a k > 9, i ∈ {1, ..., k} such that pki (x) 6=                           k
                                                                                         P
                                                                                                     . By similar arguments,
                                                                                             f (xj )
                                                                                        j=1
                                                                                                         g(xi )
{pni }kn=1      satisfies (B1)-(B5), so there exists a g(.) such that pni (x) =                         n
                                                                                                        P         , ∀n ≤ k. Thus,
                                                                                                           g(xj )
                                                                                                       j=1
                              f (xi )         g(xi )
for n ≤ 9,        pni (x) =  n
                             P           =   n
                                             P            .  As Skaperdas (1996) have proved, f (.) is unique up
                                 f (xj )         g(xj )
                            j=1             j=1
to positive multiplicative transformations, so g(xi ) = βf (xi ) where β > 0. Plug it back into
pki :
                                                    g(xi )               βf (xi )       f (xi )
                                    pki (x) =      k
                                                                  =     k
                                                                                   =   k
                                                 P                    P               P
                                                       g(xj )             βf (xj )         f (xj )
                                                j=1                   j=1             j=1
                                               f (xi )
Contradiction! Thus, pni (x) =               n
                                             P            , ∀n.
                                                  f (xj )
                                            j=1
                                                                                                                           Q.E.D.
Proof of Proposition 5. For player i, the maximization program can be written as:
                                                   X ∞
                                           max            π̃(n)pni (xi , x−i )v(n) − xi .
                                            xi
                                                   n=1
Suppose the equilibrium effort level is x∗ , and plug in the contest success function pni (x) =
   f (xi )
 n
 P            , the program becomes:
      f (xj )
j=1
                                           ∞
                                          X                           f (xi )
                                   max         π̃(n)                                 v(n) − xi .
                                     xi
                                          n=1
                                                         f (xi ) + (n − 1)f (x∗ )
Taking derivative with respect to xi :
                                       ∞
                                     X                         f 0 (xi )(n − 1)f (x∗ )
                                          π̃(n)v(n)                                        = 1.
                                      n=1
                                                            (f (xi ) + (n − 1)f (x∗ ))2
                                                                     125


In equilibrium, it must be xi = x∗ , so it requires:
                                                             ∞
                                            f (x∗ )        X                   n−1
                                              0   ∗
                                                       =          π̃(n)v(n) 2 .
                                           f (x ) n=1                            n
                                                                        f (x)
f (.) is positive, concave and increasing in x , so                    f 0 (x)
                                                                               is positive and increasing in x. Thus,
one of the two cases must be true:
                                                     f (x∗ )       P∞
     • There is a unique x∗ such that               f 0 (x∗ )
                                                              =       n=1   π̃(n)v(n) n−1
                                                                                       n2
                                                                                          .
                       f (x)     P∞
     • ∀x ∈ [0, ∞),    f 0 (x)
                               >      n=1   π̃(n)v(n) n−1    n2
In the first case, the equilibrium exists (effort level is x∗ ), which is unique. In the second
case, the effort level x∗ = 0 is the only equilibrium, so it exists, and it is unique.
                                                                                                               Q.E.D.
Lemma 6. Consider two distributions, F and G, where F is a discreet distribution over
N = {1, 2, ..., n, ...} with density function f (n) = π(n) and G is a discreet distribution with
                                 1
                                   π(n)
density function g(n) =         Pn  1      .
                                  i i π(i)
     F and G are valid distributions, and F has first-order stochastic dominance over G.
Proof of Lemma 6. First we verify that both F and G are valid distributions:
                                                         X∞
                                                               π(n) = 1
                                                         n=1
                                                ∞                    ∞
                                             X      1               X
                                                       π(n) ≤            π(n) = 1
                                              n=1
                                                    n               n=1
                                                      ∞       1
                                                    X            π(n)
                                                           Pn     1        =1
                                                    n=1         i i π(i)
                                                                 126


For any integer j > 0
                                           j                   j
                              P 11             1
                                          P                   P
          G(j) − F (j) =                       n
                                                 π(n)      −       π(n)
                                i i π(i) n=1                 n=1
                                          j                     j
                                                                                       
                              P 11                1                              1
                                            P                   P          P
                        =                        n
                                                    π(n)    −       π(n)       i i π(i)
                                i i π(i)
                                         n=1j               ∞
                                                               n=1
                                                                            ∞              j
                                                                                                    
                              P 11                1                              1
                                            P               P               P             P
                        =                        n
                                                    π(n)         π(n) −          n
                                                                                   π(n)       π(n)
                                i i π(i)
                                         "n=1               n=1            n=1            n=1           #
                                             j                ∞                  ∞              j
                              P 11                1                                   1
                                            P                P                   P             P
                        =                         n
                                                    π(n)            π(n) −            n
                                                                                        π(n)       π(n)
                                i i π(i)   n=1              n=j+1              n=j+1           n=1
                                         "                                                                 #
                                             j                ∞                  ∞                j
                              P 11                1                                     1
                                            P                P                   P                P
                        ≥                         j
                                                    π(n)            π(n) −            j+1
                                                                                           π(n)       π(n)
                                i i π(i)   n=1              n=j+1             n=j+1              n=1
                                           j               ∞                       
                              P 11                                       1       1
                                          P               P
                        =                      π(n)             π(n)     j
                                                                           −    j+1
                                i i π(i) n=1           n=j+1
                        ≥ 0
Thus, we have G(j) ≥ F (j) for all j, so F first-order stochastic dominate G.
                                                                                                             Q.E.D.
Proof of Proposition 6. Given equilibrium, we have
                                      ∞
                                    X                     n−1         f (x∗ )
                                          π̃(n)v(n)                =
                                     n=1
                                                            n2        f 0 (x∗ )
Under contest C1 , v1 (n) is increasing in n and under contest C3 , v3 is a constant. Once we
                                               nπ(n)
plug in the probability where π̃(n) =         P          , we have
                                                 i iπ(i)
                                                             ∞
                              f (x∗ )              1        X                 n−1
                                         =   P                   v(n)π(n)
                             f 0 (x∗ )          i iπ(i) n=1                      n
                                                         127


   To facilitate our proof, we set a benchmark value
                                           ∞
                                  "                    # ∞
                                         X    1         X
                             B = 1−             π(n)          π(n)v(n).
                                         n=1
                                              n         n=1
Since we have E[v1 (n)] = E[v2 (n)] = v3 , B is a constant under all contests. Thus, we have
       ∞                           ∞                                     ∞             ∞
                                                                                   
                                                                             1
          v3 (n)π(n) n−1                                  1
      P                            P                                     P            P
                      n
                         −B =         v3 (n)π(n)[1 −      n
                                                            ] − 1−           n
                                                                               π(n)       π(n)v3 (n)
      n=1                         n=1                                   n=1           n=1
                                   ∞            ∞                      ∞
                                       1
                                                                          v3 (n)π(n) n1
                                   P            P                     P
                              =        n
                                         π(n)       π(n)v3 (n) −
                                  n=1          n=1                    n=1
                                        ∞             ∞              ∞
                                                                               
                                            1                           1
                                       P             P              P
                              = v3          n
                                              π(n)      π(n) −          n
                                                                          π(n)
                                      n=1           n=1             n=1
                              = 0
       ∞                           ∞                                     ∞             ∞
                                                                                   
                                                                             1
          v1 (n)π(n) n−1                                  1
      P                            P                                     P            P
                      n
                         −B =         v1 (n)π(n)[1 −      n
                                                            ] − 1−           n
                                                                               π(n)       π(n)v1 (n)
      n=1                         n=1                                   n=1           n=1
                                   ∞            ∞                      ∞
                                       1
                                                                          v1 (n)π(n) n1
                                   P            P                     P
                              =        n
                                         π(n)       π(n)v1 (n) −
                                  n=1          n=1                    n=1
                                   ∞                           ∞
                                       1
                                                                  v1 (n)π(n) n1
                                   P                          P
                              =        n
                                         π(n)E[v1 (n)]    −
                                  n=1                         n=1
                                   ∞
                                       1
                                   P
                              =        n
                                         π(n) [E[v1 (n)] − v1 (n)]
                                  n=1
                                  ∞             ∞
                                    P1            P n1 π(n)
                              =          i
                                           π(i)        P∞ 1          [E[v1 (n)] − v1 (n)]
                                    i=1           n=1    i=1 i π(i)
                                  ∞            
                                    P1            R∞
                              =          i
                                           π(i)     0
                                                       [E[v1 (n)] − v1 (n)] dG
                                  i=1
                                     ∞
                                                
                                    P    1
                                                  R∞
                              ≥          i
                                           π(i)     0
                                                       [E[v1 (n)] − v1 (n)] dF
                                    i=1
                                  ∞            
                                    P1                           R∞              
                              =          i
                                           π(i) E[v1 (n)] − 0 v1 (n)dF
                                    i=1
                              = 0
                                                128


The inequality is true because F FOSD G. Similarly,
               ∞                                     ∞
                     v2 (n)π(n) n−1                       1
               P                                    P
                                 n
                                     −B =                 n
                                                            π(n) [E[v2 (n)]    − v2 (n)]
              n=1                                   n=1
                                                    ∞            
                                                      P     1
                                                                     R∞
                                             =              i
                                                              π(i)     0
                                                                           [E[v2 (n)] − v2 (n)] dG
                                                    i=1
                                                       ∞
                                                                  
                                                      P     1
                                                                     R∞
                                             ≤              i
                                                              π(i)     0
                                                                           [E[v2 (n)] − v2 (n)] dF
                                                      i=1
                                             = 0
Thus, we have
               ∞                             ∞                                 ∞
              X                 n−1 X                            n−1 X                       n−1
                     v2 (n)π(n)        ≤          v3 (n)π(n)             ≤        v1 (n)π(n)
              n=1
                                   n        n=1
                                                                   n          n=1
                                                                                               n
It is the same as
                                       f (x∗2 )       f (x∗3 )      f (x∗1 )
                                                  ≤              ≤
                                       f 0 (x∗2 )     f 0 (x∗3 )   f 0 (x∗1 )
                f (x)
We know that    f 0 (x)
                        is increasing in x, thus we have:
                                                 x∗2 ≤ x∗3 ≤ x∗1 .
                                                                                                   Q.E.D.
Proof of Proposition 7. Myerson and Wärneryd (2006) showed that x∗4 > x∗5 . Proposi-
tion 6 showed that:
    • when b > 0, x∗5 < x∗6 .
    • when b = 0, x∗5 = x∗6 .
                                                        129


    • when b < 0, x∗5 > x∗6 .
Now I want to compare x∗4 and x∗6 . x∗4 and x∗6 can be pinned down by:
                                                       f (x∗4 )       vµ−1
                                                        0    ∗
                                                                 =
                                                       f (x4 )        µ µ
                                                             ∞
                                         f (x∗6 )       1X                            n−1
                                             0     ∗
                                                     =          (a + bn)π(n)
                                         f (x6 )        µ n=1                            n
Thus, we have
                          f (x∗4 )      f (x∗6 )          v µ−1
                                                                         P∞
                          f 0 (x∗4 )
                                     −  f 0 (x∗6 )
                                                     =    µ µ
                                                                  −    1
                                                                       µ   n=1 (a      + bn)π(n) n−1      n
                                                            hP                                         i
                                                          1      ∞                         1        v
                                                     =    µ      n=1 (a + bn)π(n) n − µ
                                                            hP                                          i
                                                          1      ∞              1             a+µb
                                                     =    µ      n=1 aπ(n) n + b − µ
                                                            hP                           i
                                                          a      ∞            1        1
                                                     =    µ      n=1 π(n) n − µ
                                                            h                     i
                                                          a         1       1
                                                     =    µ
                                                              E π [ n
                                                                      ] − Eπ [n]
                                                                               f (x∗4 )      f (x∗6 )
Eπ [ n1 ] −   1
            Eπ [n]
                   > 0 since      1
                                  n
                                     is a convex function, and                f 0 (x∗4 )
                                                                                         >   f 0 (x∗6 )
                                                                                                         ⇔ x∗4 > x∗6 since f (.) is
increasing and concave. Thus, the comparison between x∗4 and x∗6 depends on a:
    • When a > 0, x∗4 > x∗6 .
    • When a = 0, x∗4 = x∗6 .
    • When a < 0, x∗4 < x∗6 .
The proposition focuses on scenarios with µ > 0, and there are only five possible combi-
nations: (i) a > 0, b < 0, (ii) a > 0, b = 0, (iii) a > 0, b > 0, (iv) a = 0, b > 0, and (v)
a < 0, b > 0. The proposition is an immediate result of the above analysis.
                                                                                                                          Q.E.D.
                                                                130


Appendix C: Proofs for Chapter 3
Proof of Proposition 8. For any equilibrium, the assimilation choice must be a cutoff
strategy. To see this, for any given agent i with skill level si and background I, the payoff is
                                       
                                       
                                        f (mA )sA si − d if ai = 1
                                       
                                ui =
                                       
                                       
                                            f (mI )sI si     if ai = 0
As each agent has infinitesimal mass, the choice of one agent will not affect mA , mI , sA and
sI . Thus, one of the three cases must be true:
     • f (mA )sA si − d ≤ f (mI )sI si for all si ∈ [0, 1]. In this case, no one will assimilate.
     • f (mA )sA si − d ≥ f (mI )sI si for all si ∈ [0, 1]. In this case, all agents will assimilate.
     • f (mA )sA si −d = f (mI )sI si for some si = c ∈ [0, 1]. In this case, agents with si > c will
       assimilate, agents with si < c will not assimilate, and agents with si = c is indifferent
       between assimilate and not assimilate.
In either case, agents’ action could be summarized as a cutoff strategy:
                                    
                                    
                                    
                                    
                                    
                                                 1              if si > c
                                    
                                    
                              ai =                0              if si < c
                                    
                                    
                                    
                                    
                                    
                                     1 with proability p if si = c
                                    
     The following steps prove the existence of the equilibrium.
First, denote F (s) be the cumulative distribution function of skill levels of agents with
                                                   131


background I, and G(s) be the cumulative distribution function of skill levels of agents with
background A. Denote sI be the average skill level of agents with background I, and sA be
the average skill level of agents with background A:
                                     Z  1                  Z  1
                               sI =       sdF (s) and sA =      sdG(s).
                                      0                     0
The cutoff strategy can be described as: a proportion (τ ) of agents with highest skill levels
will choose to assimilate. Thus, for any τ ∈ [0, 1], denote c(τ ) = min{F (s) ≥ τ }, and
                                                                              s
p(τ ) = f (c(τ )) − τ . Since F is increasing and right-continuous, the definition is valid. Thus,
the cutoff strategy can be fully characterized by a parameter τ ∈ [0, 1].
     Consider three situations: (i) τ = 1 is an equilibrium, (ii) τ = 0 is an equilibrium, and
(iii) τ ∈ (0, 1) is an equilibrium.
Case (i) If τ = 1 is an equilibrium, that means all agents will assimilate. This equilibrium
exists if d = 0.
Case (ii) If τ = 0 is an equilibrium, that means no agent will assimilate. This equilibrium
exists if f (1 − m)sA − d ≤ f (m)sI .
Case (iii) If τ ∈ (0, 1) is an equilibrium, then agent with skill level c is indifferent between
assimilation and not assimilation. Thus, the following equation must hold:
                          f (1 − m + mτ )sA c(τ ) − d = f (m(1 − τ ))sI c(τ )
                                                   132


                         R1                       R c(τ )
              (1−m)sA +m  c(τ ) sdF (s)                   sdF (s)
where sA =          1−m+mτ
                                         and sI =  0
                                                       1−τ
                                                                  . The above equation is the same as
                           [f (1 − m + mτ )sA − f (m(1 − τ ))sI ]c(τ ) = d
If d > 0 and f (1 − m)sA − f (m)sI > d, then the left-hand side of the equation is larger
than d if τ = 0, and it is 0 if τ = 1. Further, it is continuous in τ , so by Intermediate
Value Theorem, there exists a τ ∈ (0, 1) such that the equation holds. Thus, if d > 0 and
f (1 − m)sA − f (m)sI > d, an equilibrium exists.
    Above all, for any bounded measurable function s over N and any discrimination level
d ∈ R+ , an equilibrium always exists.
                                                                                                  Q.E.D.
Proof of Corollary 1. The Corollary is an immediate result from Proposition 8, there is
no mass point in the distribution of skill levels.
                                                                                                  Q.E.D.
Proof of Proposition 9. For any agent with background A, the utility maximization pro-
gram
                                     max f (mA )sA       s.t. c ∈ C(s, d).
                                        d
Since the working skill distribution s ∈ S, choosing c is equivalent to choosing a cutoff θc such
that agents with θ ≥ θc will assimilate and θ < θc will not assimilate. θc can be supported as
an equilibrium as long as d = [f (mA )sA − f (mI )sI ]θc ≥ 0. It is evident that mA is decreasing
in θc , mI is increasing in θc , and sI is increasing in θc . For sA , it is decreasing when θc < θ̃
and increasing when θc > θ̃, where θ̃ is unique and can be calculated by sA = si (θ̃).
                                                      133


    If θc ≤ θ̃, then sA > sI and mA > mI , so any θc can be supported as an equilibrium by
some d.
    If θc > θ̃, [f (mA )sA − f (mI )sI ] is decreasing in θc , and it is continuous. Denote θ̄ = 1 if
[f (mA )sA − f (mI )sI ] > 0 when θc = 1; otherwise, denote θ̄ be the solution to the equation
[f (mA )sA − f (mI )sI ] = 0. θ̄ is unique as [f (mA )sA − f (mI )sI ] is monotonic on [θ̃, 1].
    Thus, choosing d ∈ [0, ∞) is equivalent to choosing θc ∈ [0, θ̄]. As a result, I can write
the maximization program of h as:
                                              max f (mA )sA
                                             θc ∈[0,θ̄]
Since f (mA )sA is continuous in θc , and [0, θ̄] is compact. f (mA )sA achieves maximum
on [0, θ̄]. Denote the maximizer as θ∗ . The corresponding d∗ can be calculated by d∗ =
[f (m∗A )s∗A − f (m∗I )s∗I ]θ∗ .
    The following steps show that θ = 0 and θ = 1 cannot be the maximum, so the maximum
is achieved on (0, 1).
    First mA is decreasing in θc , since high θc means less agents will assimilate. Also, sA
is decreasing in θc at θc = 1, as agents with highest skill levels assimilate will increase the
average skill level of group A. Thus, f (mA )sA is decreasing at θc = 1, so it does not achieve
maximum at θ = 1.
    Then I want to show f (mA )sA does not achieve maximum at θc = 0. Calculate f (mA )sA
                                                        134


as a function of θc :
                                                                             Z 1
                                           f (1 − mθc )
                                                                                         
                         f (mA )sA =                          (1 − m)sA + m        s(t)dt
                                             1 − mθc                           θc
                   f (1−mθc )
Denote H(θc ) =      1−mθc
                                . Taking derivative with respect to θc :
                                     f 0 (1−mθc )(−m)(1−mθc )−f (1−mθc )(−m)
                    H 0 (θc ) =                          (1−mθc )2
                                          −m
                                  =  (1−mθc )2
                                                  [f 0 (1 − mθc )(1 − mθc ) − f (1 − mθc )]
                                                                                        h    R1      i
f 0 (1) < f (1) ⇒ H 0 (0) > 0, so H(θc ) is increasing in θc at θc = 0. Also, (1 − m)sA + m θc s(t)dt
is increasing in θc , so f (mA )sA does not achieve maximum at θc = 0.
                                                                                            Q.E.D.
Proof of Proposition 10 . For agent with background A, the utility maximization pro-
gram becomes:
                       max log(li ) + βA log(f (mA )sA si ) s.t. si = θ(1 − li ).
                      li ∈[0,1]
Substitute li with si would result in
                                                      si
                                   max log(1 −           ) + βA log(f (mA )sA si ).
                                                      θ
FOC:
                                                             
                                                1           1      βA
                                                   si     −     +     = 0.
                                             1−    θ
                                                            θ      si
                                                            135


After simplification:
                                                     βA
                                            s∗i =         θ.
                                                  1 + βA
                                                                      βI
    Similarly, for agent with background I and ai = 0, s∗i =         1+βI
                                                                          θ.
    For agent with background I and ai = 1, the program becomes:
                    max log(li ) + βI log(f (mA )sA si − d) s.t. si = θ(1 − li ).
                   li ∈[0,1]
    FOC:
                                       
                                 1       1             f (mA )sA
                                   si  −       + βI                   = 0.
                                1− θ
                                         θ          f (mA )sA si − d
    After simplification:
                                                  βI
                                         s∗i =          θ + s∗
                                                1 + βI
                    d∗
where s∗ =   (1+βI )f (mA )sA
                              .
                                                                                             Q.E.D.
                                                    βA                βI
Proof of Proposition 11. Denote γA =              1+βA
                                                         and γI =   1+βI
                                                                         .  According to Proposition
8, 9, and 10, if the equilibrium exists, it must have the following form: agents will assimilate
                                                  136


iff θ ≥ θ∗ . The on path strategies are:
                       
                       
                             γA θi       if agent has background A
                       
                       
                       
                       
                       
                       
                       
                       
                       
                  si =    γI θi + s∗ if agent has background I and θi ≥ θ∗
                       
                       
                       
                       
                       
                       
                       
                                         if agent has background I and θi < θ∗
                       
                        γI θi
                       
For agents with background I, the assimilation choice
                                               
                                                1 if si ≥ c∗
                                               
                                               
                                         ai =
                                                0 if si < c∗
                                               
                                               
and the following conditions must hold:
                              
                                   d∗ = [f (mA )sA − f (mI )sI ]c∗
                              
                              
                              
                              
                              
                              
                              
                              
                              
                              
                              
                                   ∗
                               θ = arg max f (mA )sA
                              
                              
                                                     θ
                              
                              
                                   c∗ ∈ [γI θ∗ , γI θ∗ + s∗ ]
                              
                              
                              
                              
                              
                              
                              
                              
                              
                              
                                                     d∗
                              
                               s∗ =
                              
                                              (1+βI )f (mA )sA
Simplify these conditions by substituting c∗ and d∗ :
                          
                                                θ∗ = arg max f (mA )sA
                          
                          
                          
                                                               θ
                          
                          
                               (1+βI )f (mA )sA s∗
                          
                                                     ∈ [γI θ∗ , γI θ∗ + s∗ ]
                          
                          
                              f (mA )sA −f (mI )sI
                                                    137


     I first focus on the first condition. Given the skill acquisition strategies above, the
objective function could be written as:
                       
                            f (1−mθ)
                                          (1 − m)γA + 21 mγI (1 − θ2 ) + m(1 − θ)s∗      if θ ≥ θ∗
                       
                                    1                                             
                       
                              1−mθ      2
          f (mA )sA =
                           f (1−mθ)
                                         (1 − m)γA + 21 mγI (1 − θ2 ) + m(1 − θ∗ )s∗     if θ < θ∗
                       
                                    1                                              
                       
                             1−mθ      2
As f (mA )sA achieves maximum at θ∗ , it is equivalent to f (mA )sA achieves maximum at θ∗
on both [0, θ∗ ] and [θ∗ , 1]. Thus, I can take derivative with respect to θ on two intervals
separately. θ∗ = arg max f (mA )sA is equivalent to:
                           θ
              
                              ∗               ∗  0        ∗                    ∗       ∗     ∗
               [f (1 − mθ ) − (1 − mθ )f (1 − mθ )]sA ≤ f (1 − mθ )[γI θ + s ]
              
              
              
              
              
               [f (1 − mθ∗ ) − (1 − mθ∗ )f 0 (1 − mθ∗ )]sA ≥ f (1 − mθ∗ )γI θ∗
              
     Now combine it with the second condition. The equilibrium exists as long as there exists
(θ∗ , s∗ ) that satisfies the condition below:
      
          [f (1 − mθ∗ ) − (1 − mθ∗ )f 0 (1 − mθ∗ )]sA ≤ f (1 − mθ∗ )[γI θ∗ + s∗ ]
      
      
      
      
      
      
      
      
      
      
      
                     ∗                ∗    0       ∗                     ∗   ∗
       [f (1 − mθ ) − (1 − mθ )f (1 − mθ )]sA ≥ f (1 − mθ )γI θ
      
      
      
                                                                                                   (7)
      
      
                                  (1 + βI )s∗ f (mA )sA ≤ [f (mA )sA − f (mI )sI ][γI θ∗ + s∗ ]
      
      
      
      
      
      
      
      
      
      
      
                                  (1 + βI )s∗ f (mA )sA ≥ [f (mA )sA − f (mI )sI ]γI θ∗
      
      
     I then show the existence of (θ∗ , s∗ ) that satisfies condition (7), thus finish the proof.
                                                     138


    First, I find (θ∗ , s∗ ) that satisfies the following equations:
             
                                                                        s∗ =      1
                                                                                      γ θ∗
             
             
                                                                                1+βI I
             
             
                                                                                                        (8)
             
             
              [f (1 − mθ∗ ) − (1 − mθ∗ )f 0 (1 − mθ∗ )]sA =                    2+βI
                                                                                      f (1 − mθ∗ )γI θ∗
             
                                                                                1+βI
I want to show the definition of this (θ∗ , s∗ ) is valid. I substitute s∗ and reduce condition (8)
to:
                          1
                      1−mθ ∗
                              [f (1−mθ∗ )−(1−mθ∗ )f 0 (1−mθ∗ )]             2 + βI
                h
                  (1−m) 12 γA +m 12 γI (1−(θ∗ )2 )+m(1−θ∗ ) 1+β
                                                              1
                                                                  γI θ∗
                                                                        i =         f (1 − mθ∗ )γI θ∗
                                                                I           1 + βI
The only variable in this equation is θ∗ , and the rest parameters are all exogenous given.
The LHS and RHS are all continuous in θ∗ . The solution of this equation is guaranteed by
Intermediate Value Theorem, as LF S > RHS if θ∗ = 0, and LF S < RHS if θ∗ = 1 (note
γA < γI ). Thus, I proved that there always exists (θ∗ , s∗ ) that satisfies the condition (8)
    Then I want to show that if (θ∗ , s∗ ) satisfies condition (8), it must also satisfy condition
(7). For the four inequalities in condition (7):
    • The first inequality holds with equality. It is a rearrangement of condition (8).
    • The second inequality holds, since the first inequality holds with equality, and s∗ > 0.
                                                            139


     • The third inequality holds, as:
                                              sA     2sA
                                              sI
                                                 =   γI θ∗
                                                                2           f (mA )
                                                 = (2 + βI ) 1+β                    0
                                                                   I f (mA )−mA f (mA )
                                                 > (2 + βI )
                                       f (mA )sA
                   ⇒                   f (mI )sI
                                                 ≥ 2 + βI
                                                     2+βI
                   ⇔                f (mA )sA ≤      1+βI
                                                           [f (mA )sA   − f (mI )sI ]
                   ⇔ (1 + βI )s∗ f (mA )sA ≤ [f (mA )sA − f (mI )sI ][γI θ∗ + s∗ ]
     • The forth inequality holds, as:
                (1 + βI )s∗ f (mA )sA = f (mA )sA γI θ∗ ≥ [f (mA )sA − f (mI )sI ]γI θ∗
                                                                                        Q.E.D.
Proof of Proposition 12. For f (m) = 12 − 21 (1 − m)2 = m − 12 m2 , I can write f (1 − mθ) =
1
2
  (1 − mθ)(1 + mθ). The condition (7) becomes:
                
                            (1 − mθ∗ )sA ≤ (1 + mθ∗ )(γI θ∗ + s∗ )
                
                
                
                
                
                
                
                
                
                
                
                            (1 − mθ∗ )sA ≥ (1 + mθ∗ )γI θ∗
                
                
                
                
                
                
                   (1 + βI )s∗ f (mA )sA ≤ [f (mA )sA − f (mI )sI ][γI θ∗ + s∗ ]
                
                
                
                
                
                
                
                
                
                
                
                 (1 + βI )s∗ f (mA )sA ≥ [f (mA )sA − f (mI )sI ]γI θ∗
                
                                                 140


                                                             2P
    Let P = f (mA )sA , Q = f (mI )sI , then sA =      (1−mθ∗ )(1+mθ∗ )
                                                                        . Substituting f (mA ), sA , f (mI ), sI
in the conditions:
                          
                                           2P ≤ (1 + mθ∗ )2 (γI θ∗ + s∗ )
                          
                          
                          
                          
                          
                          
                          
                          
                          
                          
                          
                                           2P ≥ (1 + mθ∗ )2 γI θ∗
                          
                          
                          
                          
                          
                          
                              (1 + βI )s∗ P ≤ (P − Q)(γI θ∗ + s∗ )
                          
                          
                          
                          
                          
                          
                          
                          
                          
                          
                          
                           (1 + βI )s∗ P ≥ (P − Q)γI θ∗
                          
    The existence of (θ∗ , s∗ ) has been proved in Proposition 11. When (θ∗ , s∗ ) satisfies the
condition above, the equilibrium exists and the on-path strategies are described below. Thus,
I finish the proof.
                               
                               
                                                   if i ∈ NA ;
                               
                               
                               
                                  γA θi
                               
                               
                               
                               
                         si = γI θi + s∗           if θi ≥ θ∗ and i ∈ NI ;
                               
                               
                               
                               
                               
                               
                                                   if θi < θ∗ and i ∈ NI .
                               
                               γ θ
                               
                                    I i
                                        d∗ = (1 + βI )f (mA )sA s∗ .
                                           
                                           
                                           1       if si ≥ γI θ∗ + s∗ ;
                                           
                                           
                               ai (si ) =
                                           
                                           
                                           0       if si < γI θ∗ .
                                           
                                                                                                  Q.E.D.
                                                   141


BIBLIOGRAPHY
      142


                                  BIBLIOGRAPHY
Advani, Arun and Bryony Reich (2015), “Melting pot or salad bowl: The formation of
  heterogeneous communities.” Technical report, IFS Working Papers.
Aghion, Philippe and Jean Tirole (1997), “Formal and Real Authority in Organizations.”
  Journal of Political Economy, 105, 1–29.
Akerlof, George A. (1970), “The Market for “Lemons”: Quality Uncertainty and the Market
  Mechanism.” The Quarterly Journal of Economics, 84, 488–500.
Akerlof, George A. and Rachel E. Kranton (2000), “Economics and identity.” The Quarterly
  Journal of Economics, 115, 715–753.
Alonso, Ricardo, Wouter Dessein, and Niko Matouschek (2008), “When Does Coordination
  Require Centralization?” American Economic Review, 98, 145–179.
Alonso, Ricardo and Niko Matouschek (2008), “Optimal Delegation.” Review of Economic
  Studies, 75, 259–293.
Aoki, Masahiko (1986), “Horizontal vs. Vertical Information Structure of the Firm.” The
  American Economic Review, 971–983.
Athey, Susan and John Roberts (2001), “Organizational design: Decision rights and incentive
  contracts.” American Economic Review, 91, 200–205.
Azmat, Ghazala and Marc Möller (2009), “Competition among contests.” The RAND Jour-
  nal of Economics, 40, 743–768.
Benjamin, Daniel J., James J. Choi, and A. J. Strickland (2010), “Social identity and pref-
  erences.” The American Economic Review, 100, 1913–1928.
Besanko, David, Pierre Régibeau, and Katharine E. Rockett (2005), “A Multi-Task
  Principal-Agent Approach to Organizational Form.” Journal of Industrial Economics, 53,
  437–467.
Blanes i Vidal, Jordi and Marc Möller (2016), “Project selection and execution in teams.”
  The RAND Journal of Economics, 47, 166–185.
Bolton, P. and M. Dewatripont (1994), “The Firm as a Communication Network.” The
  Quarterly Journal of Economics, 109, 809–839.
                                           143


Bouton, Laurent (2013), “A Theory of Strategic Voting in Runoff Elections.” American
  Economic Review, 103, 1248–1288.
Bouton, Laurent and Micael Castanheira (2012), “One Person, Many Votes: Divided Ma-
  jority and Information Aggregation.” Econometrica, 80, 43–87.
Campbell, Colin M. (1999), “Large Electorates and Decisive Minorities.” Journal of Political
  Economy, 107, 1199–1217.
Che, Yeon-Koo and Seung-Weon Yoo (2001), “Optimal Incentives for Teams.” The American
  Economic Review, 525–541.
Coller, Maribeth and Melonie B. Williams (1999), “Eliciting individual discount rates.”
  Experimental Economics, 2, 107–127.
Corts, Kenneth S. (2007), “Teams versus individual accountability: Solving multitask prob-
  lems through job design.” The RAND Journal of Economics, 467–479.
Cremer, Jacques (1980), “A Partial Theory of the Optimal Organization of a Bureaucracy.”
  The Bell Journal of Economics, 11, 683–693.
Currarini, Sergio, Matthew O. Jackson, and Paolo Pin (2009), “An economic model of friend-
  ship: Homophily, minorities, and segregation.” Econometrica, 77, 1003–1045.
Deimen, Inga and Dezsö Szalay (2019), “Delegated Expertise, Authority, and Communica-
  tion.” American Economic Review, 109, 1349–1374.
Dessein, Wouter (2002), “Authority and Communication in Organizations.” Review of Eco-
  nomic Studies, 69, 811–838.
Dessein, Wouter, Luis Garicano, and Robert Gertner (2010), “Organizing for Synergies.”
  American Economic Journal: Microeconomics, 2, 77–114.
Dewatripont, Mathias, Ian Jewitt, and Jean Tirole (2000), “Multitask agency problems:
  Focus and task clustering.” European Economic Review, 44, 869–877.
Dreyfus, Mark K. and W. K. Viscusi (1995), “Rates of time preference and consumer valu-
  ations of automobile safety and fuel efficiency.” The Journal of Law and Economics, 38,
  79–105.
Eguia, Jon X. (2017), “Discrimination and assimilation at school.” Journal of Public Eco-
  nomics, 156, 48 – 58.
Ekmekci, Mehmet and Stephan Lauermann (2022), “Information aggregation in Poisson
                                             144


  elections.” Theoretical Economics, 17, 1–23.
Espinosa, Lorelle L, Jonathan M Turk, Morgan Taylor, and Hollie M Chessman (2019),
  “Race and ethnicity in higher education: A status report.”
Fiske, Susan T., Amy J. C. Cuddy, Peter Glick, and Jun Xu (2002), “A model of (often
  mixed) stereotype content: Competence and warmth respectively follow from perceived
  status and competition.” Journal of Personality and Social Psychology, 82, 878–902.
Friebel, Guido, Matthias Heinz, Miriam Krueger, and Nikolay Zubanov (2017), “Team In-
  centives and Performance: Evidence from a Retail Chain.” American Economic Review,
  107, 2168–2203.
Friebel, Guido and Michael Raith (2010), “Resource Allocation and Organizational Form.”
  American Economic Journal: Microeconomics, 2, 1–33.
Fullerton, Richard L and R Preston McAfee (1999), “Auctioning Entry into Tournaments.”
  journal of political economy, 107, 573–605.
Galbraith, Jay R. (1971), “Matrix organization designs How to combine functional and
  project forms.” Business Horizons, 14, 29–40.
Geanakoplos, John and Paul Milgrom (1991), “A theory of hierarchies based on limited
  managerial attention.” Journal of the Japanese and International Economies, 5, 205–225.
Gilbert, G. M. (1951), “Stereotype persistence and change among college students.” The
  Journal of Abnormal and Social Psychology, 46, 245–254.
Gromb, Denis and David Martimort (2007), “Collusion and the organization of delegated
  expertise.” Journal of Economic Theory, 137, 271–299.
Grossman, Sanford J and Oliver D Hart (1983), “An analysis of the principal-agent problem.”
  Econometrica, 51, 7–45.
Groves, Theodore (1973), “Incentives in Teams.” Econometrica, 41, 617–631.
Harrison, G.W, M.I Lau, and M.B Williams (2002), “Estimating individual discount rates
  in denmark: A field experiment.” The American Economic Review, 92, 1606–1617.
Harstad, Ronald M., John H. Kagel, and Dan Levin (1990), “Equilibrium bid functions for
  auctions with an uncertain number of bidders.” Economics Letters, 33, 35–40.
Harstad, Ronald M., Aleksandar Saša Pekeč, and Ilia Tsetlin (2008), “Information aggrega-
  tion in auctions with an unknown number of bidders.” Games and Economic Behavior,
                                            145


  62, 476–508.
Hausman, Jerry A. (1979), “Individual discount rates and the purchase and utilization of
  energy-using durables.” The Bell Journal of Economics, 10, 33–54.
Ho, Colin and Jay W. Jackson (2001), “Attitudes toward asian americans: theory and
  measurement.” Journal of Applied Social Psychology, 31, 1553.
Hobday, Mike (2000), “The project-based organisation: an ideal form for managing complex
  products and systems?” Research Policy, 29, 871–893.
Holmström, Bengt (1982), “Moral Hazard in Teams.” The Bell Journal of Economics, 13,
  324–340.
Holmström, Bengt and Paul Milgrom (1991), “Multitask Principal-Agent Analyses: Incen-
  tive Contracts, Asset Ownership, and Job Design.” Journal of Law, Economics, and Or-
  ganization, 24–52.
Ishihara, Akifumi (2017), “Relational contracting and endogenous formation of teamwork.”
  The RAND Journal of Economics, 48, 335–357.
Ishihara, Akifumi (2020), “On Multitasking and Job Design in Relational Contracts.” The
  Journal of Industrial Economics, 68, 693–736.
Jackson, Linda A., Carole N. Hodge, Donna A. Gerard, Julie M. Ingram, Kelly S. Ervin,
  and Lori A. Sheppard (1996), “Cognition, affect, and behavior in the prediction of group
  attitudes.” Personality and Social Psychology Bulletin, 22, 306–316.
Karlins, Marvin, Thomas L. Coffman, and Gary Walters (1969), “On the fading of social
  stereotypes: Studies in three generations of college students.” Journal of Personality and
  Social Psychology, 13, 1–16.
Katz, D. and K. Braly (1933), “Racial stereotypes of one hundred college students.” The
  Journal of Abnormal and Social Psychology, 28, 280–290.
Krishna, Vijay and John Morgan (2012), “Voluntary voting: Costs and benefits.” Journal
  of Economic Theory, 147, 2083–2123.
Kvaløy, Ola and Trond E. Olsen (2006), “Team Incentives in Relational Employment Con-
  tracts.” Journal of Labor Economics, 24, 139–169.
Lai, Lei and Linda C. Babcock (2013), “Asian americans and workplace discrimination: The
  interplay between sex of evaluators and the perception of social skills: Asian americans
  and workplace discrimination.” Journal of Organizational Behavior, 34, 310–326.
                                            146


Larson, Erik W. and David H. Gobeli (1989), “Significance of project management structure
  on development success.” IEEE Transactions on Engineering Management, 36, 119–125.
Lechler, Thomas G. and Dov Dvir (2010), “An Alternative Taxonomy of Project Manage-
  ment Structures: Linking Project Management Structures and Project Success.” IEEE
  Transactions on Engineering Management, 57, 198–210.
Levin, Dan and Emre Ozdenoren (2004), “Auctions with uncertain numbers of bidders.”
  Journal of Economic Theory, 229–251.
Levin, Dan and James L Smith (1994), “Equilibrium in Auctions with Entry.” American
  Economic Review, 84, 585–599.
Levitt, Steven D. and Christopher M. Snyder (1997), “Is no News Bad News? Information
  Transmission and the Role of “Early Warning” in the Principal-Agent Model.” The RAND
  Journal of Economics, 28, 644–661.
Lim, Wooyoung and Alexander Matros (2009), “Contests with a stochastic number of play-
  ers.” Games and Economic Behavior, 67, 584–597.
Marino, Anthony M. and Ján Zábojnı́k (2004), “Internal Competition for Corporate Re-
  sources and Incentives in Teams.” The RAND Journal of Economics, 35, 710–727.
Marschak, Jacob (1955), “Elements for a Theory of Teams.” Management Science, 127–137.
Marschak, Jacob and Roy Radner (1972), Economic Theory of Teams. Yale University Press,
  New Haven and London.
Matthews, Steven (1987), “Comparing Auctions for Risk Averse Buyers: A Buyer’s Point of
  View.” Econometrica, 55, 633–646.
Mcafee, R. Preston and John Mcmillan (1987), “Auctions with a Stochastic Number of
  Bidders.” Journal of Economic Theory, 43, 1–19.
McAfee, R. Preston and John McMillan (1991), “Optimal Contracts for Teams.” Interna-
  tional Economic Review, 32, 561–577.
Milchtaich, Igal (2004), “Random-player games.” Games and Economic Behavior, 47, 353–
  388.
Mookherjee, Dilip (1984), “Optimal Incentive Schemes with Many Agents.” The Review of
  Economic Studies, 51, 433–446.
Moore, Michael J. and W. K. Viscusi (1990), “Models for estimating discount rates for long-
                                          147


  term health risks using labor market data.” Journal of Risk and Uncertainty, 3, 381–401.
Mukherjee, Arijit and Luis Vasconcelos (2011), “Optimal job design in the presence of im-
  plicit contracts.” The RAND Journal of Economics, 42, 44–69.
Myerson, Roger B. (1998), “Population uncertainty and Poisson games.” International Jour-
  nal of Game Theory, 27, 375–392.
Myerson, Roger B. (2000), “Large poisson games.” Journal of Economic Theory, 94, 7–45.
Myerson, Roger B (2002), “Comparison of Scoring Rules in Poisson Voting Games.” Journal
  of Economic Theory, 103, 219–251.
Myerson, Roger B. and Karl Wärneryd (2006), “Population uncertainty in contest.” Eco-
  nomic Theory, 469–474.
Münster, Johannes (2006), “Contests with an unknown number of contestants.” Public
  Choice, 129, 353–368.
Münster, Johannes (2007), “Contests with investment.” Managerial and Decision Eco-
  nomics, 28, 849–862.
Münster, Johannes (2009), “Repeated Contests with Asymmetric Information.” Journal of
  Public Economic Theory, 11, 89–118.
Pender, John L. (1996), “Discount rates and credit markets: Theory and evidence from rural
  india.” Journal of Development Economics, 50, 257–296.
Piketty, Thomas (2000), “Voting as Communicating.” Review of Economic Studies, 67, 169–
  191.
Rantakari, Heikki (2008), “Governing Adaptation.” Review of Economic Studies, 75, 1257–
  1285.
Rayo, Luis (2007), “Relational Incentives and Moral Hazard in Teams.” Review of Economic
  Studies, 937–963.
Ritzberger, Klaus (2009), “Price competition with population uncertainty.” Mathematical
  Social Sciences, 58, 145–157.
Satterthwaite, Mark and Artyom Shneyerov (2007), “Dynamic Matching, Two-Sided In-
  complete Information, and Participation Costs: Existence and Convergence to Perfect
  Competition.” Econometrica, 75, 155–200.
                                            148


Schmalensee, Richard (1976), “A Model of Promotional Competition in Oligopoly.” The
  Review of Economic Studies, 43, 493–507.
Schöttner, Anja (2008), “Relational Contracts, Multitasking, and Job Design.” Journal of
  Law, Economics, and Organization, 24, 138–162.
Skaperdas, Stergios (1996), “Contest success functions.” Economic Theory, 7, 283–290.
Spence, Michael (1973), “Job Market Signaling.” The Quarterly Journal of Economics, 87,
  355–374.
Szymanski, Stefan (2003), “The Economic Design of Sporting Contests.” Journal of Eco-
  nomic Literature, 16, 1137–1187.
Tullock, Gordon (1980), “Efficient rent seeking.” In Toward a theory of the rent-seeking
  society, 97–112, College Station: Texas A & M University.
Verdier, Thierry and Yves Zenou (2017), “The role of social networks in cultural assimila-
  tion.” Journal of Urban Economics, 97, 15–39.
Wasser, Cédric (2013), “Incomplete information in rent-seeking contests.” Economic Theory,
  53, 239–268.
                                            149