MSU

LIBRARIES
“

 

 

RETURNING MATERIALS:
PIace in book drop to
remove this checkout from
your record. FINES wiII
be charged if book is
returned after the date
stamped beIow.

 

 

 

 

 

 

 

EXISTENCE OF OPTIMAL NASH CONTRACTS
OF A PRINCIPAL-AGENT MODEL

SINGLE AND MULTIPERIOD CONSIDERATIONS

BY

Peter Cheng

A DISSERTATION

Submitted to
Michigan State University
1 partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Accounting

1982

ABSTRACT

EXISTENCE OF OPTIMAL NASH CONTRACTS
OF A PRINCIPAL-AGENT MODEL
SINGLE AND MULTIPERIOD CONSIDERATIONS
BY

Peter Cheng

This dissertation attempts (1) to construct an
abstract, general yet analytically rigorous agency
model of a business entity, and (2) to study the
conditions or circumstances such that the principal
can design an Optimal contract with his agent in both
the single and multiperiod settings.

In the single period model, it is shown that if
the contract to be negotiated belongs to the class of
totally bounded functions or to that of monotonic

increasing functions, the principal can be successful
in arriving at an Optimal contract.

The multiperiod model is constructed as a sequential
decision process on a discrete time basis. Dynamic
programming algorithm is a suggested solution
procedure to the prOblem. Under various assumptions,

it is shown that the proposed model meets the hypothesis

Peter Cheng

of tﬂne algorithm. The questions of the existence of

optimal or nearly optimal contracts, uniformly optimal
contracts and stationary Optimal contracts as well as
the convergence of the algorithm are investigated.

In the multiperiod model, the basic agency problem
is first considered on a "most well—behaved" setting.
This means that the payoff outcomes are observable by
both the principal and the agent. Analysis is carried
on both a finite and an infinite time horizon. Then,
it is assumed that the payoff is observable by the
agent alone while the principal receives a signal on
the payoff. The signal is chosen by the agent and
there is no restriction on how the agent should report
the payoff. This leads to an imperfect state information
model.

The imperfect state information model is analogous
to the reporting and auditing functions of an entity.

If appropriate measurability restrictions are imposed

on the various functions, it is shown that if the auditor
is acting in the best interest of the principal, the
audited financial statements are adequate for the deri-
vation of a long run Optimal contract through the dynamic

programming algorithm.

To
Christina, Nicholas and Cecilia

ii

ACKNOWLEDGMENTS

I wish to thank my committee members, Dr.

Ronald Marshall, Dr. Kenneth Janson and Dr. Peter
Lappan, Jr., for their continual help and guidence
throughout the years of research on the dissertation.
Special thanks are given to Dr. Lappan who patiently
taught me mathematics and showed me how to apply
mathematical analysis to study accounting and
economic problems. Also, I have to thank Mary
Reynolds for her excellent typing of the manuscript.
Finally, I must acknowledge the patience and under-
standing of my wife Christina and children, Nicholas
and Cecilia, from whom I took time normally available

for family activities to work on this project.

iii

TABLE OF CONTENTS

CHAPTER PAGE
CHAPTER I: PRELIMINARIES 1
1.1: Introduction. . . . . . . . 1
1.2: SCOpe and Objective . . . 3
1.3: Organization of the Study
and Summary of Results . . 9
CHAPTER II: SINGLE PERIOD MODELS 14
2.1: A General Agency Model. . . 14
2.2: The Pareto Optimal
Contract . . . l9
2.3: The Nash Optimal COntract . 20
2.4: Financial Reporting and
Auditing . . . . . . . . . 27
CHAPTER II*: SINGLE PERIOD MODELS 30
2.1*: The Model . . . . 30
2.2*: Totally Bounded Incentive
Functions . . . . . . 34
2.3*: Monotonic Incentives . . . 41
CHAPTER III: MULTIPERIOD MODELS — PRELIMINARIES 46
3.1: Introduction . . . . . 46
3.2: The Basic Multiperiod
Model . . . . . . . . . 52
3. 3: Optimality Concepts . . . . 57
3. 4: Dynamic Programming

Algorithm . . . . . . . . S9

CHAPTER

CHAPTER III*:

CHAPTER IV:

CHAPTER IV*:

CHAPTER V:

CHAPTER V*:

MULTIPERIOD MODELS — PRELIMI-

NARIES

3.1*:

3.2*:

FINITE
4.1:

4.2:

FINITE

4.1*:

4.2*:

Notations and
Assumptions .

The Basic MultiperiOd
Model .

Optimality Concepts

The Dynamic Programming
Algorithm

HORIZON MODEL
Introduction and

Assumptions
The Finite Horizon Model

HORIZON MODEL

Introduction and
Assumptions
The Finite Horizon Model

INFINITE HORIZON MODEL

1
.2
3

UlU'lU'l

UiU“
U14}.
.0 00

Introduction

The Contraction Assumption

The Monotonicity
Assumption

The Optimality EquatiOn

Convergence to Optimality
and Existence of Optimal
Contracts . . .

Remarks . .

INFINITE HORIZON MODEL

.1*:
5.2*:

The Contraction Assumption

The Monotonicity
Assumptions .

The Optimality EquatiOn.

Convergence to Optimality
and Existence of Optimal
Contracts

PAGE

66

66

68
70

71

73

73
75

8O

80
82

96

96
98

101
102

103
105
109
109
121
126

135

CHAPTER

CHAPTER VI:

CHAPTER VI*:

CHAPTER VII:

CHAPTER VII*:

BOREL MODELS

ONO O‘Cﬁ
NH

pan

Introduction .
Existence of Probability

Measures .

Analytic Sets .
Construction of the

"Selector"

BOREL MODELS

Introduction . .
Probability Measures on
Borel Spaces . .
Analytic Sets and Univer—
sally Measurability
Universally and Borel
Measurable Selection

IMPERFECT STATE INFORMATION MODEL

7.1
7 2

7.3:

7.4:

7.5:

Introduction

The Imperfect State
Information Model

The Perfect State Infor-
mation Model

Reduction of the Imperfect

State Information Model
Sufficient Statistic

IMPERFECT STATE INFORMATION MODEL

7.1*:
7.2*:

7.3*:
7.4*:

7.5*:

Introduction

The Imperfect State
Information Model

The Perfect State
Information Model

Reduction of the Imperfect
State Information Model

Existence of Statistics
Sufficient for
Contracting . . . .

vi

PAGE

148
148

150
153

156

158
158
162
180

188

221
221
224
231
232
233
236
236
238
243

247

259

CHAPTER

APPENDIX I

APPENDIX II

BIBLIOGRAPHY

GENERAL REFERENCES

vii

PAGE

268

289

301

305

CHAPTER I

PRELIMINARIES

1.1: Introduction

 

An agency can be defined as an economic arrange-
ment in which two or more individuals share the outcome
produced by an action or a sequence of actions and
the occurrence of some random event(s). The indivi—
dual who wishes to delegate the action responsibility
while receiving a share of the outcome is called the
principal. The party who makes the action decisions

is called the agent.

The study of the economic effects of various
contractual arrangements between the principal and
his agents has been of increasing interest in both
the accounting and economic literatures. Amershi and

Butterworth [1979] have identified three research

lines of inquiry on the subject. One branch of study
focuses on the welfare effects in terms of expected
utilities of contractual aggrangements among economic

agents. A number of writers have contributed to the

literature of this approach: Demski and Feltham [1978],
Harris and Raviv [1979], Holmstrom [1977,1979],
Kobayashi [1980], Mirrlees [1974,1976], Ross [1973,

1974], Shavell [1979], Spence and Zeckhauser [1971],

and Wilson [1968], among others. The question of market
efficiency with respect to the ability of markets to
absorb and transmit information about the activities

of economic agents was studied by Akerloff [1977],
Hirschleifer [1977], Spence [1974]. The third

direction focuses on investigating the interactive
effects of markets and contractual decentralization:
Alchian and Demetz [1972], Fama [1980], Jensen and
Meckling [1976].

This study concentrates on the first category.
Focus is placed on identifying the set of sufficient
conditions to guarantee the existence of optimal con-
tracts between the principal and hisagent in both a
single period and multiperiod settings. In the multi—
period model, emphasis is placed on the existence of

long run contracts. Although the formulation of the
model does not preclude the consideration of a two

period agency, the thrust of the research is to investi-
gate contracts over the life of the entity. The
multiperiod model is developed as a sequential decision
process on a discrete time bases. Contracts are

negotiated at the end of each period after the payoffs

are observed or reported by the agent. There is no
restriction that contracts for subsequent periods be
of the same form as those of the previous periods.
Under such a scenario, I would like to study the
circumstances, both economic and behaviorial, under
which the principal would be successful in designing
a contract which would maximize his expected utility

on the net return while at the same time allowing

the agent to maximize his own expected utility by

selecting an Optimal action choice.

1.2: Scope and Objective

 

The purposes of this research are: (l) to
construct an abstract, general yet analytically
rigorous agency model of a business entity, (2) to
study the conditions or circumstances such that the
principal can design an optimal contract with his
agent in both the single and multiperiod settings.

In a sole proPrietorship or a partnership, the
principals are the owners and the managers are the

agents. For corporations, the stockholders as repre-
sented by the board of directors are the principals
while the managers are the agents. The auditors are

assumed to act always in the best interest of the
principal: that is, they act c00perative1y with the

principal. It is also assumed that all principals

within the company act congruously and have common
utility preferences. Likewise, the agents within the
entity also have common goals and utility preferences.
The model in the subsequence chapters effectively
focuses on the relationship between one principal and
one agent.

Ng and Stoeckenius [1979] point out that, in
general, not only is the manager's action not
observable by the owner, the firm's payoff is also

unobservable by him as well. Since this asymmetry of

information exists between the principal and the agent,

the latter is required to issue financial reports
periodically to convey information concerning the firm
to the owner. If the manager's remuneration is
directly a function of his reported performance, then
the manager will have incentive to misrepresent in his
report. If auditing is effective in detecting the
misrepresentation, then it is of value to the owner,
provided that the audit cost is small enough.

Fama [1980] notes that the contracting literature

is almost uniformly concerned with one-period models.

He further argued that in a single period world, there
can be no enforcement of contracts through a wage
revision process imposed by the managerial labor

market. Radner [1981] shows that the sequential

observation of the agent's performance over time is by
itself an effective monitoring device. Since any real
world contracting process is dynamic, an agency model
is incomplete if its dynamics are not studied. A
complete dynamic general equilibrium analysis including

the external labor market is beyond the scope of the
research. I plan to study the partial equilibrium

effects on the contracting behaviors between the
principal and the agent. This means that the external
labor market Opportunities to the agent are represented
by a constant exogeneously determined. The principal

must pay him in such a way that the agent's expected
utility on his compensation is greater than the given

outside Opportunity set.

In the single period model, it will be shown, in
Chapter II, that if the contract to be negotiated is
bounded or if it is nondecreasing (monotonic increasing)
with respect to payoff, the principal can be successful
in searching for the optimal one within the above two
mentioned classes of contracts. Bounded contracts are
reasonable since no principal would pay his agents an
unlimited amount of compensation in excess of the payoff.
Monotonic increasing contracts are those of the bonus

type. Bonus has been an extremely common form of

remuneration. Agents are paid according to the payoff
outcomes. These two classes of contracts clearly
describe some common contracting behavior of economic
agents.

For the multiperiod model, there are no assumptions
nor restrictions placed on the form and types of the
long run contracts. All ad hoc assumptions and condi-

tions are imposed either on the net payoff function
(the payoff less the compensation paid to the agent),

the total expected discounted return function (the net
payoffs to the principal over the planning horizon
discounted to period 0) and the probability distribu-
tion of uncertain events (the period payoffs and the
reported payoffs). The first two are restrictions on

the economic behavior of the entity, how its payoffs,

per period and total, relate to each other and to

the environment in which the entity Operates. The
conditions on probability distributions are more be—
haviorial in nature. The distributions capture the
principal's beliefs of his expectation on the agent's
performance and the manner the agent reports, that is,
how truthful the latter is. These are discussed in
details in Chapter III and subsequent chapters. It is

the h0pe of this research that the general agency

model develoPed will describe, both normatively and

positively, the behavior of the principal and his

agent in the above context. It is a positive model
because if the conditions or assumptions are met and
Optimal contracts are found, the principal can predict
the behavior of the agent in terms of his performance.
The imperfect state information model proposed
in Chapter VII will attempt to describe the reporting
and auditing functions in terms of a multiperiod agency
model. The reduction of the imperfect state model to

a perfect state model through the auditing process and
the conclusion that the audited financial statement

is sufficient to design a long term Optimal contract
confirms the current belief about the value of auditing.
HOpefully, such a model will enhance some understanding
of the process of financial reporting and auditing as
they are related to the contracting procedures of the
company.

Most of the current researches in the agency area
are conducted by imposing certain ad hoc assumptions on
the model. Some of these assumptions are descriptions
of the economic environment of the entity and some are
behaviorial in nature while others are for mathematical
tractability of the model. Assuming a solution to the

problem, that is an optimal contract, exists under the

 

proposed assumptions, the researcher carries on either
to characterize the assumed Optimal contract or to
draw implications from the Optimal contract. There

is not any documentation about the kind of environment
and conditions under which a researcher can appro-

priately make such an assumption, the existence of an

optimal solution. This project attempts to lay the
foundation and investigate the fundamentals of the
agency problem.

Every attempt in the research is made to impose
the minimal amount of restrictions on the behavior
of the various functions. The model will be described
and analyzed in its most possible generality. It is

hoped that any results derived under such hypothesis
are applicable to a greater variety of situations.

Due to the mathematical and technical nature of the
analysis, discussions in the following chapters will

be separated into a general and non-technical descrip-
tion of the analysis and its results in the non-starred
chapters. The full technical model, with all theorems
and proofs is presented in the starred chapters. The
two appendices contain materials which are well-known

in the literature but are crucial to the develOpment

0f the model. They are collected there for completeness.

 

 

 

However, by developing the model in its most

general form, it becomes extremely difficult to give
a direct and elaborate characterization of the optimal

contract even after showing its existence. Character—
ization of the Optimal contract requires additional
hypothesis on the model. I do not intend to carry

the analysis to such an extent. Also, this research
will only discuss the conditions under which the
principal can design an optimal contract and suggest,

wherever possible, some procedure to arrive at or
approximate the solution. It will not discuss the

specific numerical aspects of the actual search for

solution.

1.3: Organization of the Study and Summary of Results

 

The organization of the study is as follows. A
general agency model with one principal and one agent

is analyzed in Chapter II. A set of sufficient
conditions for the existence of an Optimal contract

will be derived. Financial reporting and auditing are
then introduced into the model. As indicated earlier,
the auditor is assumed to act cooperatively with the
principal. This implies that the auditor's decision
variables are inputted into the model exogeneously

and the auditor is acting as a "surrogate" for the

principal. Auditing is essentially a dynamic process:

 

10

analysis of the auditing function is deferred until
Chapter VII in the multiperiod model.
In the multiperiod model, the basic agency model

is first investigated (Chapter III). The setting
is the most "well-behaved" one in the sense that

payoffs are observable by both the principal and the
agent. The model is well-defined stochastically at
time 0. This means that the sequence of payoffs over
the entire planning horizon is defined stochastically
given an initial payoff at the initial period. A
probability distribution on this sequence of payoffs
exists with a well-defined probability belief on the
initial payoff. Analysis of the model is carried on
a finite time horizon (Chapter IV) and infinite time
horizon (Chapter V).

In the second part of the multiperiod analysis,
it is assumed that the actual payoff is observable
only by the agent while the principal will receive
a signal on the payoff. The signal is chosen by the
agent and there is no restriction on how the agent

should "report" the payoff. This leads to an imperfect

state information model (Chapter VII). The principal
must attempt to assess the likelihood of the actual
payoff given the signal produced by the agent. Hopefully,

he can achieve such a task through the auditing and

11

contracting processes. Serious mathematical problems

arise under such a setting. These problems and their
suggested solutions are discussed in Chapter VI.

The model constructed in the subsequent chapters
imposes no differentiability restriction on the various
functions. Heterogeneous prObability beliefs between

the principal and the agent are allowed. In the
single period setting, it is first shown that under

very mild conditions on the agent's choice set and
utility function, the Nash constraint that the contract
is incentive compatible is met for all incentive
functions. Optimal contracts are shown to exist under
two classes of functions, totally bounded and monotonic
increasing functions.

The multiperiod model represents a sequential
decision process. Long run Optimal contracts are
determined on the basis of maximizing the principal's
total expected discounted net return (period payoffs
less the agent's remuneration) over the planning time
span. The model is considered under a finite and an
infinite horizon settings.

Dynamic programming algorithm is an iteration
procedure over time through which (1) it computes a

Conditional expectation: (2) the objective function

in two variables (state and incentive) is Optimized over

 

12

one of these variables (incentive): (3) if an
optimal contract is to be constructed, a "selector"
which maps each state to a contract which achieves

the maximum in the second step is to be chosen. It is

shown that under certain circumstances such an algorithm
is valid to solve the multiperiod agency problem in
both the finite and infinite horizon models.

In the finite horizon model, Optimal contract

exists if the principal's total expected discounted
net return is continuous and behaves "somewhat" linearly
with respect to his period net return. In addition,

the dynamic programming algorithm provides a stronger

result. The optimal contract thus obtained maximizes

the principal's period net return as well as his total
expected discounted net return.

The infinite horizon model is discussed under two
separate sets of assumptions. The first set includes
conditions that the discount factor is less than unity
and that the total discounted expected net return is

bounded for all contracts. Existence of optimal

contracts is established. Under the dynamic pro—
guamming algorithm, the Optimal contract derived is
stationary, that is, the form of the contract remains
the same throughout time. The second set of assumptions

is that the total discounted expected net return to

 

13

the principal is monotonic (either increasing or
decreasing) with respect to time. Under both sets of
assumptions, the infinite horizon solution is shown
to equal to the limit of that of the finite horizon
model when time is allowed to tend to infinity. Also,
it is shown the dynamic programming algorithm imple—
mented under these sets of conditions converges.

If appropriate measurability restrictions are
imposed on the various functions, the dynamic pro-
gramming algorithm can be implemented to the imperfect
state information model. In this part of the analysis
it is shown that if the auditor is acting strictly
in the best interest of the principal and the audit
is performed in the best possible manner, the audited
financial statement is adequate for the derivation of a

long run Optimal contract through the dynamic programming

algorithm.

CHAPTER II

SINGLE PERIOD MODELS

2.1: A General Agency Model

This section considers the situation in which
there is a principal with one agent. The agent is
entrusted with the task of selecting an action from
among a set of alternatives. Then some random event
occurs and the outcome of the action is some payoff
which is assumed to be observable by both parties.
The action choice, however, is not observable by the
principal. A problem is then to decide on some
"best" sharing scheme between the principal and the
agent. Under the assumption that the agent's utility
for wealth is positive and his utility for effort is
negative, it has been shown that a pure wage contract
is incentive incompatible, that is, the agent will
always choose the act which requires the minimum level
of effort (Ng and Stoeckenius [1979], Shavell [1979]).
On the other hand, since the action choice is not

observable by the principal, an incentive contract based

14

15

on the outcome alone may cause moral hazard problems.
Under such circumstances, Harris and Raviv [1979]

have shown that some information about the agent's
action by the principal is desirable even at some

cost, namely, some monitoring is "good". The solu-
tion to the general agency problem is the determination
of the action by the agent, an incentive function and

a monitoring system by the principal to motivate the
behavior of the agent.

The usual criteria adOpted by writers in agency
literature for the choice of optimal incentive con-
tracts are Pareto Optimality and Nash optimality.

The Pareto criterion provides the principal with an
expected utility at least as great as that which is
obtainable from among alternatives which satisfy

some minimal requirement (minimum security level)
imposed by the agent. This implies that the principal

and the agent decide COOperatively on the contract

and the action choice such that the expected utility
of the principal is maximized subject to the minimal
requirement. If the principal can observe the agent's
action choice as well as the payoff, or if there is
full and truthful communications between the principal
and the agent on the action and payoff, it would be
sufficient to guarantee that the agent will act

cooperatively. Some authors call this the first best

16

solution (Holmstrom [1979], Shavell [1979]). However,

a solution in the cooperative game does not necessarily
imply that there is full and truthful communication

of the action and payoff. The first best solution
implies that the agent always acts in the best interest
of the principal. He does not have a conflict of
interest over effort with the principal either by the
observability of the action or other monitoring means.
The Nash condition assumes noncooperative
behavior between the principal and the agent. It
requires an additional restriction, that the contract
must enable the agent to choose an action from among
the alternatives which maximizes his expected utility
under the contract. This allows the agent to choose
his actions freely based on his own choice criteria.
A typical situation of such a phenomenon would be the
asymetry of information between the principal and the

agent. The principal does not observe the agent's
action even after the payoff is Observed. There is no

way of enforcing any action choice on the agent. The
best thing the principal can do then is to design the
contract based on his "best" guess on the agent's
action. The solution set of the Nash problem is a

subset of that of the Pareto one. Hence there exist

17

solutions which are Pareto efficient but are not
Nash solutions.
The following is a general agency model. Define

the expected utility of the principal as

¢1(I.a) = E1U1(u)(a,s) -I(w(a,s))), where

a E A set of possible actions
s E S set of states of nature
{2:A.xS 4 R payoff function w(a,s) = m

units of wealth.

I: Q 4 R Incentive function I(w(a,s)).

Throughout this paper, the same notation will be
used to denote a function as an element of a functional
space and the value of the function in the range. For
example, w denotes a payoff function as well as the
units of wealth payoff for a certain action a and
state of nature 5. The actual reference should be
clear from the context.

Under most common economic circumstances, it will
not be unreasonable to assume that the agent has only a
finite number of mutually exclusive action choices.
Then, without loss of generality, A is assumed to be

a finite subset of Rn. The set S is defined with
its usual statistical meaning in a decision theory

context (Savage [1954]) with probability beliefs Pi

on S, where i = 1 denotes the principal and i = 2

18

denotes the agent. The usual homogeneity of beliefs

between the principal and the agent is not assumed in

this model. Let Ei denote the expectation Operators
with respect to the beliefs of the decision maker.

It is assumed that the utility function of the
principal, U1, is non-negative, concave and monotonic
increasing with respect to wealth. It can then be
shown that such a utility function is continuous on its

domain (Theorem 2.7).

For the agent, define his expected utility as

¢2 (Ila) = EZUZ (I (w(als) ) la) °

His utility function is also non-negative, concave, and
monotonic increasing in wealth, but non—negative, concave,
monotonic decreasing in effort. Assuming that both the
principal and the agent are utility maximizers, one can

formulate the problem as follows.

Maximize
E1U1(w(a.S) -I(w(a.S))
I e {1)
Subject to: 02(I,a) 2 v (1)
a 6 argmax E2U2(I(uwa,s)),a) (2)
aEA

v represents the minimum levels of security for the
agent to remain in the company. Solving the above

program subject to constraint (1) will yield the Pareto

19

solution. Constraint (2) represents the additional
incentive compatibility condition that the Nash

solution requires.

2.2: The Pareto Optimal Contract

 

Under the conditions of Pareto Optimality, all

parties are assumed to act COOperatively in the best
interest of the company. The existence of a Pareto
optimal contract is usually not too difficult to
guarantee. Given sufficient regularity conditions,
at least one solution is guaranteed. Wilson [1968]
has demonstrated the existence of a Pareto optimal
sharing rule and characterized the behavior of this

sharing rule in a syndicate setting. Kobayashi [1980]

analyzed the role of private information of an individual

in the syndicate by redefining the core. He showed

that an equilibrium contract belongs to the core as

well as the existence of such equilibrium contracts.
Amershi and Butterworth [1979] investigated the

problem assuming diverse beliefs among the principal and

the agents. They analyzed the conditions for existence

of the optimal contract and the characteristics of such

an Optimal contract. Most of their findings on the
Pareto optimal contract are consistent with the results

of previous studies.

20

2.3: The Nash Optimal Contract
In the formulation of the Pareto problem, one can

assume some kind of convexity prOperty for the Objec—
tive function and the constraints are generally well
behaved mathematically. Such assumptions will, to
some extent, simplify the solution techniques. Under
the Nash criterion, the additional constraint that the
agent is able to maximize his expected utility removes
any reason to impose the well behavior of the functions

of the model. Also, one cannot be sure that the

Optimal contracts are differentiable. Differentiability
is a requirement in most optimization techniques. The
problem is compounded if the principal and the agent
are allowed to hold diverse beliefs about the state of
the world.

One common method in showing existence is by
analyzing the first and second order conditions. Under

the assumptions of Kuhn-Tucker in the nonlinear

programming literature, these conditions are, in general,
satisfactory. Another usual method is to use the Euler—
Lagrange equation in the calculus of variation literature.
In the agency setting, however, both of these techniques
may not be apprOpriate. Both methods require
differentiability of the incentive contracts. The

principal's decision variable, incentive contract, is a

21

function belonging to some functional space. A space
is of finite dimension if its elements can be expressed
as some finite number of vectors (or basis). There

are only very few finite dimensional spaces that are
of interest. In fact, the only one which is of any

use is the space of polynomials of finite degree. The
principal is effectively maximizing his expect
utilities over infinite dimensional spaces. Under
such handicapped conditions, most of the common cal-
culus methods fail.

Both Mirrless [1974] and Gjesdal [1976] have
correctly shown that the differentiability assumption
may be too restrictive. Holmstrom [1977] constructed
a counter-example, the optimal solution of which can
be attained by a nondifferentiable sharing rule and
no differentiable rule can precisely attain this
solution.

One of the more extensive works in demonstrating
existence is also done by Holmstrom [1977]. He proved

the existence of two classes of incentive contracts
under the Nash conditions. His work relied heavily

on the assumption of homogeneity of beliefs between the
principal and the agent. He also put some additional
restrictions on the behavior of the functions to arrive

at his results.

22

The first step of the analysis is to generate

conditions on the agent's problem such that given any
contract I, his expected utility function will
always achieve a maximum. This will satisfy the Nash
additional condition that the contract is incentive
compatible.

To show incentive compatibility, the Weierstrass
Maximum Theorem (Theorem 2.3) is used. Loosely stated,
an upper semicontinuous function on a compact set
achieves a maximum. Upper semicontinuity is a milder
condition than continuity. When applied to the agent's
expected utility function with respect to the action
set, this means that if the agent shirks a very little
bit, that is, if a is allowed to change by a small
amount, then the agent's expected utility will not be
increased by a large amount. Hence, upper semi-
continuity simply means that the function is continuous
from above. The agent should not expect a substantial
gain if he changes his effort level slightly. This is
reasonable in terms of expected utility when all possible
states are considered.

Recall that A is assumed to be a finite subset
of Rn, it is bounded by definition. Compactness on
the real line means closed and bounded. All one

needs for A to be compact is that it is a closed

 

 

 

 

 

23

subset. Upper semicontinuity on ¢2 requires some
additional condition on the utility function of the

agent. It can be shown that if U2 is upper semi-

continuous with respect to a, the action choice,
then the resulting expected utility is also upper
semicontinuous (Theorem 2.5).

With the above construction, indeed only very
mild conditions, the Nash constraint is satisfied.
Then the principal's problem is considered, again
applying the Weierstrass Theorem on his expected utility
function.

To guarantee upper semicontinuity on Oi is easy.
The principal's utility is monotonic increasing and
concave with respect to wealth, which is generally
the residual of the payoff after the manager's
compensation is deducted. A concave function is always
continuous (Theorem 2.7), and the integral, in our case,
the expectation operator, is continuous if the integrand
is continuous, which in turn implies upper semicontinuity.

Compactness of the space of incentive contracts
is more difficult to show. The space of incentive
contracts is of infinite dimensions. It would require
stronger conditions on the contracts than just being
closed and bounded. The mathematical requirement is

that the contracts are totally bounded. Total

 

 

 

24

boundedness means that no matter how the elements

of the set (in this case set of functions) are
grouped, they are always enclosed in a ball of some
given radius. Here the radius has to be the same
for all possible groupings. It is a kind of uniform

bound for all possible payoff outcomes. Thus, if the

incentive contracts are totally bounded, then the
Weierstrass Theorem will guarantee the existence of
a maximum for the principal.

In most situations, one would expect that the
contracts should be increasing with respect to the
payoffs or a bonus as it is more commonly called. The
class of monotonic increasing contract is considered
next. Each contract of this type is assumed to be
bounded below. As the principal obtains the residual
of the outcome after he pays the agent and his utility
is increasing with wealth, given any wealth position
from the outcome, he would choose the incentive function
that would pay the agent the least amount. That is,
the optimal contract in the principal's viewpoint can
be defined as I; = inf [11,...,In} for each w E 0
where n denotes the number of possible contracts
given w. As the problem is formulated, the agent must
maintain a minimum security level in terms of expected

utility to remain in the company, the incentive function

should be bounded below for each m E O.

 

 

 

 

25

Now, consider the sequence [1;], the elements
of which are defined earlier. This is a decreasing
sequence and it is shown that it converges to a limit
1* which is also a monotonic increasing function
(Theorem 2.12). As mentioned earlier, the upper
semicontinuity on O? on A and the compactness
of the action set A guarantee the existence of an
optimal action for the agent given any incentive
contract. Thus, for each 1;, there exists a
corresponding a; for the agent. The final step is
to show that the sequence [a;] also converges to a
limit a* which is then the Optimal action for the
agent if he is given the Optimal contract 1*

(Theorem 2.13).

Up to this point, it has been shown that if the
agent's action choice set A is a closed subset of
Rn, and if his expected utility function is upper
semicontinuous with respect to A, there always exists
an optimal a* such that his expected utility is
maximized. This will satisfy the Nash constraint of
incentive compatibility for all possible incentive

contracts.

Two incentive spaces are then identified such that
‘by choosing the appropriated contracts from these two

spaces, it will be guaranteed that the principal's

expected utility will achieve a maximum.

 

26

The first of these spaces is the class of totally
bounded contracts. This is a very large class of
functions which includes the differentiable functions
and the equicontinuous functions which Holmstrom [1977]
has demonstrated to be optimal under homogeneous
belief assumptions. Functions from the totally
bounded class are compact. The principal's expected
utility function is continuous with respect to I.
These two conditions will guarantee, by the Weierstrass
Maximum Theorem, the existence of an optimal contract.

The second class of functions is the monotonic
increasing incentive functions. A typical example of
these functions is the bonus arrangement. Indeed it is
shown that one can always pick an Optimal monotonic
increasing contract to maximize the principal's expected
utility.

So far, an agency model has been described under
the usual decision choice theoretical context. The
assumptions imposed on the model, which are summarized
above, describe a very reasonable and general economic
environment. The restrictions on the utility function
are in accordance with the usual von Morgenstern
utility preference assumptions. If the contracts are
totally bounded or they are monotonic increasing,

the principal will always be able to negotiate a Nash

equilibrium contract such that both his and the agent's

27

expected utilities are maximized. In other words,
if the agent exhibits a von Morgenstern utility
preference with respect to wealth and effort, then
the principal can always design a contract which is
either totally bounded or monotonic increasing to

achieve a Nash equilibrium.

2.4: Financial Reporting and Auditing

 

Generally, financial reporting means the conveyance
of information to the principal (owner) about the out—
comes of the actions taken by the agent (manager) in

a period of time. When such a function is introduced

into an agency model, it implies as well that the out-
comes of the events are not observable by the principal.
An additional objective of the principal is then to
ensure that the agent is reporting the outcome truth-

fully. Ng and Stoeckenius [1979] have demonstrated
that without monitoring, an incentive compatible con-

tract always induces nontruthful reporting. This
suggests that when the actual payoff and the agent's
effort are not observable, information about the agent's
performance is always valuable to the principal in

terms of increasing his expected utility. Of course,
the benefits derived from such information should be
sufficient to justify its cost. Harris and Raviv

[1979] showed that even imperfect information is bene-

ficial.

28

As Baiman [1979] points out, there is a distinct
difference between monitoring and auditing. The

former means to verify the action taken by the agent
while auditing is taken as the verification of the
report of the outcome produced by the agent. In this
paper, the role of the auditor is defined as another
agent of the owner who "monitors" the reporting
function of the manager, while the incentive contract
is designed to minitor the actions of the manager.
Since the objective of the principal is to control
the reporting as well as the action choice by the

manager, the incentive contract and the auditor can

then together be considered as the monitoring system.

The "black" box of unobservables increases with
the addition of the auditing function into the model.
In most common situations, the original report submitted

by the manager is not observable by the principal.
The manager's report goes directly to the auditor who

performs the necessary tasks on the report, suggests
apprOpriate changes and adjustments and then attests

the report. The principal will receive the "audited"
report after all adjustments have been made or a
"qualified" or "disclaimed" report if the manager refuses
to make the suggested alterations. Under such a scenario,

the only variable which is observable by all parties is

29

the audited report. Any contract that is enforceable
has to be based on variables which are observable by

both the principal and the agent. Therefore, it is
not unreasonable to suggest an incentive contract

based on audited outcome. In fact, if the auditor
performs all his audit tasks in the best possible
manner and in the best interest of the principal, the
audited outcome is shown in Chapter VII to be
sufficient to design an Optimal contract. The dis-
cussion of Optimal contracts under a reporting and
auditing environment will be deferred until Chapter
VII when an imperfect state information model in a

multiperiod setting is discussed.

CHAPTER II*

SINGLE PERIOD MODELS

2.1* The Model

 

Define the expected utility of the principal as

m1(I.a) = E1U1(w(a,s)-I(w(a,s))), where

a 6 A set of possible actions

5 6 S set of states of the world

U):A.XS 4 R payoff function w(a,s) = w units of
wealth

I: Q 4 R incentive contract I(w) = k2.

The following are some standard assumptions on decision

choice theory.

Assumption 8.1: Let the action set A be a
closed, bounded subset of a; where 0’ is a normed

linear space of dimension n < m.

Assumption 8.2: The set S is non-empty with a
sigma-algebra 0(5) defining the events in S, Pi’
i = l (principal), 2 (agent), are probability measures

on 0(s). It is assumed that all functions are measur-

able with respect to 0(5).

3O

31

Assumption 3.3: (i) The utility function of the
principal U1, is assumed to be non-negative, con-
cave, and monotonically increasing with respect to
wealth.

(ii) The utility function of the agent U2,
is also nonnegative, concave, and monotonically in-

creasing in wealth, but decreasing in effort.

Assumption 8.4: There exists a mo. -w < wo < m.

such that ]w(a,s)] g 1mg] for each a 6 A and s E S.

Assumption 8.1 describes the action choice set.
More details about its tOpological structure will be
derived in the later part of this section. Since
focus is put on a single period model, at this stage
of analysis, S can be viewed as deterministic.
Hence no serious problem should arise on the measur-
ability of the functions, which are assumed to be
measurable. The assumptions on the utility function
simply imply that the preference relation of the
individuals is convex, complete and transitive.
Assumption 8.4 imposes a bound on the payoff function.
In any economic situation, it will be very unlikely
that the payoff is unbounded. One inherent consequence
of Assumption 8.4 is that the incentive function is

also bounded as no principal can pay his agent an

unspecified amount of compensation in excess of his return.

32

Thus, the expected utility of the agent can be

defined as
sz(IIa) = E2U2(I (W(a,S) ) Ia)

and the typical principal-agent problem represented

by the following program:

Maximize E1U1(w(a',s) - I(w(a'.S)))
I E C
Subject to: m2(I,a’) 2 V (l)

a: 6 argmax E2U2(I(w(a.S)),a) (2)
a€A

C represents the incentive space and V represents

the minimum level of security for the agent to remain
in the company or the Opportunity set he could attain
outside the company. Solving the program subject to
constraint (1) yields the Pareto optimal solution or
commonly known as the first-best solution. Constraint
(2) represents the additional condition that a Nash

equilibrium requires.

Theorem 2.1: Let (d,H°H1) be an n-dimensional

 

normed linear space over the real field and n < m.
Then (d,H'H1) is tOpologically isomorphic to

(an. n- 112)-

Proof: Refer to Larsen [1973].

33

Definition: A class of sets in a topological
space is said to cover a given set X if and only
if each point of X lies in at least one of the
sets. If the diameter of each set in a cover of X
is not greater than c, the class is called an

c-cover of X.

Definition: Let X be a subset of a topological
space. x is said to be compact if and only if every
class of open sets which covers X has a finite

subclass which also covers X.

Definition: A subset X of a complete normed
space is said to be sequentially compact if and only

if every sequence in X contains a convergent sub-

sequence with limit in X.

It can be shown that in a Banach space, compact-
ness and sequential compactness are equivalent. The
following is a well-known and useful result of a

finite dimensional space.

Theorem 2.2: Let 0' be an n—dimensional normed

 

linear space over the real field with n < m. Then

(i) d’ is a Banach space

(ii) If A c.d' is a closed bounded set, then A

is compact.

Proof: Refer to Rudin [1973].

34

Theorems 2.1 and 2.2 establish the structure of

the action choice set and guarantee its compactness.

By Theorem 2.1 and without loss of generality, A is
assumed, for the rest of this paper, to be a closed
subset of Rn with an appropriate norm and all results

can then be generalized to other topological spaces.

2.2* Totally Bounded Incentive Functions

In this section, by considering the requirements
of the Weierstrass Maximum Theorem, a set of sufficient
conditions is derived for the existence of an Optimal

contract under the conditions established in Section 2.1*.

Definition: A real—valued function f defined
on a normed space X is said to be upper semicontinuous
at xo if and only if lim sup f(x) g f(xo).

X"X
0

Theorem 2.3 (Weierstrass Maximum Theorem): An

upper semicontinuous function on a compact subset X

of a normed linear space achieves a maximum on X.
Proof: Refer to Luenberger [1969].

First consider the agent's problem. The idea is
that given any incentive contract, we want the agent
to be able to select a E A such that his expected
utility function will achieve a maximum. This will

satisfy the Nash additional condition that the contract

35

is incentive compatible. Since A is a closed and

bounded subset of Rn, it is compact by Theorem
2.2 (ii). All that remains to be shown is that m2

is upper semicontinuous on A.

Assumption 8.5: For each fixed 5 6 S and
e > 0, there exist a 6(5) > 0 such that 6: S 4 R
is measurable and U2(I(w(a,s)),a) < U2(I(w(ao,s)),ao)+-a

whenever Ha-—aOH < 6(5).

Theorem 2.5: Let I(w(a,s)), w(a,s) be bounded

 

measurable functions and U be the measure Of S.

Suppose for each fixed 5 E S and each s > 0, there
exists a 6(5) > 0 such that 6: S 4 R+ is measurable

and U2(I(w(a,s)),a) < U2(I(w(ao,s)),ao)+-e/2 when-

ever Ha-—aOH < 6(5). Then m2 = [S U2(I(w(a,s)),a)P2(s)ds

is upper semicontinuous on A.

Proof: Let c > 0 be given. For each s 6 S

define
60(5) = sup 6(5).
Clearly 60(5) is measurable since 6(5) is measurable.
Let
= f . ‘
So [5 E S. 60(5) > 1)

36

and

Since 60(5) > 1 for every 5 G S

For Y > 0 to be chosen later, there exist

a subset SI of S such that u(S-—S') < y.

k
Hence there is a k such that S’ = L) Sn and
n=0

50(5) <11 Vs 65’.
Let 5’ = %. Then

U2(I(w(a.S)).a) < U2(I(w(ao.S)).ao) + 9/2

/ ,2

. I
whenever Ha-aoﬂ \ 6 and 5 6 S .

This implies

IS! U2(I(w(a.s)).a)P2(s)ds < IS' 02(I(w(ao,s)),ao)92(s)ds

+ Is’ e/2 P2(s)ds.

By Assumption S.4, w(a,s) is bounded for each a 6 A,

s E S. This implies that I and consequently U2

are both bounded. Thus for each 6 > 0, there exists a

7 > 0 such that

g-S’ U2(I(w(als))ra)P2(S)dS < 62/2

whenever u(s-s’) < Y

37

[S U2(I(w(a.8)).a)P2(S)ds

= J‘s: U20: (1(a,s)),a)P2 (S)ds
+ £_S, U2(1(w(a.8)).a)P2(s)ds

< [S U2(I(w(ao.S)).ao)P2(S)dS + 5/2 + Is 5/2 P2(s)ds

Is U2(I(w(ao,s)),ao)P2(s)ds + e. QED

Assumption S.5 says that locally, when a approaches
a limit point, the increase in utility to the manager
given a very small change of effort will not be large.
This requirement seems reasonable, because with a small
increase in the agent's effort he should not expect a
large increase in the outcome as well as his compensation,

or else he will be remunerated for very trivial efforts.

Given an incentive contract I, Theorems 2.2 (ii)
and 2.6 guarantee the compliance of the requirements
of Weierstrass' Theorem and hence the Nash criterion is
satisfied. The problem is now reduced to finding an
incentive contract such that the principal's expected
utility function achieves a maximum.

It is easy to verify that space of incentive functions

satisfies the prOperties of a vector space. Since any

linear combination of two contracts is also a contract,

i.e., for every I and I E C, dI + (l-—o)I E C,

1 2 1 2

O l, C is locally convex. Formally, the above

Cr.

HA
IL’\

idea can be stated in the following theorem.

38
Theorem 2.6: The space of incentive functions C

is a locally convex vector space.

Assumption 8.6: C is complete.

Hence, by the above construction, C is a locally
convex Banach space.

Now, let us digress for a moment and investigate
the behavior of the utility function of the principal

under Assumption S.3 (ii).

Theorem 2.7: Let ‘U: R 4 R be a concave mapping

 

on (a,b). Then U is continuous on (a,b).

Proof: The following proof is due to Rudin [1974].
Suppose a < s < x < y < t < b. Write S for the point
(s,U(s)) in the plane and do the same thing with S,Y,
and T. Then X is on or above the line SY: hence Y
is on or below the line through S and X: also, Y is

on or above XT. As Y 4 X, it follows that Y 4 X,

i.e., U(Y) 4 U(X). Left-hand limits can be obtained

in the same manner. Then continuity of U follows. QED

The above theorem establishes the continuity of the
utility function on its domain. The integral of a

continuous function is also continuous.

Theorem 2.8: For a given a E A, m1(I.a) =

 

I U1(w(a.S)-I(w(a,s)))Pl(s)ds is continuous on C.
S

39

If C is of finite dimension, then by Theorem 2
(ii), a closed subset of C is compact. However, if
C is of infinite dimension the compactness of [I]

is not easy to guarantee. This leads to the following.

Definition: A subset X of a normed space is called

totally bounded if and only if, for each e > 0, there

is a finite set of Open balls which form an c-cover of X.

Theorem 2.9: If a closed subset I of a complete

 

normed space C is totally bounded, then it is

sequentially compact.

Proof: Let [In] be any infinite sequence in I.

As I is totally bounded, there is a finite l-cover of

1 1 . _
I, say [1(f1,§),...,I(fk,§)] where fj E I, j — 1,...,k.

At least one of these balls contains an infinite sub—
sequence, [In 1} say, of [In]. Take next a finite

%—cover of I and as before extract an infinite sub—

sequence, {I 1 say, of [I 1 contained in one of
n,2 n,l

these balls. Proceed in this way to find for each m

. c a ‘ r ‘1
an 1nf1n1te subsequence [In’mj say, of “In,m—l' such
that [In m} is contained in a ball of diameter m-l.

I
Now, consider the diagonal sequence [In n}. Then
CD , m .
{Ij,j}j=n 1s a subsequence of {In,n]j=n' and so 1s
contained in a ball of diameter n-1. Therefore,
‘In,n'-Im,m‘ g l/M1n(n,m).

Thus [I 7 is Cauchy, and hence is convergent in I

n,ni

as C is complete and I is closed.

40

This shows that [In] has a convergent subsequence,

and proves sequential compactness. QED

In a Banach space, compactness and sequential
compactness are equivalent. The requirements of the
Weierstrass Theorem are satisfied to guarantee the

existence of an optimal incentive contract such that

the expected utility of the principal is maximized.
The following gives a useful example of the class

of totally bounded functions.

Definition: Suppose Q is a subset of Rn,
C(O) is the sup-normed Banach space of all continuous
real-valued functions on 0. A set C C C(O) is
said to be equicontinuous if and only if, for each
6 > 0, there exists a 6 > 0 such that
II(w1) —I(w2)] < e for all w1,u12 6 O with

Im1-w2] < 6 and for all I E C.

Notice that for a given 6, the same 6 can be
chosen for every I in J. Equicontinuity means
roughly that the degree of continuity is independent
both of the position in the set 0 and of the functions

in the space C.

Theorem 2.10 (Ascoli): Suppose Q is a bounded

 

asubset of Rn, C(O) is the sup-normed Banach space

41

of all continuous real-valued functions on 0, and
C C C(O) is pointwise bounded and equicontinuous.

Then C is totally bounded in C(Q).

Proof: Refer to Rudin [1973].

2.3* Monotonic Incentives

 

Definition: 1(0) is monotonic increasing if

I(w1) g I(w2) whenever wl < wz for all w1,w2 6 O.

In this section, the class of incentive contracts
which are monotonic increasing as defined above is
considered. Recall that the principal's objective is
to find an incentive function such that his expected
utility is maximized. As the principal obtains the
residual of the outcome after he pays the agent and
his utility is increasing with wealth, given any wealth
position from the outcome, he would choose the incen—
tive function that would pay the manager the least
amount. That is, in the principal's viewpoint, the
optimal contract can be defined as I; = inf[Il,...,In]
for each w 6 0. In E C, and n denotes the possible
contracts given w. It is clear that I; is a
decreasing sequence. As the prOblem is formulated,

the agent must maintain a minimum security level in

terms of expected utility to remain in the company.

42

Thus, the incentive function should be bounded below
at least pointwise. More explicitly, there exists a

*
Io such that inf[In} 2 IC for every w 6 0.

Theorem 2.11 (Egoroff): Let the measure of a set X

be OIX) < m. If [fn] is a sequence of measurable
functions which converges pointwise at every point of x,

and if e > 0, there is a measurable set E C X, with

u(x-—E) < e, such that {fn} converges uniformly on E.
Proof: Refer to Rudin [1973].

Theorem 2.12: Suppose I C C, where C is the

 

class of monotonic increasing functions, each I is

pointwise bounded below for each x E 0, and Q is a
n * .

subset of R . Let In = inf [11,...,In} where

*
[In] C C. Then In 4 1* and the limit I* E C.

Proof: Let ml < wz and w1,w2 E 0. Con51der

a subsequence {1.1 k} of {Ii}, where l g i g n

and k=1,2,°". L81”.
* - )
and
1*
Then
*( < . *
In wl) - I1,j(wl) g I1 m(ml) = Il'm(w2) _ 1n(m2),

43

*
Thus, In is a monotonic increasing function of w
*
for each n. Clearly, 11(w) 2...2 Io(w) for every
*
w 6 0. Therefore, In(w) 4 I*(w) pointwise for each

*
w 6 0. By Theorem 2.11, In 4 I*. What remains to

be shown is that I* E C. Let s > 0 be arbitrary.

Let and be chosen as above. There exists

"’1 u”2

an n such that
* *
In(w2) < I (wz) + 6.
Then
it * * *
I (011) g In(wl) g In(w2) < I ((112) + s.

This implies that I*(w1) < I*(w2) + 5. Since 6 is

arbitrary, thus I* E C. QED

Theorem 2.12 provides a scheme to pick an optimal
contract from a class of nondecreasing functions and it
also establishes the existence of such a contract. If
the principal were to enforce I*, we still have to
ensure that the agent will be induced to choose the
correct action.

By Theorem 2.6, the upper semicontinuity of $2
on A and the compactness of the set A have
guaranteed the existence of an optimal strategy for
the agent given any incentive contract. Possibly
there can be more than one Optimal strategies for an

incentive contract. Under such circumstances, it is

 

 

CI.

44

assumed that the agent will choose among the possible
optimal strategies one which would maximize the princi-
pal's expected utility. Let a* denote such strategy.
Thus, for [1;], there exists a corresponding

sequence of optimal strategies [a;] by the agent.

'k
Clearly [an] C A.

Theorem 2.13: Suppose I C C where C is the

 

class of monotonic increasing functions, each I is
pointwise bounded below for each m E 0 and Q is a
n * . *
subset of R . Let 1 = 1nf{1 ,...,I 1 and 1 4 I
n 1 n n

on 0. Suppose the set A is bounded below. Then

there exists an a* such that m2(1*,a*) is maximized.

* '\ o a
Proof: Since {an} C A which is a closed and
bounded subset of Rn, there exists a subsequence
* 'k
{a } such that an 4 a* where a* E A. We claim
m m

that a* is the optimal action choice. Let c > O

*
be arbitrary. By the convergence of [In] on 0.
1* *) * * /2
@2( nlan < ®2(I Ian) + C o
By Theorem 2.5 m2 is semicontinuous on A.

* *
* ' *
m2(I ,an) s 11m sup m2(1 ,an)
n49

g C02 (I*Ia*) 0

Suppose there exists an a0 6 A such that

* s
C02(I*Ia) < 02(I*:ao).

45
*
Then, by the convergence of [In]

* *
w2(In'ao) > m2(I ’ao) — e/2

> m2(I*.a*) - e/2
* *

> o2(1 ,an) - e/2
* *

Since 5 is arbitrary, we have
* 'k *
mZIIn.ao) > 62(In.an).
But this is impossible, because a;, by construction,g
is the action which maximizes m2 given 1;. Therefore,

a* is the action which maximizes m2 given the

Optimal contract 1*. QED

CHAPTER III

MULTIPERIOD MODELS - PRELIMINARIES

3.1: Introduction

 

There are two major considerations in the
construction of the multiperiod model: mathematical
rigor and economic as well as behavior implications.
Obviously, there is never a sharp distinction between

the two, for they are not mutually exclusive. The
thrust of discussions in this and subsequent chapters
will be devoted to the develOpment of a rigorous

analytic model for the prOblem. No specific re-
strictions are imposed on the incentive contracts or
its parameters. The main interest of the analysis is
to investigate under what conditions would the princi-
pal be able to negotiate an Optimal contract with the
agent such that the expected utilities of the outcomes
for both parties would be maximized over the long run.
Most of the conditions which guarantee the existence
of long run Optimal contracts are imposed either on

the net wealth return to the principal per period, the

46

47

total expected discounted return to the principal,

or the probability distributions of the outcomes.
Conditions on the return functions, both per period
and in total, are descriptions of the economic
enrironments of the entity. For example, one of the
conditions required in the infinite horizon model
(Chapter V), is that the return to the principal per
period is bounded and the discount factor has to be
less than one. These are reasonable descriptions of
common characteristics of an economic enterprise.
Conditions on probability distributions are mainly
behaviorial in nature. The type of "Bayesian" update
as prOposed in the imperfect state information model
(Chapter VII) is similar to that imposed in a typical
behaviorial research.

Multiperiod models have not been extensively dealt
with in the agency literature. Authors of recent
works attempted to extend their models to their multi-
period counterpart at the end of their respective works,
for example, Baiman and Demski [1980]. Lambert [1981]
views the multiperiod model in terms of a two-period
agency and prOposed a delayed payment scheme for the
agent. Radner [1981] investigated the behavior of
contracts in a repeated agency setting and concluded

that if the process was repeated for long enough period

48

of time, one can approach the first-best solution (a
Pareto Optimal contract) within some predetermined
error bounds. His analysis imply an infinite horizon
model.

This and subsequent chapters will attempt to
formulate the multiperiod agency problem and obtain
sets of conditions which guarantee the existence of
Optimal or nearly optimal contracts. The problem is
first formulated in a typical decision choice model
setting in terms of the most basic and rudimentary
generic elements of the decision processes of the
principal and the agent. Virtually no assumptions are
imposed on the functions other than those which are

required on the action choice set and the utility
function of the agent to guarantee the compliance of

the Nash incentive compatibility requirement. Any
analysis cannot be carried too far under such a general
formulation of the model. Dynamic programming algorithm
is a procedure for searching solutions in dynamic problems
and is well documented in both the Operation research and
stochastic Optimal control literature. By reformula-

ting the multiperiod agency problem under the dynamic
programming hypothesis, 1 am able to adapt the recent
findings of the above mentioned two bulks of researches

to investigate the existence of long run Optimal contracts.

49

The adaptation of the dynamic programming algorithm
does not cause the model to lose any of its intended

generality. The algorithm provides a structural and
systematic procedure to search for an Optimal or
nearly optimal contract. It is through this systematic
procedure that one can find the conditions under which
an Optimal contract can be found. This research is
interested in the procedure of the search and the
adaptation of such a procedure to search for a solu—
tion. It does not intend to go on to investigate the
actual numerical and computational aspects of the
algorithm.

The multiperiod agency problem is developed as a
sequential decision process on a discrete time basis.
Contracts are negotiated at the end of each period

and there are no restrictions on the forms of the

contract. There is no requirement that the contracts
negotiated are to have the same form throughout the

entire time span. The agent is free to leave the com-
pany any time within the horizon the model considers.
When the agent leaves the company and a new one is
hired, then the model will revert back to time 0 and
the Optimization process will restart all over again.
'The principal will attempt to maximize his total
«Expected return over the time span and seek the corres-

EKNﬁding Optimal contracts to achieve his goal.

50

In the first part of the analysis, a most basic
multiperiod agency model will be considered. Both

the principal and the agent observe the payoff at
the end of each period before the contract is negotiated.

Payoffs in any particular period depend stochastically
on the payoffs in the previous period, the incentive
contract of the current period and the total accumula-
ted wealth of the company. Such a transition function
is assumed to be well-defined stochastically. Should
there be any disturbance arising from the above
description of the payoff transition, there always
exists a probability distribution on the disturbance
given the payoff and the incentive contract. In other
words, the entire sequence of payoffs over the planning
horizon can be stochastically defined at time 0 once
an initial payoff we is specified. The maximization
process is then reduced to finding a sequence of con—
tracts corresponding to the payoffs over all possible
initial payoffs.

Considerations of the basic model will be given

on finite horizon (Chapter IV) and infinite horizon

(Chapter V) assumptions. The finite horizon model

<:onsidered will be a long run model. Although a two-
;neriod agency is not excluded in the analysis, the focus
113 that of a much longer period, the economic life of

the entity.

51

The development of this part of the analysis
dwells heavily on writings in the Optimal contral
literature by Blackwell [1965a, 1965b, 1974, 1978],
Denardo [1967], Strauch [1966], and particularly,

Bertsekas and Shreve [1978].
The second part of the research centers on the

imperfect state information model. Payoffs are observ—
able only by the agent who reports a signal to the
principal concerning the payoffs. Since the agent can
choose any information signal he pleases, it becomes
difficult if not impossible for the principal to

assess the likelihood of the actual payoffs conditioned
on the signal received. He can no longer project the
sequence of expected payoffs such that he can Optimize
his expected utility as in the first part of the
analysis. In the imperfect state model, necessary steps
must be taken to guarantee the existence of a probability
distribution on the discrepancy between the actual pay-
off and the signal reported (Chapters V1 and VII). This
is important because the principal is maximizing his

total expected discounted wealth and there is no way
that he can assess his expected wealth without some de-

gree of control on the distribution of the disturbance

uﬂaich is the discrepancy in the current context.

52

Works in the literature concerning the imperfect
state information models are sparse. The more notable
ones are Astrom [1965], Juskevic [1976], Sawaragi and
Yoshikawa [1970], and Striebel [1975]. The proposed
model will draw materials from all these papers and
monographs freely, in particular that by Striebel whose

work is also documented by Bertsekas and Shreve [1978].

3.2: The Basic Multiperiod Model

It was proved in the single period model that
under very mild conditions on the agent's action choice
set and his utility function, the Nash constraint of
incentive compatibility is satisfied for all incentive
contracts. These conditions can be trivially carried
over to the multiperiod model. The condition on the
action choice set is that it must be compact. In a
multiperiod setting, the agent is choosing his sequence
of optimal actions from the possible action set which
is in fact the Cartesian or cross product of the sets
AO'A1"°"AN-l' Each of the Ai' i = O,l,...,N-l, is
compact by assumption. The Cartesian product of
compact sets is also a compact set. The multiperiod

model is formulated in such a way that both the
;principal and the agent is maximizing their expected

(iiscounted utilities on the outcomes. The agent will

(iiscount all his expected remuneration over the time

53

periods to time 0 and apply his utility preferences
on the discounted compensation. The only utility
function in question is the one at time 0 which
is assumed to be upper semicontinuous with respect to

the action choice set. With these two assumptions on
the action choice set and the agent's utility function,

the Weierstrass Maximum Theorem will guarantee that

the agent will be able to select a sequence of optimal

actions such that his expected utility is maximized.
In this and subsequent chapters, it is assumed

that all Ai's are closed subsets of Rn, and hence
compact (Theorems 2.1 and 2.2(ii)), and the agent's

utility function is upper semicontinuous with respect
to the actions (Assumption 8.5). Thus all contracts
under consideration are incetive compatible. It is
also assumed that if there are more than one Optimal
action choice for a particular contract, that is, the
solution to the agent's problem does not have a unique
solution, the agent will choose among the possible
"Optimal" solutions one which maximizes the principal's
expected utility.

Now, given any contract, the agent is guaranteed

that he can select an Optimal action a* corresponding
tc>the contract. He will execute a* with payoff
<n(a*,s). The payoff will effectively capture all the

Inandomness of the state of nature. This will allow the

54

following formulation of the multiperiod model to
suppress the state parameter and treat the payoff as
a random variable while assuming the existence of a
probability distrubition on all possible payoff wealth
positions.

The following definitions and conventions will

be adopted:

(1) O and C are two given nonempty spaces
referred to as the payoff and incentive spaces respec-
tively.

(2) For each m E 0, there is given a nonempty
subset U(w) of C referred to as the incentive
constraint set at w. The incentive constraint set
will exclude all contracts which give rise to an

expected utility to the agent which is less than the

outside labor market Opportunity.

(3) Denote by M the set of all functions
I: 0‘4 C such that 1(w) E U(w) for every w E 0.
Denote by U the set of all sequences F = (10.11,...)

such that I E M for every k. Elements of H are

k
referred to as contracts. Elements of H of the
form n = (1,1,...) where I E M are referred to as
stationary contracts.

Recall that the principal's utility function is

zassumed to be monotonic increasing with respect to

5'5

wealth which is its only argument. As his total

expected return is maximized, his expected utility
on the return will also be maximized simultaneously.
To simplify the analysis that follow, it is assumed
that the principal is risk neutral. All results in
subsequent chapters will hold if the principal is
risk adverse although the proofs to various theorems
have to be modified to accommodate the risk adverseness.

Define the function

N-l k
JN,TF(wO) = E{k§0 CL 9(wkolklyk)}

and

J (m ) = lim J (m )
TT 0 N409 N,TT 0

where a discount factor, a positive real number

g the principal's net return function,
g: QCY 4 R*

y a disturbance term. For each fixed
(m,I) 6 0C, a probability p(y]w,1) on
the disturbance is given and E[-]w,1]
denotes the expectation Operator with re-
spect to that probability.

R is used to denote the real line and R* to denote
the extended real line, that is, R* = R U [-m,m].

The Cartesian product of sets X1,X2,...,Xn is denoted

by xx ...Xn

1 2

56

J, as defined, is then the expected total net
return to the principal for N periods. It is
sometimes called the N-stage net return. Depending
on the values of a, that is, whether a is greater
or less than one, J represents the expected total

future or discounted wealth to the principal res-

pectively. The objective of the model is then to find

an incentive contract which will maximize JN.
TO complete the model, the manner in which the

payoff changes from period to period has to be

specified. The following transition function is

hypothesized.

wk+l = f(wI<’Ik'Jk'yk)’
The payoff in period k then depends on the payoff in
the previous period, the incentive contract which
directs the agent's action in the period and the
accumulated wealth of the company with a disturbance
term.
The basic multiperiod problem can now be formulated

as follows. For each we 6 Q,

Maximize J (m )
F611

Subject to wk+1 = f(wk,1k,Jk,yk).

57

3.3: Optimality Concepts

 

*
For a fixed w E 0. let JN(w) and J*(w)
denote the Optimal net return function to the princi-

pal in the finite and infinite horizon models

respectively. Notationally,

*
JN(w) = sup JN,v(w) for every w E 0
W611
J*(w) = sup Jv(w) for every w E O.

FGU

Regarding the Optimal contracts, v* denotes the

optimal contract in a sense that such a contract will

*

give rise to JN or J*. However, v* needs not

always exist. A weaker form of Optimality is also

considered. A v” contract is one which enables the

Ca

principal to get an expected total net return of at

*
least JN-e or J*-—e, where e is a predetermined

positive number. Such a contract is called a nearly
Optimal contract. By definition, the contract w*
taken as a whole sequence of incentive function

*
{11,12,...] will give JN or J* as the total ex-

pected net return to the principal for the N periods.

A stronger form of optimality implies that at each

stage, k g N-—l, 1 can guarantee that the principal

k

*
will Obtain wealth position Jk' Such a contract is

called uniformly N-stage Optimal contract.

58

Sometimes, it may become necessary to approximate
J or J* by considering a sequence of contracts
[an]. In doing so, the rate of convergence to Opti—

mality or near Optimality is always in important issue.

Given a sequence of positive real numbers [en] con-

verges to zero as n tends to infinity, the sequence

of contracts [en] is said to exhibit [en1—dominated

*

convergence to optimality if J N

N,” converges to J

* I
and JN,Wn 2 JN(w)-en, for n = 1,2,.... Thus, 1f

{en} converges to zero quickly, so does JN.Fn to JN.
The existence of an {en3—dominated convergence se-
quence of contracts to an optimal contract will provide
an iterative scheme such that an Optimal or nearly

Optimal contract results.

Up to this point, the multiperiod basic agency
model has been structured primarily on the usual
decision-making economic model in its most general
terms. Unfortunately, the solution to such a model is
extremely difficult to arrive at or even to guarantee

its existence. This is again, like the single period

Inodel, due to the fact that the decision variable for
the principal, namely the incentive contract, is a
function which belongs to an infinite dimension space.

[hiless one is willing to impose a number of restrictions

59

on the behavior of the various functions, it appears
impossible to solve the problem directly by using

any known analytical skills. Because of the unique
nature of the problem, a sequential decision process
on a discrete time basis, it can be reformulated under
a dynamic programming algorithm hypothesis. This

will describe a procedure which, under appropriate

conditions, guarantees an Optimal or nearly optimal

contract can be derived. with the various numerical
methods and computational algorithms developed by the
engineering discipline, more results can be generated

on the optimal contract.

3.4: Dynamic Programming Algorithm

In the Optimal control literature, multiperiod
models or dynamics are analyzed under two separate
branches of studies. The first one is that the
functions under consideration changes continuously
with respect to time, a continuous time model. A
natural solution procedure for such a model is to
formulate it as a system of differential equations.
The second branch is that the system changes at dis-
crete time intervals. A control function (the incentive
function for agency model) is selected at each of these

intervals to maximize or minimize the objective function.

Lbnaamic programming algorithm is the common procedure

60

used to search for solutions of this type of problem.
In all appearance, the multiperiod agency model as it
is applied to economic entities naturally belongs to
the latter group of discrete time models. The

following will give a discussion of the dynamic pro-

gramming algorithm as a pausible solution procedure
to the prOblem.

A dynamic programming algorithm is a systematic
sequential search for a solution to the problem of
either maximizing or minimizing an objective function

in a dynamic setting. The search can go forward or

backward in time. There are three operations performed
repetitively. These Operations will be discussed in
greater detail in Chapter VI under the Borel models.
The operations are as follows. First, there is the
evaluation of a conditional expectation. Second, the
objective function in two variables (payoff and incentive)
is optimized over one of these variables (incentive).
Finally, if an Optimal or nearly Optimal contract is to
be constructed, a "selector" which maps each state to
a contract which achieves or nearly achieves the maximum
in the second step must be chosen.

There are two stages in the analysis of a problem
under the dynamic programming algorithm. The first
stage involved in the investigation of whether the

three Operations as outlined above are feasible under

61

the assumptions of the problem. By feasibility, it
would mean whether or not one can execute these
operations. This is more of an existence phrase of
the procedure than the implementation steps. However,
there is a little more to be said than just mere
existence. At this stage, one is concerned with not
only the feasibility of the algorithm but also whether

or not there are iterative schemes to perform the

Operations. The second stage is the actual finding of
the solution procedure under the dynamic programming
algorithm hypothesis to arrive at a solution. This
includes the actual programming algorithm, the

various numerical aspects of the computation and

maybe the coding of the algorithm in terms of computer

software.

In Chapter 1, it was mentioned that the objective
of this research is to build a rigorous analytic model
in its most general terms to study the conditions
such that the principal can design an optimal contract.
Consistent with that objective, analysis in the sub-

sequent chapters on the multiperiod model are focused

on the first stage of the dynamic programming algorithm.

Given the complexity of the prOblem, it should not be

misled to believe that the second stage of actual

implementation is easy. The techniques used will

62

depend on the various behaviors of the functions of
the model. If the functions behaves in a "nice" way,

an analytic solution may be feasible. If they behave
in some pattern, there are standard numerical methods
which exist in the literature to solve the problem or
to approximate the solution, for example, Marchuk [1982],
Glowinski, Lions and Tremilieres [1981] and Dahlquist
and Bjork [1974]. It is not the intention of this
research to investigate and develOp the numerical
aspects of the solution to the problem.

Dynamic programming algorithm will be adopted to

search for long run optimal or nearly Optimal contracts

in a multiperiod agency model. The model as prOposed

in the previous section will be reformulated under the
dynamic programming algorithm hypothesis. Analysis

will be conducted to investigate (i) whether the ad hoc

assumptions imposed on the different models are met by

'k
the algorithm, (ii) whether a JN or J* can be

achieved under the assumptions, (iii) if a J; or J*
exists, then whether or not there are Optimal or nearly
Optimal contracts that would give rise to the J; or
J*, and (iv) at each stage of the iteration, are
there any numerical search procedures that can product
the optimum.

To construct a dynamic programming model for the

nuiltiperiod agency problem, first consider the mapping

Ii defined by

63
H(w,I,J) = E{g(w.I.J.y) +aJ(w.I.J.y))lI.w1

with the time subscript on the variables suppressed and
the expectation Operation is taken with respect to the

probability distribution of y given I and w.
Hence Hk represents the total expected discounted

net return to the principal from period k through N
given an incentive contract and a payoff outcome. The

dynamic programming algorithm will evaluate H

k
sequentially for k = 0,1,...,N-l and search for
*
an IR that will maximize the corresponding Hk'
Effectively, the algorithm is maximizing H recur-

k

sively N times with respect to 1k given w and an
initial value of J,JO. In order to describe the
Operation for all N periods together, define for each

I E M and every w E Q the mappings TI and T by

T: (J) (m) = H(w.I,J)

T(J)(w) = sup H(w.I.J)
I€U(w)

Same interpretation is given to TI as H, whereas
T -represents the solution of the dynamic programming

algorithm per period: the total discounted expected

*
return for Ik'

Let Tk, k = 1,2,... be the composition of T
‘Mith itself k times and TO(J) = J for every possible

.1. JN W(w) and Jn(w), the total eXpected net return

64

function can now be defined in the dynamic programming

context as follows.

JN’F(w) = (T T ...T )(JO)(w)

.IW(w) = lim (T 1? ...T )(JO)(un.

N4w Io I1 N-l

The solution under the dynamic programming algorithm

can then be expressed as follows.
_ N
JN(w) - T (Jo)(w)

J(w) = lim TN(JO)(w)
N-Ocn

for every w E Q for the finite and infinite horizon
models respectively.

The multiperiod model has been formulated using
two different approaches. A model based on decision
making process is first constructed. The same model
is given in a dynamic programming structure. The natural
question that arises is whether the two approaches are
equivalent. In particular, one would be interested to
know whether J; = JN = TN(JO). In the following
chapters, assumptions are hypothesized under each of
the finite and infinite horizon models. It will be

shown that the function H satisfies the assumptions,

*
and, under these assumptions, JN = TN(JO). This will

establish the validity of using dynamic programming in

65

generating solutions for the model. Next, the question
of the existence of Optimal or near Optimal contracts
will be investigated. Finally, the requirements to
arrive at the Optimal or near Optimal contracts will

be considered.

CHAPTER III*

MULTIPERIOD MODELS - PRELIMINARIES

3.1* Notations and Assumptions
The mathematical notations used in this work
are fairly standard. R is used to denote the real
line and R* to denote the extended real line, i.e.,
R* = R U [-m,m]. It is assumed throughout that R
is equipped with the usual tOpology generated by the
Open intervals (a,b), where a,b E R, and with
the Borel o-algebra generated by this topology.
Similarly R* is equipped with the tOpology generated
by the open intervals (a,b), a,b 6 R, together
with the sets (c,m], [-m,c), c E R, and with the
o-algebra generated by this tOpology. The Cartesian
product of sets X1,X2,...,xn is denoted by X1X2...Xn.
It was proved in the single period model that
the Nash constraint is satisfied for all incentive
contract if the action Space A is of finite dimension
and the utility function of the agent U2 is upper
semicontinuous with respect to his action choice a.

In this and subsequent parts of the analysis, it is

66

67

assumed that the contracts under consideration are
incentive compatible. Therefore, given any contract,
the payoff function is well-defined with respect to
the state space. This enables the following analysis
to suppress the state parameter and treat the payoff
as a random variable and assume the existence of a
probability measure on all possible payoff wealth
positions.

The following definitions and conventions will
be adOpted:

(l) O and C are two given sets referred to
as the payoff space and incentive space respectively.

(2) For each w 6 0, there is given a nonempty
subset U(w) of C referred to as the incentive
constraint set at w.

(3) Denote by M the set of all functions I: O 4 C

such that 1(w) 6 U(w) for every w 6 Q. Denote by

H the set of all sequences v = (10.11....) such that
Ik 6 M for every k. Elements of H are referred
to as contracts. Elements of U of the form v = (1,1,...

where 1 6 M are referred to as stationary contracts.
(4) Denote by
F the set of all extended real-valued functions
J: O 4 R*. (The exact form of J will be

defined in the next section.)

68

B the Banach space of all bounded real-valued
functions J: 0 4 R with the sup norm,

i.e., “J” = sup ]J(w)] for every J 6 B.
w60

(5) For all J,J’ write

J Jl if J(w) = (

JI w) for every w 6 0
if J(w) g J'(w) for every w E Q.

I

J J

"A

For all J 6 F and e 6 R, denote J+—e the function
taking the value J(m)+-e at each m 6 Q, i.e.,
(J4-c)(w) = J(w)+-c for every w 6 Q.

(6) The usual arithmetic for R* is adopted except

m-oo=—co+co=ooand O‘m=O.

Since the principal's utility function is assumed
to be monotonic increasing with respect to wealth which
is its only argument, it is assumed for the rest of
this analysis that the principal is risk neutral. Such
an assumption will not cause any loss in generality
because as the principal's total expected return is

maximized, so is his expected utility on the return.

3.2* The Basic Multiperiod Model

 

The expected utility of the principal or the agent
is defined as the expectation taken as an integral of
their respective utility functions with respect to the
probability measure on the state space. Such an

integral may not always exist as a Reimann integral.

69

An outer integral is adopted as the expectation

Operator.

Definition: If f 2 O, the outer integral of

f with respect to a probability measure p is

defined by

* , .
I fdp = inf {I gdp: f g g, g 18 Borel-measurable].

If f is arbitrary, define
* * *—
f fdp = f f+dp — f f dp

where f+ = max[0,f], f- = max[O,-f].

Define the total net return to the principal

for N periods as
N-l

JN,W(wO) = E*[ Z} dkg(mk,1k.yk)1
k=0
where a discount factor, a positive real number

9 the principal's net return function g: QCY 4 R*
y a disturbance which takes values in a

measurable space (Y,?). For each fixed

(w,1) 6 0C, a probability measure p(dy]m,1)

on (Y,W) is given and E*[-]w,1] denotes

the expectation in terms of the outer integral

with respect to that measure.

To complete the model, the following transition

function is hypothesized for the payoff function.

wk+l = f(uk’1k'Jk’yk)'

1

70

The basic multiperiod problem can now be

formulated as follows. For each wo 6 O,

Max1mlze JN,v(wo)
FER

Subject to wk+1 = f(mk,1k,Jk,yk).

3.3* Optimality_Concepts

 

*
For a fixed w 6 0, let JN(w) denote the optimal
net return functions to the principal in the finite

and infinite horizon models respectively, that is,

*
JN(w) = sup JN’V(w) for every w 6 Q
V€H
J*(w) = sup J (m) for every w 6 Q.

WEN

Regarding the contract, the following optimality

concepts are adOpted.

Definition: A contract F* 6 H is said to be
0 o *
N-stage opt1mal at w 6 0 1f JN'V*(w) — JN(w) and
optimal at w 6 0 if JW*(1) = J*(w).

A contract W* 6 H is said to be N-stage optimal

*
. = . . = *.
1f JN,F* JN and Opt1mal 1f Jv* J
. . . * * .
Def1n1t1on: A contract v* = (10,11,...) 15

Called uniformly N-stage optimal if the contracts

*

( *
Ii'Ii+l' o o

.) is (N-i)-stage optimal for all

= 0,1,...,N—1.

71

Definition: Given 6 > O, a contract We 6 U
is N-stage e—0ptimal if

* k 'f *
JN(U)) "' c. l JN(W) < °°

J (m) 2
N'Fc — l/e if J;(w) = w

The contract We 6 H is said to be e-optimal if

J*(m) -e if J*(w) < on
J (w) 2

”e l/e if J*(w) = on
Definition: If [on] is a sequence of positive
numbers with en 4 O, we say a sequence of contracts
[We] is said to exhibit [en1-dominated convergence

to optimality if

11m JN,T = JN
n4m n
and for n = 2,3,
*(. f J*
JN w)-—c 1 N(w) < w
JN,Vn(w) 2 J *
N.Fn_1(w) - 6 1f JN(w) = 0°

3.4* The Dynamic Programming Algorithm

The discrete time sequential decision process
provides a natural framework for adopting the dynamic

programming algorithm in generating an Optimal contract
or nearly optimal contracts. This section gives a
definition of JN "(w) and JF(w) in a dynamic

programming framework.

72

Consider the mapping H: OCF 4 R* defined by

H(UJ'I0J) = E*{g(w.I.J.y)+aJ(f(w.I.J.y))]I.w1-

Hk represents the total expected discounted net

return to the principal from period k through N-l

given an incentive contract and a payoff outcome. To
describe the Operations for all N periods together,

define for each 1 6 M the mappings TI: F 4 F by
TI(J)(w) = H(w,I,J) for every w 6 O

and T: F 4 F by

T(J)(w) = sup H(w,1,J) for every w 6 Q.
16U(w)
Let Tk, k = 1,2,... be composition of T with
itself k times and TO(J) = J for every J 6 F.
JN,w(w) and Jv(w) can now be defined in the dynamic

programming context.

J (w) = (T T . . .T ) (J )(w)
N,v IO 11 IN_l o
J (m) = lim (T T ...T )(J )(1)
F N4m Io Il IN-l o

The solution can be expressed as

JN(w) TN(JO)(w)

J(w)

11m TN<J )(w)
N—Dm 0

for every w 6 0 for the finite and infinite horizon

models respectively.

CHAPTER IV

FINITE HORIZON MODEL

4.1: Introduction and Assumptions

 

The finite horizon model describes the long
term planning behavior of an entity. It assumes the
fact that the entity will dissolve at the end of N
periods. The model seeks Optimal contracts such that

the total expected discounted net return to the
owner(s) over the N periods is maximized. Sole

prOprietorship and partnership are the two prime
examples of entities which dissolve in a finite period
of time.

In considering the N-stage Optimization problem,
as indicated towards the end of the last chapter, the

*
central question is whether J = TN(JO) in which

N
'k
the optimal JN is obtained by successively computing
T(JO),T2(JO),... via the dynamic programming algorithm.

Another issue is the existence of Optimal or nearly
Optimal contracts. In order to respond to the above

questions affirmatively, some assumptions have to be

73

74

imposed on the function H which describes the
transition of the total net return to the principal
from one period to another.

The first assumption concerns the behavior of H
in response to small changes in the value of J. For
every payoff wealth positions and every possible
contracts within the constraint set, given a small
change in J, the corresponding change in H is also
small, in fact within some predetermined bounds. This
is a weak statement to say that H is continuous with
respect to J.

The second assumption imposed a kind of linear
behavior on H from below with respect to J. It
says that if J is reduced by some positive value,

say r, then the value of H is reduced by no more

than r times the discount factor a. This assump—
tion restricts the behavior of H when a large change
in J occurs.

The third assumption is admittedly somewhat
more complicated. It allows one to get stronger
results on the existence of nearly Optimal contracts
than can be obtained under assumption two. It says
that if one were to approximate the entity's total net
return function J by a sequence of return functions,
[Jn] then each Jn can in turn be approximated by a

sequence of contracts within some predetermined bounds.

75

The first two assumptions seem reasonable in most

economic environments in which business entities
Operate. There should be no drastic changes in the
total net return (or total wealth) of a company in
any period given any changes in the total wealth of
the previous period. The third assumption simply
implies that there always exists some computational
procedures to approximate the return function to a
desired accuracy. Although the current work does not
include the analysis of the actual numerical methods
to be used to solve the problem, I will attempt at
least to guarantee that some computation procedures
exist and they are implementable under the various

assumptions studied.

4.2: The Finite Horizon Model

 

The first step in the analysis of the finite
horizon model is to establish the validity of the

assumptions in the model, that is, whether the function
H as defined in the dynamic programming formulation

of the problem satisfies the assumptions. If one

were to examine the assumptions one and two carefully,
one would notice that assumption one is really a
special case of assumption two. Since the real number
in assumption two can be any positive number, it can

be made arbitrary small which is exactly what assumption

76

one requires. All that is needed to be shown is that
H satisfies assumptions two and three (Theorem 4.1)
and extend the result to conclude that assumption one

is also satisfied.
*
The fact that JN = TN(JO) can be shown to be

true if Jk F(w) > —c under assumptions one or three.
*
This is also the case if JN < a under assumption two

(Theorems 4.2 and 4.3). J being negatively

k”(w)
finite is trivially satisfied in any economic situations.
No company can survive infinite losses for any one

period, not to mention that Jk is the total net re—

turn for k periods. The same situation applies for

* *
JN being finite. It is unrealistic to have JN in-

finite in any common business environment. With the
above facts established, one can say that dynamic
programming algorithm is apprOpriate as a solution pro—
cedure for the finite horizon models.

Next, the question of whether an Optimal or nearly
optimal contract would exist under the assumptions is

investigated. Under assumption three, it is shown
that there always exists an [en1—dominated convergence
to optimality sequence of contracts (Theorem 4.3).

Recall that assumption three says that if J is to be

approximated by a sequence of functions [Jn] then

77

each Jn can in turn be approximated by a sequence of
contracts within some predetermined bounds. Theorem
4.3 guarantees that such sequence of contracts is an
[en1—dominated convergence to optimality contracts.
This implies that assumption three not only guarantees
the existence of a nearly optimal contract, it ensures
that one can find or adOpt an iteration procedure to

approximate the nearly Optimal contract.

In the develOpment of the model, there are no
restrictions placed on the individual functions and
their parameters with the exception of U2, the
agent's utility function being continuous from above
with respect to the action choice to guarantee the
compliance of the Nash constraint and H under the
above three assumptions. The model is drawn in its
maximum generality. A trade-off the researcher has
always had to make under such general setting is the
inability to make specific characterization of the
Optimal or nearly Optimal contracts. Such comments can
only be made if more information about the behavior
of individual functions are known, that is, more
assumptions and restrictions on the functions are needed.
The only characterization of the Optimal contracts,

under the current general formulation, that can be
made is that how the contracts are related to each

stage in the iteration procedures of the dynamic

78

programming algorithm or how they behave each period
in the process of arriving at the Optimal. This may
sound disturbing to the application-oriented reader.
This research is carried on in the interplay of the
economic model and the dynamic programming model. The
correspondence in the two models, once established,
will allow the dynamic programming algorithm, which

is documented in both the Operation research and
Optimal control literatures, to take care of the
various numerical and computational aspects of the

actual search for solution to the model.

An uniformly N-stage Optimal contract is always
the more desirable contract because it maximizes the
total net return at any given point of time over the
planning horizon rather than just at the end of N
periods. Such a contract can be shown to exist if at
each period and for each payoff outcome, there is a

contract that would maximize Hk' that is the supremum

is attained in the relation Tk+1(Jo)(w) = sup H[w,I,T

16U(w)
And if a uniformly N-stage optimal contract exist,

k
(JO) ].

*
JN = TN(JO) (Theorem 4.4 and its corollaries).
In the dynamic programming algorithm, each Tk(Jo)
is computed sequentially over all possible contracts.

Under the structure of the model prOposed, this is

equivalent to saying that a sequential decision process

79

is adopted and the total wealth return is maximized
at each stage of the process. This entails the
question of whether an Optimal can be achieved each
period. If the incentive constraint set is compact,
which is a typical requirement for optimization
problems, an Optimal contract can always be achieved
(Theorem 4.5). Together with the results of Theorem
4.4, it can be concluded that under the apprOpriate
assumptions of the finite horizon model, an uniformly
N-stage Optimal contract is guaranteed to exist under

the dynamic programming algorithm.

CHAPTER IV*

FINITE HORIZON MODELS

4.1* Introduction and Assumptions

 

In considering the N-stage optimization problem,

*
the central question is whether J = sup J = TN(J )
N N,F 0
W611
*
such that the Optimal JN can be obtained by
successively computing T(JO),T2(JO),... via the

dynamic programming algorithm. Another issue is the
existence of Optimal or nearly Optimal contracts. In
order to respond to the above questions affirmatively,

some assumptions have to be imposed on the function H.

Assumption F.l: If {Jk} C F is a sequence
satisfying Jk+1 2 Jk for all k and H(w.I,J1) > -6

for all m 6 Q, I 6 U(w), then

lim H(w.I,Jk) = H(w,1,lim J
k-Ocn k-Ooo

k) (1160. I 6U(=.u).

Assumption F.2: There exists a scalar a 6 (0,6)
such that for all scalars r 6 (O,m) and functions

J 6 F, we have

H(wIIIJ) 2 H(wIIIJ-r) 2 H(WoIIJ) -ar UJ £17: I 6 U(LU)0

8O

81

Assumption F.3: There is a scalar B 6 (O,w)
such that if J 6 F, {Jn} c F, and [an] c R
satisfy
Zeno» en>0n=l,2,
n=l
J = lim Jn J 2 Jn n = 1,2,
nam
J(W)-€n n=1I2I W60 J(LU)<°°
Jn(w) 2
Jn_1(u1)--en n = 1,2, m 6 J(w) = w
H(w,1,J1) > -w for every w 6 O. I 6 U(w)
then there exists a sequence [In] CIM such that
lim T (J) = T(J)
n46 In n
and
T(J)(u1) -Ben = 1.2, w E Q,T(J) (1) <..
TI (Jn)(w) 2
n TI (Jn_1)(u)-Ben = 1.2. w 6 O.T(J)(w)==w

n-l

Actually, Assumption F.l is a special case of F.2.

Careful examination of the two assumptions will show

that if Assumption F.2 is met, F.l will be satisfied.

Assumption F.3 is somewhat more complicated.

It allows

one to get stronger results on the existence of nearly

Optimal contracts than can be obtained under F.2

(Theorem 4.3).

82

4.2* The Finite Horizon Model

 

The first step in the develOpment of the model
is to show that the function H as defined in the
previous chapter satisfies the three assumptions on
the finite horizon model. The proof of such a
statement requires four technical lemmas on outer
integrals and probability measures. These results
are very standard in mathematical analysis and
probability theory. For completeness, they are stated

without proof as follows.

Lemma 4.1: If e > O and f g g g f4—c, then

 

* 4* *
[fdp g] gdp g f fdp+2e.

Lemma 4.2: Let (X,B,p) be a probability space.

 

Let [en] be a sequence of positive numbers with

Z) c < m. Let {f ] be a sequence with lim f = f
n n n
n=l n4m

pointwise f 2 fn for n = 1,2,"'. Let

f(x) —cn if f(x) < co
fn(X) 2

H
8

fn-l (x) - en if f (x)

and [*fldp < w. Then lim [*fndp = f* fdp.

n-Dco

Lemma 4.3: Let (X,B,p) be a probability space.

 

If p*([x:f(x) = m}) > 0, then for every g,

‘k
g :X 4 [-m,m] and B-measurable, I (g4-f)dp = m.

83

Lemma 4.4: Let (X,B,p) be a probability space.

 

If E c.X satisfies p*(E) = 0, then for any

f:.X 4 R*
[*fdp = [*xx_Efdp.

Theorem 4.1: The mapping

 

H(w.I.J) = E*{g(w.I.J.y)+—aJ[f(w.I.J.y)]lu.I‘

satisfies Assumptions F.2 and F.3.

Proof: For r 6 (o,m)
H(WIIpJ) 2 H(WoI,J-r) o

By Lemma 4.1,

H(w.I.J) [*(g(w.I,J.y)4-OJ[f(w.I.J.y)])p(dy]w.l)

2 H(UJoIIJ"r)

2 H(w,I,J)-2dr.

Thus F.2 is satisfied. Let J 6 F, [Jn] C F, [an] C R

on

satisfy 2) e < m, e > O and for all n
=1 n n

J = lim Jn J 2 J
n-Oco —

n

J(w)-—en if J(w) < m

Jn(w) 2
Jn_1(w)-en if J(w) = w

H(w,I,J1) > -® V w 6 Q, I 6 U(w).

84

Let [In] C:M be such that for all n

T(J)(w)-en if T(J)(w) < e

T_ (J)(w) 2

In 1/en if T(J)(w) = e

T_ (J) 2 T_ (J).

In In-l

Consider the set

A(J) = {w 6 013 I 6U(w) with

p*({yIJ[f(w.I.J.y)] = wllwol) > 0]

where p* denotes p-outer measure. Let I 6 M be such
that
p*({le[f(w.f(w).J.y>] = mllw.i(w)) > o v w e A(J).
Define for all n
1(w) if w 6 A(J)
In(w) if m Z.A(J)

Claim that [In] thus defined satisfies the requirement

of F.3 with B = l4—2d. For m 6 A(J), by Lemma 4.2

lim inf T_(Jn)(w)
n4m I

lim inf f*{g[Wpi(UJ) IJIY] + aJn[f(WIi(w) IJIY)]1P(dY1woi(W))

n—Oco

[*{g[w.f(w).J.y]+—aJ[f(w.I(m).J.y)]}p(dy]w.I(W)).

Since T_(J)(w) > -m, by Lemma 4.3
I

lim inf T_ (Jn)(w) = e ; T(J)(w).

n4w I
n

85

For m 6 A(J), for all n

p*({y1J[f(wIIn(w)lJIY)] = 0°31w,In(u1)) = 0.

Let Bn 6 v {y]J[f(w.In(w),J,y)[ = e] C.Bn and
P(Bn(w.In(w)) = o v n.
By Lemmas 4.1 and 4.4

T (J )(w)
Inn

= f* xv_Bn(w)(g[w.In(w).J.y]-taJn[f(w.In(w).J.y)}p(dylw.In(w))

2
- 2de
n
= TI (J) (u) - Zaan
n

V w 6 A(J), lim inf T (Jn)(w) 2 lim inf T

11-400 In n46.)

T(J)(w)

lim inf TI (J )(w) > T(J)(w) V w 6 Q.
n4oo n n

But TI (Jn) g T(J) V n = 1,2,... by hypothesis
n

lim TI (Jn) = T(J).
n4co 11

If m is such that T(J)(w) < m, then m E A(J).

by Lemma 4.1,

T1 (Jn)(w) 2 T (J)(w) - Zoen
n n

IF ./

T(J) (w) — (1+ 261) en.

Ii: lX'V-Bn(W)1’g[mlIr1(W)IJIY]"1' aJ[f(w.In(w) IJIY) ]}p(dy1w,1n(w))

I (J)(w)
n

Thus,

86

If m is such that T(J)(w) = m, then either
(i) wEMJ) or (ii) w6A(J).

(1) Let the/A(J)

T (Jn) (w) 2 TI (J) (to) -2ae

1 n
n n
2 T1 (J) (to) - Zoen
n—l
2 TI (Jn_l)(w)-2oen.
n-l

(ii) Let h 6A(J)

* — — .,
TIn(Jn)(w) = [ {g[w,1(w),J.y]4-0Jn[f(w.1(w),J.y)]J
p(dy]w.1(w))
* _ — .
2 I {ng.I(w).J.y]+aJn_1[f<w.I(w>.J,y)]1
P(dylw.1(w)) -2den
= TI (Jn_1)(w)-2a€n
n-l
The claim is proved by letting 8 = 14-23. QED

Next, the correspondence of the solution values
between the economic model and the dynamic programming
*
model, that is, JN = TN(JO), is established under the

assumptions. Theorems 4.2 and 4.3 also show the existence

of a nearly Optimal contract under Assumptions F.2 and F.3.

Theorem 4.2: (a) Let F.l hold. Let

 

Jk U(w) > -w for all m 6 0, V 6 U, and k = 1,2,...,N.

Then

* N
JN - T (JO).

87

*
(b) Let F.2 hold. Let Jk(w) < m for all

m 6 Q and k = 1,2,...,N. Then
* N
JN — T (Jo)

and for every 6 > 0, there exists an N-stage e-

Optimal contract, i.e., a We 6 H such that

* J J*
JN 2 N,F 2 N - s.

Proof: (a) For each k = 0,1,...,N-l, consider

a sequence [1;] C:M such that

lim T .[TN'k'1(J )] — TN‘k(J ) k = 0,1, ,N-l
. 1 o 0
14m I
k
N-k-l N-k-l _
T i[T (JO)] g T 1+1[T (Jo)] k - 0.1.. .N-l
Ik Ik
1 = 0,1,
By F.1 and Jk F(w) > -m, we have
at3 \
JN 2 sup...sup T i ...T 1 (Jo)
1o lN-l I O I N-l
o N-l
= sup...sup T . ...T . sup T . (J )
. . 1 1 . 1 o
1 1 o N—2 1 N—l
o N-2 IO IN_2 N-l IN_1
= sup...sup T i ...T i [T(Jo)]
1o lN-Z I o 1 N-2
o N-2

88

*
Since J = sup J (w) V w 6 O
N NEH N'”
= sup (T T ...T )(J )
ten Io I1 IN-1 o
g TN(JO)
*
.. JN = TN(JO)
(b) The result clearly holds for N = 1. Suppose
it holds for N = k. Then J; = Tk(Jo) and for a
'k
c > O 3 We 6 H such that Jk,F€ 2 Jk-e. By F.2,
V I 6 M
J T \ *
k+l 3 1(Jk,v ) s TI(Jk) ' as

B ' d t' J* > k+1(J ) B definition
y in no ion, k+l = o . y
k+l *
T (Jo) 2 Jk+l
k+l _ *
T (J ) — Jk+1
Let E > 0 be given and let 5 = (Io,Il,...) be such
that
* _
J _ 2 Jk - e/Zd.
k,I
Let i e M be such that
'k * -
Ti(Jk) 2 T(Jk) - 6/2.
Consider the contract E_ = (I,Io,Il,...). Then
6
* ._ 'k _
J _ = T_(J _) 2 T_(Jk) - 5/2 3 T(Jk) - c
k+1,v= I k,v _ I I
* _
= J — e QED

k+1

89

If Assumption F.1 were to be replaced by the
following assumption, the results of Theorem 4.2 (a)

can be strengthened.

Assumption F.1': The function Jo satisfies
Jo(w) g H(w.I.Jo) V w 6 O, 1 6 U(w) and if
[Jk] C F is a sequence such that Jk+1 2 J 2 J

for all k, then

lim H(w.I.Jk) = H(w.I,1im J ) V w 6 C, I 6 U(w).

k—Ooo k-hao k

Corollary 4.2.1: Let F.1' hold. Then

 

*
JN = TN(JO).

Theorem 4.3: Let F.3 hold. Let Jk U(w) > -w

 

*
V w e o. n e n and k = 1,2,...,N. Then JN = TN(JO).
Furthermore, if {an} is a sequence of positive

numbers with en 4 0, then there exists a sequence
of contracts {Tn} exhibiting {en1-dominated

*
convergence to Optimality. If, in addition JN(w) < m

V w 6 0, then for every 3 > 0, there exists an

e—optimal contract.

Proof: Let K = l.

J:(w) sup J1 (h)

WED 'F

= sup H[w,I(w),JO]
I€M

= T(Jo)(w) V w 6 Q.

90

Given [en], by F.3, it is clear that there

a sequence [Tn] C H satisfying

lim J = J
n42 1,Vn l
*
J1(u)) - c
J (m) 2
1,J — -
n J1,7T (w) - on
n-l
Suppose the result is true for K = N-l.

exists

Let 8

be as specified in F.3. Consider a sequence

r .
.en} C R W1th an > O V n

co
lim e = O and Z) c
n4m n=1
A A n
r11C =
Let (Tn, H, where Vn (11,1
. A=J
11m J H N—l

n

-)

be such that

—oo

* -1 V *

JN_1(u)-B an m 6 Q JN_1(w) < m
J A ((1)) E _1 *
N-l,Trn J A (w)-B en V w 6 0 JN_l(w)

N-1,W

n-l

Since Jk ”(w) > —m V w 6 O, k = 1,2,...,N, W 6 H
then H(w,I,J A ) > —m V w 6 0, I 6 U(w). Since

N-l'Tr
n
1' J J*
1m =
A _
n4m N-l,w N 1

11

By F.3, it then implies H {1:} CIM such that V n

*
11m T (J A ) = T(JN-

* . a: ‘
T(JN_1)(.L) an if T(JN-l) (w) < co
T1n(JN 1 A )(w) 2 T (J >( ) 'f T(J* >( )
— In. A LU '6 1 (.U
o In-l N—1,w N-l
o -l
. . * N—l . . .
By induction JN-l = T (J ) and by definition of
* *
JN, TN(JO) 2 JN. Hence
* *
J(JN_1) = TN(JO) 2 JN
But
J* 2 1 '1‘ (J ) T(J* )
1m -
N n4m I N-1,F N—l
n
N
J - T(JN-l) — T (Jo)
Let n = (In,1n,1n, .). Then for all n
n o l 2
_ *
11m JN,W — JN
n4m n
‘k .. V . h *
JN(u))--en w 6 0 w1t JN(w) <
JN v (w) 2 * . *
’ n J (w)-—c V w 6 0 With J (N) =
N,W n N
n-l
Obviously, if J;(w) < w, Tn is e-optimal. QED

Theorems 4.1 through 4.3 established the validity

of adOpting dynamic programming algorithm as solution
procedures to the basic multiperiod agency problem.

The existence of nearly optimal contracts has also been
demonstrated. In fact, the dynamic programming algorithm

can provide much stronger results than those as stated.

92

Under conditions provided in Theorems 4.4 and 4.5,
the algorithm is actually attempting to arrive at a

uniformly optimal contract.

 

*
Theorem 4.4: A contract v* = (10,11,...) is
uniformly N-stage optimal if and only if
(T *TN-k-1)(JO) = TN‘k(Jo) k = 0,...,N-l.
Ik
Proof: Let (T *TN-k-1)(JO) = TN'k(JO), k = 0,...,N-1.
Ik
N-k
Then (T *...T * )(JO) = T (JO). But
Ik IN-l
J* T
N-k 2 (TI* .. 1* )(Jo)
k N-l
and
N—k *
T (Jo) 2 JN-k
* .—
JN_k (TI* .TI* )(JO)
k N-l

v* is uniformly N-stage Optimal.

Conversely, let U* be uniformly N-stage Optimal. Then

by definition, T(Jo) = J = T * (JO). For every

1 6 M, (TIT)(JO) = (T T * )(Jo). This implies

93

2
T (J ) = SUP (T T)(J )
0 16M I o
= SUP (TIT * )(JO)
16M IN_l
* _ 2
g J2 (TI* TI* )(JO) g T (JO)
N—2 N-l
2 _ * _ _
T (Jo) - J2 — (TI* T * )(Jo) — (TI* T)(Jo).
N—2 IN-l N-l

By induction, the result can easily be extended for all

QED

Corollary 4.4.1: There exists a uniformly N-stage

 

optimal policy if and only if the supremum is attained

in the relation

Tk+1 (JO) (1”) = sup H[(1.‘=.I,Tk (JO)]
I6U(w)
for each m 6 O and k = 0,1,...,N—l.

Corollary 4.4.2: If there exists a uniformly N-

 

stage Optimal policy, then

* N
JN — T (JO).

Theorem 4.4 and its corollaries state that if the
supremum can be achieved at each stage of the optimiza—
tion process, the resulting contract is uniformly

Optimal. The next theorem states that if the incentive
constraint set is compact, the existence of a supremum

at each stage is guaranteed.

94

Lemma 4.5: Let C be a Hausdorff space,

 

f: C 4 R* and U a subset of C. Let the set
U(i) = {1 e Ulf(1) 2 1] is compact for each 1 e R.

Then f attains a maximum over U.

Proof: If f(1) = m for all I 6 U, then
every 1 6 U attains the maximum. If
f* = sup[f(1)]1 6 U} < m, let [1n] be an increasing

sequence such that In < Xn+l for all n and

1 4 f*.
n

Thus U(ln) C‘U(1 ) for all n and the sets U(ln)

n+1
are nonempty and compact, H U (1n) = U(lo) is
n=l m
nonempty and compact. Let 1* 6 m U (1n) C‘U and

n=l
f(1*) 2 1n for all n
f(1*) 2 f*.

But f(1*) g f*

f(I*) = f*. QED

Theorem 4.5: Let the incentive space C be a

 

Hausdorff space and assume that for each m 6 0,

l 6 R and k = 0,1,...,N-1. The set

Uk(w,l) = [1 6 U(w)]H[w,I,Tk(JO)] 2 l] is compact.
Then J* = TN(JO) and there exists a uniformly N-stage

N
Optimal policy.

95

Proof: Direct application of Corollary 4.4.1

and 4.4. and Lemma 4.5.

CHAPTER V

INFINITE HORIZON MODELS

5.1: Introduction

 

The construction of the infinite horizon models
requires a few more considerations than those in the
finite horizon models. Recall that in the economic

formulation of the model J (w) = lim J (m), that
F N4m N,F

is, Jv(w) is the limiting value of the total expected
discounted net return to the principal as N gets

large and JW(w) = lim (TI ...TI )(Jo)(w) in the

N46 0 N-l
dynamic programming model. The natural initial question
would be whether such limits exist and if they do,
would they be equal. In addition, an immediate concern
would be the convergence of the dynamic programming
algorithm. Another important matter to resolve is
the question of whether or not optimal or nearly Optimal
contracts exist.

This chapter will address the above problems on

two separate sets of assumptions: contraction and

monotonicity. Both assumptions are reasonable

96

97

descriptions of the economic environments of a
business enterprise.
The contraction assumption says that payoffs

and returns that are receivable at a far distant

future have very little significance in the economic
decisions that are made today. In computing present
values, it is well-known that with a discount factor
of less than unity, the present value of any finite
amount for a long period of time is close to zero. A

discount factor of less than unity means that interest

rate is always greater than the inflation rate.

The second set of assumptions describe a mono—
tonicity behavior on the total net return function J.
This assumption can be set up in two distinct
scenarios. First, J can be monotonic increasing
with respect to time. Since J is the total accumulated
expected net return to the principal, the assumption
that J is monotonic increasing implies that the net
payoff each period to the principal is greater or equal

to zero. If one were to consider the net return for

each period in terms of expected value over all possible
payoffs, the requirement that the expected net payoff

to the principal is positive makes economic sense. The
entity will not survive if the expected payoff is

negative. 0n the other hand, if the total net return

98

function is monotonic decreasing, the situation
considered is still not totally unrealistic. Consider
the following situation. An entity has been suffering
losses consistently from period to period. This would
mean that the payoff for each period is negative and
consequently J is monotonic decreasing. The objec-
tive of the model is to maximize expected total return
which means that it will search for some incentive
contracts that would make the magnitude of period loss
diminish over time with the expectation that the loss
will become identically zero and eventually swing over

to the positive side. Once the period payoff becomes
positive, J will be monotonic increasing and the
modeling will continue under the earlier set of
assumptions. Of course, such a situation will occur
if the return becomes positive before the company goes

bankrupt.

5.2: The Contraction Assumption

The contraction assumption is motivated by the
contraction prOperty of the mapping H associated with
discounted stochastic optimal total return with bounded

net return per period. As mentioned earlier, this

assumption will be satisfied with the discount factor,

a < l and the net return function g uniformly bounded

99

above and below. As always is the case, bounded return

is an extremely reasonable assumption in most economic
settings. Theorem 5.1 will show that indeed H as
prOposed in Chapter III satisfies the contraction
assumption.

The next set of questions to be considered involve
whether the iteration process will generate contracts
that lead to an optimal total return function, that
is, whether J* = TN(J), and when the number of periods

N
N is allowed to tend to infinity, whether the dynamic

programming algorithm converges to any kind of limit.
If a finite limit exists, what is that limit equal to?
It can be shown that such a limit does exist and is
equal to the infinite horizon total return function for

any incentive contract (Theorem 5.2(a)). Also, it is

*
shown that JN equals to TN(J) for all N. These

two statements together imply that J*, the infinite

horizon Optimal return function is equal to lim TN(J)
N-Oa

(Theorem 5.2(b)). Also the mappings Tm and TT,

where m g N are contraction mappings (Theorem 5.2(c)).
With the aid of the Fixed Point Theorem which says

that a contraction mapping will converge to a unique

fixed point, J* is shown to be the unique fixed point.

Therefore, the validity of the dynamic programming algorithm

is established for the contraction assumptions.

100

Next, some attempts to describe the Optimal
contract are made. If for each financial position,

involving both the total return to date and the last
net payoff, there exists an Optimal contract at that
position, then the optimal contract is stationary
(Theorem 5.5). This means that the same form of
contract is Optimal over the entire time span. This
result can further be strengthened by showing that the
maximum is attained for each financial position if

the incentive constraint set is compact. Compactness
means that the set is closed and bounded and is the
usual requirement for optimization problems. Also,

since T is a contraction mapping, the sequence Tm(J)

does converge to a limit as m tends to infinity.
Hence the Optimal contract may be obtained in the limit
from finite horizon optimal contracts by successively

computing T(J),T2(J),... (Theorem 5.6). The conver-
gence of Tm(J) implies the convergence of the dynamic

programming algorithm for the infinite horizon model
under the contraction assumption.

As it has been stated, the contraction assumption
yields very desirable results. The optimal total return
J* can be computed through a finite horizon mode TN(J)

by letting N go to infinity. It is unique. If the

101

maximum is attained at each stage of the iteration
process, then the resulting optimal contract is

stationary.

5.3: The Monotonicity Assumption

 

In this section, two separate and parallel sets
of monotonicity assumptions are considered. It is

assumed that J the total return function to the

k’
principal at the end of time period k is a monotonic
increasing or monotonic decreasing function over, that
is Jo(w) g H(w,I,JO) (Assumption 1) or

Jo(w) 2 H(w,I,Jo) (Assumption D) for all w 6 O and
all I 6 U(w). The economic motivation and interpreta-
tion of these two assumptions have been discussed in

the introduction of this chapter. In the following

analysis under Assumptions 1 or D, two additional

assumptions are required on the behavior of the function
H. The first Assumption 1.1 or D.l, describes a
continuity prOperty on H. Similar to Assumption One

of the finite horizon models, this assumption guarantees

that for small changes in the value of J, there will

be only small corresponding changes in the value of H.
The second Assumption, 1.2 or D.2, imposes a kind of
linearity on H from below. This guarantees that H

cannot decrease more than a fixed multiple with a

102

decrease in J. Again, this assumption is similar
to Assumption Two under the finite horizon model.
The economic implications of these two assumptions
will be identical to that under the finite horizon
models.

As expected, Assumption 1 will be satisfied if g,
the net return to the principal per period is positive
and Assumption D is satisfied if g is negative
(Theorem 5.7). Hence, the analysis can be carried on
to investigate the equivalence of the economic and
dynamic programming model under these two assumptions.
This is done in three stages: (1) the Optimality
equation J* = T(J*) is investigated, (2) the question
of existence of optimal or nearly Optimal contracts
is settled, and (3) the convergence of the dynamic

programming algorithm is shown.

5.4: The Optimality Equation

Before considering whether the optimality equation
J* = T(J*) holds, the existence of a nearly optimal
contract for Assumption D is first established. If
Assumptions D, D.l and D.2 all hold, and the optimal
return J* is finite, the existence of nearly optimal
contracts is guaranteed (Theorem 5.8). In addition,
if the discount factor a is less than one, then the

nearly Optimal contract is stationary.

103

The optimality equation can be established under
all Assumptions D, D.l and D.2 (Theorem 5.9 and
Corollary 5.9.1). However, the optimality equation
holds under Assumption 1 only when Assumption 1 is

accompanied by one of the additional conditions 1.1 or
1.2 (Theorem 5.10 and Corollary 5.10.1). A consequence

of the results of this section is that the validity of
adopting the dynamic programming algorithm for the

monotonicity assumptions is established.

5.5: Convergence to Optimality and Existence of

 

Optimal Contracts

 

Under Assumptions D, D.l and D.2, it can be shown
that if the supremum is attained in the Optimality

equation

J*(w) = sup H(w.I.J*)
I6U(w)

for every w 6 0, then the resulting contract is an
optimal stationary contract. In fact, the conditions
are both necessary and sufficient (Theorem 5.11). The
same results can be obtained under Assumption 1

(Theorem 5.12). However, to arrive at the conclusion,
one additional assumption, Assumption 1.1, is required.

If the supremum cannot be attained for some w 6 0,

it is shown (Theorem 5.13) that, under Assumptions D,

104

D.l and D.2, the optimality equation can still be
used to construct a nearly optimal contract. In
addition, if the discount factor is less than one,
the nearly Optimal contract as constructed is
stationary.

Under Assumption 1, only a weak counterpart of
the above results can be given. If Assumptions 1
and 1.2 hold and if the optimal total return function

J* is finite, then a nearly Optimal contract exists

(Theorem 5.15). However, I am unable to give a counter—
part under Assumption 1 on conditions for existence
of a stationary nearly Optimal contract.

The dynamic programming model requires successive

generation of the functions T(JO),T2(JO),...,Tk(JO),...

for all k. In terms of the infinite horizon model,

it seems apprOpriate to define a function JOD by

Jm(w) = lim TN(J )(w) for every w 6 0.
N-Om O

The rest of this chapter will be devoted to the
discussion of whether J2 exists and whether Jco = J*.
In other words, it is a concern that whether the dynamic
programming algorithm converges under the two assump-
tions and if it does, will the limit be the same as

the Optimal total return.

105

Fortunately, if Assumption 1 and 1.1 hold, all

the above questions can be responded affirmatively
(Theorem 5.14). However, under Assumption D, the
equality J2 = J* may fail to hold even in very
simple situations. A preliminary result shows that

in order to have JOD = J*, it is necessary and

sufficient to have J2 T(Jm) (Theorem 5.16). The
convergence of the dynamic programming algorithm under
Assumption D is essential to arrive at the Optimal
total return.

To show tha fact that Jco = T(Jm) requires some
elaborate technical details concerning the algebraic
structure of the various functions. These are pre-
sented in detail in Chapter V*. It can be generally
stated that if the incentive constraint set is compact,
then J2 = T(Jm) = T* under Assumptions D, D.l and D.2
(Theorems 5.16 through 5.18).

Once convergence is established, it is shown

(Theorem 5.19) that the limit of the dynamic programming

algorithm is an optimal contract and it is stationary.

5.6: Remarks
1n Chapters 111 through V, the focus of attention
is on a basic multiperiod agency model in which the

period payoffs are observable to both the principal and

106

the agent and the sequence of payoffs are stochasti-
cally well-defined with a known probability

distribution. Analysis have been conducted on a
finite and infinite horizon settings. The circum-

stances or conditions under both settings that will
guarantee the principal's ability to negotiate an
optimal contract with the agent are extremely mild.

The two conditions imposed on the finite horizon
model are both on the manner in which the total net
return to the principal in the k4-1st period behaves
relative to the total net return to the kth period.
One of the conditions says that if the total return,
or total accumulated wealth in the entity, in period k
changes by a small amount, then given all other factors

equal, the expected change in the total wealth of

period k+-l is also small. The second condition goes
on further to say that the change is somewhat linear.
These two together implies that given some changes in
the wealth of a company, the effect of such changes in

subsequent periods is prOportionate to the change.

Both are conditions on the economic environment in
which the entity Operates. I believe that most if not
all business enterprises are Operating under these
circumstances. If future wealth were so unpredictable

with respect to current changes in company wealth,

107

companies will be very hesitent to declare dividends,

acquire ventures which requires capital outlays.
The contraction and monotonicity assumptions

under the infinite horizon model are also conditions
on the economic environment of the enterprise. The
contraction assumption relies on the same assumptions

that any discounted cash flow or present value models
take on. These assumptions which include bounded

payoffs and discount factor less than one are widely
accepted in the literature. The monotonicity assump-
tion is based on the belief that no company will remain
in business if its expected period payoff is negative.
If any one of these two assumptions is met and con-
sideration of the model is not restricted to a finite
number of periods, one would be able to construct an
optimal stationary contract. This is possible because
by extending the planning horizon to an infinite period
of time, one allows time to become a monitoring device.

By repeating the process over long enough periods, the

behavior of the agent becomes extremely predictable. On
the other hand, if the agent intends to remain in the
company "forever", it will be difficult for him to cheat
"forever" and remain undiscovered. It will then be in

his best interest to act cooperatively with the

principal.

108

Certainly, if the number of periods in the
finite horizon model is allowd to extend long enough,
all the infinite horizon model effects will be carried

through. The limiting effects of the infinite horizon

model provide an additional nicety to the optimal

contract: that it is stationary.

CHAPTER V*

INFINITE HORIZON MODEL

5.1* The Contraction Assumption

 

The following assumption is motivated by the
contraction prOperty of the mapping associated with

discounted stochastic optimal total return with

bounded return per period.

Assumption C: Let B be the Banach space of
all bounded real-valued functions on Q with the
supremum norm. There exists a closed subset Bo C B
such that JD 6 Bo' and for all J 6 BO, 1 6 M,

the functions T(J) and TI(J) belong to Bo' Further-

more, for every n = (10,11,...) 6 H, the limit
lim (T T ...T ) (J ) (w)
N-Oco IO I]. IN-l 0

exists and is a real number for each w 6 0. In
addition, there exists a positive integer m and

scalars w'a' With 0 < m < 1. a > 0, such that
)(TI(J) -TI(J’)|1 g oHJ—J’H v 1 6 M, J,J’ e B

[)(T T ...T )(J)-(T T ---T )(JVH

 

 

g {DJ-J!“ v I ,...,1m_1 6M, J,J’ e B

109

110

The assumption is stated in a very general setting.
It is often convenient to take B = B and assume
a < l with g uniformly bounded above and below.
However, in some special cases, the contraction pro-

perty can be verified only on a strict subset B of B.

To start the analysis, it is first shown that
the function H meets the requirements of the

assumption.

Theorem 5.1: Let H(w,1,J) =

 

E*{g(m.I,y)4-dJ[f(w.I,y)]]w,1} be a mapping. Let
Jo(w) = O for every w 6 0. Assume that a < l

and for some b 6 R, there hold
0 g g(w,1,y) g b V w 6 Q, 1 6 U(w). y 6 1.

Then Assumption C is satisfied with B0 = (J: J 6 B, J 2 O}

and a = 2a, m = d and m = 1.

Proof: Clearly Jo 6 B0 and T(J), T1(J) 6 Bo

for all J 6 Bo and 1 6 M. Then for any

W = (10,11,...) 6 H,
JO 2 TI (J ) g .3 (TI ...TI )(JO)
0 o k
g (1‘ ...T )(J ) <j
— Io Ik+l o =
lim (TI ...TI )(J )(w) exists for all m 6 O.
N4m O N-l 0

Since

lll

*
H = (6.1..1) =f{9(w.1.y)+aJ[f(w.I.y>])p(dylw.I)
* i:
g] g(w.I.y)p(dylw.I)+a] J[f(u;-.I.Y)]p(dy|x.1)

*
g b+o]‘ J[f(w,1,y) ]p(dy]u),1).

 

Thus
N—l
(TI ...TI )(JO)(u)) g E akb
o N-l k=0
g l?a VwecaN=1,2,...
'- lim (T -..T )(J)(u:)6R Vweo.
N4m Io IN-l O

ngJ ((1)) .y] + aJ[f(w.I (w) .y)]
g gfm (w) .y] +aJ’[f(w.I (w) .y) ] + aHJ -J’H

Vw6Q,J,J’6B,16M and ye'y.
By Lemma 4.1,

J1*{9[w,1(u).yl+ aJ[f(w.I (w) .y) ]}p(dy[h,1)
g (*{g[h,1(h).y]+aJ’[f(h.1(w).y)])p(dy|h.1)
-t2aWJ-J’W
Hence

TI(J) (w) - T (J’) (m) g ZoHJ-J’H.

I
Similarly, we obtain

TI(J’) (w) - TI (J) (m) g 2el1J-J’il
ITI (J) ((1)) -TI (1’) (w)l g 2oHJ-J'H.

Taking supremum on the left-hand side over w 6 O

()TI(J) (w) -T1(J') (111))! g onJ-J’H ‘v’ I 6 M, J,J’

6 B.
If J,J' 6 Bo' again we obtain

g[w.I (w) .y] +dJ[f(w.I ((11) .Y] g 9(w.I(i).Y1

+oJ'[f(m,I (w) .y) ] +aUJ—J’H.

112

Then

[*{g[w.1(w).y]+-aJ[f(w.I(w).y])p(dy]m.I)
g [*{9[w.1(w).y]i-GJ'[f(w.I(w).y)]i-QHJ-J’H1p(dy)w.l)

g J‘*{9[WoI(W) IY] + aJ’[f(wlI (W) IY) 11p (dy1LUII) +C1HJ “J’H-
Proceeding as before, we obtain

HTI(J)-TI(J’)H g oHJ-J’H V I 6 M, J,J’ 6 BO. QED

Theorem 5.2: Let Assumption C hold, then

 

(a) For every J 6 B0 and u 6 H,

J = lim (T ...T )(J ) = lim (T ...T )(J).
” N4m o N-l ° N4m Io IN—l

(b) For each positive integer N and each J 6 Bo'

sup (TI ...TI )(J) = TN(J) and

WE“ o N—l

J* (T T )(J ) TN(J )
= sup ... = .

N ten Io IN—l ° 0

(c) The mappings TM and T?. I 6 M are

contraction mappings in B0 with modulus m.

Proof: (a) Let k 2 0 be any integer and k = nm-tq

where q,n 2 O and 0 s q < m. By C, for any J,J' 6 Bo

()(TI ...T )(J)-—(TI ...T )(J’)H g onaqHJ-J’H.
o k-l o k-l

Since Jo 6 Bo

113

Taking limits as k 4 a and n 4 a

11m (T ...T )(J) = lim (T ...T )(Jo).

k4a Io Ik-l k4m Io Ik-l

(b) By assumption Tk(J) 6 ED for all k.
Thus Tk(J)(w) < a V w 6 O and k. For any 5 > O,
6 M, k = 0,1,...,N-1 be such that

let 1k

T__ (J) 2 T(J) - e
IN—l

(T_ T)(J) 2 T2(J) - e
IN-Z

(T_ TN‘1)(J) g TN(J) - c.

I
0

By Assumption C

TN(J) g (T_ TN’1)(J) + 6

IO
6 T_ [(T_ T1”) (J)+ e] + e
’ 1 I

o 1
g (T_ T_ ,TN—2)(J) + as + e
’ I I

o l

N-l k

g (T_ T T_ )(J) + ( Z) a 6)

IO Il IN_l k=O

N-l k
gsup(TI...TI )(J)+ (23 0.8)
— V6U o N-l k=O
TN(J) g sup (TI ...TI )(J).
NEH o N-1
But TN(J) 2 sup (TI ...TI )(J) by definition
— V6H o N-l
. * _ _ N
.JN—sup(TI...TI )(J)—T(J).

W6H o N-l

114

(c) By Assumption C, T? is a contraction

mapping. Also for all I 6 M, k = 0,...,m-l, and

k

J,J’ 6 B
O

(T ...T )(J) g (TI ...TI )(J') + cpHJ-J'H.
o m-l o m—l

Taking supremum of both sides over I 6 M and from

k
part (b)

Tm(J) g Tm(J’) + cpHJ-J’H.

Similarly T'“(J’) < T‘“(J) + cpllJ-J’H

HTm(J) -Tm(J') H < coHJ-J’H. QED

Theorem 5.2 provides some very preliminary results.
It establishes the validity of the dynamic programming

algorithm. The next theorem is the well-known Fixed
Point Theorem in Banach Space which is quoted below

without proof.

Theorem 5.3 (Fixed Point Theorem): If B0 is a

 

closed subset of a Banach space with an apprOpriate norm
and. L.:Bo 4 B0 is a mapping such that for some positive
integer m and scalar o 6 (0,1),

!!Lm(2) -Lm(z’)H _g_ (DMZ-2’1! for all z,z' 6 BO

Then L has a unique fixed point in Bo' Furthermore,

for every Z 6 B,

115
lim HLN(Z)-Z*H = 0
N49

where Z* 6 Bo such that L(Z*) = 2*

With the aid of the Fixed Point Theorem, Theorem
5.4 characterizes the optimal total return function
J* and Theorem 5.5 the total return function JI
corresponding to any stationary contract (1,1,...) 6

It also shows that these functions can be obtained

in the limit via successive application of T and TI

on any J 6 B.

Theorem 5.4: Let Assumption C hold. Then

(a) The optimal profit function J* 6 Bo and is
the unique fixed point of T within 30'
Furthermore, if J’ 6 B0 is such that
T(J’) 2 J’, then J* 2 J’

(b) For every I 6 M, the function J 6 B

I 0
and is the unique fixed point of TI within
B .
o
(c) lim HTN(J)-J*H = o v J 6 B
' o
N-ba
lim HT§(J)-J H = o v J 6 B , I 6 M.
I 0
N40
Proof: By Theorems 5.2(c) and 5.3, T and TI
have unique fixed points in 30' Clearly T¥(J) = JI.

Thus part (b) is proved.

116

Let 3* be the fixed point
3* = T(J*).
For any E > 0, let I 6 M be such that

T_(J*) 2 3* - E.
I

By Assumption c that HTI (J) -TI(J') H g oHJ -J’H

T2(J*) 2 T_(J*) - OLE 2 3* - (1+O)E.

I I
- . m ~ ~ m..1 _.
Continuing, T_(J*) 2 J* - (l-tG-t..uta )6. By the
I _
. I I
assumption that ”(TI TI ...TI )(J)--(TI TI ...TI )(J )h
o l m-l o l m-l
g T’EJ-J'H
2m ~ m ~ m-l -
T_ (J*) 2 T_(J*) - co(l+Cl+...+0. )e
I ‘ I
2 3* - (l+co) (l+(1+...+C1m—1)E.
Thus for all k 2 l
TEm(J*) 2 3* — (1+co+...+ cpk—l)(l+a+...+am-1)E.
I
. . km ~ . . .
Since J_ = 11m T_ (J*). Taking limits as k 4 a
I R40 I
J 2 3* — —1-—(l+a+...+om-1)E.
i - 1%

Let E =(1-CD)(1+G+...+CIm-1)-l

J_2J*—e.
I

But J* 2 J and e > O is arbitrary

117

On the other hand,

J* = sup lim (TI ...TI )(J*)
IEU N4a o N-l

lim TN(3’*) = 3*

N49

W\

~

J* = J*.

By Theorem 5.3, part (c) follows immediately.
Since T is a monotonic mapping by assumption by
part (c) it follows that

if J’ 6 Bo such that T(J’) 2 J’

then J* 2 J’. QED

Theorem 5.5: Let Assumption C hold. Then

 

(a) A stationary contract v* = (I*,I*,...) 6 U
is optimal if and only if TI*(J*) = T(J*).

Equivalently, r* is optimal if and only if
TI*(JI*) = T(JI*)

(b) If for each m 6 0 there exists a contract
which is Optimal at w, then there exists

a stationary optimal contract.

Proof: (a) If r* is Optimal, then J1* = J*.

By Theorem 5.4(a) and (b)
TI*(J*) = T(J*).
If TI*(J*) = T(J*), then TI*(J*) = J*. By Theorem

5.4(a), J1* = J*. If TI*(JI*) = T(JI*)' Again,

118

by Theorem 5.4(a)

,...) be an Optimal

contract at w 6 0.

By Theorems 5.2(a) and 5.4(a)

J*(w) = J *(w)
I
w
= lim (T * ...T * )(Jo)(w)
k-Ha I I
o,w k,w
= lim (T *, ... T * )(J*)(UD
k4ao I I
o,w k,w
3 lim (T * Tk)(J*)(w)
k4a I
o,w
= T * (J*)(w)
o,w
g T(J*) (w)
= J*(w)
T * (J*)(w) = T(J*)(w) for each w.
I

0,0.)

*
Define 1*(w) = 10 w(w) for 1* 6 M. Then

T *(J*) = T(J*). By part (a) the stationary contract
I

(I*,1*,...) is optimal. QED
Theorem 5.5 also establishes the existence and

characterization Of stationary optimal contracts.

Part (a) of Theorem 5.5 shows that there exists a

119

stationalry Optimal contract if and only if the

supremum is attained for every w 6 0 in the

optimality equation, J* = T(J*). Theorem 5.6 strengthens
this result by showing that the supremum is attained

if Uk(w,1), the incentive constraint set is compact.

It also shows that stationary Optimal contracts may be
obtained in the limit form finite horizon optimal

contracts by successively computing T(J),T2(J),--°.

At the same time, it also proves the convergence of

the algorithm.

Theorem 5.6: Let Assumption C hold. Let the

 

incentive space C be a Hausdorff space. Suppose
that for some J 6 Bo and some integers ko > O, the

sets

uk(u.x) = {1 e U(w>1u[h.1.Tk(J)] 2 1}

are compact for all m 6 Q, 1 6 It and k 2 k0. Then
(a) there exists a contract v* = (1;,I*,...) 6 H
attaining the supremum for all m 6 Q and
k 2 ko with initial function J, i.e.,

k

(T *Tk) (J) = T +1(J) v k 2 k.

Ik
(b) For every contract r* satisfying (a), the

'1‘
sequence [Ik(w)} has at least one limit

point for each m 6 Q.

120

(c) If 1*: 0.4 C is such that I*(w) is

*
a limit point of {Ik(w)} for each
w 6 0, then the stationary policy

(I*,1*,...) is optimal.

Proof: (a) Since Tk+1(J)(w) = sup H[w,1,Tk(J)]
16U(w)

and Uk(w,1) are compact for k 2 k0. By Lemma 4.5,
Tk+1(J)(w) attains a maximum

(T *Tk)(J) = Tk+1(J).

Ik

* 'k
(b) Let 1* = (10,1 .) satisfy part (a).

1,..

Define

€k=sup{iiJ*-Ti(J)H]i-2k} k=o'1'ooo

Since T(J*) = J*. By Assumption C and part (a)

1'.va - (T n“) (J) 2) = (T(J*) -T"”<J)H
1
n
g aHJ* -T“<J)!I v n 2 kc,
and
H (T ...T“) (J) — (T ,.Tk) (J) H g aHTn (J) -Tk (J) H
I I
n k
_g_ OHJ* -an) H + aHJ* -Tk(J) H
V n 2 k0, k = 0:11....
Thus

* k * n
Hiern(W)vT (J)] 2 H[w.1n(w).T (J)] - 296k

k

2 J*(w) - 306 V n 2 k, k 0

_ k _

m/

*
1n(w) 6 Uk[w,J*(w)-3dck] for all n 2 k and k 2 k0.

121
Since Uk[w,J*(w)-3ack] is compact

*
[In(w)} has a limit point in Uk[w,J*(w)-3dek].

(c) Let 1*(u0 be a limit point of [1;(w)].

By part (b), 1*(w) 6 Uk[w,J*(w)-3dek] V k 2 k0. Thus

(TI*Tk)(J)(w) 2 J*(w)-306 V w 6 O, k 2 k

k — o'

By Assumption C, for all k

(TI. (J*) - (T1,.T“) (J) I! g aHJ* —Tk (J) H 6 ask

TI*(J*)(w) 2 J*(w) - 4aek.

By Theorem 5.4(c), 6k 4 O

T (J*) 2 J*(w).

1*

But J* = T(J*) 2 TI*(J*)

TI*(J*) J*.

By Theorem 5.5, the stationary contract (I*,1*,...)

is optimal. QED

5.2* The Monotonicity Assumptions

 

For the rest of this chapter, the two parallel

sets of monotonicity assumptions are considered.

Assumption 1: Jo(w) g H(w,I,Jo) V w 6 Q, I 6 U(w).

 

Assumption 1.1: Let {Jk} C F be a sequence

 

such that JO 2 Jk g J for all k, then

k+1

) = H(w,1,lim Jk) V w 6 Q, I 6 U(T).
k-Mn

k k-Oco

122

Assumption 1.2: There exists a scalar o > 0
such that for all scalars r > 0 and functions

J6F with JogJ,

H(onIJ) g H(WoIIJ+r) g H(wlIIJ) + or

V w 6 O and I 6 U(w).
Assumption D: Jo(w) 2 H(w,I,JO) V w 6 0, I 6 U(w)

Assumption D.l: Let [Jk} C F be a sequence

such that J g Jk g Jo for all k. Then

k+1

lim H(w,I,Jk) = H(w,I,lim Jk) V w 6 Q and I 6 U(w).
k4a k-Nn

Assumption D.2: There exists a scalar a > O

 

such that for all scalars r > O and functions J 6 F

With J g Jo

H(wlIIJ) " (11' S H(onlJ-r) __<_ H(wlIIJ)

V w 6 Q and J 6 U(w)

Clearly, under either set of assumptions, JV is

guaranteed to be well-defined by the monotonicity of
J for all F 6 H. It is also easy to see that under

each of these sets of assumptions the limit,

lim (TI TI ...TI )(JO)(w) is well-defined as a
N44 o 1 N-l

real number or 1?- Indeed, in the case of Assumption 1,

123

O 0
g (TIOTI1 ...TIN_1)(JO) g ..., and
Jo 2 TIO(JO) 2 (TIOTII)(JO) 2
2 (TIOTIl ...TIN_1)(JO) 2

in the case of Assumption D. In both cases, the limit

clearly exists in the extended real numbers for each

m 6 0.

Once again, as a first step, the function H is

shown to satisfy these assumptions.

Theorem 5.7: Consider the mapping
H(w,I,J) = E*(g(w.I.y)i-aJ[f(w.I.y)])w.I).
Let Jo(w) = 0 V w 6 o. If
9(w.I.y) 2 0 V w 6 o. I 6 u(w). y 6 y

then Assumptions 1, 1.1 and 1.2 are satisfied with

the scalar in 1.2 equal to a. If
g(w.I.y) g 0 V w 6 0. I 6 U(w). y 6 W

then Assumptions D, D.l and D.2 are satisfied with

the scalar in D.2 equal to a.

Proof: Since Jo(w) = O V w 6 0 and

9(w.I.y) 2 O or g(w.I.y) g 0 V w 6 O. I 6 U(w). Y 6 W

124

Assumptions 1 and D are trivially satisfied. By the
monotone convergence theorem of integration,

Assumptions 1.1 and D.l are satisfied since

g(w,1,y) 2 O.
For all r > O and J 6 F with JO 2 J
H(onpJ+ r) = E*{g(wIIIY) +aJ[f(WrIIY)]+ar1wII]

E*[g(w.1.y) +aJ[f(w.I.y)]lw.I) + or

H(w,I,J) + or.

Hence 1.2 is satisfied. Similarly, using 9(w,1,y) g 0

for all r > O, J 6 F, J g Jo
H(w,1,J-—r) = H(m,1,J) — or

and D.2 is satisfied. QED

In proving the next theorem, the following,
admittedly confusing notation is adopted.

Notation: The contract Vk[w] = (I:[w],1§[w],...)
is associated with w. I§[w] denotes, for each

w 6 Q and k a function in M while I:[w][z]

denotes the value of 1§[w] at an element Z 6 0.

Theorem 5.8: Let Assumptions D, D.l and D.2

 

hold. Let J* < a and e > 0 be given, there exist

an e-optimal contract. Furthermore, if, in D.2, the

scalar a < l, the contract We is stationary.

125

Proof: Let [ck] be a sequence such that

6k > O for all k and

a
Z akek = e.
k=0

For each m 6 0, let [vk[w]} C U be a sequence of
contracts of the form rk[w] = (1:[w],1§[w],...)

such that for k = 0,1,...

Since J* < a, such a sequence exists. Let 1k 6 M

be defined by Ik(w) = I:[w](w) V w 6 Q and J

k
defined by

3 ((1)) =H[U.),I ((1)),lim(T ...T )(J)]Vw6o.

k k i-m 11m] I‘m] °
1
k = 0,1,

By D and D l

3 (w) = lim(T ...T )(J )(u)) = J ((1))

k 1.... Io[u)] 1‘36] ° ”km

2J*(w)-€k V 0.160: k=0,1,°"

By D.2, for all k = 1,2,... and w 6 O

T (3k)(w)
k-l

Hiw.Ik_1(u)) Ijk]

H[w.ik_1(LU) I (J*. - ek)]

ll\/

IIV

H[wI-I-k_1(W) III*] — (16k

IN

H[u),I (w).1im(T ...T )(J )]
k‘l IE-1[w] IE-1[w] °

- ask = Jk_1(w) - ock.

126

Then
TI [Ti (Jk)] 2 Ti (Jk_l-aek)
k-2 k—l k-2
2 Ti (Jk_1-—a ck)
k-2
2 J (a~ 4-026 )
— k—2 bk-l k
(Tf T_I_ )(Jk) 2 Jo - (del+...+d ck)
o k-l

k
(T_ T)(J)2J*-Eae..

I I O ‘ i=0 1

o k
Let Us = (10,11, ) and taking limits

JV“ 2 J* - c
If a < 1, take ck = e(l-d) for all k. Let
"a = (1,1,...), where I(w) = Io[w](w) for all m E Q-
The r is an c-optimal stationary policy. QED

5.3* The Optimality Equation

The next two theorems with their corollaries prove
the Optimality equation, J* = T(J*), hereby establishing
the validity of the dynamic programming algorithm. The

corollaries attempt to set up the algorithm for stationary

contracts. For Assumption D, the Optimality equation
requires the compliance of all D, D.l and D.2, whereas
under Assumption 1, the same results hold only under 1

and one of the additional conditions, 1.1 or 1.2.

127

Theorem 5.9: Let D, D.l and D.2 hold. Then

 

J* T(J*). Furthermore, if J’ e F is such that

J’ g Jo and J’ g T(J’), then J' g J*.

Proof: For every V = (Io'Il"") 6 H and

w E D. By D.l

JF(w) = if: (TIOTIl ..TIk)(Jo)(w)
= T [lim (T --T )(J )](w)
Io k4» I Ik o

1
S TI (J*)(w) g T(J*)(w).
0

Taking supremum of the left-hand side over F
J* g T(J*).

Let el and 82 > 0. By Theorem 5.2, 3 a

contract F = (i ,Il,...) such that

o
T_ (J*) g T(J*) - 61
I0
and
J7? > J* " 92
1
where
7T1 = (IloIZI ‘)
J_ = lim (T_ T_ ...T_ )(JO)
e T_ [lim (T_ ...T_ )(Jo)]
I R49 I I
o 1 k

E

H

128

= T (J_ ) 2 T (J*) - ac

- _ - 2
I0 ”1 I0
_2_ T(J*) - (61+a92).
But J* 2 J_ and 61 and 32 are arbitrary. Then
W
J* 2 T(J*)
J* = T(J*).
Let J’ e F be such that J’ g Jo and J’ g T(J’).
Let {ck} and a sequence with ek > O and a contract
E = (fo.f1,...) e n be such that
{(W)2NW)-% k=mL~-
Ik
From D.2
J* = sup lim (T ...T )(J )
wen ksa Io Ik °
2 sup lim sup (TI ...TI )(J')
_ TTEH k-Oco O k
2 lim sup (TI ...TI )(J')
— k-No o k
2 lim sup (T ...T )[T(J')-e ]
’ k4» Io Ik-l k
2 lim sup (T ...T )(J'-—e )
‘ kea Io Ik-l k
2 lim sup (TI ...TI )(J') — akek
_ k‘w O k-l
k i k i
21im [T(J’)—(Z a 6.1)] 2J' — Z a 61
_ R*m i=0 ’ i=0

since 6i are arbitrary

a J* 2 J .

QED

129

Corollary 5.9.1: Let D, D.1 and D.2 hold. Then

 

for every stationary contract

w = (1,1,...), JI = TI(J1).
Furthermore, if J' E F is such that J’ g Jo and
J’ g TI(J') then J’ g JI.
Proof: Use U (m) = {I(w)} instead of U(w) V w 6 0.

Use Theorem 5.9. QED

*
Lemma 5.1: Let I hold. Then J* = lim J .
N-Oa:

 

*
Proof: Clearly J* 2 JN for all N. Hence

*
J* 2 lim J . Also for all F = (10,11...-) 6 U:

— N4o N

(TI ...TI )(Jo) g J

*
o N-l N

Taking limits on both sides

*
J g lim J .
W — N4m N

Taking supremum of the left hand side

*
* I
J 3 11m JN

N-Ooo

*
J* = lim JN. QED
N-Om

Theorem 5.10: Let I and 1.1 hold. Then J* T(J*).

 

Furthermore, if J' 2 J0 and J’ 2 T(J'), then J’ 2 J*.

Proof: Using the arguments in Lemma 5.1 for all

w E 0

lim sup H(w,I,J;) = sup 1im H(w,I,J;).
N*m W€U(w) W€U(w) N*w

130

By 1.1, then

'k *
1im T(JN) = T(lim JN).
N—Ooo N-bco

Since I and 1.1 are equivalent to Assumption F.1',
by Corollary 4.2.1,

N
J — T (Jo).

* *
Thus T(JN) = TN+1(J ) = J

o N+1' By Lemma 5.1 and combining

the results we have

J* = T(J*).

Let J’ 6 F be such that J’ 2 Jo and J’ 2 T(J’). Then

J = sup 1im (T ...T )(J )
wen N4m Io IN-l 0

g lim sup (T ...T )(J)
- Nae wen Io IN-l °
g lim sup (TI ...TI )(J’)
— Nam JEH o N-l
g 1im TN(J’) g J’. QED
_ Nam _

Corollary 5.10.1: Let I. and 1.2 hold. Let Q

 

be a finite and J*(w) < m for all w E 0. Then
J* = T(J*). Furthermore, if J’ E F is such that

J’ 2 Jo and J’ 2 T(J*), then J’ 2 J*.

Proof: Using a nearly verbation repetition of the

*
proof of Theorem 4.2 (b), we have JN = TN(JO) for all

N = l,2,'°'. We will now show that

* * a
1im H(w,I,JN) = H(w,I,lim JN) V w 6 32, I E U(w) .
N—Ooo N400

131

Suppose for some 5 E O, I €‘U(E) and e > O

NN * ~~ *
H(w,I,Jk) + e < H(w,I,1im JN) k = 1,2,...
N-Om

*
Since 0 is finite and J*(w) = 1im JN(w) < m for all
N-an

w, 3 an integer k0 > 0 such that

* . *
Jk + (e/O) 2 lim JN V k 2 k

N-Ooo O

k

By 1.2, for all k 0

ll\/

~~ * ,.V * N N . *
H(w,I,J ) + e 2 H(w;TuJ +-(e/u)) 2 H(w,I,lim J )
k — k — Nam N

which contradicts the earlier inequality

1% *
.2 lim H(w,I,JN) = H(w,I,lim JN)
N-Oco N-Mn
and the results follow by Theorem 5.10. QED

Corollary 5.10.2: Let I and 1.1 hold. Then for

 

every stationary contract F = (1,1,...), I = TI(JI).

Furthermore, if J’ 6 F is such that J’ 2 Jo and

J’ 2 TI(J’), then J’ 2 J1.

Next, necessary and sufficient conditions for the
Optimality of a stationary contract under the two

assumptions are studied.

Theorem 5.11: Let D, D.l and D.2. Then a stationary

 

contract r* = (I*,1*,...) is optimal if and only if

T (J*) 'T(J*).

1*

Furthermore, if for each w E 0, there exists a

contract which is Optimal at w, then there exists a

stationary optimal contract.

132

Proof: If W* is optimal, then JI* = J*. By

Theorem 5.9 and Corollary 5.9.1, the result follows.
Conversely, if TI*(J*) = T(J*), By Theorem 5.9,

J* = T(J*) then it follows that TI*(J*) = J*. By

Corollary 5.9.1, JI* 2 J*.

W* is optimal.

* 'k *

= I
Let Wm (I ,

O,w l,w'°") be optimal at w E 0.

By D.1,
J*(w) = JI*,w(w)
= 1im (T * ...T * )(Jo)(w)
k4” I0.03 Ik.w

= T * [1im (T * ...T * )(JO)](w)
I k-Mo I I
0.0.) 11“) kn»

g T (J*)(w) g T(J*)(w) = J*(w)
Io.w

T * (J*)(w) = T(J*)(w) for all w E 0.

01w

Define I* E M by I*(w) = I; u)(w). Then
TI*(J*) = T(J*) and by result just proved (I*,1*,...)

is Optimal. QED

Theorem 5.12: Let I and I.1 hold. Then a

 

stationary contract v* = (I*,1*,...) is optimal if
and only if

T (J ).

I* I*

Proof: If F* is optimal, then JI* = J*. By

Theorem 5.10 and Corollary 5.10.2, the result follows.

133

Conversely, if TI*(JI*

5.10.2, JI* = T(JI*)' By Theorem 5.10, JI* 2 J*

) = T(JI*)' By Corollary

W* is optimal. QED

Theorem 5.11 states that under Assumption D, if

the supremum is attain in the Optimality equation

J*(w) = sup H(w,I,J*)
I€U(w)

for every w E Q, then there exists a stationary contract.
However, if the supremum cannot be attained for some
w 6 Q, the Optimality equation can still be used to
construct a nearly Optimal contract, which is stationary
whenever the scalar d in D.2 is strictly less than

one .

Theorem 5.13: Let D, D.l and D.2 hold. Then

 

(a) Let c > 0, and {31] be such that

°° k
Z) a ck = e. e. > 0, i = 0,l,°'°. Let
=0 1
'k 'k
F* = (10,1 ,...) 6 H be such that
T *(J*) 2 T(J*) "' 6k k = 0,1,...
Ik

then J* 2 JI* 2 J* - e.
(b) Let c > 0 and the scalar in D.2, d < 1.

Suppose I* E M is such that

TI*(J*) 2 T(J*) - e(l-—a). Then

J* 2 JI* 2 J* — e.

134

Proof: (a) Since T(J*) = J*

Apply T * to both sides
Ik-l

(T* T*)(J*) 2T* (J*)-(16k
Ik-l Ik Ik-l

2 J* - (ek4-dek).

Repeat the process, for every k = 1,2,...

k .
(T *...T 1,‘,)(J*)2J"' — (2 0116.1).
I I i=0
0 k
Since Jo 2 J*, it follows that
k i
(TI* ...TI*)(JO) 2 J* _ £2; a 6i k = 1,2,...
0 k 1-
Taking limit as k 4 a
JI* 2 J* - e.
(b) Let ck = e(l-—a) and I; = 1* for all k.
The result follows by part (a). QED

A weak counterpart of part (a) of Theorem 5.13
under Assumption I is given in Theorem 5.15. However,
I am unable to give a counterpart of part (b) under

Assumption I or conditions for existence of a

stationary Optimal contract.

135

5.4* Convergence to Optimality and Existence of Optimal

Contracts

 

Define a function Jan 6 F by

J (m) = lim T (J )(w) for every w 6 Q.
a 0
N49

This section is devoted to investigate whether J“ = J*.
Fortunately, the relationship hold for Assumption I

under very mild conditions.

Theorem 5.14: Let I hold and assume that either

 

*
1.1 holds or JN = TN(JO) for all N. Then Jo = J*

where

1im TN(JO)(w) V w e o.

Jc(w)
N49

*
Proof: By Lemma 6, J* = lim JN. By Corollary 4.2.1
N-Hn

*
JN = TN(JO)

*
J* = lim J = J . QED
Q
Nwo

The following is a counterpart of Theorem 5.8 and
part (a) of Theorem 5.13 under Assumption I for the

existence of nearly optimal contracts.

Theorem 5.15: Let I and 1.2 hold. Let Q be a

 

finite set and J*(w) < a for all m 6 0. Then for

any 6 > 0, there exists an e-optimal policy.

136

Proof: For each N, let 8N = e/2(l-ta-t..uth-1)
_ N n N
and JN - {Io,Il,...,IN_1,I,I,...] be such that

I E M and for k = 0,1,...,N-1, IE 6 M and

-k-1 -k
(T NTN )(Jo) 2 TN (Jo) - e

N.

N'
Ik
Thus TIN (Jo) 2 T(Jo) - 6N. Apply TIN to both Sides
N—l N-2
(TIN TIN )(Jo) 2 (TIN T)(Jo) - aeN
N-2 N-l N-Z
2
2 T (Jo) - (1+d)eN.
Continuing (TN...TN )(J ) 2TN(J) - (1+o+...+oN'1)e
o — o
I I
o N-l
For N = 0,1,
J 2 TN(J ) - e/2.
F — o
N
As in the proof of Corollary 5.10.1, the assumptions
imply
*- N f 11
JN — T (Jo) or a N.
. N * . . . .
By Theorem 5.14, lim T (Jo) = JN. Since Q is finite

N4:

and J*(w) < a for all w E Q E N such that
o

N
T 0(JO) 2 J* - e/2. Then

.and JN is the desired contract. QED
o

137

Under Assumption D, D.1 and D.2, the equality

Ja = J* may fail to hold even in very simple

situations. The following preliminary result shows

that in order to have J” = J*, it is necessary and
sufficient to have J" = T(J“), a condition implying

the convergence of the dynamic programming algorithm.

Theorem 5.16: Let D, D.1 and D.2 hold. Then

 

J 2 T(J”) 2 T(J*) = J*.
Furthermore, Jan = T(J”) = T(J*) = J* if and only if
Jco = T(J”).
Proof: Clearly Jan 2 JV for all F E H
J 2 J*.

By Theorem 5.9 T(J*) = J*. For all k 2 0

T(J”) = sup H(w,I,J¢)
I€U(w)
k _ k+1
S SUP H(wIIlT (JO)] _ T (JO).

' I€U(w)

"A
q

Taking limit on right side, T(J“)

J 2 T(J”) 2 T(J*) = J*.

T(J*) = J*

Let J = T(J )
a 9

Ja = T(Jo) by hypotheSis.

138

Let J... = T(JO). Since Jo 2 J”, by Theorem 5.9,
J 3 J*. Then Jan 2 T(J”) 2 T(J*) = J* implies that

J = T(J”) = T(J*) = J*. QED

To prove the fact that Jon = T(J”), the following

definitions and notations are needed.

Notations:
(1) For J E F, let E(J) denote the epigraph

of J, i.e., the subset of O x I1 given by

E(J) = {(w.1)IJ(w) 2 1]-

Under D, since Tk(JO) 2 Tk+1(Jo) for all k and

J2 = 1im Tk(Jo) thus
R*a

E(Jm)

ll
":38

k
E[T (JO)].

k 1

(2) For each k 2 l, the subset Ck of Q x C x I?

is given by
k-l
ck = {<w.I.M1H[w.I.T (Jon 2 x. w e n, I 6 own.

(3) Let P(Ck) denote the projection of Ck on

Q X 31, i.e.,
P(Ck) = {(w,X)Ia I e U(w) such that (n.1,i) e ck}.
(4) Let the set P(Ck) be defined as

P(Ck) = [(w,i)13 {in} such that

1nd)” (unkn) EP(Ck) n=O,1,...}.

139

Lemma 5.2: Let D hold. Then for all k 2 1

-————- k
P(Ck) c P(Ck) = E[T (JO)].

Furthermore, P(Ck) = P(C = E[Tk(Jo)] if and only if

k)

the supremum is attained for each m E Q in

Tk(Jo)(w) = sup H[w,T.Tk‘1(JO)].
I€U(w)

Proof: If (w,l) 6 E[Tk(Jo)], then

Tk(JO)(w) = sup H[w,I,Tk-1(Jo)] 2 1.
I€U(w)

Let {an} be a sequence such that an > 0, en 4 0 and

let [In] C'U(w) be such that

k- k
H[w.In.T l(JOH 2 T (JO) ((1)) - en 2 X - en.

Then (w,In,1-en) E Ck and (w,X-en) 6 P(Ck) for all n.

Since 1 - en * X. (w.l) E P(Ck)
E[Tk(J ) c P(C )
o k ‘
Let (w,1) E P(Ck) E a sequence {In} such that

An 4 1 and a corresponding sequence {In} C‘U(w) such

that
Tk(Jo)(w) 2 H[w.1n.Tk'1(Jo)] 2 1n.
Let n 4 a, Tk(Jo)(w) 2 1. Thus
k
(w,X) E E(T (JO)]
375;? c E[Tk(Jo)]

-————- k
P(Ck) c P(Ck) = E(T (JO)].

140

*
Assume that the supremum is attained by Ik_1(w)

for each m 6 0. Then for each (w,l) E E[Tk(JO)]
* k-l
H[Wllk_l((-U)9T (‘10)) g )‘0
This implies (unl) e P(Ck). Hence E[Tk(Jo)] c P(Ck).
By the first part of this theorem

P(Ck) = P(C = E[Tk(Jo)].

k)
Now let P(Ck) = (Ck) = E[Tk(Jo)]. For every w for

which Tk(Jo)(w) > -e
[w Tk(J )(m e P(C)
' o k '
This implies that H a I;_l(w) E U(w) such that

H[w,1;_1(u).Tk‘1(Jo)J 2 Tk(JO)(w)

sup H[w,I,Tk-1(Jo)]
IEU(w)

The supremum is attained for all m for

which Tk(Jo)(w) > -o.

It also is trivially attained by all I 6 U(w) whenever

 

Tkm )(w) = -... QED
0
Definition:
0
P( F) C ) = {(w,1)( H I 6 U(w) such that
k=1 k a
(w,I,1) e F) ck}
k=l
P( F) Ck) = {(w,1)( 3 {l ] such that
k=1 n

in a l, (m,xn) E P(kCH ck)}.

141

Lemma 5.3:

 

 

 

P( F) c ) c: 0 P(C ) c {W P(C )
k=1 k k=1 k k=1 k
= F) E[Tk(Jo) = E(Jw), and
k=1
0 m a k
p( F1 c ) c (A P(C ) = F1 E[T (J )] = E(J ).
k=1 k k=1 k k=l ° ”

Theorem 5.17: Let D, D.1 and D.2 hold. Then

 

(a) J“ = T(J”) (equivalently Ja = J*) if

and only if

 

 

P( F) ck) = F) P(C
k=l k=

(b) Jo = T(J”) and the supremum in

J2(w) = suP H(w.I.JO)
I€U(w)

is attained for each w E 0 if and only if

P( F] ck) = ﬂ P(C
k=l k=

 

Proof: (a) Let J” = T(J”) and (w,1) E E(J,)°

Thus J'(w) = sup H(w,I,Ja) 2 1. Let [an] be a
I€U(w)

sequence such that en > 0, en 4 0 3 a sequence {In}

such that

H(w,In,J') 2 i — en n = 1,2,...

k-l _
so H[w,In,T (Jo)] 2 i — en k,n — 1,2,...

(w,In,1-en) 6 C for all k,n

k

142

a
and (w.In,1-en) E kgl Ck for all n. Thus

a
(w,X-€n) 6 P(kCR Ck) for all n. But 1 - en 4 1

implies

 

(w,l) E P( F) ck)

 

 

 

 

 

k=1
E(Ja) CP( 0 ck).
k=1
By Lemma 5.3
ﬂ P(C ) = E(J ) = P( F) c ).

k=1 k k=1 k
Let P( F) ck) = ﬂ P(Ck). By Lemma 5.2,

k=1 k=1

 

P( F) Ck) = E(Jm).
k=1

Let w E Q be such that Ja(w) > -o. Then
L_E.
[w,J (w)] E P( F) Ck) 3 a sequence (1 ] with
” k=1 n

 

1n 4 Ja(w) and a sequence [In] C‘U(w) such that
H[w,I ,Tk'1(J )] 2 x k,n = 1,2,-~-
n o — n
By D.1, taking limit with respect to k
H[w,In,Jm] 2 1n n = l,2,°°'
Thus

T(J¢)(w) 2 H[w.In.J2] 2 kn.

143

Let n 4 m implies T(J°)(w) 2 Jm(w) for all w 6 O

and J2(w) > -o. The inequality also holds if
J (w) = -¢
0
. T(J“) 2 Jo
By Theorem 5.16, J” 2 T(J”)
C J = T(J ).
Q Q

(b) Let Ja = T(J”) and the supremum is attained

for each m E Q in J¢(w) = sup H(w,I,J°) H I* E M
I€U(w)

such that for each (w,1) E E(JG)

H[w,I* ((1)) ,Ja] 2 x.

k~l

Thus H[w,I*(w),T (Jo)] 2 1 for k = 1,2,... and
[WII* (w)0)\] 6 ﬂ Ck
k=1
(w,k) E P( F) Ck)
k=1
E(Ja) CP(ﬂ ck).
k=1
By Lemma 5.3, P( F) Ck) = E(Ja) = F] P(Ck). Conversely,
k=1 k=1

9
let P( 0 ck) = E(Ja). For all m e 0 with
k=1

Jm(w) > -a
[w,Jc(w)] e E(JQ) = P( 0 c

H a I*(w) E U(w) such that

w
[w,I*(w).Ja(w)] E F C

k=1 k

144

Thus

H[w,I*(w),Tk-1(Jo)] 2 Jo(w) k = 0,1,...

By D.1 and taking limits
T(J”) (w) 2 H[w.I*(w),J°] 2 Jam).

By Theorem 5.16, T(J”) = J”. If J°(w) = -o, every
I E U(w) attains the supremum and the proof is

complete. QED

Theorem 5.18: Let D, D.1 and D.2 hold. Let
the incentive space C be a Hausdorff space. Suppose

there exists an integer ko 2 0 such that for each

w 6 Q, 1 6 R and k 2 k0, the set
Uk(w,l) = (I E U(w))H[w,I,Tk(JO)] 2 X

is compact. Then

 

a
Proof: Let (w,l) be in F) P(Ck) H a sequence
k=1
{In} C U(w) such that

H[w.In,Tk(Jo)] 2 H[w,In,Tn(Jo)] 2 i v n 2 k.

Thus In E Uk(w,1) V n 2 k, k = 0,1,°". Uk(w,1) is
compact for k 2 k0. This implies that [In] has a
limit point I e Uk(w.l) V k 2 k0. But

Uo(w,1) D‘Ul(w,l) D... so I 6 Uk(w,l) for k = 0,1,...

145

H[w,I,Tk(Jo)] 2 l k = 0,1,...

- co
(willx) 6 n Ck.

k=1
G
This implies (w,1) E P( F) Ck)
k=1
O O
a P( F) c ) D F) P(C ).
k=1 k k=1 k

Since Uk(w,l) is compact, by Lemma 445, the supremum

in Tk(J )(w) = sup H[w,I,Tk-1(J )] is attained
o o
I€U(w)

for every w E Q and k 2 k0. By Lemma 5.2,
P(Ck) = (Ck) for k 2 k0. But P(Cl) 3 P(CZ) 3...

and P(Cl) D P(CZ) D...

 

F) P(Ck) = 0 P(C
k=1 k=

By Lemma 5.2,

QED

After proving the fact that Jco = T(Ja) and
hence establishing the convergence of the dynamic
programming algorithm under Assumption D, the following
provides the conditions for the existence and compu-
tation of optimal stationary contracts under the decreasing

assumption.

146

Theorem 5.19: Let the assumptions of Theorem

5.18 hold. Then

(a) there exists a contract F* = (1;,11. ) E H
attaining the supremum for all k 2 k0,
i.e.,
(T ,Tk)(J ) = Tk+l(J ) v k 2 k .
Ik o o — o

(b) For every contract w* satisfying (a),
the sequence (I;(w)] has at least one

limit point for each w 6 0 with

J*(w) > -~.

(c) Let 1*: Q 4 C be such that I*(w) is
a limit point of [1;(w)} for all m E Q
with J*(w) > -a and I*(w) 6 U(w) for

all m E 0 with J*(w) = -a. Then the

stationary contract (I*,1*,...) is Optimal.

Proof: (a) This follows from Lemma 4.5.
* *
(b) Let F* = (Io,Il,...) satisfy

(T ,Tk)(JO) = Tk+1(J > v k 2 k . w e o

o o
Ik
and J* ((1)) > -a

H[w,I;(w).Tk(JO)] 2 H[w,I;(w),Tn(Jo)]

*
In(w) E Uk[w,J*(w)] v k 2 k , n 2 k.

147

*
Uk[w,J*(w)] is compact. {In(w)} has at least one
limit point.

(c) Each limit point I*(w) C'U(w) and
H[w,I*(w),Tk(Jo)] 2 J*(w) v k 2 k0.

Using D.1 and taking limits
H[w,I*(w),Jc] = H[w,I*(w),J*] 2 J* for all w 6 0.

This relation holds trivially for all m E Q with

J*(w) = -a.
TI*(J*) 2 J* = T(J*).

This implies TI*(J*) = T(J*). By Theorem 5.11

(I*,1*,...) is Optimal. QED.

CHAPTER VI

BOREL MODELS

6.1: Introduction

 

In the previous chapters, a basic multiperiod

agency model is developed. It was shown that under
apprOpriate conditions, optimal or nearly optimal
contracts exist and the dynamic programming algorithm
can be implemented to construct such contracts. All

these results rely on the assumption that the dis-
turbance term, yk, behaves in a reasonable manner,

that is, it is countable and there is a well-defined
distribution on its behavior over time. Put into the
context of the model, the assumption implies that once
an initial payoff is specified, the sequence of sub-
sequent payoffs for the entire planning horizon will be
defined stochastically. This means that at time 0

all payoffs are defined with a known probability distri-
bution conditioned on the initial payoff. The

optimization process will then be reduced to finding

an Optimal contract for the corresponding payoffs.

148

149

If the assumptions in Chapters IV or V are met, an
Optimal or nearly Optimal contract can be guaranteed.

However, the disturbance can be arbitrary. Such
arbitrariness may be due to random externalities which
the company has no control of or to the internal
Operating procedures. The imperfect state information
model which is to be discussed in the next chapter is
perhaps the most common situation that gives rise to
an arbitrary disturbance. The actual payoff is not
observable by the principal who receives a report or

signal from the agent concerning the outcome. The

agent can freely choose his reporting function. This
will make it impossible for the principal to define his
expected payoffs at time 0 to search for an optimal
or nearly Optimal contract.

In fact, if the disturbance is allowed to be
arbitrary, various complications arise in the Optimi-

zation process. This chapter will discuss the problems

involved in the different phrases of the dynamic
algorithm. The main intent of going through the techni—
cal details is to set the stage for the imperfect state
information model such that the dynamic programming
algorithm can be utilized as a solution procedure. As

both the problems and their corresponding remedies are

150

highly technical in nature, discussions in this chapter

can only be carried on at a very general level. A
significant amount of detail is omitted. The techni-

cally oriented reader is referred to the starred
chapter for a complete develOpment.

It was mentioned earlier in this work that there
are three Operations performed repetitively. First,
there is the evaluation of a conditional expectation.
Second, an extended real-valued function in two variables

(state and incentive) is maximized over one of these

variables (incentive). Finally, if an optimal or nearly
optimal contract is to be constructed, a "selector"
which maps each state or payoff to a contract which
achieves or nearly achieves the Optimum for the second
step must be chosen. The following sections will take
each of these Operations in turn and discuss the problems
associated at each stage if the disturbance is not

countable.

6.2: Existence of Probability Measures

 

Elementary statistics say that probability is the
measurement of the likelihood of the occurrence of a
certain event from a collection (set) of events. It
is a measure of the likelihood of occurrence. If the

set of events or the set of all possible combinations

151

of events is countable, a probability distribution on
the elements of the set always exists. If the set

is arbitrary, then very little can be said about the
probability distribution of its elements. In the
context of the model in the previous chapters, the

conditional expectation involves not only the probability
distribution on one set, but on the product of two sets,

the payoff and the disturbance. It becomes essential
to investigate the interplay of the distributions on
these two sets.

Since probability distribution is a measure of
the likelihood of occurrence of the elements of a set,
its existence is closely related to the measurability

of the set. When arbitrary sets are encountered,
measurability is always a crucial issue. One can

envision measurability of a set as the ability to

count the elements in the set (counting measure) or

the ability to induce a distance between the elements

in the set in a one-dimensional setting or the area in

a two-dimensional case (a metric or norm). A probability
measure can be viewed as a function which maps the

elements in the set to the real line. The space of all

probability measures that can be defined on the given
set S is called the space of probability measures on

X. It is denoted by P(X). In Appendix I, it is shown

152

that P(X) inherits all the prOperties of the original

set X. Hence, P(X) is measurable whenever X is
measurable. 0r conversely, if X is measurable,

then an unconditional probability measure always exists
with respect to the specific measure of X. Neverthe—
less, an arbitrary measurable space is an extremely
large space for any meaningful analysis to be conducted

on. In order to draw any useful implications one has
to restrict the research on a smaller subset which is

typical enough to encompass most if not all the
characteristics of the original set. The Borel set is
the most common candidate for such a purpose. To de—
fine Borel sets, the idea of a O-algebra is needed.

A collection of subsets of a set X is said to
be a o-algebra in X if it has the following properties:
(1) it contains X, (2) it contains any subset A of

X and the complement of A relative to X, and (3)
it contains all possible unions of subsets of X. Then

the Borel sets of X is the smallest O—algebra in X
such that it contains every open set in X. Since P(X)
inherits all the prOperties of the original space, it
is also a Borel space.

As mentioned earlier, the dynamic programming

algorithm requires the evaluation of a conditional

expectation which involves prObability measures on a

153

product of the payoff and incentive spaces. It can be
shown (Theorem 6.2 and its corollaries) that a pro-
bability measure on a product of Borel spaces can be
decomposed into a marginal and a conditional probability.
Such decomposition is possible even when the parameters

or arguments of the distribution function are dependent.
In addition, such a process can be reversed (Theorem 6.3).
Given a probability measure and one or more conditional
probabilities, a unique probability measure on the
product space can be constructed. All these distributions

can be shown to be measurable if the original sets are

Borel sets.

With the establishment of probability distributions
on both the payoff and incentive spaces and the interplay
of these probabilities between these two spaces, the
conditional expectation Operations in Borel spaces are

well-defined.

6.3: Analytic Sets
The second stage of the dynamic programming algorithm

involves the maximization of an extended real-valued

function in two variables over one of these variables.
When the disturbance is countable, the whole array of
payoffs is defined stochastically given an initial payoff.
The incentive function is defined on the payoff space.

Under such circumstances, the resulting problem is a

154

standard multiperiod maximization problem which has
been treated somewhat in detail in Chapters IV and V.
When the disturbance term is an arbitrary element

from a Borel space, then wk cannot be deduced from
the knowledge of mo at time period 0. Also the
exact form of the optimal or nearly Optimal contract
cannot be specified at time 0 even if the existence
of such contracts are guaranteed. The best one can do
is to be able to construct a "selector" which maps
each payoff to a contract which achieves or nearly
achieves the maximum. Essentially, the algorithm searches
for the maximum along the projection of the incentive

function on the payoff space in this second stage of
the process.

In searching along the projection of sets from
Borel space, a very serious prOblem is encountered.
So far in Chapters III through V, the multiperiod agency
problem is formulated to involve dealing with "nice"
sets. These "nice" sets have been either measurable sets
or Borel sets. But, at this stage, when projections
of these "nice" sets are used to search for a solution,
it would be desirable that the projections are "nice"
also. It is at this point that the use of measurable
sets and Borel sets breaks down, because one cannot be

sure that the projections required will be of the same

155

type. The projections do not carry over the behavior

of the original sets. In fact, they may not be
measurable with respect to the spaces of the original
sets.

Fortunately, there is another class of sets
available, the so-called "analytic sets" which has the
desirable properties that are required in the current
model. There are many approaches to analytic sets, but
maybe the best for the current purposes is that the
analytic sets consist of the images of the Borel sets

under continuous functions. The image of an analytic

set under a continuous function is itself an analytic
set. The Borel sets thus form a subclass of the analytic
sets: each Borel set is an analytic set, but there are
analytic sets which are not Borel sets. Also, a pro-

jection is a continuous function. Now, letting the
analytic sets be the "nice" set, one obtains some

control of the results of projections, that is, a
guarantee of the measurability of the projections. This
will enable the investigation to carry forward. By
enlarging the Borel sets to include the analytic sets,
the model is ready for the implementation of the dynamic
programming algorithm. Technically, through the analytic
sets, the projection becomes measurable with respect to

the original sets.

156

6.4: Construction of the "Selector"

 

The last stage of the Optimization process is the
construction of a "selector" which maps each payoff

to an optimal or nearly Optimal contract. In the last
section, analytic sets are utilized to enhance the
measurability of the projection. If analytic sets

are to be employed, it becomes inherent that the
various functions should be defined such that they are
measurable with respect to the analytic sets. The

main result in this section is that one can construct

a selector which is measurable (Theorem 6.14). Because
of various technical measurability problems, a much
more general and larger space is used. It is in this
larger space, the universally measurable functional
space in which all functions and composition of functions
are measurable with respect to the relative analytic

sets that the "selector" is defined and constructed.
Throughout the process of constructing such a selector,

a very elaborate abstract algebraic structure is imposed
on the payoff and incentive sets and the various
functions. The actual implementation of the dynamic
programming algorithm to numerically evaluate the
optimal contract and meet all these measurability re-

quirements will not be easy.

157

However, the projections may not be that badly
behaved. Under certain conditions, it can be shown
that when extended real-valued functions involved
are semicontinuous, the selectors can be chosen to be

measurable with respect to the original Borel sets.
Such a selector is produced in Theorems 6.15 through

6.17.

The main concern in this chapter is to take care
of the technical difficulties in executing the dynamic
programming algorithm when the disturbance is un-
accountable. Admittedly, all these details have no
direct bearing on the original economic model. However,
if one were to adopt the algorithm to solve the im-
perfect state information model which is discussed in
the next chapter, one would have to guarantee the

feasibility of obtaining a solution through the algorithm.

CHAPTER VI*

BOREL MODELS

6.1* Introduction

 

If the state, incentive and disturbance spaces
are all arbitrary measure spaces, very little can be done.

Hence, for the general model, only sparse works are done

in the literature. One attempt in this direction is

the work of Striebel [1975] involving p-essential

suprema. The following objective function is adopted.

Jk+l(w) = pk-essential 52p E{g[w,I(w),y]

+ Jk[f(w,I(w),y,Jk_l)]} k = 0.---.N-l.
where the p-essential supremum is taken over all
measurable I from the payoff space 0 to incentive
space C satisfying any constraints which may have
been imposed. The functions Jk are measurable
and if the probability measures p0,...,pN_l are
chosen properly and the so-called countable €-

lattice property (refer to the monograph for a

158

159

precise definition) holds, the above modified dynamic

programming algorithm generates the Optimal net
return function and can be used to Obtain contracts
which are Optimal or nearly Optimal for pN_1 for
almost all initial states. However, the selection of
the prOper probability measures is as difficult as
executing the dynamic programming algorithm and the
verification of the countable e-lattice property is
nontrivial even in very simple situations.

A second approach is to investigate models in
which the payoff (state) and incentive spaces are
Borel spaces or even Rn and the expected net return

function

h(w.I) = g(w.I.y)p(dy)w,I)

is assumed to be semicontinuous and/Or convex. Semi-

continuous models of this type are mainly focused
on various combinations of semicontinuity and compact—

ness assumptions such that the functions Jk are
semicontinuous. Most of the researches that were done

in this model (Freedman [1974:L Furukawa [1972],
Himmelberg, et. a1 [1976], Maitra [1968] and Schal [1972])
are carried out in a finite-dimension Euclidean state

space with assumptions of convexity, semicontinuity or

160

both made on the net return function. Results are not
readily generalizable beyond Euclidean spaces
(Rockafellar [1976]).

Another approach, the Borel space framework was
introduced by Blackwell [1965]. The payoff (state)

0 and incentive C spaces were assumed to be Borel
spaces, and the functions defining the model were
assumed to be Borel-measurable. However, even over a
finite horizon the optimal total return function to

the principal need not be Borel-measurable and there
need not exist an everywhere e-0ptimal policy (Blackwell
(1965), Example 2). The problem arises from the
inability to choose a Borel-measurable function

uk: 0 4 C which nearly achieves the supremum uniformly

in w. The nonexistence of such a function interferes
with the construction of Optimal contracts via the

dynamic programming algorithm, since one must first
determine at each stage the measure p with respect
to which it is satisfactory to nearly achieve the
supremum for p almost every w. This is essentially
the same difficulty encountered with the Striebel
approach. The difficulties in constructing nearly
Optimal contracts over an infinite horizon are more
acute. Furthermore, from an applications point of

view, a p-—e-optimal contract, even if it can be

161

constructed, is a much less appealing object than an

everywhere e-0ptimal contract, since in many situa-

tions the distribution p is unknown or may change
when the system is Operated repetitively, in which
case a new p-—e-optimal contract must be computed.

In the formulation that follows, the class of

admissible contracts in the Borel model is enlarged
to include all universally measurable contracts.

It will be shown that this class is sufficiently rich
to ensure that there exist everywhere e-Optimal
contracts, and, if the supremum in the dynamic
programming algorithm is attained for every w and
k, then an everywhere Optimal contract exists. Thus
the notion of p-optimality can be dispensed with.
Another advantage of working with the class of univer-
sally measurable functions is that this class is closed
under certain basic operations such as integration with
respect to a universally measurable stochastic kernel
and composition.

In a dynamic programming algorithm, there are
three Operations performed repetitively. First, there
is the evaluation of a conditional expectation. Second,

an extended real-valued function in two variables
(state and incentive) is supremized over one of these

variables (incentive). Finally, if an optimal or

162

nearly Optimal contract is to be constructed, a
"selector" which maps each state to a contract which
achieves or nearly achieves the supremum in the

second step must be chosen. The following sections
will discuss the problems arising in each of these

operations and suggest solutions whenever feasible.

6.2* Probability Measures on Borel Spaces

The construction of a rigorous multiperiod agency

model via the dynamic programming algorithm is im-
possible when the payoff space and the incentive space

are arbitrary sets or even when they are arbitrary
measurable spaces. For this reason, the concept of a
Borel space is adopted and the prOperties of Borel
spaces are used to develoP the construction.

In evaluating the conditional expectation of the
total net return function, several prOperties of the
probability measures need to be developed, the first
and the obvious one being the unparameterized

probability measure. Since conditional expectation
involves probability measures on a product of Borel
spaces, it becomes essential to investigate the

interplay of the measures. It can be shown that a

probability measure on a product of Borel spaces

can be decomposed into a marginal and a Borel-

163

measurable stochastic kernel. This decomposition is
possible even when a measurable dependence on a
parameter is admitted. Such a result is essential
to the filtering algorithm for the imperfect state
information model which will be developed in the

next chapter. In addition, such a process can be
reversed, that is, given a probability measure and

one or more Borel-measurable stochastic kernels on
Borel spaces, a unique probability measure on the
product space can be constructed.

If X is a tOpological space, 32 is the
collection of closed subsets of X and 5k the Borel
o-algebra on X. The space of probability measures
on (X,Bk) is denoted by P(X). C(X) is the Banach
space of bounded, real-valued continuous functions
on X with the supremum norm for any metric d on
X consistent with its tOpology. A probability measure
p E P(X) determines a linear functional
1p: C(X) 4 R defined by 1p(f) = I fdp. On the other
hand, a function f E C(X) determines a real-valued
9f: P(X) 4 R defined by 9f(p) = I fdp.

The prOperties of the probability measure space

P(X) have been given much attention in statistics

literature (Ash [1972], Feller [1971] are just a couple

164

of classics), they are summarized in Appendix A. In

general, one can say that P(X) inherits all
characteristics of the space X. For example, if X
is a separable metrizable space, then P(X) is

separable and metrizable (Theorem A.4).

Definition: Let X and Y be separable
metrizable spaces. A stochastic kernel q(dy)x) on
Y given X is a collection of probability measures
in P(Y) parameterized by x 6 X. If 3’ is an
O-algebra on X and Y-1[6P(Y)] C 3, where
Y:IX 4 P(Y) is defined by y(x) = q(dylx), then
q(dy)x) is said to be JLmeasurable. If y is

continuous, q(dy]x) is said to be continuous.

Before proving the decomposition theorem for
stochastic kernels, the following theorem states their

general behavior when the state spaces are Borel spaces.

Theorem 6.1: Let X and Y be Borel spaces,

 

6 a collection of subsets of Y which generates 62
and is closed under finite intersections, and

q(dy]x) a stochastic kernel on Y given X. Then
q(dy]x) is Borel-measurable if and only if the

mapping 1E: X 4 [0,1] defined by 1E(x) = q(E]x)

is Borel measurable for every E 6 6.

165

Proof: Let y:.X 4 P(Y) be defined by

y(x) = q(dy[x). For E 6 6, 1E = BE 0 y. If

q(dy)x) is Borel-measurable, then Theorem A.ll implies
1E is Borel—measurable for every E 6 6. Conversely,
if XE is Borel—measurable for every E 6 6, then

-1
o[(J 1 (B )] c:6 .

By Theorem A.ll, it implies

-l -l -1
Y [5 ]=Y [NU 9 {EH}
P(Y) E66 E R
-l -l
=G[U Y (8 (6))]
E66 E R
-l
=o[u x (BHJCG
E66 E R X
Z q(dylx) is measurable. QED

Corollary 6.1.1: Let X and Y be Borel spaces

 

and q(dny) a Borel-measurable stochastic kernel on
Y given X. If B 6 BXY' then the mapping
AszX 4 [0,1] defined by AB(X) = q(BX)x) where

Bx = {y 6'Y: (x,y) 6 B] is Borel-measurable.

Proof: If B 6 EXY and >c6 X, then BX C‘Y
ls homeomorphlc to B O [{x}Y] 6 EXY' Thus BX 6 52
so q(BXlx) is defined. Let ﬁ*= [B 6 EXY: AB is
Borel-measurable]. ﬂ is a Dynkin system. By Theorem

6.1, ﬂ contains the measurable rectangles

.5 = BXY' QED

166

The next two results are the decomposition and
integration theorems for stochastic kernels. The

first one says that any probability measure on a
product of Borel spaces can be decomposed into a

marginal and a Borel-measurable stochastic kernel. The
second theorem is the reversed statement: given a
prObability measure and one or more Borel-measurable
stochastic kernels, a unique probability measure on

the product space can be constructed. Together with
their corollaries, the two theorems provide relation-

ships between two or more prObability spaces which are

useful in the later development of the models.

Theorem 6.2: Let (X,J) be a measurable space, let

 

Y and Z be Borel spaces, and let q(d(y,z))x) be a
stochastic kernel on YZ given X. Assume that q(B)x)
is Jemeasurable in x for every B 6 BXY‘ Then there
exists a stochastic kernel r(dz)x,y) on Z given XY
and a stochastic kernel s(dy)x) on Y given X such

that r(§]x,y) is JEk—measurable in (x,y) for every

IN

6 Bi, s(X]x) is Jemeasurable in X for every

Y E 52, and

“12.)” = [Y r(§)x.y)s(dyl><) V X 6 By. _z_ e 52.

167

Proof: Assume without loss of generality that
Y = Z = (0,1]. Let s(dy)x) be the marginal of
q(d(y,z)]x) on Y i.e., s(X]x) = q(XZ]x) for
every X_6 52. For each positive integer n, define
subsets of Y

M(j,n) = ((j-l)/2n,j/2n)] j = l,...,2n.

Thus each M(j,n4—l) is a subset of some M(k,n) and
the collection {M(j,n): n = 1,2,...;j = 1,...,2n]
generates By. For Z 6 0 D Z, where Q is the set
of rational numbers, define q(dy(0,Z]]x) to be the

measure on Y’ whose value at X_6 6% is q(X(O,Z])x).
Then q(dy(O,Z])x) is absolutely continuous with

respect to s(dy]x) for every 2 6 0 D Z and x 6 X.

Define for z 6 O D Z

q[M(j.n) (o.z])x]/s[M(j.n> Ix]

Gn(z)x,y) = if Y E M(jID) and S[M(j.n))X] > o

o if y e M(j,n) and s[M(j,n))x] = o.

For each 2, the set

B(z) ((x,y) 6 XY: 1im Gn(z)x,y) exists in R}

n-boo

((x,y) 6 XY: [Gn(zlx,y)] is Cauchy]

r) u r) {(x,y) 6XY:)Gn(z)X.y)-
k=1 N=l m,n2N

- Gm(z)X.Y)‘ < %]

is $52-measurable.

168

For fixed x and y and for m 2 n

Q[M(j.n)(O.ZJ)X] = f Gm(2)X.y)S(dy)X).
M(j.n)

But {M(j,n): j = l,...,2n] is the O-algebra generated

by Gn(z]x,y). This implies Gn(z)x,y) is a martingale
on Y under the measure s(dy]x). Each Gn(zlx,y)

is bounded above by 1, so by the Martingale convergence

theorem, Gn(z)x,y) converges for s(dylx) almost

every y

s[B(z)x)x] = 1.

Let
1im Gn(z]x,y) if (x,y) 6 B(z)
n-Ooo
G (Z‘XIY) =
2 otherwise

Let m 4 m, then

q[z(0.z])x] = [ G(z)x.y)s(dy)x) v x e x, z 6 Q n z
Y

and XL: M(j,n).
But 3; is a Dynkin system and by Theorem A.10, then

9[X(0o21)X] = I G(Z)X,y)s(dy)x) V x 6 x, z 6 Q n 2,
Y

X_6 By.
For each 20 6 Q 0 Z, define

C(zo) = {(x,y) 6 XY: 3 z e Q n z with z g 20

and G(z]x,y) > G(ZO)XoY)]

169

u
CZ

((x,y) 6 XY : G(z)X.y) > G(zo)x.y)}

C = L) C(zo)
206002

D(zo) {(x,y) 6 XY: G(°)x,y) is not right-

continuous at 20]

m

U P. U [(x,y) 6 XY: )G(ZIX.y)
n=1 k=1 ZEQnZ

2 g 2 g 20 + E - G(Zo)XoY)) 2 %]

206002

E = [(x,y) 6 XY: G(z)x,y) does not converge to

zero as 2 1 0]

co co

= U Q U [(x,y) 6 XY: )G(Z) (x,y) 2 35;}
n=1 k=1 zEQﬂZ
z<1/k

F

[(x,y) 6 XY: G(1)XoY) 7‘ 1}-
For fixed x 6 X and 20 6 Q 0 Z, and for all z 6 Q 0 Z,
2 g z

0

f G(z)x,y)s(dy)x) g] G(zo)x,y)s(dy)x)
Y Y

2 G(z]x,y) g G(zolx,y) for s(dy[x) almost all y
. s[C(zo)x)x] = O and s(CX)x) = O.
This implies that if z 1 20, z 6 Q 0 Z, then

I G(z]x,y)s(dylx) if G(zo)x,y)s(dy]x)
Y Y

170

and G(z]x,y) 1 G(zo)x,y) for s(dy)x) almost all y
s[D(zo)X)x] = O and s(DX)x) = 0.
Furthermore, as 2 1 O, z 6 0 0 Z

[Y G(z]x,y)s(dy]x) 1 0 V 2.6 52.

Since G(zlx,y) is‘non-decreasing in Z for s(dylx)
almost all y, then G(z)x,y) 1 O for s(dy)x)
almost all y

s(Exlx) = 0.

Let 2 = l in q[X(0,z])x], we obtain

f G(l)x,y)s(dy)x) = s(X[x) V Y 6 BY.
Y

Thus G(l[x,y) = 1 for s(dy)x) almost all y
s(FX[x) = 0.

For 2 6 Z, let {2n} be a sequence in 0 0 Z such

that 2n 1 2. For every x26 X, y 6 Y, define

1im G(z (x,y) if (x,y) 6 xy
n-Ooo n

F(z]x,y) = - (CUDUEUF)
2 otherwise

Clearly F(z)x,y) is well-defined, nondecreasing and

right—continuous.

It also satisfies for every (x,y) 6 XY
O§F(z)x,y)§1 V 262

F(l)x,y) = l

171

and

lim F(z)x,y) = O
210

2 For each (x,y) 3 a probability measure
r(dz]x,y) on Z such that

r((o.z])x.y) = F(Z)x,y) v z e (0.1].

The collection of subsets §_6 52 for which r(§]x,y)
is JEk-measurable in (x,y) forms a Dynkin system

which contains ((O,z])z 6 Z}
r(§]x,y) is 352-measurable for every 2'6 52.

By the m notone convergence theorem, then V x 6 X,

z 6 Z, X_6 HY

qmomllx] I P(zlx.y>s(dylx)
Y

J” r((o.z])x.y)s(dylx).
Y

Again, the collection of subsets g_6 52 for which

Q(XZ)X] = f r(z)x,y)s(dy)x) holds forms a Dynkin
Y

system which contains {(O,z])z 6 Z]

°. q(g_zlx] = [Y r(§)x,y)s(dy]x) v 2 6 BY, 2 e 52.

QED

Corollary 6.2.1: Let X,Y and Z be Borel spaces

 

and let q(d(y,z)]x) be a Borel—measurable stochastic

kernel on YZ given X. Then there exist

172

Borel-measurable stochastic kernels r(dz)x,y) and
s(dny) on Z given XY and on Y given X
respectively such that

q(_Y§)X) = [Y r(§)x.y)s(dy)x) V x 6 BY, _z_ e 32.

Corollary 6.2.2: Let Y and Z be Borel spaces

 

and q 6 P(YZ). Then there exists a Borel-measurable

stochastic kernel r(dz)y) on Z given Y such that

quZZ.) = [Y r(§)y)s(dy) V2 6 BY, 2 e BZ

where s is the marginal of g on Y.

Theorem 6.3: Let X1,X2,... be a sequence of

 

Borel spaces, Yn = X1X2"'Xn and Y = X1X2"°. Let
p 6 P(Xl) ‘be given, and, for n = 1,2,... let

qn(dxn+1)yn) be a Borel-measurable, stochastic kernel

on X given Yn. Then, for n = 2,3,... there

n+1
exist unique probability measures rn 6 P(Yn) such

that

(6.1) rn(X1X2. "Xn)
X ...,X

= [21535-4211 1 qn'an‘xl'XZ'

)

n-l

qn_2(dxn_llx1,...,xn_2)...q1(dx2[xl)p(dx1)

v5 e/sx ,...,>_<n 65X.

1 l n

173

If f:‘Yn 4 R* is Borel-measurable and either

[ f+drn < s or [ f'drn < m, then

(6.2) [Ynfdrn = fxlfxz...fxnf(xl,x2,...,xn)

qn_1(dxn'xllX2pooo'X ) coo

n-l
ql(dx2)x1)p(dxl).

Furthermore, there exists a unique probability measure

r on ‘Y = X1X2"' such that for each n, the marginal
of r on Y is r .
n n
Proof: The spaces Yn, n = 2,3,... and Y

are Borel. Let n = 2, for B 6 52 , by Corollary
2
6.2.1, define

r2(B) = le ql(BXl[xl)p(dxl)

it is easy to see that r2 6 P(YZ) and satisfies (6.1).
Let f:‘Y2 4 R* be Borel-measurable and I f-dr2 < m.

Consider f+:‘Y2 4 [0,w], 3 an increasing sequence of

simple functions such that fn 1 f+. By the monotone

convergence theorem

1im f (x ,x )q (dx Ix )
n22 IXZ n 1 2 l 2 l

_ +
_ Ixz f (x1,x2)ql(dx2)xl) v x1 6 x1

+ o
IXZ f (Xer2)ql(dX2)X1) is Borel-measurable

and

174

1im ] fndr 1im ]

n2” 2 n22 fn(xl,x2)q1(dx2]xl)p(dx1)

lexz

+
IXIIXZ f (Xl'X2)q1(dX2lX1)p(Xm) .

4.
But [Y2 fndr2 1 [Y2 f dr2

(6.2) holds for f+.

Similar arguments show (6.2) holds for f-

fdr = f+dr - f-dr
1Y2 2 [Y2 2 (Y2 2

f+(x1,x2)q1(dx2]x1)

1x312

- [X2 f’(xl.x2)q1(dlexl)]p(dx1>

(xifx2 f(x1,x2)q1(dx2)xl)p(dxl).

Now assume rk 6 P(Yk) exists for which (6.1) and (6.2)
hold when n = k.

For B 6 Y let

k+1'
rk+1(B) = [Yk qk(BYk)yk)rk(dyk).

Then rk+1 6 P(Yk+l)' If B = §1§2'°'§k§k+l' where

then

rk+l‘b) = f Xxlx2...xk(yk)qk(§k+l‘yk)rk(dyk)

I31£§2°"I§k qk)§k+l)xl'xz'"xk)qk—l(dxk)Xk—l)
. ql(dx2]x1)p(dxl)

by (6.2) when n = k.

175

(6.1) holds for n = k4-1.

Then use the previous result to show (6.2) for n = k4—1
proceeding as when n = 2 case.
By the induction hypothesis (6.1) and (6.2) holds.

Suppose r2 6 P(Yn) satisfies (6.1). Then the
collection ﬁ-= (B 6 B )r (B) = r’(B)} is a Dynkin
Yn n n

system containing the measurable rectangles

I
3 — 5&n and rn — rn

Each of the measures rn is consistent, i.e., if m 2 n

then the marginal of r on Y is r . If each X
m n n k

is complete, by the Kolmogorov theorem, 3 a unique

r 6 P(Y) whose marginal on each Yn is rn. If Xk is

N

not complete, consider the completion Xk

its completion Yn' Again, 3 E 6 P(Y) whose marginal

and Y on
n

N iv N
on each Yn is rn. The uniqueness of r implies the

uniqueness of its correspondence r 6 P(Y). QED

In the course of proving Theorem 6.3, the following

result has also been proved.

Theorem 6.4: Let X and Y be Borel spaces and

 

q(dy)x) a Borel-measurable stochastic kernel on Y
given X. If f: XY 4 R* is Borel-measurable, then
the function 1: X 4 R* defined by

1(x) = f f(x,y)q(dy)x) is Borel-measurable.

176

Corollary 6.4.1: Let X be a Borel space and
let f:.X 4 R* be Borel-measurable. Then the
function 9f : P(X) 4 12* defined by 9f(p) = J“ fdp

is Borel-measurable.

Proof: Define a Borel-measurable stochastic
kernel on X given P(X) by q(dep) = p(dx). Let

fsjp(x)x 4 R* be defined by fkp,x) = f(x). Then

ef(p> = J" f(x)p(dx) = f %'(p.x)q(dxlp).

By Theorem 6.4, 6f is Borel—measurable. QED

If f 6 C(XY) and q(dy]x) is continuous, then the
mapping y as defined in Theorem 6.4 is also continuous.

This is proved with the aid of two lemmas.

Lemma 6.1: Let Y be a metrizable space, d a

 

metric on Y consistent with its topology, and
X CLY. If g 6 Ud(X), then g has a continuous ex-

tension to Y.

Proof: Since 9 is uniformly continuous on X,
given 6 > 0 there exists 5(a) > 0 such that if
x1,x2 6 X and d(x1,x2) < 6(6), then Ig(xl)-g(x2)l < e.
Let X be the closure of X. Suppose y 6 X. Then
there exists a sequence (xn} C X for which XD 4 y.

Let s > 0 be given, 3 N(s) such that

d(xn,xm) < 6(6) for all n,m > N(e).

(g(xn)} is Cauchy in R.

177

A
Define g(y) = lim g(xn). Thus [g(xn)-3(y)) < c

n-Oco

whenever n > N(e). Suppose now x 6 X and
d(x,y) < 6(e)/2. Choose n > N(e) such that

d(xn.y) < 6(5)/2. Then d(x,xn) < 6(5) and

A A
lg(x) -g(y)) g )g(x) —g(xn>) + (g(xn) -g(y>)
. < 29

For any sequence {x5} C X with x; 4 Y

A _ . :
g(y) — lim g(xn).

n-Oco
A c
9(Y) is independent of the sequence [xn] chosen.
If y 6 X, take xn = y, n = 1,2,... and obtain
A
g(y) = g(y)
A

g is an extension of g.

If {ym} is a sequence in X which converges to y 6 X,

then there exist sequences (x ] in X with

mn
ym = lim xmn' Choose n1 < n2 <... such that
n-Oco
. _ 1
11m x - y and d(xmn ,ym) < 6(5)/2. Then
m-Om m m
A . A ‘ _
g(y) = 11m g(xmn ) and (g(xmn )-g(ym) < 2/m. Letting
m4oo m m

m 4 m, then

A , A
g(y) = 11m g(ym)
111-900

A _
and g is continuous on X. Clearly

SUP )g(x)) = sug )g(y))-
XEX y€X

178

- A
If X = Y, g is clearly unique. If X is a proper
A
subset of Y, by Tietze extension theorem, g can
be extended to all of Y such that

Hg) = sup ($(y>(. QED
y6Y

Lemma 6.2: Let X and Y be separable metrizable

 

spaces. Then the mapping 0: P(X)P(Y) 4 P(XY) defined
by o(p,q) = pq where pq is the product of the

measures p and q is continuous.

Proof: By Theorem A.7, X and Y can be
homeomorphically embedded in the Hilbert cube N. Let
X,Y C N and d be a metric on MN consistent with
its topology. Let g 6 Ud(X,Y), by Lemma 6.1, g can
be extended to a function 3 6 C(NV). Consider the
sit of finite linear combinations of the form

A

A A A
Z) f.(x)h.(y) where f. and h. range over C(ﬂ)
i=1 i i i l

and k any integer. Let c > 0 be given, by the
StoneAWeierstrass Theorem such a linear combination

exists and satisfies

k A A A
”.23 fihi-gH < 6/3.
1:
Let {pm} be a sequence in P(X) and pn 4 P, p 6 P(X)
and [gm] a sequence in P(Y) with qn 4 q, q 6 P(Y)

A A
Consider fi’hi the restrictions of fi and hi to

159

X and Y

1im sup 1] gd(pnqn) -J" gd(pq>l
n41: XY XY

k
1im su (9- Z f.h.)d( q)
n4ap REY i=1 1 1 pn n1

"A

k
2) 1' f.d h.d .— f.d h.d
+ i=1 n4:»‘£ 1 p“ I 1 q“ i 1 p i 1 q‘

K:

k
+ 1im If ( Z) fihi-g)d(pq)) S 6
n4: XY i=1 —

By the equivalence of Theorem A.5 (a) and (c)
pn 4 p = ] fdpn a [ fdp and qn a p = j qun 4 ] qu

o is continuous. QED

Theorem 6.5: Let X and Y be separable

 

metrizable spaces and let q(dy]x) be a continuous
stochastic kernel on Y given X. If f 6 C(XY),
then the function 1: X 4 R defined by

1(x) = f f(x,y)q(dylx) is continuous.

Proof: The mapping v:.X 4 P(XY) defined by

v(x) = qu(dy]x) is continuous by Corollary A.5.l

and Lemma 6.2. Thus 1(x) = (9f‘>v)(x) where
9f: P(XY) 4 R is defined by Bf(r) = I fdr. By
Theorem A.5, 6f is continuous

1 is continuous. QED

With the above results, it can be seen that the
conditional expectation Operators in Borel spaces are

well-defined.

180

6.3* Analytic Sets and Universally Measurability

The dynamic programming algorithm is centered
around maximization of functions, and this is intimately
connected with projections of sets. More specifically,
if f: XY 4 R* is given and f*: X 4 R* is defined

by

f*(x) = sup f(x,y),
y6Y

then for each c 6 R
{x e xlf*(x> > c} = projx(((x.y) 6 xy f(X.Y) > cl).

If f is a Borel-measurable function, then
{(x,y)lf(x,y) > c] is a Borel—measurable set. Un-
fortunately, the projection of a Borel-measurable set

need not be Borel-measurable.

As mentioned earlier in the introduction, the second
stage of the dynamic programming algorithm involves in
the supremization of an extended real-valued function
in two variables over one of these variables. Essentially,
the algorithm searches for the supremum along the
projection of the set. If one were unable to guarantee
the Borel-measurability of the projection, it would
become impossible to implement the algorithm. However,
in Borel spaces, the projection of a Borel set is an

analytic set. By enlarging the Borel space to include

181

all universally measurable functions, the model is
amendable for the implementation of the dynamic
programming algorithm.

Analytic sets have a very standard place in the
mathematical literature. The prOperties that are
required to develOp the multiperiod agency model are
summarized in Appendix B. This section will start

off with the measurability properties of analytic sets.

Definition: Let X be a set. A paving 9 of
X is a nonempty collection of subsets of X. The

pair (X,9) is called a paved space.

Let 0(9) be the G-algebra generated by 9. 95

denotes the collection of all intersections of

countably many members of 9 and 90 the collection
of all unions of countably many members of 9. N
denotes the set of positive integers. T and Z

are the set of all infinite and finite sequences of

positive integers respectively.

Definition: Let (X,9) be a paved space. A
Suslin scheme for 9 is a mapping from. Z) into 9.

The nucleus of a Suslin scheme 8:73 4 9 is

N(S)= U n S(Ol,...,o)
(01:02....)691 n=1

The set of all nuclei of Suslin schemes for a paving 9

is denoted by 2(9).

182

Definition: Let X be a Borel space. Denote
by 3* the collection of closed subsets of X. The

analytic subsets of X are the members of £(Jk).

Actually, there are a number of ways to define
the class of analytic sets in a Borel space. Theorem
A.l3 provides seven equivalent definitions. At the
beginning of this section, it was indicated that ex—
tended real-valued functions on a Borel space X
whose upper level sets are analytic arise naturally
via partial supremization. Because the collection of
analytic subsets of an uncountable Borel space is
strictly larger than the Borel O—algebra, such functions
need not be Borel-measurable. Nonetheless, they can
be integrated with respect to any probability measure
on (X,E&). The following will discuss the measurability

properties of analytic sets.

Definition: Let X be a Borel space. The
universal O—algebra ﬁx is defined by

7 = F) B (p). If E E , then B is uni-
*x pEP(X) x “X

versally measurable.

Theorem 6.6 (Lusin's Theorem): Let X be a Borel

 

space and S a Suslin scheme for ﬁx. Then N(S) is

universally measurable, i.e., 3(ux) = EX.

183

Proof: Let A = N(S), where S is a Suslin

scheme for 74X. For (01,...,ok) 62, define

UN

N(Gl,...,ck) = {(gl,g2,...) 6 m: g1 = 01,..., k k
and
M(Ol, .,ok) = {(gl,g2,...) E 52: gl g 01,...,gk g 0k)
= L} N(T ,...,T )
1 k
71§01,...,Tk§ok
Let
R(Oll° 10k) = U n 5(5).
z€M(Ol,...,Ok) s<z
Then R(Ol,...,ok) C K(cl,...,ok) where
k
K(Oll 00k) = T /Q' U ”- (g 391 8(713 v j)
19 1""' k“-2 k
Thus M(Cl) T m and R(ol) 1 A as 01 1 and
M(Ol,...,ck_1,ok) 1M(Ol,...,0k_1) and R(ol, 'Ok-l'ok)
R(Gl,...,ok_l) as Ok 1 m. Let p 6 P(X) and e > 0

be given. Choose :1'32"" such that

p*(A) g p*[R(Zl)] + 6/2

p*[R(:1.....§k_1>J g p*[R(Zl.....:k_l.

+ e/2k k = 2,3,"

Then p*(A) g p*[R(:1,...,§k)] + e. The set

K(7 ...,gk) is universally measurable

(F‘I

kn

184

H
II

P[K(Z1oo-u:k)] + P[X'K(Elo---I:k)]
P*[R(:1o---a§k)] + p[x'K(:10---I:k)]

2 p*(A) - e + pr-—K(§l,...,§k)].

ll\/

Let

x 6 ﬂ K(g .....g)
k=1 1 k

S('r1,...,'r.).

IIDB
"37"

k 1

1 71§§1,...,Tk§gk j

Suppose for every Tl g gl, 3 a positive interer k(T1)

such that
k(T1) k
x£S('r1) DI: m U on S('r1,'r2,...,'rj)]
k=2 72§§2,...,Tkggk j=2
Let k = max k(¢ ). Then
T gt 1
l- 1
ko k
XZ {5(71) ﬁrm U m S(T1IT2I°°°I':-j)]?
r — .' .2
Tlgbl k—2 ¢2§g2,...,¢k§bk j 2
k
0
3 U n S(T1o---oTj)
Tl§§1,...,Tk ggk 3:1
0 o
= K(g .....Q )
1 k0
Contradicting that x 6 ﬂ K(§ ,...,g )
l k
k=1
For some T1 g g1
x 6 S('rl) n [ H U f) S(?l.'r2.- .Tj)]
k=2 72§g2,...,¢k§gk j=l
Similarly, 3 ?2 g g2 such that
x E 551) H 851.;2)
m k _ _
n [ n U .n 5(71IT2It3I Iq.j)]

185

Continuing, we obtain a sequence T1 g gl, ?2 g (2,...
such that
O
x E H 8(7r ,...,7r ) CN(s) =A
k=1 l k
G
..rj K(g1,...,gk) CIA V (g1....) em.
k—l
As k 4 a, K(g1,...,§k) decreases to a set contained
in A and X-K(§l,...,gk) increases to a set con-
taining X-A.
Letting k 4 a
l;p*(A) —e+p*(X-—A)
This implies l 2 p*(A) + p*(X-—A). But for any
E CX, p*(E) + p*(X-E) 2 l
p*(A) + p*(X-—A) = l
A is measurable with respect to p. QED

Corollary 6.6.1: Let X be a Borel space,

 

every analytic subset of X is universally measurable.

Proof: The closed subsets of X are universally

7
measurable, so JK‘X) C'UX. QED

As remarked earlier, the class of analytic subsets
of an uncountable Borel space is not a U-algebra, so
there are universally measurable sets which are not

analytic.

186

In Theorem A.ll, when X is a Borel space, the
function 6A: P(X) 4 [0,1] defined by 9A(p) = p(A)
is Borel-measurable for every Borel-measurable A CZX.
The following theorem and its corollary will investi-
gate the property of this function when A is
analytic. The main result is that the set
{p 6 P(X)‘p(A) 2 c] is analytic for each real c.
Thus, there exists universally measurable probability

measure for the analytic set A.

Theorem 6.7: Let X be a Borel space and A an

 

analytic subset of X. For each c E R, the set

{p E P(X): p(A) 2 c} is analytic.

Proof: Let S be a Suslin scheme for 3* such
that A = N(S). For 5 G’ZL let N(s), M(s), R(s) and

K(s) be as defined as in Theorem 6.6, and

D
121 K(;1,...,gk) CA v (g1....,) e 5:.

Each K(s) is closed. Let p(A) Z c, for any n 2 l,

3 (E1,ZZ,...) E T such that
E(A) g EIR<Z1.....ER)] + 1/h.
Since R(s) C K(s)

§[K(El.....tk>] ; §[R(:1.....Ek>1

187
U n {p E P(X) :p[K(s)] 2 c-l/nF.
E

C
Now, let 5 E r) L) r) {p E P(X): p[K(s)] 2 c-l/n}.
n=1 26m s<z

For each n, 3 (gl,g2,...) 6 m such that

pr F) K(; ,...,c )] 1im §[K(g ,...,g )]
k=1 1 k k“. 1 k

2 c - l/h.

But F) K(g1,...,gk) C A. Thus p(A) 2 c - l/n
k=1 —

n = 1,2, and p(A) 2 c

{p E P(A) :p(A) ; C3

(1 U n {p€P(X):p[K(s)]2c-l/n}.
n=1 26m s<z

By Theorem A.ll, for each n 2 l and s E?ZL the set

Tn(s) = {p 6 P(X) :p[K(s)] 2 c-l/n}

is Borel-measurable in P(X) and

{p e P(X) :p(A) g c} =

"IDs

N (Tn) .

n 1

By Theorem A.l4 and Corollary A.l3.2,

{p e P(X)::p(A) 2 c} is analytic. QED

Corollary 6.7.1: Let X be a Borel space and A

 

an analytic subset of X. For each c E R, the set

{P F P(X): p(A) > c} is analytic.

Proof: For each c 6 R

{P E P(X) =P(A) > C} = U {p 6 P(X) :p(A) 2 c+1/n}
n=1

By Corollary A.l3.2 and Theorem 6.7, the set is analytic. QED

188

6.4 Universally and Borel Measurable Selection

 

The last stage of the optimization process is the
construction of a "selector" which maps each payoff
to a contract that achieves or nearly achieves the
supremum. The following discussion will concentrate
on the universally measurable functions and show its
existence by actually constructing a selector
(Theorem 6.14). If the projection of a particular
Borel-measurable set turns out to be Borel-measurable,
then a similar selector which is Borel-measurable can
also be constructed (Theorem 6.17). Clearly, a Borel-
measurable selector is a special case of the general

universally measurable one.

Definition: Let X be a Borel space. The
analytic O—algebra 0k is the smallest O—algebra
containing the analytic subsets of X. Notationally,
4x = U[J(JX)], where 3* is the collection of closed
subsets of X. If E 6 a&. E is analytically

measurable.

It is noted, by Theorems 6.2 and 6.6, that for any
C .
Borel space X, 5* $(Zx) C 4* C-UX
If X is countable, each of these collections of
sets is equal to the power set of X. However, if X
is uncountable, each set containment in the above

relationship is strict.

189

Definition: Let X and Y be Borel spaces
and f a function mapping D CjX into Y. If
-1 .
D E Gk and f (B) 6 0k for every B 6 E&. f 15

said to be analytically measurable.

If D 6 Ex and f-1(B) 6 ﬁx for every B E Bk,

f is said to be universally measurable.

Because of the set containment relationship, it
is clear that every Borel-measurable function is
analytically measurable, and every analytically measur—
able function is universally measurable. The converses

of these statements are false due to strict
containment prOperty.

In develOping the model for universally measurable
functions, it is necessary to show that universally
measurable stochastic kernels can be used to define
probability measures on product spaces in a manner
similar to Theorem 6.3. To do that, several preliminary

results are required.

Lemma 6.3: Let X be a Borel space and E C.X.

 

Then E 6 ux if and only if given any p E P(X),

there exists B 6 6* such that p(E.AB) = O.

190

Theorem 6.8: Let X,Y and Z be Borel spaces.

 

D 6 “X and E 6 NY. Suppose f: D 4 Y and g: E 4 Z
are universally measurable and f(D) C E. Then g<>f

is universally measurable.

Proof: For p e P(X), define p’ e P(Y) by
I -1
p (C) =p[f (0)] V c 6 By.

Let V E 5% be such that

p[f-1(V) A f-1(U)] = p’(V.AU) = 0.

Since f-1(V) 6 Uk. 3 W E 5% such that hW’Af-1(V)] = 0.
Then p[W Af-1(U)] = 0. By Lemma 6.3, f-1(U) 6 “X

. -1
for every U E U? Since g (B) 6 NY for B 6 ﬁg

-1[

f g-l(B)] is universally measurable. QED

Corollary 6.8.1: Let X and Y be Borel

 

spaces. D 6 uX and f: D 4 Y a universally

measurable function. If U E uy, then f-1(U) E “X'

Corollary26.8.2: Let X,Y and Z be Borel

 

spaces, D 6 Uk and E E Ay. Suppose f: D 4 Y
and g: E 4 Z are analytically measurable and

f(D) C E. Then g<>f is universally measurable. If

, -1
A c d&, then f (A) 6 uX.

191

Corollary 6.8.3: Let X and Y be Borel spaces,

 

let f:1X 4 Y be a function, and let q(dy]x) be a
stochastic kernel on Y given X such that, for
each x, q(dylx) assigns probability one to the
point f(x) E Y. Then q(dylx) is universally

measurable if and only if f is universally measurable.

Proof: By Corollary A.5.l, the mapping

6: Y 4 P(Y) defined by 5(y) = py is a homeomorphism.

Let yzix 4 P(Y) be the mapping v(x) = g(dylx).

Thus y = 6<>f and f = 6-1<>y. The result follows

from Theorem 6.8. QED

Lemma 6.4: Let X be a Borel space and f:.X 4 R*.

 

The function f is universally measurable if and

only if, for every p E P(X), there is a Borel—
measurable function f’ :X 4 R* such that f(x) = fp(x)

for p almost every x.

Proof: Let f be universally measurable and
p E P(X) be given. For r E 0*, let

U(r) = (x: f(x) g r}

f(x) = inf{r E Q*:x E U(r)].

Let B(r) E 6& be such that p[B(r) AU(r)] = 0. Define

fp(x) inf{r E Q* : x E B(r)}

= inf '1; (x)
rEQ*

192

where

r if x EB(r)
¢r(X) ={

m otherwise

fp: X 4 R* is Borel-measurable and

(x: f(x) ;=’ f (x)} C U [B(r) AU(r)] has
P rEQ*

p-measure 0.

Conversely, given p E P(X), let fp be a Borel—

measurable function such that f(x) = fp(x) for p

almost every x. Then

p({x:f(x) gc] A {xzfp(x) gen :0 Vc ER*
f is universally measurable. QED

Lemma 6.5: Let X and Y be Borel spaces and

 

g(dylx) be a stochastic kernel on Y given X. The

following statements are equivalent:

(a) The stochastic kernel q(dy‘x) is
universally measurable.

(b) For any B E 5&, the mapping XB.:X 4 R
defined by 1B(x) = q(B1x) is universally

measurable.

193

(c) For any p E P(X), there exists a Borel-
measurable stochastic kernel qp(dy]x)
on Y given X such that q(dylx) = qp(dylx)

for p almost every x.

Proof: Suppose (a) holds. The function
y: X 4 P(Y) as defined by v(x) = q(dylx) is univer-
sally measurable. Let B E 5k and lB(x) = q(B‘x).

Let : P(Y) 4 X be defined by EB(p) = p(B). Then

eB
1B = BB<>y. By Theorem A.ll and 6.8, 1B is
universally measurable. Suppose (b) holds and choose
p E P(X). Y is separable and metrizable, E a
countable base 6 for the topology in Y. Let 3
be the collection of sets in B and their finite

intersections. For F E 3, let fF be a Borel-

measurable function such that

fF(X) = q(FIx) V x E BF

where BF E 5k and p(BF) = 1. Such a fF and BF
exist by (b) and Lemma 6.4. For x E F) B . let
F
FEJ
q (dylx) = q(dy‘x). For x E H BF' let q (dylx)
p FE? P

be some fixed probability measure in P(Y).

194

q(dy]x) = qp(dylx) for p almost every x.

Y

Borel—measurable in x is a Dynkin system containing

The class of sets 3 in 5 for which qp(YIx) is

J. The class I is closed under finite intersection
and generates By.
By Theorem A.lO, statement (c) follows.

Suppose (c) holds and let p E P(X). Define

pr = X 4 P(Y) by

q(dylx)

.(
A
X
V
II

..‘
A
X
V
II

qp(dYIx)-

-1 —1 _
Let B E 5P(x)’ p[w (B) Ayp (B)] - 0. By Lemma 6.3,

y-1(B) is universally measurable

(c) = (a). QED

Lemma 6.6: Let X,Y and Z be Borel spaces and

 

let f: XY 4 Z be a universally measurable function.
For fixed x E X, define gX: Y 4 Z by
gX(y) = f(x,y). Then gX is universally measurable

for every x E X.

Proof: Let xo E X be fixed and let m: Y 4 XY

be the continuous function defined by w(y) = (xo,y).

For §_E 52, {y E‘Y: gX E Z} = m-1{(X.y) E XY: f(x,y) E Z}.

0
This set is universally measurable by Corollary 6.8.1. QED

195

Now, the main result is ready to be stated.

Theorem 6.9: Let X1,X2,...

Borel spaces, Yn = X1X2...Xn and Y = X1X2...

be a sequence of

 

Let p E P(Xl) be given and, for n = 1,2,... let
qn(dxn+1]yn) be a un1versally measurable stochast1c

kernel on Xn+1 given Yn' Then for n = 2,3,...,

there exist unique probability measures rn E P(Yn)

such that
(6.3) rn(§1§2-'-§n) =1 I "my qn—1(§an1'X2""'Xn-1)
X X X
—l -2 —n—1
qn_2(dxn_1lx1,x2,...,xn_2)
...q1(dx2‘xl)p(dxl)
V x1 E 5; ,...,Xn E 8X
1 n

If f: Yn 4 R* is universally measurable and either

+ .-
f f drn < m or I f drn < m, then

(6.4) f fdrn = f I ...‘f f(x1,x2,...,xn)
Y X X X
n 1 2 n
qn_1(dxn]x1,x2,...,xn_1)...q1(dx2lxl)p(dxl).

Furthermore, there exists a unique probability measure
r E P(Y) such that for each n the marginal of r on

Y is r .
n n

Proof: There is a Borel—measurable stochastic

kernel §l(dx21xl) which agrees with q(dx21x1) for

196

p almost every x1. By Theorem 6.3, define r2 E P(Y2)

by specifying it on measurable rectangles to be

r2(§i £2) = g q1(§21x1)p(dxl) v £1 6 5&1, g2 e 5&2.
-1

Assume f:'Y2 4 [0,m] is universally measurable and

f: Y2 4 [0,m] be Borel-measurable. Let f = f on
Y -N where N E B and r (N) = 0. By Theorem 6.3,
2 Y2 2
o = r2(N) = f f XN(x1,x2)ql(dlexl)p(dxl)
XX
—1 -2
= f q1(NX ‘Xl)p(dx1)
X1 1
ql(NX 1x1) = O for p almost every x1.
1
Since f(xl,x2) = f(xl,x2) for x2 E'NX . Thus

1

I; [f(x1.x2)-f(xl,x2)]ql(dlexl)l
2

g I lf(x1,x2)-f(x1,x2)]ql(dlexl) = o

N
Xi

for p almost every x1. Then

I f(x1,x2)ql(dx2‘xl) = I f(xl,x2)q1(dx21xl)
X1 X2

= j f(xl.x2)q1(dx21xl)
X2

for p almost every x1. By Theorem 6.4,

g f(xl,x2)q1(dx21xl) is Borel-measurable.
l

197

By Lemma 6.4, I f(x1,x2)q1(dx2le) is

X2

universally measurable.
Furthermore,

I fdr2 = I fdr
Y2 X2 1 2

f(xl,x2)q1(dx2‘xl)p(dxl)

M
II
XC—a

><‘——>

i f(x1.x2)q1(dx21x1>p<dx1)

(6.4) holds for n = 2 and f 2 0. If

If f: Y2 4 R* is universally measurable

and satisfies I f+dr < m or I f-dr < m,

2 2
then (6.4) holds for f+ and 5‘, so it

holds for f.

Let f = X X , we obtain (6.3). Now assume the
51 52

theorem holds for n = k. Let qk(dx be a

k+l‘yk)
stochastic kernel which agrees With qk(dxk+l]yk) for

r almost every Xk' Define r by specifying

k k+1

it on measurable rectangles to be

rk+l(§l §2"‘§k+1) = f qk(’—(k+1lxi'xz"”'xk)dr
§1§2"°§k

V £1 E Bxl'o--'2<.k.1 6 Ex

k

k+1

Proceed as in the case n = 2 for n = k+-l. The proof
for the existence of r E P(Y) such that the marginal
of r on Xn is rn, n = 2,3,... is the same as in

Theorem 6.3. QED

198

In the course of proving Theorem 6.9, the following

fact has also been established.

Theorem 6.10: Let X and Y be Borel spaces

 

and let f: XY 4 R* be universally measurable. Let
q(dny) be a universally measurable stochastic kernel
on Y given X. Then the mapping' 1.:X 4 R* defined

by 1(x) = f f(x,y)q(dy‘x) is universally measurable.

Corollary 6.10.1: Let X be a Borel space and

let f:IX 4 R* be universally measurable. Then the
function 6f: P(X) 4 R* defined by 6f(p) = I fdp

is universally measurable.

The functions obtained by supremizing bivariate,

extended real—valued, Borel—measurable functions over
one of their variables have analytic upper level sets.

These functions are called upper semianalytic functions.

Definition: Let X be a Borel space, D c.X,
and f: D 4 R*. If D is analytic and the set

{x E le(x) > c] is analytic for every c E R, then

f is said to be upper semianalytic.

It is clear that the idea behind upper semianalytic
functions is similar to that for upper semicontinuous

functions in the Borel model. The next step is to

199

investigate whether or not semianalyticity is preserved

in the Optimal function and under the expectation

Operator.

Lemma 6.7: (1) Let X be a Borel space, D

 

an analytic subset of X and f: D 4 X. The following

statements are equivalent. (a) The function f is

upper semianalytic, i.e., the set

{x E D: f(x) 2 c] is analytic for every c E R.

(b) The set in (a) is analytic for every c E R*.
(c) The set {x E D: f(x) 2 c} is analytic for every
c E R*. (d) The set in (c) is analytic for every
C E R*.
(2) Let X be a Borel space, D an analytic subset
of X, and fn: D 4 R*, n = 1,2,... a sequence of
upper semianalytic functions. Then the functions

inf fn, sup fn’ lim inf fn and lim sup fn are
n n n-Dm n-Nb

upper semianalytic. In particular, if fn 4 f, then

f is upper semianalytic.
(3) Let X and Y be Borel spaces, g:1X * Y,
and f: g(X) 4 R*. If g is Borel-measurable and f
is upper semianalytic. Then f<>g is upper semianalytic.

(4) Let X be a Borel space, D an analytic
subset of X, and f,g: D 4 R*. If f and g are

upper semianalytic, then f4-g is upper semianalytic.

200

If, in addition, g is Borel-measurable and g 2 0
or if f 2 0 and g 2 0, then fg is upper semi-

analytic, where we define 0 °w = w '0 = 0(—w)

: (_oo)0 : 0.

Proof: (1) We show (b) = (a) = (d) = (c) =9 (b).
Clearly (b) = (a) and (d) = (C). Suppose (a) holds,
then {x E D: f(x) 2 -m] = D which is analytic by

definition, while the sets

{x E D: f(x) 2 m} = {X E D: f(x) 2 n}

:3
II 38 II 38
H

H

{xED:f(x)2c}= [XED:f(x)>c-%}CER

.‘3

are analytic by Corollary A.13.2
(a) = (d).

If (c) holds, then the sets
{XED:f(X)>w1=¢

{xED:f(x)>—oo}= U [XED:f(x)2n}
{xED:f(x) 2c]: u {xED:f(x) 2c+rll]

are analytic by Corollary A.13.2

(C) = (b)-

(2) Let c E R,

{x E D: inf f (x) > c} = H {x E D: f (x) > c}
n n
n n=1
{xED:sup fn(x) 2c}: U {xED:fn(x) 2c}

:3
:5
H
H

 

 

201

inf fn and sup fn are upper semianalytic
n n

by Corollary A.l3.2 and part (1).

And
1im inf fn = sup inf fk
n4m n21 k2n .
lim sup f = inf sup fk are upper semianalytic.

n4m n n21 k2n

(3) By Theorem A.18, the domain g(X) of f is

analytic. Let c E R,

Ix e x: (fog) (x) > c} = g'1({y e g(X) : f(y) > c))

is analytic by Theorem A.18.

(4) Let c E R,

{x E D:f(x)+g(x) 2 c} = U [{X ED:f(X) 2 r}
rEQ

(1 {x E D:g(x) 2 c-r}].

This is true if we adopt f(x) + g(x) = m + w = w for
all x E D and c E R*. By Corollary A.l3.2, f4-g
is upper sem analytic whenever f and g are.

Suppose g is Borel-measurable and g 2 0. Let c 2 0,

{x E D: f(x)g(x) 2 c} = LI {x E D: f(x) 2 r,
rEQ.r>O
g(x) > C/r].

202

{X E D: f(X)g(X) > C} = {X E D: f(X) 2 O]

U {x e D: g(x) 2 0]

U [ (J {x E D: f(x) 2 r, g(x) < c/r}]
rEQ.r>O

{x E D: f(x)g(x) 2 c} is analytic.
Suppose f and g are both semianalytic and nonnegative.

For c E R, the set {x E D: f(x)g(x) 2 c] is analytic

as before

fg is upper semianalytic. QED

Theorem 6.11: Let X and Y be Borel spaces,

 

let D be an analytic subset of XY and let
f: D 4 R* be upper semianalytic. Then the function

f*: proj (D) 4 R* defined by f*(x) = sup f(x,y)
X
yEDX

is upper semianalytic. Conversely, if f*: X 4 R*
is a given upper semianalytic function and Y is an
uncountable Borel space, then there exists a Borel-
measurable function f: XY 4 R* such that

f*(x) = sup f(x,y) with D = XY.
yEDX

Proof: If f: D 4 R* is upper semianalytic and

c E R. The set

203

{x E pron(D): sup f(x,y) 2 c]
YEDX

= prOjX({(X.y) E D: f(X.Y) > Cl)

is analytic by Theorem A.l7. Suppose f*: X 4 R* is
upper semianalytic and Y an uncountable Borel space.
For r E Q, let A(r) = {x E X: f*(x) 2 r]. Clearly
A(r) is analytic. By Theorem A.17, there exists

B(r) E EXY such that A(r) = pron[B(r)]. Define

G(r) = L, B(s) and f: XY 4 R* by
SEst2r
f(x,y) = SUP {r E Q: (x,y) E G(r))
= SUP Wr(X.y)
rEQ
where “f(x,y) = r if (x,y) E G(r) and ¢r(x,y) = -0:

otherwise. f is Borel-measurable. Let g be defined
by g(x) = sup f(x,y). If f*(x) 2 c for some
yEY
c E R E r E Q such that f*(x) 2 r 2 c. Thus
x E A(r). 3 y E Y such that (x,y) E G(r) and

f(x,y) 2 r and g(x) 2 r 2 c
f* (X) 2 g(X).

If g(x) 2 c for some c E R 3 r E Q and y E Y
such that g(x) 2 r 2 c and (x,y) E G(r). Thus
for some 5 E Q, s 2 r, we have (x,y) E 8(5) and

x E A(s). This implies f*(x) 2 s 2 r 2 c
g (x) g f: (x)

Thus g(x) = f*(x) . QED

264

Theorem 6.12: Let X and Y be Borel spaces.

 

f: XY 4 R* be upper semianalytic, and q(dy]x) a
Borel-measurable stochastic kernel on Y given X.

Then the function 1: X 4 R* defined by

1(x) = I f(x,y)q(dylx) is upper semianalytic.

Proof: Let 9: XY 4 R* be defined as
g(x.y) = —f(x.y). Thus {(x.y) E XY:g(x.y) gb)
is analytic V b E R. Suppose g 2 0. Let
gn(x,y) = min{n,g(x,y)}. Each gn is lower semi—

analytic and gn ? g. Let

En = {(x.y.b) 6 XYR = gn(x.y) g b g n)
= m U {(X,y,r) E XYR : gn (XoY) < r!
k=1 rEQ
r _<_ 1 , 1}.

By Corollary A.l3.2 and Theorem A.l6, En is analytic.
Let u be the Lebesgue measure on R, p E P(XY)
and pu, be the product measure on XYR. By Fubini's

theorem,

(PUMEn) f (XE dudp

XY R n
=J‘ [n - 9n (my) ldp
XY
= n - I gn(x,y)dp.
XY

For c E R, by the monotone convergence theorem,

205

{p E P(XY): f g(x.y)dp g C)

= n {p e P(XY) =J‘ gn(x.y)dp g c}
=1 XY
= ﬂ {pEP(XY):(Pu)(En) 2n-C).

n 1
By Lemma 6.2, the mapping p 4 pp is continuous and by

Theorem 6.7, the function Sj :P(XY) 4 R* is defined by
-f 9(X:Y)dp is upper semianalytic.

Let 1(x) = 9f[q(dy}x)px]. Since the mapping x 4 q(dy1x)
is Borel-measurable and x 4 pX and [q(dy‘x),px] 4
q(dylx)pX are continuous from X to P(X) and

P(X)P(Y) to P(XY) respectively by Corollary A.5.l

and Lemma 6.2
By Lemma 6.7 (3), l is upper semianalytic.

Suppose g g 0. Let gn(x,y) = max{-n,g(x,y)}. Each

gn is lower semianalytic and gn 1 9. Let

En = {(x,y,b) E XYR : gn(x,y) g )5 g 0}.

En is analytic

(pu)(En) = f I X dudp = -f gn(x,y)dp.

E

XY R n XY
For c E R
(p E P(XY): f g(x.y)dp < c} = Llip E P(XY): I gn(x.y)dp < C‘
n=1 XY
= U {p e P(XY) : (pu) (En) > -c).
n=1

Use the same arguments as before. In the general case

I f(x,y)q(dy1X) = f f+(x.y)q(dylx)-f f-(x.y)q(dylx).

206

Since f+ and -f- are upper semianalytic, and by

the preceding arguments each of the summands on the

right is upper semianalytic. By Lemma 6.7 (4),

1(x) is upper semianalytic. QED

Theorems 6.11 and 6.12 show that both the Optimal
function and the expectation operator are well behaved.
The next two theorems will outline the procedures to

Obtain a measurable selector which assigns to each

x E X a y E Y which attains or nearly attains the
supremum in

f*(x) = SUP f(X.y)
yEY

Theorem 6.13 (JankOV-von Neumann Theorem): Let X

 

and Y be Borel spaces and A an analytic subset of XY.
There exists an analytically measurable function

¢>:prOjX(A) 4 Y such that gr(m) C A.

Proof: Let f: T 4 XY be continuous such that
A = f(m). Let g = projx<>f. Thus 9: m 4 X is con-
tinuous from m onto pron(A). For x E pron(A),

g-1({x)) is a closed nonempty subset of m.

207

Let C1(x) be the smallest integer which is the first
component of an element 21 6 g-1[(x)], and g2(x)
be the smallest integer which is the second compo-
nent of an element 22 E g-1({x}) whose first
component is §1(x). In general, let gk(x) be the
smallest integer which is the kth component of an

element 2k 6 g-1({x]) whose first (k-l)st

components are g1(x),...,gk_l(x). Let
WX) = (Cl(X).§2(X)...-)

Since 2k 4 $(x), ¢(X) E g-l({x}). Define m: pron(A) 4 Y
by m = pron° f0 u, so that gr(m) C.A. For

(01,...,Gk) 6 23, let

N(Ul,...,Gk) = {(C1,Q2,-oo) 6 97: C1 = 0110-00gk = 0k}

\v

M(ol,...,ok) = {(g1,g2,...) 6 91: gl g 01,...,gk g Gk}.
Let S = (01.02,...,Gk) E'ZL Suppose x 6 $-1[N(s)].

Let Mx) = (§1(x),g2(x),...). Then ¢'(x) €N(s) CM(s).

V 9[$(X)] 6 g[M(s)] and

§1(X) = Glyn-o,Ck(X) = Ck.

By the construction of w, 01 is the smallest integer
which is the first component of an element of g-1({x})
and for j = 2,...,k oj is the smallest integer which
is the jth component of an element of g-1({x})

whose first (j-1) components are Ul"'°'Gj—1

208

. -1 _ . _
..g([x}) ﬂM(01,...,oj_1,Oj-1)— gr 3 _ l,...,k
k
and xljglg[M(Ol....,oj_1,oj-1)]. This implies

k
J71
+ [NM] C 9[M(s)] - 3'91 9[M(01u--.0j_1 0]- -1)]-

It

Now suppose x 6 g[M(s)] - U g[M(C71,...,CI.l O.-1)].
j=1 3' 3

Since x€g{M(s)], 3 y= (711,712,...) Eg_1({x}) such

that

 

Clearly, x E pron(A) = g(ET?) i
ﬁx) is defined.

Let 1(X) = (;1(X).;2(X)....) and MX) Eg-l({x}).

Thus g[;(x)] = x. This implies 1' (X) EM(01,...,oj_l oj-l)

j = 112I"’Ik' Since $(X) Q’M(Ol-1), then c1(X) 2 01.
But g1 (x) is the smallest integer which is the first

component of an element of g—1({x}) .

C1 (X) g 711 S 01-

This implies (x) = o . Similarly, since
1 l

$'(X) {M(§1(X).02-1). Q2(X) ; 02. Again C2(x)

||/\
.‘3
N
l/\
o
to

.'. 52 (X) = 02.
Continuing v(x) €N(s) and x E i'-1[N(s)]. Thus
k

‘3-1[N(S)] :3 g[M(s)] .- U g[M(O ,...,O._ ,O’.—1].
j=1 l J 1 J

 

P09

k
. fl _
w, [N(s)] — g[M(s)] - girl/[(61,...,oj_l,oj -1]

3:1
M(t) is open in ‘3? for every t E Z. By Theorem A.18,
g[M(t)] is analytic
'. ¢'-1[N(S)] e ax v s e 23.
But {N(s) : s E Z] is a base for the topology on ‘R

O({N(S): s E'ZU) = 3m.

Thus

w-1[U({N(s):s e 21)]

= g[¢-1({N(s) : S GEM]

This implies ¢-1(B C ax and 11’ is analytically

an)
measurable. By the definition of co and the Borel-

measurability of f and pron

-1 ,-1 -l .-1
:9 (BY) = (f {pronY (BYH)
-l -l
c; (f [DXYD
c: w’lvs ) c a
m ‘X
I. a) is analytically measurable. QED

Theorem 6.14: Let X and Y be Borel spaces,

 

D C XY an analytic set and f: D 4 R* an upper semi-
analytic function. Define f*:pron(D) 4 R* by

f* (X) = sup f(x,y).
§’€I5(

 

 

210

(a) For every 6 > 0, there exists an
analytically measurable function
m:pron(D) 4 Y such that gr(m) C D

and for all x 6 pron (D)

f* (x) - c if f* (x) < co
f[X.co(X)] 2
— l/e if f* (x) = on

(b) The set I = {x 6 pron (D) : for some
= * . .
yX E DX' f(x,yx) f (x)} is universally
measurable and for every 6 > 0 there
exists a universally measurable function
co:projX (D) 4 Y such that grm) c: D and

for all x 6 pron (D)
fEX,cp(X)] = f* (X) if X E I

f*(x)-e if x£1,f*(x)<°°

fIX.co(X)] 2
1/6 if XEI, f*(x)=m

Proof: (a) The function f* is upper semi-

analytic by Theorem 6.11. For k = O,j-_l,:_2,...,

define
A(k) = {(xoy) e D: f(x.y) > kc}
B (k) = {x e pron(D) :ke < f* (x) g (k+ Us}
B (-..) = {x e pron(D) : f*(x) = -..]

B(cn) = {x E pron(D) : f*(x) = on}.

 

 

211

The sets A(k), k=0,_-f_1,:_2,... and B(co) are
analytic, while the sets B(k), k = O,_-i;l,:_2,... and

B(—a) are analytically measurable. By the Jankov—von

Neumann theorem, there exists, for each R = 0,11,:2, . . .

an analytically measurable wk : pron[A(k)] 4 Y with
(x,mk(x)) E A(k) for all x 6 pron[A(k)] and an
analytically measurable c-o : projx (D) 4 Y such that
(X,c-o(x)) E D for all x E projx (D) .

Let k* be an integer such that k* 2 1/62.
Define to : pron(D) 4 Y by

9%(x) if x EB(k), k =O,_~I;1,_+_2,...
:p(x) = g(x) if x 6 13(4)

03k" (X) if X € B(co)

Since B (k) C pron[A (k) ] and B (on) C pron[A(k)]
for all k, this definition is possible. Clearly,
co is analytically measurable and gr(cp) C D. If

x E B(k), then (x,cpk(x)) €A(k) and

fifX.co(X)] = f[X.cok (X) ]

>k€ ; f*(x)-e.

If x E B(-eo), then f(x,y) = -en for all y E DX

and

f[x,co(x)] = -o = f*(x).

 

 

212

If X E B(o)

f[X.co(x)] = f[x.cok*(X)] > k*e > l/e.

Emnce to has the required prOperties.

(b) Consider the set E C XYR* defined by

t1}
II

{(X.y.b): (X.y) 6 D. f(X.y) ; b}

k=1 r€Q*
By Theorem A.l6 and Corollary A.l3.2, E is analytic

in XYR*
2 The set A = pronR*(E) is analytic in XR*.

And the mapping T: pron(D) 4 R* defined by
T(x) = (x,f*(x)) is analytically measurable.
I = (x: (x,f*(x)) e A} = T'1(A)
I is universally measurable by Corollary 6.8.2.
Since E is analytic, by the Jankov-von Neumann theorem,
3 an analytically measurable «HA 4 Y such that
(x,m(x,b),b) E E for every (x,b) 6A. Define
‘4'; : I 4 Y by
‘HX) = co(X.f*(X)) = (co°T) (X) V x E I.
By Corollary 6.8.2, '1' is universally measurable and

by construction f[x,~lx(x)]2 f*(x) for x E I

f[X.'&:(x)] = f* (X) V x 6 I.

O
D U ((x.y.b) : (X.y) e o, f(x,y) 2 r, r 2 b-l/k].

 

213

By part (a) there exists an analytically measurable
1'-=Pr0jX(D) 4 Y such that
f*(x)-e if f*(x) <0:

f[X,¢€(X)] 2
l/E‘. if f* (X) = on .

Define co: projx (D) 4 Y by

p(x) if )< 6 I
(p(X) = o I
¢e(x) if x 6 prOjX(D)-I
Thus w is universally measurable and has the required
prOperties. QED

It is noted that since the composition of analyti-

cally measurable functions can fail to be analytically
measurable, the selector in the proof of Theorem 6.14 (b)

can fail to be analytically measurable. However, the

composition of universally measurable functions is

universally measurable, and so a selector which is

‘universally measurable is obtained.

Earlier in the chapter, it was mentioned that

the pnnajection of a Borel-measurable function need

Nevertheless, under certain

not be Borel-measurable.

conditions, it can be shown that when the extended

.realfwnalued functions involved are semicontinuous,

tﬂien tine selectors can be chosen to be Borel—measurable

 

 

214

Theorem 6.15: Let X and Y be separable

 

metrizable spaces. Let q(dylx) be a continuous
stochastic kernel on Y given X, and let f : XY 4 R*
be Borel-measurable. Define 1(x) = I f(x,y)q(dy)x) .
(a) If f is lower semicontinuous and bounded
below, then 1 is lower semicontinuous and
bounded below.
(b) If f is upper semicontinuous and bounded
above, then 1 is upper semicontinuous and

bounded above.

Proof: (a) Since f is lower semicontinuous and
bounded below, 3 a sequence, {fn] C C(XY) such that
fn ? f. Define 1n(x) = f fn(x,y)q(dy)x). Then kn

is continuous by Theorem 6.5 and by the monotone con-

vergence theorem kn T l

1 is lower semicontinuous.

(b) Same argument as (a) by complementation. QED

Theorem 6.16: Let X and Y be metrizable spaces

 

and 1£ﬂ2 f: XY 4 R* be given. Define

f*(x) = sup f(x,y).
y6Y

(a) If f is lower semicontinuous, then f* is
lower semicontinuous.

(b) If f is upper semicontinuous and Y is
compact, then f* is upper semicontinuous
and for every x E Y the supremum is attained

by some y E Y.

 

 

215

Proof: (a) Let (31 be ametric on X and d2

a metric on Y consistent with their topologies.

Let (ECZXY be open and x0 6 pron(G), 3 yO 6 Y
such that (Xo'yo) 6 G and some a > O with
N€(Xo.yo) = {(x,y) 6 XY::dl(X.XO) < c.

62(y.yo) < e] C G.

Then
C ' = 0 C '
xO \ pr03X[N€(xO,yo)] {x 6 X. dl(x,xo) < e} pronG
2 pron(G) is open in X.
Let c E R
{x E X: f*(x) > c? = projx({(x,y) 6 XY: f(x,y) > c}).
Since f is lower semicontinuous, this implies
{(x,y) : f(x,y) > c} is open
{x E X: f*(x) > c} is Open and f* lower
semicontinuous.
(b) Let x E X be fixed and let {yn} c.Y 'be
such tﬂnrt f(x,yn) 1 f*(x). This implies yn 4 yO

lim sup f(x,yn) g f(x,yo)

where yo E Y since
n4a

f (leo) = f* (X) -

Let {x11} C X be such that Xn 4 x0. Choose a sequence

{YD} <:‘Y such that

 

 

216

f(xn,yn) = f*(xn) n = 1,2,...

3 a mﬂmequence [(x ,y )) such that
“k “k

lim sup f(x .y ) = lim f(X oY )

Since Y’ is compact, {yn } 4 yo, yo 6 Y
R

 

1im sup f*(xn) = lim sup f(xn.yn)
n4: n41:

= 111“ f(X 0y )
k4o “k “k

*
_<_ f(xo.yo) g f (X0)
f* is upper semicontinuous. QED
The next lemma is very similar to Theorem 6.13 in

the analytic model and Theorem 6.17 is the corresponding

counterpart of Theorem 6.14 in Borel—measurable model.

Lemma 6.8: Let X be a metrizable space, Y a

 

.separable metrizable space, and G an Open subset of
Xﬁh. 'Then pron(G) is Open and there exists a Borel-
measurable function to : pron(G) 4 Y such that

gr(m) C G-

Prrxaf: Let {yn: n = 1,2,...} be a countable
dense subset Of Y. For fixed y 6 Y, the mapping
x 4 (x,y) is continuous

.. {>< 61X: (x,y) 6 G} is Open.

 

217

Let on = {x e x: (x,yn) e G}. Thus prOJX (G) = n31 on
and prOjX(G) is Open. Define co: pron (G) 4 Y by
y1 if x 6 G1
co(X) = n-l
y if x 6 G - Ll G , n = 2,3,...
n n k=1 n
Clearly gr(cp) C G and m is Borel-measurable. QED
Y

Theorem 6.17: Let X be a metrizable space,

a compact separable metrizable space, D an Open sub-

set Of XY, and let f : D 4 R* be upper semicontinuous.

= sup f(XIY) '

be given by f*(x)
y 6DX

Let f* : pron (D) 4 R*

Then prOjX(D) is open in X, f* is upper semicontinuous,
and for every 6 > 0, there exists a Borel-measurable

such that grace) C D and

function co- : pron(D) 4 Y

for all x 6 prOjX(D)
f*(x) - e if f*(x) < co

ffX,cp€(X)] 2
1/6 if f* (X) = a.

Proof: The set prOjX(D) is Open in X by Lemma

A
6.8. Let f: XY 4 R* be defined by

‘f(x,y) if (x,y) 6D

f(X.y) =
-¢ otherwise

For c 61R,

A
{x 6 X: sup f(x,y) > CT.

' f* (X) > c} =
y6Y

{x 6 pron(D) .

 

 

218

By Theorem 6.16 (b), f* is upper semicontinuous. Let

c > 0 be given. For k = O, i1,_+_-_2,..., define

MR) (091!) e D : f(x.y) > Re)

B (k)

{x e pron(D) :kc < f*(x) g (k+ 1):}

B(—-) = {x 6 prOjX(D) f*(x) = -u]

\

B(a) = {x 6 pron(D) : f*(x) = a]

l

(k+1)€

k6

 

 

 

 

The sets A(k) , k = 0.11.12“... are Open, while the
sets B(k), B(—¢), B(o) are Borel-measurable. By
Lemma 6.8, 3 for each k = 0,11,12,... a Borel-
measurable cpk : pron (Ak) 4 Y such that gr(cpk) C AR“
3 a Borel-measurable E): projx (D) 4 Y such that
gr (c?) C D. Let k* be an integer such that k* 2 1/92.

Define cps : projx (D) 4 Y by

 

 

219

cok(x) if x 6 B(k) k = O,i1,:2

mew) = {g(x) if x e B(w)

c9k*(x) if x 6 B(oo)

Since B(k) c:prOjX[A(k)] and B(a) C prOjX[A(k)] for
all k, this definition is possible. Clearly, we
is Borel-measurable and gr(m€) C D. If x 6 B(k),

then, since (X.wk(X)) 6 A(k),
fIXIm€(X)] = f[Xocok(X)] > ke _2_ f* (X) - e-
If x e B(-c), then f(x,y) = -a for all y e DX
f[X:co€(X)] = -~ = f*(X).
If x 6 B(an),
f[X.co€(X)] = fix.mk*(X)] > k*e < 1/e

m” has the required prOperties. QED

L

SO far in this chapter, all the rudiments that are
necessary to carry out the dynamic programming algorithm

in Borel spaces are discussed. The discussion is to a

large extent technical in nature. The main ideas are

to develop measurability requirements for the various
Operations of the algorithm. In the next chapter, our
attention will be returned to the economic model and an
imperfect information model will be built on the results

derived in this chapter.

220

It should be noted that the most logical order Of
events is to re-examine the finite and infinite models
for a basic multiperiod agency using the Borel space

ideas presented here. However, I choose not to follow
this sequence because the imperfect state information

model can be best used to describe the reporting function
Of an entity about its performance to the owners. Since
this application is Of the greatest interest to
accounting research, I shall proceed directly to the
imperfect state information model, leaving the basic

multiperiod agency Borel model for further research.

CHAPTER VII

IMPERFECT STATE INFORMATION MODEL

7.1: Introduction

 

Chapter VI discusses the problems that arise
when the disturbance space is uncountable. Specifi-
cally, it becomes impossible to define stochastically
the sequence of payoffs given an initial payoff.
Later in the chapter, by imposing measurability
restrictions on the functions and enlarging the payoff
and disturbance spaces to include all analytic sets,

the sequential search for Optimal or nearly Optimal

contracts can still be implemented.

The machineries built in Chapter VI, particularly
VI*, facilitate the modeling of the following economic
phenomenon. A contract is agreed upon by the principal
and the agent at the beginning Of a typical period,
say k. The agent makes his action choice in a manner
of a rational decision-maker. A payoff outcome occurs
at the end Of the period which is only observable by
the agent. The agent "reports" the payoff to the

principal. There is no reason to believe or to assume

221

222

that the agent always reports truthfully. In fact,

the manner the agent reports the payoff should be
One which is in his own best interest. This may

induce a discrepancy or disturbance between the actual
payoff and the reported payoff. Since the agent
"chooses" the manner to report the payoff, the resulting
discrepancy can be anything. What is observable to

the principal is a sequence of past contracts and re-
ported payoffs. A contract is enforceable only as its
arguments are observable by all parties. Under such
circumstances, the only candidates for contracting will
be the sequence of past contracts and reported payoffs.
However, it has been shown in the literature by

various writers (Ng and Stoeckenius [1979], Harris and
Raviv [1979]) that an incentive contract based on the
outcome alone will cause more hazard problems. The
question facing the principal will be under what cir-

cumstances can he design a long term contract to achieve
a Nash equilibrium over the planning horizon.

The discrepancy between the actual payoffs and
the reported payoffs under the imperfect state informa-
tion model does not behave as nicely as the countable
disturbance assumed in Chapters III through V. Since
the reporting function belongs to some functional space

which is Of infinite dimension and the principal has no

223

way of knowing what or how the agent is choosing the
reporting function, there is no reason to believe that
the disturbance is countable and has a well-defined
distribution function. It is uncountable because the
reporting function is unknown to the principal and it

is of infinite dimension, that is, the space is made
up Of an infinite number of independent vector basis.
The principal receives the value of the reporting

function, the mapping (or the how) of the actual payoff
to the reported payoff is never known to him. For this
reason, the principal is not able to induce the actual
payoff from the reported payoff. This causes more
serious problem in a multiperiod model. Since the
principal cannot induce the actual payoff given a
reported payoff in any period, it becomes more difficult
for him to define the whole sequence of payoffs even
stochastically given an initial value. Recall that the
payoffs form the state space for the multiperiod agency
model prOposed in the previous chapters, an alternative
state space must be defined before the problem can be
formulated. Such a state space must be inducible by

the knowledge of the past contracts and reported payoffs.

It must have a probability distribution conditioned on
the past contracts but independent Of the reported payoffs.

This condition is necessary to induce truthful reporting.

224

As indicated earlier, contracts based on reported

alone provides incentive for the agent to report un-
truthfully. It has been suggested that (Myerson [1979])
if the principal desires the agent to tell the truth
about a particular event, the contract should be drawn
independent of that event. It will be shown later in
this chapter that this alternate state space is used

as an argument for the incentive contract, it has to be
independent of the reported payoff to induce the agent

to report truthfully. Lastly, the principal must be
able to utilize the state space to design a long run

Optimal contract.

7.2: The Imperfect State Information Model (ISI)

Before proceeding to describe the imperfect state
information model, a remark on the solution is in order.
Since it is not possible to define the state space
stochastically at period 0, the Optimal contract,

which consists of a sequence of contracts over the time

periods, cannot be defined. The idea of a contract has
to be modified to a function whose image is on the inter-
val [O,l] and whose function value assign a probability
measure on all feasible contracts given the principal's
belief of the initial payoff and the past contracts

and the reported payoffs.

225

The model proposed here consists of the following
objects:

0 Payoff space

C Incentive space

Z Signal space, or the space of possible
reporting functions

é Information vector. Define for k = 0,1,...,N-l
Q = Z C .
o o

k "Ck-lzk

An element of Qk is called a kth
information vector, denoted by Qk.

k = O,...,N-l Incentive constraints.

These constraints exclude all contracts
which will give the agent an expected utility
less than Outside Opportunity set.

a Discount factor

g Net return function to the principal

s A conditional probability of the initial
signal given 0

s Conditional probability of Z given QC
N Time horizon: a positive integer or m

P Initial payoff probability space

The 181 model can now be described notationally as
follows. At period 0 the principal and the agent agree
on a contract ID with the principal's probability
belief Of getting the initial payoff mo to be p and
a reported payoff 20 with probability so given mo.

One can view 50 being the principal's assessment of

226

the truthfulness of the agent's reporting action.

The agent chooses ao Optimizing his expected utility
with outcome mo. He then reports 20 to the princi—
pal who then updates his information vector and revises

his belief on what he may receive as for the

21
possible contracts that he may entered into with the

agent as 12. Then 12

repeated. Of course, the biggest problem here is that

is agreed and the process

the only information that the principal has is the ¢k.
If he were to contract based on reports alone, it has
been shown (Ng and Stoeckenius [1979]) that such a
contract will induce non-truthful reporting.

At this stage of modeling, not much can be said.
A bit of comfort is that it can be shown (Theorem 6.3)

that a sequence of probability measures

Pk(QoZOCO...QkaCk)V,p) can be defined on
QOZOCo"'OkaCk given a contract n and initial dis-

tribution p. For notational ease, let Dk denote

the set Of all sequences of the form (w ,z I

O O'

’ I

o,..
wk'zk'lk) 6 OZC...OZC. This enables one to define the

N-stage total discounted expected return to the principal

corresponding to a contract F 6 H as

N-l

k
JN,F(p) = g [ggg a g(wkIIk)]Pk_1(Vop)de_lo
k-l -

For the infinite horizon model, then Jn = lim J

N4m N'V

227

In order to ensure the integral in JN v to be
defined, the following ad hoc condition is imposed

on the finite horizon model
JN,F(p) < m V v 6 H. p 6 P(Q).

This condition simply implies that the total expected
discounted return to the principal is finite.

For the infinite horizon models, to guarantee the
limit in the definition of JV to be well-defined,
one of the following conditions is needed.

(P) 0 g g(w,I) for every (w,I) 6 DC

(N) g(w,I) g 0 for every (w,I) 6 QC

(D) O < a < 1 and for some b 6 R, -b g g(w,I) g b

for every (w,I) 6 DC.

Condition (P) implies that the net return per period

to the principal g is positive. Condition (N) implies
that g is negative all the time. Condition (D) says
that g is bounded and the discount factor is less

than unity. Of the three conditions, condition (D)
certainly describes most common economic situations.

It is not unreasonable to assume that at least one of
the above conditions hold for the infinite horizon model.
As indicated earlier, based on the information

vector alone, there seems little h0pe to obtain a long

run Nash Optimal contract. However, by introducing a

228

monitoring device, a sufficient statistic, which is
defined below, the 151 model can be transformed into
a "perfect state information" (PSI) model. The process
of transformation will be described in the next section.
The sufficient statistic is defined in such a way that
knowledge Of its value is sufficient to design an
Optimal contract and control the system.

First, the term statistic is defined. A statistic
for the 181 model is a sequence (no,...,nN_1) of
functions nk: P(Q)i>k 4 5k where Ek is non-empty

with generic elements of E k = 0,1,...,N-1. Then

kl
for a statistic (no,...,nN_1) to be sufficient for the
151 model, all Of the following conditions have to be met.

(a) The statistic must guarantee that the

incentive constraint set Uk(¢k) can be recovered

from nk(p;¢k). This enables the PSI model which is

defined on Ek to search for an Optimum among the
same set of feasible contracts.
(b) It must guarantee that the distribution of

depends only on the values of E and I Thus,

k k'
can be used to construct the state

gk+1
the variables gk
space of the perfect state information model.

(c) It must guarantee that the net return function
to the principal 9 corresponding to a contract n can

be computed from the distribution induced on the

(gk'Ik) pairs.

229

What the sufficient statistic does is that the principal
can take the information vector and from there con-
struct an independent set of variables which he can

use to assess the performance Of the agent.

The 181 model describes very closely the reporting
function of an entity. Management performs their
routine tasks and decision making. A series of payoff
outcomes occur and management chooses a reporting
function to produce a set Of financial statements at
the end of each period. In most corporations, the
principal or the owner does not take part in both the
decision making and the reporting processes. The actual
payoff outcomes are then unobservable by them. Then
the financial statements are presented to the auditors
who perform the various audit tasks, make the necessary
recommendations for alteration and attest the financial

statements.

If one were to investigate the definition of a
sufficient statistic closely, it is not difficult to
see that it actually describes the audit function. The
auditor takes the reported outcome which is in fact
the information vector. After performing the audit tasks
and making the necessary changes, he produces the audited
financial statements. If an audit is performed in

accordance with the general accepted auditing standards,

230

it is believed that the audited financial statements
provides fair representation of the financial standing
of the company. In other words, from the audited

statements one can induce an expected payoff or return

to the owner which is exactly Condition (c).

The requirement that the distribution of §k+1
depends only on the values of 5k and Ik and not
on the information vector ¢i parallels the auditing
profession's emphasis on independence. Auditors are
not to be involved in the reporting function of the
company.

Condition (a) apparently is imposed more for
control purposes than reporting. However, this condi-
tion implies that the principal can use the audited
financial statements to search for optimal decision.
The possible alternatives induced by these statements
are identical with those if the actual payoffs are
known. This guarantees that the decisions made are
feasible and, as shown in the later section that, they
are Optimal also with respect to the actual payoff.
This fulfills the Objective of the financial statements
that they should provide relevant information for

decision making.

231

7.3: The Perfect State Information Model (PSI)

The manner that the sufficient statistic is de-
fined implies that given a distribution of the initial
payoff and the information vector, the principal can
transform the reported payoff into an expected payoff
with an independent probability distribution. If a

sufficient statistic is given, that is assuming it

exists, and its values can be computed from the know-

ledge Of P(Q) and § then the principal can define

k'
a perfect state information model in terms of Bk.
For notational purposes, a (A) is used to denote
Objects in the PSI model.

The perfect state information model consists of

the following:

E k = O. 1' o o o 'N-l State Space

kl
C Incentive space
A
Uk’ k = 0,1,...,N-l Incentive constraints
a Discount factor
A
gk, k = 0,1,...,N-l One period net return to the
principal
A
t , k = 0,1,...,N-2 Probability distribution of
k (e I: I)
k+1 ~k’ k

N Horizon

232

7.4: Reduction of the Imperfect State Information Model
This section studies the relationships Of the

various Objects between the 181 and the PSI models.

The discussion on the existence of a sufficient statistic

will be deferred until the next section. Throughout

this part Of the analysis, a sufficient statistic is
assumed tO exist. The Borel model as proposed in

Chapter VI guarantees the existence of an optimal solu-
tion (Theorems 6.13 and 6.14, Lemma 6.8 and Theorem 6.17).
By imposing the measurability restrictions on the E

and the incentive spaces, the PSI model consists of a

well-defined and measurable state space E with a

.—

probability distribution on E which depends on Ek

and I It becomes a direct application of the Borel

k’
model. Hence an Optimal contract for the PSI model is
guaranteed to exist. The natural question to ask is
whether or not the Optimal contract Obtained from the
PSI model is also optimal for the 151 model.

In order to establish correspondence between the
two models, the initial probability measure on E

O
must be ensured such that it can be induced from know-

ledge Of the distribution of the initial payoff (Theorem 7.1).
The inter-relationship between the probability distri-

bution on the set (wo,Io,zo,...,wk,Ik,zk) 6 Dk and

A
that on (EO,I given a contract n in

_r

O,...,1k_l,ek)

233

the PSI model is then derived (Lemma 7.1). The analysis
goes on to show that for every contract in the PSI
model, a corresponding contract in the ISI model can

be constructed (Theorem 7.2).

A
Next, given a particular contract w in PSI,

the total expected discounted net returns under the two
models are related by their respective probability
distribution (Theorem 7.3) and so are the Optimal total
return functions (Theorem 7.4). The final step is to
show the correspondence of the Optimal contracts between
the two models. I can only show a nearly Optimal con—
tract for the PSI model under the finiteness assumption
of the finite horizon model or the Assumptions (N) and (D)
of the infinite horizon model is also nearly Optimal for
the ISI model. However, correspondence for Optimal contracts
is shown for all assumptions Of both the finite and in-

finite models (Theorem 7.5).

The ISI model can then be reduced to the PSI model
for nearly Optimal contracts. In terms of the auditing
model, this implies that the nearly Optimal contracts

derived from the audited financial reports are as good

as those as if one were to Observe the actual payoff.

7.5: Sufficient Statistic

 

In this section, a sufficient statistic is prOposed
and shown to meet all the three conditions of a sufficient

statistic.

234

It is derived by a process called filtering. In
essence, filtering is very similar to the commonly
known process of Bayesian statistics. The system starts
with the initial wealth outcome wo which has a priori
distribution p. After 20 is observed, the distri-
bution is "up-dated". The up—dated distribution is
called a posteriori distribution and is shown to be
well-defined and unique (Lemma 7.6). At the kth
stage, there will be some a priori distribution p;
of “k based on dk_1. Contract Ik-l is negotiated,

some 2 is observed and an a posteriori distribution

k
of wk conditioned on (¢k-l'Ik-l'zk) is computed.
This distribution is again well-defined and unique.

The process of passing from an a priori to an a pos—
teriori distribution in this manner is called filtering.
Then the sequence Of a posterior distributions of

wk' pk: P(Q)4>k 4 P(C) is shown to be a sufficient
statistic (Theorem 7.6).

The filtering process seems to capture very closely
the audit function. The auditor comes into engagement
with a client. Based on initial interview with manage—
ment and evaluation of the company's internal contral
system, the auditor form some a priori Opinion about
the 'correctness' of the reported outcome, or what the

initial payoff should be. After performing the necessary

235

substance and compliance tests, the audit will up—date
the distribution. Such a distribution will again be

up-dated in subsequent periods based on the prior years'
working papers. The audit report thus become the out-

come of the filtering process.

In the process described above, again I am assuming
that there is no incentive nor moral hazard problems
exist between the principal and the auditor. The auditor
is acting strictly in the best interest Of the principal.
If the auditor is performing the audit tasks in the
best possible manner, the audited financial statements
is adequate for the derivation of a long-run Optimal
contract to be negotiated between the principal and

the agent.

CHAPTER VII*

IMPERFECT STATE INFORMATION MODEL

7.1* Introduction

 

The imperfect state information model consists
of two stochastic time series, namely the payoff
and the signal on the payoff. Obviously, the later
series is parameterized by the first one. In terms
of the economic model, the payoff is the wealth
outcome of a particular action selected by the agent.
He observes the payoff and then decides on the
manner that the outcome is to be reported to the
principal.

The information which is available to the

principal is a vector of the following form

Q = Z C

k 0 o...ck_lzk, k = O,...,N-l

where C is the incentive space and Z is the signal
space both Of which are assumed to be nonempty Borel
spaces. To the principal, since he has no direct

control on the form and magnitude of Zk, the report

or signal in his mind is nothing but a random event

236

237

stochastically generated via a signal kernel
s(dzk+1)Ik,wk+1). The zk+1 Signal is then added
to the past Signals and incentives (20.10,...,zk,1k)

to form the (k4—1)st information vector
Qk+l = (ZO'IO""’zk'Ik'zk+1)' The first information

vector o; = (20) is generated by the initial

Observation kernel s0(dzo)mo), and the initial pay-
off wo has some given initial distribution p.
Since Qk is Observable by both the agent and the
principal, it becomes a basis for incentive contracting.
However, it is well-known that any incentive based
solely on Qk is likely to induce misrepresentation

of the outcome. The following section will develop a

sufficient statistic for the payoff, or more specific,
a control on the reporting function.
To describe the system in terms of p, the

initial distribution of mo, define

t(B)w.I,J) P([Y= f(w,I,J,y) 6 B})w.I.J)

p(f'1(B)(w,LJ)lw.I.J)

for B 6 60.

Thus t(Blw,I,J) is the probability that the (k+—l)st

state is in B given that the kth state is w.

the kth contract is I and the kth total net

return to the principal is J. Alternatively, the

 

238

system can be viewed as moving from wk to wk+1

via the state transition kernel t(dwk+11wk'Ik'Jk)

and generates a return to the principal of g(wk'Ik)‘

7.2*

The Imperfect State Information Model (ISI)

The model will consist Of the following objects

and their corresponding assumptions

0

Payoff space: a nonempty Borel space

Incentive space: a nonempty Borel space

Signal space: a nonempty Borel space

Incentive constraints: For k = O,...,N-l,
let 6k = Zoco'°'Ck-lzk' An element Of
ék is called a kth information vector.

For each k, U is a mapping from ék

k
to the set of nonempty subsets of C such

that

I‘k = {(oka) : grk 6 6k. 1k 6 Uk(Q’k)]
is analytic
Discount factor: a positive real number
Payoff to the owner: an upper semianalytic
function from PR to R*
Initial signal kernel: a Borel-measurable
stochastic kernel on Z given 0
Signal kernel: a Borel-measurable stochastic

kernel on Z given CQ

239

t State transition kernel: a Borel-

measurable stochastic kernel on 0 given CC
N Horizon: a positive integer or a
P Initial payoff probability space: a nonempty

Borel space.

Definition: A contract for ISI is a sequence
v = (“O""'uN-l) such that, for each k, k = O,...,N-l.
“k(dIk‘P7¢k) is a universally measurable stochastic

kernel or C given P(Q)§k satisfying
uk‘Uk%>1P:¢k> = 1 V (p.¢k) e p(omk.

If for each p,k and , (dI p:¢') assigns mass
“1: k k

one to some point in C, v is said to be non-

randomized. Let U denote the set of all contracts n.

For ease Of notation, let Dk denote the set of
all sequences of the form (mo,Zo,Io,...,wk,Zk,Ik) 6
QZC...QZC. Thus, given p 6 P(Q) and
n = (”o""'uN-1) 6 U, by Theorem 6.3, there exists
a sequence of consistent probability measures
Pk(n,p) on Dk’ k = O,...,N-l defined on measurable

rectangles by

240

Pk(1r.P)(QoZogo. . 'ﬂkEk-gk)

= IQbIZOIS-o...jﬂkjgk Uk(9_~k‘p7zorlol°-°Ilk_lzk)
S(dzk‘Ik—l'wk)t(dwk‘Ik—l'wk—l)
...uo(dIolp:Zo)so(dZoIwo)p(dwo)

where .Q 6 B Z.6 E , 9-6 B

0' Z C'

Definition: Given p 6 P(Q), a contract

v = (“0""'uN—l) 6 H and a pOSItlve integer K g N.

the K-stage payoff corresponding to n at p is
K-l

k
JK'F(p) = ‘er-i[k§o a g(kak) ldPK_1(Tr.p)-

If N < m, the total net return to the principal

corresponding to v is JN v' For the finite horizon

model, either one of the following conditions on the

integral is assumed

N-l
a

[Z k+

(F-) fDN-l k=0 9 (wk.Ik)]dPN_1(7r.p) < a

V v6U.p6P(O).
+ N-l k _
(F > fgh 1 IQ§B a g (wk.1k)]dPN_l(v.p) < .

V Tr61'I.p6P(Q).

Then, JN n can be rewritten as follows
I

N-l k
JN’W(p) = gig a ka g(wk.1k)dpk_1(v.p)

241

If N = a for infinite horizon models, then

J = lim JN n’ the return to the principal corres—
N-Hn '

ponding to n. In order to ensure the limit JV is
well-defined in R*, one of the following conditions
is assumed on g

(P) 0 g g(w.I) for every (w,I) 6 0C.

(N) g(w.I) g 0 for every (w.I) 6 OC.

(D) O < d < l and for some b 6 R,

-b g g(w,I) g b for every (w,I) 6 0C,

Hence

J7me) = ki? 03‘ J‘Dk g(wk.Ik)de(7T.p) V W e n. p 6 pm).
=o

The concepts of Optimality at p, Optimality,
e-Optimality at p and e-Optimality of contract are

analogous to those given in Section 3.3*.

To aid in the analysis of the imperfect state
model, a sufficient statistic is introduced. It is
defined in such a way that knowledge of its value is
sufficient to design an Optimal contract and control

the system.

Definition: A statistic for the model ISI is a

sequence (no,...,nN_1) of Borel—measurable functions

H

nk: P(Q)§k 4 E where :k is a nonempty Borel space,

k = O,...,N-l.

242

Definition: The statistic (nb,...,nN_l) is

sufficient (SS) if:

(a)

(b)

(C)

For each k, there exists an analytic

A

A
set Pk C EkC such that prOjk(Tk)

D—l

=:_,k
and for every p 6 P(Q)
A
Pk = [(¢k.I)= [nk(p:Q&).I] 6 TR}-

-Ae-t
Define Uk _k) — ( k)gk.

There exist Borel-measurable stochastic

A
kernels tk(dgk+l‘§k'lk) on :k+1 given
EkC such that for every p 6 P(Q).
n 6 H. E 6 B: , for k = O,...,N-2,
—k+l "k+1

we have that

13k,fk) = Pk+1(w'p)[nk+l(p7ak+l) 6 Ek+1I

nk(P7¢k) = Ek'Ik = ik]

A D—O
tk(:"k+1

for Pk(n,p)-almost every (Ek,Ik), that is,

the set

A _
{(wolzoIIoIOOOIwkozklIk) 6 11k 3 tk(:k+1‘zk'lk)
= Pk+1(WaP)[’] When 5k = nk(P:¢k)IIk = 1k}
has Pk(n,p)-measure one.
There exist upper semi-analytic functions
A
gk: 9k 4 R* satisfying for every p 6 P(Q),

TFEH' k=O'Ooo'N-l'

243

for Pk(n,p) almost every (gk'lk) where
the expectation (in the sense of an outer

integral) is taken with respect to Pk(n,p).

Condition (a) of the definition of a sufficient

statistic guarantees that the incentive constraint set
Uk(¢k) can be recovered from nk(p:¢%). Indeed, for
any p e P(o). ok 6 ek, k = O,...,N—l
A
ukwk) — Uk[nk(p:¢k) ]-
If Uk(¢k) = C for every gk 6 ék' k = O,...,N—l,

A

then condition (a) is satisfied with Pk = EkC. This
is the case of no incentive constraint. Condition (b)
guarantees that the distribution of gk+1 depends only

and I . Thus, the variables

on the values of ER k

gk can be used to construct the state space of the
perfect state information model. Condition (c)
guarantees that the net payoff to the principal

corresponding to a contract can be computed from the

distribution induced on the (gk’Ik) pairs.

7.3* The Perfect State Information Model (PSI)

 

If a sufficient statistic as defined above exists

and its values can be computed from the knowledge Of

 

244

P(O) and then a perfect state information

@k'
model can be defined in terms of Ek. For notational

purposes a (A) is used to denote Objects in the

PSI model.

Definition: Let the ISI model and a sufficient
statistic (q0,...,nN_1) be given. The perfect
state information Optimal incentive model (PSI)

consists of the following:

Ek, k = O,...,N—l State space

C Incentive Space

A
Uk' k = O,...,N—l Incentive constraints
a Discount factor

A

gk, k = O,...,N-l One-period net return to
the principal

A

tk' k = O,...,N-2 State transition kernel as

defined in Definition SS(b)

 

Theorem 7.1: Define m: P(Q) 4 P(EO) by

co(p) (50) = (Q souzo: T10(p:Zo) e gollwommo)

E 6 E.
w :
O

for every

Then m is Borel-measurable.

245

Proof: AS defined, m(p) is the distribution
of the initial state 60 in PSI when the initial state

we in ISI has distribution p. Then

w§ﬂ(wo.p) = SOHZO: T(o(p:Zo) 6

)Iwo)

'—
H
H

is Borel-measurable for every 50 6 65 by Corollary

6.1.1. Define a Borel-measurable stochastic kernel

on Q given P(Q) by q(dwolp) = p(dwo). Then
co(p) (i0) = f wEo(wo.p)q(dwo)p).
By Theorems 6.1 and 6.4, m is Borel-measurable. QED

A .
Theorem 7.2: If n = (”O'°'°'GN-l) is a contract

 

for PSI, then the sequence

A A. .
(uo(d10)no(p7¢o)].....pN_l[dIN_l)no(p.¢6).Io.

""IN-2'nN-l(p7¢N-1)]

where Ok = (Z I k = O,...,N-l is a

o, 0'...'Ik-1'Zk)
contract for ISI.

Proof: Condition (a) of the statistic {nk} to
be sufficient guarantees that the incentive constraint

set can be recovered from nk(p:Ok). Since

I

k («A(J) : ok 6 ek. I e Ukwkﬂ

(Maggi) : [R(pngng 6 9k}

and

A — P

246
Thus for and p 6 P(Q), Ok 6 ék’ k = O,...,N-l

A
Uk(gk) = Uk[ﬁk(P:¢k)]

and the result follows. QED

Theorem 7.1 ensures that the initial probability
measure on E can be induced from knowledge of the
distribution of the initial payoff. Theorem 7.2 provides
the very preliminary result that for every contract for
PSI, a corresponding contract for ISI can be constructed.
Obviously, the question to ask is whether the Optimal

contract for ISI can be induced from this Optimal con-

tract for PSI. Before exploring this question, a few

definitions and notations are in order.

Definition: For p 6 P(Q), define the mapping

VP,k : DR 4 :OCOH'Eka by
vp’k(worzovlooo:-owkozkrIk)
= [no(p:¢o).Io....,rk(p;o’k),1k]

where Theorem 7.2 holds. Thus, for q 6 P(Eo) and
A'- A A (i h ' f

n — (“O""'uN-1) 6 , t ere :s a sequence O con-
sistent probability measures Pk(v,q) generated on

SOCO...Eka, k = O,...,N-l, which are defined on

measurable rectangles by

247

A A _‘ C H C )
Pk (7r,q) (::. . . .:._k —k

A A
= I50 (C ...jgk pk(gkl§o.xo.....Ik+1.ek)tk_1(dikl§k_1.1k_1>
A
...uo(dIOI§o)q(dEo)

where E 6 B: , g_ 6 B
2k "k k Ck

A+ A- A A A
When (F ). (F ). (P), (N) or (D) holds, the net
total return to the principal corresponding to a contract

9 for PSI at E 6 E is

O
N-l
A k A A A
J A == 23 a I g (E .I )dP (v.p )
N,W(§) k=0 E c ...: c k k k k 5

o O k k
where N 6 [1,o] and the optimal return for PSI at

E E is
. 6 o

7.4* Reduction of the Imperfect State Information Model
This section is devoted to study the relationships
between net returns, Optimal and nearly Optimal con-

tracts for the ISI and the PSI models. First, for a

A .
contract n for the PSI, these Objects are related to
A
the probability measures Pk(n,p) as defined in

Section 7.2 in the following manner.

248

A A
Lemma 7.1: Suppose p 6 P(O) and n 6 H. Then

 

for k = O,...,N-l and for every Borel set

B C SOCO...:ka, we have

A ... AA
Pk(v.p)[vpfk(B)] = Pk[v.e<p)](B).
ea...

=0 e
0

Proof: It suffices to prove if

c 66 ,...,E 65:, g 63 then
-0 Co k Ck k Ck

A
Pk(7T.p) ({no(p:¢o) 6 50 Io 6 Co.....nk(p:(7fk) 6 5k. Ik 6 9k?)
_ A A _ _

— Pk[7r.co(p) ] (so So: - --:_~=_k 2k)

where

(no(p:oo) 6E0. IO 6 Co.....nk(p:q() 6 Ek' Ik 69k]

{(wo.zo.10.....wk.zk.1k) : no(p:¢o) 6 50.

IO 6 co.....nk(p:¢k) 65k. Ik 69k)

oj = (20,10, . . . ,Ij_1,zj).

For k = O, the result clearly holds by the definitions

A A
of Po(Tr,p) on Do, cp(p)(;;o) and Po(7r,q) on :OCo.

_ A A
If Pk(r¢.p)[vp}k(B)l =Pkiv.co(p)](B) for k<N and

since

A
tk(-‘ik+i‘§k'lk) = Pk+i(”'P)[“k+i(P’¢k+i) 6 546+“

“WP a): 5k: 1k: EEK]

then

249

'1 EC!

A
Pk+1(Tr.p) ({no(p:¢o) 6 O 0

Am

'°°'nk+l(p7q<+l) 6 §k+i'1k+1 E -C—k+l})

A
= .. I .. I “k+i(9k+i‘5k+i)
{110(1): (30) EEOIJOESOI'°'IT)k(pI gk) GEk'Jkeg-k ik+1

A A
tk(d§k+1|nk(p:¢k).Ik)de(TT.p)
' c. C p C I ”k+1 —k+l‘§k+l)tk( ’k+l gk'Ik) th'w‘PH
Zo-T,"’:k-4< Zk+1
A
= P1:.+i[”“‘3(p)](io 90' ' 'ik+i E-k+1)° QED

The next theorem establishes the relation between
the total net return functions to the principal for the

. A
two models for a given contract n.

_ A_ A
Theorem 7.3: (F+,E+)(F ,F )(P,P)(N,N)(D,D). For

A A
every p 6 P(Q) and v 6 n. we have

A A '
J A(p) =1 J A(:o)co(1:>><<:iao).
7T :2 TT

N. o No

Proof:

A
i J A<:)e<p>(d:o)
:0 N,V

A A A
=I Z) a f gk(§k.Ik)de(vr.pE)w(p)(déo).

O k=O COCO...;ka

BY (F+) or (F‘) if N < a, by monotone convergence
theorem if N = o and under (P)(N) and by bounded

convergence theorem when N = m and under (D)

250

N‘1 k A A A
= a i I gk(§k.1k)de(W.pE)w(p)(dEo).

k=0 :0 :OCO. . . ka

III

A A
By definition of Pk(n,pE) and w(p)

N-l A A A
= 2 (1k f_ gk(:k.ik)de[rr.co(p)]-

k=O :OCO...:ka

By Lemma 7.1 and condition (c) of a sufficient statistic

N-l

= 2: (1k 9( I )de (VIP)
k: 0 ka L”k' k

= J A(p). QED
N,F

Before proceeding to prove the correspondence of
the Optimal return functions and Optimal contracts, a

few other results are studied. The following lemma
defines the Optimal return function for PSI in terms of
the initial payoff probability measure p. Next, the
relationship between the Optimal return functions for

the models for p 6 P(O) is explored.

A+A_AAA
Lemma 7.2: (F )(F )(P)(N)(D). For every p 6 P(Q)

 

A*
i JN(eo)e<p)<d§O)

O
(J (6 o)m(p)(d§o )
Eo N,

n.

:1)“,
:l>'U

251

:1)

A
Proof: For p 6 P(Q) and 6 H

A-k A
l JN(§o)e(p)<d§O> 2 J A(Eo)m<p><déo)

0 :0 N,TT
A E
2 in g JN #(-o)m(p)(d§o).
W6U "O '

A
Let c > O and let 7’ 6 ﬁ be e—Optimal. If

A*
n<p>((:olJN(:o) «J) = 0.

Then

I

A A
I J A (50)m(p)(d50) ; f J;(§o)m(P)(dEo) - e
: N,F E

O

O

. A A*
.. iuR ; JN #(Eo)w(p)(d§o) ; i JN<:o)e(p)(d:o)
WEU ”o ' ”O
If
A*
co(p) ((50 : JN(50) = 00)) > 0.
Then
A E
i J A(so)e(p)<d.0)
:0 N,W
A* A*
2 (J: JN(EO)cp(p) (ago) - e + :p(P) ((90 : JNGO)
(so JN(EO)<w)
If
A*
f JN(;O)e(p>(d;o) = -e.
*
{EonN(Eo)<w}

252

then

A
f J§<§o>w(p><d§0) = -e

H

11

O
and

A A*
sup 1 J A(:O>n<go> ; I JN(§O)m(p)(d§o).

AA
WEE o N’W o

[I
11

Otherwise, the right-hand is finite for any a > O

and
I 9 <5 ) (g )
SUP A cm 0 =o
#6ﬁ :0 N'”
. A* A
.. i JN(go)e(p)<dzo) = in? i JN #<:O)e(p><dgo).
"o W6H ”O '
QED

+ A+ _ A__ A A A
Theorem 7.4: (F ,F )(F ,F )(P,P)(N,N)(D,D). For

 

every p 6 P(Q)

*

A9:
JN(p) g JN(§O)e(p)(d€O).

INC—J

0

Proof: By Theorem 7.3

*
J (p) = SUP J (p) 2 sup J A(p)
N WGH N’W — #6ﬁ N,U
A
= sup I J A(SO)O(P)(d§O)-
A A E N,F
V6H 0
By Lemma 7.2
.... * d
- i JN(;O)e(p)( so). QED

253

The next result shows the correspondence of

contracts for the two models.

_A_ A
Lemma 7.3: (F+,£+)(F ,F )(P,P)(N,N)(D,D). Let
A A
p 6 P(Q) and F 6 H be given, there exists n 6 H

such that

JN'F(p) A(go)w(p)(d§o).

N,W

II
IIIC—‘a
q

Proof: Let p 6 P(Q) and

:1
ll

(uo.....uN_1) 6 U

be given. For k = O,...,N-l, let Qk(v,p) be the
probability measure on E C defined on measurable
rectangles to be
QkUanE,k 9k) = Pk(J.p)({nk(p:¢g) 6 5k. Ik 6 Ckl)
\
H a Borel-measurable stochastic kernel hk(dIk)Ek) on

Ck given Ek such that for every Borel set
B

M

C C

k k’

A
ok(v.p)(B) = ukusg Isk)dok(w.p).

III

I
kck k

In particular,

1 Pk(J.p)({¢'.Ik) 6 Tki)

A
= Pk(v.p)({[nk(p:¢().1k] E Tkl)
A
= Qk(W.p)(Tk)

A A
= f uk(Uk(€k)l§k)ko(J.p).
”kck

II

254

A
By altering uk(dIkl§k) on a set of measure zero if

necessary, we may assume that

A
Qk(W.p)(B) = f uk(ngI§k)ko(V.p)

ka

III

A A
and pk(Uk(§k)l§k) = l for every 5k 6 Ek' Let

A_A A
v - (no.....uN_1>.

PSI. Since

A .
Then TT‘ is a contract for

w(p)(§o) = (Q so({zo: no(p:zo) e gollwo)p(dwo>

v es:

:0
0

Clearly, the marginal of Qo(n,p) on E0 is ¢(p)

and for k = O, by Lemma 7.1

, I 6 C

A A
Pomcupungo e- 0 _kn.

lo

QO(J.p) (Eb _C_o)

Now assume that
= c - A A E J“ 6 ‘

Then

Qk+1(rr,p) (Ek+l —C-k+l)

A
= e fc “k+1‘9k+i‘-Ek+i)d0k+1”'p)
Zk+l—k+l

A
( ( -¢j )6 e w Uk+l(E-k+l‘nk+l(p7gk+l))dpk+1(Tnp).
T“k+1 p. k+1 :k-l-l"

By condition (b) of Definition SS

II
tﬁ

E-k 9k ik+ l

= I I

E C ...= C

255

A A
L1k+1 (EMU Ek+l) t (dgk+l‘ nk+1 ‘97 qt”) 'Ik) de (”'P)

A A
i “k+1‘9k+1‘Ek+i)t(dgk+i‘gk'IkakW'P)

A A
LJ1~'.+1(£k+ll §k+1)t(d§k+llgi,1k)

O O ”k k 5k+l

A A
deiF:w(P)]

A A ._
= pk+l[rr.o(p) ] ({§k+1 6 ik+1'1k 6 Elwin

m

For 'k 6

Qk (mp) (5k 9k)

By Theorem 7.3
JN,n(p)
Definition:
6 ﬁ is said to

i J A(to)q(deo>

Z
q

a, ,.g e e , k
Ck k C

ll
0
Z

l
...-l

A A
= Pk[J.co(p>]({§k 6 518 1k 6 gkl)

N,# (so)e(p)(dgo). QED

ll
NIL-o
Cl >

0

Let q 6 P(Eo) and e > O, a contract

be weakly q-e-Optimal if

A* _ A*
i JN(EO)q(d§o) - e If i JN(§o)q(d§o) < an
H "O

' A* e _
1f JN(§o)q(dso) — w

“FL—a

O

A
The contact v is said to be q-Optimal if

A A*

256

Lemma 7.4: Let J: O 4 R* be upper semianalytic.

 

Then for e > 0, there exists u E U(ClQ) such
that TU(J)(w) 2 T(J)(w) - c for every w E Q,

where T(J)(w) - 6 may be a.

Proof: By Theorem 6.14, there are universally
measurable selectors um: Q 4 C such that for
m=l,2,... and w€§7. UmEUhL‘) and

T(J)(w)-c if T(J)(w) (en

T (J)(w) 2
”m ' 2m if T(J) (w) = a

Let u(dIIw) assign mass one to u1(w) if

T(J)(m) < m and assign mass 1/2m to um(w).

m = 1,2,... if T(J)(w) = a. For each §_€ BC
Xg[“1(‘*')] if T(J) (w) < on
u(§Jw) = a
Z 3; xcfummn if T(J) (w) = a»
m=l 2 -—

is a universally measurable function of w.

u is a universally measurable stochastic

kernel with the desired properties. QED

Lemma 7.5: (F-) If JO: 0 4 R* is identically

 

zero, then Tk(JO)(w) < a for every w E O. K = l,...,N.

Proof: Suppose for some K g N and mo 6 O

that for every w G O

T3(Jo)(w> < a j O,...,K—l

257

and
k
T (Jo)(wo) — w
3 universally measurable selectors uj: Q 4 C

j = 1,...,K-1 such that uj(w) E U(w) and
(TuK_jTj'1) (JO) (w) 2 Tj (Jo) (w) - 1

j = l,...,K-1 V m 6 Q,

Then
(Tul...TuK_1) (JO) 2 (Tul...TuK_2)[T(JO) - 1]

; (Tu1~--TuK_3)[T2(JO) — 1 — a]

K-2

2 TK-1(Jo) - (l-ra-+..u+a ).

By Lemma 7.4, there is a stochastic kernel

u E U(CIQ) such that (TuOTK-l)

O (JO)(wo) = w. Then

K-l(

(TuoTul.-.TuK_l)(JO)(wo) g TuofT Jo)-

l-d-...-dK-2]

(w )

0

Choose any u E U(CIQ), let W = (u ,...,uK_1.ul,...,u)

so that v E H

K-l

k
2 d gdq (mp )
k=o I k u’o

J )

K,7T(wo

(Tuo...TuK_1) (Jo) (mo) = co

for some k g K-l f gqu(v,pw ) = a.
o

This contradicts (F'). QED

 

258

r ..A- A A A
Theorem 7.5: (F+,B+)(F ,F )(P,P)(N,N)(D,D)

 

* A* :
JN(p) = i JN(§o)m(p)(dso) v p e p(n).
”0

Furthermore, if 9 is Optimal, m(p)-0ptimal or
weakly m(p)-—e-0ptimal for PSI, then 9 is optimal,
optimal at p or e-Optimal at p respectively for ISI
If 9 is e-Optimal for PSI and (F-,§-), (N,§) or

A
(D,D) holds, then 9 is also e—Optimal for ISI.

*
Proof: JN(p) sup JN,v(p)

WEE

A
inf/3‘; J A(Eo)co(p) (GEO)

WEN o N'W

||\/

A*
= i JN(§o)w(p)(dEo)-
“o

By Lemma 7.3, then

* Air
JN(p) = l JN(EO)m(P)(dEO) v p 6 9(0).

ha
.1

O

A A
Let v be e-0ptima1 for PSI. Clearly, under (N,N),

A* V _.
JN(§O) < a :0 e :‘o

A*

A
J A(EO) g JN(§O) - e V :0 6 30'

- A- A*
Under (F ,F ), by Lemma 7.5, JN(§o) < m, and again
the above holds. From Theorem 7.3 and the first part

of this theorem

259

, A
J (p) = J (5 )co(P) (<35)
Nu? E-[zo N7]; ° o
A
2; J;(:O>cp<p) (d‘io) - a
-0
*
= JN(p) - e

A
F is e-optimal for ISI.

Similar arguments hold for the rest of this theorem. QED

7.5* Existence of Statistics Sufficient for Contracting

This section presents two statistics which are shown
to be sufficient for contracting. The first statistic
is derived by a process called filtering. To aid the
discussion of filtering, there is the following basic

lemma.

Lemma 7.6: For the ISI model, there exist Borel-

 

measurable stochastic kernels ro(dwo‘p:zo) on 0
given P(Q)Z and r(dwlp;I,z) on 0 given P(Q)CZ

which satisfy

(A) if) so(§o‘wo)P(d%) = g; ro(Qolp:zo)so(dzOlwo)p(dwo)
—o -0
V jg 6 30. go 6 52. p 6 P(Q)
(B) fs<§lx.w)p<dw> = J“ j r<§lp:1.z>s(dzlx.w)p(dx)
£2 0;

VQGBO.§EBZ,p€P(G),I€C.

 

260

Proof: For fixed (p;I) E P(O)C, define a
probability measure g on OZ by specifying its

values on measurable rectangles to be (Theorem 6.3)

q<gglpm = J" s<§lI.w)p(dw).
D

By Theorem 6.1 and 6.4, q(d(w,z)lp:I) is a Borel-
measurable stochastic kernel on OZ given P(Q)C.
By Corollary 6.2.1, this stochastic kernel can be
decomposed into its marginal on Z given P(Q)C
and a Borel—measurable stochastic kernel

r(dwlp:I.z) on 0 given P(Q)CZ such that the

theorem holds.
The existence of ro(dwo‘p;zo) is proved in a

similar manner. QED

The system starts with the initial wealth outcome
mo which has a priori distribution p. After Z0 is
observed, the distribution is "up-dated". The up-
dated distribution is called a posteriori distribution

and will be shown in the next lemma to be just

ro(dwolp:zo). At the kth stage, k 2 1, there will

be some a priori distribution pé of wk ‘based on

¢k_1 = (zo'Io""’Ik-2'zk—l)' Then, a contract Ik-l

is negotiated, some 2k is observed, and an a posteriori
distribution of wk conditioned on (¢k_1,1k_1,2k)

is computed. This distribution is just r(dw‘péylk_1,zk).

261

The process of passing from an a priori to an a
posterior distribution in this manner is called
filtering. The process is formalized in the following

definitions.

Definition: The function f: P(O)C 4 P(Q) is

defined by
(C) f(p:1)(5_>) = ft(glw.1)p(dw) Vg e 60
is called the one stage prediction equation.

By Theorems 6.1 and 6.4, f is Borel-measurable.

Definition: Given a sequence gk 6 ék such that
¢k+1 = (gk'lk’zk+l)' k = O,...,N-Z and given p 6 P(Q),

define recursively
(D) po(p:¢6) = ro(dwolp:zo)

(B) pk+1(p:¢k+1) = r(dwlf[pk(p:¢k).Ik]:Ik.Zk+l)
k = O,...,N-Z.

Thus defined, for each k, pk: P(Q)<I>k 4 P(O) is
Borel-measurable. Equations (A)-(E) are called the

filtering equations corresponding to the ISI model.

Lemma 7.7: Let the ISI modle be given. For any

 

p 6 P(Q), v = (u0,...,uN_l), v 6 H and ER 6 60, then

pk (TT,p) [wk 6 _Q‘kigk] = Pk (p7gk) (.Qk)

for Pk(n,p) almost every ¢k, k = O,...,N-l.

 

262

Proof: For any §% 6 B and 6 B

0 go

2

{2 £2 ] po(p:zo)(§%)dpo(v.p) = {2 £2 3 ro(§blp:zo)dpo(n,p)
L 0 ‘0 o -0'

= g g ro(£blp:zo)so(dzo‘wb)

—.o
P(dwo)
= g sosolwapwwo)
—o
= Pomp) (mo 6 530. 20 e gob.

The theorem follows for k = O by the definition of

conditional probability. Now assume the theorem holds

for k. For any 2k 6 HQ , 9k 6 Eb, Zk+l E 5% and
k

£§+1 6 50' Then

r 6? I 6C 2 'Z 1pk+l(p:Q&'Ik'zk+1)(£&+1)de+l(V.P)
\%( —k' k —-k' k+lt_k+1.

= f f Pk+i(P’Qi<' "k+1) (Qk+l)
Wkeq’k} 9k 1im £k+l
s (dzk+lllk’wk+l)t(dwk+llwk’Ik)

“k (dlklp: (Pfk) de (mp)

I f pk+l(p7¢k'Jk' zk+l) (ﬂkn)
k -C-k 531t+1 £194

S(dzk+l‘Ik"”k-+l)t(dmkstl‘""k'Ik)
uk(dIkIp:ﬂk)[pk(p:¢k)(dwk)]de(v.p)
= I I I I f Pk+1(P’¢k'Ik'zk+i) (Bk-t1)
{65%} 9k 9k ﬂk+l -Z—k+1
S(dzk+i‘1k"”k+1)t(dwk+1‘“’k'1k)

[Pk(p7¢k)(dwk)]uk(dIklp:¢k)de(F.p)

'DL’:

263

= I I J‘ r(_Qk+1lEi-pk (D:gk) 01k]71kt zk+l)
Wkeq’k] gt ﬂk+l gk+1

s (dzk+l‘Ik' wk+1)f{pk(p7¢k)'1k](dmk+l)

Uk(dIkIP7¢&)de(VoP)

= {(3 g: l g ilk S<Ek+l‘1k'wk+l)f[pk(P:¢k)'Ik](dwk+l)
k k —k —-+1
p,k(dlklp:¢k)de(TT:P)

a £1 s(-Z—k+lllk'w1v<+l)t(dwk+1lwk)[pk(p;¢k) (d‘kn
—- i-+l

uk(dIklp:¢k)de(W.p)

= Wk‘gik} i

= f I f I s(g-1~:+llIk'mk+l)t(du’k+llmk'1k)
kEQk} 3% 9k §h+1
uk(dIkIp:¢k) [pk(p:¢k) (dwk) lde(Tr.p)
= J‘ I I S(§k+1‘1k'wk+1)t(d°’k+1‘“’k'1k)
Wight} 9k 9k+i
Uk(dlkip79&)de(FIP)
= Pk+l(Tr'P) (Wit ‘5 q’k' Ik E 9k' Ll’k+1 ‘5 9k+1' zk+1 ‘5 §k+l})°
By the definition of conditional probability

Pk+1("'P) ”k+1 E £1k+1mi<+1] = Pk+1(P’¢k+1)(9k+1)

for Pk+1(v,p) almost every Qﬁ. QED

Theorem 7.6: For the ISI model, assume that

 

Uk(w) = C for every w 6 Q and k = O,...,N-l. Then

the sequence [po (p; (30) , . . . ,pN_1(p;¢N_1)] defined

by (D) and (E) is a sufficient statistic and the

resulting perfect state information model is stationary.

4' .-

264
Proof: Let Ek in the Definition SS be
P(Q), k = O,...,N-l. Since pk: P(Q)§k 4 P(Q) is
Borel-measurable for each k (p0,...,pN_1) is a
A
statistic. Let Pk = P(Q)C, k = O,...,N-l. Then

condition (a) of Definition SS is satisfied. For

g 6 P(Q), I 6 C and 5_6 E§(Q), define

_z_<§.I._E_> = {z e z=rtdwlf<§.1);1.z] e a}
A
t(§IE.I) =fJ“swans)I1.w’]t<dw’lw.1)a(dw).
O Q
But §j§,I,§_ is the (§,I)-section of the inverse image

of §_ under a Borel-measurable function. The

stochastic kernel

Hahn) = f 5(§11.w')t(dw'lw.l)
Q

is Borel-measurable by Theorems 6.1 and 6.4

A(_z_l:.1)

.fJ‘S(§i1nn3twdw’hmIJE(dw)
Q o

fk(§1w.1)§(dw)
Q

is Borel-measurable by the same theorems. By Theorem
A

6.1 and Corollary 6.1.1, t(d§’l§,1) is a Borel-

measurable stochastic kernel on P(Q) given P(Q)C.

For H 6 H, p 6 P(Q), E 6 B and k = 0,1,...,N—2,

P(Q)
by Lemma 7.7

 

265

Pk+1(v.p)[pk+1(p:¢k+l) 6 Elpk(p:¢k)

= Pk+1(T"oP)[zk+l E E(Ekllk'iipk(p7¢k) = Ek' Ik = 1k]

= E[Pk+1 (Tr'p)[zk+1 E g(gk'ik’E) i¢kiIk1ipk(p7¢k) = Ekllk =

Big; J; S[§_(Ekriko§_) ilk! (”k-+1]

t(dwk+llwk.1k)[pk(p:¢&)(dwk)]iPk(p:¢k) = Ek.1k = ’

A _ - _
= t(_':'_‘EkII-k)

for Pk(v,p) almost every (Ek'ik) where the

expectations are with respect to Pk+1(v,p).

Condition (B) is satisfied.

For H 6 H, p 6 P(C). and k O,...,N-l. By Lemma

7.7

E[g(uﬁ<vlk)‘pk “37%) = Ek' Ik = ik]

E{E[g(u>k.Ik)I¢K.Ik](pk(p;¢k) = Ek' Ik = ik}

E(gg(wk.1k)pk(p:¢k)(dwk)1pk(p:¢k) = Ek, Ik = 1k]

f g(wk.fk)Ek(dwk)

Q

for Pk(V,p) almost every (gk'ik) where the expecta-
tions are with respect to Pk(v,p). The function

A
g: P(0)C 4 R* defined by
A _ ..
g(E.I) = I g(w.I>E(dw>
Q

is upper semianalytic by Theorem 6.12

A
g satisfies condition (C). QED

266

Theorem 7.7: Let the ISI model be given. The

 

sequence of identity mappings on P(O)§k,

k = O,...,N-l is a sufficient statistic.

Proof: Let Ek in Definition SS by P(O)§k.

k O,...,N-l and let nk be the identity mapping

on P(O)§k. Then (no,...,nN_1) is a statistic.

A
Let Pk = P(C)? , k = O,...,N-l. Condition (A) is

satisfied. If ik+l 6 Bb(0)¢ . 3k 6 P(Q)§k,
k+1
and ik 6 Ck' define
(§k+1) - = {zk+l 6 Z‘ (P’zo'lo'""Ik-1'zk'Ik'zk+1)
(E.I)
k k
6 5k+l}
where 5k = (p7zo'Io""'Ik+l'zk}' Now, define for
A - -
_ _ ' p
k — O,...,N 2, the stochastic kernel tk(d’k+l‘§k'1k)
on P(QMk+1 given P(O)§kC by
A _ - - _ _
tk(5k+1‘§k'1k) ’ I S[(Ek+l)(§ f )‘Ik'wk+l]
+1 k' k
t(dwk+1iwk.ik)pk(gk)(dwk)
V E 6 B
i—k+l P(O)§k+1

where pk(§k) is as defined by (D) and (E). Using
similar arguments as in Theorem 7.6, it can be shown

A
that t is Borel-measurable. By Lemma 7.7

k

267

Pk+1(”'P)[”k+1(P’¢k+i) E Ek+1h‘k(p“3‘k) = Ek' Ik = i1:1

= Pk+1(n—Ip)[(gk'1k'zk+l) E Ek+1]

= “i 51-. (§k+l) (E f )lik'wk+l]t(dwk+1‘wk'fk)pk(-E-k) (dwk)
+1 k' k

_ A l? f

‘ t(-—k+1 "k' k)

M

for Pk(v,p) almost every (Ek'ik)' Condition (B) is

satisfied. For k = O,...,N-1, define

A
9k : P(C/Mkc 4 R* by

Sk(§k.fk> = f g(wk.ik)pk(2k)(dwk).

0k

By Theorem 6.12, is upper semianalytic for each k

A
9k
For p 6 P(Q), V 6 U and k = O,...,N-l from Lemma 7.7

E[g(mk.lk)‘n(p7¢k) = 5k. Ik = ik]

= $k<Ek.ik)

for Pk(v,p) almost every (gk’fk) where the expectation

is with respect to Pk(v,p)

Condition (C) is satisfied. QED

APPENDICES

APPENDIX ONE

This appendix develops the rudiments of probability
measures on Borel spaces. It tries to summarize the
basic facts of the subject. All results in this section
are available in the literature. They are collected
here for easy reference.

It is understood that throughout the appendix
X is a metrizable tOpological space with 5k as the
Borel o-algebra on X. The space of probability measures
on (X,ﬁk) is denoted by P(X). C(X) is the Banach
space of bounded, real-valued continuous functions on

X with the supremum norm for any metric d on X

consistent with its topology. Ud(X) is the space of
bounded, real-valued functions on X which are uniformly
continuous with respect to d. A probability measure

p.6 P(X) determines a linear functional 1p: C(X) 4 R
defined by 1p(f) = I fdp and conversely, a function

f e C(X) determines a real-valued function

268

269

9f: P(X) 4 R defined by 6f(p) = f fdp. These
relationships and the metrizability of the space X
enable one to show several properties Of P(X). In
particular, it can be proved that there is a natural
topology on P(X), the weakest topology with respect
to which every mapping Of the form of 6 is continuous,

under which P(X) is a Borel space whenever X is a

Borel space.

Definition: Let X be a metrizable space. A

probability measure p 6 P(X) is said to be regular

if for every B 6 Bk,

p(B) sup{p(F): F C.B, F closed}

inf{p(G): B C G, G Open].

Theorem A.l: Let X be a metrizable space.

 

Every probability measure in P(X) is regular.

Proof: Let p e P(X) be given and 6 be the

collection of B 6 EX such that

p(B) = sup{p(F): F CIB, F closed}
= inf{p(G): B C G, G Open}.
Let H ClX be Open, 3 an increasing sequence of closed

Q

sets {F } such that H = IJ F . Thus
n n=l n

270

inf{p(G): H CIG, G Open]

p(H)

1im p(F )
114$ n

g sup{p(F): F c:H, F closed}

S P(H)

p(H) = sup{p(F): F C.H, F closed} and H 6 6

6 contains every Open subset Of X.

Suppose B 6 6, then

p(BC) = l - p(B) 1 — sup{p(F): F CLB, F closed}

inf{p(G):BC C G, G Open}.

Similarly

p(BC) = sup{p(F): F C BC, F closed}

6 is closed under complementation.

Let {En} C 6. Choose e > O and Fn CBn C Gn such
. . n
that Fn lS closed, Gn is Open and p(Gn-Fn) g e/Z .

Then

U B cu G =(U Fn)u[u (en-an
n= n=1 n=1 n=1

C (U Bn) U [U (Gn-Fn)].
n=1 n=1

0 a
Thus p( L) Gn) g p( L! Bn) + g. This implies
n=1 n=1

271

P( L’ Bn) = inf{p(G) : L} Bn CIG, G Open}.
n=1 n=1

Also p( k} Bn) < p( L! Fn) + e and

n—1 n=1
m N
p( L] Bn) g p( L) Fn) + 2c for N sufficiently large.
n=1 - n=1
N
However, the finite union \J Fn is a closed subest
n=1

Q
of L! B . Thus
n=1 n

P( L3 Bn) = SUP{P(F)= F C IJ B . F closed}
n=1 n=1 n

6 is closed under countable unions.

Hence 6 is a O-algebra and 6 = 5k. QED

Theorem A.2: Let X be a metrizable space and

 

d a metric on X consistent with its tOpology. If

pl,p2 6 P(X) and
I gdpl = I gdp2 V g 6 Ud(X)

then p1 = p2.

Proof: Let F be any closed prOper subset Of X

_ . 1
and Gn — [x 6 X..d(x,F) < H]’

For sufficiently large n, F and "'Gn are
disjoint non-empty closed sets for which

inf d(x.y) > 0. By Urysohn's Lemma,
x6F,y6~Gn

H fn 6 U X) such that fn(x) = O for x 6 N Gn'

d(

272

fn(x) = 1 for x 6 F and 0 g fn(x) g 1 V x 6 X.

Then
p1(F) g f fndpl = I fndp2 g p2(Gn)

p10?) g p2(nQ Gn) = p2(F).

l
Reversing p1 and p2, we obtain pl(F) = p2(F).
By Theorem A.l, this implies p1(B) = p2(B) for every

B 6 13X. QED

These two theorems essentially says that a
probability measure on a metrizable space is completely
determined by its values on the Open or closed sets.
Also, a probability measure p on a metric space (X,d)
is completely determined by the values I gdp where g
ranges over Ud(X). Next, attention is paid to the
develOpment of the topology f[C(X)], the so—called
weak topology on P(X). However, the space C(X) is
tOO large to be manipulated easily, one would need a
countable set D C C(X) such that f(D) = ![C(X)].

Such a set D is produced by the next three lemmas.

Definition: Let c > O, p 6 P(X) and f 6 C(X),

V€(p:f) = {q 6 P(X) : If qu-f fdp) < 8].

Definition: Let D C C(X). Define the

collection Of subsets Of P(X):

v(D) = {V€(p:f): e > o. p e P(X), f e D].

273

Let 7(D) be the weak topology on P(X) which contains

7(D), i.e., the tOpology for which 7(D) is a subbase.

Lemma A.l: Let X be a metrizable space and

 

D C C(X). Let {pa} be a net in P(X) and p 6 p(x).

Then pa 4 p relative to the topology f(D) if and

only if I fdpa 4 I fdp for every f 6 D.

Proof: Let pa 4 p and f 6 D. Let c > 0 be

given, and let 6 be such that for d 2 B implies

P E Vc(p:f)

d
J” fdpa » f fdp.
Conversely, let I fdpa 4 I fdp for every f 6 D.

Suppose G 6 f(D) contains p. Then p is
n
contained in some basic Open set F) V (p;f ) C G
e k
k=1 k
where ck > O, f 6 D and k = l,...,n. Let B be

k
such that for all a 2 B

If fkdpa—J‘ fkdpl < ck k = l,...,n
pa 6 G for d 2 B
1 p 4 p. QED

(1

Lemma A.2: Let X be a metrizable space and d a

 

metric on X consistent with its tOpology. If
f 6 C(X), then there exists sequences [9n] and {hn}
such that g i f and h 1 f.

n n

Proof: Let b 6 R and x0 6 X be such that

b g f(x) g f(xo) < m for every x 6 X.

274

Define gn(x) = inf[f(y)4-nd(x,y)]. Thus for every
y6x

b g gn(x) g f(x) + nd(x,x)

g f(xO) + nd(x,xo) < a

b g gl g g2 §---§ f and lim gn g f

nan

For every x,y,z 6 X

f(y) + nd(x,y) g f(y) + nd(y,z) + nd(x,z)
gn(X) g gn(2) + nd(X.2).
Thus
lgn(X) —gn(2)l g nd(x,z)
gn 6 Ud(X) for each n.
Let a > O and {yn] C.X be such that
f(yn) + nd(x,yn) g gn(x) + e.
As n 4 c, either gn 1 a or y 4 x. If

n

9n 1 w = 11m 9n 2 f

n4m
gn T f.
If yn 4 x, since f is continuous
f(x) = lim f(y )
n
11ch

g lim 9 (X) + e
n
n49

lim g (x) = f(x). QED
n4m n

275

Lemma A.3: Let X be a metrizable space and d

 

a metric on X consistent with its topology. Then

r[c (X) ] = .7'[Ud(X) ].

Proof: Since Ud(X) CC(X), 'a'[Ud(X) C'Zf[C(X)].
This implies JTUd(X)] C:J[C(X)]. Let

pO 6 V€(p:f) 3 so > 0 such that V€O(p07f) C‘V€(P:f).

By Lemma A.2, H g,h 6 Ud(X) such that g g f g h
r.
and J fdpo < f gdpO + eO/z, f hdpO < f fdpo + 50/2.

Let q 6 Veo/2(po7g) ﬂ Veo/2(po:h), then

I fdpO < f gdpO + eO/Z < I gdq + co
g I qu + 60

and

.‘I

: P c

J qu g I hdq < J hdpO + eO/z < f fdpo + “O
r!

l, qu-J" fdpol < 60

q 6 v (po:f)

€O

and
VEO/2(po:g) ﬂ VeO/z(po:h) CiV€(P:f).
Since pO 6 V€(p:f) 6 7[C(X)] this implies V€(p;f)
is Open in the I[Ud(X)] tOpology and
Y[C(X)] C I[Ud(X)]

J’[C(X)] = .7'[Ud(X)]. QED

276

Lemma A.4: Let X be a metrizable space and d

 

a metric on X consistent with its topology. If D is

dense in Ud(X) then JIUd(X)] = f(D).
Proof: Clearly f(D) C23IUd(X)]. Let
v€(p7g) 6 V[Ud(X)] and pO 6 V€(p:g).
Let

so = e - If gde-f gdp! > 0.

Let h 6 D be such that Hg-—hH < 90/3. For any
q €‘Veo/3(Po:h)
If gdq-J" gdpl g If gdq-j hdql + If hdq-Ihdpol
+ If hde-f gdpol + lfgde-f gdpl
< 620/3 + 60/3 + 270/3
+ If gde-f gdp] = e
Veo/3(po7h) C V€(P:Q)-

This implies

.'/'[Ud(X)] c.7(D)

J’[Ud(X)] = f(D). QED

277

Theorem A.3: Let X be a separable metrizable

 

space. There exists a metric d on X consistent

with its tOpology and a countable dense subset D Of
Ud(X) such that f(D) is the weak topology J[C(X)]

on P(X).

Proof: By Urysohn's theorem, X can be
homeomorphically embedded into a subset of the Hilbert
cube. Since the Hilbert cube is compact by Tychonff's
theorem, it is totally bounded.

3 a totally bounded metrization d on X.
This implies that (X,d) can be isometrically embedded

as a dense subset Of a compact metric space (Xd,d1)

where X szd. Let g 6 Ud(X), g has a unique

extension 3 6 C(Xd) such that Hg” = Hg“. The mapping
9 4 3 is linear and norm-preserving. Since C(Xd) is
separable, this implies Ud(X) is separable.

H a countable dense set D in Ud(X).

The result follows from Lemmas A.3 and A.4. QED

From this point on, whenever X is metrizable, it
is understood that P(X) is a topological space with

the weak tOpology J[C(X)].

Theorem A.4: If X is a separable metrizable

 

space, then p(x) is separable and metrizable.

 

278

Proof: Let d be a metric on X consistent
with its tOpology and D a countable subset of Ud(X)
such that f(D) is the weak tOpology on P(X). Let
Rm be the product of countably many COpies Of R.

Define m: P(X) 4 Ra 'by
m(p) = (I gldp.I gzdp....)

where [g1,gz,...] is an enumeration Of D. Suppose

that m(p1) = m(p2), then
I gkdp1 = I gkdp2 for every gk 6 D.
Let g 6 Ud(X). H a sequence {gk} C D such that
Hgk "9H 4 O as i 4 a. Then
i
II gdp1-I gdpzl

3 lim supIJ(g- gki)dp11 + 1im supIJI gk dp1-J gk ldpzl

i469 i4co
+ lim suplI(gk -g)dp2I
i4a i

g 2 lim supHgki -9H = O

i-OQ
I gdpl = I gdpz-
By Theorem A.2, p1 = p2

m is one-tO-one.

By Lemma A.l, for each gk 6 D, the mapping p 4 I gkdp

is continuous

m is continuous.

279

Let {pa} be a net in P(X) such that m(pa) 4 m(p)
for some p 6 P(X). Then

I gkdpd * I gkdp for every gk 6 D.

By Lemma A.1, pa 4‘p
m- is continuous. The m is a homeomorphism.
Ron is metrizable and separable = P(X) is metrizable

and separable. QED

Theorem A.5: Let X be a separable metrizable
space and let d be a metric on X consistent with its
tOpology. Let {pn} be a sequence in P(X) and
p 6 P(X). The following statements are equivalent:

(a) pn 4 P

(b) I fdpn 4 I fdp for every f e C(X)

(c) I gdpn 4 I gdp for every 9 6 Ud(X)

(d) lim sup p (F) g p(F) for every closed

naa n -
set F CIX
(e) 1im inf pn(G) g p(G) for every Open

n-Oco
set G CIX.

Proof: The equivalence of (a), (b) and (c) follows
from Lemmas A.l and A.3. The equivalence of (d) and
(e) follows by complementation. To show (b) implies
(d), let F be a closed proper nonempty subset Of X.

Let Gk = [x 61X: d(x,F) < é}. F and “G. are disjoint

k

nonempty sets for k sufficiently large. 3 fk 6 C(X)

280

such that fk(x) = l for x 6 F, fk(x) = O for
x 6 ch and 0 g fk(x) g l for every x 6 X. Thus
lim sup pn(F) g lim I fkdpn = I fkdp g p(Gk).
n4o n4c
(d) follows by letting k 4 a.

To show (d) implies (b), let f 6 C(X) and assume
without loss Of generality that 0 g f g 1 Let K be
a positive integer and define

F=Ix6x-f2]5l k=0 K
k 6 . k _ K" ,..., .
K
Define m:IX 4 [0,1] by w(x) = Z)(k/K)XF -F (x)
k=0 k k+1
where FK+l = ¢. Then
1
f-Rgmgf
For any q 6 P(X).
I g5 )< 1 2E
codq= (-)Q(F -F )=- q(F)
k=0 K k k+1 K k=1 k
. l .
lim sup I fdpn - (K) g lim sup I mdpn
n-Oa n-Oa
1 6
= - lim sup p (F )
K n40 k=1 n k
:1 £5 I
S - P(F ) = mdP
— K k=1 k
g I fdp.

This implies

lim sup I fdpn g I fdp

n-Oca

for every f 6 C(X).

281

But the above argument holds for -f

lim inf I fdpn -lim sup I (-f)dpn

n4: n49

—I (-f)dp = I fdp

||\/

3. I fdpn 4 I fdp for every f 6 C(X). QED

The above two theorems guarantee that when X is
separable and metrizable, the tOpology on P(X) can be
characterized in terms of convergent sequences rather
than nets. Theorem A.5 gives several conditions which

are equivalent to convergence in P(X).

Corollary A.5.l: Let X be a metrizable space.

 

The mapping 5: X 4 P(X) defined by 5(x) = pX is a

homeomorphism.

Proof: Clearly 6 is one—tO-One.
Let {xn} be a sequence in X and x 6 X. If

xn 4 x and G is an open subset of X, then either

(i) x 6 G, implies xn 6 G for large n
1im inf p (G) = l = p (G) or
x x
n4e n

(ii) x 6 G, then 1im inf p

(G) 2 0 = p (G).
n4a n — X

X

By Theorem A.5, this implies pX 4 px
n

6 is continuous.

282

On the other hand, if pX 4 pX and G is an Open
n

neighborhood of x, since 1im inf p (G) 2 p (G) = l.
n4a Xn _ X

Then xn 6 G for sufficiently large n
x 4X
n

6 is a homeomorphism. QED

Theorem A.6: If X is a compact metrizable space,

 

then P(X) is a compact metrizable space.

Proof: If X is a compact metrizable space, it is
separable and C(X) is separable.

Let {f be a countable set in C(X) such that

k)
f1 5 l, kaH g l for every k and {fk} is dense in
the unit sphere {f 6 C(X): ”f” < 1}.

Define mpm .. [-1,1]°° by co(p) =
(I fldp,I fzdp,...). m can be easily verified as a

homeomorphism. Suppose {pn} is a sequence in P(X)

and m(pn) 4 (91.02,...) 6 I-l,l]m. Let 6 > 0 be

given and f 6 C(X) with Hf” g l, 3 fk with
_ ! E
Hf ka < 3. Also 3 n0 2 0 such that

II fkdpn-I fkdme < g whenever n,m ; no. Then

IJ fdpn-I fdpml

"A

II fdpn — I fkdpn) + II fkdpn —I fkdme
+ (I fkdpm-I fdme < e

I. {I fdpn] is Cauchy in [-l,l].

283

Let E(f) be the limit of such a sequence. If
“f“ > 1, define E(f) = Hf” E(Jén). Thus E is a

1

linear function on C(X), E(f) 2 0 whenever

f 2 O, IE(f)I g ”f“ for every f 6 C(X) and
E(fl) = 1. Let {hn} be a sequence in C(X) and
h (X) 1 O for every x 6 X. For each s > O, the

n

set Kn(e) = [x 61X: hn(X) > e] is compact and

For n sufficiently large, thH 1 O

which implies E(hn) 1 O.

3 a unique probability measure on O[ (J f (B )]
f6C (X)

which satisfies E(f) = I fdp for every f 6 C(X) since

E is a Daniell integral (see Royden [1968])

'. p 6 P(X)
and
ok = :3: I fkdpn = E(fk) = I fkdp k = 1,2,...
m(pn) 4 ¢(p) and m[P(X)] is closed on [-l,l]m.
Since [—l,l]co is compact, P(X) is compact. QED

Lemma A.5: Let X and Y be separable metrizable

 

spaces and m:iX 4 Y a homeomorphism. Define

k P(X) 4 P(Y) by

Mp) (B) = Men-1(8)] v B 6 BY

w is a homeomorphism.

284

Proof: Let p1,p2 6 P(X) and p1 #’p2. Since
p1 and p2 are regular, 3 an open set G C X for

which p1(G) 36P2(G)

m(G) is relatively Open in m(X).

Thus m(G) = m(X) 0 B, where B is Open in Y and

)(p1) (B) = p1(G) a" p2(G) = )(pz) (B)
w is one-tO-One.

Let {pn} be a sequence in P(X) and p 6 P(X). If

pn * p. since m-1(H) is Open in X for every Open

set H C‘Y. By Theorem A.5,

lim inf ¢(pn)(H) lim inf pan-1(H)]

n4m n4m
g pim'1(H>] = ¢<p2)<H)
W(pn) 4 ¢(P) and w is continuous.
Reversing the arguments with {pnI and p such that

¢(pn) 4 w(p) will show that pn 4 p

w is continuous. QED

Theorem A.7 (Urysohn's Theorem): Every separable

 

metrizable space is homeomorphic to a subset of the

Hilbert cube N.

Theorem A.8 (Alexandroff's Theorem): Let X be

 

a topologically complete space, Z a metrizalbe space,

285

and m: X 4 Z a homeomorphism. The m(X) is a
Gb—subset of Z. Conversely, if Y is a Gb-subset
of Z and Z is topologically complete, then Y is

tOpologically complete.

Theorem A.9: If X is a tOpologically complete

 

spearable space, then P(X) is topologically complete

and separable.

Proof: By Theorem A.7, 3 a homeomorphism

m:iX 4 N and the mapping w in Lemma A.5 with H
replacing Y is a homeomorphism from P(X) to P(H).

By Theorem A.8, m(X) is a Gé-subset Of N

¢[P(X)] = {p 6 PW) :pW-wXH = 0].

Since N is compact, by Theorem A.6, P(N) is compact.

m
Let G1 3 G2 3... such that v(x) = m G . Then

¢UWM] {PGEPW73PWFGQ =0}

n 1

II
8 ll ___)8

n n {p e Pm :pm-Gn) <5.
n=1 k=1

But by Theorem A.5 (d), for any closed set F and
c 6 R, the set {p 6 P(K):jp(F) g c} is closed and

this implies {p 6 P(ﬂ):jp(N-Gn) < %} is Open

¢[P(X)] is a Gé-subset of P(N).

Again by Theorem A.8, this implies P(X) is topologically

complete. QED

286

The next step is to establish the fact that
BP(X) is the smallest O-algebra with respect to

which 9B is measurable for every B 6 Ex. To do

that, a useful aid, the concept of a Dynkin system,

is needed.

Definition: Let X be a set and D a class Of
subsets of X. 3 is a Dynkin system if the following
conditions hold:

(a) X 6 3

(b) If A,B 6 6' and B c:A, then A-B 6 B

(c) If A

6 ﬂ and A C A C

l'AZ' l 2 '°°'

m

then L! A 6 3-
n
n=1

The following is a well-known result in measure

theory, which is quoted without proof.

N

Theorem A.lO (Dynkin System Theorem): Let a be

 

a class of subsets of a set X, and assume 3 is closed

under finite intersections. If D is a Dynkin system

containing J, then 3 also contains 0(3).

Theorem A.ll: Let X be a separable metrizable

 

space and 6 a collection Of subsets Of X which

generates Bk and is closed under finite intersections.

Then BP(X) is the smallest O-algebra with respect to

which all functions of the form EB(p) = P(E), E 6 6

287

are measurable from P(X) to [0,1], i.e.,

_ -1
E31°(x) — 0%ng 9E (BR)]°

Proof: Let 3' be the smallest O—algebra with re—

spect to which 9E is measurable for every E 6 6.

Let .8: {B 65X:GB is BP(X)

Clearly, ﬂ is a Dynkin system, and by Theorem A.lO,

-measurable].

ﬂ = 5*.

Let ﬁ’ = [B 6 Ex: 9 is Jemeasurable}. Again

B
6' is a Dynkin system. Since 6 C 6', ﬂ! ='6X° Thus
6f(p) = I fdp is Jemeasurable for f is the indicator
of a Borel set. 6f is Jemeasurable when f is a Borel—
measurable simple function.

Let f 6 C(X), E a sequence of simple functions fn

uniformly bounded below such that fn T f. By the

monotone convergence theorem 9f 1 9f. Thus 8f is
n
J—measurable. Then for e > O, p 6 P(X), f 6 C(X),

V€(p;f) is J-measurable.

'7

This implies BP(X) = c. QED

Theorem A.l2: If X is a Borel space, then P(X)

 

is a Borel space.

Proof: Let m be a homeomorphism mapping X onto
a Borel subset of a topologically complete separable

space Y. Then by Lemma A.5, P(X) is homeomorphic to

 

288

the Borel set {p 6 P(Y) :p[co(x)] = 1}. But P(Y)
is topologically complete and separable by Theorem A.9,

P(X) is a Borel space. QED

In this appendix, it has been shown that the
space P(X) can be taken to be a topological space
with the weak topology J[C(X)]. It inherits most
Of the properties Of the space X. When X is separable H
and metrizable, P(X) is separable and metrizable; when
X is separable and topologically complete, P(X) is
separable and tOpologically complete; and when X is

a Borel space, P(X) is a Borel space.

APPENDIX TWO

The properties Of analytic sets are summarized
in this appendix. The analysis will st0p with a
collection of equivalent definitions of the set.
Again, there are no new results in this appendix.
The reader may be referred to some standard mathematics
text for a more elaborate treatment of the topic.

The definitions of a paved space, Suslin scheme
and analytic sets are given in Chapter VI* and will
not be repeated here. First, some preliminary results

are stated without proof. (The reader may request
the author for proof of the following theorem and its

corollaries.)

Theorem A.13: Let X be a space with pairings

 

9 and 2 such that 9 Cil.
(a) M9) c 2(2)
(b) 2(9) 5 = 2(9)
(c) moo .249)

(d) 9 c H9)

(e) 3(9) = XIMQH

289

290

Corollary A.13.l: Let (X,9) ‘be a paved space
and suppose that the complement of each set in 9

is in 6(9). Then 0(9) C.&(9).

Corollary A.13.2: Let X be a Borel space. The
countable intersections and unions of analytic subsets

of X are analytic.

Theorem A.l4: Let X be a Borel space. Then

 

every Borel subset Of X is analytic. In fact, the

class Of analytic sets 343%) is equal to g(Ek).

Proof: Every Open subset Of X is an F so

O-I
every Open set is analytic. Corollary A.l3.l implies

B C 605 ). By Theorem A.13 (a) and (e), since J C E
x x x x

3(3):) c max) c AMUXH = Mix)

sax) = MEX). QED

If the Borel space X is countable, then every

subset of X is both analytic and Borel-measurable.
If X is uncountable, however, the class of analytic
subsets Of X is strictly larger than X' Note that
an immediate consequence of Theorem A.l4 is that if Y
is a Borel subset of the Borel space X, then the
analytic subsets of Y are the analytic subsets of X

contained in Y. A generalization of this fact is the

following.

291

Corollary A.l4.l: Let X and Y be Borel spaces

 

and m: X 4 Y a Borel isomorphism. Then A C:X is

analytic if and only if w(A) C‘Y is analytic.

Proof: Let m:jX 4 Y be a Borel isomorphism and
A CZX is analytic. Then A = N(S) where S is a
Suslin scheme for :X' Let q>oS be the Suslin scheme

for BY defined by

(CD 0 S) (S) = w[S(S)]

Thus (p(A) = N(chS).
a By Theorem A.l4, p(A) is analytic.
If m(A) C‘Y is analytic, A CIX is analytic by a

similar argument. QED

Definition: Let (X,0) be a paved space and S
a Suslin scheme for 9. The Suslin scheme S is regular

if for each n 6 N and (01,02,...,O ) 6 23, then

n+1

) .

s(Ol,OZ,...,on) : 5(01'02"°°'On'°n+1

Lemma A.6: Let (X,d) be a separable metric space

 

and S a Suslin scheme for :X' Then there exists a

regular Suslin scheme R for :k such that N(R) = N(S)

and, for every 2 = (61.62....) 6 T

lim diam R(gl,§2,...,gn) = 0 if
nag:

mq4yuqq)¢¢vn

292

Proof: By the Lindelbrf property, for each
positive integer k, X can be covered by a countable

collection of Open balls Of the form
Bkj = {x 62X: d(x,xkj) < l/k) j = l,2,°°°. For

(gllgllgzlgleOO) E m, deflne

R(El) = 5161

R(gl'cl'gZ'CZ) = R(Qloglogz) ﬂ S(Ql.,C,2) etc.

U

Thus
a: - a
C R(s) = [ F) Bk: ] n I 0 8(5)]
<(7 c E c ) k=1 k s<z
*1'~1'~2'~2°"
where z = (61,62,...). Clearly R is a regular Suslin

scheme for 3*

lim diam R(g ,g ,...,g ) = 0 if
n4” 1 2 n

R(g1.g2.....gn) #(Z ’v’ n.

Let x 6 N(R) 3 (€1,61,Z2,62,...) 6 m such that
x 6 _ r) _ R(s)
s<(61o61. £2: C2)

Thus

x 6 (7 8(5) CN(S)
s<(61.62....)

N(R) C N(S) .

293

Now let x 6 N(S), 3 (61,62....) 6 T such that

x 6 ﬂ 5(8).
s<(glog20000)

Since {Bkjg j = 1,2,...] covers X, 3 for each k a

positive integer Zk for which x 6 Bkzk. Thus
a
x 6 kCH Bka
x 6 ﬂ R(S) CN(R)-

s<(Z1,C1,:2162: . o .)

Thus N (S) C N(R)

N(R) = N(S). QED

Lemma A.7: Let (X,d) be a complete separable

 

metric space. If A C X is a nonempty analytic set,
then there exist a closed subset T1 of T and a
continuous function f: T1 4 X such that A = f(Tl).

Conversely, if T1 C T is closed and f: T1 4 X is

continuous, then f(Tl) is analytic.

Proof: Let A = N(R) be nonempty, where R is a

W

regular Suslin scheme for ax satisfying Lemma A.6.

Define

T1=IZ 6T: ’0 R(s) 76¢}.
s<z

Let 2: (gl,g2,...) be in 9). Let R(g1,g2....,gn) 7H3

for each n, E xn 6 R(gl,62,...,gn). By Lemma A.6,

294

{xnI is Cauchy and xn 4 x where x 6 X since X

is complete.

R is regular, for each n, {x

R(61,62,...,6n). Thus x6R(gl,62,...,gn)

x 6 ﬂ R(s).
s<z

Now, let 2 6 T — Tl. Then for some sn < z, R(sn) =

If 2 6 [w 6 T: sn < w} an Open neighborhood, then

{m 6 T: Sn < w] C T — T1. Thus T - T1 is Open and

T1 is closed. For 2 6 T1, define f(z) to be the
unique pOint in n R(s). Let {2k} C T1 and 2k 4 z0

s<z
where 20 = (61,62....) 6 T1. By Lemma A.8, given
6 > O 3 sn < 20 such that diam R(sn) < e. For k

sufficiently large, 2k 6 {z 6 T: sn < 2]
f (zk) 6 R (sn)
d(f(zk),f(zo)) g diam R(sn) < e
f is continuous.

Now, suppose T1 C T is closed and f: T1 4 T is

continuous. Define a regular Suslin scheme R for

 

3* by R(s) = f({z 6 T1: 5 < 2]) where R(s) = ¢ if

[z 6 T then

-s < z] = w. If 2 6 T

1' l’

f(z) 6 .C R(s) CN(R)
s<z

f(Tl) C N (R)

ﬂ.

295

Let x 6 N(R), then for some 20 = (61,62,...) 6 T,
x6 0 R(s).
s<zO

 

For each n, x 6 f({z 6 T1: (61.62....,6n) < 2]).

Given 6 > O, 3 Zn 6 T1 such that (61,62....,6n) < zn

and d(x,f(zn)) < 6. Then zn must converge to 20

as n 4 a. Since T1 is closed, 20 6 T1, f is
continuous implying

d(X,f(ZO)) g e

f(zo) = x

and
x 6 f(Tl). N(R) C f(Tl)

N(R) = f(T ). QED

1

So far, analytic sets are characterized as the

continuous images of closed subsets Of T. The following
lemma and theorem allows one to get an even sharper

characterization.

Lemma A.8: If T is a nonempty closed subset of

1

 

T, then there exists a continuous function g: T 4 T

such that T1 = g(T).

Proof: Let T1 be covered by a countable

collection of nonempty closed sets [8(61): 61 6 N]

296

satisfying

mle(€l)' diam 8(61) g1 61:1,2,...

where d is a metric on T consistent with its tOpology.
Cover each 8(61) with a countable collection Of non-

empty closed sets {8(61,62): 62 6 N} satisfying

s<g1) : 5(g1,g2), diam s<g1.g2) g %. g2 = 1.2,...
Thus, for any (61,62....,gn_1)
S(§11Q20000I£n) 56¢ Qn=1,2,...
O
5(51'C2"”'§n—1) = U 8(61.£2.....6n)
§n=l
s<gl,g2,...,gn_1) 3 s(g1,g2,...,gn_1.gn) gn = 1,2,...

. 1 _ ...
dlam(C1,C2,...,€n) g a Qn — 1'2,

~

Since T is complete, for each 2 6 T, F) S(s) consists
s<z
Of a single point.

Define g(z) to be this point. Thus N(S) = g(T) = Tl.

By the arguments of Lemma A.7, g is continuous. QED

Theorem A.lS: Let X be a Borel space. A nonempty

 

set A c:X is analytic if and only if A = f(T) for

some continuous function f: T‘4 X.

Proof: If X is complete, the theorem follows
from Lemmas A.7 and A.8. If X is not complete, it is

homeomorphic to a Borel subset of a complete separable

space and the result follows from Corollary A.l4.l. QED

297

The above theorem gives a very useful characteri-
zation of nonempty analytic sets in terms Of continuous
functions and the Baire null Space T. The Baire null
space has a simple description and its topology allows

considerable flexibility.

Lemma A.9: The space T is homeomorphic to any

 

finite or countably infinite product Of OOpies Of itself.

Proof: We prove the lemma for the case Of a countable
infinite product. Let v1,W2,... be a partition Of N
into infinitely many infinite sets. Define
w: T 4 TTT... by T(Z) = (21.22,...) where zk con-
sists of the components of z with indices in Wk.

Clearly m is one-to-one and onto. Since convergence
in a product space is componentwise, m is a

homeomorphism. QED

 

Theorem A.l6: Let X1,X2,... be a sequence of

Borel spaces and Ak an analytic subset of Xk'

k = 1,2,... . Then the sets A1A2"' and A1A2...An,
n = 1,2,... are analytic subsets of X1X2... and
X1X2...Xn respectively.

Proof: Let fk: T 4 Tk be continuous such that

A = fk(T), k = 1,2,"‘. Let 6 be defined as in

k

Lemma A.9 and F: TT... 4 X1X2... be defined by

298

F(zl,22,...) = (fl(zl),f2(22),...). SO Fcam is
continuous and maps T onto A1A2°°°. Similar
arguments for the finite products. QED

Another consequence of Theorem A.lS is that the

continuous image of an analytic set, in particular, the

projection of an analytic set, is analytic. It is

formalized in the following theorem.

Theorem A.l7: Let X and Y be Borel spaces and

 

A an analytic subset of XY. Then projx(A) is
analytic. Conversely, given any analytic set C C X
and any uncountable Borel space Y, there is a Borel

set B C:XY such that C = projx(B). If Y = T, B

can be chosen to be closed.

Proof: If A = f(T) c:XY where f is continuous
prOjX(A) = (prOjX<>f)(T) is analytic by Theorem A.20.

Now, suppose C = f(T) C X is nonempty and analytic.
C = projx[5r(f)] where
Eur) = {(f(z),z) e XT:z e 92}.

Since f is continuous, Er(f) is closed. If Y is
any uncountable Borel space, then 3 a Borel isomorphism
w from T onto Y.

Define a mapping T by §(x,z) = (x,m(z)). Then

@(x,z) is a Borel isomorphism from XT onto XY

C = pron<:[é}<f)1). QED

 

299

Theorem A.18: Let X and Y be Borel spaces

 

and f:iX 4 Y a Borel—measurable function. If
A C,X is analytic, then f(A) is analytic. If B C;Y

is analytic, then f-1(B) is analytic.

Proof: Suppose A CZX is analytic. By Theorem

A.l7, 3 a Borel set B C:XT such that
A = projx(B).
Define ¢:ZB 4 Y by T(x,z) = f(x). Clearly I is
Borel measurable. This implies gr($) 6 EXTY
f(A) = pron[gr<))].

By Theorem A.l7, f(A) is analytic. If B C'Y is
analytic, then B = N(S) where S is some Suslin scheme
for c

N(f-1<>S) where f.1 OS is the Suslin

H1
CD
II

scheme for 5x defined by (f_1<>S)(s) = f-1[S(s)] V s 623,

f-1(B) is analytic by Theorem A.l4. QED

Now, the above results can be summarized in the

following theorem.

Theorem A.l9: Let X be a Borel space. The

 

following definitions of the collection of analytic
subsets of X are equivalent:

(a) £(dx)

(b) £(IJX)

 

(e)

(f)

300

the empty set and the images of T under
continuous functions from T into X
the projections into X of the closed subsets

Of XT

the projections into X Of the Borel subsets
of XY, where Y is an uncountable Borel
space

the images of Borel subsets of Y under
Borel-measurable functions from Y into X,

where Y is an uncountable Borel space.

Proof: Only (f) needs to be shown.

If Y is an uncountable Borel space and f: Y 4 X

is Borel-measurable. For every B 6 by, f(B) is

analytic in X by Theorem A.18. Let 6 be a Borel

isomorphism from Y onto XT and let F C XT be

closed such that pron(F) = A. Define B = 6. (F) 6 BY.

Then (projx<>m)(B) = A is analytic. If A = ¢,

then f(¢)

= A for any Borel measurable f:‘Y 4 X. QED

 

BIBL IOGRAPHY

BIBLIOGRAPHY

Akerloff, G., "The Market for 'Lemons': Quality
Uncertainty and the Market Mechanism", Qparterly
Journal Of Economics, August 1977.

 

 

Alchian, A., and H. Demetz, "Production, Information
Costs, and Economic Organization", American
Economic Review, December 1972.

 

 

Amershi, A., and J. Butterworth, "The Theory of Agency
With Diverse Beliefs", Working Paper, University
of British Columbia, 1979.

Ash, R., Real Analysis and Probability, Academic Press,
New York, 1972.

 

Astrgm, K.J., "Optimal Contral Of Markov Processes With
Incomplete State Information", Journal Of Mathema~
tical Analysis and Applications, 1965.

 

 

Baiman, 8., "Discussion Of Auditing, Incentive and
Truthful Reporting", Journal of Accounting Research,
Supplement, 1979.

 

Baiman, S., and J. Demski, "Economically Optimal Performance
Evaluation and Control Systems", Journal of
Accounting Research, Supplement, 1980.

 

 

Bertsekas, D., and S. Shreve, Stochastic Optimal Control:
The Discrete Time Case, Academic Press, New York,
1978.

 

 

Blackwell, D., "Positive Dynamic Programming",
Proceedings of Fifth Berkeley Symposium on Mathema-
tical Statistics and Probability, 1965a.

 

 

Blackwell, D., "Discounted Dynamic Programming", Annuals of
Mathematical Statistics, 1965b.

 

 

301

302

Blackwell, D., "Borel—Programmable Functions", Annals
of Probability, 1978.

 

Blackwell, D., D. Freedman, and M. Orkin, "The Optimal
Reward Operator in Dynamic Programming", Annals of
Probability, 1974.

 

 

Dahlquist, G., and A. Berk, Numerical Methods, Prentice-
Hall, Inc., New Jersey, 1974.

 

Denardo, E.V., "Contraction Mappings in the Theory
Underlying Dynamic Programming", SIAM Review, 1967.

 

’

Demski, J., and G. Feltham, "Economic Incentive in
Budgetary Control Systems", Accounting Review,
April 1978.

 

Fama, E., "Agency Problems and the Theory Of the Firm",
Journal Of Political Economy, April 1980.

 

Feller, W., An Introduction to Probability Theory and its
Applications, Volume II, John Wiley and Sons, New
York, 1971.

 

Freedman, D., "The Optimal Reward Operator in Special
Classes Of Dynamic Programming Problems", Annals
Of Probability, 1974.

 

 

Furukawa, N., "Markovian Decision Processes with Compact
Action Spaces", Annals of Mathematical Statistics,
1972.

Gjesdal, F., "Accounting in Agencies", Working Paper,

Stanford University, 1976.

Glowinski, R., J.L. Lions, and R. Trémoliéres, Numerical
Analysis of Variation Inequalities, North-Holland
Publishing Company, Amsterdam, 1981.

 

 

Harris, M., and A. Raviv, "Optimal Incentive Contracts
with Imperfect Information", Journal Of Economic
Theory, December 1979.

Himmelberg, C.J., T. Parthasarathy, and F.S. Van Vleck,
"Optimal Plans for Dynamic Programming Problems",
Mathematical Operation Research, 1976.

 

Hirschleifer, J., "The Private and Social Value Of
Information and the Reward to Incentive Activity",
American Economic Review, September 1977.

 

303

Holmstrom, B., "On Incentives and Control in Organizations",
Unpublished Ph.D. Dissertation, Stanford University,
1977.

Holmstrom, B., "Moral Hazard and Observability”, The
Bell Journal of Economics, Spring 1979.

Jensen, M., and W. Meckling, "Theory of the Firm:
Managerical Behavior, Agency Costs and Ownership
Structure", Journal of Financial Economic, October
1976.

 

JuskeviE, A.A., "Reduction Of a Controlled Markov Model
with Incomplete Data to a Problem with Complete
Information in the Case of Borel State and Control
Spaces", Theory of Probability Applications, 1976.

Kobayashi, T., "Equilibrium Contracts for Syndicates
with Differential Information", Econometrica,
November 1980.

Lambert, R., "Managerial Incentives and Short—Run vs
Long-Run Optimization", Unpublished Ph.D.
Dissertation, Stanford University, 1981.

Luenberger, D.G., Optimization by Vector Space Methods,
John Wiley and Sons, New York, 1969.

Maitra, A., "Discounted Dynamic Programming on Compact
Metric Spaces", Sankhya, 1968.

Marchuk, G.I., Methods of Numerical Mathematics, Second
Edition, Springer—Verlag, New York, 1982.

Mirrless, J., "Notes on Welfare Economics, Information,
and Uncertainty", M. Balch, D. McFadden and S. Wu

(eds.), Essays on Economics Behavior Under

Uncertainty, North Holland Publishing Company,
Amsterdam, 1974.

 

 

Myerson, R., "Incentive Compatibility and the Bargaining
Problem”, Econometrica, January 1979.

 

Ng, D., and J. Stoeckenius, "Auditing, Incentives and
Truthful Reporting", Journal of Accounting Research,
Supplement 1979.

 

Radner, R., "Monitoring Cooperative Agreements in a
Repeated Principal-Agent Relationship", Econometrica,
September 1981.

 

 

304

Rockafellar, R.T., "Integral Functionals, Normal
Integrands and Measurable Selections", Nonlinear
Operators and the Calculus of Variations, Springer-
Verlag, New York, 1976.

 

 

Ross, 8., "The Economic Theory Of Agency: The Principal's
Problem”, American Economic Review, May 1973.

 

Ross, 8., "On the Economic Theory of Agency and the
Principle of Similarity", M. Balch, D. McFadden
and S. Wu (eds.), Essays on Economic Behavior
Under Uncertainty, North-Holland Publishing Company,
Amsterdam, 1974.

 

 

Rudin, W., Functional Analysis, McGraw-Hill Book Company,
New York, 1973.

 

Rudin, W., Real and Complex Analysis, McGraw—Hill Book
Company, New York, 1974.

 

Savage, L., The Foundations of Statistics, John Wiley and
Sons, New York, 1954.

Sawaregi, Y., and T. Yoshikawa, "Discrete-Time Markovian
Decision Process with Incomplete State Information",
Annals of Mathematical Statistics, 1970.

Schal, M., "0n Continuous Dynamic Programming with Discrete
Time Parameter", Z. Wahrecheinlichkeitstheorie und
Verw. Gebiete, 1972.

 

 

Shavell, 8., "Risk Sharing and Incentives in the Principal
and Agent Relationship", The Bell Journal of
Economics, Spring 1979.

 

 

Spence, A.M., Market Signalling Information Transfer in
Hiringiand Screening Processes, Harvard University
Press, Boston, 1974.

 

Spence, A.M., and R. Zeckhauser, "Insurance, Information

and Individual Action", American Economic Review,
May 1971.

 

Strauch, R.E., "Negative Dynamic Programming”, Annals Of
Mathematical Statistics, 1966.

 

 

Striebel, 0., Optimal Control of Discrete Time Stochastic
Systems, Springer-Verlag, New York, 1975.

 

Wilson, R., "On the Theory Of Syndicates", Econometrica,
January 1968.

 

GENERAL REFERENCES

GEN ERAL REF ERENCE 8

Arrow, K., Essays in the Theory of Risk-Bearing,
Chicago Press, 1970.

Arrow, K., "Political and Economic Evaluation Of Social
Effects and Externalities", M. Intriligator (ed.)
Frontiers of Quantitative Economics, North-
Holland Publishing Company, Amsterdam, 1971.

Bellman, R., Dypamic Programming, Princeton University
Press, New Jersey, 1957.

Bertsekas, D.P., Dynamic Programming and Spochastic
Control, Academic Press, New York, 1976.

Blackwell, D., "On Stationary Policies", Journal of
ngal Statistics Society, 1970.

 

Demski, J., "Cost Allocation Games", S. Moriarity (ed.)
Joint Cost Allocations-Proceedings Of the University
of Oklahoma Conference on Cost Allocations, Center
for Economic and Management Research, Normal,
Oklahoma, 1981.

Feltham, G., "Optimal Incentive Contracts: Penalties,
Costly Information and Multiple Workers", Working
Paper, University of British Columbia, 1977.

Fleming, W., and R. Rishel, Deterministic and Stochastic
Optimal Control, Springer-Verlag, 1975.

 

 

Hinderer, K., Foundations of Nonstationary Dynamic
Programming with Discrete Time Parameter, Springer-
Verlag, 1970.

Hoffman-Jorgensen, J., The Theory Of Analytic Spaces,
Aarhus Universitet, Aarhus, Denmark, 1970.

Huston, V., and J. Pym, Applications Of Functional
Analysis and Operator Theory, Academic Press, New
York, 1980.

 

305

306

IOffe, A.D., and V.M. Tihomirov, Theory Of Extremal
Problem, North—Holland Publishing Company,
Amsterdam, 1979.

 

Kreps, D.M., and E. Porteus, "Dynamic Choice Theory and
Dynamic Programming", Econometrica, January 1979.

 

Kreps, D.M., and R. Wilson, "Sequential Equilibria",
Econometrica, July 1982.

 

Larsen, R., Functional Analysis, Marcel Dekker Inc.,
New York, 1973.

 

Mirrless, J., "The Optimal Structure of Incentive and
Authority Within an Organization", The Bell Journal

of Economics, Spring 1976.

 

Ng, D., "An Information Economics Analysis of Financial
Reporting and External Reporting", Accounting
Review, October 1978.‘

 

Ng, D., "Supply and Demand Of Auditing Services and the
Nature of Regulations in Auditing", S. Davidson (ed.)
The Accounting Establishment in Perspective, Arthur
Young and Co., 1979.

 

Radner, R., ”Competitive Equilibrium Under Uncertainty",
Econometrica, 1968.

 

Royden, H.L., Real Analysis, Macmillan, New York, 1968.

 

Scott, W., "A Bayesian Approach to Asset Valuation and
Audit Size", Journal of Accounting Research, Autumn
1973.

 

Stiglitz, J., "Incentives and Risk Sharing in Sharecropping",
Review of Economic Studies, 1974.

 

Stiglitz, J., "Incentives, Risk and Information: Notes
Towards a Theory of Hierarchy", The Bell Journal
of Economics, Autumn 1975.

 

 

Yoshida, K., Functional Analysis, Springer-Verlag, New
York, 1980.