MSU LIBRARIES “ RETURNING MATERIALS: PIace in book drop to remove this checkout from your record. FINES wiII be charged if book is returned after the date stamped beIow. EXISTENCE OF OPTIMAL NASH CONTRACTS OF A PRINCIPAL-AGENT MODEL SINGLE AND MULTIPERIOD CONSIDERATIONS BY Peter Cheng A DISSERTATION Submitted to Michigan State University 1 partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Accounting 1982 ABSTRACT EXISTENCE OF OPTIMAL NASH CONTRACTS OF A PRINCIPAL-AGENT MODEL SINGLE AND MULTIPERIOD CONSIDERATIONS BY Peter Cheng This dissertation attempts (1) to construct an abstract, general yet analytically rigorous agency model of a business entity, and (2) to study the conditions or circumstances such that the principal can design an Optimal contract with his agent in both the single and multiperiod settings. In the single period model, it is shown that if the contract to be negotiated belongs to the class of totally bounded functions or to that of monotonic increasing functions, the principal can be successful in arriving at an Optimal contract. The multiperiod model is constructed as a sequential decision process on a discrete time basis. Dynamic programming algorithm is a suggested solution procedure to the prOblem. Under various assumptions, it is shown that the proposed model meets the hypothesis Peter Cheng of tflne algorithm. The questions of the existence of optimal or nearly optimal contracts, uniformly optimal contracts and stationary Optimal contracts as well as the convergence of the algorithm are investigated. In the multiperiod model, the basic agency problem is first considered on a "most well—behaved" setting. This means that the payoff outcomes are observable by both the principal and the agent. Analysis is carried on both a finite and an infinite time horizon. Then, it is assumed that the payoff is observable by the agent alone while the principal receives a signal on the payoff. The signal is chosen by the agent and there is no restriction on how the agent should report the payoff. This leads to an imperfect state information model. The imperfect state information model is analogous to the reporting and auditing functions of an entity. If appropriate measurability restrictions are imposed on the various functions, it is shown that if the auditor is acting in the best interest of the principal, the audited financial statements are adequate for the deri- vation of a long run Optimal contract through the dynamic programming algorithm. To Christina, Nicholas and Cecilia ii ACKNOWLEDGMENTS I wish to thank my committee members, Dr. Ronald Marshall, Dr. Kenneth Janson and Dr. Peter Lappan, Jr., for their continual help and guidence throughout the years of research on the dissertation. Special thanks are given to Dr. Lappan who patiently taught me mathematics and showed me how to apply mathematical analysis to study accounting and economic problems. Also, I have to thank Mary Reynolds for her excellent typing of the manuscript. Finally, I must acknowledge the patience and under- standing of my wife Christina and children, Nicholas and Cecilia, from whom I took time normally available for family activities to work on this project. iii TABLE OF CONTENTS CHAPTER PAGE CHAPTER I: PRELIMINARIES 1 1.1: Introduction. . . . . . . . 1 1.2: SCOpe and Objective . . . 3 1.3: Organization of the Study and Summary of Results . . 9 CHAPTER II: SINGLE PERIOD MODELS 14 2.1: A General Agency Model. . . 14 2.2: The Pareto Optimal Contract . . . l9 2.3: The Nash Optimal COntract . 20 2.4: Financial Reporting and Auditing . . . . . . . . . 27 CHAPTER II*: SINGLE PERIOD MODELS 30 2.1*: The Model . . . . 30 2.2*: Totally Bounded Incentive Functions . . . . . . 34 2.3*: Monotonic Incentives . . . 41 CHAPTER III: MULTIPERIOD MODELS — PRELIMINARIES 46 3.1: Introduction . . . . . 46 3.2: The Basic Multiperiod Model . . . . . . . . . 52 3. 3: Optimality Concepts . . . . 57 3. 4: Dynamic Programming Algorithm . . . . . . . . S9 CHAPTER CHAPTER III*: CHAPTER IV: CHAPTER IV*: CHAPTER V: CHAPTER V*: MULTIPERIOD MODELS — PRELIMI- NARIES 3.1*: 3.2*: FINITE 4.1: 4.2: FINITE 4.1*: 4.2*: Notations and Assumptions . The Basic MultiperiOd Model . Optimality Concepts The Dynamic Programming Algorithm HORIZON MODEL Introduction and Assumptions The Finite Horizon Model HORIZON MODEL Introduction and Assumptions The Finite Horizon Model INFINITE HORIZON MODEL 1 .2 3 UlU'lU'l UiU“ U14}. .0 00 Introduction The Contraction Assumption The Monotonicity Assumption The Optimality EquatiOn Convergence to Optimality and Existence of Optimal Contracts . . . Remarks . . INFINITE HORIZON MODEL .1*: 5.2*: The Contraction Assumption The Monotonicity Assumptions . The Optimality EquatiOn. Convergence to Optimality and Existence of Optimal Contracts PAGE 66 66 68 70 71 73 73 75 8O 80 82 96 96 98 101 102 103 105 109 109 121 126 135 CHAPTER CHAPTER VI: CHAPTER VI*: CHAPTER VII: CHAPTER VII*: BOREL MODELS ONO O‘Cfi NH pan Introduction . Existence of Probability Measures . Analytic Sets . Construction of the "Selector" BOREL MODELS Introduction . . Probability Measures on Borel Spaces . . Analytic Sets and Univer— sally Measurability Universally and Borel Measurable Selection IMPERFECT STATE INFORMATION MODEL 7.1 7 2 7.3: 7.4: 7.5: Introduction The Imperfect State Information Model The Perfect State Infor- mation Model Reduction of the Imperfect State Information Model Sufficient Statistic IMPERFECT STATE INFORMATION MODEL 7.1*: 7.2*: 7.3*: 7.4*: 7.5*: Introduction The Imperfect State Information Model The Perfect State Information Model Reduction of the Imperfect State Information Model Existence of Statistics Sufficient for Contracting . . . . vi PAGE 148 148 150 153 156 158 158 162 180 188 221 221 224 231 232 233 236 236 238 243 247 259 CHAPTER APPENDIX I APPENDIX II BIBLIOGRAPHY GENERAL REFERENCES vii PAGE 268 289 301 305 CHAPTER I PRELIMINARIES 1.1: Introduction An agency can be defined as an economic arrange- ment in which two or more individuals share the outcome produced by an action or a sequence of actions and the occurrence of some random event(s). The indivi— dual who wishes to delegate the action responsibility while receiving a share of the outcome is called the principal. The party who makes the action decisions is called the agent. The study of the economic effects of various contractual arrangements between the principal and his agents has been of increasing interest in both the accounting and economic literatures. Amershi and Butterworth [1979] have identified three research lines of inquiry on the subject. One branch of study focuses on the welfare effects in terms of expected utilities of contractual aggrangements among economic agents. A number of writers have contributed to the literature of this approach: Demski and Feltham [1978], Harris and Raviv [1979], Holmstrom [1977,1979], Kobayashi [1980], Mirrlees [1974,1976], Ross [1973, 1974], Shavell [1979], Spence and Zeckhauser [1971], and Wilson [1968], among others. The question of market efficiency with respect to the ability of markets to absorb and transmit information about the activities of economic agents was studied by Akerloff [1977], Hirschleifer [1977], Spence [1974]. The third direction focuses on investigating the interactive effects of markets and contractual decentralization: Alchian and Demetz [1972], Fama [1980], Jensen and Meckling [1976]. This study concentrates on the first category. Focus is placed on identifying the set of sufficient conditions to guarantee the existence of optimal con- tracts between the principal and hisagent in both a single period and multiperiod settings. In the multi— period model, emphasis is placed on the existence of long run contracts. Although the formulation of the model does not preclude the consideration of a two period agency, the thrust of the research is to investi- gate contracts over the life of the entity. The multiperiod model is developed as a sequential decision process on a discrete time bases. Contracts are negotiated at the end of each period after the payoffs are observed or reported by the agent. There is no restriction that contracts for subsequent periods be of the same form as those of the previous periods. Under such a scenario, I would like to study the circumstances, both economic and behaviorial, under which the principal would be successful in designing a contract which would maximize his expected utility on the net return while at the same time allowing the agent to maximize his own expected utility by selecting an Optimal action choice. 1.2: Scope and Objective The purposes of this research are: (l) to construct an abstract, general yet analytically rigorous agency model of a business entity, (2) to study the conditions or circumstances such that the principal can design an optimal contract with his agent in both the single and multiperiod settings. In a sole proPrietorship or a partnership, the principals are the owners and the managers are the agents. For corporations, the stockholders as repre- sented by the board of directors are the principals while the managers are the agents. The auditors are assumed to act always in the best interest of the principal: that is, they act c00perative1y with the principal. It is also assumed that all principals within the company act congruously and have common utility preferences. Likewise, the agents within the entity also have common goals and utility preferences. The model in the subsequence chapters effectively focuses on the relationship between one principal and one agent. Ng and Stoeckenius [1979] point out that, in general, not only is the manager's action not observable by the owner, the firm's payoff is also unobservable by him as well. Since this asymmetry of information exists between the principal and the agent, the latter is required to issue financial reports periodically to convey information concerning the firm to the owner. If the manager's remuneration is directly a function of his reported performance, then the manager will have incentive to misrepresent in his report. If auditing is effective in detecting the misrepresentation, then it is of value to the owner, provided that the audit cost is small enough. Fama [1980] notes that the contracting literature is almost uniformly concerned with one-period models. He further argued that in a single period world, there can be no enforcement of contracts through a wage revision process imposed by the managerial labor market. Radner [1981] shows that the sequential observation of the agent's performance over time is by itself an effective monitoring device. Since any real world contracting process is dynamic, an agency model is incomplete if its dynamics are not studied. A complete dynamic general equilibrium analysis including the external labor market is beyond the scope of the research. I plan to study the partial equilibrium effects on the contracting behaviors between the principal and the agent. This means that the external labor market Opportunities to the agent are represented by a constant exogeneously determined. The principal must pay him in such a way that the agent's expected utility on his compensation is greater than the given outside Opportunity set. In the single period model, it will be shown, in Chapter II, that if the contract to be negotiated is bounded or if it is nondecreasing (monotonic increasing) with respect to payoff, the principal can be successful in searching for the optimal one within the above two mentioned classes of contracts. Bounded contracts are reasonable since no principal would pay his agents an unlimited amount of compensation in excess of the payoff. Monotonic increasing contracts are those of the bonus type. Bonus has been an extremely common form of remuneration. Agents are paid according to the payoff outcomes. These two classes of contracts clearly describe some common contracting behavior of economic agents. For the multiperiod model, there are no assumptions nor restrictions placed on the form and types of the long run contracts. All ad hoc assumptions and condi- tions are imposed either on the net payoff function (the payoff less the compensation paid to the agent), the total expected discounted return function (the net payoffs to the principal over the planning horizon discounted to period 0) and the probability distribu- tion of uncertain events (the period payoffs and the reported payoffs). The first two are restrictions on the economic behavior of the entity, how its payoffs, per period and total, relate to each other and to the environment in which the entity Operates. The conditions on probability distributions are more be— haviorial in nature. The distributions capture the principal's beliefs of his expectation on the agent's performance and the manner the agent reports, that is, how truthful the latter is. These are discussed in details in Chapter III and subsequent chapters. It is the h0pe of this research that the general agency model develoPed will describe, both normatively and positively, the behavior of the principal and his agent in the above context. It is a positive model because if the conditions or assumptions are met and Optimal contracts are found, the principal can predict the behavior of the agent in terms of his performance. The imperfect state information model proposed in Chapter VII will attempt to describe the reporting and auditing functions in terms of a multiperiod agency model. The reduction of the imperfect state model to a perfect state model through the auditing process and the conclusion that the audited financial statement is sufficient to design a long term Optimal contract confirms the current belief about the value of auditing. HOpefully, such a model will enhance some understanding of the process of financial reporting and auditing as they are related to the contracting procedures of the company. Most of the current researches in the agency area are conducted by imposing certain ad hoc assumptions on the model. Some of these assumptions are descriptions of the economic environment of the entity and some are behaviorial in nature while others are for mathematical tractability of the model. Assuming a solution to the problem, that is an optimal contract, exists under the proposed assumptions, the researcher carries on either to characterize the assumed Optimal contract or to draw implications from the Optimal contract. There is not any documentation about the kind of environment and conditions under which a researcher can appro- priately make such an assumption, the existence of an optimal solution. This project attempts to lay the foundation and investigate the fundamentals of the agency problem. Every attempt in the research is made to impose the minimal amount of restrictions on the behavior of the various functions. The model will be described and analyzed in its most possible generality. It is hoped that any results derived under such hypothesis are applicable to a greater variety of situations. Due to the mathematical and technical nature of the analysis, discussions in the following chapters will be separated into a general and non-technical descrip- tion of the analysis and its results in the non-starred chapters. The full technical model, with all theorems and proofs is presented in the starred chapters. The two appendices contain materials which are well-known in the literature but are crucial to the develOpment 0f the model. They are collected there for completeness. However, by developing the model in its most general form, it becomes extremely difficult to give a direct and elaborate characterization of the optimal contract even after showing its existence. Character— ization of the Optimal contract requires additional hypothesis on the model. I do not intend to carry the analysis to such an extent. Also, this research will only discuss the conditions under which the principal can design an optimal contract and suggest, wherever possible, some procedure to arrive at or approximate the solution. It will not discuss the specific numerical aspects of the actual search for solution. 1.3: Organization of the Study and Summary of Results The organization of the study is as follows. A general agency model with one principal and one agent is analyzed in Chapter II. A set of sufficient conditions for the existence of an Optimal contract will be derived. Financial reporting and auditing are then introduced into the model. As indicated earlier, the auditor is assumed to act cooperatively with the principal. This implies that the auditor's decision variables are inputted into the model exogeneously and the auditor is acting as a "surrogate" for the principal. Auditing is essentially a dynamic process: 10 analysis of the auditing function is deferred until Chapter VII in the multiperiod model. In the multiperiod model, the basic agency model is first investigated (Chapter III). The setting is the most "well-behaved" one in the sense that payoffs are observable by both the principal and the agent. The model is well-defined stochastically at time 0. This means that the sequence of payoffs over the entire planning horizon is defined stochastically given an initial payoff at the initial period. A probability distribution on this sequence of payoffs exists with a well-defined probability belief on the initial payoff. Analysis of the model is carried on a finite time horizon (Chapter IV) and infinite time horizon (Chapter V). In the second part of the multiperiod analysis, it is assumed that the actual payoff is observable only by the agent while the principal will receive a signal on the payoff. The signal is chosen by the agent and there is no restriction on how the agent should "report" the payoff. This leads to an imperfect state information model (Chapter VII). The principal must attempt to assess the likelihood of the actual payoff given the signal produced by the agent. Hopefully, he can achieve such a task through the auditing and 11 contracting processes. Serious mathematical problems arise under such a setting. These problems and their suggested solutions are discussed in Chapter VI. The model constructed in the subsequent chapters imposes no differentiability restriction on the various functions. Heterogeneous prObability beliefs between the principal and the agent are allowed. In the single period setting, it is first shown that under very mild conditions on the agent's choice set and utility function, the Nash constraint that the contract is incentive compatible is met for all incentive functions. Optimal contracts are shown to exist under two classes of functions, totally bounded and monotonic increasing functions. The multiperiod model represents a sequential decision process. Long run Optimal contracts are determined on the basis of maximizing the principal's total expected discounted net return (period payoffs less the agent's remuneration) over the planning time span. The model is considered under a finite and an infinite horizon settings. Dynamic programming algorithm is an iteration procedure over time through which (1) it computes a Conditional expectation: (2) the objective function in two variables (state and incentive) is Optimized over 12 one of these variables (incentive): (3) if an optimal contract is to be constructed, a "selector" which maps each state to a contract which achieves the maximum in the second step is to be chosen. It is shown that under certain circumstances such an algorithm is valid to solve the multiperiod agency problem in both the finite and infinite horizon models. In the finite horizon model, Optimal contract exists if the principal's total expected discounted net return is continuous and behaves "somewhat" linearly with respect to his period net return. In addition, the dynamic programming algorithm provides a stronger result. The optimal contract thus obtained maximizes the principal's period net return as well as his total expected discounted net return. The infinite horizon model is discussed under two separate sets of assumptions. The first set includes conditions that the discount factor is less than unity and that the total discounted expected net return is bounded for all contracts. Existence of optimal contracts is established. Under the dynamic pro— guamming algorithm, the Optimal contract derived is stationary, that is, the form of the contract remains the same throughout time. The second set of assumptions is that the total discounted expected net return to 13 the principal is monotonic (either increasing or decreasing) with respect to time. Under both sets of assumptions, the infinite horizon solution is shown to equal to the limit of that of the finite horizon model when time is allowed to tend to infinity. Also, it is shown the dynamic programming algorithm imple— mented under these sets of conditions converges. If appropriate measurability restrictions are imposed on the various functions, the dynamic pro- gramming algorithm can be implemented to the imperfect state information model. In this part of the analysis it is shown that if the auditor is acting strictly in the best interest of the principal and the audit is performed in the best possible manner, the audited financial statement is adequate for the derivation of a long run Optimal contract through the dynamic programming algorithm. CHAPTER II SINGLE PERIOD MODELS 2.1: A General Agency Model This section considers the situation in which there is a principal with one agent. The agent is entrusted with the task of selecting an action from among a set of alternatives. Then some random event occurs and the outcome of the action is some payoff which is assumed to be observable by both parties. The action choice, however, is not observable by the principal. A problem is then to decide on some "best" sharing scheme between the principal and the agent. Under the assumption that the agent's utility for wealth is positive and his utility for effort is negative, it has been shown that a pure wage contract is incentive incompatible, that is, the agent will always choose the act which requires the minimum level of effort (Ng and Stoeckenius [1979], Shavell [1979]). On the other hand, since the action choice is not observable by the principal, an incentive contract based 14 15 on the outcome alone may cause moral hazard problems. Under such circumstances, Harris and Raviv [1979] have shown that some information about the agent's action by the principal is desirable even at some cost, namely, some monitoring is "good". The solu- tion to the general agency problem is the determination of the action by the agent, an incentive function and a monitoring system by the principal to motivate the behavior of the agent. The usual criteria adOpted by writers in agency literature for the choice of optimal incentive con- tracts are Pareto Optimality and Nash optimality. The Pareto criterion provides the principal with an expected utility at least as great as that which is obtainable from among alternatives which satisfy some minimal requirement (minimum security level) imposed by the agent. This implies that the principal and the agent decide COOperatively on the contract and the action choice such that the expected utility of the principal is maximized subject to the minimal requirement. If the principal can observe the agent's action choice as well as the payoff, or if there is full and truthful communications between the principal and the agent on the action and payoff, it would be sufficient to guarantee that the agent will act cooperatively. Some authors call this the first best 16 solution (Holmstrom [1979], Shavell [1979]). However, a solution in the cooperative game does not necessarily imply that there is full and truthful communication of the action and payoff. The first best solution implies that the agent always acts in the best interest of the principal. He does not have a conflict of interest over effort with the principal either by the observability of the action or other monitoring means. The Nash condition assumes noncooperative behavior between the principal and the agent. It requires an additional restriction, that the contract must enable the agent to choose an action from among the alternatives which maximizes his expected utility under the contract. This allows the agent to choose his actions freely based on his own choice criteria. A typical situation of such a phenomenon would be the asymetry of information between the principal and the agent. The principal does not observe the agent's action even after the payoff is Observed. There is no way of enforcing any action choice on the agent. The best thing the principal can do then is to design the contract based on his "best" guess on the agent's action. The solution set of the Nash problem is a subset of that of the Pareto one. Hence there exist 17 solutions which are Pareto efficient but are not Nash solutions. The following is a general agency model. Define the expected utility of the principal as ¢1(I.a) = E1U1(u)(a,s) -I(w(a,s))), where a E A set of possible actions s E S set of states of nature {2:A.xS 4 R payoff function w(a,s) = m units of wealth. I: Q 4 R Incentive function I(w(a,s)). Throughout this paper, the same notation will be used to denote a function as an element of a functional space and the value of the function in the range. For example, w denotes a payoff function as well as the units of wealth payoff for a certain action a and state of nature 5. The actual reference should be clear from the context. Under most common economic circumstances, it will not be unreasonable to assume that the agent has only a finite number of mutually exclusive action choices. Then, without loss of generality, A is assumed to be a finite subset of Rn. The set S is defined with its usual statistical meaning in a decision theory context (Savage [1954]) with probability beliefs Pi on S, where i = 1 denotes the principal and i = 2 18 denotes the agent. The usual homogeneity of beliefs between the principal and the agent is not assumed in this model. Let Ei denote the expectation Operators with respect to the beliefs of the decision maker. It is assumed that the utility function of the principal, U1, is non-negative, concave and monotonic increasing with respect to wealth. It can then be shown that such a utility function is continuous on its domain (Theorem 2.7). For the agent, define his expected utility as ¢2 (Ila) = EZUZ (I (w(als) ) la) ° His utility function is also non-negative, concave, and monotonic increasing in wealth, but non—negative, concave, monotonic decreasing in effort. Assuming that both the principal and the agent are utility maximizers, one can formulate the problem as follows. Maximize E1U1(w(a.S) -I(w(a.S)) I e {1) Subject to: 02(I,a) 2 v (1) a 6 argmax E2U2(I(uwa,s)),a) (2) aEA v represents the minimum levels of security for the agent to remain in the company. Solving the above program subject to constraint (1) will yield the Pareto 19 solution. Constraint (2) represents the additional incentive compatibility condition that the Nash solution requires. 2.2: The Pareto Optimal Contract Under the conditions of Pareto Optimality, all parties are assumed to act COOperatively in the best interest of the company. The existence of a Pareto optimal contract is usually not too difficult to guarantee. Given sufficient regularity conditions, at least one solution is guaranteed. Wilson [1968] has demonstrated the existence of a Pareto optimal sharing rule and characterized the behavior of this sharing rule in a syndicate setting. Kobayashi [1980] analyzed the role of private information of an individual in the syndicate by redefining the core. He showed that an equilibrium contract belongs to the core as well as the existence of such equilibrium contracts. Amershi and Butterworth [1979] investigated the problem assuming diverse beliefs among the principal and the agents. They analyzed the conditions for existence of the optimal contract and the characteristics of such an Optimal contract. Most of their findings on the Pareto optimal contract are consistent with the results of previous studies. 20 2.3: The Nash Optimal Contract In the formulation of the Pareto problem, one can assume some kind of convexity prOperty for the Objec— tive function and the constraints are generally well behaved mathematically. Such assumptions will, to some extent, simplify the solution techniques. Under the Nash criterion, the additional constraint that the agent is able to maximize his expected utility removes any reason to impose the well behavior of the functions of the model. Also, one cannot be sure that the Optimal contracts are differentiable. Differentiability is a requirement in most optimization techniques. The problem is compounded if the principal and the agent are allowed to hold diverse beliefs about the state of the world. One common method in showing existence is by analyzing the first and second order conditions. Under the assumptions of Kuhn-Tucker in the nonlinear programming literature, these conditions are, in general, satisfactory. Another usual method is to use the Euler— Lagrange equation in the calculus of variation literature. In the agency setting, however, both of these techniques may not be apprOpriate. Both methods require differentiability of the incentive contracts. The principal's decision variable, incentive contract, is a 21 function belonging to some functional space. A space is of finite dimension if its elements can be expressed as some finite number of vectors (or basis). There are only very few finite dimensional spaces that are of interest. In fact, the only one which is of any use is the space of polynomials of finite degree. The principal is effectively maximizing his expect utilities over infinite dimensional spaces. Under such handicapped conditions, most of the common cal- culus methods fail. Both Mirrless [1974] and Gjesdal [1976] have correctly shown that the differentiability assumption may be too restrictive. Holmstrom [1977] constructed a counter-example, the optimal solution of which can be attained by a nondifferentiable sharing rule and no differentiable rule can precisely attain this solution. One of the more extensive works in demonstrating existence is also done by Holmstrom [1977]. He proved the existence of two classes of incentive contracts under the Nash conditions. His work relied heavily on the assumption of homogeneity of beliefs between the principal and the agent. He also put some additional restrictions on the behavior of the functions to arrive at his results. 22 The first step of the analysis is to generate conditions on the agent's problem such that given any contract I, his expected utility function will always achieve a maximum. This will satisfy the Nash additional condition that the contract is incentive compatible. To show incentive compatibility, the Weierstrass Maximum Theorem (Theorem 2.3) is used. Loosely stated, an upper semicontinuous function on a compact set achieves a maximum. Upper semicontinuity is a milder condition than continuity. When applied to the agent's expected utility function with respect to the action set, this means that if the agent shirks a very little bit, that is, if a is allowed to change by a small amount, then the agent's expected utility will not be increased by a large amount. Hence, upper semi- continuity simply means that the function is continuous from above. The agent should not expect a substantial gain if he changes his effort level slightly. This is reasonable in terms of expected utility when all possible states are considered. Recall that A is assumed to be a finite subset of Rn, it is bounded by definition. Compactness on the real line means closed and bounded. All one needs for A to be compact is that it is a closed 23 subset. Upper semicontinuity on ¢2 requires some additional condition on the utility function of the agent. It can be shown that if U2 is upper semi- continuous with respect to a, the action choice, then the resulting expected utility is also upper semicontinuous (Theorem 2.5). With the above construction, indeed only very mild conditions, the Nash constraint is satisfied. Then the principal's problem is considered, again applying the Weierstrass Theorem on his expected utility function. To guarantee upper semicontinuity on Oi is easy. The principal's utility is monotonic increasing and concave with respect to wealth, which is generally the residual of the payoff after the manager's compensation is deducted. A concave function is always continuous (Theorem 2.7), and the integral, in our case, the expectation operator, is continuous if the integrand is continuous, which in turn implies upper semicontinuity. Compactness of the space of incentive contracts is more difficult to show. The space of incentive contracts is of infinite dimensions. It would require stronger conditions on the contracts than just being closed and bounded. The mathematical requirement is that the contracts are totally bounded. Total 24 boundedness means that no matter how the elements of the set (in this case set of functions) are grouped, they are always enclosed in a ball of some given radius. Here the radius has to be the same for all possible groupings. It is a kind of uniform bound for all possible payoff outcomes. Thus, if the incentive contracts are totally bounded, then the Weierstrass Theorem will guarantee the existence of a maximum for the principal. In most situations, one would expect that the contracts should be increasing with respect to the payoffs or a bonus as it is more commonly called. The class of monotonic increasing contract is considered next. Each contract of this type is assumed to be bounded below. As the principal obtains the residual of the outcome after he pays the agent and his utility is increasing with wealth, given any wealth position from the outcome, he would choose the incentive function that would pay the agent the least amount. That is, the optimal contract in the principal's viewpoint can be defined as I; = inf [11,...,In} for each w E 0 where n denotes the number of possible contracts given w. As the problem is formulated, the agent must maintain a minimum security level in terms of expected utility to remain in the company, the incentive function should be bounded below for each m E O. 25 Now, consider the sequence [1;], the elements of which are defined earlier. This is a decreasing sequence and it is shown that it converges to a limit 1* which is also a monotonic increasing function (Theorem 2.12). As mentioned earlier, the upper semicontinuity on O? on A and the compactness of the action set A guarantee the existence of an optimal action for the agent given any incentive contract. Thus, for each 1;, there exists a corresponding a; for the agent. The final step is to show that the sequence [a;] also converges to a limit a* which is then the Optimal action for the agent if he is given the Optimal contract 1* (Theorem 2.13). Up to this point, it has been shown that if the agent's action choice set A is a closed subset of Rn, and if his expected utility function is upper semicontinuous with respect to A, there always exists an optimal a* such that his expected utility is maximized. This will satisfy the Nash constraint of incentive compatibility for all possible incentive contracts. Two incentive spaces are then identified such that ‘by choosing the appropriated contracts from these two spaces, it will be guaranteed that the principal's expected utility will achieve a maximum. 26 The first of these spaces is the class of totally bounded contracts. This is a very large class of functions which includes the differentiable functions and the equicontinuous functions which Holmstrom [1977] has demonstrated to be optimal under homogeneous belief assumptions. Functions from the totally bounded class are compact. The principal's expected utility function is continuous with respect to I. These two conditions will guarantee, by the Weierstrass Maximum Theorem, the existence of an optimal contract. The second class of functions is the monotonic increasing incentive functions. A typical example of these functions is the bonus arrangement. Indeed it is shown that one can always pick an Optimal monotonic increasing contract to maximize the principal's expected utility. So far, an agency model has been described under the usual decision choice theoretical context. The assumptions imposed on the model, which are summarized above, describe a very reasonable and general economic environment. The restrictions on the utility function are in accordance with the usual von Morgenstern utility preference assumptions. If the contracts are totally bounded or they are monotonic increasing, the principal will always be able to negotiate a Nash equilibrium contract such that both his and the agent's 27 expected utilities are maximized. In other words, if the agent exhibits a von Morgenstern utility preference with respect to wealth and effort, then the principal can always design a contract which is either totally bounded or monotonic increasing to achieve a Nash equilibrium. 2.4: Financial Reporting and Auditing Generally, financial reporting means the conveyance of information to the principal (owner) about the out— comes of the actions taken by the agent (manager) in a period of time. When such a function is introduced into an agency model, it implies as well that the out- comes of the events are not observable by the principal. An additional objective of the principal is then to ensure that the agent is reporting the outcome truth- fully. Ng and Stoeckenius [1979] have demonstrated that without monitoring, an incentive compatible con- tract always induces nontruthful reporting. This suggests that when the actual payoff and the agent's effort are not observable, information about the agent's performance is always valuable to the principal in terms of increasing his expected utility. Of course, the benefits derived from such information should be sufficient to justify its cost. Harris and Raviv [1979] showed that even imperfect information is bene- ficial. 28 As Baiman [1979] points out, there is a distinct difference between monitoring and auditing. The former means to verify the action taken by the agent while auditing is taken as the verification of the report of the outcome produced by the agent. In this paper, the role of the auditor is defined as another agent of the owner who "monitors" the reporting function of the manager, while the incentive contract is designed to minitor the actions of the manager. Since the objective of the principal is to control the reporting as well as the action choice by the manager, the incentive contract and the auditor can then together be considered as the monitoring system. The "black" box of unobservables increases with the addition of the auditing function into the model. In most common situations, the original report submitted by the manager is not observable by the principal. The manager's report goes directly to the auditor who performs the necessary tasks on the report, suggests apprOpriate changes and adjustments and then attests the report. The principal will receive the "audited" report after all adjustments have been made or a "qualified" or "disclaimed" report if the manager refuses to make the suggested alterations. Under such a scenario, the only variable which is observable by all parties is 29 the audited report. Any contract that is enforceable has to be based on variables which are observable by both the principal and the agent. Therefore, it is not unreasonable to suggest an incentive contract based on audited outcome. In fact, if the auditor performs all his audit tasks in the best possible manner and in the best interest of the principal, the audited outcome is shown in Chapter VII to be sufficient to design an Optimal contract. The dis- cussion of Optimal contracts under a reporting and auditing environment will be deferred until Chapter VII when an imperfect state information model in a multiperiod setting is discussed. CHAPTER II* SINGLE PERIOD MODELS 2.1* The Model Define the expected utility of the principal as m1(I.a) = E1U1(w(a,s)-I(w(a,s))), where a 6 A set of possible actions 5 6 S set of states of the world U):A.XS 4 R payoff function w(a,s) = w units of wealth I: Q 4 R incentive contract I(w) = k2. The following are some standard assumptions on decision choice theory. Assumption 8.1: Let the action set A be a closed, bounded subset of a; where 0’ is a normed linear space of dimension n < m. Assumption 8.2: The set S is non-empty with a sigma-algebra 0(5) defining the events in S, Pi’ i = l (principal), 2 (agent), are probability measures on 0(s). It is assumed that all functions are measur- able with respect to 0(5). 3O 31 Assumption 3.3: (i) The utility function of the principal U1, is assumed to be non-negative, con- cave, and monotonically increasing with respect to wealth. (ii) The utility function of the agent U2, is also nonnegative, concave, and monotonically in- creasing in wealth, but decreasing in effort. Assumption 8.4: There exists a mo. -w < wo < m. such that ]w(a,s)] g 1mg] for each a 6 A and s E S. Assumption 8.1 describes the action choice set. More details about its tOpological structure will be derived in the later part of this section. Since focus is put on a single period model, at this stage of analysis, S can be viewed as deterministic. Hence no serious problem should arise on the measur- ability of the functions, which are assumed to be measurable. The assumptions on the utility function simply imply that the preference relation of the individuals is convex, complete and transitive. Assumption 8.4 imposes a bound on the payoff function. In any economic situation, it will be very unlikely that the payoff is unbounded. One inherent consequence of Assumption 8.4 is that the incentive function is also bounded as no principal can pay his agent an unspecified amount of compensation in excess of his return. 32 Thus, the expected utility of the agent can be defined as sz(IIa) = E2U2(I (W(a,S) ) Ia) and the typical principal-agent problem represented by the following program: Maximize E1U1(w(a',s) - I(w(a'.S))) I E C Subject to: m2(I,a’) 2 V (l) a: 6 argmax E2U2(I(w(a.S)),a) (2) a€A C represents the incentive space and V represents the minimum level of security for the agent to remain in the company or the Opportunity set he could attain outside the company. Solving the program subject to constraint (1) yields the Pareto optimal solution or commonly known as the first-best solution. Constraint (2) represents the additional condition that a Nash equilibrium requires. Theorem 2.1: Let (d,H°H1) be an n-dimensional normed linear space over the real field and n < m. Then (d,H'H1) is tOpologically isomorphic to (an. n- 112)- Proof: Refer to Larsen [1973]. 33 Definition: A class of sets in a topological space is said to cover a given set X if and only if each point of X lies in at least one of the sets. If the diameter of each set in a cover of X is not greater than c, the class is called an c-cover of X. Definition: Let X be a subset of a topological space. x is said to be compact if and only if every class of open sets which covers X has a finite subclass which also covers X. Definition: A subset X of a complete normed space is said to be sequentially compact if and only if every sequence in X contains a convergent sub- sequence with limit in X. It can be shown that in a Banach space, compact- ness and sequential compactness are equivalent. The following is a well-known and useful result of a finite dimensional space. Theorem 2.2: Let 0' be an n—dimensional normed linear space over the real field with n < m. Then (i) d’ is a Banach space (ii) If A c.d' is a closed bounded set, then A is compact. Proof: Refer to Rudin [1973]. 34 Theorems 2.1 and 2.2 establish the structure of the action choice set and guarantee its compactness. By Theorem 2.1 and without loss of generality, A is assumed, for the rest of this paper, to be a closed subset of Rn with an appropriate norm and all results can then be generalized to other topological spaces. 2.2* Totally Bounded Incentive Functions In this section, by considering the requirements of the Weierstrass Maximum Theorem, a set of sufficient conditions is derived for the existence of an Optimal contract under the conditions established in Section 2.1*. Definition: A real—valued function f defined on a normed space X is said to be upper semicontinuous at xo if and only if lim sup f(x) g f(xo). X"X 0 Theorem 2.3 (Weierstrass Maximum Theorem): An upper semicontinuous function on a compact subset X of a normed linear space achieves a maximum on X. Proof: Refer to Luenberger [1969]. First consider the agent's problem. The idea is that given any incentive contract, we want the agent to be able to select a E A such that his expected utility function will achieve a maximum. This will satisfy the Nash additional condition that the contract 35 is incentive compatible. Since A is a closed and bounded subset of Rn, it is compact by Theorem 2.2 (ii). All that remains to be shown is that m2 is upper semicontinuous on A. Assumption 8.5: For each fixed 5 6 S and e > 0, there exist a 6(5) > 0 such that 6: S 4 R is measurable and U2(I(w(a,s)),a) < U2(I(w(ao,s)),ao)+-a whenever Ha-—aOH < 6(5). Theorem 2.5: Let I(w(a,s)), w(a,s) be bounded measurable functions and U be the measure Of S. Suppose for each fixed 5 E S and each s > 0, there exists a 6(5) > 0 such that 6: S 4 R+ is measurable and U2(I(w(a,s)),a) < U2(I(w(ao,s)),ao)+-e/2 when- ever Ha-—aOH < 6(5). Then m2 = [S U2(I(w(a,s)),a)P2(s)ds is upper semicontinuous on A. Proof: Let c > 0 be given. For each s 6 S define 60(5) = sup 6(5). Clearly 60(5) is measurable since 6(5) is measurable. Let = f . ‘ So [5 E S. 60(5) > 1) 36 and Since 60(5) > 1 for every 5 G S For Y > 0 to be chosen later, there exist a subset SI of S such that u(S-—S') < y. k Hence there is a k such that S’ = L) Sn and n=0 50(5) <11 Vs 65’. Let 5’ = %. Then U2(I(w(a.S)).a) < U2(I(w(ao.S)).ao) + 9/2 / ,2 . I whenever Ha-aofl \ 6 and 5 6 S . This implies IS! U2(I(w(a.s)).a)P2(s)ds < IS' 02(I(w(ao,s)),ao)92(s)ds + Is’ e/2 P2(s)ds. By Assumption S.4, w(a,s) is bounded for each a 6 A, s E S. This implies that I and consequently U2 are both bounded. Thus for each 6 > 0, there exists a 7 > 0 such that g-S’ U2(I(w(als))ra)P2(S)dS < 62/2 whenever u(s-s’) < Y 37 [S U2(I(w(a.8)).a)P2(S)ds = J‘s: U20: (1(a,s)),a)P2 (S)ds + £_S, U2(1(w(a.8)).a)P2(s)ds < [S U2(I(w(ao.S)).ao)P2(S)dS + 5/2 + Is 5/2 P2(s)ds Is U2(I(w(ao,s)),ao)P2(s)ds + e. QED Assumption S.5 says that locally, when a approaches a limit point, the increase in utility to the manager given a very small change of effort will not be large. This requirement seems reasonable, because with a small increase in the agent's effort he should not expect a large increase in the outcome as well as his compensation, or else he will be remunerated for very trivial efforts. Given an incentive contract I, Theorems 2.2 (ii) and 2.6 guarantee the compliance of the requirements of Weierstrass' Theorem and hence the Nash criterion is satisfied. The problem is now reduced to finding an incentive contract such that the principal's expected utility function achieves a maximum. It is easy to verify that space of incentive functions satisfies the prOperties of a vector space. Since any linear combination of two contracts is also a contract, i.e., for every I and I E C, dI + (l-—o)I E C, 1 2 1 2 O l, C is locally convex. Formally, the above Cr. HA IL’\ idea can be stated in the following theorem. 38 Theorem 2.6: The space of incentive functions C is a locally convex vector space. Assumption 8.6: C is complete. Hence, by the above construction, C is a locally convex Banach space. Now, let us digress for a moment and investigate the behavior of the utility function of the principal under Assumption S.3 (ii). Theorem 2.7: Let ‘U: R 4 R be a concave mapping on (a,b). Then U is continuous on (a,b). Proof: The following proof is due to Rudin [1974]. Suppose a < s < x < y < t < b. Write S for the point (s,U(s)) in the plane and do the same thing with S,Y, and T. Then X is on or above the line SY: hence Y is on or below the line through S and X: also, Y is on or above XT. As Y 4 X, it follows that Y 4 X, i.e., U(Y) 4 U(X). Left-hand limits can be obtained in the same manner. Then continuity of U follows. QED The above theorem establishes the continuity of the utility function on its domain. The integral of a continuous function is also continuous. Theorem 2.8: For a given a E A, m1(I.a) = I U1(w(a.S)-I(w(a,s)))Pl(s)ds is continuous on C. S 39 If C is of finite dimension, then by Theorem 2 (ii), a closed subset of C is compact. However, if C is of infinite dimension the compactness of [I] is not easy to guarantee. This leads to the following. Definition: A subset X of a normed space is called totally bounded if and only if, for each e > 0, there is a finite set of Open balls which form an c-cover of X. Theorem 2.9: If a closed subset I of a complete normed space C is totally bounded, then it is sequentially compact. Proof: Let [In] be any infinite sequence in I. As I is totally bounded, there is a finite l-cover of 1 1 . _ I, say [1(f1,§),...,I(fk,§)] where fj E I, j — 1,...,k. At least one of these balls contains an infinite sub— sequence, [In 1} say, of [In]. Take next a finite %—cover of I and as before extract an infinite sub— sequence, {I 1 say, of [I 1 contained in one of n,2 n,l these balls. Proceed in this way to find for each m . c a ‘ r ‘1 an 1nf1n1te subsequence [In’mj say, of “In,m—l' such that [In m} is contained in a ball of diameter m-l. I Now, consider the diagonal sequence [In n}. Then CD , m . {Ij,j}j=n 1s a subsequence of {In,n]j=n' and so 1s contained in a ball of diameter n-1. Therefore, ‘In,n'-Im,m‘ g l/M1n(n,m). Thus [I 7 is Cauchy, and hence is convergent in I n,ni as C is complete and I is closed. 40 This shows that [In] has a convergent subsequence, and proves sequential compactness. QED In a Banach space, compactness and sequential compactness are equivalent. The requirements of the Weierstrass Theorem are satisfied to guarantee the existence of an optimal incentive contract such that the expected utility of the principal is maximized. The following gives a useful example of the class of totally bounded functions. Definition: Suppose Q is a subset of Rn, C(O) is the sup-normed Banach space of all continuous real-valued functions on 0. A set C C C(O) is said to be equicontinuous if and only if, for each 6 > 0, there exists a 6 > 0 such that II(w1) —I(w2)] < e for all w1,u12 6 O with Im1-w2] < 6 and for all I E C. Notice that for a given 6, the same 6 can be chosen for every I in J. Equicontinuity means roughly that the degree of continuity is independent both of the position in the set 0 and of the functions in the space C. Theorem 2.10 (Ascoli): Suppose Q is a bounded asubset of Rn, C(O) is the sup-normed Banach space 41 of all continuous real-valued functions on 0, and C C C(O) is pointwise bounded and equicontinuous. Then C is totally bounded in C(Q). Proof: Refer to Rudin [1973]. 2.3* Monotonic Incentives Definition: 1(0) is monotonic increasing if I(w1) g I(w2) whenever wl < wz for all w1,w2 6 O. In this section, the class of incentive contracts which are monotonic increasing as defined above is considered. Recall that the principal's objective is to find an incentive function such that his expected utility is maximized. As the principal obtains the residual of the outcome after he pays the agent and his utility is increasing with wealth, given any wealth position from the outcome, he would choose the incen— tive function that would pay the manager the least amount. That is, in the principal's viewpoint, the optimal contract can be defined as I; = inf[Il,...,In] for each w 6 0. In E C, and n denotes the possible contracts given w. It is clear that I; is a decreasing sequence. As the prOblem is formulated, the agent must maintain a minimum security level in terms of expected utility to remain in the company. 42 Thus, the incentive function should be bounded below at least pointwise. More explicitly, there exists a * Io such that inf[In} 2 IC for every w 6 0. Theorem 2.11 (Egoroff): Let the measure of a set X be OIX) < m. If [fn] is a sequence of measurable functions which converges pointwise at every point of x, and if e > 0, there is a measurable set E C X, with u(x-—E) < e, such that {fn} converges uniformly on E. Proof: Refer to Rudin [1973]. Theorem 2.12: Suppose I C C, where C is the class of monotonic increasing functions, each I is pointwise bounded below for each x E 0, and Q is a n * . subset of R . Let In = inf [11,...,In} where * [In] C C. Then In 4 1* and the limit I* E C. Proof: Let ml < wz and w1,w2 E 0. Con51der a subsequence {1.1 k} of {Ii}, where l g i g n and k=1,2,°". L81”. * - ) and 1* Then *( < . * In wl) - I1,j(wl) g I1 m(ml) = Il'm(w2) _ 1n(m2), 43 * Thus, In is a monotonic increasing function of w * for each n. Clearly, 11(w) 2...2 Io(w) for every * w 6 0. Therefore, In(w) 4 I*(w) pointwise for each * w 6 0. By Theorem 2.11, In 4 I*. What remains to be shown is that I* E C. Let s > 0 be arbitrary. Let and be chosen as above. There exists "’1 u”2 an n such that * * In(w2) < I (wz) + 6. Then it * * * I (011) g In(wl) g In(w2) < I ((112) + s. This implies that I*(w1) < I*(w2) + 5. Since 6 is arbitrary, thus I* E C. QED Theorem 2.12 provides a scheme to pick an optimal contract from a class of nondecreasing functions and it also establishes the existence of such a contract. If the principal were to enforce I*, we still have to ensure that the agent will be induced to choose the correct action. By Theorem 2.6, the upper semicontinuity of $2 on A and the compactness of the set A have guaranteed the existence of an optimal strategy for the agent given any incentive contract. Possibly there can be more than one Optimal strategies for an incentive contract. Under such circumstances, it is CI. 44 assumed that the agent will choose among the possible optimal strategies one which would maximize the princi- pal's expected utility. Let a* denote such strategy. Thus, for [1;], there exists a corresponding sequence of optimal strategies [a;] by the agent. 'k Clearly [an] C A. Theorem 2.13: Suppose I C C where C is the class of monotonic increasing functions, each I is pointwise bounded below for each m E 0 and Q is a n * . * subset of R . Let 1 = 1nf{1 ,...,I 1 and 1 4 I n 1 n n on 0. Suppose the set A is bounded below. Then there exists an a* such that m2(1*,a*) is maximized. * '\ o a Proof: Since {an} C A which is a closed and bounded subset of Rn, there exists a subsequence * 'k {a } such that an 4 a* where a* E A. We claim m m that a* is the optimal action choice. Let c > O * be arbitrary. By the convergence of [In] on 0. 1* *) * * /2 @2( nlan < ®2(I Ian) + C o By Theorem 2.5 m2 is semicontinuous on A. * * * ' * m2(I ,an) s 11m sup m2(1 ,an) n49 g C02 (I*Ia*) 0 Suppose there exists an a0 6 A such that * s C02(I*Ia) < 02(I*:ao). 45 * Then, by the convergence of [In] * * w2(In'ao) > m2(I ’ao) — e/2 > m2(I*.a*) - e/2 * * > o2(1 ,an) - e/2 * * Since 5 is arbitrary, we have * 'k * mZIIn.ao) > 62(In.an). But this is impossible, because a;, by construction,g is the action which maximizes m2 given 1;. Therefore, a* is the action which maximizes m2 given the Optimal contract 1*. QED CHAPTER III MULTIPERIOD MODELS - PRELIMINARIES 3.1: Introduction There are two major considerations in the construction of the multiperiod model: mathematical rigor and economic as well as behavior implications. Obviously, there is never a sharp distinction between the two, for they are not mutually exclusive. The thrust of discussions in this and subsequent chapters will be devoted to the develOpment of a rigorous analytic model for the prOblem. No specific re- strictions are imposed on the incentive contracts or its parameters. The main interest of the analysis is to investigate under what conditions would the princi- pal be able to negotiate an Optimal contract with the agent such that the expected utilities of the outcomes for both parties would be maximized over the long run. Most of the conditions which guarantee the existence of long run Optimal contracts are imposed either on the net wealth return to the principal per period, the 46 47 total expected discounted return to the principal, or the probability distributions of the outcomes. Conditions on the return functions, both per period and in total, are descriptions of the economic enrironments of the entity. For example, one of the conditions required in the infinite horizon model (Chapter V), is that the return to the principal per period is bounded and the discount factor has to be less than one. These are reasonable descriptions of common characteristics of an economic enterprise. Conditions on probability distributions are mainly behaviorial in nature. The type of "Bayesian" update as prOposed in the imperfect state information model (Chapter VII) is similar to that imposed in a typical behaviorial research. Multiperiod models have not been extensively dealt with in the agency literature. Authors of recent works attempted to extend their models to their multi- period counterpart at the end of their respective works, for example, Baiman and Demski [1980]. Lambert [1981] views the multiperiod model in terms of a two-period agency and prOposed a delayed payment scheme for the agent. Radner [1981] investigated the behavior of contracts in a repeated agency setting and concluded that if the process was repeated for long enough period 48 of time, one can approach the first-best solution (a Pareto Optimal contract) within some predetermined error bounds. His analysis imply an infinite horizon model. This and subsequent chapters will attempt to formulate the multiperiod agency problem and obtain sets of conditions which guarantee the existence of Optimal or nearly optimal contracts. The problem is first formulated in a typical decision choice model setting in terms of the most basic and rudimentary generic elements of the decision processes of the principal and the agent. Virtually no assumptions are imposed on the functions other than those which are required on the action choice set and the utility function of the agent to guarantee the compliance of the Nash incentive compatibility requirement. Any analysis cannot be carried too far under such a general formulation of the model. Dynamic programming algorithm is a procedure for searching solutions in dynamic problems and is well documented in both the Operation research and stochastic Optimal control literature. By reformula- ting the multiperiod agency problem under the dynamic programming hypothesis, 1 am able to adapt the recent findings of the above mentioned two bulks of researches to investigate the existence of long run Optimal contracts. 49 The adaptation of the dynamic programming algorithm does not cause the model to lose any of its intended generality. The algorithm provides a structural and systematic procedure to search for an Optimal or nearly optimal contract. It is through this systematic procedure that one can find the conditions under which an Optimal contract can be found. This research is interested in the procedure of the search and the adaptation of such a procedure to search for a solu— tion. It does not intend to go on to investigate the actual numerical and computational aspects of the algorithm. The multiperiod agency problem is developed as a sequential decision process on a discrete time basis. Contracts are negotiated at the end of each period and there are no restrictions on the forms of the contract. There is no requirement that the contracts negotiated are to have the same form throughout the entire time span. The agent is free to leave the com- pany any time within the horizon the model considers. When the agent leaves the company and a new one is hired, then the model will revert back to time 0 and the Optimization process will restart all over again. 'The principal will attempt to maximize his total «Expected return over the time span and seek the corres- EKNfiding Optimal contracts to achieve his goal. 50 In the first part of the analysis, a most basic multiperiod agency model will be considered. Both the principal and the agent observe the payoff at the end of each period before the contract is negotiated. Payoffs in any particular period depend stochastically on the payoffs in the previous period, the incentive contract of the current period and the total accumula- ted wealth of the company. Such a transition function is assumed to be well-defined stochastically. Should there be any disturbance arising from the above description of the payoff transition, there always exists a probability distribution on the disturbance given the payoff and the incentive contract. In other words, the entire sequence of payoffs over the planning horizon can be stochastically defined at time 0 once an initial payoff we is specified. The maximization process is then reduced to finding a sequence of con— tracts corresponding to the payoffs over all possible initial payoffs. Considerations of the basic model will be given on finite horizon (Chapter IV) and infinite horizon (Chapter V) assumptions. The finite horizon model <:onsidered will be a long run model. Although a two- ;neriod agency is not excluded in the analysis, the focus 113 that of a much longer period, the economic life of the entity. 51 The development of this part of the analysis dwells heavily on writings in the Optimal contral literature by Blackwell [1965a, 1965b, 1974, 1978], Denardo [1967], Strauch [1966], and particularly, Bertsekas and Shreve [1978]. The second part of the research centers on the imperfect state information model. Payoffs are observ— able only by the agent who reports a signal to the principal concerning the payoffs. Since the agent can choose any information signal he pleases, it becomes difficult if not impossible for the principal to assess the likelihood of the actual payoffs conditioned on the signal received. He can no longer project the sequence of expected payoffs such that he can Optimize his expected utility as in the first part of the analysis. In the imperfect state model, necessary steps must be taken to guarantee the existence of a probability distribution on the discrepancy between the actual pay- off and the signal reported (Chapters V1 and VII). This is important because the principal is maximizing his total expected discounted wealth and there is no way that he can assess his expected wealth without some de- gree of control on the distribution of the disturbance uflaich is the discrepancy in the current context. 52 Works in the literature concerning the imperfect state information models are sparse. The more notable ones are Astrom [1965], Juskevic [1976], Sawaragi and Yoshikawa [1970], and Striebel [1975]. The proposed model will draw materials from all these papers and monographs freely, in particular that by Striebel whose work is also documented by Bertsekas and Shreve [1978]. 3.2: The Basic Multiperiod Model It was proved in the single period model that under very mild conditions on the agent's action choice set and his utility function, the Nash constraint of incentive compatibility is satisfied for all incentive contracts. These conditions can be trivially carried over to the multiperiod model. The condition on the action choice set is that it must be compact. In a multiperiod setting, the agent is choosing his sequence of optimal actions from the possible action set which is in fact the Cartesian or cross product of the sets AO'A1"°"AN-l' Each of the Ai' i = O,l,...,N-l, is compact by assumption. The Cartesian product of compact sets is also a compact set. The multiperiod model is formulated in such a way that both the ;principal and the agent is maximizing their expected (iiscounted utilities on the outcomes. The agent will (iiscount all his expected remuneration over the time 53 periods to time 0 and apply his utility preferences on the discounted compensation. The only utility function in question is the one at time 0 which is assumed to be upper semicontinuous with respect to the action choice set. With these two assumptions on the action choice set and the agent's utility function, the Weierstrass Maximum Theorem will guarantee that the agent will be able to select a sequence of optimal actions such that his expected utility is maximized. In this and subsequent chapters, it is assumed that all Ai's are closed subsets of Rn, and hence compact (Theorems 2.1 and 2.2(ii)), and the agent's utility function is upper semicontinuous with respect to the actions (Assumption 8.5). Thus all contracts under consideration are incetive compatible. It is also assumed that if there are more than one Optimal action choice for a particular contract, that is, the solution to the agent's problem does not have a unique solution, the agent will choose among the possible "Optimal" solutions one which maximizes the principal's expected utility. Now, given any contract, the agent is guaranteed that he can select an Optimal action a* corresponding tc>the contract. He will execute a* with payoff O, a contract We 6 U is N-stage e—0ptimal if * k 'f * JN(U)) "' c. l JN(W) < °° J (m) 2 N'Fc — l/e if J;(w) = w The contract We 6 H is said to be e-optimal if J*(m) -e if J*(w) < on J (w) 2 ”e l/e if J*(w) = on Definition: If [on] is a sequence of positive numbers with en 4 O, we say a sequence of contracts [We] is said to exhibit [en1-dominated convergence to optimality if 11m JN,T = JN n4m n and for n = 2,3, *(. f J* JN w)-—c 1 N(w) < w JN,Vn(w) 2 J * N.Fn_1(w) - 6 1f JN(w) = 0° 3.4* The Dynamic Programming Algorithm The discrete time sequential decision process provides a natural framework for adopting the dynamic programming algorithm in generating an Optimal contract or nearly optimal contracts. This section gives a definition of JN "(w) and JF(w) in a dynamic programming framework. 72 Consider the mapping H: OCF 4 R* defined by H(UJ'I0J) = E*{g(w.I.J.y)+aJ(f(w.I.J.y))]I.w1- Hk represents the total expected discounted net return to the principal from period k through N-l given an incentive contract and a payoff outcome. To describe the Operations for all N periods together, define for each 1 6 M the mappings TI: F 4 F by TI(J)(w) = H(w,I,J) for every w 6 O and T: F 4 F by T(J)(w) = sup H(w,1,J) for every w 6 Q. 16U(w) Let Tk, k = 1,2,... be composition of T with itself k times and TO(J) = J for every J 6 F. JN,w(w) and Jv(w) can now be defined in the dynamic programming context. J (w) = (T T . . .T ) (J )(w) N,v IO 11 IN_l o J (m) = lim (T T ...T )(J )(1) F N4m Io Il IN-l o The solution can be expressed as JN(w) TN(JO)(w) J(w) 11m TN —c under assumptions one or three. * This is also the case if JN < a under assumption two (Theorems 4.2 and 4.3). J being negatively k”(w) finite is trivially satisfied in any economic situations. No company can survive infinite losses for any one period, not to mention that Jk is the total net re— turn for k periods. The same situation applies for * * JN being finite. It is unrealistic to have JN in- finite in any common business environment. With the above facts established, one can say that dynamic programming algorithm is apprOpriate as a solution pro— cedure for the finite horizon models. Next, the question of whether an Optimal or nearly optimal contract would exist under the assumptions is investigated. Under assumption three, it is shown that there always exists an [en1—dominated convergence to optimality sequence of contracts (Theorem 4.3). Recall that assumption three says that if J is to be approximated by a sequence of functions [Jn] then 77 each Jn can in turn be approximated by a sequence of contracts within some predetermined bounds. Theorem 4.3 guarantees that such sequence of contracts is an [en1—dominated convergence to optimality contracts. This implies that assumption three not only guarantees the existence of a nearly optimal contract, it ensures that one can find or adOpt an iteration procedure to approximate the nearly Optimal contract. In the develOpment of the model, there are no restrictions placed on the individual functions and their parameters with the exception of U2, the agent's utility function being continuous from above with respect to the action choice to guarantee the compliance of the Nash constraint and H under the above three assumptions. The model is drawn in its maximum generality. A trade-off the researcher has always had to make under such general setting is the inability to make specific characterization of the Optimal or nearly Optimal contracts. Such comments can only be made if more information about the behavior of individual functions are known, that is, more assumptions and restrictions on the functions are needed. The only characterization of the Optimal contracts, under the current general formulation, that can be made is that how the contracts are related to each stage in the iteration procedures of the dynamic 78 programming algorithm or how they behave each period in the process of arriving at the Optimal. This may sound disturbing to the application-oriented reader. This research is carried on in the interplay of the economic model and the dynamic programming model. The correspondence in the two models, once established, will allow the dynamic programming algorithm, which is documented in both the Operation research and Optimal control literatures, to take care of the various numerical and computational aspects of the actual search for solution to the model. An uniformly N-stage Optimal contract is always the more desirable contract because it maximizes the total net return at any given point of time over the planning horizon rather than just at the end of N periods. Such a contract can be shown to exist if at each period and for each payoff outcome, there is a contract that would maximize Hk' that is the supremum is attained in the relation Tk+1(Jo)(w) = sup H[w,I,T 16U(w) And if a uniformly N-stage optimal contract exist, k (JO) ]. * JN = TN(JO) (Theorem 4.4 and its corollaries). In the dynamic programming algorithm, each Tk(Jo) is computed sequentially over all possible contracts. Under the structure of the model prOposed, this is equivalent to saying that a sequential decision process 79 is adopted and the total wealth return is maximized at each stage of the process. This entails the question of whether an Optimal can be achieved each period. If the incentive constraint set is compact, which is a typical requirement for optimization problems, an Optimal contract can always be achieved (Theorem 4.5). Together with the results of Theorem 4.4, it can be concluded that under the apprOpriate assumptions of the finite horizon model, an uniformly N-stage Optimal contract is guaranteed to exist under the dynamic programming algorithm. CHAPTER IV* FINITE HORIZON MODELS 4.1* Introduction and Assumptions In considering the N-stage optimization problem, * the central question is whether J = sup J = TN(J ) N N,F 0 W611 * such that the Optimal JN can be obtained by successively computing T(JO),T2(JO),... via the dynamic programming algorithm. Another issue is the existence of Optimal or nearly Optimal contracts. In order to respond to the above questions affirmatively, some assumptions have to be imposed on the function H. Assumption F.l: If {Jk} C F is a sequence satisfying Jk+1 2 Jk for all k and H(w.I,J1) > -6 for all m 6 Q, I 6 U(w), then lim H(w.I,Jk) = H(w,1,lim J k-Ocn k-Ooo k) (1160. I 6U(=.u). Assumption F.2: There exists a scalar a 6 (0,6) such that for all scalars r 6 (O,m) and functions J 6 F, we have H(wIIIJ) 2 H(wIIIJ-r) 2 H(WoIIJ) -ar UJ £17: I 6 U(LU)0 8O 81 Assumption F.3: There is a scalar B 6 (O,w) such that if J 6 F, {Jn} c F, and [an] c R satisfy Zeno» en>0n=l,2, n=l J = lim Jn J 2 Jn n = 1,2, nam J(W)-€n n=1I2I W60 J(LU)<°° Jn(w) 2 Jn_1(u1)--en n = 1,2, m 6 J(w) = w H(w,1,J1) > -w for every w 6 O. I 6 U(w) then there exists a sequence [In] CIM such that lim T (J) = T(J) n46 In n and T(J)(u1) -Ben = 1.2, w E Q,T(J) (1) <.. TI (Jn)(w) 2 n TI (Jn_1)(u)-Ben = 1.2. w 6 O.T(J)(w)==w n-l Actually, Assumption F.l is a special case of F.2. Careful examination of the two assumptions will show that if Assumption F.2 is met, F.l will be satisfied. Assumption F.3 is somewhat more complicated. It allows one to get stronger results on the existence of nearly Optimal contracts than can be obtained under F.2 (Theorem 4.3). 82 4.2* The Finite Horizon Model The first step in the develOpment of the model is to show that the function H as defined in the previous chapter satisfies the three assumptions on the finite horizon model. The proof of such a statement requires four technical lemmas on outer integrals and probability measures. These results are very standard in mathematical analysis and probability theory. For completeness, they are stated without proof as follows. Lemma 4.1: If e > O and f g g g f4—c, then * 4* * [fdp g] gdp g f fdp+2e. Lemma 4.2: Let (X,B,p) be a probability space. Let [en] be a sequence of positive numbers with Z) c < m. Let {f ] be a sequence with lim f = f n n n n=l n4m pointwise f 2 fn for n = 1,2,"'. Let f(x) —cn if f(x) < co fn(X) 2 H 8 fn-l (x) - en if f (x) and [*fldp < w. Then lim [*fndp = f* fdp. n-Dco Lemma 4.3: Let (X,B,p) be a probability space. If p*([x:f(x) = m}) > 0, then for every g, ‘k g :X 4 [-m,m] and B-measurable, I (g4-f)dp = m. 83 Lemma 4.4: Let (X,B,p) be a probability space. If E c.X satisfies p*(E) = 0, then for any f:.X 4 R* [*fdp = [*xx_Efdp. Theorem 4.1: The mapping H(w.I.J) = E*{g(w.I.J.y)+—aJ[f(w.I.J.y)]lu.I‘ satisfies Assumptions F.2 and F.3. Proof: For r 6 (o,m) H(WIIpJ) 2 H(WoI,J-r) o By Lemma 4.1, H(w.I.J) [*(g(w.I,J.y)4-OJ[f(w.I.J.y)])p(dy]w.l) 2 H(UJoIIJ"r) 2 H(w,I,J)-2dr. Thus F.2 is satisfied. Let J 6 F, [Jn] C F, [an] C R on satisfy 2) e < m, e > O and for all n =1 n n J = lim Jn J 2 J n-Oco — n J(w)-—en if J(w) < m Jn(w) 2 Jn_1(w)-en if J(w) = w H(w,I,J1) > -® V w 6 Q, I 6 U(w). 84 Let [In] C:M be such that for all n T(J)(w)-en if T(J)(w) < e T_ (J)(w) 2 In 1/en if T(J)(w) = e T_ (J) 2 T_ (J). In In-l Consider the set A(J) = {w 6 013 I 6U(w) with p*({yIJ[f(w.I.J.y)] = wllwol) > 0] where p* denotes p-outer measure. Let I 6 M be such that p*({le[f(w.f(w).J.y>] = mllw.i(w)) > o v w e A(J). Define for all n 1(w) if w 6 A(J) In(w) if m Z.A(J) Claim that [In] thus defined satisfies the requirement of F.3 with B = l4—2d. For m 6 A(J), by Lemma 4.2 lim inf T_(Jn)(w) n4m I lim inf f*{g[Wpi(UJ) IJIY] + aJn[f(WIi(w) IJIY)]1P(dY1woi(W)) n—Oco [*{g[w.f(w).J.y]+—aJ[f(w.I(m).J.y)]}p(dy]w.I(W)). Since T_(J)(w) > -m, by Lemma 4.3 I lim inf T_ (Jn)(w) = e ; T(J)(w). n4w I n 85 For m 6 A(J), for all n p*({y1J[f(wIIn(w)lJIY)] = 0°31w,In(u1)) = 0. Let Bn 6 v {y]J[f(w.In(w),J,y)[ = e] C.Bn and P(Bn(w.In(w)) = o v n. By Lemmas 4.1 and 4.4 T (J )(w) Inn = f* xv_Bn(w)(g[w.In(w).J.y]-taJn[f(w.In(w).J.y)}p(dylw.In(w)) 2 - 2de n = TI (J) (u) - Zaan n V w 6 A(J), lim inf T (Jn)(w) 2 lim inf T 11-400 In n46.) T(J)(w) lim inf TI (J )(w) > T(J)(w) V w 6 Q. n4oo n n But TI (Jn) g T(J) V n = 1,2,... by hypothesis n lim TI (Jn) = T(J). n4co 11 If m is such that T(J)(w) < m, then m E A(J). by Lemma 4.1, T1 (Jn)(w) 2 T (J)(w) - Zoen n n IF ./ T(J) (w) — (1+ 261) en. Ii: lX'V-Bn(W)1’g[mlIr1(W)IJIY]"1' aJ[f(w.In(w) IJIY) ]}p(dy1w,1n(w)) I (J)(w) n Thus, 86 If m is such that T(J)(w) = m, then either (i) wEMJ) or (ii) w6A(J). (1) Let the/A(J) T (Jn) (w) 2 TI (J) (to) -2ae 1 n n n 2 T1 (J) (to) - Zoen n—l 2 TI (Jn_l)(w)-2oen. n-l (ii) Let h 6A(J) * — — ., TIn(Jn)(w) = [ {g[w,1(w),J.y]4-0Jn[f(w.1(w),J.y)]J p(dy]w.1(w)) * _ — . 2 I {ng.I(w).J.y]+aJn_1[f.J,y)]1 P(dylw.1(w)) -2den = TI (Jn_1)(w)-2a€n n-l The claim is proved by letting 8 = 14-23. QED Next, the correspondence of the solution values between the economic model and the dynamic programming * model, that is, JN = TN(JO), is established under the assumptions. Theorems 4.2 and 4.3 also show the existence of a nearly Optimal contract under Assumptions F.2 and F.3. Theorem 4.2: (a) Let F.l hold. Let Jk U(w) > -w for all m 6 0, V 6 U, and k = 1,2,...,N. Then * N JN - T (JO). 87 * (b) Let F.2 hold. Let Jk(w) < m for all m 6 Q and k = 1,2,...,N. Then * N JN — T (Jo) and for every 6 > 0, there exists an N-stage e- Optimal contract, i.e., a We 6 H such that * J J* JN 2 N,F 2 N - s. Proof: (a) For each k = 0,1,...,N-l, consider a sequence [1;] C:M such that lim T .[TN'k'1(J )] — TN‘k(J ) k = 0,1, ,N-l . 1 o 0 14m I k N-k-l N-k-l _ T i[T (JO)] g T 1+1[T (Jo)] k - 0.1.. .N-l Ik Ik 1 = 0,1, By F.1 and Jk F(w) > -m, we have at3 \ JN 2 sup...sup T i ...T 1 (Jo) 1o lN-l I O I N-l o N-l = sup...sup T . ...T . sup T . (J ) . . 1 1 . 1 o 1 1 o N—2 1 N—l o N-2 IO IN_2 N-l IN_1 = sup...sup T i ...T i [T(Jo)] 1o lN-Z I o 1 N-2 o N-2 88 * Since J = sup J (w) V w 6 O N NEH N'” = sup (T T ...T )(J ) ten Io I1 IN-1 o g TN(JO) * .. JN = TN(JO) (b) The result clearly holds for N = 1. Suppose it holds for N = k. Then J; = Tk(Jo) and for a 'k c > O 3 We 6 H such that Jk,F€ 2 Jk-e. By F.2, V I 6 M J T \ * k+l 3 1(Jk,v ) s TI(Jk) ' as B ' d t' J* > k+1(J ) B definition y in no ion, k+l = o . y k+l * T (Jo) 2 Jk+l k+l _ * T (J ) — Jk+1 Let E > 0 be given and let 5 = (Io,Il,...) be such that * _ J _ 2 Jk - e/Zd. k,I Let i e M be such that 'k * - Ti(Jk) 2 T(Jk) - 6/2. Consider the contract E_ = (I,Io,Il,...). Then 6 * ._ 'k _ J _ = T_(J _) 2 T_(Jk) - 5/2 3 T(Jk) - c k+1,v= I k,v _ I I * _ = J — e QED k+1 89 If Assumption F.1 were to be replaced by the following assumption, the results of Theorem 4.2 (a) can be strengthened. Assumption F.1': The function Jo satisfies Jo(w) g H(w.I.Jo) V w 6 O, 1 6 U(w) and if [Jk] C F is a sequence such that Jk+1 2 J 2 J for all k, then lim H(w.I.Jk) = H(w.I,1im J ) V w 6 C, I 6 U(w). k—Ooo k-hao k Corollary 4.2.1: Let F.1' hold. Then * JN = TN(JO). Theorem 4.3: Let F.3 hold. Let Jk U(w) > -w * V w e o. n e n and k = 1,2,...,N. Then JN = TN(JO). Furthermore, if {an} is a sequence of positive numbers with en 4 0, then there exists a sequence of contracts {Tn} exhibiting {en1-dominated * convergence to Optimality. If, in addition JN(w) < m V w 6 0, then for every 3 > 0, there exists an e—optimal contract. Proof: Let K = l. J:(w) sup J1 (h) WED 'F = sup H[w,I(w),JO] I€M = T(Jo)(w) V w 6 Q. 90 Given [en], by F.3, it is clear that there a sequence [Tn] C H satisfying lim J = J n42 1,Vn l * J1(u)) - c J (m) 2 1,J — - n J1,7T (w) - on n-l Suppose the result is true for K = N-l. exists Let 8 be as specified in F.3. Consider a sequence r . .en} C R W1th an > O V n co lim e = O and Z) c n4m n=1 A A n r11C = Let (Tn, H, where Vn (11,1 . A=J 11m J H N—l n -) be such that —oo * -1 V * JN_1(u)-B an m 6 Q JN_1(w) < m J A ((1)) E _1 * N-l,Trn J A (w)-B en V w 6 0 JN_l(w) N-1,W n-l Since Jk ”(w) > —m V w 6 O, k = 1,2,...,N, W 6 H then H(w,I,J A ) > —m V w 6 0, I 6 U(w). Since N-l'Tr n 1' J J* 1m = A _ n4m N-l,w N 1 11 By F.3, it then implies H {1:} CIM such that V n * 11m T (J A ) = T(JN- * . a: ‘ T(JN_1)(.L) an if T(JN-l) (w) < co T1n(JN 1 A )(w) 2 T (J >( ) 'f T(J* >( ) — In. A LU '6 1 (.U o In-l N—1,w N-l o -l . . * N—l . . . By induction JN-l = T (J ) and by definition of * * JN, TN(JO) 2 JN. Hence * * J(JN_1) = TN(JO) 2 JN But J* 2 1 '1‘ (J ) T(J* ) 1m - N n4m I N-1,F N—l n N J - T(JN-l) — T (Jo) Let n = (In,1n,1n, .). Then for all n n o l 2 _ * 11m JN,W — JN n4m n ‘k .. V . h * JN(u))--en w 6 0 w1t JN(w) < JN v (w) 2 * . * ’ n J (w)-—c V w 6 0 With J (N) = N,W n N n-l Obviously, if J;(w) < w, Tn is e-optimal. QED Theorems 4.1 through 4.3 established the validity of adOpting dynamic programming algorithm as solution procedures to the basic multiperiod agency problem. The existence of nearly optimal contracts has also been demonstrated. In fact, the dynamic programming algorithm can provide much stronger results than those as stated. 92 Under conditions provided in Theorems 4.4 and 4.5, the algorithm is actually attempting to arrive at a uniformly optimal contract. * Theorem 4.4: A contract v* = (10,11,...) is uniformly N-stage optimal if and only if (T *TN-k-1)(JO) = TN‘k(Jo) k = 0,...,N-l. Ik Proof: Let (T *TN-k-1)(JO) = TN'k(JO), k = 0,...,N-1. Ik N-k Then (T *...T * )(JO) = T (JO). But Ik IN-l J* T N-k 2 (TI* .. 1* )(Jo) k N-l and N—k * T (Jo) 2 JN-k * .— JN_k (TI* .TI* )(JO) k N-l v* is uniformly N-stage Optimal. Conversely, let U* be uniformly N-stage Optimal. Then by definition, T(Jo) = J = T * (JO). For every 1 6 M, (TIT)(JO) = (T T * )(Jo). This implies 93 2 T (J ) = SUP (T T)(J ) 0 16M I o = SUP (TIT * )(JO) 16M IN_l * _ 2 g J2 (TI* TI* )(JO) g T (JO) N—2 N-l 2 _ * _ _ T (Jo) - J2 — (TI* T * )(Jo) — (TI* T)(Jo). N—2 IN-l N-l By induction, the result can easily be extended for all QED Corollary 4.4.1: There exists a uniformly N-stage optimal policy if and only if the supremum is attained in the relation Tk+1 (JO) (1”) = sup H[(1.‘=.I,Tk (JO)] I6U(w) for each m 6 O and k = 0,1,...,N—l. Corollary 4.4.2: If there exists a uniformly N- stage Optimal policy, then * N JN — T (JO). Theorem 4.4 and its corollaries state that if the supremum can be achieved at each stage of the optimiza— tion process, the resulting contract is uniformly Optimal. The next theorem states that if the incentive constraint set is compact, the existence of a supremum at each stage is guaranteed. 94 Lemma 4.5: Let C be a Hausdorff space, f: C 4 R* and U a subset of C. Let the set U(i) = {1 e Ulf(1) 2 1] is compact for each 1 e R. Then f attains a maximum over U. Proof: If f(1) = m for all I 6 U, then every 1 6 U attains the maximum. If f* = sup[f(1)]1 6 U} < m, let [1n] be an increasing sequence such that In < Xn+l for all n and 1 4 f*. n Thus U(ln) C‘U(1 ) for all n and the sets U(ln) n+1 are nonempty and compact, H U (1n) = U(lo) is n=l m nonempty and compact. Let 1* 6 m U (1n) C‘U and n=l f(1*) 2 1n for all n f(1*) 2 f*. But f(1*) g f* f(I*) = f*. QED Theorem 4.5: Let the incentive space C be a Hausdorff space and assume that for each m 6 0, l 6 R and k = 0,1,...,N-1. The set Uk(w,l) = [1 6 U(w)]H[w,I,Tk(JO)] 2 l] is compact. Then J* = TN(JO) and there exists a uniformly N-stage N Optimal policy. 95 Proof: Direct application of Corollary 4.4.1 and 4.4. and Lemma 4.5. CHAPTER V INFINITE HORIZON MODELS 5.1: Introduction The construction of the infinite horizon models requires a few more considerations than those in the finite horizon models. Recall that in the economic formulation of the model J (w) = lim J (m), that F N4m N,F is, Jv(w) is the limiting value of the total expected discounted net return to the principal as N gets large and JW(w) = lim (TI ...TI )(Jo)(w) in the N46 0 N-l dynamic programming model. The natural initial question would be whether such limits exist and if they do, would they be equal. In addition, an immediate concern would be the convergence of the dynamic programming algorithm. Another important matter to resolve is the question of whether or not optimal or nearly Optimal contracts exist. This chapter will address the above problems on two separate sets of assumptions: contraction and monotonicity. Both assumptions are reasonable 96 97 descriptions of the economic environments of a business enterprise. The contraction assumption says that payoffs and returns that are receivable at a far distant future have very little significance in the economic decisions that are made today. In computing present values, it is well-known that with a discount factor of less than unity, the present value of any finite amount for a long period of time is close to zero. A discount factor of less than unity means that interest rate is always greater than the inflation rate. The second set of assumptions describe a mono— tonicity behavior on the total net return function J. This assumption can be set up in two distinct scenarios. First, J can be monotonic increasing with respect to time. Since J is the total accumulated expected net return to the principal, the assumption that J is monotonic increasing implies that the net payoff each period to the principal is greater or equal to zero. If one were to consider the net return for each period in terms of expected value over all possible payoffs, the requirement that the expected net payoff to the principal is positive makes economic sense. The entity will not survive if the expected payoff is negative. 0n the other hand, if the total net return 98 function is monotonic decreasing, the situation considered is still not totally unrealistic. Consider the following situation. An entity has been suffering losses consistently from period to period. This would mean that the payoff for each period is negative and consequently J is monotonic decreasing. The objec- tive of the model is to maximize expected total return which means that it will search for some incentive contracts that would make the magnitude of period loss diminish over time with the expectation that the loss will become identically zero and eventually swing over to the positive side. Once the period payoff becomes positive, J will be monotonic increasing and the modeling will continue under the earlier set of assumptions. Of course, such a situation will occur if the return becomes positive before the company goes bankrupt. 5.2: The Contraction Assumption The contraction assumption is motivated by the contraction prOperty of the mapping H associated with discounted stochastic optimal total return with bounded net return per period. As mentioned earlier, this assumption will be satisfied with the discount factor, a < l and the net return function g uniformly bounded 99 above and below. As always is the case, bounded return is an extremely reasonable assumption in most economic settings. Theorem 5.1 will show that indeed H as prOposed in Chapter III satisfies the contraction assumption. The next set of questions to be considered involve whether the iteration process will generate contracts that lead to an optimal total return function, that is, whether J* = TN(J), and when the number of periods N N is allowed to tend to infinity, whether the dynamic programming algorithm converges to any kind of limit. If a finite limit exists, what is that limit equal to? It can be shown that such a limit does exist and is equal to the infinite horizon total return function for any incentive contract (Theorem 5.2(a)). Also, it is * shown that JN equals to TN(J) for all N. These two statements together imply that J*, the infinite horizon Optimal return function is equal to lim TN(J) N-Oa (Theorem 5.2(b)). Also the mappings Tm and TT, where m g N are contraction mappings (Theorem 5.2(c)). With the aid of the Fixed Point Theorem which says that a contraction mapping will converge to a unique fixed point, J* is shown to be the unique fixed point. Therefore, the validity of the dynamic programming algorithm is established for the contraction assumptions. 100 Next, some attempts to describe the Optimal contract are made. If for each financial position, involving both the total return to date and the last net payoff, there exists an Optimal contract at that position, then the optimal contract is stationary (Theorem 5.5). This means that the same form of contract is Optimal over the entire time span. This result can further be strengthened by showing that the maximum is attained for each financial position if the incentive constraint set is compact. Compactness means that the set is closed and bounded and is the usual requirement for optimization problems. Also, since T is a contraction mapping, the sequence Tm(J) does converge to a limit as m tends to infinity. Hence the Optimal contract may be obtained in the limit from finite horizon optimal contracts by successively computing T(J),T2(J),... (Theorem 5.6). The conver- gence of Tm(J) implies the convergence of the dynamic programming algorithm for the infinite horizon model under the contraction assumption. As it has been stated, the contraction assumption yields very desirable results. The optimal total return J* can be computed through a finite horizon mode TN(J) by letting N go to infinity. It is unique. If the 101 maximum is attained at each stage of the iteration process, then the resulting optimal contract is stationary. 5.3: The Monotonicity Assumption In this section, two separate and parallel sets of monotonicity assumptions are considered. It is assumed that J the total return function to the k’ principal at the end of time period k is a monotonic increasing or monotonic decreasing function over, that is Jo(w) g H(w,I,JO) (Assumption 1) or Jo(w) 2 H(w,I,Jo) (Assumption D) for all w 6 O and all I 6 U(w). The economic motivation and interpreta- tion of these two assumptions have been discussed in the introduction of this chapter. In the following analysis under Assumptions 1 or D, two additional assumptions are required on the behavior of the function H. The first Assumption 1.1 or D.l, describes a continuity prOperty on H. Similar to Assumption One of the finite horizon models, this assumption guarantees that for small changes in the value of J, there will be only small corresponding changes in the value of H. The second Assumption, 1.2 or D.2, imposes a kind of linearity on H from below. This guarantees that H cannot decrease more than a fixed multiple with a 102 decrease in J. Again, this assumption is similar to Assumption Two under the finite horizon model. The economic implications of these two assumptions will be identical to that under the finite horizon models. As expected, Assumption 1 will be satisfied if g, the net return to the principal per period is positive and Assumption D is satisfied if g is negative (Theorem 5.7). Hence, the analysis can be carried on to investigate the equivalence of the economic and dynamic programming model under these two assumptions. This is done in three stages: (1) the Optimality equation J* = T(J*) is investigated, (2) the question of existence of optimal or nearly Optimal contracts is settled, and (3) the convergence of the dynamic programming algorithm is shown. 5.4: The Optimality Equation Before considering whether the optimality equation J* = T(J*) holds, the existence of a nearly optimal contract for Assumption D is first established. If Assumptions D, D.l and D.2 all hold, and the optimal return J* is finite, the existence of nearly optimal contracts is guaranteed (Theorem 5.8). In addition, if the discount factor a is less than one, then the nearly Optimal contract is stationary. 103 The optimality equation can be established under all Assumptions D, D.l and D.2 (Theorem 5.9 and Corollary 5.9.1). However, the optimality equation holds under Assumption 1 only when Assumption 1 is accompanied by one of the additional conditions 1.1 or 1.2 (Theorem 5.10 and Corollary 5.10.1). A consequence of the results of this section is that the validity of adopting the dynamic programming algorithm for the monotonicity assumptions is established. 5.5: Convergence to Optimality and Existence of Optimal Contracts Under Assumptions D, D.l and D.2, it can be shown that if the supremum is attained in the Optimality equation J*(w) = sup H(w.I.J*) I6U(w) for every w 6 0, then the resulting contract is an optimal stationary contract. In fact, the conditions are both necessary and sufficient (Theorem 5.11). The same results can be obtained under Assumption 1 (Theorem 5.12). However, to arrive at the conclusion, one additional assumption, Assumption 1.1, is required. If the supremum cannot be attained for some w 6 0, it is shown (Theorem 5.13) that, under Assumptions D, 104 D.l and D.2, the optimality equation can still be used to construct a nearly optimal contract. In addition, if the discount factor is less than one, the nearly Optimal contract as constructed is stationary. Under Assumption 1, only a weak counterpart of the above results can be given. If Assumptions 1 and 1.2 hold and if the optimal total return function J* is finite, then a nearly Optimal contract exists (Theorem 5.15). However, I am unable to give a counter— part under Assumption 1 on conditions for existence of a stationary nearly Optimal contract. The dynamic programming model requires successive generation of the functions T(JO),T2(JO),...,Tk(JO),... for all k. In terms of the infinite horizon model, it seems apprOpriate to define a function JOD by Jm(w) = lim TN(J )(w) for every w 6 0. N-Om O The rest of this chapter will be devoted to the discussion of whether J2 exists and whether Jco = J*. In other words, it is a concern that whether the dynamic programming algorithm converges under the two assump- tions and if it does, will the limit be the same as the Optimal total return. 105 Fortunately, if Assumption 1 and 1.1 hold, all the above questions can be responded affirmatively (Theorem 5.14). However, under Assumption D, the equality J2 = J* may fail to hold even in very simple situations. A preliminary result shows that in order to have JOD = J*, it is necessary and sufficient to have J2 T(Jm) (Theorem 5.16). The convergence of the dynamic programming algorithm under Assumption D is essential to arrive at the Optimal total return. To show tha fact that Jco = T(Jm) requires some elaborate technical details concerning the algebraic structure of the various functions. These are pre- sented in detail in Chapter V*. It can be generally stated that if the incentive constraint set is compact, then J2 = T(Jm) = T* under Assumptions D, D.l and D.2 (Theorems 5.16 through 5.18). Once convergence is established, it is shown (Theorem 5.19) that the limit of the dynamic programming algorithm is an optimal contract and it is stationary. 5.6: Remarks 1n Chapters 111 through V, the focus of attention is on a basic multiperiod agency model in which the period payoffs are observable to both the principal and 106 the agent and the sequence of payoffs are stochasti- cally well-defined with a known probability distribution. Analysis have been conducted on a finite and infinite horizon settings. The circum- stances or conditions under both settings that will guarantee the principal's ability to negotiate an optimal contract with the agent are extremely mild. The two conditions imposed on the finite horizon model are both on the manner in which the total net return to the principal in the k4-1st period behaves relative to the total net return to the kth period. One of the conditions says that if the total return, or total accumulated wealth in the entity, in period k changes by a small amount, then given all other factors equal, the expected change in the total wealth of period k+-l is also small. The second condition goes on further to say that the change is somewhat linear. These two together implies that given some changes in the wealth of a company, the effect of such changes in subsequent periods is prOportionate to the change. Both are conditions on the economic environment in which the entity Operates. I believe that most if not all business enterprises are Operating under these circumstances. If future wealth were so unpredictable with respect to current changes in company wealth, 107 companies will be very hesitent to declare dividends, acquire ventures which requires capital outlays. The contraction and monotonicity assumptions under the infinite horizon model are also conditions on the economic environment of the enterprise. The contraction assumption relies on the same assumptions that any discounted cash flow or present value models take on. These assumptions which include bounded payoffs and discount factor less than one are widely accepted in the literature. The monotonicity assump- tion is based on the belief that no company will remain in business if its expected period payoff is negative. If any one of these two assumptions is met and con- sideration of the model is not restricted to a finite number of periods, one would be able to construct an optimal stationary contract. This is possible because by extending the planning horizon to an infinite period of time, one allows time to become a monitoring device. By repeating the process over long enough periods, the behavior of the agent becomes extremely predictable. On the other hand, if the agent intends to remain in the company "forever", it will be difficult for him to cheat "forever" and remain undiscovered. It will then be in his best interest to act cooperatively with the principal. 108 Certainly, if the number of periods in the finite horizon model is allowd to extend long enough, all the infinite horizon model effects will be carried through. The limiting effects of the infinite horizon model provide an additional nicety to the optimal contract: that it is stationary. CHAPTER V* INFINITE HORIZON MODEL 5.1* The Contraction Assumption The following assumption is motivated by the contraction prOperty of the mapping associated with discounted stochastic optimal total return with bounded return per period. Assumption C: Let B be the Banach space of all bounded real-valued functions on Q with the supremum norm. There exists a closed subset Bo C B such that JD 6 Bo' and for all J 6 BO, 1 6 M, the functions T(J) and TI(J) belong to Bo' Further- more, for every n = (10,11,...) 6 H, the limit lim (T T ...T ) (J ) (w) N-Oco IO I]. IN-l 0 exists and is a real number for each w 6 0. In addition, there exists a positive integer m and scalars w'a' With 0 < m < 1. a > 0, such that )(TI(J) -TI(J’)|1 g oHJ—J’H v 1 6 M, J,J’ e B [)(T T ...T )(J)-(T T ---T )(JVH g {DJ-J!“ v I ,...,1m_1 6M, J,J’ e B 109 110 The assumption is stated in a very general setting. It is often convenient to take B = B and assume a < l with g uniformly bounded above and below. However, in some special cases, the contraction pro- perty can be verified only on a strict subset B of B. To start the analysis, it is first shown that the function H meets the requirements of the assumption. Theorem 5.1: Let H(w,1,J) = E*{g(m.I,y)4-dJ[f(w.I,y)]]w,1} be a mapping. Let Jo(w) = O for every w 6 0. Assume that a < l and for some b 6 R, there hold 0 g g(w,1,y) g b V w 6 Q, 1 6 U(w). y 6 1. Then Assumption C is satisfied with B0 = (J: J 6 B, J 2 O} and a = 2a, m = d and m = 1. Proof: Clearly Jo 6 B0 and T(J), T1(J) 6 Bo for all J 6 Bo and 1 6 M. Then for any W = (10,11,...) 6 H, JO 2 TI (J ) g .3 (TI ...TI )(JO) 0 o k g (1‘ ...T )(J ) ])p(dylw.I) * i: g] g(w.I.y)p(dylw.I)+a] J[f(u;-.I.Y)]p(dy|x.1) * g b+o]‘ J[f(w,1,y) ]p(dy]u),1). Thus N—l (TI ...TI )(JO)(u)) g E akb o N-l k=0 g l?a VwecaN=1,2,... '- lim (T -..T )(J)(u:)6R Vweo. N4m Io IN-l O ngJ ((1)) .y] + aJ[f(w.I (w) .y)] g gfm (w) .y] +aJ’[f(w.I (w) .y) ] + aHJ -J’H Vw6Q,J,J’6B,16M and ye'y. By Lemma 4.1, J1*{9[w,1(u).yl+ aJ[f(w.I (w) .y) ]}p(dy[h,1) g (*{g[h,1(h).y]+aJ’[f(h.1(w).y)])p(dy|h.1) -t2aWJ-J’W Hence TI(J) (w) - T (J’) (m) g ZoHJ-J’H. I Similarly, we obtain TI(J’) (w) - TI (J) (m) g 2el1J-J’il ITI (J) ((1)) -TI (1’) (w)l g 2oHJ-J'H. Taking supremum on the left-hand side over w 6 O ()TI(J) (w) -T1(J') (111))! g onJ-J’H ‘v’ I 6 M, J,J’ 6 B. If J,J' 6 Bo' again we obtain g[w.I (w) .y] +dJ[f(w.I ((11) .Y] g 9(w.I(i).Y1 +oJ'[f(m,I (w) .y) ] +aUJ—J’H. 112 Then [*{g[w.1(w).y]+-aJ[f(w.I(w).y])p(dy]m.I) g [*{9[w.1(w).y]i-GJ'[f(w.I(w).y)]i-QHJ-J’H1p(dy)w.l) g J‘*{9[WoI(W) IY] + aJ’[f(wlI (W) IY) 11p (dy1LUII) +C1HJ “J’H- Proceeding as before, we obtain HTI(J)-TI(J’)H g oHJ-J’H V I 6 M, J,J’ 6 BO. QED Theorem 5.2: Let Assumption C hold, then (a) For every J 6 B0 and u 6 H, J = lim (T ...T )(J ) = lim (T ...T )(J). ” N4m o N-l ° N4m Io IN—l (b) For each positive integer N and each J 6 Bo' sup (TI ...TI )(J) = TN(J) and WE“ o N—l J* (T T )(J ) TN(J ) = sup ... = . N ten Io IN—l ° 0 (c) The mappings TM and T?. I 6 M are contraction mappings in B0 with modulus m. Proof: (a) Let k 2 0 be any integer and k = nm-tq where q,n 2 O and 0 s q < m. By C, for any J,J' 6 Bo ()(TI ...T )(J)-—(TI ...T )(J’)H g onaqHJ-J’H. o k-l o k-l Since Jo 6 Bo 113 Taking limits as k 4 a and n 4 a 11m (T ...T )(J) = lim (T ...T )(Jo). k4a Io Ik-l k4m Io Ik-l (b) By assumption Tk(J) 6 ED for all k. Thus Tk(J)(w) < a V w 6 O and k. For any 5 > O, 6 M, k = 0,1,...,N-1 be such that let 1k T__ (J) 2 T(J) - e IN—l (T_ T)(J) 2 T2(J) - e IN-Z (T_ TN‘1)(J) g TN(J) - c. I 0 By Assumption C TN(J) g (T_ TN’1)(J) + 6 IO 6 T_ [(T_ T1”) (J)+ e] + e ’ 1 I o 1 g (T_ T_ ,TN—2)(J) + as + e ’ I I o l N-l k g (T_ T T_ )(J) + ( Z) a 6) IO Il IN_l k=O N-l k gsup(TI...TI )(J)+ (23 0.8) — V6U o N-l k=O TN(J) g sup (TI ...TI )(J). NEH o N-1 But TN(J) 2 sup (TI ...TI )(J) by definition — V6H o N-l . * _ _ N .JN—sup(TI...TI )(J)—T(J). W6H o N-l 114 (c) By Assumption C, T? is a contraction mapping. Also for all I 6 M, k = 0,...,m-l, and k J,J’ 6 B O (T ...T )(J) g (TI ...TI )(J') + cpHJ-J'H. o m-l o m—l Taking supremum of both sides over I 6 M and from k part (b) Tm(J) g Tm(J’) + cpHJ-J’H. Similarly T'“(J’) < T‘“(J) + cpllJ-J’H HTm(J) -Tm(J') H < coHJ-J’H. QED Theorem 5.2 provides some very preliminary results. It establishes the validity of the dynamic programming algorithm. The next theorem is the well-known Fixed Point Theorem in Banach Space which is quoted below without proof. Theorem 5.3 (Fixed Point Theorem): If B0 is a closed subset of a Banach space with an apprOpriate norm and. L.:Bo 4 B0 is a mapping such that for some positive integer m and scalar o 6 (0,1), !!Lm(2) -Lm(z’)H _g_ (DMZ-2’1! for all z,z' 6 BO Then L has a unique fixed point in Bo' Furthermore, for every Z 6 B, 115 lim HLN(Z)-Z*H = 0 N49 where Z* 6 Bo such that L(Z*) = 2* With the aid of the Fixed Point Theorem, Theorem 5.4 characterizes the optimal total return function J* and Theorem 5.5 the total return function JI corresponding to any stationary contract (1,1,...) 6 It also shows that these functions can be obtained in the limit via successive application of T and TI on any J 6 B. Theorem 5.4: Let Assumption C hold. Then (a) The optimal profit function J* 6 Bo and is the unique fixed point of T within 30' Furthermore, if J’ 6 B0 is such that T(J’) 2 J’, then J* 2 J’ (b) For every I 6 M, the function J 6 B I 0 and is the unique fixed point of TI within B . o (c) lim HTN(J)-J*H = o v J 6 B ' o N-ba lim HT§(J)-J H = o v J 6 B , I 6 M. I 0 N40 Proof: By Theorems 5.2(c) and 5.3, T and TI have unique fixed points in 30' Clearly T¥(J) = JI. Thus part (b) is proved. 116 Let 3* be the fixed point 3* = T(J*). For any E > 0, let I 6 M be such that T_(J*) 2 3* - E. I By Assumption c that HTI (J) -TI(J') H g oHJ -J’H T2(J*) 2 T_(J*) - OLE 2 3* - (1+O)E. I I - . m ~ ~ m..1 _. Continuing, T_(J*) 2 J* - (l-tG-t..uta )6. By the I _ . I I assumption that ”(TI TI ...TI )(J)--(TI TI ...TI )(J )h o l m-l o l m-l g T’EJ-J'H 2m ~ m ~ m-l - T_ (J*) 2 T_(J*) - co(l+Cl+...+0. )e I ‘ I 2 3* - (l+co) (l+(1+...+C1m—1)E. Thus for all k 2 l TEm(J*) 2 3* — (1+co+...+ cpk—l)(l+a+...+am-1)E. I . . km ~ . . . Since J_ = 11m T_ (J*). Taking limits as k 4 a I R40 I J 2 3* — —1-—(l+a+...+om-1)E. i - 1% Let E =(1-CD)(1+G+...+CIm-1)-l J_2J*—e. I But J* 2 J and e > O is arbitrary 117 On the other hand, J* = sup lim (TI ...TI )(J*) IEU N4a o N-l lim TN(3’*) = 3* N49 W\ ~ J* = J*. By Theorem 5.3, part (c) follows immediately. Since T is a monotonic mapping by assumption by part (c) it follows that if J’ 6 Bo such that T(J’) 2 J’ then J* 2 J’. QED Theorem 5.5: Let Assumption C hold. Then (a) A stationary contract v* = (I*,I*,...) 6 U is optimal if and only if TI*(J*) = T(J*). Equivalently, r* is optimal if and only if TI*(JI*) = T(JI*) (b) If for each m 6 0 there exists a contract which is Optimal at w, then there exists a stationary optimal contract. Proof: (a) If r* is Optimal, then J1* = J*. By Theorem 5.4(a) and (b) TI*(J*) = T(J*). If TI*(J*) = T(J*), then TI*(J*) = J*. By Theorem 5.4(a), J1* = J*. If TI*(JI*) = T(JI*)' Again, 118 by Theorem 5.4(a) ,...) be an Optimal contract at w 6 0. By Theorems 5.2(a) and 5.4(a) J*(w) = J *(w) I w = lim (T * ...T * )(Jo)(w) k-Ha I I o,w k,w = lim (T *, ... T * )(J*)(UD k4ao I I o,w k,w 3 lim (T * Tk)(J*)(w) k4a I o,w = T * (J*)(w) o,w g T(J*) (w) = J*(w) T * (J*)(w) = T(J*)(w) for each w. I 0,0.) * Define 1*(w) = 10 w(w) for 1* 6 M. Then T *(J*) = T(J*). By part (a) the stationary contract I (I*,1*,...) is optimal. QED Theorem 5.5 also establishes the existence and characterization Of stationary optimal contracts. Part (a) of Theorem 5.5 shows that there exists a 119 stationalry Optimal contract if and only if the supremum is attained for every w 6 0 in the optimality equation, J* = T(J*). Theorem 5.6 strengthens this result by showing that the supremum is attained if Uk(w,1), the incentive constraint set is compact. It also shows that stationary Optimal contracts may be obtained in the limit form finite horizon optimal contracts by successively computing T(J),T2(J),--°. At the same time, it also proves the convergence of the algorithm. Theorem 5.6: Let Assumption C hold. Let the incentive space C be a Hausdorff space. Suppose that for some J 6 Bo and some integers ko > O, the sets uk(u.x) = {1 e U(w>1u[h.1.Tk(J)] 2 1} are compact for all m 6 Q, 1 6 It and k 2 k0. Then (a) there exists a contract v* = (1;,I*,...) 6 H attaining the supremum for all m 6 Q and k 2 ko with initial function J, i.e., k (T *Tk) (J) = T +1(J) v k 2 k. Ik (b) For every contract r* satisfying (a), the '1‘ sequence [Ik(w)} has at least one limit point for each m 6 Q. 120 (c) If 1*: 0.4 C is such that I*(w) is * a limit point of {Ik(w)} for each w 6 0, then the stationary policy (I*,1*,...) is optimal. Proof: (a) Since Tk+1(J)(w) = sup H[w,1,Tk(J)] 16U(w) and Uk(w,1) are compact for k 2 k0. By Lemma 4.5, Tk+1(J)(w) attains a maximum (T *Tk)(J) = Tk+1(J). Ik * 'k (b) Let 1* = (10,1 .) satisfy part (a). 1,.. Define €k=sup{iiJ*-Ti(J)H]i-2k} k=o'1'ooo Since T(J*) = J*. By Assumption C and part (a) 1'.va - (T n“) (J) 2) = (T(J*) -T"” 0 such that for all scalars r > 0 and functions J6F with JogJ, H(onIJ) g H(WoIIJ+r) g H(wlIIJ) + or V w 6 O and I 6 U(w). Assumption D: Jo(w) 2 H(w,I,JO) V w 6 0, I 6 U(w) Assumption D.l: Let [Jk} C F be a sequence such that J g Jk g Jo for all k. Then k+1 lim H(w,I,Jk) = H(w,I,lim Jk) V w 6 Q and I 6 U(w). k4a k-Nn Assumption D.2: There exists a scalar a > O such that for all scalars r > O and functions J 6 F With J g Jo H(wlIIJ) " (11' S H(onlJ-r) __<_ H(wlIIJ) V w 6 Q and J 6 U(w) Clearly, under either set of assumptions, JV is guaranteed to be well-defined by the monotonicity of J for all F 6 H. It is also easy to see that under each of these sets of assumptions the limit, lim (TI TI ...TI )(JO)(w) is well-defined as a N44 o 1 N-l real number or 1?- Indeed, in the case of Assumption 1, 123 O 0 g (TIOTI1 ...TIN_1)(JO) g ..., and Jo 2 TIO(JO) 2 (TIOTII)(JO) 2 2 (TIOTIl ...TIN_1)(JO) 2 in the case of Assumption D. In both cases, the limit clearly exists in the extended real numbers for each m 6 0. Once again, as a first step, the function H is shown to satisfy these assumptions. Theorem 5.7: Consider the mapping H(w,I,J) = E*(g(w.I.y)i-aJ[f(w.I.y)])w.I). Let Jo(w) = 0 V w 6 o. If 9(w.I.y) 2 0 V w 6 o. I 6 u(w). y 6 y then Assumptions 1, 1.1 and 1.2 are satisfied with the scalar in 1.2 equal to a. If g(w.I.y) g 0 V w 6 0. I 6 U(w). y 6 W then Assumptions D, D.l and D.2 are satisfied with the scalar in D.2 equal to a. Proof: Since Jo(w) = O V w 6 0 and 9(w.I.y) 2 O or g(w.I.y) g 0 V w 6 O. I 6 U(w). Y 6 W 124 Assumptions 1 and D are trivially satisfied. By the monotone convergence theorem of integration, Assumptions 1.1 and D.l are satisfied since g(w,1,y) 2 O. For all r > O and J 6 F with JO 2 J H(onpJ+ r) = E*{g(wIIIY) +aJ[f(WrIIY)]+ar1wII] E*[g(w.1.y) +aJ[f(w.I.y)]lw.I) + or H(w,I,J) + or. Hence 1.2 is satisfied. Similarly, using 9(w,1,y) g 0 for all r > O, J 6 F, J g Jo H(w,1,J-—r) = H(m,1,J) — or and D.2 is satisfied. QED In proving the next theorem, the following, admittedly confusing notation is adopted. Notation: The contract Vk[w] = (I:[w],1§[w],...) is associated with w. I§[w] denotes, for each w 6 Q and k a function in M while I:[w][z] denotes the value of 1§[w] at an element Z 6 0. Theorem 5.8: Let Assumptions D, D.l and D.2 hold. Let J* < a and e > 0 be given, there exist an e-optimal contract. Furthermore, if, in D.2, the scalar a < l, the contract We is stationary. 125 Proof: Let [ck] be a sequence such that 6k > O for all k and a Z akek = e. k=0 For each m 6 0, let [vk[w]} C U be a sequence of contracts of the form rk[w] = (1:[w],1§[w],...) such that for k = 0,1,... Since J* < a, such a sequence exists. Let 1k 6 M be defined by Ik(w) = I:[w](w) V w 6 Q and J k defined by 3 ((1)) =H[U.),I ((1)),lim(T ...T )(J)]Vw6o. k k i-m 11m] I‘m] ° 1 k = 0,1, By D and D l 3 (w) = lim(T ...T )(J )(u)) = J ((1)) k 1.... Io[u)] 1‘36] ° ”km 2J*(w)-€k V 0.160: k=0,1,°" By D.2, for all k = 1,2,... and w 6 O T (3k)(w) k-l Hiw.Ik_1(u)) Ijk] H[w.ik_1(LU) I (J*. - ek)] ll\/ IIV H[wI-I-k_1(W) III*] — (16k IN H[u),I (w).1im(T ...T )(J )] k‘l IE-1[w] IE-1[w] ° - ask = Jk_1(w) - ock. 126 Then TI [Ti (Jk)] 2 Ti (Jk_l-aek) k-2 k—l k-2 2 Ti (Jk_1-—a ck) k-2 2 J (a~ 4-026 ) — k—2 bk-l k (Tf T_I_ )(Jk) 2 Jo - (del+...+d ck) o k-l k (T_ T)(J)2J*-Eae.. I I O ‘ i=0 1 o k Let Us = (10,11, ) and taking limits JV“ 2 J* - c If a < 1, take ck = e(l-d) for all k. Let "a = (1,1,...), where I(w) = Io[w](w) for all m E Q- The r is an c-optimal stationary policy. QED 5.3* The Optimality Equation The next two theorems with their corollaries prove the Optimality equation, J* = T(J*), hereby establishing the validity of the dynamic programming algorithm. The corollaries attempt to set up the algorithm for stationary contracts. For Assumption D, the Optimality equation requires the compliance of all D, D.l and D.2, whereas under Assumption 1, the same results hold only under 1 and one of the additional conditions, 1.1 or 1.2. 127 Theorem 5.9: Let D, D.l and D.2 hold. Then J* T(J*). Furthermore, if J’ e F is such that J’ g Jo and J’ g T(J’), then J' g J*. Proof: For every V = (Io'Il"") 6 H and w E D. By D.l JF(w) = if: (TIOTIl ..TIk)(Jo)(w) = T [lim (T --T )(J )](w) Io k4» I Ik o 1 S TI (J*)(w) g T(J*)(w). 0 Taking supremum of the left-hand side over F J* g T(J*). Let el and 82 > 0. By Theorem 5.2, 3 a contract F = (i ,Il,...) such that o T_ (J*) g T(J*) - 61 I0 and J7? > J* " 92 1 where 7T1 = (IloIZI ‘) J_ = lim (T_ T_ ...T_ )(JO) e T_ [lim (T_ ...T_ )(Jo)] I R49 I I o 1 k E H 128 = T (J_ ) 2 T (J*) - ac - _ - 2 I0 ”1 I0 _2_ T(J*) - (61+a92). But J* 2 J_ and 61 and 32 are arbitrary. Then W J* 2 T(J*) J* = T(J*). Let J’ e F be such that J’ g Jo and J’ g T(J’). Let {ck} and a sequence with ek > O and a contract E = (fo.f1,...) e n be such that {(W)2NW)-% k=mL~- Ik From D.2 J* = sup lim (T ...T )(J ) wen ksa Io Ik ° 2 sup lim sup (TI ...TI )(J') _ TTEH k-Oco O k 2 lim sup (TI ...TI )(J') — k-No o k 2 lim sup (T ...T )[T(J')-e ] ’ k4» Io Ik-l k 2 lim sup (T ...T )(J'-—e ) ‘ kea Io Ik-l k 2 lim sup (TI ...TI )(J') — akek _ k‘w O k-l k i k i 21im [T(J’)—(Z a 6.1)] 2J' — Z a 61 _ R*m i=0 ’ i=0 since 6i are arbitrary a J* 2 J . QED 129 Corollary 5.9.1: Let D, D.1 and D.2 hold. Then for every stationary contract w = (1,1,...), JI = TI(J1). Furthermore, if J' E F is such that J’ g Jo and J’ g TI(J') then J’ g JI. Proof: Use U (m) = {I(w)} instead of U(w) V w 6 0. Use Theorem 5.9. QED * Lemma 5.1: Let I hold. Then J* = lim J . N-Oa: * Proof: Clearly J* 2 JN for all N. Hence * J* 2 lim J . Also for all F = (10,11...-) 6 U: — N4o N (TI ...TI )(Jo) g J * o N-l N Taking limits on both sides * J g lim J . W — N4m N Taking supremum of the left hand side * * I J 3 11m JN N-Ooo * J* = lim JN. QED N-Om Theorem 5.10: Let I and 1.1 hold. Then J* T(J*). Furthermore, if J' 2 J0 and J’ 2 T(J'), then J’ 2 J*. Proof: Using the arguments in Lemma 5.1 for all w E 0 lim sup H(w,I,J;) = sup 1im H(w,I,J;). N*m W€U(w) W€U(w) N*w 130 By 1.1, then 'k * 1im T(JN) = T(lim JN). N—Ooo N-bco Since I and 1.1 are equivalent to Assumption F.1', by Corollary 4.2.1, N J — T (Jo). * * Thus T(JN) = TN+1(J ) = J o N+1' By Lemma 5.1 and combining the results we have J* = T(J*). Let J’ 6 F be such that J’ 2 Jo and J’ 2 T(J’). Then J = sup 1im (T ...T )(J ) wen N4m Io IN-l 0 g lim sup (T ...T )(J) - Nae wen Io IN-l ° g lim sup (TI ...TI )(J’) — Nam JEH o N-l g 1im TN(J’) g J’. QED _ Nam _ Corollary 5.10.1: Let I. and 1.2 hold. Let Q be a finite and J*(w) < m for all w E 0. Then J* = T(J*). Furthermore, if J’ E F is such that J’ 2 Jo and J’ 2 T(J*), then J’ 2 J*. Proof: Using a nearly verbation repetition of the * proof of Theorem 4.2 (b), we have JN = TN(JO) for all N = l,2,'°'. We will now show that * * a 1im H(w,I,JN) = H(w,I,lim JN) V w 6 32, I E U(w) . N—Ooo N400 131 Suppose for some 5 E O, I €‘U(E) and e > O NN * ~~ * H(w,I,Jk) + e < H(w,I,1im JN) k = 1,2,... N-Om * Since 0 is finite and J*(w) = 1im JN(w) < m for all N-an w, 3 an integer k0 > 0 such that * . * Jk + (e/O) 2 lim JN V k 2 k N-Ooo O k By 1.2, for all k 0 ll\/ ~~ * ,.V * N N . * H(w,I,J ) + e 2 H(w;TuJ +-(e/u)) 2 H(w,I,lim J ) k — k — Nam N which contradicts the earlier inequality 1% * .2 lim H(w,I,JN) = H(w,I,lim JN) N-Oco N-Mn and the results follow by Theorem 5.10. QED Corollary 5.10.2: Let I and 1.1 hold. Then for every stationary contract F = (1,1,...), I = TI(JI). Furthermore, if J’ 6 F is such that J’ 2 Jo and J’ 2 TI(J’), then J’ 2 J1. Next, necessary and sufficient conditions for the Optimality of a stationary contract under the two assumptions are studied. Theorem 5.11: Let D, D.l and D.2. Then a stationary contract r* = (I*,1*,...) is optimal if and only if T (J*) 'T(J*). 1* Furthermore, if for each w E 0, there exists a contract which is Optimal at w, then there exists a stationary optimal contract. 132 Proof: If W* is optimal, then JI* = J*. By Theorem 5.9 and Corollary 5.9.1, the result follows. Conversely, if TI*(J*) = T(J*), By Theorem 5.9, J* = T(J*) then it follows that TI*(J*) = J*. By Corollary 5.9.1, JI* 2 J*. W* is optimal. * 'k * = I Let Wm (I , O,w l,w'°") be optimal at w E 0. By D.1, J*(w) = JI*,w(w) = 1im (T * ...T * )(Jo)(w) k4” I0.03 Ik.w = T * [1im (T * ...T * )(JO)](w) I k-Mo I I 0.0.) 11“) kn» g T (J*)(w) g T(J*)(w) = J*(w) Io.w T * (J*)(w) = T(J*)(w) for all w E 0. 01w Define I* E M by I*(w) = I; u)(w). Then TI*(J*) = T(J*) and by result just proved (I*,1*,...) is Optimal. QED Theorem 5.12: Let I and I.1 hold. Then a stationary contract v* = (I*,1*,...) is optimal if and only if T (J ). I* I* Proof: If F* is optimal, then JI* = J*. By Theorem 5.10 and Corollary 5.10.2, the result follows. 133 Conversely, if TI*(JI* 5.10.2, JI* = T(JI*)' By Theorem 5.10, JI* 2 J* ) = T(JI*)' By Corollary W* is optimal. QED Theorem 5.11 states that under Assumption D, if the supremum is attain in the Optimality equation J*(w) = sup H(w,I,J*) I€U(w) for every w E Q, then there exists a stationary contract. However, if the supremum cannot be attained for some w 6 Q, the Optimality equation can still be used to construct a nearly Optimal contract, which is stationary whenever the scalar d in D.2 is strictly less than one . Theorem 5.13: Let D, D.l and D.2 hold. Then (a) Let c > 0, and {31] be such that °° k Z) a ck = e. e. > 0, i = 0,l,°'°. Let =0 1 'k 'k F* = (10,1 ,...) 6 H be such that T *(J*) 2 T(J*) "' 6k k = 0,1,... Ik then J* 2 JI* 2 J* - e. (b) Let c > 0 and the scalar in D.2, d < 1. Suppose I* E M is such that TI*(J*) 2 T(J*) - e(l-—a). Then J* 2 JI* 2 J* — e. 134 Proof: (a) Since T(J*) = J* Apply T * to both sides Ik-l (T* T*)(J*) 2T* (J*)-(16k Ik-l Ik Ik-l 2 J* - (ek4-dek). Repeat the process, for every k = 1,2,... k . (T *...T 1,‘,)(J*)2J"' — (2 0116.1). I I i=0 0 k Since Jo 2 J*, it follows that k i (TI* ...TI*)(JO) 2 J* _ £2; a 6i k = 1,2,... 0 k 1- Taking limit as k 4 a JI* 2 J* - e. (b) Let ck = e(l-—a) and I; = 1* for all k. The result follows by part (a). QED A weak counterpart of part (a) of Theorem 5.13 under Assumption I is given in Theorem 5.15. However, I am unable to give a counterpart of part (b) under Assumption I or conditions for existence of a stationary Optimal contract. 135 5.4* Convergence to Optimality and Existence of Optimal Contracts Define a function Jan 6 F by J (m) = lim T (J )(w) for every w 6 Q. a 0 N49 This section is devoted to investigate whether J“ = J*. Fortunately, the relationship hold for Assumption I under very mild conditions. Theorem 5.14: Let I hold and assume that either * 1.1 holds or JN = TN(JO) for all N. Then Jo = J* where 1im TN(JO)(w) V w e o. Jc(w) N49 * Proof: By Lemma 6, J* = lim JN. By Corollary 4.2.1 N-Hn * JN = TN(JO) * J* = lim J = J . QED Q Nwo The following is a counterpart of Theorem 5.8 and part (a) of Theorem 5.13 under Assumption I for the existence of nearly optimal contracts. Theorem 5.15: Let I and 1.2 hold. Let Q be a finite set and J*(w) < a for all m 6 0. Then for any 6 > 0, there exists an e-optimal policy. 136 Proof: For each N, let 8N = e/2(l-ta-t..uth-1) _ N n N and JN - {Io,Il,...,IN_1,I,I,...] be such that I E M and for k = 0,1,...,N-1, IE 6 M and -k-1 -k (T NTN )(Jo) 2 TN (Jo) - e N. N' Ik Thus TIN (Jo) 2 T(Jo) - 6N. Apply TIN to both Sides N—l N-2 (TIN TIN )(Jo) 2 (TIN T)(Jo) - aeN N-2 N-l N-Z 2 2 T (Jo) - (1+d)eN. Continuing (TN...TN )(J ) 2TN(J) - (1+o+...+oN'1)e o — o I I o N-l For N = 0,1, J 2 TN(J ) - e/2. F — o N As in the proof of Corollary 5.10.1, the assumptions imply *- N f 11 JN — T (Jo) or a N. . N * . . . . By Theorem 5.14, lim T (Jo) = JN. Since Q is finite N4: and J*(w) < a for all w E Q E N such that o N T 0(JO) 2 J* - e/2. Then .and JN is the desired contract. QED o 137 Under Assumption D, D.1 and D.2, the equality Ja = J* may fail to hold even in very simple situations. The following preliminary result shows that in order to have J” = J*, it is necessary and sufficient to have J" = T(J“), a condition implying the convergence of the dynamic programming algorithm. Theorem 5.16: Let D, D.1 and D.2 hold. Then J 2 T(J”) 2 T(J*) = J*. Furthermore, Jan = T(J”) = T(J*) = J* if and only if Jco = T(J”). Proof: Clearly Jan 2 JV for all F E H J 2 J*. By Theorem 5.9 T(J*) = J*. For all k 2 0 T(J”) = sup H(w,I,J¢) I€U(w) k _ k+1 S SUP H(wIIlT (JO)] _ T (JO). ' I€U(w) "A q Taking limit on right side, T(J“) J 2 T(J”) 2 T(J*) = J*. T(J*) = J* Let J = T(J ) a 9 Ja = T(Jo) by hypotheSis. 138 Let J... = T(JO). Since Jo 2 J”, by Theorem 5.9, J 3 J*. Then Jan 2 T(J”) 2 T(J*) = J* implies that J = T(J”) = T(J*) = J*. QED To prove the fact that Jon = T(J”), the following definitions and notations are needed. Notations: (1) For J E F, let E(J) denote the epigraph of J, i.e., the subset of O x I1 given by E(J) = {(w.1)IJ(w) 2 1]- Under D, since Tk(JO) 2 Tk+1(Jo) for all k and J2 = 1im Tk(Jo) thus R*a E(Jm) ll ":38 k E[T (JO)]. k 1 (2) For each k 2 l, the subset Ck of Q x C x I? is given by k-l ck = { 0, en 4 0 and let [In] C'U(w) be such that k- k H[w.In.T l(JOH 2 T (JO) ((1)) - en 2 X - en. Then (w,In,1-en) E Ck and (w,X-en) 6 P(Ck) for all n. Since 1 - en * X. (w.l) E P(Ck) E[Tk(J ) c P(C ) o k ‘ Let (w,1) E P(Ck) E a sequence {In} such that An 4 1 and a corresponding sequence {In} C‘U(w) such that Tk(Jo)(w) 2 H[w.1n.Tk'1(Jo)] 2 1n. Let n 4 a, Tk(Jo)(w) 2 1. Thus k (w,X) E E(T (JO)] 375;? c E[Tk(Jo)] -————- k P(Ck) c P(Ck) = E(T (JO)]. 140 * Assume that the supremum is attained by Ik_1(w) for each m 6 0. Then for each (w,l) E E[Tk(JO)] * k-l H[Wllk_l((-U)9T (‘10)) g )‘0 This implies (unl) e P(Ck). Hence E[Tk(Jo)] c P(Ck). By the first part of this theorem P(Ck) = P(C = E[Tk(Jo)]. k) Now let P(Ck) = (Ck) = E[Tk(Jo)]. For every w for which Tk(Jo)(w) > -e [w Tk(J )(m e P(C) ' o k ' This implies that H a I;_l(w) E U(w) such that H[w,1;_1(u).Tk‘1(Jo)J 2 Tk(JO)(w) sup H[w,I,Tk-1(Jo)] IEU(w) The supremum is attained for all m for which Tk(Jo)(w) > -o. It also is trivially attained by all I 6 U(w) whenever Tkm )(w) = -... QED 0 Definition: 0 P( F) C ) = {(w,1)( H I 6 U(w) such that k=1 k a (w,I,1) e F) ck} k=l P( F) Ck) = {(w,1)( 3 {l ] such that k=1 n in a l, (m,xn) E P(kCH ck)}. 141 Lemma 5.3: P( F) c ) c: 0 P(C ) c {W P(C ) k=1 k k=1 k k=1 k = F) E[Tk(Jo) = E(Jw), and k=1 0 m a k p( F1 c ) c (A P(C ) = F1 E[T (J )] = E(J ). k=1 k k=1 k k=l ° ” Theorem 5.17: Let D, D.1 and D.2 hold. Then (a) J“ = T(J”) (equivalently Ja = J*) if and only if P( F) ck) = F) P(C k=l k= (b) Jo = T(J”) and the supremum in J2(w) = suP H(w.I.JO) I€U(w) is attained for each w E 0 if and only if P( F] ck) = fl P(C k=l k= Proof: (a) Let J” = T(J”) and (w,1) E E(J,)° Thus J'(w) = sup H(w,I,Ja) 2 1. Let [an] be a I€U(w) sequence such that en > 0, en 4 0 3 a sequence {In} such that H(w,In,J') 2 i — en n = 1,2,... k-l _ so H[w,In,T (Jo)] 2 i — en k,n — 1,2,... (w,In,1-en) 6 C for all k,n k 142 a and (w.In,1-en) E kgl Ck for all n. Thus a (w,X-€n) 6 P(kCR Ck) for all n. But 1 - en 4 1 implies (w,l) E P( F) ck) k=1 E(Ja) CP( 0 ck). k=1 By Lemma 5.3 fl P(C ) = E(J ) = P( F) c ). k=1 k k=1 k Let P( F) ck) = fl P(Ck). By Lemma 5.2, k=1 k=1 P( F) Ck) = E(Jm). k=1 Let w E Q be such that Ja(w) > -o. Then L_E. [w,J (w)] E P( F) Ck) 3 a sequence (1 ] with ” k=1 n 1n 4 Ja(w) and a sequence [In] C‘U(w) such that H[w,I ,Tk'1(J )] 2 x k,n = 1,2,-~- n o — n By D.1, taking limit with respect to k H[w,In,Jm] 2 1n n = l,2,°°' Thus T(J¢)(w) 2 H[w.In.J2] 2 kn. 143 Let n 4 m implies T(J°)(w) 2 Jm(w) for all w 6 O and J2(w) > -o. The inequality also holds if J (w) = -¢ 0 . T(J“) 2 Jo By Theorem 5.16, J” 2 T(J”) C J = T(J ). Q Q (b) Let Ja = T(J”) and the supremum is attained for each m E Q in J¢(w) = sup H(w,I,J°) H I* E M I€U(w) such that for each (w,1) E E(JG) H[w,I* ((1)) ,Ja] 2 x. k~l Thus H[w,I*(w),T (Jo)] 2 1 for k = 1,2,... and [WII* (w)0)\] 6 fl Ck k=1 (w,k) E P( F) Ck) k=1 E(Ja) CP(fl ck). k=1 By Lemma 5.3, P( F) Ck) = E(Ja) = F] P(Ck). Conversely, k=1 k=1 9 let P( 0 ck) = E(Ja). For all m e 0 with k=1 Jm(w) > -a [w,Jc(w)] e E(JQ) = P( 0 c H a I*(w) E U(w) such that w [w,I*(w).Ja(w)] E F C k=1 k 144 Thus H[w,I*(w),Tk-1(Jo)] 2 Jo(w) k = 0,1,... By D.1 and taking limits T(J”) (w) 2 H[w.I*(w),J°] 2 Jam). By Theorem 5.16, T(J”) = J”. If J°(w) = -o, every I E U(w) attains the supremum and the proof is complete. QED Theorem 5.18: Let D, D.1 and D.2 hold. Let the incentive space C be a Hausdorff space. Suppose there exists an integer ko 2 0 such that for each w 6 Q, 1 6 R and k 2 k0, the set Uk(w,l) = (I E U(w))H[w,I,Tk(JO)] 2 X is compact. Then a Proof: Let (w,l) be in F) P(Ck) H a sequence k=1 {In} C U(w) such that H[w.In,Tk(Jo)] 2 H[w,In,Tn(Jo)] 2 i v n 2 k. Thus In E Uk(w,1) V n 2 k, k = 0,1,°". Uk(w,1) is compact for k 2 k0. This implies that [In] has a limit point I e Uk(w.l) V k 2 k0. But Uo(w,1) D‘Ul(w,l) D... so I 6 Uk(w,l) for k = 0,1,... 145 H[w,I,Tk(Jo)] 2 l k = 0,1,... - co (willx) 6 n Ck. k=1 G This implies (w,1) E P( F) Ck) k=1 O O a P( F) c ) D F) P(C ). k=1 k k=1 k Since Uk(w,l) is compact, by Lemma 445, the supremum in Tk(J )(w) = sup H[w,I,Tk-1(J )] is attained o o I€U(w) for every w E Q and k 2 k0. By Lemma 5.2, P(Ck) = (Ck) for k 2 k0. But P(Cl) 3 P(CZ) 3... and P(Cl) D P(CZ) D... F) P(Ck) = 0 P(C k=1 k= By Lemma 5.2, QED After proving the fact that Jco = T(Ja) and hence establishing the convergence of the dynamic programming algorithm under Assumption D, the following provides the conditions for the existence and compu- tation of optimal stationary contracts under the decreasing assumption. 146 Theorem 5.19: Let the assumptions of Theorem 5.18 hold. Then (a) there exists a contract F* = (1;,11. ) E H attaining the supremum for all k 2 k0, i.e., (T ,Tk)(J ) = Tk+l(J ) v k 2 k . Ik o o — o (b) For every contract w* satisfying (a), the sequence (I;(w)] has at least one limit point for each w 6 0 with J*(w) > -~. (c) Let 1*: Q 4 C be such that I*(w) is a limit point of [1;(w)} for all m E Q with J*(w) > -a and I*(w) 6 U(w) for all m E 0 with J*(w) = -a. Then the stationary contract (I*,1*,...) is Optimal. Proof: (a) This follows from Lemma 4.5. * * (b) Let F* = (Io,Il,...) satisfy (T ,Tk)(JO) = Tk+1(J > v k 2 k . w e o o o Ik and J* ((1)) > -a H[w,I;(w).Tk(JO)] 2 H[w,I;(w),Tn(Jo)] * In(w) E Uk[w,J*(w)] v k 2 k , n 2 k. 147 * Uk[w,J*(w)] is compact. {In(w)} has at least one limit point. (c) Each limit point I*(w) C'U(w) and H[w,I*(w),Tk(Jo)] 2 J*(w) v k 2 k0. Using D.1 and taking limits H[w,I*(w),Jc] = H[w,I*(w),J*] 2 J* for all w 6 0. This relation holds trivially for all m E Q with J*(w) = -a. TI*(J*) 2 J* = T(J*). This implies TI*(J*) = T(J*). By Theorem 5.11 (I*,1*,...) is Optimal. QED. CHAPTER VI BOREL MODELS 6.1: Introduction In the previous chapters, a basic multiperiod agency model is developed. It was shown that under apprOpriate conditions, optimal or nearly optimal contracts exist and the dynamic programming algorithm can be implemented to construct such contracts. All these results rely on the assumption that the dis- turbance term, yk, behaves in a reasonable manner, that is, it is countable and there is a well-defined distribution on its behavior over time. Put into the context of the model, the assumption implies that once an initial payoff is specified, the sequence of sub- sequent payoffs for the entire planning horizon will be defined stochastically. This means that at time 0 all payoffs are defined with a known probability distri- bution conditioned on the initial payoff. The optimization process will then be reduced to finding an Optimal contract for the corresponding payoffs. 148 149 If the assumptions in Chapters IV or V are met, an Optimal or nearly Optimal contract can be guaranteed. However, the disturbance can be arbitrary. Such arbitrariness may be due to random externalities which the company has no control of or to the internal Operating procedures. The imperfect state information model which is to be discussed in the next chapter is perhaps the most common situation that gives rise to an arbitrary disturbance. The actual payoff is not observable by the principal who receives a report or signal from the agent concerning the outcome. The agent can freely choose his reporting function. This will make it impossible for the principal to define his expected payoffs at time 0 to search for an optimal or nearly Optimal contract. In fact, if the disturbance is allowed to be arbitrary, various complications arise in the Optimi- zation process. This chapter will discuss the problems involved in the different phrases of the dynamic algorithm. The main intent of going through the techni— cal details is to set the stage for the imperfect state information model such that the dynamic programming algorithm can be utilized as a solution procedure. As both the problems and their corresponding remedies are 150 highly technical in nature, discussions in this chapter can only be carried on at a very general level. A significant amount of detail is omitted. The techni- cally oriented reader is referred to the starred chapter for a complete develOpment. It was mentioned earlier in this work that there are three Operations performed repetitively. First, there is the evaluation of a conditional expectation. Second, an extended real-valued function in two variables (state and incentive) is maximized over one of these variables (incentive). Finally, if an optimal or nearly optimal contract is to be constructed, a "selector" which maps each state or payoff to a contract which achieves or nearly achieves the Optimum for the second step must be chosen. The following sections will take each of these Operations in turn and discuss the problems associated at each stage if the disturbance is not countable. 6.2: Existence of Probability Measures Elementary statistics say that probability is the measurement of the likelihood of the occurrence of a certain event from a collection (set) of events. It is a measure of the likelihood of occurrence. If the set of events or the set of all possible combinations 151 of events is countable, a probability distribution on the elements of the set always exists. If the set is arbitrary, then very little can be said about the probability distribution of its elements. In the context of the model in the previous chapters, the conditional expectation involves not only the probability distribution on one set, but on the product of two sets, the payoff and the disturbance. It becomes essential to investigate the interplay of the distributions on these two sets. Since probability distribution is a measure of the likelihood of occurrence of the elements of a set, its existence is closely related to the measurability of the set. When arbitrary sets are encountered, measurability is always a crucial issue. One can envision measurability of a set as the ability to count the elements in the set (counting measure) or the ability to induce a distance between the elements in the set in a one-dimensional setting or the area in a two-dimensional case (a metric or norm). A probability measure can be viewed as a function which maps the elements in the set to the real line. The space of all probability measures that can be defined on the given set S is called the space of probability measures on X. It is denoted by P(X). In Appendix I, it is shown 152 that P(X) inherits all the prOperties of the original set X. Hence, P(X) is measurable whenever X is measurable. 0r conversely, if X is measurable, then an unconditional probability measure always exists with respect to the specific measure of X. Neverthe— less, an arbitrary measurable space is an extremely large space for any meaningful analysis to be conducted on. In order to draw any useful implications one has to restrict the research on a smaller subset which is typical enough to encompass most if not all the characteristics of the original set. The Borel set is the most common candidate for such a purpose. To de— fine Borel sets, the idea of a O-algebra is needed. A collection of subsets of a set X is said to be a o-algebra in X if it has the following properties: (1) it contains X, (2) it contains any subset A of X and the complement of A relative to X, and (3) it contains all possible unions of subsets of X. Then the Borel sets of X is the smallest O—algebra in X such that it contains every open set in X. Since P(X) inherits all the prOperties of the original space, it is also a Borel space. As mentioned earlier, the dynamic programming algorithm requires the evaluation of a conditional expectation which involves prObability measures on a 153 product of the payoff and incentive spaces. It can be shown (Theorem 6.2 and its corollaries) that a pro- bability measure on a product of Borel spaces can be decomposed into a marginal and a conditional probability. Such decomposition is possible even when the parameters or arguments of the distribution function are dependent. In addition, such a process can be reversed (Theorem 6.3). Given a probability measure and one or more conditional probabilities, a unique probability measure on the product space can be constructed. All these distributions can be shown to be measurable if the original sets are Borel sets. With the establishment of probability distributions on both the payoff and incentive spaces and the interplay of these probabilities between these two spaces, the conditional expectation Operations in Borel spaces are well-defined. 6.3: Analytic Sets The second stage of the dynamic programming algorithm involves the maximization of an extended real-valued function in two variables over one of these variables. When the disturbance is countable, the whole array of payoffs is defined stochastically given an initial payoff. The incentive function is defined on the payoff space. Under such circumstances, the resulting problem is a 154 standard multiperiod maximization problem which has been treated somewhat in detail in Chapters IV and V. When the disturbance term is an arbitrary element from a Borel space, then wk cannot be deduced from the knowledge of mo at time period 0. Also the exact form of the optimal or nearly Optimal contract cannot be specified at time 0 even if the existence of such contracts are guaranteed. The best one can do is to be able to construct a "selector" which maps each payoff to a contract which achieves or nearly achieves the maximum. Essentially, the algorithm searches for the maximum along the projection of the incentive function on the payoff space in this second stage of the process. In searching along the projection of sets from Borel space, a very serious prOblem is encountered. So far in Chapters III through V, the multiperiod agency problem is formulated to involve dealing with "nice" sets. These "nice" sets have been either measurable sets or Borel sets. But, at this stage, when projections of these "nice" sets are used to search for a solution, it would be desirable that the projections are "nice" also. It is at this point that the use of measurable sets and Borel sets breaks down, because one cannot be sure that the projections required will be of the same 155 type. The projections do not carry over the behavior of the original sets. In fact, they may not be measurable with respect to the spaces of the original sets. Fortunately, there is another class of sets available, the so-called "analytic sets" which has the desirable properties that are required in the current model. There are many approaches to analytic sets, but maybe the best for the current purposes is that the analytic sets consist of the images of the Borel sets under continuous functions. The image of an analytic set under a continuous function is itself an analytic set. The Borel sets thus form a subclass of the analytic sets: each Borel set is an analytic set, but there are analytic sets which are not Borel sets. Also, a pro- jection is a continuous function. Now, letting the analytic sets be the "nice" set, one obtains some control of the results of projections, that is, a guarantee of the measurability of the projections. This will enable the investigation to carry forward. By enlarging the Borel sets to include the analytic sets, the model is ready for the implementation of the dynamic programming algorithm. Technically, through the analytic sets, the projection becomes measurable with respect to the original sets. 156 6.4: Construction of the "Selector" The last stage of the Optimization process is the construction of a "selector" which maps each payoff to an optimal or nearly Optimal contract. In the last section, analytic sets are utilized to enhance the measurability of the projection. If analytic sets are to be employed, it becomes inherent that the various functions should be defined such that they are measurable with respect to the analytic sets. The main result in this section is that one can construct a selector which is measurable (Theorem 6.14). Because of various technical measurability problems, a much more general and larger space is used. It is in this larger space, the universally measurable functional space in which all functions and composition of functions are measurable with respect to the relative analytic sets that the "selector" is defined and constructed. Throughout the process of constructing such a selector, a very elaborate abstract algebraic structure is imposed on the payoff and incentive sets and the various functions. The actual implementation of the dynamic programming algorithm to numerically evaluate the optimal contract and meet all these measurability re- quirements will not be easy. 157 However, the projections may not be that badly behaved. Under certain conditions, it can be shown that when extended real-valued functions involved are semicontinuous, the selectors can be chosen to be measurable with respect to the original Borel sets. Such a selector is produced in Theorems 6.15 through 6.17. The main concern in this chapter is to take care of the technical difficulties in executing the dynamic programming algorithm when the disturbance is un- accountable. Admittedly, all these details have no direct bearing on the original economic model. However, if one were to adopt the algorithm to solve the im- perfect state information model which is discussed in the next chapter, one would have to guarantee the feasibility of obtaining a solution through the algorithm. CHAPTER VI* BOREL MODELS 6.1* Introduction If the state, incentive and disturbance spaces are all arbitrary measure spaces, very little can be done. Hence, for the general model, only sparse works are done in the literature. One attempt in this direction is the work of Striebel [1975] involving p-essential suprema. The following objective function is adopted. Jk+l(w) = pk-essential 52p E{g[w,I(w),y] + Jk[f(w,I(w),y,Jk_l)]} k = 0.---.N-l. where the p-essential supremum is taken over all measurable I from the payoff space 0 to incentive space C satisfying any constraints which may have been imposed. The functions Jk are measurable and if the probability measures p0,...,pN_l are chosen properly and the so-called countable €- lattice property (refer to the monograph for a 158 159 precise definition) holds, the above modified dynamic programming algorithm generates the Optimal net return function and can be used to Obtain contracts which are Optimal or nearly Optimal for pN_1 for almost all initial states. However, the selection of the prOper probability measures is as difficult as executing the dynamic programming algorithm and the verification of the countable e-lattice property is nontrivial even in very simple situations. A second approach is to investigate models in which the payoff (state) and incentive spaces are Borel spaces or even Rn and the expected net return function h(w.I) = g(w.I.y)p(dy)w,I) is assumed to be semicontinuous and/Or convex. Semi- continuous models of this type are mainly focused on various combinations of semicontinuity and compact— ness assumptions such that the functions Jk are semicontinuous. Most of the researches that were done in this model (Freedman [1974:L Furukawa [1972], Himmelberg, et. a1 [1976], Maitra [1968] and Schal [1972]) are carried out in a finite-dimension Euclidean state space with assumptions of convexity, semicontinuity or 160 both made on the net return function. Results are not readily generalizable beyond Euclidean spaces (Rockafellar [1976]). Another approach, the Borel space framework was introduced by Blackwell [1965]. The payoff (state) 0 and incentive C spaces were assumed to be Borel spaces, and the functions defining the model were assumed to be Borel-measurable. However, even over a finite horizon the optimal total return function to the principal need not be Borel-measurable and there need not exist an everywhere e-0ptimal policy (Blackwell (1965), Example 2). The problem arises from the inability to choose a Borel-measurable function uk: 0 4 C which nearly achieves the supremum uniformly in w. The nonexistence of such a function interferes with the construction of Optimal contracts via the dynamic programming algorithm, since one must first determine at each stage the measure p with respect to which it is satisfactory to nearly achieve the supremum for p almost every w. This is essentially the same difficulty encountered with the Striebel approach. The difficulties in constructing nearly Optimal contracts over an infinite horizon are more acute. Furthermore, from an applications point of view, a p-—e-optimal contract, even if it can be 161 constructed, is a much less appealing object than an everywhere e-0ptimal contract, since in many situa- tions the distribution p is unknown or may change when the system is Operated repetitively, in which case a new p-—e-optimal contract must be computed. In the formulation that follows, the class of admissible contracts in the Borel model is enlarged to include all universally measurable contracts. It will be shown that this class is sufficiently rich to ensure that there exist everywhere e-Optimal contracts, and, if the supremum in the dynamic programming algorithm is attained for every w and k, then an everywhere Optimal contract exists. Thus the notion of p-optimality can be dispensed with. Another advantage of working with the class of univer- sally measurable functions is that this class is closed under certain basic operations such as integration with respect to a universally measurable stochastic kernel and composition. In a dynamic programming algorithm, there are three Operations performed repetitively. First, there is the evaluation of a conditional expectation. Second, an extended real-valued function in two variables (state and incentive) is supremized over one of these variables (incentive). Finally, if an optimal or 162 nearly Optimal contract is to be constructed, a "selector" which maps each state to a contract which achieves or nearly achieves the supremum in the second step must be chosen. The following sections will discuss the problems arising in each of these operations and suggest solutions whenever feasible. 6.2* Probability Measures on Borel Spaces The construction of a rigorous multiperiod agency model via the dynamic programming algorithm is im- possible when the payoff space and the incentive space are arbitrary sets or even when they are arbitrary measurable spaces. For this reason, the concept of a Borel space is adopted and the prOperties of Borel spaces are used to develoP the construction. In evaluating the conditional expectation of the total net return function, several prOperties of the probability measures need to be developed, the first and the obvious one being the unparameterized probability measure. Since conditional expectation involves probability measures on a product of Borel spaces, it becomes essential to investigate the interplay of the measures. It can be shown that a probability measure on a product of Borel spaces can be decomposed into a marginal and a Borel- 163 measurable stochastic kernel. This decomposition is possible even when a measurable dependence on a parameter is admitted. Such a result is essential to the filtering algorithm for the imperfect state information model which will be developed in the next chapter. In addition, such a process can be reversed, that is, given a probability measure and one or more Borel-measurable stochastic kernels on Borel spaces, a unique probability measure on the product space can be constructed. If X is a tOpological space, 32 is the collection of closed subsets of X and 5k the Borel o-algebra on X. The space of probability measures on (X,Bk) is denoted by P(X). C(X) is the Banach space of bounded, real-valued continuous functions on X with the supremum norm for any metric d on X consistent with its tOpology. A probability measure p E P(X) determines a linear functional 1p: C(X) 4 R defined by 1p(f) = I fdp. On the other hand, a function f E C(X) determines a real-valued 9f: P(X) 4 R defined by 9f(p) = I fdp. The prOperties of the probability measure space P(X) have been given much attention in statistics literature (Ash [1972], Feller [1971] are just a couple 164 of classics), they are summarized in Appendix A. In general, one can say that P(X) inherits all characteristics of the space X. For example, if X is a separable metrizable space, then P(X) is separable and metrizable (Theorem A.4). Definition: Let X and Y be separable metrizable spaces. A stochastic kernel q(dy)x) on Y given X is a collection of probability measures in P(Y) parameterized by x 6 X. If 3’ is an O-algebra on X and Y-1[6P(Y)] C 3, where Y:IX 4 P(Y) is defined by y(x) = q(dylx), then q(dy)x) is said to be JLmeasurable. If y is continuous, q(dy]x) is said to be continuous. Before proving the decomposition theorem for stochastic kernels, the following theorem states their general behavior when the state spaces are Borel spaces. Theorem 6.1: Let X and Y be Borel spaces, 6 a collection of subsets of Y which generates 62 and is closed under finite intersections, and q(dy]x) a stochastic kernel on Y given X. Then q(dy]x) is Borel-measurable if and only if the mapping 1E: X 4 [0,1] defined by 1E(x) = q(E]x) is Borel measurable for every E 6 6. 165 Proof: Let y:.X 4 P(Y) be defined by y(x) = q(dy[x). For E 6 6, 1E = BE 0 y. If q(dy)x) is Borel-measurable, then Theorem A.ll implies 1E is Borel—measurable for every E 6 6. Conversely, if XE is Borel—measurable for every E 6 6, then -1 o[(J 1 (B )] c:6 . By Theorem A.ll, it implies -l -l -1 Y [5 ]=Y [NU 9 {EH} P(Y) E66 E R -l -l =G[U Y (8 (6))] E66 E R -l =o[u x (BHJCG E66 E R X Z q(dylx) is measurable. QED Corollary 6.1.1: Let X and Y be Borel spaces and q(dny) a Borel-measurable stochastic kernel on Y given X. If B 6 BXY' then the mapping AszX 4 [0,1] defined by AB(X) = q(BX)x) where Bx = {y 6'Y: (x,y) 6 B] is Borel-measurable. Proof: If B 6 EXY and >c6 X, then BX C‘Y ls homeomorphlc to B O [{x}Y] 6 EXY' Thus BX 6 52 so q(BXlx) is defined. Let fi*= [B 6 EXY: AB is Borel-measurable]. fl is a Dynkin system. By Theorem 6.1, fl contains the measurable rectangles .5 = BXY' QED 166 The next two results are the decomposition and integration theorems for stochastic kernels. The first one says that any probability measure on a product of Borel spaces can be decomposed into a marginal and a Borel-measurable stochastic kernel. The second theorem is the reversed statement: given a prObability measure and one or more Borel-measurable stochastic kernels, a unique probability measure on the product space can be constructed. Together with their corollaries, the two theorems provide relation- ships between two or more prObability spaces which are useful in the later development of the models. Theorem 6.2: Let (X,J) be a measurable space, let Y and Z be Borel spaces, and let q(d(y,z))x) be a stochastic kernel on YZ given X. Assume that q(B)x) is Jemeasurable in x for every B 6 BXY‘ Then there exists a stochastic kernel r(dz)x,y) on Z given XY and a stochastic kernel s(dy)x) on Y given X such that r(§]x,y) is JEk—measurable in (x,y) for every IN 6 Bi, s(X]x) is Jemeasurable in X for every Y E 52, and “12.)” = [Y r(§)x.y)s(dyl><) V X 6 By. _z_ e 52. 167 Proof: Assume without loss of generality that Y = Z = (0,1]. Let s(dy)x) be the marginal of q(d(y,z)]x) on Y i.e., s(X]x) = q(XZ]x) for every X_6 52. For each positive integer n, define subsets of Y M(j,n) = ((j-l)/2n,j/2n)] j = l,...,2n. Thus each M(j,n4—l) is a subset of some M(k,n) and the collection {M(j,n): n = 1,2,...;j = 1,...,2n] generates By. For Z 6 0 D Z, where Q is the set of rational numbers, define q(dy(0,Z]]x) to be the measure on Y’ whose value at X_6 6% is q(X(O,Z])x). Then q(dy(O,Z])x) is absolutely continuous with respect to s(dy]x) for every 2 6 0 D Z and x 6 X. Define for z 6 O D Z q[M(j.n) (o.z])x]/s[M(j.n> Ix] Gn(z)x,y) = if Y E M(jID) and S[M(j.n))X] > o o if y e M(j,n) and s[M(j,n))x] = o. For each 2, the set B(z) ((x,y) 6 XY: 1im Gn(z)x,y) exists in R} n-boo ((x,y) 6 XY: [Gn(zlx,y)] is Cauchy] r) u r) {(x,y) 6XY:)Gn(z)X.y)- k=1 N=l m,n2N - Gm(z)X.Y)‘ < %] is $52-measurable. 168 For fixed x and y and for m 2 n Q[M(j.n)(O.ZJ)X] = f Gm(2)X.y)S(dy)X). M(j.n) But {M(j,n): j = l,...,2n] is the O-algebra generated by Gn(z]x,y). This implies Gn(z)x,y) is a martingale on Y under the measure s(dy]x). Each Gn(zlx,y) is bounded above by 1, so by the Martingale convergence theorem, Gn(z)x,y) converges for s(dylx) almost every y s[B(z)x)x] = 1. Let 1im Gn(z]x,y) if (x,y) 6 B(z) n-Ooo G (Z‘XIY) = 2 otherwise Let m 4 m, then q[z(0.z])x] = [ G(z)x.y)s(dy)x) v x e x, z 6 Q n z Y and XL: M(j,n). But 3; is a Dynkin system and by Theorem A.10, then 9[X(0o21)X] = I G(Z)X,y)s(dy)x) V x 6 x, z 6 Q n 2, Y X_6 By. For each 20 6 Q 0 Z, define C(zo) = {(x,y) 6 XY: 3 z e Q n z with z g 20 and G(z]x,y) > G(ZO)XoY)] 169 u CZ ((x,y) 6 XY : G(z)X.y) > G(zo)x.y)} C = L) C(zo) 206002 D(zo) {(x,y) 6 XY: G(°)x,y) is not right- continuous at 20] m U P. U [(x,y) 6 XY: )G(ZIX.y) n=1 k=1 ZEQnZ 2 g 2 g 20 + E - G(Zo)XoY)) 2 %] 206002 E = [(x,y) 6 XY: G(z)x,y) does not converge to zero as 2 1 0] co co = U Q U [(x,y) 6 XY: )G(Z) (x,y) 2 35;} n=1 k=1 zEQflZ z<1/k F [(x,y) 6 XY: G(1)XoY) 7‘ 1}- For fixed x 6 X and 20 6 Q 0 Z, and for all z 6 Q 0 Z, 2 g z 0 f G(z)x,y)s(dy)x) g] G(zo)x,y)s(dy)x) Y Y 2 G(z]x,y) g G(zolx,y) for s(dy[x) almost all y . s[C(zo)x)x] = O and s(CX)x) = O. This implies that if z 1 20, z 6 Q 0 Z, then I G(z]x,y)s(dylx) if G(zo)x,y)s(dy]x) Y Y 170 and G(z]x,y) 1 G(zo)x,y) for s(dy)x) almost all y s[D(zo)X)x] = O and s(DX)x) = 0. Furthermore, as 2 1 O, z 6 0 0 Z [Y G(z]x,y)s(dy]x) 1 0 V 2.6 52. Since G(zlx,y) is‘non-decreasing in Z for s(dylx) almost all y, then G(z)x,y) 1 O for s(dy)x) almost all y s(Exlx) = 0. Let 2 = l in q[X(0,z])x], we obtain f G(l)x,y)s(dy)x) = s(X[x) V Y 6 BY. Y Thus G(l[x,y) = 1 for s(dy)x) almost all y s(FX[x) = 0. For 2 6 Z, let {2n} be a sequence in 0 0 Z such that 2n 1 2. For every x26 X, y 6 Y, define 1im G(z (x,y) if (x,y) 6 xy n-Ooo n F(z]x,y) = - (CUDUEUF) 2 otherwise Clearly F(z)x,y) is well-defined, nondecreasing and right—continuous. It also satisfies for every (x,y) 6 XY O§F(z)x,y)§1 V 262 F(l)x,y) = l 171 and lim F(z)x,y) = O 210 2 For each (x,y) 3 a probability measure r(dz]x,y) on Z such that r((o.z])x.y) = F(Z)x,y) v z e (0.1]. The collection of subsets §_6 52 for which r(§]x,y) is JEk-measurable in (x,y) forms a Dynkin system which contains ((O,z])z 6 Z} r(§]x,y) is 352-measurable for every 2'6 52. By the m notone convergence theorem, then V x 6 X, z 6 Z, X_6 HY qmomllx] I P(zlx.y>s(dylx) Y J” r((o.z])x.y)s(dylx). Y Again, the collection of subsets g_6 52 for which Q(XZ)X] = f r(z)x,y)s(dy)x) holds forms a Dynkin Y system which contains {(O,z])z 6 Z] °. q(g_zlx] = [Y r(§)x,y)s(dy]x) v 2 6 BY, 2 e 52. QED Corollary 6.2.1: Let X,Y and Z be Borel spaces and let q(d(y,z)]x) be a Borel—measurable stochastic kernel on YZ given X. Then there exist 172 Borel-measurable stochastic kernels r(dz)x,y) and s(dny) on Z given XY and on Y given X respectively such that q(_Y§)X) = [Y r(§)x.y)s(dy)x) V x 6 BY, _z_ e 32. Corollary 6.2.2: Let Y and Z be Borel spaces and q 6 P(YZ). Then there exists a Borel-measurable stochastic kernel r(dz)y) on Z given Y such that quZZ.) = [Y r(§)y)s(dy) V2 6 BY, 2 e BZ where s is the marginal of g on Y. Theorem 6.3: Let X1,X2,... be a sequence of Borel spaces, Yn = X1X2"'Xn and Y = X1X2"°. Let p 6 P(Xl) ‘be given, and, for n = 1,2,... let qn(dxn+1)yn) be a Borel-measurable, stochastic kernel on X given Yn. Then, for n = 2,3,... there n+1 exist unique probability measures rn 6 P(Yn) such that (6.1) rn(X1X2. "Xn) X ...,X = [21535-4211 1 qn'an‘xl'XZ' ) n-l qn_2(dxn_llx1,...,xn_2)...q1(dx2[xl)p(dx1) v5 e/sx ,...,>_ (xifx2 f(x1,x2)q1(dx2)xl)p(dxl). Now assume rk 6 P(Yk) exists for which (6.1) and (6.2) hold when n = k. For B 6 Y let k+1' rk+1(B) = [Yk qk(BYk)yk)rk(dyk). Then rk+1 6 P(Yk+l)' If B = §1§2'°'§k§k+l' where then rk+l‘b) = f Xxlx2...xk(yk)qk(§k+l‘yk)rk(dyk) I31£§2°"I§k qk)§k+l)xl'xz'"xk)qk—l(dxk)Xk—l) . ql(dx2]x1)p(dxl) by (6.2) when n = k. 175 (6.1) holds for n = k4-1. Then use the previous result to show (6.2) for n = k4—1 proceeding as when n = 2 case. By the induction hypothesis (6.1) and (6.2) holds. Suppose r2 6 P(Yn) satisfies (6.1). Then the collection fi-= (B 6 B )r (B) = r’(B)} is a Dynkin Yn n n system containing the measurable rectangles I 3 — 5&n and rn — rn Each of the measures rn is consistent, i.e., if m 2 n then the marginal of r on Y is r . If each X m n n k is complete, by the Kolmogorov theorem, 3 a unique r 6 P(Y) whose marginal on each Yn is rn. If Xk is N not complete, consider the completion Xk its completion Yn' Again, 3 E 6 P(Y) whose marginal and Y on n N iv N on each Yn is rn. The uniqueness of r implies the uniqueness of its correspondence r 6 P(Y). QED In the course of proving Theorem 6.3, the following result has also been proved. Theorem 6.4: Let X and Y be Borel spaces and q(dy)x) a Borel-measurable stochastic kernel on Y given X. If f: XY 4 R* is Borel-measurable, then the function 1: X 4 R* defined by 1(x) = f f(x,y)q(dy)x) is Borel-measurable. 176 Corollary 6.4.1: Let X be a Borel space and let f:.X 4 R* be Borel-measurable. Then the function 9f : P(X) 4 12* defined by 9f(p) = J“ fdp is Borel-measurable. Proof: Define a Borel-measurable stochastic kernel on X given P(X) by q(dep) = p(dx). Let fsjp(x)x 4 R* be defined by fkp,x) = f(x). Then ef(p> = J" f(x)p(dx) = f %'(p.x)q(dxlp). By Theorem 6.4, 6f is Borel—measurable. QED If f 6 C(XY) and q(dy]x) is continuous, then the mapping y as defined in Theorem 6.4 is also continuous. This is proved with the aid of two lemmas. Lemma 6.1: Let Y be a metrizable space, d a metric on Y consistent with its topology, and X CLY. If g 6 Ud(X), then g has a continuous ex- tension to Y. Proof: Since 9 is uniformly continuous on X, given 6 > 0 there exists 5(a) > 0 such that if x1,x2 6 X and d(x1,x2) < 6(6), then Ig(xl)-g(x2)l < e. Let X be the closure of X. Suppose y 6 X. Then there exists a sequence (xn} C X for which XD 4 y. Let s > 0 be given, 3 N(s) such that d(xn,xm) < 6(6) for all n,m > N(e). (g(xn)} is Cauchy in R. 177 A Define g(y) = lim g(xn). Thus [g(xn)-3(y)) < c n-Oco whenever n > N(e). Suppose now x 6 X and d(x,y) < 6(e)/2. Choose n > N(e) such that d(xn.y) < 6(5)/2. Then d(x,xn) < 6(5) and A A lg(x) -g(y)) g )g(x) —g(xn>) + (g(xn) -g(y>) . < 29 For any sequence {x5} C X with x; 4 Y A _ . : g(y) — lim g(xn). n-Oco A c 9(Y) is independent of the sequence [xn] chosen. If y 6 X, take xn = y, n = 1,2,... and obtain A g(y) = g(y) A g is an extension of g. If {ym} is a sequence in X which converges to y 6 X, then there exist sequences (x ] in X with mn ym = lim xmn' Choose n1 < n2 <... such that n-Oco . _ 1 11m x - y and d(xmn ,ym) < 6(5)/2. Then m-Om m m A . A ‘ _ g(y) = 11m g(xmn ) and (g(xmn )-g(ym) < 2/m. Letting m4oo m m m 4 m, then A , A g(y) = 11m g(ym) 111-900 A _ and g is continuous on X. Clearly SUP )g(x)) = sug )g(y))- XEX y€X 178 - A If X = Y, g is clearly unique. If X is a proper A subset of Y, by Tietze extension theorem, g can be extended to all of Y such that Hg) = sup ($(y>(. QED y6Y Lemma 6.2: Let X and Y be separable metrizable spaces. Then the mapping 0: P(X)P(Y) 4 P(XY) defined by o(p,q) = pq where pq is the product of the measures p and q is continuous. Proof: By Theorem A.7, X and Y can be homeomorphically embedded in the Hilbert cube N. Let X,Y C N and d be a metric on MN consistent with its topology. Let g 6 Ud(X,Y), by Lemma 6.1, g can be extended to a function 3 6 C(NV). Consider the sit of finite linear combinations of the form A A A A Z) f.(x)h.(y) where f. and h. range over C(fl) i=1 i i i l and k any integer. Let c > 0 be given, by the StoneAWeierstrass Theorem such a linear combination exists and satisfies k A A A ”.23 fihi-gH < 6/3. 1: Let {pm} be a sequence in P(X) and pn 4 P, p 6 P(X) and [gm] a sequence in P(Y) with qn 4 q, q 6 P(Y) A A Consider fi’hi the restrictions of fi and hi to 159 X and Y 1im sup 1] gd(pnqn) -J" gd(pq>l n41: XY XY k 1im su (9- Z f.h.)d( q) n4ap REY i=1 1 1 pn n1 "A k 2) 1' f.d h.d .— f.d h.d + i=1 n4:»‘£ 1 p“ I 1 q“ i 1 p i 1 q‘ K: k + 1im If ( Z) fihi-g)d(pq)) S 6 n4: XY i=1 — By the equivalence of Theorem A.5 (a) and (c) pn 4 p = ] fdpn a [ fdp and qn a p = j qun 4 ] qu o is continuous. QED Theorem 6.5: Let X and Y be separable metrizable spaces and let q(dy]x) be a continuous stochastic kernel on Y given X. If f 6 C(XY), then the function 1: X 4 R defined by 1(x) = f f(x,y)q(dylx) is continuous. Proof: The mapping v:.X 4 P(XY) defined by v(x) = qu(dy]x) is continuous by Corollary A.5.l and Lemma 6.2. Thus 1(x) = (9f‘>v)(x) where 9f: P(XY) 4 R is defined by Bf(r) = I fdr. By Theorem A.5, 6f is continuous 1 is continuous. QED With the above results, it can be seen that the conditional expectation Operators in Borel spaces are well-defined. 180 6.3* Analytic Sets and Universally Measurability The dynamic programming algorithm is centered around maximization of functions, and this is intimately connected with projections of sets. More specifically, if f: XY 4 R* is given and f*: X 4 R* is defined by f*(x) = sup f(x,y), y6Y then for each c 6 R {x e xlf*(x> > c} = projx(((x.y) 6 xy f(X.Y) > cl). If f is a Borel-measurable function, then {(x,y)lf(x,y) > c] is a Borel—measurable set. Un- fortunately, the projection of a Borel-measurable set need not be Borel-measurable. As mentioned earlier in the introduction, the second stage of the dynamic programming algorithm involves in the supremization of an extended real-valued function in two variables over one of these variables. Essentially, the algorithm searches for the supremum along the projection of the set. If one were unable to guarantee the Borel-measurability of the projection, it would become impossible to implement the algorithm. However, in Borel spaces, the projection of a Borel set is an analytic set. By enlarging the Borel space to include 181 all universally measurable functions, the model is amendable for the implementation of the dynamic programming algorithm. Analytic sets have a very standard place in the mathematical literature. The prOperties that are required to develOp the multiperiod agency model are summarized in Appendix B. This section will start off with the measurability properties of analytic sets. Definition: Let X be a set. A paving 9 of X is a nonempty collection of subsets of X. The pair (X,9) is called a paved space. Let 0(9) be the G-algebra generated by 9. 95 denotes the collection of all intersections of countably many members of 9 and 90 the collection of all unions of countably many members of 9. N denotes the set of positive integers. T and Z are the set of all infinite and finite sequences of positive integers respectively. Definition: Let (X,9) be a paved space. A Suslin scheme for 9 is a mapping from. Z) into 9. The nucleus of a Suslin scheme 8:73 4 9 is N(S)= U n S(Ol,...,o) (01:02....)691 n=1 The set of all nuclei of Suslin schemes for a paving 9 is denoted by 2(9). 182 Definition: Let X be a Borel space. Denote by 3* the collection of closed subsets of X. The analytic subsets of X are the members of £(Jk). Actually, there are a number of ways to define the class of analytic sets in a Borel space. Theorem A.l3 provides seven equivalent definitions. At the beginning of this section, it was indicated that ex— tended real-valued functions on a Borel space X whose upper level sets are analytic arise naturally via partial supremization. Because the collection of analytic subsets of an uncountable Borel space is strictly larger than the Borel O—algebra, such functions need not be Borel-measurable. Nonetheless, they can be integrated with respect to any probability measure on (X,E&). The following will discuss the measurability properties of analytic sets. Definition: Let X be a Borel space. The universal O—algebra fix is defined by 7 = F) B (p). If E E , then B is uni- *x pEP(X) x “X versally measurable. Theorem 6.6 (Lusin's Theorem): Let X be a Borel space and S a Suslin scheme for fix. Then N(S) is universally measurable, i.e., 3(ux) = EX. 183 Proof: Let A = N(S), where S is a Suslin scheme for 74X. For (01,...,ok) 62, define UN N(Gl,...,ck) = {(gl,g2,...) 6 m: g1 = 01,..., k k and M(Ol, .,ok) = {(gl,g2,...) E 52: gl g 01,...,gk g 0k) = L} N(T ,...,T ) 1 k 71§01,...,Tk§ok Let R(Oll° 10k) = U n 5(5). z€M(Ol,...,Ok) s 0 be given. Choose :1'32"" such that p*(A) g p*[R(Zl)] + 6/2 p*[R(:1.....§k_1>J g p*[R(Zl.....:k_l. + e/2k k = 2,3," Then p*(A) g p*[R(:1,...,§k)] + e. The set K(7 ...,gk) is universally measurable (F‘I kn 184 H II P[K(Z1oo-u:k)] + P[X'K(Elo---I:k)] P*[R(:1o---a§k)] + p[x'K(:10---I:k)] 2 p*(A) - e + pr-—K(§l,...,§k)]. ll\/ Let x 6 fl K(g .....g) k=1 1 k S('r1,...,'r.). IIDB "37" k 1 1 71§§1,...,Tk§gk j Suppose for every Tl g gl, 3 a positive interer k(T1) such that k(T1) k x£S('r1) DI: m U on S('r1,'r2,...,'rj)] k=2 72§§2,...,Tkggk j=2 Let k = max k(¢ ). Then T gt 1 l- 1 ko k XZ {5(71) firm U m S(T1IT2I°°°I':-j)]? r — .' .2 Tlgbl k—2 ¢2§g2,...,¢k§bk j 2 k 0 3 U n S(T1o---oTj) Tl§§1,...,Tk ggk 3:1 0 o = K(g .....Q ) 1 k0 Contradicting that x 6 fl K(§ ,...,g ) l k k=1 For some T1 g g1 x 6 S('rl) n [ H U f) S(?l.'r2.- .Tj)] k=2 72§g2,...,¢k§gk j=l Similarly, 3 ?2 g g2 such that x E 551) H 851.;2) m k _ _ n [ n U .n 5(71IT2It3I Iq.j)] 185 Continuing, we obtain a sequence T1 g gl, ?2 g (2,... such that O x E H 8(7r ,...,7r ) CN(s) =A k=1 l k G ..rj K(g1,...,gk) CIA V (g1....) em. k—l As k 4 a, K(g1,...,§k) decreases to a set contained in A and X-K(§l,...,gk) increases to a set con- taining X-A. Letting k 4 a l;p*(A) —e+p*(X-—A) This implies l 2 p*(A) + p*(X-—A). But for any E CX, p*(E) + p*(X-E) 2 l p*(A) + p*(X-—A) = l A is measurable with respect to p. QED Corollary 6.6.1: Let X be a Borel space, every analytic subset of X is universally measurable. Proof: The closed subsets of X are universally 7 measurable, so JK‘X) C'UX. QED As remarked earlier, the class of analytic subsets of an uncountable Borel space is not a U-algebra, so there are universally measurable sets which are not analytic. 186 In Theorem A.ll, when X is a Borel space, the function 6A: P(X) 4 [0,1] defined by 9A(p) = p(A) is Borel-measurable for every Borel-measurable A CZX. The following theorem and its corollary will investi- gate the property of this function when A is analytic. The main result is that the set {p 6 P(X)‘p(A) 2 c] is analytic for each real c. Thus, there exists universally measurable probability measure for the analytic set A. Theorem 6.7: Let X be a Borel space and A an analytic subset of X. For each c E R, the set {p E P(X): p(A) 2 c} is analytic. Proof: Let S be a Suslin scheme for 3* such that A = N(S). For 5 G’ZL let N(s), M(s), R(s) and K(s) be as defined as in Theorem 6.6, and D 121 K(;1,...,gk) CA v (g1....,) e 5:. Each K(s) is closed. Let p(A) Z c, for any n 2 l, 3 (E1,ZZ,...) E T such that E(A) g EIR] ; §[R(:1.....Ek>1 187 U n {p E P(X) :p[K(s)] 2 c-l/nF. E C Now, let 5 E r) L) r) {p E P(X): p[K(s)] 2 c-l/n}. n=1 26m sf is universally measurable. Proof: For p e P(X), define p’ e P(Y) by I -1 p (C) =p[f (0)] V c 6 By. Let V E 5% be such that p[f-1(V) A f-1(U)] = p’(V.AU) = 0. Since f-1(V) 6 Uk. 3 W E 5% such that hW’Af-1(V)] = 0. Then p[W Af-1(U)] = 0. By Lemma 6.3, f-1(U) 6 “X . -1 for every U E U? Since g (B) 6 NY for B 6 fig -1[ f g-l(B)] is universally measurable. QED Corollary 6.8.1: Let X and Y be Borel spaces. D 6 uX and f: D 4 Y a universally measurable function. If U E uy, then f-1(U) E “X' Corollary26.8.2: Let X,Y and Z be Borel spaces, D 6 Uk and E E Ay. Suppose f: D 4 Y and g: E 4 Z are analytically measurable and f(D) C E. Then g<>f is universally measurable. If , -1 A c d&, then f (A) 6 uX. 191 Corollary 6.8.3: Let X and Y be Borel spaces, let f:1X 4 Y be a function, and let q(dy]x) be a stochastic kernel on Y given X such that, for each x, q(dylx) assigns probability one to the point f(x) E Y. Then q(dylx) is universally measurable if and only if f is universally measurable. Proof: By Corollary A.5.l, the mapping 6: Y 4 P(Y) defined by 5(y) = py is a homeomorphism. Let yzix 4 P(Y) be the mapping v(x) = g(dylx). Thus y = 6<>f and f = 6-1<>y. The result follows from Theorem 6.8. QED Lemma 6.4: Let X be a Borel space and f:.X 4 R*. The function f is universally measurable if and only if, for every p E P(X), there is a Borel— measurable function f’ :X 4 R* such that f(x) = fp(x) for p almost every x. Proof: Let f be universally measurable and p E P(X) be given. For r E 0*, let U(r) = (x: f(x) g r} f(x) = inf{r E Q*:x E U(r)]. Let B(r) E 6& be such that p[B(r) AU(r)] = 0. Define fp(x) inf{r E Q* : x E B(r)} = inf '1; (x) rEQ* 192 where r if x EB(r) ¢r(X) ={ m otherwise fp: X 4 R* is Borel-measurable and (x: f(x) ;=’ f (x)} C U [B(r) AU(r)] has P rEQ* p-measure 0. Conversely, given p E P(X), let fp be a Borel— measurable function such that f(x) = fp(x) for p almost every x. Then p({x:f(x) gc] A {xzfp(x) gen :0 Vc ER* f is universally measurable. QED Lemma 6.5: Let X and Y be Borel spaces and g(dylx) be a stochastic kernel on Y given X. The following statements are equivalent: (a) The stochastic kernel q(dy‘x) is universally measurable. (b) For any B E 5&, the mapping XB.:X 4 R defined by 1B(x) = q(B1x) is universally measurable. 193 (c) For any p E P(X), there exists a Borel- measurable stochastic kernel qp(dy]x) on Y given X such that q(dylx) = qp(dylx) for p almost every x. Proof: Suppose (a) holds. The function y: X 4 P(Y) as defined by v(x) = q(dylx) is univer- sally measurable. Let B E 5k and lB(x) = q(B‘x). Let : P(Y) 4 X be defined by EB(p) = p(B). Then eB 1B = BB<>y. By Theorem A.ll and 6.8, 1B is universally measurable. Suppose (b) holds and choose p E P(X). Y is separable and metrizable, E a countable base 6 for the topology in Y. Let 3 be the collection of sets in B and their finite intersections. For F E 3, let fF be a Borel- measurable function such that fF(X) = q(FIx) V x E BF where BF E 5k and p(BF) = 1. Such a fF and BF exist by (b) and Lemma 6.4. For x E F) B . let F FEJ q (dylx) = q(dy‘x). For x E H BF' let q (dylx) p FE? P be some fixed probability measure in P(Y). 194 q(dy]x) = qp(dylx) for p almost every x. Y Borel—measurable in x is a Dynkin system containing The class of sets 3 in 5 for which qp(YIx) is J. The class I is closed under finite intersection and generates By. By Theorem A.lO, statement (c) follows. Suppose (c) holds and let p E P(X). Define pr = X 4 P(Y) by q(dylx) .( A X V II ..‘ A X V II qp(dYIx)- -1 —1 _ Let B E 5P(x)’ p[w (B) Ayp (B)] - 0. By Lemma 6.3, y-1(B) is universally measurable (c) = (a). QED Lemma 6.6: Let X,Y and Z be Borel spaces and let f: XY 4 Z be a universally measurable function. For fixed x E X, define gX: Y 4 Z by gX(y) = f(x,y). Then gX is universally measurable for every x E X. Proof: Let xo E X be fixed and let m: Y 4 XY be the continuous function defined by w(y) = (xo,y). For §_E 52, {y E‘Y: gX E Z} = m-1{(X.y) E XY: f(x,y) E Z}. 0 This set is universally measurable by Corollary 6.8.1. QED 195 Now, the main result is ready to be stated. Theorem 6.9: Let X1,X2,... Borel spaces, Yn = X1X2...Xn and Y = X1X2... be a sequence of Let p E P(Xl) be given and, for n = 1,2,... let qn(dxn+1]yn) be a un1versally measurable stochast1c kernel on Xn+1 given Yn' Then for n = 2,3,..., there exist unique probability measures rn E P(Yn) such that (6.3) rn(§1§2-'-§n) =1 I "my qn—1(§an1'X2""'Xn-1) X X X —l -2 —n—1 qn_2(dxn_1lx1,x2,...,xn_2) ...q1(dx2‘xl)p(dxl) V x1 E 5; ,...,Xn E 8X 1 n If f: Yn 4 R* is universally measurable and either + .- f f drn < m or I f drn < m, then (6.4) f fdrn = f I ...‘f f(x1,x2,...,xn) Y X X X n 1 2 n qn_1(dxn]x1,x2,...,xn_1)...q1(dx2lxl)p(dxl). Furthermore, there exists a unique probability measure r E P(Y) such that for each n the marginal of r on Y is r . n n Proof: There is a Borel—measurable stochastic kernel §l(dx21xl) which agrees with q(dx21x1) for 196 p almost every x1. By Theorem 6.3, define r2 E P(Y2) by specifying it on measurable rectangles to be r2(§i £2) = g q1(§21x1)p(dxl) v £1 6 5&1, g2 e 5&2. -1 Assume f:'Y2 4 [0,m] is universally measurable and f: Y2 4 [0,m] be Borel-measurable. Let f = f on Y -N where N E B and r (N) = 0. By Theorem 6.3, 2 Y2 2 o = r2(N) = f f XN(x1,x2)ql(dlexl)p(dxl) XX —1 -2 = f q1(NX ‘Xl)p(dx1) X1 1 ql(NX 1x1) = O for p almost every x1. 1 Since f(xl,x2) = f(xl,x2) for x2 E'NX . Thus 1 I; [f(x1.x2)-f(xl,x2)]ql(dlexl)l 2 g I lf(x1,x2)-f(x1,x2)]ql(dlexl) = o N Xi for p almost every x1. Then I f(x1,x2)ql(dx2‘xl) = I f(xl,x2)q1(dx21xl) X1 X2 = j f(xl.x2)q1(dx21xl) X2 for p almost every x1. By Theorem 6.4, g f(xl,x2)q1(dx21xl) is Borel-measurable. l 197 By Lemma 6.4, I f(x1,x2)q1(dx2le) is X2 universally measurable. Furthermore, I fdr2 = I fdr Y2 X2 1 2 f(xl,x2)q1(dx2‘xl)p(dxl) M II XC—a ><‘——> i f(x1.x2)q1(dx21x1>p c] is analytic for every c E R, then f is said to be upper semianalytic. It is clear that the idea behind upper semianalytic functions is similar to that for upper semicontinuous functions in the Borel model. The next step is to 199 investigate whether or not semianalyticity is preserved in the Optimal function and under the expectation Operator. Lemma 6.7: (1) Let X be a Borel space, D an analytic subset of X and f: D 4 X. The following statements are equivalent. (a) The function f is upper semianalytic, i.e., the set {x E D: f(x) 2 c] is analytic for every c E R. (b) The set in (a) is analytic for every c E R*. (c) The set {x E D: f(x) 2 c} is analytic for every c E R*. (d) The set in (c) is analytic for every C E R*. (2) Let X be a Borel space, D an analytic subset of X, and fn: D 4 R*, n = 1,2,... a sequence of upper semianalytic functions. Then the functions inf fn, sup fn’ lim inf fn and lim sup fn are n n n-Dm n-Nb upper semianalytic. In particular, if fn 4 f, then f is upper semianalytic. (3) Let X and Y be Borel spaces, g:1X * Y, and f: g(X) 4 R*. If g is Borel-measurable and f is upper semianalytic. Then f<>g is upper semianalytic. (4) Let X be a Borel space, D an analytic subset of X, and f,g: D 4 R*. If f and g are upper semianalytic, then f4-g is upper semianalytic. 200 If, in addition, g is Borel-measurable and g 2 0 or if f 2 0 and g 2 0, then fg is upper semi- analytic, where we define 0 °w = w '0 = 0(—w) : (_oo)0 : 0. Proof: (1) We show (b) = (a) = (d) = (c) =9 (b). Clearly (b) = (a) and (d) = (C). Suppose (a) holds, then {x E D: f(x) 2 -m] = D which is analytic by definition, while the sets {x E D: f(x) 2 m} = {X E D: f(x) 2 n} :3 II 38 II 38 H H {xED:f(x)2c}= [XED:f(x)>c-%}CER .‘3 are analytic by Corollary A.13.2 (a) = (d). If (c) holds, then the sets {XED:f(X)>w1=¢ {xED:f(x)>—oo}= U [XED:f(x)2n} {xED:f(x) 2c]: u {xED:f(x) 2c+rll] are analytic by Corollary A.13.2 (C) = (b)- (2) Let c E R, {x E D: inf f (x) > c} = H {x E D: f (x) > c} n n n n=1 {xED:sup fn(x) 2c}: U {xED:fn(x) 2c} :3 :5 H H 201 inf fn and sup fn are upper semianalytic n n by Corollary A.l3.2 and part (1). And 1im inf fn = sup inf fk n4m n21 k2n . lim sup f = inf sup fk are upper semianalytic. n4m n n21 k2n (3) By Theorem A.18, the domain g(X) of f is analytic. Let c E R, Ix e x: (fog) (x) > c} = g'1({y e g(X) : f(y) > c)) is analytic by Theorem A.18. (4) Let c E R, {x E D:f(x)+g(x) 2 c} = U [{X ED:f(X) 2 r} rEQ (1 {x E D:g(x) 2 c-r}]. This is true if we adopt f(x) + g(x) = m + w = w for all x E D and c E R*. By Corollary A.l3.2, f4-g is upper sem analytic whenever f and g are. Suppose g is Borel-measurable and g 2 0. Let c 2 0, {x E D: f(x)g(x) 2 c} = LI {x E D: f(x) 2 r, rEQ.r>O g(x) > C/r]. 202 {X E D: f(X)g(X) > C} = {X E D: f(X) 2 O] U {x e D: g(x) 2 0] U [ (J {x E D: f(x) 2 r, g(x) < c/r}] rEQ.r>O {x E D: f(x)g(x) 2 c} is analytic. Suppose f and g are both semianalytic and nonnegative. For c E R, the set {x E D: f(x)g(x) 2 c] is analytic as before fg is upper semianalytic. QED Theorem 6.11: Let X and Y be Borel spaces, let D be an analytic subset of XY and let f: D 4 R* be upper semianalytic. Then the function f*: proj (D) 4 R* defined by f*(x) = sup f(x,y) X yEDX is upper semianalytic. Conversely, if f*: X 4 R* is a given upper semianalytic function and Y is an uncountable Borel space, then there exists a Borel- measurable function f: XY 4 R* such that f*(x) = sup f(x,y) with D = XY. yEDX Proof: If f: D 4 R* is upper semianalytic and c E R. The set 203 {x E pron(D): sup f(x,y) 2 c] YEDX = prOjX({(X.y) E D: f(X.Y) > Cl) is analytic by Theorem A.l7. Suppose f*: X 4 R* is upper semianalytic and Y an uncountable Borel space. For r E Q, let A(r) = {x E X: f*(x) 2 r]. Clearly A(r) is analytic. By Theorem A.17, there exists B(r) E EXY such that A(r) = pron[B(r)]. Define G(r) = L, B(s) and f: XY 4 R* by SEst2r f(x,y) = SUP {r E Q: (x,y) E G(r)) = SUP Wr(X.y) rEQ where “f(x,y) = r if (x,y) E G(r) and ¢r(x,y) = -0: otherwise. f is Borel-measurable. Let g be defined by g(x) = sup f(x,y). If f*(x) 2 c for some yEY c E R E r E Q such that f*(x) 2 r 2 c. Thus x E A(r). 3 y E Y such that (x,y) E G(r) and f(x,y) 2 r and g(x) 2 r 2 c f* (X) 2 g(X). If g(x) 2 c for some c E R 3 r E Q and y E Y such that g(x) 2 r 2 c and (x,y) E G(r). Thus for some 5 E Q, s 2 r, we have (x,y) E 8(5) and x E A(s). This implies f*(x) 2 s 2 r 2 c g (x) g f: (x) Thus g(x) = f*(x) . QED 264 Theorem 6.12: Let X and Y be Borel spaces. f: XY 4 R* be upper semianalytic, and q(dy]x) a Borel-measurable stochastic kernel on Y given X. Then the function 1: X 4 R* defined by 1(x) = I f(x,y)q(dylx) is upper semianalytic. Proof: Let 9: XY 4 R* be defined as g(x.y) = —f(x.y). Thus {(x.y) E XY:g(x.y) gb) is analytic V b E R. Suppose g 2 0. Let gn(x,y) = min{n,g(x,y)}. Each gn is lower semi— analytic and gn ? g. Let En = {(x.y.b) 6 XYR = gn(x.y) g b g n) = m U {(X,y,r) E XYR : gn (XoY) < r! k=1 rEQ r _<_ 1 , 1}. By Corollary A.l3.2 and Theorem A.l6, En is analytic. Let u be the Lebesgue measure on R, p E P(XY) and pu, be the product measure on XYR. By Fubini's theorem, (PUMEn) f (XE dudp XY R n =J‘ [n - 9n (my) ldp XY = n - I gn(x,y)dp. XY For c E R, by the monotone convergence theorem, 205 {p E P(XY): f g(x.y)dp g C) = n {p e P(XY) =J‘ gn(x.y)dp g c} =1 XY = fl {pEP(XY):(Pu)(En) 2n-C). n 1 By Lemma 6.2, the mapping p 4 pp is continuous and by Theorem 6.7, the function Sj :P(XY) 4 R* is defined by -f 9(X:Y)dp is upper semianalytic. Let 1(x) = 9f[q(dy}x)px]. Since the mapping x 4 q(dy1x) is Borel-measurable and x 4 pX and [q(dy‘x),px] 4 q(dylx)pX are continuous from X to P(X) and P(X)P(Y) to P(XY) respectively by Corollary A.5.l and Lemma 6.2 By Lemma 6.7 (3), l is upper semianalytic. Suppose g g 0. Let gn(x,y) = max{-n,g(x,y)}. Each gn is lower semianalytic and gn 1 9. Let En = {(x,y,b) E XYR : gn(x,y) g )5 g 0}. En is analytic (pu)(En) = f I X dudp = -f gn(x,y)dp. E XY R n XY For c E R (p E P(XY): f g(x.y)dp < c} = Llip E P(XY): I gn(x.y)dp < C‘ n=1 XY = U {p e P(XY) : (pu) (En) > -c). n=1 Use the same arguments as before. In the general case I f(x,y)q(dy1X) = f f+(x.y)q(dylx)-f f-(x.y)q(dylx). 206 Since f+ and -f- are upper semianalytic, and by the preceding arguments each of the summands on the right is upper semianalytic. By Lemma 6.7 (4), 1(x) is upper semianalytic. QED Theorems 6.11 and 6.12 show that both the Optimal function and the expectation operator are well behaved. The next two theorems will outline the procedures to Obtain a measurable selector which assigns to each x E X a y E Y which attains or nearly attains the supremum in f*(x) = SUP f(X.y) yEY Theorem 6.13 (JankOV-von Neumann Theorem): Let X and Y be Borel spaces and A an analytic subset of XY. There exists an analytically measurable function ¢>:prOjX(A) 4 Y such that gr(m) C A. Proof: Let f: T 4 XY be continuous such that A = f(m). Let g = projx<>f. Thus 9: m 4 X is con- tinuous from m onto pron(A). For x E pron(A), g-1({x)) is a closed nonempty subset of m. 207 Let C1(x) be the smallest integer which is the first component of an element 21 6 g-1[(x)], and g2(x) be the smallest integer which is the second compo- nent of an element 22 E g-1({x}) whose first component is §1(x). In general, let gk(x) be the smallest integer which is the kth component of an element 2k 6 g-1({x]) whose first (k-l)st components are g1(x),...,gk_l(x). Let WX) = (Cl(X).§2(X)...-) Since 2k 4 $(x), ¢(X) E g-l({x}). Define m: pron(A) 4 Y by m = pron° f0 u, so that gr(m) C.A. For (01,...,Gk) 6 23, let N(Ul,...,Gk) = {(C1,Q2,-oo) 6 97: C1 = 0110-00gk = 0k} \v M(ol,...,ok) = {(g1,g2,...) 6 91: gl g 01,...,gk g Gk}. Let S = (01.02,...,Gk) E'ZL Suppose x 6 $-1[N(s)]. Let Mx) = (§1(x),g2(x),...). Then ¢'(x) €N(s) CM(s). V 9[$(X)] 6 g[M(s)] and §1(X) = Glyn-o,Ck(X) = Ck. By the construction of w, 01 is the smallest integer which is the first component of an element of g-1({x}) and for j = 2,...,k oj is the smallest integer which is the jth component of an element of g-1({x}) whose first (j-1) components are Ul"'°'Gj—1 208 . -1 _ . _ ..g([x}) flM(01,...,oj_1,Oj-1)— gr 3 _ l,...,k k and xljglg[M(Ol....,oj_1,oj-1)]. This implies k J71 + [NM] C 9[M(s)] - 3'91 9[M(01u--.0j_1 0]- -1)]- It Now suppose x 6 g[M(s)] - U g[M(C71,...,CI.l O.-1)]. j=1 3' 3 Since x€g{M(s)], 3 y= (711,712,...) Eg_1({x}) such that Clearly, x E pron(A) = g(ET?) i fix) is defined. Let 1(X) = (;1(X).;2(X)....) and MX) Eg-l({x}). Thus g[;(x)] = x. This implies 1' (X) EM(01,...,oj_l oj-l) j = 112I"’Ik' Since $(X) Q’M(Ol-1), then c1(X) 2 01. But g1 (x) is the smallest integer which is the first component of an element of g—1({x}) . C1 (X) g 711 S 01- This implies (x) = o . Similarly, since 1 l $'(X) {M(§1(X).02-1). Q2(X) ; 02. Again C2(x) ||/\ .‘3 N l/\ o to .'. 52 (X) = 02. Continuing v(x) €N(s) and x E i'-1[N(s)]. Thus k ‘3-1[N(S)] :3 g[M(s)] .- U g[M(O ,...,O._ ,O’.—1]. j=1 l J 1 J P09 k . fl _ w, [N(s)] — g[M(s)] - girl/[(61,...,oj_l,oj -1] 3:1 M(t) is open in ‘3? for every t E Z. By Theorem A.18, g[M(t)] is analytic '. ¢'-1[N(S)] e ax v s e 23. But {N(s) : s E Z] is a base for the topology on ‘R O({N(S): s E'ZU) = 3m. Thus w-1[U({N(s):s e 21)] = g[¢-1({N(s) : S GEM] This implies ¢-1(B C ax and 11’ is analytically an) measurable. By the definition of co and the Borel- measurability of f and pron -1 ,-1 -l .-1 :9 (BY) = (f {pronY (BYH) -l -l c; (f [DXYD c: w’lvs ) c a m ‘X I. a) is analytically measurable. QED Theorem 6.14: Let X and Y be Borel spaces, D C XY an analytic set and f: D 4 R* an upper semi- analytic function. Define f*:pron(D) 4 R* by f* (X) = sup f(x,y). §’€I5( 210 (a) For every 6 > 0, there exists an analytically measurable function m:pron(D) 4 Y such that gr(m) C D and for all x 6 pron (D) f* (x) - c if f* (x) < co f[X.co(X)] 2 — l/e if f* (x) = on (b) The set I = {x 6 pron (D) : for some = * . . yX E DX' f(x,yx) f (x)} is universally measurable and for every 6 > 0 there exists a universally measurable function co:projX (D) 4 Y such that grm) c: D and for all x 6 pron (D) fEX,cp(X)] = f* (X) if X E I f*(x)-e if x£1,f*(x)<°° fIX.co(X)] 2 1/6 if XEI, f*(x)=m Proof: (a) The function f* is upper semi- analytic by Theorem 6.11. For k = O,j-_l,:_2,..., define A(k) = {(xoy) e D: f(x.y) > kc} B (k) = {x e pron(D) :ke < f* (x) g (k+ Us} B (-..) = {x e pron(D) : f*(x) = -..] B(cn) = {x E pron(D) : f*(x) = on}. 211 The sets A(k), k=0,_-f_1,:_2,... and B(co) are analytic, while the sets B(k), k = O,_-i;l,:_2,... and B(—a) are analytically measurable. By the Jankov—von Neumann theorem, there exists, for each R = 0,11,:2, . . . an analytically measurable wk : pron[A(k)] 4 Y with (x,mk(x)) E A(k) for all x 6 pron[A(k)] and an analytically measurable c-o : projx (D) 4 Y such that (X,c-o(x)) E D for all x E projx (D) . Let k* be an integer such that k* 2 1/62. Define to : pron(D) 4 Y by 9%(x) if x EB(k), k =O,_~I;1,_+_2,... :p(x) = g(x) if x 6 13(4) 03k" (X) if X € B(co) Since B (k) C pron[A (k) ] and B (on) C pron[A(k)] for all k, this definition is possible. Clearly, co is analytically measurable and gr(cp) C D. If x E B(k), then (x,cpk(x)) €A(k) and fifX.co(X)] = f[X.cok (X) ] >k€ ; f*(x)-e. If x E B(-eo), then f(x,y) = -en for all y E DX and f[x,co(x)] = -o = f*(x). 212 If X E B(o) f[X.co(x)] = f[x.cok*(X)] > k*e > l/e. Emnce to has the required prOperties. (b) Consider the set E C XYR* defined by t1} II {(X.y.b): (X.y) 6 D. f(X.y) ; b} k=1 r€Q* By Theorem A.l6 and Corollary A.l3.2, E is analytic in XYR* 2 The set A = pronR*(E) is analytic in XR*. And the mapping T: pron(D) 4 R* defined by T(x) = (x,f*(x)) is analytically measurable. I = (x: (x,f*(x)) e A} = T'1(A) I is universally measurable by Corollary 6.8.2. Since E is analytic, by the Jankov-von Neumann theorem, 3 an analytically measurable «HA 4 Y such that (x,m(x,b),b) E E for every (x,b) 6A. Define ‘4'; : I 4 Y by ‘HX) = co(X.f*(X)) = (co°T) (X) V x E I. By Corollary 6.8.2, '1' is universally measurable and by construction f[x,~lx(x)]2 f*(x) for x E I f[X.'&:(x)] = f* (X) V x 6 I. O D U ((x.y.b) : (X.y) e o, f(x,y) 2 r, r 2 b-l/k]. 213 By part (a) there exists an analytically measurable 1'-=Pr0jX(D) 4 Y such that f*(x)-e if f*(x) <0: f[X,¢€(X)] 2 l/E‘. if f* (X) = on . Define co: projx (D) 4 Y by p(x) if )< 6 I (p(X) = o I ¢e(x) if x 6 prOjX(D)-I Thus w is universally measurable and has the required prOperties. QED It is noted that since the composition of analyti- cally measurable functions can fail to be analytically measurable, the selector in the proof of Theorem 6.14 (b) can fail to be analytically measurable. However, the composition of universally measurable functions is universally measurable, and so a selector which is ‘universally measurable is obtained. Earlier in the chapter, it was mentioned that the pnnajection of a Borel-measurable function need Nevertheless, under certain not be Borel-measurable. conditions, it can be shown that when the extended .realfwnalued functions involved are semicontinuous, tflien tine selectors can be chosen to be Borel—measurable 214 Theorem 6.15: Let X and Y be separable metrizable spaces. Let q(dylx) be a continuous stochastic kernel on Y given X, and let f : XY 4 R* be Borel-measurable. Define 1(x) = I f(x,y)q(dy)x) . (a) If f is lower semicontinuous and bounded below, then 1 is lower semicontinuous and bounded below. (b) If f is upper semicontinuous and bounded above, then 1 is upper semicontinuous and bounded above. Proof: (a) Since f is lower semicontinuous and bounded below, 3 a sequence, {fn] C C(XY) such that fn ? f. Define 1n(x) = f fn(x,y)q(dy)x). Then kn is continuous by Theorem 6.5 and by the monotone con- vergence theorem kn T l 1 is lower semicontinuous. (b) Same argument as (a) by complementation. QED Theorem 6.16: Let X and Y be metrizable spaces and 1£fl2 f: XY 4 R* be given. Define f*(x) = sup f(x,y). y6Y (a) If f is lower semicontinuous, then f* is lower semicontinuous. (b) If f is upper semicontinuous and Y is compact, then f* is upper semicontinuous and for every x E Y the supremum is attained by some y E Y. 215 Proof: (a) Let (31 be ametric on X and d2 a metric on Y consistent with their topologies. Let (ECZXY be open and x0 6 pron(G), 3 yO 6 Y such that (Xo'yo) 6 G and some a > O with N€(Xo.yo) = {(x,y) 6 XY::dl(X.XO) < c. 62(y.yo) < e] C G. Then C ' = 0 C ' xO \ pr03X[N€(xO,yo)] {x 6 X. dl(x,xo) < e} pronG 2 pron(G) is open in X. Let c E R {x E X: f*(x) > c? = projx({(x,y) 6 XY: f(x,y) > c}). Since f is lower semicontinuous, this implies {(x,y) : f(x,y) > c} is open {x E X: f*(x) > c} is Open and f* lower semicontinuous. (b) Let x E X be fixed and let {yn} c.Y 'be such tflnrt f(x,yn) 1 f*(x). This implies yn 4 yO lim sup f(x,yn) g f(x,yo) where yo E Y since n4a f (leo) = f* (X) - Let {x11} C X be such that Xn 4 x0. Choose a sequence {YD} <:‘Y such that 216 f(xn,yn) = f*(xn) n = 1,2,... 3 a mflmequence [(x ,y )) such that “k “k lim sup f(x .y ) = lim f(X oY ) Since Y’ is compact, {yn } 4 yo, yo 6 Y R 1im sup f*(xn) = lim sup f(xn.yn) n4: n41: = 111“ f(X 0y ) k4o “k “k * _<_ f(xo.yo) g f (X0) f* is upper semicontinuous. QED The next lemma is very similar to Theorem 6.13 in the analytic model and Theorem 6.17 is the corresponding counterpart of Theorem 6.14 in Borel—measurable model. Lemma 6.8: Let X be a metrizable space, Y a .separable metrizable space, and G an Open subset of Xfih. 'Then pron(G) is Open and there exists a Borel- measurable function to : pron(G) 4 Y such that gr(m) C G- Prrxaf: Let {yn: n = 1,2,...} be a countable dense subset Of Y. For fixed y 6 Y, the mapping x 4 (x,y) is continuous .. {>< 61X: (x,y) 6 G} is Open. 217 Let on = {x e x: (x,yn) e G}. Thus prOJX (G) = n31 on and prOjX(G) is Open. Define co: pron (G) 4 Y by y1 if x 6 G1 co(X) = n-l y if x 6 G - Ll G , n = 2,3,... n n k=1 n Clearly gr(cp) C G and m is Borel-measurable. QED Y Theorem 6.17: Let X be a metrizable space, a compact separable metrizable space, D an Open sub- set Of XY, and let f : D 4 R* be upper semicontinuous. = sup f(XIY) ' be given by f*(x) y 6DX Let f* : pron (D) 4 R* Then prOjX(D) is open in X, f* is upper semicontinuous, and for every 6 > 0, there exists a Borel-measurable such that grace) C D and function co- : pron(D) 4 Y for all x 6 prOjX(D) f*(x) - e if f*(x) < co ffX,cp€(X)] 2 1/6 if f* (X) = a. Proof: The set prOjX(D) is Open in X by Lemma A 6.8. Let f: XY 4 R* be defined by ‘f(x,y) if (x,y) 6D f(X.y) = -¢ otherwise For c 61R, A {x 6 X: sup f(x,y) > CT. ' f* (X) > c} = y6Y {x 6 pron(D) . 218 By Theorem 6.16 (b), f* is upper semicontinuous. Let c > 0 be given. For k = O, i1,_+_-_2,..., define MR) (091!) e D : f(x.y) > Re) B (k) {x e pron(D) :kc < f*(x) g (k+ 1):} B(—-) = {x 6 prOjX(D) f*(x) = -u] \ B(a) = {x 6 pron(D) : f*(x) = a] l (k+1)€ k6 The sets A(k) , k = 0.11.12“... are Open, while the sets B(k), B(—¢), B(o) are Borel-measurable. By Lemma 6.8, 3 for each k = 0,11,12,... a Borel- measurable cpk : pron (Ak) 4 Y such that gr(cpk) C AR“ 3 a Borel-measurable E): projx (D) 4 Y such that gr (c?) C D. Let k* be an integer such that k* 2 1/92. Define cps : projx (D) 4 Y by 219 cok(x) if x 6 B(k) k = O,i1,:2 mew) = {g(x) if x e B(w) c9k*(x) if x 6 B(oo) Since B(k) c:prOjX[A(k)] and B(a) C prOjX[A(k)] for all k, this definition is possible. Clearly, we is Borel-measurable and gr(m€) C D. If x 6 B(k), then, since (X.wk(X)) 6 A(k), fIXIm€(X)] = f[Xocok(X)] > ke _2_ f* (X) - e- If x e B(-c), then f(x,y) = -a for all y e DX f[X:co€(X)] = -~ = f*(X). If x 6 B(an), f[X.co€(X)] = fix.mk*(X)] > k*e < 1/e m” has the required prOperties. QED L SO far in this chapter, all the rudiments that are necessary to carry out the dynamic programming algorithm in Borel spaces are discussed. The discussion is to a large extent technical in nature. The main ideas are to develop measurability requirements for the various Operations of the algorithm. In the next chapter, our attention will be returned to the economic model and an imperfect information model will be built on the results derived in this chapter. 220 It should be noted that the most logical order Of events is to re-examine the finite and infinite models for a basic multiperiod agency using the Borel space ideas presented here. However, I choose not to follow this sequence because the imperfect state information model can be best used to describe the reporting function Of an entity about its performance to the owners. Since this application is Of the greatest interest to accounting research, I shall proceed directly to the imperfect state information model, leaving the basic multiperiod agency Borel model for further research. CHAPTER VII IMPERFECT STATE INFORMATION MODEL 7.1: Introduction Chapter VI discusses the problems that arise when the disturbance space is uncountable. Specifi- cally, it becomes impossible to define stochastically the sequence of payoffs given an initial payoff. Later in the chapter, by imposing measurability restrictions on the functions and enlarging the payoff and disturbance spaces to include all analytic sets, the sequential search for Optimal or nearly Optimal contracts can still be implemented. The machineries built in Chapter VI, particularly VI*, facilitate the modeling of the following economic phenomenon. A contract is agreed upon by the principal and the agent at the beginning Of a typical period, say k. The agent makes his action choice in a manner of a rational decision-maker. A payoff outcome occurs at the end Of the period which is only observable by the agent. The agent "reports" the payoff to the principal. There is no reason to believe or to assume 221 222 that the agent always reports truthfully. In fact, the manner the agent reports the payoff should be One which is in his own best interest. This may induce a discrepancy or disturbance between the actual payoff and the reported payoff. Since the agent "chooses" the manner to report the payoff, the resulting discrepancy can be anything. What is observable to the principal is a sequence of past contracts and re- ported payoffs. A contract is enforceable only as its arguments are observable by all parties. Under such circumstances, the only candidates for contracting will be the sequence of past contracts and reported payoffs. However, it has been shown in the literature by various writers (Ng and Stoeckenius [1979], Harris and Raviv [1979]) that an incentive contract based on the outcome alone will cause more hazard problems. The question facing the principal will be under what cir- cumstances can he design a long term contract to achieve a Nash equilibrium over the planning horizon. The discrepancy between the actual payoffs and the reported payoffs under the imperfect state informa- tion model does not behave as nicely as the countable disturbance assumed in Chapters III through V. Since the reporting function belongs to some functional space which is Of infinite dimension and the principal has no 223 way of knowing what or how the agent is choosing the reporting function, there is no reason to believe that the disturbance is countable and has a well-defined distribution function. It is uncountable because the reporting function is unknown to the principal and it is of infinite dimension, that is, the space is made up Of an infinite number of independent vector basis. The principal receives the value of the reporting function, the mapping (or the how) of the actual payoff to the reported payoff is never known to him. For this reason, the principal is not able to induce the actual payoff from the reported payoff. This causes more serious problem in a multiperiod model. Since the principal cannot induce the actual payoff given a reported payoff in any period, it becomes more difficult for him to define the whole sequence of payoffs even stochastically given an initial value. Recall that the payoffs form the state space for the multiperiod agency model prOposed in the previous chapters, an alternative state space must be defined before the problem can be formulated. Such a state space must be inducible by the knowledge of the past contracts and reported payoffs. It must have a probability distribution conditioned on the past contracts but independent Of the reported payoffs. This condition is necessary to induce truthful reporting. 224 As indicated earlier, contracts based on reported alone provides incentive for the agent to report un- truthfully. It has been suggested that (Myerson [1979]) if the principal desires the agent to tell the truth about a particular event, the contract should be drawn independent of that event. It will be shown later in this chapter that this alternate state space is used as an argument for the incentive contract, it has to be independent of the reported payoff to induce the agent to report truthfully. Lastly, the principal must be able to utilize the state space to design a long run Optimal contract. 7.2: The Imperfect State Information Model (ISI) Before proceeding to describe the imperfect state information model, a remark on the solution is in order. Since it is not possible to define the state space stochastically at period 0, the Optimal contract, which consists of a sequence of contracts over the time periods, cannot be defined. The idea of a contract has to be modified to a function whose image is on the inter- val [O,l] and whose function value assign a probability measure on all feasible contracts given the principal's belief of the initial payoff and the past contracts and the reported payoffs. 225 The model proposed here consists of the following objects: 0 Payoff space C Incentive space Z Signal space, or the space of possible reporting functions é Information vector. Define for k = 0,1,...,N-l Q = Z C . o o k "Ck-lzk An element of Qk is called a kth information vector, denoted by Qk. k = O,...,N-l Incentive constraints. These constraints exclude all contracts which will give the agent an expected utility less than Outside Opportunity set. a Discount factor g Net return function to the principal s A conditional probability of the initial signal given 0 s Conditional probability of Z given QC N Time horizon: a positive integer or m P Initial payoff probability space The 181 model can now be described notationally as follows. At period 0 the principal and the agent agree on a contract ID with the principal's probability belief Of getting the initial payoff mo to be p and a reported payoff 20 with probability so given mo. One can view 50 being the principal's assessment of 226 the truthfulness of the agent's reporting action. The agent chooses ao Optimizing his expected utility with outcome mo. He then reports 20 to the princi— pal who then updates his information vector and revises his belief on what he may receive as for the 21 possible contracts that he may entered into with the agent as 12. Then 12 repeated. Of course, the biggest problem here is that is agreed and the process the only information that the principal has is the ¢k. If he were to contract based on reports alone, it has been shown (Ng and Stoeckenius [1979]) that such a contract will induce non-truthful reporting. At this stage of modeling, not much can be said. A bit of comfort is that it can be shown (Theorem 6.3) that a sequence of probability measures Pk(QoZOCO...QkaCk)V,p) can be defined on QOZOCo"'OkaCk given a contract n and initial dis- tribution p. For notational ease, let Dk denote the set Of all sequences of the form (w ,z I O O' ’ I o,.. wk'zk'lk) 6 OZC...OZC. This enables one to define the N-stage total discounted expected return to the principal corresponding to a contract F 6 H as N-l k JN,F(p) = g [ggg a g(wkIIk)]Pk_1(Vop)de_lo k-l - For the infinite horizon model, then Jn = lim J N4m N'V 227 In order to ensure the integral in JN v to be defined, the following ad hoc condition is imposed on the finite horizon model JN,F(p) < m V v 6 H. p 6 P(Q). This condition simply implies that the total expected discounted return to the principal is finite. For the infinite horizon models, to guarantee the limit in the definition of JV to be well-defined, one of the following conditions is needed. (P) 0 g g(w,I) for every (w,I) 6 DC (N) g(w,I) g 0 for every (w,I) 6 QC (D) O < a < 1 and for some b 6 R, -b g g(w,I) g b for every (w,I) 6 DC. Condition (P) implies that the net return per period to the principal g is positive. Condition (N) implies that g is negative all the time. Condition (D) says that g is bounded and the discount factor is less than unity. Of the three conditions, condition (D) certainly describes most common economic situations. It is not unreasonable to assume that at least one of the above conditions hold for the infinite horizon model. As indicated earlier, based on the information vector alone, there seems little h0pe to obtain a long run Nash Optimal contract. However, by introducing a 228 monitoring device, a sufficient statistic, which is defined below, the 151 model can be transformed into a "perfect state information" (PSI) model. The process of transformation will be described in the next section. The sufficient statistic is defined in such a way that knowledge Of its value is sufficient to design an Optimal contract and control the system. First, the term statistic is defined. A statistic for the 181 model is a sequence (no,...,nN_1) of functions nk: P(Q)i>k 4 5k where Ek is non-empty with generic elements of E k = 0,1,...,N-1. Then kl for a statistic (no,...,nN_1) to be sufficient for the 151 model, all Of the following conditions have to be met. (a) The statistic must guarantee that the incentive constraint set Uk(¢k) can be recovered from nk(p;¢k). This enables the PSI model which is defined on Ek to search for an Optimum among the same set of feasible contracts. (b) It must guarantee that the distribution of depends only on the values of E and I Thus, k k' can be used to construct the state gk+1 the variables gk space of the perfect state information model. (c) It must guarantee that the net return function to the principal 9 corresponding to a contract n can be computed from the distribution induced on the (gk'Ik) pairs. 229 What the sufficient statistic does is that the principal can take the information vector and from there con- struct an independent set of variables which he can use to assess the performance Of the agent. The 181 model describes very closely the reporting function of an entity. Management performs their routine tasks and decision making. A series of payoff outcomes occur and management chooses a reporting function to produce a set Of financial statements at the end of each period. In most corporations, the principal or the owner does not take part in both the decision making and the reporting processes. The actual payoff outcomes are then unobservable by them. Then the financial statements are presented to the auditors who perform the various audit tasks, make the necessary recommendations for alteration and attest the financial statements. If one were to investigate the definition of a sufficient statistic closely, it is not difficult to see that it actually describes the audit function. The auditor takes the reported outcome which is in fact the information vector. After performing the audit tasks and making the necessary changes, he produces the audited financial statements. If an audit is performed in accordance with the general accepted auditing standards, 230 it is believed that the audited financial statements provides fair representation of the financial standing of the company. In other words, from the audited statements one can induce an expected payoff or return to the owner which is exactly Condition (c). The requirement that the distribution of §k+1 depends only on the values of 5k and Ik and not on the information vector ¢i parallels the auditing profession's emphasis on independence. Auditors are not to be involved in the reporting function of the company. Condition (a) apparently is imposed more for control purposes than reporting. However, this condi- tion implies that the principal can use the audited financial statements to search for optimal decision. The possible alternatives induced by these statements are identical with those if the actual payoffs are known. This guarantees that the decisions made are feasible and, as shown in the later section that, they are Optimal also with respect to the actual payoff. This fulfills the Objective of the financial statements that they should provide relevant information for decision making. 231 7.3: The Perfect State Information Model (PSI) The manner that the sufficient statistic is de- fined implies that given a distribution of the initial payoff and the information vector, the principal can transform the reported payoff into an expected payoff with an independent probability distribution. If a sufficient statistic is given, that is assuming it exists, and its values can be computed from the know- ledge Of P(Q) and § then the principal can define k' a perfect state information model in terms of Bk. For notational purposes, a (A) is used to denote Objects in the PSI model. The perfect state information model consists of the following: E k = O. 1' o o o 'N-l State Space kl C Incentive space A Uk’ k = 0,1,...,N-l Incentive constraints a Discount factor A gk, k = 0,1,...,N-l One period net return to the principal A t , k = 0,1,...,N-2 Probability distribution of k (e I: I) k+1 ~k’ k N Horizon 232 7.4: Reduction of the Imperfect State Information Model This section studies the relationships Of the various Objects between the 181 and the PSI models. The discussion on the existence of a sufficient statistic will be deferred until the next section. Throughout this part Of the analysis, a sufficient statistic is assumed tO exist. The Borel model as proposed in Chapter VI guarantees the existence of an optimal solu- tion (Theorems 6.13 and 6.14, Lemma 6.8 and Theorem 6.17). By imposing the measurability restrictions on the E and the incentive spaces, the PSI model consists of a well-defined and measurable state space E with a .— probability distribution on E which depends on Ek and I It becomes a direct application of the Borel k’ model. Hence an Optimal contract for the PSI model is guaranteed to exist. The natural question to ask is whether or not the Optimal contract Obtained from the PSI model is also optimal for the 151 model. In order to establish correspondence between the two models, the initial probability measure on E O must be ensured such that it can be induced from know- ledge Of the distribution of the initial payoff (Theorem 7.1). The inter-relationship between the probability distri- bution on the set (wo,Io,zo,...,wk,Ik,zk) 6 Dk and A that on (EO,I given a contract n in _r O,...,1k_l,ek) 233 the PSI model is then derived (Lemma 7.1). The analysis goes on to show that for every contract in the PSI model, a corresponding contract in the ISI model can be constructed (Theorem 7.2). A Next, given a particular contract w in PSI, the total expected discounted net returns under the two models are related by their respective probability distribution (Theorem 7.3) and so are the Optimal total return functions (Theorem 7.4). The final step is to show the correspondence of the Optimal contracts between the two models. I can only show a nearly Optimal con— tract for the PSI model under the finiteness assumption of the finite horizon model or the Assumptions (N) and (D) of the infinite horizon model is also nearly Optimal for the ISI model. However, correspondence for Optimal contracts is shown for all assumptions Of both the finite and in- finite models (Theorem 7.5). The ISI model can then be reduced to the PSI model for nearly Optimal contracts. In terms of the auditing model, this implies that the nearly Optimal contracts derived from the audited financial reports are as good as those as if one were to Observe the actual payoff. 7.5: Sufficient Statistic In this section, a sufficient statistic is prOposed and shown to meet all the three conditions of a sufficient statistic. 234 It is derived by a process called filtering. In essence, filtering is very similar to the commonly known process of Bayesian statistics. The system starts with the initial wealth outcome wo which has a priori distribution p. After 20 is observed, the distri- bution is "up-dated". The up—dated distribution is called a posteriori distribution and is shown to be well-defined and unique (Lemma 7.6). At the kth stage, there will be some a priori distribution p; of “k based on dk_1. Contract Ik-l is negotiated, some 2 is observed and an a posteriori distribution k of wk conditioned on (¢k-l'Ik-l'zk) is computed. This distribution is again well-defined and unique. The process of passing from an a priori to an a pos— teriori distribution in this manner is called filtering. Then the sequence Of a posterior distributions of wk' pk: P(Q)4>k 4 P(C) is shown to be a sufficient statistic (Theorem 7.6). The filtering process seems to capture very closely the audit function. The auditor comes into engagement with a client. Based on initial interview with manage— ment and evaluation of the company's internal contral system, the auditor form some a priori Opinion about the 'correctness' of the reported outcome, or what the initial payoff should be. After performing the necessary 235 substance and compliance tests, the audit will up—date the distribution. Such a distribution will again be up-dated in subsequent periods based on the prior years' working papers. The audit report thus become the out- come of the filtering process. In the process described above, again I am assuming that there is no incentive nor moral hazard problems exist between the principal and the auditor. The auditor is acting strictly in the best interest Of the principal. If the auditor is performing the audit tasks in the best possible manner, the audited financial statements is adequate for the derivation of a long-run Optimal contract to be negotiated between the principal and the agent. CHAPTER VII* IMPERFECT STATE INFORMATION MODEL 7.1* Introduction The imperfect state information model consists of two stochastic time series, namely the payoff and the signal on the payoff. Obviously, the later series is parameterized by the first one. In terms of the economic model, the payoff is the wealth outcome of a particular action selected by the agent. He observes the payoff and then decides on the manner that the outcome is to be reported to the principal. The information which is available to the principal is a vector of the following form Q = Z C k 0 o...ck_lzk, k = O,...,N-l where C is the incentive space and Z is the signal space both Of which are assumed to be nonempty Borel spaces. To the principal, since he has no direct control on the form and magnitude of Zk, the report or signal in his mind is nothing but a random event 236 237 stochastically generated via a signal kernel s(dzk+1)Ik,wk+1). The zk+1 Signal is then added to the past Signals and incentives (20.10,...,zk,1k) to form the (k4—1)st information vector Qk+l = (ZO'IO""’zk'Ik'zk+1)' The first information vector o; = (20) is generated by the initial Observation kernel s0(dzo)mo), and the initial pay- off wo has some given initial distribution p. Since Qk is Observable by both the agent and the principal, it becomes a basis for incentive contracting. However, it is well-known that any incentive based solely on Qk is likely to induce misrepresentation of the outcome. The following section will develop a sufficient statistic for the payoff, or more specific, a control on the reporting function. To describe the system in terms of p, the initial distribution of mo, define t(B)w.I,J) P([Y= f(w,I,J,y) 6 B})w.I.J) p(f'1(B)(w,LJ)lw.I.J) for B 6 60. Thus t(Blw,I,J) is the probability that the (k+—l)st state is in B given that the kth state is w. the kth contract is I and the kth total net return to the principal is J. Alternatively, the 238 system can be viewed as moving from wk to wk+1 via the state transition kernel t(dwk+11wk'Ik'Jk) and generates a return to the principal of g(wk'Ik)‘ 7.2* The Imperfect State Information Model (ISI) The model will consist Of the following objects and their corresponding assumptions 0 Payoff space: a nonempty Borel space Incentive space: a nonempty Borel space Signal space: a nonempty Borel space Incentive constraints: For k = O,...,N-l, let 6k = Zoco'°'Ck-lzk' An element Of ék is called a kth information vector. For each k, U is a mapping from ék k to the set of nonempty subsets of C such that I‘k = {(oka) : grk 6 6k. 1k 6 Uk(Q’k)] is analytic Discount factor: a positive real number Payoff to the owner: an upper semianalytic function from PR to R* Initial signal kernel: a Borel-measurable stochastic kernel on Z given 0 Signal kernel: a Borel-measurable stochastic kernel on Z given CQ 239 t State transition kernel: a Borel- measurable stochastic kernel on 0 given CC N Horizon: a positive integer or a P Initial payoff probability space: a nonempty Borel space. Definition: A contract for ISI is a sequence v = (“O""'uN-l) such that, for each k, k = O,...,N-l. “k(dIk‘P7¢k) is a universally measurable stochastic kernel or C given P(Q)§k satisfying uk‘Uk%>1P:¢k> = 1 V (p.¢k) e p(omk. If for each p,k and , (dI p:¢') assigns mass “1: k k one to some point in C, v is said to be non- randomized. Let U denote the set of all contracts n. For ease Of notation, let Dk denote the set of all sequences of the form (mo,Zo,Io,...,wk,Zk,Ik) 6 QZC...QZC. Thus, given p 6 P(Q) and n = (”o""'uN-1) 6 U, by Theorem 6.3, there exists a sequence of consistent probability measures Pk(n,p) on Dk’ k = O,...,N-l defined on measurable rectangles by 240 Pk(1r.P)(QoZogo. . 'flkEk-gk) = IQbIZOIS-o...jflkjgk Uk(9_~k‘p7zorlol°-°Ilk_lzk) S(dzk‘Ik—l'wk)t(dwk‘Ik—l'wk—l) ...uo(dIolp:Zo)so(dZoIwo)p(dwo) where .Q 6 B Z.6 E , 9-6 B 0' Z C' Definition: Given p 6 P(Q), a contract v = (“0""'uN—l) 6 H and a pOSItlve integer K g N. the K-stage payoff corresponding to n at p is K-l k JK'F(p) = ‘er-i[k§o a g(kak) ldPK_1(Tr.p)- If N < m, the total net return to the principal corresponding to v is JN v' For the finite horizon model, either one of the following conditions on the integral is assumed N-l a [Z k+ (F-) fDN-l k=0 9 (wk.Ik)]dPN_1(7r.p) < a V v6U.p6P(O). + N-l k _ (F > fgh 1 IQ§B a g (wk.1k)]dPN_l(v.p) < . V Tr61'I.p6P(Q). Then, JN n can be rewritten as follows I N-l k JN’W(p) = gig a ka g(wk.1k)dpk_1(v.p) 241 If N = a for infinite horizon models, then J = lim JN n’ the return to the principal corres— N-Hn ' ponding to n. In order to ensure the limit JV is well-defined in R*, one of the following conditions is assumed on g (P) 0 g g(w.I) for every (w,I) 6 0C. (N) g(w.I) g 0 for every (w.I) 6 OC. (D) O < d < l and for some b 6 R, -b g g(w,I) g b for every (w,I) 6 0C, Hence J7me) = ki? 03‘ J‘Dk g(wk.Ik)de(7T.p) V W e n. p 6 pm). =o The concepts of Optimality at p, Optimality, e-Optimality at p and e-Optimality of contract are analogous to those given in Section 3.3*. To aid in the analysis of the imperfect state model, a sufficient statistic is introduced. It is defined in such a way that knowledge of its value is sufficient to design an Optimal contract and control the system. Definition: A statistic for the model ISI is a sequence (no,...,nN_1) of Borel—measurable functions H nk: P(Q)§k 4 E where :k is a nonempty Borel space, k = O,...,N-l. 242 Definition: The statistic (nb,...,nN_l) is sufficient (SS) if: (a) (b) (C) For each k, there exists an analytic A A set Pk C EkC such that prOjk(Tk) D—l =:_,k and for every p 6 P(Q) A Pk = [(¢k.I)= [nk(p:Q&).I] 6 TR}- -Ae-t Define Uk _k) — ( k)gk. There exist Borel-measurable stochastic A kernels tk(dgk+l‘§k'lk) on :k+1 given EkC such that for every p 6 P(Q). n 6 H. E 6 B: , for k = O,...,N-2, —k+l "k+1 we have that 13k,fk) = Pk+1(w'p)[nk+l(p7ak+l) 6 Ek+1I nk(P7¢k) = Ek'Ik = ik] A D—O tk(:"k+1 for Pk(n,p)-almost every (Ek,Ik), that is, the set A _ {(wolzoIIoIOOOIwkozklIk) 6 11k 3 tk(:k+1‘zk'lk) = Pk+1(WaP)[’] When 5k = nk(P:¢k)IIk = 1k} has Pk(n,p)-measure one. There exist upper semi-analytic functions A gk: 9k 4 R* satisfying for every p 6 P(Q), TFEH' k=O'Ooo'N-l' 243 for Pk(n,p) almost every (gk'lk) where the expectation (in the sense of an outer integral) is taken with respect to Pk(n,p). Condition (a) of the definition of a sufficient statistic guarantees that the incentive constraint set Uk(¢k) can be recovered from nk(p:¢%). Indeed, for any p e P(o). ok 6 ek, k = O,...,N—l A ukwk) — Uk[nk(p:¢k) ]- If Uk(¢k) = C for every gk 6 ék' k = O,...,N—l, A then condition (a) is satisfied with Pk = EkC. This is the case of no incentive constraint. Condition (b) guarantees that the distribution of gk+1 depends only and I . Thus, the variables on the values of ER k gk can be used to construct the state space of the perfect state information model. Condition (c) guarantees that the net payoff to the principal corresponding to a contract can be computed from the distribution induced on the (gk’Ik) pairs. 7.3* The Perfect State Information Model (PSI) If a sufficient statistic as defined above exists and its values can be computed from the knowledge Of 244 P(O) and then a perfect state information @k' model can be defined in terms of Ek. For notational purposes a (A) is used to denote Objects in the PSI model. Definition: Let the ISI model and a sufficient statistic (q0,...,nN_1) be given. The perfect state information Optimal incentive model (PSI) consists of the following: Ek, k = O,...,N—l State space C Incentive Space A Uk' k = O,...,N—l Incentive constraints a Discount factor A gk, k = O,...,N-l One-period net return to the principal A tk' k = O,...,N-2 State transition kernel as defined in Definition SS(b) Theorem 7.1: Define m: P(Q) 4 P(EO) by co(p) (50) = (Q souzo: T10(p:Zo) e gollwommo) E 6 E. w : O for every Then m is Borel-measurable. 245 Proof: AS defined, m(p) is the distribution of the initial state 60 in PSI when the initial state we in ISI has distribution p. Then w§fl(wo.p) = SOHZO: T(o(p:Zo) 6 )Iwo) '— H H is Borel-measurable for every 50 6 65 by Corollary 6.1.1. Define a Borel-measurable stochastic kernel on Q given P(Q) by q(dwolp) = p(dwo). Then co(p) (i0) = f wEo(wo.p)q(dwo)p). By Theorems 6.1 and 6.4, m is Borel-measurable. QED A . Theorem 7.2: If n = (”O'°'°'GN-l) is a contract for PSI, then the sequence A A. . (uo(d10)no(p7¢o)].....pN_l[dIN_l)no(p.¢6).Io. ""IN-2'nN-l(p7¢N-1)] where Ok = (Z I k = O,...,N-l is a o, 0'...'Ik-1'Zk) contract for ISI. Proof: Condition (a) of the statistic {nk} to be sufficient guarantees that the incentive constraint set can be recovered from nk(p:Ok). Since I k («A(J) : ok 6 ek. I e Ukwkfl (Maggi) : [R(pngng 6 9k} and A — P 246 Thus for and p 6 P(Q), Ok 6 ék’ k = O,...,N-l A Uk(gk) = Uk[fik(P:¢k)] and the result follows. QED Theorem 7.1 ensures that the initial probability measure on E can be induced from knowledge of the distribution of the initial payoff. Theorem 7.2 provides the very preliminary result that for every contract for PSI, a corresponding contract for ISI can be constructed. Obviously, the question to ask is whether the Optimal contract for ISI can be induced from this Optimal con- tract for PSI. Before exploring this question, a few definitions and notations are in order. Definition: For p 6 P(Q), define the mapping VP,k : DR 4 :OCOH'Eka by vp’k(worzovlooo:-owkozkrIk) = [no(p:¢o).Io....,rk(p;o’k),1k] where Theorem 7.2 holds. Thus, for q 6 P(Eo) and A'- A A (i h ' f n — (“O""'uN-1) 6 , t ere :s a sequence O con- sistent probability measures Pk(v,q) generated on SOCO...Eka, k = O,...,N-l, which are defined on measurable rectangles by 247 A A _‘ C H C ) Pk (7r,q) (::. . . .:._k —k A A = I50 (C ...jgk pk(gkl§o.xo.....Ik+1.ek)tk_1(dikl§k_1.1k_1> A ...uo(dIOI§o)q(dEo) where E 6 B: , g_ 6 B 2k "k k Ck A+ A- A A A When (F ). (F ). (P), (N) or (D) holds, the net total return to the principal corresponding to a contract 9 for PSI at E 6 E is O N-l A k A A A J A == 23 a I g (E .I )dP (v.p ) N,W(§) k=0 E c ...: c k k k k 5 o O k k where N 6 [1,o] and the optimal return for PSI at E E is . 6 o 7.4* Reduction of the Imperfect State Information Model This section is devoted to study the relationships between net returns, Optimal and nearly Optimal con- tracts for the ISI and the PSI models. First, for a A . contract n for the PSI, these Objects are related to A the probability measures Pk(n,p) as defined in Section 7.2 in the following manner. 248 A A Lemma 7.1: Suppose p 6 P(O) and n 6 H. Then for k = O,...,N-l and for every Borel set B C SOCO...:ka, we have A ... AA Pk(v.p)[vpfk(B)] = Pk[v.e><<:iao). 7T :2 TT N. o No Proof: A i J A<:)e

(d:o) :0 N,V A A A =I Z) a f gk(§k.Ik)de(vr.pE)w(p)(déo). O k=O COCO...;ka BY (F+) or (F‘) if N < a, by monotone convergence theorem if N = o and under (P)(N) and by bounded convergence theorem when N = m and under (D) 250 N‘1 k A A A = a i I gk(§k.1k)de(W.pE)w(p)(dEo). k=0 :0 :OCO. . . ka III A A By definition of Pk(n,pE) and w(p) N-l A A A = 2 (1k f_ gk(:k.ik)de[rr.co(p)]- k=O :OCO...:ka By Lemma 7.1 and condition (c) of a sufficient statistic N-l = 2: (1k 9( I )de (VIP) k: 0 ka L”k' k = J A(p). QED N,F Before proceeding to prove the correspondence of the Optimal return functions and Optimal contracts, a few other results are studied. The following lemma defines the Optimal return function for PSI in terms of the initial payoff probability measure p. Next, the relationship between the Optimal return functions for the models for p 6 P(O) is explored. A+A_AAA Lemma 7.2: (F )(F )(P)(N)(D). For every p 6 P(Q) A* i JN(eo)e'U 251 :1) A Proof: For p 6 P(Q) and 6 H A-k A l JN(§o)e(p) 2 J A(Eo)m

O and let 7’ 6 fi be e—Optimal. If A* n

((:olJN(:o) «J) = 0. Then I A A I J A (50)m(p)(d50) ; f J;(§o)m(P)(dEo) - e : N,F E O O . A A* .. iuR ; JN #(Eo)w(p)(d§o) ; i JN<:o)e(p)(d:o) WEU ”o ' ”O If A* co(p) ((50 : JN(50) = 00)) > 0. Then A E i J A(so)e(p)(d;o) = -e. * {EonN(Eo)w(p>n ; I JN(§O)m(p)(d§o). AA WEE o N’W o [I 11 Otherwise, the right-hand is finite for any a > O and I 9 <5 ) (g ) SUP A cm 0 =o #6fi :0 N'” . A* A .. i JN(go)e(p). PSI. Since A . Then TT‘ is a contract for w(p)(§o) = (Q so({zo: no(p:zo) e gollwo)p(dwo> v es: :0 0 Clearly, the marginal of Qo(n,p) on E0 is ¢(p) and for k = O, by Lemma 7.1 , I 6 C A A Pomcupungo e- 0 _kn. lo QO(J.p) (Eb _C_o) Now assume that = c - A A E J“ 6 ‘ Then Qk+1(rr,p) (Ek+l —C-k+l) A = e fc “k+1‘9k+i‘-Ek+i)d0k+1”'p) Zk+l—k+l A ( ( -¢j )6 e w Uk+l(E-k+l‘nk+l(p7gk+l))dpk+1(Tnp). T“k+1 p. k+1 :k-l-l" By condition (b) of Definition SS II tfi E-k 9k ik+ l = I I E C ...= C 255 A A L1k+1 (EMU Ek+l) t (dgk+l‘ nk+1 ‘97 qt”) 'Ik) de (”'P) A A i “k+1‘9k+1‘Ek+i)t(dgk+i‘gk'IkakW'P) A A LJ1~'.+1(£k+ll §k+1)t(d§k+llgi,1k) O O ”k k 5k+l A A deiF:w(P)] A A ._ = pk+l[rr.o(p) ] ({§k+1 6 ik+1'1k 6 Elwin m For 'k 6 Qk (mp) (5k 9k) By Theorem 7.3 JN,n(p) Definition: 6 fi is said to i J A(to)q(deo> Z q a, ,.g e e , k Ck k C ll 0 Z l ...-l A A = Pk[J.co(p>]({§k 6 518 1k 6 gkl) N,# (so)e(p)(dgo). QED ll NIL-o Cl > 0 Let q 6 P(Eo) and e > O, a contract be weakly q-e-Optimal if A* _ A* i JN(EO)q(d§o) - e If i JN(§o)q(d§o) < an H "O ' A* e _ 1f JN(§o)q(dso) — w “FL—a O A The contact v is said to be q-Optimal if A A* 256 Lemma 7.4: Let J: O 4 R* be upper semianalytic. Then for e > 0, there exists u E U(ClQ) such that TU(J)(w) 2 T(J)(w) - c for every w E Q, where T(J)(w) - 6 may be a. Proof: By Theorem 6.14, there are universally measurable selectors um: Q 4 C such that for m=l,2,... and w€§7. UmEUhL‘) and T(J)(w)-c if T(J)(w) (en T (J)(w) 2 ”m ' 2m if T(J) (w) = a Let u(dIIw) assign mass one to u1(w) if T(J)(m) < m and assign mass 1/2m to um(w). m = 1,2,... if T(J)(w) = a. For each §_€ BC Xg[“1(‘*')] if T(J) (w) < on u(§Jw) = a Z 3; xcfummn if T(J) (w) = a» m=l 2 -— is a universally measurable function of w. u is a universally measurable stochastic kernel with the desired properties. QED Lemma 7.5: (F-) If JO: 0 4 R* is identically zero, then Tk(JO)(w) < a for every w E O. K = l,...,N. Proof: Suppose for some K g N and mo 6 O that for every w G O T3(Jo)(w> < a j O,...,K—l 257 and k T (Jo)(wo) — w 3 universally measurable selectors uj: Q 4 C j = 1,...,K-1 such that uj(w) E U(w) and (TuK_jTj'1) (JO) (w) 2 Tj (Jo) (w) - 1 j = l,...,K-1 V m 6 Q, Then (Tul...TuK_1) (JO) 2 (Tul...TuK_2)[T(JO) - 1] ; (Tu1~--TuK_3)[T2(JO) — 1 — a] K-2 2 TK-1(Jo) - (l-ra-+..u+a ). By Lemma 7.4, there is a stochastic kernel u E U(CIQ) such that (TuOTK-l) O (JO)(wo) = w. Then K-l( (TuoTul.-.TuK_l)(JO)(wo) g TuofT Jo)- l-d-...-dK-2] (w ) 0 Choose any u E U(CIQ), let W = (u ,...,uK_1.ul,...,u) so that v E H K-l k 2 d gdq (mp ) k=o I k u’o J ) K,7T(wo (Tuo...TuK_1) (Jo) (mo) = co for some k g K-l f gqu(v,pw ) = a. o This contradicts (F'). QED 258 r ..A- A A A Theorem 7.5: (F+,B+)(F ,F )(P,P)(N,N)(D,D) * A* : JN(p) = i JN(§o)m(p)(dso) v p e p(n). ”0 Furthermore, if 9 is Optimal, m(p)-0ptimal or weakly m(p)-—e-0ptimal for PSI, then 9 is optimal, optimal at p or e-Optimal at p respectively for ISI If 9 is e-Optimal for PSI and (F-,§-), (N,§) or A (D,D) holds, then 9 is also e—Optimal for ISI. * Proof: JN(p) sup JN,v(p) WEE A inf/3‘; J A(Eo)co(p) (GEO) WEN o N'W ||\/ A* = i JN(§o)w(p)(dEo)- “o By Lemma 7.3, then * Air JN(p) = l JN(EO)m(P)(dEO) v p 6 9(0). ha .1 O A A Let v be e-0ptima1 for PSI. Clearly, under (N,N), A* V _. JN(§O) < a :0 e :‘o A* A J A(EO) g JN(§O) - e V :0 6 30' - A- A* Under (F ,F ), by Lemma 7.5, JN(§o) < m, and again the above holds. From Theorem 7.3 and the first part of this theorem 259 , A J (p) = J (5 )co(P) (<35) Nu? E-[zo N7]; ° o A 2; J;(:O>cp = J“ j r<§lp:1.z>s(dzlx.w)p(dx) £2 0; VQGBO.§EBZ,p€P(G),I€C. 260 Proof: For fixed (p;I) E P(O)C, define a probability measure g on OZ by specifying its values on measurable rectangles to be (Theorem 6.3) q) = ft(glw.1)p(dw) Vg e 60 is called the one stage prediction equation. By Theorems 6.1 and 6.4, f is Borel-measurable. Definition: Given a sequence gk 6 ék such that ¢k+1 = (gk'lk’zk+l)' k = O,...,N-Z and given p 6 P(Q), define recursively (D) po(p:¢6) = ro(dwolp:zo) (B) pk+1(p:¢k+1) = r(dwlf[pk(p:¢k).Ik]:Ik.Zk+l) k = O,...,N-Z. Thus defined, for each k, pk: P(Q)k 4 P(O) is Borel-measurable. Equations (A)-(E) are called the filtering equations corresponding to the ISI model. Lemma 7.7: Let the ISI modle be given. For any p 6 P(Q), v = (u0,...,uN_l), v 6 H and ER 6 60, then pk (TT,p) [wk 6 _Q‘kigk] = Pk (p7gk) (.Qk) for Pk(n,p) almost every ¢k, k = O,...,N-l. 262 Proof: For any §% 6 B and 6 B 0 go 2 {2 £2 ] po(p:zo)(§%)dpo(v.p) = {2 £2 3 ro(§blp:zo)dpo(n,p) L 0 ‘0 o -0' = g g ro(£blp:zo)so(dzo‘wb) —.o P(dwo) = g sosolwapwwo) —o = Pomp) (mo 6 530. 20 e gob. The theorem follows for k = O by the definition of conditional probability. Now assume the theorem holds for k. For any 2k 6 HQ , 9k 6 Eb, Zk+l E 5% and k £§+1 6 50' Then r 6? I 6C 2 'Z 1pk+l(p:Q&'Ik'zk+1)(£&+1)de+l(V.P) \%( —k' k —-k' k+lt_k+1. = f f Pk+i(P’Qi<' "k+1) (Qk+l) Wkeq’k} 9k 1im £k+l s (dzk+lllk’wk+l)t(dwk+llwk’Ik) “k (dlklp: (Pfk) de (mp) I f pk+l(p7¢k'Jk' zk+l) (flkn) k -C-k 531t+1 £194 S(dzk+l‘Ik"”k-+l)t(dmkstl‘""k'Ik) uk(dIkIp:flk)[pk(p:¢k)(dwk)]de(v.p) = I I I I f Pk+1(P’¢k'Ik'zk+i) (Bk-t1) {65%} 9k 9k flk+l -Z—k+1 S(dzk+i‘1k"”k+1)t(dwk+1‘“’k'1k) [Pk(p7¢k)(dwk)]uk(dIklp:¢k)de(F.p) 'DL’: 263 = I I J‘ r(_Qk+1lEi-pk (D:gk) 01k]71kt zk+l) Wkeq’k] gt flk+l gk+1 s (dzk+l‘Ik' wk+1)f{pk(p7¢k)'1k](dmk+l) Uk(dIkIP7¢&)de(VoP) = {(3 g: l g ilk S = {z e z=rtdwlf<§.1);1.z] e a} A t(§IE.I) =fJ“swans)I1.w’]tk.Ik)I¢K.Ik](pk(p;¢k) = Ek' Ik = ik} E(gg(wk.1k)pk(p:¢k)(dwk)1pk(p:¢k) = Ek, Ik = 1k] f g(wk.fk)Ek(dwk) Q for Pk(V,p) almost every (gk'ik) where the expecta- tions are with respect to Pk(v,p). The function A g: P(0)C 4 R* defined by A _ .. g(E.I) = I g(w.I>E(dw> Q is upper semianalytic by Theorem 6.12 A g satisfies condition (C). QED 266 Theorem 7.7: Let the ISI model be given. The sequence of identity mappings on P(O)§k, k = O,...,N-l is a sufficient statistic. Proof: Let Ek in Definition SS by P(O)§k. k O,...,N-l and let nk be the identity mapping on P(O)§k. Then (no,...,nN_1) is a statistic. A Let Pk = P(C)? , k = O,...,N-l. Condition (A) is satisfied. If ik+l 6 Bb(0)¢ . 3k 6 P(Q)§k, k+1 and ik 6 Ck' define (§k+1) - = {zk+l 6 Z‘ (P’zo'lo'""Ik-1'zk'Ik'zk+1) (E.I) k k 6 5k+l} where 5k = (p7zo'Io""'Ik+l'zk}' Now, define for A - - _ _ ' p k — O,...,N 2, the stochastic kernel tk(d’k+l‘§k'1k) on P(QMk+1 given P(O)§kC by A _ - - _ _ tk(5k+1‘§k'1k) ’ I S[(Ek+l)(§ f )‘Ik'wk+l] +1 k' k t(dwk+1iwk.ik)pk(gk)(dwk) V E 6 B i—k+l P(O)§k+1 where pk(§k) is as defined by (D) and (E). Using similar arguments as in Theorem 7.6, it can be shown A that t is Borel-measurable. By Lemma 7.7 k 267 Pk+1(”'P)[”k+1(P’¢k+i) E Ek+1h‘k(p“3‘k) = Ek' Ik = i1:1 = Pk+1(n—Ip)[(gk'1k'zk+l) E Ek+1] = “i 51-. (§k+l) (E f )lik'wk+l]t(dwk+1‘wk'fk)pk(-E-k) (dwk) +1 k' k _ A l? f ‘ t(-—k+1 "k' k) M for Pk(v,p) almost every (Ek'ik)' Condition (B) is satisfied. For k = O,...,N-1, define A 9k : P(C/Mkc 4 R* by Sk(§k.fk> = f g(wk.ik)pk(2k)(dwk). 0k By Theorem 6.12, is upper semianalytic for each k A 9k For p 6 P(Q), V 6 U and k = O,...,N-l from Lemma 7.7 E[g(mk.lk)‘n(p7¢k) = 5k. Ik = ik] = $k O and Fn CBn C Gn such . . n that Fn lS closed, Gn is Open and p(Gn-Fn) g e/Z . Then U B cu G =(U Fn)u[u (en-an n= n=1 n=1 n=1 C (U Bn) U [U (Gn-Fn)]. n=1 n=1 0 a Thus p( L) Gn) g p( L! Bn) + g. This implies n=1 n=1 271 P( L’ Bn) = inf{p(G) : L} Bn CIG, G Open}. n=1 n=1 Also p( k} Bn) < p( L! Fn) + e and n—1 n=1 m N p( L] Bn) g p( L) Fn) + 2c for N sufficiently large. n=1 - n=1 N However, the finite union \J Fn is a closed subest n=1 Q of L! B . Thus n=1 n P( L3 Bn) = SUP{P(F)= F C IJ B . F closed} n=1 n=1 n 6 is closed under countable unions. Hence 6 is a O-algebra and 6 = 5k. QED Theorem A.2: Let X be a metrizable space and d a metric on X consistent with its tOpology. If pl,p2 6 P(X) and I gdpl = I gdp2 V g 6 Ud(X) then p1 = p2. Proof: Let F be any closed prOper subset Of X _ . 1 and Gn — [x 6 X..d(x,F) < H]’ For sufficiently large n, F and "'Gn are disjoint non-empty closed sets for which inf d(x.y) > 0. By Urysohn's Lemma, x6F,y6~Gn H fn 6 U X) such that fn(x) = O for x 6 N Gn' d( 272 fn(x) = 1 for x 6 F and 0 g fn(x) g 1 V x 6 X. Then p1(F) g f fndpl = I fndp2 g p2(Gn) p10?) g p2(nQ Gn) = p2(F). l Reversing p1 and p2, we obtain pl(F) = p2(F). By Theorem A.l, this implies p1(B) = p2(B) for every B 6 13X. QED These two theorems essentially says that a probability measure on a metrizable space is completely determined by its values on the Open or closed sets. Also, a probability measure p on a metric space (X,d) is completely determined by the values I gdp where g ranges over Ud(X). Next, attention is paid to the develOpment of the topology f[C(X)], the so—called weak topology on P(X). However, the space C(X) is tOO large to be manipulated easily, one would need a countable set D C C(X) such that f(D) = ![C(X)]. Such a set D is produced by the next three lemmas. Definition: Let c > O, p 6 P(X) and f 6 C(X), V€(p:f) = {q 6 P(X) : If qu-f fdp) < 8]. Definition: Let D C C(X). Define the collection Of subsets Of P(X): v(D) = {V€(p:f): e > o. p e P(X), f e D]. 273 Let 7(D) be the weak topology on P(X) which contains 7(D), i.e., the tOpology for which 7(D) is a subbase. Lemma A.l: Let X be a metrizable space and D C C(X). Let {pa} be a net in P(X) and p 6 p(x). Then pa 4 p relative to the topology f(D) if and only if I fdpa 4 I fdp for every f 6 D. Proof: Let pa 4 p and f 6 D. Let c > 0 be given, and let 6 be such that for d 2 B implies P E Vc(p:f) d J” fdpa » f fdp. Conversely, let I fdpa 4 I fdp for every f 6 D. Suppose G 6 f(D) contains p. Then p is n contained in some basic Open set F) V (p;f ) C G e k k=1 k where ck > O, f 6 D and k = l,...,n. Let B be k such that for all a 2 B If fkdpa—J‘ fkdpl < ck k = l,...,n pa 6 G for d 2 B 1 p 4 p. QED (1 Lemma A.2: Let X be a metrizable space and d a metric on X consistent with its tOpology. If f 6 C(X), then there exists sequences [9n] and {hn} such that g i f and h 1 f. n n Proof: Let b 6 R and x0 6 X be such that b g f(x) g f(xo) < m for every x 6 X. 274 Define gn(x) = inf[f(y)4-nd(x,y)]. Thus for every y6x b g gn(x) g f(x) + nd(x,x) g f(xO) + nd(x,xo) < a b g gl g g2 §---§ f and lim gn g f nan For every x,y,z 6 X f(y) + nd(x,y) g f(y) + nd(y,z) + nd(x,z) gn(X) g gn(2) + nd(X.2). Thus lgn(X) —gn(2)l g nd(x,z) gn 6 Ud(X) for each n. Let a > O and {yn] C.X be such that f(yn) + nd(x,yn) g gn(x) + e. As n 4 c, either gn 1 a or y 4 x. If n 9n 1 w = 11m 9n 2 f n4m gn T f. If yn 4 x, since f is continuous f(x) = lim f(y ) n 11ch g lim 9 (X) + e n n49 lim g (x) = f(x). QED n4m n 275 Lemma A.3: Let X be a metrizable space and d a metric on X consistent with its topology. Then r[c (X) ] = .7'[Ud(X) ]. Proof: Since Ud(X) CC(X), 'a'[Ud(X) C'Zf[C(X)]. This implies JTUd(X)] C:J[C(X)]. Let pO 6 V€(p:f) 3 so > 0 such that V€O(p07f) C‘V€(P:f). By Lemma A.2, H g,h 6 Ud(X) such that g g f g h r. and J fdpo < f gdpO + eO/z, f hdpO < f fdpo + 50/2. Let q 6 Veo/2(po7g) fl Veo/2(po:h), then I fdpO < f gdpO + eO/Z < I gdq + co g I qu + 60 and .‘I : P c J qu g I hdq < J hdpO + eO/z < f fdpo + “O r! l, qu-J" fdpol < 60 q 6 v (po:f) €O and VEO/2(po:g) fl VeO/z(po:h) CiV€(P:f). Since pO 6 V€(p:f) 6 7[C(X)] this implies V€(p;f) is Open in the I[Ud(X)] tOpology and Y[C(X)] C I[Ud(X)] J’[C(X)] = .7'[Ud(X)]. QED 276 Lemma A.4: Let X be a metrizable space and d a metric on X consistent with its topology. If D is dense in Ud(X) then JIUd(X)] = f(D). Proof: Clearly f(D) C23IUd(X)]. Let v€(p7g) 6 V[Ud(X)] and pO 6 V€(p:g). Let so = e - If gde-f gdp! > 0. Let h 6 D be such that Hg-—hH < 90/3. For any q €‘Veo/3(Po:h) If gdq-J" gdpl g If gdq-j hdql + If hdq-Ihdpol + If hde-f gdpol + lfgde-f gdpl < 620/3 + 60/3 + 270/3 + If gde-f gdp] = e Veo/3(po7h) C V€(P:Q)- This implies .'/'[Ud(X)] c.7(D) J’[Ud(X)] = f(D). QED 277 Theorem A.3: Let X be a separable metrizable space. There exists a metric d on X consistent with its tOpology and a countable dense subset D Of Ud(X) such that f(D) is the weak topology J[C(X)] on P(X). Proof: By Urysohn's theorem, X can be homeomorphically embedded into a subset of the Hilbert cube. Since the Hilbert cube is compact by Tychonff's theorem, it is totally bounded. 3 a totally bounded metrization d on X. This implies that (X,d) can be isometrically embedded as a dense subset Of a compact metric space (Xd,d1) where X szd. Let g 6 Ud(X), g has a unique extension 3 6 C(Xd) such that Hg” = Hg“. The mapping 9 4 3 is linear and norm-preserving. Since C(Xd) is separable, this implies Ud(X) is separable. H a countable dense set D in Ud(X). The result follows from Lemmas A.3 and A.4. QED From this point on, whenever X is metrizable, it is understood that P(X) is a topological space with the weak tOpology J[C(X)]. Theorem A.4: If X is a separable metrizable space, then p(x) is separable and metrizable. 278 Proof: Let d be a metric on X consistent with its tOpology and D a countable subset of Ud(X) such that f(D) is the weak tOpology on P(X). Let Rm be the product of countably many COpies Of R. Define m: P(X) 4 Ra 'by m(p) = (I gldp.I gzdp....) where [g1,gz,...] is an enumeration Of D. Suppose that m(p1) = m(p2), then I gkdp1 = I gkdp2 for every gk 6 D. Let g 6 Ud(X). H a sequence {gk} C D such that Hgk "9H 4 O as i 4 a. Then i II gdp1-I gdpzl 3 lim supIJ(g- gki)dp11 + 1im supIJI gk dp1-J gk ldpzl i469 i4co + lim suplI(gk -g)dp2I i4a i g 2 lim supHgki -9H = O i-OQ I gdpl = I gdpz- By Theorem A.2, p1 = p2 m is one-tO-one. By Lemma A.l, for each gk 6 D, the mapping p 4 I gkdp is continuous m is continuous. 279 Let {pa} be a net in P(X) such that m(pa) 4 m(p) for some p 6 P(X). Then I gkdpd * I gkdp for every gk 6 D. By Lemma A.1, pa 4‘p m- is continuous. The m is a homeomorphism. Ron is metrizable and separable = P(X) is metrizable and separable. QED Theorem A.5: Let X be a separable metrizable space and let d be a metric on X consistent with its tOpology. Let {pn} be a sequence in P(X) and p 6 P(X). The following statements are equivalent: (a) pn 4 P (b) I fdpn 4 I fdp for every f e C(X) (c) I gdpn 4 I gdp for every 9 6 Ud(X) (d) lim sup p (F) g p(F) for every closed naa n - set F CIX (e) 1im inf pn(G) g p(G) for every Open n-Oco set G CIX. Proof: The equivalence of (a), (b) and (c) follows from Lemmas A.l and A.3. The equivalence of (d) and (e) follows by complementation. To show (b) implies (d), let F be a closed proper nonempty subset Of X. Let Gk = [x 61X: d(x,F) < é}. F and “G. are disjoint k nonempty sets for k sufficiently large. 3 fk 6 C(X) 280 such that fk(x) = l for x 6 F, fk(x) = O for x 6 ch and 0 g fk(x) g l for every x 6 X. Thus lim sup pn(F) g lim I fkdpn = I fkdp g p(Gk). n4o n4c (d) follows by letting k 4 a. To show (d) implies (b), let f 6 C(X) and assume without loss Of generality that 0 g f g 1 Let K be a positive integer and define F=Ix6x-f2]5l k=0 K k 6 . k _ K" ,..., . K Define m:IX 4 [0,1] by w(x) = Z)(k/K)XF -F (x) k=0 k k+1 where FK+l = ¢. Then 1 f-Rgmgf For any q 6 P(X). I g5 )< 1 2E codq= (-)Q(F -F )=- q(F) k=0 K k k+1 K k=1 k . l . lim sup I fdpn - (K) g lim sup I mdpn n-Oa n-Oa 1 6 = - lim sup p (F ) K n40 k=1 n k :1 £5 I S - P(F ) = mdP — K k=1 k g I fdp. This implies lim sup I fdpn g I fdp n-Oca for every f 6 C(X). 281 But the above argument holds for -f lim inf I fdpn -lim sup I (-f)dpn n4: n49 —I (-f)dp = I fdp ||\/ 3. I fdpn 4 I fdp for every f 6 C(X). QED The above two theorems guarantee that when X is separable and metrizable, the tOpology on P(X) can be characterized in terms of convergent sequences rather than nets. Theorem A.5 gives several conditions which are equivalent to convergence in P(X). Corollary A.5.l: Let X be a metrizable space. The mapping 5: X 4 P(X) defined by 5(x) = pX is a homeomorphism. Proof: Clearly 6 is one—tO-One. Let {xn} be a sequence in X and x 6 X. If xn 4 x and G is an open subset of X, then either (i) x 6 G, implies xn 6 G for large n 1im inf p (G) = l = p (G) or x x n4e n (ii) x 6 G, then 1im inf p (G) 2 0 = p (G). n4a n — X X By Theorem A.5, this implies pX 4 px n 6 is continuous. 282 On the other hand, if pX 4 pX and G is an Open n neighborhood of x, since 1im inf p (G) 2 p (G) = l. n4a Xn _ X Then xn 6 G for sufficiently large n x 4X n 6 is a homeomorphism. QED Theorem A.6: If X is a compact metrizable space, then P(X) is a compact metrizable space. Proof: If X is a compact metrizable space, it is separable and C(X) is separable. Let {f be a countable set in C(X) such that k) f1 5 l, kaH g l for every k and {fk} is dense in the unit sphere {f 6 C(X): ”f” < 1}. Define mpm .. [-1,1]°° by co(p) = (I fldp,I fzdp,...). m can be easily verified as a homeomorphism. Suppose {pn} is a sequence in P(X) and m(pn) 4 (91.02,...) 6 I-l,l]m. Let 6 > 0 be given and f 6 C(X) with Hf” g l, 3 fk with _ ! E Hf ka < 3. Also 3 n0 2 0 such that II fkdpn-I fkdme < g whenever n,m ; no. Then IJ fdpn-I fdpml "A II fdpn — I fkdpn) + II fkdpn —I fkdme + (I fkdpm-I fdme < e I. {I fdpn] is Cauchy in [-l,l]. 283 Let E(f) be the limit of such a sequence. If “f“ > 1, define E(f) = Hf” E(Jén). Thus E is a 1 linear function on C(X), E(f) 2 0 whenever f 2 O, IE(f)I g ”f“ for every f 6 C(X) and E(fl) = 1. Let {hn} be a sequence in C(X) and h (X) 1 O for every x 6 X. For each s > O, the n set Kn(e) = [x 61X: hn(X) > e] is compact and For n sufficiently large, thH 1 O which implies E(hn) 1 O. 3 a unique probability measure on O[ (J f (B )] f6C (X) which satisfies E(f) = I fdp for every f 6 C(X) since E is a Daniell integral (see Royden [1968]) '. p 6 P(X) and ok = :3: I fkdpn = E(fk) = I fkdp k = 1,2,... m(pn) 4 ¢(p) and m[P(X)] is closed on [-l,l]m. Since [—l,l]co is compact, P(X) is compact. QED Lemma A.5: Let X and Y be separable metrizable spaces and m:iX 4 Y a homeomorphism. Define k P(X) 4 P(Y) by Mp) (B) = Men-1(8)] v B 6 BY w is a homeomorphism. 284 Proof: Let p1,p2 6 P(X) and p1 #’p2. Since p1 and p2 are regular, 3 an open set G C X for which p1(G) 36P2(G) m(G) is relatively Open in m(X). Thus m(G) = m(X) 0 B, where B is Open in Y and )(p1) (B) = p1(G) a" p2(G) = )(pz) (B) w is one-tO-One. Let {pn} be a sequence in P(X) and p 6 P(X). If pn * p. since m-1(H) is Open in X for every Open set H C‘Y. By Theorem A.5, lim inf ¢(pn)(H) lim inf pan-1(H)] n4m n4m g pim'1(H>] = ¢oS be the Suslin scheme for BY defined by (CD 0 S) (S) = w[S(S)] Thus (p(A) = N(chS). a By Theorem A.l4, p(A) is analytic. If m(A) C‘Y is analytic, A CIX is analytic by a similar argument. QED Definition: Let (X,0) be a paved space and S a Suslin scheme for 9. The Suslin scheme S is regular if for each n 6 N and (01,02,...,O ) 6 23, then n+1 ) . s(Ol,OZ,...,on) : 5(01'02"°°'On'°n+1 Lemma A.6: Let (X,d) be a separable metric space and S a Suslin scheme for :X' Then there exists a regular Suslin scheme R for :k such that N(R) = N(S) and, for every 2 = (61.62....) 6 T lim diam R(gl,§2,...,gn) = 0 if nag: mq4yuqq)¢¢vn 292 Proof: By the Lindelbrf property, for each positive integer k, X can be covered by a countable collection of Open balls Of the form Bkj = {x 62X: d(x,xkj) < l/k) j = l,2,°°°. For (gllgllgzlgleOO) E m, deflne R(El) = 5161 R(gl'cl'gZ'CZ) = R(Qloglogz) fl S(Ql.,C,2) etc. U Thus a: - a C R(s) = [ F) Bk: ] n I 0 8(5)] <(7 c E c ) k=1 k s O 3 sn < 20 such that diam R(sn) < e. For k sufficiently large, 2k 6 {z 6 T: sn < 2] f (zk) 6 R (sn) d(f(zk),f(zo)) g diam R(sn) < e f is continuous. Now, suppose T1 C T is closed and f: T1 4 T is continuous. Define a regular Suslin scheme R for 3* by R(s) = f({z 6 T1: 5 < 2]) where R(s) = ¢ if [z 6 T then -s < z] = w. If 2 6 T 1' l’ f(z) 6 .C R(s) CN(R) s O, 3 Zn 6 T1 such that (61,62....,6n) < zn and d(x,f(zn)) < 6. Then zn must converge to 20 as n 4 a. Since T1 is closed, 20 6 T1, f is continuous implying d(X,f(ZO)) g e f(zo) = x and x 6 f(Tl). N(R) C f(Tl) N(R) = f(T ). QED 1 So far, analytic sets are characterized as the continuous images of closed subsets Of T. The following lemma and theorem allows one to get an even sharper characterization. Lemma A.8: If T is a nonempty closed subset of 1 T, then there exists a continuous function g: T 4 T such that T1 = g(T). Proof: Let T1 be covered by a countable collection of nonempty closed sets [8(61): 61 6 N] 296 satisfying mle(€l)' diam 8(61) g1 61:1,2,... where d is a metric on T consistent with its tOpology. Cover each 8(61) with a countable collection Of non- empty closed sets {8(61,62): 62 6 N} satisfying sf)(T) is analytic by Theorem A.20. Now, suppose C = f(T) C X is nonempty and analytic. C = projx[5r(f)] where Eur) = {(f(z),z) e XT:z e 92}. Since f is continuous, Er(f) is closed. If Y is any uncountable Borel space, then 3 a Borel isomorphism w from T onto Y. Define a mapping T by §(x,z) = (x,m(z)). Then @(x,z) is a Borel isomorphism from XT onto XY C = pron<:[é}S) where f.1 OS is the Suslin H1 CD II scheme for 5x defined by (f_1<>S)(s) = f-1[S(s)] V s 623, f-1(B) is analytic by Theorem A.l4. QED Now, the above results can be summarized in the following theorem. Theorem A.l9: Let X be a Borel space. The following definitions of the collection of analytic subsets of X are equivalent: (a) £(dx) (b) £(IJX) (e) (f) 300 the empty set and the images of T under continuous functions from T into X the projections into X of the closed subsets Of XT the projections into X Of the Borel subsets of XY, where Y is an uncountable Borel space the images of Borel subsets of Y under Borel-measurable functions from Y into X, where Y is an uncountable Borel space. Proof: Only (f) needs to be shown. If Y is an uncountable Borel space and f: Y 4 X is Borel-measurable. For every B 6 by, f(B) is analytic in X by Theorem A.18. Let 6 be a Borel isomorphism from Y onto XT and let F C XT be closed such that pron(F) = A. Define B = 6. (F) 6 BY. Then (projx<>m)(B) = A is analytic. If A = ¢, then f(¢) = A for any Borel measurable f:‘Y 4 X. QED BIBL IOGRAPHY BIBLIOGRAPHY Akerloff, G., "The Market for 'Lemons': Quality Uncertainty and the Market Mechanism", Qparterly Journal Of Economics, August 1977. Alchian, A., and H. Demetz, "Production, Information Costs, and Economic Organization", American Economic Review, December 1972. Amershi, A., and J. Butterworth, "The Theory of Agency With Diverse Beliefs", Working Paper, University of British Columbia, 1979. Ash, R., Real Analysis and Probability, Academic Press, New York, 1972. Astrgm, K.J., "Optimal Contral Of Markov Processes With Incomplete State Information", Journal Of Mathema~ tical Analysis and Applications, 1965. Baiman, 8., "Discussion Of Auditing, Incentive and Truthful Reporting", Journal of Accounting Research, Supplement, 1979. Baiman, S., and J. Demski, "Economically Optimal Performance Evaluation and Control Systems", Journal of Accounting Research, Supplement, 1980. Bertsekas, D., and S. Shreve, Stochastic Optimal Control: The Discrete Time Case, Academic Press, New York, 1978. Blackwell, D., "Positive Dynamic Programming", Proceedings of Fifth Berkeley Symposium on Mathema- tical Statistics and Probability, 1965a. Blackwell, D., "Discounted Dynamic Programming", Annuals of Mathematical Statistics, 1965b. 301 302 Blackwell, D., "Borel—Programmable Functions", Annals of Probability, 1978. Blackwell, D., D. Freedman, and M. Orkin, "The Optimal Reward Operator in Dynamic Programming", Annals of Probability, 1974. Dahlquist, G., and A. Berk, Numerical Methods, Prentice- Hall, Inc., New Jersey, 1974. Denardo, E.V., "Contraction Mappings in the Theory Underlying Dynamic Programming", SIAM Review, 1967. ’ Demski, J., and G. Feltham, "Economic Incentive in Budgetary Control Systems", Accounting Review, April 1978. Fama, E., "Agency Problems and the Theory Of the Firm", Journal Of Political Economy, April 1980. Feller, W., An Introduction to Probability Theory and its Applications, Volume II, John Wiley and Sons, New York, 1971. Freedman, D., "The Optimal Reward Operator in Special Classes Of Dynamic Programming Problems", Annals Of Probability, 1974. Furukawa, N., "Markovian Decision Processes with Compact Action Spaces", Annals of Mathematical Statistics, 1972. Gjesdal, F., "Accounting in Agencies", Working Paper, Stanford University, 1976. Glowinski, R., J.L. Lions, and R. Trémoliéres, Numerical Analysis of Variation Inequalities, North-Holland Publishing Company, Amsterdam, 1981. Harris, M., and A. Raviv, "Optimal Incentive Contracts with Imperfect Information", Journal Of Economic Theory, December 1979. Himmelberg, C.J., T. Parthasarathy, and F.S. Van Vleck, "Optimal Plans for Dynamic Programming Problems", Mathematical Operation Research, 1976. Hirschleifer, J., "The Private and Social Value Of Information and the Reward to Incentive Activity", American Economic Review, September 1977. 303 Holmstrom, B., "On Incentives and Control in Organizations", Unpublished Ph.D. Dissertation, Stanford University, 1977. Holmstrom, B., "Moral Hazard and Observability”, The Bell Journal of Economics, Spring 1979. Jensen, M., and W. Meckling, "Theory of the Firm: Managerical Behavior, Agency Costs and Ownership Structure", Journal of Financial Economic, October 1976. JuskeviE, A.A., "Reduction Of a Controlled Markov Model with Incomplete Data to a Problem with Complete Information in the Case of Borel State and Control Spaces", Theory of Probability Applications, 1976. Kobayashi, T., "Equilibrium Contracts for Syndicates with Differential Information", Econometrica, November 1980. Lambert, R., "Managerial Incentives and Short—Run vs Long-Run Optimization", Unpublished Ph.D. Dissertation, Stanford University, 1981. Luenberger, D.G., Optimization by Vector Space Methods, John Wiley and Sons, New York, 1969. Maitra, A., "Discounted Dynamic Programming on Compact Metric Spaces", Sankhya, 1968. Marchuk, G.I., Methods of Numerical Mathematics, Second Edition, Springer—Verlag, New York, 1982. Mirrless, J., "Notes on Welfare Economics, Information, and Uncertainty", M. Balch, D. McFadden and S. Wu (eds.), Essays on Economics Behavior Under Uncertainty, North Holland Publishing Company, Amsterdam, 1974. Myerson, R., "Incentive Compatibility and the Bargaining Problem”, Econometrica, January 1979. Ng, D., and J. Stoeckenius, "Auditing, Incentives and Truthful Reporting", Journal of Accounting Research, Supplement 1979. Radner, R., "Monitoring Cooperative Agreements in a Repeated Principal-Agent Relationship", Econometrica, September 1981. 304 Rockafellar, R.T., "Integral Functionals, Normal Integrands and Measurable Selections", Nonlinear Operators and the Calculus of Variations, Springer- Verlag, New York, 1976. Ross, 8., "The Economic Theory Of Agency: The Principal's Problem”, American Economic Review, May 1973. Ross, 8., "On the Economic Theory of Agency and the Principle of Similarity", M. Balch, D. McFadden and S. Wu (eds.), Essays on Economic Behavior Under Uncertainty, North-Holland Publishing Company, Amsterdam, 1974. Rudin, W., Functional Analysis, McGraw-Hill Book Company, New York, 1973. Rudin, W., Real and Complex Analysis, McGraw—Hill Book Company, New York, 1974. Savage, L., The Foundations of Statistics, John Wiley and Sons, New York, 1954. Sawaregi, Y., and T. Yoshikawa, "Discrete-Time Markovian Decision Process with Incomplete State Information", Annals of Mathematical Statistics, 1970. Schal, M., "0n Continuous Dynamic Programming with Discrete Time Parameter", Z. Wahrecheinlichkeitstheorie und Verw. Gebiete, 1972. Shavell, 8., "Risk Sharing and Incentives in the Principal and Agent Relationship", The Bell Journal of Economics, Spring 1979. Spence, A.M., Market Signalling Information Transfer in Hiringiand Screening Processes, Harvard University Press, Boston, 1974. Spence, A.M., and R. Zeckhauser, "Insurance, Information and Individual Action", American Economic Review, May 1971. Strauch, R.E., "Negative Dynamic Programming”, Annals Of Mathematical Statistics, 1966. Striebel, 0., Optimal Control of Discrete Time Stochastic Systems, Springer-Verlag, New York, 1975. Wilson, R., "On the Theory Of Syndicates", Econometrica, January 1968. GENERAL REFERENCES GEN ERAL REF ERENCE 8 Arrow, K., Essays in the Theory of Risk-Bearing, Chicago Press, 1970. Arrow, K., "Political and Economic Evaluation Of Social Effects and Externalities", M. Intriligator (ed.) Frontiers of Quantitative Economics, North- Holland Publishing Company, Amsterdam, 1971. Bellman, R., Dypamic Programming, Princeton University Press, New Jersey, 1957. Bertsekas, D.P., Dynamic Programming and Spochastic Control, Academic Press, New York, 1976. Blackwell, D., "On Stationary Policies", Journal of ngal Statistics Society, 1970. Demski, J., "Cost Allocation Games", S. Moriarity (ed.) Joint Cost Allocations-Proceedings Of the University of Oklahoma Conference on Cost Allocations, Center for Economic and Management Research, Normal, Oklahoma, 1981. Feltham, G., "Optimal Incentive Contracts: Penalties, Costly Information and Multiple Workers", Working Paper, University of British Columbia, 1977. Fleming, W., and R. Rishel, Deterministic and Stochastic Optimal Control, Springer-Verlag, 1975. Hinderer, K., Foundations of Nonstationary Dynamic Programming with Discrete Time Parameter, Springer- Verlag, 1970. Hoffman-Jorgensen, J., The Theory Of Analytic Spaces, Aarhus Universitet, Aarhus, Denmark, 1970. Huston, V., and J. Pym, Applications Of Functional Analysis and Operator Theory, Academic Press, New York, 1980. 305 306 IOffe, A.D., and V.M. Tihomirov, Theory Of Extremal Problem, North—Holland Publishing Company, Amsterdam, 1979. Kreps, D.M., and E. Porteus, "Dynamic Choice Theory and Dynamic Programming", Econometrica, January 1979. Kreps, D.M., and R. Wilson, "Sequential Equilibria", Econometrica, July 1982. Larsen, R., Functional Analysis, Marcel Dekker Inc., New York, 1973. Mirrless, J., "The Optimal Structure of Incentive and Authority Within an Organization", The Bell Journal of Economics, Spring 1976. Ng, D., "An Information Economics Analysis of Financial Reporting and External Reporting", Accounting Review, October 1978.‘ Ng, D., "Supply and Demand Of Auditing Services and the Nature of Regulations in Auditing", S. Davidson (ed.) The Accounting Establishment in Perspective, Arthur Young and Co., 1979. Radner, R., ”Competitive Equilibrium Under Uncertainty", Econometrica, 1968. Royden, H.L., Real Analysis, Macmillan, New York, 1968. Scott, W., "A Bayesian Approach to Asset Valuation and Audit Size", Journal of Accounting Research, Autumn 1973. Stiglitz, J., "Incentives and Risk Sharing in Sharecropping", Review of Economic Studies, 1974. Stiglitz, J., "Incentives, Risk and Information: Notes Towards a Theory of Hierarchy", The Bell Journal of Economics, Autumn 1975. Yoshida, K., Functional Analysis, Springer-Verlag, New York, 1980.