Lu: 1»... .1. :1. . r1. “.5 2.2.... , . . 2:... 1.“... 1.1.3:... My: .. . .. 9 k:- t 122:...qu saga 3 «21.: 2:733: 5.! . flax? .. .91: , a! . . .3 .1. 5.3; i , . , cad... Donut»; .' Iu’tlf. Iv..unH.. Al’nllw )9}. ,1: . 29.1) a. an _ bl! f ... u. I 52.81. .i..‘h‘| Pu» (Iii. T.) \. 1 11:01.1. 5|- .0 :Klzln v: #93“; 25113922. :- ..o..43\.tx. n1. 3.va cl » . .a £2. , “Larsen -n...,..~3o.3l.. .; l.f‘ -L ‘ 3 . .. . a. . .. ‘ : Guam r £3.55." tsunan.» 7 .11... . ‘ .. I. .h w I:rv..untflnr .1351 21:: THESIS HHIIHHIHHIW 310 555 0530 This is to certify that the dissertation entitled Parameter estimation in non-linear time series: Random coefficient autoregressive and self-exciting threshold models presented by Lianfen Qian has been accepted towards fulfillment of the requirements for Ph . D . degree in §__Cl§_t.i§.8__ta Department of Statistics and Probability Mei—x Major professor 5W [7) (116 Date MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 ~—_ LIBRARY ‘ M'Chlaan State University PLACE ll RETURN BOX to remove We checkout from your record. TO AVOID FINES return on or bdore dete due. DATE DUE DATE DUE DATE DUE 1 « - [——L___:l—_i £4th L_- mmm MSU Ie An Afflnnetive Worm Opportunity inetltuion PARAMETER ESTIMATION IN NON-LINEAR TINIE SERIES: RANDOM COEFFICIENT AUTOREGRESSI'VE AND SELF-EXCITING THRESHOLD MODELS By Lianfen Qian A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1996 ABSTRACT PARAMETER ESTIMATION IN NONLINEAR TIME SERIES: RANDOM COEFFICIENT AUTOREGRESSIVE AND SELF-EXCITING THRESHOLD MODELS By Lianfen Qian This dissertation studies the parameter estimation in two nonlinear time series models: Random coefficient autoregressive and self-exciting threshold autoregressive models. For the random coefficient autoregressive model of order p (RCAR(p)), we discuss a class of minimum distance (MD) estimators for the true unknown parameters. These estimators are defined via certain weighted empiricals as in Koul (1986). The class of estimators considered includes the least absolute deviation estimator and an analogue of the Hodges-Lehmann estimator. The dissertation contains a proof of the asymptotic normality of these estimators and a simulation study. It is observed that RCAR(2) model with the Hodges-Lehmann type estimator fits the Canadian lynx data at least as well as with the least square estimator. For the first order stationary ergodic self-exciting threshold autoregressive model with single threshold parameter, we show that the maximum likelihood estimators of the underlying true parameters are strongly consistent under some regularity condi- tions on the error density. Then, we prove that the maximum likelihood estimator of the threshold parameter is n—consistent if the threshold parameter is the discontinu- ity point of the autoregressive function. Further, we derive the asymptotic normality of the estimators of the coefficient parameters. We also obtain a simple approxima- tion of a sequence of normalized log-likelihood processes, hence prove the tightness of the sequence of normalized log-likelihood processes. To: Jenny Yao my threshold daughter Qingchuan Yao — my beloved husband iii ACKNOWLEDGMENTS I want to express my deep gratitude to Professor Hira L. Koul, my thesis advisor, for all his continuous encouragement, expert guidance, generous support and extreme patience during the preparation of this dissertation. His love of statistics and devotion to research have served as the main source of inspiration to my research. I also like to thank Professors Mandrekar, Gilliland and Shapiro for serving on my thesis committee. My special thanks go to Professor Mandrekar for his carefully reading my dissertation and making positive suggestions that improve the presenta- tion, Professor Gilliland for training me working for the statistical consulting service, Professor Shapiro for preparing me the strongly mathematical background. I appreciate the support of the Department of Statistics and Probability in the last three years for pursuing my Ph.D. in Statistics. I would like to thank all the faculty members and staff in the department for all of their help during my graduate work at Michigan State University. Finally, I want to express my deep thanks to my husband, Qingchuan, for his continuous support, encouragement and patience. Major portion of this research was supported by the NSF Grant DMS 94-02904. iv Contents Introduction 1 I Random Coefficient Autoregressive Model 1 Minimum distance estimation 6 1.1 Introduction ................................ 6 1.2 Assumptions and Theorems ....................... 8 1.3 Proofs ................................... 11 1.4 Simulation results ............................. 16 II A Self-Exciting Threshold Autoregressive Model 2 Definitions, assumptions and consistency 22 2.1 The profile maximum likelihood estimation ............... 22 2.2 Assumptions ............................... 24 2.3 Strong consistency of the MLE ..................... 25 2.4 n-consistency of the threshold estimator ................ 30 3 Limiting distribution of am 40 3.1 Uniform consistency ........................... 40 3.2 Asymptotic normality of 91,. ....................... 43 4 Some asymptotic results on log-likelihood process 50 4.1 An approximation In of the normalized profile log-likelihood process in 50 4.2 Tightness of In .............................. 54 4.3 Some problems for future research .................... 58 A Appendix 59 Bibliography 62 vi List of Tables 1.1 Simulation results ............................. 19 1.2 The estimators of the RCAR(2) model for lynx data .......... 2O vii List of Figures 1.1 The graph of the dispersion Mg(u) ......... ' .......... 21 viii Introduction Nonlinear time series analysis has achieved a rapid development in the past two decades. The two main factors expediting nonlinear time series model building are: Essentially complete theory of linear time series analysis and some complicated dy- namics phenomena that can not be modeled by linear time series models. Many different types of nonlinear time series models have been studied in literature (see Priestly (1980), Pagan (1980), Nicholls and Quinn (1982) and Tong (1983, 1990). In this dissertation, we will focus on the parameter estimations of two nonlinear time series models: The random coefficient autoregressive and the self-exciting threshold autoregressive models. The first part of this dissertation is concerned with the random coefficient autore- gressz've model of order p (RC AR(p)) in which one observes {X,-,i E Z} satisfying X.- = (0 + z.)TY.-_1 + 65, i6 2, (0.1) for some 9 6 R”, where {e;,i 6 Z} and {Z;, i E Z} are independent sequences of independent identically distributed random vectors with respective distribution functions F and G. Here Y0 := (Xo,. ..,X1._,,)T is an observable random vector independent of {6;}, Y,“..1 := (X;_1,...,X;_p)T, Z.- := (Z51,...,Z,-,,)T, p Z 1 is a known integer and Z denotes the set of all integers. For the importance of these models in time series analysis, see the Lecture Notes by Nicholls and Quinn (1982) and the monograph by Tong (1990). RCAR(p) models include the well known AR(p) models (take Z,- to be degenerate at 0). The problem of interest here is to estimate the unknown parameter 9 based on {Yo, X1, ...Xn}. We will study the Minimum Distance (MD) estimators 9 of 0 which 1 are based on minimizing certain types of distance functions Mg(-), called dispersion, related to the data and the parameter, for measurable function g from R” to ’R”. The importance of this methodology in linear models is discussed in Koul (1992, Chapters 5 and 7). These estimators have many desirable properties, including consistency, robustness against outliers in the error, efficiency and asymptotic normality in AR models. Koul (1992) discussed the asymptotic behavior of the estimators 0 under the AR(p) setup. We obtain the asymptotic normality of 8 under the RCAR(p) setup. The method of proof is similar to that of Koul (1992) which requires obtaining the asymptotic uniform quadraticity of Mg(t) and showing Jim - 0) = 0p(1). This part of the material is organized as follows. Assumptions and statements of main theorems appear in Section 1.2 while proofs appear in Section 1.3. Section 1.4 contains a simulation study and an application to the Canadian lynx data. The sim- ulation study shows that the Hodges-Lehmann type estimator is as good as the Least Square (LS) estimator and Huber estimator (HE). For the additive effects outliers model, Dhar (1990, 1991) established the robustness of the MD estimator. Later, Dhar (1993) working on the AR(p) model showed through simulation that the MD estimator, with H (1:) = a: and g(y) = y, has the smallest absolute bias even for the small sample size n = 10 and the smallest mean square error for n = 50 and 100, under the logistic error distribution. As an application, we fit the RCAR(2) model with MD estimators to the annual trappings of the Canadian lynx over the years 1821-1934. The result shows that this model provides an acceptable alternative to the more widely adopted class of AR models. A self-exciting threshold autoregressive model is a piecewise linear model. It is fitted by different linear autoregressive functions in both past variables and param— eters for different subsets of data. Tong (1977) first mentioned the usefulness of these models. Later, Tong (1978a, 1978b, 1980) developed these models further in a systematic way for modeling of discrete time series data. He argued that various phenomena such as limit cycles, jump resonance, harmonic distortion and chaos can be modeled by discrete time series that are piecewise linear. He called these models the self-exciting threshold autoregressive (SETAR) models. See Tong (1983, 1990) for a comprehensive introduction to general SETAR models. The second part of the dissertation is concerned with the large sample behavior of maximum likelihood estimators in a special SETAR model, called SETAR(2;1,1), defined as follows: X; = h(X.'_1,0) + 6;, i2 1, (0.2) for some 0 = (9{,r)T E 725, where 91 = (a0, a1,bo,bl)T 6 R4 and for any a: E ’R, h(:z:,9) = (a0 + alm)I(:1: S r) + (be + b1$)I(:1: > r). Here, the errors {6;} are independent and identically distributed random variables with mean zero, finite nonzero variance and 61 is independent of X0. The parameter r, the location of the change of the autoregressive function h, is called the threshold. Define a region 9 of parameters as follows. 9 = {19 = (ao,a1,,60,fl1,s)T 6 R51 (11 (1,,81 <1, 01 ,8] <1}. (0.3) Petruccelli and Woolford (PW)(1984) proved that the model (0.2) with do = b0 = 0, r = 0 is ergodic if and only if 0 E ('9. Note that O is much wider compared to the region of stationarity of AR(I) model. Chan, Petruccelli, Tong and Woolford (CPTW)(1985) continued PW’s work and found some other sufficient conditions on the parameters for {X5} in model (0.2) to be ergodic. Note that the process {X,-} defined in (0.2) is a Markov chain. From ergodicity, one can readily obtain stationarity if the measure induced by the initial distribution of the Markov chain is the same as the invariant measure of ergodicity. Since this part of the dissertation discusses the asymptotic properties of the maximum likelihood estimators, it will be assumed that the initial measure is equal to the invariant measure. That is, we will work with stationary and ergodic SETAR (2;1,l) model. For the case of the threshold r having only finite number of possible values and assuming Gaussian errors, Tong (1983) constructed a maximum likelihood estimator of the unknown parameters using Akaike Information Criterion (1973). If the thresh- old r is known, CPTW (1985) obtained the consistency and asymptotic normality property of the least-square estimators of the coefficient parameter 91 under some regularity conditions. But in practice, the threshold parameter r is unknown and can take infinitely many values in R. In this case, Petruccelli (1986) proved that the conditional least-square estimator (CLSE) of 9 is strongly consistent for the SE- TAR(2;1,1) model. Chan (1993) developed the strong consistency of the same CLSE in a general SETAR model. Furthermore, he claimed that he obtained the limiting distribution of the CLSE of the threshold under some regularity conditions on the errors. We derive the asymptotics of a maximum likelihood estimator (MLE) of the under- lying parameter 9 in model (0.2), when the errors have a density f, not necessarily to be Gaussian. Unlike the popular AR model, the likelihood function of the SE- TAR(2;1,1) model is not differentiable with respect to the parameters. Actually, it is not continuous in the threshold parameter in general. Thus the routine method of computing maximum likelihood estimator can not be adopted. Instead, in Chapter 2, Section 2.1 discusses a profile maximum likelihood method to obtain the MLE 9,, = (9;, rn)T of 9 = (9?, r)T. Section 2.2 states assumptions for latter use. In Sec- tion 2.3, Theorem 2.3.1 shows that the MLE is strongly consistent. If the threshold parameter r is a discontinuity point of the autoregressive function h, then the maxi- mum likelihood estimator fn is not only consistent, but also n-consistent as shown in Theorem 2.4.1, i.e. |n(r,, — r)| is bounded in probability. In Chapter 3, we develop the asymptotic normality of the coefficient parameter estimator 91,, and some more byproduct results. In Chapter 4, as a consequence of the n-consistency of 7‘”, the suit- ably normalized log-likelihood sequence of processes {ln} (see section 4.1) is shown to be approximated by a sequence of simpler processes which describe the log-likelihood under known coefficient parameter 91. Through the latter processes, the tightness of {in} is derived. It is expected that this result will be useful in obtaining the limit- ing distribution of the standardized maximum likelihood estimator of the threshold parameter. Notation. Throughout this dissertation, the symbol 9 is the fixed unknown underlying parameter, the function f is the p.d. f of 61 and F denotes the distribution function corresponding to f. The expectation under 9 is denoted by E. Weak convergence is denoted by =>. A sequence (random) goes to zero (in proba- bility) is denoted by o(1)(0p(1)) while 0(1) (0p(1)) means that it is bounded (in prob- ability). The multivariate normal distribution with mean zero and covariance matrix I‘ is denoted by N(0,I‘). Let 7?, be the real line (—oo,oo), and R = R U {—00, 00}, then the compactness of the set R is under the metric d(-, ) defined by d(x,y) = larctan a: — arctan yl. A function (,0 satisfies the Lip ( 1) if V at, y E 'R, 3 L 2 0, such that Mic) - s0(y)| .<_ le - yl- For any event A, the complement event of A is denoted by Ac and the indicator function is denoted by I (A) Throughout, the capital letter C, the symbols 7;, i = 1,2, stand for absolute constants and they can have different values in different places. The notation a'Ty stands for the inner product of vectors a: and y. For any matrix M = (mij), “M” = 2,3,- Imijl , MT stands for the transpose of M, det(M) and adj(M) stand for the determinant, adjoint matrix of M, respectively. Vectors of dimension more than one are denoted by bold face letters. The index 2° in the summation varies from 1 to n unless specified otherwise. Part I Random Coefficient Autoregressive Model Chapter 1 Minimum distance estimation 1 .1 Introduction This part of the dissertation considers the random coefficient autoregressive model of order p (RCAR(p)) in which one observes {X.-,i E Z} satisfying X; = (9 -I- Z;)TY,'_1 + 6;, l6 Z, (1.1) for some unknown 9 E ’R” and for independent sequences {e,,i E Z} and {Z .-,i E Z} of independent identically distributed random vectors with distribution functions F and G', respectively. Also, it is assumed that E61 = 0 and E6? = 012; > 0, EZ1 = 0 and EZIZ$ = 2 Z 0. Here, Y0 := (X0, . . . ,X1_,,)T is an observable random vector independent of {6,}, Y,_1 := (X;_1,...,X,--,,)T,Z.- := (Z;1,...,Z.-,,)T, p Z 1 is a known integer and Z denotes the set of all integers. This model includes the well known AR(p) model (take Z ,- to be degenerate at 0). For the importance of RCAR(p) models in time series analysis, see the Lecture Notes by Nicholls and Quinn (1982) and the monograph by Tong (1990) . The problem of interest here is to estimate the unknown parameter 9 in model (1.1) based on {Y0,X1, ...X,.}. We study Minimum Distance (MD) estimators of 9 which are based on minimizing some types of distance functions, called dispersion, related to the data and the parameter. The importance of this methodology in linear models can be found in Koul (1992, Chapters 5 and 7). These estimators have many 6 desirable properties, including consistency, robustness against outlier in the error, efficiency and asymptotic normality in AR models. To describe these estimators, let 9 = (g1, ..., gp)T be a measurable function from ’R” to ’R”, and | - I be the Euclidean norm. For a given nondecreasing right continuous function H on R, define the dispersion function, for u E 7?”, MA") = /|n‘1/2:9(Ya-1){1(Xe - ”TYi—l S y) - 1(-Xi+uTYi-1 < y)}|2dH(y) and a class of MD estimators of 9, one for each 9 and H, to be 9 := argmin{Mg(u); u 6 72”}. Here I (A) is the indicator function of the event A. The existence of the MD estimator 9 follows from Dhar (1993). Note that if we take H (9:) == 1:, g(y) = y, then 9 is the Hodges-Lehmann type estimator. If we denote U.- := Z,.TY.-_1 + 6;, then the RCAR(p) model becomes X; = 9TY:_1 + U.- and M9 is essentially the same as the K g of Koul (1992) with 6,- there replaced by U5. Koul (1992) discussed the asymptotic behavior of the estimators 9 under the AR(p) setup. A simulation study of Dhar (1993) shows that many of these MD estimators outperform the least square estimator in an AR(p) model with asymmetric error. In this paper, we obtain the asymptotic normality of 9 under the RCAR(p) setup. The method of proof is similar to that of Koul (1992) which requires obtaining the asymptotic uniform quadraticity of Mg(t) and showing flfi(9 — 9)| = 0p(1). The material is organized as follows. Assumptions and statements of main theo— rems appear in section 1.2 while proofs appear in Section 1.3. Section 1.4 contains a simulation study and an application to the Canadian lynx data. The simulation study shows that the Hodges-Lehmann type estimator is at least as good as the least square (LS) estimator and the Huber estimator in the sense of having smaller biase and mean squared error. For the annual trappings data of the Canadian lynx over A the years 1821-1934, it is observed that the RCAR(2) model with 9 estimated by 9 provides an acceptable alternative to the more widely adapted class of AR models. 1.2 Assumptions and Theorems Throughout this part of the dissertation, we assume that {X;} is strictly stationary and ergodic satisfying model (1.1). Sufficient conditions for this to happen are dis- cussed in Theorems 2.1, 2.7 and Corollary 2.3.2 of Nicholls and Quinn (1982). In particular, when p = 1, then the following two assumptions imply the strictly sta- tionarity and ergodicity of {X5}. (i) {65,2' E Z} and {Z,-,i 6 Z} have mean zeros and finite variances of; and 03;, respectively. (ii) 92 + 0?; <1. Now let 7 be a measurable function from 7?.” to ’R, 3], a E ’R, t 6 RP. Define My; t, a) = / F 0,V t E R” lirr},ian(/ [72‘1” ZvP(Y._.)(p.-(y; w) — My; t, -.s))]2.m(y) s 1.52) = 1, i=1 where 7+ = max(7,0),'y‘ = 7+ — 7. (C4) For every t 6 R”, / {fl/finalise, t) — p.-(y,0)l — A(y)t/2}2dH(y) = 010(1). where A(y) = 2E 7(Y0)Yg ff(y — zTYo)dG(z). (C5) lE 72(Y0)F(y — ZfY0)(1— FL?! - ZirYODdHQ/l < 00~ We state two more conditions required for the asymptotic normality of the esti- mator 9: Let B(y) = Eg(Yo)Yg ff(y — zTYo)dG(z), y 6 R. (C6) The matrix B(y) is nonnegative definite for each 3; E R, f l3(y)dH (y) and f BT(y)B(y)dH (y) are positive definite p x p matrices. (C7) Either sTg(Y,-_1)Y,-T_,s Z 0, V 1 S 2' g n, V s 6 RP, |s|=1,a.s. OI' sTg(Y,-_1)Y;’_,s g 0, v 1 g i g n, v s 6 RP, Isl = 1,a.s. Remark 1 The above conditions are assumed so that the desired asymptotic uniform quadraticity and asymptotic normality of 9 are achievable. In the case of the AR(p) model, i.e., when Z,- E 0, the above conditions (C1)-(C4) correspond to the conditions (7.4.7a), (7.4.8)—(7.4.10) of Koul (1992) and the condition (C5) is implied by (7.4.7a) and (5.5.69) of Koul (1992). For the Huber type estimator, i.e., for y E R”, g(y) = yI(|y| S c) + kl%ll(|y| > c), for some positive constants k and c, the conditions (C7) is a priori satisfied. If H is a finite measure, 7(y) = y,, the 10 jth component of y, j = 1, ..., p, g(y) = y and F has uniformly continuous density, having positive integral with respect to the measure induced by H, then all of the above conditions (Cl)-(C7) are implied by the strict stationarity and ergodicity of {Xi}- Now we are ready to state our main theorems. Theorem 1.2.1 Suppose that conditions (F), (CU-(C5) hold. Then, V 0 < B < oo, sup mm + n-‘P’o — 13:19 + n-“Ptn = 0pm, ItISB where 16.09 + n-l/Pt) = [Ii/(y. 0) + V(—y,0) — n-1,. in.-.) + A(y) tl’dH(y)- i=1 Upon using this result p times, the jth time with 7(Yi-1) = gj(Yi-1)a .7 =1: "WP, (1'2) we obtain the required asymptotic uniform quadraticity of the dispersion Mg(u). For stating the desired results, we need to clarify the conditions (C1)-(C5) when 7 is as in the equation (1.2). Condition (C1) is now equal to (Clg) EgflYo) < 00 for all j = 1,...,p. Similarly, condition (C2) is equal to V t E R”, a. E R, 1 S j S p, ||t|| S B < 00, (023) f ng(Yo)|p1(y; t,a) - p1(y;0,0)ldH(y) = 0(1)- Let (C3g) stand for the condition (C3) after 7*(Y;_1) is replaced by gj-h(Y,-_1),1 S j S p, in condition (C3), 1 S i S n. Interpret (C4g), (C5g) similarly. Corollary 1.2.1 Suppose that conditions (F), (CIg)-(C5g) hold. Then sup |M9(9 + n'1/2t) — Mg(9 + n’l/zt)| = 0p(1), (1.3) ItlSB where we + n-l/Pt) = / I-‘71-5:9(Yi-1){I(Ui s y) — I(—U.~ < y)} + Emma). 11 Theorem 1.2.2 In addition to the assumptions of Corollary 1.2.1, assume that the conditions (06) and (C7) hold. Then A «Hm — 0) => N(0, F), (1.4) where r := v-‘E(¢(—U1) - ¢(U;))g(Yo)gT(Yo)(¢(—Ul) — ¢(Uf))TV“. My) == / Hz 3 y)BT(x)dH(:c), 10(31’) == / I(a: < y)BT(:c)dH(a:), v == / BT(y)B(y)dH(y). Remark 2 Least Absolute Deviation Estimator (I.a.d.). If we choose g(y) = y and H () degenerates at 0, then 9 is the I.a.d. estimator, v.i.z. 910,, = argmin{|n"1/2 Z Y,-_lsign(X,- - tTY,-_1)|2, t 6 RP}. Because of the importance of the I.a.d. estimator, we summarize all the conditions on f for the case p = 1. All conditions (F) and (Cl)-(C7) are satisfied when G is symmetric around zero, F has a uniformly continuous and even density f, and EX3f(Z1Xo) > 0. Therefore, Theorem 1.2.2 implies that fi(9 — 9) : N(0,&fad , where .2 _ EX3A2(0)lI(—UISO)—I(Ul>0)l2 Glad “' 444(0) _ EXci _ 570(3) _ «42(0) _ 4(EX3f(Z1Xo))2° 1.3 Proofs Notation For any measurable functions f and g from R”+1 to R, define Ifs - guli; == / [f (y, 8) - g(y, 11)]de (y). where u, s 6 R”. In the following, Wi, vi stand for W, v with 7 replaced by 7*. The proof of Theorem 1.2.1 is similar to that of Theorem 7.4.1 of Koul (1992) and is facilitated by the following two Lemmas. 1-1 12 Lemma 1.3.1 Suppose that the RCAR(p) model (1.1) holds. Then the followings hold. (A). The condition (C?) implies that V 0 < B < oo, E/[Wi(y;t,a) - Wi(y;t,0)]2dH(y) = 0(1),V ItI S B,a E R. (1.5) (B). The condition (C3) implies that V |t| S B, V 0 < B < oo, lim inf P( sup Ivi(y; s) — vi(y;t)|2 S 1962) = 1, (1.6) " ls-tISB where k and 6 are as in (C3). (C). The conditions (C1), (C3) and (Cl) imply that V 0 < B < 00, .2121; My, t) - v(y; 0) - A(y) t/212dH(y) = 0P(1)o (1-7) Lemma 1.3.2 Suppose that the conditions (C2) and (C3) hold, then V 0 < B < oo, sup [[W*(y,t) — WP(y,0)1’dH(y) = ..(1), (1.8) ItIsB '21:; [W(y,t) - W(y,0)l2dH(y) = 0P(1)- (1-9) The proofs of the above lemmas are similar to those of Lemmas 7.4.2 and 7.4.3 of Koul (1992) with the following modification: Replace the o—fields used there by f,- = o{e,~,Z,-,Yo,j S i}, i _>_ 1 and the linear term there by A t/2. Proof of Corollary 1.2.1. Note that the jth summand in M9 is the same as that in K, when the function 7 is replaced by g,- in (1.2). Therefore (1.3) follows from Theorem 1.2.1 easily. El Before proving the Theorem 1.2.2, we need the following three lemmas. Lemma 1.3.3 Let {u1,u2,...} be a stationary ergodic stochastic process such that E{u¥} is finite and E(u,-|u1,...,u,-_1) = 0,V i Z 1, with probability one. Then the distribution of n‘1/2 ’-‘ 1 u; approaches the normal distribution with mean zero and variance Euf. 13 Proof. See Billingsley (1961). Lemma 1.3.4 Suppose that the assumptions of Theorem 1.2.2 hold. Then W(é - 6l) = 0P0)- Proof. It suffices to prove that, for every 6 > 0, El B > 0 and integer NE 3 P( inf Mg(u) _>_ Mg(9)) >1— 6, V n > NC, (1.10) Inl/2(u-0)I>B because then P(|,/H(b — 9)) > B) s P( inf M,(u) < M,(o)) < e, v n > N,. )nP/P(u-0))>B The following lemma gives (1.10). E] Lemma 1.3.5 For any 6 > 0,3 B (depending on e ) and NC, 0 < B < 00, such that P( tinf M,(0 + n‘l/zt) 2 M,(9))>1— e, v n 2 N., (1.11) l )>8 P( tinf Mgw + n‘l/zt) 2 M,(o)) > 1 — e, v n 2 N.. (1.12) I |>B Proof. Write Vj(y, t), Wj(y, t) for V(y, t), W(y, t), respectively, when 7 replaced by g,- in (2) at jth time. Put Vg(y,t) := (V1(y,t),...,V,,(y,t))T and Wg(y,t) z: (W1(y, t), ..., Wp(y, t))T. Note that the measure generated by H being o-finite, there exists a partition {/15} of R such that 0 < [A dH < oo,i = 1,2, Let h = 2:2, 1A,, then 0 < Ihl}, = fh’dH < 00. For t 6 R”, define N(t) =-- [[vxy. t) — m—y, t))h(y)dH(y), Mt) == [nus/,0) — V.(—y.0) + B '8 . |u|>gl|8|=l g( '1'" 8U)—|UI>BIilsl=l lhlll Similarly, . T ‘ 2 inf Mg(9 + n‘l/zsu) Z inf M. |u|>B,|8|=1 lui>B.|t‘3l=1 lhlif Note that, by the condition (C5), E / WP(y.0)tH(y) = Wm) / Fa — zfvoxt — F 0, 3 M. < 00 such that P(Mg(9) S M.)21— 6/2, for all n 21. (1.13) Thus it suffices to prove that |.<1)TN(su)|2 P > c 1 —- , e (|u|>ia,|s|=1 |h|§, — M) > C (1 14) (sTN(su)|P P —— > c — . . (lul>lBI.lI8I=1 lhlir __ M) > 1 e (1 15) But, V t E R”, (t) - 1V (t)|2 IN / Inuit) — no.0) - V.(—y,t) + V.(—y,0) — 3(y)t|2dH(y)|h|§; < Zlhliillwgt — W... + w |/\ gt ‘1' W-golil ‘l' lvgt " Ugo + ”-gt + v-90 - Btlfq]. By using Lemma 1.3.1(C) p times and Lemma 1.3.2, we obtain that V 0 < B < oo, sup |N(t) — [\A/(t)|2 = 0p(1). ItISB Now rewrite N (t), Mt) = /1V9(y,0) — V.(-—y,0))t(y)dH(y) + / Bh(y)dH(y) t 1: N1 + N2 t. By the OS inequality, W s Mg(9)lh|i;- 15 Therefore, by (1.13), there exists b = Mel/2|th, such that, P(|1\71| Sb) 21—5. (1.16) Denote a, = sTfB(y)h(y)dH(y)s = sTfB(y)dH(y)s, a = inf|,|=1{a,}. Then by the condition (C6), a > 0. The rest of the proof is similar to that of Lemma 5.5.4. of Koul (1992). [3 Proof of Theorem 1.2.2. EXpand Mg(9 + n’l/zt) in t, M909 +0n’1/2t) 1 = M.( )+ 2 tT— W: [m U < y) —I(- U.- < y))BT(y)g(Y.--.)dH(y) fii=l . + tT / BT(y)B(y)dH(y) t. Let 9 := argmin{Mg(u) : u 6 R”). Then by the same proof as that of the Theorem 5.4.1 of Koul (1992), we have [(3 — 6V / BT(y)B(y)dH(y)(é — é)| = 010(1). Therefore it is enough to prove ma — a) => N(0, r). (1.17) But 9 must satisfy the following equation «to? - 0) = (/ BT(y)B(y)dH(y))-‘%. (1.18) where 5. == Dam—U.) — ¢(U.-‘))9(Yt—1)- Since {X.-} is strictly stationary and ergodic, so are (¢(—U.-) — tb(U,'))g(Y,_1). Furthermore, by the condition (F), U1 is a continuous random variable and given Yo, the conditional distribution of U1 is the same as that of —U1. Thus El¢(-U1)- ¢(Uf)|}'ol = 0 (1-19) 16 Now, for any p—component vector 9, E{9T(¢(—Ut) - 1NU?))9(Yi-1)9T(Yi-1)(1/J(-Ut)-- ¢(Ut-))T)3} = UTE{(t(-U.) — t(U;))g(Yo)gT(Yo)(¢(—U.) — t(U;))T}U. This expectation exists if Eg}(Yo) < 00, j = 1,...,p and condition (C6) holds. Then E{flT(I/J(—U;) — 1/2(U,-'))g(Y,-_1)|.77.-_1} = 0 follows from (1.19) and station- arity of Ug. An application of Lemma 1.3.3 shows that n‘1/22?=1,6T(1,b(—U,-) — 1/J(U,-))g(Y,-..1) converges weakly to the normal distribution with mean zero and variance fiTE{(¢(-U1) - 1MU?))9(Yo)£IT(Yo)(1l'(-U1) - ¢(Uf))T}fi. for all 9 6 R”. U.- Thus, by the Cramer-Wold device, 5,, converges weakly to the multivariate normal distribution with mean vector zero and covariance matrix E{(¢(‘U1) — ¢(U1—))9(Y0)9T(Y0)(¢(‘U1) — ¢(Ul-))T}' Hence W(é — 9) 003‘ verges in distribution to the normal distribution with mean vector zero and covariance matrix V‘1E{(t(—U1) — ¢(U:))g(Yo)gT(Yo)(tt(—U1) - ¢(Ur))T}V“. This ends the proof of Theorem 1.2.2. Cl 1 .4 Simulation results In this section, we investigate the performance of the MD estimators under the RCAR(I) model for finite samples. A simulation study (100 replications) was per- formed for samples of size n = 20, n = 50 and n = 100. The samples were generated by the RCAR( 1) model, Xi = (9 + Zi)Xi-l + 6i with the true parameter 9 = .5, and different error distributions and normal distibuted random coefficients. 17 The comparison is made among LS, Huber and MD estimators. The MD esti- mators considered for comparison under the RCAR( 1) model are as follows. The function H, g are taken to be the identity function (i.e., H(y) = g(y) = y,y E R). The Huber function 43 is defined as follows. d>(:r.) = { x’ if lxl S C (1.20) c sign(:c), if le > c. with c estimated by St = MedianIXgl. The random coefficient distribution G considered here is a normal distribution with mean zero and variance .25; This enables to have 92 + of; < 1. The error distributions considered are the following. (a) F is the standard normal distribution, (b) F is the double exponential(1) distribution, (c) F is the logistic (1,1) . The RCAR(I) process is generated as follows: In case (a), I. generate a vector < w(1), w(2) >T, where the w(i)’s are successivly generated by a standard normal random number generator. So they are independent. 2. Repeat step 1 (n + 200) times where n is the sample size desired. Let the w(1), .5w(2) generated in the mth time be em, Zm,m = 1, ...,n + 200, respectively. Then < 61, ...,cn+200 >T and < Z1, -..,Zn+2oo >T will theoretically be independent, have zero means and E6? = 1, and EZ,2 = .25. 3. Calculate Xi = (9 + ZilXi—l + 6i, where X0 is generated by the normal distribution with mean zero and variance 2. Then ignore thefirst 200 X values produced. This enables {X;} to reach an equilibrium since we assume {X;} is stable. In the case (b) ( or (c)), we independently generate w(1),w(2) from double ex- ponential(1) ( or logistic (1,1)) and standard normal distribution, respectively. Then do the same thing as in case (a) for steps 2 and 3. 18 The LS estimator is computed using the formula 2;, X,X,-_1/Z?=1 X?_,. The Huber estimator is the solution of T(u) := ZX.’_1¢(X5 — uX;_1) = 0. (1.21) i=1 with d5 as given in (1.20). For the Huber estimator, SI =Median IX,- — uX.-_1| is first computed using the LS estimator 91,, then St is again computed by using the zero of (1.21). This iterative procedure is terminated when the absolute difference between the consecutive zeros of (1.21) is less than 10‘6. To compute the MD estimator, we minimize the dispersion Mg(u) over [—1,1] and the minimizer is denoted by 9,“. Table 1.1 contains the simulation results of the averages (Mean) and the mean squared errors (MSE) of the estimators for the true parameter 9 = 0.5 in the RCAR(I) model with 100 replicates and sample size n. Notice that the MD estimator computed in Table 1.1 is a local minimization. According to the paper by Dhar (1993, Lemma 1.1), in the case H (x) = :13, the minimizer of function Mg(u) can only be one or a convex combination of a pair of elements from the set D = { (x. - X.)/(X.-_. — X.-1).(X. + x,)/(x,._, + x.-.) = } . X,-_1 75 XJ-_1,X-_1 75 —X,--1,1Si,j S n. Thus the global minimizer can be computed through comparing (9d, Mg(9d)) and pairs (u, Mg(u)) for u E D starting with (9m), Mg(9md)). From Table 1.1, we observe that the Huber estimator has the biggest MSEs except in the case of Dexp(1) error and n = 50 which could be caused by the computing accuracy. Most of the biases and the MSEs of MD estimator are less than these of LSEs. Also, the estimated standard deviation of the averages of the estimates can be computed by S V/ s/TO—(I = S V/ 10, where S V is the sample variance which is related to MSE by the formula MSE = (n — l)SV/n + (Mean — 9)2. Again, we observe that all estimators are under estimating. For the sample of size n = 100, the MD estimators with H (:r) = g(x) = a: are between LSE and Huber estimator. The simulation study was done by Mathematica. 19 Table 1.1: Simulation results Error distribution N(0, l) Logistic (1,1) Dexp(1) Estimator Mean MSE Mean M SE Mean MSE n=20 LS .437970 .055660 .432906 .058949 .439949 .068925 Huber .452448 .061695 .447024 .066406 .447882 .075071 MD .441372 .055152 .433595 .061095 .437264 .076194 n=50 LS 0.469420 .020073 .444754 .025691 .477704 .020230 Huber 0.463232 .026582 .458541 .030475 .479045 .026283 MD 0.473069 .020896 .454640 .0263490 .476869 .031124 n=100 LS .475662 .013030 .486802 .010064 .474512 .013748 Huber .488874 .014368 .498575 . 13089 .487110 .013204 MD .483978 .012277 .494234 .11173 .483249 .012016 20 Table 1.2: The estimators of the RCAR(2) model for lynx data A Estimators él 02 $311 £312 £322 (3% LS 1.3844 -.7479 .0770 -.0694 .0821 .0364 ML 1.4274 -.8073 .0664 -.0489 .0839 .0300 MD 1.3932 -.7495 .0764 -.0706 .0845 .0367 An applied example We now fit a second order autoregressive random coefficient model X: = (01 'l' Zli)X:_1 + (02 ‘1' Z2i)X:_2 + 6i to the classical Canadian lynx data. Here X5" = X,- —X, X is the average value of the Xg’s, and X,- is the loglo of the ith data. We took the first one hundred observations. The MD estimators 91, 92 of 91,92 are 1.3932, -.749498. Let Z1 = (Z11, Zgl)’ and 2: 2ll 212 212 222 = Ezlz‘f. To estimate the covariance matrix 23 of Z1 and the variance 0%. of £1, substitute the MD estimator 9 into (3.2.4) and (3.2.5) of Nicholls and Quinn. The estimators in, 5312, 2‘32. and .9}. obtained thus are .076433, -.070556, .084552 and .03668. The comparison of the LSE , MLE of Nicholls and Quinn’s and the MD estimator is given in Table 1.2. From Table 1.2, we can see that the MD estimator performs at least as well as the LSE. Also, notice that the ML estimator of Nicholls and Quinn has the smallest estimated variance 6% and the smallest norm of the estimated covariance matrix of Z1. The three dimensional graph of the dispersion Mg(u), u E R2 is in Figure 1.1. The zeros of the characteristic polynomial (1 — 1.39322 + .74949822) are 1.15509 exp {:l:i27r/9.88329}, and so by using RCAR(2) model, it exhibits a period of 9.88329 cycle which is close to the result of Moran(1953). 21 Figure 1.1: The graph of the dispersion Mg(u) Part II A Self-Exciting Threshold Autoregressive Model Chapter 2 Definitions, assumptions and consistency 2.1 The profile maximum likelihood estimation Recall that the SETAR(2;1,1) model is defined by X,- = h(X,-_1,0) + 5,», i 2 1, (2.1) for some 9 = (9:11‘,r)T E R5, where 91 = (ao,a1,bo,b1)T E R4 and for any .1: E R, h(:c,9) = (a0 + a1$)I(:c g r) + (b. + blaz)I(a: > r). Here, the errors {£5} are independent and identically distributed random variables with mean zero, finite nonzero variance and £1 is independent of X0. We begin with the definition of the maximum likelihood estimators of the unknown underlying parameter 9 in model (2.1). Assume 9 is an interior point of 9 defined in (0.3). Note that O is an open subset of R5. There exists a compact subset K of R4 such that 9 is an interior point of K x R. Denote Q = K x R, then (I is a compact set. Let 9 = (ao,al,flo,fil,s)T be any point in (2. Note that {X.-} in model (2.1) forms a Markov chain. Let g0(Xo) be the initial density of X0 under 19, f be the density function of 61, then the one step transition densities, starting with X -_1, is f (X.- — h(X,-_1, 19)), i _>_ 1. If one observes 22 23 (X0, - - - ,Xn), then the likelihood function under 19 is H?f(X.~ — h(X.-_1,19))g,9(Xo). Let 9,, = (a0,,a1,,to,,,131,,,1=,,)T be any measurable function of (X0,X1, ...,Xn) from R"+1 to 0 such that 9,, maximizes the conditional likelihood function Ln(19) := H’l‘f(X,~ — h(X,-_1,'9)), over (I. Write 19 = (19?,3)T, 9 = (9f,r)T. Because of the behavior of the threshold parameter r in the likelihood function, the maximizing algorithm will be taken in the following fashion: Step 1. For fixed 3 E R, denote Ln,(191) = Ln(191,s) = Ln(19). Let 191,,(3) E K be any value satisfying the following equation: 191,,(3) = argmaxflieKLnAfll). Step 2. Consider the profile conditional likelihood function 3 -—) Ln(191,,(s),s). Note that Ln(191n(s), s) has only finite number of possible values. Let in be the any value satisfying the following equation 1",, = argmam,eRLn(91n(s), s), and substitute in into 191,,(3) to get A oln = 01710311)- Then 9,, = (9;, 1",.)T is a maximum likelihood estimator of 9. (2.2) To see (2.2), for any ‘9 = (9?,s)T E Q, by the definitions of 91,, and r", we have Ln(élna":n) = Ln(191n(7:n)7 7211) Z Ln(‘9ln(3)as) 2 1171(6), and hence, L..(91,,,f,,) = sup L..(z9). 1960 This means that 9,, = (9;, f'n)T is a MLE of 9. 24 2.2 Assumptions In this section, we are listing assumptions and some examples. The following assump- tions on the density f of 61 will be used in the following chapters of part II. (Cl) f is absolutely continuous and positive everywhere on R. With the ac. deriva- tive f’, let (,0 = f’/f and I(f) = f2and f’(rt) = c(m)( — 171-211) (1 gay—5%. (2.3) Thus, so = f’/f = -m;1 1:93:21”, and Iron 3 m; 1f(w)- (2 4) Hence (C1) holds. By (2.4), f’2(:c) m+1 2 f(:r) S (T) “3) This implies that I ( f ) < 00. The Lip(l) of (p holds because of +1 2 2 ItI=|—-"'—,,— 6, E|61|3 < 00 which implies (C4). Throughout in the following proofs, we use the fact that E lellk < 00 implies E IXolk < 00, for k = 2, 3, as proved by Chan, Petruccelli, Tong and Woolford (1985). 2.3 Strong consistency of the MLE We are going to show the strong consistency of the MLE 9... To this effect, let ln be the conditional log-likelihood ratio: _ 1 f(X3 — h(Xi-la 19)) W) " Z 21” f(X.- - h(X.--.,o)) (2.5) 26 and denote f(€,' + h(X,'_1,9) — h(X,'_.1, 19)) i-ufifl’ = n ¢(X ) l f(e.-) , ISiSn. (2.6) Note that 1 ln(’l9) = EZ‘i/J(Xg_1,€g,19). (2.7) Write i9 = (19?,8)T E Q and h(:r,19) = h,(:c,191). Let h,(:z:) = (3/3191)(hs($a 191)) : (I(z S s),:rI(a: S s),I(.r > s),xI(:r > s))T, s E R, a: E R. Observe that for any x E R, h(x,o) = 1917.43). (2.8) Also, |h,(:c)| 3 «1+ 2:2, (2.9) and for any 3 E R, t E R, |h,(:c) — ht(:r)| S (/2(1+ $2)I(s At < a: S s V t) (2.10) S 2(1+a:2)I(|:1:—t| S Is—tl) (2.11) Thus, by (2.8) and (2.9), |h(a:,19)l S I191|v1+ :62. (2.12) Recall that 9 E 6 means the stationarity and ergodicity of underlying process {X5}. Throughout, we will work on the stationary and ergodic process {X,}. Theorem 2.3.1 Suppose that the conditions (C1) and {C2} hold. Then, 9,, fl» 9, as n —) 00, {under 9). (2.13) Before proving Theorem 2.3.1, we need the following lemma. Let U0 denote any open neighborhood of 19. 27 Lemma 2.3.1 Under the assumptions of Theorem 2.3.1, for any 19 E (I and its open neighborhood U19, E sup |w(Xo, £1,19*)— w(Xo, 61,9)I —> O, as U19 shrinks to 19. (2.14) ‘9'eU 19 Proof. Define U190?) = {19" = (191339?" E 9 = |19'i‘- I91|< n, d(s*,s) < n}, 17 > 0- It suffices to show that E sup |¢(Xo,el,19*) — w(Xo, 61,9)I —-+ 0, as 17 —> 0. (2.15) 19.6U1907) Let 61(9) = X1 — h(Xo,19) and 6(Xo,19') = h(Xo,'9) — h(Xo,19'). For any 19 = (191T,s)T, recall h3(Xo, 191) = h(X0,’l91,S) = h(Xo,‘l9), (2.16) and rewrite 5(X0,o*) = h,(Xo,191) — h,.(Xo,19'{). For any a: E R, by (2.8) and (2.9), Imus.) — h.(t,t:)) 3 I01— tax/1 —+ (2.17) and by (2.10) and (2.11), |h,($,19’{) — h,-(:r,19'{)| S |9;|‘/2(l + x2) [(3 /\ s“ < x S s V s") (2.18) S |9;|(/2(1+ 2:2) I(Ix — 3| S ls" — 3|). (2.19) Thus on U907) and for s E R, |5(Xo,19")| S lhs(Xo,191)-ha-(Xo,191)|+lhs-(Xo,191)-hs~(Xofl9I)| S [filfixllflxo — 8| S It" - 3|) + W: — 15"{llx/(1 +X3) s [filtllluxo — s) s 130(77)- sl) + nl\/(1+X3) A07, X0), (say): (220) 28 where 30(77) is such that d(so(17),s) = 77. Note that 9461(9)) = [90(61 + h(X0a9) — 110(0) 19)) — 9461)] + 0. Therefore, the finiteness of I(f) and (2.25) imply (2.15) for any 3 E R. In the case 3 = 00, similar to (2.20), (nmwns(fimmn>m+fl(uxa g (fi|01|I(Xo > so(n)) + n)\/1+ X3 431(7), X0) where d(so(17), oo) = 77. Again, EAi(n.Xo) —> 0. as n _. 0- Thus the proof goes through for s = 00. The proof is similar in the case 3 z —00, except one replaces I (X0 > s‘) by I (X0 < 3"). Therefore Lemma 2.3.1 is proved. D 29 Proof of Theorem 2.3.1. Let a(19) = E¢(Xo, 61,19) for 19 E Q. The conditions (Cl) and (C2), the mean value theorem, the independence of £1 and X0 and Cauchy- Schwarz inequality imply that E |z/)(Xo, 61,19)| < 00. Thus a is a well defined finite function from Q to ’R. Note that (1(0) = O and lnx < :c - 1, unless a: = 1. For any given open neighborhood V of 0 in Q and any 19 6 VC 2 Q \ V, an conditional argument yields that 61 + h(Xo, 0) — h(Xo, 19)) f (61) _ n rm + how) — h(Xo,t9)) - E{E[. 2.) P21} < E {/[m + h(xo,9) — hm, 19)) — 1m] dy} = 0 0(19) = Eln f( (2.26) By Lemma 2.3.1, 0 is continuous and hence the compactness of Vc implies that there exists 190 6 Vc, such that sup 0(19) = 0(190) < 0. 19ch Let 60 = —a(19o)/3. For any 19 6 VC, by Lemma 2.3.1 again, there exists no > 0, such that E SUP w(Xo, 61,1?) S E¢(Xoa 61,19) + 5o S 01(190) + 50 = —250- (2°27) 19‘EU0('IO) Again, the compactness of Vc implies that there exists a finite number M of U191,(170), 191' 6 VC, j = 1,2, ..., M such that U11" U191(170) = V". Then by the ergodic theorem and (2.27), there exists a no such that for any n 2 no, 1 S j S M, sup 1.09”) s 1: sup abacus-49*) 0'6Uoj(m) " 19‘ev,9j(no) S E sup w(Xo,61,19") + 60 S —60, a. s. 19.6U19lno) J But, sup (”(19) _>_ ln(0) = 0. (2.28) 196V Therefore, for any neighborhood V of 0 in Q, 3 no, st. for all n _>_ no, sup 1,,(19') S max sup 1,,(19’) S —60 < 0 S sup 1,,(19). 19'eQ\V 1515M (1191(710) 196V 30 This implies that (line V, a. s.VVanan2no. By the arbitrary of V, 0,, goes to 0 almost surely. E] 2.4 n-consistency of the threshold estimator From now on we will invoke the condition (M). The discontinuity of h at 7‘ will give a stronger result about the estimator 1",, of the threshold 7‘, i.e., the n—consistency of A Tn. Theorem 2.4.1 Suppose conditions (CU-{C3} and (M) hold, then Infin _ 7')l = OP“)- The proof of Theorem 2.4.1 is technical and lengthy but interesting. We will begin with some notation. Let J : 722 —+ ’R. and p(a:) = EJ(x,q), 191(3) = E|J(:r,el)|, p2(:c) = EJ2(.’E,£1),.’L' E ’R, (2.29) For u _>_ 0, define G(u) = EI(r < X0 S r + u), Gn(u) = i210 < X;_1 S r + u), and Rn(u)=-71;(XZJ ,-_1,e,)I(rdGa| s l. [/ |J°(x,y)|dF(y)] (|P*-2(X1.dw) - Go)| 33 = |E(|JC(X0,61)|I(u1 < X0 3 u.) x (E[|J°(x,._1, ek)|I(u1 < X... g u2)|X1,Xo] —E|J°(Xo,e1)|1(u1 < X0 3 422))“ s Cpk‘2E|J°(Xo,el)ll(u1 < X0 3 u4)(1+ Ih(0.Xo)| + lell) s Cp"‘”(G(u2) — (3041)). Therefore (2.35) follows from the stationarity of {X.-} and the fact EM]- pk‘j = 0(n). The proof of (2.36) follows from the property of the square integrable martingale Rn(u) — rn(u), for fixed u > 0. That is, Var(n(Rn(u) — rn(u))) = nVar(J°(Xo, el)I(u1 < X0 S u2)). This completes the proof of Lemma 2.4.2. Proposition 2.4.1 Suppose that (Cl) holds and the functions p1 and p2 are contin- uous. Then, for each 6 > 0, n > 0, there is a constant B < 00, V 0 < 5 < 1 and V n 2 [13/5] + 1, P ( sup IGn(u)/G(u) — 1| < 77) > 1— e, (2.40) B/n 1 — e, (2.41) Note. The condition (C1) is for (2.40), the continuity of p; and p2 is for (2.41). Proof of Proposition 2.4.1. For any B > 0 and 0 < 6 < 1, choose a partition of the interval (B / n, 6] as follows: Fix a b > 1 and let Mo be the greatest integer less than or equal to ln(n6 / B) / 1n b. Note that Mo . . (B/n,6] = U 1,, 1,: (b'B/n,b‘+lB/n], i: 0,...,M0 — 1, [Mo = (bM°B/n,6]. i=0 Then (2.30) and (2.32) of Lemma 2.4.1 imply that V 171 > 0, P(sup IGn(biB/n)/G(biB/n) — 1| 2 171) S 2 VaT(Gn(b‘B/n))/(niG2(biB/n)) 3 OZ 1/(mnfBb‘) = C/(mnfBu — 44)). (2.42) 34 For 0 < a: S y S b3: S 6 with IGn(x)/G(x) — 1| < 771 and IGn(bx)/G(b:c) — 1| < 171, we have, (1 - 771)G(5'=)/G(b93) -1 S Gn($)/G(b$) - 1 S Gn(y)/G(y) — 1 < Guam/0(4) — 1 g G(ba:)/G(:c)(1 + m) — 1.(2.43) The strictly increasing property of G and Dini theorem imply that V 17 > 0, one can choose 771 > 0 and b > 1 sufficiently small such that G(b‘B/n) G(b‘+lB/n) 034320 G(bi+lB/n)(l _ ’71)‘ llv I G(b‘B/n) (1+ 771) — 1 < 77. (2.44) Now let _ Gn(b‘B/n) Gn(b‘+lB/n) An, — {09323640 G(b‘B/n) — 1. < "laosflslgg-1 C(b‘HB/n) —1 < Th . Then on An, (2.43) and (2.44) imply that sup lGn(u)/G(u) - 1| = @3350 321i)- lGn(u)/G(U) - 1| B/n 0) Rn(biB/n) — rn(b‘B/n) P (“3" C(biB/n) Z ’7‘) S 2: Var(R,, Rn(b’B/n) — rn(b'B/n))/(17 302(b‘B/n)) S 021“ mniBb‘) = C/(mniB(1 - b"‘)), (2.47) and (2.30) and (2.35) yield that P (sup |B.,,(b"B/n,bi+1B/n)/G(b£B/n) — B(biB/n,bi+lB/n)/G(biB/n)| 2 171) S 2 Var(I~Z,, Rn(b‘B/n, b'+1B/n))/(nsz(biB/n)) < C(b —1)/ B): 1/b’= Cb/(nfsz). (2.48) By (2.30) and (2.33), sup |B(b£B/n,bi+lB/n)/G(biB/n)l S C(biB(b —1)/n)/(mb‘B/n) S C(b-1)/m. ' (2.49) Thus for any 6 > 0, n > 0, one can choose 171 > 0 and b > 1 sufficiently small such that 771 + C (b — 1) / m < 17 and then choose sufficiently large B such that 20 v 2b mn?(1-b“)€ m’nié' Then, by choosing no = [8/6] + l, (2.46)-(2.49) imply that for any n 2 no, B> Rn(u) — MU) ) P su > < e. 2.50 (La/«I355 C(11) _ ’7 ( ) This completes the proof of Proposition 2.4.1. C] Now let fléi + a + flXi—l) f(€:') ’ where a = bo—ao, [3 = b1 —al. The functions p, p1 and 122 are defined correspondingly. J(X,'_1, 6,) E 1,0(X5_1, £5) E- in For u 2 0, define éw {—1,CI(T 0 and 17 > 0, there is a constant B < 00 such that V 0 < 6 <1 and V n 2 [B/6]+1, 1),,(u) — d..(u) 0(a) P( sup B/n 1— e, (2.51) Proof. The continuity of p1, p2 can be derived from the conditions (C1) and (C2) (See Appendix). Thus (2.51) follows (2.41) immediately. [I] Before proving Theorem 2.4.1, we need some more notation. Write f(Xi — hs(Xi-la t)) f(€4) 1,6(X,_1,e,-,t,s) = 1n , t E R4, 3 6 R, where h, is defined in (2.16). Let {(Xi—laéiatss) = 2Z'(‘Xi—la€iatas) - 123(Xi-196i9t7r)3 t E R4) 8 E R, 1323 Tl. Then,for1SiSn,tER4, 36R, 64-1.43») s gas-434,3» = -cp(X.- — h.(X.-_1, t))h,(X,-_1) - w(Xe - h..(X,-_1, t))h,(X.-_1) = — [w(x. — h.(X.-_1. t» — w(x. — h.(X.--1,t))] (1.06-1) — 0, there is a B > 0 and 'y > 0, 1 > 6 > 0 and no such that for any ”27107 sup P ( [ln(191,3)-ln(191,r)] B/n<|s—r|gs,i9en(6) GUS - Tl) < —7) > 1 — 25. (2.52) 37 For any 19 = (9?,3)T, denote ln,(191) = ln(191,s) = 1,,(19). Now, decompose ln(191,s)-— 1,,(191, 1') into two terms as follows: [11(1913 3) — [no.9], 7') = [Ins(19l) — l'nr('191) — (Ins(01) _ lnr(01))] 'l" [lns(91) _ lnr(01)] E 1,1,(19) + [g(s), (say). We shall prove that there exists a 6 small enough such that 1.1.09) SUP -— 2 013(1), (253) B/"<|3-TIS6.1960(6) GUS - r|) and 12(3) ) P su _n__<_2 >1“' 2.54 (B/fl r only and write 5 = r + u for some a > 0. For the case 3 < r, the proof will be exactly the same. To prove (2.53), by using the absolute continuity of 2b, Til-Z [C(X,‘_1,C{,’I91,S) — C(Xi—laéi,91,5)l = 1);]; tux-4.6.40. + M. - 91),.)(4. — 0.) dv. (2.55) n 1.1.(19) By the Lip(l) of cp, (2.17), (2.18), (2.9), (2.10) and (2.11) imply that there exists a constant L, for 1 S i S n, t E R4, 3 E R, |5(X'-1,€i,ta3)| L h,(X.-_1,t) — h,(X.-_1,t)((/l:—X_E_—l +[| 0 sufficiently small. 38 To prove (2.54), recall that a = bo — ao,fl = b1 — a1 and let J = 1/2 in the definition of p in (2.29). Then 1,2,(3) can be decomposed as follows: [2(3):l2(r+u)= %Z¢(X'-1’€ )I(r 0 such that 7 = [—p(a + fir) — 17(2 + |p(a + ,Gr)|)]/2 > 0. Note that sup En(u) B/n 0. Let 1 Dn(u)—d,,(u) En(u) Cu (14)) _ 0(a) 0(a) G 0 sufficiently small, P(£) 2 P(.A) 2 P(B n c n o) > 1 — e. (2.59) This completes the proof of (2.54). Thus, V e > 0, there exists '7 > 0 and 6 > 0 sufficiently small such that with (2.53) and (2.54), P( sup [171(191, 3) —ln(191,r)] < _7) g<|s—r|gs,i9en(6) GUS — r|) ll('5’) 12(3) 2 P sup —"——— < 7, sup --—'-‘—— < —27 (%<|a-r|56,t9en(6) G(|s ‘ Tl) gas—r56 GUS - 7'l) 13.09) _>_ — P su -——" Z —2 7) (gds-IZISIS Gas 7") 7) > — _— _ 1 P( sup G([s—r|) €<|a—r|gs,1960(6) > 1 — 26. This ends the proof of (2.52) and hence of Theorem 2.4.1. D Chapter 3 Limiting distribution of @177, We now consider the limiting distribution of 91,1. Recall that f(€,‘ + h(X'._1, 0) — h(X,'_1, 19)) f (G) 2,1)(X;_1, 6,319) = 1n , 19 E Q, and the log likelihood ratio function is 1 [11(19): 5 Z¢(Xi—la€iat9)a ‘9 E Q In the definition of the MLE 3", the first four components of the parameter point in Q is treated separately from the last component and we have proved that fin is n—consistent. Thus, we need some results of 191n(3) uniformly in s in the interval [r — B / n, r + B / n] for some B, 0 < B < 00 which is given in the following Theorem 3.1.1. 3.1 Uniform consistency Theorem 3.1.1 Suppose that (C1) and (02) hold. For any 0 < B < oo, sup |191n(s) — 01| = 0p(1). ls-rlSB/n First of all, we need an analogue of the Lemma 2.3.1. Recall that 19 = (19f,s)T and In,(191) = ("(191,s). Now write ¢‘(Xi‘1’ 65’ 191) = “Xi—1,64, 1913 8). 40 41 Let n > 0, define (119,07) = {191' € K= |19i-191I < 77}- Lemma 3.1.1 Under the conditions (CI) and {C2}, for any 191 E. K and its neigh- borhood U191(17) in K, E sup |2,Z2,(X0, 61,19?) — 2,1),(Xo, £1,191)| —> 0, as n —> 0. (3.1) 86R.19;€U191(n) Proof. Let 53(Xo,19'{) = h,(Xo,I9'f) — h,(Xo,191). Observe that by (2.17) on (119,07). l5.(Xo,I9I)| S I19? - 191|\/1+ X3 S n\/1+ Xci- (3-2) By the absolute continuity of In f, which follows from (C1), in/l-{JK'2 l¢8(X0, 61’ 19”" ¢8(X01 €1,191“ S 77 1+); l90(€1(19)+ v)l dv' (3'3) ‘ o The condition (C2), EXg < 00 and (2.12) imply that E sup cp2(61(19))) [E(1+ X3)]‘/2 + LE(1+ X3772 36 ' -> 0. awn—>0, thereby completing the proof of Lemma 3.1.1. C] Proof of Theorem 3.1.1. The following argument is similar to the one used in the proof of Theorem 2.3.1, except here we deal with the component 191 of 19 for all s E ”R. Let a,(191) = a(191,s). By the definition of the function a, for any open neighborhood V1 of 01 in K and any 191 6 V1", 3 E R, 0(19) < 0. Given V1 C K, by the continuity of a and compactness of (K \ V1) x 72, there exists a 1901 E (K \ V1) X R, such that sup 0(0) = 0(1901) < 0. 196(K\v,)xk 42 Let 601 = —a(1901)/3 > 0. By using Lemma 3.1.1, there exists an 1701 > 0, E5119 SUP ¢3(Xo, 61119;) S E¢3(Xo,€1,191)+501 S 0(901)+501 = —2501- (3-5) 3672 19iEUfl (7701) 1 By the compactness of K \ V1 again, there exists a finite number M1 of U19 (7701), 1: 191,- 6 K\V1, j = 1,2,...,M1 such that U?" (1191:0701): K\ V1. Then by the ergodic theorem and (3.5), there exists an n1 such that for any n 2 n1, 1 _<_ j 3 M1, 1 sup sup [w(fl'f) S — Z sup sup ¢,(X,-_1 , 6,, 19;) SE72 19:6(119110701) n 3612 flieuglj (’701) S E SUI} 3UP ¢3(Xoa €17 19;) + 501 36R 19:6(119 _(001) 11 S _6013 0.8., which implies that sup sup ln,(19; )_ < sup max sup ln8(19'{) S —601, a.s. (3.6) .612 fliemv. oeRl<3 sup In,(01) (3.7) :67; 0161/1 86R Taylor’s expansion of In f at 6.- yields that 3 7, M < 1, such that, In. (91) — $290M + 7( h,( X14491) — h,(X-_1,01)))[h.(X.-_1,91) — MOB-1191)]- Then the Lip(l) condition of cp and (2.19) imply that 1m(91) S 3;: [l‘P(Ci)l + L|h,(X.-_1,01) — h.(X.-_1,01)l] x [Ian/1+ X?_11(IX.-_1— r1 3 Is — ml i: [.,.(.,)l + Hallm] 19.1mm.-. - r1 3 Is — r|).(3.8) Therefore, for any B, 0 < B < 00, (3.8) yields that SUP llns(01)l Is—rISB/n l S E Z [W(Cill + L|91|v1+ X124 |01|VI+X,?_11(|X.-_1 — 1‘l S B/") 1 s ; Shag-11+ Ll01lx/1+(lrl+ B/nv] I91| x\/1+(|r|+ B/n)21(|X,_1— r| _<_ B/n). 43 Take expectation both sides to obtain E( sup lln,(01)|)=0(n-l). Ia—rISB/n Thus, for any 6 > 0, there exists n2, such that as n 2 122, V Is — r| S 8/12, P(ls—rilréfB/n ("3(01) > —(So) > 1 — C. By (3.6), (3.7) and the above, there exists a no = n1 V n; such that V n 2 no, P(sup sup In,(t9'{) < inf sup In,(191)) > 1 — e. (3.9) 3672 19;€K\V1 IS-rISB/n 6161/1 Let A, = sup sup In,(t9') < inf sup In,(191) . {36R 19ieK\V1 l ""58” 016V: } Then, on Ac, 191,,(8) 6 V1, V Is —r| S B/n, V n 2 no. Thus, by the arbitrary of V1, sup |191n(s) — 01| :2 0p(1). l-9--"|SB/n This ends the proof of Theorem 3.1.1. D 3.2 Asymptotic normality of 91,, Before stating the next theorem, recall that In,(191) = 1,,(191, s), and for a: E R, h,(:c) = 6—109:(h’($’191)):(1($ S s),:rI(.7: S s), [(33 > s),a:I(a: > 3))T. Let uish’l) = “@(Xz‘ _ hs(Xi—la01))izs(Xi—l)a 1 S 2 S ”- Denote A,(:r) = h,(:z:)(h,(a:))T. 44 Lemma 3.2.1 Suppose that O < I(f) < 00 and EXg < 00, then (if 2:21.401) => MO. I‘). (3.10) where F = E (P2(€1)Ar(X0)- Proof. Note that with .75} = o{Xj,j S 2'}, E0 [uer(91)l.7".'_1] = E9 {-90(€i)ilr(Xi-1)|F—1] = hr(Xi-1)E9(—90(€i)) = 0, a- 3- Therefore, for any vector v E ”R“, by the finite Fisher information of f, vT Z u,,.(01) is a zero mean square integrable martingale. By ergodic theorem, £2 vTiE luir(91)("ir(01))Tl'7:‘-1]}v = Jun-,1; Z AJXi—I)” ——1 vTI‘v, a.s. Thus, the martingale central limiting theorem of Hall and Hedye (1980) shows that the sum 71'”2 ZvTugrwl) converges weakly to the normal distribution with mean zero and variance vTFv for all u E R4. Thus, 12'”2 Z: u;,.(01) converges weakly to the multivariate normal distribution with mean vector zero and covariance matrix F. 0 Theorem 3.2.1 Suppose that conditions {CU-(C4) hold. Then for any B, 0 < B < m} |3_§;1 N(0, F—l). (3.11) As a consequence, for any B, 0 < B < 00, 811p lx/77(191n(8) - 01)| = 0P(1)- (3-12) Is-rIsB/n Proof. Note that for any 3 E 77,, a '56:(1ns(191)) = Z uia(‘91)- 45 Consider the Taylor’s expansion of (3/8191)(lm(191)) at 01: 0 8191 where 9;,” = 01 + 71(191 — 91), '71 is a function of (X0, ..,X,,,91,s), I71| < 1 and 8 J (t) = Z 5211,,(t) = Zfix. — h,(X,-_1,t))A,(X.-_1), t e 124. Then, the definition of 191,,(3) and (3.13) yield that 1 Jn8(0‘ns) 7.7; ZU,‘3(01)+ —71——- fi(61n(3) — 91) = 0. (3.14) For any matrix M = (mij), define ”M” = Z: Imgjl. Then for any finite number of ——(z.., 191) 2211;,(01) + J...(o 1',,,)(191 — 01), (3.13) matrices {M5} and finite number of real numbers a;, we have ”Em-M." _<_ 2 laelllMill- (3.15) By the definition of A3, ||A,(a:)|| = (1+ |x|)2, for all s E ’R and a: 6 'R. (3.16) First, we prove the following: For any B, 0 < B < oo, J_,,____,(0" n +,(_ — 010(1) (3.17) sup Is—rISB/n For any s, such that Is — r| S B/n, «In 40:...) = Z‘MX: —h(X.-_1,0;,,,))A,(X.-_1) = Zl‘MXi-hs X.-_1, 9in.))- ‘P’X( X"h ,(X.-_1,01))]A,(X.~_1) + )2 [so’(X.- — MUG-1191)) — so'(e.-)]A.(X.--1) + Z $0'(€e)As(Xi-1) a J1,,(0;,,,)+ J2... + J3... (say). (3.18) For the first term in (3.18), the Lip(l) of cp’, (3.15), (3.16) , (2.16) and (2.17) imply that 11.11.3013)” 3 Zlcp’Xe—h.Xe-1,9'{...))- 0, as Thus, sup Is-rISB/n Therefore, (3.20)-(3.23) and Markov inequality yield (3.17). J as I :2 + 1“" = 012(1). (3.23) Next, we are going to show that for any B, 0 < B < oo, = 0(n'1/2). (3.24) E [ sup ls-v'ISB/n % Dam.) - u..(01)] To prove (3.24), 2341491) - Uta-(91)] = Zl— (EXoI(Xo S r))2, 48 and EI(X0 > 7‘)EX31(X0 > 7‘) > (EX01(X0 > T'))2. Thus EA,(X0) is positive definite matrix and so is I‘. Denote PM = —Jn,(01m)/n By (3.17) and the positive definiteness of P, I‘m 1S positive definite eventually for every 3, Is — r| S B / n, and sup |det(l‘ns) — det(I‘)| = 0p(1). (3.27) Is—rISB/n By the Cramer’s rule, — dj(rns) I‘m 1 = a . ( ) det(Fm) Then the continuity property of the adj(I‘n,) in the components of I‘M and (3.15) yield that -1 a_d___j(r..)_adj(1‘) Fifi/J”; P H“ [IT—ea (r...) det(I‘) H adj(I‘n,) - adj(I‘) . det(Fn,)— det(F) < d I‘ - l dear...) )II + "a ’( )" det(I‘n,)det(I‘ )="I That is, sup "F; l — F 1H = 0p(1). (3.28) Is—rISB/n It follows from (3.14), for any 3, Is — r| S B/n, «H(Ms) - o.) + Flu-”2 2:11.49.) = —;;—(r F'F)(n-1/22ug,(01) -n-1/2Zu.~,(01)) —r l(11-1/2):tt..,(111)—n-1/2§:u.-,.(31)) —;r ,1— r- F)n'1/2Zu,-r(01). (3.29) Thus, by (3.24), (3.28) and Lemma 3.2.1, SUP IVE/(9111(3) - 91) + P—ln-lfl Zair(91)| = 0P(1)- Ia-rISB/n Again by Lemma 3.2.1, Theorem 3.2.1 is proved. D As a corollary of Theorem 3.1.1 and 3.2.1, we have the following uniform conver- gence rate of 191,,(-). 49 Theorem 3.2.2 Suppose that (CU-(C4) hold, then for any B, 0 < B < oo, SUP “9171(3) _ I91n(")l : 0P(n_1/2)- Is—rISB/n Proof. Consider the Taylor’s expansions of (d/d191)(ln,(191) and (d/dt91)(lm.(191) at 01 and evaluate them at 191.1(3) and 191,,(1'), respectively. We have 0 = 295.491) + Jns(0ina)(‘9ln(3) '— 91) = Zita-(01) + Jnr(9;nr)(19ln(r) — 01). Hence, Y}? $121.30.) — was] = M16135 — 191.001 + [rm — r] \/1—1(191n(s) — 191)- [1“... — r]./?1(19.,.(r) — 91). Therefore, Theorem 3.2.1, (3.17) and (3.24) imply that sup ls-rISB/n Mame) — «91m» = can). This completes the proof. [I] Chapter 4 Some asymptotic results on log-likelihood process In this chapter, we discuss some asymptotic results for a sequence of normalized profile log-likelihood processes. It is expected that these results will be useful in obtaining the limiting distribution of the standardized maximum likelihood estimator of the threshold parameter. Recall that Xi—h ,(X;_1,191)) T s T [(191,s)-1—nf21 f(61) ,(19,, ) 652. For 2 E R, a sequence of normalized profile log-likelihood processes is ln(z) = —2n[ln(191n(r + z/n), r + z/n) — ln(191n(r), 1)]. Observe that in view of Theorem 3.1.1, 191,,(1' + z/n) is an approximation of 01 uniformly in 2 over bounded sets. Thus a natural candidate for the approximation of ln is in defined as follows: For 2 6 R, - hr+z/n(Xi—l 1 01)) . f(€1') 1.(2 )=-—-2n[l (01,r+z/n)-l( (01,1) ]nf=—2Zl . (4.1) 4.1 An approximation 1,, of the normalized profile log-likelihood process ln 50 51 Theorem 4.1.1 Suppose that (CU-(C4) hold. Then for any B, 0 < B < oo, 1.(z)—1”,.(z)( = 0,.(1). sup IZISB Proof. Without loss of generality, assume 7' = 0. Decompose the concerned process in the following way, _g. [ln(z) — 13(2)] = —% [ln(z) — 211. “Xi “ h./.(X.--.,a.))] f(€£) = Z [In “Xi — hz/n(X.-_1,191n(z/n))) _ 1n f(X1' - hz/n(X1—1,91)) f(X1‘ — h0(Xi-1191n(Z/n))) f(X1- (1009—1191)) f(X1- h0(Xi-1191n(Z/n))) + 2‘“ f(X.- - lam—1.19.30») 13.12) +1332), (say). It suffices to show that V B < oo, SUP |l1n(z)| = 010(1), (42) IZISB and sup |l2n(z)| = 0p(1). (4.3) IZISB Actually, we shall prove a slightly stronger result than (4.2). To state this stronger result, recall that for 1 S i S n, 61(191,S) 2 61(19): X; — h,(X.'_1,‘l91), (I9?,S)T E Q. Denote _ f(€1(t12/n)) . . a . pnz,-(t)—ln f(€.'(t,0)) .R R, lSzSn. Then, for any i, 1 S i S n, ani(t) = -1) — item—1)] «2(a) [Mm-1) — 532131)]. ‘44) 52 Note that 9171(2) = lenzi(‘91n(Z/n)) _ Pnzi(01)l- From (3.12), SUPlzISB |\/n(191n(z/n) - 01)| = Op(1). The stronger result that will be proved is that for any 0 < C < 00, E{ sup I: [11am - p...(01)1|} = 001-1”). (4.5) IzISBn/vYIt-91ISC Denote 913(t) = 01 + s(t — 01) and recall that (,0 = f’/f. By the absolute continuity of In f, [01 15.10.11» (t - 01) ds 5 3112(t) ‘l’ 32iz(t) + 3331(9), (say), pnzi (t) _ pnzl(61) where, by (4.4), slam = f baa-(01.0). z/n» + 103101.31. 0))1 13,323-33 — 91) «Is, 82120) = [01 [—cp(6.-(91.(t),0)) + 9(6)] [fig/“(X14 - 90(X1-1)]T (t - 91) 931 s...(t) = — P if and only if Pnr'l => Pr'l on D([sk,tk]) for all k 5ktk 5ktk and some sequence {[sk,tk], k 2 1} with Uiidshtk] = [0, oo). -1 Oath Corollary 4.2.1 Pn is relatively compact if and only if Pnr is relatively compact for all k and some sequence [shtk] such that UE‘3__1[sk,tk] = [0, oo). 56 Thus, take [shtk] = [k,k + 1], it is enough to work on D[k,k + 1], for every 11: Z 0. That is, for every 45 6 D[0, 00), define fl§°(¢) = sup [min{|¢(U') - ¢(u)|s |¢(U") - ¢(u)|}l kSu—bsu’SuSu”Su+65k-{-l +1 SUP” I44") - ¢(k)| + 8UP WU) - (WV +1)|s k 2. 0- _<_u5k k+l-6SuSk+l Theorem 4.2.2 Suppose that f is continuous, positive everywhere and bounded on R, then ({ln(-z),z Z 0}, {ln(z),z Z 0}) is tight. That is, V k 2 0, V e > 0, 15111531111 P(flflln) > c) = o. (4.12) Proof. We shall only show the tightness of {ln(z), z 2 0}, since the proof is the same for {ln(—z), z 2 0}. The following argument is similar to the proof of Lemma 3.2 in Ibragimov and Has’minski (1981, pp. 261). Let A.- = A;(u,u + 5] be the event that a trajectory of in possesses at least i discontinuities on the interval (u,u + 6]. We shall prove the following inequalities: P(Al) g 05, P(Ag) s 052. (4.13) A trajectory of in has at least one discontinuity on (u,u + 6] only if at least one X14 6 (r + u/n, r + (u + 15)/n]. Denote C.- = {X.-_1 6 (r + u/n, r + (u + 6)/n]}, then by the boundedness of f and Remark 2, P(Al) g 2 P(C.) g 06. (4.14) A trajectory of in has at least two discontinuities on (u, 11 +5] only if at least one pair of (X;_1,X,-_1) 6 (r + u/n,r + (u + 6)/n]2, i 71 j. Hence, by Proposition 4.2.1 and the stationarity of {X1}, P(Az) S ZP(X.-_1 E (r+u/n,r+(u+6)/n],X,-_1 E (r+u/n,r+(u+6)/n]) S 062. #9 This and (4.14) prove (4.13). Now let B be the event that on the interval [11,]: + 1], there exists at least two points of discontinuities of in such that the distance between them is less than 25. 57 Let us divide the interval [19, 11+ 1] into m = [6“] subintervals 6.- of length m‘l. Each interval with length less than 26 is totally contained in 6,- U (6.4.1 U 6,4,2). Therefore, n1 1n—2 B C U 142(65) U U A2(5;+1 U 6.4.2). {:1 i=1 Hence, n1 7n-2 M?) s 2 mass» + )3 P(A2(91‘+1 u a...» s cme s ca. (4.15) i=1 i=1 Furthermore, as long as the event B does not occur (i.e., the complement of B, say B", occurs), any interval of the form [u - 6,u + 6] possesses at most one point of discontinuity of l... So that this function is continuous on either [u, u + 6] or [u — 6, 11]. For example, suppose that l" is continuous on [u,u + 6]. Then in has no jump on [u, u + 6]. Note that in is a step function, so in is a constant on [u,u + 6], i.e. sup Iln(u) - ln(u”)| = 0. uSu”Su+6 Finally, on BC, there is at most one discontinuity point of in and V e > 0, { sup (1”,.(11) — 12.01)) > e/2} (1 BC c 41(1 + k/n, r + (k + 6)/n], kSuSk+6 thus, by (4.13), P({Ki‘iiss (1",.(11) — 12.01)) > e/2} n B) (4.15) s P(Ioh-(r + k/n, 1 + (k + 15)/11]) s 05, and P({ sup (1”,.(11)—i..(k+1)|>e/2}n136) (4.17) k+1-6SuSk+l gimmu+(k+1-Qhaw+w+iyfl)£03 Therefore, by (4.15)-(4.17), for every 6 > 0, P111113») 3 P(B)+P(Bcn{o§(i.)>o}) g 05 + P({ sup (1.(1.) — i,(k)| > e/2} n B“) kSuSk+6 +P({ sup lln(u)—ln(k+1)|>c/2}nB°) k+l—6Su$k+l 305. (4.13) |/\ 58 It follows from (4.18) that, V k 2 0, V e > 0, girré sup P(Bflln) > e) = 0. Therefore, {ln(z), z 2 0} is tight. CI 4.3 Some problems for future research Note that for each n, the process in is a jump process with finite number of possible jumps at nX,-_1,i = 1, ..., n. It is thus reasonable to expect that the limiting process of in will asymptotically behave like a compound Poisson process with rate 99(r) whose left and right jump distributions are given by the conditional distribution of (1 = —2ln[ f (61 — a — 6X0) / f (61)] given X0 = r“ and the conditional distribution of (2 = —2 ln[ f (61 +a+ 3X0) / f (61)] given X0 = r+, respectively. The former conditional distribution is the limiting conditional distribution of (1 given r—61 < X0 S r—62 and the latter is that of (2 given r +61 < X0 S r+62 as 6110, 621,0 and 61 Z 0, 62 > 0. I am presently working on the above problem. After obtaining the limiting distribution of in, it will be easy to obtain some inference on the limiting distribution, which will be related to the compound Poisson process, of the standardized maximum likelihood estimator of threshold parameter r. Appendix A Lemma A.0.1 Suppose that p1(y) = f|1n[f(:c + y)/f(:c)]|dF(:r) < 00, V y E R. Then (i) and (ii) below are equivalent. (i). p1(y) is continuous at a + fir, (ii). f I ln[f(a: + a + fly)/f(:r)] — ln[f(a: + a + ,Br)/f(a:)]|dF(:r) is continuous at r. Proof. Suppose that (i) holds and yn —+ r. Let f($+a+fiya) f($+a+flr) 9,, a: = ln , :1: = 1n . ( ) mo) 9‘ ) mo) Then, the continuity of f implies that, ya —1 g, point—wise and which implies 92f ->g+s g; —’g‘- (A-l) Thus, by (i), / lgnldF —+ / |g|dF. (A.2) Combining (A.1) and (A2) implies that [1,de —+ f 9* (IF. (A.3) Since 0 S (9* - 9:? S 9i: (g:h — gf)’: —-> 0 and fgidF S |g|dF < 00, by dominate convergence theorem, [(gi — g:)+ dF _. o. (A.4) 59 60 The result (AA) and f((g* — 9,, 35) dF —1 0 imply that [(91 -g?f) dF-+0- Thus, / 11* — 1:1 or —» 0 (4.5) which implies fm—aera The fact that (ii) implies (i) is obvious. CI Lemma A.0.2 Let p(y) = fln[f(:c+y)/f(:r)]dF(x), then the continuity ofpg implies the continuity of p1 and p. Proof. Let yn —1 r, denote _ f($+a+flyn) _nf(w+a+flr) hn(:1:) — 1n f(z) , h(:I:) —1 f(x) . Then h, _. h, / (5.12111? —-> [521113. (A.6) A convergence theorem in Hajek-Sidak (1967, pp. 154) and (A.6) imply that / |h,, — hlzdF —-1 0. (A.7) By Lemma A.1, (A.7) is equivalent to the continuity of p1. The continuity of p follows from (ii) of Lemma A.0.1. D Lemma A.0.3 The conditions (CI) and (C2) imply the continuity of 102(1)) = E {1n[f(61 + y)/f(61)l}2 on R- Proof. For any :1: and y in R, {<———)<—)}| 11::11 (nmnnwl f(€1 + y) f(€1) f(€1) S 2 la: - yl (El