5.1.5., 1 4.5:... . v «L. . I 3.3.3» ( it}... I. 3w 1 '.:{..._ 13.. 1, . 45"" n 'u’ (.1 a $9.1 .lih 1 Q... nu (4.3 V v I~.I ,F fin. ”if; ..w..../.... .15.: i». . . Km»: 1 3 fixfiiafi..?=uafig t K THFflsr: lob) This is to certify that the dissertation entitled Modified Cox Tests for Time Series and Panel Data presented by Donggeun Kim has been accepted towards fulfillment of the requirements for Ph. D Economics degree in “x (Aisle ' Major professor Date Ma} 17" 200.2 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 LIBRARY Michigan State University PLACE IN RETURN Box to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 3:? :12: 23:14 V—s o __J o 6/01 c'JCIRCJDateDuepGS-ms Modified Cox Tests for Time Series and Panel Data By Donggeun Kim A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Economics 2002 Abstract Modified Cox Test for Time Series and Panel Data By Donggeun Kim It has long been one of the main interests among econometricians to test nonnested models between two different families. But computational difficulties of the nonnested testing have restricted its application to rather simple linear or non- linear regression models. This dissertation proposes a new approach based upon the conditional mean and the conditional variance specification in order to solve the computational difficulties and to extend its application to more complicated cases including time series and dynamic panel data. The first chapter of this disserta- tion proposes a modified Cox test under normality, examines its application to two different nonlinear error equation models with three different time series data sets, performs Monte Carlo experiments to investigate the potential applicability of our proposed test. Chapter two extends its applicability under nonnormality and de— velops a robust modified Cox test under nonnormality. Chapter three presents its application to the nonlinear dynamic panel data models with the US. patents and R&D expenditures data. To my parents iii ACKNOWLEDGEMENTS During my time at MSU, I have been helped and supported by many good people. I thank you all. I would like to thank my committee members, Richard Baillie, Anna-Maria Herrera, Christine Amsler, and Ernest Betts for their suggestions and support. I must express my fullest respect to my mentor, Jeffrey M. Wooldridge. This dissertation could not have started without his insightful guidance, invaluable ad- vice, constant encouragement, and extreme patience. I would like to thank Ernest Betts for his support. I have been able to learn many important things while working for him as a tutorial coordinator. I would like to thank Korean economics fellow students, especially my peer, Byungrae Choi and his family for their affection and special friendship. Also, I would like to thank my pastor Rick Erickson for his steadfast support and prayer. I must thank my parents for their endless love and unlimited support. With- out them, I would not be here today. Even though my dissertation could not have been written without the helps and support of these special people, any remaining error is of my own responsibility. iv Contents 1 A Modified Cox Test for Dynamic Models of Conditional Means and Variances 1 1.1 Introduction ............................... 1 1.2 A Modified Cox Test ........................... 6 1.3 Empirical Application .......................... 21 1.4 Simulation Experiments ......................... 34 1.5 Conclusions ................................ 41 A Robust Version of the Modified Cox Test 43 2.1 Introduction ................................ 43 2.2 A Robust Modified Cox Test ....................... 44 2.3 Monte Carlo Experiments ........................ 49 2.4 Empirical Application under Nonnormality ............... 81 2.5 Conclusions ................................ 83 An Application of a Quasi-Modified Cox Test to Nonlinear Panel Data Models 85 3.1 Introduction ................................ 85 3.2 Two Competing Count Panel Data Models with the Unobserved Effects 88 3.3 An Empirical Application ........................ 96 3.4 Conclusion ................................. 103 Modified Cox test 104 Regularity Conditions 108 Bibliography .................................. 110 vi List of Figures Figure 1.1 Cox test values of T1 when GARCH(1,1) is true ......... Figure 1.2 Cox test values of T2 when Bilinear( 1,1) is true ......... Figure 2.1 Robust Coxt Test when GARCH(1,1) is true with chi—dist. Figure 2.2 Nonrobust Coxt Test when GARCH(1,1) is true with chi-dist Figure 2.3 Robust Coxt Test when Bilinear(1,1) is true with chi-dist . . . . Figure 2.4 Nonrobust Coxt Test when Bilinear(1,1) is true with chi-dist . . Figure 2.5 Robust Coxt Test when GARCH(1,1) is true with t-5 dist Figure 2.6 Nonrobust Coxt Test when GARCH(1,1) is true with t-5 dist . . Figure 2.7 Robust Coxt Test when Bilinear(1,1) is true with t-5 dist . . Figure 2.8 Nonrobust Coxt Test when Bilinear(1,1) is true with t-5 dist . . Figure 2.9 Robust Coxt Test when GARCH(1,1) is true with t-10 dist . . . Figure 2.10 Nonrobust Coxt Test. when GARCH(1,1) is true with t-lO dist vii 39 39 67 68 69 70 71 72 73 74 76 Figure 2.11 Robust Coxt Test when Bilinear(1,1) is true with t—10 dist Figure 2.12 Nonrobust Coxt Test when Bilinear(1,1) is true with t—10 dist . Figure 2.13 Robust Coxt Test when GARCH(1,1) is true with chi-dist . . . Figure 2.14 Robust Coxt Test when Bilinear(1,1) is true with chi-dist . . . Figure 2.15 Robust Coxt Test when Bilinear(1,1) is true with t—5 dist . . . viii 77 78 79 80 80 List of Tables 1.1 Summary statistics: S&P 500 ...................... 28 1.2 Summary statistics: British pound ................... 28 1.3 Summary statistics: IP .......................... 29 1.4 Estimated GARCH Models: S&P 500 .................. 30 1.5 Estimated GARCH Models: British Pound ............... 30 1.6 Estimated GARCH Models: IP ..................... 30 1.7 Estimated Bilinear Models: S&P 500 .................. 31 1.8 Estimated Bilinear Models: British Pound ............... 32 1.9 Estimated Bilinear Models: IP ...................... 32 1.10 Test resultszHozGARCH vs. H1: Bilinear ................ 33 1.11 Test resultszHozBilinear vs. H1: GARCH ................ 34 1.12 Simulation results when GARCH(1,1) is true ............... 36 ix 1.13 1.14 1.15 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 Simulation results when Bilinear(1,1) is true ................ 37 Simulation results when GARCH(1,1) is true ............... 40 Simulation results when Bilinear(1,1) is true ................ 41 Robust. Cox test. results when GARCH(1,1) is true ............ 52 Nonrobust Cox test results when GARCH(1,1) is true ........... 53 Robust Cox test results when Bilinear(1,1) is true ............. 54 Nonrobust Cox test results when Bilinear(1,1) is true ........... 55 Robust Cox test results when GARCH(1,1) is true ............ 56 Nonrobust Cox test results when GARCH(1,1) is true ........... 56 Robust Cox test results when Bilinear(1,1) is true ............. 58 Nonrobust Cox test results when Bilinear(1,1) is true ........... 58 Robust Cox test results when GARCH(1,1) is true ............ 59 Nonrobust Cox test results when GARCH(1,1) is true ........... 60 Robust Cox test results when Bilinear(1,1) is true ............. 60 Nonrobust Cox test results when Bilinear(1,1) is true ........... 61 Robust Cox test results when GARCH(1,1) is true ............ 62 Robust Cox test results when Bilinear(1,1) is true ............. 64 2.15 Robust Cox test results when Bilinear(1,1) is true ............. 65 2.16 Test resultszHozGARCH vs. H1: Bilinear ................ 82 2.17 Test resultszHozBilinear vs. H1: GARCH ................ 82 2.18 Test resultszHozGARCH vs. H1: Bilinear ................ 82 2.19 Test results:H0:Bilinear vs. H1: GARCH ................ 83 3.1 Summary Statistics: the Patents and lnR&D Data ........... 96 3.2 Estimation Results for the Patents Model: Linear Time Trend . . . . 97 3.3 Estimation Results for the Patents Model: Full Set of Year Dummies 98 3.4 Estin'iation Results for the Patents Model: Linear Time Trend . . . . 98 3.5 Estimation Results for the Patents h-‘Iodel: Linear Time Trend Only . 99 3.6 The quasi-modified Cox Test Results .................. 100 3.7 The quasi-modified Cox Test Results .................. 100 3.8 The quasi—I'I'iodified Cox Test Results .................. 101 3.9 The quasi-modified Cox Test Results .................. 102 xi Chapter 1 A Modified Cox Test for Dynamic Models of Conditional Means and Variances 1 . 1 Introduction Since Cox (1961, 1962) devised a specification testing based upon a modification of the Neyman-Pearson maximum-likelihood ratio, testing nonnested models has been one of the main interests among econometricians. However, the application of the nonnested Cox test has been restricted to rather simple linear or nonlinear regression models mainly due to its complicated and, in many cases, intractable 1 derivation of the pseudo-true value in the second part of the Cox test. (See, for example, Pesaran and Deaton (1978), Gourieroux, Monfort, and Trognon (1983), and Mizon and Richard (1986).] In general, the quasi-maximum likelihood estimate (QMLE) of a nonlinear model does not have a closed form, so it may not be possible to obtain the analytical derivation of its pseudo-true value and its finite sample estimation of the pseudo-true value in the Cox test. To avoid these computational difficulties some authors developed alternative approaches. Davidson and Mackinnon (1981) combined the two nonnested models as an artificial nesting model and replaced the nuisance parameter under the null hypothesis with the estimated value. under the alternative hypothesis to avoid the identification problems. For example, suppose there are two competing specifications All and 11/12, 11/11 :yt : mt(Xt,’y) + ut,’where at | Xt ~2'.z'.d(0,02),t:1,-~,T (1.1) Mg : yt : nt(Xt,6) + vt,where w | Xt ~ i.i.d(0,T2),t : 1,. - - ,T (1.2) then Davidson and Mackinnon (1981) transformed these two nonnested models as yt : (1*- /\)mt(Xt,7) + /\,1L1(Xt,6) + whwhere wt I Xt ~ le(0,T]2),t = 1,: ' - ,T (1.3) They replaced 6 with an OLS estimate, 6, under Mg instead of the pseudo-true value of 6 undeer and tested if A20. The DM test can be written as A M = mt(Xt,70) + )‘(Mt(Xta 5) — mt(Xt,70)) + wt (1-4) A Now we can regard the DM test as an omitted variables test of ( ,ut( X t, 6)—mt(Xt, 7)) in the nonlinear model yt = mfiX?) +89. If there are nonnormality, heteroscedastic- ity, or serial correlation, their test statistics becomes invalid. Wooldridge (1990,1991) suggested a robust version of Davidson and Mackinnon test (DM test) by modify- ing the misspecification indicator of his conditional Mean Encompassing test (CME test). Under heteroskedasticity, a robust version of DM test is derived by simply set- ting the misspecification indicator A E (pt(Xt, 6) —mt(Xt, 7)) and applying the CME test procedure. For the weighted nonlinear least squares (WLN S) estimator, a robust DM test is obtained by setting A E (lit/fit)(,ut(Xt, 6) — mt(Xt, 6)) (See, Wooldridge (1990)). For possible nonzero correlation between the residuals ét = yt —- m(Xt, ’y) and a particular weighting of the difference in the estimated regression functions, set the misspecification indicator A E CileglUMXt, 6) — mt(Xt,"y)), where C“ is the estimated variance function for the model under the null and Cftg is the estimated variance function for the model under the alternative (See, Wooldridge 1991). On the other hand, Pesaran and Pesaran (1993) offered another approach to deal with the computational difficulties of obtaining the pseudo-true value of the Cox test by a method of stochastic simulation. Let Hf : f(yt, a | Xt) and Hg : 9(yt,6 | Xt) be the two nonnested competing models. Then the Cox test (1961,1962) is based upon Q) Tf = {Lf(d) — L903» — Edam) — Lg( )} (1.5) -—- Lf(d)—Lg(8)+0(a.a), (1.6) where C(&,[§*) = Ed{Lf((3) '" 149(3)} Lf(d) = T_IZ$=110gf(yt,a | Xt),Lg(/3) =— T‘IZthllogm/afi l Xt) are the maximized log likelihood functions under H f, Hg respectively, and 001,6...) is the unconditional expectation of the log likelihood ratio when the null is correctly speci- fied. To obtain [3... by simulation method, a T x 1 vector of independent observations of yt is artificially generated under H f and then the ML estimate of 6 is derived by using these artificially generated observations under Hg. This procedure is repli- cated R times to obtain (3* = E 2 Bi (1'7) Then, the same procedure is applied to obtain C(d,6*) by the same simulation method R A C(afiQ£321{Lf(yj,é-Lg(yjfi*)} (1-8) Even though Pesaran and Pesaran(1993) argues that these estimators ob- tained by the simulation method converge to the pseudo-true values consistently and 4 fairly quickly with a relatively small number of replications, this simulated method is not such a favorable approach to the practitioners. Besides, it is very difficult to use the original Cox test if the given models contain the lagng dependent vari- ables: ft(yt,a | art,yt_1,:rt_1, . . .) and gt(yt,fi I rt,yt__1,:1:t_1,...) because poten- tially very severe con'iputational difficulties arise from computing the unconditional expectation of the differenced log likelihood functions. Bera and Higgins (1997) pre- sented nonnested Cox test results using the stochastic simulation method proposed by Pesaran and Pesaran (1993) between two nonlinear equation error models, the autoregressive conditional heteroscedasticity(ARCH) and the bilinear models with three time series data sets. The difficulty in applying the original Cox test in time series applications and possibly dynamic panel data is that it requires computing the unconditional expectation of the differenced log likelihood functions when the null is correctly specified. In this paper, we propose a new approach to solve the computational difficulties of the Cox statistic by using conditional mean and conditional variance. Our approach here is to conumte, for each t, the conditional expectation. In some important applications including ARCH and GARCH models in time series, this approach leads to substantial simplifications. Another attractive feature of our approach is that we can test other distributional features because our approach uses the first two conditional moments while the DM test is for the conditional mean, E (yt I rt), only. In section 2, we describe our new modified Cox test procedure; in section 3, we present an empirical result with three time series data sets; in section 4, we provide simulation experiments of this modified Cox test and we draw conclusions in section 5. 1.2 A Modified Cox Test 1.2.1 Motivation and General Concepts Suppose that there are T individually, identically distributed random variables yt,t : 1, - - - ,T. f (yt,a) is the probability density function under the null hy- pothesis, H f, and g(yt, ,3) is the probability density function under the alternative hypothesis, Hg, where (1,6 are unknown parameters, and f (yt,o) and 9(yt, 6) be- long to separate families. If H f is not nested in H9, and Hg is not nested in H f, then it is said that the two hypotl‘ieses, H f and Hg, are nonnested with each other. If one model can account for the results from the other model, then the former is said to encompass the latter. [see Mizon and Richard (1986), and Hendry and Richard (1990).] This means that a correctly specified model can explain the results of its competing model and the pseudo-true value is the probability limit of the alternative model under the null hypothesis. Thus, the nonnested test statistic devised by Cox (1961,1962) is an example of encompassing test. The Cox test, Tf,of H f against Hg is based on Tf = {Lf(d') — 139(6)} — EafLfld') — L902» (1-9) A where L f(ci), Lg(6) are the maximized log likelihood functions under H f and Hg respectively and c1, 6 are maximum log likelihood estimators. The test statistic is based upon the difference between the log likelihood ratio and its expected estimate under the null hypothesis, Hf. If E5,{Lf(ci) —Lg(6)} : 0, then the Cox test statistic is just simplified to the form of log likelihood ratio statistic, but, in general, this term is nonzero under nonnested hypotheses. So the Cox test takes the deviation between the maximum log likelihood ratio and its expected value under the null hypothesis. Under the correctly specified null hypothesis, Tf should be close to zero while a large deviation from zero constitutes evidence against the null hypothesis. The standardized Cox test statistic, \/T £1175, where Vf is a consistent estimator of f the asymptotic: variance of Tf, is asymptotically distributed as unit normal. White (1982) provided general regularity conditions and the asymptotic normality of the Cox test statistic. Despite its theoretically refined feature, the derivation of pseudo-true value of E(-,{Lf(d) — Lg(6)} is not straightforward, and even analytically intractable. To solve these computational difficulties, we offer a new approach based upon the conditional mean and conditional variance method. What makes difficult to apply the Cox test is that it requires computing the unconditional expectation of the differenced log likelihood functions that is not significantly ai‘ialytical or tractable in many cases. The observations are assumed independent in case of Cox (1960,1961) and it reduces the computational difficul- ties in some degree but it still requires significant computational effort. Besides, for this reason, it becomes very challenging to apply the original Cox test to time series applications that ccmtain the lagged dependant variables as the explanatory variables. But these difficulties can be avoided by computing the conditional expec- tation, for each t. And this leads to substantial simplifications in some important applications including ARCH and GARCH models. White (1994) showed the com- putational simplification of the second part of the Cox test using conditional densi- ties ft and gt given It_1 where It_1 is the information set (a-algebra generated by {a}, yt-1,;rt_1, - - }) available at time t. Let Hf : ft(:rt, a) and Hg : g(:rt, 6) be the two nonnested competing models, then their maximized log likelihood functions are, respectively, . 1 T . Ln(o') : T 2 log ft(-l?t, a) (1.10) t=l . 1 T - [471(5) : ‘7: Z 10% {Ida/1,5) (1-11) W 00 and let f : 3?” x a —> §R+ and g : if?” x 6 —> §R+ be conditional densities and d and 6 be the QMLEs under H f and Hg respectively. An estimate of the expected value of the average log likelihood ratio when the null is correctly specified is - - 1 - . . E}:[Lf,, — Lyn] E /—(log f"(.1:" ,0”) — log 9 "(1' ",7n))f"(1:",an)dvn(r") (1.12) TI, : /(n 1:] log ft(£L‘ t,an) - 10g 9;:(13 3sz )) )H ft( (ft 011))dvf’14’3) t 1 = n” Zl/(1()gft(+l7t,dn)—10g9t<$t,61’z))ft($ti51161021391014) t=1 where f ” Ill 7.1:: he The ut- fold integration in the equation(1. 14)c causes the severe difficulties of comput- ing the unconditional expectation. It is assumed that the observations are indepen- dent in the case of Cox (1960,1961), so we can reduce the integral above equation as 'U-fold integral = % ZI/(logfdl'uél —10g9t($t,6))ft(1‘t,éldvdl‘tll (115) £21 White (1994) argues that it still requires computational effort, even though this is much more tractable than before. In addition, the analytical intractability still remains when we apply Cox test to time series applications that include the lagged dependent variables as explanatory varial:)les. To avoid these difficulties, we suggest an approach using the conditional densities of ft and gt given It_1. By computing the conditional expectation for each t, we can achieve some substantial simplifica- tions of the Cox test in some important applications. Now we can rewrite the second 9 part of Cox test as D: D 'e' K.” 8; I h ‘2 3) ll 'fllH Me [/(103‘ ft(l't,d I It—I) — 10g9t(l't,6 l It—1))ft($tad l It—lld’Ut(17f:1116) @0- || t—l II ’fllt‘ Mrs [Eftlbgfdl'tfi l It—l) —10g9t(1’t,6 l It—l) l It—ll 0-”) (s. H p—I Equation (1.17) is the conditional expectation of the differenced log likelihood func— tions. 1.2.2 A Modified Cox Test Let (gt, Zt; t = 1, ~ - - ,T} be a sequence of observable random vectors with yt 1 x J, and Zt 1 X K ;yt is the vector of endogenous variables, and Zt is the vector of explanatory variables. We assume the regularity conditions in White (1982) held. Suppose the two competing nonnested parametric models ft under the null and gt under the alternative respectively, then Mli MIN | 6—1390), 90 E 9 Q 3119,15 = 1,2,°°'T (1-18) 1112: gt(yt |1t_1,(50), 60 E A Q 3iq,t : 1,2, - - ~T (1.19) where It_1 is the information set (o-algebra generated by {yt_1, Zt, - - - , Z1}) avail— able at time t. lf6 is VT-consistent estimator of 60 under the null hypothesis when 60 is a 10 true value of 6, and 6 is \/_-c0nsistent estimator of 60 under the alternative hypoth- esis when 60 is a true value of 6, then x/T(6 — 60) and x/T(6 — 60) are distributed as asymptotic normal. The null hypothesis is that A! 1 is correctly specified and the alternative hypothesis is that .Mg is correctly specified. Now we write the modified Cox test as T TM, : T—IZUngtf’yt l It—1;90)—1089t(yt lIt—1§6*)} t=1 T “T—1:{EMl(logft(3/t | It—1;90) —10g9t(yt | It—1;5*) | It-llGl-20) t=1 where 6* : plimri when M 1 is true. It is important to note that 6* ¢ 60 in general. 60 and 60 are the true parameters which are unknown, so we replace them with 6 and 6; the ML estimators of 60, and 60 respectively. This modified Cox test is composed of two parts, the averaged log likelihood ratio of the null hypothesis to the alternative hypothesis and its expected value under the null hypothesis. Now we analyze these two terms one after another. i) The first term: Let the log likelihood functions of log ft(yt | It_1;6) and I" 10s 9(yt l It—1;6) be - 1 1 - 1 z —mt6 2 logft(yt|1t_1;6’) : —§log27r-§loght(6)——2—(Jt h (6; )) (1.21) t - 1 1 . 1 yt—uti 2 waterless) = ~§10827T’§10g77t(5)—'2‘( 72(6)) (1.22) t 11 Plug these two log likelihood functions into the first part of the Cox test and we can rewrite this term as T Z{108ft(ytIIt—1;9)—10391(ytIIt—1;5)} i=1 T T (311 — mt(9))2 (yt — #10962} : —— 10 h( 1 — —- - — - .23 2{ g M 097” 9)} 9.26 11(6) 721(6) (1 ) ii) The second term: The conditional expectation of the log likelihood ratio 0f10gftIytI1t— 1; HO 10g9(ytI1t 1; 9N5 T Z [EMIUngd yt I It— 1; 9)— 10g9t(yt I It—1;5) I [if—1}] t=l 3" . T 6) . t 9 +12: mt(9 MM) (124) t( (9) 2t=1 76(9) NI’S T 1 T —log77t(6) — —2-loght(6) —+-2-Z i=1 :2 i Now we combine these two terms together and rewrite the modified Cox test d _ 1 T_1T (gt—7711(9))2 (t/t-#t(9))2 (11(9) TM 5‘72 — - t=1 ht(9) 721(9) 771(9) T ‘ 2 ‘ : T—l { _ 6 ( l—A/itw) 111(6) —ht(9) g (Ut t( )) WM) 2 1 1 _ x (7171—61 _ W6} “'26) A more detailed derivation of this result is given in the appendix. The equation (1.26) above is the modified Cox test statistic when we assume M 1 is correctly Specified. If we exchange the role of the hypotheses, i.e. Mg becomes 12 the null hypothesis and [W 1 becomes the alternative hypothesis, then under Hg, this modified Cox test has a different form as TMQ :- T_1{10g9t(yt I I11—1;<50I ‘10gftfyt I 11—5917} T 71-1}: EMQUOggtI'l/t I I11—1;(50I — log ft(yt I It—1§9*I I It—lX}-27I i=1 lit(90I — mt(9*I €t(50I2 — Tlt(50I _/J't ()60 I ht(6*) + 2 T :T’M: t=1 1 1 XI’HWI _ nt(50)>I (128) Now we consider the asymptotic distribution of the modified Cox test under the null hypothesis that Ail is correctly specified. Since we do not know the true values of parameters, 60 and 6*, we use the consistent estimates, 6 and 6 instead, so the test statistic is based upon T . - T ,1 : T_1mt(6)){mt(0)-M(6)} M gm: 1 T)t(9I +UtI9I2—(lt(9_71ntyl(I 1 __i_\)] 129 21(9) 111(9) (1 ) We can expand T111 by the mean-value theorem as T * _ —1 mt(90I — Ht(9 I Ut(90I ht(90I — T g (yt .- mt(60){ 7]t(6*) } 2 p T — 77 F — 6 x (W959 — h't(190I)I “LT-1;; 939;. IQ” — mt(6))I ”(6669“ )} 'ut(6—)2 — ht(6) 1 1 .. . + 2 (77t(9I — 1115(9) ] (6 —— 60) (1.30) where 6 and 6 lie on the segment between (6, 60) and (6, 6*). Now we 111111tiply x/T on both sides T * 9 —1/2 1 _ m ’mt(90I - M65 I} Ut(90I‘ * ht(90I T t; [(Jt t(90I{ ”“6,” + x 1 __1__ -1? T—a—II _ _ {mt(6)—m(6)} (WW) M90) +T Z .(Jt mt(9II — 1t +ut(9I‘ *llt(9I ( 1‘ _ 1 )] fi(é_90) (1,31) 2 T_1/2 i [m _ WU, {mm — mm} + Ut(90I2 — ht(00) :21 mm 2 x 1 _ 1 -1T [ _m {mime—mun} (77th?) ht(90I) +T Eve (gt t(90II 7}t(6*) ut(60I2—ht(60)( 1 _ 1 )J A- + 2 7746*) moo) WW 90) (1.32) under the null hypothesis and (6 —-> 60) and (6 —> 6*) by the mean-value property and V9 is the gradient operator. Define we 6*) = T‘liv [o —mt<60>){7”‘(60)““(6*)} ‘0’ “ ,2, 9 ‘ 7746*) +Ut(9OI2—ht(90I< 1 _ 1 N 2 77t(5*I ht(90I Then T ‘ ‘ , *2 “ T—1/2 , _m 0* {7711(9Iil1'th} “4(9) -ht(9I( 1, _ 1, )J 1; M t( I) 71M) + 2 WW him) T * __ _1/2 , —m "IIIOOI—Htw I} T g [(3% t<00){ 7/t(6*) 14 2 ‘VIt(5*I htI90I —\1:t(60,5*)\/T(é — 60) —”—> o (1.33) +Ut(90I2—ht(90I( 1 _ 1 )J The asymptotic distribution of \/TTMl is equivalent to the asymptotic distribution m 6 (5 l 6 2—h 6 ——5 off[yt—mt(00>{ “#21553“ M} m (1)2 i and mé-eo) = (—%z$:1At)” 71—? 2;; v9 logfet, 60>. _ m 6) (6 u 6 2—h. 6 1 Note that f Zthl ((3/1 — mt(60){ tI1))1I(6‘I)I(D}+ 1C OI 2 z( 0) (7”(16‘) _ h1(60))I -1 —\Ilt(60, 6*) (E [71~ZT:1A1(60)]) —\/1T 231:1 V9 log f(yt, 60) is martingale difference sequence random variable with mean zero and variance, V(60, 6*), under the null PG — W60, )) (1 t_.1 Define = u mtIgOI’/lt(5*I ’ut(90I2—ht(60)( 1 — 1 )I Dt _ ItI90)( 7W“) I+ 2 7246*) Mao) Atwo) _ 82 1 meow/00186} ,; * Z mt(9()I-Ht(5*I ui(60)2—ht(00)( 1 _ 1 II ”(009) ‘ V9I"(60)< 7M5“) I+ 2 7M5”) MW) 1 TT 1 Therefore, \/TTMl ~ N((),V(60,6*)) and _v_"1~a N(0, 1) if V is a con- hy1')othesis, where V (60, 6*) is 2 T —1 221M910) V9 101% f (yt,(903}4) :EMH T V(60,6*) = T’1 ZD t:1 "l I sistent estimator of V(60,6*). Under the regularity conditions, it is easily shown that 2 A 1 T A A 1 T A _1 A Dt — (:7 Z M9,”) (f 2 At(9I) V9 log ft(9I (1-35I 1:1 1:1 15 A T VT : T4}: i=1 Thus, T — V(60, 6*) 1 T A 1 T A A 1 T —1 A :: T; Dt — (That/466)) (TE/12(6)) Valogfdm 1 1 T 1 T A 2 _T Z 02 _ (T Z w(00,6*)) (if: 2 242090)) V0108 f(yt,90) 2:1 2:1 2:1 1, 0 (1.36) Under the null hypothesis, the statistic of the modified Cox test is asymp- totically normally distributed with mean zero and variance, VMl- Thus, the stan— . . . fiTAI . . . . . dardized modified Cox test, 77%, IS asymptot1cally distributed as unit normal Ml N(0, 1) under the null hypothesis. We now consider a time series a lication as an exam le. Su ose t t = ’ 1, 2, - - - , T is a sequence of 22'. d observable random variables. Two competing models are given as M1 3’62 = mt(90) + 112, WWW Ut ~ N(0,ht(90)), (1-37) E(:Ut | 12—1) 2 7712(90), (1'38) Va'r(y, | It_1) : ht(60), where (21(60): 020 + al'ut2-1-~ARCH(111.39) and A212 : yt : [12(50) + at, (1.40) ELI/2 | 12—1) : #2610) + ()1152—152—1 (1-41) Va7‘(yt I It—l) = 0?, (1.42) where 52 = 19152—152—1 + £2, and it ~ Nata?) 16 - - Bilinear model Under the null hypothesis that All is correctly specified, we can write the modified Cox test as T . “ ‘ 2 ” -* “ 2 . _ ~ mt(6 — 11¢ 6) 11(9) — ((10 + al-u__ ) T1111 : T 12 [112(9) ).2 ( + 2 2 2 1 2:1 Ge 1 1 X“? _ . - 2 )J (1.43) (76 00 + 01111-1 Define "22(90) - 212(6’“) 0... b N H ”I \D ? and .9 III 1 0—2 (12(90)’ Ut(90)2 — h2(90) 02 E 11-2(90)D21 + 2 022 - , 62—h é . then fiTAIl : 1/2tzl{ut(6)Dt1+ Ut( ) 2 t( )Dt2} (l. 44) 292 — h (9 Therefore, T_1/2 Z (Uta?) )Dtl + ut< ) 2 ( )D22} t=T1 6 —} 6 _ T—1/2 Z {112(90) 021 + U2( 0) 2 It( 100”} 2: 1 T + (l 2,1,,(90,5*))\/f(é— 019—73 0 (1.45) 11460)? — h2(90) 2 T T where T—1 : wt(60,6*) : T—1 : V9 [221(60)Dt1 + 022] t=l tzl . 1 T —1 1 T and 6776—00) = -( ZAt(90)> it: 010gf(I/2 90) The asymptotic distributions of the modified Cox test are as follows T fiTJ)[1 71—1/2: t=1 X V0 10% f(y2, 90)] 1T 1T Let (12 D2— —Z'U’22(0025*) —ZA2(0 Th1 n: Then, T T 1 1 "_ (It : “— m2, 211:1 1T" : —— Dt— 11111 — 22% (6,6 fig pT—ongt 0 V0105f62190)l+0p(1) 1i, : — (1, Vflflt where * : D 1 _ (1‘ t (pTI—IfOT T d l — (6 :- E 2 6 ,6* an p TinéoTtZZl:t( 0,6 ) [Hf 0 )l pTlgréo—11:LZIA2()90 : ElAt(90)l Therefore, Efollt—l) — 0 E(q{2I12_1) 1T 1T 4 0—21 — (er ,(._ A6 2 pflflTgwm,)leT§tM) T—->oo 6% —1 0)) V9102; f(‘y2,90) P 1T 1T 4 D2 - (T 2: 2/2’2(90,5*)) (f Z A2(90)> V0108 ffytfpo‘) ) 2:1 2:1 1T 4 *O@figf;m%fl um (1.49) XWW%2) —1 x X VQIng(3/ta‘90) 18 (1.50) E [(02 — (E [Warm (E [Atwonrl XVe 2022222220»? I 22-1] (1.51) = V1.21(60,6*) (162) Therefore, 1 1 T 2 1 ~ 2 1 T ‘1 1 V1111 = .— 2 D2 — 202 — 2 WWW - Z At(9) V010gf(y2,9) T1521 T t=1 T t=1 1 T , 1 T A T1 - 1 + — Z WW5) — 2 MO) Valogf(yt,9)v2910gf(y2,9)’ T t=l T t=1 1 T ‘1 1 T ’ — 2 242(6)) (— 2 1114625)) (153) (T 2:1 T 2:1 _. __ ,, _.. . x/TTm. The mod1fied Cox test stat1st1c under the conditional mean and var1ance,—W2—l, 1s Ml standard normal, N (0, 1). 0 Proposition Assume that the following conditions are satisfied under the null hypothesis, 1. Regularity conditions1 hold [see White (1982).] 2. T1/2(é — 220) —2 0pm) and T1/2(5 — 50) —> 0,,(1) 3. Conditional mean and conditional variance exist and are finite. 1 2222(6) - 222(5) T Then, T1141 2 T—IZ 2:1 (222 — mt( )) 772(5) 2222(6)2 — (12(6) ( 1 _ 1 >] 154 + 2 222(5) M9) ( I ) 1The regularity conditions are given in the appendix. 19 and the standardized Cox test statistlc, 77%{1 IS asymptotically dlstrlbuted as umt MI 1 normal, N (0, 1), where 17M, is the consistent asymptotic variance of x/TTMI. Note that the equation (1.54) is a function of 6 and 6, the ML estimators of 60 and (50 respectively. Comparing to the Cox test (1961,1962) and the sim- ulation method by Pesaran and Pesaran (1993), the modified Cox test does not require pseudo-true parameters or estimators from artificially generated data. This approach, based upon conditional mean and conditional variance specifications, is a more convenient method for a computational purpose. Now we apply this proposi- tion as follows; 0 Procedure 1.1 1. Obtain 6 and 6, the ML estimators of 60 and 6*, save residuals, ut(6), and the conditional variance, ht(6) from the log likelihood function log f (yt | It_1;6) and et(6), and 71(6) from the log likelihood function log gt(yt | A It—l;5)- A A 2. Compute D21, D22, '¢’t(62(§)a 311d V0102; 6(6) Define D“ 5 MW’ ‘ t D” E Eta—lull”, and 7122211 #226925) 5 i533; —ngntbtl — 26111522 2 212 — In 3. Compute \/TTM1 : T"1/2 21:1 [2221521 + 472—122 —(71-ZtT:12/1(6,5)) x<7l~th=1A(é))-1vglog 22(2)] and - - . 22. 22 . ~ ~ » _ VM, = $2:le [0.2122(6) + #032 + 2% 2;; 2222, 6))(+ 2;; A26» 1 20 XVe log wave log 222622722211 A<é>r1<71~ 11229222 A —2(71~Z;I:164625))(71-2211A(6))_1(D21V97222(6)+ Tax—Vmw» 4ht(0) and use the standardized Cox test statistic, —.,1/—‘1,li, as asymptotic unit Ml normal under the null hypothesis. 1.3 Empirical Application Bera and Higgins (1997) took generalized autoregressive conditional heteroscedastic- ity (GARCH) by Bollerslev (1986) and bilinearity by Granger and Anderson (1978) as two competing models for nonlinear dependence in time series data and showed the nonnested Cox test results using a stochastically simulated method by Pesaran and Pesaran (1993) with three time series data sets; S&P 500 stock index, the daily pound/ dollar exchange rate, and the rate of growth of the monthly U.S. index of industrial production. In this section we compare our modified Cox test results to those results from Bera and Higgins (1997). 1.3.1 GARCH and Bilinearity Forecasting as well as estimating a model are very substantial components in econo- metrics. These components play a very important role in the analysis of time series data. If a series is assumed as a white noise (this is very common assumption in 21 econometrics), the process is independent of its own past and it becomes very diffi- cult to forecast this series because we cannot get any information from its own past. But most of the financial and macroeconomic data in time series shows evidence of a dependency upon the past. Granger and Anderson (1978) suggested that a white noise process could be forecastable from its own past in a nonlinear manner and introduced a bilinear process that allowed dependence 011 the past realization of the series. Suppose Xt : 65t_1Xt_1 + 5;, where Q is white noise with mean zero and variance 0? and X t and 51 are uncorrelated. The conditional mean Xt is 6Xt_1€t_1 while the unconditional mean is zero because E(Xt|12—1) : 6X2--1€2—1+E(52|12-—1) (1-55) I 2'3X2—152—1 (1-56) while E(Xt) -— E(,6X,~-.15)_1+€t) (1.57) : 2’3E(X2—1)E(€t—1)+E(€t) (1-58) —. 0 (1.59) where It_1 is an information set (o—algebra) available at time t. Next, the condi- 2 . . . . . . . . 0' tional variance of yt is a? while the unconditional variance lS £73523? because 6 VaT(Xt | 12-1) : 1112(5) | 12—1) (1-60) (“‘1 M (1.61) 22 while Var(Xt) = 62Var(Xt)Var(5t_1)+Var(5t) (1.62) = 62Var(Xt)0§ + 052 (1.63) 27? Therefore, Var(Xt) 1————3—2——9 (1.64) _ ‘2 0-5 If 620? < 1, then this process is non-explosive and becomes stationary. So 23%? < 1 is a very important condition for stationarity. Engle (1982) further developed the idea of nonlinearity in his model, the autoregressive conditional heteroscedasticity (ARCH), which is very close to the bilinear process. Suppose y) : X;,13+ (it, then yt ~ N(Xt’6,ht) where ht 2 h(5t_1, - - - ,et-p, a). The conditional mean and the unconditional mean are both Xt’fiz E(’yt I 12—1) = Xifi + E(€t l 12—1) (1265) : Xt’fi (1.66) and E(yt) : E(X£fi + at) (1.67) : X£6+E(et) (1.68) : X,’,B (1.69) The conditional variance is ht : 020 + 02152-1 + - - - + apet_p but the unconditional variance is 03; VOW/t l 12—1) 2 Va"‘(€t I 12—1) (1-70) 23 : ht (1.71) and Var(yt) = E(€t2) (1.72) ('1 to (1.73) N ote that the conditional variance, ht, contains the current and lagged values of independent variables through information set available at time t because 5t : gt — Xéfi. Thus, we can decompose the ht as follows2 ht : h{(5t_1,€t_2,'",Ef-p,O’,Xt,Xt_1,'",Xt_p) (174) : ht(€t—115t—27 ' ' ° igt—p) O)ht(Xt,Xt_1, ' ° ' ,Xt—p) (1'75) Bollerslev (1986) extended the ARCH process to the generalized autoregres- sive conditional heteroscedasticity (GARCH) process allowing for a longer memory and a more flexible lag structure. The GARCH(p,q) process includes the lagged con— ditional variances as well as the linear function of past variances of the ARCH(q) process so, it corresponds to and forecasts from its own past in an adaptive expec- tation fashion . Suppose y) = X {,6 + at where y) is the dependant variable, Xt is a vector of independent variables, and ,6 is a vector of unknown parameters, then the GARCH(p,q) process is given as 2See p.3 on ARCH selected reading, Engle, 1995 24 52 | 12—1 N N(0,ht) (1-76) q I? where ht :- 020 + Z a,-2:,2_,- + 2 61h“, (1.77) i=1 i=1 and p20, q>0, Therefore, yt ~ N(Xt’6,ht) where Var(yt I It_1) = ht, while Var(yt) = Var(€t) = 0?. The bilinear process and the GARCH(p,q) process as well as the ARCH(q) process have forms of nonlinearity and provide more information for forecastability from their own past realization. Although it is hard to find the true specification between the bilinear and the GARCH processes due to the similarity between them, there are some remarkable differences between these two processes. The main and fundamental difference between the bilinear process and the GARCH (or ARCH) process is the conditional moments condition. The conditional distributions of a dependant variable between these two processes are pretty distinguishable. Suppose a dependant variable yt is generated by yt : X56 + at where 22) is a stochastic error. Under the null hypothesis, at is specified as 25 .Ml :ut I It_1 ~ N(O,ht) (1.78) where ht = 00 + (1121,24 + 1’31ht_1 - - -GARCH(1, 1) process (1.79) and under the alternative hypothesis, at is specified as 612 : 'Ut : bllut—lgt—l + Q (1.80) where 52 ~ N(0,o§)~-Bilmear process (1.81) In the GARCH(1,1) model, E(yt l It_1) : Xt’fi and Var(yt | It_1) = ht and in the bilinear model, E(yt | It_1) = Xgfi + bllut—lft—l and Var(yt | It_1) = 03. The conditional mean of bilinearity shows that the bilinear process does augment the adaptive information between its past errors and innovations in a nonlinear manner while the conditional variance of the bilinear model is constant. This nonlinearity in the conditional mean of the bilinear model may increase the forecastability of the dependant variable while the GARCH(1,1) process does not bring any augmented information from its own past and innovation from the unconditional or conditional mean. On the other hand, the conditional variance of the GARCH(1,1) process provides augmented adaptive information from its own past realization while the conditional variance of the bilinear process is constant. Although the conditional 26 distributions between the bilinear process and the GARCH process are fundamen— tally different, the unconditional distributions between these two are very similar, as shown earlier. Due to the nonlinearity and similar unconditional distributions between the bilinear process and the GARCH process, it is more difficult to find the true specification. In the next section we do the modified nonnested Cox test between these two nonlinear specifications with three time series data sets. 1.3.2 Empirical Application In this application, we consider three time series data sets: the daily percentage changes of the S&P 500 stock index, the daily log price changes of the British pound in terms of the US. dollar (£/$), and the annualized growth rate of the US. monthly index of industrial production (IP). Note that the first two data sets are high frequency financial time series and the third data set is a non-financial time series. These three data sets are the same ones that Bera and Higgins (1997) used.3 We consider that the stochastic error equation follows the nonlinearity and specify the GARCH model as the null hypothesis and the bilinear model as the alternative hypothesis. The exogenous variables are considered as autoregressive 3They retained the last 10 per cent of the observations to compute root mean squared errors for the one-step-ahead forecastability from each of models and we used the same data samples as they did for nonnested test between GARCH and bilinear models. 27 Table 1.1: Summary statistics: S&P 500 Mean s.d. Skew Kurt Max Min sample size Bera&Higgins .060 .820 -.651 8.759 3.468 -5.877 1138 Kim .042 .925 —.711 8.796 3.455 -7.008 1138 Table 1.2: Summary statistics: British pound Mean s.d. Skew Kurt Max Min sample size Bera&Higgins -.023 .477 .032 4.758 1.959 -2.252 1210 Kim .0260 .692 -.202 4.632 2.990 -2.784 1210 AR models4. Now the model s )ecifications are iven as I All : yt : X;3 + ut at I It_1 ~ i.i.d(0, ht) where ht = 00 + 012234 + 6ht_1 A1223” : Xéfi + at where at = (2112224524 + at, and 5t ~ i.i.d(0,o§) (1.82) (1.83) (1.84) (1.85) (1.86) (1.87) First, we take the daily S&P 500 stock index (SP) from January 4, 1978 to May 28, 1993 and compare our statistics summary to that of Bera and Higgins ( 1997) 4In modeling of exogenous variables, we take autoregressive (AR) models and the order of the autoregression following Bera and Higgins (1997). 28 Table 1.3: Summary statistics: IP Mean s.d. Skew Kurt Max Min sample size Bera&Higgins 3.357 10.604 - .645 5.653 37.699 -51.732 359 Kim 2.728 8.823 -.623 5.649 33.483 -42.364 359 in Table 1.1. Next, we take the daily log exchange rate of the British pound to the US. dollar (fl/SB) in a sample period from December 12, 1985 to February 28, 1991 and present the statistics results in Table 1.2. As Bera and Higgins (1997) considered in their paper, we also take the annualized growth rate of the US. monthly index of industrial production (1P), a non-financial time series data set, from January, 1960 to March, 1993 for the third empirical application and present the summary statistics in Table 1.3. Note that the summary statistics between Bera and Higgins (1997) and our findings given in Table 1.1 to 1.3, are similar but not exactly the same, even though we used the same data sets with the same sample periods that Bera and Higgins (1997) considered. There are a couple of things to be noted from the summary statistics. First, as given in Table 1.1 through 1.3, all the series are of high kurtosis, especially S&P 500 stock index series. Another is that we have different signs of the mean values in the British Pound series; -0.023 of Bera & Higgins and 0.026 of us. Table 1.4 through 1.6 present the estimation results of the GARCH(1,1) model. Again our estimation results, using the British Pound series, reveal the 29 Table 1.4: Estimated GARCH Models: S&P 500 Bera & Higgins Kim yt=.052 +.066yt_1+ at 21122041 +.010yt_1+ at (025) (031) (023) (030) 22):.011 +.013u§_1+ .96822.._1 22):.013 +.01222§_1+ .97222._1 (.006) (.005) (.013) (.004) (.004) (.006) 2(6): -1367.67 2(6): -1511091 Table 1.5: Estimated GARCH Models: British Pound Bera & Higgins Kim yt:-.024 +222 yt=.032 +ut (.014) (.018) 26 : 010 +4059u§_1+- .897h._1 ht: 017 +n065u§_1+- .902ht_1 (.004) (.002) (.017) (.009) (.022) (.036) 2(6): —785.72 2(6): -1231.208 Table 1.6: Estimated GARCH Models: IP Bera & Higgins Kim 3).:268 +.2793,2,_l +.114y._2+u. y.:2.135 +2702).1 +.122y2_2+u. (.303) (.033) (.013) (.766) (.100) (.054) 25::6024 5235261 +1012)“1 hf:47114. 22332261 + 040h2_1 (5.42) (.034) (.021) (35.128) (.116) (.524) 2(6): -1301327 2(6): -1247462 30 Table 1.7: Estimated Bilinear Models: S&P 500 Bera & Higgins Kim yt=.017 +.102yt_1 + 222 yt:—.0004 +.045yt_1 + 222 (.030) (.017) (.029) (.031) at: .053ut_1€t_1+€t 'ut 2-047Ut—152—1 (.017) (.011) 2( “): -1368.884 2(6): 4517.299 63: .651 63: .844 difference in sign; -0.024 of Bera 82: Higgins vs. 0.032 of our estimation in Table 1.5. Table 1.6 shows that the two estimations results are very close and the GARCH effects in the IP series are rather small in both estimations (0.101 from Bera and Higgins vs. 0.040 from out estimation) compared to the two other GARCH effects in S & P and £/$ data sets. Table 1.7 through 1.9 present the estimation results of the bilinear model and there are some significant differences between Bera & Higgins and our estimation results. First, the bilinear effects in the British Pound series are different in sign; (0.039, 0.083) from Bera & Higgins vs. (-0.016, 0.025) from our estimation. Second, the estimated variances of the bilinear model(65) are also different between Bera & Higgins and our results. Table 1.10 and 1.11 present the modified Cox test results. Table 1.10 reports the test results when the GARCH model is the null hypothesis and Table 1.11 reports the test results when the bilinear model is the null hypothesis. Note that 31 Table 1.8: Estimated Bilinear Models: British Pound Bera & Higgins Kim 3112.024 +222 yt:.034 +ut (.020) (.022) u,:.039'ut_1€t_1 (.021) l( )= 63 ) 20832224524 + 52 (.024) —819.62 .226 Ut:-.016U.(_1€t_1 (.031) +.025ut_262_1 + 52 (040) -1271.357 .478 Table 1.9: Estimated Bilinear Models: IP Bera & Higgins Kim y.:2.34+ .32iy._.+ .125y,_2 + u. y,:2.057+ .30792-1+ 13322-2 + u. (.634) (.053) (.030) (.566) (.069) (.055) 12,: -.006'u¢_1€¢_1 +52 22,: -.008ut_15,_1 +52 (.003) (.005) 2( °): -131115 2( ‘): 4255.231 63: 90.69 63: 63.651 32 Table 1.10: Test resultszHozGARCH vs. H1: Bilinear Bera & Higgins modified Cox test S&P 500 .023 .322 British Pound .196 -.033 Industrial Production .533 .021 the absolute values of our test results are bigger than those of Bera 82 Higgins when the bilinear model is the null hypothesis in Table 1.11. When the GARCH model is the null hypothesis, our test results are close to zero for all three series, so we cannot reject the null hypothesis in those three data sets at any significance levels. In Table 1.11, Bera & Higgins reject the British Pound series as the null at 1 ‘70 of significance level and reject the IP series at 10 % of significance level when the null hypothesis is the bilinear model, but all three series are rejected in our test results, which produces much greater test values in absolute value than those from Bera and Higgins (1997). In Table 1.9, the estimated bilinear effect is -0.008 and the standard deviation is 0.005 in our estimation results, it is marginally significant and indicates the bilinear effect is very trivial for the IP series. For the IP series the test value is ~23.724, which is almost 15 times bigger than that of Bera and Higgins and rejects the bilinear model as the null hypothesis at any significance levels. It is shown that the error equation does not follow the bilinear model in the IP series and this is consistent with the test result in Table 1.11 for the IP series. 33 Table 1.11: Test results:HO:Bilinear vs. H1: GARCH Bera & Higgins modified Cox test S&P 500 -.910 —6.775 British Pound -2.797 —8.300 Industrial Production -1.643 -23.724 1.4 Simulation Experiments In this section we perform some simulation experiments to investigate the potential applicability of the modified Cox test. We consider a linear regression model with two different nonlinear error equa- tions as competing models. We specify an AR(1)-GARCH(1,1) model as the null hypothesis and a AR(1)-first order bilinear model as the alternative hypothesis. Thus, the nonnested model specifications are Mi 161,2 = 00 + 0191,2—1 + U2. U2 |12:1 ~ N(0,ht). ht = 22 + 21222—1 + 5722—1. and rut : \/h.t222, vt ~ N(0, 1) M2 3 92,2 1‘ 230 + 6192,2—1 + 82. 52 = biiEt—iét—iJrEt, 34 (1.88) (1.89) (1.90) (1.91) (1.92) (1.93) and Q ~ N(0,1) We generate the artificial data in the following way. F irst,we generate the normally distributed random variables from RNDN GAUSS program to calculate the AR(1)-GARCH(1,1) model, y”. Then again we generate the normally distributed random variables from RNDN GAUSS program for the AR(1)—first order bilinear model, ygy. The pseudo-true population parameters for 111 1 are given as y” 2 015+ 0.85y1,t_1+ at with a strong GARCH effect; ht = 0.1+ 0.2u§_1+ 0.75ht_1. For M2, the pseudo-true population parameters are given as ”62,2 2 O.19+0.8y2,t_1+5t where at = 0-38552—152—1 + 52- The parameter values chosen for both models correspond to the empirical estimates of the time series. Next we combine these two data sets with weight A to generate a new data set yt = Ath + (1 — A)y2,t. Using this new generated data, we perform the testing experiments by setting different values of A; A:0, and 1. If A:1, then :02 : y”, so Ml becomes the correctly specified one, while Mg is correctly specified if A20. The QMLES of these two specifications are calculated based on BHHH algorithm and the simulation results are calculated from 200 replications and with a sample size of 1000, 2000, 3000, and 5000 and 250, 500, and 750 for the small sample size properties. T1 is the modified Cox test when MI is correctly specified and T2 is the modified Cox test when Mg is correctly specified. When the null is true, the test value(T) should be approximately zero. 35 Table 1.12: Simulation results when GARCH(1,1) is true sample size N=1000 N: 2000 N=3000 N=5000 T1 T2 T1 T2 T1 T2 T1 T2 mean 0.056 -19.498 0.129 -34577 -0027 -44.458 0.054 -59.661 s.d 0.846 7.045 0.986 3.986 0.945 2.353 1.019 0.678 skew -0.002 1.770 -003 5.692 -0.369 6.129 0.449 -O.276 kurt 2.866 4.423 2.368 39.269 3 782 44.359 3.630 3.138 R.F.(a:.05) 0.020 0.960 0.040 0.995 0.045 1.000 0.055 1 000 toohigh 0.005 0.000 0.030 0.000 0.010 0.000 0.040 0.000 toolow 0.015 0.960 0.010 0.995 0.035 1.000 0.015 1.000 two-tailed test with 02 = 0.05 and A = 1 In Table 1.12, we report the simulation results when the null is the GARCH(1,1) model with A z 1. The four moments of the unconditional probability distribution of the simulated test. are very close to normal for all four sample sizes. The actual size is very close to the nominal size for all sample sizes except for N=1000, in which the actual size is little bit understated. Table 1.13 reports the simulation results when the null is the bilinear model for N=1000, 2000, 3000, and 5000. The distribution of the simulation results appear to be very close to the standard normal distribution for all sample sizes. And the actual size is very close to the nominal size. Figure 1.1 and 1.2 present the empirical density functions (edfs) of the mod- ified Cox test against the cdf of N(0,1) for N: 1000, 2000, 3000, and 5000 and 200 replications. Figure 1.1 shows that the edfs of the simulation results of T1 appear 36 Table 1.13: Simulation results when Bilinear(1,1) is true sample size N=1000 N=2000 N=3000 N=5000 T1 T2 T1 T2 T1 T2 T1 T2 mean -9236 -0.063 -13045 -0145 -15.883 -0039 -20.811 -0.136 s.d 3.125 1.029 4.050 0.923 4.631 0.969 5.394 0.905 skew 2.002 0.022 2.203 -01 18 2.369 0.022 2.548 -0241 kurt 5.949 2.794 6.851 2.439 8.069 2.745 8.720 2.625 R.F.(a = .05) 0.945 0.050 0.945 0.045 0.995 0.050 0.980 0.030 toohigh 0.000 0.020 0.000 0.005 0.000 0.025 0.000 0.000 toolow 0.945 0.030 0.945 0.040 0.995 0.025 0.980 0.030 two-tailed test with a = 0.05 and A = O to be normal for all sample sizes. Figure 1.2 shows that the edfs of the simulation results of T2 appear to be approximately normal. 37 6:: 6. 2.35640 SE; C do 6639, :62 So m N — O _. I N I W I d 1.. lq - ‘ i 4 A 2 1 T F J fl. I T l 4 IA 4 0000.2 .0 .3. O I I COOfiIz .0 _u. i... A DOON'Z .0 vDU 0000000 I OOOPIZ .0 .00 I II a n—vaz no .00 l .4 .1031... nigh...- » — . p p 1P » b P —2 n . b p 2.82 do :8 6:: .m> :62 .80 8:608 9: :o 6:6 9:, :83: 1'0 0'0 [0 9‘0 9'0 170 CO 20 6'0 8'0 01 uoiiouni Kusuap 38 mu: m_ C._Y.omc_:m CME; NH *0 mm3.o> “mm: x00 m. N F O 4 q q d J ....»tfi‘h‘qx n3u“...“o‘to\ ,1 «I I C m. 0 I O O .O 4 Z L O C; O J . .V O C4 A O 4 9 O /_ O coonnz .0 :3 .ll. 8 GOOfiIz .0 .U. Iii! L OOONIZ.O :quoIoooo 1.0 0009'). .0 ‘0. I I 6 A~.sz .0 .00 I 1 P P F b p Cdvz “.0 you mr: _m> “mm: x00 omEDoE 9: B 38 9: memSoC O‘l uoqaun’; Mgsuap 39 Table 1.14: Simulation results when GARCH(1,1) is true sample size N=250 N=500 N=7 50 T1 T2 T1 T2 T1 T2 mean -0.277 -5.507 0.024 -11.027 -0.007 -15.509 s.d 0.760 3.466 0.875 4.973 0.936 6.106 skew -0.446 0.411 -0.223 1.107 —0.177 1.519 kurt 3.015 1.485 2.871 2.667 3.104 3.736 R.F.(a:.05) 0.030 0.725 0.020 0.880 0.040 0.935 toohigh 0.000 0.000 0.005 0.000 0.020 0.000 toolow 0.030 0.725 0.015 0.880 0.020 0.935 two-tailed test with a = 0.05 and /\ : 1 Table 1.14 reports the simulation results with small sizes for N = 250, 500, and 750. The mean and standard deviation for N = 250 slightly deviate from the standard normal N (0,1) but close to normal for other sample sizes. The simulation results undersize for all three sample sizes and the rejection frequency of T2 is lower than 0.95 for N = 250 and 500. In Table 1.15, the means are little bit greater than zero in absolute value for all three sample sizes but this deviation is getting smaller as the sample size increases. The actual size and the rejection frequency are approximately equivalent to the nominal levels. 40 Table 1.15: Simulation results when Bilinear(1,1) is true sample size N=250 N 2500 N =7 50 T1 T2 T1 T2 T1 T2 mean -4.952 -0.417 -6.806 -0.348 -8.274 -0.256 s.d 1.493 1.047 2.017 1.028 2.530 0.968 skew 2.406 -0.015 2.198 -0.325 2.076 -0.004 kurt 9.508 2.730 7.411 2.573 6.253 2.560 R.F(a=.05) 0.935 0.070 0.940 0.070 0.945 0.055 toohigh 0.005 0.010 0.000 0.000 0.000 0.005 toolow 0.930 0.060 0.940 0.070 0.945 0.050 two-tailed test with a = 0.05 and A = 0 1 .5 Conclusions A new approach based upon the conditional mean and the conditional variance specifications has been proposed in order to solve the computational difficulties of the Cox test. This modified Cox test has some attractive features. The major attraction of the modified Cox test is its computational conveniency because it does not require computing the pseudo-true values. As this proposed test is based upon the specification of the first two conditional moments, we can also test other distributional features unlike the DM test is for the conditional mean property only. Furthermore, it can be easily extended to the more complicated nonlinear models. Monte Carlo experiments indicate that this proposed test seems to perform well for all different sample sizes. The actual size from this proposed test is almost always close to the nominal size but the actual size is slightly different from the nominal size 41 for N = 250, and 500. Further study needs to be done to examine the applicability to the finite-sample properties. 42 Chapter 2 A Robust Version of the Modified Cox Test 2. 1 Introduction In the previous chapter we proposed a modified version of Cox test under speci— fication of the first two conditional moments. We examined its applicability with three different data sets: S&P 500 stock index, the £/ $ exchange rate, and U.S monthly IP data sets, and we also did some Monte Carlo simulation experiments. Both, empirical and simulated, test results are quite convincing the applicability of the modified Cox test and the actual size from the simulation results is very close to the nominal size regardless of sample size. But these empirical test results and 43 simulation performances are derived under normality assumption. In this chapter we relax this normality assumption and extend our model in the univariate case to the robust version of the modified Cox test under nonnormality. In section 2, we reexamine our modified Cox test under nonnormality and derive the robust and nonrobust versions of the modified Cox test. Section 3 summarizes some Monte Carlo simulation experiments under nonnormality. In section 4, we compare the test results in the previous chapter assuming normality to the test results from the robust modified Cox test under nonnormality. Then we follow with a summary and conclusion in section 5. 2.2 A Robust Modified Cox Test Assume there are two competing nonnested parametric models under the conditional mean and variance specifications. Allzyt : mt(60)+ut (2.1) E(yt|1t—1) = mt(90) (22) V(3/t I It—l) = ht(90) (2-3) and 44 Mziyt = Mt(50)+€t (2-4) Hm I It—l) = Ht(50) (2-5) V(yt|1t-1) = 7It(50) (2-6) Following the previous chapter when M 1 is correctly specified, the modified Cox test is A 1 T M{mt -,Ut()(§ )} T = — 2 [”1 T;( Jt— ut(6)2 —ht(6) (61 1 ))J 27 + 2 w) W) H And the asymptotic distributions of the modified Cox test are as follows1 ~ T mt(6 fiTMl = -71:Z1( yt- mt(9 ”(7: $05)} ut(6)2 —ht(6) ( 1 1 “6») + , — - 2 WW) (M9) T —(pTlgréo%Zwt((6, 6))(pT1imOO—tZlAtwl) XVQ log ft(6)] + 0p(1) (2.8) 1 T 1 T 1 *ED—-1°— ”66*'1'— A6" 1.36wa (It t (IJngTL—ZIWI 0, ”Wiring t( 0)) Xvologft(90) Then, E((12"|1t—1) = 0, (2-9) —1 EU]? I It—l) = E (0t —EI(T —Z¢/H( 90,5 )I) (BI—$24M 90) I) t 1 XValogft(90)) IIH] (2.10) 1We follow these from the previous chapter. See section 1.2.2 for more details. 45 = V612k I It—l) (2-11) Under conditional normality when [I11 is correctly specified, E(Ut(ut2 — ht) I [t—I) ’—‘ 0, (2-12) and E(u§|1,_1) : 211460)? (2.13) . h. 6 2 50, EU]:2 I It—l) = Dtglh't(60)+ M40) Dt22 1 T a: 1 T —l 1 T at I +(— Z wt(9o,6 ))(- Z At(90)) (- Z M9045 )) thl Tt=1 Tt=1 1 T 1 T 1 -2(— Z U’Jt(90,<5*))(— Z At(00))— thl Ti=1 ><(DVm(6)+ D” Vh(9)) (214) A T A A A 2 A Thus, VT1 : %Z Dglfrt(6)+h—t(fl—Dt22 F.- 1 t ,3 M ii A —2< Zw.)—l t=1 X(Dt1V6mt(6) + Vahtlé» (2-15) . . . . . fiT‘ 1/2 Under conditional normality, the modified Cox test, Tl ~ N (0, 1), T1 but if the conditional normality does not hold, then the limiting distribution of fly: 1/2 V is not standard normal in general. Under nonnormality the modified Cox T1 test derived from the previous chapter is not valid and the actual size from this nonrobust modified Cox test can be different from the nominal size. The robust 46 modified Cox test under nonnormality is Under nonnormality, EIUtWZ? — ht) I It—1) EIIu? — ht)2 I 1H) and extended from equation(2.10)and (2.11). ECU? — Utht I It—1) = EIu'z‘IIIm (2.16) : E(II;?-2u§ht+h?|1,_1) = Em? IIt_1)—h? (217) Under nonnormality, E01? | It_1) and E (u;1 | It_1) are generally unspecified but we can derive the conditional variance of the robust modified Cox test using the Law of Iterated Expectation (LIE): DtlDt2 2 D2 _QE 4 EI and E [ E (Ut(90)3 I It— (Ut(60)4 I It— 'DtlDt2 _U 1) = E_ 2 4903] (2.18) ‘ ’ 2 1) z: E sz'ut(60)4] (2.19) So the conditional variance of the robust modified Cox test under nonnor- mality is -, 1T - - 0,132 -. D2 - V2? = 7.; 031W) 412—‘2 <6>3+T’24-m) l T , . . 1 T . -1 1 T I +(— Z Wt(9,5))(— Z At(9)) (— Z WW5» Tt=l T121 Tt=1 T i=1 t=l x(D V m (6) + DI? (1— —1—)V 11(6)) (2 20) t1 6 t 4h¢(6) h¢(é) 6 t - Now we apply these properties as follows: 0 Procedure 2.1 1. Obtain 6 and 6, the QML estimators of 60 and 60, save residuals, ut(6), and the conditional variance, ht(6), from the quasi-log likelihood func- A A tion log f (gt | It_1; 6) and 61(6) and 77t(6) from the quasi-log likelihood function log g(yt, | It_1; 6) . A 2. Compute Du, 122. r} 2.11 we”, 8), (I 23;. Min-1, V9 log W“), W)? . , . . 6— 8 u)(6)4, and ht(6)2. Define D“ E W, Dtg E £7205} — hi6? A 2 ‘ A A A ~2__“ A At(0) E W, and mum) 2 Vacant, — Ljflptg). A A *2_‘ A A A 3. Compute 6?er 2 TV2 237:, [awn + 13720.2 — (71. 23;, we, 6)) (%Zf:1AI(9))‘1V91<>g me] ‘R_1T *2 * 1')D_*3D2 ‘4 2 and VT1 — T thl Dtlht(9) + 412—”ut(6) + j‘2(ut(6) — ht(6) ) +(1I‘ 2;:1 WtIéI glll’} Zthl Atlé))_1(7l‘ 5.:le WM» 5)), ‘2(’}‘ 2?:1 'I(-’t(éa 8))(71‘ 2?:1 At(é))—l A X(Dt1ngt(6) + fiVghtw» and use the robust modified Cox test It statistic, W721 as asymptotically unit normal under the null hypothes1s. Ti 48 This robust modified Cox test has some appealing characteristics. First, as Bollerslev and Wooldridge (1992) showed in their robust version of LM test, this approach is also valid under normality and can be applied to the case where normal- ity assumption does hold. Second, this procedure requires only the first derivatives of the conditional mean and variance functions, it is relatively easy to compute. Finally, even though this robust inference procedure requires the conditional third and fourth moments, this is not a restrictive condition and we can calculate the third and fourth moments for the robust modified Cox test using L.I.E. 2.3 Monte Carlo Experiments To investigate the applicability of the robust modified Cox test, we perform some simulation experiments for different sample sizes. Following the previous chapter, we consider a linear regression model with two different nonlinear error equations: we specify the AR(1)-GARCH(1,1) model as the null hypothesis and the AR(1)-the first order bilinear model as the alternative hypothesis. Thus, these two nonnested model specifications are M15l/t = 00+alyt—1'I‘uta (2-21) at I It__1 N i.i.d(0,ht), (2.22) 49 ht = K + 7U%_1 + (Sht_1, (2.23) and at = \/h7tz/t (2.24) MQ i yt = 50 + fiiyt—i + 5t, (225) and 6t I It—l = bii€t—iEt—1 + {t (226) As seen often in time series analysis, high frequency financial time series are of leptokurtosis and the unconditional distribution of many finantial time series typ— ically shows fatter tails than a normal distribution. But as shown in Engle (1982) and Bollerslev (1986), unconditional error distribution could be leptokurtic even though the conditional error distribution is normal. Bollerslev (1987) proposed that if the error distribution is not normal, for example the conditionally t-distributed errors, then it permits a conditional leptokurtic distribution and it also accounts for the unconditional kurtosis. To investigate the valid inference from the robust mod- ified Cox test under nonnormality, we generate the error terms from two different nonnormal distributions. First, we have considered that Vt (ft as well) is condi- tionally distributed as a Student’s t-distribution with 5 and 10 degrees of freedom. The mean and variance of t-distribution are 0 and fin f v 2 3, respectively, where v is degree of freedom, so the variance from It). distributed random variables is bigger than the variance from the random variables generated by the standard 50 normal distribution, if v is relatively small number. But the t-distributed random variables still contain symmetricity on the distribution. Next, in order to examine the effect of asymmetric error distribution, we generate the error terms from two i.i.d x? distribution, i.e. Vt (fit as well) is formed from w where $t1I$t2 are i.i.d x? variates, respectively. Thus, the distribution of Vt (fit as well) is i.i.d(0,1) per- taining to asymmetric property. The tv distributed random variables were formed as (v — 2) times a N(0,1) random variable divided by the square root of X3. variate. The normal variate was generated by the RNDN GAUSS program and X3, with 2) df by the RN DCHI GAUSS program. Beside the error generating procedures under nonnormality, we proceed in a similar way for the pseudo-true population parame- ters for M1 and Mgzand for the data generating procedure in the previous chapter (see 1.4 Simulation Experiments). The QMLEs for both models are found through BHHH algorithm and the simulation results are based upon 200 replications and a sample size of 500, 1000, 2000, and 3000. Table 2.1 and 2.2 report the simulation results under nonnormality: the error terms were generated from x? distribution. In Table 2.1, the four moments of the unconditional probability distribution of T1 are approximately close to normal but the means are a little bit larger than zero in absolute value for N = 500, and 1000. The actual size is very close to the nominal size for all sample sizes. 2We change the bilinear parameter value to b“ = 0.085 for a computational conveniency. 51 Table 2.1: Robust Cox test results when GARCH(1,1) is true sample size N = 500 N = 1000 N =2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -0373 -14.269 -0207 -23.380 -0075 -35.692 0.105 -46.313 s.d 0.966 3.341 1.069 4.390 1.008 5.043 1.050 2.436 skew -0272 2.023 -0031 3.080 0.003 4.011 0.170 10.404 kurt 4.868 6.470 4.361 11.810 3.148 19.776 2.599 127.053 R.F.(a=.05) 0.050 1.000 0.060 1.000 0.060 1.000 0.055 1.000 toohigh 0.010 0.000 0.015 0.000 0.020 0.000 0.045 0.000 toolow 0.040 1.000 0.045 1.000 0.040 1.000 0.010 1.000 max3 0.148 0.076 0.015 0.060 min4 -0.096 -0075 -0.056 -0047 mean5 0.004 0.001 0.001 -0000 Data are generated from X“) Table 2.2 reports the simulation results from the nonrobust modified Cox test under nonnormality: the error terms were formed from the x? distribution. Nonrobustness indicates that we apply the modified Cox test derived from the nor- mality assumption to the situation where this normality assumption does not hold any more. The means and standard deviations are overstated in absolute value and the actual size is more than twice as large as the nominal size varying from 0.095 to 2 3The maximum value of correlations for a sample size N = 500, 1000, 2000, and 3000, and with 200 replications 4The minimum value of correlations for a sample size N = 500, 1000, 2000, and 3000, and with 200 replications 5The mean of correlations 52 distribution and R:200 Table 2.2: Nonrobust. Cox test results when GARCH(1,1) is true sample size N=500 N=1000 N =2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -0495 -13930 -0439 -22590 0.059 -36.517 0.008 -45741 s.d 1.211 3.821 1.262 5.108 1.383 3.139 1.240 3.822 skew 0.165 1.669 0.029 2.337 0.168 6.017 0.106 6.561 kurt 3.582 4.636 2.875 7.194 2.879 48.729 3.354 54.734 R.F.(a=.05) 0.120 0.995 0.160 1.000 0.135 1.000 0.095 1.000 toohigh 0.035 0.000 0.035 0.000 0.085 0.000 0.045 0.000 toolow 0.085 0.995 0.125 1.000 0.050 1.000 0.050 1.000 max 0.117 0.083 0.072 0045 min -0102 -0.069 -0052 -0055 mean -0001 -0004 -0003 0.003 Data are generated from X21) distribution and R2200 A 0.160. Under nonnormality, the robust modified Cox test performs far much better than the nonrobust modified Cox test. Figure 2.1 and 2.2 show the empirical density functions (the edfs) of the robust and the nonrobust modified Cox tests against the cdf of N (0,1). In Figure 2.1, the empirical density functions appear to be equivalent to the cdf of N(0,1) for all the sample sizes except for N : 500. In Figure 2.2, the edfs are distorted and are far from normal for all four sample sizes. Table 2.3 reports the simulation results of the robust modified Cox test under x? distribution when the bilinear model is correctly specified. The actual size is a little bit overstated than the nominal size but approaches the nominal size as sample 53 Table 2.3: Robust Cox test results when Bilinear(1,1) is true sample size N=500 N=1000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -3.736 -0404 -4.898 -0090 -6.865 -0.106 ~8.278 0.152 s.d 1.930 1.326 2.711 1.108 3.563 1.182 4.448 1.060 skew 0.215 -0.811 0.405 -0294 0.556 -O.328 0.589 -0.469 kurt 2.689 4.066 2.389 2.479 2.268 2.549 2.182 3.276 R.F.(a=.05) 0.840 0.120 0.820 0.090 0.865 0.080 0.845 0.060 toohigh 0.000 0.005 0.000 0.020 0.000 0.010 0.000 0.030 toolow 0.840 0.115 0.820 0.070 0.865 0.070 0.845 0.030 max 0.151 0.101 0.101 0.088 min -0. 126 -0077 -0051 -0044 mean -0002 -0000 -0002 0.001 Data are generated from x?” distribution and R2200 size becomes larger. The four moments of the unconditional probability distribution of T2 are close to normal but, again, the mean and standard deviation are slightly different from N (0,1) for N = 500. Table 2.4 reports the simulation results of the nonrobust modified Cox test under x? distribution. As expected, the four moments of the unconditional prob- ability distribution of T2 are far from normal and the actual size is very different from the nominal size and overstated varying from 0.150 to 0.260. Figure 2.3 and 2.4 show the edfs of the robust and nonrobust modified Cox test of T2. In Figure 2.3, the edfs are very close to the cdf of N (0,1) for all four sample sizes but the edfs are severely distorted and far from the cdf of N(0,1) in 54 Table 2.4: Nonrobust Cox test results when Bilinear(1,1) is true sample size N =500 N 21000 N=2000 N =3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -3.736 0.172 -4.898 0.309 -6.865 0.358 -8.278 0.322 s.d 1.930 2.080 2.711 2.820 3.563 2.912 4.448 2.877 skew 0.215 1.138 0.405 1.610 0.556 0.450 0.589 0.337 kurt 2.689 6.646 2.389 9.842 2.268 4.106 2.182 3.376 R.F.(a=.05) 0.840 0.290 0.820 0.350 0.865 0.450 0.845 0.475 toohigh 0.000 0.150 0.000 0.220 0.000 0.260 0.000 0.255 tOOlOW 0.840 0.140 0.820 0.130 0.865 0.190 0.845 0.220 max 0.135 0.074 0.079 0.049 min -0107 -0.068 -0054 -0045 mean -0002 0.001 0.001 0.000 Data are generated from the x?” distribution and R2200 Figure 2.4. Table 2.5 and Table 2.6 report the simulation results for the robust and the nonrobust modified Cox test under t5 distribution. The means and standard deviations are approximately normal but the actual size is understated than the nominal size for N = 500 and slightly overstated for other three sample sizes in Table 2.5. Table 2.6 reports the simulation results of the nonrobust modified Cox test under t5 distribution. The actual size and the unconditional four moments of the probability distribution of T1 are far from normal for all sample sizes. 55 Table 2.5: Robust Cox test results when GARCH(1,1) is true sample size N=500 N =1000 N=2000 =3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -0.112 -13.739 0.101 ~23.109 0.026 -35.247 0.147 -45.876 s.d 0.965 3.003 1.065 3.504 1.115 5.193 1.105 2.170 skew 0.693 1.657 0.277 3.435 -0.016 3.866 0.469 8.670 kllI‘t 3.748 4.852 2.937 15.485 3.410 18.060 3.251 96.825 RF. 0025 1.000 0.070 1.000 0.075 1.000 0.075 1.000 toohigh 0.020 0.000 0.055 0.000 0.055 0.000 0.050 0.000 tOOlOW 0.005 1.000 0.015 1.000 0.020 1.000 0.025 1.000 Data are generated from the t-distribution with 5 degrees of freedom and R2200. Table 2.6: Nonrobust Cox test results when GARCH(1,1) is true sample size N=500 N=1000 N 22000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -O.754 -13.255 -0.407 -22.793 -0.102 -35.999 0.019 -45.765 s.d 1.912 3.338 2.309 4.134 2.228 3.470 2.232 2.439 skew -1327 1.352 -0971 2.741 -0531 5.113 -0354 5.532 kurt 7.495 3.704 4.700 10.334 2.963 31.989 2.981 39.032 R.F. 0.250 1.000 0.325 1.000 0.370 1.000 0.385 1.000 toohigh 0.035 0.000 0.125 0.000 0.180 0.000 0.200 0.000 toolow 0.215 1.000 0.200 1.000 0.190 1.000 0.185 1.000 Data are generated from the t-distribution with 5 degrees of freedom and R=200. Figure 2.5 and 2.6 Show the edfs of the robust and the nonrobust modified Cox test. The edfs in Figure 2.5 are very close to the cdf of N(0,1) while the edfs in Figure 2.6 are far from the cdf of N(0.1). Table 2.7 and 2.8 report the simulation results of the robust and the nonro- bust modified Cox test under t5 distribution when the null is the bilinear model. The simulation results show that the robust modified Cox test performs far better than the nonrobust modified Cox test generally under nonnormality except for the simulation results from the robust modified Cox test for N = 500. They are very similar to the results from the nonrobust modified Cox test for the same sample size. The edfs in Figure 2.7 are slightly deviated from the cdf of N (0,1). In Figure 2.8, the edfs are more distorted from the cdf of N(0,1). We suspect that this relatively poor performance may be mainly due to the parameter value chosen for the bilinear effect.6 Table 2.9 and 2.10 report the simulation results under 7310 distribution. In Table 2.9, the simulation results are very close to normal and the actual size is also 6We could not compute the Hessian matrix for the GARCH(1,1) model when we used the same parameter value(0.385) for the bilinear effect in the previous chapter, so we chose a different parameter value(0.085) that managed to fit in the GARCH(1,1) model. But this parameter value chosen for the bilinear effect is rather small and dose not provide a strong bilinear effect. Thus, the error terms generated from this bilinear parameter value do not fit in well enough to perform the simulation experiments and the performance is particularly worse in the small sample size. 57 Table 2.7: Robust Cox test results when Bilinear(1,1) is true sample size N=500 N=1000 N 22000 N 23000 T1 T2 T1 T2 T1 T2 T1 T2 mean -4052 -0.569 -5.386 -0375 —7.823 -0151 -8.496 —0.209 s.d 2.208 1.529 2.828 1.193 4.206 0.932 4.693 0.967 skew -0429 -1332 0.167 -O.678 0.275 -0323 -0011 -0277 kurt 4.425 5.217 2.316 3.979 1.931 2.752 1.908 3.141 R.F.(a2.05) 0.880 0.145 0.865 0.080 0.900 0.040 0.905 0.050 toohigh 0.000 0.005 0.000 0.000 0.000 0.000 0.000 0.005 toolow 0.880 0.140 0.865 0.080 0.900 0.040 0.905 0.045 Data are generated from the t-distribution with 5 degrees of freedom and R2200 Table 2.8: Nonrobust Cox test results when Bilinear(1,1) is true sample size N=500 N 21000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -4052 -0101 -5.386 -0222 -7.823 -0174 -8.496 -0320 s.d 2.208 1.587 2.828 1.577 4.206 1.630 4.693 1.840 skew -0429 1.126 0.167 0.824 0.275 0.050 -0011 -0.783 kurt 4.425 7.683 2.316 4.716 1.931 4.170 1.908 7.390 R.F.(a2.05) 0.880 0.145 0.865 0.175 0.900 0.210 0.905 0.265 toohigh 0.000 0.080 0.000 0.070 0.000 0.080 0.000 0.105 toolow 0.880 0.065 0.865 0.105 0.900 0.130 0.905 0.160 Data are generated from the t—distribution with 5 degrees of freedom and R2200 58 Table 2.9: Robust Cox test results when GARCH(1,1) is true sample size N=500 N=1000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -0.083 -13.378 0.041 -22550 0.118 -35794 0.079 -45379 s.d 0.882 2.643 0.967 2.523 1.050 1.394 1.021 0.709 skew 0.130 1.826 0.246 3.332 0.085 6.267 0.051 0.630 kurt 2.532 5.476 3.438 16.512 2.723 62.431 3.496 5.046 R.F. 0.015 1.000 0.050 1.000 0.055 1.000 0.060 1.000 toohigh 0.005 0.000 0.035 0.000 0.030 0.000 0.045 0.000 toolow 0.010 1.000 0.015 1.000 0.025 1.000 0.015 1.000 Data are generated from the t-distribution with 10 degrees of freedom and R2200. very close to the nominal size for all four sample sizes. The simulation results of the nonrobust modified Cox test in Table 2.10 are far from normal. Figure 2.9 and 2.10 show evidence that robust modified Cox test performs better than the nonrobust modified Cox test under nonnormality and that the robust modified Cox test is also very accurate. Table 2.11 and 2.12 report the simulation results of the robust and the nonro— bust modified Cox tests under 1510 distribution. As expected, the simulation results are very similar to those from Table 2.7 and 2.8. Figure 2.11 and 2.12 Show the edfs of the robust and the nonrobust modified Cox test. Again, the edfs in Figure 2.11 slightly deviate from the cdf of N (0,1) but they are relatively close to the cdf of N(0,1) compared to the edfs in Figure 2.12. So far, we have illustrated the applicability of the robust modified Cox test 59 Table 2.10: Nonrobust Cox test results when GARCH(1,1) is true sample size N=500 N: 1000 N 22000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -0030 -22592 -0.096 -22347 0.120 -35.860 0.129 -45420 s.d 1.309 2.622 1.315 2.969 1.443 1.179 1.523 0.755 skew -O.158 3.182 0.226 2.914 -0117 4.036 0.269 0.377 kurt 3.131 14.538 2.921 12.381 2.859 29.950 2.813 2.816 RF. 0130 1.000 0.110 1.000 0.160 1.000 0.210 1.000 toohigh 0.050 0.000 0.030 0.000 0.090 0.000 0.125 0.000 toolow 0.080 1.000 0.080 1.000 0.070 1.000 0.085 1.000 Data are generated from the t-distribution with 10 degrees of freedom and R2200. Table 2.11: Robust Cox test results when Bilinear(1,1) is true sample Size N: 500 N=1000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -3885 -0510 -6.280 -0.632 -8.182 -O.261 -10227 -0.189 s.d 2.195 1.407 3.550 1.232 4.962 0.992 6.202 1.005 skew -0.596 -1532 0.060 -0453 0.024 -0293 -0072 -0204 kurt 2.951 6.808 2.097 3.329 1.865 2.914 1.700 2.499 R.F.(a2.05) 0.830 0.120 0.850 0.160 0.880 0.070 0.875 0.065 toohigh 0.000 0.000 0.000 0.010 0.000 0.010 0.000 0.010 toolow 0.830 0.120 0.850 0.150 0.880 0.060 0.875 0.055 Data are generated from the t-distribution with 10 degrees of freedom and R2200 60 Table 2.12: Nonrobust Cox test results when Bilinear(1,1) is true sample size N 2500 N=1000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -4.052 -0.101 ~5.864 -0.285 -8.324 -0.153 -9.712 0.167 s.d 2.208 1.587 3.370 1.215 4.763 1.219 6.019 1.420 skew -0429 1.126 -0.075 0.619 0.043 0.343 -0.063 0.356 kurt 2.076 4.195 2.076 4.195 1.778 2.880 1.783 4.015 R.F.(a2.05) 0.870 0.120 0.870 0.120 0.915 0.120 0.885 0.125 toohigh 0.000 0.080 0.000 0.045 0.000 0.065 0.000 0.050 toolow 0.880 0.065 0.870 0.075 0.915 0.055 0.885 0.075 Data are generated from the t-distribution with 10 degrees of freedom and R2200 from the procedure 2.1 but in order to use this proposed test we have to calculate each term in the conditional variance in equation (2.15). This might be a little cumbersome. As an alternative way, we suggest E [qz‘ 2 I It_1] from equation (2.10) as the conditional variance of the robust modified Cox test. Both the robust and nonrobust modified Cox test are originally derived from the equations (2.10) and (2.11). An attractive feature of this robust modified Cox test is that it does not require computing every term in equation (2.15). If the error terms do not follow normality, then the third and fourth moments are automatically calculated in equa- tion (2.10). We did the simulation experiments with some selective cases. But these simulation results strongly suggest that this alternative robust modified Cox test would perform properly for other cases as well. Table 2.13 reports the simulation results under nonnormality when we use 61 Table 2.13: Robust Cox test results when GARCH(1,1) is true sample size N=500 N2 1000 N=2000 N 23000 T1 T2 T1 T2 T1 T2 T1 T2 mean -0355 -14130 -0192 -22715 -0059 -35.198 0.043 -45845 s.d 0.899 3.597 0.927 4.937 0.951 6.516 0.9745 3.761 skew 0.313 1.840 0.285 2.222 -0.668 3.498 -0043 5.766 kurt 3.848 5.427 2.791 6.731 5.060 14.637 2.834 37.900 R.F.(a2.05) 0.030 0.995 0.025 1.000 0.040 1.000 0.045 1.000 toohigh 0.010 0.000 0.010 0.000 0.010 0.000 0.020 0.000 toolow 0.020 0.995 0.015 1.000 0.030 1.000 0.025 1.000 max 0.096 0.061 0.047 0.060 min -0.096 -0077 -0051 -0.056 mean 0.004 0.001 -0000 -0002 Data are generated from x2 distribution, R2200, and the conditional (1) variance from equation (2.10) is used 62 equation (2.10) as the conditional variance for the robust modified Cox test. These simulation results are very similar to those in Table 2.1. Note that the actual size in Table 2.13 is slightly understated than the nominal size while the actual size in Table 2.1 is slightly overstated than the nominal size. But these differences from both cases are trivial and very close to the nominal size for all sample sizes. Figure 2.13 shows the edfs of the alternatively proposed robust modified Cox test. The edfs are also very close to those in Figure 2.1 and they approach the cdf of N (0,1) as sample size increases. Table 2.14 reports the simulation results under x? distribution when the bilinear model is correctly specified. The means and standard deviations of this simulated results appear to be approximately normal and the actual size is lower than that in Table 2.3 for N 2 500, and 1000. Figure 2.14 shows the edfs of this proposed test. The edfs are very close to the cdf of N (0,1) for all sample sizes as in Figure 2.3. Table 2.15 reports the simulation results of the robust modified Cox test using the conditional variance from equation (2.10) under tlo distribution. The simulation statistics are outperformed compared to Table 2.7. The four moments of the probability distribution of T2 are very close to normal and the actual size is also very close to the nominal size for all sample sizes. Note that the actual size 63 Table 2.14: Robust Cox test results when Bilinear(1,1) is true sample Size N =500 N=1000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -3.736 -0.250 -4.898 -0.076 -6.865 0.091 -8.278 0.078 s.d 1.930 1.139 2.711 1.026 3.563 1.196 4.448 1.063 skew 0.215 -0433 0.405 -0380 0.556 -0473 0.589 -0074 kurt 2.689 2.651 2.390 2.900 2.268 3.068 2.182 2.944 R.F.(a2.05) 0.840 0.085 0.820 0.060 0.865 0.100 0.845 0.060 toohigh 0.000 0.005 0.000 0.010 0.000 0.040 0.000 0.025 tOOlOW 0.840 0.080 0.820 0.050 0.865 0.060 0.845 0.035 max 0.094 0.099 0.071 0.066 min -0122 -0088 -0.063 -0044 mean -0001 -0001 -0001 0.000 Data are generated from x?” distribution, R2200, and the conditional variance from equation (2.10) is used 64 Table 2.15: Robust Cox test results when Bilinear(1,1) is true sample size N=500 N=1000 N=2000 N=3000 T1 T2 T1 T2 T1 T2 T1 T2 mean -4.052 -0.235 -5.386 -0.254 -7.823 -0.119 -8.496 -0.197 s.d 2.208 0.920 2.828 1.021 4.206 0.914 4.693 0.914 skew -0.429 -0.095 0.167 -0.051 0.275 -O.143 -0.011 -0.180 kurt 4.425 2.538 2.316 2.609 1.931 2.553 1.908 3.065 R.F.(a2.05) 0.880 0.040 0.865 0.050 0.900 0.015 0.905 0.045 toohigh 0.000 0.000 0.000 0.010 0.000 0.000 0.000 0.005 toolow 0.880 0.040 0.865 0.040 0.900 0.015 0.905 0.040 Data are generated from tlo distribution, R2200, and the conditional variance from equation (2.10) is used for N 2 500 in Table 2.15 is 0.040 which is almost equivalent to the nominal size of 0.05 and it is more than three times lower than the actual size of 0.145 from Table 2.7 for N 2 500. Figure 2.15 shows the edfs of the robust modified Cox test. The edfs are almost equivalent to those in Figure in 2.7 and they appear approximately to be the cdf of N(0,1). These simulation results, in Table 2.13 through 2.15, exhibit that the actual size is very close to the nominal size and that the simulation statistics are very similar to those from Table 2.1, 2.3, and 2.7. But the main difference between this type of robust modified Cox test and the previously proposed one is that the actual size of the alternative way of the robust modified Cox test is slightly understated 65 while thehactual size of the robust modified Cox test is slightly overstated. oomnm 6:6 66150 5;) 6:: .5; 2.210420 can; 58 So 5256a q 11 4 41 d O O r .I. 0 .I Z .0 r 4 C. .0 ... V r 1 O l J g r 1 .0 I 9 O r x a L k . 0 .l .00 0|. 8 r OOONIZ .0 .0. i # COO—I2 .0 —DU coco..- IA .0 I oonlz .o .no I III 6 r “—.OVZ .0 .30 II A . | ‘I IL b DI? » F P _ p r p . O 2.82 B :8 9: ..n.> am; So 56:66.8 9: B 6:86 6,: 88:90 uoqp‘un} Misuap 66 oomnm uco $6156 SE 6:: 2 9.51320 SE; am: .80 5:58:62 q I .v m: C F . . O m. 0 . _ . . 4 . _ 4 O O r 1 . 7 1 0 fl I. . Z T L F 1 0 CL r 0 1V 0 r l . g F 1 .0 9 r I 0 N x. 1 \ u .0 _.u0 0|. 8 OOONIZ .0 .0. i. ... .l DOC—I2 .0 3.0.0.... 1.0 cow-Dz .0 .3. I I 6 r A..Ovz .0 .Du I A f t1DDD b p F — r r p 0'1 :82 B :5 9: .m> .68 So 66:69: .2: .0 mane 4:: mmmcsoc uoqoun; Knsuap 67 oomnm 95 .66 EU 5.2, 6:: ...: 22.35625 can; 58 80 636m ¢ n N 2 O « u q .0 .- q d d a O _ d .- .0 0 Z .0 P1 0 1? L 0 g 1 O .L . 9 4 O /_ O 8 ooonuz .0 .0. on. 000NI2 .0 .0. it. 0 0009Iz.—O .000000000 1 . coal: .0 .u. I II 6 2.82 .o .3 III- - p p F p — b .O Cdvz .6 :86 8: .m> .68 So 66:69: er: *0 mane one m.mm_:@rm uoiiauni KllSLJap 68 CONHK Uco pmfi Eu 5:; on: m_ c.2vcomc5m certs 2mm“ xou “manoucoz an m: N — O l 3| N n 0 1 d d d 1H d. . O a 1 0 r 1 r 0 1 . Z r 1 1 .0 1 C. T L 1 .0 .V r .1 g 0 T J . 9 r L r .0 /._ 1 .0 .uo . l . 8 COGNIZ .0 .0. if L I coo—I2 .0 .‘IOIOOOOI J .0 .. oonlz .6 .no I III 6 r ¥ An.0vz .0 .00 I n r P P u > F p p L l. O C.on B :00 8: .m> 68 80 06:60:: of *0 maum 9.: ¢.Nmt_3®C 110113001 Mgsuap 69 OONHm Uco Ame ml“ 5:5 on: m_ C._VIQm 2mm: xoo 66:68: 8: B 866 8: 8.28390 uonounifinsuap 72 .vomum use .56 mi 5;) m3: 2 Cgfomszm SE; 53 30 “$5882 .v M. N — 0 pl NI. MI 0 4 q 4 11 q 1 d J‘ §IIIQ14 4 O r L O T. i . Z 0 CL 0 I V O T C4 r .L O r. l ‘ 9 F L O I I; /_ O :3 ...I. Q. r COGNIZ .o .no 38888.! L I Dooalz .0.UIOIOO¢II 4.0 OOmIZ .0 .0. I III 6 V A—.ovz .0 .001 1 p — h P P — F P r .IIL O Cdvz E :5 9: .m> 62 So 822:: 9: 6 $8 9: mmmsmg uoywun; Mgsuap 73 oomum EB .56 Si £5 25 2 2.51320 :9? 68 80 Essex ¢ 0 N p 0 Fl NI n! q . q . 1 L0 00 OOOflflZ .0 .O¢ o I o COGNIZ .0 .0. i coo-I2 .0 .UIOOOQOOI oonlz .0 .3. l I 2.32 .o .3 I 60 90 L0 90 90 V0 C0 Z‘O - — p p p P p r P Cde B :5 Eta? CL 58 30 8:69: 9: 6 flow 9: mNmSoc uogaun; Kpsuap 74 omnm EB .66 Si £3, 95 .2 2.51320 SE) EB xoo 6:98:02 ¢ M N T N- n- . . O . u q _ J A < d all < I‘_ O r A l O 1 l r A I 0 Z vI l r O F). r 4 I O .V I .0 g x A r .O 9 r L r .0 /. F 1 r O 7 ocean: .o .u. . I . 8 COGNIZ .0 .0. i... L .l DOC—I2 .0 .u. 00.00.. 0 oon'z .0 .0. I I 6 I 9.32 .o :3 III. p p h p . p r p + .l... 0 3.0.2 .0 .00 m5 .m> 38 30 8:608 9: 3 2mm 9: ofmmsoa uoqaun} KMSuap 75 oomum EB .66 Si SE 95 2 2.35325. 5:: 5,8 xou “maoom ¢ , m. N p O —l NI m.| 4 _ . q . u . , _ < .fif‘fiflflg < L0 00 gonuz .0 .30 I l o COONIZ .0 .00 it COO—I2 .0 .30 £000.00 oonlz .o .no I I A..Ov2 .0 .00 l 60 8’0 [0 9'0 9'0 170 {"0 2'0 h n , p . . . . , . p . p .O C.on B Bu 9: m) 68 xou omEuoE 9: B 28 9: :NmSo; UO!}DUH} Kpsuap 76 ON H 95 ..mfi Oplufita mg: m_ C._.V:omc_:m cmEs “mm: x00 :mnooEoz ¢ n m _ 0.1;.._1 N1 M10. q 6 q 41 q . . q q .. . O r 1 .O r 1 .0 Z T 1 l l O E p 1 a 1 O U 1 .v s. 1 1 .O ,A r C. n 1 U r 1.0 3 9 n1. 1 1 O I .0 U a /. I L T 'LKIN — I. .0 r \ \ OOOHNZ .0 .0. cl. 8 OOONIZ .0 .00 in: 1 I . ooo.n2.o_nooo.o.oo 1.0 no. 0 OOflIZ .0 .DOI I 6 r ooh o \ A..D.2 .0 .00 I A 5 III. .Llll I, o... p p n . . — L .1]: O C.on :0 :5 9: .m> 59 .8: 5:60: 9: :o 3:5 91; 9.539: 77 com”: 95 56:65 56, 9:: e C.C:um 69 .80 5:69: 9: :0 m6: 9: 3.559: Ungun; Mgsuap 78 OONHm Dco ..mEIEo 5:2, m3: 2 :JYOmEzm cmc; 1? m. N — O. ..l NI HI 0 . I I . . . . .1 . . [j o r 1 r .O r L I 0 Z 1 0 C: 1 1 O I .1 . .V. 1 1 1 .o 1 Q - 1 r .O 9 1 .0 /. r 1 r 1 0 fi OOOnflZ .0 .00 olo OD 000ml: .0 .uo Ila-83.... 1 I coo—nz.o_n........ 1.0 OOnlz .0 _..... I II. 6 r I. I ‘ A..OVZ .0 .00 I .4 Fth... . 1|: . P P n p p p . . 0. C.on 6 :5 9: .m> :93 x00 Dmcfiog :mm: x00 Hmzoom .9: :o 9:5 9; 3.55071 uoIpun} Mgsuap 79 sewn: 95 .56 mi. 5;, 9:: e C0595: :95 59 50 525m .V m. N — O F l N I m. | AI d d d < d r 1 11 r 1 r I r .1 F ..1 r 11 1 1 r L I 1. ooonnz .00 I I O ... OOONlZ .0 .00 I 1 .l COO—r2 .0 .00 00.00.. coalz .0 :5 .II I r I A—.Ov2 .0 .00 “III u‘... :21: P .— .... .1 h h — r P P F .I . P C.on 5 :5 9: .9, L0 00 6'0 8'0 [0 90 90 $70 {‘0 3'0 .0 59 So 65:69: 5.: 5 965 9: 9.555: UOIIDLm} KIIsuap 2.4 Empirical Application under Nonnormality As noted earlier, high frequency financial time series ususally exhibit leptokurtosis and the distribution from these time series typically shows fatter tails than the normal distribution. Applying the modified Cox test, assuming normality to the situation under nonnormality, generally leads to invalid test inferences. As seen in Table 1.1 through 1.3 in section 1.3, all the empirical data sets reveal higher kurtosis than the normal distribution, so we suspect that the distributions of these time series follow the normal distribution. To investigate this, we perform the robust modified Cox test for these three time series data sets and compare the test results from the robust modified Cox test to the test results from the modified Cox test in the previous chapter. Table 2.16 and 2.17 are the test results from the robust modified Cox test and Table 2.18 and 2.19 are the test results from an alternative way of the modified COX test. First, when the null is the GARCH(1,1) model, the test values from the r Obust modified Cox test are smaller than those from the nonrobust test but both test results from the robust and the nonrobust modified Cox test could not reject the null hypothesis at any significance levels. Second, when the null is the bilinear model, the test values from two types of the robust modified Cox test are much S'Irlaller than those from Table 1.11, especially the test value of IP series but the test 81 Table 2.16: Test resultszHozGARCH vs. H1: Bilinear stochastically simulated Cox test modified Cox test S&P 500 .023 .114 British Pound .196 -.011 Industrial Production .533 .017 Table 2.17: Test results:H0:Bilinear vs. H1: GARCH stochasticly simulated Cox test modified Cox test S&P 500 -.910 -2.896 British Pound -2.797 -6.289 Industrial Production -1.643 -3.802 values between these two robust modified Cox tests are very similar: Test results in Table 2.16 vs. Table 2.18 and Table 2.17 vs. Table 2.19. And both test results reject the bilinear model as the null at any significance levels. The test results, in Table 2.16 through 2.19, are different from those in Table 1.10 and 1.11 and show evidence that these time series do not follow the normal distribution and the test Statistics from the robust modified Cox test are more valid. Table 2.18: Test resultszH0:GARCH vs. H1: Bilinear : stochasticly simulated Cox test modified Cox test S&P 500 .023 .179 British Pound .196 -.051 Nindustrial Production .533 .020 82 Table 2.19: Test resultszffiyzBilinear vs. H1: GARCH stochasticly simulated Cox test modified Cox test S&P 500 -.910 -2.896 British Pound ~2.797 ~7.058 Industrial Production -1.643 -2.282 2.5 Conclusions As noted earlier, most financial time series do not show evidence of the conditional normality. Thus, applying the modified Cox test, assuming normality to the non— normal situation, yields invalid test inferences. In this chapter, we have pr0posed the robust modified Cox test under nonnormality. Monte Carlo simulation experiments suggest that the robust modified Cox test performs fairly well and can improve the validity of the test statistics and the actual size over the nonrobust modified Cox test under nonnormality. In comparison With the simulation results in the previous chapter, the means and standard devi- ations are slightly deviated from the standard normal distribution for some sample Sizes such as N = 500. It is also shown that the robust modified Cox test oversizes but not significantly and the actual size approaches the nominal size as sample size increases. Evidence from Lumsdain(1995) suggested that the robust modified Cox t(Est would perform relatively well under nonnomality. She compared the robust tra- C1itional test statistics to the nonrobust traditional test statistics under normality 83 and showed the actual size and the test statistics are not close enough to the nominal levels even under normality. We infer that these results would be even worse under nonnormality. We also performed an alternative way of the robust modified Cox test with the conditional variance from equation (2.10). The simulation results are very similar to those from the robust modified Cox test in general but the actual size is usually understated while the robust modified Cox test usually slightly oversizes. In some situations, as shown in Table 2.15, an alternative way of robust modified Cox test that we proposed here preforms very well and the simulation results are very close to normal. It is emphasized that this robust modified Cox test has computational advantage because it does not require computing every term in the conditional variance in equation (2.15). We also have summarized the nonrobust Cox test with three time series data sets in the previous chapter. The robust test results are far different from the nonrobust test results, especially when the null is the bilinear model: the test values are much smaller than those from the nonrobust modified Cox test in absolute value fOr all three data sets. 84 Chapter 3 An Application of a Quasi-Modified Cox Test to Nonlinear Panel Data Models 3 . 1 Introduction Irl many instances, the dependent variable takes on nonnegative integer values: for eJv<€-.imple, number of hospital visits in a given year, number of alpha particles emitted fI‘Om a radioactive source during a given period of time or number of patents applied for and received by a firm during a year. When a variable takes on nonnegative iIlteger values, it is referred to as a count variable. With the nonnegative property 85 of count data, the most p0pular functional form for the conditional mean is the exponential function: E (y | 9:) = exp(:rfi), where y is a count variable and :1: is a vector of explanatory variables. When there are unobserved effects in count panel data models, we cannot simply apply the standard linear unobserved effects model if we want to impose nonnegativity of the conditional mean. Hausman, Hall, and Griliches (1984) (hereafter HHG) is a pioneering work that deals with the unobserved effects in count panel data analysis using the conditional maximum likelihood (CML) approach of Anderson (1970, 1972). HHG also presented an application to the patents and R&D expenditures relationship. Wooldridge (1999) showed the QCMLE is consistent and asymptotically normal just under the conditional mean assumption in the multiplicative models. He also showed that Poisson QMLE is robust if the conditional mean is correctly specified. But it will be inefficient, in general, unless the conditional varience is also correclty specified. The most popular distributional assumption for count data is the Poisson dis- t'I‘ilzmtion. To remove the unobserved heterogeneity or fixed effects in the nonlinear Count panel data analysis, the Fixed Effects Poisson (FEP) model was developed by HHG. But one of the shortcomings of the Poisson model is that the first two 1rl'loments are the same. But in many applications the variance of a count variable is larger than the mean of it and we encounter overdispersion of the data. To solve t'11is problem, the Fixed Effects Negative Binomial (FENB) model was developed 86 as an alternative to the FEP. As shown in Wooldridge (1999), both the FEP and the FENB models have the same form of the conditional mean and are estimated by the multinomial QCML methodology. Like the GARCH and the Bilinear mod- els in the previous chapters, the F EP and the F ENB are two competing models in nonlinear count panel data analysis. It is worth while to note that the QCMLE of the FEP is consistent and asymptotically normal if the conditional first moment is correctly specified but, in general it is inefficient. On the contrary, the QCMLE of the FEN B is not consistent unless the first two conditional moments are correctly specified because negative binomial is not in LEF but the F ENB is usually more efficient than the FEP. Therefore, there is a robustness and efficiency trade-off be- tween these two models. In principle, we could try the Cox test when we consider the specification testing between these two models because the Cox test applies to any two distributions and it is derived from the difference between log likelihood ratio and its expected value under the null. But using the original Cox test or even 1311s modified Cox test from the previous chapter is very challenging task in this case. The log likelihood function of the FEP model is derived from Poisson distribution and the log likelihood function of the FENB model is derived from negative binomial distribution and these two 10g likelihood functions take very different forms from tlle normal log likelihood function. Therefore, to get the difference between these two different log likekihood ratio and its expected value is very complicated and 87 computationally very difficult too. Instead, we want something computationally simpler and we derive a new Cox test (quasi-modified Cox test) using the property of normal quasi log—likelihood as shown, for example in Bollerslev and Wooldridge (1992). In this case the quasi-modifed Cox test is based only upon the implied con- ditional varience because the conditional means of these two models are the same. In section 2 we briefly explain these two models and the quasi-modified Cox test. In section 3 we present the applications of these models to the US. patents and R&D expenditures panel data, and then apply the quasi-modified Cox test to see if either model is rejected. Conclusions follow on section 4. 3.2 Two Competing Count Panel Data Models with the Unobserved Effects Developed first by HHG, the FEP and the FENB models have been used as two cOmpeting counterparts in the nonlinear count panel data analysis. In this section We briefly discuss these two models. We assume ramdom sampling from cross section and let {(yit:$z‘t, 43,-),2' :- 1 3 2, . . .,N,t 2 1,2,...,T} be a sequence of i.i.d random variables across i, but not t, where y.” denotes the discrete observable count variable, fit is a vector of 88 explanatory variables and d, is an unobserved random scalar. For the F EP model we assume that yit l alum ~ P01880n(¢m($z't,fl0)),t = 1,2, - - - ,T (3-1) yihyis are independent conditional on email, tyés, (3.2) and the conditional mean of Mt is E(yit I LEM/>1) = EQ/z't | 13mm) = ¢i#($z't, 30) (3-3) HHG took the functional form of the conditional mean of Hit as an exponential function: E(y,-t l 23,-, (bi) = oierpCvl-tfio). Under assumptions (3.1) and (3.2), HHG used the CML techniques of Anderson(1970,1972) to estimate ,6, conditioning on the sum of the dependent variable across time, ELI yit : ni. HHG showed that M l Ramada ~ multinomialmupiim,50),---,P2'T(Iz',50)) (3-4) T T where Pit = €$P($it,50)/ Z€$P($ir,fio) and 2P1t=1 (35) 1‘21 t=1 Eq(3.4) reveals that the distribution of (yihyz-g, - - - ,yz-T) given (WW1) does not, depend upon the unobserved effects 4),. Therefore, the log likelihood function 89 of the Poisson CML methodology by HHG can be written as T T T li(l3)FEP = Z Wyn + 1) — Z 3127th8 2 (KM—(5132': - 5%)5), (3-6) 3:1 where F(.) is gamma function. Gourieroux, Monfort, and Trognon (1984) (hereafter GMT) showed that the multinomial QCMLE of the FEP is consistent even though the multinomial distri- bution is not correctly specified if E(yz't I Nari) = Prdhfiolni T where n,- = Zyit t=1 However, Wooldridge (1999) argued that this is too restrictive and showed that, while the FEP estimator is derived under assumptions (3.1) and (3.2), it is consistent and asymptotically normal only under the conditional mean assumption (3.3). On the other hand, E(y.,-t | 1.2, (15,-) = Var(y,-t | $1,451) 2 A.“ where A“ is the Poisson parameter from assumption (3.1). But it is not difficult to find, empirically, t1lat the conditioanl variance of Hit is not the same as the conditional mean of yit- More likely, the conditional variance of Mt is larger than the conditional mean of Mt 01‘ it is increasing with yit in many cases. To solve this overdispersion problem, HHG 11Sed the negative binomial distribution and developed the Fixed Effects Negative Binomial (F ENB) model as an alternative to the FEP. To derive the FEN B by HHG, 90 we assume that1 yit | (L‘i, ¢i ~ NegativeBinomial(/r(:r.it,fig),1/qbi) (3.8) where <15,- is the unobserved effect and ab,- > 0 Hit, y.,-,. are independent conditional on (xit, (152‘), t # r (3.9) E(yit|$i,¢r) = ¢i/t(ivit,fl30) (3-10) Interestingly, under (3.8) to (3.10), the conditional mean is E (yit | ni, 33,) = pit(:1:i, fio)n,-, which is the same as Eq. (3.7). The conditional log likelihood function2 for the FENB by HHG is T li(fi)FENB = 2008 F(#it + yit) —103F(Mit) — 103m?!“ + 1)) t=1 T T + log IX: flit) + log F(n,~ + 1) — log N: 1111+ n,) (3.11) t=1 t=1 Under assumptions (3.8) to (3.10), the strict exogeneity of grit, the CMLE of the FEN B is consistent and asymptotically normal. Now, we compare the Possion model and the Negative Binomial model ana- 1y 28d in Wooldridge (1999). In Possion model, 50121 l 13mm) = i#(l‘it,50) (3.12) \ = VaT(yit|$z't,¢i) (3.13) 1 We follow the notation from Wooldridge (1999) 2see HHG p.924 for more details 91 and the variance to mean ratio of the Possion model is unity. In Negative Binomial model, E(yz't I xitaCDi) = ¢i#($it,fio) (3-14) Va?‘(yz't | 113mm) : E(yit l mac/100+ 452') (3-15) and the variance to mean ratio of the NB model is (1 + d.,-) > 1. The NB model shows the overdispersion and also allows the variance to mean ration to be different from each i. The conditional mean of both Poisson and NB models conditional on the sum of dependant variable across time is E(y,-t | n,,a:,-) : Pit($i,50)ni- Next, we consider the conditional variance of both the F EP and the FENB models. Following HHG (1984), we first construct the conditional variance of the F EP model from the multinomial covariance matrix, Q,- : diag(pi) — pfipz- where Pit = ”(cribfiofl XIII ,u(a:.,-.,.,fl0). From the fixed effects assumption, 9,- is singular by construction. Therefore, we remove the first row and column to construct 92-, which iS (T — 1) x (T — 1) matrix.3 We derive the conditional variance of the F EP from the diagonal elements of 92': V(y,-t | ni, sci) : (1 — Pit)Pit- Next, we derive the con— ditional variance of the FENB as we did that of the FEP but an extra term added En the NB assumption: 92‘ = (221% + ZT=1H($it:80))/(1+ Lil/4132150)) and 3We can remove any time period from 0,. We take the first row and column for convenience. 92 QFENB = g.i(diag(pz-) — pgpi) and the conditional variance of the FENB is the diagonal terms of Q : V(y.,;t | nhxi) = gip.,;t(1 — Pit)- The original Cox test is Tf : {L f(a)—Lg(.é)}—Ed{L f(a)—Lg(a“) }. The test statistic of the Cox test is based upon the difference between the log likelihood ratio and its expected estimate under the null. In principle, we can try the original Cox test but this may cause very severe computational difficulties. Instead, we use the first two conditional moments from the QCML methodology and construct a quasi- Inodified Cox test using the normal quasi-log likelihood framework. Bollerslev and Wooldridge (1992) showed that the normal log-likelihood and its expected values are Inaximized when the correct conditional mean and variance are used, even though the normality assumption is violated. Using this property, we now construct a quasi- modified Cox test. Let Ml denote the model defined by Eqs. (3.1) and (3.2). Under M1, Hm I n-iiivi) = PitIx-iflohz' (3-16) VaTI’yit I Nanci) = (1 —p-it($i,fio))Pit($i,fio) (3-17) T where ”it = Z yit, t=1 d . “($21,180) an pit T , Zrzl Mmir, fio) Let Mg denote the model by Eqs. (3.8) and (3.9), so that EIyit I niixi) = Pit($i,90)ni (3-18) 93 Va?“(yit I n2313i) = gi(1 - Pit($i,90))19it(33i,90) (3-19) where m. = m... ”(xiii 60) 2L1 #(irir, 60) T T and 92' = (2 ”it + Z #(IritiBOD/(l + Z #Iévitaao» — t=1 Pit Z We use these two conditional mean and variance from QCML methodolody and put them into the modified Cox test that we derived from previous chapter to get the test statistic. Now, the quasi-modified Cox test has a form of N T— 1 ‘ ‘ . . . Pit($i,i3)ni-Pit($i,9)ni TM 54);.21 I“ p“ WII <1-p..>p.-t(z..é>I MB)? —(1 — I)'it($iafi))Pit($iiB) + 2 x( 1- . — 1 . )] (3.20) 91(1—Pit($i,9))P-it($ia9) (1—PitI1‘ii P))Pit($i,5) Following the previous chapter4, we can derive the asymptotic distribution of the quasi-modified Cox test: «rm, 2 —\/1=§[(g, p,i.,a)n){’( i ( fi(fi)2—(1-pi(rvi,fi))fiixi( 2 23%” pl($2)0)n’l} Pi ($1,6))Pi($i,6) ) + l l x (92(1— PiIIiignfid-Tiaé) — (1 “Pi($iiP))Pi($iiP)) A A N _ A —l _ - (refit/W) (wait—.Efli‘”) V”““”””I +op(l) (3.21) 4see section 2.2 for more detail 94 And the variance of the robust quasi-modified Cox test, VTI, is A 1 N A - VT1 = N Z [012210—I3i($i,fi))fii($i,fi)+— ><(1— 1i _ A ) (1 - fii($i,5))Pi(-’Ei,fl) A XVfi ((1 — 15i($i,3))13i($i,3)))] (3.22) 8 2" Cb ‘3 Cb b 3 H l I ‘ _ _1__t:11((19it(33it, W) pit(1itaé))nz‘) gzp1t(miti 0)(1— pit($iti 6) A 1 1 _1t“ :(gzpzt($ita 9)(1 - Pit($ita 9) b E, III ll :MEEH K3 l Pit($itaB)(]: — pit($it,B)) , — . _ 1 3212i(fi)FEP — T—l afiafi’ 1 “l < “ —— v3 a-tmwnl T_1t=1 z _ait(6)2 - Pit($ita8)(1_ Pit($it,3))D 2)] 2 n 3? § I ’ D 3 9.. fin :Q’ 52> II And the robust quasi-modified Cox test statictics, 16—1351, follows aymptotically unit T1 normal under the null hypothesis. 95 Table 3.1: Summary Statistics: the Patents and lnR&D Data mean s.d median Min Max proportion of zeros Patents 37.133 72.642 6.000 0.000 515 0.000 lnR&D 1.415 1.947 1.196 -3.849 7.034 3.3 An Empirical Application 3.3.1 An Application to US. Patents and R&D Data In this section, we estimate the FEP and the F ENB models under the CML frame- work using the U.S. patents and R&D expenditures data and apply the quasi- modified Cox test between these two competing models. We examined the dynamic specification properties from the data on US. patents and R&D expenditures from 1970 to 1979. We obtained this US. patents and R&D spending data set, patrhghtxt, from the data directory in the NBER website. This data set is a subset of the patents and R&D data used in HHG (1986), ”Patents and R&D: Is there a Lag?”, IER 27: 265-283. There are a total of 346 firms and 22 firms (about 6.4% of all the firms) have zero patents during all time periods and we deleted these firms from our data because these observations do not contribute to the estimation. Table 3.1 presents the summary statistics of the dependent variable, patents, and the explanatory variable, lnR&D. 96 Table 3.2: Estimation Results for the Patents Model: Linear Time Trend Parameter the Fixed Effects Poisson the Fixed Effects Neg Bin lnR&D 0.428 (0.038) 0.261 (0.090) lnR&D_1 -0.159 (0.048) —0.112 (0.115) lnR&D_2 0.021 (0.044) 0.042 (0.103) lnR&D_3 0.174 (0.041) 0.114 (0.098) lnR&D_4 0.090 (0.039) 0.178 (0.092) lnR&D_5 0.259 (0.030) 0.224 (0.068) time -0.083 (0.003) —0.080 (0.010) Sum of lnR&D 0.813 0.707 log likelihood -6069.156 -3935.991 Skewness of residuals 0.141 0.207 Kurtosis of residuals 7.430 7.559 Probability of Normality 0.000 0.000 * The standard errors are in the parentheses. Table 3.2 presents the estimation results for a patents model with linear time trend using the FEP and the FENB estimators. In this table, both estimation results indicate that the contemporaneous effect of lnR&D is significant. The sums of lnR&D are similar but the sum of lnR&D of the FEP is slightly bigger than that of the FENB. In the FEP estimator, the sum of lnR&D is 0.813 but it is 0.707 in the F ENB estimator. And the time coefficient is -8.3 per cent per year in the FEP estimator and -8 per cent per year in the FENB estimator. Table 3.3 presents the estimation results for patents model with a full set of year dummies by the FEP and the FENB estimators. 97 Tablciifi: Estimation RESILtS for the Paten_t:s Modelszull Set of Year Dummies Parameter the Fixed Effects Poisson the Fixed Effects Neg Bin lnR&D 0.407 (0.039) 0.245 (0.090) lnR&D_l -0.115 (0.058) -0.087 (0.120) lnR&D_2 0.061 (0.045) 0.063 (0.108) lnR&D-3 0.111 (0.042) 0.084 (0.099) lnR&D_4 0.073 (0.040) 0.165 (0.094) lnR&D_5 0.279 (0.030) 0.238 (0.069) year76 -0.044 (0.014) -0.052 (0.038) year77 -0.077 (0.014) -0.105 (0.040) year78 -0.238 (0.015) -0.233 (0.041) year79 -0.320 (0.015) -0.309 (0.042) Sum of lnR&D 0.816 0.708 log likelihood -6042.707 -3934.719 Skewness of residuals 0.229 0.259 Kurtosis of residuals 7.290 7.437 Probability of Normality 0.000 0.000 * The standard errors are in the parentheses. Table 3.4: Estimation Results for the Patents Model: Linear Time Trend Parameter the Fixed Effects Poisson the Fixed Effects Neg Bin lnR&D 0.826 (0.009) 0.694 (0.020) Time -0.065 (0.009) -0.079 (0.020) Time*1nR&D -0.008 (0.002) -0.005 (0.005) Sum of lnR&D 0.826 0.694 log likelihood -6273.494 -3981.590 Skewness of residuals -0.245 -0.295 Kurtosis of residuals 36.014 35.173 Probability of Normality 0.000 0.000 * The standard errors are in the parentheses. 98 Table 3.5: Estimation Results for the Patents Model: Linear Time Trend Only Parameter the Fixed Effects Poisson the Fixed Effects Neg Bin lnR&D 0.800 (0.006) 0.679 (0.014) Time -0.097 (0.003) —0.096 (0.010) Sum of 1nR&D 0.800 0.679 log likelihood —6280.996 -3982.094 Skewness of residuals -0.392 -0.394 Kurtosis of residuals 38.446 37.159 Probability of Normality 0.000 0.000 * The standard errors are in the parentheses. In this table, only the contemporaneous effect of lnR&D is significant in both models except the last lag of lnR&D. In the FEP estimator, the sum of lnR&D is 0.816 but it is 0.708 in the FENB estimator. Table 3.4 presents the estimation results including only current lnR&D, time trend and the multiplication of these two variables. The coeffiecients of current lnR&D in both models are much higher than those in Table 3.2 and 3.3 but the sum of lnR&D are very similar. Table 3.5 presents the estimation results including current lnR&D and time trend only. The time trend coefficient for the F EP is -9.7 per cent and -9.6 per cent for the FENB. These coefficients are bigger in absolute value than those in Table 3.4. Not surprisingly, the standard errors in the F ENB are much larger than those in the F EP and it is expected from the increased noise in the Negative Binomial Specification. 99 Table 3.6: The quasi-modified Cox Test Results HozFEP vs. leFENB H0:FENB vs. H1 :FEP Nonrobust Cox test -8.740 -6.318 Robust Cox test -7.570 -6.531 :4: Test results from the patents model with Linear Time "fiend Table 3.7: The quasi-modified Cox Test Results HozFEP vs. leFENB HozFENB vs. H1 :FEP Nonrobust Cox test -0.725 -2.170 Robust Cox test -0.706 -2.181 at Test results from the patents model with a full set of year dummies 3.3.2 The Quasi-Modified Cox Test Results Our quasi-modified Cox test has been used to compare the correct specification between the FEP model and the F ENB model. Table 3.6 presents the quasi-modified Cox test results for the patents model with liner time trend. In this table, the nonrobust test results indicate that both models are rejected against the correctly specified model at any significance level. The Jarque—Bera test (probability of Normality) reveals that the residuals of both models are not distributed as normal in Table 3.2. And the robust quasi-modified Cox test results also reject both models to be correctly specified. Table 3.7 presents the quasi-modified Cox test results for the patents model with a full set of year dummies. In this table, the nonrobust quasi-modified Cox test results show that the FEN B model is rejected against the correct specification at the 100 Table 3.8: The quasi-modified Cox Test Results HozFEP vs. leFENB HozFENB vs. leFEP Nonrobust Cox test ~28.417 -6.678 Robust Cox test -10.621 -7.043 * Test results from the patents model with linear time trend 5 per cen significance level but we fail to reject both models at 1 per cent significance level. The probability of normality in Table 3.3 indicates that the residuals from both models are not normally distributed. The robust quasi-modified Cox test results are very close to the nonrobust test results and the F EN B is rejected at 5 per cen significance level but both models failed to reject the null at a 1 per cent significance level. Table 3.8 presents the quasi- modified Cox test results for the patents model including only current lnR&D, time trend and the multiplication of these two variables. Both nonrobust and robust test results reveal that both models are rejected against the correctly specified model at any significance level. Table 3.9 presents the quasi-modified Cox test results for the patents model including current lnR&D and time trend only and shows that both models are 8810 rejected at any significance level. Interestingly, including a full set of year dummies seems to play an important role to correct the model specification. We can suggest the role of time dummy with an example below. Suppose Var(yit l 3i,¢i,7t) = 7t¢i€xp(17it5) : [Vt/1'0“}: $2) 101 Table 3.9: The quasi-modified Cox Test Results HozFEP vs. leFENB H0:FENB vs. [-11 :FEP N onrobust Cox test -13.345 -10.088 Robust Cox test -9.773 -11.437 * Test results from the patents model with linear time trend only If at is time dummy, then there is no more overdispersion problem when we include this time dummy in the model. Varfyz‘t I 332', 4%) What we can infer from this example and possibly from the test results is that the overdispersion problem may be caused not by the distributional misspecification but by the parametric misspecification. And including this time dummy can correct the overdispersion problem and lead the Poisson model to be the correctly specified model. = ¢~i€$P(Izt,3 + at) where 7t 2 exp(ozt) Further, suppose 7t is independent of (451,351), then = E[Var(y)-t l WNW/7t) | $23451] +Var [E(yz‘t | 331,618,711) | $23M = madam) + 0?, [emexpmm]? > Cb'ieirpmtfi) = VaT(yz't I (”bah/7t) 102 3.4 Conclusion In the count panel data models with unobserved effects, the F EP and the FENB models are frequently compared as two competing counterparts. The QMLE of the F EP is consistent if only the conditional mean is correctly specified but it is generally inefficient. On the contary, the QMLE of the FENB is not consistent unless the first two moments are correctly specified but it is more efficient than that of the F EP. Therefore, there is robustness and efficiency trade-off between these two models. We applied the FEP and the F ENB models to the US. patents and lnR&D expenditures relationship. The quasi-modified Cox test results indicate that includ- ing a full set of year dummies plays a major role for the correct model specification. When we include a full set of year dummies, the quasi-modified Cox test results become different from those without a full set of time dummies and both the FEP and the FEN B models fail to be rejected against the correctly Specified model at 1 per cent significance level while both models are rejected at any significance level if we include the linear time trend instead. We can conjecture from these test results that the overdispersion problem may not be a matter of the distributional misspec- ification but a matter of parametric misspecification. The further study is needed to find more correctly speicified model for the nonlinear count or continuous panel data model with unobserved effects. 103 Appendix A Modified Cox test Here we derive the modified Cox test. The original Cox test statistic can be written in terms of the information set available at time t as :r - 1 .. TM1 = ftzlflogfdytIIt—1;9o)-1089t(ytIIt—1;6)} T 1 * “EMllf ZflOBftfl/t IIt—1;90)—1089t(yt |1t—1;5 )} | It—1l(A-1) t=1 where 60 is the MLE under M1 and 6* is the MLE under M2 when M1 is correctly specified. Now we decompose the equation (A.1) by two terms and rewrite these two terms as i) The first term; T Zflogftlyt |1t—1;90)—1089t(yt I It—1§6*)} (A?) t=1 104 1 1 1 —m 6 2 108ft(ytIIt—1;90) = -§108271—'2-108ht(90)—'2'(yt ht(9t(I)O)) (A3) * 1 1 , 1 -— 6* 2 1380.14-19 ) = ~§Iog2vr—§Iogm(6 )— 5(1), 77.13:) )) (AA) From HOW 011 we define 108ft(yt I It—1§60) = 10813:, 1089t(yt I It—1;6*) = loggt, ht(60) : ht, 77t(6*) : 0t, mt(60) 2 rm, and MW“) = M for notational simplicity. 1 1 1 —m 2 — 2 logft — loggt = -10877t — — loght — — [Iii—1 — (946%)”) 2 2 2 ht T T 2 2 1 (1% — mt) (yt - Mt) 80,2{108ft-1089t} Z _"Z _——_—— t=1 2 t=1 ht 7” T —§{108 ht —1080t} (A-6) ii) The second term; T T 1T — 2 T EM] Zilogft-loggtHIt—l] = EM1 t=1 2 2 t=1 "t “(gt—2?): I lit—1}] (A.7) = 310877: — I loght +132": EM {_(y‘ _ ”)2 2 2 2 H I 7% Jig-17793 | It—l} (A8) T T T — glognt— '2‘108ht— 5 1 T _ 2 +§ Z Ell“1 {—(yt fit) I It--l} (A9) t=1 ’72 Because ( EM1{(3/t — mt)2 I It—1} = Ele’ui? I It—1} = ht T T T — 510877: " 5108M — ‘2- 105 T 2 2 1 yt — mt + mt — Ht +5 E E1111 {( > ( ) I It_1} (A.10) t=1 Tit Because ( (yt — #02 = (Ll/t — W + mt - #02 = (M - mt)2 + (mt - #02 +2(’yt - mt)(mt - Mt) = (M - mt)2 + (mt - #02 + 2Ut(mt - lit) EMIIIyt - #02 I It—l} = EM1{(l/t - mt)2 I It—l} +EM1{(mt - #02 I It-1} +2EM1{(?/t - 77%th - Mt) I It—1}v and EMIIIE/t - mt)(mt - #t) I It—l} = EM1{ut(mt - fit) I It—l} = Eleut I It—IIEMIImt - fit I It-I} = O 50, EMIIIUt — #02 I It—I} = EM1{(?/t - mt)2 I It—I} +EM1{(mt - #02 I It—I} I = ht + (W — (0)2 = glognt—gloght—§—+ éé:—+ éém; M)2 (A.11) Now we combine these two terms back together, and we produce T 1T yt-mt2 1T Kit—W2 = —§{loght—log77t}——Z(——h—)—+ ZL—l- 21:1 2 2t=1 '72 T T 1 ht 1 T (gt—m2 — —lo ——10 h}+———§j———§:——— A.12 {2 8m 2 8t 2 2t=177t 2t21 m I I = I _ l {i Eur T (mt “.1102 + T: (yt—mt)2 _ i (yt—flt)2§A 13) 2 2 t=1 ’72 t=1 7” t=1 ht t=1 7” T 2 T _ 2 2 z Z_1zflt_+:{ht+(mt #t) (9t #0} (A14) 2 2t=1 t t=1 77‘ T 2 T 2 _ 2 Z I _12 fit. + Z (9: m) ht (mt 74) (A15) 2 2 t=1 t t=1 2’72 106 On the other hand, we can rewrite the numerator of the third term as ( (yt — 202 mt + Ht #02 =I =Imt- #t + Ut)2 =Imt- #02 + “t + 2utImt - Ht) =Imt 80, Iyt — Mt)2 — ht - (mt — #02 #02 + 11%- 2UtImt - Ht) — — (mt - #02 I — —ut— ht- zutImt - Mt) T 1 T U? T “at-(mt Mt) Tu? - ht = ———Z—+Zut +2 (A16) 2 2 t=1 ht t=1 7” t=1 2’72 T T u? — ht + ht T ut(mt - pt) T u? - ht = '— — + + A.l7 2 t; 27% t; 7h t; 2m ( ) T T 2 __ T T T 2 _ h 2 5‘2“ch ht_ glziJrZUtImt #t)+mz):2 t (A.18) t=1 t t=1 t t=1 WT t=1 '72 T T 1 T 2 (11) ,1) = ———+— u—h —-—— + A.19 2 2 2t=1( t t) 77t ZI yt— Tit I ) Therefore, the modified Cox test is derived as A T TM; = T ZlflogftIyt I [t—1§60) loggtIyt I It—1§5*)} —EM1ITtZ:1{108ftI3/t I11-1;90)-1089tIytIIt—1;5*)} I It—ll _ 1T (mt-pt) 02—h) 1 1 T Tt=1I(yt—mt) 7h + t2 (——_)I (A20) 107 Appendix B Regularity Conditions Suppose yt,t = 1, - - - , T is a vector of i.i.d observations and we wish to compare when yt has the density function f (gt, 6) for some 6 in 9 under the null hypothesis, H f, and when yt has the density function g(yt, 6) for some 6 in A under the alternative hypothesis, Hg. Then let 60 denote the true value of 6 under H f, let 6 be the MLE of 60 and let 6* denote the value that 5, the QMLE of 6, converges to. For notational brevity, we state the regularity conditions in terms of f (y, 6) but these conditions are also applicable to g(y,6) as well. Below are the regularity conditions for the existence and the consistency of QMLE(White, 1982). 1. The sequence of i.i.d observations yt,t = 1, - - - ,T have common joint distrib- ution function G on Q with measurable Radon-Nikody’m density 9 : dG/dv. 108 2. Radon-Nikody’m density f(y,6) = dF(y,6)/dv where F(y, 6) is the family of distribution function is measurable in y for every 6 in O, a compact subset of a p—dimensional Euclidean space, and continuous in 6 for every y in Q. 3. a) | log f(y, 6) IS m(g) for all 6 in 9, where m is integrable with respect to G. b) E(log f(yt, 6)) has a unique maximum at 6 in O. 4. 610g f(y,6)/06,—,z' = 1, ~ - - ,p, are a measurable function of y for each 6 in O and a continuously differentiable function of 6 for each y in Q. 5. | a2 log f(y,6)/66,- . 89,- | and | Blog f(y,6)/86,- - 010g f(y,6)/86j |,z',j 2 1, - - - , p, are dominated by functions integrable with respect to G for all y in Q and6 in O. 6. Define 21(9) 2 {19(02 log f(y, 6) we, . 00,-», and 8(6) E {EIBIOE f (9.0/80. ~ 6109. f (90)/09j)}. a) 6 is interior to O, b) 24(6) and 8(6) are nonsingular. Under these conditions, /T (6 — 60) is asymptotically normally distributed. 109 Bibliography [1] Bera, A. and Higgins, M., 1997, ARCH and Bilinearity as Competing Models for Nonlinear Dependence, Journal of Business & Economic Statistics, 15, 43-50 [2] Bollerslev,T.,1986,Generalized Autoregressive Conditional Heteroscedastic- ity,Journal of Econometrics, 31, 307-327 [3] Bollerslev, T. and Wooldridge, J. M., 1992, Quasi-Maximum Likelihood Es- timation and Inference in Dynamic Models With T ime—Varying Covariances, Econometric Reviews, 11, 143-172 [4] Kuan, C. and Chen, Y., 2002, The pseudo-True Score Encompassing Test for NomNested Hypotheses, Journal of Econometrics, vol. 106, 271-195 [5] Cox, D.R.,1961,Tests of Separate Families of Hypotheses, in Proceedings of the 4th Berkeley Symposium (1), Berkeley: University of California Press, 105-123 110 [6] Cox, D.R.,1962, Further Results on Tests of Separate Families of Hypothesis, Journal of the Royal Statistical Society, Series B, 24, 406—424 [7] Davidson, R. and Mackinnon, J .G.,1981,Several Tests for Model Specification in the Presence of Alternative Hypotheses, Econometrica, 49, 781-793 [8] Davidson, R. and Mackinnon, J .G.,1984, Model Specification Tests Based on Artificial Linear Regressions, International Economic Review, 25, 485-502 [9] Engle, R.,1982, Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation, Econometrica, 50, 987-1008 [10] Fisher, GR. and McAleer, M., 1979, On the Interpretation of the Cox test in Econometrics, Economics Letter, 4, 145-150 [11] Gourieroux, C., Monfort, A., and Trognon, A., 1983, Testing Nested or Non- Nested Hypotheses, Journal of Econometrics, 21, 83-115 [12] Gourieroux, C. and Monfort, A., 1994, Testing Non-Nested Hypotheses,in: R.F.Engle and D.L.McFadden, Eds, Elsevier Science, Handbook of Econo- metrics, vol.4, 2583-2633 [13] Granger, CW. and Anderson, A.P.,1978, An Introduction to Bilinear Time Series Models, Gottingen:Vandenhoeck & Ruprecht 111 [14] Hausman, J ., Hall, B.H., and Griliches, Z., 1984, Econometric Models for Count Data with an Application to the Patents-R & D Relationship, Econometrica, vol. 52, 909-938 [15] Lumsdaine, R.L., 1991, Essays on Time Series Econometrics, unpublished Ph.D. dissertation, Harvard University, Dept. of Economics [16] Lumsdaine, R.L., 1995, Finite-Sample Properties of the Maximum Likelihood Estimator in GARCH( 1,1 ) and ICARCH( 1,1 ) Models: A Monte Carlo Investi- gation, Journal of Business & Economic Statistics, vol. 13, 1-10 [17] Mizon, GE. and Richard, J ., 1986The Encompassing Principle and its Appli- cations to Testing Non-Nested Hypothesis, Econometrica, 54, 657-678 [18] Pesaran, M.H. and Deaton, AS, 1978, Testing Non-Nested Nonlinear Regres- sion Models, Econometrica, 46, 677-694 [19] Pesaran, M.H. and Pesaran, B., 1993, A Simulation Approach to the Prob— lem of Computing Cox’s Statistic for Testing Non-Nested Models, Journal of Econometrics, 57, 377-392 [20] Quandt, RE, 1974, A Comparison of Methods for Testing Non-Nested Hy- potheses, Review of Economics and Statistics, 56, 92-99 112 [21] White, H., 1982, Regularity Conditions for Cox’s Test of Non-Nested Hypothe— ses, Journal of Econometrics, 19, 307-327 [22] White, H.,1994, Estimation, Inference and Specification Analysis, Econometric Society Monographs: Cambridge University Press [23] Wooldridge, J.M., 1990, An Encompassing Approach to Conditional Mean Tests with Applications to Testing Non-Nested Hypotheses, Journal of Econo- metrics, 45, 331-350 [24] Wooldridge, J .M., 1990, A Unified Approach to Robust, Regression-Based Spec- ification Tests, Economics Theory, 6, 17-43 [25] Wooldridge, J .M., 1999, Distribution-free Estimation of Some Nonlinear Panel Data Models, Journal of Econometrics, 90, 77-97 113 III[IIIIII[IIIIIIIIILIIIII '