ESSAYS ON TIME SERIES ECONOMETRICS By Cheol-Keun Cho A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Economics – Doctor of Philosophy 2014 ABSTRACT ESSAYS ON TIME SERIES ECONOMETRICS By Cheol-Keun Cho Chapter 1 develops an asymptotic theory for testing the presence of structural change in a weakly dependent time series regression model. The cases of full structural change and partial structural change are considered. A HAC estimator is involved in the construction of the test statistics. Depending on how the long run variance for pre- and postbreak regimes is estimated, two types of heteroskedasticity and autocorrelation (HAC) robust Wald statistics, denoted by Wald( F) and Wald(S) are analyzed. The fixed-b asymptotics established by Kiefer and Vogelsang (2005) is applied to derive the limits of the statistics with the break date being treated as a priori. The fixed-b limits turn out to depend on the location of break fraction and the bandwidth ratio as well as on the kernel being used. For both Wald statistics the limits capture the finite-sample randomness existing in the HAC estimators for the pre- and post-break regimes. The limit of Wald( F) further captures the finite-sample covariance between the pre-break estimators of regression parameters and the post-break estimators of the regression parameters. The fixed-b limit stays the same and is pivotal for Wald( F) irregardless of whether some of the regressors are not subject to structural change. Critical values for the tests are obtained by simulation methods. Monte Carlo simulations compare the finite sample size properties of the two Wald statistics and a local power analysis is conducted to provide guidance on the power properties of the tests. This Chapter extends its analysis to cover the case of the break date being unknown. Supremum, mean and exponential Wald statistics are considered and finite sample size distortions are examined via simulations with newly tabulated fixed-b critical values for these statistics. Chapter 2 generalizes the structural change test developed in Chapter 1 while allowing for a shift in the mean and(or) variance of the explanatory variable. Chapter 2 assumes the break date for the mean/variance is different from the possible break date for the regression parameters. The test is robust to serial correlation and heteroskedasticity of the error term and the explanatory variables. The fixed-b theory is applied to derive the limits of the statistics. The asymptotic theory in this paper is based on a new set of high level conditions which incorporates the possibility of the moments shift and serves to provide pivotal limits of the test statistics. Chapter 3 proposes a test of the null hypothesis of integer integration against the alternative of fractional integration. The null of integer integration is satisfied if the series is either I (0) or I (1), while the alternative is that it is I (d) with 0 < d < 1. The test is based on two statistics, the KPSS statistic and a new unit root test statistic . The null is rejected if the KPSS test rejects I (0) and the unit root test rejects I (1). The newly proposed unit root test is a lower-tailed KPSS test based on the first differences of the original data, so the test of the null of integer integration is called the "Double KPSS" test. Chapter 3 shows that the test has asymptotically correct size under the null that the series is either I (0) or I (1) and the test is consistent against I (d) alternatives for all d between zero and one. These statements are true under the assumption that the number of lags used in long-run variance estimation goes to infinity with sample size, but more slowly than sample size. Chapter 3 refers to this as "standard asymptotics." This requires some original asymptotic theory for the new unit root test, and also for the KPSS short memory test for the case that d = 1/2. Chapter 3 also considers "fixed-b asymptotics" as in Kiefer and Vogelsang (2005). Finite-sample size and power of the Double KPSS test is investigated using both the critical values based on standard asymptotics and the critical values based on fixed-b asymptotics. The new test is more accurate when it uses the fixed-b critical values. The conclusion is that one can distinguish integer integration from fractional integration using the Double KPSS test, but it takes a rather large sample size to do so reliably. To my beloved wife, Eun-Young, and my daughter, Ellin iv ACKNOWLEDGMENTS Over the past five years in East Lansing, I have owed to many people for their support and help. Without them I couldn’t get through the times I had. I want to deeply thank to Timothy J. Vogelsang, my advisor as well as coauthor, for his continuous support and excellent guidance. I can’t simply imagine how I would be now without him. He is the most influential person for me at MSU. I learned so many things from him in teaching and research and a lot of meetings we had were among the best memorable moments for me. I will always miss those times with him after I leave East Lansing. I am also so grateful to Peter Schmidt, my another excellent advisor and Christine Amsler. They are both my coauthors. It was greatly pleasing and exciting to work with them. I remember how warm they welcomed their TA’s including me at the appreciation dinner meeting every semester. Also, I am not forgetting how much they were helpful and considerate when I was in need of help. I could survive my second year with their support. Special thanks to Christine Amsler. I worked for her as research assistant but she was more than a supervisor and I deeply thank her again for encouraging me during those days. I also thank Leslie E. Papke, the graduate program director. She was so much responsive to what I was going through. I am also indebted to two other committee members, Jeffrey M. Wooldridge and Hira Koul at department of statistics and probability for what I have learned from their lectures and for their invaluable comments on my research. What I learned from them has become really important academic assets for me. I thank Margaret Lynch and Lori Jean Nichols for their support and tips in every important administration work. I had many good friends while I was in East Lansing. I could take a break from work and research and get refreshed while I was hanging out with them. They are Jaemin Baik, Mikayla Bowen, Sarah Brown, Hon Fong Cheah, Yeonjei Jung, Myoung-Jin Keay, Sang Hyun Kim, Soobin Kim, Sangjoon Lee, Seunghwa Rho and Sun Yu. I want to specially thank Reverend Borin Cho, Geum Jang, and many more at Lansing Korean v UMC. Speicial thanks also go to Sangryun Lee, Myungsook Kim, Bernhard G. Bodmann and Raul Susmel in Houston, Texas. I also deeply thank my parents, my parents in law and sisters and brother in Korea for their unceasing support and trust toward me. Last but not least I owe my deepest gratitude to my wife, Eun-Young Song and my daughter, Ellin for their love and for being together with me. vi TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii CHAPTER 1 Fixed-b Inference for Testing Structural Break in a Time Series Regresison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Review of the Fixed-b Asymptotics . . . . . . . . . . . . . . . . . . . . . . . 1.3 Model of Structural Change and Preliminary Results . . . . . . . . . . . . . 1.4 Asymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Asymptotic Limits under Traditional Approach . . . . . . . . . . . 1.4.2 Asymptotic Limits under Fixed-b Approach . . . . . . . . . . . . . . 1.5 Critical Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Finite Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Wald(S) and Wald( F) : Traditional Inference vs. Fixed-b Inference . . 1.6.2 Fixed-b Inference: Wald(S) vs. Wald( F) . . . . . . . . . . . . . . . . . 1.7 Local Power Analysis of Fixed-b Inference . . . . . . . . . . . . . . . . . . . 1.7.1 Comparison of the Local Asymptotic Power of Wald(S) and Wald( F) 1.7.2 Impact of Breakpoint Location and Bandwidth on Power . . . . . . 1.8 Partial Structural Change Model . . . . . . . . . . . . . . . . . . . . . . . . 1.8.1 Setup and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 1.8.2 Asymptotic Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 When the Break Date is Unknown . . . . . . . . . . . . . . . . . . . . . . . 1.10 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Test of Parameter Instability Allowing for Change in the Moments of Explanatory Variables . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Model of Structural Change and Preliminary Results . . . . . . . . . . . . . 2.3 Asymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Stability of β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Stability of α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 5 8 14 14 15 19 20 22 24 26 27 29 35 35 36 40 42 72 89 . . . . . . . . . . 92 92 93 97 100 106 110 112 115 119 CHAPTER 2 CHAPTER 3 A Test of the Null of Integer Integration against the Alternative of Fractional Integration . . . . . . . . . . . . . . . 121 vii 3.1 3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Setup and Assumptions . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Null Hypothesis . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Alternative Hypothesis . . . . . . . . . . . . . . . . . . . 3.2.3 Test Statistics and the Rejection Rule . . . . . . . . . . . 3.3 Asymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Asymptotic Results for ηµ . . . . . . . . . . . . . . . . . 3.3.2 Asymptotic Results for ηµd . . . . . . . . . . . . . . . . . 3.3.3 Correct Size and Consistency of the Double-KPSS Test . 3.4 Fixed-b Asymptotic Results . . . . . . . . . . . . . . . . . . . . 3.5 Monte Carlo Simulations . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Design of the Experiment . . . . . . . . . . . . . . . . . 3.5.2 Results with Standard Critical Values . . . . . . . . . . 3.5.3 Results with Fixed-b Critical Values . . . . . . . . . . . 3.5.4 Comparison of the ηµd Test and the ADF Test . . . . . . 3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 124 124 126 128 131 131 133 135 136 138 139 140 143 144 145 155 166 LIST OF TABLES Table 1.1 95% Fixed-b critical values of Wald( F) with Bartlett kernel, l = 1 . . . 44 Table 1.2 95% Fixed-b critical values of Wald( F) with Bartlett kernel, l = 2 . . . 44 Table 1.3 95% Fixed-b critical values of Wald( F) with Parzen kernel, l = 1 . . . . 45 Table 1.4 95% Fixed-b critical values of Wald( F) with Parzen kernel, l = 2 . . . . 45 Table 1.5 95% Fixed-b critical values of Wald( F) with QS kernel, l = 1 . . . . . . 46 Table 1.6 95% Fixed-b critical values of Wald( F) with QS kernel, l = 2 . . . . . . 46 Table 1.7 95% Fixed-b critical values of Wald(S) with Bartlett kernel, l = 1, b1 = b2 = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Table 1.8 95% Fixed-b critical values of Wald(S) with Bartlett kernel, l = 2, b1 = b2 = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Table 1.9 95% Fixed-b critical values of Wald(S) with Parzen kernel, l = 1, b1 = b2 = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 1.10 95% Fixed-b critical values of Wald(S) with Parzen kernel, l = 2, b1 = b2 = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 1.11 95% Fixed-b critical values of Wald(S) with QS kernel, l = 1, b1 = b2 = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Table 1.12 95% Fixed-b critical values of Wald(S) with QS kernel, l = 2, b1 = b2 = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Table 1.13 95% Fixed-b critical values of Wald(S) , l = 2, b1 = b2 . . . . . . . . . . 49 Table 1.14 The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. . . 50 Table 1.15 The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. . 51 Table 1.16 The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. . 52 Table 1.17 The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. . 53 ix Table 1.18 The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. . . 54 Table 1.19 The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. . 55 Table 1.20 The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , . . 56 Table 1.21 The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , . . 57 Table 1.22 The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , . . 58 Table 1.23 The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , . . 59 Table 1.24 The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , . . 60 Table 1.25 The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , . . 61 Table 1.26 Fixed-b 95% Critical Values of Wald( F) Unknown Break Date, Bartlett kernel, l =2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Table 1.27 Fixed-b 95% Critical Values of Wald( F) Unknown Break Date, QS kernel, l =2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Table 1.28 The Finite Sample Size of the SupW ( F) Test with 5% Nominal Size . . 63 Table 1.29 The Finite Sample Size of the SupW ( F) Test with 5% Nominal Size . . 64 Table 1.30 The Finite Sample Size of the SupW ( F) Test with 5% Nominal Size . . 65 Table 1.31 The Finite Sample Size of the MeanW ( F) Test with 5% Nominal Size . 66 Table 1.32 The Finite Sample Size of the MeanW ( F) Test with 5% Nominal Size . 67 Table 1.33 The Finite Sample Size of the MeanW ( F) Test with 5% Nominal Size . 68 Table 1.34 The Finite Sample Size of the ExpW ( F) Test with 5% Nominal Size . . 69 Table 1.35 The Finite Sample Size of the ExpW ( F) Test with 5% Nominal Size . . 70 Table 1.36 The Finite Sample Size of the ExpW ( F) Test with 5% Nominal Size . . 71 Table 2.1 Size and Power in Finite Samples, T=100, 300, λ x = 0.4, = .1, Bartlett kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 x Table 3.1 Summary of the existing asymptotic results for ηµ . . . . . . . . . . . . 146 Table 3.2 Fixed-b Critical Values for ηµ and ηµd , Bartlett kernel, l = bT − 1, l = b ( T − 1) − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Table 3.3 Size and Power Using Standard 5% Critical Values with Traditional Lag Choices,T = 50 and 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Table 3.4 Size and Power Using Standard 5% Critical Values with Traditional Lag Choices, T = 200 and 500 . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Table 3.5 Size and Power Using Standard 5% Critical Values with Traditional Lag Choices, T = 1,000 and 2,000 . . . . . . . . . . . . . . . . . . . . . . . . . 150 Table 3.6 Size and Power Using 5% Fixed-b Critical Values with Traditional Lag Choices, T = 50 and 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Table 3.7 Size and Power Using 5% Fixed-b Critical Values with Traditional Lag Choices, T = 200 and 500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Table 3.8 Size and Power Using 5% Fixed-b Critical Values with Traditional Lag Choices, T = 1,000 and 2,000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Table 3.9 Size and Power of ADF and ηµ × ADF Tests Using Standard Critical Values with Traditional Lag Choices, T = 50, 100, 200, and 500 . . . . . . . . . 154 xi LIST OF FIGURES Figure 1.1 Empirical Null Rejection Probability of Structural Break Test using Wald( F) β 1 = β 2 = 0, θ = 0.1, ρ = 0.5, φ = 0.5, b = 0.2, T = 50, Replications=2, 500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Figure 1.2 Local Power, Bartlett and QS, λ = 0.2, b = 0.2, b1 = 1, b2 = 0.25 . . . . 28 Figure 1.3 Local Power, Bartlett and QS, λ = 0.2, b = 0.5, b1 = 2.5, b2 = 0.625 . . 28 Figure 1.4 Local Power, Bartlett and QS, λ = 0.2, b = 1, b1 = 5, b2 = 1.25 . . . . . 29 Figure 1.5 Local Power, Bartlett and QS, λ = 0.5, b = 1, b1 = 2, b2 = 1 . . . . . . 30 Figure 1.6 Local Power, Wald( F) , Bartlett, b = 0.1 . . . . . . . . . . . . . . . . . . . 31 Figure 1.7 Local Power, Wald( F) , QS, b = 0.1 . . . . . . . . . . . . . . . . . . . . . 31 Figure 1.8 Local Power, Wald(S) , Bartlett . . . . . . . . . . . . . . . . . . . . . . . 32 Figure 1.9 Local Power, Wald(S) , QS . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Figure 1.10 Local Power, Wald( F) , Bartlett, λ = 0.5 . . . . . . . . . . . . . . . . . . 33 Figure 1.11 Local Power, Wald( F) , QS, λ = 0.5 . . . . . . . . . . . . . . . . . . . . . 33 Figure 1.12 Local Power, Wald(S) , Bartlett, λ = 0.5 . . . . . . . . . . . . . . . . . . 34 Figure 1.13 Local Power, Wald(S) , QS, λ = 0.5 . . . . . . . . . . . . . . . . . . . . . 34 xii CHAPTER 1 Fixed-b Inference for Testing Structural Break in a Time Series Regresison 1.1 Introduction Chapter 1 focuses on fixed-b inference of heteroskedasticity and autocorrelation (HAC) robust Wald statistics for testing for a structural break in a time series regression. Commonly used kernel-based nonparametric HAC estimators are considered to estimate the asymptotic variance. HAC estimators allow for arbitrary structure of the serial correlation and heteroskedasticity of weakly dependent time series and are consistent estimators of the long run variance under the assumption that the bandwidth (M) is growing at a certain rate slower than the sample size (T). Under this assumption, the Wald statistics converge to the usual chi-square distributions. However, because the critical values from the chi-square distribution are based on a consistency approximation for the HAC estimator, the chi-square limit does not reflect the often substantial finite sample randomness of the HAC estimator. Furthermore, the chi-square approximation does not capture the impact of the choice of the kernel or the bandwidth on the Wald statistics. The sensitivity of the statistics to the finite sample bias and variability of the HAC estimator is well known in the literature; Kiefer and Vogelsang (2005) among others have illustrated by simulation that the traditional inference with a HAC estimator can have poor finite sample properties. Departing from the traditional approach, Kiefer and Vogelsang (2002a), Kiefer and Vogelsang (2002b) , and Kiefer and Vogelsang (2005) obtain an alternative asymptotic ap- 1 proximation by assuming that the ratio of the bandwidth to the sample size, b = M/T, is held constant as the sample size increases. Under this alternative nesting of the bandwidth, they obtain pivotal asymptotic distributions for the test statistics which depend on the choice of kernel and bandwidth tuning parameter. Simulation results indicate that the resulting fixed-b approximation has less size distortions in finite samples than the traditional approach especially when the bandwidth is not small. Theoretical explanations for the finite sample properties of the fixed-b approach include the studies by Hashimzade and Vogelsang (2008), Jansson (2004), Sun, Phillips, and Jin (2008, hereafter SPJ), Gonçalves and Vogelsang (2011) and Sun (2013). Hashimzade and Vogelsang (2008) provide an explanation for the better performance of the fixed-b asymptotics by analyzing the bias and variance of the HAC estimator. Gonçalves and Vogelsang (2011) provide a theoretical treatment of the asymptotic equivalence between the naive bootstrap distribution and the fixed-b limit. Higher order theory is used by Jansson (2004), SPJ (2008) and Sun (2013) to show that the error in rejection probability using the fixed-b approximation is more accurate than the traditional approximation. In a Gaussian location model Jansson (2004) proves that for the Bartlett kernel with bandwidth equal to sample size (i.e. b = 1), the error in rejection probability of fixed-b inference is O( T −1 log T ) which is smaller than the usual rate of O( T −1/2 ). The results in SPJ (2008) complement Jansson’s result by extending the analysis for a larger class of kernels and focusing on smaller values of bandwidth ratio b. In particular they find that the error in rejection probability of the fixed-b approximation is O( T −1 ) around b = 0. They also show that for positively autocorrelated series, which is typical for economic time series, the fixed-b approximation has smaller error than the chi-square or standard normal approximation even when b is assumed to decrease to zero although the stochastic orders are same. In this chapter fixed-b asymptotics is applied to testing for structural change in a weakly dependent time series regression. The structural change literature is now enor- 2 mous and no attempt will be made here to summarize the relevant literature. Some key references include Andrews (1993) and Andrews and Ploberger (1994). Andrews (1993) treats the issue of testing for a structural break in the GMM framework when the one-time break date is unknown and Andrews and Ploberger (1994) derive asymptotically optimal tests. Bai and Perron (1998) consider multiple structural change occurring at unknown dates and cover the issues of estimation of break dates, testing for the presence of structural change and for the number of breaks. For a comprehensive survey of the recent structural break literature see Perron (2006). Because a structural change in a regression relationship effectively divides the sample into two regimes, two different HAC estimators are considered when constructing Wald statistics. Asymptotically, estimators of the regression parameters are uncorrelated across the two regimes. One HAC estimator imposes this zero covariance restriction while the other estimator does not and this leads to two possible Wald statistics that can be used in practice denoted by Wald(S) and Wald( F) . When the break date is known and coefficients of all regressors are subject to structural change (i.e. full structural change), both Wald statistics have pivotal fixed-b limits but these limits are different. For the two Wald statistics the fixed-b critical values increase as the bandwidth gets bigger and as the hypothesized break date is closer to the boundary of the sample. When some of the regressors are not subject to structural change (i.e. partial structural change), the Wald statistic based on the unrestricted HAC estimator (Wald( F) ) still has the same pivotal fixed-b limit as in the case of full structural change. A local power analysis is carried out under the fixed-b approach and several patterns are reported. The local power of both Wald statistics is lower with bigger b especially for the QS kernel. Power also improves as the break date gets closer to the middle of the sample regardless of bandwidth or kernel. With the within-regime effective bandwidths matched across the two statistics, the power difference is more evident when a big value of b is used with the QS kernel. 3 A simulation study on the finite sample performance of fixed-b inference reveals that the Wald statistic based on the unrestricted HAC estimator has better overall size performance especially when the quadratic spectral (QS) kernel and large bandwidths are used in the case of persistent data. In the comparison of the performance of fixed-b inference with traditional inference, it is found that the latter is subject to substantially more size distortions. Similar to the known patterns in models without structural change, size distortions decrease as one uses bigger bandwidth or as the hypothesized break date is closer to the middle of the sample. When there is strong persistence in the data, rejections using fixed-b critical values can be above nominal levels although these distortions are much smaller compared to traditional inference. The remainder of this chapter is organized as follows. Section 1.2 reviews fixed-b asymptotic theory in a regression with no structural change. In Section 1.3 the basic set up of the full structural-change model. Some preliminary results are laid out and the two HAC estimators are introduced. Section 1.4 derives the fixed-b limits of the two Wald statistics. Section 1.5 explores patterns in the fixed-b critical values when the break date is treated as a priori. Section 1.6 compares finite sample size across different approach of inferences, different choices of kernel and HAC estimators under various DGP specifications. Section 1.7 examines local power. Section 1.8 generalizes the results in Section 1.4 to a model with partial structural change. It is shown that the fixed-b limit of Wald( F) in the partial structural change model is same as in the full structural change model. Section 1.9 covers the case where the break date is unknown providing the fixed-b critical values. Section 1.10 summarizes and concludes. Proofs and supplemental results are collected in the Appendix. 4 1.2 Review of the Fixed-b Asymptotics Consider a weakly dependent time series regression with p regressors given by yt = xt β + ut . (1.1) Model (1.1) is estimated by ordinary least squares (OLS) giving −1 T ∑ xt xt β= T ∑ xt yt t =1 , t =1 and ut = yt − xt β are the OLS residuals. The centered OLS estimator is given by −1 T β−β = ∑ xt xt t =1 T ∑ vt , t =1 where vt = xt ut . The asymptotic theory is based on the following two assumptions. [rT ] p Assumption 1. T −1 ∑t=1 xt xt → rQ, uniformly in r ∈ [0, 1], and Q−1 exists. [rT ] [rT ] Assumption 2. T −1/2 ∑t=1 xt ut = T −1/2 ∑t=1 vt ⇒ ΛWp (r ), r ∈ [0, 1], where ΛΛ = Σ, and Wp (r ) is a p × 1 standard Wiener process. For a more detailed discussion about the regularity conditions under which Assumptions 1 and 2 hold, refer to Kiefer and Vogelsang (2002b). See Davidson (1994), Phillips and Durlauf (1986), Phillips and Solo (1992), and Wooldridge and White (1988) for more details. The matrix Q is the non-centered variance-covariance matrix of xt and is typically estimated using the sample variance Q = 1 T ∑tT=1 xt xt . The matrix Σ ≡ ΛΛ is the asymp- totic variance of T −1/2 ∑tT=1 vt , which is, for a stationary series, given by ∞ Σ = Γ0 + ∑ (Γ j + Γ j ) with Γ j = E(vt vt− j ). j =1 5 Being a long run variance, Σ is commonly estimated by the kernel-based nonparametric HAC estimator Σ=T −1 T T ∑∑K t =1 s =1 |t − s| M v t v s = Γ0 + T −1 ∑K j =1 j M Γj + Γj , where Γ j = T −1 ∑tT= j+1 vt vt− j , vt = xt ut , M is a bandwidth, and K (·) is a kernel ighting function. Under some regularity conditions (see Andrews (1991), De Jong and Davidson (2000), Hansen (1992), Jansson (2002) or Newey and West (1987)), Σ is a consistent estimator of p Σ, i.e. Σ → Σ. These regularity conditions include the assumption that M/T → 0 as M, T → ∞. This asymptotics is called ‘traditional asymptotics’ throughout this chapter. In contrast to the traditional approach, fixed-b asymptotics assumes b = M/T is held constant as T increases. Assumptions 1 and 2 are the only regularity conditions required to obtain a fixed-b limit for Σ. Under the fixed-b approach, for b ∈ (0, 1], Kiefer and Vogelsang (2005) show that Σ ⇒ ΛP(b, Wp )Λ , (1.2) where Wp (r ) = Wp (r ) − rWp (1) is a p-vector of standard Brownian bridges and the form of the random matrix P(b, Wp ) depends on the kernel. Following Kiefer and Vogelsang (2005) three classes of kernels are considered. Let H p (r ) denote a generic vector of stochastic processes. H p (r ) denotes its transpose. Then P(b, H p ) is defined as follows: case 1 If K ( x ) is twice continuously differentiable everywhere (Class 1) such as the Quadratic Spectral kernel (QS), then P b, H p ≡ − 1 0 1 0 r−s b 1 K b2 H p (r ) H p (s) drds, where K (·) is the second derivative of the kernel K (·). 6 (1.3) case 2 If K ( x ) is the Bartlett kernel (Class 2), then P b, H p ≡ 2 b 1 0 H p (r ) H p (r ) dr − 1 b 1− b H p (r ) H p (r + b) + H p (r + b) H p (r ) dr. 0 (1.4) case 3 If K ( x ) is continuous, K ( x ) = 0 for | x | ≥ 1, and K ( x ) is twice continuously differentiable everywhere except for | x | = 1 (Class 3) like Parzen kernel, then P b, H p ≡ − + K |r − s | b 1 K b2 |r −s| λT ), where xt is p × 1 regressor vector, λ ∈ (0, 1) is a break point, and 1( · ) is the indicator function. Let [λT ] denote the integer part of λT. Note that x2t = 0 for t = 1, 2, ..., [λT ] and x1t = 0 for t = [λT ] + 1, ..., T. For the time being the potential break point λT is assumed to be known. The case of λ being unknown is discussed in Section 1.9. The regression model (1.7) implies that coefficients of all explanatory variables are subject to potential structural change and this model is labeled the ’full’ structural change model. 8 Of interest it is that the presence of a structural change in the regression parameters. Consider null hypothesis of the form H0 : Rβ = 0, (1.8) where R (l ×2p) = ( R1 , − R1 ) , and R1 is an l × p matrix with l ≤ p. Under the null hypothesis, it is being tested that one or more linear relationships on the regression parameter(s) do not experience structural change before and after the break point. Tests of the null hypothesis of no structural change about a subset of the slope parameters are special cases. For example one can test the null hypothesis that the slope parameter on the first regressor did not change by setting R1 = (1, 0, . . . , 0). One can test the null hypothesis that none of the regression parameters have structural change by setting R1 = I p . In order to establish the asymptotic limits of the HAC estimators and the Wald statistics, Assumptions 1 and 2 given in previous Section are sufficient. Those Assumptions imply that there is no heterogeneity in the regressors across the segments and the covariance structure of the errors are assumed to be same across segments as well. notation For later use, define a l × l nonsingular matrix A such that R1 Q−1 ΛΛ Q−1 R1 = AA , (1.9) and d R1 Q−1 ΛWp (r ) = AWl (r ), where Wl (r ) is l × 1 standard Wiener process. 9 (1.10) Focus on the OLS estimator of β given by −1 T β= ∑ wt wt T ∑ wt yt t =1 , t =1 which can be written for each segment as −1 T β1 = ∑ x1t x1t ∑ x1t yt t =1 −1 T β2 = ∑ x2t x2t  ∑ x2t yt t =1 t =1 ∑ xt yt ∑  −1  xt xt  , (1.11) t =1 t =1 T = [λT ] ∑ xt xt = t =1 T −1 [λT ] T ∑  t=[λT ]+1  T xt yt  . (1.12) t=[λT ]+1 Fixed-b results depend on the limiting behavior of the following partial sum process t St = ∑ wj uj = j =1 t ∑ wj y j − x1j β 1 − x2j β 2 j =1 t = ∑ wj u j − x1j β 1 − β 1 − x2j β 2 − β 2 . (1.13) j =1 Under Assumptions 1 and 2 the limiting behavior of β and the partial sum process St are easily obtained. Proposition 1. Let λ ∈ (0, 1) be given. Suppose the data generation process is given by (1.7) and let [rT ] denote the integer part of rT where r ∈ [0, 1]. Then under Assumptions 1 and 2 as T → ∞,  √ and √    −1 (λQ) ΛWp (λ)  T β1 − β1  d   T ( β − β) = √ → , T β2 − β2 ((1 − λ) Q)−1 Λ Wp (1) − Wp (λ)    (1) 0   Fp (r, λ)   Λ Λ 0  T −1/2 S[rT ] ⇒   ≡  Fp (r, λ) , (2) 0 Λ Fp (r, λ) 0 Λ 10 where (1) Fp (r, λ) = Wp (r ) − r Wp ( λ ) · 1 (r ≤ λ ), λ and (2) Fp (r, λ) = Wp (r ) − Wp ( λ ) − r−λ Wp (1) − Wp ( λ ) 1−λ · 1 (r > λ ). Proof: See the Appendix. It is easily seen that the asymptotic distributions of β 1 and β 2 are Gaussian and are independent of each other. Hence the asymptotic covariance of β 1 and β 2 is zero. The √ 1 −1 asymptotic variance of T ( β − β) is given by Q− λ ΩQλ , where     0 0 λQ  λΣ  Qλ ≡   and Ω ≡  . 0 (1 − λ ) Q 0 (1 − λ ) Σ In order to test the null hypothesis (1.8) HAC robust Wald statistisc are considered. These statistics are robust to heteroskedasticity and autocorrelation in the vector process, vt = xt ut . The generic form of the robust Wald statistic is given by 1 −1 RQ− λ ΩQλ R Wald = T R β where −1 Rβ ,   −1 [λT ]  T ∑ t =1 x t x t Qλ =  0 T −1 ∑tT=[λT ]+1 0 xt xt  , and Ω is an HAC robust estimator of Ω. Two estimators for Ω are analyzed. The first one, denoted by Ω( F) , is newly introduced in this chapter and it is constructed using the residuals directly from the dummy regression (1.7): Ω ( F ) = T −1 T T ∑∑K t =1 s =1 11 |t − s| M vt vs , (1.14) where vt = wt ut = x1t ut , x2t ut (2) xt ut 1(t ≤ λT ) and vt (1) 2p×1 . Denote the components of vt as vt = x1t ut = = x2t ut = xt ut 1(t > λT ). The other estimator is the HAC estimator appearing in existing literature (e.g. Bai and Perron (2006)) which is given by   Ω(S) =  where Σ (1) = Σ (2) = 1 [λT ]  λ Σ (1) 0 0 ( 1 − λ ) Σ (2) [λT ] [λT ] ∑ ∑K t =1 s =1 |t − s| M1 T T 1 K ∑ ∑ T − [λT ] t=[λT ]+1 s=[λT ]+1 (1) Note that Σ(1) is constructed using vt  , (1.15) (1) (1) (1.16) vt vs , |t − s| M2 (2) (2) vt vs . (1.17) (data from the pre-break regime) and uses the (2) bandwidth M1 and pre-break sample size [λT ]. Likewise Σ(2) is constructed using vt (data from the post-break regime) and uses the bandwidth M2 and post-break sample size T − [λT ]. In Andrews (1993) and Bai and Perron (2006), the estimator (1.15) is used to allow for a potential structural change in the long run variance Σ itself. In this chapter the assumption is maintained that Σ does not have structural change because allowing for heterogeneity in Σ results in a non-pivotal fixed-b limit of the Wald statistic. Finding an estimator of Ω that has a pivotal fixed-b limit when Σ has structural change is a topic of ongoing research. At a superficial level, the estimators Ω( F) and Ω(S) look different but they are directly 12 (1) related. Using vt = (vt   Ω( F) =    =  = ( F) Ω11 ( F) Ω21 (2) , vt ( F) Ω12 ( F) Ω22 ) one can write Ω( F) as    ∑tT=1 ∑sT=1 K |t−s| M (1) (1) vt vs T −1 ∑tT=1 ∑sT=1 K |t−s| M (2) (1) vt vs T −1 (1) (1) [λT ] [λT ] |t−s| ∑ t =1 ∑ s =1 K M v t v s   [λT ] (2) (1) |t−s| T −1 ∑tT=[λT ]+1 ∑s=1 K M vt vs   = (1) (2) vt vs T −1 ∑tT=1 ∑sT=1 K |t−s| M (2) (2) vt vs [λT ] ∑t=1 ∑sT=[λT ]+1 K [λT ] |t−s| M |t−s| M T −1 ∑tT=[λT ]+1 ∑sT=[λT ]+1 K T −1 ∑t=1 ∑sT=[λT ]+1 K λ Σ (1) [λT ] |t−s| M T −1 T −1 T −1 ∑tT=[λT ]+1 ∑s=1 K ∑tT=1 ∑sT=1 K T −1 (2) (1) |t−s| M ( 1 − λ ) Σ (2) vt vs    (1) (2) vt vs |t−s| M (2) (2) vt vs (1) (2) vt vs    (1.18)    It is seen that the diagonal blocks of Ω( F) are the same HAC estimators used by Ω(S) except that the same bandwidth, M, is used for each diagonal block of Ω( F) whereas the diagonal blocks of Ω(S) can have different bandwidths. Ω(S) is a restricted version of Ω( F) with the off-diagonal blocks set to 0, i.e. Ω(S) imposes a zero covariance between β 1 and β 2 matching the zero asymptotic covariance between β 1 and β 2 . In contrast Ω( F) does not impose this zero covariance which is consistent with the possibility that the finite sample covariance between β 1 and β 2 is not equal to zero (which is true in general). Note however that Ω( F) uses the same bandwidth for both regimes whereas Ω(S) allows different bandwidths in the two regimes. Thus, from the bandwidth perspective, Ω( F) is restrictive relative to Ω(S) . The next Section provides asymptotic results for the two HAC robust Wald statistics under the traditional asymptotics and under the fixed-b asymptotics. 13 1.4 1.4.1 Asymptotic Results Asymptotic Limits under Traditional Approach The goal of the traditional approach is to find conditions under which the HAC estimator is consistent. One requirement for consistency is that M grows with the sample but at a slower rate. Then under additional regularity conditions, Σ(1) and Σ(2) are consistent estimators of Σ and the limit of Ω(S) is straightforwardly given by   Ω(S) =  λ Σ (1) 0    0  p λΣ  → . 0 (1 − λ ) Σ ( 1 − λ ) Σ (2) 0 (2p×2p) However establishing consistency of Ω( F) requires some additional calculation beyond existing results in the literature. Proposition 2. Under regularity conditions for the consistency of the HAC estimators Σ(1) and Σ(2) , as T → ∞,   0 p  λΣ  Ω( F) →   0 (1 − λ ) Σ . 2p×2p Proof: See the Appendix. Let Wald(S) denote the Wald statistic based on Ω(S) and let Wald( F) denote the Wald statistic based on Ω( F) . Then the results given in this subsection, combined with Proposition 1, give us the limits of the test statistics under the traditional approach: Wald(S) , Wald( F) ⇒ λ (1 − λ ) 1 1 Wl (λ) − (Wl (1) − Wl (λ)) λ 1−λ × 1 1 Wl (λ) − (Wl (1) − Wl (λ)) , λ 1−λ where Wl is a l × 1 standard Wiener process. Note that for any given value of λ the limit 14 follows a chi-square distribution with degrees of freedom l. While convenient, the chisquare limit does not capture the impact of the randomness of Ω(S) and Ω( F) on the Wald statistics in finite samples. 1.4.2 Asymptotic Limits under Fixed-b Approach Now fixed-b limits for the HAC estimators and the test statistics are provided. The fixed-b limits presented in next Lemma and Corollary approximate the diagonal blocks of Ω( F) by random matrices. Also, it is shown that fixed-b approach gives nonzero limit for the off-diagonal block, which further distinguishes fixed-b asymptotics from the traditional asymptotics. Lemma 1. Let λ ∈ (0, 1) and b ∈ (0, 1] be given. Suppose M = bT. Then under Assumptions 1 and 2, as T → ∞,    Λ 0  Λ Ω( F) ⇒   × P b, Fp (r, λ) ×  0 Λ 0 where   Fp (r, λ) =  (1) Fp (r, λ) = Wp (r ) − (2) Fp (r, λ) = Wp (r ) − Wp ( λ ) − (1) Fp (r, λ) (2) Fp (r, λ)  0 , Λ (1.19)   , (1.20) r Wp ( λ ) 1 (0 ≤ r ≤ λ ) , λ r−λ Wp (1) − Wp ( λ ) 1−λ 1 ( λ < r ≤ 1) , (1.21) (1.22) and P p b, Fp (r, λ) is defined by (1.3), (1.4), and (1.5) with H p (r ) = Fp (r, λ). Proof: See the Appendix. Extra algebra leads to an alternative representation for P b, Fp (r, λ) . The proof for this Corollary is omitted. 15 Corollary 1.   P b, Fp (r, λ) =  P C (1) b, Fp (r, λ) C (1) (2) b, Fp (r, λ) , Fp (r, λ) (1) (2) b, Fp (r, λ) , Fp (r, λ) P (2) b, Fp (r, λ)    , (1.23) where (1) (2) C b, Fp (r, λ) , Fp (r, λ) =       −      − |r −s| λT ), Xt = ( x1t x2t ) , and β = ( β 1 β 2 ) . The coefficients on the xt regressors are unrestricted in terms of a structural change whereas the coefficients on the zt regressors are assumed to not have structural change. Denote y = (y1, y2 , . . . , y T ) , X = ( X1 , X2 , . . . XT ) , Z = ( z1 , z2 , . . . , z T ) , u = ( u1 , u2 , . . . , u T ) . 35 The parameters (α, β) are estimated by OLS and the OLS residual vector can be written as u = y − X β = u − X β − β − PZ u, where y = ( I − PZ ) y, X = ( I − PZ ) X, and PZ = Z ( Z Z )−1 Z . The residual for a individual observation is given by u t = u t − Xt β − β − z t Z Z Also, note that −1  X t = X t − X Z ( Z Z ) −1 z t = Z u. (1.31)  (1)  Xt   p ×1   .  X (2)  t p ×1 The following assumptions replace  Assumptions 1 and 2 inSection 1.2:  Λ1  [rT ]  xt ut  Assumption 1’. T −1/2 ∑t=1   ⇒ ΛWp+q (r ) ≡   Wp+q (r ), where Λ1 is a zt ut Λ2 p × ( p + q) matrix, Λ2 is a q × ( p + q) matrix, and Wp+q (r ) is a ( p + q) × 1 vector of independent Wiener process. [rT ] [rT ] [rT ] Assumption 2’. p lim T1 ∑t=1 zt zt = rQ ZZ , p lim T1 ∑t=1 xt xt = rQ xx , and p lim T1 ∑t=1 xt zt = 1 −1 rQ xZ uniformly in r ∈ [0, 1], and there exist Q− ZZ and Q xx . 1.8.2 Asymptotic Limits Continue to focus on tests of the null hypothesis of no structural change in the xt slope parameters of the form H0 : Rβ = r 36 with R1 , − R1 R = l ×2p l× p Recall that the OLS estimator, β = β 1 , β 2 β= (1.32) can be rewritten as −1 T and r = 0. l× p ∑ Xt Xt T ∑ Xt y t t =1 . (1.33) t =1 Proposition 3. Under Assumptions 1’ and 2’, as T → ∞  d  T 1/2 β − β → Q−1  XX  1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (1) Λ1 Wp + q (1) − Wp + q ( λ ) 1 − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (1)  , and √ H0 1 T R β − r ⇒ R1 Q − xx Λ1 1 1 Wp + q ( λ ) − Wp + q (1) − Wp + q ( λ ) λ 1−λ , (1.34) where Q X X = p lim T −1 ∑tT=1 Xt Xt . Proof: See the Appendix. As seen from the above proposition, β 1 and β 2 are not asymptotically independent in the partial structural change regression model. This is true because we are projecting out the variation of explanatory variables zt so that β 1 and β 2 depend on the entire series of xt and zt . The dichotomy that β 1 is dependent only on the pre-break data and β 2 depends only on the post-break data no longer holds in the partial structural change model. The 1 dependence manifests in the common term, Q xZ Q− ZZ Λ2 Wp+q (1) in the above Proposition. However, this term cancels out in (1.34) when the restriction matrix takes the form of (1.32). As a result, and also as suggested by equation (1.34), one only needs to estimate Λ1 Λ1 for the inference on structural change. But, Λ1 Λ1 can be easily estimated using x1t ut or x2t ut and the corresponding version of Wald(S) can be defined by constructing 37 Σ(1) with x1t ut and constructing Σ(2) with x2t ut . By doing this, we are ignoring the need to project zt out of the xt variables but this ignorance does not make the inference invalid. The next question is whether in conducting inference one can still take into account the dependence between β 1 and β 2 which is not due to the presence of zt . This can be verified by finding a version of Wald( F) which has a pivotal limit in the partial structural change model. The answer is positive as below. The Wald statistic is given by R Q −1 Ω Q −1 R Wald = T R β XX XX −1 Rβ , (1.35) where Q X X = T −1 ∑tT=1 Xt Xt . For constructing Wald( F) , consider a HAC estimator Ω( F) which is computed using Xt ut T t =1 : Ω ( F ) = T −1 T T ∑∑K t =1 s =1 |t − s| M ξtξs, (1.36) where ξ t = Xt ut . This is a straightforward extension of Wald( F) to the case of partial structural change. Next Lemma provides the limit of the the scaled partial sum process of ξ t premultiplied by an appropriate term. Lemma 3. Let St = ∑tj=1 ξ j . Under Assumptions 1’ and 2’, as T → ∞, ξ 1 (1) 1 (2) Fp+q (r, λ) − Fp+q (r, λ) , λ 1−λ 1 R Q−1 T −1/2 S[rT ] ⇒ R1 Q− xx Λ1 ξ XX where (1) Fp+q (r, λ) = Wp+q (r ) − (2) Fp+q (r, λ) = Wp + q (r ) − Wp + q ( λ ) − r Wp + q ( λ ) 1 (0 ≤ r ≤ λ ) , λ r−λ Wp + q (1) − Wp + q ( λ ) 1−λ Proof: See the Appendix. 38 1 ( λ < r ≤ 1) . As Lemma 3 shows, the partial sums of the inputs to Ω( F) are asymptotically propor√ tional to the same nuisance parameters as T R β − r . This is the key condition for an asymptotic pivotal fixed-b limit. The next Theorem provides the fixed-b limit of Wald( F) . Theorem 3. Let λ ∈ (0, 1) and b ∈ (0, 1] be given. Suppose M = bT. Then under Assumptions 1’ and 2’, Wald( F) weakly converges to the same limit in (1.24), i.e. as T → ∞, Wald( F) ⇒ 1 1 Wl (λ) − (Wl (1) − Wl (λ)) λ 1−λ 1 1 (1) (2) Fl (r, λ) × P b, Fl (r, λ) − λ 1−λ −1 × 1 1 Wl (λ) − (Wl (1) − Wl (λ)) . λ 1−λ Proof: See the Appendix. According to Theorem 3, the limit of Wald( F) in the partial structural change model is the same as in the full structural change model. Getting back to Wald(S) , as mentioned earlier, β 1 and β 2 are no longer asymptotically uncorrelated in the partial structural change model. Therefore, forcing the covariance between β 1 and β 2 to be zero cannot be justified any longer. Even though forcing the covariance to be zero is not theoretically justified, even asymptotically, Wald(S) can be (1) modified so that it has a pivotal fixed-b limit. This is obtained by using vt (2) xt ut 1(t ≤ λT ) for constructing Σ(1) and vt = x1t ut = = x2t ut = xt ut 1(t > λT ) for Σ(2) with ut being defined in (1.31). One can easily show that the limit of this Wald(S) is the same as in (1.26). It is tempted to use Xt ut 1(t ≤ λT ) for constructing Σ(1) (and similarly for Σ(2) using Xt ut 1(t > λT )). But this version of Wald(S) turns out to have a non-pivotal fixed-b limit. This can be easily verified by deriving the limits of the associated partial sum processes. 39 1.9 When the Break Date is Unknown Tests for a potential structural break with a unknown break date are well studied in Andrews (1993) and Andrews and Ploberger (1994). Andrews (1993) considers several tests based on the supremums across break points of Wald and LM statistics and shows they are asymptotically equivalent. Andrews and Ploberger (1994) derive tests that maximize average power across potential breakpoints. Wald( F) statistic is the only focus of this Section. Given a value of b, the test statistic is computed for a range of λ. This implicitly changes the effective bandwidths ( λb , 1−b λ ) as λ varies. This results in a built-in mechanism where bigger bandwidth ratio is used as the sample size of a regime shrinks. One might want to make a comparison with the test based on Wald(S) . But note that one needs to adjust the bandwidth ratios (b1 , b2 ) every time λ changes so that (b1 , b2 ) stay equal to ( λb , 1−b λ ). The implementation of tests with Wald(S) is omitted in this chapter. Denote the Wald( F) statistic computed using break date Tb ≡ [λT ] by Wald( F) ( Tb ). Also de( F) ( F) note the limit of Wald( F) ( Tb ) as Wald∞ (λ) where the form of Wald∞ (λ) depends on whether traditional or fixed-b asymptotic theory is being used. In the case of fixed-b the( F) ory, Wald∞ (λ) depends on P b, Fp (r, λ) . As argued by Andrews (1993) and Andrews and Ploberger (1994), break dates close to the end points of the sample cannot be used and so some trimming is needed. To that end define Ξ∗ = [ T, T − T ] with 0 < to be the set of admissible break dates. The tuning parameter <1 denotes the amount of trimming of potential break dates. Consider the three statistics following Andrews (1993) and Andrews and Ploberger (1994) defined as SupW ( F) ≡ sup Wald( F) ( Tb ), Tb ∈Ξ∗ MeanW ( F) ≡ 1 T ∑ ∗ Wald(F) (Tb ), Tb ∈Ξ 40 (1.37) (1.38) 1 T ExpW ( F) ≡ log ∑ ∗ exp Tb ∈Ξ 1 Wald( F) ( Tb ) 2 . (1.39) The asymptotic limits of these statistics follow from the continuous mapping theorem and are given by SupW ( F) ⇒ ( F) Wald∞ (λ), sup λ∈( ,1− ) MeanW ( F) ⇒ 1− ExpW ( F) ⇒ log ( F) Wald∞ (λ)dλ, 1− exp 1 ( F) Wald∞ (λ) dλ . 2 In Tables 1.26 and 1.27 fixed-b critical values for SupW ( F) , MeanW ( F) , and ExpW ( F) are provided for l = 2, = 0.05, 0.1, 0.2 and for b ∈ {0.02, 0.04, 0.06, 0.08, 0.1, 0.2, 0.3, ..., 0.9, 1}. Tables 1.28 through 1.36 presents the simulation result for some of the DGP specifications introduced in Section 1.6 and for T=100, 500 and 1000. Fixed-b critical values are used for the Bartlett and QS kernels. For the Bartlett kernel, results are also reported using the traditional critical values obtained by Andrews (1993) and Andrews and Ploberger (1994). Several patterns stand out for the null rejection probabilities associated with SupW ( F) in Table 1.28 to 1.30. First, rejections using the traditional critical values are often substantially above the 5% nominal level unless persistence is very weak and a small bandwidth is used. Rejections can be close to 100%. The situation is much improved by the use of fixed-b critical values but severe over-rejections are still possible. Size distortions are higher with more persistence in the data, with a smaller value of , and a smaller value of b. As was true in the case of a known break date, the QS kernel gives less size distortion than the Bartlett kernel although the use of large bandwidths causes under-rejections. But over-rejections and under-rejections dissipate as T grows. The Bartlett kernel can suffer from over-rejections that are not easily removed just by using a big bandwidth. A larger value of T helps along with more trimming. Similar 41 patterns hold for the MeanW ( F) and ExpW ( F) statistics; see Tables 1.31-1.33 and 1.34-1.36 respectively. 1.10 Summary and Conclusions In this chapter fixed-b asymptotics was applied to the problem of testing for the presence of a structural break in a weakly dependent time series regression. Two different HAC estimators and accordingly two different Wald statistics were investigated. The Wald( F) statistic is the Wald statistic that one obtains when structural change is expressed in terms of dummy variables interacted with regressors. The Wald(S) statistic is a restricted version of Wald( F) where the off-diagonal blocks of the HAC estimator are set to zero mimicking the asymptotic zero covariance between OLS estimators in the two regimes. The fixed-b limits of the two statistics were derived, and the fixed-b inference was compared with the traditional inference. In a model with full structural change, both Wald statistics have pivotal fixed-b limits. However, in models with partial structural change, the straightforwardly adapted version of Wald( F) has the same pivotal fixed-b limit as in the full structural change case whereas the straightforwardly adapted version of Wald(S) does not have pivotal fixed-b limit. In simulation study the finite sample size distortions associated with the fixed-b approach and the traditional approach were examined. The simulation results indicate that the traditional inference is more subject to severe size distortions. When small bandwidths are used, the gap is not huge but as b gets bigger, the difference becomes substantial. Overall, rejections obtained when using fixed-b critical values are closer to the nominal level compared to using the traditional chi-square critical values. When fixed-b critical values are used, finite sample size distortions becomes more pronounce as b gets smaller or as λ gets closer to 0 or 1. Local asymptotic power is decreasing in b and power is highest for structural change located near the center of the sample. 42 In a comparison of the Wald( F) and Wald(S) statistics for full structural change models it was found that Wald( F) tends to be less size distorted than Wald(S) when using fixed-b critical values especially when serial correlation is strong and a large bandwidth is used. The better size performance of Wald( F) comes at the cost of lower power reflecting the usual trade-off between size robustness and power typically found in fixed-b analyses. The choice between Wald( F) and Wald(S) becomes a choice between tolerance for overrejections relative to desire for high power. At a practical level, Wald( F) is appealing because it retains the same asymptotic pivotal fixed-b limit in models with partial structural change whereas Wald(S) becomes nonpivotal. Finally, some fixed-b critical values tabulated for SupW, MeanW, and ExpW statistics which are commonly used for testing the presence of a structural break when the break date is not known a priori. A simulation study revealed that over-rejections are a bigger concern when the break date is treated as unknown. Critical values based on traditional asymptotics can lead to very severe over-rejection problems. Rejections using fixed-b critical values are less distorted especially when the QS kernel is used with a bandwidth that is not too small. When the Bartlett kernel is used, fixed-b rejections show substantial over-rejections. 43 Table 1.1: 95% Fixed-b critical values of Wald( F) with Bartlett kernel, l = 1 b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 5.94 8.36 10.69 12.68 14.77 23.82 32.48 41.16 50.17 58.98 68.93 77.9 87.48 97.25 0.2 4.68 5.59 6.45 7.37 8.26 13 17.66 22.74 27.59 32.89 37.77 42.82 47.7 53.15 0.3 4.34 4.85 5.33 5.94 6.55 9.75 13.6 17.5 21.61 25.87 29.8 33.54 37.35 41.77 0.4 4.22 4.58 4.97 5.39 5.86 8.49 11.79 15.22 18.81 22.28 25.79 29.52 32.84 36.3 0.5 4.2 4.52 4.89 5.21 5.61 8.06 11.19 14.59 18.85 21.89 25.3 28.36 31.88 35.58 0.6 4.21 4.56 4.95 5.33 5.78 8.37 11.78 15.18 18.75 22.39 25.97 29.37 32.59 36.3 0.7 4.31 4.78 5.32 5.96 6.52 9.74 13.64 17.62 21.77 25.86 29.98 34.03 37.78 41.77 0.8 4.59 5.44 6.26 7.21 8.21 12.89 17.73 22.72 27.98 32.9 38.06 43.43 48.04 53.15 0.9 5.92 8.28 10.55 12.67 14.81 24.15 32.85 41.63 50.61 59.99 69.38 78.76 88.14 97.25 Table 1.2: 95% Fixed-b critical values of Wald( F) with Bartlett kernel, l = 2 b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 0.2 10.45 7.49 15.43 9.36 20.01 11.3 23.63 13.23 27.45 15.17 45.04 24.5 62.32 34.33 79.15 44.31 97.5 54.75 115.81 64.44 134.73 75.73 153.2 85.47 171.39 94.64 197.92 111.33 0.3 6.74 7.74 8.86 10.08 11.16 18.28 25.71 34.01 41.9 50.3 58.44 65.2 71.96 84.82 0.4 6.49 7.2 7.98 8.92 9.77 15.5 22.44 29.69 37.15 44.32 50.96 56.85 63.23 74.22 44 0.5 6.53 7.18 7.93 8.76 9.62 15.22 21.98 28.71 36 42.62 49.05 55.35 61.05 71.25 0.6 6.6 7.34 8.14 9 9.97 15.8 22.62 29.56 36.28 44.31 51.09 56.79 63.14 74.22 0.7 0.8 7.04 7.78 8.1 9.74 9.21 11.64 10.41 13.68 11.86 15.61 18.77 25.29 26.46 35.39 34.46 46.08 42.98 56.62 51.09 67.64 59.28 78.46 66.47 88.5 73.97 98.19 84.82 111.33 0.9 10.73 15.9 20.46 24.46 28.41 45.98 63.27 81.68 100.45 119.13 137.84 155.94 174.38 197.92 Table 1.3: 95% Fixed-b critical values of Wald( F) with Parzen kernel, l = 1 b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 0.2 0.3 0.4 5.53 4.48 4.22 4.12 7.74 5.21 4.63 4.43 10.05 5.99 5.08 4.71 12.8 6.78 5.48 5.04 15.39 7.61 5.93 5.39 28.72 12.88 8.9 7.53 43.41 18.92 12.78 10.38 59.19 26.32 17.63 14.4 77.93 35.19 24.52 19.48 101.1 45.41 32.72 25.83 126.46 58.14 41.87 33.32 156.18 72.5 53.07 42.13 193.5 88.33 65.16 52.59 236.76 106.98 78.8 64.15 0.5 4.13 4.36 4.66 4.93 5.19 7.13 9.81 13.6 18.73 25.14 32.72 42.21 52.45 65.05 0.6 4.14 4.41 4.7 5.01 5.34 7.47 10.52 14.79 19.75 26.31 34.24 43.55 53.96 65.94 0.7 0.8 4.2 4.42 4.61 5.11 5 5.83 5.46 6.59 5.97 7.44 8.82 12.53 12.78 18.77 17.71 26.38 24.37 35.61 32.14 46.83 41.36 58.84 52.62 74.23 64.97 91.91 79.76 112.21 0.9 5.45 7.62 10.05 12.6 15.31 29.14 44.02 60.09 78.96 101.31 126.34 154.86 189.83 229.3 Table 1.4: 95% Fixed-b critical values of Wald( F) with Parzen kernel, l = 2 b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 0.2 9.68 7.16 14.7 8.68 20.78 10.44 26.9 12.28 32.86 14.32 61.88 26.88 95.45 42.82 139.66 65.29 196.8 95.6 278.36 134.62 382.08 186.56 514.04 248.31 680.53 326.98 892.26 423.95 0.3 0.4 0.5 6.54 6.32 6.43 7.33 6.9 6.92 8.22 7.5 7.46 9.19 8.17 8.13 10.3 8.96 8.73 17.11 13.96 13.6 27.94 22.48 21.23 42.75 35.2 32.89 63.74 53.21 49.54 92.64 77.72 72.15 131.83 110.23 101.57 179.83 150.7 139.6 234.32 200.3 184.72 303.13 260.04 240.38 45 0.6 6.44 7.01 7.62 8.37 9.13 14.16 22.38 34.8 51.62 75.69 105.94 145.76 194.26 251.93 0.7 6.81 7.68 8.6 9.59 10.68 17.92 28.87 43.65 64.56 92.71 128.55 173.65 230.37 303.92 0.8 7.48 9.04 10.85 12.71 14.79 27.25 44.02 66.36 97.41 137.41 190.44 259.83 341.18 435.32 0.9 9.87 15.26 21.32 27.69 33.72 62.55 97.73 144.3 201.36 282.96 385.63 520.59 702.39 914.14 Table 1.5: 95% Fixed-b critical values of Wald( F) with QS kernel, l = 1 λ=0.1 b=0.02 7.43 0.04 12.21 0.06 17.56 0.08 22.64 0.1 27.85 0.2 57.72 0.3 99.95 0.4 158.39 0.5 249.48 0.6 367.72 0.7 532.6 0.8 726.47 0.9 970.42 1 1261.34 0.2 5.07 6.48 8.08 9.95 12.16 25.61 45.22 74.89 114.08 168.98 241.2 326.56 431.92 559.48 0.3 4.52 5.28 6.17 7.23 8.43 16.86 32.2 54.62 83.55 122.86 176.66 245.78 323.43 416.24 0.4 4.36 4.91 5.54 6.23 7.08 13.44 25.09 43.04 68.85 103.11 143.09 197.2 262.52 339.54 0.5 4.29 4.79 5.3 5.99 6.76 12.75 24.18 43.29 70.81 103.31 146.09 199.56 262.71 340.36 0.6 4.34 4.86 5.46 6.23 6.99 13.88 25.58 44.15 70.42 103.7 148.06 201.11 262.84 334.59 0.7 4.53 5.27 6.22 7.24 8.43 16.89 31.73 53.58 85.95 128.44 182.71 247.71 325.08 423.93 0.8 0.9 4.98 7.31 6.34 12.04 7.97 17.35 9.82 22.78 11.95 28.47 25.34 59.62 46.97 101.36 76.5 158.68 118.39 239.27 175.59 353.19 247.64 501.66 343.69 697.25 452.05 926.69 585.82 1216.77 Table 1.6: 95% Fixed-b critical values of Wald( F) with QS kernel, l = 2 b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 14.06 26.97 38.59 50.18 62.26 146.13 312.09 658.6 1286.25 2448.94 4375.15 7288.27 11611.96 17823.91 0.2 0.3 0.4 8.45 7.16 6.81 11.83 8.83 7.95 15.95 11.06 9.39 20.7 13.43 11.1 25.97 16.55 13.27 67.5 43.02 35.24 158.42 108.51 87.98 331.17 241.54 196.85 655.18 499.07 420.69 1237.59 916.23 798.85 2209.76 1622.99 1431.74 3760.93 2728.88 2359.81 5871.16 4433.2 3809.97 8876.78 6763.38 5763.21 0.5 6.82 7.88 9.18 10.88 12.82 33.63 80.79 184.84 378.64 722.79 1296.51 2208.22 3646.67 5547.04 46 0.6 0.7 0.8 0.9 6.89 7.56 8.73 14.73 8.08 9.28 12.26 27.58 9.58 11.44 16.41 39.04 11.25 14.14 21.28 50.94 13.42 17.24 26.42 62.99 35.2 44.36 69.12 149.5 87.85 106.46 163.47 314.22 195.45 233.84 339.63 665.6 404.62 478.49 685.69 1357.87 773.38 897.04 1264.11 2526.8 1368.93 1595.1 2240.27 4503.02 2208.18 2662.06 3808.74 7606.67 3495.78 4198.07 6017.23 12269.85 5223.44 6474.23 9157.09 18845.07 Table 1.7: 95% Fixed-b critical values of Wald(S) with Bartlett kernel, l = 1, b1 = b2 = b b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ = 0.1 4.101 4.313 4.544 4.71 4.933 6.14 7.65 9.16 10.88 12.62 14.39 16.23 18.28 20.31 0.2 4.025 4.207 4.38 4.569 4.773 5.925 7.112 8.422 9.928 11.477 13.215 14.886 16.586 18.425 0.3 0.4 0.5 4.027 4.022 4.033 4.174 4.183 4.202 4.365 4.323 4.334 4.543 4.503 4.517 4.757 4.668 4.691 5.723 5.682 5.571 6.991 6.727 6.728 8.281 8.086 7.931 9.666 9.453 9.387 11.063 10.91 10.886 12.677 12.493 12.541 14.418 14.16 14.231 16.083 15.794 16.021 17.886 17.582 17.792 0.6 4.046 4.195 4.363 4.513 4.702 5.6 6.566 7.835 9.317 10.879 12.409 14.103 15.764 17.521 0.7 0.8 0.9 3.991 4.019 4.115 4.147 4.189 4.267 4.327 4.326 4.46 4.507 4.519 4.621 4.714 4.697 4.862 5.67 5.768 6.107 6.93 6.965 7.51 8.198 8.339 9.141 9.697 9.823 10.774 11.258 11.472 12.538 12.916 13.08 14.292 14.562 14.771 16.041 16.414 16.613 17.992 18.177 18.41 20.008 Table 1.8: 95% Fixed-b critical values of Wald(S) with Bartlett kernel, l = 2, b1 = b2 = b b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 0.2 6.416 6.199 6.788 6.557 7.241 6.916 7.695 7.34 8.224 7.774 10.957 10.138 14.159 12.953 17.909 16.119 21.74 19.6 25.701 22.974 29.558 26.374 33.527 29.887 37.692 33.535 41.749 37.265 0.3 6.196 6.459 6.786 7.114 7.461 9.728 12.297 15.067 18.337 21.6 24.714 28.008 31.375 34.985 0.4 6.122 6.404 6.739 7.061 7.395 9.466 11.799 14.606 17.7 20.891 23.974 27.221 30.147 33.507 47 0.5 6.246 6.527 6.851 7.183 7.518 9.526 11.982 14.84 17.746 20.907 24.045 27.304 30.581 33.784 0.6 6.197 6.553 6.867 7.184 7.561 9.587 12.059 14.835 17.997 20.952 23.829 26.967 30.212 33.558 0.7 6.428 6.702 7.06 7.471 7.845 10.041 12.699 15.726 18.904 22.073 25.325 28.709 32.187 35.802 0.8 6.481 6.835 7.22 7.588 8.024 10.526 13.346 16.787 20.126 23.75 27.618 30.806 34.566 38.393 0.9 6.557 6.987 7.422 7.865 8.388 11.256 14.675 18.563 22.678 26.415 30.453 34.488 38.466 42.774 Table 1.9: 95% Fixed-b critical values of Wald(S) with Parzen kernel, l = 1, b1 = b2 = b b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ= 0.1 0.2 4.049 3.982 4.218 4.121 4.378 4.258 4.549 4.407 4.709 4.559 5.673 5.439 6.89 6.457 8.337 7.569 9.952 8.899 11.766 10.479 14.238 12.371 16.803 14.548 19.604 17.11 22.963 19.776 0.3 3.999 4.124 4.258 4.38 4.532 5.326 6.209 7.332 8.604 10.063 11.714 13.621 15.697 18.188 0.4 3.986 4.106 4.236 4.356 4.46 5.242 6.132 7.127 8.408 9.749 11.269 13.113 15.141 17.469 0.5 0.6 0.7 4.002 4.003 3.963 4.136 4.111 4.096 4.251 4.245 4.194 4.353 4.371 4.349 4.509 4.492 4.496 5.188 5.187 5.27 6.057 6.013 6.2 7.094 6.996 7.323 8.21 8.098 8.571 9.586 9.516 10.086 11.267 11.291 11.839 13.106 13.188 13.911 15.394 15.483 16.075 17.794 17.65 18.511 0.8 3.978 4.1 4.245 4.346 4.503 5.307 6.282 7.478 8.742 10.371 12.212 14.335 16.557 19.084 0.9 4.047 4.185 4.347 4.484 4.646 5.582 6.811 8.278 9.905 11.863 14.041 16.533 19.562 22.964 Table 1.10: 95% Fixed-b critical values of Wald(S) with Parzen kernel, l = 2, b1 = b2 = b b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 λ=0.1 0.2 6.318 6.118 6.593 6.427 6.926 6.723 7.314 7.027 7.734 7.344 9.978 9.293 13.036 11.711 16.699 14.885 21.601 18.733 27.984 23.528 35.876 29.316 45.449 36.522 56.114 44.869 68.506 54.481 0.3 6.118 6,368 6.558 6.83 7.109 8.823 11.071 13.781 17.057 21.279 26.426 32.699 40.393 48.77 0.4 6.064 6.266 6.519 6.796 7.08 8.604 10.663 13.07 16.273 20.453 25.412 31.458 38.288 45.915 48 0.5 6.157 6.425 6.658 6.918 7.194 8.775 10.725 13.217 16.441 20.275 25.186 30.822 37.563 45.317 0.6 6.145 6.377 6.658 6.911 7.169 8.794 10.809 13.365 16.498 20.393 25.205 30.985 37.654 45.607 0.7 6.333 6.587 6.841 7.14 7.438 9.174 11.466 14.299 17.645 22.011 27.459 34.199 41.742 49.919 0.8 6.423 6.688 6.992 7.269 7.597 9.51 12.068 15.271 19.301 24.447 30.67 38.416 47.109 57.401 0.9 6.435 6.782 7.14 7.52 7.923 10.252 13.438 17.304 22.368 28.489 36.487 45.707 57.801 70.667 Table 1.11: 95% Fixed-b critical values of Wald(S) with QS kernel, l = 1, b1 = b2 = b b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 4.176 4.097 4.466 4.324 4.789 4.629 5.109 4.91 5.461 5.239 7.908 7.158 11.14 9.864 15.772 13.486 21.421 18.5 28.285 24.521 37.351 32.408 48.093 41.538 60.429 52.591 75.711 64.723 0.3 0.4 0.5 4.082 4.082 4.098 4.33 4.303 4.299 4.572 4.515 4.542 4.858 4.756 4.792 5.15 5.055 5.045 6.91 6.746 6.723 9.397 9.007 8.873 12.522 12.13 12.109 16.806 16.16 16.533 22.728 21.041 21.838 29.597 27.6 28.351 38.324 35.403 36.689 48.937 44.754 46.413 60.735 55.78 57.464 0.6 4.078 4.298 4.534 4.776 5.031 6.603 8.798 12.24 16.272 21.285 27.798 35.926 46.106 56.974 0.7 4.065 4.281 4.537 4.816 5.095 6.924 9.357 12.867 17.009 22.432 29.577 38.133 48.34 60.451 0.8 4.073 4.301 4.551 4.817 5.121 7.033 9.723 13.339 17.961 24.054 32.03 41.491 52.617 66.712 0.9 4.162 4.418 4.709 5.027 5.376 7.796 11.1 15.716 21.766 29.2 38.786 49.723 62.802 78.41 Table 1.12: 95% Fixed-b critical values of Wald(S) with QS kernel, l = 2, b1 = b2 = b b=0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 6.545 7.194 7.939 8.715 9.665 16.041 27.467 45.53 72.151 108.625 158.481 221.964 304.042 404.653 0.2 6.371 6.916 7.533 8.208 8.968 13.952 22.471 35.924 54.831 82.168 117.545 163.498 220.842 296.357 0.3 0.4 0.5 0.6 0.7 6.319 6.233 6.359 6.326 6.53 6.741 6.662 6.792 6.83 7.014 7.247 7.182 7.315 7.316 7.55 7.811 7.69 7.836 7.875 8.242 8.486 8.306 8.462 8.466 8.922 13.07 12.468 12.481 12.526 13.544 20.156 19.41 19.266 19.249 21.214 31.708 30.315 29.839 30.121 33.261 49.151 45.982 45.157 45.123 50.044 72.852 66.928 66.757 66.24 72.815 103.562 96.266 94.481 93.173 104.123 143.644 133.73 129.628 129.225 144.891 193.648 182.573 174.469 174.765 193.792 256.175 243.928 231.015 229.871 256.899 0.8 6.635 7.19 7.747 8.427 9.2 14.544 23.57 37.764 57.892 86.59 125.598 175.599 242.272 316.291 Table 1.13: 95% Fixed-b critical values of Wald(S) , l = 2, b1 = b2 λ 0.2 0.2 0.2 0.4 0.5 0.5 b1 b2 0.2 0.05 0.5 0.125 1 0.25 1.25 0.83 1 1 2 2 Bartlett kernel 9.2936 14.9354 23.8607 34.864 33.7844 67.5688 49 QS kernel 11.8473 24.8345 59.9221 256.3 231.0145 1677.88 0.9 6.69 7.38 8.162 8.986 9.933 16.558 27.843 46.608 74.435 112.953 162.34 227.359 308.793 412.891 Table 1.14: The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. T 50 100 500 DGP A: (θ, ρ, ϕ) = (0.5, 0, 0). H0 : No Structural Change in both β 1 and β 2 at t = λT, λ = 0.2 b kernel Inference 0.020 0.040 0.060 0.080 0.100 0.200 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.1784 0.1892 0.1704 0.1888 0.1096 0.1176 0.106 0.1216 0.066 0.0724 0.0628 0.0756 0.1636 0.1916 0.1548 0.194 0.1024 0.1216 0.0936 0.1204 0.064 0.0832 0.064 0.0916 0.1544 0.1948 0.1372 0.1936 0.098 0.1296 0.0916 0.1356 0.0636 0.0924 0.0628 0.1052 0.1444 0.1984 0.1236 0.2012 0.1 0.1428 0.0884 0.1596 0.0632 0.1036 0.062 0.1224 0.1348 0.2008 0.1124 0.2132 0.098 0.1544 0.0896 0.1764 0.064 0.1104 0.062 0.142 0.1256 0.2544 0.1064 0.2988 0.0928 0.2096 0.0892 0.274 0.0632 0.1664 0.0668 0.2284 0.122 0.3088 0.0964 0.3936 0.0888 0.2692 0.0804 0.3688 0.0696 0.23 0.062 0.3304 0.800 0.900 1.000 0.1168 0.5416 0.0864 0.742 0.0864 0.4964 0.0672 0.726 0.0652 0.4664 0.062 0.7204 0.116 0.5724 0.0856 0.7776 0.0868 0.5296 0.0672 0.7676 0.0652 0.5068 0.0628 0.7692 0.116 0.5932 0.0848 0.8084 0.086 0.5596 0.0656 0.7984 0.0652 0.5396 0.0604 0.8024 T kernel Inference 0.400 0.500 0.600 b 0.700 50 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.1212 0.364 0.0916 0.4932 0.0876 0.322 0.072 0.4636 0.0696 0.2824 0.0608 0.4216 0.114 0.4152 0.0892 0.5764 0.0836 0.3644 0.072 0.5408 0.0668 0.3316 0.0616 0.5144 0.1144 0.464 0.0856 0.6404 0.0832 0.4116 0.0636 0.6176 0.0704 0.3816 0.062 0.5964 0.1176 0.5052 0.0868 0.7016 0.0876 0.4508 0.0644 0.6808 0.0672 0.4232 0.0628 0.6564 100 500 50 0.300 Table 1.15: The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. T 50 100 500 DGP A: (θ, ρ, ϕ) = (0.5, 0, 0). H0 : No Structural Change in both β 1 and β 2 at t = λT, λ = 0.4 b kernel Inference 0.020 0.040 0.060 0.080 0.100 0.200 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.1036 0.1088 0.0988 0.1088 0.0796 0.0816 0.0776 0.0812 0.0672 0.0692 0.0656 0.0712 0.0976 0.1096 0.0852 0.1108 0.08 0.092 0.078 0.0972 0.0684 0.0764 0.0676 0.0868 0.088 0.116 0.0848 0.1252 0.078 0.0996 0.0772 0.1084 0.066 0.088 0.0656 0.1 0.092 0.1288 0.0864 0.1376 0.0772 0.1068 0.0776 0.1244 0.0668 0.0968 0.0644 0.1124 0.09 0.136 0.088 0.1588 0.0748 0.12 0.0772 0.1432 0.068 0.1044 0.0668 0.126 0.0852 0.194 0.084 0.2504 0.076 0.1688 0.074 0.2268 0.0644 0.1508 0.0628 0.21 0.0844 0.2492 0.0792 0.3452 0.0716 0.2268 0.0696 0.3124 0.0676 0.2052 0.0572 0.3032 0.800 0.900 1.000 0.0848 0.4708 0.0764 0.6884 0.066 0.4468 0.0616 0.656 0.0664 0.432 0.0596 0.658 0.0848 0.506 0.076 0.7368 0.0664 0.4764 0.0596 0.7004 0.0656 0.4648 0.0588 0.7072 0.0852 0.5372 0.0728 0.776 0.066 0.5028 0.0556 0.7464 0.0648 0.4976 0.0576 0.7476 T kernel Inference 0.400 0.500 0.600 b 0.700 50 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.0832 0.306 0.0824 0.42 0.0716 0.272 0.0624 0.3948 0.0588 0.2628 0.056 0.3828 0.0844 0.348 0.0808 0.5048 0.066 0.3268 0.0604 0.4736 0.062 0.3084 0.0644 0.4608 0.0852 0.39 0.0832 0.5756 0.064 0.374 0.0604 0.5372 0.0636 0.3512 0.064 0.5348 0.0828 0.4348 0.0804 0.6368 0.0656 0.4156 0.0608 0.598 0.0656 0.39 0.0612 0.6028 100 500 51 0.300 Table 1.16: The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. T 50 100 500 DGP C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5). H0 : No Structural Change in both β 1 and β 2 at t = λT, λ = 0.2 b kernel Inference 0.020 0.040 0.060 0.080 0.100 0.200 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.4812 0.4908 0.4712 0.4928 0.3904 0.4032 0.3596 0.3832 0.1892 0.1976 0.1492 0.164 0.424 0.3864 0.4528 0.4348 0.3904 0.346 0.4388 0.4184 0.334 0.286 0.3648 0.334 0.3028 0.2332 0.3516 0.3088 0.124 0.106 0.1476 0.14 0.0956 0.082 0.1244 0.126 0.3548 0.4224 0.3108 0.4092 0.2468 0.3144 0.192 0.2872 0.0948 0.1428 0.076 0.1408 0.336 0.4172 0.2756 0.3972 0.2136 0.302 0.1756 0.2808 0.092 0.1488 0.0716 0.1544 0.2444 0.4068 0.178 0.4156 0.1704 0.3148 0.1172 0.3348 0.08 0.194 0.0708 0.2328 0.2168 0.4428 0.1408 0.4812 0.1504 0.368 0.1044 0.422 0.0804 0.2472 0.0672 0.3396 0.800 0.900 1.000 0.2008 0.6372 0.1024 0.7904 0.144 0.5944 0.082 0.764 0.078 0.4956 0.0644 0.7236 0.2008 0.664 0.0996 0.8248 0.1456 0.6232 0.082 0.8048 0.0804 0.5268 0.0636 0.7648 0.2024 0.6888 0.096 0.8524 0.148 0.6508 0.0804 0.8332 0.0796 0.5588 0.0608 0.7908 T kernel Inference 0.400 0.500 0.600 b 0.700 50 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.2084 0.4832 0.1236 0.562 0.1464 0.4184 0.0944 0.5104 0.0832 0.3048 0.0696 0.4332 0.206 0.5308 0.1136 0.6236 0.1452 0.47 0.0868 0.5856 0.08 0.36 0.0676 0.5192 0.2036 0.5668 0.1056 0.6888 0.1476 0.5176 0.0844 0.6468 0.0804 0.4036 0.0668 0.594 0.2048 0.6088 0.1064 0.7476 0.1464 0.5576 0.084 0.71 0.0824 0.4532 0.0664 0.6664 100 500 52 0.300 Table 1.17: The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. T 50 100 500 DGP C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5). H0 : No Structural Change in both β 1 and β 2 at t = λT, λ = 0.4 b kernel Inference 0.020 0.040 0.060 0.080 0.100 0.200 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.4076 0.416 0.4076 0.422 0.3544 0.362 0.336 0.3492 0.1116 0.1168 0.0856 0.094 0.372 0.3968 0.35 0.3836 0.2464 0.2692 0.208 0.24 0.0856 0.0968 0.0688 0.0888 0.3132 0.3532 0.2572 0.3284 0.2036 0.2328 0.1528 0.2076 0.0776 0.0956 0.068 0.0936 0.2564 0.328 0.2184 0.2956 0.1712 0.2192 0.1296 0.1992 0.0764 0.1016 0.0688 0.1044 0.2368 0.3092 0.1896 0.284 0.1528 0.2168 0.1184 0.2012 0.0772 0.1072 0.0688 0.1204 0.1776 0.3096 0.1264 0.3272 0.122 0.2364 0.0904 0.2596 0.0736 0.162 0.0636 0.202 0.158 0.3584 0.108 0.4092 0.1164 0.2884 0.0836 0.342 0.07 0.2124 0.0576 0.2896 0.800 0.900 1.000 0.1512 0.5744 0.0992 0.7484 0.1124 0.506 0.0728 0.6884 0.0656 0.4308 0.0608 0.6616 0.1556 0.6032 0.0952 0.788 0.112 0.5388 0.0736 0.7252 0.0692 0.4676 0.0592 0.7016 0.1548 0.6356 0.096 0.8152 0.1116 0.5744 0.0728 0.7696 0.07 0.5052 0.0588 0.7372 T kernel Inference 0.400 0.500 0.600 b 0.700 50 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.1536 0.4104 0.1004 0.4924 0.1156 0.3372 0.0792 0.422 0.0664 0.2548 0.0564 0.374 0.1512 0.4628 0.1012 0.5732 0.1072 0.3864 0.0784 0.5052 0.066 0.2988 0.0572 0.4588 0.1476 0.5112 0.0996 0.6324 0.1056 0.428 0.0748 0.5776 0.068 0.348 0.0576 0.5344 0.1492 0.5428 0.0992 0.6976 0.1108 0.472 0.0748 0.6392 0.0668 0.388 0.0588 0.602 100 500 53 0.300 Table 1.18: The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. T 50 100 500 DGP D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5). H0 : No Structural Change in both β 1 and β 2 at t = λT, λ = 0.2 b kernel Inference 0.020 0.040 0.060 0.080 0.100 0.200 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.5204 0.5296 0.5164 0.5312 0.4504 0.4624 0.4264 0.444 0.2132 0.2248 0.1668 0.1828 0.462 0.4936 0.426 0.4756 0.3984 0.4256 0.3676 0.4124 0.144 0.1676 0.108 0.1428 0.4212 0.4712 0.3788 0.4548 0.3492 0.3948 0.292 0.3688 0.1212 0.1576 0.0948 0.1444 0.3956 0.46 0.3428 0.4436 0.3044 0.3728 0.2464 0.3464 0.1084 0.1632 0.0872 0.1604 0.3728 0.4524 0.3052 0.4356 0.2728 0.3608 0.2184 0.3404 0.1028 0.1692 0.0808 0.1772 0.2832 0.4472 0.204 0.4608 0.2152 0.38 0.1528 0.39 0.0948 0.218 0.0804 0.2624 0.2532 0.4836 0.1676 0.5188 0.1944 0.4168 0.1292 0.4588 0.0916 0.28 0.0804 0.3588 0.800 0.900 1.000 0.2284 0.6768 0.12 0.8076 0.1828 0.6328 0.102 0.7836 0.0908 0.5124 0.0692 0.7344 0.2308 0.7028 0.1188 0.8356 0.1828 0.6616 0.1008 0.8232 0.092 0.5476 0.0692 0.7744 0.2308 0.726 0.118 0.8644 0.1836 0.6908 0.0964 0.8488 0.0928 0.5816 0.0676 0.8136 T kernel Inference 0.400 0.500 0.600 b 0.700 50 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.2392 0.5252 0.1436 0.5948 0.1884 0.4624 0.1184 0.5432 0.0936 0.3304 0.0772 0.4512 0.2328 0.5744 0.1332 0.656 0.1844 0.5084 0.11 0.6216 0.0912 0.3836 0.0768 0.5352 0.2336 0.6128 0.1268 0.716 0.1844 0.5572 0.1064 0.6868 0.0916 0.4296 0.0752 0.6156 0.2292 0.6436 0.1244 0.7652 0.182 0.596 0.1016 0.744 0.0904 0.474 0.0712 0.6836 100 500 54 0.300 Table 1.19: The Finite Sample Size associated with Wald(S) , l = 2, b1 = b2 = b. T 50 100 500 DGP D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5). H0 : No Structural Change in both β 1 and β 2 at t = λT, λ = 0.4 b kernel Inference 0.020 0.040 0.060 0.080 0.100 0.200 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.4664 0.4768 0.4668 0.4856 0.4148 0.422 0.3964 0.4084 0.1368 0.142 0.0992 0.1092 0.4304 0.4528 0.406 0.4396 0.292 0.3128 0.2464 0.2796 0.1 0.1144 0.0812 0.0992 0.3632 0.406 0.3092 0.3708 0.2368 0.2732 0.1772 0.2416 0.09 0.1152 0.0768 0.1088 0.3124 0.3692 0.2536 0.3392 0.2008 0.2552 0.1532 0.2256 0.0896 0.1228 0.0748 0.1236 0.2828 0.3512 0.2188 0.326 0.1868 0.2492 0.1408 0.2288 0.0848 0.1292 0.0736 0.1396 0.2144 0.3652 0.1576 0.3696 0.1528 0.2784 0.1168 0.296 0.0744 0.1764 0.0648 0.218 0.2064 0.4104 0.142 0.4496 0.1452 0.328 0.104 0.3792 0.074 0.2296 0.0608 0.3068 0.800 0.900 1.000 0.1908 0.6244 0.1088 0.7688 0.1344 0.5528 0.0876 0.7196 0.074 0.45 0.0648 0.6716 0.1916 0.6524 0.106 0.8028 0.1364 0.58 0.0868 0.7588 0.0756 0.482 0.0656 0.7188 0.1904 0.6808 0.1052 0.8328 0.1372 0.608 0.0864 0.7928 0.0756 0.5128 0.0648 0.7568 T kernel Inference 0.400 0.500 0.600 b 0.700 50 Bartlett Bartlett QS QS Bartlett Bartlett QS QS Bartlett Bartlett QS QS Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square Fixed-b chi square 0.1956 0.4584 0.132 0.5276 0.1404 0.376 0.0968 0.4664 0.0708 0.2796 0.0612 0.4004 0.1916 0.5012 0.1236 0.6188 0.1336 0.4192 0.0924 0.5576 0.0724 0.3228 0.0572 0.4768 0.1872 0.5492 0.1184 0.674 0.1336 0.4732 0.0924 0.624 0.0736 0.3732 0.058 0.5536 0.1888 0.59 0.1112 0.7256 0.1344 0.5148 0.0912 0.6752 0.0744 0.4096 0.0604 0.6176 100 500 55 0.300 Table 1.20: The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , l = 2, b1 = λb , b2 = b 1− λ , T = 50, H0 : No Structural Change in both β 1 and β 2 at t = λT Bartlett kernel Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.1184 0.1168 0.2284 0.11 0.112 0.346 0.1132 0.1116 0.4924 Wald(S) DGP λ A .2 b .04 .1 .2 b1 b2 .2 .05 .5 .125 1.0 .25 .04 .1 .2 QS kernel Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0968 0.098 0.2664 0.0992 0.0928 0.464 0.0952 0.0932 0.6596 Wald(S) B .2 .2 .05 .5 .125 1.0 .25 0.2444 0.1704 0.1548 0.244 0.17 0.1544 0.3804 0.436 0.558 0.1868 0.1168 0.096 0.1884 0.1108 0.0932 0.3904 0.4936 0.6704 C .2 .04 .2 .05 .1 .5 .125 .2 1.0 .25 0.2948 0.1932 0.1668 0.2928 0.1952 0.1676 0.4244 0.4592 0.5728 0.2092 0.1216 0.0924 0.2116 0.1184 0.0928 0.4152 0.506 0.6668 D .2 .04 .1 .2 .2 .05 .5 .125 1.0 .25 0.3296 0.2312 0.2024 0.3292 0.2304 0.2036 0.47 0.514 0.6164 0.2428 0.1464 0.1156 0.2444 0.1416 0.1168 0.4596 0.5552 0.6984 E .2 .2 .05 .5 .125 1.0 .25 0.5672 0.3696 0.2768 0.5664 0.3688 0.284 0.6812 0.6444 0.682 0.4576 0.2148 0.1428 0.46 0.2128 0.1384 0.6612 0.6412 0.712 F .2 .04 .1 .2 .04 .2 .05 0.582 0.5804 0.6948 0.4816 0.4872 0.6732 .1 .5 .125 0.3924 0.3928 0.6728 0.2492 0.2452 0.6684 .2 1.0 .25 0.3064 0.3136 0.7064 0.1696 0.1632 0.7364 Note: The DGP labels are given by A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0), B: (θ, ρ, ϕ) = (0.5, 0.5, 0.0), C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5), D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5), E: (θ, ρ, ϕ) = (0.8, 0.9, 0.5), and F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9). 56 Table 1.21: The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , l = 2, b1 = λb , b2 = b 1− λ , T = 100, H0 : No Structural Change in both β 1 and β 2 at t = λT Bartlett kernel Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0928 0.092 0.1924 0.0876 0.0884 0.3044 0.0868 0.086 0.4472 Wald(S) DGP λ A .2 b .04 .1 .2 b1 b2 .2 .05 .5 .125 1.0 .25 .04 .1 .2 QS kernel Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0872 0.0892 0.2408 0.08 0.0796 0.43 0.082 0.0784 0.6308 Wald(S) B .2 .2 .05 .5 .125 1.0 .25 0.1672 0.1264 0.1168 0.1668 0.1312 0.1152 0.2916 0.3872 0.5144 0.1204 0.088 0.0768 0.122 0.086 0.0716 0.3004 0.4556 0.6344 C .2 .04 .2 .05 .1 .5 .125 .2 1.0 .25 0.1816 0.1396 0.1244 0.1828 0.1376 0.1264 0.3132 0.4028 0.5288 0.1316 0.0904 0.0792 0.1344 0.092 0.0812 0.3124 0.4624 0.6424 D .2 .04 .1 .2 .2 .05 .5 .125 1.0 .25 0.2312 0.1772 0.1624 0.2296 0.1784 0.1636 0.3792 0.4528 0.58 0.1636 0.114 0.1116 0.168 0.1108 0.106 0.3696 0.5044 0.6892 E .2 .2 .05 .5 .125 1.0 .25 0.4656 0.276 0.2192 0.4672 0.28 0.23 0.6104 0.5848 0.6472 0.3404 0.1476 0.1048 0.3428 0.1448 0.1032 0.5716 0.586 0.6908 F .2 .04 .1 .2 .04 .2 .05 0.4864 0.4872 0.6308 0.3588 0.364 0.5996 .1 .5 .125 0.3008 0.3036 0.618 0.1764 0.1748 0.6156 .2 1.0 .25 0.25 0.2524 0.6784 0.1292 0.1232 0.7152 Note: The DGP labels are given by A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0), B: (θ, ρ, ϕ) = (0.5, 0.5, 0.0), C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5), D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5), E: (θ, ρ, ϕ) = (0.8, 0.9, 0.5), and F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9). 57 Table 1.22: The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , l = 2, b1 = λb , b2 = b 1− λ , T = 500, H0 : No Structural Change in both β 1 and β 2 at t = λT Bartlett kernel Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0676 0.0672 0.1468 0.066 0.0628 0.274 0.0632 0.06 0.4128 Wald(S) DGP λ A .2 b .04 .1 .2 b1 b2 .2 .05 .5 .125 1.0 .25 .04 .1 .2 QS kernel Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0628 0.0636 0.1948 0.0592 0.0616 0.3872 0.0588 0.0604 0.6104 Wald(S) B .2 .2 .05 .5 .125 1.0 .25 0.0788 0.0768 0.078 0.078 0.076 0.076 0.174 0.2932 0.432 0.0676 0.0664 0.0668 0.0688 0.0616 0.0624 0.1996 0.3904 0.6008 C .2 .04 .2 .05 .1 .5 .125 .2 1.0 .25 0.0816 0.0724 0.0748 0.0816 0.0732 0.0756 0.1804 0.2928 0.4324 0.0724 0.0648 0.0604 0.0732 0.0632 0.0644 0.202 0.3908 0.6028 D .2 .04 .1 .2 .2 .05 .5 .125 1.0 .25 0.0976 0.0888 0.0892 0.096 0.0864 0.0884 0.2032 0.3212 0.4556 0.0788 0.08 0.0732 0.0796 0.078 0.0732 0.228 0.4216 0.62 E .2 .2 .05 .5 .125 1.0 .25 0.1964 0.1316 0.122 0.1972 0.1308 0.1208 0.3256 0.3876 0.5104 0.1236 0.0804 0.0788 0.1276 0.0772 0.0768 0.3092 0.436 0.6228 F .2 .04 .1 .2 .04 .2 .05 0.2196 0.2204 0.3652 0.142 0.1448 0.3468 .1 .5 .125 0.1568 0.1564 0.4344 0.098 0.0944 0.4756 .2 1.0 .25 0.1432 0.1436 0.548 0.0884 0.0836 0.6524 Note: The DGP labels are given by A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0), B: (θ, ρ, ϕ) = (0.5, 0.5, 0.0), C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5), D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5), E: (θ, ρ, ϕ) = (0.8, 0.9, 0.5), and F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9). 58 Table 1.23: The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , l = 2, b1 = λb , b2 = DGP A b .5 1.0 .4 .5 .2 .5 1.0 λ .5 b 1− λ , T = 50, H0 : No Structural Change in both β 1 and β 2 at t = λT Bartlett kernel Wald(S) Wald( F) Wald( F) b1 b2 Fixed-b Fixed-b chi square 1.0 1.0 0.0888 0.0856 0.552 2.0 2.0 0.0888 0.0804 0.7256 1.25 .83 0.0828 0.0824 0.5572 2.5 .625 0.11 0.1116 0.6936 5.0 1.25 0.1116 0.1032 0.8172 QS kernel Wald(S) Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0672 0.068 0.808 0.0644 0.0636 0.9396 0.0724 0.0716 0.8072 0.0828 0.08 0.8828 0.0744 0.0628 0.9652 B .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.144 0.144 0.1384 0.15 0.1492 0.1476 0.1328 0.1316 0.1456 0.1376 0.6348 0.7748 0.636 0.736 0.8504 0.0964 0.0832 0.0876 0.0816 0.0832 0.0844 0.062 0.0784 0.076 0.0744 0.824 0.9464 0.8236 0.882 0.9648 C .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.1516 0.1516 0.1492 0.1564 0.1624 0.1548 0.1412 0.148 0.1596 0.146 0.6508 0.7912 0.6516 0.7412 0.8592 0.09 0.0848 0.092 0.0892 0.086 0.0864 0.0712 0.0844 0.08 0.082 0.8332 0.9528 0.8368 0.8752 0.9692 D .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.192 0.192 0.19 0.1908 0.1908 0.1944 0.182 0.1852 0.1928 0.1756 0.6788 0.8192 0.694 0.7792 0.8756 0.1124 0.1012 0.1056 0.102 0.0976 0.1032 0.0816 0.0976 0.0912 0.0824 0.8592 0.9588 0.8504 0.9008 0.9788 E .5 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.3756 0.3756 0.354 0.25 0.2564 0.3856 0.3608 0.354 0.2512 0.2412 0.7872 0.872 0.7876 0.7972 0.8752 0.2044 0.1724 0.1728 0.1108 0.1148 0.174 0.132 0.156 0.0964 0.0892 0.8788 0.964 0.8896 0.8852 0.9692 F .5 .5 1.0 .4 .5 .2 .5 1.0 .5 1.0 1.0 0.4008 0.4044 0.7988 0.2216 0.1904 0.8956 1.0 2.0 2.0 0.4008 0.388 0.8816 0.1908 0.136 0.97 .4 .5 1.25 .83 0.3832 0.3796 0.8168 0.186 0.1608 0.9032 .2 .5 2.5 .625 0.2876 0.2836 0.8148 0.1284 0.1144 0.8984 1.0 5.0 1.25 0.2892 0.2756 0.8948 0.1272 0.0972 0.974 Note: The DGP labels are given by A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0), B: (θ, ρ, ϕ) = (0.5, 0.5, 0.0), C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5), D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5), E: (θ, ρ, ϕ) = (0.8, 0.9, 0.5), and F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9). 59 Table 1.24: The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , l = 2, b1 = λb , b2 = DGP A b .5 1.0 .4 .5 .2 .5 1.0 λ .5 b 1− λ , T = 100, H0 : No Structural Change in both β 1 and β 2 at t = λT Wald(S) b1 b2 Fixed-b 1.0 1.0 0.0668 2.0 2.0 0.0668 1.25 .83 0.0664 2.5 .625 0.0832 5.0 1.25 0.0872 Bartlett kernel Wald( F) Wald( F) Fixed-b chi square 0.058 0.5344 0.0568 0.7052 0.0664 0.5272 0.0832 0.666 0.076 0.7996 QS kernel Wald(S) Wald( F) Wald( F) Fixed-b Fixed-b chi square 0.0544 0.056 0.7856 0.052 0.054 0.9316 0.0556 0.052 0.7896 0.0688 0.0684 0.8724 0.0676 0.0628 0.9668 B .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.0856 0.0856 0.1004 0.1108 0.1144 0.088 0.0784 0.1008 0.1092 0.1048 0.5776 0.7428 0.5844 0.6988 0.8288 0.0604 0.0576 0.0692 0.0712 0.0724 0.0636 0.0508 0.0668 0.0716 0.0656 0.804 0.9412 0.8012 0.8716 0.9672 C .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.0988 0.0988 0.1112 0.1228 0.1216 0.0956 0.0852 0.108 0.1228 0.1076 0.5816 0.7492 0.5892 0.7184 0.8356 0.0648 0.0616 0.07 0.0816 0.0764 0.0588 0.0552 0.0696 0.0772 0.0664 0.8076 0.9428 0.8028 0.872 0.97 D .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.1168 0.1168 0.134 0.1644 0.166 0.1228 0.1104 0.1312 0.162 0.152 0.61 0.7588 0.6272 0.7584 0.856 0.082 0.078 0.0804 0.0872 0.0832 0.074 0.064 0.0804 0.084 0.0696 0.8096 0.9384 0.8204 0.8912 0.9696 E .5 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.2756 0.2756 0.2852 0.2128 0.216 0.2852 0.2596 0.2792 0.2112 0.2 0.742 0.8532 0.7604 0.7768 0.8808 0.1412 0.126 0.1364 0.0948 0.0992 0.1304 0.0932 0.1276 0.088 0.0812 0.87 0.9584 0.8852 0.8852 0.974 F .5 .5 1.0 .4 .5 .2 .5 1.0 .5 1.0 1.0 0.3044 0.3028 0.7664 0.1708 0.154 0.8764 1.0 2.0 2.0 0.3044 0.2872 0.8612 0.1596 0.1192 0.96 .4 .5 1.25 .83 0.2936 0.2948 0.776 0.1528 0.136 0.884 .2 .5 2.5 .625 0.2384 0.2408 0.7892 0.1152 0.0976 0.8864 1.0 5.0 1.25 0.2408 0.2252 0.8804 0.1168 0.0872 0.9728 Note: The DGP labels are given by A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0), B: (θ, ρ, ϕ) = (0.5, 0.5, 0.0), C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5), D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5), E: (θ, ρ, ϕ) = (0.8, 0.9, 0.5), and F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9). 60 Table 1.25: The Finite Sample Size of the Tests Based on Wald(S) and Wald( F) , l = 2, b1 = λb , b2 = DGP A b .5 1.0 .4 .5 .2 .5 1.0 λ .5 b 1− λ , T = 500, H0 : No Structural Change in both β 1 and β 2 at t = λT Wald(S) b1 b2 Fixed-b 1.0 1.0 0.0488 2.0 2.0 0.0488 1.25 .83 0.064 2.5 .625 0.0664 5.0 1.25 0.0636 Bartlett kernel Wald( F) Wald( F) Fixed-b chi square 0.0532 0.4988 0.0428 0.6856 0.0612 0.5144 0.0676 0.6488 0.056 0.7952 Wald(S) Fixed-b 0.0532 0.0528 0.0568 0.056 0.0556 QS kernel Wald( F) Wald( F) Fixed-b chi square 0.0496 0.7708 0.0492 0.9264 0.058 0.7848 0.058 0.8664 0.0568 0.958 B .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.06 0.06 0.0692 0.0752 0.0724 0.0652 0.0524 0.0688 0.0764 0.0652 0.5024 0.6948 0.5148 0.6552 0.7932 0.0548 0.05 0.0576 0.054 0.0564 0.0472 0.0496 0.0564 0.0592 0.0576 0.776 0.9312 0.784 0.8624 0.9612 C .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.0596 0.0596 0.0668 0.076 0.0736 0.0632 0.0556 0.0688 0.076 0.066 0.514 0.6884 0.528 0.656 0.798 0.0552 0.0492 0.0572 0.0556 0.0584 0.0488 0.0428 0.06 0.0584 0.0516 0.7744 0.9308 0.7868 0.858 0.9572 D .5 .5 1.0 .4 .5 .2 .5 1.0 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.0652 0.0652 0.0744 0.0868 0.0884 0.0644 0.0568 0.07 0.086 0.0804 0.5264 0.7 0.534 0.6704 0.8148 0.0528 0.0504 0.0604 0.066 0.0612 0.0528 0.0464 0.0548 0.0696 0.058 0.7824 0.9364 0.792 0.8728 0.9596 E .5 1.0 1.0 2.0 2.0 1.25 .83 2.5 .625 5.0 1.25 0.1032 0.1032 0.11 0.1188 0.1196 0.1024 0.0964 0.1076 0.1188 0.1056 0.5728 0.7288 0.582 0.7032 0.8352 0.0724 0.0692 0.0728 0.0796 0.0708 0.0676 0.0648 0.0652 0.0652 0.0624 0.7968 0.9304 0.794 0.8704 0.9624 F .5 .5 1.0 .4 .5 .2 .5 1.0 .5 1.0 1.0 0.12 0.1216 0.5904 0.0784 0.07 0.8052 1.0 2.0 2.0 0.12 0.1116 0.7452 0.0688 0.06 0.932 .4 .5 1.25 .83 0.1224 0.1212 0.6112 0.0712 0.0692 0.812 .2 .5 2.5 .625 0.1416 0.1424 0.7324 0.0868 0.0796 0.872 1.0 5.0 1.25 0.138 0.1288 0.8432 0.0804 0.0732 0.9692 Note: The DGP labels are given by A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0), B: (θ, ρ, ϕ) = (0.5, 0.5, 0.0), C: (θ, ρ, ϕ) = (0.5, 0.5, 0.5), D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5), E: (θ, ρ, ϕ) = (0.8, 0.9, 0.5), and F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9). 61 Table 1.26: Fixed-b 95% Critical Values of Wald( F) Unknown Break Date, Bartlett kernel, l =2 b 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 SupW 30.293 48.447 61.976 73.862 84.848 138.92 193.94 254.14 313.06 374.36 433.71 491.83 549.63 608.99 = 0.05 MeanW 4.861 5.9489 7.0183 8.001 8.973 14.018 19.113 24.443 29.999 35.304 40.902 46.205 51.450 57.142 ExpW 9.588 18.1938 24.816 30.656 36.109 63.068 90.408 120.71 149.85 180.46 210.22 239.08 268.05 297.78 = 0.1 MeanW 4.235 4.974 5.729 6.496 7.278 11.323 15.596 20.009 24.565 29.202 33.625 38.016 42.238 46.623 SupW 18.230 26.034 33.172 39.957 46.263 76.971 109.11 142.31 176.51 212.05 245.66 279.65 311.37 344.26 ExpW 5.051 8.173 11.483 14.695 17.653 32.706 48.657 65.120 82.037 99.596 116.32 133.32 149.22 165.51 SupW 13.542 16.313 19.496 22.812 26.323 46.122 67.262 89.241 111.18 134.00 153.93 173.96 192.52 212.76 = 0.2 MeanW 3.263 3.688 4.162 4.617 5.146 8.052 11.216 14.464 17.912 21.386 24.666 27.702 30.670 33.936 ExpW 3.539 4.654 5.967 7.364 8.998 18.156 28.446 39.161 49.818 61.205 70.991 81.134 90.145 100.36 Table 1.27: Fixed-b 95% Critical Values of Wald( F) Unknown Break Date, QS kernel, l =2 b SupW 0.02 64.848 0.04 122.00 0.06 161.74 0.08 207.65 0.1 257.31 0.2 832.93 0.3 3339.8 0.4 13932 0.5 47253 0.6 136211 0.7 328737 0.8 719812 0.9 1444833 1 2647520 = 0.05 MeanW 5.678 8.102 10.617 13.202 16.139 40.501 99.975 239.82 537.89 1115.4 2170.5 3982.4 7015.5 11566 ExpW SupW 26.200 24.831 54.483 46.350 74.329 68.158 97.163 91.258 122.02 118.67 409.56 452.33 1663.0 2055.3 6959.4 8975.9 23620 31752 68099 91828 164361 224463 359899 488008 722409 970172 1323754 1829406 62 = 0.1 MeanW 4.641 6.059 7.630 9.461 11.671 30.155 77.012 185.18 411.53 850.69 1674.7 3100.4 5395.5 9072.3 ExpW SupW 7.548 15.051 17.433 20.670 28.148 28.305 39.595 38.905 53.066 52.759 219.29 240.65 1020.8 1144.7 4481.1 4771.4 15869 16684 45907 49492 112225 128234 243997 283267 485079 565285 914696 1062685 = 0.2 MeanW 3.458 4.205 5.060 6.143 7.491 19.924 51.677 124.22 276.98 580.43 1140.0 2099.3 3626.6 5951.4 ExpW 4.111 6.401 9.666 14.409 20.987 113.55 565.45 2378.8 8334.9 24740 64110 141627 282635 531336 Table 1.28: The Finite Sample Size of the SupW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.331 0.184 0.721 0.290 0.191 0.847 0.284 0.212 0.898 0.284 0.211 0.936 0.277 0.212 0.954 0.262 0.153 0.992 0.264 0.084 0.998 0.259 0.048 1.000 0.250 0.036 1.000 0.250 0.026 1.000 0.250 0.024 1.000 0.243 0.020 1.000 0.248 0.015 1.000 0.253 0.012 1.000 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.160 0.111 0.347 0.136 0.096 0.494 0.131 0.104 0.601 0.127 0.108 0.696 0.124 0.106 0.756 0.125 0.081 0.908 0.124 0.050 0.962 0.123 0.031 0.983 0.120 0.023 0.994 0.118 0.017 0.997 0.120 0.016 0.999 0.119 0.014 1.000 0.123 0.010 1.000 0.121 0.007 1.000 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.104 0.099 0.163 0.108 0.098 0.241 0.103 0.082 0.308 0.098 0.078 0.384 0.094 0.071 0.447 0.084 0.056 0.678 0.081 0.043 0.812 0.081 0.028 0.885 0.081 0.019 0.923 0.080 0.017 0.950 0.080 0.015 0.968 0.079 0.012 0.984 0.079 0.010 0.989 0.080 0.009 0.994 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.093 0.084 0.472 0.086 0.080 0.704 0.086 0.082 0.810 0.087 0.079 0.865 0.084 0.077 0.904 0.078 0.063 0.983 0.081 0.055 0.995 0.078 0.047 0.999 0.083 0.046 1.000 0.081 0.034 1.000 0.080 0.026 1.000 0.078 0.025 1.000 0.079 0.026 1.000 0.081 0.026 1.000 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.069 0.070 0.217 0.069 0.063 0.376 0.069 0.063 0.507 0.065 0.058 0.607 0.064 0.057 0.679 0.058 0.052 0.878 0.057 0.052 0.946 0.056 0.041 0.974 0.057 0.038 0.986 0.059 0.034 0.994 0.062 0.028 0.998 0.058 0.027 1.000 0.058 0.025 1.000 0.061 0.024 1.000 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.062 0.062 0.111 0.060 0.060 0.179 0.057 0.058 0.247 0.060 0.056 0.315 0.062 0.056 0.381 0.054 0.044 0.641 0.061 0.052 0.786 0.054 0.050 0.865 0.052 0.042 0.908 0.050 0.041 0.935 0.055 0.035 0.959 0.054 0.031 0.977 0.050 0.030 0.987 0.055 0.028 0.992 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.0772 0.0784 0.4132 0.078 0.072 0.6584 0.0784 0.0712 0.7832 0.0756 0.0592 0.8528 0.0696 0.0664 0.8956 0.0704 0.0584 0.9804 0.07 0.0472 0.994 0.0696 0.0448 0.9984 0.0716 0.0468 1 0.072 0.0456 1 0.0724 0.0472 1 0.0664 0.0436 1 0.0696 0.04 1 0.0708 0.04 1 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.056 0.0512 0.1932 0.0544 0.0536 0.3664 0.05 0.052 0.4984 0.0464 0.0484 0.5956 0.0496 0.0532 0.6788 0.05 0.0556 0.866 0.0488 0.0476 0.934 0.0432 0.0464 0.9688 0.052 0.0452 0.9852 0.0472 0.0484 0.9912 0.046 0.0492 0.996 0.054 0.0484 0.9992 0.0508 0.0456 0.9996 0.0524 0.044 1 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.0592 0.0572 0.116 0.0592 0.0592 0.1808 0.0572 0.0584 0.2452 0.056 0.054 0.3048 0.0564 0.0516 0.3712 0.0492 0.0472 0.6172 0.0488 0.0436 0.7744 0.05 0.05 0.8536 0.054 0.0548 0.8964 0.0464 0.05 0.928 0.0496 0.0468 0.956 0.0472 0.0468 0.974 0.0484 0.0456 0.9852 0.0484 0.0424 0.9928 63 Table 1.29: The Finite Sample Size of the SupW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.665 0.308 0.947 0.466 0.173 0.955 0.398 0.166 0.966 0.365 0.159 0.978 0.351 0.161 0.985 0.316 0.114 0.996 0.308 0.071 1.000 0.304 0.044 1.000 0.302 0.034 1.000 0.310 0.024 1.000 0.310 0.020 1.000 0.307 0.021 1.000 0.304 0.021 1.000 0.309 0.020 1.000 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.598 0.395 0.801 0.417 0.200 0.800 0.355 0.166 0.842 0.319 0.156 0.881 0.308 0.153 0.906 0.283 0.101 0.968 0.271 0.061 0.990 0.266 0.036 0.997 0.272 0.026 0.999 0.276 0.023 1.000 0.266 0.020 1.000 0.265 0.021 1.000 0.265 0.020 1.000 0.273 0.021 1.000 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.493 0.376 0.604 0.341 0.227 0.542 0.292 0.172 0.587 0.268 0.145 0.638 0.255 0.128 0.674 0.224 0.077 0.832 0.216 0.054 0.907 0.213 0.037 0.948 0.213 0.034 0.966 0.208 0.028 0.980 0.210 0.024 0.989 0.211 0.020 0.993 0.209 0.020 0.995 0.209 0.018 0.999 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.247 0.109 0.697 0.207 0.097 0.832 0.186 0.096 0.888 0.181 0.086 0.930 0.170 0.089 0.956 0.144 0.074 0.991 0.146 0.054 0.998 0.140 0.036 0.999 0.144 0.037 1.000 0.145 0.032 1.000 0.146 0.031 1.000 0.144 0.032 1.000 0.146 0.031 1.000 0.149 0.032 1.000 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.193 0.112 0.391 0.149 0.085 0.514 0.139 0.083 0.616 0.126 0.078 0.685 0.129 0.077 0.756 0.114 0.058 0.908 0.110 0.050 0.964 0.106 0.036 0.985 0.107 0.034 0.993 0.105 0.038 0.998 0.103 0.033 0.999 0.106 0.034 1.000 0.106 0.030 1.000 0.108 0.031 1.000 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.131 0.090 0.216 0.108 0.084 0.251 0.102 0.077 0.318 0.098 0.070 0.382 0.096 0.070 0.447 0.088 0.051 0.684 0.089 0.044 0.824 0.088 0.041 0.888 0.083 0.041 0.926 0.083 0.038 0.949 0.087 0.036 0.970 0.090 0.033 0.984 0.087 0.030 0.990 0.087 0.030 0.994 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.1628 0.0928 0.5716 0.144 0.0888 0.7504 0.1344 0.0844 0.844 0.1308 0.0796 0.8956 0.122 0.0672 0.9248 0.1196 0.0648 0.9828 0.1044 0.0512 0.9964 0.1072 0.0436 0.9996 0.1056 0.0424 0.9996 0.1056 0.042 1 0.1084 0.0412 1 0.1048 0.0416 1 0.1068 0.0404 1 0.1088 0.0404 1 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.1176 0.0836 0.2892 0.0996 0.0636 0.4304 0.0924 0.0576 0.5604 0.0856 0.058 0.654 0.0864 0.056 0.724 0.0828 0.052 0.8888 0.0752 0.058 0.9516 0.0768 0.046 0.9748 0.076 0.04 0.9884 0.0772 0.0392 0.9952 0.0804 0.0392 0.9992 0.0804 0.0404 0.9992 0.082 0.0404 1 0.0788 0.0404 1 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.0952 0.0744 0.1596 0.0884 0.0732 0.2132 0.0836 0.0732 0.2776 0.0828 0.066 0.3492 0.0816 0.0592 0.4104 0.0624 0.0504 0.6548 0.0648 0.0544 0.7856 0.0664 0.0516 0.862 0.0648 0.0512 0.9076 0.0692 0.0436 0.936 0.0688 0.0412 0.9568 0.066 0.0408 0.9788 0.0672 0.0364 0.9884 0.0668 0.0396 0.9912 64 Table 1.30: The Finite Sample Size of the SupW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.934 0.586 0.998 0.672 0.198 0.994 0.508 0.136 0.991 0.405 0.100 0.991 0.346 0.092 0.990 0.272 0.073 0.994 0.286 0.060 0.998 0.299 0.045 0.999 0.307 0.036 1.000 0.300 0.034 1.000 0.296 0.033 1.000 0.291 0.030 1.000 0.290 0.026 1.000 0.296 0.026 1.000 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.963 0.879 0.992 0.806 0.472 0.976 0.680 0.293 0.971 0.578 0.227 0.973 0.528 0.184 0.974 0.445 0.122 0.986 0.440 0.081 0.995 0.444 0.057 0.998 0.447 0.046 1.000 0.448 0.038 1.000 0.442 0.036 1.000 0.442 0.034 1.000 0.443 0.033 1.000 0.446 0.030 1.000 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.942 0.886 0.967 0.810 0.652 0.914 0.700 0.487 0.895 0.632 0.369 0.895 0.596 0.304 0.900 0.507 0.160 0.948 0.503 0.104 0.972 0.492 0.082 0.986 0.488 0.064 0.994 0.498 0.052 0.997 0.499 0.042 0.999 0.492 0.038 0.999 0.492 0.035 1.000 0.494 0.032 1.000 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.591 0.195 0.945 0.347 0.093 0.941 0.276 0.075 0.950 0.239 0.070 0.962 0.212 0.059 0.971 0.184 0.054 0.990 0.178 0.050 0.998 0.180 0.040 1.000 0.176 0.041 1.000 0.180 0.039 1.000 0.176 0.042 1.000 0.170 0.038 1.000 0.173 0.038 1.000 0.174 0.038 1.000 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.608 0.351 0.812 0.385 0.142 0.779 0.306 0.107 0.812 0.260 0.090 0.846 0.239 0.083 0.874 0.208 0.062 0.952 0.202 0.056 0.983 0.200 0.050 0.994 0.201 0.043 0.998 0.192 0.042 1.000 0.198 0.044 1.000 0.197 0.042 1.000 0.198 0.040 1.000 0.200 0.039 1.000 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.512 0.354 0.632 0.313 0.190 0.527 0.252 0.139 0.545 0.227 0.114 0.587 0.204 0.098 0.630 0.177 0.066 0.802 0.183 0.062 0.878 0.184 0.059 0.930 0.180 0.050 0.956 0.178 0.048 0.973 0.176 0.044 0.987 0.179 0.043 0.993 0.174 0.043 0.995 0.174 0.044 0.997 = 0.05 Fixed-b Andrews Bartlett QS Bartlett 0.4004 0.1216 0.8364 0.2688 0.0724 0.888 0.2176 0.0688 0.9216 0.1956 0.0564 0.9436 0.1728 0.0516 0.9568 0.1504 0.046 0.9908 0.1464 0.0444 0.9964 0.1464 0.0424 0.9996 0.1532 0.04 0.9996 0.1556 0.0396 0.9996 0.1532 0.0408 1 0.1416 0.042 1 0.1452 0.042 1 0.1448 0.0444 1 = 0.1 Fixed-b Andrews Bartlett QS Bartlett 0.3504 0.1736 0.5964 0.2204 0.0892 0.6408 0.1764 0.068 0.7052 0.1628 0.0576 0.772 0.1504 0.054 0.8152 0.132 0.0528 0.9236 0.1332 0.0448 0.9712 0.132 0.0352 0.9864 0.1288 0.0352 0.9924 0.1364 0.0388 0.9968 0.1332 0.0392 0.9992 0.1372 0.0412 0.9996 0.1292 0.0424 0.9996 0.1296 0.0408 1 = 0.2 Fixed-b Andrews Bartlett QS Bartlett 0.2788 0.1832 0.3852 0.192 0.1188 0.3712 0.156 0.0928 0.4092 0.1492 0.0812 0.478 0.1392 0.0696 0.5344 0.1232 0.0476 0.7312 0.118 0.0452 0.8332 0.122 0.036 0.8908 0.1232 0.036 0.924 0.126 0.038 0.9536 0.1164 0.0376 0.9736 0.1192 0.0388 0.9856 0.1204 0.0388 0.9936 0.1244 0.04 0.9968 65 Table 1.31: The Finite Sample Size of the MeanW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.162 0.148 0.290 0.175 0.190 0.446 0.182 0.202 0.586 0.192 0.215 0.686 0.190 0.226 0.759 0.204 0.194 0.941 0.209 0.165 0.984 0.208 0.132 0.996 0.216 0.120 0.999 0.212 0.112 1.000 0.216 0.110 1.000 0.216 0.102 1.000 0.218 0.097 1.000 0.217 0.097 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.101 0.094 0.157 0.105 0.105 0.243 0.108 0.114 0.328 0.107 0.114 0.406 0.103 0.116 0.482 0.109 0.109 0.750 0.114 0.096 0.895 0.117 0.087 0.950 0.119 0.080 0.977 0.121 0.080 0.991 0.120 0.075 0.996 0.123 0.071 1.000 0.121 0.068 1.000 0.122 0.066 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.084 0.082 0.120 0.089 0.090 0.156 0.090 0.087 0.201 0.090 0.088 0.243 0.086 0.081 0.291 0.086 0.079 0.518 0.081 0.068 0.674 0.086 0.069 0.771 0.089 0.065 0.850 0.087 0.058 0.902 0.084 0.059 0.931 0.086 0.058 0.955 0.088 0.056 0.971 0.087 0.055 0.982 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.064 0.066 0.134 0.063 0.066 0.248 0.060 0.069 0.363 0.063 0.068 0.471 0.064 0.070 0.570 0.069 0.066 0.853 0.071 0.065 0.956 0.068 0.057 0.986 0.065 0.057 0.995 0.065 0.056 1.000 0.068 0.058 1.000 0.069 0.058 1.000 0.068 0.055 1.000 0.067 0.055 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.063 0.060 0.103 0.058 0.060 0.163 0.058 0.064 0.235 0.059 0.061 0.302 0.059 0.054 0.372 0.056 0.054 0.666 0.056 0.057 0.827 0.055 0.053 0.911 0.054 0.052 0.954 0.059 0.055 0.975 0.057 0.052 0.990 0.058 0.048 0.996 0.058 0.047 0.998 0.054 0.050 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.056 0.060 0.084 0.054 0.056 0.113 0.056 0.058 0.151 0.059 0.056 0.189 0.061 0.055 0.231 0.056 0.056 0.465 0.055 0.060 0.628 0.062 0.055 0.741 0.060 0.051 0.818 0.054 0.052 0.869 0.057 0.051 0.912 0.056 0.050 0.940 0.056 0.049 0.961 0.060 0.050 0.971 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.0576 0.0576 0.1296 0.0604 0.0616 0.224 0.0596 0.058 0.3244 0.0588 0.0584 0.4352 0.0576 0.06 0.5264 0.0556 0.0588 0.8344 0.0556 0.0512 0.942 0.0532 0.0536 0.974 0.0592 0.056 0.9912 0.0584 0.0516 0.9976 0.056 0.0492 0.9988 0.058 0.0516 1 0.0556 0.0516 1 0.056 0.0528 1 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.0508 0.0524 0.0988 0.0524 0.0532 0.1516 0.0516 0.0484 0.2188 0.0504 0.05 0.2896 0.0504 0.0484 0.352 0.052 0.0488 0.6536 0.0492 0.046 0.8236 0.0472 0.0472 0.9012 0.0544 0.0508 0.9424 0.052 0.0508 0.9696 0.0512 0.0504 0.9868 0.0496 0.048 0.9956 0.0516 0.05 0.9984 0.0512 0.0468 1 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.0468 0.0476 0.0812 0.0476 0.05 0.112 0.0496 0.0484 0.1508 0.0504 0.05 0.1904 0.0512 0.0464 0.2332 0.048 0.0428 0.4424 0.0508 0.0492 0.6132 0.048 0.0496 0.736 0.0504 0.052 0.8072 0.0476 0.0452 0.8616 0.0488 0.0432 0.9028 0.0472 0.0476 0.9328 0.0492 0.0472 0.958 0.0504 0.0476 0.9712 66 Table 1.32: The Finite Sample Size of the MeanW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.664 0.530 0.806 0.523 0.365 0.806 0.476 0.312 0.852 0.456 0.301 0.893 0.445 0.291 0.926 0.432 0.234 0.985 0.440 0.201 0.996 0.448 0.178 1.000 0.462 0.155 1.000 0.457 0.146 1.000 0.460 0.143 1.000 0.460 0.134 1.000 0.456 0.129 1.000 0.455 0.121 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.527 0.420 0.632 0.386 0.273 0.593 0.343 0.245 0.646 0.328 0.231 0.700 0.317 0.219 0.752 0.310 0.176 0.914 0.325 0.156 0.973 0.333 0.138 0.990 0.347 0.127 0.997 0.342 0.130 0.998 0.339 0.122 1.000 0.342 0.115 1.000 0.339 0.111 1.000 0.341 0.105 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.420 0.324 0.488 0.283 0.198 0.414 0.235 0.167 0.431 0.229 0.158 0.475 0.216 0.154 0.521 0.208 0.136 0.712 0.214 0.121 0.822 0.223 0.118 0.897 0.224 0.114 0.938 0.224 0.106 0.958 0.222 0.099 0.977 0.226 0.094 0.984 0.226 0.090 0.992 0.229 0.089 0.996 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.160 0.110 0.291 0.142 0.103 0.412 0.133 0.106 0.519 0.134 0.102 0.614 0.128 0.103 0.694 0.136 0.086 0.915 0.136 0.088 0.976 0.139 0.075 0.994 0.148 0.078 0.999 0.147 0.078 1.000 0.144 0.077 1.000 0.144 0.075 1.000 0.144 0.074 1.000 0.141 0.073 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.127 0.091 0.201 0.110 0.084 0.255 0.103 0.087 0.330 0.101 0.088 0.396 0.099 0.084 0.468 0.103 0.074 0.737 0.105 0.077 0.879 0.101 0.071 0.941 0.104 0.072 0.974 0.106 0.072 0.989 0.105 0.070 0.996 0.108 0.068 0.999 0.104 0.070 1.000 0.105 0.067 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.109 0.085 0.152 0.089 0.070 0.172 0.086 0.070 0.214 0.083 0.068 0.254 0.081 0.070 0.295 0.080 0.061 0.502 0.082 0.063 0.675 0.084 0.061 0.776 0.082 0.065 0.848 0.083 0.062 0.900 0.085 0.062 0.929 0.084 0.061 0.955 0.084 0.062 0.970 0.086 0.065 0.982 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.1044 0.082 0.202 0.0932 0.0784 0.3076 0.0952 0.0732 0.434 0.0928 0.0732 0.5352 0.0912 0.0696 0.6228 0.094 0.068 0.8792 0.0952 0.0688 0.9568 0.0956 0.0648 0.9844 0.0948 0.0616 0.996 0.0984 0.058 0.9984 0.0968 0.0608 1 0.0984 0.0604 1 0.0952 0.0616 1 0.096 0.062 1 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.086 0.0672 0.1416 0.0768 0.0632 0.1972 0.0768 0.0624 0.2692 0.0716 0.0588 0.3376 0.0676 0.0624 0.4068 0.0716 0.0636 0.6992 0.072 0.0624 0.8524 0.0744 0.0624 0.9208 0.0736 0.054 0.9552 0.0736 0.0532 0.9772 0.072 0.0536 0.9912 0.0728 0.0532 0.9972 0.072 0.054 0.9992 0.0716 0.0512 1 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.0732 0.0612 0.1096 0.0696 0.0628 0.1332 0.066 0.0552 0.1768 0.0628 0.0544 0.2144 0.0608 0.0528 0.2552 0.0588 0.0536 0.464 0.0608 0.062 0.6356 0.0664 0.0596 0.7512 0.0612 0.0616 0.822 0.0624 0.056 0.8664 0.0604 0.0552 0.9092 0.0592 0.054 0.938 0.0616 0.0548 0.9604 0.0628 0.0568 0.974 67 Table 1.33: The Finite Sample Size of the MeanW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.985 0.948 0.996 0.910 0.758 0.985 0.825 0.591 0.979 0.771 0.478 0.978 0.723 0.407 0.979 0.650 0.269 0.992 0.646 0.243 0.997 0.655 0.218 0.999 0.668 0.207 1.000 0.670 0.201 1.000 0.659 0.194 1.000 0.659 0.184 1.000 0.662 0.182 1.000 0.664 0.174 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.964 0.924 0.982 0.859 0.735 0.945 0.783 0.610 0.932 0.724 0.513 0.937 0.687 0.448 0.942 0.628 0.302 0.976 0.619 0.260 0.991 0.634 0.233 0.997 0.641 0.228 0.998 0.642 0.215 1.000 0.638 0.204 1.000 0.631 0.195 1.000 0.640 0.189 1.000 0.641 0.181 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.916 0.854 0.943 0.772 0.658 0.858 0.680 0.538 0.824 0.627 0.462 0.822 0.586 0.409 0.838 0.533 0.302 0.912 0.543 0.252 0.955 0.545 0.235 0.975 0.552 0.223 0.988 0.546 0.207 0.996 0.537 0.194 0.998 0.546 0.180 0.998 0.546 0.175 0.998 0.547 0.171 0.999 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.653 0.451 0.788 0.458 0.241 0.761 0.382 0.181 0.794 0.342 0.160 0.833 0.324 0.145 0.868 0.296 0.118 0.963 0.312 0.108 0.992 0.306 0.107 0.998 0.316 0.099 1.000 0.320 0.097 1.000 0.310 0.095 1.000 0.310 0.094 1.000 0.306 0.093 1.000 0.304 0.091 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.532 0.390 0.645 0.360 0.225 0.568 0.302 0.179 0.595 0.275 0.156 0.642 0.261 0.140 0.691 0.248 0.118 0.872 0.257 0.112 0.952 0.263 0.111 0.980 0.268 0.102 0.992 0.265 0.099 0.995 0.261 0.094 0.998 0.267 0.093 1.000 0.271 0.092 1.000 0.270 0.091 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.429 0.318 0.502 0.284 0.193 0.404 0.235 0.157 0.408 0.217 0.137 0.432 0.206 0.124 0.469 0.188 0.104 0.656 0.198 0.110 0.792 0.204 0.108 0.867 0.204 0.105 0.918 0.203 0.094 0.947 0.204 0.089 0.968 0.200 0.088 0.983 0.204 0.089 0.989 0.204 0.084 0.992 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.358 0.22 0.5164 0.2628 0.1344 0.5592 0.2288 0.1116 0.6384 0.2136 0.1004 0.7148 0.2052 0.0948 0.7752 0.1968 0.0804 0.9344 0.2072 0.0792 0.9848 0.2072 0.0716 0.996 0.2116 0.068 0.998 0.2176 0.068 1 0.2136 0.07 1 0.2108 0.072 1 0.2116 0.0684 1 0.2064 0.0728 1 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.282 0.1912 0.3736 0.202 0.1224 0.3712 0.1768 0.1068 0.4324 0.1612 0.092 0.4948 0.1544 0.0892 0.5624 0.1508 0.0828 0.7948 0.1584 0.0772 0.906 0.1576 0.0664 0.9564 0.1604 0.0632 0.9788 0.1632 0.0664 0.9908 0.1596 0.068 0.998 0.1632 0.0664 0.9996 0.1608 0.0688 0.9996 0.162 0.0672 1 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.2272 0.154 0.2804 0.1576 0.1032 0.2536 0.1328 0.0912 0.2776 0.1228 0.0852 0.312 0.1208 0.074 0.362 0.1104 0.0672 0.5728 0.1204 0.0724 0.7132 0.1232 0.0676 0.8116 0.1264 0.0652 0.8716 0.1228 0.06 0.914 0.118 0.06 0.9404 0.1224 0.0584 0.9596 0.1248 0.0564 0.9724 0.1212 0.0588 0.9856 68 Table 1.34: The Finite Sample Size of the ExpW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP A: (θ, ρ, ϕ) = (0.5, 0.0, 0.0) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.368 0.198 0.712 0.316 0.201 0.839 0.301 0.218 0.900 0.303 0.221 0.938 0.291 0.217 0.956 0.271 0.154 0.994 0.271 0.084 0.999 0.266 0.048 1.000 0.256 0.036 1.000 0.253 0.026 1.000 0.254 0.024 1.000 0.246 0.020 1.000 0.250 0.015 1.000 0.254 0.012 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.178 0.135 0.348 0.160 0.113 0.494 0.150 0.113 0.600 0.144 0.114 0.694 0.136 0.113 0.760 0.137 0.083 0.915 0.134 0.051 0.967 0.129 0.031 0.986 0.126 0.023 0.994 0.124 0.017 0.998 0.124 0.016 1.000 0.125 0.014 1.000 0.126 0.010 1.000 0.124 0.007 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.108 0.104 0.179 0.114 0.108 0.248 0.111 0.094 0.314 0.111 0.090 0.386 0.104 0.080 0.454 0.090 0.058 0.690 0.089 0.043 0.817 0.086 0.028 0.889 0.086 0.019 0.929 0.084 0.017 0.954 0.082 0.015 0.976 0.082 0.012 0.987 0.084 0.010 0.993 0.082 0.009 0.996 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.094 0.086 0.404 0.092 0.082 0.657 0.090 0.083 0.774 0.090 0.079 0.843 0.086 0.078 0.888 0.082 0.063 0.978 0.082 0.056 0.994 0.081 0.047 0.999 0.083 0.046 1.000 0.081 0.034 1.000 0.081 0.026 1.000 0.079 0.025 1.000 0.080 0.026 1.000 0.082 0.026 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.070 0.072 0.196 0.070 0.064 0.345 0.072 0.064 0.470 0.068 0.060 0.572 0.064 0.058 0.655 0.060 0.053 0.866 0.060 0.052 0.946 0.058 0.041 0.974 0.058 0.038 0.987 0.059 0.034 0.995 0.062 0.028 1.000 0.059 0.027 1.000 0.059 0.025 1.000 0.062 0.024 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.062 0.064 0.115 0.060 0.058 0.177 0.058 0.057 0.247 0.061 0.058 0.308 0.060 0.055 0.367 0.055 0.044 0.632 0.062 0.052 0.779 0.057 0.050 0.860 0.055 0.042 0.906 0.050 0.041 0.941 0.056 0.035 0.960 0.055 0.031 0.975 0.051 0.030 0.987 0.055 0.028 0.992 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.0776 0.0776 0.3416 0.076 0.072 0.5936 0.0784 0.0716 0.736 0.076 0.0592 0.8136 0.0692 0.0664 0.8692 0.0696 0.0584 0.9728 0.0692 0.0472 0.9928 0.07 0.0448 0.9984 0.0712 0.0468 0.9996 0.072 0.0456 1 0.0724 0.0472 1 0.0668 0.0436 1 0.07 0.04 1 0.0708 0.04 1 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.0536 0.0488 0.1668 0.0544 0.0552 0.308 0.0528 0.0508 0.4544 0.0464 0.0484 0.5532 0.0472 0.0536 0.6348 0.0496 0.0556 0.8584 0.0492 0.0476 0.93 0.0448 0.0464 0.9684 0.0512 0.0452 0.9832 0.0468 0.0484 0.992 0.0464 0.0492 0.9964 0.0544 0.0484 0.9996 0.0508 0.0456 0.9996 0.052 0.044 1 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.0592 0.0572 0.1128 0.06 0.0604 0.174 0.0592 0.056 0.2368 0.0592 0.0564 0.298 0.056 0.052 0.3624 0.0488 0.0472 0.62 0.0488 0.0436 0.7656 0.0504 0.05 0.8516 0.0544 0.0548 0.8992 0.0464 0.05 0.9288 0.0504 0.0468 0.956 0.0476 0.0468 0.9768 0.0484 0.0456 0.9872 0.048 0.0424 0.9928 69 Table 1.35: The Finite Sample Size of the ExpW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP D: (θ, ρ, ϕ) = (0.8, 0.5, 0.5) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.708 0.333 0.954 0.517 0.183 0.960 0.435 0.172 0.975 0.396 0.164 0.982 0.372 0.165 0.986 0.324 0.116 0.997 0.315 0.072 1.000 0.307 0.044 1.000 0.309 0.034 1.000 0.312 0.024 1.000 0.315 0.020 1.000 0.310 0.021 1.000 0.308 0.021 1.000 0.313 0.020 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.646 0.452 0.818 0.467 0.226 0.817 0.398 0.182 0.861 0.353 0.164 0.896 0.332 0.159 0.917 0.300 0.102 0.973 0.283 0.062 0.994 0.280 0.036 0.998 0.281 0.026 1.000 0.281 0.023 1.000 0.273 0.020 1.000 0.272 0.021 1.000 0.270 0.020 1.000 0.278 0.021 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.518 0.398 0.627 0.360 0.243 0.568 0.310 0.195 0.600 0.292 0.163 0.651 0.275 0.139 0.689 0.235 0.079 0.844 0.225 0.054 0.916 0.226 0.037 0.954 0.219 0.034 0.970 0.216 0.028 0.982 0.220 0.024 0.990 0.215 0.020 0.996 0.218 0.020 0.999 0.214 0.018 1.000 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.260 0.113 0.647 0.213 0.100 0.794 0.193 0.097 0.868 0.185 0.088 0.912 0.172 0.090 0.946 0.148 0.074 0.990 0.149 0.054 0.999 0.141 0.036 1.000 0.144 0.037 1.000 0.146 0.032 1.000 0.147 0.031 1.000 0.144 0.032 1.000 0.148 0.031 1.000 0.150 0.032 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.188 0.122 0.368 0.158 0.087 0.485 0.143 0.085 0.590 0.135 0.081 0.668 0.133 0.078 0.737 0.115 0.058 0.911 0.113 0.050 0.973 0.107 0.036 0.988 0.109 0.034 0.995 0.107 0.038 0.999 0.104 0.033 1.000 0.106 0.034 1.000 0.106 0.030 1.000 0.108 0.031 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.132 0.094 0.211 0.110 0.084 0.253 0.104 0.080 0.318 0.102 0.076 0.382 0.096 0.072 0.446 0.092 0.051 0.683 0.088 0.044 0.822 0.089 0.041 0.893 0.085 0.041 0.925 0.084 0.038 0.954 0.088 0.036 0.972 0.091 0.033 0.985 0.089 0.030 0.991 0.087 0.030 0.997 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.1652 0.0948 0.5116 0.1444 0.0892 0.7064 0.1344 0.084 0.8044 0.1312 0.0796 0.8684 0.1224 0.0676 0.9048 0.1184 0.0648 0.9792 0.1048 0.0512 0.996 0.1072 0.0436 0.9988 0.106 0.0424 0.9996 0.1056 0.042 1 0.1088 0.0412 1 0.1048 0.0416 1 0.1072 0.0404 1 0.1088 0.0404 1 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.1212 0.0824 0.2648 0.108 0.0648 0.3912 0.0948 0.0564 0.52 0.0912 0.0572 0.6192 0.0868 0.0568 0.6972 0.0844 0.052 0.8804 0.0744 0.058 0.9476 0.0764 0.046 0.9788 0.0768 0.04 0.9896 0.0776 0.0392 0.9952 0.0808 0.0392 0.9988 0.0808 0.0404 0.9996 0.082 0.0404 1 0.0788 0.0404 1 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.0924 0.0708 0.1576 0.0848 0.0712 0.2124 0.0792 0.0712 0.278 0.0824 0.0688 0.3432 0.0784 0.06 0.3988 0.064 0.0504 0.64 0.0656 0.0544 0.78 0.0656 0.0516 0.8604 0.0656 0.0512 0.902 0.0692 0.0436 0.9336 0.0692 0.0412 0.9636 0.0664 0.0408 0.9808 0.0672 0.0364 0.9888 0.0668 0.0396 0.9932 70 Table 1.36: The Finite Sample Size of the ExpW ( F) Test with 5% Nominal Size H0 : No Structural Change in both β 1 and β 2 , DGP F: (θ, ρ, ϕ) = (0.9, 0.9, 0.9) T=100 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=500 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T=1000 kernel b = 0.02 0.04 0.06 0.08 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.958 0.624 0.999 0.719 0.212 0.995 0.544 0.143 0.993 0.445 0.106 0.992 0.370 0.095 0.993 0.282 0.074 0.996 0.296 0.060 0.998 0.307 0.045 1.000 0.313 0.036 1.000 0.304 0.034 1.000 0.300 0.033 1.000 0.297 0.030 1.000 0.296 0.026 1.000 0.301 0.026 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.975 0.909 0.994 0.850 0.514 0.981 0.721 0.323 0.977 0.624 0.238 0.977 0.558 0.191 0.981 0.465 0.123 0.992 0.458 0.081 0.997 0.456 0.057 0.998 0.456 0.046 1.000 0.456 0.038 1.000 0.451 0.036 1.000 0.449 0.034 1.000 0.451 0.033 1.000 0.452 0.030 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.951 0.902 0.971 0.826 0.689 0.925 0.729 0.522 0.910 0.665 0.402 0.908 0.617 0.327 0.918 0.529 0.164 0.956 0.519 0.105 0.980 0.505 0.082 0.991 0.502 0.064 0.996 0.507 0.052 0.998 0.507 0.042 0.999 0.500 0.038 0.999 0.502 0.035 1.000 0.498 0.032 1.000 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.634 0.204 0.942 0.375 0.096 0.938 0.292 0.076 0.952 0.251 0.070 0.965 0.220 0.060 0.970 0.188 0.054 0.992 0.181 0.050 0.999 0.181 0.040 1.000 0.177 0.041 1.000 0.182 0.039 1.000 0.178 0.042 1.000 0.172 0.038 1.000 0.174 0.038 1.000 0.174 0.038 1.000 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.635 0.392 0.823 0.411 0.154 0.785 0.330 0.112 0.817 0.280 0.093 0.850 0.252 0.085 0.878 0.218 0.062 0.956 0.206 0.056 0.986 0.202 0.050 0.995 0.205 0.043 0.999 0.196 0.042 1.000 0.200 0.044 1.000 0.199 0.042 1.000 0.201 0.040 1.000 0.202 0.039 1.000 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.517 0.372 0.641 0.328 0.206 0.542 0.269 0.150 0.554 0.242 0.121 0.596 0.218 0.102 0.637 0.188 0.068 0.807 0.190 0.062 0.885 0.188 0.059 0.933 0.183 0.050 0.961 0.182 0.048 0.976 0.179 0.044 0.988 0.180 0.043 0.993 0.175 0.043 0.996 0.177 0.044 0.998 = 0.05 Fixed-b AP Bartlett QS Bartlett 0.426 0.1272 0.8092 0.2788 0.0728 0.8676 0.2228 0.0684 0.9092 0.198 0.0568 0.9348 0.1764 0.052 0.9592 0.1528 0.046 0.9916 0.1488 0.0444 0.9972 0.1472 0.0424 0.9996 0.1532 0.04 1 0.1564 0.0396 1 0.1536 0.0408 1 0.1416 0.042 1 0.1456 0.042 1 0.1448 0.0444 1 = 0.1 Fixed-b AP Bartlett QS Bartlett 0.3728 0.1908 0.5816 0.236 0.0948 0.626 0.1928 0.0704 0.6932 0.1712 0.0588 0.7588 0.1572 0.0544 0.8084 0.1364 0.0528 0.9288 0.1368 0.0448 0.9696 0.1312 0.0352 0.9884 0.1292 0.0352 0.9956 0.1356 0.0388 0.998 0.134 0.0392 0.9996 0.1384 0.0412 0.9996 0.1308 0.0424 1 0.1292 0.0408 1 = 0.2 Fixed-b AP Bartlett QS Bartlett 0.2912 0.1968 0.3884 0.1944 0.1196 0.3728 0.1612 0.0968 0.4176 0.1512 0.088 0.486 0.1436 0.074 0.536 0.1248 0.0476 0.7312 0.1212 0.0452 0.8356 0.1236 0.036 0.8956 0.1248 0.036 0.9328 0.1264 0.038 0.9576 0.1196 0.0376 0.9768 0.1184 0.0388 0.99 0.1208 0.0388 0.9944 0.1248 0.04 0.9972 71 APPENDIX 72 Appendix for Chapter 1 The following expression is a general representation of the HAC estimators: Ω=T −1 T T ∑∑K t =1 s =1 |t − s| M vt vs . This representation can be rewritten in terms of the partial sum processes St = ∑tj=1 v j following Kiefer and Vogelsang (2005) and Hashimzade and Vogelsang (2008) as follows. Let M = bT. Then for the kernels in Class 1, Ω = T −2 T −1 T −1 ∑ ∑ T −1/2 St T 2 ∆2t,s T −1/2 Ss , (1.40) t =1 s =1 where ∆2t,s ≡ (Kt,s − Kt,s+1 ) − (Kt+1,s − Kt+1,s+1 ) , |t − s| . bT Kt,s = K For the Class 2 kernel (Bartlett), Ω= 2 bT T −1 ∑ T −1 S t S t − t =1 1 bT T − M −1 ∑ T −1 St+bT St + T −1 St St+bT . (1.41) t =1 For the kernels in Class 3, Ω = T −2 ∑∑ T −1 St T 2 ∆2t,s Ss (1.42) |t−s| λ,   0 Λ 0   T −1/2 S[rT ] ⇒   λ Wp (r ) − Wp (λ) − 1r− 0 Λ − λ Wp (1) − Wp ( λ )   . Thus one can rewrite this result by using indicator functions as    (1) 0   Fp (r, λ)   Λ Λ 0  T −1/2 S[rT ] ⇒   Fp (r, λ) ,  ≡ (2) 0 Λ Fp (r, λ) 0 Λ where (1) Fp (r, λ) = Wp (r ) − r Wp ( λ ) · 1 (r ≤ λ ), λ and (2) Fp (r, λ) = Wp (r ) − Wp ( λ ) − r−λ Wp (1) − Wp ( λ ) 1−λ · 1 (r > λ ). Proof of Proposition 2: From (1.18),   Ω( F) =    = λ Σ (1) T −1 ∑tT=λT +1 ∑λT s =1 K T −1 |t−s| M (2) (1) vt vs λ Σ (1) T −1 ∑tT=λT +1 ∑λT s =1 K xt ut us xs |t−s| M ( 1 − λ ) Σ (2) T −1 |t−s| M T ∑λT t=1 ∑s=λT +1 K T ∑λT t=1 ∑s=λT +1 K |t−s| M ( 1 − λ ) Σ (2) (1) (2) vt vs     xt ut us xs  . The diagonal blocks converge respectively to λΣ and (1 − λ)Σ in probability under traditional asymptotics. One needs to show that the off-diagonal blocks converge to zero. 74 |t−s| M First, for the Bartlett K = 0 for |t − s| ≥ M. Therefore, the upper right off-diagonal block can be rewritten as ( F) Ω1,2 ≡ 1 T |t−s| M where Kt,s = K T λT ∑ ∑ (1) (2) Kt,s vt vs = t=1 s=λT +1 (1) . Set at = vt 1 T λT ∑ vt T ∑ (1) t =1 (2) Kt,s vs , s=λT +1 (2) and bt = ∑sT=λT +1 Kt,s vs and apply the partial summation formula λT ∑ a t bt = t =1 λT −1 t ∑ ∑ t =1 λT ( bt − bt + 1 ) + as s =1 ∑ as bλT s =1 to get 1 T λT ∑ vt T (1) t =1 ∑ (2) Kt,s vs = s=λT +1 1 T λT −1 ∑ t =1 (1) + SλT = (1) where St (1) = ∑tj=1 v j (1) 1 T T ∑ (1) St (2) Kt,s vs T s=λT +1 T ∑ ∑ − (2) Kt+1,s vs (1.43) s=λT +1 (2) KλT,s vs s=λT +1 λT −1 ∑ T ∑ (1) St t =1 (2) Kt,s vs s=λT +1 (2) for t ≤ λT and St T − ∑ (2) Kt+1,s vs , s=λT +1 (1) = ∑tj=λT +1 v j for t ≥ λT + 1. The above lines (2) used SλT = 0 and ST = 0. One can rewrite each component of the summands of this equation: T ∑ (2) Kt,s vs T −1 = s=λT +1 T ∑ (Kt,s − Kt,s+1 ) Ss ∑ (Kt,s − Kt,s+1 ) Ss , ∑ (Kt+1,s − Kt+1,s+1 ) Ss ∑ (Kt+1,s − Kt+1,s+1 ) Ss (2) s=λT +1 T −1 = and ∑ (2) Kt+1,s vs s=λT +1 (2) s=λT +1 T −1 = (2) (2) s=λT +1 T −1 = (2) + Kt,T ST − Kt,λT +1 SλT (2) s=λT +1 75 (2) (2) + Kt+1,T ST − Kt+1,λT +1 SλT Plugging these representations into (1.43) gives ( F) Ω1,2 1 = T λT −1 ∑ T −1 ∑ (1) (2) St (Kt,s − Kt,s+1 − Kt+1,s + Kt+1,s+1 ) Ss . t=1 s=λT +1 From Hashimzade and Vogelsang (2008) for the Bartlett kernel, Kt,s − Kt,s+1 − Kt+1,s + 1 for {(t, s) : M + t = s} and zero otherwise. This yields Kt+1,s+1 = − M ( F) Ω1,2 = 1 T λT −1 ∑ t=λT − M+1 −1 (1) (2) St St + M . M Taking the matrix-maximum norm yields ( F) Ω1,2 λT −1 1 (1) (2) T −1/2 St · T −1/2 St+ M ≤ ∑ TM t=λT − M+1 M−1 (1) (2) max T −1/2 St · max T −1/2 Ss TM λT − M+1≤t≤λT −1 λT +1≤s≤λT −1+ M 1 = O p (1) = o p (1). T ≤ Here we used (1) = O p (1), (2) = O p (1), max T −1/2 St max T −1/2 Ss λT − M+1≤t≤λT −1 and λT +1≤s≤λT −1+ M which is true given the result in Proposition 1 and the traditional assumption that M/T → 0. The same argument applies to the left lower off-diagonal block of Ω( F) yielding the desired result. Proof of Lemma 1: Plugging the limit of the partial sum process in Proposition 1 into the HAC estimators in (1.40), (1.41), or (1.42) the desired result follows from direct application of the continuous mapping theorem to obtain the desired result in (1.19). 76 Proof of Lemma 2: The proof is for the Bartlett kernel. Recall from (1.16) that 1 λT Σ (1) = [λT ] [λT ] |t − s| M1 ∑ ∑K t =1 s =1 (1) (1) vt vs . With the Bartlett kernel, one can rewrite this as: Σ 1 1 − b1 λT (1) [(1−b1 )λT ]−1 with b1 = ∑ [λT ]−1 ∑ (1) t =1 (1) (1) and St (1) (λT )−1/2 St (λT )−1/2 St (1) (λT )−1/2 St+b1 λT (λT )−1/2 St t =1 M1 λT 2 1 = b1 λT (1) (1) + (λT )−1/2 St (λT )−1/2 St+b1 λT (1) = ∑tj=1 v j . Apply the continuous mapping theorem using the result on the limit of the partial sum process in Proposition 1 to obtain Σ (1) ⇒ − − 1 b1 λ 1 b1 λ 1 2 b1 λ 0 (1−b1 )λ 0 (1−b1 )λ 0 (1) (1) λ−1/2 ΛFp (r, λ) Fp (r, λ) λ−1/2 Λ dr (1) (1) λ−1/2 ΛFp (r + b1 λ, λ) Fp (r, λ) λ−1/2 Λ dr (1) (1) λ−1/2 ΛFp (r, λ) Fp (r + b1 λ, λ) λ−1/2 Λ dr  = Λ (1) P b1 λ, Fp (r, λ) λ  Λ by recalling (1.4). One can easily get the limit of Σ(2) in the same way and this completes the proof. Proof of Theorem 1: Recall that Wald( F) = T R β 1 ( F ) −1 Qλ R RQ− λ Ω 77 −1 Rβ . , Using R = ( R1 , − R1 ) it follows that H0 d RT 1/2 β − β = R1 T 1/2 β 1 − β 1 − T 1/2 β 2 − β 2 R 1 Q −1 Λ 1 1 Wp ( λ ) − Wp (1) − Wp ( λ ) λ 1−λ → . Using Assumption 1 and Lemma 1, 1 ( F ) −1 RQ− Qλ R ⇒ λ Ω × −1 1 R1 Q−1 Λ, R1 Q−1 Λ × P b, Fp (r, λ) λ 1−λ −1 1 R1 Q−1 Λ, R 1 Q −1 Λ λ 1−λ (1) By writing out P b, Fp (r, λ) using Fp (r, λ) = . (2) Fp (r, λ) , Fp (r, λ) , one can obtain the following expression for above limit after some algebra: R1 Q−1 ΛP b, 1 1 (1) (2) Fp (r, λ) − Fp (r, λ) Λ Q−1 R1 . λ 1−λ Now apply the transformation d R1 Q−1 ΛWp (r ) = AWl (r ), with R1 Q−1 ΛΛ Q−1 R1 = AA , and conclude 1 ( F ) −1 RQ− Qλ R ⇒ AP b, λ Ω 1 1 (1) (2) Fl (r, λ) − F (r, λ) λ 1−λ l 78 A, giving the needed result: 1 1 Wl (λ) − (Wl (1) − Wl (λ)) λ 1−λ Wald( F) ⇒ −1 1 1 (1) (2) Fl (r, λ) × P b, Fl (r, λ) − λ 1−λ 1 1 Wl (λ) − (Wl (1) − Wl (λ)) . λ 1−λ × Proof of Corollary 2: Consider the Bartlerr kernel case. Other cases can be proved in (1) the same way. Look at P b, λ1 Fl (1) definitions of Fl (2) (r, λ) and Fl P b, − − 1 (1) 1 (2) Fl (r, λ) − F (r, λ) λ 1−λ l 1 (1) 1 (2) Fl (r, λ) − Fl (r, λ) λ 1−λ 1 (1) 1 (2) Fl (r, λ) − Fl (r, λ) λ 1−λ 0 1− b 1 1 (1) (2) Fl (r + b, λ) − Fl (r + b, λ) λ 1−λ 0 (1) 2 b − + 1− b 1 b 0 1 2 b 0 1 1 (2) (1 − λ ) (2) + Fl F 2 l F 2 l (2) (r, λ) Fl (2) (r + b, λ) Fl (1) (r, λ) Fl 0 (2) (r, λ) × Fl dr. (r, λ) = 0 as 1 (1) (1) Fl (r, λ) Fl (r, λ) dr 2 λ (r, λ) dr − (1) 1− b 1 b (r, λ) dr dr + (r + b, λ) + Fl dr 1 (1) 1 (2) Fl (r, λ) − Fl (r, λ) λ 1−λ 1 (1) 1 (1) (1) (1) Fl (r, λ) Fl (r + b, λ) + 2 Fl (r + b, λ) Fl (r, λ) 2 λ λ (2) (1 − λ ) 1 dr 1 (1) 1 (2) Fl (r + b, λ) − Fl (r + b, λ) λ 1−λ Rewrite the above expression by noting that Fl + (r, λ) in (1.24). Rewrite it using the (r, λ) in Lemma 1. 1 1 (1) (2) Fl (r, λ) − Fl (r, λ) λ 1−λ 0 1− b 1 b 1 b 1 2 b = (2) (r, λ) − 1−1 λ Fl (1 − λ ) 0 1− b 1 b 0 (2) (r + b, λ) Fl 79 1 (2) F 2 l (2) (r, λ) Fl dr (r + b, λ) 1 (1) (2) Fl (r, λ) Fl (r + b, λ) λ (1 − λ ) (2) (r, λ) + Fl (1) (r + b, λ) Fl (r, λ) dr. (1) By reminding that λ is given fixed and noticing Fl (r, λ) (2) 0≤r ≤ λ and Fl (r, λ) λ ≤r ≤1 are independent, one can still preserve the distribution of the above expression under the following transformations √ √ λWl∗ r λ d (1) = Wl (r ) for W (·) in Fl (r, λ), 1 − λWl∗∗ r−λ 1−λ Then apply change of variables: d (2) = Wl (r ) − Wl (λ) for W (·) in Fl (r, λ). r λ = u for the first two integrals and r −λ 1− λ = v for the next two integrals. Finally, notice the term outside the inverse in equation (1.24) is independent of the term inside the inverse. So one can apply the following separate transformation to the outside term: 1 1 d Wl (λ) − (Wl (1) − Wl (λ)) = λ 1−λ 1 λ (1 − λ ) Wl (1), which yields the desired result. Proof of Theorem 2: Using Lemma 2 and the transformation shown in the proof of Theorem 1, the result in Theorem 2 immediately follows. Proof of Corollary 3: Proof is similar to the proof of Corollary 2. Proof of Proposition 3: Frisch-Waugh Theorem gives T β= −1 ∑ Xt Xt t =1 T = −1 ∑ Xt Xt T ∑ Xt Xt t =1 (1.44) t =1 T ∑ Xt Xt β + t =1 = T ∑ Xt y t −1 T ∑ Xt u t − t =1 t =1 T T ∑ Xt Xt β + t =1 ∑ Xt u t t =1 80 , T ∑ X t z t ( Z Z ) −1 Z u t =1 and it immediately follows that T 1/2 β−β = T −1 −1 T ∑ Xt Xt T −1/2 = T . (1.45) −1 T ∑ ∑ Xt u t t =1 t =1 −1 T Xt − X Z ( Z Z ) −1 Xt − z t ( Z Z ) zt −1 ZX (1.46) t =1 × T −1/2 T ∑ X t − X Z ( Z Z ) −1 z t u t . (1.47) t =1 Under Assumption 1’ and 2’ it follows in a straightforward manner that  √  T ( β − β ) ⇒ Q −1  XX  1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (1) 1 Λ1 Wp+q (1) − Wp+q (λ) − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (1) In order to derive the limit of √ T ( R β − r ) following immediate results are useful:  T −1 Q XZ ≡ p lim T ∑ Xt z t t =1  =  λQ xZ (1 − λ) Q xZ T −1 T ∑ Xt Xt t =1 Q X X = p lim T −1 ,   2p×q   Q XX ≡ p lim   (1.48) 0  λQ xx =  0 (1 − λ) Q xx T ∑ Xt Xt , 2p×2p 1 = Q XX − Q XZ Q− ZZ Q XZ . t =1 Also by recalling from matrix algebra (see e.g. Schott (1997)) that 1 −1 −1 Q −1 = Q − XX + Q XX Q XZ Q ZZ − Q XZ Q XX Q XZ XX 81 −1 1 Q XZ Q− XX (1.49) and using (1.49) one can further show that   1 −1  λ Q xx Q −1 =  XX +P P 1 −1 1−λ Q xx P +P  , (1.50) where 1 −1 P = Q− xx Q xZ Q ZZ − Q xZ Q xx Q xZ −1 1 Q xZ Q− xx . Now plug (1.50) in (1.48) and conclude √ H0 T ( R β − r) = 1 ⇒ R1 Q − xx Λ1 √ TR β − β (1.51) 1 1 Wp + q ( λ ) + Wp + q ( λ ) − Wp + q (1) λ 1−λ . The following lemma is used in the proof of Lemma 3. 1 −1 −1 −1 Lemma 4. Let K = Q xZ Q− ZZ Q xZ . Then it holds that Q xx KP = P − Q xx KQ xx . Proof of Lemma 4: From equation (1.46),     −1 1 2 0 λ(1 − λ) Q xZ Q− λQ xx   λ Q xZ Q ZZ Q xZ ZZ Q xZ  QX X =  −   . −1 −1 2 0 (1 − λ) Q xx λ(1 − λ) Q xZ Q ZZ Q xZ (1 − λ) Q xZ Q ZZ Q xZ The desired result comes from the identity Q X X Q−1 = I by substituting equation (1.50) XX for Q−1 . XX Proof of Lemma 3: First note that implicit in the proof of Proposition 3 is the result that p lim Q−1 = Q−1 . For R = ( R1 , − R1 ) it follows that XX XX p lim R Q−1 = R1 XX 1 1 −1 1 Q xx , − Q− xx λ 1−λ 82 (1.52) using (1.50). The scaled partial sum process is given by T −1/2 S[rT ] = T −1/2 ξ =T −1/2 [rT ] [rT ] ∑ Xt u t − T ∑ Xt Xt −1 t =1 √ [rT ] ∑ Xt u t t =1 T ( β − β) − T −1 t =1 [rT ] ∑ Xt z t t =1 ZZ T −1 T −1/2 Z u . (1.53) For 0 ≤ r < λ, the first term in (1.53) satisfies T −1/2   −1 Λ1 Wp+q (r ) − λQ xZ Q ZZ Λ2 Wp+q (r ) [rT ] ∑ Xt u t ⇒  1 −(1 − λ) Q xZ Q− ZZ Λ2 Wp+q (r ) t =1 . (1.54) Hence with R = ( R1 , − R1 ) , from (1.52) and (1.54) it follows that R Q−1 T −1/2 XX  [rT ] ∑ Xt u t ⇒ R1 t =1  −1 Λ1 Wp+q (r ) − λQ xZ Q ZZ Λ2 Wp+q (r ) 1 1 −1 1 Q xx , − Q−  xx λ 1−λ =  1 −(1 − λ) Q xZ Q− ZZ Λ2 Wp+q (r ) 1 1 R1 Q − xx Λ1 Wp+q (r ). λ For the first part of the second term in (1.53) it follows that T −1   − [rT ]    −1 rλQ xZ Q ZZ Q xZ rQ xx 0 p× p  X X ⇒ − t ∑ t  t =1 0 p× p 0 p× p  1 rλQ xZ Q− ZZ Q xZ 1 r (1 − λ) Q xZ Q− ZZ Q xZ   1 r (1 − λ) Q xZ Q− ZZ Q xZ  0 p× p λ2 Q −1 xZ Q ZZ Q xZ 0 p× p   +r 1 0 p× p λ(1 − λ) Q xZ Q− ZZ Q xZ  0 p× p  1 λ(1 − λ) Q xZ Q− ZZ Q xZ  1 (1 − λ)2 Q xZ Q− ZZ Q xZ  2 2 rQ xx + (rλ − 2rλ)K −r (1 − λ) K  = , 2 2 −r (1 − λ ) K r (1 − λ ) K 83   1 where K = Q xZ Q− ZZ Q xZ . Hence with R = ( R1 , − R1 ) , R Q −1 T −1 XX  [rT ] ∑ Xt Xt ⇒ R1 t =1 + (rλ2 − 2rλ)K 1 −1 1 rQ xx 1 Q xx , − Q− × xx λ 1−λ −r (1 − λ )2 K = rR1 −r (1 − λ )2 K r (1 − λ )2 K λ − 1 −1 1 1 I − Q− Q xx K , xx K, λ λ which combined with (1.48) and Lemma 4 immediately yields R Q −1 XX T −1 [rT ] √ ∑ Xt Xt T ( β − β) ⇒ t =1   rR1 1 −1  λ Q xx 1 λ − 1 −1 1 I − Q− Q xx K ×  xx K, λ λ   × = P 1 −1 1−λ Q xx P +P 1 Λ1 Wp+q (1) − Wp+q (λ) − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (1) 1 −1 Q , 0 p× p λ2 xx  ×    1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (1)  = rR1 +P   1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (1) 1 Λ1 Wp+q (1) − Wp+q (λ) − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (1) r r 1 −1 −1 R Q Λ W ( λ ) − R1 Q − p + q 1 1 xx xx Q xZ Q ZZ Λ2 Wp+q (1). λ λ2 Finally, premultiplying the third term in (1.53) by R Q−1 gives XX R Q −1 T −1 XX = [rT ] ∑ Xt z t t =1 R Q −1 T −1 XX ZZ T −1 T −1/2 Z u [rT ] ∑ Xt − X Z ( Z Z ) t =1 84 −1 zt zt ZZ T −1 T −1/2 Z u        ⇒ R1 = 1 1 −1 1 Q xx , − Q− xx λ 1−λ   r (1 − λ) Q xZ  −1 ×  Q ZZ Λ2 Wp+q (1) −r (1 − λ) Q xZ r 1 −1 R1 Q − xx Q xZ Q ZZ Λ2 Wp+q (1). λ Combining the results for the three terms gives 1 r ξ 1 1 R Q−1 T −1/2 S[rT ] ⇒ R1 Q− Λ1 Wp + q (r ) − R1 Q − Λ1 2 Wp + q ( λ ) xx xx XX λ λ r r 1 −1 −1 −1 + R1 Q − xx Q xZ Q ZZ Λ2 Wp+q (1) − R1 Q xx Q xZ Q ZZ Λ2 Wp+q (1) λ λ 1 = R1 Q − xx Λ1 r 1 Wp + q (r ) − 2 Wp + q ( λ ) λ λ 1 (1) 1 = R1 Q − xx Λ1 Fp+q (r, λ ) . λ (1.55) Now consider λ ≤ r ≤ 1, for the first term in 1.53) it follows that T −1/2 [rT ]  ∑ Xt u t ⇒   t =1  1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (r ) 1 Λ1 Wp+q (r ) − Wp+q (λ) − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (r )  . Hence with R = ( R1 , − R1 ) , R Q−1 T −1/2 XX  R1 1 1 −1  Q xx , − Q −1 ×  λ 1 − λ xx Λ 1 = R1 Q − xx Λ1 [rT ] ∑ Xt u t ⇒ t =1 1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (r ) 1 Wp + q (r ) − Wp + q ( λ ) 1 − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (r ) 1 1 Wp + q ( λ ) − Wp + q (r ) − Wp + q ( λ ) λ 1−λ For the first part of the second term in (1.53) it follows that T −1 [rT ]  λQ xx ∑ Xt Xt ⇒  t =1  0 p× p 85 0 p× p (r − λ) Q xx   .   .   − λ2 Q  1 λ(1 − λ) Q xZ Q− ZZ Q xZ −1 1 λ(r − λ) Q xZ Q− ZZ Q xZ (r − λ )(1 − λ ) Q xZ Q ZZ Q xZ   − −1 xZ Q ZZ Q xZ λ2 Q −1 xZ Q ZZ Q xZ  1 λ(r − λ) Q xZ Q− ZZ Q xZ −1 1 λ(1 − λ) Q xZ Q− ZZ Q xZ (r − λ )(1 − λ ) Q xZ Q ZZ Q xZ   +r   λ2 Q    −1 xZ Q ZZ Q xZ 1 λ(1 − λ) Q xZ Q− ZZ Q xZ  1 λ(1 − λ) Q xZ Q− ZZ Q xZ − 2) λ2 K λQ xx + (r = (2 − r ) λ2 − λ K   1 (1 − λ)2 Q xZ Q− ZZ Q xZ (2 − r ) λ2   −λ K (r − λ) Q xx + ((r − 2) λ + r ) (λ − 1) K  . Hence with R = ( R1 , − R1 ) , R Q −1 T −1 XX  R1 [rT ] ∑ Xt Xt ⇒ t =1 − 2) λ2 K 1 1 −1 λQ xx + (r 1 Q xx , − Q− × xx λ 1−λ (2 − r ) λ2 − λ K = R1 I + (2 − r ) λ2 It directly follows that XX T −1 [rT ] ∑ Xt Xt t =1 86 √ −λ K (r − λ) Q xx + ((r − 2) λ + r ) (λ − 1) K r−λ λ ( r − 1 ) −1 1 Q xx K, − I + (r − 1) Q − xx K . 1−λ 1−λ R Q −1  T ( β − β) ⇒    I+ R1 1 −1  λ Q xx r−λ λ ( r − 1 ) −1 1 Q xx K, − I + (r − 1) Q − xx K ×  1−λ 1−λ   × P  1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (1) 1 Λ1 Wp+q (1) − Wp+q (λ) − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (1) = R1   ×  +P P 1 −1 1−λ Q xx   r−λ 1 −1 Q −1 Q xx , − λ (1 − λ)2 xx  1 Λ1 Wp+q (λ) − λQ xZ Q− ZZ Λ2 Wp+q (1) 1 Λ1 Wp+q (1) − Wp+q (λ) − (1 − λ) Q xZ Q− ZZ Λ2 Wp+q (1) 1 = R1 Q − xx Λ1   r−λ r−λ 1 Wp + q ( λ ) + Wp + q ( λ ) − Wp + q (1) 2 λ (1 − λ ) (1 − λ )2 1 −1 − R1 Q − xx Q xZ Q ZZ Λ2 1−r Wp + q (1). 1−λ Finally, premultiplying the third term in (1.53) by R Q−1 gives XX R Q −1 T −1 XX = [rT ] ∑ Xt z t t =1 R Q −1 T −1 XX ZZ T −1 T −1/2 Z u [rT ] ∑ Xt − X Z ( Z Z ) −1 t =1 zt zt ZZ T  ⇒ R1 = 1 −1 1 Q xx , − Q −1 λ 1 − λ xx −1 T −1/2 Z u   λ(1 − r ) Q xZ  −1 ×  Q ZZ Λ2 Wp+q (1) −λ(1 − r ) Q xZ 1−r −1 1 R1 Q − xx Q xZ Q ZZ Λ2 Wp+q (1). 1−λ Combining the results for the three terms gives 1 R Q−1 T −1/2 S[rT ] ⇒ R1 Q− xx Λ1 ξ XX 87 +P   × − 1 r−λ r−λ W (λ) + W (1) (W (r ) − W (λ)) − 2 1−λ (1 − λ ) (1 − λ )2 1 −1 + R1 Q − xx Q xZ Q ZZ Λ2 1 = R1 Q − xx Λ1 − 1−r 1−r 1 −1 W (1) − R1 Q − W (1) xx Q xZ Q ZZ Λ2 · 1−λ 1−λ 1 r−λ r−λ W (λ) + W (1) (W (r ) − W (λ)) − 2 1−λ (1 − λ ) (1 − λ )2 1 = − R1 Q − xx Λ1 · 1 (2) F (r, λ). 1 − λ p+q (1.56) By combining (1.55) and (1.56), for r ∈ [0.1], 1 R Q−1 T −1/2 S[Tr] ⇒ R1 Q− xx Λ1 ξ XX 1 1 (1) (2) Fp+q (r, λ) − Fp+q (r, λ) . λ 1−λ Proof of Theorem 3: To conserve on space the proof for this Theorem is provided only for the case of the Bartlett kernel with M = T (i.e. b = 1). However, the proof given here goes through for other kernels and different values of b. Note that with b = 1 the HAC estimator in equation (1.41) can be rewritten as ( F) Ω b =1 2 = T T −1 ∑ T −1/2 St T −1/2 St . ξ ξ t =1 With this HAC estimator the term within the inverse in (1.35) is given by  2 T T −1  ∑ R t =1 T −1 T −1 ∑ Xs Xs T −1/2 St T −1/2 St ξ T −1 ξ s =1 1 ⇒ P 1, R1 Q− xx Λ1 T ∑ Xs Xs s =1 −1 R    1 (1) 1 (2) Fp+q (r, λ) − Fp+q (r, λ) λ 1−λ where the limit is obtained directly from Lemma 3 and the continuous mapping theorem. The result for (1.35) can be obtained by using similar arguments as those used in Theorem d 1 1 where the transformation is used: R1 Q− xx Λ1 Wp+q (r ) = Ξ · Wl (r ), 0 ≤ r ≤ 1 for a p.d. 1 −1 matrix Ξ satisfying ΞΞ = R1 Q− xx Λ1 Λ1 Q xx R1 . l ×l 88 REFERENCES 89 REFERENCES Andrews, D. W. K.: (1991), ‘Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation’. Econometrica 59, 817–854. Andrews, D. W. K.: (1993), ‘Tests for Parameter Instability and Structural Change with Unknown Change Point’. Econometrica 61, 821–856. Andrews, D. W. K. and W. Ploberger: (1994), ‘Optimal Tests When a Nuisance Parameter is Present Only Under the Alternative’. Econometrica 62, 1383–1414. Bai, J. and P. Perron: (2006), ‘Multiple structural change models: a simulation analysis’. Econometric theory and practice: Frontiers of analysis and applied research pp. 212–237. Bai, J. S. and P. Perron: (1998), ‘Estimating and Testing Linear Models with Multiple Structural Breaks’. Econometrica 66, 47–78. Davidson, J.: (1994), Stochastic Limit Theory. New York: Oxford University Press. De Jong, R. M. and J. Davidson: (2000), ‘Consistency of Kernel Estimators of Heteroskedastic and Autocorrelated Covariance Matrices’. Econometrica 68, 407–424. Gonçalves, S. and T. J. Vogelsang: (2011), ‘Block Bootstrap HAC Robust Tests: The Sophistication of the Naive Bootstrap’. Econometric Theory 27(4), 745–791. Hansen, B. E.: (1992), ‘Consistent Covariance Matrix Estimation for Dependent Heterogenous Processes’. Econometrica 60, 967–972. Hashimzade, N. and T. J. Vogelsang: (2008), ‘Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators’. Journal of Time Series Analysis 29, 142–162. Jansson, M.: (2002), ‘Consistent Covariance Estimation for Linear Processes’. Econometric Theory 18, 1449–1459. Jansson, M.: (2004), ‘The Error Rejection Probability of Simple Autocorrelation Robust Tests’. Econometrica 72, 937–946. Kiefer, N. M. and T. J. Vogelsang: (2002a), ‘Heteroskedasticity-autocorrelation robust standard errors using the Bartlett kernel without truncation’. Econometrica 70, 2093–2095. Kiefer, N. M. and T. J. Vogelsang: (2002b), ‘Heteroskedasticity-Autocorrelation Robust Testing Using Bandwidth Equal to Sample Size’. Econometric Theory 18, 1350–1366. Kiefer, N. M. and T. J. Vogelsang: (2005), ‘A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests’. Econometric Theory 21, 1130–1164. 90 Newey, W. K. and K. D. West: (1987), ‘A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix’. Econometrica 55, 703–708. Perron, P.: (2006), ‘Dealing with structural breaks’. Palgrave handbook of econometrics 1, 278–352. Phillips, P. C. B. and S. N. Durlauf: (1986), ‘Multiple Regression with Integrated Processes’. Review of Economic Studies 53, 473–496. Phillips, P. C. B. and V. Solo: (1992), ‘Asymptotics for Linear Processes’. The Annals of Statistics 20, 971–1001. Schott, J. R.: (1997), ‘Matrix analysis for statistics’. A Wiley InterScience Publication, New York. Sun, Y.: (2013), ‘Fixed-smoothing Asymptotics in a Two-step GMM Framework’. Working Paper, Department of Economics, UCSD. Sun, Y., P. C. B. Phillips, and S. Jin: (2008), ‘Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing’. Econometrica 76, 175–194. Wooldridge, J. M. and H. White: (1988), ‘Some invariance principles and central limit theorems for dependent heterogeneous processes’. Econometric Theory pp. 210–230. 91 CHAPTER 2 A Test of Parameter Instability Allowing for Change in the Moments of Explanatory Variables 2.1 Introduction Most of the structural change literature addresses the instability of the parameters in the conditional model. The goal of this chapter is to develop a valid HAC-robust test when there is a shift in the mean and second moment of the explanatory variables. The break date, λ x T, for the mean and second moment is assumed to be different from the break date λT for the regression parameters. The bootstrap approach in Hansen (2000) considers a setup where general form of the structural change in the marginal distribution of the explanatory variables are allowed. But Hansen (2000) assumes that the error term is a martingale difference sequence with respect to a certain information set. Thus serial correlation of the product of x variables and the error, ut , is not allowed. This can be a limitation particularly in the static time series regression model where the series { xt ut }t=1,...,T often displays serial correlation. In this Chapter a valid HAC-robust approach for testing structural change in the regression slope/intercept parameters is developed. To ease the distribution theory a modified version of the standard set of high level conditions is introduced. To make the test robust to serial correlation and heteroskedasticity, a HAC estimator is used for constructing the test statistic and the fixed-b theory developed in Kiefer and Vogelsang (2005) is applied to derive the asymptotic distribution. The limiting distributions of the statistics are pivotal. The rest of the chapter is organized as follows. 92 Section 2.2 lays out the basic set up of the problem. Section 2.3 derives the limiting distributions of the appropriate test statistics under the fixed-b approach. Section 2.4 presents fixed-b critical values for certain break point value and bandwidths. The finite sample properties of the test is examined in Monte Carlo simulation experiments. Section 2.5 summarizes and concludes. Proofs are collected in the Appendix. 2.2 Model of Structural Change and Preliminary Results Suppose the univariate series { xt } has a mean shift at t = λ x T. Denote E ( x t ) = µi (t) , with µi (t)    µ1 for t ≤ λ x T . =   µ2 for t ≥ λ x T + 1 (2.1) The subscript i (t) indicates a regime for time t and µ1 is the mean in the first regime and µ2 is the mean in the second regime. Suppose λ x , µ1 , and µ2 are known. Now consider a simple time series regression model with a structural break given by y t = α 1 Dt + α 2 ( 1 − Dt ) + β 1 x t Dt + β 2 x t ( 1 − Dt ) + u t , (2.2) Dt = 1(t ≤ λT ), where xt is a regressor, λ ∈ (0, 1) is a hypothetical break point for the regression parameters, and 1( · ) is the indicator function. For expositional simplicity let us suppose λT and λ x T are integer-valued. The regression model (2.2) implies that both regression parameters are subject to potential structural change (full structural change model). Consider the 93 null hypothesis of no structural change in the slope parameter: H0 : β 1 = β 2 (2.3) The OLS estimators of α1 , α2 , β 1 , and β 2 are given by −1 λT β1 = ∑ ( xt − x1 ) t =1 T β2 = ∑ λT ∑ ( xt − x1 ) yt 2 t =1 −1 t=λT +1 (2.4) T ∑ ( x t − x 2 )2 , ( xt − x2 ) yt , t=λT +1 α1 = y1 − β 1 x 1 , α2 = y2 − β 2 x 2 , where 1 y1 = λT x1 = 1 λT λT 1 ∑ y t , y2 = (1 − λ ) T t =1 λT ∑ xt , x2 = t =1 1 (1 − λ ) T T ∑ yt , and t=λT +1 T ∑ xt . t=λT +1 Consider the case where λ > λ x . The asymptotic distribution of the slope estimator can be obtained under a certain set of high level conditions. A typical set of conditions is: [rT ] p Assumption 1. T −1 ∑t=1 xt2→ rq2x ,  uniformly in r ∈ [0, 1], and q2x is strictly positive. [rT ]  ut  Assumption 2. T −1/2 ∑t=1   ⇒ ΛW (r ), r ∈ [0, 1], where ΛΛ = Σ, and W (r ) is a xt ut 2 × 1 standard Wiener process. Assumptions 1 and 2 are the standard high level conditions used widely in the econometrics literature. Under these assumptions an immediate result is presented in the next Proposition without proof. Proposition 4. Suppose λ is known and λ > λ x . Under Assumptions 1 and 2 the OLS estimators 94 in (2.4) have the following asymptotic distributions: T 1/2 β1 − β1 ⇒ λq2x + λx λx 1− λ ( µ1 − µ2 ) T 1/2 β 2 − β 2 ⇒ (1 − λ) q2x −1 2 −1 × − µ2 − λx (µ1 − µ2 ) , 1 · ΛW (λ) , λ × (−µ2 , 1) · Λ (W (1) − W (λ)) . For simplicity assume q2x is known. To test the null hypothesis (2.3), one can consider using a robust Wald statistic −1 avar T 1/2 β 1 − β 2 Wald ( Tb ) = T β 1 − β 2 β1 − β2 , where Tb = λT denotes a hypothetical break date and avar T 1/2 β 1 − β 2 λ λq2x + λ x 1 − λx λ ( µ1 − µ2 )2 × Σ × − µ2 − + (1 − λ) (1 − λ) q2x −2 = −2 × − µ2 − λx ( µ1 − µ2 ) , 1 λ λx ( µ1 − µ2 ) , 1 λ × (−µ2 , 1) × Σ × (−µ2 , 1) , where Σ is a nonparametric kernel estimator of Σ = ΛΛ using a kernel K (·) and the bandwidth M: Σ = T −1 T T ∑∑K t =1 s =1  |t − s| M vt vs ,   ut  with vt =   . Unfortunately, Wald ( Tb ) does not have a pivotal asymptotic null xt ut distribution under fixed-b asymptotics as long as µ1 is not equal to µ2 and neither is the supremum statistic. The reason is the vector, λ λq2x + λ x 1 − − µ2 − λx λ λx λ ( µ1 − µ2 )2 (µ1 − µ2 ) , 1 , is generally different from (1 − λ) (1 − λ) q2x 95 −1 −1 × × (−µ2 , 1) unless µ1 is equal to µ2 . In order to obtain a test statistic which has a pivotal fixed-b limiting distribution, a modified regression equation and more general version of the high level conditions are introduced. The following two high-level conditions replace Assumptions 1 and 2.   2 p  rs21 uniformly in r ∈ [0, λ x ] [ rT ] − 1 → Assumption 1’. T ∑t=1 xt − µi(t)   λ x s2 + (r − λ x ) s2 uniformly in r ∈ [λ x , 1] 2 1 with s2i > 0 for i = 1, 2. Obviously under Assumption 1’ T −1 [rT ] x t − µi (t) ∑ si (t) t =1  [rT ]  Assumption 2’. T −1/2 ∑t=1  2 p → r uniformly in r ∈ [0, 1].  1 si (t) u t x t − µi (t) si (t) ut   ⇒ ΛW (r ), r ∈ [0, 1], where ΛΛ = Ω, W (r ) is a 2 × 1 standard Wiener process and si(t) = s1 1(t ≤ λ x T ) + s2 1(t ≥ λ x T + 1). When µ1 = µ2 = µ and s21 = s22 = s2 , Assumption 1’ is equivalent to Assumption 1 and Assumption 2’ is equivalent to Assumption 2. Under µ1 = µ2 = µ and s21 = s22 = s2 , [rT ] [rT ] p Assumption 1’ implies T −1 ∑t=1 ( xt − µt )2 = T −1 ∑t=1 ( xt − µ)2 → rs2 and it is easy to show this is equivalent to Assumption 1 by defining q2x = s2 + µ2 . Also, one can show the equivalence of Assumption 2’ and 2 through the relationship    s 0   Λ = Λ. µs s Note however that Assumption 1’ is more general than Assumption 1 since it allows the mean or probability limit of (centered and noncentered) sample second moment of xt to have a break. As a special case, if xt is a stationary process within each regime, then s2i is the variance of xt process and Assumption 1’ is a robust version of Assumption 1 96 allowing for a shift of the variance. Assumption 2’ is also a more general version than Assumption 2. To see this why, suppose one has evidence of shift in q2x (or s2i(t) ). Then it would be hard to justify Assumption 2 as being still valid. For example, take the case where xt is stationary and conditional homoskedasticity holds, E u2t | xt = σu2 . For a stationary variable, E xt2 = q2x . Then the variance of xt ut , denoted by Γ0 , is E xt2 u2t = E E xt2 u2t | xt = σu2 E xt2 = σu2 q2x . Hence Γ0 should have a structural break in general as long as there is a shift in q2x . This in turn implies that the long-run variance matrix Ω ≡ ΛΛ also has structural break because Γ0 is a component of the long run variance matrix. Therefore, it would not be appropriate to maintain Assumption 2 while allowing a shift in q2x or s2i(t) as in Assumption 1’. One needs to match Assumption 1’ with Assumption 2’ due to this reason. However, note that Assumption 2’ does not allow the long run variance to change in an arbitrary fashion. Assumption 2’ can only reflect the impact of change in q2x or s2i(t) on Ω in a particular way. In next Section tests of parameter instability of α or(and) β will be presented under a particular specification of breaks in µi(t) , si(t) and breaks in (α, β), which is as follows: µi(t) , si(t) has a structural break at λ x T and (α, β) is allowed to have a break at λT. There may be an empirical application where a different specification is more relevant. For example, only µi(t) has structural break at λ x T and (α, β) are allowed to have break at λT. But analysis for these other cases is not pursued in this chapter. 2.3 Asymptotic Results Suppose µi(t) , si(t) has a break at λ x T and (α, β) is allowed to change at an unknown break date, t = λT. Assume λ x and µi(t) , si(t) are given (known). Also, assume λ ∈ [ , (λ x − )] ∪ [(λ x + ) , 1 − ] following Andrews (1993). The admissible values for λ is obtained by trimming the values at both ends of the sample period and the values in the neighborhood of λ x T. It is necessarily implied that the break date for the regression 97 parameters (α, β) should not be same as the break date for the moments. Recall the regression equation (2.2). Define wt = xt si (t) . Without parameter instability, the equation can be written as yt = α + βxt + ut = α + βsi(t) xt si (t) + ut = α + β (s1 1 (t ≤ λ x T ) + s2 1 (t > λ x T )) wt + ut = α + βs1 1 (t ≤ λ x T ) wt + βs2 1 (t > λ x T ) wt + ut . Once we allow (α, β) to have a break at λT, the above regression equation can be rewritten as yt = α1 1 (t ≤ λT ) + α2 1 (t > λT ) + ( β 1 1 (t ≤ λT ) + β 2 1 (t > λT )) s1 1 (t ≤ λ x T ) wt + ( β 1 1 (t ≤ λT ) + β 2 1 (t > λT )) s2 1 (t > λ x T ) wt + ut . Note that there are four interaction terms made by the two time indicator functions and one of these four interactions is identical to zero. For example, if λ < λ x , then 1 (t ≤ λT ) × 1 (t > λ x T ) = 0. Depending on the relative magnitude of λ and λ x , one can rewrite the regression equation by reparametrization. For λ < λ x yt = α1 1 (t ≤ λT ) + α2 1 (λT < t ≤ λ x T ) + α3 1 (t > λ x T ) (2.5) + γ1 wt 1 (t ≤ λT ) + γ2 wt 1 (λT < t ≤ λ x T ) + γ3 wt 1 (t > λ x T ) + ε t , where γ1 = β 1 s1 , γ2 = β 2 s1 , γ3 = β 2 s2 , and α2 = α3 . For λ > λ x yt = α1 1 (t ≤ λ x T ) + α2 1 (λ x T < t ≤ λT ) + α3 1 (t > λT ) + γ1 wt 1 (t ≤ λ x T ) + γ2 wt 1 (λ x T < t ≤ λT ) + γ3 wt 1 (t > λT ) + ε t , where γ1 = β 1 s1 , γ2 = β 1 s2 , γ3 = β 2 s2 , and α1 = α2 . 98 (2.6) Denote as W the T × 6 matrix which collects the six explanatory variables: for (2.5), 1 (t ≤ λT ) , wt 1 (t ≤ λT ) , 1 (λT < t ≤ λ x T ) , wt 1 (λT < t ≤ λ x T ) , 1 (t > λ x T ) , wt 1 (t > λ x T ) in this order and for (2.6), 1 (t ≤ λ x T ) , wt 1 (t ≤ λ x T ) , 1 (λ x T < t ≤ λT ) , wt 1 (λ x T < t ≤ λT ) , 1 (t > λT ) , and wt 1 (t > λT ) in this order. Notice that the extra parameter for α is introduced for each equation so that the regression model takes the form of full structural change model with three regimes. The stability of α can be rewritten as α1 = α2 in (2.5) and α2 = α3 in (2.6). Likewise the stability of slope parameter β can be rewritten as β 1 = β 2 in (2.5) and β 2 = β 3 in (2.6). The parameters in (2.5) and (2.6) are estimated by OLS and robust Wald statistics will be constructed based on these estimators. The OLS estimators in (2.5) are given by −1 λT γ1 = ∑ ( w t − w1 ) t =1 −1 t =1 λx T ∑ γ2 = λT ∑ ( w t − w1 ) y t 2 −1 T ∑ ( w t − w2 ) y t , t=λT +1 t=λT +1 γ3 = λx T ∑ ( w t − w2 )2 , ( w t − w3 ) T ∑ 2 ( w t − w3 ) y t , t = λ x T +1 t = λ x T +1 and α1 = y1 − γ1 w1 , α2 = y2 − γ2 w2 , α3 = y3 − γ3 w3 , where w1 = 1 λT ∑λT t =1 xt s1 y1 = ∑t=x λT +1 λ T xt s1 and y2 = ∑tT=λx T +1 xt s2 y3 = w2 = 1 λ x T −λT w3 = 1 T −λ x T 1 λT ∑λT t =1 y t 1 λ x T −λT 1 T −λ x T λ T ∑t=x λT +1 yt . ∑tT=λx T +1 yt Defining x1 , x2 , and x3 similarly as above immediately gives w1 = 99 x1 s1 , w2 = x2 s1 , and w3 = x3 s2 . One can easily write down the expression for the OLS estimators (γ1 , γ2 , γ3 , α1 , α2 , α3 ) and all the other quantities when λ ∈ [(1 + ) λ x , 1 − ] . The next Proposition presents the asymptotic limits of the OLS estimators.    Λ1  Proposition 5. Suppose λ ∈ [ , (1 − ) λ x ] . Denote Λ =   . Under Assumptions 1’ Λ2 and 2’, as T → ∞ the OLS estimators in (2.5) have the following limits: W (λ) , λ µ W (λ) T 1/2 (α1 − α1 ) ⇒ s1 Λ1 − 1 Λ2 , s1 λ W (λ x ) − W (λ) T 1/2 (γ2 − γ2 ) ⇒ Λ2 , and λx − λ µ W (λ x ) − W (λ) . T 1/2 (α2 − α2 ) ⇒ s1 Λ1 − 1 Λ2 s1 λx − λ T 1/2 (γ1 − γ1 ) ⇒ Λ2 Suppose λ ∈ [(1 + ) λ x , 1 − ] . Then the OLS estimators in (2.6) have following limits: 1 Λ2 (W (λ) − W (λ x )) , λ − λx W (λ) − W (λ x ) µ2 , T 1/2 (α2 − α2 ) ⇒ s2 Λ1 − Λ2 s2 λ − λx 1 Λ2 (W (1) − W (λ)) , and T 1/2 (γ3 − γ3 ) ⇒ 1−λ µ2 W (1) − W ( λ ) T 1/2 (α3 − α3 ) ⇒ s2 Λ1 − Λ2 . s2 1−λ T 1/2 (γ2 − γ2 ) ⇒ Proof: See the Appendix. 2.3.1 Stability of β Consider the null hypothesis H0 : β is stable. 100 (2.7) This null hypothesis implies γ1 = γ2 in (2.5) and γ2 = γ3 in (2.6). From Proposition 10, for λ ∈ [ , λ x − ] , W (λ x ) − W (λ) W (λ) − λx − λ λ λx ∼ N 0, Λ2 Λ2 , λ (λ x − λ) T 1/2 (γ2 − γ1 ) ⇒ Λ2 (2.8) and for λ ∈ [λ x + , 1 − ] , W (1) − W ( λ ) W ( λ ) − W ( λ x ) − 1−λ λ − λx 1 − λx Λ2 Λ2 . ∼ N 0, (1 − λ ) ( λ − λ x ) T 1/2 (γ3 − γ2 ) ⇒ Λ2 (2.9) Test Statistic T1 Denote Tb = λT and define robust Wald statistics: β Wald1 β ( Tb ) = Wald2 ( Tb ) = T (γ2 − γ1 )2 Λ2 Λ2 T (γ3 − γ2 )2 Λ2 Λ2 for Tb ∈ [ T, (λ x − ) T ] ≡ Ξ1 and (2.10) for Tb ∈ [(λ x + ) T, (1 − ) T ] ≡ Ξ2 , (2.11) where Λ2 Λ2 is a nonparametric kernel HAC estimator given by Λ 2 Λ 2 = T −1 with vt = T T ∑∑K t =1 s =1 x t − µi (t) si (t) |t − s| M vt vs , (2.12) ut . The HAC estimator Λ2 Λ2 in Wald1 ( Tb ) is computed using the residuals ut from the reβ gression of equation (2.5) and Λ2 Λ2 in Wald2 ( Tb ) is computed with the residuals ut from β the regression of equation (2.6). Under the assumption of a fixed bandwidth ratio this 101 HAC estimator can be rewritten as a function of partial sum processes (see Kiefer and Vogelsang (2005)), [rT ] β S[rT ] = ∑ vt = t =1 [rT ] x t − µi (t) t =1 si (t) ∑ ut . The next Proposition presents the limit of this (scaled) partial sum process. Proposition 6. Under Assumptions 1’ and 2’, as T → ∞, the limit of the partial sum process is given by When 0 < λ < λ x , T 1/2 S[rT ] ⇒ β       Λ2 W (r ) − λr W (λ) for 0 ≤ r ≤ λ, Λ2 W (r ) − W (λ) − λrx−−λλ (W (λ x ) − W (λ)) for λ ≤ r ≤ λ x ,      Λ2 W (r ) − W (λ x ) − r−λx (W (1) − W (λ x )) for λ x ≤ r ≤ 1, 1− λ x and When λ x < λ < 1, T 1/2 S[rT ] ⇒ β            Λ2 W (r ) − Λ2 W (r ) − W ( λ x ) − r λx W (λ x ) r −λ x λ−λ x Λ2 W (r ) − W ( λ ) − r −λ 1− λ for 0 ≤ r ≤ λ x , (W (λ) − W (λ x )) (W (1) − W (λ)) for λ x ≤ r ≤ λ, for λ ≤ r ≤ 1. Proof: See the Appendix. Define H1 ≡ H1 (r, λ, λ x ) r = W1 (r ) − W1 (λ) · 1(r ≤ λ) λ + W1 (r ) − W1 (λ) − r−λ (W1 (λ x ) − W1 (λ)) · 1(λ < r ≤ λ x ) λx − λ + W1 (r ) − W1 (λ x ) − r − λx (W1 (1) − W1 (λ x )) · 1(λ x < r ≤ 1), 1 − λx and H2 ≡ H2 (r, λ, λ x ) 102 W1 (r ) − = + W1 (r ) − W1 (λ x ) − r W1 (λ x ) · 1(r ≤ λ x ) λx r − λx (W1 (λ) − W1 (λ x )) · 1(λ x < r ≤ λ) λ − λx + W1 (r ) − W1 (λ) − r−λ (W1 (1) − W1 (λ)) · 1(λ < r ≤ 1), 1−λ where W1 (·) is a 1-dimensional Wiener process. Theorem 4. Let λ ∈ (0, 1) and b ∈ (0, 1] be given. Suppose M = bT. Then under Assumptions 1’ and 2’, as T → ∞, the limits under the null hypothesis in (2.7) is given by H0 β Wald1 ( Tb ) ⇒ W1 (λ x )−W1 (λ) λ x −λ − W1λ(λ) 2 P (b, H1 ) and, β H0 Wald2 ( Tb ) ⇒ W1 (1)−W1 (λ) 1− λ W1 (λ x ) − W1 (λλ)− −λ x P (b, H2 ) ≡ Wald1∞ (λ, λ x ) , 2 ≡ Wald2∞ (λ, λ x ) , The definition of P (b, H1 ) and P (b, H2 ) can be found in Cho and Vogelsang (2014). Proof: See the Appendix. Finally, define a test statistic T1 : T1 = max β β max Wald1 ( Tb ) , max Wald2 ( Tb ) . Tb ∈Ξ2 Tb ∈Ξ1 This statistic can be used when the break date is unknown. Its limit is given by max sup λ∈[ ,(1− )λ x ] Wald1∞ (λ, λ x ) , sup λ∈[(1+ )λ x ,1− ] 103 Wald2∞ (λ, λ x ) . (2.13) ( F) Test Statistic T1 Alternatively, the inference in this setup can be based on different HAC estimators defined in Cho and Vogelsang (2014). These alternative HAC estimators are given by Υ 1 = T −1 T T ∑∑K t =1 s =1  (1) vt   =   x t − µ1 s1 x t − µ1 s1 x t − µ2 s2 |t − s| M (1) (1) vt vs , 1 (t ≤ λT ) (2.14)    1 (λT < t ≤ λ x T )   ut ,  1 (λ x T ≤ t ≤ T ) where λ < λ x , and Υ2 = T −1 T T ∑∑K t =1 s =1  (2) vt   =   x t − µ1 s1 x t − µ2 s2 x t − µ2 s2 |t − s| M (2) (2) vt vs , 1 (t ≤ λ x T ) (2.15)    1 (λ x T < t ≤ λT )   ut ,  1 (λT ≤ t ≤ T ) where λ > λ x . Define robust Wald statistics based on the above HAC estimators: T (γ2 − γ1 )2 for Tb ∈ [ T, (λ x − ) T ] ≡ Ξ1 , = D1 Q1−1 Υ1 Q1−1 D1 (2.16) T (γ3 − γ2 )2 = for Tb ∈ [(λ x + ) T, (1 − ) T ] ≡ Ξ2 , D2 Q2−1 Υ2 Q2−1 D2 (2.17) ( F ),β Wald1 ( Tb ) ( F ),β Wald2 ( Tb ) 104 where   1 0 0 D1 ≡ (1, − 1) ×  , 0 1 0  2 λT 1 0  T ∑ t =1 ( w t − w 1 )  2 λx T 1 Q1 =  0  T ∑t=λT +1 ( wt − w2 )  0 0   0 1 0 D2 ≡ (1, − 1) ×   , and 0 0 1  2 λx T 1 0  T ∑ t =1 ( w t − w 1 )  2 λT 1 Q2 =  0  T ∑ t = λ x T +1 ( w t − w 2 )  0 0  0 0 1 T ∑tT=λx T +1 (wt − w3 ) 2   ,    0 0 1 T ∑tT=λT +1 (wt − w3 ) 2   .   Theorem 5. Let λ ∈ (0, 1) and b ∈ (0, 1] be given. Suppose M = bT. Then under Assumptions 1’ and 2’, as T → ∞, the limits under the null hypothesis in (2.7) is given by ( F ),β Wald1 H ( Tb ) ⇒0 W1 (λ x )−W1 (λ) λ x −λ − W1λ(λ) 2 P (b, H3 ) and, ( F ),β Wald2 W1 (1)−W1 (λ) 1− λ H0 ( Tb ) ⇒ W1 (λ x ) − W1 (λλ)− −λ x P (b, H4 ) ( F ),∞ ≡ Wald1 (λ, λ x ) , 2 ( F ),∞ ≡ Wald2 (λ, λ x ) , where the processes H3 and H4 are defined as H3 ≡ H3 (r, λ, λ x ) = − 1 λx − λ 1 r W1 (r ) − W1 (λ) · 1(r ≤ λ) λ λ W1 (r ) − W1 (λ) − r−λ (W1 (λ x ) − W1 (λ)) · 1(λ < r ≤ λ x ), λx − λ 105 and H4 ≡ H4 (r, λ, λ x ) = 1 λ − λx − 1 1 − λx W1 (r ) − W1 (λ x ) − r − λx (W1 (λ) − W1 (λ x )) · 1(λ x < r ≤ λ) λ − λx W1 (r ) − W1 (λ) − r−λ (W1 (1) − W1 (λ)) · 1(λ < r ≤ 1). 1−λ The test statistic for testing the null hypothesis in (2.7) is simply given by ( F) T1 = max ⇒ max ( F ),β max Wald1 Tb ∈Ξ1 ( F ),β ( Tb ) , max Wald2 Tb ∈Ξ2 ( F ),∞ sup λ∈[ ,(1− )λ x ] Wald1 (λ, λ x ) , ( Tb ) sup λ∈[(1+ )λ x ,1− ] ( F ),∞ Wald2 (λ, λ x ) ≡ T ( F),∞ ( , λ x ) . 2.3.2 Stability of α Consider the null hypothesis H0 : α is stable. (2.18) With a hypothetical break point λT, the break in α may occur at λT. This implies α1 = α2 in (2.5) and α2 = α3 in (2.6) under the null hypothesis (2.18). From Proposition 10, for λ ∈ [ , λx − ] , W (λ x ) − W (λ) W (λ) − λx − λ λ   λx µ   s1   ∼ N 0, s1 , − 1 ΛΛ   , λ (λ x − λ) s1 − µs11 T 1/2 (α2 − α1 ) ⇒ µ s1 Λ1 − 1 Λ2 s1  106 (2.19) and for λ ∈ [λ x + , 1 − ] , W (1) − W ( λ ) W ( λ ) − W ( λ x ) − 1−λ λ − λx   1 − λx µ2  s2    s2 , − ΛΛ  ∼ N 0,  . s2 (1 − λ ) ( λ − λ x ) − µs22 µ2 s2 Λ1 − Λ2 s2  T 1/2 (α3 − α2 ) ⇒ (2.20) Test Statistic T2 Denote Tb = λT and define robust Wald statistics: T ( α2 − α1 )2 Wald1α ( Tb ) = s1 , − µ1 s1 × ΛΛ × s1 , − µ1 s1 for Tb ∈ [ T, (λ x − ) T ] ≡ Ξ1 and (2.21) Wald2α ( Tb ) = T ( α3 − α2 ) s2 , − µ2 s2 2 × ΛΛ × s2 , − µ2 s2 for Tb ∈ [(λ x + ) T, (1 − ) T ] ≡ Ξ2 , (2.22) where ΛΛ is a nonparametric kernel HAC estimator given by ΛΛ = T −1 T T ∑∑K t =1 s =1   with ξ t =  |t − s| M ξtξs, (2.23)  1 si (t) x t − µi (t) si (t)   ut . As before, the HAC estimator ΛΛ in Wald1α ( Tb ) is computed using the residuals ut from the regression of equation (2.5) and ΛΛ in Wald2α ( Tb ) is computed with the residuals ut from the regression of equation (2.6). Under the assumption of fixed bandwidth ratio, this HAC estimator can be rewritten as a function of partial sum processes (see Kiefer and Vogelsang (2005)), [rT ] S[αrT ] = ∑ ξt = t =1 [rT ]  ∑  t =1 107 1 si (t) x t − µi (t) si (t)    ut . The next Proposition presents the limit of the (scaled) partial sum processes. Proposition 7. Under Assumptions 1’ and 2’, as T → ∞, the limit of the partial sum process is given by When 0 < λ < λ x , T 1/2 S[αrT ] ⇒       Λ W (r ) − λr W (λ) for 0 ≤ r ≤ λ, Λ W (r ) − W (λ) − λrx−−λλ (W (λ x ) − W (λ)) for λ ≤ r ≤ λ x ,      Λ W (r ) − W (λ x ) − r−λx (W (1) − W (λ x )) for λ x ≤ r ≤ 1, 1− λ x and When λ x < λ < 1, T 1/2 S[αrT ] ⇒            Λ W (r ) − Λ W (r ) − W ( λ x ) − r λx W (λ x ) r −λ x λ−λ x Λ W (r ) − W ( λ ) − r −λ 1− λ for 0 ≤ r ≤ λ x , (W (λ) − W (λ x )) (W (1) − W (λ)) for λ x ≤ r ≤ λ, for λ ≤ r ≤ 1. Proof: See the Appendix. The next Theorem presents the limit of the statistics. The limit is the same as in Theorem 8. Theorem 6. Let λ ∈ (0, 1) and b ∈ (0, 1] be given. Suppose M = bT. Then under Assumptions 1’ and 2’, as T → ∞, the limits under the null hypothesis in (2.7) is given by H0 Wald1α ( Tb ) ⇒ Wald1∞ (λ, λ x ) , and, H0 Wald2α ( Tb ) ⇒ Wald2∞ (λ, λ x ) . The definition of P (b, H1 ) and P (b, H2 ) can be found in Cho and Vogelsang (2014). Proof: See the Appendix. 108 Using Theorem 9, the test statistic T2 defined below has the following limit: T2 = max max Wald1α ( Tb ) , max Wald2α ( Tb ) Tb ∈Ξ1 ⇒ max Tb ∈Ξ2 sup λ∈[ ,(1− )λ x ] Wald1∞ (λ, λ x ) , sup λ∈[(1+ )λ x ,1− ] Wald2∞ (λ, λ x ) . ( F) Test Statistic T2 Alternative statistic can be derived by a similar way as before. Construct HAC estimators (1) Υ3 , Υ4 using ξ t (2) and ξ t respectively where  (1) ξt        =        and 1 s1 1 ( t (2) ξt ≤ λT )    1 (t ≤ λT )    1 1 λT < t ≤ λ T ( )  x s1  × ut ,  x t − µ1 1 (λT < t ≤ λ x T )  s1    1 1 λ T ≤ t ≤ T ( ) x  s2  x t − µ2 1 λ T ≤ t ≤ T ( ) x s2 x t − µ1 s1         =         1 s1 1 ( t  ≤ λx T )    1 (t ≤ λ x T )    1 1 λ T < t ≤ λT ( )  x s2  × ut .  x t − µ2 1 (λ x T < t ≤ λT )  s2    1 1 λT ≤ t ≤ T ( )  s2  x t − µ2 1 λT ≤ t ≤ T ( ) s2 x t − µ1 s1 Define Wald statistics ( F ),α Wald1 T ( α2 − α1 )2 ( Tb ) = s1 , − µ1 s1 × D3 WW T −1 109 Υ3 WW T −1 (2.24) D3 × s1 , − µ1 s1 for Tb ∈ [ T, (λ x − ) T ] ≡ Ξ1 , and ( F ),α Wald2 ( Tb ) T ( α3 − α2 )2 = s2 , − µ2 s2 × D4 WW T −1 Υ4 WW T −1 (2.25) D4 × s2 , − µ2 s2 for Tb ∈ [(λ x + ) T, (1 − ) T ] ≡ Ξ2 , where D3 ≡ ( I2 , − I2 ) × ( I4 , 04×1 ) , D4 ≡ ( I2 , − I2 ) × (04×1 , I4 ) . Finally, ( F) T2 ≡ max ⇒ max ( F ),α max Wald1 Tb ∈Ξ1 Tb ∈Ξ2 ( F ),∞ sup λ∈[ ,(1− )λ x ] Wald1 (λ, λ x ) , ( F ),α ( Tb ) , Wald2 The asymptotic limits of Wald1 ( F ),∞ Wald1 2.4 ( F ),∞ (λ, λ x ) , Wald1 ( F ),α ( Tb ) , max Wald2 ( Tb ) sup λ∈[(1+ )λ x ,1− ] ( F ),α ( F ),∞ Wald2 ( F) ( Tb ) and T2 (λ, λ x ) . are the same as (λ, λ x ) and T ( F),∞ ( , λ x ) respectively in previous Theorem. Simulations In this Section the finite sample properties of the test are examined via Monte Carlo simulation. The data generating process (DGP) is given by yt = α1 1 (t ≤ λT ) + α2 1 (t > λT ) + β 1 1 (t ≤ λT ) xt + β 2 1 (t > λT ) xt + ε t , xt = si(t) ut + µi(t) , ut = ρut−1 + ηt , ε t = 0.5ε t−1 + νt , 110 where ηt ∼ N (0, ση2 ) and νt ∼ N (0, 1) are independent. The values of α1 and α2 are set to zero. The mean and variance of xt are allowed to have a single break at λ x = 0.4. Three specifications on the mean and variance of xt are considered. specification 0: µ1 , µ2 , s21 , s22 = (0, 0, 1, 1) specification 1: µ1 , µ2 , s21 , s22 = (0, 0.5, 1, 1.5) specification 2: µ1 , µ2 , s21 , s22 = (0, 1, 1, 2) specification 3: µ1 , µ2 , s21 , s22 = (0, 2, 1, 4) Specification 0 implies there is no break in the mean and variance. When the true values of λ x , µ1 , µ2 , s21 , s22 are known, one can recover ut from xt and use ut to construct a HAC estimator. So the break in mean and variance has no effect on inference and the empirical rejection probabilities should be same across the specifications. The values of ρ, ση2 are chosen so that the variance of xt becomes 1 for the first regime (before λ x T). The selected sets of those values are (0.5, 0.75) and (0.8, 0.36). The DGP with ρ, ση2 = (0.8, 0.36) has more persistent autocovariance function than the other and has the larger long run variance of ut . To examine the size property, the case where β 1 = β 2 = 1 is considered. To learn about power, the following values for λ0 , β 1 , β 2 are considered: (0.2, 1, 1.5) , (0.25, 1, 1.5), where λ0 denotes the true break point for the regression parameters. The value of the trimming parameter to be used in this experiment is 0.1. Based on this particular value of trimming parameter and the value of λ x , the admissible set of λ is [0.1, 0.4 − 0.1] [0.4 + 0.1, 1 − 0.1] . To check against the case where the true value does not belong to this admissible set, extra values of λ is considered: λ = 0.4. With = 0.1 and λ x = 0.4 and the Bartlett kernel being used, the 95% fixed-b critical values for the statistic T1 are 188.64 for ( F) b = 0.1 and 717.15 for b = 0.5. For T1 with = 0.1 and λ x = 0.4 the 95% fixed-b critical values are 35.11 for b = 0.1 and 162.07 for b = 0.5. Table 2.1 reports the results for the 111 simulation experiments for T = 100 and 300 and for three different tests: SupW ( F) test in ( F) Cho and Vogelsang (2014), and T1 and T1 tests proposed in this chapter. ( F) Table 2.1 shows that the test results for T1 and T1 are invariant to the break in the moments of the explanatory variable. This is, as mentined earlier, because the true value of λ x , µ1 , µ2 , s21 , s22 is assumed to be observable and these two statistics are numerically the same across different values of λ x , µ1 , µ2 , s21 , s22 . There are several points conveyed by this table. First, in the absence of break in the moments, SupW ( F) test displays better power property and comparable size property over the other two tests. But when there exists a break in the moments, the size and power of SupW ( F) test would be sensitive to the magnitude of the change in the moments and the size property does not get better as T ( F) increases. When there is change in the moments and T1 test is used for the inference, the size distortion is relatively small and the rejection frequency is not so much sensitive to the persistence of the underlying process. However, this test has poor power compared to ( F) SupW ( F) test. This is because for given a value of b, T1 uses bigger effective bandwidth for estimating the variance matrix in each regime. The test based on T1 is seen to be ( F) dominated by T1 2.5 test in terms of the size and power when T is 300. Summary and Conclusions This chapter proposes an inference procedure for testing stability of regression parameters allowing for a single break in the mean and second moments of the x variable. The mean and second moments are assumed to have a break at λ x T. The break point for the regression parameter (λT ) should be different from λ x T. The analysis focuses on a simple linear regression model but the proposed test can be generalized to a multiple linear regression model. A new set of high level conditions are introduced which incorporates the possibility of change in the mean and second moments of the x variable. Under fixed-b asymptotics, the limiting distribution of the robust Wald statistic is pivotal so the criti- 112 cal values can be simulated and used for conducting inference. The simulation results in this chapter show there is substantial size distortion in finite samples. This is not surprising because the three regimes induced by two different types of break points make even smaller sample size for each regime. Also, whether the main results will be still valid when λ x and the moments are unknown and need to be estimated should be a straightforward direction of future research. 113 Table 2.1: Size and Power in Finite Samples, T=100, 300, λ x = 0.4, µ1 = 0, µ2 = 0.5 s21 = 1, s22 = 1.5 No break in (µ, s2 ) ρ, ση2 b (.5, .75) 0.1 0.5 (.8, .36) 0.1 0.5 (.5, .75) 0.1 0.5 (.8, .36) 0.1 0.5 = .1, Bartlett kernel µ1 = 0, µ2 = 1 s21 = 1, s22 = 2 (.5, .75) 0.1 0.5 µ1 = 0, µ2 = 2 s21 = 1, s22 = 4 (.8, .36) 0.1 0.5 (.5, .75) 0.1 0.5 (.8, .36) 0.1 0.5 63.8 58.8 SupW ( F) Test T=100 Size Power λ0 = .2 .25 .4 T=300 Size Power λ0 = .2 .25 .4 13.3 11.9 23.0 20.7 16.5 14.1 25.3 21.1 24.6 19.4 30.8 ρ, b T=100 Size Power λ0 = .2 .25 .4 T=300 Size Power λ0 = .2 .25 .4 50.8 49.6 27.9 22.1 33.1 28.2 28.1 22.6 33.4 28.7 26.3 20.5 32.0 26.7 46.8 36.7 45.3 37.2 48.6 38.0 46.5 38.5 47.2 36.9 46.2 38.2 67.5 52.8 60.5 50.7 69.0 55.0 61.8 51.4 70.4 56.5 62.4 52.8 98.1 86.7 93.6 83.6 98.3 88.6 94.1 84.5 98.4 93.2 94.3 86.8 8.4 14.8 13.3 29.9 25.1 30.9 25.6 93.0 83.6 93.4 75.1 86.8 71.9 95.2 81.0 88.6 74.7 97.1 88.8 90.6 79.8 100 100 100 96.0 99.9 95.6 97.3 99.8 96.6 99.8 99.9 99 7.9 14.4 11.9 41.5 34.8 38.8 32.9 41.8 34.1 38.7 32.5 37.8 30.6 36.3 29.3 18.6 16.0 76.0 59.8 67.6 55.5 78.3 63.6 69.4 57.4 80.0 68.0 70.6 58.3 ( F) T1 Test ση2 26.7 T1 Test (.5, .75) 0.1 0.5 (.8, .36) 0.1 0.5 (.5, .75) 0.1 0.5 (.8, .36) 0.1 0.5 20.1 18.7 45.7 41.2 12.0 13.7 25.6 21.6 25.2 21.9 20.1 18.7 49.6 43.2 49.3 43.6 45.7 41.2 20.3 18.9 19.5 19.0 12.0 10.7 17.3 15.8 17.2 16.4 13.7 12.3 9.5 9.4 20.9 18.9 8.6 11.8 25.1 20.5 25.3 19.8 9.5 9.4 31.9 25.8 31.4 25.0 20.9 18.9 25.0 25.3 24.6 24.9 8.6 7.8 114 10.7 7.8 12.3 11.1 22.2 22.1 22.0 21.6 11.8 11.1 84.8 75.0 APPENDIX 115 Appendix for Chapter 2 Proof of Proposition 10: The proof is only provided for γ2 . One can easily prove the rest of the results. −1 λx T ∑ γ2 = ( w t − w2 ) λx T ∑ 2 ( w t − w2 ) y t t=λT +1 2 −1 λx T xt − t=λT +1 λx T ∑ = γ2 + t=λT +1 Since p lim x2 = p lim (λ 1 x −λ) T xt − x2 s1 ∑ s1 t=λT +1 x2 ut . λ T ∑t=x λT +1 xt = µ1 , it follows under Assumptions 1’ and 2’, W (λ x ) − W (λ) λx − λ T 1/2 (γ2 − γ2 ) ⇒ Λ2 . Proof of Proposition 11: Proof is only provided for r ∈ [λ, λ x ] when 0 < λ < λ x . [rT ] β S[rT ] = [rT ] = ( α2 − α2 ) ∑ t=λT +1 ∑ t =1 x t − µ1 s1 x t − µ1 s1 [rT ] ut = ∑ t=λT +1 x t − µ1 s1 [rT ] ∑ + (γ2 − γ2 ) t=λT +1 x t − µ1 s1 yt − α2 − γ2 [rT ] xt s1 xt + s1 t =∑ λT +1 x t − µ1 s1 ut . So, W (λ x ) − W (λ) × (r − λ) + Λ2 (W (r ) − W (λ)) λx − λ r−λ W (r ) − W ( λ ) − (W (λ x ) − W (λ)) . λx − λ T −1/2 S[rT ] ⇒ −Λ2 β = Λ2 Proof of Theorem 8: The results immediately follows from equation (2.8), (2.9), Proposition 3 and the transformation d Λ2 W (·) = AW1 (·), 116 where A is the positive constant satisfying Λ2 Λ2 = A2 . Proof of Proposition 12: Proof is only provided for r ∈ [λ, λ x ] when 0 < λ < λ x . [rT ] S[αrT ] = [rT ] = ∑ t=λT +1    1 s1 x t − µ1 s1    ut =  [rT ] = ( α2 − α2 ) ∑ [rT ] t=λT +1   ∑ t=λT +1 t =1  x t − µ1 s1  T −1/2 S[αrT ] ⇒ − (r − λ)  µ1 s21 1  [rT ] 1 s1   ut xt   yt − α2 − γ2 s1 x t − µ1 s1   − (r − λ )  t =1   + (γ2 − γ2 ) Hence x t − µ1 s1  1 s1    1 s1  ∑  1 s1  [rT ] ∑ ξt =  ∑ t=λT +1   1 s1 x t − µ1 s1  [rT ]   xt   + ∑  s1 t=λT +1 1 s1 x t − µ1 s1    ut .  µ1   s1 , − s1 0 Λ W (λ x ) − W (λ) λx − λ    Λ2 W (λ x ) − W (λ) λx − λ = Λ W (r ) − W ( λ ) − + Λ (W (r ) − W (λ)) r−λ (W (λ x ) − W (λ)) . λx − λ Proof of Theorem 9: The results immediately follows from equation (2.19), (2.20), Proposition 4 and the transformations µ1 s1 µ2 s2 , − s2 s1 , − d ΛW (·) = B1 W1 (·), d ΛW (·) = B2 W1 (·), 117 where the constants B1 and B2 satisfies s1 , − µ1 s1 ΛΛ s1 , − µ1 s1 = B12 , s2 , − µ2 s2 ΛΛ s2 , − µ2 s2 = B22 . 118 REFERENCES 119 REFERENCES Andrews, D. W. K.: (1993), ‘Tests for Parameter Instability and Structural Change with Unknown Change Point’. Econometrica 61, 821–856. Cho, C. K. and T. J. Vogelsang: (2014), ‘Fixed-b Inference for Testing Structural Change in a Time Series Regression’. Working paper, Department of Economics, Michigan State University. Hansen, B. E.: (2000), ‘Testing for structural change in conditional models’. Journal of Econometrics 97(1), 93–115. Kiefer, N. M. and T. J. Vogelsang: (2005), ‘A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests’. Econometric Theory 21, 1130–1164. 120 CHAPTER 3 A Test of the Null of Integer Integration against the Alternative of Fractional Integration 3.1 Introduction Chapter 2 proposes a test of the null hypothesis of “integer integration” against the alternative of fractional integration. More precisely, the null is that the series is either I (0) or I (1), while the alternative is that it is I (d) with 0 < d < 1. The null of integer integration is rejected in favor of the alternative of fractional integration if the KPSS test rejects the null of I (0) and a unit root test rejects the null of I (1). A new unit root test is used as the second part of the testing procedure, which is a lower-tailed KPSS test based on first differences of the data, but other unit root tests like the ADF test could also have been used. This two-part testing procedure will be called the “Double-KPSS” test because it consists of two steps, but it should be pointed out that the test is treated as one test and is evaluated for its properties (consistency and finite sample size and power) as such. The KPSS test of Kwiatkowski et al. (1992) was originally suggested as a test of the null of (short memory) stationarity against the alternative of a unit root. Conversely, standard unit root tests like the Dickey-Fuller tests, the augmented Dickey-Fuller (ADF) test of Said and Dickey (1984) or the Phillips-Perron test of Phillips and Perron (1988) were viewed as tests of the null of a unit root against the alternative of short-memory stationarity. So if the KPSS test rejected but the unit root test did not, the conclusion was that the series had a unit root. If the unit root test rejected but the KPSS test did not, the conclusion was 121 that the series was short-memory stationary. If neither test rejected, the conclusion was that the data were not informative enough to decide whether the series was I (0) or I (1). However, if both tests rejected, there was in some sense a contradiction. This apparent contradiction can be resolved by considering a wider class of processes, specifically long-memory processes. The leading example considered in this chapter, is the I (d) process (with 0 < d < 1) of Adenstedt (1974), Granger and Joyeux (1980), and Hosking (1981). Since both the KPSS test and unit root tests have power against longmemory alternatives, the “double rejection” outcome can be taken as evidence that the series has long memory, as opposed to being either I (0) or I (1). This is not a novel observation. However, the approach in this chapter is novel in its consideration of the doubletesting procedure as a single test, and its investigation of the size and power properties of this test. In this regard, the basic observation is that if the nominal size of each of the two tests is set to 5%, the double test also has size of 5% asymptotically. For example, if the DGP is I (0), then asymptotically the KPSS test will reject with probability 5% while the unit root test will reject with probability one, while if the DGP is I (1) the converse will occur. So whether the data are I (0) or I (1), the probability of rejection of the double test is asymptotically 5%. The practical issue to be faced is to what extent one can be reasonably sure that the double rejection outcome is due to fractional integration, as opposed to size distortions of the test under the I (0) or I (1) null. For example, Caner and Kilian (2001) and Müller (2005) have shown that the KPSS test has large size distortions if the DGP is AR(1) with autoregressive coefficient near unity. Conversely, Dejong et al. (1992), Phillips and Perron (1988) and Vogelsang and Wagner (2013), among others, have found that the Dickey-Fuller test and its variants can have large size distortions, especially if the DGP is ARI MA(0, 1, 1) with moving average root near (negative) unity. This does not imply that the Double KPSS test will suffer from large size distortions in either of these cases, since the cases for which the KPSS test has large size distortions correspond to cases in 122 which the unit root test may have low power, and conversely. However, it does argue for a careful investigation of the size and power properties of the new test in finite samples. As noted above, the specific unit root test used in this chapter is a lower-tail KPSS test based on the data in differences. The KPSS unit root test suggested by Shin and Schmidt (1992) and Breitung (2002) was considered. However, as shown by Lee and Amsler (1997), the KPSS unit root test is not consistent against I (d) alternatives with 1/2 < d < 1. One might also consider using the ADF test, but this test is known to have low power against I (d) alternatives (e.g. Diebold and Rudebusch (1991), Hassler and Wolters (1994)), and there is also the practical consideration that it is easier to prove the consistency of our test against I (d) alternatives for all d between zero and one than it is for the ADF test. In simulation, it makes little difference whether the new test or the ADF test is used. The consistency of the Double KPSS test depends on the consistency of the KPSS test and of the unit root test proposed in this chapter, and these in turn depend on the number of lags used in the estimation of the long-run variance going to infinity, but more slowly than sample size. Under this assumption a single critical value for each test (for each significance level) obtains, and these will be referred to as the “standard asymptotics” critical values. They do not depend on the kernel used to estimate the long-run variance or on the bandwidth (so long as the number of lags behaves as assumed above). However, following Kiefer and Vogelsang (2002a, 2002b, 2005), this chapter also considers “fixed-b asymptotics,” where b, defined as the ratio of the number of lags to the sample size, is constant as the sample size grows. The fixed-b critical values depend on the kernel and on the value of b, and there is evidence in many settings that they yield tests with smaller size distortions than the critical values based on the standard asymptotics. The main theoretical contribution is that the consistency of the Double-KPSS test is proved against I (d) for all d between zero and one. For the KPSS test, this can be shown using existing results except for the case of d = 1/2, so the divergence of the statistic for d = 1/2 is proven in this chapter. For the new unit root test, its asymptotic distribution 123 is established for d = 0, 0 < d < 1/2 and 1/2 < d < 1, and it is proven that the statistic converges to zero in probability when d = 1/2. Besides these theoretical results, this chapter contains substantial simulation results to show the extent to which this testing procedure is likely to be useful in finite samples. The plan of the chapter is as follows. Section 3.2 gives the definitions and basic properties of stationary short memory, long memory and unit root processes, and explicitly states the testing procedure. Section 3.3 gives the asymptotic results, using the standard asymptotics. The asymptotic limits of the two component tests are derived and consistency of the two-part test is proved. Section 3.4 presents the fixed-b asymptotics. Section 3.5 presents the results of simulations which explore the finite sample properties of the new test. Section 3.6 summarizes and concludes. Finally, an Appendix gives some proofs and technical details. 3.2 Setup and Assumptions The data is assumed to be generated by the DGP: yt = µ + t , t = 1, 2, ..., T. (2.1) That is, non-zero level of the yt series is allowed, but not trend. Allowing for trend would not change the basic principles of the research in this chapter, but it would change the asymptotics. 3.2.1 Null Hypothesis Under the null hypothesis { t }∞ t=1 is either a stationary short memory process or a unit root process. That is, either { t }∞ t=1 itself is a stationary short memory process or it is a cumulation of a short memory process. t Let {zt }∞ t=1 be a time series with zero mean, and let Zt = ∑ j=1 z j be its partial sum. 124 {zt }∞ t=1 is said to be a short-memory process if it satisfies the following two conditions. Assumption N1 σ2 = lim T −1 E ZT2 T →∞ exists and is nonzero. (2.2) Assumption N2 ∀r ∈ [0, 1], T −1/2 Z[rT ] ⇒ σW (r ), (2.3) where [rT ] denotes the integer part of rT, ⇒ means weak convergence, and W (r ) is the standard Wiener process. In addition to Assumptions N1 and N2, further regularity conditions are necessary for the consistency of HAC (heteroskedasticity and autocorrelation consistent) estimators. Examples of such conditions can be found in Andrews (1991), Newey and West (1987), De Jong and Davidson (2000), Jansson (2002), and Hansen (1992). It is implicitly assumed that one or more of these sets of conditions hold, so that the HAC estimators that appear in our test statistics are consistent. Unit root processes are the other class of DGP which belongs to the null hypothesis. A time series is said to be a unit root process if its first difference is a short memory process. Equivalently, a cumulation of a short memory process is a unit root process. That is, Zt is a unit root process if (1 − L) Zt ≡ zt ∼ short memory process. 125 (2.4) 3.2.2 Alternative Hypothesis Under the alternative hypothesis, { t }∞ t=1 is a fractionally integrated process. Specifically, consider the alternative that t follows an I (d) process with 0 < d < 1 : (1 − L ) d t = ut , ut ∼ i.i.d Normal(0, σu2 ), The class of I (d) processes with 0 < d < 1 2 (2.5) has been widely used in econometrics to represent long memory processes1 . More generally, a stationary process is said to have long memory if n lim n→∞ ∑ γ j = ∞, (2.6) j=−n where γ j is the autocovariance at lag j. Lo (1991) uses the following form of autocovariance function as a definition of a long-range dependent (long memory) process.      j2d−1 L( j) for d ∈ (0, 1 ) or  2 γj ∼ as j → ∞ ,  − j2d−1 L( j) for d ∈ (− 1 , 0)    (2.7) 2 where L( j) is a slowly varying function2 at infinity. This form of autocovariance function includes the autocovariance function of the I (d) process with 0 < d < 1 2, and is more general in the sense that it would accommodate the case that ut in (2.5) is a short memory but not necessarily an i.i.d. process. However, it does not accommodate the case of 1 2 ≤ d < 1. This chapter considers the I (d) process with i.i.d. innovations as in equation (2.5), which was analyzed by Granger and Joyeux (1980) and Hosking (1981). When − 21 < d < 1 2, d = 0, the process is a stationary long memory process, while it is a nonstationry long memory process for 1 2 ≤ d < 1. For d > − 21 , the process is invertible and has infinite 1 For more comprehensive treatment of this topic, see Giraitis et al. (2012). (1991, page 1286): A function L( x ) is said to be slowly varying at infinity if limt→∞ L(tx )/L(t) = 1. An example is log x. 2 Lo 126 order moving average representation: t = (1 − L ) −d ∞ ut = ∑ b j ut− j , bj = j =0 and when d < 1 2 Γ ( j + d) Γ ( d ) Γ ( j + 1) (2.8) it has infinite order AR representation: (1 − L ) d ∞ t = ∑ aj t− j = ut , a0 = 1, a j = j =0 Γ ( j − d) for j ≥ 1. Γ (1 − d ) Γ ( j + 1) (2.9) Also, for − 12 < d < 12 , d = 0 the process has a slowly decaying autocovariance function given by σu2 Γ (1 − 2d) Γ (k + d) γk = ∼ ck2d−1 as k → ∞ for some constant c, Γ ( d ) Γ (1 − d ) Γ ( k + 1 − d ) (2.10) and this autocovariance function satisfies (2.6) and (2.7). Hosking (1981) provided further results with ARMA(p, q) innovations. To establish the asymptotic results in this chapter, an invariance principle under the alternative hypothesis is needed. Davydov (1970) and Sowell (1990) provide an invariance principle for the fractionally integrated processes with i.i.d. innovations. Lee and Schmidt (1996) use the result in Sowell (1990), replacing his rth moment-condition by a normality assumption. Lo (1991) bases his asymptotic analysis upon the result of Taqqu (1975), assuming stationarity and Gaussianity of the long-memory process. More recently, Qiu and Lin (2011) proved an invariance principle for the fractionally integrated process with strong near-epoch dependent innovations, which is the most general functional central limit theorem for fractionally integrated processes currently available. The analysis in this chapter will focus on the I (d) process with normal i.i.d. innovations, following Lee and Schmidt (1996). Specifically, this chapter will use the functional central limit theorem 127 appearing in Sowell (1990) which is restated in Lee and Schmidt (1996), as follows.3 1 Suppose {zt }∞ t=1 is generated by (2.5) with − 2 < d < 1 2 and let σT2 = var ( ZT ) . From Sowell (1990), σT2 = σu2 · Γ (1 + d + T ) Γ (1 + d ) Γ (1 − 2d) × − Γ ( T − d) Γ (−d) (1 + 2d) Γ (1 + d) Γ (1 − d) (2.11) and as T → ∞, σT2 T 1+2d → σu2 · Γ (1 − 2d) ≡ ωd2 . (1 + 2d) Γ (1 + d) Γ (1 − d) (2.12) Also, Sowell (1990) gives the following invariance principle for the fractionally integrated processes with − 21 < d < 1 2 : σT−1 Z[rT ] ⇒ Wd (r ), (2.13) T −(d+1/2) Z[rT ] ⇒ ωd Wd (r ), (2.14) or equivalently where Wd (r ) is the fractional Brownian motion of Mandelbrot and Van Ness (1968), which is defined as Wd (r ) = 3.2.3 1 · Γ (1 + d ) 1 0 (r − s)d dW (s) (2.15) Test Statistics and the Rejection Rule As above, the model is: yt = µ + t , t = 1, 2, ..., T. The null hypothesis is: H0 : t (2.16) is a stationary short-memory process or a unit root process with short-memory innovations, and the alternative hypothesis is: H1 : t is an I (d) process with 0 < d < 1 and with normal i.i.d. innovations. 3 This result could presumably be generalized under the more general conditions in Qiu and Lin (2011). The technical details involved would be orthogonal to the main points of this paper. 128 New rejection rule will be based on two statistics. The first statistic, denoted by K1 , should have a non-degenerate limit only when the innovation t is stationary short- memory, and should diverge when the innovation has a unit root or is a long memory process. Conversely, the second statistic, denoted by K2 , should have a non-degenerate asymptotic distribution only when the innovation has a unit root, and should converge to zero when the t is a stationary short memory or when it is a long memory (0 < d < 1) process. If one can find two such statistics, K1 and K2 , the following rejection rule will asymptotically have size of 5% : Reject H0 if K1 > cv10.95 and K2 < cv20.05 , where cv10.95 is the upper 5% critical value from the asymptotic distribution of K1 when the error term is a short memory process, and cv20.05 is the lower 5% critical value from the asymptotic distribution of K2 when the error term is a unit root process. Now two specific statistics (K1 and K2 ) are proposed to make this procedure operational. The first statistic (K1 ) is the KPSS statistic (KPSS, 1992) which is defined as ηµ = T −2 ∑tT=1 St2 , s2 ( l ) (2.17) where s2 (l ) is a HAC estimator. This statistic is constructed using the OLS residuals ej T j =1 from equation (2.16). More specifically, et = yt − y = yt − 1 T T ∑ y j , St = j =1 t ∑ ej, (2.18) j =1 l s2 (l ) = γ0 + 2 ∑ w(s, l )γs , s =1 where w(s, l ) = 1 − s l +1 (Bartlett kernel) , γs = 1 T ∑ Tj=s+1 e j e j−s . Note that with the Bartlett kernel the number of lags, l, determines the maximum lag of the sample autocovariances 129 considered for estimating the long run variance of the error term. It is assumed that l → ∞ but l/T → 0 as T → ∞. This assumption characterizes the "traditional asymptotics" and is made to ensure the consistency of the test. Later Section will make some comparisions to the "fixed-b" asymptotics that arise when b ≡ l +1 T has a fixed non-zero limit. Section 3.3 collects existing asymptotic results for ηµ . The only case in which ηµ has a non-degenerate limiting distribution occurs when the innovation follows a short-memory process. Under all the other cases, i.e. a unit root process and the processes described by our alternative hypothesis, ηµ diverges to infinity. Also, a statistic K2 is needed that has a non-degenerate limit only under the unit root innovation processes. Shin and Schmidt (1992) and Breitung (2002) considered a slightly modified KPSS statistic, l T ηµ , which converges to a non-degenerate limiting distribution under the unit root innovation process and goes to zero under short-memory error processes. But as shown by Lee and Amsler (1997), this statistic cannot distinguish I (1) processes from I (d) processes with 1 2 < d < 1. This chapter therefore suggest an alter- native statistic which can distinguish I (1) processes from I (d) processes with (nonstationary long-memory processes) as well as from I (d) with 0 < d < 1 2 1 2 ≤d<1 (stationary long-memory processes) and stationary short-memory processes. Consider the differenced model, yt = t = et . (2.19) The second statistic K2 will be the KPSS statistic based on the differenced data { yt }tT=2 . The statistic is given by ηµd = T −2 ∑tT=2 St2 , s2 ( l ) 130 (2.20) where t St = t ∑ yj = j =2 ∑ j = t − 1, (2.21) j =2 l s2 (l ) = γ0 + 2 ∑ w(s, l )γs , s =1 with γs = 1 T −1 ∑tT=s+2 ∆yt ∆yt−s = 1 T −1 ∑tT=s+2 ∆ t ∆ t−s . It will be shown that this statistic has a non-degenerate distribution in the case of a unit root, while it converges to zero under stationary short memory or under I (d) with 0 < d < 1. Note that this will be a lower-tail test. Other unit root test could have considered, notably variants of the Dickey-Fuller test. The simulations will also compare the results with different unit root tests. From a technical point of view, ηµd is attractive because the proof of the consistency of the test is relatively straightforward. 3.3 Asymptotic Results This Section discusses the asymptotic distributions of the ηµ and ηµd statistics when t is I (0), I (d) with 0 < d < 1, and I (1). The asymptotic theory is established under the assumption that, as T → ∞, l → ∞ but l/T → 0, where l is the number of lags used in the Bartlett kernel for estimation of the relevant long-run variance. These are the "traditional asymptotics," as opposed to the "fixed-b asymptotics" which will be discussed in a later Section of this chapter. 3.3.1 Asymptotic Results for ηµ The existing results on the limit of ηµ are collected from Kwiatkowski et al. (1992), Shin and Schmidt (1992), Lee and Schmidt (1996), and Lee and Amsler (1997). Table 3.1 shows those results. Note that the case of d = 1 2 is missing in Table 3.1. Results for d = 1 2 similar to those in Table 3.1 are not currently available, so this case will be treated separately in 131 this Section. The implications of these results for the asymptotic distribution of ηµ are summarized in Theorem 7. Theorem 7. Given the data generation process in (2.16), the KPSS statistic, ηµ defined in (2.17) has following asymptotic limits. A. When t is a short-memory process (Kwiatkowski et al. (1992)), 1 ηµ ⇒ B. When t 0 B(r )2 dr. (2.22) is a unit root process (Shin and Schmidt (1992)), l ηµ ⇒ T 2 r 0 W ( s ) ds dr 1 2 0 W ( s ) ds 1 0 (2.23) implying ηµ → ∞ in probability. C. When t is a fractionally integrated process with 0 < d < 1/2 (Lee and Schmidt (1996)), l T 2d 1 ηµ ⇒ 0 Bd (r )2 dr (2.24) implying ηµ → ∞ in probability. D. When t is a fractionally integrated process with 1/2 < d < 1 (Lee and Amsler (1997)), l T ηµ ⇒ 1 0 2 r 0 W d∗ ( s ) ds dr 1 2 0 W d∗ ( s ) ds (2.25) implying ηµ → ∞ in probability. The above results cover all of the cases except d = 1/2. This case is covered by the following Theorem. Theorem 8. Given the data generation process in (2.16), the KPSS statistic, ηµ defined in (2.17) diverges to infinity when the error term is an I ( 21 ) process. 132 Proof: See the Appendix. Theorems 7 and 8 imply that the KPSS ηµ test is consistent against a unit root and also against I (d) alternatives for all 0 < d < 1. 3.3.2 Asymptotic Results for ηµd This Section considers the asymptotic behavior of ηµd under stationary short- or longmemory errors, under unit root errors, under nonstationary long-memory errors with 1 2 < d < 1, and under nonstationary long-memory errors with d = 12 . Recall ηµd ≡ T −2 ∑tT=2 St2 , with St ≡ s2 ( l ) t ∑ ∆y j = j =2 t ∑∆ j = t − 1, j =2 T 1 ∆yt ∆yt−s , s (l ) = γ0 + 2 ∑ w(s, l )γs , and γs = T − 1 t=∑ s +2 s =1 l 2 where ∆yt = yt − yt−1 , and ∆ t = t − t −1 . Since ηµd is a unit root test statistic, its asymptotic distribution is established under first, the unit root null. Then it will be shown that the limit of the statistic is zero for the case of stationary short memory and for I (d) processes with 0 < d < 1. Proposition 8. Under the data generation process in (2.16) with the error term root process, and under the assumption that l → ∞ and l T t being a unit → 0 as T → ∞, the statistic, ηµd defined in (2.20) weakly converges: ηµd ⇒ 1 0 W (r )2 dr, (2.26) where W (r ) is the standard Wiener process. Proof: See the Appendix. This is a lower tail test. The 1%, 5% and 10% lower tail critical values are 0.034, 0.056, and 0.076, respectively. These are different from the critical values of the KPSS unit root 133 test in Shin and Schmidt (1992) and Breitung (2002) because the data is differenced instead of demeaning the terms in St . As a consequence the result in (2.26) involves an ordinary Wiener process as opposed to a demeaned Wiener process. The next three results together prove that the limit of the statistic is zero for all cases except the case of unit root errors. Theorem 9. Under the data generation process in (2.16) with the error term short- or long-memory process and under the assumption that l → ∞ and l T t being a stationary → 0 as T → ∞, the statistic, ηµd defined in (2.20) has the following limiting distribution: l T where γ0 = E 2 t −1 d ηµd → γ0 + 12 , 2γ0 (2.27) p . Therefore, ηµd → 0. Proof: See the Appendix. The next Proposition shows the statistic ηµd can distinguish fractionally integrated processes with 1 2 < d < 1 from unit root processes. Recall that this is not the case for the KPSS unit root test (Lee and Amsler (1997)). Proposition 9. Under the data generation process in (2.16) with the error term ally integrated with i.i.d. normal innovations and with 1 2 t being fraction- < d < 1 (so, a nonstationary long- memory process), and under the assumption that l → ∞ and l T → 0 as T → ∞, the statistic, ηµd defined in (2.20) has the following limiting distribution: l T 2d∗ ηµd ⇒ 1 0 Wd∗ (r )2 dr, (2.28) p where d∗ = d − 1 and Wd∗ (r ) is the fractional Brownian motion. Therefore, ηµd → 0. Proof: See the Appendix. Lastly, consider the case of I 1 2 . 134 Theorem 10. Under the data generation process in (2.16) with the error term integrated with i.i.d. normal innovations and with d = process) and under the assumption that l → ∞ and l T 1 2 t being fractionally (so, a nonstationary long-memory → 0 as T → ∞, the statistic, ηµd defined in (2.20) converges to zero. Proof: See the Appendix. 3.3.3 Correct Size and Consistency of the Double-KPSS Test Now go back to the rejection rule in Section 3.2. The null of integer integration is rejected if the KPSS test rejects short memory and if the unit root test based on ηµd rejects unit root. This two-part test has correct size asymptotically and is consistent against I (d) alternatives with 0 < d < 1. Proposition 10. Suppose the data generation process is given by (2.16). Under the null hypothesis of integer integration and under the assumption that l → ∞ and l T → 0 as T → ∞, the rejection rule Reject H0 if ηµ > cv10.95 and ηµd < cv20.05 , gives a test with asymptotic size of 5%, where cv10.95 is the upper 5% percentile of (2.22) and cv20.05 is the lower 5% percentile of (2.26). Also, the test is consistent againt the alternative hypothesis of I (d) with 0 < d < 1. Proof: The Proposition follows immediately from the results of Sections 3.3.1 and 3.3.2. (1) If the series is I (0), asymptotically the KPSS test will reject with probability 0.05 and the ηµd test will reject with probability one, so the Double-KPSS test will reject with probability 0.05. (2) If the series is I (1), the KPSS test will reject with probability one and the ηµd test will reject with probability 0.05, so the Double-KPSS test will reject with probability 0.05. (3) If the series is I (d) with 0 < d < 1, asymptotically both tests will reject with probability one, and so the Double-KPSS test will reject with probability one. 135 3.4 Fixed-b Asymptotic Results Let b = l +1 T , the ratio of the number of lags (plus one) to the sample size. The asymptotic results of the previous Section were obtained under the assumption that b → 0 as T → ∞. This Section discusses the asymptotic distribution of the ηµ and ηµd statistics under the "fixed-b" assumption that b is held constant as T → ∞. The idea of fixed-b asymptotics was proposed by Kiefer and Vogelsang (2005) and Hashimzade and Vogelsang (2008). The fixed-b approach gives a random limit of the HAC estimator which depends on the choice of kernel and the bandwidth ratio b. The fixed-b approach is known to produce a better finite sample approximation to the distribution of test statistics in a variety of settings. Amsler et al. (2009) derived the fixed-b asymptotic distribution of the KPSS ηµ statistic under the I (0) null and under the I (1) alternative. Proposition 11 present their results. Proposition 11. Given the data generation process in (2.16), under the assumption that b = l +1 T ∈ [0, 1] is held constant as T increases, the KPSS statistic, ηµ defined in (2.17) has the following fixed-b asymptotic limits. When t is a short-memory process, ηµ ⇒ where Q0 (b) ≡ When t 2 b 1 0 B(r )2 dr − 1− b 0 B(r )2 dr , Q0 ( b ) (2.29) B(r ) B(r + b)dr with B(r ) ≡ W (r ) − rW (1). is a unit root process, ηµ ⇒ where Q1 (b) ≡ W (s) − 1 0 2 b 1 0 P(r )2 dr − 1− b 0 1 0 P(r )2 dr , Q1 ( b ) P(r ) P(r + b)dr with P(r ) ≡ 1 0 W ( u ) du. Proof: See Amsler et al. (2009). 136 (2.30) r 0 W ( s ) ds with W (s) ≡ As Proposition 11 shows, the KPSS statistic ηµ has a nondegenerate limit under both I (0) and I (1) data generation processes. So the KPSS ηµ test does not give a consistent test against the unit root under the fixed-b assumption. In the present context, the DoubleKPSS test would be conservative (undersized). If the DGP is I (1), the ηµd test would reject with probability 0.05, but the KPSS ηµ test would reject with probability less than one, under fixed-b asymptotics. So the probability of both tests rejecting would be less than 0.05. However, these issues should not be regarded as consequential. The assumption of fixed-bandwidth ratio does not recommend any rules for selecting the number of lags. But, no matter how one chooses the number of lags, it will be positive, and the fixed-b critical values will usually give a test of more accurate size than the traditional (b = 0) critical values. That is, fixed-b asymptotics is simply viewed as a way of generating a more accurate approximation to the finite sample distribution of the statistic. The next Proposition provides the fixed-b limits of ηµd under I (0) and I (1) data generation processes. To avoid confusion with the previous definition of b and l, let the number of lags and the ratio be denoted by l and b = l +1 T −1 . (We have T − 1 instead of T because one observation is used up in differencing.) Proposition 12. Given the data generation process in (2.16), under the assumption that b = l +1 T −1 ∈ [0, 1] is held constant as T increases, the KPSS statistic, ηµd has the following fixed-b asymptotic limits: When t is a short-memory process, ηµd where γ0 = E When t 2 t and ∞ ⇒ γ0 + 2 b γ0 + denotes the weak limit of 2 1 2 1 + T 2 ∞ , (2.31) as T → ∞. is a unit root process, ηµd ⇒ 1 2 0 W (r ) dr , Q1d (b ) 137 (2.32) where W (r ) is the standard Wiener process and Q1d (b ) ≡ 2 b 1 0 W (r )2 dr − 1− b 0 W (r )W (r + b )dr − 1 1− b W (r )W (1)dr + W (1)2 . Proof: See the Appendix. The fixed-b critical values for ηµ and ηµd are simulated using i.i.d. N(0,1) pseudo random numbers with T = 1, 000 and 50, 000 replications. Table 3.2 provides these fixed-b critical values. The fixed-b critical values for ηµ are slightly different from those in Amsler et al. (2009). This is partly due to randomness of the simulations, but it is also due to a slight difference in the definitions of b. They had b = l T whereas now we have b = l +1 T . Since the critical values were simulated using T = 1, 000, b = 0.02 in Amsler et al. (2009) would correspond to b = 0.021 in this chapter, for example. For bigger T (i.e. asymptotically) this difference obviously disappears. The simulation results in Amsler et al. (2009) showed that the size distortion associated with strong short-run persistence of the I (0) DGP can be fixed by using a relatively large number of lags and the corresponding fixed-b critical values. In their results, with the original KPSS critical value being used, as the short run persistence gets higher, the overrejection of the KPSS test gets worse. One can reduce the rejection frequency by using a higher number of lags but now the test becomes subject to an underrejection problem which is translated into low power. However, by taking a relatively large number of lags and using the fixed-b critical values, this problem can be partially fixed. Exactly the same considerations apply to the ηµd test and the Double-KPSS test. 3.5 Monte Carlo Simulations This Section reports the results of simulations designed to investigate the finite sample size and power properties of the Double-KPSS test. Some comparisons of the ηµd test and the ADF test will be made. 138 3.5.1 Design of the Experiment The data generating processes to be considered in the simulations are as follows. 1. I (0) DGP: yt = µ + t , t with µ = 0, 0 = ρ t −1 (2.33) + ut , = 0, ρ ∈ {0, 0.25, 0.5, 0.75, 0.95}, ut ∼ i.i.d. Normal (0, 1). 2. I (1) DGP: yt = µ + t , t = t −1 (2.34) + ηt , ηt = ut − φut−1 , with µ = 0, 0 = u0 = 0, φ ∈ {0, 0.25, 0.5, 0.75, 0.95}, ut ∼i.i.d. Normal (0, 1). 3. I (d) DGP: yt = µ + t , (1 − L ) d t (2.35) = ut ∼ i.i.d. Normal (0, 1), with µ = 0, and d ∈ {0.1, 0.2, 0.3, 0.4, 0.45, 0.499, 0.5, 0.6, 0.7, 0.75, 0.8, 0.9}. To generate I (d) processes with 0 < d < 1 2 Toeplitz matrix was used (formed from the autocovariances, as in Diebold and Rudebusch (1991)). For the case of 1 2 ≤ d < 1, first generated I (d) processes with − 12 ≤ d < 0 (again using the Toeplitz matrix) and cumulated them to obtain the I (d) processes with 1 2 ≤ d < 1. This is the same procedure as in Lee and Schmidt (1996). The experiments considered T = 50, 100, 200, 500, 1, 000, and 139 2, 000 and the number of replications was 5, 000. The numbers of lags used for computing the statistics were l0 (= 0), l4 , l12 , l25 and l50 , where lk ≡ k · T 100 1/4 . (2.36) For the ADF test, with p lags, p4 , p12 , and p25 lags were considered, where pk is defined in the same way as in equation (2.36). As a matter of notation, ηµ × ηµd will denote the Double-KPSS test based on ηµ and ηµd . Similarly ηµ × ADF will denote the double test but using the ADF test instead of ηµd . 3.5.2 Results with Standard Critical Values This Section discusses the results for the ηµ , ηµd and ηµ × ηµd tests, using the "standard" critical values that are valid asymptotically when l → ∞ and l T → 0 as T → ∞. These results are given in Tables 3.3, 3.4 and 3.5. Each table contains the results for two sample sizes (3.3: T = 50 and 100; 3.4: T = 200 and 500; 3.5: T = 1, 000 and 2, 000). The formatting for each sample size is the same. The results for the KPSS ηµ test are similar to those from previous simulations and will be discussed only briefly. Size under the I (0) null with ρ = 0 is essentially correct for l0 but the test is undersized with more lags. The test is oversized when ρ > 0 and severely so for the largest values of ρ (like ρ = 0.95). Size improves very slowly as T increases. Power against an I (1) alternative rises when T increases, falls as the number of lags increases, and falls as φ increases (since the series approaches stationarity as φ → 1). Power against I (d) alternatives grows with d, falls as the number of lags increases, and grows (but slowly) as T increases. The results for the ηµd unit root test show a pattern that is similar to what was seen for the KPSS ηµ test, but reversed. Size under the I (1) null with φ = 0 is essentially correct, but the test is undersized with more lags. The test is oversized when φ > 0 140 and severely so for the largest values of φ (like φ = 0.95). Size generally improves as T increases. Power against I (0) alternatives rises when T increases, falls as the number of lags increases, and falls as ρ increases (since the series approaches I (1) as ρ → 1). Power against I (d) alternatives grows as d decreases, falls as the number of lags increases, and grows as T increases. All of these statements would also be true for the ADF test. Some comparisons of the performance of the ηµd test and the ADF test will be given in Section 3.5.4. Now turn to the issue of main interest, the performance of the Double-KPSS (ηµ × ηµd ) test. This test rejects the null of integer integration if both the ηµ short-memory test and the ηµd unit root test reject their respective null hypotheses. As a result, the upward size distortions caused by short run dynamics must be smaller for the Double-KPSS test than for either of the individual tests. In many cases the rejection probability for the DoubleKPSS test is at least approximately equal to the product of the rejection probabilities for the two component tests, but this is not always the case (The two tests are not independent). Consider first the size of the Double-KPSS test under the I (0) null. In the most empirically relevant cases, like l4 lags with T = 100, or l12 lags with T = 200 or 500, its size is reasonably accurate, except perhaps for the biggest values of ρ. For the largest sample sizes (T = 1, 000 and 2, 000) the test has fairly accurate size, except for the case of ρ = 0.95, if the test uses l12 × l12 or l25 × l25 lags. However, the size of the test does not improve uniformly as T increases, because loosely speaking the power of the unit root test goes to one faster than the size of the short-memory test goes to 0.05. But as a general statement the size of the test is surprisingly good over a broad range of values of ρ. Now consider the size of the test under the I (1) null. Once again the size is reasonably accurate, except perhaps for the biggest values of φ, if reasonable numbers of lags, like l4 lags with T = 100, or l12 lags with T = 200 or 500 are used. The test is if anything undersized (due to the use of the standard critical values despite the positive number of 141 lags) for the smaller values of φ. For the largest values of T (1, 000 and 2, 000), size is quite good with l12 × l12 or l25 × l25 lags, except when φ = 0.75 or 0.95. Finally, consider the power of the test against I (d) alternatives. Now there is a potential problem, because if one uses the numbers of lags mentioned above as sufficient to control size, power is low. For example, if l12 lags, with T = 200 is used, the highest power is only 0.099 (against d = 0.4) and with T = 500 the highest power is 0.386 (against d = 0.5). Of course, there is a trade-off between size and power. If one uses only l4 lags, maximal power is 0.488 for T = 200 and 0.786 for T = 500. But with only l4 lags, there are large size distortions under the null for the larger values of ρ (for the I (0) null) or φ (for the I (1) null). It takes a very large sample size (like T = 1, 000 or 2, 000) to have reasonable power with l12 lags. So, what can one conclude from these simulations? In a view the main practical question is how large the sample size needs to be so that one can reasonably conclude that a rejection from the test is due to its power against a fractional alternative, as opposed to size distortions of one or both of the two component tests. This obviously will depend on the values of d against which we require power, as well as the values of the nuisance parameters that we want the null hypothesis to encompass. As an extreme example, there is no hope of success if we want to include in the I (0) null AR processes with local to unity roots, or if we want to include in the I (1) null ARI MA(0, 1, 1) processes with local to unity MA roots. The simulations results seem to indicate that the test can in fact reasonably distinguish fractional integration from non-extreme I (0) or I (1) processes, but that it will take a large sample size to do so. For example, for T = 500 and for the tests using l12 lags, power for d in the range [0.3, 0.7] is at least twice as large as the maximal size distortion for I (0) processes with AR roots less than or equal to 0.75 or for I (1) processes with MA roots less than or equal to 0.75. For smaller sample sizes, this statement would not be true, and to make a similar statement that is true would require a smaller range of d and/or a more 142 restrictive range of AR or MA roots. Conversely, to make a similar statement that is true for a larger range of d or of AR and MA parameters will require a larger sample size. For example, with T = 2000, power for d in the range [0.2, 0.8] is at least twice as large as the maximal size distortion for I (0) processes with ρ less than or equal to 0.75 or for I (1) processes with φ less than or equal to 0.75. The obvious problem with these statements is that, for economic time series data, T = 2000, or for that matter T = 500, is a very large sample size. 3.5.3 Results with Fixed-b Critical Values The fixed-b critical values, for the relevant values of b, are smaller than the traditional critical values for the KPSS ηµ test (an upper tail test) and larger for the ηµd test (a lower tail test). The fixed-b critical values will therefore lead to more rejections than the traditional critical values, if the same number of lags is used in both cases. If the number of lags increases with sample size but more slowly than sample size, the difference in the critical values (and the rejection probabilities) will go to zero since b will go to zero. The fixed-b critical values are very successful in removing the underrejection problem that occurs for the KPSS ηµ and ηµd tests when there are no short-run dynamics and the sample size is not large. For example, for KPSS ηµ with T = 50 and I (0) data with ρ = 0, and with l12 lags, compare size of 0.014 with traditional critical values to 0.053 with fixedb critical values. Or, for the ηµd test with T = 50 and I (1) data with φ = 0, and with l12 lags, compare 0.000 with traditional critical values to 0.039 with fixed-b critical values. For the Double-KPSS test there is also improvement in size in these cases from using the fixed-b critical values, but the improvement is not so striking. Upward size distortions in the presence of short run dynamics (I (0) data with large positive ρ or I (1) data with large positive φ) are worse when the fixed-b critical values are used. Also power against I (d) alternatives is higher when the fixed-b critical values are used. However, these differences are not large when the sample size is big enough for us 143 to have reasonable power (e.g., T greater than or equal to 500). Of course, that is because the rule for the choice of lags in the simulations implies that b goes to zero as T grows, but that is a reasonable feature for such a rule to have. Using the fixed-b critical values is recommended but recognize that if the number of lags is chosen reasonably this is likely to not make much difference. 3.5.4 Comparison of the ηµd Test and the ADF Test Although it is not the focus of this research, a new unit root test has been proposed in this chapter and it is relevant to ask how it compares to other existing unit root tests. There are of course a great many other existing unit root tests. The ADF τµ test will be taken as a standard of comparison . The number of lagged differences included in the ADF regression is denoted as p, and in making comparisons of size and power a value of p will be matched to the same value of l, the number of lags used for long run variance estimation in the ηµd test. Table 3.9 gives size and power for the ADF test for T = 50, 100, 200 and 500, and these results can be compared to the results previously given in Table 3.3 and 3.4. In terms of the size of the test, the results are mixed. However, for the larger sample sizes, the ADF test with p12 lags has smaller size distortions than the ηµd test with l12 lags for the larger values of φ. The ADF test generally has higher power against I (0) alternatives, while the ηµd test has higher power against I (d) alternatives except when d is very small (e.g. 0.1). If one compares the Double-KPSS (ηµ × ηµd ) test to the ηµ × ADF test, similar statements apply, but the differences are much smaller. In fact, the similarities between these two double tests far outweigh the differences. Unsurprisingly, perhaps, the precise choice of unit root test to use is not the main issue here. 144 3.6 Conclusions This chapter proposed a Double KPSS test to test the null of integer integration (I (0) or I (1)) against the alternative of fractional integration (I (d) with d between zero and one). The null of integer integration is rejected if the KPSS test rejects the null of short memory and a unit root test rejects the null of a unit root. A new unit root test was suggested for use in this testing procedure, but any other unit root test like the ADF test is also possible. This would be a good preliminary test to use before estimating a fractional model. An alternative, of course, is to just estimate the fractional model and see whether the estimated d is significantly different from zero and from one. However, there appears to be no clear consensus in the existing literature on how to allow for short-run dynamics in estimating d and conducting inference about it. The consistency of the test were proved. The main practical question is how large the sample size needs to be so that one can reasonably conclude that a rejection from the test is due to its power against a fractional alternative, as opposed to size distortions of the two component tests. The simulations results seem to indicate that the test can in fact distinguish fractional integration from non-extreme I (0) or I (1) processes, but that it will take a very large sample size to do reliably. This is not a surprising result. It takes a lot of data to distinguish I (0) from I (1) processes, if the range of short-run dynamics is not severely restricted. Now we are trying to do more, for example, to distinguish a unit root process from an I (d) process with d = 0.8. An important contribution of this chapter is to try to quantify how much data that takes. The required sample sizes would be very large indeed for macroeconomic applications, but perhaps not for applications in finance. 145 Table 3.1: Summary of the existing asymptotic results for ηµ H0 t ∼ St = ∑tj=1 e j : I (0) √1 S[rT ] T I (1) 1 S T 3/2 [rT ] ⇒ σB(r ), where B(r ) = W (r ) − rW (1) ∑tT=1 St2 ⇒ σ2 ∑ St2 : 1 T2 s2 ( l ) : s2 ( l ) → σ 2 1 0 where W (s) = 1 T4 B(r )2 dr p r 0 W ( s ) ds, 1 W (s) − 0 W (u)du ⇒σ ∑tT=1 St2 ⇒ σ2 1 2 lT s ( l ) 2 r 0 W ( s ) ds dr 1 0 1 2 0 W ( s ) ds ⇒ σ2 H1 t ∼ St = ∑tj=1 e j : ∑ St2 : s2 ( l ) : I (d), 0 < d < 1/2 I (d), 1/2 < d < 1 1 S ⇒ ωd Bd (r ), T d+1/2 [rT ] where Bd (r ) = Wd (r ) − rWd (1) 1 T 2( d +1) ∑tT=1 St2 ⇒ ωd2 1 0 Bd (r )2 dr p l −2d s2 (l ) → ωd2 ⇒ ωd∗ 0 W d∗ (s)ds, where d∗ = d − 1 and 1 W d∗ (r ) = Wd∗ (r ) − 0 Wd∗ (s)ds 1 T 2d∗ +4 ∑tT=1 St2 ⇒ ωd2∗ 1 s2 ( l ) lT 2d∗ +1 146 r 1 S T d∗ +3/2 [rT ] 1 0 ⇒ ωd2∗ 2 r 0 W d∗ ( s ) ds dr 1 2 0 W d∗ ( s ) ds Table 3.2: Fixed-b Critical Values for ηµ and ηµd , Bartlett kernel, l = bT − 1, l = b ( T − 1) − 1 b 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 ηµ upper tail 5% 1% 0.463 0.739 0.453 0.705 0.446 0.673 0.439 0.639 0.434 0.609 0.428 0.582 0.421 0.561 0.416 0.541 0.409 0.522 0.403 0.504 0.398 0.489 0.394 0.476 0.391 0.464 0.388 0.455 0.385 0.447 0.383 0.441 0.381 0.435 0.380 0.430 0.381 0.426 0.380 0.424 0.381 0.422 0.383 0.421 0.386 0.422 0.390 0.425 0.394 0.430 0.400 0.437 ηµd lower tail 1% 5% 0.034 0.056 0.038 0.060 0.042 0.063 0.045 0.067 0.049 0.071 0.053 0.075 0.057 0.078 0.062 0.082 0.066 0.086 0.069 0.090 0.073 0.095 0.077 0.099 0.080 0.103 0.083 0.107 0.086 0.111 0.088 0.115 0.090 0.119 0.092 0.122 0.093 0.126 0.095 0.129 0.095 0.132 0.096 0.135 0.098 0.137 0.098 0.139 0.099 0.141 0.100 0.142 b 0.52 0.54 0.56 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1 147 ηµ upper tail 5% 1% 0.405 0.445 0.410 0.453 0.414 0.461 0.419 0.468 0.424 0.473 0.429 0.479 0.432 0.485 0.436 0.490 0.439 0.493 0.442 0.496 0.445 0.499 0.448 0.499 0.449 0.501 0.452 0.500 0.454 0.498 0.456 0.498 0.458 0.496 0.462 0.494 0.465 0.490 0.468 0.488 0.473 0.487 0.478 0.487 0.484 0.488 0.491 0.492 NA NA ηµd lower tail 1% 5% 0.100 0.144 0.101 0.145 0.101 0.147 0.102 0.148 0.102 0.150 0.103 0.151 0.104 0.151 0.104 0.152 0.104 0.153 0.105 0.154 0.106 0.155 0.106 0.156 0.106 0.157 0.107 0.158 0.107 0.159 0.107 0.160 0.108 0.160 0.108 0.161 0.108 0.162 0.109 0.163 0.109 0.163 0.109 0.164 0.110 0.165 0.110 0.166 0.110 0.166 Table 3.3: Size and Power Using Standard 5% Critical Values with Traditional Lag Choices,T = 50 and 100 T=50 lag l0 ηµ l4 l12 l0 ηµd l4 l12 l0 × l0 ηµ × ηµd l4 × l4 l12 × l12 0.014 0.016 0.021 0.035 0.118 0.959 0.923 0.867 0.641 0.108 (Power) 0.478 0.484 0.478 0.324 0.048 0.001 0.001 0.001 0.001 0.001 0.050 0.134 0.280 0.379 0.082 (Size) 0.015 0.021 0.031 0.030 0.003 0.000 0.000 0.000 0.000 0.000 ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 (Size) 0.053 0.045 0.149 0.062 0.336 0.102 0.653 0.214 0.924 0.530 ∼ I (1) φ= 0 0.25 0.50 0.75 0.95 0.961 0.944 0.895 0.709 0.126 (Power) 0.711 0.703 0.679 0.571 0.100 0.353 0.348 0.331 0.262 0.035 0.042 0.141 0.400 0.786 0.955 (Size) 0.021 0.028 0.064 0.208 0.466 0.000 0.000 0.000 0.000 0.001 0.030 0.112 0.321 0.518 0.117 (Size) 0.002 0.003 0.011 0.062 0.036 0.000 0.000 0.000 0.000 0.000 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.133 0.252 0.399 0.546 0.616 0.674 0.679 0.775 0.851 0.879 0.907 0.939 (Power) 0.082 0.138 0.198 0.276 0.322 0.372 0.369 0.456 0.533 0.571 0.603 0.662 0.021 0.036 0.053 0.076 0.084 0.100 0.100 0.131 0.174 0.200 0.229 0.283 0.946 0.925 0.890 0.820 0.774 0.723 0.720 0.572 0.389 0.294 0.215 0.102 (Power) 0.464 0.430 0.379 0.322 0.282 0.251 0.239 0.167 0.110 0.085 0.066 0.034 0.001 0.001 0.001 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.124 0.227 0.340 0.427 0.450 0.458 0.457 0.402 0.290 0.222 0.164 0.077 (Power) 0.029 0.043 0.045 0.050 0.047 0.042 0.043 0.027 0.017 0.012 0.009 0.003 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 l0 ηµ l4 l12 l0 ηµd l4 l12 l0 × l0 ηµ × ηµd l4 × l4 l12 × l12 0.027 0.033 0.041 0.068 0.250 0.999 0.993 0.983 0.944 0.262 (Power) 0.751 0.764 0.797 0.784 0.172 0.040 0.046 0.064 0.100 0.029 0.043 0.142 0.354 0.673 0.247 (Size) 0.027 0.036 0.063 0.137 0.055 0.001 0.001 0.000 0.001 0.000 0.006 0.006 0.006 0.008 0.032 0.044 0.149 0.431 0.769 0.299 (Size) 0.009 0.015 0.040 0.188 0.178 0.000 0.000 0.000 0.001 0.002 0.041 0.037 0.034 0.031 0.026 0.029 0.028 0.023 0.018 0.015 0.013 0.010 0.155 0.343 0.528 0.681 0.726 0.744 0.742 0.690 0.533 0.427 0.311 0.130 (Power) 0.061 0.113 0.164 0.201 0.207 0.202 0.201 0.151 0.091 0.069 0.047 0.022 0.001 0.001 0.001 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 t t t T=100 lag ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 (Size) 0.043 0.040 0.143 0.052 0.361 0.088 0.718 0.194 0.978 0.591 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 0.993 0.989 0.978 0.905 0.300 (Power) 0.821 0.818 0.809 0.747 0.265 0.578 0.577 0.571 0.541 0.186 (Size) 0.046 0.029 0.157 0.041 0.451 0.091 0.863 0.332 0.998 0.718 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.156 0.346 0.539 0.715 0.783 0.830 0.830 0.908 0.954 0.967 0.975 0.985 (Power) 0.091 0.173 0.272 0.373 0.424 0.473 0.474 0.561 0.640 0.678 0.710 0.774 0.052 0.086 0.131 0.181 0.214 0.238 0.241 0.310 0.380 0.410 0.440 0.513 0.997 0.992 0.983 0.958 0.936 0.907 0.904 0.775 0.572 0.454 0.330 0.139 t t t (Power) 0.738 0.713 0.670 0.609 0.564 0.509 0.509 0.370 0.235 0.175 0.128 0.062 148 Table 3.4: Size and Power Using Standard 5% Critical Values with Traditional Lag Choices, T = 200 and 500 T=200 lag l4 ηµ l12 l25 l4 ηµd l12 l25 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 (Size) 0.044 0.040 0.061 0.047 0.098 0.055 0.223 0.079 0.711 0.311 0.029 0.031 0.033 0.042 0.127 0.945 0.951 0.965 0.975 0.574 (Power) 0.555 0.569 0.615 0.706 0.333 0.008 0.009 0.012 0.021 0.032 0.041 0.057 0.093 0.215 0.358 (Size) 0.022 0.026 0.033 0.050 0.046 0.000 0.000 0.001 0.000 0.000 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 (Power) 0.948 0.720 0.948 0.721 0.945 0.720 0.919 0.707 0.557 0.466 0.524 0.525 0.523 0.514 0.340 0.037 0.058 0.127 0.445 0.917 (Size) 0.022 0.023 0.031 0.094 0.499 0.004 0.003 0.004 0.003 0.006 0.026 0.041 0.100 0.385 0.501 (Size) 0.004 0.004 0.006 0.028 0.211 0.000 0.000 0.000 0.000 0.001 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.132 0.264 0.403 0.540 0.597 0.646 0.647 0.742 0.819 0.852 0.879 0.923 (Power) 0.084 0.143 0.219 0.292 0.334 0.379 0.381 0.457 0.526 0.558 0.594 0.658 0.051 0.082 0.117 0.161 0.183 0.214 0.215 0.271 0.333 0.364 0.391 0.457 0.940 0.926 0.904 0.863 0.831 0.785 0.785 0.646 0.442 0.341 0.246 0.105 (Power) 0.550 0.534 0.500 0.439 0.391 0.348 0.350 0.237 0.145 0.111 0.079 0.042 0.009 0.008 0.008 0.008 0.009 0.008 0.007 0.007 0.006 0.006 0.006 0.005 0.121 0.241 0.358 0.457 0.483 0.486 0.488 0.447 0.321 0.253 0.184 0.074 (Power) 0.046 0.074 0.096 0.099 0.094 0.084 0.091 0.057 0.029 0.022 0.016 0.006 0.001 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 l4 ηµ l12 l25 l4 ηµd l12 l25 t t t T=500 lag l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 (Size) 0.054 0.049 0.067 0.054 0.094 0.060 0.192 0.084 0.728 0.322 0.043 0.044 0.047 0.056 0.143 0.997 0.997 0.999 1.000 0.995 (Power) 0.873 0.883 0.918 0.971 0.969 0.556 0.570 0.616 0.728 0.784 0.053 0.067 0.094 0.192 0.724 (Size) 0.042 0.046 0.054 0.080 0.305 0.023 0.025 0.028 0.039 0.081 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 (Power) 0.992 0.901 0.992 0.901 0.992 0.901 0.990 0.897 0.888 0.791 0.730 0.729 0.728 0.728 0.661 0.045 0.059 0.123 0.440 0.981 (Size) 0.038 0.042 0.054 0.160 0.778 0.023 0.025 0.027 0.046 0.400 0.041 0.054 0.117 0.431 0.870 (Size) 0.020 0.022 0.030 0.114 0.601 0.003 0.003 0.004 0.012 0.231 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.173 0.365 0.555 0.721 0.785 0.832 0.830 0.906 0.950 0.964 0.974 0.987 (Power) 0.119 0.225 0.341 0.461 0.515 0.566 0.568 0.662 0.746 0.778 0.808 0.861 0.084 0.145 0.220 0.297 0.339 0.379 0.381 0.461 0.530 0.571 0.609 0.672 0.996 0.995 0.991 0.980 0.967 0.950 0.947 0.844 0.642 0.507 0.366 0.141 (Power) 0.868 0.860 0.839 0.799 0.760 0.715 0.715 0.565 0.361 0.272 0.186 0.083 0.555 0.546 0.515 0.460 0.418 0.363 0.364 0.242 0.139 0.102 0.076 0.042 0.173 0.362 0.549 0.705 0.755 0.786 0.782 0.756 0.600 0.477 0.345 0.134 (Power) 0.101 0.192 0.281 0.359 0.380 0.384 0.386 0.339 0.221 0.163 0.108 0.045 0.047 0.075 0.101 0.113 0.109 0.095 0.099 0.063 0.029 0.020 0.014 0.008 t t t 149 Table 3.5: Size and Power Using Standard 5% Critical Values with Traditional Lag Choices, T = 1,000 and 2,000 T=1,000 lag l4 ηµ l12 l25 l4 ηµd l12 l25 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 (Size) 0.047 0.045 0.057 0.049 0.076 0.056 0.147 0.073 0.641 0.272 0.042 0.044 0.046 0.055 0.133 1.000 1.000 1.000 1.000 1.000 (Power) 0.966 0.971 0.985 0.999 1.000 0.804 0.817 0.866 0.946 0.996 0.047 0.057 0.076 0.147 0.641 (Size) 0.043 0.047 0.055 0.073 0.272 0.034 0.037 0.040 0.052 0.132 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 (Power) 0.998 0.959 0.998 0.959 0.998 0.959 0.997 0.958 0.975 0.931 0.849 0.848 0.848 0.848 0.819 0.049 0.061 0.117 0.390 0.988 (Size) 0.042 0.046 0.064 0.174 0.851 0.034 0.037 0.042 0.085 0.599 0.048 0.060 0.116 0.387 0.963 (Size) 0.030 0.034 0.050 0.148 0.786 0.013 0.014 0.016 0.041 0.467 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.197 0.414 0.630 0.799 0.860 0.899 0.899 0.953 0.978 0.987 0.991 0.995 (Power) 0.139 0.273 0.422 0.576 0.637 0.695 0.696 0.792 0.859 0.885 0.906 0.939 0.106 0.192 0.299 0.406 0.459 0.507 0.509 0.600 0.681 0.713 0.747 0.800 1.000 1.000 0.999 0.997 0.994 0.986 0.984 0.925 0.740 0.603 0.446 0.186 (Power) 0.964 0.962 0.950 0.923 0.903 0.866 0.870 0.729 0.513 0.395 0.291 0.126 0.802 0.795 0.780 0.748 0.706 0.652 0.645 0.504 0.322 0.239 0.170 0.081 0.197 0.414 0.630 0.797 0.853 0.886 0.884 0.879 0.718 0.591 0.438 0.182 (Power) 0.133 0.262 0.398 0.524 0.567 0.591 0.593 0.555 0.410 0.319 0.234 0.099 0.085 0.150 0.227 0.293 0.311 0.305 0.300 0.257 0.159 0.112 0.076 0.032 l4 ηµ l12 l25 l4 ηµd l12 l25 t t t T=2,000 lag l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 (Size) 0.048 0.046 0.058 0.050 0.074 0.054 0.132 0.067 0.599 0.242 0.044 0.045 0.047 0.054 0.122 1.000 1.000 1.000 1.000 1.000 (Power) 0.995 0.996 0.999 1.000 1.000 0.940 0.949 0.971 0.994 1.000 0.048 0.058 0.074 0.132 0.599 (Size) 0.046 0.050 0.054 0.067 0.242 0.041 0.043 0.046 0.054 0.122 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 (Power) 1.000 0.989 1.000 0.989 1.000 0.989 1.000 0.989 0.998 0.982 0.940 0.940 0.941 0.941 0.927 0.051 0.062 0.111 0.375 0.995 (Size) 0.045 0.049 0.066 0.165 0.878 0.040 0.041 0.048 0.092 0.677 0.051 0.062 0.110 0.375 0.993 (Size) 0.042 0.046 0.062 0.159 0.861 0.027 0.028 0.033 0.069 0.614 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.223 0.496 0.745 0.890 0.932 0.958 0.959 0.987 0.995 0.996 0.998 1.000 (Power) 0.163 0.345 0.533 0.693 0.764 0.815 0.817 0.892 0.938 0.952 0.966 0.981 0.125 0.254 0.397 0.523 0.588 0.651 0.651 0.753 0.820 0.848 0.873 0.910 1.000 1.000 1.000 1.000 1.000 0.999 0.998 0.977 0.849 0.707 0.552 0.217 (Power) 0.995 0.994 0.993 0.985 0.976 0.957 0.955 0.858 0.651 0.527 0.385 0.157 0.938 0.937 0.925 0.895 0.869 0.834 0.829 0.692 0.489 0.369 0.258 0.115 0.223 0.496 0.745 0.890 0.932 0.957 0.958 0.965 0.844 0.704 0.550 0.216 (Power) 0.163 0.344 0.529 0.683 0.745 0.777 0.779 0.755 0.596 0.487 0.359 0.146 0.117 0.237 0.367 0.467 0.505 0.528 0.525 0.493 0.361 0.271 0.185 0.080 t t t 150 Table 3.6: Size and Power Using 5% Fixed-b Critical Values with Traditional Lag Choices, T = 50 and 100 T=50 lag t t t t t l0 l4 l12 l25 l0 l4 l12 l25 l0 × l0 ηµ × ηµd l4 × l4 l12 × l12 l25 × l12 ∼ I (0) ρ=0 0.25 0.5 0.75 0.95 0.053 0.149 0.336 0.653 0.924 (Size) 0.052 0.053 0.073 0.061 0.114 0.071 0.236 0.102 0.551 0.247 0.057 0.061 0.065 0.066 0.079 0.959 0.923 0.867 0.641 0.108 (Power) 0.597 0.085 0.608 0.096 0.613 0.121 0.487 0.168 0.103 0.083 0.031 0.034 0.042 0.058 0.057 0.050 0.134 0.280 0.379 0.082 (Size) 0.025 0.005 0.035 0.005 0.056 0.007 0.073 0.010 0.016 0.006 0.003 0.002 0.002 0.002 0.000 ∼ I (1) φ=0 0.25 0.5 0.75 0.95 0.961 0.944 0.895 0.709 0.126 (Power) 0.727 0.495 0.717 0.492 0.696 0.481 0.589 0.414 0.114 0.097 0.168 0.170 0.168 0.154 0.079 0.042 0.141 0.400 0.786 0.955 (Size) 0.041 0.039 0.059 0.039 0.113 0.037 0.321 0.044 0.586 0.077 0.039 0.037 0.030 0.026 0.033 0.030 0.112 0.321 0.518 0.117 (Size) 0.007 0.003 0.012 0.003 0.031 0.003 0.129 0.003 0.058 0.005 0.001 0.001 0.001 0.005 0.004 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.133 0.252 0.399 0.546 0.616 0.674 0.679 0.775 0.851 0.879 0.907 0.939 (Power) 0.095 0.072 0.156 0.097 0.222 0.128 0.301 0.162 0.348 0.185 0.397 0.208 0.396 0.210 0.484 0.257 0.556 0.316 0.593 0.343 0.623 0.374 0.681 0.436 0.065 0.073 0.083 0.089 0.094 0.101 0.101 0.113 0.126 0.132 0.137 0.155 0.946 0.925 0.890 0.820 0.774 0.723 0.720 0.572 0.389 0.294 0.215 0.102 (Power) 0.578 0.085 0.556 0.082 0.508 0.081 0.443 0.083 0.409 0.081 0.373 0.078 0.359 0.075 0.265 0.072 0.181 0.064 0.145 0.061 0.119 0.056 0.073 0.045 0.034 0.035 0.037 0.033 0.031 0.032 0.030 0.034 0.036 0.036 0.036 0.039 0.124 0.227 0.340 0.427 0.450 0.458 0.457 0.402 0.290 0.222 0.164 0.077 (Power) 0.047 0.005 0.075 0.005 0.089 0.005 0.093 0.005 0.093 0.005 0.092 0.006 0.091 0.006 0.069 0.005 0.044 0.005 0.033 0.005 0.027 0.004 0.014 0.003 0.003 0.003 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.001 0.001 l50 l4 l50 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 l50 × l25 T=100 lag t ηµd ηµ ηµd ηµ l4 l12 l25 l12 l25 ∼ I (0) ρ=0 0.25 0.5 0.75 0.95 0.045 0.060 0.095 0.208 0.608 (Size) 0.045 0.048 0.052 0.052 0.061 0.057 0.093 0.068 0.305 0.137 0.051 0.053 0.058 0.065 0.054 0.801 0.813 0.841 0.839 0.244 (Power) 0.391 0.054 0.410 0.057 0.447 0.072 0.508 0.109 0.209 0.132 0.028 0.028 0.032 0.042 0.072 0.032 0.044 0.073 0.158 0.092 (Size) 0.013 0.002 0.015 0.003 0.016 0.004 0.023 0.010 0.015 0.012 0.002 0.002 0.002 0.001 0.002 ∼ I (1) φ=0 0.25 0.5 0.75 0.95 0.832 0.830 0.820 0.758 0.277 (Power) 0.621 0.445 0.617 0.444 0.615 0.442 0.588 0.422 0.236 0.182 0.086 0.085 0.084 0.081 0.079 0.042 0.059 0.123 0.389 0.777 (Size) 0.043 0.044 0.046 0.042 0.063 0.040 0.123 0.038 0.372 0.051 0.044 0.040 0.038 0.036 0.030 0.017 0.026 0.060 0.238 0.203 (Size) 0.004 0.005 0.005 0.005 0.008 0.004 0.027 0.003 0.072 0.003 0.002 0.002 0.002 0.004 0.007 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.100 0.186 0.288 0.390 0.439 0.489 0.489 0.575 0.655 0.691 0.728 0.786 (Power) 0.077 0.065 0.121 0.085 0.171 0.110 0.231 0.146 0.259 0.163 0.292 0.182 0.292 0.184 0.360 0.220 0.428 0.266 0.459 0.290 0.494 0.316 0.562 0.383 0.057 0.062 0.062 0.074 0.077 0.078 0.076 0.080 0.078 0.080 0.081 0.083 0.788 0.767 0.726 0.667 0.623 0.576 0.577 0.434 0.294 0.230 0.172 0.091 (Power) 0.393 0.054 0.379 0.055 0.362 0.059 0.314 0.060 0.285 0.060 0.246 0.062 0.254 0.057 0.193 0.060 0.143 0.060 0.120 0.057 0.098 0.057 0.065 0.050 0.028 0.029 0.030 0.032 0.033 0.036 0.034 0.038 0.038 0.039 0.039 0.041 0.073 0.134 0.197 0.235 0.243 0.249 0.246 0.196 0.131 0.102 0.076 0.037 (Power) 0.023 0.020 0.030 0.025 0.035 0.026 0.032 0.023 0.035 0.023 0.030 0.017 0.034 0.018 0.022 0.014 0.015 0.010 0.010 0.008 0.008 0.006 0.006 0.004 0.003 0.003 0.003 0.001 0.002 0.002 0.002 0.002 0.003 0.002 0.002 0.002 151 Table 3.7: Size and Power Using 5% Fixed-b Critical Values with Traditional Lag Choices, T = 200 and 500 T=200 lag t t t t t l4 l12 l25 l50 l4 l12 l25 l50 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 l50 × l25 ∼ I (0) ρ=0 0.25 0.5 0.75 0.95 0.047 0.066 0.102 0.233 0.720 (Size) 0.053 0.053 0.057 0.055 0.065 0.060 0.094 0.074 0.340 0.186 0.053 0.056 0.057 0.064 0.091 0.956 0.963 0.974 0.983 0.641 (Power) 0.663 0.346 0.677 0.356 0.727 0.395 0.826 0.475 0.540 0.407 0.046 0.049 0.060 0.084 0.150 0.045 0.063 0.100 0.227 0.414 (Size) 0.034 0.017 0.038 0.016 0.044 0.019 0.074 0.023 0.119 0.020 0.017 0.018 0.021 0.025 0.017 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 0.951 0.950 0.948 0.923 0.564 (Power) 0.741 0.587 0.739 0.587 0.738 0.583 0.725 0.576 0.487 0.414 0.377 0.378 0.378 0.376 0.283 0.048 0.073 0.150 0.478 0.930 (Size) 0.047 0.048 0.051 0.051 0.069 0.053 0.180 0.083 0.616 0.286 0.046 0.046 0.046 0.041 0.046 0.034 0.054 0.122 0.419 0.516 (Size) 0.011 0.002 0.013 0.003 0.022 0.004 0.078 0.010 0.287 0.095 0.002 0.002 0.003 0.006 0.060 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.137 0.271 0.415 0.550 0.608 0.655 0.659 0.749 0.826 0.860 0.884 0.925 (Power) 0.099 0.084 0.167 0.123 0.241 0.173 0.318 0.221 0.364 0.246 0.406 0.281 0.408 0.281 0.481 0.343 0.550 0.398 0.584 0.428 0.617 0.459 0.684 0.518 0.072 0.091 0.115 0.141 0.152 0.174 0.175 0.204 0.238 0.257 0.278 0.323 0.952 0.941 0.920 0.884 0.852 0.811 0.811 0.677 0.479 0.377 0.280 0.126 (Power) 0.661 0.344 0.652 0.334 0.624 0.318 0.575 0.285 0.531 0.262 0.492 0.239 0.492 0.237 0.375 0.188 0.258 0.142 0.201 0.122 0.158 0.101 0.086 0.071 0.048 0.049 0.051 0.053 0.055 0.057 0.053 0.058 0.060 0.059 0.055 0.053 0.128 0.252 0.375 0.476 0.504 0.513 0.516 0.478 0.357 0.289 0.216 0.094 (Power) 0.065 0.054 0.109 0.079 0.143 0.102 0.161 0.107 0.165 0.104 0.158 0.102 0.163 0.104 0.126 0.078 0.083 0.047 0.060 0.034 0.045 0.025 0.021 0.013 0.027 0.035 0.038 0.034 0.028 0.024 0.030 0.018 0.011 0.009 0.007 0.004 l50 l4 l50 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 l50 × l25 T=500 lag t ηµd ηµ ηµd ηµ l4 l12 l25 l12 l25 ∼ I (0) ρ=0 0.25 0.5 0.75 0.95 0.054 0.070 0.097 0.197 0.733 (Size) 0.055 0.054 0.058 0.056 0.064 0.060 0.090 0.068 0.335 0.166 0.052 0.054 0.056 0.063 0.099 0.998 0.998 1.000 1.000 0.997 (Power) 0.901 0.672 0.908 0.687 0.941 0.738 0.983 0.848 0.984 0.927 0.365 0.377 0.415 0.507 0.639 0.054 0.069 0.097 0.197 0.730 (Size) 0.049 0.035 0.052 0.037 0.059 0.042 0.088 0.055 0.325 0.139 0.033 0.035 0.039 0.049 0.084 ∼ I (1) φ=0 0.25 0.5 0.75 0.95 0.992 0.992 0.992 0.990 0.890 (Power) 0.908 0.749 0.908 0.748 0.907 0.748 0.906 0.747 0.802 0.685 0.593 0.593 0.592 0.593 0.552 0.048 0.067 0.131 0.454 0.984 (Size) 0.051 0.047 0.055 0.049 0.070 0.057 0.193 0.101 0.815 0.545 0.049 0.048 0.050 0.060 0.227 0.044 0.062 0.126 0.445 0.874 (Size) 0.028 0.013 0.031 0.014 0.043 0.017 0.146 0.037 0.641 0.347 0.007 0.008 0.009 0.021 0.269 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.178 0.371 0.559 0.725 0.790 0.836 0.834 0.909 0.952 0.965 0.975 0.988 (Power) 0.129 0.102 0.239 0.164 0.356 0.244 0.478 0.324 0.530 0.365 0.583 0.408 0.582 0.409 0.677 0.489 0.758 0.559 0.790 0.596 0.819 0.636 0.871 0.693 0.085 0.123 0.172 0.220 0.244 0.278 0.278 0.339 0.405 0.436 0.464 0.530 0.997 0.996 0.992 0.983 0.970 0.954 0.952 0.855 0.658 0.523 0.387 0.153 (Power) 0.897 0.669 0.888 0.662 0.871 0.647 0.835 0.598 0.803 0.564 0.755 0.524 0.757 0.516 0.619 0.390 0.419 0.253 0.322 0.196 0.233 0.146 0.107 0.084 0.363 0.354 0.336 0.290 0.274 0.242 0.245 0.181 0.134 0.112 0.096 0.069 0.178 0.369 0.554 0.712 0.762 0.794 0.790 0.770 0.617 0.493 0.367 0.146 (Power) 0.114 0.090 0.210 0.144 0.304 0.207 0.389 0.264 0.415 0.280 0.422 0.288 0.421 0.287 0.386 0.264 0.269 0.178 0.207 0.135 0.148 0.094 0.064 0.036 0.066 0.104 0.147 0.178 0.182 0.186 0.176 0.138 0.085 0.065 0.046 0.022 152 Table 3.8: Size and Power Using 5% Fixed-b Critical Values with Traditional Lag Choices, T = 1,000 and 2,000 T=1,000 lag t t t t t l4 l12 l25 l50 l4 l12 l25 l50 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 l50 × l25 ∼ I (0) ρ=0 0.25 0.5 0.75 0.95 0.049 0.059 0.079 0.151 0.647 (Size) 0.049 0.051 0.054 0.052 0.059 0.056 0.077 0.063 0.286 0.145 0.049 0.050 0.052 0.055 0.092 1.000 1.000 1.000 1.000 1.000 (Power) 0.973 0.851 0.978 0.866 0.989 0.901 0.999 0.966 1.000 0.998 0.621 0.635 0.685 0.807 0.964 0.049 0.059 0.079 0.151 0.647 (Size) 0.048 0.044 0.052 0.045 0.059 0.051 0.077 0.061 0.286 0.144 0.042 0.043 0.046 0.053 0.091 ∼ I (1) φ=0 0.25 0.5 0.75 0.95 0.998 0.998 0.998 0.997 0.976 (Power) 0.963 0.861 0.962 0.860 0.962 0.860 0.961 0.860 0.934 0.832 0.707 0.707 0.707 0.707 0.686 0.052 0.065 0.123 0.396 0.988 (Size) 0.053 0.052 0.057 0.053 0.076 0.064 0.195 0.117 0.867 0.659 0.054 0.055 0.058 0.079 0.413 0.051 0.063 0.121 0.393 0.965 (Size) 0.040 0.023 0.044 0.023 0.059 0.029 0.170 0.066 0.805 0.528 0.012 0.012 0.015 0.036 0.419 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.201 0.418 0.635 0.803 0.862 0.901 0.902 0.955 0.978 0.987 0.991 0.996 (Power) 0.146 0.116 0.281 0.207 0.432 0.316 0.584 0.425 0.647 0.476 0.703 0.527 0.707 0.528 0.799 0.619 0.862 0.693 0.890 0.728 0.911 0.761 0.943 0.815 0.094 0.150 0.221 0.296 0.334 0.378 0.378 0.456 0.529 0.563 0.592 0.647 1.000 1.000 0.999 0.998 0.994 0.987 0.986 0.929 0.749 0.615 0.455 0.194 (Power) 0.972 0.848 0.968 0.844 0.960 0.829 0.935 0.800 0.916 0.764 0.886 0.716 0.885 0.711 0.754 0.573 0.547 0.392 0.431 0.307 0.319 0.227 0.143 0.115 0.620 0.616 0.601 0.570 0.531 0.486 0.485 0.366 0.248 0.200 0.153 0.093 0.201 0.418 0.635 0.802 0.857 0.889 0.888 0.884 0.728 0.603 0.448 0.190 (Power) 0.140 0.099 0.271 0.172 0.413 0.257 0.541 0.330 0.584 0.350 0.613 0.357 0.614 0.351 0.581 0.313 0.443 0.213 0.356 0.163 0.263 0.121 0.115 0.055 0.080 0.126 0.180 0.227 0.240 0.253 0.246 0.217 0.146 0.108 0.075 0.030 l50 l4 l50 l4 × l4 ηµ × ηµd l12 × l12 l25 × l25 l50 × l25 T=2,000 lag t ηµd ηµ ηµd ηµ l4 l12 l25 l12 l25 ∼ I (0) ρ=0 0.25 0.5 0.75 0.95 0.049 0.060 0.074 0.134 0.604 (Size) 0.049 0.047 0.051 0.049 0.056 0.052 0.070 0.057 0.245 0.133 0.047 0.048 0.049 0.052 0.080 1.000 1.000 1.000 1.000 1.000 (Power) 0.996 0.954 0.997 0.960 0.999 0.979 1.000 0.996 1.000 1.000 0.802 0.818 0.866 0.950 1.000 0.049 0.060 0.074 0.134 0.604 (Size) 0.049 0.045 0.051 0.047 0.056 0.051 0.070 0.057 0.245 0.133 0.045 0.046 0.049 0.052 0.080 ∼ I (1) φ=0 0.25 0.5 0.75 0.95 1.000 1.000 1.000 1.000 0.998 (Power) 0.990 0.944 0.990 0.944 0.990 0.943 0.989 0.943 0.983 0.932 0.815 0.815 0.815 0.815 0.810 0.052 0.065 0.114 0.380 0.995 (Size) 0.051 0.052 0.058 0.054 0.073 0.061 0.176 0.108 0.886 0.708 0.053 0.054 0.058 0.078 0.482 0.051 0.064 0.114 0.380 0.993 (Size) 0.048 0.037 0.054 0.038 0.069 0.044 0.171 0.084 0.870 0.648 0.019 0.020 0.024 0.051 0.546 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.226 0.499 0.749 0.892 0.935 0.959 0.960 0.988 0.995 0.996 0.998 1.000 (Power) 0.167 0.131 0.350 0.267 0.538 0.407 0.700 0.535 0.770 0.599 0.819 0.662 0.824 0.663 0.893 0.759 0.942 0.826 0.954 0.855 0.966 0.878 0.982 0.916 0.107 0.191 0.287 0.393 0.434 0.487 0.489 0.570 0.651 0.688 0.718 0.769 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.978 0.853 0.712 0.557 0.222 (Power) 0.996 0.953 0.995 0.950 0.993 0.940 0.988 0.912 0.979 0.892 0.961 0.856 0.961 0.852 0.869 0.723 0.669 0.526 0.543 0.408 0.408 0.295 0.167 0.136 0.800 0.799 0.786 0.755 0.723 0.677 0.675 0.538 0.369 0.279 0.211 0.110 0.226 0.499 0.749 0.892 0.934 0.958 0.959 0.966 0.848 0.709 0.556 0.221 (Power) 0.166 0.124 0.349 0.254 0.534 0.382 0.691 0.484 0.754 0.529 0.785 0.554 0.790 0.551 0.767 0.523 0.617 0.397 0.504 0.308 0.381 0.222 0.158 0.099 0.103 0.182 0.271 0.354 0.381 0.405 0.403 0.380 0.293 0.224 0.155 0.061 153 Table 3.9: Size and Power of ADF and ηµ × ADF Tests Using Standard Critical Values with Traditional Lag Choices, T = 50, 100, 200, and 500 T=50 lag t t t ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 p0 ADF p4 0.041 0.157 0.515 0.956 1.000 (Power) 0.789 0.683 0.495 0.220 0.056 (Size) 0.044 0.041 0.052 0.182 0.716 1.000 1.000 0.999 0.978 0.941 0.882 0.880 0.686 0.422 0.307 0.219 0.094 (Power) 0.672 0.536 0.406 0.294 0.249 0.209 0.207 0.143 0.102 0.084 0.070 0.053 p4 ADF p12 1.000 1.000 0.978 0.474 0.064 T=100 ηµ × ADF l4 × p4 l12 × p12 p12 l0 × p 0 0.072 0.069 0.061 0.047 0.038 0.053 0.149 0.320 0.238 0.043 0.038 0.034 0.031 0.030 0.065 0.032 0.119 0.413 0.665 0.126 (Size) 0.000 0.000 0.001 0.002 0.012 (Size) 0.020 0.018 0.014 0.010 0.001 0.063 0.057 0.050 0.043 0.043 0.042 0.042 0.040 0.037 0.039 0.038 0.037 0.133 0.252 0.398 0.525 0.559 0.561 0.564 0.476 0.298 0.217 0.159 0.068 (Power) 0.001 0.001 0.002 0.004 0.004 0.007 0.007 0.010 0.013 0.015 0.017 0.017 p0 ADF p4 0.000 0.000 0.000 0.000 0.000 1.000 1.000 1.000 0.974 0.116 0.000 0.000 0.000 0.000 0.000 0.054 0.190 0.585 0.978 1.000 (Power) 0.994 0.983 0.932 0.623 0.091 (Size) 0.051 0.050 0.058 0.182 0.947 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1.000 1.000 1.000 1.000 0.998 0.998 0.930 0.692 0.516 0.362 0.145 (Power) 0.970 0.901 0.761 0.573 0.479 0.401 0.397 0.255 0.155 0.128 0.102 0.068 p4 ADF p12 l0 × p 0 0.346 0.315 0.270 0.185 0.060 0.043 0.143 0.361 0.693 0.105 0.043 0.044 0.041 0.041 0.206 0.051 0.179 0.563 0.883 0.300 (Size) 0.002 0.001 0.003 0.008 0.013 (Size) 0.023 0.024 0.022 0.016 0.008 0.274 0.205 0.158 0.123 0.111 0.101 0.100 0.082 0.067 0.059 0.055 0.048 0.156 0.346 0.539 0.715 0.782 0.828 0.828 0.838 0.645 0.484 0.338 0.133 (Power) 0.002 0.005 0.009 0.012 0.013 0.017 0.017 0.019 0.019 0.020 0.021 0.024 T=200 lag t t t ∼ I (0) ρ=0 0.25 0.50 0.75 0.95 ∼ I (1) φ=0 0.25 0.50 0.75 0.95 ∼ I (d) d =0.1 0.2 0.3 0.4 0.45 0.499 0.5 0.6 0.7 0.75 0.8 0.9 0.046 0.046 0.052 0.165 0.997 (Power) 1.000 1.000 1.000 0.999 0.661 (Size) 0.047 0.045 0.045 0.047 0.480 1.000 1.000 0.997 0.956 0.892 0.791 0.788 0.543 0.317 0.231 0.169 0.096 (Power) 0.751 0.595 0.444 0.313 0.265 0.225 0.224 0.153 0.110 0.094 0.079 0.060 1.000 1.000 1.000 1.000 0.899 p25 l4 × p 4 ηµ × ADF l4 × p4 l12 × p12 p12 0.007 0.008 0.010 0.011 0.002 0.004 0.011 0.056 0.170 0.049 0.011 0.017 0.025 0.034 0.040 0.048 0.048 0.053 0.038 0.028 0.019 0.006 T=500 ηµ × ADF l12 × p12 l25 × p25 0.875 0.867 0.846 0.778 0.359 0.054 0.067 0.094 0.192 0.422 0.046 0.047 0.048 0.047 0.098 0.043 0.042 0.041 0.043 0.371 (Size) 0.043 0.044 0.047 0.056 0.135 (Size) 0.018 0.073 0.359 0.719 0.661 0.193 0.149 0.116 0.089 0.081 0.077 0.076 0.060 0.052 0.049 0.046 0.044 0.045 0.066 0.077 0.064 0.064 0.063 0.062 0.051 0.044 0.043 0.040 0.039 (Power) 0.051 0.082 0.117 0.161 0.183 0.214 0.215 0.267 0.249 0.184 0.110 0.034 0.028 0.028 0.030 0.033 0.030 1.000 1.000 1.000 1.000 0.899 0.011 0.011 0.010 0.011 0.121 0.046 0.046 0.052 0.165 0.997 (Power) 1.000 1.000 1.000 0.999 0.661 (Size) 0.047 0.045 0.045 0.047 0.480 0.004 0.005 0.005 0.004 0.003 0.003 0.003 0.002 0.002 0.002 0.001 0.002 1.000 1.000 1.000 0.999 0.999 0.991 0.990 0.870 0.552 0.398 0.262 0.106 (Power) 1.000 0.993 0.944 0.798 0.695 0.575 0.572 0.356 0.203 0.152 0.115 0.071 154 p25 l4 × p 4 ηµ × ADF l12 × p12 l25 × p25 0.875 0.867 0.846 0.778 0.359 0.054 0.067 0.094 0.192 0.422 0.046 0.047 0.048 0.047 0.098 0.043 0.042 0.041 0.043 0.371 (Size) 0.043 0.044 0.047 0.056 0.135 (Size) 0.018 0.073 0.359 0.719 0.661 0.749 0.592 0.432 0.300 0.252 0.206 0.205 0.139 0.094 0.083 0.068 0.055 0.173 0.355 0.484 0.498 0.459 0.384 0.378 0.253 0.154 0.115 0.094 0.062 (Power) 0.084 0.145 0.220 0.297 0.339 0.379 0.381 0.461 0.516 0.492 0.367 0.095 0.028 0.028 0.030 0.033 0.030 0.011 0.011 0.010 0.011 0.121 0.050 0.077 0.094 0.091 0.083 0.072 0.069 0.040 0.022 0.018 0.013 0.010 APPENDIX 155 Appendix for Chapter 3 Proof of Theorem 8 Liu (1998), Theorem 3.4, shows that ∑ St2 = O p T 3 when t 1 2 is I . It also shows that s2 (0) = O p (ln T ) . However, he does not establish the order in probability of s2 (l ) when l → ∞ as T → ∞. The proof here will establish the limiting behavior of s2 (l ) when l → ∞ using results of Tanaka (1999). Tanaka provides the invariance principle for I 1 2 processes having i.i.d. innovations. He defines the process ts2T − s2j 1 1 XT (t) = y j + 2 y j − y j −1 , sT s j − s2j−1 s T s2j−1 s2T ≤t≤ s2j s2T , (2.37) where (1 − L)1/2 yt = ut ∼ I ID (0, σu2 ) and s2j = Var y j . Lemma 2.1 of Liu (1998) shows that s2j = with K = 2σu2 π 4σu2 π j 1 ∑ 2k − 1 = K · L( j) (2.38) k =1 and L( T ) = 1. T →∞ log T (2.39) lim Theorem 2.1 in Tanaka (1999) states XT = { XT (t)} weakly converges to the standard Wiener process defined on [0, 1] . Note that in our terminology, t replaces yt in Tanaka (1999). Rewrite s2 (l ) as s2 ( l ) = 1 T T l ∑ e2t + 2 ∑ t =1 1− s =1 156 s l+1 1 T T ∑ t = s +1 et et−s . Multiplying s2 (l ) by 1 l ln T · ln1T yields 1 1 1 1 1 · · s2 ( l ) = l ln T ln T l ln T T + T ∑ (ln T )−1/2 et · (ln T )−1/2 et (2.40) t =1 1 l s 1− ∑ l s =1 l+1 1 1 ln T T T ∑ (ln T )−1/2 et · (ln T )−1/2 et−s . t = s +1 Consider the absolute value of the second sum. 1 l s 1− ∑ l s =1 l+1 ≤ 1 l s 1− ∑ l s =1 l+1 = ≤ 1 1 ln T T T ∑ (ln T )−1/2 et · (ln T )−1/2 et−s 1 1 · max ln T T 1≤s≤l 1 1 1 · max 2 ln T T 1≤s≤l T ∑ (2.41) t = s +1 T ∑ (ln T )−1/2 et · (ln T )−1/2 et−s t = s +1 (ln T )−1/2 et · (ln T )−1/2 et−s t = s +1 T 1 1 1 · max ∑ (ln T )−1/2 et · (ln T )−1/2 et−s . 2 ln T T 1≤s≤l t=s+1 It turns out that it is enough to show this last expression is o p (1). max 1≤ s ≤ l 1 1 ln T T T ∑ (ln T )−1/2 et · (ln T )−1/2 et−s t = s +1 1 1 ≤ max ( T − s) max (ln T )−1/2 et · (ln T )−1/2 et−s ln T T s +1≤ t ≤ T 1≤ s ≤ l 1 ≤ max max (ln T )−1/2 et · max (ln T )−1/2 et 1≤ t ≤ T 1≤s≤l ln T 1≤t≤ T 1 1 = max (ln T )−1/2 et · max (ln T )−1/2 et = · O p (1) = o p (1). ln T 1≤t≤T ln T 1≤ t ≤ T 157 (2.42) The second to last equality in (2.42) comes from the following. max (ln T )−1/2 et 1≤ t ≤ T ≤ max 1≤ t ≤ T (ln T )−1/2 ≤ 2 · max (ln T ) + (ln T )−1/2 t −1/2 K1/2 2 · max · sT 1≤ t ≤ T t 1≤ t ≤ T (2.43) t with large T = 2K1/2 · max | XT (r )| ⇒ 2K1/2 · max |W (r )| = O p (1). 0≤r ≤1 0≤r ≤1 The weak convergence result in the last line follows from Tanaka (1999). Similarly, one can show 1 1 1 l ln T T ∑tT=1 √ 1 et ln T · √ 1 et ln T = o p (1). Hence 1 l (ln T )2 · s2 ( l ) = o p (1). Now rewrite ηµ as ηµ = and recall that above p lim 1 T3 1 T3 ∑tT=1 St2 1 s2 ( l ) l (ln T )2 × T l (ln T )2 , (2.44) ∑tT=1 St2 = O p (1) and its weak limit is not zero (Liu 1998). Also from the 1 s2 ( l ) l (ln T )2 = 0. Therefore, since T l (ln T )2 goes to infinity under the traditional choice of the number of the lags, ηµ diverges to infinity if l → ∞ and l T → 0 as T → ∞. Proof of Proposition 8 In this case, ∆ t is a short memory process with zero mean. So the limiting behavior of s2 (l ) and St should be the same as that of s2 (l ) and St from the model with short memory error and no intercept. Hence the followings are immediate: p 1 s2 (l ) → σ2 , √ S[rT ] ⇒ σW (r ), T 1 1 T 2 2 S ⇒ σ W (r )2 dr, and ∑ t 2 T t =2 0 ηµd ⇒ 1 0 (2.45) W (r )2 dr. Note that the weak limit of ηµd is a functional of a standard Wiener process instead of a Brownian bridge process (for the KPSS test of short memory) or a demeaned Brownian motion (for the KPSS unit root test). This is because we difference the data instead of 158 demeaning the terms in St . Proof of Theorem 9 Rewrite the numerator as T −2 ∑tT=2 St2 = T −2 ∑tT=2 ( T −2 ∑tT=2 t −1 T ∑ St2 t =2 ∑tT=2 1 = T 1 T = 1 T − 1) t + 2 1 2 = T −2 ∑tT=2 2 t −2 1 · + T −1 12 . Multiplying by T gives T since t T ∑ 2 t 2 t + t =2 T ∑ T 1 −2 1 T 2 1 ∑ (2.46) t =2 + o p (1), t =2 p t → 0. Therefore, T −1 T d ∑ St2 → γ0 + 2 1. (2.47) t =2 Second, to figure out the limiting behavior of s2 (l ), rewrite γs as below. 1 T γs = = 1 T T ∑ T ∑ ( t− t −1 ) ( t − s T t t−s t = s +2 t − s −1 ) T ∑ − − t = s +2 t t − s −1 − t = s +2 ∑ T t −1 t − s + t = s +2 ∑ t −1 t − s −1 . t = s +2 Plugging this into s2 (l ) yields: − − 1 T 1 T T 1 T s2 ( l ) = ∑ t =2 T ∑ t =2 ∑ t s =1 l t 1 T T ∑ T t = s +2 l T t −1 t + 2 ∑ w ( s, l ) s =1 159 1 T t t−s t = s +2 1 t−1 + 2 ∑ w ( s, l ) T s =1 T t =2 l t + 2 ∑ w ( s, l ) ∑ ∑ t = s +2 t t − s −1 t −1 t − s (2.48) 1 T + T l ∑ t −1 t −1 t =2 + 2 ∑ w(s, l ) s =1 1 T T ∑ . t −1 t − s −1 t = s +2 Then the equation (2.48) can be rewritten by collecting the terms according to the time lags of cross products of m n ’s s2 ( l ) = + + 1 T ∑ [2 − 2w(1, l )] + t t T T t =2 + T 1 1 (2.49) T −1 ∑ 2 [2w(1, l ) − w(0, l ) − w(2, l )] t t −1 t =3 T T −1 + 2 [w(1, l ) − 1] 2 1 T + T −1 ∑ 2 [2w(2, l ) − w(1, l ) − w(3, l )] t t −2 t =4 2 [w(2, l ) − w(1, l )] · [ T T T −2 + 3 1] +··· T −1 ∑ 2 [2w(l − 1, l ) − w(l − 2, l ) − w(l, l )] t t − l +1 t = l +1 + 2 [w(l − 1, l ) − w(l − 2, l )] · [ T 1 + T + T −1 2 [w(1, l ) − w(0, l )] · 1 + T 1 + T 1 T (i.e. the value of m − n). T T − l +1 + l 1] T −1 ∑ 2 [2w(l, l ) − w(l − 1, l )] t t−l t = l +2 2 [w(l, l ) − w(l − 1, l )] · [ T T T −l + l +1 1 ] − 1 T T ∑ 2w(l, l ) t t − l −1 . t = l +2 Now, fix l and let T increase to infinity. Because max1≤s,t≤T | t s| = O p (T ) 4 one can obtain the following as T increases: p s2 (l ) → [2 − 2w(1, l )] γ0 + 2 [2w(1, l ) − w(0, l ) − w(2, l )] γ1 4 Notice that max 2 t 1≤s,t≤ T | t s | = max1≤t≤ T t and recall that for t ∼ I (1), max1≤t≤ T √ T equivalently max1≤t≤T t2 = O p ( T ) . Hence we conclude that max1≤t≤T t2 = O p ( T ) when t short or long memory process. 160 (2.50) 2 = O p (1), or is a stationary + · · · + 2 [2w(l − 1, l ) − w(l − 2, l ) − w(l, l )] γl −1 + 2 [2w(l, l ) − w(l − 1, l )] γl − 2w(l, l )γl +1 . Note that with the Bartlett kernel w( j, l ) = 1 − j l +1 , (2.49) can be simplified because 2w( j, l ) − w( j − 1, l ) − w( j + 1, l ) = 0 for j = 1, 2, ..., l. Hence p s2 (l ) → [2 − 2w(1, l )] γ0 − 2w(l, l )γl +1 = 2 (γ0 − γl +1 ) . l+1 (2.51) p Therefore, l · s2 (l ) → 2γ0 as l increases since γl +1 → 0 under the assumption of either stationary short- or stationary long-memory process. Now it is straightforward to see that T d T ηµ = l l T −2 ∑tT=2 St2 s2 ( l ) = T −1 ∑tT=2 St2 , ls2 (l ) (2.52) and therefore, using (2.47) and (2.51), T d d γ0 + 12 η → . l µ 2γ0 p This implies that ηµd → 0 as l → ∞, T → ∞, l T (2.53) → 0. Proof of Proposition 9 In this case, ∆ t ∼ I (d∗ ) where d∗ = d − 1 with − 21 < d∗ < 0. This means ∆ anti-persistent process. From Table 3.1 in Section 3.3.1, 1 T 2( d ∗ S +1) [rT ] ≡ = 1 T d∗ +1/2 1 [rT ] ∑∆ [rT ] ∑∆ T d∗ +1/2 t =1 ⇒ ωd∗ Wd∗ (r ), 161 t t =2 t − 1 T d∗ +1/2 ∆ 1 t is an so T 1 T 2( d ∗ +1) ∑ St2 ⇒ ωd2∗ t =2 1 0 Wd∗ (r )2 dr, and p l −2d∗ s2 (l ) → ωd2∗ . Therefore, l T That is, ηµd = O p T 2d∗ l 2d∗ 1 ηµd ⇒ 0 Wd∗ (r )2 dr. and ηµd goes to zero as l → ∞, T → ∞, l T → 0. Proof of Theorem 10 Since t ∼ I (1/2), it it true that ∆yt = ∆ t ∼ I (−1/2). Fix l and increase T to get s2 (l ) → σ2 (l ) = γ0∗ + 2 ∑ls=1 Ws,l γs∗ , where γs∗ = E (∆ t ∆ t−s ) . As in Lee and Schmidt (1996, p.291), one can show 2 ( l + 1) σ ( l ) = ( l + 1) γ0∗ l + 2 ∑ (l + 1 − s) γs∗ (2.54) s =1 l +2 = var ∑∆ = var ( j l +2 − 1) j =2 = var ( l +2 ) + var ( 1 ) − 2ρ where ρ is the correlation between l +2 and var ( t ) = where L(t) = 4 ∑tj=1 1 2j−1 and L(t) ln t 1. var ( l +2 ) var ( 1 ), Recall from equation (2.38) that 2σu2 L ( t ), π → 1 as t increases. Divide equation (2.54) by ln(l + 1) and let l increase to yield ( l + 1) σ 2 ( l ) 2σ2 → u as l grows. ln(l + 1) π 162 This is due to the facts that var ( 1 )/ ln(l + 1) → 0 as l → ∞ and |ρ| ≤ 1. Hence p lim l · s2 ( l ) ( l + 1) s2 ( l ) 2σ2 = p lim = u ≡ K, ln l ln(l + 1) π ln l l which implies s2 (l ) = O p when T and l grow but l/T goes to zero. Next, consider the sum of St2 in the numerator of ηµd . Look at the absolute value of the appropriately scaled sum and see 1 T ln T T 1 T ∑ St2 = t =2 ≤ 4 · max √ 1≤ t ≤ T T 1 ∑ √ln T ( √ K1/2 · 4 · max sT 1≤ t ≤ T t − t =2 2 1 ln T 1) · t t 1 ln T ( t− 1) with large T = 4K1/2 · max | XT (r )| ⇒ 4K1/2 · max |W (r )| = O p (1). 0≤r ≤1 0≤r ≤1 Therefore, one can obtain ηµd T −2 ∑tT=2 St2 = = s2 ( l ) = l ln T ln l T l ln T ln l T · O p (1) 2σu2 π + o p (1) T 1 2 T ln T ∑t=2 St l 2 ln l s ( l ) = o p (1). Proof of Proposition 12 Unlike the case for ηµ , the calculation of ηµd does not involve demeaning so that the full sum of the data ST is not zero. So, the correct representation of the HAC estimator with the Bartlett kernel is (see Hashimzade and Vogelsang (2008), page 161) s (l ) = T −1 2 2 b ( T − 1)2 − ∑ St2 t =2 2 b ( T − 1)2 − T −b ( T −1)−1 2 b ( T − 1)2 T −1 ∑ t = T − b ( T −1) 163 St S T + ∑ S t S t + b ( T −1) t =2 1 S2 , T−1 T (2.55) with b = l +1 T −1 . Now suppose s (l ) = T −1 2 2 ∑( b ( T − 1)2 = 2 T −1 ∑ 2 t + b ( T −1) − 1 (2.56) ( t− − 1) ( T 1) + 2 t T −1 −2 ∑ 1 t =2 t + ( T − 2) t + 2 1 1 ( T−1 − t =2 T − 1) 2 b ( T − 1)2 2 × T −b ( T −1)−1 t t + b ( T −1) − ∑ 1 2 1 + T − b ( T − 1) − 1 · t + b ( T −1) t =2  T −1 2 b ( T − 1) 1) t = T − b ( T −1) 2 b ( T − 1) ( t−  ∑  t =2 − ∑ − T −1 T −b ( T −1)−1 ∑ 1) t =2 2 b ( T − 1) − t in (2.55) yields T −b ( T −1)−1 2 t =2  − t− 1 follows a short-memory process. Plugging St = t 2 ∑   T −1 t T ∑ − t = T − b ( T −1) 1( t ηµd = T) + b ( T − 1) 2 1 + t = T − b ( T −1) ≡ H (b ) + Combining this with + 1 ( T−1 T − 1) 2 ∑tT=2 St = 1 T2 ∑tT=2 2 t −2 T 1 ∑ t =2 t + ( T − 1) 2 1 1 T2 ∑tT=2 St = s2 ( l ) 1 T ∑tT=2 2 t −2 T 1 ∑ t =2 t + ( T − 1) 2 1 Denote the weak limit of T as T grows as ∞ T T −1 T − 1) . 1 T2 T · H (b ) + 1 ( T−1 ( T − 1) 2 gives . and apply the functional central limit theo- rem and continuous mapping theorem to obtain 2 b (γ0 + b 1 γ0 + 12 2 2 b γ0 + 1 + = Secondly, suppose that 2 1 γ0 + ηµd ⇒ t ∞) + ( ∞ 2 ∞ − 1) 2 γ0 + 12 < 2γ0 + 12 + 2 ∞ <1 follows a unit root process. Rearranging equation (2.56) 164 2 gives 2 ( T − 1)2 2 s (l ) = 2 bT T − 2 bT T −1 ∑ T −1 ∑ T −1/2 2 t − t =2 T −1/2 t · T −1/2 2 bT T+ t = T − b ( T −1) T −b ( T −1)−1 ∑ T −1/2 t · T −1/2 t + b ( T −1) t =2 T−1 T −1/2 ( T T− 1) 2 + o p (1), where the remaining terms are negligible since T −1/2 1 = o p (1), and 1 T2 T −1 ∑ t 1 = T −1/2 1· t =2 1 T T −1 ∑ T −1/2 t = o p (1)O p (1) = o p (1). t =2 Therefore by applying the functional CLT and continuous mapping theorem one can show ηµd ⇒ 1 2 0 W (r ) dr , Q1d (b ) where W (r ) is the standard Wiener process and Q1d (b ) ≡ 2 b 1 0 W (r )2 dr − 1− b 0 W (r )W (r + b )dr − Note that this limit does not degenerate to limit of ηµ in Proposition 11, which is 1 2 1 2 1 1− b W (r )W (1)dr + W (1)2 . for b = 1. This is in contrast with the fixed-b under both I (0) and I (1) DGPs. 165 REFERENCES 166 REFERENCES Adenstedt, R. K.: (1974), ‘On large-sample estimation for the mean of a stationary random sequence’. The Annals of Statistics 2, 1095–1107. Amsler, C., P. Schmidt, and T. J. Vogelsang: (2009), ‘The KPSS test using fixed-b critical values: size and power in highly autocorrelated time series’. Journal of Time Series Econometrics 1. Andrews, D. W. K.: (1991), ‘Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation’. Econometrica 59, 817–854. Breitung, J.: (2002), ‘Nonparametric Tests for Unit Roots and Cointegration’. Journal of Econometrics 108, 343–363. Caner, M. and L. Kilian: (2001), ‘Size distortions of tests of the null hypothesis of stationarity: evidence and implications for the PPP debate’. Journal of International Money and Finance 20, 639–657. Davydov, Y. A.: (1970), ‘The invariance principle for stationary processes’. Theory of Probability & Its Applications 15, 487–498. De Jong, R. M. and J. Davidson: (2000), ‘Consistency of Kernel Estimators of Heteroskedastic and Autocorrelated Covariance Matrices’. Econometrica 68, 407–424. Dejong, D. N., J. C. Nankervis, N. E. Savin, and C. H. Whiteman: (1992), ‘Integration versus Trend Stationarity in Time Series’. Econometrica 60, 423–434. Diebold, F. X. and G. D. Rudebusch: (1991), ‘On the power of Dickey-Fuller tests against fractional alternatives’. Economics Letters 35, 155–160. Giraitis, L., H. L. Koul, and D. Surgailis: (2012), Large sample inference for long memory processes. Imperial College Press London. Granger, C. W. and R. Joyeux: (1980), ‘An introduction to long-memory time series models and fractional differencing’. Journal of Time Series Analysis 1, 15–29. Hansen, B. E.: (1992), ‘Consistent Covariance Matrix Estimation for Dependent Heterogenous Processes’. Econometrica 60, 967–972. Hashimzade, N. and T. J. Vogelsang: (2008), ‘Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators’. Journal of Time Series Analysis 29, 142–162. Hassler, U. and J. Wolters: (1994), ‘On the power of unit root tests against fractional alternatives’. Economics Letters 45, 1–5. 167 Hosking, J. R.: (1981), ‘Fractional differencing’. Biometrika 68, 165–176. Jansson, M.: (2002), ‘Consistent Covariance Estimation for Linear Processes’. Econometric Theory 18, 1449–1459. Kiefer, N. M. and T. J. Vogelsang: (2002a), ‘Heteroskedasticity-autocorrelation robust standard errors using the Bartlett kernel without truncation’. Econometrica 70, 2093–2095. Kiefer, N. M. and T. J. Vogelsang: (2002b), ‘Heteroskedasticity-Autocorrelation Robust Testing Using Bandwidth Equal to Sample Size’. Econometric Theory 18, 1350–1366. Kiefer, N. M. and T. J. Vogelsang: (2005), ‘A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests’. Econometric Theory 21, 1130–1164. Kwiatkowski, D., P. Phillips, P. Schmidt, and Y. Shin: (1992), ‘Testing the Null Hypothesis of Stationarity Against the Alternative of a Unit Root: How Sure are We that Economic Time Series Have a Unit Root’. Journal of Econometrics 54, 154–178. Lee, D. and P. Schmidt: (1996), ‘On the power of the KPSS test of stationarity against fractionally-integrated alternatives’. Journal of Econometrics 73, 285–302. Lee, H. S. and C. Amsler: (1997), ‘Consistency of the KPSS unit root test against fractionally integrated alternative’. Economics Letters 55, 151–160. Liu, M.: (1998), ‘Asymptotics of nonstationary fractional integrated series’. Econometric Theory 14, 641–662. Lo, A. W.: (1991), ‘Long-Term Memory in Stock Market Prices’. Econometrica 59, 1279– 1313. Mandelbrot, B. B. and J. W. Van Ness: (1968), ‘Fractional Brownian motions, fractional noises and applications’. SIAM Review 10, 422–437. Müller, U. K.: (2005), ‘Size and power of tests of stationarity in highly autocorrelated time series’. Journal of Econometrics 128, 195–213. Newey, W. K. and K. D. West: (1987), ‘A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix’. Econometrica 55, 703–708. Phillips, P. C. B. and P. Perron: (1988), ‘Testing for a Unit Root in Time Series Regression’. Biometrika 75, 335–346. Qiu, J. and Z. Lin: (2011), ‘The invariance principle for fractionally integrated processes with strong near-epoch dependent innovations’. Science China Mathematics 54, 117–132. Said, S. E. and D. A. Dickey: (1984), ‘Testing for Unit Roots in Autoregressive-Moving Average Models of Unknown Order’. Biometrika 71, 599–607. Shin, Y. and P. Schmidt: (1992), ‘The KPSS stationarity test as a unit root test’. Economics Letters 38, 387–392. 168 Sowell, F.: (1990), ‘The fractional unit root distribution’. Econometrica 58, 495–505. Tanaka, K.: (1999), ‘The nonstationary fractional unit root’. Econometric Theory 15, 549– 582. Taqqu, M. S.: (1975), ‘Weak convergence to fractional Brownian motion and to the Rosenblatt process’. Probability Theory and Related Fields 31, 287–302. Vogelsang, T. J. and M. Wagner: (2013), ‘A fixed-b perspective on the Phillips–Perron unit root tests’. Econometric Theory 29(03), 609–628. 169