LIBRARY
Michigan State
University

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

AUﬂ-Oﬁi 60'th

 

(£7
APR 0 7 20057 6 .9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 c;/ClRC/DateDue.p65-p.15

Studies in nonlinear and long memory time series econometrics
By

Rehim K1119

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Department of Economics

2002

ABSTRACT

Studies in nonlinear and long memory time series econometrics
By

Rehim K1119

This dissertation explores long memory and nonlinear dynamics in foreign ex-
change, commodity and stock markets. The ﬁrst two chapters of this dissertation
explore nonlinearity and long memory in econometrics. In particular, chapter one
provides a concise overview of Smooth Transition Autoregressive (STAR) models.
The discussion is cast in terms of speciﬁcation procedures for smooth transition mod-
els. This chapter provides simulation evidence on the power and size properties of
nonlinearity tests designed in the literature against STAR type of nonlinear behavior
in a univariate time series. The chapter also studies the small sample properties of
nonlinear least squares method in estimating STAR models. Long memory Autore-
gressive Fractionally Integrated Moving Average (ARFIMA) models for the condi-
tional mean of a process, Generalized Autoregressive Heteroscedastic (GARCH) and
Fractionally Integrated GARCH models for the conditional volatility of a process are
discussed in terms of speciﬁcation, estimation and inference in chapter two.

Chapter three of the dissertation investigates a well known puzzle in international
ﬁnance literature. The purchasing power parity puzzle relates to the slow adjustment

of real exchange rates. We investigate the transactions cost-nonlinearity explanation

of the puzzle by utilizing STAR models. The ﬁndings in the chapter point out the
difﬁculty in explaining the puzzle by by the transactions cost theory alone. The
estimated models and further analysis reveal the extreme persistence in real exchange
rates over the ﬂoating period.

The fourth chapter of this dissertation investigates long memory dynamics in com-
modity markets. Both cash and future prices of several commodities, (coffee, corn,
gold, silver, soybean and unleaded gasoline) are analyzed. The ﬁndings indicate that
commodity cash and future prices are approximately martingale with long term de-
pendence in the higher moments. The volatility proxies, for example, squared returns,
absolute returns, and intraday range are found to exhibit long memory component.
The ﬁnding of the long memory has important implications for optimal hedge ratios.

Chapter ﬁve of the dissertation analyzes the long memory dynamics in an emerging
capital market, the Istanbul Stock Exchange (ISE) National 100 daily and weekly
dollar index returns and its absolute and squared returns. Both parametric FIGARCH
models and nonparametric methods are employed. Results indicate the presence of
long memory dynamics in the conditional variance which can be modelled adequately
by a FIGARC'H model.

The last chapter revisits the persistence and nonlinearity of deviations from PPP.
It develops new unit root test that is speciﬁcally designed to test random walk without
drift and random walk with drift against stationary exponential smooth transition
autoregressive models. The asymptotic distributions of the tests are derived and
shown to be nonstandard. The power and size of the tests in finite samples studied
by simulations. The ﬁtted exponential STAR models and further analysis reveal the

nonlinear nature of real exchange rates as well as the persistence of the deviations.

For Ada, Calm and in loving memory of my mother, and my brothers.

iv

ACKNOWLEDGMENTS

”We act as though comfort and luxury were the chief requirements of life, while

all that we really need is something to be enthusiastic about.”

Albert Einstein

The completion of this dissertation would not have been possible without the in—
put, advise, assistance and encouragement of many individuals throughout the entire
process. I am happy to have the opportunity to express my gratitude to those indi-
viduals, although I realize that it may be impossible to express, in these pages, the
depth of my appreciation to the many who are deserving.

When I started my PhD research, I wasn’t quite sure whether this was something
to be enthusiastic about. When I look back, I can surely say that it was. A decisive
factor in bringing about this change of mind has been the opportunity to know the
distinguished economists, Richard T Baillie, Robert de Jong, Peter Schmidt, Rowena
Pecchenino and Jeffrey Wooldridge. Having the opportunity to know them personally
and working with them, it is impossible not to become enthusiastic about economic
and econometric research. I can only hope to continue to exchange ideas and co—
operate with them in the future.

I thank Professor Richard T Baillie not only as being my supervisor, but also for
his great support and inspiration from the very ﬁrst until the very last day of my

years at Michigan Sate University (MSU). Despite his busy schedule, he has been

incredible in directing and stimulating my research. I deeply appreciate his generous
ﬁnancial support throughout research assistantship which not only helped me and
my family ﬁnancially but also lead to completion of a chapter of this dissertation.
I thank the members of my committee, Professor Peter Schmidt, Professor Rowena
Pecchenino, and Professor Jeffrey Wooldridge for their insightful comments on this
thesis. I thank them being there whenever I needed their advise and help. They have
been great support and inspiration throughout my years at MSU. I special word of
thanks to Professor Robert de Jong, for his constant help on questions that I didn’t
have a clue about. He has been not only a great inspiration but also a very good
friend to talk about several issues. I am also grateful to Peter Schmidt and Rowena
Pecchenino for all of the support and advisement that they offered me throughout my
PhD years at MSU. I would like to thank to Professor S. Tamer Cavusgil for reading
my dissertation, his ﬁnancial support throughout a research assistantship and his
moral support. I would also like to thank to Professors Ana Maria Herrera and
Steven J. Matusz for their useful comments and suggestions about certain chapters
of this dissertation.

I thank to my colleague PhD students at Michigan State University, especially
to Scott Adams, Ali M. Berker, Chirok Han, Vinit Jagdish, Daiji Kawaguchi, Alina
Luca, Pmar Ozbay, Yuri Soares, and Chien—Ho Wang for their advise and help on
certain stages of this dissertation and for their friendship. I am grateful to the staff
in the Department of Economics at MSU, in particular to Ms. Margaret Linch, Ms.
Amy Fekete, Ms. Wendy Tate and Ms. Linda Wirick for their support. I would like
to thank also to the Graduate School at MSU and the Dean of the College of Social
Sciences at MSU for providing ﬁnancial support.

Of course there are things in life to be more enthusiastic about than writing a
dissertation- although at times I have found it necessary that other people pointed

this out to me. My special thanks are to my best friend and my wife Giilen K1119, and

vi

my daughter Ada Birge, for reminding and arousing my enthusiasm for life. Without
Giilen’s never failing moral support and love and the joy, Ada brought to my life, this
this dissertation wouldn’t exist.

I would like to thank to my best friends, Giilten and Sait Akgiin, and Giin Ay-
han Utkan being always there when I needed them both morally and ﬁnancially. I
would like to thank my friends, Leyla Parvizi-Yilan for proof reading my writings and
Giiltekin Yilan for his moral support. Finally, I would like to thank to my father, my
sisters, and my mother in law and father in law for their moral support throughout

these years.

vii

TABLE OF CONTENTS

LIST OF TABLES xi
LIST OF FIGURES xiii

1 Smooth Transition Autoregressive Model: speciﬁcation, estimation,
and inference 1
1.1 Introduction .................................. 1
1.2 The STAR Model: Representation, Speciﬁcation, and Inference ..... 2
1.3 Properties of the STAR Model ........................ 6
1.4 Empirical Speciﬁcation of STAR models .................. 9

1.4.1 Specifying an appropriate linear AR model ................ 9
1.4.2 Testing linearity against STAR ...................... 11
1.5 Estimation of STAR Models ......................... 27
1.6 Diagnostic Checking of Estimated STAR model .............. 31
1.6.1 Tests for serial autocorrelation ...................... 31
1.6.2 Testing for remaining nonlinearity .................... 32
1.6.3 Testing parameter constancy ....................... 34
1.7 Impulse response function analysis of estimated STAR model ....... 35
1.8 Conclusion ................................... 39
BIBLIOGRAPHY 40

2 Review of long memory models for conditional mean and variance 51
2.1 Introduction: Deﬁnition and sources of long memory in economic time series 51

2.2 Long Memory Models ............................ 56
2.2.1 The ARFIMA Model ............................ 56
2.3 Long memory volatility models ....................... 62
2.3.1 The (G)ARCH Model ........................... 65
2.3.2 The IGARCH Model ............................ 69
2.3.3 The F IGARCH Model ........................... 70
2.4 ARFIMA-FIGARCH Model: Modelling long memory in both conditional
mean and variance ............................ 74
2.5 Estimation and Inference ........................... 75
2.5.1 Regression based estimation in the frequency domain .......... 75
2.5.2 Parametric Methods: Approximate Maximum Likelihood ........ 78

viii

2.5.3 Whittle’s approximate MLE ........................ 79

2.5.4 Approximate MLE in the time domain .................. 80
2.6 Conclusion ................................... 83
BIBLIOGRAPHY 84
3 Persistence and Nonlinearity in Real Exchange Rates 92
3.1 Introduction .................................. 92
3.2 Modelling Nonlinearity by Smooth Transition Autoregressive Modes . . . 97
3.3 Nonlinearity, Non-stationarity and Real Exchange Rates ......... 100
3.4 Empirical Results ............................... 103
3.4.1 The Data .................................. 103
3.4.2 Nonlinearity tests and STAR model speciﬁcation ............ 104
3.4.3 Results from the Estimated STAR Models ................ 105
3.5 Further Analysis of the Dynamics of Estimated Star Models: Character-
istics Roots and GIRFs .......................... 107
3.6 Conclusion ................................... 1 1 1
BIBLIOGRAPHY 113
4 Long Memory in Commodity Markets 129
4.1 Introduction .................................. 129
4.2 The Data ................................... 132
4.3 Results from GARCH and FIGARCH Models ............... 136
4.4 Conclusion ................................... 138
BIBLIOGRAPHY 140
5 On the long memory properties of Emerging Capital Markets: Evi-
dence from Istanbul Stock Exchange 181
5.1 Introduction .................................. 181
5.2 The Data ................................... 185
5.3 Empirical Results ............................... 188
5.4 Conclusion ................................... 189
BIBLIOGRAPHY 191

6 Revisiting the nonlinearity and persistence in real exchange rates:
evidence from a new unit root test and an ESTAR speciﬁcation 199

6.1 Introduction .................................. 199
6.2 Foundations of nonlinear adjustment of real exchange rates and ESTAR
model ................................... 202
6.2.1 Motivation for a nonlinear adjustment in real exchange rates ...... 202
6.2.2 Stationarity of ESTAR model ....................... 205
6.3 Testing Unit root against stationary ESTAR alternatives ......... 207

6.4 Empirical Critical Values and size and power properties of the sup Wald

tests .................................... 211
6.5 Empirical Results ............................... 213
6.5.1 The data .................................. 213
6.5.2 Unit root test results ............................ 213
6.5.3 ESTAR model estimation and persistence of real exchange rates . . . . 214
6.6 Conclusion ................................... 219
6.7 Appendix: Proof of propositions 1 and 2 .................. 220
BIBLIOGRAPHY 230

1.1
1.2

1.3
1.4
1.5
1.6

1.7

1.8

1.9

1.10

3.1
3.2
3.3
3.4

3.5

3.6
3.7

4.1
4.2

4.3
4.4

4.5
4.6

LIST OF TABLES

Lag selection frequencies in AR(p) model .................. 43
Parameter Speciﬁcations for the generated DGPs: All of the DGPs are
generated with c = 0 and 7 = 5 ..................... 43
Empirical power of the linearity tests ..................... 44
Empirical size of the linearity test. ..................... 45
Simulation Results on the ﬁnite sample performance of N LE of STAR models 45

Simulation Results on the ﬁnite sample performance of NLSE of STAR

models ................................... 46
Simulation Results on the ﬁnite sample performance of NLSE of STAR

models ................................... 46
Simulation Results on the ﬁnite sample performance of NLSE of STAR

models ................................... 47
Simulation Results on the ﬁnite sample performance of NLSE of STAR

models ................................... 47
Simulation Results on the ﬁnite sample performance of NLSE of STAR

models ................................... 48
Empirical rejection frequencies of linearity tests, Sample size=305. . . . . 123
Empirical rejection frequencies for ADP PP and KPSS tests ....... 123
Results on unit root and stationarity tests:PP, and KPSS ......... 124
p-values of LM tests for star type of nonlinearity in monthly logarithmic

differences of real exchange rates. .................... 125
Estimation Results from ESTAR models: Sample size: 291 (after adjusting

end points). ................................ 126
Tests for remaining nonlinearity and parameter constancy ........ 127
Characteristic Roots in extreme regimes .................. 128
Summary statistics for commodity future and cash returns ........ 172
Summary statistics for commodity future absolute and squared returns

and intraday range ............................ 172

KPSS and Phillips-Perron test results for commodity future log prices
levels, returns, absolute returns, squared returns and intraday range . 173

Estimated MA — GARCH Models for the commodity future returns . . 174
Estimated MA — F I GARCH Models for the commodity future returns . 175
Estimated MA — GARCH Models for the commodity cash returns . . . 176

4.7 Estimated MA — F I GARCH Models for the commodity cash returns . .
4.8 GPH estimation results the cash returns, squared and absolute returns
4.9 GPH estimation results the future returns, squared and absolute returns
and intraday range ............................
4.10 Local Whittle Estimates of long memory parameter for commodity cash
and future returns and volatility proxies ................

5.1 Summary statistics for ISE100 stock returns ................
5.2 Estimated ARM A(P, Q) — F I GARCH (p, 6, q) Models for ISE 100 Index
returns ...................................
5.3 GPH, CSS and local Whittle estimates of long memory parameter for the
ISE100 stock squared returns and absolute returns ...........

6.1 Empirical critical values of the unit root tests ...............
6.2 Empirical size of the unit root tests .....................
6.3 Empirical power of the unit root tests ....................
6.4 Results on unit root and stationarity tests:PP, supWald and KPSS tests

6.5 Estimation Results from ESTAR models: Sample size: 312 ........

177
178

179

195

196

249
249
250
251
252

1.1

1.2

2.1
2.2
2.3

3.1
3.2

4.1
4.2

4.3
4.4

4.5
5.1

5.2

6.1
6.2

6.3
6.4

LIST OF FIGURES

Examples of the exponential, logistic, functions for values of '7 3, 5, and

25 and threshold parameter c = 0 ..................... 49
Sample realizations from the STAR models it” = —.3,7r1,2 = 0.7,c = 0

and u; N NID(O,1) ............................ 50
Sample realizations from ARF I M A(p, (1, q) processes ........... 89

Autocorrelations of the Sample realizations from ARFI M A(p, d, q) processes 90
Autocorrelations of u? from sample realizations of GARCH(1, 1) and

FIGARCHU, d, 1) processes ...................... 91
Estimated Transition Function versus Time and Threshold Variable . . . 117
Generalized Impulse Response Functions from estimated ESTAR models 121
Cash returns, absolute and squared returns ................. 143
Commodity future returns, absolute and squared returns, and intraday

range .................................... 149
Autocorrelations for cash returns, absolute and squared returns ..... 155
Autocorrelations for future returns, absolute and squared returns, and

intraday range ............................... 161
Future returns and estimated conditional variances ............ 167

ISE National 100 Daily stock indices, index returns, absolute and squared
returns ................................... 197
Correlograms of ISE 100 stock index returns ................ 198

Estimated j-step ahead covariances from the simulated ESTAR model . . 233
Real exchange rate series and ﬁtted values, residuals, and estimated tran-
sition function versus time and transition variable ........... 235
Generalized Impulse Response Functions from estimated ESTAR models 242
Distribution of Generalized Impulse Responses ............... 245

xiii

CHAPTER 1

Smooth Transition Autoregressive
Model: speciﬁcation, estimation,

and inference

1. 1 Introduction

The aim of this chapter is to review the smooth transition model and discuss
aspects of the model that are relevant to the subsequent chapters. The presentation
is framed in terms of empirical speciﬁcation and estimation of the smooth transition
autoregressive models, the basics of which are discussed in Granger and Terasvirta
(1993), Teriisvirta (1994), and Eitrheim and Teriisvirta (1996). A review of the
STAR similar in spirit to this chapter is given by Teriisvirta (1998), and by van Dijk,
et a1. (2000). This chapter contains three Monte Carlo simulation experiments. The
ﬁrst experiment suggests that standard lag selection criteria (i.e. AIC, BIC) may not
always select the correct lag order in STAR models. The second experiment examines
the properties of standard and heteroscedasticity consistent (HCC) variants of non-
linearity tests. The results suggest that both variants have comparable power, (i.e. the

ability to reject linearity when false). However, the size of the standard tests becomes

worse when compared to that of HCC variants. The third experiment examines the
ﬁnite sample properties of nonlinear least squares (NLS) estimates of STAR models.
The results indicate that in sample sizes of 100 (which is approximately the available
sample size for several macroeconomic variables) the estimation performs poorly in
terms of mean square errors. When the sample size is doubled the NLS method

performs better.

1.2 The STAR Model: Representation, Speciﬁca-
tion, and Inference

The smooth transition model for a univariate time series yt, which is observed at

times t = 1 —p,—p,...,—1,0,1,...,T— 1,T, is given by
yt = 7Irixt(1_ F(Zt;’7,C)) + W2$tF(Zt;716) + at t=1:' ° ' 1T1 (1'1)

where x, is a vector consisting of lagged endogenous and exogenous variables, x, =
(1,513,)’ with {it = (yt_1, . . . ,y¢_p, wu, . . . ,wkt)’ and it,- = (no, . . . ,7r,-,m)’,

i = 1, 2, with m = p+ k. The STAR is obtained if one considers i = (y¢_1, . . . , yt_p)’.
The presentation in this chapter is restricted to the STAR model as it is the model that
is used in the applications in this dissertation. The disturbances, (ut) are assumed
to be a martingale difference sequence with respect to the history of the time series
up to time t - 1, which is denoted by 9,4 = yt_1,...,y1_p. This means that,
E [ut|9¢_1] = O. For simplicity, we also assume that the conditional variance of at is
constant, that is, E [uf|Q¢_1] = 02. The transition function F (2,; '7, c) is a continuous
function that is bounded between 0 and 1. The transition variable 2, can be a lagged
endogenous variable, 2, = yt_d for a certain integer d > 0, as assumed most of the time
in empirical applications. It can also be an exogenous variable, or a function of both

lagged exogenous and endogenous variables, say z, = 2(1‘13). This function, in principle,

2

can be either, linear or nonlinear and it can be parametric or non-parametric. In most
of the applications it is taken to be a linear function of lagged endogenous variables.
Another possibility is to let z, to be a function of a linear time trend zt = t, which is
simply the STAR model with smoothly changing parameters, see Lin and Terasvirta
(1994). In order to keep the generality, we do not assume any particular form for
the transition function throughout this chapter. One can write out the STAR model

given in equation (1.1) in more detail as follows;

2% = (7T1,o + 7T1,1yt—1 + + 7r1,pyt—p)(1"' F(ZtW, 0))

+(7l'2,0 + n2,1y¢_1+ + 7T2,pyt_p)F(Zt; ”y, C) + u; (1.2)

There are two possible ways of interpreting the STAR model. The STAR model
can be thought of as a regime switching model that allows for two regimes, associated
with the extreme values of the transition function, F(.) = 0 and F(.) = 1, where
the transition from one regime to the other is gradual. Alternatively, it can also be
thought that the STAR model involves a c’ontinum of regimes, each associated with
a different value of the transition function between 0 and 1. The regime that prevails
at time t is determined by the observable variable, zt and the associated value of
F(..) Different choices for the transition function, F(.), leads to different types of
state-dependency and / or regime-switching behavior. In most of the applications in

econometrics, either logistic,

1

 

F ; , = , > 0, 1.3
(2; '7 C) 1+exp[—’Y(Zt'- C)] 7 ( )

or exponential function,
F (21; 7, c) = 1- expl-W: - @217 > 0, (1-4)

are the most popular choices. The choice of the logistic function leads to the logistic
STAR (LSTAR) model, while the choice of the exponential function results in so

called exponential STAR (ESTAR) model. The parameter, c in the LSTAR model is
interpreted as the threshold between the two regimes corresponding to F(.) = 0 and
F(.) = 1, in the sense that the logistic function changes from 0 to 1 as 2, increases
and F(c,7,c) = 0.5. The parameter 7 determines the smoothness of the change in
the value of the logistic function and thus smoothness of the transition from one
regime to the other. Figure 1.1 shows graphs of the logistic and the exponential
functions for different parameter speciﬁcations. From the ﬁgure it is obvious that as
ry becomes larger and larger the logistic function approaches to the indicator function
[[zt > 0], deﬁned as I ( ) = 1 if argument is true and I () = 0, otherwise. As a result
the transition from one regime to the other happens almost instantaneously at z, =
c. This implies that the LSTAR model nests a two-regime threshold autoregressive
(TAR) model as a special case. When 2, = yt_d the model is called the self-exciting
TAR model. TAR models are discussed extensively in Tong(1990). When 7 is close
to zero the logistic function is equal to the constant 0.5 and when 7 = 0, the LSTAR
model reduces to a linear model.

The type of regime switching implied by the LSTAR model may be useful for mod-
elling certain economic time series that exhibit asymmetries in terms of expansions
and recessions. This is because in the LSTAR model the two regimes correspond to
the small and large values of the transition variable 2; relative to the threshold c.
Hence it allows one to distinguish expansions and recessions in a given time series.
That is the reason why the LSTAR model has been used in the empirical business
cycle literature for modelling asymmetric behavior of macroeconomic variables, such
as output and unemployment, over a business cycle. For example, if yt is the rate
of unemployment, and if the transition variable is the unemployment rate at a pre-
determined date, say, the unemployment rate of previous period, 2, = yt_1, then the
model is capable of distinguishing high and low unemployment relative to a threshold

rate, say the natural rate of unemployment, assuming such a rate exists, over the

business cycle. Similarly, if y, is the growth rate of an output variable, and if the
transition variable is taken to be the growth rate in the previous period, if c z 0, then
the LSTAR model can distinguish periods of positive and negative growth, namely
periods of expansions and contractions over the business cycle. The LSTAR model
has been applied by Tera'svirta and Anderson (1992) and Teriisvirta, Tj¢stheim and
Granger (1994) to study the the different dynamics of industrial production in a
number of OECD countries.

It is quite plausible to come up with empirical problems in economics where dif-
ferent types of regime-switching behavior may be much more appropriate than the
one implied under the LSTAR model. A major example would be the behavior of
real exchange rates. The dynamic behavior of real exchange rates could possibly de-
pend on the magnitude of the deviations from purchasing power parity [PPP]. For
instance, the presence of transaction costs may lead to the notion of different regimes
in real exchange rates. In particular, the proﬁts from commodity arbitrage, which
is generally thought to be the ultimate force behind maintaining PPP, do not make
up for the costs involved in the necessary transactions for small deviations from the
equilibrium value. This means that there may exist a band around the equilibrium
rate in which there is no tendency for the real exchange rate to revert to its equilib-
rium value. Whenever the rate is outside the band that is speciﬁed by the relevant
costs, arbitrage becomes proﬁtable. This in turn forces the real exchange rate back
towards the band. Dumas (1992), for instance, builds a general equilibrium model
that implies the type of behavior outlined above.

If we want to model the type of behavior that is described in the above example
by a STAR model, with y, being the real exchange rate and z, = gt...“ it appears
much more appropriate to choose the transition function such that the regimes are
associated with small and large absolute values of 2;. A speciﬁcation along these lines

for the transition function would be, for example, the exponential function given in

(1.4) as it may allow one to model symmetric adjustment towards the equilibrium
value of real exchange rates. The ESTAR model has been applied to real exchange
rates by Michael, Nobay, and Peel(1997), Taylor, Peel, and Sarno (2001) among
others.

Note the fact that the exponential function in (1.4) has the property that whenever
7 -+ 0 or 7 —+ 00, it becomes a constant, see ﬁgure 1. Thus the ESTAR model becomes
linear in both cases and it does not nest a self exciting threshold autoregressive
(SETAR) model as a special case. To remedy this drawback use of the quadratic
logistic function;

1
_ 1+ exp[—7(zt — cl)(zt — c2)]

 

F(Ztl’l/ac) 1C1 S 62: 7 > 0 (15)

has been suggested in some literature, see for instance, Jansen and Terasvirta (1996).
With the quadratic transition function, if 7 —> 0, the model becomes linear. While
when 7 —+ co, and c1 3i c2, the transition function is equal to 1 for z, < c1 and z, > C2
and equal to 0 in between. Thus the speciﬁcation for the transition function in (1.5)

nests a three regime SETAR model.

1.3 Properties of the STAR Model

In this section we brieﬂy discuss some properties of the STAR family models. The
discussion here is rather informal and intuitive. A much more formal discussion of
STAR models is given in Tong (1990) and Terasvirta (1994). Throughout this section
we concentrate on those models with autoregressive lag equal to 1 as it is easier to
present the important characteristics of the models without exposing their complex
details.

One of the ﬁrst things to note about STAR models is the relatively large variety of

dynamic patterns that can be obtained from choosing the parameters appropriately.

To get an impression of the potential dynamic patterns that can be generated from
STAR models, panels of ﬁgure 1.2 show realizations of T = 250 observations from
an ESTAR model with p = 1 and 2, = yt_1. The realizations are obtained by
setting 1r” = -0.3, «2,1 = 0.7 and the parameters in the exponential function, (7, c)
are set equal to 3 and 0 respectively. The disturbances ut,t = 1,. . .T are drawn
independently from a standard normal distribution, i.e. u; ~ i.i.d.i~l(0,1). All series
are started with yo = 0, and the same values for the disturbances are used to generate
subsequent observations. The intercepts it”) and «2,0 are varied to generate different
behavior. One thing that is observed in the panels of ﬁgure 1.2 is that by just changing
the intercepts over the regimes one can obtain quite rich dynamic patterns in STAR
models. In other words by keeping the autoregressive parameters in the two extreme
regimes the same, but varying the intercepts generates series with quite different
behavior. This also illustrates how the constant terms can play an important roles
in nonlinear models. To get some idea about the dynamics of STAR models with
different parameter speciﬁcations in the autoregressive parameters, realization from
the ESTAR model with 713,1 = 1, “2.1 = —0.3 where all other parameter speciﬁcations
are the same as above except «1,0 = «2,0 = 0 is given in panels of ﬁgure 1.2 as
well. The panel f of ﬁgure 1.2 gives a sample realization from an LSTAR model with
quadratic logistic function given in (1.5), with c1: 0,c2 = 0.5, 7r1,o= «2,0: 0, and
it” = 1, «2,1 = —0.3. In these latter panels of ﬁgure 1.2, the autoregressive parameter
in the inner / middle regime is unity. This implies that the process acts like a unit root
process in the inner / middle regime and becomes a stationary process in the outer
regime. Thus as the deviation of the transition variable (in these examples, yt_.1)
from the threshold level becomes larger and larger, the process becomes increasingly
mean reverting in the sense that it tends to move back to the inner / middle regime.
Therefore, the generated processes although locally behave as a random walk, globally

they are stationary. In this sense the time series realizations are globally stationary.

Conditions that need to hold for the stationarity of STAR models is relatively
less explored. The required conditions for the stationarity in STAR models have only
been established for the ﬁrst-ordered SETAR model which is obtained from (1.2) with
p = 1 and (1.3) by allowing 7 —i 00. Chan, Petrucelli, Tong, and Woolford (1985)
show the conditions for the stationarity of the ﬁrst order SETAR model. They show
that the SETAR model is stationary if and only if one of the following conditions is
satisﬁed:

1. 771,1 <1,7r2,1 <1, 711,1, 71'2,1<1;
’ll. 771,1 = W211 < 1, 71'”) > 0;
221. 7T1,1 < 1, 7T2,1 = 1, 7T2'0 < 0,
iv. 7T1,1 =1, 71'2‘1 =1, 71'2'0 < O < 71'1’0;

’U- 7r1,1’”2,1 =1, 7T1,1< 0, 7r2,0 + 7r2,1771,o > 0.

Condition (i) allows one of the autoregressive (AR) parameters to become smaller
than -1. Note also that the conditions (ii — iv) allow unit root behavior in one or both
of the regimes. In these cases, the time series is locally nonstationary. Local station-
arity is obtained because of the conditions on the intercept terms in two regimes. The
conditions (ii — iii) on the intercepts 7r”, and «2,0 are such that the time series has
a tendency to revert to the stationary regime and hence, the time series is globally
stationary. The condition in (iv) also allows the two AR parameters to be unity and
hence the time series to be nonstationary in both regimes globally but the conditions
on the intercepts guarantees the global stationarity of the series. The testing problem
for unit roots in SETAR models is discussed in Caner and Hansen (2001), Enders and

Granger (1998) and Berben and van Dijk (1999) and in Chapter 6 of this dissertation.

8

1.4 Empirical Speciﬁcation of STAR models

Issues relating to the empirical speciﬁcation of STAR models have been discussed
extensively in Granger(1993), Granger and Terasvirta ( 1993), and Terasvirta(1994).
The empirical speciﬁcation procedure advocated by these authors involve a speciﬁ-
cation strategy that starts with a simple or restricted model and proceeds to a more
general one only if diagnostic tests indicate that the maintained model is inadequate.
The procedure efﬁciently put forward in Tera'svirta (1994) consists of the following

steps.

1. Specify an appropriate linear AR model of order p [AR(p)] for the time series

under study;

2. Test the null hypothesis of linearity against the alternative of STAR—type non—
linearity. If linearity is rejected, select the appropriate transition variable 2; and

the form of the transition function F (zt; 7, c);
3. Estimate the parameters in the selected STAR model;
4. Evaluate the model using diagnostic tests;
5. Modify the model if necessary;
6. Use the model for descriptive or forecasting purposes.

The following sections discuss each of these steps in detail.

1.4.1 Specifying an appropriate linear AR model

The important issue involved in specifying an AR(p) for the time series under
consideration is the selection of the lag order p. The residuals from the AR(p) model

need to be approximately white noise as the tests for nonlinearity that are used in the

second step are sensitive to residual autocorrelation. There are several conventional
methods that can be used for lag selection purposes. The most commonly used criteria
in the linear models are the Akaike Information Criterion [AIC], AI C = Tln 62 + 2k,
Schwartz Information Criterion [BIC], BI C’ = T ln 52 + k(ln(T)), Harman and Quinn
Criterion (HQ), HQ = Tln 62 + kln(ln(T)) and the Ljung—Box (LB) statistic. The
LB statistic is used to test directly for the residual autocorrelations. The LB statistic
is LB(j) = T(T + 2) 2;, ﬁrﬂu) where ”(11) is the k -— th autocorrelation of the
residuals. Under the null hypothesis of no residual autocorrelation at lags 1 through
m the LB test has an asymptotic xzdi stribution with m — p degrees of freedom.

These methods are mostly developed for linear time series models. The use of
these information criteria and (partial) autocorrelation based methods may not be
quite appropriate in case of non-linear time series. One reason is the autocorrelations
of non-linear time series processes may have quite different properties. For instance,
Granger and Terasvirta (1999) and Diebold and Inonue (2001) discuss certain regime
switching models that have autocorrelations that resemble long memory properties.
Especially in ﬁnite samples, estimated autocorrelations may be quite substantial and
they may decline very slowly. Therefore, when an AR(p) model is considered for these
series the selected lag order may become large.

In order to better asses the appropriateness of the methods discussed above within
the context of STAR models, the following simulation experiment was conducted.
Time series are generated from the ESTAR model given in (1.2) with (1.4) and with
p = 1,zt = yt_1. The parameters in the two regimes were speciﬁed to be 7r” =
0.6, 1r” = 0.3, the smoothness parameter was chosen to be 7 = 3 and the threshold
parameter was kept at c = 0.5 during simulations. The sample was taken to be
T = 250 and T = 500 observations. The series were generated from at ~ iid N (0, l).
The constant terms in both regimes were kept at zero during simulations. An AR(p)

model is speciﬁed for the generated ESTAR series where p is set equal to the lag length

10

that minimizes AIC, BIC, HQ, with maximum order p = 6, or to the minimum lag
length for which the LB statistic with m = 15 is not statistically signiﬁcant at the 5%
level. Table (1.1) shows the frequencies out of 1000 replications, for which different
values of p are selected as the appropriate lag order. The results in (1.1) indicate
that in some cases standard lag selection criteria over estimate the autoregressive
lag order. This may mean that straightforward application of these criteria may not
always be appropriate. Hence, one needs to pay particular attention when using these

selection criteria in STAR type modelling.

1.4.2 Testing linearity against STAR

Once an AR(p) model is speciﬁed, one can proceed with testing linearity against the
alternative of STAR-type nonlinearity. This step is crucial as the failure of rejecting
the null hypothesis of linearity will invalidate the STAR modelling for the time series
under investigation.

In order to facilitate the discussion in this section re—write the STAR model given
in (1.1)

ye = «12:.(1- F(zm. 6)) + TréxtF(zt; 7.6) + ”at, t = 1, - - - ,T. (16)

where 2:, = (1,5:2)’ with it, = (y,_1, . ..y,.,,)’. The null hypothesis of linearity can
be formulated in different ways. A straightforward formulation involves setting the
autoregressive parameters in the two regimes to be equal, that is, H0 = 7r; = 7r;
against the alternative hypothesis H1 = 117,,- 79 «2,,- for at least one j E 0,. . .p.
The testing for linearity against STAR—type nonlinearity is complicated because of
the nuisance parameters problem. More explicitly, the testing for linearity becomes
complicated as there exist unidentiﬁed nuisance parameters under the null hypothesis.

This is because the STAR model contains parameters which are not restricted by

the null hypothesis, but they are present when the null hypothesis holds true. For

11

instance, the null hypothesis given above does not restrict the parameters in the
transition function, namely, 7 and 9. However observe the fact that whenever the
null hypothesis holds true the transition function, F (zt, 7, c),and hence, 7 and c drop
out of the model.

The presence of unidentiﬁed nuisance parameters problem can also be seen when
expressing the null hypothesis of linearity in several different ways. In addition to
the equality of the AR parameters in two regimes, H0 = 1r’l = «5, one can formulate
the null hypothesis H6 = 7 = 0. This alternative formulation of the null hypothesis
also gives rise to a linear model. For example, if 7 = 0 the logistic function in (1.3) is
equal to 0.5 for all values of zt, and the STAR model in (1.6) reduces to an AR model
with parameter W. Similarly under H6 the exponential function in (1.4) becomes
zero and hence the ESTAR model reduces to a linear AR model with parameter 7n.
Under this alternative null hypothesis, 1r1and 1r2and the threshold parameter c can
take any values.

A recent account of the problem of unidentiﬁed nuisance parameters under the
null hypothesis is given in Hansen (1996). The main consequence of the presence of
unidentiﬁed parameters under the null hypothesis is that the conventional statistical
theory can not be applied to obtain the asymptotic distribution of the test statistics.
The relevant test statistics in general tend to have non—standard distributions for
which an analytic expression is not available. Hence the critical values need to be
determined by means of simulation methods which in turn can be quite prohibitive
depending on the statistic.

To avoid the nuisance parameters problems in testing for linearity against the
STAR type nonlinearity, Luukkonen, Saikkonen and Tera'svirta (1988) proposed to
replace the transition function F(.) by a suitable Taylor series approximation. The
beneﬁt of such a solution is that the problem is re—parameterized so that the iden-

tiﬁcation problem is no longer present. The linearity is then tested by means of a

12

Lagrange Multiplier [LM] statistic which has a standard asymptotic xz—distribution
under the null hypothesis. This procedure is quite appealing as it does not require
the estimation of the model under the alternative hypothesis. It also avoids the use
of simulation methods to assess the signiﬁcance of test statistics. One shortcoming
of this method is that the LM tests can potentially have power against any other
form of misspeciﬁcation or nonlinearity that may be approximated by the transition
function used. In other words, rejection of the null may not always indicate that
the correct speciﬁcation is a STAR model. Thus, diagnostic tests need to be used in
evaluating the ﬁt of the models before concluding on the STAR type nonlinearity.
As noted in Granger and Terasvirta (1993), in testing linearity against the al-
ternative of a STAR model, based on an AR(p) model under the null hypothesis,
one needs to distinguish three situations depending on the nature of the transition

variable 2,:
1. z, is a lagged endogenous variable yt_d, with 1 S d S p;
2. z; is a lagged endogenous variable yt_d with d > p, or an exogenous variable wt;
3. z; is a linear combination of y,_1, . . . ,ytp, that is a’i, with (1 unknown.

The ﬁrst two situations test linearity against STAR with a speciﬁed transition
variable, which is most often encountered in applications of STAR modelling in eco—
nomics and ﬁnance. The test statistic differs slightly in the ﬁrst situation compared
to the second as 2t is contained as a regressor in the model under the null hypothesis
whenever d S p. The test statistics that result in situation three are usually inter-
preted as general tests against STAR-type of nonlinearity, see for instance Terasvirta
(1998). In the rest of this section we ﬁrst present derivations of the test statistics
that are used in the ﬁrst situation and then give some remarks on the differences that

arise in the second and third cases.

13

Testing against LSTAR

In order to facilitate the presentation we ﬁrst discuss the tests against the LSTAR
model and then the ESTAR model. Given the LSTAR model as in (1.6) with the

transition function (1.3) and with z, = yt_d for certain 1 g d S p, re—write (1.6) as
y: = 71,117t + (712 " 7r1)I$tF(yt—da 7, C) + “t (1-7)

Following the suggestion of Luukkonen et al. (1988) approximating the transition

function with a ﬁrst order Taylor approximation around 7 = 0, we have

 

6F _ , ,c
F1(yt—d,’7, C) = F(l/t—dﬂ. 0) + ’7 (we; 7 )lv=o + Rl(yt—d,%cl
1 1
= ~2- + 17011-.) - C) + Rl(yt—d, 7. C) (1.8)

where R1(.) is the remainder term. Substituting F1(.) for F () in (1.7) and rearranging

terms gives the auxiliary model

311 = (150.0 + (Pair-1‘ ¢lityt—d + 77: (19)

where 17; = u; + (7T2 — 7r1)’x¢ + R1(yt_d,7,c). Note that under the null hypothesis,
the remainder term is equal to 0 and m = ut . Thus the remainder term does not
affect the properties of residuals under the null hypothesis. This in turn implies that
the distribution of the test statistics will not be affected by the remainder term. The
relationship between the parameters 4),- = (¢g,1,---,¢i,p),i = 0,1, in the auxiliary
regression model in (1.9) and the parameters in the LSTAR model in (1.7) are given
by

450,0 = %(7Tl,o + 7T2,o) — $700120 - 7T1,o) (1-10)

(150,4 = ‘;'(7r1,d + Wad) - i7 C(7T2,d - 7T1,d) — (”2,0 - ”1.0) (1-11)

¢0,j =%(771,j +7T2J) — i’YCWaj — “14): j = 11' ' 'P, j 7‘é d: (1-12)
4514' = $7007” — 7T1,j)a j = 1:°"1p° (1-13)

14

These relationships show that the restrictions 1n 2 «2 or 7 = 0 imply ¢1J = 0 for
j = 1, - - - , p. Therefore testing the null hypothesis Ho : 7r] = 7r2 or H6 :7 = 0 in (1.7)
is equivalent to testing the null hypothesis H6’ : 451 = 0 in (1.9). This hypothesis
can be tested by a standard variable addition test. The test statistic is the standard
Lagrange Multiplier test for parameter restriction and denoted by LMI. This statistic
is X2 distributed with p degrees of freedom under the null hypothesis of linearity under
certain regularity conditions which are given in Saikkonen and Luukonen (1988). This
test is usually referred to LM—type statistic because the LM1 statistic does not test
the original null hypothesis H6 : 7 = 0 but rather the auxiliary null hypothesis
H6’ :43 1 = 0.

The above test statistic does not have power in cases where only the intercept is
different across regimes, that is when 7r”) 75 «2,0 but “M = 7I2.j j = 1, - - - , p. This can
easily be seen from (10—13) which shows that (1)1 J- = 0, j = l, - - ' , p. Luukonen et al.
(1988) suggest use of a third order Taylor approximation of the transition function
to solve this problem. This is because the second order Taylor approximation of the
Logistic function around 7 = 0 is zero. The third order Taylor approximation of the

transition function is;

 

 

63F _ , ,c
F3(yt_d, ’y, C) = F(y¢_d, 0, C) '1' 7 (2:7: 7 ) 17:0 + (1.14)
1 63F(y -d: ’7: C)
673 5,73 |7=o + Rafi/pd, ’7, C)
—l+l( —c)+i3( —c)3+R( C)

Now replacing the transition function F () with its third order approximation results

in the auxiliary model
3!: = ¢0,0 + 45bit + (blityt—d ‘1' 4535:3134 + (15:35:13.0: '1' Th (1-15)

where 7h = u¢+(1r2 ——1r1)’x¢R3(y¢_d,7, c), and "0.0 and the 42,-, i = 1, 2, 3, are functions

of the parameters 1n, d2, 7, and c. The null hypothesis of linearity H6 becomes H6’ :

15

(151 = (I); = 933 = 0. This hypothesis can also be tested by a standard LM-type
test. Under the null hypothesis of linearity, the test statistic denoted by LM3, has
an asymptotic x2 distribution with 3p degrees of freedom. A parsimonious version
of LM3 statistic can be obtained by ﬁrst observing that the only parameters that
depend on the constants 7r”) and an are $2.11 and $3,.) and hence, augmenting the

auxiliary equation (1.9) with regressors yid and yf_d, that is,

31: = (30.0 + (565% + ¢jityt—d + (”2,0113% + (b3.dyi1—d + 77: (1.16)

The null hypothesis of linearity can be tested by testing the hypothesis Ho : 461 = 0
and 952,4 = (153,4 = 0. The resulting test statistic denoted by LM3E, has an asymptotic

x2 distribution with p + 2 degrees of freedom.

Testing against ESTAR

Granger and Terﬁsvirta (1993)and Terasvirta (1994) show that linearity can
be tested against an ESTAR alternative, given by (1.7) with (1.4), by replacing the
exponential transition function with a ﬁrst order Taylor approximation around 7 = 0.

Approximating the exponential function around 7 = 0 gives

aF(yt—d171 C)

 

F1(yt—d,% C) = F(iUt-d, 0, C) + 7 87 l7=o +R1(yt—da ”7.0)
= ’YU/t—d — C)2 + Rift/pd, ’7, C), (1-17)
which leads to the auxiliary model,
y. = ¢0,o + 453:2. + my.-. + 4532.113... + m (1.18)

where m = ut+(7r2—1r1)’mtR1(y¢_d, 7, c). Granger and Tera'svirta (1993) and Tera'svirta
( 1994) show that the restriction 7 = 0 corresponds with (bl = $2 = 0 in (1.18). The
LM2 statistic which tests this null hypothesis has an asymptotic X2 distribution with

2p degrees of freedom.

16

Recently Escribano and Jorda(1999) argue that a ﬁrst order approximation for
the exponential function is not sufﬁcient to capture certain characteristics of the ex-
ponential function, especially, the two inﬂection points of the function. They suggest

a second order Taylor approximation,

8F(yt—da 7) C)

F2(yc—d1%0) = F(yt—da 0, C) + 7 67 |7=o
+172 82F(y;—d: 7: C)
2 8'7 17:0

 

+ 122(3):...“ 7. C) (1'19)
1

Substituting back to (1.7) yields the auxiliary regression,
yt = (150.0 + ¢bit+ ¢lityt—d + ¢I2ityt2—d '1' (15:35:93.1 + (blityf—d + 77: (1-20)

The null hypothesis to be tested is H6 : 451 = (152 = (153 = $4 = 0. The resulting LM
type test is denoted by LM4. It has an asymptotic X2 distribution with 4p degrees of
freedom under the null hypothesis. Escribano and Jorda(1999) show by simulation
that the LM4 test have higher power compared to the LM2 test statistic. When 2, is a
lagged endogenous variable y¢-d with d > p or an exogenous variable, w, the resulting
test statistics are very similar to the ones derived above. The only difference is the
additional regressors, zf, i = 1,2, - - - , that enter the auxiliary model. For example,
the auxiliary model (1.18) based on the ﬁrst Taylor approximation of the exponential

function now becomes

9: = 450.0 + (15653: + 451,02: ‘1' 45:53:21: + 7k

while the auxiliary model (1.15)based on the third-order Taylor approximation of the

logistic function becomes;

31: = $0.0 + ¢6it+ 451.02. + ¢litzi+ (152,023 + (#251312? + 433.02? + $353.23 + 17:-

In the case linearity is tested against an alternative with z, = a’it, the number of

auxiliary regressors in the re-parameterized model increases very rapidly when the

17

parameter vector (1, which deﬁnes the linear combination of yt_1, - - - , yt_,,, that is
used as transition variable, is left completely unspeciﬁed. In order to compute the
test in practice, p needs to be set fairly small or the length of the time series has to
be sufﬁciently large. Discussion of this issue can be found in Granger and Terasvirta
(1993).

In the small samples, the usual suggestion is to use F—versions of the LM test
statistic because these have better size and power properties than the X2 versions.

The F—versions of the LM tests can be computed as follows;

1. Estimate the model under the null hypothesis of linearity by regressing y, on 33,.

Compute the residuals, i1, and the sum of squared residuals SSRO = 2;, 11?.

2. Estimate the relevant auxiliary regression of it, on 3:, and ityLd, where i will be
based on the LM statistic considered. For instance, in the case of LM3 statistic
based on (1.15) i runs from 1 to 3. After estimating the relevant auxiliary model

compute the sum of squared residuals and label it by S S R1.

3. The LM, statistic is computed as

(SSRO — SSR1)/df0

LM‘ = SSR1 /df1

 

where dfO and df 1 refers to the relevant degrees of freedoms for the numerator
and the denominator which will depend on the LM statistic considered. For

example, in the case of LM3based on (1.15), the F- version is

_ (3512, — SSR1)/3p
LM3 “ ssei/(T — 4p — 1)’

 

which under the null hypothesis is approximately F distributed with 3p and

T — 4p - 1 degrees of freedom.

18

Selection of transition variable and function

The selection of an appropriate transition variable in the STAR model and choice
of a suitable transition function are usually done during the linearity testing step
of the speciﬁcation. As illustrated in Teriisvirta (1994) the LM3 statistic, although
developed for testing linearity against LSTAR alternative, should have power against
ESTAR alternative as well. Intuitively this can be seen by comparing the auxiliary
models (1.15) and (1.18) which are used for computing LM2 and LM3 statistics re-
spectively. It is easy to see all auxiliary regressors in (1.15) are included in (1.18).
Hence it is intuitive to think that LM3 test might have power against ESTAR al-
ternatives. Observing this Terasvirta (1994) suggests that the appropriate transition
variable in the STAR model can be determined by ﬁrst, without specifying the form
of the transition function, by computing the LM3 statistics for several candidate tran-
sition variables 21,, - - - ,zmt, say, and selecting the one for which the p—value of the
test is smallest. The rationale behind this procedure is that the test should have the
highest power when the alternative model is correctly speciﬁed, that is, if the cor-
rect transition variable is used. In other words if the auxiliary regression model that
is used in calculating the LM3 statistic is considered to approximate the (L)STAR
model to a certain degree of accuracy, then selecting 2: as the choice which minimizes
the residual variance of the auxiliary model is equivalent to selecting z, as the vari-
able that maximizes the LM—type statistic. This is because LM—type statistic is a
monotonic transformation of the residual variance. Simulation results in Teriisvirta
(1994) indicates that this procedure works quite well in a univariate setting.

If linearity tests indicate presence of STAR type nonlinearity in the time series
and an appropriate transition variable has been selected then one usually proceeds
with selection of the transition function that appropriately models the STAR type

of nonlinear dynamics. In general, the logistic, the exponential, or the quadratic

19

logistic function given in equations, (1.3), (1.4) and (1.5), are used. Terésvirta (1994)
suggests using a decision rule based upon a sequence of tests nested within the null

hypothesis corresponding to LM3. In particular, he proposes to test the hypotheses
H03 : ¢3 = 0:
H02:¢2:01¢3=01

H011¢1=0l¢3=¢2=0,

in (1.15) by means of LM-type tests. Under the assumption that a ﬁrst order Taylor
approximation of the exponential function is sufficient, it can be observed by inspect-
ing the expressions for the auxiliary parameters, (131, $2 and 453 in terms of parameters
of the original STAR model that 953 is nonzero only if the model is an LSTAR model,
that 432 is zero if the model is an LSTAR model with 1r1'o = um and c = 0 but is
always nonzero if the model is an ESTAR model, and that 451 is zero if the model
is ESTAR model with it”) = «2,0 and c = 0 but is always nonzero if the model is
an LSTAR model.These observations indicate the following decision rule; if the p—
values corresponding to H02 is the smallest, an ESTAR model should be selected,
while in all other cases an LSTAR model should be the preferred choice.

An alternative method proposed by Escirbano and Jorda(1999) involves use of
LM4 as a test for general STAR-type nonlinearity. The proposed decision rule for
choosing between the LSTAR and ESTAR alternatives is based on the observation
that, assuming «1,0 = «2,0 and c = 0 in (1.7), the properties of ¢1 and $2 given above
also apply to 433 and 954 in (1.20), respectively. Hence, they suggest using the following
hypotheses

H63 245 2 = $4 = 0,
H6” 1 ¢1 = ¢3 = 0,
in (1.20). The selection rule is choose LSTAR (ESTAR) model if the minimum

p-value is obtained for H6 (H63). Their simulation results indicate that in case the

20

true data generating process (DGP) is an LSTAR model, the power of the LM3test
is in general higher than the power of the LM4 test, while reverse holds if DGP is
an ESTAR model. This ﬁnding is intuitive as the p additional auxiliary regressors
gag/La, in (1.20)are redundant in case of an LSTAR model, and the use of p extra
degrees of freedom by the LM4 statistic causes a loss in power. In case of an ES-
TAR model however, these extra terms contain vital information which more than
compensates the use of additional degrees of freedom. They also ﬁnd that their pro—
cedure in deciding between LSTAR and ESTAR models performs better than that
of Tera'svirta (1994). Recent increases in computational power have made the above
discussed decision rules about the transition function less important. It is now possi-
ble to estimate a number of STAR models with different transition functions and to
choose among them at the evaluation stage by using misspeciﬁcation tests. Given the
results in Terésvirta (1994) that the above mentioned procedure may not select the
correct model always, it seems that rather than using these decision rules, one may
prefer to estimate several STAR models and choose the one that best describes the
data at hand by using certain misspeciﬁcation tests that will be discussed in section

1.6.

Effects of Heteroscedasticity on tests of STAR type nonlinearity

If there is neglected heteroscedasticity it will have effects similar to residual
autocorrelation, in that it may lead to spurious rejection of the null hypothesis of
linearity. Wooldridge (1990, 1991) have developed speciﬁcation tests which can be
used in the presence of heteroscedasticity of unknown form. Wooldridge’s (1990,
1991) procedure can be applied in the present context to robustify the tests against
STAR-type nonlinearity, see also Granger and Teriisvirta (1993, pp.69—70). For an
illustration consider the LM3 test discussed above. The heteroscedasticity-consistent

(HCC) variant of the LM3 statistic based upon (1.15) can be computed as follows;

21

o Regress y, on at, and obtain the residuals 11,;

o Regress the auxiliary regressors :Z‘tyLd, i = 1, 2, 3, on 2:, and compute the resid-

uals é,;

a Weight the residuals é, from the regression in step 2 with the residuals it, ob-
tained in step 1 and regress 1 on me. The explained sum of squares from this

regression is the LM—type statistic.

One issue raised by the simulation results in Lundebrgh and Terésvirta (1998) on
robustifying the linearity tests for the presence of unknown heteroscedasticity is that
in some cases the robustiﬁcation removes most of the power of the linearity tests, so
that existing non-linearity may not be detected. In order to better understand the
power and size properties of LM-type tests a simulation study is conducted. To see
how the two versions of the linearity tests behave under a true DGP of linearity and
nonlinearity in the conditional mean data from AR and LSTAR models generated with
GARCH and without GARCH effects in the conditional variances. The parameter
speciﬁcations for different models and conditional variances are given in (1.2), where
a missing value denotes the corresponding parameter value in the respective model is
equal to zero.

The number of replications in the simulations study is set to 2000. The length of
the generated time series is 100, 300, 500, and 1000 observations after removing the
ﬁrst 100 observations from the beginning of the series to eliminate the effects of the ini-
tial values which are set to zero. For each replicate two versions of LMg, LM3 and LM4
tests against STAR-type of nonlinearity and corresponding p-values are computed.
Namely, standard least squares based version and heteroscedasticity consistent ver-
sion based on Wooldridge (1990, 1991) are computed.

To see how the two versions of the tests behave when nonlinearity is present in the

conditional mean data is generated from LSTAR models with autoregressive lag orders

22

set equal to 1 and 2. For convenience, these DGPs are denoted by LSTAR(l) and
LSTAR(2). The conditional variances are generated to be either constant or follow a
GARCH(1,1) process. The results from this experiment are given in table (1.3). One
clear result from the table is that as the sample size increases the empirical power of
the LM-type tests increases substantially for all of the tests considered. The power
of the tests is better when LSTAR(2) is the alternative model against linearity. Both
versions of the tests have better power when there is GARCH effects. There is a slight
difference in power of two versions for moderate sample sizes in that LS versions of
the tests have a slightly better power than the HCC version. But this difference
disappears as the sample size increases. When there is nonlinearity and GARCH
effects both versions have comparable power, the LS variants have marginally better
performance, but this may be due to the fact that LS variants do not take GARCH
effects into consideration and they may have some power against GARCH effects and
thus they most often reject the null of linearity compared to HCC variants. In other
words standard versions of the tests may spuriously suggest nonlinearity when there
is heteroscedasticity in the conditional variance. This is also evident from table (1.4)
which gives the empirical size of the tests. As is evident from table (1.4) the empirical
size of the LS versions of all of the tests is higher than that of HCC variants. For
most of the cases considered empirical sizes of the LS variants of the tests were found
to be higher than the HCC variants and sometimes exceeds the nominal size of the
test. Thus for some of the cases especially when there are GARCH effects standard
tests suggest nonlinearity erroneously. The results from this simulation experiment
indicates that both versions of the tests have good size and power properties in terms
of detecting STAR-type of nonlinearity in the conditional mean of a given time series

and the HCC version have better size properties than the LS version in the presence

of heteroscedasticity of GARCH form.

23

Presence of outliers and their effects on nonlinearity tests

As might have been observed above STAR models can be parameterized to generate
very asymmetric realizations, in the sense that its realizations resemble linear time
series with a few outliers. A relevant question in this context is how the LM-type tests
discussed above perform when the DGP is a linear model but the observations are
contaminated by occasional outliers. This question is studied by van Dijk, Franses
and Lucas (1999). Their ﬁndings show that in the presence of additive outliers these
tests tend to reject the correct null hypothesis too often, even asymptotically. As
a solution they suggest to use outlier-robust estimation techniques. An additive
outlier can be viewed as an observation which is the genuine data point plus or
minus some value. This later value can be nonzero because of a recording error or
because of a cause outside the intrinsic economic environment that generates the
time series data. For instance, in the case of stock market or exchange rate data a
misinterpretation of sudden news ﬂashes, which in turn can cause stock returns or
exchange rate returns to take unexpectedly large absolute values. In this sense the
data point is aberrant. An additive outlier for the time series y; formally can be
deﬁned by y; = x, + <pI [t = T], t = 1, - - - ,T, where I [t = r] is an indicator variable,
taking a value of 1 when t = 7' and a value of zero otherwise. The time series 3:, is the
uncontaminated but unobserved time series, while y, is the observed variable. The
size of the outlier is given by cp, and in practice, the value of T is unknown.

Robust estimators are developed to obtain better parameter estimates in the pres-
ence of contamination, by assigning less weight to inﬂuential observations such as out—
liers, see for instance Huber (1981). For example, a robust estimator for the AR(p)

model y, = ﬂ’xt + at can be obtained as the solution to the ﬁrst order conditions

T
Zwr(rt)xt(yt — ﬁ'xt) = 0 (1.21)
t=1
where rt denotes the standardized residual, rt E (y, — ﬂ’zt)/(auwx(:rt), with on a

24

measure of scale of the residuals w E y, — 6’33, and wx(.) and w,(.) are weight
functions that are bounded between 0 and 1. From (1.21) it can be seen that the
robust estimator is a type of weighted least squares estimator, with the weight for
the t—th observation given by the value of w,(.). The functions wx(.) and w,(.) is
chosen such that the t—th observation receives a relatively small weight if either the
regressor act or the standardized residual rt becomes unusually large. The weight
function w,(rt) usually speciﬁed in terms of a function 1,1)(rt) as w, (rt) = ’l/J(Tt)/Tt for
1', 76 0 and w..(0) = 1. Common choices for the w(.) function are the Huber and Thkey
bisquare functions. The Huber 1/J(.) function is given by
-K. if r, ;<_ —n,
10(71) = rt if ——n < T; S K. (1.22)
n if rt > n,

or w(r) = med(-n, rs, r), where med denotes the median and n > 0. The tuning
constant K. determines the robustness and efﬁciency of the resulting estimator. Since
robustness and efﬁciency properties of the estimator are decreasing and increasing
functions of It, the tuning constant should be chosen such that the two are balanced.
Usually n is taken to be 1.345 to produce an estimator that has an eﬂiciency of 95
percent compared to ordinary least squares,(OLS) estimator if ut is normally dis-
tributed. The weights implied by the Huber function have the attractive property
that w,(rt) = 1, if —n g rt < It. Only observations outside this region receive less
weight. A noted disadvantage of the Huber function is that weights decline to zero
very slowly, hence subjective judgement is required to decide whether a weight is

small or not. The Tukey’s bisquare function is given by

rt(1 -(1'¢/K.)2)2 if | r, lg n,
1““) = (1.23)
0 If I T; I) K"
The tuning constant :9 again determines the robustness and efﬁciency of the resultant

estimator. Usually It is set equal to 4.685 to achieve 95 percent efﬁciency for normally

25

distributed ut. In this function downweighting occurs for all nonzero values of rt.
Different from the Huber function the resulting weights decline to zero quite rapidly.
There are several possibilities for the weighting function proposed in the literature,
for a discussion of possible speciﬁcations for w(.) see van Dijk et al. ( 1999).

The weight function wx($t) for the regressor is usually speciﬁed as
wx(:rt) = w(d(:r,)°)/d(a:,)°, (1-24)

where w(.) is any appropriate function, d(:z:,) is the distance given by d(:r:t) =
Ix, — mil/oz, with m, and 0,. measures of location and scale of 23,, respectively.
These measures can be estimated robustly by the median mm = med(:1:t) and median
absolute deviation (MAD) 0;, = 1.483.med|:r:t — mz|,. The constant 1.483 is used to
make the MAD estimator a consistent estimator of the standard deviation where 1:, is
normally distributed. It is usually the practice to set a = 2 in order to obtain robust
standard errors.

Since weights w,(.) depend on the unknown parameters 3 they need to be deter-
mined endogenously. This in turn implies that the ﬁrst order condition given in (1.21)
is nonlinear in 6 and 0“, and estimation of these parameters requires an iterative pro-
cedure. Recognizing that w,(.) is a function of (,6,au),wr(ﬁ,ou), and denoting the
estimates from the nth iteration by Bwand (“7.)") respectively, it follows from (1.21)

that 760:“) can be obtained as the weighted least squares estimate

 

789;“) = Zf=1wr(76(")a0£n))$tyt

Zf=1wr(5(").0£"))$?
where the estimate of o'u can be updated at each iteration using a robust estimation

of scale, such as MAD given above.

The above method gives robust estimators under the null hypothesis of linear-
ity. Robust estimation of STAR models has not been developed yet. The robust
estimation procedures allow one to construct test statistics that are robust to out-

liers. As illustrated in van Dijsk et al. ( 1999) outlier robust variants of LM type

26

tests discussed above can be obtained as TRZ, using the R2 from the regression of
the weighted residuals 113(73) = (13,03) on the weighted regressors 623011,). :1: u’ where .4:
denotes element-by-element multiplication, V’ is the vector that includes the auxiliary
regressors. For instance in the case of LM3 statistics 11‘ = (x6,$6zt,:r;zf,x;zf). The
weights are obtained from the robust estimation of the AR(p) under the null. The
F -versions of the tests can be computed as well. The simulation results in van Dijk
et al. (1999) suggest that the robustiﬁed LM — type tests have good size properties in
small samples, also in the presence of outliers. In the case of no outliers the power of
the tests are lower than that of their non-robust counterparts. The power of standard
tests decreases drastically in the presence of outliers while power of the robustifed

tests is hardly affected.

1.5 Estimation of STAR Models

If the linearity tests indicate presence of STAR type of nonlinearity then one needs
to determine the transition variable z, and the transition function F(zt, 7, c) as above.
The next step involves estimation of the relevant STAR model. The estimation of
the STAR model carried out by nonlinear least squares (NLS). The parameter vector
1r = (1r6, 116,7, c)’ can be estimated as

T
fr = argmin,r QT(7r) = argmin,r 2(yt — S(:1:t;1r))2, (1.25)

t=1

where S (23,; 7r) is the skeleton of the model, that is,
S(xt; 71') = Wlxt(1_ F(Ztl’)’, 0)) + Wéth(ZtafY:c)' (1'26)

Under the normality assumption on disturbances NLS is equivalent to maximum
likelihood estimates. Under certain regularity conditions, which are discussed in

Gallant (1987) Pbtcher and Prucha (1997) among others, the NLS estimates are

27

consistent and asymptotically normal. In other words, under certain conditions
ﬁe: — 7m) —. N(0, 2:), (1.27)

where no denotes the true parameter vector, and 2 denotes the asymptotic covariance
matrix of the NLS estimates, it. )3 can be estimated consistently by H; leHq‘I 1, where

HT is the Hessian evaluated at ir,namely;
. 1 T 1 T
HT: -T Z qutUI = T Zlvs($t; 7T)VS($t; 7f), - V25($t; 701111, (128)
t=l t=l
with qt(ir) = (yt— S(:rt; ir))2, VS(a:t;1r)= 6301:); 7r)/87r, and J} is the outer product
of the gradient

%qu.(rr )(yvq.77 = %ZafVS( mm «was, 7r)’. (1.29)

Tt=l

The estimation can be performed by using any standard nonlinear optimization
procedure, see Hamilton (1994, sec. 5.7) for a brief survey. The following are the
important issues that deserve attention when carrying out the estimation procedure.

Use of good starting values will help optimization procedure to work smoothly.
In order to get good starting values, note that for ﬁxed values of the parameters
in the transition function, 7 and c, the STAR model is linear in the autoregressive
parameters 1r1 and 71'2. Thus conditional upon 7 and c, estimates of 1r = («6,1r’2)’ can

be obtained by ordinary least squares (OLS)as

71,0710) = (21307:C)I)-I(Z$t(7ic)yt)1 (130)

where ast(7,c) = ($60 — F(zt,7,c)),:r6F(zt,7,c))’ and the notation 1r(7,c) indi-
cates that the estimate of 1r is conditional upon 7 and c. The OLS residuals
and the corresponding variance can be computed as it, = y, — 71(7, c)’:r,(7,c) and
62(7, c) = T‘1 2;, 62(7, c). An appropriate method proposed in the literature (see
for instance Tera'svirta (1998)) for obtaining sensible starting values for the nonlin-

ear optimization algorithm involves a two-dimensional grid search over 7 and c and

28

selects those parameter estimates which gives the smallest estimate for the residual
variance 6(7, c).

Another method suggested by Leybourne, Newbold and Vougas (1998) to simplify
the estimation problem involves concentrating the sum of squares function. Since the
STAR model is linear in the autoregressive parameters for ﬁxed values of 7 and c, the

sum of squares function QT(7r) can be concentrated with respect to 7r] and 7r2as

T
omc) = 23y.— 7r(%¢)’$t(710))2- (1.31)
t=l

The estimates of 1r(7, c) is obtained from minimization of (1.31) for different values
of 7 and c and the one that gives the lowest residual variance is chosen for 7 and c as
the ﬁnal estimates. This reduces the dimensionality of the NLS estimation problem
considerably, as the sum of squares function given in (1.31) is minimized with respect
to the two parameters 7 and c only.

One difﬁculty reported on the estimation of STAR models is obtaining a precise
estimate of the smoothness parameter 7. A reason why it is difﬁcult to obtain a
precise estimate of 7 is that for large values of 7, the shape of the transition function
changes only little. Thus in order to get an accurate estimate of 7 one needs many
observations in the immediate neighborhood of the threshold c. As this is not typically
the case, the estimate of 7 is usually imprecise and often insigniﬁcant when judged
by its t-statistic. Granger and Teriisvirta (1993) and Terasvirta (1994) argue that
insigniﬁcance of the estimate of 7 should not be taken as evidence against the presence
of STAR-type nonlinearity. This should be assessed by means of different diagnostics,
some of which will be discussed in the next section.

To better understand the ﬁnite sample properties of the NLS estimates, the fol-
lowing simulation experiment is performed. Time series are generated from an ES-
TAR model, with 7r1 = 1,0.8,0.5,1rf = 0.9, 0.4, —0.5, 7 = 1,5,15, c = 0,0.5 and
u, ~ i.i.d.N(0, 1). The sample size is taken to be T = 100, 300, and 500 observations.

29

In each replication the ﬁrst 100 observations are deleted in order to minimize the
initialization problem. The parameters in the STAR model, with the lag orders set at
their true values and the correct transition function and variable, is estimated by the
NLS. Tables 1.5 through 1.10 show the mean parameter estimates, mean standard er-
rors, and root mean squared errors, skewness and kurtosis. The simulation results are
based on 2000 replications. The ﬁndings of the simulation experiment indicate that
as the sample size grows from 100 to 500 the parameter estimates improve in terms of
having smaller biases, root mean square errors and smaller standard errors. It seems
that for most of the designs the estimate of autoregressive and threshold parameters
are very precise especially for samples sizes of 300 and 500. On the other hand, the
estimate of the smoothness parameter has relatively higher biases, root mean square
errors, skewness and kurtosis. Although the precision of the smoothness parameter
increases with sample size, for small and large parameter speciﬁcations the estimates
are relatively less precise. The skewness and kurtosis values indicate that the distri-
bution of parameter estimates are far from being normal for especially small sample
sizes. As the sample size increases estimated skewness and kurtosis statistics get closer
to values that are more in line with a normally distributed random variable. The kur-
tosis for 1r and 7 is mostly above 3 indicating that larger estimates are obtained for
these parameters than one would expect under a normally distributed random vari-
able. On the other hand kurtosis estimates for 7r“ and c are mostly piled up around
values less than 3. In all experimental designs, the parameter estimates have positive
skewness except in one of the designs in which 7r = 0.5,1r‘ = —0.5,7 = 5,c = 0.
The nonzero skewness estimates reported in tables 1.5-1.10 indicate that distribution
of parameter estimates are not symmetric around the mean parameter estimate and
most often skewed in the positive direction. The general result from this experiment
is that usually the NLS performs poorly for sample sizes of 100 (which corresponds

the sample size available for many macroeconomic time series) and improves for sam-

30

ple sizes higher than 300. In applications of STAR models with reasonable sample

sizes one needs to interpret inference based on asymptotic theory with caution.

1.6 Diagnostic Checking of Estimated STAR
model

This section discusses some diagnostic tests which can be used to evaluate estimated
STAR models. In particular, diagnostic tests for residual autocorrelation, remaining
nonlinearity, and parameter constancy will be discussed as developed in Eitrheim
and Tera'svirta (1996), Lundbergh, Terasvirta, and van Dijk (1999), and van Dijk
and Franses (1999).

1.6.1 Tests for serial autocorrelation

In order to facilitate the review consider the STAR model of order p,
31:: Show) ‘1' ut (1.32)

where 2:, = (1.5703172: = (yt_1,- - - ,yt_,,)’ as before and S(a:t;7r) is given in (1.26), is
called the skeleton of the model. As shown in Eitrheim and Terasvirta (1996) an LM-
test for k-th order serial dependence in u) can be obtained as TRZ, where R2 is the
coefﬁcient of determination from the regression of ﬁt on 65(17, 117/671' and k lagged
residuals 21,4, - - - ,a,-,.. Hats indicate that the relevant quantities are estimates under
the null hypothesis of serial independence of at. The resulting test statistic is denoted
by LMs(k), is X2 distributed with 1: degrees of freedom. As shown in Eitrheim and
Tera'svirta (1996), this test is a generalization of the LM-test for serial correlation
in an AR(p) model of Breusch and Pagan (1979), which is based on the auxiliary

31

regression

P k
if; = Z (rpm-p ‘1' E at ‘1" 'Ut (1.33)
i=1

i=1

where now a, is the residuals from AR(p) model. In a linear AR(p) model (without
an intercept) S(:rt; 7r) = 25:, 7r,yt_,~, and

Maggi) = (yt_1,---,yt_p)’. In the case of STAR model, the skeleton is given by
S(a:t;7r) = n’lrrt(1 — F(z¢,7,c)) + 7r’2cctS(zt,7,c). Hence, in this case the parameter
vector is it = (1n, ﬁg, 7, c) and the relevant partial derivatives 8%} can be obtained in
a straightforward manner, for details see Eitrheim and Teriisvirta (1996). The non-
linear function S (2:); it) needs to be twice differentiable in order for the above testing

procedure to work.

1.6.2 Testing for remaining nonlinearity

It is important to assess whether the estimated nonlinear model adequately cap-
tures the nonlinearity in the time series under investigation. An intuitive method
to examine this question is to apply a test for no remaining nonlinearity in the esti-
mated model(s). In the case of STAR models, an approach is to specify the alternative
hypothesis of remaining nonlinearity as the presence of an additional regime. This
approach is suggested by Eitrheim and Tera'svirta (1996). For instance, one can test
the null hypothesis that a two regime model is adequate against the alternative that a
third regime is necessary. Eitrheim and Teréisvirta (1996) develop an LM statistic to
test a two regime STAR model against the alternative of an additive 3-regime model

which can be written as,
y: = 77,131+ (772 — 7r1),5':t1:‘1(231t1'71)Cl)+(71'3 — W2)'31F2(22u’72,02)+ “t (134)

where F1(.) and F2(.) are the transition functions given either in (1.3) or (1.4) and
where c1 < 92 is also assumed. The null hypothesis of a two regime STAR model

can be expressed as either Ho : 72 = 0 or H0 : 7r3 = 72. This testing problem suffers

32

from a similar identiﬁcation problem as the problem of testing the null hypothesis
of linearity against the alternative of a two-regime STAR model discussed in section
4. The proposed solution is the same, namely approximating the transition function
F202,, 72, Cg) around 72 = 0. In the case of a third order approximation, it is shown

in Eitrheim and Terasvirta (1996) that the resulting auxiliary model will be
y; = (bbxt‘l’ (712 — ”ll'xthZuﬂi. Cl) ‘1' (191571221 + 95255123; + (pge,z§,+ m (1.35)

where the parameters 49,-, i = 0, 1, 2, 3, are functions of the parameters 7T1,7T2,’72 and
c2. The null hypothesis H6: 72: 0 in (1.34) translates into H6’ : $1: 452 = (193: 0
in (1.35). The test statistic is computed as TR2 from the auxiliary regression of
the residuals obtained from estimating the model under the null hypothesis it, on the
partial derivatives of the regression function with respect to the parameters in the two-
regime model, 1r1,1r2, 71 and c1, evaluated under the null hypothesis, and the auxiliary
regressors itz2¢,i = 1,2,3. The resulting test statistic is shown in Eithrheim and
Teriisvirta (1996) to have an asymptotic x2 distribution with 3p degrees of freedom.
The statistic is denoted by LMAMR,3, where the subscript AM R is used to indicate
that this statistic is designed as a test against an additive multiple regime model.
van Dijk and Hanses (1999) derived an LM-type statistic for testing the null of
a two-regime STAR model against the alternative of a four regime STAR model by
using the same procedure as above. The null hypothesis is the two-regime STAR
model given in (1.2) and the alternative now is given by the following multiple regime

STAR model developed in van Dijk and Franses (1999);
y: = [73960 — F(Zu.71. C1)) + WéxeF1(Zu.71. C1)111 - F2(z2¢.72. C2)l (1-36)
+[WQC1(1— F1(2u. 71. C1)) + Wlxthzlt. 7. C1)]F2(z21. 72. C2) + at

In this model the relationship between y, and its lagged values are given by a linear

combination of four linear AR models, each associated with a particular combination

33

of F1(z1t) and F2032.) being equal to 0 or 1. This model is called Multiple Regime
STAR (MRSTAR) model and is discussed in detail in van Dijk and Franses (1999).
The test statistic developed in van Dijk and Hansess (1999) involves replacement
of second transition function F2(z2t, 72, Cg) by a third order Taylor approximation to

render the auxiliary regression

y. = ((96513: + (”2 — 771)’$tF1(Z2t171101) + €5,117?th + charm; (1.37)
+¢3itz23t + ¢litFl(zlt.71.C1)22t + ¢gitF1(tha711C1)Z§t

+¢gitFl(ZIt. f71: C1)Z:23t + Tlt

The null hypothesis again can be stated as H0 : 72 = 0 in (1.37). It becomes H6 : ¢j =
0, j = 1, - - - , 6 which can be tested exactly the same way as above. The resulting test
statistic denoted by LMEMRAi s asymptotically xzdi stributed with 6(p + 1) degrees
of freedom, where the subscript EMR indicates that the statistic is designed as a test

against an ’encapsulated’ multiple regime model.

1.6.3 Testing parameter constancy

In order to assess the parameter stability in the estimated model LM type tests
are developed in Lundbergh, Tera'svirta and van Dijk (1999). For this purpose they
consider the MRSTAR model given in (1.37) with the second transition function F2
being a function of time t rather than 22.. In other words replacing the transition

variable in the second transition function with a t gives rise to so called Time-Varying
STAR (TVSTAR) model, which allows for both nonlinear dynamics of the STAR-type

and time varying parameters. With this replacement the model in (1.37) becomes

.21. = [713.(1 — F(zt.71.C1)) + 73x.F1(Z..71.C1)111- F2(t.72.C2)1 (1-38)

+[7rérc.(1- F1(z.. 71. C1)) + «bids. 7. C1)1F2(t. 72. C2) + 21,.

34

This model is discussed in detail in Lundbergh, Terasvirta and van Dijk (1999). The
relevance of this model here is that by testing the hypothesis H0: 72 = 0, one tests
for parameter constancy in the two-regime STAR model (1.2), against the alternative
of smoothly changing parameters. The appropriate LM-type test statistic based on
a relevant, say a j‘h-order Taylor approximation of F2(t, 72, c2), is denoted by LMCJ-
is similar to the LMEMRJ‘ statistic with 22. = t. They also note that the asymptotic
theory works ﬁne even if the transition variable is a non-stationary deterministic

trend, see also Lin and Terasvirta (1994).

1.7 Impulse response function analysis of esti-

mated STAR model

Since parameter estimates generally do not provide much information about
the dynamics of the estimated STAR model one needs to utilize alternative tools
in order to characterize the dynamic behavior of the series under study. Impulse
response functions (IRF) are convenient methods of evaluation of the properties of
the estimated model, as they allow one to examine the effects of shocks u. on future
evolution of the time series under investigation and hence provide a measure of the
response of y”), to an impulse i at time t.

In the case of linear models IRFs are deﬁned as the difference between two real-
izations of yt+k which start from identical histories of the time series up to time t— 1,
denoted as 4.22-1. In one realization, the process is hit by a shock of size iota at time
t, while in the other realization no shock occurs at time t. All shocks occur between
the intermediate periods are set equal to zero in both realizations. This IRF is named

by van Dijk and Terasvirta (2000) as the traditional IRF and given by

T111017. lI. wt—l) = Elyt+k l“ t = L.Ut+1 = = “HI: = 0.01 — (1-39)

35

Elyt-Hc lat = 0.ut+1 = = “1+1: = 0.4.

for k = 0, 1, 2, - - -, where E denotes the expectation operator. The second conditional
expectation in (1.40) is usually called the benchmark proﬁle of the series. The IRF
given in (1.40) has certain properties whenever the time series y. follows a linear
model. First of all it is symmetric, as such a shock of size —L has an effect that is
exactly opposite to that of a shock of size +1.. Moreover, it is linear in the sense that
the IRF is proportional to the size of the shock. Lastly, it is history independent as its
shape does not depend on the particular history w,_1. These properties of traditional
IRF function can be easily observed by considering an AR(l) model. In the AR( 1)
model, y; = 30 + ﬁlyt—l + at. since yt-Hc = C0713t- + gill/t + ut+k '1' .Blut-l-k—l + ' ' ' + 51““:
one can easily show that TIy = 66‘ . - -, for k = 0,1,2,---. From this equation it is
trivial to observe the mentioned properties. As discussed in Koop et al. (1996) and
Pesaran and Potter (1997) in general these somewhat simple properties do not hold
when the time series follows a nonlinear model, for example a STAR model. It is
shown that the impact of a shock depends not only on the history of the process but
also on the sign and size of the shock. Furthermore, as shown in Pesaran and Potter
(1997), when one wants to analyze the effect of a shock on the time series It > 1
periods ahead, the assumption that no shocks occur in the intermediate periods may
give misleading inference concerning the propagation mechanism of the model. The
assumption of no shocks in the intermediate periods for the linear models is justiﬁed

by the existence of Wold representation of the linear time series,
00
y. = lejut—j (1.40)
:0

which shows that shocks in different periods do not interact. For nonlinear time

series there does not exist Wold representation however. Nonlinear time series can be

36

represented in terms of past and present shocks by means of the Volterra expansion,

yt z] 2 l/{y'ut—j ‘1' Z Z Cjiut—jut—i (1.41)
j=0 j=0 j=i

000000

'1” Z Z Z: Cjiut—jut—iut—h '1' ' ' '.

j=0 j=i 1121'

as given in Granger and Terasvirta (1993). From this representation of any nonlinear
model it is obvious that the effect of the shock at on yt+k depends on the shocks
at“, - - - , ut+k, as well as on the history of the shocks, u¢_1,ut_2, - - -. In order to deal
with these problems Koop et al. (1996) developed so called the Generalized Impulse
Response Function (GIRF). GIRF for a speciﬁc shock at = L is deﬁned as

Glyuc. Lt—1.w) = Elyt+k In t = Lawt—ll — Elyt+k Iw t—ll. (1-42)

for k = 1, 2, - - -. Note that the expectations of yt+k are conditioned only on the history
and/ or on the shock. In other words, the problem of dealing with shocks occurring
in the intermediate periods is dealt with by averaging them out. That explains also
why the benchmark proﬁle is the expectation of ym, given only the history of the
process 1122-1. Therefore, in the benchmark proﬁle the current shock is averaged out
as well. This GIRF reduces to traditional IRF when the model is linear.

Koop et al. (1996) emphasize that the GIRF given in (1.42) is indeed a random
variable. The GIRF is a function of L and wt-“ which are realizations of the random
variables u. and the information set, 92-1. In this framework, GIRF given in (1.42)

can be written in a more general form as
Gly(k:ut:Qt—l) = Ell/1+1: lu tat—ll — Elyt+k 1Q t—ll (1-43)

The reformulation in (1.45) is ﬂexible and useful for certain purposes as it allows
one to consider a number of conditional versions of GIRF that can be obtained. For

example, one might consider only a particular history w._1 and treat GI as a random

37

variable in terms of u, only, that is,
GIy(k.Ut.wt—1) = Elyt+k [11 “wt—11 — Elyt-Hc Ia) t—ll- (1.44)

It is also possible to reverse the roles of the shock and history by ﬁxing the shock at
u. = L and deﬁning the GIRF as a random variable with respect to the history, 924.
Koop et a1 (1996) show that in general it is possible to compute GIRFs conditional
on any particular subsets A and B of shocks and histories respectively.

The GIRFs can be utilized in several ways in analyzing the dynamic properties of
the estimated model. They can be used to analyze the persistence of shocks. A shock
u. = t is called transient at history w._1 if GIy(k,i,w._1) becomes equal to zero as
k —> 00. If on the other hand, GI approaches a non zero ﬁnite value when k —+ 00 then
the shock is said to be persistent. It is intuitive to think that if a time series process
is stationary and ergodic, the effects of all shocks eventually converge to zero for all
possible histories of the process. Hence the distribution of G1,,(k, L, w._1) collapses to
a spike at 0 as k -—> 00. In contrast, for non-stationary time series the dispersion of
the distribution of GI,,(k, L, w._1) is positive for all k. Koop et al. (1996) suggest that
the dispersion of the distribution of 016(k,i,w._1) at ﬁnite horizons conveniently can
be used to obtain information about the persistence of shocks. For instance, one can
compare densities of GIRFs conditional on positive and negative shocks to ﬁnd out
whether there is a difference in terms of persistence for negative and positive shocks.

GIRFs can also be used to asses the signiﬁcance of asymmetric effects over time.
Potter (1994) deﬁnes a measure of asymmetric response to a particular shock at = L,
given a particular history wt_1, as the sum of the GI for this particular shock and

the GI for the shock of the same magnitude but with opposite sign, that is,
ASYyUc, L,wt_1) = 011709: t,w._1) + GI,,(k,—L,wt_1). (1.45)

An alternative measure of asymmetry can be obtained by considering the distribution

of the random asymmetry measures given above for each history and average across

38

all possible histories to obtain

ASYy‘UC, L) = E[GIy(k, i,w._1)] + E[G'Iy(k, —L,wt_1)] (1.46)

= Elyt+k In t = L] + Ell/t-l-k lat = —61-

One problem in computing the GIRFs is that the analytic expressions for the condi-
tional expectations are not available for k > 1. Therefore they need to be estimated.
Koop et al. (1996) discusses in detail simulation methods to estimate GIRFs. In par-
ticular Monte Carlo or bootstrap methods are suggested for computation of GIRFs.

For details see Koop et al. (1996).

1 .8 Conclusion

This chapter reviewed the STAR models in reference to speciﬁcation, estimation
and inference. Both ESTAR and LSTAR models are discussed extensively. Issues
pertaining to testing presence of STAR type nonlinearity, speciﬁcation of autoregres-
sive orders, estimation, diagnostic checking and inference procedures are discussed in
some detail. The simulation experiments indicate that use of standard information
criteria, say AIC or BIC may not always give the correct autoregressive order within
the STAR models hence they need to be used cautiously. Both standard and het-
eroscedasticity consistent versions of STAR type nonlinearity tests have comparable
power properties in detecting STAR type of nonlinearity. The performance of NLS
in ﬁnite samples is analyzed by an extensive Monte Carlo experiments. The ﬁnd-
ings of the experiment indicate that N LS performs poorly for sample sizes of 100 but

improves for sample sizes higher than 300.

39

BIBLIOGRAPHY

[1] Anderson, H. M.(1997), Transactions costs and non-linear adjustment towards
equilirium in the US treasury bill market, Oxford Bulletin of Economics and
Statistics 59, 465—484.

[2] Berben, R.-P. and D. van Dijk (1999), Unit root testsand asymmetric adjustment,
Econometric Institute Report 9902, Erasmus University Rotterdam.

[3] Breusch, TS. and AR. Pagan (1979), A simple test for heteroscedasticity and
random coefficient variation, Econometrica 47, 1287—94.

[4] Caner, M. and B. E. Hansen (2001), Threhold autoregression with a unit root,
Econometrica 69 1555-1596.

[5] Chan, K.S., J.D. Petrucelli, H. Tong, and SW. Woolford (1985), A multiple
threshold AR( 1) model, Journal of Applied Probability 22, 267—279.

[6] Dumas, B. (1992), Dynamic equilibrium and the real exchange rate in a spatially
separated world, Review of Financial Studies 5, 153—180.

[7] Enders, W. and C.W.J. Granger (1998), Unit root tests and asymmetric ad-
justment with an example using the term structure of interest rates, Journal of
Business and Economic Statistics 16, 304—311.

[8] Eitrheim C. and T. Terasvirta (1996), Testing the adequacy of smooth transition
autoregressive models, Journal of Econometrics 74, 59—76.

[9] Gallant, A. R. (1987), Nonlinear Statistical Models, New York: John Wiley

[10] Granger, C.W.J. and T. Tera'svirta (1993), Modelling Nonlinear Economic Re-
lationships, Oxford: Oxford University Press.

[11] Hansen, B. E. (1996), Inference when a nuisance parameter is not identiﬁed under
the null hypothesis, Econometrica 64, 413—30.

[12] Huber, P.J. Robust Statistics, New York: John Wiley

40

[13] Jansen, D.W. and T. Teriisvirta (1996), Testing parameter constancy and super
exogeneity in econometric equations, Oxford Bulletin of Economics and Statistics
58, 735—768.

[14] Koop, G., M. H. Pesaran and S. M. Potter (1996), Impulse response analysis in
nonlinear multivariate models, Journal of Econometrics 74, 119—147.

[15] Lin , C-F.J. and T. Terasvirta (1994), Testing the constancy of regression param-
eters against continuous structural change, Journal of Econometrics 62, 211—228.

[16] Leybourne, S. P. Newbold, and D. Vougas (1998), Unit roots and smooth tran-
sitions, Journal of Time Series Analysis 19, 83—97.

[17] Lundbergh, S., T. Tera'svirta (1998) Modelling economic high-frequency time
series with STAR-GARCH models, Working papers in Economics and Finance
291, Stockholm School of Economics.

[18] Lundbergh, S., T. Teréisvirta and D. van Dijk (1999), Time-varying smooth
transistion autoregressive models, Stockholm School of Economics, unpublished
muniscript.

[19] Luukkonen, R., P. Saikkonen and T. Tera'svirta (1988), Taesting linearity against
smooth transition autoregressive models, Biometrika 75, 491—9.

[20] Michael, P.,A.R. Nobay and D.A. Peel (1997), Transaction costs and nonlinear
adjustment in real exchange rates: an empirical investigation, Journal of Political
Economy 105, 862—879.

[21] Pesaran, M. H. and S. M. Potter (1997), A ﬂoor and ceiling model of US output,
Journal of Economic Dynamics and Control 21, 661-695.

[22] Potter, S. M. ( 1994) Asymmetric economic propagation mechansisms, in W.
Semmler (ed.), Business cylces: Thoery and Empirical Methods, Boston: Kluver,
pp. 527—560.

[23] Pbtcher, RM. and I.V. Prucha (1997), Dynamic Nonlinear Econometric Models-
Asymptotic Theory, Berlin: Springer-Verlag

[24] Taylor, M.P., D.A. Peel, and L. Sarno (2001), Non-linear in real exchange rates:
towards a solution of the purchasing power parity puzzles, Working Paper, Centre
for Economic Policy Research, London, UK.

41

[25]

[261

[271

1281

[291

1301

[31]

[321

[331

[341

Teréisvirta, T. (1994), Speciﬁcation, estimation and evaluation of smooth transi—
tion autoregressive models, Journal of the American Statistical Association 89,
208—218.

Tera'svirta, T. (1998), Modelling economic relationships with smooth transition
regressions, in A. Ullah and D.E.A. Giles (editors), Handbook of Applied Eco-
nomic Statistics, New York: Marcel Dekker, pp. 507—552.

Tera'svirta, T., D. Tjotheim and C.W.J. Granger (1994), Aspects of modeling
nonlinear time series, in RF. Engle and D.L. McFadden (editors), Handbook of
econometrics, vol.I V, Amsterdam: Elsevier Science.

Terasvirta, T. and H. M. Anderson (1992), Characterizing nonlinearities in busi—
ness cycles using smooth transition autoregressive models, Journal of Applied
Econometrics 7, 3119—8136.

Tong, H.(1990), Non-linear Time Series: a Dynamical Systems Approach, Ox-
ford: Oxford University Press.

van Dijk, D., T. Teriisvirta and RH. Franses (2000), Smooth transistion autore-
gressive models - a survey of recent developments, SSE / EFI Working paper series
in Economics and Finace No. 380, Stockholm School of Economics.

van Dijk, D., P.H. Franses and A. Lucas (1999), Testing for smooth transition
nonlinearity in the presence of additive outliers, Journal of Business and Eco-
nomic Statistics 17, 217—235.

van Dijk, D., and RH. Franses (1999), Modeling multiple regimes in the business
cycle, Macroeconomic Dynamics 3, 311—40.

Wooldridge, J .M. (1990), A uniﬁed approach to robust, regression-based speciﬁ-
cation tests, Econometric Theory 6, 17—43.

Wooldridge, J .M. (1991), On the application of robust, regression-based speciﬁ-
cation tests, Journal of Econometrics 47 , 5—46.

42

Table 1.1: Lag selection frequencies in AR(p) model

 

AR Order AIC BIC HQC LB

p T=250 T=500 T=250 T=500 T=250 T=500 T=250 T=500
1 734 728 984 993 906 938 875 870

2 120 114 13 6 67 43 8 6

3 62 68 2 1 12 15 10 14

4 35 37 1 0 11 3 15 14

5 25 28 0 0 2 0 15 21

6 24 25 0 0 2 77 75

 

Frequencies of lag length selection in AR(p) models on series generated from ESTAR model
(1.2) and (1.4), With 7r1'0 = 7r2‘0 = 0, TF1,1 = 0.6, 7T2; = 0.3, C = 0.5,ut ~ iidN(O, 1).

Table 1.2: Parameter Speciﬁcations for the generated DGPs:

generated with c = 0 and 7 = 5

DGP Conditional mean equation
“1,0 72,0 771,1 7T2,1 7T1,2 ”2,2

LSTAR(l) -0.3 0.1 -0.5 0.5
LSTAR(1)-GARCH(1,1) -0.3 0.1 -0.5 0.5 . .
LSTAR(2) -0.3 0.1 -0.5 0.3 0.5 -0.3
LSTAR(2)-GARCH(1,1) -0.3 0.1 -0.5 0.3 0.5 -0.3
AR(1) 0.5 0.8
AR(1)-GARCH(1,1) 0.5 0.8 .
AR(2) 0.5 0.8 -0.4

AR(2)-GARCH(1,1) 0.5

0.8

43

-0.4

All of the DGPs are

Conditional Variance
w or

1

1

0.3
0.3
0.3

0.3

5

0.6
0.6
0.6

0.6

Table 1.3: Empirical power of the linearity tests.
Sample size: T=100

 

DGP LS HCC
LM2 LM3 LM4 LM2 LM3 LM4
STAR(1) 0.26 0.23 0.20 0.19 0.15 0.12
STAR(1)-G(1,1) 0.22 0.20 0.19 0.16 0.14 0.10
STAR(2) 0.56 0.50 0.45 0.39 0.33 0.25

STAR(2)-G(1,1) 0.62 0.57 0.53 0.44 0.39 0.31
Sample size: T=300
STAR(1) 0.65 0.62 0.57 0.61 0.57 0.50
STAR(1)-G(1,1) 0.62 0.59 0.55 0.57 0.52 0.46
)
)-

 

 

STAR(2 0.98 0.99 0.99 0.96 0.98 0.94
STAR(2 G(1,1) 1.00 0.99 0.99 0.98 0.98 0.97
Sample size: T=500

 

 

STAR(1) 0.88 0.87 0.83 0.87 0.84 0.79
STAR(1)-G(1,1) 0.86 0.83 0.80 0.83 0.79 0.75
STAR(2) 0.99 1.00 0.99 0.98 0.99 0.97

STAR(2)-G(1,1) 1.00 1.00 1.00 1.00 1.00 1.00
Sample size: T=1000

 

 

STAR(1) 0.99 0.99 0.99 1.00 0.99 0.99
STARE(1)- 1.00 1.00 1.00 1.00 1.00 0.99
G(1,1)

STAR(2) 1.00 1.00 1.00 1.00 1.00 1.00

STAR(2)-G(1,1) 1.00 1.00 1.00 1.00 1.00 1.00
Note: The LS stands for the standard least squares based versions of the LM-type tests,
HCC refers to the Wooldridge version of the unknown heteroscedasticity consistent version
of the tests. The empirical powers are computed at 5% signiﬁcance level. The transition
variable used in the linearity tests is 312.1

 

44

Table 1.4: Empirical size of the linearity test.
Sample size: T=300

 

 

 

 

DGP LS HCC
LM2 LM3 LM4 LM2 LM3 LM4
AR(1) .044 .042 .043 .037 .037 .028
AR(1)-G(1,1) .043 .036 .039 .044 .029 .028
AR(2) .048 .034 .034 .039 .033 .022
AR(2)-G(1,1) .048 .048 .044 .037 .029 .021
hline
Sample size: T=500
AR(1) .049 .043 .037 .048 .039 .030
AR(1)-G(1,1) .052 .041 .037 .051 .036 .028
AR(2) .051 .045 .053 .050 .040 .042
AR(2)-G(1,1) .053 .045 .045 .040 .028 .025
Sample size: T=1000
AR(1) .045 .040 .046 .048 .044 .041
AR(1)-G(1,1) .052 .044 .044 .049 .048 .037
AR(2) .056 .053 .050 .055 .048 .045
AR(2)-G( 1,1) .057 .056 .055 .050 .037 .035

 

Note: Each cell represents the proportion of rejections of the true null hypothesis of
linearity at 5% signiﬁcance level. LS columns give the standard least squares based tests
and HCC columns give the Wooldridge type heteroscedasticity consistent versions of the

tests. The transistion variable used in the linearity tests is y¢_1.

Table 1.5: Simulation Results on the ﬁnite sample performance of NLE of STAR

 

models
Parm. Mean Est Mean RMSE BIAS Skewness Kurtosis
_ S.E.
T=100
77 1.043 1.143 1.717 0.430 1.370 5.360
1r‘ 0.850 0.322 0.172 -0.051 1.010 1.071
7 4.605 1.981 6.651 3.605 2.144 5.329
T=300
7r 0.964 0.522 0.795 -0.036 1.376 3.274
7r" 0.885 0.188 0.092 -0.015 1.014 1.038
7 3.657 1.900 5.123 2.657 2.313 5.091
T=500
7r 1.008 0.425 0.631 0.008 1.308 2.705
77‘ 0.888 0.164 0.078 -0.012 1.008 1.020
7 3.100 1.785 5.100 2.100 2.270 4.950

 

 

 

Key: Mean and RMSE, Bias, skewness and the kurtosis of N LS estimates of the parameters
in the ESTAR model, with 1r; = 1, 7r; = 0.9, 7 = 1, c = 0 and u; ~ i.i.d.N(0, 1). The table
is based on 2000 replications.

45

Table 1.6: Simulation Results on the ﬁnite sample performance of NLSE of STAR

 

models
Parm. Mean Est Mean RMSE BIAS Skewness Kurtosis
_ SE.
T=100
1r 1.081 1.000 1.704 0.081 1.611 6.406
7r“ 0.852 0.269 0.166 -0.048 1.005 1.044
7 4.383 2.050 5.441 -0.617 2.138 5.246
T=300
rt 1.021 0.500 0.790 0.021 1.116 3.023
77" 0.878 0.161 0.115 -0.022 1.006 1.022
7 4.830 1.800 4.900 -0.170 2.036 4.850
T=500
7r 0.994 0.406 0.590 -0.006 1.039 2.636
77* 0.883 0.160 0.106 -0.017 1.008 1.015
7 4.885 1.650 4.225 -0.115 2.016 4.550

 

 

Mean and RMSE, Bias, skewness and the kurtosis of Nﬁ estimates of the parameters in
the ESTAR model, with 1r1 = 1, nf = 0.9, 7 = 5, c = 0 and at ~ i.i.d.N(0,1). The table is
based on 2000 replications.

Table 1.7: Simulation Results on the ﬁnite sample performance of NLSE of STAR

 

models
Parm. Mean Est Mean RMSE BIAS Skewness Kurtosis
__ S.E.
T=100
71' 1.028 1.053 1.611 0.028 1.740 7.025
1r* 0.846 0.396 0.180 -0.054 1.015 1.057
7 4.086 2.294 12.065 -10.914 2.258 5.958
T=300
7r 1.007 0.673 1.090 0.007 1.060 3.994
7r“ 0.883 0.146 0.118 -0.017 1.009 1.022
7 8.874 2.078 9.900 -6.126 2.006 4.395
T=500
7r 1.005 0.465 0.790 0.005 1.004 3.676
77“ 0.885 0.108 0.106 -0.015 1.008 1.011
7 10.389 2.005 7.151 -4.61_1_ 2.120 4.255

 

 

Mean and RMSE, Bias, skewness and the kurtosis of NTS estimates of the parameters in
the ESTAR model, with m = 1, «1' = 0.9, 7 == 15, c = 0 and at ~ i.i.d.N(0, 1). The table
is based on 2000 replications.

46

Table 1.8: Simulation Results on the ﬁnite sample performance of NLSE of STAR

 

models
Parm. Mean Est Mean RMSE BIAS Skewness Kurtosis
S.E.
T=100
1r 0.960 0.439 1.527 -0.040 3.433 9.159
7r"‘ 0.815 0.219 0.213 -0.085 1.022 1.073
7 6.948 1.365 8.767 5.948 1.694 3.224
c 0.203 0.366 2.357 -0.297 0.142 2.524
T=300
7r 0.934 0.242 0.807 -0.066 1.851 4.186
7r" 0.878 0.210 0.157 -0.022 0.996 1.031
7 5.875 1.335 7.168 4.875 1.249 2.933
c 0.440 0.326 2.119 -0.060 0.368 2.023
T=500
7r 0.975 0.171 0.545 -0.025 1.984 4.019
7r* 0.887 0.071 0.067 -0.013 0.771 1.055
7 4.099 1.206 6.951 -3.099 1.118 3.349
c 0.515 0.289 1.847 -0.015 0.040 2.689

 

 

Mean and RMSE, Bias, skewness and the kurtosis of 0T8 estimates of the parameters in
the ESTAR model, with in = 1, 7r; = 0.9, 7 = 1, c = 0.5 and ut ~ i.i.d.N(0, 1). The table
is based on 2000 replications.

Table 1.9: Simulation Results on the ﬁnite sample performance of NLSE of STAR

models

 

Parm. Mean Est Mean RMSE BIAS Skewness Kurtosis
S.E.
T=100
7r 0.913 1.769 3.229 0.113 1.931 9.199
77" 0.379 0.657 0.278 -0.021 1.056 4.117
7 7.811 2.343 11.077 2.811 3.029 10.697
T=300
7r 0.901 0.866 1.944 0.101 2.004 5.082
7r“ 0.393 0.442 0.202 -0.007 1.071 3.419
7 6.611 2.176 7.783 1.611 2.902 6.179
T=500
7r 0.881 0.822 1.299 0.081 1.638 4.236
1r‘ 0.395 0.330 0.133 -0.013 1.016 1.686
7 5.991 2.110 6.817 0.991 2.771 5.353

 

 

Mean and RMSE, Bias, skewness and the kurtosis of NTS estimates of the parameters in
the ESTAR model, with M = 0.8, it} = 0.4, 7 = 5, c = 0 and at ~ i.i.d.N(0, 1). The table
is based on 2000 replications.

47

Table 1.10: Simulation Results on the ﬁnite sample performance of NLSE of STAR

 

models
Parm Mean Est Mean RMSE BIAS Skewness Kurtosis
S.E.
T=100
7r 0.853 1.778 3.190 0.353 2.177 12.876
7r‘ -0.480 0.441 0.223 -0.020 -0.803 2.323
7 8.293 2.065 12.933 3.293 3.432 14.816
T=300
1r 0.724 1.121 1.800 0.224 2.094 6.876
7r‘ -0.507 0.225 0.167 -0.007 -1.049 2.146
7 6.684 1.175 8.286 1.684 3.174 7.559
T=500
7r 0.625 0.976 1.447 0.125 2.674 4.190
7r* -0.504 0.215 0.112 -0.004 -0.509 1.726
7 6.097 1.634 7.064 1.097 2.578 6.532

 

 

Mean and RMSE, Bias, skewness and the kurtosis of N_L—S estimates of the parameters in
the ESTAR model, with in = 0.5, it; = —0.5, 7 = 5, c = 0 and u. ~ i.i.d.N(0,1). The
table is based on 2000 replications.

48

 

 

Figure 1.1: Examples of the exponential, logistic, functions for values

25 and threshold parameter c = 0.
a. Exponential

of 7 3, 5, and

 

1.0

0.8

0.6

0.4

0.2

 

 

 

0.0

 

b. Logistic

1.0

 

0.8

0.6

  

0.4

 
 

”a"? P.‘ N

0.2

‘ltr..‘ —

 

 

 

0.0
1
1
r

—4 —.3 -2 —1 O 1 2

49

Figure 1.2: Sample realizations from the STAR models ”1.1 = —.3,1r1.2 = 0.7, c = 0
and u. ~ NID(0, 1)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) 71,0 = —0.5,7r2,0 = 0-5 (b)7r1,0 = 05.7720 = ‘05
3 s 4
2 3r
1 2.
0' ‘
l
-1 01
-2. -1 l
-3. -2
-4 . -3 L x
0 so 100 150 200 250 0 so 100 1 so 200 250
(C) 171.0 = —1.5,7l’2,0 =1.5 ((1)71’1’0 =1.5,772'o = —1.5
4 a - 4 -
3 1 3.
2 l 21
1
l l.
o
-1 II l 01
-2 I -I
-3 .1 -2
-4 -3
0 so 100 15o 200 250 0 so 100 150 200 250
(6)7f1,0 = 772,0 = 0.7r1,1 = 1.721 = ‘03 (071,0 = ”20.704 = 1.7T2,1 = “0-3
4 5 -
4 .
2 1 3' ]
2
o
1
-2 _ 0'
-1.
_q -2 .
—;1.
-5 r x n x _4 . . .
0 so 1 00 150 200 250 0 so 100 150 200 250

Notes: The ﬁgures in 2a-2e are sample realizations from ESTAR model with the given
parameter speciﬁcations, while ﬁgure in 2f is a sample realization from LSTAR model with
quadratic logistic function given in (1.5) with the same parameter speciﬁcation as in 2e,
except, thresholds are speciﬁed to be C1 = 0, C2 = 0.5

50

CHAPTER 2

Review of long memory models for
conditional mean and variance

2.1 Introduction: Deﬁnition and sources of long
memory in economic time series

This chapter brieﬂy discusses the properties of long memory process with par-
ticular attention given to fractionally integrated processes. Surveys of long memory
processes, their statistical properties and applications in economics, ﬁnance and some
other ﬁelds can be found in Baillie ( 1996), and Beran (1994).

Traditionally, long memory has been deﬁned in the time domain in terms of de
cay rates of long-lag autocorrelations, or in the frequency domain in terms of rates
of explosion of low-frequency spectra. A process with the long-lag autocorrelation
function given by,

7), = qkzd'las k —> 00 (2.1)
is called a long memory process. The deﬁnition in (2.1) implies the following condi-

tion,

T
1330 Z |p.-l= oo. (2.21

j=—T
That is, for a discrete time series, autocorrelation function, p,- is not absolutely

summable. See for instance, McLeod and Hipel (1978).
In the spectral domain a long memory process is deﬁned in terms of the behavior of

the spectral density at low frequencies. A process is called long memory if the spectral

51

density, fy(w) = cfw’z“

as w ——> 0+. A more general deﬁnition, provided by Heyde
and Yang (1997) in the frequency domain is simply, f(w) = 00 as w —-1 0*. Note
that the constants, c, and Cf can be replaced by so—called slowly varying functions,
i.e., functions such that for any t6 R, L(ty)/L(y) —> 1 as y —> 00 or y —1 0. Since
knowing the covariances (or correlations and variance) is equivalent to knowing the
spectral density, the long-lag autocorrelation deﬁnition in the time domain and low-
frequency spectral deﬁnitions are equivalent under the conditions given, for example
in Beran (1994, pp. 42-44).

A third deﬁnition of long memory involves the rate of growth of variances of partial

sums,

T
ST: Z 311-
t=1

A process is said to be a long memory process if var(ST) = 0(T2d“) for d > 0.
In other words, a process is a long memory process if the growth rate of variances
of its partial sums are in the order of Tad“. There is a connection between the
variance-of-partial—sum deﬁnition of long memory and the spectral deﬁnition of long
memory (and hence also the autocorrelation deﬁnition of long memory). In particular,
because the spectral density at frequency zero is the limit of %ST, a process has long
memory in the generalized spectral sense of Heyde and Yang if and only if it has
long memory for some at > 0 in the variance-of-partial-sum sense. Therefore, the
variance-of-partial—sum deﬁnition of long memory is quite general.

It should be emphasized that these deﬁnitions are asymptotic in the sense that
they characterize the ultimate behavior of the correlations, and variance of partial
sums as lags and/ or sample size approaches inﬁnity. In general they do not specify
the correlations and/ or variance of the partial sums for any ﬁxed ﬁnite lag and / or for
any ﬁxed ﬁnite sample size. In particular, both correlation deﬁnition and the spectral
density deﬁnitions do not determine the absolute size of the correlations. In other

words, each individual correlations can be arbitrarily small while the decay rate of

52

correlations is slow.

There is a natural desire to understand the nature of various mechanisms that
could generate long memory. Most econometric attention has focused on the role
of aggregation. Granger (1980) considered the aggregation of i = 1, - - - , N cross-
sectional components, 311,1 = my” + 65,2, where 6,3. is white noise, and it is also
assumed that for i 76 j 6,"; is independent of 6]"; and a,- is also independent of 6]";
for all i, j, t. As N —+ 00, it is shown in Granger (1980) that the spectrum of the

aggregated process, y. = 26:, y” is approximately given by,

 

N 1
fl! _ ﬁElvaT(6i.t)l/' |1__ aexpiwlzdpm)’

where F(a) = 0° Mdt, is the cumulative density function governing the 038.

B(PJ’)
Here, B(p, b) = fol ap'1(1 — a)b‘1doz = W, is the beta function, and p, b > 0.

Upon assuming that a,’s are distributed as a Beta distribution with parameters (p, b),

2

”(0) = 776.71

azp_1(1— a2)b_ldoz, 0 3 al,

then the kth autocovariance of y. is

 

_ 2 1 2p+k—l 2 b—2 1-1,
7y(k)—B(p,b)/Oa (1—a) da—Ck .

Thus Granger ( 1980) shows that the aggregated series, y), is a long memory process
in the sense that it is integrated of order (1 — 6).

Recently, Lippi and Zaffaroni (1999) generalized Granger’s result by replacing
Granger’s assumed beta distribution with weaker semi-parametric assumptions and
obtained similar results. Chambers (1998) considers temporal aggregation in addition
to cross sectional aggregation in both discrete and continuous time as the source of
long memory.

An alternative source of long memory, which also involves aggregation, has been

studied by Ciozek-Georges and Mandelbrot (1995), Taqqu, Willinger and Sherman

53

(1997), and Parke (1999). This source of long memory involves the distribution of
the duration between consecutive events. In particular, the idea is based on the mod-
elling of aggregate traffic computer networks. For illustration, consider the stationary
continuous time binary series S (t), t 2 0 such that S (t) = 1 during ”on” periods and
S(t) = 0 during ”off” periods. The lengths of the on and off periods are assumed to
be independently and identically distributed (i.i.d) at all leads and lags. It is also
assumed that on and off periods alternate. Under these assumptions, consider M
sources, Sm(t),t 2 0, m = 1, - - - , M, and deﬁne the aggregate count in the interval

10. tT] by
tT M

SM(tT) = / (Z Sm(v))dv.
0 m=l
Let F1(y) denote the c.d.f. of durations of on periods, and F2(y) be the c.d.f. of dura-

tions of off periods, and further assume the following for the tail of the distributions

of on and off durations,

1— F1(y) N Cly-alLl(y),Wlth1< (11 < 2,

1 — F2(y) ~ ng_a2L2(y),W1tll1 < (12 < 2.

Thus the power-law tails imply inﬁnite variance for the on and off durations. By
letting ﬁrst M —1 co and then T -> oo Cioczek-Georges and Mandelbrot (1995)
and Taqqu et al. (1997) show that SM(tT) after being appropriately standardized,
converges to a fractional Brownian motion. The regular Brownian motion, B(r),
is a continuous time stochastic process whose increments are independent Gaussian
distributed. The fractional Brownian motion, Bd(r) is regarded as the approximate
(—d) fractional derivative of regular Brownian motion, Bd(r) = Fur—LT) f; (r—y)ddB (y)
See Beran (1994) for details. Hence, the aggregate counts in the interval [0, tT] is a

long memory process.

54

Parke (1999) considers a closely related discrete-time error duration model. In
particular, he assumes that the aggregate process, yt is being generated by the follow-
ing sum, yt = Z:=_°o g,,tu,, where at ~ i.i.d.(0,0‘2), and 98,: = 1(t g s + n,), where
1(.) is the indicator function, and n, is the stochastic duration between consecutive
errors. Assuming a probability law for the distribution of n, that implies inﬁnite
variance for the durations, similar to above, leads y, to be long memory.

An alternative route, that may lead to long memory, explored by Diebold and
Inoue (2001), involves structural change or stochastic regime switching. They show
how some simple stochastic regime switching models may produce realizations that
appear to have long memory under conditions that ensure that as sample size increases
the realizations tend to have just a few breaks. For illustration purposes consider the

following mixture model,

yt=l1t+€t

#t = #t—l + Ut
0 w.p. l—p
'Ut =
wt w.p. p

where wt ~ iidN(0, 0,2”) and at ~ iz'dN(0, of). They show that under the assumption
that p = 0(T2d‘2), O < d < 1, yt will be an I (d) (integrated of order d) process.
Diebold and Inoue (2001) show several other stochastic models under certain con-
ditions (mostly assumptions that dictate how certain parameters, such as mixture
probabilities vary with T) can generate realizations with long memory. Their theo-
retical results indicate that regime switching (structural change) and long memory are
easily confused when only a small number of regime switchas / breaks occurs. Guided
by their theoretical results, they conduct extensive Monte Carlo analysis to verify
how in ﬁnite samples with ﬁxed-parameter stochastic regime switching models whose

dynamics is either I (O) or I (1) one can obtain realizations that have long memory

55

dynamics. Diebold and Inoue (2001) conjecture that threshold autoregressive (TAR),
smooth transition autoregressive models (STAR) may have realizations with long

memory once one allows thresholds to change appropriately with sample size.

2.2 Long Memory Models

This section discusses parametric models that are capable of capturing long
memory phenomena in both the conditional mean and the conditional variance of a
univariate series. In particular, the fractionally integrated autoregressive moving av-
erage (ARFIMA) model, developed by Granger and Joyeux ( 1980), Granger ( 1980),
and Hosking (1981) for the conditional mean of a time series, and fractionally inte-
grated autoregressive conditional heteroscedastic (FIGARCH) model due to Baillie
etal. (1996) will be reviewed in terms of representation, speciﬁcation, estimation,

and inference.

2.2.1 The ARFIMA Model

Integrated autoregressive moving average (ARIMA) models were introduced by Box
and Jenkins (1970). The theory of statistical inference for the ARlMA models is
well developed, see for instance, Brockwell and Davis (1997), and Hamilton (1994).
ARFIMA models are natural extensions of the ARJMA models. Therefore, let us
ﬁrst recall the deﬁnition of ARMA and ARIMA processes. To simplify the notation
assume that E(yt) = p = 0. Otherwise, 3;, needs to be replaced by y, — p in the

following formulas. First deﬁne the polynomials,
p .
(15(13) = 1 — 2 45,33
i=1

q
0($)=1+ 261151.
i=1

56

where pandq are integers. Assuming that all the solutions of polynomial equations,
(19(3) = 0 and 0(x) = 0 are outside the unit circle, an ARM A(p, q) model is deﬁned

to be the stationary solution of
(MI/)9: = 6(L)u,, (23)

where L is the lag operator, and disturbances, u, are usually assumed to have zero
mean, E(ut = 0), and ﬁnite variance, E(uf) = 03 and are serially uncorrelated,
E(utu,) = O for t 76 3. If equation (2.1) holds true for the dth difference (1 — L)dy¢,
then y; is called an ARI M A(p, at, q) process with the corresponding equation, now
given by

¢(L)(1 — L)dy, = 0(L)u,. (2.4)

Note that ARM A(p, q) model is encompassed by the ARI M A(p, d, q) model in the
sense that ARM A(p, q) model is obtained from ARI M A(p, (1, q) model by letting
d = 0. If (1 2 1, then the original series 3;, is not stationary and hence to obtain
a stationary process y; needs to be differenced d times. Generalization of (2.4) to
non-integer values of (1 gives the ARFIMA(p,d,q) model. Note that if d is an integer

(d 2 0), then (1 — L)dcan be written as

d
(1—L)d=§j d (-1)"L".
i=0 k

with the binomial coefﬁcients

d d! F(d + 1)

k k!(d—k)! = F(k+1)I‘(d—k+1)’

 

where F(.) denotes the gamma function and is deﬁned by

F(s) = [00° exp(—:r)a:"ld:c.

57

Since the gamma function is deﬁned for all real numbers, the binomial coefﬁcients

can be extended to all real numbers d. For any real number (1, (1 — L)di 3 deﬁned by

(1—L)d=:: :=F( (—1)"L’°=1—dL—5i(—l—;!—d)L—2—m (2.5)

 

°° F(k d) k
—,,d11,)—L ZI‘(k+1)I‘(—d)L’

where F stands for the hypergeometric function which is deﬁned formally by

F(m +j)F(n +j)
F(m) )I‘)(n) )2 F(s +j)F(j + 1) '

 

F(m,;n,s:r)=

For all positive integers only the ﬁrst d + 1 terms are nonzero and hence, for positive
integer d (2.6) is the usual dth difference operator while for non-integer d, the
summation in (2.6) is genuinely over an inﬁnite number of indices. Given (2.6)
Granger and Joyeux (1980) and Hosking (1981) proposed the following deﬁnition for
the ARFIMA model:

Deﬁnition 2.1 Let 3;, be a stationary process such that

¢(L)(1 - leyt = 0(Llut (2-5)

for some —% < d < %. Then y; is called an ARF I M A(p, (1, q) process.

The range that makes the ARFI M A(p, (1, q) process in (2.6) long memory is
O _<_ d < %. The upper bound (1 < % makes the process covariance stationary.
For d 2 % the ARFI M A(p, d, q) process is not covariance stationary. In particu-
lar, the usual deﬁnition of the spectral density of yt would lead to a non-integrable
function. Whenever d falls in g, 1) then the process is considered to be covariance
non-stationary. Moreover, the ARFI M A(p, d, q) process given in (2.6) is invertible
for values of d > —-;— and have an inﬁnite order autoregressive representation. For
the range —% < d < % the ARFI M A(p, d, q) process is invertible and stationary

and can be represented by both as an inﬁnite order autoregressive or inﬁnite order

58

moving average process. These representations for the general ARFI M A(p, d, q) are
given in Sowell (1992). They are complicated functions of hypergeometric function.
For p = q = 0 the ARFI M A(0, (1, 0) process is also called fractional white noise, see
Baillie (1996). This is because a random walk is the discrete analog of the Brownian
motion and similarly the discrete time version of fractional Brownian motion is the

fractionally differenced white noise. Note that ARFI M A(O, d, 0) process is given by
(1 — L)dyt = Ut. (2.7)

In this case, the infinite order autoregressive and moving average representations are
easy to obtain from (2.7) as shown in Hosking (1981). In particular, the inﬁnite order
autoregressive representation is,
co
zn=§jmmh+mb as)
k=0

where the inﬁnite order autoregressive weights are given in (2.6) and for k —) oo,

7r,c ~ ﬁle-d“. (2.9)

The inﬁnite order moving average representation is obtained by use of the Wold

decomposition, and given by,

Kit = (1 — Ill—dth = E :wkUt—k
k=0
2 3
d(d+ 1)L + d(d+1)(d+ 2)L + . .11“ (2.10)

=u+dL+ 2! a

 

The inﬁnite order moving average coefﬁcients alternatively can be expressed by use

 

of the gamma function. Since, F(d + k) = d(d+l)(d?a')"(d+k'1) it follows that W =

Tglff—lc‘gﬁ. When k —) 00, the inﬁnite order moving coefficients will be approximately
equal to,
1
~ _kd-l, 2.11

59

Equation (2.6) can be interpreted in several ways. For instance, deﬁning, 37, ==

¢‘1(L)0(L)ut, it can be written as

(1‘ L)dyt = 9:-

This representation means that an ARMA process is obtained after passing yt through
the fractional difference operator (or inﬁnite linear ﬁlter) (1 -— L)“. Alternatively, (2.6)
can be written as
y. = ¢(L)’10(L)y{,

where y,“ is an ARFI M A(0, (1, 0) process deﬁned in (2.7). In this representation, y, is
obtained by passing an ARFI M A(0, d, 0) process through an ARMA ﬁlter. Figures
1.a to 1.d show sample realizations of several ARFIMA processes with disturbances
u, ~ iidN(0,0.25) and the same long memory parameter d = 0.3. It is apparent
from these graphs that many different types of dynamic behavior can be obtained.
Figures 2.a to 2.d show the ﬁrst ﬁfty autocorrelations of the corresponding processes
together with the 95 percent conﬁdence intervals. As is evident from the ﬁgures the
sample realizations are quite persistent in their autocorrelations in that there are
very signiﬁcant correlations in higher lags. The parameter d determines the long run
behavior of the process while autoregressive and moving average parameters allow
one to model short-run dynamics more ﬂexibly. In this sense, ARFI M A models are
very ﬂexible and parsimonious as they allow one to model both short run and long
run behavior of a time series simultaneously.

The spectral density of an ARFI M A process can be obtained directly from (2.6).

Note that the spectral density of an ARM A process, g, is given by;

_ 0?: l6’(<‘3“‘")l2

M“) ‘ % |¢(e‘“)|2’
where w is the angular frequency. Since the ARFI M A process is obtained from a

process fit with spectral density, fg by applying the inﬁnite linear ﬁlter, 2:10 “37,4,

60

then by a result from Priestley (1981, pp.243-66), the spectral density of y, is equal
to |A(w)|2f,-,(w), where A(w) 2 22:0 wke‘k‘”. Hence, it follows from (2.6) that the

spectral density of 3;; will be;

fy(w) = |1— eW|-2df,-,(w), (2.12)

where |1 — ewl = 28in(%w). Since, limw_.0 w‘1(sin(%w) = 1, the behavior of the

spectral density of the process at low frequencies (alternatively, at high periods, or
as sample size approaches inﬁnity) will be given by

3 “9(1)!2
27f l¢>(1)l2

For —% < d < 0, fy(0) = O, and hence the sum off all autocorrelations is zero. For

 

2
an ._
fy(w) ~ gfgw) = M 2d. (2.13)

d = 0, spectral density reduces to that of an ordinary ARM A(p, q) process with
bounded spectral density. Long-range dependence, and / or long memory occurs when
0 < d < %. To transform yt into a process with bounded spectral density, the inﬁnite
linear ﬁlter, (1 — L)d needs to be applied.

Obtaining explicit expressions for all covariances for the ARFI M A(p, d, q) process
is relatively difficult, except in the case of ARFI M A(0, d, 0) process. In this case, it
is shown in Sowell (1992) that the covariances are given by the formula;

2 (—1)*I‘<1 — 2d)
uF(k-d+1)1-‘(l — k—d)

 

’71: = 0 (2.14)

The autocorrelations are given by,

_ r(1— d)I‘(k + d)
p" " F(d)I‘(lc +1 — d)‘

 

(2.15)

By using the approximation, i%gi_f%) z [cw—1 for large k,p k can be expressed asymp-

totically by

pk ~ szd‘1 as (k —-> oo) (2.16)

To obtain the covariances of the general ARFI M A(p, d, q) process as suggested in
Beran (1994) one can use the covariances of the ARFI M A(0, (1, 0) process. This can

61

be done by ﬁrst recalling that y, is obtained by passing an ARF I M A(0, d, 0) process,
y; through the linear ﬁlter,
A(L)= ¢(-L)6 =2 ,\ L‘.
i=0

Denoting the covariances of y',’, in the ﬁrst step, calculate the coefﬁcients A,- by match-
ing the powers of ¢(L)0‘1(L) with those of A(L). In the second step the covariances
of ARFI M A(p, (1, q) process, yt are obtained from /\(L) and the covariances ’7; by

71 = Z A./\n;+.-_,. (2.17)

i,l=0

See Chung (1994) for alternative derivation of autocorrelations of ARFI M A(p, d, q)

model. The asymptotic formulas for the covariances and autocorrelations are:

 

7:. ~ 7)(d,<¢>.9)|l€l2d‘1 (2-18)
where
2 0 1 2
C,(d,¢,6) = % ||¢(1)II2F(1 — 2d) sin d1r.
and
pr. ~ C p(d ¢0)lk|2d 1 (2-19)
where
_ C7(da¢30)
Cp(di¢16) — ff" f(W)dLIJ

2.3 Long memory volatility models

Risk is an important factor in ﬁnancial markets. At a theoretical level, the Cap-
ital Asset Pricing Model (CAPM) developed by Sharpe (1964) and Merton (1973)
indicates presence of a direct relationship between return and risk of an asset. Also
an important determinant of an option is the risk associated with the price of the

underlying asset, as measured by its volatility. One of the stylized facts of asset

62

returns in ﬁnancial markets is that volatilities of assets change over time. Periods
of large price changes are followed by periods of relatively stable prices. This prop-
erty of asset prices is referred to in the literature as volatility clustering. The time
varying nature of the volatility was recognized early in 19603, see for instance, Man-
delbrot (1963a, 1963b) and Fama (1965). Econometric modelling of the volatility
clustering phenomenon occurred relatively recently in 19803. The Autoregressive
Conditional Heteroscedasticity (ARCH) model introduced ﬁrst by Engle (1982) and
modiﬁed by Bollerslev (1986) and labelled as Generalized Autoregressive Conditional
Heteroscedasticity (GARCH) models and their extensions have become popular both
among practitioners and researchers. GARCH models are able to describe certain
properties of economic time series, such as volatility clustering and excess kurtosis.
Although the GARCH model is able to capture the volatility clustering phenomenon
well it is not able to capture certain other empirically relevant properties of ﬁnancial
time series. For instance, in the standard GARCH model the effect of a shock on
volatility depends only on the shocks’ size not sign. However, as observed in Black
(1976) negative shocks or news may affect the volatility quite differently than positive
ones. Hence, the sign of the shock may be relevant in understanding the dynamic
nature of the volatility. Another example constitutes the persistence of the effects
of shocks in the volatility process. As observed in Ding, Granger, and Engle (1993)
sample autocorrelations of certain volatility measures, such as absolute and squared
returns, decline at a hyperbolic rate. Standard GARCH models fail to account for this
slow decay in the autocorrelations which is inherent in the volatility process. These
considerations led several researchers to develop volatility models that are capable of
modelling several aspects of volatility in ﬁnancial markets. In this section, we will
review GARCH class of models with particular attention given to parametric long
memory volatility model of Baillie et a1. (1996), namely the fractionally integrated
GARCH, (FIGARCH) model.

63

In general, an observed time series y, can be written as the sum of a predictable

and an unpredictable component,
yt = Elyt Int—11+ “at, (2.20)

where 91—1 is the information set consisting of all relevant information up to
and including time t— 1. In the previous section, different speciﬁcations (such
as ARIMA(p, q), or ARFIMA(p, d, q) for the predictable or conditional mean
E[yt|f2t_1] have been discussed. In section 2.2, the unpredictable part or distur-
bance at is assumed to satisfy the white noise properties. In particular, it was
assumed that at is both conditionally and unconditionally homoscedastic, that is,
E[uf] = E [uflflt_1] = 0,2, for all t. In the ARCH modelling of volatility, this assump-
tion is relaxed, and replaced by the assumption that the conditional variance of u, can
vary over time, that is, E [u?|9¢_1] = ht for some nonnegative function ht E h,(f2t-1).
Hence, the disturbances are conditionally heteroscedastic. Following Engle (1982), a

convenient functional form is
at = zt\/h—t (2.21)

where 7., independent and identically distributed with zero mean and unit variance.
For convenience, it is usually assumed that 2, has a standard normal distribution.
This latter assumption can be replaced with another distributional assumption, for
example, following Bollerslev (1987) one may assume that zt follows a student-t distri-
bution with 11 degrees of freedom. From (2.21) and the properties of 2, it follows that
the distribution of at conditional upon the history {2,4 is either normal or student-t

with mean zero and variance h,. The unconditional variance of at is,
a: E Elull = ElElulet—lll = Elhtl: (232)

where the latter equality follows from the law of iterated expectations, assuming

that the expectations exist. It follows that the unconditional variance of at should

64

be constant, that is, the unconditional mean, E[h,] = constant. Equations (2.21-
2.22) specify the general representation of GARCH type of models. The complete
speciﬁcation involves how one assumes the conditional variance of u, evolves over
time. GARCH type models specify the conditional variance of u, as such the speciﬁed
model captures (some) of the empirically observed facts of the economic and ﬁnancial

time series.

2.3.1 The (G)ARCH Model

Engle (1982) introduced the class of Autoregressive Conditionally heteroscedastic
(ARCH) models to capture the volatility clustering phenomenon that occurs in eco-
nomic and ﬁnancial time series. In the basic ARCH model, the conditional variance
of the disturbance that occurs at time t is speciﬁed to be a linear function of the
squares of past disturbances. The general ARCH (q) model is given by

q

ht = w + Z on-uf_j (2.23)

j=1
Obviously, the conditional variance ht needs to be nonnegative. To guarantee nonneg-
ativeness of the conditional variance, it is required that w > 0 and oz,- 2 0 for allj =
1, - - - ,q. To understand why the ARCH model can describe volatility clustering, ob-
serve that model (2.21) with (2.23) basically states that the conditional variance of
at is an increasing function of the disturbance/ shock that occurred in the previous q
periods with some nonnegative weights. Hence, if say ut_1i 8 large in absolute value,
at is expected to be large in absolute value as well. In other words, large (small)
shocks tend to be followed by large (small) shocks of either sign. An alternative way
to see the same thing is to note that the ARCH (q) model can be written as an AR(q)
model for uf. Adding u? to (2.23) and subtracting h, from both sides gives

q
11?: w + Zufﬂ- + vt (2.24)
i=1

65

where vt E u? — h, = h¢(z,2 — 1). Note that E [v,|f2¢_1] = 0. Given the AR representa-
tion of ARCH (q) process, the condition that needs to be satisﬁed in order for u? to be
covariance stationary is that the roots of the lag polynomial a(L) = 1 —a1L-- - ~01qu
need to be outside the unit circle. Moreover, the unconditional variance of at, or un-

conditional mean of u? can be obtained as

w

 

32 E[u§]_ — (2.25)

1 _ j: -1 aJ
Hence 2'12, 01,- < 1 in order for the unconditional variance to be well deﬁned. Under
these conditions, (2.24) can be rewritten as

21?: 1_ Za,)+ZaJut_J+vt

j=—1aJ(
= (1 — anj)0,2, + q;ajuf_j + vt

- —03 + (1201: — 02) + vt (2.26)

 

Equation (2.26) shows that if 21,2“1 is larger (smaller) than its unconditional expected
value 03, a? is expected to be larger (smaller) than of, as well.

ARCH model cannot only capture the volatility clustering of the time series under
investigation but also their excess kurtosis which is common in economic and ﬁnancial
time series. Horn (2.21) it can be seen that the kurtosis of u, is always greater than

that of 2,,

ElUZ’] = ElZflElhflz ElZZ’KElhtV) = ElZﬂWlUﬂ”),

where the inequality follows from J ansen’s inequality. As shown by Engle (1982), for

the ARCH (1) model with normally distributed 2; the kurtosis of at is equal to

Elull _3(1“ __a_1)

K t“:
“T E[u?]2 1—3ag

 

which is ﬁnite if 30% < 1. It is clear that K urtu is always larger than the normal

value of 3.

66

To capture the dynamic patterns in conditional volatility adequately by means
of an ARCH (q) model, q needs often to be quite large. Hence it can be quite
cumbersome to estimate the parameters in an ARCH (q) model with large q, as
nonnegativity and stationarity conditions need to be imposed. To reduce the com-
putational problems one needs to impose some structure on the parameters, such as
01,- = oz(q +1 —j)/(q(q+1)/2),j = 1, - - - ,q, which implies that the parameters of the
lagged squared shocks/ disturbances decline linearly and sum to a, see Engle (1982).
An alternative method is suggested by Bollorslev (1986) which involves adding lagged
conditional variances to the ARCH speciﬁcation. For instance, adding p such condi-
tional variances to the ARCH (q) model results in the GARCH (p, q) model,

q :9
ht = w + Z ajugﬂ- + Zﬁlht_jh¢_j

j=1 j=1

= w + a(L)ut + ﬁ(L)ht (2.27)
This model avoids the necessity of adding many lagged squared disturbance terms
by adding lagged values of conditional variance terms. To see why a GARCH spec-
iﬁcation takes care of adding large number of lagged residual terms consider the
GARCH(1, 1) model,

ht = w + 0111.? + ﬁlht_1. (2.28)

This model can be rewritten as,
ht = w + 011134 + [310.0 + aluiz + ﬂlht_2),
or by continuing the recursive substitution one can obtain,

h. = Z [31w + a1 2 sf‘1uf_,. (2.29)

j=l j=1
This equation shows that the GARCH(1, 1) model corresponds to an ARCH (00)
model with a particular parameter structure. This clearly illustrates why in most of

the applications a low order, for instance a GARCH (1, 1) model, is usually found to

67

be general enough to capture the dynamic behavior of many economic and ﬁnancial
time series.
An alternative representation of a GARCH(1, 1) model can be obtained by adding

u? to both sides of (2.28) and moving ht to the right-hand side,
u? = w +(oz1+ [Mail + vt — [31v¢_1, (2.30)

where again vt = u? — ht. This ARMA(1, 1) representation allows one to establish
conditions for the covariance stationarity of the GARCH (1, 1) process. From (2.30) it
is obvious that GARCH(1, 1) model is covariance stationary if and only if 01 +61 < 1.
In this case the unconditional mean of u? - or unconditional variance of u, - is equal

to

2 _ w

u — 1 — 011 + 51.
The parameters in GARCH(1, 1) model need to satisfy w > 0, a; > 0 and ,81 2 0 in

an

0'

order to guarantee that h, 2 0. Moreover, al needs to be strictly positive in order for
H1 to be identiﬁed. This is because, if a1 = 0 in (2.30) both AR and MA polynomials
become 1- 61L, hence when one rewrites the ARM A(1, 1) model for u? as an M A( 00)

process polynomials will cancel out,

“=1-sm
‘ 1—mL

which indicates that )61 is not identiﬁed, see Bollerslev (1986) for details.

 

W = Uta

In the case of GARCH(1, 1) Bollerslev (1986) showed that the kurtosis of at under
normality of 2, is given by

3[1 -— (01 + ﬂln
1 — (01 + ,31)2 — 20%,

 

K urtu =

which is always larger than the normal value of 3. The autocorrelations of u? are

derived in Bollerslev (1988) and are given by,

aiﬂi
l - 20131 — ﬁt”
pk = (01 + '61)k-1p1 fork = 2a 3a ' ° ° (233)

 

P1 = 01 + (2-32)

68

The decay factor of autocorrelations is 011 + 61. This means that if this sum is close
to 1, the autocorrelations decline gradually still at an exponential rate. If the fourth
moment of at does not exist (if (a1 + ﬂl)2 + 201% _>_ 1, as shown by Bollerslev 1986)
then the autocorrelations of u? are timevarying. As shown by Ding and Granger
(1996), if the GARCH(1, 1) model is covariance stationary but with inﬁnite fourth
moment, one can still compute the sample autocorrelations.

In the general GARCH (p, q) model if all the roots of 1 — ﬂ(L) lie outside the unit

circle, the model can be written as an inﬁnite-order ARCH model,

 

 

_ w _9_(_L_)___ 2
h“1—B(1)+1—ﬁ(L)“‘
=1_ 31w _ ﬁp; 2:6,th (2.34)

To guarantee the nonnegativity of the conditional variance all 6J- need to be nonneg-
ative. The ARM A(m, p) representation of u? is given by

- —w + 2(09 + )6J)u J—ZﬂJ-vt_J + vt, (2.35)

j=1
where m = maa:(p,q),aJ- E 0 for j > qand ﬂJ- forj > p. The GARCH(p,q) model is

covariance stationary if all the roots of 1 — a(L) - B(L) lie outside the unit circle.

2.3.2 The IGARCH Model

In applications of the GARCH(1, 1) model to high frequency economic and
ﬁnancial data, it is usually found that the estimates of alandﬂl are such that their
sum is close to or equal to 1. The GARCH(1, 1) model with restriction 01 + 31 = 1
is referred to be the Integrated GARCH (IGARCH) model. The reason is that the
restriction on these parameters leads a unit root in the ARM A(1, 1) representation of
GARCH(1, 1) model. From equation (2.30) the ARMA representation of the model
becomes,

(1'— L)“: = O.) + 'Ut— ,Bl’Ut_1.

69

From (2.31) it can easily be seen that the unconditional variance of at is not ﬁnite.
Therefore, the [GARCH(1, 1) model is not covariance stationary. Although, the
autocorrelations of u? for an IGARCH model are not deﬁned properly, Ding and

Granger (1996) show that they are approximately equal to
1 2 —k/2
Pk: 3(1+201)(1+201) .

I The autocorrelations still decay exponentially. This is in sharp contrast to an I (1)
process, say for instance a random walk model, for which the autocorrelations are

approximately equal to 1.

2.3.3 The FIGARCH Model

The properties of the conditional variance h, as implied by the IGARCH model are
not very attractive from an empirical point of view. The IGARCH model implies that
a shock to the volatility process will have very persistent effects. The IGARCH model
also implies that there is a linear trend in the future forecast of the volatility process,
i.e. E¢h¢+k = ht + kw, hence, the forecasts of future conditional variance increases
linearly with the forecast horizon. This is not realistic from an empirical point of
view. On the other hand, estimates of the GARCH(1, 1) model from high frequency
ﬁnancial time series invariably yield a sum of a1 and [31 close to 1, with (11 small
and ﬂl large. From the ARCH (oo) representation of GARCH(1, 1) model, equation
(2.29), it can be seen that the impact of a shock at on the conditional variance at a
future date, hm. is given by 0161‘“. With 61 close to 1, the impact of a shock at
time t on the conditional variance will decay very slowly as It gets larger and larger.
Moreover, the autocorrelations for u? given in (2.33 and 2.34) are die out very slowly
if the sum (11 + 61 is close to 1, although the decay is still at an exponential rate.
This can be seen from panel a of ﬁgure (2.3) which displays the autocorrelations for
u? from a sample realization of GARCH(1, 1) with w = 0.001, a} = 0.2, and ,61 = 0.7.

70

It is evident from the ﬁgure that the autocorrelations decay slowly but still the decay
rate is too fast to mimic the observed autocorrelation patterns of empirical volatility
processes. For example, Ding, Granger, and Engle (1993), deLima, Breidt, and Crato
(1994), Baillie and Bollerslev and Mikkelsen (1996), Lobato and Sevin (1998), Da-
corogna etal. (1993), Andersen and Bollerslev (1997), and Baillie, Cecen, and Han,
(2001), all report that the sample autocorrelations of absolute returns and power
transformations of returns for various asset prices at different frequencies decline only
at a hyperbolic rate. As this discussed in the previous section, this type of behavior of
autocorrelations can be modelled by means of long memory or fractionally integrated
processes.

Baillie, Bollerslev, and Mikkelsen (1996) propose the class of Fractionally Inte-
grated GARCH (FIGARCH) models. The FIGARCH process is capable of modelling
very slow hyperbolic decay in the autocorrelations of the volatility process quite ﬂex-

ibly. Rewriting the ARM A(m, p) representation of the GARCH (p, q) model as,
[1 — B(L) — a<L>1uf = w +[1 — ﬂ(L)lvt.

the F IGARCH (p, 6, q) model can be obtained by simply adding (1 — L)6 term
on the left hand side of this ARM A(m, p) representation. More explicitly, the
FI GARCH (p, 6, q) model is given by

¢(L)(1 - M6”? = w + [1 - 3(L)lvt, (2.36)

where ¢(L) = [1 — 6(L) — a(L)](1 — L)‘5, all the roots of ¢(L) and [1 — ﬁ(L)] lie
outside the unit circle, and 0 < 6 < 1. For 0 < 6 < 1, ¢(L) is an inﬁnite order
polynomial, while it is of order m — 1 for 5 = 1. As it is evident from (2.36) the
FIGARCH model nests GARCH and IGARCH models in the sense that when 6 = 0
the FIGARCH model reduces to the GARCH model while for d = 1 it becomes an

IGARCH model. Rearranging the terms in (2.36) an alternative representation for

71

the FIGARCH model can be obtained as,
[1 — mm. = w + [1 — ML) - ¢<L><1 — L)"]uf- (2.37)

From this representation, the conditional variance of at, or inﬁnite ARCH represen-

tation of the FIGARCH process, is simply

 

 

_ w __¢(_L)_ _ 6u2
’“U—zmﬁ“ l-ﬂ(L)(1 L)“
E 1 _ 3(1) + A(L)ut, (2.38)

where ML) = A1L+A2L2 +- - - . For the FI GARCH (p, 6, q) process to be well deﬁned
and the conditional variance to be positive for all t, all the coefficients in the inﬁnite
ARCH representation in (2.38) need to be nonnegative, i.e. AJ- 2 Ofor j = 1,2,---.
The general conditions for nonnegativity of lag coefﬁcients in /\(L) are not easy to
establish, but as illustrated in Baillie et al. (1996) it is possible to show sufﬁcient
conditions in a case by case basis.

The FIGARCH process implies a slow hyperbolic rate of decay for the autcorre-
lations of u? as can be seen from panel b of ﬁgure 2.3 which displays the ﬁrst ﬁfty
autocorrelations of a? from a sample realization of a FIGARCH(1, 6,1) process. For
0 < 6 S 1, M1) = 0 and hence the second moment of the unconditional distribution of
u, is inﬁnite, and FIGARCH process is not covariance stationary similar to IGARCH
processes. As argued in Baillie et al. (1996) just like the IGARCH processes it can
be shown that FIGARCH processes are strictly stationary and ergodic for 0 < 6 S 1.
Baillie et al. (1996) show that it is possible to obtain impulse response coefficients
from the deﬁnition given in (2.36). Speciﬁcally, the coefﬁcients from the 7(L) lag

polynomial,

(1 - DU? = (1 - L)1“5¢(L)‘1w + (1 - L)1"‘5¢(L)'1[1 - 5(Lllvt

a c + 7(L)vt. (2.39)

72

The long run impact of past shocks for the volatility process can be assessed in terms

of the cumulative impulse response weights,
1:
7(1) = gig, 20). = 31;; A. = F(6 — 1,1,1;1)¢(1)‘1[1— 3(1)],
.7:

where F (6 — 1, 1, 1, 1; 1) is the hypergeometric function. For details, see Baillie et
al. (1996). Since for 0 S 6 < 1, F(6 — 1,1,1; 1) = 0, shocks to the conditional
variance of F IGARCH process will die out eventually in a forecasting sense similar
to a GARCH process. But the shocks to the GARCH process dissipate at a fast
exponential rate while shocks to the conditional variance of a FIGARCH process
is much slower at a hyperbolic rate. In contrast, for 6 = 1, F (6 — 1,1,1; 1) = 1
and hence cumulative impulse rates for a IGARCH process converge to the nonzero
constant 7(1) = ¢(1)1[1 — 6(1)]. This implies that shocks to the conditional variance
of the IGARCH process persist indeﬁnitely. For an illustration, consider the basic
FIGARCH (1, 'y, 0) model discussed in Baillie et al. (1996). This model can be written
as

(1 — L)6]uf = w + U; — ﬁlvt_1.

Using the deﬁnition of vt = u? — ht, this can be rewritten as an ARCH (oo) process

for the conditional variance as,

 

 

__ w (l-L)6 2
m— —a'“*1—aﬂm
l—ﬂl +A(L)ut’

where )((L) E 1 - (1 — L)5/(1 — 61L). By using the expansion (2.6) for (1 — L)‘, it

can be shown that for large k
A), z [(1— ,61)I‘(L)‘1]k‘5‘1.

It is evident from this expression that the effect of at on ht+k decays only at a

hyperbolic rate as k increases.

73

2.4 ARFIMA-FIGARCH Model: Modelling long
memory in both conditional mean and vari-
ance

A model that combines long memory processes for both the conditional mean and
variance processes and allows one to model jointly the long memory in time series that
may have long memory property in both its conditional mean and variance process is

the ARFI M A(P, d, Q) -— FI GARCH (p, 6, q). The ARFIMA-FIGARCH process can

be expressed as,
‘I’(L)(1 — L)dyy = 9(Llut

Ut =Zt\/h—t

ML)h. = w + [1 — ML) — ML)(1 — Mu? (2.40)

where B(L), and ¢(L) are the same as before, while <I>(L) = 1 — <I>1L — - - - — <I>pLP,
9(L) = 1 + 81L + + OQLQ, and have all their roots outside the unit circle.
Moreover, Et_lzt = 0, E¢_1(z,2) = 1. This model is capable of modelling both short
run dynamics and long run properties of a time series in both conditional mean
and variance very parsimoniously. Note that if ht = w then the model reduces to
the ARFI M A(p, (1, q) model for the conditional mean process discussed above. If
p = q = d = 0 the model becomes so called Martingale-FIGARCH process for the
conditional mean and variance. The Martingale-FIGARCH model is appealing as
it allows one to model random walk and highly persistent conditional second mo—
ments of many high frequency asset prices. The Martingale-FIGARCH model is
ﬁt to daily and high frequency exchange rate data (hourly of half-an hour data)
by Baillie, Bollerslev and Mikkelsen (1996), and most recently by Baillie, Ceqen,
and Han (2001). On the other hand, Baillie, Han and Kwon (2001) applied the
ARFI M A(p, d, q) — FI GARCH (P, 6, Q) model to inflation series and obtained re—

sults that indicate presence of long memory dynamics in both the conditional mean

74

and variance of the inflation series for several industrial countries. As noted in Bail-
lie, Han and Kwon (2001), contrary to pure ARFIMA process, ARFIMA-FIGARCH

process have an inﬁnite unconditional variance for all (1 given 6 7E 0.

2.5 Estimation and Inference

Several methods of estimating long memory parameter d have been suggested
in the literature. The early methods are mostly heuristic in the sense that they
are simple diagnostic tools used in detecting the presence of long memory. Most of
these methods are discussed extensively in Beran (1994). More advanced and rigor-
ous methods are developed to estimate long memory and parameters of long memory
models discussed in the previous sections in both time and frequency domain. A
complete review and discussion of them can be found in Baillie ( 1996) and Beran
(1994) and references therein. In this section some of these methods, those mostly
used among applied economists are discussed. In particular, semi-parametric estima-
tion in the frequency domain (or least squares regression in the frequency domain)
due to Geweke and Portar-Hudak (1983) and Robinson (1994, 1995), approximate
maximum likelihood estimation in the frequency domain due to Whittle (1951) and
Fox and Taqque (1986), and approximate maximum likelihood estimation (or non-
linear least squares estimation, or conditional sum of squares estimator) in the time
domain due originally to Hosking (1984) in the context of ARFIMA processes, and
Baillie and Chung (1993) in the context of ARFIMA-FI/GARCH processes, will be
discussed in some detail within the context of the long memory models discussed in

the previous sections.

2.5.1 Regression based estimation in the frequency domain

In the spectral domain, Geweke and Portar-Hudak (1983) suggested a semi-

parametric procedure to obtain an estimate of the fractional differencing parameter

75

d based on the slope of the spectral density function around the angular frequency.

The spectral density of a stationary Gaussian long-memory time series yt is given by

f(w) = I1 - eXP(-iW)|‘2df(W)’ (2-41)

where d E (-0.5, 0.5) and f (w)" is an even, positive continuous function on [——7r, 7r],
bounded above and bounded away from zero with ﬁrst derivative f " = 0 and second
and third derivatives bounded in a neighborhood of zero. The function f (w)"‘ endows
the model (2.41) with a short-term correlation structure which is free of any paramet-
rically imposed constraints. For this reason the semi-parametric model in (2.41) may
be preferable to the assumption that the time series obeys an ARFI M A(p, (1, q) pro-
cess with p and q ﬁnite, either known or unknown as in the ARFI M A(p, (1, q) model
discussed above. Note the fact that the ARFIMA model is a special case of (2.41)
that can be obtained by assuming f (w)“' to be the spectral density of a stationary
invertible ARM A(p, q) process as in (2.12). The long memory parameter, d can be

estimated semi-parametrically based on the ﬁrst periodogram ordinates

 

T—l
l . .
12' = 27rTjTl 21:0: y. expawjtllz. .7 = 1. ' ' ° .771 (2-42)

where W = 2 j7r / T and m is a positive integer. The semi-parametric estimator which
is also known as GPH estimator in the literature, is given by —% times the least
squares estimate of the slope parameter in an ordinary linear regression of {log I J- };-’f__1

on the explanatory variable
_ . . 0’1
$1 =10g||1 - exp (—sz)|| = log ||281n(3)ll.

together with a constant term. Therefore the GPH estimator can be written as

(J —0.5 223:1(25- — i) log IJ-
GPH = ﬁrn _
Zk=1 ($1: — my

 

(2.43)

76

where E = $1- 22; 22),. The GPH estimator can be motivated heuristically by noting

that

t

log [J = (log f5 — C) — 2(1er- + logﬁ + q,
0

Where 62‘ = 108(11/13) + C, With fj = ﬂwj), f; = fJ‘-'(w,-) and C = 0577215 18 the
Euler’s constant. It is assumed that m —+ 00, so that the variance of clap” will
decrease to zero as T -—> co, and also that % ——) 0, so that bias due to the non-
constancy of log(fJ‘/f5) will tend to zero.

Although the GPH estimator is widely used in practice, its consistency for all
at E (—0.5, 0.5) and asymptotic normality have only recently been proved by Hurvich
et al. (1998). Robinson (1995) did prove consistency and asymptotic normality for
a modiﬁed regression estimator which regresses {log 1337;,“ on {233}; +1, where l
is a lower truncation point which tends to inﬁnity more slowly than m. However,
simulations (e.g. Hurvich and Beltaro, 1994) indicate that the modiﬁed estimator is
typically outperformed in ﬁnite samples by the GPH estimator itself. The reason is
that any bias reduction resulting from omission of the ﬁrst I periodogram ordinates
from the regression is more than offset by inﬂation of the variance (see Hurvich and
Beltrao, 1994). Hurvitch et al (1998) show that the optimal (in the sense that it
minimizes the theoretical mean squared error of the GPH estimator) choice of m is
in the order of 0(T4/5). They present simulation results to asses the accuracy of
their asymptotic theory on the mean squared error for ﬁnite sample sizes. Their
ﬁndings indicate that the choice m = Tm, originally suggested by Geweke and
Porter-Hudak (1983) and used extensively in the empirical literature, can lead to
performance which is markedly inferior to that of asymptotically optimal choice in
reasonably small samples.

The GPH estimation only allows one to estimate the long memory parameter. In
a parametric model, such as in the case of ARFI M A(p, d, q) model given in (2.6) all

of the other parameters (i.e. ARMA parameters, variance, and the mean parameter)

77

in principal can be estimated in the second step by any appropriate method, such as
maximum likelihood once the series is ﬁltered by the estimate of the long memory
parameter, clap”. A problem with this two-step approach is that the sampling distri-
bution of estimators is not known yet. The problem may be much more serious in the
models with GARCH or FIGARCH effects in the conditional variance of the process.
Moreover, there is some evidence that in the case of autocorrelated disturbances the
GPH estimator may have serious biases. See for instance Agiakloglu, Newbold, and
Whoar (1993). The next subsection discusses methods that estimates jointly the long

memory parameter and the ARM A parameters.

2.5.2 Parametric Methods: Approximate Maximum Likeli-
hood

It seems that if one is only interested in having an idea about the presence of long
memory or not in a time series the GPH estimator may provide information about
the presence of long memory. If on the other hand one needs to understand both
short run and long run dynamics of a time series and use the model for describing
the dynamic structure of the series and/ or use the model for forecasting purposes,
the GPH estimator obviously will not tell anything about the short term properties
of the process. Methods which allow one to model the whole autocorrelation struc-
ture, or, equivalently, the whole spectral density at all frequencies, have to be used
to characterize the short-run behavior of the series. One such approach is to use
parametric models, such as the ARFIMA model in (2.6) and estimate parameters, for
example, by maximizing the likelihood. One such method is the exact maximum like-
lihood estimator (MLE) of the ARFI M A(p, d, q) model under the assumption that
u, is normally distributed. The exact MLE for the ARFI M A(p, at, q) model is devel-
oped in Sowell (1992). Given the ARFI M A(p, d, q) process in (2.6) the log-likelihood

78

function is
T 1 1 , _1
My; cp) = --2-10g(27r) - 5 log detZW) - 53/ 20.0) y (2-44)

where 2 is the variance-covariance matrix whose i, jth element is given by Eia‘ =
7I,_J-|, y is the T —dimensional vector of observations on the process 3)., and (p =
(d, (231 - - - ,¢>J,, wl, - - - ,wq, 03)’, is the parameter vector in the ARFIMA(p, d, q) model
with known mean a. The exact MLE of (,0 is obtained by maximizing (2.44) with
respect to the k = p + q + 2 dimensional parameter vector. The consistency and
normality of exact MLE of the ARFI M A(p, d, q) model is established in Yajima
(1985) and Dahlhaus (1989) for the Gaussian long memory processes. Although exact
MLE of 1;) can in principal be obtained by the MLE procedure, in practice, exact MLE
has serious computational problems. The exact MLE requires the inversion of a T x T
matrix of nonlinear hypergeometric functions at each iteration of the maximization
of the likelihood. To solve the computational problem an alternative approach is to
maximize an approximation to the likelihood function. There are several alternative
approximate MLE of the ARFI M A(p, at, q) model under normality of disturbances.
Two such approximate MLEs that are mostly used in empirical work are discussed

here.

2.5.3 Whittle’s approximate MLE

The two terms in (2.44) that depend on the parameter vector, (p are the logarithm

of the determinant of the covariance matrix,
log det 2((p),

and the quadratic form

y’3(cp)"‘y-

79

The Whittle’s approximate MLE uses the approximations for these terms in the log-

likelihood function. In particular,

alim log det 2(4p) = log(21r)f(wJ-).

and second term approximated by I (wJ- / f (wJ). Then the approximate log-likelihood

is

 

T—l T-l
4. = Zlogl(27r)f(wj; 10)) + Z 1(..,) (2.45)

_ f (wj; 1P),
where wJ- = 27rj / T — 1, and f (.) is the spectral density. An alternative approximate

MLE is given by Fox and Taqque (1986) which numerically minimize the quantity
2 (w?) (2.46)

where m is the number of frequencies used. For a detailed discussion of Whitlle’s

 

approximate MLE see Beran (1994) and references there.

2.5.4 Approximate MLE in the time domain

In this subsection estimation of long memory models will be discussed within
the context of both ARFI M A(p, (1, q) model for the conditional mean process as
well as the FI GARCH (P, 6, Q) model for conditional volatility. The setup of the
technique is general enough to cover both types of long memory processes and the dual
long memory model ARFI M A(p, d, q) — FI GARCH (P, 6, Q). To this end general
principles are discussed ﬁrst, and some remarks on speciﬁc models will be given.

Consider the ARF I M A(p, d, q) — FI GARCH (P, 6, Q) model given in (2.40). Un-
der the assumption that disturbances are conditionally normally distributed the con-

ditional log-likelihood can be written in the time domain is

T T u2
3(u1...,uT;.p) =—§ln21r—Z[lnh¢+é], (2.47)

t=1

where (,0’ = (a, (PI - - - (Pp, 91-°-eq,w6ﬂl---ﬁp,¢1---¢q). Since conditional normal-

ity of u, is often not a very realistic assumption for many economic and ﬁnancial

80

time series, the resulting model fails to capture the kurtosis in the data. Instead,
following Bollerslev (1987) one sometimes assumes that 2t is drawn from a (standard-
ized) Student-t distribution. Note that the standardized Student-t distribution with

u degrees of freedom is,

f(z.) _ r((u +1)/2) _;3_

_ ,/7r(u"'—" _2)F(u/2)( + u —- 2)

The Student-t distribution is symmetric around zero (and thus E [2; = 0]). while it

—(u+1)/2.

 

converges to the normal distribution as the number of degrees of freedom V becomes
larger. A further characteristic of the Student-t distribution is that only moments
up to order 11 exist. Hence, for V > 4, the fourth moment of 2: exists and is equal
to 3(1/ — 2) / (u — 4). As this is larger than the normal value of 3, the uncondi—
tional kurtosis of at will also be larger than in the case where 2, followed a normal
distribution. The number of degrees of freedom of the Student-t distribution can
be estimated along with the other parameters of the model. Indeed any other dis-
tribution can be assumed. The parameters of the model under consideration then
can be estimated by maximizing the log-likelihood corresponding with this partic-
ular distribution. As one can never be sure that the speciﬁed distribution of the
disturbances is the correct one, an alternative approach is to ignore the problem and
base the likelihood on the normal distribution as in (2.47). This method usually
is referred to as quasi-maximum likelihood estimation (QMLE). In general, the re-
sulting estimates are still consistent and asymptotically normal, provided that the
models for the conditional mean and conditional variance are correctly speciﬁed. Li
and McLeod (1986) have shown the consistency and asymptotic normality of QMLE
for the ARF I M A(P, d, Q)-homoscedastic model with mean [1 either known or zero.
Dahlhaus (1988, 1989) and Moehring (1990) showed the same result with 11 unknown.
In particular, they show that the parameter estimates in the ARFIMA model with

homoscedastic disturbances are asymptotically normal, with the ARFIMA parame-

81

ter estimates being Tl/2 consistent while the QMLE of p is Tl/Z‘d consistent. For
the conditional variance process, asymptotic normality and consistency have only
been shown in speciﬁc cases. Weiss (1984, 1986) has demonstrated consistency and
asymptotic normality for QMLE of ARCH (q) model as in (2.24), while Bollerslev and
Wooldridge (1992), Lee and Hansen (1994) and Lumsdaine (1996) have obtained the
same result where h, follows a GARCH(1, 1) under varying assumptions on the prop-
erties of 2,. Lumsdaine (1996) also illustrated consistency and asymptotic normality
for the QMLE of I GARCH (1, 1) model. While simulation experiments for FIGARCH
processes in Baillie and Bollerlev (1996) indicate consistency and asymptotic normal-
ity of the QMLE, a fully general theoretical treatment is not available yet. In the
case of the more general models ARFIMA-GARCH and ARFIMA-FIGARCH, Baillie,
Chung, and Tieslau (1996) and Baille, Han, and Kwon (2001) through simulations
provide evidence that the QMLE is consistent and asymptotically normal.

As the true distribution of z, is not assumed to be the same as the normal distri-
bution which is used to construct the likelihood function, the standard errors of the
parameters have to be adjusted accordingly. In particular, the asymptotic covariance

matrix of DT(¢ — (pg) is equal to

Dr1A(900)_lB(900)A(800)DT1 a (248)

where A(.) is the Hessian, i.e. the negative of the matrix of second-order partial
derivatives of the log likelihood function with respect to the parameters in the model,
H (<p) E —0€(u1, - - - ,UT; <p)2/61p6<p’, B(.) is the expected value of the outer product

of the gradient matrix,

 

and DT is a diagonal matrix with =dliag(DT) = [Tl/Z‘d, Tm, - - - ,Tl/Z]. The matrices

82

A(.) and B () can be consistently estimated by their sample analogs, namely,

 

T ..
ATW?) = —% (35):?)
and
T A A
8(2) = %Z<%"—)a—§gﬂ)

As the ﬁrst order conditions in maximization of the log likelihood will be nonlinear
functions in the parameter of the models discussed here, an iterative optimization
procedure has to be used to obtain the MLE o. The most frequently used iterative
optimization procedures that can be used to estimate the parameters typically require
the existence of ﬁrst and second order derivatives of the log likelihood function with
respect to (a -that is, the score S((p) E 66/690 and Hessian matrix H (cp) deﬁned above.

For example, the iterations in the well known Newton-Raphson method take the form

T

T
4"“ -- «\(Z HAM—1)"1 Z 3M“). (2.49)

t=l

@k

where of is the estimate of the parameter vector obtained in the mth iteration and
the scalar /\ denotes a step size. In the BHHH algorithm which is by far the most
popular method to estimate GARCH and FIGARCH models, the Hessian H.(cﬁ) in

(2.49) is replaced by the outer product of the gradient matrix Bt(¢k‘1) as given above.

2.6 Conclusion

This chapter provided a concise review of the long memory models for the
conditional mean and variance of a time series. In particular, ARFI M A(p, d, q) model
for the conditional mean of a time series and GARCH (p, q) and FI GARCH (p, 6, q)
models for the conditional variance are discussed. The discussion is cast in terms of
properties of the models and estimation of these models. Chapters 4 and 5 of the

dissertation include applications of these models in commodity and stock markets.

83

BIBLIOGRAPHY

[1] Agiakloglu, C., P. Newbold and M. Wohar ( 1993), Bias in estimator of the frac-
tional difference parameter, Journal of Time Series Analysis 14, 235-246.

[2] Anderson, T. G. and T. Bollerslev (1997), Heterogenous information arrivals and
return volatility dynamics: Uncovering the long-run in high frequency returns,
Journal of Finance, 3, 975-1005.

[3] Aydogan, K. G.G. Booth (1988), Are there long cycles in common stock returns?,
Southern Economic Journal 55, 141-149.

[4] Baillie, R. T., (1996), Long Memory Processes and Fractional Integration in
Econometrics”, Journal of Econometrics, 73, 5-59.

[5] Baillie, R. T., (1998), Comment Journal of Business 89' Economic Statistics, 16,
273-276.

[6] Baillie, R.T., T. Bollerslev, and H.O. Mikkelsen (1996), Fractionally integrated
Generalized Autoregressive Conditional Heteroscedasticity, Journal of Econo-
metrics 74, 3-30.

[7] Baillie, R.T., C.-F. Chung, and M.A. Tieslau (1996), Analyzing inﬂation by the
fractionally integrated ARFIMA-GARCH model, Journal of Applied Economet-
rics 11, 23-40.

[8] Baillie, R. T. , Y. W. Han, and Tae-Go Kwon (2001), Further long memory
properties of inﬂationary shocks, forthcoming, Southern Economic Journal.

[9] Baillie, R. T., A. A. Cegen, and Y—W. Han (2001), High frequency Deutch mark-
US dollar retruns: FIGARCH representations and non linearities, Manuscript,
Department of Economics, Michigan State University, East Lansing, MI.

[10] Beran, J. (1994), Statistics for Long-Memory Processes, Chapman & Hall

[11] Black, F. (1976), The pricing of commodity contracts, Journal of Financial Eco-
nomics 3, 167-179.

84

[12] Bollerslev, T. (1986), Generalized autoregressive conditional heteroskedasticity,
Journal of Econometrics 31, 307-327.

[13] Bollerslev, T. (1987), A conditionally heteroscedastic time series model for specu-
lative prices and rates of return, Review of Economics and Statistics 69, 542-547.

[14] Bollerslev, T. (1988), On the correlation structure of the generalized autore-

gressive conditional heteroscedastic process, Journal of Time Series Analysis 9,
121-131.

[15] Bollerslev, T. and J. M. Wooldridge ( 1992), Quasi-maximum likelihood estima-
tion and inference in dynamic models with time varying covariances, Econometric
Reviews 11, 143-172.

[16] Bollerslev, T. and H.O.A. Mikkelsen (1996), Modeling and pricing long memory
in stock market volatility, Journal of Econometrics 73, 151-184.

[17] Chambers, M. (1998), Long memory and aggregation in macroeconomic time
series, International Economic Review 39, 1053—1072.

[18] Cheung, Y. and Lai, K. (1995), A search for long memory in international stock
market returns, Journal of Internationtal Money and Finance, 14, 597-615.

[19] Cioczek-Georges, R., and B. B. Mandelbrot (1995), A class of micropulses and
antipersistent fractional brownian motion, Stochastic Processes and Their Ap-
plications 60, 1-18.

[20] Chung, C-F. and RT. Baillie, 1993, Small sample bias in conditional sum of
squares estimators of fractionally integrated ARMA models, Empirical Eco-
nomics 18, 791-806.

[21] Chung, C-F., (1994), A note on calculating the autocovariances of the fractionally
integrated ARMA models, Economic Letters, Economic Letters 45, 293-297.

[22] Crato, N. (1994), Some international evidence regarding the stochastic behavior
of stock returns, Applied Financial Economics, 4, 33-9.

[23] Dacorogna, M. M., U. A. Muller, R. J. Naglar, R. B. Olsen, and O. V. Pictet
( 1993), A geographical model for the daily and weekly seasonal volatility inthe
foreign exchange markets, Journal of International Money and Finance 12, 413—
438.

85

[24] Dahlhaus, R. (1989), Efficent parameter estimation for self-similar processes,
Annals of Statistics 17 , 1749-1766.

[25] Diebold, F. X., and A. Inoue (2001), Long memory and regime switching, Journal
of Econometrics 105, 131-159.

[26] Ding, Z., C.W.J. Granger, and RF. Engle (1993), A long memory property of
stock returns and a new model, Journal of Empirical Finance, 1, 83-106.

[27] Ding, Z. and C. W. J. Granger ( 1996), Modeling volatility persistence of specu-
lative returns: a new approach, Journal of Econometrics 73, 185-215.

[28] Engle, R. F. ( 1982), Autoregressive conditionally heteroscedasticity with esti-
mates of the variance of United Kingdom inﬂation, Econmetrica 50, 987-1007.

[29] Fama, E. F. (1965), The behavior of stock market prices, Journal of Business
38, 34-105.

[30] Fox, R., and Taqqu, M. S. (1986), Large sample properties of parameter estimates
for strongly dependent stationary Gaussian time series, Annals of Statistics, 14,
517-532.

[31] Granger, C., (1980), Long Memory Relationships and the Aggregation of Dy-
namic Models, Journal of Econometrics, 14, 227-238.

[32] Granger, C., and R. Joyeux (1980), An Introduction to Long Memory Time
Models and Fractional Differencing, Journal of Time Series Analysis, 1, 15-29.

[33] Green, M. T., and Fieltz B. D. (1977), Long-term Dependence in Common Stock
Returns, Journal of Financial Economics, 4, 339-349.

[34] Gweke, J. and Portar-Hudak, S. (1983), The estimation and application of long
memory time series models, Journal of Time Series Analysis 4, 221-238.

[35] Hamilton, J. (1994), Time Series Analysis, Princeton, New Jersey: Princeton
University Press.

[36] Hosking, J. (1981), Fractional Differencing, Biometrika, 68, 165—176.

[37] Hosking, J. R. M. (1984), Modelling persistence in hydrological time series using
fractional differencing, Water Resources Research 20, 1898-1908.

[38] Hurst, H. (1951), Long Term Storage Capacity of Reservoirs, Transactions of the
American Society of Civil Engineers, 116, 770-799.

86

[39] Hyde, C. C., and Y. Yang (1997), On deﬁning long range dependence, Journal
of Applied Probability 34, 939-944.

[40] Hurvich, C. M. and Beltrao, K. I. (1994), Automatic semiparametric estimation
of the parameter of a long memory time series, Journal of Time Series Analysis,
15, 285-302.

[41] Hurvich, C.M., Deo, R. and Brodsky, J. (1998), The mean squared error of Gweke
and Portar-Hudak’s estimator of the long memory parameter of a long-memory
time series, Journal of Time Series Analysis 19, 19—46.

[42] Lee, S. W. and B. E. Hansen (1994), Asymptotic theory for the GARCH(1, 1)
quasi-maximum likelihood estimator, Econometric Theory 10, 29-52.

[43] Lumsdaine, R. L. ( 1996), Consistency and asymptotic normality of quasi-
maximum likelihood estimator in IGARCH(1,1) and covariance stationary
GARCH(1,1) models, Econometrica 64, 575-596.

[44] Lippi, M., and P. Zaffaroni (1999), Contemporaneous aggregation of linear dy-
namic models, Manuscript, Research Department, Bank of Italy.

[45] Lo, A. W. (1991), Long-term memory in stock market prices, Econometrica, 59,
1279-313.

[46] Lobato, I. N., and Savin, N. E. (1998), Real and spurious long-memory proper-
ties of stock-market data, (with discusssion), Journal of Business 62 Economic
Statistics,, 16, 261-283.

[47] Mandelbrot, B., (1963a) New methods in statistical economics, Journal of P0-
litical Economy 71, 421-40.

[48] Mandelbrot, B. (1963b) The variation of certain speculative prices, Journal of
Business 36, 394-419.

[49] Mandelbrot, B. B. (1971), When can price be arbitraged efficiently? A limit to
the validity of the random walk and martingale models, Review of Economics
and Statistics, 53, 225-36.

[50] McLeod, A. 1., and K. W. Hipel (1978), Preservation of the rescaled adjusted
range, Water Resources Research 14, 491-518.

[51] Parke, W. R. (1999), What is fractional integration? Review of Economics and
Statistics 81, 632-638.

87

[52] Robinson, RM. (1990), Time series with strong dependence, Advances in econo-
metrics, 6th world congress, Cambridge University Press, Cambridge.

[53] Robinson, P. M. (1994), Semiparametric analysis of long-memorytime series, The
Annals of Statistics 22, 515-539.

[54] Robinson, P. M. (1995), Log-periodgram regression time series with long-range
dependence Annals of Statistics 23, 1048-72.

[55] Robinson, RM. and F.J. Hidalgo, (1997), Time series regression with long-range
dependence, Annals of Statistics 27, 77-104.

[56] Samarov, A. and MS. Taqqu (1988), On the eﬂicency of the sample mean in
long memory noise, Journal of Time Series Analysis 9, 191-200.

[57] Sowell, F. (1992), Maximum likelihood estimation of stationary univariate frac-
tionally integrated time series models, Journal of Econometrics, 53, 165-188.

[58] Taylor, S. (1986), Modelling Financial Time Series, John Wiley & Sons, New
York.

[59] Taqqu, M. S., W. Willinger, and R. Sherman (1997), Proof of a fundamental
result in self-similar traffic modelling, Computer Communication Review 27, 5-
23.

[60] Yajima, Y. (1985), On estimation of long memory time series models, Australian
Journal of Statistics 27, 303-320.

[61] Yajima, Y. (1991), Asymptotic properties of the LSE in a regression model with
long-memory stationary errors, Annals of Statistics 19, 158.

[62] Whittle, P. (1951) Hypothesis testing in time series analysis, Haftner: New York.

88

Figure 2.1: Sample realizations from ARFI M A(p, ,q) processes

 

 

(a)ARFIMA(0, 0.3, 0) (b)ARFIMA(0, 0.3, 1)

1 6 20

1.2 l t

0.8 1 2

04 1 08
-0.0l I ' ' ‘ 1' 04!, |
-o.4|11 -00‘ [H1] 1 ‘ , it)“ 11'
—0.8 I ‘ —04 , ‘ ' [ ‘
—1 2 l »-0 8
~15 1 2

 

 

 

0 50 100 200 300 400 500 o 50 100 200 300 400 500

 

Time Time
(c)ARFIMA(1, 0.3, 0) (d) ARFIMA(1,0.3, 1)
2.5 r , r 3
2.0
2
1.5»
1.0 . '[ 1
., ‘ ‘ ,‘ , ‘ ‘ 1 .’ ([1
0.5 , ‘ ‘ ‘ , , [
l [ ‘ ‘ l ‘ [ 0) l I ‘ l
0.0 1 1 1 _ ‘ l . [f ‘ m
1 l l [ ) ‘ ‘ v ,l
-0.5- ‘ l _1
—1.0- [ v
‘ -2
-1.5

 

 

_23 . . . A . . . . . -3 . . 1 . . . . .
0 50 100 200 300 400 500 0 50 100 150 200 250 300 350 400 450 500
Time Time

89

 

Figure 2.2: Autocorrelations of the Sample realizations from ARFI M A(p, d, q) pro—
cesses

 

 

 

 

 

 

 

 

 

 

 

(a)ARFIMA(O, 0.3, 0) (b)ARFIMA(O, 0.3, 1)
0.50 4 .ﬁ . f , v e f o r v j r
0.42: 1 0.14.
0.38: i 040$
0,343 1 0.06:
030‘ 0.02i
026» 00 4:
022’ ' > 2:
0.18: —o.05.
0.14: —0.10C
0:10 _014:
006; —o.18i 4
0.02. 022’ i
—0io24 ’- C 1
—0.06‘ ; -O.26
-0110; r . . . . . . . r ‘ -030 4 i # ¥ . . . . r
0 10 20 30 40 50 0 10 20 30 40 50

(c)ARFIMA(1, 0.3, 0) (d) ARFIMA(1,0.3, 1)

 

 

   

0 76g 4 0.25;
0.70; 3 0.205
0.64; i 0.14;
0-583 0.08;
052E 0.02 g.
046E -0.04;
0405 —0,103
0.345 —o.16;
0285 -o.22;
022; —0.28E
0,16; —0.34;
010E -0.40;
004E -0.46;

_002 E-------- unsung-unnuuuuuu-uuu ‘ _052.

—0.08; -o.58:

—o.14_ -o.54,

—o.20’ -o.70’

 

 

 

 

 

 

 

Figure 2.3: Autocorrelations of 11? from sample realizations of GARCH(1, 1) and
FIGARCH(1,d,1) processes

(a)GARC'H(1, 1) : ht = 0.001 + 0.2ut2_1 + 0.7h¢_1

 

0.28» ' ' ' ' ' f r f 4
0.24: 1
0.20
0.16
0.12»
0.081 1
0.04i ‘
-0.00>:::::::::::: .. 2 " " ’ '
—o.04
—o.081
—0.12C
—0.16*
_020 . 1 r . A 1 m

11' Vii"

 

 

 

(b) FIGARCH(1, d, 1) ; (1 — 0.6L)!» = o.oo1+ {1 — 0.6L — 0.2(1 — [JP-351413

 

 

ObOddMMUA$WM®Vme
mwaomwmhomMmAOQMm

 

 

 

O 10 20 3O 4O 50

91

CHAPTER 3

Persistence and Nonlinearity in
Real Exchange Rates

3. 1 Introduction

The purchasing power parity (PPP) condition states that a common basket of
goods quoted in the same currency needs to cost the same in all countries. The
condition rests on the assumption of perfect commodity arbitrage across countries.
Although very few economists would believe that PPP holds true continuously in
the real world, most would believe some form of PPP holds at least as a long-run
relationship. Both traditional and new open economy macroeconomics based on in-
tertemporal optimizing models assume some variant of PPP (Obstfeld and Rogoﬂ’,
1996). Apart from a constant term reﬂecting differences in units of measurement,

real exchange rates are deﬁned to be the deviation from PPP,

Qt = 3t - (Pt— PE), (3-1)

where s, is the logarithm of the nominal exchange rate observed at time t, and p,
and p; are the logarithms of the domestic and foreign price levels, respectively. A
necessary condition for PPP to hold in the long run is that the real exchange rate
needs to be stationary, not driven by permanent shocks.

Previous results from many single equation unit root tests indicate that, the unit

root hypothesis in real exchange rates cannot be rejected in data from the free-ﬂoating

92

period. Similarly, there is an absence of cointegration between nominal exchange rates
and relative price levels, see Froot and Rogoff (1996), and Rogoff (1996), for recent
surveys. Only from 1900 or further back is there evidence that real exchange rates are
stationary, see for instance Diebold et al. (1991). To overturn this somehow puzzling
empirical evidence, Pedroni (1995), Frankel and Rose (1996), Oh ( 1996), Wu (1996)
and Lothian (1997) among others, applied panel data variants of standard unit root
and cointegration tests. The idea behind these studies is to increase the power of the
tests by increasing the sample size. These studies report evidence of mean reversion
in real exchange rates for the ﬂoating era. One important critique of the panel data
methods came from O’Connell (1998a). O’Connell’s criticism centers on the failure of
the panel data tests in controlling cross-sectional dependence in the data. He ﬁnds no
evidence against the unit root in real exchange rate data for several countries when
cross-sectional dependencies are taken into account. As noted by Rogoff (1996), the
results of panel data and long—span studies seem to indicate a half-life of deviations
from the PPP to be about three to ﬁve years. Since it is hard to believe that real
shocks will account for the majority of short run volatility of real exchange rates and
it is intuitive to think that nominal shocks can only have strong effects only a time
period in which nominal wages and prices are sticky, then the apparent persistence
of real exchange rates is puzzling, even if real exchange rates are mean reverting.

A recent strand of literature stresses the importance of allowing market imper-
fections in understanding the persistence in the adjustment of real exchange rates
towards their long run equilibrium. General equilibrium models of real exchange rate
determination developed in Dumas (1992) and in Sercu et al. (1995) take into ac-
count transaction costs and show that the adjustment of real exchange rates toward
PPP is a nonlinear process. In these models, transaction costs create a band of in-
action within which international price diﬁerentials are not arbitraged away, as only

the price differentials exceeding transaction costs (outside the band) are proﬁtable to

93

arbitrage away. Therefore, the presence of transactions costs leads to the notion of
different regimes in real exchange rates. In particular, the proﬁts from commodity ar-
bitrage, which is generally thought to be the ultimate force behind maintaining PPP,
do not make up for the costs involved in the necessary transactions for small devia-
tions from the equilibrium value. This means that there may exist a band around the
equilibrium rate in which there is no tendency for the real exchange rate to revert to
its equilibrium value. Whenever the rate is outside the band that is speciﬁed by the
relevant costs, arbitrage becomes proﬁtable, this in turn forces the real exchange rate
back towards the band.

Several studies have tested and modelled the implications of transaction costs in
real exchange rates. Micheal et al. (1997), use a long span of annual as well as
quarterly data for the interwar period and report statistically signiﬁcant evidence of
nonlinearity in the adjustment of real exchange rates. Sarantis (1999), and Sarno
(2000) reject linearity for several effective and bilateral real exchange rates respec-
tively for a group of industrial countries over the ﬂoating period. Baum et al. (2001)
ﬁt the Exponential Smooth Transition Autoregressive (ESTAR) models to deviations
from PPP which are obtained using the Johansen cointegration method on nominal
exchange rates, home and foreign price levels. Taylor et al. (2001) report supportive
evidence that the speed of convergence of real exchange rates towards their long run
equilibrium increases with the size of the PPP deviation over the ﬂoating period for
a number of US Dollar real exchange rates. On the other hand O’Connell (1998b)
ﬁnds large deviations from PPP to be at least as persistent as small deviations.

The results of the literature seem to be unsettled and contentious in explaining
the puzzling behavior of real exchange rates. Although, ﬁndings from the more recent
studies that take nonlinearities into account are promising, there are certain issues
that need to be investigated in judging the empirical success of these studies. Micheal
et al. (1997 ), and Baum et al. (2001) test for cointegration in PPP, and subsequently

94

apply the ESTAR model to the residuals from the cointegration relationship to ana-
lyze the adjustment process towards PPP. This approach may be questionable on the
ground that if the residuals of PPP relationship follow a nonlinear process, the valid-
ity of the linear coinegration tests and interpretation of these residuals are doubtful.
Moreover, the concept of equilibrium in nonlinear models may be different from that
of linear models. To avoid these problems this chapter applies the Smooth Transition
Autoregressive (STAR) models directly to the real exchange rate and then inves-
tigates the dynamic properties of the exchange rate process using well established
statistical methods. Note also that theoretical models in Dumas (1992) and in Sercu
et al. (1995), analyze directly the dynamic behavior of the real exchange rate process
rather than the residuals that are obtained from a cointegration regression. Taylor
et al. (2001), ﬁt ESTAR models to the log real exchange rates, and then tested if
there were any remaining nonlinearities left out. The problem with their approach
is that the testing procedures in Taylor et al. (2001) departs from the original PPP
by calling for further economic information about the other real exchange rates in
the testing step, but has the drawback that this additional information is left aside
in the univariate estimation of ESTAR models for the real exchange rate. For this
reason, the stationarity evidence provided from their panel data tests may not be
applied to univariate real exchange rates. If real exchange rates are nonstationary in
the sample, then the results of their speciﬁcation tests may also be questionable, as
these tests are based on the assumption of stationary residuals. Moreover, since the
transition variable used in their study was the lagged log real exchange rates, if the
real exchange rates were nonstationary in their sample, then the process has a certain
probability of being absorbed into a single regime. This in turn may invalidate the
inference in the other regime.

Given the concerns discussed above, the purpose of the present chapter is twofold.

One, to reinvestigate more rigorously the threshold type nonlinear behavior in real

95

exchange rates; two, to analyze carefully the persistence/ mean reverting nature of
real exchange rates when a nonlinearity of threshold form is allowed. More precisely,
this chapter attempts to address the question to what extent does the presence of
threshold dynamics in the real exchange rate resolve the puzzling evidence from unit
root tests? To this end, this chapter carefully tests for the presence of threshold type
nonlinearities. Three different forms of nonlinearity tests and their robust variants
that take possible heteroscedasticity and outliers into consideration are applied. In
addition to standard residual diagnostics, newly developed speciﬁcation tests due
to Eitrheim and Tera'svirta ( 1996), van Dijk and Pianses (1999), and generalized
impulse response functions, developed by Koop et al. (1996), are used as diagnostic
tools to better evaluate the estimated models. The results of linearity tests and
estimated STAR models provide evidence on the presence of threshold behavior in
real exchange rates for several currencies but with the caveat that real exchange rates
are still reasonably persistent when far away from PPP. This ﬁnding on persistence
is similar to the ﬁndings of O’Conell (1998b) but contrary to Taylor et al. (2001),
who employ a. similar approach to modeling nonlinearity. The main reason for the
different ﬁnding is that this chapter considers the ﬁrst differences of real exchange
rates, while Taylor et al. (2001) consider the levels. The simulation experiments on
the power /size of the standard unit root and stationarity tests support the ﬁndings
in that, these tests have power to detect nonlinear mean reversion in general. Hence,
allowing transaction costs may not be able to solve the PPP puzzle alone.

The rest of this chapter is structured as follows. Section 3.2 discusses the issues
relating to representation, testing and speciﬁcation of the STAR model. Section 3.3
discusses nonstationarity and nonlinearity of real exchange rates and presents the
simulation results on the power / size properties of the LM type linearity tests, unit
root and stationarity tests. The data and empirical results are presented in section 3.4.

In section 3.5, the dynamic behavior of real exchange rates is evaluated by analyzing

96

the characteristic roots in different regimes and by estimating the generalized impulse
response functions from the ﬁtted ESTAR models. Finally section 3.6 concludes and

discusses the implications of the empirical ﬁndings.

3.2 Modelling Nonlinearity by Smooth Transition
Autoregressive Modes

The nonlinear dynamic behavior of real exchange rates in this chapter is modelled in
terms of the STAR models that were discussed in chapter 1. In this section for the sake
of completeness a brief overview the model is given. The STAR model for a univariate
time series 3),, which is observed at times t = 1 —p, —p, . . . ,—1,0, 1,. . .,T —1,T, is

given by

yt=(7T1,o + 7r1,1yt—1 + + 7T1.pyt—p)(1 - F(Zt;%C))

+(7r2,0 + 7r2,lyt_1 + + 1r2myt_p)F(zt;7, c) + at, (3.2)

where y, is a stationary process with disturbances, at, which are martingale difference
sequences with respect to the history of the time series up to time t — 1, which is
denoted by (2‘4 = (yt_1, . . . ,y1_,,). This means that, E[ut|Q¢_1] = 0. It is usually
assumed that the conditional variance of u, is constant, that is, E [uf|9¢_1] = 02. The
transition function F (21; 7, c) is a continuous function that is bounded between 0 and
l. The transition variable zt can be a lagged endogenous variable, 2; = yt_d for a
certain integer d > 0, as assumed most of the time in empirical studies. As discussed in
chapter 1, the logistic and / or the exponential function are frequently used in empirical
studies. Since the STAR models and their speciﬁcation and estimation are discussed
in chapter 1, we will brieﬂy discuss the strategy as applied in this chapter.

In this study the autoregressive (AR) order is selected by a combined use of AIC,
BIC, and Ljung-Box statistics for autocorrelation. Whenever these criteria do not

agree on the appropriate lag order, the highest lag number is selected, because a low

97

AR order may not be able to take care of the possible serial correlation in the series
which in turn might lower the power of the non-linearity tests. The usual practice
in the literature is to ﬁrst identify a linear AR(p) model and then to estimate STAR
models with the same speciﬁed order in each regime. This approach is somewhat
problematic as the true AR order in a linear model may not be the same in a nonlinear
STAR type of model. Simulation evidence reported in chapter 1 suggests that these
criteria may fail to correctly select the true lag order in STAR models. In this chapter,
whenever an estimate is found to be statistically insigniﬁcant then it has been removed
and the model is re-estimated with different AR orders in each regime. Diagnostic
tests are used to decide if the removal of a lag is appropriate or not.

Testing linearity against the STAR type of nonlinearity are carried out by use
of the LM- tests discussed in chapter 1. Standard, heteroscedasticity robust and
outlier robust versions of LM2, LM3 and LM4 are applied in this chapter. To specify
the value of the delay parameter, d, the tests are performed for values of d ranging
from 1 to 12. Following Terasvirta (1994) the delay parameter is usually determined
by d = arg minP(d) for 1 S d _<_ 12, where P(d) is the p-value of the LM3 test. The
choice between the LSTAR and the ESTAR model is usually done by a sequence of
tests nested within the null hypotheses corresponding to the LM3 and the LM4 tests,
see Teriisvirta (1994) and Escirbano and Jorda(1999). The type of regime switching
implied by the LSTAR model can be convenient for modelling certain economic time
series that exhibit asymmetries in terms of expansions and recessions. This is because
in the LSTAR model, the two regimes correspond to the small and large values of
the transition variable zt relative to the threshold c. The ESTAR model may be
better suited for modelling real exchange rates, as regimes in the ESTAR model
are associated with small and large absolute values of the transition variable. In
other words, properties of the ESTAR model allow symmetric adjustment of the real

exchange rate for deviations above and below the equilibrium level. In the context of

98

real exchange rates both models imply that there are distinct regimes in the exchange
rate market, for example, an appreciating regime and a depreciating regime. The
LSTAR model implies that real exchange rates behave differently in the two regimes,
while the ESTAR model implies that the two regimes have rather similar dynamics,
while the transition period can have different dynamics. In this chapter instead of,
a priori, excluding LSTAR model as a possible model for the real exchange rates,
the LSTAR models are also estimated along with the ESTAR models to check the
adequacy of the ESTAR model. In all of the reported cases in section 3.4, the ESTAR
model is found to better represent the dynamic behavior of real exchange rates. This
way of selecting the appropriate STAR model and delay parameter is quite ﬂexible
and in general may be preferable to the strict application of the procedures described
in Terasvirta (1994) and Escirbano and Jord5.(1999), as it allows one to compare the
estimated models for each of the transition variables and functions. This approach
is also suggested by Tera'svirta (1998). Another difference from the studies which
apply STAR modelling to exchange rates is that this study estimates STAR type of
models with different autoregressive orders in each regime. Given the results from
linearity tests, several ESTAR and LSTAR models are estimated by nonlinear least
squares (NLS). Under certain regularity conditions, which are discussed in Gallant
(1987) Potcher and Prucha (1997) among others, the NLS estimates are consistent and
asymptotically normal. The estimation is performed by using constrained maximum
likelihood library of Gauss. The Newton-Raphson algorithm is used in optimization.
Apart from the standard diagnostic analysis of residuals the diagnostic tests developed
by Eitrheim and Tera'svirta (1996) and van Dijk and Franses (1999) are applied. For
details, see chapter 1.

99

3.3 Nonlinearity, N on-stationarity and Real Ex-
change Rates

The application of linearity tests and of the STAR models presumes stationary
time series. An issue that deserves particular attention in modelling real exchange
rates by STAR type models involves the treatment of non-stationarity. The recent
empirical literature argues that standard unit root tests fail to detect mean reverting
behavior of real exchange rates as the the true data generating mechanism (DGP) for
the real exchange rates follow a nonlinear model of the STAR type. This idea rests

on the following re-parameterization of the real exchange rates;

p—l
AQt = (a + pqt—i + Z mama—Ml - F(zt, 7, c)) +
j=1
p-l
(a' + p’Qt—l + Z 7r2JAQt—j)F(Zta ’7, C) + ut- (3-3)
j=1

Note that equation (3.3) indicates that when the process is in the middle regime, (that
corresponds to F () = 0 in the ESTAR model) the behavior of real exchange rates is
mostly determined by the value of p and when the process is in the outer regime (that
corresponds to F(.) = lin the ESTAR model) the behavior is mostly determined by
the value of p’. Hence, for small deviations from PPP the coefﬁcient p will govern the
adjustment process whereas for large deviations from PPP the coeﬂicient p’ becomes
more and more important. In this sense, STAR models of the form (3.3) are consistent
with the predictions of equilibrium models of real exchange rate determination in the
presence of transactions costs. In particular, the larger the deviation from PPP, the
stronger the tendency to move back to equilibrium, provided that the estimates of
p and p’ are such that p is even positive while p’ is negative. These conditions will
ensure the global stationarity of the real exchange rates generated from model in
(3.3). If the true DGP of real exchange rates is given by the model in (3.3), then unit
root tests which are based on a linear AR(p) model of the augmented Dickey-Fuller

100

regression form
p- 1

AQt = (04* + p'qi—i + Z 1r;Aq,_,-) (3-4)

i=1

may not be able to detect the mean reverting behavior of real exchange rates, as
the estimates of the parameter p“ in (3.4) will tend to be a combination of p and p’.
Thus, failure to reject the unit root hypothesis on the basis of a linear model does not
necessarily invalidate long-run PPP. That is, the unit root hypothesis Ho : p‘ = 0 may
not be rejected against the stationary linear alternative hypothesis H1 :p ‘ < 0, even
though the true DGP is a nonlinear globally stable process. Given this possibility of
non-rejection of the unit root hypothesis when in fact the true process is nonlinearly
mean reverting, it is worthwhile to investigate the frequency with which the hypothesis
of a unit root can be rejected using standard test procedures when, under the null
hypothesis, the data generating process is a mean reverting STAR process. This may
shed some light on understanding the power / size properties of the standard tests and
may reveal information on the reasons why previous research has resulted in non-
rejection of unit root null or rejection of stationary null for real exchange rates over
the ﬂoating period.

Since, a priori, it is not known, whether or not real exchange rates are stationary,
it is also worthwhile to investigate the frequency with which the hypothesis of nonlin-
earty is rejected when the true DGP is a linear unit root and/or stationary process.
This is important as the linearity tests and estimation of STAR models assume that
the time series under study is stationary. Results of this experiment combined with
the results of the experiment on the power/ size of unit root/stationarity tests will
guide us in testing and estimating the STAR models in the subsequent sections.

To investigate the size of linearity tests, data is generated from AR(p) model.
To investigate the power/size properties of unit root and stationarity tests the data

is generated from the ESTAR model with p = 1 and p = 2. The parameters in

101

ESTAR models are speciﬁed so that the generated series are globally stationary even
though they may behave as a random walk in the middle regime. In all experiments,
disturbances are generated from independent and identically distributed Gaussian
innovations with zero mean and unit variance. Starting values are set equal to zero and
in each replication the ﬁrst 100 observation is discarded in order to remove the possible
effects of starting values. A sample size of 305 observations is generated from AR(p)
and ESTRAR(p) models as this corresponds to the sample size used in this study.
The results are given in tables 3.1 and 3.2. Table 3.1 gives the empirical rejection
frequencies of the F variants of LM type tests. Linearity tests and corresponding
p—values are computed and compared with the 5% signiﬁcance level. Both levels and
ﬁrst differences are used in computing the tests. The ﬁrst values in the table are the
empirical size of tests when the level of the generated data is used while the values in
the square brackets correspond to the size of tests when ﬁrst difference of the data is
used. Tests are computed given the true lag order of 2. Experiments are conducted
with different p values. Since the results are similar only results from p = 2 are
reported. The results from table 3.1 indicate that for the values of the AR parameter
which make the AR(p) model stationary the standard versions of LM—type tests have
estimated empirical sizes closer to the nominal size of 5%. As the the coefficients in
AR(p) processes take values so that the processes become near unit root or a pure unit
root process the empirical size of the tests worsens and becomes unity. This means
that the LM—type tests may spuriously suggest presence of nonlinearity even though
the true DGP is a linear process. The results also indicate that ﬁrst differencing the
series in general improve the size of the tests.

The results in table 3.2 indicate that the ability of Phillips-Perron (1988) (PP),
Augmented Dickey-Fuller (ADF) and KPSS tests to reject nonstationarity when non-
stationarity is false depend on the parametric speciﬁcation for the true data generating

process (DGP). When the true DGP is a STAR model with near unit root or unit

102

root behavior in the middle / inner regime and stationary in the outer regime such that
the process is globally stable then the unit root tests and stationarity tests have good
power and size properties in terms of detecting global stationarity of the series. How-
ever, when the root of the autoregressive parameter in the outer regime approaches
unity then the ability of ADF and PP tests declines in detecting nonlinear mean
reversion. This indicates that the power of the ADF and PP tests depend on the
behavior of the process in the outer regime as the global behavior of the time series
in an ESTAR model is dictated by the roots of the autoregressive polynomial in the
outer regime. As the autoregressive parameter(s) in the outer regime approaches to
unity, the ESTAR model becomes more and more persistent and hence the ADF and
the PP lose power in detecting the global stationarity of the process while the power

of KPSS rises as KPSS has power against persistent but stationary alternatives.

3.4 Empirical Results
3.4.1 The Data

The data used in this study consists of monthly observations on consumer
price indices for Belgium, Canada, France, Germany, Italy, Japan, the Netherlands,
Switzerland, the UK, and the US and end-of-period spot exchange rates for Belgian
franc, Canadian dollar, French franc, German mark, Italian lira, Japanese yen, Dutch
guilder, Swiss franc, the UK pound against the US dollar. All data cover the sam-
ple period from 1973M03 to 1998M07 and derived from the International Monetary
Fund’s International Financial Statistics data compact disks. The logarithmic real
exchange rate series constructed with these data as in equation (3.1), with st taken
as the logarithm of the dollar price of currency, pt as the logarithm the US price level,
and p; as the logarithm of the price level of the relevant country.

PP, due to Phillips, and Perron (1988), KPSS, due to Kwiatkoski, Phillips

Schmidt, and Shin (1992), statistics in both levels and ﬁrst differences are used to

103

evaluate the nonstationarity-stationarity nature of real exchange rates. The results
are given in Table 3.3. The results from the table indicates that for all series the real
exchange rates are non-stationary, and clearly have a unit root. The log differenced
real exchange rates are all stationary. Combined with the results from the simulation
experiments reported above the ﬁrst difference logarithmic real exchange rates are
going to be used in analyzing the nonlinear behavior of the real exchange rate series

over the free ﬂoating period in the rest of the study.

3.4.2 Nonlinearity tests and STAR model speciﬁcation

The p—values for linearity tests with the maximum AR lag determined by combined
use of AIC, BIC and LB statistics, are reported in table 3.4. Following the suggestion
in Tera'svirta (1994, 1998) F-variants of linearity tests are used as they have more
power in ﬁnite samples. Each table gives three versions of each of the LM-type tests
discussed above. Each row in table 3.4 gives the transition variable(s) for which
at least one of the p—values from any version of the test is less than 0.10. One
of the striking result from table 3.4 is that for some of the currencies (especially
for Belgian franc, the British pound, Dutch guilder, French franc, Italian lira and
Japanese yen) the standard variant of the tests indicate presence of very signiﬁcant
nonlinearity while either HCC or OR or both variants have highly insigniﬁcant p-
values, indicating either the results from LS variants may be spurious in the sense
that a ﬁnding of nonlinearity possibly due to either presence of heteroscedasticity,
outliers or both, or robust variants are not able to detect nonlinearity. There is
almost no evidence of nonlinearity at any reasonable level of signiﬁcance for the
British pound and Swiss franc for the sample in this study from HCC variants of the
tests. For all other currencies either some or all of the tests indicate the presence of
STAR type of nonlinearity at either 5% or 10% signiﬁcance levels. In some of these

cases evidence from HCC and/ or OR versions of nonlinearity tests on the presence

104

of STAR form nonlinearity is not very strong. In these cases it is not clear how
to conclude about the presence of nonlinearity. An approach is to estimate STAR
models for all of the delay parameters for which nonlinearity is suggested by the LS
versions of the nonlinearity tests and then let the diagnostic and speciﬁcation tests
reveal the relevance of the nonlinear model for the data. This approach is intuitive,
because if there is no STAR type of nonlinearity in the data, either the estimation
procedure would fail (indicating threshold type of nonlinearity is not being identiﬁed)
or else, in the case of curve ﬁtting, the ﬁtted model would fail to pass at least some
of the diagnostic and speciﬁcation tests. This is the approach taken in the remaining

part of this chapter.

3.4.3 Results from the Estimated STAR Models

For all currencies, both ESTAR and LSTAR models are estimated for each of the
transition variable for which some evidence of nonlinearity is obtained from linearity
tests. LSTAR models are used for comparison purposes to check if the ESTAR models
appropriately model the dynamics of real exchange rates as suggested by economic
intuition. Consistent with the intuition, in all cases the ESTAR model is found
to represent the dynamics better than the LSTAR model. The estimated models
for the Belgian franc, British pound, Dutch guilder and Swiss franc either failed in
the estimation stage or failed to pass the diagnostic tests, especially the presence
of remaining nonlinearity and presence of serial correlation tests. Hence no results
for these currencies are reported in the following. The selection of the model with
the appropriate transition variable is done by use of diagnostic statistics. The use
of diagnostic tests in selecting the appropriate transition variable and function is
quite ﬂexible and in general should be preferred as it allows one to compare the
estimated models for each of the transition variables and fimctions. For example for

the French franc and Italian lira the LS versions of the tests indicated the presence

105

of strong nonlinearity especially at d = 1 while other versions suggested that these
ﬁndings are probably due to the presence of heteroscedasticity or outliers. Despite
this, both LSTAR and ESTAR models were estimated with d = 1 and it was found
that there were considerable nonlinearities left out for higher delay parameters, and
signiﬁcant correlations are found in the residuals. Hence these and several other
estimated models were discarded as they failed to pass the diagnostic tests. On the
other hand, for the German mark, consistent with the results of the LS variant of
linearity tests, the ESTAR model with delay parameter d = 1 is found to be the best
one. STAR models of the form given in (3.3) are estimated without any restriction.
The hypothesis that the process is white noise in the outer regime as suggested by
economic theory, is tested by testing the null of, Ho :p ‘ .= —1, «1' = = 1r? = O, in
(3.3). This hypothesis implies that real exchange rates, although they can behave as
random walks or even have explosive paths within the neighborhood of a threshold
level, become increasingly mean reverting with the absolute size of the deviations
from equilibrium level. In all of the estimated models this hypothesis is rejected
signiﬁcantly. Those parameters which are found to be nonsigniﬁcant are deleted
and the model is re-estimated. The model best ﬁts the data in terms of adequate
diagnostic properties selected and reported.

Tables 3.5 and 3.6 present the results from ﬁve of the countries. The ESTAR
model is found to be an adequate representation for the rates reported. This implies
that real exchange rates move from high or low levels towards the middle level or
their normal level in a similar fashion. Diagnostic statistics are satisfactory in all
cases. The '7 estimates vary across countries, with the speed of adjustment for some
real exchange rates being much higher than others. The estimated values for 'y for
all series are found to be signiﬁcantly different from zero. The estimate of threshold
parameter, 6 is found to be indistinguishable from zero.

In order to better evaluate the estimated models, panels of Figure 3.1 display

106

the graphs of the estimated transition function versus time and threshold variable.
The ﬁgures reveal that transition functions, visit each of the extreme regimes in
general. This means that real exchange rates behave in a nonlinear fashion in that
they visit extreme regimes quite often and a linear representation that ignores this
behavior will not be appropriate to fully understand the dynamic behavior of real
exchange rates. It can be observed from the panels of ﬁgure 3.1 that the Dutch guilder
and Italian lira rates spend most time during the sample period closer to the outer
regime, while German mark, Canadian dollar and Japanese yen rates stay closer to the
middle regime. The estimated transition functions over threshold variable indicate
that transition between regimes is relatively fast. That is to say that real exchange
rate differences adjust to shocks rapidly as the slope of the transition functions for all
currencies are high. The estimated transition functions in general provide evidence

of nonlinearity for all of the series.

3.5 Further Analysis of the Dynamics of Esti-

mated Star Models: Characteristics Roots and
GIRFs

To gain some insights into the dynamic behavior of real exchange rates this section
examines the dynamics of estimated models ﬁrst by computing the characteristic
roots from estimated equations and second by analyzing the propagation mechanism
of shocks to real exchange rate process through use of generalized impulse response

functions (GIRF). Characteristic roots are obtained by solving the equation
p .
A” - Elm-(1 — £12.. 7, c)) + M.,-Fob 7, cur-3 = 0. (3.5)
j=1

For illustration two extreme regimes are considered, namely F = 0, (middle regime)
and F = 1 (outer regime). Characteristic roots are computed for the level series.

Table 3.7 gives roots for each regime. The striking result is that for all of the series

107

the modulus is equal to unity in the middle regime. This implies that the real exchange
rates will behave as if they are a unit root process in this regime. Although for all
the series, the modulus in the outer regime is less then one, albeit they are very close
to unity. This implies that, although real exchange rates tend toward the stationary
equilibrium as time passes, the speed with which they tend to the equilibrium level
is very slow. In other words, when ta real exchange rate is in the outer regime it will
adjust towards its equilibrium level, but most probably the size of the adjustment is
very small hence it takes for a long time for the real exchange rate to revert back
to its respective equilibrium path. The rest of this section further investigates this
implied persistence by means of GIRFs developed by Koop et al. (1996).

Impulse response functions (IRF) for a linear model and a nonlinear model are
different. An IRF for a linear model is symmetric, as such a shock of size —6 has an
effect that is exactly opposite to that of a shock of size +6. Moreover, it is linear in
the sense that the IRF is proportional to the size of the shock. Lastly, it is history
independent as its shape does not depend on the particular history wt._1. As discussed
in Koop et al. (1996) and Pesaran and Potter (1997), in general, properties of IRFs
from a linear model do not carry to IRFs from a nonlinear model. Koop et al. (1996)
show that the impact of a shock depends not only on the history of the process but
also on the sign and size of the shock when the time series follows a nonlinear process
such as a STAR model. Furthermore, as shown in Pesaran and Potter (1997), when
one wants to analyze the effect of a shock on the time series It > 1 periods ahead,
the assumption that no shocks occur in the intermediate periods may give misleading

inference concerning the propagation mechanism of the model. GIRF for a speciﬁc

shock at = 6 is deﬁned as
01140615, wt-l) = Elyt+k '11 t = (SM—1] — Eli/1+1. Iw t-lla (3-6)
for k = 1, 2, - 1 -. Note that the expectations of gm, are conditioned only on the history

108

and / or on the shock. In other words, the problem of dealing with shocks occurring in
the intermediate periods is dealt with by averaging them out. That explains also why
the benchmark proﬁle is the expectation of yt+k given only the history of the process
wt_1. Therefore, in the benchmark proﬁle the current shock is averaged out as well.
This GIRF reduces to traditional IRF when the model is linear. Koop et al. (1996)
emphasize that the GIRF given in (3.6) is indeed a random variable. The GIRF is
a function of 5 and cut.” which are realizations of the random variables at and the
information set, 9,4.

The GIRFs can be utilized in several ways in analyzing the dynamic properties of
the estimated model. They can be used to analyze the persistence of shocks. A shock
at = 6 is called transient at history wt-1 if GIy(k,6,w¢_1) becomes zero as k —-¢ 00.
If on the other hand, GI RF approaches a non zero ﬁnite value when lc —+ 00 then
the shock is said to be persistent. It is intuitive to think that if a time series process
is stationary and ergodic, the effects of all shocks eventually converge to zero for all
possible histories of the process. Hence the distribution of GIy(k, 6, wt_1) collapses
to a spike at 0 as k —+ 00. In contrast, for non-stationary time series the dispersion
of the distribution of GIy(k, 6,114-1) is positive for all k. Koop et al. (1996) suggest
that the dispersion of the distribution of GIy(k, 6,112,-1) at ﬁnite horizons conveniently
can be used to obtain information about the persistence of shocks. GIRFs can also
be used to assess the signiﬁcance of asymmetric effects over time. One difficulty in
computing the GIRFs is that the analytic expressions for the conditional expecta-
tions are not available for k > 1. Therefore they need to be estimated. Koop et al.
(1996)discusses in detail simulation methods to estimate GIRFs. In particular Monte
Carlo or bootstrap methods are suggested for computation of GIRFs. In this study,
conditional expectations are simulated realizations that are obtained from iteration
of the estimated ESTAR model, randomly by drawing with replacement from the

estimated residuals of the model, and then averaging over 5000 random draws over

109

h = 0,1,2, - ' - ,60. For each combination of history and initial shock, we compute
generalized impulse responses for horizons k = 1, 2, - - - , N with N = 60. More ex-
plicitly, the conditional expectations in (3.9) are estimated as the means over 5,000
realizations of Ag”), with and without using the selected initial shock to obtain Aqt
and using randomly sampled residuals of the estimated ESTAR models elsewhere.
All generalized impulse responses are initialized such that they equal i/a-u at k = 0.

There are different ways of obtaining GIRFs. One way is to estimate GIRFS for
each history vector. Alternatively one could estimate GIRFs by estimating condi-
tional expectations for each history wt_1 and then average the obtained sequences
over all possible drawings from wt_1. A third way is to estimate GIRFs by setting
the conditioning vector to w?_1 = E[w¢_1]. GIRFs from all of these strategies are
computed. The mean GIRFs from histories that correspond to the upper 10 per-
cent quintile of the estimated transition function are given in the panels of ﬁgure
3.2. GIRFs are computed for the levels of the real exchange rates by cumulating
the impulse responses from the logarithmic difference of the real exchange rates for
each horizon. Inspection of the generalized impulse response functions reveal that for
all of the series, shocks to innovations in real exchange rates do not dissipate as the
horizon increases. That is, consistent with a modulus that is around unity, a shock
will have quite persistent effects in that real exchange rates do not return to their
equilibrium path in a short period of time. This is in contrast to the argument that
real exchange rates should be mean reverting when deviations from the equilibrium
level implied by the PPP condition are large. This result indicates that although,
the presence of transaction costs may lead to nonlinear type of behavior that can be
modelled appropriately by ESTAR models, it does not necessarily imply that real

exchange rates are anti-persistent.

110

3.6 Conclusion

The use of three different nonlinearity tests and their robustiﬁed variants against
heteroscedasticity and outliers indicated presence of STAR type nonlinearities at dif-
ferent transition variables for most of the currencies considered in this study. The
results from nonlinearity tests also revealed the importance of evaluating the esti-
mated STAR model in different respects, as a ﬁnding from nonlinearity tests may be
due to some other property of the data. In turn, several different diagnostic tests
are utilized in evaluating the estimated STAR models. For the Belgian franc, British
pound, and French franc rates, estimated models did not pass all the diagnostic tests,
especially tests of remaining nonlinearity and tests for serial correlation in the resid-
uals despite the evidence of nonlinearity from the LM tests.

Further evaluation of the dynamic behavior of real exchange rates from estimated
STAR models revealed that shocks to real exchange rates have quite persistent effects
which is consistent with a non-stationary process. This ﬁnding is consistent with the
results of the simulation experiments on the power and size of PP, ADF and KPSS
statistics which indicated that unit root and stationarity tests are capable of detecting
a globally stationary process even if the true DGP is a nonlinear one. The ﬁndings
here support the ﬁndings of O’Connell (1998b), in that small deviations from PPP
can be as persistent as large deviations. The identiﬁed threshold type of nonlinearity
may indicate that a certain component of real exchange rates may have the tendency
to behave as nonlinearly mean reverting but apparent persistence indicates that either
the nominal exchange rates or the relative prices converge too slowly. As such, the
presence of transaction costs by themselves are not able to induce real exchange
rates converge to long run equilibrium levels. The general equilibrium models that
incorporate transaction costs, such as Dumas (1992) and Sercu et al. (1995) indicate

that real exchange rates spend most of the time away from equilibrium. Still, they

111

presume that relative prices and nominal exchange rates converge to the long run
equilibrium at the same rate. Since in these models adjustments in relative prices are
the main force that cause real exchange rates to revert to equilibrium, the ﬁndings
here raise the question of why adjustments in relative prices are not able to induce
real exchange rates to move to equilibrium faster? Perhaps, as argued by Engel and
Morley (2001) nominal exchange rates and relative prices have different speeds of
adjustment and persistence of real exchange rates can be explained by persistence
of nominal exchange rates rather than relative prices. An interesting issue that may
worth investigating is the persistence and nonlinear behavior of nominal exchange
rates and relative prices separately as this may reveal important information on the
adjustment dynamics and speed with which nominal exchange rates and relative prices
converge to their long run equilibrium levels. Given the observed strong correlation
between nominal and real exchange rates it is possibly the relative prices that have
the threshold type of mean reversion rather than the nominal exchange rates. This

issue is left for future research.

112

BIBLIOGRAPHY

[1] Baum, C., Berkoulas, and M. Caglayan (2001), Nonlinear adjustment to pur-
chasing power parity in the post-Bretton Woods era, Journal of International
Money and Finance 20, 379-399.

[2] Diabold, F. X., S. Husted, and M. Rush 1991. Real exchange rates under the
gold standard. Journal of Political Economy 99 (6), 1252-1271.

[3] Diabold, F. X., and A. Inoue 1999. Long memory and structural change, Working
paper, Stern School of Business, NYU.

[4] Dumas, B. (1992), Dynamic equilibrium and the real exchange rate in a spatially
separated world, Review of Financial Studies 5, 153—180.

[5] Engel, C., and J .C. Morley (2001) The adjustment of prices and the adjustment
of the exchange rate, N BER Working Paper 8550, Cambridge, MA.

[6] Eitrheim O. and T. Terasvirta (1996), Testing the adequacy of smooth transition
autoregressive models, Journal of Econometrics 74, 59—76.

[7] Escirbano, A. and O. Jorda(1999), Improved testing and speciﬁcation of smooth
transition regression models, in P. Rothman (ed.), Nonlinear Time Series Anal-
ysis of Economic and Financial Data, Boston: Kluwer, pp. 289-319.

[8] Frankel, J .A., and AK. Rose (1996), Mean reversion within and between coun-
tries: a panel project on purchasing power parity, Journal of International Eco-
nomics 40, 2-9—224.

[9] Froot, K.A., K., Rogoff (1995), Perspectives on PPP and long-run real exchange
rates, in Grossman, G., Rogoff, K. (Eds), Handbook of International Economics,
North-Holland, Amsterdam (chap 32).

[10] Gallant, A. R. (1987), Nonlinear Statistical Models, New York: John Wiley.

[11] Granger, C.W.J. and T. Terasvirta (1993), Modelling Nonlinear Economic Re-
lationships, Oxford: Oxford University Press.

113

[12] Koop, G., M. H. Pesaran and S. M. Potter (1996), Impulse response analysis in
nonlinear multivariate models, Journal of Econometrics 74, 119-147.

[13] Kwiatkowski, D., P.C.B. Phillips, P. Schmidt, and Y. Shin, (1992), Testing the
null hypothesis of stationarity against the alternative of a unit root: How sure
are we that economic time series have a unit root? Journal of Econometrics 54,
159-178.

[14] Lothian, R. (1997), Multi-country evidence on the behavior of purchasing power
parity under the current ﬂoat, Journal of International Money and Finance 16,
19—35.

[15] Lundbergh, S., T. Terasvirta (1998) Modelling economic high-frequency time
series with STAR-GARCH models, Working papers in Economics and Finance
291, Stockholm School of Economics.

[16] Lundbergh, S., T. Tera'svirta and D. van Dijk (1999), Time-varying smooth
transistion autoregressive models, Stockholm School of Economics, unpublished
muniscript.

[17] Luukkonen, R., P. Saikkonen and T. Terasvirta (1988), Tiiesting linearity against
smooth transition autoregressive models, Biometrika 75, 491—9.

[18] Micheal, P., R.A. Nobay, and D.A. Peel (1997), Transactions costs and nonlinear
adjustment in real exchange rates: an empirical investigation, Journal of Political
Economy 105, 862—879.

[19] Obstfeld, M. and A. M. Taylor (1997), Nonlinear aspects of goods-market arbi-
trage and adjustment: Hecksher’s commodidty points revisited, Journal of the
Japanese and International Economics 11, 441—479.

[20] Obstfeld, M. K. Rogoff, Foundations of International Macroeconomics, MIT
Press, Cambridge, Massachusetts.

[21] O’connel, P.G.J. (1998), The overvaluation of purchasing power parity, Journal
of International Economics 44, 1-19.

[22] O’Connel, P.G.J. (1998), Market frictions and real exchange rates, Journal of
International Money and Finance 17, 71—95.

[23] Oh, KY. (1996), Purchasing power parity and unit root tests using panel data,
Journal of International Money and Finance 15, 405-418.

114

[24] Pedroni, P. (1995), Panel cointegration: asymptotic and ﬁnite sample propereties
of pooled time series tests with an application to the PPP hypothesis, Unpub-
lished Working Paper 95-013, Department of Economics, Indiana University,
Bloomington, IN.

[25] Pesaran, M. H. and S. M. Potter (1997), A ﬂoor and ceiling model of US output,
Journal of Economic Dynamics and Control 21, 661-695.

[26] Phillips, P. and C.B.P. Perron (1988), Testing for a unit root in time series
regression, Biometrika 75, 335-346.

[27] Potcher, RM. and I.V. Prucha (1997), Dynamic Nonlinear Econometric Models-
Asymptotic Theory, Berlin: Springer-Verlag

[28] Rogoff (1996), The purchasing power parity puzzle, Journal of Economic Liter-
ature 34, 647-668.

[29] Sarantis, N. (1999), Modelling non-linearities in real effective exchange rates,
Journal of International Money and Finance 18, 27—45.

[30] Sarno, L. (2000), Systematic sampling and real exchange rates,
Weltwirtschaftliches Archiv 136, 24-57.

[31] Sarno, L. and M. P. Taylor (1999), The economics of exchange rates, Cambridge
and Newyork: Cambridge University Press.

[32] Sercu, P., R. Uppal, and C. Van Hulle (1995), The exchange rate in the presence
of transaction costs: implications for tests of purchasing power parity, Journal
of Finance 10, 1309—19.

[33] Taylor, M.P., D.A. Peel, and L. Sarno (2001), Non-linear in real exchange rates:
towards a solution of the purchasing power parity puzzles, Working Paper, Centre
for Economic Policy Research, London, UK.

[34] Tera'svirta, T. (1994), Speciﬁcation, estimation and evaluation of smooth transi-
tion autoregressive models, Journal of the American Statistical Association 89,
208-218.

[35] Terasvirta, T. (1998), Modelling economic relationships with smooth transition
regressions, in A. Ullah and D.E.A. Giles (editors), Handbook of Applied Eco-
nomic Statistics, New York: Marcel Dekker, pp. 507—552.

115

[36] van Dijk, D., P.H. Franses and A. Lucas (1999), Testing for smooth transition
nonlinearity in the presence of additive outliers, Journal of Business and Eco-
nomic Statistics 17, 217—235.

[37] van Dijk, D., and RH. Franses (1999), Modeling multiple regimes in the business
cycle, Macmeconomic Dynamics 3, 311-40.

[38] Wooldridge, J .M. (1990), A uniﬁed approach to robust, regression-based speciﬁ-
cation tests, Econometric Theory 6, 17—43.

[39] Wooldridge, J .M. (1991), On the application of robust, regression-based speciﬁ-
cation tests, Journal of Econometrics 47, 5—46.

[40] Wu, Y. (1996), Are real exchange rates nonstationary? Evidence from panel-data
tests, Journal of Money Credit, and Banking 28, 54-63.

116

Figure 3.1: Estimated Transition Function versus Time and Threshold Variable

 

 

 

 

 

 

(a)Canadian Dollar
F-function vs. time F-function vs. tran. var.
1.0 1.0- .,o ,
1 \
0.7' ‘ 1 1 0.7» \
05'}! "ml 0.5. \ .
' ‘ r l i ; l 1 1 \ 8
0.3 l. ‘ ‘1 1 0.3 l .
. ’I 1‘ ' ‘ . ‘
1 1 1 ,
0.1 0.1 °
0 80 160 240 320 -0.12 -0.06 0.00 0.06 0.12

(b)Dutch Guilder

F-function vs. tran. var.

 

 

 

 

 

 

 

 

F-ﬁmction vs. time
mm W“ 111 W 111 1.0
I l r 7 ’ 1‘ ,
0.7 1 ‘ ' i 0.7
’ §
0.5 0.5 .. f
T - 3
0-3 1 0.3 g
0.1 0.1 3°
160 240 320 -012 -0.06 0.00 0.06 0.12

 

 

080

117

Figure 3.1 (cont’d).
(c) German Mark

F-function vs. time

F-function vs. tran. var.

!
-. l
0.5

a i
3 E
0.3. ‘a f
8
E 5
%

 

0.1

 

 

 

-O.l2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

-0.06 0.00 0.06 0.12
(dlmﬁmm
F -function vs. time F -function vs. tran. var.
NI Fl 1 p 1.0
‘1 31 I11?
1 f l
0.7
l I ‘ l ‘3 5
0.5 .9 s
. ‘ l .
0.3 1 0-3 i .
.‘ E o
l f
0.1 01 ".
so;
0 80 ‘160 240 320 -0.12 -0.06 0.00 0.06 0.12

118

(e) W

Figure 3.1 (cont’d).

F -function vs. time

 

 

  

 

 

 

 

 

 

 

 

F-function vs. tran. var.
1.0 71.0 4.,“ 6
§ 0
0.7 0.7 3% 9
1.1% .
1 ‘ 0.7 . .
0.7 1‘ 1 J ‘ s. I
. 1" 1' 1 1 1 1 g. a
l
l ‘ “ 1 1 1 1 i y
0.3 “Y 1,131 { 0.3 .i f
I ’ i 1 q ‘ l’ 5
0.1"” ‘ ’ . 1‘ ‘ 0.1 I
0 240 320 _012

 

 

 

-0.06 0.00 0.06

119

Figure 3.2: Generalized Impulse Response Functions from estimated ESTAR models
(a) Canadian Dollar

 

 

 

 

 

 

 

 

ct:
C rrTj TMYTTW—YTY—rTYYYI 1", It‘rrY ﬁYYYTTTrf1YYTTTrfYYTTTTYYY'ﬁW’
O
b-
q.
C’ ‘ I ‘ L V H U 1
C) www— vv ﬁ 7" rv r‘ r"
r i
(\l
C) . -4
C) K‘ :—
_4 A
b 1“.” & _-&“‘—&— -‘iF oil- Ao—e&0-&o-&—Q
f' -<
(\J A ODD-'..
C) fl. 8" 'E""E“‘.,Ea.ggm.bw :E]. 0‘Ee.\.uDeO.IIE'QIO'E]g\..‘
o I q
| b
E’ .
\ .-\
ﬂ)- A K .\ A p
o S-rG‘K" 'C’—'€‘J-"‘:7—G"“(>—-tJ—C9"j
(C)
. llllll JALLLMML ‘LJ_L lllll 1 lllllllllllllllllll LAILML'M
o a a ~ - - 4 .-
l1 4 / 11 16 41 2b .51 so 41 4b 51 Do
(b)Dutch Gliilder
jTrTTrTrY’TIYﬁ'rYYYYY'rYITTT'VI1'11fﬁrFYYrY'I'Yf—YYYVIYIY 1'77
'1 )‘r’ ‘w A: :1 :4 r: x— : :1

 

 

0.04

p-

, &-~—&-\_&..—£y‘.ﬁ~.ﬁ’ﬂé-“ék-‘ﬂvoﬁe-séw~—

0.02

f .

r
O ' 4
Q -4
O 1
.. E“‘\G""‘E' “E3“"'G os.-D... a Dem. ‘ 'D' .. B.00.E_]u "—4

-0.02

-0.04

1...
El
1

r/

 

LLLLLL AJLMJLLLJJJJI AJ

S_G._©—-O-e-e—G--e>--e~er—G-—‘

ALMAIALLA LLLLL

.4

 

 

ll 16 21

120

26 31 36

41 46 51 56

Figure 3.2 (cont’d)
(c) German Mark

 

 

~)T I W T r1 r r I I I I I I I I rT I r I I 1 I T I I I I I I I I I I I I T I r r I I I I I I I r1 I I Y I I I l I I

(M {i' _.z x r’x Ln 1~‘A 4:1 ."I 1‘ ‘
— (_{l '--’ \- ’ ‘..’ —‘L ,‘ “_v YJY' ﬁ‘r \ ‘ I . ﬂ

.- -4

p -<

- A

)-

r.

,nr-~a-—.-s_\.— ~-A-—.a-.—A—-——ar—-—-a—~a-~-5 -— aim-i

I]. J

..E]-----[3~-~'E3"'-E]~---E]ww-El--WC3----[Zi----EJ~-~=~D~-'-B-"-.

 

llllLlljlIILLALIJIJIALILLIJIJJI114111114111ILLJJLJJIILJIAIL

 

 

(d)Italian Lira

 

II’I’FITIITIIIIIII‘IIIIIIIIWIIIIrrIIIIIrTTrTIIIIIITIIIrIIrIIIIY

8—-—£§-—~A---&v—A—-—-A—~~A—--A~—--A—~~A—-A«--A—-

r— a
r- 4

q
r- 4

Jo...B...,{g..~‘.BgcgaO-B‘.‘o--E].o...E}u... [3,...0.E.,...{:],.. E]..,..B,....

 

 

LllllllllllllJLlllllIjjlllll£1111LJJAILIIIIIJLLIIALMALLL

 

121

Figure 3.2 (cont’d).
(e) Japanese yen

 

 

TIIrIITIrTITTIIIIIIrITIIIIIrIIIITIIIIIIIIIIII'TI’rIIIIII‘rII

L WWW—h +
P .--&-Iowﬁ—I—A—-'-A-—~&.-&-o—A--L\—'-A~—-é~--A—--A
_i

.V\

11111]ll11111111111111111llllLJLLJaLLLL LLLLLLLLLL Lllllllllll

 

 

 

NotezThe mean GIRFs from shocks of 10%, (solid lines with star), 5%, dotted
lines with triangles), -5% (dots with squares), and -%10(dashes with circles) are
given for the histories that correspond to the outer regime. Note that shocks
are standardized by dividing the standard error of the residuals from estimated

models.

122

Table 3.1: Empirical rejection frequencies of linearity tests, Sample size=305.
Model Design: yt = plyt_1 + p2y¢_2 + ut, ut ~ i.i.d.N(0, 1)

 

Parameter Rejection frequencies
LM2 LM3 LM4
p1 = 0.3, p2 = 0.6 0.077[0.041] 0.067[0.040] 0.064[0.044]
p1 = 1.0, p2 = 0.0 0.105[0.044] 0.098[0.039] 0.108[0.046]
p1 = 0.7, p2 = 0.3 0.110[0.047] 0.090[0.048] 0.097[0.049]
p1 = 0.3, p2 = 0.7 0.292[0.045] 0.262[0.043] 0.247[0.049]
p1 = 0.5, p2 = 0.5 0.193[0.046] 0.162[0.040] 0.165[0.045]
p1 = 0.7, p2 = 0.4 0.999[0.997] 1.000[1.000] 1.000[1.000]

 

Notes: The rejection frequencies are obtained computation F variants of LM tests and
corresponding p-values 5000 times. Since the true data generating model is linear these
frequencies indicate the empirical sizes of the tests. The nominal signiﬁcance level taken is

%5. Squared bracketed values correspond to the ﬁrst differenced series.

Table 3.2: Empirical rejection frequencies for ADF PP and KPSS tests

Model Design:

9: = 7F1,1yc—1(1 - F(yt—1,5,0)) + [7T1,2yt—1F(yt—1, 5, 0)] + at, at ~ “db/(0, 1)

Parameter speciﬁcation

Rejection frequency

 

 

KPSS PP ADF
m = 0.9, «1,2 = —0.5 0.067 0.990 0.970
m = 1,7”,2 = -0.5 0.071 0.899 0.900
«1,, = 1,1“,2 = —0.1 0.355 0.997 0.990
70.1 = 1.1.71.2 = —0.5 0.085 0.994 0.991
70.1 = 1.2, «1,2 = —o.5 0.120 0.991 0.995
70.1 = 1.0, «1,2 = 0.5 0.800 0.845 0.840
m =1.0,7r1,2 = 0.7 0.870 0.835 0.830
m = 1.0.01.2 = 0.95 0.850 0.540 0.520
m = 1.1,1r1,2 = 0.95 0.880 0.480 0.475

 

Model Design: yt =
7T2’2yt_2]F(y¢_1, 5,0) + Ug, at N ZZdN(0, 1)

[7&in + 771,2yz—2](1 — F(yt-1a5,0)) + [772,1yt—1

 

KPSS PP ADF
m = 0.6, 70.2 = 0.4, «2,1 = 0.4, «2,2 = —O.6 0.104 0.890 0.992
m = 0.4, 70.2 = 0.6, «2,1 = 0.4,«2,2 = —0.6 0.344 0.995 0.994
«1,1 = 0.7, «1,2 = 0.3, «2,1 = 0.4mm = —0.6 0.059 0.996 0.992
m = 0.3, «1,2 = 0.7, «2,1 = 0.4,«2,2 = —0.6 0.613 0.998 0.993
70.1 = 0.3, «1,2 = 0.7, 92.1 = 0.4.42.2 = 0.4 0.815 0.722 0.720
7T1’1 = 0.3, W13 = 0.7, 7T2'1 = 0.6, W22 = 0.3 0.828 0.718 0.715

 

NotezRejection frequencies are based on 5000 replications.

123

 

Table 3.3: Rasults on unit root and stationarity tests:PP, and KPSS

 

 

Currency level ﬁrst difference
PP KPSS PP KPSS
Belgian franc -1.351 0.997 -16.299 0.091
Canadian dollar -1.504 2.812 -14.253 0.180
French franc -1.534 1.354 -17.046 0.206
German Dmark -1.882 3.217 -16.259 0.166
Italian lira -2.589 3.239 -15.102 0.438
Japanese yen -0.483 3.695 -12.532 0162
Dutch guilder -1.397 3.088 -16.612 0.100
Swiss franc -2.226 3.205 -15.950 0.228
British pound -2.941 2.706 -11.586 0.312

 

Notes: The reported values for the PP test are based on the regression of the time series
on a constant and its lagged value. The lag truncation for the Bartlett kernel is obtained
from the formula floor(4(1-g5)2/9). The 1% and 5% critical values are -3.454 and -2.871
respectively for the PP tests. The reported values for the KPSS test are based on a
regression of the series on a constant only. The 1% and 5% critical values for the KPSS
tests are 0.739 and 0.463 respectively. PP statistic test the null hypothesis of a unit root
against the alternative of stationarity while the KPSS statistic has the null of covariance
stationarity against non-stationarity.

124

Table 3.4: p-values of LM tests for star type of nonlinearity in monthly logarithmic
differences of real exchange rates.

Belgian franc, p = 2

 

 

d LS HCC OR
LM2 LM3 LM4 LM2 LM3 LM4 LM2 LM3 LM4

1 0.0094 0.0005 0.0040 0.3631 0.5446 0.2361 0.7686 0.5849 0.0290
9 0.0828 0.1255 0.0628 0.2762 0.4011 0.2579 0.0318 0.0182 0.0597
11 0.1208 0.2529 0.0912 0.1143 0.2820 0.0851 0.0134 0.0016 0.0188
British pound, p = 3
3 0.0478 0.1351 0.1260 0.5300 0.8183 0.3079 0.2695 0.0433 0.0671
5 0.0663 0.1536 0.0242 0.3113 0.5744 0.1923 0.3186 0.0492 0.4010
Canadian dollar, p = 1
8 0.0971 0.2434 0.0964 0.0699 0.1791 0.0728 0.2052 0.0797 0.3906
10 0.2462 0.0970 0.0792 0.3092 0.3097 0.1735 0.0778 0.0623 0.1340
Dutch guilder, p = 2
1 0.0199 0.0007 0.0096 0.4889 0.5581 0.3328 0.4807 0.1423 0.0790
9 0.2268 0.2120 0.1761 0.4380 0.7388 0.4993 0.0467 0.0640 0.0688
11 0.0740 0.1985 0.0468 0.1161 0.2429 0.0691 0.0350 0.0035 0.0477
French franc, p = 1
1 0.0575 0.0147 0.0575 0.1025 0.1153 0.1025 0.0453 0.0923 0.0627
5 0.4571 0.1114 0.3617 0.2203 0.0346 0.2047 0.0254 0.0514 0.0697
11 0.1462 0.0703 0.0468 0.3514 0.3636 0.2592 0.0108 0.0026 0.0288
German mark, p=1
1 0.0032 0.0001 0.0032 0.0723 0.1506 0.0723 0.0373 0.3271 0.0032
5 0.1411 0.0331 0.3912 0.0454 0.0404 0.3588 0.0383 0.4653 0.0863
9 0.1719 0.2021 0.2524 0.2533 0.4151 0.5244 0.0175 0.0632 0.0475
Italian lira,p=2
1 0.0278 0.0027 0.1901 0.1643 0.0422 0.5446 0.0023 0.0040 0.0021
7 0.0377 0.0150 0.0039 0.1813 0.1709 0.0863 0.0071 0.0043 0.0164
9 0.0228 0.0589 0.0450 0.1479 0.2457 0.0870 0.0217 0.0058 0.0360
11 0.0512 0.1480 0.0557 0.1044 0.2245 0.0787 0.0620 0.0166 0.0901
Japanese yen, p = 3
1 0.0387 0.0538 0.1588 0.0929 0.0768 0.2206 0.2055 0.2918 0.1987
8 0.1970 0.4093 0.1128 0.0814 0.2169 0.0500 0.1797 0.0255 0.2454
11 0.3895 0.1872 0.1080 0.1504 0.0931 0.0596 0.0746 0.0452 0.1076
Swiss franc,p=1
4 0.0294 0.0904 0.1872 0.1862 0.1262 0.1981 0.4382 0.5533 0.2568
12 0.2384 0.0445 0.2205 0.5011 0.1578 0.4964 0.0920 0.8255 0.1221

 

KeyzLS, HCC, and OR stand for Least squares, Heteroscedasticity (ﬁnsistent and Cut-her
Robust variants of the LM tests described in the paper. The column d gives those delay
parameters, and hence the transition variables, for which most of the p-values from three
variants of LM—type tests are less than 0.1.

125

Table 3.5: Estimation Results from ESTAR models: Sample size: 291 (after adjusting

 

end points).
Parameters Parameter Estimates for each currency
CD DG GM IL JY
«1,0 0.003 -0.073
(0.001) (0.034) . . .
«1,1 0.271 -1.138 0.610 0.592 0.223
(0.103) (0.520) (0.158) (0.302) (.128)
«2,0 0.002 0.013 0.035 0.063 .
(0.001) (0.008) (.017) (0.026) .
p’ -0.024 -0.010 -0.034 -0.008 -0.002
(0.012) (0.007) (0.017) (0.004) (0.001)
«2,1 . . -0.385 . 0.425
. . (0.187) . (0.179)
7 25.091 21.473 15.508 12.578 6.661
(1.116) (0.935) (0.495) (1.494) (2.636)
c 0.016 -0.067 -0.002 0.077 0.046
(0.132) (0.483) (0.113) (0.594) (0.340)
Skewness -0.036 0.310 0.184 0.587 -0.558
Kurtosis 2.830 3.846 3.213 4.259 3.982
PLM(6) 0.385 0.408 0.681 0.348 0.491
PLM(12) 0.178 0.526 0.819 0.293 0.421
pARCH(6) 0.685 0.254 0.650 0.158 0.464
pARCH(12) 0.147 0.446 0.627 0.338 0.667
d 8 1 1 9 8

 

HCC standard errors are given underneath the parameter estimates. 'ITansition variable and
the transition function are indicated in the ﬁrst row of the table along with the currency. (1
stands for the transition variable used in the estimation. The rows corresponding to puns)
and pLM(12) give p-values from LM, statistics for 6th and 12th order serial correlations in
residuals. The rows corresponding to pARCH(6) and PARCH(12) report the p—values for the
presence of ARCH effects up to 6th and 12th orders in the residuals. d gives the lag value
of the transition variable.

126

Table 3.6: Tests for remaining nonlinearity and parameter constancy
p—Values from LMAMR test: HCC version

 

 

 

Tr. var CD DG GDM IL JY

yt—l 0.813 0.139 . 0.942 0.955
lit—2 0.670 0.027 0.141 0.372 0.561
tit—3 0.444 0.596 0.455 0.373 0.278
lit—4 0.012 0.129 0.060 0.705 0.680
yt—s 0.318 0.799 0.182 0.552 0.108
yt—s 0.367 0.688 0.702 0.331 0.717
yt—7 0.854 0.154 0.138 0.481 0.443
yt—B 0.914 0.908 0.600 0.763 0.642
yt—Q 0.644 0.688 0.853 0.664 0.738
yt—lO 0.282 0.367 0.917 0.569 0.477
yt—ii 0.100 0.392 0.721 0.165 0.072
yt-12 0.707 0.318 0.919 0.614 0.633

p—Values from LMEMR test: HCC version
yt—l 0.651 0.098 . 0.950 0.760
yt—2 0.304 0.106 0.251 0.519 0.241
lit—3 0.768 0.828 0.521 0.244 0.168
yt-4 0.042 0.288 0.173 0.872 0.454
lit—5 0.408 0.405 0.160 0.540 0.247
311-6 0.415 0.398 0.589 0.468 0.848
lit—7 0.779 0.427 0.339 0.194 0.441
yt—s . 0.746 0.751 0.890 0.460
tit—9 0.179 0.460 0.693 0.081 0.737
Sit—10 0.556 0.344 0.976 0.894 0.683
yt-ll 0.316 0.590 0.872 0.413 0.197
lit-12 0.729 0.432 0.694 0.843 0.477
p-Values from LMCJ tests for parameter constancy
Statistics p—Values

LMCI 0.869 0.544 0.406 0.379 0.854
LMcz 0.900 0.331 0.519 0.231 0.945
LMC3 0.529 0.305 0.456 0.140 0.987

 

127

Table 3.7: Characteristic Roots in extreme regimes

 

 

Currency Regime Characteristic Roots Modulus
CD M 1.000, 0.271 1.000
O 0.976 0.976
DG M 1.000, -1.138 1.138
O 1.00, 0.077 1.00
GM M 1.000, 0.610 1.00
O 0.976, 0.395 0.976
IL M 1.000, 0.592 1.000
O 0.992 0.992
JY M 1.000, 0.285 1.000
O 0.967, 0.285 0.967

 

Nota'M stands for the middle regime, and C for the outer regime.

128

CHAPTER 4

Long Memory in Commodity

Markets

4.1 Introduction

In accord with the efﬁcient markets hypothesis, asset price returns and exchange
rate returns exhibit very little serial correlation. On the other hand their volatilities
contain a much richer structure in that certain transformations of asset price and
exchange rate returns have an extremely persistent distinct form of autocorrelation.
There is considerable evidence that shows that conditional volatility of returns of asset
prices and returns of exchange rates display long memory. Ding et al. (1993), de
Lima and Crato (1993), Bollerslev and Mikkelsen (1996), Granger and Ding (1996),
have shown that asset price return volatilities have long memory property. On the
other hand, Baillie et a1. (1996) have shown that exchange rate volatility displays
long memory property. Previous literature has found daily commodity series to be
well described by martingale-GARCH(1,1) models, see for example, Baillie and Myers
(1991).

The purpose of this chapter is to examine daily commodity futures and cash re-

turns for several primary commodities and their volatilities, particularly, their squared

129

and absolute returns as well as intra-daily ranges. The subject of this chapter is mod-
elling volatility in commodity markets. At a substantive level, one may be interested
in forecasting the volatility in these markets. Moreover, knowledge of the dynamic
properties of return volatilities may have implications on the dynamic nature of com-
modity prices, and forecasting optimal hedge ratios. This is because a ﬁnding of
time dependency in second conditional moments of cash and future commodity re-
turns will imply that Optimal hedge ratios should be time dependent as well. See
for instance Baillie and Myers (1991). The results of this study may be helpful in
comparing the dynamic features of commodity markets with that of stock and foreign
exchange markets. This in turn may have implications for theoretical modelling of
the prices in these markets. This study tries to answer the following questions. Do
daily commodity cash and future prices have long memory property, with cash and
future returns being approximately uncorrelated, and with very persistent autocorre-
lation in certain proxies for the volatility, such as, for example, squared and absolute
returns and intradaily ranges?

Granger and Ding (1995), using the results of Luce (1980), showed that the ex-
pected absolute return and any power transformation of this return, may be inter-
preted as a measure of risk. Hence, volatility literature routinely uses absolute or
squared returns as volatility proxies. In this chapter, following Garman and Klass
(1980), Parkinson (1980) and Anderson and Bollerslev (1998), we consider a third
proxy, namely range, deﬁned here as the difference between the highest and lowest
log asset price during a discrete sampling interval. It is by now well known that the
conditional distribution of log absolute and squared returns are far from Gaussian.
On the other hand, Alizadeh, Brandt, and Diebold (1999)show both theoretically and
empirically that log range is approximately Gaussian, in sharp contrast to popular
volatility proxies, such as log absolute and / or squared returns. There is considerable

literature on both absolute and squared returns in stock and exchange rates markets,

130

but little attention has been paid to extreme value volatility proxies. Range as a
proxy for volatility has been appreciated in the business press, which routinely dis-
plays high and low prices. One potential problem in the use of range as a proxy for
volatility is the downward bias in the range induced by discrete sampling (Rogers and
Satchel] 1991). However, as Alizadeh, Brandt, and Diebold (1999) and Anderson and
Bollerslev (1998) show on days with substantial price reversals, return-based proxies
underestimate daily volatility, as the closing price is not very different from the open-
ing price, despite the large intraday price ﬂuctuations. The range in this sense may
better reﬂect the intraday volatility. In this chapter, the long memory property of
absolute and squared returns as well as intraday range will be analyzed. If intraday
log range exhibits long range dependence then this may support the ﬁndings of An-
derson and Bollerslev (1998) and Alizadeh, Brandt, and Diebold (l999)and motivate
consideration of intraday log range in modelling ﬁnancial market volatility.

We utilize the Fractionally Integrated GARCH (FIGARCH) model of Baillie et al.
( 1996) to model the dynamics of volatility in commodity cash and futures returns.
Since the GARCH model attempts to account for volatility persistence, but has the
feature that persistence decays relatively fast, we use it as a benchmark and compare
its results with the FIGARCH model, as the latter model is capable of modelling
very long temporal dependencies in the conditional variance of a process. In order to
better asses the presence of long memory in the volatility of commodity future and
cash returns, this chapter also models absolute returns, squared returns, and intraday
ranges using the Fractionally Integrated Autoregressive Moving Average (ARFIMA)
model of Granger and Joyeux (1980), and Hosking (1981). Moreover, estimates of
the long memory parameter for the volatility proxies from semi-parametric methods
are also obtained. Particularly, the GPH estimator from Geweke and Portar-Hudak
(1983), and a local Whitlle estimator based on Fox and Taqque (1986) are used.

The rest of the chapter is organized as follows. Section 4.2 describes the data

131

and examines the empirical autocorrelations of the series. Section 4.3 presents and
discusses the results from the estimation of the F IGARCH models for daily cash and
future return volatilities. Results from the estimation of the ARFIMA models and
nonparametric methods for squared and absolute returns are discussed in section 4.4.

The last section provides the conclusion.

4.2 The Data

We analyze cash and future prices on commodities, coffee, corn, gold, silver,
soybean, and unleaded gasoline. The data is obtained from the Chicago Mercan-
tile Exchange. The data set consists of the daily observations for each commodity.
The sample period differs for each commodity. The sample periods for each of the
commodities are the following; coffee, 03/20/84-12/29/00; corn, 03/ 20/85-03/ 14/01;
gold, 04/21/75—03/31/00; silver, 12/26/89-12/26/97; soybean, 03/20/80—12/29/00;
and unleaded gasoline, 04/ 25 / 86-12 / 29/ 00. Each contract starts trading well before
the delivery month. Except for gold and silver, for all commodities we consider the
contract that expires in March of each year. For gold, the December contract, and
for silver, the April contract are used.

Following the standard practice, the returns are deﬁned as R, = 100 x Aln(Pt),
where B is the price (either cash or future) at date t, absolute returns as HM, and
squared returns as Rf. Daily returns are computed for each contract and then com-
bined to obtain a series of future returns. In estimation, dummy variables are included
to see if contract expiration dates have any statistically signiﬁcant effect on the return
and volatility dynamics. For none of the commodities were the estimated coefficients

of dummy variables signiﬁcant. Following Parkinson (1980), range is deﬁned by

_ ln(P.") - 1MP!)
RR‘ _ 2ln 2 ’

 

where P,” and P,‘ are the highest and lowest prices at day t, respectively.

132

Panels of ﬁgures 4.1 and 4.2 give the graphs of the daily cash and future returns,
absolute returns and squared returns, as well as intraday range for the commodity
futures over each sample period. It appears from the graphs that for all commodities,
relatively volatile periods, characterized by large price changes, alternate with more
tranquil periods in which prices remain more or less stable. This indicates that large
cash and future returns (both positive and negative) seem to occur in clusters and so
does volatility. The volatility clustering phenomenon which is typical of stock prices
and exchange rates, seems to occur in the commodity markets as well.

Summary statistics for the future and cash returns are given in table (4.1). The
table indicates that most of the series have small negative means and medians equal
to zero over their respective sample periods. One of the usual ways of getting an
idea of the distribution of a time series yt is to look at the kurtosis and the skewness
and compare them with that of a normal random variable. The last two columns
of table 4.1 indicate that the kurtosis of all returns are much larger than that of a
normal random variable. This reﬂects the fact that the tails of the distribution of
these return series are fatter than the tails of the normal distribution. This in turn
means that large realizations occur more often than one might expect for a normally
distributed variable.

Since any symmetric distribution has a skewness equal to zero, table 4.1 indicates
that the distribution of the daily cash returns has some asymmetry. Iii-om table
4.1 it is seen that all of the future returns and three out of six cash returns (silver,
soybean, and unleaded gasoline) have negative skewness. This implies that for those
commodities, the left tail of the distribution is fatter than the right tail, or large
negative returns tend to occur more often than large positive ones. The analysis here
indicates that daily future and cash return distributions are far from being normal.
This ﬁnding is consistent with the distributions of daily returns for stock price returns

and exchange rate returns.

133

Table (4.2) gives the summary statistics for return based and range based volatility
proxies for the commodity futures. For almost all commodities, intraday volatility has
a lower sample variance and skewness compared with absolute and squared returns.
Squared returns always have the highest kurtosis. It seems that not only return based
volatility proxies but also log range is far from being normal, a result in contrast to
the ﬁndings of Alizadeh, Brendt, and Diebold (1999).

Table (4.3) reports the results from the Phillips-Perron test (PP) from Phillips
and Perron (1988), and the KPSS test, due to Kwiatoski et al. ( 1992). The PP tests
the null hypothesis of a unit root, I (1), against the alternative of I (0), while KPSS
tests the null of an I (0) against the alternative of an I (1) process. As shown in Lee
and Schmidt (1996) the KPSS test has power against the long memory alternative as
well. Both tests indicate that commodity futures and cash prices are non-stationary
and possibly have a unit root, while daily cash and future returns are stationary. The
PP test indicates that all of the volatility proxies are stationary. The KPSS test, on
the other hand, rejects the null of 1(0) for the squared future returns and absolute
returns for coffee, gold, soybeans, and unleaded gasoline. Combined with the results
of the PP test, this may indicate long memory behavior in the future squared and
absolute returns for these commodities. The KPSS test also rejects its null for coffee,
gold, silver, and soybeans intraday ranges. Hence, there is some evidence from the
unit root and stationarity tests that volatility proxies may have long memory behavior
for some of the commodity future returns. The KPSS test rejects its null for coffee
and gold squared cash returns, and for the absolute returns of coffee, gold, soybean
and unleaded gasoline at the 5 percent level. Hence, evidence of long memory for the
cash squared and absolute returns is not that strong compared to future squared and
absolute returns.

To gain further insight on the dependence structure of the series, panels of ﬁgures

4.3 and 4.4 display the ﬁrst 100 autocorrelations for the daily log cash and future

134

returns, absolute returns, squared returns, and intraday range together with two-
sided 5 percent critical values (11.96/\/T) where 71 s the respective sample size.
It is seen that the autocorrelations of the future and cash returns are very small,
even at low lags and for a majority of lags they are within the 5 percent intervals.
Hence, autocorrelations of returns mimic the autocorrelation structure of a stationary
process. By contrast, for the absolute and squared returns, and the intraday ranges
the autocorrelations start off at a moderate level but remain signiﬁcantly positive for
a substantial number of lags. Moreover, autocorrelation in the absolute returns is
generally somewhat higher than the autocorrelation in the squared returns and for
all commodities autocorrelations in absolute returns hardly become insigniﬁcant at
all lags considered. This illustrates what has become known as the ’Taylor property’
(see Taylor, 1986, pp.52-55), that is, when calculating the autocorrelations for the
series R: for various values of 6, one almost invariably ﬁnds that autocorrelations are
largest for 6 = 1.

As is evident from the graphs, autocorrelations for absolute returns are not only
larger than those of squared returns but also much more persistent in the sense that
they decay much more slowly. Moreover, autocorrelations for intraday range are usu-
ally higher than those of absolute and squared returns and more persistent. The
autocorrelations in absolute and squared returns and intraday range seem to mimic
the correlation properties of a long memory process rather than a short memory sta-
tionary process for which autocorrelations decay to zero at an exponential rate. As
is evident from the graphs, the autocorrelations in absolute and squared returns and
intraday range decay very slowly, indicating that the linear association between dis—
tant observations is persistent and autocorrelations decay at a hyperbolic rate. This
behavior of autocorrelations is consistent with time series models with long memory
or long range dependence. The above described characteristics of autocorrelations

in log commodity future and cash prices are in conformity with the ﬁndings from

135

the stock and foreign exchange markets. For example, see Ding and Granger (1993),
Baillie et al. ( 1996), Bollerslev et al. (1996).

4.3 Results from GARCH and FIGARCH Models

A class of parametric models that is capable of modelling volatility clustering
and the persistence in the autocorrelations of absolute and squared cash returns is
the Fractionally Integrated Generalized Autoregressive Heteroscedastic (FIGARCH)
model of Baillie et al. (1996). The details of volatility models are discussed in chapter
2.

In the light of the discussion in chapter 2, conditional variance of commodity cash
and future returns are modelled by GARCH/FIGARCH processes. The robust Wald
statistic is used to check if the estimated FIGARCH model better represents the
long memory property of the data compared to a GARCH speciﬁcation. Results of
the estimated ARM A(p, q) — FI GARCH (P, 6, Q) models for future and cash returns
are presented in tables (4.4)-(4.7). The conditional mean speciﬁcation for cash and
future returns varies across different commodities. An M A(l) speciﬁcation found
to be satisfactory for modelling the conditional mean of cash and future returns for
all commodities except coffee. For the conditional mean of coffee cash and future
returns an M A(3) found to be a better speciﬁcation. The estimate of long memory
parameter, 6, for daily future and cash returns are signiﬁcantly different from zero.
Various tests for speciﬁcation of the models were performed. In particular, the last
row of the tables (4.5 and 4.7) give the robust Wald test values of a stationary
GARCH(1, 1) model under the null hypothesis against a F I GARCH (1, 6, 1) model
under the alternative hypothesis. In each of the commodities, the robust Wald test
values indicate clear rejection of the null hypothesis when compared with the critical

values of a xzdi stribution with one degree of freedom. For none of the commodities

136

did the estimated GARCH models performed better than the FIGARCH models.
The sum of the estimates of a and B in the GARCH models are found to be close to
one for all commodities, indicating that the volatility process is highly persistent. In
all cases the standardized residuals exhibit less skewness and kurtosis than the returns.
Perhaps of greater importance, the Ljung-Box statistic, Q, fails to reject the null
hypothesis of independently and identically distributed standardized residuals and
squared standardized residuals for most of the commodities. One striking result from
table 4.7 is the ﬁnding of dual long memory in both conditional mean and conditional
variance of the coffee cash returns. As the table indicates, an ARFI M A(0, d, 1) —
F I GARCH (1, 6, 0) model seems to ﬁt the coffee cash returns better than the other
speciﬁcations. Although the estimate of the long memory parameter is small, it is
signiﬁcantly different from zero.

To obtain some insight into the volatility in the commodity markets, panels of
ﬁgure 4.5 present the commodity future returns together with the estimated condi-
tional variances from the FIGARCH models. As the ﬁgures indicate, the estimated
models do very well in describing in sample volatility in the commodity markets.
The FIGARCH models are quite accurate in estimating the time dependence and
clustering in the volatility.

In the FIGARCH model, taking out the mean parameters, the squared error term
coincides with the squared return. Hence, the F I GARCH model estimates provide
evidence that the squared returns exhibit long memory. As indicated in section 4.2 the
autocorrelations of squared returns, absolute returns, and intraday range seemed to
mimic the autocorrelation structure of a long memory process. Moreover, the results
of the unit root and stationarity tests indicated that the volatility proxies are neither
unit root nor stationary. A result that can be interpreted as evidence of long memory.
To further analyze the long memory in the proxies for the volatility tables 4.8 and

4.9 present the results from the GPH estimates for different number of periodogram

137

ordinates and the table 4.10 reports results from the local Whittle estimation. The
results show that both cash and future squared returns and absolute returns exhibit
the long memory property with the estimates of the long memory parameter being
signiﬁcantly greater than zero and less than one. In most cases, the estimate is less
than 0.5 indicating both long memory and stationarity. These ﬁndings are consistent
with the F1 GARCH estimates. Interestingly, the intraday range also exhibits long
memory usually the long memory parameter estimates usually greater than those of

squared and absolute returns.

4.4 Conclusion

In this chapter, we analyzed daily commodity cash and future returns for cer-
tain primary commodities. The returns are modelled through the GARCH and the
FIGARCH models. The chapter found evidence supporting the F I GARCH mod-
els in the sense that the FIGARCH models ﬁt the data better than the GARCH
models. The F I GARCH speciﬁcation is able to capture both long and short run dy-
namic characteristics of the volatility process. The estimates of the fractional degree
of integration parameter were found to be signiﬁcantly different from zero. Robust
Wald tests are used to test the F I GARCH models against the GARCH models and
in all cases the tests rejected a GARCH(1, 1) model in favor of a FIGARCH(1, 6, 1)
model. This implies we need to consider time dependency and long term depen-
dence in forecasting optimal hedge ratios. On the other hand this requires a bivariate
F IGARCH modelling of cash and future returns. This is a potentially interesting
question that may also raise interesting econometric issues that need to be studied in
the future.

For each commodity the chapter also considered measures of risk or the volatility

proxies, namely, squared returns, absolute returns, and the intraday range (or volatil-

138

ity). The sample autocorrelations, unit root and stationarity tests, and estimates
from the semi-parametric methods, namely, the GPH estimates and the local Whit-
tle estimates of the long memory parameter indicated presence of the long memory
component in the volatility proxies. The ﬁndings here indicate that, in addition to
squared returns and absolute returns, intraday range exhibits long memory property
and it seems to be more persistent than the squared and absolute returns. The ﬁnd-
ings support the ﬁndings of Alizadeh et al. (1999) in that intraday range can be as
good a proxy for the volatility as the squared and absolute returns.

The ﬁndings in this chapter indicates that on a practical level, one need to take
into consideration the long memory in the conditional volatility of commodity cash
and future returns in assessing the risk and return relations in these markets. The
results also indicate that the optimal hedge ratios should be time dependent and one
needs to consider taking the long memory dynamics in the conditional volatility in
forecasting optimal hedge ratios. As shown in Baillie and Myers (1991) the optimal
hedge ratios should be time dependent when there are GARCH effects. The ﬁndings
in this chapter indicate that similar to Baillie and Myers (1991), one can improve in
forecasting hedge ratios by considering the long memory in the conditional variance

of cash and future returns.

139

BIBLIOGRAPHY

[1] Alizadeh, S., M. Brandt, and F. X. Diebold (1999), Range-based estimation of
stochastic volatility models or exchange rate dynamics are more interesting than
you think, Working paper, Stern School, NYU.

[2] Anderson, T. G. and T. Bollerslev (1998), Answering the skecptics: yes, standard
volatility models do provide accurate forecasts, International Economic Review
39, 885-905.

[3] Baillie, R.T., T. Bollerslev, and H.O. Mikkelsen (1996), Fractionally integrated
Generalized Autoregressive Conditional Heteroscedasticity, Journal of Econo-
metrics 74, 3-30.

[4] Bollerslev, T. and H.O.A. Mikkelsen (1996), Modeling and pricing long memory
in stock market volatility, Journal of Econometrics 73, 151-184.

[5] Baillie, R. T., and R. J. Myers (1991), Bivariate GARCH estimation of the
optimal commodity futures hedge, Journal of Applied Econometrics 6, 109-124.

[6] Crato, N. (1994), Some international evidence regarding the stochastic behavior
of stock returns, Applied Financial Economics, 4, 33—9.

[7] de Lima, P.J.F. and N. Crato (1993), Long range dependence in the condi-
tional variance of stock returns, paper presented in August 1993 Joint Statistical
Meeting, San Fransisco. Prooceedings of the Business and Economic Statistics
Section.

[8] Ding, Z., C.W.J. Granger, and RF. Engle (1993), A long memory property of
stock returns and a new model, Journal of Empirical Finance, 1, 83-106.

[9] Fox, R., and Taqqu, M. S. (1986), Large sample properties of parameter estimates
for strongly dependent stationary Gaussian time series, Annals of Statistics, 14,
517-532.

[10] Granger, C., (1980), Long Memory Relationships and the Aggregation of Dy-
namic Models, Journal of Econometrics, 14, 227-238.

140

[11] Granger, C., and R. Joyeux (1980), An Introduction to Long Memory Time
Models and Fractional Differencing, Journal of Time Series Analysis, 1, 15-29.

[12] Granger, C. Z. Ding (1995), Some properties of absolute return: an alternative
measure of risk, Annales D’Economie et de Statistque 40, 67-91.

[13] Granger, C. and Z. Ding (1996), Varities of long memory models Journal of
Econometrics 73, 61-77.

[14] Hosking, J. (1981), Fractional Differencing, Biometrika, 68, 165-176.

[15] Garman, M. B., and M. J. Klass (1980), On the estimation of security price
volatility from historical data,Journal of Business 53, 67-78.

[16] Gweke, J. and Portar—Hudak, S. (1983), The estimation and application of long
memory time series models, Journal of Time Series Analysis 4, 221-238.

[17] Kwiatkowski, D., P.G.B. Phillips, P. Schmidt, and Y. Shin, (1992), Testing the
null hypothesis of stationarity against the alternative of a unit root: How sure
are we that economic time series have a unit root? Journal of Econometrics 54,
159-178.

[18] Luce, R. D. (1980), Several possible measures of risk, Theory and Decision 12,
217-228.

[19] Parkinson, M. (1980), The extreme value method for estimating the variance of
the rate of return, Journal of Business 53, 61-65.

[20] Phillips, P. and C.B.P. Perron (1988), Testing for a unit root in time series
regression, Biometrika 75, 335-346.

[21] Rogers, L.C.G., and S. E. Satchel] (1991), Estimating variance from high, low
and closing prices, Annals of Applied Probability 1, 504-512.

[22] Taylor, S. (1986), Modelling Financial Time Series, John Wiley & Sons, New
York.

141

Figure 4.1: Cash returns, absolute and squared returns

Coffee

a.

 

00>” wu\n to)”

 

 

 

ooN

. ovu

 

 

 

 

 

 

 

 

8.} 3.)” 1.}

 

 

3|
«_I

v-

N—

.0”

 

 

142

Figure 4.1 (cont’d).

Corn

b.

 

 

 

E}. am\n 3? nm\n nm\n 82 mm} B} on} o
I l_. .o ..'=..- J~J.:1.III . Jul—.14... .1 .Q :3 30:1 -_..1 .. . .‘Q s I. a. . ﬂ a. a. —.. - 31.: I... - .eql.
_. .__ _

_ _ N
n
c

 

 

8}” no? 3)” no)” 3} a}. muxn S} no}. ml

DOQN

 

 

143

 

.ox. max. ~o\v max. max. .oxc aux. pox. «axe aux. ane aux. sax. max. o
_ _ _ _ L wo—

‘ ,. on
on
. ow
. 8
. om
. oh
. on
. om

 

 

,o\v an\¢ -aa\c aux. no\e . .oxe. aux. “axe. aux, aux. Fox. on\e .mmxe nu\e

.3. ... .41.. 11.3.5.4..7. a... . ..

_
_

   
   

    

 

 

"
. .. "

OOQBOnQHNPO

'-

 

Figure 4.1 (cont’d).

Gold

 

C.

8} am} 3\c

mm\. nm\. Fa\. mm\. am\¢ nmxv no\v _o\v an\¢ ~a\' ne\&

0'
ol-

 

OD‘OVNON

.—

 

144

 

Figure 4.1 (cont’d).

Silver

(1.

 

soxua

mo\Np

no\N—

.o\u.

 

eo\~. o

0—

.On
.01
.On
.00
.05
non

 

 

na\~_

no\~.

a «a: 11.1.1 Jae .2... 2... .436... . .13J

na\~_

_ .
_

:45. 3............ ..31..2....1 « .1411

_m\u.

.

 

an\n—

.1...‘ «‘1-weJd‘ .4141 44“ 31‘.. ‘1. .‘l.‘ni

 

O a o h o n t n N n o

'-

 

 

no\~_

 

no\~.

na\wp

.o\mp

onxmr

 

 

145

Figure 4.1 (cont’d).

c._ Soybean

 

 

 

 

 

   

 

 

 

ooh maxn on}. Ex.” 8)” 098 no)” no)” 5)” S} 8% o
_ _ _ ,
. _ . om
. ow
8
l on
. oo—
. on.
::
ooxn nmxn umxn 3} S} on} no} on)” 3}” no} on} o
1%... J. . . . . .. . n .I. . . . . .u.. .. O. o . . .— . I 1‘ sell ‘
_ ‘ N
e
o
. o
. 9
a ~—
03». 3).. no} 3} Nuxn ooxn nmxn on} 5} 8% om\n 2..

 

 

 

 

 

 

146

“\d/
t
n
o
{W
,1“
4
e
r
u
.Wo
F
e
n
Ch
0
M»
G
.w
.m
e
.m
U

f.

 

8% no)” on} 3}

3k ooxn an)” 8} a

0+

. ON—
. cop

. Otn

 

. own

 

oo\n moxn no} . . .
T2317:_.. :13]: 441.214.- ...1.. 3.11.11.244 .2. a .31 .. AZ... 34.14.31. a. 5.: :

a. ,,

       

3).. an)” oak on} 2.}

on
N.
v—
n—
0—
ON

 

 

8% 8}” 3k 3)” 3k 8) om.\n an},

owl
o_l
NT.

 

mp

 

 

147

Figure 4.2: Commodity future returns, absolute returns, and intraday range

a.

Coffee

 

16

 

 

-16

 

l6

  

 

.1..-

, . l."l ‘4.-

.h

n‘ 05qu .01 I. ll-Leh A. . -.I a

 

 

240

120

 

10

 

...hAﬂL

3/94

 

.I“.m A...“ M .

' 3/98

3/96

3/92

3/90

3/88

' V 3/86

0‘ 3/84

 

 

148

Figure 4.2 (cont’d).

Corn

b.

 

3R

 

as l in .32.

    

QO

 

EM as :3 2;. . o

 

 

 

.3

 

SN

 

 

-3: _..l .1 <.l. 1.. j_.4¢.J-qd‘-.nq I .s . a:

o

 

 

 

 

 

 

 

149

Figure 4.2 (cont’d).

Gold

C.

 

‘1

31......

83 _. . 8:.

  

. .
_ .

a; 3:. . S; as. :3. ”S. WE.

   

o

 

4 I. I: .14 13. _114 ll: 1111111 ‘qawaanI‘Gda-ﬁ 1.1.415.

v

 

.om

 

H2:

 

 

 

.—

 

 

 

 

 

 

 

150

Figure 4.2 (cont’d).

(1. Silver

 

 

 

 

 

 

. 4 .1..I3.... In. 416—... 1_-JIJ.Jscalla—q_d_alaiou a. 4 c...

. .7 . ..:.-e=~:4an.—A..i,_-.. n;

592 mQN— nQN— 3h: QQNF 0
i1; ._ ._ £1... __ a
r A
um
um
o
, on
LCD“
..I.l_...l.<-:c,..-. .1. . ., .a- .C

.2

 

 

 

 

 

151

Figure 4.2 (cont’d).

Soybean

C.

 

 

 

 

 

 

 

 

09m 3R 3% mg.” . on «ﬁn 32m 32m SR 3
_
m
o
c
, H mm
me
o
. Ia 4......u1: .4111. . .l .-........1... - 4.2-=3... J... .. 1.13... 541...- .n. 11;...— 3—.
d . _
m
.\.
”I
_ . .
o
_
w

 

 

152

Figure 4.2 (cont’d).

f. Unleaded Gasoline

 

 

 

 

 

 

 

 

on ”QM on va NQM on 3% on} o
I. is. _I I. I. II I. I J2: .14 .1... .1. .1-

.

_w

c

we

Hovw

..1....... 3...: ...- - - 1 . ..... 41.....- ... .. SE30
w

._

NT

.o

N.

 

 

 

 

153

Figure 4.3: Autocorrelations for cash returns, absolute and squared returns

Coffee

a

 

 

vrvvvvv

66Nm¢ommm
‘fnnNNNI—v—o

 

ddddddddd'

 

|"'I’.‘|"'l'|'|'.I|"'|lll"'|“"-l" II""-.

E . .01
3.0.0:
.88

ION”
v—v-D
COO

 

émmbvo
vnnNNN

Oddddd"'

 

 

 

 

 

 

 

154

Figure 4.3 (cont’d).

b. Corn

 

 

OON‘DV‘OONCO

 

anneeméﬁe
OOOOOOOOO

 

 

 

 

O‘DNIDV'OIONCD

fﬂﬂﬁmﬁfﬁq
OOOOOOOOO

 

 

 

.o F .on
8.?
.30..
.oo.o
sod
.oﬁo
; . .o
.28

 

 

 

155

Figure 4.3 (cont’d).

Gold

0

 

 

 

00.0

OIDNCOVOKONQ
#HININNNv-v-O
OOOOOOOOO

 

llllllllllllll .||II'|.II|III'I I||ll ‘Il'||. 'I||| '||' 10".

0""'-|""'I'|"-"'|"I'll'-'-'00"""'l"""-|I'|l"-"""|'l

 

owmdvommv
vﬁmqmmwp
00000000

ﬁvwv

 

 

3.01
Modj
N001
00.0
390
.000
1.0
.20

 

 

 

 

156

Figure 4.3 (cont’d).

d. Silver

 

 

 

 

 

 

 

 

 

 

 

 

157

 

I |"'.'0 "lI'II

   

l l

0"--.I II'I-‘ I II'II II 0'--- ‘I'I'I I.-"|'s‘e---"‘l""

vvvvv

00N0¢00~

Vv

 

Inn‘INN'T'. .
00000000

 

v—vw

obdmeawwoo #0

_LI

I.IIII'IIIIII.I 0"I'eIIII'I Il'I'It I'|"IIIJI"'- I---‘-‘I.|||"'I"||-’l

 

ll l

'0"""I"-""'|"'l"||"|-'l'l'-"-Lv

v vvvv vv—v—erV—v

 

*GQNQNCTQO 0’
0000000000 00

 

Figure 4.3 (cont’d).

 

e. Soybean

     
   

I---'|""|"' 0'01

 

 

 

158

 

 

 

 

r-v

owmmvodmm
VGT‘V‘I‘I'T'TO. .
000000000

 

 

 

00vaommv
vnnmmm~~
00000000

 

Figure 4.3 (cont’d).

 

OCOU
00.0
.....Inuuuwnhwnuuuili>.i..b!im..!.unhu!i? uni- - a..>uhhnuuu.-..;.._8.oi

. 4 AIIIIIIHI‘ I “ I . I . . ‘ ‘ I‘llll 00.0
.000
”one
.38
0P0

 

 

 

f. Unleaded Gasoline

159

 

 

 

 

absolute and squared returns, and

)

Figure 4.4: Autocorrelations for future returns

intraday range

a. Coffee

 

 

 

 

 

 

 

 

160

Figure 4.4 (cont’d).

b. Corn

 

 

 

 

 

 

 

 

 

 

«.0

 

 

 

 

 

 

161

Figure 4.4 (cont’d).

Gold

C.

 

-‘0---

l GOO
.
lfll
III

c0

— .0.

 

:1
5; o.°

..§

 

 

* . v_e°'.
0'--- 'l'--- I..." 0"-"O"-"l'*""' '00'll -"I"‘ '''''''''' A

. 0.0

 

.,vd

 

 

 

 

m0

 

 

162

Figure 4.4 (cont’d).

d. Silver

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

163

Figure 4.4 (cont’d).

Soybean

e.

 

 

 

 

I'l"'lal 'I'IQI 'l"-
I

 

 

 

 

 

 

N0

 

 

 

 

164

Figure 4.4 (cont’d).

f. Unleaded Gasoline

V

 

 

 

 

 

 

 

 

 

 

 

 

 

N0

 

 

165

Figure 4.5: Future returns and estimated conditional variances

a. Coffee

 

 

 

 

 

 

 

 

8.} l - 8% 3% cm)” 3)” mm} cm)” on)” on} so}.
a 1 A 4 e e e 4 < 4 e q 1 e .- 1‘10
.0...
SN
.8.
.9.
.8

A
Lo...

moocotg _oconzncg noun—55mm carbon 83:.— >=on 8.000 B“.
8.? . . . -00.} . mm} . mm} 3% 3}” oak .8} on}. E}

 

 

3030.. 9.3.3 250 ootou n8

moucmtg _mcoEUCOO 98533 new menace 233 notou 6.59...

 

 

166

Figure 4.5 (cont’d).

Corn

b.

 

 

 

 

 

 

 

 

 

 

 

S} mm\n 3} mm\n nm\n a} mm\n E} . {mm} o
i N
. 4 ¢
L
i m
1 m
1 9
. S
30:35) _mcoEucoo noquDmo cube 233 >=mu ECU and
S} mm\n 3} mm} nm\n a} mm\n S} imm\n on
n
L .VI
A
NI
0
N
¢
4
u o

mango... 9.33 260 CLOU m3

mmocmtm> 68228 032.5% new 8.5%: 333 58 5,50:

 

 

167

 

 

 

 

 

 

Figure 4.5 (cont’d).

Gold

C.

 

 

«533 9.3:“. 26v Bow m3

mmocmtm> _mcoEUCOQ 33633 new 3.5%: 333 28 “23$

3) mm} 3} mm} B} on} 8} B}

1111.4- i . . ii 4 i t i 0
.¢

.m

JNP

in:

low

J¢N

gmm

moocmtm> 6:03:33 9.33 26o Bow and

EEinm} 3} 3} mm} $> no} 8} E} 9
ml

0'

VI:

NI.

0

m

¢

0

m

op

 

168

 

Figure 4.5 (cont’d).

Silver

d.’

 

nm\mP mmme Fm\~,

 

 

 

 

 

 

mm\mp

 

 

 

 

 

 

 

mcezpme €33 26v .526 mg

 

mmocmtg _mcobﬁcoo UmumEBmo ucm mceavoe 9.33 326 No.59“.

 

 

. S

i. I

.A m:

. 9

wow
moocmC0> _mco_£DCOo mLapDu ZED L925 an:

8%: . mam? ‘ aw? ‘ oath, 3%. 9:
mi
on
T
N:
o
m
¢
0
m

 

 

 

 

169

Figure 4.5 (cont’d).

e. Soybean

 

 

 

 

 

 

 

 

4 84} i 34> 4 84k 4 «mkl 34} 4 om4\n 4 34> 4 84} 4 $4} 4 S4} 40mins
.N

_ U¢

im

in
i:
M Np
L;

30575) 6:056:00 0:33 250 coon>0m an:

oo4\nl mm} 4 cm} $4}. i .84} 4 8+? 4 84} 4 34)., 4 $4? 4 $4} 48>, mi
mi

T

:N...

o

.,N
.v

 

 

3:30: 0:33 26: :m>:m E

80:95) 6:056:00 UmmeDmo 0cm «Esau... 9.33 :mon>0w “0:39”.

 

 

170

Figure 4.5 (cont’d).

f. Unleaded Gasoline

 

 

 

 

 

 

 

 

 

ooh mm\n 00)” em} 00>” om}, Rim 1 00>. o
o,
1 ON
1 on
a
i oe
- cm
a
. 00
000:0.:0> 6:056:00 0:33 366 0::0000 60602:: no“.
8} 0m\n 03m 00>” mm)” om} 00\n i 00}, 07
NT
ml

 

0:56.. 0:33 260 0:6me 6000063 :3

08:07.0) 6:056:00 600950.00 0:0 0:030: 0:33 056000 006006: “0,590

N—

 

 

171

Table 4.1: Summary statistics for commodity future and cash returns

 

coffee
corn
gold

silver

soybean

11. gas.

mean
-0.020
-0.020
-0.022
-0.008
-0.018
0.008
-0.015
0.009
-0.002
-0.004
0.039
-0.018

med
0.007
0.000
0.000
0.000
-0.025
0.000
0.000
0.000
0.000
0.000
0.034
0.000

min
-14.247
-14.458
-5.264
-7.486
-9.909
-7.750
-9.776
-9.432
—6.172
-11.490
-14.618
-18.251

max
12.739
21.328
5.213
7.903
9.745
9.291
7.801
5.827
6.433
7.867
10.285
12.573

var.

4.453
4.544
1.416
2.168
1.580
1.591
2.082
1.805
1.591
1.936
2.754
6.189

skew.
-0.275
0.008
0.016
-0.334
-0.046
0.116
-0.241
-0.280
-0.070
~0.446
-0.202
-0.242

kurt.
7.289
12.950
5.098
6.068
10.056
9.928
7.709
7.128
5.201
7.049
7.768
6.882

 

Table 4.2: Summary statistics for commodity future absolute and squared returns

and intraday range

 

coffee

C01' 11

gold

silver

soybean

mean
1.480
4.453
1.542
0.871
1.416
0.938
0.813
1.580
0.749
0.996
2.081
0.168
0.922
1.592
0.945
1.187
2.755
0.789

med

1.022
1.045
1.271
0.657
0.432
0.830
0.503
0.253
0.556
0.677
0.458
0.000
0.678
0.460
0.822
0.877
0.769
0.554

min

0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000

max
14.247

202.979

9.866
5.264
27.714
5.676
9.909
98.190
7.162
9.776
95.568
4.535
6.433
41.385
5.381
14.618

213.693

13.058

var.
2.263

124.949

1.303
0.657
8.215
0.262
0.919
22.619
0.542
1.090
29.124
0.213
0.742
10.656
0.293
1.347
51.209
0.872

skew.
2.368
7.693
1.838
1.855
4.326
1.857
2.842
8.755
2.469
2.440
7.934
4.300
1.898
4.537
1.763
2.462

kurt.
12.213
88.993
9.038
7.447
26.218
9.886
15.267
119.255
13.416
12.565
98.294
27.064
7.652
30.726
8.452
14.960

12.336 279.598

2.355

17.903

 

172

Table 4.3: KPSS and Phillips-Perron test results for commodity future log prices
levels, returns, absolute returns, squared returns and intraday range

a. KPSS Test: Commodity Phture Prices:

 

 

 

 

 

 

 

 

series coffee corn gold silver soybean u. gaso—
line
level 2.682 2.804 5.930 3.698 2.892 2.813
return 0.080 0.079 0.160 0.163 0.041 0.107
squared return 2.333 0.180 4.779 0.305 0.484 0.570
absolute return 3.633 0.245 9.202 0.414 0.798 0.653
intraday range 4.909 0.351 9.619 0.911 1.578 0.115
b. Phillips-Perron Test: Commodity Future Prices:
level -2.096 -2.551 -1.908 -2.400 ~3.026 -3.254
return -64.127 -60.126 -79.168 -46.425 -71.684 -52.950
squared return -53.998 -51.024 -59.158 -41.983 -59.146 -43.791
absolute return -49.740 -51.875 -57.835 -40.487 -60.718 -43.756
intraday range -40.084 -46.548 -42.264 -35.396 -49.315 -28.550
c. KPSS Test: Commodity Cash Prices
level 2.401 1.856 6.459 4.643 1.663 1.238
return 0.089 0.054 0.270 0.145 0.044 0.063
squared return 6.055 0.278 4.952 0.295 0.375 0.412
absolute return 11.567 0.335 8.945 0.356 0.658 1.099
d. Phillips-Perron: Commodity Cash Prices
level -1.447 -2.065 -1.956 -2.019 -2.862 -3.004
return -61.993 -59.290 -82.966 —45.274 -73.017 -51.983
squared return -47.393 -49.861 -53.496 -39.515 -57.993 -47.105
absolute return -52.613 -48.496 -53.590 -39.029 -58.757 -44.989

 

173

Table 4.4: Estimated MA — GARCH Models for the commodity future returns

 

 

coffee corn gold silver soybean u. gasoline
p -0.044 -0.018 0.036 -0.045 —0.028 0.028
(0.025) (0.016) (0.010) (-0.030) (0.014) (0.024)
(9 0.040 0.048 . . . .
(0.017) (0.018) . . . .
w 0.042 0.033 0.002 0.019 0.035 0.055
(0.015) (0.009) (0.001) (0.010) (0.008) (0.019)
a 0.110 0.096 0.053 0.026 0.085 0.097
(0.016) (0.012) (0.009) (0.007) (0.009) (0.017)
6 0.888 0.882 0.949 0.956 0.893 0.883
(0.016) (0.015) (0.008) (0.009) (0.011) (0.021)
ln(€) -8530.280 -6075.413 -8935.093 -3505.472 -8205.102 -5703.621
Skewness -0. 122 -0.318 -0.304 -0.166 0.045 -0.113
Kurtosis 4.900 7.026 7.183 7.118 4.381 4.355
Q20 22.544 26.091 28.905 28.300 21.135 28.822
Q30 30.579 19.145 29.009 11.033 36.399 17.683
T 4206 4055 6295 2002 5267 3153

 

Key: ln(€) is the maximized log likelihood. The number in parenthesis indicate the asymp—
totic robust QMLE standard errors of the corresponding parameter estimates. The Q20
and Q30 are the Ljung—Box statistics at 20 degrees of freedom based on the standardized
residuals and squared standardized residuals respectively. The skewness and kurtosis are
based on the standardized residuals. T is the sample size.

174

Table 4.5: Estimated MA — FIGARCH Models for the commodity future returns

 

 

coffee corn gold silver soybean u. gasoline

p -0.037 -0.018 0.034 -0.042 -0.029 0.027
(0.025) (0.016) (0.010) (-0.030) (0.014) (0.024)

6 0.040 0.050 . . . 0.038
(0.018) (0.017) . . . (0.019)

6 0.533 0.582 0.424 0.241 0.546 0.541
(0.085) (0.121) (0.050) (0.040) (0.106) (0.089)

(.2 0.062 0.036 0.015 0.198 0.041 0.136
(0.024) (0.012) (0.006) (0.046) (0.014) (0.032)

6 0.684 0.653 0.691 0.579 0.650 0.451
(0.070) (0.097) (0.073) (0.024) (0.092) (0.095)

0 0.326 0.162 0.388 0.420 0.168 .
(0.062) (0.052) (0.066) (0.024) (0.045) .

1n(€) -8514.112 -6080.833 -8907.757 -3512.494 ~8209.061 -5706.865

Skewness -0.148 -0.121 -0.318 -0.112 0.064 -O.130

Kurtosis 4.669 4.201 7.026 7.290 4.531 4.317

Q20 24.182 27.984 26.091 27.913 21.998 23.087

Q30 30.796 31.805 19.145 9.629 36.046 12.738

T 4206 4055 6295 2002 5267 3153

W5=o 39.351 23.206 73.116 36.644 26.358

 

Key: W5=o stands for the robust Wald test statistics testing the null of a (31201.! (1, 1)

model against a FIGARCH(1, 6, 1) model. The rest of the table is same as Table 4.4.

175

Table 4.6: Estimated MA — GARCH Models for the commodity cash returns

 

 

coffee corn gold silver soybean u. gasoline
,u -0.066 0.019 0.008 -0.025 -0.010 0.036
(0.039) (0.019) (0.010) (-0.027) (0.015) (0.040)
6 0.151 . . . . .
(0.030) . .
9 -0.097 -0.060 0.080
(0.038) . (0.016) . . (0.019)
w 0.022 0.041 0.009 0.030 0.028 0.111
(0.013) (0.011) (0.009) (0.018) (0.007) (0.044)
a 0.094 0.107 0.081 0.041 0.084 0.076
(0.015) (0.013) (0.044) (0.015) (0.010) (0.017)
6 0.909 0.878 0.920 0.943 0.903 0.907
(0.014) (0.014) (0.045) (0.022) (0.011) (0.022)
ln(€) -7970.439 -6890.949 -9006.606 -3338.934 -8598.366 -7096.422
Skewness -0.381 -0.399 -0.023 -0.167 0.182 -0.088
Kurtosis 13.492 4.800 11.233 5.997 4.490 4.352
Q20 30.029 32.118 50.366 26.312 26.110 22.398
Q30 18.066 19.05 23.071 25.830 23.917 17.398
T 4206 4055 6295 2002 5267 3153

 

Key: (1 is the long memory parameter in the ARFI M A(0, (1, 1) model that is ﬁtted to

conditional mean of coffee cash returns. The rest of the table is same as Table 4.4.

176

Table 4.7: Estimated MA —- F I GARCH Models for the commodity cash returns

 

 

coffee corn gold silver soybean u. gasoline
a -0.074 0.019 0009 -0.021 -0.012 ~0.031
(0.034) (0.019) (0.009) (-0.027) (0.015) (0.040)
d 0.074 . . . . .
(0.017) . . .
6 0.074 0.030 -0.063 0.084
(0.022) (0.019) (0.014) . . (0.019)
6 0.367 0.499 0.342 0.268 0.668 0.438
(0.047) (0.144) (0.034) (0.046) (0.158) (0.097)
w 0.172 0.110 0.042 0.137 0.036 0.216
(0.077) (0.028) (0.022) (0.034) (0.010) (0.102)
6 0.235 0.396 0.500 0.570 0.738 0.602
(0.069) (0.164) (0.127) (0.025) (0.107) (0.127)
as . . 0.313 0.429 0.151 0.280
. . (0.130) (0.025) (0.059) (0.084)
ln(€) -8013.027 -6887.237 -8824.516 -3335.056 ~8604.379 -7098.089
Skewness -0.841 -0.380 -0.034 -0.111 -0.156 -0.089
Kurtosis 14.736 4.739 8.786 5.778 4.527 4.445
Q20 30.775 28.622 53.507. 29.054 25.693 22.824
Q30 22.495 32.006 7.564 20.588 27.344 11.748
T 4206 4055 6295 2002 5267 3153
W5=0 . 99.924 34.201 17.879 41.593

 

Key: W5=o stall

ds for the robust Wald test statistics testing the null of a GARCH(1, 1)

model against a FIGARCH(1, 6,1) model. The rest of the table is same as Table 4.4.

177

Table 4.8: GPH estimation results the cash returns, squared and absolute returns

a. Cash Returns

 

 

 

 

 

 

m coffee corn gold silver soybean u. gaso-
line
TO'55 -0.002 0.132 0.004 -0.071 ~0.135 -0. 154
(-0.037) (2.023) (-0.077) (-1.313) (-2.220) (-2.206)
T065 0.078 0.081 0.082 -0.071 -0.037 -0.041
(1.838) (1.876) (2.193) (~1.313) (0.943) (-0.887)
T":75 0.047 0.078 0.010 -0.027 -0.080 -0.057
(1.669) (2.732) (0.430) (-0.717) (3.087) (-1.832)
b. Cash squared Returns:
TO'55 0.276 0.478 0.496 0.385 0.345 0.358
(6.467) (7.296) (8.536) (4.845) (5.663) (5.121)
T”:65 0.276 0.372 0.399 0.170 0.474 0.248
(6.467) (8.630) (10.668) (3.125) 11.971 (5.299)
T”75 0.247 0.210 0.355 0.146 0.337 0.211
(8.795) (7.393) (14.689) (3.925) 13.064 (6.730)
c. Cash absolute returns
7‘055 0.455 0.519 0.496 0.438 0.435 0.394
(7.020) (7.927) (8.542) (5.512) (7.143) (5.633)
T":65 0.373 0.416 0.421 0.264 0.463 0.298
(8.755) (9.650) (11.260) (4.848) (11.681) (6.371)
T“75 0.277 0.281 0.370 0.164 0.334 0.249
(9.875) (9.860) (15.351) (4.423) (12.945) (7.944)

 

Key: m stands for the number of periodogram ordinates used in the GPH estimator.
The values in parentheses are the t statistics for testing the null of Ho : 6 = 0 versus the
alternative of H1 : 6 > 0. The t values are computed by using the theoretical variance of

1r2 / 24m.

178

 

Table 4.9: GPH estimation results the future returns, squared and absolute returns

and intraday range
a. Future Returns

 

 

 

 

 

 

 

 

m coffee corn gold silver soybean u. gaso-
line
TO'55 -0.023 0.039 -0.008 -0.055 -0.040 0.078
(-0.357) (0.599) (-0.133) (-0.692) (-0.659) (1.115)
T‘"65 0.045 0.096 0.078 -0.075 -0.020 0.029
(1.044) (2.219) (2.087) (-1.375) (-0.508) (0.629)
T075 —0.008 -0.021 0.009 -0.057 -0.033 -0.053
(-0.279) (-0.721) (0.360) (-1.548) (-1.268) (-1.680)
b. Future squared Returns:
To55 0.219 0.429 0.444 0.307 0.413 0.437
(3.377) (6.549) (7.655) (3.857) (6.783) (6.246)
T065 0.327 0.419 0.370 0.170 0.382 0.347
(7.673) (9.717) (9.901) (3.124) 9.638 (7.409)
T":75 0.271 0.354 0.415 0.084 0.365 0.262
(9.652) (12.452) (17.191) (2.260) 14.152 (8.376)
0. Future absolute returns
T 0'55 0.375 0.400 0.464 0.339 0.442 0.519
(5.796) (6.110) (7.993) (4.268) (7.257) (7.421)
T‘"65 0.401 0.367 0.403 0.211 0.441 0.316
(9.411) (8.503) (10.779) (3.873) (11.137) (6.756)
51“"75 0.314 0.336 0.350 0.162 0.366 0.285
(11.170) (11.807) (14.487) (4.365) (14.194) (9.104)
d. Future intraday ranges
T‘I55 0.468 0.421 0.483 0.415 0.476 0.558
(7.218) (6.429) (8.324) (5.219) (7.827) (7.979)
T":65 0.515 0.480 0.490 0.370 0.532 0.531
(12.079) (11.123) (13.115) (6.802) (13.440) (11.352)
To:75 0.415 0.374 0.409 0.239 0.395 0.501
(14.785) (13.156) (16.959) (6.448) (15.299) (16.014)

 

Key: Same as table (4.8).

179

Table 4.10: Local Whittle Estimates of long memory parameter for commodity cash
and future returns and volatility proxies

a. Cash Series

 

 

 

 

Series coffee corn gold silver soybean u. gaso-
line
return 0.081 0.082 0.047 -0.072 -0.009 -0.180
(0.051) (0.051) (0.038) (0.076) (0.052) (0.053)
squared return 0.431 0.564 0.440 0.394 0.422 0.384
(0.044) (0.060) (0.035) (0.071) (0.048) (0.051)
absolute return 0.552 0.596 0.494 0.710 0.600 0.562
(0.039) (0.057) (0.032) (0.084) (0.044) (0.050)
b. Future Series
return 0.094 -0.018 0.048 -0.043 -0.045 0.057
(0.051) (0.045) (0.038) (0.075) (0.043) (0.055)
squared return 0.379 0.599 0.349 0.323 0.472 0.408
(0.043) (0.064) (0.031) (0.067) (0.049) (0.054)
absolute return 0.552 0.538 0.503 0.473 0.598 0.583
(0.041) (0.057) (0.032) (0.074) (0.044) (0.052)
Intraday range 0.562 0.491 0.567 0.452 0.644 0.774
(0.043) (0.052) (0.036) (0.065) (0.040) (0.078)

 

Key: The values in parentheses are the robust standard errors.

180

CHAPTER 5

On the long memory properties of
Emerging Capital Markets:
Evidence from Istanbul Stock

Exchange

5.1 Introduction

The presence of long memory components in stock returns has important implications
for many of the paradigms of ﬁnancial economics. If stock returns display long-term
dependence, then they exhibit significant autocorrelation between observations widely
separated in time. Since the series realizations are not independent over time, real-
izations from the remote past can help predict future returns, hence giving rise to
the possibility of consistent speculative proﬁts. This is in contrast to the martingale
or random walk type behavior that many theoretical ﬁnancial asset pricing models
usually assume. Therefore, optimal consumption/ savings and portfolio decisions may
become sensitive to the investment horizon. The presence of long memory in asset

returns contradicts the weak form market efficiency hypothesis, which states that,

181

conditioning on past returns, future asset returns are unpredictable. A ﬁnding of
long memory in asset returns calls into question linear modelling and invites the de-
velopment of nonlinear pricing models at the theoretical level to account for long
memory behavior. Mandlebrot (1971) observes that in the presence of long mem-
ory, the arrival of new market information can not be fully arbitraged away and
martingale models of asset prices can not be obtained from arbitrage. If the under-
lying continuous stochastic processes of asset returns exhibit long memory, then the
pricing derivatives by martingale models as well as statistical inference concerning
asset pricing models based on standard testing procedures (Yajima, 1985) may not
be appropriate.

Due to the theoretical and empirical importance of the issue, there is an extensive
literature on analyzing the long memory properties of ﬁnancial asset returns in major
ﬁnancial markets. Greene and Fielitz (1977), by using the R/S statistic of Hurst
(1951), test long-term dependence in the daily returns of 200 individual stocks on the
New York Stock Exchange from December 23, 1963, to November 29, 1968, and report
evidence of persistence. Aydogan and Booth (1988) used also the original R/ S analysis
to test for long memory in common stock returns. Lo (1991), by using a modiﬁed
version of the R/ S statistic which controls the possible short term dependencies in
the data, found no evidence in favor of long memory of the monthly and daily returns
on Center for Research in Security Prices (CRSP) stock indexes. Ding, Granger, and
Engle (1993) examined the long memory properties of several transformations of the
absolute value of daily returns on the Standard and Poor’s (S&P) 500, and obtained
considerable evidence of long memory in the squared and absolute returns. Crato
(1994), used the exact maximum likelihood method of Sowell (1992), and found no
evidence of long memory for the stock return series of G-7 countries. By using both
the modiﬁed R/S method of L0 (1991), and the Geweke and Porter-Hudak (1983)

(GPH) method, Cheung and Lai (1995) found no evidence of persistence in several

182

international stock return series. Lobato and Savin (1998) test the presence of long
memory in daily returns and their squares on S&P 500 series by using semi-parametric
procedures. Their test results indicate no evidence for long memory in the levels of
daily returns but evidence of long memory in absolute and squared returns.

Despite the extant literature that analyzes the long memory properties of ma-
jor stock markets prices, there is little research done on the time series properties
of Emerging Markets asset prices. Outside the world’s developed economies, there
is a host of emerging capital markets (ECM) in Europe, Latin America, Asia, the
Middle East and Africa. As pointed out by Harvey (1995) compared to developed
markets, ECMs exhibit higher expected returns as well as higher volatility. Due to
low correlation with developed countries’ stock markets, the unconditional portfolio
risk of a world investor would be signiﬁcantly reduced. These markets have attracted
a great deal of attention from investors and investment funds seeking to further diver-
sify their portfolios as these stock markets provide a new menu of opportunities for
investors of the world. Despite temporary setbacks, ECMs continue to be important
conduits of diversiﬁcation, and a complete characterization and understanding of the
dynamic behavior of stock returns in ECMs is warranted. One may think that ECMs
are likely to exhibit characteristics different from those observed in developed capital
markets. Barkoulas et al. (2000) recently analyzed the long memory properties of
weekly Greek stock market data and obtained strong evidence of long memory in
the conditional mean process, a ﬁnding contrary to the results from developed stock
markets. One may expect biases due to market thinness and non-synchronous trading
that is possibly more severe in the ECMs. Moreover, in contrast to developed capital
markets, which are highly efficient in terms of the speed of information reaching all
traders, investors in Emerging Capital Markets may tend to react slowly and gradu-
ally to new information. All these may lead one to expect ECMs stock returns behave

differently and have distinct properties compared to developed capital markets.

183

The purpose of this chapter is to analyze the long memory properties of stock price
returns in an emerging capital market; the Istanbul Stock Exchange (ISE). Speciﬁ-
cally, the paper tries to answer the following question. Do daily and weekly ISE index
returns have the long memory property, with index returns being approximately un-
correlated, and with very persistent autocorrelation in squared and absolute returns?
To my knowledge, no study has analyzed the long memory dynamics of Istanbul Stock
Exchange market returns.

The ISE, the only stock exchange in Turkey, was formally inaugurated in late 1985.
The number of companies traded on the exchange increased from 80 at the end of 1986
to 262 at the end of 1998 (Yuksel 2000). The national market is the major component
of the ISE. The total market capitalization of the ﬁrms traded has increased from 938
million US dollars at the end of 1986, to 56 billion US dollars at the middle of 1999.
Turkey has one of the most liberal foreign exchange regimes in the world, with a fully
convertible currency as well as a policy that allows foreign institutional and individual
investments in securities listed on the ISE since 1989. Turkish stock and bonds
markets are open to foreign investors, without any constraints on the repatriation of
capital and proﬁts. Just between the beginning of 1996 and the end of 1999 foreign
investment in ISE has more than tripled. According to Yuksel (2000) about half of the
floating equity in ISE is owned by foreign investors. These observations show that ISE
is one of the important ECMs in the world economy and a better understanding of the
dynamic properties of the ISE index returns will be useful not only for comparison
purposes, but also for the international investors whose portfolios include equities
from ISE.

This chapter uses the Fractionally Integrated Generalized Autoregressive Condi-
tional Heteroscedasticity (FIGARCH) model of Baillie et a1. (1996). Since the Gen-
eralized Autoregressive Conditional Heteroscedasticity (GARCH) model attempts to

account for volatility persistence, but has the feature that persistence decays rela-

184

tively fast, we use the GARCH model as a benchmark and compare its results with
the F IGARCH model, as the latter model is capable of modelling very long temporal
dependencies in conditional variance of a process. In order to better asses the presence
of long memory in the volatility of index returns, this chapter also models absolute
returns and squared returns using Fractionally Integrated Autoregressive Moving Av-
erage (ARFIMA) model of Granger and Joyeux (1980), and Hosking (1981). More-
over, estimates of the long memory parameter for the volatilities of stock returns
from semi-parametric methods are also obtained. Particularly, the GPH estimator
from Geweke and Portar-Hudak (1983) and a local Whitlle estimator based on Fox
and Taqque (1986) are used. The ﬁndings of the this chapter indicate presence of
long memory in the volatility process of ISE 100 stock returns. Contrary to empirical
evidence from some other ECMs, the conditional mean of ISE 100 daily and weekly
dollar stock index returns do not posses the long memory component.

The rest of the chapter is organized as follows. Section 5.2 describes the data
and examines the empirical autocorrelations of the series. Section 5.3 presents and

discusses the empirical results. The last section provides the conclusion.

5.2 The Data

The data set consists of daily US dollar 'Ilurkish lira spot exchange rates and the
'Ihrkish stock index based on the closing prices of a value-weighted index comprising
the top a hundred listed ﬁrms on the ISE National Market by their market capital-
ization. Exchange rate data is obtained from the Central Bank of the Republic of
Turkey (CBRT), while ISE 100 index data is obtained from the ISE. In choosing the
stocks included in the index, the stocks are ranked in a descending order according to
market and daily average traded values. Those stocks that have the highest market

values and daily average trading values are included in the ISE National-100 index.

185

The sample period spans 01 / 04/ 1988 to 09/28/2001 for a total of 3440 observations.
The index used in this study is expressed in terms of US dollars in order to avoid
the effect of local inﬂation risks. The base year for the index is adjusted so that the
index at 01/04/1988 is equal to 100. Then the following formula is used to convert
the index into dollar denominated base; 100 x £5850”, where P, is the index at time
t, S; is the spot exchange rate at date t and Sims is the spot exchange rate at base
date. The weekly index series is constructed from the daily data by taking the in—
dex corresponding to Thursday of the week. In cases where data is not available for
Thursdays, Wednesday data is used.

Following the standard practice, the stock returns are deﬁned as Rt = 100 x
Aln(Pt), where P, is the stock index at date t, absolute returns as |Rt|, and squared
returns as Rf. Figure 1 gives the graphs of the daily stock index returns, absolute
returns and squared returns over the sample period. It appears from the graphs that
relatively volatile periods, characterized by large price changes, alternate with more
tranquil periods in which the index remains more or less stable. This indicates that
large index returns (both positive and negative) seem to occur in clusters and so does
volatility. The volatility clustering phenomenon which is typical of asset prices and
exchange rates, seems to occur in the ISE as well.

Summary statistics for the index returns are given in table 5.1. The table indicates
that both daily and weekly stock returns have small negative means and medians over
the sample period. One of the usual ways of getting an idea of the distribution of
a time series y, is to look at the kurtosis and the skewness and compare them with
that of a normal random variable. The last two columns of table 5.1 indicate that
the kurtosis of both daily and weekly returns are much larger than that of a normal
random variable. This reflects the fact that the tails of the distribution of index
returns are fatter than the tails of the normal distribution. This in turn means that

large observations occur more often than one might expect for a normally distributed

186

variable.

Since any symmetric distribution have skewness equal to zero, table 5.1 indicates
that the distribution of daily and weekly stock index returns have some asymmetry.
The negative values of skewness indicate that for the ISE stock returns over the sample
period considered, the left tail of the distribution is fatter than the right tail, or large
negative returns tend to occur more often than large positive ones. The analysis here
indicates that daily stock return distribution is far from being normal.

To gain some insight into the dependence structure of the series, ﬁgure 5.2 displays
the ﬁrst 100 autocorrelations for the daily stock index, index returns, absolute returns
and squared returns together with two—sided 5 percent critical values (:l:1.96/\/T
where T is the sample size). The asymptotic critical values are not strictly valid for a
process with ARCH effects. Still they may be considered to be useful as guidelines. It
is clear from the ﬁgure that the ISE 100 log index has autocorrelations close to unity at
all selected lags and, hence, it seems to mimic the correlation properties of a random
walk process. There is a small, positive but signiﬁcant ﬁrst order autocorrelation
in the stock index returns, while higher orders are not signiﬁcant at conventional
levels. On the other hand, for the absolute and squared returns, the autocorrelations
start off at a moderate level (about 0.32) but remain signiﬁcantly positive for a
substantial number of lags. Moreover, autocorrelation in the absolute returns is
generally somewhat higher than the autocorrelation in the squared returns. This
illustrates what has become known as the ’Taylor property’ (see Taylor, 1986, pp.52-
55), that is, when calculating the autocorrelations for the series Rf for various values of
6, one almost invariably ﬁnds that autocorrelations are largest for 6 = 1. As is evident
from the ﬁgure autocorrelations for absolute returns are not only larger than those
of squared returns, but also much more persistent in the sense that they decay much
more slowly. The autocorrelations in absolute and squared returns seem to mimic

the correlation properties of a long memory processes rather than a short memory

187

stationary process for which autocorrelations decay to zero at an exponential rate.
As is evident from the ﬁgure, the autocorrelations in absolute and squared returns
decay very slowly, indicating that linear association between distant observations is
somewhat persistent and autocorrelations decay at a hyperbolic rate. This described
behavior of autocorrelations in absolute and squared returns is consistent with the
time series models with long memory or long range dependence. The above described
characteristics of autocorrelations in the ISE 100 index, index returns, absolute and
squared returns are in conformity with the ﬁndings from developed stock markets .

For example, see Ding and Granger (1993).

5.3 Empirical Results

In light of the discussion in section 5.2, conditional variance of the ISE 100 stock
index returns are modelled by the FIGARCH process which allows one to model
persistence in the autocorrelations of index returns as well as volatility clustering
phenomenon. The robust Wald statistic is used to check if the estimated F IGARCH
model better represents the long memory property of the data compared to a GARCH
speciﬁcation. Results of the estimated ARM A(P, Q) — FI GARCH (p, 6, q) models
for returns are represented in table 5.2. The estimate of long memory parameter, 6,
for daily data is 0.538 and for the weekly returns it is 0.319. These estimates are
signiﬁcantly different from zero. Various tests for speciﬁcation of the models were
performed. In particular, a robust Wald test of a stationary GARCH(1, 1) model
under the null hypothesis versus a FIGARCH(1,6, 1) model under the alternative
hypothesis has a numerical value of 35.060, which shows a clear rejection of the
null hypothesis when compared with the critical values of a xzdi stribution with one
degree of freedom. In none of the data frequencies the estimated GARCH models

performed better than the FIGARCH models, and the sum of the estimates of a and

188

6 in the GARCH models were very close to one, indicating that the volatility process
is highly persistent. In both daily and weekly returns the standardized residuals from
the estimated models exhibit less skewness and kurtosis than the returns. The Box~
Pierce portmanteau statistic, Q fails to reject the null hypothesis of independently
and identically distributed squared standardize residuals at conventional signiﬁcance
levels.

The results from the FI GARCH (1, 6, 0) indicate that the conditional variance
of ISE 100 index returns contain long memory. In the FIGARCH model the long
memory parameter corresponds to the squared error term. Hence, results from table
5.2 provide evidence that the squared stock returns exhibit long memory. To further
investigate this issue, table 5.3 gives the estimates of the long memory parameter
from the GPH, Conditional Sum of Squares (CSS), and the local Whittle estimation
as applied to the squared and absolute returns. The results from table 5.3 indicate
that both squared and absolute returns have statistically signiﬁcant long memory.
This result is supported from all estimation methods. Moreover, the ﬁndings also
support the Taylor Effect. In general, the estimate of the long memory parameter is
higher for the absolute returns than that of the squared returns. The results are in

line with those of the FIGARCH estimates reported in table 5.2.

5.4 Conclusion

This chapter has investigated the volatility clustering and the long memory in
an emerging capital market, namely Istanbul Stock Exchange, by utilizing the
ISE National 100 daily and weekly index returns. The long memory M A(1) —
FI GARCH (1,6, 0) model is found to provide a good representation of the daily
returns while a Martingale-PI GARCH (1,6, 0) model is found to ﬁt better for the

weekly returns data. Estimates of the long memory parameter are found to be sig-

189

niﬁcantly different from zero, indicating that the ISE 100 index volatility is a long
memory process, thus rejecting a GARCH speciﬁcation.

Phrther analysis of squared and absolute returns supports the presence of long
memory in the volatility process. In particular, autocorrelations of squared and ab-
solute returns, and estimates from GPH, local Whittle, and CSS methods all support
the ﬁndings from the FIGARCH model. Moreover, results from estimates of the long
memory parameter provide evidence of the so—called Taylor Effect. The evidence of
approximate Martingale behavior in the conditional mean of the ISE 100 index re—
turns and the presence of long memory in absolute and squared returns is similar
to that obtained from major capital markets in the literature. The ﬁnding of short
memory in returns is in contrast to the evidence of long memory in the conditional
mean of return process for some other Emerging Capital Markets. The evidence of the
long memory component presented in this study may indicate that ﬁnancial security
prices are not immune to persistent informational asymmetries, especially over longer
time spans. Following Anderson and Bollerslev (1997), if we interpret the volatility
as a combination of heterogenous information arrivals then it may be argued that, de-
spite the short memory information arrivals, the conditional variance of stock returns
exhibit long memory characteristics. In this sense, the evidence of long memory is an
intrinsic feature of the returns generating process. The ﬁnding of long memory both
in daily and weekly frequency supports the argument that long memory is an intrin-
sic property of the return process rather than exogenous occasional shifts. To better
understand this issue, it may be worthwhile to study dynamics of individual stock
returns from Emerging Capital Markets. Moreover, use of high frequency data may

also reveal important information on the long memory component of stock returns.

190

BIBLIOGRAPHY

[1] Anderson, T. G. and T. Bollerslev (1997), Heterogenous information arrivals and
return volatility dynamics: Uncovering the long-run in high frequency returns,
Journal of Finance, 3, 975-1005.

[2] Aydogan, K. G.G. Booth (1988), Are there long cycles in common stock returns?,
Southern Economic Journal 55, 141-149.

[3] Baillie, R. T., (1996), Long Memory Processes and Fractional Integration in
Econometrics”, Journal of Econometrics, 73, 5-59.

[4] Baillie, R. T., (1998), Comment Journal of Business 65 Economic Statistics, 16,
273-276.

[5] Baillie, R.T., T. Bollerslev, and H0. Mikkelsen (1996), Fractionally integrated
Generalized Autoregressive Conditional Heteroscedasticity, Journal of Econo-
metrics 74, 3-30.

[6] Baillie, R.T., C.-F. Chung, and M.A. Tieslau (1996), Analyzing inﬂation by the
fractionally integrated ARFIMA-GARCH model, Journal of Applied Economet-
rics 11, 23—40.

[7] Baillie, R. T. , Y. W. Han, and Tae-Go Kwon (2001), Phrther long memory
properties of inflationary shocks, forthcoming, Southern Economic Journal.

[8] Barkoulas, J. T., Baum, C. F., and Travlos, N. (2000), Long memory in the
Greek stock market, Applied Financial Economics, 10, 177-84.

[9] Beran, J. (1994), Statistics for Long-Memory Processes, Chapman & Hall

[10] Bollerslev, T. (1986), Generalized autoregressive conditional heteroskedasticity,
Journal of Econometrics 31, 307-327.

[11] Bollerslev, T. and J. M. Wooldridge (1992), Quasi-maximum likelihood estima-
tion and inference in dynamic models with time varying covariances, Econometric
Reviews 11, 143-172.

191

[12] Bollerslev, T. and H.O.A. Mikkelsen (1996), Modeling and pricing long memory
in stock market volatility, Journal of Econometrics 73, 151-184.

[13] Chung, GP. and RT. Baillie, 1993, Small sample bias in conditional sum of
squares estimators of fractionally integrated ARMA models, Empirical Eco-
nomics 18, 791-806.

[14] Cheung, Y. and Lai, K. (1995), A search for long memory in international stock
market returns, Journal of Internationtal Money and Finance, 14, 597-615.

[15] Crato, N. (1994), Some international evidence regarding the stochastic behavior
of stock returns, Applied Financial Economics, 4, 33-9.

[16] Ding, Z., C.W.J. Granger, and RF. Engle (1993), A long memory property of
stock returns and a new model, Journal of Empirical Finance, 1, 83-106.

[17] Fox, R., and Taqqu, M. S. (1986), Large sample properties of parameter estimates
for strongly dependent stationary Gaussian time series, Annals of Statistics, 14,
517—532.

[18] Granger, C., (1980), Long Memory Relationships and the Aggregation of Dy-
namic Models, Journal of Econometrics, 14, 227-238.

[19] Granger, C., and R. Joyeux (1980), An Introduction to Long Memory Time
Models and Fractional Differencing, Journal of Time Series Analysis, 1, 15-29.

[20] Green, M. T., and Fieltz B. D. (1977), Long-term Dependence in Common Stock
Returns, Journal of Financial Economics, 4, 339-349.

[21] Hosking, J. ( 1981), Fractional Differencing, Biometrika, 68, 165—176.

[22] Hurst, H. (1951), Long Term Storage Capacity of Reservoirs, Transactions of the
American Society of Civil Engineers, 116, 770-799.

[23] Gweke, J. and Portar-Hudak, S. (1983), The estimation and application of long
memory time series models, Journal of Time Series Analysis 4, 221-238.

[24] Harvey, C. R. (1995), Predictable risk and returns in Emerging Markets, The
Review of Financial Studies, 8, 773-816.

[25] Hurvich, C. M. and Beltrao, K. I. (1994), Automatic semiparametric estimation
of the parameter of a long memory time series, Journal of Time Series Analysis,
15, 285-302.

192

[26] Hurvich, C.M., Deo, R. and Brodsky, J. (1998), The mean squared error of Gweke
and Portar-Hudak’s estimator of the long memory parameter of a long-memory
time series, Journal of Time Series Analysis 19, 19-46.

[27] Lee, S. W. and B. E. Hansen (1994), Asymptotic theory for the GARCH(1, 1)
quasi-maximum likelihood estimator, Econometric Theory 10, 29—52.

[28] Lo, A. W. (1991), Long-term memory in stock market prices, Econometrica, 59,
1279-313.

[29] Lobato, I. N., and Savin, N. E. (1998), Real and spurious long-memory proper-
ties of stock-market data, (with discusssion), Journal of Business 85 Economic
Statistics,, 16, 261-283.

[30] Mandelbrot, B. B. (1971), When can price he arbitraged efﬁciently? A limit to
the validity of the random walk and martingale models, Review of Economics
and Statistics, 53, 225-36.

[31] Robinson, RM. (1990), Time series with strong dependence, Advances in econo-
metrics, 6th world congress, Cambridge University Press, Cambridge.

[32] Robinson, P. M. (1995), Log-periodgram regression time series with long-range
dependence Annals of Statistics 23, 1048-72.

[33] Robinson, RM. and F.J. Hidalgo, (1997), Time series regression with long-range
dependence, Annals of Statistics 27, 77-104.

[34] Samarov, A. and MS. Taqqu (1988), On the eﬂicency of the sample mean in
long memory noise, Journal of Time Series Analysis 9, 191-200.

[35] Sowell, F. (1992), Maximum likelihood estimation of stationary univariate frac-
tionally integrated time series models, Journal of Econometrics, 53, 165-188.

[36] Taylor, S. (1986), Modelling Financial Time Series, John Wiley & Sons, New
York.

[37] Yajima, Y. (1985), On estimation of long memory time series models, Australian
Journal of Statistics 27 , 303-320.

[38] Yajima, Y. (1991), Asymptotic properties of the LSE in a regression model with
long-memory stationary errors, Annals of Statistics 19, 158.

193

[39] Yuksel, S. A. (2000), Three essays on the microstructure of the 'Ihrkish stock
market, PhD thesis, Department of Finance, Michigan State University, E. Lans-
ing, MI.

194

Series

Table 5.1: Summary statistics for ISE100 stock returns

mean med

 

daily returns

-0.004 0.031

weekly returns -0.017 0.059

min max variance skewness kurtosis
-13.288 13.040 2.281 -0.348 10.730
-17.688 12.915 13.780 -0.261 5.143

 

Table 5.2: Estimated ARM A(P, Q) — F I GARCH (p, 6, q) Models for ISE 100 Index

 

returns
Daily Returns Weekly Returns
u -0.005 0.0025
(0.025) (0.099)
81 0.131 .
(0.021) .
w 0.173 0.319
(0.040) (0.135)
,8 0.269 0.023
(0.123) (0.108)
6 0.538 0.319
(0.108) (0.135)
T 3339 686
ln(L) -5808.093 -1830.700
Skewness -0.227 -0.192
Kurtosis 5.337 4.004
Q(10) 27.432 23.217
Q2(10) 12.490 6.490
Q(20) 36.683 35.799
622(20) 21.720 15.119

 

Key: ln(L) is the value of the maximized Gaussian likelihood, and QMLE standard errors
are presented in parentheses below corresponding parameter estimates. The Q(10), Q2(10),
Q(20), and (22(20) are the Ljung-Box test statistics with 10 and 20 degrees of freedom
based on the standardized residuals, and squared standardized residuals respectively. The
sample skewness and kurtosis are also based on the standardized residuals.

195

Table 5.3: GPH, CSS and local Whittle estimates of long memory parameter for the
ISE100 stock squared returns and absolute returns

 

 

 

 

Ordinates R? IR, |
m Daily Weekly Daily Weekly
T05 0.226 0.154 0.365 0.180
(2.685) (1.227) (4.336) (1.435)
{-9.191] {-6.724} {-7.540} {-6.517}
T‘"6 0.183 0.324 0.334 0.287
(3.289) (3.576) (5.979) (3.164)
{-14.636] {-7.451] {-11.938] {-7.863}
T‘"7 0.133 0.220 0.266 0.265
(3.573) (3.368) (7.157) (4.044)
{-23.347] {-11.911] {-19.762] {-11.235]
To:8 0.192 0.194 0.268 0.216
(7.759) (4.107) (10.856) (4.572)
{-32.725] {-17.103] {-29.629] {-16.638]
dcss 0.258 0.209 0.250 0.202
(0.0973) (0.095) (0.030) (0.051)
dWhillle 0.246 0.287 0.479 0.537
(0.050) (0.121) (0.049) (0.114)

 

Rey: m stands for the number of periodogram ordinates used in the (El—PH estimator. The
values in parentheses are the t statistics for testing the null of Ho : d = 0 versus H1 : d > 0,
and the values in square parentheses are the t statistics for testing the null of H0 : d = 1
versus the alternative of H1 : d < 1. The t statistics are computed by using the theoretical
variance of r2/24m. The dogs and dWhittle are the estimate of long memory parameter
from CSS estimator, and local Whitlle estimator respectively. Values in the parentheses are
the robust standard errors.

196

Figure 5.1: ISE National 100 Daily stock indices, index returns, absolute and squared
returns

 

I
1/01

1/99

a

1/97

1/95

”LA...

1793

go.

-0

1/91’

'1/8‘9 '

 

 

 

 

r
Ol‘
_WLJ II A. -1

 

 

 

 

 

5
4
3|
16
0
l-_ ‘
14
7
0
18
90
0

 

197

Figure 5.2: Correlograms of ISE 100 stock index returns

 

 

 

 

 

 

 

 

 

 

 

198

CHAPTER 6

Revisiting the nonlinearity and
persistence in real exchange rates:

evidence from a new unit root test

and an ESTAR speciﬁcation

6.1 Introduction

As discussed in chapter 3, there is a growing strand of research on nonlinear
behavior of real exchange rates. The ﬁndings of chapter 3 and the discussion of the
empirical and theoretical literature there indicated that in the presence of transaction
costs real exchange rates are expected to adjust to equilibrium in a nonlinear fashion.
It is also shown that the power of the standard unit root and stationarity tests is
based on the parametric speciﬁcation of the STAR model. When the parametric
speciﬁcation is one that indicates that the generated data has a unit root in the
middle regime while the root(s) in the outer regime(s) becomes closer to unity, (hence
the generated data is locally non stationary but globally remains stationary) the
Augmented Dickey-Fuller (ADF) (Dickey and Fuller 1984) and the Phillips-Perron

199

(PP) (Phillips and Perron 1988) tests lack power in detecting the non-linear mean
reversion. The formal testing of the conjecture that the real exchange rate can be
mean reverting once the nonlinearity is controlled for remains a challenge for empirical
researchers. As discussed in chapters 1 and 3, the linearity tests and the estimation
of STAR models require the time series under consideration to be stationary. As the
simulation experiments in chapter 3 indicated, if the true data generating process
is a linear random walk, the linearity tests may spuriously indicate the presence of
nonlinearity. This ﬁnding implies that the distribution of the linearity tests possibly
differs for a non stationary process hence use of asymptotic X2 critical values may not
be appropriate. This issue deserves further analysis which is beyond the scope of this
chapter. To avoid this problem, the ﬁrst difference of real exchange rates are used
in chapter 3. This chapter, develops a unit root test that is speciﬁcally designed to
test the random walk with or without drift against a globally mean reverting ESTAR
process.

Some recent studies also considered the issues pertaining to stationarity and non-
linearity within the context of STAR models and real exchange rates. Taylor et
al. (2001) show empirically the stationarity of real exchange rates from multivariate
tests before proceeding to their ESTAR model estimation. Killian and Taylor (2001)
use simulations to assess the level of their test of random walk against an ESTAR
alternative. These approaches are not totally satisfactory. Indeed, the Multivariate
ADF (MADF) and the Johansen Likelihood Ratio (JLR) tests of Taylor and Sarno
(1998) are not designed speciﬁcally to test unit root against mean reverting STAR al-
ternatives. Taylor et al. (2001) show by simulation that these tests have better power
properties compared to univariate ADF test when the true data generating process
is a mean reverting ESTAR model. The MADF test assumes that all the series have
a unit root under the null hypOthesis hence the test has the tendency to reject the

null when even only one of the series is stationary. This problem was also pointed

200

out in Taylor and Sarno (1998). To avoid the pitfall of the MADF test, the JLR test
assumes that at least one of the series has a unit root under the null hypothesis. The
rejection of this null implies that all the series are stationary only if we assume that
each of the series is a realization of an I (0) or I (1) process. Otherwise, the rejection
of the null hypothesis in the J LR test will mean that at least one of the series is not a
unit root process. Hence, it will not be informative about the other series. Moreover,
the testing procedures in Taylor et al. (2001) departs from the original PPP criterion
by calling for further economic information about the other real exchange rates in
the testing step, but has the drawback that this additional information is left aside
in the univariate estimation of ESTAR models for the real exchange rate. Killian and
Taylor (2001) approach is relevant provided that the rejection of their null of the unit
root guarantees the stationarity of their nonlinear ESTAR representation under the
alternative, which in fact needs to be shown.

This chapter departs from chapter 3 in that it develops a unit root test, namely
a sup Wald test, (sup Wald), that has power against nonlinear mean reversion. Two
null hypotheses are considered; random walk without drift and random walk with drift
against mean reverting ESTAR alternative. The distribution of the test statistics are
derived and are conjectured to be nuisance parameter free. We apply the tests to G-7
countries’ real exchange rates against the US dollar for the ﬂoating period. Findings
from the new tests support the nonlinear mean reversion of real exchange rates. The
empirical power and size of the tests are studied through simulations and are compared
with those of the standard unit root tests. The simulations indicate that sup Wald
tests have good size and power properties and perform better than the standard
unit root tests. This chapter also studies the dynamic adjustment mechanism of real
exchange rates to a shock by utilizing generalized impulse response functions. The
results from the estimated ESTAR models, the generalized impulse response functions

and the distributions of generalized impulse responses in the outer regimes reveal the

201

nonlinear and persistent behavior of the real exchange rates in this study.

The rest of the chapter is organized as follows; the next section discusses the foun-
dations of nonlinear behavior of real exchange rates, and conditions for stationarity
in the ESTAR model. Section 6.3 introduces the sup Wald test and gives the asymp-
totic distribution of the tests. The empirical size and power of the tests are discussed
in section 6.4. Section 6.5 gives and discusses the empirical ﬁndings. Section 6.6
concludes the chapter. The proofs of the propositions are given in the appendix to

the chapter.

6.2 Foundations of nonlinear adjustment of real

exchange rates and ESTAR model

6.2.1 Motivation for a nonlinear adjustment in real exchange

rates

Similar to chapter 3 we chose to study the nonlinear dynamics in real exchange
rates by using ESTAR model that is discussed in chapter 1. As discussed in chapter
3, the nonlinear behavior of real exchange rate may result from transaction costs.
Dumas (1992), and Sercu et al. (1995) study a two-country model with trading
costs. The models in these papers predict that the presence of trading costs leads
to the existence of a region of no trade in which the real exchange rate may follow
a random walk as arbitrage does not take place. Outside the region, international
arbitrage takes place and brings the real exchange rate back to the nearest threshold
level which corresponds to the marginal cost of shipping. As a result, the exchange
rate is expected to behave discontinuously. Since in the real world, there are several
goods and transaction costs differ for each good, it is intuitive to think that the

shifts will be gradual rather than abrupt. Hence, a Smooth Transition Autoregressive

202

model should better represent the shifts in the real exchange rates than the Threshold
Autoregressive models (TAR).

The presence of transaction costs alone could not account for many of the observed
very large movements in real exchange rates, either in terms of day-to-day volatility
or in terms of periods of substantial and persistent overvaluation or undervaluation of
real exchange rates. An example for this would be the overvaluation of the U.S. dollar
in the 1980s. Killian and Taylor (2001) propose a complementary explanation that
is based on the presence of heterogenous foreign exchange traders; noise traders and
rational speculators (or arbitrageurs). Noise traders’ demand for foreign exchange
is affected by beliefs that are not fully justiﬁed by news about the fundamentals.
Arbitragers on the other hand, form fully rational expectations about the return
on holding foreign exchange and they sell foreign exchange when noise traders push
prices up and buy when noise traders depress prices, thereby making a proﬁt in
the process. In this model, the unpredictability of noise traders’ future opinions
creates risk to arbitrageurs that prevents complete arbitrage. The arbitrage is limited
by three types of risk; the future realizations of fundamental may turn out to be
higher than expected, because of the unpredictable swings in the demand of noise
traders a foreign exchange that is overpriced today may be even more overpriced
tomorrow, and lastly the equilibrium value of the exchange rate can not be observed
directly and hence arbitrageurs will have diﬂiculty in detecting the deviations from
fundamentals. Assuming that agents assign less probability to levels of exchange rate
corresponding to large deviations from the hmdamental level than the values close to
the fundamental (this is because larger deviations are increasingly implausible from
a theoretical point of view), few rational traders will be inclined to take a strong
position when the exchange rate is close to the fundamental value. Therefore, closer
to the unobserved equilibrium the exchange rate is driven mainly by noise traders.

As the exchange rate moves away from the unobserved equilibrium, a consensus will

203

gradually be reached among the rational traders that the exchange rate is misaligned,
inducing them to take stronger positions against the prevailing exchange rate and
ensuring the ultimate mean reversion of the exchange rate toward the unobserved
true economic fundamental. As argued by Killian and Taylor (2001) this nonlinearity
may be described by a STAR model, in which the strength of mean reversion is an
increasing function of past deviations from the equilibrium.

Differently from chapter 3, we postulate an ESTAR model of the form for the real

exchange rates;

Qt = ¢(L)Aq. + [u + pq._1l(1— F(zt; 7. 6)) + [u‘ + p‘qt—1]F(zt; '7, c) + u. (6.1)

where ¢(L) = ¢1L+¢2L2 +- - °+ ¢p_1Lp‘1, F () is the exponential transition function
given in chapter 1 and 3, z, = qt_d for d E 1,2, - - -,d. As discussed in chapter 3,
the exponential form of the transition function makes good economic sense in this
application because it implies symmetric adjustment of the real exchange rate above
and below equilibrium (or positive and negative deviations from PPP). The transition
parameter 7 determines the speed of transition between the two extreme regimes,
with lower values of 7 implying slower transition. The middle regime corresponds to

qt_d = c, when F = 0 and (6.1) becomes a linear model;

Qt = ¢(L)A<1t+ ll 'l' Pqt—l + U:-

The outer regime corresponds, for a given 7, to limlq,_d_c]_.ioo F(qt_d;’y,c), where

(6.1) becomes a different AR(p) model;

(It = ¢(L)AQt + M * +P * Qt—l + ut:

with a correspondingly different speed of mean reversion so long as p at par. In any
empirical application of STAR models, it is necessary to determine the dimension d

and the number of lagged values of the real exchange rate inﬂuencing the transition

204

function, that is, the delay parameter d. In general, applied practice with ESTAR
models has favored restricting d to be a singleton (see e.g. Teriisvirta, 1994; Taylor,
Peel and Sarno, 2001; and Killian and Taylor, 2001). Granger and Teriisvirta (1993)
and Terasvirta (1994) suggest a series of nested tests for determining the appropriate
delay parameter. In the present application to monthly real exchange rate data,
similar to Taylor, Peel, and Sarno (2001), we found that the model that worked best
for each country (in terms of goodness of ﬁt, statistical signiﬁcance of parameters,
and adequate diagnostics) set the delay parameter to 1. The ﬁnding of the delay
parameter being 1 seems reasonably intuitive since it allows the effects of deviations
from equilibrium to affect the nonlinear dynamics with a shorter lag rather than larger
lags. This is because, there is no compelling reason why there should be very long

lags before the real exchange rate begins to adjust in response to a shock.

6.2.2 Stationarity of ESTAR model

Since, this chapter aims to test the random walk against a stationary ESTAR
alternative, we need to determine under which conditions the ESTAR model given in
(6.1) is a globally stationary process. For this end, consider the ESTAR(p) model

given in the following equation.
yt = r’xt(l ‘ F(Zt; '1. Cl) ‘l' "l’xtFth’Ya C) + at (6-2)

where 3:; = (1,yt_1,- - - ,yt_,,)’, F(z,;'y,c) = 1 — exp(—7(zt — c)2), zt = yt_d for d =
1, 2, - - - , pm. As for the disturbances, we have the following assumption.
Assumption 1: Assume that u, ~ iid, with E(ut) = 0, Elutl < 00 and indepen-
dent of yo. The distribution of ut is absolutely continuous and its density is positive
everywhere.
Note that Assumption 1 is satisﬁed for u, ~ iid(0, 02). As discussed in Tostheim

(1990) the stationarity properties of the ESTAR model given in (6.2) are dictated

205

by what happens in the limit when 2, goes to inﬁnity. As 2., goes to inﬁnity (both
positive and negative inﬁnity) F(:l:oo;'y,c) converges to 1. Therefore, as 2, goes to

inﬁnity, yt becomes a two-regime self exciting threshold model;
y, = rr'rt(1 — I(Izt|> c)) + rr":rtI(|zt|> c) + ut (6.3)

The stationarity properties of general threshold models are not known. Chan et al.
(1985) give necessary and suﬂicient conditions for a multiple regime TAR(1) model
with d=1. At an intuitive level, we can expect that the process for yt given by (6.2)
be globally stationary when the roots of the autoregressive polynomial in the outer
regime lie outside the unit circle. In other words, the largest root in absolute value
of the characteristic polynomial in the outer regime, 1 — rrfé — «5&2 — - - - - «55” = 0
be less than 1. This means that the smallest root in the middle regime, 1 — n15 —
W252 — - -- — 75,5" = 0 may be equal to one (having a unit root in the inner regime)
while the process stays globally stationary.

In order to gain some insight into the stationarity of the data generated from an
ESTAR process with parameter speciﬁcation that satisfy the conditions stated in the
last paragraph, a simulation experiment is conducted. The data, yt, for t = 1, - - - , T
from the ESTAR model. 11. = 170.1(1 - F (yr—17.6)) + p * yt—1F(yt—1,r,6)) + at.
with p = 1, par: 0.8, 7 = 3, 5, 10, 20, and at ~ iidN(O, 1) are generated. The
threshold parameter, c is kept at 0. The data is generated N=10,000 times and in
each replication, ﬁrst 100 simulated data points are discarded. The sample sizes of
T = 300, 500, 1000 are used. Letting yu- be the value of yt in simulation replication i
for t = 1, ---, T; and i = 1, ~-, N. The j-step ahead covariance across replications,
6,4- = ﬁzllymytﬂ-J, for t =j+1, ---, Tandj = 1,2,3,~-, J = 10, are estimated
and graphed against time t for each j. The purpose of this simulation is to see
whether Sta' does or does not depend on t. For a covariance stationary process we

should expect that 8to‘ stay approximately constant, over time t. Since the estimated

206

6th for any given j do not differ across the different speciﬁcations of '7 and sample
size T, the results from '7 = 10 for j = 2, 5, 7,9 and T = 1000 are given in panels of
ﬁgure (6.1). As it can be see from the graphs, 6th stay almost constant over time
for any given j. This indicates that the data generated from ESTAR model has on

average covariances that do not depend on time, implying covariance stationarity.

6.3 Testing Unit root against stationary ESTAR

alternatives

Following Micheal et al. (1997) we can rewrite the ESTAR model given in (1.1)

as follows;

yt = ¢(L)Ayt + [H + pyt—1l(1 _ F(Zti’7,C)) + [If + P‘yt—llF(zt;’7,C) + “t. (6-4)

where ¢(L) = ¢1L + ¢2L2 + - -- + ¢p_1I/"‘1. We can re-parameterize the transition
function by ﬁrst letting A = ﬁe. This parameterization will be useful in proving the
asymptotic behavior of the unit root tests. Note that we can write F () as F (2,; A, c) =
1— exp (—(%zt — A)2). In model (6.4) we can test H3 : u = pa: = Oandp = p* =1,
random walk without drift, and H3 : p = p a: and p = pa: = 1, random walk with drift
against the alternative H1 : y; follows a stationary ESTAR process. Under the null
hypotheses we assume that the roots of 1 — 0115 — agéz — — apép‘l = 0, where
011 = (1 + (1)), a, = d),- for i odd and a,- = d),- - 46,--1 for i even, lie outside the unit
circle. Under both null hypotheses the parameters A and c are not identiﬁed. Thus
it is impossible to obtain consistent estimates of A and c under both null hypotheses.
The proposed unit root test is the Wald test which test the parameter restrictions
given in the above null hypotheses. The unrestricted model is given by equation (6.4).

The restricted model is given by

y; = ¢(L)Ayt + yt—l + at, (6-5)

207

y. = ¢(L)Ayz + u + yt—l + at

under H3 and H3 respectively.

As noted by Leybourne et al. (1998) the ESTAR model given in (6.4) is linear in
autoregressive parameters for given A and c. Hence, for given A and c we can estimate
the unrestricted and restricted models by OLS. Denoting the vector of residuals from
the unrestricted model by a and the vector of residuals from the restricted model
by i2, we can write the Wald test in terms of the residual sum of squares under
homoscedasticity as;

Proposition 1: Let d = J = 1 be ﬁxed. Let .\ > o and a = c/x/T > o be ﬁxed.
Suppose (A,E) belongs to A where A is a compact set of R”. Under H3, the Wald

test satisﬁes

new»-.. A (up) (6.7)

poinwise in (A,c), where cp = (A,5, 6), 6: o/(l — 011 — 0:2 — — ap_1) and ((90)
is a function of Brownian motions given in the proof of the proposition. Under the
alternative the statistic diverges.

Since, under the null hypothesis ’7 and c are not identiﬁed we can make any
assumptions about them. The assumption c = x/TE is reminiscent of the assumption
made in the structural change literature where the break point is hypothesized to be
equal to TT where r is in (0,1). Under H3, yt/x/T converges to a Brownian motion
6B(r) with r = t/T. Note that since 2, = yt_d the the behavior of the transition
function in the limit will be characterized by the behavior of y, as T goes to inﬁnity.

If we assume that 'y andc are ﬁxed, then the transition function,

F(z¢;7.c) =1—exp (—(ﬁzt— Cx/W)

208

as T —> 00. This means that for ﬁxed 7 and c the process becomes linear asymptoti-
cally and hence the test statistic will lose its power in detecting nonlinear stationarity
of the time series under consideration. On the other hand if we assume that (A, 6)

are ﬁxed, then we have;

2 2
F(zt;7,c) =1—exp l:— («Eu-($7: — A) ] L1—exp[—(%6B(r)— A) ]asT —> 00.

The following proposition gives the distribution of the Wald test under the null hy-

 

pothesis of H3. As noted in Hamilton (1994) the distribution of ADF and PP tests
differ under “random walk without drift” and under “random walk with drift”. In a
similar fashion, proposition 2 shows that the distribution of the Wald test is diﬂerent
from the distribution one obtains under H3.

Proposition 2:Let d = d = 1, and E = -%, and A be ﬁred. Suppose (A, 6) belongs to
A, is a compact set of R”. Under the null hypothesis H8 the asymptotic distribution
of Wald test given in equation (6.7) is a x2(<p) variate with (p is given in the proof of
the proposition. Under the alternative the statistic diverges.

Note that under H3, when (7, c) are ﬁxed,

2
F(zt;'y,c)=1—exp(—7T2(%—%) ) —L—+1asT—>oo.

When we assume that (A, E) are ﬁxed, then

A 2, 2 A 2, 2
F(zt;'7,c)=1—exp —(-C—TT-A) =1—exp —(E—T—TT_)‘)
A 2
—L—il—exp(—(Eu—A))asT—>oo.

The proofs of propositions 1 and 2 are given in the appendix.

209

Note that the limiting distribution of the Wald test under both null hypotheses
depends on the unknown parameters (A, c). As these parameters are not identiﬁed
under the null hypotheses, the choice of (A, c) is arbitrary. Hence the limiting dis-
tribution of the test statistic is not nuisance parameter free. One way to get away
from this problem and gain power is to use the same testing strategy as in testing lin-
earity against self exciting threshold autoregressive model (SETAR) (see for instance
Hansen (1997, and Caner and Hansen 2001)), namely taking the supremum of the
test statistic with respect to the nuisance parameters. The sup Wald test then will
be given by:

supW E sup(A,c)ngcWT(A, c), (6.8)

whereﬂ= [L a andC= [g, E]aresuchthat0<£< A <2, and0<g< g <‘c’. Since
the test will have power for any A, any ﬁxed (2 can be chosen. Obviously the test will
have power even if we choose one single value for A, but the use of a range of values
will increase the power of the test. One important issue is not to make the interval
too wide as a very large A may make the transition function F to be ﬂat. As for the
choice of C, we can follow the same approach taken in the SETAR literature (see for
instance Hansen 1997,and Caner and Hansen 2001) and select the c corresponding to
the ordered values of lztl and discard 15% of the highest and smallest values. This will
guarantee that the boundaries g and E do not depend on any unknown parameter. We
conjecture that the distribution of sup Wald tests will be nonstandard in the sense
that it is going to be the supremum of a number of random functions, but nuisance
parameter free. Unfortunately, for a rigorous proof of this conjecture, we need a
uniform convergence in A = Q x C which we haven’t been able to prove. To our best
knowledge, there is no result in the econometrics literature that we can use to prove
our conjecture. If we had a uniform convergence of Wald tests discussed above the
proof of our conjecture for the sup Wald tests would be trivial in the sense that our

conjecture would follow by continuous mapping theorem. In the rest of the chapter

210

we assume that our conjecture is true and following Caner and Hansen (2001) we

compute critical values by simulation.

6.4 Empirical Critical Values and size and power
properties of the sup Wald tests

To compute the empirical critical values we have generated data from (6.5). When
ﬁtting (6.5) to real exchange rates a was found to be statistically indistinguishable
from zero for most of the real exchange rates and it was around 0.05 for some of
the rates. Hence data is generated with p = 0 and with u = 0.05 in computing.
the critical values. In generating the data, disturbances, u, in (6.5), are drawn from
iidN(0,1). Table 6.1, reports the empirical critical values from 20,000 replications
of sample size 312 since 312 corresponds to the sample size in this study. The two
dimensional grid search in 7 and c was performed for the following sets of values:
7 E (0.25,0.5,0.75,1,1.25,- -- , 15) and c E [925] with g and '6 such that 15% of the
smallest and highest values of lyt_1| are excluded from the grid. In addition to the
standard version, heteroscedasticity- robust versions of the tests are also computed.

In order to analyze the size and power properties of the proposed tests, a ﬁnite
sample study is performed. The empirical critical values reported in table 6.1 are used
in the simulation experiments. Therefore, the power is actually a size-corrected power.
In computing the size of the tests, the data is generated under the null hypotheses of
H3 and H3 with ,u = 0 and u = 0.05. The disturbances are drawn from iidN(0,1).
The standard error is normalized to unity in all of the experiments. Table (6.2)
reports the empirical rejection frequencies from 5,000 Monte Carlo replications with
T = 312. For comparison purposes, the empirical size of the Augmented Dickey-
Fhller (ADF) and Phillips-Perron (PP) statistics are also reported. The empirical

size of the sup Wald test is quite accurate and comparable with the size of ADF and

211

PP. The heteroscedasticity robust versions, supWh and supWhp, seem to be slightly
more conservative than the standard versions.

The power of the tests is examined by generating 5,000 series under the alternative
(6.1) for various parameters values. Throughout the experiments, p is kept ﬁxed at
unity, while the autoregressive parameter in the outer regime, pat, was varied to see the
effect of having an autoregressive root in the outer regime that changes from values
in the stationary range to values closer to unity. This parameterization is consistent
with the ﬁtting of ESTAR models to the data as we will see in the next section. The
data is generated under u = pa: = 0 and u 76 ”it. Since the results did not vary
signiﬁcantly, only p = pal: = 0 and ,u = 0.05, pa: = —0.05 are reported in table 6.3.
The smoothness parameter, 7, was varied to see the inﬂuence of the change in the
curvature of the transition function on the power of the tests. The values reported
are closer to the smoothness parameter estimates obtained in the empirical section.
Since, it did not have any signiﬁcant effect on the power of the tests, the threshold
parameter, c, is set at 0.05. Again for comparison purposes, the power of ADF and
PP tests are also reported. As can be observed from the table, as the autoregressive
parameter, p*, in the outer regime approaches unity, the power of all tests declines.
However, the fall in the power of ADF and PP is more than that of the sup Wald
tests. For instance, the power of ADF and PP tests is about 40 percent, while that
of supr is about 83 percent in the case given in panel (1 of the table. In cases
where par = 0.95 the sup Wald tests outperform the ADF and PP tests. Moreover,

the power of sup Wald tests in general increases with '7.

212

6.5 Empirical Results

6.5.1 The data

The data set comprises monthly observations on consumer price indices for the
US, the UK, Canada, Germany, Italy, Japan, and Switzerland, and end-of-period
spot exchange rates for the UK pound (BP), German mark (GM), Canadian dollar
(CD), Italian lira (IL), Japanese yen (JY), and Swiss franc (SF) against the US dollar.
The data covers the sample period from 1973:01 to 1998:12, and is taken from the
International Monetary Fund’s (IMF) International Financial Statistics data compact
discs. Real exchange rate series are constructed with these data in logarithmic form

as in chapter 3. The data is centered around sample mean.

6.5.2 Unit root test results

Table (6.4) gives the results from standard unit root tests, namely ADF (Dickey and
Fuller, 1981) , and PP (Phillips and Perron, 1988), stationarity test of Kwiatkowski
Phillips, Schmidt, and Shin (1992) (KPSS) together with the results of sup Wald
tests applied to real exchange rates. The PP and ADF tests reject the unit root null
for only BP and IL only at 10 percent level. For all other series, ADF and PP tests
indicate the presence of a unit root at the 10 percent signiﬁcance level. ADF and PP
fails to reject the null hypothesis of a unit root for all of the real exchange rates at
the 5 percent level. KPSS rejects the null of stationarity in all real exchange rates.

Since we have seen that ADF and PP tests lose power when the autoregressive
parameter in the outer regime becomes closer to unity, we can argue that these
results can not constitute a strong evidence for non-stationarity of real exchange rates.
According to the sup Wald tests reported in table 6.4 the random walk hypothesis
is rejected strongly for all of the real exchange rates in favor of a globally stationary

ESTAR model. Note that except IL and JY for none of the real exchange rates in our

213

sample we were able to obtain constant term estimates in the ﬁtted ESTAR model.
Therefore, we did not test the null hypothesis of random walk with a drift, (H3). For
the JY and the IL sup Wald tests reject the null of random walk with drift at the 5
and 10 percent levels, respectively. Given the results from the sup Wald tests we can
argue that real exchange rates in our sample are globally stationary, although they
may exhibit random walk behavior locally. This result indicates that once a threshold
type of nonlinearity is taken into consideration, real exchange rates are stationary.
After empirically showing that real exchange rates are stationary, the next task is to
model the nonlinear behavior of real exchange rate under the alternative of a globally

stationary ESTAR model.

6.5.3 ESTAR model estimation and persistence of real ex-

change rates

While the results of sup Wald tests impart some idea of the mean reverting nature
of real exchange rates, a sensible way to gain a full insight into the mean-reverting
properties of real exchange rates is to model this behavior by the nonlinear model
that is assumed under the alternative hypothesis, and also to look at the propagation
mechanism with which the adjustment process takes place after a shock to the level
of real exchange rates. Thus, table 6.5 reports the estimated ESTAR models of the
form given in (6.1). The estimation of the ESTAR model given in (6.1) was performed
using the constrained maximum likelihood method. The CML library in Gauss with
the Newton-Raphson optimization algorithm is used in estimation. The constraints,
'7 > 0 and c E [9, E], with g and 6 such that 15% of the observations in absolute value
are below _c_ and 15% are above 6, are imposed. Following, Leyboune et al. (1998) the
objective function is concentrated so that optimization is carried out for 'y and c only.

For details, see Leyboune et al. (1998) or chapter 1 of this dissertation. The starting

214

values are obtained from a two-dimensional grid search over 7 and c. Following the

suggestion of Teriisvirta (1998), the transition function is reparameterized as follows:

 

F(Zt;%0) = 1 — exp (3e12,) (2t _ c)2) ,

where s.e.(z¢) is the sample standard deviation of the transition variable, so as to
make '7 approximately scale-free. The grid for 7 was set arbitrarily to 0.1, 0.2, - - - , 20,
while the grid for c is set as explained above.

For each of the estimated ESTAR models, we could not reject the hypothesis of
no remaining nonlinearity of ESTAR form for values of d ranging from 2 to 12 on
the basis of the p-values of Lagrange multiplier (LM) tests (table 6.5 reports only
the p—values corresponding to the maximal value of the LM statistic, pNLESm).
Neither could we reject the hypothesis of remaining nonlinearity of LSTAR variety
with values of delay parameter in the range of 1 to 12 (pNLLSm in the table). This
procedure suggests setting d = 1. The residual diagnostic statistics are satisfactory
in all cases (Eithrehim and Terasvirta, 1996). The estimated transition parameter in
each case appears to be strongly signiﬁcantly different from zero both on the basis of
the individual t—ratios as well as in terms of the empirical marginal signiﬁcance levels
reported in the square brackets. Since under the null hypothesis that '7 = 0, each of
the real exchange rate series follow a unit root process, the usual t — ratios should
be interpreted with caution. In the presence of a unit root under the null hypothesis
we can not assume that the distribution of t — ratio will be given by student’s t
distribution. Following Taylor, Peel, and Sarno (2001), the empirical p—values are
computed by Monte Carlo methods assuming that the true data generating process for
the logarithm of the real exchange rate series was a random walk with the parameters
of the data generating process calibrated using the actual real exchange rate over the
sample period. The empirical p— values are based on 5,000 simulations of length

412, initialized at 0, from which the ﬁrst 100 data points were discarded in each case.

215

At each replication ESTAR of the form reported in table (6.5) was estimated. The
percentage of replications for which a t—ratio for the estimated transition parameters
was greater in absolute value than that reported in table (6.5) was obtained was then
reported as the empirical p-value in each case. Note that since this test can also be
considered to be a unit root test against a nonlinear mean reverting alternative, the
results also support the ﬁndings from sup Wald tests reported in the previous section.
As can be seen from panels of ﬁgure 6.1, the estimated models ﬁt the data very well
and real exchange rate visit both inner and outer regimes in each case. The graph
of the transition function against time reveals that BP, DG, GM, and SF (European
zone except IL) series tend to stay closer to the outer regime until 1985 and stay
closer to inner the regime between 1986 and 1993 and then again tend to stay closer
to the outer regime after the early 19903. On the other hand, CD, IL, and JY tend
to stay closer to the outer regime for most of the time during our sample period.
The ESTAR estimates reported in table 6.5 indicate that the autoregressive pa-
rameter in the inner regime is, for all series, either unity or above unity, implying a
unit root behavior in the inner regime. This is consistent with the theoretical foun-
dations given above in the sense that whenever the deviation from the equilibrium is
small real exchange rates behave as a random walk. On the other hand, the autore-
gressive estimate for the outer regime is, although less than unity for all series, close
to unity, implying near unit root behavior in the real exchange rates even globally.
This ﬁnding is consistent with the ﬁndings of chapter 3 in that it implies that devi-
ations from equilibrium should persist for a long time. This ﬁnding also motivates
the need to evaluate estimated models on the basis of impulse response functions as
the estimated parameters indicate that the real exchange rates may reveal persistent
deviations from equilibrium. To this end, the panels of ﬁgure 6.2 give the estimated
generalized impulse response functions (GIRF). The GIRFs are calculated as in chap-

ter 3. For a linear univariate model, the impulse response function is equivalent to

216

a plot of the coefﬁcients of the moving average representation (see e.g. Hamilton,
1994, p. 318). As discussed in chapter 1 estimating the impulse response function
for a nonlinear model raises special problems both of interpretation and of compu-
tation, ( see also, Koop, Peseran, and Potter, 1996). In particular, with nonlinear
models, the shape of the impulse response function is not independent with respect
to either the history of the time series at the moment the shock occurs, the size of
the shock considered, or the distribution of future exogenous innovations. In this
sense, impulse response functions are themselves random variables. As discussed in
chapter 1, the distribution of impulse responses can be utilized to gain insight about
the persistence of shocks in STAR models. It is intuitive to think that if a time
series process is stationary and ergodic, the effects of all shocks eventually converge
to zero for all possible histories of the process. Hence the distribution of impulse
responses collapses to a spike at 0 as the horizon approaches to inﬁnity. In contrast,
for non-stationary time series the dispersion of the distribution of impulse responses
is positive for all horizons. Koop Peseran and Potter (1996) suggest use of dispersion
of the distribution of generalized impulse responses at the ﬁnite horizons as a tool in
obtaining information about the persistence of shocks.

In this chapter we compute history- and shock-speciﬁc generalized impulse re-
sponses for all observations in the sample period as discussed in chapters 1 and 3.
The values of the normalized initial shock equal to i/6u = 1, 5, 10, 20, 40, where 6,,
denotes the estimated standard deviation of the residuals from the ESTAR model.
For each combination of history and initial shock, we compute generalized impulse
responses for horizons k = 1, 2, - - . , N with N = 120. The conditional expectation in
(1.42) are estimated as the means over 5,000 realizations of qt“, with and without
using the selected initial shock to obtain qt and using randomly sampled residuals
of the estimated ESTAR models elsewhere. All generalized impulse responses are

initialized such that they equal i/6u at k = 0.

217

The estimated generalized impulse responses that correspond to the histories as-
sociated with the average value of the transition function, are graphed in the panels
of ﬁgure 6.2 for each of the real exchange rates. These impulse response functions
very clearly illustrate the nonlinear nature of the adjustment, with the impulse re-
sponse functions for larger shocks decaying much faster than those for smaller shocks.
Careful analysis of the panels of ﬁgure 6.2 indicate that shocks to the level of real
exchange rates are although decays for all shocks, in all cases the speed with which
the impulse responses decays and becomes half of the original normalized value of
the initial shock changes with the magnitude of the initial shock. For even moderate
size shocks it takes several months for the shocks to revert back to half of the initial
magnitude. Since, impulse response functions are random variables that depend on
the shock and the initial history of the series considered, the distribution of impulse
responses for those histories corresponding to the value of the transition function be-
ing in the upper 95 quartile are given in the panels of ﬁgure 6.3. Note that these
impulse responses correspond practically to periods where the real exchange rate is in
the outer regime. Therefore we expect that the real exchange rate to be mean revert-
ing and hence the distribution of generalized impulse responses accumulate around
zero at ﬁnite horizons. The panels of ﬁgure 6.3 illustrate clearly that as the horizon
increases the distribution of generalized impulse responses tend to pile up around
zero. However, in none of the cases, the distribution of generalized impulse responses
do not form a spike around zero even for horizons of 120 months which correspond
to 10 years after an initial shock occurs. These results support the ﬁndings in chap-
ter 3 and lead us to reach a similar conclusion in that despite the evidence of mean
reverting nonlinearity in real exchange rates, they are very persistent in terms of the

response to shocks.

218

6.6 Conclusion

The high persistence of the deviations from PPP is well documented in the
literature. This chapter explored the nonlinear mean reversion of deviations from
PPP within the context of an exponential smooth transition autoregressive model.
The chapter proposes sup Wald tests to test the random walk hypothesis against
globally stationary ESTAR alternatives. Results from standard unit root tests and
the KPSS test indicate non-stationarity of real exchange rates while results from
sup Wald test revealed stationarity of real exchange rates once nonlinearities are
controlled for. The Monte Carlo experiments on the power of sup Wald and standard
unit root tests indicated that for parametric speciﬁcations that are closer to the ﬁtted
ESTAR models in the data, sup Wald tests have better power properties than the
standard unit root tests. Estimation, and further analysis of real exchange rates
by generalized impulse response functions, indicated the nonlinearity and persistence
of deviations from the PPP. Although, the larger deviations tend to decay more
rapidly, the half-life estimates seem to be consistent with the studies that do not take

nonlinearity into consideration, see for instance, Rogoff (1996).

219

6.7 Appendix: Proof of propositions 1 and 2

For the sake of completeness, in the following we ﬁrst reproduce the deﬁnition of
a regular transformation and the theorem 3.1 of Park and Phillips (1999).

Deﬁnition 6.1: (Deﬁnition 3.1 of Park and Phillips, 1999) A transforma-
tion T is said to be regular if and only if,

(a)it is continuous in a neighborhood of inﬁnity, and

(b) on every compact set If, there exist L, T e and 6.5 > 0 for each 6 > 0 satisfying
110:) S T (y) S T423)

for all 2:, y E C such that la: — y| < 6., and In (Te - L) (a2)d:c—+ 0, as e ——> 0.

According to Park and Phillips (1999) the class of regular transformations includes
all continuous functions on a compact support. For that reason, the exponential
function is a regular function for any given value of A and c. Since in the proofs
we assume that the parameter space for (A, E) is compact the exponential function
indexed by the parameters (A, c) satisﬁes the regularity conditions given in deﬁnition
3.2 of Park and Phillips. Moreover, since any regular transformation is closed under
addition, subtraction, and multiplication the transformations obtained by addition,
subtraction and multiplication of the exponential function is regular. For details, see
Park and Phillips (1999) pages 810.

Deﬁnition 6.2 (Deﬁnition 3.1 of Park and Phillips 1999) We say that for
the function T(x,w) ( deﬁned on a compact set of parameter space, II) is regular if

(a) T is regular for all 1r 6 II

(b) for all a: E R, T(x, .) is equicontinuous in a neighborhood of 9:.

Since the exponential function is continuous for all a: and (7, c) it should satisfy

the regularity conditions stated above.
Theorem: (Theorem 3.1 of Park and Phillips, 1999) Under certain regu-

larity conditions on the disturbances of the time series process given in ( 6. 2) (at being

220

a Martingale diﬁerence sequence is enough) and under a regular transformation T on

a compact set II
1 n yt 1
— — —"a a B 1 i
n;T(\/ﬁ,n) ”/0‘ T( (r) 1r)dr
uniformly in 7r 6 II. Moreover, if T(., it) is regular, then
ﬁg; T (% 7r) 21,—n. f,1 T (B(r),7r) dam as n —» 00.
The proofs of propositions use these results frequently.
Proof of Proposition 1: The proof of the proposition follows the similar steps
given in Hamilton (1994, chapter 17) and uses theorem 3.1 of Park and Phillips (1999).

Letting v, = y, — yt_1, the model in (6.2) can be written as
y. = xlﬁ + u. (6.9)

where

xt=(v1_1.~-.v1_p+1.(1— Fr). 31:40 - F1).F..y1—1Fl)'.
ﬂ = (in. - - '1¢p-11#a 10. u*.p*)’,
u, ~ iid(0, 0,2,) and for notational simplicity the dependence on t of transition function
is denoted by Ft. Note that 3:, depends on A and E which we have assumed to be

ﬁxed. Given the representation in (6.9), the deviation of OLS estimates (,6) from the

true value (B) is

B — ﬂ = [Z 13334-1 Z xtut (6.10)

These can be written as follows:
A I
:2:th = u 2’ (6.11)
A21 A22

where;

- a

z 113.1 2: Ut-ivt—z ' '° 2 vt—lvt-p-i-l
Z v1-2vt_1 2 v3, ° ° ° 2 221—avg-..“

 

 

L : 'U¢_p+1’Ut_1 Z vt—p-l-lvt—Z ° ' ° 2 Utz-p-l-l

221

' Z (1 —. Ftl’Ut—l °'- 2(1— Ft)v¢-,,+1 q
A21 _ Elli—10‘ Ft)vt—1 '-- Zy¢_1(1— Ft)vt_p+1
Z FtUt-l ' ' ' Z FtUt—p+1
E yt-IFtvt-l ' ' ' Z yt—IF‘tvt—p-l-l

 

 

and A22 is a symmetric matrix given by;

p

20‘ F02
Elli—10 — F02 2313—10 — Ft)2
23(1— Ft) ZFtZ/t-IU - Ft) 2th
L Eye—11710 — Ft) 23112—150 " Ft) Elli-1F? 23112—11712 .

A22 =

 

 

The vector in the second expression of (6.10) is;

l 2 vt—lut
2 ’Ut-zut

th—p-i-lut
mm = (6.12)
2 Z (1 ‘ Ft)ut

Elk-1(1— Ft)ut
2: Eu;
Elli—113%

 

 

l.

Under Hg, since the true process is a random walk without drift, following Hamil-
ton (1994) we can use the following (p — 1 + 4) x (p — 1 + 4) diagonal scaling matrix
(TT) with diagonal elements (x/T, - - - , x/T, VT, Tx/T, T).

Premultiplying (6.10) by TT, we can obtain;

TT (3 — ﬂ) = [Tel [2 2:02] T;1]_1{T;~1 [Z 00]} (5.13)

Now consider the matrix [T711 [2:13th] T551]. Elements in the upper left (p — 1) x
(p — 1) block of Earn; (i.e. elements of All) are divided by T. The ﬁrst and third

222

row of A21 (similarly, ﬁrst and third column of A3,) are divided by T. The second
and fourth row of A21 are divided by T3”. On the other hand, those entries that has
not yt_1 in the sub-matrix A22 are divided by T, those that has y¢_1 are divided by

T3”, and those entries with y,2_lare divided by T2. By the Law of Large Numbers,

1 . .
T Z valve—3' l" E lvt—ivt—J‘l = (If "' Jl'

Note that under H3, yt is a random walk without drift and yt/x/T converges
to 6B(r), r = t/T, where B(.) is a standard Brownian motion. Note also that
f 2;) 1 u, converges to oB(r), where (Tr)i s the largest integer that is less than
or equal to Tr. Since the continuous transformations of the exponential transition
function F(z; A, E) = 1 — exp [— (g2 — A)2] are themselves continuous in zand in
(A, E) e A they are regular in the sense of the deﬁnition given in Park and Phillips
(1999). Therefore we can apply their theorem 3.1 to the remaining terms of the (6.13).

For this purpose denote;

F(r) = 1 — exp [— (€680) — if]

where B(r) is a standard Brownian motion on [0,1]. By theorem 3.1 of Park and

Phillips (1999),
%Z Ft'Ut—i L 0
1
‘7‘. 2(1“ Ftlvt—i "L 0

1 yt— 1(1— Ft)’U¢_1 L 0

\/T
TZT Eli-1 F P
- — eve—1 —->0

Zytlp,1_r; i”:1/(B(r)F(r)(1—F(r)))dr
TZyJ—g-(l—FchS/o B(r)( (1—F(r))2dr
%Z£f-n(1—m L52/013(r)2r(r) (1—F(r)) dr

223

%ZL}71 (p1- 02—.52/01 (r)2 (1—F(r))2dr
%ZL;1F}2 P (SQ/0330‘) )2F(r )Zdr

pointwise in (A,c') E A. The convergence here is pointwise rather than uniform as
the theorem 3.1 of Park and Phillips (1999) applies here for ﬁxed values of A and 5.
Ideally, we would like to have a uniform convergence in A which is very difficult to
prove. To our knowledge, there does not exist a result that extends Park and Phillps’s
theorem 3.1 to the case where convergence is uniform in A. Applying Theorem 3.1 of

Park and Phillips to the rest of the terms;

%ZFt(l—F,)—P+/01(F(r)(1—F(r)))dr
TZF2—4/F( )r2dr
%Z(1—Ft)2LAl( (l—F(r))2dr

uniformly in (A,E) E A.

Hence, we have shown that

 

 

V 0
r-1 0.1:; “r-1 —L» (6.14)
(.1: 1.1 ,Q
where _ _
C0 C1 ' ' ' <p—2
V = C1 C0 ' ' ' Cir-3
] Cp—2 Cp-S ' ' ' C0 _
Q = Q11 Q’zi
Q21 Q22
with

f0 (1F — (r))dr 60f’B(r) (1 —F(r))2dr
60’f B(r) (l-F(r )) dr 62foB (r)2(1—F(r ))2dr

224

Q11 =

_ In F( l (1 — F(T))d" 5 f01 B(r)F‘(r) (1 — 132(1)) dr
Q21 _ ~

61.1 B(r)F< ) (1 — F< >) dr 62 foB (621(7) (1 - F(.)) .1.
IS F(2)1dr 6 f8 B(r)F(r)2dr

Q22 =
6 f01 B(r)F(r)2dr 52 [01 B(r)2F(r)2dr

Now consider the vector, T711 [2 mm] , in (6.13). Following Hamilton (1994, pages
520-21) this term can be decomposed into two parts. Using the result from Hamilton
(1994), the ﬁrst (p — 1) elements of this vector satisfy the usual central limit theorem

and hence;

% th—lut

‘—1"' 12.11
“7‘: ‘ 2 ‘ L111 ~N(0,02V) (6.15)

 

 

. 71? 222-2221112 1

The asymptotic behavior of the last four elements can be obtained by using the results

in Hamilton and Park and Phillips (1999). For any given (A, 6) we have;

 

 

 

 

ﬂ,— 2(1—Ft)ut - r 0‘]: (1— F(r)) dB(r) -
1 ..
7 z; y.-. (1 - F.) u. i) I12 N 06 f0 3(1) (1 — Fm) (13(1) (6.16)
ﬂ 2 Ftut 0‘ fol F (r)dB(7‘)
% Z yt—1Ftut ‘ _ 06 1.013 ()F(r)dB(r)
Substituting (6.15) through (6.16) into (6.13) results in
-1

. V II V‘lh
Tr (B — H) —L—» 0 l = 1 (6.17)

0 Q ’12 Q-lhz

The null hypothesis Hg : p = p.21: = 0, p = p* = 1 can be represented by Rﬂ = q,
I
whereR= [0 I4],q=(0,1,0,1),withObeinga4x(p—1)zeromatrix

225

and 14b eing the 4 x 4 identity matrix. The Wald test is then
. I -1 ‘1 .
W7: (6 — 5) R’ [623 (2 mg) R] R (6 — (3) (6.18)

Deﬁne T} be the following (4 x 4) matrix:

r- -(

«T 0 0 0

- 0 T 0 0
0 0 «if 0

0 0 0 T

L. -

 

 

Notice that (6.18) can be written
W7: (6‘ — ﬂ), RTT [5217,52, (2: $539-1 Rh] 4 TTR (B —- ﬂ) (6.20)
Observe that the matrix TT has the property that
ha = RTT

for R = [ 0 I4 ] and TT the (p + 3) x (p + 3) diagonal scaling matrix given above.
From (6.17),

1211(6 — (3) —1—» 0%.

Therefore, (6.20) implies that

W7: ([3 — ﬂ), (RTT)’ [621217. (2 ztx;)—1TTR]—l TTR (,6 — ﬂ)
—"—> (Q2122) [21621]“ (Q4222)

= (1262—1’12/‘72 E (W) (621)

Note that under the alternative hypothesis y; follows a stationary ESTAR process for
p* is strictly less than 1. Under the alternative parameters [3 will converge asymp—
totically in x/T to their pseudo true values that are functions of 7 and 0. Hence, the

test statistic should diverge.

226

Proof of Proposition 2: Note that under H8 since the process is a random
walk with drift (i.e. y; = u + 311-1 + at) we need to use the following diagonal matrix
with the diagonal elements (x/T, - - . , x/T, VT, T3”, x/T, T3”). Note also that under
H8 yt/T converges to p as T ——1 00. Since under the null the OLS estimate of p is

consistent we can act as if we know u. Denote

F(6) =1— exp (— (€11 — A)2).

Using the theorem 3.1 in Park and Phillips (1999) and proceeding as in the proof of

the proposition 1 we can show that;

121”“ (1-12.) )L/(m —F(d11))) F10)
3sz —’-’—» [$11112de
%Z(1—F1)2L/(l-FMDZFM)

uniformly in (A,E) E A. The rest of the terms converges in probability pointwise in

01,5) 6 A. That is,
1
T 2 17101—1 "5+ 0
1
a; E (1 — Ft)vt—i L 0

W2 y—t 1( _)Ft Ut— 1 ‘—* 0
111-1 P
T12 TFt'Ut—l —“’ 0
1T: #1111216 — F.) i» / EF(11)(1— Fm) dFm)
%Zy_‘-1(1-)F.2 1’ [5(1—F())11F(1)
— .).: g’2-—1F.(1— F.) i» [3 -"—F(11)( 1 — (F611) 1111(1)

.). 1’;;——1<1-F.)L/1—‘3- (1—F<11))d F()

227

1 313-1 2 P fl‘z” 2 "
T: 12 F. 3 F02) 1m).

pointwise in ()1, E) E A. In the above, integration is over the support of 11. Applying

the similar steps in the proof of proposition 1 we can obtain:

1211212121112 :3.

where now, V is the same as above and Q becomes

Q: Q11 621.]
Q21 Q22
with
6211— 1(1—”(11))2f”(u) f’§(1-F(u))2d17‘(u)
5 1—F()1) 611502) 1"?” (l-F(u))2d13(u)

112(1) 12.126.) (1 — F(u))dF<11)
112(1) f 2312(1) (1 — F66) 112(1)

f 1500261151 (12) f £151 (FWF

Q22 = ~ ~ 2 ~ ~
I 15F ((2)2111? (12) F61" (u)2dF (u)

(6.22)

The limiting distribution of the ﬁrst (p — 1) x (p — 1) elements of the vector,

T771 [2: $111.] , is given in (6.15). The last four elements of this vector follows asymp-

totically,

. - ﬁg“,

ﬁEU-qu. 2
1 Z 1-1701) “t
T—gﬁz y.-.(1—F.)u. .. 7'1" ( )

1 2312.11. ..
7? 717~ZF(u)u1

_ 7:375 23 #(t - 1)F(u)u1

 

 

#72 Z yt—IFtut

 

 

228

.. 1
——2 saw—mm. #2301216?)

(6.23)

Combining each component of 6.13, it follows that

. V‘lh ,
T. (F — F) —"2 ‘ (6.24)
N (0, 022-1)

Under the null H8 consider the following selection matrix;

.4...)

where 0 is a (4) X (p — 1) zero matrix, and

 

 

0 0 0 0
1 0 —1 0
R4 =
O 1 O 0
L O O 0 1 _
and deﬁne TT now to be
[7: 0 0 0
- 0 T3/2 0 0
T7: (6.25)
0 0 JT 0
_ 0 0 0 T3/2 .

 

 

Proceeding in a similar fashion to the proof of the proposition 1 we can show that

the asymptotic distribution of WT is

WT— —2 $11(0.02Q“)'Q“N(0.02Q“) -—2 X2012) (626)

By the same argument given in the proof of proposition 1, under the alternative the

Wald test should diverge. This completes the proof.

229

BIBLIOGRAPHY

[1] Caner, M. and B. E. Hansen (2001), Threhold autoregression with a unit root,
Econometrica 69 1555-1596.

[2] Chan, K.S., J.D. Petrucelli, H. Tong, and S.W. Woolford (1985), A multiple
threshold AR(1) model, Journal of Applied Probability 22, 267—279.

[3] Dickey, D. and W. Fuller, (1981), Likelihood ratio statistics for autoregressive
time series with a unit root, Econometrica 49, 1057-1072.

[4] Dumas, B. (1992), Dynamic equilibrium and the real exchange rate in a spatially
separated world, Review of Financial Studies 5, 153-180.

[5] Eitrheim O. and T. Terasvirta (1996), Testing the adequacy of smooth transition
autoregressive models, Journal of Econometrics 74, 59-76.

[6] Granger, C.W.J. and T. Teréisvirta (1993), Modelling Nonlinear Economic Re-
lationships, Oxford: Oxford University Press.

[7] Hamilton, J. (1994), Time Series Analysis, Princeton, New Jersey: Princeton
University Press.

[8] Hansen, B. E. (1997), Inference in TAR models, Studies in Nonlinear Dynamics
and Econometrics 1, 119-131.

[9] Killian, L. and M. Taylor (2001), Why is it difﬁcult to beat the random walk
forecast of exchange rates? Mansucrpipt, Department of Economics, University
of Michigan.

[10] Koop, G., M. H. Pesaran and S. M. Potter (1996), Impulse response analysis in
nonlinear multivariate models, Journal of Econometrics 74, 119—147.

[11] Kwiatkowski, D., P.C.B. Phillips, P. Schmidt, and Y. Shin, (1992), Testing the
null hypothesis of stationarity against the alternative of a unit root: How sure

are we that economic time series have a unit root? Journal of Econometrics 54,
159—178.

230

[12]

[13]

[14]

[15]

[15]

[17]

[18]

[19]

[20]

[21]

[22]

Micheal, P., R.A. Nobay, and D.A. Peel (1997), Transactions costs and nonlinear
adjustment in real exchange rates: an empirical investigation, Journal of Political
Economy 105, 862—879.

O’Connel, P.G.J. (1998), Market frictions and real exchange rates, Journal of
International Money and Finance 17, 71—95.

Park, J. Y. and P. C. B. Phillips (1999), Nonlinear regressions with integrated
time series, working paper, Department of Economics, Yale University.

Phillips, P. and C.B.P. Perron (1988), Testing for a unit root in time series
regression, Biometrika 75, 335—346.

Rogoff (1996), The purchasing power parity puzzle, Journal of Economic Liter-
ature 34, 647-668.

Sercu, P., R. Uppal, and C. Van Hulle (1995), The exchange rate in the presence
of transaction costs: implications for tests of purchasing power parity, Journal
of Finance 10, 1309—19.

Taylor, M. P. and L. Sarno (1998), The behavior of real exchange rates during the
post-Bretton Woods period, Journal of International Economics 46, 281-312.

Taylor, M.P., D.A. Peel, and L. Sarno (2001), Non-linear in real exchange rates:
towards a solution of the purchasing power parity puzzles, Working Paper, Centre
for Economic Policy Research, London, UK.

Tera'svirta, T. (1994), Speciﬁcation, estimation and evaluation of smooth transi-
tion autoregressive models, Journal of the American Statistical Association 89,
208—218.

Terasvirta, T. (1998), Modelling economic relationships with smooth transition
regressions, in A. Ullah and D.E.A. Giles (editors), Handbook of Applied Eco-
nomic Statistics, New York: Marcel Dekker, pp. 507—552.

Tjostheim, D. (1990), Nonlinear time series and Markov Chains, Advances in
applied probability 22, 587-611.

231

F igure 6.1: Estimated j-step ahead covariances from the simulated ESTAR model
(a) 6m

Cove from simulated STAR model

 

 

 

 

 

 

 

 

2.5 ...............................................................................
23: I. I u I ’ ' I l. I. ‘ b I

2. 1 Few lb.'ll*"i"’.'~'!l";"l‘¢"«'~l«.r('4 v“~"‘*'1‘l’ml‘v""!“'l-'§"WMW“.
1.9 : I
1.7: i
q, 1.5; 3
g 1.3: 3
g 1.1 _ q
:3 0.9 : 3
O 0.7 _ 1
o OSK .
0.3: 3
0.1 C .
—~O.1 Z
-—O.3 :
_Q_5 .................................................................. . ............

27 127 252 377 502 627 752 877 1002

Time
(b) 5m

Covs from simulated STAR model
2-5 'rmirr'v'vﬂrv'vvrvIflrrvvvrvr‘erIvvvrvvvvvr'I'I'IVIVIVIYv'v'WrYIVIvIvrvrvrv
2.3: 4
2.1 I 1
1.9 I :
1.7: :
E) l-3 a. l '(lu .v l '3
.9 l . l bulk?!“ "in'li‘ﬁxﬂ'kl‘hVa-‘Mgw iW'!‘ 1:! l‘r'fq-‘ﬂ :1,“ Wh‘yi‘:
‘5 09 E :
o 0.5 ~ 1
0.3: :
0.1 E :
—0.3: 1
_05 l“ ...............................................................................

30 130 255 380 505 630 755 880 1005

Time

232

Covononce

Q)

U

C.

9

a

>

O

L)

25

23
24
19
17
i5
15
14
09
07

05

03
on
—01

—os
—05

Covs from simulated STAR model

 

VIVI'I'TTFTI'T'Y'r‘T'I'IrIYTrTVTVIVIVI'I'l'l'T'YVYVTﬁTIVY'r'U‘I'V'Y'I'UVI'V'l'

3”.le Mr. Nab. Mam, u" will!” M‘”‘~‘wg’v“w"‘i"l )w‘lwwmil‘”

TTTTTTTTTTTSTTTTTTTTTTITIITTT
11114441111111]lJlllllJLllllJ

 

 

OOOOOQRRfffffNNN
muagumumawmumaum

1111AlllAlAlLlA‘AL‘lAlALA‘ 1A1 Llnlmluﬁ AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

32 132 257 582 507 632 757 882 1007
Time

 

Covs from simulated STAR model

 

 

 

 

VI'YVfTYVrVYVTTTTﬁT't'IVTVTTY'V'TTYVT'I'I'I'I'I'I'I‘IVT‘T'IWT'I'171'1'1'1'1'1‘
- 4
»— A
3 1
_ '1
~ -l
r- -1
r— d
7 1
— -1
t .
r- -<
,_ —<
: I
C h :
:VM¢\.«M’ \i 'l/Ma'b,‘ We ﬁ)ﬂm"y"i,~ My“ (3”? a NW "NW“ 13%»li
: 1
>— A
I :
PAL nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn lAlLlAl lllllllllll J AAAAAAAAAAAAAA LALLLAJA ~
34 134 259 384 509 634 759 884 1009

Time

233

Figure 6.2: Real exchange rate series and ﬁtted values, residuals, and estimated
(a)BP

transition function versus time and transition variable

0.0 To No 0.0! N 0| «.0! mdl w.ol

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4 1 l 1 4‘ 1 ldl 4| - 4 q 4 0.0
r %‘4@ i 8} 3} 8x, 2) no) 2) E, no.0»
r % 1—.O 44414441l4ll1|1ql14d4ﬁ4444441
j
I % 1 N.° r woo-
s A .u
u N 1 nd 1 100
v L . .
1 i 4.0 1 ’ __ _ N00!
7 o _P’. _ . v. — .: r‘.... . .
r W . nd 1 . 3:... _f- _ ._ L _f OOO
' v 2 E .2 o _ _, .
r a , _ : 1 So
v A
vl * l voao
t r no.0
r 2
4 no.0
i» r tr.pppprbrlbrhrrlriﬁ.rp»rpOP.O
yIN .m> cozocgld m_oao_mom
$3 3) mm} mm} a) mo} 3) on) no) 5) 2} R) an} 0.0 mm) mm) :3. B} an) on} on} no:
ﬁ 1 1 l 1 ﬂ 4 4 ~ 4 q 4 i4 4 u 4 alldld‘i q 1 .4 l A 4 q 4 d 4 . 4 1 4 4 4 «l4 1 4 4 4 4 4 4 4 4 4 Ti
a L is V
w A r
f 1 N.o 4
v A
r 1 no 1
r L
1 . 1 '.O
T f . .
Y A 0.0
W I
r 1 0.0
v A f
1 1 5.0 ..
I $6 1 0.0 r
y 2 r
H. .Umo f
r l r ? P P F r by b O.—
oE: .m> cococaalu id 95 35%

234

Figure 6.2 (cont’d).

(b) CD

nd md 70 odl _..ol Nd! n.0l v.0l

 

 

 

 

 

 

 

 

 

 

 

4 q . a 4 1‘ 1 1 G . 4 _ ‘ 0.0
o L 3} 3) 3) B} 3) 2K. nu) .
O 4 q 1‘4! 1 q ‘ 1 + d 1 q . . < a . d‘ 4 .‘ 1 4 . a 4 no 0.
1 O .L 70
r w L r L
. L «o r . No.01
1 .L ”.0 . .
v. Q w L to . . 6.0..
I O A v A k . _ .
v m .L 0.0 , L L L L. . . L .
. M L . , . , _ A L L _ ooo
f L we a . L L
r m o L L L .
r L 1 50 r L W .. FOO
, W L r . L
r w 1 0.0
Y A . L «ed
1 1 G.O L
r I(- u p p > . p h L O_. p p + r r r r L r p L . p . . r y p r y p P s p . n00
LIN .m> cozocgll m_030_mmm
mm} 3) 8) nm> 5) mm) Bi mm> no) E> 2} 2) 3) 00 mm) mm) a) :1. 3} 2> on} tel
4‘. . L. a 4 u 1 4 4 q 1 a 4 H 4‘ «“1 u 4 §< q 4‘ * ‘1 ‘.‘ 1 4 q 1 q u a 1 a 1 a < 1“ 1 q 1 q 4 a <

 

 

 

 

 

 

 

m8: .m> cozucglm tn. bco memm

235

Figure 6.2 (cont’d).

(c) DG

0.0

*.o Nd

 

fﬁ—rr

T

 

ﬁnN.m>

1

 

L L 1 K 4

l

cozucswlu

mm\— nm\P nm\é nm\F Fm\. max. nm\F mm\_ nmxp Fm\ﬁ mn\_ nn\_ nn\r

 

 

«

 

P

-‘_

u

r

1

h

b

q 4 d T 4‘1 4)—

—‘\-.

 

h h w L! b b F r

.T‘4‘J‘d14..14

3 L

 

., £

<

L;

L

 

L
r,._.F,_‘L.__L{,_L

 

mctw.m>

cozucsvlm

0.0

Nd

n.o

+.o

md

0.0

5.0

md

md

0.0

Nd

no

v.0

n.0

ad

5.0

0.0

ad

O.—

h»\. max, on\_ mh\p

 

Y I

r

f

 

d — a 1 a 4 a 4 q < J 4 4 J 4 4

 

 

 

 

mm\.

m:::£mmm

nm\, rmxr ~m\. nm\. m~\v nnxr

070!

N70!

modl

#06!

00.0

'06

no.0

di

070

odl

 

 

T

4 .W q 4 4 4 4 a < q 4 4 - a A J] ‘4‘ < 4 1

if 1

d

A

 

) .

L
\%i

 

:L Eco mmcmm

+.O|

Nd!

N.O

¢,o

 

0.0

236

m6 5.0 4.0 Nd

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

. 8) no) a} S) 8) an) an} 2.0..
4 4 4 q 4 d 4 a 4 q 4 4 1 a . 4 . d‘ 4 4 4 1 4 1“ 4
r r L 00.0:
. . 6.0..
T1 r . , _ 00.0
1 v L
r f
T T _ . . n '0.0
. . I 1 00.0
, 1 0.0
V A
. 3 23" b P h _ L b O.— . _ p _ p p p r , p p _ r . p . _ . p p h r P > N—O
) 0 N m) COLSCJTL gosuawm
d
t
m
0 mm} 3) no) 23 a) mm) B) 3) no) 5) 2} hi. 05> . mm) mm) 3) S) 8) 2) nip
( 4‘ 4 4 q 4 q 4 4 4 ‘T‘4 q q . u 4 4 J‘ A 4 1‘.‘ 4 O O H a A 1 4 4 H ‘1 1 4 q .1 4 4 ﬁ 4 q 4 ‘_w J 4‘ 4‘ 4 1 m 0'
2 r _ L .o L L
6 .1 . 0 - . L to-
We «.0 (a . L
.1
F L

 

 

 

 

 

 

 

L

f _ L . I
r . . L 0.0 . . «0
r _ 1 4d r .
y L
r L 0.0 .
y L L L r . .
, L L L 0.0
t L L 1 60 - .
r L
Y ~ L 0.0 1
. L . - .
I bi .L m0 7 .

r _ > ~ r >_ >‘ . h p p r p p p . p ‘F O.— . r “ r r . > > p > _ p . . . . 0‘ > p L P ‘ L7 00

0:0: .m> COLUCEIL L; 000 $2me

 

 

(d) GM

237

Figure 6.2 (cont’d).

((1) LL

odl 4.01 m 0|

 

 

chm .
@000

Iii-DO ‘ . . _;U Q!

 

 

LIN .m> Cozucglm

 

 

\F nmxp Pm\, mm\_ nm\, mm\F nm\P F0\, m~\. mp\_ n~\P

‘ n J1 u

q 4 d a d 4 d .‘ u «

 

 

0E: .m> coﬁucglu

0.0

Nd

nd

to

md

0.0

“.0

0.0

ad

0..

0.0

Nd

nd

v.0

nd

0.0

5.0

0.0

ad

 

 

 

 

 

 

 

.mx. pm\, n0\,
mLODUmem

- O70!

oodl

Nodl

«0.0

00.0

o...o

tpd

 

Fm\, nm\F

u 4 a . a -

 

nm\_

a H

 

238

 

 

:u use mmcmm

Figure 6.2 (cont’d).

(ﬁll

 

 

0.0

N0

n0

+0

0.0

0.0

5.0

0.0

 

 

 

0.” ad 0.0 1.0 Nd 0.0l Nd! v.0! odl mdl
4 4 4 4 4 u 4 (E ‘4‘4 a 4 + n 1

v 0 0 L
, w l
L. 0 0 L
r m 0 .L
T

. w L
v 0 L
I O 1
v 0 A
I O I.
. L
I m 1
. .L
r 1
r L

h P P P P n p P r Bi
LIN .m> COLUCJLIIL

mm\, pox? mm\— nm\F .mxp max, nmxr nmxp max. .mxﬁ mh\, nn\, n~\,

 

I

.l

v

rﬁ

v‘r‘rfr

 

‘ 4“. q . 4

FE‘L b p

.E}.>ﬂ>\.pr »

d 4 q 4 a 4 4 u 4 d

 

4‘

n 4 4

 

 

MLE:

.m> COLQCDLIL

0;

 

 

 

 

 

 

 

 

 

 

 

 

«ox, ,m\. 50\, n0\, ms\_ nh\.
. L
. L
. L
I L
r — in h 5 p L
m_03©_mmm
mm\F Fm\F ~0\. nm\, m~\F nn\.
1 4 a 4 1 a 1‘4 . . 4 4 A 4 4 14 4 u u 4 .

 

 

 

070!

00.0|

00.0..

40.0!

«0.0!

00.0

No.0

40.0

00.0

00.0

0.0!
v.0|
N.OI
0.0|
N0
V0

0.0

 

 

E 0:0

mmCmm

239

Figure 6.2 (cont’d).

(2;) SF

 

0. 90 90 to No 90: No: to- 90-
Y % ..
r % ..
f A
z 0 1
v 0 L
r L

O

T I.
r o A
T 0 I.
T w A
. L
. %
r

b a \' g I h b n r

 

 

 

 

LIN

.m> cozucsLIL

mm\, rm\, mm\, nm\. Lm\L mm\, max, nm\. max. F0\L mn\L ns\L mn\r

 

 

 

 

 

 

  

 

 

 

 

 

ﬁ+ ... L._ .Lﬁs 1‘4L.. ..‘L L
. LT LL L
., LLLL . L. L.
L L L U
L L _ L
. : . d
. L L
.. L_ L.
L; Lg ;.LL::..
0E: .m> COLCcEInL

0.0

N0

n0

fo

n0

0.0

m0

n0

e0

n0

0.0

5.0

0.0

0.0

nm\. —m\, n0\0

mn\.

anx—

 

nm\,

. L.
H Lf: TLL ELL _ LL.a Lwﬂ LL LLL PL LQLL
_ L LLLLL L .L L?

 

 

 

 

 

5‘?

d 4 4 4

LL

 

 

mLQDUmem

nm\,

mn\0

mn\p

N_..O|

00.0|

v0.0l

40.0

00.0

N70

 

 

 

4 . J

11!

tdl

N00

0.0l

«.0

v0

0.0

0.0

 

 

:L 0:0 mmtmm

2L4()

Figure 6.3: Generalized Impulse Response Functions from Estimated ESTAR Models
WEB

 

V,

W ~
2 I
00,. «
0

LO

0' 1
V.

O o

(\l _t
O

CD. ~ uuuouu ............... 99999'HH-u-n-um. un......_._.
CD17 15 24 33 42 5160 69 78 87 96107 120

 

 

 

LO
' YVYVII‘IITYYTI’TYVTYWIIIIVIrilllilIIIIIIIII1111I"VIVI‘IIIITTYTIYFTITYTIY[YIIYIIIYTITIYYIIlllviiiilllTIYII’IIYIIIIITIIIII
v '4
1
. _ _
o
P‘")
. F- —1
. é -
r-
i:
O "Hulunnuuuuu ....................
, ,----.._--

1735—2433 42 51 so 59 78 87 96 107 120

241

Figure 6.3 (cont’d).
(om ‘

 

T7TTTITIFTTTTITTTYIIIIIlrlllTTYrtllIllIITUFTIIIIIWIIITUIVII[ITTITYIYTITYTIIIIIIIIIIIIIIIIIIIIIIIIIIIITIFVTYIIYIIII

F

l

h—
J _ .l

 

 

 

 

( l
u)
(:1)
‘13
— ~ P -
t_‘)
L 1
'If‘
t::)
I ,
L “~l
L:)
t:‘

‘1": 7 15 24 3.3. 42 51 to 69 78 8.7 913 107 120

((DQM

 

ll,IIIIIIIIIII[IllilIlIIFITIIIIlIIIIIIIIIYIIIIIrrTIUTTIIIIIIIIIllIllllIIIIlllIlIIIUIU‘IIIITIVUIVUIIIFFIIYIYIYIITYIVTYII

 

 

    

6&8348JIIIIIIIIIIIIII-Inasmu-un....-...............

‘“'| 7 15 24 LIL: 42 51 6C) 69 78 5:7 96 107 120

242

of)

Figure 6.3 (cont’d).

 

ﬁYIITIIIlYllT—IYYIIIIIIIIIIIIITIVIIIIII‘III'IIIII1ITYIIIIII‘I[WIIIIYYW‘IIIITTI‘TIYYUI’TYTYIIIll’ll‘l'lIT‘TYFYIIYYIITITIIIY

1
V“
I

 

 

.
_ ".....'""""""H~ "'.“'|0Iilnh--

O 69 78 87 96 107 120

1 / 15 24 33 42 51

01

. 57'

 

 

 

1 7 15 24 33 42 51 60 62 7“ 87 96 107 120

243

 

Figure 6.3 (cont’d).

24 33 42 51 I30 159 78

$3 7

E) E.

107

 

12C)

Figure 6.4: Distribution of Generalized Impulse Responses

 

ff 5 I 7 1 V r— 7 w
l4~
" '1
3
r.: _
10.’
O
"' i ‘1
i
I
6* 2 ~
» i
I ‘.
.1

 

 

 

 

0.7

 

f I I f F r ‘r F V ‘V T f
U
‘ ' I
‘
r— ‘ u
l 0 K
r- : ~ "
. . 1
Q
1" II .
' \
O O
P : \ 1
‘
._ g .1
Q
.
. ‘1
'

 

 

 

 

0.04 0.12 0.20 0.28

245

Figure 6.4 (cont’d).

 

 

 

 

 

 

4.5 _ g: "1. 7
3.0 7 5 “. _
1.5 7
1.0 ~ —
0.0 _
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
(d)GM
:...l;‘ T I I r I l j I 1
5 i l -
- s 1‘ -
_ f t‘. 1“ _
3 ’ . 5. '

 

 

 

 

246

 

‘11 A

- iﬁﬁ"

Figure 6.4 (cont’d).

 

 

 

 

 

 

 

247

 

 

 

Figure 6.4 (eont’d).

 

0.7 0.8 0.9

0.5 0.6

0.4

 

 

 

0.3

0.2

0.l

0.0

248

Table 6.1: Empirical critical values of the unit root tests

1% 5% 10% 15% 20%
supW 0.912 1.972 2.756 3.370 3.950
supWh 0.960 2.094 2.941 3.590 4.230
311qu 0.244 0.945 1.621 2.207 2.733
supWhp0.286 1.040 1.741 2.391 2.984

80% 85% 90% 95% 99%

13.140 15.084 17.860 23.849 44.077
14.548 16.732 20.268 26.817 45.915
11.430 13.247 15.886 21.631 41.532
12.514 14.427 17.537 23.868 42.473

Notes:supW and supWh stand for the standard and heteroscedasticity robust version of sup Wald
test for testing random walk without drift against a stationary ESTAR alternative while 311qu
and supWhp stand for the standard and heteroscedasticity robust versions of the sup Wald tests of
random walk with drift against stationary ESTAR alternative. Critical values are computed from
20,000 replications and p = 0.05 and errors are drawn from iid R(0 1).

Table 6.2: Empirical size of the unit root tests

 

 

Theoretical ADF PP supW supWh supW“ supWh,‘
Size

1% 0.013 0.012 0.011 0.010 0.011 0.010
5% 0.050 0.051 0.054 0.052 0.044 0.041
10% 0.102 0.102 0.106 0.100 0.078 0.077

 

NoteszThe columns corresponding to supW and sup-WI: give the rejection frequencies of true null
hypotheses of random walk without drift, while the columns corresponding to .911qu and supWhu
give the rejection frequencies of true null of random walk with drift. The data is generated under the
nulls of H3 and Hg with p = 0 and p = 0.05.The rejection frequencies for ADF and PP corresponds

top=0.

249

Table 6.3: Empirical power of the unit root tests
a. 7=2.5,c=0.05,p=u*=0

 

 

 

Test p=1.0p* = —0.5 p=1.0p* =0.5 p=1.0p* =0.95
1% 5% 10% 1% 5% 10% 1% 5% 10%
ADF 0.970 0.975 0.978 0.835 0.850 0.865 0.410 0.445 0.450
PP 0.968 0.977 0.980 0.805 0.844 0.866 0.400 0.425 0.448
supW 1.000 1.000 1.000 0.995 0.998 0.999 0.479 0.685 0.803
supWh 1.000 1.000 1.000 0.995 0.996 0.997 0.446 0.633 0.750
371pr 1.000 1.000 1.000 0.996 0.998 0.999 0.507 0.712 0.813
supWh,‘ 1.000 1.000 1.000 0.993 0.995 0.997 0.481 0.666 0.787
b.7=15, c=0.05,p=p*=0
Test p =1.0p* = —0.5 p = 1.0 p* = 0.5 p = l.0p* = 0.95
1% 5% 10% 1% 5% 10% 1% 5% 10%
ADF 0.961 0.970 0.972 0.828 0.839 0.855 0.3850 0.411 0.420
PP 0.962 0.975 0.977 0.788 0.812 0.846 0.378 0.405 0.417
supW 1.000 1.000 1.000 0.998 0.998 1.000 0.499 0.715 0.817
supWh 1.000 1.000 1.000 0.996 0.997 0.998 0.476 0.673 0.785
supWﬂ 1.000 1.000 1.000 0.995 0.998 0.999 0.538 0.740 0.833
supWhp 1.000 1.000 1.000 0.994 0.996 0.998 0.494 0.688 0.801

 

c. 7 = 2.5, c = 0.05, p = 0.05;“: = —-0.05

 

Test p=1.0p*=—0.5 p=l.0p*=0.5 p=1.0p*=0.95

1% 5% 10% 1% 5% 10% 1% 5% 10%
ADF 0.935 0.950 0.958 0.810 0.822 0.835 0.377 0.400 0.414
PP 0.932 0.950 0.960 0.776 0.811 0.836 0.375 0.400 0.413
supW 1.000 1.000 1.000 0.997 0.996 0.998 0.500 0.714 0.820
supWh 1.000 1.000 1.000 0.995 0.996 0.998 0.488 0.675 0.790
321pr 1.000 1.000 1.000 0.999 1.000 1.000 0.667 0.814 0.885
supWhp1.000 1.000 1.000 0.996 0.999 0.999 0.628 0.773 0.850

 

c. 7 = 15, c = 0.05, p = 0.05/1* = —0.05

 

Test p=1.0p* = —0.5 p= l.0p* =0.5 p=1.0p* =0.95
1% 5% 10% 1% 5% 10% 1% 5% 10%
ADF 0.935 0.950 0.958 0.812 0.820 0.837 0.377 0.400 0.414
PP 0.930 0.948 0.956 0.789 0.817 0.837 0.375 0.400 0.413
supW 1.000 1.000 1.000 0.998 0.998 0.998 0.524 0.746 0.834
supWh 1.000 1.000 1.000 0.995 0.997 0.999 0.488 0.713 0.820
311pr 1.000 1.000 1.000 1.000 1.000 1.000 0.679 0.829 0.890
supWh,‘ 1.000 1.000 1.000 0.999 1.000 1.000 0.645 0.794 0.861

 

Notes:The rows corresponding to supW and supWh give the rejection frequencies of false null
hypotheses of random walk without drift, while the rows corresponding to 311pr and supWhﬂ give
the rejection frequencies of false null of random walk with drift. The data is generated under the
alternative hypothesis of globally stationary ESTAR model.

250

Table 6.4: Results on unit root and stationarity tests:PP, supWald and KPSS tests

 

PP KPSS ADF supW supWh supW” supWhp

 

BP -2.571 2.242 -3.009 24.505 29.024 n.a. n.a.
CD -1.192 2.357 -1.382 1462.232 1749.536 n.a. n.a
GM -2.126 2.675 -1.784 2547.000 2617.812 n.a. n.a.

IL -2.697 2.675 —2.785 49.679 55.319 13.058 19.663
JY -0.376 3.041 -0.136 58.269 65.303 34.965 33.632
DG -1.536 2.570 —1.311 3030.276 3191.674 n.a. n.a.
SF -2.440 2.665 -2.112 249.205 269.036 n.a. n.a.

Key: The reported values for the PP test are based on the regression of the time series on a
constant and its lagged value. The lag truncation for the Bartlett kernel is obtained from the formula
floor(4(-1%:5)2/9). The 1%, 5% and 10% critical values are -3.454, —2.871, and -2.570 respectively
for the PP tests. The reported values for the KPSS test are based on a regression of the series
on a constant only. The 1%, 5%, and 10% critical values for the KPSS tests are 0.739, 0.463 and
0.347 respectively. The size of the Bartlett window for KPSS is obtained by using floor(8(%)1/‘).
ADF test is based on the regression of first diﬂerenced real exchange rate on a constant, lagged real
exchange rate and p — 1 lags of the first diﬂ'erenced real exchange rate. The lag length is chosen
according to the Ljung-Box statistic and for all real exchange rates found to be 1. The 1%, 5%, and
10% critical values for ADF test are -3.454, -2.871, and —2.570.

 

251

Table 6.5: Estimation Results from ESTAR models: Sample size: 312

 

BP CD DG GM IL JY SF
5, 0.004 0.002 0.001 0.003
(0.001) (0.001) (0.000) (0.001)
52 0002 . .
(0.001) . .
p . 0.024 -0017
. . . . (0.007) (0.009)
p 1 .054 1.002 1.035 1.042 0.946 1.065 1.037
(0.053) (0.007) (0.034) (0.036) (0.028) (0.093) (0.022)
,1... . . . . 0.004 0004
. . . . (0.002) (0.002)
p4 0.983 0.996 0.984 0.981 0.993 0.996 0.978
(0.007) (0.020) (0.008) (0.007) (0.003) (0.003) (0.006)
7 9.049 14.011 10.466 11.736 5.120 10.480 16.436
(0.730) (1.157) (1.792) (1.673) (0.420) (0.835) (1.582)
[0.032] [0.007] [0.025] [0.021] [0.028] [0.018] [0.013]
c . -0140 -0017 -0.169 -0.456 -0215
. (0.038) (0.150) (0.143) (0.040) . (0.120)
Skew 0.344 0.078 0.030 0.050 0.542 -0.694 -0015
Kurt 3.737 0.210 4.053 3.663 4.229 3.905 3.706
pLM(1 — 6) 0.139 0.136 0.444 0.234 0.236 0.242 0.453
pLM(1 — 12) 0.390 0.064 0.593 0.396 0.277 0.291 0.534
pNLESm 0.185 0.873 0.767 0.753 0.205 0.163 0.470
pNLLSm 0.114 0.149 0.027 0.389 0.243 0.306 0.072
SSR 0.173 0.034 0.315 0.321 0.277 0.230 0.406
pLMc 0.326 0.797 0.659 0.692 0.091 0.153 0.57_4___

 

ﬁeteroscedasticity robust standard errors are given underneath the parameter estimates. The values
in squared parentheses are the computed marginal signiﬁcance levels. The rows corresponding to
pLM (1 - 6) and pLM(1 — 12) are the p-values from Lagrange Multiplier test statistics for up
to 6th and 12th order serial correlations in residuals respectively, constructed as in Eitrheim and
Teriisvirta (1996). pNLESma, is the p-value for maximal Lagrange multiplier test statistic for no
remaining ESTAR nonlinearity with delay in the range from 2 to 12 (Eitrheim and Teriisvirta, 1996).
pNLLSma, is the p-value corresponding to no remaining LSTAR nonlinearity with delay in the range
1 to 12 (Eitrheim and Teriisvirta, 1996). SSR is the sum squared residuals of regression. pLMc
is p-value for Lagrange multiplier test statistic for parameter constancy in the estimated ESTAR
model (Eitrheim and Teriisvirta, 1996).

252

.5! ,

 

Tc- . .

1|1|11111llli1|I111l11111ill1111111111111
. 3 1293 02328 8057