SUPERPOSITIONS OF ORNSTEIN-UHLENBECK TYPE PROCESSES:
INTERMITTENCY AND APPLICATIONS TO FINANCE
By
Irena Tesnjak

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
Statistics - Doctor of Philosophy
2017

ABSTRACT
SUPERPOSITIONS OF ORNSTEIN-UHLENBECK TYPE PROCESSES:
INTERMITTENCY AND APPLICATIONS TO FINANCE
By
Irena Tesnjak
Ornstein-Uhlenbeck (OU) type processes driven by Levy noise are useful in modeling the
activity time in the fractal activity time geometric Brownian motion (FATGBM) model for
a risky asset. Discrete superpositions of these processes can be constructed to incorporate
non-Gaussian marginal distributions and long or short range dependence. While the partial
sums of finite superpositions of OU type processes obey the central limit theorem, we show
that the partial sums of a large class of infinite long range dependent superpositions are
intermittent. We discuss the property of intermittency and behavior of the cumulants for
the long-range dependent superpositions of OU type processes. In addition we show an
application of finite superpositions in modeling financial time series and superiority of the
model at hand compared to the Black-Scholes model when modeling log-returns.

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

Introduction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 1
1.1
1.2
1.3

1.4

1.5

Overview of the Ornstein-Uhlenbeck (OU) type processes and
their superpositions . . . . . . . . . . . . . . . . . . . . . . . . . . .
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Levy processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Infinitely divisible distributions . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Infinite divisibility and the Levy-Khintchine representation . . . . . .
1.3.2 The correspondence between infinitely divisible distributions and Levy
processes in law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.3 Self-decomposable distributions . . . . . . . . . . . . . . . . . . . . .
Ornstein-Uhlenbeck type processes . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Stochastic integral with respect to the Levy process . . . . . . . . . .
1.4.2 Existence and uniqueness of the OU type process . . . . . . . . . . .
1.4.3 Properties of the the OU type processes . . . . . . . . . . . . . . . .
1.4.3.1 The stationary OU type process . . . . . . . . . . . . . . . .
1.4.3.2 The stationary OU type processes and self-decomposable distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.3.3 Covariance of the stationary OU type process . . . . . . . .
Superpositions of the OU type processes . . . . . . . . . . . . . . . . . . . .
1.5.1 Existence of supOU processes . . . . . . . . . . . . . . . . . . . . . .
1.5.2 The covariance structure of supOU processes . . . . . . . . . . . . . .
1.5.2.1 Finite supOU processes . . . . . . . . . . . . . . . . . . . .
1.5.2.2 Infinite supOU processes . . . . . . . . . . . . . . . . . . . .
1.5.3 Examples of OU type superpositions . . . . . . . . . . . . . . . . . .
1.5.3.1 Gamma supOU process . . . . . . . . . . . . . . . . . . . .
1.5.3.2 Inverse Gaussian supOU process . . . . . . . . . . . . . . .
1.5.3.3 Variance gamma supOU process . . . . . . . . . . . . . . . .
1.5.3.4 Normal inverse Gaussian supOU process . . . . . . . . . . .
1.5.3.5 Tempered stable supOU process . . . . . . . . . . . . . . . .

1

Chapter 2 Intermittency of supOU processes . . . . . . . . . .
2.1 Intermittency . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Limiting properties of partial sums of finite supOU processes .
2.3 Limiting properties of partial sums of infinite supOU processes

iii

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

4
4
9
11
11
16
22
26
28
34
37
37
41
44
45
45
47
47
47
50
50
51
52
54
55
57
59
62
64

Chapter 3 Applications in finance . . . . . . . . . . . . . . . . . . . . . . . . .
3.1 The Black-Scholes Model:
Geometric Brownian Motion (GBM) . . . . . . . . . . . . . . . . . . . . . .
3.2 Empirical evidence against the Black-Scholes model . . . . . . . . . . . . . .
3.2.1 The non-Gaussian character of the returns . . . . . . . . . . . . . . .
3.2.1.1 Skewness (asymmetry) and excess kurtosis (fat tails) . . . .
3.2.2 The dependence structure of the log-returns:
regular, absolute and squared returns . . . . . . . . . . . . . . . . . .
3.2.3 Stochastic Volatility and clustering . . . . . . . . . . . . . . . . . . .
3.3 Fractal Activity time geometric Brownian motion model with supOU as activity time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Fractal Activity time geometric Brownian motion process . . . . . . .
3.3.2 Properties of log-returns in the FATGBM model . . . . . . . . . . . .
3.3.2.1 Moments, skewness and kurtosis . . . . . . . . . . . . . . .
3.3.2.2 The dependence structure in the FATGBM model . . . . . .
3.3.2.3 Stochastic volatility - conditional heteroscedasticity . . . . .
3.3.3 Distribution theory . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.3.1 Gamma distribution of the unit increments of the activity
time process . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Data calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 VG parameter estimation using method of moments . . . . . . . . . .
3.4.2 VG parameter fitting - results . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79
81
83
84
86
89
94
98
99
101
102
103
105
106
107
110
112
115
123

APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Appendix A Mixing Coefficients and some important results . . . . . . . . . . . . 125
Appendix B Financial prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . 126
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

iv

LIST OF TABLES

Table 3.1:

Table 3.2:

Values for sample skewness, kurtosis and the resulting decisions of
the normality test. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

Values for sample skewness, kurtosis mean and standard deviation .

89

v

LIST OF FIGURES

Figure 3.1:

Histogram of the S&P500 returns against the Normal curve fit.

. .

84

Figure 3.2:

Empirical autocorrelation function of regular, squared and absolute
log-returns for the S&P500 index. . . . . . . . . . . . . . . . . . . .

91

Autocorrelation function of regular, squared and absolute log-returns
for the simulated Geometric Brownian motion, calibrated on the
S&P500 index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

Figure 3.4:

S & P 500 real returns against the simulated GBM retuns. . . . . .

95

Figure 3.5:

The historical volatility for the S&P 500 index returns (daily readings). 97

Figure 3.6:

Fitting of VG distribution parameter to the S&P 500 Index returns.
The plot shows the histogram of the empirical returns along with
three differently calibrated VG densities. The methods presented are
VG MOM method (example 2), VG MLE method (example3) and
the Normal curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Figure 3.7:

Fitting VG, NIG and Normal distribution parameters to the S&P 500
Index returns. The plot shows the histogram of the empirical returns
along with three different density. . . . . . . . . . . . . . . . . . . . 120

Figure 3.8:

Tail probabilities of the Fitted VG (MOM, MLE), NIG and Normal
calibrated on the S&P 500 returns. The plot shows how three distributions fit the empirical distribution. . . . . . . . . . . . . . . . . . 121

Figure 3.9:

Fitting of VG distribution parameters to the S&P 500 Index returns.
The plot shows three different examples explaining different methods
of estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Figure 3.3:

Figure 3.10: Fitting the VG distribution to the S&P 500 Index returns. The plot
shows outputs of the MLE approach. . . . . . . . . . . . . . . . . . 122

vi

Introduction
Fisher Black, Myron Scholes and Robert C. Merton have developed the Geometric Brownian motion (GBM) to model risky asset evolution and by provided the finance community
with an explicit method for pricing a European option, see [12, 56].
Although, the classical Black-Scholes GBM model is undoubtedly, the most renowned
and widely accepted model within both academic and practitioners community worldwide,
the fit of the model has been questioned continuously over the last decade. Recently, a large
body if literature has shown the departure of the model from the real financial data, see for
instance Heyde and Liu (2001) [39] and Heyde (2009) [38], also Cont (2001) and Granger
(2005) [16, 35] who thoroughly explored the characteristics of the empirical log-returns for
the financial time-series. The ’stylized features’ of the log-returns included heavier-tailed
and higher peaked distributions for the log-returns, uncorrelated and stationary returns and
the dependence of squared and absolute returns (both short and long range dependence),
which contradicted the GBM model.
A number of models that incorporate non-Gaussian distributions and/or dependence in
log returns have been proposed (see, [52], [51], [37], [36], [10], and many others) including
the class of stochastic volatility models of Barndorff-Nielsen and Shephard, see [8]. The
focus is motivated by fractal activity time GBM models (FATGBM) that, along with other

1

stochastic volatility models, address the stylized features. FATGBMs were first introduced in
Heyde (1999) [37] and further developed for the Variance-Gamma, Normal-Inverse Gaussian
and student distributions in [24–26, 36, 48, 49]. Heyde’s FATGBM model belongs to the
class of subordinator type models, where a random time change (activity time process) with
preferable characteristics, is used instead of the regular deterministic time to account for the
stylized features of the log-returns.
Several versions of Heyde’s model have the capacity to capture the stylized features of the
financial data if a suitable activity time is being chosen. As suggested by Heyde himself [37],
if the activity time process Tt is chosen such that its increments exhibit dependence and has
heavier tails than Gaussian, then these features are going to be inherited by the resulting
model. Furthermore, there is a strong empirical evidence (Heyde and Liu (2001) [39]) that
the activity time Tt should be asymptotically self-similar.
In this thesis we focus on FATGBM models with a specific time change, namely, the
models that use superpositions of Ornstein-Uhlenbeck type processes (supOU) to construct
the increments of the activity time, elaborated in [48, 49]. The finite or infinite supOU
processes were first introduced by Bardnorff-Nielsen and other colaborators, see [3,4,8]. The
motivation behind this approach lies in the fact that both finite and infinite supOU exibit
the properties that an ’ideal’ activity time should have, as mentioned in Heyde (1999) [37]
and established further in the literature. In addition to allowing for non-Gaussian marginals,
the supOU process have very flexible correlation structure (summation of exponential terms)
that allows for easy modeling of short and long range dependence. In addition, these models
are analytically very tractable (their marginal distributions are known), which is a very
useful property because implementation of these models is straightforward.
It can be shown that the partial sums of finite supOU processes have Gaussian limiting
2

distribution, hence they are asymptotically self-similar. In this thesis, the limiting properties
of the infinite supOU process are investigated. We found that the partial sums of the
infinite superpositions are intermittent, which in turn implies unusual cumulant behavior
which may preclude central limit theorem (CLT) type of results. Therefore, any use of the
infinite superposition processes would be very difficult from the application standpoint, since
testing efficiency of estimators, constructing confidence intervals or performing hypothesis
tests would be challenging. Another aspect from the finance standpoint is that, as a result
of our findings, use of the infinite LRD supOU process as an activity time would not address
the challenge of building a FATGBM model that incorporates long-range dependence of the
log returns and the asymptotic self-similarity of the activity time.
The thesis is organized as follows. The first chapter includes the preliminary material
introducing the superpositions of Ornstein-Uhlenbeck type processes. The second chapter
presents new theoretical results. The third chapter focuses on applications of finite superpositions of the OU type processes for building tractable FATGBMs.

3

Chapter 1
Overview of the Ornstein-Uhlenbeck
(OU) type processes and their
superpositions

1.1

Preliminaries

This section contains standard terminology, definitions, and theorems that will be used
throughout. Let (Ω, F, P ) be a probability space and let X : Ω → R be a F/B(R)-measurable
real-valued mapping or simply a random variable. Here B(R) is the sigma-algebra of Borel
sets on R. A probability measure induced by X, denoted as FX (B) = P (X ∈ B) = P ({ω :
X(ω) ∈ B}), for all B ∈ B(R), will be referred to as distribution of X, or sometimes a law
of X. In addition, whenever two random variables, say X, Y on R, not necessarily defined
on the same probability space, have identical distributions, i.e. FX = FY , we will use the
D

following notation X = Y .
The convolution µ of two distributions µ1 and µ2 on R, denoted by µ = µ1 ∗ µ2 is a
distribution defined by µ(B) =

R×R 1B (x

+ y)µ1 (dx)µ2 (dy), B ∈ B(R). Note, in case

X1 ∼ µ1 and X2 ∼ µ2 are two independent random variables on R, then their sum X1 + X2
has distribution µ1 ∗ µ2 .

4

A characteristic function of a random variable X with a distribution µ, usually denoted
by ϕX (z), or sometimes µ
ˆ(z), is given by the following expression:

ϕX (z) = µ
ˆ(z) = EeizX , z ∈ R.

A cumulant (generating) function of a random variable X will be denoted by:

κX (z) = log EeizX , z ∈ R.

(m)

Assuming it exists, one can define the m-th cumulant of X, denoted as κX , for m ∈ N,
using the m-th derivative of the cumulant function κX , evaluated at zero, written correctly
below,
(m)

κX

= (−i)m

dm
κ (z)
dz m X
z=0

In this context, we say that a characteristic function κX (.) is analytic around the origin, if
and only if, at any point in the neighborhood of the origin, its Taylor series converges to the
value of the function, i.e. if for any z around the origin, the following is true,
∞

κX (z) =
m=1

(iz)m (m)
κ .
m! X

In this paragraph, we will briefly explain regularly and slowly varying functions, which
is an essential tool when dealing with heavy tails, long range dependence, and domains
of attraction. Intuitively, regularly varying functions, in the long run, behave like power
functions, which in case of regularly varying distributions means that they have tails that
are approximately power-law, like Pareto distribution for example.

5

Definition 1.1.1. It is said that a measurable function U : R+ → R+ is regularly varying
at ∞ with an index ρ ∈ R, if for any scale x > 0 the following holds,

limt→∞

U (tx)
= xρ ,
U (t)

where the index ρ is called the exponent of variation.
Note that regularly varying functions are asymptotically scale invariant functions.
Definition 1.1.2. A measurable function L : R+ → R+ is slowly varying at ∞ if for any
scale x > 0 the following holds,
limt→∞

L(tx)
= 1,
L(t)

i.e. it is regularly varying function with ρ = 0.
Notice that if U ∈ RVρ then U (x)/xρ ∈ RV0 , which in turn means that any ρ-regularly
varying function can be represented as U (x) = xρ L(x). Simple examples of slowly varying
function is logx, log(logx) and for the regularly varying are power functions. On the other
hand, an exponential function ex is an example of not regularly varying function.
In the next paragraph we are going to define notions related to stochastic processes and
families of random variables. From here throughout, a probability space (Ω, F, P ) is fixed
and random variables are defined on it. We will start with defining the family of finitedimensional distributions and continue with stochastic processes, see below.
Definition 1.1.3. For any fixed integer n, and any choice of 0 ≤ t1 < t2 < ... < tn , the joint
distribution P [X(t1 ) ∈ B1 , X(t2 ) ∈ B2 , ..., X(tn ) ∈ Bn ] determines a probability measure on
B((R)n ). The family of probability measures over all possible choices of n and {ti }n
i=1 ’s, is
called the system of finite-dimensional distributions of a process {Xt }.
6

d
Definition 1.1.4. Let Xj be a random variable with values in R j , where j = 1, ..., n. One

says that the finite family of random variables {Xt1 , ..., Xtn } is independent if for every
d
Bj ∈ B(R j ), j = 1, ..., n, the joint probability can be decomposed as following:
n

P (X1 ∈ B1 , ..., Xn ∈ Bn ) =

P (Xi ∈ Bi ).
i=1

Note that an infinite family of random variables is independent, if every finite subfamily
of it is independent.
Definition 1.1.5 (Modification). A stochastic process {Yt } is said to be a modification of
a stochastic process {Xt } if they are almost surely equal at each time point t, i.e. if for all
t ∈ [0, ∞), P (Xt = Yt ) = 1.
Definition 1.1.6. Two stochastic process {Xt } and {Yt }, not necessarily defined on the same
D

probability space, are said to be equal in law (in distribution), denoted as {Xt } = {Yt }, if the
systems of their finite-dimensional distributions are equal. Note that, from here throughout
whenever we talk about equality in distribution of two stochastic processes we mean equality
of their finite dimensional distributions.
Definition 1.1.7 (Stationarity).
(i) The stochastic process {Xt } is said to be stationary in a narrow sense if the system of its
D

finite-dimensional distributions is shift invariant, i.e. if ∀ h ≥ 0, {Xt }t≥0 = {Xt+h }t≥0 .
(ii) The stochastic process {Xt } is said to be stationary in a wide sense (weakly), also called
second order stationary, if, one- and two-dimensional distributions of the process {Xt } are
shift invariant.
An important concept involving stochastic processes, either continuous or discrete, is
7

memory of the process, as t → ∞. Intuitively, depending on the behavior of the process,
its short or long term memory property will allow one to extrapolate across time scales and
deduce long time behavior from short time behavior.
The most obvious way to measure the length of memory in a stochastic process that has
second-order moments is by looking at the rate at which its correlations decay with time-lag.
Hence, consider a weakly or second-order stationary time series {Xn }n∈Z , assuming in
addition, the existence of the second moment. Let’s first define measures of dependence in
time series, that is let RX (k) and ρX (k) be autocovariance function (ACVF) and autocorrelation function (ACF) for the process {Xn }. For any time points t, k ∈ Z, we define the
following,
RX (k) = Cov(X0 , Xk ) = Cov(Xt , Xt+k ),

(1.1)

and also,
ρX (k) =

Cov(X0 , Xk )
γ(k)
=
.
γ(0)
V ar(X0 )

(1.2)

Now, we can give definition of the notions of short and long range dependence as follows,
Definition 1.1.8. A second-order stationary time series {Xn }n∈Z with finite second moment
is called short range dependent (SRD, in short) if,
+∞

|RX (k)|

<

∞.

(1.3)

=

∞.

(1.4)

k=−∞

and long rnage dependent (LRD in short) if,
+∞

|RX (k)|
k=−∞

8

Note, that short range dependence is also referred to as short memory or weakly dependent and similarly, long range dependence is often referred to as long memory or strongly
dependent series. Also, in case of the continuous time stochastic process the above series
will be replaced by integrals over the time axis, t ∈ R.

1.2

Levy processes

The Levy processes are named after the famous French mathematician Paul L´evy, whose
contribution, although not alone in it, played an important role in understanding and characterizing the processes with stationary and independent increments.
Definition 1.2.1 (L´evy process). Let {Zt }t≥0 be a stochastic process defined on a probability
space (Ω, F, P ). We say that a real-valued stochastic process Zt is a L´evy process in law if
the following conditions are satisfied:
(i) It starts at zero almost surely, i.e. Z0 = 0 a.s.
(ii) Zt has independent increments, i.e. for any n ≥ 1 and any choice of 0 ≤ t0 < ... < tn , ≤
the family of random variables {Z(tk ) − Z(tk−1 )}n
k=1 is independent.
(iii) Zt has stationary increments (temporal homogeneity),i.e. for any s ∈ [0, ∞), the two
D

processes are equal in law, {Zt − Zs }t≥0 = {Zt−s }t≥0 . In other words, the distribution of
Zt+h − Zt doesn’t depend on specific time point t, only on the time increment h.
(iv) Zt is stochastically continuous, or in other words if the following is true,
for any t ∈ [0, ∞) and

> 0, lims→t P (|Zs − Zt | > ) = 0.

The above statements from (i)-(iv) define the Levy process in law (in distribution), and
the Kolmogorov extension theorem guarantees the existence of a stochastic process with a
9

given system of finite-dimensional distributions.
Regarding the sample path properties of the Levy process, it can be shown that (i)-(iv)
imply the existence of a modification that is right-continuous with left limits, or continue ´a
droite, limite a´ gauche, in French, denoted as c´adl´ag, see Sato (1999) [63]. Hence, for every
Levy process in law, there exist Ω0 ∈ F, P (Ω0 ) = 1, such that ∀ ω ∈ Ω0 , Xt (ω) has cadlag
sample paths. We will refer to this c´adl´ag modification as a Levy process throughout. In
addition to the right-continuity of their sample paths one can also show that they have at
most countable number of random jump discontinuities occurring at random times, on each
finite time interval.
Even though, above we defined a Levy process only on the positive real axis {Zt }t≥0 ,
it can be extended it to the whole set of real numbers, i.e. {Zt }t∈R , provided that on the
negative time axes t < 0, process is defined as follows,
Definition 1.2.2. Let {Zt }t≥0 be a Levy processes with E[eizZt ] = etκ0 (z) . Denote {Z˜t }t≥0
as an independent copy of {Zt }t≥0 . Then the Levy process on the negative time axis t < 0,
D
is defined as follows, for any t < 0, Zt = −Z˜(−t)− .

Below we listed some important properties of the Levy process.
(i) The stationary and independent increments property imply that a Levy process is
a Markov process, more precisely a temporally homogeneous Markov process. Combined with the almost sure right continuity of sample paths, one may show that Levy
processes are strong Markov processes as well.
(ii) They form a special subclass of semimartingales.
(iii) They are ’analogs’ of random walks in continuous time.
10

The next section addresses the specific type of marginal distributions that Levy processes
can have by introducing a class of infinitely divisible distributions.

1.3
1.3.1

Infinitely divisible distributions
Infinite divisibility and the Levy-Khintchine representation

In this section we will define infinitely divisible distributions, give their representation
and main properties. We will also briefly explain their correspondence to the L´evy processes
in distribution. As standard notation we will write: µ n = µ ∗ ... ∗ µ as the n-fold convolution
of a probability measure µ with itself.
Definition 1.3.1. A probability measure µ on R is said to be infinitely divisible (ID) if, for
any positive integer n, there exist a probability measure µn on R such that: µ = µnn .
Since the convolution of measures corresponds to the product of characteristic functions
or sum of independent random variables, we can therefore say that probability measure µ is
infinitely divisible (ID) if and only if, for each n, the n-th root of the characteristic function µ
ˆ
is a also a characteristic function of some probability measure: µ
ˆ(z) = (ˆ
µn )n . Equivalently,
one can also say that a random variable X is infinitely divisible (ID) iff for any n ∈ N there
D

(n)

exists an i.i.d sequence of random variables Yn such that: X = Y1

(n)

+ ... + Yn .

As it turns out, all infinitely divisible distributions have a specific form of their characteristic function which is very well known representation, named the L´evy-Khintchine
representation or the L´evy-Khintchine formula, see below.
Theorem 1.3.1 (L´evy-Khintchine). If µ is an infinitely divisible distribution on R, then

11

there exists a unique triplet (A, ν, γ) (referred to as the generating triplet) such that:

1
eizx − 1 − izx1[−1,1] (x) ν(dx)},
µ
ˆ(z) = eκ(z) = exp{− Az 2 + iγz +
2
R

z ∈ R. (1.5)

where γ ∈ R, A ≥ 0 and ν is the L´evy measure, that is, a σ-finite Borel measure on R such
that
(1 ∧ x2 )ν (dx) < ∞.

ν ({0}) = 0,

(1.6)

R

Conversely, given a generating triplet (A, ν, γ), there exists an infinitely divisible distribution
µ on R whose characteristic function is given by (1.5). The exponent κ(z) = log(ˆ
µ(z)), in
the above formula, is often referred to as the Levy exponent.
From the above theorem, one can conclude that the generating triplet (A, ν, γ) uniquely
determines the infinitely divisible random variable X with law µ. The parameters A, ν, γ are
called, respectively, the Gaussian term, the L´evy measure of µ, and the drift.
Bellow we have provided some examples of well known distributions that are elements of
the class of infinitely divisible distributions and some that are not. Note, in order to show
that a specific distribution belongs to the class of infinitely divisible distributions, one can
either show that its characteristic function is of the Levy-Khinchine form (1.5) or directly
use the definition 1.3.1 and suggest how to re-parameterize the form of its characteristic
function in order to get the n-th root.
Example 1.3.1.
(i) Let X be a Gaussian random variable: X ∼ N (γ, A). It is infinitely divisible since its

12

characteristic function is of the form,
2

γ 1A 2
n
ϕX (z) = en ( iz n − 2 n z ) = ϕXn (z)

2

, where the n-th root is Xn ∼ N ( nγ , An ). On the other hand, one can also see that ϕX (z)
is of the Levy-Khinchine form (1.5), with generating triplet (A, 0, γ), where the variance of
a normal variable is related to the Gaussian term A, a mean is related to the drift term γ,
while the Levy measure is zero, ν = 0. Note here, the Levy measure is zero, when and only
when the distribution is Gaussian.
(ii) Let a Gamma distribution be defined by the following density,

Γ(α, β) ∼

αβ β−1 −αx
x
e
1 [0,∞) (x), α > 0, β > 0.
Γ(β)

Similarly, it belongs to the class of infinitely divisible distributions. While the infinite divisibility can be easily seen by suggesting the n-th root of its characteristic function, i.e.

ϕΓ(α,β) (z) =

iz −β
=
1−
α

n
iz −β/n n
(1 − )
= ϕ Γ(α,β/n) (z) ,
α

the Levy-Khinchine form of its characteristic function is not as straightforward to find. It
can be shown that it is given by the following expression,

∞ izx
−αx /x)dx
ϕΓ(α,β) (z) = e β 0 (e −1)(e
.

From the above representation one can see that the generating triplet is of the form, A =
0, γ = 01 xν(dx), and ν(dx) = (eαx /x)dx and that Levy measure has density with respect
13

to Lebesgue measure. In addition, since an exponential is a special case of the Gamma
distribution, it is also infinitely divisible.
(iii) Following the similar procedure as above, one can see that the Poisson distribution is
infinitely divisible as well, i.e.

ϕP (λ) (z) =

iz
e −λ(1−e )

=

iz
e −λ/n (1−e )

n

= ϕP (λ/n) (z)

n

.

After rewriting the above characteristic function, one can see that the Levy-Khinchine exponent is of the form:

{eizx − 1 − izx1[−1,1] (x)} δ1 (dx),

κP (λ) (z) = λ
R

1
and hence the generating triplet for Poisson distribution is given by, (0, λ δ1 , −1
x δ1 (dx) ),

where δ1 denotes the Dirac measure supported on {1}.
(iv) The compound Poisson distribution is given as:

N
i=1 ξi

∼ µ, where N is a Poisson ran-

dom variable with parameter λ, {ξi }i≥1 an i.i.d sequence of random variables with common
law F , and it is assumed to be independent of N . Its characteristic function is of the form

µ
ˆ(z) = eλ R (e

izx −1)F (dx)

ˆ

ˆ

= eλ(F (z)−1) = eλ/n(F (z)−1)

n

.

Hence, it belongs to the class of infinitely divisible distributions with the generating triplet,
1
(0, λF (dx), λ −1
xF (dx) ).

(v) Distribution of a Levy process {Zt }t≥0 , at any time t, is an infinitely divisible distribution,

14

see below. Trivially we have that for any n ≥ 1 and t ∈ [0, ∞),

Zt = (Ztn − Ztn−1 ) + ... + (Zt1 − Zt0 ),

where tk = kt/n, k = 0, 1, ..., n and tn ≡ t. Stationarity and independence of the Levy
increments yields that, {Ztk − Ztk−1 }n
k=0 is a sequence of i.i.d random variables, hence, for
a fixed time t, the random variable Zt is infinitely divisible.
(vi) In general it turns out that, any distribution with bounded support (except δ-distributions)
is not infinitely divisible. Hence, uniform and Binomial distribution are the examples of
distributions that are not infinitely divisible.
Remark 1.3.1. Some other continuous distributions that are infinitely divisible are: Cauchy,
one-sided strictly stable distributions of index 1/2, while other discrete infinitely divisible
distributions are geometric, negative binomial, etc... In the examples given above one can
easily find the n-th root of the characteristic function, but there are many other distributions
whose infinite divisibility is much harder to prove, like for example: Student’s t-distribution,
Pareto, F-distribution, Gumbel, Weibull, log-normal, logistic distribution, and some others,
see [63]
For the sake of completeness, some interesting properties of infinitely divisible distributions are given in the form of lemmas below, see Sato (1999) [63][chapter 2] for more
details.
Lemma 1.3.2.
(i) If µ is infinitely divisible distribution then its characteristic function µ
ˆ(z) has no zeros,
i.e. µ
ˆ(z) = 0 for any z ∈ R .
15

(ii) If µ1 and µ2 are infinitely divisible then their convolution, µ1 ∗ µ2 is infinitely divisible.
(iii) If {µk }k≥0 is a sequence of infinitely divisible distributions and {µk } → µ, as k → ∞,
then the limiting distribution µ is infinitely divisible.
(iv) If µ is infinitely divisible then, for any t ∈ [0, ∞), µt is both, well-defined and infinitely
divisible. Note µ0 = δ0 .
(v) Every infinitely divisible distribution is the weak limit of a sequence of compound Poisson
distributions.

1.3.2

The correspondence between infinitely divisible distributions
and Levy processes in law

Let us now present the details of a well known fact that there is one-to-one correspondence
between the class of infinitely divisible distributions and Levy processes in law. We will
explain this relationship in a very simple manner and then we’ll state the theorem.
From the example 1.3.1 we already know that the Levy process Zt is infinitely divisible
for any fixed t ≥ 0, which means that for any integer n, we can decompose Zt as a sum
of n i.i.d reparametrized distributions. Hence, the more interesting question is if we can
decompose it as sum of t , i.i.d. Z1 -distributions, where t is a real number, i.e. if and when
the following is true, for any Levy process {Zt }, and for any time t ∈ R+ , t ≥ 0,

ϕZt (z) = E[eizZt ] = etκ(z) = (ϕZ1 )t ,

κ(z) = log(E[eizZ1 ]).

From the following relationship, Zt = (Ztn − Ztn−1 ) + ... + (Zt1 − Zt0 ), by taking specific

16

integer valued times, t = m, n = m and t = m, n = n, one can easily get,

(ϕZ1 )m = ϕZm = (ϕZ
(z))n .
m/n

For any n, m ∈ N and hence for any rational number, the following equation holds,

ϕZ

m

m/n

= (ϕZ1 ) n .

If we simply choose the sequence of rationals {rn } such that rn → t, together with the
fact that the Levy process is stochastically continuous we get that, Zrn → Zt in probability,
D

which implies convergence in distribution. This in turn implies, Zt = (Z1 )t and ϕZt = etκ(z) ,
for any real time t ≥ 0 and not only for integer or rational.
From the above discussion it is clear that each Levy process can be associated with an
infinitely divisible distribution, but what is not straightforward is, whether given an infinitely
D

divisible distribution µ, one can construct a Levy process Zt , such that Z1 = µ. This question
is answered in the theorem below, see Sato 1999, see [63] for more details. Intuitively, starting
with an infinitely divisible distribution one needs to construct a corresponding Levy process
in law, i.e. one needs to suggest an appropriate system of finite dimensional distributions
D

such that it is well defined (satisfy Kolmogorov consistency theorem), Z1 = µ, and have
properties of the Levy process. If one shows that the proposed system of f.d.d. is well
defined, then one only needs to show that the system is a Levy process in law. See the
theorem below,
D

Theorem 1.3.3. Let {Zt }t≥0 be a real-valued Levy process in law and denote Z1 = µ.
D

D

(i) Then for every t ∈ [0, ∞), Zt is infinitely divisible and Zt = (Z1 )t = µt .
17

(ii) Conversely, if µ is an infinitely divisible distribution on R, then there is a real-valued
D

Levy process in law, denoted with {Zt }t≥0 , such that Z1 = µ.
D

(iii) If {Zt } and {Zt } are two real-valued Levy process in law such that Z1 = Z1 then the
D

processes are identical in law, i.e. {Zt }t≥0 = {Zt }t≥0 .
Note, µ is said to be an infinitely divisible distribution corresponding to the Levy process
D

in law {Zt }, where Z1 = µ; conversely, the process {Zt } is said to be the Levy process in law
corresponding to the infinitely divisible distribution µ. A generating triplet for the infinitely
divisible distribution µ, denoted as (A, γ, ν), is also called the Levy generating triplet.
Remark 1.3.2. Even though the stationary and independent increments property of the Levy
process {Zt } is enough to claim that its marginal distributions are infinitely divisible, it is
not enough in order to prove the property (i). One still needs the stochastic continuity.
Below we give couple of well known examples of the Levy processes. We are going to
assume that (Ω, F, P ) is a probability space and that a given process is defined on it.
Example 1.3.2.
(i) A standard Brownian motion is a classical example of the Levy process with normal
marginal distributions, i.e. a real-valued process with stationary and independent increments, starts at zero a.s., has a modification with continuous sample paths, and at each time
t > 0, process Bt has normal distribution with variance t, or Bt ∼ N (0, t). Note also that a
˜t ≡ Bt + γt is a Levy process too.
linear Brownian motion or a BM with drift, i.e. B
(ii) A continuous time and a non-negative integer-valued process {Nt }t≥0 is said to be a
Poisson process with intensity λ > 0, if it is a Levy process and, in addition, at each time
t > 0, Nt has a Poisson distribution with parameter λt. Since it is a Levy process, the
iz
characteristic function is of the Levy-Khintchine form, E e izNt = e−λt ( 1−e ) .

18

Remark 1.3.3. Note, another way of defining the Poisson is as a counting process. In fact,
it is the only process that is both, Levy and counting at the same time. The Poisson counts
the arrivals of randomly occurring events or events with exponential inter-arrival times in the
time interval [0, t], i.e. Nt =
distributed waiting times, Tn =

n≥1 1 t ≥ Tn ,
n
i=1 τi ,

where Tn is a random walk with exponentially

τi ∼ exp(λ). Since it is a Levy process, the sample

paths of Poisson are a.s. right continuous by the definition, but some other path properties
come more intuitively when using the second definition. Hence, from here we may also
conclude that the trajectories of a Poisson are always increasing and piece-wise constant
functions, with jumps of size 1 only, where jumps occur whenever the random events occur.
Note, also that a Poisson process itself cannot be used to model asset prices because the
jump size is always equal to one, which is way too restrictive, but it can be used as building
block to construct more advanced financial models that exhibit jumps.
(iii) Let λ > 0 and let F be a distribution on R such that F {0} = 0. A real-valued and
continuous time process is a compound Poisson process {Zt }t≥0 associated with λ and F ,
if it is a Levy process and at each time t > 0 it has a compound Poisson distribution,
ˆ

E e izZt = e−λt ( 1−F (z)) , z ∈ R.
Another way of defining a compound Poisson process with a jump intensity λ and a jump
size distribution F is as a random walk, Zt =

Nt
i=1 ξi ,

where {ξi }i≥1 is a sequence of

independent random variables with law F and Nt is a Poisson process with with intensity λ t
and independent of the given sequence {ξi }i≥1 . One can see that, the jumps arrive randomly
according to a Poisson process and the size of the jumps is also random, with a specified
probability distribution.
Remark 1.3.4. Similarly as a Poisson process, a compound Poisson process is a piece-wise
19

constant process which jumps at jump times of a standard Poisson process, but as opposed
to a Poisson which has fixed jump length of one, the jump sizes of a compound Poisson are
i.i.d. random variables with a given law F .
(iv) As already explained, gamma distribution is infinitely divisible, hence by the theorem
D

1.3.3 there exist a levy process Zt with gamma marginal distributions, such that Z1 = Γ(α, β)
D

and Zt = (Γ(α, β)) t . Since Gamma distribution is closed under convolution in the first
D

parameter (skewness parameter) we get, Zt = Γ(tα, β), i.e. the characteristic function of the
gamma process is of the form, E [e izZt ] = (1 − iz/β)−tα .
Remark 1.3.5. Stationarity, independent increments property and the fact increments (Z(1))
are gamma distributed imply that the process is a.s. strictly increasing. Even though it seems
similar to a compound Poisson where the jump distribution F is concentrated on the positive
real line (0, ∞), there are two main differences. First, Levy measure of a Gamma process has
infinite total mass unlike the Levy measure of a compound Poisson, which is necessarily finite
and equal to the arrival rate of jumps 1 . Second, while a compound Poisson with positive
jumps does have an a.s. non-decreasing paths, it does not have paths that are almost surely
strictly increasing 2 .
Remark 1.3.6. Note that the Poisson and the compound Poisson process with strictly positive increments are the most well-known non-negative Levy processes. As mentioned their
paths are non-decreasing piece-wise constant functions where jumps happen rather rarely.
Consequently, their increments are often exactly equal to zero, even when measured over
large time intervals. This feature is fundamentally different from the Gamma process since
its increments are strictly positive regardless of how much time has elapsed. This means that
1 Frequently referred to as jump intensity.
2 Levy processes whose paths are almost surely non-decreasing are called subordinators.

20

at every time point this process jumps, hence it will have infinity number of small jumps
in any finite time interval. The Levy processes that exhibit this feature are said to be of
infinite activity (jump intensity). The path of these processes have rough upward trend with
occasional large jumps and intuitively one can conclude that the paths are not continuous
anywhere.
From the Definition 1.2.1 alone, it is difficult to see, just how rich the class of Levy
processes really is. In general, there are many processes that satisfy the definition of the Levy
process and on the first encounter they can seem considerably different from one another.
For example, firstly, the Brownian motion has an a.s. continuous sample paths whereas the
Poisson process is a jump process only. Secondly, the Brownian motion is a.s. nowhere
differentiable, i.e. its paths are of unbounded variation over finite time horizons, while the
Poisson process is a non-decreasing process and thus has paths of bounded variation over
finite time horizons. Still, they both belong to a one large class of processes which makes
modeling with Levy processes very flexible and suitable for various situations.
Beside their flexibility with the sample path properties which allows for both continuous
or jump sample paths, they have also larger class of possible marginal distributions, as
mentioned previously, class of infinitely divisible distributions. Therefore, the probability
models driven by Levy processes are applicable to many different situations. One of the
drawbacks when modeling directly with Levy is their independent increments property, which
is not true in many real life situations. Hence, if one wants to model phenomena that has
correlation structure present when modeling squared or absolute increments, Levy processes
would not be suitable. So, from the modeling and application perspective, because of their
correlation structure, the OU and OU type processes, see the section 1.4, allow for more
flexibility when fitting the data. These can be also used to define more advanced models,
21

referred to in literature as superpositions of OU type processes, see the section 1.5. They
were first introduced by Barndorff-Nielsen and other colaborators, see [3, 4, 8] and the class
of superspositions is richer than standard OU type processes.

1.3.3

Self-decomposable distributions

In this section we will define a class of self-decomposable (SD) distributions, which, due
to the specific form of their Levy measure, is a subclass of infinitely divisible distributions.
We will digress here and try to explain some applicational motivation behind the processes
with self-decomposable distributions. The most standard models for option pricing (Black
and Scholes, 1973 [12]) usually assume that continuously compounded returns are normally
distributed. The primary motivation behind this assumption is the central limit theorem
which says that if the price returns are realized as sum of large number of independent
influences, properly scaled, then returns should be normally distributed. Now, the fact that
distribution of the price returns for time-series and option data is not normal has lead many
scientist to research into jump-diffusion models, stochastic volatility models, pure-jump Levy
processes and some other. So, the question is, are there some alternatives to the Gaussian
distribution as a limiting law, i.e. can we construct models that are limits of properly scaled
partial sums, where the scaling function is not necessarily

√

n ?3 The answer is yes there is a

class of distributions with these properties, so-called laws of class L, which were first defined
by Khintchine (1938) and Levy (1937) as limiting distributions for sums of n independent
variables that are centered and scaled by functions of n, and this class is identical to the class
of self-decomposable distributions. Examples would be Normal and stable distributions.
Regardless, the self-decomposable laws represent an important generalization of Gaussian
3 There is no economic motivation for the scaling function to be √n over any other function of n.

22

and stable laws as they describe limit laws with more general scaling constants than only
√1 .
n

This flexibility is important in applications when the independent sources of randomness

being summed are of different orders of magnitude. Now that we explained why these models
are interesting from finance/modeling perspective let’s proceed with the definition.
Definition 1.3.2. A random variable X is self-decomposable if, for all c ∈ (0, 1) , its characteristic function can be decomposed/factorized as, ϕX (z) = ϕcX (z) × ϕc (z), for some
characteristic function ϕc (z), z ∈ R.
Intuitively, a random variable is self-decomposable if it can be decomposed into a sum of
scaled down version of itself and an independent residual term, i.e. if for every c ∈ (0, 1) ,
D

there exists an independent random variable Xc , such that: X = cX + Xc .
Remark 1.3.7. To summarize, the self-decomposable laws arise naturally as limiting distributions for certain generalizations of the central limit problem. More generally, the class of
all infinitely divisible distributions coincides with those distributions that arise as limits of
row sums of uniformly asymptotically negligible triangular arrays of random variables.
There is another important characterization of self-decomposable distributions and that
is through a special form of their L´evy measure. In particular, since self-decomposable
distributions are a subclass of infinitely divisible distributions, their characteristic function
is already of the L´evy-Khintchine form, but it can be shown that their L´evy measure has a
density with respect to the Lebesgue measure, see Sato (1999), [63][corolary 15.11], i.e. their
L´evy measure is of the following form:

ν (dx) =

k(x)
dx,
|x|

23

(1.7)

for some non-negative function k(.) : R → R, that is increasing on (−∞, 0) and decreasing
∞
on (0, ∞) and also satisfies the integrability condition: −∞
(1 ∧ x2 ) |x| dx < ∞.
k(x)

Another interesting property of self-decomposable distributions is that their densities are
unimodal, see Yamazato 1978 [68] or Sato 1999 [63]. The proof that every density in the
class of L or self-decomposable class is unimodal is non-trivial and an interested reader may
refer to the references suggested above. This property can be helpful in determining if some
distribution is not self-decomposable. Below is the example where a distribution is infinitely
divisible but not self-decomposable.
Example 1.3.3.
(i) Any stable distribution on R is self-decomposable. For example, let’s look at the one-sided
α-stable law, for 0 < α < 2. Levy generating triplet is of the form (0, ν, γ), where γ ∈ R and
Levy measure is of the form,

ν(dy) =





Cαy −α−1 dy,

y>0




0,

y < 0.

Directly from the above definition, one can see that the given Levy measure has density with
respect to Lebesgue, hence it is self-decomposable. In addition, the characteristic function of
a stable distribution is given by the following formula,

νˆ(z) =




α

e−C Γ(1−α) (−iz) ,

f or 0 < α < 1,

Γ(2−α)

C
(−iz) α


e (α−1)
,

f or 1 < α < 2.

(ii) As mentioned previously, the Levy-Khinchine exponent of the Gamma distribution is given

24

by the following expression,
∞
β

ϕ Γ(α,β) (z) = e

(eizx − 1)(e−αx /x)dx

0

.

From the above representation one can see that the Levy measure has density with respect to
the Lebesgue measure, ν(dx) = (e−αx /x)dx, hence it is self-decomposable. In addition, since
exponential is a special case of Gamma distribution it is self-decomposable as well.
(iii) Other well known and self-decomposable distributions are double-exponential (Laplacian),
log-normal, Pareto, t-distribution, F-distribution, etc...
(iv) An interesting example of a distribution that is infinitely divisible but not self-decomposable
can be constructed in the following way. Take sum of independent normal and poisson
distribution. The convolution is going to be infinitely divisible but not self-decomposable
since for different parameters λ the resulting distribution can be multi-modal, while the selfdecomposable distributions are unimodal. Another way to construct a continuous distribution
that is infinitely divisible but not self-decomposable is via Levy-Khintchine representation
while making sure that the Levy measure is not absolutely continuous with respect to the
Lebesgue.
In the next section one will see a different view of self-decomposable distributions and
that is as a stationary distribution of a class of Markov processes called Ornstein-Uhlenbeck
type. This interpretation of a self-decomposable distribution gives another representation of
its Levy measure as opposed to the equation (1.7), given above.

25

1.4

Ornstein-Uhlenbeck type processes

Named after Leonard Ornstein and George Eugene Uhlenbeck, real-valued OrnsteinUhlenbeck (OU) process {Yt }t≥0 , induced by a Brownian motion Bt , is an a.s. and unique
solution to the following stochastic differential equation (SDE), commonly referred to as the
Langevin SDE,
d Yt = −λ Yt d t + d Bt .
A standard Ornstein-Uhlenbeck process has many useful modeling properties, for example
a fact that it is a continuous time analogue of discrete AR(1) process, relatively straightforward to simulate, it has a mean reversion property (converges towards its long-term mean,
as t → ∞). However, a potential drawback of modeling via standard OU process is that it
can fit only data that exhibit normality. Their exponentially decaying dependence structure
is not very flexible either, so there is an obvious need for a model that improves on these
features.
In this thesis our focus will be on the processes of Ornstein-Uhlenbeck type (OU type for
short) which are analogues of the above mentioned ordinary OU processes where a Brownian
motion part is replaced by a general Levy process {Zt }t≥0 ,

d Yt = −λ Yt d t + d Zt .

As we will see later, the class of possible marginal distributions for the OU type processes
coincides with the class of self-decomposable distributions, which includes Gamma, Inverse
Gaussian, Normal Inverse Gaussian, and many other distributions useful for applications
in finance. Self-decomposable distributions are specifically useful when modeling data that

26

exhibits heavier tails and higher peaks than normal (frequently referred to as Leptokurtic
distributions). Hence, the OU type models are more flexible when fitting empirical data.
Here we summarize some of the available defintions and results from the Sato’s and
Rocha-Arteaga’s book [60]. In this chapter, we give a layout for the construction of an
Ornstein-Uhlenbeck type process, which is a necessary building block for the upcoming
chapters. As suggested in [60][Chapter 2], in order to construct an Ornstein-Uhlenbeck type
process, first a stochastic integral of a deterministic integrand with respect to a Levy process
{Zt } needs to be defined, which is done in the subsection 1.4.1 by following the reference
provided. Afterwards, a stochastic integral of a step function on a bounded time interval is
defined. Due to the fact that any measurable function can be approximated with a sequence
of simple functions, the Levy integral can then be defined as a limit in probability of the
previously constructed simple integrals. The Ornstein-Uhlenbeck type process is afterwards
introduced as a stochastic integral with respect to a Levy process, {Zt }. Now as suggested
in [60,63], since the Levy integral is well defined, the existence and uniqueness of an OrnsteinUhlenbeck type process is established as an almost sure, strong and unique solution to the
Langevin integral equation, see the section 1.4.2.
The section 1.4.3 has important results on the stationarity of the Ornstein-Uhlenbeck
type process and its Markovian property, see Sato’s book [63] and [1, 60] for more details. It
is well known that the ordinary OU process driven by the Brownian motion has a limiting
stationary distribution as t → ∞, which is Gaussian. On the other hand, the OU type
process doesn’t necessarily have a limiting stationary distribution, since it is driven by a
general Levy noise that tends to have more erratic behavior due to jumps. In this section we
will give the conditions on the Levy measure related to the driving Levy process, under which
processes of OU type have stationary limiting distributions. After establishing the limiting
27

stationary distribution of an OU type process, we will briefly explain a correspondence
between self-decomposable distributions and stationary processes of OU type.
For more properties of OU type processes and their generalizations see Mandrekar &
Rudiger (2007) [54], Barndorff-Nielsen (2001) [4], Barndorff-Nielsen & Stelzer (2011) [9].

1.4.1

Stochastic integral with respect to the Levy process

In this section we summarize the approach to the definition of the stochastic integral of
a bounded measurable function defined on a bounded closed interval on R with respect to
a given Levy process, see the details in [60]. Note that this is a special case of integration
with respect to semi-martingale. In addition, the expression for the characteristic function
of the stochastic integral with respect to a given Levy process, is provided, refer to the
same [1, 60, 63]. It turns out that its characteristic function depends on the characteristic
function of the driving Levy process.
We will start by defining a step function first.
Definition 1.4.1. Let 0 ≤ t0 < t1 < ∞. We say that a function f , defined on the interval
[t0 , t1 ], is a step function if, for the finite number of points t0 = s0 < s1 < ... < sn = t1 and
for some aj ∈ R, j = 1, ..., n, it can be written in the following form,
n

f (s) =

aj 1[s

j−1 ,sj ]

j=1

(s).

(1.8)

As we explained, first we will define the stochastic integral for a given step function f ,
defined on a bounded closed interval [t0 , t1 ], with respect to a given Levy process:
D

Definition 1.4.2. Let {Zt }t≥0 be a Levy process on R, with Z1 = µ0 uniquely determined

28

by a generating triplet of µ0 ∼ (A0 , ν0 , γ0 ) and the characteristic function of the form,

ϕZt (z) = E[eizZt ] = etκ0 (z) ,

(1.9)

where κ0 (z) is the Levy exponent of µ
ˆ0 and it is given by,
1
eizx − 1 − izx1[−1,1] (x) ν0 (dx).
κ0 (z) = − A0 z 2 + iγ0 z +
2
R

(1.10)

Then the Levy integral for a given step function f is defined as,
t1

n

aj Zsj − Zsj−1 .

f (s)dZs =

t0

(1.11)

j=1

Following the steps in [60], we present with properties of the above defined Levy integral
of a simple function.
t

Proposition 1.4.1. The distribution of the random variable Y ≡ t 1 f (s)dZs is infinitely
0
divisible for any step function f , and its characteristic function is given in the form below:

ϕY (z) = E{exp iz

t1

f (s) dZs } = exp{

t0

t1
t0

κ0 (zf (s)) ds}

(1.12)

Proof. By the independent and stationary increment property of the Levy process combined

29

with the expression (1.9) we get,

E{exp iz

t1

n

f (s)dZs } =

t0

E {e
j=1
n

=

iz aj (Zsj −Zsj−1 )

}

(s −s
) κ (a z)
e j j−1 0 j

j=1


= exp 
= exp



n

(sj − sj−1 )κ0 (aj z)

j=1
t1
t0

κ0 (zf (s)) ds

The infinite divisibility of the Levy integral in (1.11) is a direct consequence of its definition.

In the subsequent steps we still follow the reference [60], hence we define the Levy integral
for a larger class of functions, real-valued bounded measurable functions on [t0 , t1 ]
Proposition 1.4.2. Let f (s) be a real-valued bounded measurable function on [t0 , t1 ], such
that there are uniformly bounded step functions {fn (s)}n≥1 on [t0 , t1 ] satisfying fn → f .
Then, the Levy integral

1 [t ,t ] fn dZs converges to a real-valued random variable Y in
0 1

probability. The law of Y is infinitely divisible and its characteristic function is given by:

EeizY = exp

t1
t0

κ0 (zf (s)) ds .

(1.13)

Note, the limit Y does not depend on the choice of fn up to a probability zero.
Proof. As suggested in the reference [60], we start with lemma from the Sato’s book [63].
Even though the characteristic function is uniformly continuous, it is not straightforward to
show that the Levy characteristic exponent is continuous. The Lemma 7.6 (pg 33), in Sato
30

1999, says that any continuous function ϕ(z) : R → C that is ϕ(0) = 1 and ϕ(z) = 0 for
any z has a unique and continuous Levy exponent κ(z) : R → C such that κ(0) = 0 and
eκ (z) = ϕ(z). Now, due to the continuity of the Levy exponent κ0 in (1.10), one gets the
convergence of κ0 ( (fn (s) − fm (s) ) z) → 0 for almost every s, as n, m → ∞. Since the step
functions are assumed to be uniformly bounded and the Levy exponent κ0 is continuous, the
Dominated Convergence theorem implies,
t1
t0

κ0 ( (fn (s) − fm (s) )z) ds → 0,

as n, m → ∞.

By (1.12),
t
iz t 1 ( fn (s)−fm (s) ) dZs
0
Ee

=e

t1
t0 κ0 ( z(fn (s)−fm (s) ) ) ds

→ 1,

which implies convergence to zero in probability of the above sequence, i.e.

(1.14)
t1
t0 ( fn (s)

−

P

fm (s) ) dZs → 0. This means that the sequence of integrals is Cauchy in probability, and
since the metric of convergence in probability is complete one may conclude that, there
exists a random variable Y which is limit in probability of the sequence of random variables
t

defining the Levy integral t 1 fn (s)dZs .
0
The law of Y is infinitely divisible because the lemma 1.3.2 implies that the weak limit
of infinitely divisible sequence of distributions is infinitely divisible. Since the sequence of
t

simple Levy integrals t 1 fn (s)dZs is infinitely divisible, see the proposition 1.4.1 and the
0
rest follows.
It is straightforward to get the convergence in distribution for the sequence of Levy
n→∞

integrals. The continuity of the Levy exponent κ0 implies, κ0 (fn (s)z) → κ0 (f (s)z), and
t

the DCT, t 1 κ0 ( fn (s) z) ds =
0

n
j=1 κ0 (aj z)(sj

31

t

− sj−1 ) → t 1 κ0 ( f (s) z) ds. Finally the
0

proposition 1.4.1 implies, as n → ∞,
t
iz t 1 fn (s) dZs
0
Ee

→e

t1
t0 κ0 ( f (s) z) ds

The existence of the limit in probability, the above convergence of the characteristic functions
combined with the continuity theorem give the distributional form of the limit, i.e. the above
equation (1.13).
Uniqueness follows from the equation (1.14), with two approximating sequences given as
fn (s), gn (s) where both are uniformly bounded and converging a.s. to f . So, the following
t

t

holds, t 1 fn (s) dZs − t 1 gn (s) dZs → 0 in probability.
0
0
Remark 1.4.1. It is explained in [60] that by using Lusin’s theorem it is possible to construct a sequence of uniformly bounded step functions such that if f is a real-valued bounded
t

measurable function on [t0 , t1 ], then Y = t 1 f (s)dZs is definable and (1.4.2) holds.
0
Definition 1.4.3. The real-valued random variable Y in the proposition above is the stochastic integral of f with respect to the Levy process {Zt } and it is denoted by,

Y ≡

t1

f (s)dZs

t0

Proposition 1.4.3. If f is a real-valued bounded measurable function on [t0 , t1 ], then Y =
t1
t0 f (s)dZs

is definable and (1.4.2) holds.

Note, the above proposition shows the existence of the Levy integral on the bounded
time interval [t0 , t1 ]. Provided that the assumptions in the proposition 1.4.1 hold, couple
of important properties of the Levy integrals will hold true, see the references for more
details [1, 60]. See below.
32

Proposition 1.4.4 (Some properties of the Levy integrals).
(i) For any Levy integral of the type, defined in 1.4.3, the following holds,
t1
t0

D

f (s)dZs+h =

t1

f (s)dZs .

t0

The above statement is true due to the stationarity of Levy increments and due to the fact that
convergence in probability implies convergence in distribution. In addition, if the distribution
D

of 0∞ f (s)dZs exists, the following property holds as well, 0∞ f (s)dZ(s+h) = 0∞ f (s)dZs .
D

(ii) Provided that a Levy process on the negative time-axis is defined as, {Zt : t < 0} = {−Zt :
t ≥ 0}, where Zt is independent of Zt , the following property is true,
t

t

D

e−λ(t−s) dZs =

e−λs dZs .

0

0

Again, it holds because of the stationary increment of the Levy process and the fact that,
D
t λs
0 e dZ−s =

− t0 eλs dZs as a consequence of stationarity of the Levy increments.

Remark 1.4.2. As soon as we start working with the integrals on the unbounded intervals,
eg. t∞ f (s)dZs we will need to strengthen the above assumptions in order to get a limiting
0
t

distribution of t 1 f (s)dZs , as t1 → ∞. In the section 1.4.3.1, we will see that existence
0
of a stationary OU type process is related to the existence of the limiting distribution of an
unbounded Levy integral, hence a stronger set of assumptions will be established.
Note that, an additive process is a stochastically continuous process with independent
increments that starts at zero a.s. It will be used in the next two proposition. Below we
included two propositions, from [60] for the sake of completeness. Even though, we are not
going to give a detailed proof of the existence of an OU type process, in order to define it,
33

the proposition of Fubini type involving the stochastic integrals is a necessary step so we
included it here.
Proposition 1.4.5. Let f (s) be a locally bounded and measurable function on the interval
s ∈ [0, ∞). Then there exist an additive real-valued process {Yt } such that, for every t > 0
the following is true,
t

P

Yt =

f (s)dZs

= 1.

0

Note, since it is understood that 0t f (s)dZs is a modification of Yt , then the integral of the
t

following form, t 1 f (s)dZs is a modification of Yt1 − Yt0 .
0
Similarly, for the sake of completeness, we included the proposition that shows that Levy
integral has a modification that is an additive process.
Proposition 1.4.6. Let f (s) and g(s) be bounded measurable functions on [t0 , t1 ]. Then
t1
t0

1.4.2

s

g(s)ds

t1

f (u)dZu =
t0

f (u)dZu

t0

t1

g(s)ds, a.s.

(1.15)

u

Existence and uniqueness of the OU type process

Given a real-valued Levy process {Zt }t≥0 with a generating triplet (A0 , ν0 , γ0 ), satisfying
the Levy-Khintchine formula (1.5), a real constant λ > 0 and a starting point x ∈ R, we
define a new process {Yt }t≥0 such that it is a.s. right continuous with left limits and it a.s.
satisfies the integral equation given below:
t

Yt (ω) = x + Zt (ω) − λ

Ys (ω)ds,

f or t ≥ 0

(1.16)

0

Theorem 1.4.7. Considering the equation (1.16) as an equation for Yt (ω), there exist an
34

almost sure and unique solution Yt (ω), and it is given by:

Yt (ω) = xe−λt + e−λt

t

eλs dZs (ω).

(1.17)

0

In general, the solution proposed in (1.17) is well defined if the integral with respect
to measure induced by the Levy process is well defined, see the section 1.4.1. Hence, the
integral equation (1.16) has an a.s. strong and unique solution regardless of the type of the
Levy process being used as a driving noise.
Remark 1.4.3. In many applications, the driving Levy process is an increasing, positive Levy
processes, frequently referred to as a subordinator. It can be shown that the sample functions
of these processes are almost surely, of bounded variation on any finite time interval4 , hence
the above integral is well defined. So, in case of Levy being increasing (or non-decreasing)
we can re-write the above integral equation (1.16) as a path-wise differential equation, which
if viewed as a function of t and ω is considered stochastic.
The OU type process can be viewed as an a.s. and unique solution to a stochastic
differential equation (SDE) given below:

dYt (ω) = −λYt (ω)dt + dZt (ω),

t ≥ 0.

(1.18)

Since all the necessary elements were previously defined one can proceed with the definition of an Ornstein-Uhlenbeck type (or OU type) process.
Definition 1.4.4 (Ornstein-Uhlenbeck type process). Let λ > 0, M be a real valued random
variable, and Zt being a Levy process on R generated by the triple (A0 , ν0 , γ0 ), such that
4 This means that the last integral in (1.17) is a path-wise Lebesgue-Stieltjes integral in s.

35

M and {Zt }t≥0 are independent. The process that is an a.s. and unique solution to the
equation,
t

Yt = M + Z t − λ

Ys ds,

f or t ≥ 0

(1.19)

0

is called the Ornstein-Uhlenbeck type process generated by Zt and λ, starting from Y0 = M .
One can also refer to it as the Ornstein-Uhlenbeck type process generated by the quadruple
(A0 , ν0 , γ0 , λ). Equivalently, the appropriate stochastic differential equation is of the form,

dYt = −λYt dt + dZt ,

Y0 = M.

(1.20)

Note that, the above process Zt is often referred to as the background driving Levy process
(or BDLP for short) and that the parameter λ is usually referred to as the mean reversion
parameter. Intuitively, the parameter λ controls how fast the process Yt converges to its long
term mean. Moreover, it controls the memory of the process Yt through the autocorrelation
function, see the section (2.3.4).
Remark 1.4.4. Frequently in the literature, the BDLP of the given OU type process is
rescaled from Zt to Zλt , hence the above stochastic differential equation (1.20) becomes:
dYt = −λYt dt + d Zλt . The given scaling is such that the marginal distributions of OU type
process are free of the parameter λ. This way when fitting a distribution to the data at hand
there will be one parameter less to estimate. The parameter λ is usually estimated from the
empirical correlation function.

36

1.4.3

Properties of the the OU type processes

1.4.3.1

The stationary OU type process

Recall that a process is stationary if and only if its family of f.d.d. is shift invariant, or
D

in other words if for any u ≥ 0, the following is true: {Yt }t≥0 = {Yt+u }t≥0 .
The Levy processes usually do not have limiting distributions as t → ∞, except for a
case of a zero process. On the other hand, since the OU type processes have drift toward
the origin with the magnitude that is proportional to the distance from the origin, they may
have a limit if certain assumptions are satisfied, in which case stationarity will hold.
One way to show the stationarity is to directly write down its characteristic function and
see what are the conditions under which the limiting distribution exists, i.e. the distribution
as t → ∞. See below for more details, Barndorff (1998), see [5]. Another approach, proposed
by Sato (1999) [63] is to explore Markovian properties of an OU type process and to find the
necessary conditions under which the limiting distribution of a Markov process exists and is
stationary.
To simplify, we will only explain the first approach and explore the stationarity of a
marginal distribution of the OU type process, similarly as in Barndorff (1998), see [5]. The
proposition below reveals simple, yet important property of the OU type process, which will
be used throughout, for more details refer to [60].
Proposition 1.4.8. Let {Yt }t≥0 be an Ornstein-Uhlenbeck type process given by the formula,
Yt = e−λt Y0 + 0t e−λ(t−s) dZs , λ > 0, then the following is true:

Yt+u = e−λu Yt +

u

e−λ(u−s) dZs ,

0

37

u ≥ 0.

(1.21)

Proof.
t+u

Yt+u = e−λ(t+u) Y0 +
=

e−λ(t+u) Y

0

e−λ(t+u−s) dZs

0
t

+ e−λ(t+u)

eλs dZs

0
t

= e−λu e−λt Y0 +

0
u

D

= e−λu Yt + e−λu

0

t+u

+

eλs dZs

t

e−λ(t−s) dZs

u

+ e−λ(t+u)
0

eλ(s +t) dZ(s +t)

eλs dZs .

The last step is true due to the fact that Levy process Zt has stationary increments, i.e.
D

D

dZ(s +t) = dZs which implies 0u eλs dZ(s +t) = 0u eλs dZs .
Remark 1.4.5. In the proof above, Y0 and both integrals defined on disjoint intervals 0t
and tt+u are independent due to the independent increment property of the Levy process Zt .
Hence, we can conclude that Yt and 0u eλs dZs are independent.
Let’s continue with the stationarity of an OU type process Yt . The necessary and sufficient
condition for the stationarity of a marginal distribution is that, for any u > 0, ϕYt (z) =
ϕYt+u (z). If we denote ϕ (c) (z) as the characteristic function of a new random variable
Y
(c)
Yt

=

t
iz 0u e−λ(u−s) dZs
E[e
],
(c)

along with the fact that Yt

which represents the integral term in the equation (1.21),

and Yt are independent (remark 1.4.5), we get the following

formula,

ϕYt (z) = ϕYt+u (z) = ϕYt (e−λu z) ϕ (c) (z).
Y
t

38

(1.22)

The proposition 1.21 implies,

u −λ(u−s) dZ
s ] = exp
ϕ (c) (z) = E[eiz 0 e
Y
t

u
0

κZ1 z e−λ(u−s) ds ,

(1.23)

where κZ1 (z) as the Levy exponent of the BDLP Zt , given by ϕZ1 (z) = eκ(z) .
Remark 1.4.6. Since the above equation (1.22) holds for all u > 0, it follows it will hold for
all c = e−λu , c ∈ (0, 1). Hence, one may conclude that if the stationarity of the marginal distribution for Yt holds, as in (1.22), then its marginal distribution must be self-decomposable.
We are still interested in finding the condition under which the OU type process is
stationary. So, if we denote ω = z e−λ(u−s) , then it follows from (1.22) and (1.23),

ϕYt (z) = ϕYt (e−λu z) exp λ−1

z
z e−λu

κZ1 (ω) ω −1 dω .

Since any characteristic function is continuous at zero and so is ϕYt (z), if we let u → ∞ it
follows that,

ϕYt (z) = lim exp λ−1
u→∞

z
z e−λu

κZ1 (ω) ω −1 dω

z

= exp λ−1
0

κZ1 (ω) ω −1 dω . (1.24)

Remark 1.4.7. It is important to notice that the equation (1.24), provided above, tells us
how to choose the distribution of Z1 , or its Levy exponent κZ1 , in order to get a particular
marginal distribution of the OU type process Yt , with characteristic function ϕYt .
Let’s continue with the stationarity of an OU type process. In order that it holds, the
right-hand side of the equation (1.24) needs to hold, i.e. the Levy exponent κZ1 must

39

necessarily satisfy the following condition:
z
0

|κZ1 (ω)|ω −1 dω

∞,

<

f or z > 0.

(1.25)

According to Wolfe (1982) the above condition is equivalent to the condition provided
below,
E (log + |Z1 |) = E (0 ∨ log |Z1 |)

<

∞,

(1.26)

Jurek and Mason (1993), stated in the theorem 3.6.6. that the above condition is equivalent
to the condition given in Sato (1999), namely,

log |x| ν(dx)

<

∞,

(1.27)

|x|>2

where ν is the Levy measure of the background driving Levy process (BDLP) Zt .

We have discussed the condition that allows an OU type process to be stationary. A
Markov process will be stationary if its transition function has a stationary limiting distribution, as t → ∞. Since an OU type process is a Markov process, we can explore the
self-decomposability of its marginals and its stationarity through exploring if and when its
limiting distribution exists, as t → ∞. According to definition of the OU type process and
proposition 1.4.4 the following holds,

Yt = e−λt Y0 +
D
= e−λt Y0 +

40

t

e−λ(t−s) dZs

0
t
0

e−λs dZs .

As t → ∞, if the limiting distribution exists, it is given by,
∞

D

Y∞ =

e−λs dZs .

0

The existence of the the above limiting distribution will be closely related to the existence
of a stationary (or invariant) distribution of an OU type process. Provided that a measure
ν is a Levy measure of BDLP Z1 , it can be shown, see [1], that the necessary condition for
the existence of the distribution of Y∞ is,

log (1 + |x|) ν(dx)

<

∞.

|x|>1

Remark 1.4.8. Provided that the above limiting distribution Y∞ exists and the OU type process Yt is stationary, the following are alternative representations of an Ornstein-Uhlenbeck
process,
∞

D

Yt =

D

e−λs dZs =

0

0
−∞

eλs dZs

D

t

=
−∞

e−λ(t−s) dZs

(1.28)

Note, provided that the OU type process is stationary, all three representations above
can be proved by using the above approaches and appropriate change of variable.
Remark 1.4.9. From the third equality from the representation (1.28), one can see that the
+∞
OU type process is a special case of a moving average process Zt = −∞
f (s − t)dZs where

f (s) = 1 (∞,0 ] (s) eλs , see [1].

1.4.3.2

The stationary OU type processes and self-decomposable distributions

There is a one-to one correspondence between the class of self-decomposable distributions
and the stationary OU type processes. As it was briefly mentioned in the remark 1.4.6, it is
41

not difficult to see that if an OU type process is stationary then it is also self-decomposable,
see the proposition 1.4.9 below.
Proposition 1.4.9. Let Yt be a stationary OU type process on R, given by the formula,
Yt = e−λt Y0 + 0t e−λ(t−s) dZs , λ > 0. Then the distribution Y0 is self-decomposable, with
given parameters c = e−λt and Y (t)(c) = 0t e−λs dZs .
(c)
Remark 1.4.10. Note, it can be proved that any self-decomposable measure µ = µc ∗ µt

is an invariant measure, see [1] and Applebaum’s lectures [2]. This means that any selfdecomposable distribution µ is a potential candidate for marginal distribution of some staD

tionary Markov process Yt , such that Y0 = µ5 .
In fact, it is possible to show the converse result is true, that any self-decomposable
distribution µ is a weak limit of a stationary Ornstein-Uhlenbeck type process Yt such that
D

Y0 = µ. We closely follow Applebaum’s book on the Levy processes and stochastic calculus [1]
and present the proposition below.
Proposition 1.4.10. Given any self-decomposable random variable X with distribution µ,
there exists a stationary Ornstein-Uhlenbeck type process Yt , such that,
∞

X = Y0 =

e−λs dZs ,

0

D

and L(Y0 ) = µ.

Proof. Let us give a sketch of the proof. Intuitively, since any self-decomposable measure
(c)
can be decomposed for any c ∈ (0, 1), i.e. µ = µc ∗ µt , take specific c = e−λt , to get, for
−λt
any t > 0, µ = µe
∗ L(Y (t)(c) ), where Y (t)(c) = 0t e−λs dZs . From here, one can see

that it is possible to construct a stationary OU type process Yt and its BDLP levy process
5 Remember that a Markov process X is stationary iff µ ∆
= L(X0 ) is an invariant measure.
t

42

Zt that would satisfy the above equality, but it is not straightforward, see the reference for
details (Jurek and Vervaat (1983), [43]).
Note, the result in the above proposition can be interpreted as a complete characterization
of a self-decomposable distribution, see below.
Proposition 1.4.11. Let X be a self-decomposable random variable with Levy measure Q
of the form, Q(dx) = q(x)dx, where q(x) = k(x)/|x|, for some non-negative real function
∞
k(.), increasing on (−∞, 0), decreasing on (0, ∞), and satisfies, −∞
(1 ∧ x2 ) |x| dx < ∞. It
k(x)

follows that, if a random variable X is self-decomposable and if |x|>1 log (1+|x|) Q (dx) < ∞
then X can be represented as,
∞

X=

e−s dZs ,

(1.29)

0

where Zt is a Levy process whose law is uniquely determined by that of X. In fact it is a
BDLP corresponding to X, where Z1 is uniquely determined by its Levy measure W (dx).
Furthermore, the Levy measure W (dx) of Z1 is related to the Levy density q(x) of X by,

W + (x) = x q(x), f or x > 0,

and

W − (x) = |x| q(x), f or x < 0,

(1.30)

In addition, if the Levy density q(x) of X is differentiable then the Levy measure W (dx) of
Z1 has the density w with respect to Lebesgue measure and the densities w(x) and q(x) are
related by,
w(x) = −q(x) − xq (x)

43

(1.31)

1.4.3.3

Covariance of the stationary OU type process

The nature of the covariance structure of any processes is important from many different
aspects. For example in modeling it is essential to know if the data exhibit long or short range
dependence and how it evolves with time; statistical estimation and inference techniques are
highly reliant on the dependence structure of the underlying process and if the given process
satisfy mixing conditions.
The covariance of the OU type processes can be derived from the elementary calculations,
see below:

Cov(Yt , Yt+u ) = Cov Yt , e−λu Yt +

u

e−λ(u−s) dZs

(1.32)

0

= e−λu V ar(Yt ) + Cov Yt ,

u

e−λ(u−s) dZs

(1.33)

0

= e−λu V ar(Yt )

(1.34)

where Yt and the integral 0u e−λ(u−s) dZs are independent, see the proposition (1.4.8). From
the above formula for the covariance, one can easily see that autocorrelation function of an
OU type process is given by:

ρ(u) = e−λ|u| .

(1.35)

From the above formula we can see that parameter λ controls the memory of the OU type
process, where autocorrelation is decaying slower as λ decreases.
Remark 1.4.11. Note, because its auto-correlation function is decreasing exponentially as
number of lags is increasing, it follows that the Ornstein-Uhlenbeck process is short range

44

dependent (SRD).

1.5
1.5.1

Superpositions of the OU type processes
Existence of supOU processes

Superpositions of the OU type processes, or supOU processes for short, where introduced
in the papers of Barndorff-Nielsen [4], Barndorff-Nielsen and Shephard [8] and BarndorffNielsen and Leonenko [7], among others. In this section we will give a brief overview of
the superposition type models, i.e. their existence, uniqueness, and the main properties
including their covariance structure. The focus of this thesis will be mainly on their limiting
behavior, see chapter 2. We will start with the definition of finite and infinite superpositions
and the assumptions necessary so that they are well defined, see below.
Definition 1.5.1. Let’s assume the following,
(A) Let {Y (k) (t)}k≥1 be the sequence of independent stationary OU type processes, i.e. independent processes such that, each Y (k) (t) is a stationary solution of the equation:

dY (k) (t) = −λk Y (k) (t)dt + dZ (k) (λk t),

t ≥ 0,

(1.36)

in which the L´evy processes {Z (k) }k≥1 are independent, and λk > 0 for all k ≥ 1.
In addition, we assume that the self-decomposable distribution of Y (k) has finite moments
of up to q, q ≥ 2, it is closed under convolution with respect to at least one distributional
parameter δk , and that cumulants of order q ≥ 2 of Y (k) are proportional to parameter δk .

45

(B) Another set of assumption is related to the growth of moments of Yt , see below,
∞

∞

EY (k) (t)

V arY (k) (t) < ∞.

< ∞ and

k=1

(1.37)

k=1

Let’s define either finite for an integer m, or infinite superpositions of the OU type processes
as the following,
(i) If the assumption (A) with q = 2 holds finite superposition for an integer m is defined as,
m

Y (k) (t),

Ym (t) =

t∈R

(1.38)

k=1

(ii) Provided that both assumptions, (A) with q = 2 and (B) hold, infinite superposition is,
∞

Y (k) (t),

Y∞ (t) =

t ∈ R,

(1.39)

k=1

Note, the construction with infinite superposition is well-defined in the sense of mean-square
or almost-sure convergence provided that the assumption (B) holds.
Although assumption (A) may seem restrictive, it is satisfied for many examples with
tractable distributions of superpositions. The section 1.5.3 provides a number of examples
where both assumptions (A) and (B) are satisfied. These examples include Gamma, inverse
Gaussian and other well known distributions. Their superpositions have the marginal distributions that belong to the same class as the marginal distributions of the components of
superposition.

46

1.5.2

The covariance structure of supOU processes

1.5.2.1

Finite supOU processes

In the case of finite superposition, the covariance function of the resulting process is given
by the following formula,
m

V ar(Y (k) (t))e−λk t ,

RYm (t) = cov(Ym (0), Ym (t)) =
k=1

Recall the section 1.1, a stochastic process is short-range dependent if its covariance
+∞
function is integrable, i.e. t=−∞
|RYm (t)|dt < ∞.

Since we have,
+∞

+∞
t=−∞

|RYm (t)| dt =

m

V ar(Y (k) (t))e−λk t < ∞,

t=−∞ k=1

we may conclude that the finite superposition is a short-range dependent process.

1.5.2.2

Infinite supOU processes

In the case of infinite superposition, the covariance function is of the form,
∞

V ar(Y (k) (t))e−λk t ,

RY∞ (t) = cov(Y∞ (0), Y∞ (t)) =
k=1

and under the condition (A) the variance of Y (k) (t) is proportional to δk , that is

V ar(Y (k) (t)) = δk C2 ,

47

where constant C2 does not depend on k and reflects parameters of the marginal distribution
of Y (k) . If one chooses specific δk ,
1
< H < 1,
2

δk = k −(1+2(1−H)) ,

λk = λ/k

for some λ > 0, then we get that the covariance function of an infinite superposition is,
∞

RY∞ (t) = C2

1

k 1+2(1−H)
k=1

e−λt/k .

(1.40)

The lemma below shows that the correlation function (1.40) is not integrable for the chosen
parameters δk and λk , thus the process obtained via infinite superposition exhibits long-range
dependence.
Lemma 1.5.1. For the infinite superposition of OU type processes that satisfy condition (A)
with q = 2 and condition (B), the covariance function of Y∞ (t) given by with λ(k) = λ/k
and δk = k −(1+2(1−H)) , 21 < H < 1, can be written as

RY∞ (t) =

L(t)
t2(1−H)

,

t>0

where L is a slowly varying at infinity function.
Proof. The proof of this lemma is essentially the same as the proofs presented for particular
cases of superpositions of OU processes in [47], [49]. We provide it here for completeness
and for the remark that follows. The remark will be used for proofs later in the paper. Let
∞

L(t) = C2

t2(1−H)

1

k 1+2(1−H)
k=1
48

e−λt/k .

Estimate the sum appearing in the expression for L as follows:
∞

e−λt/u
u1+2(1−H)

1

∞

1

du ≤
k=1

k 1+2(1−H)

∞

e−λt/k ≤
1

e−λt/u
u1+2(1−H)

du + e−λt .

Transform the variables λt/u = s to get
λt

C2
λ2(1−H)

e−s s2(1−H)−1 ds ≤ L(t) ≤

0

λt

C2
λ2(1−H)

0

e−s s2(1−H)−1 ds + C2 e−λt t2(1−H) .

Since
λt

e−s s2(1−H)−1 ds → Γ(2(1 − H))

0

as t → ∞, it follows that limt→∞ L(tv)/L(t) = 1 for any fixed v > 0.
Remark 1.5.1. From proof of Lemma 1.5.1

L([N t]) ≤
≤

λ[N t]

C2
λ2(1−H)
C2
λ2(1−H)

0

e−s s2(1−H)−1 ds + C2 e−λ[N t] [N t]2(1−H)

Γ(2(1 − H)) + C2 e−2(1−H)

2(1 − H) 2(1−H)
λ

for all N ≥ 1 and t ∈ [0, 1] since the function x2(1−H) e−x is bounded (attains its maximum
at x = 2(1 − H)). Also from the proof of Lemma 1.5.1

L(N ) ≥

λN

C2
λ2(1−H)

e−s s2(1−H)−1 ds ≥

0

λ

C2
λ2(1−H)

e−s s2(1−H)−1 ds

0

for all N ≥ 1. Also note that L(0) = 0. Therefore the ratio L([N t])/L(N ) is bounded
uniformly in N ≥ 1 and t ≥ 0.

49

1.5.3

Examples of OU type superpositions

The examples in this section have been discussed in [4, 47]. We briefly present them to
illustrate that conditions (A) and (B) are satisfied for a number of OU type processes.

1.5.3.1

Gamma supOU process

Gamma distribution Γ(α, β), α > 0, β > 0 is defined by the following probability density
function (pdf),
π (x) =

β α α−1 −βx
x
e
1[0,∞) (x).
Γ (α)

It is self-decomposable and hence there is a stationary OU type process {Y (t), t ≥ 0} with
gamma marginal distribution, often referred to in literature as stationary Gamma OU type
process. Its covariance function is of the form,

α
RY (t) = 2 exp {−λ |t|} .
β

The cumulant generating function is given by,

κY (z) =

logEeizY (t)

iz
= −α log 1 −
β

∞

=
m=1

α(iz)m
,
mβ m

z < β,

(1.41)

so that
EY (t) =

α
,
β

α
V arY (t) = 2 .
β

Let’s look at the superposition of the stationary Gamma OU type processes. Suppose
{Y (k) (t), t ≥ 0}, k ≥ 1 are independent stationary OU type Gamma processes with marginals

50

Γ(αk , β), k ≥ 1 where
1
< H < 1.
2

αk = αk −(1+2(1−H)) ,

It follows from our setup and (1.41) that the processes {Y (k) (t), t ≥ 0}, k ≥ 1 satisfy
∞
k=1 αk

condition (A) with the choice of δk = αk . In addition, if

< ∞, the the condition (B)

is satisfied as well, hence infinite superposition of the stationary Gamma OU type processes
is well defined. Thus, the supOU process given by,
∞

Y (k) (t),

Y∞ (t) =

t ≥ 0,

k=1

has a marginal Γ(

∞
k=1 αk , β)

distribution and the covariance function is of the following

form,
1
RY∞ (t) = 2
β

∞

αk e−λ

(k) t

.

k=1

Remark 1.5.2. Note, with above choice of δk and a particular choice of λ(k) as λ/k we can
get a long range dependent stationary Gamma OU type process.

1.5.3.2

Inverse Gaussian supOU process

Inverse Gaussian distribution IG(δ, γ), δ > 0, γ > 0 given with pdf
1 δeδγ
1
exp −
π (x) = √
2
2π x3/2

δ2
+ γ 2x
x

1[0,∞) (x),

(1.42)

is self-decomposable, so there exists a stationary OU type process {Y (t), t ≥ 0} with IG(δ, γ)
marginal distribution and correlation function exp {−λ |t|}. The cumulant generating func-

51

tion is
∞

κY (z) =

log EeizY (t)

=δ γ−

γ2

− 2z

=
m=1

δ(2m)!(iz)m
,
(2m − 1)(m!)2 2m γ 2m−1

and
EY (t) =

δ
V arY (t) = 3 .
γ

δ
,
γ

It follows that independent stationary OU type processes {Y (k) (t), t ≥ 0}, k ≥ 1 with
marginals IG(δk , γ), k ≥ 1 where

δk = δk −(1+2(1−H)) ,

1
< H < 1,
2

satisfy conditions (A) and (B), and we obtain the inverse Gaussian supOU process as,
∞

Y (k) (t),

Y∞ (t) =

t ≥ 0,

k=1

with marginal IG(

∞
k=1 δk , γ)

distribution. The covariance function is

1
RY∞ (t) = 3
γ

1.5.3.3

m

δk exp {−λk |t|} .
k=1

Variance gamma supOU process

The Variance Gamma distribution V G (κ, α, β, µ) with the parameters

κ > 0, α > |β| > 0, µ ∈ R, γ 2 = α2 − β 2 ,

52

is defined by the following pdf

π (x) = √

γ 2κ
πΓ (κ) (2α)

κ−1/2

|x − µ|κ−1/2 Kκ−1/2 (α |x − µ|) eβ(x−µ) ,

x ∈ R,

where Kλ (z) denotes modified Bessel function of the second kind.
Since the Variance Gamma distribution is self-decomposable, the corresponding stationary OU type process, denoted as {Y (t), t ≥ 0}, exists and its cumulant generating function
is given by the following formula,

κY (z) = log EeizY (t) = iµz + 2κ log

α2

γ
− (β + iz)2

,

|β + z| < α.

It follows that V G (κ, α, β, µ) distribution is closed under convolution with respect to parameters κ and µ. Independent stationary OU type processes {Y (k) (t), t ≥ 0}, k ≥ 1 with
∞
k=1 µk

marginals V G (κk , α, β, µk ), k ≥ 1 where

converges and a specific choice for the

parameter κk , i.e.
1
< H < 1,
2

κk = κk −(1+2(1−H)) ,

satisfy conditions (A) and (B), and we obtain the Variance Gamma supOU process as,
∞

Y (k) (t),

Y∞ (t) =

t ≥ 0,

k=1

with marginal V G

∞
k=1 κk , α, β,

∞
k=1 µk

distribution. The covariance function is of the

form,
2
RY∞ (t) = 2
γ

1+2

β 2
γ

53

∞

κk exp {−λk |t|} .
k=1

1.5.3.4

Normal inverse Gaussian supOU process

The normal inverse Gaussian distribution N IG(α, β, δ, µ) with following parameters,

α ≥ |β| ≥ 0, δ > 0, µ ∈ R,

is defined by the pdf

αδ δγ
e
π (x) =
π

δ 2 + (x − µ)2

K1 α

eβ(x−µ) ,

δ2

+ (x − µ)

2

x ∈ R,

where Kλ (z) denotes modified Bessel function of the second kind. Since it is self-decomposable,
the corresponding stationary OU type process {Y (t), t ≥ 0} exists and its cumulant generating function is given by,

κY (z) = log EeizY (t) = iµz + δ

α2 − (β + iz)2 ,

α2 − β 2 −

|β + z| < α.

It follows that N IG(α, β, δ, µ) distribution is closed under convolution with respect to parameters δ and µ. Independent stationary OU type processes {Y (k) (t), t ≥ 0}, k ≥ 1 with
∞
k=1 µk ,

marginals N IG(α, β, δk , µk ), k ≥ 1 with convergent

δk = δk −(1+2(1−H)) ,

and δk given as,

1
< H < 1,
2

satisfy conditions (A) and (B), and we obtain the normal inverse Gaussian supOU process
as,
∞

Y (k) (t),

Y∞ (t) =
k=1

54

t ≥ 0,

with marginal N IG(α, β,

∞
k=1 δk ,

∞
k=1 µk )

distribution. The covariance function is given

by,
RY∞ (t) =
1.5.3.5

∞

α2

(α2 − β 2 )3/2 k=1

δk exp {−λk |t|} .

Tempered stable supOU process

The positive tempered stable distribution T S(κ, δ, γ), κ ∈ (0, 1), δ > 0, γ > 0 is selfdecomposable, hence there exists a stationary OU type process {Y (t), t ≥ 0} with T S(κ, δ, γ)
as its marginal distribution. The distribution of Y (t) can be specified through its cumulant
generating function given by,

1

κY (z) = log E izY (t) = δγ − δ γ κ − 2iz

κ

,

0<z<

γ 1/κ
.
2

Thus, the T S(κ, δ, γ) distribution is closed under convolution with respect to parameter δ. Independent stationary OU type processes {Y (k) (t), t ≥ 0}, k ≥ 1 with marginals T S(κ, δk , γ),
k ≥ 1 and where δk is chosen as,
1
< H < 1,
2

δk = δk −(1+2(1−H)) ,

satisfy conditions (A) and (B), and we obtain the tempered stable supOU process as,
∞

Y (k) (t),

Y∞ (t) =

t ≥ 0,

k=1

with marginal T S(κ,

∞
k=1 δk , γ)

RY∞ (t) =

distribution. The covariance function is given by,

κ−2
4κ (1 − κ) γ κ

∞

δk exp {−λk |t|} .
k=1

55

Remark 1.5.3. More examples of supOU type processes satisfying condition (A) can be
derived from other distributions, like for example: normal tempered stable, Euler’s gamma
distribution and z-distribution.

56

Chapter 2
Intermittency of supOU processes
The phenomenon of intermittency has been widely discussed in physics literature (see
for example [11, 28, 57, 67, 69] and [27, Chapter 8]). The term is used to describe models
exhibiting high degree of variability and enormous fluctuations which escape from the scope
of the usual limit theory. Terms multifractality, separation of scales, dynamo effect are
often used interchangeably with intermittency. For a formal definition of intermittency
appearing in the theory of stochastic partial differential equations (SPDE) we follow [14]
and [44, Chapter 7]. There, a nonnegative random field {ψt (x), t ≥ 0, x ∈ R} stationary in
parameter x is said to be intermittent if the function k → γ(k)/k is strictly increasing on
[2, ∞) where γ(k) is the k-th moment Lyapunov exponent of ψ defined by
log E (ψt (x))k
,
t→∞
t

γ(k) = lim

(2.1)

assuming the limit exists and is finite. This approach to intermittency is tailored for the
analysis of SPDEs and characterizes fields with progressive growth of moments.
To compare intermittency to a slower growth of moments, consider the sum φn =

n
i=1 ξi ,

where ξi are positive independent identically distributed (iid) random variables with finite
moments. The k-th moment of φn grows as nk (Eξ1 )k , therefore

γ(k) = lim

n→∞

k log n + k log Eξ1
=0
n
57

(2.2)

for all k ≥ 1. With the appropriate centering and norming, the classical central limit theorem
holds.
In contrast, for a sequence of products of positive random variables ψn =

n
i=1 ξi

log Eψnk
= log Eξ1k .
n→∞
n

γ(k) = lim

If ξi are not constant a.s., then from Jensen’s inequality it follows that for l > k

Eξ1k

<

Eξ l

k
l

,

showing that γ(k)/k is strictly increasing. The wild growth of moments of ψn provides the
main heuristic argument that intermittency implies unusual limiting behavior. A formal
argument showing that under some assumptions intermittency implies large peaks in the
space coordinate of the random field can be found in [44], some ideas of which will be used
later in this work.
By far the most investigated model exhibiting intermittent behavior is the parabolic
Anderson model (see [29–32]). In this work we consider models provided by the partial sums
of discrete superpositions of L´evy driven Ornstein-Uhlenbeck (OU) type processes. While
models based on L´evy flights describe the position of particle, models given by OU dynamics
describe the velocity of particle trapped in a field generated by quadratic potential ( [20]).
Applications of L´evy-driven OU type processes include financial econometrics [8,48,50], fluid
dynamics [66], plasma physics [15] and biology [59]. The stochastic model discussed in this
thesis provides another example of intermittency model based on the velocity (see [27, Section
8.5]). First, we modify the preceding definition of intermittency to tailor it to the analysis

58

of sequences of partial sum processes. In the case of finite superpositions we show that the
central limit theorem holds. In the case of infinite long range dependent superpositions, we
show that the growth of cumulants is such that the partial sum process is intermittent. The
section 1.5.3 contains examples that fit our assumptions which cover, to our knowledge, all
the examples with tractable distributions of superpositions.

2.1

Intermittency

For a process {Y (t), t ≥ 0}, denote

q = sup{q > 0 : E|Y (t)|q < ∞ ∀t}.

Our definition of intermittency is based on the version of Lyapunov exponent that replaces
t in the denominator of (2.1) with log t. For a stochastic process {Y (t), t ≥ 0}, define the
scaling function at point q ∈ [0, q) as
log E|Y (t)|q
,
t→∞
log t

τ (q) = lim

(2.3)

assuming the limit exists and is finite for every q ∈ [0, q). Objects similar to the scaling
function (2.3) appear in the theory of multifractal processes (see e.g. [33]), however, there
are some important differences [45]. The following proposition gives some properties of τ .
Proposition 2.1.1. The scaling function τ defined by (2.3) has the following properties:
(i) τ is non-decreasing and so is q → τ (q)/q;
(ii) τ is convex;
59

(iii) if for some 0 < p < r < q, τ (p)/p < τ (r)/r, then there is a q ∈ (p, r) such that
τ (p)/p < τ (q)/q < τ (r)/r.
Proof. (i) For 0 ≤ q1 < q2 < q Jensen’s inequality implies
q1

q1

E|Y (t)|q1 = E (|Y (t)|q2 ) q2 ≤ (E|Y (t)|q2 ) q2

and thus
q
τ (q1 ) ≤ 1 τ (q2 )
q2
proving part (i).
(ii) Take 0 ≤ q1 < q2 < q and w1 , w2 ≥ 0 such that w1 + w2 = 1. It follows from H¨older’s
inequality that
E|Y (t)|w1 q1 +w2 q2 ≤ (E|Y (t)|q1 )w1 (E|Y (t)|q2 )w2 .
Taking logarithms, dividing by log t for t > 1 and letting t → ∞ we have

τ (w1 q1 + w2 q2 ) ≤ w1 τ (q1 ) + w2 τ (q2 ).

(iii) This is clear since q → τ (q)/q is continuous by (ii).
We now define intermittency for a stochastic process and for a sequence of random variables by using the corresponding partial sum process.
Definition 2.1.1. A stochastic process {Y (t), t ≥ 0} is intermittent if there exist p, r ∈
(0, q) such that
τ (p)
τ (r)
<
.
p
r

60

Later in the thesis, we will investigate intermittency of a stationary sequence of random
variables {Yi , i ∈ N} with finite mean. In this sense, intermittency will be considered as
intermittency of the centered partial sum process
t

t

Yi −

S(t) =
i=1

EYi , t ≥ 0.
i=1

Proposition 2.1.1(i) shows that the function q → τ (q)/q is always non-decreasing. What
makes the process intermittent is the existence of points of strict increase. In the next two
sections, we connect this property to the limiting behavior of cumulants of partial sums of
superpositions of Ornstein-Uhlenbeck type processes. We show that while the partial sums
of finite superpositions obey the central limit theorem, partial sums of infinite long-range
dependent superpositions provide examples of intermittent processes.
Remark 2.1.1. To conclude this section, we note that intermittent sequences exhibit erratic
behavior which may preclude the usual central limit theorem type of results. For example,
take the case of positive, i.i.d random variables, see (2.2), for which we already showed are
not intermittent. It is well known that they do obey central limit theorem type of results if
properly normalized. Now, on the contrary, for an intermittent sequence such normalization
does not exist, which we will see in the next section for the case of partial sums of infinite
supOU processes, i.e. we will see that the partial sums of finite superpositions of OrnsteinUhlenbeck type processes obey the central limit theorem, while partial sums of infinite longrange dependent superpositions provide examples of intermittent processes and most likely do
not obey CLT type of results.

61

2.2

Limiting properties of partial sums of finite supOU
processes

We begin with the limit distribution of the partial sum process for the finite superposition.
Hence, for t > 0, we define a partial sum processes as following,
[t]

Sm (t) =

Ym (i),

(2.4)

i=1

where the supOU process Ym (i), at time i, is a given by, Ym (i) =

m
(k) (i),
k=1 Y

i ∈ Z.

In the following paragraph we provide a brief proof of the asymptotic normality of the
partial sum process, which is easy to prove using the strong mixing property (α-mixing) of
the OU type processes. Strong mixing property was established in the work of Jongbloed,
Van Der Meulen, and Van Der Vaart, see [42]. They showed that the integrability condition
on the Levy measure of BDLP, |x|>2 log |x| ν(dx) < ∞, given in the (1.27), is enough to
show the β-mixing which, in turn implies the strong mixing property of the stationary OU
type process. The β-mixing of general, multi-dimensional OU type process is treated in
Masuda, see [55]. He made a stronger assumption than (1.27), related to existence of the
absolute moments of order α of the marginal distribution. More specific, Masuda assumed
that the process is strictly stationary and that the following holds,

|x|α π(dx)1 < ∞, for

some α > 0 which in turn yields a stronger result than only the β-mixing itself. It follows
that β-mixing coefficient converges to zero with an exponential rate, i.e. for some a > 0,
βY (t) = O(e−at ), as t → ∞. From this stronger assumption the asymptotic normality of the
partial sum process follows directly, see below. Note that previously, asymptotic normality
1 Note that π is unique invariant probability measure.

62

of partial sums was reported for inverse Gaussian and gamma finite superpositions [48]. The
result below is a straightforward generalization to a more general class of processes.
Theorem 2.2.1. For a fixed integer m > 0, let Ym be defined by (1.38), where OU type
processes {Y (k) , k = 1, . . . , m} defined by (1.36) are independent and E|Y (k) |2+d < ∞
for some d > 0 and all k = 1, . . . , m. Then the partial sums process (2.4), centered and
appropriately normed, converges to the Brownian motion

1
cm N 1/2

Sm ([N t]) − ESm ([N t])

→ B(t),

t ∈ [0, 1],

as N → ∞ in the sense of weak convergence in Skorokhod space D[0, 1]. The norming
constant cm is given by
m

cm =

V ar

Y (k) (0)

k=1

1 − e−λ
1 + e−λ

(k)

(k)

1/2

.

Proof. Since each OU type process in the superposition has a finite second moment, β- mixing
(absolute regularity, see the Appendix A) for each OU process holds with the exponential
rate. Namely, there exists ak > 0 such that the mixing coefficient β (k) (t) = O(e−ak t )
Y
[55, Theorem 4.3]. Denote by α(k) (t) the strong mixing coefficient of the process Y (k) ,
then since β-mixing implies α-mixing, [13], we get 2α(k) (t) ≤ β (k) (t) ≤ Dk e−ak t for a
constant Dk , for each k = 1, . . . , m. A finite sum of α-mixing processes with exponentially
decaying mixing coefficients is also α-mixing, see again [13] with exponentially decaying
mixing coefficient, therefore weak convergence of partial sums of the process Ym in D[0, 1]
follows from Davydov, [18, Theorem 4.2], see the Appendix A).

63

2.3

Limiting properties of partial sums of infinite supOU
processes

Before we provide the results from this section, we will briefly present an important
theorem related to the regular variation, previously defined in the Preliminaries section
(1.1). There, we defined the notion of regularly varying (RV) function or a function that
asymptotically behaves like a power function. The Karamata’s theorem, stated in detail
below, simply says that the ρ-regularly varying function integrates the way one would expect
a common power function to integrate. In other words, for the purpose of integration, a ρvarying function behaves roughly like xρ . A brief simple explanation is that any RV function
U (x) ∈ RVρ can be represented as, U (x) = xρ L(x), where L(x) is a slowly varying function.
Since a slowly varying function asymptotically behaves like a constant, one would simply
guess that it can passed through the integral. Hence, one is left with integrating only a
power function. See the integral expression below,
x

x

U (t) dt =
0

x

∼

tρ L(t) dt = L(x)

tρ dt = L(x) xρ+1 /(ρ + 1)

0

0

The Karamata’s theorem is stated below.
Theorem 2.3.1 (Karamata’s theorem).
(i) Suppose ρ ≥ −1 and the U ∈ RVρ . Then 0x U (t)dt ∈ RVρ+1 and
xU (x)
limx→∞ x
=ρ+1
U
(t)dt
0
(ii) Suppose ρ < −1 and then U ∈ RVρ implies that x∞ U (t)dt is finite and x∞ U (t)dt ∈
64

RVρ+1 and
xU (x)
=ρ−1
limx→∞ ∞
U
(t)dt
x
Now, we can continue with the limiting properties of the infinite superposition process.
For t > 0, lets consider a partial sum processes of the infinite supOU process as follows,
[t]

S∞ (t) =

Y∞ (i).

(2.5)

i=1

We now proceed with the limit distribution of the partial sum process for the infinite
superposition (1.39). The variance of this process has been computed in [47, Equation (5.3)],
however the result on the asymptotic normality of the partial sum process [47, Theorem 3]
was not correct. Also incorrect was statement (30) of [6, Theorem 5]. Here we provide the
derivation of the variance and correct the result on the limit distribution.
Lemma 2.3.2. For the infinite superposition (1.39) of OU type processes that satisfy condition (A) with q = 2 and condition (B), set λ(k) = λ/k and δk = k −(1+2(1−H)) , 12 < H < 1.
Then
V ar (S∞ ([N t])) =

L(N )[N t]2H
(1 + o(1)) , as N → ∞,
H(2H − 1)

(2.6)

where L is a slowly varying at infinity function.
Proof. Using the expression for the covariance function of the infinite superposition from
Lemma 1.5.1, write

65

[N t]

V ar (S∞ ([N t])) =

cov (Y∞ (m), Y∞ (n))
m, n=1
[N t]

L(m − n)

= [N t]V ar (Y∞ (n)) + 2
m, n=1, m>n

(m − n)2(1−H)

[N t]−1

= C2 [N t]ζ(1 + 2(1 − H)) + 2

([N t] − j)
j=1

L(j)
j 2(1−H)

,

where ζ(·) is Riemann’s zeta function. The sum appearing in the expression for the variance
[N t]−1

([N t] − j)
j=1

L(j)
j 2(1−H)

is a Riemann sum for the following integral:
1

([N t] − [N t]u)
0

L([N t]u)
([N t]u)2(1−H)

1

[N t]du = [N t]2H

(1 − u)u2H−2 L([N t]u)du.

0

Consider the integral
1

u2H−2 L([N t]u)du =

0

1

[N t]

[N t]2H−1 0

v 2H−2 L(v)dv,

and apply Karamata’s theorem [58, Theorem 2.1] to get
[N t]
0

v 2H−2 L(v)dv =

L(N )[N t]2H−1
(1 + o(1))
2H − 1

66

as N → ∞. Similarly,
1

u2H−1 L([N t]u)du =

0

L(N )
(1 + o(1))
2H

as N → ∞, and therefore
1

([N t] − [N t]u)
0

L([N t]u)
([N t]u)2(1−H)

[N t]du =

L(N )[N t]2H
(1 + o(1)).
2H(2H − 1)

Since L([N t]u)/L(N ) → 1 as N → ∞ uniformly on bounded intervals [64, Theorem 1.1],
the uniform convergence implies that
1

(1 − u)u2H−2

0

1
1
L([N t]u)
(1 − u)u2H−2 du(1 + o(1)) =
du =
(1 + o(1)).
L(N )
2H(2H − 1)
0

For 12 < H < 1, the second term in the expression for the variance of S∞ ([N t]) dominates
the first, and (2.6) follows.
In order to characterize the limit distribution of the partial sums of the infinite superpositions, we use the representation of the discretized stationary OU process as a first order
autoregressive sequence

Y (k) (i) = e−λk Y (k) (i − 1) + W (k) (i),

(2.7)

where W (k) (i) is independent of Y (k) (j) for all j < i. Denote by ρk = e−λk . The following lemma provides a useful representation of the partial sums process for the infinite
superposition.
Lemma 2.3.3. The centered partial sums of the superposition of processes that satisfy condi67

tion (A) with q = 2 and condition (B) with λ(k) = λ/k and δk = k −(1+2(1−H)) , 12 < H < 1,
can be written as
∞

S∞ ([N t]) − ES∞ ([N t] =

(k)
b[N t] τ (k) (0) +

[N t] ∞

(k)

a[N t]−j V (k) (j),

(2.8)

j=1 k=1

k=1

where τ (k) (0), V (k) (j) are independent for different k, for each k: V (k) (j) are independent
for different j and also independent of τ (k) (0). The series in (2.8) converges almost surely,
and the coefficients are given by
[N t]
(k)
b[N t]

[N t]

ρik

=
i=1

ρk (1 − ρk
=
1 − ρk

)

,

(2.9)

and
[N t]−j
(k)
a[N t]−j

[N t]−j+1

1 − ρk
=
1 − ρk

ρik

=
i=0

.

(2.10)

Proof. Center the variables

τ (k) (i) = Y (k) (i) − EY (k) (i),

V (k) (i) = W (k) (i) − EW (k) (i)

to arrive at centered version of (2.7)

τ (k) (i) = ρk τ (k) (i − 1) + V (k) (i).

Iterate (2.11) to obtain
i

τ (k) (i)

=

i−j

ρik τ (k) (0) +

ρk V (k) (j).
j=1

68

(2.11)

Now the partial sum of τ (k) can be written
[N t]

[N t]

τ (k) (i)

=

τ (k) (0)

i=1

[N t] i

ρik
i=1

i=1 j=1

[N t]

=

τ (k) (0)

[N t]

ρik

j=1

[N t]

=

[N t]

[N t]−j

V (k) (j)

+

i=1

ρk
i=j

[N t]

ρik

i−j

V (k) (j)

+

i=1

τ (k) (0)

i−j

ρk V (k) (j)

+

ρm
k
m=0

j=1
[N t]

=

(k)

(k)
b[N t] τ (k) (0) +

a[N t]−j V (k) (j),
j=1

where the coefficients are given by (2.9) and (2.10). Note that for different j, V (k) (j)
are independent due to (2.7), and they are also independent of τ (k) (0). For different k,
independence follows from the independence of OU type processes Y (k) . Summing with
respect to k completes the derivation of (2.8), provided that the series in (2.8) converge
almost surely. Series convergence holds because the terms have zero mean, and the series of
second moments converge. Indeed, for the first series in (2.8)
∞

∞

(k)

(k)

(b[N t] )2 δk

(b[N t] )2 E(τ (k) (0))2 = C2

k=1

k=1
∞ [N t]

[N t]
i+j
ρk δk

= C2

=

k=1 j,i=1

j,i=1

L(i + j)
(i + j)2(1−H)

.

The sum can be viewed as a Riemann sum for the double integral:

1
[N t]2
=

[N t]

1

L(i + j)

(i + j)2(1−H)
j,i=1
L(N )

[N t]2(1−H) 0

1

1
0

1

=
0

0

L([N t](x + y))
([N t](x + y))2(1−H)

dx dy
(x + y)2(1−H)
69

(1 + (o(1))

dxdy(1 + o(1))

as N → ∞. The last equality is justified using Karamata’s theorem as in Lemma 2.3.2, or
by considering
1

L(N )
x=

1

L([N t](x + y))
dx dy
L(N )
(x + y)2(1−H)
y=0

and using remark 1.5.1 and the dominated convergence theorem. More precisely, the fact
that L([N t](x + y))/L(N ) converges to 1 uniformly in x + y ∈ [ , 1] by [64, Theorem 1.1], and
letting → 0. Therefore the variance of the first series in (2.8) is of the order L(N )N 2H .
For the second term in (2.8), the series of the variances is
[N t] ∞

2

(k)

a[N t]−j

E V (k) (j)

2

[N t] ∞

=

[N t]−j

2
ρik  (1 − ρ2k )C2 δk ,


j=1 k=1

j=1 k=1



i=0

since V ar(V (k) (j)) = (1 − ρ2k )V ar(τ (k) ). The series of variances becomes
[N t] ∞ [N t]−j

i +i
ρk1 2 (1 − ρ2k )C2 δk

j=1 k=1 i1 ,i2 =0
[N t] [N t]−j

L(i1 + i2 )

=
j=1 i1 ,i2 =0
1

1−x

x=0 y=0

(i1 + i2 )2(1−H)
1−x

z=0

−

L(i1 + i2 + 2)
(i1 + i2 + 2)2(1−H)

L([N t](y + z))
([N t](y

+ z))2(1−H)

× (1 + o(1)).

70

−

=

[N t]3
λ2(1−H)

×

L([N t](y + z) + 2)
([N t](y + z) + 2)2(1−H)

dxdydz

Arguing in the same way as for the variance of first term in (2.8), we have
[N t] ∞ [N t]−j

i +i
ρk1 2 (1 − ρ2k )C2 δk = [N t]2H+1 L(N )

j=1 k=1 i1 ,i2 =0
1

1−x

1−x

1

×
=

x=0 y=0 z=0
1
[N t]2H L(N )

(2H − 1)

(y + z)2(1−H)
1−x

×

−

1
((y + z) + 2/[N t])2(1−H)

dxdydz

[N t] (y + 2/[N t])2H−1 − y 2H−1

x=0 y=0

− (y + 1 − x + 2/[N t])2H−1 − (y + 1 − x)2H−1

dxdy(1 + o(1)).

It is not hard to see that as [N t] → ∞ the integrand converges to

2(2H − 1) y 2H−2 − (y + 1 − x)2H−2 .

Also, by the mean value theorem

[N t] (y + 2/[N t])2H−1 − y 2H−1 =

2((2H − 1)
θ2(1−H)

≤

2(2H − 1)
y 2(1−H)

,

for some θ ∈ (y, y + 2/[N t]). Similar integrable bound holds for the second difference in the
integrand above, and the dominated convergence theorem yields
[N t] ∞ [N t]−j

i +i
ρk1 2 (1 − ρ2k )C2 δk

j=1 k=1 i1 ,i2 =0

= 2[N t]2H L(N )

1

1−x

y 2H−2 − (y + 1 − x)2H−2 dxdy(1 + o(1)),

x=0 y=0

which shows that the series in the second term converges almost surely, and that the variance
of the second term has the same order as the variance of the first term, namely L(N )[N t]2H .

71

The next theorem gives the asymptotic behavior of the cumulants of the partial sums
process.
Theorem 2.3.4. Assume that the m-th cumulant of the centered partial sums of the superposition of processes satisfies condition (A) for all q ≥ 2, condition (B), and has λ(k) = λ/k
and δk = k −(1+2(1−H)) , 21 < H < 1. In addition, for each k, the cumulant function of τk ,
given in the equation (2.11), is analytic around the origin. Then the m-th cumulant has the
following asymptotic behavior:

κm,N = Dm L(N )[N t]m−2(1−H) (1 + o(1))

as N → ∞, where the Dm = Cm K for some positive constant K.
Proof. Using (2.8), the logarithm of the characteristic function of the partial sums process
can be written as

log E exp {iu(S∞ ([N t]) − ES∞ ([N t])}
∞

log E exp

=

(k)
ib[N t] uτ (k) (0)

[N t] ∞

(k)

log E exp ia[N t]−j uV (k) (j) .

+
j=1 k=1

k=1

Under this theorem’s assumption (basically, assumption (A) and that the cumulant function
of τk is analytic), the logarithm of the characteristic function of τ (k) (0) can be expanded
∞

log E exp

iuτ (k) (0)

=
m=2

(iu)m
C m δk ,
m!

where the summation is from m = 2 due to centering. From (2.11), the logarithm of the

72

characteristic function of V (k) (j) can also be expanded as follows:

log E exp iuV (k) (j) = E exp iuτ (k) (i) − E exp iuρk τ (k) (i − 1)
∞

=
m=2
∞

=
m=2

(iu)m
C m δk −
m!

∞
m=2

(iuρk )m
C m δk
m!

m

(iu)
Cm (1 − ρm
k )δk .
m!

Therefore the m-th cumulant of the centered partial sums process is
∞

(k) m
δk
b[N t]

κm,N = Cm

[N t] ∞

(k)

+ Cm

m

a[N t]−j

(1 − ρm
k )δk = I + II.

j=1 k=1

k=1

Consider the first term:
∞

(k)

I = Cm

δk = C m

b[N t]
[N t]

∞

= Cm

=

Cm

i +···+im
δk ρk1

i1 ,...,im =1 k=1
1
[N t]m 1

C2

···

0

Cm [N t]m L(N )
C2 [N t](2(1−H) 0

0
1

[N t]

m
ρik 

δk 
i=1

k=1

k=1

=



∞

m

Cm
=
C2

[N t]
i1 ,...,im =1

L(i1 + · · · + im )
(i1 + · · · + im )2(1−H)

L([N t](x1 + · · · + xm ))
([N t](x1 + · · · + xm ))2(1−H)
1

···
0

dx1 . . . dxm
(x1 + · · · + xm )2(1−H)

dx1 . . . dxm (1 + o(1))
(1 + o(1)) ,

where we used Remark 1.5.1 and the dominated convergence argument for the slowly varying
function. This shows that the first part of the expression for the m-th cumulant behaves like
L(N )[N t]m−2(1−H) multiplied by a constant

Dm,I =

1
Cm 1
dx1 . . . dxm
···
.
C2 0
0 (x1 + · · · + xm )2(1−H)

73

Now consider the second term
[N t] ∞

m

(k)

II = Cm

a[N t]−j

(1 − ρm
k )δk

j=1 k=1

m



[N t] ∞

[N t]−j

= Cm

ρik  (1 − ρm
k )δk


j=1 k=1

i=0

[N t] ∞

[N t]−j
i +···+im

ρk1

= Cm

(1 − ρm
k )δk =

j=1 k=1 i1 ,...,im =0
[N t]

[N t]−j

×
j=1 i1 ,...,im =0
1

L(i1 + · · · + im )
(i1 + · · · im )2(1−H)
1−x

1−x

x=0 y1 =0

L(i1 + · · · + im + m)
(i1 + · · · im + m)2(1−H)

L([N t](y1 + · · · + ym ))

···

×

−

Cm
C2

([N t](y1 + · · · ym ))2(1−H)

ym =0

−

=

Cm [N t]m+1
C2 λ2(1−H)

L([N t](y1 + · · · + ym ) + m)
([N t](y1 + · · · ym ) + m)2(1−H)

× dy1 . . . dym dx (1 + o(1))
Remark 1.5.1 and the dominated convergence argument followed by integration with respect
to ym yield

II =

Cm L(N )[N t]m−2(1−H)+1
C2
1

1−x

1−x

···

×

ym =0

x=0 y1 =0

1
(y1 + · · · + ym )2(1−H)

× dy1 . . . dym dx (1 + o(1)) =
1

1−x

×

1−x

···
x=0 y1 =0

ym−1 =0

−

1
((y1 + · · · + ym ) + m/[N t])2(1−H)

Cm L(N )[N t]m−2(1−H)
C2 (2H − 1)

[N t] (y1 + · · · + ym−1 + m/[N t])2H−1 − (y1 + · · · + ym−1 )2H−1

− ((y1 + · · · + ym−1 + 1 − x + m/[N t])2H−1 − (y1 + · · · + ym−1 + 1 − x)2H−1 )
× dy1 . . . dym−1 (1 + o(1)).

74

Note as [N t] → ∞ the limit of the integrand is

m(2H − 1)((y1 + · · · + ym−1 )2H−2 − (y1 + · · · + ym−1 + 1 − x)2H−2 ).

The same argument as in the proof of Lemma 2.3.3 and the dominated convergence theorem
yield

II =

1−x
1−x
mCm L(N )[N t]m−2(1−H) 1
(y1 + · · · + ym−1 )2H−2
···
C2
ym−1 =0
x=0 y1 =0

− (y1 + · · · + ym−1 + 1 − x)2H−2 dy1 . . . dym−1 dx (1 + o(1))
= Dm,II L(N )[N t]m−2(1−H) (1 + o(1))
with

Dm,II =

1−x
1−x
mCm 1
···
C2 x=0 y1 =0
ym−1 =0

(y1 + · · · + ym−1 )2H−2 − (y1 + · · · + ym−1 + 1 − x)2H−2 dy1 . . . dym−1 dx.
Thus the asymptotic behavior of the second term is the same as of the first term, namely
L(N )[N t]m−2(1−H) .
Remark 2.3.1. The analyticity of the cumulant function κX in the above theorem ensures
the existence of all the cumulants of the marginal distribution of the underlying supOU process
X. It follows from Lemma 3.1, paper [34]. that the cumulant function of X(t) is analytic in
the neighborhood of the origin if there exists a > 0 such that Eea|X(t)| < 1. This implies in
particular that all the moments and cumulants of X(t) exist. This condition is satisfied for
many self-decomposable distributions, for example the IG, NIG, and others. Other examples
of supOU processes satisfying the above conditions can be obtained by taking the marginal
75

distribution to be gamma, variance gamma, tempered stable, Eulers gamma, or z-distribution.
For more details, see [34].
Corollary 2.3.5. Under the assumptions of Theorem 2.3.4, the centered partial sum process
{S∞ (t) − ES∞ (t), t ≥ 0} is intermittent.
Proof. Let Y (u) = S∞ (u) − ES∞ (u). We show intermittency at p = 2 and r = 4. From
Theorem 2.3.4, the m-th cumulant of Y ([N t]) equals

Dm L(N )[N t]m−2(1−H) (1 + o(1))

as N → ∞. Since L([N t])/L(N ) → 1 as N → ∞ for any t > 0, the m-th cumulant of Y (u),
denoted by κ
˜ m,u , equals
Dm L(u)um−2(1−H) (1 + o(1))
as u → ∞.
Using the relation between moments and cumulants it follows from Theorem 2.3.4 that

E|Y (u)|2 = κ
˜ 2,u + κ
˜ 21,u = D2 L(u)u2H (1 + o(1)),
E|Y (u)|4 = κ
˜ 4,u + 3˜
κ22,u = D4 L(u)u2H+2 (1 + o(1)) + 3D22 L(u)2 u4H (1 + o(1))

as u → ∞. Since H < 1 implies 2H + 2 > 4H, we have

τ (2) = 2H,
τ (4) = 2H + 2,

and thus τ (2)/2 < τ (4)/4.
76

Note that the behavior of moments shown in the proof implies that

EY (u)4 /(EY (u)2 )2

grows to infinity as u → ∞, the behavior noted by Frisch ( [27, Section 8.2] as a manifestation
of intermittency. Other examples of unusual growth of moments are given in [62] in the
context of fractional diffusion.
Similar behavior of cumulants was obtained in [4, Example 4.1] for a case of continuous
(integrated) superpositions of OU type processes and also in [22] in the context of proving
central limit theorem type results. The authors of [22] noted that the existence of the limit
was unlikely given the behavior of cumulants. We concur with this statement, but showing
that there is no weak limit under intermittency in the usual partial sum setting remains
an open problem. The consideration of why the existence of a weak limit is unlikely is as
follows.
If the limit of the partial sum process for the infinite superposition existed in the sense
of convergence of all finite dimensional distribution, then by the Lamperti’s theorem (see,
for example, [21, Theorem 2.1.1]), the norming had to be a regularly varying function of N .
That is, the weak convergence would hold for

N a L1 (N ) (S∞ ([N t]) − ES∞ ([N t]))

for some a ∈ R and a slowly varying at infinity function L1 . However, no matter what a ∈ R
is chosen, all cumulants of the centered and normed partial sum cannot converge. This is
because the m-th cumulant of N a (S∞ ([N t]) − ES∞ ([N t])) behaves like N m(a+1)−2(1−H) .

77

Also note that for even q, the scaling function defined in (2.3) in this case is τ (q) =
q − 2(1 − H), and
τ (q)
2(1 − H)
=1−
q
q
is strictly increasing in q. The term −2(1 − H) in the exponent of the asymptotic behavior
of the cumulants
κq,N = Dq L(N )[N t]q−2(1−H) (1 + o(1))
gives the reason for both the increasing behavior of τ (q)/q and for the lack of norming that
would make cumulants converge. Of course, convergence of cumulants provides a sufficient
means for proving the existence of the limit by showing the convergence of the characteristic
function. The formal link between intermittency and lack of the limit theorems needs to be
further developed for the partial sums and other sequences of stochastic processes.

78

Chapter 3
Applications in finance
For the past thirty years, an extensive research has been devoted to finding classes of
stochastic models that can capture the essential statistical properties of stock price returns.
The Black-Scholes model, the most renowned and widely used model by far, also referred
to as geometric Brownian motion model or GBM for short, was first introduced by Black,
Scholes and Merton, see [12]. It models the stock price evolution as,

1
St = S0 exp { (µ − σ 2 )t + σB(t)}, t > 0.
2

(3.1)

Even though the Black-Scholes model is the most widely used model so far, as we’ll see later
on, it is not able to capture some important features of the log-returns and therefore some
new alternatives were suggested in order to improve the overall model fit. The assumptions
it makes about the nature of the stock price process, in turn impose the log-returns to be
independent, stationary and normally distributed, the characteristics that differ from the
empirical features of the returns listed above.
First improvement on the GBM in (3.1), that uses Levy processes instead of Brownian
motion as a noise component was suggested by Merton in 1970’s [56]. Since most of the
properties of the Levy process, for example independent and stationary increments, having
a version that is a.s. cadlag will hold true for the Merton’s improved model as well, hence
the new model allows for jumps in the stock price evolution. The class of possible marginal

79

distributions or infinitely divisible distributions is also larger than only Gaussian in geometric
Brownian motion model.
The next refinement of the Black-Scholes model, proposed by Mandelbrot (1967) [53],
had a completely different flavor. He used fractional Brownian motion BH (t) that is subordinated by positive stochastic process θ(t) that is independent of FBM, i.e. he modeled a
stock price evolution with the process: BH (θ(t)). Mandelbrot’s model belongs to a group
of subordinator models, i.e. group of models where instead of the deterministic one is taking an increasing stochastic process. Note that financial interpretation of the time in the
subordinator type models is completely different than in the regular stochastic models, see
Howison and Lamper (2001) [41].
Another popular class of models that aimed to correct for the volatility parameter σ,
in the equation (3.1), is called a class of stochastic volatility models. Assumptions of the
GBM imply that volatility is constant, while in reality this parameter depends on both, time
and the strike price. This phenomenon, in financial literature is referred to as the volatility
smile. Widely recognized and used models from this class are Heston, Banrdorff-Nielsen and
Shepard (BNS) and superposition of Ornstein-Uhlenbeck type (SupOU) model [4, 8].
Finally, a model that we are going to use for calibration purposes is a slight modification
of the Heyde’s Fractional Activity time GBM model, or FATGBM for short, a version that
is in the literature frequently referred to as skewed FATGBM model. This model belongs to
a class of subordinator models, i.e. models where Brownian motion process is subordinated
to a specific activity time process, finite (or infinite) supOU process. In fact, it’s been shown
in [8] that the BNS stochastic volatility model is a special type of FATGBM model with a
specific choice of the activity time. The equivalence between the two is a consequence of
self-similarity of Brownian motion.
80

3.1

The Black-Scholes Model:
Geometric Brownian Motion (GBM)

In this section, we will briefly present the rationale behind stock price modeling and why
Geometric Brownian motion seems to be a good representation of the reality.
In various option pricing models the time evolution of a stock price {St }t≥0 is modeled by
describing a price change in a small time interval: ∆St = St+∆t − St . Now, the question is if
it is better to use absolute, relative or log-returns. One can find in the finance literature that
the most commonly used metrics are relative and log-returns, both having their strengths
and weaknesses and one will be used over the other depending on the specific situation.
In order to be able to compare investments in different securities, instead of the absolute
price change: ∆St , we will focus on modeling the relative price change (relative return)1 :
∆St /St . It is reasonable to expect that in a small time interval ∆t, return can be decomposed
into a systematic and a random component:

∆St /St =

+ σ∆B(t), S0 > 0

µ∆t
systematic

(3.2)

random

where the systematic part is expected value of the return and the random part is noise.
Remark 3.1.1. The equation 3.2 simply says that since the stock price fluctuates stochastically, it is reasonable to assume that it oscillates around some expected (deterministic) value
and that oscillation is due to random noise.
It is natural to assume that over a short period of time, both, the expected return (sys1 In finance, the relative price change is usually called a simple return. When evaluating portfolios that
have more than one security simple returns are more convenient metric to use than absolute or log-returns.
However, most authors in the financial literature prefer working with log-returns due to their additivity over
time and the exponential nature of many financial models.

81

tematic term) and the variance of random component are proportional to the time increment
∆t. Financial intuition here, is that in a short time span, the expected increase or decrease
in price is proportional to the time length and also the longer the ∆t the more volatile stock
price may be.
Therefore, in the Black-Scholes setup, the deterministic part is of the form: µ∆t, where
the proportionality parameter µ, called the drift, represents the mean rate of return of the
stock and it is assumed to be constant over time. At the same time, the random component
in the Black-Scholes model is modeled by: σ∆B(t) where ∆B(t) is the noise increment
driven by Brownian motion and σ > 0 is assumed to be constant over time. So, the random
term is driven by normally distributed error with variance σ 2 ∆t. Parameter σ governs how
volatile2 the stock price is and is commonly referred to as volatility of the stock.
Remark 3.1.2. The above decomposition of the relative stock return into deterministic and
random component can be applied in general, regardless of the model. Hence, how the deterministic term µ changes with time and if it should be constant function of time or random
and how the error term is distributed is all tied to a specific model.
Heuristically, as ∆t → 0, the equation 3.2 becomes SDE of the form:

dSt /St = µdt + σdB(t),

S0 > 0

(3.3)

where {B(t)}t≥0 is a BM process. The stochastic differential equation 3.3 has an almost
surely unique, strong solution, called Geometric Brownian Motion (GBM) and it is given by
2 σ controls how much effect the noise have on the stock return, i.e. how much the stock price fluctuates.

82

the following expression:

1
St = S0 exp { (µ − σ 2 )t + σB(t) },
2

t > 0.

(3.4)

Remark 3.1.3. Notice that GBM is an exponential of the Brownian motion process with
drift. From 3.1, we can conclude that log returns of the stock price process are normally
S

distributed: log S t ∼ N ( (µ− 21 σ 2 ) t, σ 2 t) and that stock price itself is log-normally distributed.
0

3.2

Empirical evidence against the Black-Scholes model

As we mentioned earlier, the empirical evidence suggests that the classical Black-Scholes
model does not describe the properties of financial time series very well. We present here,
the essential ways in which statistical features of real log-returns differ from those implied
by the Black-Scholes, see the list below.
1. Empirical distribution of returns seems to be more sharply peaked with fatter tails.
2. Dependence structure of the returns:
(a) The empirical returns are uncorrelated (independent in the Gaussian case).
(b) The real absolute and squared returns exhibit strong and persistent dependence
(as opposed to Black-Scholes returns).
3. The conditional variance of the returns E[(Xt − µ)2 | Ft−1 ], for the empirical returns
behaves erratically3 as a function of time, i.e. seems to behave randomly and exhibits
3 Note, recall the volatility parameter from GBM SDE (equations 3.2, 3.3), then this statement simply
says that the volatility parameter changes stochastically over time.

83

possible signs of clustering and long-range dependence, while under the Black-Scholes
assumption, it is constant: E[(Xt − µ)2 | Ft−1 ] = σ 2 , for all t > 0.
In subsequent sections, we provide more details about each stylized fact along with accompanying figures and tables. The data consists of daily adjusted closing prices of the
Standard and Poor’s 500 Index (S&P 500), from the 9th of September 2005 till the 9th of
September 2016.

3.2.1

The non-Gaussian character of the returns

While the GBM assumption about the stock price evolution implies that log-returns are
normally distributed (see the remark 3.1.3), it has been shown in practice that the returns
have actually higher peaks and fatter tails (Leptokurtic distributions), see figure 3.1.

50

Histogram of the S&P 500 log−returns

30
20
0

10

Density

40

S&P 500
Normal density

−0.10

−0.05

0.00

0.05

0.10

Returns
Figure 3.1: Histogram of the S&P500 returns against the Normal curve fit.

84

There are different ways one can showcase non-normality of the returns. The most popular among the practitioners is to test normality of the returns by using either Jarque-Bera,
Kolmogorov-Smirnov, Shapiro-Wilk or Anderson-Darling test, where each has its own benefits and weaknesses. Our null hypothesis is that the log returns are i.i.d. normal under the
GBM, and we look for the evidence against this null hypothesis. In the table 3.1 below, we
present the results of the tests for the S&P 500 real and simulated returns alongside other
stock returns, from the period of 2005 to 2016. One can clearly see evidence against the
normality, similarly as in the figure above 3.1. As we continue we will see more and more
evidence against the normality assumption.

Asset returns

N

Sam. skewness

Sam. kurtosis

p-value

Decision (H0)

S&P500 (real)

2770

-0.3325

13.5867

<< 0.001

reject

S&P500 (sim)

2770

-0.0033

3.0491

> 0.5

retain

DJI

2770

-0.0961

13.4402

<< 0.001

reject

NDX

2770

-0.1746

10.9485

<< 0.001

reject

GSK

2770

-0.3673

8.7517

<< 0.001

reject

PFE

2770

-0.194

10.2959

<< 0.001

reject

COF

2770

-0.3491

17.6885

<< 0.001

reject

BAC

2770

-0.3049

24.1146

<< 0.001

reject

MSFT

2770

0.0729

13.2692

<< 0.001

reject

IBM

2770

-0.1871

8.5777

<< 0.001

reject

GOOG

2770

0.5622

13.2928

<< 0.001

reject

Table 3.1: Values for sample skewness, kurtosis and the resulting decisions of the normality
test.

85

Note that Jarque-Bera4 is a goodness of fit type test, which tests if sample skewness and
kurtosis match the ones of normal distribution, i.e. the null hypothesis is a joint hypothesis of
both, skewness and excess kurtosis being zero. Hence, it is similar to the tests we suggested.
In the next section, as we shall see in the table 3.2, even only the sample skewness and
kurtosis may indicate a degree up to a given distribution is asymmetric and has heavier tails
than normal.

3.2.1.1

Skewness (asymmetry) and excess kurtosis (fat tails)

Let us denote µ = µ1 = E[X] as mean, µ2 = E[(X − µ)2 ] as variance and µk = E[(X −
µ)k ] as the k-th central moment of a random variable X. Let us also assume that {Xi }N
i=1 , is a
sample of size N , from the distribution X, and that its sample (central) moments are given by,
µˆk (N ) = N1

N
i=1

¯ (N )
Xi − X

k

¯ (N ) = µˆ1 (N ) = 1
, for k = 1, 2, 3, 4, and where X
N

N
i=1 Xi

is its sample average. In our case, we will assume that {Xi }N
i=1 , is a sample of returns that
are uncorrelated (not necessarily independent) and identically distributed random variables.
Finally, skewness (γ1 ) is defined as a quantity that measures the degree to which a given
distribution X is asymmetric, while excess kurtosis (γ2 ) measures how heavy its tails are
comparing to normal. Below we give both formulas, for skewness and kurtosis respectively:

γ1 =

E[(X − µ)3 ]
(V ar[X])3/2

=

µ2
(µ3 )3/2

,

γ2 =

E[(X − µ)4 ]
µ4
=
.
2
(var[X])
(µ2 )2

Since we assumed that returns are uncorrelated, in order to estimate the skewness and
kurtosis parameters of the marginal distribution of the returns, we need to use estimation
techniques that are robust to the presence of correlation in our data. So, given the assump4 This test is overly sensitive for smaller sample sizes and typical rule of thumb here is a sample size larger
than 2000.

86

tions below are satisfied, we will employ the method of moments estimation, which allows
us to exploit the fact that, for N large enough, theoretical moments of the distribution are
relatively close to the sample moments. This is true provided that certain mixing properties
and therefore ergodicity of the returns are satisfied. This implies consistency and asymptotic
normality of the sample moments and hence of sample skewness and kurtosis. Note that
in our case mixing property is satisfied, see the theorem 2.4. Finally, the formulas for the
sample skewness and kurtosis are given below,

γˆ1 (N ) =

µˆ3 (N )
µˆ2

(N )

3/2

γˆ2 (N ) =

,

µˆ4 (N )
µˆ2

(N )

2

.

Remark 3.2.1. Note that with respect to skewness, there are two types of distributions:
a negatively skewed (has longer tail to the left than to the right) and a positively skewed
distribution (reverse is true). Regarding the kurtosis, a mesokurtic distribution (behaves like
a Normal with kurtosis value of three), a platykurtic distribution (has flatter top and kurtosis
less than three), and finally a leptokurtic distribution (has higher peaks and kurtosis greater
than three). In general, kurtosis greater than 3 indicates a distribution whose tails decrease
slower than the Normal distribution and it is more sharply peaked.
Both skewness and kurtosis are usually measured as a reference to a Normal distribution,
that has skewness equal to zero and kurtosis equal to three (or excess kurtosis to zero).
In addition to the normality tests performed above (table 3.1), below we provide a table
with values of sample skewness and kurtosis, table 3.2, which seems to indicate that, for
the majority of our data sets, sample skewness is slightly negative compared to Normal,
while sample kurtosis is significantly higher. To be more specific, the sample skewness of
our S&P 500 index is only slightly negative, around −0.3325, while typically, the observed
87

daily log-returns of different indices may exhibit at least some significant negative skewness.
Regarding the kurtosis, we clearly see that S&P 500 returns, as well as majority of other
securities have kurtosis significantly higher than three.
From the results we conclude that for majority of our data, the empirical distribution is
more Leptokurtic and has also slightly longer tail to the left. The fact that the distribution
of the real returns are more leptokurtic than Normal has already been noted by Fama
(1965), see [23]. The financial intuition behind a model with high kurtosis is that the
large movements in a stock price occur more frequently than in the model with Normally
distributed increments. This phenomena has been observed in the markets for quite some
time, and one recent example is the 2007/2008 financial crisis or the mortgage crisis. This
feature is one of the reasons for considering modeling a stock price processes via jumps.
Regarding skewness, one possible explanation behind the negative skewness5 is that investors
tend to react more strongly to negative information rather than the positive information,
see Rydberg (2000) [61].
5 In addition, another interesting perspective to positively/negatively skewed returns, is that investors
are more willing to buy stocks that are positively skewed (meaning frequent small losses and a few extreme
gains), which intuitively means that investors would rather accept smaller expected returns even slightly
negative ones with an occasional extreme return (either gain or loss) than invest in the negatively skewed
stock returns. A real life example of this is buying lottery tickets or ”penny stocks”, small-cap growth stocks,
etc...

88

Risky asset returns

N

Sam. Mean

Sam. St.dev

Sam. Skewness

Kurtosis

S&P500 (real)

2770

0.00019458

0.0127

-0.3325

13.5867

S&P500 (sim GBM)

2770

-0.00010136

0.0125

-0.0033

3.0491

DJI

2770

0.00019027

0.0117

-0.0961

13.4402

NDX

2770

0.00038597

0.0137

-0.1746

10.9485

GSK

2770

0.00013589

0.0137

-0.3673

8.7517

PFE

2770

0.00025785

0.0144

-0.194

10.2959

COF

2770

0.000016862

0.0312

-0.3491

17.6885

BAC

2770

-0.00027863

0.0353

-0.3049

24.1146

MSFT

2770

0.00036112

0.0174

0.0729

13.2692

IBM

2770

0.00031453

0.0138

-0.1871

8.5777

GOOG

2770

0.00058731

0.0192

0.5622

13.2928

Table 3.2: Values for sample skewness, kurtosis mean and standard deviation

3.2.2

The dependence structure of the log-returns:
regular, absolute and squared returns

Another interesting phenomenon that is not being captured well by the Black-Scholes
model is the dependence structure of the log-returns and their absolute and squared values.
The dependence structure of the returns is very important because it tell us how predictable
stock returns are and how much we can infer about the future by only using historical data.
A standard graphical approach for exploring dependence in returns is the autocorrelation
function (ACF) which measures correlation between returns that are k-lags away. In order
to investigate further, assuming that we are given a sample of N-log-returns, we compute
89

and plot a sample autocovariance and autocorrelation function, given by the formulas below,

ˆ (N ) (k) = 1
R
N

N −k

¯
¯
(Xt − X)(X
t+k − X),
t=1

and
ρˆ(N ) (k) =

ˆ (N ) (k)
R
,
ˆ (N ) (0)
R

¯ is the sample average and k represents number of lags.
where X
If we assume that given sample of log-returns is coming from the population generated
by white noise then, for relatively large N , the sample autocorrelation estimator has Normal
distribution with variance 1/N , where number of lags should be much smaller than total
number of observations, 1 ≤ k ≤ m << N , see Anderson (1942). This means that we
can test if given returns are purely random (white noise) by checking if observed values
of autocorrelation lie within the 95% confidence bounds based on the asymptotic values
of the sample autocorrelation coefficient under the white noise assumption. These bounds
√
correspond to the levels of ±1.96/ N . The interpretation of the test is following: we would
reject the H0 : population being generated by white noise, only if, for the fixed number of
lags 1 ≤ k ≤ m << N , we observe that ρˆ(N ) (k) is falling outside of the interval bounds
more than 5% of the time, in which case we would call these correlations significant.
In our case, from the figure below 3.2, we see that the sample autocorrelations of real logreturns of various risky assets diminish rapidly and that there is little or no autocorrelation
present in returns past couple of lags, i.e. we can say that the correlations are statistically
insignificant (follow the white noise), see the green line. This is not enough to say that
returns are independent, one can only conclude that they are uncorrelated. If we want to
understand the correlation structure of the returns better, than it is necessary to study also
90

a serial correlation of the quadratic and absolute returns. This, in turn, may revel some
relevant information about how the variance changes with time and if the accumulation of
large/small absolute returns persists as t → ∞. For example, the strong dependence of
the absolute returns, similarly as with squared returns, tells us if any accumulation of big
movements of risky asset prices exist in our data see below.

Empirical autocorrelation for S&P 500 log-returns
1

(k) = Corr [x( t), x( t + u) ]

Log-returns X(t)
Squared log-returns |X(t)| 2
Absolute log-returns |X(t)|
Upper white noise band
Lower white noise band

0.8
0.6
0.4
0.2
0
-0.2
0

50

100

150

time lag (k)

Figure 3.2: Empirical autocorrelation function of regular, squared and absolute log-returns
for the S&P500 index.

In addition, in the figure 3.2, we also present sample autocorrelations for the absolute and
squared log-returns, |Xt | and Xt2 respectively, t = 0, ..., N − k, and where the black dotted
lines on the plot represent the upper and lower bound of the 95% confidence interval for the
sample autocorrelations under the H0 (Gaussian white noise assumption). Interestingly, the
figure does show some strong persistence in autocorrelations for both, squared and absolute
91

log-returns, and so there is a strong evidence against the GBM hypothesis. This empirical
evidence matches the previous observations of Heyde and Liu (2001) [39], Cont (2001) [16],
and also supports the ”stylized facts” outlined in Granger (2005) [35]. Nowadays, most of
the authors will consider the empirical features of risky asset returns as a foundation for
building a reliable model that is a good representation of reality.
Let us compare our finding about the empirical log-returns, given in the figure above
against the simulated GBM returns, calibrated on the same S&P500 data. We have decided
to take the most simple approach for calibration, i.e. we used (historical) sample mean and
variance in order to estimate the drift and volatility parameters. From this plot, see figure
3.3, one can clearly see that there is little autocorrelation present for all three sequences of
log-returns, regular, squared and absolute. This is considered to be an additional evidence
of the Black-Scholes deficiencies.
Recall, that a process is long-range dependent (LRD)6 if its autocorrecaltion function is
+∞
not integrable7 , which in continuous case turns out to be, −∞
| ρ(t) | dt = ∞, f or t ∈ R,

and

+∞
−∞ | ρ(k) |

= ∞, for the discrete case. In contrast, a series whose autocorrelation is

integrable in R may display short-range dependence.
Many authors consider long-range/short-range dependence as very important empirical
feature of the returns. In general there are two schools of thought on this subject. One group
of authors claims that the long-range dependence exists for risky assets and that it has been
empirically proves, see Heyde and Yang (1997) [40] and references therein, however, this is
not an universally accepted view. One way to distinguish between whether we have strong
6 Some other acronyms for the LRD/SRD processes is either a long/strong memory process or strong
dependence process, while for the short range depndence we have either short memory, weak memory or
weak dependence process.
7 Note, that there is also one more representation of the LRD and that is via product of power function
and a slowly varying function.

92

Autocorrelation function for simulated S&P500 GBM log-returns
1

(k) = Corr [x( t), x( t + u) ]

Log-returns X(t)
Squared log-returns |X(t)| 2
Absolute log-returns |X(t)|
Upper white noise band
Lower white noise band

0.8
0.6
0.4
0.2
0
-0.2
0

50

100

150

time lag (k)

Figure 3.3: Autocorrelation function of regular, squared and absolute log-returns for the
simulated Geometric Brownian motion, calibrated on the S&P500 index.
or weak dependence in our data is to estimate, the so called Hurst parameter, which, in a
way, represent a direct link between the intensity of dependence of a given process and its
self-similar scaling nature.
One of the reasons that the infinite superposition model was introduced was as a long
range dependent solution that will keep the the most useful properties of the finite superposition model8 and introduce the long range dependence in addition. For this thesis, due
to our findings we will refrain ourselves from using the long range dependence model and
instead work only with the finite superposition model, that we know they is short range
dependent.
8 To mention the most important ones, tractability, non-normal class of marginal distributions and flexible
correlation structure.

93

If these correlations are statistically significant we have an evidence of predictability.
One can use Ljung-Box (LB) test in order to check joint significance of the auto correlation
coefficients over more than one lag.

3.2.3

Stochastic Volatility and clustering

In a stochastic volatility model, price Xt is a solution of the following SDE, given by

dXt = µ(t, Xt )dt + σ(t, Xt )dBt .

which exists under certain conditions on the coefficients. The functions µ and σ are known
as drift and the volatility of the underlying process Xt . The volatility function is also
referred to an instantaneous absolute volatility. In the above scenario, assuming that drift
is deterministic, the volatility function term σ(t, Xt ) will directly influence the variance of
the Brownian motion. In particular, in the Black-Scholes model where GBM evolution is
assumed, both drift and volatility functions are constants, µ(t, Xt ) = µ and σ(t, Xt ) = σ,
hence the volatility parameter is equal to the variance of the returns. Hence it simply
describes how much effect does a noise term have on the price, see the figures 3.4 below.
In order to understand and model the volatility, one must start with the following question: How to use the Black-Scholes model in order to price an option today (now) by using
the past information we have (historical price movements of the underlying stock)?
A first step in option pricing is to find/estimate the value of the volatility parameter σ,
which is assumed to be constant under the GBM. There is rich literature on how to measure
and estimate the volatility function σ(t, Xt ) in general, but since this is not the topic of the
discussion, for the volatility calibration purposes we will focus only on estimating volatility
94

Real S&P500 returns

Real S&P 500 index returns from September 2005 till September 2016
0.2
0.1
0
-0.1
2006

2008

2010

2012

2014

2016

Real S&P500 returns

time
Simulated GBM Returns (calibrated on the S&P 500 index data)

0.2
0.1
0

-0.1
-0.2
2006

2008

2010

2012

2014

2016

time

0.15

Real S&P 500 index returns VS Simulated GBM Returns
Real S&P 500 Returns
Simulated GBM Returns

Log-returns

0.1

0.05

0

-0.05

-0.1
2006

2008

2010

2012

2014

2016

time

Figure 3.4: S & P 500 real returns against the simulated GBM retuns.

95

under the GBM model. A key difficulty in measuring the market volatility is that it is not
directly observable, it is a latent variable and therefore it must be inferred from the market
price values.
Two most common techniques for calibrating the volatility parameter are historical and
implied volatility. Here, we will focus only on the historical volatility, which is a retrospective
measure that reflects how volatile the asset has been in the past. Different historical volatility
type models can be used, from the most simple, like moving average (MA) and exponentially
weighting moving average (EWMA) models to GARCH and stochastic volatility type models.
A simple moving average model for one-day volatility at time t ≥ 0, is a sample standard
deviation of all daily observed log-returns over a fixed time-window W preceding t. The past
log-returns are denoted as Xi , i = t − W...t − 1, where time window W can be one year,
month, day. MA model of volatility is given by the following formula,

1
σ
ˆt =
W −1

W

¯ 2,
(Xt−i − X)
i=1

¯ = 1
where Xi are observed log-returns and X
W

W
i=1 Xt−i .

Usually, the daily standard

deviations are annualized by multiplying σ
ˆt by square root of the number of trading days in
one calendar year, which is typically 250 days a year. This annualized standard deviation is
what is most commonly thought of the historical volatility.
Every day t, the above moving average formula for the spot volatility will be adjusted
correspondingly, annualized and used in the Black-Scholes model to price an option on that
particular day. Even though, under the GBM volatility should be constant, plots of the
one-year and one-month MA annualized historical volatilities of the S&P 500 index, show
that the estimated volatility parameter changes over time, see the figures 3.5a, 3.5b. This

96

effect can be observed in the previous plots of the regular S&P 500 returns, see figure 3.4.
Moreover, one can even recognize a volatility clustering effect in both figures.

9

10-4 S&P 500 index - 1-year Moving Average volatility

3

10-3 S&P 500 index: 1-month Moving Average volatility

8
2.5
7
2
Volatility

Volatility

6
5
4
3

1.5

1

2
0.5
1
0

0
2006

2008

2010

2012

2014

2016

2006

time

2008

2010

2012

2014

2016

time

(a) One-year MA

(b) One-month MA

Figure 3.5: The historical volatility for the S&P 500 index returns (daily readings).

From the figures 3.5a, 3.5b the volatility seems to behave erratically over time which
prompts one to consider non-deterministic models for this parameter. Moreover, one can
also see there is a slight mean reversion effect, along with a periodic behavior of volatility,
which can be better observed over the long term, see figure 3.5b. The phenomenon where
periods with high variance of returns are followed by periods with low variance is called
clustering effect and can be, intuitively, explained by the fact that large price variations are
more likely to be followed large price variations, and vice versa. The clustering effect can
be clearly observed in the figure 3.4. The fact that volatility exhibits clustering periodically
in time makes it partially predictable, which may be of great help in the fields of risk
management and option pricing.
97

3.3

Fractal Activity time geometric Brownian motion
model with supOU as activity time

Since considerable investigation of the characteristics of the stock price returns show signs
of non-Gaussianity and a very delicate dependence structure, new models were proposed that
try to describe empirical log-returns more realistically. Most common problems that any of
the more advanced models face, are complexity and intractability. These models become
complicated either by introducing a large number of parameters to be estimated or because,
simply their marginal densities are intractable, i.e. don’t have a closed form solution.
Recall that in the previous chapters we mentioned that finite/infinite supOU processes
have tractable distributions if suitable marginal distributions are chosen. Hence, to keep
the tractability, for the calibration of the S&P 500 data, we suggested using Heyde’s skewed
FATGBM subordinator model where the activity time is constructed via finite superpositions of OU type processes. This model generalizes classical GBM model by introducing a
time-changed version of the Brownian motion and most importantly subordinated process is
chosen so that final model supports all of the desired features of the empirical log returns.
Remark 3.3.1. It is important to notice that this approach proposed by Heyde [37] combines
all of the desired properties: non-Gaussian distributional features, dependence structure of
the log returns and tractable marginal distributions.
There are various types of models that try to incorporate non-Gaussian distributions that
have been proposed and investigated by many authors, see Prsetx, P.D, Sorensen, Rydberg.
For example, approach that uses Levy processes instead of a Brownian motion in GBM
model, proposed by Eberlein and Raible (1999) [19], allows larger class of possible marginal

98

distributions (infinitely divisible distributions) and it also introduces jumps in the stock
price process. Then, already mentioned, Mandelbrot’s Fractional Brownian Motion (FBM)
subordinator model [53] and the Stochastic Volatility type models of Barndorff-Nielsen and
Shephard [8].
Here, we focus more on models that try to describe volatility as a stochastic process.
There are two approaches to building these type of models. One is modeling σ, the volatility
parameter, directly as a stochastic process. The models of this type are commonly called
Stochastic Volatility type models. The other way is through the subordination of a Brownian
Motion (or a Levy process) and by treating time as a random activity time process. These
are usually called subordinator or change of time models. Clearly, the model we suggested
for the calibration is of the second type.
Moreover, different activity time type models have been proposed depending on how the
activity time was constructed. In particular, use of Levy processes to model the activity
time was suggested by Madan and Seneta (1990), Madan, Carr and Chang (1998), see
[51, 52]. Authors, Bender and Marquardt (2009) [10] offered a construction of activity time
via convolution between a Levy subordinator and a deterministic kernel. While all these
constructions have many desirable features they do not specify the marginal distribution of
the returns in a closed form which makes these models extremely hard to implement.

3.3.1

Fractal Activity time geometric Brownian motion process

This section introduces a fractal activity time geometric Brownian motion process, or
FATGBM for short. FATGBM models, that belong in the group of subordinator models,
were first introduced by Heyde (1999) [37], and see also Heyde and Liu (2001) [39]. The
FATGBM model describes the stock price process {St , t ≥ 0}, driven by standard Brownian
99

motion Bt and evaluated at a random activity time Tt , independent of Bt . The stock price
process is given by the following expression:

log St = log S0 + µt + θTt + σB(Tt ),

(3.5)

where µ, θ ∈ R, and σ > 0 are constants referred to as drift, asymmetry and volatility. The
activity time process {Tt }t≥0 is positive, nondecreasing with stationary (but not independent) increments, denoted as τt = Tt − Tt−1 , and it starts at zero a.s., T0 = 0. In addition,
processes Bt and Tt are Ft -adapted.
Remark 3.3.2. In the case when Tt = t of the FATGBM model, equation (3.5) becomes classical GBM model, with corresponding log-returns being independent and identicaly distributed
normal random variables, for any point in time t.
Remark 3.3.3. Note that the restricted case when θ = 0 is frequently called a symmetric
model. In which case the price process is of the form, St = S0 exp{µt + σB(Tt )}.
If certain conditions, see Kobayashi (2010) [46] are satisfied 9 , the FATGBM process in
equation (3.5) is a unique, strong solution to the time-change stochastic differential equation,
given below:
1
dSt
= µdt + (θ + σ 2 )dTt + σdBTt
St
2

(3.6)

The activity time process Tt has an attractive interpretation of time over which the
market price evolves. In the finance literature, it is often associated with a trading volume
or with the flow of new information. In other words, the more information is released into
9 Since the SDE 3.6 is homogeneous linear SDE, beside the standard assumptions for the existence and
uniqueness of the solution, one needs additional assumptions about the process Tt . In specific, the activity
time is assumed to be positive, non-decreasing, with continuous sample paths and To = 0

100

the market, or the more ”frenzied” trading becomes, the faster the activity time flows, see
Howison and Lamper [41].

3.3.2

Properties of log-returns in the FATGBM model

As with the Black-Scholes model, one is more interested in modeling log returns directly
instead of the stock price process, hence the FATGBM log-returns are of the form,

Xt = log St − log St−1 = µ + θ(Tt − Tt−1 ) + σ(B(Tt ) − B(Tt−1 ))
D

(3.7)

= µ + θ(τt ) + σB(τt )

(3.8)

√
D
= µ + θ(τt ) + σ τt B(1),

(3.9)

provided that τt = Tt − Tt−1 .
The above log-returns can be simply expressed in terms of standard Normal errors,

√
Xt = µ + θ(τt ) + σ τt ξ(t), t = 1, 2, ...

where ξ(t) is an iid sequence of standard Normal random variables, independent of the
activity time τt , that is assumed by the model.
In the subsequent sections we will explain properties of the FATGBM log-returns and
how they relate to the stylized facts of the empirical stock returns. We will see how moments, skewness, kurtosis and dependence structure of the returns behaves as a result of our
assumptions about the parameters θ, µ, and Eτt . In addition, the fact that the variance of
the FATGBM log-returns is heteroscedastic will also be explored.

101

3.3.2.1

Moments, skewness and kurtosis

Assuming the finiteness of at least fourth order moment of τt , the expressions for the
moments of the FATGBM log-returns of up to the fourth order are given below,

EXt = µ + θEτt ,

(3.10)

E(Xt − EXt )2 = σ 2 Eτt + θ2 M2 ,

(3.11)

E(Xt − EXt )3 = 3θσ 2 M2 + θ3 M3 ,

(3.12)

E(Xt − EXt )4 = 3σ 4 (M2 + (Eτt )2 ) + 6σ 2 θ2 (Eτt M2 + M3 ) + θ4 M4 ,

(3.13)

where Mi = E(τt − Eτt )i , i = 2, 3, 4. Note that Seneta (2004) [65] assumed that Eτt = 1
in his paper, and we will exploit this assumption when calibrating the Variance-Gamma parameters, see the section 3.4. In the above expressions one can see how different assumptions
about θ, µ, and Eτt can affect the properties of the returns.
The coefficients of skeweness (asymmetry) and of kurtosis (fatness of tails) are defined
respectively by,
γ1 =

γ2 =

E[(Xt − µ)3 ]
(V ar[X])3/2

=

3θσ 2 M2 + θ3 M3
(σ 2 Eτt + θ2 M2 )3/2

.

E[(X − µ)4 ]
3σ 4 (M2 + (Eτt )2 ) + 6σ 2 θ2 (Eτt M2 + M3 ) + θ4 M4
=
.
(var[X])2
(σ 2 Eτt + θ2 M2 )2

(3.14)

(3.15)

Notice that, symmetric log-returns10 (γ1 = 0) correspond to the case when θ = 0, while
skewed returns correspond to the case when θ = 0. In addition, in the case of the symmetric
10 Symmetric about the mean µ.

102

log-returns: θ = 0, the kurtosis is:

γ2 =

3 M2 + (E[τt ])2
,
(E[τt ])2

and it is still ≥ 3. This allows for the returns to have heavier tails than normal distribution
even in the case of symmetric returns.
From the above equations (3.10) one can see that even when θ = 0, the variance can be
heteroskedastic, i.e. one can pick E(τt ) to be time-dependent.

3.3.2.2

The dependence structure in the FATGBM model

In this section we will show that the FATGBM subordinator model is capable of reproducing the dependence properties of the simple, squared and absolute empirical log-returns,
as contrast to the Black-Scholes model.
First we’ll explain some of the properties of covariance of the log-returns in terms of the
activity time process {τt }. If finiteness of the moments is assumed accordingly, then for
k ≥ 1 one gets:

√
√
cov(Xt , Xt+k ) = cov(θτt + σ τt B1 (1), θτt+k + σ τt+k B2 (1))
= θ2 (Eτt τt+k − Eτt Eτt+k )
= θ2 cov(τt , τt+k ),

where B1 and B2 are independent copies of Brownian motion.
Now, let us see how the dependence structure changes as we pick different parameters.
For example when θ = 0 then cov(Xt , Xt+k ) = 0, i.e. the log-returns are uncorrelated or

103

independent in Gaussian case. In addition, from the above expression we can see that if the
increments of the activity time τt in the FATGBM model are independent or when activity
time is deterministic (τt = t), the log-returns turn out to be uncorrelated as well.
Remark 3.3.4. Note, when τt = t, FATGBM behaves like a classical Geometric Brownian
motion with i.i.d. Gaussian log-returns.
Now, let us explore the dependence structure of the squared and log returns in a general
case, when µ = θ = 0,

√
2
2 ) = cov (µ + θτ + σ √τ B (1))2 , (µ + θτ
cov(Xt2 , Xt+k
t 1
t
t+k + σ τt+k B2 (1))
2 )+
= (σ 4 + 4θ2 µ2 + 4θµσ 2 )cov(τt , τt+k ) + θ4 cov(τt2 , τt+k
2 )),
+ (θ2 σ 2 + 2θ3 µ)(cov(τt2 , τt+k ) + cov(τt , τt+k
θ=0

= σ 4 cov(τt , τt+k ).

From the above expression for the covariance function, one can conclude that, even in
the symmetric case (θ = 0) which implies uncorrelated log-returns (cov(Xt , Xt+k ) = 0), the
covariance of the squared returns is still persistent and depends only on the covariance of
2 ) = σ 4 cov(τ , τ
the subordinator, cov(Xt2 , Xt+k
t t+k ) for θ = 0.

Remark 3.3.5. One can conclude that dependence properties of the activity time process
{τt } directly imply the properties of the squared log-returns {Xt2 }.
In the next paragraph we will explore the covariance structure for the absolute log-returns

104

when µ = θ = 0, see the details below:

√
√
cov(|Xt |, |Xt+k |) = cov( σ τt |B1 (1) |, σ τt+k |B2 (1)| )
√ √
= σ 2 E(| B1 (1)| )E( |B2 (1)| ) cov( τ t , τt+k )
=

√ √
2 2
σ cov( τ t , τt+k )
π

Similarly as for quadratic returns, we see that assuming θ = 0 implies uncorrelated logreturns, while absolute log-returns are still directly dependent on the covariance structure
of the activity time process.

3.3.2.3

Stochastic volatility - conditional heteroscedasticity

In this section we will see that, with the appropriate choice of the activity time τt , the
FATGBM log-returns display the property of conditional heteroscedasticity, that is a time
dependent conditional variance. To be more precise, if one chooses the activity time τt that
has specific dependence structure, the heteroscedasticity simply follows.
Let Ft = σ({B(u), u ≤ Tt }, {Tu , u ≤ t}) be the sigma-algebra of information available
up to time t, then the conditional variance is of the form:

V ar(Xt |Ft−1 ) = E(Xt2 |Ft−1 ) − E(Xt |Ft−1 )2

(3.16)

= θ2 E(τt2 |Ft−1 ) + (σ 2 + 2µθ)E(τt |Ft−1 )−

(3.17)

− 2µθE(τt |Ft−1 ) + θ2 E(τt |Ft−1 )2

(3.18)

= θ2 V ar(τt |Ft−1 ) + σ 2 E(τt |Ft−1 ).

(3.19)

In the case of restricted model, when θ = 0, the above expression reduces to: V ar(Xt |Ft−1 ) =
105

σ 2 E(τt |Ft−1 ), see Heyde and Liu (2001) [39].
Remark 3.3.6. What is interesting here, is when one does comparison of the time-dependent
conditional variance of the FATGBM model to the constant conditional variance of the BlackScholes model, one can conclude that the first model offers considerably more flexibility. That
is, one can model conditional variance or the ”volatility”) as they are pleased: either as a
random function of time or only a function of time. Note that, even in the reduced/symmetric
FATGBM model, where θ = 0, even though the returns are uncorrelated, one can still model
the conditional variance.
Remark 3.3.7. Note, it is common in industry and finance to refer to the parameter σ,
which in the Black-Scholes model simply represents a standard deviation of the returns, as
the volatility parameter. Even though it may be confusing at times, the volatility would be
referred to as random or stochastic, throughout the applied literature. Even though the context
is understood, it is not correct mathematically, since expectation cannot be random. Hence
here, we use the term volatility for conditional variance only - for which we know it can be
random.

3.3.3

Distribution theory

Since the FATGBM log returns are of the form:

√
D
Xt = µ + θτt + σ τt B(1),

then the conditional distribution of the returns: Xt , given the variance τt = V is normal
with mean µ and variance σ 2 V when θ = 0, while when θ = 0, the mean is µ + θV and the

106

variance is the same. See below,

D

Xt |τt = N (µ + θτt , σ 2 τt ).

Since the distribution of the log-returns Xt is conditionally normal, they belong to the class
of normal mixed distributions or so-called generalized hyperbolic distributions.

3.3.3.1

Gamma distribution of the unit increments of the activity time process

Suppose that the activity process τt follows a Gamma distribution, then the distribution
of the log-returns Xt is going to be a Variance Gamma (VG) distribution. This is actually
the reason why this distribution is called Variance-Gamma, because its variance is Gamma
distribution: V G(µ, θ, σ 2 , α, β) | V ∼ N (µ, σ 2 V ), where V has Gamma distribution.
Although VG distribution was explained in the section 1.5.3.3 in detail, here we will
introduce a different parametrization of the VG distribution, the one that is later on going
to be used for calibration of the returns.
As a reminder, if τt is distributed as Γ(α, β), where α, β > 0, then its density and the
characteristic function are given by the expressions:
β α α−1 −βx
x
e
, x > 0,
fΓ (x) =
Γ(α)

φΓ (u) =

iu −α
1−
, u ∈ R.
β

(3.20)

The moments of the τt -marginal Gamma distribution are

Eτt =

α
,
β

α
M2 = 2 ,
β

M3 =

where Mi = E(Xt − EXt )i .
107

2α
,
β3

M4 =

3α(α + 2)
.
β4

(3.21)

The specific choice of a Gamma distribution as a marginal distribution of the activity
time τt coupled with the FATGBM model setup results in log-returns Xt having the marginal
skew Variance Gamma distribution, see [51], with the density given by,

−

fV G (x) =

(x−µ)θ

σ2
2 βαe
π
σΓ(α)

α− 21

|x − µ|
θ2 + 2βσ 2

K

|x − µ| θ2 + 2βσ 2
,
α− 12
σ2

(3.22)

where Kη (ω) is a modified Bessel function of the third kind with index η 11 . For η ∈ R and
ω > 0 is given by,
Kη (ω) =

1 ∞ η−1 −ω/2(z+1/z)
z
e
dz.
2 0

Note that the behavior of the tail of the VG distribution is different from that of a Normal
distribution. VG is considered to belong to a class of semi-heavy tails distributions that decay
slower than the Normal and faster than the heavier tail distributions. The probability of
its tail is a product of both a power and an exponential function. Namely, if Xt has VG
distribution, then as x → ∞, the tail probability is given by12 :

−x

P (|Xt | > x) ∼ const(α, β, σ)xα−1 e

2β/σ 2

.

The characteristic function of the VG log-returns Xt is given by:

φV G (u) = eiµu 1 −

iθu
1 2 2 −α
+
σ u
, u ∈ R.
β
2β

11 Note that K is referred to as a modified Bessel function of the second kind in some texts.
η
12 The notation: f (x) ∼ g(x) means that lim
x→∞ f (x)/g(x) = 1

108

(3.23)

In the symmetric case (θ = 0) the expression for the characteristic function is of the form:

φV G (u) = eiµu 1 +

1 2 2 −α
σ u
.
2β

(3.24)

Remark 3.3.8. The distribution defined in (3.22) and in (3.23) we will denote as V G(µ, θ, σ 2 , α, β).
Specializing the equations for the central moments of the FATGBM model, in 3.10, by
taking that τt has Gamma marginal distribution, the central moments of the log-returns Xt
are given by:

EXt = µ + θα/β,
α θ2 α
+ 2,
β
β
2
3θσ α 2θ3 α
+ 3 ,
E(Xt − EXt )3 =
β2
β
3σ 4 α(α + 1) 6σ 2 θ2 α(α + 2) 3θ4 α(α + 2)
E(Xt − EXt )4 =
+
+
.
β2
β3
β4
E(Xt − EXt )2 = σ 2

Remark 3.3.9. Note that, till now we defined only the univariate characteristics of the
activity time process {τt }, i.e. we fully defined the marginal distributions. We haven’t yet
introduced any time-dependence. This will be done through the particular choice of the covariance structure for the activity time process. This step is necessary in order to get the
time-dependent conditional variance and the suitable dependence structure of the returns.
Remark 3.3.10. Notice here, that if the appropriate activity time process τt along with
the convenient dependence structure is chosen, then all the empirical properties of the stock
log-returns can be modeled accordingly.

109

3.4

Data calibration

In this section, we would like to validate the approach of using a version of the FATGBM
process to model the stock price evolution. The marginal distribution for the activity time is
chosen to be a Gamma distribution. This, in turn implies that the marginal distribution of
the log-returns is Variance-Gamma. We will showcase the strengths of this approach through
series of plots and tables presenting the stylized facts of the log-returns discussed previously,
while comparing it to the classical Geometric Brownian Motion (GBM) model.
The objective of this calibration exercise is to estimate the parameters of the VarianceGamma (V-G) distribution whilst trying to keep the assumption of the dependent logreturns. While in the option pricing literature there are many different ways to approach
model calibration problems, a frequently used method is to minimize squared residual errors
between the underlying model and data at hand. Although, this approach is effective and
it does not require many assumptions about the underlying distribution of the returns, it is
fairly more complex to implement than the classical maximum likelihood (ML) or method
of moments (MOM). Due to the fact that the returns in our model are dependent, so the
regular MLE is not appropriate, we have decided to take a simpler route and to derive
parameter estimates for the VG distribution using the method of moments technique. Importantly, this technique, if relying on the stationarity of increments and ergodicity property
gives consistent estimators. As an exploration, we used the version of the MLE technique
called pseudo-likelihood for the comparison purposes. It is explained below and in the results
section in more details.
We will digress here for a moment to explain briefly and compare strengths and weakness
of both, the MOM estimation technique and the Maximum Likelihood (ML) approach. It

110

is well known that, even though the MOM is relatively simple and reliable technique which,
under very mild assumptions gives consistent estimators, these estimators are often biased.
In addition, when working with small samples, the estimates derived by the MOM can
fall outside of the parameter space. On the contrary, when estimating parameters from a
known family of distributions, the maximum likelihood estimate is very efficient. Now, there
are some situations when using the method-of-moments technique is more appropriate, for
example, when likelihood equations are intractable or when one does not want to make
any distributional assumptions. In our case, we want to keep the dependent observations
assumption and use the estimation techniques that take this into account, hence the regular
MLE would not be appropriate.
In addition, we will briefly dabble with pseudo-likelihood (PL) method, where the PL
is said to be an approximation of the joint likelihood of a set of observed data. Pseudolikelihood and Quasi-likelihood are very useful techniques to use in the situation where
the full likelihood is difficult to work with or unknown. In our case, there is dependency
in the data hence the likelihood is unknown. The pseudo-likelihood approach maximizes
the product of marginal densities, as opposed to the full joint density. If observations are
independent then these two coincide. In certain cases one can show that the pseudo-likelihood
estimators are even consistent, see Cox and Reid (2004) [17]. We will attempt to use the
MLE technique along with the unrealistic assumption of the independent increments, which
is equivalent, in a way, to using the pseudo-likelihood method. That is, we are maximizing the
product of the marginal distributions since we assumed ”the observations are independent”,
which is nothing but the product approximation of the joint likelihood and the weaker the
dependence in the data the two will be closer. This means that the MLE parameter estimates
may not be that far from the true values of the parameters, especially if dependence is not as
111

pronounced. As we shall see in the results section 3.4.2, our calibration exercise showed that
MLE estimators seemed to fit better than MOM which means that PL is relatively close to
the joint likelihood of the data. In addition parameter θ seemed to be close to zero which
intuitively implies that dependencies are not that strong.

3.4.1

VG parameter estimation using method of moments

As mentioned, we are considering FATGBM returns, see (3.5), (3.7), given by the expression,
√
Xt = log St − log St−1 = µ + θ(τt ) + σ τt B(1),

(3.25)

where µ, θ, and σ > 0 are real constants and τt = Tt − Tt−1 is an activity time process:
positive, increasing, with stationary increments, and independent of Brownian motion.
We assume that increments of the activity time follow a marginal gamma distribution
Γ(α, β). In addition, since E τt < ∞, without loss of generality we can assume that

Eτt = 1,

(3.26)

which simply scales down the expected activity time change to be in units of one, over the
unit calendar time. The scaling constant gets absorbed into θ and σ.
Since the Eτt = 1 the moments of the Gamma distribution, given in (3.27) are the
following,
Eτt = 1,

M2 =

1
,
α

M3 =

2
,
α2

M4 =

3(α + 2)
.
α3

(3.27)

Given the above assumptions, when increments of activity time follow a marginal gamma
distribution Γ(α, α), the returns follow a marginal (skew) VG distribution, denoted as

112

V G(µ, θ, σ 2 , α) and given by,

fV G (x) =

2
π

−
αα e

(x−µ)θ
σ2

σΓ(α)

√

α− 21

|x − µ|
θ2 + ασ 2

√
|x − µ| θ2 + ασ 2
K 1
α− 2
σ2

(3.28)

Considering the above V G(µ, θ, σ 2 , α) distribution and given the equations (3.10), we
obtain the following for its centralized moments,

EXt = µ + θ,

(3.29)

E(Xt − EXt )2 = σ 2 + θ2 M2 ,

(3.30)

E(Xt − EXt )3 = 3θσ 2 M2 + θ3 M3 ,

(3.31)

E(Xt − EXt )4 = 3σ 4 (M2 + 1) + 6σ 2 θ2 (M2 + M3 ) + θ4 M4 ,

(3.32)

Consequently, after substituting the moment M1 , ..., M4 in the above equations, we obtain
the following equations for the central moments,

EXt = µ + θ,

(3.33)
θ2
,
α
3θσ 2 2θ3
=
+ 2,
α
α
1
6σ 2 θ2
= 3σ 4 1 +
+
α
α

E(Xt − EXt )2 = σ 2 +

(3.34)

E(Xt − EXt )3

(3.35)

E(Xt − EXt )4

1+

2
α

+

3θ4
α2

1+

1
α

(3.36)

From the above equations, we can obtain expressions for the coefficients of skewness and
of kurtosis. If we assume that θ ≈ 0, we can ignore all terms higher than θ2 , θ3 , θ4 ... in the

113

above equations (3.33), and so we get

µ1 = EXt = µ + θ,

(3.37)

µ2 = E(Xt − EXt )2 ≈ σ 2 ,

(3.38)

µ3 = E(Xt − EXt )3 ≈

3θσ 2
α

(3.39)

µ4 = E(Xt − EXt )4 ≈ 3σ 4 1 +

1
α

(3.40)

If we denote the sample (central) moments as, µˆk (N ) = N1
N
i=1 Xi

¯ (N ) = µˆ1 (N ) = 1
k = 1, 2, 3, 4, and where X
N

N
i=1

¯ (N )
Xi − X

k

, for

is its sample average. Now by

equating the theoretical moments to the sample moments, we get

ˆ
µˆ1 = µ + θ,
µˆ2 = σ
ˆ2,
µˆ3 =

ˆσ 2
3θˆ
α
ˆ

µˆ4 = 3ˆ
σ4 1 +

1
α
ˆ

,

From now on, in order to simplify the presentation of the results, instead of the αparameter we’ll be using ν = 1/α, in which case the MOM estimators are given by the
following set of formulas,

ˆ
µ
ˆ = µˆ1 − θ,

σ
ˆ 2 = µˆ2 ,

α
ˆ µˆ3
θˆ =
3ˆ
σ2

=

µˆ3
3ˆ
σ 2 νˆ

,

α
ˆ=

1
µˆ
( 44
3ˆ
σ

− 1)

νˆ =

µˆ4
−1
3ˆ
σ4

(3.41)

Remark 3.4.1. Note, if the θˆ in the true sample is small then the full set of equations (3.33)
ˆσ
will be very closely satisfied by the MOM estimators µ
ˆ, θ,
ˆ , and α
ˆ ( = 1/ˆ
ν ).
114

Remark 3.4.2. Ideally we would have liked to estimate full set of parameters for VG distribution V G(µ, θ, σ 2 , α, β), but due to the fact that solving a system of large number of
nonlinear equations can be challenging at best, we are making some small simplifications
in order to obtain meaningful results. Note that, there is no need for the independence of
increments assumption in order to estimate the above parameters.
Remark 3.4.3. Notice here, that even though we have assumed that θ is small, we have
avoided to take θ = 0 from start, in which case we would be working with the symmetric
VG distribution, and the model that would have the skewness parameter equal to zero and
uncorrelated returns. To be more specific, setting the parameter θ = 0 in (3.7) tells us
that the returns Xt are distributed symmetrically about their mean µ with the variance σ 2 .
Actually, in the financial context, the symmetric model assumption is not that far away from
the truth, since symmetry is usually the case when working with the empirical data.

3.4.2

VG parameter fitting - results

Data used for this calibration exercise is the same as the one used in previous chapters.
It consists of daily adjusted closing prices of the Standard and Poor’s 500 Index (S&P 500),
in the eleven year period, from the 9th of September 2005 till the 9th of September 2016.
The total number of log daily prices was N=2770. The figure 3.6a shows the S&P500 price
evolution in the time period from 2005-2016, and in the figure 3.6b the log-returns from the
same period are presented.
We tried couple of different ways to calibrate the parameters, where two of the method
of moments approaches are explained in first and second, while the MLE approach is given
in the third example.

115

S&P 500 index - real price VS simulated GBM price evoultion

Real S&P 500 index returns from September 2005 till September 2016
Real S&P500 returns

4000

Real S&P 500 Index
Simulated GBM Prices

3500

2500

0.1
0
-0.1
2006

2008

2010

2012

2014

2016

time
Simulated GBM Returns (calibrated on the S&P 500 index data)

2000

Real S&P500 returns

Price Values

3000

0.2

1500

1000

500
0

500

1000

1500

2000

2500

0.2
0.1
0

-0.1
-0.2
2006

3000

2008

2010

2012

2014

2016

time

time

(a) Empirical price path for S&P500 index

(b) Empirical log-return sequence for S&P500 index

Example 3.4.1. In this example we exactly followed the estimation approach suggested in
the Seneta’s (2004) paper [65] and what we explained previously, i.e. we assumed that θ ≈ 0,
so the terms θ2 , θ3 , θ4 ... are ignored. We used the MOM estimates provided in (3.37), and
results are the following:

µ
ˆ = 1.9458 × 10−4 ,
σ
ˆ = 0.0127,
θˆ = −1.0477 × 10−11 ,
νˆ = 3.5266, (βˆ = 0.2836)

First, notice that θˆ is very small, hence θ2 , θ3 , θ4 ... are nearly negligible, which in turn implies
the above estimates can be considered to be relatively close to the moment estimates. The

116

resulting VG density is shown in the figure 3.9a. To compare this method to the other two
approaches in the examples below, please see the figure 3.6.
Example 3.4.2. In this example we kept the terms θ, θ2 , while ignoring θ3 , θ4 ... in the
equations (3.33) and solved the system of non-linear equations in the MATLAB software.
The estimates we got as a solution are given below:

µ
ˆ = 0.0005978,
σ
ˆ = 0.0127071,
θˆ = −0.0004032197,
νˆ = 3.50211.

Here too θˆ is relatively small, larger than in the previous example. Note that while σ and ν
are unchanged, the estimates for µ and theta are considerably affected in this compared to
the previous example, which casts doubt into the assumption that θ = 0. The VG density
with the estimated parameters is shown in the figure 3.9b. To compare this method to the
other two approaches see the figure 3.6.
Example 3.4.3. In the final example we fitted the parameters of the VG distribution by
maximizing the corresponding likelihood along with an additional assumption of independence
of increments. For the initial values of the algorithm we used the MOM estimates which is a
standard procedure. We interpreted the resulting estimates as quasi-likelihood type, or in our
case better to say pseudo-likelihood type. Since pseudo-likelihood is simply an approximation
of the joint likelihood, any differences between these and the MOM estimates we can explain
with possibly persistent dependencies in our data or simply that the independent increment

117

assumption wasn’t appropriate to start with. The PLE estimates are given below:

µ
ˆ = 0.0008363,
σ
ˆ = 0.0119243,
θˆ = −0.0006523,
νˆ = 1.3369743.

In general, if we compare the pseudo-likelihood and MOM estimates from the previous example, they seem to be at least of the same order, where σ, θ seem close while ν and µ seem
to differ. If we look at the plot comparing all three fitting examples against the empirical
log-returns 3.6, we can see that MLE estimates provided a far better fit than the both MOM
estimates. The possible explanation is that under the circumstances the approximated likelihood is very close to the real joint likelihood, see the conclusion section below for more
detailed explanation. In the figure 3.9c we plotted the calibrated VG densities for all three
methods. Observe that the MLE fitted plot seems the least peaked. Finally, since MLE performed the best, we included other plots related to the MLE estimation, see the figure 3.10a,
3.10b, 3.10c, 3.10d.
To illustrate the fit we compared the empirical returns, VG and inferior Gaussian distribution in the figure 3.7. One can clearly see that both VG is better fitting the data
than normal distribution. So, in the figure 3.7, in addition to higher peaks than Gaussian
distribution one can observe that typical returns data has heavier tails also.
In order to illustrate the tail fit, we plotted the tails for the VG distribution, fitted by
MOM and MLE, alongside the NIG and Normal distribution, see the figures 3.8a, 3.8b.
Conclusions are similar as previously stated, i.e. that MLE method seems to does metter
118

than MOM in general, and that both VG and NIG seem to be more appropriate for the data.

119

120

Histogram of the S&P 500 log−returns

0

40

Density

80

Variance−Gamma MOM
Variance−Gamma QMLE
S&P 500 real returns
Normal density

−0.06 −0.04 −0.02

0.00

0.02

0.04

0.06

Returns

Figure 3.6: Fitting of VG distribution parameter to the S&P 500 Index returns. The plot
shows the histogram of the empirical returns along with three differently calibrated VG
densities. The methods presented are VG MOM method (example 2), VG MLE method
(example3) and the Normal curve.

80

Histogram of the S&P 500 log−returns

40
0

20

Density

60

Variance−Gamma QMLE
Symm NIG
Normal−Inverse
Gaussian
Normal density

−0.10

−0.05

0.00

0.05

0.10

data

Figure 3.7: Fitting VG, NIG and Normal distribution parameters to the S&P 500 Index
returns. The plot shows the histogram of the empirical returns along with three different
density.
120

60

Variance−Gamma QMLE
Symm NIG
Normal−Inverse
Gaussian
Normal density

0

0

20

20

40

Density

100

Variance−Gamma MOM
Variance−Gamma QMLE
S&P 500 real returns
Normal density

60

Density

Histogram of the S&P 500 log−returns
80

Histogram of the S&P 500 log−returns

0.00

0.01

0.02

0.03

0.04

0.00

0.01

0.02

Returns

0.03

0.04

data

Figure 3.8: Tail probabilities of the Fitted VG (MOM, MLE), NIG and Normal calibrated on
the S&P 500 returns. The plot shows how three distributions fit the empirical distribution.

40
20
0

0

0

50

100

150

300

60

250

MLE method − Example 3

80

MOM method (theta,theta^2<>0) − Example 2

500

MOM method (theta<>0) − Example 1

−0.06 −0.04 −0.02 0.00 0.02 0.04 0.06

−0.06 −0.04 −0.02 0.00 0.02 0.04 0.06

−0.06 −0.04 −0.02 0.00 0.02 0.04 0.06

x

x

x

(a) Example 1

(b) Example 2

(c) Example 3

Figure 3.9: Fitting of VG distribution parameters to the S&P 500 Index returns. The plot
shows three different examples explaining different methods of estimation.
121

Log−Histogram of Returns

2
0

0

−4

−2

Log−density

40
20

Density

60

4

Histogram of Returns

−0.10

−0.05

0.00

0.05

0.10

−0.10

obs
param = (0.001,0.012,−0.001,1.337)

−0.04

0.00 0.02 0.04 0.06

0.05

0.10

(b) VG log-density

Theoretical Quantiles
vgC = 0.001, sigma = 0.012, theta = −0.001, nu = 1.337

0.4

0.8

P−P Plot of Returns

0.0

Probability−integral−transformed Data

0.10
0.00
−0.10

Sample Quantiles

−0.08

0.00

obs
param = (0.001,0.012,−0.001,1.337)

(a) VG density

Q−Q Plot of Returns

−0.05

0.0

0.2

0.4

0.6

0.8

1.0

Uniform Quantiles
vgC=0.001, sigma=0.012, theta=−0.001, nu=1.337

(c) VG QQ plot

(d) VG PP plot

Figure 3.10: Fitting the VG distribution to the S&P 500 Index returns. The plot shows
outputs of the MLE approach.

122

3.4.3

Conclusion

Based on all provided results, we see that PL (pseudo-likelihood) gives the best fit with
respect to the data at hand, which seems to be counter intuitive in this situation, see the
figure 3.6. One of the main properties of this model, is the fact that θ = 0 or existence of
the dependence structure of the log-returns. If we just simply disregard this by assuming
independence, one would expect to get far worse estimates, hence we would expect MOM
to be superior to the MLE. One possible explanation is that the degree of autocorrelation
for increments in the data is not as high as one would expect, i.e. that parameter θ is quite
small.
This leads us to the conclusion that, unless the data at hand is extremely skewed and/or
correlated, and in addition, interest lies in particularly estimating only the symmetry (dependence parameter) (θ), then pseudo-likelihood estimation is probably the best bet. Note,
however that such highly skewed and dependent data one is not very likely to encounter in
practice, the data we worked with are particularly realistic values of actual stock prices.
Even thought the results are somewhat surprising, even more so provided that in the nonindependence case pseudo-likelihood has theoretical drawbacks. Provided that the maximum
likelihood is the superior estimation method for the independent data, it seems that, at least
for the cases considered, the dependence structure did not impose the true empirical joint
density to be sufficiently different from the product of marginal densities in order to cause
the MOM method to perform better than the MLE.

123

APPENDICES

124

Appendix A
Mixing Coefficients and some
important results
Let’s denote α(t) a strong mixing coefficient and β(t) a β-mixing coefficient or absolute
regularity, see Bradley (2008) [13], as following:

α(t) = sup|P (A ∩ B) − P (A)P (B)|, A ∈ Fs , B ∈ Fs+t

and
β(t) = sup|P (A/B) − P (A)P (B)|, A ∈ Fs , B ∈ Fs+t ,
where Fs = σ(Y (s) : s ≤ t), Ft = σ(Y (s) : s ≥ t), for any t ≥ 0.
Below we present a very important result on mixing, suggested in Davydov, [18].
Theorem A.0.1 (Davydov). Let Yt satisfy strong mixing conditions, i.e. α(t) →, as t → ∞.
Let also E|Y0 |2+σ = τ 2+σ < ∞ and
if σn2 = V ar

n−1
j=0 Yn

σ/2(2+σ)
n α(n)

= C < ∞ for some σ > 0. Now,

F dd

→ ∞, then we have Xn → W (t), as n → ∞ where Yn (t) is an

interpolation process on [0, 1] and W (t) is a Brownian motion.

125

Appendix B
Financial prerequisites
Financial instruments
In this section, we give a brief overview of the basic financial instruments. There are different ways in which companies can raise funds for working capital and capital expenditures.
One is raising capital by equity financing. It refers to the sale of an ownership interest in a
company by selling shares/stocks. The value of the company’s shares typically reflects the
value of real assets of the company along with its earning power. Another way of financing is
called debt financing and it occurs when a firm raises money by selling bonds, bills or notes
to individuals and institutional investors. The investors or creditors, in return for lending
the money receive a promise that both, principal and interest on the debt will be repaid at
some specified time in the future.
Below we give definitions of most commonly used instruments in the market.
Definition B.0.1. A stock (also known as share or equity) is a type of security that provides
partial ownership1 in a company and holder of a stock (or shareholder) has a claim on part
of companies assets and earnings.
Definition B.0.2. A stock index is a measure that follows a value of a portfolio of stocks
that are representing the overall or only a portion of the market2 .
1 The level of ownership is determined by the ratio of shares held relative to number of shares outstanding.
2 Typically, depending on what sector of the stock market one is interested in, a smaller sample of stocks
is selected as a good representative of the whole and used to track the performance of that sector.

126

Ideally, a change in the index price should be proportional to a change in the price of the
stocks included. In the past, the price of the stock market index used to be calculated as
an average of the stock prices included, while nowadays weighted average is used with either
price-based3 weighing or most commonly used, market capitalization4 based weighing.
Keep in mind that the stock index is nothing more than a list of stocks and that anybody can create one. The most commonly known indices are Dow Jones Industrial Average
(DJIA), Standard and Poor’s 500 (S&P 500), Nasdaq Composite, etc. DJIA is one of the
most recognized indices and it used to be referred to as ”the market”. It contains 30 of the
most largest and influential companies in the U.S. On the other hand, the S&P 500 index
contains 500 most widely held companies in the U.S., which makes it a better representative
of the entire U.S. stock market and is commonly used as a market benchmark.

Modeling assumptions
Models that we used in the thesis have common underlying assumptions about the market, its participants and mathematical structure of the price processes. In this section we’ll
give a brief overview and rationale behind these. In every model, we assumed a simplistic
market framework which states that only two assets are traded continuously in the market, a
risky and a riskfree asset. Hence, a first group of assumptions is related to risky instruments
and is frequently referred to as: set of ”frictionless market assumptions” while other set of
assumptions is associated with the riskfree asset itself.
3 very susceptible to stock splits that in turn affect a company’s weight in the index.
4 Also referred to as ”Market Cap”. It is the total dollar market value of company’s outstanding shares.
It is calculated by multiplying the shares outstanding of a company with the current market price of one
share.

127

Frictionless market assumptions:
1. No transaction costs:
By transaction costs we consider broker’s commission, bid/ask spread, taxes, margin
requirements, etc. Note that this is not as strong assumption as it seems, since most of
the above requirements are satisfied for large market participants and financial
institutions. This makes it a very reasonable approximation to reality.
2. Perfectly liquid markets:
This assumption implies that it is possible to trade any amount or fraction of a financial
instrument available on the market, at any given time. More precisely, it is a market
with large number of bid and ask offers, low spreads and also lower volatility. In these
markets it is straightforward to execute a trade at any desirable price and quickly since
there are numerous buyers and sellers. Also, relative changes in supply and demand have
a small impact on the price of an instrument. The opposite of a liquid market is often
called a thin or illiquid market. An example of a perfectly liquid market is a market for
the stock of a Fortune 500 company or the forex market which is considered the largest
and the most liquid market. The above assumption of a perfectly liquid market is well
justified considering that, we are modeling stocks from the Fortune 500 companies for
which demand and supply is very high.
3. No restrictions nor penalties or fees on short sales:
In February 2010, Securities and Exchange Commission (SEC) adopted a new short sale
price restriction, which is commonly referred to as an alternative uptick rule 5 , designed
5 The uptick rule or a short sale rule was put in place in the period from 1938 till July 2007 as a way to
control a declining market, since at that time mechanisms and liquidity couldn’t be guaranteed to prevent
panic stock declines or any type of manipulation. The rule states that stocks could be shorted only on an
uptick (last share price higher than the one before) or a zero-plus tick (last trade is the same as the previous
one which was an uptick) and not on a downtick. However, the naked shorting, or selling the shares that

128

to preserve investor confidence and promote market efficiency, as short selling can potentially have both a beneficial and a harmful impact on the market. The alternative uptick
rule is put in order to restrict short selling from further driving down the price of a stock
that has dropped more than 10 percent in one day compared to the closing price on the
previous day. Once the circuit breaker6 is triggered, it will give an advantage to long
sellers over short sellers so they can sell their shares and get out of their positions before
the price drops down even further.
4. No transaction delays:
This means that trading is continuous, i.e. that markets never close and there is no delay
in execution of the trades. Note that with this assumption we are excluding highfrequency trading as a possibility.
5. No default risk:
The risk of failure of a company or bankruptcy along with other risks that are due to
economic and political activity are ignored for simplicity. This makes an assumption
of equal compounded interests rates for borrowing and lending a plausible assumption.
6. Competitive markets (market participants act as price takers)
Price takers are individuals and companies who don’t have enough market share to
influence market price on their own, so they have to accept the current available spot
price. This means that market is not influenced by trading of individuals and even large
amounts of trading of one participant cannot influence the pricesin the market. In
addition, we assume that brokers accept all trades.
7. Rational agents (market participants prefer more to less)
can’t be verified is still forbidden.
6 Also know as collars are control measures approved by SEC in order to prevent panic sellouts on the
U.S. stock exchange and to manage excessive volatility in the market.

129

This is a relatively weak assumption, since it doesn’t impose anything on the preferences
and beliefs of the market participants except that they are rational in sense that they
prefer more money for the same services.
8. No arbitrage
The rationale behind this assumption is: if there exists such a contract in the market
that offers ”something for nothing” (also known as a ”free lunch”) then the simple
principle of supply and demand would force the price to converge to an equilibrium price.

Riskfree asset assumptions:
Second set of assumptions is concerned with a riskless asset or a bank account and it assumes
the following list of conditions:
1. As mentioned previously, our market framework consists of two assets: a stock as a risky
and a bank account as a riskless asset. Instead of a stock one can use an index and
instead of a bank account one can use a riskfree bond. Also, it is assumed that in the
market all investors are allowed to trade continuously up to a fixed finite time horizon T .
2. A bank account Bt is assumed to grow exponentially as it is described by the following
differential equation: dBt /Bt = rdt, /, r ≥ 0. In other words, we assume that the
continuously compounded rate of return for asset Bt is r, where r is deterministic and
constant.
3. It is assumed that investors can deposit and borrow money at the same interest rate, i.e.
borrowing and lending rates are equal to the riskfree rate of return r.

130

Probability setup
Even though in practice, stock prices are observed only when the exchange is open and
the prices are restricted to discrete values (often multiples of 0.01 dollars), we use continuous
time stochastic process with continuous realizations to model a stock price evolution. Hence,
we will model the price process of our asset by a continuous-time stochastic process {St :
0 ≤ t ≤ T }, where St gives us the price of a stock at time t ≥ 0.
Let (Ω, F, P ) be a probability space. We assume that Ω is set of all possible outcomes
that we are interested in, eg. set of all possible outcomes for one stock. F contains the set
of all events we would like to evaluate, eg. find how probable is the event that the stock
price will be above $100. The probability measure P is a measure that reflects everything
that happens in the market, including both, investors’ personal belief about the price and
their risk preference, i.e. measure P includes investors’ individual biases. Example is given
below.
Example B.0.1. When one is deciding whether to go long or short with a certain stock,
they would like to know what are the chances of a stock increasing or decreasing. So, they
will first look at the historical data to see if there are any increasing/decreasing trends which
will provide an initial belief about the distribution of up and down movements. Second, they
will look at how volatile the stock was in the past and based on their risk preference they
will adjust their initial probabilities. To summarize, based on historical data and their risk
preference they will construct a subjective probability measure they will use to make decisions.
Remark B.0.1. Notice, the investor’s risk preference, risk averse versus risk seeking is
an important factor in both, when building an investment portfolio and when pricing an
instrument with the same return value but different levels of risk. As such, the risk preference
131

has a strong influence on investors’ decision making process and therefore it is reflected in
the overall market probability measure P .

132

BIBLIOGRAPHY

133

BIBLIOGRAPHY

[1] D. Applebaum. Levy Processes and Stochastic Calculus. Cambridge University Press,
Cambridge, 2009.
[2] D. Applebaum. Lectures on Levy Processes and Stochastic Calculus (University of
Sheffield). Lecture 5: The Ornstein-Uhlenbeck Process, July 2010.
[3] Ole E Barndorff-Nielsen. Processes of normal inverse Gaussian type. Finance and
stochastics, 2(1):41–68, 1997.
[4] Ole E Barndorff-Nielsen. Superposition of Ornstein–Uhlenbeck type processes. Theory
of Probability & Its Applications, 45(2):175–194, 2001.
[5] Ole E Barndorff-Nielsen, J.L Jensen, and M. Sorensen. Some stationary processes in
discrete and continuous time. Advances in Applied Probability, 30:9891007, 1998.
[6] Ole E Barndorff-Nielsen and N.N. Leonenko. Burgers’ turbulence problem with linear
or quadratic external potential. Journal of applied probability, 42(2):550–565, 2005.
[7] Ole E. Barndorff-Nielsen and N.N. Leonenko. Spectral properties of superpositions of
Ornstein-Uhlenbeck type processes. Methodology and computing in applied probability,
7(3):335–352, 2005.
[8] Ole E. Barndorff-Nielsen and N. Shephard. Non-Gaussian Ornstein–Uhlenbeck-based
models and some of their uses in financial economics. Journal of the Royal Statistical
Society: Series B (Statistical Methodology), 63(2):167–241, 2001.
[9] Ole E Barndorff-Nielsen and R. Stelzer. Multivariate supOU processes. The Annals of
Applied Probability, 21(1):140–182, 2011.
[10] C. Bender and T. Marquardt. Integrating volatiltity clustering into exponential levy
models. Journal of applied probability, 46(3):609–628, 2009.
[11] L. Bertini and N. Cancrini. The stochastic heat equation: Feynman-kac formula and
intermittence. Journal of Statistical Physics, 78(5-6):1377–1401, 1995.

134

[12] F. Black and M. Scholes. The pricing of options and corporate liabilities. Journal of
Politics and Economics, 81:637–654, 1973.
[13] R.C. Bradley. Basic properties of strong mixing conditions. a survey and some open
questions. Probability surveys, 2(2):107–144, 2005.
[14] R. Carmona and S.A. Molchanov. Parabolic Anderson Problem and Intermittency,
volume 518. American Mathematical Soc., 1994.
[15] AV Chechkin, V. Yu Gonchar, M. Szydl, et al. Fractional kinetics for relaxation and
superdiffusion in a magnetic field. Physics of Plasmas, 9(1):78–88, 2002.
[16] R. Cont. Empirical properties of asset returns: stylized facts and statistical issues.
Quantitative Finance, 1:223–236, 2001.
[17] D.R. Cox and N. Reid. A note on pseudolikelihood constructed from marginal densities.
Biometrika, 91:729–737, 2004.
[18] Yu A Davydov. Convergence of distributions generated by stationary stochastic processes. Theory of Probability & Its Applications, 13(4):691–696, 1968.
[19] E. Eberlain and S. Raible. Term structure models d riven by general levy processes.
Mathematical Finance, 9:31–53, 1999.
[20] I. Eliazar and J. Klafter. L´evy, Ornstein–Uhlenbeck, and subordination: Spectral vs.
jump description. Journal of statistical physics, 119(1-2):165–196, 2005.
[21] P. Embrechts and M. Maejima. Selfsimilar Processes. Princeton University Press,
Princeton, NJ, 2002.
[22] I. Endre and T. Gy¨orgy. Superposition of diffusions with linear generator and its multifractal limit process. ESAIM: Probability and Statistics, 7:23–88, 2003.
[23] E. Famma. The behavior of stock prices. Journal of Business, 38:34105, 1965.
[24] R. Finlay and E. Seneta. Stationary-increment student and variance-gamma processes.
Journal of Applied Probability, 43(4):441–453, 2006.
[25] R. Finlay and E. Seneta. Option pricing with vg-like models. International Journal of
Theoretical and Applied Finance, 11(8), 2007.
135

[26] R. Finlay and E. Seneta. A gamma activity time process with noninteger parameter
and self-similar limit. Journal of Applied Probability, 44(4):943–955, 2008.
[27] Uriel Frisch. Turbulence: the legacy of A.N. Kolmogorov. Cambridge University Press,
Cambridge, 1995.
[28] Hirokazu Fujisaka. Theory of diffusion and intermittency in chaotic systems. Progress
of theoretical physics, 71(3):513–523, 1984.
[29] J. G¨artner, F Den H., and G. Maillard. Intermittency on catalysts: voter model. The
Annals of Probability, 38(5):2066–2102, 2010.
[30] J. G¨artner and F. den Hollander. Intermittency in a catalytic random medium. The
Annals of Probability, 34(6):2219–2287, 2006.
[31] J. G¨artner, W. K¨onig, and S. Molchanov. Geometric characterization of intermittency
in the parabolic Anderson model. The Annals of Probability, 35(2):439–499, 2007.
[32] J. G¨artner and S.A. Molchanov. Parabolic problems for the Anderson model. Communications in Mathematical Physics, 132(3):613–655, 1990.
[33] D. Grahovac and N. N. Leonenko. Bounds on the support of the multifractal spectrum
of stochastic processes. arXiv preprint arXiv:1406.2920, 2015.
[34] D. Grahovac, N.N. Leonenko, A. Sikorski, and M. Taqqu. The unusual properties of longrange dependent superpositions of ornstein-uhlenbeck type processes (under review).
2016.
[35] C.W.L. Granger. The past and future of empirical finance: Some personal comments.
Journal of Econometrics, 129:35–40, 2005.
[36] C. C. Heyde and N.N. Leonenko. Student processes. Journal of Applied Probability,
37:342–365, 2005.
[37] C.C Heyde. A risky asset model with strong dependence through fractal activity time.
Journal of Applied Probability, 36:1234 1239, 1999.
[38] C.C. Heyde. Scaling issues for risky modeling. Math. meth. Oper. Res., 69:593 603,
2009.

136

[39] C.C. Heyde and S Liu. Empirical realities for a minimal description risky asset model.
the need for fractal features. J. Korean Math Soc., 38:1047 1059, 2001.
[40] C.C. Heyde and Y. Yang. On defining long-range dependence. Journal of Applied
Probability, 34:939–944, 1997.
[41] S. Howison and D. Lamper. Trading volume in models of financial derivatives. Applied
Mathematical Finance, 8:119–135, 2001.
[42] G. Jongbloed, F.H. Van Der Meulen, and A.W. Van Der Vaart. Nonparametric inference
for L´evy-driven Ornstein-Uhlenbeck processes. Bernoulli, 11(5):759–791, 2005.
[43] Mz.J. Jurek and W. Vervaat. An integral representation for selfdecomposable banach
space-valued random variables. Z. Wahrsch. verw. Geb., 62:24762, 1983.
[44] D. Khoshnevisan. Analysis of stochastic partial differential equations, volume 119.
American Mathematical Soc., 2014.
[45] D. Khoshnevisan, K. Kim, and Y. Xiao. Intermittency and multifractality: A case study
via parabolic stochastic PDEs. arXiv preprint arXiv:1503.06249, 2015.
[46] Kobayashi. Stochastic calculus for a time-changed semimartingale and the associated
stochastic differential equations. Journal of Theoretical Probability, 24(3):789–820, 2011.
[47] N. Leonenko and E. Taufer. Convergence of integrated superpositions of OrnsteinUhlenbeck processes to fractional Brownian motion. Stochastics: An International
Journal of Probability and Stochastics Processes, 77(6):477–499, 2005.
[48] Nikolai N Leonenko, S Petherick, and A Sikorskii. Fractal activity time models for risky
asset with dependence and generalized hyperbolic distributions. Stochastic Analysis and
Applications, 30(3):476–492, 2012.
[49] N.N. Leonenko, S Petherick, and A Sikorskii. A normal inverse Gaussian model for a
risky asset with dependence. Statistics & Probability Letters, 82(1):109–115, 2012.
[50] L. Li and V. Linetsky. Time-changed Ornstein–Uhlenbeck processes and their applications in commodity derivative models. Mathematical Finance, 24(2):289–330, 2014.
[51] D.B. Madan, P.P. Carr, and E.C. Chang. The variance-gamma process and option
pricing. European Finance Review, 2:79105, 1998.

137

[52] D.B. Madan and E. Seneta. The variance-gamma (v.g.) model for share market returns.
The Journal of Business, 63(4):511–524, 1990.
[53] B. Mandelbrot and H.M. Taylor. On the distributrion of stock price differences. Operations Research, 15(6):1057–1062, 1967.
[54] V. Mandrekar and Rudiger B. Stochastic integration in Banach spaces. Springer Verlag,
2007.
[55] H. Masuda. On multidimensional Ornstein-Uhlenbeck processes driven by a general
L´evy process. Bernoulli, 10(1):97–120, 2004.
[56] R.C. Merton. An intertemporal capital a sset pricing model. Econometrica, 41:867–887,
1973.
[57] S.A. Molchanov. Ideas in the theory of random media. Acta Applicandae Mathematica,
22(2-3):139–282, 1991.
[58] Sidney I. Resnick. Heavy-tail Phenomena: Probabilistic and Statistical Modeling.
Springer Science & Business Media, New York, 2007.
[59] L.M. Ricciardi and L. Sacerdote. The Ornstein-Uhlenbeck process as a model for neuronal activity. Biological cybernetics, 35(1):1–9, 1979.
[60] A. Rocha-Arteaga and K. Sato. Topics in Infinitely Divisible Distributions and Levy
Processes. Sociedad Matematica Mexicana, Mexico, 2003.
[61] T.H. Rydberg. Realistic statistical modelling of financial data. Inter. Stat. Review,
68:233–25, 2000.
[62] T. Sandev, A.V. Chechkin, N. Korabel, H. Kantz, I.M. Sokolov, and R. Metzler.
Distributed-order diffusion equations and multifractality: Models and solutions. Physical Review E, 92(4):042117, 2015.
[63] Keni-Iti Sato. L´evy Processes and Infinitely Divisible Distributions. Cambridge University Press, Cambridge, 1999.
[64] E. Seneta. Regularly Varying Functions. Springer Verlag, Berlin-Heidelberg-New York,
1976.

138

[65] E. Seneta. Fitting the variance-gamma model to financial data. Journal of Applied
Probability, 41(A):177–187, 2004.
[66] T.H. Solomon, E.R. Weeks, and H.L. Swinney. Observation of anomalous diffusion and
l´evy flights in a two-dimensional rotating flow. Physical Review Letters, 71(24):3975,
1993.
[67] Wojbor A. Woyczy´
nski. Burgers-KPZ turbulence(G¨ottingen lectures). Lecture Notes in
Mathematics. Springer-Verlag, Berlin, 1998.
[68] M. Yamazato. Unimodality of infinitely divisible distribution functions of class l. Ann.
of Probability, 6:523–531, 1978.
[69] Ya B. Zel’dovich, S.A. Molchanov, A.A. Ruzma˘ıkin, and D.D. Sokolov. Intermittency
in random media. Soviet Physics Uspekhi, 30(5):353, 1987.

139