TRUNCATED GAUSSIAN PROCESS REGRESSION FOR PREDICTING GROWTH
OF ABDOMINAL AORTIC ANEURYSM AND FOR TEMPORAL MODELING OF
SENTIMENTS
By
Ahsan Ijaz

A THESIS
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
Electrical Engineering - Master of Science
2013

ABSTRACT

TRUNCATED GAUSSIAN PROCESS REGRESSION FOR PREDICTING
GROWTH OF ABDOMINAL AORTIC ANEURYSM AND FOR
TEMPORAL MODELING OF SENTIMENTS
By
Ahsan Ijaz
An abdominal Aortic Aneurysm (AAA) is a form of vascular disease causing focal enlargement of the abdominal aorta. As part of the present study, we use series of computer
tomography scans (CT-scans) of small AAAs taken at diﬀerent times to model and predict
the spatio-temporal evolution of AAAs. Using the proposed methodology and available CT
scan data, the prediction of an AAA can be made for any time using truncated Gaussian
process regression. The results of our case study show excellent outcomes of our algorithms
when they are compared to the true CT scan images.
Second part of the thesis concerns the temporal modeling of sentiments expressed through
textual information in Social networks. As part of this study, we explore the issues related
to the temporal models and provide an eﬃcient method which overcomes the ineﬃciencies
associated with traditional schemes. A nonparametric, computationally eﬃcient temporal
model is provided using truncated Gaussian process regression. The model is built so that a
noise parameter is estimated using the sentiment classiﬁcation error metrics and inserted in
the regression setting. This makes the method generic and any form of quantiﬁcation of sentiments (through manual labeling or by some other classiﬁcation scheme) can be used with
improvement on ﬁnal results. Baseline sentiment analysis schemes are used in conjunction
with the proposed temporal model on data crawled from Twitter to express the utility of
the scheme.

TABLE OF CONTENTS

LIST OF TABLES

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Chapter 1 Prediction of Abdominal Aortic Aneurysms using Gaussian Process Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Data and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2 Observations of Center Lines from the Data . . . . . . . . . . . . . .
1.2.3 Observations of AAA Surfaces from the Data . . . . . . . . . . . . .
1.2.4 Gaussian Process Regression . . . . . . . . . . . . . . . . . . . . . . .
1.3 Spatio-temporal Modeling of an AAA . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Spatio-Temporal Modeling of the Center Line . . . . . . . . . . . . .
1.3.2 AAA Surface Prediction . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Case Study and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Decision Making via Prediction and its Conﬁdence Region . . . . . .
1.5.2 Scheduling of CT Scans . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.3 Hyperparameters as Possible Feature Vectors . . . . . . . . . . . . . .
1.5.4 Limitations and Future Research Directions . . . . . . . . . . . . . .
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
1
5
6
6
6
7
8
10
12
13
15
17
19
19
20
21
21
23

Chapter 2 Temporal Modeling and Forecasting of Sentiments
Social Networks . . . . . . . . . . . . . . . . . . . . . . .
2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Sentiment Analysis . . . . . . . . . . . . . . . . . . . . .
2.1.2 Temporal Modeling . . . . . . . . . . . . . . . . . . . . .
2.1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Sentiment Classiﬁcation . . . . . . . . . . . . . . . . . .
2.2.2 Error Characterization . . . . . . . . . . . . . . . . . . .
2.2.2.1 Sampling and Scalability . . . . . . . . . . . . .
2.2.2.2 Temporal Model . . . . . . . . . . . . . . . . .
2.3 Eperimental setup and Data . . . . . . . . . . . . . . . . . . . .

31
33
34
34
35
36
36
37
40
41
41

iii

in
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

Online
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .

.
.
.
.
.
.

42
44
44
45
48
50

APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Appendix A: Generation of samples of the center line . . . . . . . . . . . . . . . .
Appendix B: Surface parameterization . . . . . . . . . . . . . . . . . . . . . . . .

51
52
55

BIBLIOGRAPHY

58

2.4

2.5

2.3.1 Training Data and Feature Extraction .
Results . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Classiﬁcation Results . . . . . . . . . . .
2.4.2 Gaussian process based Temporal Model
2.4.3 Eﬀects of Sampling . . . . . . . . . . . .
Conclusion and Future Work . . . . . . . . . . .

. . . . . . . . . . . . .

iv

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

. . . . . . . . . . . . . . . . .

LIST OF TABLES

Table 1.1

Scan times for Patient given as days after ﬁrst scan . . . . . . . . .

6

Table 1.2

Hyperparameters For Surface Prediction . . . . . . . . . . . . . . . .

16

Table 1.3

Error measures in Prediction using data of Patient B . . . . . . . . .

18

Table 2.1

Twitter data collected . . . . . . . . . . . . . . . . . . . . . . . . . .

42

Table 2.2

Training Data using emoticons . . . . . . . . . . . . . . . . . . . . .

43

Table 2.3

Classiﬁcation Results . . . . . . . . . . . . . . . . . . . . . . . . . .

45

Table 2.4

Most informative features

46

. . . . . . . . . . . . . . . . . . . . . . .

v

LIST OF FIGURES

Figure 1.1

Parametrized axis system r(s, θ) (black) and center line (red), where
s is the travel length in mm along the center line and θ ∈ Θ is the
angle in radians. The output of r is given as the distance from the
center line to the point on the surface. For interpretation of the
references to color in this and all other ﬁgures, the reader is referred
to the electronic version of this thesis. . . . . . . . . . . . . . . . . .

9

The predicted center line for fourth scan using ﬁrst three scans. The
predicted center line is shown in green and the original center line is
in blue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

Case 1: (a) Parameterized surface using original data for time t3 . (b)
Parameterized surface using results of prediction for time t3 . . . . .

24

Case 1: (a) Original surface of aorta at time t3 . (b) Reconstructed
image of aorta using predicted surface and center line for time t3 . . .

25

Case 2: (a) Parameterized surface using original data for time t4 . (b)
Parameterized surface using results of prediction for time t4 . . . . .

26

Case 2: (a) Original surface of aorta at time t4 . (b) Reconstructed
image of aorta using predicted surface and center line for time t4 . . .

27

Case 3: (a) Parameterized surface using original data for time t3 . (b)
Parameterized surface using results of interpolation for time t3 . . . .

28

Case 3: (a) Original surface of aorta at time t3 . (b) Reconstructed
image of aorta using interpolated surface and center line for time t3 .

29

Case 2: Predicted surface (middle) with conﬁdence intervals (up and
down) at time t4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

Figure 2.1

Aggregated sentiments for Obama(blue) and Romney(red) . . . . . .

45

Figure 2.2

Predicted Target Sentiments(Green) with Aggregated sentiment(Blue)
for Obama . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

Figure 1.2

Figure 1.3

Figure 1.4

Figure 1.5

Figure 1.6

Figure 1.7

Figure 1.8

Figure 1.9

vi

Figure 2.3

Figure 2.4

Figure 2.5

Predicted function(green) for Obama sentiments(blue) with conﬁdence interval(grey) . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

Predicted function(green) for Romney sentiments(red) with conﬁdence interval(grey) . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

Decrease of mean square error from Original temporal sentiments
with increase of Samples . . . . . . . . . . . . . . . . . . . . . . . .

49

vii

LIST OF ALGORITHMS

Algorithm 1 Generation of Samples of Center Line . . . . . . . . . . . . . . . . .

54

Algorithm 2 Surface Parameterization . . . . . . . . . . . . . . . . . . . . . . . .

56

viii

Chapter 1
Prediction of Abdominal Aortic
Aneurysms using Gaussian Process
Regression

1.1

Introduction

The aorta is a major artery in which blood circulates through the heart. An aortic aneurysm
is identiﬁed as enlargement of the aorta greater than 50% of the normal diameter. The vast
majority of aortic aneurysms are in the abdominal region (AAAs) and over 90% of these
AAAs occur speciﬁcally within the infrarenal aorta [1] [2] where a diameter greater than 3
cm is considered as an aneurysm. The infrarenal aorta is the section of the abdominal aorta
which lies between the renal branches and the iliac bifurcation. AAAs are a serious medical
condition that, when left untreated, can cause vessel rupture with patient mortality rates
more than 95% [3] [4]. Therefore, a thorough understanding of expansion and rupture of
AAAs is desired.
The expansion rate and rupture potential of AAAs have been associated by several studies. Multifaceted biological processes have been identiﬁed as aﬀecting the growth of AAAs
including biochemical, biomechanical, cellular, and proteolytic factors [5]. Morphological

1

features and regional geometrical variations have been analyzed. Relations between wall
stress distribution and hemodynamic factors have been studied using ﬂuid structure interaction (FSI) for ruptured aneurysms [6], in presence of intraluminal thrombus (ILT) and
as a result of morphology and blood pressure [7–9]. Findings over recent decades have
also demonstrated that the vascular tissues exhibit a remarkable ability of adaptation under various physiological conditions [10–12]. In particular, blood vessels seek to maintain a
preferred (homeostatic) mechanical state in conditions of altered blood ﬂow [13, 14], blood
pressure [15,16], axial extension [17–19], and during disease processes. Based on the increased
understanding of vascular diseases and advances in theoretical and computational modeling
of vascular adaptation, many computational models have been developed to describe vascular adaptation in various physiological and pathological conditions such as altered blood
ﬂow, sustained axial stretch, hypertension, intracranial aneurysm, and vasospasm [20–23].
The earlier studies, however, have focused mainly on hypothesis-testing, i.e., proving the
feasibility of stress-mediated mechanisms in vascular adaptation that have been proposed as
hypotheses, mostly with simple geometries.
Some recent works have been conducted on growth and remodeling (G&R) of AAAs
based on patient-speciﬁc geometry as well. Zeinali-Davarani et al. [24] developed the G&R
model to account for both elastic degeneration and stress-mediated collagen turnover during
AAA development using ﬁnite element analysis (FEA). A coupled simulation of G&R with
hemodynamics was conducted for studying its eﬀects on AAA expansion [25]. Geometric,
kinetic and material parameters have also been identiﬁed for individual patients using inverse
optimization techniques for modeling the growth of AAAs. For the estimation of constitutive
material parameters of the artery, a nonlinear estimation method has been suggested [26].
In addition, it is reported that kinetic parameters such as collagen turnover, rates of pro2

duction, half-life deposition stretch and material stiﬀness depends strongly spatio-temporal
changes in wall thinkness, biaxial stresses and maximum collagen stretch [27] [28]. Spatial distribution of the thickness and material properties of porcine thoracic aortas have also
been investigated experimentally by extension-inﬂation tests with a stereo-vision system [29].
Furthermore, it has been shown that the same material parameters for AAA expansion can
predict the intrasac-pressure dependent vascular adaptation after endovascular repair [30].
Transferring such advances in computational modeling of vascular diseases into an individualized predictive tool in clinical treatment, however, requires a major paradigm-shift due
to the incompleteness of the model, limited information, and uncertainty associated with
clinical measurement.
While the patient-speciﬁc computer models of G&R for an AAA provide insight into the
associated risks, dependence on multitude of factors can obfuscate the prediction results.
For a more precise feature independent prediction, we analyze the spatio-temporal patientspeciﬁc geometrical variations from a purely statistical framework.
The goal of this chapter is to provide a data-driven, nonparametric statistical framework
using patient-speciﬁc data for improving the prediction results for aneurysm growth, hence
assisting clinical management planning. While a computational G&R model is not utilized,
the nonparametric modeling approach (using Gaussian process regression) in this study can
be viewed as a step towards a Bayesian approach that will be capable of incorporating
various uncertainties, patient-speciﬁc data, and computational models for G&R. To this
end, for example, the computational G&R model can also be modeled in a nonparametric
fashion.
In this chapter, the longitudinal patient-speciﬁc data used for the study consists of four
CT scan images of an AAA. See more details about the data in Section 1.2.1. To achieve
3

our goal, ﬁrst parameterization of data is developed as follows. Segmentation of an AAA
is followed with image registration taking lumbar vertebrae as the static reference. The
center line of an AAA for each scan is obtained and used for surface parameterization. The
parameterization is carried out so that the distance of a point (on the AAA surface) from
center line r normal to the center line is a function of length s, angle θ and time t of a scan
as shown in Fig. 1.1. The surface prediction to assess the condition of an AAA for a future
time is carried out using truncated spatio-temporal Gaussian process regression [31]. This
needs to combine two predictions of the center line and the AAA surface. Using longitudinal
data of AAA scans is desirable as it would reveal point wise progression of surface along
with the associated uncertainties of prediction for any time of interest. Thus, both local
and global changes are observed and the risk of rupture for small AAAs is also highlighted.
The rate of progression is expressed in the model by hyperparameters of spatio-temporal
Gaussian process which in turn are estimated using the maximum likelihood estimator [32].
Predicted AAAs in diﬀerent cases are compared to existing true CT images to evaluate the
performance of the proposed approach. A preliminary study about the technique using a
diﬀerent patient with two scans without the prediction of center lines, reconstruction of AAA
surface, cases for validation and comparison between predicted images and original data was
reported in [33].
In summary, the main contributions of this chapter are as follows.
• Surface parameterization: A unique surface parameterization for AAAs for visualization and analysis is developed.
• Prediction of the center line of an AAA: The temporal-variations in the center line of
AAAs are used to develop a mathematical model to get a statistical estimation model
of the center line at desired time.
4

• Prediction of an AAA surface: A statistical model for a parameterized AAA surface
(with respect to a center line) is developed using computationally eﬃcient truncated
Gaussian process regression.
• Prediction of an AAA and its validation: Predicted AAAs are validated for three different cases. Each case has a training data set that is a subset of valuable longitudinal
data of four CT scan AAA images of a patient. Comparison results of predictions with
respect to the true (not-used) scan images are provided to evaluate the accuracy of
the proposed scheme.
• Prediction uncertainty: The point-wise conﬁdence interval associated with prediction
is obtained for the predicted AAA surface. Error estimates using available data is also
carried out.
• Possible utility of the methodology: Possible utility of the proposed method is discussed
from helping decision making to feature extraction applications.
To the best of our knowledge, this is the ﬁrst study that predicts the AAA growth using
available (patient-speciﬁc) CT scan data in a statistical perspective allowing uncertainty
quantiﬁcation in the predicted AAA.

1.1.1

Notation

Standard notation will be used throughout this chapter. Let R, R≥0 , R>0 , and Z denote,
respectively, the sets of real, non-negative real, positive real, and integer numbers. In denotes the identity matrix of size n. For column vectors va ∈ Ra ,vb ∈ Rb , and vc ∈ Rc ,
col(va , vb , vc ) := [va vb vb ] ∈ Ra+b+c stacks all vectors to create one column vector, and va
denotes the Euclidean norm (or vector 2-norm) of va . |A| denotes the determinant of a matrix A ∈ Rn×n . Let E(z) and Var(z) denote, respectively, the expectation and the variance of
5

random vector z. A random vector z ∈ Rq , which is distributed by a multivariate Gaussian
distribution of a mean µ ∈ Rq and a variance Σ ∈ Rq×q , is denoted by z ∼ N (µ, Σ). The
ﬁrst derivative operator on h := Rm → R with respect to vector s ∈ Rm is as follows.

h(s)=

1.1.2

∂h(s)
∂h(s)
∂h(s)
=
,...,
∂s
∂s1
∂sm

.

Organization

This chapter is organized as follows. Section 1.2 explains our data and methods in detail.
Sections 1.2.2 and 1.2.3 describe how we obtains observations from the data. Our main
method, Gaussian process regression is introduced in Section 1.2.4. Section 1.3 illustrates
spatio-temporal modeling of AAAs using observations and Gaussian process regression methods. Successful results from our methodology are illustrated under three diﬀerent cases in
Section 1.4. Discussion and conclusion are followed in Sections 1.5 and 1.6, respectively.

1.2
1.2.1

Data and Methods
Data
Table 1.1: Scan times for Patient given as days after ﬁrst scan
Scan Number

Time of Scan

Scan
Scan
Scan
Scan

t1
t2
t3
t4

1
2
3
4

=0
= 386
= 756
= 1120

To evaluate our model, we used longitudinal data of four CT scan images of a male
patient of age 54 years. The resolution of these CT scans are approximately 0.7 mm per
6

pixel. Details about the time of scans are provided in Table 1.1. This study was subject
to Internal Review Board (IRB) approvals at both Michigan State University and Seoul
National University Hospital. No patient consent was necessary since the data was collected
for a retrospective study. Three dimensional (3D) models are reconstructed from CT scans
using Mimics (Materialise, Leuven, Belgium) to get the longitudinal model set using semiautomatic segmentation. The longitudinal model set is further subjected to global image
registration with respect to lumbar vertebrae, which is assumed to be relatively unchanging
over time. This provides the spatial transform which maps the positions and orientations
of AAAs with respect to the lumbar vertebrae. Image registration allows for an accurate
investigation of the true spatial diﬀerences between scans at diﬀerent times. The vertebra
of the ﬁrst scan is selected as the reference and the vertebra of second scan along with
associated lumen and tissue models are aligned according to it. This registration is important
for building the statistical growth model of the AAA since the spatial points of an AAA for
all times should be aligned for building an accurate temporal evolution model. Thus image
registration allows for the unique visualization of the true spatial diﬀerences of the AAA
geometry and oﬀers insight into the surface evolution of an AAA. The collection of (point
cloud) data sets obtained from four scan images is denoted by Dscan := {D1 , · · · , D4 } for
further development.

1.2.2

Observations of Center Lines from the Data

The center line of an AAA acts as a reference for surface parameterization and analyzing
morphological features. To obtain the center line, an iterative algorithm is developed for
generating the center line for an arterial surface by collecting the center points of maximally
inscribed spheres within the surface boundaries at ﬁxed lengths. Using these center points of
7

spheres and 4th order polynomial basis functions, a smooth line approximation of the center
line is obtained as a function of length of an AAA. The algorithm is discussed in detail in
Appendix . From the points of the center line C obtained by Algorithm 1 (in Appendix ),
parameterization with respect to s is obtained. Here s is an equi-distant discrete set of
values deﬁned along the center line. These points are later used to analyze a discrete set
of longitudinal planes for parameterization of an AAA. A smooth approximation function
is generated based on a basis function φi (s) [28] multiplied with the set of points C as in
Eq. (1.1).
m

φi (s)C(i),

ρ(s) =

(1.1)

i=1

where m is the total number of discrete points of the center line generated by Algorithm 1
of Appendix .
By applying Algorithm 1 to the four point cloud data sets Dscan = {D1 , · · · , D4 }, we
obtained observations of center lines {¯(s, ti )|s ∈ Si }, where ∀i ∈ I := {1, · · · , 4}.
ρ

1.2.3

Observations of AAA Surfaces from the Data

The surface data is then parameterized with respect to the center line by deﬁning a function
r : S × Θ → R>0 , where S := [0, zmax ] and Θ := [0, 2π] with the input coordinate system
of s ∈ S as the travel length in mm along the center line and θ ∈ Θ as the angle in radians.
The output of r is given as the distance from the center line to the point on the surface at
a given set of input coordinates (s, θ). A visualization of this coordinate system is shown
in Fig. 1.1. Therefore, in this chapter, an AAA is modeled by r(s, θ) with respect to ρ(s),
where s ∈ S and θ ∈ Θ. The detail information regarding how to obtain a noisy version of
8

Figure 1.1: Parametrized axis system r(s, θ) (black) and center line (red), where s is the
travel length in mm along the center line and θ ∈ Θ is the angle in radians. The output of r
is given as the distance from the center line to the point on the surface. For interpretation
of the references to color in this and all other ﬁgures, the reader is referred to the electronic
version of this thesis.

9

r(s, θ) from the point cloud data D is given in Appendix and summarized in Algorithm 2
of Appendix . The outputs of Algorithms 1 and 2 from the data set D are denoted as ρ(s, t)
¯
and r(s, θ, t), where t ∈ {t1 , · · · , t4 }, respectively. They are considered to be observations
¯
obtained from a small number of sampling times, e.g., the limited number of CT scan images
of a patient.
In summary, by applying Algorithm 2 to the four point cloud data sets D1 , · · · , D4 , we
have obtained observations of AAA surfaces {¯(s, θ, ti )|s ∈ Si , θ ∈ Θi }, where ∀i ∈ I :=
r
{1, · · · , 4}.

1.2.4

Gaussian Process Regression

In Section 1.3, we develop a spatio-temporal model of an AAA for a given observation set
that is generated by the previous subsections. To this end, Gaussian process regression plays
a key role in constructing a spatio-temporal model of an AAA. In this subsection, we brieﬂy
review Gaussian process regression. A Gaussian process is formally deﬁned as follows [34].
Deﬁnition 1: A Gaussian process is a collection of random variables, any ﬁnite number
of which have a joint Gaussian distribution.
A Gaussian process is completely speciﬁed by its mean and covariance functions. Let x ∈
Q := R × T ⊂ Rd denote the index vector, where x := rT

t

T

contains the sampling

location r ∈ R ⊂ Rd−1 and the sampling time t ∈ T ⊂ R≥0 .
For an illustrative purpose, we consider a Gaussian process

z(x) ∼ GP µ(x), K(x, x ) .

In general, the mean and the covariance functions of a Gaussian process can be estimated a
10

priori by maximizing the likelihood function [32].
Suppose, we have p noise corrupted observations with De =

(x(i) , z (i) )|i = 1, · · · , p .
¯

Assume that
z (i) = z (i) + n(i) ,
¯
where n(i) is independent and identically distributed (i.i.d.) white Gaussian noise with
2
variance σn . x is deﬁned as x = col(x(1) , x(2) , . . . , x(p) ). The collections of the realiza-

tions z = z (1) , . . . , z (p)

T

∈ Rp and the observations ¯ = z (1) , . . . , z (p)
z
¯
¯

T

∈ Rp have the

Gaussian distributions

z ∼ N (µ(x), K(x)) ,

2
¯ ∼ N µ(x), K(x) + σn Ip ,
z

where K(x) ∈ Rp×p is the covariance matrix of z and is obtained by Kij (x) = K(x(i) , x(j) )
and Ip ∈ Rp×p is the identity matrix. We can predict the value z∗ of the Gaussian process
at a point x∗ [34] as

2
z∗ |De ∼ N µ∗ (x), σ∗ (x) ,

(1.2)

where the predictive mean E(z|De ) is

2
µ∗ (x) = µ(x) + k T (x) K(x) + σn Ip

−1

(¯ − µ(x))
z

(1.3)

and the predictive variance is given by

2
2
σ∗ (x)=Var(z∗ |De )=σ 2 − k T (x) K(x) + σn Ip

11

−1

k(x).

(1.4)

Here k(x) ∈ Rp is the covariance matrix between z and z∗ obtained by kj (x) = K(x(j) , x∗ )
and σ 2 = K(x∗ , x∗ ) ∈ R is the variance at x∗ .
It can be seen from Eqs. (1.3) and (1.4) that the calculation of both the predictive mean
and predictive variance requires the inversion of covariance matrix whose size depends on
the number of observations p, i.e., its complexity is O(p3 ). Hence a drawback of Gaussian
process regression is computational complexity. A large p makes it impossible to compute
Eqs. (1.3) and (1.4) using all data points. To overcome the limited computation resource, a
number of approximation methods have been proposed. For instance, the sparse greedy approximation method [35], the Nystrom method [36], the informative vector machine [37], the
likelihood approximation [38], and the Bayesian committee machine [39] have been employed
for diﬀerent problems. In particular, it has been proposed that spatio-temporal Gaussian
process regression can be applied to truncated observations including only measurements
near the position and time of interest [31]. To justify prediction based on only the most recent observations, a similar argument has been made in [40] in the sense that the data from
the remote past do not change the predictors signiﬁcantly under the exponentially decaying
correlation functions. In this chapter, to cope with computation complexity, we will also use
local observations near the point of interest when we compute the prediction of that target
point.

1.3

Spatio-temporal Modeling of an AAA

From now on, we explain how to model the evolution of an AAA of a patient based on the
limited data set of CT scan images such that estimation (or prediction) and the error variance
of an AAA can be computed for any given time (including future time). In this section, we

12

will use noisy observations of center lines {¯(s, ti )|s ∈ Si } and surfaces {¯(s, θ, ti )|s ∈ Si , θ ∈
ρ
r
Θi }, where ∀i ∈ I := {1, · · · , 4} computed from the four point cloud data sets D1 , · · · , D4
as described from Sections 1.2.2 and 1.2.3.

1.3.1

Spatio-Temporal Modeling of the Center Line

The x, y and z coordinates of center lines for previous times are independent of each other
both in spatial and temporal directions. Hence, the prediction framework of the center line
of an AAA utilizes three independent zero mean Gaussian processes as:

ρx (s, t) ∼ GP 0, Kx (s, t, s , t ; Φx ) ,
ρy (s, t) ∼ GP 0, Ky (s, t, s , t ; Φy ) ,
ρz (s, t) ∼ GP 0, Kz (s, t, s , t ; Φz ) ,

where t ∈ T and ρx (s, t) := ρ(s, t) · ex , ρy (s, t) := ρ(s, t) · ey , and ρz (s, t) := ρ(s, t) · ez are the
x, y and z coordinates of the center line at time t with distance s mm from the origin. Here
ex , ey , and ez denote the unit vectors codirectional with the x, y, and z axes, respectively.
The standard exponential kernel function is used for calculating the covariance function for
each of directions:

2
Kα (s, t, s , t ; Φα ) = σf α exp −

|t − t |2
2
2σtα

× exp −

|s − s |2
2
2σsα

with hyperparameters Φα := [σf α σsα σtα ]T where ∀ α ∈ {x, y, z}. σ sα and σ tα are bandwidths for space and time. The hyperparameters are determined by maximizing the likelihood function. The obtained hyperparameters are shown in Table 1.2. Having estimated

13

hyperparameters in {Φα , ∀α ∈ {x, y, z}} from observations {¯(s, ti )|s ∈ Si }, where ∀i ∈ I,
ρ
using the covariance function form in Eq. (1.5), we can now predict the center line of the AAA
for any time using Gaussian process regression illustrated in Section 1.2.4. The prediction
will be given at any space and time (s∗ , t∗ ) by the conditional expectation:

ρ(s∗ , t∗ ) := E (ρ(s, t)|{¯(s, ti )|s ∈ Si }, ∀i ∈ I) .
ˆ
ρ

Since the point-wise variance obtained in each coordinate dimension is independent, the
uncertainty envelop obtained across each point of center line is an ellipsoidal. The predicted
center line for the time of fourth scan t4 using center line data of ﬁrst three scans along with
the original center line obtained is shown in Fig. 1.2.

Figure 1.2: The predicted center line for fourth scan using ﬁrst three scans. The predicted
center line is shown in green and the original center line is in blue.

14

1.3.2

AAA Surface Prediction

In this section, the AAA surface r(s, θ, t) is modeled by using Gaussian process regression
using observations of AAA surfaces {¯(s, θ, ti )|s ∈ Si , θ ∈ Θi }, where ∀i ∈ {1, · · · , 4}.
r
We assume that the AAA surface parameter r is a Gaussian process, i.e.,

r(s, θ, t) ∼ GP(µr , K(s, θ, t, s , θ , t ; Ψ)),

where the covariance function K s, θ, t, s , θ , t ; Ψ with a hyperparameter vector Ψ :=
[σf , σs , σθ , σt ]T is calculated using the kernel function [41] given as:

2
K s, θ, t, s , θ , t ; Ψ = σf exp

−

|s − s |2
2
2σs

exp

−

1 − cos(θ − θ )
2
2σθ

exp

−

|t − t |2
2
2σt

,

(1.5)
where σ 2 presents the range on which r varies vertically for a ﬁxed point. Again, σ s and σ t
f
are bandwidths in space and time. The eﬀect of a bandwidth can be illustrated as follows. If
r dose not change much in time at a spatial point, σ t would be large and strong correlation
would be reﬂected in the corresponding entries of the covariance matrix. The torus function
for θ in Eq. (1.5) ensures that the covariance factor contributed by θ takes the highest value
when θ − θ = 2N π, where N ∈ Z and the lowest value when θ − θ = (2N + 1)π, where
N ∈ Z.
The hyperparameters in Ψ are calculated by maximizing the likelihood function [32]. The
estimated parameters are given in Table 1.2. Using the covariance function in Eq. (1.5) with
the estimated hyperparameters plugged-in, the Gaussian process regression can be performed
as discussed in Section 1.2.4. In particular, the prediction r can be made at any input point

15

and time (s∗ , θ∗ , t∗ ) given by the conditional expectation

r(s∗ , θ∗ , t∗ ) := E (r(s∗ , θ∗ , t∗ )|{¯(s, θ, ti )|s ∈ Si , θ ∈ Θi }, ∀i ∈ I) .
ˆ
r

Table 1.2: Hyperparameters For Surface Prediction
Gaussian Process

Hyperparameters

Estimated Values

Surface Prediction

Ψ := [σf , σs , σθ , σt ]

[23.21, 62.3, 1.72, 336]

Center line Prediction

Φx := [σf x , σsx , σtx ]
Φy := [σf y , σsy , σty ]
Φz := [σf z , σsz , σtz ]

[12.3, 63.6, 336]
[28.4, 67.4, 336]
[123.5, 137.5, 336]

As discussed in Section 1.2.4, prediction at the target spatio-temporal point is made using
the points nearest to its spatial locations in previous scans. For our case, each scan consists
of 66600 distinct spatial points with s ∈ S := {0, 1, · · · , 184} and θ ∈ Θ := {0, 1, · · · , 360}.
The subset of data used for making prediction feasible at a single point (¯, θ, ti ) for case j
s ¯
(to be deﬁned shortly), where j ∈ J , is given by

¯
{{¯(¯, θ, ti )|¯ ∈ Ni (¯) ⊂ Si , θ ∈ Θi }, ∀i ∈ Ij },
r s ¯
s
s

(1.6)

where Ni (¯) is the set of local indices near s at time index i. The cardinality of this index set
s
¯
will determine the computational complexity of Gaussian process regression. For example, if
3
its cardinality |Ni (¯)| is less than Nmax , then the complexity will be less than O(Nmax ) as
s

can be obtained from Eqs. (1.3) and (1.4). As discussed previously, similar approaches using
local observations to reduce the complexity were proposed in [31] and [40]. Xu et al., [31]
showed that the quality of prediction based on truncated observations does not deteriorate
16

much as compared with that of prediction based on all data points. For further details, the
reader is referred to Theorem 3.4 on the error analysis in [31].

1.4

Case Study and Results

In this section, using the observations Dscan = {D1 , · · · , D4 } of the four scan images, we
formulate problems for three cases in order to illustrate the eﬀectiveness of our approach in
realistic scenarios. To validate our approach, the main idea is that we pretend that one data
set Di not available and make its prediction only using remaining data sets {Dj : j = i}.
We then compare the prediction results with the existing true data set Di . We summarize
our three cases as follows.
• Case 1: (Extrapolation) For the ﬁrst case, we use D1 and D2 taken at t1 and t2 to
predict the AAA at t3 . The third scan is D3 which is available but we pretend it is
not. In particular, prediction results are obtained for the center line {ˆ(s, t3 )} and
ρ
the AAA surface {ˆ(s, θ, t3 )}. The reconstructed AAA is then compared with the true
r
third scan image of D3 . The results of case 1 are shown in Fig. 1.3 and Fig.x 1.4.
• Case 2: (Extrapolation) For case 2, the ﬁrst three scans (D1 , D2 , D3 ) are used to get
prediction results at time of the fourth scan image, i.e., t4 . Similar to the previous
case, both center line and surface predictions are made and the three dimensional
AAA surface is reconstructed to form the vessel. The results of this case are shown in
Fig. 1.5 and Fig. 1.6.
• Case 3: (Interpolation) For the third case, interpolation is performed using the ﬁrst
two D1 , D2 and fourth D4 scans taken at t1 , t2 and t4 respectively. The interpolation results are obtained at the time of third scan image, i.e., at t3 . Like previous

17

cases, reconstruction is performed using the interpolated center line {ˆ(s, t3 )} and the
ρ
interpolated parameterized surface {ˆ(s, θ, t3 )}. The results of this case are shown in
r
Fig. 1.7 and Fig. 1.8.
Considering Eq. (1.6), the new data sets for cases 1, 2, 3 can be systematically organized
by J = {1, 2, 3}, I1 = {1, 2}, I2 = {1, 2, 3}, and I3 = {1, 2, 4}.
The hyperparameter vectors Ψ and {Φα }, where α ∈ {x, y, z}, estimated by maximizing
the likelihood function for the given data, are shown in Table 1.2.
For each case, the root mean square error (RMSE) is calculated by comparing the two
functions of the original and predicted parameterized AAA surfaces. For a ﬁxed time, the
original surface is represented as r(s, θ) whereas the predicted surface is represented as r(s, θ).
¯
ˆ
The two surfaces are grid-aligned and the RMSE is calculated for all values of s and θ as
follows.

RM SE =

1
n×k

n

k

(¯(si , θj ) − r(si , θj ))2
r
ˆ

(1.7)

i=1 j=1

The RMSE and maximum error between collections of grid points on surfaces for all cases
are given in Table 1.3.
Table 1.3: Error measures in Prediction using data of Patient B
Case Number
Case 1
Case 2
Case 3

RMSE (mm)

Maximum Error

3.2
2.05
1.6

10.5
6.9
4.9

For case 1, Fig. 1.3 shows the prediction of both parameterized surface and reconstructed
surface of an AAA using the predicted center line at time t3 . As can be shown, the prediction
results quite accurately match with the original surface. With increase of longitudinal data
18

as in case 2, the prediction results further improve as can be shown in Table 1.3. The
prediction results along with visualization of original data for time t4 for case 2 are shown in
Fig. 1.5. In case 3, which used data of three CT scans for interpolation, the results obtained
were the best of three cases. This is expected in a sense that nonparametric regression
(such as Gaussian process regression) performs better in interpolation than in extrapolation
(prediction at future time). The interpolated surface of an AAA at time t3 along with data
from original scan for case 3 is shown in Fig. 1.7. Table 1.3 summarizes these results using
the RMSE. It shows a decrease in error from 3.2 mm to 2.05 mm with increase of one scan
and the best results for interpolation with RMSE going down to 1.6 mm.

1.5

Discussion

In this section, we discuss possible utility and limitations of our approach along with future
research directions.

1.5.1

Decision Making via Prediction and its Conﬁdence Region

The major possible utility of our algorithms is to help clinicians in conducting medical
treatment of an AAA (such as monitoring, open surgery or endovascular repair) by providing
the predicted AAA (at future time) and its conﬁdence region generated from the limited
number of available CT scan images. The results from our case study showed excellent
performance of our algorithms under three diﬀerent cases.
Prediction error variances for predicted values can be computed using Eq. (1.4), which
is one of the main advantages of using Gaussian process regression to model AAAs. Using
Eq. (1.4), we can compute the conﬁdence regions. For a clear visualization, let us present

19

a conﬁdence region for the predicted parameterized AAA surface. The surface predicted at
t4 for case 2 along with point-wise upper and lower 90%-conﬁdence intervals are shown in
Fig. 1.9. The conﬁdence regions can be straightforwardly computed for three dimensionally
reconstructed AAAs. In this way, uncertainty quantiﬁcation in predicted AAAs,however,
is readily available by correctly taking into account all uncertainties, for example, available
CT images, diﬀerent observation noise (or resolution) levels in CT images, and patientspeciﬁc estimated hyperparameters in an empirical Bayes method. This capability of gauging
uncertainty in the predicted AAAs, however, is not available in standard G&R computational
models [22–25]. Again, conﬁdence regions on predicted AAAs will be very useful in making
clinical decision in order to gauge the level of conﬁdence in any decision made.

1.5.2

Scheduling of CT Scans

The number of scans and the time diﬀerence between scans are inﬂuential in generating a
good quality prediction of an AAA at a particular time. Since a large time diﬀerence, in the
prediction phase, would result in little correlation, higher uncertainties in ﬁnal prediction
would occur. In general, a large number of scans for a patient is also desirable for better
quality of both hyperparameter estimation and AAA prediction. This implies that the
conﬁdence region is a function of the scanning times and other parameters such as resolution,
noise levels etc. Therefore, given all other parameters and previous CT scans of a particular
subject, the next CT scan can be scheduled in order to meet a desired level of prediction
quality by calculating its conﬁdence region. Note that once hyperparameters are ﬁxed,
prediction and its conﬁdence region can be calculated at any future time as illustrated using
Eqs. (1.3) and (1.4).

20

1.5.3

Hyperparameters as Possible Feature Vectors

The hyperparameters estimated for the given data are shown in Table 1.2. For the surface of
an AAA, σf in Eq. (1.5) is an indication of the range on which the radius r varies for a given
input point. The hyperparameter σs in Eq. (1.5) is the scaling factor in direction of center
line s and captures the correlation structure of the surface along s. For example, a high
value for σs implies that the AAA surface varies smoothly whereas a lower value indicates
that the surface has high variance in direction of s. Similary σθ in Eq. (1.5) is the scaling
factor for θ and σt in Eq. (1.5) is the temporal scaling factor in the covariance structure.
The hyperparameter vector can be viewed as a feature vector that may encode information of the AAA evolution. The hyperparameters estimated for the regression provide a
unique patient-speciﬁc feature vector which captures both the temporal and spatial variation
patterns across and around the length of AAA surface. Collective feature vectors obtained
from more patients could be useful in building a classiﬁcation module capable of detecting
patients with imminent danger of rupture [42]. In the presence of more longitudinal data,
an estimation of the temporal hyperparameter would also be a guide for specifying the ideal
diﬀerence at which CT scans of AAA should be conducted for a speciﬁc patient.

1.5.4

Limitations and Future Research Directions

Our current method presented in this chapter is based on an empirical Bayes method where
estimators for uncertain values such as hyperparameters and center lines are plugged in (as
approximation) instead of integrating out the uncertainties in such variables (as in a fully
Bayesian way). Hence, uncertainties in such variables are not fully accounted while gauging
conﬁdence regions. However, prediction error variances in center lines are small and can be

21

easily accounted in conﬁdence regions of predicted AAAs. Gaussian process regression is
robust to selection of hyperparameters. It is a common practice that hyperparameters are
obtained a-priori by maximizing the likelihood function as an empirical Bayes fashion [34].
We have justiﬁed our approach of using an empirical Bayes method by showing excellent
prediction results with respect to true AAAs that were not used in training data for our case
study in Section 1.4. The fully Bayesian approach using Gaussian process regression with
an uncertain covariance function is computationally expensive. This will add much more
complexity to the current one of O(n) with n observations. In addition, prior distributions
on uncertain variables need to be carefully selected. For further information, the reader is
referred to [43, 44]. Hence, a future research direction is to develop a fully Bayesian version
of our proposed scheme taking into account uncertainties in hyperparameters and center
line prediction. Given the excellent results from our current method using an empirical
Bayes method, even if a fully Bayesian approach is used, we won’t expect signiﬁcantly better
performance resulted. Nonetheless, it can provide a complete solution to our proposed
formulation without any approximation used as in empirical Bayes methods.
As can be seen from the results in Table 1.3, the interpolation results (case 3) are better
than those of extrapolation (cases 1 and 2). It could be expected that the quality of the
predicted AAA will decrease as the prediction time horizon increases. This is more eminent
in our current formulation due to the fact that the nonparametric regression technique is used
without inclusion of the G&R computational model [22–25]. Our approach with inclusion
of the G&R computational model will be a computationally and theoretically challenging
task given the computational complexity of the model and its unknown input parameters.
However, a well-adopted computation model structure will provide a constraint in space and
time, which will help in reducing the size of the conﬁdence region of the predicted AAA
22

at future time. Therefore, the incorporation of the computational model in our Bayesian
framework shall be our future research direction.

1.6

Conclusion

In this chapter, we formulated the AAA modeling and its growth using patient-speciﬁc CT
scan image data in a purely statistical framework. As part of the work, a unique visualization
of an aneurysm is provided using a surface parameterization in r(s, θ, t) coordinate system
with respect to a center line of ρ(s, t) at time t. Using the proposed methodology and
available CT scan data, the prediction of an AAA can be made for any time using truncated
Gaussian process regression. The results of the case study showed excellent performance of
our algorithms when they are compared to the true CT scan images. To the best of our
knowledge, this is the ﬁrst study that predicts the AAA growth using available (patientspeciﬁc) CT scan data in a statistical perspective allowing uncertainty quantiﬁcation in the
predicted AAA. In doing so, it provides some interesting insights along with limitations of
such models for studying the nature of AAA growth. Possible utility and limitations of
our approach along with future research directions have been discussed. With advances in
computing technology and new sampling methods, the use of the Bayesian approach will have
a great potential to revolutionize application of computational modeling in the treatment of
vascular diseases.

23

(a)

(b)
Figure 1.3: Case 1: (a) Parameterized surface using original data for time t3 . (b) Parameterized surface using results of prediction for time t3 .

24

(a)

(b)
Figure 1.4: Case 1: (a) Original surface of aorta at time t3 . (b) Reconstructed image of
aorta using predicted surface and center line for time t3 .

25

(a)

xs
(b)
Figure 1.5: Case 2: (a) Parameterized surface using original data for time t4 . (b) Parameterized surface using results of prediction for time t4 .

26

(a)

(b)
Figure 1.6: Case 2: (a) Original surface of aorta at time t4 . (b) Reconstructed image of
aorta using predicted surface and center line for time t4 .
27

(a)

(b)
Figure 1.7: Case 3: (a) Parameterized surface using original data for time t3 . (b) Parameterized surface using results of interpolation for time t3 .

28

(a)

(b)
Figure 1.8: Case 3: (a) Original surface of aorta at time t3 . (b) Reconstructed image of
aorta using interpolated surface and center line for time t3 .
29

Figure 1.9: Case 2: Predicted surface (middle) with conﬁdence intervals (up and down) at
time t4 .

30

Chapter 2
Temporal Modeling and Forecasting
of Sentiments in Online Social
Networks
Recent proliferation of online social networks (OSN) as a media for expressing opinions and
sentiments has met with increasing interest from a gamut of ﬁelds. These sentiments often
expressed as textual nuggets proves to be an invaluable resource for a constant assessment
of policies, product review, reactionary responses, and as a feedback for improvement on
strategies.
microblogging and the blogosphere have brought the sentiments expressed Sentiments on
online networks are important for a variety of purposes. For instance, it is important for
marketeers, economists, political scientists, medical. This wide range of interest is further
helped by the vast amount of data available for analysis given the microblogging and news
consumption culture online.
A multi-faceted role of microblogging framework has emerged. Users provide recommendations and express sentiments while assimilating information through these diverse sources.
The power and role of these networking sites has been highlighted by many studies. It
has been shown that the OSNS can be used to detect World events [45], provide inference

31

about public health [46], predict election results [47] and play a pivotal role during emergency situations [48]. The ease of spreading information has further morphed these media
sources as a platform for campaigning and modern day activism. The opinions expressed in
microblogging is also shown to be highly correlated with traditional polls [49]. Thus, the public using these sources provide live feedback about major events, products and Government
decisions. The e-commerce also makes use of the online feedback system for the recommendation systems [50,51]. The textual exchange has also been shown to be a source for opinion
formation [52].Therefore, discovering knowledge from this information is imperative in policy
making, economists and political scientists. This necessitates a proper understanding and
quantiﬁcation of opinions and sentiments expressed through OSNS. Moreover, sentiments
expressed through web are dynamic and vary over time. Thus, capturing users temporal
preference is of prime importance [53]. The quantiﬁcation of expressed information is often
achieved through sentiment analysis techniques [54] whereas the temporal behavior is primarily captured through aggregation over time [49]. While such models for capturing temporal
behavior of sentiments are shown to be eﬀective, they are tied with issues of scalability, sampling and quantiﬁcation of sentiment classiﬁcation error. The scalability issues arises due to
the sheer amount of data being generated in OSNS [45]. The sampling problems are a result
of data acquisition bottlenecks and information analysis complexity [55]. Furthermore, the
researchers are also constrained to use certain samples based on application and quality of
data. For example, it has been shown that about 40% of all tweets from Twitter feed are
pointless babbles. Also, for sentiment analysis, it is necessary to identify and use samples
with subjective information.
In this study, we propose a computationally eﬃcient mathematical model for characterizing
temporal behavior of sentiments. The proposed model incorporates the inherent problems
32

associated with traditional aggregating schemes for temporal modeling of sentiments. The
model can be used to study the changes in opinions, predict agitation in online social communities and forecast success mark of a product, campaign or political candidate. The model
proposed is both ﬂexible and extensible. It can be used with any classiﬁcation scheme and
easily be extended to include spatial forecasting. As a case study, we use the model to
analyze the sentiments during the presidential election day 2012 of United States. The textual opinions were collected from “Twittersphere”as tweets about the two main Presidential
candidates- Obama and Romney.

2.1

Related Work

Studies on temporal evolution of sentiments has been carried out in numerous ﬁeld. Since
all sentiment analysis and classiﬁcation methods in literature comes with some classiﬁcation
error, it is desirable that the temporal model be ﬂexible to incorporate it. Also, for building
a real time sentiment analysis and forecasting system, the issue of scalability and sampling
come into play. Acquiring textual data samples is usually constrained with rate limitations
through OSNS. For example, Twitter APIs have a cap for tweet retrieval through rate
limitation. Moreover not all samples acquired are subjective opinions and hence cannot be
used. Therefore, for building any real time sentiment forecasting model, there is no control
over usable temporal samples.

33

2.1.1

Sentiment Analysis

Sentiment analysis is a proliﬁc and growing research ﬁeld. Given its proﬁtable market and
wide applicability, keen interest has been shown in this ﬁeld. Several studies have been
conducted for improving the methods of sentiment mining and classiﬁcation. An exhaustive survey prior to 2008 is [54]. The process generally consists of identifying subjective
sentences [56] [57], followed by feature extraction [58–60], polarity assignment [61] and classiﬁcation. The Recent studies show that the application dependent variation and complexity
associated with sentiment analysis provides further room for improvement. Techniques for
reﬁned feature extraction have been suggested. [62] compensated for frequency bias of discriminative features using a frequency based weight penalizing prior on the regularization
process in elastic net framework. The case for application dependent variation is made
through extraction of target dependent sentiment expression from Twitter data [63]. The
study also incorporates and assigns polarity to the informal language of tweets. Another interesting problem in sentiment analysis is acquiring training data. Several studies manually
annotate labels for a small sample from data [48, 52] which is quite expensive. Some other
studies make use of tag words like emoticons [64,65] for labeling of training set. Zhang et al.
(2011) use a lexicon-based method for performing sentiment classiﬁcation and then applied
a supervised classiﬁer for improving the recall by using the training examples provided in
the previous lexicon-based approach.

2.1.2

Temporal Modeling

The opinions and sentiments expressed in Social networking sites are dynamic and vary
over time. The importance of capturing user’s temporal preferences has been discussed

34

and recommendation for user speciﬁc time scale is made [53]. A number of studies use
aggregation of text sentiments for time series modeling, behavior of stock market has been
tied with emotions expressed in blogs [66,67]. Temporal happiness in songs, blogs and about
Presidents has been modeled using the same aggregation scheme [68]. Strong correlations
between textual sentiments expressed in microblog messages with contemporary polling data
has been shown [49] by aggregating sentiments.
A time evolving, user-speciﬁc scale for aggregating diverse data and predicting users interests
has been proposed [69]. It has further been shown that changing the time segments aﬀects
both the prediction performance and local eﬀects. In a study for event detection using
Twitter, the issue of scalability for detection algorithms has been raised [45]. The quality of
usable data in microblogs has been questioned. Furthermore, eﬀects of temporal smoothing
through moving average are discussed and a suggestion for an improved stratiﬁed sampling
technique has been made [49]. The issues of data acquisition bottlenecks and information
analysis complexity are raised in [55]. The data acquisition bottleneck occurs due to rate
limitations on publicly available APIs (Application Programming Interfaces) whereas huge
volumes of data causes complexities in information analysis. Another study shows optimal
scheduling of tweets for maximizing message diﬀusion [70]. The case of repeating location
speciﬁc diurnal patterns has been made in [71] with another study showing consistency of
culture speciﬁc diurnal mood patterns.

2.1.3

Notation

Standard notation will be used throughout this chapter. Let R, R≥0 , R>0 , and Z denote,
respectively, the sets of real, non-negative real, positive real, and integer numbers. In denotes the identity matrix of size n. For column vectors va ∈ Ra ,vb ∈ Rb , and vc ∈ Rc ,
35

col(va , vb , vc ) := [va vb vb ] ∈ Ra+b+c stacks all vectors to create one column vector, and va
denotes the Euclidean norm (or vector 2-norm) of va . |A| denotes the determinant of a matrix A ∈ Rn×n . Let E(z) and Var(z) denote, respectively, the expectation and the variance
of random vector z. A random vector z ∈ Rq , which is distributed by a multivariate Gaussian distribution of a mean µ ∈ Rq and a variance Σ ∈ Rq×q , is denoted by z ∼ N (µ, Σ).
Let Bern (p) denote a Bernoulli distribution with mean value p and B (n, p) be a Binomial
distribution where n is the number of trials and p represents the success probability.

2.2

Proposed Method

Classiﬁcation module is developed based on labeled training data and random samples are
taken from the corpus for building the temporal model. The acquired samples are ﬁltered
through a selection crieteria. Feature extraction followed by sentiment classiﬁcation is performed on the selected data samples and a gaussian process based spatio-temporal model is
formed for prediction and forecasting of sentiments.

2.2.1

Sentiment Classiﬁcation

Our study mainly concerns improvement of temporal model by incorporating the classiﬁcation error from sentiment analysis. Any classiﬁcation scheme can be used with some
parameters explained later for the model. For illustrative purposes, we use the Naive Bayes
classiﬁer with combination of unigrams and bigrams as features as the classiﬁcation module.
Naive Bayes is chosen since it has been shown to work well for sentiment classiﬁcation based
on textual features [72]. The data set is represented as D := {(o1 , t1 , c1 ), · · · , (og , tg , cg )}
where oi is the ith opinion (usually expressed as a textual nugget), ti is the time on which
36

that opinion was expressed and ci is its labeled class and g is the total number of opinions
expressed in the data set. Using the Naive Bayes model for classiﬁcation, class c∗ is assigned
to an opinion o by:

c∗ = arg max P (c|w1 , w2 , w3 , . . . , wh )
s

h

P (wj |s) × P (c)

= arg max
c

j=1
h

P (wj |c)

= arg max
c

j=1

where wj is the selected textual feature from the opinion o and h is the total number of
features selected for analysis. The conditionals of this equation are estimated using maximum
likelihood estimator.

2.2.2

Error Characterization

The classiﬁcation module used for sentiment classiﬁcation is tied with some classiﬁcation
error. In the temporal models that use aggregation for quantiﬁcation of these sentiments,
the incorporation of these classiﬁcation errors is insuﬃcient. The classiﬁed sentiments form
one of the following four cases:

T P := c∗ = 1|c = 1

True Positive

F P := c∗ = 1|c = 0

False Positive

T N := c∗ = 0|c = 0

True Negative

F N := c∗ = 0|c = 1

False Negative

37

For our model, we use sensitivity and speciﬁcity given in Eq. (2.1) for incorporating the
classiﬁcation errors.

TP
,
TP + FN
TN
Speciﬁcity =
,
TN + FP

Sensitivity =

(2.1)

Sensitivity is the proportion of the actual positives identiﬁed to the total number of
positive instances in the data set. Similarly, speciﬁcity is the ratio between the actual
negatives identiﬁed and the total number of negative instances in the data set. Together,
they take into account both type-I (due to false positives) and type-II (due to false negatives)
errors. In our case we assign a value of 1 for a positive sentiment and 0 for a negative
sentiment during the classiﬁcation. Sensitivity therefore translates as the success probability
of correctly classifying a sentiment as positive, whereas speciﬁcity is the success probability of
correctly classifying a sentiment as negative. Classiﬁcation of a sentiment as either positive
or negative can therefore be modeled as a Bernoulli trial with success probability pα =
Sensitivity for the positively identiﬁed instances and pβ = 1−Speciﬁcity (also known as false
positive rate) for the negatively identiﬁed instances. Therefore, the classiﬁed sentiments c∗
can be modelled as a Bernoulli trial as:



Bern p

β
∗∼
c


Bern (p )

α

if c∗ ≡ 0
(2.2)
if c∗ ≡ 1

Each sentiment, either classiﬁed Section as positive or negative can be represented by an
independent and identically distributed (i.i.d.) Bernoulli distribution as shown in Eq. (2.2).
38

A time unit ∆t is selected as the interval on which the sentiments are aggregated so that
the time line of data used is broken down into l equally spaced time intervals. The data
set thus formed can be represented as Di = {c∗ , · · · , c∗ , } where ∀i ∈ I := {1, · · · , l}.
ni
ni
Note here, that n which is the number of sentiments expressed in the time interval ∆t
might be diﬀerent for each i ∈ I. The summation of these sentiments leads to two separate
binomial distributions for each time interval, one for the positive sentiments expressed in
that interval, and the other for the negative sentiments represented as sβi and sαi where
∀i ∈ I := {1, · · · , l}.
βi

sβi =
j=1
αi

sαi =

c∗
ji

∀c∗ ≡ 0,
ji

c∗
ji

∀c∗ ≡ 1,
ji

(2.3)

j=1

where αi is the total number of positive sentiments in the ith time interval and βi is the
total number of negative sentiments expressed in the interval. This leads to l binomial
distributions for the data given as:

sγi ∼ B γi , pγ

∀i ∈ I, γ ∈ {α, β},

(2.4)

With an appropriate choice of ∆t according to the streaming speed of data, the values
of αi and βi become large enough to approximate the binomial distributions of Eq. (2.4)
using the Gaussian distributions [73]. The accumulated sentiments sγi and sαi can thus be

39

represented as Gaussian distributions given by:

sγi ∼ N γi pγ , γi pγ (1 − pγ )

∀i ∈ I, γ ∈ {α, β}.

(2.5)

For each i ∈ I, we can get an overall estimate of the sentiment by adding the two
Gaussian distributions expressed in Eq. (2.5). This leads to:

si ∼ sαi + sβi
si ∼ N αi pα + βi pβ , αi pα (1 − pα ) + βi pβ (1 − pβ )

2.2.2.1

(2.6)

Sampling and Scalability

The issue of scalability, sampling size and strategy have been discussed [55] with comparisons
of diﬀerent sampling schemes and topologies. While the study suggests to incorporate both
topology and user-context over naive methods, it is speciﬁc for information diﬀusion model
and might not work for modeling sentiments. Moreover, in sentiment analysis, it is required
to discard certain samples due to quality and subjectivity of textual information. As of
recent, aﬀects of diﬀerent sampling strategies for characterizing sentiments have not been
explored. In this study however, we focus on incorporating the sampling deﬁciency by using a
predictive model for estimating sentiments at unsampled times. We suggest to select textual
information from a uniform distribution with a judicious sample size and discarding any
samples that fail the sentiment analysis crieteria. The missing sample points are reﬂected in
our proposed model by an increase in point wise uncertainty.

40

2.2.2.2

Temporal Model

The temporal model is formulated by treating the obtained data in a Gaussian process
framework. The sentiments s(t) obtained in section 2.2.2 are modeled as:
¯

s(t) ∼ GP(0, K(t, t ; Ψ))
¯

The covariance function K t, t ; Ψ used to model the sentiments has the hyperparameter
vector Ψ := [σf , σt ]T with kernel selected as:

2
K t, t ; Ψ = σf exp

−

|t − t |2
2
2σt

The hyperparameters are estimated using the method discussed in section 1.2.4. With
an estimation σf and Equation (2.6), the error source due to classiﬁcation is accounted for
as modeling it as a noise element. The noise is obtained as:

T
2
σn Iσn = αi pα (1 − pα ) + βi pβ (1 − pβ ) − σf I∀i ∈ I.

2.3

(2.7)

Eperimental setup and Data

We evaluated our model using data from Twitter using the Twitter streaming API with
“Obama and Romney” as query terms. The distribution of data collected is shown in Table 2.1. As shown in Table 2.1, more than twice of the tweets were about Obama. The same
pattern was observed by [74] where it is shown that twitter users discussed Obama twice as
much as Romney during the time leading up to elections. The tweets were obtained from

41

Table 2.1: Twitter data collected
Topic

Total tweets

Obama
Romney
Combined

3740519
1680522
5421041

November 5, 2012, 6:00am to November 7, 2012, 12:00am.

2.3.1

Training Data and Feature Extraction

We acquired the training data by using the emoticons present in tweets [75, 76]. These
emoticons are used as training examples because each emoticon carries a positive or negative connotation. With identiﬁcation of emoticons with etiher positive or negative, the
information can be used to obtain a labeled training data set. The emoticons used for mapping positive and negative sentiments along with the distribution of total tweets among the
two candidates is shown in Table 2.2. This table shows that Obama, as being more famous
on the twitsphere, obtained more opinion nuggets through tweets as compared to that of
Romney. For the training set,16000 tweets were used with equal number of positive and
negative tweets. While forming the training data, the tweets were winnowed with expulsion
of the tweets of following category:
• Non-English Languages Any tweets of language other than English were removed
from the training set.
• Dual Candidate names Any tweets referring to both Obama and Romney were
removed.
• Dual polarity emoticons Any tweets containing emoticons assigned with both positive and negative polarity were removed.

42

• Non-subjective tweets Tweets without adjective were removed from the training
data set.This was done since we only need to analyze opinions and presence of adjectives
is shown to be highly correlated with subjectivity of sentence [57].
Table 2.2: Training Data using emoticons
Sentiment

Emoticon used

Positive

:) , :} ,:D , :))

Negative

:( , :’( , :(( , :@

Candidate

Tweets

Obama
Romney
Obama
Romney

25564
8877
5620
2850

Total tweets
34441
8470

Adjectivity and English ﬁlters were applied using WordNet [77]. For feature extraction,
some of the words in tweets were also ﬁltered:
• Small length words: Any word of length less than or equal to 3 was removed
• Candidate names: Barrack, Obama, Mitt and Romney were removed from the
training set since the distribution is biased in favor of Obama and it may select one of
the names as feature contributing towards favored sentiment. The bias would corrupt
the ﬁnal results.
• Emoticons: The emoticons used were also removed.
• Retweet information: Any information that tells about retweet by label RT was
removed
• Mentions: Mentions of any name by hashtags and @ were also removed from the
tweets of training data.
• Website links: Many people post website links with their tweets. The links were also
removed during creation of training data.
An example of parsed positive tweet is Not even an American but i’m hoping for whatever
happens in America somehow aﬀects Singapore too and a negative tweet collected in this
43

manner is lost the popular vote but won by the electoral vote which is sad because the 50%
for just lost their voice.
Unigram and bigram features were obtained from these tweets. After training the classiﬁer, 40000 tweets were randomly sampled from the data set for classiﬁcation. The obtained
samples were checked for adjectivity and english language the same way as the training samples. The tweets were divided as Romney and Obama tweets for comparison of sentiments
for each candidate. Since the rate of tweets even when using 40000 samples is 16 tweets per
minute, we aggregated the sentiments into 10 minute windows. The prediction and sampling
strategies are evaluated by taking sub-samples from these aggregated sentiments. After getting the sentiments, we applied Gaussian process regression with hyperparameter learning
using maximum likelihood estimation. Since ln p(ts|Φ) is generally non-convex and can have
multiple maxima, the starting points for hyperparameters were selected by visual inspection
of data.

2.4
2.4.1

Results
Classiﬁcation Results

Classiﬁcation accuracy was obtained by using 66% training and 34% test data. Table 2.3
summarizes the results obtained. The temporal analysis of data was done only on the
Unigram features with Naive Bayes classiﬁer.It was selected because of its simplicity and
good results. For the sake of completeness, comparison with Support Vector Machines and
use of Bigrams is also given. Table 2.1 shows that Naive Bayes classiﬁer with unigrams as
features works better of our application.
Table 2.4 shows the likelihood ratio for top features selected using Naive Bayes algorithm.
44

Table 2.3: Classiﬁcation Results
Feature Set

Classiﬁcation framework

Accuracy

Precision

Unigram

Naive Bayes
SVM

80.7%
77.8%

81.2%
81.3%

Bigram

Naive Bayes
SVM

73.2%
75.4%

74.6%
79.1%

It is interesting to note that most of the top features are for negative polarity.
Results of aggregated sentiments using ten minute window are shown in Figure 2.1. As
can be seen, even with a ten minute window, the results are quite noisy and it is diﬃcult to
know the true temporal pattern of sentiments expressed during the election time. However,
the graph shows that Obama had an overall better sentiment score as compared to Romney.
The total sentiment score for Obama during the 42 hour period near elections is 6597
whereas Romney gets 4300 positive sentiments.

50
0
−50

m
0a

pm

7,
ov
N

N

ov

6,

12

4:

:0

00

am
30
7:
6,
ov
N

5,
ov
N

N
ov

5,

11

2:

:0

30

0p

pm

m

−100

Figure 2.1: Aggregated sentiments for Obama(blue) and Romney(red)

2.4.2

Gaussian process based Temporal Model

Due to non-convex nature and multiple maximas of likelihood function, the selection of
starting point for estimating hyperparameters is of prime importance. Since the prediction
45

Table 2.4: Most informative features

Feature Name
sad
damn
hang
pic
returns
winning
chicago
aww
2013
lose
footage
well
losing
hilarious
estan
gone
cry
approve

NEG:POS
114.3:1
100.8:1
69.7:1
48.5:1
41.4:1
32.3:1
29.9:1
29.2:1
1:28.7
25.7:1
24.7:1
1:24.6
23.8:1
1:21.8
1:20.6
18.4:1
16.9:1
1:15.6

46

data is low dimensional, we used visual inspection of data for selecting the initial points.
The selected point for Obama’s sentiments was σf = 30 and σs = 10. The local maxima was
found at σf = 23.7 and σs = 14.95. For time series representing sentiments expressed about
Romney, the selected initial point was σf = 15 and σs = 10 and it converged to local maxima
at σf = 12.3 and σs = 8.1. Using the Gaussian process framework, the target sentiments
were estimated. Figure 2.2 shows the temporal behavior of sentiments emerging from the
noisy signal. This estimation is performed using sentiments obtained from only 16 hours
of data instead of all 42 hours. The samples were obtained through uniform distribution
and prediction using the model was performed for sentiments at unsampled time locations.
The variation of point wise conﬁdence interval in Figure 2.3 reﬂects the missing samples
alongwith classiﬁcation error noise. It is also evident from the ﬁgure that the framework
models both local and global eﬀects. The plateaus in sentiment scores are retained.

80
60
40
20
0
−20

m
0a

pm

7,
ov
N

N

ov

6,

12

4:

:0

00

am
30
7:
6,
ov
N

5,
ov
N

N

ov

5,

11

2:

:0

30

0p

pm

m

−40

Figure 2.2: Predicted Target Sentiments(Green) with Aggregated sentiment(Blue) for
Obama

47

80
60
40
20
0
−20

m
:0
0a

00
pm

7,
ov
N

N
ov

6,

12

4:

30
am
7:
N
ov

6,

11
5,
N
ov

N
ov

5,

2:

:0

30
pm

0p
m

−40

Figure 2.3: Predicted function(green) for Obama sentiments(blue) with conﬁdence interval(grey)
The predicted target distribution of sentiments for Obama and Romney is shown in
Figure 2.4. The distributions are more meaningful as compared to the noisy sentiments
observed through simple aggregation.

2.4.3

Eﬀects of Sampling

To study the eﬀects of sample size with prediction, sentiments from labeled data of training
set were obtained and the data was segmented into 255 equal time windows with each segment
spanning 10 minutes. Gaussian process was used to make predictions by varying the sample
size from 1 to 255. Mean square distance was calculated between predicted mean and actual
data. As can be seen from Figure 2.5, the error between prediction and actual data settles
down at around 60 samples. Hence, by using only 23% of data, accurate predictions can be
made.

48

Predicted Sentiments

60
40
20
0
−20

m
N

ov

N
ov

7,

6,

12

4:

:0

00

0a

pm

am
30
7:
6,
ov
N

N

ov

N
ov

5,

5,

11

2:

:0

30

pm

0p
m

−40

(b) Using 30 Random points of data

Mean Square distance from True values

Figure 2.4: Predicted function(green) for Romney sentiments(red) with conﬁdence interval(grey)

60
50
40
30
20
10

50

100
150
Number of Samples

200

250

Figure 2.5: Decrease of mean square error from Original temporal sentiments with increase
of Samples

49

2.5

Conclusion and Future Work

In this study, we have surveyed the challenges incurred during temporal modeling of sentiments. In particular, we have identiﬁed four problems; scalability, sampling, classiﬁcation
error and capturing both local and global phenomena. We have proposed a gaussian process
framework that addresses these challenges. The extensibility of model has been discussed
and mathematical formulation of spatio-temporal prediction is given. As a case study, Twitter data 42 hours prior and through the election day is used. The predicted sentiments in the
temporal model have been shown to be better indicator as compared with traditional aggregating schemes. Finally, it has been shown that with only 23% of samples, high conﬁdence
for prediction can be achieved. This is still a new ﬁeld of research and has many interesting
problems. The error model in this study is a linear gaussian noise variable. Better stochastic
modeling with deterministic parameters can be used to improve the prediction. Furthermore, Gaussian random ﬁelds can be used instead of a continuous function for decreasing
the computational complexity for spatio-temporal models. This study only looked at the
eﬀects of prediction with random sampling of data. Better sampling models are desired that
are optimal for sentiment modeling.

50

APPENDICES

51

Appendix A
Generation of samples of the center
line
In what follows, we show how to generate a collection of ﬁnite number of samples on the
center line of an AAA surface. Open ends of the vessel are required as a ﬁrst step for the
method. It is achieved by transversely truncating the AAA surface with truncation planes
2.5mm from the top and bottom of the vessel. The data thus obtained is a subset and is
denoted as D. The center line is initialized by using the middle point of the bottom most
transverse plane. A vector a is drawn between this initial guess and the point least distant
from it on the AAA surface. The initial center point is then pushed in direction away from
vector a by a constant δ (1mm) amount. This step is repeated for a pre-deﬁned number of
times with reduction of δ by half if the center point location hasn’t changed more than 1mm
in every 15 iterations. The next center line point is obtained by a linear shift from previous
point in z-axis direction. An initial guess is obtained for each of the remaining center point
along length of AAA by a linear shift in direction of a vector b drawn between the previous
two center points. A vector c is then obtained as the projection of a onto the plane normal
to b where a is calculated as explained before. The center point is translated in direction
opposite to vector c by δ. This process is repeated for a pre-deﬁned number of times with
reduction of δ by half if the location of the center point hasn’t changed by 1mm every ﬁxed

52

iterations. The procedure is summarized by an algorithm as Algorithm 1.

53

Algorithm 1 Generation of Samples of Center Line
D = [dα (1), dα (2), . . . , dα (n)]T ∀ α ∈ {x, y, z}
Output:
ρ(s)
Algorithm:
l←1
for all i ∈ I do
if dz (i) = min(dz ) then
bx (l) ← dx (i) when dz (i) = min(dz )
by (l) ← dy (i) when dz (i) = min(dz )
l ←l+1
end if
end for
C(1) ← average(bx ), average(by ), min(dz )
cinit ← C(1)
for l = 1 → MaxNumIters do
dmin ← min vD(i)/C(1)
i∈I
a ← vC(1)/c
init
C(1) ← C(1) − δ × a
a
if l mod 15 = 0 and vc /P (1) ≤ 1mm then

Center line of Aorta

δ is selected as 1mm–2mm

init

δ
δ←2
end if
end for
C(2) ← {Cx (1), Cy (1), Cz (1) + v}
k←3
while Cz (k) ≤ max(dz ) do
b ← vC(k−1)/C(k−2)
Cd ← C(k − 1) + v × b
b
Cinit ← Cd
for l = 1 → MaxNumIters do
dmin ← min vD(i)/C(k)
i∈I
a ← vC /d
d min
a×b
b×
b
c←
b
if l mod 15 = 0 and vC /C(k) ≤ 1mm then
init
δ
δ←2
end if
C(k) ← Cd − δ × c
c
end for
k ←k+1
end while
m
ρ(s) =
φi (s)C(i)
i=1

54

v is constant

δ is selected as 1mm–2mm

Appendix B
Surface parameterization
In what follows, we provide detail information regarding how to parameterize surface from
the point cloud data D with respect to the calculated center line ρ(s). A coordinate system N
is deﬁned to acquire longitudinal acquisition planes. The ﬁrst vector deﬁning the coordinate
system N1 (s) is the unit normal vector drawn between consecutive center points in ρ(s).
N2 (s) uses the known Cartesian standard basis perpendicular to N1 (s) where as N3 (s) is
obtained by cross product of N1 (s) and N2 (s) for each s. The point cloud data D belonging
to these longitudinal planes are identiﬁed for each s by a minimum distance criterion using
dot product. The points satisfying this criterion are further used to obtain rh where h is the
number of points which lie on the longitudinal plane located at s. At each longitudinal plane
deﬁned by (N1 (s), N2 (s), N3 (s)) a collection of vectors rh exists in Cartesian coordinates that
describe the distance from the center line point ρ(s) to surface points D. To more eﬃciently
analyze the data on a longitudinal plane basis, a transformation to polar coordinates takes
place. In polar coordinates the magnitudes of rh vectors represent radius. By considering
a suite of dot products between each rh vector, N1 (s), and N2 (s) within each longitudinal
plane at a given s the angular values θ within that plane associated with each rh is obtained.
This procedure is summarized in Algorithm 2.

55

Algorithm 2 Surface Parameterization
Input:
D = [dα (1), dα (2), . . . , dα (n)]T ∀ α ∈ {x, y, z}
ρ(s)
Output:
N (s) = col (N1 (s), N2 (s), N3 (s))
rh (s, θ)
Algorithm:
for all s do
∂ρ(s)
N1 (s) ←
∂s
(ex − N1 · ex )
N2 ←
ex − N1 · ex
N3 (s) ← N1 (s) × N2 (s)
end for
for all s do h ← 0
for all i ∈ Rn do h ← h + 1
if N1 (s) · vρ(s)/D(i) < 0.01 then
rh (s) ← vρ(s)/D(i)
end if
end for
GETANGLE(N, rh (s))
end for
function GetAngle(N,rh (s))
if rh (s) · N2 (s) = 1 then
θ←0
else if rh (s) · N2 (s) = −1 then
θ←π
else if (|rh (s) · N2 (s)| ≥ 0 and rh (s) · N3 (s) ≥ 0) then
θ ← cos−1 (rh (s) · N2 (s))
else(|rh (s) · N2 (s)| ≥ 0 and rh (s) · N3 (s) < 0)
θh (s) ← − cos−1 (rh (s) · N2 (s)) + 2π
end if
end function

56

Center line of Aorta
Longitudinal Planes along s

BIBLIOGRAPHY

57

BIBLIOGRAPHY
[1] C. M. Porth, Essentials of pathophysiology: Concepts of altered health states. Lippincott
Williams & Wilkins, 2010.
[2] A. R. Zankl, H. Schumacher, U. Krumsdorf, H. A. Katus, L. Jahn et al., “Pathology,
natural history and treatment of abdominal aortic aneurysms,” Clinical Research in
Cardiology, vol. 96, no. 3, pp. 140–151, 2007.
[3] A. Klink, F. Hyaﬁl, J. Rudd, P. Faries, V. Fuster, Z. Mallat, O. Meilhac, W. J. Mulder, J.-B. Michel, F. Ramirez et al., “Diagnostic and therapeutic strategies for small
abdominal aortic aneurysms,” Nature Reviews Cardiology, vol. 8, no. 6, pp. 338–347,
2011.
[4] H. Kniemeyer, T. Kessler, P. U. Reber, H. B. Ris, H. Hakki, and M. K. Widmer,
“Treatment of ruptured abdominal aortic aneurysm, a permanent challenge or a waste
of resources? prediction of outcome using a multi-organ-dysfunction score,” European
Journal of Vascular and Endovascular Surgery, pp. 190–196, 2000.
[5] M. Wassef, B. T. Baxter, R. L. Chisholm, R. L. Dalman, M. F. Fillinger, J. Heinecke,
J. D. Humphrey, H. Kuivaniemi, W. C. Parks, W. H. Pearce et al., “Pathogenesis of
abdominal aortic aneurysms: a multidisciplinary research program supported by the
national heart, lung, and blood institute,” Journal of Vascular Surgery, vol. 34, no. 4,
pp. 730–738, 2001.
[6] S. Einav, J. Ricotta, and D. Bluestein, “Abdominal aortic aneurysm risk of rupture:
patient-speciﬁc FSI simulations using anisotropic model,” Journal of Biomechanical
Engineering, vol. 131, pp. 031 001–1, 2009.
[7] M. F. Fillinger, M. L Raghavan, S. P. Marra, J. L. Cronenwett, F. E. Kennedy et al., “In
vivo analysis of mechanical wall stress and abdominal aortic aneurysm rupture risk.”
Journal of Vascular Surgery, vol. 36, 2002.
[8] D. Bluestein, K. Dumont, M. De Beule, J. Ricotta, P. Impellizzeri, B. Verhegghe, and
P. Verdonck, “Intraluminal thrombus and risk of rupture in patient speciﬁc abdominal
aortic aneurysm–FSI modelling,” Computer Methods in Biomechanics and Biomedical
engineering, vol. 12, no. 1, pp. 73–81, 2009.

58

[9] B. Wolters, M. C. M. Rutten, G. W. H. Schurink, U. Kose, J. De Hart, and F. N.
Van De Vosse, “A patient-speciﬁc computational model of ﬂuid–structure interaction in
abdominal aortic aneurysms,” Medical Engineering & physics, pp. 871–883, 2005.
[10] F. H. Epstein, G. H. Gibbons, and V. J. Dzau, “The emerging concept of vascular
remodeling,” New England Journal of Medicine, vol. 330, no. 20, pp. 1431–1438, 1994.
[11] J. D. Humphrey, Cardiovascular solid mechanics: cells, tissues, and organs.
Verlag, 2002.

Springer

[12] N. Resnick, H. Yahav, A. Shay-Salit, M. Shushy, S. Schubert, L. C. M. Zilberman,
and E. Wofovitz, “Fluid shear stress and the vascular endothelium: for better and for
worse,” Progress in Biophysics and molecular biology, vol. 81, no. 3, pp. 177–199, 2003.
[13] A. B. Driss, J. Benessiano, P. Poitevin, B. I. Levy, and J.-B. Michel, “Arterial expansive
remodeling induced by high ﬂow rates,” American Journal of Physiology-Heart and
Circulatory Physiology, vol. 272, no. 2, pp. H851–H858, 1997.
[14] M. Zamir, “Shear forces and blood vessel radii in the cardiovascular system.” The Journal of General physiology, vol. 69, no. 4, pp. 449–461, 1977.
[15] M. A. Hajdu and G. L. Baumbach, “Mechanics of large and small cerebral arteries in
chronic hypertension,” American Journal of Physiology-Heart and Circulatory Physiology, vol. 266, no. 3, pp. H1027–H1033, 1994.
[16] J.-J. Hu, S. Baek, and J. D. Humphrey, “Stress–strain behavior of the passive basilar
artery in normotension and hypertension,” Journal of biomechanics, vol. 40, no. 11, pp.
2559–2563, 2007.
[17] J. D. Humphrey, J. Eberth, W. Dye, and R. L. Gleason, “Fundamental role of axial
stress in compensatory adaptations by arteries,” Journal of biomechanics, vol. 42, no. 1,
pp. 1–8, 2009.
[18] Z. S. Jackson, D. Dajnowiec, A. I. Gotlieb, and B. L. Langille, “Partial oﬀ-loading
of longitudinal tension induces arterial tortuosity,” Arteriosclerosis, thrombosis, and
vascular biology, vol. 25, no. 5, pp. 957–962, 2005.
[19] Z. S. Jackson, A. I. Gotlieb, and B. L. Langille, “Wall tissue remodeling regulates
longitudinal tension in arteries,” Circulation research, vol. 90, no. 8, pp. 918–925, 2002.

59

[20] S. Baek, K. R. Rajagopal, J. D. Humphrey et al., “A theoretical model of enlarging
intracranial fusiform aneurysms,” TRANSACTIONS-ASME JOURNAL OF BIOMECHANICAL ENGINEERING, vol. 128, no. 1, p. 142, 2006.
[21] S. Baek, A. Valentin, and J. D. Humphrey, “Biochemomechanics of cerebral vasospasm
and its resolution: Ii. constitutive relations and model simulations,” Annals of biomedical engineering, vol. 35, no. 9, pp. 1498–1509, 2007.
[22] R. L. Gleason and J. D. Humphrey, “A mixture model of arterial growth and remodeling
in hypertension: altered muscle tone and tissue turnover,” Journal of vascular research,
vol. 41, no. 4, pp. 352–363, 2004.
[23] ——, “Eﬀects of a sustained extension on arterial growth and remodeling: a theoretical
study,” Journal of biomechanics, vol. 38, no. 6, pp. 1255–1261, 2005.
[24] S. Zeinali-Davarani, A. Sheidaei, and S. Baek, “A ﬁnite element model of stress-mediated
vascular adaptation: application to abdominal aortic aneurysms,” Computer methods
in biomechanics and biomedical engineering, vol. 14, no. 9, pp. 803–817, 2011.
[25] A. Sheidaei, S. C. Hunley, S. Zeinali-Davarani, L. G. Raguin, and S. Baek, “Simulation of
abdominal aortic aneurysm growth with updating hemodynamic loads using a realistic
geometry,” Medical engineering & physics, vol. 33, no. 1, pp. 80–88, 2011.
[26] S. Zeinali-Davarani, J. Choi, and S. Baek, “On parameter estimation for biaxial mechanical behavior of arteries,” Journal of biomechanics, vol. 42, no. 4, pp. 524–530,
2009.
[27] J. S. Wilson, S. Baek, and J. D. Humphrey, “Parametric study of eﬀects of collagen
turnover on the natural history of abdominal aortic aneurysms,” Proceedings of the
Royal Society A: Mathematical, Physical and Engineering Science, vol. 469, no. 2150,
2013.
[28] S. Zeinali-Davarani, L. G. Raguin, D. A. Vorp, and S. Baek, “Identiﬁcation of in vivo
material and geometric parameters of a human aorta: toward patient-speciﬁc modeling
of abdominal aortic aneurysm,” Biomechanics and modeling in mechanobiology, vol. 10,
no. 5, pp. 689–699, 2011.
[29] J. Kim and S. Baek, “Circumferential variations of mechanical behavior of the porcine
thoracic aorta during the inﬂation test,” Journal of biomechanics, vol. 44, no. 10, pp.
1941–1947, 2011.

60

[30] S. T. Kwon, J. E. Rectenwald, S. Baek et al., “Intrasac pressure changes and vascular remodeling after endovascular repair of abdominal aortic aneurysms: review and
biomechanical model simulation.” Journal of biomechanical engineering, vol. 133, no. 1,
p. 011011, 2011.
[31] Y. Xu, J. Choi, and S. Oh, “Mobile sensor network navigation using gaussian processes
with truncated observations,” IEEE Transactions on Robotics, vol. 27, no. 6, pp. 1118–
1131, 2011.
[32] Y. Xu and J. Choi, “Adaptive sampling for learning Gaussian processes using mobile
sensor networks,” Sensors, vol. 11, no. 3, pp. 3051–3066, 2011.
[33] A. Ijaz, J. Choi, W. Lee, and S. Baek, “Prediction of abdominal aortic aneurysms using
sparse gaussian process regression,” Proceedings of the ASME 2013 Summer Bioengineering Conference, 2013.
[34] C. Williams and C. Rasmussen, “Gaussian processes for machine learning,” MIT Press,
2006.
[35] A. J. Smola and P. Bartlett, “Sparse greedy Gaussian process regression,” in Advances
in Neural Information Processing Systems 13. Citeseer, 2001.
[36] C. Williams and M. Seeger, “Using the nystr¨m method to speed up kernel machines,”
o
in Advances in Neural Information Processing Systems 13. Citeseer, 2001.
[37] N. D. Lawrence, M. Seeger, and R. Herbrich, “Fast sparse Gaussian process methods:
The informative vector machine,” Advances in neural information processing systems,
vol. 15, no. 15, pp. 609–616, 2002.
[38] M. Seeger, “Bayesian Gaussian process models: Pac-bayesian generalisation error
bounds and sparse approximations,” 2003.
[39] V. Tresp, “A bayesian committee machine,” Neural Computation, vol. 12, no. 11, pp.
2719–2741, 2000.
[40] A. Brix and P. J. Diggle, “Spatiotemporal prediction for log-Gaussian cox processes,”
Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 63,
no. 4, pp. 823–841, 2001.
[41] A. J. Storkey, “Truncated covariance matrices and toeplitz methods in Gaussian processes,” in Artiﬁcial Neural Networks, vol. 1. IET, 1999, pp. 55–60.
61

[42] S. S. Raut, S. Chandra, J. Shum, and E. A. Finol, “The role of geometric and biomechanical factors in abdominal aortic aneurysm rupture risk assessment,” Annals of biomedical
engineering, pp. 1–19, 2013.
[43] M. Gaudard, M. Karson, E. Linder, and D. Sinha, “Bayesian spatial prediction,” Environmental and Ecological Statistics, vol. 6, no. 2, pp. 147–171, 1999.
[44] Y. Xu, J. Choi, S. Dass, and T. Maiti, “Sequential bayesian prediction and adaptive
sampling algorithms for mobile sensor networks,” IEEE Transactions on Automatic
Control, vol. 57, no. 8, pp. 2078–2084, 2012.
[45] J. Weng and B. Lee, “Event detection in twitter,” Proc. of ICWSM, 2011.
[46] M. Paul and M. Dredze, “You are what you tweet: Analyzing twitter for public health,”
in Fifth International AAAI Conference on Weblogs and Social Media (ICWSM 2011),
2011.
[47] A. Tumasjan, T. Sprenger, P. Sandner, and I. Welpe, “Predicting elections with twitter:
What 140 characters reveal about political sentiment,” in Proceedings of the fourth
international aaai conference on weblogs and social media, 2010, pp. 178–185.
[48] S. Verma, S. Vieweg, W. Corvey, L. Palen, J. Martin, M. Palmer, A. Schram, and K. Anderson, “Natural language processing to the rescue?: Extracting’situational awareness’
tweets during mass emergency,” Proc. ICWSM, 2011.
[49] B. OConnor, R. Balasubramanyan, B. Routledge, and N. Smith, “From tweets to polls:
Linking text sentiment to public opinion time series,” in Proceedings of the International
AAAI Conference on Weblogs and Social Media, 2010, pp. 122–129.
[50] S. Morinaga, K. Yamanishi, K. Tateishi, and T. Fukushima, “Mining product reputations on the web,” in Conference on Knowledge Discovery in Data: Proceedings of
the eighth ACM SIGKDD international conference on Knowledge discovery and data
mining, vol. 23. Citeseer, 2002, pp. 341–349.
[51] J. Blitzer, M. Dredze, and F. Pereira, “Biographies, bollywood, boom-boxes and
blenders: Domain adaptation for sentiment classiﬁcation,” in Annual MeetingAssociation For Computational Linguistics, vol. 45, no. 1, 2007, p. 440.
[52] R. Balasubramanyan, W. Cohen, D. Pierce, and D. Redlawsk, “Modeling polarizing
topics: When do diﬀerent political communities respond diﬀerently to the same news?”
in Sixth International AAAI Conference on Weblogs and Social Media, 2012.
62

[53] L. Xiang, Q. Yuan, S. Zhao, L. Chen, X. Zhang, Q. Yang, and J. Sun, “Temporal
recommendation on graphs via long-and short-term preference fusion,” in Proceedings
of the 16th ACM SIGKDD international conference on Knowledge discovery and data
mining. ACM, 2010, pp. 723–732.
[54] B. Pang and L. Lee, Opinion mining and sentiment analysis. Now Pub, 2008.
[55] M. De Choudhury, Y. Lin, H. Sundaram, K. Candan, L. Xie, and A. Kelliher, “How
does the data sampling strategy impact the discovery of information diﬀusion in social
media,” in Proceedings of the 4th International AAAI Conference on Weblogs and Social
Media, 2010, pp. 34–41.
[56] B. Pang and L. Lee, “A sentimental education: Sentiment analysis using subjectivity
summarization based on minimum cuts,” in Proceedings of the 42nd Annual Meeting on
Association for Computational Linguistics. Association for Computational Linguistics,
2004, p. 271.
[57] J. Wiebe, R. Bruce, and T. O’Hara, “Development and use of a gold-standard data
set for subjectivity classiﬁcations,” in Proceedings of the 37th annual meeting of the
Association for Computational Linguistics on Computational Linguistics. Association
for Computational Linguistics, 1999, pp. 246–253.
[58] Y. Mejova and P. Srinivasan, “exploring feature deﬁnition and selection for sentiment
classiﬁers,” in Proceedings of the Fifth international aaai conference on Weblogs and
Social media (icWSm-2011), 2011.
[59] W. Peng and D. Park, “generate adjective sentiment dictionary for social media sentiment analysis using constrained nonnegative matrix factorization,” Urbana, vol. 51, p.
61801, 2004.
[60] Y. Lu, M. Castellanos, U. Dayal, and C. Zhai, “Automatic construction of a contextaware sentiment lexicon: an optimization approach,” in Proceedings of the 20th international conference on World wide web. ACM, 2011, pp. 347–356.
[61] S. Baccianella, A. Esuli, and F. Sebastiani, “Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining,” in Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC10), Valletta, Malta,
May. European Language Resources Association (ELRA), 2010.
[62] A. Rafraﬁ, V. Guigue, and P. Gallinari, “Coping with the document frequency bias
in sentiment classiﬁcation,” in Sixth International AAAI Conference on Weblogs and
Social Media, 2012.
63

[63] L. Chen, W. Wang, M. Nagarajan, S. Wang, and A. Sheth, “Extracting diverse sentiment expressions with target-dependent polarity from twitter,” in Proceedings of the
Sixth International AAAI Conference on Weblogs and Social Media (ICWSM), 2012,
pp. 50–57.
[64] D. Davidov, O. Tsur, and A. Rappoport, “Enhanced sentiment learning using twitter
hashtags and smileys,” in Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010, pp.
241–249.
[65] E. Kouloumpis, T. Wilson, and J. Moore, “Twitter sentiment analysis: The good the
bad and the omg,” in Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, 2011.
[66] E. Gilbert and K. Karahalios, “Widespread worry and the stock market,” in Proceedings
of the international conference on weblogs and social media, vol. 2, no. 1, 2010, pp. 229–
247.
[67] M. Koppel and I. Shtrimberg, “Good news or bad news? let the market decide,” Computing attitude and aﬀect in text: Theory and applications, pp. 297–301, 2006.
[68] P. S. Dodds and C. M. Danforth, “Measuring the happiness of large-scale written expression: Songs, blogs, and presidents,” Journal of Happiness Studies, vol. 11, no. 4,
pp. 441–456, 2010.
[69] N. Nori, D. Bollegala, and M. Ishizuka, “Exploiting user interest on social media for aggregating diverse data and predicting interest,” in Fifth International AAAI Conference
on Weblogs and Social Media, 2011.
[70] O. Dabeer, P. Mehendale, A. Karnik, and A. Saroop, “Timing tweets to increase eﬀectiveness of information campaigns,” Proc. ICWSM, 2011.
[71] M. Naaman, A. Zhang, S. Brody, and G. Lotan, “On the study of diurnal urban routines
on twitter,” in Sixth International AAAI Conference on Weblogs and Social Media, 2012.
[72] C. Manning and H. Sch¨tze, Foundations of statistical natural language processing. MIT
u
press, 1999.
[73] D. B. Peizer and J. W. Pratt, “A normal approximation for binomial, f, beta, and other
common, related tail probabilities, i,” Journal of the American Statistical Association,
vol. 63, no. 324, pp. 1416–1456, 1968.
64

[74] “Twitter
votes
to
obama.”
[Online].
Available:
http://www.buzzfeed.com/jwherrman/twitter-users-say-they-voted-for-obama-2-to1
[75] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classiﬁcation using distant supervision,” CS224N Project Report, Stanford, pp. 1–12, 2009.
[76] A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining,” in Proceedings of LREC, vol. 2010, 2010.
[77] C. Fellbaum, “Wordnet,” Theory and Applications of Ontology: Computer Applications,
pp. 231–243, 2010.

65