COMMUNITY DETECTION IN TEMPORAL MULTI-LAYER NETWORKS

By

Esraa Mustafa Al-sharoa

A DISSERTATION

Submitted to

Michigan State University

in partial fulﬁllment of the requirements

for the degree of

Electrical Engineering - Doctor of Philosophy

2019

ABSTRACT

COMMUNITY DETECTION IN TEMPORAL MULTI-LAYER

NETWORKS

By

Esraa Mustafa Al-sharoa

Many real world systems and relational data can be modeled as networks or graphs. With the

availability of large amounts of network data, it is important to be able to reduce the network’s

dimensionality and extract useful information from it. A key approach to network data reduction

is community detection. The objective of community detection is to summarize the network by a

set of modules, where the similarity within the modules is maximized while the similarity between

different modules is minimized. Early work in graph based community detection methods focused

on static or single layer networks. This type of networks is usually considered as an oversimpliﬁ-

cation of many real world complex systems, such as social networks where there may be different

types of relationships that evolve with time. Consequently, there is a need for a meaningful rep-

resentation of such complex systems. Recently, multi-layer networks have been used to model

complex systems where the objects may interact through different mechanisms. However, there is

limited amount of work in community detection methods for dynamic and multi-layer networks.

In this thesis, we focus on detecting and tracking the community structure in dynamic and

multi-layer networks. Two particular applications of interest are considered including temporal

social networks and dynamic functional connectivity networks (dFCNs) of the brain.

In order to detect the community structure in dynamic single-layer and multi-layer networks,

we have developed methods that capture the structure of these complex networks. In Chapter 2, a

low-rank + sparse estimation based evolutionary spectral clustering approach is proposed to detect

and track the community structure in temporal networks. The proposed method tries to decompose

the network into low-rank and sparse parts and obtain smooth cluster assignments by minimizing

the subspace distance between consecutive time points, simultaneously. Effectiveness of the pro-

posed approach is evaluated on several synthetic and real social temporal networks and compared

to the existing state-of-the-art algorithms. As the method developed in Chapter 2 is limited to dy-

namic single-layer networks and can only take limited amount of historic information into account,

a tensor-based approach is developed in Chapter 3 to detect the community structure in dynamic

single-layer and multi-layer networks. The proposed framework is used to track the change points

as well as identify the community structure across time and multiple subjects of dFCNs constructed

from resting state functional magnetic resonance imaging (rs-fMRI) data. The dFCNs are summa-

rized into a set of FC states that are consistent over time and subjects. The detected community

structures are evaluated using a consistency measure. In Chapter 4, an information-theoretic ap-

proach is introduced to aggregate the dynamic networks and identify the time points that are topo-

logically similar to combine them into a tensor. The community structure of the reduced network

is then detected using a tensor based approach similar to the one described in Chapter 3. In Chapter

5, a temporal block spectral clustering framework is introduced to detect and track the community

structure of multi-layer temporal networks. A set of intra- and inter-adjacency matrices is con-

structed and combined to create a set of temporal supra-adjacency matrices. In particular, both

the connections between nodes of the network within a time window, i.e. intra-layer adjacency, as

well as the connections between nodes across different time windows, i.e. inter-layer adjacency

are taken into account. The community structure is then detected by applying spectral clustering to

these supra-adjacency matrices. The proposed approach is evaluated on dFCNs constructed from

rs-fMRI across time and subjects revealing dynamic connectivity patterns between the resting state

networks (RSNs).

This work is dedicated to my beloved parents, Mustafa Al-sharoa and Shama Al-kofahi, who

have been the source of inspiration, encouragement and unconditional love during all times.
To my soulmate and lovely husband, Mahmood Al-khassaweneh, who has been supporting and
encouraging me to make my dreams come true. Thank you for being in my life and believing in me.
To my precious children, Ahmad, Ghadaq, Alaya and Tabarak. You are the source of my happiness,
strength and success. I love you to the moon and beyond.
To my family in law, sisters, brothers and friends. Thank you for your prayers, kind words and
support.

iv

ACKNOWLEDGMENTS

I would like to thank my advisor Prof. Selin Aviyente who without her guidance, valuable com-

ments and inputs this dissertation would not have been possible. Many thanks to Prof. Aviyente

for being ready to help in all times. In addition, I would like to thank my committee members Prof.

John Deller, Prof. Hayder Radha and Dr. Mark Iwen for their useful comments and guidance.

A very special thanks go to the Schlumberger Foundation, Faculty for the Future Fellowship who

supported me through my PhD. I am very grateful to them.

I would like also to thank my husband Mahmood Al-khassaweneh, my children Ahmad, Ghadaq,

Alaya and Tabarak. With their encouragement, support and love I was able to ﬁnish this disserta-

tion.

Finally, I would like to thank the best role models in my life, my parents Mustafa Al-sharoa and

Shama Al-kofahi, who have been showering me with their love and prayers. They inculcated me

with the willing to learn and build a bright future. Thanking them will never be enough.

v

TABLE OF CONTENTS

LIST OF TABLES .

.

LIST OF FIGURES .

.

.

.

.

.

.

LIST OF ALGORITHMS .

.

.

.

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

ix

x

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

Chapter 1

.

.

.
.

.
.

Introduction .
.

1.1 Graph Theory .
1.2 Community Detection in Static Networks

.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
1.2.1 Graph Cut problem and Spectral Clustering . . . . . . . . . . . . . . . . .
1.2.2 Modularity Based Clustering . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
2
4
5
6
7
1.3 Background on Tensors .
1.4 Community Detection in Dynamic Networks . . . . . . . . . . . . . . . . . . . .
. 10
1.5 Community Detection in Dynamic Multi-layer Networks . . . . . . . . . . . . . . 12
1.6 Dynamic Functional Connectivity Networks of the Brain . . . . . . . . . . . . . . 13
1.7 Organization and Contributions of this Thesis . . . . . . . . . . . . . . . . . . . . 16

Chapter 2

Introduction .

2.1
2.2 Background . .

. .

Detecting and Tracking Community Structure in Temporal Networks: A
Low-Rank + Sparse Estimation Based Evolutionary Clustering Approach 19
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1
Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 Low-rank + Sparse matrix approximation . . . . . . . . . . . . . . . . . . 23
2.2.3 Chordal Subspace Distance Measure . . . . . . . . . . . . . . . . . . . . . 24
2.2.4 Variation of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

.
. .

.
.

.
.

.
.

2.3 Methods: low-rank + sparse estimation based evolutionary spectral clustering (LS-

.

.

.

.

. .

. .

. .

ESC) .
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.1
Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Distance Between Successive Subspaces . . . . . . . . . . . . . . . . . . . 26
2.3.3
Problem Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.4 Tracking the community structure over time . . . . . . . . . . . . . . . . . 30
2.3.5 Detecting the number of clusters . . . . . . . . . . . . . . . . . . . . . . . 30
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Simulated temporal networks . . . . . . . . . . . . . . . . . . . . . . . . . 32
Simulated binary temporal networks . . . . . . . . . . . . . . . . 32
2.4.1.1
2.4.1.2
Simulated weighted temporal networks . . . . . . . . . . . . . . 35
2.4.2 Real social network: MIT Reality Mining Dataset . . . . . . . . . . . . . . 38
2.4.3 Real social network: Primary school network . . . . . . . . . . . . . . . . 39

2.4.1

. .

. .

. .

.

.

.

2.4 Results .

2.4.3.1

Primary school data set description and temporal network con-
struction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

vi

2.4.3.2 Detecting and tracking the community structure in the primary

2.5 Conclusions .

.

school temporal network (PSTN)
.

. . . . . . . . . . . . . . . . . 42
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

. .

.

.

3.3 Methods .

3.4 Results .

Chapter 3

Introduction .

3.1
3.2 Background . .

.

.

.

.

.

.

.

.

.
.

.
.

.
.

. .

.
. .

Tensor Based Temporal and Multi-layer Community Detection . . . . . . 47
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.1 Quality Metrics to Determine the Number of Clusters . . . . . . . . . . . . 50
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.1
3D-Windowed Tensor Approach (3D-WTA) . . . . . . . . . . . . . . . . . 53
3.3.2
3.3.3 Running Time Tensor Approach (RTTA) . . . . . . . . . . . . . . . . . . . 54
3.3.4
4D-Windowed Tensor Approach (4D-WTA) . . . . . . . . . . . . . . . . . 56
3.3.5 Tracking the Community Structure Over Time . . . . . . . . . . . . . . . . 57
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Simulated Temporal Networks . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.1.1 Description of the Data sets . . . . . . . . . . . . . . . . . . . . 60
3.4.1.2 Comparison of different quality metrics . . . . . . . . . . . . . . 63
3.4.1.3 Comparison between 3D-WTA and state of the art algorithms . . 64
3.4.1.4 Comparison between 3D-WTA and 4D-WTA . . . . . . . . . . . 65
3.4.1.5
. . . . . . . . . . . . . . . . . . . . . . 69
3.4.2 MIT Reality Mining Dataset
. . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4.3 Application to dFCN From EEG Dataset . . . . . . . . . . . . . . . . . . . 70
3.4.4 Application to dFCN From rs-fMRI Dataset . . . . . . . . . . . . . . . . . 71

Flocks of Boids Dataset

3.4.1

. .

. .

. .

.

.

.

3.4.4.1 Data Preprocessing and Functional Connectivity Networks Con-

3.4.5 Dynamic Functional Connectivity Network Construction

struction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
. . . . . . . . . 72
3.4.5.1 Resting State Networks (RSNs)
. . . . . . . . . . . . . . . . . . 72
3.4.5.2 Detecting the Community Structure in rs-fMRI . . . . . . . . . . 72
3.4.5.3 Community structure of each FC state . . . . . . . . . . . . . . . 76
3.4.5.4
. 77
3.4.5.5 Consistency of the RSNs across time . . . . . . . . . . . . . . . 78
3.4.5.6 Comparison Between 3D-WTA and 4D-WTA . . . . . . . . . . . 79
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

State transition proﬁles . . . . . . . . . . . . . . . . . . . . . .

. .

.

.

.

.

3.5 Conclusions .

Chapter 4

Introduction .

4.1
.
4.2 Materials and Methods

. .

.

.

.
.

.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Detecting Brain Dynamics During Resting State: A Tensor Based evolu-
tionary Clustering Approach . . . . . . . . . . . . . . . . . . . . . . . . . 90
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
. 94
4.2.1 Data and Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2.2 Dynamic Functional Connectivity Networks (dFCNs) . . . . . . . . . . . . 95
Dynamic Functional Connectivity Network Construction . . . . 95
Surrogate Data Generation for Testing Statistical Signiﬁcance of
the dFCN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.2.3 Reducibility of Brain dFCNs . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.2.2.1
4.2.2.2

vii

4.4 Conclusions . .

Chapter 5

5.1
5.2 Methods .

Introduction .
.

.

5.3 Results .

4.3 Results .

4.2.4 Dynamic Network Clustering . . . . . . . . . . . . . . . . . . . . . . . .

. 99
4.2.4.1
HOOI Clustering Based Approach . . . . . . . . . . . . . . . . 99
4.2.4.2 Choosing the Number of Clusters . . . . . . . . . . . . . . . . . 100
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
. 100
. 102
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

dFCN Reduction .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
FC States Summarization . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3.1
4.3.2

. .

. .

.

.

.

.

.

.

.

.

.

.
.

.
.

.
.

.
.

. .
.
.

5.2.1 Construction of multi-layer temporal networks

Temporal Block Spectral Clustering for Multi-layer Temporal Functional
Connectivity Networks

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
. 109
. 111
. . . . . . . . . . . . . . . 111
5.2.1.1 Construction of a temporal network from signals . . . . . . . . . 111
5.2.1.2 Construction of multi-layer temporal network
. . . . . . . . . . 112
. . . . . . . . . . . 114
5.2.2 Temporal block spectral clustering approach (TBSCA)
. 115
Simulated networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.3.1
5.3.2 Community Detection in dynamic functional connectivity networks (dFCNs)116
5.3.2.1 Data and preprocessing . . . . . . . . . . . . . . . . . . . . . . 116
5.3.2.2 Dynamic Community structure
. . . . . . . . . . . . . . . . . . 117
5.3.2.3 Consistency of the RSNs across time . . . . . . . . . . . . . . . 118
. 121

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

. .

. .

. .

.

.

.

.

.

.

5.4 Conclusions .

Chapter 6

6.1 Conclusions .
6.2 Future Work . .

Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 122
. 122
. 125

.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
. .

. .

.
.

.
.

.
.

APPENDIX .

.

.

.

BIBLIOGRAPHY .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

viii

LIST OF TABLES

Table 2.1:

Table 2.2:

Table 2.3:

Table 3.1:

Table 3.2:

Table 3.3:

Table 4.1:

Parameters of the simulated binary networks from Girvan and Newman
benchmark for six different data sets and the selected values of the regu-
larization parameters.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Parameters of the truncated Gaussian distribution used to generate the
edges of the simulated weighted temporal networks for three different
data sets, the ground truth of the nodes’ community membership for the
generated networks and the selected values of the regularization parameters. 37

Different classes and number of teachers and students participating in the
primary school data from each class.

. . . . . . . . . . . . . . . . . . . . 41

Parameters of the truncated Gaussian distribution used to generate the
edges of the simulated networks for four different datasets, and the ground
truth of the nodes’ community membership for the generated networks.

. 62

Parameters of the truncated Gaussian distribution used to generate the
edges of the simulated networks for the 4D simulated data, and the ground
truth of the nodes’ community membership for the generated network.

. . 68

The ROIs belonging to the different subnetworks in resting state. The
numbers in the table refer to the ROIs deﬁned in Table 1.

. . . . . . . . . 73

The ROIs that deﬁnes the RSNs. The numbers in the table refer to the
ROIs deﬁned in Table 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Table 4.2:

Key Attributes of the Tensors Constructed from the Reduced dFCN.

. . . 102

Table 1:

List of the ROIs in the AAL 90 template. The odd and even indices refer
to the left and right brain hemispheres ROIs, respectively.

. . . . . . . . . 139

ix

Figure 1.1:

Figure 2.1:

Figure 2.2:

Figure 2.3:

Figure 2.4:

Figure 2.5:

Figure 2.6:

Figure 2.7:

LIST OF FIGURES

An example of dynamic multi-layer network: The network consists of
6 nodes, 3 time points and 3 layers. Solid lines represent the edges be-
tween the nodes in the same layer while dashed lines represent the edges
between nodes across layers.

. . . . . . . . . . . . . . . . . . . . . . . .

3

VI comparison between LS-ESC, AFFECT and DYNMOGA for simu-
lated binary networks from Girvan and Newman benchmark with the pa-
rameters presented in Table 2.1. LS-ESC scored the lowest VI over time
compared to the other methods: (a) Data set 1; (b)Data set 2; (c) Data set
3; (d) Data set 4; (e) Data set 5; and (f) Data set 6. . . . . . . . . . . . . . 34

Comparison between LS-ESC and state of the art algorithms for the dif-
ferent weighted networks in Table 3.1: (a)-(c) The cost function calcu-
lated by LS-ESC, (d)-(f) The detected number of clusters (DNOC) for
the simulated networks using different algorithms. LS-ESC estimated
the correct number of clusters over time and the cost function reﬂected
the correct change points; (g)-(i) Comparison of variation of information
(VI) between LS-ESC, AFFECT and GenLov for the simulated networks.
LS-ESC outperforms other methods and scored the lowest variation of in-
formation over time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Cost function calculated by LS-ESC for the MIT Reality mining network.
The changes in the cost function coincides with six important dates during
the school year (red dashed lines).

. . . . . . . . . . . . . . . . . . . . . 39

Detected community structure of the MIT reality mining temporal net-
work at different times: (a) Week 5; (b)Week 15; (c) Week 27; and (d)
Week 39. .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

.

.

.

Cost function calculated by LS-ESC for the primary school temporal net-
work for the: (a) ﬁrst day, (b) second day. Red dashed lines refer to the
time during the day.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Detected community structure in the ﬁrst day of the primary school tem-
poral network at different times: (a) Morning, (b) break, (c) lunch and (d)
end of the school day.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Detected community structure in the second day of the primary school
temporal network at different times: (a) Morning, (b) break, (c) lunch
and (d) end of the school day. . . . . . . . . . . . . . . . . . . . . . . . . 46

x

Figure 3.1:

Figure 3.2:

Figure 3.3:

Figure 3.4:

Detected number of clusters (DNOC) by 3D-WTA for the data sets in
Table 3.1 using different metrics, including: Eigengap (EG), modularity
(Q), muti-resolution modularity Qγ and asymptotical surprise (AS). . . . . 63

Comparison between 3D-WTA and state of the art algorithms for the dif-
ferent data sets in Table 3.1: (a)-(d) The normalized cost function calcu-
lated by 3D-WTA, (e)-(h) The detected number of clusters (DNOC) for
the simulated networks using different algorithms. 3D-WTA estimated
the correct number of clusters over time and the normalized cost func-
tion reﬂected the correct change points represented by the pink dashed
lines. (i)-(l) Comparison of variation of information (VI) between 3D-
WTA, AFFECT, PCM-NC, PCQ-NC, TC and GenLov for the simulated
networks. 3D-WTA scored the lowest variation of information over time
compared to the other methods. . . . . . . . . . . . . . . . . . . . . . . . 66

Variation of information (VI) comparison between the detected commu-
nity structure by 3D-WTA and 4D-WTA and the ground truth for simu-
lated data.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

.

.

.

(a) The normalized cost function and the detected number of clusters for
Flocks of Boids dataset using different algorithms. 3D-WTA results in the
best estimate of the number of clusters especially after the change point,
(b) Rand index comparison between 3D-WTA, AFFECT, PCM and PCQ
for Flocks of Boids Dataset. . . . . . . . . . . . . . . . . . . . . . . . . . 82

Figure 3.5:

Detected Change Points for Reality Mining Data.

. . . . . . . . . . . . . 83

Figure 3.6:

Detected Change Points for ERN Data. . . . . . . . . . . . . . . . . . . . 83

Figure 3.7:

Community structure for the ERN networks obtained by RTTA: (a) Pre-
ERN, (b) ERN, (c) Post-ERN.

. . . . . . . . . . . . . . . . . . . . . . . 84

Figure 3.8:

A ﬂowchart of the application of 4D-WTA to rs-fMRI: (a) Time series
from each ROI; (b) dFCNs constructed across time and subjects using
Pearson correlation coefﬁcient; (c) After random subsampling across time
and subjects, 4 dimensional FC tensors are constructed for each sample.
Each tensor consists of 30 random time points from each subject; (d)
Tucker decomposition of each FC tensor; (e) Community structure found
by 4D-WTA for each sample; (f) The detected community structure is
summarized into 5 functional connectivity states deﬁned through com-
munity detection using k-means.

. . . . . . . . . . . . . . . . . . . . . . 85

Figure 3.9:

(a) Distribution of weights across time for all samples, (b) Distribution of
weights across subjects for all samples. . . . . . . . . . . . . . . . . . . . 86

xi

Figure 3.10: The similarity among the community structures obtained by 4D-WTA
across the 200 subsamples across time and subjects. The similarity be-
tween the different community structures is quantiﬁed by variation of in-
formation. Samples with similar community structure are grouped to-
gether by k-means.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Figure 3.11: State transition proﬁles for subjects 4, 7, 9 and 12. . . . . . . . . . . . . . 87

Figure 3.12: Variation in RSNs’ consistency across subjects calculated by (a) 3D-

WTA, (b) PCQ-NC, (c) GenLov, (d) 4D-WTA. . . . . . . . . . . . . . . . 88

Figure 3.13: Full and circular view of the community structure of the FC states found
by 4D-WTA: (a) State 1 with 9 clusters; (b) State 2 with 4 clusters;
(c)State 3 with 8 clusters; (d) State 4 with 7 clusters and (e) State 5 with
6 clusters.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

.

.

.

Figure 4.1:

Jensen-Shannon divergence between all pairs of time points in the dFCN. 101

Figure 4.2:

Figure 4.3:

Figure 5.1:

Figure 5.2:

Figure 5.3:

. . . . . . . . . . . . . . . . . . . . . . 101

The dendrogram resulting from hierarchical clustering and the correspond-
ing values of q. The dashed red line identiﬁes the cut point of the dendro-
gram from the second peak of q.
The detected community structure for the FC States (1− 10). For each
state, (left) the optimized weighted average of the adjacency matrices
within the state. Each cluster is denoted by the main ROIs detected within
the cluster. The clusters shown in the circular graphs (right) refer to the
different RSNs deﬁned in literature, and the connections between them
are detected by the proposed algorithm. The number of clusters for the
FC states varies between 3 and 4. . . . . . . . . . . . . . . . . . . . . . . 108

Comparison between the detected community structure by TBSCA, BNLSC
and GenLov and the ground truth for the synthetic data set in terms of
variation of information. The number of clusters changes from 3 to 2 at
time point 21, and from 2 to 3 at time point 41.
. . . . . . . . . . . . .

. 116

Agreement between the detected community structures across time for
subject 1 measured by variation of information (VI). The dashed lines
refer to the detected change points.
. . . . . . . . . . . . . . . . . . . .

. 118

Detected community structure from selected change points. The ROIs
in different networks reconﬁgured their community membership across
time. Visualized by circularGraph toolbox [1]. . . . . . . . . . . . . . . . 119

Figure 5.4:

Agreement between the detected community structures across time for all
subjects measured by variation of information (VI).

. . . . . . . . . . . . 120

xii

Figure 5.5:

Consistency of different RSNs across subjects calculated from the de-
tected community structure by TBSCA.

. . . . . . . . . . . . . . . . . . 120

xiii

LIST OF ALGORITHMS

Algorithm 2.1

LS-ESC .

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Algorithm 3.2

3D-Windowed Tensor Approach (3D-WTA) . . . . . . . . . . . . . . 54

Algorithm 3.3

Running Time Tensor Approach (RTTA) . . . . . . . . . . . . . . . . 55

Algorithm 3.4

TrackU .

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Algorithm 3.5

4D Windowed Tensor Approach (4D-WTA) . . . . . . . . . . . . . . 58

Algorithm 4.6

HOOI clustering based approach.

. . . . . . . . . . . . . . . . . . . 100

xiv

Chapter 1

Introduction

Many real world systems can be modeled as complex networks. The objects in these systems are

deﬁned as the nodes or vertices of the graph, while the interactions between these objects deﬁne

the edges of the graph. With the growth of network data, it is important to be able to reduce the

dimensionality and extract useful information from these networks. Examples of real systems that

are naturally structured as networks include social [2], biological [3] and communication systems

[4]. One of the most important characteristics of these networks is the community structure, where

the nodes are organized into a set of modules. One common way to understand the community

structure in these networks is through community detection techniques.

The objective of graph based community detection is to partition a set of nodes into smaller

subsets (modules or communities) such that nodes within the same module are similar to each other

and are densely connected while they are different from the nodes in the other modules. Since many

complex systems can be modeled as graphs or networks, various graph based community detection

techniques have been developed. However, most of these methods are limited to static networks,

i.e. the communities do not change with time. As many real world systems exhibit variation in

their community structure over time, there is a growing need to develop algorithms that detect the

community structure and track its evolution across time [5]. Other complex systems might have

different types of relationships between their entities and these systems can be modeled as multi-

1

layer networks [6]. Multi-layer networks consider relationships between nodes within layers, i.e.

intra-layer edges, and across layers, i.e.

inter-layer edges.

In particular, multi-layer networks

consider all possible edges between nodes within and across layers. Consequently, it important

to develop community detection algorithms that can detect and track the community structure in

multi-layer networks in order to understand these systems correctly.

In this thesis, we focus on developing community detection approaches for two types of net-

works: dynamic (temporal) single-layer networks and dynamic multi-layer networks. In dynamic

single-layer networks, the network topology changes over time as the nodes might come and go

or the interactions between them might change across time. For instance, we can model a group

of users on Facebook and their interactions by a dynamic network, where the interactions between

them might vary across time as shown in the ﬁrst row of Fig. 1.1. In multi-layer networks, the

objects in the network have different types of interactions between them and each layer corre-

sponds to one type of these interactions. For instance, a group of users on Facebook may also

have accounts on Twitter and Instagram and interactions between these users can occur through

the different accounts which reﬂect the multiple relationships between them. In dynamic multi-

layer networks, the relationships between the objects vary both by the type of the relationship and

time. For example, the interactions between the same group of users on Facebook, Twitter and

Instagram may change over time as well as across the different layers and this can be represented

by a dynamic multi-layer network as shown in Fig. 1.1.

1.1 Graph Theory

Let G = {V,E,A} be an undirected weighted graph with a set of vertices or nodes set V =
{v1, . . . ,vn}, and a set of edges, E, that represents the pairwise weights between the nodes. The

2

Figure 1.1: An example of dynamic multi-layer network: The network consists of 6 nodes, 3 time
points and 3 layers. Solid lines represent the edges between the nodes in the same layer while
dashed lines represent the edges between nodes across layers.

weighted adjacency matrix, A ∈ Rn×n, is symmetric and its elements represent the similarity be-
tween each pair of nodes. For binary graphs, Ai j ∈ {0,1} and for weighted graphs Ai j ∈ [0,1]. The
degree of a vertex, vi ∈ V , is deﬁned as di =
diagonal matrix with {d1, . . . ,dn} on its diagonal [7, 8].

Ai j and the degree matrix, D, is deﬁned as the

n
∑
j=1

For a dynamic network with M time points, a set of weighted and undirected graphs are con-
structed to represent the network over time as {G (tm)}, where tm ∈ {t1,t2, . . . ,tM}. The graph at
each time point is represented by the adjacency matrix A(tm) ∈ Rn×n.

3

1.2 Community Detection in Static Networks

Early work in community detection has focused on static networks. Different clustering approaches

have been proposed for community detection, including hierarchical clustering, partitional [9] [7]

and modularity optimization approaches [10, 11]. In hierarchical clustering approaches, a hier-

archical structure is constructed using a distance matrix and visualized through a dendrogram.

Hierarchical clustering is usually implemented by agglomerative or divisive algorithms. In ag-

glomerative algorithms, clustering starts with individual nodes as separate clusters, then at each

step the closest pairs of clusters are merged until all of the nodes form one big cluster. On the

other hand, divisive algorithms start with all nodes as one cluster, then at each step it divides the

clusters until each cluster contains only one node [12, 13]. Under partitional algorithms, squared-

error based algorithms and graph-theoretic algorithms were introduced to assign the objects into

r clusters. Squared-error based algorithms rely on optimizing a criterion function in order to pro-

duce the clusters. k-means clustering, the most popular squared-error algorithm, relies on a cost

function that minimizes the within-cluster sum of squares (intra-cluster distances) and maximizes

the between-cluster sum of squares (inter-clusters distances). Graph-theoretic algorithms partition

the graph using a cut cost depending on the spectral properties of the graph, i.e. eigenvalues and

eigenvectors of the adjacency matrix [7]. Under modularity optimization methods, the modular-

ity measures the deviation of the within community edges from the expected value of the same

quantity in a network with the same community partitions but random connections between the

vertices [11]. Even though the aforementioned methods are limited to static networks, recent work

has extended these methods to handle time varying networks. More discussion follows in Sec. 1.4.

4

1.2.1 Graph Cut problem and Spectral Clustering

Over the past decades, several methods have been proposed for graph partitioning. One of the
simplest ways to partition a graph, G = {V,E,A}, is to solve the mincut problem. The mincut
problem can be solved by choosing the partition C = {C1, . . . , Cr} with Cl ∩Cl(cid:48) = /0 where l (cid:54)= l(cid:48)
and C1 ∪ . . .∪Cr = C which minimizes the following objective function:

cut(C1, . . . , Cr) =

1
2

r

∑

l=1

A(Cl, ¯Cl),

(1.1)

where ¯Cl is the complement of Cl and A(Cl, ¯Cl) = ∑i∈Cl, j∈ ¯Cl Ai j for two disjoint sets Cl, ¯Cl ⊂ V .
The mincut problem performs efﬁciently when r = 2 as discussed in [14]. However, solving this

problem might result in unbalanced clusters as it might separate one single vertex from the rest

of the graph and consider it as a separate cluster. In order to overcome this problem, normalized

versions of the cut function have been proposed. The two most commonly used ones are RatioCut

[15] and normalized cut, Ncut [16]. In RatioCut, the size of a cluster in the graph is measured by
the number of vertices, |Cl|, whereas in Ncut the size is measured by the total weights of the within

cluster edges, vol(Cl). These functions are are deﬁned as:

RatioCut(C1, . . . , Cr) =

1
2

r

∑

l=1

A(Cl, ¯Cl)

|Cl|

,

Ncut(C1, . . . , Cr) =

1
2

r

∑

l=1

A(Cl, ¯Cl)
vol(Cl)

.

(1.2)

(1.3)

Minimizing both of these functions leads to balanced clusters normalized by the number of
vertices or edge weights, respectively. However, adding the balancing conditions, |Cl| and vol(Cl),

makes the cut problem NP hard to begin with as discussed in [17]. Spectral clustering offers

an efﬁcient solution to the relaxed versions of these problems. Relaxing the RatioCut leads to

unnormalized spectral clustering and relaxing the Ncut leads to normalized spectral clustering.

5

Spectral clustering maps the relationship between the nodes in a graph into a lower dimen-
sional subspace. Given a network with adjacency matrix, A ∈ Rn×n, spectral clustering solves the

following trace optimization problem [7],

(cid:16)

(cid:17)

min
U∈Rn×K

Trace

U†ΦΦΦU

,

s.t U†U = I,

(1.4)

where ΦΦΦ = I−AN is the normalized Laplacian matrix with AN = D− 1

2 AD− 1

2 the normalized adja-

cency matrix [18, 19].

The normalized Laplacian matrix is a square, symmetric and positive semi-deﬁnite matrix.
Consequently, all the eigenvalues, {λ1,λ2, . . . ,λn}, of ΦΦΦ are real and positive with 0 as the smallest
eigenvalue. The solution to the optimization problem in Eq. (2.1) is found by choosing U as the

matrix that contains the r eigenvectors that correspond to the smallest r eigenvalues of ΦΦΦ. The

community structure is then determined by applying k-means to the matrix U [7]. Since we are

concerned about community detection in dynamic networks, one approach is to apply standard

spectral clustering at each time point individually. However, this approach is very sensitive to

instantaneous noise, and produces inaccurate clustering results for noisy networks.

1.2.2 Modularity Based Clustering

Modularity based clustering is considered as one of the most popular optimization based cluster-

ing techniques. This approach relies on maximizing the modularity function across all possible

partitions [11, 20]. Modularity can be deﬁned as,

Q =

1
2E ∑
c∈C

∑
i, j∈Vc

[Ai j − DiD j
2E ],

(1.5)

6

n
∑
i, j=1

Ai j is the total edge weight, Ai j is the i jth entry of the adjacency matrix, Di is the
where 2E =
degree of node i, C = {C1, . . . ,Cr} is the set of clusters and Vc is the set of nodes in the cth cluster.

Modularity measures the deviation ratio of the within community edges in the network from the

expected value of the same quantity in the network with the same community partitions but random

connections between the vertices [11]. However, there are several drawbacks with modularity

based clustering, such as computational cost, null model selection, preference for bigger clusters

and the possibility to converge to a sub-optimal solution [20, 21].

1.3 Background on Tensors

Tensors are multi-dimensional arrays that can be used to provide a compact representation for

higher-order data and preserve the multi-linear relationships in data. Different approaches have

been developed to summarize and extract meaningful structure from high-order data using tensors.

The advantages of using tensor analysis over two-dimensional analysis for high-order data have

been discussed in great detail in [22–24].

Tensor matricization: A J-order tensor is denoted as X ∈ RI1×I2×···×IJ where xi1, i2,..., iJ cor-
responds to the (i1, i2, . . . , iJ)th element in the tensor, X . A J-order tensor can be transformed

into a matrix by reordering its elements in a process called matricization or unfolding. The mode- j
matricization of a tensor, X ∈ RI1×I2×···×IJ, is denoted as X( j). A tensor element (i1, i2, . . . , iJ) is
mapped to a matrix element (i j,k), where

k = 1 +

J

∑
l=1, l(cid:54)= j

(il − 1)Kl

with Kl =

l−1
∏
q=1, q(cid:54)= j

Iq.

(1.6)

j-mode product: The j-mode product of a tensor X ∈ RI1×···×Ij×···×IJ and a matrix U ∈ RK×Ij
is denoted as X × j U and has a size of I1 ×···× Ij−1 × K × Ij+1 ×···IJ with its elements deﬁned

as:

7

(X × j U)i1 i2 ... i j−1 k i j+1 ... iJ

=

Ij
∑
i j=1

xi1 ... iJuki j,

which can be expressed in terms of unfolded tensors as:

C = X × j U ⇔ C( j) = UX( j)

(1.7)

(1.8)

The j-Rank: The rank of a j-mode tensor, X ∈ RI1×I2×···×IJ, is denoted as rank j(X ) and
deﬁned as the column rank of X( j). The n-rank of X is then deﬁned as the collection of ranks of

mode matrices X( j) and can be expressed as:

rank j(X ) = {rank(X(1)), rank(X(2)), . . . , rank(X( j))},

(1.9)

where j = 1, . . . , J.

Tensor factorization: Matrix factorization methods such as singular value decomposition

(SVD) and principle component analysis (PCA) have been extended to tensors through Parral-

lel Factor for Cross Products (PARAFAC) and Tucker decomposition [25]. Tucker decomposition

provides orthogonal subspace information along each mode of the tensor, where the tensor is de-

composed into a core tensor multiplied by an orthogonal component matrix along each mode. A J
mode tensor X ∈ RI1×I2×···×IJ, can be written using the Tucker model as follows:

X = Y ×1 U(1) ×2 U(2)···×J U(J),

(1.10)

where Y ∈ RI1×I2×···×IJ is the core tensor, U( j) ∈ RIj×Ij for j = 1,2, . . . , J is the orthogonal com-
ponent matrix along the jth mode.

Tucker decomposition is commonly implemented through high-order SVD (HOSVD) [26] or

higher order orthogonal iteration (HOOI) [27]. HOSVD performs SVD on the unfolded matrix,

X( j), along each mode of the tensor independently [28]. On the other hand, HOOI performs alter-

nating optimization to ﬁnd the best projection matrices.

8

HOOI is an iterative algorithm used to compute the low rank approximation of tensors. For a
J mode tensor X , let r1,r2, . . . ,rJ be a set of integers satisfying r j ≤ Ij, for j = 1, . . . ,J. The best
rank-(r1,r2, . . . ,rJ) approximation of the tensor X is achieved by ﬁnding a set of U( j) ∈ RIj×r j
matrices with orthonormal columns for j = 1,2, . . . , J, and a core tensor Y ∈ Rr1×r2×···×rJ that

satisﬁes the following least squares optimization problem,

min

U(1),U(2),...,U(J),Y

(cid:107) X − Y ×1 U(1) ×2 U(2)···×J U(J) (cid:107)2
F .

(1.11)

The optimal solution of Eq. (1.11) is given by,

Y = X ×1 UT

(1) ×2 UT

(2)···×J UT
(J),

(1.12)

with UT

( j)U( j) = I, where I ∈ Rr j×r j is the identity matrix. The HOOI relies on alternating least
squares approach to solve the best rank-(r1,r2, . . . ,rJ) approximation problem, and solves a set of

restricted optimization problems successively as follows,

(cid:107) X − Y ×1 U(1) ×2 U(2)···×J U(J) (cid:107)2
F ,

min
U(p)

(1.13)

where the optimization is performed over the jth matrix U( j) with the latest optimized values of
the other U(p)’s substituted in (1.11) and (1.12), where j (cid:54)= p. A more detailed description can be
found in [27].

9

1.4 Community Detection in Dynamic Networks

The community structure of most real world systems varies across time. For instance, in social

networks, the community structure can change with time as people leave or join different commu-

nities [29, 30]. As a result, evolutionary clustering approaches have been implemented to detect

the community structure in this type of networks.

Evolutionary clustering provides a framework to cluster the network at each time point, such

that it preserves the network structure at the current time and the cluster assignments change

smoothly across time. Over the past decade, different evolutionary clustering approaches have

been developed to detect the community structure of dynamic networks. These approaches include

simple extensions of static clustering methods [31, 32], statistical modeling based methods [33],

tensor based methods [34] and modularity based methods [35].

Extensions of static clustering methods such as k-means and agglomerative hierarchical cluster-

ing approaches have been developed in [31] for dynamic networks. A cost function that considers

both the cluster membership at the current time and the deviation from the previous time point

was introduced. In k-means evolutionary clustering, this deviation is quantiﬁed through a distance

measure between centroids of clusters from consecutive points. However, this distance measure

is sensitive to variation in the historic centroids, since it depends on matching the current centroid

with its peer from recent history. Similarly, in evolutionary hierarchical clustering, the merging

step of the tree nodes at the current time is affected by the historic community assignment. In [32],

the k-means evolutionary clustering approach was extended using two spectral clustering based

frameworks: preserving cluster quality (PCQ) and preserving cluster membership (PCM). In both

frameworks, the clustering depends on a cost function to guarantee temporal smoothness. In PCQ,

the temporal smoothness is deﬁned as the cost of applying the current clustering results to the

10

previous data point, consequently the current clustering results are rejected if they disagree with

recent history. On the other hand, PCM considers the projection distance between the current and

previous clustering results, which is not a proper metric [36], to measure the difference between the

current and historic clusterings. Moreover, the suggested temporal cost function requires a priori

knowledge about the changes in the community structure and depends on the choice of a changing

parameter.

Under statistical modeling based methods, authors in [33] assumed a statistical model for the

evolution of the adjacency matrix and an adaptive forgetting factor for evolutionary clustering and

tracking (AFFECT) was introduced. In AFFECT, static clustering methods were performed after

smoothing the proximity between objects over time. However, AFFECT depends on a stochastic

block model for the adjacency matrix and has higher computational complexity compared to static

clustering to estimate the forgetting factor.

Under modularity based methods, a statistical method based on greedy modularity optimization

has been proposed in [35] to identify the changing modular organization in temporal multiplex

networks. Yet this approach has some drawbacks as it depends on modularity optimization as

discussed in [20].

Under tensor based methods, a non-negative tensor factorization approach was introduced

in [34].

In this approach, the community structure of the temporal network is detected using

PARAFAC, where all the successive adjacency matrices in the temporal network were represented

by a single 3-way tensor. However, this representation of the temporal network may not reﬂect the

community structure of the network at different time points.

11

1.5 Community Detection in Dynamic Multi-layer Networks

Most real world complex systems cannot be simply represented by static networks. In many disci-

plines, the systems are represented by dynamic networks [5] that evolve over time, and the com-

munity structure has to be detected across time. As a result, different evolutionary clustering

approaches have been introduced, as discussed in Section 1.4. Moreover, many of these systems

can be comprised of multiple types of relationships, and can be modeled as multi-relational or

multi-layer networks [6]. In multi-layer networks, the entities or the nodes in the network ex-

hibit different types of relationships, i.e. within layer and across-layers relationships, and these

relationships should be taken into account to understand the correct underlying structure of these

networks [37]. For instance, social networks can be modeled as multi-layer networks, where the

nodes correspond to the same group of individuals across all layers while the edges within each

layer correspond to intra-layer relationships and the edges across layers correspond to inter-layer

relationships [37–39]. Another example is air transportation network [40], where the nodes cor-

respond to the airports and the edges correspond to the direct ﬂights between them. Each layer

then corresponds to the different airlines. More recently, the need to combine the concepts of

dynamic networks and multi-layer networks has increased, resulting in a new representation for

complex systems as dynamic multi-layer networks. Dynamic multi-layer networks provide a com-

plete representation for systems with various types of relationships that evolve across time. The

aforementioned social network example can be remodeled as a dynamic multi-layer network when

considering the changes in the relationships between individuals across time.

Over the past years, few methods have been developed to detect the community structure in

dynamic multi-layer networks. In [35], the authors introduced a general framework that encom-

passes time-evolving, multi-scale and multi-layer networks to determine the community structure

12

in multi-slice networks. These networks include a set of adjacency matrices that represent the vari-

ation in connectivity between nodes across time, or variation in the types of relationships across

layers, or variation of the same network across multiple scales. The framework relies on a gener-

alization of the modularity optimization problem for detecting the community structure of a single

network. Since this framework depends on modularity, it has multiple drawbacks including the

computational cost, null model selection and the possibility of the convergence to a non-optimal

solution as discussed in [20]. Moreover, this approach only considers the inter-layer edges be-

tween each node and itself and ignores the inter-layer edges between different nodes across layers.

Another evolutionary clustering approach to mine and track dynamic multi-layer networks is intro-

duced in [41]. The proposed framework relies on a multi-objective optimization representation and

the solution is obtained by a genetic algorithm. This approach has a high computational cost and

is impractical in the case of detecting the community structure for large networks. Furthermore,

this approach does not consider any inter-layer edges between nodes across layers.

In this thesis, we address the problem of community detection in dynamic multi-layer networks.

In particular, we focus on analyzing brain functional connectivity by modeling it as a dynamic

network across multiple subjects. This representation allows a comprehensive understanding of

the dynamic behavior of the brain functional connectivity and its variation across subjects.

1.6 Dynamic Functional Connectivity Networks of the Brain

With the advancement of neuroimaging technology, it is now possible to collect neuronal data

from the human brain across different modalities, spatial and temporal resolutions, such as elec-

troencephalography (EEG) and fMRI data [42–44]. One of the major ways of analyzing this mul-

tidimensional neurophysiological data has been to construct functional connectivity networks of

13

the brain. Functional connectivity (FC) is deﬁned as the statistical dependency between spatially

distributed brain regions [45–47].

With the development of complex network theory, it is now possible to map human brain

activity as a complex network [48–50]. In particular, the nodes of the network correspond to the

different brain regions and the edges are quantiﬁed through the interactions between these regions.

The possibility of modeling the brain as a network opened the door for researchers to investigate

brain behavior during task related events and resting state. Early work on functional connectivity

network (FCN) analysis focuses on static networks. However, recent studies have shown that FCNs

exhibit a dynamic behavior [51–53].

In recent years, understanding the spontaneous activity of the brain during resting state at-

tracted lots of attention. Consequently, different approaches focused on understanding the func-

tional mechanisms of human brain during resting state. This was accomplished by investigating

the functional connectivity of the brain networks from fMRI during resting state. FC between dif-

ferent brain regions has been quantiﬁed through the computation of Pearson’s correlation between

regions of interest (ROIs) [54–56]. Increased interest in studying resting state fMRI (rs-fMRI) has

shown that the different brain regions of the motor cortex are active even in the absence of a motor

task [57]. Many of the current studies on rs-fMRI reported a set of networks that appear consis-

tently in healthy subjects such as default mode network, visual network, somatomotor network,

executive control network and auditory network [58, 59].

Early work on FCN analysis relied on static network representation. However, recent studies

revealed evidence for dynamic nature of FCNs [51–53], [60–62] Different methods have been

developed to study the dynamics of the functional connectivity networks (dFCNs). These methods

include clustering based methods [55], [63–67], statistical state modeling methods [68, 69] and

subspace analysis based methods [56], [70–72], [73].

14

Under clustering based methods, different algorithms have been developed assuming that the

network is at one distinct state at each time point, and the states reoccur across time. In [55], an

approach based on k-means clustering of dFCNs across time and subjects is introduced to identify

”FC states” from rs-fMRI data. Similarly, the authors in [63–66] used k-means clustering to ob-

tain ”synchrostates” from a time-frequency dependent phase difference data. Another clustering

based approach is introduced in [67], where a codebook was obtained to facilitate the description

of the ”FCµstate” through Neural Gas algorithm. Under statistical state modeling, an approach to

identify the network states through hierarchical clustering followed by a Hidden Markov Model

(HMM) is introduced in [68]. Similarly, the authors in [69] obtained the network states and the

transitions between them through independent vector analysis and Markov modeling. Under sub-

space modeling methods, an approach based on principal component analysis (PCA) is introduced

in [56], where it is assumed that the network states are made up of multiple building blocks, or

eigenconnectivities. A similar PCA based approach is adopted in [70] using EEG data to recon-

struct the principal resting state dynamics. Most of the current work focuses on extracting FC states

and reduces the dimension of the dFCNs by vectorizing the connectivity matrices before extracting

the FC states. This procedure may result in the loss of the topological structure of the FCN. These

approaches also reduce the dFCNs into a few common basis functions or states without explic-

itly identifying the topological organization within each state. Moreover, they do not necessarily

identify the change points between these states. More recently, approaches that can ﬁrst detect the

change points in dFCN connectivity patterns and then summarize the network structure have been

introduced. These include greedy algorithms using graphical approaches [74], Bayesian network

modeling based models [75] and tensor based approaches [76], [73].

15

1.7 Organization and Contributions of this Thesis

In this thesis, we present novel approaches to both track and detect the community structure in

dynamic multi-layer networks. A particular focus of this thesis is detecting and tracking the com-

munity structure in temporal social networks and dynamic multi-layer functional connectivity net-

works (dFCNs) of the brain.

In Chapter 2, we introduce a low-rank + sparse estimation based evolutionary spectral clus-

tering approach to detect and track the community structure in temporal networks. The proposed

method decomposes the network into low-rank and sparse parts and obtains smooth cluster assign-

ments by minimizing the subspace distance between consecutive time points. The method tries

to obtain smooth cluster assignments by minimizing the subspace chordal distance between con-

secutive time points. Furthermore, the proposed algorithm explicitly considers the changes in the

dimensions of the consecutive subspaces over time by modifying the chordal distance. In addition,

the introduced framework is robust to noise and outliers and can detect the community structure

in both binary and weighted temporal networks efﬁciently. Therefore, unlike existing evolutionary

clustering methods, the proposed algorithm does not require any prior knowledge or make assump-

tions about the community structure and does not assume any particular model for the network.

The proposed approach is evaluated by applying it to multiple simulated and real social temporal

networks.

In Chapter 3, we propose a tensor based approach using 3-way and 4-way tensors to detect

the community structure of dynamic single-layer and multi-layer networks, respectively. In the

dynamic single-layer network case, two approaches, windowed tensor approach (3D-WTA) and

running time tensor approach (RTTA) are proposed. In 3D-WTA, a ﬁxed length widow is used

to construct the tensors that represent the network at each time point, while in RTTA, a variable

16

length window is used to construct the tensors. The proposed framework takes the history of the

networks over time into account and optimizes the contribution of each time point in the network.

Moreover, a normalized cost function is introduced to track the changes in the community structure

over time. The proposed approach is applied to synthetic and real networks including social net-

works and dFCNs. In particular, it is applied to EEG data collected during a study of error-related

negativity (ERN) in order to understand the dynamic behavior of the FCN during cognitive con-

trol. For dynamic multi-layer networks, 3D-WTA is extended for temporal multi-layer networks,

namely 4D-WTA, to identify and track the community structure across time and layers, simultane-

ously. The proposed algorithm is applied to study the temporal evolution of communities in FCNs

constructed across different regions of interests (ROIs) in the brain from rs-fMRI. In 4D-WTA,

the contribution of each subject and time point within the framework is optimized. Moreover,

the dFCNs are summarized into a set of FC states deﬁned through their community structure that

repeat over time and subjects. The detected community structures are evaluated through a con-

sistency measure and the dynamic behavior of the FCNs is compared to the results from prior

work.

In Chapter 4, we address the issue of community detection in dynamic networks by considering

the ability to reduce the network and aggregate the time points that are topologically similar to

achieve more compact and efﬁcient community detection. The proposed approach consists of

two steps. The ﬁrst step relies on an information-theoretic function to aggregate the dynamic

networks and identify the time points that possess similar topological structure to combine them

into a tensor. In the second step, a tensor based spectral clustering approach similar to the one

proposed in Chapter 3 is applied to identify the community structure of the constructed tensors.

The detected community structure is summarized as a set of connectivity patterns or states that

reoccur across time. A particular application of this approach is the analysis of resting state FCNs

17

of the brain from fMRI signals. A set of FC states is extracted from the FCNs and evaluated by

comparing it to the ﬁndings of previous studies.

In Chapter 5, we propose a temporal block spectral clustering framework to detect and track

the community structure of multi-layer temporal networks. The main contribution of the proposed

approach is the construction of data-driven within and across layer adjacency matrices. The pro-

posed approach considers all possible edges between nodes within and across layers to construct

intra- and inter- layers adjacency matrices unlike existing methods that ignores inter-layer edges or

considers inter-layer edges between the node and itself only, i.e. multiplex networks. In particular,

a sliding window correlation approach is used to model the pair-wise temporal relationships be-

tween all nodes within and across time windows. A set of intra- and inter-adjacency matrices are

constructed and combined to create a set of temporal supra-adjacency matrices. The community

structure is then determined by applying spectral clustering to these supra-adjacency matrices. The

proposed algorithm is evaluated on simulated networks and dFCNs from rs-fMRI across time and

subjects and compared to results reported from previous studies.

Finally, Chapter 6 summarizes the contributions of this thesis and suggests future work to

improve current approaches.

18

Chapter 2

Detecting and Tracking Community

Structure in Temporal Networks: A

Low-Rank + Sparse Estimation Based

Evolutionary Clustering Approach

2.1

Introduction

Many real world systems and relational data can be modeled as networks or graphs, where the

system’s objects and the interactions between them are represented by the vertices and edges of a

graph, respectively. With the availability of large amounts of network data, it is important to be

able to reduce the network’s dimensionality and extract useful information from it. A key approach

to network data reduction is community detection [77]. Early work in graph based community de-

tection methods has focused on static networks [2] [7]. This type of networks is usually considered

as an oversimpliﬁcation as many real world complex systems exhibit variation in their community

structure over time. Consequently, there is a growing need to develop algorithms that detect the

19

community structure and track its evolution across time [5].

Over the past decade, various methods have been proposed to detect the community struc-

ture in temporal networks. Evolutionary clustering approaches provide a framework to cluster the

network over time, such that the network structure at each time point is preserved and the cluster

assignments change smoothly across time. These approaches include extensions of static clustering

methods [31] [32], statistical modeling based methods [33], tensor based methods [34] [78], mod-

ularity based methods [35] and multi-objective optimization genetic algorithms [79]. Extensions

of static clustering methods to temporal networks include k-means and agglomerative hierarchical

clustering approaches [31]. In [32], k-means evolutionary clustering approach was extended to

two spectral evolutionary clustering based frameworks: preserving cluster quality (PCQ) and pre-

serving cluster membership (PCM). The clustering in both frameworks depends on a cost function

to guarantee temporal smoothness. However, this cost function requires a priori knowledge about

the community structure of the underlying network. For statistical modeling algorithms, a statisti-

cal model for the evolution of the adjacency matrix was assumed [33] and an adaptive forgetting

factor for evolutionary clustering and tracking (AFFECT) was introduced. In AFFECT, the prox-

imity between objects was smoothed over time followed by a static clustering method. AFFECT

assumes a stochastic block model for the adjacency matrix and has higher computational complex-

ity compared to static clustering due to the estimation of the forgetting factor. Under tensor based

methods, non-negative PARAFAC tensor factorization community detection approaches were in-

troduced in [34] and [78] where the temporal network is represented by a single 3-way tensor.

In [78], PARAFAC decomposition and k-means are used to learn the network generative models

and detect the clusters, respectively. However, this framework requires a priori knowledge about

the ground truth clusters in the temporal network, including the number of the clusters, the nodes

that belong to each cluster and the presence of the clusters across time, to determine the signiﬁcant

20

clusters and evaluate them. Modularity based methods are based on greedy modularity optimiza-

tion to identify the changing modular organization [35]. This approach has some drawbacks such

as computational cost, null model selection, preference for bigger clusters and the possibility to

converge to a sub-optimal solution as discussed in [20]. Multi-objective optimization genetic al-

gorithms consider a multi-objective approach (DYNMOGA) [79] that consist of two objectives:

Snapshot cost and temporal cost. The snapshot cost relies on computing four quality measures,

including modularity, conductance, normalized cut and community score. On the other hand, the

temporal cost relies on minimizing the normalized mutual information between consecutive parti-

tions. However, DYNMOGA has high computational complexity that increases with the number

of generations. Moreover, DYNMOGA performs well in binary networks whereas its performance

declines for weighted networks, especially when the networks are fully connected.

Although a variety of methods have been developed to detect the community structure in tem-

poral networks, there are still some problems and limitations associated with these methods. First,

some of the existing approaches are limited to binary networks [34]. Second, some of the current

algorithms require a priori knowledge or make assumptions about the community structure or the

adjacency matrix [33] [35] [78]. Third, some of the existing algorithms such as [79] have a high

computational complexity. Finally, many of the existing algorithms are not robust to outliers which

leads to the detection of inaccurate community structure. Consequently, there is a need to develop

new approaches to overcome these shortcomings.

In this chapter, a low-rank + sparse estimation based evolutionary spectral clustering approach

is proposed to detect and track the community structure in temporal weighted and binary networks.

The major advantages of the proposed approach over existing methods are ﬁve fold. First, the pro-

posed method can detect and track the community structure in both weighted and binary temporal

networks without any adjustments to the optimization problem. Second, the extraction of the low-

21

rank part of the adjacency matrix at each time point removes the sparse noise and outliers in the

network which improves the robustness of the method. Third, minimizing the subspace distance

between consecutive time points results in smooth cluster assignments across time, i.e. preserves

the community structure at the current time point and guarantees that the structure is evolving

smoothly across time. Fourth, a cost function deﬁned through subspace distance measure provides

a way to track the changes in the community structure across time. Finaly, the proposed approach

does not require a priori knowledge or make any assumptions about the community structure un-

like some existing methods.

2.2 Background

2.2.1 Spectral Clustering

A popular approach for detecting the community structure in static networks is spectral cluster-

ing, which maps the relationship between the nodes in a graph to a lower dimensional subspace.
Given a network with adjacency matrix, A ∈ Rn×n, spectral clustering solves the following trace

minimization problem [7],

(cid:16)

U†ΦΦΦU

(cid:17)

,

min
U∈Rn×r

Tr

s.t U†U = Ir,

(2.1)

where † is the transpose operator and ΦΦΦ = I− D− 1

2 AD− 1

2 is the normalized Laplacian matrix with

D as the diagonal degree matrix [19]. Φ is always a positive semi-deﬁnite matrix for undirected

graphs. The solution to the optimization problem is the matrix U containing the r eigenvectors

that correspond to the smallest r eigenvalues of ΦΦΦ. The community structure is then determined by

applying k-means to the matrix U [7]. The number of the eigenvectors, r, refers to the number of

22

clusters in the network [7].

2.2.2 Low-rank + Sparse matrix approximation

Different methods have been proposed to ﬁnd a low-rank approximation for a noise corrupted ma-
trix A ∈ Rn1×n2. For example, principal component analysis (PCA) ﬁnds a low rank approximation

by solving the following optimization problem [80], [81], [82]:

min

L∈Rn1×n2

(cid:107)A− L(cid:107)2
F ,

s.t rank(L) ≤ r,

(2.2)

where L ∈ Rn1×n2 is the low rank approximation of A.

The solution to this problem is given in terms of singular value decomposition (SVD) of the
2 be the SVD of A, where Q1 ∈ Rn1×n1 and Q2 ∈ Rn2×n2 are the left
matrix A. Let A = Q1ΣΣΣQ†
and right singular vectors matrices, respectively and ΣΣΣ ∈ Rn1×n2 is a diagonal matrix with the
†, where ˜Q1 ∈ Rn1×r,
singular values on its diagonal. The rank-r matrix L is given as L = ˜Q1 ˜ΣΣΣ ˜Q2
˜Q2 ∈ Rn2×r and ˜ΣΣΣ ∈ Rr×r are the top r singular vectors and singular values matrices, respectively.

Even though PCA is the most commonly used technique for identifying a low-rank approximation

of a corrupted observations matrix, it is known that its performance declines when the corruptions

are non-Gaussian.

In order to address this issue, robust PCA (RPCA) has been introduced to recover the low-rank

approximation from data corrupted by non-Gaussian noise [83]. RPCA decomposes the matrix A

into a low-rank component, L, and a sparse component, S, by solving the following optimization

problem:

min

L ∈Rn1×n2 ,S∈Rn1×n2

(cid:107)L(cid:107)∗ + µ(cid:107)S(cid:107)1, s.t L + S = A,

(2.3)

where (cid:107)L(cid:107)∗ is the nuclear norm of L, (cid:107)S(cid:107)1 is the l1-norm of S and µ > 0 is the regularization

23

parameter. The unconstrained version of the problem in Eq. (2.3) is given by:

min

L ∈Rn1×n2 ,S∈Rn1×n2

(cid:107)L(cid:107)∗ + µ(cid:107)S(cid:107)1 +

(cid:107)A − L − S(cid:107)2
F ,

γ
2

(2.4)

which can be solved efﬁciently by iterative SVD soft-thresholding algorithm [83].

2.2.3 Chordal Subspace Distance Measure

Let B be a subspace of dimension r1 in Rn, and C another subspace of dimension r2 in Rn, where
r1,r2 ≤ n. The chordal distance, which deﬁnes a metric on subspaces of different dimensions, is

deﬁned as [84]:

Ch(B,C) = |r2 − r1|+
d2

min{r1,r2}

∑

i=1

(sinθi)2,

(2.5)

where θi represents the ith principal angle between B and C which quantiﬁes how close the two

subspaces are geometrically.

2.2.4 Variation of Information

In order to evaluate the performance of a community detection algorithm, the degree of agreement

between different partitions of the same network should be determined. Throughout this chapter,

the variation of information metric introduced in [85] is used for this purpose. Variation of infor-
mation (VI) between two partitions ¯C = (C1, C2, . . . , Cr) and ¯C(cid:48) = (C(cid:48)
r(cid:48)) with r and r(cid:48)

2, . . . , C(cid:48)

1, C(cid:48)

clusters, respectively is deﬁned as:

V I( ¯C, ¯C(cid:48)) = H( ¯C| ¯C(cid:48)) + H( ¯C(cid:48)| ¯C),

(2.6)

24

where H( ¯C| ¯C(cid:48)) and H( ¯C(cid:48)| ¯C) are conditional entropies. VI ranges between 0 ≤ V I ≤ logn, where n

is the number of nodes in the network. A value of 0 means the partitions are exactly the same and

as the VI value increases, the degree of agreement between partitions decreases.

2.3 Methods: low-rank + sparse estimation based evolutionary

spectral clustering (LS-ESC)

2.3.1 Problem Formulation

Given noisy adjacency matrix observations, A(t) ∈ Rn×n, of a low-rank temporal network, L(t) ∈
Rn×n, the objective is to obtain temporally smooth community detection assignments from these

noisy adjacency matrix observations. To accomplish this objective, we propose to decompose the
corrupted adjacency matrix into low-rank, L(t) ∈ Rn×n, and sparse, S(t) ∈ Rn×n, components. The

estimated low-rank adjacency matrix at each time point is then used to extract the corresponding

subspace, i.e. cluster assignment, with the additional constraint that the communities evolve slowly

over time. This results in the following optimization problem:

min

L(t)∈Rn×n,S(t)∈Rn×n,

U(t)∈Rn×r

(cid:107)L(t)(cid:107)∗ + µ(cid:107)S(t)(cid:107)1 + λ1 Tr(U(t)†ΦΦΦ(t)U(t)) + λ2 d2

ch(U(t),U(t−1)),

(2.7)

s.t A(t) = L(t) + S(t),U(t)†U(t) = I,L(t) = L(t)†

, L(t)

i j ≥ 0, L(t)

ii = 0.

The different terms in problem in Eq. (2.7) achieve the following goals:

• The ﬁrst two terms represent the RPCA problem. The constraints L(t) = L(t)†

, L(t)i j ≥
0, L(t)ii = 0 are added to ensure the symmetry and nonnegativity of the low-rank adjacency

matrix with zeros on the diagonal.

25

• The third term, Tr(U(t)†ΦΦΦ(t)U(t)), with the constraint U(t)†U(t) = I, represents the spectral

clustering problem and is introduced to detect the community structure of the underlying

low-rank network at time point t. The normalized Laplacian matrix, ΦΦΦ(t), is deﬁned in terms
of the low-rank component as ΦΦΦ(t) = I− D(t)−0.5L(t)D(t)−0.5.

• The last term, d2

Ch(U(t),U(t−1)), is introduced to minimize the distance between the consec-
utive networks’ subspaces at time points t and t − 1. By adding this term to the optimization

problem in Eq. (2.7), the community structure of the temporal network is guaranteed to be

evolving smoothly over time.

• µ,λ1 and λ2 > 0 are regularization parameters.

2.3.2 Distance Between Successive Subspaces

In order to deﬁne the distance between consecutive subspaces, let U(t) be the matrix whose columns
form a basis for the r-dimensional subspace in Rn, deﬁned by the r eigenvectors corresponding to
the smallest eigenvalues of the graph Laplacian at time point t. Similarly, U(t−1) can be deﬁned
as the basis spanning the r1-dimensional subspace in Rn, that contains the r1 eigenvectors corre-
sponding to the smallest eigenvalues of the graph Laplacian at time point (t − 1), i.e. U(t) ∈ Rn×r
and U(t−1) ∈ Rn×r1. One of our main objectives is to ﬁnd the subspace spanned by U(t) that is close
to the subspace spanned by U(t−1), and at the same time conserves the nodes’ connectivity at time

point t. Using the deﬁnition of chordal distance in Eq. (2.5) and the fact that the columns of U(t)

are orthonormal, we have three different cases.

In the ﬁrst case, the dimensions of the subspaces corresponding to the consecutive time points

are equal to each other, i.e. r = r1. In the second case, the dimension of U(t) is smaller than the
dimension of U(t−1), i.e. r ≤ r1. In the third case, the dimension of U(t) is larger than the dimension

26

of U(t−1), i.e. r > r1. The chordal distance between the successive subspaces for the three cases

can be simpliﬁed as follows.

For the ﬁrst case:

Ch(U(t),U(t−1)) =
d2

=

i=1

(sinθi)2

(1 − (cosθi)2)

r1∑
r1∑
= r1 − r1∑
= r1 − Trace(U(t)U(t)†U(t−1)U(t−1)†

(cosθi)2

i=1

i=1

For the second case:

∑

∑

i=1
r

(sinθi)2

(1 − (cosθi)2)

Ch(U(t),U(t−1)) = r1 − r +
d2
= r1 − r +
= r1 − r + r − r
∑
= r1 − r
∑
= r1 − Trace(U(t)U(t)† ˆU(t−1) ˆU(t−1)†

(1 − (cosθi)2)

(1 − (cosθi)2)

i=1

i=1

(cid:104) ˆU(t−1)

¯U(t−1)(cid:105)

, ˆU(t−1) ∈ Rn×r is the matrix that contains the ﬁrst r eigenvectors
where U(t−1) =
of the Laplacian matrix at the previous time point, and ¯U(t−1) ∈ Rn×(r1−r) is the matrix that contains
the last r1 − r eigenvectors of the Laplacian matrix at the previous time point.

(2.8)

(2.9)

),

),

r

i=1

27

For the third case:

i=1

r1∑
r1∑

i=1

(sinθi)2

(1 − (cosθi)2)

Ch(U(t),U(t−1)) = r − r1 +
d2
= r − r1 +
= r − r1 + r1 − r1∑
= r − r1∑
= r − Trace( ˆU(t) ˆU(t)†U(t−1)U(t−1)†

(1 − (cosθi)2)

(1 − (cosθi)2)

i=1

i=1

(2.10)

),

(cid:104) ˆU(t)

¯U(t)(cid:105)

, ˆU(t) ∈ Rn×r1 is the matrix that contains the ﬁrst r1 eigenvectors of the
where U(t) =
Laplacian matrix at the current time point, and ¯U(t) ∈ Rn×(r−r1) is the matrix that contains the last
r− r1 eigenvectors of the Laplacian matrix at the current time point.

2.3.3 Problem Solution

The solution to the optimization problem in Eq. (2.7) can be obtained by alternating direction

method of multipliers (ADMM) [86] combined with proximal algorithms [87] [88] due to the non-
differentiability of the nuclear and l1− norms. First, the problem in Eq. (2.7) is split by adding an
auxiliary variable W(t) as follows:

(cid:107)L(t)(cid:107)∗ + µ(cid:107)S(t)(cid:107)1 + λ1Tr(U(t)†

(I − D(t)−0.5W(t)D(t)−0.5

)U(t))

(2.11)

min

L(t)∈Rn×n,S(t)∈Rn×n,
U(t)∈Rn×r,W(t)∈Rn×n

+λ2d2

ch(U(t),U(t−1)),

s.t A(t) = L(t) + S(t), W(t) = L(t),U(t)†U(t) = I,W(t) = W(t)†

, W (t)

i j ≥ 0, W (t)

ii = 0.

Second, the augmented Lagrange multipliers (ALM) function is rewritten by adding additional

quadratic terms to guarantee the attainment of the ﬁrst two equality constraints:

28

(I − D(t)−0.5W(t)D(t)−0.5

min

(cid:107)L(t)(cid:107)∗ + µ(cid:107)S(t)(cid:107)1 + λ1Tr(U(t)†

L(t)∈Rn×n,S(t)∈Rn×n,
U(t)∈Rn×r,W(t)∈Rn×n
1,A(t) − L(t) − S(t)(cid:105)
+ λ2d2
F + (cid:104)Xk
2,W(t) − L(t)(cid:105) +
+
i j ≥ 0, W (t)
s.t U(t)†U(t) = I, W(t) = W(t)†
, W (t)

ch(U(t),U(t−1)) + (cid:104)Xk
(cid:107)A(t) − L(t) − S(t)(cid:107)2

γ1
2

(cid:107)W(t) − L(t)(cid:107)2
F ,

γ2
2
ii = 0,

)U(t))

(2.12)

with Lagrange multipliers (dual variables) at the kth iteration deﬁned as,

Xk+1
1 = Xk

1 + γ1(A(t) − L(t)k+1 − S(t)k+1

),

Xk+1
2 = Xk

2 + γ2(W(t)k+1 − L(t)k+1

).

(2.13)

(2.14)

The problem in Eq.

(2.12) can be solved at each time point, t, by following the steps in

Algorithm 2.1 where the primal variables L(t),S(t),W(t) and U(t) are updated iteratively as follows:

L(t)k+1

= prox 1

γ1+γ2

(cid:107)L(t)(cid:107)∗

(cid:18)γ1Yk

1 + γ2Yk
2
γ1 + γ2

(cid:19)

,

(2.15)

(2.16)

+

Xk
1
γ1

, Yk

2 = W(t)k

+

where Yk

1 = A(t)−S(t)k

2 (cid:107) L−Y (cid:107)2
is the proximity operator of the convex function f where f (L) in this case is deﬁned as f (L) =
(cid:107)L(t)(cid:107)∗. Details are included in the Appendix.

and prox f (Y) = argminL∈Rn×n f (L) + 1

Xk
2
γ2

F

Similarly,

S(t)k+1

= prox µ
γ1

(cid:107)S(t)(cid:107)1

A(t) − L(t)k+1

+

Xk
1
γ1

(cid:18)

(cid:19)

.

with f (S) deﬁned as f (S) = (cid:107)S(t)(cid:107)1 as explained in the Appendix.

The update of W(t) is performed by computing the closed form solution as deﬁned in Eq. (2.17)

followed by the spectral projected gradient (SPG) [89] method as explained in the Appendix to

fulﬁll all the constraints:

W(t)k+1

= L(t)k+1 − Xk
2
γ2

+

λ1
γ2

(D(t)k

−0.5
)

U(t)kU(t)k†

(D(t)k

−0.5
)

(2.17)

29

Finally, the update of U(t) requires the solution of multiple optimization subproblems depend-

ing on the dimension of the consecutive subspaces. For the ﬁrst two cases deﬁned in Section 2.3.2,

U(t) is updated by solving:

min

λ1Tr(U(t)†ΦΦΦ(t)k+1

modU(t)), s.t U(t)†U(t) = I,

(2.18)

U(t) ∈Rn×r
mod = ΦΦΦ(t)k+1 − λ2
λ1

where ΦΦΦ(t)k+1

ˆU(t−1) ˆU(t−1)†. For the last case, U(t) is updated by solving:

min

λ1Tr( ˆU(t)†ΦΦΦ(t)k+1

mod ˆU(t)) s.t
λ1Tr( ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)),

ˆU(t) ∈Rn×r1
min
¯U(t)† ¯U(t) = I, ˆU(t)† ¯U(t) = 0,

¯U(t) ∈Rn×(r−r1)
s.t

ˆU(t)† ˆU(t) = I,

¯U(t)(cid:105)

.

(2.19)

(cid:104) ˆU(t)

where ΦΦΦ(t)k+1

mod = ΦΦΦ(t)k+1 − λ2
λ1

U(t−1)U(t−1)† and U(t) =

The derivations are included in the Appendix.

2.3.4 Tracking the community structure over time

In order to track changes in the community structure over time, a cost function is deﬁned using the

distance between consecutive subspaces as follows:

CostFunction = d2

ch(U(t),U(t−1)),

(2.20)

where this cost represents the minimum distance achieved between consecutive subspaces. The

changes in the cost function reﬂect the changes in the community structure.

2.3.5 Detecting the number of clusters

During the application of LS-ESC, the number of clusters at each time point is determined using

the asymptotical surprise (AS) metric [90]. After the iterations start at Step 10 in Algorithm 2.1,
U(t)k+1 is calculated using a range of eigenvectors, e.g. (2-20), followed by k-means clustering.

30

), Pk

% Deﬁne dual variables
Ch(U(t)k
4 = λ2d2
4(cid:107)2
4 −Pk
(cid:107)Pk+1
(cid:107)Pk
4(cid:107)2

F

F

,U(t−1)).

> ε and

> ε and

3 = λ1Tr(U(t)k†
2 = µ(cid:107)S(t)k(cid:107)1, Pk
ΦΦΦ(t)kU(t)k
3(cid:107)2
3 −Pk
(cid:107)Pk+1
2(cid:107)2
2 −Pk
(cid:107)Pk+1
(cid:107)Pk
(cid:107)Pk
2(cid:107)2
3(cid:107)2
> ε do

> ε and

F

F

F

F

F

F

F

2 ← W(t)k − L(t)k.

Algorithm 2.1 LS-ESC
Input: A(t) ∈ Rn×n, U(t−1) ∈ Rn×r1 µ, λ1, λ2, ε.
Output: L(t), S(t), U(t), Clustering labels.
1: k ← 0
2: Initialize L(t) ← A(t), S(t) ← zeros(n,n), W(t) ← L(t).
3: γ1 ← 1, γ2 ← 1.
1 ← A(t) − L(t)k − S(t)k, Xk
4: Xk
1 = (cid:107)L(t)k(cid:107)∗, Pk
5: Pk
% Deﬁne primal objectives
1(cid:107)2
6: while (cid:107)Pk+1
1 −Pk
> ε and
1(cid:107)2
(cid:107)Pk
1(cid:107)2
2(cid:107)2
2 −Xk
(cid:107)Xk+1
1 −Xk
(cid:107)Xk+1
> ε and
F
(cid:107)Xk
1(cid:107)2
(cid:107)Xk
2(cid:107)2
Update L(t)k+1 using Eq. (3).
Update S(t)k+1 using Eq. (5).
Update W(t)k+1 using Eq. (9) and Algorithm ??.
Update U(t)k+1 by solving Eq. (14) or Eq. (24).
if k = 2 then

7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18: end while
19: Apply k-means on U(t) to obtain clustering labels.

end if
Update Xk+1
1
Update Xk+1
2
Update Pk+1
1
k ← k + 1.

using Eq. (2.13).
using Eq. (2.14).
, Pk+1

, Pk+1

using Step 5.

and Pk+1

2

4

3

F

F

Determine the number of clusters using Eq. (2.21).

31

The correct number of clusters at time point t is determined as the number that maximizes the AS

metric [90] with respect to the underlying network and is deﬁned as:
AS(A(t),Clustering labels),

r : argmax

r

(2.21)

where Clustering labels : {C1,C2, . . . ,Cr}. The number of clusters is then ﬁxed for time point t

until the algorithm converges. This process is repeated at each time point to update the number of

clusters across time.

2.4 Results

In order to validate the performance of the proposed algorithm in detecting and tracking the com-

munity structure of temporal networks, the algorithm is evaluated on a set of simulated and real

binary and weighted temporal networks. The performance of the proposed LS-ESC is compared to

the state-of-the-art graph-based community detection algorithms including; AFFECT [33], DYN-

MOGA [79] and generalized Louvain for multi-layer modularity maximization (GenLov) [35].

The comparison is conducted in terms of detected number of clusters (DNOC) and accuracy of the

detected community structure across time using VI metric.

2.4.1 Simulated temporal networks

2.4.1.1 Simulated binary temporal networks

To evaluate the performance of the proposed algorithm in detecting the community structure of

binary temporal networks, the benchmark networks proposed by Girvan and Newman in [91] and

implemented in [92] are adopted. In this data set, the temporal network consists of 20 time points

with 128 nodes divided into 4 clusters. Each node has a ﬁxed average degree (AD) and shares a

number (z) of edges with nodes from other communities. As AD increases, the network becomes

32

Data set

AD
z

NC%
µ, λ1, λ2
λ1

1
32
3
10

2
32
5
10

3
20
2
20

4
20
1
30

5
16
3
20

6
16
3
30

0.3, 0.1,

0.4, 0.1,

0.5, 0.1,

0.5, 0.1,

0.5, 0.1,

0.6, 0.1,

0.3

0.3

0.1

0.1

0.1

0.1

Table 2.1: Parameters of the simulated binary networks from Girvan and Newman benchmark for
six different data sets and the selected values of the regularization parameters.

more dense and as z decreases, the clusters become more distinct. In order to impose nonstationar-

ity to the community structure over time, a percentage of nodes (NC%) are moved randomly from

their communities and assigned to other communities at each time step. A set of temporal networks

are generated with different parameters to obtain different community structures as summarized in

Table 2.1. The regularization parameters are determined empirically and reported in Table 2.1.

The performance of the proposed approach is compared to the state-of-the-art algorithms using

VI metric in Fig. 2.1. The simulations are repeated 100 times and the VI is averaged over all

simulations. As it can be seen in Fig. 2.1, the proposed algorithm performs better than AFFECT

and DYNMOGA in detecting the community structure over time for the different networks. The

performance of DYNMOGA is found to be comparable to the proposed algorithm for data sets 1

and 2 in Fig. 2.1a and Fig. 2.1d, respectively. This is due to the fact that In data set 1, AD is high

and NC% is low, i.e. the networks are densely connected and more stationary. On the other hand,

in data set 4, z is low, i.e. the clusters are more distinct. However, the performance of DYNMOGA

degrades as AD decreases and z and NC% increases , i.e. in cases where the network is sparse,

non-stationary and is less modular such as data set 5 and 6. It is found that AFFECT does not

perform well in binary networks.

33

(a)

(c)

(b)

(d)

(e)

(f)

Figure 2.1: VI comparison between LS-ESC, AFFECT and DYNMOGA for simulated binary
networks from Girvan and Newman benchmark with the parameters presented in Table 2.1. LS-
ESC scored the lowest VI over time compared to the other methods: (a) Data set 1; (b) Data set 2;
(c) Data set 3; (d) Data set 4;(e) Data set 5; and (f) Data set 6.

34

246810121416182000.050.10.150.20.250.30.35Time PointVIData set 1  DynmogaAFFECTLS−ESC246810121416182000.050.10.150.20.250.30.350.4Time PointVIData set 2  DynmogaAFFECTLS−ESC246810121416182000.050.10.150.20.250.30.350.40.450.5Time PointVIData set 3  DynmogaAFFECTLS−ESC246810121416182000.10.20.30.40.50.60.7Time PointVIData set 4  DynmogaAFFECTLS−ESC246810121416182000.10.20.30.40.50.60.7Time PointVIData set 5  DYNMOGAAFFECTLS−ESC246810121416182000.10.20.30.40.50.60.7Time PointVIData set  6  DYNMOGAAFFECTLS−ESC2.4.1.2 Simulated weighted temporal networks

As many of the real-world networks are weighted, a set of weighted temporal networks are gener-

ated to evaluate the performance of the proposed approach in detecting and tracking the community

structure over time. Temporal networks are generated for 60 time points with 100 nodes. Intra and

inter-cluster edges are selected from a truncated Gaussian distribution in the range of [0,1]. Dif-

ferent levels of sparse noise (SN%) are added to the inter-cluster edges. The parameters and the

details of the community structure for these data sets are given in Table 3.1. The values of the

regularization parameters are chosen empirically and reported in Table 3.1.

The performance of LS-ESC and the state-of-the-art algorithms in detecting the community

structure for the weighted temporal networks in Table 3.1 is compared in terms of the DNOC and

VI over time in Fig. 2.2. DNOC at each time point for LS-ESC and AFFECT is determined as

the number that maximizes the asymptotical surprise metric. The simulations are repeated for 100

times and the cost function, DNOC and VI are averaged over all simulations.

As it can be seen in Fig. 2.2(a)-(c), the cost function can track the change points in the com-

munity structure over time and detects changes at time points 20 and 40. Fig. 2.2(d)-(f) show that

the proposed approach performs better than the other methods in detecting the correct number of

clusters over time. Finally, Fig. 2.2(g)-(i) present a comparison between the different methods

using the VI metric. LS-ESC outperforms the other methods in terms of detecting and tracking

the correct community structure over time. Moreover, LS-ESC has better performance around

the change points and detects the correct community structure faster than the other methods. The

performances of AFFECT and GenLov decline as the SN% increases and the size of clusters de-

creases. In particular, both methods show preference to large clusters and tend to merge small

clusters together as can be seen during the time intervals (40-60), (20-40) and (20-60) in data sets

35

Figure 2.2: Comparison between LS-ESC and state of the art algorithms for the different weighted
networks in Table 3.1: (a)-(c) The cost function calculated by LS-ESC, (d)-(f) The detected number
of clusters (DNOC) for the simulated networks using different algorithms. LS-ESC estimated the
correct number of clusters over time and the cost function reﬂected the correct change points;
(g)-(i) Comparison of variation of information (VI) between LS-ESC, AFFECT and GenLov for
the simulated networks. LS-ESC outperforms other methods and scored the lowest variation of
information over time.

1, 2 and 3, respectively. The results for DYNMOGA are not presented in these ﬁgures as it does

not perform well on weighted networks.

36

0204060Time Point     (a)       00.511.5Cost Function0204060Time Point       (d)       345DNOC0204060Time Point       (g)       00.020.040.060.080.1VI0204060Time Point       (b)       00.511.5Cost Function0204060Time Point       (e)       22.533.544.5DNOCAFFECTGenLovLS-ESC0204060Time Point        (h)       00.020.040.060.080.1VI0204060Time Point     (c)       00.511.5Cost Function0204060Time Point        (i)       00.020.040.060.080.1VI0204060Time Point      (f)                 456DNOCDataSet 3DataSet 2DataSet 1Data Sets

Time points 1-20

Time points 21-40

Time points 41-60

1

2

3

µintra,σintra, µiner,σinter

Nodes in each cluster C

Number of clusters
SN%, µ, λ1, λ2
λ1

0.5,0.3,0.2,0.1

C1(1− 30), C2(31− 45),
C3(46− 60),C4(61− 80)

C5(81− 100)

0.5,0.2,0.1,0.1

0.5,0.3,0.2,0.1

C1(1− 30), C2(31− 60),
C3(61− 80),C4(81− 100)

C1(1− 60), C2(61− 80),

C3(81− 100)

5

4

3

10%, 0.3, 0.1, 0.1

10%, 0.3, 0.1, 0.1

10%, 0.3, 0.1, 0.1

µintra,σintra, µiner,σinter
Nodes in each cluster C C1(1− 30), C2(31− 60),
C3(61− 80),C4(81− 100)

0.7,0.3,0.2,0.1

0.7,0.2,0.3,0.2

0.7,0.3,0.2,0.1

C1(1− 60), C2(61− 80),

C3(81− 100)

C1(1− 30), C2(31− 60),
C3(61− 80),C4(81− 100)

Number of clusters
SN%, µ, λ1, λ2
λ1

4

3

4

20%, 0.5, 0.1, 0.1

20%, 0.5, 0.1, 0.1

20%, 0.5, 0.1, 0.1

µintra,σintra, µiner,σinter
Nodes in each cluster C C1(1− 30), C2(31− 60),
C3(61− 80),C4(81− 100)

0.8,0.2,0.3,0.2

0.6,0.1,0.2,0.1

0.8,0.2,0.3,0.2

C1(1− 30), C2(31− 45),
C3(46− 60), C4(61− 80)

C5(81− 100)

C1(1− 15), C2(16− 30),
C3(31− 45),C4(46− 60)
C5(61− 80), C6(81− 100)

Number of clusters
SN%, µ, λ1, λ2
λ1

4

5

6

30%, 0.25, 0.1, 0.1

30%, 0.2, 0.1, 0.1,

30%, 0.2, 0.1, 0.1

Table 2.2: Parameters of the truncated Gaussian distribution used to generate the edges of the simulated weighted temporal networks for
three different data sets, the ground truth of the nodes’ community membership for the generated networks and the selected values of
the regularization parameters.

37

2.4.2 Real social network: MIT Reality Mining Dataset

This data set was collected between August 2004 and June 2005, by recording the cell phone ac-

tivity for 94 students and staff at MIT [93]. This data set can be represented as a binary dynamic

network across 46 time points. Each point refers to a week during the school year. Two dominant

clusters are assumed at each time point. LS-ESC approach is applied to this data set to detect the

community structure across time. The cost function is used to track the changes in the commu-

nity structure over time as shown in Fig. 2.3. The changes in the cost function coincides with

six important dates during the school year, namely the beginning and end of the Fall and Spring

semesters, Thanksgiving and spring breaks. The detected community structures for selected time

points are shown in Fig. 2.4. These time points are selected to present the detected community

structures in the beginning of the Fall semester, Thanksgiving break, 2 weeks after the beginning

of the Spring semester and end of the Spring semester. The black frames represent the expected

ground truth and the different colors represent the communities detected by LS-ESC. As it can

be seen be seen from Fig. 2.4a, small clusters are detected at the beginning of the Fall semester

indicating less interactions between students during this time. On the other hand, bigger clusters

are detected during Thanksgiving break and Spring semester indicating more interactions between

participants during these times as shown in Fig. 2.4b and Fig. 2.4c. Finally, two big clusters are

detected at the end of the Spring semester as shown in Fig. 2.4d. However, these clusters are

less dense than the ones detected during the semester which indicates that the interactions between

participants decreases toward the end of the semester.

38

Figure 2.3: Cost function calculated by LS-ESC for the MIT Reality mining network. The changes
in the cost function coincides with six important dates during the school year (red dashed lines).

2.4.3 Real social network: Primary school network

In order to validate the performance of the proposed approach in detecting and tracking the com-

munity structure in real weighted temporal networks, a high-resolution data set that describes a

temporal social network in a primary school is used [94]. This data set studies the relation between

contact patterns among children in a primary school and the propagation of diseases [95].

2.4.3.1 Primary school data set description and temporal network construction

This data set is collected by the SocioPatterns collaboration (http://www.sociopatterns.org)

during two consecutive days in October 2009: Thursday, October 1st from 8:45 am to 5:20 pm and

Friday, October 2nd from 8:30 am to 5:05 pm. The data set is collected using wearable sensors that

sense face-to-face proximity interactions (about 1 to 1.5 meters) between individuals. The sensing

system has a temporal resolution of 20 seconds and the proximity interactions are detected over

consecutive 20 seconds long time intervals. The school consists of ﬁve grades, where each grade

is divided into two sections. The total number of participants in the data set is 242 including 232

students and 10 teachers. Details about the students in the different grades are presented in Table

39

051015202530354045Weeks00.20.40.60.811.21.41.61.8Cost functionCost function calculated by LS-ESC for reality mining data setSpring term beginsThanksgivingSpring term endsFall term endsSpring breakFall term begins(a)

(c)

(b)

(d)

Figure 2.4: Detected community structure of the MIT reality mining temporal network at different
times: (a) Week 5; (b) Week 15; (c) Week 27; and (d) Week 39.

40

Detected community structure by LS-ESC during week 5102030405060708090102030405060708090Detected community structure by LS-ESC during week 15102030405060708090102030405060708090Detected community structure by LS-ESC during week 27102030405060708090102030405060708090Detected community structure by LS-ESC during week 39102030405060708090102030405060708090Class name

1A 1B 2A 2B 3A 3B 4A 4B 5A 5B Teachers

Number of partici-
pants

23

25

23

26

23

22

21

23

22

24

10

Table 2.3: Different classes and number of teachers and students participating in the primary school
data from each class.

2.3. The school day runs from 8:30 am 4:30 pm, with lunch break from 12:00 pm to 2:00 pm
and two breaks of 20− 25 minutes around 10:30 am and 3:30 pm. Only two or three classes have

breaks at the same time. Lunch is served in a common canteeen in two consecutive time periods

and a common playground located outside the building is used during the other two breaks. A

more detailed description of the data set can be found in [94].

In order to analyze the primary school data set, two temporal networks are constructed to

present the temporal interactions between students and teachers for two consecutive days. In each

network, the raw sensor data is averaged over time intervals of approximately 13 minutes, i.e.

each time step of the constructed temporal network represents the average interactions between

individuals over 13 minutes. Each time step is represented by an undirected weighted adjacency
matrix A(t) ∈ Rn×n with n = 242 nodes. This results in two temporal networks with T = 40 time

points representing each day.

An important advantage of this network is that since it reﬂects the interactions in a primary

school, a set of communities are expected to represent the different classes in the school. Moreover,

the classes, lunch and break times are known which provides an expected ground truth about the

community structure and how it changes over time. For instance, face-to-face interactions are

expected to be higher between children in the same class during class times while the interactions

between children from different classes are expected to increase during lunch or break times.

41

2.4.3.2 Detecting and tracking the community structure in the primary school temporal

network (PSTN)

The proposed LS-ESC approach is applied to PSTN for both days to detect and track the commu-

nity structure over time. In both days, the detected number of communities in the network varies

between 4 and 15 communities.

The proposed cost function is calculated for both days and presented in Fig. 2.5(a) and (b). As

it can be seen in Fig. 2.5, both cost functions follow a similar pattern during the day where the

changes in the cost function reﬂect the changes in the community structure corresponding to breaks

and lunch times. In particular, the biggest variation in the cost functions corresponds to lunch times

where face-to-face proximity interactions between students from different classes increase.

The detected community structures at selected time points are shown in Fig.2.6 and Fig. 2.7.

These time points are selected to present the detected community structures in the morning, break,

lunch and end of the day. Teachers are represented by the last ten nodes in red, the black frames

represent the different classes and the different colors represent the communities detected by LS-

ESC. As it can be seen from Fig.2.6 and Fig. 2.7, the detected community structures in the morning

and at the end of the school day, i.e. Fig. 2.6(a) and (d) and Fig. 2.7 (a) and (d), represent

the different classes in the school where the face-to-face proximity interactions are high between

children from the same classes. Also, teachers are members of these communities during these

times. These results agree with the activity pattern of the school since these times represent class

times. On the other hand, bigger clusters that consist of more than one class are detected during

break and lunch times as shown in Fig. 2.6 and Fig. 2.7(b) and (c), respectively, which reﬂects the

true activity pattern during these times.

42

2.5 Conclusions

In this chapter, a new low-rank + sparse estimation based evolutionary spectral clustering approach

is proposed to detect and track the community structure in temporal networks. In particular, the

proposed method decomposes the network into low-rank and sparse parts and obtains smooth

cluster assignments by minimizing the subspace distance between consecutive time points, simul-

taneously. The estimated low-rank adjacency matrix is used for clustering and the subspaces are

deﬁned through spectral embedding. Furthermore, a cost function to track changes in the com-

munity structure across time is introduced through the subspace distance measure. The algorithm

is validated on multiple simulated and real weighted and binary temporal networks and compared

to the existing state-of-the-art evolutionary clustering algorithms. The simulation results show

that the proposed algorithm can adapt to the changing community structure and detect the correct

community structure over time.

The proposed approach detects and tracks the community structure in both weighted and binary

temporal networks using the same optimization problem efﬁciently unlike some existing methods

that are limited to either binary or weighted networks. Moreover, LS-ESC approach is robust to

noise and outliers, non-stationarity and abrupt changes in the structure of the temporal network

due to the estimation of the low-rank part of the adjacency matrix at each time point.

43

(a)

(b)

Figure 2.5: Cost function calculated by LS-ESC for the primary school temporal network for the:
(a) ﬁrst day, (b) second day. Red dashed lines refer to the time during the day.

44

024681012Cost function calculated by LS−ESCTimeCost function10:0011:0012:0002:0003:0004:0005:00Break TimeBreak Time01:00Lunch Time02468101214TimeCost functionCost function calculated by LS−ESC for the second day  10:00  11:00  05:00  04:00  03:00  02:00  12:00Break TimeLunch TimeBreak Time  01:00Figure 2.6: Detected community structure in the ﬁrst day of the primary school temporal network
at different times: (a) Morning, (b) break, (c) lunch and (d) end of the school day.

45

Detected community structure by LS−ESC at 9:11 amDetected community structure by LS−ESC at 10:55 amDetected community structure by LS−ESC at 1:05 pmDetected community structure by LS−ESC at 4:55 pm(b)(a)(c)(d)1A         1B        2A3A3B4A4B5A5BTeachersFigure 2.7: Detected community structure in the second day of the primary school temporal net-
work at different times: (a) Morning, (b) break, (c) lunch and (d) end of the school day.

46

Detected community structure by LS−ESC at 9:35 amDetected community structure by LS−ESC at 10:40 amDetected community structure by LS−ESC at 1:00 pmDetected community structure by LS−ESC at 4:25 pm(a)(c)(d)(b)Chapter 3

Tensor Based Temporal and Multi-layer

Community Detection

3.1

Introduction

Networks are commonly used to model the relationships and interactions between people or objects

such as in social, biological, and communication networks. A lot of research has been done in

detecting the community structure of networks, especially for static networks [96]. However, in

many applications, the community structure changes over time. In these applications, the system

is represented by a dynamic network [5]. The topology of dynamic networks change over time,

since nodes and edges might come and go. In other applications, the nodes of each graph might

have different types of relationships that evolve with time, and this variation should be taken into

account to understand the correct community structure of these networks [37]. In this case, the

system is represented by a dynamic multi-layer network [41]. Over the past decade, multiple

approaches have been proposed to investigate the community structure in dynamic networks. Yet,

community detection algorithms for dynamic multi-layer networks are very limited. Standard

spectral clustering can be used to analyze these networks at each time point or layer separately, but

this approach is very sensitive to instantaneous noise.

47

In recent years, different approaches have been developed to detect the community structure of

dynamic networks [31–34]. These approaches include simple extensions of static clustering meth-

ods, statistical modeling based methods and tensor based methods. Extensions of static clustering

methods such as k-means and agglomerative hierarchical clustering approaches were developed

in [31] for dynamic networks. A cost function that considers both the cluster membership at the

current time and the deviation from the previous time point was introduced. In [32], two spectral

clustering based frameworks were introduced: preserving cluster quality (PCQ) and preserving

cluster membership (PCM). In both frameworks, the clustering depends on a cost function to guar-

antee temporal smoothness. However, this cost function requires a priori knowledge about the

number of clusters and depends on the choice of a changing parameter. Under statistical modeling

based methods, [33] assumed a statistical model for the evolution of the adjacency matrix and an

adaptive forgetting factor for evolutionary clustering and tracking (AFFECT) was introduced. In

AFFECT, static clustering methods were used after smoothing the proximity between objects over

time. Under tensor based methods, a non-negative tensor factorization approach was introduced

in [34]. This approach used PARAFAC to detect the community structure in temporal networks,

where a single 3-way tensor was constructed to represent all the successive adjacency matrices in

the temporal network.

Over the past years, few methods have been developed to detect the community structure in

dynamic multi-layer networks. In [35], the authors introduced a general framework that encom-

passes time-evolving, multi-scale and multi-layer networks to determine the community structure

in multi-slice networks. These networks include a set of adjacency matrices that represents the vari-

ation in connectivity between nodes across time, or variation in the types of relationships across

layers, or variation of the same network across multiple scales. The framework relies on a gener-

alization of the modularity optimization problem for detecting the community structure of a single

48

network. Since this framework depends on modularity, it has multiple drawbacks including the

computational cost, null model selection and the possibility of the convergence to a non-optimal

solution as discussed in [20]. Another evolutionary clustering approach to mine and track dynamic

multi-layer networks is introduced in [41]. The proposed framework relies on a multi-objective

optimization representation and the solution is obtained by a genetic algorithm. This approach

has a high computational cost and impractical in the case of detecting the community structure for

large networks.

In this chapter, we propose a tensor based approach using 3-way and 4-way tensors to detect

the community structure of dynamic single-layer and dynamic multi-layer networks, respectively.

In the case of dynamic networks, two approaches, 3-dimensional windowed tensor approach (3D-

WTA) and running time tensor approach (RTTA) are proposed. In 3D-WTA, a ﬁxed length sliding

window is used to construct the tensors, while a variable length sliding widow is used to con-

struct the tensors in RTTA. In the case of dynamic multi-layer networks, we extend 3D-WTA to

4-dimensional windowed tensor approach (4D-WTA) to optimize the contribution of the multiple

layers and time points to the community structure, simultaneously. The main contributions of the

proposed work are ﬁve-fold. First, a tensor based approach that extends spectral clustering to tem-

poral and multi-layer networks is introduced. The proposed approach simultaneously takes the

past history across time or time and multiple layers into account to obtain a common community

structure for the network. Moreover, unlike simple averaging based methods, the proposed ap-

proach optimizes the contribution of each layer and time point within the framework. Second, the

proposed approach preserves the topological structure of the networks by using tensors to repre-

sent the network across time and layers. Third, the proposed framework offers a way to track the

changes in the community structure over time. In particular, a normalized cost function is intro-

duced to track the changes in the community structure over time. Fourth, the proposed framework,

49

unlike AFFECT, is model-free and it tracks and identiﬁes the dynamic community structure across

time and layers without making any assumptions about the changes in the community structure

like PCM and PCQ the temporal clustering (TC) framework in [78]. Fifth, the proposed work can

be applied to investigate different types of networks, such as social and biological networks. An

application of interest in our work is studying and analyzing the dynamic functional connectivity

networks of the brain during resting state and cognitive control, with a major focus on resting state

fMRI (rs-fMRI). In particular, the dFCNs are summarized into a set of functional connectivity

patterns or states that repeat over time. For each subject, each time point in the dFCN is repre-

sented by a FC state. Unlike previous work [55], [56] and [97], the FC states are deﬁned through

a common community structure. Furthermore, transition proﬁles between different states are ob-

tained for each subject to show the variance in the dynamic brain activity across subjects during

rest. A consistency measure is employed to quantify the dynamics of the different subnetworks in

rs-fMRI.

3.2 Background

3.2.1 Quality Metrics to Determine the Number of Clusters

Different methods and quality metrics have been utilized to determine the number of clusters and

to quantify the quality of a community structure when the ground truth is unknown [98,99]. Some

of the most commonly used metrics include eigengap criterion [7], [100], modularity [11] and

asymptotical surprise [90].

In the eigengap criterion, the number of clusters, r, for a given normalized similarity matrix,
AN, with eigenvalues λ1 ≥ λ2 . . . ≥ λn is determined by min{r : λi − λi+1 ≤ δ ,for all i > r}. The
threshold δ is determined as the 0.95 quantile of the largest eigengap obtained from the eigenvalues

50

of a null model generated from random networks with the same degree distribution and weights as

the original network [100].

community detection methods [30, 33, 92]. Modularity is deﬁned as Q = 1

Another commonly used metric is modularity [11], which has been used as a quality metric in
[Ai j − DiD j
2E ]
Ai j is the total edge weight, Di is the degree of node i, ¯C = {C1, . . . ,Cr} is the
set of clusters and Vc is the set of nodes in the cth cluster. This metric measures the deviation

where 2E =

n
∑
i, j=1

2E ∑
c∈ ¯C

∑
i, j∈Vc

fraction of the within community edges in the network from the expected value of the same quan-

tity in the network with the same community partitions but random connections between the ver-

tices [11]. However, modularity has been shown to suffer from the resolution and null model

selection problems [101]. The addition of a tunable parameter namely a resolution parameter

γ is suggested in [102] to address this issue and the deﬁnition of the modularity is modiﬁed as

Qγ = 1

2E ∑
c∈ ¯C

∑
i, j∈Vc

[Ai j − γ DiD j

2E ], where Qγ is the multi-resolution modularity metric. An approach

to calculate the resolution parameter is suggested in [103]. Despite the fact that the addition of

γ improves the performance of modularity, it still does not provide a reliable solution with the

inappropriate choice of the null model [20].

In order to overcome the limitations of modularity, asymptotical surprise is proposed in [90].
Asymptotical surprise is deﬁned as Sa = 2E DKL(q||(cid:104)q(cid:105)), where 2E is the sum of edge weights in
the adjacency matrix, q = Eint
edge weights and (cid:104)q(cid:105) = pint

2E is the observed ratio of the intra-cluster edge weights to the total

p is the expected ratio of intra-cluster edge weights with pint as the total

sum of possible intra-cluster edge weights. DKL is the Kullback-Leibler divergence [104]. This

quality metric compares the distribution of the nodes and edges in the detected communities in a

certain network with respect to a null model. The community structure with the highest score is

then selected. As discussed in [90], [105] and [98], this metric is not affected by the resolution

limit unlike modularity.

51

3.3 Methods

3.3.1 Spectral Clustering

In graph theory, a common method to detect the community structure in static networks is spectral

clustering, as it represents the relationship between the nodes in the graph in a lower dimensional

subspace. Spectral clustering solves the following trace optimization problem [7, 8],

min
U∈Rm×r

Trace

U†ΦΦΦU

,

s.t U†U = I,

(3.1)

where, ΦΦΦ = I − AN is the normalized Laplacian matrix. AN = D− 1
2 is the normalized
adjacency matrix, where AN ∈ Rn×n, n is the number of nodes and D is the diagonal degree matrix

2 AND− 1

which is deﬁned as Dii =

n
∑
j=1

ai j. Spectral clustering problem can also be written as,

(cid:16)

(cid:16)

(cid:17)

(cid:17)

max
U∈Rn×r

Trace

U†ANU

,

s.t U†U = I,

(3.2)

or since AN is positive semi-deﬁnite, (3.2) can be written as,

(cid:107) U†ANU (cid:107)2
F ,

max
U∈Rn×r

s.t U†U = I.

(3.3)

The solution to the optimization problems in (3.1), (3.2) and (3.3) is found by choosing U as the

matrix that contains the r eigenvectors that correspond to the smallest eigenvalues of ΦΦΦ or the

largest eigenvalues of AN. The community structure is then determined by applying k-means to

the matrix U [7].

52

3.3.2

3D-Windowed Tensor Approach (3D-WTA)

In clustering dynamic or temporal networks at time t, it is important to take the recent history of

the network into account so that the resulting community structure is smooth across time. One

way to accomplish this is to employ a weighted average of the adjacency matrices within a time

window of length L and then optimize the conventional spectral clustering cost function over this

weighted average. This results in an optimization over both the subspace vectors matrix, U(t), and

the weight vector, W (t). The optimization problem in (3.3) can then be rewritten as,

(cid:32) L

∑

l=1

(cid:107) U(t)†

max
U(t),W (t)

(cid:33)

w(t)
l A(l)
N

U(t) (cid:107)2
F ,

,s.t U(t)†U(t) = I, W (t) ≥ 0,(cid:107) W (t) (cid:107)2= 1,

(3.4)

(cid:105)

=

N

where wl are the weights for the adjacency matrices within a time window of length L.
(cid:104)
In the proposed approach, 3D-WTA, we build a third order tensor, X (t) ∈ Rn×n×L
A(t−L+1)
matrix A(l)

, . . . ,A(t)
N

N . A sliding ﬁxed size tensor is constructed across time and (3.4) can be reformulated in

, where the lth frontal slice of the tensor X (t) is the normalized adjacency

terms of Tucker decomposition as follows,

(cid:107) X (t) ×1 U(t)† ×2 U(t)† ×3 W (t)† (cid:107)2

F , s.t U(t)†U(t) = I,(cid:107) W (t) (cid:107)2= 1,

(3.5)

max
U(t),W (t)

where U(t) is the basis along the ﬁrst and second modes and W (t) corresponds to the weighting

vector across time. Since the networks are undirected, the components along the ﬁrst two modes

are the same. The optimization in (3.5) can be solved by higher-order orthogonal iteration (HOOI),

where in each iteration either the matrix U(t) or the vector W (t) is ﬁxed to optimize the other one.

This approach is summarized in the pseudo code in Algorithm 3.2. In this approach, we ini-

53

tialize the eigenvectors matrix using HOSVD for the initial tensor and then update this estimate

using HOOI. For the remaining time points, in order to reduce the computational cost, the eigen-

vectors matrix at time point t is initialized using the estimated optimal eigenvectors matrix from
the preceding time point (t − 1), U(t−1).

N , . . . ,A(M)

Algorithm 3.2 3D-Windowed Tensor Approach (3D-WTA)
Input: A(1)
Output: Clustering Labels, Change points.
1: for t = L : M do
2:

N ,L.

Construct a similarity tensor
, . . . ,A(t)
X (t) =
.
N
Unfold the tensor along mode 3 to get X(t)
(3).
if t = L then

N

(cid:104)
A(t−L+1)

(cid:105)

else

Initialize U(t)

Initialize U(t)

0 by HOSVD of X (t).
0 by Ut−1.

3:
4:
5:
6:
7:
8:
9:
10:

11:

12:

end if
i = 0
Calculate W (t)
Compute ¯A =
Obtain U(t)
if (cid:107) U(t)

i+1 as the dominant left singular vector of X(t)
(3)

t
∑

l=t−L+1

(W (t)

i+1)lA(l)
N .

i+1 by eigenvalue decomposition of ¯A.

F > ε then

i (cid:107)2
i+1 − U(t)
i = i + 1.
goto line 10.

13:
14:
15:
16:
17:
18:
19: end for
20: Track the change points using (3.11).

end if
Normalize the columns of U(t).
Apply k−means on U(t) to obtain clustering labels.

(cid:16)

i ⊗ U(t)
U(t)

i

(cid:17)

.

3.3.3 Running Time Tensor Approach (RTTA)

(cid:104)

N ···A(t)
A(1)

N

(cid:105)

In this approach, the tensor at time point t is built by taking all history into account, where X (t) =

. This approach estimates the optimal eigenspace that represents the network at time

54

t in a similar fashion to 3D-WTA. The main differences are in the tensor size and initialization

of the matrix U(t). This approach uses a window size that grows with time. Consequently, the

tensor constructed at each time point has a different size. Moreover, at each time point the initial

eigenvectors matrix is an updated version of the previous one. This update was motivated by
the fact that the tensor at time t is the same as the tensor at time (t − 1) except for one time

slice. Therefore, the matrix U(t) is adaptively updated based on the new adjacency matrix. We

used TrackU algorithm [106, 107] to update the eigenvectors matrix. The RTTA is described in

Algorithm 3.3, and the eigenvectors matrix update is described in Algorithm 3.4.

N . . .A(M)

Algorithm 3.3 Running Time Tensor Approach (RTTA)
Input: A(1)
N , starttime, f orgetting f actor = β .
Output: Clustering Labels, Change points.
1: for t = starttime : M do
2:

N . . .A(t)
A(1)
Construct a similarity tensor X (t) =
N
Unfold the tensor along mode 3 to get X(t)
(3).
if t = starttime then

(cid:104)

(cid:105)

.

0 by HOSVD of X (t)
1 .
0 =TrackU(U(t−1),S(t−1),A(t)

N ,β ).

3:
4:
5:
6:
7:
8:
9:
10:

11:

12:

Initialize U(t)

else

Initialize U(t)

end if
i = 0
Calculate W (t)
Compute ¯A =

(cid:16)

i ⊗ U(t)
U(t)

i

(cid:17)

.

i+1 as the dominant left singular vector of X(t)
(3)

t
∑
l=1

(W (t)

i+1)lA(l)
N .
i+1 by eigenvalue decomposition of ¯A as, ¯A = U(t)
F > ε then

i+1S(t)
i+1

(cid:16)

U(t)
i+1

(cid:17)†

.

Obtain U(t)
if (cid:107) U(t)

i+1 and S(t)
i (cid:107)2
i+1 − U(t)
i = i + 1.
goto Line 10.

13:
14:
15:
16:
17:
18:
19: end for
20: Track the change points using the normalized cost function given in (3.11).

end if
Normalize the columns of U(t).
Apply k-means on U(t) to obtain clustering labels.

55

Algorithm 3.4 TrackU
Input: Eigenvectors matrix: U, Eigenvalues matrix: S, Adjacency matrix at time point t: A(t)
N ,

forgetting factor: β .

Output: Updated eigenvectors matrix.
1: CA =size(AN,2), CU =size(U,2), s=diag(S).
2: for i = 1 : CA do
¯b = AN(:,i).
3:
for j = 1 : CU do
4:
5:

¯b j.

y j = u†
j
s(t)
j = β s j + y2
j.
Compute the error: e j = ¯b j − y ju j.
u j = u j + 1
y je j.
s j
¯b j+1 = ¯b j − y ju j.

end for

9:
10:
11: end for
12: Orthogonalize U using Gram-Schmidt.

6:
7:
8:

3.3.4

4D-Windowed Tensor Approach (4D-WTA)

In the case of four dimensional networks, where graphs are built over time and across multi-layer or

multi-subjects, we want to track the changes in the community structure at time t across all layers

based on the past history. In a similar fashion to the 3D case, we can build a weighted average of

the adjacency matrices within a time window of length L across all layers and time, then optimize

the conventional spectral clustering cost function over this weighted average. This results in an

optimization over the subspace vectors matrix, U(t), the weighting vector across layers, W (t)
s

, and

the weighting vector over time points within a window, W (t)

w . The optimization problem in Eq.

(3.3) can then be rewritten as,

(cid:32) K

L

(cid:107) U(t)†

∑
max
U(t),W (t)
k=1
s
s.t U(t)†U(t) = I, W (t)
s

,W (t)
w

w(t)
sk

∑
wl A(l,k)
w(t)
w ≥ 0, (cid:107) W (t)
, W (t)

l=1

N

U(t) (cid:107)2
F ,

(cid:107)2= 1, (cid:107) W (t)

w (cid:107)2= 1,

(3.6)

(cid:33)

s

where w(t)

wl are the weights for the adjacency matrices within a time window of length L, w(t)

sk are

56

the weights across layers and K is the total number of layers. In the proposed 4D-WTA, we build
a ﬁxed size fourth order tensor, X (t) ∈ Rn×n×L×K =
slice of the tensor X (t) is the adjacency matrix A(l,k)

at the time point l for subject k. (3.6) can be

, where the (l,k)th

, . . . ,A(t,1:K)

N

N

N

(cid:104)
A(t−L+1,1:K)

(cid:105)

reformulated in terms of Tucker decomposition as follows,

(cid:107) X (t) ×1 U(t)† ×2 U(t)† ×3 W (t)

w

† ×4 W (t)

s

† (cid:107)2
F ,

(3.7)

max
U(t),W (t)
s
s.t U(t)†U(t) = I, (cid:107) W (t)

,W (t)
w

w (cid:107)2= 1, (cid:107) W (t)

s

(cid:107)2= 1,

where U(t) is the basis along the ﬁrst and second modes, W (t)

w corresponds to the weighting vector

across time and W (t)
s

is the weighting vector across layers. The components along the ﬁrst two

modes are the same, since the networks are undirected. The problem in (3.7) can be solved by

HOOI.

This approach is summarized in the pseudo code in Algorithm 3.5. In this approach, the pro-

jection matrices for the ﬁrst tensor are initialized using HOSVD and then updated using HOOI.

For the remaining time points, the eigenvectors matrix, U(t), and the weighting vector over time,

W (t)
w , are initialized from the estimated optimal eigenvectors matrix and time weighting vector at
the preceding time point (t − 1), U(t−1) and W (t−1)

, respectively.

w

3.3.5 Tracking the Community Structure Over Time

To track the changes in the community structure using the proposed approaches, we deﬁne a cost

function that quantiﬁes the quality of the cluster assignment. For 4D-WTA, this cost function is
deﬁned as (cid:107) X (t) ×1 U† ×2 U(t)† ×3 W (t)

† (cid:107)F and satisﬁes the following inequality,

† ×4 W (t)

w

s

Cost = (cid:107) X (t) ×1 U(t)† ×2 U(t)† ×3 W (t)
w (cid:107)2(cid:107) W (t)

≤ (cid:107) X (t) (cid:107)F(cid:107) U(t) (cid:107)2

2(cid:107) W (t)

† ×4 W (t)
(cid:107)2,

w

s

s

† (cid:107)F

(3.8)

57

N

N

,L.

, . . . ,A(M,1:K)

Algorithm 3.5 4D Windowed Tensor Approach (4D-WTA)
Input: A(1,1:K)
Output: Clustering Labels, Change points.
1: for t = L : M do
2:

Construct a similarity tensor
X (t) =
Unfold the tensor along mode 3 and mode 4 to get X(t)
if t = L then

(cid:104)
A(t−L+1,1:K)

, . . . ,A(t,1:K)

(cid:105)

N

N

.

Initialize U(t)

Initialize U(t)

0 and W (t)
0 by U(t−1) and W (t)

w,0 by HOSVD of X(t).
w,0 by W (t−1)

w

.

(3) and X(t)

(4), respectively.

s,i+1 as the dominant left singular vector of X(t)
(4)
w,i+1 as the dominant left singular vector of X(t)
(3)

(cid:16)
(cid:16)

(cid:17)
i ⊗W (t)
.
i ⊗ U(t)

w,i

i

(cid:17)

.

i ⊗ U(t)
U(t)
s,i+1 ⊗ U(t)
W (t)

3:
4:
5:
6:
7:
8:
9:
10:

11:
12:

13:

else

end if
i = 0
Calculate W (t)
Calculate W (t)
Compute
¯A =
Obtain U(t)
if (cid:107) U(t)

K
∑
k=1

t
∑

(W (t)

(W (t)
s,i+1)k
i+1 by eigenvalue decomposition of ¯A.

w,i+1)lA(l,k)
N .

l=t−L+1

F > ε then

i (cid:107)2
i+1 − U(t)
i = i + 1.
goto line 10.

14:
15:
16:
17:
18:
19:
20: end for
21: Track the change points using (3.10).

end if
Normalize the columns of U(t).
Apply k−means on U(t) to obtain clustering labels.

58

which can be further simpliﬁed to,

Cost = (cid:107) X (t) ×1 U(t)† ×2 U(t)† ×3 W (t)

w

≤ (cid:107) X (t) (cid:107)F ,

† ×4 W (t)

s

† (cid:107)F

(3.9)

w (cid:107)2= 1, (cid:107) W (t)

s (cid:107)2= 1 and (cid:107) U(t) (cid:107)2

since (cid:107) W (t)
2= λmax(U(t)†U), where λmax is the largest eigenvalue
of U(t)†U, which equals to 1 since U(t)†U(t) = I. Therefore, by dividing both sides in (3.9) by
(cid:107) X (t) (cid:107)F we get a normalized cost function which is always between 0 and 1,

Cost4DNormalized =

(cid:107) X (t) ×1 U(t)† ×2 U(t)† ×3 W (t)

w

(cid:107) X (t) (cid:107)F

† ×4 W (t)

s

† (cid:107)F

.

(3.10)

Similarly, we deﬁne the normalized cost function to detect the changes in community structure

with 3D-WTA and RTTA as,

Cost3DNormalized =

(cid:107) X (t) ×1 U(t)† ×2 U(t)† ×3 W (t)† (cid:107)F

(cid:107) X (t) (cid:107)F

.

(3.11)

Changes in the normalized cost function in Eq. (3.10) and (3.11) reﬂect changes in the com-

munity structure. Both cost functions are computed based on the detected community structure

at each time point and quantify the agreement between the detected community structure and the

underlying network structure. The change points are then determined using a change detection

algorithm [108], [109]. In this algorithm, the normalized cost function is partitioned by choosing

random change points. The total deviation of all points within the partition from the empirical

mean is calculated. A total residual error is then calculated by summing up these deviations across

all partitions. These steps are repeated by varying the location of the change points until the resid-

ual error is minimized.

In 3D-WTA and RTTA, the number of clusters at each time point t is determined by repeating

59

the algorithm for different number of clusters, usually for 2-10 clusters and the number of clus-

ters that maximizes the asymptotical surprise metric is considered as the best in representing the

underlying network. Similarly, in 4D-WTA the algorithm is repeated over a different number of

clusters. The asymptotical surprise of the adjacency matrix for each layer and time point is then

computed and averaged across all layers. Finally, the number of the clusters that maximizes the

average asymptotical surprise is selected as the number of clusters for all layers at that time point.

3.4 Results

3.4.1 Simulated Temporal Networks

3.4.1.1 Description of the Data sets

Four simulated data sets with different community structures and different noise levels are gener-

ated to validate the performance of the proposed 3D-WTA. For each data set, a dynamic network is

generated for 60 time points with 100 nodes. Intra and inter-cluster edges are selected from a trun-

cated Gaussian distribution in the range of [0,1]. The parameters and the details of the community

structures for these data sets are given in Table 3.1.

The SNR of each network is deﬁned as the ratio of the total power of intra-cluster edges to the

total power of inter-cluster edges. The SNR can be calculated as:

SNR = 10log10

1

(tm−tM(cid:48) +1)

1

(tm−tM(cid:48) +1)

tM(cid:48)
∑
t=tm
tM(cid:48)
∑
t=tm

∑

i, j∈inter

A(t)
i j

2

A(t)
i j

∑

i, j∈intra

,

2

(3.12)

where tm and tM(cid:48) refer to the start and end time points in the network, respectively. The term
(tm −tM(cid:48) + 1) refers to the total number of time points in the network.

60

In order to introduce variability across the different data sets, we have further varied the struc-

ture of the networks for data set 1 and 2 during the intervals 21-40 and 1-20, respectively. For

these networks and time intervals, the large cluster with 60 nodes is generated such that the nodes

close to the diagonal of the cluster have stronger connections than the nodes on the off-diagonal.

In particular, all the edges close to the diagonal of the cluster are randomly selected from a trun-

cated Gaussian distribution with µinter and σinter while 60% of the edges on the off-diagonal of the

cluster are selected from a truncated Gaussian distribution with µinter and σinter and the remaining

edges are selected from the same distribution with µintra and σintra.

61

Data Sets

1

2

3

4

µintra,σintra, µiner,σinter

SNR(dB)

Time points 1-20

0.7,0.4,0.4,0.2

-1.69

Number of clusters
Nodes in each cluster C C1(1− 30), C2(31− 60),C3(61− 80),

4

C4(81− 100)

µintra,σintra, µiner,σinter

SNR(dB)

0.5,0.4,0.3,0.2

1.839

Number of clusters
Nodes in each cluster C C1(1− 60), C2(61− 80),C3(81− 100)

3

µintra,σintra, µiner,σinter

SNR(dB)

0.6,0.4,0.2,0.2

1.233

Number of clusters
Nodes in each cluster C C1(1− 30), C2(31− 60),C3(61− 80),

4

C4(81− 100)

µintra,σintra, µiner,σinter

SNR(dB)

0.8,0.5,0.3,0.2

0.0339

Number of clusters
Nodes in each cluster C C1(1− 30), C2(31− 60),C3(61− 80),

4

C4(81− 100)

Time points 21-40

Time points 41-60

0.5,0.3,0.2,0.2

0.7,0.4,0.4,0.2

3.32

3

-1.77

4

C1(1− 60), C2(61− 80),C3(81− 100)

C1(1− 30), C2(31− 60),
C3(61− 80),C4(81− 100)

0.5,0.2,0.2,0.2

0.5,0.4,0.3,0.2

-1.289

5

-0.643

4

C1(1− 20), C2(21− 40),C3(41− 60),

C4(61− 80), C5(81− 100)

C1(1− 40), C2(41− 60),
C3(61− 80), C4(81− 100)

0.4,0.2,0.2,0.1

0.6,0.4,0.2,0.2

0.1628

5

-1.191

6

C1(1− 30), C2(31− 45),C3(46− 60),

C4(61− 80),C5(81− 100)

C1(1− 15), C2(16− 30),
C3(31− 45), C4(46− 60),
C5(61− 80), C6(81− 100)

0.6,0.2,0.2,0.3

0.8,0.5,0.3,0.2

-4.541

8

0.0388

4

C1(1− 15), C2(16− 30), C3(31− 45),
C4(46− 60),C5(61− 70), C6(71− 80),

C7(81− 90), C8(91− 100)

C1(1− 30), C2(31− 60),
C3(61− 80), C4(81− 100)

Table 3.1: Parameters of the truncated Gaussian distribution used to generate the edges of the simulated networks for four different
datasets, and the ground truth of the nodes’ community membership for the generated networks.

62

Figure 3.1: Detected number of clusters (DNOC) by 3D-WTA for the data sets in Table 3.1 using
different metrics, including: Eigengap (EG), modularity (Q), muti-resolution modularity Qγ and
asymptotical surprise (AS).

3.4.1.2 Comparison of different quality metrics

For 3D-WTA, a 3-mode tensor, X(t) ∈ R64×64×4, is constructed at each time point, and 3D-WTA

is applied with a window length of 4. For each data set, the networks are simulated 100 times. In

order to choose the best quality metric for determining the number of clusters, the different metrics

discussed in Section 3.2.1 are employed. The detected number of clusters (DNOC) for the four

different data sets by 3D-WTA using the eigengap criterion (EG) with δ calculated as suggested

in [100], modularity (Q) [11], multi-resolution modularity (Qγ) with γ calculated as suggested

in [103] and asymptotical surprise (AS) [90] are presented in Fig. 3.1.

As can be seen in Fig. 3.1, the best quality metric in determining the number of clusters over

time is asymptotical surprise. With the EG criterion, the number of clusters is detected more

63

51015202530354045505560123456789Time PointDNOCData Set 4  WTAASWTAQWTAQ γWTAEGGround truth5101520253035404550556011.522.533.544.5Time PointDNOCData Set 15101520253035404550556011.522.533.544.555.5Time PointDNOCData Set 2510152025303540455055602.533.544.555.566.5Time PointDNOCData Set 3accurately when the networks have a high SNR and large clusters with all the nodes strongly

connected within these clusters. On the other hand, the modularity metric performs better than

the multi-resolution modularity for the different data sets. However, both fail to detect the number

of clusters correctly when the community structure is noisy with small clusters. Consequently,

asymptotical surprise is selected to quantify the quality of the detected community structure and to

determine the number of clusters in the rest of this chapter.

3.4.1.3 Comparison between 3D-WTA and state of the art algorithms

The performance of the proposed approach is compared to other state of the art community detec-

tion algorithms including AFFECT [33], normalized cut PCM (PCM-NC), normalized cut PCQ

(PCQ-NC) [32], temporal clustering (TC) [78] and generalized Louvain (GenLov) for multi-layer

modularity maximization with the value of γ calculated as described in [103] using variation of

information metric. The TC framework is implemented for the different data sets with different

number of generative models and the number that detects the best community structure is pre-

sented (5, 7, 8 and 7 are the number of generative models that performed the best for data sets 1, 2,

3 and 4, respectively). The GenLov algorithm detected the correct community structure best with

γ = 1. A range of coupling coefﬁcients has been tested with GenLov and the best performance is

achieved with coupling coefﬁcient, ω = 1 [35], [110]. The number of clusters at each time point

for 3D-WTA, AFFECT, PCM-NC and PCQ-NC is determined as the number that maximizes the

asymptotical surprise metric. On the other hand, the ground truth community structures are given

to the TC algorithm a priori to evaluate the community structure over time.

As it can be seen from Fig. 3.2 (a)-(d), the normalized cost function reﬂects the changes in

the networks, where change points are detected around time points 20 and 40. The proposed

algorithm is able to detect the correct number of clusters over time better than other methods as

64

shown in Fig.3.2 (e)-(h). The variation of information scores achieved by the different methods

are given in Fig.3.2 (i)-(l). The results indicate that the proposed algorithm outperforms the other

algorithms in terms of detecting the correct community structure over time and can correct itself

faster to capture the true community structure after the change points. As seen in Fig.3.2 (e)-(l),

the performance of AFFECT, PCM, PCQ and GenLov declines with increasing noise level and

decreasing cluster size.

3.4.1.4 Comparison between 3D-WTA and 4D-WTA

In order to compare 3D-WTA with 4D-WTA, a simulated dynamic network for 60 time points with

64 nodes across 20 layers is generated. The edges of the graphs of the simulated dynamic network

are taken from a truncated Gaussian distribution in the range of [0,1], with the speciﬁcations shown

in Table 3.2.

For evaluating 3D-WTA, the networks at each time point are averaged across layers. A 3-mode
tensor, X(t) ∈ R64×64×4, is constructed at each time point, with a window length of 4. Moreover,
4D-WTA is applied to the same dynamic network where a 4-mode tensor, X(t) ∈ R64×64×4×20 ,

is constructed at each time point, with a window length of 4. This experiment is repeated 100

times. A comparison between the detected community structure by 3D-WTA and 4D-WTA and

the network’s ground truth in terms of variation of information is shown in Fig. 3.3. The variation

of information is calculated by comparing the clustering results with the ground truth for each layer

separately and then averaged across all layers.

The results show that 4D-WTA performs better than 3D-WTA in detecting the community

structure over time. This indicates that the 4D-WTA captures the true community structure better

than 3D-WTA since it captures the variations across layers, whereas 3D-WTA loses this variability

due to averaging the networks across layers.

65

Figure 3.2: Comparison between 3D-WTA and state of the art algorithms for the different data sets
in Table 3.1: (a)-(d) The normalized cost function calculated by 3D-WTA, (e)-(h) The detected
number of clusters (DNOC) for the simulated networks using different algorithms. 3D-WTA es-
timated the correct number of clusters over time and the normalized cost function reﬂected the
correct change points represented by the pink dashed lines.
(i)-(l) Comparison of variation of
information (VI) between 3D-WTA, AFFECT, PCM-NC, PCQ-NC, TC and GenLov for the sim-
ulated networks. 3D-WTA scored the lowest variation of information over time compared to the
other methods.

66

1020304050600.250.260.270.280.29Time StepCost3DNormalizedData Set 15101520253035404550556022.533.544.55Time StepDNOC  AFFECTGenLovPCM−NCPCQ−NCTC3D−WTAGround Truth1020304050600.260.280.30.320.34Time StepCost3DNormalizedData Set 25101520253035404550556033.544.555.56Time StepDNOC5101520253035404550556000.050.10.15Time StepVI1020304050600.290.30.310.320.330.34Time StepCost3DNormalizedData Set 35101520253035404550556044.555.566.57Time StepDNOC5101520253035404550556000.050.10.15Time StepVI1020304050600.250.30.35Time StepCost3DNormalizedData Set 451015202530354045505560246810Time StepDNOC5101520253035404550556000.050.10.150.2Time StepVI5101520253035404550556000.050.10.150.2Time StepVI(e)(f)(g)(h)(d)(c)(b)(a)(i)(j)(k)(l)Figure 3.3: Variation of information (VI) comparison between the detected community structure
by 3D-WTA and 4D-WTA and the ground truth for simulated data.

67

010203040506000.010.020.030.040.050.060.070.080.090.1Time StepVIVI comparison between 4D−WTA and 3D−WTA for simulated data   4D−WTA3D−WTALayers

1− 5

6− 15

16− 20

Time points 1-20

Time points 21-40

Time points 41-60

0.2,0.1,0.1,0.1

0.2,0.2,0.1,0.1

0.2,0.2,0.1,0.1

µintra,σintra, µiner,σinter
Nodes in each cluster C C1(1− 22), C2(23− 44), C3(45− 64) C1(1− 22), C2(23− 64) C1(1− 22), C2(23− 44), C3(45− 64)
µintra,σintra, µiner,σinter
Nodes in each cluster C C1(1− 22), C2(23− 44), C3(45− 64) C1(1− 22), C2(23− 64) C1(1− 22), C2(23− 44), C3(45− 64)
µintra,σintra, µiner,σinter
Nodes in each cluster C C1(1− 25), C2(26− 49), C3(50− 64) C1(1− 25), C2(26− 64) C1(1− 25), C2(26− 49), C3(50− 64)

0.4,0.1,0.1,0.1

0.2,0.1,0.1,0.1

0.1,0.1,0.1,0.1

0.2,0.1,0.1,0.1

0.2,0.1,0.1,0.1

0.4,0.1,0.1,0.1

Table 3.2: Parameters of the truncated Gaussian distribution used to generate the edges of the simulated networks for the 4D simulated
data, and the ground truth of the nodes’ community membership for the generated network.

68

3.4.1.5 Flocks of Boids Dataset

This data set represents the natural ﬂocking behavior of birds [33] [111]. The data set is represented

by a dynamic network across 40 time points. The network at each time point consists of 100 boids

and the community structure changes over time by changing the ﬂock membership of one boid at

each time step. The adjacency matrices across a window at time point t were represented by a 3-

mode tensor. The proposed 3D-WTA algorithm was applied to this dataset to detect the community

structure and track the changes over time. The number of clusters at each time point was detected

using modularity, and the normalized cost function in (3.11) was used to track the changes in the

community structure over time. The boids are arranged into 4 clusters until time point 18, then

they rearrange into 2 clusters until the end. The change points and the number of clusters shown

in Fig. 3.4(a) were detected using 3D-WTA with L = 8. In Fig. 3.4(b), the Rand index of the

proposed algorithm and other state of the art evolutionary clustering methods are compared. The

results show that our algorithm performs almost the same as AFFECT and better than PCQ and

PCM in terms of Rand index. On the other hand, the proposed algorithm performs better than

the existing methods in terms of detecting the changes in community structure and the number of

clusters over time. Moreover, the proposed algorithm is robust over multiple runs with respect to

tracking changes and number of clusters over time.

3.4.2 MIT Reality Mining Dataset

This data set represents the cell phone activity for 94 students and staff in MIT Media Laboratory. It

was collected between August/2004 and June/2005 [93]. The data set is represented by a dynamic

network across 46 time points. Each point refers to a week during the school year. Each time point

was represented by a 3-mode tensor. Both algorithms were applied to this data to track the changes

69

over time.We assumed that there were two dominant clusters at each time point, and we used the

normalized cost function in (3.10) to track the changes in the community structure over time. The

detected change points corresponds to end of the Fall semester, start of the Spring semester, spring

break and the end of the Spring semester. The change points shown in Fig.3.5 were detected by

the WTA with α = 11.

3.4.3 Application to dFCN From EEG Dataset

The proposed tensor tracking approach is applied to a set of connectivity graphs constructed from

EEG data containing the error-related negativity (ERN). The ERN is a brain potential response

that occurs following performance errors in a speeded reaction time task usually 25-75 ms after

the response. Previous work [112] indicates that there is increased coordination between the lateral

prefrontal cortex (lPFC) and medial prefrontal cortex (mPFC) within the theta frequency band (4-8

Hz) and ERN time window. EEG data from 63-channels was collected in accordance with the

10/20 system on a Neuroscan Synamps2 system (Neuroscan, Inc.) sampled at 128 Hz from 91

subjects. The task was a common speeded response letter (H/S) ﬂanker, where error response

locked trials from each subject were utilized. The EEG data are pre-processed by the spherical

spline current source density (CSD) waveforms to sharpen event-related potential (ERP) scalp

topographies and reduce volume conduction [113].

We constructed average connectivity networks at each time point across subjects to represent

the adjacency matrix of the connectivity graphs. Individual connectivity networks were constructed

by using a previously introduced phase synchrony measure [114]. This results in a dynamic net-

work with 63 nodes and 256 time points. We applied the proposed RTTA to detect the community

structure over time. Starting from time point 30, a 3-mode tensor was built at each time point.The

change points were detected by the normalized cost function in (3.10) as shown in Fig.3.6. As it

70

can be seen in Fig.3.6, there is an increase in the cost function around −78 ms right before the

response time. Similarly, there is a decrease in the cost function around 400 ms corresponding

well with the end of the ERN waveform. The number of clusters was determined by eigengap

criterion and the detected community structure corresponding to these three time periods are given

in Fig.3.7. While the community structures for pre- and post- ERN intervals are similar, there is

increased segregation during ERN especially in the frontal central brain regions. This is aligned

with the increased coordination between lPFC and mPFC during this time.

3.4.4 Application to dFCN From rs-fMRI Dataset

3.4.4.1 Data Preprocessing and Functional Connectivity Networks Construction

The rs-fMRI dataset used in this chapter is from Bangor rs-fMRI data. This dataset is part of 1000

Functional Connectomes Project [115]. The dataset includes 20 healthy subjects (males, aged be-
tween 19−38). The rs-fMRI data was collected using a 3T Magnet scanner from all subjects while

their eyes were open. TR equals to 2s yielding 34 slices and 265 time points. The pre-processing

was done using SPM12 [116] and CONN [117] functional connectivity toolboxes. The structural

images were registered to the mean functional image and spatially normalized to MNI space for

each subject. Slice timing and motion correction were performed on every functional image. The

functional images were regionally parcellated by automated anatomical labeling (AAL) atlas [118]

and smoothed by spatial convolution with 4-mm FWHM Gaussian kernel. Confounds such as mo-

tion parameters obtained from realignment and BOLD signals obtained from white matter and

CSF masks were regressed out and band-pass (0.0084-0.09 Hz) temporal ﬁltering was applied to

functional images of each subject.

71

3.4.5 Dynamic Functional Connectivity Network Construction

The dynamic functional connectivity networks are constructed for each subject by computing the

Pearson correlation coefﬁcient, ρ, between different ROIs (90 ROIs: with 45 ROIs in the left and

the other 45 in the right hemispheres, listed in Table1) using a sliding window approach with a

rectangular window of length equal to 60 TRs and an overlap of 1 TR [60], [55], [56].

For each subject, a sequence of weighted and undirected graphs {A(t)}t1,...,tM are constructed
i j, where A(t) ∈ Rn×n is the adjacency matrix representing the network at time step t,

i j = ρt

with A(t)

n is the total number of nodes (ROIs) and M is the total number of time points.

3.4.5.1 Resting State Networks (RSNs)

In order to study the brain dynamics over time, our analysis focuses on the temporal dynamics of

functional connectivity communities. The detected communities will be evaluated with respect to

well-known resting state networks including the default mode network (DMN), auditory network

(AN), visual network (VN), somatomotor network (SMN), bilateral limbic network (BiN), subcor-

tical network (SCN) and cognitive control network (CCN) as summarized in Table 3.3 [55]. Each

subnetwork is deﬁned by a group of ROIs that are commonly categorized to be part of these net-

works. The purpose of applying the proposed approach to the dFCN is to determine the functional

organization of the brain and evaluate the obtained functional communities with respect to these

commonly deﬁned resting state networks.

3.4.5.2 Detecting the Community Structure in rs-fMRI

The proposed community detection approaches were applied to dFCNs constructed from rs-fMRI

dataset described in Section 4.2.2.1. 90 ROIs were extracted after the preprocessing procedure for

each subject. Networks built using Pearson correlation are positive semi-deﬁnite and consequently

72

RSN ROIs

VN

AN

(43− 56)
(29,30,79− 82)
(21,22,27,28,37− 42,83,84,87,88)

BiN
DMN (3− 6,23− 26,31− 36,65− 68,85,85)
SMN (1,2,17− 20,57− 60,63,64,69,70)
SCN (71− 78)
CCN (61,62,89,90,7− 16)

Table 3.3: The ROIs belonging to the different subnetworks in resting state. The numbers in the
table refer to the ROIs deﬁned in Table 1.

can be used as AN in (3.2) and (3.3).

3D-WTA is applied to each subject separately to track and detect the community structure over

time and to analyze variability across subjects. At each time point, a 3-mode tensor is constructed
using a sliding window with a length of 30, X (t) ∈ R90×90×30. For each subject, a total of 177

tensors are constructed. The constructed tensors are approximated by Tucker decomposition, as
X (t) ≈ C ×1 U×2 U×3 W , where C ∈ Rr×r×1 is the core tensor, U ∈ R90×r is the orthogonal low
rank component matrix along the connectivity mode and W ∈ R30×1 is the weighting vector across

time mode. r is the rank which corresponds to the number of clusters. The number of clusters

at each time point is determined using asymptotical surprise metric and varied between 2 and 10

clusters.

For the application of 4D-WTA to rs-fMRI, the fact that not all subjects have the same com-

munity structure within the same time window is taken into account. Recent studies in dynamic

rs-fMRI literature have revealed groups of reproducible patterns of dFCNs named as FC states [55],

[119]. As a result, the possibility that dFCNs across subjects may cycle through a few number of

distinct FC states over time has been hypothesized and investigated [56], [71], [120], and [121].

73

Following what have been proposed in literature, 4D-WTA is applied to rs-fMRI data to extract

common network connectivity patterns, or connectivity states, across time and subjects. The main

difference between our approach and what is proposed in the literature is that the states are iden-

tiﬁed through their community structures and not just co-activation patterns. It is also important

to note that the goal of 4D-WTA is to achieve data reduction rather than track the dynamics of

FC within each subject unlike 3D-WTA. Following the preliminary steps shown in Fig. 3.8, the

application of 4D-WTA to the rs-fMRI can be summarized as:

1. Subsample 30 random time points across all subjects, similar to [55], and form a 4D tensor,

X (sample) ∈ R90×90×30×20. This subsampling process was repeated 200 times.

2. Apply 4D-WTA to detect the community structure of the tensor constructed from this ran-

dom sampling across subjects and time. The tensor corresponding to each random subsam-

pling is approximated by Tucker decomposition following lines 4-16 from Algorithm 3.5.

3. Determine the number of clusters for each subsampling using asymptotic surprise.

4. Use U from Eq. (3.7) to obtain the common community structure of the subsampling.

5. Determine the similarity between community structures from different subsmples using vari-

ation of information metric and construct a 200× 200 similarity matrix.

6. Apply k-means clustering to the similarity matrix from Step (5) to identify the samples with

similar community structure, i.e. the network states.

After performing Step 2, the optimized weights for the different time points within a time

window are found to be very close to each other with a mean of 0.1824 and a standard deviation

of 0.0074. However, the optimized weights across subjects show more variation with a mean of

0.2155 and a standard deviation of 0.0597. This is consistent with the variations of functional

74

connectivity networks across subjects. The distributions of the weights across time and subjects

for all samples are shown in Fig. 3.9(a) and (b), respectively. From Fig. 3.9, it can be noticed that

the distributions are highly concentrated, indicating that all time points and subjects contributed

almost equally to the clustering results.

In the application of 4D-WTA to the rs-fMRI, the detected number of clusters for the differ-

ent subsamplings is varied between 4 and 10. After applying k-means clustering to identify the

samples with similar community structure, 5 clusters that correspond to the FC states shown in

Fig. 3.10 are obtained. The optimal number of states is determined using the elbow criterion. The

community structure that corresponds to each FC state is determined as the community structure

of the centroid of each cluster found by k-means.

The community structure corresponding to the detected FC states is compared to the dynamic

functional connectivity network of each subject. For each subject and at each time point, the state

that best represents each time point in terms of asymptotical surprise metric is assigned as the

community structure at that time point. Finally, a state transition proﬁle that tracks the community

structure is obtained for each subject. The obtained state transition proﬁles for the different subjects

followed different patterns, which reﬂects the variation in the brain activity across subjects during

resting state as can be seen in Fig. 3.11.

In the next part of this subsection, the community

structure corresponding to each FC state and state transition proﬁles for the different subjects

are discussed. Moreover, the dynamic behavior of the different RSNs is quantiﬁed through the

consistency measure. Finally, a comparison between 3D-WTA and 4D-WTA in terms of tracking

brain dynamics is presented.

75

3.4.5.3 Community structure of each FC state

After applying 4D-WTA to the rs-fMRI, the community structure of FC states that are persistent

across time and subjects are extracted as shown in Fig. 3.13.

First, the detected FC states are evaluated based on their frequency of occurrence across time

and subjects. This is achieved by calculating the average occurrence rate (AR%) across time and

subjects, as shown on the top of each state in Fig. 3.13. For each subject, the occurrence rate of

each state is calculated as the number of times this state occurred divided by the total number of

time points. The occurrence rate of each state across subjects is then averaged to get AR%.

As shown in Fig. 3.13(b), State 2 has the highest AR% at 25%. This state consists of one

large cluster consisting of DMN and BiN. Furthermore, there is another large cluster that reﬂects

the connectivity between other RSNs, such as SMN, AN and SCN. A similar state was observed
previously in [55]. Other states including States 1 and 3− 5 have AR% between 13% to 22%.

A major network of interest during resting state is the DMN, as this network is known to be

activated during rest and deactivated during task related events [122], [123]. Across the different

states, we observe that different ROIs of the DMN are clustered together in some states and clus-

tered with other RSNs in other states. For example, in States 4 and 5, the DMN module includes the

main ROIs of the DMN, namely the posterior cingulate gyrus (PCG), precuneus (PCUN), anterior

cingulate gyrus (ACG), angular gyrus (AG) and superior frontal gyrus including the dorsolateral

(SFGdor), medial (SFGmed) and medial orbital (ORBsupmed). On the other hand, in State 2, the

DMN is assigned to the same community with ROIs from the BiN, mainly the hippocampus (HIP)

and parahippocampal gyrus (PHG). In fact, HIP and PHG are known to play an important role in

long-term memory [124,125], and previous work considered them as part of the DMN [126], [55].

On the other hand, in States 1 and 3, PCUN, PCG, ACG, SFGmed and middle frontal gyrus (MFG)

76

from CCN form a cluster, while the remaining ROIs of the DMN are clustered with other RSNs.

This variation in the community membership between the different ROIs of the DMN reﬂects the

dynamic behavior of this network and conﬁrms its important role during resting state.

Across the different states, the AN and the VN show a relatively stationary behavior. The ROIs

of the AN are clustered together and joined the SMN across all states. Similar behavior is observed

for the VN, where the ROIs of the VN form a cluster that appears across all the states. Even though

the VN show a relatively stationary behavior, it is strongly connected to ROIs from other RSNs in

some states. As can be seen in Fig. 3.13(b), a cluster consisting of the visual network ROIs and

the PCUN is observed. This is in line with previously reported results in [55] and [127], where a

network state that included PCUN and the ROIs of the VN has been identiﬁed. Furthermore, the

VN cluster includes the ROIs of the VN and the superior parietal gyrus (SPG) across States 1, 2

and 5, and this agrees with the results reported in [128].

Another observation across the states is related to the dynamics of the SCN. As can be seen in

Fig. 3.13(b-d), the ROIs of the SCN exhibit variation in their community membership. In State

1, a complete separation between subcortical and cortical ROIs is observed. Moreover, during

this state the ROIs of the DMN join other ROIs from different RSNs to form small clusters. This

observation suggests that State 1 represents a light sleep state, which is commonly observed in

rs-fMRI as in [55].

3.4.5.4 State transition proﬁles

In order to understand the dynamics of the FCNs, we studied the state transition proﬁles for the

different subjects over time. These proﬁles reﬂected two important phenomena. First, the state

transition proﬁles followed relatively different patterns over time suggesting variation in the brain

activity across subjects during resting state. Moreover, for some subjects the brain may ﬂuctuate

77

between a few number of states and not necessarily occupy all the detected FC states. Second, we

observed that the community structure did not ﬂuctuate quickly between states over time. State

transition proﬁles for selected subjects are shown in Fig. 3.11.

3.4.5.5 Consistency of the RSNs across time

In order to calculate the consistency of a subnetwork, we ﬁrst calculated the recruitment coefﬁcient

deﬁned in [129] for each ROI within the subnetwork. The ROI’s recruitment coefﬁcient is deﬁned

as the probability that this ROI is in the same cluster as other ROIs of its own subnetwork at

each time point. The subnetwork consistency is then calculated as the average of the recruitment

coefﬁcients of all ROIs that correspond to the subnetwork.

After the application of 3D-WTA and 4D-WTA to the rs-fMRI FCNs, the consistency of the

RSNs across time and subjects was computed to quantify their dynamic behavior. The variation

in consistency across the different RSNs is shown in Fig.3.12(a) for 3D-WTA and Fig.3.12(d) for

4D-WTA.

VN, SCN, SMN and AN have relatively higher consistency across subjects implying that the

ROIs within each network are strongly connected.

In other words, these networks experience

less variability over time. On the other hand, DMN, CCN and BiN are less consistent over time,

indicating that these RSNs are more dynamic over time. By comparing the variation in consistency

calculated by 4D-WTA and 3D-WTA, we observe that the consistency results by 3D-WTA are less

representative, since the variation in consistency across different RSNs is very similar. Thus, it is

hard to reach a deﬁnitive conclusion about the dynamics of these RSNs using 3D-WTA.

In order to compare the results of 3D-WTA and 4D-WTA with other state of the art algorithms,

the algorithms that performed the best among others in the simulated data are selected. These

methods are PCQ-NC and Genlov. The variation in consistency across all subjects for the different

78

RSNs is shown in Fig.3.12(b) for PCQ-NC and Fig.3.12(c) for GenLov. From these ﬁgures, it can

be seen that the consistency results from both methods follow similar patterns that align with the

results from both 3D-WTA and 4D-WTA. However, 3D-WTA and 4D-WTA show less variation

in the consistency of the different RSNs across subjects, indicating the stability of our methods

compared to existing algorithms.

3.4.5.6 Comparison Between 3D-WTA and 4D-WTA

In order to study the degree of agreement between the community structure detected by 3D-WTA

and 4D-WTA for the different subjects, the variation of information is calculated between the

clustering results found by 4D-WTA and the clustering results found by 3D-WTA for each subject

separately. The average VI across time and subjects has a mean of 0.46 and a standard deviation

of 0.0507. These results indicate that in general, there is a high agreement between group and

individual community structures.

3.5 Conclusions

In this chapter, we introduced a new tensor based framework to detect and track community struc-

ture in temporal and temporal multi-layer networks using 3-way and 4-way tensors, respectively.

The framework is implemented by two approaches to detect the community structure in temporal

networks: 3D-WTA and RTTA. In 3D-WTA, a ﬁxed length sliding window is used to construct

the tensors, while a variable length sliding widow is used to construct the tensors in RTTA. On the

other hand, we extended the 3D-WTA to 4D-WTA to detect the community structure in temporal

multi-layer networks. In the proposed framework, either a 3-way or 4-way tensor is constructed

at each time point by taking the past history across time or time and layers into account. The

79

community structure is detected by ﬁnding a common subspace through Tucker decomposition of

the constructed tensors. Moreover, the proposed algorithms optimize the contribution of each time

point and layer within the framework. Furthermore, a normalized cost function is introduced to

track the changes in the community structure over time. The algorithms were tested on multiple

simulated and real data sets and compared to the existing state of the art algorithms. The appli-

cation of the proposed framework to simulated and real social networks shows that the proposed

algorithms are able to detect the correct community structure and changes in the number of clusters

over time even in noisy networks.

A particular application of interest of the proposed work is to detect and track the community

structure dynamics in FCN of the brain during cognitive control or resting state, with the major fo-

cus on FCN from rs-fMRI. The application to functional connectivity brain network from EEG data

revealed changes in the community structure across time that aligned well with the experimental

setting and prior work on cognitive control.

In the application of the proposed framework to the dFCN from rs-fMRI, the dFCNs are sum-

marized into a set of FC states that repeat over time and subjects. The detected FC states are

deﬁned through their community structure. Moreover, state transition proﬁles which reﬂect the

variance in the dynamic brain behavior across subjects during resting state are obtained. The de-

tected community structures were evaluated through consistency measure. From this measure, it

is clear that VN, AN, SCN and SMN are the most stationary RSNs, whereas DMN, CCN and BiN

reconﬁgure themselves more frequently across time. In particular, the DMN showed the highest

dynamic activity over time as expected.

In addition, the results of 3D-WTA and 4-WTA showed high agreement in the community

structure in terms of variation of information. However, the community structure detected by 4D-

WTA reﬂected the RSNs’ dynamics and evolution across time better than 3D-WTA. This is due

80

to the fact that 4D-WTA is using more data from both time and subjects, and it preserves the

topological structure of the FCNs.

Future work will consider the extension of this framework to higher order tensor type data

clustering across different modes. Moreover, we will consider reducing the computational cost of

the algorithms by reducing the size of the constructed tensors. This can be achieved by adding

extra terms or constraints to the objective functions. The major goal of this modiﬁcation will be

to select the time points or layers that can be included in the framework. A possible selection

criterion could be dependent on selecting the time points or layers that are sparser than others, as

this will guarantee the selection of the time points or layers with less noisy community structure.

81

(b)

Figure 3.4: (a) The normalized cost function and the detected number of clusters for Flocks of
Boids dataset using different algorithms. 3D-WTA results in the best estimate of the number of
clusters especially after the change point, (b) Rand index comparison between 3D-WTA, AFFECT,
PCM and PCQ for Flocks of Boids Dataset.

82

101520253035400.250.30.35Noormalized Cost3DTimeStepNormalized Cost3D10152025303540246810Comparing the Detected Number Of Clusters Using ModularityTime StepNumber Of Clusters  WTAAFFECTPCMPCQ510152025303540102030405060708090100Time StepRand IndexRand Index Comparison  WTAPCQPCMAFFECTFigure 3.5: Detected Change Points for Reality Mining Data.

Figure 3.6: Detected Change Points for ERN Data.

83

51015202530350.220.230.240.250.260.270.28End of FallNormalized Cost Function Estimated by WTA for Reality Mining Data SetWeeksNormalized CostEnd of SpringSpring BreakStart of Spring−800−600−400−200020040060080010000.30950.30950.30960.30960.30960.30960.30960.30970.30970.3097Normalized Cost Function Estimated by RTTA for ERN DataTime (ms)Normalized CostERNPost−ERNPre−ERNFigure 3.7: Community structure for the ERN networks obtained by RTTA: (a) Pre-ERN, (b) ERN,
(c) Post-ERN.

84

Figure 3.8: A ﬂowchart of the application of 4D-WTA to rs-fMRI: (a) Time series from each
ROI; (b) dFCNs constructed across time and subjects using Pearson correlation coefﬁcient; (c)
After random subsampling across time and subjects, 4 dimensional FC tensors are constructed
for each sample. Each tensor consists of 30 random time points from each subject; (d) Tucker
decomposition of each FC tensor; (e) Community structure found by 4D-WTA for each sample;
(f) The detected community structure is summarized into 5 functional connectivity states deﬁned
through community detection using k-means.

85

(a)

(b)

Figure 3.9: (a) Distribution of weights across time for all samples, (b) Distribution of weights
across subjects for all samples.

86

00.10.20.30.40.50.60.70.80.9100.050.10.150.20.25Weights across time for all samplesHistogram of distribution of weights across time for all samples00.10.20.30.40.50.60.70.80.9100.050.10.150.20.250.30.350.40.45Weights across subjects for all samplesHistogram of distribution of weights across subjects for all samplesFigure 3.10: The similarity among the community structures obtained by 4D-WTA across the 200
subsamples across time and subjects. The similarity between the different community structures
is quantiﬁed by variation of information. Samples with similar community structure are grouped
together by k-means.

Figure 3.11: State transition proﬁles for subjects 4, 7, 9 and 12.

87

  00.050.10.150.20.250.30.350.41002003004000246Subject 04Time (sec)States1002003004000246Subject 07Time (sec)States1002003004000246Subject 09Time (sec)States1002003004000246Subject 12Time (sec)States(a)

(b)

(c)

(d)

Figure 3.12: Variation in RSNs’ consistency across subjects calculated by (a) 3D-WTA, (b) PCQ-
NC, (c) GenLov, (d) 4D-WTA.

88

VNDMN1SCNSMNANBiLNCCN0.20.30.40.50.60.70.80.911.1Change in consistencyChange in consistency of RSNs across subjects calculated by 3D−WTA  Medianµ25%−75%9%−91%VNDMN1SCNSMNANBiLNCCN0.40.50.60.70.80.91Change in consistencyChange in consistency of RSNs across subjects calculated by PCQ−NC  Medianµ25%−75%9%−91%VNDMN1SCNSMNANBiLNCCN0.40.50.60.70.80.91Change in consistencyChange in consistency of RSNs across subjects calculated by GenLov  Medianµ25%−75%9%−91%VNDMN1SCNSMNANBiLNCCN0.20.30.40.50.60.70.80.911.1Change in consistencyChange in consistency of RSNs across subjects calculated by 4D−WTA  Medianµ25%−75%9%−91%(a)

(c)

(b)

(d)

Figure 3.13: Full and circular view of the community structure of the FC states found by 4D-WTA:
(a) State 1 with 9 clusters; (b) State 2 with 4 clusters; (c) State 3 with 8 clusters; (d) State 4 with 7
clusters and (e) State 5 with 6 clusters.

(e)

89

Chapter 4

Detecting Brain Dynamics During Resting

State: A Tensor Based evolutionary

Clustering Approach

4.1

Introduction

With the advancement of neuroimaging technology, researchers are able to collect digital data

for the brain across different modalities, spatial and temporal resolutions. With the growth in

neuroimaging data, it is now possible to map human brain activity as a complex network.

In

particular, the nodes of the network correspond to the different regions of the brain and the edges

are quantiﬁed through the interactions between these regions.

In recent years, different approaches focused on understanding the functional mechanisms of

human brain. A particular ﬁeld of interest is understanding the spontaneous activity of the brain

during resting state. This was accomplished by investigating the functional connectivity of the

brain networks from functional magnetic resonance imaging (fMRI) during resting state. fMRI

is a noninvasive neuroimaging modality that measures the changes of blood oxygenation level-

90

dependent (BOLD) signal. FC is deﬁned as the statistical dependency between different brain

regions and has been quantiﬁed through the computation of Pearson’s correlation between regions

of interest (ROIs) [54–56]. Increased interest in studying resting state fMRI (rs-fMRI) has shown

that the different brain regions of the motor cortex are active even in the absence of a motor task

[57]. Since that time, rs-fMRI has been widely used as one of the major tools to investigate and

explore FC [58, 59, 130].

Many of the current studies on rs-fMRI reported a set of functional connectivity networks that

appears consistently in healthy subjects such as default mode network, visual network, somatomo-

tor network, executive control network and auditory network [58,59]. A major network of interest

during resting state is the default mode network (DMN), which consists mainly of posterior cingu-

late gyrus, precuneus, anterior cingulate gyrus, angular gyrus and medial frontal gyrus. DMN has

attracted a lot of attention since it reﬂects the default neural activity and is deactivated during task

related events [122, 123, 131].

Early work on rs-fMRI assumed stationarity of functional connectivity, however, recent stud-

ies revealed evidence for dynamic nature of FC during resting state [53, 60]. Different methods

have been developed to study the dynamics of the functional connectivity networks (FCNs). These

methods include clustering based methods [55], statistical state modeling methods [68] and sub-

space analysis based methods [56]. Most of the current work focuses on extracting FC states and

reduces the dimension of the dynamic FCNs (dFCNs) by vectorizing the connectivity matrices be-

fore extracting the FC states. This procedure may result in the the loss of the topological structure

of the FCN. These approaches also reduce the dFCNs into a few common basis functions or states

without explicitly identifying the topological organization within each state. A well-known method

to study the network organization is community detection [132]. Community detection algorithms

have attracted lots of attention as the detected structure reﬂects the topological relationships within

91

the network [50, 133]. However, most of the current work on community detection focuses on

detecting the community structure of static networks, such as standard spectral clustering [7].

Over the past decade, different evolutionary clustering approaches have been developed to de-

tect the community structure of dynamic networks. These approaches include simple extensions

of static clustering methods [31, 32], statistical modeling based methods [33], tensor based meth-

ods [34, 134] and modularity based methods [35]. Extensions of static clustering methods such

as k-means and agglomerative hierarchical clustering approaches were developed in [31] for dy-

namic networks. A cost function that considers both the cluster membership at the current time and

the deviation from the previous time point was introduced. In [32], two spectral clustering based

frameworks were introduced: preserving cluster quality (PCQ) and preserving cluster member-

ship (PCM). In both frameworks, the clustering depends on a cost function to guarantee temporal

smoothness. However, this cost function requires a priori knowledge about the changes in the

community structure and depends on the choice of a changing parameter. Under statistical model-

ing based methods, authors in [33] assumed a statistical model for the evolution of the adjacency

matrix and an adaptive forgetting factor for evolutionary clustering and tracking (AFFECT) was

introduced. In AFFECT, static clustering methods were performed after smoothing the proximity

between objects over time. However, the AFFECT depends on a stochastic block model for the

adjacency matrix and requires higher computational cost compared to static clustering to estimate

the forgetting factor. Under tensor based methods, a non-negative tensor factorization approach

was introduced in [34].

In this approach, the community structure of the temporal network is

detected using PARAFAC, where all the successive adjacency matrices in the temporal network

were represented by a single 3-way tensor. However, this representation of the temporal network

may not preserve the community structure of the different time points. Under modularity based

methods, a statistical method based on greedy modularity optimization has been proposed in [35]

92

to identify the changing modular organization in temporal multiplex networks, yet this approach

has some drawbacks since it depends on modularity optimization as discussed in [20].

In this Chapter, we introduce a two step dynamic community detection algorithm to identify the

network community structure of dFCN of the brain. First, we introduce an information-theoretic

function to reduce the dFCN across time and identify the time points that carry similar topological

information to combine them as a multi-way array or a tensor. In the second step of the algorithm, a

tensor based approach is developed to identify the community structure of the reduced dFCN over

time and summarize it in a set of ”FC patterns” or ”FC states”. The contributions of the proposed

work are four-fold. First, the proposed approach relies on quantum theory principle to reduce the

whole dFCN into a few distinct network states that have similar topological information. This

approach identiﬁes the time points with similar network organization, unlike simple averaging

methods, the proposed approach optimizes the contribution of each time point within each state

through an optimized weighted average where the common subspace and the weights across time

are optimized, simultaneously. Second, the proposed approach preserves the topological structure

of the time points in each state, where the FC matrices from the original dFCN are used to identify

the community structure within the state, unlike most recent studies that vectorize the network

before clustering. Third, the proposed approach detects and tracks the community structure of the

dFCN across time without any assumptions about the network structure or the number of clusters

or community membership across time. Finally, the proposed approach provides evidence for the

dynamic behavior of the RSNs and reveals the ﬂuctuations in the community membership for the

different ROIs.

93

4.2 Materials and Methods

4.2.1 Data and Preprocessing

The data used in this chapter is from Bangor rs-fMRI dataset. This dataset is part of 1000 Func-

tional Connectomes Project [115]. The dataset includes 20 healthy subjects (males, aged between
19 − 38). The rs-fMRI data was collected using a 3T Magnet scanner from all subjects while

their eyes were open. TR equals to 2s yielding 34 slices and 265 time points. The pre-processing

was done using SPM12 [116] and CONN [117] functional connectivity toolboxes. The structural

images were registered to the mean functional image and spatially normalized to MNI space for

each subject. Slice timing and motion correction were performed on every functional image. The

functional images were regionally parcellated by automated anatomical labeling (AAL) atlas [118]

and smoothed by spatial convolution with 4-mm FWHM Gaussian kernel. Confounds such as mo-

tion parameters obtained from realignment and BOLD signals obtained from white matter and

CSF masks were regressed out and band-pass (0.008-0.09 Hz) temporal ﬁltering was applied to

functional images of each subject.

In order to study the brain dynamics over time, we focus our analysis on some well-known

resting state networks (RSNs) including the default mode network (DMN), auditory network (AN),

visual network (VN), somatomotor network (SMN), bilateral limbic network (BiN), subcortical

network (SCN) and cognitive control network (CCN). These networks are commonly deﬁned by a

set of ROIs. In this study, the set of ROIs that are categorized within these RSNs are summarized

in Table 4.1. [55, 56]

94

VN

AN

(43− 56)
(79− 82)
(21,22,27,28,37− 42,83,84,87,88)

BiN
DMN (3− 6,23− 26,31− 36,65− 68,85,85)
SMN (1,2,17− 20,57− 60,63,64,69,70)
SCN (71− 78)
CCN (61,62,89,90,7− 16,29,30)

Table 4.1: The ROIs that deﬁnes the RSNs. The numbers in the table refer to the ROIs deﬁned in
Table 1.

4.2.2 Dynamic Functional Connectivity Networks (dFCNs)

4.2.2.1 Dynamic Functional Connectivity Network Construction

The dynamic functional connectivity networks are constructed for each subject by computing Pear-

son correlation coefﬁcient, ρ, between different ROIs (90 ROIs: with 45 ROIs in the left and 45

ROIS in the right hemispheres, listed in Table 1) using a sliding window approach with a rectan-

gular window of length equal to 60 TRs and an overlap of 1 TR [55, 56, 60].

For each subject, a sequence of weighted and undirected graphs {A(t)

k }t1,..., tM were constructed
k ∈ Rn×n is the adjacency matrix representing the network at
time step t for subject k, n is the total number of nodes (ROIs) and M is the total number of time

k (i, j), where A(t)

with A(t)

k (i, j) = ρ (t)

points. The ﬁnal dFCN at each time point is averaged across all subjects [135–137] to obtain
{A(t)}t1,..., tM, where A(t)(i, j) = 1

k (i, j) and A(t) ∈ Rn×n.
ρ (t)

K

K
∑
k=1

4.2.2.2 Surrogate Data Generation for Testing Statistical Signiﬁcance of the dFCN

Before applying the proposed clustering approach to the dFCN, we ﬁrst apply thresholding to the

connectivity matrices to obtain sparse adjacency matrices with nonzero edges corresponding to the

95

signiﬁcant connections. We generated surrogate data [138] from the ROI time series and we kept

all edges that were signiﬁcantly different from the null model.

For each ROI, surrogate data was generated using iterative amplitude-adjusted Fourier trans-

form (IAAFT) [139]. For each subject, we created 100 surrogate time series for each ROI. We

then computed Pearson correlation coefﬁcient between each pair of ROIs using a sliding rectan-

gular window of length 60 TRs and an overlap of 1 TR. This process was repeated for all 100

surrogate time series. For each pair of ROIs, i and j, and for each window, the correlation values

obtained from the null model were normally distributed with a sample mean of µ (t)(i, j) and a

calculated the absolute z-score as z(t)

standard deviation of σ (t)(i, j). For each edge, A(t)
k (i, j) = |A(t)

k (i, j), in the original dFCN of each subject, we
k (i, j)− µ (t)(i, j)|/σ (t)(i, j) and removed all the
k (i, j) < 3, which corresponds to a statistical test with signiﬁcance of 99.8%. The v ,
k }t1,..., tM. The ﬁnal z-score thresholded dFCN, {B(t)}t1,..., tM, is constructed at each time point

edges with z(t)
{B(t)
by averaging {B(t)

k }t1,..., tM across subjects, where B(t)(i, j) = 1

K

K
∑
k=1

z(t)
k (i, j).

4.2.3 Reducibility of Brain dFCNs

The ﬁrst step in our proposed approach is the reduction of dFCN. The goal of the reduction step

is to cluster the time points that possess similar structure into a common partition to identify the
distinct network states. The z-score thresholded dFCN, {B(t)}t1,..., tM, constructed from surrogate
data is used to reduce the the original dFCN. The reduction method relies on quantum theory

principles to group similar time points by maximizing the distinguishability between the reduced

temporal network and the corresponding fully aggregated graph deﬁned as, Bagg =

detailed description and derivation of the method can be found in [140].

M
∑
m=1

B(tm). A

The method can be summarized in three main steps: (i) computation of Jensen-Shanon di-

vergence, DJS, between all pairs of time points in the temporal network, (ii) performing classical

96

hierarchical clustering of the M time points and, (iii) calculating the relative entropy, q, which is

used as the quality function of the resulting partition. The partition that maximizes q is chosen.

In the ﬁrst step, DJS is computed between all pairs of time points, which is deﬁned in terms of

Von Neumann entropy of each graph, B(t),

h(B(t)) = −tr(LB(t)log2LB(t)) = − n
∑

i=1

λ (t)
i

log2 λ (t)
i

,

(4.1)

where LB(t) =

(DB(t)−B(t))
n
B(t)(i, j)
∑
i, j=1
λ (t)
1 ,λ (t)
density matrix and

(cid:110)

(cid:111)

is the rescaled Laplacian of graph B(t) which has all the properties of a

2 , . . . ,λ (t)
n

are the eigenvalues of LB(t), tr, is the trace of a matrix and

DB(t), is the diagonal degree matrix corresponding to B(t).

Measuring the dissimilarity between any two pairs of time points in the temporal network can

be achieved by computing DJS between them. Given the rescaled Laplacian of any pair of time

points as Lti and Lt j, we can compute a new density matrix as L = 1

2 (Lti + Lt j), then DJS can be

calculated as,

DJS(Lti||Lt j) = h(L)− 1
2

(h(Lti) + h(Lt j)).

(4.2)

In the second step, classical hierarchical clustering of the M time points is performed using DJS

between all pairs of time points as the dissimilarity measure. At each step of the algorithm, the

time points that are separated by the smallest DJS are aggregated by summing up the corresponding

adjacency matrices. The distances between the resulting new pairs of the reduced temporal network
are then updated. This procedure is repeated M−1 times until we form the fully aggregated graph,

Bagg. As a result of the hierarchical clustering algorithm, we get a dendrogram that is associated

with the aggregation procedure.

97

In the third step, the relative entropy, q, is used as the quality function to quantify the distin-

guishability between the fully aggregated graph, Bagg, and the reduced temporal network, Gred,
where Gred : {G(1), G(2), . . . , G( ¯M)}, with ¯M ≤ M. The matrix, G(α), is either one of the z-score
¯M . The quality function
thresholded matrices or the sum of two or more of them with α = 1, . . . ,

of the reduced network is deﬁned as,

q(Gred) = 1− ¯H(Gred)
h(Bagg)

,

(4.3)

where ¯H(Gred) is the average Von Neumann entropy of the reduced z-score thresholded dynamic

network. Finally, the partition that maximizes the relative entropy is chosen, where the z-score

thresholded matrices of the time points that have similar information are clustered together.

The output of the reduction step is a set of multidimensional arrays or tensors, where the slices

of each tensor consists of adjacency matrices from the original dFCN. The adjacency matrices

included within each tensor are marked and selected from the original dFCN through the reduction

step as they carry similar topological information. The reduced dFCN can be deﬁned as Y :
{Y (1), . . . , Y ( ¯M)} with ¯M ≤ M, where Y (τ) ∈ Rn×n×L(τ)

T is the number
of time points included in the tensor Y (τ), which refers to the third dimension of the tensor and

T with τ = 1, . . . , ¯M and L(τ)

might vary between multiple tensors. For each tensor, Y (τ), we deﬁne L(τ) as a vector of indices of

the time points from the original dFCN included in the tensor. The lth frontal slice of the 3-mode
tensor, Y (τ)(:,:,l), is the normalized adjacency matrix AL(τ)(l)

, where l ∈ L(τ).

N

98

4.2.4 Dynamic Network Clustering

4.2.4.1 HOOI Clustering Based Approach

The evolutionary community structure of the reduced dFCN, Y , can be determined by identifying

the community structure for each partition, Y (τ). Each Y (τ) can be represented as a multidimen-

sional array and the optimal subspace U(τ) that describes the collection of adjacency matrices can

be found through spectral clustering of the weighted average of the adjacency matrices. This re-

sults in an optimization over both the subspace vectors matrix, U(τ), and the weight vector, W (τ),

for each partition. The optimization problem in (3.3) can then be rewritten for each tensor as,

(cid:32)

(cid:33)

(cid:107) U(τ)†

max

U(τ),W (τ)

∑
l∈L(τ)

l AL(τ)(l)
w(τ)

N

U(τ) (cid:107)2

F ,s.t U(τ)†U(τ) = I, W (τ) ≥ 0,(cid:107) W (τ) (cid:107)2= 1,

(4.4)

where w(τ)
l

are the weights for the adjacency matrices within Y (τ) and L(τ) refers to the vector of

indices of the time points from the original dFCN included in the tensor. The ﬁrst two modes of

the tensor refer to the connectivity and the third mode refers to time. A variable size set of tensors

are constructed to represent the reduced dFCN across time and (4.4) can be reformulated in terms

of Tucker decomposition, for each tensor as follows,

(cid:107) Y (τ) ×1 U(τ)† ×2 U(τ)† ×3 W (τ)† (cid:107)2

F , s.t U(τ)†U(τ) = I,(cid:107) W (τ) (cid:107)2= 1,

(4.5)

max

U(τ),W (τ)

where U(τ) is the basis along the ﬁrst and second modes and W (τ) corresponds to the weighting

vector across time. Since the networks are undirected, the components along the ﬁrst two modes

are the same. The optimization in (4.5) can be solved by higher-order orthogonal iteration (HOOI),

where in each iteration either the matrix U(τ) or the vector W (τ) is ﬁxed to optimize the other one.

This approach is summarized in the pseudo code in Algorithm 4.6.

99

3:
4:
5:

6:

7:

0 by HOSVD of Y (τ).
i+1 as the dominant left singular vector of Y(τ)

Unfold the tensor Y (τ) along mode 3 to get Y(τ)
(3).
Initialize U(τ)
i = 0
Calculate W (τ)
Compute ¯A = ∑
l∈L(τ)
Obtain U(τ)
if (cid:107) U(τ)

i+1 by eigenvalue decomposition of ¯A.

(W (τ)

i+1)lAL(τ)(l)

, where AL(τ)(l)

(cid:107)2
F > ε then

(cid:16)

N

N

(cid:17)

.

i ⊗ U(τ)
U(τ)
= Y (τ)(:,:,l).

(3)

i

Algorithm 4.6 HOOI clustering based approach.
Input: Y : {Y (1), . . . , Y ( ¯M)},L = {L(1), . . . ,L( ¯M)}.
Output: Clustering Labels.
1: for τ = 1 : ¯M do
2:

i+1 − U(τ)
i = i + 1.
goto line 6.

i

8:
9:
10:
11:
12:
13:
14:
15: end for

end if
Use modularity to determine the number of clusters.
Normalize the columns of U(τ).
Apply k−means on U(τ) to obtain clustering labels.

4.2.4.2 Choosing the Number of Clusters

The number of the clusters for each state were determined by modularity [10]. In the proposed

approach, we calculated the modularity of the detected community structure for each tensor using

the adjacency matrices for the time points included in each tensor separately. Then, an average

modularity across all of the time points within that FC state was calculated, and the number of

clusters that maximized the average modularity was chosen.

4.3 Results

4.3.1 dFCN Reduction

The ﬁrst step of the proposed approach is to combine the time points that carry similar topological

information, so that the time points with similar FC patterns are grouped together. Consequently,

100

the dimension of the dFCN is reduced while the distigushibility between the reduced dFCN and the

fully aggregated graph is maximized. We performed reducibility for the subject averaged dFCN

following the procedure explained in 4.2.2.2 and 4.2.3. The Jensen-Shannon distance matrix and

quality function for the temporal network obtained after each step of the hierarchical clustering

algorithm are shown in Fig.4.1 and Fig.4.2, respectively.

Figure 4.1: Jensen-Shannon divergence between
all pairs of time points in the dFCN.

Figure 4.2: The dendrogram resulting from hier-
archical clustering and the corresponding values
of q. The dashed red line identiﬁes the cut point
of the dendrogram from the second peak of q.

From the dendrogram shown in Fig.4.2, the reduced dFCN consists of 12 FC states represented

by 12 tensors. The time points included in each tensor carry similar information. As can be seen

from Fig.4.2, we cut the dendrogram using the quality function curve. The cut point is chosen as

the second peak of the quality function, since the partition resulting from the ﬁrst peak produced

101

Time PointsTime Points  204060801001201401601802002040608010012014016018020000.10.20.30.40.50.60.70.80.91DJS00.511.522.533.500.050.10.150.20.250.30.35Distanceq(.)00.511.522.53Time PointsDistance18 tensors and some of these tensors corresponded to less than 5 time points.

4.3.2 FC States Summarization

The HOOI based clustering approach is applied to analyze the brain functional connectivity dy-

namics. A set of 12 tensors are constructed corresponding to the reduced dFCN. The third di-

mension of the tensors represents the number of time points included in each partition. The pro-

posed approach is applied to each tensor to detect its community structure. The constructed ten-
sors were approximated by Tucker decomposition as Y (τ) ≈ C (τ) ×1 U(τ) ×2 U(τ) ×3 W (τ), where
C (τ) ∈ Rr×r×1 is the core tensor, U(τ) ∈ R90×r is the orthogonal low rank component matrix along
the connectivity mode and W (τ) ∈ RL(τ)
T ×1 is the weighting vector across time mode. r is the rank

which corresponds to the number of clusters. The number of clusters for each tensor is determined

using modularity, as explained in 4.2.4.2, and varied between 3 and 4.

State

1

2

3

4

5

6

7

8

9

10

Tensor Size
90× 90× 16
90× 90× 19
90× 90× 25
90× 90× 32
90× 90× 19
90× 90× 20
90× 90× 19
90× 90× 13
90× 90× 14
90× 90× 29

Time Points (seconds)
(120− 140)(154− 162)
(141− 153)(163− 188)

(189− 206)(246− 256)(282− 298)(301− 302)

(207− 245)(257− 281)(299− 300)
(303− 314)(330− 342)(386− 396)
(315− 329)(356− 358)(452− 470)

(343− 355)(397− 422)

(360− 384)
(423− 451)

(471− 503)(505− 530)

Reoccurrence percentage

7.8%

9.2%

12%

15.5%

9.2%

9.7%

9.2%

6.3%

6.7%

14%

Table 4.2: Key Attributes of the Tensors Constructed from the Reduced dFCN.

In this chapter, we are investigating the whole brain dynamics by detecting the community

102

structure of the brain over time. This is achieved by summarizing the dynamic behavior of the

brain by extracting a set of “FC states”. As the detected community structure across some states

is the same, the ﬁnal resulting FC states are summarized into 10 states. Table 4.2, summarizes the

key attributes of the tensors constructed to detect the community structure of the dFCN states.

Fig. 4.3 shows the detected community structures for all of the FC states. For each state, the

optimally weighted average of the adjacency matrices within the state is given along with the clus-

ter assignments (left). The clusters are labeled using the major ROIs detected within the cluster,

i.e. the cluster might contain ROIs from multiple RSNs, but it is labeled by the the most domi-

nant RSN within the cluster. Moreover, the circular graph (right) shows the detected community

structures using circularGraph toolbox [1] and the connections between the ROIs of the different

RSNs.

The extracted FC states are analyzed with respect to two aspects. First, the frequency of oc-

currence of each FC state is computed across time. Second, the variability in the community

membership of the core ROIs corresponding to different RSNs is investigated across FC states and

the reconﬁguration of the community membership of the ROIs is analyzed.

The different FC states are ﬁrst analyzed based on their frequency of occurrence. States 4 and

10 are the two states with the highest reoccurrence percentage with relative frequency of occurrence

at 15.5% and 14%, respectively. In State 4, a cluster which includes most of the core ROIs known

to belong to DMN, is detected. State 10, on the other hand, has a large SMN cluster that also

includes ROIs of the SCN, CCN and BiN. The reoccurrence percentage of the remaining states

varied between 6.3% and 9.7%, as shown in Table 4.2.

For the variability in the community membership of the ROIs, we observed that some ROIs

have a relatively stable community membership across the multiple states, i.e. these ROIs were

clustered together across the different states. For example, ROIs of the VN revealed a consistent

103

community membership across all states similar to the results in [141]. On the other hand, the VN

expended to include the superior parietal gyrus (SPG) during States 1, 2, 5, 6, 8, 9 and 10. Similar

ﬁndings were reported in [128], where the innovation-driven co-activation patterns (iCAPs) from

rs-fMRI were extracted and an iCAP that consists of ROIs of the VN and SPG has been identiﬁed.

Another set of ROIs that clustered consistently together across all states are the core ROIs of the

SMN, including paracentral lobule (PCL), precentral gyrus (PreCG), postcentral gyrus (PoCG)

and supplementary motor area (SMA). Even though these ROIs were clustered together across
all states, the SMN was extended to include ROIs from the SCN in States 1 − 4, 9 and 10 as
previously noted in [55]. Also, the HIP is included within the SMN across States 1− 4, 6 and 10.

In contrast, States 5 and 8 exhibited a separation of the ROIs of the SMN and SCN. Moreover,

we observed that the ROIs of the AN, superior temporal gyrus (STG) and heschl gyrus (HES),

were co-activated with the SMN across all states and this agrees with the results found in [128].

Similar to the other RSNs, some of the ROIs of the SCN were relatively stable, such as putamen

(PUT) and pallidum (PAL), whereas thalamus (THA) and caudate (CAU) have a variation in their

community membership between DMN, SCN and SMN [128, 142]. Furthermore, the ROIs of the
BiN exhibited a dynamic behavior across different states. For instance, in States 5 and 7− 9, the

hippocampus (HIP) and parahippocampal gyrus (PHG) are included in the DMN, as these ROIs are

known for playing an important role in long-term memory [124, 125], previous studies comprised

them as part of the DMN [126]. During States 1-3, the HIP and AMYG joined the SMN and PHG

joined the DMN. On the other hand, the ROIs HIP, PHG and AMYG are clustered together during

State 10.

As the DMN is a major network of interest during resting state [122,123], we will present some

interesting features of the detected DMN cluster across the different states. In State 4, which scored

the highest reoccurrence percentage, a large DMN cluster is detected. This cluster consists of the

104

core ROIs of the DMN, including precuneus (PCUN), Posterior cingulate gyrus (PCG), anterior

cingulate gyrus (ACG), angular gyrus (AG), superior frontal gyrus (SFG), middle frontal gyrus

(MFG), inferior frontal gyrus (IFG) and middle temporal gyrus (MTG). Most of the ROIs of the

DMN were stable across all states except for the PCUN. The PCUN exhibited different community

membership across the FC states, and it is known to play a crucial role in the DMN [127, 143].

In particular, it is afﬁliated with DMN cluster only in State 4. On the other hand, a consistent

co-activation between PCUN and the ROIs of the VN in States 1, 3, 5, 6 and 10 is revealed as can

be seen in Fig. 4.3, and this agrees with previously reported results in [55, 127], where a FC state

that included PCUN and the ROIs of the VN has been identiﬁed. Moreover, States 7-9 detected a

cluster that revealed co-activation between PCUN and the ROIs of SCN and CCN, similar to the

results in [55, 144].

By studying the community structures detected across the different FC states, we observe that

some RSNs show a relatively stationary behavior, while other RSNs show a dynamic behavior.

From the community structures shown in Fig. 4.3, it can be seen that VN and AN are more

stationary across the states, where the ROIs of these networks are clustered together across multiple

states and experienced less variability in their community membership. On the other hand, the

DMN, SMN, SCN, BiN and CCN exhibited a dynamic behavior across multiple states, as most

of the ROIs of these RSNs keep changing their community membership across states. Moreover,

in some states the aforementioned RSNs joined each other to form bigger clusters. For instance,

SMN, SCN,AN and BiN formed one cluster during States 1, 2, 4, 6, 9 and 10 and DMN and ROIs

from CCN and SCN formed one cluster during States 2, 5 and 6.

105

4.4 Conclusions

In this chapter, we introduce a two step dynamic community detection algorithm to investigate

the whole brain FC dynamics during resting state. The ﬁrst step in the proposed approach relies

on an information-theoretic function to reduce the dimensionality of the dFCN across time. This

reduction step identiﬁes the time points that possess similar topological information across time

and combines them as a tensor. The second step introduces a tensor based clustering approach to

detect the community structure of the reduced dFCN and summarizes it into a set of FC states that

reoccur across time. The extracted FC states provide evidence for the temporal evolution of the

brain dFCN, as the ROIs of the different RSNs reconﬁgure their community membership across

time.

The extracted FC states are evaluated through the frequency of reoccurrence across time and

the variability in the community membership of the ROIs across the different states. The results

show that some RSNs are more dynamic across time than others. In particular, we observed that the

VN and AN are the most stationary networks across states, whereas DMN, SCN, BiN, CCN and

SMN reconﬁgure their community membership more often across the different states. Moreover,

the results show that the FC state that is repeated most frequently across time is State 4, which has

a cluster that includes most of the core ROIs of the DMN. This result provides further evidence for

the important role played by the DMN during resting state.

The proposed approach makes some key contributions to the ﬁeld of dFCN analysis. First, the

proposed algorithm depends on quantum theory principles to reduce the dFCN across time into

a set of FC states. Second, the proposed approach uses a tensor representation for the extracted

partitions to detect the community structure across them, and this presentation preserves the topo-

logical structure of the dFCN. Third, the proposed tensor based clustering approach optimizes the

106

contribution of each time point within each state. Finally, the proposed approach detects and tracks

the community structure and its evolution in dFCN without any prior assumptions.

Future work will focus on different extensions of the proposed approach. First, the reduction

of the dFCN will be performed across time and subjects in order to construct a reduced multi-layer

temporal FCN network. Second, FC states will be extracted across time and subjects and will be

used to evaluate and compare the variability in the dFCN community structure between group and

individuals.

107

Figure 4.3: The detected community structure for the FC States (1− 10). For each state, (left) the
optimized weighted average of the adjacency matrices within the state. Each cluster is denoted by
the main ROIs detected within the cluster. The clusters shown in the circular graphs (right) refer
to the different RSNs deﬁned in literature, and the connections between them are detected by the
proposed algorithm. The number of clusters for the FC states varies between 3 and 4.

108

Chapter 5

Temporal Block Spectral Clustering for

Multi-layer Temporal Functional

Connectivity Networks

5.1

Introduction

A network representation is useful for describing the structure of a large variety of complex sys-

tems in the biological, social and physical sciences [145]. Traditional studies of networks generally

assume that nodes are connected to each other by a single type of static edge. However, this as-

sumption is almost always an oversimpliﬁcation, and ignores important network dynamics such

as time dependence. Multi-layer networks explicitly incorporate multiple channels of connectivity

in a system. A fundamental aspect of describing multi-layer networks is deﬁning and quantify-

ing the interconnectivity between different layers. One common approach to characterizing these

connections is community detection.

While different community detection approaches have been developed for static networks

[7, 9–11], there is a growing need to develop algorithms that detect the community structure

109

in multi-layer networks [6, 37]. Examples of multi-layer community detection methods include

temporal clustering approaches [31–33], subspace learning based methods [146], stochastic block

model (SBM) based multi-layer community detection methods [147, 148] and tensor based meth-

ods [149, 150]. More recently, a multi-objective optimization based genetic algorithm [41] was

proposed for detecting and tracking community structure in dynamic multi-layer networks. How-

ever, none of the aforementioned methods take the inter-layer couplings into account and are thus

mere extensions of multi-view data clustering approaches.

Recently modularity optimization and spectral clustering based methods for multi-layer net-

works with inter-layer connections have been proposed to address this limitation [35,151]. In [35],

a modularity optimization framework to determine the community structure in multi-slice net-

works is introduced. Since this framework depends on modularity, it has multiple drawbacks as

discussed in [20]. Moreover, the approach is limited to the case where each node is connected to

only itself in the adjacent time windows, i.e., multiplex networks with the inter-layer coupling set

to a constant parameter. In [151], a spectral clustering method to detect the community structure in

multi-layer networks with inter-layer couplings is introduced. However, this approach is limited as

it constructs a single block adjacency matrix to represent all the graphs in the network and cannot

capture the temporal variation in the community structure.

Therefore, there is an increasing need to look at the dynamic community structure of multi-

layer temporal networks with unconstrained inter-layer or inter-temporal connectivity [6,152–154].

In this chapter, a framework to detect and track the community structure in multi-layer temporal

networks (MLTNs) is introduced. In the proposed approach, the temporal relationships between

all node pairs within and across time window are modeled using a sliding window correlation

approach. In this manner, non-trivial intra- and inter-adjacency matrices are constructed resulting

in temporal supra-adjacency matrices. The dynamic community structure is detected by applying

110

spectral clustering on these temporal block supra-adjacency matrices. This work makes signiﬁcant

contributions to the ﬁeld of dynamic community detection and differs from existing multi-layer

community detection methods in some key ways. First, unlike the work in [35], the inter-layer

adjacency matrices across time are not limited to the interactions between each node and itself

across time and consider the interactions across all nodes and all time points. Second, unlike the

work in [151] that constructs a single block matrix for the whole network, the community structure

at each time point is updated by taking the history of networks into account. A particular focus of

this chapter is community detection in dynamic functional connectivity networks (dFCNs) of the

brain. More speciﬁcally, the dFCNs are modeled for each subject as a MLTN where the different

brain regions correspond to the different nodes and the different time windows correspond to the

different layers.

5.2 Methods

In this chapter, a temporal block spectral clustering approach is introduced to detect and track the

community structure in temporal multi-layer temporal networks. In the proposed approach, a set

of supra-adjacency matrices that captures the intra- and inter-temporal relations is constructed as

follows.

5.2.1 Construction of multi-layer temporal networks

5.2.1.1 Construction of a temporal network from signals

Given a set of multivariate signals, {x1(t),x2(t), . . . ,xn(t)} with t ∈ {t1,t2, . . . ,tM}, a set of adja-

cency matrices is constructed at each time point describing the temporal network with the signals

corresponding to the nodes in the networks. A sliding window of length W is used to construct

111

both the intra-adjacency matrix, A(tm), with tm ∈ {tW ,tW +1, . . . ,tM}, and the inter-adjacency matrix,
A(tk,tl), with tk,tl ∈ {tW ,tW +1, . . . ,tM}. A(tm) captures the correlations between the different signals
within a time window W , whereas A(tk,tl) captures the correlations between signals across different

time windows. A(tk,tl) is deﬁned as:



,



A(tk,tl) =

ρW (tk,tl )
x1x1

ρW (tk,tl )
x1x2

. . . ρW (tk,tl )

x1xn

ρW (tk,tl )
x2x1

ρW (tk,tl )
x2x2

. . .

. . .

...

...

...

...

ρW (tk,tl )
xnx1

ρW (tk,tl )
xnx2

. . . ρW (tk,tl )

xnxn

where ρW (tk,tl )

xix j = ρ(xi(tk−W +1 : tk),x j(tl−W +1 : tl)) is the absolute correlation between xi in time
window W (tk) and x j in time window W (tl). The constructed A(tk,tl) measures the similarities be-

tween each node in window W (tk) and all other nodes in window W (tl), unlike previous work that

focused mostly on node-to-node edges [35]. A(tm) is considered as a special case of A(tk,tl), where
A(tm) = A(tm,tm). Note that A(tl,tk) = A(tk,tl)†.

5.2.1.2 Construction of multi-layer temporal network

In this chapter, the variations in the community structure of temporal networks is studied. In order

to achieve this, a set of block supra-adjacency matrices is constructed to reﬂect the relationships

between the different networks across time.

In the proposed approach, a sliding window technique is used over the time-varying networks

A(tm) to construct a set of intra- and inter- block adjacency matrices, Aintra and Ainter, respectively.

Each window considers T consecutive networks to construct:

112



A(tm(cid:48) )
intra =

A(tm(cid:48)−T +1)

0

...



, and

A(tm(cid:48) )

0

A(tm(cid:48) )
inter =



0

A(tm(cid:48)−T +1,tm(cid:48)−T +2)

A(tm(cid:48)−T +1,tm(cid:48) )

A(tm(cid:48)−T +2,tm(cid:48)−T +1)

...

0

...

...

...

0



,

(5.1)

inter ∈ RnT×nT . Note that T is different
where tm(cid:48) ∈ {tW +1, . . . ,tM}, T < M −W + 1 and A(tm(cid:48) )
from W used in Section 5.2.1.1, where T is the number of networks included in constructing A(tm(cid:48) )
intra
and A(tm(cid:48) )

inter , while W is the signal length used to construct the adjacency matrices.

intra,A(tm(cid:48) )

For each set of T consecutive networks, a supra-adjacency matrix with the intra-adjacency ma-

trices on the diagonal and the inter-adjacency matrices on the off-diagonal is constructed. As this

matrix is symmetric with positive entries, the corresponding Laplacian is positive semi-deﬁnite.
Supra ∈ RnT×nT is
A(tm(cid:48) )

A(tm(cid:48) )
Supra = A(tm(cid:48) )

intra + A(tm(cid:48) )
inter.

(5.2)

A supra-normalized Laplacian matrix can then be constructed following the deﬁnition in [151,155],

Supra = D(tm(cid:48) )
L(tm(cid:48) )

Supra

− 1
2

(L(tm(cid:48) )

intra + L(tm(cid:48) )

inter)D(tm(cid:48) )

Supra

− 1
2

,

(5.3)

113

intra = D(tm(cid:48) )

intra − A(tm(cid:48) )
where L(tm(cid:48) )
the degree matrices of A(tm(cid:48) )

intra and L(tm(cid:48) )
intra and A(tm(cid:48) )

inter = D(tm(cid:48) )
inter and A(tm(cid:48) )

inter, with D(tm(cid:48) )

inter − A(tm(cid:48) )
Supra , respectively.

intra, D(tm(cid:48) )

inter and D(tm(cid:48) )

Supra deﬁned as

5.2.2 Temporal block spectral clustering approach (TBSCA)

Block spectral clustering [155], [151] simpliﬁes the clustering problem of multi-layer networks to

the standard spectral clustering of a single LSupra that corresponds to all graphs in the network.

The advantage of this approach is that it can provide a globally optimal solution for the clustering

problem using the eigenvectors of LSupra.

In order to obtain temporally smooth communities, it is important to determine a community

structure that simultaneously ﬁts the network structure at time t and does not deviate from the past

community structures. To accomplish this objective, a temporal block spectral clustering approach

(TBSCA) is introduced.
In this approach, a set of supra-adjacency matrices is constructed as
described in (5.1) and (5.2) across time to capture the dynamics of the network. From each A(tm(cid:48) )
Supra,
the corresponding supra-normalized Laplacian is constructed, L(tm(cid:48) )

Supra, for T consecutive networks

using (5.3). The clustering problem is then formulated as,

(cid:16) ˆU(tm(cid:48) )†L(tm(cid:48) )

Supra

ˆU(tm(cid:48) )(cid:17)

min

ˆU(tm(cid:48) )∈RnT×rm(cid:48)

trace

s.t. ˆU(tm(cid:48) )† ˆU(tm(cid:48) ) = I,

(5.4)

ˆU(tm(cid:48) ) can be obtained from the eigendecomposition of L(tm(cid:48) )
Supra. The number of clusters is deter-
mined using the eigen-gap criterion [7]. By applying k−means on ˆU(tm(cid:48) ), a cluster label vector
C ∈ R1×nT is obtained for the nT vertices. Finally, the cluster labels for time point tm(cid:48) are obtained
as C(1,nT − n + 1 : nT ).

114

5.3 Results

5.3.1 Simulated networks

A temporal network with 150 nodes is generated for 60 time points. This temporal network is
created with 3 clusters within time intervals 1− 20 and 40− 60 and 2 clusters within time inter-
val 21− 40. At each time point, Aintra and Ainter are constructed. The intra- and inter-adjacency

matrices are generated where the within- and between-cluster edges are selected randomly from

a uniform distribution in the range [0,1] and [0, 0.5], respectively. 30% of the within-cluster and

5% of the between-cluster edges are randomly set to 1. The TBSCA is implemented to detect

and track the community structure of the network over time with T = 4 time points. This sim-

ulation is repeated 100 times. The number of clusters at each time point is detected using the

eigengap criterion. The proposed algorithm is able to detect the correct community structure and

the change points across time as shown in Fig. 5.1. The performance of the proposed approach

(TBSCA), BNLSC [151] and the generalized Louvain for multi-layer modularity maximization

(GenLov) [35] is compared using Variation of information (VI) metric [4]. A VI value of 0 means

the partitions are exactly the same and as the VI value increases, the degree of agreement between

partitions decreases. The results in Fig.5.1 indicate that the proposed algorithm is very compara-

ble to the GenLov method, which is considered as the state-of-the-art algorithm for detecting the

community structure in temporal networks, and is better than BNLSC.

115

Figure 5.1: Comparison between the detected community structure by TBSCA, BNLSC and
GenLov and the ground truth for the synthetic data set in terms of variation of information. The
number of clusters changes from 3 to 2 at time point 21, and from 2 to 3 at time point 41.

5.3.2 Community Detection in dynamic functional connectivity networks

(dFCNs)

Resting state fMRI (rs-FMRI) is commonly used to investigate spontaneous brain activity. Previous

work [55,59] indicates that there is a set of resting state networks (RSNs) that appears consistently

in healthy subjects. These RSNs are default mode network (DMN), visual network (VN), somato-

motor network (SMN), cognitive control network (CCN), bilateral network (BiN) and auditory

network (AN). Even though early studies in rs-fMRI assume stationary functional connectivity,

recent studies revealed evidence for temporal evolution of the brain functional connectivity during

resting state [53, 60].

5.3.2.1 Data and preprocessing

Bangor rs-fMRI dataset (http://www.nitrc.org/projects/fcon1000) which includes 20 healthy sub-
jects (males, aged between 19−38) is used in this chapter. The rs-fMRI data was collected using a

3T Magnet scanner from all subjects while their eyes were open. TR equals to 2s yielding 34 slices

116

010203040506000.020.040.060.080.10.120.140.16Time pointsVariation of informationComparison between the detected community structure and the ground truth using VI  TBSCABLSCGenLovand 265 time points. The pre-processing was done using SPM12 (http://www.ﬁl.ion.ucl.ac.uk/spm/)

and CONN (http://www.nitrc.org/projects/conn) toolboxes. The structural images were registered

to the mean functional image and spatially normalized to MNI space for each subject. Slice timing

and motion correction were performed on every functional image. The functional images were re-

gionally parcellated by automated anatomical labeling (AAL) atlas [118] to 90 regions of interest

(ROIs) and smoothed by spatial convolution with 4-mm FWHM Gaussian kernel. Confounds such

as motion parameters obtained from realignment and BOLD signals obtained from white matter

and CSF masks were regressed out and band-pass (0.008-0.09 Hz) temporal ﬁltering was applied

to functional images of each subject.

5.3.2.2 Dynamic Community structure

The proposed approach is applied to dFCNs constructed from rs-fMRI for each subject following

the steps described in Sec. 5.2.1 with W = 60 (2 minutes) similar to previous studies [56]. The

constructed dFCN consists of 206 time points. The TBSCA with T = 2 networks is applied to

detect and track the community structure for each subject separately. The detected number of
clusters varied between (2− 7) for the different subjects across time.

The application of the TBSCA to multi-layer temporal networks for each subject resulted in a

community structure for each subject at each time point. For a typical subject, Fig. 5.2 shows the

similarity between the detected community structures across consecutive time points measured by

VI. From Fig. 5.2, it can be seen that the detected community structure changes smoothly across

time and there are certain time points where the structure shows signiﬁcant changes indicating the

dynamics of resting state connectivity. Fig. 5.3, displays the detected community structure at some

of these change points. In an attempt to quantify the similarity of the community structure across
time and subjects, a 4100×4100 similarity matrix is constructed to reﬂect the degree of agreement

117

Figure 5.2: Agreement between the detected community structures across time for subject 1 mea-
sured by variation of information (VI). The dashed lines refer to the detected change points.

between the detected community structures across time and subjects in terms of VI, as shown in

Fig. 5.4. From Fig. 5.4, it can be seen that the dFCNs exhibit a smooth variation in the community

structure across time for a given subject indicating the dynamic behavior of the FCNs during rest.

Furthermore, the community structure of the dFCNs for the different subjects follows different

patterns across time as during resting state the subjects are not necessarily in the same network

state at the same time.

5.3.2.3 Consistency of the RSNs across time

In order to quantify the dynamics of the RSNs across time and subjects, the consistency of each

RSN is calculated.

In order to calculate the consistency of a RSN, the recruitment coefﬁcient

deﬁned in [129] is calculated for each ROI within the RSN. The ROI’s recruitment coefﬁcient is

deﬁned as the probability that the ROI is in the same cluster as other ROIs of its own subnetwork

at each time point. The RSN’s consistency is then calculated as the average of the recruitment

coefﬁcients of all ROIs that correspond to that RSN.

The consistency is calculated for the different RSNs across time and subjects after applying

118

Time pointsTime points  204060801001201401601802002040608010012014016018020000.050.10.150.20.250.30.350.40.450.5VIAgreement between detected community structures by TBSCA across time for subject 1                                            using variation of information(a) Time point 14

(b) Time point 40

(c) Time point 104

(d) Time point 200

Figure 5.3: Detected community structure from selected change points. The ROIs in different
networks reconﬁgured their community membership across time. Visualized by circularGraph
toolbox [1].

TBSCA to the dFCN. The variation in consistency across subjects is shown in Fig. 5.5. From

Fig. 5.5, the VN, AN and SMN have a relatively higher consistency than the other RSNs. This

reﬂects a relatively stronger connection between the ROIs within these networks over time. On the

other hand, the ROIs of the DMN, SCN, CCN and BiN exhibit more variation in their community

membership over time which indicates the dynamic behavior of these RSNs during rest.

119

Figure 5.4: Agreement between the detected community structures across time for all subjects
measured by variation of information (VI).

Figure 5.5: Consistency of different RSNs across subjects calculated from the detected community
structure by TBSCA.

120

Subjects  5001000150020002500300035004000500100015002000250030003500400000.10.20.30.40.50.60.7VIAgreement between detected community structures across time between all                            subjects using variation of informationSubjectsVNDMN1SCNSMNANBiLNCCN0.40.50.60.70.80.91Change in consistencyChange in consistency of RSNs across subjects calculated from the detected community structure by TBSCA                     Medianµ25%−75%9%−91%5.4 Conclusions

In this chapter, a temporal block spectral clustering framework is introduced to detect and track the

community structure of multi-layer temporal networks. In the proposed approach, a sliding win-

dow correlation approach is used to model the pair-wise temporal relationships between all nodes

within and across time windows. A set of intra- and inter-adjacency matrices is constructed and

combined to create a set of temporal supra-adjacency matrices. The dynamic community structure

is then detected by applying spectral clustering to these supra-adjacency matrices. The simulation

results show that the proposed algorithm is able to detect and track the correct community structure

over time. The proposed method is also evaluated on dFCNs constructed from rs-fMRI across time

and subjects revealing dynamic connectivity patterns between the RSNs. Moreover, the detected

community structures for the different subjects are found to follow different patterns across time

as this is rs-fMRI. Finally, a consistency measure is used to quantify the dynamics of the RSNs

across time and subjects. From this measure, certain RSNs such as the DMN, SCN, CCN and BiN

have been found to exhibit more dynamic behavior during rest.

121

Chapter 6

Conclusions and Future Work

6.1 Conclusions

In this thesis, we proposed various methods to detect and track the community structure in dy-

namic and multi-layer networks. In order to better understand the community structure in these

complex networks, we started our work by analyzing dynamic single-layer networks and then gen-

eralized it to analyze dynamic multi-layer networks. A particular focus of this thesis is detecting

the community structure in social networks and functional connectivity networks of the brain.

In Chapter 2, we approached the problem of community detection in dynamic networks by

introducing a low-rank + sparse estimation based evolutionary spectral clustering approach. The

proposed method decomposes the network into low-rank and sparse parts and obtains smooth clus-

ter assignments by minimizing the subspace distance between consecutive time points. The method

tries to obtain smooth cluster assignments by minimizing the subspace chordal distance between

consecutive time points. Moreover, the proposed method considers explicitly the changes in the

number of clusters between the consecutive time points and reformulates the objective function de-

pending on these changes. The proposed framework is robust to noise and outliers and can detect

the community structure in both binary and weighted temporal networks efﬁciently. Furthermore,

a cost function to track changes in the community structure across time is introduced through the

122

subspace distance measure. Therefore, unlike existing evolutionary clustering methods, the pro-

posed algorithm does not require any prior knowledge or make assumptions about the community

structure and does not assume any particular model for the network. The proposed approach is

evaluated on both simulated temporal networks and social temporal networks. The results show

that the proposed algorithm is able to detect and track the correct community structure over time

even in noisy networks.

As the method proposed in Chapter 2 is limited to temporal single-layer networks and can

take limited history into account, a computationally efﬁcient tensor based method is proposed in

Chapter 3 to detect the community structure in dynamic single-layer networks dynamic multi-

layer networks and track change across longer history. The proposed method relies on a tensor

based representation of the dynamic network. A 3-mode tensor is used to represent the dynamic

network, while a 4-mode tensor is used to represent the dynamic multi-layer network. In the case

of dynamic single-layer networks, the framework is implemented through two approaches: 3D-

windowed tensor approach (3D-WTA) and running time tensor approach (RTTA). The proposed

framework uses a sliding window approach in order to take the past history of the networks into

account and optimizes the contribution of each time point within the window. In 3D-WTA, a ﬁxed

length sliding window is used to construct the tensors, while a variable length sliding window is

used in RTTA. For each window, a common subspace across all time points within the window

is identiﬁed using Tucker decomposition. Furthermore, a normalized cost function is introduced

to track the changes in the community structure over time. The performance of the proposed

framework is evaluated using multiple synthetic and real networks including social networks and

FCNs of the brain. The change points detected by the introduced normalized cost function reﬂect

the true changes in these networks. Moreover, the application of proposed framework to the dFCNs

of the brain constructed from EEG data during ERN and from rs-fMRI revealed changes in the

123

community structure across time that is well-aligned with prior work.

As one of the major applications of our work is exploring the dFCNs from rs-fMRI, we applied

4D-WTA to identify and track the brain network community structure across time and subjects. In

particular, this approach is adopted to study the temporal evolution of communities in the resting

state networks. In 4D-WTA, the contribution of each subject and time point within the framework

is optimized. Moreover, the dFCNs are summarized into a set of FC states deﬁned through their

community structures that repeat over time and subjects. The detected community structures are

evaluated through a consistency measure and the dynamic behavior of the FCNs is compared to

the results from prior work.

In Chapter 4, we introduce an information-theoretic approach to reduce the dynamic network

and identify the time points that carry similar topological structure to aggregate them into a tensor.

A tensor based spectral clustering approach, similar to the one in Chapter 3, is then applied to

identify the community structure of the constructed tensors. A particular focus of the proposed

approach is detecting the dynamics of the community structure in the FCNs constructed from rs-

fMRI data. The objective of the proposed algorithm is to reduce the dynamic FCNs by grouping

the time points that are topologically similar into a set of functional connectivity states. A tensor

based approach is then performed to detect the community structure of these states. Finally, the

detected community structure is summarized as a set of FC states that reoccur across time. The

extracted FC states revealed the dynamic behavior of the resting state networks. The detected

community structure of the FC states is found to be in line with results reported in previous studies

of rs-fMRI.

In Chapter 5, a temporal block spectral clustering framework is proposed to detect and track

the community structure of multi-layer temporal networks. The main contribution of the proposed

approach is the construction of data-driven intra- and inter-layer adjacency matrices. In particular,

124

the proposed approach considers all possible edges between nodes within and across layers to

construct intra- and inter- layers adjacency matrices unlike existing methods that ignore inter-layer

edges or considers inter-layer edges between the node and itself only, i.e. multiplex networks.

This is performed by using a sliding window correlation approach to model the pair-wise temporal

relationships between all nodes within and across time windows. A set of intra- and inter-adjacency

matrices are constructed and combined to create a set of temporal supra-adjacency matrices. The

community structure is then determined by applying spectral clustering to these supra-adjacency

matrices. The proposed algorithm is evaluated on simulated networks and dFCNs from rs-fMRI

across time and subjects and compared to results reported from previous studies.

6.2 Future Work

In Chapter 2, the proposed approach considers the minimization of the subspace distance between

consecutive time points to obtain smooth cluster assignments. Future work can extend the current

work by considering longer history and minimize the distance between the current subspace and

the subspaces of multiple previous time points. One main challenge of the proposed approach

is the computational complexity where the dominant cost for each time step corresponds to the

computation of the nuclear proximal operator and eigenvalue decomposition (EVD). Future work

will consider reducing this cost by using randomized algorithms for SVD [156] or partial SVD and

partial EVD as suggested in [157] [158].

In Chapter 3, the proposed approaches rely on a sliding window technique that is applied at

each time point to identify the current community structure. Future work will consider optimal

selection of the window length to determine how many time points from past history should be

included. Moreover, we will reduce the computational cost by reducing the constructed tensors’

125

sizes. This can be accomplished by adding extra constraints to the objective function. The added

constraints can guarantee the selection of time points or layers that are sparser than others to be

included in the framework, thus excluding noisy adjacency matrices.

In Chapter 4, future work will focus on different extensions of the proposed approach for

temporal multi-layer networks. In particular, the aggregation of the multi-layer network can be

performed by considering both intra- and inter-layer edges since the current work focuses on intra-

layer edges only. The aggregation process can be achieved by a two-step approach: First, a sliding

window approach can be used to construct supra-adjacency matrices that consider both intra- and

inter-layer connections between nodes within the time window. Second, the time windows that

possess similar topological structure can be aggregated together. This may result in a better rep-

resentation of the temporal multi-layer network and reﬂect a more accurate community structure.

After the temporal multi-layer network is reduced, a tensor based community detection approach

can be applied to detect the underlying community structure.

In Chapter 5, the proposed approach detects the community structure in temporal multi-layer

networks through a temporal block spectral clustering framework. Future research will investigate

the extension of different deﬁnitions of graph cut to multi-layer networks and derive the corre-

sponding spectral clustering algorithms. Moreover, a multi-layer network may also have multiple

aspects. For example, each layer may correspond to a different frequency band and vary across

time for neuronal networks. Extensions of spectral clustering for these types of multiple aspect

multi-layer networks will also be considered.

126

APPENDIX

127

Update L(t)k+1

Update L(t) by keeping only the terms with L(t):

L(t)k+1
argmin
L(t)∈Rn×n
γ1
2

+

=
(cid:107)L(t)(cid:107)∗ + (cid:104)Xk
(cid:107)A(t) − L(t) − S(t)k(cid:107)2

1,A(t) − L(t) − S(t)k(cid:105)

F + (cid:104)Xk

2,W(t)k − L(t)(cid:105) +

γ2
2

(cid:107)W(t)k − L(t)(cid:107)2
F ,

which can be simpliﬁed to:

(cid:107)L(t)(cid:107)∗ +

γ1
2

(cid:107)L(t) − (A(t) − S(t)k

+

Xk
1
γ1

)(cid:107)2
F

= argmin
L(t)∈Rn×n
γ2
2

+

= argmin
L(t)∈Rn×n

(cid:107)L(t) − (W(t)k

+

(cid:107)L(t)(cid:107)∗ +

= prox 1

γ1+γ2

(cid:107)L(t)(cid:107)∗(

Xk
)(cid:107)2
2
F
γ2
γ1 + γ2

γ1Yk

2
1 + γ2Yk
2
γ1 + γ2

),

(cid:107)L(t) − γ1Yk

1 + γ2Yk
2
γ1 + γ2

(cid:107)2
F

(1)

(2)

where Yk

1 = A(t) − S(t)k

+

Xk
1
γ1

, Yk

2 = W(t)k

+

Xk
2
γ2

is the proximity operator of the convex function f [88]. Let Y =

and prox f (Y) = argminL∈Rn×n f (L) + 1
, γ = γ1+γ2
2

1+γ2Yk
2
γ1+γ2

γ1Yk

2(cid:107)L− Y(cid:107)2
and Y =

F

Y be the SVD of the matrix Y, singular value soft thresholding is then used to update

QYΣΣΣYV†
L(t)k+1 as:

L(t)k+1

= QYΩ 1
γ

(ΣΣΣY)V†
Y,

(3)

where Ωτ is the element-wise thresholding operator deﬁned as Ωτ (a) = sgn(a)max(|a|−τ,0).

128

Update S(t)k+1

Update S(t) by keeping only the terms with S(t):

S(t)k+1
argmin
S(t)∈Rn×n

=
(cid:107)S(t)(cid:107)1 + (cid:104)Xk

(cid:107)S(t)(cid:107)1 +

= argmin
S(t)∈Rn×n

1,A(t) − S(t) − L(t)k+1(cid:105) +
γ1
2

(cid:107)S(t) − (A(t) − L(t)k+1

+

γ1
2
Xk
1
γ1

)(cid:107)2
F

(cid:107)A(t) − S(t) − L(t)k+1(cid:107)2
F

Similar to updating L(t)k+1, S(t)k+1 can be computed as:

S(t)k+1

= prox µ
γ1

(cid:107)S(t)(cid:107)1
(A(t) − L(t)k+1

+

(A(t) − L(t)k+1
Xk
1
γ1

)

= Ω µ
γ1

Update W(t)k+1

+

Xk
1
γ1

)

Update W(t) by keeping only the terms with W(t):

f (W(t))

= argmin
W(t)∈Rn×n
λ1Tr(U(t)k†

W(t)k+1

= argmin
W(t)∈Rn×n
+ (cid:104)Xk

2,W(t) − L(t)k+1(cid:105) +

(cid:107)W(t) − L(t)k+1(cid:107)2
F ,

γ2
2

s.t W(t) = W(t)†

, W (t)

i j ≥ 0, W (t)

ii = 0,

(I − (D(t)k

−0.5
)

W(t)(D(t)k

−0.5
)

k

)

)U(t)

(4)

(5)

(6)

129

where D(t)k is the degree matrix calculated from the previous iteration. Eq. (6) can be then simpli-

ﬁed to:

−0.5
)

W(t)(D(t)k

−0.5
)

U(t)k

)

argmin
W(t) ∈Rn×n

λ1Tr(U(t)k†

(D(t)k
(cid:107)W(t) − (L(t)k+1 − Xk
2
γ2

+
s.t W(t) = W(t)†

γ2
2

)(cid:107)2
F

, W (t)

i j ≥ 0, W (t)

ii = 0.

The gradient of the function f (W(t)) can be written as:

∇W(t) f (W(t)) = γ2(W(t) − L(t)k+1

+

)−λ1(D(t)k

)

−1
2 U(t)kU(t)k†

(D(t)k

−0.5
)

Xk
2
γ2

A closed form solution for W(t) can be calculated as:
(D(t)k

W(t)k+1

= L(t)k+1 − Xk
2
γ2

λ1
γ2

+

−0.5
)

U(t)kU(t)k†

(D(t)k

−0.5
)

.

(7)

(8)

(9)

In order to guarantee that W(t) satisﬁes all the constraints, spectral projected gradient (SPG)

[89] method is used to update W(t) by following the steps in Algorithm 1 with the projection

operator, P(W(t)), deﬁned as:

P(W(t)) =



W (t)
i j

if i (cid:54)= j and W (t)

i j ≥ 0.

0

if i = j or W (t)

i j < 0.

Update U(t)k+1

Update U(t) by keeping all the terms with U(t) only:

U(t)k+1

= argmin
U(t)∈Rn×r

f (U(t))

= argmin
U(t)∈Rn×r
λ1 Tr(U(t)†ΦΦΦ(t)k+1U(t)) + λ2 d2

ch(U(t),U(t−1)), s.t U(t)†U(t) = I.

(10)

130

2 ,U(t),λ1,γ2,ε.

Algorithm 1 Spectral projected gradient method (SPG)
Input: W(t),L(t),X(t)
Output: W(t).
1: l ← 0
2: Initialize W(t)0 using Eq. (9).
3: σ ← 1.
4: while (cid:107) W(t)l+1 − W(t)l (cid:107)2
5:
6:
7: W(t)l+1 ← W(t)l
+ ρR.
s ← vec(W(t)l+1 − W(t)l
8:
y ← vec(∇ f (W(t)l+1
9:
σ ← y†y/s†y.
10:
l ← l + 1
11:
12: end while

R ← P(W(t)l − σ∇ f (W(t)l
Compute the step length ρ using line search [89].

))− W(t)l.

F < ε do

).

)− ∇ f (W(t)l

)).

Considering the three different cases for d2

Ch(U(t),U(t−1)) as discussed in Section 2.3.2, the func-

tion, f (U(t)), in Eq. (10) can be rewritten for the ﬁrst case as:

+

λ2
λ1

{r − Trace(U(t)U(t)†U(t−1)U(t−1)†

)},

+ (−λ2
λ1

Trace(U(t)U(t)†U(t−1)U(t−1)†

)) +

(11)

λ2
λ1

r,

(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)
(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)

U(t)†ΦΦΦ(t)k+1U(t)(cid:17)
(cid:16)

U(t) ∈Rn×r

λ1 min
Trace
s.t U(t)†U(t) = I,
λ1 min
Trace
s.t U(t)†U(t) = I,

U(t) ∈Rn×r

U(t) ∈Rn×r

λ1 min
Trace
s.t U(t)†U(t) = I.

using the cyclic property of the trace operator, Eq. (11) becomes,

+ (−λ2
λ1

Trace(U(t)†U(t−1)U(t−1)†U(t))) +

λ2
λ1

r,

(12)

By ignoring the constant term and rearranging the problem using properties of the trace function

we get,

λ1 min

U(t) ∈Rn×r

Trace

(cid:18)
U(t)†(cid:18)

ΦΦΦ(t)k+1 − λ2
λ1

U(t−1)U(t−1)†(cid:19)

(cid:19)

,

U(t)

s.t U(t)†U(t) = I,

(13)

131

which can be rewritten as,

min

U(t) ∈Rn×r

λ1Trace(U(t)†ΦΦΦ(t)k+1

modU(t)), s.t U(t)†U(t) = I,

(14)

mod is deﬁned as the modiﬁed Laplacian matrix at time point t. ΦΦΦ(t)k+1

mod is calculated as

where ΦΦΦ(t)k+1
ΦΦΦ(t)k+1

mod = ΦΦΦ(t)k+1 − λ2
λ1
For the second case, the function, f (U(t)), in Eq. (10) can be rewritten as:

U(t−1)U(t−1)†.

U(t)†ΦΦΦ(t)k+1U(t)(cid:17)
(cid:16)
(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)

λ1 min
Trace
U(t) ∈Rn×r
s.t U(t)†U(t) = I,
λ1 min
Trace
s.t U(t)†U(t) = I,

U(t)∈Rn×r

+

{r1 − Trace(U(t)U(t)† ˆU(t−1) ˆU(t−1)†

)},

λ2
λ1

+ (−λ2
λ1

Trace(U(t)U(t)† ˆU(t−1) ˆU(t−1)†

)) +

(15)

λ2
λ1

r1,

using the cyclic property of the trace operator and ignoring the constant term, Eq. 15 becomes,

(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)

+ (−λ2
λ1

Trace(U(t)† ˆU(t−1) ˆU(t−1)†U(t))),

U(t) ∈Rn×r

λ1 min
Trace
s.t U(t)†U(t) = I,

which can be rewritten as,

(16)

(17)

min

U(t) ∈Rn×r
mod = ΦΦΦ(t)k+1 − λ2
λ1

where ΦΦΦ(t)k+1

λ1Trace(U(t)†ΦΦΦ(t)k+1

modU(t)), s.t U(t)†U(t) = I,

ˆU(t−1) ˆU(t−1)†.

The solutions to the problems in Eq. (14) and Eq. (17) are found by choosing U(t) as the

matrix containing the r eigenvectors that correspond to the smallest r eigenvalues of ΦΦΦ(t)k+1
ΦΦΦ(t)k+1

mod ∈ S+
For the third case, the problem in Eq. (10) can be rewritten as:

n is the set of positive semideﬁnite matrices.

n and S+

mod where

132

(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)
(cid:16)

+

λ2
λ1

{r − Trace( ˆU(t) ˆU(t)†U(t−1)U(t−1)†

)},

+ (−λ2
λ1

Trace( ˆU(t) ˆU(t)†U(t−1)U(t−1)†

)) +

(18)

λ2
λ1

r,

U(t) ∈Rn×r

λ1 min
Trace
s.t U(t)†U(t) = I,
λ1 min
Trace
s.t U(t)†U(t) = I,

U(t) ∈Rn×r

by ignoring the constant term, Eq. 18 becomes,

(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)

U(t) ∈Rn×r

λ1 min
Trace
s.t U(t)†U(t) = I.

+ (−λ2
λ1

Trace( ˆU(t) ˆU(t)†U(t−1)U(t−1)†

)),

(19)

Considering the ﬁrst term in Eq. (19), and deﬁning U(t) as, U(t) =

with ˆU(t) ∈

Rn×r1 and ¯U(t) ∈ Rn×(r−r1) , then

(cid:16)
U(t)†ΦΦΦ(t)k+1U(t)(cid:17)

=

Trace



ˆU(t)†{ΦΦΦ(t)k+1 ˆU(t)

ˆU(t)†ΦΦΦ(t)k+1 ¯U(t)

¯U(t)†ΦΦΦ(t)k+1 ˆU(t)

(cid:16) ˆU(t)†ΦΦΦ(t)k+1 ˆU(t)(cid:17)

= Trace

¯U(t)†L(t)
N

¯U(t)

+ Trace

¯U(t)(cid:105)

(cid:104) ˆU(t)

(cid:16) ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)(cid:17)

The optimization problem in Eq. (19) can be rewritten as:

(cid:16) ˆU(t)†ΦΦΦ(t)k+1 ˆU(t)(cid:17)

(cid:16) ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)(cid:17)

+ Trace

min

ˆU(t) ∈Rn×r1 , ¯U(t)∈Rn×(r−r1)

Trace

+ (−λ2
λ1

Trace( ˆU(t) ˆU(t)†U(t−1)U(t−1)†

)),

s.t U(t)†U(t) = I.

(cid:18)
ˆU(t)†(cid:18)
(cid:16) ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)(cid:17)

Trace

,

min

ˆU(t) ∈Rn×r, ¯U(t)∈Rn×(r−r1)

+ Trace

ΦΦΦ(t)k+1 − λ2
λ1

s.t U(t)†U(t) = I.

U(t−1)U(t−1)†(cid:19)

ˆU(t)

(cid:19)

133

(20)

.

(21)

(22)

A modiﬁed Laplacian using the current Laplacian and the eigenvectors found for the previous time
point can be deﬁned as ΦΦΦ(t)k+1

U(t−1)U(t−1)†.

mod = ΦΦΦ(t)k+1 − λ2
λ1

The modiﬁed version of the optimization problem in Eq. 22 can be written as two separate

optimization problems for the third case, which will be solved by splitting the single constraint in

Eq. 21 into the multiple orthogonality constraints shown below:

min

λ1Trace( ˆU(t)†ΦΦΦ(t)k+1

mod ˆU(t))

U(t) ∈Rn×r
+λ1Trace( ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)), s.t U(t)†U(t) = I.

(23)

The optimization problem in Eq. (23) can be written as two separate optimization problems

which will be solved by splitting the single constraint in Eq.

(23) into multiple orthogonality

constraints as follows:

λ1Tr( ˆU(t)†ΦΦΦ(t)k+1

mod ˆU(t)) s.t

ˆU(t)† ˆU(t) = I,

λ1Tr( ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)), s.t

¯U(t)† ¯U(t) = I, ˆU(t)† ¯U(t) = 0,

(24)

min

ˆU(t) ∈Rn×r1
min

¯U(t) ∈Rn×(r−r1)
mod = ΦΦΦ(t) − λ2
λ1
mod ∈ S+

n where S+

where ΦΦΦ(t)k+1
t and ΦΦΦ(t)k+1

U(t−1)U(t−1)† is deﬁned as the modiﬁed Laplacian matrix at time point

n is the set of positive semideﬁnite matrices. Similar to the previous

cases, the solution of the ﬁrst part of this problem can be found as the ﬁrst r1 eigenvectors corre-
sponding to the smallest r1 eigenvalues of the modiﬁed Laplacian. After ˆU(t) is found, it is used to

solve for the second part subject to the two constraints in Eq. (24). In particular, a complementary

set of eigenvectors that corresponds to the graph at time point t is calculated. These vectors should

be orthogonal within themselves and simultaneously orthogonal to the eigenvectors from the ﬁrst

part, i.e., the inner product between ˆU(t) and ¯U(t) is zero. The solution can be found using the trace

penalty minimization model introduced in [159].

The trace penalty minimization model [159] that represents the second part in Eq. 24 is,

134

min

¯U(t) ∈Rn×(r−r1)

(cid:16) ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)(cid:17)

f ( ¯U(t)) =

1
2

Trace

s.t ˆU(t)† ¯U(t) = 0,

+

ζ
4

(cid:107) ¯U(t)† ¯U(t) − I(cid:107)2
F ,

(25)

where ˆU(t)† ˆU(t) = I and ζ is the penalty parameter. The problem in Eq. 25 can be solved using
modiﬁed projected gradient method proposed in [159], initializing ¯U(t)
0

such that ˆU(t)† ¯U(t)

0 = 0

yields,

¯U(t)
j+1 = ¯U(t)

j − δ j

I − ˆU(t) ˆU(t)†(cid:17)
(cid:16)

∇ f ( ¯U(t)
j ),

(26)

where j refers to the jth iteration and δ j is the step size. The initialization of ¯U(t)

0 is done by solving

the following optimization problem:

min

¯U(t) ∈Rn×(r−r1)

Trace

(cid:16) ¯U(t)†ΦΦΦ(t)k+1 ¯U(t)(cid:17)

, s.t

¯U(t)† ¯U(t) = I.

(27)

A closed form solution can be used to calculate the gradient of f ( ¯U(t)

j ) at each iteration, as

follows:

∇ f ( ¯U(t)

j ) = ΦΦΦ(t)k+1 ¯U(t)

j + ζ ¯U(t)
j

¯U(t)
j

† ¯U(t)

The step size is computed as,

(cid:19)
j − I

(cid:18)
(cid:17)

(cid:16)

.

(28)

(29)

jH j

,

δ j =

Trace

V†
(cid:107)H j(cid:107)2
j+1−∇ f ( ¯U(t)

F

j+1− ¯U(t)

j and H j = ∇ f ( ¯U(t)

where V j = ¯U(t)
j )). The penalty parameter ζ is chosen such
that ζ ∈ [max(0,λr−r1),λn] where λr−r1 is the (r − r1)th smallest generalized eigenvalue of the
graph Laplacian ΦΦΦ(t)k+1. The iterations termination criterion is (cid:107) f ( ¯U(t)
j )(cid:107)F≤ ε1. At the end of the
iterations ˆU(t) and ¯U(t) are combined to form U(t).

135

ROIs in the AAL Template

Index

ROI

(1,2)

Precentral gyrus (PreCG)

(3,4)

Superior frontal gyrus, dorsolateral (SFGdor)

(5,6)

Superior frontal gyrus, orbital part (ORBsup)

(7,8)

Middle frontal gyrus (MFG)

(9,10) Middle frontal gyrus, orbital part (ORBmid)

(11,12)

Inferior frontal gyrus, opercular part IFGoperc

(13,14)

Inferior frontal gyrus, triangular part (IFGtriang)

(15,16)

Inferior frontal gyrus, orbital part (ORBinf)

(17,18) Rolandic operculum (ROL)

(19,20) Supplementary motor area (SMA)

(21,22) Olfactory cortex (OLF)

(23,24) Superior frontal gyrus, medial (SFGmed)

(25,26) Superior frontal gyrus, medial orbital (ORBsupmed)

136

(27,28) Gyrus rectus (REC)

(29,30)

Insula (INS)

(31,32) Anterior cingulate and paracingulate gyri (ACG)

(33,34) Median cingulate and paracingulate gyri (DCG)

(35,36) Posterior cingulate gyrus (PCG)

(37,38) Hippocampus (HIP)

(39,40) Parahippocampal gyrus (PHG)

(41,42) Amygdala (AMYG)

(43,44) Calcarine ﬁssure and surrounding cortex (CAL)

(45,46) Cuneus (CUN)

(47,48) Lingual gyrus (LING)

(49,50) Superior occipital gyrus (SOG)

(51,52) Middle occipital gyrus (MOG)

(53,54)

Inferior occipital gyrus (IOG)

(55,56) Fusiform gyrus (FFG)

137

(57,58) Postcentral gyrus (PoCG)

(59,60) Superior parietal gyrus (SPG)

(61,62)

Inferior parietal, but supramarginal and angular gyri (IPL)

(63,64) Supramarginal gyrus (SMG)

(65,66) Angular gyrus (ANG)

(67,68) Precuneus (PCUN)

(69,70) Paracentral lobule (PCL)

(71,72) Caudate (CAU)

(73,74) Lenticular nucleus, putamen (PUT)

(75,76) Lenticular nucleus, pallidum (PAL)

(77,78) Thalamus (THA)

(79,80) Heschl gyrus (HES)

(81,82) Superior temporal gyrus (STG)

(83,84) Temporal pole: superior temporal gyrus (TPOsup)

(85,86) Middle temporal gyrus (MTG)

138

(87,88) Temporal pole: middle temporal gyrus (TPOmid)

(89,90)

Inferior temporal gyrus (ITG)

Table 1: List of the ROIs in the AAL 90 template. The odd and even indices refer to the left and
right brain hemispheres ROIs, respectively.

139

BIBLIOGRAPHY

140

BIBLIOGRAPHY

[1] “Circulargraph

toolbox,”

https://www.mathworks.com/matlabcentral/ﬁleexchange/

48576-circulargraph.

[2] S. Borgatti, A. Mehra, D. Brass, and G. Labianca, “Network analysis in the social sciences,”

science, vol. 323, no. 5916, pp. 892–895, 2009.

[3] M. Rubinov and O. Sporns, “Complex network measures of brain connectivity: uses and

interpretations,” Neuroimage, vol. 52, no. 3, pp. 1059–1069, 2010.

[4] B. Malin, “Data and collocation surveillance through location access patterns,” in Proc.

NAACSOS Conf, 2004.

[5] P. Holme and J. Saram¨aki, “Temporal networks,” Physics reports, vol. 519, no. 3, pp. 97–

125, 2012.

[6] M. Kivel¨a, A. Arenas, M. Barthelemy, J. Gleeson, Y. Moreno, and M. Porter, “Multilayer

networks,” Journal of complex networks, vol. 2, no. 3, pp. 203–271, 2014.

[7] U. Luxburg, “A tutorial on spectral clustering,” Statistics and computing, vol. 17, no. 4, pp.

395–416, 2007.

[8] N. Andrew, J. Michael, and Y. Weiss, “On spectral clustering:analysis and an algorithm,”

Advances in neural information processing systems, vol. 2, pp. 849–856, 2002.

[9] A. Jain, “Data clustering: 50 years beyond k-means,” Pattern recognition letters, vol. 31,

no. 8, pp. 651–666, 2010.

[10] M. Newman, “Modularity and community structure in networks,” Proceedings of the na-

tional academy of sciences, vol. 103, no. 23, pp. 8577–8582, 2006.

[11] M. Newman and M. Girvan, “Finding and evaluating community structure in networks,”

Physical review E, vol. 69, no. 2, p. 026113, 2004.

[12] A. Jain, M. Murty, and P. Flynn, “Data clustering: a review,” ACM computing surveys

(CSUR), vol. 31, no. 3, pp. 264–323, 1999.

141

[13] P. Tan et al., Introduction to data mining. Pearson Education India, 2006.

[14] M. Stoer and F. Wagner, “A simple min-cut algorithm,” Journal of the ACM (JACM), vol. 44,

no. 4, pp. 585–591, 1997.

[15] L. Hagen and A. Kahng, “New spectral methods for ratio cut partitioning and clustering,”
IEEE transactions on computer-aided design of integrated circuits and systems, vol. 11,
no. 9, pp. 1074–1085, 1992.

[16] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on

pattern analysis and machine intelligence, vol. 22, no. 8, pp. 888–905, 2000.

[17] D. Wagner and F. Wagner, “Between min cut and graph bisection,” in International Sympo-

sium on Mathematical Foundations of Computer Science. Springer, 1993, pp. 744–750.

[18] S. White and P. Smyth, “A spectral clustering approach to ﬁnding communities in graphs,”
in Proceedings of the 2005 SIAM international conference on data mining. SIAM, 2005,
pp. 274–285.

[19] J. Forman, P. Clemons, S. Schreiber, and S. Haggarty, “Spectralnet–an application for spec-

tral graph analysis and visualization,” BMC bioinformatics, vol. 6, no. 1, p. 260, 2005.

[20] S. Fortunato and D. Hric, “Community detection in networks: A user guide,” Physics Re-

ports, vol. 659, pp. 1–44, 2016.

[21] K. Steinhaeuser and N. Chawla, “Identifying and evaluating community structure in com-

plex networks,” Pattern Recognition Letters, vol. 31, no. 5, pp. 413–421, 2010.

[22] E. Acar and B. Yener, “Unsupervised multiway data analysis: A literature survey,” IEEE

transactions on knowledge and data engineering, vol. 21, no. 1, pp. 6–20, 2009.

[23] E. Acar, S. Camtepe, M. Krishnamoorthy, and B. Yener, “Modeling and multiway analysis

of chatroom tensors.” ISI, vol. 2005, pp. 256–268, 2005.

[24] F. Estienne, N. Matthijs, D. Massart, P. Ricoux, and D. Leibovici, “Multi-way modelling of
high-dimensionality electroencephalographic data,” Chemometrics and Intelligent Labora-
tory Systems, vol. 58, no. 1, pp. 59–72, 2001.

[25] T. Kolda and B. Bader, “Tensor decompositions and applications,” SIAM review, vol. 51,

no. 3, pp. 455–500, 2009.

142

[26] L. D. Lathauwer, B. D. Moor, and J. Vandewalle, “A multilinear singular value decompo-
sition,” SIAM journal on Matrix Analysis and Applications, vol. 21, p. 12531278, January
2000.

[27] L. D. Lathauwer, B. D. Moor, and J.Vandewalle, “On the best rank-1 and rank-(r 1, r 2,..., r n)
approximation of higher-order tensors,” SIAM Journal on Matrix Analysis and Applications,
vol. 21, pp. 1324–1342, January 2000.

[28] T. Kolda, Multilinear operators for higher-order decompositions. United States. Depart-

ment of Energy, 2006.

[29] G. Palla, A. Barab´asi, and T. Vicsek, “Quantifying social group evolution,” Nature, vol. 446,

no. 7136, pp. 664–667, 2007.

[30] T. Yang, Y. Chi, S. Zhu, Y. Gong, and R. Jin, “Detecting communities and their evolutions
in dynamic social networksa bayesian approach,” Machine learning, vol. 82, no. 2, pp. 157–
189, 2011.

[31] D. Chackrabarti, R. Kumar, and A. Tomkins, “Evolutionary clustering,” in Proceedings of
the 12th ACM SIGKDD international conference on Knowledge discovery and data mining,
vol. II. ACM, 2006, pp. 554–560.

[32] Y. Chi, X. Song, D. Zhou, K. Hino, and B. Tseng, “Evolutionary spectral clustering by in-
corporating temporal smoothness,” in Proceedings of the 13th ACM SIGKDD international
conference on Knowledge discovery and data mining, vol. II. ACM, 2007, pp. 153–162.

[33] K. Xu, M. Kliger, and A. Hero, “Adaptive evolutionary clustering,” Data Mining and Knowl-

edge Discovery, vol. 28, no. 2, pp. 304–336, 2014.

[34] L. Gauvin, A. Panisson, and C. Cattuto, “Detecting the community structure and activity
patterns of temporal networks: a non-negative tensor factorization approach,” PloS one,
vol. 9, no. 1, 2014.

[35] P. Mucha, T. Richardson, K. Macon, M. Porter, and J. Onnela, “Community structure in
time-dependent, multiscale, and multiplex networks,” science, vol. 328, no. 5980, pp. 876–
878, 2010.

[36] S. Chevallier, Q. Barth´elemy, and J. Atif, “Metrics for multivariate dictionaries,” arXiv

preprint arXiv:1302.4242, 2013.

[37] S. Wasserman and K. Faust, Social network analysis: Methods and applications. Cam-

bridge university press, 1994, vol. 8.

143

[38] D. Krackhardt, “Cognitive social structures,” Social networks, vol. 9, no. 2, pp. 109–134,

1987.

[39] J. Padgett and C. Ansell, “Robust action and the rise of the medici, 1400-1434,” American

journal of sociology, vol. 98, no. 6, pp. 1259–1319, 1993.

[40] A. Cardillo, J. G´omez-Gardenes, M. Zanin, M. Romance, D. Papo, F. D. Pozo, and S. Boc-
caletti, “Emergence of network features from multiplexity,” Scientiﬁc reports, vol. 3, 2013.

[41] A. Amelio and C. Pizzuti, “Evolutionary clustering for mining and tracking dynamic multi-

layer networks,” Computational Intelligence, vol. 33, no. 2, pp. 181–209, 2017.

[42] F. Babiloni, F. Cincotti, C. Babiloni, F. Carducci, D. Mattia, L. Astolﬁ, A. Basilisco,
P. Rossini, L. Ding, Y. Ni et al., “Estimation of the cortical functional connectivity with
the multimodal integration of high-resolution eeg and fmri data by directed transfer func-
tion,” Neuroimage, vol. 24, no. 1, pp. 118–131, 2005.

[43] A. D. Martino, A. Scheres, D. Margulies, A. Kelly, L. Uddin, Z. Shehzad, B. Biswal, J. Wal-
ters, F. Castellanos, and M. Milham, “Functional connectivity of human striatum: a resting
state fmri study,” Cerebral cortex, vol. 18, no. 12, pp. 2735–2747, 2008.

[44] M. V. D. Heuvel and H. Pol, “Exploring the brain network: a review on resting-state fmri
functional connectivity,” European neuropsychopharmacology, vol. 20, no. 8, pp. 519–534,
2010.

[45] M. Steriade, P. Gloor, R. Llinas, F. D. Silva, and M. Mesulam, “Basic mechanisms of cere-
bral rhythmic activities,” Electroencephalography and clinical neurophysiology, vol. 76,
no. 6, pp. 481–508, 1990.

[46] S. Bressler, “Large-scale cortical networks and cognition,” Brain Research Reviews, vol. 20,

no. 3, pp. 288–304, 1995.

[47] F. Varela, J. Lachaux, E. Rodriguez, and J. Martinerie, “The brainweb: phase synchroniza-
tion and large-scale integration,” Nature reviews. Neuroscience, vol. 2, no. 4, p. 229, 2001.

[48] E. Bullmore and O. Sporns, “Complex brain networks: graph theoretical analysis of struc-

tural and functional systems,” Nature reviews. Neuroscience, vol. 10, no. 3, p. 186, 2009.

[49] M. van den Heuvel, C. Stam, M. Boersma, and H. Pol, “Small-world and scale-free organi-
zation of voxel-based resting-state functional connectivity in the human brain,” Neuroimage,
vol. 43, no. 3, pp. 528–539, 2008.

144

[50] C. Stam and J. Reijneveld, “Graph theoretical analysis of complex networks in the brain,”

Nonlinear biomedical physics, vol. 1, no. 1, p. 3, 2007.

[51] R. Prabhakaran, S. Blumstein, E. Myers, E. Hutchison, and B. Britton, “An event-related
fmri investigation of phonological–lexical competition,” Neuropsychologia, vol. 44, no. 12,
pp. 2209–2221, 2006.

[52] P. Boveroux, A. Vanhaudenhuyse, M. Bruno, Q. Noirhomme, S. Lauwick, A. Luxen,
C. Degueldre, A. Plenevaux, C. Schnakers, C. Phillips et al., “Breakdown of within-and
between-network resting state functional magnetic resonance imaging connectivity during
propofol-induced loss of consciousness,” Anesthesiology: The Journal of the American So-
ciety of Anesthesiologists, vol. 113, no. 5, pp. 1038–1053, 2010.

[53] C. Chang and G. Glover, “Time–frequency dynamics of resting-state brain connectivity

measured with fmri,” Neuroimage, vol. 50, no. 1, pp. 81–98, 2010.

[54] B. Rogers, V. Morgan, A. Newton, and J. Gore, “Assessing functional connectivity in the
human brain by fmri,” Magnetic resonance imaging, vol. 25, no. 10, pp. 1347–1357, 2007.

[55] E. Allen, E. Damaraju, S. Plis, E. Erhardt, T. Eichele, and V. Calhoun, “Tracking whole-

brain connectivity dynamics in the resting state,” Cerebral cortex, p. bhs352, 2012.

[56] N. Leonardi, J. Richiardi, M. Gschwind, S. Simioni, J. Annoni, M. Schluep, P. Vuilleumier,
and D. V. D. Ville, “Principal components of functional connectivity: a new approach to
study dynamic brain connectivity during rest,” NeuroImage, vol. 83, pp. 937–950, 2013.

[57] B. Biswal, F. Yetkin, V. Haughton, and J. Hyde, “Functional connectivity in the motor cortex
of resting human brain using echo-planar mri,” Magnetic resonance in medicine, vol. 34,
no. 4, pp. 537–541, 1995.

[58] C. Rosazza and L. Minati, “Resting-state brain networks: literature review and clinical ap-

plications,” Neurological Sciences, vol. 32, no. 5, pp. 773–785, 2011.

[59] J. Damoiseaux, S. Rombouts, F. Barkhof, P. P. Scheltens, C. Stam, S. Smith, and C. Beck-
mann, “Consistent resting-state networks across healthy subjects,” Proceedings of the na-
tional academy of sciences, vol. 103, no. 37, pp. 13 848–13 853, 2006.

[60] R. Hutchison, T. Womelsdorf, E. Allen, P. Bandettini, V. Calhoun, M. Corbetta, S. D. Penna,
J. Duyn, G. Glover, J. Gonzalez-Castillo et al., “Dynamic functional connectivity: promise,
issues, and interpretations,” Neuroimage, vol. 80, pp. 360–378, 2013.

145

[61] R. Goebel, F. Esposito, and E. Formisano, “Analysis of functional image analysis contest
(ﬁac) data with brainvoyager qx: From single-subject to cortically aligned group general
linear model analysis and self-organizing group independent component analysis,” Human
brain mapping, vol. 27, no. 5, pp. 392–401, 2006.

[62] J. Schrouff, V. Perlbarg, M. Boly, G. Marrelec, P. Boveroux, A. Vanhaudenhuyse, M. Bruno,
S. Laureys, C. Phillips, M. P´el´egrini-Issac et al., “Brain functional integration decreases
during propofol-induced loss of consciousness,” Neuroimage, vol. 57, no. 1, pp. 198–205,
2011.

[63] W. Jamal, S. Das, and K. Maharatna, “Existence of millisecond-order stable states in time-
varying phase synchronization measure in eeg signals,” in Engineering in Medicine and
Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE.
IEEE,
2013, pp. 2539–2542.

[64] W. Jamal, S. Das, K. Maharatna, D. Kuyucu, F. Sicca, L. Billeci, F. Apicella, and F. Mu-
ratori, “Using brain connectivity measure of eeg synchrostates for discriminating typi-
cal and autism spectrum disorder,” in Neural Engineering (NER), 2013 6th International
IEEE/EMBS Conference on.

IEEE, 2013, pp. 1402–1405.

[65] W. Jamal, S. Das, I. Oprescu, K. Maharatna, F. Apicella, and F. Sicca, “Classiﬁcation of
autism spectrum disorder using supervised learning of brain connectivity measures extracted
from synchrostates,” Journal of neural engineering, vol. 11, no. 4, pp. 046–019, 2014.

[66] W. Jamal, S. Das, K. Maharatna, F. Apicella, G. Chronaki, F. Sicca, D. Cohen, and F. Mura-
tori, “On the existence of synchrostates in multichannel eeg signals during face-perception
tasks,” Biomedical Physics & Engineering Express, vol. 1, no. 1, pp. 015–002, 2015.

[67] S. Dimitriadis, N. Laskaris, and A. Tzelepi, “On the quantization of time-varying phase
synchrony patterns into distinct functional connectivity microstates (fcµstates) in a multi-
trial visual erp paradigm,” Brain topography, vol. 26, no. 3, pp. 397–409, 2013.

[68] J. Ou, L. Xie, C. Jin, X. Li, D. Zhu, R. Jiang, Y. Chen, J. Zhang, L. Li, and T. Liu, “Charac-
terizing and differentiating brain state dynamics via hidden markov models,” Brain topog-
raphy, vol. 28, no. 5, pp. 666–679, 2015.

[69] S. Ma, V. Calhoun, R. Phlypo, and T. Adalı, “Dynamic changes of spatial functional net-
work connectivity in healthy individuals and schizophrenia patients using independent vec-
tor analysis,” Neuroimage, vol. 90, pp. 196–206, 2014.

[70] S. Mehrkanoon, M. Breakspear, and T. Boonstra, “Low-dimensional dynamics of resting-

state cortical activity,” Brain topography, vol. 27, no. 3, pp. 338–352, 2014.

146

[71] N. Leonardi, W. Shirer, M. Greicius, and D. V. D. Ville, “Disentangling dynamic networks:
Separated and joint expressions of functional connectivity patterns in time,” Human brain
mapping, vol. 35, no. 12, pp. 5984–5995, 2014.

[72] A. Mutlu, E. Bernat, and S. Aviyente, “A signal-processing-based approach to time-varying
graph analysis for dynamic brain network identiﬁcation,” Computational and mathematical
methods in medicine, vol. 2012, 2012.

[73] A. Mahyari, D. Zoltowski, E. Bernat, and S. Aviyente, “A tensor decomposition-based ap-
proach for detecting dynamic network states from eeg,” IEEE Transactions on Biomedical
Engineering, vol. 64, no. 1, pp. 225–237, 2017.

[74] I. Cribben, R. Haraldsdottir, L. Atlas, T. Wager, and M. Lindquist, “Dynamic connectivity
regression: determining state-related changes in brain connectivity,” Neuroimage, vol. 61,
no. 4, pp. 907–920, 2012.

[75] J. Zhang, X. Li, C. Li, Z. Lian, X. Huang, G. Zhong, D. Zhu, K. Li, C. Jin, X. Hu et al., “In-
ferring functional interaction and transition patterns via dynamic bayesian variable partition
models,” Human brain mapping, vol. 35, no. 7, pp. 3314–3331, 2014.

[76] A. Ozdemir, E. Bernat, and S. Aviyente, “Recursive tensor subspace tracking for dynamic
brain network analysis,” IEEE Transactions on Signal and Information Processing over
Networks, 2017.

[77] “Community detection in graphs,” Physics Reports, vol. 486, no. 3, pp. 75 – 174, 2010.

[78] K. Tu, B. Ribeiro, A. Swami, and D. Towsley, “Temporal clustering in dynamic networks

with tensor decomposition,” arXiv preprint arXiv:1605.08074, 2016.

[79] F. Folino and C. Pizzuti, “Multiobjective evolutionary community detection for dynamic
networks,” in Proceedings of the 12th annual conference on Genetic and evolutionary com-
putation. ACM, 2010, pp. 535–536.

[80] C. Eckart and G. Young, “The approximation of one matrix by another of lower rank,”

Psychometrika, vol. 1, no. 3, pp. 211–218, 1936.

[81] H. Hotelling, “Analysis of a complex of statistical variables into principal components.”

Journal of educational psychology, vol. 24, no. 6, p. 417, 1933.

[82] N. Kishore Kumar and J. Schneider, “Literature survey on low rank approximation of ma-

trices,” Linear and Multilinear Algebra, vol. 65, no. 11, pp. 2212–2244, 2017.

147

[83] E. Cand`es, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” Journal of

the ACM (JACM), vol. 58, no. 3, p. 11, 2011.

[84] K. Ye and L. Lim, “Distance between subspaces of different dimensions,” ArXiv e-prints,

2014.

[85] M. Meil˘a, “Comparing clusteringsan information based distance,” Journal of multivariate

analysis, vol. 98, no. 5, pp. 873–895, 2007.

[86] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization and
statistical learning via the alternating direction method of multipliers,” Foundations and
Trends R(cid:13) in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.

[87] N. Parikh, S. Boyd et al., “Proximal algorithms,” Foundations and Trends R(cid:13) in Optimiza-

tion, vol. 1, no. 3, pp. 127–239, 2014.

[88] P. Combettes and J. Pesquet, “Proximal splitting methods in signal processing,” in Fixed-
Springer, 2011, pp.

point algorithms for inverse problems in science and engineering.
185–212.

[89] E. Birgin, J. Mart´ınez, M. Raydan et al., “Spectral projected gradient methods: review and

perspectives,” J. Stat. Softw, vol. 60, no. 3, pp. 1–21, 2014.

[90] V. Traag, R. Aldecoa, and J. Delvenne, “Detecting communities using asymptotical sur-

prise,” Physical Review E, vol. 92, no. 2, p. 022816, 2015.

[91] M. Girvan and M. Newman, “Community structure in social and biological networks,” Pro-

ceedings of the national academy of sciences, vol. 99, no. 12, pp. 7821–7826, 2002.

[92] Y. Lin, Y. Chi, S. Zhu, H. Sundaram, and B. Tseng, “Facetnet: a framework for analyzing
communities and their evolutions in dynamic networks,” in Proceedings of the 17th inter-
national conference on World Wide Web. ACM, 2008, pp. 685–694.

[93] N. Eagle and A. Pentland, “Reality mining: sensing complex social systems,” Personal and

ubiquitous computing, vol. 10, no. 4, pp. 255–268, 2006.

[94] J. Stehl´e, N. Voirin, A. Barrat, C. Cattuto, L. Isella, J. Pinton, M. Quaggiotto, W. V. den
Broeck, C. R´egis, B. B. Lina et al., “High-resolution measurements of face-to-face contact
patterns in a primary school,” PloS one, vol. 6, no. 8, p. e23176, 2011.

148

[95] V. Gemmetto, A. Barrat, and C. Cattuto, “Mitigation of infectious disease at school: targeted

class closure vs school closure,” BMC infectious diseases, vol. 14, no. 1, p. 695, 2014.

[96] S. Fortunato, “Community detection in graphs,” Physics reports, vol. 486, no. 3, pp. 75–174,

2010.

[97] M. Yaesoubi et al., “Mutually temporally independent connectivity patterns: A new frame-
work to study the dynamics of brain connectivity at rest with application to explain group
difference based on gender,” NeuroImage, vol. 107, pp. 85–94, 2015.

[98] “Metrics for community analysis: A survey.”

[99] G. Rossetti and R. Cazabet, “Community discovery in dynamic networks: A survey,” ACM

Computing Surveys (CSUR), vol. 51, no. 2, p. 35, 2018.

[100] “Global spectral clustering in dynamic networks.”

[101] S. Fortunato and M. Barthelemy, “Resolution limit in community detection,” Proceedings

of the National Academy of Sciences, vol. 104, no. 1, pp. 36–41, 2007.

[102] J. Reichardt and S. Bornholdt, “Statistical mechanics of community detection,” Phys. Rev.

E, vol. 74, p. 016110.

[103] M. Newman, “Community detection in networks: Modularity optimization and maximum

likelihood are equivalent,” arXiv preprint arXiv:1606.02319, 2016.

[104] S. Kullback and R. Leibler, “On information and sufﬁciencyannals of mathematical statis-

tics, 22, 79–86,” MathSciNet MATH, 1951.

[105] C. Nicolini et al., “Community detection in weighted brain connectivity networks beyond

the resolution limit,” Neuroimage, vol. 146, pp. 28–39, 2017.

[106] S. Papadimitriou, S. Jimeng, and F. Christos, “Streaming pattern discovery in multiple
time-series,” in Proceedings of the 31st international conference on Very large data bases.
VLDB Endowment, 2005, pp. 697–708.

[107] J. Sun, D. Tao, S. Papadimitriou, P. Yu, S. Philip, and C. Faloutsos, “Incremental tensor
analysis: Theory and applications,” ACM Transactions on Knowledge Discovery from Data,
vol. 2, no. 3, p. 11, 2008.

149

[108] M. Lavielle, “Using penalized contrasts for the change-point problem,” Signal processing,

vol. 85, no. 8, pp. 1501–1510, 2005.

[109] “ﬁndchangepts,” www.mathworks.com/help/signal/ref/ﬁndchangepts.html.

[110] “A generalized louvain method for community detection implemented in matlab.”

[111] C. Reynolds, “Flocks, herds and schools: A distributed behavioral model,” ACM SIG-

GRAPH computer graphics, vol. 21, no. 4, pp. 25–34, 1987.

[112] J. Cavanagh, M. Cohen, and J. Allen, “Prelude to and resolution of an error: Eeg phase
synchrony reveals cognitive control dynamics during action monitoring,” The Journal of
Neuroscience, vol. 29, pp. 98–105, 2009.

[113] J. Kayser and C. Tenke, “Principal components analysis of laplacian waveforms as a generic
method for identifying erp generator patterns: I. evaluation with auditory oddball tasks,”
Clinical neurophysiology, vol. 117, no. 2, pp. 348–368, 2006.

[114] S. Aviyente, E. Bernat, W. Evans, and S. Sponheim, “A phase synchrony measure for quan-
tifying dynamic functional integration in the brain,” Human brain mapping, vol. 32, no. 1,
pp. 80–93, 2011.

[115] “1000 functional connectomes project,” http://www.nitrc.org/projects/fcon 1000

[116] “Statistical parametric mapping,” http://www.ﬁl.ion.ucl.ac.uk/spm/

[117] “Functional connectivity toolbox,” http://www.nitrc.org/projects/conn

[118] N. Tzourio-Mazoyer, B. Landeau, D. Papathanassiou, F. Crivello, O. Etard, N. Delcroix,
B. Mazoyer, and M. Joliot, “Automated anatomical labeling of activations in spm using
a macroscopic anatomical parcellation of the mni mri single-subject brain,” Neuroimage,
vol. 15, no. 1, pp. 273–289, 2002.

[119] N. Leonardi and D. V. D. Ville, “Identifying network correlates of brain states using tensor
decompositions of whole-brain dynamic functional connectivity,” in Pattern Recognition in
Neuroimaging (PRNI), 2013 International Workshop on.

IEEE, 2013, pp. 74–77.

[120] X. Li et al., “Dynamic functional connectomics signatures for characterization and differ-

entiation of ptsd patients,” Human brain mapping, vol. 35, no. 4, pp. 1761–1778, 2014.

150

[121] H. Eavani, T. Satterthwaite, R. Gur, R. Gur, and C. Davatzikos, “Unsupervised learning of
functional network dynamics in resting state fmri,” in International Conference on Informa-
tion Processing in Medical Imaging. Springer, 2013, pp. 426–437.

[122] M. Raichle, A. MacLeod, A. Snyder, W. Powers, D. Gusnard, and G. Shulman, “A default
mode of brain function,” Proceedings of the National Academy of Sciences, vol. 98, no. 2,
pp. 676–682, 2001.

[123] M. Greicius, K. Supekar, V. Menon, and R. Dougherty, “Resting-state functional connec-
tivity reﬂects structural connectivity in the default mode network,” Cerebral cortex, vol. 19,
no. 1, pp. 72–78, 2009.

[124] Y. Zhou, N. Shu, Y. Liu, M. Song, Y. Hao, H. Liu, C. Yu, Z. Liu, and T. Jiang, “Al-
tered resting-state functional connectivity and anatomical connectivity of hippocampus in
schizophrenia,” Schizophrenia research, vol. 100, no. 1, pp. 120–132, 2008.

[125] A. David and L. Pierre, “Hippocampal neuroanatomy,” in The hippocampus book. Oxford

University Press, 2009.

[126] J. Vincent, A. Snyder, M. Fox, B. Shannon, J. Andrews, M. Raichle, and R. Buckner, “Co-
herent spontaneous activity identiﬁes a hippocampal-parietal memory network,” Journal of
neurophysiology, vol. 96, no. 6, pp. 3517–3531, 2006.

[127] Z. Yang, R. Craddock, D. Margulies, C. Yan, and M. Milham, “Common intrinsic connec-
tivity states among posteromedial cortex subdivisions: Insights from analysis of temporal
dynamics,” Neuroimage, vol. 93, pp. 124–137, 2014.

[128] F. Karahano˘glu and D. V. D. Ville, “Transient brain activity disentangles fmri resting-state
dynamics in terms of spatially and temporally overlapping networks,” Nature communica-
tions, vol. 6, 2015.

[129] “Learning-induced autonomy of sensorimotor systems.”

[130] M. Lowe, B. Mock, and J. Sorenson, “Functional connectivity in single and multislice echo-
planar imaging using resting-state ﬂuctuations,” Neuroimage, vol. 7, no. 2, pp. 119–132,
1998.

[131] M. Greicius, B. Krasnow, A. Reiss, and V. Menon, “Functional connectivity in the resting
brain: a network analysis of the default mode hypothesis,” Proceedings of the National
Academy of Sciences, vol. 100, no. 1, pp. 253–258, 2003.

151

[132] D. Meunier, R. Lambiotte, and E. Bullmore, “Modular and hierarchically modular organi-

zation of brain networks,” Frontiers in neuroscience, vol. 4, p. 200, 2010.

[133] E. Bullmore and D. Bassett, “Brain graphs: graphical models of the human brain connec-

tome,” Annual review of clinical psychology, vol. 7, pp. 113–140, 2011.

[134] E. Al-sharoa, M. Al-khassaweneh, and S. Aviyente, “A tensor based framework for commu-

nity detection in dynamic networks,” in Proceedings of ICASSP 2017, to be published.

[135] D. Meunier, S. Achard, A. Morcom, and E. Bullmore, “Age-related changes in modular
organization of human brain functional networks,” Neuroimage, vol. 44, no. 3, pp. 715–723,
2009.

[136] M. Rubinov and O. Sporns, “Weight-conserving characterization of complex functional

brain networks,” Neuroimage, vol. 56, no. 4, pp. 2068–2079, 2011.

[137] J. Power, A. Cohen, S. Nelson, G. Wig, K. Barnes, J. Church, A. Vogel, T. Laumann,
F. Miezin, B. Schlaggar et al., “Functional network organization of the human brain,” Neu-
ron, vol. 72, no. 4, pp. 665–678, 2011.

[138] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. Farmer, “Testing for nonlinearity in
time series: the method of surrogate data,” Physica D: Nonlinear Phenomena, vol. 58, no.
1-4, pp. 77–94, 1992.

[139] T. Schreiber and A. Schmitz, “Surrogate time series,” Physica D: Nonlinear Phenomena,

vol. 142, no. 3, pp. 346–382, 2000.

[140] M. D. Domenico, V. Nicosia, A. Arenas, and V. Latora, “Structural reducibility of multilayer

networks,” Nature communications, vol. 6, p. 6864, 2015.

[141] X. Di and B. Biswal, “Dynamic brain functional connectivity modulated by resting-state

networks,” Brain structure & function, vol. 220, no. 1, p. 37, 2015.

[142] A. Schaefer, D. Margulies, G. Lohmann, K. Gorgolewski, J. Smallwood, S. Kiebel, and
A. Villringer, “Dynamic network participation of functional connectivity hubs assessed by
resting-state fmri,” Frontiers in human neuroscience, vol. 8, 2014.

[143] P. Fransson and G. Marrelec, “The precuneus/posterior cingulate cortex plays a pivotal role
in the default mode network: Evidence from a partial correlation network analysis,” Neu-
roimage, vol. 42, no. 3, pp. 1178–1184, 2008.

152

[144] L. Uddin, K. Clare, B. Biswal, C. Xavier, and M. Milham, “Functional connectivity of de-
fault mode network components: correlation, anticorrelation, and causality,” Human brain
mapping, vol. 30, no. 2, pp. 625–637, 2009.

[145] M. Newman, Networks: an introduction. Oxford university press, 2010.

[146] X. Dong, P. Frossard, P. Vandergheynst, and N. Nefedov, “Clustering with multi-layer
graphs: A spectral perspective,” IEEE Transactions on Signal Processing, vol. 60, no. 11,
pp. 5820–5831, 2012.

[147] N. Stanley, S. Shai, D. Taylor, and P. Mucha, “Clustering network layers with the strata
multilayer stochastic block model,” IEEE transactions on network science and engineering,
vol. 3, no. 2, pp. 95–105, 2016.

[148] T. Valles-Catala, F. Massucci, R. Guimera, and M. Sales-Pardo, “Multilayer stochastic block
models reveal the multilayer structure of complex networks,” Physical Review X, vol. 6,
no. 1, p. 011036, 2016.

[149] E. Al-Sharoa, M. Al-khassaweneh, and S. Aviyente, “A tensor based framework for commu-
nity detection in dynamic networks,” in Acoustics, Speech and Signal Processing (ICASSP),
2017 IEEE International Conference on.

IEEE, 2017, pp. 2312–2316.

[150] E. Al-sharoa, M. Al-khassaweneh, and S. Aviyente, “Detecting brain dynamics during rest-
ing state: a tensor based evolutionary clustering approach,” in Wavelets and Sparsity XVII,
vol. 10394.

International Society for Optics and Photonics, 2017, p. 103940G.

[151] C. Chen, M. Ng, and S. Zhang, “Block spectral clustering for multiple graphs with inter-
relation,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 6,
no. 1, p. 8, 2017.

[152] R. Kanawati, “Multiplex network mining: a brief survey,” IEEE Intelligent Informatics Bul-

letin, vol. 16, pp. 24–28, 2015.

[153] M. Berlingerio, M. Coscia, F. Giannotti, A. Monreale, and D. Pedreschi, “Evolving net-
works: Eras and turning points,” Intelligent Data Analysis, vol. 17, no. 1, pp. 27–48, 2013.

[154] M. D. Domenico, A. Sol´e, S. G´omez, and A. Arenas, “Random walks on multiplex net-

works,” arXiv preprint arXiv:1306.0519, 2013.

[155] C. Chen, M. Ng, and S. Zhang, “Block spectral clustering methods for multiple graphs,”

Numerical Linear Algebra with Applications, vol. 24, no. 1, 2017.

153

[156] R. Witten and E. Candes, “Randomized algorithms for low-rank matrix factorizations: sharp

performance bounds,” Algorithmica, vol. 72, no. 1, pp. 264–281, 2015.

[157] D. Goldfarb and S. Ma, “Convergence of ﬁxed-point continuation algorithms for matrix
rank minimization,” Foundations of Computational Mathematics, vol. 11, no. 2, pp. 183–
210, 2011.

[158] Y. Nakatsukasa and N. Higham, “Stable and efﬁcient spectral divide and conquer algorithms
for the symmetric eigenvalue decomposition and the svd,” SIAM Journal on Scientiﬁc Com-
puting, vol. 35, no. 3, pp. A1325–A1349, 2013.

[159] Z. Wen, C. Yang, X. Liu, and Y. Zhang, “Trace-penalty minimization for large-scale
eigenspace computation,” Journal of Scientiﬁc Computing, vol. 66, no. 3, pp. 1175–1203,
2016.

154