A MULTIVARIATE TIME-FREQUENCY BASED PHASE SYNCHRONY MEASURE AND APPLICATIONS TO DYNAMIC BRAIN NETWORK ANALYSIS By Ali Yener Mutlu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Electrical Engineering 2012 ABSTRACT A MULTIVARIATE TIME-FREQUENCY BASED PHASE SYNCHRONY MEASURE AND APPLICATIONS TO DYNAMIC BRAIN NETWORK ANALYSIS By Ali Yener Mutlu Irregular, non-stationary, and noisy multichannel data are abound in many fields of research. Observations of multichannel data in nature include changes in weather, the dynamics of satellites in the solar system, the time evolution of the magnetic field of celestial bodies, population growth in ecology and the dynamics of the action potentials in neurons [1, 2]. One particular application of interest is the functional integration of neuronal networks in the human brain. Human brain is known to be one of the most complex biological systems and quantifying functional neural coordination in the brain is a fundamental problem. It has been recently proposed that networks of highly nonlinear and non-stationary reciprocal interactions are the key features of functional integration. Among many linear and nonlinear measures of dependency, time-varying phase synchrony has been proposed as a promising measure of connectivity. Current state-of-the-art in time-varying phase estimation uses either the Hilbert transform or the complex wavelet transform of the signals [3]. Both of these methods have some major drawbacks such as the assumption that the signals are narrowband for the Hilbert transform and the non-uniform time-frequency resolution inherent to the wavelet analysis. Furthermore, the current phase synchrony measures are limited to quantifying bivariate relationships and do not reveal any information about multivariate synchronization patterns which are important for understanding the underlying oscillatory networks. In this dissertation, a new phase estimation method based on the Rihaczek distribution and Reduced Interference Rihaczek distribution belonging to Cohen’s class is proposed. These distributions offer phase estimates with uniformly high time-frequency resolution which can be used for defining time and frequency dependent phase synchrony within the same frequency band as well as across different frequency bands. Properties of the phase estimator and the corresponding phase synchrony measure are evaluated both analytically and through simulations showing the effectiveness of the new measures compared to existing ones. The proposed distribution is then extended to quantify the cross-frequency phase synchronization between two signals across different frequencies. In addition, a cross frequency-spectral lag distribution is introduced to quantify the amount of amplitude modulation between signals. Furthermore, the notion of bivariate synchrony is extended to multivariate synchronization to quantify the relationships within and across groups of signals. Measures of multiple correlation and complexity are used as well as a more direct multivariate synchronization measure, ‘Hyperspherical Phase Synchrony’, is proposed. This new measure is based on computing pairwise phase differences to create a multidimensional phase difference vector and mapping this vector to a high dimensional space. Hyperspherical phase synchrony offers lower computational complexity and is more robust to noise compared to the existing measures. Finally, a subspace analysis framework is proposed for studying timevarying evolution of functional brain connectivity. The proposed approach identifies event intervals accounting for the underlying neurophysiological events and extracts key graphs for describing the particular intervals with minimal redundancy. Results from the application to EEG data indicate the effectiveness of the proposed framework in determining the event intervals and summarizing brain activity with a few number of representative graphs. Copyright by ALI YENER MUTLU 2012 TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Assessing Functional Brain Connectivity Using Phase Synchronization . . 1.2 Existing Phase Synchrony Methods and Extensions to Dynamic Networks 1.3 Contributions and Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . 1 2 5 9 Chapter 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 A Time-Frequency Based Approach to Phase and Phase Synchrony Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Measures of Phase Synchrony . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Continuous Wavelet Transform . . . . . . . . . . . . . . . . . . . . . 2.2.3 Cohen’s Class of Time-Frequency Distributions . . . . . . . . . . . . Rihaczek Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Reduced Interference Rihaczek Distribution (RID-Rihaczek) . . . . . 2.3.2 Implementation of the Proposed TFDs . . . . . . . . . . . . . . . . . 2.3.3 Time-Varying Phase Estimation and Phase Synchrony . . . . . . . . Cramer-Rao Lower Bound for the Phase Estimator . . . . . . . . . . . . . . Evaluation of the Statistical Properties of Phase Difference and Phase Locking Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Distribution of the Phase Difference . . . . . . . . . . . . . . . . . . . 2.5.2 Bias of the Synchrony Measure: Dependency of Phase Locking Value on the Number of Trials . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Performance of the Rihaczek Distribution in Estimating Time-Varying Phase and Phase Difference . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Effect of the Kernel on Phase Estimation . . . . . . . . . . . . . . . . 2.6.3 Statistical Evaluation of the Phase Estimator: Comparison of CRLB and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.4 Evaluation of the Time-Frequency Resolution of RID-TFPS . . . . . 2.6.5 Robustness of RID-TFPS to Noise . . . . . . . . . . . . . . . . . . . 2.6.6 Kuramoto Model: Comparison of RID-TFPS with Wavelet-TFPS for Multiple Oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix v 12 12 16 16 19 20 22 23 24 26 28 31 32 33 36 37 38 40 41 44 47 50 Chapter 3 Joint-Frequency Representations for Cross-Frequency Coupling 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Cross Frequency Phase Synchronization . . . . . . . . . . . . . . . . . . . . . 3.2.1 Accuracy of RID-Rihaczek Distribution in Estimating CF Phase Synchrony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Simulation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2.1 Evaluating Amplitude Modulation Through CFPLV . . . . 3.3 Joint Frequency Spectral Lag Representation for Cross-Frequency Modulation Analysis in the Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Joint Frequency-Spectral Lag Distribution . . . . . . . . . . . . . . . 3.3.2 Cross Frequency-Spectral Lag Distribution . . . . . . . . . . . . . . . 3.3.2.1 Quantification of Modulation . . . . . . . . . . . . . . . . . 3.3.3 Simulation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Application to EEG Data . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4.1 Power-to-power Modulation . . . . . . . . . . . . . . . . . . 3.3.4.2 Phase-to-power Modulation . . . . . . . . . . . . . . . . . . 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 52 56 57 59 62 64 66 67 70 73 76 77 78 79 Chapter 4 Methods for Quantifying Multivariate Phase Synchronization 81 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.2 Bivariate Synchrony Dependent Multivariate Synchrony Measures . . . . . . 85 4.2.1 Measures of Multivariate Synchronization . . . . . . . . . . . . . . . 85 4.2.2 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.2.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2.3.1 Rossler Oscillator Model . . . . . . . . . . . . . . . . . . . . 89 4.2.3.2 Performance of Multivariate Synchrony Measures In Quantifying Within and Between Cluster Synchrony . . . . . . . . 90 4.2.3.3 Significance Testing For the Multivariate Synchrony Measures 96 4.2.4 Application to EEG Data . . . . . . . . . . . . . . . . . . . . . . . . 97 4.3 Hyperspherical Phase Synchrony . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.3.1 Simulation Results: Robustness of HPS to Noise . . . . . . . . . . . . 101 4.3.2 Application to EEG Data . . . . . . . . . . . . . . . . . . . . . . . . 102 4.3.2.1 EEG Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.3.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Chapter 5 Time-Varying Graph Analysis for Dynamic Brain Network Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Network Change Detection Using PCA . . . . . . . . . . . . . . . . . 5.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Forming Time-Varying Graphs via Phase Synchronization . . . . . . 5.3.2 Event Interval Detection . . . . . . . . . . . . . . . . . . . . . . . . . vi 110 110 113 113 114 115 115 116 5.3.3 Summarizing Signal Subspace for Key Graph Estimation 5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Application to Social Networks . . . . . . . . . . . . . . 5.4.2 Application to EEG Data . . . . . . . . . . . . . . . . . 5.4.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2.2 Network-wide Change Detection . . . . . . . . 5.4.3 Significance Testing for the Key Graph Estimation . . . 5.4.3.1 Key Graph Estimation . . . . . . . . . . . . . . 5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 121 121 124 124 126 127 129 133 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . 135 Bibliography . . . . . . . . . . . . . . . vii . . . . . . . . . . . . . . . . . . . 140 LIST OF TABLES Table 4.1 Table 5.1 Means and standard deviations of Rv , Sg , Sy and ST for the networks in Fig. 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Mean and standard deviation values of the normalized Frobenius inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 viii LIST OF FIGURES Figure 2.1 Effect of filtering: Magnitudes of the original and the Reduced Interference Rihaczek distributions for the sum of two signals, x(t) = (t−100)2 (t−200)2 2 2 ej0.2t + e− ej0.8t . For interpretation of the refere− ences to color in this and all other figures, the reader is referred to the electronic version of this dissertation. . . . . . . . . . . . . . . . Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 25 Experimental and theoretical distributions of the phase difference for a low-synchrony signal pair . . . . . . . . . . . . . . . . . . . . . . . 32 Empirical and the hypothesized distributions of ξ in Eq. (2.32): Empirical pdf is obtained from 100000 independent random variables generated in accordance with Eq. (2.32). . . . . . . . . . . . . . . . 36 Average PLV value for a low-synchrony pair as a function of the number of trials (Error bar indicates the standard deviation) . . . . 37 Performance of Rihaczek distribution in estimating the time-varying phase difference between two chirp signals: (Left) Magnitudes of the two signals, (Right) Actual and estimated phase differences (in radians) as a function of time. . . . . . . . . . . . . . . . . . . . . . . . 38 Actual and estimated unwrapped time-varying phase at the instantaneous frequency for the Rihaczek and RID-Rihaczek distributions in the absence of noise: The signal in Eq. (2.38) is considered with 100 Hz sampling frequency. . . . . . . . . . . . . . . . . . . . . . . . 39 Comparison of the performance of RID-Rihaczek with the performance of CWT in tracking the time and frequency varying phase: (a)-(b) illustrate the total variance and total CRLB for both of the phase estimators as a function of SNR. 1600 simulations are performed for 128 time points. . . . . . . . . . . . . . . . . . . . . . . . 42 Comparison of the performance of RID-Rihaczek with the performance of CWT in tracking the time and frequency varying phase: (a)-(b) illustrate the normalized bias for both of the phase estimators as a function of SNR. 1600 simulations are performed for 128 time points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 ix Figure 2.9 Figure 2.10 Figure 2.11 Figure 3.1 Figure 3.2 Figure 3.3 RID-TFPS vs Wavelet-TFPS: Phase synchrony in the time-frequency plane for two chirp signals with constant phase shift . . . . . . . . . 45 Comparison of noise robustness: Performances of the RID-TFPS and Wavelet-TFPS in estimating the true phase synchrony between a signal pair with constant phase difference for different SNR values, bars indicate the standard deviations. . . . . . . . . . . . . . . . . . . . . 46 Mean RID-TFPS and mean Wavelet-TFPS (PLV, averaged over all possible pairs of 16 oscillators) as a function of the coupling strength K, where the number of trials, N = 200. Error bar indicates the standard deviation. Center frequency of the oscillators is w0 = 200 rad/sec and the distribution width γ = 60 rad/sec which results in Kcrit = 120. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Principal Forms of Cross-Frequency Interactions [4]: (a) A slow oscillatory signal in the theta band (e.g., 8 Hz): the frequency remains fairly constant whereas the amplitude (red line) of the signal fluctuates. The different ways in which faster oscillatory signals (e.g., gamma oscillations) can interact with such a signal are: (b) the fluctuations in power of the faster oscillations are correlated with power changes in the lower frequency band. This interaction is independent of the phases of the signals; (c) (n : m) phase synchrony occurs between slower and faster oscillations. In each slow cycle, there are four faster cycles and their phase relationship remains fixed which indicates a phase locking between the signals; (d) the frequency of the fast oscillations is modulated by the phase of the slower oscillations; and (e) the power of the faster oscillations is modulated by the phase of the slower oscillations. . . . . . . . . . . . . . . . . . . . . . . . . 54 CF phase synchrony for the signal model in Eq. (3.8). CF P LV (20, 70) has the largest synchrony value and the CFPLV is concentrated around this point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 CF phase synchrony for the signal model in Eq. (3.9): The dashed black line indicates the ideal CFPLV profile, where CFPLV is equal to one on this line and zero everywhere else, and represents both the range of instantaneous frequencies, f1 and f2 , and the relationship between them, f2 (t) = 2f1 (t). f1 has the range [20, 50] Hz and f2 has the range [40, 100] Hz. . . . . . . . . . . . . . . . . . . . . . . . 61 x Figure 3.4 CF phase synchrony for the signal model in Eq. (3.10): The dashed black quadratic curve indicates the ideal CFPLV profile. The relation f (t) f 2 (t) 1 between the two instantaneous frequencies is f2 (t) = 1 + 160 , 10 where f1 has the range [10, 120] Hz and f2 has the range [1.625, 102] Hz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 3.5 Figure 3.6 Figure 3.7 Figure 3.8 63 CFPLV for the signal model in Eq. (3.11): x1 (t), at frequency 40 Hz, is CF synchronized with x2 (t), at both 20 Hz and 100 Hz. This is caused by the modulation of x2 (t) by x1 (t) and shows that AM might result in a CFPLV between signals. . . . . . . . . . . . . . . . . . . 64 Cross frequency-spectral lag distribution of the modulation model in Eq.(3.14), which is symmetric with respect to the origin. 2 and 5 are caused by the auto terms, whereas impulses 1, 3, 4 and 6 are caused by the modulation terms. . . . . . . . . . . . . . . . . . . . . . . . . 69 For the model in eq. (3.23), (a) shows the cross frequency-spectral lag distribution when there is no modulation whereas (b) shows the distribution when modulation exists. Entropies of the distributions are: H(Px1 ,x2 ) = 0.0428, H(Px1 ,x3 ) = 0.0677. . . . . . . . . . . . . 74 For the model in eq. (3.24), (a) shows the cross frequency-spectral lag distribution when there is no modulation whereas (b) shows the distribution when modulation exists. Entropies of the distributions are: H(Py1 ,y2 ) = 0.1189, H(Py1 ,y3 ) = 0.2465 . . . . . . . . . . . . . 75 Figure 3.9 (a) and (b) show the cross frequency-spectral lag distributions for the theta-gamma (H(PTheta,Gamma ) = 0.4362) and alpha-gamma (H(PAlpha,Gamma ) = 0.3635) bands of sample EEG data, respectively. 79 Figure 4.1 Rossler network for evaluating the dependency of the multivariate synchrony measures on the coupling strengths. The coupling strengths ρ1 and ρ2 are increased from 0 to 1 in steps of 0.2. . . . . . . . . . . 90 Dependency of the multivariate synchrony measures, Sg and Sy , on the coupling strengths ρ1 and ρ2 . . . . . . . . . . . . . . . . . . . . 91 Dependency of the multivariate synchrony measures, ST and Rv , on the coupling strengths ρ1 and ρ2 . . . . . . . . . . . . . . . . . . . . 92 Figure 4.2 Figure 4.3 xi Figure 4.4 12 different Rossler networks for evaluating the performance of the S and Rv -estimators in estimating the within-cluster and betweencluster synchrony. Each node represents an oscillator and each connection represents the two symmetric coupling strengths, which are equal to 1, between two oscillators. . . . . . . . . . . . . . . . . . . . 94 Figure 4.5 Line crossings, such as the black dots, indicate the sampled 3-dimensional direction vectors based on uniform angular sampling of a 2-sphere. . 100 Figure 4.6 Comparison of noise robustness: Performances of the hyperspherical phase synchrony and S-estimator in estimating the true multivariate phase synchrony within a group of oscillators consisting of highly synchronized sinusoidal signals having constant phase differences for different SNR values, bars indicate the standard deviations. . . . . . 103 Figure 4.7 The mean HPS values computed over all subjects at each time and frequency point within the ERN interval and theta band for the two electrode groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Figure 4.8 The mean S-values computed over all subjects at each time and frequency point within the ERN interval and theta band for the two electrode groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Figure 4.9 ROC curves for HPS and S-estimator . . . . . . . . . . . . . . . . . 107 Figure 5.1 Flow chart describing the proposed framework for extracting key graphs using subspace analysis . . . . . . . . . . . . . . . . . . . . . 120 Figure 5.2 Angular Similarity for MIT Reality Mining Data Set: Five event intervals are detected by the change detection algorithm . . . . . . 121 Figure 5.3 Extracted key graphs for the MIT Reality Mining Data (Diagonal entries are made zero for contrast purposes): (b) and (d) show the key graphs for Fall and Spring semesters, respectively, whereas (a) and (e) show the key graphs for the summer break and (c) shows the graph for the winter break . . . . . . . . . . . . . . . . . . . . . . . 123 Figure 5.4 Event interval detection: 6 event intervals are identified which correspond to the stimulus processing (-1000 to -179 ms), pre-ERN (-178 to 0 ms), ERN (1 to 94 ms), post-ERN (95 to 281 ms), Pe (282 to 462 ms) and inter-trial (463 to 1000 ms) intervals, respectively. The subjects respond to the stimulus at time 0 ms and the red lines indicate E(t) = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 xii Figure 5.5 Stimulus processing interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. . . . 128 Figure 5.6 Pre-ERN interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. . . . . . . . . . . 129 Figure 5.7 ERN interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. . . . . . . . . . . . . 130 Figure 5.8 Post-ERN interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. . . . . . . . . . . 131 Figure 5.9 Pe interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. . . . . . . . . . . . . 132 Figure 5.10 Inter-trial interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. . . . . . . . . . . 133 xiii Chapter 1 Introduction Irregular, non-stationary, and noisy time-series have been observed in many fields of research including changes in weather patterns, the dynamics of satellites in the solar system, the time evolution of the magnetic field of celestial bodies, population growth in ecology, the dynamics of the action potentials in neurons, and molecular vibrations [1, 2]. In engineering, irregular and noisy time-series have been found in communications, control, pattern recognition and measurement devices. For instance, in communication, chaotic oscillators have been used as carrier signals for encryption of information [5] and in optics, multichannel data acquisition techniques have been used for sensing the temperature at a number of measuring points from optical sensors [6]. One particular application of interest is the neuronal networks which have complex dynamic behavior and consist of large assemblies of neurons. EEG signals reflect the dynamics of the underlying neuronal networks and have been considered to be generated by nonlinear time-varying systems exhibiting chaotic behavior [7, 8]. The underlying neuronal system behaves as a deterministic chaotic attractor if the correlation dimension, a measure of complexity, is found to be low. Several studies have revealed that the EEG has a finite non-integer correlation dimension which led to the evidence that EEG is generated by a chaotic neural process [7, 9, 10, 11]. Therefore, there has been a growing interest in applying techniques from the domains of nonlinear analysis and chaos theory to investigate the behavior of human brain from noninvasive multivariate time-series such as multichannel EEG signals. 1 The relationships among simultaneously recorded multichannel signals are usually quantified by first evaluating the reciprocal and pairwise interactions. For this purpose, either linear measures such as cross-correlation, spectral coherence and Granger causality or nonlinear measures such as mutual information have been employed [12]. However, linear measures are limited to quantifying only the linear relations and assume stationarity of the underlying signals whereas most real life signals possess non-stationary behavior. Similarly, reliable estimation of mutual information requires a large amount of data. Recently, tools from nonlinear dynamics, in particular, phase synchrony, have received much attention. Phase synchrony has been shown to be a better indicator of the statistical relationships between oscillators than amplitude dependent measures [13]. In addition, it can account for the nonstationary and nonlinear nature of the oscillators. Hence, one can use phase synchronization to understand the interactions between irregular and non-stationary oscillators [14, 15]. Generally, synchronization can be interpreted as the appearance of relations between functionals of two processes due to interaction. The characteristics of the functionals is to some extent subtle and depend on the problem under consideration. In the classical case of periodic self-sustained oscillators, phase synchronization is usually defined as locking of the phases of two oscillators, while the amplitudes can remain uncorrelated or independent [14]. 1.1 Assessing Functional Brain Connectivity Using Phase Synchronization Human brain is known to be one of the most complex biological systems and understanding the functional connectivity patterns to distinguish between normal and disrupted brain behavior still remains as a challenge [16, 17, 18]. Functional connectivity is defined as the 2 statistical dependencies among remote neurophysiological events indicating the integration of functionally segregated brain regions [16] and can be inferred from different neuroimaging data such as the functional magnetic resonance imaging (fMRI), electroencephalography (EEG) and magnetoencephalography (MEG) [19]. fMRI provides a high spatial resolution whereas EEG and MEG have more limited spatial resolution. On the other hand, EEG and MEG offer higher temporal resolution compared to fMRI. Although the bases of the functional relationships in the brain have been argued for decades, it has been recently proposed that networks of reciprocal interactions are the key features of functional integration [20, 21]. Among the approaches to quantifying reciprocal interactions, phase synchrony has been one of the most promising one. Classically, phase synchronization of two oscillators is the adjustment of their rhythmicity, or more precisely, that their phases are locked [3]. Studies of visual binding, or the so-called perceptual ’binding-problem‘, propose phase synchrony as a basic tool to investigate the large scale cognitive integration of the brain that is needed for perception [22, 13]. Moreover, the relation between phase synchrony and cognition has been studied in the concept of sensory-motor interactions and planning [23, 24], and memory [25]. Recently, large-scale phase synchronization is associated with consciousness, as an integrative process in the constitution of unitary cognitive moments [26]. Similarly, during face recognition, a consistent pattern of gamma frequency synchrony among occipital, parietal, and frontal areas, has been reported [13]. Synchronization in the brain is also assumed to play a role in the manifestation of various neurological diseases, such as epilepsy, Parkinson’s disease and psychopathologies such as schizophrenia, where optimal brain network becomes disrupted [27, 28]. This has, in turn, motivated the quest for robust methods for quantifying phase synchrony in specific frequency bands from experimentally recorded multivariate neurophysiological signals. These studies indicate both the importance of large-scale synchrony 3 in the human brain during cognition and its existence within several frequency bands as well as across frequency bands [29, 30]. For instance, modulations between neuronal oscillations in different frequency bands have been observed for electrohysiological recordings in humans and animals during cognitive tasks and it was found that the power of the fast gamma oscillations (30-150 Hz) was systematically modulated during the course of a theta (4-8 Hz) rhythm [4]. In all of these studies, the basic form of phase synchronization analysis applies only to the bivariate case. However, EEG data is essentially multivariate and the examination of multivariate data has been accomplished by the repeated application of bivariate synchronization measures. Such applications of bivariate synchrony impose a limitation to multivariate analysis such that only synchrony between pairs of signals can be directly studied [31]. Hence, ( ) this requires working in the computationally costly space of N signal pairs of N signals. 2 Furthermore, in multivariate complex systems such as the brain, two processes do not have to interact directly [32]. Therefore, bivariate analysis is often not sufficient to reveal the correct interaction structure and there is a growing need for quantifying multivariate or global phase synchronization in understanding the group dynamics as a whole rather than focusing on the bivariate interactions [33, 34]. For instance, Knyazeva et al. were able to infer a specific whole-head surface topographic synchronization landscape relevant to the clinical picture of the schizophrenia disease by using a recently developed multivariate phase synchrony technique [35]. A popular approach to look at the multivariate synchronization patterns has been through the use of complex network theory which introduces graphs or networks to represent realworld complex systems. A network is a mathematical representation of a system with relational information and can be represented by a graph consisting of a set of vertices (or nodes) 4 and a set of edges (or connections) between pairs of nodes. The presence of a connection between two vertices means that there is some kind of relationship or interaction between the nodes. In order to emphasize the strength of the connectivity between nodes, one can assign weights to each of the edges and the corresponding graph is called a weighted graph. In the study of functional brain networks, nodes represent the different brain regions and the edges correspond to the functional connectivity between these nodes which are usually quantified by the magnitudes of temporal correlations in activity. Most of the current approaches to quantifying functional connectivity provide a single graph to describe the network activity within a given time interval rather than tracking the evolution of functional connectivity. Hence, this results in neglecting possible time-varying properties of the underlying topologies [36, 37, 38]. However, a time-invariant description of the brain connectivity using a single graph is not sufficient to represent the communication patterns of the brain and can be considered as an unreliable snapshot of functional connectivity. Evidence suggests that the emergence of a unified neural process is mediated by the continuous formation and destruction of functional links over multiple time scales [39]. Therefore, there has been a growing interest in analyzing the time-varying dynamic evolution of functional brain networks. The extension from static to dynamic networks reveals that the processing of a stimulus involves optimized functional integration of distant brain regions by dynamic reconfiguration of links. 1.2 Existing Phase Synchrony Methods and Extensions to Dynamic Networks In order to quantify the bivariate phase synchrony between two signals, in brief, two steps are needed. First, instantaneous phase of each signal is estimated at a particular frequency 5 of interest and second, a statistical criterion is employed to quantify the degree of phase locking. In order to address the first step, two traditional approaches for estimating the time and frequency dependent phase of a signal have been proposed. The first one is to use the analytic signal concept through the Hilbert transform and to estimate the instantaneous phase from this analytic form [15]. However, this method requires the bandpass filtering of the signal to have a meaningful and reliable phase estimate. The second approach computes a time-varying complex energy spectrum using either the continuous wavelet transform with a complex Morlet wavelet [40] or the short-time Fourier transform (STFT) [41]. The wavelet approach has an implicit non-uniform time-frequency tiling, which distorts the phase spectrum whereas in the case of (STFT), there is a trade-off between time and frequency resolution due to the window function. Therefore, there is a need for reliable, high resolution phase and corresponding phase synchrony estimates to quantify large-scale functional integration within and across frequency bands in the brain. For the second step, the deviation of the empirical distribution of the relative phase difference from a uniform distribution is usually quantified using indices based on either Shannon entropy or circular variance of phases using ‘Phase Locking Value’ (PLV) [12]. However, PLV is a measure to quantify the degree of phase locking between only two signals and is not able to account for the group dynamics. Recently, multivariate measures of synchronization have been much of interest in understanding the group dynamics. Existing approaches to multivariate phase synchronization are based on the computation of the whole set of bivariate synchrony values and forming connectivity matrices or graphs, which leads to a large amount of mostly redundant information. In the context of graph theory, cluster analysis has been proposed to maximize group connectivity within each cluster while minimizing the connectivity between clusters [42, 43]. This 6 description of the multivariate structure in the form of clusters is followed by the specification of a degree of participation of each element within its cluster. One basic method to obtain clusters is to threshold the matrix elements [44, 45], which is very sensitive to the fluctuation of individual bivariate connectivity indices. Another approach is to use spectral clustering using eigenvalues and eigenvectors of the correlation matrix [46, 47], which were motivated by the application of random matrix theory to empirical correlation matrices. In the context of phase synchrony analysis, a preliminary approach is the partial synchrony adapted from partial coherence to reveal the indirect interactions among the oscillators within a network [32]. However, this method quantifies indirect bivariate synchronization and cannot quantify group dynamics. Allefeld and colleagues have proposed a mean-field approach to analyze functional connectivity from EEG data. They assume that each signal within the network contributes to a single synchronization cluster to a different extent [48], which is not legitimate since the underlying clustering structure of brain networks usually consists of multiple clusters. To address this drawback, an approach based on the eigenvalue decomposition of the pairwise bivariate synchronization matrix has been proposed [49]. Multiple synchronization clusters are detected, where the strength of each cluster depends on the magnitude of its associated eigenvalue and the corresponding eigenvectors account for the internal structure of each cluster. However, it has recently been shown that there are important special cases, clusters of similar strength that are slightly synchronized to each other, where the assumed one-to-one correspondence of eigenvectors and clusters is completely lost [50]. Existing approaches to dynamic network analysis are either graph theory based, such as direct extensions of component finding, clustering coefficient [51, 52, 53] and community detection [54] from the static to the dynamic case, or are feature based where features extracted from each graph in the time series are used to form time-varying graph metrics 7 [55, 56]. More recently, the dynamic nature of the modular structure in the functional brain networks has been investigated by finding modules for each time window and comparing the modularity of the partitions across time [17]. However, this approach does not evaluate the dynamic evolution of the clusters across time and is basically an extension of static graph analysis for multiple static graphs. Mucha et al. [54] proposed a new time-varying clustering algorithm which addresses this issue by defining a new modularity function across time. All of these module finding algorithms result in multiple clustering structures across time and there is a need to reduce this multitude of data into a few representative networks or to quantify the evolution of the network in time using reliable metrics. Therefore, these approaches do not track the change in connectivity or clustering patterns and cannot offer meaningful summarizations of time-varying network topology. Recently, researchers in signal processing have addressed problems in dynamic network analysis such as detection of anomalies or distinct subgraphs in large, noisy background [57, 58]. For instance, in [59], direction of the principal eigenvector of a matrix based on the graph is tracked over time, and an anomaly is detected if the direction changes by more than some threshold. [60] uses scan statistics to track the history of a node’s neighborhood and looks for large deviations to detect anomalous behavior. Tracking dynamic networks [61] using shrinkage estimation, or simple approaches such as sliding window or exponentially weighted moving averaging have been proposed for inferring long-term information or trends [62, 63]. However, these methods have some disadvantages such as preserving historical affinities indefinitely, which makes the network topology denser as time evolves [62]. 8 1.3 Contributions and Organization of the Dissertation In Chapter 2, a new time-varying phase estimation method based on a modified Rihaczek distribution, Reduced Interference Rihaczek distribution, belonging to Cohen’s class is proposed. The performance of the phase estimator and the corresponding synchrony measure are evaluated both analytically and through simulations in comparison to existing measures in particular to continuous wavelet transform based estimates. Both the analytical and the simulation results show Reduced Interference Rihaczek distribution based phase and synchrony estimators to be more robust to noise, have better time-frequency resolution and perform better at detecting actual synchrony in the system, in particular for a network of oscillators. In Chapter 3, we propose two complementary methods to quantify nonlinear relationships between oscillators across frequencies in terms of both phase synchrony across frequencies and amplitude modulation relationship. The first method is based on the Reduced Interference Rihaczek distribution and extends the Reduced Interference Rihaczek based phase synchrony measure to quantify the phase synchrony between two signals across different frequencies. The second method, which is closely related to the modulation frequency and modulation spectrum in speech processing literature, defines a cross frequency-spectral lag distribution based on the Wigner distribution to represent the modulation relationships between two signals. The cross frequency-spectral lag distribution offers cross-frequency coupling information and focuses on quantifying the amount of amplitude modulation between two signals. This approach has been shown to reveal the modulation effect of the theta frequency band (4-8 Hz) on the high frequency gamma band (40-70 Hz). 9 The contribution of Chapter 4 is two fold. First, we extend the notion of bivariate synchrony to multivariate synchronization by employing measures of multivariate correlation and complexity from statistics to quantify the synchronization within and across groups of signals rather than between pairs. The proposed measures depend on quantities such as multiple correlation and R2 and are redefined in the context of phase synchrony. In particular, a measure of association for multivariate data sets is used to quantify the degree of synchronization between groups of variables. We also exploit a global complexity measure based on the spectral decomposition of the bivariate synchronization matrix to estimate the multivariate synchronization within a network. The proposed measures extend the current state of the art phase synchrony analysis from quantifying bivariate relationships to multivariate ones. This shift from pairwise bivariate synchrony analysis to multivariate analysis within and across groups offers advantages for understanding functional brain connectivity where the bivariate relationships do not always reflect the underlying network structure. The second contribution of this chapter is a novel and direct method of computing the multivariate phase synchronization within a group of oscillators without the need for computing bivariate synchrony values. This new method is referred to as hyperspherical phase synchrony and is based on extending the definition of phase synchrony from the two-dimensional space to an N-dimensional space by employing uniform angular sampling of a unit sphere in an Ndimensional hyperspherical coordinate system. Hyperspherical phase synchrony eliminates the need for computing pairwise synchrony values and offers lower computational complexity and improved performance in terms of robustness to noise compared to the existing measures. Finally, in Chapter 5, we propose a framework for analyzing dynamic evolution of functional brain networks which is based on identifying meaningful time intervals corresponding to the underlying neurophysiological events and extracting key networks or graphs for de10 scribing the particular intervals with minimal redundancy. The proposed framework is based on subspace analysis using principal component analysis (PCA) to focus on signal subspace only and to discard noise subspace. The resulting key networks contain only the information related to the signal subspace. Results from the application to real EEG data containing the ERN supports the effectiveness of the proposed framework in determining the event intervals of dynamic brain networks and summarizing network activity with a few number of representative networks. 11 Chapter 2 A Time-Frequency Based Approach to Phase and Phase Synchrony Estimation 2.1 Introduction Irregular, non-stationary, and noisy bivariate data such as chaotic oscillators are abound in many fields of research. Observations of chaotic behavior in nature include changes in weather, the dynamics of satellites in the solar system, the time evolution of the magnetic field of celestial bodies, population growth in ecology, the dynamics of the action potentials in neurons, and molecular vibrations [1, 2]. In engineering, chaotic behavior has found applications in communications, control, pattern recognition and measurement devices. For instance, in communication, chaotic oscillators have been used as carrier signals for encryption of information [5]. The relationship between two simultaneously recorded signals is usually quantified through either linear measures such as cross-correlation or nonlinear measures such as mutual information. Recently, tools from nonlinear dynamics, in particular, phase synchrony, have received much attention [14, 15]. Phase synchronization of chaotic oscillators occurs in many complex systems such as the human brain during different cognitive 12 processes. Synchronization measures have been applied to multichannel electroencephalography (EEG) and magnetoencephalography (MEG) recordings to quantify the dependencies between the activity of remote brain areas in humans (e.g., [15, 40]). Classically, synchronization of two oscillators is understood as the temporal adjustment of their rhythms, or appearance of a certain relation between the phases of two oscillators. Weak phase locking condition at a particular frequency is defined as Φn,m (t) = |nϕ1 (t) − mϕ2 (t)|mod2π < constant, where n and m are some integers and Φn,m is the generalized phase difference and mod2π is used to account for the noise-induced phase jumps [14]. The first step in quantifying phase synchrony between two signals is to extract the time-varying phase of the signals. Two closely related approaches for extracting the time and frequency dependent phase of a signal have been proposed. In both cases, the original signal x(t) is transformed with the help of an auxiliary function into a complex-valued signal, from which the instantaneous phase is easily obtained. The first method is based on computing the Hilbert transform of the signal to obtain an analytic form of the signal and estimate the instantaneous phase from this analytic form [15]. To do so, one has to ensure that the signal is composed of a narrowband of frequencies. Thus, this method requires the bandpass filtering of the signal around a frequency of interest and then applies the Hilbert transform to obtain the instantaneous phase. Recently, a data dependent tool, called the empirical mode decomposition (EMD) [64, 65], has been used as a pre-processing tool to eliminate the need for bandpass filtering. EMD is proposed as a way to extract the individual frequency components, called the implicit mode functions (IMFs), in the signal and thus can be used prior to extracting phase and computing phase synchrony with Hilbert transform. However, it has several drawbacks. First, EMD is completely data driven, thus there is no guarantee that the extracted IMFs really represent the fundamental modes of the data. Second, there is 13 no systematic uniqueness or stability theory for EMD [65]. Furthermore, in order to evaluate the phase synchrony between two signals at a particular frequency, ω0 , one needs to extract two IMFs at the same frequency which is hard to guarantee with univariate EMD. The second approach to phase synchrony computes a time-varying complex energy spectrum using either the continuous wavelet transform (CWT) with a complex Morlet wavelet [40] or the shorttime Fourier transform (STFT) [41]. The Morlet wavelet has a Gaussian modulation both in the time and in the frequency domains and therefore has an optimal time and frequency resolution [66, 67]. It has been observed that the CWT and STFT are similar to the Hilbert transform based methods with the prior giving higher resolution phase synchrony estimates over time and frequency, especially at the low frequency range [3]. The main difference between the two approaches is that the Hilbert transform is actually a filter with unit gain at every frequency [14], so that the whole range of frequencies is taken into account to define the instantaneous phase. Therefore, if the signal is broadband it is necessary to pre-filter it in the frequency band of interest before applying the Hilbert Transform, in order to get an accurate estimate of the phase (e.g., [68], [69], [70]). On the other hand, the wavelet function is non-zero only for those frequencies close to the frequency of interest, so it is equivalent to band-pass filtering x(t) at this frequency, which makes the pre-filtering unnecessary. Although the wavelet and STFT based phase synchrony estimates address the issue of non-stationarity, they suffer from a number of drawbacks. In the case of the wavelet transform, a representation is obtained where the frequency resolution is high at low frequencies and low at high frequencies. Although this property makes wavelet transform attractive in detecting high frequency transients in a given signal, it inherently imposes a non-uniform time-frequency tiling on the analyzed signal and thus results in biased energy representations and corresponding phase estimates. In the case of STFT, there is a trade-off between 14 time and frequency resolution due to the window function. A wider window in time domain results in better frequency resolution but poor time resolution, and vice versa. For these reasons, there is a need for high time-frequency resolution phase distributions that can track dynamic changes in phase synchrony over the whole time-frequency plane. An alternative approach to estimating phase synchrony is through parametric modeling such as the polynomial phase model [71]. The idea is to estimate the parameters of the polynomial phase function instead of directly estimating the time-varying phase. Some of the approaches for estimating the parameters of the polynomial phase signals (PPSs) are high-order ambiguity function (HAF) [72], which provides good results for high signal-tonoise ratio (SNR) but is not robust against high noise variance and cross-terms occurring for multicomponent signals, and the multilag HAF. The use of multilag concept in the computation of HAF and multiplying the HAFs obtained for different lag sets is proposed to address these problems [73]. Both of these approaches are parametric and thus suffer from inaccuracies in determining the order of the polynomial function. Quantifying the phase relationship between two signals has also been a topic of interest in nonlinear dynamics literature. Recurrence plot analysis (RPA) has been introduced to analyze the dynamics of phase space trajectories of nonstationary and relatively short signals [74]. An extension of the recurrence plots to cross recurrence plots has been proposed to compare the phase space trajectories of two signals in the same phase space [75]. A synchrony measure called correlation probability of recurrence (CPR), which is based on the recurrence probabilities, has been proposed to quantify the phase synchrony between two signals [76]. However, unlike nonstationary signal processing based methods, CPR cannot quantify time and frequency dependent phase synchrony for time-varying signals. Recently, we have introduced a new time-varying phase estimation method based on a 15 modified Rihaczek distribution, Reduced Interference Rihaczek distribution, belonging to Cohen’s class [77]. In this chapter, the statistical performance of this new phase estimator is quantified by deriving the Cramer-Rao lower bound and the bias for a signal in additive white noise model. The derived Cramer-Rao lower bounds are evaluated for simulated signals and compared to the variance. After evaluating the statistical properties of the phase estimator, the bias and properties of the corresponding phase synchrony measure are also evaluated. It is important to note that although the proposed phase estimator has lower variance than existing measures, due to its bias it is more appropriate for the estimation of time and frequency dependent phase difference or phase synchrony between two signals since the initial phase of the signal will be lost through the time-frequency transformation. Finally, the different properties of the proposed phase estimator and the corresponding synchrony measure, such as resolution and robustness to noise, are compared with the existing methods for different simulated models including a large set of coupled oscillators. 2.2 2.2.1 Background Measures of Phase Synchrony Phase synchrony is defined as the temporal adjustment of the rhythms of two oscillators while the amplitudes can remain uncorrelated. In order to quantify the phase synchrony between two signals, first the instantaneous phase of the individual signals must be estimated around the frequency of interest. As discussed in the introduction, the two major approaches to quantifying the instantaneous phase of the signal are the Hilbert transform and the complex wavelet transform. Both methods aim at obtaining an expression for the signal in the form of x(t, ω) = a(t)exp(j(ωt + ϕ(t))), where a(t) is the time-varying instantaneous amplitude ˜ 16 and ϕ(t) is the instantaneous phase at the frequency of interest ω. This formulation can be repeated for different frequencies to obtain a time and frequency dependent phase estimate. The relationship between the temporal organization of two signals, x and y, can be quantified through the difference of their instantaneous phase estimates, Φxy (t) = |nϕx (t) − mϕy (t)|. Once the phase difference between two signals is estimated, it is important to quantify the amount of synchrony. The most common scenario for the assessment of phase synchrony entails the analysis of the synchronization between pairs of signals. In the case of noisy oscillations, the length of stable segments of relative phase gets very short; furthermore, the phase jumps occur in both directions, so the time series of the relative phase Φxy (t) looks like a biased random walk (unbiased only at the center of the synchronization region). Therefore, direct analysis of the unwrapped phase differences Φxy (t) has been seldom used. As a result, phase synchrony can only be detected in a statistical sense. Two different indices have been proposed to quantify the synchrony based on the relative phase difference, i.e. Φxy (t) wrapped into the interval [0, 2π), and are defined as follows [12]: 1. Information theoretic measure of synchrony: This measure evaluates the distribution of Φxy (t) by partitioning the interval [0, 2π) into L bins and comparing it with the distribution of the cyclic relative phase obtained from two time series with independent phases [15]. This comparison is carried out by estimating the Shannon entropy of both Φxy (t) for the original signals and Φxy (t) for a pair of independent signals. A normalized phase synchrony index, ρ = (Smax − S)/Smax (2.1) is obtained where S is the entropy of the distribution of the phase difference for the 17 original signals and Smax is the maximum entropy for the same number of bins, i.e. the entropy of the uniform distribution. Normalized in this way, the index is constrained as 0 ≤ ρ ≤ 1. ρ = 1 indicates perfect phase synchronization, whereas ρ ≈ 0 indicates independent oscillators. One drawback of this measure is that ρ depends on the number of bins used to calculate the histogram of Φxy (t). This might result in low values of ρ values even for perfect phase synchrony. One can expect to use Smax = log L for independent phases. However, the distribution of the phase differences is not uniform even for uncorrelated series due to finite size effects [78]. Hence, Smax should be estimated constructing a surrogate data set, i.e, a set of independent signal pairs obtained by randomly shuffling one of the phases while keeping the other unchanged [12]. 2. Phase Synchronization Index: This index is also known as ‘mean phase coherence’ (PC), or ‘phase locking value’ (PLV), √ γ= < cos(Φxy (t)) >2 + < sin(Φxy (t)) >2 = N −1 1 ∑ jΦxy (t ) k e N (2.2) k=0 where the brackets denote averaging over time and N is the number of time points. This index is a measure of how the relative phase is distributed over the unit circle. If the two signals are phase synchronized, the relative phase will occupy a small portion of the circle and mean phase coherence will be high. This measure is equal to 1 for the case of complete phase synchronization and tends to approach zero for independent oscillators. This measure can be applied by either averaging phase differences over time or multiple realizations of the same process. When the phase differences are averaged over trials, it is referred to as the phase locking value (PLV) and can quantify the 18 consistency of response-locked phase differences across trials as follows: N 1 ∑ P LV (t, ω) = exp(jΦk (t, ω)) , 1,2 N (2.3) k=1 where N is the number of trials and Φk (t, ω) is the time-varying phase difference 1,2 estimate between two signals for the k th trial. If the phase difference varies little across the trials, PLV is close to 1 which indicates high phase synchrony pair signals. 2.2.2 Continuous Wavelet Transform One commonly used approach to extract time-varying phase information is the continuous wavelet transform (CWT) with complex wavelet functions. The phase spectrum of the signal can be extracted from its wavelet transform, which is the convolution of the signal with a complex wavelet: ∫ ∞ Wx (t, f ) = x(u)Ψ∗ (u)du t,f −∞ (2.4) where Ψ∗ (u) represents the complex conjugate of the wavelet function [3]. In particular, t,f Morlet wavelet is used for phase extraction and is defined as follows: Ψt,f (u) = √ (u−t)2 j2πf (u−t) e− 2σ 2 fe (2.5) where Ψt,f (u) is a Gaussian window centered at time t with variance σ 2 modulated by a complex exponential at frequency, f . The phase spectrum of x(t) can be evaluated as follows: [ Wx (t, f ) Φx (t, ω) = arg |Wx (t, f )| 19 ] (2.6) Similarly, the phase difference between two signals, x1 (t) and x2 (t), can be computed as: [ ∗ W1 (t, ω) W2 (t, ω) Φ12 (t, ω) = arg |W1 (t, ω)| |W2 (t, ω)| ] (2.7) The spread of the window, σ, is inversely proportional to f and determines the frequency resolution of time-varying phase estimates [3]. 2.2.3 Cohen’s Class of Time-Frequency Distributions Bilinear time-frequency distributions (TFDs) belonging to Cohen’s class can be expressed as 1 [79]: ∫ ∫ ∫ C(t, ω) = τ τ ϕ(θ, τ )x(u + )x∗ (u − )ej(θu−θt−τ ω) du dθ dτ, 2 2 (2.8) where ϕ(θ, τ ) is the kernel function and x is the signal. TFDs represent the energy distribution of a signal over time and frequency, simultaneously. The kernel completely determines the properties of its corresponding TFD. Some of the most desired properties of TFDs are the energy preservation, satisfying the marginals, and the reduced interference. Energy preservation and satisfying the marginals are defined as: ∫ ∫ ∫ ∫ |x(t)|2 dt = C(t, ω) dt dω = |X(ω)|2 dω, ∫ ∫ 2 , C(t, ω) dω = |x(t)| C(t, ω) dt = |X(ω)|2 . (2.9) 1 All integrals are from −∞ to ∞ unless otherwise stated. 20 For multicomponent signals, bilinear TFDs suffer from the existence of cross-terms or interference, i.e. if x(t) = ∑N i=1 xi (t) then C(t, ω) = ∑N i=1 Cxi ,xi (t, ω) + ∑ i̸=j 2Re(Cxi ,xj (t, ω)), where Cxi ,xi and Cxi ,xj refer to the auto-terms and cross-terms, respectively. The crossterms introduce time-frequency structures that do not correspond to the time-frequency spectrum of the actual signal. For real signals, the cross-terms might contaminate the spectrum of the auto-terms since the individual signal components may not be disjoint in the time-frequency plane. Hence, the cross-terms should be filtered out using an appropriate kernel function. Any TFD given by equation (2.8) can be equivalently written as: ∫ ∫ C(t, ω) = where A(θ, τ ) = ∫ ϕ(θ, τ )A(θ, τ )e−j(θt+τ ω) dτ dθ (2.10) x(u + τ )x∗ (u − τ )ejθu du is the ambiguity function of the signal. Since the 2 2 ambiguity function tends to group the auto-terms close to the θ − τ axis, the kernel function is usually designed as a lowpass filter. In this chapter, reduced interference distributions (RIDs) will be used to address the problem of cross-terms, with |ϕ(θ, τ )| << 1 for θτ >> 0, to concentrate the energy around the auto-terms [80]. The major advantages of Cohen’s class of TFDs over other time-frequency representations such as the wavelet transform are the nonlinearity of the distribution, energy preservation and the uniform resolution over time and frequency. Most of the members of Cohen’s class, such as the spectogram and the Wigner distribution, are real valued energy distributions describing the energy of the signal over time and frequency, simultaneusly. However, since these distributions do not have phase information, they cannot be used for estimating the phase of an individual signal and the phase synchrony between two signals. Therefore, there is a need for high resolution complexvalued TFDs that carry both the energy and the phase information of the underlying signals. 21 2.3 Rihaczek Distribution Rihaczek derived the signal energy distribution in time and frequency by application of the complex signal notation. If two complex signals at the same frequency, x1 (t) and x2 (t), are considered where x1 (t) may be interpreted as the voltage and x2 (t) as the current generated ∫ in an impedance, the total complex energy is defined as x1 (t)x∗ (t)dt. Rihaczek extended 2 the idea of complex energy to define the interaction energy at a frequency of interest, ω, within some frequency band and at a given time, t, within an infinitesimal time interval as [81]: 1 C(t, ω) = √ x(t)X ∗ (ω)e−jωt , 2π (2.11) where x(t) is the signal and X(ω) is its Fourier transform and measures the complex energy of a signal around time t and frequency ω. A geometric interpretation of the Rihaczek Distribution was developed in [82]. The Rihaczek distribution can be expressed as a complex Hilbert space inner product between the time series and its infinitesimal stochastic Fourier generator, which results in an illuminating geometry [83], wherein the angle between the time series and its infinitesimal stochastic Fourier generator for a given frequency component and time instant is characterized by the Rihaczek distribution. The complex energy density function provides a fuller appreciation of the properties of phase-modulated signals that is not available with other time-frequency distributions. While the time-frequency resolution of the STFT or the wavelet transform is determined by the window function or the basis functions used to expand the signal, for the Rihaczek distribution, the time-frequency resolution is determined by the rate of change of the instantaneous frequency which provides better localization for phase-modulated signals. Similar to other members of Cohen’s class of distributions, the Rihaczek distribution 22 is a bilinear, time and frequency shift covariant time-frequency distribution that satisfies the marginals, preserves the energy of the signal with strong time and frequency support properties [81]. With these properties, the Rihaczek distribution is a complex TFD that provides both a time-varying energy spectrum as well as a phase spectrum with good timefrequency localization for phase modulated signals. 2.3.1 Reduced Interference Rihaczek Distribution (RID-Rihaczek) For a multicomponent signal such as, x(t) = x1 (t) + x2 (t), the Rihaczek distribution is: 1 ∗ ∗ C(t, ω) = √ (x1 (t)X1 (ω)e−jωt + x2 (t)X2 (ω)e−jωt 2π ∗ ∗ + x1 (t)X2 (ω)e−jωt + x2 (t)X1 (ω)e−jωt , (2.12) where the last two terms in Eq. (2.12) are the cross-terms. These cross-terms are located at the same time and frequency locations as the original signals and will lead to biased energy and phase estimates. In order to get rid of these cross-terms, we have recently proposed a reduced interference version of the Rihaczek distribution by applying a kernel function to filter the cross-terms in the ambiguity domain [77]. Different kernel functions such as the Choi-Wiliams (CW), BornJordan or binomial kernels can be used to address the issue of cross-terms with similar results [79]. In this chapter, we employ the Choi-Williams kernel where the resulting distribution can be written in terms of the product of the CW kernel and the kernel for the Rihaczek 23 distribution as: ∫ ∫ C(t, ω) = ( (θτ )2 exp − σ ) CW kernel θτ ) A(θ, τ )e−j(θt+τ ω) dτ dθ. 2 Rihaczek kernel exp(j (2.13) θτ where ej 2 is the kernel function for the Rihaczek distribution. This new distribution, which will be referred to as RID-Rihaczek, will have an equivalent time-frequency kernel ϕ(θ, τ ) = (θτ )2 θτ e− σ ej 2 . Since this kernel satisfies the constraints, ϕ(θ, 0) = ϕ(0, θ) = ϕ(0, 0) = 1, the corresponding distribution will satisfy the marginals and preserve the energy, and is a complex energy distribution at the same time. The value of σ can be adjusted to achieve a desired trade-off between resolution and the amount of cross-terms retained. Fig. 2.1 illustrates the effect of the kernel function on the magnitude of the Rihaczek distribution for a multicomponent signal. 2.3.2 Implementation of the Proposed TFDs The time-frequency distributions employed in this chapter for phase estimation, i.e. Rihaczek and RID-Rihaczek distributions, have been implemented using MATLAB. The discrete-time discrete-frequency Rihaczek and RID-Rihaczek TFDs, are implemented as follows [81, 84]: • Compute the local autocorrelation function R[n, τ ]: R[n, τ ] = x∗ [n]x[((n+τ ))N ], where n = 1, 2, . . . , N is the discrete time, τ = 1, 2, . . . , N is the discrete lag variable, and (·)N refers to the mod N operation. • Compute the ambiguity function by taking the N point FFT of R[n, τ ]: A[θ, τ ] = FFT{R[n, τ ]} = ∑N n=1 R[n, τ ]e −j(2π/N )nθ , where θ = 1, 2, . . . , N is the discrete Doppler lag variable. 24 Magnitude of Rihaczek Distribution Normalized Frequency 1 0.8 0.6 0.4 0.2 0 50 100 150 200 250 Time (a) Magnitude of RID−Rihaczek Distribution Normalized Frequency 1 0.8 0.6 0.4 0.2 0 50 100 150 200 250 Time (b) Figure 2.1: Effect of filtering: Magnitudes of the original and the Reduced Interference (t−100)2 (t−200)2 j0.2t + e− 2 2 Rihaczek distributions for the sum of two signals, x(t) = e e ej0.8t . − For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. • Discrete time-frequency Rihaczek distribution is obtained as: C[n, k] = IFFT {FFT{A[θ, τ ]}} = ∑N θ=1 ∑N 25 j(2π/N )θn e−j(2π/N )τ k . τ =1 A[θ, τ ]e • RID-Rihaczek distribution is obtained by first multiplying the ambiguity function with the kernel function and then computing the IFFT and FFT: { C[n, k] = IFFT = N N ∑∑ (θτ )2 − σ } FFT{A[θ, τ ]e } (θτ )2 A[θ, τ ]e− σ ej(2π/N )θn e−j(2π/N )τ k θ=1 τ =1 where k = 1, 2, . . . , M is the discrete frequency variable, M is the number of frequency bins and σ = 0.001 in our implementations. Discretization of different kernel functions are explained in detail in [85]. 2.3.3 Time-Varying Phase Estimation and Phase Synchrony In this section, we will define a time and frequency dependent phase estimate and a corresponding synchrony measure using the proposed complex TFD. For a signal x(t) = A(t)ejϕ(t) with Fourier transform X(ω) = B(ω)ejθ(ω) , the time-varying phase estimate based on the Rihaczek distribution can be defined as 2 [77]: [ ] ] C(t, ω) A(t)ejϕ(t) B(ω)e−jθ(ω) e−jωt Φ(t, ω) = arg = arg , |C(t, ω)| A(t)B(ω) [ ] = arg ejϕ(t) e−jθ(ω) e−jωt , [ = ϕ(t) − θ(ω) − ωt, (2.14) where ϕ(t) and θ(ω) refer to the phase in the time and the frequency domains, respectively. Once the phase estimate in the time-frequency domain is obtained, the phase difference 2 In this section, all of the derivations for time-varying phase spectrum and phase synchrony will be based on the original definition of Rihaczek distribution for purposes of simplicity. Similar computations for RID-Rihaczek can be done numerically. 26 between two signals, x1 (t) and x2 (t), can be computed as: [ ] ∗ C1 (t, ω) C2 (t, ω) Φ12 (t, ω) = arg , |C1 (t, ω)| |C2 (t, ω)| [ ] j((ϕ1 (t)−ϕ2 (t))−(θ1 (ω)−θ2 (ω))) , = arg e = (ϕ1 (t) − ϕ2 (t)) − (θ1 (ω) − θ2 (ω)), (2.15) where ϕ1 (t) and ϕ2 (t) correspond to the phases of the two signals in the time domain, whereas θ1 (ω) and θ2 (ω) correspond to the phases of the two signals in the frequency domain. For a real-valued signal, the phase difference between a signal x1 (t) and its shifted version x1 (t − t0 ) is given by: [ ] ∗ x1 (t)X1 (ω)e−jωt x∗ (t − t0 )X1 (ω)e−jωt0 ejωt 1 Φ12 (t, ω) = arg , |x1 (t)||X1 (ω)| |x1 (t − t0 )||X1 (ω)| [ ] x1 (t) x∗ (t − t0 )e−jωt0 1 = arg , |x1 (t)| |x1 (t − t0 )| = −ωt0 , (2.16) which is a linear function of frequency as expected 3 . In most applications where a bivariate relationship between two signals is desired, the time-frequency dependent phase estimates are not directly useful. In order to further quantify the bivariate relationship or the coupling between signals, a measure of phase synchrony needs to be defined. In this chapter, phase locking value (PLV), described in Section 2.2.1, will be used. The phase synchrony estimate based on the RID-Rihaczek distribution will be referred to as RID-Time-Frequency Phase Synchrony (RID-TFPS) measure. Similarly, the phase synchrony estimate given by the CWT will be referred to as Wavelet-TFPS measure. 12 (t, ω) = −ωt0 with modulus of π. 3Φ 27 2.4 Cramer-Rao Lower Bound for the Phase Estimator In this section, the Cramer-Rao lower bound is derived to quantify the efficiency of the proposed time-varying phase estimators based on Rihaczek and RID-Rihaczek distributions ˆ in discrete time, θ[n] = Φ[n, w(n)] where w(n) is the instantaneous frequency of interest for a particular time n. Let z be a complex signal with additive complex white Gaussian noise e: z[n] = x[n] + e[n] (2.17) where e ∼ CN (0, C), C is the covariance matrix and x[n] = A[n]ejθ[n] is a function of the unknown time-varying phase, θ = θ[n], which will be denoted as a deterministic vector parameter θ for n = 1, 2, . . . , N . From the properties of the complex Gaussian pdf, z ∼ CN (x, C) and the covariance matrix C does not depend on θ. For any parameter estimator, ˆ θ, for θ, the Fisher’s information matrix for a complex Gaussian pdf is given as [86]: [ ] [ ] H ∂C −1 ∂C ∂µz −1 ∂µz [I(θ)]kl = tr C −1 C + 2Re C ∂θk ∂θl ∂θk ∂θl (2.18) where k, l = 1, 2, . . . , N and µz = x. Since C is independent of θ, equation (2.18) reduces to: [ ] ∂µH −1 ∂µz z [I(θ)]kl = 2Re C . ∂θk ∂θl 28 (2.19) For the complex signal x[n] = A[n]ejθ[n] , with θ = [θ[1] . . . θ[N ]],     0             0     jθ[2]    A[2]e ∂µz      µz =  = A[k]jejθ[k]  ⇒     ∂θk . .     . .     .     .   jθ[N ] A[N ]e   0 A[1]ejθ[1] (2.20) and the Fisher’s information matrix for the time-varying phase estimator can be derived as:     0         . .    .     ]     −jθ[k] . . . 0 C −1 (j)A[l]ejθ[l]  [I(θ)]kl = 2Re   0 . . . (−j)A[k]e        .      .     .                 0             [    (2.21) where, C −1 = 12 I, is a diagonal matrix since e is assumed to be complex white Gaussian σ noise and σ 2 is the variance of the complex noise and I is the identity matrix. Therefore;    0,  [I(θ)]kl = if k ̸= l  2  2A [k]   2 , if k = l (2.22) σ and the CRLB for the unbiased estimators is written as: ˆ COV(θ) Cθ ≽ [I(θ)]−1 = CRLB ˆ 29 (2.23) In order to account for biased estimators, the Cramer-Rao lower bound (CRLB) for the biased vector parameter estimator of the phase of a complex signal with additive white Gaussian noise should be modified as follows [86]: ˆ COV(θ) ∂Ψ(θ) Cθ ≽ [I(θ)]−1 ˆ ∂θ ( ) ∂Ψ(θ) T = CRLB ∂θ (2.24) ˆ where Ψ(θ) = b(θ) + θ and b(θ) = E[θ] − θ is the bias. The expression for the CRLB in Eq. (2.24) is also valid for the unbiased estimators since b(θ) = 0 and ∂Ψ(θ) ∂θ = I, where I is the identity matrix. Therefore, since the Fisher’s information matrix is diagonal: ( ) Tr Cθ ≥ Tr (CRLB) ˆ (2.25) which means the total variance of the phase estimator is lower bounded by the total CRLB. In order to find the lower bound in (2.24), the gradient, ∇b(θ), needs to be computed. However, for the estimator proposed in this chapter, exact computation of the bias gradient is not possible. Hence, an unbiased and consistent sample mean estimate of ∇θ b(θ) is exploited, which is given by [87]: ∇θ b(θ k ) = 1 L−1 L ∑ i=1  ˆ θ k − 1 i L L ∑  ˆk θj  ∇θ ln fz (zi ; θ) − ∇θ k (2.26) j=1 where {zi }L is a set L of i.i.d realizations of the signal model given in eq. (2.17), fz (zi ; θ) i=1 ˆk is the complex Gaussian pdf evaluated for the ith realization and θi is the phase estimate at time instance k, computed from the ith realization. The N × N matrix 30 ∂Ψ(θ) ∂θ can be written as:  [∂b(θ)]1  ∂θ1  ∂Ψ(θ)  =  ∂θ   ··· . . . ... [∂b(θ)]N ∂θ1 ··· [∂b(θ)]1 ∂θN   . . . [∂b(θ)]N ∂θN   + I,   (2.27) where the kth row is computed using the gradient of the bias for the kth time point, from ∇θ b(θ k ). The bound in Eq. (2.24) indicates that even though the signal’s phase is independent from its amplitude, for a noisy signal the estimator variance depends on the noise power. 2.5 Evaluation of the Statistical Properties of Phase Difference and Phase Locking Value In order to understand the performance of the phase synchrony measure corresponding to the RID-Rihaczek phase estimator, it is important to quantify the statistical properties of the underlying phase difference. However, finding analytic expressions for the statistical properties of the phase difference between two arbitrary signals, such as the distribution of phase difference (or phase) and phase synchrony (e.g., PLV), is not possible since these properties depend on the underlying signals. Thus, a simulation model involving a lowsynchrony signal pair, consisting of two independent white Gaussian noise sequences with equal variance (σ 2 ), is used for quantifying the distribution of phase difference and PLV under the null hypothesis, i.e. for the case where the signals are independent [40, 88]. As long as the two noise sequences are independent, the amount of the noise variance does not have any effect on these distributions. 31 Distributions of the Phase Difference for a Low Synchrony Signal Pair 400 Frequency of Occurance 350 300 250 200 150 100 Experimental Theoretical 50 0 −4 −2 0 Phase Difference (rad) 2 4 Figure 2.2: Experimental and theoretical distributions of the phase difference for a lowsynchrony signal pair 2.5.1 Distribution of the Phase Difference The distribution of the phase and phase differences for the proposed estimation method are evaluated using a simulation model with low-synchrony signals. The phase difference Φ12 (t, ω) between the two signals is expected to be distributed uniformly over [−π, π] since the phase differences between two white noise sequences are randomly distributed between [−π, π]. Fig. 2.2 illustrates the theoretical and experimental distributions of the phase difference, where the signal length is 180 and the number of simulations is 20000. The experimental distribution is close to a uniform distribution as shown by the Chi-square goodness-of-fit test at the 5% significance level where the null hypothesis that the data is from a uniform distribution over [−π, π] can not be rejected (p = 0.1486 > 0.05). 32 2.5.2 Bias of the Synchrony Measure: Dependency of Phase Locking Value on the Number of Trials Once the distribution of the phase difference is determined, the distribution and bias of PLV estimates can be found. PLV in Eq. (2.3) can be expanded as: P LV (t, ω) = N 1 ∑ exp(jΦk (t, ω)) 1,2 N k=1 N ) 1 ∑( = cos Φk (t, ω) + j sin Φk (t, ω) 1,2 1,2 N k=1 N N ∑ 1 ∑ k (t, ω) + j = cos Φ1,2 sin Φk (t, ω) 1,2 N k=1  = 1 N  k=1 N ∑ 2  cos Φk (t, ω) +  1,2 N ∑ 2 sin Φk (t, ω) 1,2 k=1 k=1 N −1 ∑ ∑ N 1 = [N + 2 (cos Φi (t, ω) cos Φk (t, ω) 1,2 1,2 N i=1 k=i+1 + sin Φi (t, ω) sin Φk (t, ω))]1/2 1,2 1,2 (2.28) Since cos Φi (t, ω) cos Φk (t, ω) + sin Φi (t, ω) sin Φk (t, ω) = cos(Φi (t, ω) − Φk (t, ω)), 1,2 1,2 1,2 1,2 1,2 1,2 Eq. (2.28) can be rewritten as: 1 P LV (t, ω) = N N +2 N −1 ∑ ∑ N ˜ cos Φik (t, ω) 1,2 (2.29) i=1 k=i+1 ˜ where Φik (t, ω) = Φi (t, ω) − Φk (t, ω). 1,2 1,2 1,2 In particular, the phase difference, Φk (t, ω) for k = 1, 2, . . . , N , between two low1,2 synchrony signals is modeled as a uniformly distributed random variable over [−π, π]. Then 33 ˜ the difference, Φik (t, ω) = Φi (t, ω) − Φk (t, ω), will have a triangular distribution since 1,2 1,2 1,2 the sum of two independent identically distributed random variables results in a new random variable, whose probability density function (pdf) is the convolution of the two initial pdfs (convolution of two uniform distributions results in a triangular distribution). Hence, the ˜ pdf of Φik (t, ω) is: 1,2 ˜ fΦ (Φ) = ˜    2π+Φ ˜    4π 2     ˜ if − 2π ≤ Φ ≤ 0 ˜ 2π−Φ ˜ if 0 < Φ ≤ 2π  4π 2       0  otherwise ˜ Making the substitution, Υik (t, ω) = cos Φik (t, ω), Eq. (2.29) can be written as: 1,2 1,2 1 P LV (t, ω) = N N +2 N −1 ∑ ∑ N Υik (t, ω) 1,2 (2.30) i=1 k=i+1 ˜ ˜ ˜ ˜ Two solutions of Υ = cos Φ exist for Φ in [−π, π], Φ1 = cos−1 (Υ) and Φ1 = − cos−1 (Υ). ˜ ˜ ˜ Differentiation of Υ with respect to Φ leads to d (cos Φ) = − sin Φ and the pdf of the random ˜ dΦ variable, Υ, can be computed as: fΥ (Υ) = 1/2π 1/2π 1 + = −1 (Υ))| −1 (Υ))| | − sin(cos | − sin(− cos π sin(cos−1 Υ) for − 1 ≤ Υ ≤ 1 (2.31) To further simplify the expression in Eq. (2.30), a new random variable, ξ, is introduced as: ξ= N −1 N −i 1 ∑ ∑ ik Υ1,2 (t, ω). N2 i=1 k=1 34 (2.32) Hence, PLV can be written as: √ P LV (t, ω) = 1 + 2ξ, N (2.33) and the expected value of the PLV can be evaluated as follows [89]: [ ] E P LV (t, ω) = ∫ 1√ 1 0 N + 2ξ fξ (ξ) dξ (2.34) The random variable ξ can be numerically approximated as a shifted exponential random process with probability density function [89]: ( ) 1 −N ξ+ 2N fξ (ξ) = N e , (2.35) Fig. 2.3 shows the empirical pdf of ξ and the hypothesized distribution in Eq. (2.35). Empirical pdf is obtained from 100000 independent random variables generated in accordance with Eq. (2.32). The empirical distribution is close to the hypothesized pdf as shown by the Kolmogorov-Smirnov goodness-of-fit test at the %5 significance level, where the null hypothesis that ξ follows the pdf in Eq. (2.35) can not be rejected (p = 0.3281 > 0.05). Once this approximation is made, it has been shown that the expected value of the phase synchrony is governed by [89]: [ ] 1 E P LV (t, ω) ≈ √ N (2.36) Based on these results, the expected value of the phase synchrony for a low synchrony signal pair is not exactly equal to zero due to the limited number of trials and has a bias which is inversely proportional to N . 35 Frequency of Occurrence Approximation of ξ to the Shifted Exponential Random Variable 0.12 0.1 0.08 Empirical Distribution Hypothesized Distribution 0.06 0.04 0.02 0 −0.01 0 0.01 0.02 ξ 0.03 0.04 0.05 0.06 Figure 2.3: Empirical and the hypothesized distributions of ξ in Eq. (2.32): Empirical pdf is obtained from 100000 independent random variables generated in accordance with Eq. (2.32). This result is experimentally validated through 200 simulations for different number of trials, N , for signals with 128 samples. Fig. 2.4 indicates that the average PLV value for a low-synchrony signal pair has a bias for small number of trials. Furthermore, average PLV follows the rule given in Eq. (2.36). 2.6 Simulation Results In this section, we will illustrate the accuracy of the proposed phase estimator and corresponding phase synchrony measure through simulations and comparisons to CWT based methods. We will also evaluate the derived CRLB and compare with estimator variance for simulated signals. 36 Bias of the RID−TFPS for a Low Synchrony Signal Pair 1 Average RID-TFPS √ 1/ N Average RID−TFPS 0.8 0.6 0.4 0.2 0 50 100 150 200 250 300 Number of Trials (N) 350 400 Figure 2.4: Average PLV value for a low-synchrony pair as a function of the number of trials (Error bar indicates the standard deviation) 2.6.1 Performance of the Rihaczek Distribution in Estimating TimeVarying Phase and Phase Difference In order to test the performance of the proposed method in tracking the time-varying phase difference, two chirp signals with exponentially decaying amplitudes are considered: ) ) ( ( j 2π −10 n+12 n 5.12 x1 (n) = e−n e , ( ) ) ( −8 −n ej 2π 5.12 n+12 n+π/2 x2 (n) = e for 1 ≤ n ≤ 256 (2.37) The two signals, x1 and x2 , have different linear chirp rates resulting in a time-varying phase difference. As seen in Fig. 2.5, the actual and the estimated phase differences using Rihaczek distribution overlap at the instantaneous frequency. Therefore, the phase estimator can track the evolution of changing phase difference as a function of time. This example can 37 be generalized to any signal pairs. 1 7 Actual Estimated X1(t) X2(t) 0.8 6 0.6 Phase Difference Amplitude 0.4 0.2 0 0.2 0.4 5 4 3 2 0.6 1 0.8 1 0 1 2 0 3 Time 0 1 2 3 Time Figure 2.5: Performance of Rihaczek distribution in estimating the time-varying phase difference between two chirp signals: (Left) Magnitudes of the two signals, (Right) Actual and estimated phase differences (in radians) as a function of time. 2.6.2 Effect of the Kernel on Phase Estimation Although the kernel function is useful for removing the cross-terms for multicomponent signals, it has a smoothing effect in the ambiguity domain. Hence, the time varying phase estimation will be influenced by the kernel function. The effect of the kernel on the bias and variance of the estimator are shown in detail in Section 2.6.3. In this section, we want to illustrate the effect on the actual phase function as a function of time. The following signal is considered to illustrate the effect of the kernel function on the time-varying phase 38 estimation: 40 x1 (n) = ej2π 100 n for 1 ≤ n ≤ 100. (2.38) Fig. 2.6 shows the actual and estimated phase at the instantaneous frequency for the Rihaczek and RID-Rihaczek distributions in the absence of noise. Rihaczek distribution is able to track the time-varying phase, whereas the phase estimated by the RID-Rihaczek deviates from the actual phase slightly. This is the effect of the kernel on the phase estimation occurring due to the convolution of the original spectrum and phase with the kernel function. Actual and estimated unwrapped phases 350 300 250 Phase 200 150 100 Actual Estimated by Rihaczek Estimated by RID−Rihaczek 50 0 −50 0 0.2 0.4 0.6 0.8 1 Time Figure 2.6: Actual and estimated unwrapped time-varying phase at the instantaneous frequency for the Rihaczek and RID-Rihaczek distributions in the absence of noise: The signal in Eq. (2.38) is considered with 100 Hz sampling frequency. 39 2.6.3 Statistical Evaluation of the Phase Estimator: Comparison of CRLB and Variance In order to assess the statistical performances of the phase estimators, a signal set consisting of two constant amplitude chirp signals with different chirp rates is considered: 0.3 2 z1 [n] = ej(0.1n+ 256 n ) + e1 [n] 0.8 2 z2 [n] = ej(0.1n+ 256 n ) + e2 [n] for 1 ≤ n ≤ 128 (2.39) where e1 and e2 are complex white Gaussian noise sequences. SNR values from 0 dB to 30 dB are considered with 1600 simulations for 128 time points. For both signals, using the relation in Eq. (2.25), the total variance of the RID-Rihaczek and CWT phase estimators are compared with the total CRLBs computed at the frequency of interest, i.e., the instantaneous frequency. Furthermore, to examine the bias-variance trade-off of the two estimators, averaged normalized bias is plotted as a function of SNR. Figs. 2.7(a) and 2.7(b) show the total variance and total CRLB as a function of SNR for the signals, z1 and z2 for both RID-Rihaczek and CWT, respectively. Similarly, Figs. 2.8(a) and 2.8(b) show the normalized bias as a function of SNR for the two methods. First, the CRLB is always less than the variance for both of the phase estimators as expected. Similarly, as the SNR increases, CRLB and variance decrease for both methods. However, for the signal z1 , the decrease in the variance is faster for RID-Rihaczek compared to CWT. Second, the CRLB for RID-Rihaczek is lower than the CRLB for CWT when SNR is between 0 and 25 dB. In addition, the variance of RID-Rihaczek is also lower than the variance of CWT when SNR is between 3 and 26 dB. This indicates that the RID-Rihaczek is more robust 40 to noise and is a more precise phase estimator than the CWT for a wide range of SNRs. This improved performance of RID-Rihaczek in terms of variance comes at the expense of increased bias as shown by Figs. 2.8(a) and 2.8(b). On the other hand, for the signal z2 , the CRLB for RID-Rihaczek is lower than the CRLB for CWT when the SNR is between 0 and 10 dB and the variance of RID-Rihaczek is lower than the variance of CWT when the SNR is between 5 and 10 dB. However, the normalized bias for RID-Rihaczek is lower than the normalized bias for CWT for all SNR values. The increase in the variance and the decrease in the bias for RID-Rihaczek are due to the faster rate of change of the frequency spectrum. When the signal covers a broader band of frequencies, CWT has a harder time discriminating between the different components thus leading to a higher bias. RID-Rihaczek, on the other hand, offers uniform resolution resulting in lower bias at the expense of increased variance. Therefore, Figs. 2.7 and 2.8 illustrate the bias-variance trade-off when the chirp rate of the signal is varied. 2.6.4 Evaluation of the Time-Frequency Resolution of RID-TFPS The performance of the RID-TFPS measure is compared to the Wavelet-TFPS measure for signals with time-varying frequency content. The goal of this comparison is to illustrate how the two methods differ in the way that they track phase synchrony across time and a range of frequencies. For this purpose, two linear chirp signals are considered with constant phase difference, i.e. x1 (t) = exp(j(ω0 t + βt2 )) and x2 (t) = exp(j(ω0 t + βt2 + θ)), where ω0 = 0.5, β = −0.00117 in terms of the normalized frequency for 128 time samples. The parameters ω0 and β were chosen such that the signal covers a broad range of frequencies, i.e. it is not narrowband, and that there is no aliasing in frequency. 200 simulations of this signal model are performed with uniformly distributed random phase difference and additive 41 Total Variance and Total CRLB for the Signal z 1 Total Variance and CRLB 100 RID−Rihaczek Total Variance CWT Total Variance RID−Rihaczek Total CRLB CWT Total CRLB 80 60 40 20 0 0 5 10 15 SNR 20 25 30 (a) Total variance and total CRLB for the signal z1 Total Variance and Total CRLB for the Signal z 2 Total Variance and CRLB 100 RID−Rihaczek Total Variance CWT Total Variance RID−Rihaczek Total CRLB CWT Total CRLB 80 60 40 20 0 0 5 10 15 SNR 20 25 30 (b) Total variance and total CRLB for the signal z2 Figure 2.7: Comparison of the performance of RID-Rihaczek with the performance of CWT in tracking the time and frequency varying phase: (a)-(b) illustrate the total variance and total CRLB for both of the phase estimators as a function of SNR. 1600 simulations are performed for 128 time points. white Gaussian noise at a SNR value of 10 dB. Fig. 2.9 shows that around the instantaneous frequency, which is the derivative of the instantaneous phase and shown by the black lines, RID-TFPS is close to 1 (the maximum 42 Normalized Bias for the Signal z1 Normalized Bias 0.05 0.04 0.03 RID−Rihaczek CWT 0.02 0.01 0 0 5 10 15 20 25 SNR (a) Normalized bias for the signal z1 Normalized Bias for the Signal z2 30 Normalized Bias 0.04 0.03 RID−Rihaczek CWT 0.02 0.01 0 0 5 10 15 SNR 20 25 30 (b) Normalized bias for signal z2 Figure 2.8: Comparison of the performance of RID-Rihaczek with the performance of CWT in tracking the time and frequency varying phase: (a)-(b) illustrate the normalized bias for both of the phase estimators as a function of SNR. 1600 simulations are performed for 128 time points. synchrony is equal to 0.9942) and smoothly tapers off as the frequency moves away from the instantaneous frequency. On the other hand, Wavelet-TFPS has a larger spread around the instantaneous frequency (the maximum synchrony is equal to 0.9586) compared to the 43 spread of the RID-TFPS. The difference in the concentration of the PLV values around the instantaneous frequency can be quantified by computing the mean squared error (MSE) between the ideal PLV profile and the computed ones. The ideal PLV profile is a timefrequency surface with ones at the instantaneous frequency and zeros everywhere else, and can be easily obtained by computing the Wigner distribution and estimating the instantaneous frequency. For RID-TFPS, MSE is equal to 0.2427 whereas for Wavelet-TFPS it is equal to 0.3682, indicating a larger deviation from the ideal case. It is also important to note that for the Wavelet-TFPS, the bandwidth around the instantaneous frequency increases as the instantaneous frequency increases. This is due to the fact that at high frequencies the wavelet transform has high time resolution at the expense of low frequency resolution. 2.6.5 Robustness of RID-TFPS to Noise The robustness of the RID-TFPS measure in noise is evaluated and compared to the performance of the Wavelet-TFPS measure. In order to evaluate the robustness in noise, a high synchrony signal pair consisting of two sinusoids with constant phase difference is considered: x1 [n] = sin(16πn) + e1 [n] x2 [n] = sin(16πn + π/4) + e2 [n] (2.40) where e1 and e2 are independent white Gaussian noise processes at different variance levels, with SNR varying in the range of −20 dB to 30 dB. 200 simulations with 200 trials of the two signal models for 64 time points are considered to evaluate the distribution of the synchrony values. Ideally, the synchrony value between the two signals with constant phase difference is 44 Resolution of RID−TFPS 0.9 Normalized Frequency 1.2 0.8 1 0.7 0.6 0.8 0.5 0.6 0.4 0.3 0.4 0.2 0.1 0.2 20 30 40 50 60 70 80 90 100 Time Samples (a) RID-TFPS Resolution of Wavelet−TFPS 0.9 1.2 Normalized Frequency 0.8 1 0.7 0.6 0.8 0.5 0.4 0.6 0.3 0.4 0.2 0.1 0.2 20 30 40 50 60 70 80 90 100 Time Samples (b) Wavelet-TFPS Figure 2.9: RID-TFPS vs Wavelet-TFPS: Phase synchrony in the time-frequency plane for two chirp signals with constant phase shift expected to be equal to one. Fig. 2.10 illustrates the performances of the RID-TFPS and Wavelet-TFPS in estimating the true phase synchrony for different SNR values. The average synchrony values are plotted with increasing SNR values and the bars indicate the standard 45 deviations. When the SNR is less than −10 dB and greater than 10 dB, performances of both methods are almost the same. However, RID-TFPS is larger than the Wavelet-TFPS for −10 dB ≤ SNR ≤ 10 dB. Furthermore, for all of the SNR values, a Welch’s t-test is used at 5% significance level to test the null hypothesis that the synchrony values from RID-TFPS and Wavelet-TFPS are independent random samples from normal distributions with equal means, against the alternative that the means are not equal. At all SNR values, the p-values provided by the t-test are less than 0.05. Hence, the null hypothesis is rejected at all SNR values. This indicates that the RID-TFPS is significantly larger than the Wavelet-TFPS and outperforms the Wavelet-TFPS, especially for the case −10 dB ≤ SNR ≤ 10 dB. Robustness to Noise: RID−Rihaczek vs CWT 1 PLV 0.8 0.6 0.4 RID−TFPS Wavelet−TFPS 0.2 0 −20 −10 0 10 SNR (dB) 20 30 Figure 2.10: Comparison of noise robustness: Performances of the RID-TFPS and WaveletTFPS in estimating the true phase synchrony between a signal pair with constant phase difference for different SNR values, bars indicate the standard deviations. 46 2.6.6 Kuramoto Model: Comparison of RID-TFPS with WaveletTFPS for Multiple Oscillators In order to assess the performance of the RID-TFPS measure in representing the true phase synchrony and to compare it with the Wavelet-TFPS in a multiple oscillator setting, a wellstudied model of globally coupled limit-cycle oscillators, introduced by Kuramoto [90], is used. A detailed overview on Kuramoto model can be found in [91, 92, 93]. The model describes the phase dynamics of a large network of M globally coupled limitcycle oscillators [94]. Based on this model, phase dynamics are given as follows: M dϕi K ∑ = wi + sin (ϕj − ϕi ) dt M (2.41) j=1 where ϕi represents the phase of the ith oscillator, with the natural frequency wi distributed with a given probability density g(w), and K denotes the coupling strength (strength of the contributions from the other oscillators to the phase of the ith oscillator). Hence, the time-varying phase of each oscillator is determined by its natural frequency and the average influence of all other oscillators. In other words, each oscillator tries to oscillate independently at its own frequency but the coupling tends to synchronize it to the other oscillators in the network [92]. A common choice of the distribution, g(w), of the natural frequencies is the Lorentzian distribution with center frequency w0 and width γ and is given by: g(w) = γ ] [ 2 + (w − w )2 π γ 0 (2.42) When all the oscillators are running independently, the magnitude of the phase synchrony goes to zero and tends to increase in synchronized states. When the network is perfectly 47 phase-locked, the phase synchrony is equal to 1. Kuramoto showed that there is a phase transition from a desynchronized to a partially synchronized state at a critical K value, K = Kcrit . When K < Kcrit , the network is not synchronized and when K > Kcrit , a cluster of synchronized oscillators emerges and expands with increasing K. Kuramoto also showed that if the natural frequencies are collected from the Lorentzian distribution in Eq. (2.42), the critical value of K is as follows: Kcrit = 2γ (2.43) Hence, the phase transition of the network is determined by the width of the distribution of the natural frequencies. Kuramoto Model: RID−Rihaczek Distribution vs Wavelet Transform 1 Mean PLV 0.8 0.6 RID−TFPS Real PLV Wavelet−TFPS 0.4 0.2 0 0 100 200 300 K 400 500 600 Figure 2.11: Mean RID-TFPS and mean Wavelet-TFPS (PLV, averaged over all possible pairs of 16 oscillators) as a function of the coupling strength K, where the number of trials, N = 200. Error bar indicates the standard deviation. Center frequency of the oscillators is w0 = 200 rad/sec and the distribution width γ = 60 rad/sec which results in Kcrit = 120. Model Simulations: In the simulations, a system of M = 16 oscillators is used. For each oscillator, Eq. (2.41) is numerically integrated using the Runge-Kutta method with 48 a time step of ∆t = 1/128 sec corresponding to a sampling frequency of 128 Hz and thus, the time-varying phases of all oscillators are obtained. The time series corresponding to the oscillator i is formed by sin (ϕi (t)) + ni (t) for i = 1, 2, . . . , 16, with 128 samples, where ni (t) is additive white Gaussian noise at a SNR of 10 dB. The natural frequencies are collected from the Lorentzian distribution given by Eq. (2.42) with the center frequency w0 = 200 rad/sec and the width γ = 60 rad/sec. Hence, the critical K value is Kcrit = 2γ = 120. Simulations are performed for increasing K values from 0 to 540, in steps of 60. For each value of the coupling strength K, RID-TFPS and Wavelet-TFPS values, averaged over all pairs of 16 oscillators, are computed using Eq. (2.3) where the number of trials, N = 200, and are plotted as a function of K. For each oscillator pair, a single synchrony value is obtained by computing two PLV values at the two different natural frequencies of the two oscillators and calculating the mean of the two values. Averaging over all pairs, a single PLV value is obtained for each K value. Fig. 2.11 illustrates how the mean RIDTFPS and Wavelet-TFPS change with increasing coupling strength. First, in the absence of synchrony (K = 0), Wavelet-TFPS is larger than 0.35 indicating a bias in the wavelet based measure. For K = 0, ideally phase synchrony should be equal to zero but Wavelet-TFPS gives a spuriously high synchrony value at this level and stays at about a synchrony of 0.4 for K < Kcrit = 120. In contrast, RID-TFPS is almost equal to zero at K = 0 which represents the true synchrony and it stays low for K < 120. Second, when K = Kcrit , the model starts to generate partially coupled oscillators and RID-TFPS is very sensitive to the increase in the coupling strength of the system. This can be seen by the sudden increase in the RID-TFPS from K = 60 to K = 120. The Wavelet-TFPS also increases at this K value but the sensitivity of it in detecting the partial synchrony in the system is smaller compared to RID-TFPS. Thus, RID-TFPS performs better in tracking the true changes in 49 the coupling strength, especially for low K values and is more reliable than Wavelet-TFPS. Finally, both of the methods perform well in estimating the true synchrony, which should be close to 1, for large values of K. From all these results, it can be stated that RID-Rihaczek is a more efficient and reliable method than Wavelet-TFPS in estimating the underlying phase synchrony in a large network of oscillators and it tracks the real synchrony better than Wavelet-TFPS. The degradation in performance of the Wavelet-TFPS is due to the fact that the time-frequency resolution of the wavelet transform is lower than the resolution of the RID-Rihaczek distribution. The wavelet transform produces a larger bandwidth around the natural frequency of the oscillators and although the spread of the natural frequencies of the oscillators is large, γ = 60 rad/sec, this causes an overlap between the time-frequency energy distributions of the oscillators. This overlap results in a large Wavelet-TFPS value even for small coupling strengths. 2.7 Conclusions In this chapter, a new time-varying phase estimation method based on the RID-Rihaczek distribution is proposed. The performance of the phase estimator and the corresponding synchrony measure are evaluated both analytically and through simulations in comparison to existing measures in particular to continuous wavelet transform based estimates. Both the analytical and the simulation results show RID-Rihaczek phase and synchrony estimators to be more robust to noise, have better time-frequency resolution and perform better at detecting actual synchrony in the system, in particular for a network of oscillators. Future work will concentrate on exploring different complex time-frequency distributions, such as multivariate empirical mode decomposition [95], to develop more statistically sta- 50 ble and computationally efficient time-varying phase estimates for non-stationary signals. Applications of the developed measures to real signals include determining the functional connectivity networks of the brain from EEG and MEG signals through time-varying phase synchrony analysis. 51 Chapter 3 Joint-Frequency Representations for Cross-Frequency Coupling 3.1 Introduction Electrophysiological brain oscillations in various frequency bands, ranging from delta (1-4 Hz) to gamma (30-150 Hz) and beyond, have been linked to a range of cognitive and perceptual processes. Recently, cross-frequency (CF) coupling has been suggested as a plausible mechanism in neuronal computation, communication and learning [30, 29, 4]. CF coupling can be defined as a phenomenon where the basic activity resources defined by distinct frequencies might cause a more complex regulatory structure through interactions between different frequency bands. These interactions transfer information from large-scale brain networks to the fast, local cortical processing, thus integrating functional systems across multiple spatiotemporal scales [96]. Although oscillations in different frequency bands may remain independent, statistical relationships among neuronal oscillations in different frequency bands have been observed for electrohysiological recordings in humans and animals during cognitive tasks [4, 96, 25]. There are several manifestations of cross-frequency coupling including powerto-power modulation where the power (or amplitude) of the high frequency oscillations is modulated by the power of the low frequency oscillations and phase-to-power modulation where the phase of the low frequency oscillations modulates the power of the high frequency 52 oscillations. For instance, Canolty et al. investigated the relationship between theta and gamma oscillations and found that the power (or amplitude) of the fast gamma oscillations was systematically modulated during the course of a theta cycle [97]. However, little is known about the nature of interactions among oscillations in various frequency bands. Theoretically, cross-frequency interactions might occur in multiple ways (Fig. 3.1 [4]) and among these, two principal forms of CF coupling have been recognized: (n : m) or CF phase synchrony, which indicates phase locking between n cycles of one oscillation and m cycles of another oscillation [15], and systematical amplitude modulation (AM) of the power of high frequency oscillations by the phase of low frequency oscillations, which is known as the modulation effect [4]. Palva et al. used neuromagnetic recordings to investigate whether the human cortex exhibits (n : m) phase synchrony among oscillations and quantified this synchrony using the continuous wavelet transform (CWT) with a complex Morlet wavelet to estimate the time and frequency dependent phase of recordings [29]. Cohen, on the other hand, investigated the modulation effect using phase synchrony by first narrow band-pass filtering EEG recordings in combination with the Hilbert transform. In the first step, the power series corresponding to the gamma band is extracted. In the second step, a lower frequency band that the upper frequency power time series might be synchronized with is empirically identified. Finally, the phase of the two time series are estimated using the Hilbert transform and phase-to-power coupling is quantified [30]. Similarly, Voytek et al. [98] used phase synchrony to identify the modulation effect of theta and alpha phase on the gamma amplitude [98]. The similarities or differences between the two forms of CF coupling remain subtle and have not been addressed before. Although the two forms are not directly related to each other, one can not claim that the two approaches to CF coupling are independent. For 53 Figure 3.1: Principal Forms of Cross-Frequency Interactions [4]: (a) A slow oscillatory signal in the theta band (e.g., 8 Hz): the frequency remains fairly constant whereas the amplitude (red line) of the signal fluctuates. The different ways in which faster oscillatory signals (e.g., gamma oscillations) can interact with such a signal are: (b) the fluctuations in power of the faster oscillations are correlated with power changes in the lower frequency band. This interaction is independent of the phases of the signals; (c) (n : m) phase synchrony occurs between slower and faster oscillations. In each slow cycle, there are four faster cycles and their phase relationship remains fixed which indicates a phase locking between the signals; (d) the frequency of the fast oscillations is modulated by the phase of the slower oscillations; and (e) the power of the faster oscillations is modulated by the phase of the slower oscillations. instance, the instantaneous frequency of a signal x3 (t) = x1 (t)x2 (t), where x1 (t) is the amplitude modulator and x2 (t) is the carrier, depends on the frequency of the modulator. Hence, the instantaneous phase of x3 (t) is affected by the modulator and this might cause a CF phase synchrony between the modulator, x1 (t), and the signal, x3 (t). For example, if x1 (t) is a sinusoid at 20 Hz and x2 (t) is a sinusoid at 60 Hz, there will be (2 : 1) and (4 : 1) CF phase synchronization between the signals x1 (t) and x3 (t). Thus, for this example, amplitude modulation results in a CF phase synchrony. However, the same CF phase synchrony can be observed between the signals x1 (t), a sinusoid at 20 Hz, and x3 (t), sum of two sinusoids at frequencies 40 Hz and 80 Hz. Therefore, despite the fact that amplitude modulation might lead to a CF phase synchrony, one can not uniquely identify the cause of the observed 54 phase synchrony. Thus, CF phase synchrony does not provide direct information about the generating signal interaction model. Moreover, if the modulating signal, x1 (t), or the carrier, x2 (t), are not narrowband, it will be hard to learn any information about the modulation relationship from the CF phase synchrony since one will observe phase synchrony values at a large range of frequencies. In summary, CF phase synchrony is not a direct way of quantifying the modulation relationship between signals and one needs a more precise and direct approach to identify the modulation effect. In this chapter, we propose two complementary methods to quantify CF phase synchrony and modulation effect. The first method is based on the RID-Rihaczek distribution and is an extension of Palva et al.’s Morlet wavelet based CF phase synchrony measure. The second method is based on defining a cross frequency-spectral lag distribution that is focused on quantifying the amount of amplitude modulation between signals. Both methods can be employed to quantify both phase-phase locking and amplitude-phase locking. The major difference between the two proposed methods is that the first method does not imply any particular form for the underlying signal model whereas the cross frequency-spectral lag based approach assumes an AM generating model for the signals. The organization of this chapter is as follows: In section 3.2, we present the first approach and demonstrate the accuracy of the CF phase synchrony estimator through simulations. In section 3.3, we propose the second approach which is closely related to the modulation frequency and modulation spectrum in speech processing literature. The performance of the proposed distribution is illustrated for simulated signals as well as for electroencephalogram (EEG) signals. Finally, in section 3.4, we discuss and summarize the similarities and differences between the two approaches for investigating CF interactions. 55 3.2 Cross Frequency Phase Synchronization In this section, we will extend the RID-Rihaczek based phase synchrony measure to quantify the CF phase synchrony between two signals across different frequencies. As illustrated in chapter 2, RID-Rihaczek distribution offers time and frequency dependent phase information, ϕ(t, ω), of a signal x(t). In this section, we exploit this property of RID-Rihaczek distribution and modify the definition of (1 : 1) phase locking value (PLV) to define (n : m) or CF phase locking value (CFPLV) between signals, x1 (t) and x2 (t), as follows: ( ( )) N 1 ∑ ω2 k k (t, ω ) CF P LV1,2 (t, ω1 , ω2 ) = exp j , ϕ (t, ω1 ) − ϕ2 2 N ω1 1 (3.1) k=1 where N is the number of trials and ϕk (t, ω) and ϕk (t, ω) are the time and frequency depen1 2 dent phases of signals, x1 (t) and x2 (t), for the k th trial. According to the classical definition of (n : m) phase synchrony, n and m have to be integers [15, 29]. However, this is not a necessary condition for the CF phase synchrony to evaluate phase locking due to the fact that at least one period of the slower oscillation, which is phase locked to multiple periods of the faster oscillation, provides sufficient information about the signal’s characteristics and is adequate for tracking the phase locking between the slower and faster oscillations. In Eq. ω (3.1), n corresponds to ω2 , any arbitrary positive value, whereas m = 1. Similar to the PLV, 1 CFPLV is in the range [0, 1] where 1 indicates perfect CF phase locking and zero indicates no coupling between the two signals. CFPLV can be considered as a 3-dimensional CF phase synchrony measure which has the axes, time, ω1 and ω2 , respectively. Since it might be hard to inprete these three-dimensional synchrony values for the CF coupling, in this section, we 56 will integrate the CFPLV and compute the mean along the time axis: T 1∑ CF P LV1,2 (t, ω1 , ω2 ) CF P LV (ω1 , ω2 ) = T (3.2) t=1 where T is the number of time points. Therefore, CFPLV will be represented as a twodimensional function of ω1 and ω2 . By looking at the CFPLV, one can identify the CF phase synchrony at all possible frequency pairs. For example, for a signal set consisting of two sinusoids, x1 (t) = sin (ω1 t) and x2 (t) = sin (ω2 t), the only nonzero element of the 2-D CFPLV should be at (ω1 , ω2 ) since the two signals have constant instantaneous frequencies. 3.2.1 Accuracy of RID-Rihaczek Distribution in Estimating CF Phase Synchrony In Chapter 2, it was shown that the phase estimate from the Rihaczek distribution is not unbiased. Hence, the accuracy of the proposed distribution, which depends on the Rihaczek distribution, in quantifying CF phase synchrony needs to be evaluated. The time and frequency dependent phase from Rihaczek distribution was estimated as ϕ(t, w) = ϕ(t) − θ(ω) − ωt, where ϕ(t) and θ(ω) refer to the phase in the time and the frequency domains, respectively. However, finding analytical expressions for the frequency domain phase of signals with arbitrary phase in the time domain is not possible since these expressions depend on the underlying signals. Therefore, we consider a simpler signal set consisting of two second 57 order polynomial phase signals: 2 x1 (t) = ejϕ1 (t) = ej(βt ) β 2 x2 (t) = ejϕ2 (t) = ej( 2 t +K) (3.3) where K is the constant phase difference and the instantaneous frequency of x1 (t), ω1 (t) = 2βt, is two times the instantaneous frequency of x2 (t), ω2 (t) = βt. Since ϕ1 (t) = 2ϕ2 (t), (1 : 2) CF phase synchrony exists between the two signals since |Φ1,2 (t) = ϕ1 (t) − 2ϕ2 (t)| = 2K is constant. Therefore, the ideal CF phase synchrony is equal to one. θ1 (ω) and θ2 (ω) need to be computed from the Fourier transforms, X1 (ω) and X2 (ω): ( 1 X1 (ω) = √ e 2β 2 −j ω − π 4β 4 1 X2 (ω) = ejK √ e β ) ( 2 −j ω − π 2β 4 ) (3.4) as ω2 π − ) 4β 4 2 ω π θ2 (ω) = K − ( − ) 2β 4 θ1 (ω) = −( (3.5) Note that the instantaneous frequency of x1 (t), ω1 (t) = 2ωt , is two times the instantaneous frequency of x2 (t), ω2 (t) = ωt . Hence, the time and frequency dependent phases, ϕ1 (t, w) 58 and ϕ2 (t, w), are computed as: (2βt)2 π − − 2βt2 4β 4 2 2 βt (βt) π ϕ2 (t, w) = + − − βt2 2 2β 4 ϕ1 (t, w) = βt2 + (3.6) Finally, the phase difference between the two signals is obtained as: |Φ1,2 (t, w)| = |ϕ1 (t, w) − 2ϕ2 (t, w)| = π 4 (3.7) One can see that the Rihaczek distribution provides a constant phase difference for the signal model in Eq. (3.3), which results in an ideal CF phase synchrony value. Therefore, Rihaczek distribution can be used to quantify the CF phase synchrony since the bias is removed due to the CF interactions as in Eq. (3.7). In fact, this example signal model does not provide an analytical proof that the Rihaczek distribution is an unbiased CF phase synchrony estimator, but it provides some inspiration for the use of the proposed method in estimating CF phase synchrony. 3.2.2 Simulation Examples In order to evaluate the performance of the proposed method in estimating CF phase synchrony, we consider a signal set consisting of two sinusoids with constant phase difference: x1 (t) = cos (2π20t) + e1 (t) x2 (t) = cos (2π70t + π/3) + e2 (t) 59 (3.8) where e1 (t) and e2 (t) are complex white Gaussian noise sequences. SNR value of 10 dB is considered with 200 trials for 256 time points and the sampling rate is 256 Hz. The two signals are (3.5, 1) phase synchronized and ideally, the only nonzero element of CFPLV should be at (20Hz, 70Hz) point. Fig. 3.2 shows the CFPLV for the signal model in Eq. (3.8). As expected, CF P LV (20, 70) has the largest synchrony value of 0.802. There is some spread around this point due the smoothing filter in the ambiguity domain. CFPLV 120 Largest CFPLV Value 100 0.7 0.6 0.5 60 0.4 40 f2(Hz) 80 0.3 0.2 20 0.1 20 40 60 80 f (Hz) 100 120 1 Figure 3.2: CF phase synchrony for the signal model in Eq. (3.8). CF P LV (20, 70) has the largest synchrony value and the CFPLV is concentrated around this point. The second signal model we consider is a signal set consisting of two constant amplitude chirp signals, x1 (t) and x2 (t), with constant phase difference where the instantaneous frequency, ω2 (t), of x2 (t) is a function of the instantaneous frequency, ω1 (t), of x1 (t): 2 x1 (t) = ejϕ1 (t) + e1 (t) = ej2π(20t+15t ) + e1 (t) 2 x2 (t) = ejϕ2 (t) + e2 (t) = ej2π(40t+30t +π/3) + e2 (t) 60 (3.9) SNR value of 10 dB is considered with 200 trials for 256 time points and the sampling 1 rate is 256 Hz. The instantaneous frequency f1 (t) = 2π ∂ϕ1 (t) ∂t , is initially equal to 20 Hz and increases up to 50 Hz which is the final value. f2 (t) is two times the value of f1 (t) and therefore, it is initially equal to 40 Hz and increases up to 100 Hz. Fig. 3.3 shows the CFPLV for the two chirp signals. The dashed black line indicates the ideal CFPLV profile, where CFPLV is equal to one on this line and zero everywhere else, and represents both the range of instantaneous frequencies, f1 and f2 , and the relationship between them. The slope of the line is constant and equal to two. As one can see, CFPLV is highest CFPLV 0.7 120 0.6 f2 (Hz) 100 0.5 80 0.4 60 0.3 40 0.2 20 0.1 20 40 60 80 f (Hz) 100 120 1 Figure 3.3: CF phase synchrony for the signal model in Eq. (3.9): The dashed black line indicates the ideal CFPLV profile, where CFPLV is equal to one on this line and zero everywhere else, and represents both the range of instantaneous frequencies, f1 and f2 , and the relationship between them, f2 (t) = 2f1 (t). f1 has the range [20, 50] Hz and f2 has the range [40, 100] Hz. (0.7283) and concentrated on this line. For this example model, the relationship between the instantaneous frequencies of the two chirp signals is linear. The spread around the black line or the deviation from the ideal CFPLV profile is because of the residual cross-terms and 61 the noise effect. Finally, we consider a signal model where the relationship between instantaneous frequencies is nonlinear: 2 x1 (t) = ejϕ1 (t) + e1 (t) = ej2π(10t+55t ) + e1 (t) ) ( 260 3960 2 12100 3 jϕ2 (t) + e (t) = ej2π 160 t+ 320 t + 480 t +π/3 + e (t) x2 (t) = e 2 2 (3.10) where f1 (t) = 10 + 110t, is initially equal to 10 Hz and increases up to 120 Hz and f2 (t) = 2 f1 (t) f1 (t) + 160 = 260 + 3960 t+ 12100 t2 , is initially equal to 1.625 Hz and increases up to 102 Hz. 10 160 160 160 Fig. 3.4 shows the CFPLV for the model in Eq. (3.10). The dashed black quadratic curve indicates the ideal CFPLV profile. CFPLV is highest (0.4937) and concentrated around this nonlinear curve. Hence, RID-Rihaczek distribution performs well in estimating CF phase synchrony between signals whether the relationship between their instantaneous frequencies is linear or nonlinear. 3.2.2.1 Evaluating Amplitude Modulation Through CFPLV As discussed in the Introduction, modulation effect and CF phase synchrony are interrelated approaches to quantifying CF coupling between signals. Here, we consider an amplitude modulation model consisting of two sinusoids to evaluate the modulation effect through CFPLV: x1 (t) = cos(2π40t) + e1 (t) x2 (t) = x1 (t) cos(2π60t) + e2 (t) 62 (3.11) CFPLV 0.45 100 f2 (Hz) 120 0.4 0.35 80 0.3 60 0.25 40 0.2 0.15 20 0.1 20 40 60 80 f (Hz) 100 120 1 Figure 3.4: CF phase synchrony for the signal model in Eq. (3.10): The dashed black quadratic curve indicates the ideal CFPLV profile. The relation between the two instanta2 f1 (t) f (t) neous frequencies is f2 (t) = 1 + 160 , where f1 has the range [10, 120] Hz and f2 has the 10 range [1.625, 102] Hz. where x1 (t) is the amplitude modulator of x2 (t) and SNR value of 10 dB is considered with 200 trials for 256 time points and the sampling rate is 256 Hz. From Fig. 3.5, one can see that x1 (t), at frequency 40 Hz, is phase synchronized with x2 (t), at both 20 Hz and 100 Hz. This is caused by the modulation of x2 (t) by x1 (t). For this example, amplitude modulation results in a clear CF phase synchrony since the modulator and the carrier signals are sinusoids. However, for wide-band modulating and carrier signals, one will observe CF synchrony values at a large range of frequencies. 63 CFPLV 120 0.8 100 0.6 f2(Hz) 80 60 0.4 40 0.2 20 20 40 60 80 f (Hz) 100 120 1 Figure 3.5: CFPLV for the signal model in Eq. (3.11): x1 (t), at frequency 40 Hz, is CF synchronized with x2 (t), at both 20 Hz and 100 Hz. This is caused by the modulation of x2 (t) by x1 (t) and shows that AM might result in a CFPLV between signals. 3.3 Joint Frequency Spectral Lag Representation for Cross-Frequency Modulation Analysis in the Brain Until recently, modulation between different frequency ranges has been quantified using phase synchrony measures. However, CF phase synchrony does not provide direct information about the nature of the interactions, i.e. it’s not clear from CF phase coupling whether there is an AM relationship between the two signals. Therefore, we define a new distribution which can directly identify the existence and the amount of amplitude modulation between signals and is closely related to the modulation frequency and spectrum in speech processing literature. Modulation spectrum contains both short-term and long-term information about the signal representing patterns of time variation and is generally computed by first applying a Fourier transform on short-time windows of the signal, and then taking a Fourier transform 64 of the sequence of short-term magnitude spectrum features. The concepts of modulation frequency along with modulation spectrum are originally encountered in acoustics, speech and audio processing. The joint frequency analysis has been important in speech and audio recognition and classification applications and has been referred to as joint acoustic and modulation frequency representation. This joint representation was inspired by the observation that very low frequency modulations of sound are the fundamental carrier of information in speech of timbre in music, i.e. low bandwidth processes modulate higher bandwidth carriers [99, 100]. Augmenting the conventional concept of system function frequency analysis, Zadeh first proposed joint acoustic and modulation frequency analysis, a separate frequency dimension for representing system variation [101]. Kailath performed the first analysis of this bifrequency system function and showed its application to modulation-based systems [102]. More recently, Gardner extended the concept of bifrequency analysis for cyclostationary systems, and these cyclostationary approaches have been widely applied in parameter estimation and detection [103]. These studies do not focus on decomposing a signal into its modulators and carriers. Recently, Atlas et al. proposed modulation spectrum analysis and modulation filtering for separating out the low frequency temporal envelope from the high frequency carrier signal [104, 99]. Modulation filtering is defined as the process of modifying an analytical subband signal by filtering its modulator and recombining the result with the original carrier [105]. Interest in modulation filtering arises from the observation that envelope fluctuations in speech tend to be low-pass in nature with a peak around the syllabic rate, or 3-4 Hz. However, modulation filtering, which is based on the modulation spectrum, much like time-frequency distribution, is limited to identifying the modulator and carrier of an individual signal, rather than investigating the modulation relationship between signals. In this section, we introduce cross frequency65 spectral lag representation based on the Wigner distribution to represent the modulation relationships between two signals. The cross frequency-spectral lag distribution offers crossfrequency coupling information and it focuses on amplitude modulation between two signals. Furthermore, we define a measure of entropy and use some existing symmetric information divergence measures to quantify the amount of modulation observed in the cross frequencyspectral lag distribution. 3.3.1 Joint Frequency-Spectral Lag Distribution Joint frequency representation, also referred to as joint frequency-spectral lag distribution, can be defined based on the bilinear class of joint time-frequency distributions [79]. The simplest joint frequency distribution for a signal x(t), Px (η, ω), where ω is the actual frequency and η is the spectral lag or the modulation frequency, is defined as 1 : ∫ ∫ τ τ x∗ (t − )x(t + )e−jωτ e−jηt dτ dt, 2 2 η η = X ∗ (ω − )X(ω + ) 2 2 1 P (η, ω) = 2π and is based on the Wigner distribution of the signal, W (t, ω), since P (η, ω) = (3.12) ∫ W (t, ω)e−jηt dt. Similarly, P (η, ω) can be expressed as the 2-D Fourier transform of the signal’s local autocorrelation function or time-varying correlation, R(t, τ ) = x(t + τ )x∗ (t − τ ). Bispectrum, which 2 2 is defined as the third order spectrum or the first member of higher-order spectra and is an efficient way of capturing non-Gaussian correlations [106], resembles this joint representation since it is the 2-D Fourier transform of the third-order cumulant sequences. However, there is no direct connection between bispectrum and the joint frequency representation discussed 1 All integrals are from −∞ to ∞ unless otherwise noted. 66 here [104]. Due to the quadratic nature of the representation, cross-terms or interference may occur. For real life signals, the effects of the cross-terms are reduced by two-dimensional smoothing similar to the spectrogram. First, a two-dimensional smoothing function with an appropriately chosen window length is used to estimate a joint time-frequency representation and then a Fourier transform is applied along the time dimension yielding an estimate of SP the modulation spectrum, Px (η, ω) = Px (η, ω) ∗ Ph (η, ω), where Ph is the joint frequency representation of the window function h. 3.3.2 Cross Frequency-Spectral Lag Distribution Previous work in modulation spectrum analysis has focused on understanding how low frequency information modulates high frequency carrier within a signal. However, for the analysis of modulations between neural oscillations, the modulation spectrum defined above is inadequate. For this reason, in this section we introduce cross-frequency-spectral lag distribution to quantify the modulating effect of one signal, x1 , on another one, x2 : ∫ ∫ τ τ x∗ (t − )x2 (t + )e−jωτ e−jηt dτ dt, 1 2 2 η η ∗ = X1 (ω − )X2 (ω + ) 2 2 1 Px1 x2 (η, ω) = 2π (3.13) This cross frequency-spectral lag distribution is closely related to the Cross-Wigner distribution of x1 and x2 through the Fourier transform. Similarly, Px1 x2 (η, ω) can be expressed as the 2-D Fourier transform of the local cross-correlation function, x∗ (t − τ )x2 (t + τ ). When 1 2 2 ∗ η = 0, this distribution reduces to cross-spectral density, X1 (ω)X2 (ω), and for non-zero η values it quantifies the cross-spectrum for different frequency lags. In order to illustrate the behavior of the proposed distribution, we consider a simple 67 amplitude modulation (AM) model: x1 (t) = A cos ωm t, x2 (t) = m(t)c(t) = (A + A cos ωm t)B cos ωc t (3.14) where m(t) = A + x1 (t) is the modulator, c(t) = B cos ωc t is the carrier of x2 (t) and ωm and ωc are the modulation and carrier frequencies, respectively. The corresponding cross frequency-spectral lag distribution is: η η ∗ Px1 x2 (η, ω) = X1 (ω − )X2 (ω + ) 2 2 (3.15) } η A{ η η ∗ X1 (ω − ) = δ(ω − − ωm ) + δ(ω − + ωm ) 2 2 2 2 Auto terms (3.16) where and } η AB { η η X2 (ω + ) = δ(ω + − ωc ) + δ(ω + + ωc ) 2 2 2 2 Auto Terms { } AB η η + δ(ω + − ωc − ωm ) + δ(ω + − ωc + ωm ) 4 2 2 Modulation Terms { } AB η η + δ(ω + + ωc − ωm ) + δ(ω + + ωc + ωm ) 4 2 2 Modulation Terms (3.17) Fig. 3.6 shows the magnitude of the cross frequency-spectral lag distribution of the modulation model in Eq.(3.14), which consists of 12 impulse functions on the (η, ω) plane located ∗ at the line crossings. Since, |Px1 x2 (η, ω)| = |Px1 x2 (−η, −ω)|, it is symmetric with respect to 68 the origin, the first quadrant will be adequate to analyze the distribution. Therefore, only the six impulses in the first quadrant are enumerated and their coordinates are shown on the (η, ω) plane in Fig. 3.6. (ωc − ωm, (ωc+ωm)/2) ω (ω , (ω +2ω )/2) c c ω (frequency) 2 m 3 (ω − 2ω , ω /2) c m c 1 0 4 5 η 6 (ω +2ω , ω /2) c m c (ω , (ω −2ω )/2) c c m (ωc+ωm, (ωc−ωm)/2) 0 η (frequency lag) Figure 3.6: Cross frequency-spectral lag distribution of the modulation model in Eq.(3.14), which is symmetric with respect to the origin. 2 and 5 are caused by the auto terms, whereas impulses 1, 3, 4 and 6 are caused by the modulation terms. Impulses at 2 and 5 refer to the products of the two spectra at their center frequencies, ∗ ∗ X1 (ωm )X2 (ωc ) and X1 (−ωm )X2 (ωc ), respectively. These terms would appear regardless of whether there is any modulation between the two signals or not. The impulses at 1 and 4 ∗ ∗ refer to the products, X1 (ωm )X2 (ωc − ωm ) and X1 (−ωm )X2 (ωc − ωm ), and measure the amount of modulation between the modulating signal, x1 , at ωm and the modulated signal, ∗ x2 , at ωc − ωm . Similarly, the impulses 3 and 6 refer to the products, X1 (ωm )X2 (ωc + ωm ) ∗ and X1 (−ωm )X2 (ωc + ωm ). If there is no modulation between the two signals, X2 (ωc + ωm ) and X2 (ωc − ωm ) will be approximately zero for narrowband signals and there won’t be any 69 activity at locations 1, 3, 4 and 6. 3.3.2.1 Quantification of Modulation Once the magnitude of cross frequency-spectral lag distribution, |Px1 x2 (η, ω)|, is obtained, one needs to quantify the amount of modulation between signals. For this purpose, we will consider both the entropy of |Px1 x2 (η, ω)| and symmetric information divergence measures: 1. Entropy: Applying entropy to time-frequency distributions for quantifying the number of signal terms has been previously suggested [107]. For the case that there is no modulation between signals, the distribution will be concentrated around two impulses (2 and 5 in Fig.2.1) corresponding to the difference and sum of the frequencies, ωm and ωc . On the other hand, when there is some modulation, side lobes or extra terms will appear in the distribution (impulses 1, 3, 4 and 6 in Fig.2.1). Therefore, one can expect the entropy of |Px1 x2 (η, ω)| to be higher if there is any modulation between signals since the distribution of |Px1 x2 (η, ω)| will be closer to a uniform distribution. This fact will be exploited to determine whether there is any amplitude modulation between the two signals, using the entropy defined as: H(|Px1 ,x2 |) = − n ∑ pi (|Px1 ,x2 |) log pi (|Px1 ,x2 |) (3.18) i=1 where pi (|Px1 ,x2 |) is the n bin histogram of |Px1 ,x2 |. 2. Symmetric Information Divergence Measures: The impulses at 2 and 5 in Fig. 3.6 refer to the cross-product of the two spectra 70 at their center frequencies and appear regardless of whether there is any modulation relationship between the two signals or not. However, the remaining four impulses would appear if and only if there exists a modulation between the signals. Therefore, we can claim that the impulses at 2 and 5 belong to the same class, corresponding to the case there is no modulation effect, and the remaining four impulses belong to another class, corresponding to the information about the modulation. For real signals which still have to be narrowband, instead of impulses, one would observe a spread or distribution around the impulses as illustrated in Fig. 3.9. Hence, these distributions can be represented by 2-D Gaussian functions. In order to cluster these distributions into two different classes, we use a Gaussian mixture model to describe the 2-D distribution, |Px1 x2 (η, ω)|, such that the distribution is represented by multiple 2-D Gaussian functions. A Gaussian mixture model is defined as a weighted sum of M component Gaussian densities: p(x|θ) = M ∑ wi g(x|µi , Σi ) (3.19) i=1 where x is a D-dimensional vector, wi and g(x|µi , Σi ), i = 1, . . . , M , are the mixture weights and component Gaussian densities, respectively. Each component density is a D-variate Gaussian function of the form: } 1 T Σ−1 (x − µ ) , g(x|µi , Σi ) = exp − (x − µi ) i i 2 (2π)D/2 |Σi |1/2 { 1 (3.20) with mean, µi , and covariance, Σi . The mixture weights must satisfy the constraint that ∑M i=1 wi = 1 and the complete Gaussian mixture model is parameterized by 71 θ = {wi , µi , Σi }. As long as the analyzed signals are narrowband, one will observe six distributions where the power or magnitude of these distributions depend on the strength of modulation. Therefore, in this section, we can assume that D = 2 and M = 6. We use the two parameters, mean and covariance, of the two Gaussian distributions corresponding to the impulses at 2 and 5, to generate a 2-Dimensional distribution, Q(η, ω), which then will be compared to the original distribution, |Px1 x2 (η, ω)|. If there is no modulation between the two signals, one expects to see a low divergence between |Px1 x2 (η, ω)| and Q(η, ω). In contrast, the divergence between the two distributions will increase as the amount of modulation increases since side lobes or extra terms will appear in |Px1 x2 (η, ω)|. In order to use information divergence measures for quantifying the amount of modulation, we first normalize the two distributions and update them as: |Px x (η, ω)| P (η, ω) = ∑ ∑ 1 2 η ω |Px1 x2 (η, ω)| Q(η, ω) Q(η, ω) = ∑ ∑ η ω Q(η, ω) (3.21) such that they are valid joint probability distributions. In this section, we will use symmetric Kullback-Leibler (K(P ||Q)), Chi-square (χ2 (P ||Q)) and Arithmetic-Geometric 72 (T (P ||Q)) divergence measures which are defined as [108]: K(P ||Q) = ∑∑ η χ2 (P ||Q) = ( (P (η, ω) − Q(η, ω)) log ω P (η, ω) Q(η, ω) ) ∑ ∑ (P (η, ω) − Q(η, ω))2 (P (η, ω) + Q(η, ω)) P (η, ω)Q(η, ω) η ω ) ( ∑ ∑ (P (η, ω) + Q(η, ω)) P (η, ω) + Q(η, ω) √ T (P ||Q) = log 2 2 P (η, ω)Q(η, ω) η ω 3.3.3 (3.22a) (3.22b) (3.22c) Simulation Examples In order to illustrate the performance of the proposed distribution in determining the AM modulation between signals, we consider two example signal models. The first model consists of sinusoidal signals: x1 (t) = cos(2π10t) x2 (t) = cos(2π60t) x3 (t) = (1 + x1 (t)) cos(2π60t) (3.23) and the second model consists of Gabor logon signals centered at the same frequencies as the sinusoidal signals: ) (t − 50/256)2 cos (2π10t) y1 (t) = exp − 2σ 2 ( ) (t − 50/256)2 y2 (t) = exp − cos (2π60t) 2σ 2 ( ) (t − 50/256)2 y3 (t) = (1 + y1 (t)) exp − cos (2π60t) 2σ 2 ( 73 (3.24) Cross frequency−spectral lag distribution for x1 and x2 1 40 0.8 30 0.6 20 0.4 10 f = ω / (2π) (Hz) 50 0.2 20 40 η (Hz) 60 80 (a) Px1 x2 Cross frequency−spectral lag distribution for x1 and x3 1 40 0.8 30 0.6 20 0.4 10 f = ω / (2π) (Hz) 50 0.2 20 40 η (Hz) 60 80 (b) Px1 x3 Figure 3.7: For the model in eq. (3.23), (a) shows the cross frequency-spectral lag distribution when there is no modulation whereas (b) shows the distribution when modulation exists. Entropies of the distributions are: H(Px1 ,x2 ) = 0.0428, H(Px1 ,x3 ) = 0.0677. where 0 ≤ t ≤ 1, σ = 12.8 and the sampling frequency is 256 Hz. In both models, modulation is only between the first and third signals, whose carrier is the second signal. Figs. 3.7 and 3.8 show the magnitude of the cross frequency-spectral lag distributions 74 Cross frequency−spectral lag distribution for y1 and y2 f = ω / (2π) (Hz) 50 0.8 40 0.6 30 20 0.4 10 0.2 20 40 η (Hz) 60 80 (a) Py1 y2 Cross frequency−spectral lag distribution for y1 and y3 1 40 0.8 30 0.6 20 0.4 10 f = ω / (2π) (Hz) 50 0.2 20 40 η (Hz) 60 80 (b) Py1 y3 Figure 3.8: For the model in eq. (3.24), (a) shows the cross frequency-spectral lag distribution when there is no modulation whereas (b) shows the distribution when modulation exists. Entropies of the distributions are: H(Py1 ,y2 ) = 0.1189, H(Py1 ,y3 ) = 0.2465 for the two example models, which are consistent with Fig. 3.6. Because of the modulation between x1 and x3 , and y1 and y3 , entropies and the divergence values of the corresponding distributions in Fig. 3.7(b) and 3.8(b), are higher. From Figs. 3.7 and 3.8, it can be con- 75 cluded that as long as the analyzed signals are narrowband, energy of the cross distribution will be concentrated around the same locations as the location of the impulses in Fig. 3.6, which depend on the center frequencies, ωm and ωc . However, instead of observing impulses, one will observe a distribution spread around the location of the impulses as seen in Figs. 3.8(a) and 3.8(b).2 3.3.4 Application to EEG Data The proposed measure is applied to a set of EEG data containing the error-related negativity (ERN). The ERN is a brain potential response that occurs following performance errors in a speeded reaction time task [109]. EEG data from 63-channels was collected in accordance with the 10/20 system on a Neuroscan Synamps2 system (Neuroscan, Inc.). A speededresponse flanker task was employed, and response-locked averages were computed for each subject. In this section, we have analyzed data from 46 subjects. The proposed cross frequency-spectral lag distribution is used as a novel tool to determine the strength of the modulation between theta-gamma and alpha-gamma bands of EEG. For each subject, three channels which have the maximum power for theta (4-8 Hz), alpha (8-12 Hz) and gamma (40-70 Hz) bands are identified, respectively. Then all trials of the selected channels are bandpass filtered around the frequency band of interest, so that they are only composed of the relevant frequency bands. In this section, the interactions between the low frequency bands, theta and alpha, and the high frequency band, gamma, are investigated using both power-to-power and phase-to-power modulation of EEG. 2 This is validated using several simulation models but only Gabor logon signal model is included in this section. 76 3.3.4.1 Power-to-power Modulation Power-to-power modulation is the phenomenon where the fluctuations in the power of the faster gamma oscillations are modulated by the power changes in the lower frequency band. For each subject, the three electrodes having the highest power in the theta, alpha and gamma bands are identified. For all trials, cross frequency-spectral lag distributions for theta-gamma and alpha-gamma bands are obtained and the modulation strength between the frequency bands are quantified using both the entropy measure in Eq. (3.18) and the symmetric information divergence measures in Eq. (3.22). Among the 46 subjects, 44 subjects yield higher entropy values for the theta-gamma modulation compared to the entropy values for the alpha-gamma modulation. Using the Wilcoxon rank sum test, the null hypothesis that the entropy values come from identical distributions with equal medians, against the alternative that they do not have equal medians, is rejected at %5 significance level (p-value = 0.00023). Moreover, the divergence measures provide much better results where all of the subjects yield higher divergence values for the theta-gamma modulation compared to the divergence values for the alpha-gamma modulation. Using the Wilcoxon rank sum test for K(P ||Q), χ2 (P ||Q) and T (P ||Q), the null hypothesis is rejected at %5 significance level with p-values, 5.7759 × 10−16 , 1.7958 × 10−14 and 4.7636 × 10−16 , respectively. Therefore, the modulation strength between the theta and gamma bands is significantly larger than the modulation strength between the alpha and gamma bands as observed previously in the literature [30]. Fig. 3.9 shows the magnitude of the cross frequency-spectral lag distributions for the theta-gamma and alpha-gamma bands of sample EEG data. The regions with the highest energies are enumerated such that one can clearly compare the strengths of the modulations. Regions 1, 3, 4, and 6 have higher 77 energy in Fig. 3.9(a), which indicate stronger modulation effect between theta and gamma bands. Furthermore, the coordinates of the regions in both figures are consistent with Fig. 3.6, which shows the locations of auto and modulation terms as functions of ωm and ωc (ωm (theta) ≈ 6 Hz, ωm (alpha) ≈ 10 Hz, ωc (gamma) ≈ 45 Hz). 3.3.4.2 Phase-to-power Modulation In phase-to-power modulation, the power of the fast gamma oscillations is systematically modulated during the course of a theta cycle [4]. In other words, there is a strong correlation between the phase of theta oscillations and the power of gamma oscillations. The same procedure described in section 3.3.4.1 is followed but this time the time-varying phase of the low frequency bands, theta and alpha bands, is extracted to quantify phase-to-power modulation between low frequency and high frequency bands. Hilbert transform is used to estimate the phase of the theta and alpha bands. Among the 46 subjects, 30 subjects yield higher entropy values for the theta (phase)-gamma modulation compared to the entropy values for the alpha (phase)-gamma modulation. Wilcoxon rank sum test is used and the null hypothesis is rejected at %5 significance level (p-value = 0.0489). On the other hand, the divergence measures provide better results where 45 out of the 46 subjects yield higher divergence values for the theta phase-gamma modulation compared to the divergence values for the alpha phase-gamma modulation. Using the Wilcoxon rank sum test for K(P ||Q), χ2 (P ||Q) and T (P ||Q), the null hypothesis is rejected at %5 significance level with p-values, 1.8029 × 10−16 , 9.1633 × 10−15 and 1.4821 × 10−16 , respectively. Hence, the modulation strength between the theta phase and gamma bands is significantly larger compared to alpha phase and gamma bands. 78 f = ω / (2π) (Hz) Cross frequency−spectral lag distribution for Theta and Gamma 120 100 100 3 80 2 80 1 60 60 40 40 20 20 5 4 20 6 40 η (Hz) 60 (a) PT heta,Gamma f = ω / (2π) (Hz) Cross frequency−spectral lag distribution for Alpha and Gamma 120 70 100 3 60 2 80 1 50 40 60 30 40 20 20 5 4 20 40 η (Hz) 6 10 60 (b) PAlpha,Gamma Figure 3.9: (a) and (b) show the cross frequency-spectral lag distributions for the thetagamma (H(PTheta,Gamma ) = 0.4362) and alpha-gamma (H(PAlpha,Gamma ) = 0.3635) bands of sample EEG data, respectively. 3.4 Conclusions In this chapter, we proposed two complementary approaches to quantify CF phase synchrony and modulation effect. The first approach is based on the RID-Rihaczek distribution and 79 has been applied to simulated signals. Simulation results demonstrate that the proposed approach performs well in quantifying the phase synchrony across frequencies. The second method is based on a new cross frequency-spectral lag distribution which can determine the cross-frequency modulation between two signals and has been applied to both simulated signal models and EEG data. Simulation results and EEG applications show that the proposed method performs well in identifying the modulation effects. In particular, the hypothesis that the power of the gamma oscillations is modulated by the phase of the theta band in EEG has been supported by this new approach. Although the two approaches are not directly related to each other, they are not independent either. Both methods can quantify phase-phase and amplitude-phase locking. However, the major difference between the two proposed methods is that the first method does not imply any particular form for the underlying signal model whereas the cross frequencyspectral lag based approach assumes an AM generating model for the signals. Moreover, the first method is based on the individual signal analysis using phase estimate from the RID-Rihaczek distribution, whereas the second method is based on the Fourier transform of the cross Wigner distribution or the two dimensional Fourier transform of the local cross correlation of two signals. Hence, the first method depends on the phase spectrum of individual signals, whereas the second depends on the cross-energy density of two signals. Future work will concentrate on the application of the approach based on the RIDRihaczek distribution to EEG signals and exploring different measures to quantify the amount of modulation. We will also investigate methods for determining the low frequency bands that modulate high frequency oscillations in EEG signals without a priori knowledge. 80 Chapter 4 Methods for Quantifying Multivariate Phase Synchronization 4.1 Introduction Cooperative dynamic behavior of complex systems is relevant in many fields of research, from climactic processes and geophysics to economics, human cardio-respiratory system and neuroscience. In many cases, this complex dynamics is to be conceived as arising through the interaction of subsystems which can be observed in the form of multivariate time series reflecting the measurements from the different parts of the system. Usually, the degree of interaction of two subsystems is quantified using bivariate measures of signal interdependence such as traditional cross-correlation and spectral coherence techniques or nonlinear measures such as mutual information [12]. More recently, tools from nonlinear dynamics, in particular, phase synchronization have received much attention since they offer a way of extracting information on the interdependence of weakly interacting systems that cannot be obtained by traditional methods [14, 15, 110]. Phase synchronization of a network of oscillators occurs in many complex systems including the human brain, where both linear and nonlinear dependencies are quantified though bivariate phase synchrony using noninvasive measurements such as EEG data [77]. The current phase synchrony measures are limited to quantifying bivariate relationships and do not reveal any information about the global 81 connectivity patterns which are important for understanding the underlying oscillatory networks. Recently, phase synchronization of a group of oscillators, which is referred to as global or multivariate phase synchronization, has been of interest for understanding the group dynamics and characteristic behavior of complex networks [33, 34]. Contrary to the bivariate phase synchrony, which is limited to pairwise relationships, multivariate synchrony captures the global synchronization patterns quantifying the degree of interactions within a group of oscillators. The current application of bivariate measures to multivariate data sets with N time series results in an N × N matrix of bivariate indices, which leads to a large amount of mostly redundant information. Therefore, it is necessary to reduce the complexity of the data set in such a way to reveal the relevant underlying structures using multivariate analysis methods. Recently, different multivariate analysis tools have been proposed to define multivariate phase synchronization. The basic approach used for multivariate phase synchronization is to trace the observed pairwise correspondences back to a smaller set of direct interactions using approaches such as partial coherence adapted to phase synchronization [32]. Another complementary way to achieve such a reduction is cluster analysis, a separation of the parts of the system into different groups, such that the signal interdependencies within each group tend to be stronger than in between groups [42, 43]. Allefeld and colleagues have proposed two complementary approaches to identify synchronization clusters and applied their methods to EEG data [48, 49, 111, 112]. In [48], Allefeld et al. have proposed a mean-field approach to analyze EEG data, where each signal is contributing to a single cluster to a different extent [48]. The existence of a single synchronization cluster is not a reasonable assumption since the underlying clustering structure of brain networks, which usually con82 sist of multiple clusters, cannot be inferred. This method has the disadvantage of assuming a single cluster and thus cannot identify the underlying clustering structure. In [49], an approach that addresses the limitation of the single cluster approach has been introduced using methods from random matrix theory. This method is based on the eigenvalue decomposition of the pairwise bivariate synchronization matrix and appears to allow identification of multiple clusters. Each eigenvalue greater than 1 is associated with a synchronization cluster and quantifies its strength within the data set. The internal structure of each cluster is described by the corresponding eigenvector. Combining the eigenvalues and the eigenvectors, one can define a participation index for each oscillator and its contribution to different clusters. This method assumes that the synchrony between systems belonging to different clusters, i.e. between-cluster synchronization, is equal to zero and requires an adjustment for proper computation of the participation indices in the case that there is between-cluster synchronization. Despite the usefulness of eigenvalue decomposition for the purposes of cluster identification, it has recently been shown that there are important special cases, clusters of similar strength that are slightly synchronized with each other, where the assumed one-toone correspondence between eigenvectors and clusters is not realistic [50]. Other alternative measures that quantify multivariate relationships include the directed transfer function and Granger causality defined for an arbitrary number of channels [113, 114]. Both of these methods have been applied to study interdependencies and causal relationships, however, are limited to stationary processes and linear dependencies. The contribution of this chapter is two fold. First, we extend the notion of bivariate synchrony to multivariate synchronization by employing measures of multivariate correlation and complexity from statistics to quantify the synchronization within and across groups of signals rather than between pairs. The proposed measures will depend on quantities such 83 as multiple correlation and R2 and will be redefined in the context of phase synchrony. In particular, Rv , a measure of association for multivariate data sets introduced in [115] will be used to quantify the degree of association or synchronization between groups of variables. Rv is a particularly attractive measure for quantifying the similarity between groups of variables since it has been shown to be a unifying metric that when maximized, with relevant constraints, yields the solutions to different linear multivariate methods including principal component analysis, canonical correlation analysis and multivariate linear regression [115]. We also exploit a global complexity measure based on the spectral decomposition of the bivariate synchronization matrix similar to the S measure defined in [33], in order to complement the findings of Rv by quantifying the synchronization within a network. However, these approaches depend on the estimation of bivariate phase synchrony values and offer an indirect and a limited way to estimate the multivariate synchrony within a network. Therefore, the second contribution of this chapter is that we propose a novel and direct method, which will be referred to as hyperspherical phase synchrony (HPS), to compute the multivariate phase synchronization within a group of oscillators. The proposed method is is based on extending the definition of phase synchrony from the two-dimensional space to an N-dimensional space by employing a uniform angular sampling of a unit sphere in an N-dimensional hyperspherical coordinate system. This approach offers a new definition of multivariate phase synchronization since the network dynamics are characterized using the whole ensemble of phase differences in a multidimensional space rather than considering the pairwise relationships. Hence, the proposed methods in this Chapter will be useful for applications such as EEG signals where the synchronization within or across regions is more important than individual pairwise synchrony to characterize the functional networks. 84 4.2 Bivariate Synchrony Dependent Multivariate Synchrony Measures 4.2.1 Measures of Multivariate Synchronization In the proposed work, we will develop measures of association based on bivariate phase synchrony in an attempt to capture multivariate synchronization effects. One measure of interest is R2 like measure of association proposed by Robert and Escoufier [115], which is a multivariate generalization of Pearson correlation coefficient, defined as: tr(Rxy Ryx ) Rv = √ tr(R2 )tr(R2 ) xx yy (4.1) where x and y refer to groups of variables, Rxy , Ryx , Rxx , Ryy are the auto- and crosscorrelation matrices between the variables and Rv quantifies the association between the variables x1 , x2 , . . . , xq and y1 , y2 , . . . , yp . This measure has been shown to be equivalent to a distance measure between normalized covariance matrices and is always between 0 and 1. The numerator corresponds to a scalar product between positive semi-definite matrices, the denominator is the Frobenius matrix scalar product [116] and Rv is equivalent to the cosine between the covariances of the two data matrices. The closer to 1 it is, the better is y as a substitute for x. It has been shown that the major approaches within statistical multivariate data analysis, such as principal component analysis, canonical correlation and multivariate regression, can all be brought into a common framework in which the Rv coefficient is maximized subject to relevant constraints [115]. In the case of multivariate synchronization, the matrices Rxy , Ryx , Rxx , Ryy are formed by computing the pairwise bivariate phase 85 synchrony across different groups of variables and within each group, respectively. For a network consisting of N nodes, a second closely related measure that will be adapted for multivariate synchronization is the S-estimator [33], which exploits the eigenvalue spectrum of the N × N bivariate synchronization matrix to quantify the amount of synchronization within a group of oscillators: ∑N S =1+ i=1 λi log(λi ) log(N ) (4.2) where λi s are the N normalized eigenvalues. This measure is an information theoretic inspired measure since it is complement to the entropy of the normalized eigenvalues of the correlation matrix. The more disperse the eigenspectrum is the higher the entropy would be. In this chapter, this estimator will be applied to the bivariate synchronization matrix instead of the correlation matrix. If all of the oscillations in a group are completely synchronized, i.e. the entries of the pairwise synchrony matrix are all equal to 1, then all of the eigenvalues except one will be equal to zero and the value of S will be equal to 1 indicating perfect multivariate synchrony. This measure can quantify the amount of synchronization within a group of signals and thus is useful as a global complexity measure. 4.2.2 Proposed Approach Let N be the number of oscillators or channels in a system. The proposed multivariate phase synchronization measures can be computed from data as follows: 1. Compute the RID-Rihaczek distribution and the corresponding phase spectrum for the N oscillators, xi (tm ), m = 0, 1, . . . , M − 1, using Eq. (2.13). 86 For each oscillator xi : i 2. From the RID-Rihaczek distribution, find the frequency, ωmax , having the maximum i energy and extract the time-varying phase, Φi (t, ωmax ), from the phase spectrum. 3. Compute the pairwise synchrony (bivariate synchrony) between ith and jth oscillators, γi,j : γi,j = M −1 1 ∑ j i exp(j(Φi (tm , ωmax )) − Φj (tm , ωmax ))) . M (4.3) m=0 4. Form the bivariate phase synchrony matrix R as   γ1,2  1    γ2,1 1  R=  . . .  . .  .  γN,1 γN,2 . . . γ1,N ... γ2,N . . . . ...      .     (4.4) 1 .. 5. Using R, compute S for the whole network by finding the normalized eigenvalues of R and computing the expression given by equation 4.2. 6. The measure Rv quantifies the degree of association between two oscillator groups and can be computed for any groups of oscillators from the nextwork. For example, consider two oscillator groups, x and y formed by oscillators {1, 2 . . . , N ′ } and {N ′ + 1, . . . , N }, respectively. Rv between these two groups can be computed using equation 4.1 with 87 matrices Rxx , Ryy , Rxy , Ryx computed as follows:        Rxx =            Ryy =       1 γ1,2 . . . .. . . . . γ1,N ′ .. . γ2,N ′ . . . .. . .. 1 . . . . . . ... γN ′ ,1 γN ′ ,2 .      ,     1  γN ′ +1,N ′ +2 . . . γN ′ +1,N ... ... γN ′ +2,N . . . .. γN,N ′ +1 γN,N ′ +2 . .. . . . . .. . 1   γ1,N ′ +1 γ1,N ′ +2 . . . γ1,N   . .. .. .  . . γ2,N .  Rxy =   . . ... ... . .  . .   . γN ′ ,N ′ +1 γN ′ ,N ′ +2 . . γN ′ ,N   γN ′ +1,1 γN ′ +1,2 . . . γN ′ +1,N ′   . .. .. .  . . γN ′ +2,N ′ .  Ryx =   . . .. .. . .  . . . .   ... γ γ γ ′ N,1 N,2      ,          ,           .     (4.5) N,N The procedure described above can be extended to the case of computing synchrony across realizations instead of across time. This modification would require the computation of RIDRihaczek distribution for each realization and computing the phase coherence by taking an average over realizations instead of time. 88 4.2.3 Simulation Results 4.2.3.1 Rossler Oscillator Model In this chapter, in order to evaluate the performance of the proposed multivariate measures, a well known model of nonlinear oscillators, called Rossler oscillators, is used. These chaotic oscillators, investigated by [14, 117], form a system that is known to have characteristic phase synchronization properties and to exhibit clusters of phase synchronization depending on the coupling strengths within the system. The model consists of a network of multivariate time series coupled in a way to form synchronization clusters of different size as well as desynchronized oscillators. The networks considered in this chapter consist of N = 6 Rossler oscillators which are coupled diffusively via their z-components: xj = 10(yj − xj ), ˙ yj = 28xj − yj − xj zj , ˙ zj = −8/3zj + xj yj + ˙ N ∑ ϵij (zi − zj ). (4.6) i=1 The coupling coefficients, ϵij , are chosen from the interval [0, 1] to construct different networks. The differential equations are numerically integrated using the Runge-Kutta method with a time step of ∆t = 1/25 sec corresponding to a sampling frequency of 25 Hz, where the initial conditions are randomly chosen from the interval, [0, 100]. The first 2500 samples are discarded to eliminate the initial transients. In the remaining of this chapter, the z-components of the Rossler oscillators are used for the procedure described in section 4.2.2. 89 4.2.3.2 Performance of Multivariate Synchrony Measures In Quantifying Within and Between Cluster Synchrony In this example, the dependency of the multivariate synchrony measures on the coupling strengths in a Rossler network is evaluated using 500 samples. The network in Fig. 4.1 is formed and the coupling strengths ρ1 = ϵ1,4 = ϵ4,1 and ρ2 = ϵ2,6 = ϵ6,2 are increased from 0 to 1 in steps of 0.2, with ϵ1,2 = ϵ2,1 = ϵ1,3 = ϵ3,1 = ϵ2,3 = ϵ3,2 = 1, ϵ4,5 = ϵ5,4 = ϵ4,6 = ϵ6,4 = ϵ5,6 = ϵ6,5 = 1 and all other coupling strengths set to zero. 1 6 2 U 2 U 1 5 3 4 Figure 4.1: Rossler network for evaluating the dependency of the multivariate synchrony measures on the coupling strengths. The coupling strengths ρ1 and ρ2 are increased from 0 to 1 in steps of 0.2. Figs. 4.2 and 4.3 show the dependency of the Sg , Sy , ST and Rv on the coupling strengths ρ1 and ρ2 , in the absence of noise. When both ρ1 and ρ2 are equal to zero, Sg and Sy have the highest values, which is equivalent to the network in Fig. 4.4(a). In this case, there are two completely separate clusters and each cluster has the maximum phase synchrony, with no between-cluster synchrony. This result is expected since the S values represent the within-cluster phase synchrony and increasing the coupling coefficients ρ1 and 90 Dependency of the S on the coupling strengths g 0 0.9 0.2 0.8 ρ 1 0.4 0.7 0.6 0.6 0.8 0.5 1 0 0.2 0.4 0.6 ρ 0.8 1 2 (a) Sg Dependency of the Sy on the coupling strengths 0 0.9 0.8 0.4 0.7 ρ 1 0.2 0.6 0.6 0.8 0.5 1 0 0.2 0.4 ρ2 0.6 0.8 1 (b) Sy Figure 4.2: Dependency of the multivariate synchrony measures, Sg and Sy , on the coupling strengths ρ1 and ρ2 ρ2 synchronizes the two of the oscillators from each cluster to the other two oscillators in the other cluster, which destroys the within-cluster phase synchrony. Thus, maximum withincluster synchrony is achieved when ρ1 = 0 and ρ2 = 0 and increasing either or both ρ1 and ρ2 results in the reduced within-cluster phase synchrony values, shown by Figs. 4.2(a) and 91 Dependency of the S on the coupling strengths T 0.6 0 0.5 0.4 ρ 1 0.2 0.6 0.4 0.8 1 0.3 0 0.2 0.4 ρ2 0.6 0.8 1 (a) ST Dependency of the Rv on the coupling strengths 0 0.4 0.4 0.3 ρ 1 0.2 0.6 0.2 0.8 0.1 1 0 0.2 0.4 ρ2 0.6 0.8 1 (b) Rv Figure 4.3: Dependency of the multivariate synchrony measures, ST and Rv , on the coupling strengths ρ1 and ρ2 4.2(b). ST , which shows the within cluster synchrony for the whole network, has the maximum value when ρ1 and ρ2 are both equal to 1. This is also an expected result since these two coupling strengths try to synchronize the two clusters with each other. Reduction in either or both of ρ1 and ρ2 results in a reduced ST value which is shown in Fig. 4.3(a). 92 Fig. 4.3(b) shows that Rv , which represents the between-cluster synchrony, is directly proportional to ρ1 and ρ2 . An increase in only one of the coupling strengths is not enough to increase Rv . However, when both of these coupling strengths increase, Rv also increases and reaches its maximum value when ρ1 = 1 and ρ2 = 1. This is an expected result since both ρ1 and ρ2 are responsible for the increased between-cluster synchrony. Moreover, by looking at Figs. 4.3(a) and 4.3(b), one can say that there is a strong positive correlation between ST and Rv . The reason for this is that Rv represents the between-cluster synchrony and ST represents the synchrony of the whole network, both of which increase with the increasing coupling strengths, ρ1 and ρ2 . In order to evaluate the performance of the S and Rv -estimators in estimating the withincluster and between-cluster synchrony in detail, 12 different Rossler networks, shown in Fig. 4.4, consisting of 6 oscillators are generated. Each connection represents two symmetric coupling strengths, equal to 1, between two oscillators. For each network, 200 simulations are performed with additive white Gaussian noise at a SNR value of 5 dB. Table 4.1: Means and standard deviations of Rv , Sg , Sy and ST for the networks in Fig. 4.4 Rv Sg Sy ST Network 1 0.1834 ± 0.1482 0.7844 ± 0.0431 0.7921 ± 0.0728 0.5682 ± 0.0681 Network 2 0.1209 ± 0.0870 0.4005 ± 0.2133 0.4882 ± 0.2379 0.3293 ± 0.1241 Network 3 0.1689 ± 0.1198 0.6135 ± 0.1246 0.5983 ± 0.0972 0.4339 ± 0.0763 Network 4 0.6451 ± 0.2255 0.6692 ± 0.1352 0.6832 ± 0.1354 0.6239 ± 0.1392 Network 5 0.9671 ± 0.0239 0.7789 ± 0.0433 0.7831 ± 0.0594 0.8203 ± 0.0387 Network 6 0.7642 ± 0.2923 0.4340 ± 0.2249 0.4462 ± 0.2103 0.5253 ± 0.2251 Network 7 0.6824 ± 0.2261 0.4849 ± 0.1448 0.4976 ± 0.1927 0.5379 ± 0.1466 Network 8 0.2452 ± 0.1556 0.3358 ± 0.1809 0.3144 ± 0.1771 0.2848 ± 0.1430 Network 9 0.4475 ± 0.1228 0.4202 ± 0.0783 0.4317 ± 0.0779 0.4358 ± 0.0982 Network 10 0.7871 ± 0.0562 0.4156 ± 0.0946 0.4141 ± 0.0862 0.5411 ± 0.0632 Network 11 0.3378 ± 0.1185 0.4168 ± 0.0669 0.4146 ± 0.0728 0.4096 ± 0.0582 Network 12 0.4234 ± 0.1334 0.4174 ± 0.0724 0.1754 ± 0.0822 0.3358 ± 0.0423 93 (a) Network 1 (b) Network 2 (c) Network 3 (d) Network 4 (e) Network 5 (f) Network 6 (g) Network 7 (h) Network 8 (i) Network 9 (j) Network 10 (k) Network 11 (l) Network 12 Figure 4.4: 12 different Rossler networks for evaluating the performance of the S and Rv estimators in estimating the within-cluster and between-cluster synchrony. Each node represents an oscillator and each connection represents the two symmetric coupling strengths, which are equal to 1, between two oscillators. 94 Table 4.1 shows the mean and standard deviation values for all 12 networks. Networks 1 and 5 have the largest Sg and Sy values, which indicates that the within-cluster phase synchrony is very strong for these networks. Strong within-cluster synchrony, or high S value, is usually obtained when all possible within-cluster connections exist and the betweencluster connections are either very strong or non-existent. Networks 1 and 5 satisfy these conditions with network 1 having no between-cluster connections and network 5 having strong connections between the two clusters. On the other hand, networks 3 and 4 have all possible within-cluster connections but they have smaller Sg and Sy values compared to network 5 since the small number of between-cluster connections are not adequate to synchronize the two clusters and are also disruptive to the within-cluster synchrony. The largest Rv value, or between-cluster synchrony, is observed for network 5 which results from the three connections between the two clusters, forcing the two clusters to be highly synchronized with each other. Networks 6 and 10 also have a large number of connections between the two clusters but they lack some of the within-cluster connections, which results in reduced Rv values for these networks. Network 6 has a larger Rv value compared to network 7 but has smaller S values since there are 3 connections between the clusters. Networks 8, 9, 11 and 12 all have small Rv and S values since the within-cluster and between-cluster synchronies are both weak due to the lack of connectivity in these networks. ST , on the other hand, measures the within cluster synchronization when all six oscillators are assumed to form a big cluster. Network 5 has the largest ST value because there is one big cluster which is formed by multiple smaller clusters with strong within-cluster connectivity. Since the connectivity in the whole network is strong, the eigenvalues of the synchrony matrix tend to be better concentrated, which results in a low entropy value and a high ST value. On the other hand, networks 2 and 8 have small ST values since the within network connectivity 95 is not strong. 4.2.3.3 Significance Testing For the Multivariate Synchrony Measures Determining the statistical significance involves hypothesis testing and requires the formation of a null hypothesis. In some cases, it may be possible to derive analytically the distribution of the given measure under a given null hypothesis. However, in the case of the multivariate synchronization measures this proves to be a very difficult problem, therefore this distribution is estimated by direct Monte Carlo simulations. For this purpose, an ensemble of surrogate data sets are generated [118]. The surrogate data set is generated by first computing the Fourier transform of the data and then randomizing the phase. Finally, the inverse Fourier transform is taken to obtain the surrogate data which has the same power spectrum and autocorrelation function as the original data. This operation preserves the amplitude relationships while randomizing the phase dependencies. For each surrogate data set, the procedure described in section 4.2.2 is followed to compute the multivariate measures. From this ensemble of statistics, the distribution is approximated. A robust way to define significance would be directly in terms of the p-values with rank statistics. For example, if the observed time series has a Rv or S value that is in the lower one percentile of all the surrogate statistics, then a p-value of p=0.01 could be quoted. For the networks in Figs. 4.2 and 4.3, 100 surrogate data sets are formed and all multivariate synchrony measures, Rv , Sg , Sy and ST , are found to be significant at p = 0.05. The surrogate testing demonstrates that our simulation results are not likely to occur by chance. 96 4.2.4 Application to EEG Data The proposed multivariate phase synchronization measures were applied to a set of EEG data containing the error-related negativity (ERN). The ERN is an event-related brain potential that occurs following performance errors in a speeded reaction time task [109, 119] and is observed as a sharp negative trend in EEG recordings which typically peaks from 75-80 ms after the error response. Previous work indicates that there is increased phase synchrony associated with ERN for the theta frequency band (4-8 Hz) and ERN time window (2575 ms) between frontal and central electrodes versus central and parietal [120, 121]. EEG data from 63-channels was collected in accordance with the 10/20 system on a Neuroscan Synamps2 system (Neuroscan, Inc.) 1 . A speeded-response flanker task was employed, and response-locked averages were computed for each subject. In this chapter, we analyzed data from 11 subjects corresponding to the error responses from five electrodes corresponding to the areas of interest, i.e., two frontal electrodes (F3 and F4), one central electrode (FCz) and two parietal electrodes (CP3 and CP4). All five electrodes were analyzed by RID-Rihaczek distribution and the phases in the theta frequency band are extracted to compute the pairwise synchrony values. After the 5 × 5 bivariate synchronization matrix was formed, we computed S and Rv values considering two groups of electrodes, F3, F4 and FCz and CP3, CP4 and FCz. For all 11 subjects, S value of the electrode group, F3-F4-FCz, is larger than the S value of the group, CP3-CP4-FCz. A Wilcoxon rank sum test is used at %5 significance level to test the null hypothesis that the S values of the group F3-F4-FCz and the S values of the group CP3-CP4-FCz are independent samples from identical distributions with equal 1 We would like to acknowledge Dr. Edward Bernat from Florida State University for sharing his EEG data with us. 97 medians, against the alternative that they do not have equal medians. The p-value provided by the test is 0.0187, which is less than 0.05. Thus, the null hypothesis is rejected at 5% significance level, which shows that the S values of the group F3-F4-FCz are significantly larger than the S values of the group CP3-CP4-FCz. This result indicates that the frontal (F3 and F4) electrodes are more strongly coupled to the central electrode (FCz), compared to the coupling between the parietal (CP3 and CP4) and central electrodes, which is shown by the significantly larger within-cluster synchrony values. Moreover, Rv value between the central-frontal and Rv value between the central-parietal electrode groups are computed. Rv value between the central-frontal electrodes is larger for 10 subjects out of 11. Using the Wilcoxon rank sum test, the null hypothesis is rejected with p-value = 0.0122, which demonstrates that the synchrony between the frontal (F3 and F4) electrodes and central electrode (FCz) is larger compared to the synchrony between the parietal (CP3 and CP4) and central electrodes, which is shown by the significantly larger between-cluster synchrony. These results are consistent with the previous work in [122]. 4.3 Hyperspherical Phase Synchrony Bivariate phase synchrony is based on the circular variance of the two-dimensional direction vectors on a unit circle (1-sphere), obtained by mapping the phase differences, {Φk (t, ω)}k=1,...,L , between the two time-series onto a Cartesian coordinate system. If 1,2 the circular variance of these direction vectors is low, the time-series are said to be locked to each other. In this chapter, we propose an extension of this idea to the multivariate case and k k k define {θ1 (t, ω), θ2 (t, ω), . . . , θN −1 (t, ω)} as the (N − 1) angular coordinates at time t and k frequency ω for the k th trial, where θi (t, ω) = Φk (t, ω) − Φk (t, ω) is the phase difference i i+1 98 between the ith and (i + 1)th time series within a group of N oscillators2 . We map these (N − 1) angular coordinates onto an N -dimensional space by forming direction vectors in an N -dimensional hyperspherical coordinate system. The resulting direction vectors cover the whole (N − 1)-sphere, which is a generalization of an ordinary sphere to an arbitrary dimension3 . For any natural number N , an N − 1-sphere of radius r is defined as the set of points in (N )-dimensional Euclidean space which are at distance r from a central point, where the radius r may be any positive real number. The set of coordinates in N -dimensional space, γ1 , γ2 , . . . , γN , that define an (N − 1)-sphere is represented by: r2 = N ∑ (γi − ci )2 (4.7) i=1 where c = [c1 , . . . , cN ] is the center point and r is the radius. In this Chapter, r = 1 and the center point is the origin. Fig. 4.5 shows an example of a 2-sphere where the 3-dimensional direction vectors are shown by the line crossings. k k Using the N − 1 angular coordinates, {θ1 (t, ω), . . . , θN −1 (t, ω)}, we define the set of N Cartesian coordinates on a unit N − 1 sphere which forms a direction vector, Γk (t, ω) = 2 Note that the sequence of the phases when computing the angular coordinates does not have any effect on the circular variance of the resulting direction vectors. 3 To generate a suitable set of direction vectors, unit hyperspheres are sampled based on uniform angular sampling methods. 99 z 1 0 −1 1 1 0 y −1 −1 0 x Figure 4.5: Line crossings, such as the black dots, indicate the sampled 3-dimensional direction vectors based on uniform angular sampling of a 2-sphere. k k [γ1 (t, ω), . . . , γN (t, ω)], as: k k γ1 (t, ω) = cos (θ1 (t, ω)), k k k γ2 (t, ω) = sin (θ1 (t, ω)) × cos (θ2 (t, ω)), k k k k γ3 (t, ω) = sin (θ1 (t, ω)) × sin (θ2 (t, ω)) × cos (θ3 (t, ω)), . . . k k k k γN −1 (t, ω) = sin (θ1 (t, ω)) × · · · × sin (θN −2 (t, ω)) × cos (θN −1 (t, ω)), k k k k γN (t, ω) = sin (θ1 (t, ω)) × · · · × sin (θN −2 (t, ω)) × sin (θN −1 (t, ω)), (4.8) Therefore, for N signals, we define the hyperspherical phase synchrony (HPS) as: L 1 ∑ k Γ (t, ω) HP S(t, ω) = L k=1 (4.9) 2 where HP S(t, ω) is the multivariate synchronization value at time t and frequency ω, ∥.∥2 is 100 the l2 norm and L is the number of trials. Note that HPS is equivalent to PLV for a network consisting of two signals. In the case of perfect multivariate phase synchronization of the network, HPS is equal to 1 and equals to zero when the oscillators are independent. The proposed hyperspherical phase synchrony measure is calculated through the com2 putation of N Cartesian coordinates, which requires N −N −2 multiplications, and L2 norm 2 of an N -dimensional vector which requires N multiplications. Therefore, the computational complexity of the proposed measure is O(N 2 ). On the other hand, S-estimator is obtained ( ) by the computation of N bivariate synchrony values, which has O(N 2 ) complexity, and 2 the computation of eigenvalues which requires the eigenvalue decomposition of the bivariate synchrony matrix, using one of the many iterative algorithms which utilize a large class of fast recursive matrix multiplication algorithms, such as the ‘Rayleigh quotient iteration algorithm’ [123] and the QR algorithm. It has been shown that the computational complexity of the eigenvalue problem for a N × N matrix is O(N 3 ) [124]. Therefore, the proposed measure is computationally more efficient than the S-estimator especially for large group of oscillators. 4.3.1 Simulation Results: Robustness of HPS to Noise The robustness of the hyperspherical phase synchrony measure in noise is evaluated and compared to the performance of the S-estimator in quantifying multivariate synchrony. In order to evaluate the robustness in noise, a group of oscillators, {x1 (t), . . . , x16 (t)}, consisting of highly synchronized sinusoidal signals having constant phase differences is considered: ( (i − 1) xi (t) = cos 2π40t + π 16 101 ) + ni (t), i = 1, 2, . . . , 16 (4.10) where ni s are independent white Gaussian noise processes at different variance levels, with signal-to-noise ratio (SNR) varying in the range of −30 dB to 30 dB, in steps of 2 dB. 200 simulations of the group of oscillators for 1024 time points are considered to evaluate the distribution of the synchrony values. RID-Rihaczek distribution introduced in Chapter 2 is employed to estimate time and frequency dependent phase estimates, {Φ1 (t, ω), . . . , ϕ16 (t, ω)}. Ideally, the multivariate synchrony value for a group of signals with constant phase shifts is expected to be equal to 1. Fig. 4.6 illustrates the performance of the hyperspherical phase synchrony and S-estimator in estimating the true multivariate phase synchrony for different SNR values. The average multivariate synchrony values over time are plotted with increasing SNR values and the bars indicate the standard deviations. From Fig. 4.6, for all SNR values, one can see that the multivariate phase synchrony provided by the hyperspherical phase synchrony measure is always larger than the one given by the S-estimator, especially for low SNR values. Furthermore, using the Wilcoxon rank sum test, the null hypothesis that the multivariate synchrony values come from identical distributions with equal medians is rejected at %5 significance level for all of the SNR values. Hence, the multivariate synchrony values from hyperspherical phase synchrony measure are significantly larger than the ones provided by the S-estimator. Therefore, HPS is more robust to noise and outperforms the S-estimator in quantifying the multivariate phase synchrony. 4.3.2 Application to EEG Data 4.3.2.1 EEG Data In order to evaluate the performance of the proposed measure in quantifying the multivariate synchronization across different brain regions, we use a set of EEG data containing the 102 Robustness to Noise Multivariate Synchrony 1 0.8 Hyperspherical Synchrony S−Estimator 0.6 0.4 0.2 0 −40 −30 −20 −10 0 10 SNR (dB) 20 30 40 Figure 4.6: Comparison of noise robustness: Performances of the hyperspherical phase synchrony and S-estimator in estimating the true multivariate phase synchrony within a group of oscillators consisting of highly synchronized sinusoidal signals having constant phase differences for different SNR values, bars indicate the standard deviations. ERN. Cavanagh et al. have shown that lateral prefrontal cortex (lPFC) activity is phasesynchronous with medial-frontal theta, supporting the idea that medial prefrontal (mPFC) and lPFC regions are functionally integrated during error processing [122]. Recent work has also supported the hypothesis that mPFC regions are highly integrated with other prefrontal areas during control processing [121]. Therefore, in this Chapter, application of the proposed measure to EEG data is based on the hypothesis that the medial-frontal region will play a central functional role during the ERN, and will have significant integration with frontal areas within the theta frequency band. Therefore, multivariate phase synchronization is expected to be higher for frontal and central electrode group compared to parietal and central one. In this Chapter, we analyzed 62-channel EEG data collected from 32 subjects corresponding to the error responses4 . Before applying the proposed measure, all EEG epochs were 4 We would like to acknowledge Dr. Jason Moser from Michigan State University for 103 converted to current source density (CSD) to accentuate local activity and distal activity (e.g. volume conduction), using published methods [125, 126]. Theta Band: Frequency (Hz) HPS for Frontal−Central Electrodes (F1, F2, F3, F4, FCz) 0.5 8 0.45 7 0.4 6 0.35 5 0.3 0.25 4 0.2 10 20 30 40 50 60 70 80 90 100 ERN Interval: Time (ms) (a) HPS: Frontal-Central Theta Band: Frequency (Hz) HPS for Parietal−Central Electrodes (CP1, CP2, CP3, CP4, FCz) 0.5 8 0.45 7 0.4 6 0.35 5 0.3 0.25 4 0.2 10 20 30 40 50 60 70 80 90 100 ERN Interval: Time (ms) (b) HPS: Parietal-Central Figure 4.7: The mean HPS values computed over all subjects at each time and frequency point within the ERN interval and theta band for the two electrode groups. sharing his EEG data with us. 104 Theta Band: Frequency (Hz) S−estimator for Frontal−Central Electrodes (F1, F2, F3, F4, FCz) 0.35 8 7 0.3 6 0.25 5 0.2 4 10 20 30 40 50 60 70 80 90 ERN Interval: Time (ms) (a) S-value: Frontal-Central 100 Theta Band: Frequency (Hz) S−estimator for Parietal−Central Electrodes (CP1, CP2, CP3, CP4, FCz) 0.35 8 7 0.3 6 0.25 5 0.2 4 10 20 30 40 50 60 70 80 90 ERN Interval: Time (ms) (b) S-value: Parietal-Central 100 Figure 4.8: The mean S-values computed over all subjects at each time and frequency point within the ERN interval and theta band for the two electrode groups. 4.3.2.2 Results The proposed hyperspherical multivariate synchrony measure is applied to both the frontal (F1, F2, F3, F4)-central (FCz) electrode group and the parietal (CP1, CP2, CP3, CP4)central (FCz) group. For each subject, we focused on the ERN interval, [ta , tb ], and theta 105 frequency band, [ωa , ωb ], and compared the mean HPS value, 1 HPS = T ×Ω ∑tb t=ta ∑ωb ω=ωa HPS(t, ω), where T and Ω are the total number of time and frequency bins, respectively, to identify if the frontal-central group has stronger multivariate synchronization compared to the parietal-central group. For all subjects, frontal-central electrode group resulted in significantly larger HPS values compared to the parietal-central group using a Welch’s t-test at 1% significance level. This result is consistent with previously observed interactions in the theta band between medial prefrontal cortex (mPFC) and lateral prefrontal cortex (lPFC) during error-related negativity [122]. Figs. 4.7(a) and 4.7(b) show the mean HPS values computed over all subjects at each time and frequency point within the ERN interval and theta band for the two electrode groups. Moreover, we compared the performance of HPS with the S-estimator in discriminating between the multivariate synchronization of the two groups. Figs. 4.8(a) and 4.8(b) show the mean S-values computed over all subjects. We found that the HPS values for frontal-central group in Fig. 4.7(a) are significantly larger compared to parietal-central group in Fig. 4.7(b) with p < 0.01. On the other hand, the S-values in Fig. 4.8(a) are significantly larger compared to parietal-central group with p < 0.05. Therefore, the proposed measure yields more significant differences and outperforms S-estimator in discriminating between the multivariate synchronization of the two groups. We also compare our proposed measure with S-estimator for detecting significant multivariate synchronization for the frontal-central electrodes using an ROC curve. ROC curves were developed in the context of signal detection [127], and have been widely used for EEG analysis purposes in neuroscience [128, 129]. ROC curves describe the full trade-off between detection and false alarm rates and allow one to compare two methods over the complete spectrum of operating conditions. Each point on each curve corresponds to a different thresh106 old for signaling an alarm. In general, if one curve lies entirely above another curve, then the corresponding method is considered to be superior in that it handles the trade-off between missed detections and false alarms better. In this Chapter, for each subject, a true positive (detection) is determined when the mean multivariate synchrony value within the ERN interval and theta band for the frontal-central electrodes is larger than the threshold, whereas a false alarm is defined when the mean multivariate synchronization for the parietal-central group is larger than the threshold. Fig. 4.9 shows the ROC curves for HPS and S-estimator. One can clearly see that HPS performs better than the S-estimator in detecting frontal-central multivariate synchronization since the area under the ROC curve of HPS is much greater compared to the ROC curve of S-estimator. Including more subjects in the current EEG data analysis could improve the performance of our proposed approach and enable the ROC curves to approach to the optimal upper left corner. ROC Curve for Comparison of HPS and S−estimator Probability of Detection 1 0.8 0.6 0.4 HPS S−estimator 0.2 0 0 0.2 0.4 0.6 0.8 Probability of False Alarm Figure 4.9: ROC curves for HPS and S-estimator 107 1 4.4 Conclusions In this chapter, we first introduced an approach consisting of two complementary measures, S-estimator and Rv , for quantifying multivariate phase synchronization within a group of oscillators as well as between groups. The proposed approach is based on the application of RID-Rihaczek distribution for extracting time and frequency dependent phase information, and adapting measures of correlation from statistics to multivariate analysis. The proposed approach extends the current state of the art phase synchrony analysis from quantifying bivariate relationships to multivariate ones. Second, we proposed a novel and direct method, called hyperspherical phase synchrony, to compute the multivariate phase synchrony within a group of oscillators coupled through either direct or indirect relationships, such as the global coupling across different brain sites. Simulation results and application to real EEG data show the effectiveness and superiority of hyperspherical phase synchrony compared to the existing methods. This shift from pairwise bivariate synchrony analysis to multivariate analysis offers advantages especially for complex system analysis such as the brain where the bivariate relationships do not always reflect the underlying network structure. Future work will focus on the extension of S-estimator and Rv measures using different multivariate analysis techniques such as cluster analysis and canonical correlation to obtain a more detailed understanding of the synchronization networks as well as the application of these measures to a large number of signals as is the case with EEG. For the hyperspherical phase synchrony measure, we will concentrate on exploring different sampling point-sets such that the resulting direction vectors are distributed uniformly on the N -sphere since the set of direction vectors based on uniform sampling in the angular coordinate system results in nonuniformly distributed direction vectors as shown in Fig. 4.5. In the current application 108 to EEG data, one limitation of the proposed measures is that the groups of oscillators to be analyzed have to be identified a priori with known synchronization patterns and an exhaustive search to find synchronization clusters would be computationally complex. Therefore, it would be valuable to first use preprocessing methods, such as eigenvalue decomposition or measures of association and complexity, which can help us to discover the underlying networks. Furthermore, we will extend the application of the proposed measure to compare the dynamic nature of functional brain networks for error and correct responses to get a more complete understanding of cognitive control. 109 Chapter 5 Time-Varying Graph Analysis for Dynamic Brain Network Identification 5.1 Introduction In recent years, there has been a growing need for quantifying multivariate relationships across multiple brain regions to provide an understanding of the collective behavior of functional brain connectivity. A popular approach to look at the multivariate synchronization patterns has been through the use of a new, multidisciplinary approach to the study of complex systems based on graph theory. Graph theory provides a way to capture the topology of the functional networks and to quantify the multivariate relationships among neuronal activations across brain regions [130]. It also suggests models for functional brain networks which may allow us to better understand the relation between network structure and the processes taking place on these networks. One such model is the ’small-world’ network introduced by Watts and Strogatz [131], that demonstrates both clustered (’cliquish’) inter-connectivity within groups of nodes (like regular lattices) and a short path length between any two nodes (like random graphs). This is an attractive configuration for the functional architecture of the brain, because small-world networks are known to optimize information transfer, increase the rate of learning, and support both segregated and distributed information processing. Recently, there have been multiple functional network studies using graph theory based on 110 fMRI [132, 133], EEG [134, 135], and MEG data [136, 137] which have shown small-world patterns in functional networks of healthy subjects. Several studies have also shown how brain pathology, such as schizophrenia and Alzheimer’s diseases, may interfere with the normal small-world architecture [138, 139, 140, 136, 135]. Furthermore, the development of neurobiologically meaningful and easily computable measures, such as graph theory based clustering coefficient and characteristic path length, have been shown to reliably characterize functional brain connectivity [141, 139, 140, 136, 142]. These measures also offer a simple way to compare functional network topologies between subject populations and reveal presumed connectivity abnormalities in neurological and psychiatric disorders [135, 143]. Currently, topological features of functional brain networks such as clustering coefficient, path length, small world parameter [39], modularity, global and local efficiency are defined over long periods of time, thus focusing on static networks and neglecting possible time-varying properties of the topologies [36, 37, 38]. However, evidence suggests that the emergence of a unified neural process is mediated by the continuous formation and destruction of functional links over multiple time scales [39]. Hence, a single graph is not sufficient to represent the communication patterns of the brain and can be considered as an unreliable snapshot of functional connectivity. In recent years, there has been an interest in characterizing the dynamic evolution of networks. Most of the existing approaches to dynamic network analysis are either graph theory based such as direct extensions of component finding [51, 52, 53] and community detection [54] from the static to the dynamic case, or are feature based where features extracted from each graph in the time series are used to form time-varying graph metrics [55, 56]. The analysis of time-varying features of the functional connectivity reveals that the processing of a stimulus involves optimized functional integration of distant brain regions 111 by dynamic reconfiguration of links. More recently, the dynamic nature of the modular structure in the functional brain networks has been investigated by finding modules for each time window and comparing the modularity of the partitions across time [17]. However, this approach does not evaluate the dynamic evolution of the clusters across time and is basically an extension of static graph analysis for multiple static graphs. Mucha et al. [54] proposed a new time-varying clustering algorithm which addresses this issue by defining a new modularity function across time. All of these module finding algorithms result in multiple clustering structures across time and there is a need to reduce the multitude of data into a few representative networks or to quantify the evolution of the network in time using reliable metrics. Therefore, these approaches do not track the change in connectivity or clustering patterns and cannot offer meaningful summarizations of time-varying network topology. Recently, researchers in signal processing have addressed problems in dynamic network analysis such as detection of anomalies or distinct subgraphs in large, noisy background [57] and tracking dynamic networks [61]. Simple approaches such as sliding window or exponentially weighted moving averaging have been proposed for inferring long-term information or trends [62, 63]. However, these methods have some disadvantages such as preserving historical affinities indefinitely, which makes the network topology denser as time evolves [62]. In Chapters 2, 3 and 4, we focused on quantifying the functional brain connectivity within a given time interval and did not provide an understanding of time-varying evolution of these functional networks. Therefore, in this Chapter, we will contribute to the line of dynamic network analysis by finding the event intervals in functional brain connectivity patterns, revealing the most relevant information for each interval and summarizing brain network activity with a few number of representative networks. This is similar to data reduction 112 in signal processing where the ideal summary should conserve the minimum redundancy in representing the dynamics of the particular interval. In this Chapter, we first construct time-varying graphs, which are needed to describe the brain activity across time for a given frequency band, by quantifying the time-varying phase synchrony between different brain regions defined through EEG data [110]. Then, a framework for summarizing or reducing the information in dynamic brain networks into a few representative networks will be proposed by using subspace analysis based on principal component analysis (PCA) to focus on signal subspace only and to discard noise subspace. We compute the angular similarity between subsequent graphs in signal subspace to determine the event boundaries and finally, we form a key network for each interval such that this key network captures only the information belonging to the signal subspace. 5.2 5.2.1 Background Notation We use uppercase bold letters, such as X, to denote a matrix and lowercase bold letters, such as x, to denote a vector, where X(t) and x(t) represent the matrix and vector at time point t, respectively. Xi,j (t) represents the entry of matrix X(t) at the ith row and j th ∑ column, whereas xi (t) denotes the ith entry of vector x(t). ∥x(t)∥p = ( k |xi (t)|p )1/p is i=1 the lp norm of a k-dimensional vector x(t). 113 5.2.2 Network Change Detection Using PCA PCA is a dimensionality reduction technique that results in a compact representation of a multivariate data set by projecting the data onto a lower dimensional subspace defined by a set of new axes called principal components (PCs) and each PC points in the direction of maximum variance remaining in the data, given the variance already accounted for in the preceding components. Lakhina et al. were the first to use PCA as a tool to detect network traffic anomalies, in particular for computer network data streams, and PCA based network wide anomaly detection has been extensively investigated in [144, 145, 146]. Let X = [x(1), . . . , x(m)] be a n × m time-series data matrix, centered to have zero mean, where x(t) is an n-dimensional data vector at time point t. Then, the set of n principal components, {vi }n , are defined as: i=1  vi = arg max ∥v∥2 =1 X − i−1 ∑  T Xvj vj  v (5.1) j=1 1 Solution of Eq. (5.1) is given by the eigenvectors of the covariance matrix, C = m XX T , which form an n × n matrix V = [v1 , . . . , vn ] having associated eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn arranged in decreasing order. It has been observed that although the original data spans a high dimensional space, normal traffic patterns lie in a low dimensional signal (normal) subspace spanned by the first l PCs corresponding to the l largest eigenvalues and anomalous behavior lies in a noise (anomalous) subspace spanned by the remaining (n − l) PCs [144]. Hence, the data at time point t, x(t), can be decomposed as: x(t) = xSig (t) + xNoi (t) 114 (5.2) where xSig (t) and xNoi (t) correspond to the signal and noise components, respectively, and can be computed as: xSig (t) = P P T x(t) and xNoi (t) = (I − P P T )x(t) (5.3) where P = [v1 , . . . , vl ] consists of the eigenvectors with the largest l eigenvalues and I is the identity matrix. When there is a large change in the noise component, xNoi (t), an anomalous network wide behavior is declared [144]. Thus, by performing statistical testing using a Q-statistic on the squared prediction error, ∥xNoi (t)∥2 , one can determine whether an anomaly is observed or not [147, 144]. 5.3 5.3.1 Methods Forming Time-Varying Graphs via Phase Synchronization In order to identify event intervals and infer the evoked network activity within each interval, we first need to obtain time-varying graphs representing the interactions across different brain sites. Let {G(t)}t=1,2,...,T be a time sequence of matrices where G(t) is an N × N weighted and undirected graph corresponding to the functional connectivity network at time t for a fixed frequency or frequency band, T is the total number of time points and N is the number of nodes within the network. The nodes in the network correspond to different brain regions and edges correspond to the connectivity strength quantified by the average phase locking value (PLV) within a frequency band and at a certain time as: ω b 1 ∑ P LVi,j (t, ω) Gi,j (t) = Ω ω=ω a 115 (5.4) where P LVi,j (t, ω) = L ( ) 1 ∑ exp jΦk (t, ω) 1,2 L (5.5) k=1 is the phase synchrony between nodes i and j at time t and frequency ω, L is the number of trials and Φk (t, ω) = |Φi (t, ω) − Φj (t, ω)| is the phase difference estimate for the k th trial. i,j The time and frequency dependent phase, Φi (t, ω), of a signal, si , is estimated using the RIDRihaczek distribution introduced in Chapter 2. Gi,j (t) ∈ [0, 1] represents the connectivity strength between the nodes i and j within the frequency band of interest, [ωa , ωb ], and Ω is the number of frequency bins in that band. Since the graphs are undirected and symmetric, we create vectors, {x(t)}t=1,2,...,T , to equivalently represent these graphs where x(t) is an (N ) 2 -dimensional column vector obtained by stacking the columns of the upper triangular portion of G(t). In this Chapter, our focus is to evaluate the dynamics of the networks over time and the proposed framework is designed accordingly. However, one can extend this framework to consider each time and frequency bin separately to evaluate the network changes over both time and frequency. 5.3.2 Event Interval Detection Once the time-varying graphs and corresponding column vectors, [x(1), . . . , x(T )], are obtained, we need to identify network-wide changes to determine time intervals which may correspond to the underlying neurophysiological events such as evoked potentials. We define an event interval as a time window during which the similarity in terms of edge strength between the consecutive time-varying graphs is maximized. Change detection in time-varying data streams has usually been performed by using distance functions or similarity metrics [148]. One approach is to use information theoretical measures, e.g., KullbackLeibler or 116 Jensen-Shannon divergence, as distance functions to compute dissimilarity between subsequent time-varying graphs [149]. However, this requires a large amount of data for the reliable computation of empirical histograms. Another most commonly used similarity measure is the lp norm, such as Euclidean, l1 and l∞ norms, of the difference between subsequent column vectors consisting of the graph edges [150]. Although these metrics are relatively straightforward and easy to implement, they might impose a loss of information in revealing important features of the difference between graph edges. For instance, l∞ norm focuses only on the maximum absolute valued difference between the edges and discards the rest of the information in the remaining elements. On the other hand, Euclidean and l1 norms consider the total energy of the difference, which results in emphasizing the importance of the elements having larger magnitude and neglecting the contribution of the ones with smaller absolute values. Alternatively, correlation of subsequent time-varying graphs has been used as a similarity index [151], which also emphasizes the elements with larger absolute values. Therefore, we propose to track the angle of time-varying direction vectors, formed by the edge values, to detect event intervals depending on the change in the angular similarity over time. Furthermore, since the edge values could be suppressed by the effect of the noise subspace, we propose to analyze the dynamic network in the signal subspace, which is similar to spatial filtering of multivariate time-series. For this purpose, we define the set of direction vectors, [y(1), . . . , y(T )], as: y(t) = P T x(t) ∥P T x(t)∥2 (5.6) where P is formed with the l eigenvectors of the covariance matrix, 1 C = T [x(1), . . . , x(T )][x(1), . . . , x(T )]T , corresponding to the largest l eigenvalues such 117 that: ∑l j=1 λj × 100 ≥ 90 ∑(N ) 2 j=1 λj (5.7) Note that y(t) is an l-dimensional direction vector on a unit l − 1 sphere at time t and lies in the signal subspace. Then, we define the angular similarity at time t as: at = y T (t − 1)y(t) (5.8) Note that this method is similar to the cosine similarity and does not depend on the magnitude of the edge values, which results in an equal contribution of each edge value in defining a network-wide change. If there is an abrupt decrease in the angular similarity between the subsequent direction vectors, this would indicate a significant change in the network patterns. Hence, we detect event intervals as follows:    1, if a < µ − 3σ  t t t Et =   0, if a ≥ µ − 3σ  t t t (5.9) where we propose to employ a change detection algorithm based on adaptive thresholding. An event boundary is detected, Et = 1, depending on the decrease of at , 3σt = √ ∑ ∑ 3 1 δ (at−k − µt )2 , from µt = 1 δ at−k . The length of the moving average wink=1 k=1 δ δ dow, δ, can be chosen based on the sampling frequency and total number of time samples, T. 118 5.3.3 Summarizing Signal Subspace for Key Graph Estimation After determining the event intervals, our goal is to form key graphs which best summarize the particular intervals. An ideal key graph should describe the particular interval with minimal redundancy. In this Chapter, we propose to discard noise subspace and consider only signal subspace by projecting the original data onto the signal subspace. Let [x(i), . . . , x(i + m − 1)] be the set of m vectors obtained from the time-varying graphs that compose a detected event interval we try to summarize. The corresponding projection matrix, Pi , is computed such that l eigenvalues capture 90% of the total energy as described in section 5.3.2. For each time point within the particular event interval, signal subspace component is extracted as: xSig (t) = Pi PiT x(t) t = i, . . . , i + m − 1 (5.10) Hence, the new set of vectors contains only the information related to the signal subspace and conserves 90% of the total energy within the particular event interval. For each event interval, we compute a weighted mean vector: ∑i+m−1 xSig = SN RIt xSig (t) ∑i+m−1 SN RIt t=i t=i (5.11) where we define a time-varying signal-to-noise ratio index, (SN RI), which is a measure of how the relative energy captured by the signal subspace compared to the noise subspace evolves over time, as follows: SN RIt = ∥Pi PiT x(t)∥2 ∥(I − Pi Pi (t)T )x(t)∥2 119 = ∥xSig (t)∥2 ∥xN oi (t)∥2 (5.12) The corresponding key graph is obtained by reshaping xSig such that it constitutes the upper triangular part of the symmetric key graph for the particular interval. The proposed framework is summarized in Fig. 5.1. Extract Phase: x RID-Rihaczek Distibution Form Time-Varying Graphs: x Compute Bivariate synchrony indices x Use Phase Locking Value Event Interval Detection Time-Series Data Change Detection Algorithm: x Adaptive Thresholding x Deviation of Angular Similarity Key Graph Estimation Time-Varying Graphs Extraction of Key Graphs Using Subspace Analysis for Dynamic Functional Brain Networks x Extract Signal Subspace Component x Compute SNR for each graph x Form Key Graph Based on SNR Similarity Metric: x Track Changes in Direction Vector x Angular Similarity Subspace Analysis: x Project the Data onto the Signal Subspace x Capture 90% of the Energy Key Graph Figure 5.1: Flow chart describing the proposed framework for extracting key graphs using subspace analysis 120 Event Interval Detection for Reality Mining Data 0.95 0.9 a(t) 0.85 0.8 0.75 0.7 Fall Semester 0.65 5 10 15 Spring Semester 20 25 30 Time in Weeks 35 40 45 Figure 5.2: Angular Similarity for MIT Reality Mining Data Set: Five event intervals are detected by the change detection algorithm 5.4 5.4.1 Results Application to Social Networks Before applying the proposed framework to functional brain networks, in order to evaluate the performance of our framework in inferring the evolution of dynamic networks, we employed the time-varying network data set collected by the MIT Reality Mining Project [152]. The data set is obtained between 2004 and 2005 by recording cell phone communication of 94 students, including graduate and undergraduate students, laboratory staff and professors within a major research institution. Each subject had a phone with built-in Bluetooth devices and recorded the Media Access Control addresses of nearby phones at five minute intervals. We constructed a weighted adjacency matrix by utilizing the physical proximity data for each week within the nine months from August 2004 to June 2005, which results in 121 a dynamic graph consisting of 46 discrete time steps1 . Fig. 5.2 shows the angular similarity between subsequent graphs where the event interval detection algorithm recognized five different intervals separated by the vertical lines. Detected event intervals are consistent with the academic calendar of MIT since the second and fourth intervals correspond to the Fall and Spring semesters, respectively, whereas the remaining intervals correspond to the winter and summer breaks. After detecting the intervals, we extracted five key networks which are shown in Fig. 5.3. We see that the network connectivity is denser within the Fall and Spring semesters where we observe communication between students, undergraduates and graduates, staff members and professors. On the other hand, the network density decreases during the breaks where we observe some connections between graduate students (nodes 1 to 36), first year graduate students (nodes 37 to 51) or laboratory staff members (nodes 52 to 57). This is expected since almost all of the staff members and some of the graduate students continue working in the MIT Media Laboratory whereas undergraduate students (nodes 58 to 66) and Sloan Business School students (nodes 67 to 93) leave the campus during the breaks. In order to evaluate the performance of our framework, we compare the subspace analysis approach with averaging the networks within each event interval to summarize network dynamics. For each interval, we computed the normalized Frobenius inner product of the key graph with each weighted adjacency matrix within the event window and compared it with the same inner product for the averaging approach. Table 5.1 shows the mean and standard deviation of the normalized inner product values for both approaches. For each interval, subspace analysis approach results in a larger inner product value compared to 1 We would like to thank Kevin S. Xu for providing the affinity matrices of MIT Reality Mining data. 122 Key Graph for weeks: 6 to 16 0.8 20 40 60 80 Node Labels Node Labels Key Graph for weeks: 1 to 5 0.6 0.4 0.2 0 20 40 60 80 Node Labels 0.6 20 40 60 80 0.4 0.2 20 40 60 80 Node Labels (a) (b) Key Graph for weeks: 17 to 24 Key Graph for weeks: 25 to 36 0.4 20 40 60 80 Node Labels Node Labels 0 0.2 0 20 40 60 80 Node Labels 0.6 20 40 60 80 0.4 0.2 20 40 60 80 Node Labels (c) (d) Node Labels Key Graph for weeks: 37 to 46 20 40 60 80 0.4 0.2 20 40 60 80 Node Labels 0 (e) Figure 5.3: Extracted key graphs for the MIT Reality Mining Data (Diagonal entries are made zero for contrast purposes): (b) and (d) show the key graphs for Fall and Spring semesters, respectively, whereas (a) and (e) show the key graphs for the summer break and (c) shows the graph for the winter break sample averaging and differences are significant at α = 0.05 level with Wilcoxon rank sum test. Therefore, the proposed approach offers more representative key graphs compared to sample averaging approach. 123 Table 5.1: Mean and standard deviation values of the normalized Frobenius inner products Interval 1 Interval 2 Interval 3 Interval 4 Interval 5 Subspace Analysis 0.72 ± 0.05 0.68 ± 0.06 0.73 ± 0.11 0.76 ± 0.04 0.65 ± 0.08 Mean Network 0.44 ± 0.04 0.52 ± 0.11 0.46 ± 0.12 0.58 ± 0.03 0.47 ± 0.14 5.4.2 Application to EEG Data 5.4.2.1 Data To evaluate the performance of the proposed measure in summarizing the event intervals with biological data, we use a set of EEG data containing the error-related negativity (ERN). The ERN is an event-related potential that occurs following performance errors in a speeded reaction time task [119, 153]. The ERN is observed as a sharp negative trend in EEG recordings which typically peaks from 75-80 ms after the error response. Previously reported EEG data from 62-channels were utilized [109]. This study included 90 (34 male) undergraduate students (two of the original 92 participants were dropped due to artifacts rendering computation of the PLV values problematic). Full methodological details of the recording are available in the previous report [109]. The task was a common speeded-response letter (H/S) flanker, where error and correct response-locked trials from each subject were utilized. A random subset of correct trials was selected, to equate the number of errors relative to correct trials for each participant. Before computing the phase-synchrony measures, all EEG epochs were converted to current source density (CSD) using published methods [125, 126]. This was done to accentuate local activity and to attenuate distal activity (e.g. volume conduction). There has been longstanding interest in time-frequency representations of the ERN [154, 124 155, 156]. It has now been established that the time-frequency energy in the ERN occurs in the theta band (4-8 Hz) of the EEG, occurring medial-frontally. This activity has been shown to have primary sources in the anterior cingulate cortex (ACC) [157, 158, 159]. Observations of similar theta activity across a number of different tasks has been reported, suggesting that midline frontal theta activity may serve related roles across a number of cognitive processes [120]. New attention has been focused on the functional connectivity occurring during the ERN, to better understand the role of medial-frontal theta activity in functional networks subserving cognitive control. Cavanagh and colleagues [122], for example, found evidence that lateral prefrontal cortex (lPFC) activity was phase-synchronous with medial-frontal theta, supporting the idea that medial prefrontal (mPFC) and lPFC regions are functionally integrated during error processing. By assessing medial-frontal regions active during the ERN in relation to diffusion tensor imaging (DTI), new work has also helped demonstrate how mPFC regions are highly integrated with other prefrontal areas during control processing [121]. Together, advances in this area support the view that medial-frontal sources serve as a central region of activity during error-processing, and that phase-synchrony measures of theta activity can index this functional integration. At the same time, work in this area is nascent, and new research into the nature of this functional integration are important. The proposed approach is a graph-based data-driven approach to characterizing functional connectivity, and can offer a new look at network patterns occurring during the ERN. Thus, while the primary aims of the current work are methodological (i.e. developing a method for characterizing time-varying graphs), we hypothesize that the medial-frontal region will play a central functional role during the ERN, and will have significant integration with frontal areas, including lateral-frontal. Such findings can offer support that the proposed timevarying graph approach produces effects consistent with current theoretical and empirical 125 work in the field. 5.4.2.2 Network-wide Change Detection In this Chapter, we analyzed data from 90 subjects corresponding to the error responses. For each subject, time and frequency dependent phase synchrony between all possible electrode (q) pairs is computed by RID-Rihaczek based PLV measure and time-varying graphs, Gt , t = 1, . . . , 256, for the q th subject are constructed using Eq. (5.4) where the number of nodes, N , is equal to 62, the frequency band of interest is the theta band (4-8 Hz) and the sampling ¯ frequency is 128 Hz. Furthermore, a mean time-varying graph sequence, Gt , is computed over all subjects as: 1 ∑ (q) ¯ Gt = Gt 90 90 (5.13) q=1 ¯ and the event interval detection algorithm is applied to this average sequence, Gt , where the length of the moving average window, δ, is chosen as 5% of the sampling period. The value of δ is selected such that the window length is able to both detect the abrupt changes in the connectivity patterns and prevent over-smoothing. Different values of moving average window can be chosen depending on the sampling frequency or the application type. We detected 6 event intervals using Eq. (5.9) which roughly correspond to the stimulus processing (-1000 to -179 ms), pre-ERN (-178 to 0 ms), ERN (1 to 94 ms), post-ERN (95 to 281 ms), Pe (282 to 462 ms) and inter-trial (463 to 1000 ms) intervals, respectively, as shown in Fig. 5.4. The detected event intervals are consistent with the speeded reaction time task as the subjects respond to the stimulus at time 0 ms. The first interval indexes complex processing of the imperative stimulus before making a response. The Pre-ERN and Post-ERN intervals, 126 Event Interval Detection 0.98 0.96 Stimulus Processing a(t) 0.94 0.92 0.9 Pre−ERN Inter−trial ERN 0.88 Post−ERN 0.86 Pe −1000−800 −600 −400 −200 0 200 Time (ms) 400 600 800 1000 Figure 5.4: Event interval detection: 6 event intervals are identified which correspond to the stimulus processing (-1000 to -179 ms), pre-ERN (-178 to 0 ms), ERN (1 to 94 ms), post-ERN (95 to 281 ms), Pe (282 to 462 ms) and inter-trial (463 to 1000 ms) intervals, respectively. The subjects respond to the stimulus at time 0 ms and the red lines indicate E(t) = 1. just before and after the ERN, index activity around the incorrect motor response. Importantly, the ERN interval and Pe interval are detected successfully by the event detection algorithm. The Pe (error-positivity) interval corresponds to a P3-like component observed subsequent to the incorrect response [160, 161]. However, measures of P3 energy generally show activity in lower frequency delta bands (e.g. [162, 163, 164, 165]), rather than the currently measured theta activity. 5.4.3 Significance Testing for the Key Graph Estimation Since the distribution of the interactions under the null hypothesis which form a key graph for a particular interval cannot be obtained analytically, we resort to generating random 127 networks to derive this distribution. For each key graph extracted for a given time interval, we derived an ensemble of 2000 surrogate time-varying networks by randomly reshuffling the edge weights [143]. The key graph estimation algorithm is applied to each surrogate time-varying graph set in each interval which resulted in 2000 surrogate key graphs. In order to compare the original key graphs with the ones obtained from the surrogate data sets, we selected two different P-values, p < 0.01 and p < 0.001, to determine the significant interactions at 99% and 99.9% significance levels, respectively. Figure 5.5: Stimulus processing interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. 128 Figure 5.6: Pre-ERN interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. 5.4.3.1 Key Graph Estimation For each interval, subspace summarization approach described in section 5.3.3 is employed to estimate xSig , which construct the corresponding symmetric key graph. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are statistically significant. For each event interval, Figs. 5.5-5.10 show the interactions which are significant at two different significance levels where the interactions with p < 0.01 and p < 0.001 are represented in blue and red colors, respectively. As one can see from Figs. 5.5-5.10, ERN interval has much more significant 129 Figure 5.7: ERN interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate timevarying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. connections compared to the Pre-ERN and Post-ERN intervals as expected because of the complex activity associated with the error commission. In particular, the frontal electrodes (F5, FZ, F2 and F4) have significant connections with the central electrode (FCz) with p < 0.001, consistent with previously observed interactions in theta band between medial prefrontal cortex (mPFC) and lateral prefrontal cortex (lPFC) during error-related cognitive control processes [122], whereas the other event intervals do not include such interactions among frontal and central sites. During the Pe, on the other hand, we observe significant connections only among the parietal and occipital-parietal electrodes with p < 0.01 and p < 0.001. Hypotheses about theta activity during the Pe are underdeveloped in the literature, 130 Figure 5.8: Post-ERN interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. because P3-related activity generally occurs at lower frequencies (e.g. 0-3 Hz, as described above). Thus, while the observed pattern of effects could be interpreted, it is more reasonable to note that this interval contains the fewest connections between nodes among the identified intervals. We also focused on the change in connectivity for FCz electrode with the remaining 61 electrodes within the key graphs for Pre-ERN, ERN and Post-ERN intervals and compared these connectivity values to identify if FCz has stronger connectivity during the ERN interval compared to the Pre-ERN and Post-ERN intervals. We used a Welch’s t-test at 5% significance level to test the null hypothesis that the connectivity strengths from different 131 Figure 5.9: Pe interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate timevarying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. key graphs are independent random samples from normal distributions with equal means. For both comparisons, Pre-ERN vs ERN and Post-ERN vs ERN, the null hypothesis is rejected where FCz has a larger mean connectivity for the ERN interval indicating that the central electrode has significantly larger connectivity with the rest of the brain during the ERN interval. Moreover, we compared the connectivity values for Pre-ERN and Post-ERN where there is no significant difference between the connectivity values from these intervals. 132 Figure 5.10: Inter-trial interval: A key graph is obtained using the framework described in section 5.3.3. We compared the extracted key graphs with the ones obtained from the surrogate time-varying graphs and identified the interactions which are significant. The interactions which are found to be significant at two different levels, p < 0.01 and p < 0.001, are represented in blue and red colors, respectively. 5.5 Conclusions In this Chapter, we proposed a new framework to describe the dynamic evolution of brain networks. The proposed approach is based on finding the event intervals and revealing the informative component of each interval by considering the signal subspace and discarding the noise subspace such that the key graph would summarize the particular interval with minimal redundancy. Expectable results from the application to real EEG data containing the ERN supports the effectiveness of the proposed framework in determining the event intervals of dynamic brain networks and summarizing network activity with a few number 133 of representative networks. Future work will concentrate on exploring alternative decomposition techniques for subspace analysis, which may result in an improved performance in summarizing dynamic networks. Furthermore, the proposed framework will be extended to compare the dynamic nature of functional networks for error and correct responses to get a more complete understanding of cognitive control. In addition, we will employ the proposed framework to analyze data in other frequency bands including delta, which may be more central to activity during the Pe interval. Future work will also consider exploring single dipole [159, 166] and distributed dipole [167] source solutions to the inverse problem for extending our proposed dynamic functional connectivity analysis framework to the source domain. Finally, we will explore different group analysis methods to consider the variability across individual subjects and possibly reveal the distinctive network features for each subject rather than averaging the time-varying graphs from all subjects. 134 Chapter 6 Conclusions and Future Work In this thesis, a new time-varying phase estimation method based on the RID-Rihaczek distribution is proposed. The performance of the phase estimator and the corresponding synchrony measure are evaluated both analytically and through simulations in comparison to existing measures. Both the analytical and the simulation results show RID-Rihaczek distribution based phase and synchrony estimators to be more robust to noise, have better time-frequency resolution and perform better at detecting actual synchrony in the system compared to existing measures. Second, we proposed two complementary approaches, based on the RID-Rihaczek distribution and the cross frequency-spectral lag distribution to quantify the cross-frequency modulation between two signals, namely the CF phase synchrony and modulation effect, respectively. CF phase synchrony is based on the Reduced Interference Rihaczek distribution and extends the Reduced Interference Rihaczek based phase synchrony measure to quantify the phase synchrony between two signals across different frequencies. The cross frequency-spectral lag distribution offers cross-frequency coupling information and it focuses on quantifying the amount of amplitude modulation between two signals. Simulation results and EEG applications show that the proposed methods perform well in quantifying the phase synchrony across frequencies and identifying the modulation effects. In Chapter 4, we introduced methods for quantifying multivariate phase synchronization within and across groups of signals. The first method was based on two complementary measures of multivariate correlation and complexity, S-estimator and Rv . RID-Rihaczek 135 distribution is used to extract time and frequency dependent phase estimates and measures of correlation are adapted from statistics to multivariate analysis. The proposed approach extends the current state of the art phase synchrony analysis from quantifying bivariate relationships to multivariate ones. We then proposed a novel and direct method, called hyperspherical phase synchrony, to compute the multivariate phase synchrony within a group of oscillators coupled through either direct or indirect relationships, such as the global coupling across different brain sites. Hyperspherical phase synchrony offers lower computational complexity and improved performance in terms of robustness to noise compared to the existing measures. Also, application to real EEG data show the effectiveness of the proposed measure in quantifying functional connectivity across different brain sites. Finally, we extended from bivariate analysis to multivariate connectivity through construction of connectivity graphs. Most of the current literature on using complex network theory to study functional connectivity focuses on constructing a single graph that describes the static relationships and neglect possible time-varying properties of the underlying connectivity patterns of the brain. Hence, in order to account for the changes in these connectivity patterns, we introduce a framework for understanding dynamic evolution of functional brain networks. The proposed approach is based on detecting key event intervals corresponding to the underlying neurophysiological events and inferring key networks or graphs for describing the particular intervals with minimal redundancy. We employ the RID-Rihaczek distribution to compute time-varying functional brain networks and then use subspace analysis based on principal component analysis to separate the edges belonging to the signal subspace from the background edges similar to extracting signal from noise. The resulting key networks contain only the information related to the signal subspace. Results from the application to real EEG data illustrates the effectiveness of the proposed framework in determining the event 136 intervals of dynamic brain networks and summarizing underlying network activity with a few number of representative networks. Future work will concentrate on exploring different generalized complex time-frequency distributions to develop more statistically stable and computationally efficient time-varying phase estimates for non-stationary signals. These new measures will be applied to real signals including the quantification of functional connectivity in the brain from EEG signals. Moreover, the CF phase synchrony measure based on the RID-Rihaczek distribution will be applied to EEG signals and different measures will be explored to quantify the amount of amplitude modulation. Methods for determining the low frequency bands that modulate high frequency oscillations in EEG signals without a priori knowledge will also be investigated. Furthermore, the proposed approaches for multivariate synchrony analysis will be extended. An important limitation of the proposed multivariate synchrony measures is that the groups of oscillators to be analyzed have to be identified a priori with known synchronization patterns and an exhaustive search to find synchronization clusters would be computationally complex. Therefore, it would be valuable to first use preprocessing methods, such as eigenvalue decomposition or measures of association and complexity, which can help to discover the underlying synchronization clusters. Thus, future work will focus on developing statistical measures and methods for quantifying multivariate synchrony. After obtaining the information about the synchronization clusters, one can use the proposed measures to compute multivariate phase synchronization to quantify the within and between cluster relationships. Finally, different decomposition techniques for subspace analysis will be explored, which may result in an improved performance in understanding time-varying functional brain networks. The proposed framework will be employed to analyze EEG data in other frequency bands including delta, beta and gamma. Different group analysis methods 137 will be explored to consider the variability across individual subjects and possibly reveal the distinctive network features for each subject rather than averaging the time-varying graphs from all subjects. 138 BIBLIOGRAPHY 139 BIBLIOGRAPHY [1] R. Sneyers, “Climate chaotic instability: statistical determination and theoretical background,” Environmetrics, vol. 8, no. 5, pp. 517–532, 1997. [2] C. Kyrtsou and W. Labys, “Detecting positive feedback in multivariate time series: the case of metal prices and US inflation,” Physica A: Statistical Mechanics and its Applications, vol. 377, no. 1, pp. 227–229, 2007. [3] M. Le Van Quyen, J. Foucher, J. P. Lachaux, E. Rodriguez, A. Lutz, J. Martinerie, and F. J. Varela, “Comparison of hilbert transform and wavelet methods for the analysis of neuronal synchrony,” Journal of Neuroscience Methods, vol. 111, no. 2, pp. 83–98, 2001. [4] O. Jensen and L. Colgin, “Cross-frequency coupling between neuronal oscillations,” Trends in Cognitive Sciences, vol. 11, no. 7, pp. 267–269, 2007. [5] M. Hasler, “Engineering chaos for encryption and broadband communication,” Philosophical Transactions A, vol. 353, no. 1701, pp. 115–126, 1995. [6] K. Goto, “Multi-channel optical sensing system,” Jan. 4 1983, US Patent 4,367,040. [7] R. Ferri, F. Alicata, S. Del Gracco, M. Elia, S. Musumeci, and M. Stefanini, “Chaotic behavior of EEG slow-wave activity during sleep,” Electroencephalography and Clinical Neurophysiology, vol. 99, no. 6, pp. 539–543, 1996. ¨ [8] E. Ubeyli, “Lyapunov exponents/probabilistic neural networks for analysis of EEG signals,” Expert Systems with Applications, vol. 37, no. 2, pp. 985–992, 2010. [9] W. Freeman and C. Skarda, “Spatial EEG patterns, non-linear dynamics and perception: the neo-sherringtonian view,” Brain Research Reviews, vol. 10, no. 3, pp. 147–175, 1985. [10] I. Dvorak and J. Siska, “On some problems encountered in the estimation of the correlation dimension of the EEG,” Physics Letters A, vol. 118, no. 2, pp. 63–66, 1986. 140 [11] A. Meyer-Lindenberg, “The evolution of complexity in human brain development: an EEG study,” Electroencephalography and Clinical Neurophysiology, vol. 99, no. 5, pp. 405–411, 1996. [12] E. Pereda, R. Quiroga, and J. Bhattacharya, “Nonlinear multivariate analysis of neurophysiological signals,” Progress in Neurobiology, vol. 77, no. 1-2, pp. 1–37, 2005. [13] E. Rodriguez, N. George, J. Lachaux, J. Martinerie, B. Renault, and F. Varela, “Perception’s shadow: long-distance synchronization of human brain activity,” Nature, vol. 397, no. 6718, pp. 430–433, 1999. [14] M. G. Rosenblum, A. A. Pikovsky, and J. Kurths, “Phase synchronization of chaotic oscillators,” Physical Review Letters, vol. 76, no. 11, pp. 1804–1807, 1996. [15] P. Tass, M. Rosenblum, J. Weule, J. Kurths, A. Pikovsky, J. Volkmann, A. Schnitzler, and H. Freund, “Detection of n:m phase locking from noisy data: application to magnetoencephalography,” Physical Review Letters, vol. 81, no. 15, pp. 3291–3294, 1998. [16] K. Friston, “Functional and effective connectivity: A review,” Brain Connectivity, vol. 1, no. 1, pp. 13–36, 2011. [17] M. Chavez, M. Valencia, V. Latora, and J. Martinerie, “Complex networks: new trends for the analysis of brain connectivity,” International Journal of Bifurcation and Chaos, vol. 20, no. 06, pp. 1677–1686, 2010. [18] T. Leergaard, C. Hilgetag, and O. Sporns, “Mapping the connectome: Multi-level analysis of brain connectivity,” Frontiers in Neuroinformatics, vol. 6, p. 14, 2012. [19] C. Stam, “Nonlinear dynamical analysis of EEG and MEG: review of an emerging field,” Clinical Neurophysiology, vol. 116, no. 10, pp. 2266–2301, 2005. [20] J.-P. Lachaux, E. Rodriguez, J. Martinerie, and F. J. Varela, “Measuring phase synchrony in brain signals,” Human Brain Mapping, vol. 8, pp. 194–208, 1999. [21] F. Varela, J. P. Lachaux, E. Rodriguez, and J. Martinerie, “The brainweb: phase synchronization and large-scale integration,” Nature Reviews Neuroscience, vol. 2, no. 4, pp. 229–239, 2001. [22] W. Singer and C. Gray, “Visual feature integration and the temporal correlation hypothesis,” Annual Review of Neuroscience, vol. 18, no. 1, pp. 555–586, 1995. 141 [23] P. Roelfsema, A. Engel, P. Konig, and W. Singer, “Visuomotor integration is associated with zero time-lag synchronization among cortical areas,” Nature, vol. 385, no. 6612, pp. 157–161, 1997. [24] E. Vaadia, I. Haalman, M. Abeles, H. Bergman, Y. Prut, H. Slovin, and A. Aertsen, “Dynamics of neuronal interactions in monkey cortex in relation to behavioural events,” Nature, vol. 373, no. 6514, pp. 515–518, 1995. [25] F. Mormann, J. Fell, N. Axmacher, B. Weber, K. Lehnertz, C. Elger, and G. Fern´ndez, a “Phase/amplitude reset and theta–gamma interaction in the human medial temporal lobe during a continuous word recognition memory task,” Hippocampus, vol. 15, no. 7, pp. 890–900, 2005. [26] D. Cosmelli, O. David, J. Lachaux, J. Martinerie, L. Garnero, B. Renault, and F. Varela, “Waves of consciousness: ongoing cortical patterns during binocular rivalry,” Neuroimage, vol. 23, no. 1, pp. 128–140, 2004. [27] T. White, M. Schmidt, D. Kim, and V. Calhoun, “Disrupted functional brain connectivity during verbal working memory in children and adolescents with schizophrenia,” Cerebral Cortex, vol. 21, no. 3, pp. 510–518, 2011. [28] W. de Haan, W. van der Flier, H. Wang, P. Van Mieghem, P. Scheltens, and C. Stam, “Disruption of functional brain networks in Alzheimer’s disease: What can we learn from graph spectral analysis of resting-state magnetoencephalography?” Brain Connectivity, vol. 2, no. 2, pp. 45–55, 2012. [29] J. Palva, S. Palva, and K. Kaila, “Phase synchrony among neuronal oscillations in the human cortex,” Journal of Neuroscience, vol. 25, no. 15, pp. 3962–3972, 2005. [30] M. Cohen, “Assessing transient cross-frequency coupling in EEG data,” Journal of Neuroscience Methods, vol. 168, no. 2, pp. 494–499, 2008. [31] D. Rudrauf, A. Douiri, C. Kovach, J. Lachaux, D. Cosmelli, M. Chavez, C. Adam, B. Renault, J. Martinerie, and M. Le Van Quyen, “Frequency flows and the timefrequency dynamics of multivariate phase synchronization in brain signals,” Neuroimage, vol. 31, no. 1, pp. 209–227, 2006. [32] B. Schelter, M. Winterhalder, R. Dahlhaus, J. Kurths, and J. Timmer, “Partial phase synchronization for multivariate synchronizing systems,” Physical Review Letters, vol. 96, no. 20, p. 208103, 2006. 142 [33] K. Oshima, C. Carmeli, and M. Hasler, “State Change Detection Using Multivariate Synchronization Measure from Physiological Signals,” J Signal Process, vol. 10, no. 4, pp. 223–226, 2006. [34] D. Cui, X. Liu, Y. Wan, and X. Li, “Estimation of genuine and random synchronization in multivariate neural series,” Neural Networks, vol. 23, no. 6, pp. 698–704, 2010. [35] M. Knyazeva, M. Jalili, A. Brioschi, I. Bourquin, E. Fornari, M. Hasler, R. Meuli, P. Maeder, and J. Ghika, “Topography of EEG multivariate phase synchronization in early Alzheimer’s disease,” Neurobiology of Aging, vol. 31, no. 7, pp. 1132–1144, 2010. [36] S. Dimitriadis, N. Laskaris, V. Tsirka, M. Vourkas, S. Micheloyannis, and S. Fotopoulos, “Tracking brain dynamics via time-dependent network analysis,” Journal of Neuroscience Methods, vol. 193, no. 1, pp. 145–155, 2010. [37] F. D. V. Fallani, L. Astolfi, F. Cincotti, D. Mattia, M. G. Marciani, S. Salinari, J. Kurths, S. Gao, A. Cichocki, A. Colosimo, and F. Babiloni, “Cortical functional connectivity networks in normal and spinal cord injured patients: Evaluation by graph analysis,” Human Brain Mapping, vol. 28, no. 12, pp. 1334–1346, 2007. [38] M. Rubinov and O. Sporns, “Complex network measures of brain connectivity: uses and interpretations,” Neuroimage, vol. 52, no. 3, pp. 1059–1069, 2010. [39] M. Valencia, J. Martinerie, S. Dupont, and M. Chavez, “Dynamic small-world behavior in functional brain networks unveiled by an event-related networks approach,” Physical Review E, vol. 77, no. 5, p. 050905, 2008. [40] J. P. Lachaux, A. Lutz, D. Rudrauf, D. Cosmelli, M. Le Van Quyen, J. Martinerie, and F. Varela, “Estimating the time-course of coherence between single-trial brain signals: an introduction to wavelet coherence,” Neurophysiologie Clinique/Clinical Neurophysiology, vol. 32, no. 3, pp. 157–174, 2002. [41] D. Li and R. Jung, “Quantifying coevolution of nonstationary biomedical signals using time-varying phase spectra,” Annals of Biomedical Engineering, vol. 28, no. 9, pp. 1101–1115, 2000. [42] M. Newman, “Finding community structure in networks using the eigenvectors of matrices,” Physical Review E, vol. 74, no. 3, p. 36104, 2006. [43] ——, “Modularity and community structure in networks,” Proceedings of the National Academy of Sciences, vol. 103, no. 23, pp. 8577–8582, 2006. 143 [44] D. Kim and H. Jeong, “Systematic analysis of group identification in stock markets,” Physical Review E, vol. 72, no. 4, p. 046133, 2005. [45] C. Zhou, L. Zemanov´, G. Zamora, C. Hilgetag, and J. Kurths, “Hierarchical organizaa tion unveiled by functional connectivity in complex brain networks,” Physical Review Letters, vol. 97, no. 23, p. 238103, 2006. [46] A. Utsugi, K. Ino, and M. Oshikawa, “Random matrix theory analysis of cross correlations in financial markets,” Physical Review E, vol. 70, no. 2, p. 026110, 2004. [47] M. M¨ller, G. Baier, A. Galka, U. Stephani, and H. Muhle, “Detection and characu terization of changes of the correlation structure in multivariate time series,” Physical Review E, vol. 71, no. 4, p. 046116, 2005. [48] C. Allefeld and J. Kurths, “An approach to multivariate phase synchronization analysis and its application to event-related potentials,” International Journal of Bifurcation and Chaos, vol. 14, no. 2, pp. 417–426, 2004. [49] C. Allefeld, M. M¨ller, and J. Kurths, “Eigenvalue decomposition as a generalized synu chronization cluster analysis,” International Journal of Bifurcation and Chaos, vol. 17, pp. 3493–3497, 2007. [50] C. Allefeld and S. Bialonski, “Detecting synchronization clusters in multivariate time series via coarse-graining of Markov chains,” Physical Review E, vol. 76, no. 6, pp. 66 207–66 215, 2007. [51] A. Casteigts, P. Flocchini, W. Quattrociocchi, and N. Santoro, “Time-varying graphs and dynamic networks,” Ad-hoc, Mobile, and Wireless Networks, vol. 6811, pp. 346– 359, 2011. [52] P. Basu, A. Bar-Noy, R. Ramanathan, and M. Johnson, “Modeling and analysis of time-varying graphs,” Arxiv preprint arXiv:1012.0260, 2010. [53] V. Nicosia, J. Tang, M. Musolesi, G. Russo, C. Mascolo, and V. Latora, “Components in time-varying graphs,” Arxiv preprint arXiv:1106.2134, 2011. [54] P. Mucha, T. Richardson, K. Macon, M. Porter, and J. Onnela, “Community structure in time-dependent, multiscale, and multiplex networks,” Science, vol. 328, no. 5980, pp. 876–878, 2010. [55] J. Tang, S. Scellato, M. Musolesi, C. Mascolo, and V. Latora, “Small-world behavior in time-varying graphs,” Physical Review E, vol. 81, no. 5, p. 055101, 2010. 144 [56] N. Santoro, W. Quattrociocchi, P. Flocchini, A. Casteigts, and F. Amblard, “Timevarying graphs and social network analysis: Temporal indicators and metrics,” Arxiv preprint arXiv:1102.0629, 2011. [57] B. Miller, M. Beard, and N. Bliss, “Matched filtering for subgraph detection in dynamic networks,” in IEEE Statistical Signal Processing Workshop (SSP), 2011, pp. 509–512. [58] ——, “Eigenspace analysis for threat detection in social networks,” in Proceedings of the 14th International Conference on Information Fusion (FUSION). IEEE, 2011, pp. 1–7. [59] T. Ide and H. Kashima, “Eigenspace-based anomaly detection in computer systems,” in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004, pp. 440–449. [60] C. Priebe, J. Conroy, D. Marchette, and Y. Park, “Scan statistics on enron graphs,” Computational & Mathematical Organization Theory, vol. 11, no. 3, pp. 229–247, 2005. [61] K. Xu, M. Kliger, and A. Hero, “A shrinkage approach to tracking dynamic networks,” in IEEE Statistical Signal Processing Workshop (SSP), 2011, pp. 517–520. [62] C. Cortes, D. Pregibon, and C. Volinsky, “Computational methods for dynamic graphs,” Journal of Computational and Graphical Statistics, vol. 12, no. 4, pp. 950–970, 2003. [63] H. Tong, S. Papadimitriou, P. Yu, and C. Faloutsos, “Proximity tracking on timeevolving bipartite graphs,” in Proc. of SDM, 2008, pp. 704–715. [64] N. Huang, Z. Shen, S. Long, M. Wu, H. Shih, Q. Zheng, N. Yen, C. Tung, and H. Liu, “The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis,” Proceedings: Mathematical, Physical and Engineering Sciences, vol. 454, no. 1971, pp. 903–995, 1998. [65] C. Sweeney-Reed and S. Nasuto, “A novel approach to the detection of synchronisation in EEG based on empirical mode decomposition,” Journal of Computational Neuroscience, vol. 23, no. 1, pp. 79–111, 2007. [66] S. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674 –693, 1989. 145 [67] K. G¨chenig, Foundations of time-frequency analysis. Birkh¨user, 2001. o a [68] L. Angelini, M. D. Tommaso, M. Guido, K. Hu, P. C. Ivanov, D. Marinazzo, G. Nardulli, L. Nitti, M. Pellicoro, C. Pierro, and S. Stramaglia, “Steady-state visual evoked potentials and phase synchronization in migraine patients,” Physical Review Letters, vol. 93, no. 3, p. 38103, 2004. [69] J. Bhattacharya, “Reduced degree of long-range phase synchrony in pathological human brain,” Acta Neurobiologiae Experimentalis, vol. 61, no. 4, pp. 309–318, 2001. a a [70] M. Koskinen, T. Sepp¨nen, J. Tuukkanen, A. Yli-Hankala, and V. J¨ntti, “Propofol anesthesia induces phase synchronization changes in EEG,” Clinical Neurophysiology, vol. 112, no. 2, pp. 386–392, 2001. [71] C. Ioana and A. Quinquis, “Time-frequency analysis using warped-based high-order phase modeling,” EURASIP Journal on Applied Signal Processing, vol. 17, pp. 2856– 2873, 2005. [72] B. Porat, Digital Processing of Random Signals: Theory and Methods. Prentice Hall, 1994. [73] S. Barbarossa, A. Scaglione, and G. Giannakis, “Product high-order ambiguity function for multicomponent polynomial-phase signal modeling,” IEEE Transactions on Signal Processing, vol. 46, no. 3, pp. 691–708, 1998. [74] J. Eckmann, S. Oliffson Kamphorst, and D. Ruelle, “Recurrence plots of dynamical systems,” EPL (Europhysics Letters), vol. 4, pp. 973–977, 1987. [75] J. Zbilut, A. Giuliani, and C. Webber, “Detecting deterministic signals in exceptionally noisy environments using cross-recurrence quantification,” Physics Letters A, vol. 246, no. 1, pp. 122–128, 1998. [76] N. Marwan, M. Carmen Romano, M. Thiel, and J. Kurths, “Recurrence plots for the analysis of complex systems,” Physics Reports, vol. 438, no. 5-6, pp. 237–329, 2007. [77] S. Aviyente, E. Bernat, W. Evans, and S. Sponheim, “A phase synchrony measure for quantifying dynamic functional integration in the brain,” Human Brain Mapping, vol. 32, no. 1, pp. 80–93, 2011. o [78] T. Kreuz, R. Andrzejak, F. Mormann, A. Kraskov, H. St¨gbauer, C. Elger, K. Lehnertz, and P. Grassberger, “Measure profile surrogates: A method to validate the per146 formance of epileptic seizure prediction algorithms,” Physical Review E, vol. 69, no. 6, p. 61915, 2004. [79] L. Cohen, Time-frequency analysis: theory and applications. Prentice Hall, 1995. [80] J. Jeong and W. Williams, “Kernel design for reduced interference distributions,” IEEE Transactions on Signal Processing, vol. 40, pp. 402–412, 1992. [81] A. Rihaczek, “Signal energy distribution in time and frequency,” IEEE Transactions on Information Theory, vol. 14, no. 3, pp. 369–374, 1968. [82] P. Schreier, L. Scharf, and A. Hanssen, “A geometric interpretation of the Rihaczek time-frequency distribution for stochastic signals,” in International Symposium on Information Theory (ISIT), 2005, pp. 966–969. [83] L. Scharf, B. Friedlander, P. Flandrin, and A. Hanssen, “The Hilbert space geometry of the stochastic Rihaczek distribution,” in Asilomar Conference on Signals Systems and Computers, vol. 1, 2001, pp. 720–725. [84] B. Gillespie and L. Atlas, “Optimizing time-frequency kernels for classification,” IEEE Transactions on Signal Processing, vol. 49, no. 3, pp. 485–496, 2001. [85] J. O’neill and W. Williams, “Shift covariant time-frequency distributions of discrete signals,” IEEE Transactions on Signal Processing, vol. 47, no. 1, pp. 133–146, 1999. [86] S. Kay, Fundamentals of statistical signal processing: estimation theory, CH 15, Pages 45, 525, 1993. [87] A. Hero III, J. Fessler, and M. Usman, “Exploring estimator bias-variance tradeoffs using the uniform CR bound,” IEEE Transactions on Signal Processing, vol. 44, no. 8, pp. 2026–2041, 1996. [88] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. Doyne Farmer, “Testing for nonlinearity in time series: the method of surrogate data,” Physica D: Nonlinear Phenomena, vol. 58, no. 1-4, pp. 77–94, 1992. [89] G. Tcheslavski, “Coherence and phase synchrony analysis of electroencephalogram, Chapter 3, Page 46,” Ph.D. dissertation, Virginia Polytechnic Institute and State University, 2005. 147 [90] Y. Kuramoto, “Self-entrainment of a population of coupled nonlinear oscillators,” in International symposium on mathematical problems in theoretical physics, Lecture notes in Physics, vol. 39, 1975, pp. 420–422. [91] S. Strogatz, “From Kuramoto to Crawford: exploring the onset of synchronization in populations of coupled oscillators,” Physica D: Nonlinear Phenomena, vol. 143, no. 1-4, pp. 1–20, 2000. [92] J. Acebr´n, L. Bonilla, C. P´rez Vicente, F. Ritort, and R. Spigler, “The Kuramoto o e model: A simple paradigm for synchronization phenomena,” Reviews of Modern Physics, vol. 77, no. 1, pp. 137–185, 2005. [93] M. Yeung and S. Strogatz, “Time delay in the Kuramoto model of coupled oscillators,” Physical Review Letters, vol. 82, no. 3, pp. 648–651, 1999. [94] C. Stam, G. Nolte, and A. Daffertshofer, “Phase lag index: Assessment of functional connectivity from multi channel EEG and MEG with diminished bias from common sources,” Human Brain Mapping, vol. 28, no. 11, pp. 1178–1193, 2007. [95] N. Rehman and D. Mandic, “Multivariate empirical mode decomposition,” Proceedings of the Royal Society of London, Series A: Mathematical, Physical and Engineering Sciences, vol. 466, no. 2117, pp. 1291–1302, 2010. [96] R. T. Canolty and R. T. Knight, “The functional role of cross-frequency coupling,” Trends in Cognitive Sciences, vol. 14, no. 11, pp. 506 – 515, 2010. [97] R. T. Canolty, E. Edwards, S. S. Dalal, M. Soltani, S. S. Nagarajan, H. E. Kirsch, M. S. Berger, N. M. Barbaro, and R. T. Knight, “High gamma power is phase-locked to theta oscillations in human neocortex,” Science, vol. 313, no. 5793, pp. 1626–1628, 2006. [98] B. Voytek, R. Canolty, A. Shestyuk, N. Crone, J. Parvizi, and R. Knight, “Shifts in Gamma Phase–Amplitude Coupling Frequency from Theta to Alpha Over Posterior Cortex During Visual Tasks,” Frontiers in Human Neuroscience, vol. 4, 2010. [99] S. Sukittanon, L. Atlas, and J. Pitton, “Modulation-scale analysis for content identification,” IEEE Transactions on Signal Processing, vol. 52, no. 10, pp. 3023–3035, 2004. [100] H. Dudley, “Remaking speech,” The Journal of the Acoustical Society of America, vol. 11, p. 165, 1939. 148 [101] L. Zadeh, “Frequency analysis of variable networks,” Proceedings of the IRE, vol. 38, no. 3, pp. 291–299, 2006. [102] T. Kailath, “Channel characterization: Time-variant dispersive channels,” Lectures on Communication System Theory, pp. 95–123, 1961. [103] W. Gardner, “Exploitation of spectral redundancy in cyclostationary signals,” IEEE Signal Processing Magazine, vol. 8, no. 2, pp. 14–36, 2002. [104] L. Atlas and S. Shamma, “Joint acoustic and modulation frequency,” EURASIP Journal on Applied Signal Processing, vol. 2003, p. 675, 2003. [105] P. Clark and L. Atlas, “Time-frequency coherent modulation filtering of nonstationary signals,” IEEE Transactions on Signal Processing, vol. 57, no. 11, pp. 4323–4332, 2009. [106] C. Nikias and M. Raghuveer, “Bispectrum estimation: A digital signal processing framework,” Proceedings of the IEEE, vol. 75, no. 7, pp. 869–891, 2005. [107] R. Baraniuk, P. Flandrin, A. Janssen, and O. Michel, “Measuring time-frequency information content using the R´nyi entropies,” IEEE Transactions on Information Theory, e vol. 47, no. 4, pp. 1391–1409, 2002. [108] I. Taneja, “Generalized symmetric divergence measures and inequalities,” Arxiv preprint math/0501301, 2005. [109] J. R. Hall, E. M. Bernat, and C. J. Patrick, “Externalizing psychopathology and the error-related negativity,” Psychological Science, vol. 18, no. 4, pp. 326–333, 2007. [110] S. Aviyente and A. Mutlu, “A time-frequency-based approach to phase and phase synchrony estimation,” IEEE Transactions on Signal Processing, vol. 59, no. 7, pp. 3086–3098, 2011. [111] C. Allefeld, S. Frisch, and M. Schlesewsky, “Detection of early cognitive processing by event-related phase synchronization analysis,” Neuroreport, vol. 16, no. 1, pp. 13–16, 2005. [112] C. Allefeld and J. Kurths, “Multivariate phase synchronization analysis of eeg data,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 86, no. 9, pp. 2218–2221, 2003. 149 [113] C. W. J. Granger, “Investigating causal relations by econometric models and crossspectral methods,” Econometrica, vol. 37, no. 3, pp. 424–438, 1969. [114] L. A. Baccala and K. Sameshima, “Partial directed coherence: a new concept in neural structure determination,” Biological Cybernetics, vol. 84, no. 6, pp. 463–474, 2001. [115] P. Robert and Y. Escoufier, “A unifying tool for linear multivariate statistical methods: the RV-coefficient,” Applied Statistics, vol. 25, no. 3, pp. 257–265, 1976. [116] R. Horn and C. Johnson, Matrix analysis. Cambridge Univ Pr, 1990. [117] G. Osipov, A. Pikovsky, M. Rosenblum, and J. Kurths, “Phase synchronization effects in a lattice of nonidentical R¨ssler oscillators,” Physical Review E, vol. 55, no. 3, pp. o 2353–2361, 1997. [118] D. Prichard and J. Theiler, “Generating surrogate data for time series with several simultaneously measured variables,” Physical Review Letters, vol. 73, no. 7, pp. 951– 954, 1994. [119] M. Falkenstein, J. Hohnsbein, J. Hoormann, and L. Blanke, “Effects of crossmodal divided attention on late ERP components. II. error processing in choice reaction tasks,” Electroencephalography and Clinical Neurophysiology, vol. 78, no. 6, pp. 447– 455, 1991. [120] J. Cavanagh, L. Zambrano-Vazquez, and J. Allen, “Theta lingua franca: A common mid-frontal substrate for action monitoring processes,” Psychophysiology, vol. 49, no. 2, pp. 220–238, 2012. [121] M. Cohen, “Error-related medial frontal theta activity predicts cingulate-related structural connectivity,” Neuroimage, vol. 55, no. 3, pp. 1373–1383, 2011. [122] J. Cavanagh, M. Cohen, and J. Allen, “Prelude to and resolution of an error: EEG phase synchrony reveals cognitive control dynamics during action monitoring,” Journal of Neuroscience, vol. 29, no. 1, pp. 98–105, 2009. [123] L. Trefethen and D. Bau, Numerical linear algebra. Society for Industrial Mathematics, 1997, no. 50. [124] J. Demmel, I. Dumitriu, and O. Holtz, “Fast linear algebra is stable,” Numerische Mathematik, vol. 108, no. 1, pp. 59–91, 2007. 150 [125] J. Kayser and C. Tenke, “Principal components analysis of laplacian waveforms as a generic method for identifying ERP generator patterns: I. evaluation with auditory oddball tasks,” Clinical Neurophysiology, vol. 117, no. 2, pp. 348–368, 2006. [126] ——, “Principal components analysis of laplacian waveforms as a generic method for identifying ERP generator patterns: II. adequacy of low-density estimates,” Clinical Neurophysiology, vol. 117, no. 2, pp. 369–380, 2006. [127] J. Egan, “Signal detection theory and ROC analysis,” 1975. a o ıa, o [128] D. Ab´solo, R. Hornero, C. G´mez, M. Garc´ and M. L´pez, “Analysis of EEG background activity in Alzheimer’s disease patients with Lempel-Ziv complexity and central tendency measure,” Medical engineering & physics, vol. 28, no. 4, pp. 315–322, 2006. [129] M. Le Van Quyen, J. Soss, V. Navarro, R. Robertson, M. Chavez, M. Baulac, and J. Martinerie, “Preictal state identification by synchronization changes in long-term intracranial EEG recordings,” Clinical Neurophysiology, vol. 116, no. 3, pp. 559–568, 2005. [130] E. Bullmore and O. Sporns, “The economy of brain network organization,” Nature Reviews Neuroscience, vol. 13, no. 5, pp. 336–349, 2012. [131] D. Watts and S. Strogatz, “Collective dynamics of small-world networks,” Nature, vol. 393, no. 6684, pp. 440–442, 1998. [132] K. Supekar, V. Menon, D. Rubin, M. Musen, and M. Greicius, “Network analysis of intrinsic functional brain connectivity in Alzheimer’s disease,” PLoS Computational Biology, vol. 4, no. 6, p. e1000100, 2008. [133] M. Rubinov and O. Sporns, “Weight-conserving characterization of complex functional brain networks,” Neuroimage, vol. 56, no. 4, pp. 2068–2079, 2011. [134] C. Stam, M. Breakspear, A. van Walsum, and B. van Dijk, “Nonlinear synchronization in EEG and whole-head MEG recordings of healthy subjects,” Human Brain Mapping, vol. 19, no. 2, pp. 63–78, 2003. [135] C. Stam, B. Jones, G. Nolte, M. Breakspear, and P. Scheltens, “Small-world networks and functional connectivity in Alzheimer’s disease,” Cerebral Cortex, vol. 17, no. 1, pp. 92–99, 2007. 151 [136] D. Bassett, A. Meyer-Lindenberg, S. Achard, T. Duke, and E. Bullmore, “Adaptive reconfiguration of fractal small-world human brain functional networks,” Proceedings of the National Academy of Sciences, vol. 103, no. 51, pp. 19 518–19 523, 2006. [137] L. Douw, M. Schoonheim, D. Landi, M. Van Der Meer, J. Geurts, J. Reijneveld, M. Klein, and C. Stam, “Cognition is related to resting-state small-world network topology: an magnetoencephalographic study,” Neuroscience, vol. 175, pp. 169–177, 2011. [138] C. Stam, T. Montez, B. Jones, S. Rombouts, Y. Van Der Made, Y. Pijnenburg, and P. Scheltens, “Disturbed fluctuations of resting state EEG synchronization in Alzheimer’s disease,” Clinical Neurophysiology, vol. 116, no. 3, pp. 708–715, 2005. [139] S. Achard, R. Salvador, B. Whitcher, J. Suckling, and E. Bullmore, “A resilient, lowfrequency, small-world human brain functional network with highly connected association cortical hubs,” The Journal of Neuroscience, vol. 26, no. 1, pp. 63–72, 2006. [140] D. Bassett and E. Bullmore, “Small-world brain networks,” The Neuroscientist, vol. 12, no. 6, pp. 512–523, 2006. [141] O. Sporns, D. Chialvo, M. Kaiser, and C. Hilgetag, “Organization, development and function of complex brain networks,” Trends in Cognitive Sciences, vol. 8, no. 9, pp. 418–425, 2004. [142] O. Sporns, C. Honey, and R. K¨tter, “Identification and classification of hubs in brain o networks,” PLoS One, vol. 2, no. 10, p. e1049, 2007. [143] C. Stam, W. De Haan, A. Daffertshofer, B. Jones, I. Manshanden, A. van Cappellen van Walsum, T. Montez, J. Verbunt, J. de Munck, B. van Dijk, H. Berendse, and P. Scheltens, “Graph theoretical analysis of magnetoencephalographic functional connectivity in Alzheimer’s disease,” Brain, vol. 132, no. 1, pp. 213–224, 2009. [144] A. Lakhina, M. Crovella, and C. Diot, “Diagnosing network-wide traffic anomalies,” in ACM SIGCOMM Computer Communication Review, vol. 34, no. 4, 2004, pp. 219–230. [145] ——, “Mining anomalies using traffic feature distributions,” in ACM SIGCOMM Computer Communication Review, vol. 35, no. 4, 2005, pp. 217–228. [146] L. Huang, X. Nguyen, M. Garofalakis, M. Jordan, A. Joseph, and N. Taft, “In-network PCA and anomaly detection,” Advances in Neural Information Processing Systems, vol. 19, p. 617, 2007. 152 [147] J. Jackson and G. Mudholkar, “Control procedures for residuals associated with principal component analysis,” Technometrics, vol. 21, no. 3, pp. 341–349, 1979. [148] D. Kifer, S. Ben-David, and J. Gehrke, “Detecting change in data streams,” in Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol. 30, 2004, pp. 180–191. [149] R. Veldhuis and E. Klabbers, “On the computation of the Kullback-Leibler measure for spectral distances,” IEEE Transactions on Speech and Audio Processing, vol. 11, no. 1, pp. 100–103, 2003. [150] H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh, “Querying and mining of time series data: experimental comparison of representations and distance measures,” Proceedings of the VLDB Endowment, vol. 1, no. 2, pp. 1542–1552, 2008. [151] P. Rosin, “Thresholding for change detection,” in Sixth International Conference on Computer Vision. IEEE, 1998, pp. 274–279. [152] N. Eagle, A. Pentland, and D. Lazer, “Inferring friendship network structure by using mobile phone data,” Proceedings of the National Academy of Sciences, vol. 106, no. 36, pp. 15 274–15 278, 2009. [153] W. Gehring, B. Goss, M. Coles, D. Meyer, and E. Donchin, “A neural system for error detection and compensation,” Psychological Science, vol. 4, no. 6, pp. 385–390, 1993. [154] E. M. Bernat, W. J. Williams, and W. J. Gehring, “Decomposing ERP time-frequency energy using PCA,” Clinical Neurophysiology, vol. 116, no. 6, pp. 1314–1334, 2005. [155] P. Luu and D. Tucker, “Regulating action: alternating activation of midline frontal and motor cortical networks,” Clinical Neurophysiology, vol. 112, no. 7, pp. 1295–1306, 2001. [156] L. Trujillo and J. Allen, “Theta EEG dynamics of the error-related negativity,” Clinical Neurophysiology, vol. 118, no. 3, pp. 645–668, 2007. [157] E. Hochman, Z. Eviatar, Z. Breznitz, M. Nevat, and S. Shaul, “Source localization of error negativity: additional source for corrected errors,” NeuroReport, vol. 20, no. 13, pp. 1144–1148, 2009. [158] W. Miltner, C. Braun, and M. Coles, “Event-related brain potentials following incorrect feedback in a time-estimation task: Evidence for a generic neural system for error detection,” Journal of Cognitive Neuroscience, vol. 9, no. 6, pp. 788–798, 1997. 153 [159] S. Dehaene, M. Posner, and D. Tucker, “Localization of a neural system for error detection and compensation,” Psychological Science, vol. 5, no. 5, pp. 303–305, 1994. [160] H. Leuthold and W. Sommer, “ERP correlates of error processing in spatial S-R compatibility tasks,” Clinical Neurophysiology, vol. 110, no. 2, pp. 342–357, 1999. [161] T. Overbeek, S. Nieuwenhuis, and K. Ridderinkhof, “Dissociable components of error processing: On the functional significance of the Pe vis-`-vis the ERN/Ne.” Journal a of Psychophysiology, vol. 19, no. 4, pp. 319–329, 2005. [162] E. Bernat, S. Malone, W. Williams, C. Patrick, and W. Iacono, “Decomposing delta, theta, and alpha time-frequency ERP activity from a visual oddball task using PCA,” International Journal of Psychophysiology, vol. 64, no. 1, pp. 62–74, 2007. [163] T. Demiralp, A. Ademoglu, Y. Istefanopulos, C. Ba¸ar-Eroglu, and E. Ba¸ar, “Wavelet s s analysis of oddball p300,” International Journal of Psychophysiology, vol. 39, no. 2, pp. 221–227, 2001. [164] C. Gilmore, S. Malone, E. Bernat, and W. Iacono, “Relationship between the p3 event-related potential, its associated time-frequency components, and externalizing psychopathology,” Psychophysiology, vol. 47, no. 1, pp. 123–132, 2010. [165] K. Jones, B. Porjesz, D. Chorlian, M. Rangaswamy, C. Kamarajan, A. Padmanabhapillai, A. Stimus, and H. Begleiter, “S-transform time-frequency analysis of p300 reveals deficits in individuals diagnosed with alcoholism,” Clinical Neurophysiology, vol. 117, no. 10, pp. 2128–2143, 2006. [166] C. Michel, M. Murray, G. Lantz, S. Gonzalez, L. Spinelli, and R. Grave de Peralta, “EEG source imaging,” Clinical Neurophysiology, vol. 115, no. 10, pp. 2195–2222, 2004. [167] M. Herrmann, J. Rommler, A. Ehlis, A. Heidrich, and A. Fallgatter, “Source localization (LORETA) of the error-related-negativity (ERN/Ne) and positivity (Pe),” Cognitive Brain Research, vol. 20, no. 2, pp. 294–299, 2004. 154