GRAPH-BASED METHODS FOR INFERRING NEURONAL CONNECTIVITY FROM
SPIKE TRAIN ENSEMBLES
By
Seif El-din El-dawlatly

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Electrical Engineering
2011

ABSTRACT
GRAPH-BASED METHODS FOR INFERRING NEURONAL CONNECTIVITY FROM
SPIKE TRAIN ENSEMBLES
By
Seif El-din El-dawlatly

Understanding the brain’s inner workings requires studying the underlying complex networks
that bind its basic computational elements, the neurons. Advances in extracellular neural
recording techniques have enabled simultaneous recording of spike trains from multiple single
neurons in awake, behaving subjects. Yet, devising methods to infer connectivity among these
neurons has been significantly lacking. We introduce a connectivity inference framework based
on graphical models. We first infer the functional connectivity between neurons by searching for
clusters of statistically dependent spike trains. We then infer the effective connectivity between
neurons within each cluster by building Dynamic Bayesian Network (DBN) model fit to the
spike train data. Using probabilistic models of neuronal firing, we demonstrate the utility of this
framework to infer neuronal connectivity in moderate and large scale networks with a substantial
gain in performance compared to classical methods. We further use this framework to examine
the role of spike timing correlation in infragranular layer V of the primary somatosensory cortex
(S1) of the rat during unilateral whisker stimulation in vivo. Stable, whisker-specific networks
provided more information about the stimulus than individual neurons’ response. We finally
demonstrate how this framework enables tracking and quantifying plastic changes in
connectivity in biologically-plausible models of spike-timing-dependent-plasticity as well as
changes in S1 response maps following sensory deprivation in the awake, behaving rat.

Copyright by
SEIF EL-DIN EL-DAWLATLY
2011

th

To the martyrs of the January 25 , 2011 Egyptian revolution

iv

ACKNOWLEDGMENTS

First, I want to express my gratitude to Allah whose blessings made my efforts fruitful.
Second, I could not have done this without many people who have directly or indirectly helped
me. I would like to thank them all, but there are some people who deserve special recognition.
I would like to thank my advisor Dr. Karim Oweiss for his continuous guidance. His
encouragement, patience and meeting with me every single week are the primary reasons this
work has progressed. He has provided me with an excellent atmosphere for doing research as
well as generous support to present our work at many prestigious conferences.
I would like to express my appreciation to my dissertation committee members Dr. Hayder
Radha and Dr. Hongbing Wang. I am also grateful to Dr. Rong Jin for providing valuable
comments and suggestions throughout our collaboration.
I would also like to thank all my current and previous lab mates who were more of friends
than colleagues: Mehdi Aghagolzadeh, Ki Yong Kwon, Ali Mohebi, John Daly, Fei Zhang,
Debbie Mukherjee, Jianbo Liu, Jasmina Jakupovic, Jennifer Parker and Faisal Abu-Nimeh. My
thanks also go to my friends who have always been there for me: Kareem Refaat, Sameh
Zakhary, Cherif Salama, Rana Alkhamra, Jameel Haddad, Mohamed El-gafy, Sara Al-alili,
Mona Alchaer, Bassem Amin, Ahmed Mamoun and the TAP Wednesday soccer team.
I would also like to thank my parents and my brother for their support and encouragement
during years of my work. Living thousands of miles away has not prevented them from
supporting me every single day during my PhD.
Finally, I would like to thank the generous funding of my PhD by the National Institute of
Neurological Disorders and Stroke (NINDS) grant number NS054148 and the Graduate School.

v

TABLE OF CONTENTS

LIST OF TABLES....................................................................................................................... viii
LIST OF FIGURES ....................................................................................................................... ix
LIST OF ABBREVIATIONS..................................................................................................... xvii
Chapter 1: Introduction ................................................................................................................... 1
Chapter 2: Background and Overview............................................................................................ 7
2.1
Neural Activity................................................................................................................ 7
2.2
Recording and Pre-processing of Extracellular Neural Activity .................................... 9
2.3
Existing Methods for Inferring Neuronal Connectivity................................................ 13
2.3.1
The Crosscorrelogram........................................................................................... 13
2.3.2
Information-theoretic Methods ............................................................................. 15
2.3.3
Granger Causality ................................................................................................. 17
2.3.4
Generalized Linear Model .................................................................................... 19
2.4
Graphical Models for Inferring Neuronal Connectivity ............................................... 22
Chapter 3: A Multiscale Clustering Approach for Identifying Neuronal Functional
Connectivity...................................................................................................................................26
3.1
Neuron Population Model............................................................................................. 26
3.2
Scale-space Representation........................................................................................... 30
3.3
Similarity Measures ...................................................................................................... 32
3.4
Spectral Clustering........................................................................................................ 34
3.5
Results........................................................................................................................... 36
3.5.1
Clustering Synaptically-Coupled Populations ...................................................... 37
3.5.1.1 Population with Fixed History Interval................................................................. 37
3.5.1.2 Population with Variable History Interval............................................................ 40
3.5.1.3 Clusters with Distinct Temporal Interactions ....................................................... 42
3.5.1.4 Distinct within-cluster Temporal Interactions ...................................................... 42
3.5.2
Variability in Network Topology.......................................................................... 44
3.5.2.1 Structural Plasticity............................................................................................... 45
3.5.2.2 Spike-Timing Dependent Plasticity ...................................................................... 47
3.5.3
Clustering Large Populations................................................................................ 49
3.5.4
Estimating the Time Scale of Maximum Accuracy.............................................. 51
3.6
Validation using Barrel Cortex Data............................................................................. 52
3.7
Discussion and Conclusion ........................................................................................... 56
Chapter 4: Inferring Neuronal Effective Connectivity using Dynamic Bayesian Networks........ 59
4.1
Dynamic Bayesian Network Theory............................................................................. 59
4.2
Learning Bayesian Networks........................................................................................ 63
4.3
DBN and Spike Trains .................................................................................................. 65
4.4
Simulation Model.......................................................................................................... 67

vi

4.5
Results........................................................................................................................... 69
4.5.1
Networks with Fixed Synaptic Latency................................................................ 70
4.5.2
Mismatch between Actual Synaptic Latency and DBN Markov Lag................... 75
4.5.3
Networks with Variable Synaptic Latencies......................................................... 78
4.5.4
Identifying Monosynaptic Connectivity ............................................................... 79
4.5.5
Networks with Unobserved and Independent Neurons ........................................ 82
4.5.6
Identifying Connectivity in Large Populations..................................................... 86
4.5.7
Estimating the Markov Lag of Maximum Accuracy ............................................ 88
4.6
Discussion and Conclusion ........................................................................................... 90
Chapter 5: Millisecond-Timescale Stimulus-specific Network Coding ....................................... 93
5.1
Rat Barrel Cortex .......................................................................................................... 93
5.2
Methods......................................................................................................................... 95
5.2.1
Barrel Cortex Recording ....................................................................................... 95
5.2.2
Single Unit Analysis ............................................................................................. 98
5.2.3
Dynamic Bayesian Networks Analysis................................................................. 99
5.2.4
Network Similarity and Network Information...................................................... 99
5.2.5
Decoding Whisker Identity ................................................................................. 102
5.2.6
Crosscorrelogram Analysis................................................................................. 103
5.2.7
Predicting Single Neuron Firing from Pre-synaptic Peers.................................. 104
5.3
Results......................................................................................................................... 106
5.3.1
Firing Characteristics .......................................................................................... 106
5.3.2
Whisker-specific Networks................................................................................. 108
5.3.3
Relating Network Properties to Individual Neuronal Responses ....................... 116
5.3.4
Crosscorrelogram Comparison ........................................................................... 118
5.3.5
Network Model-based Prediction of Single Neuron Firing................................ 120
5.4
Position-specific Networks in the Medial Prefrontal Cortex...................................... 122
5.5
Discussion and Conclusion ......................................................................................... 124
Chapter 6: Network Dynamics Associated with Experience-dependent Plasticity in the Rat
Somatosensory Cortex................................................................................................................. 130
6.1
Experience-dependent Plasticity in the Barrel Cortex ................................................ 130
6.2
Methods....................................................................................................................... 132
6.2.1
Barrel Cortex Recording and Electrode Implantation ........................................ 132
6.2.2
Data Analysis ...................................................................................................... 134
6.3
Results......................................................................................................................... 135
6.3.1
Changes in Individual Neuron Response............................................................ 135
6.3.2
Changes in Network Response ........................................................................... 137
6.4
Discussion and Conclusion ......................................................................................... 139
Chapter 7: Concluding Remarks and Future Directions ............................................................. 142
REFERENCES ........................................................................................................................... 148

vii

LIST OF TABLES

Table 4.1:Conditional probabilities Pr(W|R) and Pr(U|R)............................................................ 61
t+1

Table 4.2: Conditional probabilities Pr(S

t

t

| S , R )..................................................................... 62

viii

LIST OF FIGURES
Figure

2.1:
(a)
Simple
neuron
structure
(Adapted
from
www.solarnavigator.net/human_brain.htm). (b) Two neurons in contact. (c) A train
of three action potentials or spikes. (d) Excitatory post-synaptic potential (top) and
inhibitory post-synaptic potential (bottom) (Adapted from [32]). For interpretation of
the references to color in this and all other figures, the reader is referred to the
electronic version of this dissertation. ......................................................................... 8

Figure 2.2: (a) Micro-wire arrays. Wires made of tungsten with a diameter of 50μm and a
minimum inter-wire separation of 250μm (Adapted from www.tdt.com). (b) Siliconbased multi-electrode array (Michigan array) with (Left) 8 shanks and (Right) 4
shanks and 4 electrodes/shank [44]. (c) Another example of silicon-based arrays
(Utah array) with 100 single electrodes [45]. ............................................................ 10
Figure 2.3: Extracellular recording pre-processing. Spikes are first detected (shown in red) from
the filtered raw signal. Characteristic features are extracted from the detected spikes
shape (using principal component analysis in the diagram). Each point in the feature
space represents one spike. A clustering algorithm is then used to identify the spikes
belonging to each neuron. In this example, 3 different neurons are recorded (red,
blue and magenta) while the black cluster represents miss-detected spikes (i.e. noise).
................................................................................................................................... 12
Figure 2.4: (a) Crosscorrelogram and (b) Post Stimulus Time Histogram (PSTH) and Joint PSTH
(JPSTH) computed from the spike trains of two neurons in the rat somatosensory
(barrel) cortex in response to whisker stimulation. Bin size was set to 10 ms in both
figures. The crosscorrelogram in (a) indicates an excitatory interaction between the
two neurons as indicated by the significant narrow peak around 10 ms. The JPSTH in
(b) indicates that the significant synchrony between the two neurons occurs about 30
ms post stimulus presentation and therefore may indicate that the stimulus represents
the cause of the synchrony. ....................................................................................... 14
Figure 2.5: (a) GLM of a 2-neuron population used to fit spike trains from stimulus-driven retinal
ganglion cell (RGC) responses in vitro. Each neuron input is processed by a stimulus
filter ki, a post-spike filter αii and coupling filters that capture other neurons’
dependencies, αij’s. (b) Parameters of an example retinal ganglion ON cell model:
Temporal and spatial components of center and surround filter, post-spike filter and
coupling filters from neighboring OFF and ON cells. (Adapted from [83]). ............ 20
Figure 2.6: (a) Undirected graph of a sample neuronal population where edges indicate the
membership of the connected neurons in the same cluster. (b) Directed graph of the
same population in (a) where edges indicate the causal influence among the neurons.
(c) Sample small-world network. Connections are mainly local with the exception of
a few distal connections (neurons 2 to 6 and neurons 5 to 8). (d) Sample scale-free
network, where a small number of neurons appears as hubs (neurons 2 and 8)........ 24
ix

Figure 3.1: Synaptic weights for (a) Mpq = 40 bins (Hpq = 120 ms) and (b) Mpq =120 bins (Hpq
= 360 ms). In both cases, Apq = 2.5, B = 2000 and F = 3000. When the history
interval is limited to 40 bins, the function exhibits smaller time constant within the
short time support where the function is non-zero. In contrast, the function exhibits
larger time constant when Mpq =120. ........................................................................ 30
Figure 3.2: Two-level Haar scale space representation of a sample spike train. .......................... 32
Figure 3.3: A sample undirected graph with 4 objects and arbitrary edge weights for which the
spectral clustering algorithm clusters objects with strong similarity (small edge
weight indicates stronger similarity between objects)............................................... 35
Figure 3.4: Pseudocode of the multiscale clustering method. ...................................................... 36
Figure 3.5: (a) Network topology for the independent clusters population (solid lines), I =
inhibition, E = excitation. (b) Sample 3-sec trace of the obtained spike trains. The
initialization period is indicated by a black bar. (c) Sample ISI histogram for a
neuron in the population and an independent neuron at 12 ms and 360 ms history
intervals, respectively. Coefficient of variation of the ISI histogram. (d) Augmented
correlation matrix using time scales up to 96 ms. (e) Average clustering accuracy vs.
time scale for a fixed 360 ms history interval obtained using probabilistic spectral
clustering and k-means clustering.............................................................................. 38
rd

Figure 3.6: (a) Clustering accuracy vs. time scale fitted with 3 order polynomial for the
network in Figure 3.5a for variable history intervals. (b) (Gray dots) Time scale of
maximum clustering accuracy obtained for the 25 trials vs. the population true
history interval. (Black dots) Average of maximum accuracy time scales. .............. 41
rd

Figure 3.7: (a) Average clustering accuracy ± SD vs. time scale fitted with 3 order polynomial
3

4

1

2

3

4

for different settings of M and M (M and M fixed to 20 bins). H = H = H . (b)
(Gray dots) Time scale of maximum clustering accuracy vs. history interval of
clusters 3 and 4 for each of the 25 trials. (Black dots) Average of maximum accuracy
time scales.................................................................................................................. 42
Figure 3.8: (a) Average clustering accuracy ± SD over all trials vs. time scale. (b) Average
clustering accuracy using crosscorrelograms for different window sizes. (c) Average
clustering accuracy using JPSTH for different window sizes. .................................. 43
Figure 3.9: (a) Network topology with across-cluster synaptic coupling (dotted lines). (b)
Average clustering accuracy vs. time scale. (c) Augmented R matrix using time
scales up to 96 ms. ..................................................................................................... 46

x

Figure 3.10: (a) Network topology with paired stimulation to induce STDP. (b) EPSP slope as a
function of pre-post synaptic spikes amplitude (A) and time constant (τ) were chosen
as 1.01 and 14.8 ms for LTP, and -0.52 and 33.8 ms for LTD, respectively. (c)
Average clustering accuracy post stimulation phases assuming 4 clusters and 2
clusters. (d) Similarity matrix using time scales up to 96 ms post the 10 stimulation
phases and (e) post 120 stimulation phases. .............................................................. 48
Figure 3.11: (a) Network topology for a population of 120 neurons, where each cluster has 30
neurons. The bidirectional connections refer to an excitation connection in one
direction and an inhibition in the opposite one. Black connections represent the
within cluster connections, while gray connections represent the across clusters 1 and
2 connections that undergo functional plasticity. (b) Average pair-wise distance
between neurons in clusters 1 and 2 for variable connectivity strengths. A
connectivity strength of 1 indicates a connection across-clusters of the same strength
as a connection within clusters. (c) Average clustering accuracy for variable acrossclusters connectivity strengths (clusters 1 and 2). ..................................................... 50
rd

Figure 3.12: Average Dunn index ± SD over all trials vs. time scale fitted with 3 order
polynomial for the network in Figure 3.5a for different history intervals................. 52
Figure 3.13: (a) Sample stimulus-specific networks formed of clusters for 3 different whiskers
(C1, D1 and D2) of rat R3. (b) Network feature space of rat R3 for 3 different
whiskers (C1, D1 and D2). Each dot corresponds to the projection of one network
onto a 2-dimension principal components (PC1 and PC2) feature space. (c) Network
information for rat R3 as a function of the number of clusters used for each of
whiskers C1, D1 and D2............................................................................................ 54
Figure 3.14: (a) Histogram of the spatial difference between the connected neurons quantified in
terms of the horizontal and vertical separations between the electrodes on which
neurons were recorded. (b) Histogram of the temporal difference between the
connected neurons quantified in terms of the difference in the connected neurons’
response latency to the stimulus. The number of clusters for each rat in both (a) and
(b) was set to the number that maximizes the mutual information between the
networks and the stimulus. ........................................................................................ 55
Figure 4.1: (a) An example of a Bayesian Network. Random variable R indicates if it rains or not,
W indicates if the grass is wet or not, and U indicates if people bring umbrella or not.
(b) An example of a Dynamic Bayesian Network. Each morning an automatic
sprinkler randomly decides whether it should water the lawn or not conditioned on
whether it watered the day before and whether there was rain the day before.......... 61
Figure 4.2: Schematic of the DBN structure search algorithm..................................................... 64
Figure 4.3: (a) A simple causal network where nodes represent neurons and edges represent
connections. lij indicates the synaptic latency associated with each connection. (b)

xi

Corresponding DBN where black nodes represent the neuron firing state at a specific
Markov lag while white nodes represent the present state......................................... 66
Figure 4.4: An illustrative graph of α ij (t ) where lijΔ is the synaptic latency, Aij models the
+

connectivity strength, Mij/3000 is the time constant of the decaying exponential
where Mij denotes the number of history bins, and MijΔ is the history interval........ 68
Figure 4.5: (a) DBN performance vs. number of excitatory input (pre-synaptic) connections.
Inset: Connection strengths Aij for each choice of that number. (b) DBN performance
vs. firing history interval length. Inset: Connection strengths Aij for each choice of
this length. (c) DBN performance (solid) vs. ratio of inhibitory to excitatory synaptic
strength (I/E Ratio). The accuracy gets closer to unity as inhibition strength surpasses
excitation strength (I/E ratio increases above 1). Partial Directed Coherence (PDC)
(dashed) and Generalized Linear Model (GLM) fit (dotted) performance shown for
comparison. (d) Coefficient of variation for the data analyzed in (c). (e) DBN
performance vs. background rate............................................................................... 71
+
Figure 4.6: (a) Coupling function α ij (t ) for different history intervals M (in bins). (b) DBN

performance vs. Markov lag for variable history intervals and fixed synaptic latency
(4 bins).(c) PDC performance vs. model order for the data analyzed in (a). (d) GLM
fit performance vs. detection threshold for the data analyzed in (b). (e) GLM fit
performance vs. GLM parameter WGLM when the model parameter M was set to 60
bins..............................................................................................................................76
Figure 4.7: (a) A simple network illustrating the dependence on the synaptic heterogeneity index
(HI). The subpopulation inside the dashed circle has a HI of 3 while the entire
population (neurons 1 to 6) has a HI of 5. The firing history was fixed at 60 bins (180
ms). (b) DBN, PDC and GLM-fit performance vs. HI.............................................. 79
Figure 4.8: (a) Example 3-neuron chain, where solid arrows represent true connections while the
dotted one represents a spurious connection, l indicates the synaptic latency
associated with each connection. (b) Average number of spurious connections
inferred per chain....................................................................................................... 80
Figure 4.9: (a) Network structure examined where black nodes 1-5 in the dashed rectangle
represent unobserved neurons. Solid arrows indicate real connections while dashed
ones indicate the expected connections to be inferred by DBN. (b) DBN performance
vs. difference between the synaptic latency of the connections to odd numbered
neurons and even numbered neurons (lo − le) where le is fixed at 1 bin. (c) A network
of 20 neurons, where white nodes indicate observed neurons while black nodes
indicate unobserved neurons. (d) A network with 15 observed neurons; neurons 1-10
(inside the dashed circle) receive connections while neurons 11-15 are not connected
to any neurons. The history interval was set to 60 bins (180ms) and the synaptic
latency was fixed to 1 bin (3 ms) for all connections. ............................................... 84
xii

Figure 4.10: (a) DBN performance vs. number of neurons/subpopulation for different search
intervals. 10 different populations of 120 neurons each were simulated. (b) Search
time required by DBN to achieve 100% accuracy (F-Measure = 1) vs. number of
neurons/subpopulation. (c) DBN performance vs. search time for 120 neuron
populations with no clusters. ..................................................................................... 87
Figure 4.11: (a) Mean influence score (IS) vs. Markov lag for variable history intervals for the
same data analyzed in Figure 4.6a. Note that the mean IS has similar profile to the
accuracy. (b) DBN performance vs. Markov lag for networks with HI equals 3.
Maximum accuracy is attained when the maximum Markov lag matches the
maximum synaptic latency in the population. (c) Mean influence score vs. Markov
lag for networks with HI equals 3.............................................................................. 89
Figure 5.1: (a) (Left) The rat’s right snout with the whiskers organized in rows, indexed in letters,
and columns, indexed in numbers. (Right) A horizontal section of the barrel cortex at
layer IV showing the whiskers mapping (Adapted from [197]). (b) Flow of
information from the Thalamus through the barrel system. The thickness of the
arrows correspond to the strength of the connections [198]...................................... 94
Figure 5.2: (a) Experimental setup where rats were anesthetized and whiskers were deflected one
at a time using a capillary tube glued to a piezoelectric bimorph. (b) Whiskers were
deflected horizontally for 100 ms. Red curve indicates whisker displacement......... 97
Figure 5.3: Firing characteristics of the recorded neurons. Two sample neurons: (a) Neuron 2a
and (b) Neuron 10a from rat R2 response to stimulation of whiskers C2, D1 and D2.
(Top) Whisker displacement. (Middle) Spike raster over multiple repeated trials.
(Bottom) Post-stimulus Time Histogram (PSTH) with 0.5ms bin size. Neuron 2a
shows stronger and faster response to whisker C2 than other whiskers while Neuron
10a shows a slightly stronger and faster response to whisker D1. (c) Nissl stained
coronal section (50 μm) in rat R5. This rat was chronically implanted over 35 days.
Dashed curve indicates the original shape of the section that was damaged due to the
implant. Black arrowhead points to an electrolytic lesion mark of the deepest
recording site on one of the shanks of the multi-electrode array. The depth of the
lesion mark (~1250 μm) is consistent with the depth recorded using the
micromanipulator during the surgery and corresponds to layer Vb of the barrel cortex
(1.1 mm posterior and 5.2 mm lateral to bregma). (d) Peak PSTH counts of each
neuron for each whisker versus its mean first-spike latency (n = 240). Each dot
corresponds to the response of a single neuron to a single whisker. (e) Normalized
mutual information between spike count/first-spike latency and whisker identity.
Each dot corresponds to one neuron. Black diagonal line represents equal
information (n = 80)................................................................................................. 107
Figure 5.4: Sample whisker-specific networks. (a) DBN networks inferred from one population
(Rat R2) for 3 individually stimulated whiskers: C2, D1 and D2. Undirected links
indicate bidirectional connections. Network of each whisker was inferred from a

xiii

dataset of length 18 sec (180 trials x 100 ms). (b) Connection probability in the
DBNs as a function of the horizontal and vertical separations between the electrodes
on which neurons were recorded. The number of connections inferred at each
distance was normalized by the corresponding total number of possible connections.
................................................................................................................................. 110
Figure 5.5: Networks similarity within- and across-whiskers. (a) Network feature space of rat R2
for 3 different whiskers (C2, D2 and D1) and a shuffled dataset. Each dot
corresponds to the projection of one network onto a 2-dimension principal
components (PC1 and PC2) feature space. Insets: Sample networks from the 3
whiskers. Black edges represent common connections with the top left sample
network (whisker C2). (b) Similarity between networks inferred for the same
whisker and between networks inferred for different whiskers averaged across
subjects (mean ± SD). Similarity for a given pair of networks was quantified as 1 –
the normalized distance between the projections of the pair in the principal
component space. Within whisker similarity was corrected for the bias resulting from
the overlap in the data (estimated from the shuffled data). The figure indicates that
the within whisker networks cluster more closely compared to across-whisker
networks. *P < 1e-6, two-sample t-test. (c) Normalized mutual information between
each of the networks inferred from the original and shuffled data, and the stimulus
averaged across subjects (mean ± SD). *P < 0.01, two-sample t-test. .................... 112
Figure 5.6: Overlap in the original and the shuffled datasets. (a) Distribution of the amount of
overlap between any pair of datasets for (Right) the original data and (Left) the
shuffled data. Blue curve indicates a binomial distribution fit with parameters p = 0.2
and n = 180. Both figures indicate that both the original and the shuffled data have
the same degree of overlap where any given pair of datasets would have an overlap
of 20±4%. (b) Network feature space of the shuffled datasets extracted from rat R2
data. Each dot corresponds to the projection of one network onto a 2-dimension
principal components (PC1 and PC2) feature space. (c) Similarity between networks
inferred for the same shuffled dataset and between networks inferred for different
shuffled datasets averaged across subjects (mean ± SD). Similarity for a given pair of
networks was quantified as 1 – the distance between the projections of the two
networks in the principal component feature space................................................. 113
Figure 5.7: Individual neurons subnetworks information. (a) Network feature space of one
sample neuron (Neuron 16b of rat R2) for 3 different whiskers (C2, D1 and D2).
Each dot corresponds to the projection of one network onto a 2-dimension principal
components (PC1 and PC2) feature space. (b) Information in the individual neurons’
networks inferred for the shuffled data versus the original data (mean ± SD). * P = 0,
two-sample t-test. (c) Normalized information in first-spike latency and spike count
versus normalized information in the networks of each neuron. Each dot corresponds
to one neuron. Black diagonal line represents equal information. Network
information was corrected for any statistical bias by subtracting the network
information computed for each neuron from the shuffled data in (b). .................... 115

xiv

Figure 5.8: Network information in the original data is consistently higher than that in the
shuffled data, independent of the bin size. (A) Network information in the original
and the shuffled data as a function of the bin size used to estimate the mutual
information averaged across the 5 subjects (mean ± SD). (B) Information in the
network of individual neurons computed from the original and the shuffled data as a
function of the bin size used to estimate the mutual information, averaged across the
80 neurons (mean ± SD). ......................................................................................... 116
Figure 5.9: Network properties and individual neuronal responses. (a) Histogram of the
difference between the mean first-spike latency of the post-synaptic cells and the
pre-synaptic cells for each inferred connection. Only unidirectional connections were
counted in the histogram. Red bars indicate the fraction of connections consistent
with the difference between latencies while blue bars indicate connections that are
not. (b) Ratio between the number of outgoing connections and ingoing connections
for each neuron (source index) as a function of its mean first-spike latency for each
whisker. Each dot corresponds to one neuron for a given whisker (n = 240). Z-scores
of the mean first-spike latency and the source index are reported on the X-axis and
the Y-axis, respectively. Gray curve indicates decaying exponential fit. (c)
Information in the networks of each neuron as a function of the source index
averaged across whiskers (n = 80). Each dot corresponds to the standardized z-scores
of one neuron. Gray line indicates regression line. ................................................. 117
Figure 5.10: Comparison with networks inferred using crosscorrelograms. (a) Histogram of the
difference between the mean first-spike latency of the post-synaptic neurons and the
pre-synaptic neurons for each connection inferred using the crosscorrelogram
technique. Only unidirectional connections were counted in the histogram. Red bars
indicate the fraction of connections consistent with the difference between latencies
while blue bars indicate connections that are not. (b) Ratio between the number of
outgoing connections and ingoing connections for each neuron (source index) as a
function of its mean first-spike latency for each whisker. Each dot corresponds to
one neuron for a given whisker (n = 240). Z-scores of the mean first-spike latency
and the source index are reported on the X-axis and the Y-axis, respectively. Gray
curve indicates decaying exponential fit. (c) Fraction of possible chain effect-induced
connections and common input-induced connections inferred by crosscorrelogram
and DBN (mean ± SD). * P < 0.001, two-sample t-test. ......................................... 119
Figure 5.11: Predicting single neuron firing. (a) ROC curves of sample neuron 2b in response to
whisker C2 deflection (left) and neuron 17b in response to whisker D2 deflection
(right) in rat R2. Predicted spike trains were obtained using the pre-synaptic cells’
history inferred by DBN and stimulus onset time, pre-synaptic cells’ history inferred
by the GLM and stimulus onset time, and the stimulus onset time only. Each curve
was computed from tenfold cross-validation datasets. (b) Predictive power
comparison across all rats for multiple models (mean ± SD). * P < 0.001, twosample t-test. (c) GLM predictive power versus that obtained using DBN and
stimulus onset time. Black diagonal line represents equal prediction (n = 240). (d)
Predictive power obtained using DBN and stimulus as a function of the variability in

xv

mean first-spike latency. (e) Difference between predictive power for each neuron
obtained using DBN and stimulus and that obtained using GLM as a function of the
variability in mean first-spike latency. Each dot corresponds to one neuron for a
given whisker (n = 240). Gray lines in (d) and (e) indicate regression lines........... 121
Figure 5.12: Partitioning the spike trains to obtain position-specific datasets for 24 right-turn
trials (similar analysis for 18 left-turn trials)........................................................... 123
Figure 5.13: Networks inferred from left and right trials data for different maze position intervals.
Only connections appeared in at least 3 out of the 8 datasets of each position interval
are shown. Clear dissimilarity between network models of left and right trials can be
seen. ......................................................................................................................... 125
Figure 5.14: (a) Similarity between the inferred networks at different maze positions computed
using the F-Measure (equation 4.9) averaged across both sides. On-diagonal entries
are eliminated. (b) Average similarity between the inferred networks vs. distance
between maze position intervals. Values were extracted from the matrix shown in (a).
It can be seen that networks become increasingly dissimilar as a function of the
distance between position intervals, suggesting a differential population encoding
mechanism that depends on the behavioral goal. .................................................... 126
Figure 6.1: (a) Timeline of the experiment. (b) Difference in the evoked response and (c) in the
first-spike latency across the spared whiskers averaged across neurons and subjects
for the control data (blue) and the plasticity data (red) recorded 1-2 days and 6-7
days post whisker-trimming. Y-axis was normalized by the maximum evoked
response in (b) and the maximum first-spike latency in (c) across the spared whiskers
for each neuron. *P < 0.05, two-sample t-test......................................................... 136
Figure 6.2: (a) Control (left) and 7-days post whisker-pairing (right) network feature space of a
sample rat for 3 different whiskers (D4, D5 and D6). Whiskers D4 and D5 were
paired after recording the control data. Each dot corresponds to the projection of one
network onto a 2-dimension principal components (PC1 and PC2) feature space. (b)
Network similarity across spared whiskers for the control data (blue) and data
recorded 1-2 days and 6-7 days post-pairing (red) averaged across 4 subjects (mean ±
SD). Similarity for a given pair of networks was quantified as 1 – the normalized
distance between the projections of the pair in the principal component space. *P <
0.05, two-sample t-test............................................................................................. 138
Figure 7.1: A DBN model that distinguishes between signal and noise correlations. Gray node
(W) takes a value ∈ {0, 1}, where 0 indicates no whisker deflection, while 1
indicates the deflection of any whisker. Dotted edges indicate signal influence while
solid edges indicate noise influence. Signal correlation can be inferred from the
graph by searching for neurons receiving connections from W (such as neurons 1 and
2) whereas noise correlation can be estimated by searching for neurons receiving
connections from a common parent (such as neurons 3 and 4)............................... 145

xvi

LIST OF ABBREVIATIONS

AUC

Area Under the Curve

DBN

Dynamic Bayesian Networks

EPSP

Excitatory Post-synaptic Potential

GLM

Generalized Linear Model

HI

Heterogeneity Index

IPSP

Inhibitory Post-synaptic Potential

IS

Influence Score

ISI

Inter-spike Interval

JPSTH

Joint Post-stimulus Time Histogram

mPFC

Medial Prefrontal Cortex

PC

Principal Component

PDC

Partial Directed Coherence

PSTH

Post-stimulus Time Histogram

ROC

Receiver Operating Characteristics

SD

Standard Deviation

STDP

Spike Timing-dependent Plasticity

xvii

Chapter ____________________________________________

1

Introduction

______________________________________________________________________________

Hundred billion neurons with a total of two million miles of axons and trillion connections
make the human brain the most complex organ in the human body [1, 2]. Studying the brain and
its inner workings is therefore an extremely complex task. The functionality of the brain is
known to lie in the enormous number of connections among the neurons that interact, rather than
act independently. These interactions mediate many complex brain functions along numerous
parallel and sequential pathways with highly intricate network structures. The massive size of
these networks suggests the vastly complex information processing mechanism that underlies
their operation [3, 4]. Many nervous system diseases and disorders are also known to result from
abnormal changes in the function and/or the structure of these networks [5-7]. Deciphering the
neural circuitry underlying the observed behavior is therefore of utmost importance in systems
and clinical neuroscience.
The long standing goal of inferring neural circuitry in vivo has always been challenged by the
lack of technology that enables simultaneous recording of brain activity at the individual neuron
level. Despite the emergence of different non-invasive techniques to record brain activity such as
Electroencephalogram

(EEG),

Functional

Magnetic

Resonance

Imaging

(fMRI)

and

Magnetoencephalogram (MEG), these techniques do not directly measure the activity of
individual neurons [8]. They rather measure the aggregate activity in a certain brain area which
1

leads into low spatial and temporal resolutions. On the other hand, invasive recording techniques
have progressed over decades from recording the activity of an individual neuron using a single
penetrating electrode to simultaneously recording the activity of multiple neurons using
microelectrode arrays [9-12].
Implantable microelectrode arrays have enabled scrutinizing activity from cortical neurons at
an unprecedented scale, and greatly accelerated our ability to monitor functional alterations of
cortical networks in awake, behaving subjects [13-15]. The concomitant neurobiological
discoveries have paved the way for these devices to become an essential constituent in
neuroprosthetic devices to provide assistive technology to people with severe disability.
Consequently, capitalizing on these advances by developing data analysis tools to identify
neuronal connectivity from simultaneously recorded neuronal activity becomes the next essential
step towards understanding how the brain encodes information in the population activity.
There has been extensive research in the past few decades to develop methods to discover
neuronal connectivity from simultaneously recorded spike trains. Neuronal connectivity can be
classified into two different types: anatomical connectivity and inferred connectivity. Anatomical
connectivity refers to the direct unidirectional synaptic links between the neurons which can be
excitatory or inhibitory. On the other hand, inferred connectivity refers to the connectivity that
can be identified between observed neurons in the presence or absence of a specific stimulus
presentation. The inferred connections represent statistical dependence between the observed
neurons that may or may not necessarily match the anatomical connections. Two types of
inferred connectivity are frequently used in the literature: functional connectivity and effective
connectivity [16, 17]. Functional connectivity refers to the undirected statistical dependence
between the observed variables that is often represented in the form of clusters [18, 19]. At the
2

neuronal level, this can result from the presence of a synaptic link between the neurons, or can be
observed when two unlinked neurons respond to a common driving input, for e.g., an unobserved
neuron. Effective connectivity, on the other hand, refers to the causal relationships governing this
dependence and involves identifying the putative pre- and post-synaptic neurons in the observed
mixture. Existing neuronal connectivity inference methods include cross correlation-based
methods, spike pattern classification, likelihood, information theory and frequency-domain
methods [16, 20-22]. However, many of these methods were developed for analyzing single
electrode recordings, and therefore are constrained by the number of neurons that a single depth
electrode can pick, mostly pairs or triplets of neurons [23, 24]. When large-scale recording of
multiple single units became more common, usage of these methods to infer a complex dynamic
network structure has shown to be a nontrivial task. In addition some of these methods suffer
from a number of limitations such as relying on pair-wise correlations and detecting linear
relationships only.
The objective of this dissertation is to:
Develop advanced graph-based techniques to infer neuronal connectivity in large scale
ensembles and demonstrate their analysis of simultaneously recorded spike trains
Graph-based techniques have become increasingly popular in the analysis of large scale
neural data [25, 26]. Advantages of graph-based techniques over classical measures of neuronal
connectivity include the ability to consider the activity of the entire observed neuronal
population rather than relying on pair-wise measures, the ability to detect non-linear
relationships and the intuitive graphical representation of the inferred neuronal networks which
allows visualizing and quantifying the properties of these networks as well as the plastic changes
that occur to these networks, for example, as a result of learning and memory.
3

This dissertation reviews previous methods that have been proposed to identify neuronal
connectivity from population activity and discusses their utility as well as their limitations.
Graph-based methods are emphasized as a way to circumvent many of the limitations of the
reviewed methods. We introduce two graphical approaches in this dissertation: First, the
multiscale clustering algorithm for inferring functional connectivity which attempts to find
disjoint subspaces (clusters) with arbitrary structures in the entire neural space using unlabeled
data [27]. Second, using Dynamic Bayesian Networks (DBNs) to find the intrinsic structure of
the system underlying each subspace by transforming the inference problem to a structure
learning problem for which many efficient search algorithms exist [28]. In addition to validating
the introduced approaches using data simulated by biologically-plausible models, we also
provide evidences for the validity of these approaches by applying them to data recorded from
the rat somatosensory cortex. Together, the two algorithms constitute a two-stage framework for
statistical inference in large scale neural population data [29].
Chapter 2 provides the background and overview of the research presented in this
dissertation. The chapter begins with a brief introduction of the biological aspects of neural
activity since it is crucial to the understanding of the nature of the analyzed data. This is followed
by the pre-processing steps required to extract neuronal spike trains from recorded extracellular
activity. We then survey the most-widely used connectivity inference methods, stating the
advantages and the limitations of each method. We finally introduce the utility of graph-based
methods as a general framework for inferring neuronal connectivity.
Chapter 3 suggests a multiscale clustering approach to identify functional connectivity
between neuronal elements from their simultaneously recorded spike trains [27]. In particular,
we identify clusters of neurons that exhibit functional interdependency over variable spatial and
4

temporal patterns of interaction. We represent neurons as objects in a graph and connect them
using arbitrarily-defined similarity measures calculated across multiple time scales. We then use
a probabilistic spectral clustering algorithm to cluster the neurons in the graph by solving a
minimum graph cut optimization problem. Using point process theory to model population
activity, we demonstrate the robustness of the approach in tracking a broad spectrum of neuronal
interaction, from synchrony to rate co-modulation, by systematically varying the length of the
firing history interval and the strength of the connections that govern the discharge pattern of
each neuron. We also demonstrate how activity-dependent plasticity can be tracked and
quantified in multiple network topologies built to mimic distinct behavioral contexts. We
compare the performance to classical approaches to illustrate the substantial gain in performance.
We finally validate the approach using real data recorded from the rat somatosensory cortex.
Chapter 4 investigates the applicability of Dynamic Bayesian Networks (DBNs) in inferring
the effective connectivity between spiking cortical neurons from their observed spike trains [28].
We demonstrate that DBNs can infer the underlying non-linear and time-varying causal
interactions between these neurons and can discriminate between mono- and poly-synaptic links
between them under certain constraints governing their putative connectivity. We analyzed
conditionally-Poisson spike train data mimicking spiking activity of cortical networks of small
and moderately-large sizes. The performance was assessed and compared to other methods under
systematic variations of the network structure to mimic a wide range of responses typically
observed in the cortex. Results demonstrate the utility of DBN in inferring the effective
connectivity in cortical networks.
Chapter 5 presents a comprehensive study in which stimulus-specific networks are inferred in
layer V of the rat somatosensory cortex [30]. We examined spike timing correlation between
5

simultaneously recorded layer V neurons within and across columns of the primary
somatosensory cortex of anesthetized rats during unilateral whisker stimulation. For each
stimulated whisker, we inferred stable, whisker-specific, DBNs over many repeated trials, with
network similarity of 83.3±6% within whisker, compared to only 50.3±18% across whiskers.
These networks provided information about whisker identity that was 6 times higher than that
provided by the latency to first spike and 22 times higher than that provided by the spike count of
individual neurons. Furthermore, prediction of individual neurons’ precise firing conditioned on
knowledge of putative pre-synaptic cell firing was 3 times higher than predictions conditioned on
stimulus onset alone. These results suggest the presence of a temporally precise network coding
mechanism that integrates information across neighboring columns within layer V about vibrissa
position to mediate whisker movement by motor areas innervated by layer V. In addition, we
demonstrate the inference of stable position-specific networks in the rat medial prefrontal cortex
(mPFC) during a working memory task [29].
Chapter 6 demonstrates the use of DBNs to track plastic changes in layer V of the rat
somatosensory cortex resulting from sensory alteration induced by trimming all whiskers but two,
referred to as whisker pairing. Stimulus-driven activity before pairing and after 1-2 days and 6-7
days of pairing were analyzed using DBN. An increased similarity between the networks inferred
for the spared whiskers was observed as a result of pairing where the similarity for the control
data was 0.58±0.1, while after 1-2 days of pairing it increased to 0.66±0.1, followed by a further
increase after 6-7 days to 0.71±0.1. These changes are consistent with previous studies that
hypothesized that plasticity resulting from whisker pairing follows Hebb’s postulate of “Neurons
that fire together, wire together”.
Finally, Chapter 7 provides concluding remarks and future directions.
6

Chapter ____________________________________________

2

Background and Overview

______________________________________________________________________________

2.1 Neural Activity
Neurons have four distinct regions that can be ordered according to the signal flow direction
as: the dendrites, the cell body (the soma), the axon, and the axon terminals as shown in Figure
2.1a [31]. The dendrites serve as the antennae of the neuron through which the neuron receives
inputs from other neurons. When the received inputs are strong enough, the cell body decides to
fire an output signal, called an action potential or a spike, shown in Figure 2.1b. The spike is then
propagated through the axonal terminals that deliver the output of the neuron to other adjacent
and distal neurons.
Neurons communicate with each other through specialized sites called synapses as shown in
Figure 2.1c [31]. A synapse is the point at which the axon terminal of the pre-synaptic neuron
(transmitter neuron) comes in contact with the dendrites of the post-synaptic neuron (receiver
neuron). In general, connections between neurons are unidirectional, and are either excitatory or
inhibitory. A spike from the pre-synaptic neuron increases the post-synaptic neuron firing
probability in case of an excitatory connection, while for an inhibitory connection, it decreases
the post-synaptic neuron firing probability. A neuron generates a spike if the integration of the
inputs coming through the dendrites is higher than a predefined membrane threshold. In case of

7

Dendrites

AP Direction

Cell Body

Synapse

Axon

(b)

Post-synaptic
Cell
EPSP

Memb. Potential

+50 mV

(a)
Action Potential

Pre-synaptic
Cell

Axon
Terminals

10 - 500 ms
10 - 500 ms

-60 mV

Time
IPSP
2 ms
(c)
(d)
Figure
2.1:
(a)
Simple
neuron
structure
(Adapted
from
www.solarnavigator.net/human_brain.htm). (b) Two neurons in contact. (c) A train of three
action potentials or spikes. (d) Excitatory post-synaptic potential (top) and inhibitory postsynaptic potential (bottom) (Adapted from [32]). For interpretation of the references to color in
this and all other figures, the reader is referred to the electronic version of this dissertation.

an excitatory connection, a spike from the pre-synaptic neuron leads to an increase in the
membrane potential of the post-synaptic neuron. This increase is called excitatory post-synaptic
potential (EPSP), which is shown in Figure 2.1d [32]. For an inhibitory connection, a presynaptic spike causes an inhibitory post-synaptic potential (IPSP) leading to a decrease in the
membrane potential of the post-synaptic cell.
For both EPSP and IPSP, the membrane potential decays back to the resting potential 10 to
500 ms (depending on the synapse properties) after the occurrence of the pre-synaptic spike at
time 0. This implies that the effect of a pre-synaptic spike on the firing of the post-synaptic
neuron lasts for a certain period of time during which that effect varies. The influence of that pre-

8

synaptic spike on the post-synaptic neuron firing at a certain time instant is the amount of change
caused by the corresponding EPSP or IPSP signal in the post-synaptic neuron membrane
potential at that time instant. Thus from the post-synaptic neuron perspective, it considers at each
time instant the effects caused by previous pre-synaptic spikes, which leads to the notion of the
firing history. The firing history of pre-synaptic neurons plays an important role in governing the
firing of the post-synaptic neurons.

2.2 Recording and Pre-processing of Extracellular Neural Activity
Techniques for recording brain activity can be classified into two categories: non-invasive
techniques

and

invasive

Electroencephalogram

techniques.

(EEG),

Functional

Examples
Magnetic

of

non-invasive

Resonance

techniques

Imaging

(fMRI)

are
and

Magnetoencephalogram (MEG). Being non-invasive, these methods measure the aggregate
activity in a certain brain area not the activity of individual neurons leading to low spatial and
temporal resolutions [8]. Numerous functional neuroimaging studies suggest that cortical regions
selectively couple to one another [33-35], for example, during complex visual stimulus
presentations [36-38], during motor prehension and execution in visuomotor tasks [39, 40], or in
working memory and auditory-related tasks [41, 42]. It is widely believed, however, that the low
temporal and spatial resolutions of these techniques do not enable the investigation of the brain’s
functional dynamics at the single neuron level [43], nor do they enable the identification of
causal relationships between multi-units across distant cortical regions.
This dissertation is mainly focused on signals recorded using invasive techniques,
specifically, signals recorded using implantable high-density microelectrode arrays [11, 12]
which have enabled scrutinizing activity from cortical neurons at temporal and spatial resolutions
9

(a)

(b)

(c)
Figure 2.2: (a) Micro-wire arrays. Wires made of tungsten with a diameter of 50μm and a
minimum inter-wire separation of 250μm (Adapted from www.tdt.com). (b) Silicon-based multielectrode array (Michigan array) with (Left) 8 shanks and (Right) 4 shanks and 4
electrodes/shank [44]. (c) Another example of silicon-based arrays (Utah array) with 100 single
electrodes [45].

(1-2 ms and 10-100 μm) much higher than what can be achieved using non-invasive recording
techniques [24, 46]. Figure 2.2 illustrates two major kinds of microelectrode arrays: the tungstenbased arrays and the silicon-based arrays. The geometry of these arrays enables them to record
from individual neurons extracellularly across different brain layers. The high temporal and
spatial resolution of these arrays opened the door to investigating how populations of neurons
encode information at the individual neuron level. They have been extensively used in studies
addressing crucial neuroscience questions such as how neuronal ensembles represent sensory
inputs [47-49], plan motor actions [50-52] and perform complex cognitive tasks such as decision
making [13, 53]. Although the methods presented in this dissertation are introduced to analyze
data recorded using microelectrode arrays, these methods can be applied to other signal
modalities such as spiking recorded using 2-photon Calcium imaging [54].
10

It is widely accepted in the literature that information processing in the brain is carried out
using information in the spike event timing and/or rate, and not in the spike shape. Given that
extracellular recording of ensembles of neurons is inevitable with multi-electrode arrays, spike
sorting typically takes place to convert the neural recording mixture to spike trains of the
individual neurons observed as illustrated in Figure 2.3 [55-57]. First, the recorded raw signals
are band-pass filtered in the range [300, 5000] Hz which is the frequency band of the spikes.
Second, spikes are detected on each electrode using a spike detection algorithm which typically
identifies the time of the spikes by comparing the recorded signal amplitude to a pre-defined
threshold [58]. In the time-domain, a threshold of 3 to 5 times the standard deviation of the noise
amplitude, depending on the Signal-to-Noise Ratio (SNR), is normally used for spike detection
[59, 60]. Third, a window with a pre-defined width, typically 1 to 2 ms, around the detection
point of each spike is extracted. The extracted spikes are then aligned at either their peaks or
their troughs depending on the recorded signals. Fourth, characteristic features are extracted from
the spike shapes. Although spikes of all neurons are stereotypical with the same shape when
measured intracellularly, they would have different shapes when recorded extracellularly. For
example, the amplitude of the spikes emitted by a given neuron recorded on an electrode is
inversely proportional to the distance between that neuron and the electrode. Therefore, given
that spikes from different neurons will be recorded with different shapes, features extracted from
the spike shapes can be used for spike sorting in which the recorded spikes are assigned to
different neurons. Simple features can be extracted from the spike shapes such as the amplitude,
the spike width and rise time. However, many more accurate approaches exist for feature
extraction that lead to more compact clusters and more separation between the clusters of
different neurons in the feature space and thus better spike sorting results. The most commonly
11

Raw Data

0

1
Time (sec)

2

Detection

0
Feature
Extraction

1
Time (sec)

2

Feature Space After Clustering

Feature Space

PC2

PC 1

Clustering

PC 2

PC1
Spike
Assignment

Figure 2.3: Extracellular recording pre-processing. Spikes are first detected (shown in red) from
the filtered raw signal. Characteristic features are extracted from the detected spikes shape (using
principal component analysis in the diagram). Each point in the feature space represents one
spike. A clustering algorithm is then used to identify the spikes belonging to each neuron. In this
example, 3 different neurons are recorded (red, blue and magenta) while the black cluster
represents miss-detected spikes (i.e. noise).
12

used approach is principal component analysis (PCA) in which the first 2 or 3 principal
components are used as the classification features [59]. Finally, a clustering algorithm such as kmeans, fuzzy c-means or expectation maximization is used to cluster the feature space and
identify the spikes belonging to each neuron [61, 62].

2.3 Existing Methods for Inferring Neuronal Connectivity
The problem of inferring connectivity between neurons can be stated as follows: we observe
the spike trains of a population of N neurons, denoted S=[Si]N×T, during an interval of length T.
We want to identify the set of neurons, πi, in the set of the observed N neurons that affect the
firing of neuron i, Si, in a statistical sense. We assume that S is expressed in the form of digital
spike trains with a very small bin width Δ (usually 1-3 ms). This form assumes that at time bin m,
Si(mΔ)=1 would indicate a spike is present, while Si(mΔ)=0 would indicate no spike.
2.3.1

The Crosscorrelogram

The Crosscorrelogram is one of the oldest and most widely used methods for inferring
neuronal connectivity [23, 63]. It is computed as a histogram of spike counts for a given neuron
within a bin of pre-defined width ω, relative to the spike count of another neuron. For a given
pair of neurons (i, j), and a given lag kω, the crosscorrelogram CCij is expressed as

CC ij (kω ) =

T

kω Δ

m =1

l =1+ (k −1)ω Δ

∑ S i (mΔ )

∑ S j ((m + l )Δ )

(2.1)

The peak of this histogram at a certain lag indicates excitation, while a trough indicates
inhibition. The sign of the lag at which the peak or the trough occurs determines the direction of

13

the inferred connection. A significant peak or a trough at a positive lag indicates a connection
from neuron i to neuron j, while a significant peak or a trough at a negative lag indicates a
connection from neuron j to neuron i. An extension to the crosscorrelogram method is the Joint
Peri-Stimulus Time Histogram (JPSTH) which measures the lagged synchrony between pairs of
neurons relative to the onset of an external stimulus [21]. JPSTH results in a 2-dimensional
histogram in which the value at a given element (t1, t2) represents the number of times the first
neuron fired at time t1 and the second neuron fired at time t2 as shown in Figure 2.4. Both t1 and

PSTH of Unit 1

0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-400 -200

Average Spikes/bin

Normalized Counts/bin

Sample Crosscorrelogram

0

200

0.2
0.1
0
PSTH of Unit 2
0.2
0.1
0

400

Time (ms)

1
(a)

200
Time (ms)

400

JPSTH

400
2

0

200
0

0

200
400
Time (ms)

(b)
Figure 2.4: (a) Crosscorrelogram and (b) Post Stimulus Time Histogram (PSTH) and Joint PSTH
(JPSTH) computed from the spike trains of two neurons in the rat somatosensory (barrel) cortex
in response to whisker stimulation. Bin size was set to 10 ms in both figures. The
crosscorrelogram in (a) indicates an excitatory interaction between the two neurons as indicated
by the significant narrow peak around 10 ms. The JPSTH in (b) indicates that the significant
synchrony between the two neurons occurs about 30 ms post stimulus presentation and therefore
may indicate that the stimulus represents the cause of the synchrony.

14

t2 are measured relative to the stimulus onset. The further the distance from the diagonal, the
more lagged the synchrony is.
Both methods have a number of limitations. First, the analysis bin size is fixed and this
means that neuronal firing is assumed stationary for both neurons during this interval.. There is
ample evidence, however, suggesting that the time scale of neuronal excitability is largely timevarying, possibly to maintain a principled excitation–inhibition homeostatic balance [64, 65].
Most importantly, recent studies on spike-timing-dependent plasticity (STDP) have demonstrated
that neurochemical receptors can generate excitatory post synaptic potentials (EPSPs) at resting
potential that can last a few hundred milliseconds, which elongates the time window required for
synaptic integration and eventually alters the temporal length for neuronal excitability [64, 6671]. Second, both the CC and JPSTH methods only consider correlation between pairs of
neurons. Considering the effect of the entire population becomes important when quantifying
information transmission in large networks that is not accounted for by using pair-wise
association measures such as cross correlation. Third, cross correlation can only be interpreted as
a measure of statistical dependence when the pair-wise responses are jointly Gaussian, which
may not always be the case. Lastly, the analysis is typically limited to very short coincidence
intervals above which the effect of covariation of firing rates in response to a common driving
input cannot be ruled out [72].
2.3.2

Information-theoretic Methods

Information

theoretic

(IT)

methods

attempt

to

circumvent

the

limitations

of

crosscorrelograms by considering measures, such as mutual information [73], that rely on higherorder statistics but may differ in the neuronal response property used. The advantage is that the

15

analysis can be carried out independent of the detailed system properties as it relies on building
empirical histogram estimates of joint response distributions. In some cases, precise spike timing
is used to estimate these distributions [74-76], thereby revealing the tendency of neurons to
synchronize their firing. In others, a binned version of the spike trains is used, thereby measuring
stimulus-dependent covariation of firing rates. One example is the transfer entropy measure (TE)
defined as [77]

TE j →i =

I ⎛ Si ; S h
⎜
j
⎝
f

S ih ⎞ =
⎟
⎠

∑∑∑ Pr (

f

si , s h , sih
j

f h h
si s j si

Pr ⎛ si s h , sih ⎞
⎜
⎟
j
⎝
⎠,
log
f
Pr ⎛ si sih ⎞
⎜
⎟
⎝
⎠

)

f

(2.2)

where sif , s h and s ih are binary variables, I ⎛ S i f ; S h S ih ⎞ is the mutual information between the
⎜
⎟
j
j
⎝

⎠

f

future firing of neuron i, S i , and the past firing of neuron j, S h , that cannot be accounted for by
j
using the past firing of neuron i, S ih . This measure can also be computed using binned versions
of the spike trains.
IT methods can also be viewed as pattern detection methods that search for frequent patterns
of spikes (symbols) appearing more than chance level [78, 79]. The size of the pattern (word
length) is an important feature that has to be identified to determine if the frequency of
occurrence of a certain pattern above chance is indicative of a functional relationship. In a few
studies [80, 81], the choice of the timescale was found to yield strongly correlated network states,
even though pair-wise correlations computed over a fixed bin width was statistically insignificant.
One potential caveat of IT methods is the need for large amounts of data where the connectivity
can be assumed stationary to provide unbiased estimates of the joint distributions of the patterns.
Neural plasticity may affect these estimates if it occurs over short time scales [77].

16

The most popular among IT methods is the Maximum Entropy model (MaxEnt), rooted in
statistical physics, in which the joint density of the population is expressed as an exponential
function of the weighted individual and pair-wise activities
⎞
⎛
⎟
⎜N
N N
1
1
Pr (S1 , S 2 ,..., S N ) = exp⎜ ∑ δ i S i + ∑ ∑ γ ij S i S j ⎟
⎟
⎜
Z
2 i =1 j =1
⎟
⎜ i =1
j ≠i
⎠
⎝

(2.3)

where Z is a normalization factor, and δi and γij are the unknown parameters of the model. These
parameters are estimated using an iterative scaling algorithm so that the expected values of the
individual and pair-wise responses match the measurable information in the experiment [82] the mean spike count within a fixed window and pair-wise correlations.
Pair-wise MaxEnt models have been shown to fit experimental data in vitro for small systems
such as retinal ganglion cell cultures [83]. Their performance for larger systems (> ~20 neurons)
remains questionable [84]. Recently, we proposed a Minimum Entropy Distance (MinED)
algorithm to alleviate this curse of dimensionality when characterizing larger systems [85]. The
method attempts to find the network structure that minimizes the distance between the entropy of
single trials and the true entropy of the data, estimated from a large number of repeated trials. It
has been tested on data from biologically plausible models of spiking neurons with natural
adaptation mediated by an STDP mechanism [86]. The method demonstrated good
generalization capability given limited data available.
2.3.3

Granger Causality

Granger causality is one popular method that has been applied to study causal links between
random variables [87]. Specifically, suppose that the spike train of neuron i at time bin m can be
17

predicted given the neuron’s own firing history and that of another neuron j using the bivariate
autoregressive model

S i (mΔ ) =

K

∑ Aii (k )Si ((m − k )Δ ) +

k =1

K

∑ Aij (k )S j ((m − k )Δ ) + ε i j (mΔ )

(2.4)

k =1

where K is the maximum number of lags (model order) and the A’s represent the linear
regression coefficients obtained by minimizing the squared prediction error εi|j when Sj is used to
predict Si. Neuron j is said to Granger cause neuron i if the inclusion of Sj in equation (2.4)
reduces the variance of the prediction error. This bivariate model can be extended to a
multivariate model by summing the effect of other neurons in the population in the second and
third terms of equation (2.4) [88]. Extending the time domain model in equation (2.4) to the
frequency-domain yields the Partial Directed Coherence (PDC) and Directed Transfer Function
(DTF) approaches [89, 90]. The main advantage is the abolition of the dependence of the time
domain models on a specific bin size, under the assumption of stationarity within the analysis
interval.
Granger causality, whether computed in the time-domain or the frequency-domain, assumes
linear interactions by virtue of the autoregressive model structure. This might lead to erroneous
conclusions when nonlinear interactions occur, particularly in higher-level association cortices
[91]. For example, auditory neurons have been consistently found to perform coincidence
detection of their dendritic inputs [92, 93], while posterior parietal cortex neurons perform gain
modulation to guide motor behavior in response to visual stimuli [94, 95]. It has also been
reported that Granger causality cannot detect inhibitory connections with the same accuracy as
excitatory ones [96, 97], and cannot discriminate between mono and polysynaptic connections.

18

2.3.4

Generalized Linear Model

The Generalized Linear Model (GLM) is a generative model of wide use in many statistical
problems. In the context of modeling neuronal population activity, GLM models the output of
each neuron in terms of a conditional intensity function, λ(t F(t)) of a stochastic point process,
where F(t) models all the factors that influence the neuron output [98, 99]. For inhomogeneous
Poisson processes, the conditional intensity of neuron i is expressed as

⎛

λi (t Fi (t )) = exp⎜ β i + k i .ξ h +
⎜
⎝

⎞
αij .S h ⎟,
∑ j⎟
j ∈π i
⎠

(2.5)

h

where Fi(t) is a linear sum of the neuron‘s background rate βi, an extrinsic covariate ξ that the
neuron independently encodes, and the firing history S h of the neurons driving neuron i forming
j
the set πi prior to time t (inclusive of neuron i to model self-inhibition caused by refractoriness).
The parameters ki and αij model the neuron’s receptive field and synaptic coupling with neuron j,
respectively. While the GLM has been used as a generative model and not as a method to infer
neuronal connectivity, optimizing its parameters to fit the observed data involves calculating the
coupling parameters αij’s and therefore can be viewed as a tool to identify neuronal connectivity.
Typically, αij’s are calculated using maximum likelihood estimation techniques [100].
GLMs have been successfully shown to fit data in vivo from primary motor cortex during
goal-directed 2D arm reach movements [98], as well hippocampus place cell firing activity of
rats in spatial navigation tasks [101]. They have also been shown to fit population data from in
vitro neural cultures such as the primate retinal ganglion cells (RGCs) [83], a structure with wellcharacterized anatomical connectivity. Figure 2.5 shows a block diagram of the model for a pair

19

Stimulus Filter
Coupled Spiking Model
Stimulus
Stochastic
Filter
Nonlinearity Spiking
-200 -100 0
Time (ms)

Post-spike Filter

Post-spike Filter
1

Gain

Neuron 1
Coupling
Filters

0

20 40
Time (ms)
Incoming Coupling Filters
OFF
ON

Neuron 2

3
2
1
0

(a)

0

1.0
0.8
0.6
20 40 0 20
Time (ms)

40

(b)
Figure 2.5: (a) GLM of a 2-neuron population used to fit spike trains from stimulus-driven retinal
ganglion cell (RGC) responses in vitro. Each neuron input is processed by a stimulus filter ki, a
post-spike filter αii and coupling filters that capture other neurons’ dependencies, αij’s. (b)
Parameters of an example retinal ganglion ON cell model: Temporal and spatial components of
center and surround filter, post-spike filter and coupling filters from neighboring OFF and ON
cells. (Adapted from [83]).

of RGC neurons recorded in the study by Pillow et al. [83], where an example of the estimated
parameters for one ON cell is illustrated, including the estimated synaptic coupling parameters
with the neighboring ON and OFF cells.
A critical assumption in the GLM is that the dependence of the firing rate on the external
covariates is implicitly linear, with possibly the inclusion of a nonlinearity, such as the
exponential function in the Poisson case. There are many prime differences, however, between
the structure of cortical networks observed in vivo and that of neuronal cultures in vitro such as

20

the well-characterized spatial layout of the RGC system. More specifically, the linearity property
may not always hold in many cortical areas where nonlinear mechanisms and non-Poisson output
characteristics tend to predominate [13, 102-104]. For example, neuronal responses in prefrontal
areas have been shown to represent discrete – possibly binary – values (rather than a continuum)
reflecting possible “decision making” mechanisms reminiscent of sensorimotor integration (e.g.
[105]). Many complex brain functions such as attention [106-108] and motion perception are
thought to be mediated by highly heterogeneous and nonlinear firing patterns of neurons in other
areas such as area MT [104, 109-111]. Most importantly, the GLM assumes the neuron responds
h

to a covariate ξ that is known or can be explicitly measured. However, in many cortical areas
(e.g. premotor), the intrinsic driving input to the observed population may not be explicitly
measured, and therefore the exact functional form of the encoding mechanism – the
h

mathematical relationship between extrinsic covariates, ξ ’s, and individual neurons’ outputs,

λ ’s - remains a subject of intense debate [112-115].
From a computational standpoint, GLM fitting - like most maximum likelihood optimization
techniques - is computationally intense and may tend to overfit the data because of the large
number of parameters to be estimated. Some potential remedies include Bayesian regularization
techniques that attempt to maximize the posterior density instead of the likelihood, assuming a
priori beliefs about receptive field parameters. Another way is to assume sparse connectivity and
use known anatomical properties of the populations being measured, or by assuming a specific
functional form for the coupling between the neurons [83, 98], effectively reducing the number
of parameters. Recent experimental evidence from selected brain areas seems to support this
approach [116, 117].

21

2.4 Graphical Models for Inferring Neuronal Connectivity
Graphical models are considered an elegant blend of probability and graph theories to
capture complex dependencies between random variables. They are a cornerstone in statistical
inference and have been widely used in mathematics and machine learning applications such as
bioinformatics, communication theory, statistical physics, signal and image processing, and
information retrieval [118]. Their use in modeling neuroscience data has been surprisingly very
limited, but is increasingly gaining strong attention in the computational and systems
neuroscience communities.
A graphical model uses a node to represent a random variable and an edge to connect two
nodes representing a probabilistic relationship between the corresponding random variables.
Formally, a graph G consists of a set of nodes V and edges E, written as G =< V, E >. Each node
in V, denoted by vi, corresponds to a random variable xi. Each edge in E ⊆ V × V , denoted by vi
→ vj for directed graphs, indicates the causal influence of random variable xi on random variable
xj (i.e. effective connectivity). For undirected graphs, edges indicate statistical dependence that is
symmetric between the two random variables xi and xj (i.e. functional connectivity). An
unweighted graph can be represented as an n × n binary adjacency matrix A, where n is the total
number of nodes in the graph. Each element A(i, j) takes the value ‘1’ if there is a connection
from vi to vj and ‘0’ if there is no connection between the corresponding nodes. An undirected
edge between nodes vi and vj is represented by A(i, j) = 1 and A(j, i) = 1. In the context of
neuronal networks, each neuron is represented by a node in a graph and each connection between
a given pair of neurons is represented by an edge. Figure 2.6a illustrates an example of an
undirected graph of neurons where an edge between a pair of neurons indicates their membership

22

in the same cluster (functional connectivity), while Figure 2.6b illustrates a directed graph for the
same population with edges indicating causal influence (effective connectivity). Throughout this
dissertation, for any effective connection a

b, neuron a will be referred to as the pre-synaptic

neuron while neuron b will be referred to as the post-synaptic neuron. These terms refer to both
monosynaptic (direct anatomical connection) and polysynaptic (indirect influence through a
chain of neurons interconnected by monosynaptic connections) connectivity unless otherwise
indicated.
Graphical representation of neuronal interactions is intuitive because it can be easily
visualized and allows the use of many established graph metrics to quantify certain features of
the inferred graph that can be compared to imaging and anatomical neural data [17, 119]. For
instance, using graph-theoretic metrics, it has been discovered that anatomical and functional
neuronal networks in C. elegans, vertebrates and primates follow a small-world structure [120123]. In this structure, nodes tend to connect to their nearest neighbors with a very high
probability in addition to some distant nodes with a very low probability as illustrated in Figure
2.6c [120]. This structure results in a short path length between any given pair of nodes
Lα log(n ) , where n is the total number of neurons, facilitating the information transfer while

maintaining a low connection cost. Another network structure that has been argued to co-exist in
brain networks is the scale-free structure shown in Figure 2.6d. In this structure, few nodes
appear as hubs in the network with a large number of connections while the rest of the nodes
would have small number of connections asymptotically following a power law where the
probability of having k connections for a given neuron Pr (k )α k −c , where c is a constant [124,
125]. These discoveries all became possible as a result of using graphical representations.

23

1

5

2
3

7

4

1

6

3

8

5

2

6
7

4

(a)

(b)

3

8

3

4

4

2

5

5

1

6

2

1

6

8

8
7

7

(c)
(d)
Figure 2.6: (a) Undirected graph of a sample neuronal population where edges indicate the
membership of the connected neurons in the same cluster. (b) Directed graph of the same
population in (a) where edges indicate the causal influence among the neurons. (c) Sample smallworld network. Connections are mainly local with the exception of a few distal connections
(neurons 2 to 6 and neurons 5 to 8). (d) Sample scale-free network, where a small number of
neurons appears as hubs (neurons 2 and 8).

In addition to the measures graph-theory offers to post-process the inferred networks, they
can provide more accurate models compared to existing methods of inferring neuronal
connectivity. Existing directed and undirected traditional techniques for identifying neuronal
connectivity surveyed above, such as crosscorrelograms and partial directed coherence, mainly
rely on pair-wise correlations assessed over a very fine time scale (typically 0-5 milliseconds).
While these techniques may be feasible to implement for a small number of neurons, they
become computationally prohibitive for large number of cells and inadequate to collectively
assess causal relationships in networks with polysynaptic connectivity in which interaction
occurs over broader time scales (10’s-100’s of milliseconds). Graphical models, on the other

24

hand, do not rely on pair-wise relationships as detailed in Chapters 4 and 5. They rather take into
account the activity of the entire observed ensemble when searching for relationships between
the neurons. This enables identifying only direct, and not indirect, relationships between
observed neurons. This is particularly important when dealing with cortical networks with
complex connectivity patterns that are likely to span broader temporal and spatial scales. In such
a case, simple pair-wise measures may erroneously lead to classify indirect relationships as direct,
or entirely miss certain connections.
An added advantage is that graphical models have a unique ability to detect nonlinear
relationships that other techniques for inferring connectivity (such as Granger causality) cannot
detect. Thus, using graphical models in analyzing neuronal spike trains might facilitate revealing
how the underlying cortical circuits are reconfigured in response to complex, nonlinear stimuli
features. Moreover, they can be used to explain the observed correlation in cortical network
nd

states beyond what 2

order maximum entropy models can reveal, given their sensitivity to

variations in temporal dependence [82, 126].

25

Chapter ____________________________________________

3

A Multiscale Clustering Approach for
Identifying Neuronal Functional
Connectivity

______________________________________________________________________________

3.1 Neuron Population Model
Here we introduce the multiscale clustering approach to identify neuronal functional
connectivity [27, 29]. A common assumption among all of the classical techniques of inferring
neuronal connectivity surveyed in Chapter 2 is that the interval over which the functional
relationship between any neuron pair remains fixed. While this may be the case in local
populations, such a simplistic assumption does not take into account a number of factors
governing the discharge probability of a neuron. This probability can vary considerably over
short latencies of a few milliseconds depending on the strength of synaptic connectivity to other
adjacent neurons [65]. It can also vary over relatively longer latencies, possibly hundreds of
milliseconds [127], depending on the dynamics of neurotransmitter release. In addition,
information content in spike trains was shown to vary considerably depending on the timescale
chosen for the analysis, and, depending on a particular choice of that timescale, strongly
correlated network states can emerge even though correlations over a fixed bin width may be
statistically insignificant [80, 81]. Thus, there is an urgent need to assess the interaction between
simultaneously recorded neuronal elements over a multitude of timescales to better understand
the nature of neural computations that give rise to behavioral events.

26

A neuron population model can be formulated as follows: suppose that in a P neuron
population, we want to approximate the probability of the pth neuron firing within an interval of
length T. If the interval is binned to a very small bin width Δ chosen to account for the
refractory period (i.e., the digital form of the spike train can accommodate consecutive ‘1’ bins,
and no more than one event can occur in any given bin), then the neural discharge pattern during
T will be a vector of length N= T/ Δ , and the spikes will occupy some fraction Np/N of these N
bins, where Np denotes the number of events within T. This simple calculation gives an empirical
estimate of the firing rate of the neuron within T. Alternatively, if T = [0, t ) , this probability can

(

)

also be calculated with knowledge of the conditional intensity function λ p t H p (t ) , where
H p (t ) denotes the history of all the processes that affect the firing probability of the neuron p

up to time t (e.g. stimulus feature, other neurons’ interaction, the neuron’s own firing history,
etc…). In this model, the history H p (t ) models the effects caused by post-synaptic potentials.
Specifically, the probability of a spike event for small Δ is [128]

(

)

Pr{spike from neuron p in (t , t + Δ ] | H p (t ) } ≅ λ p t H p (t ) Δ

(3.1)

The spike train Sp can be modeled as a conditionally independent Bernoulli random process with
probability mass function [129]

(

)

⎧ λ p t H p (t ) Δ,
⎪
f s p (t ) = ⎨
⎪1 − λ p t H p (t ) Δ,
⎩

(

)

(

)

if S p (t ) = 1

(3.2)

if S p (t ) = 0

Here the success probability of the Bernoulli random variable is time-varying, given its
dependence on the history interval. This interval is likely to be of variable duration when the

27

intrinsic properties of the synapses undergo substantial changes, for example, as a result of
Hebbian learning.
Consider for example a version of the conditional intensity of an inhomogeneous Poisson
model [128]
⎞
⎛
P M pq
⎜β +
λ p t i α p , H p (t i ) = exp⎜ p ∑ ∑ α pq (mΔ )S q (t i − mΔ )⎟
⎟
⎟
⎜
q =1 m=1
⎠
⎝

(

)

(3.3)

th

where α pq denotes the synaptic coupling between neurons p and q in the m

time bin, Mpq is

the length (in bins) of the history interval that relates neuron p’s firing probability to activity
from neuron q and can vary as q varies, S q (t i − mΔ ) is an indicator function that takes the value
th

of ‘1’ if neuron q fired a spike in the m

time bin and 0 otherwise, and βp is the log of the

background rate of neuron p. The α pq ’s are time-varying weight factors governing how much
the occurrence of a spike from neuron q, going back Mpq bins in time, influences the probability
of firing of neuron p at time ti. The model in equation (3.3) has been previously demonstrated to
be powerful in fitting data from motor cortex during goal-directed arm reach movements [98,
130], as well as data from place cells in rat hippocampus area CA1 during foraging in a linear
track [131].
A related model in [99] was restricted to a fixed length history interval among all neurons in
the ensemble. The model in equation (3.3) is more realistic for two reasons: first, it takes into
account variable length of the history interval among any given neuron pair consistent with the
expected variations in the post-synaptic potentials duration depending on the cell morphology;
and second, it accounts for temporal precision in spike occurrence when Δ is too small (for
example, 1 ms) which is important for identifying spike-timing dependent plasticity (STDP) in
synaptic connectivity as will be detailed later.
28

Consider the weight function α pq that models the connection from neuron q to p given by

α pq (t ) = A pq sin (Fπt M pq )exp(− Bt M pq )

(3.4)

where F and B are constants, and Apq is a magnitude controlled by synaptic modification. The
damped sinusoidal form in this expression implies that a spike from a neuron q occurring
towards the end of the history interval has a smaller contribution to the firing probability of
neuron p than if it were to occur at an earlier time. This weight function mimics the change in the
membrane potential of post-synaptic neurons caused by an excitatory post-synaptic potential
(EPSP) when Apq is positive or an inhibitory post-synaptic potential (IPSP) when Apq is
negative. The damped sinusoidal form of α pq models a synchronized oscillatory impulse
response of the effect of potentially many unobserved, bidirectional, excitatory and inhibitory
neurons that affect the discharge pattern of neurons p and q. The time-varying characteristics of

α pq are controlled by the sinusoid frequency and the time constant of the exponential term,
which is equivalent to the bandpass filtering effect of synaptic transmission [132]. Other
functions can also be used to mimic different characteristics. A sample function is shown in
Figure 3.1 for two different history intervals while other constants are fixed. A narrow analysis
time scale will be suitable to track the dynamics of α pq when Mpq is small, while a larger
analysis time scale will be needed if Mpq is large. It is very likely that the interaction history
intervals between multiple neurons in an ensemble will be distinct, given the possibility of a
neuron to simultaneously participate in multiple neuronal clusters consistent with Hebbian
learning.

29

2
1.5
1
0.5
0
-0.5
-1
-1.5
-2

Amplitude

Amplitude

2
1.5
1
0.5
0
-0.5
-1
-1.5
-2

0

50
100
Time (ms)
(a)

150

0

50
100
Time (ms)
(b)

150

Figure 3.1: Synaptic weights for (a) Mpq = 40 bins (Hpq = 120 ms) and (b) Mpq =120 bins (Hpq
= 360 ms). In both cases, Apq = 2.5, B = 2000 and F = 3000. When the history interval is limited
to 40 bins, the function exhibits smaller time constant within the short time support where the
function is non-zero. In contrast, the function exhibits larger time constant when Mpq =120.

The above analysis emphasizes the fact that correlations are highly dependent on the time
scale that govern the functional connectivity among neurons, which in turn depends on the
strengths and directions of the synaptic coupling as well as the particular elements of a given
cluster. In a real situation, specific information about these elements and the associated history is
not available to the experimenter, and therefore functional interdependence between any neuron
pair has to be identified without knowledge of these parameters.

3.2 Scale-space Representation
The dependence of the correlation on the variable-duration history terms necessitates a preprocessing step to identify any intricate variations in neuronal discharge probability across
multiple time scales. This step consists of projecting the spike train onto a scale space using a
variable duration Haar rectangular basis [133]. At any given time scale j, the spike train can be
expressed as
30

j
S p (t ) = W ( j ) S p (t )

where W

(j)

j = 0,1,..., J

(3.5)

is a lumped matrix representing the Haar wavelet transform, and J is the total number

of time scales, i.e., depth of temporal analysis. This is a parameter to be chosen by the user.
Figure 3.2 shows an example of a scale space representation of a spike train snippet up to two
(j)

time scales. The Haar transformation operator W

for the first two time scales takes the form

0
0
0
0
0
0
⎡1 2 1 2
⎢ 0
0
0
0
0
0 12 12
⎢
⎢ 0
12
0
0
0
0
0 12
⎢
0
0 12 12
0
0
0
0
W (1) = ⎢
⎢1 2 − 1 2 0
0
0
0
0
0
⎢
0
0
0
0
0
1 2 −1 2
⎢0
⎢0
1 2 −1 2 0
0
0
0
0
⎢
0
0
0
0
0
1 2 −1 2
⎢0
⎣
⎡1 4
⎢ 0
(2)
W =⎢
⎢1 4
⎢
⎣ 0

14
0

14
0

14
0

0
14

0
14

0
14

14
0

−1 4
0

−1 4
0

0
14

0
14

0
−1 4

where multiplying the top half of W

(1)

⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
0 ⎤
14 ⎥
⎥
0 ⎥
⎥
−1 4 ⎦

(3.6)

by the spike train Sp gives the probability of firing in

1
successive bins that are two samples in size. These represent the approximation coefficients a p .

Multiplying the bottom half of W

(1)

by the spike train gives the relative change in the firing

pattern in successive bins that are two samples in size. These represent the detail coefficients
(2)
d 1 . Similarly, a 2 and d 2 can be obtained by using W , which is equivalent to computing the
p
p
p

probability of firing and the relative change in successive bins that are four samples in size, and
so on.
Two important features can be extracted from the scale space representation: 1) the
j
probability of firing at time scale j indicated by the sequence a p (t ) along the transform

approximation path (left branch); 2) the relative change in firing probability across successive

31

S p = {1, 0, 0, 0, 0,1,1,1}
d 1 = { 1 2 ,0, −1 2 , 0}
p

a1 = { 1 2 ,0, 1 2 , 1}
p
a 2 = { 1 4 , 3 4}
p

d 2 = { 1 4 , −1 4}
p

Figure 3.2: Two-level Haar scale space representation of a sample spike train.

j
bins indicated by the sequence d p (t ) along the details path (right branch), where

positive/negative values indicate a decrease/increase in the probability of firing relative to the
previous bin, respectively. At this stage, the scale space representation incorporates all the
desired information about the dynamics of the discharge patterns.

3.3 Similarity Measures
Once the scale space representation is obtained for every spike train, a similarity measure is
computed between pair-wise neurons in each time scale up to a temporal depth set by the user.
The choice of this depth largely depends on the expected latency of interaction between neurons,
which typically depends on the brain region(s) from which these neurons are recorded. The
similarity measure used throughout this chapter is the sample cross-correlation between neurons
p and q at time scale j, and can be expressed as

[

]

j
j
σ pq (t i , t l ) = E S p (t i )S qj (t l )

(3.7)

( j)
∈ ℜ P × P , can be formed using equation (3.7)
The full correlation matrix at time scale j, ∑

for all p, q ∈ { ,..., P} . Other similarity measures can also be used (e.g. mutual information,
1
coherence, Granger causality, etc…).

32

Let’s now define the total “instantaneous energy” of a given cluster, C k ∈ ℜ

P ×P

, as the

algebraic sum of all instantaneous pair-wise correlations between the elements of that cluster,
i.e., C k = ∑

j
∑ σ pq , where the time indices have been dropped for notation simplicity. We

j p ,q∈{k }

note here that all the off-diagonal entries C k ( p, q ) are zero for p, q ∉ {k } . The objective is to
determine the nonzero entries C k ( p, q), ∀p, q ∈ {k} that can serve as similarity measures
between pair-wise neurons, irrespective of the length of the history interval governing the
interaction between the two neurons. These entries can be obtained by simultaneously
( j)
diagonalizing the correlation matrices ∑
computed from equation (3.7) for j = 0,1,..., J .

Specifically, a block diagonal matrix R ∈ ℜ ( P×J )×( P×J ) that has the j

th

( j)
P × P block as ∑
is

formed. This matrix is diagonalized using Singular Value Decomposition (SVD) as

⎡ ∑ ( 0)
⎢
R=⎢
⎢
⎢
⎢
⎣

⎤
⎥ Q
∑ (1)
⎥ ≅ λ u uT
⎥ ∑ i i i
O
⎥ i =1
(J )
∑ ⎥
⎦

(3.8)

th

where λ i u i denote the eigenvalue/eigenvector pair associated with the i dominant mode of
the block diagonal matrix R .
To compute an instantaneous similarity measure between pair-wise neurons, the dominant Q
modes of the decomposition in equation (3.8) are used to approximate the augmented correlation
( j)
matrix R. In practice, this is achieved by first vectorizing the J matrices ∑
to P 2 × 1 -

dimensional vectors, concatenating the J vectors obtained in a matrix of dimension P 2 × J ,
performing SVD on the resulting matrix and keeping the first Q modes in a matrix of size

33

P 2 × Q , then finally reshaping the P 2 × Q matrix to Q matrices of size P × P. SVD is used here
as an efficient way to fuse the correlation matrices obtained at different time scales into one
similarity matrix while taking into account the expected variance in neuronal temporal
interactions. SVD captures second order statistics while simultaneously decorrelating the
information across multiple time scales, thereby eliminating any redundancy in the cross
correlations obtained.

3.4 Spectral Clustering
The next step relies on clustering a graphical representation of the population. Using graph
theory, each neuron is represented as an object. Each object p, in our case a neuronal spike train,
is represented as a node in the graph. Any two objects, p and q, are connected by a directed edge
whose weight w pq is the similarity between the two objects. An example of an undirected
th

graph is shown in Figure 3.3. In our case, the weight w pq is the sum of the ( p, q ) entry of the
Q principal matrices weighted by the corresponding eigenvalue obtained from equation (3.8).
The present task is to cluster the neurons in this graphical representation by identifying the
groups of strongly similar objects. In our case, we use a spectral clustering approach drawn from
Machine Learning theory [134, 135]. A spectral clustering algorithm performs graph partitioning
using a minimum cut algorithm. Specifically, the connected graph is divided into multiple
disconnected sub-graphs such that only edges of small weights are removed. In the example
shown in Figure 3.3, by removing the edges with small weights, we are able to identify two
clusters of strongly correlated objects, with object 1 and 2 in the first cluster and object 3 and 4
in the second cluster. Since a typical graph cut problem is NP-hard, some approximation

34

O1
2
O2

6
7

7
6

O3
2
O4

Figure 3.3: A sample undirected graph with 4 objects and arbitrary edge weights for which the
spectral clustering algorithm clusters objects with strong similarity (small edge weight indicates
stronger similarity between objects).

methods exist to efficiently solve the problem using “soft” memberships to reduce the
computational complexity [136].
th

Let’s denote by a pk the probability of the p

neuron to be in the k

th

cluster and solve the

minimum cut problem by finding the set of probabilities {a pk } that maximize the following
objective function
P P

l=

K

∑

k =1

(3.9)

∑ ∑ a pk a qk w pq

p =1q =1
P P

∑ ∑ a pk w pq

p =1q =1

The final cluster memberships are derived from {a pk } by assigning each object to the cluster
*
with the largest probability, i.e., k p = arg max a pk . Note that although many clustering
k∈[1,...,K ]

algorithms exist, a number of studies have shown that spectral clustering is able to outperform
many well known clustering algorithms, particularly when cluster boundaries are complex to
define [137]. Our results demonstrate that this is indeed the case. Figure 3.4 illustrates the steps
of the entire algorithm.

35

•

For j = 1 to J (Time scale index)
For p = 1 to P (Neuron index)
j
Obtain the wavelet representation S p of each spike train S p

End
Compute the P × P correlation matrix ∑ ( j ) of the spike trains at time scale j
End
•

Form the block diagonal correlation matrix R from the matrices ∑ ( j ) and obtain its
spectral decomposition using SVD

•

Obtain the pair-wise similarity of neurons using the weighted sum of the first few
dominant modes of R

•

Compute the cluster membership of each neuron by maximizing the objective
function (3.9)

Figure 3.4: Pseudocode of the multiscale clustering method.

3.5 Results
At that stage of development, it was essential to test the algorithm on spike train data for
which we knew the absolute truth. Therefore, we extensively tested the algorithm on data
simulated using the point process model of equation (3.3) with known relationships among
neuronal elements. We also propose metrics to assess the performance when the ground truth is
not available when analyzing real data.
36

3.5.1

Clustering Synaptically-Coupled Populations

We used synaptic coupling in equation (3.3) according to the following expressions [99]

(
) (
)
α + (t ) = 2.5 sin(4000πt M pq ) exp(− 3000t M pq ),
pq
α − (t ) = −3 sin(2000πt M pq )exp(− 3000t M pq ),
pq
α 0 (t ) = −2 sin 3000πt M pp exp − 3000t M pp ,
pp

(3.10)

where t is in seconds, α 0 , α ± are the auto-inhibitory, and excitatory/cross-inhibitory
pp
pq
interactions, respectively. We generated data sets (n=25) containing 16 neurons each with a total
duration of 98 sec in each trial. In each dataset, the 16 neurons were divided into 4 clusters with
4 neurons each.
3.5.1.1 Population with Fixed History Interval
Figure 3.5a shows the network topology of the 16-neuron population with no across cluster
connectivity. Within a cluster, each neuron inhibits itself as well as its neighbor in the clockwise
direction, while excites the other neighbor in the anti-clockwise direction. In this dataset, the
parameter Mpq was set to 120 for all interactions. Figure 3.5b shows a 3-seconds raster of the
generated spike trains. A sample inter-spike interval (ISI) histogram of a neuron in the
population and that of an independent Poisson model for two choices of Mpq (4 and 120 bins), is
shown in Figure 3.5c demonstrating the effect of inhibition (first 40 ms) and the excitation (40100 ms) intervals resulting from interaction with other neurons when Mpq is large, while this
interaction effect diminishes when Mpq decreases. We also computed the Coefficient of
Variation (CV) of the Inter-spike Interval (ISI) histograms ( CV = std (ISI ) mean ( ISI ) ) to
compare the statistics of the data generated by the model to those typically observed in cortical

37

I

1

E
I

2

I

I

C1 I
E
I
4 E 3
I

I

E
13 I

14

6

I

E
I C2 I
I
8 E 7
I
I

E I
I

E
5 I

E

I

I

E
I C4 I
I
16 E 15
I
I
E

E
9 I

10

I

E I C3 I
E
I
12 E 11
I
I
(a)
C1
C2
C3
C4

80
Count

60

0.75

1.5
Time (s)
(b)
Dep.
Indep.

Dep.
Indep.

40
20
0
0

3

1
0.9
0.8
60
120
240 360
Hist. Interval (ms)

150
300 450
ISI (ms)
(c)
Figure 3.5: (a) Network topology for the independent clusters population (solid lines), I =
inhibition, E = excitation. (b) Sample 3-sec trace of the obtained spike trains. The initialization
period is indicated by a black bar. (c) Sample ISI histogram for a neuron in the population and an
independent neuron at 12 ms and 360 ms history intervals, respectively. Coefficient of variation
of the ISI histogram. (d) Augmented correlation matrix using time scales up to 96 ms. (e)
Average clustering accuracy vs. time scale for a fixed 360 ms history interval obtained using
probabilistic spectral clustering and k-means clustering.
150
300
ISI (ms)

450 0

2.25

Coeff. of Variation

__________ 0

38

2 4 6 8 10 12 14 16
Neuron
(d)
Figure 3.5 (cont’d)

10

5

0

Avg. Clustering Acc.

Neuron

16
14
12
10
8
6
4
2

Spectral
k-means

1
0.75
0.5
0.001

0.01

0.1
1
Time Scale (s)
(e)

10

100

neurons [106, 138]. The rightmost plot in Figure 3.5c illustrates that as the history interval
increases, the CV decreases below 1 (CV = 1 is equivalent to a Poisson neuron), consistent with
characteristics of many types of cortical neurons [139].
In each trial, four across-cluster interactions pairs were randomly added to simulate noise or
potential error margins that may be encountered, for example, during spike sorting of
simultaneously recorded neurons [59]. Each pair consisted of an excitation in one direction and
an inhibition in the other using the same expressions in equation (3.10). Figure 3.5d shows the
augmented R matrix obtained by considering correlation matrices up to a time scale of 96 ms
illustrating the computed similarity between pair-wise neurons averaged across trials. Neurons
were ordered such that neurons 1-4 represent cluster 1, neurons 5-8 represent cluster 2, and so
on. Neurons in the same cluster are observed to share significantly larger similarity than those
across-clusters. The average clustering accuracy using the proposed approach shown in Figure
3.5e (the solid curve) was above 96% for time scales in the range of ~200 ms to ~1.5 sec and
peaks around 360 ms, consistent with the fixed history interval chosen in the model ( M pq Δ =
360 ms), then it declines below 80% for smaller (< 10 ms) as well as larger time scales (> 10

39

sec). Since we did not account for synchrony explicitly in this dataset, these results confirm our
expectation that the clustering accuracy is maximized at temporal resolution matching the
network properties.
To assess the effect of each of the computational steps in the algorithm on the clustering
accuracy, we first examined the contribution of the scale space step. When the similarity matrix
was computed using a fixed bin width, i.e., without the scale space representation prior to
clustering, an average accuracy of only ~51% was achieved. We then compared the performance
when the k-means was used as opposed to spectral clustering. Figure 3.5e shows that the
clustering accuracy deteriorates (maximum accuracy in this case is 61%) compared to the case
when spectral clustering is used. Therefore, we concluded that both steps are important for
superior performance.
3.5.1.2 Population with Variable History Interval
As derived earlier, the interaction history interval with neighboring neurons plays a
fundamental role on the probability of firing of the neuron under consideration. For example, if
the time constant of an excitatory synapse between neurons p and q is small (e.g. Mpq = 4 bins as
in Figure 3.5c), neuron p is likely to fire Mpq times the likelihood of firing independently. In
contrast, when the time constant is large (e.g. Mpq = 120 bins), neuron p incorporates the history
of neuron q’s firing over a much longer time scale and therefore ignores the precise timing of
neuron q’s firing and emphasizes firing rate information. Thus, the former case represents
synchrony effects governing the probability of firing, while the latter represents response to rate
modulation effects, both believed to be major constituents of the neural code [140, 141].

40

We varied the history parameter Mpq and examined the algorithm performance as illustrated
in Figure 3.6a. The performance systematically peaked around the specific history interval
chosen in the model. This is further indicated by observing the monotonic relationship in Figure
3.6b between the time scale where maximum accuracy was attained and the history interval in
the population model. In our model, the firing pattern of each neuron is a highly nonlinear
function of the individual pre-synaptic neurons’ firing pattern as indicated by the exponential
term and the nonlinear relation between Mpq and the synaptic strength in equation (3.4).
Therefore, we did not expect the relationship between the time scale of maximum accuracy and
the history interval to be linear. It is also noteworthy that we are not interested in estimating the
history intervals of interaction, but rather identify the clusters of functionally interdependent
neurons.

Max. Acc. Time Scale (s)

Avg. Clustering Acc.

1
0.8
0.6
0.4

H=60 ms
H=120 ms
0.2
H=240 ms
H=360 ms
0
0.001 0.01 0.1
1
10
Time Scale (s)

10
1
0.1
0.01

0.001

100

(a)

0.1
History Interval (s)

1

(b)
Figure 3.6: (a) Clustering accuracy vs. time scale fitted with 3 order polynomial for the
network in Figure 3.5a for variable history intervals. (b) (Gray dots) Time scale of maximum
clustering accuracy obtained for the 25 trials vs. the population true history interval. (Black dots)
Average of maximum accuracy time scales.
rd

41

3.5.1.3 Clusters with Distinct Temporal Interactions
We further examined the effect of varying history intervals across clusters, while maintaining
it fixed among neurons within a given cluster. We used the same topology in Figure 3.5a with
1

2

3

4

k

M = M = 20, and we varied both M and M , where M denotes the history interval (in bins) of
interaction between neurons in cluster k. Figure 3.7a shows the clustering accuracy for different
3

4

k

k

settings of H and H (H = M × Δ) with a peak accuracy of more than 90% in all cases. As can
be seen from Figure 3.7b, the monotonic relationship indicates that the accuracy is not
diminished as a result of the distinct patterns of interactions expressed by the variable history
intervals.
3.5.1.4 Distinct within-cluster Temporal Interactions
We also examined the performance of the approach when variable history intervals of

Max. Acc. Time Scale (s)

Avg. Clustering Acc.

interaction occur within a given cluster. This case is relatively challenging since the dependence

1
0.9
0.8
0.7
0.6
0.5
0.4

H=120 ms
H=240 ms
H=420 ms
H=840 ms
0.001 0.01 0.1
1
10
Time Scale (s)

100

(a)

1
0.1
0.01
0.001

0.1
History Interval (s)
(b)

1

rd

Figure 3.7: (a) Average clustering accuracy ± SD vs. time scale fitted with 3 order polynomial
3

4

1

2

3

4

for different settings of M and M (M and M fixed to 20 bins). H = H = H . (b) (Gray dots)
Time scale of maximum clustering accuracy vs. history interval of clusters 3 and 4 for each of
the 25 trials. (Black dots) Average of maximum accuracy time scales.

42

between any pair of neurons’ firing pattern is only governed by the presence of a non-zero α pq ,
while dependence on similar Mpq is not present. The same topology of Figure 3.5a was used with
M1i, M5i, M9i, and M13i set to 10, M2i, M6i, M10i, and M14i set to 20, M3i, M7i, M11i, and M15i
set to 110, and M 4i , M 8i , M 12i , and M 16i set to 120, where i ∈ {1, 2, ...,16} . Figure 3.8a
demonstrates that the average clustering accuracy reaches ~97% around the 100 ms time scale

1
Avg. Clustering Acc.

Avg. Clustering Acc.

1
0.8
0.6
0.4
0.2
0.001 0.01

0.1
1
10
Time Scale (s)
(a)

100

0.8
0.6
0.4
0

0.6
0.12
Window Size (s)
(b)

0.18

Aveg. Clustering Acc.

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.01

0.1
1
10
100
Window Size (s)
(c)
Figure 3.8: (a) Average clustering accuracy ± SD over all trials vs. time scale. (b) Average
clustering accuracy using crosscorrelograms for different window sizes. (c) Average clustering
accuracy using JPSTH for different window sizes.

43

showing that the neuronal elements in each cluster are correctly identified despite the variability
in the history intervals within the cluster.
For comparison, we assessed the performance to that obtained using two standard techniques:
the crosscorrelogram and the Joint Post Stimulus Time Histogram (JPSTH) [21]. Both methods
rely on using a fixed bin width correlation analysis as detailed in Chapter 2, section 2.3.1, and
therefore are not expected to be robust to variations in the history interval. We tested window
sizes of various numbers of bins (each bin was 4 samples). A correlation matrix was formed in
each case, where each entry corresponded to the peak of the crosscorrelogram/JPSTH obtained
for the corresponding pair of neurons that exceeded a predefined significance level. Figure 3.8b
and Figure 3.8c show the average clustering accuracy for different window sizes, obtained using
the crosscorrelogram and JPSTH-based similarity matrices as input to a standard hierarchical
clustering. In this analysis, we used the average linkage hierarchical clustering algorithm in
which the distance between two clusters is computed as the average of the pair-wise distance
between the members of the two clusters [142]. Comparing Figure 3.8a to Figure 3.8b and Figure
3.8c, better clustering accuracy is obtained using the proposed approach.
3.5.2

Variability in Network Topology

So far we have examined the performance of the algorithm under systematic variation in
temporal dependence to illustrate its ability to track nonstationarity in the temporal interaction
between neurons within a fixed network structure. Herein, we examine the ability of the
algorithm to track variations in spatial interactions. This was investigated in two different ways:
1) by manually inducing changes in the structure of the network in which new connections
randomly appear; and 2) by gradually varying the connectivity strength based on an activity-

44

dependent spike-timing dependent plasticity (STDP) model, widely believed to underlie Hebbian
learning.
3.5.2.1 Structural Plasticity
To invoke structural plasticity in the model, new connections were instantaneously created in
the population between subsets of neurons from different clusters as shown in Figure 3.9a. This
can mimic changes in neuronal circuitry resulting from instantaneous activation of distinct
pathways, for example, when cortical rewiring takes place post traumatic brain injury or stroke
[143]. These structural changes are long thought to accompany functional plasticity across
multiple time scales, even under normal conditions [144, 145]. Another example of structural
changes triggered by functional plasticity is the sudden transition in behavioral goal during a
motor task [146]. This may trigger an observed neuron -previously not involved in encoding
behavioral task parameters- to be instantaneously “recruited” based on the correlation between
its tuning characteristics and the new task demands. In such case, the synaptic connections
linking this neuron to other across-cluster neurons can be modeled as a step function centered at
the time marker of the response trigger, and weighted by an excitatory/inhibitory coupling α pq
with appropriate strength.
rd

For illustration purposes, the 3

st

neuron in each cluster now excites the 1 neuron from the
th

nd

neighboring cluster in the clockwise direction, while the 4 neuron in a cluster inhibits the 2

neuron from the neighboring cluster in the anti-clockwise direction. The synaptic coupling was
introduced over distinct time markers and over multiple time scales. Noise was added in the
same way as previously described. The across-cluster interaction expressions used for synaptic
coupling are

45

I

1

E
I

2

I

I

I

E

C1 I
E
I
4 E 3
I
I

E I
I
E

E
I

E
I
13 I

14

I

E
5 I

6

I

E
I C2 I
I
8 E 7
I
E

E

I

I

E
I C4 I
I
16 E 15
I
I
E

E
9 I

I
10

I

E I C3 I
E
I
12 E 11
I
I

1
0.9

Neuron

Avg. Clustering Acc.

(a)

0.8
0.7
0.6

16
14
12
10
8
6
4
2

15
10
5

Isolated
0
Interacting
2 4 6 8 10 12 14 16
0.4
0.001 0.01 0.1
1
10
100
Neuron
Time Scale (s)
(b)
(c)
Figure 3.9: (a) Network topology with across-cluster synaptic coupling (dotted lines). (b)
Average clustering accuracy vs. time scale. (c) Augmented R matrix using time scales up to 96
ms.
0.5

46

α + (t ) = 1.25 sin (4000πt M pq )exp (− 3000t M pq )
pq
α − (t ) = −1.5 sin
pq

(2000πt M pq )exp(− 3000t M pq )

(3.11)

The expression in equation (3.11) differs from equation (3.10) by the constant amplitude term
Apq, which indicates relatively weaker coupling between clusters. The history interval was fixed
to 120 bins for all interactions.
The similarity matrix shown in Figure 3.9b shows significant correlation appearing between
neuron 5 and neurons 2, 3, and 4 consistent with the model since neuron 5 is excited by neuron 3,
and neuron 3 is coupled to neurons 2 and 4. Similarly, significant correlation appears between
neuron 8 and neurons 1, 2, and 3. These changes in the correlation structure lead to a slightly
diminished clustering accuracy (about ~4% decrease) compared to the isolated cluster topology
illustrated in Figure 3.9c. These results demonstrate that despite the presence of weak
interactions in the similarity matrix, the clustering accuracy was not significantly affected,
indicating the ability of the algorithm to detect subtle differences between weak across-cluster
interactions relative to stronger within-cluster ones.
3.5.2.2 Spike-Timing Dependent Plasticity
Herein, we investigated an activity-dependent model of plasticity by modifying synaptic
strength based on the timing of pre- and post-synaptic spike occurrences [86, 147]. Figure 3.10a
shows the topology used where STDP was induced by simultaneously stimulating two neurons
from each cluster. This was achieved by simultaneously increasing the background rate β p in
equation (3.3) for the paired neurons at specific phases. By doing so, we increase the likelihood
of observing two spikes from the pre- and the post-synaptic neurons within a shorter interval,
thereby potentiating their synaptic connections. We carried out 120 stimulation phases at 0.2 Hz

47

I

E

I

I

E

2
E C1 E
E
4
3
I
I

E

E

E

1

I

14 I
E C4 E
E
16
15
I
I
13

5

E

Normalized EPSP slope (%)

Stimulation
I

6
C2 E
E
E
7
8
I
I

E

E 10 I
E C3 E
E
11
12
I
I

I

9

Avg. Clustering Acc.

(a)

2
1.5
1
0.5
-100
-50
0
50
100
Pre/post Synaptic Spike Interval (ms)
(b)

1
0.9
0.8
0.7
0.6
0

4 Clusters
2 Clusters
20 40 60 80 100 120
Stimulation Phases

16
14
12
10
8
6
4
2

4

Neuron

Neuron

(c)

2

16
14
12
10
8
6
4
2

15
10
5

0
2 4 6 8 10 12 14 16
2 4 6 8 10 12 14 16
Neuron
Neuron
(d)
(e)
Figure 3.10: (a) Network topology with paired stimulation to induce STDP. (b) EPSP slope as a
function of pre-post synaptic spikes amplitude (A) and time constant (τ) were chosen as 1.01 and
14.8 ms for LTP, and -0.52 and 33.8 ms for LTD, respectively. (c) Average clustering accuracy
post stimulation phases assuming 4 clusters and 2 clusters. (d) Similarity matrix using time
scales up to 96 ms post the 10 stimulation phases and (e) post 120 stimulation phases.

48

with 12 ms duration in each. The parameter Mpq was set to 40 bins (120 ms) for all synaptic
couplings. The percentage change in synaptic strength was modeled using the relation
dα pq = A exp(− dt τ ) , where dt is the pre/post inter-spike interval. The parameters A and τ were

chosen as 1.01 and 14.8 ms respectively for Long Term Potentiation (LTP), and -0.52 and 33.8
ms for Long Term Depression (LTD), respectively. These parameters were chosen according to
the model described in [147] in which the measured synaptic changes were fitted with
exponential functions similar to Figure 3.10b.
Increasing stimulation time results in LTP, and this means that the post-synaptic neuron
should gradually “merge” into the pre-synaptic neuron’s cluster. For example, neuron pair (2, 5)
was weakly connected before stimulation. During each stimulation phase, these two neurons
were stimulated simultaneously. This repetitive pairing resulted in LTP, and the same process
was applied to neuron pairs (3, 8), (9, 14), and (12, 15). Thus, the clustering accuracy should
decrease if the number of clusters is held equal to the number of clusters prior to the paired
stimulation, which is indeed the case in Figure 3.10c. Figure 3.10d and Figure 3.10e show the
similarity matrix obtained using time scales up to 96 ms after 10 and 120 stimulation phases,
respectively. The results demonstrate that the algorithm is able to track the effect of naturally
occurring spikes as a result of STDP and the corresponding changes in the network topology.
3.5.3

Clustering Large Populations

We examined the performance of the proposed approach when applied to a larger population
of 120 neurons. The population consisted of 4 clusters, 30 neurons each, where each cluster is
formed of a 2D grid of neurons. The strength of the connections between clusters 1 and 2 was
manually varied similar to the 16-neuron population above as shown in Figure 3.11a.

49

Avg. Pairwise Distance

Cluster 1

Cluster 2

Cluster 4

Cluster 3

8
7
6
5
4
3
0

0.2
0.4
0.6
0.8
Connectivity Strength

1

(b)
Avg. Clustering Acc.

(a)
1
0.9
0.8
0.7
0.6
0.5
0.4
0

4 Clusters
3 Clusters
0.5
1
Connectivity Strength

(c)
Figure 3.11: (a) Network topology for a population of 120 neurons, where each cluster has 30
neurons. The bidirectional connections refer to an excitation connection in one direction and an
inhibition in the opposite one. Black connections represent the within cluster connections, while
gray connections represent the across clusters 1 and 2 connections that undergo functional
plasticity. (b) Average pair-wise distance between neurons in clusters 1 and 2 for variable
connectivity strengths. A connectivity strength of 1 indicates a connection across-clusters of the
same strength as a connection within clusters. (c) Average clustering accuracy for variable
across-clusters connectivity strengths (clusters 1 and 2).

Figure 3.11b shows the average pair-wise distance between neurons of clusters 1 and 2 for
variable connectivity strengths between the clusters. These distances were computed using the
similarity matrix prior to probabilistic spectral clustering. As expected, the distance decreases as
the connectivity strength between the two clusters increases, which is reflected on the average

50

clustering accuracy in Figure 3.11c. As the across-clusters connections strengthen, the clustering
accuracy decreases when the number of clusters is fixed to be 4, while it increases when the
number of clusters is fixed to be 3.
3.5.4

Estimating the Time Scale of Maximum Accuracy

We have demonstrated that a functional relationship exists between the time scale at which
maximum accuracy occurs and the history interval length chosen to model the population
interaction. When dealing with real data, prior knowledge of this history interval is not available
and therefore this relationship cannot be exactly discovered. For that purpose, we computed a
clustering validity index, the Dunn Index (DI) [148], defined as

DI =

min

1≤ i ≤ N,1 ≤ j ≤ N
i≠ j

(

d Ci , C j

)

max diam(C k )

1≤ k ≤ N

(3.12)

where d(Ci, Cj) is the average distance between spike trains of pair-wise neurons in clusters i and
j ( i ≠ j ), diam(Ck) is the maximum pair-wise distance between spike trains of neurons in cluster
k, and N is the total number of clusters. This index will be maximized when the across-clusters
distance (the numerator) is maximized, while the within-cluster distance (the denominator) is
minimized. Thus, it can identify sets of clusters that are compactly separated.
Figure 3.12 shows the average Dunn index at different time scales for the data sets analyzed
in section 3.5.1.2. The values shown were computed directly using the clustering outcome from
the proposed approach. Comparing Figure 3.12 with the clustering accuracy shown previously in
Figure 3.6a, it can be seen that for each choice of history interval H, the Dunn index is
maximized around the same time scale where the clustering accuracy peaks. In practice, the
Dunn index is computed using entries of the augmented R matrix by normalizing the entries in

51

Average Dunn Index

1

0.5

0

H=60 ms
H=120 ms
H=240 ms
H=360 ms
0.01

0.1
1
Time Scale (s)

10

100
rd

Figure 3.12: Average Dunn index ± SD over all trials vs. time scale fitted with 3
polynomial for the network in Figure 3.5a for different history intervals.

order

the R matrix by the maximum value, then subtracting each entry in the normalized R from one.
The resulting matrix thus comprises the distance between all pair-wise neurons and is eventually
used in the computation of equation (3.12).

3.6 Validation using Barrel Cortex Data
To demonstrate the performance of the multiscale clustering algorithm in the analysis of real
data, we applied the algorithm to stimulus-driven activity recorded from the rat barrel cortex.
Details of the experimental procedure are presented in Chapter 5, section 5.2.1. Briefly, 5 adult
female Sprague Dawley rats were used in the experiment. Subjects were anesthetized and their
left somatosensory cortex was exposed. An array of 32-channel microelectrode array was
advanced into layer V of the barrel field. For each rat, 3 whiskers were selected for mechanical
stimulation where each whisker was horizontally deflected 900 times. For each deflected
whisker, spike trains within 100 ms of each stimulus onset were considered as one trial. A total

52

of 100 datasets, 18 sec each, for each whisker were extracted from the recorded 900
trials/whisker, where each dataset was formed by concatenating 180 trials that were randomly
chosen with a uniform distribution from the 900 trials.
Multiscale clustering was applied to each of the stimulus-specific datasets with variable
number of clusters. Figure 3.13a illustrates sample inferred networks for three distinct whiskers
in one rat showing 2 clusters for whisker C1, 3 clusters for whisker D1 and 6 clusters for whisker
D2. To determine the optimal number of clusters for each rat, we computed the mutual
information between the inferred networks and the stimulus (i.e. whisker identity) for each
possible setting of the number of clusters. The optimal number of clusters was set as the number
of clusters that maximizes the mutual information. Details of the mutual information (network
information) computation are presented in Chapter 5, section 5.2.4. Briefly, principal component
analysis (PCA) feature space of the inferred networks was constructed as illustrated in Figure
3.13b for each setting of the number of clusters where each point represents a network of one
dataset. Network information was then computed from the obtained feature space. Figure 3.13c
demonstrates the mutual information obtained for each setting of the number of clusters for one
sample rat. For this particular example, the maximum network information was achieved by
clustering C1 data to 2 clusters, D1 data to 3 clusters and D2 data to 6 clusters. The network
information obtained across all rats was 1.4±0.23 bits.
To assess the validity of these networks in the absence of knowledge of the underlying true
connectivity, we first examined the connection probability as a function of the horizontal and
vertical separations between electrodes for optimal setting of the number of clusters. We
expected that neurons recorded on the same or adjacent electrodes are more likely to be
connected in the inferred networks, consistent with anatomical and physiological studies in
53

Whisker C1
16a

8b

25c
1b

8a
2a

25b 17b
25a

7a

24b 17a

25a
7a

2b

1a
9c

9b

Whisker D1
17b

7b

24a

2a

9a

8a

9a
1a

1b

24c

24a

25b

Whisker D2
25a
8b

1a

7a

24a
17a

24c
25c
17a

8b

2b
9b
16b

16a

25b

24b

9c

7b

16b

2a

16a

24b

24c

16b
9b

9a

9c

2b

17b

8a

1b

25c

7b

(a)

Network Info. (bits)

Network Space

0
-5

-10
-10

C1
D1
D2
5

D2 Clusters

PC2

5

7
6
5
4
3
2

1.5
1

0.5
7
6
5
4
3
0
2
D1 Clusters

7
6
5
4
0
3
2
C1 Clusters
PC1
(c)
(b)
Figure 3.13: (a) Sample stimulus-specific networks formed of clusters for 3 different whiskers
(C1, D1 and D2) of rat R3. (b) Network feature space of rat R3 for 3 different whiskers (C1, D1
and D2). Each dot corresponds to the projection of one network onto a 2-dimension principal
components (PC1 and PC2) feature space. (c) Network information for rat R3 as a function of
the number of clusters used for each of whiskers C1, D1 and D2.

-5

54

the neocortex suggesting that connectivity tends to be mostly local for economic wiring [13, 149,
150]. Figure 3.14a demonstrates that neurons recorded on the same or on adjacent electrodes
have a higher probability of being connected. Furthermore, the connection probability decreased
with increasing electrode separation (r = -0.42, P < 0.05, n = 24, t-test).
We also validated the identified clusters temporally where neurons with similar temporal
characteristics were expected to belong to the same cluster. We examined whether neurons with
similar response latencies to whisker stimulation were clustered together where the response
st

latency was quantified as the time to the 1

spike fired in response to whisker deflection

(Chapter 5, equation (5.1)). Figure 3.14b illustrates a histogram of the connections as a function
of the difference in the response latency between the connected pairs using the optimal setting of
the number of clusters. As can be seen, neurons with small difference tend to be clustered
together more than neurons with larger difference (r = -0.76, P < 0.05, n = 7, t-test).

500
400
300
200
100
0

0.06
0.04
0.02
0

Fraction of Conn.

Vertical Sep. (μm)

Connection Prob.
0.8
0.6
0.4
0.2

0
0 5 10 15 20 25 30 35
Response Latency Difference (msec)
(a)
(b)
Figure 3.14: (a) Histogram of the spatial difference between the connected neurons quantified in
terms of the horizontal and vertical separations between the electrodes on which neurons were
recorded. (b) Histogram of the temporal difference between the connected neurons quantified in
terms of the difference in the connected neurons’ response latency to the stimulus. The number
of clusters for each rat in both (a) and (b) was set to the number that maximizes the mutual
information between the networks and the stimulus.

0 400 800 1200
Horizontal Sep. (μm)

55

3.7 Discussion and Conclusion
Two major features of the proposed approach are specifically tailored to the type of neuronal
correlation likely to be observed in cortical data: First, the projection of the spike trains onto a
scale space to account for any synergistic interplay between temporal and rate codes that may be
simultaneously present in a given ensemble activity pattern; and second, the probabilistic
spectral clustering to overcome any complexity of clustering data with arbitrary structures in a
feature space. This latter feature yields two additional advantages: First, it is nonparametric and
therefore can track a wide variety of cluster shapes, and second, its probabilistic nature is useful
in quantifying the degree of uncertainty in the membership of a given neuron to each cluster.
This implies more robustness to plastic changes in cortical circuitry within and across multiple
trials. The results demonstrated in this chapter show the effectiveness of the proposed multiscale
clustering approach in identifying functional connectivity in both simulated data under different
model parameters and real data recorded from the rat barrel cortex.
A typical problem arises in the use of clustering algorithms in general is the knowledge of the
exact number of independent clusters prior to the analysis. While a few heuristic approaches
have been proposed in the literature for many types of data, this problem can be quite
challenging to the application at hand given the ever-present cortical adaptation caused by
synaptic plasticity, cellular conditioning and/or representational plasticity [151]. One approach
for estimating the number of clusters to look for in the data is the elbow criterion [142], which
was further improved by the introduction of the gap statistic [152]. Specifically, the gap statistic
criterion is based on the observation that the within-cluster dispersion, defined as the overall
distance between pair-wise elements in a cluster, tends to decrease significantly when the

56

number of clusters exceeds the true number. Therefore, it is possible to estimate the correct
number of clusters by identifying the elbow (or the “knee”) in the graph of within-cluster
dispersion and comparing it to the reduction observed in the within-cluster dispersion of a
reference distribution.
It has been demonstrated that the gap statistic approach was capable of detecting the correct
number of independent neuronal clusters when applied to a population of leaky integrate-and-fire
(LIF) neurons [153]. Similar to the Dunn index used in this chapter to determine the maximum
clustering accuracy, the gap statistic can also be considered as a clustering validity index.
However, the main difference between the use of both measures is that the gap statistic is used
prior to the clustering to estimate the number of clusters to look for in the data, while the Dunn
index is used post clustering to choose the clustering result corresponding to the time scale that
best represents the temporal interaction between neuronal elements in the observed activity
patterns. In case of stimulus-specific data, we used the mutual information between the obtained
clustering result of all trials and the stimulus to determine the optimal number of clusters.
In addition to the numerous systems neuroscience applications that may benefit from the
proposed approach, a fundamental objective in many recent Brain Machine Interface (BMI)
applications is to decode ensemble spike trains to better characterize the nature of cortical motor
control associated with distinct motor behaviors [50, 154-158]. Many of these studies have
examined the utility of multiple single unit recordings in the primary motor, premotor and
supplementary motor areas of the cortex in controlling artificial devices. In the overwhelming
majority of these studies, neuronal firing rates estimated over a fixed bin width (typically 50-100
ms) are believed to contain all the information needed to estimate the neural correlate of
behavioral. However, the large variability in individual firing rates typically observed across
57

multiple repeated trials results in unreliable decoding. These algorithms require extensive
periodic “calibration”, even when recordings are stable and the subject’s performance remains
steady [159]. Faithful decoding of motor cortex activity patterns when the subject is freely
behaving and naturally interacting with the surrounding requires continuous identification of
clusters of neurons involved in encoding behavioral parameters at different time scales.
Clustering techniques such as the one reported in this chapter may constitute an important predecoding step to identify the underlying clusters constituting a lower dimensional subspace of
the observed neural space.

58

Chapter ____________________________________________

4

Inferring Neuronal Effective
Connectivity using Dynamic Bayesian
Networks

______________________________________________________________________________

4.1 Dynamic Bayesian Network Theory
This chapter introduces the use of Dynamic Bayesian Networks (DBNs) in inferring the
effective connectivity between spiking cortical neurons from their observed spike trains [28, 29].
Inferring causal links between neurons is a challenging problem because observing an effect may
have many causes, some of which may be unobservable, leading to the possibility of inferring
spurious connections between neurons. In the case of time varying stochastic processes with
potentially many possible causes such as spike trains, Bayesian analysis provides a powerful
framework because it treats all model quantities as random variables.  Causal links between these
variables – now explicitly appearing as vertices in a graph - can therefore be represented as
directed edges. In such case, DBN can be used as a graphical model of spiking neurons because
it explicitly models time dependencies between the variables [160, 161] .
A Bayesian Network (BN) is a graphical representation of statistical relationships between
random variables, widely used for statistical inference and machine learning [162, 163]. A BN is
denoted by B =< G, P >, where G is a directed acyclic graph (DAG) and P is a set of conditional
probabilities. Each graph G consists of a set of nodes V and edges E, and is usually written as G
=< V, E >. Each node in V, denoted by vi, corresponds to a random variable xi. Each directed
59

edge in E, denoted by vi → vj, indicates that node vi (i.e., random variable xi) is a parent of node
vj (i.e., random variable xj). Conditional probabilities in P are used to capture the statistical
dependence between child nodes and parent nodes. In particular, given a random variable xi in
the graph, we denote by xπ(i) the set of random variables that are parents of xi. The statistical
dependence between xi and its parent nodes xπ(i) is captured by the conditional probabilities
Pr(xi|xπ(i)). More precisely, the value of random variable xi is decided by the values of its parents
via the conditional probabilities Pr(xi|xπ(i)), and is independent from the values of other random
variables in the graph given the value of xπ(i). Thus, the joint probability distribution of the
random variables xi can be expressed given the conditional dependence on the parents using the
chain rule
N

(

Pr( x1 , x2 ,..., x N ) = ∏ Pr xi xπ (i )
i =1

)

(4.1)

Figure 4.1a shows an example of a Bayesian Network, in which we have three binary random
variables: random variable R indicates “if it rains” or not, W indicates “if the grass is wet” or not,
and U indicates “if people bring umbrella” or not. The arc from R to W indicates that “the grass
being wet” is decided by “if it rains”. Similarly, the arc from R to U indicates that “whether or
not people bring umbrella” is decided by “if it rains”. The conditional probabilities Pr(U|R) and
Pr(W|R) are summarized in Table 4.1. According to the chain rule described before, the joint
probability Pr(R, U, W) is written as
Pr (U , R, W ) = Pr (R ) Pr (U R ) Pr (W R ).

(4.2)

60

R
W

U
(a)

Day t-1

Day t

Day t+1

Sprinkle

Sprinkle

Sprinkle

Rain

Rain

Rain

(b)
Figure 4.1: (a) An example of a Bayesian Network. Random variable R indicates if it rains or not,
W indicates if the grass is wet or not, and U indicates if people bring umbrella or not. (b) An
example of a Dynamic Bayesian Network. Each morning an automatic sprinkler randomly
decides whether it should water the lawn or not conditioned on whether it watered the day before
and whether there was rain the day before.

A Dynamic Bayesian Network (DBN) is an extension of BN to handle time-series or
sequential data [161] and, therefore, are much more suitable to analyze nonstationary random
processes. In a DBN, the status of a node (variable) at time t0 is conditionally-dependent on its
parents’ state in history. Specifically, given a random variable xi at time T = t + 1, denoted by

( )

( )

xi t +1 , and its parents xπ(i), the value of xi t +1 is decided by the values of its parents xπ(i)

(1
observed during the interval T = 1 to T = t, denoted by x π :(ti)) . Note that T here is assumed to be a

( )

t +1
and
discrete variable. Similar to Bayesian networks, the statistical dependence between xi

Pr(W|R)
R = true
R = false
Pr(U|R)
W = true
0.7
0.4
U = true
W = false
0.3
0.6
U = false
Table 4.1:Conditional probabilities Pr(W|R) and Pr(U|R).

61

R = true
0.9
0.1

R = false
0.2
0.8

(t +1) | x (1:t ) ), and the conditional probability
π (i )

(1
x π :(ti)) is captured by conditional probabilities Pr( xi

(
(
Pr ⎛ x1t +1) , x 2t +1) ,..., x (t +1) x (1:t ) ⎞ is computed as
⎜
⎟
N
⎝
⎠
N

(1:t )
(
(
Pr⎛ x1t +1) , x2t +1) ,..., x (t +1) x (1:t ) ⎞ = ∏ Pr⎛ xi(t +1) xπ (i ) ⎞.
⎜
⎟
⎟
⎜
N
⎝
⎠
⎠
⎝

(4.3)

i =1

( )

In many cases, and for the sake of simplicity, it is often assumed that xi t +1 is only
dependent on the value of its parents observed at time T = t, which simplifies the conditional

(t +1) x (1:t ) ⎞ to Pr ⎛ x (t +1) x (t ) ⎞ . This is known as the Markov assumption
probabilities Pr ⎛ x i
⎜
⎜ i
π (i ) ⎟
π (i ) ⎟
⎝
⎠
⎝
⎠
with Markov lag equal to 1. A special case of DBN is the well-known Hidden Markov Model
(HMM) [164]. This simplification to the Markov assumption can be extended to include multiple

( )

Markov lags. For instance, a DBN with maximum Markov lag equals 3 implies that xi t +1 is

(t −2:t )
decided by the value of its parents observed at time T = t, t − 1, t − 2, or Pr⎛ xi(t +1) xπ (i ) ⎞ .
⎜
⎟
⎝
⎠
Figure 4.1b is an example of a DBN. Each morning, an automatic sprinkler decides whether to
water a lawn. The sprinkler makes a probabilistic decision based on 1) whether it watered the
lawn the day before and 2) whether there was rain the day before. The conditional probability
distribution table of Pr(S

t+1

t

t

= true| S , R ) is listed in Table 4.2. Note that in this example, the

variable S is also one of its parents.

t+1

Pr(S

t

t

= true| S , R )
0.8
0.3
0.3
0.1
t+1

Table 4.2: Conditional probabilities Pr(S

t

t

t

S
false
true
false
true

| S , R ).
62

t

R
false
false
true
true

4.2 Learning Bayesian Networks
Learning a Bayesian network from data involves two tasks: learning the structure of the
network and learning the parameters of the conditional probability distributions. Structure
learning of Bayesian networks is much more difficult compared to parameter learning because
once the structure is known, it is easy to learn the parameters of the conditional probability
distributions using existing algorithms such as Maximum Likelihood Estimation (MLE).
Learning the structure of the network can be formulated as searching for a network structure
G* that best fits the observed data D. Formally, this is expressed as

G* = arg max(log Pr (G D ))

(4.4)

G

= arg max(log Pr(D G ) + log Pr(G )),
G

where Pr(D|G) is the likelihood of the data D given the structure G and Pr(G) is the prior
probability of G. In our experiments, we assumed a uniform distribution for Pr(G).
There are many structure learning algorithms, and the most popular among them are the
score-based approaches [163, 165, 166]. Figure 4.2 outlines the steps of the DBN structure
search algorithm. A scoring function is first defined to estimate the likelihood Pr(D|G) by which
a given Bayesian network structure is evaluated on a given dataset, and then a search is
performed through the space of all possible structures to find the one with the highest score.
Score-based approaches are typically based on well-established statistical principles such as
Minimum Description Length (MDL) [167], Bayesian Dirichlet equivalent (BDe) score [168] or
Bayesian Information Criterion (BIC) [169]. In our analysis, we used the BDe score. Let θG
denote the parameters for the conditional probability distribution for structure G. To evaluate the

63

Spike Trains
D

Initial Random
Structure
Structure

Markov Lags

Compute Bayesian
Score
Pr(G|D)

G

Score
No

Modify

End Search?
Yes

Inferred Structure (G*)
Figure 4.2: Schematic of the DBN structure search algorithm.

posterior Pr(D|G), one needs to consider all possible parameter assignments of θG, namely

log Pr(D G ) = log ∫ Pr (D G, θ G ) Pr (θ G G ) dθ G ,

(4.5)

where Pr(D|G,θG) is the probability of the data given the network structure and the parameters
and Pr(θG|G) is the prior probability of the parameters which penalizes complex structures.
Under the assumption that the distribution of each node in the network can be learned
independently of all other distributions in the network and assuming Dirichlet priors, the BDe
score can be expressed in a closed form of equation (4.5) as [163, 170]

( )

(

'
⎛
'
Γ Z ijk + Z ijk
Γ Z ik
⎜
+ ∑ log
log Pr (D G ) = ∑ ⎜ log
'
'
Γ Z ijk
Γ Z ik + Z ik
i ,k ⎜
j
⎝

(

)

( )

)⎞⎟

(4.6)

⎟
⎟
⎠

∞

x −1 −t
where Γ( x ) = ∫ t e dt is the Gamma function satisfying Г(x+1)=xГ(x) and Г(1)=1,
0

(1
Z ik = ∑ Z ijk , Zijk is the number of times random variable xi(t ) = j and x π :(ti)) = k ,
'
Z ik =

'
(
'
∑ Z ijk and Z ijk = a Pr(xi(t ) = j, xπ1:(ti)) = k G0 ), where a is the equivalent sample size and
j

j

G0 is a prior structure.

64

The associated optimization of this approach, however, is intractable [171]. Commonly used
methods to alleviate this problem include: 1) using some pre-processing methods like conditional
independence test to infer the Markov blanket for each node, and thus limiting the candidate
structure space; 2) using heuristics, like greedy search or simulated annealing, to find the suboptimal structures. Herein, we use a score-based approach with simulated annealing search.
Simulated annealing search avoids falling in local maxima by choosing certain modifications to
the structure that do not necessarily increase the score. Specifically, if for a given modification
the score increases, then this modification is accepted, while if the score decreases, then this
modification is accepted with probability exp(Δe/T0) where Δe is the change in the score, which
is negative in this case, and T0 is the system ‘temperature’. The search starts with a very high T0
so that almost every structure modification is accepted, and then decays gradually as the search
process progresses [172].

4.3 DBN and Spike Trains
DBN has been recently used in inferring transcriptional regulatory networks from gene
expression data [173-175]. For brain connectivity, DBN has been coarsely applied to infer
effective connectivity between multiple brain areas from fMRI data [176, 177] and, at a more
refined resolution, from multi-unit activity recorded throughout the songbird auditory pathway
[178]. Nevertheless, the use of DBNs to characterize cortical networks at the single neuron and
the ensemble level has not been fully explored. This chapter demonstrates how DBN can be used
to infer neuronal connectivity from spike trains.
In the context of spiking cortical networks, we model each neuron in the data by a set of
nodes where each node corresponds to the neuron’s firing state (‘0’ or ‘1’) at a given Markov
65

lag. A directed edge between two nodes represents a causal relationship between them that is
detected at the associated Markov lag. Figure 4.3 illustrates an example of a causal network and
the corresponding DBN. Consider for example the connection between neurons 3 and 4 with a
synaptic latency of 2 bins. This connection is represented in the DBN as a directed edge from
one node representing neuron 3 firing state at a Markov lag of 2 to another node representing the
current firing state of neuron 4. Modeling time dependency allows DBN to model cycles or
feedback loops as illustrated by the loop involving neurons 1, 2 and 3 (represented by a DAG in
the DBN). Moreover, this makes DBN capable of capturing relationships that appear at different
time lags. This is particularly useful considering that many unobserved neurons, e.g.
interneurons, may exist in the pathway between observed efferent and afferent neurons. This
may contribute considerable latency between causal events as a result of the aggregate synaptic
delay.

2

...

3

...
...

1

l31=1
l32=3

3
l43=2

2

Neuron

l12=2

1

4

1

Markov Lag
. . . t-3 t-2 t-1
...

4

(a)

t

(b)
Figure 4.3: (a) A simple causal network where nodes represent neurons and edges represent
connections. lij indicates the synaptic latency associated with each connection. (b) Corresponding
DBN where black nodes represent the neuron firing state at a specific Markov lag while white
nodes represent the present state.

66

4.4 Simulation Model
To examine the performance of DBN in reconstructing a spiking neuronal network, it was
essential to use a model to simulate the spike train data in which the “ground truth” was known.
In our simulations, we used a variant of the generalized linear model (GLM) proposed by [98]
similar to the model used in Chapter 3. In this model, the spike train Si of neuron i is expressed

(

)

as a conditionally-Poisson point process with mean intensity function λi t H i (t ) , where H i (t )
denotes the firing history of all the processes that affect the firing probability of neuron i up to
time t (e.g. stimulus feature, other neurons’ interaction, the neuron’s own firing history, etc…).
Herein, we do not consider an explicit external stimulus to drive the network, albeit this can be
included as an object in the graphical representation in the stationary case or as a set of objects
with known transitions in the nonstationary case. We focus on two main variants contributing to

λi (t H i (t )) : a) the neuron’s background level of activity; and b) the spiking history of the
neuron itself and that of other neurons connected to it. Mathematically, the firing probability
f (S i (t ) = 1) of neuron i at time t can be expressed as
M ij
⎛
⎞⎞
⎛
⎜
⎟⎟
⎜
f (S i (t ) = 1) ≅ λi (t H i (t ))Δ = ⎜ exp⎜ β i + ∑ ∑ α ij (mΔ )S j (t − mΔ )⎟ ⎟Δ,
⎟⎟
⎜
⎜
j ∈π (i ) m = 0
⎠⎠
⎝
⎝

(4.7)

where Δ is a very small bin width, βi is the log of the background rate of neuron i, π(i) is the set
of pre-synaptic neurons of neuron i (this set includes neuron i itself to model self-inhibition), Mij
is the number of history bins that relate the firing probability of neuron i to activity from neuron
j, αij models the connection between neuron i and neuron j (excitatory or inhibitory), and Sj(t −
mΔ) is the state of neuron j in bin m (takes the value of 0 or 1). Unlike the model in [98], the
length of the history interval of interaction between neurons i and j (equals Mij × Δ) was not

67

fixed for all neuron pairs, consistent with the notion of cell assemblies [179]. To mimic the
influence of excitatory post-synaptic potential (EPSP) and inhibitory post-synaptic potential
(IPSP), we used the following decaying exponential functions for synaptic coupling [180-182]
⎧
⎪0
⎪
⎩± Aij exp − 3000 t − lij Δ M ij

±
α ij (t ) = ⎨

(

(

)

, if t < lij Δ

)

(4.8)

, if t ≥ lij Δ

where t is the time (in seconds), +/- indicate excitatory/inhibitory interactions, Aij models the
strength of the connection, and lij models the synaptic latency (in bins) associated with that
connection. Similar to the effects of EPSP and IPSP, the decaying exponential in these
expressions implies that a spike from a neuron j occurring towards the end of the history interval
has a smaller influence on the firing of neuron i than if it were to occur at a closer time. Figure
4.4 illustrates α ij (t ) showing that α ij (t ) attains its peak value at lijΔ and that the time constant
+

+

of the decaying exponential is M ij 3000 .

⎧0
⎪
⎪ Aij exp − 3000 t − lij Δ M ij
⎩

+
α ij (t ) = ⎨

Aij

lijΔ

(

(

l ij Δ + M ij 3000

)

)

, if t < lij Δ
, if t ≥ lij Δ

M ij Δ

t

+
Figure 4.4: An illustrative graph of α ij (t ) where lijΔ is the synaptic latency, Aij models the

connectivity strength, Mij/3000 is the time constant of the decaying exponential where Mij
denotes the number of history bins, and MijΔ is the history interval.
68

4.5 Results
We tested the algorithm on multiple network structures simulated using the point process
model in equation (4.7). For each parameter setting in the results that will follow, we generated
100 networks of different structures, each containing 10 randomly connected neurons, where for
each neuron, the indices of pre-synaptic neurons were drawn from a uniform distribution to
avoid biasing the algorithm to a specific model. In addition, each neuron had a self-inhibitory
connection to model post-firing refractoriness and recovery effects. The duration of the
generated spike trains was set to 1 minute with a bin width of 3 ms that included the refractory
period.
In our experiments, we used the Bayesian Network Inference with Java Objects (BANJO)
toolbox [178]. We used the simulated annealing search algorithm in all the analyses [172]. The
search time was set to 1 minute in all the analyses except for Section 4.5.6, where the
performance was examined as the search time was increased for large populations. All the
analyses were performed on a Dual Intel Xeon machine (2.33 GHz, 64 bit) with 8 GB of
memory. We used the F-measure to quantify the inference accuracy [183]. This measure is the
harmonic mean of two quantities: the recall R and the precision P, defined by

C
C
,P =
C+M
C +W
2 RP
2C
=
F=
,
R + P 2C + M + W

(4.9)

R=

where C is the number of correctly inferred connections, M is the number of missed connections
and W is the number of erroneously inferred spurious connections. Thus, F will be ‘0’ if and only
if none of the true connections are inferred (C = 0) and will be ‘1’ if and only if all the true
connections are inferred (M = 0) and no spurious connections are inferred (W = 0). Note that we

69

are interested here in comparing the connections inferred by the DBN to the true connections
regardless of the Markov lag they appear at or the strength of the actual connections. For
example, if a true connection has a synaptic latency of 1 bin, it should be inferred at a Markov
lag of 1. If it were to be inferred at a Markov lag of 2, it will still be considered as correct when
computing F.
4.5.1

Networks with Fixed Synaptic Latency

We thoroughly investigated the performance of DBN by varying the model parameters in
equations (4.7) and (4.8) while keeping the synaptic latency for all neurons fixed at 3 ms. This
mimics direct connectivity that may exist in a local population rather than diffuse connections
that may be observed across different parts of the cortex. Figure 4.5 shows the performance of
DBN for different settings of the model parameters when the DBN Markov lag was set to match
the synaptic latencies. Each point represents the mean and standard deviation of the inference
accuracy for 100 different network structures.
We initially examined the performance as the number of pre-synaptic neurons was varied
between 1 and 6 while fixing all other parameters. All the connections here were excitatory with
a history interval of 180 ms. We decreased the strength of the connections Aij in equation (4.7) as
shown in the inset in Figure 4.5a as the number of pre-synaptic neurons increased in order to
keep the mean firing rate in the range of 20 to 25 spikes/sec and prevent unstable network
dynamics while the background rate of each neuron was set to 10 spikes/sec. The results in
Figure 4.5a suggest that DBN achieves 100% accuracy with a slight decline above 4 input
connections. This decline can be attributed to the decrease in the synaptic strength

70

0.9

0.9

0.6

1
0

2 4 6
1
2
3
4
5
6
# Pre-synaptic Connections
(a)

F-Measure

0.9
0.8
0.7
0.6

No 0.25 0.5 1 2
I/E Ratio
Inhib.
(c)

0.7

2
1

0
0 200 400
60
180
300
420
History Interval (ms)
(b)

DBN
PDC
GLM

1

0.8

0.6

Coefficient of Variation

0.7

2

Aij

0.8

F-Measure

1

Aij

F-Measure

1

4

1.4
1.2
1
0.8
0.6
0.4

No 0.25 0.5 1
I/E Ratio
Inhib.
(d)

2

4

F-Measure

1
0.9
0.8
0.7
0

5
10
15
20
25
Background Rate (spikes/sec)
(e)
Figure 4.5: (a) DBN performance vs. number of excitatory input (pre-synaptic) connections.
Inset: Connection strengths Aij for each choice of that number. (b) DBN performance vs. firing

history interval length. Inset: Connection strengths Aij for each choice of this length. (c) DBN
performance (solid) vs. ratio of inhibitory to excitatory synaptic strength (I/E Ratio). The
accuracy gets closer to unity as inhibition strength surpasses excitation strength (I/E ratio
increases above 1). Partial Directed Coherence (PDC) (dashed) and Generalized Linear Model
(GLM) fit (dotted) performance shown for comparison. (d) Coefficient of variation for the data
analyzed in (c). (e) DBN performance vs. background rate.

71

Aij, thereby reducing the influence an input spike from neuron j has on the firing probability of
neuron i.
We then varied the history interval of the interaction by varying Mij and examined the
performance. Each neuron received two excitatory connections of equal weight from two
neurons. The weights were adjusted in order to keep the mean firing rate in the same regime for
different settings of Mij as illustrated in the inset of Figure 4.5b. Given that the synaptic latency
was fixed among all neurons and matched the DBN Markov lag used, we expected the
performance to be invariant to changes in the history interval length. Figure 4.5b illustrates that
this is indeed the case and that the DBN performance is almost steady regardless of the length of
the history interval.
When inhibitory connections exist, the inference task becomes more complicated,
particularly for hypoexcitable neurons, due to the fact that the neuron’s spiking pattern becomes
very sparse. As a result, observing a spike event may contain a lot more information about causal
effects than not observing one. We further investigated the performance in the presence of
inhibitory connections with variable degrees of strength. Each neuron received two input
connections, one excitatory and one inhibitory. The excitatory synaptic strengths in equation
(4.8) were fixed at 2.5, while those of the inhibitory connections were varied. We defined I/E
ratio as the ratio of the cross-inhibitory to cross-excitatory synaptic strength and tested ratios of
0.25, 0.5, 1, 2, and 4, respectively. All other parameters were set as previously described. Taking
the self-inhibition mechanism inherent in our model into account (Aii = -2.5), an I/E ratio of 4
would correspond to a neuron with equal degree of inhibition and excitation. Such neuron should
display homogeneous – Poisson like - characteristics reminiscent of independent firing. Figure
4.5c illustrates the inference accuracy as a function of the I/E ratio. For ratios below 1, a neuron
72

is more affected by the excitatory connection than the cross-inhibitory one. Therefore, the drop
in performance observed suggests that DBN is unable to detect the presence of weak inhibition in
the presence of a strong excitation. This can be attributed to the relatively insignificant effect of
weak inhibitory input on the firing characteristics of post-synaptic neurons conditioned on
receiving a strong excitation. When the I/E ratio increases above 1, the accuracy rapidly gets
closer to unity and does not deteriorate even when the inhibitory connections are 4 times stronger
than the excitatory connections.
To interpret these results, we used the coefficient of variation (CV) of the inter-spike interval
(ISI) histograms (CV = std(ISI)/mean(ISI)) to quantify the variability in the firing characteristics
for variable I/E ratios. We hypothesized that if weak inhibitory connections do not significantly
influence the firing of these neurons, then the CV should be close to those receiving pure
excitation. Moreover, one would not expect to see a sharp transition in CV characteristics for I/E
values adjacent to purely-excitatory connections. Figure 4.5d demonstrates that this is indeed the
case. The average CV for I/E ratios of 0.25 is not significantly different from that of purelyexcitatory connections. This suggests that weak cross-inhibition did not result in large effects on
the neurons’ firing characteristics, making it more difficult for DBN to detect this type of
connectivity. The CV monotonically increases as the I/E ratio increases, reaching an average of 1
(similar to that of an independent Poisson neuron) when cross-inhibition, on average, is 4 times
stronger than excitation (I/E ratio = 4).
We compared DBN performance to a related measure of connectivity - Partial Directed
Coherence (PDC) – that has been proposed to study causal relationships between signal sources
at coarser resolution in EEG and fMRI data [89]. PDC is the frequency domain equivalent of
Granger causality that is based on vector autoregressive models of certain order. A connection is
73

inferred between a given pair of neurons if the PDC at any frequency exceeded a threshold of
0.1. For each network, PDC was applied using models of orders 1 to 30. The accuracy shown by
the dashed plot in Figure 4.5c represents the maximum accuracy achieved across all model
orders. As can be seen, DBN outperformed PDC over the entire range of the I/E ratio, while
exhibits similar performance around an I/E ratio of 0.25. Closer examination of the networks
inferred using PDC for I/E ratios greater than 0.25 revealed that the deterioration in the PDC
performance compared to the DBN was due to its inability to detect most of the inhibitory
connections, consistent with previous findings with spectral coherence as well as Granger
causality [96, 97].
We also compared DBN performance to that obtained using a Generalized Linear Model
(GLM) fit [98, 184]. Ideally, GLM fit should yield the best result for our data since this is the
generative model we used in equation (4.7). A maximum likelihood estimate of the coupling
function αij in equation (4.8) for each neuron i is computed in terms of the spiking history of all
other neurons j within a certain window of length WGLM. Since our goal is to identify the
existence of connections and not to estimate the coupling function, we post-processed the
estimated coupling functions such that a connection was inferred if the estimated coupling
function was larger/lower than a given threshold for excitatory/inhibitory connections,
respectively, for 3 consecutive bins while their p-values were significant. The spiking history
considered for fitting WGLM was set to 20 bins (60 ms). The threshold was varied between 0 and
±1.5 and the p-value between 0.1 and 0.0001. The dotted plot in Figure 4.5c shows the maximum
accuracy obtained across all thresholds and p-values. Superior performance of the GLM can be
seen compared to DBN at I/E ratios below 1, while comparable if not slightly inferior for I/E
ratios above 1. We note, however, that the GLM approach has a number of limitations: First, the
74

performance is highly dependent on the choice of the fitting spiking history interval WGLM.
Second, the inference threshold and the p-values have to be carefully set to identify the
connections. Finally, the GLM method search time was approximately 20 times that needed by
the DBN to estimate the coupling functions for each 10-neuron population.
Finally, we examined the performance as a function of the background rate in equation (4.7).
This may mimic variations in the afferent input current to the neuron, and increments in this
input when there is not coupling to other neurons are known to impact estimates of correlation
between their output spike trains [185]. This is because correlation between a neuron pair cannot
be orthogonally separated from their firing rates, thereby potentially leading to spurious
connectivity inference [186]. Here, we had 2 input connections per neuron, one excitatory and
one inhibitory having the same strength. Figure 4.5e illustrates that the inference accuracy was
above 96% for background rates higher than 5 spikes/sec. Most remarkable is the ability of DBN
to infer roughly 70% of the connections at background rates around 2 spikes/sec, despite that, at
this low rate, neurons are “silent” most of the time.
4.5.2

Mismatch between Actual Synaptic Latency and DBN Markov Lag

In practice, the synaptic latency between any two observed neurons is unknown and it is
more likely that synaptic latencies between network elements to be heterogeneous, reflecting the
highly distributed nature of cortical processing. For example, the synaptic weight functions
expressing the EPSP and IPSP characteristics in Figure 4.6a for different choices of M illustrate
that for relatively longer history intervals, the effect of an input spike on the firing probability of
a post-synaptic neuron should last longer. This should enable the DBN to infer these connections
even if the Markov lag is chosen to be larger than the true synaptic latency. On the

75

M=8
M=20
M=60
M=100
M=140

α+(t)
ij

0.8
0.6
0.4

1
F-Measure

1

0.2
0

0.8
0.6
0.4

M=8
M=20
M=60
M=100
M=140

0.2
0

20

40

0

60 80 100
Bins

(a)

2
4
6
Markov Lag (bins)
(b)

8

1

F-Measure

0.8
0.6
0.4

M=8
M=20
M=60
M=100
M=140

0.2
0

0

10
20
PDC Order
(c)

1

1
0.5
M=8
0
M=20
M=60
M=100
M=140
0
0.5
1
Threshold

F-Measure

F-Measure

30

0.8
0.6
0.4
0.2
0
0

1.5

20

40

60
80
WGLM (bins)

100

(e)

(d)

+
Figure 4.6: (a) Coupling function α ij (t ) for different history intervals M (in bins). (b) DBN

performance vs. Markov lag for variable history intervals and fixed synaptic latency (4 bins).(c)
PDC performance vs. model order for the data analyzed in (a). (d) GLM fit performance vs.
detection threshold for the data analyzed in (b). (e) GLM fit performance vs. GLM parameter
WGLM when the model parameter M was set to 60 bins.
76

other hand, the influence of an input spike on the post-synaptic neuron firing for a short history
interval should behave in a similar fashion.
Figure 4.6b demonstrates the performance of the DBN when there is a mismatch between the
DBN Markov lag and the synaptic latency in the model. In this experiment, we generated 100
different networks, 10 neurons each, in which each neuron had two input connections, one
excitatory and one inhibitory of the same strength. The synaptic latency was fixed for all neurons
at 4 bins. As can be seen, when the Markov lag was set to a value smaller than the populations’
true average synaptic latency, almost none of the connections was inferred. On the other hand,
when the Markov lag was set to be larger than the true synaptic latency, the inference accuracy
deteriorated only slightly for relatively long history intervals, while deteriorated significantly for
shorter history intervals. To interpret this result, let’s first denote by lij the synaptic latency
between neurons j and i. The firing of neuron i at time bin m is only affected by the firing of
neuron j in the range [m−Mij, m−lij], where Mij is the number of bins in the history interval.
Thus, setting the Markov lag to a value less than lij (for example, trying to relate the firing of
neuron i at time bin m to that of neuron j at time bin m − lij + 1) did not help to detect the
presence of the connection between the neuron pair. Not surprisingly, DBN attained accuracy
close to unity when the Markov lag matched the synaptic latency, consistent with the results
previously shown in Figure 4.5b.
For the sake of comparison, Figure 4.6c and Figure 4.6d show the performance of PDC and
GLM-fit, respectively. For PDC, almost none of the true connections were inferred when the
model order was less than the synaptic latency. The performance improved when the model
order was set greater than or equal to the synaptic latency, but the accuracy never exceeded 0.8
across the entire range of history intervals considered in the model. The drop in performance was
77

again due to the inability of PDC to infer inhibitory connections. For the GLM fit shown in
Figure 4.6d, the spiking history interval WGLM considered in the fitting procedure was kept fixed
at 20 bins (60 ms) for all the examined populations. As can be seen, the GLM fit highly depends
on the threshold used to detect the existence of connections.
To demonstrate GLM fit dependency on the choice of WGLM, we examined its performance
in Figure 4.6e when the history Mij in the model was fixed to 60 bins. Peak performance was
obtained when WGLM was set to 20 bins. For this choice of Mij (60 bins), the weight function

α ij (t ) in equation (4.8) falls to roughly 95% of its peak value after 20 bins (Figure 4.6a).
Therefore, a choice of WGLM that does not match the GLM model order may result in over or
under fitting.
4.5.3

Networks with Variable Synaptic Latencies

As stated before, a functional cortical network is more likely to have heterogeneous synaptic
latencies. Typically, synaptic latencies can range from a few milliseconds for monosynaptic
connections [31, 187] to a few tens of milliseconds for polysynaptic connections [188, 189]. In
addition, the limited sample size precludes the ability to record all the neurons that are directly
connected in a given population. It is therefore much more realistic for our model to have
variable synaptic latencies in order to test the applicability of the method to data from real
cortical networks.
We investigated the performance when multiple synaptic latencies exist within the
population. We defined a heterogeneity index (HI) as the number of distinct synaptic latencies
that exist in the network as illustrated by the simple network in Figure 4.7a. Figure 4.7b
demonstrates the accuracy for networks of 10 neurons with different HIs where HI = 1 implies

78

1
l=4
5

l=2

l=1

3

l=3
2

F-Measure

1

l=2
4

l=5
6

0.9
0.8
DBN
GLM
PDC

0.7
0.6

1

2

3
HI

4

5

(a)
(b)
Figure 4.7: (a) A simple network illustrating the dependence on the synaptic heterogeneity index
(HI). The subpopulation inside the dashed circle has a HI of 3 while the entire population
(neurons 1 to 6) has a HI of 5. The firing history was fixed at 60 bins (180 ms). (b) DBN, PDC
and GLM-fit performance vs. HI.

that all neurons had the same synaptic latency while 5 implies that every 20% of the neurons in
the network had similar synaptic latency. Each neuron received 2 random connections, one
excitatory and one inhibitory. In order to identify connections at different synaptic latencies, we
applied DBN with a range of Markov lags such that the maximum lag matched the maximum
synaptic latency in the population while the minimum lag was set to 1. In this case, DBN
considers those lags simultaneously and tries to identify the best lag at which each connection is
most prominent. Surprisingly, Figure 4.7b demonstrates that the accuracy was not significantly
impacted at high HI. In addition, the DBN performance was superior to that of the GLM-fit and
PDC for the same populations.
4.5.4

Identifying Monosynaptic Connectivity

One important feature of using DBN in structure learning is its ability to detect causal
relationships between the children nodes and their immediate monosynaptic parents and not the

79

higher-order polysynaptic parents. Figure 4.8a shows a simple network consisting of a 3-neuron
chain where a directed information flow characterizes the connectivity such that neuron A excites
neuron B, while in turn neuron B excites neuron C. The synaptic latency of both connections was
set to 1 bin. In this case, neurons A and B are considered monosynaptic parents of neurons B and
nd

C, respectively. On the other hand, neuron A is considered a 2

order polysynaptic parent

(ancestor) of neuron C. It is expected that the connections between neurons B and C and their
monosynaptic parents can be best detected using a DBN with a single bin Markov lag, while a
nd

spurious monosynaptic relationship between neuron A and neuron C (2

order connection)

might be detected with a 2-bin Markov lag as a result of the polysynaptic pathway between the
two neurons. The complexity in discriminating between these two types of connections would be
most pronounced in the absence of prior knowledge of the best Markov lag to use, particularly
when analyzing a heterogeneous network such as the ones analyzed in Figure 4.7. Herein, we
demonstrate the ability of DBN to restrict its inference to only direct causal relationships despite

False Connections/Chain

using a non-optimal Markov lag.

l=2
A

l=1

B

l=1

C

0.15
0.1
0.05
0
1

2
3
(1, 2)
Markov Lag (bins)

(b)
(a)
Figure 4.8: (a) Example 3-neuron chain, where solid arrows represent true connections while the
dotted one represents a spurious connection, l indicates the synaptic latency associated with each
connection. (b) Average number of spurious connections inferred per chain.
80

We investigated the performance using 100 different network structures with a history
interval fixed to 60 bins (180 ms), a fixed synaptic latency of 1 bin, and 1 excitatory and 1
inhibitory input connections per neuron. Each of the 100 networks examined had a total of 40
chains with 3 neurons each similar to the one in Figure 4.8a. A successful DBN inference should
nd

yield two connections matching the true monosynaptic connections and no 2
nd

per chain. Figure 4.8b shows the number of erroneously inferred 2

order connections

order connections per chain

versus Markov lag. It can be seen that when a Markov lag of 1 bin was used, DBN did not infer
nd

any of the spurious 2

order connections as expected since the lag is shorter than the

superposition of two synaptic latencies. When a Markov lag of only 2 bins was used, DBN
surprisingly had a very low error rate despite that true monosynaptic connections should be
nd

inferred in addition to the spurious 2

order connection in each chain as shown in Figure 4.8b.

We attributed this superior performance to two factors: first, the effect of the pre-synaptic
neuron on the probability of firing of the post-synaptic neuron lasts for a period of time governed
by the synaptic coupling function given in equation (4.8). Given the exponential decay form of
this coupling, it is expected that the probability of firing at times beyond those governed by the
synaptic latency will be influenced by the time constant of that exponential function. Consider
the example in Figure 4.8a, when the synaptic latency is 3 ms (1 bin) and given the firing of
neuron A at time t, equation (4.8) yields a probability of firing for neuron B at time t + 3 ms of
0.25, while that at time t + 6 ms (i.e. after 2 bins of neuron A firing) to be 0.2. Thus, the
probability of firing at t + 6 ms has only declined by 20% of its maximum attained at t + 3 ms.
Therefore, the connection A→B can still appear at a Markov lag of 2. The same applies to the
connection B→C. Second, as the DBN tends to find the network structure with the least number

81

of edges to explain the data, the connection A→C is a redundant one given the presence of the
connections A→B and B→C.
Figure 4.8b also shows that a Markov lag of 3 results in a slight increase in the number of
nd

spurious 2

order connections. This can also be attributed to the effect of the time constant of

the synaptic coupling. As the lag increases, the effect of the input spike decreases, making it
relatively harder to detect true monosynaptic connections, while increasing the probability of
nd

detecting only spurious 2

order connections. Finally, at a Markov lag range of 1 and 2 in which
nd

connections at both lags should be simultaneously identified, no 2

order connections were

inferred since in this case DBN finds the best lag (or a combination of lags) that best explains the
data. These results indicate that even if the Markov lag is not correctly set, DBN can still succeed
in identifying only true monosynaptic connections provided that the lag range is chosen equal to
or more than the maximum anticipated synaptic latency in the population.
4.5.5

Networks with Unobserved and Independent Neurons

Cortical neurons are known to receive common excitatory and inhibitory inputs from other
regions [190-193]. These common inputs are a major source of complexity in inferring causal
relationships since they cannot be distinguished from actual coupling caused by true synaptic
links. We investigated DBN performance when unobserved common inputs exist in the form of
synaptic coupling from unobserved neurons. Figure 4.9a shows a sample structure of the network
examined in which unobserved neurons in the top row simultaneously excite pairs of observable
neurons in the bottom row. The dashed connections in the figure indicate the connections that are
expected to be inferred by DBN when the synaptic latency is fixed. These bidirectional

82

connections indicate that each pair of neurons receives the same input at the same time, and
therefore there should be no causal, time-dependent relationship between them.
We studied the performance of DBN in inferring connectivity between the observed neurons
(indexed 6 to 15). The synaptic latency of connections to even numbered neurons le was kept
fixed at 1 bin (3 ms) while that to odd numbered neurons lo was varied. Figure 4.9b demonstrates
the inference accuracy plotted as a function of the difference l o - l e . We considered a
bidirectional connection as an indicator of a common input. The Markov lag of the DBN was set
to 1 bin (3 ms). As can be seen, the inference accuracy peaks when the synaptic latency of the
connections to both odd and even numbered neurons was the same (i.e. the difference in synaptic
latency lo − le is 0). As the difference in synaptic latency increases, the accuracy decreases. This
may be interpreted as follows: when there is a difference between the synaptic latencies of the
common input, an indirect relationship between the neurons receiving that common input
emerges. For example, consider the common input from neuron 1 to neurons 6 and 7 shown in
Figure 4.9a when the synaptic latency of the connection 1→ 6 is 1 bin (3 ms) while that of 1→7
is 2 bins (6 ms). In this case, the influence of an event from the common input (neuron 1) on
neuron 7 probability of firing has a latency of 1 bin relative to its influence on neuron 6. Thus,
not observing the common input coupled with a difference in synaptic latency between that input
and the observed neurons results in a spurious connection between the observed neurons directed
from the shorter latency neuron to the longer latency one. On the other hand, an inferred
bidirectional connection between two observed neurons might indicate a common input
simultaneously received by both neurons. Nevertheless, it can still be indicative of a pair of
unidirectional physical connection between the two neurons rather than a common input.

83

1

…

2

le

lo
7

6

le
8

lo
9

5

le
14

…

lo
15

(a)

F-Measure

1
0.8
0.6
0.4
0.2
0
0

1
2
l - l (bins)
o

(b)

15
1

8

20
2

3

9

14

2

4

6

11

10

15

13

16

9

3
6

5

12

17

7
13

8

1

4

5
12

e

14

18

10

3

11

19

7

(c)
(d)
Figure 4.9: (a) Network structure examined where black nodes 1-5 in the dashed rectangle
represent unobserved neurons. Solid arrows indicate real connections while dashed ones indicate
the expected connections to be inferred by DBN. (b) DBN performance vs. difference between
the synaptic latency of the connections to odd numbered neurons and even numbered neurons (lo
− le) where le is fixed at 1 bin. (c) A network of 20 neurons, where white nodes indicate
observed neurons while black nodes indicate unobserved neurons. (d) A network with 15
observed neurons; neurons 1-10 (inside the dashed circle) receive connections while neurons 1115 are not connected to any neurons. The history interval was set to 60 bins (180ms) and the
synaptic latency was fixed to 1 bin (3 ms) for all connections.

84

Discrimination between the two cases can be achieved if prior knowledge about the anatomy of
the brain area and the duration of the synaptic latencies are available.
We further examined the performance in the case of unobserved neurons within the
population. We were interested in assessing whether this would hinder our ability to detect
functional relationships between the actual observed neurons. We randomly selected 6 neurons
to be unobserved from 20-neuron populations as shown in Figure 4.9c. DBN was applied to the
spike trains of the remaining 14 neurons. The synaptic latency of all the neurons was set to 1 bin
(3 ms) and the Markov lag was also set to 1 bin to match the synaptic latency. The inference
accuracy achieved by examining 100 different networks was 0.96±0.03. When computing the
accuracy, all the connections that involved the unobserved neurons in the simulated networks
were not considered.
We also examined the performance when some of the observed neurons are not elements of
the functional network. These can be typically thought of as task-independent neurons.
Identifying these neurons can be useful in the context of decoding spike trains. In such a case,
the performance of a decoder should improve if the posterior probability of the stimulus is
expressed in terms of the conditional probabilities of the task-dependent neurons, while
excluding the task-independent ones from the computation of the joint prior distribution [194].
Figure 4.9d shows a sample network of 15 neurons in which neurons 1-10 receive 2 input
connections each (1 excitatory and 1 inhibitory), while neurons 11-15 are not connected to any
other neurons. The Markov lag was set at 1 bin. The inference accuracy obtained by applying the
DBN to 100 networks was 0.98±0.07. The number of spurious connections inferred per network
between any of the connected neurons (i.e. neurons 1-10) and the unconnected ones (i.e. neurons
11-15) was 0.1±0.3, and occurred in 10 out of 100 networks. The number of connections inferred
85

per network among unconnected neurons (i.e. neurons 11-15) was 0.05±0.2 and occurred in only
5 out of 100 networks. To compare this number with what would be expected by chance, we
simulated 100 different networks with 10 unconnected neurons each. The number of connections
inferred per network was 0.02±0.14, and occurred in 2 out of 100 networks.
4.5.6

Identifying Connectivity in Large Populations

All the analyses shown so far were carried out on relatively small populations of 10 to 15
neurons each. Given that simultaneous recording of multiple single units in excess of these
numbers is expected, we were interested to investigate how the method scales as a function of
the number of neurons in the population. We simulated 10 different populations, 120 neurons
each. Each population consisted of 12 clusters, 10 neurons each, in which each neuron received 3
excitatory connections from neurons belonging only to its own cluster. The history interval was
set to 60 bins (180 ms) and the synaptic latency to 1 bin (3 ms).
Figure 4.10a shows the inference accuracy when DBN was applied to the entire population
with a Markov lag set to 1 bin (3 ms) for different search times. For a search time of 1 min, it is
clearly seen that DBN was unable to infer the network structure (only 15% accuracy). To
improve the accuracy, two different approaches were examined. First, when allowing the DBN
more search time (2 hours), an inference accuracy of 100% was attained for the 120 neuron
population. Second, one can try to break down the large population into smaller, functionallyinterdependent subpopulations if prior knowledge of any clusters in the neural space is available.
When this is the case, we found that applying DBN to subpopulations when keeping the search
interval fixed at 1 min significantly improved the performance. Specifically, we divided each of
the 120 neuron populations into 2 subpopulations with 60 neurons each (i.e. clusters 1-6 in one

86

100
Time for F=1 (min)

1
F-Measure

0.8
0.6
0.4
0.2
0

1 min
5 min
10 min
60 min
120 min
0
30
60
90
120
# Neurons/Subpopulation
(a)

10

1

0.1
0

30
60
90
120
# Neurons/Subpopulation
(b)

F-Measure

1
0.8
0.6
0.4
0.2
0

40

80 120 160 200 240
Search Time (min)
(c)
Figure 4.10: (a) DBN performance vs. number of neurons/subpopulation for different search
intervals. 10 different populations of 120 neurons each were simulated. (b) Search time required
by DBN to achieve 100% accuracy (F-Measure = 1) vs. number of neurons/subpopulation. (c)
DBN performance vs. search time for 120 neuron populations with no clusters.

subpopulation and clusters 7-12 in the other). The performance increased in this case by more
than two fold as indicated in Figure 4.10a. When subsequent division is carried out into 3, 4, 6
and 12 subpopulations, with 40, 30, 20, and 10 neurons each, respectively, accuracy of 100%
was achieved on populations of sizes 10 to 20 neurons each. In this realm, the multiscale
clustering algorithm reported in Chapter 3 [27] was applied prior to DBN analysis. In summary,
for large populations, a two-stage framework is suggested in which the neural space is first
87

clustered to identify any potential statistical dependence between neuronal elements. This is then
followed by DBN analysis to identify the effective connectivity structure within each cluster.
Figure 4.10b illustrates the minimum search time needed to achieve 100% accuracy for each
subpopulation size. It can be seen that the computational complexity increases exponentially
with the population size. Finally, we investigated the performance when no clusters are present.
As can be seen in Figure 4.10c, when more search time is allowed (in this case by 2 fold), the
underlying network was correctly inferred.
4.5.7

Estimating the Markov Lag of Maximum Accuracy

We have demonstrated how the DBN performance is highly dependent on the selection of the
Markov lag and that the accuracy is maximized when the Markov lag is equal to or slightly
greater than the population true average synaptic latency. When dealing with real neural data,
prior knowledge of this synaptic latency may be unavailable, and therefore some measure is
needed to devise the best Markov lag to use for a given population. For that purpose, we
computed the mean influence score (IS), defined as the average absolute value of the influence
score for each inferred connection. The influence score IS(i, j) measures the degree to which
neuron j influences neuron i’s firing, independent of the output of the other parents of neuron i
[195]. Its computation is based on the same conditional probabilities used in inferring the
network structure. IS can be either positive or negative depending on whether the connection is
excitatory or inhibitory, respectively. Since IS measures the degree of influence of the presynaptic neuron on the post-synaptic neuron for a given inferred connection, it is expected that
the mean of the absolute value of the IS of all the inferred connections for a given network
structure will be maximum if the inferred structure matches the true one.

88

Mean IS

0.03
0.02
0.01

M=8
M=20
M=60
M=100
M=140

0
0

2
4
6
Markov Lag (bins)
(a)

8

0.025

0.8
Mean IS

F-Measure

1

0.6
0.4
0.2

0.02
0.015
0.01
0.005

0

0
1
2
3
4
5
1
2
3
4
5
Maximum Markov Lag (bins)
Maximum Markov Lag (bins)
(b)
(c)
Figure 4.11: (a) Mean influence score (IS) vs. Markov lag for variable history intervals for the
same data analyzed in Figure 4.6a. Note that the mean IS has similar profile to the accuracy. (b)
DBN performance vs. Markov lag for networks with HI equals 3. Maximum accuracy is attained
when the maximum Markov lag matches the maximum synaptic latency in the population. (c)
Mean influence score vs. Markov lag for networks with HI equals 3.

Figure 4.11a shows the mean IS at different Markov lags for the data sets previously
analyzed in Figure 4.6 in which the heterogeneity index (HI) was 0. Comparing the results in
Figure 4.11a with those in Figure 4.6, it can be seen that for each choice of history interval
length M, the mean IS profile matches the F-measure profile obtained when exact knowledge of
the network structures was available. Thus, the mean IS can be used to quantify the degree of
confidence in the inferred network and to potentially estimate the best Markov lag that explains
the data. Figure 4.11b shows the inference accuracy for more complex networks with HI = 3 at

89

different ranges of Markov lags. The inference accuracy peaks when the maximum Markov lag
matches the maximum synaptic latency in the population. Figure 4.11c shows similar result
using the mean IS. It is clearly seen that, similar to the fixed synaptic latency case (HI = 0), the
mean IS behaves similar to the F-measure. This confirms the utility of the mean IS as a metric
for determining which network best explains the data when information about the anticipated
synaptic latency is unavailable.

4.6 Discussion and Conclusion
Identifying the effective connectivity between cortical neurons is a fundamental goal in
systems neuroscience. In this chapter, we demonstrated the utility of Dynamic Bayesian
Networks (DBN) in inferring this connectivity from the observed spike trains under a wide
variety of conditions. Our main conclusion is that DBN is a useful tool for analyzing spike trains,
and is capable of identifying causal relationships between distinct neuronal elements in small and
moderately-large populations of neurons, particularly when nonlinear interaction between these
neurons is present.
We have applied the method to probabilistic neuronal network models. These models are
increasingly being used in the neuroscience community because they are highly non-linear,
stochastic, and faithfully model the dynamic discharge patterns of many cortical neurons. DBN
outperformed Partial Directed Coherence (PDC), particularly when temporal dependence was
variable among the population elements. The diminished PDC performance was mainly
attributed to its inability to detect inhibitory connections. Compared to GLM fit, DBN
performance was comparable overall, and occasionally superior in highly heterogeneous network
structures. The GLM performance was not surprising, given that it was tested on data simulated
90

using the same generative model, while the DBN did not assume any specific generative model.
Therefore, we expect DBN to have more powerful generalization capability compared to GLM
fit, although not tested here. Some disadvantages of the GLM method are the need to specify a
detection threshold (with associated p-values) due to the dependence of the neuron’s firing on the
spiking history of other neurons that is likely to be variable across neuron pairs. In addition, the
orders of magnitude difference in computational time, in favor of the DBN, makes the inference
task more daunting for the GLM.
Two issues deserved detailed investigations in the proposed approach. The first is the
dependence of the DBN performance on the Markov lag, which can be regarded as playing the
same role as the model order in the GLM fit. Nevertheless, choosing the Markov lag to be
exactly matching the synaptic latencies is not necessary, as long as it is not smaller than the
maximum synaptic latency expected in the population. Information about synaptic latency may
be available if knowledge of the anatomical structure of the sampled cortical regions is available.
In such a case, running the DBN using different Markov lags and finding the Markov lag(s) at
which the mean influence score is maximized should yield the best accuracy. The structure
obtained would be the most likely network structure that best explains the data.
The second issue is the complexity of the search in the high-dimensional neural space that is
directly proportional to the number of the neurons in the population. We found that the
computational complexity increases exponentially with the population size. However, we found
that when clusters of functionally interdependent neurons do exist, the computational complexity
can be reduced. This is important, given the plethora of studies suggesting that neuronal
interaction across multiple time scales may underlie multisensory neuronal integration in
naturalistic behavior, perception, and cognition. The results suggest that a two-step framework
91

may be best suited for large-scale analysis, whereby populations are first clustered to discover
any inherent statistical dependency between their elements. These clusters are subsequently
analyzed by DBN to infer their underlying structures [29]. The clustering step, however, can be
eliminated without affecting the DBN performance if longer search time is permitted.
As detailed later in Chapter 6, the proposed approach can also be useful in quantifying
synaptic plasticity that is strongly believed to underlie learning and memory formation. In such
case, graphical approaches such as DBNs can reveal potential plastic changes reminiscent of
cortical re-organization during learning or recovery from injury. It can also be used to identify
task-dependent neurons when neural decoding is sought to reconstruct a sensory stimulus or
predict an intended motor behavior in a wide range of neuroprosthetic applications.

92

Chapter ____________________________________________

5

Millisecond-Timescale Stimulusspecific Network Coding

______________________________________________________________________________

5.1 Rat Barrel Cortex
Temporal precision plays an important role in the coordination among neocortical neurons
and has been hypothesized to enhance information flow along sensory pathways [196]. However,
its role in mediating the integration of information at the output of these pathways remains
poorly understood. Using DBN and information theoretic measures, this chapter demonstrates
some evidence for the presence of stable, stimulus-specific networks that emerge in the barrel
cortex in response to individual whisker deflection [30]. The chapter also provides evidence for
the existence of position-specific networks in medial prefrontal cortex (mPFC) as rats forage in a
T-shaped maze during a working memory task [29].
The barrel cortex is part of the rat somatosensory cortex in which neurons encode
information about whiskers deflection [31]. Rats use their whiskers to gather information about
their surrounding environment and recognize features of external objects, much like humans use
their hands and digits for tactile sensation. The barrel cortex is organized into vertical columns,
or barrels, across different cortical layers. The cells within each barrel respond maximally to the
deflection of a single principal whisker. Figure 5.1a illustrates the organization of the barrel
cortex (layer IV) and the associated whiskers where whiskers/barrels are denoted by A-E for the

93

Rat Snout

A 1 2 34
B
C
D
E

Barrel Cortex

Layer
II/III

E
1
D
B C
3
A
4
P M
L A

IV
Va

VPM Thalamus
(a)
(b)
Figure 5.1: (a) (Left) The rat’s right snout with the whiskers organized in rows, indexed in letters,
and columns, indexed in numbers. (Right) A horizontal section of the barrel cortex at layer IV
showing the whiskers mapping (Adapted from [197]). (b) Flow of information from the
Thalamus through the barrel system. The thickness of the arrows correspond to the strength of
the connections [198].

rows and 1, 2, 3, etc. for the columns [197]. Figure 5.1b shows a schematic of the information
flow to the barrel system [198]. Specifically, upon a certain whisker deflection,
mechanoreceptors in its follicle modulate their firing activity. This activity is carried by sensory
fibers projecting to the brain stem and subsequently project to the thalamus [199]. Axons from
thalamic neurons innervate the barrel at layer IV corresponding to the deflected whisker as
shown in Figure 5.1b. This layer is considered as the entry point of information to the barrel
system at the cortical level. Inside each column, excitatory connections project from layer IV up
to layer II/III and down to layer V. Weaker excitatory and inhibitory connections extend to
neighboring barrels.
It has been suggested that precise spike timing relative to stimulus onset carries most of the
information about whisker displacement [48], and that this mechanism aids the animal during
active whisking episodes to recognize external objects. This is primarily present in layer IV
neurons as this layer receives numerous inputs from the ventral posterial medial (VPM) thalamus
through the lemniscal and paralemniscal pathways [200, 201]. Neurons in infragranular layer V,
on the other hand, exhibit more complex dynamics as they integrate inputs from multiple barrels
94

within and across hemispheres [202-204]. This integration creates larger receptive fields than
those typically found in layer IV cells [205, 206].
Because layer V is a major output layer to multiple structures such as the posterior medial
nucleus, zona incerta, pontine nuclei and the primary motor cortex [207-209], the frequently
observed stimulus-dependent correlation among layer V cells is believed to play an important
role in providing two streams of information [210]: a spatial code from ascending pathways
representing current whisker position, and a corticothalamic feedback representing information
about past whisker position [211]. These two streams are necessary to provide sufficient
information to primary motor cortex for mediating active whisking cycles during adaptive
exploratory behavior [212]. Despite the large body of reports suggesting that individual layer V
neurons encode spatial information, the precise mapping of this spatial information to temporal
coordination among layer V neurons during whisker movements remains poorly understood.
Given the information integration that occurs in layer V, we hypothesized that: For a given
stimulated whisker, a stable network representation would be obtained - as measured by the
degree of structural similarity between networks inferred across multiple repeated trials. In
addition, the structure of these networks would be less similar to those inferred during
stimulation of other whiskers.

5.2 Methods

5.2.1

Barrel Cortex Recording

Five adult female Sprague Dawley rats weighing ~300 g were used in this study (Rats R1R5). All procedures involving animals were approved by the Michigan State University

95

Institutional Animal Care and Use Committee (IACUC). Animals were anesthetized using a
cocktail of ketamine and xylazine (75 and 5 mg/kg injected intrapertoneally, respectively). The
left somatosensory cortex was exposed (4 × 4 mm craniotomy, 0-4 mm posterior and 4-8 mm
lateral to bregma). A 32-channel microelectrode silicon array (NeuroNexus Technologies, Ann
Arbor, MI, USA) with 4 shanks, 8 recording sites/shank, 400 μm shank separation and 100 μm
electrode separation within shank was advanced into the barrel field in 100 μm steps. Acquired
signals were amplified and band-pass filtered in the range 300-5000 Hz and sampled at 25 KHz.
Stimulus-driven activity was recorded at depths of 1100-1500 μm corresponding to layer V of
the barrel cortex. Subjects were perfused at the end of the experiments using 0.1 M phosphatebuffered saline and 4% paraformaldehyde. Coronal sections (50 μm) were cut and sections were
Nissl-stained. The laminar depth of the arrays was confirmed to be in layer V by examining
either the length of the electrode tracks or electrolytic lesions created by passing 4 μA current for
5 sec.
Prior to vibrissae stimulation, whiskers were all trimmed to 6 mm length. For each rat, 3
whiskers were selected for mechanical stimulation that resulted in maximal modulation of the
firing rate based on the observed neuronal response to manual deflection (Whiskers C2, C3 and
D2 for R1; C2, D2 and D1 for R2; C1, D1 and D2 for R3; B2, B3 and B4 for R4; B1, B2 and B3
for R5). The selected whiskers were deflected one at a time by inserting each whisker into a
capillary tube glued to a piezoelectric bimorph (Piezo Systems, Cambridge, MA, USA). Each
whisker was horizontally deflected 900 times with a displacement of 80 μm for 100 ms (rise time
and fall time were each set to 1 ms) at 1 Hz frequency as illustrated in Figure 5.2 [48, 213].
Spikes in multiple single unit activity were detected and sorted using NeuroQuest; a
MATLAB toolbox for neural data processing and analysis [214, 215]. Spikes were declared
96

Deflection (μm)

80
0

1

98
Time (ms)

1

(b)
(a)
Figure 5.2: (a) Experimental setup where rats were anesthetized and whiskers were deflected one
at a time using a capillary tube glued to a piezoelectric bimorph. (b) Whiskers were deflected
horizontally for 100 ms. Red curve indicates whisker displacement.

present if the raw waveform surpassed a threshold set at 3 times the noise standard deviation.
Due to the observed overlap in the recorded spikes in the data, a short spike length of 0.5 ms was
used for spike sorting (0.25 ms pre threshold crossing and 0.25 ms post threshold crossing).
Spikes were aligned at their trough. Principal Component Analysis (PCA) was applied to the
detected spikes, and the first 2 principal components were used as features for spike sorting. An
average population size of 16±7.8 single units/rat was recorded (12 units for R1, 27 for R2, 21
for R3, 8 for R4 and 12 for R5). Spike trains were binned at Δ=0.5 ms. Quality of the spike
sorting was assessed using inter-spike interval histogram (ISI) to ensure that no spikes with interspike intervals of less than 1.5 ms were classified as belonging to the same unit. Neurons were
indexed by channel number (1-32) and unit number (a, b, c, etc ...).

97

5.2.2

Single Unit Analysis

For each neuron, Post-stimulus Time Histograms (PSTHs) were computed as the average
firing across trials for each stimulated whisker with 0.5 ms bin size within a window of 100 ms
post stimulus onset. The peak PSTH count for a given neuron for each whisker was then
extracted. The mean first-spike latency Liw of each neuron i for a given whisker w was computed
as [48, 216]
(5.1)

T

1 w tw
Liw =
∑ τ (1)
Tw t =1 i
w

t

where Tw is the total number of trials for whisker w and τ i w is a vector of the spike times of
neuron i on trial tw relative to stimulus onset.
We quantified the amount of information present in the individual neuron and population
response property, namely the first-spike latency/spike count and network representation,
respectively, about the deflected whisker identity using mutual information [48, 217]. For a
given neuron i, the mutual information between its response property Xi and the stimulus W (in
our case the whisker identity) was computed as

⎛ Pr ( xi , w ) ⎞
I ( X i ;W ) = ∑ ∑ Pr ( xi , w ) log ⎜
⎜ Pr ( x ) Pr (w ) ⎟
⎟
i
⎝
⎠
xi w

(5.2)

where Xi corresponds to the time stamp of the first spike of neuron i within 100 ms of the
stimulus onset in the case of first-spike latency, Xi corresponds to the total number of spikes fired
by neuron i during the same time interval in the case of spike count, and Xi corresponds to the
subnetwork of neuron i in the case of network representation as detailed later. The more distinct
the response distribution of a given neuron to different whiskers is, the higher the mutual

98

information, and thus, the higher the information it conveys about the identity of the deflected
whisker. To normalize I(Xi;W) between 0 and 1, we divided equation (5.2) by the joint entropy
H(Xi,W).
5.2.3

Dynamic Bayesian Networks Analysis

To infer the whisker-specific networks, we used the Bayesian Network Inference with Java
Objects (BANJO) toolbox [178] with simulated annealing search algorithm [172]. For each
deflected whisker, spike trains within 100 ms of each stimulus onset were considered as one trial.
A total of 100 datasets, 18 sec each, for each whisker were extracted from the recorded 900
trials/whisker, where each dataset was formed by concatenating 180 trials that were randomly
chosen with a uniform distribution from the 900 trials. This results in an overlap between any
given pair of datasets that follows a binomial distribution with a mean overlap of 20% and a
standard deviation of 4% (as will be shown later in Figure 5.6). The spike trains of each dataset
were analyzed using DBN with Markov lags in the range [1, 10] bins ([0.5, 5] ms). A Markov lag
range of [1, 5] bins ([0.5, 2.5] ms) was found to be the best range for all datasets based on
calculations of an influence score that measures the degree of influence each pre-synaptic cell
has on post-synaptic cells [28, 195]. In case a connection was inferred at more than one Markov
lag, only the largest lag was considered. The maximum number of pre-synaptic cells for each cell
was set to 10.
5.2.4

Network Similarity and Network Information

To quantify the similarity between the inferred networks, we first represented each inferred
network as an n × n binary adjacency matrix A, where n is the total number of neurons in the

99

network. Each element A(i, j) takes the value ‘1’ if there is a connection from neuron i to neuron
j and ‘0’ if there is no connection between the corresponding neurons. For a given population of
n neurons, K deflected whiskers and M datasets per whisker, all the adjacency matrices of the
2

inferred networks were vectorized and stacked together into one KM × n matrix. Principal
component analysis (PCA) was then applied to this matrix to extract significant features from the
inferred networks by projecting the adjacency matrices into a p-dimension network space, where
2

p ≤ n , that accounts for most of the variance in the networks [218]. The similarity R(Al, Am)
between a pair of adjacency matrices Al and Am was defined as

R( Al , Am ) = 1 − ql − q m

(5.3)

where ql and qm are the projections of Al and Am in the p-dimension network space, respectively,
and ||.|| is the Euclidean distance (lp-norm) between the two projections. The network space was
normalized such that the maximum possible distance between any pair of projections is 1. The
number of principal components used p was set to 2. The average across-whiskers similarity

R Across and the average within-whisker similarity RWithin for a given population were
therefore defined as
(5.4)

2
R Across =
∑ ∑ R (w1 , w2 ),
K (K − 1) w w ≠ w
1

RWithin =

2

1

1
∑ R (w, w)
K w

where the average similarity between the networks inferred for a given pair of whiskers w1 and
w2, R (w1 , w2 ) , and within a given whisker w, R (w, w ) , were defined as

100

R (w1 , w2 ) =

2

w
w
R (Al 1 , Am 2 )
2 ∑∑

M l m
2
w
R (w, w) =
∑ ∑ R Alw , Am
M (M − 1) l m ≠ l

(

(5.5)

)

Similar to the expression given in equation (5.2), the mutual information between the network
projection Q and the stimulus W was computed as

⎛ Pr (q , w ) ⎞
I (Q;W ) = ∑ ∑ Pr (q , w ) log ⎜
⎜ Pr (q ) Pr (w ) ⎟
⎟
⎝
⎠
q w

(5.6)

where the probabilities used in equation (5.6) were computed from the 2-dimensional network
space after discretizing it into a 10 × 10 bins. For the mutual information measure of individual
neurons’ subnetworks, Q corresponds to the projection of the subnetworks of each neuron onto
the network space. The normalized I(Q;W) was computed by dividing equation (5.6) by the joint
entropy H(Q,W).
In order to estimate any bias in R (w, w ) and I(Q;W) that results from the overlap in the
datasets of each whisker, surrogate datasets were created from the original data such that each
surrogate contains trials from all whiskers that has the same degree of overlap as the original
datasets. Thus, within a surrogate, any excess similarity between the networks inferred for the
datasets extracted from it would only result from the overlap between these datasets and not from
the stimulus. To form the shuffled datasets, the 900 trials recorded for each whisker were first
split into 3 different groups, 300 trials each. One group was then selected from each whisker and
concatenated with 1 group from each of the other 2 whiskers forming a 900 trial shuffled
surrogate (For example: group 1 of whisker 1, group 1 of whisker 2 and group 1 of whisker 3
form one shuffled surrogate; group 2 of whisker 1, group 2 of whisker 2 and group 2 of whisker

101

3 form another shuffled surrogate, ... etc). For each subject, 3 shuffled surrogates, 900 trials each,
were formed. A total of 100 datasets, 18 sec each, were then extracted from each shuffled
surrogate by concatenating 180 trials that were randomly chosen with a uniform distribution
from the 900 trials. Therefore, these datasets are not whisker-specific and thus any similarity
between the networks inferred within the same shuffled surrogate that is larger than that across
surrogates would only result from the overlap between the datasets extracted from the same
surrogate. Similarly, any non-zero mutual information between the networks inferred for the
shuffled surrogates and the identity of these surrogates would also result from the overlap
between the datasets extracted from the same surrogate. We used the difference between the
within-surrogate similarity and the across-surrogates similarity as an estimate of the bias in the
within-whisker similarity of the original data that results from the overlap. We also used the
mutual information computed for the shuffled datasets as an estimate of the bias resulting from
the overlap.
5.2.5

Decoding Whisker Identity

We used a leave-one-out cross-validation approach to test the ability to decode the identity of
the deflected whisker using the inferred networks [219]. The network obtained for each 18 sec
dataset of a given whisker was compared to the other networks inferred for the same whisker (M
- 1 networks) and the other whiskers (M(K-1) networks). The identity of the deflected whisker
w* for a given test dataset was computed as

w* = arg max
w

(

1 Z
∑ R A, Azw
Z z =1

)

(5.7)

102

where Z = M when w ≠ w* or Z = M - 1 when w = w* (to exclude the test dataset network from
the similarity comparison). Therefore, w* is the whisker whose inferred networks (templates)
result in maximum similarity with the test data fit A.
Decoding for a given test dataset based on the response latency of individual neurons was
obtained by, first, computing the mean first-spike latency for each neuron from all datasets of all
whiskers excluding the test dataset (training datasets). The mean first-spike latency for each
neuron in the test dataset was computed and compared to that obtained from the training datasets.
Whiskers whose training datasets for each cell had the least absolute latency difference were
identified as the ones being stimulated. Single-cell decoding accuracy was computed as the
percentage of test datasets for which the identified whiskers matched the actual whiskers. The
overall decoding accuracy was computed by averaging across cells. For comparison, we also
computed the decoding accuracy using a majority-voting rule in which whisker identity was
determined as the one with the majority of cells having the least absolute latency difference
between training and test datasets.
5.2.6

Crosscorrelogram Analysis

For the sake of comparison, we computed the standard crosscorrelogram approach to assess
potential causal influence between pairs of neurons with bin size of 0.5 ms and range [-5, 5] ms
[63]. Ten jittered versions of each of the datasets analyzed using DBN were formed in which
each spike was randomly displaced over a uniform interval in the range [-10, 10] ms around the
original spike time. A peak or trough (indicating excitation or inhibition, respectively) in the
crosscorrelogram for a given neuron pair in the original dataset was determined to represent a
connection if it crossed a confidence band computed from the jittered datasets. The upper and

103

lower limits of the confidence band were computed from the maximum and minimum counts of
the jittered crosscorrelograms, respectively, with an acceptance level of 0.99 [13].
5.2.7

Predicting Single Neuron Firing from Pre-synaptic Peers

The firing probability of each neuron was estimated using the firing history (5 bins or 2.5 ms)
of pre-synaptic cells determined from the networks inferred for each whisker and the stimulus.
The conditional firing probability of a given neuron i at time t was estimated as

{ ( (

)

)}

Pr⎛ ri (t ) = 1 r j t − lij − 5 Δ : t − lij Δ
, S (t − Liw − 5Δ : t − Liw )⎞
⎜
⎟
j∈π (i )
⎝
⎠

{ ( ( )
{ ( ( )

(5.8)

)}

Pr⎛ ri (t ) = 1, r j t − lij − 5 Δ : t − lij Δ
, S (t − Liw − 5Δ : t − Liw )⎞
⎜
⎟
j∈π (i )
⎝
⎠
=
Pr⎛ r j t − lij − 5 Δ : t − lij Δ
, S (t − Liw − 5Δ : t − Liw )⎞
⎜
⎟
j∈π (i )
⎝
⎠

)}

where π(i) is the set of pre-synaptic neurons inferred for whisker w, lij is the Markov lag at which
a connection from neuron j to neuron i was inferred, S(t) is a binary vector with a nonzero entry
of ‘1’ only at the stimulus onset and 0 otherwise, and Liw is the mean first-spike latency
computed in equation (5.1). Using ten-fold cross-validation, 10 training datasets (72 sec duration
each) were extracted from each 90 sec whisker dataset by sliding a 72 sec window with 5 sec
steps. The joint probabilities on the right hand side of equation (5.8) were computed from each
training dataset using kernel density estimation with a normal function kernel and a bandwidth
of 0.001 [220]. The probabilities estimated from each training dataset were used to predict the
firing of each neuron in the remaining 18 sec test dataset. When conditioned on the stimulus
only, equation (5.8) can be re-written as

104

Pr(ri (t ) = 1 S (t − Liw − 5Δ : t − Liw )) =

Pr(ri (t ) = 1, S (t − Liw − 5Δ : t − Liw ))
.
Pr(S (t − Liw − 5Δ : t − Liw ))

(5.9)

The probability estimates were then smoothed by applying a moving average filter with a
window of 15 ms. Predicted spike trains were computed for test datasets (10 datasets/whisker, 18
sec each) using the smoothed estimates for threshold values in the range [0, 1] with a step of
0.001. At each time point, the original spike trains were used in the prediction. True and false
positive rates were computed from the predicted spike trains for each threshold. Receiver
Operating Characteristic (ROC) curves were constructed from the true and false positive rates for
each cell for a given whisker. The area under the ROC curve was used as a measure of the
predictive power as 2 × (Area under ROC – 0.5) [221].
We compared the predictive power using the DBN method to that obtained using
Generalized Linear Models (GLMs) [98]. GLM expresses the firing probability of a neuron i as

{( (

)

)}

Pr⎛ ri (t ) = 1 r j t − lij − 5 Δ : t − lij Δ
, S (t − Liw )⎞
⎜
⎟
∀j ≠i
⎝
⎠
5
5
⎛
⎞
= exp⎜ ∑ ∑αij (mΔ )r j (t − mΔ ) + ∑αiw (mΔ )S (t − Liw − mΔ )⎟Δ
⎜
⎟
m =1
⎝ j ≠i m=1
⎠

(5.10)

where αij models the coupling filters between neurons i and j and αiw models the stimulus filter.
The history interval was set to 5 bins (2.5 ms) similar to the DBN analysis. The coupling and
stimulus filters were estimated using the same training datasets for the DBNs using iterative
reweighted least squares (IRLS). These filters were then used to compute the firing probability of
each neuron at each time point of the test datasets. Spike train predictions were computed by
comparing the estimated firing probability to threshold values in the range [0, 1]. ROC curves

105

were constructed and the predictive power was also computed for each neuron for a given
whisker.

5.3 Results

5.3.1

Firing Characteristics

We recorded a total of 80 single units from layer V of the barrel cortex in five anesthetized
rats (R1-R5). Microelectrode arrays with 32 channels were used in each rat to record neural
responses to unilateral stimulation of individual whiskers, one at a time. Each whisker was
stimulated 900 times at a frequency of 1 Hz. Compared to previous studies that addressed
individual neurons’ firing characteristics in which 50 trials only were used [48, 205, 222], we
used this large number of trials (900 trials) to guarantee enough data to infer causal networks.
Figure 5.3a and b illustrate the discharge patterns of two sample neurons to the deflection of
three whiskers in rat R2. Onset response latencies were in the range of 10+ ms, corresponding to
typical response patterns of pyramidal cells in layer V of the barrel cortex [223, 224]. The
laminar depth of the arrays was also confirmed to be in layer V by examining electrode tracks or
electrolytic lesions as illustrated in Figure 5.3c. Neurons showed significant whisker-specific
modulation of their firing pattern in response to whisker deflection, with 71.3% of the recorded
units exhibiting a significantly stronger response to stimulation of a principal whisker during the
first 100 ms post stimulus onset compared to other non-principal whiskers (P < 0.05, two-sample
t-test for each pair of whiskers).

106

Raster for Neuron 2a

PSTH

PSTH

400
PSTH
0
50 100 150
Time (msec)

0
50 100 150
Time (msec)

0
50 100 150
Time (msec)

(a)

Whisker D1 Stim.

Whisker D2 Stim.

Raster for Neuron 10a

Raster for Neuron 10a

PSTH

PSTH

Whisker C2 Stim.

80
0
Raster for Neuron 10a
800
400

0
0.05
0

PSTH
0 50 100 150
Time (msec)

0
50 100 150
Time (msec)

0

(b)

0.4

Sp. Count Info.

D (μm)

Whisker D2 Stim.

Raster for Neuron 2a

Raster for Neuron 2a
800

0
0.05
0

Sp./Trial Trial

Whisker D1 Stim.

Whisker C2 Stim.

Peak PSTH (Sp./Tr.)

D (μm)
Sp./Trial Trial

80
0

0.2
0

50 100 150
Time (msec)

0.1
0.05

0
20
40
0
0.05
0.1
st
st
Mean 1 -sp. Lat. (ms)
1 -sp. Lat. Info.
(c)
(d)
(e)
Figure 5.3: Firing characteristics of the recorded neurons. Two sample neurons: (a) Neuron 2a
and (b) Neuron 10a from rat R2 response to stimulation of whiskers C2, D1 and D2. (Top)
Whisker displacement. (Middle) Spike raster over multiple repeated trials. (Bottom) Poststimulus Time Histogram (PSTH) with 0.5ms bin size. Neuron 2a shows stronger and faster
response to whisker C2 than other whiskers while Neuron 10a shows a slightly stronger and
faster response to whisker D1. (c) Nissl stained coronal section (50 μm) in rat R5. This rat was
chronically implanted over 35 days. Dashed curve indicates the original shape of the section that
was damaged due to the implant. Black arrowhead points to an electrolytic lesion mark of the
deepest recording site on one of the shanks of the multi-electrode array. The depth of the lesion
mark (~1250 μm) is consistent with the depth recorded using the micromanipulator during the
surgery and corresponds to layer Vb of the barrel cortex (1.1 mm posterior and 5.2 mm lateral to
bregma). (d) Peak PSTH counts of each neuron for each whisker versus its mean first-spike
latency (n = 240). Each dot corresponds to the response of a single neuron to a single whisker.
(e) Normalized mutual information between spike count/first-spike latency and whisker identity.
Each dot corresponds to one neuron. Black diagonal line represents equal information (n = 80).
200 μm

107

It has been reported that the first spike post-stimulus onset in a barrel column conveys most
of the information about the corresponding whisker deflection [48, 216]. Therefore, we used the
post stimulus first-spike latency as a measure of temporal response fidelity of the recorded
neurons. We found that 83.8% of the recorded neurons had a significantly smaller first-spike
latency for a single whisker compared to other whiskers (P < 0.05, two-sample t-test for each
pair of whiskers). A fraction of 46.3% of the recorded units showed preference to the same
whisker in terms of both firing rate and first-spike latency, suggesting the presence of both rate
and temporal coding mechanisms [225]. Neurons with strong response modulation to a given
whisker also exhibited short latency as illustrated in Figure 5.3d (r = -0.76, P < 0.0001, n = 240,
t-test).
To confirm the salience of spike timing in encoding whisker movements, we computed the
mutual information between the stimulus and each individual response property (namely, the
first-spike latency and the spike count) [48, 217]. More information was conveyed about the
stimulus by the first-spike latency than the spike count as illustrated in Figure 5.3e (P < 0.0001,
n = 80, two-sample t-test). Indeed, 96.25% of the recorded neurons had larger first-spike latency
information than spike count information, indicating that temporal coding was more pronounced
compared to rate coding. (Information in first-spike latency: 0.19±0.08 bits, information in spike
count: 0.05±0.04 bits; Normalized information in first-spike latency: 0.04±0.02, normalized
information in spike count: 0.01±0.01, mean ± SD).
5.3.2

Whisker-specific Networks

As illustrated in Figure 5.3, variability in the temporal characteristics of the responses across
whiskers was observed. We asked whether this variability could be accounted for using a

108

network model of the recorded ensemble beyond what is provided by individual neurons’
response variability. To address this question, we analyzed the data by fitting a dynamic
Bayesian network (DBN) model [28]. For each stimulated whisker, one hundred 18-sec long
spike train datasets were formed from the 900 trials of each whisker by randomly choosing 180
trials out of the 900 trials following a uniform distribution. DBN fit was then obtained for each
of these datasets. Figure 5.4a illustrates sample inferred networks for three distinct whiskers in
one rat. To assess the validity of these networks in the absence of knowledge of the underlying
true connectivity, we examined the connection probability as a function of the horizontal and
vertical separation between electrodes. We expected that neurons recorded on the same or
adjacent electrodes are more likely to be connected in the inferred networks, consistent with
anatomical and physiological studies in the neocortex suggesting that connectivity tends to be
mostly local for economic wiring [13, 149, 150]. Figure 5.4b demonstrates that neurons recorded
on the same or on adjacent electrodes have a higher probability of being connected. Furthermore,
the connection probability decreased with increasing electrode separation (r = -0.47, P < 0.05, n
= 24, t-test).
To examine whether the inferred networks are indeed whisker-specific, we measured the
similarity between the networks inferred for the same whisker (termed herein within-whisker
similarity) and the similarity between the networks inferred for different whiskers (termed
across-whisker similarity). Principal component analysis (PCA) was used to construct a network
feature space where each point in that space corresponds to one network as illustrated in Figure
5.5a [218]. We quantified the similarity between a pair of networks as (1 – the normalized
distance between their corresponding projections in the PCA network space). Distance
normalization ensured that the maximum possible pair-wise distance measure in the network
109

Whisker C2
16a
16b
17a
17b
24a
24b
25a
25b

10b

26a

Whisker D1

10a 9b 9a 6b

26b 26c 31a 31b

6a

16a
16b
17a
17b
24a
24b
25a
25b

2b

2a
1b
1a
32c
32b
32a
31c

10b

26a

10a 9b 9a 6b

26b 26c 31a 31b

6a

2b

2a
1b
1a
32c
32b
32a
31c

Whisker D2
10b

Ver. Separation ( m)
μ

16a
16b
17a
17b
24a
24b
25a
25b
26a

10a 9b

9a 6b

26b 26c 31a 31b

6a

2b

2a
1b
1a
32c
32b
32a
31c

(a)
Conn. Probability

500
400
300
200
100
0

0.08
0.06
0.04
0.02

0
0 400 800 1200
Hor. Separation (μ m)
(b)
Figure 5.4: Sample whisker-specific networks. (a) DBN networks inferred from one population
(Rat R2) for 3 individually stimulated whiskers: C2, D1 and D2. Undirected links indicate
bidirectional connections. Network of each whisker was inferred from a dataset of length 18 sec
(180 trials x 100 ms). (b) Connection probability in the DBNs as a function of the horizontal and
vertical separations between the electrodes on which neurons were recorded. The number of
connections inferred at each distance was normalized by the corresponding total number of
possible connections.

110

feature space does not exceed ‘1’. As illustrated by Figure 5.5b, networks inferred for the same
whisker were significantly more similar compared to those inferred for other whiskers (83.3±6%
within whisker, 50.3±18% across whiskers, P < 1e-6, two-sample t-test). This suggests that the
inferred networks are whisker-specific, and that temporal coordination among the observed
neurons bared a signature of the deflected whisker identity.
We next examined whether the overlap between the datasets extracted for the same whisker
(20±4% overlap between any given pair of datasets) may have resulted in a statistical bias
towards the within-whisker similarity measures. We estimated that bias by creating multiple
shuffled surrogate datasets for each population. Each surrogate dataset consisted of 900 trials,
300 trials from each whisker. One hundred 18-sec long datasets were then extracted from each
surrogate dataset in the same way as the original 100 datasets/whisker were extracted. This
shuffling procedure destroyed whisker-specific features within each dataset but maintained the
same amount of overlap (Figure 5.6). Therefore, similarity within each surrogate dataset that
exceeds similarity across surrogate datasets would only result from the overlap between the
datasets within each surrogate. We found the bias of within-whisker similarity to be 1.8±3%
(Figure 5.6). In addition, the projection of the networks inferred from the surrogate data clustered
around the origin in the network space as seen in Figure 5.5a. This demonstrates that the
networks inferred from the shuffled data represented pure noise and, therefore, the inferred
networks from the original data are whisker-specific (distance from the origin for the original
data: 1.83±0.9, distance from the origin for the surrogate data: 0.75±0.5, P = 0, two-sample ttest).
We then examined the amount of information conveyed by the networks about the stimulus
and compared it to those conveyed by single individual neuron responses. Averaged across
111

Network Space
10a

9b

9a

10b

6b

2

6a

16a

2b
16b
2a
17a
1b

10a

9b

9a

10b

6b
6a

16a

2b
16b
2a

17b

17a
1b

1a
24a

17b

32c

1a

24b

24a

32b

32c

25a

24b

32a

25b

32b
25a

31c

26a
26b

26c

31a

31b

32a

25b

31c

26a

PC2

26b

9b

9a

C2
D2
D1
Shuffled

10b

6b
6a

16a

2b
16b
2a
17a
1b
17b
1a
24a
32c
24b
32b
25a
32a

25b

31c

26a
26b

26c

31a

31b

31a

9b

9a

31b

0
-2

10a

26c

-4
-4

10a
10b

6b
6a

16a

2b
16b
2a
17a
1b
17b
1a

-2

0
PC1

2

24a
32c
24b
32b
25a
32a

25b

31c

26a
26b

26c

31a

31b

(a)

*

Network Information

Network Similarity

1
0.8
0.6
0.4
0.2
0

Within
Whisker

0.4
0.3
0.2
0.1
0

Across
Whisker

*

Original
Data

Shuffled
Data

(b)
(c)
Figure 5.5: Networks similarity within- and across-whiskers. (a) Network feature space of rat R2
for 3 different whiskers (C2, D2 and D1) and a shuffled dataset. Each dot corresponds to the
projection of one network onto a 2-dimension principal components (PC1 and PC2) feature
space. Insets: Sample networks from the 3 whiskers. Black edges represent common connections
with the top left sample network (whisker C2). (b) Similarity between networks inferred for the
same whisker and between networks inferred for different whiskers averaged across subjects
(mean ± SD). Similarity for a given pair of networks was quantified as 1 – the normalized
distance between the projections of the pair in the principal component space. Within whisker
similarity was corrected for the bias resulting from the overlap in the data (estimated from the
shuffled data). The figure indicates that the within whisker networks cluster more closely
compared to across-whisker networks. *P < 1e-6, two-sample t-test. (c) Normalized mutual
information between each of the networks inferred from the original and shuffled data, and the
stimulus averaged across subjects (mean ± SD). *P < 0.01, two-sample t-test.

112

Original Data

0.05

0
0

50
Overlap (%)

100

Fraction of Pairs of Datasets

Fraction of Pairs of Datasets

0.1

0.1

Shuffled Data

0.05

0
0

50
Overlap (%)

100

(a)

Shuff. 1
Shuff. 2
Shuff. 3

2
PC2

Network Similarity

Shuffled Data Network Space

1
0
-1
-1

0

1

1
0.8
0.6
0.4
0.2
0

2

Within
Dataset

Across
Datasets

PC1
(b)
(c)
Figure 5.6: Overlap in the original and the shuffled datasets. (a) Distribution of the amount of
overlap between any pair of datasets for (Right) the original data and (Left) the shuffled data.
Blue curve indicates a binomial distribution fit with parameters p = 0.2 and n = 180. Both figures
indicate that both the original and the shuffled data have the same degree of overlap where any
given pair of datasets would have an overlap of 20±4%. (b) Network feature space of the
shuffled datasets extracted from rat R2 data. Each dot corresponds to the projection of one
network onto a 2-dimension principal components (PC1 and PC2) feature space. (c) Similarity
between networks inferred for the same shuffled dataset and between networks inferred for
different shuffled datasets averaged across subjects (mean ± SD). Similarity for a given pair of
networks was quantified as 1 – the distance between the projections of the two networks in the
principal component feature space.

subjects, information of 0.28±0.08 (Un-normalized: 1.4±0.3 bits) was obtained from the original
data compared to only 0.09±0.07 (Un-normalized: 0.28±0.28 bits) from the shuffled data which
can be used as an estimate of the bias resulting from the overlap among the within-whisker
datasets (Figure 5.5c). Therefore, after correcting for the bias, the inferred whisker-specific

113

networks convey an average net normalized information of ~0.2. Compared to individual
neurons’ response information, we found network information to be approximately 6 times the
information provided by the first-spike latency and 13 times the information provided by the
spike count. This suggests that the network code provides orders of magnitude more information
about the stimulus compared to rate and temporal codes combined.
Within each of the networks inferred, some neurons exhibited strong participation in
subnetworks. We investigated whether, within each subnetwork, individual neurons’ conveyed
more information about the stimulus compared to their individual response characteristics. This
was done by, first, extracting the subnetwork where each neuron was an actual element, and
second, computing the PC’s of these subnetworks as illustrated in the example of Figure 5.7a.
Individual neuron subnetworks conveyed more information about the stimulus, albeit at a
reduced precision compared to the parent network case as illustrated in Figure 5.7b (Normalized
information for original data: 0.15±0.06, shuffled data: 0.08±0.02, P = 0, two-sample t-test; Unnormalized information for original data: 0.82±0.3 bits, shuffled data: 0.4±0.2 bits, P = 0, twosample t-test). Moreover, 78.6% and 94.3% of the neurons conveyed more information through
their subnetworks than their individual first- spike latency and spike count, respectively, as can
be seen in Figure 5.7c. This suggests that the network code provides better stimulus
discrimination than rate and temporal code not only at the global population level but also at the
local subnetwork level. It is noteworthy that the estimated probabilities, and so the computed
mutual information, vary with the network space discretization bin size. Our results, however,
did not seem to be affected by the choice of the bin size (Figure 5.8).
The significant amount of information conveyed by the network suggests that decoding
whisker identity based on network features would be more accurate. For a given 18-sec long
114

Neuron 16b Subnetwork Space
1

PC2

9b 2b 9a
1b
16a
0.5
1a 16b
32c
17a
25a 2a

2a 31a
31b

0

1b

16b
2b

-0.5
-1
-1.5

C2
D2
D1

-2

-1

PC1

0

1

0.2

st

*

Resp. Info.

Subnetwork Info.

(a)

0.1

0.2

1 -sp. Lat.
Sp. Count

0.1

0
Original Shuffled
0
0.1
0.2
Subnet. Info.
Data
Data
(b)
(c)
Figure 5.7: Individual neurons subnetworks information. (a) Network feature space of one
sample neuron (Neuron 16b of rat R2) for 3 different whiskers (C2, D1 and D2). Each dot
corresponds to the projection of one network onto a 2-dimension principal components (PC1 and
PC2) feature space. (b) Information in the individual neurons’ networks inferred for the shuffled
data versus the original data (mean ± SD). * P = 0, two-sample t-test. (c) Normalized
information in first-spike latency and spike count versus normalized information in the networks
of each neuron. Each dot corresponds to one neuron. Black diagonal line represents equal
information. Network information was corrected for any statistical bias by subtracting the
network information computed for each neuron from the shuffled data in (b).
0

dataset of a given whisker, the identity of the deflected whisker was decoded as the whisker
whose inferred networks had the highest similarity to the network of that dataset (using leaveone-out cross-validation method). Overall, decoding accuracy reached 97.6±3% across subjects

115

1.5
1
0.5
0
-0.5
0

Original
Shuffled
80 100

Indiv.l Neuron Net. Info. (bits)

Network Information (bits)

2

2
1.5
1
0.5

Original
Shuffled
80
100

0
40
60
0
20
40
60
Bin Size
Bin Size
(a)
(b)
Figure 5.8: Network information in the original data is consistently higher than that in the
shuffled data, independent of the bin size. (A) Network information in the original and the
shuffled data as a function of the bin size used to estimate the mutual information averaged
across the 5 subjects (mean ± SD). (B) Information in the network of individual neurons
computed from the original and the shuffled data as a function of the bin size used to estimate the
mutual information, averaged across the 80 neurons (mean ± SD).
20

(1464 of 1500 datasets were classified correctly), compared to only 79.7±12% when the response
latency of each neuron was used as a feature for decoding, whereas using a majority voting
method using all neurons reached an accuracy of 87.7±13%.
5.3.3

Relating Network Properties to Individual Neuronal Responses

To examine whether the inferred networks are consistent with individual neuron response
properties, we first examined whether the direction of an inferred connection between a given
pair of neurons is consistent with the sign of the difference between their mean first-spike
latencies. As Figure 5.9a illustrates, we found that for 93.3% of the inferred unidirectional
connections, neurons acting as putative pre-synaptic cells had smaller latencies than neurons
acting as putative post-synaptic cells.
We next examined the relationship between the number of connections each neuron
sends/receives and its response latency. Neurons with relatively large number of outgoing
116

0.3
0.2
0.1
0
-40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40
Post-Pre First-spike Latency (ms)
(a)
Subnet. Info. (z-score)

Fraction of Connections
Source Index (z-score)

6

0.4

4
2
0
-2

2
1
0
-1
-2

-2
0
2
-1
0
1
2
3
Mean 1st-sp. Lat. (z-score)
Source Index (z-score)
(b)
(c)
Figure 5.9: Network properties and individual neuronal responses. (a) Histogram of the
difference between the mean first-spike latency of the post-synaptic cells and the pre-synaptic
cells for each inferred connection. Only unidirectional connections were counted in the
histogram. Red bars indicate the fraction of connections consistent with the difference between
latencies while blue bars indicate connections that are not. (b) Ratio between the number of
outgoing connections and ingoing connections for each neuron (source index) as a function of its
mean first-spike latency for each whisker. Each dot corresponds to one neuron for a given
whisker (n = 240). Z-scores of the mean first-spike latency and the source index are reported on
the X-axis and the Y-axis, respectively. Gray curve indicates decaying exponential fit. (c)
Information in the networks of each neuron as a function of the source index averaged across
whiskers (n = 80). Each dot corresponds to the standardized z-scores of one neuron. Gray line
indicates regression line.

connections constituted central hubs in the network, thereby acting as “source” nodes, as can be
seen in Figure 5.4a. The identity of these source neurons was whisker–specific (for e.g., neuron
2b for whisker C2, neuron 16b for whisker D1, and neuron 17b for whisker D2). We observed
that the identity of these neurons correlates with neurons with short response latency. On the
other hand, neurons with large response latency were observed to have more ingoing connections
117

than ones with short response latency, thereby acting as “sink” nodes. Figure 5.9b illustrates the
ratio of the number of outgoing (fan-out) to ingoing connections (fan-in) for each neuron, termed
the source index, as a function of its response latency. This index decays exponentially with the
mean first-spike latency (Time constant = -0.67, r2 = 0.53, n = 240, t-test). One way to interpret
these observations is that neurons with short response latency receive the information about
whisker deflection before neurons with larger response latency. In addition to these findings,
neurons with high source index were found to convey more information about the stimulus than
neurons with low source index as seen in Figure 5.9c (r = 0.54, P < 1e-6, n = 80, t-test). This
suggests that few strongly connected hub neurons are key players in orchestrating the local
population response to the stimulus.
5.3.4

Crosscorrelogram Comparison

The crosscorrelogram is a known method to identify functional connectivity between cells
over very short time scales (< 5 ms) [13, 63]. Crosscorrelogram analysis of our data revealed
some similarity but also some substantial differences from the DBN analysis. In particular,
Figure 5.10a shows the connection probability calculated from crosscorrelogram analysis as a
function of the difference between post and pre-synaptic neurons latencies. Contrary to the DBN
result in Figure 5.9a, the probability of inferring a connection from small latency neurons to
large latency neurons was not significantly higher than inferring a connection in the opposite
direction. (Crosscorrelogram: 50.1%, DBN: 93.3%). Similar to Figure 5.9b, Figure 5.10b shows
a negative correlation between the source index and response latency but less significant than

118

4
2
0
-2

0.3
0.2
0.1
0
-40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40
Post-Pre First-spike Latency (ms)
(a)

Fraction of Connections

Fraction of Connections

Source Index (z-score)

6

0.4

1

Crosscorr.
DBN
*

*

0.5

0
-2
0
2
ChainCommon
st
Mean 1 -sp. Lat. (z-score)
effect
Input
(b)
(c)
Figure 5.10: Comparison with networks inferred using crosscorrelograms. (a) Histogram of the
difference between the mean first-spike latency of the post-synaptic neurons and the pre-synaptic
neurons for each connection inferred using the crosscorrelogram technique. Only unidirectional
connections were counted in the histogram. Red bars indicate the fraction of connections
consistent with the difference between latencies while blue bars indicate connections that are not.
(b) Ratio between the number of outgoing connections and ingoing connections for each neuron
(source index) as a function of its mean first-spike latency for each whisker. Each dot
corresponds to one neuron for a given whisker (n = 240). Z-scores of the mean first-spike latency
and the source index are reported on the X-axis and the Y-axis, respectively. Gray curve
indicates decaying exponential fit. (c) Fraction of possible chain effect-induced connections and
common input-induced connections inferred by crosscorrelogram and DBN (mean ± SD). * P <
0.001, two-sample t-test.

what was obtained using DBN analysis (Crosscorrelogram: time constant = -0.23, r2 = 0.04, n =
240; DBN: time constant = -0.67, r2 = 0.53, n = 240).
The crosscorrelogram analysis, however, revealed some inconsistency with individual neuron
response analysis. In particular, more connections were observed that were inconsistent with the
response latency of the neurons, indicating they mostly represent spurious connections. This was
119

expected because, in contrast to DBN, the crosscorrelogram method – by virtue of the fact that it
is a pair-wise measure – does not explain away unlikely causes of correlation. To demonstrate
that this is indeed the case, consider for example the case of 3 neurons A, B and C forming a 3neuron chain where A→B and B→C but no connection exists between A and C. A pair-wise
measure such as the crosscorrelogram would infer a direct connection A→C. In the case of a
common input A driving both B and C, the crosscorrelogram would detect the spurious
connections B→C or C→B. We quantified the ability of the DBN method to infer direct causal
influence in the 3-neuron chain case as well as the common input case and compared it to the
crosscorrelogram method. We found that the crosscorrelogram inferred a significantly higher
number of spurious connections compared to DBN (P < 0.001, two-sample t-test). On average,
the crosscorrelogram inferred 36.2% more connections than DBN for the 3-neuron chain case,
and 64.4% more connections for the common input case as shown in Figure 5.10c.
5.3.5

Network Model-based Prediction of Single Neuron Firing

We further examined the ability of the inferred networks to predict the firing of individual
neurons. The probability of firing of each neuron at any given time point was estimated based on
the firing history of the neuron’s input connections as determined by the network structure for a
given whisker dataset and stimulus onset. Predicted spike trains were obtained by comparing the
probability of firing to a variable threshold (see section 5.2.7) and computing the percentage of
times the spikes in the predicted spike train matched the original spike train (True positives) and
the percentage of times they did not match (False positives) for each threshold value [221]. We
then compared the Receiver Operating Characteristics (ROC) obtained using DBN fit to those
obtained using Generalized Linear Models (GLM), conditioned on the pre-synaptic cells’ history

120

Neuron 17b - Whisker D2

0.5
DBN (Pre-syn. + Stim)
GLM (Pre-syn. + Stim)
Stim

0
0

0.5
False Positive Rate

*

GLM Predictive Power

(a)

1

*

0.5

Stim

DBN Predictive Power

0

1

GLM
DBN
(Pre-syn.+Stim) (Pre-syn.+Stim)
(b)

0.5

0
0
20
40
Within Whisker Lat. Std (ms)

Neuron 2b - Whisker C2

0.5

0
0

1

DBN - GLM Pred. Pow.

Predictive Power

1
True Positive Rate

True Positive Rate

1

DBN (Pre-syn. + Stim)
GLM (Pre-syn. + Stim)
Stim

0.5
False Positive Rate

1

1

0.5

0
0

0.5
1
DBN Predictive Power
(c)

0.4
0.2
0
-0.2
-0.4
0
20
40
Within Whisker Lat. Std (ms)

(d)
(e)
Figure 5.11: Predicting single neuron firing. (a) ROC curves of sample neuron 2b in response to
whisker C2 deflection (left) and neuron 17b in response to whisker D2 deflection (right) in rat
R2. Predicted spike trains were obtained using the pre-synaptic cells’ history inferred by DBN
and stimulus onset time, pre-synaptic cells’ history inferred by the GLM and stimulus onset time,
and the stimulus onset time only. Each curve was computed from tenfold cross-validation
datasets. (b) Predictive power comparison across all rats for multiple models (mean ± SD). * P <
0.001, two-sample t-test. (c) GLM predictive power versus that obtained using DBN and
stimulus onset time. Black diagonal line represents equal prediction (n = 240). (d) Predictive
power obtained using DBN and stimulus as a function of the variability in mean first-spike
latency. (e) Difference between predictive power for each neuron obtained using DBN and
stimulus and that obtained using GLM as a function of the variability in mean first-spike latency.
Each dot corresponds to one neuron for a given whisker (n = 240). Gray lines in (d) and (e)
indicate regression lines.
121

and the stimulus [98]. GLM has been shown to fit spike train data in a number of brain structures
[83, 98, 221]. Figure 5.11a demonstrates sample ROC curves of two cells. As can be seen, the
predictive power of DBN conditioned on the pre-synaptic cells’ history as well as the stimulus
onset was the highest, suggesting that network inference was the most accurate.
To quantify the predictive power for each neuron, we used the area under the ROC curve
(AUC) [221]. An AUC of 0 indicates a prediction that is similar to chance level, while an AUC
of 1 indicates perfect prediction with a highly deterministic conditioned response. Figure 5.11b
shows a comparison between the predictive power of different models. DBN predictive power
conditioned on the pre-synaptic cells’ history and the stimulus onset time was significantly better
than that obtained conditioned on the stimulus onset time only (stimulus only: 0.2±0.24, DBN
pre-synaptic cells’ history and stimulus: 0.58±0.16, P = 0, two-sample t-test). The predictive
power of the DBN was slightly better than that of the GLM (GLM: 0.53±0.16, P < 0.001, n =
240, two-sample t-test) as illustrated in Figure 5.11c. DBN predictive power was inversely
proportional to the variability in response latency as seen in Figure 5.11d (r = -0.56, P < 1e-20, n
= 240, t-test). Thus, better overall prediction was obtained for cells with small variance in the
response temporal precision compared to cells with high variance. For this group of cells, DBN
prediction was better than GLM as shown in Figure 5.11e (r = -0.26, P < 0.001, n = 240, t-test).

5.4 Position-specific Networks in the Medial Prefrontal Cortex
We used DBN to infer causal relationships between simultaneously recorded neurons in the
medial prefrontal cortex (mPFC) of an awake, behaving rat, during a working memory task [13,
29]. This brain area is known to integrate multiple inputs during decision making to guide motor
behavior. This experiment was carried out in Gyorgy Buzsaki’s laboratory at Rutgers University
122

in which rats were trained to decide, based on an odor cue, whether to traverse the left arm or the
right arm of a T-maze to be rewarded. A session consisted of 24 right-turn trials and 18 left-turn
trials. Using crosscorrelograms features, recorded neurons were found to fire selectively in
different regions of the maze and exhibit some form of pair-wise connectivity. These mPFC
neurons were therefore argued to encode behavioral goal representation that is mediated by
integrating a myriad of sensory inputs. In addition, inferring monosynaptic connectivity between
some of the recorded neurons was hypothesized to form local dynamic circuits indicative of
short-term plasticity [13].
Spike trains from 39 simultaneously recorded neurons in layer 2/3 of the mPFC were
examined here for the existence of consistent networks using the graphical models discussed
above. Spike trains were first downsampled to 2000 Hz (0.5 ms bins). This permitted detecting
connectivity with synaptic latency of 0.5 ms or more. The length of each maze arm was
normalized between 0 and 1 and divided into 5 intervals (0 0.2, 0.2 0.4, 0.4 0.6, 0.6 0.8
and 0.8 1). The spike train data corresponding to each interval were concatenated together to
nd

form 5 position-specific datasets for each side (2
Trial 1

Trial 2

row in Figure 5.12). Each of the positionTrial 3

Trial 24
...

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
T1 T2 . . . T24
0 0.2

T1 T2 . . . T24
0.2 0.4

D1 D2 D3 D4
D5 D6 D7 D8

D1 D2 D3 D4
D5 D6 D7 D8

0 0.2 0.4 0.6 0.8 1

T1 T2 . . . T24 T1 T2 . . . T24 T1 T2 . . . T24
0.4 0.6
0.8 1
0.6 0.8
D1 D2 D3 D4
D5 D6 D7 D8

D1 D2 D3 D4
D5 D6 D7 D8

D1 D2 D3 D4
D5 D6 D7 D8

Figure 5.12: Partitioning the spike trains to obtain position-specific datasets for 24 right-turn
trials (similar analysis for 18 left-turn trials).

123

specific datasets was subsequently divided into 8 smaller datasets, where each dataset was
rd

formed by randomly selecting 6 sub-trials and grouping them together (3 row in Figure 5.12).
For each position-specific dataset, connections with synaptic latency in the range [0.5, 3] ms
with a step of 0.5 ms were identified. Figure 5.13 illustrates the networks obtained for each maze
position interval for the left and right datasets. Only connections inferred in at least 3 out of the 8
datasets examined are shown. We then examined the similarity between the inferred networks for
different maze positions in Figure 5.14a. Figure 5.14b illustrates that the similarity decreases
with increasing separation between maze positions. This suggests a graded transition between
network states as the rat navigates through the maze. In addition, we found that networks
corresponding to left arm datasets have more connections (17.8±5.3) than those of right arm
datasets (12.6±7). This may be attributed to the difference in firing rate of each neuron for left
and right trials, where 62% of the analyzed neurons were more active during left trials than right
trials.

5.5 Discussion and Conclusion
Neural coding theories in the sensory neocortex of many species posit that fast integration of
sensory information is crucial to the organism’s ability to guide motor actions. In the rat
somatosensory cortex, previous studies have consistently demonstrated that putative pyramidal
neurons in layer IV receiving input from trigeminal nuclei through VPM thalamus showed the
prevalence of temporal coding over rate and correlation coding [48, 205, 225-227]. Whether
temporal coding at the population level provides sufficient information to subserve the rapid
sensorimotor integration mechanisms needed to perform active whisking remained poorly
understood.
124

192
186

25

130

49

246
271

135

30

22

32

295

55

207

246

139

182

30

133

271

135

30

247

131

245
244

133

Position = 0 -> 0.2 Right
201

Position = 0 -> 0.2 Left
190
143
32
72
201
22
207
295
55

73
40
Position = 0.2 -> 0.4 Left
45
295
246
22
30
55
139
32
131
19
130 135
197
49 40 186
143
271
207
271
201
192 52
73 190
247
71
138
72

192
Position = 0.2 -> 0.4 Right

135

245
54
Position = 0.4 -> 0.6 Right
143
244
295 22 135
190
186
246
45

139
190 40
135
49 216
249
72
130 25 192
143 247
244
52
133
295
30
245
186
246

182

131

186

71

201

246

201

Position = 0.4 -> 0.6 Left
131

197

295

190

72
52

139 182
247
271
241

40

201

130

30
131

Figure 5.13: Networks inferred from left and right trials data for different maze position intervals.
Only connections appeared in at least 3 out of the 8 datasets of each position interval are shown.
Clear dissimilarity between network models of left and right trials can be seen.

125

Position Interval

Network Similarity
0.19
0.18
0.17
0.16
0.15

0-0.2
0.2-0.4
0.4-0.6
0.6-0.8
0.8-1

Average F-Measure

Position = 0.6 -> 0.8 Right
Position = 0.6 -> 0.8 Left
245
295 182
143
182
197 244
19
246
192
49
22 52 247
25
49
130
150
30
72
135 25 190
30
135
216
186
73
131
186
249 40
139
72 245
130
32
19
246
216
150
295
201
201
133
207
Position = 0.8 -> 1 Right
Position = 0.8 -> 1 Left
138
52
197
25
130 30
138 246
45
40
186
131 245
72
19 241
283 247 133 143 271 192
216
190
135 54 249
216
55
22
73
71
201
135
201
Figure 5.13 (cont’d)

283
54
271
138

241

0.2
0.18
0.16

0.2 0.4 0.6 0.8
0-0.2 0.2-0.40.4-0.60.6-0.8 0.8-1
Position Intervals Distance
Position Interval
(b)
(a)
Figure 5.14: (a) Similarity between the inferred networks at different maze positions computed
using the F-Measure (equation 4.9) averaged across both sides. On-diagonal entries are
eliminated. (b) Average similarity between the inferred networks vs. distance between maze
position intervals. Values were extracted from the matrix shown in (a). It can be seen that
networks become increasingly dissimilar as a function of the distance between position intervals,
suggesting a differential population encoding mechanism that depends on the behavioral goal.

126

Here, we examined the dynamics of spike timing correlation between local, simultaneously
observed neurons in layer V in response to unilateral whisker stimulation in the anesthetized rat.
We showed that rapid network dynamics between these neurons, as determined by the stable,
whisker-specific dynamic Bayesian networks, provided evidence of a synergistic code that
mediates information flow. Furthermore, we were able to demonstrate that the effective
connectivity revealed by the structure of these networks provided more information about the
stimulus than what is provided by both temporal and rate codes of each neuron analyzed
individually. In particular, we showed that the decoding performance of these networks was
~18% higher compared to that of the first-spike latency, and that the inferred connections were
important in predicting individual neurons’ firing patterns. This agrees with previous studies that
addressed similar questions in other brain areas [53, 80, 83], albeit at a much finer temporal and
spatial resolution.
It is widely accepted that complex, possibly nonlinear, response characteristics occurring at
the population level such as those described here are more pronounced in higher cortical areas
such as the prefrontal cortex, or subcortical areas such as the hippocampus and the thalamus [13,
228, 229]. These characteristics have been also hypothesized to underlie the fast oscillatory
patterns observed throughout many cortical layers [230, 231]. These patterns are known to
originate in the cortex and do not require rhythmic drives from the thalamus, as demonstrated by
direct electrical stimulation studies [116, 212]. Here we demonstrate that the origin of these
dynamics may be rooted in the precise temporal coordination among layer V neurons possibly to
subserve the integration of multiple information processing streams needed to mediate
sensorimotor transformations by areas innervated by layer V, and in particular the motor cortex.
Whether the connectivity inferred in this study represents actual anatomical connectivity
127

between the cells was not possible to achieve with our recording or analysis methods.
Nevertheless, there is substantial evidence in the literature indicating that local anatomical
connectivity in sensory cortices is predominantly present [232, 233], and our results provide
strong support of these findings.
A plausible interpretation of our findings would be that since layer V is a major output of the
barrel cortex to other brain areas such as the thalamus, the motor cortex and the pontine nuclei
[207], information has to be rapidly integrated across cortical columns to regulate the whisking
behavior needed to discriminate between different objects that come in contact with the whiskers
[211]. Thus, these high-level functions might require a gain-modulation mechanism that is
dynamically shaped by a variable number of participating neurons coordinating their
information-bearing signals to represent stimulus attributes that cannot be provided by single
units responding individually to their principal whiskers. Our results therefore suggest a strong
account for a synergistic coding mechanism in layer V that may reflect stimulus-dependent states
of the observed population. While these findings agree with previous reports of multi-whisker
integration in layer V of the barrel cortex at the single cell level [205, 207], they provide the first
evidence that this integration occurs at the network level within millisecond timescales.
It is noteworthy that the recurrent nature of cortical circuits, particularly those present in
layer V, makes it especially difficult for pair-wise connectivity measures such as
crosscorrelograms to differentiate between direct and indirect coupling, such as in a neuronal
chain, or when a common input is present. The DBN approach overcomes these limitations, as it
integrates immediate evidence with prior information (long-term knowledge) and uses this
process to explain away unlikely causes of firing.

128

It is important to bring our findings in perspective with two related hypotheses, namely cellassemblies and synfire chains [179, 234]. Integration across neighboring columns as
demonstrated by our analysis seems to support the cell assembly hypothesis given the active,
dynamic participation of the observed neurons in whisker-specific networks. This implies that
each post-synaptic neuron in the assembly “reads” patterns of firing of its pre-synaptic peers
within temporal integration windows with possibly variable lengths. This is particularly
interesting, in part because cell-assemblies that reflect different degrees of synchrony between
their elements have been hypothesized to correlate with presumed ‘top-down’ processing of
sensory information [228, 235]. Herein, our findings suggest that similar mechanisms occur
during bottom-up processing where sensory information is propagated to upstream cortical
networks, and that coordination among the cells at millisecond timescale is a more wide spread
phenomenon than previously thought.
Finally, consistent with these findings in the somatosensory cortex, data from the medial
prefrontal cortex (mPFC) showed a similar trend in which networks inferred during a working
memory task were position-specific. We found that the similarity between the inferred networks
is inversely proportional to the distance between the corresponding positions, suggesting a
graded transition between network states as the rat navigates through the maze. These findings in
both somatosensory cortex and mPFC suggest that localized network codes may represent a
universal encoding mechanism for mediating information flow across many neocortical
structures.

129

Chapter ____________________________________________

6

Network Dynamics Associated with
Experience-dependent Plasticity in
the Rat Somatosensory Cortex

______________________________________________________________________________

6.1 Experience-dependent Plasticity in the Barrel Cortex
Brain plasticity, or plasticity in short, refers to the continuous change in the neurons’ intrinsic
properties or the connections among them that occurs due to a variety of reasons such as
experience, aging, brain diseases or injury [31, 236]. The problem of identifying plasticity in a
recorded neural population has long been the subject of intense research given its importance in
understanding how the brain develops and adapts [66, 86, 237, 238]. Experience-dependent
plasticity might be the most vital to brain development because it includes memory formation,
accompanies learning and sensory alteration. The latter results in reorganization of the brain area
that represents the altered sensory inputs [239, 240]. For example, sensory alteration has been
studied in humans and monkeys by amputating hand fingers or blocking nerves that innervate
them resulting in reorganization of connectivity in the somatosensory cortex [241, 242].
Similar phenomena have been observed in the somatosensory cortex of adult rats as a result
of whisker trimming [198]. For example, sparing a single whisker results in an expansion of the
barrel corresponding to that whisker and shrinkage of the barrels corresponding to the deprived
whiskers [243]. In this chapter, we focus on experience-dependent plasticity induced by
trimming all whiskers but two, termed whisker pairing. Consistent with previous studies, we

130

hypothesize that whisker pairing will result in merging of the two barrels corresponding to the
spared whiskers [198]. This merging form of plasticity is consistent with Hebb’s hypothesis
which postulates that “neurons that fire together, wire together” [179]. This can be explained by
virtue of the fact that in the case of whisker pairing, rats are likely to rely on the two spared
whiskers during active whisking episodes to palpate new objects and explore the surroundings.
Subsequently, neurons of the two corresponding barrels would likely fire nearly simultaneously
than if all whiskers were intact. There are multiple physiological mechanisms behind this type of
plasticity. For example, the blockade of NMDA receptors or the lack of α-CaMKII
(calcium/calmodulin-dependent protein kinase II, type α) impairs plasticity in the somatosensory
cortex [198]. In addition, inhibitory connections are also altered where the number and density of
GABA synapses in layer IV was observed to decrease significantly as a result of whisker
deprivation. Other studies have also demonstrated a significant increase in excitatory axonal
projections from the barrels corresponding to the deprived whiskers to the barrels corresponding
to the spared whiskers, whereas a significant increase in the inhibitory axonal projections occurs
in the opposite direction [244].
At the individual neuron level, neuronal firing rates in layer IV in one principal sparedwhisker (W1) barrel become more similar to those in the other spared-whisker (W2) barrel in
response to deflection of W2 [245]. Plasticity in layer IV starts 3 days post whisker trimming and
reaches its peak 7-10 days post whisker trimming [246], while in layers II/III and layer V it
starts as early as 1 day post whisker trimming [247]. The elevated response of the neurons of one
whisker to the other spared whisker is hypothesized to result from the emergence of new
connections between the two corresponding barrels or the strengthening of already existing
connections [198]. Changes in the effective connectivity, measured using crosscorrelograms,
131

across the barrels corresponding to the spared whiskers have been reported in layer IV indicating
an increase in the strength of these connections [213].
In this chapter, we examine the applicability of Dynamic Bayesian Networks (DBNs) in
tracking plasticity induced by whisker pairing at the network level. We hypothesized that
stimulus-specific networks inferred for the spared whiskers will exhibit more similarity after
pairing compared to networks inferred before pairing in the same animal. We provide evidences
that plasticity induced by whisker pairing occurs in layer V and that DBN can track changes in
the stimulus-specific networks.

6.2 Methods

6.2.1

Barrel Cortex Recording and Electrode Implantation

Four adult female Sprague Dawley rats weighing ~300 g were used in this study. All
procedures involving animals were approved by the Michigan State University Institutional
Animal Care and Use Committee (IACUC). Animals were anesthetized using a cocktail of
ketamine and xylazine (75 and 5 mg/kg injected intrapertoneally, respectively). The left
somatosensory cortex was exposed (4 × 4 mm craniotomy, 0-4 mm posterior and 4-8 mm lateral
to bregma). A 32-channel microelectrode silicon array (NeuroNexus Technologies, Ann Arbor,
MI, USA) with 8 shanks, 4 recording sites/shank, 200 μm shank separation and 100 μm electrode
separation within shank was advanced into the barrel field in 100 μm steps. Acquired signals
were amplified and band-pass filtered in the range 300-5000 Hz and sampled at 25 KHz.
Stimulus-driven activity was observed at depths of 1100-1500 μm corresponding to layer V of

132

the barrel cortex. After reaching the desired depth in layer V, the electrode array was secured in
place using dental cement.
Rats were left to recover for 7-10 days post surgery after which they were anesthetized and
stimulus-driven S1 activity was recorded (control data). For each rat, 3 whiskers contralateral to
the side of the implant were selected for mechanical stimulation based on the observed neuronal
response to manual deflection (mapping experiment). The selected whiskers were deflected one
at a time by inserting each whisker into a capillary tube glued to a piezoelectric bimorph (Piezo
Systems, Cambridge, MA, USA). Each whisker was horizontally deflected 900 times with a
displacement of 80 μm for 100 ms (rise time and fall time were each set to 1 ms) at 1 Hz
frequency. Two whiskers were then selected to be paired and all other whiskers on the same side
of the rat’s mystacial pad were trimmed to the skin level. Rats were returned back to their cages
where enrichments were introduced to encourage them to actively whisk new objects using the
spared whiskers. After 1-2 days and 6-7 days post-whisker trimming, rats were anesthetized and
stimulus-driven activity was recorded (plasticity data) in response to spared whiskers deflection.
At the end of the 1-2 days recording session, re-grown whiskers (not previously spared) were
trimmed again to restrict the pairing to the same pair of whiskers.
For both control and plasticity data, spikes from multiple single unit activity were detected
and sorted using NeuroQuest; a MATLAB toolbox for neural data processing and analysis [214,
215]. Spikes were declared present if the raw waveform surpassed a threshold set at 3 times the
noise standard deviation. A spike length of 1 ms was used for spike sorting (0.25 ms pre
threshold crossing and 0.75 ms post threshold crossing). Spikes were aligned at their trough.
Principal Component Analysis (PCA) was applied to the detected spikes, and the first 2 principal
components were used as features for spike sorting. An average population size of 23.2±6.7
133

single units/rat was recorded. Spike trains were binned at Δ=1 ms. Quality of the spike sorting
was assessed using inter-spike interval histogram (ISI) to ensure that no spikes with inter-spike
intervals of less than 2 ms were classified as belonging to the same unit.
6.2.2

Data Analysis

For each neuron, the total number of evoked spikes in a given trial for each stimulated
whisker was computed as the difference between the total number of spikes fired by the neuron
within the 100 ms-trial and the total number of spikes fired by the neuron in the 100 ms window
preceding the trial. The mean first-spike latency of each neuron for a given whisker was
computed using the same expression given in equation (5.1).
To infer the stimulus-specific networks for both control and plasticity data, a total of 100
datasets, 18 sec each, for each whisker were extracted in a similar manner to the datasets used in
Chapter 5 from the recorded 900 trials/whisker. Each dataset was formed by concatenating 180
trials that were randomly chosen with a uniform distribution from the 900 trials. The spike trains
of each dataset were analyzed using DBN with Markov lags in the range [1, 5] bins ([1, 5] ms).
In case a connection was inferred at more than one Markov lag, only the largest lag was
considered. The maximum number of pre-synaptic cells for each cell was set to 10.
Similar to the analysis presented in Chapter 5, the similarity between the inferred networks
was quantified as follows: we first represented each inferred network as an n × n binary
adjacency matrix A, where n is the total number of neurons in the network. Each element A(i, j)
takes the value ‘1’ if there is a connection from neuron i to neuron j and ‘0’ if there is no
connection between the corresponding neurons. For a given population of n neurons, K deflected
whiskers and M datasets per whisker, all the adjacency matrices of the inferred networks were

134

2

vectorized and stacked together into one KM × n matrix. Principal component analysis (PCA)
was then applied to this matrix to extract significant features from the inferred networks by
2

projecting the adjacency matrices into a p-dimension network space, where p ≤ n , that accounts
for most of the variance in the networks [218]. The distance D(Al, Am) between a pair of
adjacency matrices Al and Am was defined as

D( Al , Am ) = ql − q m

(6.1)

where ql and qm are the projections of Al and Am in the p-dimension network space, respectively,
and ||.|| is the Euclidean distance (lp-norm) between the two projections. The number of principal
components used p was set to 2. The average distance between the networks inferred for a given
pair of whiskers w1 and w2, D (w1 , w2 ) , was defined as

D (w1 , w2 ) =

2
M

w
w
D (Al 1 , Am 2 )
2 ∑∑

(6.2)

l m

6.3 Results

6.3.1

Changes in Individual Neuron Response

We recorded a total of 325 single units from layer V of the barrel cortex in four chronically
implanted rats. Microelectrode arrays with 32 channels were used in each rat to record neural
responses to unilateral stimulation of individual whiskers, one at a time. One week postimplantation, stimulus-driven activity was recorded (control data) over a number of days after
which all but two whiskers on one side of the rat’s face were trimmed to the level of the skin to
induce experience-dependent plasticity. Stimulus-driven activity was then recorded 1-2 days and
135

6-7 days post whisker trimming (plasticity data) as illustrated by the timeline in Figure 6.1a.
Each spared whisker was stimulated 900 times at a frequency of 1 Hz.
To assess the changes in the response of individual neurons as a result of whisker pairing, we
computed the number of evoked spikes by each neuron as well as the first-spike latency in
response to the deflection of each of the spared whiskers. Whisker pairing was expected to result
in an increased similarity in the response of the neurons to the deflection of each of the spared

Record

Record
Trim all
but two

Trim re-grown
Keep two

Start
Pairing

Control

Diff. in 1 Sp. Lat.

*

0.4

st

Diff. in Response

*

0.6

0.2
0

Control

Maintain
Pairing

6-7 Days

1-2 Days

(a)

0.8

Record

1-2

6-7

Days
Post-Trimming

0.4

*

0.3

*

0.2
0.1
0

Control

1-2

6-7
Days
Post-Trimming

(c)
(b)
Figure 6.1: (a) Timeline of the experiment. (b) Difference in the evoked response and (c) in the
first-spike latency across the spared whiskers averaged across neurons and subjects for the
control data (blue) and the plasticity data (red) recorded 1-2 days and 6-7 days post whiskertrimming. Y-axis was normalized by the maximum evoked response in (b) and the maximum
first-spike latency in (c) across the spared whiskers for each neuron. *P < 0.05, two-sample t-test.

136

whiskers [213, 245-247]. We quantified the similarity in the response by the difference in the
evoked number of spikes and in the first-spike latency across the spared whiskers. There was no
significant change in this difference 1-2 days post-trimming, however, this difference decreased
significantly 6-7 days post-trimming as illustrated in Figure 6.1b and Figure 6.1c (Normalized
difference in evoked spikes for control data: 0.41±0.23, 6-7 days post-trimming: 0.29±0.21;
Normalized difference in first-spike latency for control data: 0.18±0.12, 6-7 days post-trimming:
0.15±0.11, P < 0.05, two-sample t-test). This indicates an increased similarity in the response of
individual neurons to the spared whiskers 6-7 days post-trimming as a result of pairing.
6.3.2

Changes in Network Response

For each stimulated whisker in both control and plasticity data, one hundred 18-sec long
spike train datasets were formed from the 900 trials of each whisker by randomly choosing 180
trials out of the 900 trials following a uniform distribution. DBN fit was then obtained for each
of these datasets. Similar to the analysis in Chapter 5, principal component analysis (PCA) was
used to construct a network feature space where each point in that space corresponds to one
network. Therefore, we quantified the similarity between a pair of networks as (1 – the
normalized distance between their corresponding projections in the PCA network space). The
normalization constant was computed as the maximum distance between the projections of the
spared whiskers’ networks obtained from both control and plasticity data of all rats. Distance
normalization ensured that the maximum possible pair-wise distance measure in the network
feature space does not exceed ‘1’.
Figure 6.2a illustrates the network space of one sample rat for the control data and after 7
days of pairing of whiskers D4 and D5. The figure clearly shows the increased similarity

137

between the networks of these two whiskers compared to the control data illustrated by the
merging of the two clusters of whiskers D4 and D5. Figure 6.2b summarizes the average
similarity across all rats between the networks of the spared whiskers before and after pairing. A
significant increase in the similarity was observed that is proportional to the number of days
post-trimming (Network similarity for control data: 0.58±0.1, 1-2 days post-trimming: 0.66±0.1,

Day 7

Control
D4
D5
D6

0

D4
D5
D6

2
PC2

PC2

2

-2

0
-2

-2

0
PC1

2

-2

Paired Whiskers Net. Sim.

(a)

1

*

0
PC1

2

*

0.5

0

Control

1-2

6-7
Days
Post-trimming
(b)
Figure 6.2: (a) Control (left) and 7-days post whisker-pairing (right) network feature space of a
sample rat for 3 different whiskers (D4, D5 and D6). Whiskers D4 and D5 were paired after
recording the control data. Each dot corresponds to the projection of one network onto a 2dimension principal components (PC1 and PC2) feature space. (b) Network similarity across
spared whiskers for the control data (blue) and data recorded 1-2 days and 6-7 days post-pairing
(red) averaged across 4 subjects (mean ± SD). Similarity for a given pair of networks was
quantified as 1 – the normalized distance between the projections of the pair in the principal
component space. *P < 0.05, two-sample t-test.

138

6-7 days post-trimming: 0.71±0.1, P < 0.05, two-sample t-test). This finding suggests that the
change in the functional/effective connectivity between neurons as a result of whisker pairing
could be tracked using DBN and is consistent with the longstanding Hebbian plasticity
hypothesis.

6.4 Discussion and Conclusion
Identifying plasticity in cortical neural ensembles in vivo is important in studying systems
neuroscience to permit tracking the dynamics of biological cortical networks during learning and
behavior. In this chapter, we provide evidence that graphical models such as DBNs can track
experience-dependent plasticity in layer V of the rat barrel cortex after sensory deprivation. Our
results indicate a gradual increase in the similarity among stimulus-specific networks inferred for
the spared whiskers that peaks after 6-7 days of pairing. Although no significant change was
observed after 1-2 days of pairing at the individual neuron level, we were able to detect
increased similarity at the network level after the same time period. This suggests the
effectiveness of such algorithms in tracking plasticity that might not be feasible to detect using
simple measures such as changes in firing rates or response latency.
Our experimental paradigm provides a more accurate quantification of the effects induced by
plasticity compared to previous studies for two reasons: First, changes due to plasticity in this
study were identified in the same animal by using a chronic implant to record neuronal activity
whereas in previous studies, comparisons were carried out between two different groups of
animals using acute recordings: a plasticity group and a control group [213, 245-247].
Identifying plasticity effects in the same animal could provide results that are more objective
than comparing effects across animals. The downside with our experimental paradigm might be
139

the issues associated with chronic implants such as losing the signals after some time and the
need to anesthetize the animals before every recording session. This could affect the overall
condition of the animals. Second, this study is the first study, to our knowledge, that attempts to
quantify whisker pairing-induced plasticity at the ensemble level in the barrel cortex in vivo.
With the exception of only one study that quantified changes in the crosscorrelograms of pairs of
neurons in the barrels of the spared whiskers [213], other studies only hypothesized that the
changes in neuronal responses due to whisker pairing result from changes in the neuronal
networks in the corresponding barrels [198, 243]. These changes were hypothesized to be in the
form of either emergence of new connections across the two barrels or strengthening already
existing connections [198, 243]. The study presented in this chapter provides evidence for
changes that occur at the ensemble level using a more powerful tool than pair-wise measures to
support such hypothesis.
Although this study provides evidence for the possibility of tracking plasticity at the
population level in vivo, several modifications to the experimental paradigm should be
considered to improve this study: First, more control data needs to be recorded across a number
of days before whisker trimming to establish a baseline of day-to-day changes in neural
responses that should not be attributed to experience-dependent plasticity. The control data used
in this study did not follow the same timeline used for recording the plasticity data, where for 2
rats, 1 control dataset was recorded, while for each of the other 2 rats, 3 control datasets were
recorded over a period of 1 week. Second, more data needs to be recorded during the plasticity
period within the 1 week pairing period as well as after that to determine whether the similarity
observed scales linearly, sublinearly or supralinearly with the plasticity period. Third, the
reversibility of the process needs to be examined, i.e., if all whiskers were allowed to re-grow,
140

the mapping should be repeated to determine if ensemble representation of whisker deflection
would become gradually dissimilar across barrels. Repeated anesthesia might prevent
implementing these suggestions. Recording from awake, head-fixed animals might overcome
such challenge, and there is evidence that plasticity at the individual neuron level could be
detected within only few hours after whisker pairing in these rats [248]. Another challenge,
however, would manifest itself in the need to ensure rats are trained not to move their whiskers
voluntarily during the experimental trial.

141

Chapter ____________________________________________

7

Concluding Remarks and Future
Directions

______________________________________________________________________________

Devising methods to identify neuronal connectivity from simultaneously observed neurons is
crucial to understand how neurons collectively and dynamically respond to external stimuli or
give rise to an observed motor behavior. We have reviewed and proposed a number of methods
that could be used to infer neuronal connectivity and discussed their utility as well as their
limitations in analyzing population activity. In particular, we have emphasized the use of
graphical models to circumvent many of these limitations. The multiscale clustering algorithm
for inferring functional connectivity attempts to find disjoint subspaces with arbitrary structures
in the entire neural space using unlabeled data. Dynamic Bayesian Networks (DBNs), on the
other hand, finds the intrinsic structure of the population underlying each subspace by
transforming the inference problem to a structure learning problem for which many efficient
search algorithms exist. This two-step approach provides a framework to decipher the complex
connectivity patterns that may exist in large neural populations.
From a system identification standpoint, the multiscale clustering step helps alleviate the
uncertainty about the temporal dynamics of the system that are governed by a time-scaledependent interaction between the elements. It therefore takes into consideration the effect of
pre-synaptic cells on the post-synaptic cell firing probability without relying on the hard-tomeasure synaptic latencies. The latter would require measuring post-synaptic potentials in the
142

post-synaptic cell in response to controlled input to a physiologically tagged pre-synaptic cell
(for example, using a current or voltage clamp technique) – a technique that is simply not
feasible to perform in large populations with current state of the art technology. The DBN, on the
other hand, helps alleviate the uncertainty about the spatial dynamics of the system that are
governed by the specific labels of the elements. It does so by integrating the immediate evidence
(which cell fired) with the long-term knowledge (prior probability) and uses it to explain away
unlikely causes of the observed effects. In doing so, it searches for the best graphical model that
fits the data. Taken together, both steps would be essential to capture uncertainties about
dynamic interactions between neurons in large complex networks.
Despite the superior performance of the methods introduced in this dissertation compared to
other classical methods, more work is needed to devise algorithms for estimating some
parameters during the analysis such as the best time scale and the best wavelet for the multiscale
clustering algorithm, and the Markov lag for the DBN. We have provided a number of metrics the Dunn Index, maximum mutual information and the Mean Influence Score – to devise some
suggested values to use for these parameters. These metrics, however, require running the
algorithms using a range of values for each parameter and then choosing the value that
maximizes them. This is a computationally intense task. Future work should provide more
systematic ways of estimating these parameters from the data, for example using maximum
likelihood, prior to running the algorithms. In addition, larger network structures, compared to
the small size of the networks we demonstrated in Chapters 3 and 4, could be used to simulate
the spiking data of populations mimicking real anatomic neuronal networks in which each
neuron receives thousands of connections from other unobserved neurons.

143

Using data recorded from the rat barrel cortex, we showed evidence for the validity of both
multiscale clustering and DBN in Chapters 3 and 5. In particular, we showed that the inferred
connectivity was consistent with known connection probability in local cortical areas and that
this connectivity was sufficient to build a model that accurately predicted the firing of individual
neurons. In addition, we showed that the inferred connectivity using both methods provided a
substantial amount of information about the stimulus that was higher than the information carried
by individual neurons’ response properties. Although the results reported are encouraging to
claim the validity of both methods, more direct validation is still needed in which the anatomical
properties of the connections - rather than their functional properties - are used as ground-truth.
For example, recording neuronal activity from layers II/III, IV and V of the rat barrel cortex and
comparing the direction of the inferred connections across layers to the well-documented
anatomical connections might provide a strong argument towards the validity of these methods.
Although it is almost impossible in this case to confirm that the inferred connectivity directly
corresponds to anatomical connectivity in the same animal because it is practically impossible to
confirm the identity of the recorded cells post-mortem to permit anatomical tract tracing studies.
Matching the direction of the inferred connectivity with what has been reported in the literature
(From layer IV to layer II/III and from layer IV to layer V [198]), however, should be a good
first step. Another way to validate the proposed methods using anatomical ground-truth is to
compare properties of the networks inferred from animal models of brain disorders (such as
stroke or Alzheimer’s disease) to control subjects. For example, Alzheimer’s disease is known to
correlate with a significant loss in the total number of connections [249]. Therefore, we would
expect that the networks inferred from Alzheimer’s subjects using the proposed methods would
have a significant decrease in the total number of connections compared to control subjects.
144

Another extension to the barrel cortex study is to examine the separation between signal and
noise correlations [81, 250]. Signal correlation refers to the correlation between the mean firing
rates of a pair of neurons in response to stimulus presentation. Noise correlation, on the other
hand, refers to the correlation observed in the trial-to-trial deviation of the firing rates from their
mean values. Noise correlation has been often attributed to the possibility of anatomical
connectivity among the neurons. This inherently assumes that the stimulus directly influences the
firing of all of the observed neurons and that the contribution of the anatomical connectivity is
purely noise [83]. The distinction between signal and noise correlation in the barrel cortex data
could be examined using DBN by adding a variable (a node) in the graphical model to represent
whisker deflection as illustrated by the simple example in Figure 7.1. In this case, connections
that are inferred using DBN between the added node and the neurons will correspond to the
signal influence (dotted edges) whereas connections among the neurons will correspond to noise
(solid edges). Signal correlation for a given pair of neurons can be inferred from the graph if both
neurons receive a connection from the whisker node whereas noise correlation can be inferred if
they both receive a connection from another common neuron. In this model, a pair of neurons
can be correlated in both signal and noise if they receive common connections from the whisker

1

3

W

5
2

4

Figure 7.1: A DBN model that distinguishes between signal and noise correlations. Gray node
(W) takes a value ∈ {0, 1}, where 0 indicates no whisker deflection, while 1 indicates the
deflection of any whisker. Dotted edges indicate signal influence while solid edges indicate noise
influence. Signal correlation can be inferred from the graph by searching for neurons receiving
connections from W (such as neurons 1 and 2) whereas noise correlation can be estimated by
searching for neurons receiving connections from a common parent (such as neurons 3 and 4).
145

node as well as another common neuron. The model given in Figure 7.1 can be further extended
to include multiple whisker nodes in the graph, one for each whisker, to investigate whiskerspecific signal correlations. This proposed model could help in distinguishing between
connections inferred from the data as a result of stimulus-driven statistical dependence and those
that represent other causal influence among the neurons. Spontaneous data could be utilized as
well to examine connectivity that is not stimulus-driven. However, low firing rates of neurons
associated with spontaneous activity compared to stimulus-driven activity might hinder the
ability of the proposed methods to infer the connections using the same amount of data.
The methods presented in this dissertation were designed specifically to analyze spike train
data. An important factor to consider that may affect the performance is the accuracy of spike
sorting [55, 251]. Although we did not assess the effect of spike sorting errors on the accuracy,
we suspect that errors above certain level may affect the accuracy, for example, when resolving
spike overlaps. A detailed simulation study is needed to address how sensitive these methods are
to spike detection and sorting deficiencies.
Connectivity inference methods used here were designed for offline analysis. Developing
real-time implementation of these methods would be extremely useful for applications that
require continuous knowledge of the underlying neural dynamics such as decoding motor cortex
neural signals for Brain Machine Interfaces (BMI) or tracking rapid plasticity. For decoding
applications, knowledge of the stimulus prior as well as the connectivity between observed
neurons may improve the ability to decode spike trains in a Bayesian sense [101, 194, 252, 253].
One way to think about this is to consider factorizing the likelihood function - the joint neural
response given the stimulus - as a product of marginal densities. Each density would express an
independent neural response to the stimulus given its pre-synaptic inputs, which would be
146

inferred by the connectivity algorithm. In a simulation study, our results suggested that distinct
and consistent network connectivity patterns may arise during goal-directed 2D arm reach
movements [85]. When these connectivity patterns were used to estimate the parameters of a
Bayesian decoder offline, the decoding performance was significantly improved compared to the
case when information about network connectivity was absent [194]. Experimental results by
other groups using the activity of retinal ganglion cells in vitro to decode visual scenes agree
with the above findings, i.e., better scene reconstruction was obtained when information about
connectivity was taken into account [83]. Tracking the underlying connectivity online could
therefore improve the decoding performance. Given the extensive computational run-time of the
methods presented here, a potential alternative is to use non-negative matrix factorization (NMF)
to speed the analysis. NMF factorizes a matrix into the product of a non-negative bases matrix
and a non-negative encoding matrix [254]. It has been used to relate the firing rates of
simultaneously recorded single neurons to behavioral variables [255] and to infer the functional
2+

connectivity from firing rates extracted from Ca

imaging data [256]. Inferring neuronal

connectivity using NMF from extracellular recordings has not been fully explored yet. This
could be achieved by applying NMF to the continuous-time firing rates of the neurons to
eliminate the combinatorial optimization required by DBN to find the best discrete structure that
fits the data. NMF, however, assumes linear relationships between the examined random
variables.
Lastly, quantifying neural plasticity can greatly benefit from the ability to systematically
identify network connectivity. For example, learning and memory formation has long been
hypothesized to accompany changes in network connectivity [145, 257-262], as well as recovery
after spinal cord injury, traumatic brain injury and stroke [143, 259, 263, 264]. We have shown
147

that the methods presented here can track changes in the shape and number of neuronal clusters
in simulated data [27]. We have also shown how they can track network dynamics associated
with experience-dependent plasticity after sensory deprivation. An ultimate goal would be to
validate these methods with anatomical and/or imaging data.

148

REFERENCES

149

REFERENCES
[1]

R. W. Williams and K. Herrup, "The control of neuron number," Annu. Rev. Neurosci.,
vol. 11, pp. 423–453, 1988.

[2]

Y. Tang, J. R. Nyengaard, D. M. D. Groot, and H. J. Gundersen, "Total regional and
global number of synapses in the human brain neocortex," Synapse, vol. 41, pp. 258–273,
2001.

[3]

S. L. Bressler, "Large-scale cortical networks and cognition," Brain Res., vol. 20, pp.
288–304, 1995.

[4]

F. Varela, J. P. Lachaux, E. Rodriguez, and J. Martinerie., "The brainweb: phase
synchronization and large-scale integration," Nat. Rev. Neurosci., vol. 2, pp. 229–239,
2001.

[5]

M. Catani and D. H. fftyche, "The rises and falls of disconnection syndromes," Brain, vol.
128, pp. 2224–2239, 2005.

[6]

K. Supekar, V. Menon, D. Rubin, M. Musen, and M. D. Greicius, "Network analysis of
intrinsic functional brain connectivity in Alzheimer's disease," PLoS Comput. Biol., vol. 4,
p. e1000100, 2008.

[7]

Y. Liu, M. Liang, Y. Zhou, Y. He, Y. Hao, M. Song, C. Yu, H. Liu, Z. Liu, and T. Jiang,
"Disrupted small-world networks in schizophrenia," Brain, vol. 131, pp. 945–961, 2008.

[8]

M. v. Gerven, J. Farquhar, R. Schaefer, R. Vlek, J. Geuze, A. Nijholt, N. Ramsey, P.
Haselager, L. Vuurpijl, S. Gielen, and P. Desain, "The brain–computer interface cycle,"
Journal of Neural Engineering, vol. 6, p. 041001, 2009.

[9]

R. C. Gesteland, B. Howland, J. Y. Lettvin, and W. H. Pitts, "Comments on
microelectrodes," Proc. IRE, vol. 47, pp. 1856–1862, 1959.

[10]

E. E. Fetz, "Operant conditioning of cortical unit activity," Science, vol. 163, pp. 955–
958, 1969.

[11]

R. Normann, E. M. Maynard, P. J. Rousche, and D. J. Warren, "A neural interface for a
cortical vision prosthesis," Vision Research, vol. 39, pp. 2577–2587, 1999.

150

[12]

K. Wise, D. Anderson, J. Hetke, D. Kipke, and K. Najafi, "Wireless Implantable
Microsystems: High-Density Electronic Interfaces to the Nervous System," Proc. of the
IEEE, vol. 92-1, pp. 76–97, 2004.

[13]

S. Fujisawa, A. Amarasingham, M. Harrison, and G. Buzsáki, "Behavior-dependent
short-term assembly dynamics in the medial prefrontal cortex," Nat. Neurosci., vol. 11,
pp. 823–833, 2008.

[14]

A. B. Schwartz, "Cortical neural prostheses," Ann. Rev. Neurosci., vol. 27, pp. 487-507,
2004.

[15]

A. Jackson, J. Mavoori, and E. E. Fetz, "Long-term motor cortex plasticity induced by an
electronic neural implant," Nature, vol. 444, pp. 56–60, 2006.

[16]

K. J. Friston, "Functional and effective connectivity in neuroimaging: A synthesis," Hum
Brain Mapping, vol. 2, pp. 56-78, 1994.

[17]

E. T. Bullmore and O. Sporns, "Complex brain networks: graph-theoretical analysis of
structural and functional systems," Nature Rev Neurosci., vol. 10, pp. 186–198, 2009.

[18]

S. Dodel, J. M. Hermann, and T. Geisel, "Functional connectivity by cross-correlation
clustering," Neurocomputing, pp. 1065-1070, 2002.

[19]

H. Tamura, H. Kaneko, and I. Fujita, "Quantitative analysis of functional clustering of
neurons in the macaque inferior temporal cortex," Neurosci Res, vol. 52, pp. 311–322,
2005.

[20]

A. M. Aertsen, G. L. Gerstein, M. K. Habib, and G. Palm, "Dynamics of neuronal firing
correlation: modulation of "effective connectivity"," Journal of Neurophysiology, vol. 61,
pp. 900-917, 1989.

[21]

G. L. Gerstein and D. H. Perkel, "Simultaneously recorded trains of action potentials:
analysis and functional interpretation," Science vol. 164, pp. 828-830, 1969.

[22]

E. N. Brown, R. Kass, and P. Mitra, "Multiple neural spike train data analysis: State-ofthe-Art and future challenges," Nature Neurosc., vol. 7 pp. 456-461, 2004.

151

[23]

G. L. Gerstein, "Cross correlation measures of unresolved multi-neuron recordings," J.
Neurosci Methods, vol. 100, pp. 41-51, 2000.

[24]

G. Buzsaki, "Large-scale recording of neuronal ensembles," Nat Neurosci, vol. 7, pp.
446-451, 2004.

[25]

O. Sporns, "Graph theory methods for the analysis of neural connectivity patterns," in
Neuroscience Databases. A Practical Guide, R. Kotter, Ed. Boston: Kluwer, 2002, pp.
169–83.

[26]

B. Denise, W. David, N. Richard, A. Amos, and G. Sonja, "Spatially organized spike
correlation in cat visual cortex," Neurocomputing, vol. 70, pp. 2112--2116, 2007.

[27]

S. Eldawlatly, R. Jin, and K. G. Oweiss, "Identifying Functional Connectivity in LargeScale Neural Ensemble Recordings: A Multiscale Data Mining Approach," Neural
Computation, vol. 21, pp. pp. 450-477, 2009.

[28]

S. Eldawlatly, Y. Zhou, R. Jin, and K. G. Oweiss, "On The Use of Dynamic Bayesian
Networks in Reconstructing Functional Neuronal Networks from Spike Train
Ensembles," Neural Computation, vol. 22, pp. 158-189, 2010.

[29]

S. Eldawlatly and K. Oweiss, "Graphical Models of Functional and Effective Neuronal
Connectivity," in Statistical Signal Processing for Neuroscience and Neurotechnology, K.
Oweiss, Ed.: Academic Press, Elsevier, 2010, pp. 129-174.

[30]

S. Eldawlatly and K. Oweiss, "Millisecond-Timescale Local Network Coding in the Rat
Primary Somatosensory Cortex," PLoS ONE, in press.

[31]

E. R. Kandel, J. H. Schwartz, and T. M. Jessell, Principles of Neuroscience, fourth ed.
New York: McGraw-Hill Health Professions Division, 2000.

[32]

H. Lodish, D. Baltimore, A. Berk, S.L. Zipursky, P. Matsudaira, and J. Darnell,
Molecular Cell Biology, 4th ed. New York: W.H. Freeman, 1999.

[33]

M. D. Greicius, B. Krasnow, A. L. Reiss, and V. Menon, "Functional connectivity in the
resting brain: a network analysis of the default mode hypothesis," PNAS vol. 100, pp.
253–258, 2003.

152

[34]

R. Winder, C. R. Cortes, J. A. Reggia, and M.-A. Tagamets, "Functional connectivity in
fMRI: A modeling approach for estimation and for relating to local circuits," Neuroimage,
vol. 34, pp. 1093–1107, 2007.

[35]

H. Koshino, P. A. Carpenter, N. J. Minshew, V. L. Cherkassky, T. A. Keller, and M. A.
Just, "Functional connectivity in an fMRI working memory task in high-functioning
autism," Neuroimage, vol. 24, pp. 810–821, 2005.

[36]

U. Hasson, E. Yang, I. Vallines, D. J. Heeger, and N. Rubin, "A Hierarchy of Temporal
Receptive Windows in Human Cortex," J. Neurosci., vol. 28, pp. 2539 - 2550, 2008.

[37]

D. Y. Tsao, N. Schweers, S. Moeller, and W. A. Freiwald, "Patches of face-selective
cortex in the macaque frontal lobe," Nature Neuroscience, vol. 11, pp. 877-879, 2008.

[38]

A. H. Bell, F. Hadj-Bouziane, J. B. Frihauf, R. B. H. Tootell, and L. G. Ungerleider,
"Object representations in the temporal cortex of monkeys and humans as revealed by
functional magnetic resonance imaging," Journal of Neurophysiology, vol. 101, p. 688,
2009.

[39]

L. Verhagen, H. C. Dijkerman, M. J. Grol, and I. Toni, "Perceptuo-Motor Interactions
during Prehension Movements," J. Neurosci., vol. 28, pp. 4726 - 4735, 2008.

[40]

M. Lotze, P. Montoya, M. Erb, E. Hulsmann, H. Flor, U. Klose, N. Birbaumer, and W.
Grodd, "Activation of cortical and cerebellar motor areas during executed and imagined
hand movements: an fMRI study," Journal of Cognitive Neuroscience, vol. 11, pp. 491501, 1999.

[41]

A. Caclin and P. Fonlupt, "Functional and effective connectivity in an fMRI study of an
auditory-related task," Eur J Neurosci, vol. 23, pp. 2531-2537, 2006.

[42]

M. d'Esposito, G. K. Aguirre, E. Zarahn, D. Ballard, R. K. Shin, and J. Lease, "Functional
MRI studies of spatial and nonspatial working memory," Cognitive Brain Research, vol.
7, pp. 1-13, 1998.

[43]

N. K. Logothetis, D. Leopold, M. Wilke, A. Maier, N. K. Logothetis, K. Moutoussis, G.
A. Keliris, Z. Kourtzi, N. K. Logothetis, and C. Juchem, "What can and cannot fMRI tell
us about neural activity: A primer in combined imaging and electrophysiology"},"
Journal of Neuroscience}, vol. 25, p. 2526, 2005.

153

[44]

A. B. Schwartz, "Useful signals from motor cortex," J. Physiol., vol. 579, pp. 581–601,
2007.

[45]

P. J. Rousche and R. A. Normann, "Chronic recording capability of the Utah intracortical
electrode array in cat sensory cortex," J. Neurosci. Methods, vol. 82, pp. 1–15, 1998.

[46]

K. Oweiss, "Introduction," in Statistical Signal Processing for Neuroscience and
Neurotechnology, K. Oweiss, Ed.: Academic Press, Elsevier, 2010, pp. 1-13.

[47]

P. Tiesinga, J. M. Fellous, and T. J. Sejnowski, "Regulation of spike timing in visual
cortical circuits," Nat Rev Neurosci vol. 9, pp. 97–107, 2008.

[48]

S. Panzeri, R. S. Petersen, S. R. Schultz, M. Lebedev, and M. E. Diamond, "The role of
spike timing in the coding of stimulus location in rat somatosensory cortex," Neuron, vol.
29, pp. 769-777, 2001.

[49]

M. DeWeese, M. Wehr, and A. Zador, "Binary spiking in auditory cortex," J Neurosci,
vol. 23, pp. 7940–7949, 2003.

[50]

L. R. Hochberg, M. D. Serruya, G. M. Friehs, J. A. Mukand, M. Saleh, A. H. Caplan, A.
Branner, D. Chen, R. D. Penn, and J. P. Donoghue, "Neuronal ensemble control of
prosthetic devices by a human with tetraplegia," Nature, vol. 442, pp. 164–171, 2006.

[51]

J. Reimer and N. G. Hatsopoulos, "Periodicity and evoked responses in motor cortex," J
Neurosci vol. 30, pp. 11506-11515, 2010.

[52]

M. Velliste, S. Perel, M. C. Spalding, A. S. Whitford, and A. B. Schwartz, "Cortical
control of a prosthetic arm for self-feeding," Nature, vol. 453, pp. 1098–1101, 2008.

[53]

K. D. Harris, J. Csicsvari, H. Hirase, G. Dragoi, and G. Buzsaki, "Organization of cell
assemblies in the hippocampus," Nature, vol. 424, pp. 552–556, 2003.

[54]

Y. Ikegaya, G. Aaron, R. Cossart, D. Aronov, I. Lampl, D. Ferster, and R. Yuste,
"Synfire chains and cortical songs: temporal modules of cortical activity," Science, vol.
304, pp. 559–564, 2004.

154

[55]

K. D. Harris, D. A. Henze, J. Csicsvari, H. Hirase, and G. Buzsaki, "Accuracy of tetrode
spike separation as determined by simultaneous intracellular and extracellular
measurements," J. Neurophysiol. , vol. 84, pp. 401–414, 2000.

[56]

M. S. Fee, P. P. Mitra, and D. Kleinfeld, "Automatic sorting of multiple unit neuronal
signals in the presence of anisotropic and non-Gaussian variability," J Neurosci Methods,
vol. 69, pp. 175-88, 1996.

[57]

K. G. Oweiss and D. J. Anderson, "Spike sorting: a novel shift and amplitude invariant
technique," Neurocomputing, vol. 44-46, pp. 1133–1139, 2002.

[58]

M. Aghagolzadeh and K. Oweiss, "Detection and Classification of Extracellular Action
Potential Recordings " in Statistical Signal Processing for Neuroscience and
Neurotechnology, K. Oweiss, Ed.: Academic Press, Elsevier, 2010, pp. 15-74.

[59]

M. Lewicki, "A review of methods for spike sorting: the detection and classification of
neural action potentials," Network Comput. Neural Syst. , vol. 9, pp. R53–R78, 1998.

[60]

Q. R. Quiroga, Z. Nadasdy, and Y. Ben-Shaul, "Unsupervised spike detection and sorting
with wavelets and super-paramagnetic clustering," Neural Comput., vol. 16, p.
1661−1687, 2004.

[61]

G. Zouridakis and D. C. Tam, "Identification of reliable spike templates in multi-unit
extracellular recordings using fuzzy clustering," Comput. Methods Prog. Biomed, vol. 61,
pp. 91–98, 2000.

[62]

S. Shoham, M. R. Fellows, and R. A. Normann, "Robust, automatic spike sorting using
mixtures of multivariate t-distributions," J. Neurosci. Methods, vol. 127, pp. 111–122,
2003.

[63]

D. H. Perkel, G. Gerstein, and G. P. Moore, "Neuronal spike trains and stochastic point
processes. II. Simultaneous spike trains," J. Biophys vol. 7, pp. 419-40, 1967.

[64]

M. Frerking, J. Schulte, S. P. Wiebe, and U. Stäubli, "Spike timing in CA3 pyramidal
cells during behavior: implications for synaptic transmission”," J. Neurophysiol. , vol. 94,
pp. 1528-1540, 2005.

[65]

W. Maass and T. Natschlager, "A Model for Fast Analog Computation Based on
Unreliable Synapses," Neural Computation, vol. 12, pp. 1679-1704, 2000.

155

[66]

G. Q. Bi and M. M. Poo, "Synaptic modifications in cultured hippocampal neurons:
dependence on spike timing, synaptic strength, and postsynaptic cell type," Journal of
Neuroscience vol. 18, pp. 10464-72, 1998.

[67]

H. Markram, J. Lübke, M. Frotscher, and B. Sakmann, "Regulation of synaptic efficacy
by coincidence of postsynaptic APs and EPSPs," Science, vol. 275, pp. 213-215, 1997.

[68]

M. V. Tsodyks and H. Markram, "The neural code between neocortical pyramidal
neurons depends on neurotransmitter release probability " PNAS, vol. 94, pp. 719-723,
1997.

[69]

P. D. Roberts and C. C. Bell, "Spike-timing dependent synaptic plasticity in biological
systems," Biological Cybernetics, vol. 87, pp. 392-403, 2002.

[70]

A. Destexhe and E. Marder, "Plasticity in single neuron and circuit computations,"
Nature, vol. 431, pp. 789-795, 2004.

[71]

R. Dzakpasu and M. Zochowski, "Discriminating different types of synchrony in neural
systems," Physica D., vol. 208, pp. 115-122, 2005.

[72]

W. S. Smith and E. E. Fetz, "Synaptic Interactions Between Forelimb-Related Motor
Cortex Neurons in Behaving Primates," Journal of Neurophysiology, vol. 102, pp. 10261039, Aug 2009.

[73]

T. M. Cover and J. A. Thomas, Elements of Information Theory: Wiley-Interscience New
York, 2006.

[74]

S. Yamada, K. Matsumoto, M. Nakashima, and S. Shiono, "Information theoretic
analysis of action potential trains. II. Analysis of correlation among n neurons to deduce
connection structure," J. Neurosci. Methods, vol. 66, pp. 35-45, 1996.

[75]

L. M. A. Bettencourt, G. J. Stephens, M. I. Ham, and G. W. Gross, "Functional structure
of cortical neuronal networks grown in vitro," Phys Rev E Stat Nonlin Soft Matter Phys,
vol. 75, p. 021915, 2007.

[76]

P.-O. Amblard and O. J. J. Michel, "On directed information theory and Granger
causality graphs," J Comput Neurosci, vol. 30, pp. 7–16, 2011.

156

[77]

B. Gourevitch and J. J. Eggermont, "Evaluating Information Transfer Between Auditory
Cortical Neurons," J Neurophysiol, vol. 97, pp. 2533 - 2543, 2007.

[78]

M. Abeles and G. Gerstein, "Detecting Spatiotemporal Firing Patterns Among
Simultaneously Recorded Single Neurons," Journal of Neurophysiology, vol. 60, pp. 909924, 1988.

[79]

S. Grün, M. Diesmann, and A. Aertsen, "Unitary Events in Multiple Single-Neuron
Spiking Activity: II. Nonstationary Data," Neural Computation, vol. 14, pp. 81-119, 2002.

[80]

E. Schneidman, M. J. Berry, R. Segev, and W. Bialek, "Weak pairwise correlations imply
strongly correlated network states in a neural population," Nature, vol. 440, pp. 1007–
1012, 2006.

[81]

B. B. Averbeck, P. Latham, and A. Pouget, "Neural correlations, population coding and
computation," Nature Reviews Neuroscience vol. 7, pp. 358-366, 2006.

[82]

A. Tang, D. Jackson, J. Hobbs, W. Chen, J. L. Smith, H. Patel, A. Prieto, D. Petrusca, M.
I. Grivich, A. Sher, P. Hottowy, W. Dabrowski, A. M. Litke, and J. M. Beggs, "A
Maximum Entropy Model Applied to Spatial and Temporal Correlations from Cortical
Networks In Vitro," J. Neurosci., vol. 28, pp. 505-518, January 9, 2008 2008.

[83]

J. W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke, E. J. Chichilnisky, and E. P.
Simoncelli, "Spatio-temporal correlations and visual signalling in a complete neuronal
population " Nature, vol. 454, pp. 995-999 2008.

[84]

Y. Roudi, S. Nirenberg, and P. E. Latham, "Pairwise maximum entropy models for
studying large biological systems: when they can work and when they can't," PLoS
Comput Biol, vol. 5, p. e1000380, May 2009.

[85]

M. Aghagolzadeh, S. Eldawlatly, and K. Oweiss, "Synergistic Coding by Cortical Neural
Ensembles," IEEE Transactions on Information Theory, vol. 56, pp. 875-889, 2010.

[86]

Y. Dan and M. M. Poo, "Spike timing-dependent plasticity of neural circuits," Neuron,
vol. 44, pp. 23-30, 2004.

[87]

C. W. J. Granger, "Investigating causal relations by econometric models and crossspectral methods," Econometrica, vol. 37 pp. 424-438, 1969.

157

[88]

M. Kami ski, M. Ding, W. Truccolo, and S. Bressler, "Evaluating causal relations in
neural systems: Granger causality, directed transfer function and statistical assessment of
significance," Biological Cybernetics, vol. 85, pp. 145-157, 2001.

[89]

K. Sameshima and L. Baccala, "Using partial directed coherence to describe neuronal
ensemble interactions," J Neurosci Methods, vol. 94, pp. 93–103, 1999.

[90]

P. J. Franaszczuk, G. K. Bergey, and M. Kaminski, "Analysis of mesial temporal seizure
onset and propagation using the directed transfer function method," Electroenceph. clin.
Neurophysiol., vol. 91, pp. 413-427, 1994.

[91]

L. Lin, R. Osan, and J. Z. Tsien, "Organizing principles of real-time memory encoding:
neural clique assemblies and universal neural codes," Trends Neurosci, vol. 29, pp. 48-57,
Jan 2006.

[92]

J. L. Peña, S. Viete, K. Funabiki, K. Saberi, and M. Konishi, "Cochlear and neural delays
for coincidence detection in owls," J. Neurosci., vol. 21, pp. 9455–9459, 2001.

[93]

H. Agmon-Snir, C. E. Carr, and J. Rinzel, "The role of dendrites in auditory coincidence
detection," Nature, vol. 393, pp. 268–272, 1998.

[94]

E. Salinas and P. Their, "Gain modulation: a major computational principle of the central
nervous system," Neuron, vol. 27, pp. 15-21, 2000.

[95]

Y. E. Cohen and R. A. Andersen, "A common reference frame for movement plans in the
posterior parietal cortex," Nature Reviews Neuroscience, vol. 3, pp. 553–562, 2002.

[96]

A. J. Cadotte, T. B. DeMarse, P. He, and M. Ding, "Causal measures of structure and
plasticity in simulated and living neural networks," PLoS ONE, vol. 3, e3355, 2008.

[97]

R. Dahlhaus, M. Eichler, and J. Sandkuhler, "Identification of synaptic connections in
neural ensembles by graphical models," J Neurosci Methods, vol. 77, pp. 93–107, 1997.

[98]

W. Truccolo, U. T. Eden, M. Fellow, J. D. Donoghue, and E. N. Brown, "A point process
framework for relating neural spiking activity to spiking history, neural ensemble, and
extrinsic covariate effects," J. Neurophysiol., vol. 93, pp. 1074-1089, 2005.

158

[99]

M. Okatan, M. A. Wilson, and E. N. Brown, "Analyzing Functional Connectivity Using a
Network Likelihood Model of Ensemble Neural Spiking Activity," Neural Computation,
vol. 17, pp. 1927-1961, 2005.

[100] L. Paninski, J. W. Pillow, and E. P. Simoncelli, "Maximum Likelihood Estimation of a
Stochastic Integrate-and-Fire Neural Encoding Model," Neural Computation, vol. 16, pp.
2533-2561, 2004.
[101] R. Barbieri, M. A. Wilson, L. M. Frank, and E. N. Brown, "An analysis of hippocampal
spatio-temporal representations using a Bayesian algorithm for neural spike train
decoding," IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol.
13, pp. 131-136, 2005.
[102] R. Barbieri, M. Quirk, L. Frank, M. Wilson, and E. Brown, "Construction and analysis of
non-Poisson stimulus-response models of neural spiking activity," Journal of
Neuroscience Methods, vol. 105, pp. 25-37, 2001.
[103] D. A. Dombeck, M. S. Graziano, and D. W. Tank, "Functional clustering of neurons in
motor cortex determined by cellular resolution imaging in awake behaving mice," J
Neurosci, vol. 29, pp. 13751-60, Nov 4 2009.
[104] M. R. Cohen and W. T. Newsome, "Estimates of the contribution of single neurons to
perception depend on timescale and noise correlation," J Neurosci, vol. 29, pp. 6635-48,
May 20 2009.
[105] J. I. Gold and M. N. Shadlen, "The Neural Basis of Decision Making," Annual Review of
Neuroscience, vol. 30, pp. 535-574, 06/28/ 2007.
[106] M. N. Shadlen and W. T. Newsome, "The variable discharge of cortical neurons:
Implications for connectivity, computation, and information coding," J. Neurosci., vol. 18,
pp. 3870-3896, 1998.
[107] L. Paninski, J. Pillow, and J. Lewi, "Statistical models for neural encoding, decoding, and
optimal stimulus design," Progress in brain research, vol. 165, p. 493, 2007.
[108] M. Cohen and J. Maunsell, "Attention improves performance primarily by reducing
interneuronal correlations," Nature neuroscience, vol. 12, pp. 1594-1600, 2009.

159

[109] E. M. Callaway, "Local circuits in primary visual cortex of the macaque monkey," Annu
Rev Neurosci, vol. 21, pp. 47-74, 1998.
[110] Y. Yoshimura and E. M. Callaway, "Fine-scale specificity of cortical networks depends
on inhibitory cell type and connectivity," Nat Neurosci, vol. 8, pp. 1552-9, Nov 2005.
[111] M. M. Churchland and K. V. Shenoy, "Temporal Complexity and Heterogeneity of
Single-Neuron Activity in Premotor and Motor Cortex," Journal of Neurophysiology, vol.
97, p. 4235, 2007.
[112] J. M. Carmena, M. A. Lebedev, R. E. Crist, J. E. O'Doherty, D. M. Santucci, D. F.
Dimitrov, P. G. Patil, C. S. Henriquez, and M. A. L. Nicolelis, "Learning to control a
brain-machine interface for reaching and grasping by primates," PLoS Biology, vol. 1, p.
e2, 2003.
[113] L. Paninski, M. R. Fellows, N. G. Hatsopoulos, and J. P. Donoghue, "Spatiotemporal
tuning of motor cortical neurons for hand position and velocity," Journal of
neurophysiology, vol. 91, pp. 515-532, 2004.
[114] W. Wu, M. J. Black, D. Mumford, Y. Gao, E. Bienenstock, and J. P. Donoghue,
"Modeling and decoding motor cortical activity using a switching Kalman filter,"
Biomedical Engineering, IEEE Transactions on, vol. 51, pp. 933-942, 2004.
[115] D. W. Moran and A. B. Schwartz, "Motor cortical representation of speed and direction
during reaching," Journal of Neurophysiology, vol. 82, pp. 2676–2692, 1999.
[116] M. Histed, V. Bonin, and R. Reid, "Direct activation of sparse, distributed populations of
cortical neurons by electrical microstimulation," Neuron, vol. 63, pp. 508–522 2009.
[117] S. Song, P. J. Sjostrom, M. Reigl, S. Nelson, and D. B. Chklovskii, "Highly nonrandom
features of synaptic connectivity in local cortical circuits," PLoS Biol, vol. 3, p. e68, Mar
2005.
[118] M. J. Wainwright and M. Jordan, "Graphical Models, Exponential Families, and
Variational Inference," Foundations and Trends in Machine Learning vol. 1, pp. 1-305,
2008 2008.
[119] C. J. Stam and J. C. Reijneveld, "Graph theoretical analysis of complex networks in the
brain," Nonlin. Biomed. Phys., vol. 1, p. 3, 2007.

160

[120] D. J. Watts and S. H. Strogatz, "Collective dynamics of 'small-world' networks," Nature,
vol. 393, pp. 440–442, 1998.
[121] C. C. Hilgetag, G. A. P. C. Burns, M. A. O'Neill, J. W. Scannell, and M. P. Young,
"Anatomical connectivity defines the organisation of clusters of cortical areas in macaque
monkey and cat," Philos Trans R Soc Lond B Biol Sci, vol. 355, pp. 91–110, 2000.
[122] K. E. Stephan, C.-C. Hilgetag, G. A. P. C. Burns, M. A. O'Neill, M. P. Young, and R.
Kotter, "Computational analysis of functional connectivity between areas of primate
cerebral cortex," Phil Trans R Soc Lond B, vol. 355, pp. 111-126, 2000.
[123] G. Buzsaki, C. Geisler, D. A. Henze, and X. J. Wang, "Interneuron diversity series:
Circuit complexity and axon wiring economy of cortical interneurons," Trends Neurosci.,
vol. 27, pp. 186–193, 2004.
[124] V. M. Eguíluz, D. R. Chialvo, G. A. Cecchi, M. Baliki, and A. V. Apkarian, "Scale-free
brain functional networks," Phys. Rev. Lett., vol. 94, p. 018102, 2005.
[125] B. J. He, J. M. Zempel, A. Z. Snyder, and M. E. Raichle, "The temporal structures and
functional significance of scale-free brain activity," Neuron, vol. 66, pp. 353–369, 2010.
[126] J. Shlens, G. D. Field, J. L. Gauthier, M. I. Grivich, D. Petrusca, A. Sher, A. M. Litke,
and E. J. Chichilnisky, "The structure of multi-neuron firing patterns in primate retina," J.
Neurosci., vol. 26, pp. 8254-8266, 2006.
[127] J. M. Fellous, P. H. E. Tiesinga, P. J. Thomas, and T. J. Sejnowski, "Discovering Spike
Patterns in Neuronal Responses," Journal of Neuroscience, vol. 24, p. 2989, 2004.
[128] D. Brillinger, "The Identification of Point Process Systems," Annals of Probability, vol. 3,
pp. 909-924, 1975.
[129] E. N. Brown, "Theory of Point Processes for Neural Systems " in Methods and Models in
Neurophysics, C. C. Chow, B. Gutkin, D. Hansel, C. Meunier, and J. Dalibard, Eds.
Paris: Elsevier, 2005, pp. 691-726.
[130] L. Srinivasan, U. T. Eden, A. S. Willsky, and E. N. Brown, "A State-Space Analysis for
Reconstruction of Goal-Directed Movements Using Neural Signals," Neural Comp., vol.
18, pp. 2465-2494, October 1, 2006 2006.

161

[131] E. N. Brown, D. P. Nguyen, L. M. Frank, M. A. Wilson, and V. Solo, "An analysis of
neural receptive field plasticity by point process adaptive filtering," PNAS, vol. 98, pp.
12261-12266, October 9, 2001 2001.
[132] C. E. Armstrong-Gold and F. Rieke, "Bandpass Filtering at the Rod to Second-Order Cell
Synapse in Salamander (Ambystoma tigrinum) Retina," J. Neurosci., vol. 23, pp. 37963806, May 1, 2003 2003.
[133] S. Mallat, A Wavelet Tour of Signal Processing, second ed.: Kluwer Acedemic Publishers,
1998.
[134] F. R. K. Chung, Spectral Graph Theory vol. 92. Providence, RI: American Mathematical
Society, 1994.
[135] C. H. Q. Ding, X. He, H. Zha, M. Gu, and H. D. Simon, "A Min-max Cut Algorithm for
Graph Partitioning and Data Clustering " in Proceedings of the 2001 IEEE International
Conference on Data Mining 2001, pp. 107 - 114.
[136] R. Jin, C. Ding, and F. Kang, "A Probabilistic Approach for Optimizing Spectral
Clustering," in Advances in Neural Information Processing Systems, 2005.
[137] A. Ng, M. Jordan, and Y. Weiss, "On spectral clustering: Analysis and an algorithm,"
Advances in Neural Information Processing Systems, vol. 14, 2001.
[138] C. F. Stevens and A. M. Zador, "Input synchrony and the irregular firing of cortical
neurons," Nat. Neurosci., vol. 1, pp. 210-217, 1998.
[139] F. Rieke, "Spikes:: Exploring the Neural Code," 1997.
[140] P. J. Sjöström, G. G. Turrigiano, and S. B. Nelson, "Rate, timing, and cooperativity
jointly determine cortical synaptic plasticity," Neuron, vol. 32, pp. 1149-64, 2001.
[141] S. B. Nelson, P. J. Sjöström, and G. G. Turrigiano, "Rate and timing in cortical synaptic
plasticity," Philos Trans R Soc Lond B Biol Sci., vol. 357, pp. 1851-7, 2002.
[142] A. K. Jain, M. N. Murty, and P. J. Flynn, "Data clustering: a review," ACM Comp. Surv.,
vol. 31, pp. 264-323, 1999.

162

[143] I. R. Winship and T. H. Murphy, "In Vivo Calcium Imaging Reveals Functional
Rewiring of Single Somatosensory Neurons after Stroke," J. Neurosci., vol. 28, pp. 65926606, June 25, 2008 2008.
[144] L. Gross, "A New Window into Structural Plasticity in the Adult Visual Cortex," PLoS
Biology, vol. 4, p. e42, 2006.
[145] S. Fusi, W. F. Asaad, E. K. Miller, and X. J. Wang, "A Neural Circuit Model of Flexible
Sensorimotor Mapping: Learning and Forgetting on Multiple Timescales," Neuron, vol.
54, pp. 319-333, 2007.
[146] K. Sakamoto, H. Mushiake, N. Saito, K. Aihara, M. Yano, and J. Tanji, "Discharge
Synchrony during the Transition of Behavioral Goal Representations Encoded by
Discharge Rates of Prefrontal Neurons," Cereb. Cortex, p. bhm234, February 5, 2008
2008.
[147] R. C. Froemke and Y. Dan, "Spike-timing-dependent synaptic modification induced by
natural spike trains," Nature, vol. 416, pp. 433-438, 2002.
[148] J. C. Bezdek and N. R. Pal, "Some new indexes of cluster validity," IEEE Transactions
on Systems, Man, and Cybernetics, Part B, , vol. 28, pp. 301-315, 1998.
[149] C. A. Atencio and C. E. Schreiner, "Columnar Connectivity and Laminar Processing in
Cat
Primary
Auditory
Cortex,"
PLoS
ONE,
vol.
5,
p.
e9521.
doi:10.1371/journal.pone.0009521 2010.
[150] A. Nicoll and C. Blakemore, "Patterns of local connectivity in the neocortex," Neural
Computation, vol. 5, pp. 665–680, 1993.
[151] D. V. Buonomano and M. M. Merzenich, "CORTICAL PLASTICITY: From Synapses to
Maps," Annual Review of Neuroscience, vol. 21, pp. 149-186, 1998.
[152] R. Tibshirani, G. Walther, and T. Hastie, "Estimating the number of clusters in a data set.
via the gap statistic," Journal of the Royal Statistical Society B vol. 63, pp. 411–423,
2001.
[153] F. Chen, S. El-Dawlatly, R. Jin, and K. Oweiss, "Identifying and Tracking the number of
independent clusters of functionally interdependent neurons from biophysical models of

163

population activity," in Proc. of 3rd Int. IEEE EMBS Conf. on Neural Engineering,
Hawaii, 2007, pp. 542-545.
[154] D. M. Taylor, S. I. H. Tillery, and A. B. Schwartz, "Direct Cortical Control of 3D
Neuroprosthetic Devices," Science, vol. 296, p. 1829, 2002.
[155] M. A. Lebedev and M. A. Nicolelis, "Brain-machine interfaces: past, present and future,"
Trends Neurosci, vol. 29, pp. 536-546, 2006.
[156] M. A. Lebedev, J. E. O'Doherty, and M. A. L. Nicolelis, "Decoding of Temporal
Intervals From Cortical Ensemble Activity," J Neurophysiol, vol. 99, pp. 166-186,
January 1, 2008 2008.
[157] D. W. Moran and A. B. Schwartz, "Motor Cortical Activity During Drawing Movements:
Population Representation During Spiral Tracing," Journal of Neurophysiology, vol. 82,
pp. 2693-2704, 1999.
[158] M. D. Serruya, N. G. Hatsopoulos, L. Paninski, M. R. Fellows, and J. P. Donoghue,
"Brain-machine interface: Instant neural control of a movement signal," Nature, vol. 416,
pp. 141-142, 2002.
[159] M. Serruya, N. Hatsopoulos, M. Fellows, L. Paninski, and J. Donoghue, "Robustness of
neuroprosthetic decoding algorithms," Biological Cybernetics, vol. 88, pp. 219-228, 2003.
[160] K. Murphy and S. Mian, "Modelling gene expression data using dynamic Bayesian
networks," Computer Science Division, University of California Berkeley, CA 1999.
[161] K. Murphy, "Dynamic Bayesian Networks: Representation, Inference and Learning,"
PhD thesis, UC Berkeley, Computer Science Division, 2002.
[162] J. Pearl, Probabilistic Reasoning in Intelligent Systems. San Francisco, CA: Morgan
Kaufmann, 1988.
[163] D. Heckerman, "A tutorial on learning with bayesian networks," Technical Report MSRTR-95-06, Microsoft Research, 1995.
[164] L. R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in
Speech Recognition," Proceedings of the IEEE, vol. 77, pp. 257–286, 1989.

164

[165] N. Friedman, I. Nachaman, and D. Peer, "Learning bayesian network structure from
massive datasets: The sparse candidate algorithm," in Proc. of the 15th Conference on
Uncertainty in Artificial Intelligence (UAI), 1999, pp. 206-215.
[166] A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young, "Using graphical
models and genomic expression data to statistically validate models of genetic regulatory
networks," in Pacific Symposium on Biocomputing, 2001.
[167] W. Lam and F. Bacchus, "Learning bayesian belief networks: an approach based on mdl
principle," Computational Intelligence, vol. 10, pp. 269-293, 1994.
[168] D. Heckerman, D. Geiger, and M. Chickering, "Learning bayesian networks: The
combination of knowledge and statistical data," Machine Learning, vol. 9, pp. 197–243,
1995.
[169] G. Schwarz, "Estimating the dimension of a model," Annals of Statistics, vol. 6, pp. 461–
464, 1978.
[170] G. F. Cooper and E. Herskovits, "A Bayesian method for the induction of probabilistic
networks from data," Machine Learning, vol. 9, pp. 309-347, 1992.
[171] M. Chickering, C. Meek, and D. Heckerman, "Large-sample learning of bayesian
networks is np-hard," in 19th Annual Conference on Uncertainty in Artificial Intelligence
(UAI-03), 2003.
[172] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, "Optimization by Simulated Annealing,"
Science, vol. 220, pp. 671-680, 1983.
[173] A. Bernard and A. J. Hartemink, "Informative structure priors: Joint learning of dynamic
regulatory networks from multiple types of data," Pac Symp Biocomput, vol. 10, pp. 459–
470, 2005.
[174] N. Dojer, A. Gambin, A. Mizera, B. Wilczynski, and J. Tiuryn, "Applying dynamic
Bayesian networks to perturbed gene expression data," BMC Bioinformatics, vol. 7, 2006.
[175] F. Geier, J. Timmer, and C. Fleck, "Reconstructing gene-regulatory networks from time
series, knock-out data, and prior knowledge," BMC Systems Biology, vol. 1, 2007.

165

[176] J. C. Rajapakse and J. Zhou, "Learning effective brain connectivity with dynamic
Bayesian networks " NeuroImage, vol. 37, pp. 749-760, 2007.
[177] L. Zhang, D. Samaras, N. Alia-Klein, N. Volkow, and R. Goldste, "Modeling neuronal
interactivity using dynamic Bayesian networks," in Advances in Neural Information
Processing Systems, Cambridge, MA, 2006.
[178] V. A. Smith, J. Yu, T. V. Smulders, A. J. Hartemink, and E. D. Jarvis, "Computational
inference of neural information flow networks," PLoS Computational Biology, vol. 2, pp.
1436-1449, 2006.
[179] D. O. Hebb, The Organization of Behavior; a Neuropsychological Theory: Wiley, New
York, 1949.
[180] L. Kuhlmann, A. N. Burkitt, A. G. Paolini, and G. M. Clark, "Summation of
spatiotemporal input patterns in leaky integrate-and-fire neurons: Application to neurons
in the cochlear nucleus receiving converging auditory nerve fiber input," J. Comput.
Neurosci., vol. 12, pp. 55-73, 2002.
[181] H. Sprekeler, C. Michaelis, and L. Wiskott, "Slowness: An objective for spike-timingdependent plasticity?," PLoS Computational Biology, vol. 3, p. e112.
doi:10.1371/journal.pcbi.0030112, 2007.
[182] X. Zhang and L. H. Carney, "Response properties of an integrate-and-fire model that
receives subthreshold inputs," Neural Computation, vol. 17, pp. 2571-2601, 2005.
[183] C. J. v. Rijsbergen, Information Retrieval. London: Buttersworth, 1979.
[184] G. Czanner, U. T. Eden, S. Wirth, M. Yanike, W. A. Suzuki, and E. N. Brown, "Analysis
of between-trial and within-trial neural spiking dynamics," J Neurophysiol., vol. 99, pp.
2672-2693, 2008.
[185] J. de La Rocha, B. Doiron, E. Shea-Brown, Josi, K. cacute, and A. Reyes, "Correlation
between neural spike trains increases with firing rate," Nature, vol. 448, pp. 802-806,
2007.
[186] S. Amari, "Measure of correlation orthogonal to changing in firing rate," Neural Comput,
vol. 21, pp. 960-972, 2009.

166

[187] D. Debanne, "Information processing in the axon," Nat Rev Neurosci, vol. 5, pp. 304–316,
2004.
[188] G. Molnar, S. Olah, G. Komlosi, M. Fule, J. Szabadics, C. Varga, P. Barzo, and G.
Tamas, "Complex events initiated by individual spikes in the human cerebral cortex,"
PLoS Biol vol. 6, p. p. e222, 2008.
[189] N. Toni, D. A. Laplagne, C. Zhao, G. Lombardi, C. E. Ribak, F. H. Gage, and A. F.
Schinder, "Neurons born in the adult dentate gyrus form functional synapses with target
cells," Nature Neuroscience, vol. 11, pp. 901-907, 2008.
[190] Y. Yoshimura, J. L. Dantzker, and E. M. Callaway, "Excitatory cortical neurons form
fine-scale functional networks," Nature, vol. 433, p. 868−873, 2005.
[191] D. A. Keen and A. J. Fuglevand, "Common input to motor neurons innervating the same
and different compartments of the human extensor digitorum muscle," J Neurophysiol,
vol. 91, pp. 57–102, 2004.
[192] K. S. Turker and R. K. Powers, "Effects of common excitatory and inhibitory inputs on
motoneuron synchronization," Journal of Neurophysiology, vol. 86, pp. 2807-2822, 2001.
[193] K. S. Turker and R. K. Powers, "The effects of common input characteristics and
discharge rate on synchronization in rat hypoglossal motoneurones," Journal of
Physiology, vol. 541, pp. 254-260, 2002.
[194] M. Aghagolzadeh, S. Eldawlatly, and K. Oweiss, "Identifying Functional Connectivity of
Motor Neuronal Ensembles Improves the Performance of Population Decoders," Proc. of
4th Int. IEEE EMBS Conf. on Neural Engineering, pp. 534-537, 2009.
[195] J. Yu, V. A. Smith, P. P. Wang, A. J. Hartemink, and E. D. Jarvis, "Advances to Bayesian
network inference for generating causal networks from observational biological data,"
Bioinformatics, vol. 20, pp. 3594-3603, 2004.
[196] S. Panzeri, N. Brunel, N. K. Logothetis, and C. Kayser, "Sensory neural codes using
multiplexed temporal scales," Trends Neurosci, vol. 33, pp. 111–120, 2010.
[197] C. I. Moore, S. B. Nelson, and M. Sur, "Dynamics of neuronal processing in rat
somatosensory cortex," Trends in Neurosciences, vol. 22, p. 513:520, 1999.

167

[198] D. E. Feldman and M. Brecht, "Map plasticity in somatosensory cortex," Science, vol.
310, pp. 810–815, 2005.
[199] M. E. Diamond, M. v. Heimendahl, P. M. Knutsen, D. Kleinfeld, and E. Ahissar,
"“Where” and “what” in the whisker sensorimotor system," Nat Rev Neurosci vol. 9, pp.
601–612, 2008.
[200] C. C. H. Petersen, "The functional organisation of the barrel cortex," Neuron, vol. 56, pp.
339-355, 2007.
[201] D. Kleinfeld, E. Ahissar, and M. E. Diamond, "Active sensation: insights from the rodent
vibrissa sensorimotor system," Curr. Opin. Neurobiol. , vol. 16, pp. 435–444, 2006.
[202] A. A. Ghazanfar and M. A. L. Nicolelis, "The structure and function of dynamic cortical
and thalamic receptive fields," Cereb Cortex, vol. 11, pp. 183–193, 2001.
[203] M. C. Wiest, N. Bentley, and M. A. L. Nicolelis, "Heterogeneous integration of bilateral
whisker signals by neurons in primary somatosensorycortex of awake rats," J
Neurophysiol, vol. 93, pp. 2966–2973, 2005.
[204] D. Schubert, J. F. Staiger, N. Cho, R. Kötter, K. Zilles, and H. J. Luhmann, "Layerspecific intracolumnar and transcolumnar functional connectivity of layer V pyramidal
cells in rat barrel cortex," J Neurosci, vol. 21, pp. 3580-3592, 2001.
[205] N. Wright and K. Fox, "Origins of cortical layer V surround receptive fields in the rat
barrel cortex," J. Neurophysiol., vol. 103, pp. 709–724, 2010.
[206] D. J. Simons, G. E. Carvell, A. E. Hershey, and D. P. Bryant, "Responses of barrel cortex
neurons in awake rats and effects of urethane anesthesia," Exp Brain Res, vol. 91, pp.
259–272, 1992.
[207] A. A. Ghazanfar and M. A. L. Nicolelis, "Spatiotemporal properties of layer V neurons in
the rat primary somatosensory cortex," Cereb. Cortex, vol. 9, pp. 348-361, 1999.
[208] B. E. Mercier, C. R. Legg, and M. Glickstein, "Basal ganglia and cerebellum receive
different somatosensory information in rats," Proc Natl Acad Sci USA, vol. 87, pp. 4388–
4392, 1990.

168

[209] R. Izraeli and L. L. Porter, "Vibrissal motor cortex in the rat: connections with the barrel
field," Exp Brain Res, vol. 104, pp. 41–54, 1995.
[210] M. Zhang and K. D. Alloway, "Stimulus-induced intercolumnar synchronization of
neuronal activity in rat barrel cortex: a laminar analysis," J. Neurophysiol., vol. 92, pp.
1464–1478, 2004.
[211] K. D. Alloway, "Information processing streams in rodent barrel cortex: the differential
functions of barrel and septal circuits," Cereb. Cortex, vol. 18, pp. 979–989, 2008.
[212] F. Matyas, V. Sreenivasan, F. Marbach, C. Wacongne, B. Barsy, C. Mateo, R. Aronoff,
and C. C. H. Petersen, "Motor Control by Sensory Cortex," Science, vol. 330, pp. 12401243, 2010.
[213] M. A. Lebedev, G. Mirabella, I. Erchova, and M. E. Diamond, "Experience-dependent
Plasticity of Rat Barrel Cortex: Redistribution of Activity across Barrel-columns," Cereb
Cortex, vol. 10, pp. 23 - 31, 2000.
[214] K. Y. Kwon, S. Eldawlatly, and K. G. Oweiss, "NeuroQuest: A Comprehensive Tool for
Large Scale Neural Data Processing and Analysis," in Proc. of 4th Int. IEEE EMBS Conf.
on Neural Engineering, 2009, pp. 622-625.
[215] K. Y. Kwon, S. Eldawlatly, and K. Oweiss, "NeuroQuest: A Comprehensive Analysis
Tool for Extracellular Neural Ensemble," Journal of Neuroscience Methods, in review.
[216] S. Temereanca, E. N. Brown, and D. J. Simons, "Rapid changes in thalamic firing
synchrony during repetitive whisker stimulation," J Neurosci, vol. 28, pp. 11153-11164,
2008.
[217] C. Shannon, "A mathematical theory of communication," Bell. Sys. Tech. J., vol. 27, pp.
379–423, 1948.
[218] B. Luo, R. C. Wilson, and E. R. Hancock, "Spectral Embedding of Graphs," Pattern
Recognition, vol. 36, pp. 2213-2230, 2003.
[219] R. Q. Quiroga and S. Panzeri, "Extracting information from neuronal populations:
information theory and decoding approaches," Nature Rev. Neurosci., vol. 10, pp. 173–
185, 2009.

169

[220] M. C. Jones, J. S. Marron, and S. J. Sheather, "A Brief Survey of Bandwidth Selection for
Density Estimation," Journal of the American Statistical Association, vol. 91, pp. 401–
407, 1996.
[221] W. Truccolo, L. R. Hochberg, and J. P. Donoghue, "Collective dynamics in human and
monkey sensorimotor cortex: predicting single neuron spikes," Nat Neurosci, vol. 13, pp.
105-111, 2010.
[222] R. S. Petersen, S. Panzeri, and M. E. Diamond, "Population coding of stimulus location
in rat somatosensory cortex," Neuron, vol. 32, pp. 503–514, 2001.
[223] C. P. de Kock, R. M. Bruno, H. Spors, and B. Sakmann, "Layer- and cell-type-specific
suprathreshold stimulus representation in rat primary somatosensory cortex," J. Physiol.
(Lond.), vol. 581, pp. 139–154 2007.
[224] T. Celikel, V. A. Szostak, and D. E. Feldman, "Modulation of spike timing by sensory
deprivation during induction of cortical map plasticity," Nat. Neurosci., vol. 7, pp. 534–
541, 2004.
[225] R. S. Petersen, S. Panzeri, and M. E. Diamond, "Population coding in somatosensory
cortex," Curr. Opin. Neurobiol., vol. 12, pp. 441–447, 2002.
[226] S. P. Jadhav, J. Wolfe, and D. E. Feldman, "Sparse temporal coding of elementary tactile
features during active whisker sensation," Nat. Neurosci., vol. 12, pp. 792–800, 2009.
[227] G. Foffani, B. Tutunculer, and K. A. Moxon, "Role of spike timing in the forelimb
somatosensory cortex of the rat," J. Neurosci., vol. 24, pp. 7266–7271, 2004.
[228] K. D. Harris, "Neural signatures of cell assembly organization," Nat. Rev. Neurosci., vol.
6, pp. 399–407, 2005.
[229] G. Foffani, M. L. Morales-Botello, and J. Aguilar, "Spike timing, spike count, and
temporal information for the discrimination of tactile stimuli in the rat ventrobasal
complex," J. Neurosci., vol. 29, pp. 5964–5973, 2009.
[230] D. S. Barth, "Submillisecond synchronization of fast electrical oscillations in neocortex,"
J Neurosci, vol. 23, pp. 2502–2510, 2003.

170

[231] G. Buzsáki, "Neural Syntax: Cell Assemblies, Synapsembles, and Readers " Neuron, vol.
68, pp. 362-385, 2010.
[232] B. Haider and D. A. McCormick, "Rapid neocortical dynamics: cellular and network
mechanisms," Neuron, vol. 62, pp. 171–189, 2009.
[233] B. M. Hooks, S. A. Hires, Y.-X. Zhang, D. Huber, L. Petreanu, K. Svoboda, and G. M. G.
Shepherd, "Laminar Analysis of Excitatory Local Circuits in Vibrissal Motor and
Sensory Cortical Areas," PLoS Biol, vol. 9, p. e100057, 2011.
[234] M. Abeles, Local Cortical Circuits: An Electrophysiological study. Berlin: Springer,
1982.
[235] A. K. Engel, P. Fries, and W. Singer, "Dynamic predictions: oscillations and synchrony
in top-down processing," Nat. Rev. Neurosci., vol. 2, pp. 704–716, 2001.
[236] B. Kolb and I. Q. Whishaw, "Brain plasticity and behavior," Annu Rev Psychol, vol. 49,
pp. 43–64, 1998.
[237] S. A. Siegelbaum and E. R. Kandel, "Learning-Related Synaptic Plasticity: LTP and
LTD," Curr Opin Neurobiol, vol. 1, pp. 113–120, 1991.
[238] L. F. Abbott and S. B. Nelson, "Synaptic plasticity: taming the beast," Nature Neurosci.,
vol. 3, pp. 1178–1183, 2000.
[239] K. Fox and R. O. Wong, "A comparison of experience-dependent plasticity in the visual
and somatosensory systems," Neuron, vol. 48, pp. 465–477, 2005.
[240] E. Gao and N. Suga, "Experience-dependent plasticity in the auditory cortex and the
inferior colliculus of bats: role of the corticofugal system," Proc. Natl Acad. Sci. USA,
vol. 97, pp. 8081–8086, 2000.
[241] T. Weiss, W. H. R. Miltner, R. Huonker, R. Friedel, I. Schmidt, and E. Taub, "Rapid
functional plasticity of the somatosensory cortex after finger amputation," Exp. Brain
Res., vol. 134, pp. 199–203, 2000.
[242] J. H. Kaas, "Plasticity of sensory and motor maps in adult mammals," Annu. Rev.
Neurosci., vol. 14, pp. 137–167, 1991.

171

[243] K. Fox, "Anatomical pathways and molecular mechanisms for plasticity in the barrel
cortex," Neuroscience, vol. 111, pp. 799-814, 2002.
[244] M. Fu and Y. Zuo, "Experience-dependent structural plasticity in the cortex " Trends
Neurosci, vol. 34, pp. 177-187, 2011.
[245] M. E. Diamond, M. Armstrong-James, and F. F. Ebner, "Experience-dependent plasticity
in adult rat barrel cortex," Proc. Natl Acad. Sci. USA, vol. 90, pp. 2082–2086, 1993.
[246] M. Armstrong-James, M. E. Diamond, and F. F. Ebner, "An innocuous bias in whisker
use in adult rats modifies receptive fields of barrel cortex neurons," J. Neurosci. Methods,
vol. 14, pp. 6978–6991, 1994.
[247] M. E. Diamond, W. Huang, and F. F. Ebner, "Laminar comparison of somatosensory
cortical plasticity," Science, vol. 265, pp. 1885-1888, 1994.
[248] H. Sellien and F. F. Ebner, " Rapid plasticity follows whisker pairing in barrel cortex of
the awake rat," Exp. Brain Res., vol. 177, pp. 1–14, 2007.
[249] S. W. Scheff and D. A. Price, "Synaptic pathology in Alzheimer's disease: a review of
ultrastructural studies," Neurobiol Aging, vol. 24, p. 1029−1046, 2003.
[250] D. Lee, N. L. Port, W. Kruse, and A. P. Georgopoulos, "Variability and correlated noise
in the discharge of neurons in motor and parietal areas of the primate cortex," J Neurosci,
vol. 18, pp. 1161–1170, 1998.
[251] I. Bar-Gad, Y. Ritov, E. Vaadia, and H. Bergman, "Failure in identification of
overlapping spikes from multiple neuron activity causes artificial correlations," J
Neurosci Methods, vol. 107, pp. 1-13, 2001.
[252] J. W. Pillow, L. Paninski, V. J. Uzzell, E. P. Simoncelli, and E. J. Chichilnisky,
"Prediction and Decoding of Retinal Ganglion Cell Responses with a Probabilistic
Spiking Model," J. Neurosci., vol. 25, pp. 11003-11013, November 23, 2005 2005.
[253] W. Wu, Y. Gao, E. Bienenstock, J. P. Donoghue, and M. J. Black, "Bayesian Population
Decoding of Motor Cortical Activity Using a Kalman Filter," Neural Computation, vol.
18, pp. 80-118, 2005.

172

[254] D. D. Lee and S. Seung, "Learning the parts of objects by nonnegative matrix
factorization," Nature, vol. 401, pp. 788 - 791, 1999.
[255] S. P. Kim, Y. N. Rao, D. Erdogmus, J. C. Sanchez, M. A. L. Nicolelis, and J. C. Principe,
"Determining patterns in neural activity for reaching movements using nonnegative
matrix factorization," EURASIP J. Appl. Signal Process., vol. 19, pp. 3113-3121, 2005.
[256] L. Tao, J. D. Lauderdale, and A. T. Sornborger, "Mapping functional connectivity
between neuronal ensembles with larval zebrafish transgenic for a ratiometric calcium
indicator," Front. Neural Circuits, vol. 5, 2011.
[257] J. W. H. Schnupp and O. Kacelnik, "Cortical Plasticity: Learning from Cortical
Reorganisation," Current Biology, vol. 12, pp. R144-R146, 2002.
[258] J. R. Wickens, J. N. Reynolds, and B. I. Hyland, "Neural mechanisms of reward-related
motor learning," Curr Opin Neurobiol, vol. 13, pp. 685-90, 2003.
[259] O. A. Gharbawie and I. Q. Whishaw, "Parallel stages of learning and recovery of skilled
reaching after motor cortex stroke:“Oppositions” organize normal and compensatory
movements," Behav Brain Res, vol. 175, pp. 249-262, 2006.
[260] E. H. Baeg, Y. B. Kim, J. Kim, J. W. Ghim, J. J. Kim, and M. W. Jung, "LearningInduced Enduring Changes in Functional Connectivity among Prefrontal Cortical
Neurons," Journal of Neuroscience, vol. 27, p. 909, 2007.
[261] D. Dupret, A. Fabre, M. D. Döbrössy, A. Panatier, J. J. Rodríguez, S. Lamarque, V.
Lemaire, S. H. R. Oliet, P. V. Piazza, and D. N. Abrous, "Spatial Learning Depends on
Both the Addition and Removal of New Hippocampal Neurons," PLoS Biol, vol. 5, p.
e214, 2007.
[262] D. E. Feldman, "Synaptic Mechanisms for Plasticity in Neocortex," Annual Review of
Neuroscience, vol. 32, pp. 33-55, 2009.
[263] N. Dancause, S. Barbay, S. B. Frost, E. J. Plautz, D. Chen, E. V. Zoubina, A. M. Stowe,
and R. J. Nudo, "Extensive cortical rewiring after brain injury," Journal of Neuroscience,
vol. 25, pp. 10167-10179, 2005.

173

[264] J. Girgis, D. Merrett, S. Kirkland, G. A. S. Metz, V. Verge, and K. Fouad, "Reaching
training in rats with spinal cord injury promotes plasticity and task specific recovery,"
Brain, vol. 130, p. 2993, 2007.

174