c5
‘1?

E

This is to certify that the
thesis entitled

Methods for Neural Signal Processing and Analysis

presented by

Yasir Suhail

LIBRARY
Michigan State
Umversity

has been accepted towards fulﬁllment
of the requirements for the

MS. degree in Electrical Engineering

 

 

 

War Professor/s Signature
0:3 1/ z 3 ,1 o f

Date

MSU is an Afﬁnnative Action/Equal Opportunity Institution

 

PLACE IN RETURN Box to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECAILED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

Pitt» we

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2/05 c:/ClRC/DateDue.ind¢p.15

 

Methods for Neural Signal Processing and
Analysis

By

Yasir Suhail

A THESIS

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

MASTER OF SCIENCE
Department of Electrical and Computer Engineering

2005

ABSTRACT

Methods for Neural Signal Processing and Analysis

By

Yasir Suhail

High-density microelectrode arrays offer indispensable tools for neuroscience re-
search and potential benefits for neuroprosthetic applications at the micro and nano
scales. Advances in microfabrication and microelectronics have enabled integration
of large number of electrodes on a single device with modest signal processing capa-
bilities, which in turn triggered many new scientiﬁc discoveries in the neuroscience
community. From a system viewpoint, these rapid developments present numerous
challenges to the associated data processing and analysis stages. In this work, we
present algorithms and performance analyses of a neural interface system designed
for large scale processing of the neural data obtained. We evaluate the performance
and tradeoffs of the system’s computational complexity versus data precision and
signal ﬁdelity during real-time telemetry transmission. We also evaluate perfor-
mance tradeoffs between single and multiple channel signal detection techniques.
On the data mining aspect, we present a new multiscale processing and clustering
technique to infer functional interdependency between populations of neurons from
the recorded mixture to ease the identiﬁcation of local and global biological neural

circuits.

Copyright © by
Yasir Suhail

2005

To my parents and my sister

iv

ACKNOWLEDGEMENTS

I would like to express my gratitude to my thesis advisor Dr. Karim G. Oweiss
for his constant guidance, support and encouragement throughout my two years of
graduate study. I must credit him with ﬁnding time whenever it was needed for
discussing any results and progress. His support has been indispensable not only
for this work, but also for my graduate study and career.

I also wish to record the support and camaraderie of my lab mates and fellow
graduate students, notably Kyle Thomson, Lance Slifka, and April Thompson which
made it easier to get through the work and enjoy it. I have beneﬁtted greatly from
the company of many people at Michigan State, most notably Professors Hayder
Radha and Selin Aviyente who have taught me most of what I know of Signal
Processing, Information Theory, and Statistics. Professor Rong Jin has been very
helpful with his collaboration in the area of probabilistic clustering. Michigan State
University hosts one of the friendliest people to learn from and spend time with. A
diverse community of wonderful students within the Department and the University
in general have been a bedrock of support and have made my stay memorable.

I would not have been able to achieve any of this without the unconditional love,

support, and good wishes of my parents and my little sister. I do not have the

words to express my feelings and appreciation to them. Most of all, I thank the
Almighty for bestowing on me these blessings so I could comfortably engage myself

in the pursuit of knowledge that I have enjoyed so much.

vi

TABLE OF CONTENTS

LIST OF TABLES
LIST OF FIGURES

1 Wavelet Transform Block for Implantable Neural Interfaces
1.1 Introduction ...............................
1.2 Theory ..................................
1.2.1 The Traditional Mallat’s Algorithm ..............
1.2.2 The Lifting Approach ......................
1.2.3 Integer Lifting DWT ......................
1.2.4 Quantization of the Filter Coefﬁcients ............
1.2.5 Wavelet Thresholding ......................
1.2.6 Computation and Data Rate Savings .............
1.3 Methodology ..............................
1.4 Results ..................................

1.5 Conclusion ................................

2 Spike Detection
2.1 Introduction ...............................
2.2 Data model ...............................
2.3 Wavelet domain detection .......................
2.3.1 The Stationary Wavelet Packet Tree .............
2.4 Analysis of the Bayesian Tests for Spike Detection ..........
2.4.1 Likelihood Ratio Test for the known signal case .......

CDOOOOO'IuD-iiki—il-t

i—Ii—ii—ar—a
\IOOMM

19
19

29
29

2.4.2 Likelihood Ratio Test for the unknown signal case ......

2.5 Results. . .

2.6 Conclusions

ooooooooooooooooooooooooooooooo

3 Clustering Neural Spike Trains

3.1 Introduction ...............................
3.1.1 Background ...........................
3.2 Simulation Method ...........................
3.2.1 Proposed Technique ......................

3.3 Multi-resolution Approach .......................

3.4 A Principal Components Approach for Multi-scale Clustering . . . .

3.5 Graph Theoretic Clustering ......................

3.6 Clustering Results ...........................

3.7 Conclusions

BIBLIOGRAPHY

ooooooooooooooooooooooooooooooo

viii

36
44
49

51
51
52
54
55
56
65
66
69
74

76

1.1

2.1

3.1
3.2

LIST OF TABLES

Update and Prediction ﬁlter coefﬁcients for symmlet 4. ....... 8
Signal to Noise Ratio in the different SWPT nodes. ......... 29
Eigenvalues associated with the Principal Components ....... 66
Clustering Errors obtained in the k-means algorithm ........ 72

ix

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

2.1
2.2
2.3

2.4

2.5

LIST OF FIGURES

The basic analysis block of the pyramidal algorithm proposed by
Mallat[10] for 3 decomposition levels. .................

Lifting scheme for computing a single level of DWT decomposition.

Comparison of denoising performance for various precisions of ﬁlter
coefﬁcients for a sample channel of a 4-channel array with the data
quantized to 10 bits. . . . '. ......................

Denoising performance for various wavelet transform computations.

Performance in the distance metric of the signal subspace for data
denoised using various wavelet transform computations ........

Performance in the distance metric of the signal subspace for data
denoised using various wavelet transform computations ........

Example spike waveforms obtained with DWT computation with
different ﬁlter coefﬁcient bit widths ...................

Effect of round off and quantization errors on the signal ﬁdelity as a
function of multiplier complexity. ...................

The Stationary Wavelet Packet Tfee. .................
A single channel represented in different SWPT nodes. .......

Receiver operating Characteristics for Array and singe channel de-
tection ...................................

Receiver operating Characteristics for Array and singe channel de-
tection ...................................

ROC curves for Multiresoution and Time domain detection for a
single channel data at an SN R of 7.6024. ...............

15

16

16

17

25

28

41

43

45

2.6 ROC curves for Multiresoution and Time domain detection for a
single channel data at an SNR of 8.7623. ...............

2.7 ROC curves for Multiresoution and Time domain detection for a
single channel data at an SNR of 10.857. ...............

2.8 ROC curves for Multiresoution and Time domain detection for a two
channel array at an SNR of 8.1177. ..................

2.9 ROC curves for Multiresoution and Time domain detection for a two
channel array at an SNR of 6.9579. ..................

2.10 ROC curves for Multiresoution and Time domain detection for a two
channel array at an SNR of 10.2124. .................

2.11 Noisy data on one channel used to detect spikes ............
2.12 The clean spike data to be detected. .................

2.13 A clean and noisy spike example. ...................

3.1 The Haar basis approximation function. ...............
3.2 The Haar basis detail function. ....................
3.3 The Wavelet 'Iree representation to 5 levels. .............
3.4 Correlations in different subbands. ..................
3.5 Principal Components of correlations ..................
3.6 Fused proﬁle of the ﬁrst 4 principal components ............

3.7 The true clusters depicted in the ﬁrst two components resolved from
the fused matrix D ............................

3.8 Result of the k—means algorithm applied to the first two principal
components of the fused matrix .....................

xi

45

46

46

47

48
48
49

59
64
67
70

71

73

CHAPTER 1

Wavelet Transform Block for

Implantable Neural Interfaces

1 . 1 Introduction

Electrical recordings from neural tissue give direct observations of the neural activ-
ity. Compared to Electroencephelogram (EEG) recordings recordings or chemical
detection of neurotransmitters and neuromodulators, invasive recordings from elec-
trodes placed inside the neural tissue provide data with high temporal (of the order
of 30 kHz) and spatial (micron range) resolution. Cellular recordings from neurons
have been used to study the neural systems such as the motor, somatosensory, and

visual cortical areas of the brain.

With the advances in micro fabrication, it is possible to manufacture dense

arrays of microeletrodes[1]. These arrays can record from a large number of neu-
rons and provide high volumes of data for further analysis. For artiﬁcially grown
neural tissue, it is possible to have relatively long term cellular recordings from
individual neurons through techniques such as patch clamping[2] and directed neu-
ral growth[3]. ln-situ recordings from animals, however is more complicated for
various reasons. The recorded data is corrupted to a large extent with noise from
the neighboring tissue, and the signal conditions evolve over time. Another prob-
lem is the actual transmission and processing of this information. The traditional
technique relies on ampliﬁers situated on the recording probe relaying the recorded
signals over wires to a bank of ﬁlters, analog to digital converters, and a computer
for signal processing. This complicates things for experiments in live, freely behav-
ing animals. It is not always possible to have bulky wired connections to a large
data processing system. The opportunities for clinical neural prosthetic devices
also present similar challenges. These devices have the potential to help patients

with motor disorders and spinal cord injuries lead more normal lives.

Our motivation is the development of on-chip signal processing and wireless data
transmission for high density, implantable neural recording probes. Researchers
have tried to move towards wireless telemetry[4] to make the neural probes less
obtrusive and suitable for both long term behavior experiments and prosthetic
devices. It is clear that complex signal processing is required on this data for
compression, detection of action potentials and local ﬁeld potentials, separation of

neural sources, and decoding the neural signals. Passive probes transmitting all the

data wirelessly face a deluge of data. A modestly sized array with 96 electrodes
recording at a sampling rate of 25kHz per channel to a precision of 12bits / sample,
required bit rate would be 29 Mbps. Further complicating this problem is that the
biological environment places stringent limits on the size and power availability for

such devices.

We focus on the development of on-chip signal processing prior to data trans-
mission to extract some valuable information and reduce the telemetry bandwidth
requirements. Studies have shown that on-chip processing may reduce the data
throughput to approximately 75% for bursting activity and 85-90% for spontaneous
activity and reasonable bit rates are achieved for a 32 channel device sampling at
25 kHz, with 8 bit sample quantization[5]. This technique relied heavily on the

Discrete Wavelet Transform (DWT) [6]

The Discrete Wavelet Transform is also the basis of many algorithms for spike
detection[7], noise reduction[8], spike sorting[9] of multichannel microelectrode
recordings, including those presented in this thesis. On-chip DWT computation
of the recorded data will be an essential component of any implantable electrode
recording array with intelligent wireless transmission of data. With the stringent
power and ares limitations imposed on the system, a suitable implementation of
the DWT is paramount. The use of the lifting technique for DWT computation
results in considerable savings in computational hardware, memory, and bandwidth

requirements.

1 .2 Theory

The compaction property of the wavelet transform, which expresses the signal in
a few larger coefﬁcients while expressing the noise in many small coefﬁcients is
used in most of the algorithms for neural signal processing. Discarding the smaller
coefﬁcients results in denoising[8]. Also, the fact that the signal of interest (neural

spikes) occupy a small number of coefﬁcients is beneﬁcial for compression strategies.

1.2.1 The Traditional Mallat’s Algorithm

The traditional Mallat’s algorithm[10] for evaluating the wavelet transform of a
given signal involves recursively convolving the signal through two decomposition
ﬁlters h and g, and decimating the result to obtain the approximation and detail

coefﬁcients at every decomposition level j as illustrated in Fig.1.1.

These ﬁlters l and h are the low pass scaling function and high pass wavelet
functions respectively. The approximation and detail coefficients at any arbitrary

level are given by

. K-1 .
a-7(z') = Z aJ—1(2z' — km,
k=0
K—l

dj(i) = Z aj_1(2z’— k)hk (1.1)
1:20

where K is the length of the ﬁlters, also known as the support of the wavelet

dj-l-l

 

 

 

 

 

 

 

 

 

 

IQ

a,
Kn.
A

 

 

 

 

 

 

 

 

 

 

 

Ht?) t2 r:— dj+3
- L(:) - t2 ~—~ “1 as (: t2 ~3-

L(.:) 4,2 L—g
a HZ) l2”—

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IQ

 

 

 

 

 

 

 

 

 

 

 

 

 

j+3

 

 

 

 

IQ

 

 

 

 

 

 

 

 

 

Figure 1.1. The basic analysis block of the pyramidal algorithm proposed by Mallat
for 3 decomposition levels.

function.

1.2.2 The Lifting Approach

From the computational standpoint, the lifting scheme for DWT computation has
shown numerous beneﬁts over traditional DWT computation[11]. It is a fast and
efﬁcient mechanism for implementing the DWT. For asymptotically long ﬁlters, the
computational cost of the lifting algorithm is one half of the cost of the standard
algorithm, and it requires only iii-place computations. It can also be adapted to
a lossless integer—to—integer transform which does not require any ﬂoating point

computations. The lifting scheme relies on factorizing the l and h ﬁlters to obtain

the ﬁlters S N and TN as

hit/E) + law—7)

W?) + M?) '

 

2
hfﬁ) - h(\/:E)

2
l(x/E) - INT-3)

 

_ 2

p

K 0

0 K 0

 

I SN(Z) I

2
0

I TN(z) 1

d

1

 

d

(1.2)
l 0
T1(Z) 1

In this scheme, there is no need to compute the two different discrete convolu-

tions for the whole data record. The memory required is the same as that required

to store the data, while Mallat’s algorithm requires twice as much memory as the

initial data to compute the wavelet transform. In addition to these savings, an

integer approximation of the data and quantization of the ﬁlter coefﬁcients further

reduces the computational load.

 

 

 

 

 

 

_f_. 1.
JD
‘1’
—_——i Split 71(2) 81(2) ...

 

 

 

 

 

 

 

 

 

 

 

 

 

1,, ‘
ﬂ

 

SN(z)

 

 

 

 

 

—1
8." G

j+l

Figure 1.2. Lifting scheme for computing a single level of DWT decomposition.

The lifting scheme of Fig.1.2 replaces one stage of the Mallat’s algorithm shown

in Fig.1.1. It is based on three steps: splitting the data into even and odd samples,
predicting the odd samples from the even samples, and updating the even samples

to obtain the approximation coefﬁcients.

For our hardware implementation with strict power and area constraints, we con-
sider an integer to integer wavelet transform[12]. Here, the ﬁrst factor in Eq. (1.3)
is discarded, and the realized transform is in fact a scaled version of the original
wavelet and scaling functions. The symmleM wavelet basis has been found suitable
for processing of neural recordings[9]. Factoring the scaling and wavelet functions

for this basis results in the following steps to be used in the scheme of Fig.1.2.

stepl: gilt]: lgolil+01folill

step2= filil = [folil+C2gilil+C3glli+1lJ

step3= 92M = lgilil +C4f1lil +05f1li-1ll (13)
steel: alil = lfllil+0692lil+0792li-lll

step5 : d[i] = [g2[i] + Cga[i + 1]]

where the input registers go[i] and f0[i], are the pair of even and odd samples of the
signal, a[i] and d[i] are a pair of approximation and detail coefﬁcients computed,
and [.j denotes the floor function. Note that we truncate the result at every stage,

since we wish to implement the integer version of the transform.

The constants 01,02, ..., C8, which are the coefﬁcients of the ﬁlters Tn(z) and

Table 1.1. Update and Prediction ﬁlter coefﬁcients for symmlct 4.

 

 

 

Coefﬁcient Value Coefficient. Value
C1 0.39114 C5 0.16203
Cg -0.12439 CG 0.43128
C3 -0.33924 C7 0.43128
C4 -1.41951 C8 -1.04925

 

 

 

 

 

571(2) are given given in Table 1.1.

1.2.3 Integer Lifting DWT

The data recorded at any channel is scaled and quantized to obtain data sam-
ples within a 1—bit integer precision. The DWT is then computed using the lift-
ing scheme described in Eq. (1.3) to the required number of decomposition levels.
The last step (of the corresponding floating point lifting) of multiplication by the
constants K and K ‘1 (in Eq 1.3) is omitted. Hence, the dynamic range of the

transform at each level will change by K.

1.2.4 Quantization of the Filter Coefﬁcients

The ﬁlter coefﬁcients C1 through C8 have real values given in Table 1.1. However,
since we are working with 10-bit integer data, we wish to remove all the floating
point operations from the DWT computation strategy. The ﬁlter coefﬁcients are
therefore quantized to a ﬁxed number of bits. We have experimented with 4, 6,

8, 10, and 12-bit precision of the lifting factorized coefﬁcients. Quantization to a

small number of bits results in a signiﬁcant reduction in the number of additions

and shift operations required for each multiplication in Eq. (1.3).

1.2.5 Wavelet Thresholding

The DWT representation of neural signals is suited for noise reduction as the sig-
nal components are compactly represented by few large amplitude coefﬁcients be-
cause of the inherent similarity between the wavelet basis functions and the spike
waveforms[13], and the noise is spread across all decomposition levels in many small
amplitude coefﬁcients. This allows thresholding[14] the small coefﬁcients to take
place and simultaneously reduces the number of coefﬁcients to code for extracuta—
neous transmission. The coefﬁcients that are thresholded can be run-length encoded

to give signiﬁcant savings in bandwidth[5].

We formulate a measures to quantitatively assess the performance of our DWT
module for typical neural signal processing problems. The Signal to Noise Ratio of

the recorded data at any stage of processing is deﬁned as

Average Spike Peak to Peak Amplitude

SNR (dB) = 20109.10 Noise RMS

 

(1.4)

This deﬁnition of the SN R is relevant for spike detection in noisy data as dis-

cussed in Chapter 2.

We also wish to evaluate a measure of neural source separability in multiunit

array recordings. The wavelet based MASSIT algorithm[9] for spike sorting relies on
the invariance of the signal subspace in the “best” subbands of the DWT expansion.
The approximations performed in the DWT module design should not compromise

the accuracy of the spike sorting stage.

To compute the eigenvectors for the MASSIT spike sorter, we model the ob-

served spike waveform across multiple channels as
Y=X+Z am

where Y E ERMXN denotes the observed spike in the interval [1, . . .,N] across
an array of M channels, X = apsg is the noise free observed spike across the
array such that the column vector ap E ERMXI describes the array response to
the spike waveform Sp 6 RN X1 elicited by neuron p, and Z 6 giMXN denotes a

zero mean additive noise component with a non-diagonal spatial covariance matrix

RzéiliMXM.

The column vector ap spans the same subspace as the ﬁrst eigenvector "1 of the

spatial covariance matrix of Y, computed as

M
Ry = E[YYT] = UYDYU$ = Z amumug (1.6)

m=1

Therefore, a qualitative assessment of the neural source separability is a distance

metric between the eigenvector obtained with ﬂoating-point standard Mallat DWT

10

algorithm and the eigenvector obtained using ﬁxed-point integer lifting DWT com-
putation. The distance between both vectors, in a mean square sense is computed

as

dp = ”111 - at”2 (1.7)
where [III is the Euclidean norm.

This distance metric computed after thresholding in the Integer Lifting DWT
module, and the regular DWT module provides a qualitative measure of the per-

formance in spike sorting.

The distortion in the reconstructed spike waveform due to the quantization and
approximation errors in the DWT module is also considered. Faithful reconstruc-
tion of the spike waveform is essential for all spike sorting algorithms, whether for

single channel[15] or array data[9]. We compute the distortion as

xxT
(X — Yr)(x — Yr)?”

 

Distortion SNR (dB) 2 10log10 (1.8)

where Yr 6 §R1XN is the reconstructed signal after wavelet thresholding for the
DWT module, and X E ili‘lXN is the true neural signal for any single channel.

This distortion metric can be evaluated for the Integer Lifting DWT module
with arbitrary data and ﬁlter quantization, and compared to the “ideal” distortion

obtained with the traditional ﬂoating point DWT module.

11

1.2.6 Computation and Data Rate Savings

The lifting scheme for DWT computation requires less computation than the tra-
ditional Mallat’s algorithm, even without any data or ﬁlter quantization. For the
symmleU, orthogonal wavelet with a ﬁlter length of eight (Ihl = 8, [II = 8), the
number of ﬂoating point operations (flops) required for the standard algorithm is

34, while the lifting algorithm requires only 20 flops [11].

Further, the quantization of the data and ﬁlter coefﬁcients C1, C2, ..., C8 reduce
the complexity of the multipliers used in the lifting scheme computations (Eq. (1 ..3))
If the data is quantized to Nd bits and the lifting ﬁlter coefﬁcients are quantized
to N f bits, the product will be truncated to Nd bits. Hence, we estimate the total
number of 1-bit addition steps required for one multiplication in the lifting module
to be Nd x N f. The required precision of the data is directly proportional to the

data rate needed for transmission or storage on on-chip buffers.

1.3 Methodology

A template spike waveform from a single neural source was used to generate a
simulated spike train across a four channel array. The inter-spike interval times
were realizations of an exponentially distributed random process. Experimental
neural noise from a closely spaced array recording was superimposed on the spike

train. The noise level was varied to obtain carious SNR conditions.

To evaluate the expected performance in source separation, we used Monte Carlo
simulations for various choices of mixing matrices for different single units. The
dominant eigenvectors were computed from the denoised data and their distance
metric was evaluated from the original mixing matrix A. The distortion in the
spike waveform in an individual channel as a function of the multiplier complexity

and transmission data rate are also plotted

1.4 Results

Figure 1.3(a) illustrates the noisy signal, and its denoised versions using the tra-
ditional ﬂoating point DWT, Integer Lifting DWT with 10-bit ﬁlter coefﬁcients,
and Integer Lifting DWT with 4-bit ﬁlter coefﬁcients. Figure 1.4 illustrates the de-
noising performance comparison of the ﬂoating point DWT, Integer Lifting DWT
with 10—bit and 4-bit ﬁlter coefﬁcients on 10-bit data. The interesting result here
is that the performance of the Integer Lifting DWT module with 4-bit ﬁlter coefﬁ-
cients is close to that of the 10-bit DWT module, which translates into tremendous

computational and memory savings.

The performance of the source separation distance metric was evaluated and
illustrated in Figure 1.6. The plot indicates that the signal subspace is maintained
when replacing the ﬂoating point DWT to the Integer to Integer Lifting based DWT.
Figure 1.7 shows that the reconstructed waveform after denoising for the different

cases is very similar. We can locate an optimal operating point for minimum

13

2m .. -—‘~— —a ——v~~--—v-—-~—1 15w .- —~ __._. -— ._ ~ ~ —«

“”Wtl‘tit‘ll'i’li‘iilli’tlilllimit l l h l"
f l l‘ . I
i l l -1 50 [ I

. I l
-9“) L- -- .1. A D.L_._c_fn_. Ll -25(_ii,___1.m.1. .4 L. A.” .___.._ 1_J

 

 

i. liv‘lj )l I: ',“‘KI.‘,,‘,'. i'."j 1
Oil J?" 5i; {ill l. lull} ll ".li.‘ ii“ l‘i'il' f‘ \11
,v

 

 

(a) A single channel of noisy data. (b) Wavelet thresholded data using the tra-
ditional DWT

 

 

 

 

— m —r— “ r— r m 200 * *— rrrr Hf * *Fr‘r‘ “’ﬁ
180; i i
‘ . . l l ' l
r. : ‘ , , . ‘ . l.-\"ii,"’ f. ' 19". ‘1‘ n”. "i." .3: ‘hu; 4‘," v. " . h - ' -
”i159." ‘ .‘ “"23ut»+~’\;(f’i' .‘|.'. i‘f“.‘_.'r"(_P‘ ii If). ”LIN! , 0 I' "1' -i ‘, '.. ‘ " ' l "r" i' ' .‘. ill ‘ Er " ,. 3 J7 ‘it’ “w ‘
L ' 'l :4 ' . 'l l :39": 1) "ml -:,.l ./, - 'j ,
l l I l
150i i | J -200, ‘ l
L -__ _.. A -._r_-_ . . _ L_ _r__._ 1-..-.. _.__- L _

 

(c) Wavelet thresholded data with Integer (d) Wavelet thresholded data with Integer
Lifting DWT with 10-bit ﬁlter coefﬁcients. Lifting DWT with 4—bit ﬁlter coeﬂicients.

Figure 1.3. Comparison of denoising performance for various precisions of ﬁlter
coefﬁcients for a sample channel of a 4-channel array with the data quantized to 10
bits.

computational complexity while giving good signal ﬁdelity from Figure 1.8. We see
that using 10-bits to represent the data, and constructing the DWT module with
4-bit ﬁlter coeﬂicients gives very little waveform distortion, while huge savings on

the hardware are obtained.

14

30L —Floating Point DWT

SNR after denoising (dB)
to N
o o:

.3
01

10

 

8 10 12 14 16

 

--------- IWT 10 bit quant. ' _ v"
- ~~IWT 4 bit quant. . . '/

 

 

 

 

l l l l L 41 L I J

18 20
SNR of noisy data (dB)

22 24 26

l

28

Figure 1.4. Denoising performance for various wavelet transform computations.

p p p p
a: a: '4 on
I

S3
00

Distance for denoised data
0
_ a

p
N

0.1

 

. Data denoised with Floating point DWT
0 Data denoised with Integer Lifting for 10 bit coefficients
' Data denoised with Integer Lifting for 4 bit coefficients
—Floating point (best fit)
"mm-10 bit coefficients (best ﬁt)
' "4 bit coefficients jbest fit)

 

 

no.

0.

1

 

L i 1

0" 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.3

Distance for noisy data

 
    
   
   
 
 
 

Figure 1.5. Performance in the distance metric of the signal subspace for data
denoised using various wavelet transform computations.

15

 

0 Data denoised with Floating point DWT
° Data denoised with Integer Lifting for 10 bit coefficients
0 Data denoised with Integer Lifting for 4 bit coefficients
—Floating point (best ﬁt)
' ....... 10 bit coefficients (best fit)
' ‘ '4 bit coefficients (best f'rL

F3
on

o
q

 

 

P
0)

.0
an
I

Distance for denoised data
9 o
w a

 

 

I L I J

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Distance for noisy data

Figure 1.6. Performance in the distance metric of the signal subspace for data
denoised using various wavelet transform computations.

 

 

 

 

 

i. 2 i:
'I :E it 1!,
:I 32 ii ,
I‘ E: ii i
l' 3': ,'
I 5. .‘.
it, 2‘. l!
l: 2:
ti, 2‘, ii.
_ A | E ' I 1
(U \J‘] ‘ $5 \ 'L- ~ u,“ \
g“) i: a; v ‘i, x
‘0‘) l: i3 l
i: ii H i
H 32 9i :
i: a; :i 1
It 'i ,! ll
i: E: i! 3
'I 2; i! 1
II :3 i: it
i ‘ i ?
Time

 

-— Thresholded with ﬂoating point DWT

-—- Thresholded with Integer Lifting

........ Thresholded with Integer Lifting. 4—bit Wavelet
-'-‘- Thresholded with Integer Lifting, 6-bit Wavelet
—-- Thresholded with Integer Lifting, 8-bit Wavelet

 

 

 

Figure 1.7. Example spike waveforms obtained with DWT computation with dif-
ferent ﬁlter coefﬁcient bit widths.

16

Distortlon in spike waveform vs Computations

 

 

 

 

 

 

g .
I T f
6 ' _,._.-—- - W“ s
' l/ T g i.
/
4 /
j/
I
a 2 ~ —4 bit Wavelet ﬁter coefﬁcients
£1 — 6 bit Wavelet ﬁlter coefﬁcients
5) 0 _ -—- 8 bi Wavelet ﬁlter coefﬁcients
— 10 bit Wavelet ﬁlter coefﬁcierts
-—- 12 bit Wavelet filter coefﬁcients
-2 - O 4 bit data quantization
,', ,z r; X 6 be data quantization
4, .. T + 8b't data quantization
” 0 10 bit data quantization
V 12 bit data quantization
-6 1 l l
0 50 100 1 50

Mulﬁpier Complexity (equivalent bit additions/sample)

Figure 1.8. Effect of round off and quantization errors on the signal ﬁdelity as a
function of multiplier complexity.

1 .5 Conclusion

A robust, computationally simple module with anticipated major reduction in chip
area and power consumption for implementing the DWT has been demonstrated.
The proposed design was motivated by the need to efﬁciently process neural record-
ings from high—density implantable neural interface modules in the nervous system.
The ability of the DWT to preserve the signal characteristics in the temporal,
spectral and spatial domains intuitively makes it a desirable signal processing tool
to pipeline the neural data with reduced bandwidth in any associated telemetry
system. The anticipated reduction in chip area and power consumption without re-

ducing the performance to a noticeable degree strongly suggests that the proposed

17

module design can be a key processing stage in the next generation of neuropros—

thetic devices.

A circuit design and simulation of the proposed module has indeed conﬁrmed
the savings in chip area and power requirements in addition to the versatility of

the design for real-time and streaming data processing[16].

18

CHAPTER 2

Spike Detection

2.1 Introduction

Spike Detection refers to the detection of action potentials generated at the mem-
brane of neurons from electrical signals recorded from neural electrodes. There is
generally a noise process from the surrounding neural tissue corrupting the recorded
signal. The neural spikes or action potentials are transient signals of small duration
that appear as local voltage surges. The wavelet transform appears as a natural
way to represent data containing transient signals due to its localization properties
in both time and frequency. We have demonstrated use of the wavelet transform to
do low level processing on the recorded data for denoising[17] and compression[18].
Therefore, the wavelet transform computations would need to be performed in our
framework of a modular design of the prosthetic devices, with the detection algo—

rithms adding very little overhead.

19

The algorithms presented here detect the spikes through the peaks in signal
amplitude occurring at any point over its waveform. For wavelet bases which might
be correlated with the signal waveform to some extent, the ratio of the signal peaks
associated with the spike to the noise variance increases in some wavelet subbands.

The wavelet transform essentially gives a higher SNR for detection in the relevant

subbands

The waveform of spikes generated by a neuron may change over time, and in
general cannot be assumed known. For chronic experiments or neural prosthetic
devices, active neurons whose action potentials are observed at the electrodes may
appear and go due to electrode movement, and changes in neural tissue. For these
reasons we assume no knowledge of the spike waveform or amplitude. Nenadic[19]
also gives a wavelet based method for spike detection heavily based on Oweiss[20].
However, Nenadic’s method implicitly assumes knowledge of the spike waveform by
choosing the wavelet function and decomposition levels which are highly correlated
with the spike waveform. Further, the estimation of signal strength and signal
probability makes the detection susceptible to changes in signal conditions. These
assumptions and estimations will run into more problems when action potentials
from more than one neural source are to be detected, and the problem becomes cou-
pled with that of estimating the number of sources. Nenadic’s approach also relies
on a continuous wavelet transform which is highly computationally expensive. Our
emphasis is evaluating approaches for a very blind signal detection scenario with

very little assumptions about the signal and noise conditions. The work presented

20

here uses algorithms for single electrode channel and array observations similar to

those presented in [20] and derives results for performance in different cases.

2.2 Data model

We assume the following data model for the an M electrode array (Eq. (2.1)).

Y = AS+Z

= x+z (2.1)

where S E §RPXN is the neural signal, A E §RMXP is the mixing matrix,
Z E meN is the noise and Y E ERMXN is the observed data, N being the

number of time samples, and P being the number of neurons.

For simplicity, we assume stationarity and iid distributions at each time sample.

Hence, for any time snapshot we get Eq. (2.2).

y = As+z

: x + z (2.2)

21

where s E mel is the neural signal, 2 E ERMXI is the noise and y E ERMXI is

the observed data.

z ~ N(O, R) (2.3)

We assume the noise (Eq. (2.3)) to be temporally white Gaussian, with the

spatial correlation structure given by R 6 RM XM .

2.3 Wavelet domain detection

The compaction property of the wavelet transform condenses the energy of the
signal into a few coefﬁcients, while the noise (assumed white) is distributed across
all the nodes. Hence, it should be easier to detect the signals in the wavelet domain.
Our approach is to perform a multi level stationary wavelet packet decomposition of
the observed data. The stationary (or undecimated) wavelet transform is performed
because the truncated wavelet transform reduces the number of samples at each
subsequent level by half. Though it takes less memory or registers, the truncated
wavelet transform reduces the temporal resolution for transient signal detection.
The wavelet packet tree decomposes both the approximations and details at each

level, as opposed to decomposing only the approximations as in a wavelet tree.

22

2.3.1 The Stationary Wavelet Packet Tree

The Stationary Wavelet Transform seeks to represent the signal in a redundant,
shift-invariant basis with uniform time-scale tiling. To aid the understanding of the
transform as a matrix operation, consider the following deﬁnition of a translation

operator[21].

Deﬁnition 2.1 Suppose z is a vector. Deﬁne (Rk2)n = zn_k. Rk is a translation

operator.

If the low pass ﬁlter function (also called the father wavelet) be designated
as u E gile, and the high pass ﬁlter function (also called the mother wavelet)
designated v 6 Elile, where f is the length of the ﬁlters (compact support of the
wavelet), then the matrices for the approximations and detail decomposition of a

signal of length N at any level can be written as in Eq. (2.4) and Eq. (2.5).

Wa = [Rlu R211 RNu] (2.4)

W(1 = [Rlv Rgv RNV] (2.5)

23

Here, Wa E me N is the transformation matrix for the approximations, and
Wd 6 iliNx N is the transformation matrix for the details. The approximations

sa 6 mlxN and details sd 6 ERUN for the signal s E ililxN can be obtained as

in Eq. (2.6).

s8 = sWa

sd = SWd (2.6)

Note that at every level of decomposition the amount of data (or memory re-
quirement) is doubled for a stationary wavelet packet tree. The multilevel Station-

ary Wavelet Packet Tree can be represented as in Fig.2.1.

The node 1 in this tree corresponds to the un-processed time domain signal.
We can therefore write the transformation matrix for obtaining the signal at any
node as a product of Wa and Wd. For example, to obtain the wavelet transformed

signal at the node 5 we have,

4 5 5 7
W W” w W“ wa dea wd
8 9 10 11 12 13 14 15

Figure 2.1. The Stationary Wavelet Packet Tree.

=> W(5) = wawd. (2.7)

Looking back at the data model for the observations, we formulate the wavelet

transform equations for the array data

25

Y = AS+Z

= x+z

y(.i) : yw(j)
Y0) : xw(j)+zw(j)

3(0) 2 x(j)+zfj) (2.8)

where YO) E iIiMXN is the wavelet transform in the jth node of the observations

YERMXN.

The observations are assumed to be sampled at 20kHz, and since the spike
waveforms are usually concentrated at bandwidths less than 10kHz, the ﬁrst level
details (labelled Node 3 in Fig.2.1) and its offspring do not include data of interest.
Hence, to save computations the complete wavelet tree is not computed. Rather,
only Node 2 and its children are computed. The detection tests are run only on

these subbands.

Similar to Eq. (2.2), we describe a snapshot of the data at a single time sample

351

,0) 2 x0) + 20'), (2.9)

26

Here, yo.) 6 ERMxl is simply one column of the Y matrix, corresponding to the
time instant of interest. The algorithms considered in this work operate on single
time snapshots. The spike detection tests are carried out independently across all
the wavelet sub-bands and the decisions are logically OR’ed. For convenience and
reducing visual clutter, the subscript corresponding to the node (subband) of the
Wavelet Packet Tree will be dropped, and all the data will be understood to be

from a particular wavelet subband.

It is shown in the sections on detection performance evaluation, that the
detection performance is a function of the Signal to Noise Ratio of a channel.
The wavelet transform helps in the detection performance because neural spike
waveform, being localized in time would be correlated to some extent with the
wavelet functions. This in effect, increases the equivalent SNR at some wavelet
subbands. Previous work[8],[17] has shown the symmlet 4 to be a good wavelet
basis for neural spike signals, and this is the wavelet basis used in this study. As an
illustration, we have plotted the signal in various nodes of the Stationary Wavelet
Packet Tree for a single channel of data containing 4 spikes in Fig.2.2. The red
dots denote the instant of the largest absolute peak for a spike, which is to be
detected. As we can see, some of the nodes, for example Nodes 8 and 17 are well
tuned for capturing the spike energy. The SNR’s in the different SWPT nodes for

this example are tabulated in Table 2.1.

27

Amplitude Amplltude Amplltude Amplltude Amplltude Amplltude Amplitude Amplitude

500
0
-500
500
0
-500
500

0

-500
200

0

500

-500
200

0
-200
200
0
-200
200
0

-200

Mi
in MM
200

 

NE Node1 ‘

 

 

 

 

 

 

 

 

W
kmwlllMWW "0601M
WWWMMMWN

it ill-iii Milli it'll tiliﬁiill l

100

 

 

 

 

 

 

 

 

 

300
Time Samples

Mill MM ,1 M

500

 

WWW? til” will I“ 'l'l‘l'IW

Elli ll ,,h‘1WV1.,lil lI‘IlIIlIIill IIIIIIIII‘II ”#11le Will}

WWW
WW W‘ltll'illilii

r 1,13,11,ng111‘111, “Will" iI'l 12011“.th

100

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Time Samples

Figure 2.2. A single channel represented in different SWPT nodes.

28

Table 2.1. Signal to Noise Ratio in the different SWPT nodes.

 

 

 

SWPT Node Signal to Noise Ratio
1 10.4799
2 13.3108
4 14.8127
5 6.1543
8 15.5377
9 13.8415
10 3.3308
11 7.2197
16 11.9609
17 16.7398
18 5.9863
19 14.3290
20 4.1822
21 1.2984
22 3.0082
23 6.4636

 

 

 

2.4 Analysis of the Bayesian Tests for Spike De-

tection

2.4.1 Likelihood Ratio Test for the known signal case

For simplicity, we consider a single neural source impinging on an array. In this
case, we have a binary hypothesis. Under hypothesis H0, no signal is present and
s = 0. Under the alternative hypothesis H1, the signal (spike) is present and

As=x=p..

29

The LRT then becomes,

—<y — u)TR‘1(y — u)

 

1

 

 

 

e 2
—N 2
v27r|R| / TR_1 §g371 (2.10)
‘y y
1 _
e
v27rlR|_N/2

where N is the number of electrode channels and 17 is the detection threshold.

This reduces to

_ H u
yTR 1p. §H(1) logn + L__ E 7 (2.11)

We therefore define the Sufficient Statistic T for the array spike detection with

Eq. (2.12). For a vector case, this reduces to a matched ﬁlter.

T = yTR_1 p. (2.12)

30

To compute this Sufficient Statistic, we require the noise covariance R. In the
absence of any information about the noise structure (the case in real spike detec-
tion applications), it will be estimated from the data. This is simply the sample
covariance of the observations (Eq. (2.13)). We will assume this noise covariance

estimate for all subsequent expressions.

R = -—YYT (2.13)

Since the Sufficient Statistic T is a linear function of a Gaussian random
variable Y, it is also a Gaussian. We can determine the pdf of the Sufﬁcient

Statistic T for the two hypothesis in terms of the means and variances.

H0 = EIT] = E [yTR"1ul = 0

E [(T - T32} = E[(yTR‘1u)T(yTR”IM)l = ”TR—1M
H1 I ElTl = ElyTR_1ul = ”TR—ll!

E[(T — W] = uTR‘lu

(2.14)

31

Therefore, we can evaluate the performance of the detection in terms of these

numbers.

 

 

PF = PHO(T > A) = g [1— Q(I£TR31M\/§)] (2.15)

Correspondingly, we have similar expressions for single channel spike detection

too.

 

.2 ‘
21-”
1 a]-
PD =PH1(T>/\)=-2- l-Q ”.2
- 03' ﬂ _

 

 

J

 

1 A
PF =PHO(T>/\)=-2- l—Q ”.2
U'zﬁ

- .7

(2.16)

 

 

Hence, for a Gaussian distributed Sufficient Statistic T, the distance metric
between the hypotheses is simply “TR-1p. For a single channel case, this is

simply the energy of the signal divided by the variance of the noise.

32

To compare the performance of the array detection vs. the single channel de-
tection, we observe the distance measure between the hypotheses in the two sce-

narios. The array processing will give better performance if the following condition

(Eq. (2.17)) holds true.

0

”TR—1p. > El??- (2.17)
0]-

where pj is the strength of the signal at the the channel j where we are detecting

the signal, and aj is the variance of the noise at that channel.

Hence, we can describe “TR—1p, as an equivalent Signal to Noise Ratio for
the array case. The observations (array or channel) with a better Signal to Noise

Ratio give better detection performance.

Performance Comparison for Spatially Uncorrelated Noise

If we assume the noise covariance to be diagonal (Eq. (2.18)), we can get results
on the equivalent SNR’S for a single channel and the array data. Eq. (2.19) shows
that the equivalent SNR, and hence the performance of a detector using the array

data will be better than that working on any single channel.

33

 

 

 

 

012 0 0 0
0 022 0 0
R: (2.18)
L 0 0 0 0M2
01—2 0
#TR_1#=#T u
1 0 ”NI—2-
M #2
=25» —:—:Vie{1,2.. M} (2.19)
1210i

Performance Comparison for Spatially Correlated Noise

In this case, the noise covariance R is not diagonal. Since we have already have
the expressions for the detection performance in various cases, our purpose here
is simply to qualitatively describe the effect of the correlation. For mathematical
simplicity, we assume a two channel array. The signal and noise covariances are

described in Eq. (2.20).

34

#1

’1’:
H2
2
01 0102/)
R: (2.20)
0102/) 012

An inequality condition (Eq. (2.21)) can be derived for the array and channel
SNRs. By symmetry, a similar condition holds for channel 2. Hence, we conclude

that the array performance is better than a single channel performance.

 

 

T -1 it? 031% -- 2H1H20102P + #30? it?
" R f‘ ‘ _2 = 2 2 2 - ‘2
01 0102 — (0102p) 01
2
(#2002 (1- £323)
— “2 1 > 0 (2.21)

clog (1 - p2)

We can also try to understand the effect of the noise correlation on the per-
formance. For some ﬁxed signal conditions, and ﬁxed channel noise variances, the
array SNR performance improves with the noise correlation coefficient opposite in
sign to the signal correlation, and deteriorates when the noise correlation coefﬁcient

is of the same sign as the signal.

2.4.2 Likelihood Ratio Test for the unknown signal case

Here, we construct the generalized likelihood ratio test because we do not know
the signal waveform or strength. The generalized estimate for the signal therefore

is the observed data itself.

-(y - y)TR"1(Y - y)

 

 

 

 

1
e 2
,f‘zn'IRl-Nﬂ H1
T _1 §H 77 (2.22)
1 -y R y 0
, e
«mm-M?
This reduces to
_ H
yTR 1y §Hé2logn (2.23)

This is essentially a blind energy detector. We therefore deﬁne the Sufficient

Statistic T for the array spike detection with Eq. (2.24).

T = yTR-ly (2.24)

The Sufficient Statistic in this case is X2 distributed with degrees of freedom
equal to the dimension of y. The X2 statistics can be proven if we consider the

singular value decomposition of the noise covariance (Eq. (2.25)).

36

R = UDUT (2.25)

The Sufﬁcient Statistic can thus be expressed as the norm of the whitened

observations y.

T = {IR—13'
= yT(UDUT)_1y
: (D’1/2U‘1y)T(D’1/2U‘1y)

= 9% (2.26)

Now, 57 = D—l/zU—ly, being a linear relation on a Gaussian Random variable
y, is also Gaussian. The norm of a multidimensional Gaussian is X2 distributed,
and hence the Sufficient Statistic (Eq. (2.26)) for unknown signal detection in an
array is X2 distributed. We can calculate some statistics for the two hypotheses

(Eq. (2.27)) in terms of the whitened observations.

37

= 0
E[(y— “Xi—WT] =1
'9'” = 0T0 = 0
(2.27)
H1 2 = E121 = E [D—l/zU‘ly]
= D—1/2U—1M
E [(9 — m - W] = 1

if? = (D”1/2U‘IM)T(D_1/2U‘1u) = ”TR—III

A

Therefore, we can construct the probability density functions and cumulative
distribution functions for the Sufficient Statistic T in terms of the non-central X2

distribution functions.

 

Tn/2-le—T/2
H : T = n T,0 =
0 p( ) p( ) I‘(%n)2n/2
F(T) = F(T|n, 0) (2.28)

38

H1 : p<T> = pn(T,uTR‘1u) = MT, 1)
_ T(n/2—1)e—(T+/\)/2 00 (ATV:
_ 2n/2 k=0 22kk!r(k + $72.)
P(T) : ﬁrm, A) (2.29)

 

 

where F (T ln, A) is the cumulative distribution function of a non-central X2 dis-

tributed random variable with 17. degrees of freedom and non-centrality parameter )1.

These statistics (Eq. (2.28), Eq. (2.29)) can help us calculate the performance

of the detection as follows.

PD 2 P(T > 2logn|H1)=1— FH1(2logn)
=1— F(2 log nln, A)
= 1 — F(2logn|n,pTR“1p)
PF 2 P(T > 2log77|H0) =1— FH0(2logn)

= 1 — F(2 log nln,0) (2.30)

where PD is the probability of detection, and PF is the probability of false alarm.

To compare the performance of the array detection with that of the single

channel detection, we write the corresponding probabilities of detection and false

39

alarm for the single channel case.

2

ﬂ.
PD =1-FH1(210g77)=1-F(210gn|1,ﬁ)

.7
PF =1—FH0(2logn)=1—F(2logn|1,0) (2.31)

where 113- is the mean of the signal at the channel j where we are detecting the

signal.

In this case of unknown signal detection, we do not have one expression
similar to Eq. (2.17) to compare the performance of array and single channel
detection. However, it is possible to compare the performance in different cases nu-

merically using the tabulated values of various X2 cumulative distribution functions.

Let us consider an example of a three channel electrode array. Assume the
signal strength at the 3 electrodes in the presence of a spike given by Eq. (2.32),

and the noise covariance given by Eq. (2.33).

2.2

As : z = p = 2 (2.32)

1.8

 

 

40

1 0.4 0.3
R: 0.4 1 0.2 (2-33)

0.3 0.2 1

 

 

The detection performance in terms of the Receiver Operating Characteristics

are plotted for this case in F ig.2.3.

 

 

 

 

 

 

 

 

HOB"; t'“:“ 6 ‘I “an“
09 ——“
0'8 ---- Channel 1
c - - Channel 2
g 0-7 - - - Channel 3
0
Q)
15 0.6
o
'5 0.5
g
g 0.4
.0
9
o. 0.3
0.2
0.1
0 1 1 l 1 l
0 0.2 0.4 0.6 0.8 1

Probability of False Alarm

Figure 2.3. Receiver operating Characteristics for Array and singe channel detec-
tion.

It is seen that in this case, spike detection using the array data gives better

performance.

41

Let us now consider a slightly different case with the same noise covari-

ance(Eq. (2.33)), but the signal strengths given by Eq. (2.34).

2.48

As=x=u= 2.1 (2-34)

1.8

 

 

The detection performance in terms of the Receiver Operating Characteristics

are plotted for this case in Fig.2.4.

On close examination, it is revealed that for low values of Probability of False
Alarm, Channel 1 detection gives better performance compared to detection with
the Array data. The situation is reversed for higher values of Probability of False
Alarm. We thus observe that for the generalized likelihood ratio detection, we can

not derive a Uniformly Most Powerful property for array vs. channel detection.

It will be worthwhile to calculate the values of the Signal to Noise Ratio at
channel 1, and the equivalent array Signal to Noise Ratio we had derived in the

known signal detection case (Eq. (2.35)).

42

 

 

 

 

 

 

 

0.9 ,
_ y’ , ’ ' — Array
0'8 .z ’ ’ , ’ ’ ----- Channel 1
:07. ,I" x - - Channel2
2 ' .I ,’ - -- Channel 3
8 ' I,
g 0.6 r /
'5 0.5
g:
a 0.4
.D
9 ’I
n. 0.3 . I
l
0.2 I
0.1 l
O L L L l l
0 0.2 0.4 0.6 0.8 1
Probability of False Alarm

Figure 2.4. Receiver operating Characteristics for Array and singe channel detec-
tion.

2
Channell SNR = 1010g10 (5%) = 7.889 dB
(7
1

Equivalent Array SNR = lologlo (pTR_1 p) = 9.3516 dB (235)

Channel 1 data and the Array data give comparable detection performance in
this case. However, if we had known the signal (and applied the test of Eq. (2.10),

the performance of the Array detection would have been much better due to its

43

higher equivalent SN R. Thus, we can conclude that the performance gain in array

detection is more pronounced in the case of known signals.

2.5 Results

we run Monte Carlo simulations for spike detection of unknown signals for single
channel and array observations at various SNR conditions. The performance
of spike detection in the time domain and multiresolution (wavelet) domain is

compared.

Fig.2.5 shows the performance of detection for single channel detection in
multiresolution and time domain for an SNR of 7.6024. As we can see, the
multiresoltion detection performs much better than the time domain detection.
Figs.2.6 and 2.7 also show similar improvements in detection performance due to

the multiresolution analysis.

F igs.2.8, 2.9, and 2.10 Show ROC curves for multiresolution and time domain
detection at various SNR conditions. An example of the noisy data and the real

locations of spikes for an SNR of 8.117 is shown in Figs.2.11 and 2.12.

At these noise levels, the spikes are well masked by the noise so as to make

detection by visual inspection quite difficult. The level of masking can be seen in

44

 

 

 

 

 

 

c

.9

8 -7 cf v Multiresolution Scatter

8 0.6 f1 i 1 Time Domain Scatter

3 "‘~ ' —Multiresolution Best Fit

1; --'-'T1me Domain Best Fit

3 0.

(U

.D

9

o. 0. .
of a 1 1 1 a
0 0.2 0.4 0.6 0.8 1

Probabiity'of False Alarm

Figure 2.5. ROC curves for Multiresoution and Time domain detection for a single
channel data at an SNR of 7.6024.

   
  
 

 

 

 

 

 

c 0. v Multiresolution Scatter

1%: 1 Time Domain Scatter

% 0 —Multiresolution Best Fit

0 ‘ '-~-'Time Domain Best Fit

"5

2* .

=3 0.4 -

(U

.D

9

‘1 0.2
Gﬁ 1 1 1 1 1
O 0.2 0.4 0.6 0.8 1

Probabiity of False Alarm

Figure 2.6. ROC curves for Multiresoution and Time domain detection for a single
channel data at an SNR of 8.7623.

45

 

 

 

 

 

 

c -
.9
g v Multiresolution Scatter
g _ A Time Domain Scatter
‘6 — Multiresolution Best Fit
3‘ '- . -'Time Domain Best Fit
‘5 0.4
<5
.0
9
“L 0.2
0 1 1 1 1 1
0 0.2 0.4 0.6 0.8 1
Probabiity of False Alarm

Figure 2.7. ROC curves for Multiresoution and Time domain detection for a single
channel data at an SNR of 10.857.

    
 

 

 

 

 

1
c 0.8
.Q
'3
‘55 0 6 _ v Multiresolution Scatter
e ., A Time Domain Scatter
: . — Multiresolution Best Fit
:15: Q4 '-'-‘Time Domain Best Fit
9
a.

J I

0 0.2 0.4 0.6 0.8 1
Probabiity of False Alarm

Figure 2.8. ROC curves for Multiresoution and Time domain detection for a two
channel array at an SNR of 8.1177.

46

      
 

 

 

 

 

 

c 0.8
.9
*6
(D
g 0.6 ..
‘5 I | .f "1% v Multiresolution Scatter
% Q4141: 5 A Time Domain Scatter
S ‘ 1 — Multiresolution Best Fit
9 1- , -'Time Domain Best Fit
:1
0.2
Or: I A 1 I l
0 0.2 0.4 0.6 0.8 1

Probabiity 01 False Alarm

Figure 2.9. ROC curves for Multiresoution and Time domain detection for a two
channel array at an SN R of 6.9579.

 

 

 

 

 

 

c
2% 19 v Multiresolution Scatter
{3 0 6 ‘ A; A Time Domain Scatter
c1 ' , A — Multiresolution Best Fit
° v-~-1Time Domain Best Fit
E :_‘
'5 0.4 ~
(1! "
.0
e :
‘L 0.25
C. - . e 1 .
0 0.2 0.4 0.6 0.8 1

Probabiity of False Alarm

Figure 2.10. ROC curves for Multiresoution and Time domain detection for a two
channel array at an SN R of 10.2124

47

 

Noisy Data (channel 1)

 

i I L i L I

400
Time (ms)

Figure 2.11. Noisy data on one channel used to detect spikes.

 

Clean Data (channel 1)

 

300 400 500
ﬁne (ms)

Figure 2.12. The clean spike data to be detected.

48

the spike which is shown in detail in Fig.2.13.

25

20*

15*

10*

 

 

 

 

 

 

 

 

 

100

10

-20»

 

 

 

 

 

 

 

 

 

 

20

40

6080

Figure 2.13. A clean and noisy spike example.

2.6 Conclusions

We have described Bayesian detection tests for spike detection in single electrode
and array data. Analytical results for the Receiver Operating Characteristics have
been derived for the known and unknown signal cases. Detection using the multi-
electrode array data gives potentially better performance than single channel spike
detection. A Uniformly Most Powerful property for the array detection versus the

single channel detection is proved for the known signal case in terms of an “equiva-

49

lent array SNR”. The effect of the spatial correlation of the noise on the performance
of the array detection is also investigated. In the case of an unknown signal case,
the array detection gives better performance than the single channel detection for
certain signal and noise cases, but it may not be Uniformly Most Powerful. We
have concentrated on the unknown signal case, as we assume spike detection to
occur in the preliminary stages of the neural signal processing before any signal
conditions are estimated. Transforming the data in a multi-resolution space like
the wavelet transform and running the tests on the various time scales improves
detection because the wavelet compresses the signal energy in a small number of
coefficients. The performance gain in wavelet domain detection has been experi-
mentally veriﬁed for various Signal to Noise Ratio conditions in single channel and

array cases.

50

CHAPTER 3

Clustering Neural Spike Trains

3.1 Introduction

Ensemble coding of stimuli[22], motor commands[23], and memory[24] has been
demonstrated by neuroscientists and is a subject of contemporary research. We
consider high dimensional electrode arrays recording from subthalamic regions
of the central nervous system. After the preliminary signal processing stages of
compression, telemetry, spike detection, and spike sorting, we obtain spike trains
of individual neurons. The recorded data usually comprise spike trains of a large
number of neurons. While such neural yield presents rich data sets for understand-
ing complex neural circuits and coding mechanisms, it makes inferring relationships
between neurons quite a formidable task. However, not all the recorded neurons
might be responding to a common stimulus under study. Our goal here is to cluster

the data into groups of neurons which have correlated neural activity. This re-

51

duces the data dimensions for any secondary processing of the recorded spike trains.

3. 1 .1 Background

The spike trains, which are a series of action potentials, are characterized as
discrete-event random processes. The exact statistical characteristics of the spike
trains is still a research topic in the neuroscience community, with most studies sub-

scribing to either a rate coding or temporal coding mechanism in the neural system.

The Poisson process has been traditionally used to model the arrival times of
spike events. Alternatively, the inter spike interval generally obeys an exponential
density. If the spike train is modelled as a homogenous Poisson process, it will
be described completely by its mean ﬁring rate. However, a homogenous Poisson
assumption implies stationarity. Since neural signals encode time varying stimuli,
commands, and higher brain functions, inhomogenous Poisson processes appear as
a more natural way of modelling spike trains. An inhomogenous Poisson process
is completely characterized by a time varying rate function. The rate function
describes the probability of an action potential in a unit time interval, as a function
of time. In this study we assume all neural spike trains to be inhomogenous spike
trains. Various methods exist to characterize a spike train’s Poisson rate function.

The simplest way to do so is to count the number of spikes occurring in a time

52

window of some duration 6T. This is known as binning the data with a bin width

of size 6T.

Inferring relationships between neural spike trains is an integral part of any
multi unit spike train analysis[25]. Presently a number of methods exist to
analyze these associations. The cross-corellogram is the cross variance of two
spike trains calculated for different time lags. Cross-intensity is a similar relation
that characterizes the ﬁring rate of one neuron as a function of the ﬁring time of

the other neuron. These methods can be characterized as time domain methods[26].

Another widely used approach is the characterization of the ﬁring activity of
two or more neurons after the presentation of a stimulus. The Joint Peri-Stimulus
Time Histogram (JPSTH) for two neurons is a matrix of the ﬁring rate of one
neuron at a time t1 after the application of the stimulus, and ﬁring rate of the
other neuron at a time t2 after the application of the stimulus. Computing the
J PSTH requires repeatedly recording the same neurons under repetetive controlled

application of the stimulus.

Analogous to these time domain methods, there are various frequency domain
methods that infer relationships between spike trains. Computing the correlation

between the Fourier transforms of spike trains leads to measures of coherence.

53

Operating in the frequency domain, these methods are not bin size dependent but
assume stationarity on the data. Spectrograms and coherograms can be seen as
the frequency domain counterparts of the JPSTH as they characterize the joint

frequency domain ﬁring rates of neurons in the presence of stimuli[27].

Multiple spike trains can also be modelled as jointly random discrete event
processes. The joint probability functions can thus be characterized by a number
of parameters, depending on the form of the probability function assumed.
Standard statistical techniques can be used to estimate these parameters like
maximum likelihood[28] and minimum errors. However, this method requires a
priori knowledge of the parametric form of the multidimensional point process

probability function, and sufﬁcient data to estimate the parameters.

3.2 Simulation Method

Our simulation consists of 120 spike trains with 4 groups of 30 neurons each having
related ﬁring rates within the group, and independent of neurons outside the group.
Every group of neurons had a basic ﬁring rate function. For example, the kth group
Sk is associated with a ﬁring rate function fk E 3?le sampled at N points in time.

A basic ﬁring rate function was selected independently for each of the four groups.

54

For each group Sk, ﬁring rate functions fm E 3?le were generated for all

neurons m, m E Sk with linear and delay relations as in Eq. (3.1).

fm[n] =afk[n—T] +5 (3.1)

where a and B are randomly generated parameters for a linear relation, 7' is a

randomly generated delay parameter.

For each neural ﬁring rate fm, a spike train sm is obtained as a realization
of an in—homogenous Poisson process.‘ Spike trains were thus generated for 30
neurons m1, m2, - ~ ,m30 based on the basic ﬁring rate function of each group k,
where m1,m2, - -- ,m30 E Sk. For 4 groups k = {1,2, 3, 4}, a total of 120 spike
trains are thus generated. Our aim is to cluster these 120 neuronal spike trains

into the 4 natural clusters.

3.2.1 Proposed Technique

. In order to overcome the difficulties in predetermining the bin size in time domain
methods, the stationarity assumptions in frequency domain methods, and the data
requirements of the the parametric methods, we rely on a novel method to cluster

these spike trains.

A discrete wavelet transform representation of the spike train is obtained to 8

levels. The correlations among the neuronal spike trains in different time scales are
calculated and plotted in F ig.3.4. For visual clarity, each group of related neural
spike train is numbered consecutively. Hence, a cluster appears as a continuous
band of 30 neurons. We see that different clusters appear with different strengths

in different time scales.

Graph theoretic and k-means clustering results with features extracted by

Principal Component Analysis are reported.

3.3 Multi-resolution Approach

We assume that functionally interdependent neurons due to synaptic connections
or shared stimuli will have related ﬁring rates. The relations in the ﬁring rate can

be characterized in the pair-wise correlations of the estimated ﬁring rates.

We have indicated that time domain methods like cross-corellograms require
a ﬁxed bin width depending on the assumed frequency content of the signal.
Increasing the bin width gives a more efﬁcient estimate of the ﬁring rate in the
sense of a smaller variance of the estimate. However, the temporal resolution in the

estimation of a varying ﬁring rate would degrade with larger bin widths. Therefore,

56

this results in a variance-resolution trade off in the ﬁring rate estimation. The
optimal bin width will depend on the ﬁring rate, its rate of variation, and its
relation to the stimulus or signal being encoded by the neural signals. Frequency

domain methods are not bin-size dependent but assume stationarity on the signals.

Multi-resolution estimation of the ﬁring rates resolves these problems by de-
composing the spike trains with a wavelet transform (Fig.3.3). We propose the use
of the Haar wavelet basis to represent the observed spike trains. The use of the
Haar is well-suited for the spike trains because the wavelet coefﬁcient at each level
3' of the wavelet tree can be related to the ﬁring rate computed at a certain bin
width Zj . The Haar basis approximation u[n] and detail v[n] functions are shown

in Figs.3.1, and 3.2.

 

05* i

 

 

Amplitude

—0.5*

 

 

.- L

-1 -0.5 0 0.5 1 1.5 2
Time

 

Figure 3.1. The Haar basis approximation function u[n].

57

 

0.5*

 

 

8

a

3 c

S
-o.5»

 

 

 

 

 

11 -o.5 o 0.5 1 1.5 2
Time

Figure 3.2. The Haar basis detail function v[n].

At any level j, the details coefﬁcients d[n] of a signal s[n] in the wavelet
transform are obtained as din] = $(v[2j(r — n)]s[r]), and the approximations
coefficients a[n] by a[n] = Eli-(2423.0 — n)]s[r]). The Haar basis functions have
compact support, and the constant absolute magnitude within the support does
not scale the data as in the case of any higher order basis function. With the
wavelet functions for the Haar basis, the details coefﬁcient d[n] is simply the
change in the ﬁring rate at the nth bin of size 2j_1. The approximation coefﬁcient

a[n] is the ﬁring rate at the nth of size 2j .

A wavelet representation of the neural spike trains to characterize processes
as Poisson or non-Poisson has been reported by Cao[29] for decoding neural
spike trains in neural prosthetic devices. However, our purpose here is to have a
convenient representation for the neural spike train ﬁring rate for inferring clusters

of neurons with correlated activity.

58

Figure 3.3. The Wavelet Tree representation to 5 levels.

The wavelet transformation at level j can be represented as a lumped matrix
W0) E iffﬂ x N where N is the length of the spike train being decomposed[21]. We
initially aszshme the spike trains to be binned to a very small bin width 6T where
within 6T no more than one event can occur. The spike train expressing the neural
activity during a duration N 6T will be a vector of length N. Therefore, the wavelet

details of the mth spike train sm at level j can be computed as in Eq. (3.2).

53m = W(j)sm (3.2)

where S‘' 6 ER is the representation of sJ at node j.
m 97.. x1 m

59

The representation of the M spike trains S E ERNX M at node j of the wavelet

tree is given by Sj 6 RN xM (Eq. (3.3)).
2.7

sj = SW“) = [9'1 s1; 55M] (3.3)

We can assume the spike train vector at bin width 6T to be a noisy estimate
of its underlying ﬁring rate fm = [fm[1] fm [2] fm [N ]]T sampled at intervals of
6T. Hence, for a neuron m with ﬁring rate given by fm E 9?le (sampled at N

intervals 6T time apart), the spike train data can be represented as in Eq. (3.4).

where qm is expresses a random process associated with the uncertainty due to
using a single realization of the random process (spike train Sm) to estimate the
parameter (ﬁring rate fm). Note that we chose 6T small enough so that not more

than a single spike can occur in one bin (due to the refractory period). This implies

We can derive some properties of the ”noise” term qm. At a particular instance
of the discrete time, p, the probability of observing n spikes due to its Poisson

nature, is given by Eq. (3.5).

60

._. (fmipiérwe-(fmipiazr)

PlSmlpl = ni n,

 

(3.5)

We thus observe that qm[p], at any time instance 19 is given by Eq. (3.6).
Writing out its probability distribution function(Eq. (3.7)), we observe that it is a
zero mean, non-stationary, temporally white process which is uncorrelated across

neurons(Eq. (3.8)).

Qmipl : Smlpl " fmlPl5T (3-6)

_ (fm[p]6T)(m+fm[Pl5T)e-(fm[p]6T)

 

Pllepl = ml (m + fmlplciT)! ’ (3-7)
Vm -— fm[p]6T E {..., —-1,0, 1,2, ...}
ElQmiPll = 0
Ei(qmlpl)zl = fmlplé‘T —<fmip16T)2 (3.8)

Elmnlpilqimlpzll = 0

13(qu [PlQm2 [Pl] = 0

The property of the noise process being white across neurons is because we had
assumed neurons in the same group to have related ﬁring rates but no second
probabilistic dependence. This ma or may not hold in practice depending on the

biological mechanism producing the correlated ﬁring activity. Note that this ap-

61

proximation (Eq. (3.4)) of the binned spike train eases further analysis. We assume
each of the M neurons as belonging to any one of the K groups (51 $2 Sk ...SK)
with some related ﬁring rate. We assume the underlying basic ﬁring rate function
of the kth group of neurons to be fk. Within group 5k it is assumed that mk

neuronal ﬁring rates are linearly related by

K
k=1

where amk are coefﬁcients describing the degree of belongingness of neurons to the

different groups with amk = OVm ¢ SR, and amk 74 OVm E Sk-

The wavelet representation at level j of the spike train can thus be expressed as

sin = amkW0)fk6T + W(5)qm, Vm e (8),) (3.10)

To infer relations between neural spike trains, a classical technique is to compute
the correlation of the estimated ﬁring rates at a certain bin width[30]. By taking
the discrete wavelet transform of the spike trains the multiple resolutions obtained
correspond to multiple bin widths. We now compute the sample cross-correlation
between different neurons at the wavelet subband j as

. . . N . .
Cfﬁi,m2 :- %5Jm1T5]m2 : % Z SJmlinlsfnglnl
=0

2 E [3%, [1113152 [71]] (3.11)

62

The complete correlation matrix at level j, CO) 6 ER Mx M will have as its

(3')

elements cm1 1m2 .

If we assume the underlying basic ﬁring rates fk, k = 1, - - - K of the different
neural groups (SI - - - SK) to be independent, we can derive the following relations

on the expectation of the correlation.

_ 1 . T .
E [6913712] = FE [3,1111 SJmZ]
1 . . T ° '
= NE [(amlkW(J)fk6T + W(J)qm1) (amsz0)fk5T + W0)Qm2)]
T

K K
1 ' ' 2
Z N Zamrkwmfk ZWU)fkam2k (6T)
k=1 k=1
= 0 le E Sk1,m2 E Sk2,k1 # kg

1

. T . -
= NamlkamgkfkTWO) W(J)fk(6T)2 = amlkamgkefc vm11m2 E Sk(3-12)

- . T .
where 6']: = %fkTW(-l) W(~l)fk((5T)2 represents the energy of the kth groups

ﬁring rate fk at time scale 3'.

This is interpreted as the expectation of the correlation between neurons ml

63

and m2 at any time scale j being zero if the neurons do not have correlated ﬁring
Within that chorce of bin w1dth. The expectation of the correlatlon is am1 kamzkek,
with 6]: being a scaling coefficient describing the spectral content of the ﬁring rate
of the neural group k at time scale j, and amk, being some unknown relations on

the ﬁring rates with amk 7e 0 describing the membership of neuron m in the group k.

 

 

 

 

 

 

Subband 4 Subband 5 Subband 6
12° , “mm-.7791) 9., - ... . .. »
10011: 5-57. 13‘ 3 i
E 80:. i t
E 601:: " > e '. 1 g: i '1
4o; _ '2- ; ‘ EH ’1" t
20:1: . . ' w i:
ﬁmwmxmr ,0" '\ ‘ll “3"?! 'IHM' l l

 

 

 

 

    

 

 

 

4‘ 'u‘ ,, ii . "1" -.‘ 3. .
ﬁgm Imeﬁmh "14$ Wm wean-‘1: ‘mﬂMﬁTﬁ!
405030100120 204060n80100120 2040m80100120
Neuron Nouro Neuron

Figure 3.4. Correlations in different subbands.

64

3.4 A Principal Components Approach for

Multi—scale Clustering

Eq. (3.11) hints that some clusters can be identiﬁed in certain time scales with
different strengths 6%. Correlated neural activity at time scale 3' appears as
non zero correlation among the neurons at that time scale. Consider a cluster
correlation A(k) E me M for every cluster k = 1,... K with the m1,m2th

element being A(k)[m1,m2] = am1 kamz k- The cluster correlations A(k) can be

K
thought of as the essential features making up the correlation CO) = Z A0045";c
at any arbitrary time scale 3'. Therefore, Principal Component Analysifgll] should
resolve the cluster energies in the ﬁrst few dominant principal components, with the
noise energy distributed among the smaller principal components. The correlation
proﬁles in different principal components are given in Fig.3.5. Figures in this
thesis appear in color. The dominant principal components express the energy of
the correlated ﬁring rates in them, while the independent noise occupy the less
dominant principal components. The eigenvalues associated with the principal
components are tabulated in Table 3.1. The ﬁrst four principal components capture
75.5% of the energy, and we assume these to capture the required information
for all the clusters. We form a fused matrix from the principal components to be

used for clustering the neurons. The entries of this matrix D are calculated as in

Eq. (3.13) using the 4 most dominant principal components.

65

Table 3.1. Eigenvalues associated with the Principal Components

 

Principal Component Eigenvalue
1 0.2537
0.1926
0.1739
0.1349
0.1136
0.0598
0.0326
0.0226
0.0162

 

 

 

 

QDOOKIOBCJWiD-CON

 

4
dm1,m2 = Z laiiiimglAm (3'13)
. p=1

(p)

where cm m is the correlation between the neurons m and m in rincipal com-
11 2 1 2 p

h

ponent p, and A(p) is the eigenvalue associated with the pt principal component.

This resulting matrix has a proﬁle plotted in F ig.3.6. As we can see, all the four

natural clusters are well represented and distinctly visible in this matrix.

3.5 Graph Theoretic Clustering

A graph theoretic clustering approach[32] is needed in our application because

the neurons to be clustered are related through pair-wise similarity measures

66

       
    

.11. ...-a ”an? :23.

Principal Component 1 Principal Component 2 Principal Component 3

. “i -_I I' s {I

1 1
l

 

I.—

r111 l

21.-.... _ _._J

.. ..., 5 4
20 40 80 no 100120 20 40 60 80 100120 20 40 60 80 100120
Neuron Neuron Neuron

Figure 3.5. Principal Components of correlations.

67

rather than a low dimensional vector associated with each neuron as required
in vector clustering algorithms[33]. The true clusters of neurons are plotted in
the space of two components resolved from the edge weight matrix of Eq. (3.13)
are plotted in Fig.3.7. While our choice of the edge weights computed from the
principal components of the correlation matrices in different subbands offers a
good representation of the clusters, the data cannot be satisfactorily clustered with

any of the classical “vector data” algorithms like k—means or k-Nearest Neighbors.

The neural spike trains are represented as the vertices of an undirected graph.
The weight, or capacity of an edge connecting any two neurons m1 and m2 is a
measure of correlation among the neurons. The weights of the edges are assigned

from the D matrix, with the edge weight between neurons m1 and m2 given by

Eq. (3.13).

We use a minimum cut type of algorithm on our graph representation to
separate the neurons in clusters. Probabilistic clustering, with “sof ” member-
ship functions to characterize the membership of a neuron in a cluster is used.
Probabilistic clustering is a technique to reduce the computational complexity
of an otherwise NP-hard clustering problem[34]. We wish to ﬁnd clusters that
have maximum edge weights between neurons within a cluster, and minimum
edge weights between neurons in two different clusters, or across a cut. With the

pm”; denoting the membership of neuron ml in cluster k, we form an objective

68

function[35] described by Eq. (3.14). We note that intuitively, we want pch to

have a high value if amk is statistically non-zero.

MM

2 Z Pi,kpj,kdz',j

K i=13=1
i: Z (3.14)

 

Maximizing the objective function 1 attempts to minimize the edge weights
(which are measures of correlated ﬁring) across cuts separating the clusters. Finally,
crisp decisions on the cluster memberships are calculated by assigning the neuron

to the cluster in which it has the highest probabilistic membership.

3.6 Clustering Results

To compare the performance of our probabilistic graph clustering with traditional
methods, we tried to use the ﬁrst few resolved components from the fused matrix
D to perform clustering with the k-means algorithm. As we can see in Fig.3.7,
neural groups 1 and 3 are well mixed, and can not be linearly separated in the ﬁrst
two component space. Fig.3.8 shows a k-means result for the ﬁrst two principal
components of the fused matrix. As expected, this gives quite a signiﬁcant clus-

tering error of 32.5% We deﬁne the clustering error as the number of incorrectly

69

Neuron

120

100

80

60

4o

20

 

4 Principal Components

(Ii.

- «1.5

 

 

 

20 40 60 80 100120
Neuron

Figure 3.6. Fused proﬁle of the ﬁrst 4 principal components.

70

2.5 w r

 

 

 

 

 

 

 

 

0 Ar Cluster 1
2 _ 0 O O Cluster 2 H
O o O g 06) o 0 U Cluster 3
O O 0 O A Cluster 4
N 1.5“ O 00 no ‘
"" O
C
a) O O 000 .59; 3%“
g 1 l. g '* A.“ A A w .1
Q D O “A A
E [:1 :D at ”E El Dalia/3} 1 as
8 EU ﬁ%‘*ﬂ A
0 5 - 1:1 Cl I D a -
D #0 Cl
D are are * *
:1 D *
Oi .
13
_0.5 l 1 r 1
—2 —1.5 -1 —0.5 0 0.5

Component 1

Figure 3.7. The true clusters depicted in the ﬁrst two components resolved from
the fused matrix D.

71

Table 3.2. Clustering Errors obtained in the k-means algorithm

 

 

No. of Principal Components Clustering Error
2 33.21%
3 27.49%
4 14.61%
5 16.87%
6 16.93%
7 17.57%
15 16.41%

 

 

 

 

classiﬁed neurons as a fraction of the total number of neurons. Since the results
of the k-means algorithm depend on the initial guess of cluster centers, we run
the k-means in 1000 iterations and compute the mean error. Table 3.2 shows the
mean clustering errors for the k-means algorithms applied to different numbers of
principal components. As we see, the clustering error does not decrease uniformly
by increasing the number of principal components. Though increasing the number
of principal components captures more useful information, it also adds statistical
noise to the data. This shows that the performance of the k—means algorithm is
limited due to the inherent nature of the pair-wise similarity measures in our prob—

lem, which lends itself better to a graph theoretic representation.

Probabilistic clustering with the edge weights of Eq. (3.13) gave a clustering

error of 2.5% z'.e.,only 3 neurons out of the 120 being classiﬁed in the wrong cluster.

72

2.5

 

 

 

 

 

 

O 1. Cluster 1
2 _ 0 O O Cluster 2
O o O 806) o O D Cluster 3
1 5 O 8 go 0 A Cluster 4
N - b A
E C] O . 1 X1”; [Tic
a:J DB ”L. 1.35"“. ’ 451,0.
8 1 h D are ALA /.
E 3 SE C] D D *gie; ; ‘E
8 05’ 0633?] DD *1#*’21:* H
DD C] a?“ *
1:1 ,.. #91;
0 F E]
O.§2 -1.5 —1 -O.5 0 0.5
Component 1

Figure 3.8. Result of the k-means algorithm applied to the ﬁrst two principal
components of the fused matrix.

73

3.7 Conclusions

We have demonstrated the use of a novel multi-scale clustering algorithm for par-
titioning a large set of neural spike trains into groups with correlated ﬁring. A
multi-scale or wavelet domain representation of the neural spike train eliminates
the need of ﬁxing a bin width and assuming stationarity, which are the limita-
tions of traditional time domain and frequency domain methods respectively. We
have demonstrated that the probabilistic graph theoretic clustering approach out-
performs traditional algorithms like k-means for our application. We believe that
this algorithm can go a long way in enabling neuroscientists to analyze large scale
neural recordings and give a better systems understanding of neural activities. In
future, we need to test this approach on data sets generated from a more biologically

faithful representation of neural circuits or on real large scale data sets.

74

BIBLIOGRAPHY

BIBLIOGRAPHY

[1] K.D Wise, D.J. Anderson, J.F. Hetke, D.R. Kipke, and K. Najaﬁ. Wireless
implantable microsystems: High-density electronic interfaces to the nervous
system. In Proc. of the IEEE, volume 92, pages 76—97, 2004.

[2] FA. Edwards, A. Konnerth, B. Sakmann, and T. Takahashi. A thin slice
preparatiOn for patch clamp recordings from neurones of the mammalian

central nervous system. Pﬁugers Archive: European Journal of Physiology,
414(5):600—612, September 1989.

[3] J .C. Chang, G.J. Brewer, and BC. Wheeler. Microelectrode array recordings
of patterned hippocampal neurons for four weeks. Biomedical Microdevices,
2(4):245—253, December 2000.

[4] M. Ghovanloo and K. Najaﬁ. A wideband frequency-shift keying wireless ink
for inductively powered biomedical implants. IEEE Transactions on Circuits
and Systems I, 51(12):2374—2383, December 2004.

[5] KC. Oweiss, D.J. Anderson, and M.M. Papaefthymiou. Optimizing signal
coding in neural interface system-on-a-chip modules. In Proceedings of the
25th IEEE Conference of EMBS, pages 2016-2019, September 2003.

[6] S. Mallat. A Wavelet Tour of Signal Processing. Academic Publishers, 2
edition, 1999.

[7] K. G. Oweiss and D. J. Anderson. A multiresolution generalized maximum
likelihood approach for the detection of unknown transient multichannel signals
in colored noise with unknown covariance. In Proceedings of the I CASSP ’2002,
volume 3, pages 2993—2996, May 2002.

[8] KC. Oweiss and DJ. Anderson. Noise reduction in multichannel neural

recordings using a new array wavelet denoising algorithm. Neurocomputing,
38-4021687—1693, July 2001.

[9] KC. Oweiss and DJ. Anderson. Massit-multiresolution analysis of signal sub-
space invariance technique: A novel algorithm for blind source separation. In
Pmc. of the IEEE 35th Asilomar Conference on Signals, Systems and Com-
puters, pages 819—823, November 2001.

76

[10]

[11]

Ml

[13l

l14l

[15]

[16]

[17]

[18]

[19]

[20]

[21]

S.G. Mallat. A theory for multiresolution signal decomposition: the wavelet
representation. IEEE Transactions on Pattern Analysis and Machine Intelli-
gence, 11(7):674—693, July 1989.

I. Daubechies and W. Swedens. Factoring wavelet transform into lifting steps.
Journal of Fourier Anaysis Apllications, 41(3):247—269, 1998.

W. Sweldens, A.R. Calderbank, I. Daubechies, and BL. Yeo. Wavelet trans-
forms that map integers to integers. Applied Computational Harmonic Analy-
sis, 5(3):332-369, 1998.

K. G. Oweiss and D. J. Anderson. A uniﬁed framework for advancing ar-
ray signal processing technology of multichannel microprobe neural recording
devices. In Proceedings of the 2nd IEEE Conference on Microtechnology in
Medicine and Biology, pages 245—250, May 2002.

David L. Donoho and Iain M. Johnstone. Ideal spatial adaptation by wavelet
shrinkage. Biometrika, 81(3):425—455, 1994.

Michael S. Lewicki. A review of methods for spike sorting: the detection
and classiﬁcation of neura action potentials. Network: Computational Neura

Systems, 9:R53—R78, 1998.

Karim Oweiss, Andrew Mason, Kyle Thomson, Jian Li, and Yasir Suhail. A
scalable wavelet transform vlsi architecture for real-time neural signal condi-
tioning in implantable multichannel neuroprosthetic devices. Under review,
for IEEE Transactions on Circuits and Systems.

Y. Suhail and KC. Oweiss. A reduced complexity integer lifting wavelet-
based module for real-time processing in implantable neural interface devices.
In Conference Proceedingsof the 26th Annual International Conference of the
Engineering in Medicine and Biology Society, pages 4552 — 4555, Sept. 2004.

K. Oweiss, K. Thomson, , and D. Anderson. A systems approach for real-
time data compression in advanced brain-machine interfaces. In Conference
Proceedings of the 2nd International IEEE EMBS Conference on Neural En-
gineering, 2005.

Z. Nenadic and J. W. Burdick. Spike detection using the continuous wavelet
transform. IEEE Transactions on Biomedical Engineering, 52(1):74—87, Jan-
uary 2005.

K. G. Oweiss. Multiresolution Analysis of Multichannel Neural Recordings in
the Context of Signal Detection, Estimation, Classiﬁcation and Noise Suppres-
sion. PhD thesis, University of Michigan, Ann Arbor, 2002.

Michael W. Frazier. An Introduction to Wavelets Through Linear Algebra.
Springer-Verlag, 1999.

77

[22] Paul M. Gochin, Michael Colombo, A. Gerald Dorfman., George L. Gerstein,
and Charles G. Gross. Neural ensemble coding in inferior temporal cortex.
Journal of Neurophysiogy, 71(6):2325—2327, June 1994.

[23] Mark Laubach, Johan Wessberg, and Miguel A. L. Nicolelis. Cortical ensemble
activity increasingly predicts behaviour outcomes during learning of a motor
task. Nature, 405:567—571, June 2000.

[24] H Eichenbaum, SI Wiener, ML Shapiro, and NJ Cohen. The organization
of spatial coding in the hippocampus: a study of neural ensemble activity.
Journal of Neuroscience, 9:2764—2775, August 1989.

[25] Emery N. Brown, Robert E. Kass, and Partha P. Mitra. Multiple neural
spike train data analysis: state-of-the—art and future challenges. Nature Neu-
roscience, 7(5):456—461, May 2004.

[26] Z. H. Perkel, G. L. Gerstein, and G. P. Moore. Neuronal spike trains and
stochastic point processes. ii. simultaneous spike trains. Biophysics Journal,
7(4):419—40, July 1967.

[27] M. R. Jarvis and Mitra P. P. Sampling properties of the spectrum and co-
herency of sequences of action potentials. Neural Computation, 13:717-749,
2001.

[28] ES. Chornoby, L.P. Schramm, and A. F. Karr. Maximum likelihood estimation
of neural point processes. Biological Cybernetics, 59:265—275, 1988.

[29] Shiyan Cao. Spike Train Characterization and Decoding for Neural Prosthetic
Devices. PhD thesis, California Institute of Technology, 2003.

[30] Y. Hata, T. T sumoto, H. Sato, and H. Tamura. Horizontal interactions between

visual cortical neurones studied by cross-correlation analysis in the cat. Journal
of Physiology, 441:593—614, September 1991.

[31] IT. Jolliffe. Principal Component Analysis. Springer, 2 edition, October 2002.

[32] D. W. Matula. Classiﬁcation and Clustering, chapter Graph theoretic tech-
niques for cluster anaysis algorithms, pages 95—129. Academic Press, 1977.

[33] J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan
Kaufmann, 2001.

[34] A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an
algorithm. Advances in Neural Information Processing Systems, 14, 2001.

[35] K. Oweiss, R. Jin, and Y. Suhail. Scale-space processing and clustering for ef-
ﬁcient multi-electrode data analysis of large-size neuronal ensebles. In Confer-
ence Proceedings of the 2nd International IEEE EMBS Conference on Neural
Engineering, 2005.

‘Q
00