WWIUHWHNIWWHH‘WWW\lH‘lWlWHﬂ

i

  

    

_3
00
(A)

599
THS

AN STATE LIBRARIES

"'2 Illll|llllllllll|llill.llllllllll

 

 

 

 

("9995) 3 1293 01774 61
. LIBRARY
Michigan State
Unlverslty

 

 

 

This is to certify that the

thesis entitled

ON-LINE BLIND SIGNAL SEPARATION
OF SPEECH SOURCES

presented by
Walter Andres Zuluaga
has been accepted towards fulfillment

of the requirements for

Master's degreein Electrical Eng

3 K X
Major professN

 

 

 

0‘
06

Date gj’Ql

0-7639 MS U is an Afﬁrmative Action/Equal Opportunity Institution

 

 

 

PLACE IN RETURN Box to remove this checkout from your record.
TO AVOID FINE return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

we a 3! a}

\-

9!”

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ma mu

ON-LINE BLIND SIGNAL SEPARATION TO SPEECH SOURCES
BY

Walter Andrés Zuluaga

A THESIS
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
MASTER OF SCIENCE

Department of Engineering

1 998

ABSTRACT
ON-LINE BLIND SIGNAL SEPARATION APPLIED To SPEECH SOURCES
By

Walter Andrés Zuluaga

Over the past decade, a great amount of literature on Independent Component
Analysis (ICA) and blind Signal separation has been reported. The ICA concept
pertains to separating mixtures of Signals that have been mixed algebraically by
a constant, but unknown, matrix. There are many areas where the ICA concept
may potentially be applied, e.g. separation of signals in communications, medical
diagnosis, and speech processing. The objective of this thesis is to investigate
and evaluate the real-time implementation of some of the algorithms found in the
literature, and thus assesses the feasibility of using them in on-line applications.
A DSP system iS used to implement and execute the algorithms in an on-Iine

environment

The Signals to be separated are assumed to be in the human vocal bandwidth

range. A pair of mixed Signals is considered to be the input to the system. The

mixed signals are sampled by the DSP system and processed according to the
algorithms. The generated pair of outputs is taken to be an estimate of the
original unmixed signals. The implemented algorithms vary depending on the
functions used to extract the high order statistics, and on the functions used to
govern the learning rate of the learning rule. A performance measure to compare

the performance of the various algorithms is proposed in this investigation.

To my exceptional parents.

TABLE OF CONTENTS

LIST OF TABLES ................................................................................ vii
LIST OF FIGURES ............................................................................ viii
INTRODUCTION ................................................................................. 1
1 Background Theory ....................................................................... 3
1.1 BLIND SIGNAL SEPARATION ......................................................................................................... 3
1.2 SIGNAL SEPARATION. .................................................................................................................. 4
1.2.1 Independent Component Analysis ....................................................................................... 4

1.2.2 Algorithms and learning rules. ......................................................................................... 12
1.2.2.1 An algorithm by Cardoso ........................................................................................................... 13

1.2.2.2 An algorithm by Amari .............................................................................................................. l3

2 Implementation ............................................................................ 16
2.1 OFF-LINE ................................................................................................................................... 17
2.2 REAL TIME ................................................................................................................................ 23

2. 2.1 DSP Environment ............................................................................................................ 23
2.2.1.1 DSP .......................................................................................................................................... 24

2.2.1.2 Interface .................................................................................................................................... 25

2.2.1.3 Signal Acquisition ..................................................................................................................... 26

2.2.2 Program and Algorithms ................................................................................................. 26

2.2.2.1 PC program (host) ..................................................................................................................... 28

2.2.2.2 DSP program ............................................................................................................................ 29

2.2.2.2.] Program ﬂow .................................................................................................................. 29

2.2.2.2.2 Considerations in the program deSIgn 32

2.2.2.2.3 Implemented Algorithms ................................................................................................. 34

3 Tests and Results ........................................................................ 38
4 Conclusions and Recommendations ........................................... 46
APPENDIX A ...................................................................................... 52
REFERENCES .................................................................................. 61

vi

LIST OF TABLES

Table 1. Average of weight matrices for identity mixing channel ......................... 41
Table 2. Algorithm performance for identity mixing matrix. ................................. 42
Table 3. Average of weight matrices for an arbitrary mixing channel. ................. 43
Table 4. Algorithm performance for mixing matrix A. .......................................... 43

vii

LIST OF FIGURES

Figure 1. Model of the system. ............................................................................. 4
Figure 2 Input Signals ......................................................................................... 19
Figure 3 Mixed Signals ....................................................................................... 19
Figure 4 Estimated signal for 31(1) ..................................................................... 20
Figure 5 Estimated signal for S2(t) ..................................................................... 20
Figure 6. Influence of Silence periods ................................................................. 22
Figure 7. Program flowchart ................................................................................ 31
Figure 8. Sampling and program length. ............................................................ 33
Figure 9. Learning rate ........................................................................................ 35
Figure 10. Original signals ................................................................................. 44
Figure 11. Mixed Signals ..................................................................................... 45
Figure 12. Estimated Signals ............................................................................... 45
Figure 13. Additional input to solve the invertibility problem ................................ 50

viii

INTRODUCTION

Technologies allow the user to interact with electronic devices by means of his
voice, either as a source of information to be transmitted or as the means to
control equipment. This can be seen, for example, in the increase in use of
cellular phones and the control of a computer by using the person’s voice. These
kind of applications require an improvement in the processing of the Speech
Signals, by isolating the voice from the surrounding environment. It is not
sufficient to reduce the background noise, but also to remove the unwanted voice
Signals that might share similar characteristics. The final objective is to perform
the speech separation in voice controlled systems isolating the controlling Signal,

or to Obtain a better quality in the transmission of voice.

This type of application require on-line processing, in order to perform the
separation at the same time the signals are produced. As the user generates the
Speech signal, it must be processed without significant delay. In some other
applications this may not be necessary, e.g. the separation of EEG Signals, but

for the present scenarios an immediate response is necessary.

The present thesis looks to implement, in a real time environment, various

learning algorithms proposed in the literature [5][12] to separate blind Signals, by

means of linear neural networks. Their feasibility is going to be tested, so they
can be applied in the future to such mentioned devices, focusing on speech

separation but not limited to this field.

The thesis is organized as follows. The first chapter covers the theoretical
background, stating the problem to be addressed and developing the equations
and learning rules to be applied, based on the works done by Amari et al [5] and
by Cardoso [12]. The selection of the different parameters of the algorithm will
also be studied, namely the functions applied over the input Signals and the
learning rate functions to be used. The second part describes the implementation
Of the algorithm in different environments, its design and how it works,
emphasizing the problems encountered for it to work adequately in real time. A
description of the DSP environment is also included. The third chapter covers the
tests executed to study the performance Of the different algorithms, and a
measure Of the intersymbol interference is provided, in order to evaluate the
quality of the weight matrix Obtained. Finally, the last chapter contains

conclusions and recommendations drawn from the results obtained.

1 Background Theory

1. 1 Blind Signal Separation

Blind Signal separation consists of separating statistically independent or nearly
independent Signals, which have been mixed by an unknown medium from which
there is no previous information. It is applied to different areas such as array
processing, communications like multipath propagation, medical Signal
processing like isolating artifacts from heart and brain Signals, and Speech

processing [10].

Consider a system with n statistically independent sources S.(t), which by means
of a mixing medium represented by a mixing matrix A, produces m Signals x,(t).
The separation of the Signals is made by calculating adaptively a separating
matrix W(t), without any a priori information of the mixing matrix, and Obtaining an

estimate yi(t) of the input signals.

The block diagram of this system is as follows:

\_
Figure 1. Model of the system.

 

 

 

 

 

 

 

 

 

 

1.2 Signal Separation.

1 .21 Independent Component Analysis

A linear adaptive method, called Principal Component Analysis (PCA), is used to
decompose patterns (for this specific case Speech Signals) based on the
covariance among the components input Signal, and thus considering only their
second-order statistics. This kind Of decomposition is appropriate for the
separation of Gaussian Signals, but Since other Signals have higher order
statistics, the PCA approach is not suited for signal separation in more general

applications [9] [11].

For example, in speech signals, the information is contained in both the
amplitude and phase Spectra. However, the autocorrelation of a speech Signal, a
second-order statistic, only carries information about the amplitude Spectrum,

and the rest Of the information is discarded. Consequently for Signal separation,

it is necessary to include higher-order statistics that carry the missing information

[1].

Independent Component Analysis (ICA) may be viewed as a nonlinear version of
the PCA in incorporating high-order statistics. More information is contained in

these statistics, and the Signals are assumed independent, stationary zero-mean
stochastic processes. Many different statistics can be used, so the problem is to

choose the most appropriate for the case considered.

Adaptive separation based on ICA [2] is performed by adjusting the separation

matrix, using the mixed Signal vector x(t), according to the equation

W(t+1) = W(t) - 11(1) f( W(t), X(t) ) (1)

where the term u(t), known as adaptation rate, is used for stabilization and
convergence, and f( W(t), x(t) ) iS the function that describes the learning
algorithm. It iS necessary for the algorithm to be equivariant[12], in order for the
estimator to be totally independent Of the mixing matrix. If the estimator A is
considered as a mapping, the equivariance would be satisﬁed if

A(MX) = WAX) (2)

with M an invertible matrix and X a data set. Once this condition is satisfied, the
performance Of the algorithm does not depend on the mixing matrix, but only on

the present value of the separation matrix and the input data [2].

Consider the following updating algorithm:

W(t+1) = W(t) - 11(1) H(Y(t))W(t) (3)

If the global system is represented by C(t)=W(t)A, then left multiplying by A we

obtain:
W(t+1)A = W(tlA - flit) H(Y(t))W(t)A
C(t+1) = 0(1) - 11(1) H(Y(l))C(t)
C(t+1) = 0(1) - 11(1) H(lN(t)AX(t))C(t)
C(t+1) = 0(1) - Mt) H(C(1)X(t))C(t) (4)

C(t+1) ={ I - 11(1) H(CO )X(t))}C(t) (5)

This algorithm (5) is equivariant for the global system, and therefore does not rely

on the mixing matrix. This kind of algorithm is known as serial updating [12].

Based on this update law, two different classes of H(y(t)) can be considered,

based on the form Of this function [2]: symmetric and non-symmetric.

- Symmetric form. Two approaches to Obtain the function are

considered. The first one consists of taking a stochastic approach.

In order to find the function H(y), it is necessary to make the assumption that the
sources are independent, that is, the mutual information is 0. So, the algorithm
solves for the value of W with the constraint

E H(Y(t))=0 (6)
hence obtaining a stationary point. It is necessary to minimize an Objective

function that according to Cardoso [12], takes the form:

mm a E¢<y> = E¢<Bx>
(7)

with In a differentiable function. Expanding q>( y + 8y) we Obtain

c((1 - E )B)= C(B)+ E¢’(y)‘ Ey +0057?)
(3)

Minimizing C(B) is accomplished by choosing E proportional to

4E¢’(y)'Ey
(9)

Finally, the function iS obtained when the expectation in equation (9) is dropped.

Hence, the Obtained equation uses the function

H (y(t))= ¢’(y)y'
(10)

The other considered possibility comes from a statistical consideration.
Assuming a differentiable probability distribution p. for each of the signals SI, the

function can be formed as

H (y(t)), = vainly,- "59'
(11)

Where

Vi:_p%i (12)

This kind of function satisfies the maximum likelihood estimate in applying the

previous update rule.

- Non symmetric functions. These functions originate from using high
order statistics and Optimizing them, since they contain information
related to the independence between the signals. Even as high order
statistics are difficult and computationally expensive to calculate, a
high order statistic quantity like the kurtosis (a fourth order cumulant),
is easily implemented and yields a good performance in Signal
separation. This process is composed usually of two stages, the

whitening stage and the orthogonal stage [11] [12] [4].

- Whitening. This stage preprocesses the signals in order to Obtain
new signals with a covariance matrix approaching the identity. The
corresponding update law, see Cardoso [12], is developed as

follows:

The Objective function to be minimized, with Flz=Ezzt as the covariance, is:

‘I’(S ) E Trace(R z )— log det(Rz )— n
(13)

provides the following expansion:

W((I +1319): ‘I’(S)+ 2(Ezz' —1)E+o(E)
(14)

Subsequently the update rule for S (the whitening or sphering matrix) is

S (t + 1): i1 —2(z(t)z(t)l — I )ISQ)

(15)

- Orthogonalization. This stage optimizes the fourth order cumulant.

It minimizes the equation

$521594“

I=I,n (16)
given that the input signals’ covariance matrix is equal to the identity, which is
already satisfied by the whitening stage. The update law for the

orthogonalization matrix U, is then computed as follows:

Let C(U) be the Objective function:

ed!) a E¢<y> = EZIyJ‘
(17)

Applying the same expansion as in (8), one gets

610 -E 11)= C(U )+ E¢’(y)' Ey +004")
(13)

This equation has to be changed slightly, in order to satisfy the unitary constraint
on U (i.e. UU‘=I), so it is necessary for E to be skew-symmetric (i.e. E=-E t).
Choosing E proportional to

Elyy' —I +¢’(y>y' -y¢’(y>‘l
(19)

the resulting function is

H (y(t))= ¢’(y)y' -y¢’(y)'
(20)

and its corresponding update rule is

U(t+1)={1-AI¢’(Y)YI -y¢’(y)‘IiJ(t)

(21)

Combining both update rules, (15) and (21), into one:

W(t)=U(t)5(t)

(22)

son): [1-4401(1) _z)]s(z)

(23)

1O

U(t+1)={I—/IISD’(Y)Y"Y¢’(Y)'I}J(t)

(24)

W(t+1)=U(t+1)S(t+I)

W(t+1)={I—/Ii¢’(y)Y' —y¢’(y)’I}U(t)II awe -I)ls(t> (25)
W(t+1)= {1 -AI¢’<y>y' —y¢’(y>‘ llU(t)—U(t)1(zz' -I)lS(t)

(26)
Mr +1): {U — Ai/(zz' — I)+ l¢’(y>y' — y¢’<y)' ll}+ Wrm' - y¢’(y)‘ Kzz' — Ibis
(27)
Disregarding the term with 1.2:
W(t+1)= if -/l{(U22‘ -U)+ I¢’(y)y‘ -y¢’(y)‘ ifiiS
(23)
Since U‘U=I:
W(t+1)= {U -/I{(Uzz‘ (U'U)—U)+ i¢’(y)y' -y¢’(y)’ iJIIS
(29)

W(t +1): {US —/I{((UzXz‘U‘ )JS -Us)+ |¢’(y)y' —y¢’(y)’ 115B (30)
With y=Uz , y‘ = id and W(t)=U(t)S(t)
W(t+1)= {W(t)- 410$ - I 1440+ i¢’(y)y' -y¢’(y)' INN}

(31)

W(t+1)={l —/I{(yy' —I)+I¢’(y)y' -y¢’(y)' 11M!)

(32)

The update law is of the form

W(t+1)=W(t)—rt(t)H(y(t))W(t)

(33)
where
H 0(1))= (w' -1 +¢’(y)y' -y¢’(y)‘)
(34)

11

1.2.2 Algorithms and Ieaming rules.

The following learning algorithms are for a linear feed-forward neural network,
represented by the weight matrix W[t]. The input speech signals are assumed to

be statistically independent.

From the previous section, the update algorithm to be used is

W(t+1)=W(t)-#(I)H(y(t))/V(t)

(33)
with different options for the function H(y(t)):
- Stochastic approach
H 0(1))=¢’(y)y‘
(35)
- Maximum likelihood estimate
H(y(t) =w,(y.)’ y, -5, w.- =‘ A
, with ‘ (36)
- Optimization of a contrast function
H (y(t))= (yy' —I+¢'(y>y’ -y¢’<y)')
(37)

12

Both teams Of Cardoso and Amari, use the serial update rule, but the choice Of
the function H(y) and the learning rate function differ in both cases, along with the

measure Of the performance.

1.2.2.1 An algorithm by Cardoso

Cardoso [12] uses the adaptive algorithm described in the previous section:
W[k+1]= W[k]+#[kI{H(ylk])IW[kI

The simulation results Obtained by Cardoso use a symmetric function with

¢(y)=ly.|2y. , and a constant value for p.(t) [12]. The performance is measured by

the intersymbol interference: by multiplying the resulting separating matrix W with

the mixing matrix A, a matrix close to the identity Should be obtained. The energy

of the Signal in each of the estimates is obtained by lWAnlz. Hence, the relative

power of the jth source in the Ith Signal estimate is lWAulz.

1.2.2.2 An algorithm by Amari

The learning rule proposed by Amari et al is the following [5]:

dwij (t)
dt

 

=11, (oiw, m-f. b.1012, w,, 0mm}
(38)

Expressed in vector form becomes

13

dw(t)
dt

 

=#,<t){w(t)—fly(r)ly’w(r)}
(39)

The last equation in discrete time is given as

W[k+1]=W[k]+ulkl{I—f(y)y“)Wlkl
(40)

the serial update algorithm is obtained.
Amari et al [5] proposes for f(y) (H(y) in this text) to be typically of the form
f(y)=y2"*‘, clearly a non-symmetric function, and therefore a stochastic approach

solution. The learning rates used by Amari et al [5] are Of two types:

Standard: The learning rate is a predefined function, for example a decaying

exponential, Of the form

#(t)= #0?“
(41)

With this kind of function, the search is executed initially with big steps to find
rapidly an energy minimum and, as the Ieaming rate is reduced, the search

becomes finer.
Adaptive: The value of the learning rate iS established by a set of differential

equations. This set allows the Ieaming rate to increase in case one parameter of

the update rule is driven high. This might happen when the input signals change,

14

after the algorithm already has converged. In this case, the Ieaming rate will have

a large value and the search algorithm will start again with coarse steps.

The proposed set of differential equations for updating this Ieaming rate is:

dvij (t)
dt
dﬂij (t)

1'2 dt = ﬂu.) (t) +_aiVU (t)| (41)

TI

 

= —v,,. (r) + g, (t)

 

where T1,Tz>0, a>0 and

gij (t) : {WI} (t1-fi bio”: ij (0)3, (0}
Fl (42)

In discrete time, one has

v,[k+1]=(v.,lkl(z. -1)+g.., lkD/z.
p, [k+1]= (u, [1.11, -1)+a1v,[k]) 1, (43)

When these equations are discretized, the numerical integration method used

may yield some numeric imprecision in the Ieaming algorithm. The next chapter

discusses this area.

15

2 Implementation

The design of the algorithm assumed that two different mixed signals were going
to be sampled. The program was developed with this feature in mind, but with
the possibility to increase the number of mixed signals if it becomes necessary.
It was also assumed that for n source Signals, there are going to be n

corresponding channels, meaning that a square matrix would represent the

separation matrix.

The Signal separation algorithm was implemented in two different environments.
The ﬁrst was an off-line algorithm, used to check the algorithm’s behavior and
feasibility, allowing a fast prototyping of the program, Since it is easier to modify it
under controlled conditions. After certain tests were performed with
predetermined Speech signals and the parameters were chosen, the program

was migrated to an on-line process of the DSP system

16

2. 1 Off-line

The program was initially designed on a Sun SPARC workstation, to run under
the Matlab 5.0 environment. Speech samples available on Matlab were used,
and the performance of the algorithm in separating the Signals was tested. Note
that no speed performance on how fast the algorithm was achieving this
separation was analyzed, Since the ﬁnal objective Of the work was the on-line
implementation. It is enough to say that the separation took about 6 times the
Size of the sample. After the parameters were adjusted to Obtain a good output
for certain random mixing matrices, the program was implemented in ANSI C in

the same workstation.

This next step is logical, since the C compiler for the DSP was going to be used
to generate the assembler program. It would also be a program Of a lower level
than the one made in Matlab, allowing better control over the low-level functions
and Optimizations, and therefore increasing the Speed of execution. Still, Matlab
was being used as the output interface of the program. The same tests executed
on Matlab were executed in the C program, obtaining the same results both in

the weight matrix and in the appearance of the output Signals. AS expected, the

17

time performance was improved in almost half the execution time, being

appropriate for an off-line process, where immediate results are not needed.

Note that the program was made with a static data structure, Since the test
Signals to be used were finite and had a predetermined length, read at the

beginning of the program in one stage.

The following example ﬁgures Show the behavior of the algorithm initially
implemented on the PC. Two speech Signals were applied, each of 42028
samples. The corresponding waveforms are Shown in Figure 2. These two
signals were mixed through a random matrix, obtaining the mixed signals to be
processed and separated by the algorithm. The resulting Signals are Shown in
Figure 3. Note the dominance Of Signal S1 (t) over the other signal 320). The
algorithm was applied to the Signals and the corresponding Signal estimates were
obtained, as depicted in Figure 4 and Figure 5, where each of the estimates
(blue) is accompanied by the original Signal (green). The estimate of signal 31 (t)
resembles accurately the original signal, but the other estimate still presents
some crossover. This is due to the attenuation Of this signal in certain periods Of

time.

18

Input 31(1)

 

 

  

 

 

 

 

 

 

 

 

 

 

 

5
x 10 ‘
5

x 10‘

Figure 2. Input Signals
Mixed signals
1 . r .

5

x 10‘

5

x 10'

Figure 3. Mlxed Signals

InpuI s1(t)

 

    
    

i 2 a 4 5
Output y2(t) x 1o

 

 

 

 

 

 

 

 

 

 

-o.5.
.‘I . . . .
'0 1 2 a ' 4 5
x 10‘
Figure 4. Estimated Signal for s1(t)
Input-2m

II
110‘
II
I

x 10

Figure 5. Estimated Signal for s2(t)

20

Since the Ieaming rate is adaptive, the presence of such periods effects the
value of the parameter, producing big changes in the weight matrix and therefore
altering the output. Even if the separation matrix has the ideal values, such a
disturbance will alter the coefficients; when the strength of the signal rises again,

the values will converge to the previous value.

This can be seen in Figure 6. There, the signal 32(t) (green) is illustrated with the
estimated signal (blue) and the behavior of the individual weights in time. When
the strength of the Signal decreases (region A), a variation of the Ieaming rate is
reflected in the weights and therefore the estimates are affected. The inﬂuence
is higher in the first portion, since the main envelope of the Ieaming rate is higher
there. AS time progresses, the Ieaming rate will decrease even if disturbances
are present, and therefore the influence of these “Silent periods” will diminish with
time. If the final weight matrix were used without allowing any other change over
the subsequent samples, a very good separation will be Obtained, given that the
mixing matrix would not change after that point. Although the Signals may not
look similar in some regions, the sound produced by the estimates is close to that

of the original unmixed sounds, previous to the mixing.

21

 

 

 

r.-
A '-

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A A
0.5 + l
o
-o.5 -
-1 - _
o 1 2 3 4 5:
x 10‘
so -
/" '/ )Nj"
N3
I 0
'300 1 2 a 4 1
x10‘

Figure 6. Inﬂuence of silence periods

22

2.2 Real Time

After the program was tested on the workstation, it was modified to satisfy the
requirements of the DSP environment. Even as the algorithm is the same,
several modifications had to be made to optimize the algorithm to fit the DSP
structure and change the source of the input signals. In addition, a PC program
was prepared to interface with the DSP board. It was also necessary to change
the static data structure already mentioned, since the Signals were not going to
arrive all at the same time and be stored, but had to be processed one by one, as

they were sampled.

As a first step, the program was migrated to the DSP and the input Signals were
sent from the PC, being the same ones used in the workstation. After the
program was running smoothly, the source of the signals was changed to the
Daughter Modules where analog mixed Signals were sampled and processed

directly.

2.2.1 DSP Environment

The DSP environment where the application was implemented consists of a

development module based on a TMSSZOCSZ DSP from Texas Instruments [13].

23

It uses a PC as a host with a bank of DPRAM for fast data exchange. It also has
two banks of SRAM for data storage and program execution. In addition, the
PC/C32 interfaces with an on-board LSl daughter module with dual channel

ADC/DAC ports. Following is the description Of each of the parts [15] [16] [17].

2.2.1.1 DSP

The TM8320C32 is a 60MHz digital signal processor, with 32 bit floating point

capabilities and a peak performance of 60 MFLOPS.

The C32 has a Single cycle floating-pointfinteger multiplier that can operate in
parallel with the ALU. The ALU also performs Single cycle operations, and the
mentioned parallelism is accomplished by the use of two CPU buses that carry
operands from memory and two register buses, carrying the data from the
register file. The CPU also has two auxiliary register arithmetic units, which

generate addresses.

The register ﬁle consists on a total of 28 registers. Of these, 8 are extended

precision registers, supporting 32 bit integer numbers and 40 bit floating point

numbers. There are 8 auxiliary registers, used for indirect addressing, loop

24

counters or 32-bit general purpose registers. The other registers are for specific

purposes like data-page pointer, stack pointer, indexing, etc.

The memory organization for the C32 model consists on two RAM banks, each Of
256x32 bits. These banks can be accessed twice on a Single cycle, taking
advantage Of two address buses. It also has a cache memory, storing 64x32 bit

instructions, reducing the number of off-chip accesses.

The system has 256kbytes of extended memory, with certain restricted to

memory positions, reserved for the system.

2.2.1.2 Interface

The communication between the PC and the DSP is made through two types of
interfaces: a memory mapped interface and an VO mapped interface.

The memory mapped interface is done through a 2k x 16 DPFIAM. This memory
allows interchanging information between the DSP and the PC, without
interrupting the process in either of the devices. Since any sector of the DPRAM
can be accessed at the same time, a semaphore logic is provided, allowing to

control the resource and avoiding data corruption.

25

The I/O mapped interface consists of five 16 bit registers in the PC l/O map as a
control interface providing access to software conﬁgurable functions and other
facilities. The five registers are a control register, a status register, interrupt

register, DPRAM base address register and semaphore register.

2.2.1.3 Signal Acquisition

The Signal acquisition iS done by using the LSI AM/Di6SA daughter module.
This is an on-board module that provides two Analog/Digital Converters allowing
two input signals to be sampled, and two Digital/Analog Converters for the
outputs. Each channel uses 16 bits-based and the conversion, made by
successive approximations, can attain a maximum sample rate of 200kHz. There
are also input and output filters, which can be changed according to the
application. The conﬁguration and data registers are mapped in the memory of

the DSP, and allow a large range of Operations.

2.2.2 Program and Algorithms

The program implemented on the system is composed of two different parts, one
running on the PC where the DSP board is installed and the other running on the

DSP. Both programs are synchronized by means of a software semaphore

26

implemented on the system, preventing data corruption in the DPFIAM, as it

might be changed by any of the programs at any time.

The design of the DSP program was made using a polling model, instead of
interruptions, of the input and output ports. When interruptions are used, the
program is “interrupted” from the normal flow; in this case, it would be interrupted
every time a new sample is acquired. Since it is not desirable to interrupt in the
middle Of executing the update rule, which is the dominant part of the program
and uses the input data for its computations, interruptions would have to be
disabled for this part. Moreover, all the calculations depend on the input data
and therefore there is one single program path, SO the use of interruptions would
unnecessarily add an overhead each time it is invoked. When polling is used,
the program asks the port regularly if new data has arrived, and when it does, it
proceeds to use it to perform the corresponding calculations. When the program
is done, it waits for the next samples. The description of each of the programs

now follows. The flow chart of both the PC and DSP programs are in Figure 7.

27

2.2.2.1 PC program (host)

1. The host program takes care of the initialization of the DSP board. First, the
l/O address and the DPRAM position are established, and with these values

the board is initialized.

2. The assembled DSP program is loaded into the DSP memory and the

program is executed from that position.

3. The PC gets the semaphore and sends the following information: Initial
weight matrix, initial value for the Ieaming rate and different parameters for
the update rule. It also sets the EndData Flag to 1, indicating to the DSP that

it can start acquiring data from the input ports.

4. The semaphore is released, and the program waits for the input from the user

that establishes the end of the process.

5. When this is done, the semaphore is requested by the PC, and after this the

EndData liilag is set tO O.

6. The semaphore is released, waiting for the DPS to return the final values, and
when the PC is in possession of the semaphore, these values are read, and

the DSP system is closed.

28

2.2.2.2 DSP program

Two DSP programs were implemented during the development of the thesis.
The difference between these two versions is the source signals. While in the
first program the input came from the PC by means of the DPRAM, the second
one Obtained the training data from the input ports of the daughter module. The
following explanation of the program corresponds to the latter implementation,

since it is the one to be executed in real-time.

2.2.2.2.1 Program flow

The host program dictates when the DSP program starts executing.

7. As soon as the DSP program starts, it waits for the semaphore to be
released, in order to read the initial parameters of the implemented algorithm.
Then, the Daughter Modules are initialized, setting the clock source, the
sampling rate and other configuration parameters. The DSP releases the
semaphore, so the PC program can change the EndData flag. Next, the main
loop starts based on the EndData flag, set by the PC. AS long as this ﬂag is

1, the loop is executed.

8. The DSP polls the input port until a new sample is Obtained.

29

9. The main loop iS composed of three stages. The first stage reads the input
buffer status, and reads the data from the two input ports when the buffer is

full.

10.The second stage calculates the output data based on the last value of the

weight matrix and the result iS sent to the output ports.

11. The last stage calculates the update rule for the weight matrix, depending on

the input values.

12. Once the EndData flag is set to O, the loop finishes, and the semaphore is

requested by the DSP.

13. When it is free, the weight matrix is written to the DPRAM, and the

semaphore is released again.

30

PC Program (Host) DSP Program

C Start)

   
   

 

Start

I

Receive
parameters

 

 

 

 

 

 

 

 

 

 

 

Input
buffer
full?

 

 

 

 

 

 

 

Readdata

 

 

 

 

 

Calculate
output data

 

 

 

 

 

Update
weights

 

 

 

 

 

 

 

Send End

A.
Received

R . e end ﬂag?
Results

I I

c E... 3 as...“

l

c an

Figure 7. Program flowchart

 

 

 

 

 

 

 

 

 

31

 

2.2.2.2.2 Considerations in the program design

Different considerations had to be taken into account in the realization of the
program. First, the numerical approximation of the update rule. The original
update rule is continuous, and hence a numerical integration method must be
used. In this case the selected method is the rectangular Riemann sum, which
will be described in the next section. This approximation is also applied to the

differential equations defining the Ieaming rate in the adaptive rule by Amari et al

[5].

The other important factor is the sampling rate. The frequency at which the input
signal is sampled has a lower limit and an upper limit. The lower limit is the
Nyquist rate, which depends on the bandwidth of the Signals to be processed.
For a Signal with bandwidth com, the sampling frequency ms is determined by

(as > 201,"
(44)

in order to be able reconstruct the original Signal.

Since this is a theoretical value, a higher value is to be used as the lower bound

of the sampling frequency.

32

The upper limit is determined by the time required to process the data between
samples. If after getting a sample the program takes a long time to execute
before reading the next sample, it can lose this sample, and therefore the
sampling frequency would be altered. The minimum time between samples can
generally be broken in three portions (Figure 8): (a) reading the input data, (b)
calculating and writing the new output data, and (c) updating the weight matrix.
Of these stages, the last one consumes the longest time. Stage (c) also depends

on the particular algorithm implemented where the update rule and the Ieaming

'Sampling train

 

I

a b c

rate may require Special consideration.

Figure 8. Sampling and program length.

The frequencies used as lower and upper limits were 40 kHz and 50 kHz
respectively. For the Nyquist rate, it was Chosen based on the input low-pass

filter, with a cut-Off frequency of 18kHz. Hence, choosing this as the bandwidth Of

33

the signal, a sampling frequency Of 36kHz is obtained. The 40 kHz frequency is
selected to Obtain a safe region. The upper frequency (50kHz) was obtained by
executing the largest algorithm, and visually checking the form Of the known

output signals, generated by echoing the input signals.

2.2.2.2.3 Implemented Algorithms

The implementation of the algorithms only vary in the selection of the H(y)
function, depending on the optimization method selected (stochastic, maximum
likelihood estimator and contrast function) and in the selection of the Ieaming

rate .

Of the three Optimization methods, the maximum likelihood estimator was not
considered, since it is based on the knowledge or assumption of the probability
distribution of the input Signals. The other two were considered, selecting two
non-symmetric functions (stochastic) and one symmetric function (4th order

cumulant). These functions are:

34

Non-symmetric functions:

H(y)=y3(t)y(t)’

(‘155)

H (y)= y5(t)Y(t)T

(‘I€5)
Symmetric function:

H (y)= y(t)r(t)T -1 +y3(t)y(t)T -y(t)y3(t)T

(‘17’)

For the Ieaming rate, two different functions were used. The ﬁrst one iS a
piecewise function, where the first part is a constant value Ito. the second part is
a decaying exponential and the last piece is a constant value Of 0 meaning that
the update is stopped (see Figure 9). The constant part allows the algorithm to
take coarse steps in the search space, and after some time, finer steps are taken

until they go to 0.

11(1)

 

Figure 9. Learning rate

35

The second function is the solution of the differential equations proposed by
Amari’s group, where the Ieaming rate is adaptive. This system of equations is

rewritten here:

v,[k+1]=(v.,-Ikl(z. —1)+g..,. [kD/z.
,1, [k+1]= (a, [1.11, -1)+a(v,. [1.1) 72

For both implementations, the gradient descent update rule and the system of

first order differential equations that describe the behavior of u(t), two numerical

integration methods based on the Riemman sum were used. The ﬁrst and more

basic is the rectangular rule (or the Euler approximation) given by the formula

R, (f): it; f(a+kh) h = (b-a)/n

(48)
R. (f)= hi' f(a+kh)+ hf(b)
r=l (49)
R. (f)= RH (f)+ hf(b)
(50)
Since
w ,- (t) = Iﬂj (t){H (t)w(t)}dt
° (51)
using the approximation,
w[k +1]: hp, [k]{H[k]w[k]}+ w[k]
(52)

36

the discrete update rule to be implemented is Obtained by setting h=1.

The second method is the trapezoidal rule given by

T,(f):h[1£i]+f[a+h]+f[a+2h]+---+f[a+(n—1)h]+f—£b—]], h = (b—a)/n

(53)

where by manipulating the terms we obtain

 

T.(r>=r,,-.c>+h[f[“+g"""]+f§’]]. h=(b—a)/n

(54)

Applying it to the gradient descent rule the update rule becomes

W[k +1]: w[k]+ [hp]. [k]{H[k]w[k]}+ hp}. [k —1]{H[k — 1}w[k — 115/2

(55)

Both rules were implemented and tested for different mixing matrices. Since no
improvement in the performance was noticed, the Simplest rule (the rectangular)
was used. Note that the implementation Of the trapezoidal rule adds more
programming lines to the algorithm, affecting the length of the code and therefore

limiting the frequency constraints already enumerated.

37

3 Tests and Results

To test the implementation of the on-line separation algorithms and the feasibility
of applying them in different systems, several tests were performed. Different
algorithms were programmed, depending on the approaches listed in the
previous chapters. The update rule

w, [k]: hu,[kl{H[k]wIk1}+ wlkl
is common to all the algorithms, but variations were introduced on the function
H[k] and on the function describing the Ieaming rate u(t). Three H[k] functions
already described and two different descriptions of the Ieaming rate yield Six

algorithms to be tested.

The complexity of the algorithms varies, where the Simplest is the one using the

fewer number of calculations, based on the function

H (y)= y3(t)¥(t)’

(45)
and the Ieaming rate
'11,, 0 s t s to
,u(t)= < poe"” tO < t < t,
‘0 II S t (56)

 

38

The most complex uses the function

H (y)= y(t)y(t)’ -1 +y’(t)y(t)T -y(t)y3(t)T

(47)

and the system of differential equations for the Ieaming rate.

Two sources are used for test signals. One is a sine wave Of frequency 10kHz
and the second is a triangular wave of frequency 3kHz. The low frequency for
the latter is due to the high frequency components, and given the restrictions of
the input frequency, higher frequencies would be lost. In any case, the
bandwidth covered is sufficient for the considered applications, pertaining Speech

separation.

The Signal sources are mixed by an analog circuit, where the matrix coefficients
are easily changed to test different conditions. Each Of the coefficients can take
any value between 0 and 1, and by using an inverting buffer, a negative

coefficient can be Obtained hence covering along range of mixing matrices.

Two different tests were executed for each of the algorithms. The ﬁrst consisted

on applying the signals without mixing them, that is, the mixing matrix is the

39

identity. The expected weight matrix is a permutation matrix, multiplied by a

diagonal matrix:

(57)

A weight matrix close to the following iS an acceptable result:
[0 1][1.2 o ] [o 1.05]
w: =
1 0 0 1.05 1.2 0 (58)

Note that in this case the output Signals will be switched, depending on the initial
configuration. The values near to zero represent the intersymbol interference, in
this case being equal to zero. The further these interference “weights” are, the

less acceptable is the performance of the algorithm.

The second test performed consisted of applying a random mixing matrix, and
then, Observing the results as the algorithm converged. In order to analyze the
weight matrix, it was necessary to multiply the resulting matrix with the mixing

matrix, in order to Obtain the permutation matrix.

In both cases it was necessary to reorganize the results Of the permutation

matrix, SO that the weights Obtained would be in the same position for all the

40

samples. The reorganization was done so that the matrix would resemble the
identity matrix.

To check the performance of the algorithms, each algorithm was executed 50
times and the results were averaged to obtain a matrix to be used to compare the
intersymbol interference.

An example of the obtained values for the identity mixing matrix are depicted in

table 1.

Table 1. Average of weight matrices for identity mixing channel.

 

H(y) / Learning rate Exponential Adaptive (Amari)

 

H (y)=y’(t)Y(t)T

1.2470 0.1652 1.2509 0.0700
0.1270 1.1124 0.0355 1.1038

 

H (y)=y5(t)y(t)T

1.2447 0.1459 1.1568 00562
0.1221 1.1170 0.0253 1.0360

 

H0)=Y(I)Y(I)T-1+y3(t)Y(t)T-y(t)y3(t)’ [”562 04944] [12325 0.0049]

0.1914 1.2969 0.0051 1.4294

 

 

 

 

 

By normalizing the terms corresponding to the estimated channel, a measure Of
the intersymbol interference will be obtained to evaluate the performance. After
the normalization, the maximum interference ratio is selected as the performance

value for the Specific algorithm. This can be expressed by the formula

41

where p is the performance measure of the weight matrix. Table 2 Shows the

p: m... E Ya.
W11 W22

performance for each case averaged.

(58)

Table 2. Algorithm performance for identity mixing matrix.

 

 

 

 

 

H(y) / Learning rate Exponential Adaptive (Amari)
H(y): y3 (z)y(z)’ 0.1325 0.0560
H(y): y5 (z)y(t)’ 0.1172 0.0486
H (y)=y(t)y(t)’ -I +y’(t)y(t)’ -y(t)y’(t) 0.1476 0.0040

 

 

 

Proceeding in the same way when the mixing matrix A has arbitrarily selected

values the process was executed. Note that in all executions the algorithm

converged. The analogous results are summarized in tables 3 & 4

(=1

0.44975 .40647
0.66568 .80043

42

l

 

Table 3. Average of weight matrices for an arbitrary mixing channel.

 

 

 

 

 

 

H(y) / Learning rate Exponential Adaptive (Amari)
110143010 [31:12: 313333] [32333: 3.3.2.3]
Homer): [33:32; 32333;: [31:33: 3332:]
WW.)_,.,.«(.),(,y-,(.).s(.y [32213 1113333: [3.225. 11333:]

 

 

 

Using the same measure of performance:

Table 4. Algorithm performance for mixing matrix A.

 

 

 

 

 

H(y) / Learning rate Exponential Adaptive (Amari)
H(y): y3 (t)y(t)’ 0.2388 0.2337
H(y): y5 (t)y(t)T 0.2315 0.2488
H (y)=y(t)y(t)’ -I +y3(t)y(r)’ -y(t)y’(t) 0.2419 0.2511

 

 

 

 

AS it can be seen in the case of he identity mixing matrix, the best results were

Obtained from the algorithm implemented with an adaptive Ieaming rate and with

the symmetric function. However, the performance is almost the same for the

other experiments. Because of this, it iS recommended to use the mentioned

43

 

fc

algorithm with the best performance in the ﬁrst test. Note that this algorithm is

more complex, and therefore takes the longest time to execute.

Following are some printouts Of the input (Figure 10), mixed (Figure 11) and
estimated Signals (Figure 12).

Tak 250k$lsr 112 Alcqs
l v

1777‘ vvvvvvvvvv . vvvvvv 1v 1
h a . . .

 

  

 

 

h......-..-.....-.....--. .........o-n-.n-o....-.-—

 

 

II I (I1; .11 ' ' nus - -'m' 3May1998
12:29:08

Figure 10. Original signals

44

 

 

 

 

 

 

 

c111 '2.oov~ ioov M '200ps' cnz'l —1’sOn'Iv 3May1993
28:47

Figure 11. Mixed signals

TOR ZSOKS/s
F__

I 1

 

   

 

 

 

 

 

M ‘2oous‘ cnz'J' —Iiromv 3 my 199;
12:28:25

Figure 12. Estimated signals

45

4 Conclusions and Recommendations

The on-line implementation of the algorithms shows that the application to real-
time systems is potentially feasible, taking into account the restrictions imposed
by the length of input Signals, which will affect the frequency ranges and the
computational time. The performance might be improved by rewriting the code
completely in assembly language, but this improvement might not be
considerable, given that the optimizer provided for the C compiler does a good
job. This Optimization may not be necessary if the program is to be used in an
Off-line environment, and the only frequency limit would be imposed by the

sampling rate, Since no immediate response iS necessary.

A problem that can be seen in the implementation of the algorithms, either on-
line or Off-line, is channel switching. This happens when the source Signal in
channel 1 appears as the output in Channel 2 and vice versa. Given that it is not
possible to know the structure of the mixing matrix, no change within the
algorithm has been found yet. To solve this permutation it is necessary to
implement a post-processing stage so the algorithm can be used in real on-line

applications. Since the Off-line implementation allows the user to, choose the

46

desired output, it may not be an issue in this case, but the importance of this
separation in on-line algorithms is crucial. To design this stage, it may be
necessary to use a priori information of the input Signals, depending on the
application. In speech, for example, some features can be used to select the
desired separated output from the others, based on the energy spectrum,
waveform characteristics, etc. Another a priori information that may also be used
is the probability distribution of the signals, since this can state the nature of the
desired function H(y). The presence of some Special noise can be filtered out

with this knowledge.

Another problem to address is the use of a dynamic mixing matrix. Assuming
that the mixing medium varies Slowly in time (Slow enough for the algorithm to
converge) the initial solution found may not be valid some time after the update
process has finished. TO solve this problem, the Ieaming rate must be taken into
account. If it has a constant value, the update rule will adapt to a Changing
environment , but it might not produce an accurate result, Since it will either make
coarse steps in the search space when large values are used, or may take a long
time to converge when the search is fine, with the possibility that this time is

going to be longer than the rate at which the mixing matrix changes.

47

This can be somehow overcome by the use Of an exponential instead of a
constant. A decaying exponential, as the one implemented in the algorithms, will
pose problems after the separation matrix has converged, if the system changes.
This is because the final value Of the Ieaming rate is 0, and therefore no
additional change in the weights will be produced. By selecting some measure of
the change in the variation of the weights, compared to some established value,
assuming a constant learning rate (that is, before multiplying by u(t) ), the
exponential can be applied repeatedly over and over, initiating the process every

time the condition iS met, hence allowing the system to change.

Finally, this problem is addressed by Amari by using the system of two
differential equations to describe an adaptive Ieaming rate, where it depends
both on its previous values and the current value of the weight matrix. The
problem lies in that the selection of the parameters is difficult and have to be ﬁne
tuned for each application. An improvement may be Obtained by using a
previous stage where these parameters can be set or “learned” based on the
information of the Signals to be treated. For example, the decaying rate of the
exponential would depend on the frequency of the signals and the “change

sensitivity" would be done by knowing on how fast the medium may change.

48

The final problem to be considered, and maybe the most important one, is the
non-invertibility of the mixing matrix. If the system is not invertible or has a very
small determinant, the algorithm will not converge. New alternatives must be
looked for, maybe by applying some kind of pre-processing on the signals, using
some knowledge about their nature and modifying one of the inputs in such a

way that one Of the mixed Signals will be affected.

The next step to be done in a future work is to apply actual speech signals and
process them in real time, to verify the performance Of the implementation. Since
the algorithm works well with a controlled mixing medium (it is Simulated by
adding the signals), the mixing medium to be used next should be the
environment, and the input signals should come from microphones. Since in this
case the mixing matrix is unknown, the evaluation of performance can only be
done by listening to the actual output. A precise comparison of the waveforms
can be done, but the delays produced between the input and the output, along
with the acquisition of the original Signals may generate some problems for
adequate test environment. Enough time and patience may produce good
results, but the evaluation Obtained by listening to the estimated Signals Should

be satisfactory in some cases.

49

Also, as it was said before, the switching problem Should be addressed in order
to think Of an actual application in real devices, along with the problem of
invertibility. A proposed solution can be thought of by changing the mixing
medium parameters. If a non-invertible medium is detected (the algorithm does
not converge in a reasonable time) somehow it might be‘possible to alter the
position of an input microphone. By providing another input located in a different
position and switching to it, the characteristics of the environment would be
different, and the Signals may be separated. This solution Should be considered

for real applications, until a new training algorithm is developed to solve the

problem in an easier way.

   
 

Mic1

 

 

 

Blind
Separation

 

 

 

 

 

A Mic 2b

Figure 13. Additional input to solve the invertibility problem

50

APPENDIX

51

APPENDIX A

PC Program

#define True 1
#define False 0

#define DSP_SEM_LOCATION Oxc00040
#define DSP_BUFFER_START 0xc00004
#define DSP_BUFFER_END 0xc001fd

#define DSP_BUFFER_OUT_START 0xc00200
#define DSP_BUFFER_OUT_END 0xc003fd

#include 'stdlib.h'

#include 'stdio.h'

#include 'time.h'

#include “d:\user\zuluaga\tic32nt.h"

#pragma hdrstop

void main()

{

char * filename;

int c,i;

HANDLE hProcessor_Data ;

ULONG PC,port,port_out;

float dataout2,t2,tl,alpha,mu,w;
char deviceName[100];

FILE *FID, *FIDZ;

((PLSI_WIN95_PCC32)deviceName) -> baseAddress
((PLSI_WIN95_PCC32)deviceName) -> DPRAMAddress
c=Open_System();

hProcessor_Data = Open_Processor_ID(deviceName);
c=Global_Reset(hProcessor_Data);
filename="d:/user/zuluaga/amari.out';
c=Load_Object_File(filename, hProcessor_Data);
Set_Processor_Data_Type_Size_32(hProcessor_Data);
PC=Get_Entry;PC(hProcessor_Data);
Run_From(PC,hProcessor_Data);

0x290;
0xd0000;

// Initialize parameters.

Request_Semaphore(hProcessor_Data);
while (Read_Semaphore(hProcessor_Data))
I

Request_Semaphore(hProcessor_Data);

}

port=DSP_SEM_LOCATION;
c=Put_DPRAM_Float_32(port, l, hProcessor_Data);
srand( (unsigned)time( NULL ) );

t1=10000; /*lOOO,lOOO,.01 */

52

t2=10000;
alpha=0.1;
mu=.001;

port=DSP_BUFFER_START;
p0rt_out=DSP_BUFFER_OUT_START;

//t2, t1, alpha, mu

c=Put_DPRAM_Float_32(port++, t2, hProcessor_Data);
c=Put_DPRAM_Float_32(port++, t1, hProcessor_Data);
c=Put_DPRAM_Float_32(port++, alpha, hProcessor_Data);
c=Put_DPRAM_Float_32(port++, mu, hProcessor_Data);

//W

w=2*((float)rand()/RAND_MAX)-1;

c=Put_DPRAM_Float_32(port++, w, hProcessor_Data);
w=2*((float)rand()/RAND_MAX)-1;

c=Put_DPRAM_Float_32(port++, w, hProcessor_Data);
w=2*((float)rand()/RAND_MAX)-1;

c=Put_DPRAM_Float_32(port++, w, hProcessor_Data);
w=2*((float)rand()/RAND_MAX)-l;

c=Put_DPRAM_Float_32(port++, w, hProcessor_Data);

// DSP reads data

Release_Semaphore(hProcessor_Data);

// Stop process

scanf(': \n');

Request_Semaphore(hProcessor_Data);

while (Read_Semaphore(hProcessor_Data))

{

Request_Semaphore(hProcessor_Data);

}
port=DSP_SEM_LOCATION;

c=Put_DPRAM_Float_32(port, O, hProcessor_Data);
Release_Semaphore(hProcessor_Data);

FID=fopen('w.txt",”a");

port_out=DSP_BUFFER_OUT_START+2;
c=Get_DPRAM_Float_32(port_out++, &dataout2, hProcessor_Data);
printf("%f \t',dataout2);

fprintf(FID,"%f \t',dataout2);
c=Get_DPRAM_Float_32(port_out++, &dataout2, hProcessor_Data);
printf("%f \n",dataout2);

fprintf(FID,"%f \t",dataout2);
c=Get_DPRAM_Float_32(port_out++, &dataout2, hProcessor_Data);
printf("%f \t”,dataout2);

fprintf(FID,”%f \t',dataout2);
c=Get_DPRAM_Float_32(port_out++, &dataout2, hProcessor_Data);
printf('%f \n',dataout2);

fprintf(FID,"%f \n',dataout2);
Close_Processor_ID(deviceName);

Close_System();

fcloselFID);

DSP Programs

53

a) Symmetric function, adaptive Ieaming rate.

# include "D:/tool_dir/math.h'

#define
#define
#define
#define
#define
#define

#define
#define
#define
#define
#define
#define
#define
#define

#define
#define
#define
#define
#define
#define

#define
#define

typedef

DSP_SEM_LOCATION 0XC00040
DSP_BUFFER_START OXC00004
DSP_BUFFER_END OXCOOIfd
DSP_BUFFER_OUT_START 0xc00200
DSP_BUFFER_OUT_END 0XC003fd
SEMAPHORE 0x820000
TIMERl 0X81A005

UCR 0x81A008

ACR 0x81AO0A
CONFIG 0X81AO0F

IMR 0x81AOOB

DCR 0x81AO0C

CHO 0x81A002

CH1 0x81A006
IMR_DEF 0x000010000
DCR_DEF 0x0001e0000
TIMER1_DEF 0x0Fecb0000
UCR_DEF 0x0A4000000
ACR_DEF 0x000F20000
CONFIG_DEF 0x08dff0000
CLEARINT OXOFFFFFFFE
ITTP 0x06000000
unsigned long UINT32;

void mainlvoid)

{

const int scale =(-0x4fFFFFFF);

float * dport,* dport_out,a,b;
float mu_i, t1, t2, alpha;

float tl_temp,t2_temp,alpha_temp;
volatile unsigned int * sem;

float * end_data;

UINT32 * confdm;

register float
register float
register short
signed int

confdm =

y12].g.92,y3[2],ytemp121,ytemp2121:
W12]I2],mUI2][2],v[2][2];

* dataDMO,* dataDMl;

k,j,i;

(UINT32 *IDCR;

* confdm = DCR_DEF;

confdm =

(UINT32 *)UCR;

* confdm = UCR_DEF;

confdm =

(UINT32 *IACR;

* confdm = ACR_DEF;

confdm =

(UINT32 *)TIMER1;

54

* confdm = TIMER1_DEF;
confdm = (UINT32 *)CONFIG;
* confdm = CONFIG_DEF;
confdm = (UINT32 *)IMR;

* confdm = IMR_DEF;

sem=(unsigned int *)SEMAPHORE;

* sem = l;
dataDMO
dataDMl

(short *)CHO;
(short *)CH1;

// Wait for parameters.
dport = (float *)DSP_BUFFER_START;
dport_out = (float *)DSP_BUFFER_OUT_START;
// Initialize parameters.
* sem=0;
while (* sem)
I
* sem=0;
1
t2: * dport++;
t1: * dport++;
alpha= * dport++;
mu_i=* dport++;
W[0][0]=* dport++;
W[0][l]=* dport++;
W[1][0]=* dport++;
W[1][1]=* dport++;
t1_temp=(t1-l)/(t1);
t2_temp=(t2-1)/(t2);
alpha_temp=alpha/(t2);
* sem = 1;
i=0;
v[0][0]=0;
v[0][1]=0;
v[l][1]=0;
V[l][0]=0;
mu[0][0]=mu_i;
mu[0][l]=mu_i;
mu[l][0]=mu_i;
mu[l][l]=mu_i;
//Receive Data
end_data=(float *)DSP_SEM_LOCATION;
while (* end_data)
I
while (!((* confdm) & (0x00010000)))
{}
a=(float)(* dataDMO)/scale;
b=(float)(* dataDMl)/scale;
Y[0]=(W[0][0]*a+W[0][11*b);
Ylll=(W[1][01*a+W[l][l]*b);

* dataDMO=scale*y[O];
* dataDMl=scale*y[l];
ytemp[0]=y[0]*W[0][0]+y[1]*W[1][0];

55

ytemplll=y[0]*W[01[11+ylll*W[11[l];
y3[0]=y[01*y[O]*y[01;
y3lli=ylli*ylll*ylll;
ytemp2101=y3101*w{01[01+y311]*W[1][01;
ytemp2111=y3IOI*WIO][11+y3111*W[1][11;

for (k=0;k<2;k++)

{
for (j=0;j<2;j++)
{
g=wlk1[ii-(y3[kl*ytemp[j1-ylk]*ytemp2[j]+y[k]*ytempljli;
W[kilj]=WIklljl+mu[k][j]*g;
Vlkllj1=VIk1[j]*(t1_templ+g/(t1);
mulkllj1=lmulki[ji*(t2-temp))+(alpha_temp*VIk][j]*v[k][jil;
}

}

}

* sem=0;
while (* sem)
{

* sem=0;

}

dport_out = (float *)(DSP_BUFFER_OUT_START+2);
* dport_out++=W[0][0];

* dport_out++=W[OII1];

* dport_out++=W[l][0];

* dport-out++=W[ll[1);

* sem = 1;

56

b) Non-symmetric function, exponential Ieaming rate.

# include "D:/tool_dir/math.h'

#define DSP_SEM_LOCATION 0xc00040
#define DSP_BUFFER_START 0xc00004
#define DSP_BUFFER_END 0xc001fd
#define DSP_BUFFER_OUT_START 0xc00200
#define DSP_BUFFER_OUT_END 0xc003fd
#define SEMAPHORE 0x820000
#define TIMERl 0x81A005
#define UCR 0x81A008
#define ACR 0x81AO0A
#define CONFIG 0x81AO0F
#define IMR 0x81AO0B
#define DCR 0x81AO0C
#define CHO 0x81A002
#define CH1 0x81A006
#define IMR_DEF 0x000010000
#define DCR_DEF 0x0001e0000
#define TIMER1_DEF 0x0Ff100000
#define UCR_DEF 0x0A4000000
#define ACR_DEF 0x000F20000
#define CONFIG_DEF 0x08dff0000
#define CLEARINT OXOFFFFFFFE
#define ITTP 0x06000000

typedef unsigned long UINT32;

void main(void)

{

const int scale =(-0x4fFFFFFF);

float * dport,* dport_out,a,b;
float mu,mu_i, t1, t2, alpha;
volatile unsigned int * sem;

float * end_data;

UINT32 * confdm;

register float
register short
signed int k,j,i;

y121.g.92.y3[2].ytemp[21.W[2]l2];
* dataDMO,* dataDMl;

confdm = (UINT32 *)DCR;

* confdm = DCR_DEF;

confdm = (UINT32 *)UCR;

* confdm = UCR_DEF;
confdm = (UINT32 *)ACR;

* confdm = ACR_DEF;
confdm = (UINT32 *)TIMERl;
* confdm = TIMER1_DEF;
confdm = (UINT32 *)CONFIG;
* confdm = CONFIG_DEF;

57

confdm = (UINT32 *)IMR;
* confdm = IMR_DEF;

sem=(unsigned int *)SEMAPHORE;
dataDMO = (short *)CHO;
dataDMl = (short *)CHl;

// Wait for parameters.
dport = (float *)DSP_BUFFER_START;
dport_out = (float *)DSP_BUFFER_OUT_START;

// Initialize parameters.
* sem=0;
while (* sem)
I
* sem=0;
I
t2: * dport++;
tl= * dport++;
alpha: * dport++;
mu_i=* dport++;
W10][01=* dport++;
W[0][l]=* dport++;
W[l][0]=* dport++;
W[l][1]=* dport++;
t1_temp=(t1—1)/(t1);
t2_temp=(t2—l)/(t2);
alpha_temp=alpha/(t2);
* sem = 1;
i=0;
mu=mu_i;

//Receive Data
end_data=(float *)DSP_SEM_LOCATION;

while (* end_data)

I
while (!((* confdm) & (0x00010000)))
{}
a=(float)(* dataDMO)/sca1e;
b=(float)(* dataDMl)/scale;
Y[01=(W[0][01*a+W[0][1]*b);
YI1]=(W[1][01*a+W[1][1]*b);

* dataDMO=scale*y[0];
* dataDM1=scale*y[1];
ytempiol=Y[01*W[01[01+ylll*WlllIOI;
ytempll]=y[0]*W[0][l]+y[1]*W[1][1];
y3[0]=y[0]*YIOI*y[0];
y3lli=ylll*y[1]*ylll;

if (i<30000)
I
for (k=0;k<2;k++)
{
for (j=0;j<2;j++)
{

58

g=Wlk1[jI-(y3lk1*ytemplj]);
W[kllj1=WIk][j]+mu_i*g;
}
)
l++;
}
else
{
if (i<40000)
{
for (k=0;k<2;k++)
I
for (j=O;j<2;j++)
I
g=WIkl[jl-(y3[k]*ytempljl);
W[k][j]=W[k][j]+mu*g;
}
}
mu=mu_i*exp(-.0005*(i-30000));
i++;
}
else
I
9=W[0][OI-(y3[O]*Ytemp[0]);
92=g*g;
i=0;
mu=mu_i;
if (g2>100)
I
dport=(float *)DSP_BUFFER_START+4;
W[0][0]=* (dport++);
W[0][1]=* (dport++);
W[l][0]=* (dport++);
W[l][1]=* (dport);
}
}
}

* sem=0;

while (* sem)

I

* sem=0;

I
dport_out = (float *)(DSP_BUFFER_OUT_START+2);
* dport_out++=W[0][0]:
* dport_out++=W[0111];
dport_out++=W[l][0];
dport_out++=W[l][1];
sem = 1;

II'I’II-

59

REFERENCES

60

REFERENCES

[1] Bell A.J. 81 Sejnowski T.J. 1996. “Learning the higher-order structure of a
natural sound”. [Online] Available ftp://ftp.cnl.Salk.edu/pub/tony/Sﬁr4.ps, April 29,
1998

[2] Cardoso, J.. “Performance and implementation of invariant source separation
algorithms”. Proceedings ISCAS'96, 1996. [Online] Available
ftp://Sig.enst.fr/pub/jfc/Paperstscas96_invar.ps.gz, April 29, 1998.

[3] Cardoso, J. and Comon, P.. “Independent component analysis, a survey of
some algebraic methods”. Proceedings ISCAS'96, vol.2, pp. 93-96, 1996.
[Online] Available ftp://sig.enst.fr/pubfjfc/Papersﬁsca596_algebra.ps.gz, April 29,
1998.

[4] Cardoso, J. and Souloumiac, A. “An efﬁcient technique for blind separation of
complex sources”. Proceedings IEEE SP Workshop on Higher-Order Stat, Lake
Tahoe, USA, pg 275-279, 1993. [Online] Available
ftp://Sig.enst.fr/pub/jfc/PaperS/hos93.ps.gz, April 29, 1998.

[5] Cichocki A., Amari S., Adachi M. and Kasprzak W. “Self-Adaptive Neural
Networks for Blind Separation of Sources”. 1996 IEEE International Symposium
on Circuits and Systems, ISCAS'96, Vol. 2, IEEE, Piscataway, NJ, 1996, 157-
160. [Online] Available http://www.bip.riken.go.jp/absl/kas/PSPAPﬁscas96.ps.gz,
April 29, 1998.

[6] Comon, P. “Independent Component Analysis, A New Concept?”. Signal
Processing, vol. 36, pp 287-314, 1994. Elsevier.

[7] Davis, Philip and Rabinowitz, Philip. Methods of Numerical Integration.
Academic Press, Inc. 1984.

61

[8] De Lathauwer, L., Comon, P. , De Moor, B. and Vandewalle, J. “Higher-Order
Power Method - Application in Independent Component Analysis”. [Online]
Available http://www.bip.riken.go.jp/absI/nolta/delathauwer.ps.gz.

[9] Haykin, S. Neural Networks: A comprehensive foundation. Macmillan
Publishing Company, 1994.

[10] Karhunen, J., Hyvarinen, Aapo, \frgario, R., Hurri, J. and Oja, E.
“Applications of Neural Blind Separation to Signal and Image Processing”. Proc.
ICASSP 1997.

[11] Karhunen, J., Hyvarinen, Aapo, Vigario, R., Hurri, J. and Oja, E. “A Class of
Neural Networks for Independent Component Analysis”. IEEE Transactions on
Neural Networks. May 1997.

[12] Laheld B. and Cardoso, J.. “Adaptive source separation with uniform
performance”. Proceedings EUSIPCO, pages 183-186, Edinburgh, September
1994. [Online] Available ftp://sig.enst.fr/pub/jfc/PaperS/eusipc094_PFS.ps.gz,
April 29, 1998.

[13] TMSBZOCSX, User’s Guide. Texas Instruments. 1997.

[14] “Generating Efficient Code with TM8320 DSPs: Style Guidelines.”
Application Report SPRA366. Texas Instruments. July 25, 1997. [Online]
Available http://www-S.ti.comlsc/psheets/spra366/spra366.pdf, April 29, 1998.

[15] PC/C32 Board, Technical Reference Manual. Loughborough Sound Images.
August 1996.

[16] AM/D16SA Burr-Brown ADC/DAC Daughter Module, User Manual.
Loughborough Sound Images. March 1996.

[17] PC/032 Win32 Support Package, User Guide. Loughborough Sound
Images. June 1997.

62

General References:

Smaragdis, P. “Paris’ Independent Component Analysis & Blind Source
Separation page”. [Online] Available http://Sound.media.mit.edu/~parisﬁca.html,
April 29, 1998.

Lee, T. W Page for Blind Source Separation. [Online] Available
http://www.cn|.salk.edu/~tewon/Blind/blind.html, April 29, 1998.

Kasprzak, W. Basic Blind Source Separation. [Online] Available
http://www.bip.riken.go.jp/abSl/kaS/researchBS.htmI, April 29, 1998.

63

"iiiiiiiiliiiiiiii