.‘u .14? b.” i ,1 m. 5.. fight .oaenumé. . .i 5.1:. I I Maw 21)! 4:“.an an» . hwmkummm. ; . um ...... .3 . Haw... A 14.... u. . :- .nr. A . £34...qu , ‘ . 1.“: .1. mm . . L. :3. fig .. 3.1.6“ £551....» «fist vizllfin‘ .713. 5:6 ...._ .3... : .. :{1 i . Ii. EM 1.52....34. A “I .‘I '7: ‘ n «a I... 51:1. v.0..- 11 .1 $29.5... iv 3. r.q.%.fi..§.§_ ._ .. - .. . x; ‘ A. ,. .figésfififififi . ‘ z ‘ : ‘ xégsa . . . . 1w ‘2. I ‘Irl I'.I|I mssxs Z TUIWHIHHIZII: WI!UHHHNIINUHI 301 566 6963 + LIBRARY Mlchigan State University This is to certify that the dissertation entitled A POLYPHASE-BASED ALIAS-FREE STRUCTURE FOR ADAPTIVE FILTERING AND TRACKING presented by Umashankar S. Iyer has been accepted towards fulfillment of the requirements for Ph. D. degree in Electrical Eng. Wj/Ovax Major professor Date ’3’ M' 7‘ MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 wt ~ -—v.. *__. '__.' A" -‘r o _. V‘_.—fi.——‘ fl“.— v-k PLACE N RETURN BOX to mow-this mum your record. TO AVOID FINES return on or More data duo. DATE DUE DATE DUE DATE DUE MSU Is An Affirmative Action/Equal Opportunity Institution W ”3-9.1 A Polyphase-Based Alias-flee Structure For Adaptive Filtering And Tracking By Umashankar S. [yer A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSPHY Department of Electrical Engineering 1996 ABSTRACT A Polyphase-Based Alias-Free Structure For Adaptive Filtering And Tracking By Umashankar S. [yer An asymptotically alias-free parallel structure for adaptive filtering applications, called the Polyphase Structure (PPS) is considered and analyzed here. The structure promotes the polyphase implementation of a frequency sampling filter bank. Using accumulators in each band as post-filters, the scheme forms a zooming mechanism on alias-free points in the frequency domain. This implies that the bands will become asymptotically uncorrelated which allows independent adaptations in each sub-band. If sufficient number of bands is used and the input is rich enough, perfect identification of unknown FIR systems is achieved, both analytically and experimentally. The effectiveness of this method in identifying long impulse responses is presented, especially under adverse input coloring and changing environments. A generalized version of the algorithm is developed which optimizes the solution based on a priori information about the input signal. This alias-free parallel structure has also been extended to applications in blind equalization of communications channels. With theoretically proven convergence behavior, this method outperforms the LMS and RLS type algorithms in many cases, some of which are presented in the thesis. To my Grandparents who got me started, my Parents who got me through, and my Wife who helped me finish. iii ACKNOWLEDGMENTS This thesis has materialized because of the help and guidance of numerous people. I would first like to thank my advisor, Dr. Majid Nayeri. In the six years of my association with him he has been my mentor and friend. With the correct balance of firmness and latitude, he has been able to help me perform at my best. His thoughtful comments and insight have pulled me out of many tight spots. Without his help I definitely would not have been able to achieve my goals. I thank the members of my guidance committee, Professors Deller, Frazier and Khalil for their help and advice. Special thanks to Dr. Deller for his suggestions and comments that have helped improve the quality of this document. Dr. Hiroshi Ochi is another person to whom I owe deep gratitude. It was my association with him that gave me the problem which I address in my thesis. His help and guidance has taken me miles ahead of where I could have gone on my own steam. Shobha, P.G, Shanti, Sanjay, Venky, Anuj, Vivek, Sachin, Mona and Veekay, have been great friends. My association with them has made me richer. To Shanti I owe a lot for being there when I needed him the most. He has been a great moral support for me through my rough periods here at Michigan State University. Finally I would like to thank my parents and grandparents for inspiring me to be what I am today. My wife has been the pillar of my strength in the rough times I have faced. It has been her love and encouragement that has helped me remain focussed during trying times. To her I owe deep gratitude for her support. Umashankar Iyer East Lansing, Michigan iv TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES 1 Introduction 1.1 1.2 1.3 1.4 1.5 1.6 1.7 2 The 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 The echo problem ................................. Channel equalization ............................... Sub-band schemes ................................ Multirate adaptive filtering ........................... 1.4.1 Multirate basics ............................. 1.4.2 Filter banks ................................ 1.4.3 Perfect reconstruction filterbanks .................... Polyphase representation ............................. DF T filter bank .................................. Statistical properties of multirate systems ................... Polyphase Algorithm Transform-domain adaptive filtering ...................... Sub-band adaptive filtering ........................... Prior work ..................................... 2.3.1 FIR algorithms .............................. 2.3.2 IIR filterbanks .............................. 2.3.3 IIR echo canceller ............................ Equivalence of block processing and sub-band schemes ............ Further properties of the uniform DFT filter bank .............. Adaptive scheme based on frequency sampling theory ............ Convergence analysis ............................... Time-varying environment ............................ Computational complexity ............................ Recursive PPS .................................. vii viii CDWNO‘MH 10 12 12 16 18 24 24 25 27 27 3 Performance Evaluation of PPS 3.1 3.2 Time invariant systems .............................. 3.2.1 White noise ................................ 3.2.2 Colored noise ............................... Time-varying systems .............................. 3.3 3.4 3.5 3.6 3.7 Simulations .................................... Echo cancellation ................................. Dilation and rotation ............................... Signal-to-noise ratio and computational complexity . . . . ~ .......... Channel equalization ............................... 3.7.1 Blind equalization ............................ 3.7.2 Cost functions for blind equalization .................. 3.7.3 PPS for blind equalization ........................ 4 Discussion and Future Work 4.1 4.2 Diagonal decomposition of the system in the transform—domain ....... 4.1.1 Generalized sub-band decomposition .................. 4.1.2 Adaptive structure ............................ Equalization of non-minimum-phase channels ................. 5 Conclusion BIBLIOGRAPHY vi 50 51 52 52 54 60 66 69 71 73 74 75 77 81 82 82 85 87 88 90 2.1 2.2 2.3 3.1 LIST OF TABLES PPS algorithm. .................................. 40 Number of computations per block (M samples) for PPS. .......... 47 Recursive PPS algorithm. ............................ 49 Comparison of computational complexities of LMS, FTP and PPS ...... 73 vii 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 2.1 2.2 2.3 LIST OF FIGURES Echo problem in a telephone system ....................... A teleconferencing echo problem. ........................ An echo canceller. ................................ PAM communications system ........................... Impulse response of a channel. ......................... (a) A basic decimator block. (b) A basic interpolator block. ......... Effects of decimation and interpolation in the transform (frequency) do- main.(a) The input signal X (6”), (b) The decimated signal (M=2). Notice the aliasing effect. (c) The interpolated signal (L=2). Observe the spectral images. ...................................... (a) A complete decimator with a band-limiting anti-aliasing filter. Typical cut-off frequency of the decimation filter is 1r /M . (b) A complete interpolator with an image suppressing filter. The cut-off frequency of the interpolation filter is 7r / L ..................................... (a) An analysis filter bank. (b) A synthesis filter bank ............. Generation of polyphase components. ..................... (a-b) Noble Identity 1. (c-d) Noble Identity 2 .................. Efficient structures for decimators and interpolators. (a) Polyphase imple— mentation of a decimator. (b) Simplification of (a) using Noble Identities. (c) A similar structure for an interpolator using a Type-II decomposition and noble identities ................................... A DF T filter bank ................................. A polyphase representation of an uniform DFT bank. ............ An efficient implementation of a uniform DF T filter bank, when its outputs have been decimated by M. ........................... A general multirate system. ........................... A two sub-band block with a common post filter ................ A general sub-band adaptive filter. ....................... (a) A block processing system. (b) Its multirate equivalent. ......... Commutator models for (a) decimators, (b) interpolators. .......... viii (Dam-#0300 10 11 11 14 14 15 16 18 19 19 20 26 32 34 2.4 2.5 2.6 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.13 3.14 3.15 3.16 3.17 3.18 3.19 Commutator based implementation of the analysis and synthesis section of a DP T filter bank. ................................ 35 Normalized frequency response of a 4-band DF T filter bank .......... 36 Polyphase adaptive structure ........................... 37 System identification configuration ........................ 51 Inverse filtering configuration ........................... 51 Impulse response of a room ............................ 52 Comparison of learning curves for LMS and PPS when M=32. ....... 53 Comparison of learning curves for LMS and PPS when M=128 ........ 54 32 FIR filter. Estimated impulse response .................... 55 32 tap FIR filter. Comparison of learning curves ............... 56 The unknown system is the impulse response of a room. IRDC for LMS and PPS are plotted here. .............................. 57 Comparison of learning curves when the unknown system is the impulse response of a room ................................. 58 Comparison of Performance of PPS and FLMS. N = 256 and input is a low pass signal with cut-off of 0.257r. ........................ 59 Tracking behavior of LMS and PPS with slow variations in one parameter of a 16-tap FIR system (A = 0.98). ........................ 60 Tracking behavior of LMS and PPS with fast variations in one parameter of a 16-tap FIR system (A = 0.95). ........................ 61 Tracking behavior of LMS and PPS with slow variations in one parameter of a 512-tap FIR system (A = 0.45) ......................... 62 Tracking behavior of LMS and PPS with variations in 8 parameters of a 128-tap FIR system (A = 0.35) .......................... 63 Tracking behavior of F TF and PPS with variations in 1 parameter of a 16-tap FIR system (App, = 0.95, Aft; = .985). ..................... 64 Tracking behavior of FTP and PPS with variations in 1 parameter of a 128- tap FIR system (App, 2 0.9,Aflf = .9). Notice that FTF goes unstable around sample 2500. ............................... 65 Comparison of learning curves for LMS, FDLMS, DCTLMS and PPS when the input is a low pass colored noise. The unknown system is a 512 tap FIR filter ......................................... 66 Comparison of learning curves for LMS, FDLMS, DCTLMS and PPS when the input is a high pass colored noise. The unknown system is a 512 tap FIR filter ......................................... 67 Comparison of learning curves for LMS, F DLMS, DCTLMS, and PPS when the input is a speech signal. The unknown system is the 4096 tap impulse response of a conference room. ......................... 68 ix 3.20 3.21 3.22 3.23 3.24 3.25 3.26 3.27 3.28 3.29 4.1 4.2 Rotation and dilation with PPS. ........................ 69 Coloring filter with eight equally spaced nulls along the frequency axis. . . . 70 Effect of rotation on the performance of PPS .................. 71 Effect of decimation on the performance of PPS. (a) No decimation. (b) With decimation .................................. 72 The performance of LMS, FTP and PPS under various SNR conditions. . . 74 Blind equalization configuration. ........................ 75 PPS modified for blind equalization. Notice the delay introduced to account for the block processing. ............................. 77 Performance of PPS under blind equalization with all parameters initialized to zero. ...................................... 78 Performance of recursive PPS under blind equalization with parameters ini- tialized to non-zero values ............................. 79 Comparison of the impulse response estimates from PPS, FTP and LMS while equalizing a minimum-phase channel. The ideal equalizer has length 512. 80 Diagonalized system function in the transform-domain ............. 83 Adaptive structure based on generalized sub-band decomposition ....... 86 CHAPTER 1 Introduction Adaptive identification and equalization of systems has been a long studied problem. Many efficient algorithms and structures have been proposed for solving these problems and are summarized in [23, 38, 70]. The adaptive algorithms yield estimates of the parameters of the linear model based generally on a gradient or least squares methods. A major decision in identification is how to parameterize the properties of a system or signal using a model of suitable structure. Typically models can be classified as either input-output models or state-space models. Signal processing literature concentrates more on input-output models than on the others. Typical I/O models are 1. Linear difference equations; 2. Lattice models. The properties of an algorithm are dependent on the model structure. The speed of convergence, the robustness of the algorithm, are all structure dependent. Lattice type algorithms have better properties with regard to stability monitoring and numerical ro- bustness, while on the other hand linear difference equation (or transversal filter) models yield algorithms which are simpler and easier to implement and understand. For this matter most algorithms are identified with a structure. Adaptive filtering has found widespread application in dynamic system identification, channel equalization, echo cancellation, multi-path equalization, beam-forming and various bio-medical applications [21, 33, 39, 54, 55, 63, 73]. Most of these applications are discussed in [23, 70]. There is renewed interest in echo cancellation and channel equalization with rapid advances in the semiconductor technology. Numerous problems related to echo cancellation limit the efficiency of classical algorithms. We study these problems and propose a novel structure to overcome these limitations. 1.1 The echo problem Echo occurs in many scenarios. Cases of interest are, telephony and tele-conferences. In telephone networks a bidirectional two-wire circuit connects the subscriber to the local office. The local offices are interconnected by four-wire trunks, two for each direction of communication. A hybrid connects the two-wire local circuit, where the incoming and outgoing signals are together into four-wire trunks where the incoming and outgoing signals are separated. A typical telephone loop is shown in Figure 1.1. The hybrid circuits are required to separate the incoming and outgoing channels. However due the variations in the number of connections and the length of the two-wire lines, it is not possible to achieve ideal separation. There is leakage of the incoming signal into the outgoing circuit. This leakage along with the round trip delay produces an echo. The further the distance over which the call is made, the more annoying the echo gets. Similarly, in a teleconference echo occurs because of a feedback path from the loud speak- ers to the microphone. In this case the signal from the speaker leaks into the microphone. Figure 1.2 depicts the case of an echo in a teleconference. The leakage path has been found to be essentially linear [63], with an unknown amplitude and phase characteristic. The fact that the echo signal has non-stationary characteristics, attributable to the nature of the speech signal, and the changing environment (the echo path changes due to switching in the telephone network or movement of people in conferences), a stationary solution for cancelling the echo is ruled out. This gives rise to the application of adaptive filtering which provides an ideal formulation for echo cancellation [45, 70]. The adaptive echo cancellers have gained much popularity over the fixed echo suppressors for Echo Path Figure 1.1. Echo problem in a telephone system. Signal from remote site x(n) Interference or Echo Path A - I .daptlve Filter / 3(n) Figure 1.2. A teleconferencing echo problem. 71/ x(n) Hybnd i Hybrid AD + 11(n) / / + + 7n) , 5'01) L f g . + W) sm) /lr:Leakage Figure 1.3. An echo canceller. their tracking capabilities and performance, especially when the round-trip echo path is greater than 100 milliseconds [70]. The length of the echo path determines the order of the model. The longer the echo , path, the larger the order of the model. Typically the length of a long distance telephone connection is of the order of several hundred milliseconds. This requires that the linear model chosen for the echo be either an infinite impulse response (IIR) filter, or a very long finite impulse response (FIR) filter. IIR adaptive filters offer the advantage of a fewer number of parameters to be considered, but are severely disadvantaged by the absence of a fast and robust adaptation law [30, 59]. FIR filters on the other hand tend to require many more parameters for an acceptable performance, but have been traditionally considered as the method of choice owing to the stability of the adaptive process. Let us consider the generic case shown in Figure 1.3. Here the speech from the speaker s(n), is corrupted by the echo signal 17(n). Depending on the situation, signal x(n) might be the speech from a remote speaker (Figure 1.2), or it might be the echo of the speaker’s own speech (Figure 1.1). The signal 3(n) is known as the near-end speech. The return signal, denoted by y(n), contains an echo component which must be removed by appropriate devices. This may be accomplished with an adaptive filter (ADF). Since the echo path is essentially a linear system, we attempt to model it with an adaptive filter. The output of input signal - t - - x(m) = x(m'I') norse Tl( ) received Signal r(t) TINT! _, s A A M) \,\/ Figure 1.4. PAM communications system. the adaptive filter @(n) approximates the echo signal. Once the echo path is identified then the output of the canceller .§(n) is free of echo. Other applications of echo cancellation have been in cellular and mobile phones and in high speed data networks. These applications along with the problems presented below limit the performance of classical algorithms. These limitations are 1. The length of the echo path; 2. Colored and non-stationary inputs; 3. High sampling (or data) rate. 1 .2 Channel equalization In telephone channels, dispersion between successive samples, know as inter-symbol inter- ference (ISI), greatly complicates reliable transmission and reception of digital signals. ISI arises in all pulse modulation schemes, but is best explained for baseband pulse amplitude modulation (PAM) systems. In PAM [35, 65] the output signal is a pulse whose amplitude depends on the input signal value. For instance in a quaternary system the input is quan- tized into four levels -3, -1, 1 and 3. A model of a PAM communications system is shown in Figure 1.4. At instant mT a symbol x(m) is transmitted, where T is the sampling period and x(m) represents the quantized value of the sample x(t) at t = mT. A typical channel impulse response is shown in Figure 1.5. The received signal r(t) is the superposition of the impulse response of the channel to each transmitted signal and additive noise n(t). Mr) A Figure 1.5. Impulse response of a channel. rm = ijhu - m + 770) (1.1) Equalization of such channels involves filtering the output of the channel to recover the transmitted signal. Once again the time-varying characteristics of the channel rule out a fixed solution, [23, 39, 40]. Formally the problem of channel equalization can be stated as follows [23]. Consider the system in Figure 1.4. The problem is to recover the unknown input x(m), by identifying the inverse h‘1 of the channel h, given the observed system output sequence r(m). Thus channel equalization is effectively the identification of the inverse of the channel. When the channel has a moving average (MA or FIR) model, the the inverse is an auto- regressive (AR or IIR) model. Identifying an MA system with an AR model leads to very complex and unstable algorithms [30, 59]. Using an MA model to approximate an AR model yields a good solution, however the length of the MA model should be very long, as the impulse response of an AR model is of infinite duration. Another problem arises when the channel is non-minimum-phase [50]. In this case, the inverse system is noncausal. Under such circumstances algorithms based on second-order statistics yield only the minimum-phase equivalent of the channel, producing phase distortion. Solving the non-minimum-phase problem involves the use of higher-order statistics which yields more complex algorithms. Hence for channel equalization of minimum-phase channels, we need algorithms, which very much like echo cancellation, are efficient in identifying long impulse responses. While colored and non-stationary inputs limit the convergence speed of gradient type algorithms and their tracking capabilities [38, 71], sampling rates limit the time available between samples for all the computations leading to the adaptations of the model parame- ters. Therefore acoustic echo cancellation and channel equalization with FIR adaptive filters entails adapting a large number of parameters (order of thousands) at very high speed in the presence of colored and non-stationary inputs. These requirements, however, impose constraints on real-time equalization capabilities of a conventional adaptive system. To overcome this problem several non-conventional structures have been proposed [3, 20, 62]. These schemes typically are sub-band schemes and attempt to pose the problem as a parallel processing paradigm. In the subsequent chapters we shall develop a sub-band method which yields a perfectly parallel system. The solution obtained by this method is optimal and is based on the properties of the DFT filter bank. This method overcomes the problems faced the the existing schemes and provides a cost-efficient solution to the problems discussed above. Before developing this algorithm, we shall discuss the basics of sub-band schemes. 1.3 Sub-band schemes Sub-band schemes have been considered as an alternative to conventional schemes for cases where either the correlation of the input or the order of the system poses a prob- lem. The forerunner of the sub-band schemes have been transform-domain algorithms [48]. Transform-domain algorithms attempt to solve the problem by transforming the problem into a more suitable domain by applying unitary transformations. The transformation is chosen to exploit the properties of the model in the transform-domain. The simplest exam- ple is the time domain to frequency domain transformation via a discrete Fourier transform (DFT). These transformations simplify convolution in time domain to multiplication in the frequency domain. The first use of transformation based algorithms was to overcome the adverse effects of colored (correlated) inputs on adaptation speeds. See Section 2.1 for details. The transform- domain algorithms based on the DFT, discrete cosine transform (DCT), and discrete sine transform (DST) attempt to do that with great success. Consequently a more general class of algorithms is the frequency domain adaptive filtering [60]. In frequency domain adaptive filtering the input signal is transformed into a more de- sirable form before further processing. It is split into an array of signals which are band limited to a smaller range (band) of frequencies. The advantages of using a transformation are two fold. First the transformation generates signals that are generally less correlated [48, 60]. This lessens the colored input problem. Secondly the speed of an adaptive process is governed by the signal energy [23, 70], and normalizing the signal energy in each band in the transform-domain can make convergence uniform across the adaptive filter [48]. Having partitioned the (full band) signal across (almost) non-overlapping bands (sub- bands), one can gain additional savings by reducing the sampling rate in the sub-bands in accordance with the Nyquist criterion. This process known as decimation yields great computational advantage and is a basic operation in multirate signal processing. 1.4 Multirate adaptive filtering Multirate filters and systems have found widespread applications in communications, speech processing, image compression and the digital audio industry [69]. Interest in multirate adaptive systems is a recent development [11, 69]. In conventional digital systems the sam- pling rate is the same throughout the system. In contrast, in a multirate system sampling rate can vary from point to point. This is advantageous because we can optimize the sam- pling rate from point to point in a system. There is however a price to be paid in the form of aliasing errors. This can be minimized by proper design of the multirate system. Input Output x(m) YD(n) H [M —» Input Output x( m) 1’10!) (b) Figure 1.6. (a) A basic decimator block. (b) A basic interpolator block. 1.4.1 Multirate basics The basic building blocks of a multirate system are decimators and interpolators. Dec- imators or downsamplers, decrease the sampling rate, while interpolators or upsamplers increase them. Let us consider Figure 1.6. The output of the M -fold decimator is given as A 310(71): x(Mn) (1.2) and the output of the interpolator is given as A x(n / L), if n is an integer-multiple of L y1(n) = (1.3) 0, otherwise The z-transforms of the signals yo and y; are, respectively, given by 1 M-1 YD(2) = HZXRVMW") (1.4) k=0 Y1(z) = X(zL) (1.5) Graphically we can interpret this in the transform-domain (loosely speaking frequency domain) as decimation stretches the spectrum of the signal, while interpolation compresses it. When decimating, expansion of the spectrum in the transform-domain may lead to link _._.’7 t—l lute dec, 1.4. Figure 1.7. Effects of decimation and interpolation in the transform (frequency) domain.(a) The input signal X (81”), (b) The decimated signal (M22). Notice the aliasing effect. (c) The interpolated signal (L=2). Observe the spectral images. aliasing effects which may result in the loss of information. In order for a decimated signal to be undistorted due aliasing, it is important that it be band-limited to 1r/M, where M is the decimation factor. These effects are illustrated in Figure 1.7(b). In most cases a decimator is preceded by a filter called the decimation filter, which ensures that the signal being decimated is band-limited. Also an interpolator is followed by a filter, known as the interpolation filter, that suppresses the spectral images created by interpolation. Typical decimation and interpolation circuits can be seen in Figure 1.8. 1.4.2 Filter banks A filter bank is a collection of filters with a common input or a common output as 11 Input Output x(m) y(n) .———————- .EI(Z) IF ‘.y1v1 [—————I> (a) Input Output x(m) W") o———> f L ———> H(z) ———> (b) Figure 1.8. (a) A complete decimator with a band-limiting anti-aliasing filter. Typical cut- off frequency of the decimation filter is 1r/M. (b) A complete interpolator with an image suppressing filter. The cut-off frequency of the interpolation filter is 1r / L. x(n) Hdz) ——>+M x001) yam) o——> 4L ——> Pom mm —>+M_>x,(n) mu) o———> 4L m2) 3(a) L—p HM,,(z)_—>*M—> xM.1(n) YM-1(")H+L —> FM.1(Z) —>—L> (a) (b) Figure 1.9. (a) An analysis filter bank. (b) A synthesis filter bank. illustrated in Figure 1.9. The filter bank shown in Figure 1.9(a) has a common input and is known as an analysis filter bank. This system splits a signal x(n) into M signals zk(n) for further analysis. Typically the filters in the bank, split the frequency domain into almost non-overlapping bands. Therefore an analysis bank decomposes a full band signal x(n) into sub.band signals xk(n). Conversely, the system shown in Figure 1.9(b) synthesizes M signals into one signal i:(n) and is known as a synthesis filter bank. ”A? i 12 1.4.3 Perfect reconstruction filterbanks The filterbanks split the frequency domain into sub-bands. In a uniform filterbank these sub-bands are of the same width. The output of the analysis filter, i.e., the sub-band signals, occupy a smaller range of frequencies and hence can be sub-sampled in accordance with the Nyquist criteria. Therefore it is natural to have decimators following the output of an analysis filterbank. Similarly, the inputs of a synthesis filterbank are preceded by interpolators. It is known that the introduction of decimators or interpolators produces aliasing. Now consider a system where a synthesis filterbank immediately follows an analysis filterbank so that y,-(n) = z;(n), i = 0,1, . . ., M -1. In general the output :i'(n) is not equal to the input x(n), but when the filters in the filterbanks are chosen carefully the aliasing introduced by decimation and interpolation can be cancelled and 53(n) = x(n). Such filterbanks are known as perfect reconstruction (PR) filterbanks. From this point on, we shall only be interested in designing PR filterbanks and studying their applications in transform-domain adaptive filtering. As an important tool in this design we shall develop an alternate representation for multirate filterbanks which has a very efficient implementation. 1.5 Polyphase representation Polyphase representation of multirate systems, introduced in 1976 by Bellanger [5], led to the development of very efficient structures for implementing multirate systems. The basic idea of polyphase representation is as follows. Consider a system H(z) = z: h(n)z-". (1.6) Separating even numbered and odd numbered coefficients of h(n) we can rewrite (1.6) as H(z) = i0: h(2n)z"2" + 2’1 f: h(2n +1)z-2". (1.7) n=—oo n=-oo 13 Defining Eo(z) = Z h(2n)z'"’ E1(z) = Z h(2n+1)z'" we can rewrite H (2:) as H(z) = 130(22) + 271E1(22). Extending this idea to M, we can decompose H(z) as H(z) = Z?=-ooh(nM)z‘"M + 2‘1 2°:_oo h(nM+1)z‘"M + 2"(M'1) 22°=_oo h(nM + M — 1)z‘"M. As done in (1.8) we can rewrite H(z) as M—l H(z) = Z z‘kEk(zM) k=0 where Ek(z)= Z ek(n)z"" 71:-00 with ek(n)éh(Mn+k), OSkSM—l. (1.8) (1.9) (1.10) (1.11) (1.12) (1.13) The decomposition of H (2) given in (1.11) is known as the Type-I polyphase representa- tion (or decomposition) and Ek(z) are known as the Type-I polyphase components. Figure 1.10 summarizes how ek(n) can be generated from h(n). Alternatively H (2) can also be represented as M—l H(z) = Z 2(M"1"k)Rk(zM). k=0 (1.14) Of dii fol. If 1 1.1 poll 14 Input Output ’10") 2" ekfn) o—h ——> {M ———> Figure 1.10. Generation of polyphase components. Input Output Input Output If ) ( ) x(m) y(n) m.___. A M G(z) y" A *—*' G(ZM) {M ‘—’ (a) (1)) Input Output Input Output x(m) y(n) (m) )4") ~——~ 6(2) [L H A x-—- t L G(z’“)—-> (C) ((1) Figure 1.11. (a-b) Noble Identity 1. (c-d) Noble Identity 2. Rk(z) are just permutations of Ek(z) and the decomposition in (1.14) is known as the Type- II polyphase representation. For a discussion of why the term polyphase came into use see [69]. Polyphase decomposition yields very efficient structures for the implementation of mul- tirate systems. Before we delve into this aspect, we shall consider certain interconnections of multirate blocks and their simplifications. Consider the multirate systems depicted in Figure 1.11. Typically we have seen that in order to overcome the problem of aliasing, a filter precedes a decimator (Figure 1.11(a)). A different type of cascade is encountered in polyphase representations. In this type, a filter follows a decimator (Figure l.ll(b)), or a filter precedes an interpolator (Figure 1.11(d)). If the function G(z) is rational, then figures 1.11(a) and 1.11(b), and figures 1.11(c) and 1.11(d) are equivalent. These identities are known as Noble Identities [68, 69] Armed with the Noble Identities we can set about simplifying multirate systems using polyphase representations. Consider the system shown in Figure 1.8(a) with M = 2. Sub- 15 x(n) 50(2) Z-l Elfzz) [ 2 —> (a) x(n) [2 L-u 50m A 2 I‘" 51(2) (1)) x(n) R0(z2) 5* T 2 p. RM) 1 2 (c) Figure 1.12. Efficient structures for decimators and interpolators. (a) Polyphase imple- mentation of a decimator. (b) Simplification of (a) using Noble Identities. (c) A similar structure for an interpolator using a Type-II decomposition and noble identities. stituting the polyphase representation the system can be simplified as in Figure 1.12(a). This can be further simplified using the Noble Identities to yield the system in 1.12(b). If H (2) was an Nth order FIR filter, it would requires N + 1 multiplications and N additions per unit time, whereas each of Eo(z) and E1(z) in the structure in 1.12(b) require only (N + 1) / 2 multiplications and N / 2 additions per unit time. Notice that due to decimation not only have the computations been halved, but also the unit time has been doubled. A similar implementation for the interpolator in Figure 1.8(b) is given in 1.12(c). This uses the Type-II polyphase decomposition. This idea can be extended to arbitrary M. 16 ——> x0(n) '—> X101) w‘f —’ XM,[(II) “M-If '1) Figure 1.13. A DFT filter bank. 1.6 DFT filter bank One of the most commonly used filter banks is the DF T filter bank. This filter bank implements the transformation represented by the DFT matrix W. The DFT matrix W is defined as szkl, OSkJSM—l (1.15) where wk; = W“, and W = e‘m/M. Consider the transformation shown in Figure 1.13. The matrix W] represents the inverse DFT operator. The input signal x(n) generates M sequences u,-(n) such that u,-(n) = x(n — 2'). At every instant n, the system computes, M signals 93,-(12) as M-l :c,(n) = Z u;(n)W'ik. (1.16) k=0 Taking the z-transform of (1.16) we have M—l X;(z) = ZU,‘(Z)W-ik k=0 M—l _ . Z z”X(z)W"k k=0 ‘5. .. ,' wf' .‘ ‘1‘“ A .- Wh Wri the s F1311: l7 M—l _ = )3 (sz)“‘X(z). (1.17) k=0 Therefore we can write X.(Z) = H;(z)X(z) (1-18) where H;(z) é Ho(zWi) (1.19) with Ho(z)=1+z'1 + ...+z_(M"1). (1.20) Therefore the system in Figure 1.13 is identical to the analysis bank in Figure 1.9(a) where, the analysis filter Hk(z) is given by (1.19) and (1.20). Notice also that the the analysis filters are nothing but the z-transform of the corresponding row of the DFT matrix W]. Filter banks satisfying (1.19) are known as uniform filter banks. Now consider a M uniform DF T filterbank. The kth filter in the bank can be represented as 11,,(2) .—_ Ho(zw’=) M-1 2 (z-lw-krEle), (1.21) '=0 where E,(z) are the polyphase components of H(z) as per (1.11). The output Xk(z) can written as M-l Xk(z) = Z W-k‘ (z-‘E.(2M)X(z)). (1.22) i=0 Therefore a uniform DFT bank can now be represented as in Figure 1.14. Further if the sub-band signals zk(n) are decimated by a factor M, Figure 1.14 can be redrawn as in Figure 1.15. ct. DC «In I 18 E0(zM) —> »—> x001) 121(le > —> x101) w'i‘ l—> EM_,(zM) t] —> xM.1(n) Figure 1.14. A polyphase representation of an uniform DFT bank. 1.7 Statistical properties of multirate systems Multirate systems are generally time varying. The presence of interpolators and decimators make a time-invariant system into a periodically varying system [37]. Consider a generalized multirate filter shown in Figure 1.16. The output signal of this fractional sampling rate changer can be expressed as (see [11, 37]) y(n) = Z h(mL + InMIL):z: (Ly—15%] - m) , (1.23) mz—oo where |v| L denotes v modulo L and [u] denotes the greatest integer less than or equal to u. From (1.23) it is clear that a fractional sampling rate changer is a linear periodically time varying system (LPTV) with period M. It has been shown that for such a system the output is wide sense cyclostationary (CWSS) [18], if the input is wide sense stationary (W58) [29, 56]. We shall now make a few definitions and establish some results which are true for general multirate systems. These would be very important for analytical developments in the sections to come. Consider the decimator shown in Figure 1.17. Here, the autocorrelation of Hg, 2' = 1,2, 19 150(2) —> —>xo(n) E1(Z) F—D +3510!) wf L“ b M —> EM_1(Z) —> xM_l(n) Figure 1.15. An efficient implementation of a uniform DFT filter bank, when its outputs have been decimated by M. )1 n) x(m)_. *M h(k) AL —> V Figure 1.16. A general multirate system. is defined as ru,(kM) é E{u.-((n + k)M)u[ (72114)}. (1.24) Also, the cross-correlations between a: and Hg, denoted by pm.(k), and between ul and U2, denoted by pu1u2(kM), are respectively defined by A i pm'(k) = E{a:(nM + k)u3- (nM)}. (1.25) Fact 1.1 For the system shown in Figure 1.17 in accordance with above definitions, the power spectral density (psd) Su,(w), and cross p.s.d’s qu,(w) and Sn,“2 (w) are found to be 2 w—27rk Gi(w—21rk Sx( M ) ——M) 1 M—1 Shit") = HZ k=O 20 _. Gum) _VI_.[M 1'. Mm) 4’0") x(n) “2 Y2(m) —- 02m» ”all; > Am) Figure 1.17. A two sub-band block with a common post filter. saw) = alumna) 1 MTI w—21rk tw—27rk w—21rk 51111120”) = M 12:; G1( M )G2( M )5-1'( M ) (1'26) where Sx(w) is the p.s.d of x(n). Proof: Let g,(n) be the impulse response of G,(w). Then p...(k) E {x(nM + k)ul(nM)} = E{Zgl(j)$(nM + k)xl(nM -J')} J = :gi(j)r.(k +1) = ZgI(—j)r.(k -j). (127) Therefore smite) = GI(w)S,(w). (1.28) Now ru,(kM) E {u,-((n + k)M)u,I(nM)} = E {gunman + k)M — j)u!(nM)} = Zga(j)p.u.(kM -1)- (1°29) Th Therefore, we have [53] lM—l Sua(w) = Moreover, pxu2(k) Therefore 3| El 21 w— fifth) w — 27rk Gi(_ )S:ua(T) (w— M21rk) w— M21rk 25———x( ) k=0 E{:r(nM + lc)u:(nM)} 7.309) ® 92 [(5.16) qu2(w) = 0355.32). Now Pu1u2(kM) Therefore 3 EIH rm 5111112 = 3 l 1 M a. II o -1 1 E{ul<(n+ k)M)ul(nM)} E {291037401 + k)M - j)Ui(nM)} j 291(J')pxu2(kM - j). j w- M27rk w—M21rk Gl(-—— )qu2(——- ) w- M27rk w— M21rk )Gl(—— ——)S (w-M 21rk G1( ). (1.30) (1.31) (1.32) (1.33) (1.34) 22 Fact 1.2 The cross p.s.d Sxy,(w) and the p.s.d Sy,(w) are given by M 5.,(3) .-_ Ai(w)s,u,(w) (1.35) 1 M 1 w—27rk w- M21rk 5y.(w) = |A (‘0)le Z G.-(—————) 5x( )- (1.36) k: 0 M Proof: Let a(n) be the impulse response corresponding to A. Then p39,.(k) = E{IL‘(nM+ k)!!! (nM)} = E{Zal(j)x(nM + k)u,l(nM — j)} j = Dismal/cw) j = aR—k) spans). (137) Therefore, smog) = Al(w)5m,.w. (1.38) Also S3(a)) — |A(w)|25u.(w) = A(Mw)|2_ (w— M21rk) )I2:cS( (w— M27rk). (1.39) O Fact 1.3 For the system shown in Figure 1.17, it is true that 1""1 w—27rk [Lu—27F]: 2 w~21rk S... = H Z G.( M )G.( )|A(w)| Sx( ). (1.40) k=0 Proof: Note that 53,13,200) = iA(w)i25u1u2 (w) 23 Therefore, from Fact 1.1 it is concluded that 1 M 1 —2 k 2 k 2 k -H1(ZG(“’ M“ )Gi(“’M ” )|A(w W (“M ” ). (1.41) k=0 & It should be clear that the p.s.d of the output has aliased components and unless some care is take its effects on the system can be degrading. We shall begin the next chapter by first considering the need for a new sub-band adaptive scheme. Based on the multirate basics developed here we shall then design a sub-band adaptive filtering scheme. CHAPTER 2 The Polyphase Algorithm Sub-band techniques are an effective solution to identify very long impulse responses, spe- cially for echo cancellation and channel equalization. Their strength lies in their ability to break a given problem into weakly linked sub-problems of smaller complexity that can then be solved almost independently of one another at faster rates. The forerunner of the current sub-band algorithms has been the Transform-domain (TD) algorithms. TD algorithms were however considered for solving another limitation of adaptive algorithms, namely the convergence rate. Input signals whose autocorrelation matrix has a very large eigenvalue spread slows the convergence of adaptive algorithms [23, 38, 48, 70]. Such signals are know as ill-conditioned signals (strictly speaking these are signals with an ill-conditioned autocorrelation matrix). 2.1 Transform-domain adaptive filtering A host of TD algorithms [7, 8, 13, 46, 47, 48] has been studied to improve the convergence of adaptive process when the input is ill-conditioned. The premise of these TD algorithms is as follows. Suppose an orthogonal transform W is chosen such that the autocorrelation matrix of the input signal in the transform-domain is the identity matrix, then the adaptive filter in that domain will have the best convergence rate possible. The corresponding W is known as the Karhunen-Loéve Transform (KLT) [4]. The KLT is a signal-dependent transformation and requires a priori knowledge of the second-order statistics of the input. 24 25 In most practical applications such a priori information is not available. Also the signal might have time-varying statistics that make a fixed transformation impossible. Under such circumstances transforms such as the Discrete Fourier Transform (DFT), the Discrete Cosine Transform (DCT) have been observed to yield better convergence than the regular time domain algorithms [46, 48]. Also these transformations are computationally efficient, with a complexity of 0(N log2(N )) as compared to 0(N2) for any general transformation. Simulations show that DCT is better for speech and low-pass signals [46, 48]. 2.2 Sub-band adaptive filtering Sub-band algorithms are an improvement on TD algorithms. The orthonormal transfor- mation W is implemented using a bank of filters. The mth filter in the bank has as its coefficients the mth row of the transformation W. Typically the filter bank represents a bank of comb filters or bandpass filters. The transform-domain signals which are the outputs of the filters have a bandwidth which is at most equal to the bandwidth of the corresponding filter. In a M filter bank, if each filter has a bandwidth of 21r/M, then each of the transform-domain signals has a bandwidth of at most 271’ /M . Since the original signal occupies the full band of 21r, we can subsample each of the transform-domain signals by a factor of M (See Section 1.4.1). Thus a sub-band algorithm is essentially a multirate TD algorithm. While identifying very long impulse responses, the sub-band algorithms decouple the system into smaller subsystems distributed across the sub-bands. The adaptive process then adapts to these lower order suboband systems. The advantages of sub-band algorithms are 1. Faster convergence due to improvement in the condition number of the input correla- tion matrix; 2. More processing time in the transform-domain due to decimation; 3. Smaller order subsystems to identify, thereby reducing the computation complexity. h! OI CO Wit 26 +$M ADF x(n) —>+M ADF _., W +]M ADF DM-Ifm) Figure 2.1. A general sub-band adaptive filter. A typical sub-band system is shown in Figure 2.1, where D0,D1,...,DM_1 are the transform-domain desired signals, and Y0,Y1,. . .,YM-1 are the outputs of the sub-band adaptive filters which are of a much smaller order than the original system. The filters are adapted such that the cost function J = E{e2} M—l E{ Z €k(m)}2 (2-1) Ic=0 is minimized, where ek is the error in the kth sub-band. Ideally, we would like each of the sub-band adaptive filters to work independently of one another, i.e., in parallel. In this case each adaptive process minimizes its corresponding cost function J1. -_- E{ei(m)} k = 0,1,. ..,M — 1 (2.2) with the overall cost function 27 z E1 E{e§(m)} k = 0,1,. . .,M — 1. (2-3) k=0 Equations (2.1) and (2.3) are equal only when E{e,-(m)ek(m)} = 0 i 75 k, i,k = 0,1,. . .,M — 1. (2.4) Equation (2.4) is true if and only if the spectral overlap between the filters in the bank is zero. This implies that all the filters in the bank be ideal bandpass filters. But such filters are non-causal and hence unrealizable. Most sub-band schemes are based on the cost function in equation (2.3), in which case the solution is sub-optimal since e;(m) and ek(m) are not orthogonal in general. Next we shall consider some of the existing sub-band techniques and discuss their advantages and drawbacks, and then present a new structure that will overcome their limitations. 2.3 Prior work The techniques to be discussed in this section can be categorized under FIR and HR algo- rithms. We first discuss the FIR algorithms and then the HR algorithms. 2.3.1 FIR algorithms Gillore and Vetterli [19] present some of the pioneering work in sub-band echo cancellers. The main limitation observed by the authors was the aliasing between the sub-bands. This yielded a sub-optimal solution as seen in equation (2.4) and was observed in the form of peaks, in the spectrum of the residual signal, at frequencies corresponding to the overlap. These peaks indicated the failure of the canceller at those frequencies. Two solutions were suggested in this paper. The first method was to introduce band stop filters to suppress the peaks. These filters however degraded the quality of the speech signal and required additional care in processing the near-end speech. The second method was to oversample the sub-bands, .i.e., use a 28 decimation factor K which is less than M, the number of sub-bands. This reduced the aliasing between the filters in the bank. Improvement in the quality of the solution was at the expense of more computations compared to the savings of maximal decimation (decimation by a factor M). In their subsequent work [20] the authors present a scheme to overcome the limitations of their previous work. They propose a transformation which yields a tridiagonal system in the transform-domain which implies that cross filters maybe used just between adjacent filters to model the aliasing between them. Such a system yields an optimal solution and also uses critical sub-sampling. However the computational complexity is increased, as 2(M — 1) cross filters are needed. Once again M is the number of sub-bands used. In [72] the degrading effects of spectral overlap between the sub-bands is overcome by introducing filterbanks with no overlap. This technique yielded a much lower residual echo but the solution is suboptimal since some of the input information was lost due to the spectral gaps. In this case the output signal had spectral gaps at frequencies corresponding to the edges of the sub-bands. These problem is severe when the number of sub-bands was large and needed additional compensation. In [62] Somayajulu et al. present a technique in which they introduce auxiliary sub- bands which are not decimated, in addition to the main sub-bands in order to overcome the problem of spectral gaps. There are M non-overlapping sub-bands, and (M — 1) auxiliary sub-bands. The main sub-bands are critically decimated (by M) while the auxiliary bands are not. There are a total of (2M — 1) adaptive filters in the sub-band, (M — 1) of which are working at the input sampling rate. The auxiliary sub-bands have been introduced to reduce the problem of spectral gaps between the non-overlapping main sub-bands. While this structure does not completely avoid aliasing it also has the auxiliary sidebands as an added cost. A very efficient FF T based block processing scheme is the Fast LMS (FLMS) [36]. In this scheme a FFT’s are performed on non-overlapping blocks of data, thereby yielding great savings while compared to the frequency domain LMS (FDLMS) algorithm. However 29 to overcome the aliasing effects of circular convolution, overlap-and-save and a overlap-and- add methods are used. This effectively increases the length of the FFT to be taken to be 2N compared to the block size of N. It was observed in [2, 3] that while frequency domain algorithms performed much better than the time domain algorithms, the propagation delay grew with the size of the trans- formation. In order to reduce this delay a new structure called the frequency bin adaptive filter was proposed. This algorithm is a modification of the block LMS. The key point of the algorithm is as follows. Since the propagation delay is proportional to N, the size of the FFT, the frequency domain LMS is modified such that instead of taking an N -point FFT, M N ' FFTs are taken reducing where N = M N '. In practice, a 2N '-point FFT is taken increasing the computational complexity by a factor of 2. The algorithm proposed here is essentially a combination of Frequency Domain LMS and Block LMS. While it tries to exploit the features of both schemes, the effect on the autocorrelation is not clear. In addition the aliasing between adjacent blocks in this structure are not addressed and neither are the issues related to convergence. It was, however noticed that power normalization in each block yields an improved performance. 2.3.2 IIR filterbanks Another approach to the problem of aliasing between the sub-bands is to design filters with sharper cutoffs. In such filterbanks a very good solution can be obtained ignoring the very small aliasing. Since FIR filters with sharp cut-off have a very high order, IIR filterbanks have been considered. In [28, 49, 67] the authors present techniques that yield orthogonal IIR filterbanks. In [67] Tuncer and Sandri prove that Butterworth filters can be used to design and orthogonal filterbank, while in [28] Iyer et al. designed sharper cut-off filter banks using elliptic filters. It is shown in these papers that these filterbanks constitute a wavelet filter bank. These filterbanks yield a much lower aliasing and at the same time are orthogonal. The one problem posed by these filters is that the synthesis part is non-causal. In such filter banks even a delayed reconstruction is not possible. In [67] this problem is overcome 30 by truncating the impulse responses of the IIR filters to obtain an FIR filter bank. Recent work by Tuncer and Nguyen [66] has shown techniques for developing perfect reconstruction IIR filter banks. 2.3.3 IIR echo canceller IIR filters are very efficient while trying to represent long impulse responses. However they are severely limited by the speed of convergence and stability of the adaptive filter. In [16] Fan and Jenkins explore the issues relating to the use of IIR echo cancellers and the performance of a new class of IIR adaptive filtering algorithms developed in [14, 15] is investigated. In order to overcome the “strict positive realness” condition required for convergence by the well known SHARF algorithm [32, 34], Fan and Jenkins proposed a modification of the Steiglitz-McBride algorithm [64]. The authors conclude that although the IIR echo canceller has the potential to suppress echo better, the improvement depends on the environment in which it is operating. It was observed that the performance is degraded when 1. The length of the echo path is not exactly known; 2. The measurement noise is considerable. Even though computationally more efficient, the IIR filters converge more slowly than the FIR filters. The problem gets worse when the order of the IIR filter is greater than two. Having discussed the existing approaches to solving the problem of echo cancellation we are now ready to present a new technique that overcomes the limitations of the existing FIR sub-band filtering algorithms. But before we present the algorithm we shall introduce some basics of block processing schemes based on which will be the development of the new technique. 31 2.4 Equivalence of block processing and sub-band schemes Block processing of signals has been a common approach in many applications [10, 43]. Block digital filtering has been used to implement filters in a manner that enhances parallelism in computations. In many high speed applications block digital filtering is a preferred method. Consider a linear system h(n) with input x(n) and output y(n). Define vectors 33(n) and y3(n) as Pa:(nM+M—1)- -y(nM+M—1)1 ng+M—2 M+M_2 33(71): ( , ) ,me: 3“" , ) (2.5) _ x(nM) . _ y(nM) J 23(n) and y3(n) are known as the blocked versions of x(n) and output y(n), respectively. Let H 3(2) be a system that generates Y3(z) from X 3(2) according to YB(Z) = HB(Z)XB(Z) (2-6) where HB(z) is an M-input M-output system. It can be shown that 113(2) is an LTI system and can be represented by an M X M transfer matrix. H 3(2) is called the blocked version of H (2) Let us define the signals xk(n) and yk(n) by xk(n) x(nM + k) “D yk(n) y(nM + k). (2.7) x(m)o——> 32 Figure 2.2. (a) A block processing system. (b) Its multirate equivalent. Then 23(n) and yB(n) can be rewritten as a33(71) = xM_1(n) 231(72) xo(n) l l yam} Unblocking y(n) EH Mecahnism + (a) YM-1(n) ——> 4 M z-l YM-2(fl) _. 1 M HB(Z) -1 Z -1 Yo(n) z —» f M y(m) (b) yM-1(n) 9 31301) = (2.8) 3M") [ yo(n) Now X(z) and Y(z) can be rewritten in terms of the z-tmnsforms of xk(n) and yk(n), denoted by X k(z) and Yk(z), respectively, as X(z) Y(Z) M—1 2 z""X,,(zM) k=0 M-1 )3 24144.21"). k=0 (2.9) 923(n) and y3(n) are the polyphase components of x(n) and y(n) respectively. This implies that block filtering scheme in Figure 2.2(a) is equivalent to the multirate system in 33 Figure 2.2(b). The structure shown in 2.2(b) indicates that the system H 3(2) is operating at a rate M times lower than the input rate. So the sampling rate of the input signal x(n) can be M times larger than the speed of the basic computational unit. Therefore we have increased the computational rate (and parallelism) using a block processing (or its equivalent multirate) scheme by increasing the number of computational units. As a result we are able to process signals which arrive at a rate M times higher. We know from (1.4) that even though decimation produces aliasing in the analysis banks, we are able to recover the signal y(n) without any distortion at the synthesis end of the system. Consequently, the aliasing introduced in the analysis section is cancelled by the synthesis section. 2.5 Further properties of the uniform DFT filter bank As mentioned before the uniform DFT filter bank given by the (1.19) and (1.20) has many interesting properties. They form a perfect reconstruction system. Before we discuss the DFT filter we first consider the commutator switch models. This makes understanding of the filter bank structure more straightforward. Commutator models are useful tools to study the operation of polyphase implementa- tions. A delay chain followed by a set of decimators is equivalent to a commutator switch rotating counter clockwise. This equivalence is depicted in Figure 2.3(a). Figure 2.3(b) shows the equivalent commutator model for a set of interpolators followed by a delay chain. Therefore a DF T filter bank can be represented using a commutator model as in Figure 2.4. For such a filter bank it can be shown that :i:(n) : Mz(n — M + 1) (2.10) To further study the properties of an uniform DFT bank, let us consider the filter bank, whose filters are given according to the equations (1.19) and (1.20). The frequency response 34 Vz'l o—hh3 ———> a '———> Vz’ o———> (a) ] —>+3 -—>—o -I if v __.+3 _..._ (b) Figure 2.3. Commutator models for (a) decimators, (b) interpolators. of kth filter in the bank can be expressed as ° M . _ an) = S”? ,3, “L “5' I) (2.11) sm 2 where 27rk Figure 2.5 shows the frequency response of a typical M -band DFT filter bank. Notice that there are M points along the frequency axis, corresponding to the center frequency of each bin, where aliasing is zero, i.e., at those points only one of the bins has a non-zero response. The DFT filter bank can be thought of as a spectrum analyzer. The output zk(n) of the kth bin represents a smoothed spectrum of the input X (ejw) about the center frequency of 35 +—> —o o— —> ”do—D ——o .— —>‘>v(n) o W o w'[’ o O—D —00— —> Figure 2.4. Commutator based implementation of the analysis and synthesis section of a DFT filter bank. the corresponding bin (21rk/M). We shall exploit the following properties of the DFT bank to obtain a structure which would give us increased computational savings in terms of both speed and parallelism. l. The DF T filter bank offers computational savings because of the increased parallelism and decimation; 2. The DFT filter bank is alias free at M distinct frequencies; 3. The sub-band signals are an average of the spectrum of the input at frequencies corresponding to the center frequencies of each band. What we propose to achieve is the spectral estimate of the unknown system at fre- quencies corresponding to the center frequencies of each bin of the filter bank. Then in accordance with the frequency sampling theorem we can identify the impulse response of the unknown system, provided we have sufficient number of spectral estimates. In Section 2.6 we present an adaptive structure that attempts to exploit the features mentioned above. This structure known as the Polyphase structure (PPS) was presented by Iyer et al. in [24, 25, 26, 27]. 36 I I I I r I E Go(w) 01(0)) 029’) G300) \ 7‘ 1 _ 3 / \ " \ W: / \ .l \ Q3 / \ .l \ 8 I ‘\ I \ g 0.8 r [I \ .’ \ . 8 I ‘ -’ ‘ a I \ I”! \ \ I. Q) , . v 006 '- l \ I 2 \ - :3 I \ - ' I . ED , \ _’ \. g I ‘ .’ \ 0.4 ~ I ‘ .’ \ . "3 I ‘ I \ a) \ . . I q ' E ’ ’ \ Q; / '\_\\ I, l/ \\ , l’ \ \ 802’ I] \ I \ 1" \ .. .’ \\ _ Z ” \K I \ .’ \ ‘5/ \\ I, \' / [I \ . _-l \\ l . I \ / \ \ \ / .\ o 1 l l l l l x 0 1 2 3 4 5 6 Frequency w Figure 2.5. Normalized frequency response of a 4-band DF T filter bank. 2.6 Adaptive scheme based on frequency sampling theory Figure 2.6 shows the proposed adaptive structure, where the employed transformation is an M -band DFT filter bank. In this structure, x(n) denotes the input, y(n) the reconstructed signal, and d(n) the desired signal. In addition, X k(nM ) and Dk(nM) are the kth sub-band signals. The filters involved in the structure are, h(n) = Estimated Impulse Response hD(n) = Desired Impulse Response Gk(w) = kth DFT filter HD(w) = Fourier transform of h D(n) AN(w) = Fourier transform of Accumulator 37 A A X0 3f0 ”0 i __.. _ 4- A " ’ ’10 X1 X1 ”1 A —~@— + a w M - Point 5 5 M - Point XM l 7M.) — > hM-l 50 73: M - Point DFI‘ ) —> Commutator [E] —> Ratio _>Accumulator An Copy coefficients Figure 2.6. Polyphase adaptive structure. Note that the M -DFT in conjunction with the commutator is the polyphase decomposi- tion of a maximally decimated filter bank whose prototype filter has a flat impulse response of length M. The error signal for the kth bin, parameterized by the scalar If)“ is formed by ek(nM) = Dk(nM) — ffkl(nM)Xk(nM), (2.13) where Xk(nM) Xk((n—1)M)+Xk(nM) é Dk(nM) Dk((n — 1)M) + 1),.(nM) (2.14) 38 denote the accumulated quantities of the input and the desired (reference) signal, respec- tively. Having defined the critical quantities in the proposed structure, the statement of the problem is as follows. Problem Statement In reference to Figure 2.6, if the cost function for each bin is defined as Jk(nM) é iek(iM)e:(iM) (2.15) i=1 find [h(nM) such that Jk(nM) is minimized. Obviously, this optimization performed for each bin independently, will be a success- ful procedure if ek(nM), 0 S k S M - 1, are uncorrelated asymptotically. If we let If), 2 flied-1’ jfiimag, and assume that Jk(nM) is a differentiable function of these two variables [23], we can write 1" é ——6~Jk j—afl‘, . (2.16) 3H): 6H?“ 8H}: ‘5 Since by definition "‘ JunM) = 2(1),. — 11,1 2am. _ £1li i=1 = XXI—9m: - DkaX: - 5:1?le + lHk|2XkX,[), (2.17) i=1 it is found that Lil: = 21-21-7ng + 2XlefIZ). (2.18) 0H), {:1 Setting ii = 0 yields, 3”“ ° ( M) “l _ Pk n ‘We have temporarily dropped the time indices for simplicity. 39 where pk(nM) = £2 Dk(IM)X,I(iM) i=1 i‘k(nM) = $iX—k(iM)Xz(2M). (2.20) i=1 Notice that the quantities 15;,(nM ) and 1"),(nM) can be regarded as temporal cross cor- relations of Dk and Xk, and temporal auto-correlation of Xk, respectively. pDka(nM) and er(nM) are the corresponding statistical correlations. The computation of 15;,(nM) and rk(nM) is a simple task which is done recursively to compute f1 h(nM ) in (2.19). Table 2.1 summarizes the PPS algorithm. The asymptotic convergence properties of this algorithm are investigated next. 2.7 Convergence analysis We claim that [h(nM), given by (2.19), converges to 1113(3)?) as n —> 00. Note that the DFT filter bank is alias free at 3%, k 6 {O,1,...,M - 1}. It should also be noted that the transform-domain signals Dk and X], are stationary if the input x(n) is stationary. Even though the decimators form a time-varying system, a decimated signal is wide sense stationary (WSS) if the original signal is WSS. (See [29, 56].) We have from (2.19) that the estimated f1 1, are the ratios of the temporal correlations. The following theorem establishes that the limit of the ratio of the temporal correlations is the same as that of the statistical correlations. Theorem 2.1 Assume that the input x(n) is an ergodic random process. Define é . fikUlM) _ ”131010 fk(nM) . (2.21) Then it is true that Ira: ’N 0 7 : Mil (2.22) NP... 175(0) 40 Table 2.1. PPS algorithm. 0 Form the input blocks at instant nM as 33(71) z [z(nM),:c(nM "1)9' ° -,x((n - 1)M +1” d3(n) = [d(nM)9d(nM _ 1)9"'9d((n— 1)M +1)] where x(nM) is the input and d(nM) is the desired output 0 Find the FFT of the input blocks [X0(nM)9Xl(nM)9”'9XM-1(nM)] : FFT(:EB(TZ)) [Do(nM), Dl(nM), - - -, DM_1(nM)] = FFT(d3(n)) o fori= 0,1,...,M— 1 do 1. Form the accumulated outputs X,(nM) = X,((n — 1)M) + X,-(nM) D;(nM) D.((n -1)M)+ D;(nM) 2. Update the auto- and cross- correlations 72'.(nM) (1 _ 55,-((71 —1)M)+%|)f'i("1‘4)l2 2.02M) = <1 — int-((12 — 1)M) + %D1(nM)X.-l(nM) 3. Then H1 [3g(nM) 3' ("M) = m 0 Obtain the impulse response estimate h(n) as h(n) = IFFT{[fI0(nM),H1(nM),-~-,HM_1(nM)]} 41 where _ A N—l D,’,"(nM) = Z Dk((n — i)M) (2.23) i=0 _ A N-l XflnM) = Z Xk((n — 0M) (2.24) i=0 are partially accumulated quantities, and A -' ‘N pofxym) = E{DI.V(nM)X.‘ 0. Splitting the integral over [0,6] and [6, 71'] we have t n_ 1r 6 ‘ 2 nt 1 = _ 16] W11” i £1‘°’—',"—,—dt, (2.39) mr 0 sm 2 mr 5 s1n 2 where g t) = W — f(0) . Since f(t) is continuous at 0, then g(t) —> 0 ast —9 0. This 2 simply suggests that, for e > 0, there exists 6 > 0, such that g(t) < 5/2, for all t 6 (0,6). Hence, g(t)si2n: —2—n7dt 6 f0" sin2— 2 e —- —dt= 2.4 mr __/06 sin2 - 2n1r sin:2 ; 2’ ( 0) and on (6,71') "g(t) sinz— "‘ /" I — —dt < t dt Inlvr 5 sinzé _ nrsinzg 6 g() f" lg(t)ldt 727? $111 5 44 Also, since f E £([—1r, 7r]), there exists N E N, such that, for all n 2 N, _f____5 lg(t)ldt< 5 -. (2.42) n1r sin2 _—_%— <2 Consequently, for all n 2 N, we have that III <59(t)51n22"—t+1/"9(t)8in22"—‘dt ‘ mr o sinzM mr 5 sinzé e e _ 5+5 = e. (2.43) Therefore 1 7' Lgflsinznf L=u _/ ————-—dt= 0. 2.44 .12.. -. $.22 M < ) Notice that as n increases the integral collapses to the value of the function at t = 0. This - n_t can be interpreted as the zooming efi’ect of the accumulator function A"(t) = %i—. In fact, 2 as it tends to infinity this term converges to a delta function. As a consequence of Lemma 2.2 we have the following. Lemma 2.3 Assume that f1,f2 E £([—7r,7r]), are continuous at 0 and periodic with period 21r. Also assume that f2(0) ¢ 0. Then fur f1(:)Sin-2md 2 L, = “in 1 (pain—'22 2 :zfly (2'45) n oo 11' 2 2 ) Proof: L’ can be rewritten as _1_ 1r fi(:)8in“ 3d L' = lim (2.46) 11—900 _1_mr1r1rf2(t)sin:-MT NI 45 Then by Lemma 2.2 and the fact that f2(0) 75 0, we have the required result. ‘ The above lemmas are summarized in the following theorem to give us the points of convergence of the the estimates of the PPS algorithm given in (2.19). This theorem shows that the algorithm does indeed converge to the samples of the frequency response of the unknown system as intended. Theorem 2.2 Assume that the p.s.d. of input x(n), Sx(w), and the frequency response H D(w) of the unknown system are sufficiently smooth functions which belong to £([—1r, 7r]), then . I3k(nM) _ 1‘ fl _ 221530 i'k(nM) .. HD M ), k e {0,1,...,M 1} (2.47) 2754—?) 7e 0, k e {0,1,...,M — 1}. Proof: From Theorem 2.1 we have that ‘ p‘N' (0 11 WW") - ° ”*X" ) (2.48) n~oo fk(nM) — N—IPOO W Then from Lemma 2.1, and by substituting (2.35) and (2.36) in equations (2.31) and (2.32), respectively, we have that lim Pram”) _ If. f1"(w)SN(w)dw :(0) ‘~~ooff.f§(w)SN(w)dw‘ (2'49) Realizing that f1" and ff are periodic, smooth and E £([—7r,1r]), we have by Lemma 2.3 that . PENN) _ me) ~th W ‘ m ”'50) Substituting for fl"(w) and ff(w) from equations (2.35) and (2.36), respectively, yields — 1ri 1ri 2 7ri 12m PDMMO) _ hZf-ZolHM-grr Cid-217 5x(‘2714‘) (2 51) —>oo - — - 1ri 2 7 7ri ° ' N ”5(0) fizgol le(-'21vr 534—217) 46 But GA—fifii) = M6,}, where 1 if z' = k 6i}: = (2.52) 0 if 2' ¢ 1:. Therefore we have that I, 15k(nM) _ Meme?) m T—_ ‘ " 21rk "#00 rk(nM) Sal-T!- —2wk — HD( M ). (2.53) if Sx(—3i;7'5) 75 0, k E {0, 1,. ..,M — 1}. Then from (2.19) and (2.53) we have that . 2 k ”1&1;on = JIM—XI). (2.54) The resulting zooming on the center frequency of each band leads to perfect estimation of the parameters. Notice that the condition SA—gfi'i) ¢ 0 for k E {1,2,. . ., M — 1} forms a Persistency of Excitation (RE) [23] requirement on the input signal. Also, since we are evaluating the frequency response of the unknown system at M particular points in the interval from [—1r,1r], it is only natural that we require the input to excite those particular points. The impulse response can be obtained by taking the IDFT of the ff), obtained above. If the number of bins used is sufficient, then the estimated impulse response is identical to the impulse response of the unknown system. 2.8 Time-varying environment It will be shown in Section 3.1 that the PPS algorithm works very well in time-invariant setting. However, for time-varying environments we need to incorporate a forgetting factor in the accumulated quantities. As done in the classical RLS [38], we introduce a forgetting 47 Table 2.2. Number of computations per block (M samples) for PPS. Operation Additions Multiplications Divisions Adaptive algorithm %M + 11 7M + 14 M + 2 Filter Bank 4M log2 “—21- 4M log? % - Inverse FFT 2M log 521 2M log 521 - Total %M+11+6Mlog2—2- 7M+14+6Mlogzg M+2 factor A, where /\ 6 (0,1). The accumulated equations are modified as XkUIM) = AXk((n-1)M)+Xk(nM) Dk(nM) = ADk((n — 1)M) + Dk(nM). (2.55) The cost function is redefined as Jk(nM) = ihn-iekUMkkiM). (2.56) i=1 Proceeding as done in Section 2.6, we observe that the correlation quantities would be accordingly evaluated as Pk(nM) fiDunMWlmM) + A(1- 3pm —1)M) WM) = %X.(nM)X,I(nM)+A(1—gym—1)M) (2.57) and the estimate of the parameter is the same as in (2.19). The choice of the forgetting factor A depends on how fast the system parameters are changing. A rule of thumb would be to decrease A as the system parameters change faster. 2.9 Computational complexity The computational elements involved in the PPS algorithm are the F SF filter bank, and the accumulators. It should however be noted that most of the computations are done at different rates. The FSF filter bank is implemented using an FFT algorithm, thereby giving 48 us computations of order M 1092M , when an M -band filter bank is considered. The input sequence x(n) is partitioned into blocks of length M to compute the M -point FFT (see Table 2.1). Since the accumulator and the updating algorithm work at a decimated clock rate, great computational savings are rendered. Table 2.2 summarizes the number of real arithmetic operations [required per input—block of length M. 2.10 Recursive PPS PPS can be modified into a recursive form similar to the extension of least squares method to recursive least squares [38]. The non-recursive PPS always initializes the parameters to zero, where as the recursive version can initialize the parameters to any arbitrary value. This helps in proper initialization when some a priori knowledge of the unknown system is available. The recursive version of least squares method is summarized in Table 2.3. In the next chapter we evaluate the performance of the PPS algorithm and perform simulations to verify its convergence properties. tThe number of complex arithmetic operations have been converted to appropriate number of real operations. 49 Table 2.3. Recursive PPS algorithm. 0 Form the input blocks at instant nM as 233(7).) : [x(nM),:r(nM "1), ' ' 33((7‘ — 1)M + 1)] data) [d(nM),d(nM - 1),---,d((n- 1)M + 1)] where x(nM) is the input and d(nM) is the desired output 0 Find the FFT of the input blocks [Xo(nM), X1(nM), . - o,XM_1(nM)] = FFT(:cB(n)) [Do(nM), Dl(nM), . - -,DM_1(nM)] FFT(dB(n)) o fori=0,1,...,M— 1 do 1. Form the accumulated outputs XJ‘nM) = X.((n -1)M)+ X;(nM) D,(nM) = D;((n - 1)M) + D.(nM) 2. Update the inverse auto-correlation and cross-correlations 1r(nM) = X,l(nM)P((n_- 1)M) _ x(nM) = P((n — l)M)X.-(nM)/(z\ + 1r(n)X.-(nM)) d(nM) = 12,-(nM) — Il,l(nM)X.-(nM) P(nM) = %{P((n —1)M)- n(nM)1r(nM)} 3. Then . . H,i(nM) = H,l((n — 1)M) + n(nM)al(nM) 0 Obtain the impulse response estimate h(n) as h(n) = IFFT{[1§Io(nM), 111(nM), - . ., HM_1(nM)]} CHAPTER 3 Performance Evaluation of PPS In this chapter we present simulations to highlight the characteristics of the Polyphase Structure algorithm. Its convergence to the optimal solution along with its robustness to disturbance are demonstrated. In addition, it is compared with various other algorithms to show its better performance and improved computational efficiency. PPS has been introduced as an algorithm for identifying very long impulse responses. We shall first establish its convergence properties by comparing its performance with smaller order systems in the system identification mode. In this mode, depicted in Figure 3.1 we attempt to model the unknown system by a linear model such that the difference between its output and that of the unknown system is minimum according to some criteria such as the mean-square or the least square. The parameters of the model are given by one of the many adaptive algorithms that we shall be comparing. The unknown system hd and the input 3 are chosen so as to highlight the properties of the adaptive algorithms. Later on we shall compare the performance of the PPS algorithm in the inverse filtering mode, depicted in Figure 3.2 in order to evaluate its performance in equalizing inter-symbol interference (ISI) in communications channels. In particular, we shall use as our unknown system a room of dimensions 7mx8mx2.5m. The impulse response coefficients were obtained by the ”hammer-tap” method. The impulse response is about 4096 samples and represents a duration of 0.25 seconds at a sampling frequency of 16 kHz. The normalized room impulse response is shown in Figure 3.3. Other 50 51 v(n) 4. Unknown + System hd (in) x(n) 8(71 ) Adaptive ‘ Model h W") Figure 3.1. System identification configuration. x(n_)> Unknown d( n) Adaptive y(n) ‘ e(n) System hd *—’ Model I? + Figure 3.2. Inverse filtering configuration. scenarios are also considered to highlight various other features and shall be described later in the chapter. 3.1 Simulations We compare the PPS with popular algorithms currently in use. The ones considered are the Least Mean Square algorithm (LMS), Fast Transversal Filter (FTF) algorithm [9] which is a variation of the Recursive least squares algorithm, Frequency domain LMS (FDLMS), discrete cosine transform LMS (DCTLMS), and Fast LMS (FLMS). The strength of LMS is its simplicity but its weakness is its slow convergence while using severely colored inputs. The DCTLS and FDLMS are faster than LMS when the input is severely colored but are computationally expensive. FTF a variant of the Recursive Least Square algorithms is computationally more efficient that RLS, but is very complex and unstable. The FLMS algorithm offers both computational savings and better performance than LMS while using 52 E ‘c 4 8 3' ‘:- - :s‘ *w—~ 4 8 .3 3 -I E -0.6 - d -0.8 r * -1 l 1 l l l l l 1 0 500 1000 1 500 2000 2500 3000 3500 4000 4500 Time index n Figure 3.3. Impulse response of a room. colored inputs. In the sections to follow we compare the performance of PPS with these algorithms using both time varying and time invariant systems and white and colored inputs. 3.2 Time invariant systems In this case the unknown system is assumed to be time invariant. However no assumption is made about the stationarity of the input signal 2:. This signal would be non-stationary when signals such as speech are used. We first consider the case when the input is white and then consider colored signals and speech signals. 3.2.1 White noise The behavior of the LMS algorithm is nearly optimal when the input is white. The orthogo— nality, in a stochastic sense, of the input to the adaptive filter is the sought-after property by sub-band schemes. It has been argued in the literature that sub-band filtering is redundant 53 . L=32 Proposed Algonthm 0.3 l I I I I I 0.2 - - _ 0.1 - - 9. III 0 1 - -0.1 - - '02 b | 1 I l l l 0 500 1000 1500 2000 2500 3000 Samples LMS E w - 500 1000 1500 2000 2500 3000 Samples Figure 3.4. Comparison of learning curves for LMS and PPS when M232. for cases in which the input is already white. In fact, due to the non-ideal characteristics of the filter banks, small correlations will be introduced in each sub-band if the input is white. This is usually prevailed by a degradation in performance of the sub-band algorithms compared to that of the LMS. The same phenomenon has been observed in our experiments involving white noise. In out first simulation we consider a 32 tap unknown system with a white input 1:, uniformly distributed on [-.5, .5]. The learning curves in Figures 3.4 and 3.5 show that LMS outperforms the PPS when the input is white. When the length of the unknown system is increased to 128, the performance of the two algorithms is more comparable as shown in Fig. 3.5. The convergence is also governed by the zooming effect of the accumulator. It can be observed from (2.40) and (2.41) that the accumulator zooms in at a rate of 1 / n. Although the performance of the LMS algorithm and the PPS are crudely comparable, the 54 , L=128 Proposed Algonthm 0.5 I I I I I I g o l ‘ LU '0-5 I l I 1 l l 0 1000 2000 3000 4000 5000 6000 Samples LMS 0.5 I I I I I I g 0 Ln ALL A; 5: fi_ - LU .0.5 l L l l I l 0 1000 2000 3000 4000 5000 6000 Samples Figure 3.5. Comparison of learning curves for LMS and PPS when M=128. tradeoff in favor of the PPS improves when the length of the unknown system increases or the input signal becomes colored (i.e., the condition number of it correlation matrix gets large). 3.2.2 Colored noise The performance of PPS is dramatically better compared to the performance of LMS- type algorithms when colored inputs are involved. This is true even under severe coloring conditions as long as the input is persistently exciting, as the following simulations will show. In our first simulation the unknown system is 32 tap FIR filter, whose impulse response is shown in Figure 3.6. The input signal has a low pass spectrum with a normalized cut-off frequency 0.11r. Figure 3.6 shows the parameters of the adaptive model after 5000 samples 55 Impulse respsonse after 5000 iterations 0.12 I I I I I I bold: PPS 0.1 - - dots: LMS dashed: Original 0.08 - - 0.06 - a 0.04 - - 0.02 ~ ~ 0 ,_ ------ I / \ ~ _____ .4 .0020 5 10 15 20 25 30 35 Samples Figure 3.6. 32 FIR filter. Estimated impulse response. of the input have been processed. Notice that the estimates from PPS are much closer to the original compared to the estimates from LMS. The learning curve in Figure 3.7, which is a plot of how the output error e changes with iterations, shows the faster and better convergence of PPS when compared to LMS. As mentioned before PPS is a better choice when the length of the unknown system is long. In the next simulation we try to estimate the impulse response of a room. The input used is a low pass signal with a cut-off frequency 0.11r. Figure 3.8 compares the difference between the impulse response of the unknown system and the model after convergence. The difference called the impulse response difference coefficient (IRDC) is defined as 9 IRDC (13(2) — 11.42))2 (3.1) Error 56 Learning Curves of the Prop. Alg. vs. LMS 3 _ 1 r f I r 2.5;» bold: PPS ~ dots: LMS 2 r d 1.5 > ‘ L l 1000 2000 3000 4000 5000 Samples Figure 3.7. 32 tap FIR filter. Comparison of learning curves 6000 57 Comparison of estimated impulse responses I I I dotted: LMS bold: PPS 050010001500200025003000350040004500 Impulse response coefficients Figure 3.8. The unknown system is the impulse response of a room. IRDC for LMS and PPS are plotted here. 58 1 O- : I I I I I 1 10": m 2 §1o‘ ‘: w 1 c i m 4 O 2 bdd:PPS 10'5 ~ dotted: LMS 1 1043 ‘ ‘ ‘ ‘ L 0 0.5 1 1 .5 2 2.5 3 Samples x 10“ Figure 3.9. Comparison of learning curves when the unknown system is the impulse response of a room. Notice that the error in the coefficients in case of PPS is much smaller than that of LMS. The learning curve in Figure 3.9 also tells the same story. PPS outperforms LMS in the same time scale by more that 30dB. This behavior was observed to be typical for systems with long impulse responses under adverse conditions. As a further test of the performance of PPS we compare PPS to F LMS. Figure 3.10 shows how PPS outperforms FLMS. The order of the system used in this case was 256 and the input was a low pass signal with a cut-off frequency of 0.257r. Not only does PPS perform better, it is computationally very efficient compared to FLMS as can be noticed in Table 3.1. We next evaluate the ability of PPS to track time-varying systems. 59 PPS: L = 256 Error e ' 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Samples x 10‘ Fast LMS: L = 256 1.5 I I l I l I I I I 0.5 Error 9 O —o.5 ‘ -1.5 0 1 1.2 Samples x 10 Figure 3.10. Comparison of Performance of PPS and FLMS. N = 256 and input is a low pass signal with cut-off of 0.257r. 60 0.16 I T I I I I I I I I 0.15r PPS - 0.14 . 0.13— _ § . g xi /’ l 9012'- /\ [I \l I I V " a “ I, f, 0" I, l \f" \ I f/ I 0.11'— /// \v~ " I, ,’ UMS 01- ’ " - ' / I / o.oe~ ,’ ~ ‘I I i. I 0.08 11 l l l I l L l L 1 0 500 1000 1500 2000 2500 3000 3500 4000 4500 Samples Figure 3.11. Tracking behavior of LMS and PPS with slow variations in one parameter of a 16-tap FIR system (A = 0.98). 3.3 Time-varying systems With the introduction of the forgetting factor in equations (2.55) and (2.56) PPS has the ability to track time varying systems. To verify its ability to track time varying systems, we test its tracking properties first with slowly varying systems and then with fast variations. In the first case periodic ramp type variations were introduced into one parameter of a 16-tap FIR filter. Two cases are considered, one with a period of 350 samples and the other with a period of 700 samples. Figures 3.11 and 3.12 depict the tracking behavior of the LMS and PPS algorithms for these variations. The time varying estimates from both algorithms are plotted at each sample instant. Notice that in both cases PPS performs much better that LMS. A higher order system is tried out wherein sinusoidal variations are introduced into one 61 0.2 r WI I I WIT If F I I I PPS . . ' ‘ I I ‘ ‘ 0.15i' « i ' v‘ ‘ a I- I | 2 I i I I I g I r V ’ / ’ [I 1. 1‘ Al .2 It / ’ 'l ’ ._ .1 I. - I I l I l , . .’ I I § 4 Ii {I l .f l I. 4 ll . .J \ H 1 'Il '. i5] |.\- I, l, .. I! l' U I r . ' l ’ -' 'l l u " v ' ‘ ll l . _I I l- . 01- I I - I l.’ 1! l 1 ‘ ‘ ° . i I i." l,’ ' ."- 1" ‘1 ‘.I z 1'; l I1 'I 1' II | l LMS i ll ' h I 005 l l 1 L l l l l l l l 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Samples Figure 3.12. Tracking behavior of LMS and PPS with fast variations in one parameter of a 16-tap FIR system (A = 0.95). 62 0.25 r I I I j I I T I J l l I / — PPS 0.14 .' —- - LMS - I ---- True parameter I i I l 0.054 I . l l l O 1 l l l l l 1 J_ 1 l 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Samples x 104 Figure 3.13. Tracking behavior of LMS and PPS with slow variations in one parameter of a 512-tap FIR system (A = 0.45). parameter of a 512 tap FIR filter. The input in this case is a high pass signal with a cut-off frequency of 0.51r. Figure 3.13 shows how PPS outperforms LMS. In order to model systems with larger variations, ramp variations are introduced into 8 parameters of a 128 tap FIR filter. In Figure 3.14 only three of the parameters have been plotted for convenience. Notice that PPS shows a much faster and better tracking of the unknown parameters. We next compared the performance of PPS with FTF. FTF exhibits all the fast con- vergence properties of a Least Squares based algorithm, but is very complex and inherently unstable. In order to make it more stable numerous rescue and restart operations have been prescribed which add to the complexity of the algorithm. Nevertheless it is a yard stick for comparing both the speed of convergence and the computational efficiency. We shall address the issue of computations in a later section. Here we present the case where the 63 I I I I I 1" ‘. .’ / / . / I ’ d / I . 0.55 /' n I 0’ .' .1 ’\ 'I’l- I ’ .z' I I z .05 ‘\ I I I. ‘ l’ \ 'll \ I \ ’ -2.5 - a -3 ~ — pps - - - LMS -3.5 1' ..... Tme - .4 I l J L 1 0 0.5 1 1.5 2 2.5 Samples x10“ Figure 3.14. Tracking behavior of LMS and PPS with variations in 8 parameters of a 128-tap FIR system (A = 0.35). 64 0.5 0.45 0.4, .- o.35 0.3 'f . True — Parameter I I I I I‘ i 0.25:: ._.- pps ' I I i L 0.2 ‘ 0.15. . 0.05 l- - 0 1000 2000 30004000 5000 6000 7000 8000900010000 Figure 3.15. Tracking behavior of F TF and PPS with variations in 1 parameter of a 16-tap FIR system (App, = 0.95, Aft, = .985). instability of this algorithm in case of time varying systems is presented. In the first comparison shown in Figure 3.15, PPS performs marginally better that F TF. In this case the length of the unknown system was 16 and saw-tooth type variations were introduced into on the parameters. The input signal was low pass with a cut-off frequency of 0.17r. When the order of the unknown system is increased to 128 it can be seen from Figure 3.16 that FTF becomes unstable around the sample 2500. In spite of the presence of rescue and restart operations FTF, goes unstable. Further modifications have been proposed in [61] to overcome this problem It has been noticed that PPS exhibits excellent tracking capabilities. It exhibits a very fast transient state, and once it settles in a neighborhood of the true parameter, it achieves tracking even under adverse input conditions. The PPS elevates its performance beyond the reach of the LMS, for both slow and fast-varying systems. 65 0.5 0.45 - 0.4 5 4 0.35 9 _, -" . E .I E I 0.3 j '1' ' E .’ 0.25 4 : . i i Ema 0'2 9 i arameter _ ll ' - ' ' PPS 0.15 4! """ F": . il 31 0.1 7:, . {i 0.05 a - {l 0 l l l l L l m L 0 2000 4000 6000 8000 10000 12000 14000 16000 Figure 3.16. Tracking behavior of FTF and PPS with variations in 1 parameter of a 128-tap FIR system (App, = 0.9, A ft 1‘ = .9). Notice that FTF goes unstable around sample 2500. 66 70 I I I I I I I f I l\ \ 60- fl ‘ \ '1 ~ ~’ 1.11 \\ {.4 I. \ J" v“ 50- r *‘u. , DCTLMS ERLE 010002000300040005000600070008000900010000 Samples Figure 3.17. Comparison of learning curves for LMS, F DLMS, DCTLMS and PPS when the input is a low pass colored noise. The unknown system is a 512 tap FIR filter. 3.4 Echo cancellation In order to evaluate the performance of PPS while suppressing echoes, we compare it with LMS, Frequency Domain LMS (F DLMS) and discrete cosine transform-domain LMS (DCTLMS)[47]. A measure of echo suppression, the Echo Return Loss Enhancement (ERLE), defined as ERLE = IOIog%]—::E—:%, (3.2) is used in some experiments, where y(n) is the desired output, and e(n) is the output error. An ERLE greater that 20dB is expected for echo cancellation [17]. In the first scenario the unknown system is a 512 tap FIR filter. Two simulations were performed, one with a low pass input and the other with a high pass input, where the stop-band attenuation in both cases are around 40dB. Figures 3.17 and 3.18 portray the 67 50" FDLMS 4 40 p DCTLMS 30 b .. 5, 20 FPS . m LU 10 ‘ 'i 0 a / -< ,/ -10 , / LMS ~ _20 “I 1 1 1 1 1 1 1 1 1 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Samples Figure 3.18. Comparison of learning curves for LMS, FDLMS, DCTLMS and PPS when the input is a high pass colored noise. The unknown system is a 512 tap FIR filter. results of this comparison. It is clear that the performance of LMS is the worst and that DCTLMS gives the best performance. PPS does a reasonable job, and gives an ERLE of more than 20dB. At the same time PPS is computationally more efficient, requiring a 512 point FFT every 512 samples, while DCTLMS and FDLMS require the corresponding 512 point transform for every sample. In the second scenario the impulse response of a real conference room (Figure 3.3) is the unknown system, and the input is a speech signal (a male voice reading a passage from a book). Figure 3.19 compares the echo suppression achieved by the different algorithms. The superiority of PPS over LMS is self-evident. Its performance is comparable with that of FDLMS and DCTLMS, but note that FDLMS and DCTLMS are much more compu- tationally intensive. After 50,000 samples while FDLMS and DCTLMS perform 50,000 of their corresponding 4096-point transform, PPS performs only 12 FF Ts. PPS, DCTLMS and FDLMS fare much better than LMS, when the input is ill- 68 ERLE r- _10 l l l l l 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Samples x 104 Figure 3.19. Comparison of learning curves for LMS, FDLMS, DCTLMS, and PPS when the input is a speech signal. The unknown system is the 4096 tap impulse response of a conference room. conditioned. DCTLMS gives a very good performance with speech and low pass signals, while FDLMS fares well for high pass inputs. Notice that when the order of the system increases PPS performs at least as well as the other algorithms. This along with the fact that it is 0(M — 1) times more efficient than both DCTLMS and FDLMS, makes it the best choice for echo cancellation. PPS offers a steady performance, achieving the ERLE of 20dB after only 10 blocks of input have been processed, for almost all inputs and all orders of the system. Note that the memory allocation for the covariance matrix (4096 X 4096 in this case) make RLS sluggish and impractical as verified by our simulations. In the section to follow we shall consider enhancements that make PPS more robust and efficient. 69 e-jmw ._> _> x(n) Fractional | Polyphase — Decimator Algorithm 89°F” Estimates —’ L. Unknown d(n) Fractional System Decimator . e-jhan/N nonse Figure 3.20. Rotation and dilation with PPS. 3.5 Dilation and rotation It is observed in (2.53) that a weak sample of the input spectrum causes degradation in the convergence of the algorithm. Since PPS is a frequency domain algorithm, it is possible for PPS to zoom in on the portion of the input spectrum which is well conditioned and to provide better spectral estimates. Such operations can be performed using rotation (modulation) and dilation (decimation) of the spectra of the input signal. It is known that decimation in time produces a dilation in the frequency domain [11, 69]. Assume, for example, it is known a priori that the input signal has a low pass p.s.d. Such a signal would degrade the performance of PPS since there is insufficient excitation at higher frequencies. However, if the input signal is decimated, its p.s.d dilates to occupy a much larger spectral range. This input conditioning helps PPS to zoom in on the useful range of frequencies. Similarly if it is known that the signal is not sufficiently exciting at the desired frequencies, modulating the input signal with a complex sinusoid will rotate its p.s.d for more excitation at the desired frequencies. Figure 3.20 depicts how the two schemes of modulation and decimation can be incorpo- rated into PPS. The rotated and decimated input a: and the desired output d are given as 70 1.1 I I I I I I I 1 ~ . 0.9 ~ . 0.8 - - 75‘— o.7 - - E c, 0.6 - ~ (D c: O 5‘ 0.5 - - ad ,3 0.4 - . fl 0*: a 0.3 - N 53° 2 0.2 o 1 J k i l l J l -3 -2 —1 o 1 2 3 Frequency to Figure 3.21. Coloring filter with eight equally spaced nulls along the frequency axis. inputs to PPS. If M sub-bands are used then, PPS yields spectral estimates at M even spaced frequencies over the range of frequencies dictated by rotation and dilation. To recover the estimates in time domain, we need to undo the effects of rotation (i.e., demodulation) and dilation (i.e., interpolation). To show the effect of rotation, a signal is generated with nulls around 8 equidistant points on the frequency axis from 0 to 211'. The frequency response of the coloring filter used to generate this input is shown in Figure 3.21. The unknown system is an eight-tap FIR filter. Figure 3.22 shows estimates of the magnitude of the frequency response of the unknown system obtained by PPS without rotation in comparison to those obtained by rotated PPS. The estimates have been plotted after 500 iterations. Notice that rotated PPS exhibits a much improved convergence. In order to examine the effect of decimation we conducted the following experiment. The unknown system is a 512 tap FIR filter, and spectrum of the input signal is low pass with a 71 I I I x Est. from Hot. PPS -— Desired System 0 Est. from PPS 0.9 - 0.8 I 0.7 0.6 - I 0.5 0.4 I 0.3 r 0.2 - 0.1 *- Q 1 1 -3 —2 -1 0 1 2 3 4 Frequency LO Figure 3.22. Effect of rotation on the performance of PPS. cut-off frequency of .57r. Figure 3.23(a) shows the performance of the normal PPS with 512 sub-bands after 20,000 samples. Here the frequency response of the system estimated by normal PPS is plotted. When the signal is decimated by a factor of two, the performance of PPS improves dramatically. Using only 256 taps after decimation, PPS converges to the true estimates much faster as shown in 3.23(b), needing only 10,000 samples. The execution times for the decimated and undecimated PPS on a Sun Sparc 20 workstation consistently shows a ratio of 1:3. 3.6 Signal-to-noise ratio and computational complexity The strength of PPS has been its computational efficiency. Not only does it allow parallel processing but also improves efficiency by being a block processing algorithm. This feature makes it ideal for applications, involving the identification of very long impulse responses, 72 10 ’ I I I I I I I ‘ l — Oedred System 1 o - - Est. from PPS 10 r v vvvvv w 1 10-1? 1 ”"5 i i ; f 104 r g 104? 1. ‘0'55 1. 1o" . -4 o 4 Frequency (a) 10‘ E — Desired System 3 ‘ - - Est. from PPs 0 10" - 1 51:: . 10'2 - — 10.3? 1 10"4 l l l -1 0 1 4 Frequency Figure 3.23. Effect of decimation on the performance of PPS. (a) No decimation. (b) With decimation. 73 Table 3.1. Comparison of computational complexities of LMS, FTF and PPS. Algorithm ® per sample N = 128 N = 512 N = 1024 LMS 2N 256 1024 2048 FTF SN 1024 4096 8192 PPS 610g2(N/2)+ 8 44 56 62 FLMS 1010g2(2N) + 16 96 116 126 such as severe echo cancellation in telephones and teleconferences. The problem of echo cancellation as discussed in Section 1.1 involves the identification of a system whose impulse response is as long as the round trip time of the call [42, 70]. With increasing distances over which a call is made, the effectiveness of a canceller is limited by its computational efficiency. In addition, the non-stationary nature of the signals and their sample correlation inhibit the effectiveness of many algorithms. Table 3.1 compares the computational complexities of the LMS, FTF, PPS, and FLMS. Notice that PPS enjoys a decisive computational advantage over the other three algorithms. As a further test of the computational efficiency we compare the performance of the three algorithms (LMS, FTF and PPS) under various signal to noise ratios (SNR) when the input signal is a speech signal, and the unknown signal is the impulse response of a room of about 0.5 seconds long (or, 4096 taps). The disturbance term is simulated using an additive Gaussian white noise, with a zero mean. Its variance is varied to obtain different SN Rs. It can be seen from Figure 3.24 that PPS performs reasonably well in comparison to FTF, while both FTF and PPS are more preferable than LMS. These simulations were run on a SUN Sparc 20 workstation and it was observed that on the average the time required for PPS, LMS and FTF showed a ratio of 1:2:7, respectively, for the case under consideration. 3.7 Channel equalization The dispersive effect of a channel, known as the 181 was discussed in Section 1.2. Equal- ization of a channel involves the recovery of the original stream of data transmitted over the channel. This poses the problem of channel equalization under the realm of inverse filtering as depicted in Figure 3.2. Most telephone channels are FIR in nature [54] and their 74 Mean Square Error 0 O ‘ i: F’ 8 on _. T I I _o 3 .0 o N I 1 0 1 5 20 25 30 35 40 45 50 55 Figure 3.24. The performance of LMS, F TF and PPS under various SNR conditions. inverses are IIR. Thus exact equalization requires the use of IIR filtering algorithms. From our discussion in Section 2.3.3 on IIR algorithms we known that they are very slow and face stability problems [30]. The best solution for such problems then is using very long FIR filters for approximating the infinite impulse response of an ideal equalizer. This approach puts PPS as an ideal choice for equalization of FIR channels. 3.7.1 Blind equalization One of the many problems with equalization of channels is that not much is known about the signal being transmitted. One might just have some limited information about the statistics of the signal, but the signal itself is not known. This should be clear from the statement of the equalization problem in Section 1.2. In most practical cases the adaptive equalization process in initialized by transmitting a known pseudo random sequence of data across the channel. This sequence is used to asses the channel characteristics which are 75 U(n)_ Adaptive y(n) Zero-memory i E) 7 Filter nonlinear estimator ' — + Adaptive ’ Algorithm e(n) Figure 3.25. Blind equalization configuration. used to remove the 181 from the subsequent data. The ideal formulation would be for an algorithm to perform the equalization without any knowledge of the input signal. This process is known as blind equalization or deconvolution [31, 39, 54]. In blind equalization an estimate of the most likely transmitted input symbol is obtained from the output of the equalizer. This signal is then used as the desired signal for the adaptive process. For instance in the case of transmitting a binary sequence of ls and -ls, the estimator might just be the sign of the output of the equalizer. In general the estimator is a non linear, zero memory device. A blind equalization scheme is depicted in Figure 3.25. There are a whole class of cost functions associated with the nonlinear estimators. We shall briefly discuss a few. 3.7.2 Cost functions for blind equalization Consider the blind equalizer shown in Figure 3.25. The equalizer attempts to minimize the cost function 10!) = E{(i(n) - y(n))2} (3-3) Here the estimate of the input signal a”:(n) is obtained as the output of the zero-memory nonlinear estimator. One of the earliest proposed estimator was for the case of a binary source of data, where the estimator was described as 76 an) = vsgn{y(n)} (3.4) where sgn(.) is the sign function. The constant 7 sets the gain of the equalizer and is defined by _ Emu» 7 ‘ E{|x(n)l}' (3'5) The adaptive algorithm based on equations (3.3)-(3.4) was first proposed by Sato [57] in 1975. The Sato algorithm is robust, but has a slower convergence. Its convergence is guaranteed, when the input signal’s probability density function can be approximated by a sub-Gaussian distribution such as the uniform distribution [6]. A generalization of the Sato algorithm leads to a whole family of constant modulus blind equalization algorithms for use in two-dimensions digital communications systems (e.g., M- ary phase shift keying [58, 65]). These class of algorithms collectively known as the Godard algorithm [22] minimize the non-convex cost function «’00 = E{(|y(n)l” - 3:02} (3-6) where p is positive integer and Rp is a positive real constant defined by _ Emma} 12.- E{|:r(n)|”}° The Godard algorithm is designed to penalize deviation of the blind equalizer output (3.7) y(n) from a constant modulus. The constant RP is chosen in such a way that the gradient of the cost function J (n) is zero when perfect equalization is attained. Note that for p = 1 equations (3.3) and (3.6) are equal, i.e., the Godard algorithm is same as the Sato algorithm for p = 1. 77 1401) -M _ Adaptive y(n) Sign J’” 32) Z Filter 7 Function .1 V - 4- ‘ PPS V Algal-"hm g(n) Figure 3.26. PPS modified for blind equalization. Notice the delay introduced to account for the block processing. 3.7 .3 PPS for blind equalization Since PPS is a block processing algorithm, there is a delay of the order of the block size in the estimation of the model parameters. This delay has to be incorporated into the scheme for equalization of channels. Modification of PPS for blind equalization is depicted in Figure 3.26. In all our simulations we assume that the input signal is a binary sequence of ls and -ls. The nonlinear estimator in our case is the sign function. Under this setting PPS operates under a Sato class algorithm for blind equalization. Figure 3.27 depicts the performance of PPS while equalizing an all-pole channel with denominator polynomial D(z) = 1 + 0.82’1 + 0.42'2 + 0.52‘3. Notice that the estimates from PPS converge towards the negative of the true parameters. This is due to the nature of the non-linearity (the sign function) used. This problem is because of the fact that PPS initializes all the parameters to zero and the algorithm converges to the closest minima which in this case happens to be the set of negatives. It should be noted that the negative values also yield a valid solution because all that is needed to recover the transmitted signal is to invert the output. In practice equalization is not done completely blind. With each data frame, a known signal is transmitted and this is used to initialize the equalizer. This implies that recursive version of PPS needs to be used as this allows for initialization of the estimates to non-zero values. The PPS based blind equalizer initialized by a training signal can be expected to converge faster to the true solution. This observation is strengthened by the results in 78 0.1 I I I I I T I I I O 7 .4 -0.1 - -0.2 a 2 —0.3 - a 2 o E -0.4 r .l E as CL -0.5 - - —0.6 - - -0.7 - a —0.8 - ‘ _o.9 l l l l l l I l l 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Samples Figure 3.27. Performance of PPS under blind equalization with all parameters initialized to zero. Figure 3.28. In this simulation the recursive PPS algorithm is used to equalize the all pole channel equalized in Figure 3.27. The algorithm initialized to non-zero values obtained by using a training sequence, converges to true values in a much shorter period. Finally we compare the performance of PPS, LMS and FTF while equalizing a channel. For this simulation we consider a binary input sequence of ls and -ls. The channel is a minimum-phase channel. The cost function for blind equalization is not unimodal because of the non-linear es- timator used. This makes the quality of the solution very much dependent on the initial conditions. For large number of parameters (like 500 or a 1000), the cost function can be very complex with many local minimas. Figure 3.29 shows the estimates of the equalizer obtained from different algorithms under consideration. The ideal equalizer (the inverse of the channel) has an impulse response which is 512 taps long. Notice that PPS performs at least as good as the other algorithms 79 1.1 I I I I I I I j T 1M - l— o.9 , 0'8 /\/ d 0.5 f9 / O \l Parameters O a: 0.4 0.3 0.2 - 0.1 500100015002000250030003500400045005000 Samples Figure 3.28. Performance of recursive PPS under blind equalization with parameters ini- tialized to non-zero values. in addition to its great computational savings. PPS offers the best choice available in terms of the tradeoffs between performance and computational efficiency for equalization on minimum-phase channels. This along with the observations in Section 3.4 on echo cancellation make PPS ideal for applications where long impulse responses need to be identified, or infinite impulse responses need to be approxi- mated. While the performance of other algorithms degrades with the increasing order of the system, PPS offers almost the same performance for systems of different lengths. This is due to its parallel, alias-free structure. The promise of computational savings along with good performance makes further study of PPS for blind-equalization very attractive. Further investigation on application of PPS for equalizing non-minimum-phase channels would be a fruitful endeavor. This and other issues for future research are discussed in Chapter 4. 80 1.5 . . 1.5 0.5 ' 0 200 400 0 200 400 Desired PPS 1.5 . . 1.5 0.5 ' 0 200 400 0 200 400 LMS FTF Figure 3.29. Comparison of the impulse response estimates from PPS, FTF and LMS while equalizing a minimum-phase channel. The ideal equalizer has length 512. CHAPTER 4 Discussion and Future Work The development of PPS opens a new avenue for the design of sub-band adaptive filtering algorithms. A whole new class of sub-band algorithms can be developed considering the structure of the system in the sub-bands. It was shown in Section 2.4 that sub-band schemes and block processing schemes are equivalent. Thus a scalar system, modeled by a block processing scheme, is equivalent to a sub-band scheme in which the system is now a square matrix. This sub-band matrix system has many interesting properties and can be exploited to achieve increased parallelism. PPS algorithm developed in previous chapters is one of such algorithms. In PPS only one parameter is evaluated per bin. However based on the structural decomposition method, algorithms evaluating more than one parameter per bin can be developed. One of the implications of this feature is that the block size can be reduced, thus decreasing block delays. This allows us to design algorithms that optimize the delay and the number of parameters to be evaluated in each bin. Some of the approaches to structural decomposition of sub-band schemes have been presented in [41, 44, 52]. In the section to follow a more general approach based on diagonal decomposition of systems is presented. Another modification would be the extension of PPS for identification of non-minimum- phase channels. This would require incorporation of higher order statistics and poses some very interesting challenges. In the sections to follow, leads for the implementation of the 81 82 above ideas are presented. 4.1 Diagonal decomposition of the system in the transform- domain An alternate approach to increased parallelism can be obtained by considering a decompo- sition of the system in Fig. 2.2(b). Let 33(2) be defined as [173(2) 2 leB(z)W (4.1) Then Fig. 2.2(b) can be redrawn as shown in Fig. 4.1. If the transformation W is such that FIB is diagonal, then we have achieved our goal. This can be stated as Problem Statement: Does there exist a unitary transformation W, that diago- nalizes H B; with the added condition that the W is independent of the elements of H B- We have the following fact. Fact 4.1 A circulant matrix is unitary similar to a diagonal matrix. The transformation that diagonalizes all circulant matrices is the DF T matrix. Proof: See [12, 51] In the subsequent section we develop the diagonal decomposition of a sub-band system based on the diagonalization of circulant matrices. 4.1.1 Generalized sub-band decomposition Consider the block processing, multirate equivalent of a $180 (scalar) system H (z) with input X (z) and output Y(z). If the size of the block is M, then H 3(2), the multirate equivalent of the scalar system H (z), is a M x M matrix. For the two systems to be equivalent, the multirate system must be an alias free system. The requirements on H 3(2) that yield an alias free system are summarized in the following theorem. 83 x(n) YM.l(m) 0—5- ——> —> —> + M Z-l YM.2(m) —-—> —> + M w 1213(2) w" z" m) 7:] M ] y(n) —> —> —” l M Figure 4.1. Diagonalized system function in the transform-domain. Theorem 4.1 Let 113(2) be the multirate equivalent of the scalar system H (z) If the two systems are to be equivalent, then H 3(2) should be pseudo-circulant (PC). A pseudo- circulant matrix is essentially a circulant matrix with all elements below the main diagonal being multiplied by z. (See Example 4.1. In addition the elements of the first row of H 3(2) one the M -polyphase components of H (2) Proof: See [69] for the proof. Consider a transformation W such that, a new multirate equivalent system IlB(z) is given by (4.1) if the transformation W diagonalizes H 3( 2), then we can represent the block / multirate system as shown in Figure 4.1. The diagonalizing transformation W must be unitary, and independent of the system function H 3(2), i.e., there should be only one transformation W that diagonalizes all H 3(2). The following theorem shows how one can obtain such a transformation. Theorem 4.2 Let C be any M x M pseudo-circulant matrix, and W a square matrix with elements ajk defined as k7j=0,---,M—1 andw:ej2"/M 84 Then W is unitary, diagonalizes C and wicw = D = diag{p(1),p(w).....p(wM-‘)} (4.2) when: p(/\) = Co + clAzl/M + . . . + cM_1(Azl/M)M‘1 Proof: This proof can be obtained just along the same lines as the proof for circulant matrices. See [69] for details. The transformations for the cases of n = 2 and n = 3, respectively, are 1 1 1 1 1 1 1 -1—\/§ —1 '\/§ W 1/2 1/2 , W5 21/3 z1/3l__21__2 zl/3l_+;__i (4.3) 2 2 l 22/3 zz/sg-ltg'x/fil z2/3g—1-21'x/5] An example of such decomposition is given below. Example 4.1 Consider the following scalar (or unblocked) system H(z) = a0 + alz + a1»?2 + a3z3 (4.4) Its four-component Type-I polyphase components are 530(2) = ao E1(Z) = al 192(2) = 02 E3(z) = a3 (4.5) Then the block (PC) system [13(2) is given as 85 203 (10 (11 02 113(2) = (4.6) 2 02 2 a3 a0 0.1 2a1 2a; 2a3 a0 for which we have Y3(2) = 113(2) 4: X3(2) (4.7) as in (2.6) The unitary transformation which diagonalizes H 3(2) given in (4.6) was found to be, 1 1 1 1 l 2 2 2 2 1 1/4 1 - 1/4 _1 1/4 _1 . 1/4 22 232 22 232 W: (4.8) .1. _l l _.1_ zfi 2 2 2 z 2 z 1 3/4 __1_ ' 3/4 _1 3/4 1 ' 3/4 _22 232 22 232 _ The diagonalized form 11 3(2) is - n 21/401 +ao+z3/4a3+\/'z'az, 0,0,0 . 0,a — '23/4a - 2a + '21/4a ,0,0 H3(2)=WlH3(2)W= 0 J 3 f 2 J 1 (4.9) 0,0,—21/“a1+ao—23/4a3+\/2a2,0 0,0,0,ao+j23/4a3—¢2a2—-j21/4a1 4.1.2 Adaptive structure We now consider an adaptive algorithm based on the above sub-band decomposition of a scalar system. The first problem to be considered is the fact that the transformation W has fraction powers of 2'1, the delay operator, and hence cannot be realized using a rational filter bank. This can however be avoided by using interpolation. Using this property the 86 x(n) Z w \S (101) Figure 4.2. Adaptive structure based on generalized sub-band decomposition. transformation W can then be rewritten as W = diag{1,2'1,...,2-(M-1)}U (4.10) where U is the M x M DFT matrix. Thus the transformation W can be implemented efficiently using the FFT algorithms. The implementation of an adaptive structure, using the above transformation is shown in Figure 4.2. A LMS type algorithm is used to adapt the coefficients. Each bin evaluates the coefficients of the corresponding polyphase compo- nent. It should be noted that the diagonal element of the unknown system 113(2) in the transform-domain is H (wjz), where j is the corresponding bin. In terms of the M -polyphase components Ej(Z),j = 0,1, . . ., M — 1, of H(z), we can rewrite this as M-l H(wjz) = Z 2'kEk(2M)wkj (4.11) k=0 87 With such as structure, adaptations can be performed in one of two ways. In the first method all the bins have a copy of the coefficient array and use that compute their outputs. But they only update the coefficients of the corresponding polyphase component keeping the others fixed. At the end of the iteration we can update the fixed polyphase components using the updated values from the other bins. Its advantage, however, lies in the fact that lesser number of coefficients are updated in each bin, and these updates can be done in parallel. In the second method, efficiency is increased by estimating just corresponding polyphase component in each bin, assuming that the rest of the components are zero. By estimating only the parameters of the polyphase component in each bin we are under-parameterizing, but by combining all the polyphase components we can reconstruct the original system function. This method gives an optimal strategy for utilizing the generalized sub-band structure. However gradient type algorithms have poor convergence when they are severely under parameterized. Lattice based least squares method [23] provides very good conver- gence regardless of the number of parameters used owing to its orthogonal and modular nature and seems to the be the best solution for this problem. 4.2 Equalization of non-minimum-phase channels One of the limitations of an equalizer based on second-order statistics is that when the channel is non-minimum-phase, exact identification is not possible. In this case the algo- rithms identifies the minimum-phase equivalent of the unknown system. Extension of PPS to higher order statistics (third-order or higher) is a subject that needs further investigation. While second-order statistics are phase blind, higher order statistics are phase sensitive. In particular, cumulants and their Fourier transforms, polyspectra are useful in the analysis and descriptions of non-Gaussian process, and non-minimum-phase and nonlinear systems [23]. A frequency domain approach based on polyspectra is a good avenue to pursue. An- other promising technique is the method based on fractionally spaced equalizers which is a relatively new development. CHAPTER 5 Conclusion We have presented an asymptotically alias-free parallel structure in which the DFT filter bank polyphase decomposition allows multi-rate processing. The zooming effect, as a result of using accumulators in each band, is the distinguishing feature of this scheme. This implies that the bands would become asymptotically uncorrelated, which accommodates independent adaptations in each sub-band. It was proven that 11k(nM), given by (2.19), converges to 113(2fik) as n —> 00 provided sufficient number of bands are used and the input is persistently exciting. Through several experiments it was shown that PPS completely outperforms the LMS and FDLMS algorithms by as much as 30 dB (ERLE) in some applications involving severe input coloring. The tracking capability is yet another strong attribute of PPS which was experimentally shown against the LMS and F TF. The great computational saving afforded by PPS as compared to LMS, FTF and FLMS is evident from table 3.1. This coupled with its good performance under different SN Rs, as compared to FTF and LMS, and its effectiveness in identifying long impulse responses, especially under adverse input coloring and changing environment, makes this structure a viable alternative to available methods. The modification of PPS for blind equalization opens a new area for its application. This represents one of the first frequency domain approaches to blind equalization. The recursive version of PPS extends very well for blind equalization with facility for choosing arbitrary initial values for the parameters. We can summarize our achievements as follows: 88 89 0 An alias free structure for sub-band adaptive filtering was developed. 0 A simple and modular least squares type algorithm was developed for updating the parameters. 0 The computational complexity of the algorithm is 0(l0g2N) per sample. 0 Analyses were conducted to establish the convergence of the parameters to the true values under developed the persistency of excitation conditions. 0 Simulations supporting the deductions from analysis were performed. 0 The algorithm was modified to incorporate a priori information about the input with an eye on improving performance. 0 A general frame work for developing sub-band adaptive filtering algorithms based on diagonal decomposition in the transform-domain was presented. The prospects of extending this research and applying it to other areas was discussed in the previous sections. These can be summarized as follows: 0 Modify PPS in a fashion similar to the work in [2] to reduce the block delay. 0 Develop a more general algorithm based on the diagonal decomposition in sub-bands. o Formulate an algorithm similar to PPS, based on Chirp z-transform. Such a structure would allow for non-uniform spacing of frequency domain estimates. 0 Extend PPS for equalization of non-minimum phase channels. BIBLIOGRAPHY BIBLIOGRAPHY [1] T.M. Apostol, Mathematical analysis, N arossa Publications, 1991. [2] M.R. Asharif and F. Amano, Acoustic echo-canceler using the FBAF algorithm, IEEE Trans. on Communications 42 (1994), no. 12, 3090-3094. [3] M.R. Asharif, F. Amano, S. Unagami, and K. Murano, Acoustic echo canceller based on frequency bin adaptive filters( F AF ), Proc. IEEE Int. Conf. Global Telecommn. (1987), 1940—1944. [4] K.G. Beauchamp, Transform of engineers, John Wiley, 1985. [5] M. Bellanger, G. Bonnerot, and M. Coudreuse, Digital filtering by polyphase network: application to sample rate alteration and filter banks, IEEE Trans. Acoustics, Speech and Signal Processing 24 (1976), 109-114. [6] A. Benveniste, M. Goursat, and G. Ruget, Robust identification of a nonminimum phase system: Blind adjustment of a linear equalizer in data communications, IEEE Trans. on Automatic Control 25 (1980), no. 3, 385—399. [7] NJ. Bershad and PL. Feintuch, Analysis of frequency domain adaptive filter, Proc. IEEE 67 (1979), 1658—1659. [8] RP. Bitmead and B.D.O. Anderson, Adaptive frequency sampling filters, IEEE Trans. on Circuits and Systems 28 (1981), 610—615. [9] J .M. Cioffi and T. Kailath, Fast, recursive-least squares transversal filters for adaptive filtering, IEEE Trans. on Acoustics, Speech, and Signal Processing 32 (1984), 304-337. [10] G. Clark, S.K. Mitra, and S. Parker, Block implementation of adaptive digital filters, IEEE Trans. on Acoustics, Speech and Signal Processing 29 (1981), 744—752. [11] RE. Crochiere and L.R. Rabiner, Multirate digital signal processing, Prentice Hall, 1983. [12] P. Davis, Circulant matrices, John Wiley and Sons, 1979. [13] M. Dentino, J. McCool, and B. Widrow, Adaptive filtering in the frequency domain, Proc. IEEE 66 (1978), 1658—1659. 90 91 [14] H. Fan and W.K. Jenkins, A new adaptive IIR filter, IEEE Trans. on Circuits and Systems 33 (1986), 939—947. [15] , Adaptive IIR filtering, a new approach, Proc. ICASSP (1987), 2125-2128. [16] , An investigation of an adaptive IIR echo canceller: Advantages and problems, IEEE Trans. on Acoustics, Speech and Signal Processing 36 (1988), no. 12, 1819-1834. [17] Hong Fan, New adaptive IIR filtering algorithms, Ph.D. thesis, University of Illinois, Urbana, 1985. [18] WA. Gardner and LE. Franks, Characterization of cyclostationary random signal process, IEEE Trans. on Information Theory 21 (1975), 4-14. [19] A. Gillore, Experiments with subband acoustic echo cancellers for teleconferencing, Proc. ICASSP (1987), 2141-2144. [20] A. Gillore and M. Vetterli, Adaptive filtering in subbands with critical sampling: Anal- ysis, experiments and application to acoustic echo cancellation, IEEE Trans. on Signal Processing 40 (1992), no. 8, 1862—1874. [21] J. Glover, Adaptive noise cancelling of sinusoidal interferences, Ph.D. thesis, Stanford University, Stanford, California, 1975. [22] D.N. Godard, Self-recovering equalization and carrier tracking in a two-dimensional data communications system, IEEE Trans. on Communications 28 (1980), 1867-1875. [23] S. Haykin, Adaptive filter theory, Prentice Hall, 1990. [24] U. Iyer and M. Nayeri, Generalized adaptive polyphase algorithm, Proc. ICASSP (1996). [25] U. Iyer, M. N ayeri, and H. Ochi, Parameter tracking of the polyphase structure adaptive filtering algorithm, Proc. ISCAS (1994). [26] , A polyphase structure for system identification and adaptive filtering, Proc. ICASSP 3 (1994), 433-436. [27] , A polyphase based adaptive structure for adaptive filtering and tracking, IEEE Trans. on Circuits and Systems H 43 (1996), no. 3, 220—233. [28] U. Iyer, H. Ochi, and M. Nayeri, IIR subband adaptive filtering, Midwest Symp. on Sytems and Circuits 1 (1993), 125—128. [29] , Statistical analysis of multirate filters with stationary inputs, Tech. Report of IECE, Japan (1993). [30] C.R. Johnson, Adaptive IIR filtering: Current results, and open issues, IEEE Trans. on Information Theory 30 (1984), 237—250. 92 [31] , Admissibility in blind adaptive channel equalization, IEEE Control Systems (1991),3—15. [32] C.R. Johnson, M.G. Larimore, J .R. Treichler, and B.D.O. Anderson, SHARF conver- gence properties, IEEE Trans. on Acoustics, Speech, and Signal Processing 29 (1981), 659—670. [33] J.J. Kormylo and J.M. Mendel, Maximum-likelihood seismic deconvolution, IEEE Trans. on Geosciences and Remote Sensing 21 (1983), 72—82. [34] M.G. Larimore, J .R. Treichler, and C.R. Johnson, SHARF: An algorithm for adapting IIR digital filters, IEEE Trans. on Acoustics, Speech, and Signal Processing 28 (1980), 428-440. [35] B.P. Lathi, Modern digital and analog communications systems, Holt, Rinehart and Winston, 1983. [36] J .C. Lee and C.K. Un, Performance analysis of frequency-domain adaptive filters using overlap-add methods, IEEE Trans. on Circuits and Systems 36 (1989), no. 2, 173-189. [37] J .S. Lim and A.V. Oppenheim, Advanced topics in signal processing, Prentice Hall, 1988. [38] L. Ljung and T. Soderstrom, Theory and practice of recursive identification, MIT Press, 1983. [39] R.W. Lucky, Automatic equalization for digital communication, The Bell System Tech- nical Journal (1965), 547-588. [40] , Techniques for adaptive equalization of digital communications systems, The Bell System Technical Journal (1966), 255—286. [41] A. Mahalanobis, S. Song, S.K. Mitra, and M. Petraglia, Adaptive FIR filter structure based on structural subband decomposition for system identification problems, IEEE Trans. of Circuits and Systems 40 (1993), no. 6, 375-381. [42] G. Maral and M. Bousquet, Satellite communications systems, John Wiley and Sons, 1986. [43] S.K. Mitra and R. Gnanasekaran, Block implementation of recursive digital filters: new structures and properties, IEEE Trans. of Circuits and Systems 25 (1975), 762-769. [44] S.K. Mitra and A. Mahalanobis, A generalized structural subband decomposition of FIR filters and its application in efficient FIR filter design and implementation, IEEE Trans. of Circuits and Systems 40 (1993), no. 6, 363-374. 93 [45] B. Mulgrew and C.F.N. Cowan, Adaptive filters and equalisers, Kluwer Academic Pub~ lishers, 1988. [46] James Murphy, A comparison of transform domain adaptive filters, with emphasis on the Hartley transform, Ph.D. thesis, University of Illinois, Urbana, 1987. [47] SS. Narayan and A.M. Peterson, Frequency domain LMS algorithm, Proc. IEEE 69 (1981),124—126. [48] SS. Narayan, A.M. Peterson, and M.J. Narasimha, Transform domain LMS algorithm, IEEE Trans. on Acoustics, Speech and Signal Processing (1983), 609-615. [49] H. Ochi, U. Iyer, and M. Nayeri, A design method based for wavelet bases based on [IR filters, IEICE Trans. A J77-A (1994), no. 8, 1096—1099. [50] A.V. Oppenheim and R.W. Schafer, Digital signal processing, Prentice Hall, 1988. [51] J. Ortega, Matrix theory, Plenum, 1989. [52] M. Petraglia and S.K. Mitra, Adaptive FIR filter structure based on the generalized structural subband decomposition of FIR filters, IEEE Trans. of Circuits and Systems 40 (1993), no. 6, 354-362. [53] J .G. Proakis and D.G. Manolakis, Introduction to digital signal processing, Maxwell Macmillan, 1992. [54] S.U.H. Qureshi, Adaptive equalization, Proceedings of the IEEE 77 (1985), no. 9, 1349- 1387. [55] R. Riegler and R. Compton, Adaptive enhancement of multiple sinusoids in uncor- related noise, IEEE Trans. on Acoustics, Speech and Signal Processing 26 (1978), 240—250. [56] V. Sathe and RP. Vaidyanathan, Analysis of the effects of multirate filters on stationary random inputs, with application in adaptive filtering, Proc. ICASSP 3 (1991), 1681— 1684. [57] Y. Sato, A method for self-recovering equalization for multilevel amplitude-modulation systems, IEEE Trans. on Communications (1975), 679—682. [58] S. Shanmugam, Digital and analog communications systems, Wiley, 1979. [59] J .J . Shynk, Adaptive IIR filtering, IEEE ASSP Magazine 6 (1989), 4-21. [60] , Frequency-domain and multirate adaptive filtering, IEEE SP Magazine 6 (1992),14-37. 94 [61] D.T.M. Slock, C. Luigi, H. Lev-Ari, and T. Kailath, Modular and numerically stable fast transversal filters for multichannel and multiexperiment rls, IEEE Trans. on Signal Processing 40 (1992), no. 4, 784-793. [62] VS. Somayajulu, S.K. Mitra, and J .J . Shynk, Adaptive line enhancement using multi- rate techniques, Proc. ICASSP (1989), 928—931. [63] M.M. Sondhi, An adaptive echo canceller, Bell Systems Technical Journal 46 (1967), 497-511. [64] K. Steiglitz and LE. McBride, A technique for identification of linear systems, IEEE Trans. on Automatic Control 10 (1965), 461-464. [65] H. Taub and D. Schilling, Principles of communications systems, McGraw Hill, 1971. [66] ET. Tuncer and T.Q. Nguyen, Interpolated IIR mth-band filter design with allpass subfilters, IEEE Trans. on Signal Processing 43 (1995), no. 8, 1986-1990. [67] T.E. Tuncer and G.V.H. Sandri, Orthonormal wavelet representation using Butterworth filters, Proc. SPIE. Conf. on Adaptive and Learning Systems (1992). [68] RP. Vaidyanathan, Multirate digital filters, filter banks, polyphase networks, and ap- plications: A tutorial, Proc. of IEEE 78 (1990), 56-93. [69] , Multirate systems and filter banks, Prentice Hall, 1992. [70] B. Widrow and SD. Stearns, Adaptive signal processing, Prentice Hall, 1985. [71] B. Widrow and E. Walach, 0n the statistical efiiciency of the LMS algorithm with non-stationary inputs, IEEE Trans. on Information Theory 30 (1984), 211—221. [72] H. Yasukawa, S. Shimada, and l. Furukawa, Acoustic echo canceller with high speech quality, Proc. ICASSP (1987), 2125-2128. [73] J .R. Zeidler, An adaptive array for interference rejection, Proc. IEEE 61 (1973), 748— 758. "llllllllllllllllll“