'3‘: =3? ‘ g .0? 4.. l .‘iiakyrlflun i .. .“Lfdtfim? QumMH 3:. L. :I fizsllll m ‘ W . fl. u. “ « W m , m, M. q . w m a m g._.§#§w§ /\ THE$S 7 i ‘2 01A ) "fl“ “T‘Wi\\\\\\i L. 3 M m ‘“\IiliiiLily Michigan State University This is to certify that the dissertation entitled MULTIWEIGHT OPTIMIZATION IN OPTIMAL BOUNDING ELLIPSOID ALGORITHMS presented by Dale Joachim has been accepted towards fulfillment of the requirements for Ph. D. degree in Electrical Eng E. E a Major professor a; " Date 9" a“? QB MSU is an Affirmative Action/Equal Opportunity Institution 0- 12771 PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINE return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 1“ mm.“ MULTIWEIGHT OPTIMIZATION IN OPTIMAL BOUNDING ELLIPSOID ALGORITHMS By Dale Joachim A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering 1998 ABSTRACT MULTIWEIGHT OPTIMIZATION IN OPTIMAL BOUNDIN G ELLIPSOID ALGORITHMS By Dale Joachim Optimal Bounding Ellipsoid (OBE) algorithms offer an attractive alternative to traditional least squares methods for identifying linear-in-parameters signal and sys- tem models due to their low computational efficiency, superior tracking ability, and selective updating that permits processor sharing among tasks. In existing OBE algorithms, optimization takes place pointwise; all conditions (in particular, previous weights) extant at the time of optimization remain fixed. The globally optimal solution at a given time would diminish the solution set in light of all observations. This research introduces a new class of 0813 algorithms with improved conver- gence speed and tracking capability, the multiple weight optimal bounding ellipsoid (MW—OBE) algorithms. Given a system of order m, a MW-OBE algorithm “revisits” K past weights when the observation set at time n is received and deemed innova- tive, so that the ellipsoid is optimally diminished with respect to the current and past K observations, conditioned upon information known at time n — K — 1. The si- multaneous optimization over multiple weights in MW-OBE algorithms offers greater flexibility with respect to conventional methods in shaping the hyperellipsoid, thus potentially decreasing the solution set. This dissertation derives a general MW-OBE algorithm form. This general algo- rithm is then optimized in the framework of two existing OBE algorithms, the quasi- OBE (QOBE) and set membership—weighted least square (SM-WRLS). Simulation results are then presented, demonstrating the potential of the developed algorithms, M W-QOBE and MW-SM-WRLS. Copyright by Dale Joachim 1998 Acknowledgments First and foremost, thanks be to God, who probably watches me with great amuse- ment and concern, for the gift of life and the opportunity to learn. I acknowledge my ancestors and the many who have sacrificed heavily to improve my world. I am greatly indebted to my dissertation advisor, Dr. John R. Deller, Jr., for his guidance, many hours of discussion and countless words of encouragement (some subliminal.) Sincere thanks to Dr. Majid Nayeri, for his invaluable ideas and com- ments during the initial phases of my research, and to the other members of my Ph.D. guidance committee, Drs. Percy Pierre, Michael Frazier and Robert Nowak, I remain indebted. To Dr. Sonneborn in the Department of Mathematics and Dr. Khalil in the Department of Electrical Engineering, who took the time to discuss my linear algebra problems; to the many graduate students who made themselves available to discuss difficult problems of absolutely no concern to them; to Dr. O’Kelly for proof-reading my dissertation; and to the department secretaries for their continued support, I remain indebted. My five. years at Michigan State University would not have been possible without the reliable funding of the National Science Foundation, the Department of Electrical Engineering, and the Office of the Associate Dean. To my family and relatives who observed my struggles through these not-so—easy times with great concern; to Neighbour for friendship and stimulating discussions; to the many comrades who made MSU enjoyable, I am indebted. To Dramey, who, with whimsical flair, disrupted my linear approach to life, I remember music. I remain indebted. Contents List of Tables List of Figures 1 Introduction and Background 1.1 Introduction ................................ 1.2 OBE in system identification ....................... 1.3 Review of OBE algorithms ........................ 1.4 Motivation for a new algorithm ..................... 1.5 Notation, variables and acronyms .................... 2 OBE with Multiple Weight Optimization 2.1 Introduction and overview ........................ 2.2 Recursions ................................. 2.3 A posteriori error vector and energy matrices .............. 2.3.1 A posteriori error, sum ...................... 2.3.2 A posteriori weighted energy matrix, Gnln ........... 2.4 Ellipsoid volume ............................. 2.5 Algorithm ................................. 2.6 Computational costs ........................... 2.7 Optimizations .............................. 3 Multiple Weight Quasi-OBE Algorithm 3.1 Introduction ................................ 3.2 Kn minimization .............................. 3.3 Existence and uniqueness of an optimal solution ............ 3.4 Incremental gain ............................. 3.5 Geometric interpretation of the K." minimization ............ 3.6 Algorithm ................................. 3.7 Computational cost ............................ 3.8 MW-QOBE with K = l ......................... 3.9 Illustrative examples ........................... 3.9.1 Decrease in ellipsoid volume ................... 3.9.2 Weight assignments ........................ 4 Multiple Weight Set-Membership Weighted RLS 4.1 Introduction ................................ 4.2 Volume minimization ........................... 4.3 Optimal solution existence and uniqueness ............... 4.4 MW-SW—WRLS with K = 1 ....................... 4.5 Algorithm ................................. 4.6 Illustrative examples ........................... 4.6.1 Decrease in ellipsoid volume ................... vi 4.6.2 Weight assignments ........................ 4.7 Volume minimization, alternative approach ............... Alternate MW-OBE Algorithms 5.1 Introduction ................................ 5.2 Forward—looking and backward-looking MW-OBE ........... 5.3 One-step MW—OBE ........................... 5.4 Proof of Theorem 3.5 ........................... 5.5 Example contrasting forward-looking, backward looking, and one-step MW—OBE algorithms ........................... Practical Benefits of MW-OBE Algorithms 6.1 Introduction ................................ 6.2 QOBE, SM-WRLS, RLS and LMS ................... 6.3 Signal cataloging ............................. 6.4 Parameter convergence and volume ................... 6.4.1 MW-QOBE vs. QOBE ...................... 6.5 Tracking performance ........................... 6.5.1 MW—QOBE vs. QOBE ...................... 6.6 Selective updating ............................ 6.7 General findings, conclusions, recommendations for practice ..... Future Directions and Complementary Work 7.1 Introduction ................................ 7.2 Normalized MW-OBE algorithm ..................... 7.3 Hyperellipsoid volume .......................... 7.4 Application to linear—prediction analysis of speech signals ...... 7.5 Alternative volume minimization: Trace ................ 7.6 Approximations .............................. 7.7 MW-OBE in the Kalman-Bucy framework ............... 7.8 Conclusions, contributions and further work .............. Appendix A A.1 Definitions ................................. A.2 Matlab programs ............................. vii 59 6O 68 68 69 73 75 75 77 77 77 81 81 84 94 94 100 101 105 105 105 106 108 111 112 113 114 118 118 122 List of Tables 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 3.4 4.1 4.2 4.3 4.4 4.5 Vector and matrix notation. A, B represent matrices, a a vector and z, a scalar. ................................ MW-OBE vector and matrix variables at time n for a system of order m when considering K past weights. n is the “current” time and t represents the general sequence index ................... Acronyms and abbreviations ...................... The MW-OBE algorithm. ........................ Detailed computational costs of MW-OBE algorithms per update. The computation of enln-1 and Gnln-1 are included although they are usu- ally considered part of the innovation check. For simplicity, the nota- tion K is used to denote K + l. .................... Computational costs of MW-OBE algorithms per update for K = 0, 1, 2 and 3 as functions of m. ...................... The MW-QOBE algorithm (non-zero past weights). .......... Computational cost of the MW-QOBE per-innovation check. For sim- plicity, the notation K is used to denote K + 1. ............ Computational cost of the MW-QOBE per-observation innovation check when K = 1. ............................ First 30 weights assigned by QOBE and MW-QOBE (K = 1,2) al- gorithms in the identification of the AR(3) system 3;... = O.49y,._1 + 0.61yn_2+0.58yn_3+€n., where the “true” measurement error sequence {5...} is uniformly distributed over the interval (-1, +1). The QOBE and MW-QOBE (K = 1,2) algorithms selected 32, 10, 3 observations, respectively, from a total of 100 ...................... Notation used in Chapter 4. ....................... Function definitions used in Chapter 4. ................ The MW-SM-WRLS algorithm for K = 1. ............... The MW-SM—WRLS algorithm (non-zero past weights) ......... First 30 weights assigned by SM—WRLS and MW—SM-WRLS (K = 1) algorithms in the identification of the AR(3) system yn = 0.493),...1 + 0.61yn-2+0.58yn_3+em, where the “true” measurement error sequence {5...} is uniformly distributed over the interval (—1,+1). The SM~ WRLS and MW—SM-WRLS (K = l) algorithms selected 51 and 45 observations, respectively, from a total of 100. ............. viii 10 11 12 21 23 23 37 38 39 42 48 48 57 58 65 5.1 5.2 5.3 6.1 6.2 6.3 7.1 7.2 7.3 7.4 The forward MW-OBE algorithm ..................... 71 The one-step MW-OBE algorithm (overlapping weight adjustment blocks applied to non-zero past weights.) ................ Equivalence of the forward and backward MW-QOBE algorithms, and weight assignments in the one-step MW-QOBE. First eight weights assigned in the identification of the AR(2) system 3],. = —O.1Oy,._1 — 0.5631,.-2 + 6..., where the “true” measurement error sequence {5...} is uniformly distributed and bounded by 1. MW-QOBE (backward), MW-QOBE (forward) and one—step MW~QOBE used the same number of points. ................................. Parameter vector for the AR systems used in the simulations of Chap- ter 6, both time-invariant (TI) and time-varying. The operator ’\’ denotes the division remainder. ..................... 82 Excitation noises for the AR systems used in the simulations of Chapter 6. 83 Matrix of systems used in the simulations of Chapter 6. See Tables 6.1 74 and 6.2. .................................. 83 “Other” AR parameter estimators inside the ellipsoid (Figure 7.2). The coordinates and radii (“0“, — OCJJHZ) are in relation to the final SM- WRLS ellipsoid center (M = 14, volume = 1.4 x 1019 and 4.67 x 10"1 S semi-axes g 43.6). ........................... 109 SM-WRLS 14-dimensional hyperellipsoid semi-axes (see Figure 7.2). Volume 2 4.32 x 10”. .......................... 110 Poles of the 14th order system, based on the batch covariance estimate (see Figures 7.2 and 7.3) .......................... 110 Poles of the 14th order system (SM-WRLS), based on the ellipsoid llO center estimate (see Figure 7.2 and 7.3). ................ ix List of Figures 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 3.4 Hyperstrips w, and their intersection S1,, as described in (1.3) for a system of order 2 (m = 2) at time n = 3. ................ The ellipsoid superset f2... 2 (2,. corresponding to the system of Figure 1.1. 5 The polytope, ellipsoid and least squares surface for a system of order 2. The “true” and ellipsoid center estimators are denoted by a: and 0 respectively. ................................ Weight assignments in the MW—OBE algorithm. ........... 15 Rectangular error bound constraint (1.2) (dashed line) when K = l and m = 2. An error eniml originally violating the bound condition in light of new observations is mapped to a location inside the rectangle {u : In] 37} in its a posteriori En|n form. ............... 19 Computational costs (number of multiplications) of a single MW-OBE iteration for K = 0, 1, 2 and 3 for system orders m = 2, 4, 8, 12 and 16. K = 0 represents the conventional OBE algorithm case ......... 22 Error bound constraint (1.2) when K = 1 and m = 2. An error 6,..n_1 originally violating the bound condition is mapped to a location inside the rectangle {u : lul $7} in its a posteriori an... form. ....... 36 OBE ellipsoids resulting from the system identification of AR(2) sys- tem yn = -O.1Oy,._1 —— 0.56yn_2 + 5,... by QOBE (dashed line) and MW—QOBE (solid line, K21) at times it = 8 and 13. The star (*) represents the “true” parameter and the circles (o) the central estima- tors (superimposed). The underlying polytopes (exact feasible sets) are also shown. .............................. Convergence of the parameter estimator 0,, (2) to the “true” parameter 0,..(2) = 0.61 and associated ellipsoid volume in the system identifi- cation of yn = O.49y,._1 + 0.6131,.-2 + O.58y,._3 + 5,... by QOBE and MW-QOBE (Kr-1,2) as in Table 3.4 ................... 43 51,, and a posteriori error an," associated with the identification of the system of Figure 3.3 ............................ 44 4.1 4.2 4.3 5.1 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 OBE ellipsoids resulting from the system identification of AR(2) sys- tem y.. = -O.10y.._. — 0.56y.._2 + 5.... by SM—WRLS (dashed line) and MW—SM-WRLS (K21, solid line) at times n = 8 and 10. The star (4:) represents the “true” parameter and the circles (o) the central estima- tors (superimposed). The underlying polytopes (exact feasible sets) are also shown. .............................. Convergence of the parameter estimator 6..(2) to the “true” parameter 0...(2) = 0.61 and associated ellipsoid volume in the system identifica- tion of y.. = 0.4%.... + 0.61y.._2 + 0.58y.._3 + 5.... by SM-WRLS and MW-SM-WRLS (K=1) as in Table 4.5 .................. 66 K... and a posteriori error 5..,.. associated with the identification of the system of Figure 4.2 ............................ 67 Forward-looking and backward-looking MW—OBE algorithms’ duality. 69 Parameter 0..(2) = 0.562 convergence of QOBE and SM-WRLS vs. that of LMS and RLS in the identification of the AR(2) system y.. = —0.104y.._1 + 0.562y..._2 + 5..., where 5... is uniformly distributed over (-1,l). The QOBE and SM-WRLS algorithms used 22 and 121 points, 64 respectively, from a total of 1200 points. ................ 79 Parameter 0..(2) = 0.562 convergence and volumes of QOBE and SM-WRLS algorithms in the identification of the AR(2) system y.. = —0.104y.._1 + 0.562y.._2 + 5..., where 5.... is uniformly distributed over (-1,1). The QOBE and SM-WRLS algorithms used 22 and 121 points 80 respectively from a total of 1200 points .................. Parameter 9...(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 1 by MW-QOBE and QOBE algorithms, which used 94 and 108 points, respectively, from a total of 200 points. 86 Parameter 0...(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 2 by MW-QOBE and QOBE algorithms, which used 75 and 87 points, respectively, from a total of 700 points. 87 Parameter 9....(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 3 by MW-QOBE and QOBE algorithms, which used 100 and 82 points, respectively, from a total of 700 points. 88 Parameter 0...(1) = —0.10 convergence and volumes resulting from the identification of SYSTEM 4 by MW—QOBE and QOBE algorithms, which used 11 and 42 points, respectively, from a total of 100 points. 89 Parameter 9....(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 5 by MW-QOBE and QOBE algorithms, which used 117 and 116 points, respectively, from a total of 1500 points. 90 Parameter 0... (2) = 0.92 convergence and volumes resulting from the identification of SYSTEM 6 by MW-QOBE and QOBE algorithms, which used 174 and 166 points, respectively, from a total of 3000 points. 91 Parameter 9....(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 7 by MW-QOBE and QOBE algorithms, which used 214 and 203 points, respectively, from a total of 1500 points. 92 xi 6.10 Parameter 0....(1) = —0.10 convergence and volumes resulting from the 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 7.1 7.2 7.3 identification of SYSTEM 8 by MW—QOBE and QOBE algorithms, which used 9 and 43 points, respectively, from a total of 500 points. . Parameter 0....( 1) tracking and volumes resulting from the identifica- tion of SYSTEM 9 by MW-QOBE and QOBE algorithms, which used 109 and 110 points, respectively, from a total of 700 points ....... Parameter 0...(1) tracking and volumes resulting from the identifica- tion of SYSTEM 10 by MW-QOBE and QOBE algorithms, which used 70 and 72 points, respectively, from a total of 350 points. ....... Parameter 0.....(1) tracking and volumes resulting from the identifica- tion of SYSTEM 11 by MW—QOBE and QOBE algorithms, which used 52 and 95 points, respectively, from a total of 700 points. ....... Parameter 0....(1) tracking and volumes resulting from the identifica- tion of SYSTEM 12 by MW-QOBE and QOBE algorithms, which used 37 and 67 points, respectively, from a total of 350 points. ....... Ratio of points taken, averaged over 10 runs, in the identification of an AR(2) system with uniformly distributed error sequence by QOBE and MW-QOBE (adaptive K.) ...................... Ratio of points taken, averaged over 10 runs, in the identification of an AR(3) system with uniformly distributed error sequence by QOBE and MW-QOBE (adaptive K.) ...................... Ratio of points taken, averaged over 10 runs, in the identification of an AR(12) system with uniformly distributed error sequence by QOBE and MW-QOBE (adaptive K.) ...................... Ratio of points taken, averaged over 10 runs, in the identification of an AR(2) system with uniformly distributed error sequence by QOBE and MW—QOBE (fixed K.) ........................ Ratio of points taken, averaged over 10 runs, in the identification of an AR(3) system with uniformly distributed error sequence by QOBE and MW-QOBE (fixed K.) ........................ Ratio of points taken, averaged over 10 runs, in the identification of an AR(12) system with uniformly distributed error sequence by QOBE and MW-QOBE (fixed K.) ........................ Effect of the model order in the identification of the LP parameters in the voiced /I/ phoneme. We show the log of the final volumes 93 96 97 98 99 101 102 103 103 104 104 (normalized to the initial ellipsoid volume) using the model orders 7-14.108 Spectrum of the voiced /I/ phoneme (512 points). The F FT of the sequence is shown in (a), and the FF T of the impulse response of center (0d,), COV (9Lsa,L) and non-central estimators inside the ellipsoid (91-11,) are shown in (b). The speech segment is modeled by an AR(14) system. Ellipsoid volume = 1.4 ><1019 and 4.67x10‘1 3 axes _<_ 110.10. See Tables 7.1, 7.2, 7.3 and 7.4 for more information on estimates and ellipsoid ................................... Poles of the 14th order system based on the COV and OBE ellipsoid center estimates (see Figure 7.2, Tables 7.3 and 7.4). ......... xii 116 Chapter 1 Introduction and Background 1.1 Introduction System identification techniques are used in diverse fields, ranging from engineering, to natural and social sciences, to finance and economics. Indeed, many physical problems benefit from modeling and identification. Linear-in-parameters systems form a broad and important class of well-studied models for which extensive analysis tools have been developed. In particular, models that are linear in the data as well as in the parameters have been widely studied and applied in statistics [47], economics [38], biology [4, 42, 48], engineering [8, 19, 33] and other fields [24, 37]. In the autoregressiue (AR) model, for example, the output at a given time is a linear function of past outputs and an excitation sequence. This relatively simple model has been used to represent many physical systems [26, 45], a notable example of which is speech [9]. Several batch and recursive methods have been developed to identify linear-in— parameters system. Of these, set-membership (SM) algorithms are unique in providing a. set of feasible parameter vectors (a solution set) instead of a single point estimate. This is achieved through successive refinements of an initial solution set, consistent with a priori constraints on the signal or system model. Optimal bounding ellipsoid (OBE) algorithms belong to the class of recursive SM algorithms and iteratively assign a weight to each incoming data vector that reflects the current observation set’s potential to refine the solution set [14]. Each weight is determined by minimizing a measure of the size of an hyperellipsoidal feasibility set to which the “true” parameter vector must belong. In existing OBE algorithms, optimization takes place pointwise; all conditions (in particular, previous weights) extant at the time of Optimization remain fixed. The globally optimal solution at a given time would diminish the solution set in light of all observations. This research introduces a new class of OBE algorithms with improved conver- gence speed and tracking capability, the multiple weight optimal bounding ellipsoid (MW-OBE) algorithms. Given a system of order m, a MW-OBE algorithm “revisits” K past weights when the observation set at time n is deemed innovative, so that the ellipsoid is Optimally diminished with respect to the current and past K observations, conditioned upon information known at time n — K - 1. The number of revisited past weights, K, must be less than the system order m and may be time-Varying. The corresponding past data vectors are not required to be sequential. The simultaneous optimization over multiple weights in MW-OBE algorithms offers greater flexibility with respect to conventional methods in shaping the hyperellipsoid, thus potentially better fit to exact feasibility set (described later.) This dissertation begins with the derivation of a general MW-OBE algorithm form. This general algorithm is then optimized in the framework of two existing OBE algorithms, the quasi-OBE (QOBE) [39] and set membership-weighted least square (SM-WRLS) [l3]. Simulation results are then presented, demonstrating the potential of the developed algorithms, MW-QOBE and MW-SM-WRLS. 1.2 OBE in system identification Several batch methods are available to identify linear-in-parameters models system models, including minimum squared error (MMSE), maximum likelihood (ML) and least square error (LSE) [25, 26, 45]. The recursive least square (RLS) [26], least mean [square (LMS) [25], instrumental variable [45] and Optimal bounded ellipsoid (OBE) [8, 14, 17, 21] algorithms are recursive (in the estimates) techniques for use in on-line applications. RLS and LMS require the whiteness of the model distur- bance, and they fail to perform adequately in colored noise [26]. OBE algorithms 2 do not impose any statistical requirements on the disturbance, but require that the sequence squared, say {53"}, be pointwise bounded by a known sequence {’73,} [34]. OBE identification algorithms (e.g., [14, 15, 21]) have strong potential for application to signal-processing problems involving linear-in-parameters models. With respect to conventional least-square-error identification methods (e.g., [26]), OBE identifiers offer superior adaptation, improved accuracy, efficient use of innovation in the data, improved computational efficiency, robustness to measurement noise, robustness to deviation from the assumed input model, a set of feasible solutions rather than a sin- gle point estimate, and the ability to compute the solution recursively in time without block processing or windows (e.g., [2, 13, 15, 16]). OBE algorithms are used to identify linear-in-parameters models of the form a. = 932.. + a... (1.1) in which 0. 6 ER'" is the unknown “true” parameter vector to be identified; (3..) is a sequence of measurable vectors of dimension m; and {5...} is a “true” but unknown error sequence. OBE algorithms are based on the premise that, for each n, the model error has a known pointwise energy bound, 52 < 72. (1.2) This “true” model is posed only for analysis purposes and provides the background from which actual parameter vector estimates are derived. Given data on times t E [1,n], an exact feasibility set, say it... of estimates for 0.. whose elements are consistent with these bounds is formally described by (Figure 1.1) (2.. = “$21M, where a). = {9 : 5?. = [y. — OTm.[2 _<_ 7.2}. (1.3) OBE algorithms work with an hyperellipsoidal set, say f2", that is guaranteed to Contain Q", hence 0.. The observations are scrutinized with respect to their ability t0 “shrink” On, hence to more tightly bound 9... At time n, the hyperellipsoid is £03 , W2 91—) Figure 1.1. Hyperstrips w. and their intersection 9.. as described in (1.3) for a system of order 2 (m = 2) at time n = 3. given by (e.g., [14]) (Figure 1.2) (2,. c12‘{e : (0 — 9..)TC.(0 — 9..) g y..} (1.4) in which 0.. is a weighted covariance matrix of the observations, 0,; = Zq¢,n$¢x'{, (1.5) t=l K... is the scalar Kn : 0:07:07: + Z qtm (712 — ytzli (16) t=l and 0.., the center of S2... is a weighted least-square—error estimator of 0. at time n (Figure 1.3), 0,. = P..c.., with P.. ‘13! 0,;1 and c.. ‘13} Zq.,..y.:c.. (1.7) t=l The weighting sequence in this process at time n, {q.,..}[‘=l, is chosen to optimally diminish some set measure of the hyperellipsoid. 4 W2 01—} Figure 1.2. The ellipsoid superset $2.. 2 (2.. corresponding to the system of Figure 1.1. OBE algorithms make selective use of incoming data in updating the ellipsoid and central estimator. Frequently, the observations at time n contain no innovation in the sense that they cannot be used to reduce the size of flu-.. This is manifest in the failure to find _valid weights, and the effort of updating can be avoided at this time. Depending on the properties of the sequence {5...}, OBE algorithms often update only 10 percent of the time or less. 1.3 Review of OBE algorithms Schweppe published one of the first OBE-type algorithm in early 1968 [44] in the context of estimating state parameters of linear dynamic systems using noisy ob- servations. Assuming bounded inputs and bounded observation error, Schweppe’s algorithm estimates the state of the system using bounding ellipsoids. However, as Schweppe notes [44], this novel algorithm is presented without convergence proof, processes all available data, and is computationally impractical. Witsenhausen in 1968 [49], and Bertsekas and Rhodes in 1971 [5], tackled the State-estimation problem from a SM approach. Under similar assumptions as those Cf! ANN “it” ‘ ' e93”. li‘ss‘éfié‘ is” Figure 1.3. The polytope, ellipsoid and least squares surface for a system of order 2. The “true” and ellipsoid center estimators are denoted by * and 0 respectively. of Schweppe, Bertsekas and Rhodes examined filtering, prediction and smoothing problems. The algorithm of Schweppe, and of Berstekas and Rhodes have a Kalman— Bucy filter structure which is optimal as a state estimator in Gaussian white noise. In 1979, Fogel [20] published an OBE identification algorithm for the ARX model (e.g., [36]) based on a priori knowledge of the cumulative error energy. Fogel proves the convergence of the hyperellipsoid central estimator to the true parameter in a de— terministic setting, by demonstrating that the hyperellipsoid asymptotically reduces to a point set. In 1982, Fogel and Huang [21] published an OBE algorithm with selective updat— ing (F-H/OBE) which processes only relevant data, a key feature of modern OBE algorithms. This data—selection process is achieved by assigning weights to each in- coming data vector, with a zero weight indicating data rejection. A pre—processing information check 0(m2) determines the possibility of a non-zero weight, thereby po— tentially eliminating redundant computations. Fogel and Huang present a sufficient Condition for the convergence of the F-H/OBE hyperellipsoid to a point, provided the observation error is white noise. The validity of this proof remains controversial, and a more recent proof in a stochastic setting with a more general class of OBE algorithms is presented in the paper by Nayeri et al. [40]. The F-H/OBE data selection process was improved to 0(m) complexity in 1987 by Dasgupta and Huang [8] in their OBE-type algorithm (D-H/OBE). By minimizing an, a scalar not apparently related to the hyperellipsoid volume [see ( 1.4)], Dasgupta and Huang derive a simple but effective algorithm and prove the asymptotic convergence (at an exponential rate) of its central estimator to a region around the true parameter. Deller et al. introduced the set-membership weighted recursive least squares (SM- WRLS) algorithm in 1989. SM-WRLS is similar to F -H/OBE but is derived in a much different manner [13, 18]. A major difference between the F-H/OBE algorithm, derived from geometric considerations, and SM-WRLS, derived as an RLS algorithm with special weighting, is in the weighting strategy. With the introduction of SM- WRLS, the relationship between OBE and WRLS was formally established. In 1993, the set-membership stochastic approximation (SM-SA) was introduced in a paper that unifies all previously known versions of OBE algorithms [14, 15]. The first stochastic proof of convergence (in probability) of an OBE algorithm is achieved with the SM- SA algorithm in [14], and the unification in [14] implies that the convergence result is generally applicable to all published OBE algorithms. In 1991, Cheung [6, 7] published the optimal volume algorithm (OVE) based on an affine transformation which reduces the hyperellipsoid volume without imposing the condition that the hyperellipsoid center be equivalent to 9... This relaxation on the hyperellipsoid center improves reduction in the hyperellipsoid size with a minimal increase in computational cost. The compounded effect of (1.2) at each n restricts the “true” parameter vector estimate to an exact polytope. In the 1991 paper by Veres and Norton [46], it is proved that, under heavy-tailed conditions on {5...} (see Appendix A.1) and certain persistency of excitation [26] (PE) conditions on {:c..}, these polytOpes converge in probability to a point set. In spite of the potential benefits of OBE identification, a significant practical impediment precludes widespread application. The practical difficulty concerns the estimation of proper model error bounds which are theoretically necessary for reliable identification. Failure to prescribe accurate bounds in OBE processing is potentially catastrophic. Underestimated bounds may cause divergence to a parameter vector that is not even capable of generating the observed data (i.e. outside the set Q"), while overestimated bounds may cause the estimate to “freeze” at a biased estimate. Although improper bounds imply statistical inconsistency in theory, OBE algorithms have been successfully applied to a range of applications in which bounds can only be estimated. The OBE algorithm with automatic bound estimation (OBE—ABE) developed by Lin in 1996 [34] is the first OBE method to theoretically solve the difficult problem of accurately estimating “true” model error bounds in ARX models with “true” error {5...} and exogenous input {an} (included as part of the measurements {:c..}). In theory, OBE-ABE removes the practical roadblock to model identification using OBE algorithms. OBE—ABE converges consistently under conditions on the “true” model error sequence {5...} that are met by many practical signals, and additionally provides the customary OBE set of feasible solutions. The basis for the “ABE” in OBE-ABE is Lin’s proof [34] that, if the error bounds are overestimated, there exists an interval Z of length N over which no update takes place for any finite N. Thus, the need to adjust the bound is practically indicated by a sufficiently long period over which no update takes place. When such an interval is found, OBE-ABE invokes its bound re-estimation recursion which depends on N and an “adjustment constant” 5. The choices (and possible iterative refinements) of parameters N and 5 have a significant impact on OBE-ABE performance in practice [29]. One of Lin’s principal results [34] asserts that the OBE-ABE bound re—estimation recursion results in point-set convergence of $2.. in probability if, in addition to a PE requirement on {ten}, and a statistical infinite visitation (IV) criterion on {5...}, {5...} and {un} are asymptotically independent (see Definitions A.1 and A2 in Appendix A.1). However, the choice of N and 5 may still pose practical challenges in certain applications [29]. Lin’s convergence proof depends upon asymptotic analyses in which n—>oo, N-+oo,and5—>O. Huang and hisresearch group at the University of Notre Dame (UND) and Deller working at UN D and later at Michigan State University (MSU) recently focused on the quasi-OBE (QOBE) algorithm, an OBE algorithm featuring a very simple innovation data check, similar to that of D-H/OBE [22, 39] but with a weighting strategy similar 8 to SM-WRLS. Working together, these groups proved that in a deterministic setting the parameter estimator asymptotically converges to the “true” parameter vector given PE and disturbance sequence IV of any arbitrary neighborhoods of the true bounds [11, 22]. A convergence proof of the QOBE ellipsoid volume is forthcoming and requires joint conditions on {:c..} and { 5...} that would seem to be rare in practice [11, 39]. 1.4 Motivation for a new algorithm All published OBE algorithms can be manipulated into the formal framework de- scribed in Section 1.2, provided that we allow for time-varying ( “n-dependent”) weight sequences as we have done in (1.5) - (1.7) [14]. In all published cases, however, this time dependence (if any) has a simple structure arising from a either a generalized “forgetting factor” (e.g., c.f. [15] and [21]) or some heuristic measures to induce adap- tation (e.g., [17]). In no published case is there an attempt to reoptimize any of the “previous” weights at time n in light of the new measurements 23.. and y... That is, all optimization in existing OBE algorithms can be accomplished by manipulation of the current weight only except for possible inherent scaling of past weights. This can be inferred directly from the work in [14]. The globally optimal solution at time n would optimize all weights {q.,..}[‘:., in light of all known information {(x.,y.)}?=.. In this work, we develop an algorithm that can potentially approach the perfor- mance of a globally—optimized algorithm at each time, by “revisiting” K 2 0 past weights when data at time n are received and deemed innovative. The revision of past weights is made by additive adjustments to existing weight values, subject to the constraint that any revised weight remain nonnegative and that the number of revisited weights be less than the system order. 1.5 Notation, variables and acronyms The vector and matrix notation, acronyms and abbreviations employed throughout the remaining developments are defined in Tables 1.1 - 1.3. 9 Table 1.1. Vector and matrix notation. A, B represent matrices, a a vector and x, a scalar. Notation Definition I identity matrix with appropriate dimensions AT transpose of matrix A T A.T (A'l) when matrix A‘1 exists A adjoint of the matrix A, i.e. :4— : det (A)A"T A(*, i) N-vector comprising the ith column of A, an N x M matrix A(i, *) M -vector comprising the ith row of A, an N x M matrix A(i,j) (i,j) element of matrix A \(A) matrix formed by setting off-diagonal elements of N x N matrix A to zero A o B Hadamard product of two N x M matrices A and B defined to be the N x M matrix with (i,j) element A(i,j)B(i,j) [27] A > 0 each element of the matrix A is positive (similar notation applies to vectors) [A] matrix with (i,j) element [A(i,j)[ (similar notation applies to vectors) A > B A - B > 0 (similar notation applies to vectors) sign(A) matrix with (i, 3') element |A(:3)| (similar notation applies to vectors) a(i) ith element of vector 0 D(a) diagonal matrix with ith diagonal element a(i), where a is an N-vector d(A) N x 1 vector with ith element A(i, i), with A an N x N matrix E . . h . . 1 t 6x BA matrix Wit (2,]) e emen _8A(i,j)' lO Table 1.2. MW-OBE vector and matrix variables at time n for a system of order m when considering K past weights. n is the “current” time and t represents the general sequence index. Quantity Notation Dimension Note: KdéfK+1 Observation matrix X. dz-e-f [ x.._K -- - x.._. an m x K Output vector yn dz”! [ y.._K - -- y.-. y.. ] K x l e T ’ Weight adjustments A. ‘2 [ A.-. A... A... ] K x 1 vector Weight adjustments A. Cg D(A..) K x K matrix . . def T ~ Cumulative weight q”, : [q._K,.. q.-.” q... ] K x 1 vector Cumulative weight QM ”Er D(q.,,,) K x K matrix T - Error bound vector 7,, {if [ 7..-K -- - 7.._1 y.. ] K x 1 Error bound matrix 1",. ‘13-! Db") K x K T .. Conditional predic- 5.]. {if [ 5..-“. 5..” ] K x 1 tion error vector Conditional predic- En]. d:0f D(5..,.) K x K tion error matrix ~ - def T ~ Sign vector (Signs of 3.. = [ 21:1 i1 :l:1 ] K x1 elements are chosen in context) Sign matrix S.. (ire-f D(s..) Instantaneous G.” ”—3! XIHX. covariance-weighted observation energy matrix (Inverse) relative en- H. drgf I + A..G..,.._. K x K ergy gain kc >§i xix! X X 11 Table 1.3. Acronyms and abbreviations Definition ] Acronym ABE automatic bound estimation AR autoregressive (parametric model) ARX autoregressive with exogenous input (parametric model) i IV infinite visitation [ IID independent and identically distributed 7;. LMS least mean square (algorithm) ML maximum likelihood MW-OBE multiple weight optimal bounding ellipsoid (algorithm) MWnQOBE multiple weight strategy applied to QOBE (algorithm) MW-SM-WRLS multiple weight strategy applied to SM-WRLS (algorithm) MMSE minimum mean squared error OBE-ABE optimal bounding ellipsoid with automatic bound estima- tion (algorithm) OBE Optimal bounding ellipsoid (algorithm) OVE optimal volume ellipsoid (algorithm) PE persistently exciting, or persistency of excitation (as appro— priate in context) QOBE quasi-OBE (algorithm) SM set-membership SM-SA set-membership-stochastic-approximation (algorithm) SM-WRLS set—membership weighted least squares (algorithm) UCT uniformly conditionally tailed WRLS weighted recursive least squares (algorithm) Chapter 2 OBE with Multiple Weight Optimization 2.1 Introduction and overview This chapter introduces the general class of MW-OBE algorithms and presents back- ground formulation for the MW-QOBE and MW-SM-WRLS algorithms (presented in the following two chapters). At each time n, conventional OBE algorithms update the previous covariance matrix 0.-., by incorporating a weighted outer product of the current data vector, if this vector is deemed innovative (e.g., [14]). This process, although efficient, may result in large ellipsoid volumes [13, 34], often due to the shape and size of the underlying exact polytope [46]. However, when adequate PE and IV conditions are present (Appendix A.1), pointwise reduction in ellipsoid volume may be improved by a joint weight assignment [29, 30]. If (2.. is too large, adding weights to OBE will not help. A principal objective of this research is to develop methods for taking optimal advantage of the information in the data by optimizing a present and reoptimizing K past weights over a block of measurements. The block optimization process begins by separating the last K +1 outer products in the covariance matrix (1.5) n-K-l Cn : Qt,n$z$¢T+ Z qt,n$tmrtr (21) (:1 tzn—K 13 _{nafi‘ofi-fi I .-| ‘7;qu where the block of weights {q.,..}?:,,_K is reoptimized at each time n. To express (2.1) in matrix form, let X.. represent the observation matrix (the block of K + 1 data vectors beginning at index n — K), X. ‘2‘ [ In—K x.._K+. x. ], and q”, the vector containing the weights applied to the vectors in X.. at time t. In these terms, (2.1) is written n—K—l Cn : qtfllztx’tr + XnQn,nXI (2'2) t=l where QM, = ’D(q,,,,,). Let A. ”:33 qn’n — r1,,.,,_l represent the adjustments to the weights applied to observation matrix X. at time n, with corresponding diagonal matrix A. = D(/\..). The weight adjustments represent the difference between the a priori and newly computed weights, with the most recent “adjustment” being a modification to a zero (by definition) weight. The recursive expression (2.2) in terms of the weight adjustments becomes C. = C.-.+X.A.X3‘. (2.3) The (composite) weights at any time must be non-negative, gm, 2 A..+q,.,,.-. > 0, in order to retain proper meaning. The block weight assignment strategy is comparable to a sliding window over the sequence of data vectors, where the weights assigned to a data vector vary with time n, but only vary during the time interval [n,n + K]. Accordingly, at time n, the time—varying weights are computed by (Figure 2.1) ELK/My, 0‘_<_t 0 for any t and n. The weight adjustments A... are zero outside the time interval [72 -— K S t S n]. This formulation allows the covariance matrix to be updated at each n with a set of weight adjustments acting upon the present and past K observation vectors. In the next section, we derive the general MW—OBE recursions, beginning with (2.3). 14 qt,n : qt,n—l + Ann t6 [n,n + K] Time it - 1 weights PO 'O Qn—lnz—K (In-l‘n—K+l L" (In—lhn—2 gn—lin—l \ A... A Time n weights adjustments 9 An‘nwK An,Li—K+l ‘ An,n;2 An,n-—l L, 9 Time n weights A (in.r_z:K qfl,7_l—K+1 j ' gn,n~_2 Qn,n;l (12,11 9 Figure 2.1. Weight assignments in the MW-OBE algorithm. 2.2 Recursions As in other OBE algorithms, at each time t, the MW-OBE computes the inverse covariance matrix, P; 1 (1:?! C. as part of its recursion [15]. In the present formulation, the matrix P. is similarly obtained by applying the matrix-inversion lemma (see Appendix A.1) to the covariance, (2.3), yielding P. = P.-. —— ._.X.H;1A.X3P._.. (2.5) where H .. ”re—f I + A..G..l.._. and G....... (1;! X: P.._1X... The existence of the matrix H ,j 1A. 2 [Ag‘ + G..[....]’1 is contingent upon all weight adjustments in the matrix A... being non-zero and [Ag 1 + G'..,.._.]“1 being invertible. In practice this constraint is adhered to by omitting data vectors corresponding to zero weights [see (2.3)]. Another constraint, the reason for which is not yet obvious, is that K be no larger than m. The general reason for this inequality is the necessity to invert G..,.... in future developments. As in conventional OBE algorithms, the MW-OBE algorithm recursively computes the ellipsoid center 0.. and the scalar It... The ellipsoid center represents the parameter 15 vector estimate at time n, and K.., a defining scalar for the ellipsoid [see (1.4)]. The equalities Cu 2 Cn—1+XnAnyn (2-6) P...X.H;l = P..X.. (2.7) are first noted from (1.7) and (2.5) in order to simplify further developments. The recursion for the ellipsoid center 0.. is derived in a manner similar to that used by Deller and Luk [13]. By employing (2.6) and (2.7) in (1.7), we obtain 0.. = Pncn = Pncn—l + PanAnyn = on—l — Pn-IH;lAnX:0n-l + PanAnvn = 9..-. + P.X..A.(y. - X39.-.) = on-l + PanAnEnln—l' (28) Recursion (2.8) is similar in form to its conventional OBE counterpart (reduces to the conventional recursion when K = 0) but updates the ellipsoid center as a function of K past data vectors. By substituting the recursion 91‘0an : (an—1 + PanAnEnln~llTCn(0n-l + PanAnEnln—l) — 93—1Cn0nul + 01—1Cnpn(XnAnEn|n—l) + (XnAnEnln—1)TPnCn0n-1 + (XnAnEnln—l)TPnCnPn(XnAnen|n-ll : or—lcnan-l + 95.1(XnAnEnIn—l) + (XnAn€n]n—1)T0n-l + (XnAnEnln—1)Tpn(XnAnEn[n—ll : 0:_1Cn-lgri—l + (yn —' En|n-—1)T1xn(yn T" En]n—-l) + 2(yn _ En[n—1)T1‘n€n]n—l + Earl—[AnGnlnAnenIn—l : 93.13.40.-. + yZAw. — e1].-.A.e...-. + EZ]..—11\nc;n|nl‘n5n]n-—l = 0:_.C..-.0..-1 + yrAnyn + 5271—1 [A.GnlnA. — An] 5....-. -1 : 6:-1Ci—10n-1 + III/Any" — 5:3,._1[A;1+ Grim—1] é:nlnul' 16 into (1.6), we obtain Kn : K’n-‘l - (7n _ yn)An(7n + yn)T + ozcflon - QZ-ICn—‘lon‘l : mm +71An‘7n — 5:,”_1H;1Ansnln_l. (2.9) Expression (2.9) also yields the conventional SM—WRLS recursion for Kin [13] when K=0. 2.3 A posteriori error vector and energy matrices The overall objective of any OBE algorithm is to define a set of solutions, or a single solution, that closely identifies the parameters of a system. This objective is achieved by seeking to reduce either the distance (in some sense) between the parameter estimate and the “true” parameter vector, or the ellipsoid size. The latter in effect brings the estimator to a closer neighborhood of the true parameter vector when (2,, becomes “small.” The progress of OBE algorithms in achieving the given objective is observable in the effects of each recursion at time n on error vector en|n_1 and energy matrix Gn|n_1. In the process of reducing the current ellipsoid size, OBE algorithms “re-map” the error (scalar or vector) to satisfy inequality (1.2). As a result, 6,4,, and Gnin provide important insights into algorithm behavior. In this section we express the error vector and energy matrix in their a posteriori representations, to better illustrate MW-OBE behavior. 2.3.1 A posteriori error, an.“ To satisfy the pointwise error bound constraint (1.2), the a posteriori error vector enln must belong to the hyper-box {u 6 8K : |u(z)| 57"} (Fig. 2.2). The transformation which maps enm-‘ to this hyper-boxl at time n is found by expressing enln in terms 0f Enln—l (2.8), Enln : yn — X3971 b 1The hyper-box is defined with the inclusion of its boundaries. 17 T : yn _ Xn 071—1 _ GnlnAnenln-l : EnIn—l — GnlnAnenIn—l : [I — Cumin] an,,_,. (2.10) By Lemma A.1 (see Appendix A.1), the inverse ofmatrix Hn = I+AnGnln~1 becomes H;‘ = I -— IA,,X,T(Cn_1+XnIAnXI)—1Xn : I _ AnGnlns (211) which, combined with (2.10) yields 5n," = HgTenln_1. (2.12). From (2.12) we note that the mapping of error vector en|n_1 to its a posten’orz' image enln is achieved through the transformation H ,j T. The expression for the scalar Kn takes a simpler form through the use of enln; substituting (2.12) in (2.9), we have Kn :: Kin_1 +7IAn7n — EZn-lAnenln’ (2.13) or, using the Hadamard product notation (see Table 1.1), Kn : Kn—l + A3671; 07:; — 5:rzlnul O En|n) - (214) 2.3.2 A posteriorz' weighted energy matrix, Gnln The weighted energy matrix (divided by Kn) represents the composite projection of the data vectors in Xn on the current ellipsoid axes. The a posteriori matrix Gain is 'found by substituting the right side of (2.5) into XZPan, yielding GR,” 2 XnTPan -—- XIP..-1X.. — XIPn-1XnH;‘AnXIPn-.Xn = G,,;,,_l — G,,,,,-1H;‘A,,G,,,.,,-, 18 ........................ 47.0)?! Figure 2.2. Rectangular error bound constraint (1.2) (dashed line) when K = 1 and m = 2. An error Enln~1 originally violating the bound condition in light of new observations is mapped to a location inside the rectangle {u : In) S 7} in its a posteriori En|n form. = H;TGnln_1. (2.15) (Since Gnln is symmetric, (2.15) is also equal to Gn'n-1H;1.) The matrix H: may be expressed as H;‘ = (0”,.-.)“(G,,,,,). (2.16) Theorem 2.1 The matrices Gui..-” Gnln are positive definite. Proof: Since Pn_1 is positive definite, aTG’nln-la 2 (aX:)TP,,_1(aX3) Z 0 for all a E 9%”, or Gn,n_1 is positive semi—definite [41]. The same is deduced for Gnln. With both Gui..-“ Gnln non—singular (no zero eigenvalues), we conclude that they are positive definite. I 2.4 Ellipsoid volume The volume of ellipsoid S2,, is proportional to the determinant of an?" and therefore the ratio det (nnPn) _ n. "‘ det (P..) (2 17) (let (Kn-an—l) _ Kn—l dEt (Pn‘l) . 19 represents an appropriate measure for the change in volume [13]. The ratio of deter- minants in (2.17) reduces to det (P11) —1 -1 T ——-——- = I —Xn A Gun- X P"- det(Pn_1) [ n + l l] n I det (A;1 + 6....-. — Gn,,._,] _ det (A;1 + Gn.,,_1) l det (I + A,G,.,,._,) .—_ det (H;‘) = 1/det(Hn), (2.18) or, det (Pn._1) t = -—————-. . de (Hn) det (Pu) (2 19) Incorporating (2.19) into the inverse of (2.17), we obtain det (“n-13“” : det (Hn) (”2“) . (2.20) det (KnPn) Results (2.19) and (2.20) will facilitate future develOpments. 2 . 5 Algorithm M W—OBE algorithms are similar to other OBE algorithms in that the observation matrix is checked for innovation at each iteration. If the observation matrix is deemed useful, a new inverse covariance matrix and parameter vector estimate are computed. An advantage of MW-OBE over conventional OBE algorithms is the flexibility in selection and number of past weights to revisit. Indeed, MW-OBE does not specify Which past weights are to be revisited, but limits their number to (m —- 1), m being the system order, due to the necessity to invert the matrix Gn'n_1 (see Section 2.2 and Chapter 3). The simulations presented in Chapter 6 are generated using MW—OBE algol‘ithms that revisit the past K non-zero weights. Updating the past sequentially nun'lbered K weights often results in reconsideration of previously rejected observation vectors (zero weighted observation vectors) which may not offer any new information 20 represents an appropriate measure for the change in volume [13]. The ratio of deter- minants in (2.17) reduces to det (P11) —1 -1 T _ = I - Xn A + an n— X71 Pn— det (19,.-.) l " ‘ 1] ‘ det [A;1 + Gn]n_1 - Gnln—l] det (A;1 + Gn,,,_1) 1 “ det(I+A,.G,.,,,_,) = det(H;‘)=1/det(H,,), (2.18) or, det(H,,) = gig—"1)— (2.19) det (Pu) . Incorporating (2.19) into the inverse of (2.17), we obtain det (Km..1Pn_1) (Km—1)," :: d t Hn . 2.2 damp") e ( ) n. ( 0) Results (2.19) and (2.20) will facilitate future developments. 2.5 Algorithm MW-OBE algorithms are similar to other OBE algorithms in that the observation matrix is checked for innovation at each iteration. If the observation matrix is deemed useful, a new inverse covariance matrix and parameter vector estimate are computed. An advantage of MW-OBE over conventional OBE algorithms is the flexibility in selection and number of past weights to revisit. Indeed, MW—OBE does not specify which past weights are to be revisited, but limits their number to (m — 1), m being the system order, due to the necessity to invert the matrix Gn]n-1 (see Section 2.2 and Chapter 3). The simulations presented in Chapter 6 are generated using MW-OBE algorithms that revisit the past K non-zero weights. Updating the past sequentially numbered K weights often results in reconsideration of previously rejected observation vectors (zero weighted observation vectors) which may not offer any new information 20 Table 2.1. The MW-OBE algorithm. I. Initialization: 1. 0K = 0,16K =1 and PK : £1, 11 small. 2. qK : 0. II. Recursion: Forn=K+1,K+2,... Form Xn matrix from present and chosen past K data vectors along with corresponding ya and 7,, vectors. EnIn—l : yn — Xian-l Rn = Pn—IXn and Gnln—l : XIRn If current and past K observations are innovative, determine optimal weight vector, An (optimization criterion dependent). Otherwise, next 11. Kn 2 Hum;1 + G,¥H_1]“l Pn = Pn—l _ Kan 0n = on—l + Knenln-l Kn 2 En.) +7ZAn7n - e:|n_1[A;1 + Gn]n_1]’1e,,ln_1 (when necessary) Next n. on the system. This issue needs more study although empirically, observations taken in the past are more likely to be informative. The MW-OBE algorithm is described in Table 2.1. Intermediate matrices Rn and Kn are introduced there to simplify the recursions. 2.6 Computational costs The following discussion is restricted to the computational costs of the general form of the MW-OBE algorithm. These recursions, common to all MW-OBE algorithms, are only performed when a data matrix is deemed informative. MW-OBE algorithms compute the error vector en|n_1 and the energy matrix Gn|n_1 in the process of check- ing for innovation in the observation matrix. Although these costs are included in the following considerations, costs due to particular optimization methods (actual innovation check and weight generation) are excluded. Detailed per update cost is shown in Table 2.2. When K = 1 (K = 0), corre- sponding to the conventional OBE algorithm, this cost becomes 3/2(m2 + 3771 + 4), 21 comparable to that of the algorithm described in [15] at 0(m2). This 0(m2) perfor- mance is maintained for m > K. When K x m, the per-update cost increases to 0(m4). In order to keep the cost performance of this algorithm to real-time (on-line) status, we limit this study to the re—visitation of one or two past weights. In so doing, the computational cost are kept within a “reasonable” range (see Table 2.3 and Figure 2.3). Note that with small values of such K, the inversion of matrix (A;l + Gn|n_l]‘1 is relatively inexpensive. The expression for computational cost shown in Table 2.2 includes the computations of en]n_l and Gain-“ which are used in the innovation check. 2500] 8 "1:16 £1 £6 ’/ 'U ,/ onoo— , :3 x z 33 ’l I Q ,’ 0' E1500» ’1’ .9 ’/’ m=12 [’21 H I ,/ 3 x ,z ”-1 / I” a x x .31000" ’1 ’,3’ Id / / 500:? ’,” "4"’ ”” __,—-" m=4 _a r’ __-—”" _-'——"' ‘_,._“ _’__..—D-"' M=2__.__a *' __ ————— —-~—:::: ——————— t} ----- ::-— 1 ”””””” L— 1 .1 41 O 0.5 1 1R 2 2.5 3 Figure 2.3. Computational costs (number of multiplications) of a single MW-OBE iteration for K = 0, 1,2 and 3 for system orders m = 2,4, 8, 12 and 16. K = 0 repre- sents the conventional OBE algorithm case. As pointed out in [10, 13, 15, 17], OBE algorithms update the parameter estimator by using a small percentage of observation vectors, often less than 5% (on simulated data with known bounds). MW-OBE algorithms use an even smaller number of Observation vectors [30, 31] and thereby compensate for the higher per—update com- putational cost. This tradeoff makes MW—OBE algorithms attractive in applications Where per—point convergence is the dominant concern. 22 Table 2.2. Detailed computational costs of MW-OBE algorithms per update. The computation of enln_1 and Gn|n_1 are included although they are usually considered part of the innovation check. For simplicity, the notation K is used to denote K + 1. Expression Dimension No. of Multiplications Enln-l R X 1 7711?~ Rn = Pan m x K m2K_ Gnln-l K X If (PK/2) (~K + 1.) K,,=R,,[A,:l-l—G',.,,._1]‘l me K3j-mK2+K P, = Pn_1— Knn’f m x m (mK/2)(m +1) 9,, = 0,,_1 + Knen,n_1 K x 1 ~mK~ Kn 1 x 1 3K + K2 Total (3mK/2)(m+K+2)+K(K2+K+4) Table 2.3. Computational costs of MW-OBE algorithms per update for K = 0, 1, 2 and 3 as functions of m. For typical m K No. of Multiplications per Update m z: 10 m = 100 m = 1000 0 (3/2)(m2 + 3m + 4) 201 15456 1504506 1 3m2 + 12m + 20 440 31220 3012020 2 3/2(3m2 + 15m + 32) 723 47298 4522548 3 6(m2 + 6m + 16) 1056 63696 6036096 The computational costs in Table 2.2 exploit symmetries in the computations. Additional reduction in cost is achieved by noting that expensive recursion of Pn may be evaluated in a total of 12%? + Km + m2) [23, 28] multiplications using LU decomposition (Choleski factorization) of H ,j 1A,, (56—3). This technique also simplifies the computation of Km. 2.7 Optimizations The matrix defining the hyperellipsoid at time n is given by [see (1.4)] Pgl/nn. The measure det(1c,,P,,) is proportional to the square of the volume of the ellipsoid 23 and is most often minimized in the OBE optimization process. Minimization of the trace of {renPn} is also a meaningful measure of size (e.g., [14, 21]). Minimization of the parameter ran, first suggested in [8], had been controversial with respect to its interpretability [14] until recently when Huang’s research group developed the QOBE algorithm [22, 39]. QOBE, which minimizes it" in conjunction with a specific weighting strategy, provides interesting interpretations of this optimization process. Because of the relative algebraic simplicity of the algorithm, we use the QOBE-like approach of K." minimization to develop a specific instance of the MW-OBE algorithm, MW-QOBE. MW—QOBE is presented in Chapter 3. On the other hand, the SM- WRLS OBE algorithm optimizes the volume of the ellipsoid at each step. This is used to develop a second MW—OBE algorithm, MW-SM-WRLS. In Chapter 4, the MW-SM-WRLS is presented with simulations. 24 Chapter 3 Multiple Weight Quasi-OBE Algorithm 3.1 Introduction In this chapter we develop the MW—QOBE, a MW—OBE algorithm based on min- imizing the scalar n" with respect to the present weight and past K weights (the diagonal elements of A.) if the new data admit further reduction. We compare this novel approach to the QOBE algorithm developed by Huang et al. [22]. We prove the uniqueness of the optimal solution for general K S m, and experimentally study the case of K =1. 3.2 5,, minimization Recent study of the QOBE algorithm has shown the merit of minimizing the scalar Kn [22, 39]. This simple yet efficient algorithm offers good convergence of the param- eter estimator to the true parameter vector [11]. When the prediction error €n|n_1 generated by the current parameter estimator 9,,-1 and observation vector 3,, falls outside the error bound constraint (1.2), the QOBE generates a new parameter es- timator 0,. which re-maps the prediction error to (exactly) the bound (lenlnI = '7"). Although developed from an “OBE” point of view, the condition for data acceptance (l€n|n-1l < 7,.) and the process of mapping en]n_1 into 5,4,, is found to be “decoupled” from the ellipsoid, a departure from other OBE algorithms (and reason for the name 25 “QOBE”). This independence from the ellipsoid makes QOBE particularly interest- ing in time-varying applications due to its robustness to a “true” parameter 9. moving outside the ellipsoid. In this section we develop a specific MW—OBE algorithm by using as the optimization criterion the minimization of K... with respect to the weight vector A". In the following theorem, we derive the optimal weight vector A" at time n which minimizes Km over the present and past K weight adjustments. Theorem 3.1 The scalar 16,. is minimized by the weight adjustments F "n = (GnIn—lSnAn)—1(En|n—l "' 51171;) where S... is a diagonal matrix with diagonal elements :l:1. Proof: In minimizing K," (2.9), we encounter the term Elfin-1H; lAnenln_1, a quadratic expression in an".-. which involves the inverse matrix H ,j 1A,. = [A.Il + Gala—ll—l- We use the notation of Table 1.1 in writing to A..(i) to mean the ith element of the vector A... at time n. An(i) is equivalent to A,.(i, i) since An = ‘D(A,.). Differentiating the term H g 1A” with respect to an arbitrary (scalar) weight An(i), we obtain a(H;1A,,) _ arr;1 _. 6A,. 0A.(z') ‘ aA.(z‘)A"+H" axna) 6A 6A _ __ 'Tl '3 —1 —1 n — Ha aAn(z-)Gnlfl-1Hn An+Hn aAnh) 8A _ —1 n __ -1 _ H,, 6A..(z') (I 0....-.11, A.) 8A T __ —1 n —1 _. H,, a (“(11”) . (3.1) me (3.1) we conclude that T a(H;lA”) _ . _ . T ”ht—l 8A (2,) Eula—1 EIIn—lHfll(*l 1) [Hnl(*, 1)] Enln_1, (3.2) n E where the column vector H ,1" l(*, i) (see Table 1.1) is the ith column of matrix H ,‘,’ 1. 26 By incorporating (3.2) in the differentiation of (2.9) with respect to A,.(i) we obtain 6 ° 2 T _1 . _1 , 6A3” = 6111(2)) _enln‘lHfl (*,’t) [Hn (*iz)]Ten|n—l = (7.0-) + 55..-.H;1(*.i)) (7.6) — e1].-.H;‘(*. 2)). (33) Hence, 6K... _ 2 8A" = r3 - ['D(H,,Te,.|,._.)] , (3.4) an . . . . . . 6x where at: 18 the diagonal matrix whose ith diagonal element [S m. Using the Hadamard product notation (see Table 1.1), and recalling that A,. = D(A,.), (3.4) is expressed as 016,. 0A,. = (7,. + H;T€n|n—l) 0 (7n “ HgTenIn—l) ° (35) The optimal diagonal weight matrix A,. (or vector A,. ) is the solution of the equation an" : O, . 3A,. (3 6) Using [A] to denote matrix with (i, j) element [A(i, 3)], (3.5) and (3.6) imply that IHzTenln—II —'7.. = 0. (3.7) To solve (3.7), we define the column vector of signs, :l:1, 3,. (113-f sign{H;Tc-:,.l,._1} (see Table 1.1). By incorporating |H;Te,.,,._.| = (H;Te,.,,._.) o 3,. (3.8) into (3.7), and multiplying on the left by H I, the optimal weight vector is found to be the solution in A,. of Enln-l = HZhn o 3n) = (7n 0 8n) + GnIn—lAnhn o 3n). (39) 27 After some manipulation, we obtain A,. = [Grill—1 (5",..-) —7,, o s,.)] o (9,, o 3,.) (3.10) where 9,, denotes the column vector whose elements are the diagonal elements of I: 1. If we further define the matrix 5,. déf D(s,.), then (3.10) can be also expressed as An 2 (Gn]n—ISnFn)-1(En]n-l- Sn’Yn). (3.11) We verify that the solution point is a minimum by demonstrating a positive determi— nant of the Hessian, the matrix whose (i, j) element is 62’s,. aA.(z‘)6A.(j)° (3‘12) Using the differentiation vector notation of (3.5), the Hessian (3.12) is alternatively represented by the (symmetric) matrix whose ith column is a 85,. am) (3A.) (3.13) The derivative of (3.5) with respect to a scalar weight A,.(i) is expressed as 6 81s,. “T a ‘ ax.(i)(a,\n) z ‘2(Hn en.._1)o]m (HnTen).._.)] (3.14) which, by incorporating (3.1), becomes 6 0K... _ _ _‘T _T (9H: _T (9A,.(2) (6&1) - 2(Hn Eula—1)0[ Hn aAn(i)Hn Enln—l] - (3.15) Evaluating (3.15) at any root, say A; of the first derivative (3.5) and by substituting H;Te,.l,._. = 5,03. into (3.14), we obtain the columns of the Hessian matrix a 6"“211 _ —T 6H: axnu) (611,.) x ‘ ”WWII" ax.(1)5"7" 0A,. 2 n H—TGnn— ——. n 37n0 n [ laAn(z)S’7n 28 0A,, : 25n7n O H;TGn|n~lrnSn—— (”a(i) 0A : 2 n nn— —l———n._' n - S7no I lHn 6An(2)57n The determinant of the Hessian is then det fill—— 62K" - 2K+1det (S I‘ )det (G H‘IS F) 6A,.(l) 6A", ’6An(K+1)aAn _ 11 fl nln-l n n n 2K+l 2 =——-———t nFndthn_. damn)“ (5 ) e< . I) (3.16) A necessary and sufficient condition for a non-negative Hessian determinant (3.16) is det (Hn) Z 0. This condition is satisfied with valid weight adjustments, made evident by expressing H n in its a posteriori representation (2.16). I Remark: It is noteworthy that setting K = 0 reduces this expression to the optimal scalar weight in the QOBE algorithm [22, 39]. Corollary 3.1 Optimizing [in over several weights at each n results in a non- increasing sequence {Kn}. Proof: By substituting (3.7) into (2.9) we obtain K'n = M4 +7ZAnhn ‘ SnEnxn—il (3-17) = rem -7fAnSn(en.n-1 — 51:7,.) = fin—i - ArrnSn(€n|n—l — Sn‘Yn) = Hn-i — (Enln-I "‘ Sn7n)TG;[71._l(£nln—l “ 5:1an (3-18) Where Gn‘IL, is positive definite. Therefore, K." is a non—increasing function when evaluated at A" of (3.11). Equation (3.18) provides an algebraic proof of Corollary 3.1 along with the amount 0f decrease in K." at each step. To simply prove that Kn _<_ fin—1 [using weights (3.11)], we note that "nun-:0 2 final. I 29 3.3 Existence and uniqueness of an optimal solu- tion A major benefit of OBE algorithms is the avoidance of computationally costly recur- sions by a simple redundancy check that detects when observations supply no new information. Indeed, by pre-determining the absence of an optimal solution through a computationally inexpensive test, an OBE algorithm avoids laborious computations that ultimately yield zero or negative weights (indicating no optimal solution). The QOBE algorithm is particularly attractive due to very its simple test for innovation, notably lEnIn—ll > 7". In the previous section, the MW-QOBE algorithm optimal weights were derived as functions of a sign vector (or equivalent diagonal matrix) 3,, (at time n). This (K + 1) x 1 sign vector is formed from a set of 2K +1 possible permutations of :tl elements, with the constraint (see Section 2.1) that the (composite) weights at time n be element—wise positive, qn’n : An + qn,n—l > 0. (3.19) An efficient test for new information in the MW-OBE observation matrix presents new challenges. Although a sufficient test for the presence of an optimal solution is not yet found, we report a necessary condition in Theorem 3.1. The following lemma represents the principal algebraic result needed to prove a condition for existence of the weight adjustment vector A,.. Lemma 3.1 Let A represent an n x n symmetric positive definite matrix and let u and v be two n x 1 vectors with v(i) > 0, for all i E [1, n]. There exists a sign vector 8 with corresponding diagonal matrix S = ”0(3) (see Table 1.1) which satisfies the vector inequality SA(u — Sv) > 0, only if vTDz(v—|u|) S 0 where D is a diagonal matrix of eigenvalues of A. Pr00f: We multiply the vector inequality 3.41:. > SASv by the vector v > O and 30 obtain the scalar inequality vTSAu > vTSASv > 0 (3.20) which can be re-written sTVAu > sTVAVs or (3.21) sz > 2T2 (3.22) where v 2 73(1)), z ‘1—5‘ RDVs, w 42‘ RDu, A = RTD2R with D a diagonal matrix comprised of the positive square roots of the eigenvalues of A (all positive since A is positive definite) and R unitary. The functions f1(z) déf sz and f2(z) déf sz represent a hyperplane and a quadratic function in z, respectively. The inequality (3.22) is only achieved in the interior of the spheroid represented by sz — sz = 0. (3.23) In order to express (3.23) in the form Hz - 15c”2 - a = 0. where zc represents the center of the spheroid and a is a scalar, we choose 1 1 Cz- d =- 2. z 211) an a 4”w” The spheroid equation then becomes 4 1 l l (W) Hz - iwll2 = 1, or Hz — iwl)2 = Iliwllz. (3.24) Substituting the expressions for z and w, and recalling that R is unitary, (3.24) iInplies that IIRD(2V8 "11)“2 = HRDUII2 ||D(2VS - u)“2 = HDUH2 0f 31 vTDz(v—Su) = 0. Any feasible vector u therefore satisfies {u : vTD2 (v — Su) 3 0} (3.25) where the vector 3 is one of 2" +1 possible sign vectors. The smallest value on the left side of (3.25) occurs whenever a = sign(u). Therefore any n consistent with (3.25) must satisfy vTD2(v-|u|) g o. (3.26) Hence the existence of a solution requires that there be a vector u satisfying (3.26). Theorem 3.2 (Necessity) There exists a weight adjustment vector A,. (3.11) sat- isfying (3.19) only if 7:11:67... - lean—1|) S 0 (3.27) where Dn is the diagonal matrix of eigenvalues of Gn)n_1. Proof: We may assume without loss of generality that A,. > 0 as demonstrated in Chapter 5. Let A = dirt-1’ vn = 6,4,...) and an :7”. Apply Lemma 3.1 to prove the existence of an appropriate sign vector, hence, a weight adjustments vector. I Next we prove that at most one weight adjustment vector A satisfies (3.11) with constraint (3.19) beginning with the following lemma. Lemma 3.2 Let A represent an n x n symmetric positive definite matrix and let u and v be two n x 1 vectors with v(i) > 0, for alli E [1,n]. There exists at most one sign vector with corresponding diagonal matrix S = D(s) such that SA(u — Sv) > 0. (3.28) 32 Proof: Let 31 ;£ 32 be sign vectors with corresponding diagonal matrices SI and 52 both satisfying inequality (3.28). Without loss of generality, we assume that the mismatched elements of S) and 52 are consecutively arranged in the top left quadrants as S 1,, and 320, since we may re-order the basis and preserve the positive definitiveness of A. The partitioned matrices (including A appropriately partitioned) are Sla O —Sla 0 An Ac S] I , $2 = , A = T (3.29) 0 [51,, 0 IS» Ac [A,. We now add the inequalities corresponding to S 1 and 52, 0 < (31+ $2)Au — ($14431 + 321432)” and incorporate (3.29), to obtain l 0 0 SlaAaSla 0 v0 0 < - 2 _ o 23.. 0 l sit/1.5... v. i _ZslaAaSIava o < (3.30) _ 2(Slb — SlebSlb)vb where [ ya I vb ]T is an appropriately partitioned vector v. Multiplying each side of the top partition inequality by the vector v,, > O maintains the inequality, therefore 0 < “vaTSlaAaSla'vaa a contradiction since A0 is positive definite. Hence a sign vector 3 satisfying (3.28) is unique. I Theorem 3.3 (Uniqueness) At most one weight adjustment vector An solving (3.11) also satisfies (3.19). Proof: Define A, an and v" as in the proof Theorem 3.2. Apply Lemma 3.2. I 33 3.4 Incremental gain The merit of MW-QOBE with respect to QOBE is its ability to further reduce rcn and the ellipsoid volume at each update. In this section we explore the incremental gains in such improvements. Although “valid” weights are required to be non-zero for the non-singularity of H ; 1A,, (see Section 2.2), in practice a zero weight may be replaced by a small num- ber2. This substitution allows us to prove the following theorem. Theorem 3.4 If valid optimal weights ARK and A;,K_1 exist at time n for the opti- mization of an and 52mg-“ then Kn,K(/\;,K) < an_1(A;,K_1). Heuristic proof: We can always achieve KmK : anJ by taking A,.,K = [0 I An,K—1(2:K) ] a: [5 | An’K_1(2;K) ] where e is a small number. If this solution produces the smallest possible an, then the new (K th) observa- tion provides no innovation and therefore Kn,K(A;,K) = Kn,K—1(A;,K_.1)- Otherwise "n.K(’\;,K) < Kn,K-1(/\Z,K-1)- I Proof with exact incremental gain: The energy matrix at time n when optimizing over K weights may be partitioned in the following manner: T ,K Gnln—1.K : g" ng (331) 9n,K Gnln—l,K—l with ng ‘2‘ x:_KP,,_lx,._K, 9,7,}, déf xZ_KP,,_1X,,,K-1, Xn,K_1. We define the d r _ . . . . . - scalar AmK é 9n.K — 9:,KGnlh—l,K-lgn.K' Since Gn‘K is posrtive definite, An}, = G;}((1,1) is positive. Let us recall (3.18), Kn : Kn—l —' (Enln—l _ Sn7anGn—|:;_1(Enln—l — 31:7,.) and define umK : 5,”le — Sn.K7n,K- Then, by Lemma A.5, we rewrite an as 2 ——1 —1 T “3n,K = Kn,K-1— n,K (un,K(1) n,K_1Xn,K_1Pn—1$n,K-i “* “ax—1) \ 2Necessary for the existence of A; l 34 and conclude that an < nn,K_1. Theorem 3.5 The sequence of ellipsoid volumes (over the times of updates) in MW- OBE algorithms is decreasing. Proof: Deferred until Chapter 5. I 3.5 Geometric interpretation of the Kn minimiza- tion The geometric interpretation of the MW-QOBE algorithm provides considerable in- sight into its behavior. The QOBE algorithm (MW-QOBE with K = 0) maps the absolute value of the (scalar) a posteriori error, |€n|n_1|, to the bound 7,. [11, 39]. We observe a similar behavior in the MW-QOBE algorithm by incorporating (3.7) into (2.12) to obtain 7n 077: — 6.71]?! O Enln = 0 (3.32) This reveals that the MW-QOBB algorithm maps the component-wise absolute value of a posteriori error vector enln to the error vector bound 7,, by requiring 7n : lenlnl- (3.33) This phenomenon is illustrated in Figure 3.1 for the case K = 1. At each iteration, the MW-QOBE algorithm attempts to map the error vector Enln-l to the unique Sn'yn vector (one of 2" +1) which satisfies condition (3.19) through the transformation (3.7). This condition imposes a more stringent requirement on the acceptance of observation vectors and as a result provides a more selective screening process. 3 .6 Algorithm The MW-QOBE algorithm appears in Table 3.1. The initial conditions mirror those in QOBE supplemented by the initial “accumulated” weight vector qwk1 which is 35 Enln—l Sn 27" 71(2) 5,; {7n 1 5 5 1 -’7..( k {M ) : S" 37" meal 371.477: Figure 3.1. Error bound constraint (1.2) when K = 1 and m = 2. An error 6","-1 originally violating the bound condition is mapped to a location inside the rectangle {u : In] 37} in its a posteriori an," form. appropriately assumed to contain all zero elements. This vector contains the accumu- lation of all adjustments made to the weights over the window of time that it currently represents, n - K, . . . ,n. At each iteration, the elements of q,,,,,_1 are “shifted” to reflect the new time window. The check for innovation at time n in the conventional QOBE algorithm (K = O) is l5nln—1l >7n- (3-34) With weight reoptimization (K > 0), satisfaction of this simple test is still necessary for any further computation to be required on the window 12 - K, . . . , n. Indeed, if there is no innovation in the observation at time n, then the past K weights are already optimal. The use of the QOBE check for innovation (3.34) also becomes the basis for a simple adaptive (in K) version of MW-QOBE algorithm. When an Optimal weight adjustment vector for K past weights cannot be found, the algorithm may opt to use the optimal weight for K = O. This adaptive process guarantees further decrease in Kn (per update, see Section 3.4) and allows use of QOBE algorithm convergence results [11, 39]. The computational cost of the method is drastically reduced by recourse to the simple check (3.34) prior to further optimization. 36 EnIn-l Figure 3.1. Error bound constraint (1.2) when K = 1 and m = 2. An error Enln...1 originally violating the bound condition is mapped to a location inside the rectangle {u : In] 37} in its a posteriori enln form. appropriately assumed to contain all zero elements. This vector contains the accumu- lation of all adjustments made to the weights over the window of time that it currently represents, n — K, . . . ,n. At each iteration, the elements of qn,n-l are “shifted” to reflect the new time window. The check for innovation at time n in the conventional QOBE algorithm (K = O) is lEnIn-il >7n- (3-34) With weight reoptimization (K > O), satisfaction of this simple test is still necessary for any further computation to be required on the window 12 - K, . . . , n. Indeed, if there is no innovation in the observation at time n, then the past K weights are already optimal. The use of the QOBE check for innovation (3.34) also becomes the basis for a simple adaptive (in K) version of MW-QOBE algorithm. When an optimal weight adjustment vector for K past weights cannot be found, the algorithm may opt to use the optimal weight for K = 0. This adaptive process guarantees further decrease in 52,. (per update, see Section 3.4) and allows use of QOBE algorithm convergence results [11, 39]. The computational cost of the method is drastically reduced by recourse to the simple check (3.34) prior to further optimization. 36 Table 3.1. The MW-QOBE algorithm (non—zero past weights). I. Initialization: 1. 9K 2 0, 5K :1 and PK : i], a small. 2. qK = 0. II. Recursion: For n = K + 1, K + 2, . . . , L (available data length) If lEnln—ll > 7n Form Xn matrix from present and chosen past K data vectors along with corresponding yn and 7,, vectors. E:nln-l : yn — Xian—l Rn = Pn-an and Gnln-l : Xy’fRn If 3,, exists such that Sn—Gnln_1(sn'n_1 - San) > 0 An : (Gnln—lSnFn)-I(En|n—l — $1177.) Otherwise, next n. Kn :: RnlA;1+ Gn arr—l]—l Pn : Pn-l -' Kan on : on—l + KnEnIn-l Kn = nn-1 4‘71"an — Sneflnq) (if necessary) Next n. The computational complexity of the reoptimized algorithm is significantly worse per update than that of QOBE without reconsideration of past data. However, these selective algorithms tend to incorporate so few data that, even with the additional burden at times of update, the overall complexity remains 0(m), the complexity of the computations at a time for which no update occurs. Further, there is empirical evidence that reoptimization may result in a significant reduction in the number of updates over conventional OBE optimization. 3.7 Computational cost The basic computational costs are described in Section 2.6. The initial data ac- ceptance check remains the same as in the QOBE algorithm. If a QOBE Optimal weight exists, the algorithm attempts an optimization over the past K weights (3.19) which requires the computation of .G—nIn—l (K 3 floating-point operations (flops)) as well as its multiplication to a vector (K2 flops) (3.11). Note that the computation of det (G,,l,,_1) is not necessary to check inequality (3.19) since det (Gnln-1) > 0 and 37 Table 3.2. Computational cost of the MW-QOBE per—innovation check. For simplic- ity, the notation K is used to denote K + 1. Expression Dimension No. of Multiplications EnIn-l Rn = 1’an Enln-l Gn|n-—l Sna—nln—l(5n|n—l An — 511771) Total le me KxK KxK le le mK m2K (mK/2)(K+ 1) < K3 K2 (x2k) R2 x R + det(Gn,n_1) E-T—n—(m+K+2) K/2 +I?(R2+I?+4) therefore does not change this inequality. Table 3.2 summarizes the computational cost of the MW-QOBE algorithm. The computational cost is decreased with increased K due the reduction in ob- servations deemed innovative (avoiding laborious recursions). 3.8 MW—QOBE with K :1 Readjusting a single past weight offers important insight into behavior of the general MW-QOBE algorithm. The innovation check requires the computation of the vector en,,,_1 (2m flops), the 2 x 2 matrix Cw.-. (2m2 + 3m flops) and a maximum of four tests of condition (3.19), each requiring four multiplications (see Table 3.3). 38 Table 3.3. Computational cost of the MW-QOBE per-observation innovation check when K = 1. Expression No. of Multiplications E'nln-l 2m Rn = 1’an 2m2 EnIn—l 3m _ Gnln-l ‘ SnGnln—l(€n|n—l _ 57177;) 4 (X4, max) A,. 4 Total 2m2 + 5m + [8 — 20] 3.9 Illustrative examples 3.9.1 Decrease in ellipsoid volume In order to illustrate the effect of increased K in MW-QOBE algorithms, we consider the AR(2) system, y.. = -O.10yn_1 - 0.56yn_2 + an. (3.35) where 5,... is uniformally distributed over the interval (—1,+1). We use the MW- QOBE with K 2: O (QOBE) and MW-QOBE (K = 1) to identify this AR(2) system of length 100. The QOBE algorithm found 21 points relevant to the optimization process as compared to the five used by MW-QOBE with K = 1. Figure 3.2 shows the ellipsoid generated by the two techniques at times n = 8 and n = 13 with corresponding polytopes, ellipsoid centers and true parameters. The MW-QOBE ellipsoids show improved alignment with the major axis of the polytope, and reduced volumes. We note that QOBE does not focus on decreasing volume. In fact, it more-or-less “ignores” the ellipsoid altogether in the attempt to minimize an. 39 w 3.9.2 Weight assignments In this section, we illustrate the weight assignment mechanism in the MW-QOBE algorithms and their data screening behavior. We consider the AR(3) system, 31,, :2 0.49%-; + 0.61yn—2 + 0.58yn_3€n. (3.36) where 5,... is biased and found in the interval [—0.5, +1]. Table 3.4 show the first 38 weight assignments. As expected, the QOBE algorithms uses the largest number of observations in the optimization‘process, followed by the MW—QOBE (K =2 1) and MW-QOBE (K = 2) algorithms. At time n = 17 (K = 2), the non-zero weight 917,17 = A17 = [9.34 x 10“, 1.904 x 10’3,4.148 x 10’3]T is assigned to observations previously ignored. Therefore, observation vectors 1:15 and 1:15 became relevant in light of the new observation vector 2:17. We also observe at times n=15, 16 and 17 (K = 1) the compounding effects of weight adjustment as consecutive observation blocks are used. Between times 15 and 17, the weight applied to observation vector (:15 increased from 6.242 x 10'3 to 6.551 x 10‘3 and the one applied to 1:15 decreased from 1.906 x 10‘3 to 5.443 x 10“. This example illustrates the importance of allowing negative adjustments (within constraints) to past data vectors which may not convey as much information as previously computed in light of a current observation. The MW—QOBE (K = 2) only used three observation vectors in identifying (3.36), compared with the 32 and 10 needed by MW-QOBE (K = 1) and QOBE, respectively. Using this reduced number of observations, the MW-QOBE (K = 2) was able to identify system (3.36) in a comparable amount time to that required by smaller K, as seen in Figure 3.3a. The associated volume is greater (Figure 3.3b) due to the smaller number of points taken (even though, per update, increasing K decreases volume). A similar remark is true of the plot of K." in Figure 3.4b. The a posteriori error in Figure 3.4a shows its mapping to the error bound (when an observation is accepted). 40 '--‘. v -4.- —10 -2r- -‘r— -6»— -8 -10 (b) Time n = 13. Figure 3.2. OBE ellipsoids resulting from the system identification of AR(2) system y.. 1' —0.10y,,_1 — 0.5611,.-2 + 6,... by QOBE (dashed line) and MW-QOBE (solid line, Kzl) at times 72. = 8 and 13. The star (*) represents the “true” parameter and the circles (0) the central estimators (superimposed). The underlying polytopes (exact feasible sets) are also shown. 41 Table 3.4. First 30 weights assigned by QOBE and MW-QOBE (K = 1,2) algorithms in the identification of the AR(3) system yn = 0.49yn_1 + 0.61yn_2 + 0.58y,,_3 + am, where the “true” measurement error sequence {5...} is uniformly distributed over the interval (-1, +1). The QOBE and MW—QOBE (K = 1,2) algorithms selected 32, 10, 3 observations, respectively, from a total of 100. QOBE MW-QOBE (K=1) MW-QOBE (K=2) Time)" qn." (111m qn,n—l gum gum—1 ([n,n—2 x1074r x10“‘1 x10-4 x10-4 x10‘ ><10-‘r 5 1.837 - - - _ - 7 6.203 12.002 7.117 - - - 8 - — 12.002 - - - 9 50.115 - - - - - 14 15.073 - - - .. - 15 16.917 62.422 25.913 - - - 16 20.392 19.061 65.508 - - - 17 6.189 21.261 5.443 9.340 19.039 41.481 18 2.025 - 21.261 —— 9.340 19.039 19 — - - - - 9.340 20 0.624 - - - _ - 24 1.798 - - - - _ 25 - 15.458 10.908 - - - 26 - — 15.458 - - _ 28 - 13.325 1.661 - - _ 29 1.546 - 13.325 - - - 30 0.475 - - - - - 32 0.612 - - - - _ 34 0.059 - - - _ - 36 0.034 - - - - - 39 0.357 3.191 4.099 - - _ 40 0.515 1.301 3.764 .. - - 41 - - 1.301 - - - 44 2.757 - - - - - 46 3.933 - - - - - 47 0.685 - - - - - 54 0.009 - - - - - 57 0.146 - - - - - 61 0.219 - - - _ - 62 0.200 - - - - - 42 0.7 0.6 0.5 0.4 LPpmmctarIz 0.2 Ellipsoid volume 10 Figure 3.3. Table 3.4. 0.3 l —— 0035 1 — — - MW-QOBE (K=1) 1 — — - MW-QOBE(K=2) J A A l L J L l 1 0 10 20 30 40 50 60 70 80 90 100 Time (71) (a) Parameter #2. T r T QOBE MW—QOBE (K.-_-1) ‘ MW—QOBE (K=2) ... ............................................................................................... c---‘._____._____.._._._..___q A I 1 L 0 1O 20 30 40 50 60 70 80 90 100 Time (n) (b) Volume. Convergence of the parameter estimator 0,,(2) to the “true” param— eter 9,,,(2) = 0.61 and associated ellipsoid volume in the system identification of y.. = 0.49yn_1 + 0.61yn_2 + 0.5831,.-3 + 5,... by QOBE and MW-QOBE (K=l,2) as in 43 ”'5 I ' T r 1— T v r r J |1 1 ———- QOBE ‘ — - - MW—QOBE (K=1) - - - MW—QOBE (K=2) _. -30 1b 2b 30% f0 :0 610 7b do 90 100 Time (n) (a) A posteriori error, 5",”. ‘ J - ' Y I I 1 T f Y Y I \1 ' , QOBE '. , - - — MW-QOBE K=1) 0.990» U ---—-- MW-QOBE K=2) ‘ ' i 0.996 ~ I 9 - ' I | I 0.994 — " ~ ‘1 \. 1 450.992» " “ ‘\ - l ____________________________ I ......................................... 0.99» _ 0.909 ~ . 0.986 ~ _ 0.9040 10 20 30 40 so so 70 no 90 100 Time (n) (b) Kn. Figure 3.4. 16,, and a posteriori error €n|n associated with the identification of the system of Figure 3.3. 44 Chapter 4 Multiple Weight Set-Membership Weighted RLS 4. 1 Introduction Reducing the ellipsoid volume in OBE algorithms shrinks the set of feasible solutions, and assuming convergence of (2,, to {0.}, asymptotically forces the central estimator to a closer neighborhood of the true parameter vector. This approach is successfully used in several OBE algorithms including the SM-WRLS [13, 14], which pointwise minimizes the ellipsoid volume in the context of an RLS-like framework. In this chapter we deve10p an OBE algorithm named after the SM-WRLS, the MW-SM- WRLS algorithm, which extends the pointwise SM-WRLS minimization of volume to an algorithm that optimizes the volume over several past weights. 4.2 Volume minimization The volume of ellipsoid (2,, is measured by either the trace or determinant of MP" [14]. The SM-WRLS algorithm minimizes the determinant of MP", which is proportional within a constant to the actual ellipsoid volume, in its derivation of optimal weights. Here, we use the same optimization approach, with the added flexibility of K past weights adjustments. Hereafter the term “ellipsoid volume” or “volume” at time n refers to the determinant of the matrix renPn ,. Pn Va (1:! det (KnPn) = Kgndet (Pu) :— Ic’” det( ) det (Pn_,). (4.1) n det (Pn_1) 45 Using (2.18), (4.1) can be written Vn = m—-———— = ” 4.2 “" det (Hn) h, ( l where hn — Erde t(H,, ). We minimize (4.2) over several weights, in a manner similar to that used in [13]. Minimizing the volume3 with respect to the weight matrix, alternatively represented by (see Section 2.2) :fl(H - I) nln— 1 - —nH n[n—1_Gn_]rlt-l : Gnln— IHT— 75111—1? (4'3) is equivalent to minimizing volume with respect to the matrix H n. The derivative of the volume function with respect to matrix H n, BVn _ 07s,, K'm—l 0h” m 6H,. — det(P,,_1)[maHnnn hn—aHnnnJ, (4.4) is minimized by determining a matrix H n that solves = ~31 Jh~1441 <44 09,?“ and det (Pn_1) are both strictly positive numbers.) In finding the derivative of 19,, with respect to Hn, we first write an as a function of Hn, Kn : K'n-l +711; [(Hfl - IlGnln-1]7n _ E:174]n— l [H—1(Hfl _ I):n|n- I] Enln l : M311 1 +7n [I17l G1 nln— l]7n + E:nTlnm l [G11lln—1(HnGn-[n—1)-IGn|n-l]Enln‘*1 = kn_1+7,, [HnG n‘ln_ [[79, +511” l[H,,G,,’I,1,_ 1] le'nln_1 (4.6) ~ (19‘ T —1 T 7! Using the formulae given by Athans [1] (see Lemma A.6), we then find 0h" 6H : “"HJT (4.7) an” _T 6H :: 7n7nGnln-1WHT—qun lenln— lenln— 1H" , (48) 3Minimizing the“volume ratio” as in [13] or “volume” lead to identical optimizations. 46 The volume minimizing equation (4.5) becomes 0 = mn‘ynGn—In— l ”-7711?! T__Gn'n lenln— lsnTln- [HgT—I‘CanT _—_ mHn7n7nGl;_ ,HT —_mG,,,,, 15m- 153.44 ,4an -_- mGIJ,_ ,HT (.mn)Gn,n 1H3," —m(e',.,,,_1§',',‘,,,_,)— nnGn-l; ,HZ (4.9) or, in matrix form, "rm: | “$4111 5.}.-fo 0=[ iii—111T | I] -———l ——————— --—- (4-10) — l 5711 l _mEnlfl- lenln- l I where G"1 n|n_ 1H T is the unknown, symmetric positive-definite, matrix. Remark: When K = O, the solution to (4.9), Hn Rn—l + ’23—] + 4(m2 _ ”7353.1"..1 7 Gnln—l — 2(m _ 1)7721 corresponds to the SM-WRLS weight Gnln—llHn/Gnln—ll - 1 Gnln—l Gnln—lkn-l — 2(m - 1)7r21+ GnIn—l\/ki21-l+ 4(m2 — 1)77216721|n—1 ) — 2(m —' 1)Gn|n—1712; A,.: ~ _ 2 2 ~ _ Where “Zn-1 “ K'n—l_5n|n_1/Gn|n—l_7nln-1/Gn|n-11EnIn—l — Enln—l/Gnln—ly and Where H, and Guin-1 are the scalar versions of H n and Gum—1- 4.3 Optimal solution existence and uniqueness In this section we provide a method for solving (4.9) for the matrix H n, describe a condition for the existence of a solution and show the uniqueness of this solution. Without loss of generality, we assume that the weight adjustment vector A" has all positive elements, as demonstrated in Chapter 5. Notation and definitions needed 47 Table 4.1. Notation used in Chapter 4. Notation Definition 71 x 1 vector u v n x 1 vector with v(i) > 0, for all 2' 6 [1,12] m m E N, with m 2 n Mn+ set of all real symmetric positive definite n x n matrices X X E Mn+ W . W E Mn+ with diagonal elements 211,- 919’ W(2', 2') D diagonal matrix of eigenvalues, (1,, of X [i.e., d; déf D(2, 2)] R orthogonal matrix such that X = RTDR a,- idéf u_(2) [mu(2)v(2) - uTv] for all 2' 6 [1,71] 8 ”63 fl, flid=f—— mu2v(2)+uTv foralli€1,n EU ) [ ( 1 ] l 1 Table 4.2. Function definitions used in Chapter 4. Function Definition b b=c-uTW“lu —vTW'1v wherec> 0 f f : Mn+ —+ 3?. f(Z) 2 12+ uTZu + vTZ‘lv F F: Mn+ —> Mn+ F(Z) = ZuuTZ — f(Z)Z —mva throughout the rest of this chapter given in Tables 4.1 and 4.2. The following lemma is necessary to find a condition for the existence of a valid weight adjustment vector A,. (or matrix Gn|n_1H:) which solves (4.9). Lemma 4.1 f(X) —+ 00 and ({etli) —+ 00 as trace (X) —> 00. Proof: We re-write the function f (X) as u _,v ”T f(X) = b+llull2i +llv H2— X _— H" H Xu“ ll “‘0 ll “’0” :2 b+||u||2 uTDu+||v||227TD 1" 48 where i 2: Ru/Hu“ and ii = Rv/Hv”, each having unity norm. Since (1min < fiTD‘ii < dmax and d",1 < 'vTD‘l v < (lgfiln, the function f (X) is bounded by b + ”21“?de + llvllzdmax< 00 implies trace (X) -) 00.) We rewrite f (X) as __ b " ‘ii 2 f(X) = d.[)lu))2u0) 2 +3: +23, 1.5)] so that f'"(X) _ d;-" H " W ‘ "‘ det(x) — I‘zld Hullu J'()+ £+§dd Since the term b " 472(4) [gig-F: didj] ->oo as dj—>oo, and since m > 72 implies that d;" > [1111 d,, we have m X d‘m f ( ) [||u||fi(j)]2m n] —>oo as d]- ->oo, det (X) —> 121 d1 from which we obtain f’"(X) det(X) -—>oo as dj——)oo. Hence, f’"(X) (let (X) —+ 00 as trace (X) —> oo. Theorem 4.1 (Existence) A suflicient condition for existence of a positive weight vector that solves (4.9) at time n is Kn_ (7,3: — enln_,e:In_I) Gn'In_ 1 < ml], (4.11) Proof: The volume (4.1) is a continuous function in the positive hyperquadrant of ER“ +1 (see Chapter 2). From the definition of H n, we have K-H trace( =2 [1+ An( Gnln—l(i1i)]- 2:1 Therefore “A,. H ——+ 00 implies that trace (Hn) —> 00. Let X: Hn n|n— I, b- - kn_1, u :7" and v = Enln-l with 2%,.-1 and 63'an, as in (4.6). By Lemma 4.1 we have m ran—>00 and {ii-+00 as ”Ann—+00, n. hence Vn(A,,) —> 00 as ”An” —+ 00 (see Equation 4.1). We also know that Vn(0) must be greater that Vn(An) if an optimal weight adjustment, An, exits. Therefore, the condition that all partial derivatives be negative at A,. = 0 is sufficient for the existence of a solution in the positive hyperquadrant. By substituting A,. = 0 in (4.7) and (4.8) we obtain Bhn = I 8H" An20 an?! : 77TG-l “—GnIn—lE-nln-IE-T 1 6H" Anzo n nln— l nln- ._ T —l _ 7n7nGn'n_1"5n|n—15n|n_1 n|n—l which, when incorporated into (4.4), produces the matrix 3V" T —1 T _1 6H" Anzo : K'n-l (7n7n0nln—l — Enln—IEnIn-IGnln—l) — I T T —1 “Zn—1 : (7717" — Enln-lenln-l) nln—-1 _ m I (4.12) with trace n — T ‘1 T —1 (K + 1) trace 6H" A4120 _771 Gnln—l7n - Enln-lGnln—IEnIn—l — Kn—l- (4.13) The check for existence of a valid weight is therefore the matrix inequality (4.11) I Remark: From (4.13) we obtain (K+1) [Cn_j ”71Gn-Irlh4‘7n > ET Grii_15n|n_1 > 0 (4.14) nln—l which implies Kn-1 —'7',’:Gn‘I,l,_I'7n > 55n-1Gra-1111-15nln-1 > 0, thus 222,.-1 > 0. 0 We next turn our attention to solving (4.9). At present, no closed form solution for K > 1 has been found, but we present a general approach for solving the problem, which simplifies the quest for solutions. The following lemma represents the principal algebraic result needed to derive the optimal weight adjustment vector A,.. Lemma 4.2 F(D) = O with constraint f(D) > 0 when d,- = 1(b+‘/b2+4a,fi,~). (4.15) 201' Proof: The (i,j) equation in F(D) = 0 when i aé j: m(d,dj)[u(i)u(j)]—m[v(i)v(j)] = 0 . . or did, 2 M. (4.16) The (i,j) equation in F(D) = 0 when i = j: 2 . 0 = de-2u(i)2 - 5dr — d1 Zu2(j)dj " (11': ”(10) — mv(i)2 ,- ,- j = (m —1)d§u(1)2 — bd. — (m +1)v(z')2 - Z (u2(j)d.dj + 122(1)?) J9“ = (m — l)d,~2u(i)2 — bd, -— (m +1)v(i)2 2 - 2 - d? _ I}; (u (fldz-dj +v (”(1:71) (4.17) Substituting (4.16) in (4.17), we obtain 0 : i-(7,n___ 1),“(02 _Zu(i)u(.7)v(j) di2__bdi_ (m+1)v(i)2—Zv(z)v(j_)u(]) _ #1 ”(2) #1 "(‘l = l.(m — l)u(i)2 — 1('(Z)Zu(j)v(j) d-2 —- bd- — (m + l)v(i) z.)}__:u(j)v(j) , ”(z ) J¢i :(z)1#i = [(m —1)u(i)2 —— 5%(uTv -— u(i)v(i))] d? —- bd, — (m +1)v(i)2 _v(1)' _ ,- -zu( ‘7‘)(1‘ ‘D — ”(011(0) - mu22 1—‘—(—ZZuT 2- -mvi2+ —(—vi)uTv ‘ (’ v0) T”‘ld “1 l () u0) l _ 2 - "(2') b . _ [mu(i)v(i) + uTv] — d‘ v(i) [mu(i)v(i) — uTv]d' [mu(i)v(i) — uTv] (4'18) .—_ d3 —- £711.. - g3} (4.19) I Equation (4.19) is quadratic in d,- with solutions (4.15). We now revisit the condition for existence of the MW-SM-WRLS optimal solutions through Lemma 4.3 by determining a condition for the existence for the roots of (4.19). Lemma 4.3 A matrix D satisfying F(D) = 0 with constraint f(D) > O and d,- > w,- erists if and only if a,([3.-+bw,-) > (01,111,)2. (4.20) Proof: From d,- > 11), we obtain 1 o (fl.+bw.-) > w? (4.21) 0431+th > (amt)? (422) Remark: When n z 1, we have a :2 212(m — 1), 6 :: v2(m +1), w z l/G and 2 = c — Z‘_ — v20. Equation (4.20) becomes G N ’U. u2v2(m —1)(m +1) > —G_'(m —1)[E§(m - 1) — b] which reduces to msz2 > mu2 - cG. (4.23) We obtain the SM-WRLS innovation check by u = '7, v = Enln-l/Gn and c = nn_,. 0 Theorem 4.2 When the weight adjustment vector A,. exists, the matrix H "GJILI solves (4.9), and A" solves (4.3). Proof: Let X = H“ ’1 b 2 it..-“ u :7" and v = EnIn_., where F6"..1 and Enln-l nln-l’ are as in (4.6). Apply Lemma 4.2 to find HnGn'ILI. I Next we prove that at most one weight adjustment vector A,. (or matrix ”1 lHZ) satisfies (4.9) with constraint (3.19). The following lemma is required. n|n- Lemma 4.4 There exists at most one X e Mn+ satisfying F(X) 2: 0. Proof: Let X1 and X2 6 Mn+ such that F(Xl) = 0 and F(Xz) = 0. Let M be a non-singular matrix such that MXIMT = I and MXgMT : D, where D is a diagonal matrix with positive elements. The existence of such a matrix D is proven in [41]. Let it“ = Mu and '17 = M‘Tv. We then have 0 = mDaaTD - (b + oTDfi + oTD“o)D — mot—F (4.24) 0 = mfi'fiT —(b+ uT u + v Tv)I — min-2T. (4.25) When i 75 j, the (i,j) equations of (4.24) and (4.25) become me’jNUfiWUN-m[V(i)"17(J')J = 0 mlU(i)fi(j)l-mlv(i)’fi(j)l = 0 implying that d,d,- : 1. From the diagonal equations (i,i) of (4.24) and (4.25) we have 0 = m‘u(i)2 - :—((:—))fiTvi d? — bd, — [mv(i)2+ :2—7; fiTiiJ _ , 2 a(i) . 2+ v( ) ._ 0 — mu(i) — ”(i)“ ”T 1)] —b- [mv(z)+ 11(7)"ij thus d,- = 1 for all i. The matrix D is therefore the identity matrix, indicating that X1 must equal X2. Therefore, a solution to F (X) = 0 must be unique. I Theorem 4.3 (Uniqueness) At most one weight adjustment vector An (or G" HZ) satisfies (4.9). nln— 1 Proof: Let X: Gn-ln— 1H1, b 2 kn.“ u 27,, and v = 6",,th with 12"-, and e",,,,,_, as in (4.6). Apply Lemma 4.4. I 4.4 MW-SW—WRLS with K = 1 Optimizing the ellipsoid volume by re-visiting a single past weight is the simplest form of MW—SM-WRLS. It offers significant insight into the behavior of the algo- rithm and presently is the only case with a closed form solution. We solve for the matrix Hn G'—l nln 1 from which we obtain the weight adjustments. In order to simplify 54 notation, the optimal solution to (4.9) when K = l is presented through the following lemma. Lemma 4.5 Let n = 2. The solution to F(X) = 0 with f(X) > 0 is X = RTDR, - ' d 0 with R 2 C081!) sm 1!) , D 2: 1 and X(1, 2) 2 gm, where Slut/J cosw 0 d2 __ <15 Pi 11’ ” 2 4 ¢ : arctan 1 mllvl|(ut - v.2)(u1 + ”2) + 2uTv(u1v1 -— "2112) 2 mllvllulug + uTv(u1v2 + 112111) d,‘ = ii:(b+vb2+4a,fl,~)fori=1,2. Proof: Let —— ' d 0 R = cow 5'” and D : 1 (4.26) Sim/2 c0511) 0 112 where w is a rotation angle. We then obtain 1 912 = ((12 — d1) coswsinw : §(d2 —- d|)sin 2w. (4.27) From (4.27) and (1le = 31:2 , we obtain the quadratic equation in (11, 11111.2 2912 5152 0 = 42 d - 4.28 1+ sin2z/J l ‘121'172 ( ) E b : ‘29” * (4 29) 7&1 TIM—£151 — UT'U sin 2d) . — -— -— — T _23‘ "13‘3” u v = 1 (4.30) 122111 mulvl — uTv E b = ————29” (4 31) fig mfigvg -— uTv sin 2112 ' — - — — T ulvg 77111202 "l’ 'u. 'U : 1 (432) Combining (4.29-4.32) we obtain the system of equations -—b mfilfigfil "1" fig (BTU) Z ——-‘D—2 Sill 211} (4.33) 2912 b m'filfigv‘g + fi1(uTv) : —'61 sin 2w (4.34) 2912 which simplifies to TRH’UHZ'l—Ilfig + UTV(E17IJ_2 + 5251) Z 0. (4.35) Incorporating the rotation matrix (4.26) into (4.35), we obtain 0 : [mHvH2u1u2 + uTv(u1v2 + u2v1)] cos 21/1 + 5 [mllvll2(ul - U2)("1+ “2) + 2uTv(u1vt — u2v2)] sin 2d) (4.36) which we solve for the angle #2. Let ¢ : arctan 1 m||v||(ul — u2)(u1 + u2) + 2uTv(u1v1 — U202). (4.37) 2 mHvllulug + uTv(u1v2 + u2v1) The solution w of (4.36) becomes 0 = cos ¢cos 2w + sin ¢sin 21,!) = cos(¢ — 21.0) (4.38) or,w 2 g—g-kn k=O,1,-~. (4.39) Since the addition of the term k7r k =2 0, 1, - ~ . in (4.39) does not change the solution of F(X), we have e H (\DIS~ | A): (4.40) Table 4.3. The MW-SM-WRLS algorithm for K := 1. I. Initialization: 1 1. 9K = 0, KK =1 and PK 2 ——I, a small. It 2. qK = 0. II. Recursion: Forn=K+1,K+2,... n _ G 2 2 n l nln I If EnIn—l > 7n '“" """"_""_' '-'""' m Form Xn matrix from present and chosen past K observation vectors along with corresponding y" and 7,, vectors. 5.n|n--l : yn — Xyrgn-l Mn = Pn-,x,, and 0....-. = Xan - _ -l ~ __ —1 Enln—l ‘— G _1Enln-117n “ Gn|n_17n nln -1 Kn—II If (7n7: _ Enin-IEZ‘In—l) nIn—l > m [78"-] = Kn..1 " EZIn-lé-nln‘l _7IIn—l‘7nln-l Compute the rotation matrix R" from (4.40) and (4.26) 1 Compute the diagonal matrix Dn from (4.15) An : RZDan _ 1511-1, Otherwise, next n. Kn = M,,[A,:1 + Grit-11-1 Pr: = Pn—l "' Kn n 9n = gn—l + KnEnIn—l —1 - T —l ~T - - K" : Kn_1 +713 [Hfl vulva—1],)“: + Enln-l [HnGnlrli—l] Enln—l 4.5 Algorithm To date, a closed form for the MW-SM-WRLS optimal weights hasn’t been found. We offer an outline of the general algorithmic steps on Table 4.4. An algorithm for the case K = 1 is presented in Table 4.3. When K = 1, we first determine an appropriate rotation angle with corresponding eigenvalues for the matrix Han—(7114: then generate the optimal weight vector. 57 Table 4.4. The MW-SM-WRLS algorithm (non-zero past weights). I. Initialization: 1 1. 0K = 0, wk =1 and PK 2 ——I, )1 small. a 2. qK = 0. II. Recursion: Forn=K+l,K+2,... n G 2 2 71-1 nln—l If Enln—l > 7n — _— Form Xn matrix ffgm present and chosen past K observation vectors along with corresponding y" and 7,, vectors. Enln— l " ”yr; XT 91:- 1 Mn =_P,,1X,, andGnln_1=X,,Mn Enln—l_ _ Gnln—lEnln-h 773— - Gum—177! If (7:31“ Enin-IEZIn-l) n-lrll—l > [in—“ll Rum] = K'n-l -' 53,)"- IEInIn—l —7:|n—17-n|n—l Compute optimal G“..(..-1HZ = RZDan, where Dul is diagonal (positive), and R", orthogonal. An HG n—ln 1H: _ n-lrlI—l’ Otherwise, next n. Kn =MnIA"l + 011,1- 11—1 R, : Pn_1— KnMn on : 011— 1+Knenln— l 1 Kn=Kn—1+7n[Hnn|n—1J7n‘l‘5nln 1[Hn n|n 1] 5'71!"—1 Next n. FIX I'V‘fl A 11 “I 4.6 Illustrative examples 4.6.1 Decrease in ellipsoid volume In order to illustrate the effect of increased K in MW-SM—WRLS algorithms, we consider the AR(2) system (3.35). We use the SM-WRLS and MW-SM-WRLS (K = l) to identify this AR(2) system of length 100. Figure 4.1 shows the ellipsoid generated by the two techniques at times 11. = 8 and n = 10 with corresponding polytopes, ellipsoid centers and true parameters. The MW-SM-WRLS ellipsoids show reduced volumes. Note the significant size differences between the MW-QOBE ellipsoids of Figure 3.2 and the MW-SM-WRLS (K = l) ellipsoids of Figures 4.1. This difference reinforces the remark in section 3.9.1, that MW-QOBE algorithms do not focus on minimizing the ellipsoid volume and therefore may generate large ellipsoids. 4.6.2 Weight assignments In this section, we illustrate the weight assignment mechanism in the MW—SM-WRLS algorithms and their data screening behavior. We consider the AR(3) system (3.36). Table 4.5 shows the first 38 weight assignments. As expected, the SM-WRLS algo- rithms use more observations in the optimization process than the MW-SM-WRLS. At time n = 5 (K = 1), the non-zero weight Q5,5 = A5 = [0.857 x 101,1.437x1]T is assigned to an observation vector previously ignored. Therefore, observation vector 24 became relevant in light of the new observation vector 2:5. We also observe that, between times n=11 and 12 (K = l), the weight applied to observation vector 1:“ decreased from 21.24 x 101 to 15.31 x 10‘. This example illustrates the importance of allowing negative adjustments (within constraints) to past data vectors which may not convey as much information as previously computed in light of a current observation. The MW-SM-WRLS (K = 1) used 45 observation vectors in identifying (3.36), compared the 51 needed by SM-WRLS. Using this reduced number of observations, I the MW-SM-WRLS (K = 1) was able to identify system (3.36) in a comparable amount time to that required by SM-WRLS, as seen in Figure 4.2a. As expected, the MW-SM-WRLS (K = 1) ellipsoid volume is smaller than that of SM-WRLS (Figure 4.2). No particular conclusion is drawn about the effect of increased K on K,, 59 "I . (shown in Figure 4.3b.) The a posteriori error in Figure 4.3a shows its mapping to less than (or equal to) the error bound (when an observation is accepted.) 4.7 Volume minimization, alternative approach In this section we describe an alternative approach for solving for the optimal weight matrix that minimizes volume. —1 Multiplying (4.5) from the right by [gain—J we obtain an” oh, _l 16,, which, by substituting (3.4), becomes 0 = m [r3 — D(H;Te,.,,,_,)2] [\,(H;TG,,,,,_,)]‘l — nnI (4.42) o = m [hfirf - v(finen,n_,)2] [\,(1r?f,.G,,,,,_.)]’l — hnnnI. (4.43) Minimizing the ellipsoid volume in the MW-OBE context is seen to be equivalent to solving for the weight adjustments, roots of either (4.42) or (4.43). We re-write (3.4) 3.5 05,, _ _ 8A 2 r3 —\,(H,Te,,,,,_,ef,,_,H,‘) (4.44) to obtain the volume minimizing equation 0 = mrf — m\,(H;Tenln_le:,,,-1H;l -— n..\, (Gn,n_1H,‘,‘) 0 = \ (H;T (HZI‘an - en,n-,53,‘,,_, + %H30,,n_1) H;‘] . (4.45) Equation (4.45) may be re-written the following way: 0 : \[HgTGn(n-1(Tn)GnIn—1Hr:1i 60 with the symmetric matrix Tn defined by Tn : G"l (HZI‘ZHn _ EnIn—legn—l _ EHZGTIITI‘I) Gn|n-—l m nln—l _ __ R Z F3A121+ nlrli-lrrfAn + Aflrr? nlrlr—l - ERA" _1 2 T Kn -—l 'l' nIn—l (I), ‘* €n|n~15nln-1 _ ‘7‘; n|n—-l) nIn—l' (4.46) We re—write Tn in the following block form: (4.47) where t A, = r: K: _ -1 2 __ n B" — Gnln-lrn 2m _ —1 2 T "n —1 C" — nln—l (Pu - E’lln‘IEnIn—J — 7n- Gnlfl—l) nIn-l’ When A,. and 0,, are positive definite, Tn is positive-semi-definite. Therefore Tn must be zero in order for (4.45) to hold as shown in the following lemma. Lemma 4.6 : Let A and B be symmetric non-singular and positive-semi-definite matrices respectively. Then \, [ABA] = 0 implies that B = 0. Proof: By congruence, B positive semi-definite implies that the product ABA is also positive-semi-definite. The trace of ABA, sum of non-negative eigenvalues, being zero implies in turn that all the eigenvalues are zero. In addition, the matrix ABA with non-negative eigenvalues is symmetric therefore it must be zero. However, A is non-singular; therefore B must be zero. I We therefore solve Tn = 0 for a general Kn > 0. Note that 52,, — Kn_1 = 0 gives (4.47) a Ricatti equation form [3]. 61 Definition 4.1 : The operator r‘H (UT) creates a matrix identical to H except that its ith row is replaced by the vector uT. ( This is an adaptation of the notation in [43])- Focusing on [fin and hnnn, the following observations are made: fin(*ij)5n|n—l : detllJBn(E:|n-l)l 12.)..-1(t)A..(j)‘ii.(*,j)e...-. = det(at...(e...-1(j)A..(j)eI..-t)l detHnbiioM.) = det [T’n.(‘73.(j)/\an(jt*))[- (a Since the expression for has" sums these determinants (from similar matrices except for one row), we may combine their determinants (see [41]) as K . hn’cn : hnKn—l + :det [HER (7g(j)An(J)Hn(Ja *) _ Enln—l(j)An(]-)517f|n—l)] - u 1:1 By defining the matrix M n M, “é‘ r3113,a,,,,._, + A,. (r: — en,n-,ej,r,,,_,) (4.48) = An [Ian -— en,n_le;,rl,,_,] , h" Kn becomes K . hnnn = hnnn_1+;det [r’Hn(M,,(j,*))] (4.49) .7 which is converted into the matrix form 14,, — 14"-, = trace(Manl) = trace (H;TM:) (4.50) hnnn — h,,n,,_1 : trace (Mnfif) : trace ()7an (4.51) By expressing M n and J n in block form, [ 6.4-14. M. = [ A,. | I [ A..G.......1 — ~ —— 0....-. (4.52) _ C" + Lid—l Grflrli—lj 62 I and J" = trace (HgTMn) [ An | I] —- . (4.53) G—l nIn—l The initial steps provided in this section for finding the Optimum MW-SM-WRLS weights may provide important information for future research in MW-SM-WRLS algorithms. 63 05'- as, (b) Time it = 10. Figure 4.1. OBE ellipsoids resulting from the system identification of AR(2) system 31.. = —-0.10y,,_1 — 0.5631,...2 + 6,... by SM—WRLS (dashed line) and MW-SM-WRLS (K21, solid line) at times n = 8 and 10. The star (*) represents the “true” parameter and the circles (o) the central estimators (superimposed). The underlying polytopes (exact feasible sets) are also shown. Table 4.5. First 30 weights assigned by SM-WRLS and MW-SM-WRLS (K = l) algorithms in the identification of the AR(3) system yn = 0.49y,._1 + 0.6131,.-2 + 0.5831,.-3 + em, where the “true” measurement error sequence {6",} is uniformly dis- tributed over the interval (—1,+1). The SM-WRLS and MW-SM-WRLS (K = 1) algorithms selected 51 and 45 observations, respectively, from a total of 100. SM-WRLS MW-SM-WRLS (K=1) Time)" gum qn,n Qnm—l x10I x10l x10l x10I 4 9.994 - - 5 14.912 0.857 1.437 6 22.302 10.541 0.857 7 24.355 14.597 10.541 8 21.644 16.438 14.597 9 28.240 20.155 16.438 10 23.885 21.241 20.155 11 24.891 7.621 15.312 12 - - 7.621 14 46.303 25.753 - 15 26.187 17.355 25.753 16 19.828 18.332 17.355 17 12.770 5.007 18.332 18 24.471 4.723 5.007 19 2.471 - 4.723 20 3.639 - - 24 41.686 15.696 8.232 25 43.606 17.851 15.696 26 - - 17.851 27 4.092 6.023 - 28 0.332 - 6.023 29 47.938 14.348 - 30 13.976 7.815 14.348 31 - - 7.815 32 12.784 6.149 - 33 - - 6.149 34 12.027 3.715 - 35 0.607 - 3.715 39 32.757 8.178 - 40 22.735 6.752 8.178 '. 1"" 65 8 I fl 7 1’ Y I T T r —— SMLwan '1 - - - MW-SM-WRLS (K=1) I 6... I q ‘1 '1 '1 '1 4- '1 ~ ’1 a " _ ‘1 ‘3 'I 52.. I ‘ .. 8 1 1 '3: 1 o —4 -2- .. “40 1b 2b so 40 so so 70 so 90 100 Time(n) (a) Parameter #2. 102 " ‘T .: f. *7 ..,T':,::‘::T:.::::'V::::“:‘T.‘::‘;:::.’::‘:.:;.: :E'l' d....'...;’;:.:ff"'l:.f.i;.fifiqiifif __ SM-WRLS . . ..... --- MW-SM-WRLS(K=1).. 10‘ 10° E 9 D Z 310" 10" 10-3 1 1 1 1 1 1 1 1 1 0 10 20 30 4O 50 60 70 80 90 100 Time(n) (b) Volume. Figure 4.2. Convergence of the parameter estimator 0,,(2) to the “true” parameter 0,,.(2) = 0.61 and associated ellipsoid volume in the system identification of yn = 0.493),...1 + 0.61yn._2 + 0.58yn_3 + an. by SM-WRLS and MW-SM-WRLS (K21) as in Table 4.5. 66 4'! m...) ‘- j~v ‘ r fl . F . .3 ~ ' I a A T14 r 'n I , ; r [\ SM-WRLS - - - MW-SM-WHLS (K=1) - Predictor) one: I ,1 l I L -o.a b I -o.a [- 'l' . '10 10 20 so 40 50 so 70 so 90 100 Time (n) (a) A posteriori error, 6",”. ‘2 r T r T T r r I I 11 10'- SM—WRLS MW-SM—WRLS (K=1) 3.. 5b 4b- , \ 3_ 1 “ - l /-"\ 1 \._ -__....~ 2“”~ 1 1 I-—‘r-"‘1“"1-""I’—“1 1 "’ 0 10 20 30 4O 50 80 70 80 90 100 T1me(n) (b)K.,,. Figure 4.3. Kn and a posteriori error an," associated with the identification of the system of Figure 4.2. 67 Chapter 5 Alternate MW-OBE Algorithms 5. 1 Introduction The general MW-OBE algorithm introduced in Chapter 2 does not place any con- straints on the selection of observations targeted for weight adjustments (correspond— ing to distinct observation vectors), nor on how many times a weight may be ad- justed. This algorithm was presented in the context of weight adjustments acting on past observation vectors because this is how the algorithm was originally conceived and developed. In fact, MW-OBE algorithms can be structured to target any set of observation vectors for weight adjustments, as long as the cumulative weight on each observation vector remains positive. In this chapter we identify and study two broad classes of the MW-OBE algorithms. First, we make the distinction between the “forward-looking” MW-OBE algorithms, which solely act on “future” observa- tion vectors, based on a “current” covariance matrix, and the “backward-looking” MW-OBE algorithms, which make adjustments of only past observation vectors. It is the latter category that has thus far been the subject of this work. We establish a duality between the two classes of MW—OBE algorithms and use this duality to simplify the proof of Theorem 3.5. Secondly, we focus on MW-OBE algorithms that only modify non-zero weights associated with successive overlapping blocks of obser- vation vectors (overlap of exactly one observation vector) and generate a simplified MW-OBE algorithm. 68 5.2 Forward-looking and backward-looking MW- ()1313 MW-OBE algorithms may be constructed as “forward-looking” or “backward- looking.” At time n, backward-looking MW-OBE algorithms adjust K selected pre- vious weights (in addition to the current weight), based on the current observation matrix X" and the covariance matrixof time n — l. The backward-looking algo- rithm, although not labeled as such when presented in Chapter 2, is restricted to the . modification of past weights. 7‘ The forward-looking algorithm acts on the “same” observation matrix X", but considers it to represent the “future” based on a “current” covariance matrix at time n - K — l. The relationship between the forward-looking and backward-looking al- gorithms (see Figure 5.1) has some similarities with the relationship introduced by i Deller et al. in [15] concerning the weighting strategy in conventional OBE algorithms. The labels “forward—looking” and “backward-looking” are references to the time re- lationships between the covariance matrix and the “current” observation matrix. (1-x-t Ct_. 1\ /1 [Weight adjustments An] n — K — 1 [Observation matrix Xn] n — 1 Figure 5.1. Forward-looking and backward-looking MW—OBE algorithms’ duality. We now develop the different recursions associated with these two Mw-OBE algo- rithms. We shall see that these two approaches lead to formally different algorithms, but generate exactly the same sequence of permanent (composite) weights {071,71}- Recursions (2.5), (2.8) and (2.9) are expressed as functions of A,., but can be re-written as functions of QM, = A,. + Qn_,‘,,, where QM”, is a matrix of known past weights at time n. This is achieved by writing the covariance matrix CM] in 69 the form (2.2), Cn-l : é’lK + XnQnm-erT? (51) d r - d r . . . . . where nK :51 n -— K - 1 and CM é 221%,,wa IS a posxtive definite matrix“. Applying the matrix inversion lemma to (5.1), we obtain .. - - -1 .. P._I = p... — P... X. [Q;},_, + 0...] xfpnk (5.2) " --l " " . . . . . . where PM, (ii—f Cm, and 0,,an dr-e—f XanKXn (posntive definite Similar to 0,4,.-. in Section 2.3.2). Pre— and post-multiplying the left and right sides of (5.2) by X; and K. respectively yields GnIn-l : £3"an - G~nan [Qn—JI‘“! + énan]-l énInK '2 (3,1an (I "‘ [anrla—l + énlnxl-l CThink) = énlnx [01:711—1 + énanl-l "ZIP—1 —l -—- [G... +0.,....]". (5.3) . . - —1 . . an important relation between GM..-) and GM” which Will be needed later. The recursion for the inverse covariance matrix PM.“ is found similarly to (5.2), - .. - -1 - PM,“ = P... — PnKXn [on-,1, + G...,.] X,TP,,K. (5.4) The central parameter estimator 9,,K and the scalar an related to the covariance matrix (5.1) become énK+l : énK + PnK+anQy.,nE-nlriK (55) - _ .. _ - -1 - KnK-H : K’nK +7Z‘Qnm7n — €711an [ 71,711 + Gnan] Enan, (56) where Enlnx =2 3],, — XIO-nk. Recursions (5.4), (5.5) and (5.6) form the forward MW-OBE algorithm. These are the counterparts to (2.5), (2.8) and (2.9).in the 4Note the significant difference between Cm, and CM. 70 Table 5.1. The forward MW-OBE algorithm. I. Initialization: 1. o. = o, it. =1 and 15. :51,” small. 2. q0 = 0. II. Recursion: Fort: 1,2,... n = l + K Form X. = X11. K matrix from future K + 1 observation vectors along with corresponding 3],, and 7,, vectors. 5am =_ y. - XI 9'15 R. = PMX. and 0..., = X312. If the current and past K observations are innovative, determine optimal weight vector, (1,”, (optimization criterion dependent). Otherwise, next n; K. = 12.14.? + G....1-* an+l : Pug - KnRZ‘ aux-H z aux + [(71511an _ link.” = max +7Z‘Qflm7n —- éran[Q,,',l, + Gnan]“é'n[,.K (when necessary) Next 12. backward algorithm (Section 2.5). The forward MW-OBE algorithm is summarized in Table 5.1. At a given time n, the forward and backward forms of MW—OBE algorithms gen- erate an identical weight adjustment vector A,.,n. The forward-looking algorithm gen- erates the weights QM, as opposed to weight adjustments A,. generated by backward- looking MW-OBE. The quantities are related by Gym 2 qn,,,_, + A,., where q,,,,_l is known. Thus, the composite weight vector QM, applied to the observation matrix X. at time n is the same whether generated by the MW-OBE forward or backward algorithm. The forward and backward forms of MW-OBE algorithms are therefore equivalent. The following theorem, broadens the idea of MW-OBE algorithm equivalence by showing that any MW-OBE optimization (in particular MW-QOBE or MW-SM— WRLS developed in this research) may be implemented using positive weights. Theorem 5.1 Let C" 2 CM] + XnAnXI. There exists a covariance matrix 6..-, and a weight matrix An with positive elements such that 0,. = 6..-] + XnAnXI. 71 Proof: Let 8.. = {i., - - - , i K...} represent the indices of observation vectors included in X... We write C. as 011 : Cn—l + XnAnX: = Z (Inn-I‘IBt-“EtT + Z: (Its—13.x? + X..A.X,T 1415. £65.. where G. = Z q.,..x.x,r, a symmetric positive definite matrix, and A,. = tes. Z q...a:.x, + X. A,. XT, with A,. containing all positive weights (by definition). tes. I Note that q”.-. = qt," for t i? 8... These equivalent MW-OBE recursions are helpful in simplifying the analysis of MW-QOBE weight adjustments employed in the “backward” class. By inserting (5.3) into (3.11), the weight adjustment vector becomes A,. = 1),—15.. G"1 n..,_ .(En)n— 1 51:77.) 2 1),—IS. [Gum + QM.-.] (Enln—l — 5:17..) = I‘,‘,“S.. G ,.,:,K(en)n 1- 3.7..) +I‘JISnQnm—tenln—l " I‘;‘S.Q,,,_.S.7. = r; 1s. G" ,.,‘,K [(1 + G....Q.,.._.) 5....-. — 5.7.] - PJ‘Q...-17. -_- 1,..15 a ,,,, [(1 + a....o,,_.) 5....-. - 3.7.] - 4..-. (5.8) Substituting (5.8) into (3.19) we obtain the inequality 5. G .... [(I + G....o.,.-.) a... —— 3.7.] > 0. (5.9) It can be shown that 5..., = (I + é...,Q,,_,) e..._. (5.10) and therefore (3.19) and (5.9) are equivalent to 5.6," . (5....-. — 3.7,) > 0. (5.11) ilnh 72 Inequality (5.11) is used in the existence and uniqueness proofs of Section 3.3. 5.3 One-step MW-OBE At each iteration, a MW-OBE algorithm adjusts a set of K weights. These weights may correspond to successive data vectors in time, past non-zero weights, or any other criterion. When the indices of the weights targeted for revision at time n are the same those at time n — 1 with the n — Kth index discarded and a new index appended, we shall refer to this weight as “shifted” by one (from the weight at time n - 1.) The indices of the revised (scalar) weights at time n - 1 1.1—1 = {in-1,1.in—12.‘ ' ‘ vin-1,K+lain-I,K+l} are “shifted” to In = {In—1,2,Zn-1,3,"°,2n—1,K+1,3n,K+1}- When an algorithm shifts indices at each iteration, it computes the last adjustment for the weight indexed I..(1). Therefore the An,I..(1) becomes the final additive adjustment to the weight at index 1'..(1). This weight remains permanent but the other K weights will be recomputed. Therefore, the algorithm may be simplified by updating the covariance matrix based on the single permanent weight and discarding the others (since they will be recomputed in future iterations.) The update of the covariance takes on the form of a conventional OBE algorithm. We therefore use the much simpler OBE recursions (QOBE or SM-WRLS algo- rithms) [14], in conjunction with an MW-OBE weight strategy. Such a MW-OBE algorithm is shown Table 5.2. The corresponding MW—SM-WRLS and MW-QOBE algorithms are identical to the SM—WRLS and QOBE algorithms [15, 39] aside from the weight computations. We refer to this MW-OBE algorithm as “one-step” (re- stricted to the overlap described earlier.) 73 I] :- Table 5.2. The one-step MW-OBE algorithm (overlapping weight adjustment blocks applied to non-zero past weights.) I. Initialization: l. 60 = 0, k0 = land B0 2&1, a small. 2. q0 = 0. II. Recursion: Forn=K+1,K+2,... l = n — K Form X. matrix from present and chosen past K observation vectors along with corresponding y, and 7,, vectors. 5,,an z-yn — X59715 R. = P.,X. and 0..., = X,’{‘R. If the current and and (other) chosen K observations are innovative, determine optimal scalar weight, q... (optimization criterion dependent). Otherwise, next n. PnK-H : PnK - QnK+l,nK+l/(1+ QnK+_1,nK+IGnK+1)PnK$nK+1$ZK+1PnK anx+l = 071K +(QnK+l,nK+15nK+1|nK)Pn—lznx+l flux“ = Knx+qu+l,nK+17121K+l "(QnK+1,nK+1€nK+1|nK)/(1+QnK+1,nK+10nK+1) (when necessary) Next 12. 74 5.4 Proof of Theorem 3.5 In Chapter 3, Theorem 3.5 states that the sequence of ellipsoid generated by MW- OBE algorithms is decreasing. Proving that det (nnPn) _<_ det(rt.._1P.._1) is circum- vented by the following proof. Proof of Theorem 3.5: The backward MW-OBE algorithm described in Chapter 2 is equivalent to a one-step MW-OBE algorithm and therefore generates a single single- positive-weight per recursions. We apply results from the single weight case [11] to prove the theorem. Alternatively, we may apply (2.18) with K = 0 and a single positive weight to derive the same result. I 5.5 Example contrasting forward-looking, back- ward looking, and one-step MW-OBE algo- rithms The equivalence of the forward and backward MW-QOBE algorithm is demonstrated in Table 5.5. The AR(2) system, y.. = —0.10y.-. — 0.563.... + 4..., where e... is uniformly distributed in (—1,1) is identified by forward-looking and backward-looking MW-QOBE algorithms. As expected, the weights assigned are equal at each time in both algorithms. The same system is also identified by the one- step MW-QOBE algorithm of Table 5.2 and compared with the backward MW-QOBE algorithm. Table 5.5 shows that the sequence (q...) is the same in the backward, forward, and one-step MW-QOBE algorithm. \I CI] Table 5.3. Equivalence of the forward and backward MW-QOBE algorithms, and weight assignments in the one-step MW-QOBE. First eight weights assigned in the identification of the AR(2) system y.. = —0.10y.._1 —— 0.56y.._2 +5..., where the “true” measurement error sequence {5...} is uniformly distributed and bounded by 1. MW- QOBE (backward), MW—QOBE (forward) and one-step MW-QOBE used the same number of points. fl MW-QOBE (backward) [[ MW-QOBE (forward) [[ MW-QOBE (one—step) ‘ Time,n H q“, (1) ] q..,..(2) [I q... (1) q.,.(2) H q...(1) 9....(2) x10'4 x10‘4 x10“ x10‘4 x10" x10"4 12 15.671 10.908 15.671 10.908 - 10.908 13 18.316 11.931 18.316 11.931 — 11.931 14 20.143 23.546 20.143 23.546 - 23.546 15 - 20.143 — 20.143 - 20.143 20 0.246 0.253 0.246 0.253 - 0.253 21 - 0.246 - 0.246 - 0.246 40 - — — - - - 100 0.002 0.025 0.002 0.025 - 0.025 76 6.1 In this throngh algoritl.1 MW-QQ I Wgencra thnta. WNMQ ll] ”9310: 6.2 6 [)9 fifSt llf titular 14' The 53' .t...,,5ml H5 f “1 Par. 9. ' + ITO-m 31)-‘40.“. F9' 31111 AS a . Chapter 6 Practical enefits of MW-OBE lgorlthms 6. 1 Introduction In this chapter, we demonstrate the practical benefits of MW-QOBE algorithms through simulations. QOBE and SM-WRLS are briefly compared to LMS and RLS algorithms (for reference) and subsequently to their multiple weight upgrade versions, MW-QOBE and MW-SM-WRLS. The performance benchmarks are parameter con- vergence, tracking, and data usage. By “data usage,” we refer to the number (or percentage) of observations used by the algorithm to update its estimator. Whenever possible, the simulations are generated using observation data similar to that found in previous work ([13, 14, 34]) in order to facilitate comparison. 6.2 QOBE, SM—WRLS, RLS and LMS We first briefly compare QOBE and SM-WRLS algorithms to each other, and to the popular identification algorithms, LMS and RLS, in order to establish benchmarks. The SM-WRLS algorithm Optimization minimizes the volume of (2., without di- rect consideration of the “distance” between the “true” parameter vector 9. and the central parameter vector estimator 9... Under proper data conditions (see Chapter 1), 0.. -—> 0. asymptotically as the set Q. converges to a point [15]. QOBE algorithms, however, reduce (not minimize) both the “distance” between 9.. and 9, and the vol- ume. As a result, the excellent parameter convergence of QOBE algorithms is often 77 F3 accompanied by large ellipsoid volumes. The large terminal volume in QOBE al- gorithms is often because the algorithm becomes extremely selective of data as its estimator approaches the true parameter vector [12]. Therefore, the QOBE feasible solution set, (2., tends to be larger than its SM—WRLS counterpart. Both QOBE and SM-WRLS algorithms exhibit performance superior to LMS and RLS in parameter convergence and adaptation when the error bound sequence is known (or closely approximated). This is partly due to the discriminatory selection of data points by QOBE and SM—WRLS algorithms which does not exist in RLS and LMS. The choice of pertinent observation vectors in QOBE and SM-WRLS also leads to faster convergence to the “true” parameter vector, increases robustness to additive noise, and improves performance in colored noise [15]. We illustrate performance differences among QOBE, SM-WRLS, LMS and RLS through the following simulation. The AR(2) system 3).. = —0.104y.-. + 0.562y.._2 + s... (6.1) where e... is uniformly distributed over (—1, 1), is identified by the QOBE, SM-WRLS, LMS and RLS. Figure 6.1 shows the superior performance of QOBE and SM-WRLS over RLS and LMS in parameter convergence. QOBE and SM-WRLS use small percentages of the available data (approximately 2 and 10 percent of the data, re- spectively.) In Figure 6.2, we illustrate important differences between the QOBE and SM-WRLS algorithms. SM-WRLS generates a much smaller ellipsoid, with a less “stable” convergence, whereas the QOBE, retains a large volume, uses fewer points and exhibits fast parameter estimator convergence to the “true” parameter. 78 - LMS - - - RLS —— QOBE 0.8 5 - 0.6 ' " "\+l\-‘-‘--~"~ _’_~.. W-.-~‘ rr’; 2‘ g 0.4: 1 5 0.2 - 3.. o a -0.2 4 1 1 1 400 600 800 1000 1200 Time (n) (a) QOBE vs. LMS and RLS. ' . LMS - --- RLS —- SM-WRLS 0.8 - .1 0.6 ., _ . “\ -.‘ _._- _- ~~ .- -. .. S‘.‘ E E 04 . 5 0.2 ’- . o . -o.2 1 L . 1 L o 200 400 soo 800 1000 1200 Time (n) (b) SM-WRLS vs. LMS and RLS. Figure 6.1. Parameter 9.(2) = 0.562 convergence of QOBE and SM-WRLS vs. that of LMS and RLS in the identification of the AR(2) system y.. = —0.104y,._1+0.562y.._2+ 5..., where e... is uniformly distributed over (~1,1). The QOBE and SM-WRLS algo- rithms used 22 and 121 points, respectively, from a total of 1200 points. 79 08 1" r f W V? I i —- QOBE i -»--‘ SM—WRLS . r, . r l i f r \ , 1 _ 1.. .4. . 531‘ "-l |f4 _rfi~‘r‘ ‘,i"“-: ii 0.5%: 4 ‘a‘.’ l 5 l 5...: - g, l «_L, l 0.31 - i 0.2:; _ it 0.] J i l o l L 1 1 I O 200 400 600 800 1 000 1 200 Time (n) (a) Parameter convergence. 8 ‘ ' T I'CJ.;'. .'.T;:TI:Z. ::;:::.':E:.'1”:;i.;i:;;.i;. 0035 ET: . :..f':i.‘::t"'iii...::i.‘:..}’:.. ..:.‘;;.;;'L:33 -.-'. SM-WRLS'” ......................................... Log ellipsoid volume ..\.__I .. . . 3; .1.> _\> . .. 2 . -_‘ . . 10 g- - **~~._. 5: ‘1 *‘v--.......;J : ~__ ......... . ...... 3 1 1 1 1 1 600 800 1000 1200 Time (n) (b) Volume. Figure 6.2. Parameter 9..(2) = 0.562 convergence and volumes of QOBE and SM- WRLS algorithms in the identification of the AR(2) system y.. = —0.104y,._1 + 0.562y.._2 + 5..., where E... is uniformly distributed over (-1,1). The QOBE and SM- WRLS algorithms used 22 and 121 points respectively from a total of 1200 points. 80 TI 6.3 [(1 iii? gmiltill 4.1-01k . {[1101 inpu‘t 1- 81’: [(1115 i} i (31119113? 081101 6 (121141116 and nor: 6.4 in this s iisited in We consi stimputiii due to lil' [1911' infor addition 1. 10100110 This 51 1.50: in it f'iifii'ffgemf 151110111] .1) him 89100 R9 rel.) '*« l‘df‘i 6.3 Signal cataloging In this section, we catalog the signals used for the simulation. This cataloging is similar to Lin’s [34] in order to facilitate comparison with the results of previous work (e.g., [14, 34, 35]). The systems to be identified are described by a parameter vector (or sequence of parameter vectors in the time-varying case) (Table 6.1) and an input noise type (Table 6.2.) The systems are cataloged in Table 6.2. SYSTEM 1 through SYSTEM 8 have time-invariant parameter vectors with var- ious types of input noises. SYSTEM 9 through SYSTEM 12 are time-varying pa- rameter vectors; SYSTEM 9 and SYSTEM 10 have gradual variations in parameter vector 0... (sine wave), while SYSTEM 11 and SYSTEM 12 have abruptly changing parameters. The excitation noises used are Bernoulli, uniform and colored, with zero and non-zero means. 6.4 Parameter convergence and volume In this section, we show by simulation that increasing the number of weights re- visited improves parameter convergence speed, and further reduces ellipsoid volume. We considerer the case of K = 1 and adjust a single past weight in addition to computing the current data vector weight. We choose to adjust non-zero past weights due to their obvious past relevance (although this choice does not guarantee current new information.) At each iteration, we attempt to adjust one previous weight in addition to computing a present weight but fall back on the single weight algorithm (SM—WRLS or QOBE) when this such attempt fails . Recall that K = 0 corresponds to “conventional” QOBE or SM-WRLS. This first series of simulations compares the benefits of an extra weight adjust- ment in the identification of SYSTEM 1 through SYSTEM 8 in terms of parameter convergence speed and volume reduction. We consider the “true” parameter vectors unknown and identify them using MW-OBE algorithms. SYSTEM 1 and SYSTEM 3 have Bernoulli distributed noise sequences. The relationship between 16., minimization and the volume of the hyperellipsoid is not well-defined. It can be inferred from recent work, however, that 52,. minimization 81 T4038 . I both 0. main Table 6.]. Parameter vector for the AR systems used in the simulations of Chapter 6, both time-invariant (TI) and time-varying. The operator ’\’ denotes the division remainder. Label Model “True parameter vector”, 0,7,, Type Order A 2 [-01. 0.56 )T T1 217.1 B 2 1'6““ 600 sine wave —0.68 (slow) 2175 C 2 [ 1.6-cgségoo ] sine wave ' (fast) 0 (1) _ 1.6, (n \ 600) + 300 < 300 D 2 m - —l.6, otherwise square 0,...(2) = —0.68 wave (slow) 9 (1) __ 1.6, (n \ 300) + 150 < 150 E 2 m _ —1.6, otherwise square 9,...(2) = -—0.68 wave (fast) F 3 [ 2, -1.48, 0.34 [T TI G 12 [ 0.1, 0.9175, -0.191, -0.2253, 0.2601, 0.0046, TI -0.00367, -0.0209, -0.0082, 0.0095, -0.0052, - 0.0041 17‘ 82 Table 6.2. Excitation noises for the AR systems used in the simulations of Chapter 6. Label Distribution “True error”, 6... Mean . 1, with probability 0.7 A Bernoulli, B(_1’ l) { -1, with probability 0.3 zero . 1, with probability 0.7 :-- B Bernoulli, B(—-1,0.5) { _05, with probability ()3 non-zero E C Uniform, U(-1,1) zero [- D Colored, 2,. ~ U(-1,1), { 1’ if w. >.—1 non-zero ——1, otherwrse w. = -—0.8w.._. + 2.. h: E Colored, 2.. ~ U(——1,l), { 1’ If 11).. > -05 non-zero w. = —0.8w.._. + 2.. —0.5, otherwise Table 6.3. Matrix of systems used in the simulations of Chapter 6. See Tables 6.1 and 6.2. E...\0... A A 4 B - C - D 8 E - y..: D p—A y..-1 p—I N) 'KIUWNDJr—t i 83 will reduce volume at each update [14, 22]. This assertion is verified through the following simulations. 6.4.1 MW-QOBE vs. QOBE We compare MW-QOBE (K = 1) and QOBE algorithms in terms Of parameter convergence and ellipsoid volume reduction. We assume the true error magnitude to be bounded by7,, = 1.1, when is it actually bounded by unity. In general, we observe a decrease in the number of updates and an improvement in parameter convergence when adjusting a single past weight. Figure 6.3 shows the parameter convergence of SYSTEM 1, an AR(3) system driven by a non—zerO-mean Bernoulli distributed sequence. Both the QOBE and MW-QOBE perform equally well. This is due to the frequent visitation Of the bound. Considering an extra weight slightly decreases the number of points taken as well as the ellipsoid volume. The disturbance in SYSTEM 2 is uniformly distributed over (—1,1) therefore the MW—QOBE parameter estimator 0.. is not expected to converge as fast as in SYSTEM 1 (where 5.... visited the bound more frequently). We notice an improvement in MW—QOBE performance when increasing K from zero to one, in both parameter tracking and volume reduction (Figure 6.4). The volumes remain large in all the QOBE experiments due to the algorithm’s tendency to stop taking points (or very stringent requirements on accepting incoming data points) once the central estimator approaches the true parameter vector. The disturbance in SYSTEM 3 is Bernoulli distributed as in SYSTEM 1 but has asymmetric error bounds (see excitation noises labeled A and B on Table 6.1.) The resulting overestimated bound (30% Of the time) decreased the performance of SYSTEM 3 (as Opposed to SYSTEM 1). As in the previous case, the volume is further decreased by adjusting one past weight (Figure 6.5). In this experiment we notice a significant reduction in number of points taken. SYSTEM 4 shows an improved performance over SYSTEM 1 because of smaller model order. The incremental improvements are minimal with increased number of 84 weight revisions (Figure 6.6). This improvement may be due to the equality between number of weights revisited and model order. In SYSTEM 5, the noise sequence is a filtered AR(1) signal. This is usually a difficult case Of the type that causes RLS to converge to a biased estimator. MW- QOBE’s performance stands out in this experiment with fast convergence, smaller volume, and significantly fewer points selected (Figure 6.7). This improvement may be due to the more stringent criteria for data acceptance in MW-QOBE, resulting in the selection of points with a higher energy concentration in the modes Of the system being identified. In SYSTEM 6, the AR(3) system Of SYSTEM 5 is replaced by an AR(12) system. The result of the identification of 9..(2) in this higher order system by QOBE and MW—QOBE is shown in Figure 6.8. The MW-QOBE algorithm shows a modest improvement over the QOBE algorithm. In SYSTEM 7, the mean of SYSTEM 5 is moved closer to zero and therefore induces some improvement in its identification as seen in Figure 6.9. SYSTEM 8 has the same noise sequence as SYSTEM 5, but with a smaller sys- tem order, AR(2), and therefore enjoys faster convergence and smaller volume (Fig- ure 6.10). In summary, the adjustment of a single past weight improves the convergence speed of the QOBE (MW-QOBE with K = 0) algorithm and further reduces the ellipsoid volume. 2.5 If I Y I I V I fir QOBE — - MW-OOBE (K=1) I 0.5 i - I I I I ' l 1 A l 1 _L l L 0 40 80 BO 100 120 140 180 180 200 Time (0) (a) Parameter convergence. ' ' if} ' ' "' ' 3’3: 3 .. 7 ., — BE '3 - - - MW-QOBE (K=1) , 7 : 1 10° L 1 m 1 1 1 L 1 40 60 BO 100 120 140 160 180 200 Time (n) Figure 6.3. Parameter 0....(1) = 2.00 convergence and volumes resulting from the identification Of SYSTEM 1 by MW-QOBE and QOBE algorithms, which used 94 and 108 points, respectively, from a total of 200 points. (b) Volume. 86 25 U r_ fi j r r 2 z 1.5 “ 3 E a —— QOBE E; , _ - - - MW—QOBE (K=1) 1 0.5 "‘ l I I 00 100 200 300 400 500 600 700 Time (n) (a) Parameter convergence. 10' :-_: I :n. F r 1 T i .‘L ifi Z ....................... I: 10‘ [ ...... - 10‘ -. ,r — as g ......... - — - MW-QOBE (K=1) I 3 . . . . . ., .. . ..... . o , > ......................... f: E ...... 3 m < .5133 :: iiiiiiiiiiii 1 fl), _. ‘ ’f f 10° L 1 I i 1 L 0 100 200 300 400 500 600 700 Time (n) (b) Volume. Figure 6.4. Parameter 0....(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 2 by MW-QOBE and QOBE algorithms, which used 75 and 87 points, respectively, from a total of 700 points. 87 2.5 T Y T T I I z: - 1 g QOBE i - - — MW—QOBE (K=1) 5 1. 1 0.5 .1 I o l L L l 1 l O 100 200 300 400 500 600 700 Time (n) (3) Parameter convergence. 10. {st ' V ' T. . ., l' 7: ................. f: 10‘ y .......... - ‘O'r . . . ......... ‘ 1 g L ..... BE 3 — — - MW—QOBE (K=1) E ........... .. I _ g ' IT! 1 ‘0‘, 1 1 1 1 1 I 0 ‘00 200 300 400 500 600 700 Time (n) (b) Volume. Figure 6.5. Parameter 0,..(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 3 by MW-QOBE and QOBE algorithms, which used 100 and 82 points, respectively, from a total of 700 points. 88 0.6 V f I f F I7 V r if 0.5+- : l 'I 0'4I— .I d I I 'I 0.31- 'I I 'I I 02_ .' '—'——‘— QOBE _4 E ' ,: - - - MW-QOBE (K=1) ’I 0.1 P- ’l .. a ' . IL ' l _I O I ' . l . I -O.1- - -o.2 - « -O.3P -I _0.4 1 1 I 1 1 1 I 1 1 0 1O 20 I!) 40 50 60 70 80 90 100 TIme (n) (a) Parameter convergence. 10‘ l n I r I .1 r r T I . .1: .1 t. '. ........................................................................... 1 L '1, 1 I r' I” ........................ . l.. I I Io’~ ' I t I‘: '1 o 'f —_ BE ‘ E I - * — MW-QOBE (K=1) 1 :9 L ' .. ..... 3 I 9‘ I - ‘ l i} I I ‘02:"‘..::j, 7‘ : :Q‘.’ .:'. ...... .4 5 .‘ ,. v. . 4 .... ‘_________._.‘_ ‘— _ — _l '-C -.__..'._ ______________ I. .I 4 10‘ 1 1 1 1 I 1 1 1 1 0 1O 20 30 4O 50 60 7O 80 90 100 Time (n) (b) Volume. Figure 6.6. Parameter 9,..(1) = —0.10 convergence and volumes resulting from the identification of SYSTEM 4 by MW-QOBE and QOBE algorithms, which used 11 and 42 points, respectively, from a total of 100 points. 89 2.5 r '3 E a 27 QOBE MW-QOBE (K=1) —O.5 ‘ ‘ O 500 1000 1500 Timo(n) (a) Parameter convergence. 100:: I I 1 1:: ............ .1 '05? ...... 51 4 1° is: ‘3 l3i:::. . “33 iiii _ § ’ ‘ " - -- MW-QOBE(K=1) " 1 .13 " ' ‘"::..;::..::'..:»‘t.:;;;‘z:;‘;;.. ": g .. . '3 .9 , as ‘1 10 =1 ...~. -:—'—:t.—,-~.~+¢.1:+5¢+f..§ H.134.-.’.-_..~:4:.....‘4.1..I.I.a'..;I—I_+e.}.4:+.—:;+.j....j+gg.gz.._y:4.~q ...... .. ., 10° 1 1 O 500 1000 1500 TIme (n) (b) Volume. Figure 6.7. Parameter 9,..(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 5 by MW-QOBE and QOBE algorithms, which used 117 and 116 points, respectively, from a total of 1500 points. 90 —— QOBE - - - MW—QOBE (K=1) ‘ l JI l l 1500 2000 2500 3000 0 500 1000 Time (n) k- (a) Parameter convergence. V T QOBE . - - - MW-QOBE (K=1) ‘ 10 O 500 1000 1500 2000 2500 3000 TIme (n) (b) Volume. Figure 6.8. Parameter 9,..(2) = 0.92 convergence and volumes resulting from the identification of SYSTEM 6 by MW-QOBE and QOBE algorithms, which used 174 and 166 points, respectively, from a total of 3000 points. 91 —-—- QOBE - - —- MW-QOBE (K=1) 1 O 1 1 o 500 1000 1500 Time (II) (a) Parameter convergence. 106 I I 105 If _______ 1 I. ..... QOBE 1 * --- MW-QOBE(K=1) ..-::". j _______________________ - '4 10° L l O 500 1000 1500 Time (II) (b) Volume. Figure 6.9. Parameter 0,..(1) = 2.00 convergence and volumes resulting from the identification of SYSTEM 7 by MW-QOBE and QOBE algorithms, which used 214 and 203 points, respectively, from a total of 1500 points. 92 0 I U T T T T -0.05 H 4 ml .- - ........ - I I ,_ I t -o.15 ~ | ~ I g . I a 5 -0... : —— cogs - , - - - MW-QOBE (K=1) I I -o.25 - ' - I l —0.3 - _ -o.35 1 L L l l l l L L 0 50 100 150 200 250 300 350 400 450 500 Time (II) (a) Parameter convergence. ‘0‘ . . T . T . T . . . . .T . T . . . r . T T T < l ........ t i I ' .1 r . . -1 r . . ..................................................................... . l‘ . . ................................. 4 l . . . . -—— BE : - - - MW-QOBE (K=1) g wwwwww : '2' . ti 5 4 10‘ I L 1 L I I 1 L m o 50 100 150 200 250 300 350 400 450 500 “me (n) (b) Volume. Figure 6.10. Parameter 0n.(1) = —0.10 convergence and volumes resulting from the identification of SYSTEM 8 by MW-QOBE and QOBE algorithms, which used 9 and 43 points, respectively, from a total of 500 points. 93 6.5 Tracking performance In this section we show the tracking benefits of increased K in MW-OBE algorithms. 6.5.1 MW-QOBE vs. QOBE While based in the “OBE philosophy,” QOBE algorithms may be interpreted as hav- ing an update rule that is essentially independent of the current hyperellipsoid [22, 39]. This result is counter-intuitive because the Optimization criterion is to minimize the parameter K", a scalar that is directly related to the ellipsoid and its size. Neverthe- less, the alternative interpretation reveals why QOBE is strongly robust to parameter variation and highly proficient at adaptive estimation. In the present work, we use faster systems than those in [14, 34] in order to high- light both the superior tracking performance of QOBE algorithm and the incremental benefits of higher-order weight adjustments in MW-QOBE. SYSTEM 9 represents a time-varying system in which one of the parameters changes gradually from -1.6 to 1.6 every 400 points. The noise sequence is uniformally distributed in (-1, 1). We observe good tracking by both MW—QOBE and QOBE (Figure 6.11). In SYSTEM 10 the rate of change of parameter 9,,.(1) is increased compared with that of SYSTEM 9. The noise sequence is the as in SYSTEM 9 but parameter 9,..(1) is made to change twice as fast. Nevertheless the parameter estimators of both QOBE and MW-QOBE stay on track most of the time (Figure 6.12). MW-QOBE uses fewer points to achieve the same (or better) tracking performance. The final ellipsoid (volume) is very small due to the relatively large number of points taken by MW-QOBE. SYSTEM 11 represents an abruptly changing system, with parameter 0,..(1) switching in one sample from —1.6 to 1.6 (see Tables 6.1-6.3). The tracking per- formance can be observed at switching times. Again the benefit of an additional weight in the optimization process is evident (Figure 6.13). As in SYSTEM 11, SYSTEM 12 represents a fast switching system with abrupt changes in parameter vector. The MW-QOBE superior tracking is shown in Fig- ure 6.14. The simulations presented in this section show that MW-QOBE (K = 1) has a tracking performance at least as good as QOBE while using fewer observations. —— QOBE — - — MW-QOBE (K=1) .. - True LS -1— -t5- -2 Thne(n) (a) Parameter convergence. L 400 500 600 700 L 1 O 100 200 300 Thne(n) (b) Volume. Figure 6.11. Parameter 9,..(1) tracking and volumes resulting from the identification of SYSTEM 9 by MW—QOBE and QOBE algorithms, which used 109 and 110 points, respectively, from a total of 700 points. 96 h..- 2 v r 1 W 1* —— QOBE — - — MW—QOBE (K=1) - - True 0.5 LP parameter an o -1.5~ Time (n) (a) Parameter convergence. Q 10 - "FL‘-::...‘1»;.;'* .r.;.»;:.1 ....‘:‘;.-.V.,‘;f..‘:m.‘T'If..'..;:i QOBE . - - - MW—QOBE (K=1) ; IIQLI Ellipsoid volume ALAMAAJ A AAA: LAAAA ‘ A Aka” ILQQJ Tome (n) (b) Volume. Figure 6.12. Parameter 0,..(1) tracking and volumes resulting from the identification of SYSTEM 10 by MW-QOBE and QOBE algorithms, which used 70 and 72 points, respectively, from a total of 350 points. 97 —— QOBE — - - MW-QOBE (K=1) True .I 0.5 > LP parameter #1 0 4L —1.5'- ' r‘ I I I I O I _2 I_ L 1 I 1 II 0 1 00 200 300 400 500 600 700 Tune (n) (3) Parameter convergence. ‘0 373:;7?.f:jzr-,;:_~;p_' 1335....-.;;-;Tf..'ff-v-i‘ZT:§-T .i..-...7.E:T;Téf';:§;?5:-..-IE‘..~:~:~:~.~.~- rfi;h'fi7fiffi.gui:33.y1g...mnx ———— QOBE L a " "" MW QOBE(K=1)=I-1 ...... Ellipsoid volume 6 m, .0 0I "‘1 ......... N ........'."I;...i'..; y.;‘:;_.: ..-‘:.. 2; ..l “me (n) (b) Volume. Figure 6.13. Parameter 0n.( 1) tracking and volumes resulting from the identification of SYSTEM 11 by MW-QOBE and QOBE algorithms, which used 52 and 95 points, respectively, from a total of 700 points. 98 0.5 ‘- —— QOBE - - - MW—QOBE (K=1) _ d T _—-—-——-—.“’HR—-- LP parameter #1 O -2 Time (n) (a) Parameter convergence. . '.V ‘11:.VT‘TT " 'T:T‘f":.£52.:TEIIIT;S:1'3:E{T?:§ 1 ;.5.5.:.§‘;§E}§ ‘5;..3_;;.;.»:.2-p_--_::.‘; . —— 0035 5.3 3 ' "'sié::s§.s=?%:$:':'%!.4735:!.%i_!sé:s!zész-iiréér3335 ”’ MW‘QOBE(K=‘)*3 . . . . . . . . ‘ . . . . . ~ . . . . . . . . . . . . . . . . . . . . . . a . . . . . - ~ ~ ~ . . , . - . f. . . I . .. . .g . e 1 e . u - - . . . I . a . e - u I l - . I e ' 1 I I a I .' '. I ‘. '. I I : I I r l : ; l I I l t I 3 I a I I .' I ‘ 1 I 1 I l .‘ I r ‘ ‘ : ' I ‘. r 2 1:1,: . I :1. I .- I I I - I ' ‘I I I I . E I l : l I i e I ‘ I I l I 2 ' ’ I I : I l 1 ' i ’ , ' I I . ' ' .2- ' ' i ‘ i ' ' 7 ' ' 7 I ' f ' .\ :2 f ' ' a: I ' ’i ' ‘. ‘ 4 . , . . . . . . . . . . . . , . . . . . . . . . . . . ~ .. .. .. 4 . . . . . . . . . . r . . . . . . . . . . . , . . . . . . . e €10 .-§.; .3.'Z§ '.';:;; a :31: Ike-g ‘5 > 3‘0 .rrv 32:" .3 “TEX- ‘J 1-~:--:3{' ““0 2'5? 57:33:" +3352: ~15: =1 “5,2: Wits: ....... 1° F Igé ~ 5.51 :i’isgii‘m. . :1 “_.-‘. 3:" ' X f: i' ' .7” 1'1 ...... | . .. ... .4 10 rum. I I ETIHTT—‘S‘é-T'o-V—ro-r—v-é-fl-i-’—s-UEHZH€H .......... Ht-t:::.'::t:=‘! tIV'. .H. i7. .:.‘:: ,It'fl" : 1 5 ..... . ,. I... .. 1 10‘ r ..... I.§_:'.-. .3 ' 1 «5'3; ~“;I;_3.3; 10-9 I 1 1 1 I_ _L 0 50 100 150 200 250 300 350 Turne (n) (b) Volume. Figure 6.14. Parameter 9,...(1) tracking and volumes resulting from the identification of SYSTEM 12 by MW-QOBE and QOBE algorithms, which used 37 and 67 points, respectively, from a total of 350 points. 99 6.6 Selective updating MW-QOBE algorithms utilize fewer points than QOBE algorithms due to their more stringent data selection. The extra per-point-taken costs are often offset by the de- crease in the number of observations processed. In this section, we chart the average number of data points accepted by QOBE and MW—QOBE in the identification of SYSTEM 2. In the first experiment we identify SYSTEM 2 by MW-QOBE and QOBE using an MW—QOBE algorithm with an “adaptive” number of weights. That is, the algorithm first attempts to optimize by incorporating one past weight, but is allowed to revert to K = 0 (QOBE) if the attempt fails. We then compare the average number (over 10 runs) of points taken by MW-OBE and QOBE per length of data. Figures 6.15—6.17 show plots of the results. The average number of updates of the central estimator in both MW-QOBE and QOBE algorithm decreases as the duration of the observation vector sequence in- creases. This is expected, since many of the observations are redundant after a certain time. However, the MW-QOBE selects fewer points on average. We observe performance gains in each case. Increasing the number of weights revisited results in a more stringent test for acceptance of observations but increases the per-point convergence and decreases the ellipsoid volume. Figures 6.15 - 6.17 compare the average number of points used by MW-QOBE and QOBE in identifying AR(2), AR(3) and AR(12) [systems SYSTEM 4, SYSTEM 2 and SYSTEM 12] with error sequence uniformly distributed over (-—1, 1). We observe performance gains in each case. Next we consider an MW-QOBE algorithm which attempts to Optimize Kn over K weights but does not revert to the QOBE algorithm when the optimization over K weights fails. Figures 6.18 - 6.20 compare the average number of points utilized by MW-QOBE and QOBE in identifying the systems of the preceding experiments. The per-update MW-QOBE algorithm computational costs (flops) for a system of order m is (3/2)(m2 + 3m + 4) for K =2 O (QOBE), and 3m2 + 12m + 20 for K = 1 (see Chapter 3). When m = 3 the computational cost per update becomes 33 and 83 (flops). For a sequence of length 1000 (see Fig 6.19), this corresponds to an average 100 0.02 r r r r r r QOBE — — - MW-OOBE (K=1) 1 0.018 I- 0.016 - .o .o O 0 d d N b I Y Average (over 10 runs) traction of updates .0 S l 0000- ~ I ‘ ‘ ‘ 0006- . 0.004» T T T ‘ ‘ ‘ ~ I 1 0.002 4 1 1 9 1 1 L 400 600 800 1000 1200 1400 1600 1000 2000 Duration ot observation aoquonoe Figure 6.15. Ratio of points taken, averaged over 10 runs, in the identification of an AR(2) system with uniformly distributed error sequence by QOBE and MW-QOBE (adaptive K.) of 34 and 10 updates for K = 0 and K = 1, or 112 and 83 multiplications. These numbers support the assertion that the extra per-update cost is often offset by a decrease in the number of observations processed. Using a fixed K in MW-QOBE, demonstrates its strongest asset, namely the drastic decrease in number of observation taken. 6.7 General findings, conclusions, recommenda- tions for practice Simulation results show improvements in parameter convergence Speed, reduction in ellipsoid volume and tracking ability, when an extra weight is used in MW—OBE algorithms. Both QOBE and MW-QOBE have excellent tracking capabilities, with MW-QOBE performing at least as well as QOBE when tracking fast systems with abrupt parameter transitions. In practice, revisitation of a large number of weights becomes impractical. We recommend the use of a small number of weight revisions, 101 —— QOBE - - — MW—QOBE (K=1) . 009 Average (over 10 runs) traction 0t updates .0 .° .° .° 9 2 5t 8 8 8 p 8 0.02 '- 0.01 ‘ l L 400 600 800 1000 1 200 1 400 1600 1 800 2000 DuratIon of observation sequence Figure 6.16. Ratio of points taken, averaged over 10 runs, in the identification of an AR(3) system with uniformly distributed error sequence by QOBE and MW-QOBE (adaptive K.) typically one or two. In the simulations presented, the MW-QOBE algorithm was given the option of reverting to the QOBE algorithm when no optimal solution could be found consid- ering K weights. When this option is removed, as shown in Figures 6.18-6.20, the number of observations necessary to update the estimator is dramatically reduced. The drawback is the more stringent data acceptance criteria which may prevent the algorithm from sufficiently updating its estimator in a short interval. We suggest the use of MW-OBE with an adaptive number of adjustments when identifying parameters in short data segments and for parameter tracking of time- varying parameters. This adaptive version (in K) guarantees maximum tracking performance. For long data sequences where the concern is to reduce the number of points processed (in updating the estimator), we recommend using an MW-OBE algorithm with fixed K. 102 3'5.“ . "Tl-“Al". ' 1. an" I 0.145 O o '8 .0 L. “ N I I T Average (over 10 runs) traction 0t updates 9 8 0.04 *- f 1 T rI w QOBE — - — MW—OOBE (K=1) L L l 0.02 400 600 J 800 1000 1200 1400 1600 1800 Duration 0! observation sequence For II- mum's-M. Figure 6.17. Ratio of points taken, averaged over 10 runs, in the identification of an AR( 12) system with uniformly distributed error sequence by QOBE and MW-QOBE (adaptive K.) T Y I T T 0.02 v r QOBE — - - MW-QOBE(K=1) . 0.018 '- 0.016 I" 9 .9 O o .0 «a N 5 I 0.01 *- Average (over to runs) traction 0t updates iii 0.002 *- \ o I 1 400 600 800 1000 1200 1400 1600 1800 2000 DuratIon o! observation sequence Figure 6.18. Ratio of points taken, averaged over 10 runs, in the identification of an AR(2) system with uniformly distributed error sequence by QOBE and MW-QOBE (fixed K.) 103 T 0-1 Y I T T V T QOBE - — - MW—QOBE (K=1) - 0.08 '- .0 o ‘4 r .o 8 Average (over 10 runs) traction at updates .0 .0 ° 8 2 8 I U I .0 o N l O O .a r 1000 1 200 1 400 1 600 1 800 2000 Duration of observation sequence Figure 6.19. Ratio of points taken, averaged over 10 runs, in the identification of an AR(3) system with uniformly distributed error sequence by QOBE and MW-QOBE (fixed K.) 016 I fir f r r T Y QOBE - — - MW-QOBE (K=1) 9 d Y Average (over 10 runs) traction at updates 9 o 8 8 I t _o E o L l L J 400 600 800 1000 1 200 1 400 1600 1800 2000 Duration ot observation sequence Figure 6.20. Ratio of points taken, averaged over 10 runs, in the identification of an AR(12) system with uniformly distributed error sequence by QOBE and MW-QOBE (fixed K.) 104 Chapter 7 Future Directions and Complementary Work 7. 1 Introduction This chapter contains a collection of topics that were explored as part of this research. These topics contain information and ideas potentially helpful for future research. We also summarize the results obtained throughout the course of this research and conclude the dissertation. 7 .2 Normalized MW—OBE algorithm Including a normalization strategy (as in [6]) offers new insights in the behavior of MW-OBE algorithms. This “normalization” is achieved by an (affine) transformation of the parameter estimator space, which normalizes the ellipsoid semi-axis lengths to unity and shifts its center, 0n_1, to the origin. We begin by expressing the positive definite matrix Pn_1 using its normalized eigenvector representation P,._l = RMIRLI (7.1) where Rn_1 is a unitary matrix. We define the change of variable in the parameter space (at each iteration) by gngT (9“0n—1)0r0=‘-Rn—1§+9n—1 (7-2) n-l 105 and the projection of the matrix of data vectors in the parameter Space spanned by Rn-l by 2..-, “:‘i’ RZ_,X.,_1. (7.3) It follows that Gn,n-, = 2:2,. and Hn,,,_1 = I+Anzfzn. The a posteriori ellipsoid set at time n — l (a priori at time n) in the new coordinate system becomes - _ _T T _ 9..-, = {Rn_,0+0..-l : (Ruse.-.) C..-1(Rn-10..-l)< mm} or, by multiplying each side by 12,11 (a rotation), Qn_1 : {0 Z 6T(Rz_lpn_lRfl_1)-r0— < Kn_1} = {9 : 5T9- < 19.4}. The MW-OBE recursions take on the new form: 1254mm.-. = I-ang‘Anzf (7.4) 5,, = an—I+ZnAnEnIn—l (7.5) lo, 2 nn_1+7ZAn'yn—eZ]n_1H;lAnenln_1. (7.6) Equations (7.4-7.6) may be used as alternatives to the MW-OBE recursions intro— duced in Chapter 2 and have a simpler form. 7 .3 Hyperellipsoid volume The determinant of the matrix representing a hyperellipsoid is proportional to the volume. The need to compare volumes in different dimensional Spaces (models of different order) sometimes arises, leading to the need for a more exact representation of volume. Some of the following ideas are used in Section 7.4. Lemma 7.1 Let 9 = [91, 02, - - -, 0m-1] represent a vector of angles, describing a direc- tion in m-dimensz'onal space, and u = [u1,u2, - - -,um] is the unit vector in that direco tion. Without loss of generaliti, we assume that 0 S 01 < 271' and —7r/2 _<_ 6,- < 7r/2, 106 for i 76 1. The ith element of u is expressed as follows: m u,- 2 Si H Cj (7.7) j=i whereS=[1 sin61 sin6m_1J andC=[c0391 c030m_1 1]. Proof: Start with the two-dimensional case and iterate. I Lemma 7 .2 Let M be a positive definite matrix of order m describing a hyperellip- soid. The distance from the center to the surface of the hyperellipsoid in any given direction 0 is given by: .1 d9 = (uTMu) 2 (7.8) where the vector u is derived as in Lemma 7.1. Proof: Let 1) be the vector from the origin to the ellipsoid in the 9 direction. Then 1) may be expressed as a multiple of the vector uT, or v = an, where a is a scalar and u, the unit vector in the 9 direction. From the hyperellipsoid equation vTMv = 1, we obtain uTMu =2 1/a2. Equation 7.8 follows. I Lemma 7.3 The volume of an m-dimensional hypersphere of radius r is given by g- % 21f r v = / .../ / /lsin01c0502-~-cosflm_1dld01---d0m_1 - -§ 0 o I. 2 _ 2:2(m—I)Tm m 2m-l : 7r7‘m (7'9) m In Figure 7.1, we compare the effects of model order on the volume in OBE identification of a speech segment. Volume corresponding to each model order is normalized by sealing by an appropriate constant (see Lemma 7.3). Lemmas 7.1 and 7.2 are used for generating the “other” parameter vectors inside the ellipsoid in Section 7.4. These “other” points are generated by choosing a random 107 Model Order. m Time (n) Figure 7.1. Effect of the model order in the identification of the LP parameters in the voiced /I/ phoneme. We show the log of the final volumes (normalized to the initial ellipsoid volume) using the model orders 7-14. direction inside the ellipsoid (m — 1 angle vector), determining the length of a semi- axis in the chosen direction, then choosing a random length. 7.4 Application to linear-prediction analysis of speech signals Speech signals are often modeled as AR signals with 10-14 linear prediction (LP) pa- rameters [9]. These parameters represent the spectral properties of a speech segment and are usually identified by a least-squares method, often in batch form. However, the least-squares estimator does not necessarily provide the best perceptual represen- tation of the original speech sequence. OBE algorithms have interesting properties with applications to speech processing. Fast convergence and good tracking abilities of MW-QOBE make it an ideal candidate for spectral estimation in very short time frames. However, what distinguishes OBE from most system identification methods is its data selection ability and its generation of a feasible set. These two properties 108 offer great potential for improving perceptual qualities in speech analysis. An objective in Speech analysis may be to find a parameter estimator that would enhance a certain perceptual quality. Such an estimator may be found within the ellipsoid, aside from the central estimator and the conventional LS estimators. OBE algorithms provide a set of feasible parameters from which a “better” estimator can be chosen. In Figure 7.23 we Show the spectrum of a segment of length L (L=512 points) of the vowel /I/, generated by fast Fourier transform (FFT) and assume that the speech segment is accurately modeled with an AR(14) signal with “true” parameter 9.. In Figure 7.2b, we Show the impulse response F FT of the “covariance method” estimator ( 900W): the LSE estimator [9], the central estimator (0d,) and three “other” non-central estimators (01-3,L) of 0. inside the ellipsoid (see Section 7.3). These alternative estimators, with close spectra to the least-squares estimators, may offer other desirable properties in terms of perceptual quality. We suggest the use of these alternative parameters in improving speech quality. Parameter 01,1, 92,1, 93,1, 1 0.07 -0.00 0.04 2 0.17 0.00 -0.07 3 -0.04 -0.00 -0.06 4 0.27 0.00 0.05 5 —0.16 0.03 -0.57 6 0.13 -O.25 -6.16 7 -0.62 -0.15 5.24 8 0.57 2.00 15.31 9 22.61 4.21 73.82 10 -1.24 23.52 17.00 11 141.98 79.60 336.44 12 16.86 66.57 75.48 13 329.19 547.10 -288.21 14 —147.43 -374.22 73.76 “9,1, — 96,1,”2 388.66 671.34 461.98 Table 7.1. “Other” AR parameter estimators inside the ellipsoid (Figure 7.2b). The coordinates and radii (“0,1, — BC’LHZ) are in relation to the final SM-WRLS ellipsoid center (M = 14, volume 2: 1.4 x 1019 and 4.67 x 10“1 S semi-axes S 43.6). 109 Certain challenges remain in applying OBE algorithms to speech processing. First, determining reasonable error bounds for speech signals poses a challenge, Since the “true” bounds are unknown [9, 29]. Second, OBE algorithms generate large ellipsoid volumes [13]. A closer look at these ellipsoids shows a great disparity between axis lengths as shown in Table 7.2. (Tables 7.3 and 7.4 are provided for reference.) It is conjectured that this is largely due to the lack of persistency of excitation in most voiced Speech data. Lengths 43.62 3.81 35.51 0.47 17.33 0.91 2.47 13.85 13.47 2.62 5.10 1.86 4.07 1.82 Table 7.2. SM-WRLS 14-dimensiona] hyperellipsoid semi-axes (see Figure 7.2). Vol- ume = 4.32 x 1017. Magnitudes(x10‘1) 7.98 9.66 9.66 9.69 9.69 9.33 9.33 8.45 8.45 9.86 9.86 9.35 3.83 3.83 Angles (deg) 180.00 124.03 -12403 80.38 -80.38 95.69 -95.69 113.81 -113.81 10.10 -1010 0.00 27.59 -27.59 Table 7.3. Poles of the 14th order system, based on the batch covariance estimate (see Figures 7.2 and 7.3). pa. Magnitudes (X10'1) 10.1 10.1 9.10 9.62 9.62 9.27 9.27 9.58 9.58 8.22 7.44 7.44 2.57 1.99 Angles (deg) 9.93 —9.93 0.00 79.51 -79.51 96.54 -96.54 j 123.10 -123.10 180.00 113.14 —113.14 180.00 0.00 Table 7.4. Poles of the 14th order system (SM-WRLS), based on the ellipsoid center estimate (see Figure 7.2 and 7.3). 110 7 .5 Alternative volume minimization: Trace Minimizing the volume can also be done by minimizing the trace of the ellipsoid matrix [15]. In this section initial steps toward finding a solution with multiple- weight adjustment are described. More work is necessary to find an actual solution to the derived system of equations. Using the fact that trace AB = trace B A, and defining Rn (i=8:r [Pn-1XnH;lAnXIPn_1], we can then find the trace of traceRn = trace [P,_,X,,H;'A,,XZ'P,,-,] = trace [H;1A,,X,,TP,,-,P,,-,X,,] = trace [Hg‘Aan] where Bn déf XI Pf_1X,,. Using the formulae of Athans [1] and defining A,. - '. o —1 -o.5 o 0.5 1 Real (a) COV estimate. T O o 0.5 r 1 j o . ' O 2‘ g : E) O h 0 .0 ., _, ' .o t O .' —o.5 » Q o o g o 1 . _1 _ .; 0 L 1' L —1 -o 5 0 o 5 1 Real (b) OBE-ABE ellipsoid center. Figure 7.3. Poles of the 14th order system based on the COV and OBE ellipsoid center estimates (see Figure 7.2, Tables 7.3 and 7.4). 117 Appendix A A. 1 Definitions en, an, and yn are modeled as random variables defined on a probability Space (0, f, P) where Q iS a sample Space, .7 a o-field, and P a probability measure. Using Lin’s [34] notation [34], define c-neighborhoods of noise bounds as: D? = [fl - 6, W] and D." = {-777 ‘77” 61- Definition A.l [34] A random sequence {en} is called uniformly conditionally tailed (UCT) if given 6 > 0, there exist a 6 > 0 and an infinite subsequence {t,} C N, such that F(En E (De1L U 0:) [73.4) > 6 as V n E {t,-}. Definition A2 The sub-sequence of random vectors, {mm}, is called persistently exciting (PE) if there exits a > 0, 6 > 0 and Tpg < 00 such that Vnk, (subsequence in time) k+Tpg o] < 2 3mm: < 01 i=1: Lemma A.l fill/[Matrix Inversion Lemma]: Let the matrix H = A +BDC', where A, D are non—singular matrices of order m, n and B, C are m x n and n x m 118 matrices respectively. Then H“ .-= A"—A“B(D“+CA“B)“‘CA“ Lemma A.2 [41] det Adet (D + CA-IB) = det Ddet (A + BDC’) where A, D are non-singular matrices of order m, n and B, C are m x n and n x m matrices respectively. Lemma A.3 [32] If A’1 exists, —1 A D _ A“+EA”F —EA" 0 B —A-1F A“ where A = B —— CA—ID, E = A-ID, F = CA”. If B'1 exists, the block (1,1) can be written as [A — DB"IC]". A is known as the Shur complement of A. Also, A D -1 det = detAdet(B+CA D). CB Lemma AA As 0 extension of Lemma A3, assuming B"1 exists, -l a-1 -A“F —EA“ B" + EA‘1F AD CB where A = A — DB“C, E = B“C', F 2 DB". If A“ exists, the block (2,2) can be written as [B -— CA’ID]“. Proof: Use the development of [32] starting from the top left corner of the matrix. I 119 lea-e Lemma A.5 Following Lemma [4.4, let A; be the (K X K) symmetric and invertible matrix and u; the (K x 1) vector as described below: T A] = [ ul : [11.1 no] 01 A0 where A0 is invertible (K — 1 x K — 1), 111 is (K — 1 x 1), a1 and b, are scalars. Then, A,‘1 = A,“1 ________ I __________ ——A61a1 ] A1A61+(A6101)(A6101)T] where A1 = a1 — aTAo‘lal. It follows that 0 _ _ —1 l. Aflul = _l — A110“ — uoTAo 101) _1 2. uTAlul = unguo + 221107.110 + “Ufa, 2 3. Alu'erl‘lul = AlugAgluo + (ug'Ao—lal — 111) Proof: Apply Lemma A.4. Lemma A.6 [I] Let A, B and X be matrices of appropriate dimensions. In addi- tion, let X be invertable with elements independent of each other. Then, fitrachX) 2 AT gax-trace(AXB) = ATBT fitracMX‘l) = ”(X-IX”)T a—aXtrace(AX_lB) = -—(X-IBAX’1)T fideux’) = det(X)X-T fideMAXB) = det(AXB)X‘T fidefiXT) = 53(88th—elet(X)x-‘T 120 where the operator 0a def[ 6a 5:? :: BIC-J] for a scalar a (see Table 1.1.) 121 A.2 Matlab programs MWOBE, August 25, 1998 Z Multiple weight OBE algorithm (MWOBE). Z with volume and kappa minimization options (MW-SM-WRLS and MW-QDBE). Z Dale Joachim, MSU. 1996, 09/7/97 and 6/9/98. Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z 00101-50 function [thet,param,stat,Ellip] = mwobe(y,w,gam,pm,opt) y —> output sequence observable input sequence (when not present use w=y and q=-1) w -> gamma max: initial overestimated bound (scalar) gmax -> pm(l) -> p, order of ’y’ pm(2) -> q, order of ’x’ pm(3) -> N, non-update maximum window length pm(4) -> ep, ’small number’ pm(5) —> reset value of k (kappa) pm(6) -> forgetting factor -> digit(l) Algorithm type: O->volume (back), 1->kap (back) 2->volume (forw), 3->kap (forw) 4->volume (simp), 5->kap (simp) -> digit(2) Number of past weight revisions -> digit(3) Observation vector choice and adaptive K in [K,O] (adaptive K) —> non-zero weights 2 variable K in {0,K} -> non-zero weights & K fixed -> immediate weights & K fixed -> immediate weights 2 variable K in {0,K} -> digit(4) Ellipsoid matrix: 0 -> none, 1-> store Opt thet <- parameter updates (matrix MxT) param(1,1:T) <- epsilon, error sequence (vector le) (squared) param(2,1:T) <- vol, ellipsoid volume (vector le) param(3,1:T) <- kappa, kappa (min distance) (vector le) param(4,1:T) <- gamma, estimated error bound (vector le) param(5:7,1:T) <- last 3 weight updates (vector 3xT) Ellip <- ellipsoid matrices (MxMxT) stat(l) <- nup, number of data points used (scalar) stat(2) <- gup, number of gamma updates (scalar) stat(3) <- mup, number of mulpiple weight update points function [thet,param,Ellip,stat] = mwobe(y,w,gam,pm,opt); 122 mu = 10‘(-2); Z small number rem(fix(opt/10‘O),10); Z function, see next. rem(fix(opt/10‘2),10); Z choice of data. funct d-typ nK = rem(fix(opt/10‘1),10) + 1; Z TOTAL weight to revise Z INCLUDING current do_ell = rem(fix(0pt/10“3),10) == 1; Z 1->return each ellipsoid. if size(y,1)>size(y,2) y=y’; end; Z column vector if size(w,1)>size(w,2) w=w’; end; Z column vector p = pm(l); q = pm(2); N = pm(3); ep = pm(4); ff = pm(6); lY = 1ength(y); Z output sequence size Z ARX-type model order H = p+q+1; l i if length(gam) == , ‘. gam = gamtones(size(y)); end; Z -------------- Initialization ----------------------- P0 = (1/mu‘2)*eye(H); Z P = P0; Z initial condition param = zeros(7,lY); Z epsilon,vol,k,g,k*G param(2,:) = ones(size(param(2,:))); thet = zeros(H,lY); Z model parameters t = zeros(M,1); Z initial parameters k0 = 1.0; k = k0; P = Ptmu; v0 = sqrt(det(kOtP0)); Z initial kappa Ellip = inv(PO)/k; Z initial ellipsoid v = sqrt(det( ktP )); Z volume pub 8 O*ones(nK,1); Z TOTAL weight revisions nup = 0; Z number of updates mup = 0; Z number of multiple weight updates Y = [zeros(nK-1,1);y(M)]; U = [gam(1:nK)]’; Z initial bound vector X = [y(M+1-(1:1:p)) w(M+1-(0:1:q))1’; for i = 1:nK-1, 123 X = [zeros(M,1) X]; end; B = Y - X’*t; Z initial error ofile = 1; Z output device (0 = null) Z ----------------- Recursion ------------------------- fprintf(ofi1e,’\nHWOBE, K = Zd, M = Zd: ’,nK-1,H); for n = H+nK:lY, fprintf(ofile,’.’); Y1 = [Y(2:end); y(n)]; Z new candidates U1 = [U(2:end); gam(n)]; Z same: gam(n-nK+1:n)]’; Um = diag(U1); Z diagonal matrix x1 [y(n-(1:1:p)) w(n-(O:1:q))]’; X1 = [X(:,2:end) 11]; if ( sum(d_typ == [1,2]) ) l (funct > 1) Z immediate K observations X = X1; Y = Y1; U 8 U1; Z forward algorithms end; B 3 Y1 - Xl’tt; Em = diag(E); Z error 6 = x1’*P*X1; Z weighted energy matrix d_idx = nK; if(funct>1) d_idx = 1; end; x = X1(:,d_idx); e = Y1(d_idx) - x’*t; g = x’*P*x; Z initial error i g scalar u = Ul(d_idx); if sum(funct == [0,2,4J) [L,chk,pHa,S] = mwv_weights(G,Ul,E,pUb,[funct,d_typ,k,M]); else [L,chk,pUa,S] = qu-weights(G,U1,E,pr,[funct,d_typ,k,M]); end; switch(chk) case 1 fprintf(ofile,’\b[Zd]’,n); nup = nup + 1; Z stat: points used (any adjust.) X = X1; Y = Y1; U = U1; 1 = L(d_idx); h = 1+l*g; 124 P P - (l/h)*P*x*x’*P; k k + (l/h)*(h*u“2 - e‘2); if k < 10e-8, k=0.1; end; t = t + 1*e#P*x; case 2 fprintf(ofile,’\b’,n); nup = nup + 1; Z stat: no of used pts X 3 X1; Y = Y1; U = U1; Lm = diag(L); H = eye(nK)+Lm*G; P P - P*X*inv(H)*Lm*X’*P; k k + U’thtU - E’*inv(H)*Lm*E; if k < 10e-8, k=0.1; end; t = t + P*X*Lm*E; map = mup + 1; end; if k<0; k = 0.1; end; Z adaptation if sum(eig(P)1, param(6,n) = pwa(2); end; if nK>2, param(7,n) = pwa(3); end; v = sqrt(det( ktP )); Z volume thet(:,n) = t; Z parameters param(1,n) = Y1(d_idx) - x’tt; param(2,n) = v; Z param(3,n) = k; param(4,n) = U(d_idx); Z pr = [pWa(2:nK); 0]; Z updated weights if ( sum(d_typ == [2,3]) ) & (chk==0) Z immediate K observations X = X1; Y = Y1; U = U1; Z forward algorithms end; 125 ‘9- u ’I I I ...i if do_e11, Ellip = [Ellip inv(P)/k]; end; end param(1,1:H+nK-1) = param(l,M+nK)*ones(1,N+nK-1); Z initial error param(2,1:H+nK-1) v0*ones(1,M+nK-1); Z initial vol param(3,1:H+nK-1) k0*ones(1,M+nK-1); Z initial kappa param(4,1:N+nK-1) param(4,H+nK)*ones(1,H+nK-1); Z initial gamma stat = [nup 0 mup]; 126 MWOBE, August 25, 1998 Z HV-QOBE weights Z dale joachim, msu, 6/9/98 Z function [L,chk,pWa,S] = qu_weights(G,U,E,pr,0pt) —> weighted energy matrix (X’PX) -> upper bound vector -> error vector Z pwb —> vector of past (cumulated) weights (before) Z Opt -> opt(1): funct (see NUOBE.m) N NCO Z 0pt(2): d-typ (see MWOBE.m) Z opt(3): kappa from previous iteration Z opt(4): system order M Z Z Z L <- weight adjustments vector Z chk <- weight exists ? (0:no, 1:qobe, 2:mwobe) Z pHa <- vector of (cumulated) weights (after) Z S <- sign vector function [L,chk,pWa,S] = qu_weights(G,U,E,pr,0pt) nK = size(G,1); Z Um = diag(U); Z diagonal matrix Em = diag(E); Z error pWa = pwb; ddir = opt(l) > 1; vK = sum (opt(2) == [0,3]); simpl = opt(l) > 3; k = opt(3): H opt(4): chk = 0; L = zeros(nK,1); S = 0; Z initialization if ddir, t 1; else t = nK; end; e = E(t); u = U(t); g = G(t.t); if (nK > 1) for i = 0:0 Z (2“nK)-1 Z for this study, do all! if rank(G) == nK 127 S1 = [((dec2bin(i,nK == dec2bin(0,nK)) ... ‘ (dec2bin(i,nK) == dec2bin(0,nK)))]’; Se = sign(E)+(E==O); Z sign E 81 = Se; Z note for now ! Sm = diag(Sl); B = G*Um*Sm; Z other: B = Em*Gm*Em; L1 = inv(B)*(E-(U.*Sl)); if prod(L1+pr>0) chk = 2; S = 51; L = L1; if simpl, Z only one weight logged chk = 1; pWa(t) = L(t); else pWa = pub + L; end; end; end; end; end; if ('chk) & (( abs(e) > u )) & ((nK==1)|((nK>1)&vK)) 3 Z last cond. forces nK fixed S = e/abs(e); chk = 1; L(t) = (abs(e) - u)/(u*g); pHa(t) = L(t); end; 128 Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z MWOBE, August 25, 1998 MH-SH-WRLS weights dale joachim, msu, 6/9/98 function [L,chk,pHa,S] = mwv_weights(G,U,E,pr,opt) G -> Weighted energy matrix (X’PX) U -> upper bound vector E -> error vector pUb -> vector of past (cummulated) weights opt -> opt(1): funct (see HHOBE.m) opt(2): d_typ (see HWOBE.m) 0pt(3): kappa from previous iteration opt(4): system order H L <- weight adjustments vector chk <- weight exists ? (Ozyes, 1:sm-wrls, 2:mw-sm-wrls) pHa <- vector of (cummulated) weights 8 <- sign vector function [L,chk,pHa,S] = mwv_weights(G,U,E,pwb,opt) nK = size(G,1); Z Um.= diag(U); Z diagonal matrix Em = diag(E); Z error pwa = pub; ddir = 0pt(1) > 1; vK = sum (0pt(2) == [0,3]); simpl = opt(l) > 3; k opt(3): m opt(4): chk = 0; L = zeros(nK,1); S = 0; Z initialization if ddir, t = 1; else t = nK; end; e = E(t); u = U(t); g = G(t.t); Gi = inv(G); if (e‘2 - u“2 + ktg/m > 0 ) & (nK>1) El = 01 t E; 01 = Gi * U; b = k - E’rEl - U’tUl; if prod(prod((E*E’ - U*U’)*Gi + (k/m) > 0)) 129 Z ------------ numerical solution ---------- a1 = 0; a2 = 2*pi; incr = (a2-a1)/SOO; tt = [incr:incr:a2]; uvec = zeros(length(tt),2); vvec = zeros(1ength(tt),2); dvec = zeros(1ength(tt),2); tvec = 10“9*ones(1ength(tt),1); uvec(:,1) = [0(1)*cos(tt) - U(2)*sin(tt)]’; uvec(:,2) [U(2)#cos(tt) + U(1)*sin(tt)]’; vvec(:,l) = [El(1)*cos(tt) - 81(2)*sin(tt)]’; vvec(:,2) = [E1(2)*cos(tt) + El(1)*sin(tt)]’; alfv = (uvec./vvec).*(m*uvec.*vvec - E1’*U); betv = (vvec./uvec).*(m*uvec.*vvec + El’tU); dvec = (b + sqrt((b‘2) + 4*(alfv).*(betv)))./(2*alfv); idx = isrea1(dvec); for i=1:1ength(tt) if (isrea1(dvec(i,:))) DD = diag(dvec(i,:)); uu = uvec(i,:)’; vv = vvec(i,:)’; f = b + uu’*DD*uu + vv’tDDtvv; F = mthtuu*uu’*DD - ftDD - mtvvtvv’; tvec(i) = trace(F); end; end; Z tvec = (tvec < O)*10‘50 + (tvec>0).*tvec; [val,idx] = min(tvec); angl = tt(idx); D = diag(dvec(idx,:)); R = [cos(ang1) -sin(ang1); sin(ang1) cos(angl)]; xx = R’tntfi; L1 = diag(XX - Gi); plot(tvec); for i = 1:1 zoom on; agl = ginput(1); angl = ag1(1); R = [cos(angl) -sin(angl); sin(angl) cos(angl)]; uvec = R*U; vvec = Rtfil; alfv = (uvec./vvec).*(m*uvec.*vvec - El’tU); 130 betv = (vvec./uvec).*(m*uvec.*vvec + Ei’*U); dvec = (b + sqrt((b‘2) + 4*(alfv).*(betv)))./(2*alfv); D = diag(dvec); XX = R’*D*R; L1 = diag(XX - Gi); val = prod(L1>0); end; if prod(L1+pr>0)&(va1>0)&isrea1(L1) chk = 2; L = L1; if simpl, Z only one weight logged chk = 1; pwa(t) = L(t); else pHa = pUb + L; end; end; end; end; if ('chk) & (( m*(u“2 - 8‘2) - k*g < 0 )) & ((nK==1)l((nK>1)&vK)) ; Z a = (m-1)*(u*g)“2; b = u“2*g*(2*m-g)+e‘2*g-k*g‘2; c = m*(u‘2~e“2)-k*g; L(t) = (-b+sqrt(b‘2-4*a*c))/(2*a); pHa(t) = L(t); chk = 1; end; 131 References [1] Michael Athans. The matrix minimum principle. Information and Control, 11:592—606, January 1968. [2] CS. Banya'sz and L. Keviczky (editors). In Proc. 9th IFAC/IFORS Symp. on Identification and System Parameter Estimation, volume 1 & 2, Budapest, July 1991. [3] Stephen Barnett. Matrices: methods and applications. Oxford University Press, New York, 1990. [4] G. Belforte, B. Bona, and M. Milanese. Advanced modelling and identification techniques for metabolic processes. CRC J. Biomedical Engr., 10:275—316, 1983. [5] D. P. Bertsekas and I. B. Rhodes. Recursive state estimation for a set- membership description of uncertainty. IEEE Trans. on Automatic Control, AC-16:117—128, April 1971. [6] M. F. Cheung. 0n Optimal Algorithms for parameter set estimation. PhD thesis, Ohio State University, Columbus, 1991. [7] M. F. Cheung, S. Yurkovich, and K. M. Passino. An optimal volume ellipsoid algorithm for parameter set estimation. In Proc. 30th IEEE' Conf. Decision and Control, pages 969—974, Brighton, December 1991. [8] S. Dasgupta and Y.-F. Huang. Asymptotically convergent modified recursive least squares with data dependent updating and forgetting factor for systems with bounded noise. IEEE Trans. on Information Theory, 33:383-392, 1987. [9] John R. Deller, Jr., J .G. Proakis, and J.H.L. Hansen. Discrete— Time Processing of Speech Signals. MacMillan, New York, 1993. [10] J .R. Deller, Jr. Set-membership identification in digital Signal processing. IEEE' Acoustics, Speech, and Signal Processing Magazine, 6:422, 1989. [11] JR. Deller, Jr. Personal communication. 1997. [12] JR. Deller, Jr., S. Gollamudi, S. Nagaraj, Y.-F. Huang, and S. Kapoor. Conver- gence analysis of a new “Quasi-OBE” algorithm for real-time Signal processing. In preparation. [13] J.R. Deller, Jr. and T. C. Luk. Linear prediction analysis of Speech based on set—membership theory. Computer Speech and Language, 3:301—327, October 1989. [14] J .R. Deller, Jr., M. Nayeri, and M. S. Liu. Unifying the landmark developments in optimal bounding ellipsoid identification. Int. J. Adaptive Control and Signal Processing, 8:48-63, Jan-Feb. 1994. 132 [15] JR. Deller, Jr., Majid Nayeri, and S. F. Odeh. Least-square identification with [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [25] [27] [28} error bounds for real-time signal processing and control. Proc. IEEE, 81(6):813— 849, June 1993. JR Deller, Jr. and M. Nayeri (session chairs). In Proc. International Symp. on Circuits and Systems, volume 1, Chicago, May 1993. (Special session on set-membership—based identification for Signal processing and control). J .R. Deller, Jr. and S. F. Odeh. Adaptive set-membership identification in 0(m) time for linear-in-parameters models. IEEE Trans. on Signal Processing, 41(5), May 1993. J.R. Deller, Jr. and G. P. Picaché. Advantages of a Givens rotation approach to temporally recursive linear prediction analysis of speech. IEEE Trans. on Acoustics, Speech, and Signal Processing, 37(3):429—431, 1989. G. F iorio, S. Malan, and M. Milanese. Guaranteed specification control design of inertial platforms for a high accuracy calibration device. In Proc. 29th Annual IEEE Conf. Decision and Control, Honolulu, 1990. E. Fogel. System identification via membership set constraints with energy con- strained noise. IEEE Trans. on Automatic Control, 24:752-758, 1979. E. Fogel and Y.‘-F. Huang. On the value of information in system identification - Bounded noise case. Automatica, 18:229—238, 1982. S. Gollamudi, S. Nagaraj, and Y.-F. Huang. SMART: A toolbox for set- membership filtering. In Proc. 1.9.97 European Conf. on Circuit Theory and Design, Budapest, September 1997. Gene H. Golub and Charles F. Van Loan. Matrix Computations. The John Hopkins University Press, Baltimore, Maryland, 1983. R. Gomeni, H. Piet-Lahanier, and E. Walter. Study of the pharmacokinetics of betaxolol using set-membership set estimation. In Proc. 3th IMEKO Congress on Measurements in Clinical Medecine, Edinburgh, 1986. D. Graupe. Time Series Analysis, Identification, and Adaptive Systems. Krieger, Malabar, Florida, 1989. Simon Haykin. Adaptive Filter Theory. Prentice-Hall, Englewood-Cliffs, New Jersey, 1995. Roger A. Horn and Charles R. Johnson. Topics in Matrix Analysis, volume 2. Cambridge University Press, 1991. Alan Jennings. Matrix Computations for Scientists and Engineers. John Wiley and Sons, 1977. 133 [29] [30] [31] [32] [33] [34] [35] [35] [37] [38] [39] [40] D. Joachim, J.R Deller, Jr., and Majid Nayeri. Practical considerations in the use of a new OBE algorithm that blindly estimates error bounds. In Proc. 40‘hMidwest Symposium on Circuits and Systems, pages 762-765, Sacramento, August 1997. D. Joachim, J.R Deller, Jr., and Majid Nayeri. Weight reoptimization in OBE algorithms: An initial study. In Proc. 35‘“ Allerton Conference on Commu- nication, Control and Computing, pages 789-797, Monticello, Illinois, October 1997. D. Joachim, J .R Deller, Jr., and Majid Nayeri. Multiweight optimization in OBE algorithms for improved tracking ans adaptive identification. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Seattle, May 1998. Thomas Kailath. Linear Systems. Prentice-Hall, Englewood Cliffs, New Jersey, 1980. K. Keesman. Robust identification and prediction for output—error models. In Proc. 9th [FA C/IFORS Symp. on Identification and System Parameter Estima- tion, volume 2, pages 878—882, Budapest, July 1991. T. M. Lin. Optimal bounding ellipsoid algorithms with automatic bound estima- tion. PhD thesis, Michigan State University, East Lansing, 1996. M. S. Liu. Development and analysis of an optimal bounding ellipsoid algorithm using stochastic approximation. PhD thesis, Michigan State University, East Lansing, August 1993. L. Ljung and T. deertriim. Theory and Pratice of Recursive Identification. MIT Press, Cambridge, Mass, 1983. M. Milanese and G. Belforte. Estimation theory and uncertainty evaluation intervals evaluation in the presence of unknown but bounded errors: Linear families of models and estimators. IEEE Trans. on Automatic Control, AC- 27:408—414, 1982. M. Milanese, R. Tempo, and A. Vicino. Optimal error prediction for economic models. Int. J. System Science, 19:1189—1200, 1988. S. Nagaraj, S. Gollamundi, J.R Deller, Jr., Y.-F. Huang, and S. Kapoor. Per- formance studies on a “Quasi-OBE” algorithm for real-time Signal processing. In Proc. 4 0th Annual Midwest Symposium on Circuits and Systems, volume 2, pages 770—773, Sacramento, August 1997. M. Nayeri, M.S. Liu, and J.R Deller, Jr. Do interpretable optimal bounding ellipsoid algorithms converge? parts I & II. In Proc. 10th IFA C/IFORS Symp. on System Identification (SYSID ’94), volume 3, pages 389—400, Copenhagen, July 1994. 134 [41] Ben Noble. Applied Linear Algebra. Prentice-Hall, Englewood-Cliffs, New Jersey, 1969. [42] J. P. Norton. Problems in identifying the dynamics of biological systems from very Short records. In Proc. 25th Annual IEEE Conf. Decision and Control, Athens, 1986. [43] Richard E. Phillips. A Course in Matrix Theory (class notes). Michigan State University, 1992. [44] F. C. Schweppe. Recursive state estimation: Unknown but bounded errors and system inputs. IEEE Trans. on Automatic Control, AC—13r22-28, February 1968. [45] T. deerstrbm and P. Stoica. System Identification. Prentice-Hall, Englewood- Cliffs, New Jersey, 1989. [46] S. M. Veres and J. P. Norton. Structure selection for bounded-parameter mod- els: Consistency conditions and selection criterion. IEEE Trans. on Automatic Control, 36:474—481, 1991. [47] A. Vicino, R. Tempo, R. Genesio, and M. Milanese. Optimal error and GMDH predictors: A comparison with some statistical techniques. Int. J. Forecasting, 2:313-328, 1987. [48] E. Walter, Y. Lecourtier, J. Happel, and J. Y. Kao. Identifiability and distin- guishability of fundemental parameters in catalytic methanation. J. American Inst. Chemical Engineer, 32:1360—1366, 1988. [49] H. S. Witsenhausen. Sets of possible state of linear systems given perturbed observation. IEEE Trans. on Automatic Control, AC—13:556-568, October 1968. HICHIGRN STRTE UNIV. LIBRRRIES [HI 1]] [I [II] [I [ll] Ill] [Illllllll [[1]] [ll [1]] 31293017743927