,H .Gnruuei :‘l. N a: .5 ... a 3.24. 1 1 12v I.¢r....uq.. . \fiivl; mam-s MICHIGAN STATE UNI RSITY LIBRARIES ii\\\\\\\\\\\\\\\\\1\\TI\\\\\\\H\\ \\\\\\\\\\\\\i H ‘1 \ u 3 1293 0 402 7241 i LIBRARY Michigan State University This is to certify that the dissertation entitled RADAR TARGET DISCRIMINATION USING NEURAL NETWORKS presented by CHANG-YING TSAI has been accepted towards fulfillment of the requirements for Ph.D. degree in Electrical Eng. W Major professor Date 5/2'?/7/~ MS U is an Affirmative Action/Equal Opportunity Institution 0- 12771 PLACE II RETURN BOXto mnovothio ohookouflrom your record. TO AVOID FINES rotum on or botoro date duo. DATE DUE DATE DUE DATE DUE MSU is An Affirmative Mimi/Equal Opportunity instituion Wm: RADAR TARGET DISCRIMINATION USING NEURAL NETWORKS By Tsai, Chang-Ying A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering 1995 ABSTRACT RADAR TARGET DISCRIMINATION USING NEURAL NETWORKS By Tsai, Chang-Ying This study uses several different memory-based neural networks to discriminate radar targets based on their early-time, aspect-dependent response, and demonstrates that target discrimination can be accomplished in a high-noise environment with great reliability. The difficulty of locating the beginning response point in practice prompts the use of PET frequency spectrum magnitudes as aspect process patterns since a time shifi is implicated in the phase of the spectrum. The efi‘ects of analog data and bipolar data with different quantization levels on network performances are investigated. Especially promising is the Recurrent Correlation Accumulation Adaptive Memory-Generalized Inverse (RC AAM-GI) cascade neural network. This network uses a dynamic memory structure to accumulate the converging information and has a stability criterion to allow us to define the final stable state. It can be considered as a real-time adaptive learning network with contamination observability and flexible decision strategy. From the simulation results, the network demonstrates computation space efficiency, and high noise tolerance. To My Family in Taiwan iii ACKNOWLEDGMENTS Iwould like to express my sincerest thanks to Dr. K. M. Chen, Dr. E. J. Rothwell and Dr. D. P. Nyquist. Your support, concern, advice and consideration are deeply appreciated. I also thank Dr. Byron Drachman for participating as a member of my guidance committee. I am indebted to Rober Bebermeyer for his esteemed and accurate measurements. I also want to thank Dr. J. E. Ross, III and Dr. Ponniah Ilavarasan for their encouragement, help and fiiendship. Finally, I would like to thank my colleague, Dr. Chi-wei Chang, and his lovely family for their hospitality relieving a traveller from loneliness. To my family expression is too savage. Keeping in heart is forever. TABLE OF CONTENTS LIST OF TABLES .................................................. viii LIST OF FIGURES .................................................... x CHAPTER 1 Introduction .................................................... 1 CHAPTER 2 Measurements and Data Preprocessing ................................ 6 2.1 Introduction ................................................. 6 2.2 Measurements ............................................... 6 CHAPTER 3 . Multi-Layer Feedforward with Error Backpropagation and Generalized Inverse Networks ............................................... 19 3.1 Introduction ................................................ 19 3.2 Multi-layer F eedforward with Error Backpropagation Networks ........ 19 3.2.1 Gradient Descent Rule ................................. 22 3.2.2 ML/BP Learning Mechanism ............................ 23 3.3 Gaussian Noise Generation ..................................... 26 3.4 ML/BP Network Trainings and Performances for Radar Target Discrimination ...................................................... 27 3.4.1 ML/BP Without Hidden Layer ........................... 31 3.4.2 ML/BP With One Hidden Layer .......................... 41 3.5 Generalized Inverse Network ................................... 48 3.5.1 Generalized Inverse Network Performance Using Bipolar Data . . 52 CHAPTER 4 Recurrent Correlation Associative Memories .......................... 58 4.1 Introduction ................................................ 58 4.2 Hopfield Network and Bidirectional Associative Memory (BAM) ....... 59 4.2.1 Simulations and Results ................................ 66 4.3 High order Correlation Associative Memory (HCAM) and Exponential Correlation Associative Memory (ECAM) ...................... 71 4.3.1 Simulations and Results ................................ 76 4.4 RCAM-GI Cascade Networks .................................. 82 4.4.1 Simulations and Results ................................ 85 CHAPTER 5 Recurrent Correlation Accumulation Adaptive Memories ................. 91 5.1 Introduction ................................................ 91 5.2 Recurrent Correlation Accumulation Adaptive Memories (RCAAM) ..... 91 5.3 Performance of the Recurrent Correlation Accumulation Adaptive Memory with dynamic input (RCAAM/di) ................................. 99 5.3.1 RC AAM/di performances with respect to stability criterion effect 100 5.3.2 RCAAM/di performances with respect to initial order effect . . . 122 5.3.3 RCAAM/di performances for the same averaged order group . . 141 5.4 Performance of the Recurrent Correlation Accumulation Adaptive Memory with fixed input (RCAAM/fl) ................................... 158 5.5 Performance of the Recurrent Correlation Accumulation Adaptive Memory with analog input and digital output (RCAAM/ad) ................... 178 5.6 Comparisons of Correlation-based Discrimination Resolutions Between Analog and Bipolar Patterns ...................................... 181 5.7 Performances of RCAAM-GI Cascade Networks ................... 201 5.7.1 Converging Efficiency Comparisons Between the RCAAM and the RCAAM-GI Cascade Network ........................ 202 5.7.2 Noise Tolerance Comparisons Between the RCAAM and RCAAM-GI Cascade Network .................................. 203 CHAPTER 6 Target Discrimination using Neural Network with Spectrum Magnitude Response ..................................................... 224 6.1 Introduction ............................................... 224 6.2 Spectrum Magnitude Process and Normalization ................... 226 6.3 Manipulation of Spectrum Magnitude Data Before Network Process . . . . 238 6.4 Comparisons of Correlation-based Discrimination Resolutions for Different Data Formats ............................................... 244 6.5 Network Simulations with Spectrum Magnitude Response ............ 261 6.6 Network Comparisons ....................................... 277 CHAPTER 7 Conclusions .................................................. 294 vi BIBLIOGRAPHY ................................................... 299 vii LIST OF TABLES Table 3.1 Code assignment for 7 quantization levels coded by three bipolar bits. . . . . . 53 Table 5.1 Comparisons of normalized target correlation gains subject to an assumed active threshold of 30% between analog data and encoded bipolar data. .......... 198 Table 5.2 Comparisons of normalized target correlation gains subject to an assumed active threshold of 20% between analog data and encoded bipolar data. .......... 199 Table 5.3 Comparisons of normalized target correlation gains subject to an assumed active threshold of 10% between analog data and encoded bipolar data. .......... 200 Table 6.1 Code assignment of 5 quantization levels encoded by three bipolar bits. . . 229 Table 6.2 Estimated normalized target correlation gains subject to a active threshold of 40% for different spectral process data forms. ............................ 255 Table 6.3 Estimated normalized target correlation gains subject to a active threshold of 50% for different spectral process data forms. ............................ 257 Table 6.4 Estimated normalized target correlation gains subject to a active threshold of 60% for different spectral process data forms. ............................ 258 Table 6.5 Comparisons of network discrimination iterations among different data forms by testing networks with contaminated stored patterns. .................... 259 Table 6.6 Architecture summary of the recurrent correlation associative networks used. ........................................................... 288 Table 6.7 Network architectures and performances summary for time domain target discrimination with 7 quantization levels encoded by 3 bits. .............. 291 Table 6.8 Network architectures and performances summary for spectrum magnitude targe discrimination with 5 quantization levels encoded by 3 bits. .............. 292 viii Table 6.9 Network architectures and performances summary for spectrum magnitude targe discrimination with 7 quantization levels encoded by 3 bits. .............. 293 ix F12 fig: figi Figu Ficu- h . 52m FlEur LIST OF FIGURES Figure 2.1 Frequency domain transient measurement system ..................... 7 Figure 2.2 Block diagram for the anechoic chamber measurement system ........... 8 Figure 2.3 Time gated backscattered responses of 852, at aspect angle 0", after 8192-point IFFT. ............................................................ 15 Figure 2.4 Normalized time response of BS2, at aspect angle 0°, serves as a network analog pattern. ...................................................... 16 Figure 2.5 68 lOO-sample aspect time responses, truncated from 68 820-sample time-gated IFFT responses, for 4 scale aircraft models. ........................... 17 Figure 2.6 68 normalized lOO-sample aspect time responses serving as aspect stored patterns for time domain networks. ........................................ 18 Figure 3.1 Structure of a multi-layer feedforward with error backpropagation network 21 Figure 3.2 (a) An artificial neuron unit model. The Om's are outputs from previous layer HI, 0). is the bias for neuron j and Wi j's are interconnecting weights between layer HI and the neuron j of layer H]. (b) Activation firnction , g(x), used for bipolar process. 21 Figure 3.3 100 time responses of 852 used for network process at azimuthal aspect 0°. Solid line denotes responses without noise, while the dash-dot denotes the one with added noises of SNR= 10 dB ............................................ 28 Figure 3.4 100 time responses of 852 used for network process at azimuthal aspect 0°. Solid line denotes responses without noise, while the dash-dot denotes the one with added noises of SNR= 0 dB. ........................................... 29 Figure 3.5 100 time responses of 852 used for network process at azimuthal aspect 0°. Solid line denotes responses without noise, while the dash-dot denotes the one with added noises of SNR= -3 dB. ........................................... 30 Figure 3.6 Activation functions, GB(x), with different activation parameters. Solid line denotes the one with activation parameter [3:1, while the dash line denotes the one with parameter [i=3 ............................................. 32 Figure 3.7 The first derivatives, d[GB(x)]/dx, of activation firnction with different activation parameters. Solid line denotes the one with activation parameter (i=1, while the dash line denotes the one with parameter (i=3. ............................. 33 Figure 3.8 Performance of ML/BP network w/o hidden layer and w/o extra trainings vs. SNR(dB). (a). Network performance for the 17 training patterns of target B58. (b). Overall performance for all 68 training patterns ...................... 37 Figure 3.9 Performance of ML/BP network w/o hidden layer and with extra trainings vs. SNR(dB). (a). Network performance for the 17 training patterns of target BS8. (b). Overall performance for all 68 training patterns ...................... 38 Figure 3.10 The generalization performance of ML/BP network w/o hidden layer and w/o extra trainings vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target BS8. (b). Overall generalization performance for the 64 untrained patterns. ....... 39 Figure 3.11 The generalization performance of ML/BP network w/o hidden layer and with extra trainings vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target BS8 (b). Overall generalization performance for the 64 untrained patterns. ....... 40 Figure 3.12 Normalized derivatives of error functions of three different orders w.r.t. weight W,» -dE(W)/dWfl.. The desired output is assumed to be 1 and an activation parameter [i=2 is used for output neurons. The dash-dot line denotes the error function with order of 6, the solid line denotes the one with order of 4, and the dash line shows the one with order of 2. ............................................................ 43 Figure 3.13 Normalized derivatives of error function w.r.t. weight W1}, -dE(W)/dW. ., with three different activation parameters. The desired output is assumed to be 1 and the error function with order of 4 is used here. The dash-dot line denotes the activation firnction with parameter (i=3, the solid line denotes the one with 13=2, and the dash line shows the one with (i=1. ...................................... 44 Figure 3.14 Performance of ML/BP network with one hidden layer vs. SNR(dB). xi (a). Network performance for the 17 training patterns of target BS8 (b). Overall performance for all 68 training patterns ...................... 46 Figure 3.15 The generalization performance of ML/BP network with one hidden layer vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target BS8. (b). Overall generalization performance for the 64 untrained patterns. ....... 47 Figure 3.16 Generalized Inverse (GI) network performance vs. SNR(dB). (a). Network performance for the 17 training patterns of target BS8. (b). Overall performance for all 68 training patterns ...................... 56 Figure 3.17 The generalization performance of Generalized Inverse (GI) network vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target B58. (b). Overall generalization performance for the 64 untrained patterns. ....... 57 Figure 4.1 BAM network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance by using target group code. ......... 69 Figure 4.2 BAM generalization performance by using target group code for the 64 unstored patterns. ...................................................... 70 Figure 4.3 HCAM (order 3) network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance. .............................. 77 Figure 4.4 HCAM (order 3) generalization performance for the 64 unstored patterns. . 78 Figure 4.5 HCAM (order 5) network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance. .............................. 80 Figure 4.6 HCAM (order 5) generalization performance for the 64 unstored patterns. . 81 Figure 4.7 ECAM network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance. .............................. 83 Figure 4.8 ECAM generalization performance for the 64 unstored patterns. ........ 84 Figure 4.9 HCAM (order 3)-GI cascade network performances vs. SNR(dB). xii (a). Target discrimination performance for the 68 stored patterns. (b). Generalization performance for the 64 unstored patterns ............... 88 Figure 4.10 HCAM (order 5)-GI cascade network performances vs. SNR(dB). (a). Target discrimination performance for the 68 stored patterns. (b). Generalization performance for the 64 unstored patterns ............... 89 Figure 4.11 ECAM-GI cascade network performances vs. SNR(dB). (a). Target discrimination performance for the 68 stored patterns. (b). Generalization performance for the 64 unstored patterns. .............. 90 Figure 5.1 Unrecognized discrimination of the RCAAM/di with an initial order of 2 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. . . . . 104 Figure 5.2 Correct and wrong pattern discrimination performances of the RCAAM/di with an initial order of 2 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. ....... 105 Figure 5.3 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 2 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 106 Figure 5.4 Unrecognized discrimination of the RCAAM/di with an initial order of 2 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. . . 107 Figure 5.5 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 2 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 108 Figure 5.6 Unrecognized discrimination of the RCAAM/di with an initial order of 3 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. . . . . 1 1 1 Figure 5.7 Correct and wrong pattern discrimination performances of the RC AAM/di with an initial order of 3 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. ....... 1 12 xiii Figure 5.8 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 3 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 113 Figure 5.9 Unrecognized discrimination of the RCAAM/di with an initial order of 3 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. . . 1 14 Figure 5.10 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 3 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 115 Figure 5.1] Unrecognized discrimination of the RCAAM/di with an initial order of 6 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. . . . . 117 Figure 5.12 Correct and wrong pattern discrimination performances of the RCAAM/di with an initial order of 6 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. ....... 1 18 Figure 5.13 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 6 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ l 19 Figure 5.14 Unrecognized discrimination of the RCAAM/di with an initial order of 6 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. . . 120 Figure 5.15 Correct and wrong target discrimination performances of the RC AAM/di with an initial order of 6 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 121 Figure 5.16 Unrecognized discrimination of the RCAAM/di with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. . . . 124 xiv 112 he he F121 F121 Figure 5.17 Correct and wrong pattern discrimination performances of the RCAAM/di with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. ........... 125 Figure 5.18 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 126 Figure 5.19 Unrecognized discrimination of the RCAAM/di with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. . 127 Figure 5.20 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 128 Figure 5.21 Unrecognized discrimination of the RCAAM/d1 with a stability criterion of 5 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. . . . 131 Figure 5.22 Correct and wrong pattern discrimination performances of the RCAAM/di with a stability criterion of 5 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. ........... 132 Figure 5.23 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 5 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 133 Figure 5.24 Unrecognized discrimination of the RCAAM/di with a stability criterion of 5 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. . 134 Figure 5.25 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 5 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. XV fig Figur lkun Fli’ure F12m Ftures (b) Wrong target discrimination performances vs. initial order. ............ 135 Figure 5.26 Unrecognized discrimination of the RCAAM/di with a stability criterion of 10 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. . . . 136 Figure 5.27 Correct and wrong pattern discrimination performances of the RCAAM/di with a stability criterion of 10 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. ........... 137 Figure 5.28 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 10 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 138 Figure 5.29 Unrecognized discrimination of the RCAAM/di with a stability criterion of 10 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. . 139 Figure 5.30 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 10 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 140 Figure 5.31 Averaged orders required for target discrimination of the RCAAM/di with contaminated stored pattern inputs vs. stability criterion. ................ 142 Figure 5.32 Averaged orders required for target discrimination of the RCAAM/di with contaminated unstored pattern inputs vs. stability criterion. .............. 143 Figure 5.33 Orders required for stability criterion of the RCAAM/di with an average order between 9 and 10 vs. SNR. ...................................... 145 Figure 5.34 Performances of the RCAAM/di with an average order between 9 and 10 vs. SNR for 68 stored patterns. (a) Correct target discriminations vs. SNR for 68 stored patterns. (b) Unrecognized and wrong target discriminations vs. SNR for 68 stored pattem Figure 5.35 Orders required for unstored pattern discriminations of the RC AAM/di with an average order of stored patterns between 9 and 10 vs. SNR. ............. 147 xvi Figure 5.36 Performances of the RCAAM/di with an average order of stored patterns between 9 and 10 vs. SNR for 64 unstored patterns. (a) Correct target discriminations vs. SNR for 64 unstored patterns. (b) Unrecognized and wrong target discriminations vs. SNR for 64 unstored patterns. ........................................................... 148 Figure 5.37 Orders required for stability criterion of the RCAAM/di with an average order between 10 and 11 vs. SNR. ..................................... 149 Figure 5.38 Performances of the RCAAM/di with an average order between 10 and 11 vs. SNR for 68 stored patterns. (a) Correct target discriminations vs. SNR for 68 stored patterns. (b) Unrecognized and wrong target discriminations vs. SNR for 68 stored patterlrfii) Figure 5.39 Orders required for unstored pattern discriminations of the RCAAM/di with an average order of stored patterns between 10 and 11 vs. SNR. ............ 151 Figure 5.40 Performances of the RCAAM/di with an average order of stored patterns between 10 and 11 vs. SNR for 64 unstored patterns. (a) Correct target discriminations vs. SNR for 64 unstored patterns. (b) Unrecognized and wrong target discriminations vs. SNR for 64 unstored patterns. ........................................................... 152 Figure 5.41 Orders required for stability criterion of the RC AAM/di with an average order between 11 and 12 vs. SNR. ..................................... 154 Figure 5.42 Performances of the RCAAM/di with an average order between 11 and 12 vs. SNR for 68 stored patterns. (a) Correct target discriminations vs. SNR for 68 stored patterns. (b) Unrecognized and wrong target discrinrinations vs. SNR for 68 stored patterlrié Figure 5.43 Orders required for unstored pattern discriminations of the RCAAM/di with an average order of stored patterns between 11 and 12 vs. SNR. ............ 156 Figure 5.44 Performances of the RCAAM/di with an average order of stored patterns between 11 and 12 vs. SNR for 64 unstored patterns. (a) Correct target discriminations vs. SNR for 64 unstored patterns. (b) Unrecognized and wrong target discrirrrinations vs. SNR for 64 unstored patterns. ........................................................... 157 Figure 5.45 Orders required for stability criterion of the RCAAM/di with an average order between 13 and 14 vs. SNR. ..................................... 159 xvii Figure 5.46 Performances of the RCAAM/di with an average order between 13 and 14 vs. SNR for 68 stored patterns. (a) Correct target discriminations vs. SNR for 68 stored patterns. (b) Unrecognized and wrong target discriminations vs. SNR for 68 stored patterlrw Figure 5.47 Orders required for unstored pattern discriminations of the RCAAM/di with an average order of stored patterns between 13 and 14 vs. SNR. ............ 161 Figure 5.48 Performances of the RCAAM/di with an average order of stored patterns between 13 and 14 vs. SNR for 64 unstored patterns. (a) Correct target discriminations vs. SNR for 64 unstored patterns. (b) Unrecognized and wrong target discriminations vs. SNR for 64 unstored patterns. ........................................................... 162 Figure 5.49 Unrecognized discrimination of the RCAAM/Ii with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. . . . . 165 Figure 5.50 Correct and wrong pattern discrimination performances of the RCAAM/fl with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. ....... 166 Figure 5.51 Correct and wrong target discrimination performances of the RCAAM/fl with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 167 Figure 5.52 Unrecognized discrimination of the RCAAM/fi with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. . . 168 Figure 5.53 Correct and wrong target discrimination performances of the RC AAM/fi with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (3) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 169 Figure 5.54 Unrecognized discrimination of the RCAAM/fl with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. . . . 172 Figure 5.55 Correct and wrong pattern discrimination performances of the RCAAM/fl with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for 68 stored xviii patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. ........... 173 Figure 5.56 Correct and wrong target discrimination performances of the RCAAM/fl with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 174 Figure 5.57 Unrecognized discrimination of the RCAAM/fr with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. . 175 Figure 5.58 Correct and wrong target discrimination performances of the RCAAM/fl with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 176 Figure 5.59 Averaged orders required for target discrimination of the RCAAM/fr with contaminated stored pattern inputs vs. stability criterion. ................ 177 Figure 5.60 Unrecognized discrimination of the RCAAM/ad with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. . . . . 182 Figure 5.61 Correct and wrong pattern discrimination performances of the RC AAM/ad with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. ....... 183 Figure 5.62 Correct and wrong target discrimination performances of the RCAAM/ad with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 184 Figure 5.63 Unrecognized discrimination of the RCAAM/ad with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. . . 185 Figure 5.64 Correct and wrong target discrimination performances of the RCAAM/ad with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. xix (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. ........ 186 Figure 5.65 Correct and wrong target discrimination performances of the RCAAM/ad with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 187 Figure 5.66 Correct and wrong target discrimination performances of the RCAAM/ad with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. ............ 188 Figure 5.67 Averaged orders required for target discrimination of the RCAAM/ad with contaminated stored pattern inputs vs. stability criterion. ................ 189 Figure 5.68 Orders required for the discriminations of RCAAM/til vs. SNR for contaminated stored patterns. ..................................... 194 Figure 5.69 Orders required for the discriminations of RCAAM/ad] vs. SNR for contaminated stored patterns. ..................................... 195 Figure 5.70 Unrecognized discriminations of the RCAAM/di2 and RCAAM/di22-GI vs. initial order for 40 dB and 0 dB SNR. .............................. 204 Figure 5.71 Unrecognized discriminations of the RCAAM/fil and RC AAM/fi 1 2-GI vs. initial order for 40 dB and 0 dB SNR. ................................... 205 Figure 5.72 Unrecognized discriminations of the RCAAM/adl and RCAAM/ad12-GI vs. initial order for 40 dB and 0 dB SNR. .............................. 206 Figure 5.73 Performance comparisons of the RCAAM/di's with an initial order of 2 and RCAAM/di22-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination vs. SNR for the 68 stored patterns. (b) Wrong target discrimination vs. SNR for the 68 stored patterns. ........ 208 Figure 5.74 Unrecognized discriminations and discrimination orders of the RCAAM/di's with an initial order of 2 and RCAAM/di22—GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 68 stored patterns. XX (b) Orders required for discriminations vs. SNR for the 68 stored patterns. . . 209 Figure 5.75 Performance comparisons of the RCAAM/di’s with an initial order of 2 and RCAAM/di22-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 64 unstored patterns. (b) Wrong target discriminations vs. SNR for the 64 unstored patterns. ..... 210 Figure 5.76 Unrecognized discriminations and discrimination orders of the RC AAM/di's with an initial order of 2 and RCAAM/di22-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 64 unstored patterns. (b) Orders required for discriminations vs. SNR for the 64 unstored patterns. . 211 Figure 5.77 Performance comparisons of the RCAAM/fi’s with an initial order of 1 and RCAAM/filZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 68 stored patterns. (b) Wrong target discriminations vs. SNR for the 68 stored patterns. ....... 213 Figure 5.78 Unrecognized discriminations and discrimination orders of the RCAAM/fi's with an initial order of l and RCAAM/dilZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 68 stored patterns. (b) Orders required for discriminations vs. SNR for the 68 stored patterns. . . 214 Figure 5.79 Performance comparisons of the RCAAM/fi’s with an initial order of 1 and RCAAM/filZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 64 unstored patterns. (b) Wrong target discriminations vs. SNR for the 64 unstored patterns. ..... 215 Figure 5.80 Unrecognized discrirrrinations and discrimination orders of the RCAAM/fr's with an initial order of 1 and RCAAM/fi12-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 64 unstored patterns. (b) Orders required for discriminations vs. SNR for the 64 unstored patterns. 216 Figure 5.81 Performance comparisons of the RCAAM/ad's with an initial order of 1 and RCAAM/adlZ—GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 68 stored patterns. (b) Wrong target discriminations vs. SNR for the 68 stored patterns. ....... 220 xxi Figure 5.82 Unrecognized discriminations and discrimination orders of the RCAAM/ad’s with an initial order of 1 and RCAAM/dilZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 68 stored patterns. (b) Orders required for discriminations vs. SNR for the 68 stored patterns. . . 221 Figure 5.83 Performance comparisons of the RCAAM/ad's with an initial order of l and RCAAM/adIZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 64 unstored patterns. (b) Wrong target discriminations vs. SNR for the 64 unstored patterns. ..... 222 Figure 5.84 Unrecognized discriminations and discrimination orders of the RCAAM/ad’s with an initial order of 1 and RCAAM/adlZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 64 unstored patterns. (b) Orders required for discriminations vs. SNR for the 64 unstored patterns. . 223 Figure 6.1 Two aspect (54° and 9°) stored patterns with an apparent time-shift resulted from an inappropriate beginning response determination occurred in time domain simulations. ........................................................... 225 Figure 6.2 68 aspect FF T spectrum magnitude patterns, with normalized energy of 1, used for spectrum process network trainings/storage. Spectrum process networks simulate 4 targets, each target has 17 trained/stored aspect spectral patterns. The effective frequency band is 1-7 GHz. ...................................... 228 Figure 6.3 FFT spectrum magnitude patterns of two time responses from target F 14 with respective aspect angle of 54° and 9°. .............................. 236 Figure 6.4 Normalized FFT spectrum magnitude patterns of two time responses from target F14 with respective aspect angle of 54° and 9°. ....................... 237 Figure 6.5 820 time responses of target BS2 at aspect angle 0° used as measured responses by a time domain radar. The solid line shows the true responses, while the dotted line denotes the contaminated signal with a SNR of 0 dB. .................. 240 Figure 6.6 1358 spectrum magnitudes transformed from the previous two time responses by using a 1358-point DFT. The solid line shows the true responses, while the dotted line denotes the contaminated signal with a SNR of 0 dB. The evaluation of 0 dB is based on the 820 time responses. ....................................... 241 xxii l . 3 - :‘fi Figure 6.7 100 time responses truncated from the previous 820 responses used as the time domain network process aspect pattern. The solid line shows the true responses, while the dash-dot line denotes the contaminated signal with a SNR of 0 dB. Here the evaluation of 0 dB is based on the 100 time responses. .................. 242 Figure 6.8 100 spectrum magnitudes truncated from the previous 1358 spectral responses used as the spectrum network process pattern. The solid line shows the true responses, while the dash-dot line denotes the contaminated signal with a SNR of 0 dB. The evaluation of 0 dB is based on the 820 time responses. ........... 243 Figure 6.9 Correlation gain population distribution for 68 stored time domain patterns with different data forms. ............................................ 247 Figure 6.10 Correlation gain population distribution, only considering gain scale, for 68 stored time domain patterns with different data forms. .................. 248 Figure 6.11 Correlation gain population distribution for 68 stored spectral process patterns with different data forms. ........................................ 252 Figure 6.12 Correlation gain population distributions, only considering gain scale, for 68 stored spectral process patterns with different data forms. ............... 253 Figure 6.13 Spectrum magnitude GI network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. .................. 263 Figure 6.14 Spectrum magnitude GI network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. ...... 264 Figure 6.15 Spectrum magnitude HCAM and HCAM-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. ..................................................... 265 Figure 6.16 Spectrum magnitude HCAM and HCAM-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. ............................................. 266 Figure 6.17 Spectrum magnitude ECAM and ECAM-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns267 Figure 6.18 Spectrum magnitude ECAM and ECAM-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. ............................................. 268 xxiii Figure 6.19 Spectrum magnitude RCAAM/di22 and RCAAM/di22-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. ............................................... 271 Figure 6.20 Spectrum magnitude RCAAM/di22 and RCAAM/di22-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. ................................... 272 Figure 6.21 Spectrum magnitude RCAAM/fi12 and RCAAM/fi12-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. ............................................... 273 Figure 6.22 Spectrum magnitude RCAAM/fi12 and RCAAM/filZ-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. ................................... 274 Figure 6.23 Spectrum magnitude RCAAM/ad12 and RCAAM/ad12-GI cascade network performances with bipolar output forms coding S and 7 levels vs. SNR for 68 contaminated stored patterns. ..................................... 275 Figure 6.24 Spectrum magnitude RCAAM/ad12 and RCAAM/adiZ—GI cascade network generalization performances with bipolar output forms coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns ............................... 276 Figure 6.25 Performance Comparisons of different time domain networks (7-leve1 bipolar data, if used) vs. SNR for 68 contaminated stored patterns. .............. 280 Figure 6.26 Generalization performance Comparisons of different time domain networks (7- level bipolar data, if used) vs. SNR for 64 contaminated unstored patterns. . . 281 Figure 6.27 Performance Comparisons of different spectrum magnitude process networks (S-level bipolar data, if used) vs. SNR for 68 contaminated stored patterns. . . 282 Figure 6.28 Generalization performance Comparisons of different spectrum magnitude process networks (5-level bipolar data, if used) vs. SNR for 64 contaminated unstored patterns. ........................................................... 283 Figure 6.29 Performance Comparisons of different spectrum magnitude process networks (7-level bipolar data, if used) vs. SNR for 68 contaminated stored patterns. . . 284 Figure 6.30 Generalization performance Comparisons of different spectrum magnitude xxiv process networks (7-level bipolar data, if used) vs. SNR for 64 contaminated unstored patterns. ........................................................... 285 XXV CHAPTER 1 Introduction Many interesting schemes have been proposed for radar target discrimination [1-16], Particularly fascinating are those which use the transient response of the target. These include methods based on the aspect-independent late—time response [1-9], time domain imaging techniques [10-12], correlation [13-15] and wavelet transforms [16]. Many of these schemes use only the early-time specular target response, or the late-time resonant portion. Of those techniques that use the entire waveform, problems arise in the amount of computer storage required and the time needed to process the measured response of an unknown target. Most estimation procedures, from mean-squared linear estimation to nonlinear regression, require a rough mathematical model expressing the relationship between input and output, therefore they are model-based estimations. The radar target discrimination based on the early-time transient scattered field response can't be reached from a model-based method since an accurate mathematical aspect-dependent scattering transform formula is unavailable in the practical situation. Then the task may be devoted to the model estimation by using adaptive estimation procedures. However, an adaptive estimation requires prior information about the model order, number of estimated parameters, and the rough firnction type, and becomes very difficult for high orders under noisy situations. Hence this kind of model estimation is almost impossible. Speaking of the pattern classification, the stochastic pattern recognition algorithms [17]-[20] can be divided into two categories : density/distribution estimation and pattern 2 clustering. The parameter estimations (discriminant function, Maximum Likelihood and Bayesian estimations) require the desity model for each scattered response, while the clustering algorithms (supervised/unsupervised/differential competitive-learning or c-means algorithm) require consistent and clear pattern grouping. Unfortunately, the early-time radar target scattered responses usually show that the dependence on aspect is higher than the dependence on target in some aspect ranges, thus the target clustering among aspect responses is ambiguous and inconsistent. A neural network’s structure learns the given (or experienced) input-output associations through a synapse-interconnected black box, thus the neural learning is from experience and is model-free. A model-free estimation doesn't require to estimate the mathematical shape of the presented task, and, in contrast, the neural learning occurring in the black-box will implicitly represent the experienced input-output associations in its own way. Although mathematical models can analyze a task in detail, human takes actions typically by instinct (or neural learning) instead of sequential mathematical calculations. For example, nobody picks up a cup on a desk by using the complicated mathematical algorithm that a robot manipulator [21] undertakes. A child can finish the task without difficulty, while an adult may proceed in a more efficient and gracefirl way or in motion. Compared to convential methods, the artificial neural networks are quite suitable to solve the complicated systems for which deterministic methods are incapable or inefficient. Consider a practical problem : Can an unprofessional driver or deterministic mathematical control system continuously and smoothly back up, from an arbitriry initial position, a long trailer (cap+trail) along a given line in real-time ? In this problem, a deterministic model can't 3 consistently and effectively proceed with task since each action will accumulatively affect the future actions and errors, and then the trailer will become out of control if some jacknife angles occur. The neural networks can be applied to those fields within which stochastic and deterministic methods may suffer difficulties. Recently, neural networks have been used to perform target discrimination with reasonable storage requirements and rapid processing times [22, 23]. Much of this effort was based on simple back-propagation networks. This thesis will examine more sophisticated networks and demonstrate that target discrimination can be accomplished in a high-noise environment with great reliability. We will concentrate our research on the noise tolerance of various neural networks and the processing efficiency. The transient scattered field response of a radar target is aspect sensitive, but for an interrogating pulse of a given bandwidth, a discretization of aspect angle can be found for which changes are gradual from angle to angle. Therefore, we can store some specified aspect responses as the reference patterns for each target, design a neural network to memorize the association among reference patterns and expect the network will correctly converge to some reference pattern when it is triggered at the input by a pattern that is sufficiently close to one of the reference patterns. Using the Michigan State University anechoic chamber, an PIP-87208 network analyzer is used to perform stepped frequency measurements of four scale-aircraft models, B52(1 :72), BS8(1:48), F14(l :48) and TR1(1:48). The targets are measured from 0° to 288° with an azimuthal aspect increment of 09°, resulting in 33 aspect measurements for each target. The frequency response spectra are calibrated using a 14" sphere as in [24] and taken into the time domain using the inverse fast Fourier Transform (IF FT). We then select 17 time- 4 domain responses from 0° to 288° with aspect increment 18° for each target as training/stored patterns, and 16 responses as untrained/unstored network generality test patterns from 09° to 279° with the same increment. Therefore, every untrained/unstored test pattern resides at the middle of two training/stored patterns. We have used both time-response patterns and FFT spectrum-magnitude patterns to simulate target discrimination for each neural network. In time-response processing, the beginning response time has been assumed known, and we extract the next 100 response points as the aspect prototype from the assumed beginning point. Based on this assumed segment, noise is later added to test network noise tolerances. The difficulty of locating the beginning response point in practice prompts the use of FFT frequency spectrum magnitudes as aspect process patterns since a time shift is implicated in the phase of the spectrum. In binomial input simulations, we quantize each time response using 7 numerical levels and encode these 7 levels using 3 bits. For spectrum magnitude processes, we simulate bipolar data formats with two quantization levels, 5 quantization levels and 7 quantization levels encoded by 3 bits. More levels are used for the time responses, since they exhibit a much wider oscillation range than the spectral magnitudes. 5 levels can be used to express response variations for the spectral magnitudes, since they have a small dynamic oscillation range. In chapter 2, the target-aspect scattering measurement procedure is given and the data preprocessing described. We then present the theory for several different neural networks for target discrimination. In chapter 3, the Multi-Layer Feedforward and Error-BackPropagation (ML/BF) network and its backpropagation learning algorithm are presented. The Generalized Inverse (GI) algorithm and its iterative network learning procedure are also given in chapter 5 3. In chapter 4, we discuss Recurrent Correlation Associative Memories (RCAM), and analyze the High Order Recurrent Correlation Associative Memory (HCAM) and Exponential Correlation Associative Memory (EC AM). The advantage of the RC AM-GI cascade network is investigated. In chapter 5, we propose a new network structure, the Recurrent Correlation Accumulation Adaptive Memory (RCAAM), which uses a dynamic memory structure to accumulate the converging information and has a stable criterion to allow spurious states to either stay as unknowns or converge to one of the stored patterns. We describe it as a real- time adaptive learning network with contamination observability and flexible decision strategy. The RCAAM performs discrimination equally well as the ECAM, always outperforms the HCAM with low orders, and requires much less processing space than the ECAM. An estimate formula for target crosscorrelation gain as developed, and a correlation- based discrimination resolution comparison between bipolar data and analog data is undertaken. In chapter 6, we introduce the FFT spectrum magnitude process to relieve the time-shift inconsistency occurring in the time responses. A modified process is constructed for analog spectrum process networks. We summarize and compare neural networks used in this paper to some popular ones, and also compare the estimated target discrimination resolutions among all data formats used in the time domain and spectrum magnitude processes. We briefly discuss implementation complexities and conclude the thesis in chapter 7. CHAPTER 2 Measurements and Data Preprocessing 2.1 Introduction To determine the transient backscattered field response of a practical radar target the Michigan State University anechoic chamber and a PIP-87208 vector network analyzer are used. Four scale aircrafi models, BS2(1272), BS8(1:48), F14(1:48) and TRl(1:48), are used to perform stepped frequency measurements. The measurement and calibration processes will be given in the next section. Then the backscattered time responses obtained via the inverse Fourier transform from the measured spectra] responses and the data preprocess for time domain networks are conducted in section 2.3. 2.2 Measurements [24] Since an impulse response can help us to understand the target's structure, a synthesized short duration pulse, an approximation to an impulse, is used in the anechoic chamber measurements. A desired temporal impulse is difficult to consistently generate in the time domain, thus we measure spectral responses and then synthesize temporal responses. From Fourier frequency spectrum analysis and synthesis, we know an impulse in the time domain has an uniform frequency spectrum covering the whole frequency band. Therefore, a vector network analyzer capable of measuring both spectral magnitude and phase can allow us to synthesize a short duration pulse through wide frequency sweeping. The frequency domain measurement system we use is shown in Figure 2.1 . The HP-87ZOB network analyzer is programmed to generate sweeping frequency waves to emulate a temporal 6 Transmit Antenna l\ J/ 1\ Receive Antenna Foam Pedestal Anechoic Chamber Vector Network Analyzer Figure 2.1 Frequency domain transient measurement system EU) @— Ht(f) Source Pulse or CW R Receiver Figure 2.2 Block diagram for the anechoic chamber measurement system Transmitted Field Transmitting Antenna Mutual Chamber Antenna V Cowling ‘ 5mm" H Interactions 11 Clutter Ham Hsm Hscm Hem r v R“) HI“) 1 Received Field Receiving Noise Antenna NU) impulse. Since the measurement is conducted in an anechoic chamber, a system calibration is required for accurate results. First the block diagram for the measurement system is presented in Figure 2.2 . Ha(f) denotes the transfer function of the direct coupling from the transmit antenna to the receive antenna, while Hc(f) represents the transfer firnction of the coupling from the transmit antenna to the receive antenna via the anechoic chamber, antenna supports, and target mount. Hr(f) and Ht(f) are the transfer functions of the receive and transmit antennas from the transmission line into the free-field environment, while E(f) represents the spectral content of the pulse or CW source. The background environment includes the path Ha(f) and Hc(f). Thus we can model the background measurement by R (If) = EM'H,(/)'[H,(I)rHC(/)l'H,U)+Nb(/) 1 . sm-[Hamwcmjwbm ( ) where S(f) is the system transfer function S(f)=Hr(f) - Ht(f) - E(f) and N°(f) is background random noise. If we measure some target, t, then we have R "”01 = $0) - mm 41.0) +H.‘-R ”(n = sm-rH.‘m+H.:mr+N'(/) (3) 10 where N‘(f)= N"”°(f) - N°(f). From the above equations, we can't solve the desired target scattering response, Hs‘(f), even ignoring the multi-interaction term Hsc'(f), because the system transfer firnction is still not available. Thus a calibration measurement is required to find the system transfer function S(f). In the calibration procedure, we need to use an object which has a known transfer function, Hs°(f). Then the calibration measurement gives RC”’(/) = SO)'[H,(f)+Hc(/)+H,C(/)+H.:(/)l+N°b(/) (4) where Hs°(f) is the transfer function of the calibration object with known transfer firnction, Hsc°(f) is the transfer firnction of the multi-interaction between the calibration and the anechoic chamber, and N°“°(f) is the random noise spectrum. Again, the multi-interaction term Hsc°(f) is causal. We have used a 14" sphere for calibration, since sphere scattering has the well known solution given by the Mie theory. The frequency response of a sphere can be attained in both magnitude and phase over a wide frequency band. And the sphere response is aspect-independent and thus we can avoid another calibration on aspect angle. Then we have R ‘m = Rotor-R to“) = SM-[HfWHgt/Hw ‘m (5) where N°(f)= N°"°(f) - N°(f). Since Hsc°(f) represents the multi-path interaction between the calibration object and the chamber, the first temporal response resulting from Hsc°(f) will apparently lag behind the ending temporal response of Hs°(f). Therefore, a time gating window can be used to eliminate the multi-path interaction portion of temporal response of R°(f). First the time response of the ll calibration measurement can be calculated by r“(t) = 5’“{R “(m (6) Since the temporal responses resulting from multi-path interaction between the calibration object and the chamber are sufficiently delayed beyond the end of the calibration object response, a time gate window function, w(t), can be used to exclude the multi-path interaction portion as follows : rcw = rc(t)-w(t) (7) If 9"{S(f) Hs°(f)} is time limited and w(t) is well designed, then the r°"(t) will be the time response corresponding to spectral response, S(f) Hs°(f), that is RWU) = .9'{r““(z)} = S(/)'H:U) Since Hs°(f) is known and R"(/) = 90°70} = Y{W(t)'9"1{R‘(/)}} can be calculated, S(f) can be represented as follows : RWU) rrfgn sq): Afier the system transfer fimction is found from the calibration procedure, then R'(f)= S(f) - [Hs‘(f) + Hsc'(f)] + N‘(f) will give Hit/bait» = R ‘0) S(/) if noise is ignored. Therefore, (3) (9) (10) (11) 12 h.'(t)+h.2(r) : V‘WXUMHLUH = 9’"{5§Z0(Q} (12) If the target impulse response is approximately time limited, and the multi-path interaction is delayed beyond the end of the target impulse response, then a time gate window, w'(t), can be used to isolate the target impulse response as follows : h.‘(t) = rh.‘(t)+h.:(r>1-w rt) (13) In our measurements, the effective operating band of the antennas is 1-7 GHz, and the sweep frequenc step size is set to 0.01 GHz. Therefore, we attain 601 measured spectral responses, including magnitude and phase, from 1 to 7 GHz. Since the antenna operating frequency band is restricted to 1-7 GHz in our measurement, we use a weighting function, the Gausian Modulated Cosine, to window the effective measurement band and eliminate band edge discontinuities. 2.3 Data Preprocessing To have a small time increment in the temporal responses after the inverse Fourier Transform, the spectral responses are expanded to 4096 samples by attaching (4096-601) zeros to the measured and weighted spectral responses. Therefore, there are 99 zeros artificially assigned to the band 0.01 GHz to 0.99 GHz and 3396 zeros are artificially assigned to the band 7.01 GHz to 40.96 GHz. Since a real time waveform has a complex conjugate Fourier spectrum, the 4096 spectral response points are expanded to 8192 samples by using their conjugate partners. Then we use a 8192-point Inverse Fast Fourier Transform (IF FT) to transfer the 8192 spectral response points to 8192 temporal response points for each aspect l3 measurement. After the IFFT, the temporal responses have a duration of 100 ns and a sampling spacing of 0.01221 ns. As analyzed above, the responses from multi-path interaction between target and chamber are apparently delayed beyond the direct backscattered responses from target, so we use a time gate window to extract the effective 820 temporal response points. Figure 2.3 shows the 820 backscattered response points after the IFFT for the 852 target at an aspect angle of 0°. The scale aircraft models are around 40 cm long, and thus the early-time backscattered response duration is the wave propagation time traveling twice aircraft length in the wave propagation direction, 40x2/30 ns = 2.67 ns if the aspect angle is 0°. This estimated early-time backscattered response duration for a 40 cm scale aircraft model will cover about 219 temporal response points. Since we like to use 100 time samples as the network analog pattern, we double the time sample spacing so that the new time sample spacing is 0.02442 ns and the response duration time of the 100 new time samples is 2.442 ns. With 0° aspect angle the 100 new temporal responses cover about 3/4 of the 852's early- time responses, while they can cover the whole TRl response. Since the beginning time of a target response is difficult to locate in practice, we design a simple detection algorithm to find the beginning of a response. When the beginning response time is decided, we pick 100 time response points, starting from the detected beginning of the response, as the network aspect prototype. With aspect angle changing and noise contamination, our simple detection isn’t consistent among aspect angles. However, we still use these truncated patterns, with inconsistency for some aspect ranges, as stored aspect response prototypes to emulate the practical situation. An example of the inconsistencies in 14 our time domain stored aspect patterns will be given in chapter 6, Target Discrimination using Neural Network with Spectrum Magnitude Response. To systematically process target discrimination the 100 time response points are normalized to an energy of 1. Figure 2.4 shows the 100 normalized time response points acting as a network analog pattern for the 852 target at an aspect angle of 0°. In this thesis, each target is measured at aspect angles from 0° to 288° with an azimuthal increment of 09°. Therefore each target has 33 aspect measurements. We then select 17 aspect responses from 0° to 288° with aspect increment 18° for each target as network training/stored aspect patterns, and 16 aspect responses as untrained/unstored network generalization test patterns from 09° to 279° with the same increment. Therefore, every untrained/unstored test pattern resides at the middle of two training/stored aspect patterns. Figure 2.5 shows the 68 100-sample aspect time responses, truncated from 68 820- sample time-gated IFFT responses, for 4 scale aircraft models. Figure 2.6 presents the 68 normalized aspect time responses serving as the 68 aspect stored patterns for time domain networks. The details for bipolar coding schemes and data processing of spectrum magnitude network will be discussed in later chapters. Time Response 0 I I 1 l i 1 1 0 1 00 200 300 400 500 600 700 800 Time Sample (with sample spacing 0.01221 ns) Figure 2.3 Time gated backscattered responses of B52, at aspect angle 0°, after 8192- point IFFT. l6 0.3 I I I I I I l I I 0.2 r * Normalized Time Response _0.3 1 L L 1 1 1 g 1 1 0 1O 20 30 40 50 60 7O 80 90 100 Time Sample (with sample spacing 0.02442 ns) Figure 2.4 Normalized time response of 852, at aspect angle 0°, serves as a network analog pattern. l7 1 II I1l11l111 1 ll o. 1,... 1.1.11 111 1" 1 1111‘ 1 ll 11 1 1 «1 g I” III/:1" H0, ,/ 1&1] 1I‘lIuII I 5;.“ “E, 1111,. mu 111111 15‘! II WI 1: -1\ I,’ II/ II" 'fltpm'tilI‘ "'11 «(I III III IJIW —r 5\ l _2\ -2.5\ _3\ 6O 4% 806°¢0 0 0 10 20 30 40 50 60 70 4820 ns) 100 8,?) Time Samme (with spacing 0.024 Figure 2.5 68 100-sample aspect time responses, truncated from 68 820-sample time- gated IFFT responses, for 4 scale aircraft models. 11 11111 ”1 III 1111‘. 11111111‘ 111‘ 1‘ 1 “I 11111, 11“ 11 1111/ 1111 I1 "I Il111l11111I1‘I1 1111‘ 1“ 11‘ 11 I1 11111111 «r01\.l 11 202\ II 03\ CHAPTER 3 Multi-Layer Feedforward with Error Backpropagation and Generalized Inverse Networks 3.1 Introduction In this chapter, we will discuss two training based artificial neural networks, Multi- layer Feedforward with Error Backpropagation (ML/BF) networks and Generalized Inverse (GI) network. The limitations of the Perceptron were pointed out by Minskey and Papert [25] in 1969. The perceptron learning can poorly behave with nonseparable data and it will stop learning upon the point where a critical solution is reached. Therefore, the perceptron learning will only barely work. Widrow and Hoff [26] developed a model for ADALINE (ADAptive LINear Elements), an important variation of perceptron learning. This model uses Widrow- Hoff rule or LMS (least mean square) algorithm to eliminate the oscillation caused by nonseparable learning examples and approximate to the solution with least mean square. However, it is not necessary the optimal one. 3.2 Multi-layer F eedforward with Error Backpropagation Networks The Multi-layer feedforward with Error-Backpropagation (ML/BP) networks are most well known, widely applied, effective and flexible artificial neural networks. The backpropagation learning ideas were rediscoved independently by different workers (Werbos [27]; Parker [28]; Rumelhart, Hinton, and Williams [29]). Strictly speaking the backpropagation is a learning algorithm, not a type of network. Currently the backpropagation is the most important and most widely used algorithm for doing hard 19 20 connectionist learning. Essentially the ML/BP network involves two phases. The first phase, the forward phase, occurs when the input is presented and propagated forward through the network to attain an value for each processing element. Then all current outputs are compared with the desired outputs, and the differences, or errors, are computed. In the second phase, the backward phase, the recurring difference computation from the first phase is now performed in a backward direction. Generally the ML/BP net is composed of a hierarchy of processing units, organized in a series of two or more mutually exclusive sets of artificial neuron layers. Essentially this feedforward network uses a refinement of the Widrow—Hoff technique, which calculates the difference between real outputs and the desired outputs. The weights are changed in proportion to the error times the input. Therefore, we require inputs, outputs and the desired outputs all at any learning neuron. However, this is hard to do with hidden units, since you don't know how much responsibility a particular hidden unit should take. To solve this problem, the neuron learnings are run backward, so you can tell how strongly a particular neuron is connected. Each hidden learning neuron's error is a weighted sum of the errors in the successive layer. Therefore, the forward phase is used to estimate the error, then the backward phase is introduced to modify weights based on backpropagation algorithm so that the error is decreased. Figure 3.1 illustrates a typical ML/BP network structure. The U represents input layer with n inputs, while H1, H1 and H J represent hidden layers with I neuron units in HI and J neuron units in H J respectively. And the OK denotes the output layer with K outputs. 21 Input Layer Hidden Layers Figure 3.1 Structure of a multi-layer feedforward with error backpropagation network 11g(X) (a) (b) Figure 3.2 (a) An artificial neuron unit model. The Om's are outputs from previous layer HI, Bj is the bias for neuron j and Wij's are interconnecting weights between layer HI and the neuron j of layer H]. (b) Activation fiinction , g(x), used for bipolar process. 22 Wi j represents interconnecting weights between the hidden layers H1 and H J, while WJ-k denotes interconnecting weights between the hidden layer HJ and output layer OK. In Figure 3.2, (a) shows an artificial neuron unit model used in Figure 3.1 , and a nonlinear activation function used in an artificial neuron for bipolar process is illustrated in (b). 3.2.1 Gradient Descent Rule Gradient descent rule is an important part in most connectionist learning algorithms, especially mean squared error and backpropagation. It is a mathematical approach to minimizing the error between actual and desired outputs. The weights are modified by an amount proportional to the first derivative of the error function with respect to the weight. The Widrow-Hoff (or Delta) Rule is one example of a gradient descent rule. Suppose we have a network weight W :(w ,, w") for a single cell problem, and it produces a differentiable measure of error E(W). Since the error function is differentiable, we can compute its derivative (or gradient) vector 8E 8E v5: __...._ aw, aw, (1) at any point of weight space. This gradient vector gives the direction in the weight space that has the maximum increase of the error when an infinitesimally small weight change is made in that direction. Therefore the vector -VE(W) will minimize error for a sufficiently small step. This introduces a learning algorithm with which weights, W, will continue error convergence by evaluating -VE(W) and taking small steps in that direction until some convergent criterion is reached. So the gradient descent learning rule can be formulated as follows with a small positive learning rate 11 : 23 WUpdaled = W_ “VEW (2) 3.2.2 ML/BP Learning Mechanism The ML/BP network is a supervised net, so it needs a training (or example) set including input pattern set and the corresponding desired output set to begin learnings. The network architecture learns to adapt to the training set by using the errors backpropagated layer by layer to adjust its interconnection weights. As we can observe from Figure 3.1 that the network interconnects any two adjacent layers by synapses and each layer has its artificial neurons, but there are no interconnection synapses between any two neurons in the same layer. The learnings adjust synapse weightings connecting any two neurons of any two adjacent layers in backward direction. The network iteratively continues its learnings pattern by pattern until the error criterion is reached, thus a well trained network will satisfy the given training pattern set with error less than the required criterion. Suppose a ML/BP network has output layer OK with K neurons, the first backward hidden layer HJ with J neurons and the second backward hidden layer III with I neurons. Let Tkr denotes the kth target (or desired) output bit for the training pattern r, ij denotes the synapse weighting connecting the j“‘ neuron of hidden layer H J to the k"‘ neuron of the output layer OK, and Wi j denotes the synapse weighting connecting the i"‘ neuron of hidden layer Hl to the j"' neuron of hidden layer HJ. Then the network has .1 0;: 0,4le WIkHD J '1 (3) H}. = GBQ; WUH, ) M- s. ‘ ha ‘4 1‘ "--\ I ‘H. ‘4. ‘ 7 . \‘y, 1 /P1 4", // I l .1.‘ 24 An activation shown in Figure 3.2 is used here for bipolar [-1 1] process, 2 G : _____ - BM erXM—BV) (4) and this kind of nonlinear sigmoid fiJnction is frequently used to emulate the bi—state for a biological neuron, activated or inactive. Then we define an error function for training pattern r as follow, K 1 K 2 E'=ZEk=—Z (Tit-0k) (5) 11.1 2 11.1 Now, we have calculated the error between real network output, 0', and target output, T',for the training pattern r. Afler the error is available, the gradient descent rule [20][29][30] for the output unit k with respect to weight W”. is given by r k r r r 6 BW = ‘flka'0k)["a I ‘1: I: J J 1 (6) =71”; [03(21‘ WJkHJIITk ‘01)]:nH15" J. ij‘ —r] where J a; = 0,;(2 ijyj'xrg—ok') (7) 1.1 Compared to the Widrow-Hoff rule, delta,5[’, can be regarded as the error backpropagated through the nonlinear activation from output unit k. Since all neuron j's, j=l J, in hidden layer HJ are connected to the output neuron k in our model, all the hidden neurons in layer H J should be responsible for error occurred at 25 the output unit k. The backpropagation algorithm proposes that the error occurred on any output unit k will be broadcasted to neurons of previous layer which are connected to the unit k. Therefore the gradient descent update rule for the interconnecting weight W1; between hidden layers HI and HJ is given by 6E ' 5er aE ' 41W”. : —'r] 6W = 4] air—f (8) 1} 1} 6H] where aer G1(21:Wflr)Hr 9 6W”. _ [3 1.] t} t t ( ) aE ' K «'90; K J K = 2 (T1: ‘01:) : 2 (T1: ’Ok)G.B(Z ijHJ)WJk = (E Wflcbk) (10) GH .r 1"] 6H} I“ H kn] 1 Therefore we have 1 K AW.) : "Hz-r0136: WUH")(X ijb'D r-l ["1 (11) = nHrrGIBIHJrIEJranierr where H}. =GB(HJ.) ~ r K r (12) E}. : § me, ~ K 5120301]. )(2;l Wfibk) (13) In equation (11), E; denotes the errors backpropagated to neuron j of hidden layer HJ 26 through interconnecting synapse weights from all output neurons. Since there are no apparent desired outputs to calculate error for hidden layer, this backpropagated error from all neurons of previous layer can be treated as a measure of current error for this hidden layer. Again 5; represents the error backpropagated through the nonlinear activation fiinction. 3.3 Gaussian Noise Generation In previous chapter we have 33 normalized azimuthal aspect responses of 100 analog sample points from 00 to 28.80 with aspect increment 0.90 for each target. The 17 of them from 0° to 288° with aspect increment 1.80 are selected as training pattern set, while the rest 16 of them are kept untrained from 0.90 to 27.90 with the same increment. In this thesis, we simulate all neural networks by using sofiware package, the Matlab 4.0. To test network performances, we add noises to test inputs before putting them to network. In order to have a basis for comparing network performances, we define a Signal-To-Noise (SNR) in dB. Suppose we have a continuous input pattern s={ s1 s2 5,00}, then we calculate its average signal power by l 100 2 s72 = —— s 15 100 2,“ m ( ) If the 100 added noises, N=(N, N2 N W), have average noise power 1 100 2 62: — EN) 16 1002,11 (... ( ) , then we define the SNR and SNR in dB as follows 27 ”3 I N SNR = 0' N E, (17) SNR (dB) = 10101;10 : 0 And we simulate a noise set with SNR=snr by using Matlab’s random noise generator to generate a Gaussian noise set, N(O, CG), with 0 mean and CG variance given by ‘4 | o . __ (18) snr Ten noisy circumstances with SN’R’s of [40, 30, 20, 14, 10, 6, 3, O, -3, -6] dB have been simulated for each network to test network noise tolerances. The solid line of Figure 3.3 shows the 100 time responses of B52 used for network process at azimuthal aspect 0°, while dash-dot line denotes the one with added noises of 10 dB. A noisy response with 0 dB is shown in Figure 3.4 , while one with -3 dB is shown in Figure 3.5 .After noises are added to the test signal, the noisy signal is normalized again before proceeding it. Network performance under each case is simulated 10 times and then average is calculated and used as resultant performances. 3.4 ML/BP Network Trainings and Performances for Radar Target Discrimination In this section, we implement two ML/BP nets, one without hidden layer and the other with one hidden layer, and train the two networks by using 17 time-domain azimuthal aspect responses for each target. There are four targets used for network trainings, so total 68 time domain azimuthal aspect responses are used for network training. Here we use continuous 28 1.5 r r 1 1 r I 1 r 1 1 b— — /' ‘l 0.51 ', I, — \ I I 11 / I l | ' a) 1 1 a) \ I 1 c 1‘ I , O , 1 I , -1 2? 1 ‘ ,1 ."\ 1‘ l I ,I 1 ' I a: of 1 I J 1 ’ I’ I / 1’ 1 \ 1 ‘ Q) 1 I J \I \ I ' I l .g ' I I 1 \ l l" l \ 1 I i ‘. l i I 1 '0.5' ‘ I, I l "I 1 . ‘ 1 fl 1/ -1 1— —- Solid : Without Noise Dash-Dot : With Added Noises -15 1 1 1 1 1 1 1 1 1 O 10 20 30 4O 50 60 7O 80 90 100 Time Sample Figure 3.3 100 time responses of B52 used for network process at azimuthal aspect 0°. Solid line denotes responses without noise, while the dash-dot denotes the one with added noises of SNR= 10 dB. 29 1.5 1 1 1 T 1 1 1 1 1 1 l 11 1 ‘1 l ,‘ 11 1 I ' I II 1 1 fl 1 1 1‘ I 1 [1 ,1 1 I1 0.5 I. I, i 1 1 11' j ,1 1I - l l l l 1 1 ‘1‘, 1II ,‘ 1, 1 ‘I I IHI 1 ‘ 1 1 1111 IIIl 1 1 . 1 1 ,‘1 l I l (I?) I [I1 1‘ ‘ IIIII1I1 I 111 I\1I If ‘ l ‘ 8 I 1II 1‘ I1 1‘ II 1 11 I II I11I ‘ I III1 1% II / I 1‘ I1 1 I I I ‘ ‘ I I 1 I a) 0 I ‘ 1‘1“ I11 ‘I I III I II” I II 1 ‘ 1 II-‘1 I ‘1 v\/ 1 1 || 1 ‘1 II 1 1‘ a.) 1 ‘1 1 1 11 1 1 1 l‘ l \ l I l l' E ‘1 1 1 I ‘ 1 I I1 1 I— II ll I II I I I ‘ I 1 1 1 1‘ 1 | 1 1 , 1 1 1 , 11 ll \ I 1 1 I l 1 -0.5 l I ll 1 1 l |I ‘ 1 11 \| 1 1 | lI ll I 11 II I 1 ‘ 11 1’ 1 l I l 1 ; 1 1 -1_ _ Solid : Without Noise Dash-Dot : With Added Noises _1.5 l I l l I l l 1 I O 10 20 30 40 50 60 70 80 90 100 Time Sample Figure 3.4 100 time responses of B52 used for network process at azimuthal aspect 0°. Solid line denotes responses without noise, while the dash-dot denotes the one with added noises of SNR= 0 dB. Time Response 30 1.5 1 1 1 1 1 1 1 1, 1 1 [II 1" 11‘ 1 1, ,, l 1 , 1 1‘ 11.. 1 1,, _. 1 I1 1' , , 1] I I1 , 1 , 1 ,1| 1 ‘I I ‘ i 1 1 1 1 1' I II I 1 1 1 II II1 I 'I ll , I l 1, 1 , ,, 1| 1‘ l 1 1 11 1 1 , 1 1 — O'Sl' 11; 1 l ,, 1‘ 1 1, E, I, 1 :, 11 1 I 1‘ I 1‘ 1 I I I , , ,1 1 1 11 1 , 1 ,11 ,1 1,, ,1 I“,1 ,1 I1 11 ‘I l ‘ 11 11,1, 1 I1 1 1 l1 , 1 1 ‘ .1 l ,1 ,1I l 11,1 1 11 l l 11 11 u , , 1 1 x, 1 1 1 , I ‘ 1 1111111 I 111 ,1 , 11 1 1 ,11 1 1 , 1,. 011 , ,,,,1 1, , 111,1, ,1,l, 1111, ,, ,,, , ,1 , ,1 _1 1 l l _‘ ,, 1,111, ‘1, ,1 ,1,1, \11, 11/\1, 1,l ,,1 1,, ,1 11 1 1,,11,1 1 , l \l , 1 l,‘ 1 “I ,, ,,l I 1 11 1, , 1 1 1 1 ,, ,1/ 1, , 1,1 1 ‘1 1,,’ 11, 1 , 11 1 1 11 II 11 III I I1‘ II ,,1 II II lI11 1 . 11 1 1 ,, 1 , ‘1 1 | , I, 1, l l , I 1 1 I 1 I‘ ‘ 1 I 11 ,I I 1‘ 1 I" I‘ 1 I11_ '0.5"- 1 \1 l ‘ | I l 1 I ' 1 1 1 I I 1‘ 1 | l I l 1 1 1I “ I 1I II I , , ,1 1, ,1 11 , , ,1 1, ,1 11 ' 1 1 11 1 111 1 1 1 .1». 11,1 1: ‘ 1, f- . i . 1 1 Solid : W1thout N015e II' , I1 , 1 l l 111 Dash-Dot : WIth Added Nelses ,, , , 1 , ' 1 '1‘ I, - 1 1 _15 l I J_ 1 LI L l I l J 0 10 20 30 4O 50 60 70 80 90 100 Time Sample Figure 3.5 100 time responses of BS2 used for network process at azimuthal aspect 0°. Solid line denotes responses without noise, while the dash-dot denotes the one with added noises of SNR= -3 dB. 31 inputs but bipolar outputs for network process. Four output patterns are used to serve as desired target codes. They are : [1 -l -1 -1] for 852, [-1 l -l -1] for BSS,[-1-1 1-1] for F14 and [-1 -1 -1 1] for TRl. Then we test the networks’ performances by adding Gaussian noises of 10 different SNR's to the training patterns, and test networks' generalization capability by adding noises to 16 untrained azimuthal aspect responses for each target. 3.4.1 ML/BP Without Hidden Layer There is no hidden layer for this network. Since we use continuous input values and bipolar output form, the network has 100 inputs in its input layer and four units at the output layer. The network is trained by using the 68 training pattern set for four targets to learn the artificial synaptical interconnections between input and output layers. It is equivalent to say a 100 x 4 matrix is designed to learn network synaptic weights. We randomly initialize network weights by Gaussian function with norm 1, and then times initial weights by 0.2, since the activation fimction will saturate if the input value to the function is large. Figure 3.6 shows two activation fimctions, Gu(x), with different activation parameters, while their first derivatives (or slopes), dGu'(x)/dx, are shown in Figure 3.7 , solid line for [i=1 and dash one for [3:3, For example, if the nonlinear sigmoid fimction has activation parameter [i=2 and an input value 2, then the activation fimction will have output GZ(Z)=0.964 and slope Gz'(x)|,F2 = 0.0707. The gradient descent learning rule working on backpropagated error will become idle with a derivative near 0 even though the error is as large as -2 or 2. Therefore, the initial weights are required to be small enough to have weighted sums unsaturated to neurons of the next layer if the learning in this layer is expected in the beginning. We train training pattems in random order to avoid being trapped in local minima. The momentum has 32 08* 0.6" 0.4 _ 0.2 c Solid : Activation Parameter = 1 , Dash : Activation Parameter = 3 l —A l- L l ._ >— P— i— >— Figure 3.6 Activation fimctions, GB(x), with different activation parameters. Solid line denotes the one with activation parameter [3:], while the dash line denotes the one with parameter [3:3 33 Solid : Activation Parameter = 1 , i Dash : Activation Parameter = 3 First Derivative Figure 3.7 The first derivatives, d[Gp(x)]/dx, of activation function with different activation parameters. Solid line denotes the one with activation parameter [3:], while the dash line denotes the one with parameter 0:3. 34 also been used to increase learning efficiency and reduce the chance of being trapped in local minima. The learnings between patterns are competitive. For example, if there is a training pattem, p, has 1 as the kth desired output bit, then the learning for this specific training pattern is expected to have 100 GAE ”ank)_’1 (19) 71-] At the same time the weights, Wn k, are adjusted by another 67 training patterns to satisfy their respective desired output. Therefore, if learnings of most training patterns adjust the weights in the opposite direction to the training pattern p, then the iterative learning epochs may bring the k‘h output of training pattern p far away from its desired one, say —1, before those dominant learning patterns converge to their respective desired outputs within correct saturation regions (or convergent criterions). To avoid this phenomenon to occurre, we train network by initially using a larger activation parameter, then using a smaller one afier most training patterns converge. For example, if the k[11 output neuron initially use an activation parameter 1 and have input value -2 for the training pattern p, i.e. Z uannk = -2 (20) then the network currently has Gl(-2)=-0.7616 instead of the desired one, 1, at the k‘h output corresponding to the training pattern p. According to gradient descent rule, the network 35 learning would have reduced the error 1-(-0.7616) to a smaller value. But the dominant learnings coming from the other 67 training patterns may adjust the learning weights for output unit k, Wn k, in a direction opposite to the pattern p, and then force the input to output neuron k for pattern p to deeply enter wrong area, more negative than -2, instead of approaching toward the positive side. From equation(4) or Figure 3.7 , we can find the first derivative of activation function with parameter (i=1, Gp'(x), larger than 0.051 for inputs within interval {-3.6 3.6]. It means dominant learnings won't converge until their respective inputs to output neurons have a distance larger than 3.6 from origin. The adjusted weights might have pushed the input of output neuron k for the training pattern p deeply into a more negative area, less than -4, afier dominant learnings converge. Therefore the successive learnings for the training pattern p have no chance to pull its wrong output back to the correct side. Since, according to the gradient descent learning rule, a learning on the weights connected to output neuron k for the training pattern p has AWn k=un Gp'(x) (TkP-Ok") with G2‘(x)[x.‘,=0.0013<<1 , the learning is idle even though its error, (Tkp-Okp)=1-GZ(-4)=1.9993, is much larger than those convergent ones. Therefore we initially use a parameter 0 around 2 or 3 to start trainings, and then use a smaller one to ensure continuous learnings for those few nonconvergent training patterns after most training patterns converge. The trainings consist of two sections. The first training part uses hard limiter to determine if all training patterns converge to their respective desired outputs. The hard limiter function is a simple Signum (Sign) function, and it has 1, v>0 Sign (V)={ (21) -l v<0 9 36 for bipolar process. When this convergent criterion is reached, all training patterns under no contaminations can be correctly discriminated by the network with a Sign threshold function. However, it doesn't indicate tolerance to contaminations. For example, it may occur that the last convergent training pattern, which has a desired value 1 for output neuron k, converges its input value to 0.001 and stop. This critical convergent weight won't correctly serve for the pattem with little distortion resulting in weighted sum (or input) to neuron k shifting toward negative side with an amount larger than 0.001. Therefore, the extra training is required to converge all training patterns far away from the origin and then increase network tolerances. However, it may only affect the last few convergent training patterns. The first training section takes 164 epochs to converge, and then another 1012 extra training epochs are followed. Figure 3.8 (a) shows the performance of the ML/BP network without hidden layer and without extra trainings for the 17 training patterns of target B58, while Figure 3.8 (b) shows the overall network performance for all 68 training patterns. Figure 3.9 (a) presents the network performance with extra trainings for the 17 training patterns of target B58, while Figure 3.9 (b) shows the overall network performance for all 68 training patterns. It is apparent the extra training makes the network have better tolerance for the 17 training patterns of target B58. Another interesting phenomenon is that the network prefer unknown to wrong discrimination. The network generalization capability is tested by using the other 64 untrained patterns for four targets. Figure 3.10 (a) shows the network generalization performance without extra trainings for the 16 untrained patterns of target B58, while Figure 3.10 (b) shows the overall network generalization performance for the 64 trained patterns. 37 \ 16~ 8 C 14l— " «u E g 12— ~ 0 CL ‘5 10 Solid:Correctly Discriminated % 8~ Dashed:Unrecognized _ z Dashdot2Wrongly Discriminated C '6 w a E 6 8 a) 4” ‘ .§ i— 2_ _ O l .1 ‘H 1 t——‘7———‘I————1'——.1-fl- -10 -5 0 5 1O 15 20 25 30 35 40 SNR (dB) \l O J —4 g 60 ~ ~ C as E 5 50 ‘ z t Q) (L 'g 40 P Solid:Correctly Discriminated 4 if.) Dashed:Unrecognized Z 30 _. Dashdot2Wrongly Discriminated- C I? 3 20 - \ \ - m \ E \ \ +— 1oh \ « 0 1 -\T‘__1 \1‘\‘1-———4__-_4____1___.4---z -10 -5 0 5 10 15 20 25 3O 35 40 SNR (dB) (b) Figure 3.8 Performance of ML/BP network w/o hidden layer and w/o extra trainings vs. SNR(dB). (a). Network performance for the 17 training patterns of target B58. (b). Overall performance for all 68 training patterns. 38 16- n (D 814- - w E g12- - m lio- g SolidzCorrectly Discriminated % 8 _ Dashed:Unrecognized _ z Dashdot2Wrong|y Discriminated C .5 _ _ E 6 8 a) 4 ” - .E P 2_ 1 0 1 1 “ ~— 31 1 \ ‘ -t — ._ 1 1 1 1 -10 -5 0 5 10 15 20 25 30 35 40 SNR(dB) (a) 70 T l T 1 T 1 l l T 860— - C (U E 5 50 — _ t (D a i»; 40 ’ Solid:Correct|y Discriminated .g Dashedihvecognaed z 30 _ Dashdot2Wrongly Discriminated- C ‘8 S O 20 '~ 0) 1.5 P10— ~ 0 1 1 ~— ~ _1 1 \ x _1 1 1 1 1 -10 -5 0 5 10 15 20 25 3O 35 40 SNR(dB) (b) Figure 3.9 Performance of ML/BP network w/o hidden layer and with extra trainings vs. SNR(dB). (a). Network performance for the 17 training patterns of target B58. (b). Overall performance for all 68 training patterns. 39 —L 0') —1 —4 —l —- _‘ .1 ... _. —‘ _L b l .5 N l _L O I SolidzCorrectly Discriminated Dashed:Unrecognized - Dashdot:Wrongly Discriminated Time Domain Network Performance (I) i l '1‘ ~ - _l l J L l 1 l -10 -5 0 5 10 15 20 25 30 35 40 SNR (dB) l T l l l l T T 1 60” “ Q) Q g g 50 ” ‘ O E 0- 4o - - ‘5 Solid:Correctly Discriminated g Dashed:Unrecognized 2 3O _ Dashdot:Wrongly Discriminatedd C '5 §2o~ ~ 0 \ Q) \ E \ i: 10» \ \ ~ 0 1 \r‘~~1 -1-5‘7-—-_1—-__i_—__1__—-1-__5 -1O -5 0 5 1O 15 20 25 30 35 40 SNR (dB) Figure 3.10 The generalization performance of ML/BP network w/o hidden layer and w/o extra trainings vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target B58. (b). Overall generalization performance for the 64 untrained patterns. 40 _A _A k C) r —1 fl _1 ,_1 .5 N f .5 O T Solid:Correct|y Discriminated Dashed:Unrecognized 4 Dashdot:Wrongly Discriminated Time Domain Network Performance 00 l O 1 1 ‘ ~ A 1 1 1 1 1 “l — — — “ -1O -5 0 5 1O 15 20 25 3O 35 40 SNR (dB) (a) 1 T l l 1 l T l 1 60” “ Q) 0 5 E 50" 7 O t a) a. 40 _ 1 g Solid:Correct|y Discriminated g Dashed:Unrecognized z 30 ” Dashdot:Wrongly Discriminated“ C '23 E20— — o \ Q) \ E \ 1: 10” \ \ -i O 1 ‘l‘i\i“‘l 1‘55—1T-T‘1-‘_‘1—‘"1———n———‘ -10 -5 0 5 1O 15 20 25 30 35 40 Figure 3.11 The generalization performance of ML/BP network w/o hidden layer and with extra trainings vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target BS8 (b). Overall generalization performance for the 64 untrained patterns. 41 Figure 3.11 (a) presents the network generalization performance with extra trainings for the 16 untrained patterns of target B58, while Figure 3.11 (b) shows the overall network generalization performance for the 64 trained patterns. Again the network with extra training has larger generalization tolerance for the 16 untrained patterns of target BS8 3.4.2 ML/BP With One Hidden Layer We add a hidden layer with 25 neurons to the previous ML/BP net to see if the network noise tolerance and generalization ability will increase. This network trainings take much more time and manipulation to learn all the training patterns without error, since the hidden layer has ambiguous desired outputs. The competitive learnings usually occur between interconnection weights an's with activation parameters 0] on the hidden layer BJ and ij's with BK on the output layer OK. If the activation parameter [3J for neurons in the hidden layer is larger than the one for the output layer, then the learnings for an are greatly reduced and the WJ- 1. will dominate the network learning. Referring to Figure 3.6 and Figure 3.7 , the larger activation parameter will make the hidden layer learning region ( transition region of activation function) narrower and then most inputs to hidden neurons will enter the saturation area. Therefore the backpropagation learning will be inhibited by a derivative about 0. It means the outputs at hidden layer have mostly been either 1 or -1 when the learning on weights between the output layer and the adjacent hidden layer only proceeds for few iterations. The network thus degenerate to one without a hidden layer, and the an works as a transform function only transferring inputs to the dimension of the hidden layer. To make hidden layers useful, the activation parameter for hidden neurons should not be too large with respect to the neuron input scale. The hidden layer play an important role 42 which acts as a new representation of inputs. This new representation of inputs will help to solve linearly nonseparable problems of which the perceptrons, ADALINE and network without hidden layers can't asymptotically converge to a solution. So the learnings occurred on an between the input and hidden layers tend to find new representations for all the training patterns which can make network outputs converge to their respective desired outputs. In the previous section, we found that if an input of neuron is beyond the transition region of the neuron activation function, then the learning for this input is nearly idle even though the network output has a large error. This phenomenon is not expected, since it prohibits those training patterns resident in wrong saturation region from learning. Although gradually releasing the activation fimction’s slope may help to pull back those nonconvergent ones, it is only effective for the inputs not too far away from the transition area. One may need to reinitiate network weights and retrain the network, if the nonconvergent training patterns have been pushed deeply into saturation area of the wrong side after those dominant training patterns converge. So we design a new error function with reenforced order of error to overcome the idle learning caused by a derivative near 0. The ML/BP net defines an error function as the sum of squared errors shown in equation(S). From the gradient descent rule, equation (6), the weight learning is proportional to the derivatives of error function and activation function respectively. Therefore, we can define an error function with high order, say 4, then its first derivative still has an order of 3 and will overcome the derivative of activation fiinction in its saturation area. It means that a reenforced order of error fiinction will increase the width of 43 7 ~ \ - I \ Dash-Dot : Error Function with order of 6 1 Solid : Error Function with order of 4 1 Dash : Error Function with order of 2 Normalized Derivatives of Error Functions w.r.t. Weight A l 1 Neuron Input Figure 3.12 Normalized derivatives of error fimctions of three different orders w.r.t. weight W13, -dE(W)/dWJ-k. The desired output is assumed to be 1 and an activation parameter [i=2 is used for output neurons. The dash-dot line denotes the error function with order of 6, the solid line denotes the one with order of 4, and the dash line shows the one with order of 2. 44 l l Dash-Dot : Activation parameter = 3 Solid : Activation parameter = 2 Dash : Activation parameter = 1 Normalized Derivatives of Error Function w.r.t. Weight I'\) 0.5— Neuron Input Figure 3.13 Normalized derivatives of error fiinction w.r.t. weight ij, -dE(W)/dWfl,, with three different activation parameters. The desired output is assumed to be 1 and the error fiinction with order of 4 is used here. The dash-dot line denotes the activation fiJnction with parameter (i=3, the solid line denotes the one with (i=2, and the dash line shows the one with [3:1, 45 learning region, and then make the idle ones with derivative order of one possible to be pulled back from saturation area of the wrong side. Suppose the desired output corresponding to the training pattern p is 1 without losing generalization, then Figure 3.12 shows normalized derivatives of error functions of three different reenforced orders with respect to the learning weight ij, -dE(W)/dWJ-k. The neuron activation parameter is assumed 2 for those three cases. The dash-dot line denotes the one with a reenforcement order of 6, the solid line denotes the one with order of 4, and the dash line shows the one with order of 2. Figure 3.13 presents normalized derivatives of error functions with respect to learning weight WJ-k, -dE(W)/dWJ-k, with three different activation parameters. The reenforcement order of error function is assumed 4 for those three cases. The dash-dot line denotes the one with activation parameter [5:3, the solid line denotes the one with (i=2, and the dash line shows the one with 0:1. Both figures are normalized by the respective reenforcement orders of error fimctions and network input un. Two interesting characteristics observed by comparing Figure 3.12 and Figure 3.13 are the order reenforcement of error function increases the width of learning area and shifis it toward deeper saturation area, while the enlargement of activation parameter shrinks the learning area and shifts it toward the transition region. The network trainings take 624 epochs to converge with a hard limiter, and then another 414 extra training epochs are followed. Without the order reenforcement of error fiJnction, the training can't escape from deep saturation area and is reinitiated a couple of times. By monitoring network trainings, being trapped or idle phenomena usually occurs with inputs to neuron much far away from the transition region. For a training pattern with desired 46 l l l l 16 — ~ 8 c 14 " ~ in E ,g 12 - _ m 0' 10 - E Solid:Correctly Discriminated g 8 _ Dashed:Unrecognized .. 2 Dashdot:Wrongly Discriminated C .a _ _ E 6 18 CD 4 ' \ " .e \ \ '— 2 — g \ \ _ 0 l '1’ .- \ H L \ \ \ .1 _ 1 1 1 1 -1O -5 O 5 1O 15 20 25 30 35 4O SNR (dB) (8) 70 r 1 1 1 8 60 - a C cu E 5 50 r _ t m CL ‘5‘ 40 ‘ Solid:CorrectIy Discriminated g Dashed:Unrecognized Z 30 _ Dashdot:Wrongly Discriminated_ C To 8 o 20 - — 0) .E l- 10 - 4 O 1 Li '~ ~1. _ 1 \ \ 4 1 L 1 L -10 -5 O 5 10 15 2O 25 30 35 4O SNR (dB) (b) Figure 3.14 Performance of ML/BP network with one hidden layer vs. SNR(dB). (a). Network performance for the 17 training patterns of target B58. (b). Overall performance for all 68 training patterns. 47 .5 O) —L b l ..L N l _A O l Solid2Correctly Discriminated Dashed:Unrecognized - Dashdot:Wrongly Discriminated Time Domain Network Performance (1) 1 O l 1 1 -10 25 30 35 40 1 l T 60- .. Q) U S E50— ~ 0 ‘t (D o. 40 _ .. 15‘, Solid2CorrectIy Discriminated g Dashed:Unrecognized z 30 H Dashdot:Wrongly Discriminated1 C '6 E 20 r- - O Q) E :10— — O 1 1“~—1—_ 1 ~‘1T“-1—--—1—————1————_1.__... -10 -5 0 5 10 15 20 25 30 35 40 SNR(dB) (b) Figure 3.15 The generalization performance of ML/BP network with one hidden layer vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target B58. (b). Overall generalization performance for the 64 untrained patterns. 48 output value of 1, an finally nonconvergent input value of -S to an output neuron with activation parameter (i=2, i.e. G2(-5)=-0.9999 and G2'(x)|x._5=1.82*10“, is a common idle case. In our simulations, the reenforced order of error function did help pulling those severely wrong ones back. Figure 3.14 (a) shows the performance of the ML/BP network with one hidden layer for the 17 training patterns of target BS8, while Figure 3.14 (b) presents the overall performance for all 68 training patterns. Figure 3.15 (a) shows the generalization performance of ML/BP network with one hidden layer for the 16 untrained patterns of target BS8, while Figure 3.15 (b) presents the overall generalization performance for the 64 untrained patterns of four targets. From simulation results, the ML/BP with one hidden layer doesn't perform better than the one without hidden layer. The ambiguous desired outputs of the hidden layer degrade the capability of hidden layers, while being linearly separable makes the ML/BP with hidden layers have no apparent advantage over the one without hidden layers. 3.5 Generalized Inverse Network The radar target discrimination may be represented as a set of associative equations, and then a matrix equation. The number of response samples for each aspect pattern is the variable number for the associative equation set and the total available aspect patterns is the number of associative equations. However, there are two characteristics which may hinder the conventional matrix operation. One is that the aspect response doesn't apparently change along aspect increment in some specific aspect range. The other is that the responses of the same aspect belonging to different targets may be quite close. This will make direct matrix 49 approach difficult to deal with singularity, especially for quantization data. An exact numerical solution, which has no generalization capability, is not what we expect, since it won't work under contamination. Neural network learning is good in releasing the singularity stress, and offers tolerance or/and generalization. Then it suggests us to build a hybrid network which is initially constructed by the associative equation set and then learns to converge to a solution. Assume we have p aspect patterns to be stored in memory, X = [X1 X2 X"] where Xi is an m-dimension column vector, i.e. Xi = [X1i X2i Xm']T. Suppose Y = [Y1 Y2 YP] are the associative code patterns corresponding to X where Yi is an n-dimension column vector, i.e. Yi = [Y1i Y2i Yn‘]T. Then we can write an equation representing the above association as WX = r (22) where W is an n by m matrix. For our application, Yi is the target associated to the stored aspect pattern Xi . Typically, X is not a square matrix and m > P is assumed. Thus a direct Generalized Inverse matrix computation can be used to solve the interconnection weight matrix W [20][3l][32] as W = Y(XTX)“XT = YX‘ (23) where X” is the generalized (or pseudo) inverse of X . The generalized inverse X+ exists only if m>P, but the direct computation of the generalized inverse becomes difficult or impractical if the dimension of m or P is too large. In addition, problems may occur when two adjacent aspect angle target responses are very close. This is not unusual for target aspect responses. 50 After we quantize them into binomial code form used in our network process, they may have same quantization sequence codes causing the generalized inverse computation to become instable or singular. Therefore, instead of direct computation of the generalized inverse, we can use iterative trainings based on the gradient descent algorithm to gradually approach and then approximate a solution of the interconnection matrix W. We iteratively train the network weight W and expect the network output to each trained pattern Xi, WX‘, will retrieve its associative pattern Y‘. First, we can construct an error (or cost) function J(W) for the network training as n P J(W) . illY - WXllz = Z 2 1Y,’ — (WX'),12 (24) 1-1 j-l N]!— where I] ll denotes the Euclidian L2 norm. To minimize the error firnction J(W), the gradient descent learning rule [18][3 l ][32] can be used 0J(W[kl) W[k+1] . W[k] _ n Wlkl . W1k1 + ntY—ervox’ (25) where Wlk] is network weight matrix at learning epoch k, and n is the "learning rate" with 0XT= 2: ntY'-W1k1X'>X"' (26) 1-1 , above training algorithm is a batch mode learning by which each learning pattern adjusts the interconnection weights W without considering the adjustments done by the other learning patterns at the same learning epoch. This batch mode learning may cause over adjustments 51 (or learnings) problems, since the gradient descent algorithm uses the same network weight to calculate respective errors for all training patterns and then adjust the network weight at the same time. But the basis of gradient descent rule can effectively reduce the error only for the present training pattern with current weight and doesn't promise that the current weight adjustment will also benefit the other training patterns. The learning algorithm can be modified to asynchronously update for each training pattern, Xi , and its associative pattern components, Yj‘. Therefore, the current training pattern adjusts the weight W updated by the previous learning pattern. The asynchronous update rule gives erkn] = ij] + n(Yj.m-W}.[k]X(’))X(°T (27) where lek] denotes the ju‘ row of weight matrix W at update iteration k and Y1“) is the j"‘ component of the desired pattern Yi associated with X‘. To train each pattern and component fairly, and avoid being trapped in a local minimum, the training pattern i is randomly selected. In our application to radar target discrimination, the output is expected to clearly present the discriminated target type. Therefore, continuous value form is not appropriate for output representation. The binomial output form also increase the noise tolerance, therefore a nonlinear threshold (activation) fiinction is introduced to the network output stage. Then we have GB( W) = Y (28) and 06)}, = g(mgvnu-Gkv» (29) 52 from equation (4). The asynchronous update rule for pattern Xi and its associative pattern component Yji is then given by Wj.[k+1] = Wj[k] . nG(v)’Bl Y1‘(0_Wj[k] X (1))X(1)T (30) mlklx(l)( where Wj[k]X“’ is the input to the output neuron j and (Yj‘l’ - Wj[k]X“’ ) is the error occurred on the output of neuron j for the current training pattern X“). The equation (28) has P associative equations and each has m variables. The weight matrix W maps P m-dimension vectors to P n-dimension vectors. Therefore, for m > P, there are multiple exact solutions. If weight matrix W is randomly initialized, then the W may converge to some specific solution that has high noise tolerance for some trained patterns, but low noise tolerance for others. To avoid the solutions being biased by some learning patterns and also to speed up trainings, we may initiate the weight matrix by W = YXT. That is, we use the correlation recording matrix or Hopfield memory with nonzero diagonals. And this initial way has great possibility to approximate to the generalized inverse solution YX+ if X+ exists. Therefore, the GI network will become a multi-layer feedforward with error backpropagation network without a hidden layer, if the network weight W is not initialized by the correlation recording matrix YXT . So the GI network can be referred as a hybrid network composed of correlation associative memory and learning based backpropagation network. And then we may also say the GI network is a single layer net with initial weight matrix W = YXT. 3.5.1 Generalized Inverse Network Performance Using Bipolar Data Since the GI network will be cascaded to a Recurrent Correlation Associative 53 Memory(RCAM) to serve as a target code decoder and a refiner of spurious stable states, bipolar data are adopted for this network in both input and output. The advantages from using binomial form will be discussed in the next chapter. Similar to ML/BP nets, the GI network has 17 training azimuthal aspect patterns for each target, while the other 16 aspect patterns of every target are left untrained to test network generalization after training. Thus, we have total of 68 training aspect patterns and 64 untrained patterns for four targets. From the previous chapter, we know that each time domain aspect pattern contains 100 analog samples. First, we quantize each analog sample by 7 quantization levels for all patterns, then use 3 bits to encode the 7 quantization levels. The 3-bit code assignment for 7 quantization levels is shown below in Table 3.1. BitlBit2 -l -l -1 1 l l 1 —l Bit3 '1 1 6 5 4 Table 3.1 Code assignment for 7 quantization levels coded by three bipolar bits. The code resulted from 7 levels coded by 3 bits is not exactly a linear code, with which two nearer levels have two similar codes. But if we properly assign the code-to-level 54 mapping as shown in Table 3.1, the codes have its linear characteristic for the adjacent levels. It means that if the noise contamination range is smaller than 11.5 level intervals, then the code linearity will still statistically functions. Therefore the network discrimination won’t be affected by the coding scheme. If the noise contamination is severe, then the linearity may not work for every response level. Under the severe noise condition, some wrong or unrecognized discriminations probably result from this nonlinear coding scheme. For example, if the true response is in level 3 (represented by 1 -1 1), and the contaminated range is in i1.5 level intervals, then the contaminated signal will be in either level 2 (represented by -1 -1 l) or level 4 (represented by 1 -l -l), and these two codes (-1 -1 1 and 1 -l -1) are still the most similar codes to the true one (1 -1 1) than the others. If the number of quantization levels is fixed, then the input resolution will be the same for different coding schemes. The more digits we use, the more linearity the code mapping has, but more process space is required. Therefore it becomes a trade-off problem, either save process space with less code bits, then process quickly but poorly under severe noises; or expend more memory with more code bits, then process slowly but satisfactorily under severe contamination. This problem can be solved by processing the analog (numerical) valued responses instead of binomial codes for digital computer simulation. But the problem will still remains the same for hardware realization. The training for GI network only takes several epochs to converge to desired target codes with hard limiter and then 120 extra training epochs are followed. Figure 3.16 (a) shows the GI network performance for the 17 training patterns of target B58, while Figure 3.16 (b) presents the network overall performance for all 68 training patterns. Figure 3.17 55 (a) shows the G1 network generalization performance for the 16 untrained patterns of target BS8, while Figure 3.17 (b) presents the network overall generalization performance for the 64 untrained patterns. As expected, the GI network shares an interesting phenomenon with the analog process ML/BP networks. That is the GI network also prefers unknown to wrong discriminations. This interesting characteristic allows us to put more confidence in the target type discriminated by GI network. Since the wrong discrimination percentage is quite low, it is reasonable to think that the GI network cascading to an Recurrent Correlation Associative Memory (RCAM) will only increase the correct discrimination rate and do no harm to those correctly associated outputs presented by the RCAM. Therefore the GI network can serve as a refiner of spurious stable states resulted from the cascaded RCAM. This issue will be appreciated in the following chapters. The GI has better performances than the analog ML/BP nets under fair noise conditions, but performs worse under severe contamination. The nonlinear interruptions described earlier may be used to explain that deficiency. Since the ML/BP nets use continuous values for network inputs, the linearity is always kept even under serious contamination. S6 16 ~ ~ 8 c 14 F 4 (U E ,g 12 F a o o. 10 - ~ g Solid2CorrectIy Discriminated g 8 _ Dashed:Unrecognized 2 z Dashdot:Wrongly Discriminated C .5 _ 2 E 6 8 a) 4 ” ‘ .E 1— 2 _ 2 l J I 1 -10 2O 25 30 35 4O 70 I T T I 8 60 — ~ C m E 5 50 h _ t o (L E 40 ‘ Solid:Correctly Discriminated 4% Dashed:Unrecognized Z 30 _ Dashdot:Wrongly Discriminated- C ’6 E O 20 l. .1 Q) .E 1— 1o — _ 0 1 1 1' ‘ 4 ~ , _ 1 ~ ‘ _1 1 1 1 1 -1O -5 O 5 10 15 20 25 30 35 4O SNR (dB) (b) Figure 3.16 Generalized Inverse (GI) network performance vs. SNR(dB). (a). Network performance for the 17 training patterns of target B58. (b). Overall performance for all 68 training patterns. S7 _L O) _.L # I ...; N I .5 O I SolidzCorrectly Discriminated Dashed:Unrecognized ~ Dashdot:Wrongly Discriminated Time Domain Network Performance on T 1 1 L -10 25 3O 35 40 60 r 2 Q) U 5 E 50 *' ‘ O t g 40 _ - A5': Solid:Correctly Discriminated g Dashed:Unrecognized z 30 P Dashdot:Wrongly Discriminated~ s S 20 — . D a) E a i: 10" ‘ \ \ ‘ \ \ O L 1 \I_‘ ~ ..1 \ ‘ —1 — — _ _1 .1 1 1 -10 -5 O 5 10 15 2O 25 30 35 40 SNR (dB) Figure 3.17 The generalization performance of Generalized Inverse (GI) network vs. SNR(dB). (a). Network generalization performance for the 16 untrained patterns of target B58. (b). Overall generalization performance for the 64 untrained patterns. CHAPTER 4 Recurrent Correlation Associative Memories 4.1 Introduction In this chapter we will investigate some Recurrent Correlation Associative Memories (RCAM) [18][3l]~[33][38]~[41]. These networks are designed to retrieve the stored information from incomplete or degraded data, and they are "recursive" systems because the output units are fed back to the inputs at each update. The recurrent correlation associative memory is a correlation-based stored data reconstruction memory. The RCAM retrieves stored data from data themselves unlike the traditional address-based memories which access stored data by their corresponding addresses. Therefore, the RCAM not only saves the resources used to calculate the stored data address by address-based memories, but also directly process data in parallel. They can be used as the efficient content-addressable memories [18] and then pattern discriminator. In this application, all stored patterns are explored in parallel because the data storage and retrieval are based on the pattern itself and not on its address. The associative memories [31]~[33] can be classified in various ways depending on their nature of the stored associations (autoassociative vs. heteroassociative) and the update mode (synchronous vs. asynchronous). The autoassociative memory is a discrete memory in which the associative pattern of each stored pattern is the pattern itself. Therefore, the output has the same dimension as the input. It employs a single layer of perceptrons and has hard- limiter activations at the output stage. The perceptrons are firlly interconnected and the 58 59 outputs are directly fed back to the inputs with one iteration delay. And the update can operate in a synchronous (parallel) or asynchronous (random) mode. A heteroassociative memory can be regarded as an extension of the autoassociative memory. There are two pattern sets stored in a heteroassociative memory and each pattern in one set has its associative partner, no more itself, in the other set. Therefore, the heteroassociative memory may have its output dimension different than that of its input. They also can be thought of two separate single-layer feedforward networks with each one's output connected to the other oneBinput 4.2 Hopfield Network and Bidirectional Associative Memory (BAM) The learning mechanisms occurred in correlation-based associative memories can be widely explained by the Hebb's rule [34]. The Hebb's rule suggests when a neuron i repeatedly participates in activating another neuron j, then the efficiency of neuron i in activating neuron j is increased and the efficiency should be decreased in the opposite case. Or, in the view of synaptic connection weight, it can be quantitatively expressed by that the synaptic weight w”. should be enhanced if neurons i and j much ofien have the same activity state, and should be inhibited if neurons i and j much ofien have the opposite activity state. Therefore, the Hebb's rule proposed the meaning of correlationship and synaptic strength between two connected neurons. Suppose neurons have binomial states, 0 or 1 for binary form and -l or I for bipolar form. For a given input, if the neuron i has state Si and the neuron j has state S, then the Hebb's rule indicates that the synaptic weight should be adjusted to respond to the training input by AWij = Awji = r Si 8, where r is a positive learning rate. Assuming that we want p 6O associative pattern pairs {(x“,y")| q=l, 2, P} stored in memory, where xq is an m- dimension column vector, and yq is an n-dimension column vector associated to xq , so that X=[x‘ x2 xp] and Y=[yl y2 y"]. The associative memory is an autoassociative memory if xq—j/q for all P patterns, otherwise it is a heteroassociative memory. The Hopfield [3S][36] network is an autoassociative memory, so the recorded associative patterns have xq=yq for all P patterns. Suppose the network has binary data form and wij denotes the interconnection weight between input neuron i and output neuron j, then a Hopfield memory can be constructed by P (2.1-11(2 4-1) g x x) I] (1) W. = U 0 ,1 :1 Except w, the synaptic weight between neurons i and j is increased if neurons i and j have the same binary state, and is decreased if two neurons have opposite active state. Therefore, the Hopfield memory recording algorithm is in accordance with the Hebbian rule. Since it is autoassociative, the interconnection weight has w,j=wfi. In binary form, the relationship between one state, b, and its complementary state, b', can be written by b'=( l -b) or b=(l-b'). Then we have [2(l-x,)-l][2(l-xj)-l]=(2x,--l )(2xj-1), so the above Hopfield recording scheme also stores the complementary partners of the desired stored patterns. For binary form, given an input state s[k], the Hopfield network will have the next state s[k+1], by f l ,2”: Wu st[k]>0 s}.[k+1] = l M (2) 0 ,2 Wu. s'.[k] g 0 1-1 k 61 In equation (1), the auto-feedback synaptic weight for each neuron is set to O and then the weight matrix will have 0 diagonals. This setting will prevent the network from blindly amplifying the j state of an arbitrary input by the number of stored patterns, when the network intends to determine output activity for neuron j. Without the diagonal annulling, a Hopfield network with a large number of stored patterns will become insensitive to variable inputs and is inclined to leave an arbitrary input unchanged. Setting the diagonals to 0 can be physically expressed as that the neuron itself is not permitted to get involved in determining its next activity state. The storage capacity of the Hopfield memory has been empirically found to scale about 0.15n [35] and theiretically to scale less than n/(ZIogn) [30][37]. The latter suggests that for large n, the ratio of stored patterns to pattern dimension P/n required for correct recall appoaches zero, thus it reflects the poor capacity for the Hopfield memory. Since the activation threshold is nonlinear, the Hopfield net is a nonlinear dynamic network. A nonlinear system is generally difficult to analyze its characteristics. The Russian mathematician Liapunov devised a stability test based on energylike functions. For a network W and a state s, the energy fimction is defined as E(s) = -%STWS (3) or rewritten as (4) 62 Now we would like to know how the energy change from state to state by synchronous update, that is [20] 6E (S) —— = - Ws as (5) or by asynchronous update AE(s) "‘ 2 - W ..S . as}. E; U i (6) From equation (6), the network has 2 was, >0 and AS=l>O if si changes from 0 to 1-1 1 , and it has 2 was, 50 and AS:- 1 <0 if si changes from 1 to 0. Therefore, the energy change i-l sE(s) can't be positive whenever a neuron changes its state. Since the weight matrix W is deterministic for a finite number of stored patterns and stored patterns X is bounded in dimension of m and binary state [0,1], the energy E is bounded. Therefore, the network W should converge a given state to a local minimum of E. But there may be more than one local minimum or stable state with the same amount in energy, i.e. energy may be kept the same when a state changes. Then the network may have oscillation cycle among the same energy states. Let’s consider a simple example here. Suppose we need the network to store a binary pattern x=(x1 x2 x3 x,)T=(l 0 1 0)T, then, according the equation (1), the network weight matrix has w,j=(2x,-l)(2xj-l) for i¢ j and wif—‘O for i=j. And we know the network interconnection weights are symmetrical, so only 6 weights are required to be calculated. 63 They are w13=(2x1-1)(2x2-1)=(2-1)(0-l)=-1=w21, w13=l=w31, wH=-l=w41, w23=-l=w32, w24=l=w42 and W34=-1=W43. Now the Hopfield network has ’0-11—1‘ W —10—1 1 7 ”— 1—10—1 U _—1 1-1 0‘ Let's see if the stored state is stable. Assume the next state is s, then we have s=g(Wx) where g() is the binary activation fimction and the Wx is given by i0-11-11'13 ’1‘ 1041 o —2 Wx= = (8) 1—1o-1 1 1 _—11—104,0‘ 52‘ Then the network has next state s=g[(1 -2 l -2)T]=(l O 1 0)T=x, thus the stored state x is a stable state. Let's examine the state changing trace and its corresponding energy change by considering two initial states s'=(l 0 0 1)T and 322(0 0 O 1)T. For the initial state 5‘, the network has the next state s'(l)=g(Ws')=g[(-l O 0 -1)T]=(O O 0 0)T, and then the next state s'(2)=g[Wsl(1)]=g[(0 O 0 0)T]=(0 0 0 0)T=s‘(1). The final stable state is (0 O 0 0)T which is not a desired stored state. From equation (3), the energy for the only stored state x=(1 0 l O) is E(x)=-O.SxTWx=-1. The initial state S1 has energy E(s‘)=-0.5(s‘)TWs‘=l, and then the stable state s‘(1)=s‘(2) has energy E[s'(1)]=0. In this example the initial state truely converges to a lower energy stable state. Now for the initial state 52, the network has the next state 64 53(1)=g(Wsz)=g[(-l 1 -1 O)T]=(0 l O 0)], and then the next state 52(2)=g[Wsz(1)]=g[(-l O -1 1)T]=(O O O 1)T=sz. Therefore, an oscillation cycle, 52 ~ 32(1) ~ 32 ~ 52(1) occurs. The state s2 has energy E(sz)=0 and the state 52(1) also has the same energy E[sz(1)]=0, therefore the states oscillate between these two local minima, which share the same energy. Another typical associative network we use to compare with Recurrent Correlation Accumulation Adaptive Memories (RCAAM), discussed in next chapter, is Bidirectional Associative Memory (BAM) [3 8][39]. BAM is a heteroassociative memory, i.e. xqey“, and the retrieval operations are sequentially proceeded in alternative directions until two associative stable states are reached at two sides of the network. Suppose the bipolar data form is used, then , with the above assumed associative stored pattern pairs {(x“,yq)| q=l , 2, P} and dimensions (m for column vector xq and n for column vector y“), a BAM can be constructed as follows : w. = Z xiqyjq ,i=l,...,m and j=1,...,n (9) Of uugquqizxw (10) where (yq)T is the transpose of y“, X=[xl x2 xp] and Y=[yl y2 yp]. Therefore, the BAM has a m by 11 weight matrix W and then it can have either an m-dimension input or an n- dimension input. Again the BAM weight implementation algorithm is a Hebbian learning rule based correlation matrix. Given an input state u(0) with dimension of m, the network has an 65 n-dimension next state v(l)=g[WTu(0)], then an m-dimension next state u(2)=g[Wv(1)], then an n-dimension state v(3)=g[WTu(2)], then v(3)=u(4)=v(5) until u(k)=v(k+l)=>u(k+2)=u(k). Then the u(k) and v(k+l) are the retrieved associative patterns by the BAM corresponding to the given input u(0). Similarly, the input of the BAM can be an n-dimension vector v(0) and the successive updates are the same as above. The similar energy function for the BAM system of associative state (x,y) is defined E(x,y) = -xTWy (11) Then, we have the energy change with respect to state change in asynchronous update, ax, } 1 m (12) AE 7' —— -Z: x W). ij x-l or in synchronous update, AE(Ax,y) = —AxTWy : -2 My, y "‘ (13) AE(x,Ay) : —x TWay : Z x TW}. 13y}. j-l where Wi is the i th row of W and W, is the j th column of W. Thus, the energy decreases whenever a state changes either in asynchronous or synchronous update mode. Since, for a finite number of stored states, the BAM weight matrix W and bipolar state {-1 l} are bounded, the energy is bounded. From equations (12) and (13), for an arbitrary given initial state (x,y), the BAM will converge the initial state to a stable state (x,~,y,~) and thus the BAM 66 has no oscillation phenomenon, which occurred in Hopfield net. Let's roughly estimate the BAM's storage capacity. Suppose the input is one of stored patterns with m-dimension x’, then the next n-dimension output before bipolar activation threshold is T P er=(erxr)yr+Z: (xfo9)yq ,. (14) : my'+ Z (x' x")yq 4.14» Since the y’ is the n-dimension stored pattern associated to the stored pattern x', the BAM is expected to retrieve pattern y’ if presented the pattern x’. From the above equation, the associative pattern y’ is amplified by the dimension of stored pattern set X and the crosscorrelation gains between xr and xq with rtq can be regarded as the noise amplification gains. If the numbers of stored patterns, P, is greater than the dimension of X, m, then the noise terms may possibly prevail over the expected associative pattern. Similarly, the unreliable retrieval may possibly occur if the presented input is yr with n-dimension and P is greater than n. Thus the BAM storage capacity should be limited by P< min(m, n). 4.2.] Simulations and Results In Hopfield network and BAM simulations, we use the same four targets as used in the previous chapter, and each target has 17 aspect patterns stored in network and another 16 aspect patterns unstored to test network generalization. The scheme of 3-bit encoding 7 quantization levels is again used here, therefore, each stored pattern in Hopfield net and one stored pattern set of BAM has 300 bits. Here the network presents two performances, pattern 67 recognition and target discrimination, for stored pattern tests, and only target discrimination for unstored pattern test, i.e. generalization performance. The pattern recognition performance shows how well the network can recognize the contaminated stored patterns, while the target discrimination performance presents how well the network can classify a contaminated pattern into its own target. All networks are simulated 10 times and each stored/unstored pattern is tested under 10 different SNR's, [40, 30, 20, 14, 10, 6, 3, 0, -3, -6] dB, in each simulation, and then the average is used to present the network performance. The autoassociative Hopfield net always brings inputs to some undefined states and leaves nothing discriminated. This network couldn't do anything for our radar target discrimination application. From the previous storage capacity formula for the Hopfield net n/(ZIogn), the Hopfield net can store less than 26 300-bit pattems due to 300/(210g300)=26.3. Compared to the simulation results, we can realize that the radar target scattered responses are far away from random signals, therefore the storage capacity derived from random signals is almost useless for our application. Since the BAM is a heteroassociative memory, the network should have another pattern set, which will act as network output codes, stored to associate with the lab-measured aspect response set. Therefore, we needs to design another pattern code set to associate with the one we used. The BAM also has poor performances when another stored pattern set uses the orthogonal codes, i.e. the bit i of pattern i is set to 1 and the other bits are set to -l's. Since the BAM requires bidirectional retrieval, another code set should have distinct codes, i.e. any two codes can not be identical. It means two associative stored pattern sets should have one-to-one mapping and won't allow multiple-to- one mapping. This is a disadvantage, since the one-to-one mapping prohibits the BAM from 68 clustering or target grouping application. However, our purpose is at least that the network can recognize its own target if given an aspect pattern. With above disadvantage constraint, we intend to partition an output code into two sections, one for target identification purpose and the other for BAM bidirectional operation. We design a set of 7-bit heteroassociative codes corresponding to the 68 300-bit aspect pattern codes. The first two bits of the 7-bit codes are designed as a target group code for four different targets, i.e. [-l -l] for BSZ, [-l 1] for 858, [l -l] for F14 and [l 1] for TRl, and then the next five bits are encoded to represent 17 azimuthal responses of each target. Then we alter the BAM process strategy by using target group code in its heteroassociative partners, and we only discriminate the target group code portion of the final stable state and ignore the rest of output code. This effort has greatly improved its correct recognition rate and also reduced the wrong rates in simulations. Figure 4.1 (a) shows the BAM pattern recognition performance for 68 contaminated stored patterns, while Figure 4.1 (b) presents the BAM target discrimination performance. The pattern recognition uses all 7 bits to discriminate stored patterns, and it has worse performance. The target discrimination only uses the target group codes, first two bits, to discriminate a given pattern, so the BAM will correctly discriminate an input pattern or wrongly discriminate and leaves none unknown. Compared to correct pattern recognition rate, the correct rate of target discrimination raises about 60% at 40 dB SNR and 55% at 0 dB SNR. Figure 4.2 shows the BAM generalization performance by testing 64 contaminated unstored patterns. The correct target discrimination rate of BAM generalization performance is 83% at 40 dB. 69 \l O _ g _. g d _ .. .4 O) O l l l / 1 ""i U! 0 j / I Solid:Correctly Recognized ‘ Dashed:Unrecognized & O r Network Pattern Recognition Performance 30 - Dashdot:Wrongly Recognized _ 20 ~ * 10 — r O l -10 35 4O \l O —i -i 01 O) O O l T b- 0 fl Solid:Correct|y Discriminated Dashed:Unrecognized Network Target Discrimination Performance 30 _ \ Dashdot:Wrongly Discriminated- i\ \ 20 - \ - \ 10 i- \ ' x. « 0 l l l l 1“‘1_——l—_—l—_i_i-li—i—- -10 -5 0 5 10 15 20 25 30 35 40 SNR (dB) (b) Figure 4.1 BAM network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance by using target group code. 7O T T T T T T T T T 60 — ~ 50 P ‘1 co 0 : cu 40 — _ E o E 0- Solid: Correctly Discriminated .5 Dashed: Unrecognized E3; 30 *- Dashdot: Wrongly Discriminated e "E \ Q) \ QC) \ O \ ‘E \ o _ \ z E 20 \ (D \ Z \ 10 — ‘ ————————————————————— _ l L J l 1 l l l L -10 —5 0 5 1O 15 20 25 30 35 40 SNR (dB) Figure 4.2 BAM generalization performance by using target group code for the 64 unstored patterns. 71 4.3 High order Correlation Associative Memory (HCAM) and Exponential Correlation Associative Memory (ECAM) In the last section, we showed that the Hopfield net couldn't do anything for our application, while the BAM needed a well redesigned heteroassociative code set. Furthermore, the one-to-one mapping makes the coding assignment less flexible and also interferes the bidirectional retrieval consistence, especially for two distinct stored patterns with two close associative codes. Thus it will degrade the network performance and makes the BAM highly sensitive to coding. Both Hopfield net and BAM mechanisms give low discrimination resolution, therefore the stored states usually confiise together at each updated state. The Recurrent Correlation Associative Memories (RC AM) are designed to recall the associative pattern yj by using recurrent correlation operations, if given an input u which is sufficiently close to x‘. This type of neural network has application in our radar target discrimination. In this section, we will discuss High order Correlation Associative Memory (HCAM) and Exponential Correlation Associative Memory (ECAM) with autoassociative stored patterns, i.e. x‘=yi for all P stored patterns. Since the correlation of two normalized or bipolar signals is a measure of how close two signals are, the RCAM can be applied to discriminate patterns based on this property. If xj has binomial (binary or bipolar) components and s is an input or the current state with dimension m, then we can write the evolutionary behavior [40][4l] for an RC AM by P s'=G{(2 as 5c“) act)» (15) i-l 72 where s' is the next network state, fi is a weighting fiinction and G is a threshold (or activation) function. In bipolar processing, the Signum (Sign) fimction, 1, v>0 Sign (V) ={ (16) -l, v<0 is used for the threshold function G. We can see that the Hopfield network is a special case of the RCAM with weighting function f(c)=c and degenerated diagonals. An RCAM with the above evolution equation is asymptotically stable [9] in both synchronous and asynchronous update modes if its weighting function is continous and monotone nondecreasing. The network first computes the correlations between the given state s and each stored pattern, then processes each correlation by weighting function fi to obtain the weighted correlation gain, and then multiplies each stored pattern by its weighted correlation gain. Finally, the network adds every amplified pattern together, and then manipulates the sum by the Sign threshold function to have the network output (next state). Since the correlation between two binomial vectors can be regarded as one similarity measurement, we can classify a given input into its stored prototype by appropriately using correlation gains. Generally, the larger a correlation value two vectors have, the closer they are. So the weighting function should be strictly increasing to assure the viability of the correlation-based retrieval algorithm. For a Hopfield net with nonzero diagonals, the network has next state 5' s’=Sign {2}): (s Txm) arm} (17) M The network only considers a l-dimension correlationship between the given state and the 73 stored vector x‘. Since the next state is generated by the sum of each amplified pattern, similar to the above BAM analysis, the weighted sum can be separated into two terms, one is the associative pattern amplified by the its correlation gain with the input 3 and the sum of the others weighted patterns is a noisy term. Only the weighted associative pattern prevail over the noisy term during recurrent nonlinear updates, then the correct recall can be attained. Therefore, with a fair number of stored patterns, a pattern with a slightly larger correlation gain may be distorted by the sum of the others, and then it won't dominate the next state after the addition of all amplified patterns. One important reason is that the discrimination resolution for one order correlation is not sufficiently high enough to survive a stored pattern input through the sum of weighted patterns . If we compare the relationship between xJ-ixki and sjsk, where j=1,...,m and k=1,...,m, then we can use more information to emphasize the correlation between these two vectors xi and s before adding them to provide the next state. The one-dimension model (Hopfield net) compares two vector strings bit by bit to compute the number of identical bits, while the two-dimensional model constructs the individual autocorrelation plane for each stored pattern and the given input, i.e. xi(xi)T and s(s)T, and then compares the input autocorrelation 2-d plane to each individual stored pattern 2-d plane to find the closest autocorrelation plane structure among all stored patterns. The autocorrelation plane not only offers the similarity information about linear position but also investigates the similarity of crosscorrelationships among bit positions between the given state and all stored states. So it's clear that the 2-d model uses m times the information of the l-d model to discriminate patterns. The network by taking advantage of high dimensional autocorrelation hyperplane is 74 called the High Order Correlation Associative Memory (HC AM). A HCAM [40][41] has the weighting function fie) = (c + T,,)' (18) where r >l. To, is some offset value designed to avoid amplifying negative correlation gains for even r. If (c+T(,S) is positive or r is a positive odd integer, the weighting function f is strictly increasing, as required for correct retrieves. Another RCAM used for our target discrimination simulations is the Exponential Correlation Associative Memory (ECAM) which has the weighting function flC)=b ‘ (19) where b > 1. Again, this weighting fimction f is strictly increasing. RCAM's with a continuous and strictly increasing weighting function f are asymptotically stable in both synchronous and asynchronous update. This means that network recurrent operations will drive a given input state to some stable state, therefore ensuring no oscillation cycle during the recurrent convergent iterations. Thus the RCAM will converge to either one of the stored patterns or some spurious (unknown) state when triggered at the input by a given pattern. A HCAM needs a predetermined fixed order r to proceed, since the sequential convergence will continue until the stable state is reached as soon as an input is presented to the network. A network with low order r can't retrieve correct associative patterns, while a high order r wastes computation space. When the ECAM exponentially amplifies the correlation gains, producing excellent discrimination resolution, it also exponentially expands the network computation space. It is not possible to physically realize an ECAM for 75 processing patterns with large dimension. For example, in our simulations each bipolar stored pattern has 300 bits, so the maximum weighted correlation gain is 2300 for the ECAM with the weighting fimction f(c)=2c, i.e. b=2. Therefore, a huge processing space is required to fulfill the hardware realization in chip design scale. When a given input is closer to one of the stored patterns, the ECAM requires a much larger scale space for processing, and the larger scale computation needs a longer time for processing. This phenomenon somehow seems to be illogical and unreasonable. The storage capacity of ECAM is not a deterministic scale or a definite meaning for our application. Since radar target response signals are far away from random signals, the estimated storage capacity based on random data and a lot of impractical probability assumptions couldn't be a valuable operation guide to us. But for reference purpose we still give it here. Consider an N-bit input pattern with r (rs pN and 0 s p < 1/2) bits away from the nearest stored pattern. Then the definition of storage capacity is, as the pattern's dimension (or number of bits) N == 00, defined as what is the greatest rate of growth of M(N) so that after one iteration the bit-error probability (the probability that a bit in the next state is different from the corresponding bit in the nearest stored pattern) is less than (4rr:T)'l 2e'T, where T is a fixed and large number. Suppose an ECAM is loaded with [41] 4 4 N(1-9t(p’)) + . I + 2 -1 M(N) : la /( 7712 1 If P 2 (1 a) (20) ra‘/(4mz”[(1Monti” +1 if p'< (1+a2)‘ N-bit memory patterns, where p' = p + (UN), 0 s p < 1/2, and 31(0): -x logzx - (1401ng 76 (l-x) is the binary information entropy of p'. If the current state pattern x is pN bits away from the nearest memory pattern, then as N = co, the bit-error probability (P,) is less than (4rtT)""P3e‘T where T is a fixed and large number. The proof of the above capacity is trivial and is omitted here. According to the above formula, the ECAM has a storage capacity that scales exponentially with the number of bits, N, in memory patterns, however the dynamic range required of exponentiation hardware implementation also grows exponentially with N. Therefore, it is almost impossible to realize in hardware for processing patterns with large dimensions. 4.3.1 Simulations and Results First of all, we simulate the HCAM with an order of 3 and find that quite a few stored patterns can not be discriminated. Then the HCAM with an order to 5 is simulated and the performance is greatly improved. Figure 4.3 (a) shows the HCAM (order 3) pattern recognition performance for 68 contaminated stored patterns, while Figure 4.3 (b) presents the HCAM (order 3) target discrimination performance. The target discrimination rate is the same as the pattern recognition rate under low contaminations, while the target discrimination is 9.1% greater than the pattern recognition rate at 0 dB SNR. Compared to Hopfield net, the HCAM's performances definitely prevail although both networks are recurrent autoassociative nets. Compared to the BAM's pattern recognition performance, the HCAM of order 3 is much better than the BAM. Since the HCAM is an autoassociative net and the stored aspect pattern codes are encoded from the real lab-measured responses, the stored aspect patterns did not have a designed distinct target code for every 17 aspect patterns of each target. On the other hand, 77 \J O .4 ... .4 ..4 _, .1 _. —« G) O T l 01 O I & O I Solid:Correctly Recognized Dashed:Unrecognized Dashdot:Wrongly Recognized l N (IO 0 O I T Network Pattern Recognition Performance 23 T 10 15 20 25 30 35 4o \l O —( _l _[ 03 O I 1 U1 0 I A O I / / Solid:Correctly Discriminated i Dashed:Unrecognized Network Target Discrimination Performance 4O 30 _ Dashdot:Wrongly Discriminatedz zd- ‘e‘z‘_~‘~ - 10 — r O l V \ T‘ ~ I 1 BL l I J I -10 -5 0 5 10 15 20 25 3O 35 SNR (dB) Figure 4.3 HCAM (order 3) network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance. 78 T l 1 l T T I T T 60- q 50~ a a) 8 \ (U 40" \ E \ 5 \ t \ an CL Solid: Correctly Discriminated .5 Dashed: Unrecognized E3; 30- Dashdot: Wrongly Discriminated - E . Q) \ g \ ________ - » , " ——————————— 0 — " _ ‘i 1‘ f3: 20— a m 2 10» _. r I_\ ‘ ‘ L— — -1 I I l r r -10 -5 O 5 10 15 20 25 30 35 40 SNR (dB) Figure 4.4 HCAM (order 3) generalization performance for the 64 unstored patterns. 79 we gave the BAM a postdesigned code set, target code + dummy code, with prior known information about the aspect-target relationship to associate with the real world measured aspect patterns. Therefore, comparison between the BAM's target discrimination performance and the HCAM’s is not appropriate. The target discrimination comparison between the BAM with target group code and the HCAM-GI cascade network, discussed later, will be meaningfiJl. Figure 4.4 presents the HCAM (order 3) generalization performance for 64 contaminated unstored patterns and the result shows that the unknowns increase. The HC AM performances for both stored and unstored patterns show that the HCAM prefer to leave an contaminated pattern unrecognized rather than to classify it into a wrong pattern or target. Figure 4.5 (a) shows the pattern recognition performance of the HCAM with an order of 5 for 68 contaminated stored patterns, while Figure 4.5 (b) presents the target discrimination performance. Compared to the one with an order of 3, HC AM with an order of 5 has greatly reduced its unrecognized rates and then increases the correct discrimination rates. Under serious contaminations, some distorted aspect patterns converge to a wrong aspect pattern of its own target, and then this only decreases the correct pattern recognition performance but won't degrade the target discrimination. Figure 4.6 shows the generalization performance of the HCAM with an order of 5 for 64 contaminated unstored patterns. Compared to the one of order 3, the HCAM of order 5 has also greatly increased the generalization performance. Figure 4.7 (a) shows the ECAM pattern recognition performance for 68 contaminated stored patterns, while Figure 4.7 (b) presents the EC AM target discrimination performance. The ECAM performances for both pattern and target discriminations are better than the ones of the HCAM with an order of 5. It is apparent that the ECAM almost leaves no unknown 80 70 l T T T l T l T T Q) 8 ca 60 ~ g E o 5 50 ~ - o. C .9 E40 “ Solid:Correctly Recognized 8 Dashed:Unrecognized § 3oz Dashdot:Wrongly Recognized - C 5 E (L 20 - a x \ 6 ~ z ‘ \ E 10— \ \ ~ 0) \ \ Z \ C ~ _ _\‘ __________ O I J J\\\1 1 _1—_—_l—-_.I_—_—1_—__‘ -10 - O 5 10 15 20 25 3O 35 4O SNR (dB) (a) 70 T I T l T l T I I o O 5 g 60 7' " O E Q 50 '7 -J C .9 E 40 * Solid:Correctly Discriminated ‘ E Dashed:Unrecognized g 30 _ Dashdot:Wrongly Discriminated- ‘65 9 «I 20 r , z i— \ i‘ ‘ \. E 10+- ? § \ ‘ Q) \\ 2 C ‘ —————————————————— O 1 1\\—.r l 1 l 1_-_—L“———I—_——‘ ~10 -5 O 5 10 15 20 25 3O 35 4O SNR (dB) Figure 4.5 HCAM (order 5) network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance. 81 T 1 1 T T T T T T 60— ~ 50~ ~ <1) 8 (U 40" .1 E o ‘5 ‘1 Solid: Correctly Discriminated E Dashed: Unrecognized E 30~ Dashdot: Wrongly Discriminated e E m c a) 0 f o _ E 201’ co 2 \ \. \ \ \ i\ \ '\ 10 ~ ‘ g _ \.\ \ \ ~ _________________ \ \ ________ —- " ———————— l l \ 1‘ — —— -- l— - l l 1 I I -10 -5 0 5 10 15 20 25 30 35 40 SNR (dB) Figure 4.6 HCAM (order 5) generalization performance for the 64 unstored patterns. 82 behind even under severe distortions. Since the ECAM exponentially amplifies correlation gains and then multiplies each stored patterns by its individual gain, the resolution space is extremely huge . Thus the ECAM will still blindly classify a nonsensically ambiguous input or even an arbitrary noise into one of the stored patterns by calculating the exponentially exaggerated correlations between input and each individual stored pattern. This characteristic is good for light or fair contaminations but not reliable for severe distortions. Compared to the HCAM, the ECAM converges all unknowns to correct patterns or targets for contamination less than 3 dB SNR, and divides unknowns into correct and wrong discrimination for distortion greater than 3 dB SNR. At -6 dB SNR, the risk becomes higher than one half. The wrong discrimination rates dramatically increase for severe distortions, since the coding scheme is not linear when the contamination range is statistically greater than 1.5 quantization levels, analyzed in the previous chapter. But the ECAM's converging everything is the network's own characteristic and is not subjected to the coding scheme. Figure 4.8 shows the ECAM generalization performance for 64 contaminated unstored patterns. Compared to HCAM, it also has excellent performance for fairly contaminated unstored patterns. As predicted, the ECAM converges everything and leaves no unknown for contaminated unstored patterns. 4.4 RCAM-GI Cascade Networks From the previous section, the HCAM of order 5 still left quite a few unknowns behind even for light contaminations. The computation space will be increased, although the network performance will be improved, if the HCAM uses a larger order. The larger order HCAM will waste resource under light contaminations. From the last chapter, we know that 83 \l o 1 CD 8 cu 60 r - E o E 50 — q n. C .9 g 40 ” Solid2Correctly Recognized 8 Dashed:Unrecognized (as) 30 _ Dashdot:Wrongly Recognized _ C 63 E (L 20 ” \ z i‘ \ O E 10 — ‘ \ a a) z \ \ O J _ I L \ \ l_ L l l l l -10 -5 O 5 10 15 20 25 30 35 4O SNR (dB) (8) 70 T T T T F I 1 T T (D U E E 60 T " O E m 50 - a C .9 g 40 ” Solid2Correctiy Discriminated ‘ E Dashed:Unrecognized g 30 _ Dashdot:Wrongly Discriminated“ a x a 20 - \ - l- \ i" '\ O \ 5 10 — .\ z 0.) . z \ . \ u l 1* ‘4 I 1 1 l I 1 -1O -5 0 5 10 15 20 25 30 35 4O SNR (dB) Figure 4.7 ECAM network performances vs. SNR(dB) for the 68 stored patterns. (a). Pattern recognition performance. (b). Target discrimination performance. 84 I I T I I I I I I— 60 ~ _ 50 — _ cu 8 as 40— a E o E ‘1 SoIid: Correctly Discriminated E Dashed: Unrecognized E 30— Dashdot: Wrongly Discriminated - E3 a) C \ m (D \, f \. O — \ _ E 20 ‘\ a.) Z ‘. \. \ 8‘. 1o — ‘. — \. \ i\ \‘\ x. 0 1 a E _ 4:... 1_ _1 1 1 1 1 -10 -5 O 5 1O 15 20 25 3O 35 SNR (dB) Figure 4.8 ECAM generalization performance for the 64 unstored patterns. 40 85 the Generalized Inverse (GI) network can converge most unknowns to correct target under fair contaminations, and prefers to leave an ambiguous state unrecognized rather than to wrongly discriminate it. Another advantage of GI network is that the GI network requires a small implementation memory. And our purpose is to discriminate target by using variable aspect inputs. Therefore, it is reasonable to expect that an RCAM-GI cascade network will save much memory and converge more unknowns. We have constructed the GI network by initializing a correlation memory and then learning to converge all desired associative pattern pairs. In our GI net, the desired associative patterns have multiple-to-one mapping, thus multiple aspect responses map to a same target. The GI net initially has correlation—based record memory and then learn to enlarge the attractive basin realms of each stored pattern. Then it will converge to the target with which the basin is associated, if a given input falls into one of those attractive basins. By using this learned attractive potential, we can use a fair order HCAM to save space and leave unknowns behind. Then the cascading GI net will converge those attracted by its basins and still leave ambiguous ones unrecognized. 4.4.1 Simulations and Results Figure 4.9 (a) shows the HCAM (order 3)-GI cascade network performance for 68 contaminated stored patterns, while Figure 4.9 (b) presents the HC AM (order 3)-GI cascade network generalization performance for 64 contaminated unstored patterns. As expected, compared to Figure 4.3 (b), the HCAM (order 3)-GI cascade network correctly converges all unknowns left by the HCAM (order 3) for the SNR's greater than 6 dB SNR. For serious distortions, the cascade network converges most unknowns and leaves some of them 86 unrecognized, and the ratio of correct convergence to wrong convergence is greater than 1. Even compared to the HCAM (order 5), the HC AM (order 3)-GI cascade network still has better performances for the SNR greater than 0 dB. This better performance has rewarded us the desired fulfillment that the HCAM-GI cascade net not only save processing memory but also increases discrimination performances. The cascade network has better generalization performances than the HCAM with order of 3 or 5, except it wrongly discriminates one slightly contaminated unstored pattern. Figure 4.10 (a) shows the HCAM (order 5)-Gl cascade network performance for 68 contaminated stored patterns, while Figure 4.10 (b) presents the HCAM (order 5)-GI cascade network generalization performance for 64 contaminated unstored patterns. Again, the HCAM (order 5)-GI cascade network has a better performance than the HCAM of order 5. Compared to the HCAM (order 3)—GI cascade net, the HCAM (order 5)-GI cascade network also demonstrates firrther improved performances, due to the fact that the HCAM of order 5 has better performances than the HCAM of order 3. The HCAM (order 5)-GI cascade network this time doesn't wrongly discriminate any slightly contaminated unstored pattern, and shows a better generalization performance than the HC AM of order 5. Figure 4.11 (a) shows the EC AM-GI cascade network performance for 68 contaminated stored patterns, while Figure 4.11 (b) presents the EC AM-GI cascade network generalization performance for 64 contaminated unstored patterns. Except the unknown of ECAM at -3 dB SNR is converged, the ECAM-GI cascade network performance is the same as the one of the ECAM. Since the extremely huge discrimination space of the ECAM leaves no unrecognized states for the cascading GI network to firrther converge, the ECAM-GI 87 cascade net goes nowhere to improve performance. Similarly, the EC AM-Gl cascade network has the same generalization performance as the ECAM does. 88 ‘1 O _l O) O I 1 01 O I 1 A O l —. Solid:Correctly Discriminated Dashed:Unrecognized Network Target Discrimination Performance 30 _ Dashdot:Wrongly Discriminated_ 20 - '\ \ ~ \ \ 10 P \ i\ ‘ 0 L \ I ~ ~\ 1 - _ 1 1 1 1 1 1 -10 -5 O 5 10 15 20 25 30 35 4O SNR (dB) (3) r I I I I I I T I 60 r r G) O 8 E 50 - _ ‘5 1: Q) ‘2 4o — 4 .g Solid:Correctly Discriminated E Dashed:Unrecognized E 30 P Dashdot:Wrongly Discriminated— g (5 20 _ \. z i‘ \. O \ E '\ g 10 — \ '\. \ ~ 0 1 \7~~_:1~hi_L“‘r'———1‘“-1—‘- —4- -‘—--1'--— -1O '5 O 5 1O 15 20 25 30 35 4O SNR (dB) (b) Figure 4.9 HCAM (order 3)-GI cascade network performances vs. SNR(dB). (a). Target discrimination performance for the 68 stored patterns. (b). Generalization performance for the 64 unstored patterns. 89 3 : l ; O) O r 1 01 O r 1 h 0 I Solid:Correctly Discriminated Dashed:Unrecognized Network Target Discrimination Performance 30 _ Dashdot:Wrongly Discriminated- 20 - \ _ V\ ‘\ 1O - \ \_ 4 \ \ \' 0 a \ W ~\.~ ~1 _ 1 1 1 1 1 1 -1O -5 O 5 1O 15 2O 25 30 35 4O SNR (dB) (6!) I I fi [ I I r 1 I 60 — a (D 0 5 g 50 — d O t on “c- 40 r ~ g Solid2Correctly Discriminated E Dashed:Unrecognized S 30 h Dashdot:Wrongly Discriminated‘ (9 20 _ \. _ i‘ \ o E X 2 10 T \ \ ‘ \ \ \- 0 1 \T‘\“,:1___1_ 1 1 1 1 1 -10 -5 0 5 10 15 20 25 30 35 4O SNR (dB) Figure 4.10 HCAM (order 5)-GI cascade network performances vs. SNR(dB). (a). Target discrimination performance for the 68 stored patterns. (b). Generalization performance for the 64 unstored patterns. 90 ‘4 O "l '1 -1 fi O) O I 1 01 O I 1 _( 4:. O r Solid:Correct|y Discriminated Dashed:Unrecognized Network Target Discrimination Performance 30 _ Dashdot:Wrongly Discriminated_ \- 20 - \ z \ \ 10 — ‘ — \. O 1 1 \ A ‘ ~11 1 1 1 1 1 1 -10 -5 O 5 1O 15 2O 25 3O 35 4O SNR (dB) (3) I I I fl I I 1 I I 60 r a Q) 0 E E 50 ~ _ 5 t 0) c: 40 — — .g Solid:Correct|y Discriminated g Dashed:Unrecognized E 30 _ Dashdot:Wrongly Discriminatedfi 8 x 09 \ C9 20 — \ ~ x 5 \. E \. J G) __ \ Z 10 .\> O l l i \ ‘1 _ l l l l l J -10 -5 0 5 1O 15 20 25 30 35 4O SNR (dB) (b) Figure 4.11 ECAM-GI cascade network performances vs. SNR(dB). (a). Target discrimination performance for the 68 stored patterns. (b). Generalization performance for the 64 unstored patterns. CHAPTER 5 Recurrent Correlation Accumulation Adaptive Memories 5.1 Introduction From the previous Chapter, for a HCAM, we need to guess for what order the network can discriminate well among all possible inputs before processing the input. Some inputs may be easily discriminated, while some may be more difficult. We also know it is nearly impossible to physically realize an ECAM for large dimension patterns, since it exponentially expands network computation space. It is also unreasonable and wasteful that the ECAM blindly reserves a huge memory without considering the necessity of variable inputs. This abuse will give no information about the input contamination when the output appears. If a network can use flexible and suflicient orders of correlation to reach similar performance, then it will be a better choice. Thus we wish to use a dynamic order, dependent on the input, to discriminate among the stored patterns. 5.2 Recurrent Correlation Accumulation Adaptive Memories (RCAAM) When the recurrent update of RCAM amplifies individual correlations between the initial given input and stored patterns, it also introduces noisy crosscorrelation terms between any two stored patterns i and j with i¢ j. Assume we have P patterns {x‘l i=1, 2, P} stored in memory, where 111i is an m-dimension column vector, so that X={x‘,x2,...,x"}. If xi has binomial (binary or bipolar) components and s is an initial input, then the HCAM of order r will have output state s" after two synchronous updates 9] 92 P s mm {X [waSz'gn (f (x<°"s>'x<°>rx°>} (1) 1.1 1.1 Therefore it did not purely amplify the correlations between the initial given input and stored patterns, and the nonlinear threshold function Sign prohibits the recurrent iterations from linearly accumulating the respective pattern correlation gains, [(x“’)Ts]', produced by the last iteration. The recurrent feedbacks introduce noisy crosscorrelation terms. If the network can release the nonlinear interferences caused by the threshold fiJnction Sign and accumulate the previous respective correlation gains for each stored pattern, then the network will function efficiently and stably. No nonlinear interference means there will be linear amplifications on respective (cross) correlation terms generated during the last iteration, and the accumulations of previous recurrent iterations will speed up the order of correlations. We propose a Recurrent Correlation Accumulation Adaptive Memory (RCAAM) which uses dynamic memory structure to accumulate the correlation information between the input(s) and all stored patterns. Then the network discrimination resolution ( ability ) to an input will increase as the recurrent iterations increase. Compared to the EC AM and HCAM, the RCAAM uses recurrent accumulative and dynamic structure to gradually converge a given input to some (semi-)stable state(s). We may regard this network as a real time learning network. The network adjusts its real-time learning structure to converge the given input to the nearest stable state associated to the stored patterns as long as the recurrent operations continue. And, unlike Multi-layer Feedforward Error-Backpropagation learning it can avoid being trapped in local minimum states. Suppose the stored associative pattern pairs are {( E’,C’)| i: l,...,P} ,where E‘ is a 93 column vector with length m and C' is a column vector with length n. We intend to implement a neural network that can recall 5‘ if given an input sufficiently close to C’ . If U is an n-dimension column vector input and init is a positive initial order, then we can construct the initial RCAAM, MD, as follows : P P M0 = 2 5(1) [( C(Oruy'm‘t ((1)]T: 2 E0) (WE) ((0)7. (2) i-l i-1 where Wow is the dynamic weighting for the i'11 stored pattern at time 0. If init=0, then M0 will degrade to the Hopfield memory with nonzero diagonal. Suppose the stored patterns 5' have bipolar form, then, with this initial memory matrix, the network output is given by P V, = Sign {MOU} = Sign {23 ITU1c<°T} = Z £<°(wf°c<°)T H M P (4) V1 2 Sign {M1 U1} = Sign {22 [(W1(OC(0)TU1] 5(1)} i-l where the dynamic weighting for the stored pattern C‘ at iteration 1 has W1“) = (W00) C' )TU. Then the adaptive memory and network output at time k are given by I’ I) M}: = Z E(I){[(wli'kole/k—JUOT} : Z E(i)(wk(0C(0)T i-l i-l p (5) V, = Sign {MkUk} = Sign {2 [(wf)c"))TUk]E‘°} i-l where Uk = U for both RCAAM/f1 and RCAAM/ad, Uk = V,,_l for RCAAM/di, and wk“) = (WHO) (if Um is the weighting matrix for the stored pattern C‘ at time k. The algorithm shows the RCAAM has dynamic stored patterns weighted by wk“) at time k, and each pattern weighting indicates the correlation accumulation through iterations between the network recurrent input states and the stored pattern itself. Therefore, the weightings of those stored patterns that are closer to the given input will become larger than the others, as long as the recurrent iterations increase. It is equivalent to say that by real-time adjustment of the individual pattern weightings of the dynamic memory, the recurrent outputs will gradually adapt to the closest stored pattern. Since the pattern weighting adjustments are parallel in each stored pattern vector direction at each iteration, there is no local minimum trap phenomenon in RCAAM. With its dynamical memory structure, the dynamic accumulation will eliminate the oscillation phenomenon which occurs in recurrent Hopfield nets. Although 95 it may have semi-stable states at which the network stays for a finite number of iterations its adaptive accumulation memory will function to leave those states, if the coding scheme used by the network allows it. Therefore, the network updatings won't be trapped in an oscillation cycle. For RCAAM/fl and RCAAM/ad, the pattern weighting iteration, wk“) = (wk_,‘”C)TU, doesn't introduce the nonlinear threshold fiJnction Sign , and thus crosscorrelation interference terms among stored patterns won't become a problem. Therefore, the correlation accumulation becomes linear. This useful processing structure is possible for RC AAM/fi and RCAAM/ad, but not for the RC AM’s. And this advantage still benefits from the dynamically accumulative memory structure. Compared to [39] which trains the associative memory off- line by Linear Programming or Sequential Multiple Training to guarantee the recall of stored patterns, the RCAAM learns on-line and adjusts the stored pattern weightings to converge the given input to the closest stored pattern. The RC AAM doesn't have predetermined order or a fixed memory matrix, so the network is quite flexible and applicable to any kind of distorted inputs. The network only takes a few recurrent iterations to recognize a slightly distorted stored pattern, and requires more recurrent iterations to discriminate a more ambiguous input. Although the RCAAM/di has its output feedback to input, the previous correlations between recurrent input states and each stored pattern have been sequentially accumulated in memory. Then the accumulative correlation gains will acts as a reinforcement term to keep the next output associated with the last state and then the original input. It reminds the network of the passed converging trace. It is equivalent to saying that the accumulative 96 correlation gains in RCAAM/di plays the part of adaptive momentum which directs the network toward the converging trace with less confusion and keeps the converging accesses consistent. The momentum will be enhanced, if the stored patterns are close to it. The momentum will be adjusted and then the current one will gradually decay, if the stored patterns are not in accordance with it. The recurrent feedback gives the RCAAM/di great adaptive elasticity and enhances the target group idea. If patterns in a group are consistent and close to the input, then the recurrent output states will be apparently subjected to the group attraction. Therefore, the recurrent feedback makes the RCAAM/di still have a great adaptive elasticity, especially for ones with small initial orders, even under momentum guidance. Then, compared to the RCAAM/fl and the RCAAM/ad, the cooperation of adaptive elasticity and momentum in RC AAM/di will speed up its convergence for fairly or severely contaminated inputs. This adaptive elasticity is helpful especially for partially inconsistent pattern groups which may have patterns closer to the other groups' than its own group's. For partially inconsistent pattern groups, an input may be a little closer to some special pattern belonging to other groups than its own group patterns. Under this circumstance, the adaptive elasticity will favor the group attraction, since multiple consistent attractions, under low correlation order, will prevail and then iteratively adapt the input to its own group before possible high order correlation manipulation. Typically, the RCAAM/di requires less space than the RCAAM/fl and the RCAAM/ad. Aspect sensitive radar target scattering usually has this partial inconsistence phenomena. 5.2.1 Implementations of the RCAAM 97 For easy implementation, the RCAAM can be further realized, then the initial memory M0 is P M0 = 2 5(3) [(C(1)TU)im't C(r‘)]T P M 0 _ (1) 1 (I) T a: 5 (W C ) (6) =E-{Dz‘ag [(CT-U).A(mit)]°c’} = E'lDiag(Wo)]‘CT where C = [CW ((2) CM], 5 = [5“) 5(2) 61”], W0 E [wo1 wo2 Wop], A -Aqs [,ii,",4,q...,4,,"]T and "A, 0 ol 0 A, o... Diag (A); O...0A Pl , if A = [A1 A2 AP]. Then the network output is P V0 = Sign {MOU} = Sign {23 [(wf'ic"))TU 150)} 1-1 (7) = E'flDiag (W91 '(CTU)} Thus the dynamic memory at recurrent iteration time k is P M: 2: W[(wfiic")>TU,._,1<"”} M = E'lDiag (Wk_1)'Diag(CTU,,_1)]‘CT (8) = if 5")(wfi'W r-l = E'iDiag (Wk)l°CT 98 where Wk 5 I Wkl sz wk? 1 and wk“) = wk.|(i)(c(i)ir Um)» and the output is P Vk , Sign {MkUk} = Sign {E [(wlfocoifuk] gm} i-l = z-{wiag(W,,>cT1-U,.} (9) = e-{wz‘agrwm-(c’z/g} Again, RCAAM/f1 and RCAAM/ad have Uk=U, while RCAAM/di has Uk=VH Therefore, the dynamic memories Mk and the evolution outputs Vk for RCAAM/fl and RCAAM/ad can be fithher simplified as P M, = z a“){i(w,f_?c<°)TU1cW} i-l : Ev[Diag (Wk_])-Diag (CTUH 'CT . g-{Diag [(CTU)."(init+k)]}'CT (10) P VI: : Sign {MkU} : Sign {Z [(wf)C('))TU] 5(1)} i-l = 5-[Diag (Wk)'(CT°U)l . g.[(§TU)/‘(init +k+1 )] For heteroassociative memory (C' is not equal to 5‘), we have 68 stored aspect response patterns belonging to four different targets, so (I has 68 columns. Suppose the BSZ is encoded by [1 -l -1 -l], the B58 by [-1 l-l -1], the F14 by [-l -l l -1] and the TR] by [-1 -1-1 1], then g‘~g”=[1 -1 -1 -1]T, 518~534=[-1 1-1 —1]T, g35~55'=[—1 -1 1 .1]T and ask-56H- l -l -1 1]T. From the above realization algorithms, the network only requires storing memories for P=68 dynamically weighted stored patterns, 4 target group codes, one current input pattern and one output. Since RCAAM has a dynamic memory structure, there may be 99 semi-stable states at which the network stays until the correlation accumulation is high enough to escape the temporary spurious state and move toward other states. We define the stability criterion sc by Vk = V.,_l = = VW. Therefore, we regard the state Vk, which satisfies Vk= V.,_l = = VIN, as the network discrimination pattern, if the stability criterion sc is adopted. This indicates that the RCAAM has a flexible decision strategy for an ambiguous input, allowing us to either leave it unknown or force it to one of the stored patterns. To leave those spurious states as unknown, the stability criterion can be set to a low value (eg, 1 or 2). The stability criterion can be set to a high value (eg, larger than 2), if a definite discrimination is required. This allows convergence to one of the stored patterns. The RCAAM requires only the minimum computation scale space which the RCAM can possibly offer to discriminate an arbitrarily given input. Thus the RCAAM not only needs less processing space than RCAM, but will perform the same or better. 5.3 Performance of the Recurrent Correlation Accumulation Adaptive Memory with dynamic input (RCAAM/di) In this section, we simulate RCAAM/di with several different initial orders and stability criterions. First we simulate the RCAAM/di performances with respect to stability criterion for three different initial orders, 2 , 3 and 6. Then the performances of RCAAM/di with respect to initial order are simulated for three different stability criterions, 2 , 5 and 10. Finally, RCAAM/di performances for groups with different average orders are compared. If a RCAAM/di has an initial order p and a stability criterion q, then we abbreviate this RCAAM/di by dipq. The order required for network stability criterion (or discrimination) is p+k, if the network takes k recurrent iterations to satisfy its stability criterion. Thus the 100 average order is calculated by summing the orders required for 10 SNR's and 4 targets, and then taking the average value. There are four average order groups simulated, the first group has an average order between 9 and 10, i.e. 9 ‘ 3 § 5) 6 6} {5 a) 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.2 Correct and wrong pattern discrimination performances of the RCAAM/di with an initial order of 2 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. 106 1000 1 {1‘2 [3 {w} $— $ {9 $ J/C 95 r r 90 — v M x X ) r :5 (I) C .9 Iii C E 85) 9 Circle 0 : 40 dB 5 80L Plus+ :14dB * ,. x—Mark x : 0 dB 3 75 e ‘ CU '— ‘6 70 — u 9 8 65 — - 60 1 l l 1 L l 1 2 3 4 5 6 7 8 9 10 Stable Criterion (a) 35 l T j T l l l @130 e - U) 8 g 25 - — .E ,g 20 — ‘ 8 Circle 0:40 dB .‘2 Plus + : 14 dB 915» x—Markx:0dB _ Q) 9’ a: l— 10 - V V V V V - c) x A A A I'\ I'\ C 9 g 5 r _ 069 ‘ £9 £9 43 43 G} $ 63 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.3 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 2 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 107 10 l 1 V T T l l 9 - _ Circle 0:40 dB 8~ Plus+ :14dB - x—Mark x: 0 dB Unrecognized (%) m to 45C 01C? md \10 oo (06} 3% A Stable Criterion Figure 5.4 Unrecognized discrimination of the RCAAM/di with an initial order of 2 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. 108 100 ..- 9 o3 8 ,9 Circle 0 : 40 dB iii Plus + : 14 dB E 95 x-Mark x: 0 dB ‘ ’8 .‘L’ D ... V x *-)( Q) I\ 9’ m _ l... ‘5 9 5 O 85) 1 1 L 4 1 1 1 2 3 4 5 6 7 8 9 10 Stable Criterion (a) 10 l r T T T l T Ex? .9 A A /\ ‘6 .E .E 6 r r 8 .‘2 D ‘65 4 — n g Circle 0 : 40 dB 1— Plus + : 14 dB 0) x-Mark x : 0 dB C 2 __ _. 9 3 2 3 3i E E 7 ii ES 10 Stable Criterion (b) Figure 5.5 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 2 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 109 6.5%. The wrong discriminations of RCAAM/di3 nearly haven't changed while its correct target discriminations increase along with sc. To fiirther analyze and derive a generalized rule by which the RCAAM/di with different initial orders will be affected with sc, a RCAAM/di with an initial order of 6 (di6) is also simulated. Figure 5.11 to Figure 5.15 shows the simulation performances of the RCAAM/di with an initial order of 6 vs. sc. Comparing these three RCAAM/di's (di2, di3 and di6) performances, two interesting characteristics can be found. (1). The higher the initial order, the flatter the performance curve. (2). The flatter the performance curve is, the higher the sc needed for converging unstored pattern inputs. Rule 1 can be interpreted that higher initial order RCAAM/di has less affect with sc than the lower initial order one. And rule 2 can be described as the RC AAM/di with an higher initial order needs larger sc to converge all unstored pattern inputs. From comparing the unrecognized performances of three RCAAM/di's for both stored and unstored pattern inputs, the di2 has the deepest slopes , the di3 has fair ones, and the di6 has the smoothest ones. Characteristic 2 can be considered as an interpretation of network adaptive elasticity. For a RCAAM/di with a large initial order, the successive iterations after the first output will have less convergent ability since the high order correlation spanned a large space and then separated a state from each other with a larger distance. This means the first iteration output of the network will be at some state which is far away from any other state. If the first output is at some ambiguous state, then it is hard to turn this deeply trapped state back to some stored state. Therefore, we can say the deeply trapped state has dramatically lost its llO convergent elasticity and has a great chance to stick around those ambiguous states for many iterations. Consider the case that a RC AAM/di has stored pattern inputs with a SNR of 14 dB, i.e. the star mark '+'. Then the di2 needs an sc of 8 to converge unknowns in Figure 5.4 , the di3 needs a sc of 7 in Figure 5.9 , and the di6 only requires a sc of 3 in Figure 5.14 . The reverse property will occur by considering the performances with unstored pattern inputs. With an SNR of 40 dB, the di2 requires only an sc of 4 to converge unknowns, the di3 needs an sc of 6, and the di6 needs an sc of 8 to complete the task. Under 14 dB, the di2 requires an sc of 8 to converge unknowns, the di3 needs an sc of 10, and the di6 needs an sc at least as large as 10. There are two observations here. For light contamination stored inputs, the RCAAM/di with a high initial order requires a smaller sc to converge unknowns, while a low initial order RCAAM/di needs a larger sc to complete the task. For unstored pattern inputs, the RCAAM/di with a high initial order then needs a larger sc to converge unknowns if possible, while a low initial order RC AAM/di requires a smaller sc to finish the job. To explain the above seemingly inconsistent simulation performances, let's consider two different factors that affect the convergent speed along with sc. First consider an input close to one of the stored patterns. The high order correlation will bring the input much closer to its true stored pattern. Here the high initial order thus speeds up the convergence. This factor explains why the di6 converges light contaminated stored inputs by using a small sc, although it has the flattest performance curve. Next consider an input that is not apparently close to one of the stored patterns or ambiguous to stored patterns. This may be caused by a linear combination of a couple of stored patterns or a pattern which is close to one specific stored pattern but a little less close to several stored patterns belonging to a same target. lll 10 l T T T T T l 9~ _ Circle 0:40 dB 8 — Plus + :14 dB 4 x-Markx : 0 dB 7_ _ 5_ a Unrecognized (%) 00 e e e M e 1 e 2 3 4 5 6 7 8 9 10 Stable Criterion Figure 5.6 Unrecognized discrimination of the RCAAM/di with an initial order of 3 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. 112 1009/6; c e c e e e E 95- ~ (I) C .9 *5 90 .. a .E .E g 85 5 Circle 0 : 40 dB 1 '5 Plus + : 14 dB C 80~ x-Markx20dB A a I: (0 CL 75 - n 8 9 5 70 r z 0 v v v X x x )6! A " 7* 65 l 1 1 1 1 1 l 2 3 4 5 6 7 8 9 10 Stable Criterion (a) 35 T T l l r l T X on yo \ it 1 a? (I) 8 g25— g .E .E _ a 52° Circleoz40 dB g Plus+ :14dB c15_ x-Markx20dB d 5 8 o.10~ a U) 8 g 5'- —t 0e gets a; e e—=—=——e e 1 u 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.7 Correct and wrong pattern discrimination performances of the RCAAM/di with an initial order of 3 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. 113 1 DOE‘L/C/Cr‘i 3 9 $— $ $ r $ 951— _. 90 " — A! \ X 85 — Circle 0 : 40 dB ‘ Plus + : 14 dB 80 l- x-Mark X 2 0 dB -l 75 r l 70- - Correct Target Discriminations (%) l— 65 1 1 L 1 1 2 3 4 5 6 7 8 9 10 Stable Criterion (a) (is) 01 d ‘1 —4 CO 0 1 1 N 01 r 1 N O 1 Circle 0 : 40 dB 5 Plus + : 14 dB x—Mark x : 0 dB _ .1. 01 1 _L O r 1 T X X X x >r F Wrong Target Discriminations (%) (II 1 1 NS “8 #69 we % e e 1 a; 6 7 8 Stable Criterion (b) Figure 5.8 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 3 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 114 10 T T T T T F 1 9L - Circleoz40 dB 8* Plus+ :14dB ~ x-Markx:0dB 7~ a A 6e - 39 ‘D (D .E C O) O 0 92 C 3 Stable Criterion Figure 5.9 Unrecognized discrimination of the RCAAM/d1 with an initial order of 3 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. 115 100 T $ £5 8 ,9 Circle 0 : 40 dB ‘66 Plus + : 14 dB _ E 95. x—Mark x : 0 dB '5 .L” o ‘5 \l 9 g 90 _ e 1'5 9 5 0 > 85 l l l l 1 i l 2 3 4 5 6 7 8 9 10 Stable Criterion (a) 10 1 l T 1 r 1 T S; a) x x x x x x C .9 ‘5 .E .E 5 ” 1 5 .‘L’ o ‘65 4— Circleoz40dB - 9’ Plus + : 14 dB 3 x—Mark x : 0 dB 2’ e 2 ‘ ‘ 3 06 6% $ $ 6 $ $ ‘ —$ 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.10 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 3 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 116 Then the high order correlation may deeply force the input into a spurious state and trap it there for many iterations. The network will thus need several iterations to compensate the initial overemphasis and then gradually turn it toward a stable state if possible. Here the high initial order thus slows down the convergence, if possible. This factor can explain why the di6 needs a higher sc to converge the unstored pattern inputs, if possible, than the di3 and di2. Since the RCAAM/di has its output feedback to network input, the current output state depends on the previous output state. Thus RCAAM/di is a causal system and its convergence is successively improved iteration by iteration. RCAAM/di also accumulates previous correlation gains in its memory. As the recurrent operation goes on, the converging continues but the adaptive elasticity is gradually declining. Since a high initial order acts similarly but not exactly as a long term iteration accumulation, a high initial order RCAAM/di will have low adaptive elasticity. Therefore, a RCAAM/di with a small initial order has high adaptive elasticity to converge unstored pattern inputs to stable states. We know the radar target responses become less consistent within some aspect ranges. Therefore, if an input is a little closer to one specific stored pattern than its own target patterns, then a high initial order will force the input to initially approach the wrong pattern. The strong momentum introduced by high initial order is then hardly released by future recurrent feedbacks, since the updated input state might have become much closer to the inconsistent pattern. We mentioned that the recurrent feedback of the RCAAM/di will enhance the target group idea. For inconsistent aspect patterns, networks with small initial orders may have better discriminations than ones with high initial orders, since multiple consistent attractions from group patterns, under low correlation order, will prevail and then ll7 4.5 - ~ Circle 0 : 40 dB 4) Plus+ :14dB ~ x-Mark x : 0 dB Stable Criterion Figure 5.11 Unrecognized discrimination of the RCAAM/di with an initial order of 6 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. 118 95* ‘ l 85 Circle 0 : 40 dB ‘ Plus + : 14 dB x-Mark x : 0 dB 1 80 75~ - l 70 ‘ Correct Pattern Discriminations (%) ~X ) (JO—X bh-X 0'! O) \l m_ X (D 65 2 10 Stable Criterion (a) (D U! T 00 1N X X 1 .3 U) S gas- ~ E .E _ 23207 Circleo:40 dB .9 Plus+ :14dB 215_ x-Markx20dB _, B E d10~ _ U) 8 a 5* 069 $ 9 $ % 1 % i 6 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.12 Correct and wrong pattern discrimination performances of the RCAAM/di with an initial order of 6 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. 119 10 * e e e 9 I 8 ' ‘9 g 95* - (I) C .9 v *5 90* v x n E )W ..E. 5 85— ‘ .‘L’ O C. . .. 80— Ircleo.40dB g g, Plus+ :14dB :5 x—Markx:0dB L: 75~ ~ 0 ‘2 8 70— - 65 1 1 1 1 L 1 1 2 3 4 5 6 7 8 9 10 Stable Criterion (a) 35 l T l T I T T gem - (I) 8 3325— ‘ .9 Circle 0 : 40 dB §20_ Plus+ :14dB _ g x-Markx:0dB 5 615” 7 9 «3 F10“ v v x ye x 7 c» n A C 9 g 5— _ 0e 9; e e e 1 e ' ea 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.13 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 6 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 120 5 l l V T l l l > 4.5 - Circle 0 : 40 dB 4~ Plus+ :14dB - x-Mark x : 0 dB Stable Criterion Figure 5.14 Unrecognized discrimination of the RCAAM/di with an initial order of 6 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. 12] 100 i m efi . A J) A cW’e/V o\° 2 4 9 Circle 0 : 40 dB ‘66 Plus + : 14 dB E 95 ’ x—Mark x : 0 dB ‘ '8 .‘2 0 ES 9) ( g 90 - ~ ‘6 9 ES 0 ) 85 l l l l l l l 2 3 4 5 6 7 8 9 10 Stable Criterion (a) 10 T l l l l l I § ”f 36 flu >< x x 2 8 - ‘ .9 ‘6 E .E 6 r ‘ 8 Circle 0 : 40 dB .92 Plus + : 14 dB 9 x-Mark x : 0 dB a) 4 ~ g 2’ «s F. E” e 2 ” i 3 0e a e a a 1 e . a 2 3 4 5 6 7 8 9 10 Stable Criterion (b) Figure 5.15 Correct and wrong target discrimination performances of the RCAAM/di with an initial order of 6 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 122 iteratively adapt the input to its own group before possible high order correlation manipulation. 5.3.2 RCAAM/di performances with respect to initial order effect In this subsection, we will analyze the performances of RCAAM/di with fixed stability cnterions vs. initial order. First let’s abbreviate the RCAAM/di with a stability criterion of q by using 'RCAAM/di_q' or 'di_q'. Figure 5.16 shows unrecognized discrimination of RCAAM/di with a stability criterion of 2 (i.e. RCAAM/di_Z or di_2) vs. initial order for contaminated stored pattern inputs, while Figure 5.17 presents the correct pattern discrimination in (a) and wrong pattern discrimination in (b). Figure 5.18 presents the correct target discrimination of di_2 vs. initial order in (a) and wrong target discrimination in (b). The correct pattern and target discriminations rise along with initial order increasing for light contaminations, while they look like downward curves for the heavy noise case (0 dB). And the wrong pattern and target discriminations fall along with initial order increasing for light contaminations, while they look like upward curves for the heavy noise case. For light contamination, the increasing curves can be explained by factor 1 in the above subsection. Under heavy noise, increasing initial order will greatly converge spurious states for RCAAM/di with small initial orders, and appropriately enhance the seriously soft elasticity to attract the distorted pattern by target group correlation gains. Too sofi an elasticity may not only have low efficiency but also induce a coding scheme problem. For small initial order, an increment in the initial order will help to overcome the coding scheme problem described later. But the adaptive elasticity will become too small to turn some spurious states back as described by factor 2 in the above subsection, if the initial order is too 123 large. This explains why the downward curve has a maximum occurring between the smallest and the largest initial order, and the upward curve has a minimum occurring between the smallest and the largest initial order. Figure 5.19 shows the unrecognized discrimination of RCAAM/di_z for unstored pattern inputs, while Figure 5.20 presents the correct target discrimination in (a) and wrong target discrimination in (b). In the previous subsection, we know the high initial order RCAAM/di has higher correct performances at a low sc and flatter performance curves than the low initial order one. Since the sc is only set to 2, we only see the increasing portion. To generalize a rule to regulate the sc effect on the performances of RCAAM/di along with initial order, we simulate another two sets with stability criterions of 5 and 10, RCAAM/di_S and RCAAM/di_lO. Figure 5.21 to Figure 5.25 show the simulation results of RCAAM/di_S, while Figure 5.26 to Figure 5.30 present the performances of RC AAM/di_1 0. Let’s analyze the unrecognized discriminations for three stability criterions of 2, 5 and 10 with contaminated stored pattern inputs, i.e. Figure 5.16 , Figure 5.21 and Figure 5.26 . All three unrecognized discriminations decrease as expected when the initial order changes from 1 to 6. Figure 5.21 shows that the unrec0gnized discrimination of RCAAM/dil 5 is larger than the one of RCAAM/diOS under light contamination (40dB and 14 dB). It may be explained as a coding scheme effect. By definition, a lightly contaminated stored pattern should still be close to its true pattern. Consider three close patterns belonging to the same target. Suppose one of these three stored patterns is lightly contaminated, then the sum of weighted stored patterns is a vector most near the linear combination of those three patterns 124 15 T T T T T T T T T Circle 0 : 40 dB Plus + : 14 dB x—Mark x : 0 dB 10) — g? “C Q) .t‘ C 0') O O ‘2 C D 5 ~ _ < c O 1 1 m 1 {1} 1 1 1 5 1 1 5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order Figure 5.16 Unrecognized discrimination of the RCAAM/di with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. 101A 010 OOVVmQCDCD 0010 O r Correct Pattern Discriminations (%) 01 l 01 my 125 b <> X J L _1 l L Circle 0 : 40 dB Plus + : 14 dB x—Mark x : 0 dB 1’s \J d 00 01 1.5 2.5 3 3.5 4 4.5 Initial Order (a) Wrong Pattern Discriminations (%) N 01 1 N O T —L 01 l .1. O T 00 O v I (/ 1 $ 1 $ 1 2.5 3 3.5 4 4.5 Initial Order (b) Circle 0 : 40 dB Plus + : 14 dB x—Mark x : 0 dB ,— p— 5 5.5 Figure 5.17 Correct and wrong pattern discrimination performances of the RCAAM/di with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. 1 03$ 126 1E) IN \J 3695* - g 90 — .2 _ E 85 ~ _ .§ (‘3 80 ~ _ .2 > 9 75 — Circle 0:40 dB _ 8: Plus + : 14 dB E70r x-Markx:0dB- E 65 — - 8 60 — _ 55 1 1 1 1 1 1 1 1 1 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (a) 35 l l T T T T T T T 2830 — _ (I) 8 '5 25 r ~ .5 E . ’5 20 ” Circle 0 : 40 dB ‘ 8 Plus + : 14 dB ..15— x—Markx20dB — 81 a } l— to \ _ o) A V V C IV A 9 g 5 r _ w 1 $ 1 $ 1 $ 1 l 1 $ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (b) Figure 5.18 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 10 127 8) Unrecognized (%) 01 T 0 s‘ 0 ’— h—I l—o l 1.— Circle 0 : 40 dB Plus + : 14 dB x-Mark x : 0 dB 1 1.5 2 2.5 3 3.5 4 Initial Order 4.5 5 5.5 Figure 5.19 Unrecognized discrimination of the RCAAM/di with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. 128 100 T T T T T l T T T D o\° 8 .g 95 - 2 Circle 0 : 40 dB .g Plus + : 14 dB 8 c x—Mark x : 0 dB 5 90 r 6 9’ as 1— >16 6 85 " .1 93 6 0 )1 80 J l l l l l 1 l 1 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (a) 15 r T r r T r r l T 3 Circle 0 :40 dB 8 Plus + : 14 dB .2, i x-Mark x : 0 dB .2 10— . .E ., 5 .‘L’ C) ‘65 U) 6 5 L 1 *— 05 C 2 0 ~11; 1 c + a 1 1 1 is 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (b) Figure 5.20 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 129 weighted. Then the coding scheme, 3-bit coding 7 levels, and the nonlinear threshold function ( Sign fiinction) may result in a state code which is not particularly close to one of those three stored patterns but close to another stored pattern. Therefore, with a stability criterion of 5, an initial order of 0 might leave the lightly distorted pattern in a wrong stored pattern. And an initial correlation with one order higher may pull it to an intermediate state, nor correct neither wrong state, then this intermediate state will contribute to the unrecognized discrimination while a wrong pattern, possible correct target, discrimination is eliminated. Therefore, the curves rise for 40 and 14 dB SNR's. Furthermore another time increase on the initial order may converge the intermediate state to the correct stored state. This time the correct discrimination will be credited while the intermediate state is eliminated. The curves thus fall. This hypothesis can be verified by investigating Figure 5.22 and Figure 5.23 . In Figure 5.22 , the correct pattern discrimination doesn't increase the same amount, while the wrong pattern discrimination decreases when the initial order changes from O to 1. This indicates that some portion of eliminated wrong discriminations goes to intermediate states instead of correct stored states. From Figure 5.23 (a), you can see the correct target discrimination decreases when the initial order increases from 0 to 1 for the light contamination case (40 dB). This indicates a wrong pattern discrimination can be a correct target discrimination, and the wrong pattern discrimination can be converged to an intermediate state instead of a correct target when the initial order only increases by one. From the above subsection, we know a higher sc converges more states or has higher capability to turn a spurious state into a stable state. The coding scheme effect won't bother the RCAAM/di with 3 SC of 10, while it interferes with the 130 RCAAM/di with a sc of 5. Compared to the RCAAM/di_lO, the RCAAM/di_S still leaves some states convergeable. From the correct and wrong discrimination performances for three cases, all the correct and wrong discriminations are pulled up when the stability criterion is increased. These are consistent with the previous performances and discussions. Let's investigate the unrecognized discriminations for unstored pattern inputs, i.e. Figure 5.19 , Figure 5.24 and Figure 5.29 . From the previous subsection discussion, we have two rules describing the initial order effect on the performance curve shapes and converging speeds along with sc. The higher initial order RCAAM/di has flatter performance curves but slow converging vs. sc. And factor 2 told us a high initial order RCAAM/di won't converge unknowns for unstored pattern inputs until a high sc is given, since it has lost a lot of adaptive elasticity . Under contaminations of 40 dB and 14 dB SNR, consider the initial order ranges from 1 to 6 with unstored pattern inputs. In Figure 5.19 , the RCAAM/di with an initial order of 6 has the minimum unknowns and the RCAAM/di with an initial order of 1 has maximum unknowns. When the sc is increased to 5, the above rules are invoked. In Figure 5.24 , the RCAAM/di6's lose their places, while the RCAAM/di 1 's and RCAAM/diZ's become new minima. When the sc continues increasing to 10, then factor 2 shows up. In Figure 5.29 , the RCAAM/di6's not only lose their minima but also become the maxima, while the RCAAM/di3's and RCAAM/di4 continue converging their unknowns. The downward curves of correct target discriminations for stability criterions of 5 and 10, Figure 5.25 and Figure 5.30 , again show the consistency. 131 10 I I 9— _ Circleo : 40 dB 8- Plus+ :14dB - x—Mark x : 0 dB 7- a 6.— Unrecognized (%) U 03$ 0 1 E 3 4 5 Initial Order Figure 5.21 Unrecognized discrimination of the RCAAM/di with a stability criterion of 5 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. 132 100 % $ 1 $ 90 - 4 Circle 0 : 40 dB 80; Plus + .14 dB q A (,3 U) C .9 ‘2' E x-Markx20dB § '5 70 L V ‘ E A Q) g 60* - IL ‘8 t. 50~ * O 0 >1 4O 1 l J 1 l O 1 2 3 4 5 6 Initial Order (a) \1 ’350 r '1 o\ (I) C 8 4o - - as E .E c 3°“ 7‘ ‘ 3 Circle 0 : 40 dB u Plus + : 14 dB §20 x-Mark x : 0 dB ‘ CL C, c 5 10 — , E O u 3 a 1 GB 0 1 2 3 4 5 6 Initial Order (13) Figure 5.22 Correct and wrong pattern discrimination performances of the RCAAM/di with a stability criterion of 5 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. 133 100 ,//é/— 9— % 63 I 69 25 . 7,: 90 — V x C .9 2' 80 ~ _ E '5 Circle 0 : 40 dB 8 70) Plus+ :14dB ‘ .5 x-Mark x : 0 dB 9 g 60 ’— j ‘6 9 8 50 _ d 40 1 J 1 1 1 O 1 2 3 4 5 6 Initial Order (a) l l l T l 7,50 — - o\ (D C .2 4O — - 2 E '5 30 — Circle 0 : 40 dB _ 5 Plus + : 14 dB ‘65 I x—Mark x : 0 dB 920 - .. (U .— U) C 9 1O - A V V g f I\ .1 6% £13 6% £9 ‘ {B O 1 2 3 4 5 6 Initial Order (b) Figure 5.23 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 5 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 134 Circle 0 : 40 dB Plus + : 14 dB x-Mark x : 0 dB 1 ci\A 0 v 1 Initial Order y.— _ o .1 NC co £2. 01 m Figure 5.24 Unrecognized discrimination of the RCAAM/di with a stability criterion of 5 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. 9’ 1 (I) C .9 a —i .9 ..E 5 _ .9 .0. Circle 0 : 40 dB _ 8, Plus + : 14 dB 5 x-Mark x : 0 dB 1- .. ‘6 9 5 z 0 65 l J l 1 l O 1 2 3 4 5 6 Initial Order (a) 00 O 1 I \l Circle 0 : 40 dB d Plus + : 14 dB x—Mark x : 0 dB -* N N 01 O 01 d om Wrong Target Discriminations (%) 01 I 4 O 41} £13 1 63 O 1 2 3 4 5 6 Initial Order (b) Figure 5.25 Correct and wrong target discrimination performances of the RCAAM/di with a stability criterion of 5 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 136 5 l l T l T 4.5 ~ ~ Circle 0 : 40 dB 4 _ Plus + : 14 dB - x-Mark x: 0 dB 3.5 - a 3 ” i Unrecog'gized (%) in l l 7 N69 £9 A€9 and} Initial Order Figure 5.26 Unrecognized discrimination of the RCAAM/di with a stability criterion of 10 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. 137 100 3F 63 r $9 903" Circle 0 : 40 dB 4 Plus + : 14 dB 80: x-Mark x : 0 dB _ +— Correct Pattern Discriminations (%) 70 L V - 60 - a 50 - a ) 4O 1 l 1 L l O 1 2 3 4 5 6 Initial Order (a) 60 T T T T l > 50 — _ 40 - - 30 l— A -< N O —L 0“ Wrong Pattern Discriminations (%) ' Circle 0 : 40 dB ‘ Plus + : 14 dB _ x—Mark x : 0 dB - O t ' 0 1 2 3 Initial Order (b) #6} me) Figure 5.27 Correct and wrong pattern discrimination performances of the RCAAM/di with a stability criterion of 10 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. 138 10 $ % 6* I 63 g \1 A! ‘w’ 90 " " C .9 .8 so 4 .E 5 Circle 0 : 40 dB 5970 Plus+ :14dB - 5 x—Mark x : 0 dB 9’ lg 60 ~ ‘6 93 5 50 — O 40 L Q l l l 0 1 2 3 4 5 6 Initial Order (a) T T T Circle 0 : 40 dB Plus + : 14 dB - x—Mark x : 0 dB at; e I a; 3 4 5 6 Initial Order (b) i gu re 5.28 Correct and wrong target discrimination performances of the RCAAM/di ith a stability criterion of 10 under 40, 14 and 0 dB SNR vs. initial order for the 68 ored patterns belonging to 4 targets. ) Correct target discrimination performances vs initial order. ) Wrong target discrimination performances vs. initial order. 139 4.5 Circle 0 : 40 dB Plus + : 14 dB J x—Mark x : 0 dB 3.5 (A) Qgized (%) in UHFGCQ 0.5 133" ) Q1 .— CDC; ‘6. 0 Initial Order igure 5.29 Unrecognized discrimination of the RCAAM/di with a stability criterion of 10 ider 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. 140 100 A {‘1} 4 ‘ (x c: 95 - (I) C \j \J .9 " X g, 90 E .E 5 85 ‘ .9 o 1g 30 Circle 0:40 dB “ :5 Plus + : 14 dB I; 75 x—Mark x : 0 dB e i3 8 7 ‘ 65 1 1 L l 1 O 1 2 3 4 5 6 Initial Order (a) 30 j T l I T § v25 _ (D C .9. E 20 Circle 0:40 dB ~ g Plus + : 14 dB 5:15 x-Markx:OdB _ 5 2'3, :5 1O 4 ,__ A “M O) 5 5 g 0 $ $ 1 $ 0 1 2 3 4 5 6 Initial Order (b) Figure 5.30 Correct and wrong target discrimination performances of the RCAAM/di vith a stability criterion of 10 under 40, 14 and 0 dB SNR vs. initial order for the 64 .nstored patterns belonging to 4 targets. a) Correct target discrimination performances vs initial order. b) Wrong target discrimination performances vs. initial order. 141 5.3.3 RCAAM/di performances for the same averaged order group In this subsection we evaluate the performances of RCAAM/di's with similar discrimination orders. The comparisons of RCAAM/di's belonging to a same discrimination order group can help us select and design an RC AAM/di which will satisfy our specifications under constrained available memory. Suppose a RC AAM/di with an initial order of p and a stability criterion of q (i.e. dipq) requires k iterations to satisfy its stability criterion for an input, then we say the order required by RCAAM/dipq for this input discrimination is p+k. Figure 5.31 shows the averaged orders required for the discriminations (aord) of RCAAM/di's with the initial order change from O to 6 vs. stability criterion for stored pattern inputs, while Figure 5.32 presents the averaged order required for the discriminations of unstored pattern inputs. For a fixed initial order, the aord changes linearly along with sc. The slope is larger for small sc, and is approaching 1 when sc become large. This is especially apparent for the RCAAM/di with small initial order, such as diO and dil. With so fixed to 5, the diO has its aord larger than the dil's, di2’s, di3's and di4’s, while the dil has its aord still larger than the di2's. The phenomena indicate the soft elasticity associated with low initial orders will struggle longer to adapt to stored patterns under heavy contaminations. Since a low initial order, 0 or 1, is unable to explicitly instruct the possibly efficient routes to converge in the beginning, a slightly advantaged pattern after the feedback mechanism of RCAAM/di might become more ambiguous due to the linear combination of weighted patterns and the coding scheme problem. From Figure 5.31 and Figure 5.32 , we can see that an initial order larger than 1 will greatly reduce (or eliminate) the inefficiency in aord resulting from the too soft elasticity problem. 142 18 l i a l T T l Dotted : : Initial Order of 0 Point . : Initial Order of 1 Circle 0 : Initial Order of 2 Star “ : Initial Order of 3 Plus + : Initial Order of 4 x—Mark x : Initial Order of 6 Averaged Orders Required for Discrimination O) ,— ... ,... _ .. l- #- Stable Criterion Figure 5.31 Averaged orders required for target discrimination of the RCAAM/di with contaminated stored pattern inputs vs. stability criterion. 1O 143 18 I I T T r 16 14 12 1O Averaged Orders Required for Discrimination Dotted : : Initial Order of 0 Point . : Initial Order of 1 Circle 0 : Initial Order of 2 Star " : Initial Order of 3 Plus + : Initial Order of 4 x—Mark x : Initial Order of 6 ‘ Stable Criterion Figure 5.32 Averaged orders required for target discrimination of the RC AAM/di with contaminated unstored pattern inputs vs. stability criterion. 144 We categorize four RC AAM/di groups by using their aord of stored pattern inputs, group aord9, group aorle, group aordll and group aord13. Group aord9 defines the RCAAM/di's with 9\L 351/ « .9 E 20 r 1 .§ 5 Circle 0 : 40 dB £15» Plus+ :14dB - C x—Mark x : 0 dB 93 *5 10 5 ~ 11 C) C 9 5 r a 3 oe—e—e e 1 5 1 1 1 e 5 10 15 20 25 30 35 40 Stable Criterion (b) Figure 5.50 Correct and wrong pattern discrimination performances of the RCAAM/fi with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. 167 10W 1 $ T T I $ 3 95 ~ 1 8 ,9 90 ~ 1 1'6 C E 85 ’ v >< ‘ 5 5 80 F ‘ H ) ‘1’ _ Circle 0 : 40 dB .5: 75 Plus+ :14dB 1 l— x-Mark x : 0 dB *6 70 r r 2 8 65) _ 60 1 1 1 L l l l 5 1O 15 20 25 30 35 40 Stable Criterion (a) 30 fi 1 T T T T T Z\°‘ v25 .— — U) C .9 220— ~ g Circle 0 : 40 dB 5 PMs+ :14dB 3915— x-Markx:OdB a D ‘65 C) 6 10 - - I— V \1 g1 >e—x—”*”""" A L- 5 " “ 3 052—22 212 1 % I 1 I e 5 1O 15 20 25 3O 35 40 Stable Criterion (b) Figure 5.51 Correct and wrong target discrimination performances of the RCAAM/fi with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 168 20 l l l l I l I 185 Circle 0:40 dB 16 Plus + : 14 dB x—Mark x : 0 dB _-L N 1 Q! Unrecognized (%) 5 I on I \‘ 0 (o Stable Criterion Figure 5.52 Unrecognized discrimination of the RCAAM/f1 with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. 40 169 100 *I I I I I I I A ‘> o\° U) _ C .9 ‘6 Circle 0 : 40 dB E Plus + : 14 dB E x-Mark x : 0 dB 4 O .52 o 6 —. E” m '— ‘5 9 a ‘6 O 75 l L l l l l l 5 10 15 2O 25 30 35 40 Stable Criterion (a) 10 f I I f I I I 3x3 2 8 — 2 .9 V V a A .E g at - 2 5 i ‘65 4 - 2 g Circle 0 : 40 dB .— Plus + : 14 dB 0') x—Mark x : 0 dB 5 21 2 3 oe—e—e <2 1 a ' 1 1 e 5 10 15 2O 25 30 35 40 Stable Criterion (b) Figure 5.53 Correct and wrong target discrimination performances of the RCAAM/fl with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 170 RCAAM/fi_2 vs. initial order for stored pattern inputs in (a) and the wrong pattern discrimination in (b), while Figure 5.56 presents the correct target discrimination in (a) and wrong target discrimination in (b). From the latter two figures, you can find out that the oscillation occurring in Figure 5.54 results from the oscillating wrong pattern or target discrimination. In our simulations, we have set a maximum convergent cycle within which the recurrent updating procedure continues until either the so is satisfied or the recurrent iteration reaches the maximum convergent cycle. If an iteration reaches the maximum convergent cycle, the recurrent updating procedure stops and the final state is given as the network output. We have set the maximum convergent cycle to 40, 60 or 120 iterations. Therefore, with an even initial order p, the final accumulation gain of a negative correlation will give a positive amplification, since an even p plus a even maximum convergent cycle gives an even integer. This final state will contribute to wrong target discrimination, if a pattern has a negative correlation gain, whose scale is largest among all correlation gains, and it belongs to a target difierent to the input's. This explains why the fi22, fi42 and fi62 have the peaks in wrong discriminations. By definition, stored patterns are slightly distorted by light contaminations. A large negative correlation gain rarely exists between a slightly contaminated stored pattern and any stored pattern. For an odd initial order p, the final accumulation gain of a negative correlation will give a negative amplification, since an odd p plus a even maximum convergent cycle gives an odd integer. Then RCAAM/fl can't recognize a negative stored pattern as one of the stored patterns since it only memorizes and recognizes the stored patterns. Therefore, the fi12, fi13 and fi15 leave those negatively amplified patterns unrecognized when the iteration reaches the maximum convergent cycle. 171 Figure 5.57 shows the unrecognized discrimination of RCAAM/fi_2 vs. initial order for distorted unstored pattern inputs. Again the unrecognized performance oscillates along with the initial order for 0 dB. Figure 5.58 shows the correct target discriminations of RCAAM/fi_2 vs. initial order for distorted unstored pattern inputs in (a), and the wrong target discriminations in (b). In (b), the oscillation of wrong discrimination becomes larger. The wrong discriminations for odd initial orders have the same performances and those for even initial orders also have the same performances. The correct target discriminations for lightly contaminated unstored patterns are worse than those in Figure 5.53 since the RCAAM/fi_2’s shown in Figure 5.58 only use an sc of 2. Figure 5.59 shows that the aord's of RCAAM/fl] change along with sc for contaminated stored and unstored pattern inputs. Under the same sc, the aord of unstored patterns is larger than the aord of stored patterns. We should notice that the RCAAM/fi's aord is variable with respect to a maximum convergent cycle, since the discrimination order will be equal to initial order plus the maximum convergent cycle if an oscillation occurs. Compared to Figure 5.31 and Figure 5.32 , the RCAAM/fl has a higher aord than the RCAAM/di. Since the feedback mechanism allows RCAAM/di to sufficiently employ the adaptive elasticity in accordance with an appropriate sc to speed up convergence, the RCAAM/di will converge sooner than the RC AAM/fi under heavy contaminations. And the RCAAM/di doesn't suffer the oscillation problem which occurs in RCAAM/fl, so its aord is always smaller than a fair maximum convergent cycle. In other words, the RCAAM/fl keeps the original input intact at the expense of slowing down its converging when the recurrent accumulative adaptations continue. 172 20 I T I I I I I I j ‘3 I Circle 0 : 40 dB Plus + : 14 dB x—Mark x : 0 dB A12"- ‘ o\° U Q) .5 C - —< GIG O O 93 C :3 8" q 6* _ 4" -1 2% a \ l e l G 1 {I} l {9 I O 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order Figure 5.54 Unrecognized discrimination of the RCAAM/fl with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 68 stored patterns. 173 100G r €13 1 Q 1 Q T _Q I E 95 — — (I) C .9. 90 2 2 E C I dB -- _ irc e o : 4O - g 85 Plus + : 14 dB (c3 x—Mark x : 0 dB 5 80 2 2 E g 75 2 2 (U a ‘6 7O — a 93 8 65 2 2 60 I X l )1“ I T 1 ’1‘ L 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (a) (A) O .1 l —J -l "I N U! \l :5 (I) C .9 E 20 2 2 .E 5 Circle 0 : 40 dB 815— Plus+ :14dB 2 C x—Mark x : 0 dB 2 m 10 ’— —i o. C) C 9 5 2 2 E (E l {i} 1 $ 1 $ 1 $ 1 $ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (b) Figure 5.55 Correct and wrong pattern discrimination performances of the RCAAM/fl with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for 68 stored patterns. (a) Correct pattern discrimination performances vs. initial order. (b) Wrong pattern discrimination performances vs. initial order. I74 1000 I {I} I ? I 1‘? 1 fl L E 95 2 - 2 .9 90 2 a E E g 85 2 2 2 5 80 h 22 V x %——2"/':k ‘6 " 7T 9 752 2 8 Circle 0 : 40 dB ‘6 702 Plus+ :14dB _ 93 x—Mark x : 0 dB 8 65L — 60 I J l 1 1 1 1 L l 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (a) 30 I I I I I I I I I :2 st ._ _ U) C .9 220— — .E '5 Circle 0 : 40 dB .915L Plus+ :14dB 2 e x-Mark x : 0 dB 8 a 10 2 2 i o, WW( 2 5’ 3 w 1 (I!) l a} 1 G) 1 $ 1 $ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (b) Figure 5.56 Correct and wrong target discrimination performances of the RCAAM/fi with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 175 20 I f I If I I I I I ‘8 " Circle 0 : 40 dB Plus + : 14 dB x—Markx : OdB 162 2 n Vr D _L N I l '1 Unrecognized (%) 3 I 1 (I) r r 0 O 1 1 l l l l 1 1 l 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order Figure 5.57 Unrecognized discrimination of the RCAAM/fl with a stability criterion of 2 under 40 dB, 14 dB and 0 dB SNR vs. initial order for 64 unstored patterns. 176 100 T I I I I I I I I :2 g 95 - 2 .9 > ‘6' .9 ” .g 90 ~ 0 g Circle 0 : 40 dB .. Plus 4» : 14 dB 8, 852 V x—Markx:OdB 2 B l.— 8 L ;__________————_)< 9;) 80 T v— fill/'7‘ ‘T’VV o ie/ffn O 75 l l l l I l 1 1 l 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (a) 10 I f I I I I I I I 2: Circle 0 : 40 dB m 8e Plus+ :14dB 2 E x—Mark x : 0 dB I3 K .E .2 el . 8 .9 ) o *2. 4- 2 Es '— 2’ 9 2 ” ‘ B m; l {i} 1 $ 1 {i} 1 $ 1 $ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Initial Order (b) Figure 5.58 Correct and wrong target discrimination performances of the RCAAM/fl with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 177 50 I I I I I I I 452 40- 352 302 252 202 Averaged Orders Required for Discrimination 15 _ Circle 0 : Initial Order of 1 w/ Stored Inputs _ Plus + : Initial Order of 1 w/ Unstored Inputs 1O 2- 4 5 l 1 l I 4 I P 5 1O 15 2O 25 30 35 40 Stable Criterion Figure 5.59 Averaged orders required for target discrimination of the RC AAM/fi with contaminated stored pattern inputs vs. stability criterion. 178 5.5 Performance of the Recurrent Correlation Accumulation Adaptive Memory with analog input and digital output (RCAAM/ad) Since 3-bit coding with 7 quantization levels will have nonlinearity interference when contamination becomes serious in this section, we use analog inputs to eliminate the nonlinearity problem. We simulate the performances of RCAAM/ad with different initial orders and stability criterions. From section 5.2 we know the only difference between RCAAM/fl and RCAAM/ad is the network input form. RCAAM/ad is used with analog input form instead of digital (bipolar) form used by RCAAM/fi, with an initial order of I (RCAAM/ad] or adl) under different stability criterion circumstances and RCAAM/ad with its sc fixed to 2 (RCAAM/ad_2 or ad_2) under different initial order cases. For four targets using RCAAM/ad, we have 68 IOO-point aspect responses to store. First of all, we calculate each energy of the 68 lOO-point responses then normalize them to l by their respective energy. These 68 normalized patterns are then stored in RCAAM/ad. When an input (of 100 responses) is input to the network, we calculate its energy and then add the corresponding noise with the desired simulation SNR. Then we calculate the energy of the noisy input, and finally normalize the noisy input to 1. The normalized input is then presented at the network input as the network process input pattern. Figure 5.60 shows the unrecognized discrimination of RCAAM/ad with an initial order of 1 (RCAAM/adl) vs. sc for contaminated stored pattern inputs for 40 dB, 14 dB and 0 dB SNR. Figure 5.61 shows the correct pattern discrimination of RCAAM/ad] vs. sc in (a) and wrong pattern discrimination in (b), while Figure 5.62 presents the correct target discrimination in (a) and wrong target discrimination in (b). We can find RCAAM/ad] has 179 excellent no error target discrimination even under severe distortion (0 dB SNR). Therefore, for distorted stored pattern inputs, RCAAM/ad has the best converged discrimination performances among three versions of \RCAAM. Comparing (b) of Figure 5.61 with (b) of Figure 5.62 , we realize that all wrong pattern discriminations result from the recognition of different aspect angles of a target, and thus will contribute to the correct target discriminations. There is a strange phenomenon observed in that the correct discrimination of RCAAM/ad12 (i.e. sc=2) with an SNR of 0 dB is larger than the one with 40 dB. This phenomenon didn't occur in the RCAAM/fl. When the analog signal form is adopted, the pattern linearity isn't distorted at all. Most aspect responses are consistent within a fair angle range, and then there may be some adjacent analog aspect patterns that are very close to each other. Ifthis happens, we may call these patterns with very close neighbors uneasy patterns. Then the correlation gains for its neighbors will be also large and close, if the input is an uneasy pattern. Therefore, the output may be a linear combination of the uneasy pattern and its adjacent stored patterns, since each analog stored patterns in RCAAM/ad has its own associated bipolar stored patterns. If the neighbor patterns are very close, then their linear combination state may become a spurious stable state which is stable only for few iterations. Therefore, an sc of 2 might still leave it in a spurious state, while a higher so can force it to leave the spurious state. A large distortion may shifi an uneasy pattern in a way to bias favor itself or one of its neighbors. Therefore, with a low sc of 2, the unrecognized discrimination of RCAAM/adl with 0 dB SNR is less than the one for 40 dB. And a higher sc will eliminate this strange phenomenon. Why wouldn't the same phenomenon have happened to the 180 RCAAM/fi with a sc of 2 ? The 3-bit coding with 7 quantization levels is a nonlinear procedure and the coding mapping from analog signal to digital form increases the correlation discrimination resolution. This means that the coding scheme will interpret uneasy patterns with higher discrimination resolutions of correlations, and thus the RCAAM/fl with a sc of 2 can discriminate easier based on correlation gains than the RCAAM/ad_2 under light contamination. The resolutions of correlation discrimination for analog and bipolar signals will be formally analyzed in the next section. Figure 5.63 shows the unrecognized discrimination of RCAAM/adl vs. sc for contaminated unstored pattern inputs under 40 dB, 14 dB and 0 dB SNR. Figure 5.64 presents the correct pattern discrimination of RCAAM/ad] vs. sc for contaminated unstored patterns in (a) and wrong pattern discrimination in (b). Again the RCAAM/ad] has excellent no error target discrimination for seriously distorted unstored patterns. And the unrecognized discrimination for 0 dB SNR is again less than the one for 40 dB SNR. But this time the cause is different for the previous case. Since each unstored test pattern is resident at the middle of two adjacent stored patterns in our simulations, a severely contaminated unstored pattern might have a better chance to shifi toward either adjacent stored pattern than an uncontaminated unstored pattern. Therefore, a heavily distorted unstored pattern may be closer to a true stored pattern than a slightly distorted unstored pattern. This seemingly unreasonable phenomenon results from the deterministic allocation arrangements around stored and unstored test patterns. It is not expected in practical situation. Figure 5.65 shows the correct target discrimination of RCAAM/ad with a sc of 2 (RCAAM/ad_2) vs. initial order for contaminated stored patterns in (a) and wrong target 181 discrimination in (b). Figure 5.66 presents the correct target discrimination of RC AAM/ad_2 vs. initial order for contaminated unstored patterns in (a) and wrong target discrimination in (b). As explained in the previous section on RCAAM/fl, the performances of RCAAM/ad are independent of their initial orders. The phenomena occurred in the RCAAM/adlz with contaminated stored patterns and the RC AAM/adl with distorted unstored patterns and also exists in RCAAM/ad_2 with initial orders around i. We should notice that the phenomenon for stored patterns will be eliminated by larger initial orders, since a sufficiently larger initial order is capable of discriminating the uneasy stored patterns. Figure 5.67 shows the aord of RCAAM/ad vs. so for contaminated stored patterns and unstored patterns. From this figure, there are approximately linear relationships between aord and sc for both cases. 5.6 Comparisons of Correlation-based Discrimination Resolutions Between Analog and Bipolar Patterns In the previous section, we found the RCAAM/ad with a fair sc has higher unrecognized discriminations for lightly contaminated stored patterns than the RCAAM/fi, though it has excellent correct and no error target discrimination performances for seriously distorted patterns. This indicates that the analog data used by RCAAM/ad may have lower correlation-based discrimination resolution than the digital data used by RCAAM/fl. Figure 5.68 shows the orders required for the discriminations of RCAAM/fil vs. SNR for contaminated stored patterns, while Figure 5.69 presents the orders required for the discriminations of RCAAM/ad]. With contamination equal to or larger than 6 dB, the RCAAM/fi12 and RCAAM/fil 5 always have smaller discrimination orders than the RCAAM/ad12 and RCAAM/ad] 5, respectively. When the contaminations are less than 6 dB, 182 15 r r r r I Circle 0 : 40 dB g) Plus + : 14 dB ' x-Mark x : 0 dB )I 10 2 013 ‘6 (D .5 C C) O O “2 C D 5 T .4 a < 0 I 1 l 1 $ 5 10 15 20 25 30 Stable Criterion Figure 5.60 Unrecognized discrimination of the RCAAM/ad with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 68 stored patterns. 183 100 I I I I I? Ex; 70’ C .9 ‘2 95 — — E g Circle 0 : 40 dB 5 Plus + : 14 dB E x-Mark x : 0 dB 0) fi 90 1 CL 1 E c 6 O 85 l 1 I l I 5 1O 15 2O 25 30 Stable Criterion (a) U"! _ _ _ _. # I 1 3’ Circle 0 : 40 dB I Plus+ :14 dB 2_ x-Markx:0dB .. Wrong Pattern Discriminations (%) —L \ < i i $ 1 1 $ 5 10 15 2O 25 30 Stable Criterion (b) Figure 5.61 Correct and wrong pattern discrimination performances of the RCAAM/ad with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for 68 stored patterns. (a) Correct pattern discrimination performances vs. stability criterion. (b) Wrong pattern discrimination performances vs. stability criterion. 184 100 r r r r 32‘ U) C .9 E 95 L . .E 8 Circle 0 : 40 dB 5 Plus + : 14 dB .5 x-Mark x : 0 dB 9’ g 90) 2 3:; c 5 O 85 l L l 1 l 5 1O 15 20 25 30 Stable Criterion (a) 5 I I I I I Z\°‘ 2 4 2 - .9 T6 .9 E 2 .. '5 3 Circle 0 : 40 dB g Plus + : 14 dB .. x—Mark x : 0 dB a) 2 - 2 E1 (U I.— 2’ 9 1 ’ ‘ B oar—r e 1 4 1 1 4 5 1O 15 2O 25 30 Stable Criterion (b) Figure 5.62 Correct and wrong target discrimination performances of the RCAAM/ad with an initial order of 1 under 40, I4 and 0 dB SNR vs. stability criterion for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 185 25 I I I I I C Circleoz40 dB 20~ Plus+ :14dB 7 ) x-Markx20dB A15— 4 o\° ‘0 (D .L‘.‘ C O) O 0 92 C D10_ .. 5* a 0 l l l l l 5 10 15 2O 25 30 Stable Criterion Figure 5.63 Unrecognized discrimination of the RCAAM/ad with an initial order of 1 under 40 dB, 14 dB and 0 dB SNR vs. stability criterion for 64 unstored patterns. 186 100 I I I I I is" 1” 95 L ~ .5 Circle 0 : 40 dB ‘é Plus + : 14 dB -- x-Mark x : 0 dB g 90 ~ . 0 .‘L’ o g, 85 L ~ to '— § 801 ~ 5 0 75C 1 l J_ l l 5 1O 15 20 25 30 Stable Criterion (a) 5 l I I I T $3 8 4 ” * .9 ‘5 .9 .§ 3 r r 8 .‘L’ o as 2 — E’ r Circle 0 : 40 dB 2 Plus + : 14 dB 31 x-Mark x : 0 dB 1 ._ ..r 9 E our—a r I 4 1 1 a 5 1O 15 20 25 30 Stable Criterion (b) Figure 5.64 Correct and wrong target discrimination performances of the RCAAM/ad with an initial order of 1 under 40, 14 and 0 dB SNR vs. stability criterion for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs stability criterion. (b) Wrong target discrimination performances vs. stability criterion. 187 100 I I I I T I I I I Ex; (I) C .9 E 95 — ' - g Circle 0 : 40 dB 5 Plus + : 14 dB 3 x-Mark x : 0 dB 6 9 '53 90 r V 4 8 ’1 0L) L1 8 5 O 85 l l l l l l l l l 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Initial Order (a) 5 I I I I I l I I T 03° 2 4 f - .9 2'6 .5 E 3 l— _. '5 Circle 0 : 40 dB .3 Plus + : 14 dB _, x-Mark x : 0 dB 0) 2 ~ ~ 9’ (U '— 8’ e 1 ” ‘ 3 m 1 1 1 1 fl 1 1 1 1 i 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Initial Order (b) Figure 5.65 Correct and wrong target discrimination performances of the RCAAM/ad with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 188 100 r I I I I I j I I 31" g 95 - ~ .9 ‘5 . 5 Circle 0 : 40 dB g 90— Plus+ :14dB - g x—Mark x : 0 dB .5 39’, 85 - - m |_. 5 x g 80 ~ - O O 75 3 I 1 I r {I} 1 1 1 1 <5 0 02 04 0.6 08 1 1.2 1.4 1.6 1.8 2 initial Order (a) 5 I I I I I fi I I I .3 8 4 ” ‘ .9 ‘55 E E 3 — . a 5 Circle 0 : 40 dB g Plus + : 14 dB .. x-Mark x : 0 dB a: 2 — — 9’ cu ’— 8’ 9 1 ‘ ‘ E m l l 41 1 $ 1 l l l a 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Initial Order (b) Figure 5.66 Correct and wrong target discrimination performances of the RCAAM/ad with a stability criterion of 2 under 40, 14 and 0 dB SNR vs. initial order for the 64 unstored patterns belonging to 4 targets. (a) Correct target discrimination performances vs initial order. (b) Wrong target discrimination performances vs. initial order. 189 40 I I j I I 35 r C 5.3 30 ~ (U E .E 5 .‘2 o 5:, 25 r U 3’ '5 CT 0.) cc 9 20 r G) E O U (D g 2;; 15 — Point . : Initial Order 010 w/ Stored Inputs Circle 0 : Initial Order of 1 w/ Stored Inputs x—Mark x : Initial Order of 0 w/ Unstored Inputs Plus + : Initial Order of 1 w/ Unstored Inputs 10— Stable Criterion Figure 5.67 Averaged orders required for target discrimination of the RCAAM/ad with contaminated stored pattern inputs vs. stability criterion. 190 the RCAAM/til has much larger discrimination orders than the RCAAM/ad. The latter phenomenon results from the code linearity breakdown corresponding to severe contaminations. As described in Chapter 3, the coding scheme we use will still have linearity if the contamination amplitude is statistically less than 1.5 quantization levels. When the contamination level breaks through the linearity range, the nonlinearity will result in many ambiguous states which are not apparently close to any stored pattern. Therefore, the discrimination order abruptly increases with a dramatic rate. And the former indicates the analog data carries lower discrimination resolution than the encoded bipolar data. Let's roughly prove this issue as follows. In our simulation, there are 4 targets and each target has 17 aspect response patterns stored in the network. Suppose the i‘h stored pattern, p,, is a row vector and g is its autocorrelation, then g,= p,"‘(p,)T where * denotes matrix multiplication and (pi)T stands for the transpose of the row vector p,. Consider a RCAAM/f1 or RCAAM/ad with initial order of O and the recurrent iteration k. Then, from section 2, the stored patten pi has autocorrelation gain or weight, gi at iteration I, and has autocorrelation gain or weight, g,“ at iteration k, if the input is pattern pi. Assumption I : The output bipolar codes, associated to analog stored patterns in RCAAM/ad and associated to themselves in RCAAM/f1, are consistent within each target. Therefore, the output bipolar codes vary smoothly along with aspect angles for each target. Assumption 2 : Only active patterns, whose correlation gains are greater than a threshold value, can participate in generating the next state in the output stage. This assumption is practical, since a small correlation gain will become annulled compared to a large one after 191 a few iterations. Assume the target t has 11 active stored aspect patterns w.r.t. an input. Usually we set an active threshold by some percent, such as 30% or 20%, of the maximum correlation gain of the data form. Then define a normalized correlation gain for this target t w.r.t. the input by Gt where G (:2: g. for iteration 1 n (11) (#22 glk for iteration k From the assumption 2, only active patterns can contribute to the normalized target correlation gain. Since assumption 1 has assumed the output codes are consistent within each target, summation can be used to enhance the same target code. For finite positive real values, we know the mean of sum is equal to or greater than the root of the product. Therefore, 1 k )3 g." 2 M] a"); = n1 2 n"( _ 3.)" (13) i-l r-l The equality appears only when all the correlation gains (gi's) have the same value. From the right most term, the normalized correlation gain Gt will decrease by n” when the recurrent 192 iterations continue. Finally this gives us a simplified way to evaluate and then compare the normalized target correlation gains between the analog data and bipolar data. In the following simulations, we assume the active patterns in the output stage should have their correlation gains greater than a threshold value. Thus, under this assumption, only active patterns can contribute to the normalized target correlation gain. To evaluate the correlation-based discrimination resolution, we calculate the normalized target crosscorrelation gain between different targets. Then, for each target, we compare its normalized target autocorrelation gain to the other three normalized target crosscorrelation gains. If the ratios of target autocorrelation gain, 1 after normalization, to target crosscorrelation gains for a target are high then that target has high correlation-based recognition resolution. If the ratio of target autocorrelation gain to target crosscorrelation gains is around 1, the target has low correlation-based recognition resolution and then will not be easily recognized. Since each target has 17 aspect patterns stored, the appearance probability for each aspect is 1/17 for a given target. In our simulations, the bipolar pattern has 300 bits and a maximum correlation gain of 300, while the normalized analog data has 100 values and a maximum correlation gain of 1. For example, if we set an active threshold by 30% of the maximum correlation gain, then, to a given input, a stored pattern in RCAAM/fl will be referred as active only if its correlation gain is greater than 90 (300x3 0%) with the input, and a stored pattern in RCAAM/ad will be considered as active only if its correlation gain is greater than 0.3 (1x30%) with the input. For each stored pattern, we calculate the correlations between itself and all 68 stored patterns and then evaluate the normalized target correlation gains for all 4 targets by only using those active correlations. 193 For example, a stored aspect pattern of target BSZ should result in more active patterns for the B52 and less active patterns for the other three targets. After all 68 stored aspect patterns are calculated, the final normalized target correlation (autocorrelation and crosscorrelation) gains are given by the averaged value among 17 aspect patterns for each target. The following tables show the simulation results for iteration k=1; and the results for the other k‘s are similar to k=l. Table 5.1 shows the comparisons of normalized target correlation gains subject to a threshold of 30% between analog data and the encoded bipolar data. For bipolar input form, an active threshold of 30% means a stored pattern can be an active pattern only if it has a correlation greater than 90 (300x30%) with the bipolar input. It is equivalent to say an active stored pattern at least has 195 (150+90/2) bits out of 300 bits the same as the bipolar input. If an active threshold of 30% is assumed, the analog input form brings 21.76% stored patterns into active operation, while the encoded bipolar input form only involves 8.69% stored patterns. Since the target correlation gains are finally normalized with respect to the input target, the normalized correlation gains in each row can be regarded as an estimate of the input target recognition resolution in this 4 target memory. The input target can be easily recognized within a few iterations, if the normalized crosscorrelation gains of the other three targets w.r.t. the input target are much less than 1. For example, with a threshold of 30%, an analog aspect pattern of target F14 has a normalized crosscorrelation gain of 0.0409 with target 858 and a normalized crosscorrelation gain of 0.6048 with target TR]. Therefore, the RCAAM/ad recurrent operations are almost expended in distinguishing the input target, F14, from the target TR], while the target BS8 nearly doesn’t bother the input pattern at all. 194 I I I I I I j I T 22 _ ‘ ’5 g 20 — . <11 .9 Circle 0 : fi12 .E Plus + :fi15 33’ 18 _ x—Mark x : Ii1X _ D E C .9 5 16 — — 5 Q .0 £9. 14 - — (I) if 0 A 8 z 12 _ a .9 U 9 '5 10 r .. CT 0) c: ‘3 Q) E 8 r O 5 ” 4 l l l l _L _L 1 \ll 1 -10 —5 0 5 10 15 20 25 3O 35 SNR (dB) Figure 5.68 Orders required for the discriminations of RCAAM/f1] vs. SNR for contaminated stored patterns. 40 I95 I I I I r 22L —1 ’E '9 V \1 V E 20” Circleo :ad12 .E Plus+ :ad15 E x—Mark x : ad1|XV o 5: 18— d C .9 9 5 16L 2 92 ..Q S a) '5 o 14— _ E (D z E g 12— — '5 U Q) (I ‘3 a) 10— 7. a E O 8* -1 1 a 1 a 1 I -10 —5 O 5 10 15 20 25 3O 35 40 SNR(dB) Figure 5.69 Orders required for the discriminations of RC AAM/adl vs. SNR for contaminated stored patterns. I96 Comparing the analog portion with the bipolar portion in Table 5.1, we can find the analog input data has lower correlation-based discrimination resolution than the encoded bipolar data. This indicates that for the same input, the encoded bipolar form data is more easily discriminated than the analog fonn, and the bipolar input form requires smaller discrimination order from the RCAAM/f1 than the analog input form does from the RCAAM/ad. This comparison of correlation-based discrimination resolutions is consistent with the results of Figure 5.68 and Figure 5.69 . For bipolar input targets, the input target 852 has the most 0's in the normalized crosscorrelation gains w.r.t. the other targets. Thus the target 852 is the most easily recognized target under the RCAAM/f1 network manipulation subject to an assumed threshold of 30%. Table 5.2 shows the comparisons of normalized target correlation gains subject to a threshold of 20% between analog data and the encoded bipolar data. For bipolar input form, an active threshold of 20% means a stored pattern can be an active pattern only if it has a correlation greater than 60 (300x20%) with the bipolar input. It is equivalent to say an active stored pattern at least has 180 (150+60/2) bits out of 300 bits the same as the bipolar input. Under this assumed active threshold, the analog input form brings 39.1% stored patterns into active operation, while the encoded bipolar input form makes 20.02% stored patterns active. In Table 5.2, in comparison to Table 5.1, there are two normalized target crosscorrelation gains, the terms of TR] w.r.t. input target 858 and the terms of 858 w.r.t. input target TRI, of bipolar form getting greater than those of analog form. Strictly speaking of those two cross targets, the analog form has higher discrimination resolution than the bipolar form. But, in general, the bipolar form still has a lot higher discrimination resolution than the analog 197 form. Table 5.3 presents the comparisons of normalized target correlation gains subject to a threshold of 10% between analog data and the encoded bipolar data. This assumed active threshold makes the analog input form induce 64.06% stored patterns active and the encoded bipolar input form induce 55.97% stored patterns active. For bipolar input form, to be an active pattern only requires a correlation greater than 30 (300x10%) with the bipolar input. This means an active stored pattern may have 135 (150-30/2) bits out of 300 bits different than the bipolar input. Besides the two cross terms appearing in Table 5.2, there are another two normalized target crosscorrelation gains, the term of F14 w.r.t. input target 858 and the term of 858 w.r.t. input target F 14, of bipolar form changing to greater than those of analog form. Summation is used in (l 1) to enhance the same target code, since assumption 1 has assumed the output codes are consistent within each target. The assumption implicitly indicates a larger active threshold can confine active patterns within a more consistent region. Then the active correlation gains are closer, and the summation becomes more meaningful. The equality in (12) is approached when the correlation gains become closer. Therefore, a larger active threshold not only represents a better practical situation but also makes the estimates of the normalized target correlation gain formula more consistent and acceptable. Therefore, an active population of more than 50% not only becomes impractical, but also violates the assumptions. 198 Input No. (%) of Input Normalized Normalized Normalized Normalized Data Active Target Correlation Correlation Correlation Correlation Form Patterns Gain of 852 Gain of 858 Gain of F14 Gain of TR] w.r.t. a w.r.t. Input w.r.t. Input w.r.t. Input w.r.t. Input Threshold Target Target Target Target of 30% Target 1.0000 0.2777 0.3623 0.2801 852 Target 0.2176 1.0000 0.0507 0.1214 858 Analog 14.79 (21.76%) Target 0.2289 0.0409 1.0000 0.6048 F14 Target 0.1741 0.0964 0.5951 1.0000 TRI Target 1.0000 0 0 0 852 Target 0 1.0000 0 0.0280 858 Bipolar 5.9] (869%) Target 0 0 1.0000 0.1402 F14 Target 0 0.0236 0.1518 1.0000 TR] Table 5.] Comparisons of normalized target correlation gains subject to an assumed active threshold of 30% between analog data and encoded bipolar data. 199 Input No. (%) of Input Normalized Normalized Normalized Normalized Data Active Target Correlation Correlation Correlation Correlation Form Patterns Gain of 852 Gain of 858 Gain ofF14 Gain of TR] w.r.t. a w.r.t. Input w.r.t. Input w.r.t. Input w.r.t. Input Threshold Target Target Target Target of 20% Target 1.0000 0.4431 0.5247 0.3753 852 Target 0.3392 1.0000 0.1927 0.2505 858 Analog 26.59 (39.10%) Target 0.3609 0.1740 1.0000 0.7168 F14 Target 0.2524 0.2196 0.6999 1.0000 TR] Target 1.0000 0.1020 0.0144 0.0909 852 Target 0.0608 1.0000 0.1115 0.2914 858 Bipolar 14.29 (21.02%) Target 0.0071 0.0917 1.0000 0.3398 F14 Target 0.0478 0.2567 0.3644 1.0000 TR] Table 5.2 Comparisons of normalized target correlation gains subject to an assumed active threshold of 20% between analog data and encoded bipolar data. 200 Input No. (%) of Input Normalized Normalized Normalized Normalized Data Active Target Correlation Correlation Correlation Correlation Form Patterns Gain of 852 Gain of 858 Gain of F 14 Gain of TR] w.r.t. a w.r.t. Input w.r.t. Input w.r.t. Input w.r.t. Input Threshold Target Target Target Target OfIOO/o Target 1.0000 0.5759 0.6286 0.4719 852 Target 0.4585 1.0000 0.3614 0.3617 858 Analog 43.56 (64.06%) Target 0.4655 0.3390 1.0000 0.7909 F14 Target 0.3360 0.3253 0.7720 1.0000 TR] Target 1.0000 0.4132 0.1662 0.2858 852 Target 0.2666 1.0000 0.3867 0.5235 858 Bipolar 38.06 (55.97%) Target 0.1007 0.3658 1.0000 0.5391 F14 Target 0.1794 0.5209 0.5624 1.0000 TR] Table 5.3 Comparisons of normalized target correlation gains subject to an assumed active threshold of 10% between analog data and encoded bipolar data. 20] 5.7 Performances of RCAAM-GI Cascade Networks From previous RCAAM/di, RCAAM/fl, and RCAAM/ad simulation performances, we see that the unrecognized discriminations reduce when the stability criterion increases. At the same time, the correct target discriminations of the RCAAM/di and RCAAM/fl are apparently improved while the wrong target discriminations rise only slightly. For the RCAAM/ad, the correct target discriminations increase without raising the wrong target discriminations. Although a higher sc will improve discrimination performances, especially for the networks with low initial orders, a higher sc will take more processing (calculation) time as well as larger computing memory (i.e. larger aord) to satisfy the higher stability criterion. And a high initial order not only has weakened adaptive elasticity and contamination observability but also raises the aord, i.e. computing space. A low initial order gives a high adaptive elasticity, and a low sc neither renders all convergeable correct target discriminations nor converges all possible wrong discriminations . Therefore, a low initial order and sc will quite often leave an ambiguous input in semi-stable states, which are stable only for a few iterations. The Generalized Inverse (GI) network is firrther trained after an initial correlation- based memory coding. It has the hybrid characteristic of being capable of converging the spurious states within attractive basins to their associative target codes. We should remember that the GI network has a noticeable advantage that it prefers to leave an ambiguous state unrecognized rather than converge it to a wrong target. Thus a combination network leaving the RCAAM with low initial orders and sc and cascaded with a GI network, will become more efficient without sacrificing possible convergence. This thought seems unreasonable, since taking advantage without losing profits anywhere is somehow contrary to nature. 202 5.7.1 Converging Efficiency Comparisons Between the RCAAM and the RCAAM-GI Cascade Network Figure 5.70 shows the unrecognized discrimination comparisons between the RCAAM/di2 and the RCAAM/di22-GI cascade network along with stability criterion for both contaminated stored and unstored patterns with 40 dB and 0 d8 SNR. The RCAAM/diZZ-GI cascade network converges all unknowns for all cases, while the RCAAM/di2 with a high sc of 10 still struggles for convergence for both contaminated stored and unstored patterns for 0 d8 SNR. Therefore, a low sc of 2 in the RCAAM/diZZ-GI cascade network at least has saved the discrimination orders integrated by an increment of 8 in sc used by RCAAM/diZX where X denotes 10. Thus the RCAAM/diZZ-GI cascade network is much more efficient in processing time and space than the RC AAM/di2. Figure 5.71 presents the unrecognized discrimination comparisons between the RCAAM/fl] and the RCAAM/fi12-GI cascade network along with stability criterion for both contaminated stored and unstored patterns for 40 dB and 0 d8 SNR. For the RC AAM/fi12- GI cascade network, the unknowns are not completely converged only for the heavy contamination case, i.e. 0 d8 SNR. As discussed before, serious contamination will result in coding linearity failure and then the mechanism of RC AAM/fi will make the output state fall into oscillation if the dominant correlation gain is negative. Therefore, the existence of unknowns should be attributed to the nonlinear coding scheme and RCAAM/fl mechanism. It is noticeable that the RCAAM/filZ-GI cascade network leaves no unknowns for contaminated unstored patterns for 40 d8 SNR, while the RCAAM/fil couldn't converge all unknowns even when an sc as large as 40 is used. This time the efficiency benefit becomes 203 huge when using an sc as small as 2 in RCAAM/filZ-GI instead of a sc as large as 40 in RCAAM/filIVX where IVX stands for 40. Later we will show that the performances of RCAAM/filZ-GI network for contaminated unstored patterns under the 40 d8 won't drop, but get better than the RC AAM/filIVX. Figure 5.72 shows the unrecognized discrimination comparisons between the RC AAM/adl and the RC AAM/adlZ-GI cascade network along with stability criterion for both contaminated stored and unstored patterns for 40 and 0 d8 SNR. For the RCAAM/adIZ-GI cascade network , the unknowns are all converged for all cases, since the analog data has no nonlinearity interference problem even under severe distortions. The RCAAM/ad1 still needs an sc of 15, larger than the sc of 2 used by RCAAM/fl], to converge unknowns for slightly contaminated stored patterns. The reason for this, as given in the previous section, is that the analog data has lower correlation-based discrimination resolution than the encoded bipolar data form. The RCAAM/ad1 converges the slightly contaminated unstored patterns only when a high sc of 30 is adopted. Compared to the RCAAM/di, the analog data form is acting on a converging delay, so that the effect has firrther emphasized the efficiency of the RCAAM/adlZ-GI cascade network. 5.7.2 Noise Tolerance Comparisons Between the RCAAM and RCAAM-GI Cascade Network From the last subsection, we know that the converging efficiency of the RCAAM-GI cascade network is much better than the RCAAM. But we don't know if the RCAAM-GI cascade network can still have good discrimination effect when its previous stage network, the RCAAM, only sets a low stability criterion. Here we simulate the RC AAM with three sc's 204 1 O I I I I I T ‘IT 9 >— -1 Solid — : di22—GI for Stored Patterns under 40 dB Circle 0 : di2 tor Stored Patterns under 40 dB Dashed —- : d122-Gl for Stored Patterns under 0 dB 8 _ Plus + : di2 for Stored Patterns under 0 dB 9 Dashdot -. : di22-GI for Unstored Patterns under 40 dB Star ’ : di2 for Unstored Patterns under 40 dB 2 Dotted : : d122—GI tor Unstored Patterns under 0 dB 7 x—Mark x : di2 tor Unstored Patterns under 0 dB _ A 6 r 7 o\° ‘0 a) .1! a 5* 1 o o 9 c: D 4 _ _ 3 a 4 2 P _ 1 _ a l- r 00 1 $ 8 8 8 fl g 2 3 4 5 6 7 8 9 10 Stable Criterion Figure 5.70 Unrecognized discriminations of the RC AAM/di2 and RCAAM/di22-GI vs. initial order for 40 dB and 0 d8 SNR. 205 20 T I I I I I I 18 Solid - : fi12—GI for Stored Patterns under 40 dB Circle 0 : fit for Stored Patterns under 40 dB Dashed —- : ti12-GI for Stored Patterns under 0 dB 16 Plus + : fit for Stored Patterns under 0 dB 5 Dashdot -. : fi12-GI for Unstored Patterns under 40 dB Star " : fit for Unstored Patterns under 40 dB 31‘ Dotted : : 1112-61 for Unstored Patterns under 0 dB 14 x—Mark x : fi1 for Unstored Patterns under 0 dB _ ..a N Unrecognized (%) ES _____________________ _ 066 6 6 I 6 1 1 I 6 5 10 15 20 25 30 35 40 Stable Criterion Figure 5.7] Unrecognized discriminations of the RCAAM/fil and RCAAM/filZ-GI vs. initial order for 40 dB and 0 d8 SNR. 206 25 I I I I I Solid - : ad12—Gl tor Stored Patterns under 40 dB Circle 0 : ad1 tor Stored Patterns under 40 dB Dashed —— : ad12-Gl for Stored Patterns under 0 dB 20 a Plus + : ad1 for Stored Patterns under 0 dB - Dashdot —. : ad12—GI for Unstored Patterns under 40 d8 4 Star ‘ : ad1 Ior Unstored Patterns under 40 dB Dotted : : ad12-GI tor Unstored Patterns under 0 dB x—Mark x : ad1 tor Unstored Patterns under 0 dB A15 r i o\° '0 <1) .13! c: 8’ o C 93 - c D 10 _ 5 — a ‘\ \. 0 1 1 1 1 '1' 5 1 0 15 20 25 30 Stable Criterion Figure 5.72 Unrecognized discriminations of the RCAAM/ad1 and RCAAM/adIZ-GI vs. initial order for 40 dB and 0 d8 SNR. 207 under different contamination levels, and then compare their discrimination performances with those of the RCAAM-GI cascade network using the lowest sc. Figure 5.73 shows the correct target discriminations of the RCAAM/diZZ-GI cascade network and the RCAAM/di2 for contaminated stored patterns vs. SNR in (a), and the wrong target discriminations in (b). Figure 5.74 presents the unrecognized discriminations of the RC AAM/diZZ-GI cascade network and the RCAAM/diZ for contaminated stored patterns vs. SNR in (a), and discrimination orders in (b). The correct target discriminations always increase when the RCAAM/di2 changes its sc from 2 to 5 and then from 5 to 10. Corresponding to the increment in sc, the wrong target discriminations show no change for fair contaminations, and only rise slightly for serious noises. In both correct and wrong target discrimination performances, the RCAAM/diZX‘s performances approximate the RCAAM/di22-Gl's. It seems that the RCAAM/di25 performances will converge to the RCAAM/d122's while the sc continues increasing. The unrecognized discrimination performances for the RCAAM/diZ and RCAAM/di22-GI network also reveal this converging characteristic. Although the RCAAM/diZZ-GI cascade network only sets an sc of 2, it leaves the fewest unknowns behind for all contamination situations. The large differences between the discrimination orders needed for the RCAAM/di2 and the ones for the RCAAM/di22-GI network again demonstrates the RCAAM/diZZ—Gl's high efficiency. Although the RCAAM/diZX expends a space about 8 orders higher than RCAAM/di22-GI network, it doesn't give better discrimination performances. Figure 5.75 shows the correct target discriminations of the RCAAM/di22-GI cascade network and the RCAAM/di2 for contaminated unstored patterns vs. SNR in (a), and the 208 ,, ’ —r_—_5__ :1 __.__-—__-_—__—T:-=’-" I r g 95 — _ g 90 — - E 85 — — .E 5 80 ~ Solid — : d122-Gl ~ 8 Dashdot -. : di22 ... 75 - Dashed —- : di25 — g, Dotted : : di2X '53 70 ~ — 5.3 65 — _ 5 O 60 r r 55 I 1 I I 1 1 1 I I —1 0 —5 0 5 10 15 20 25 30 35 40 SNR (dB) (a) 40 I I I f I I § 35 ~ - U) .5 30 ~ - E 25 _ Solid — : Wrong Discrimination tor di22-GI q E Dashdot -. : Wrong Discrimination tor di22 5 Dashed —- : Wrong Discrimination tor di25 8 20 ~ Dotted : : Wrong Discrimination for di2X ~ 36’: 15 ~ - cu I.- cn 10 r a c 9 a 5 — — O l 1 l 1 l l —1 0 10 15 20 25 30 35 40 SNR (dB) (b) Figure 5.73 Performance comparisons of the RCAAM/di's with an initial order of 2 and RCAAM/di22-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Correct target discrimination vs. SNR for the 68 stored patterns. (b) Wrong target discrimination vs. SNR for the 68 stored patterns. 209 10 I I T I I I I I I 8 _ -l ’3‘ , ’ ‘ ‘ 85 / \ \ Solid - : Unrecognized for di22-GI g 6 r / '\ Dashdot —. : Unrecognized Ior di22 e g / .\ Dashed -- : Unrecognized for di25 8, '\ Dotted : : Unrecognized for d12X O \ \ \ E 4 1' \ ~ C ‘\ 3 \ / / \ 2 - / / \ \ , ._ — - - \ \ 7 / \ _ _ _, ’ ’ \ \ \ g . \ I ,. .— "‘ ~ \ ‘ \ 0 W ' ' 1 1. 1 1 1 ‘1 \ ‘ ~\ .1 1 —1 O —-5 0 5 10 15 20 25 30 35 40 SNR (dB) (8) U) C I .2 16 r T c6 .5 E 5 14 e i .12 O i‘ o 12 " T E. Circle 0 :di22—GI 2 Star ' : di22 C 10- Plus+ :di25 _ .9 x—Mark x : di2X .0 93 .5 8 _ i o- a) 0: (I) (T) 6 r _ 8 1 + 1 fl —1 0 -5 0 5 10 15 20 25 30 35 40 Figure 5.74 Unrecognized discriminations and discrimination orders of the RCAAM/di's with an initial order of 2 and RCAAM/di22-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 68 stored patterns. (b) Orders required for discriminations vs. SNR for the 68 stored patterns. 210 J-———1————r-—--«————r gags — __ _ , _____________ — 990L 9 .9 E 85 — — .E g Solid — :di22—Gl .. 75 P Dashdot -. : di22 - 8, Dashed —- :di25 “ Dotted : : di2X (U "' -l '_ 70 8 65 — _ ’5 O 60 L - 55 I l l l J I l l L I -10 — 0 5 10 15 20 25 30 35 40 SNR (08) (a) 40 I I I I I 01535 — - 2 .9307 7 iii -525— — g Solid - : Wrong Discrimination for di22-GI b Dashdot -. : Wrong Discrimination Ior di22 .93 20 r Dashed —-- : Wrong Discrimination for di25 - O Dotted : : Wrong Discrimination for di2X 3'5 915 r 1 <13 '— c» 10 - n c 9 3 5- - O l l l 4 l— —10 —5 0 5 10 15 20 25 30 35 4O SNR (dB) (b) Figure 5.75 Performance comparisons of the RCAAM/di's with an initial order of 2 and RCAAM/di22-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 64 unstored patterns. (b) Wrong target discriminations vs. SNR for the 64 unstored patterns. 211 10 V I I I I T T r I 8— , ” ‘ \-\ _ / ‘ \ ’ ‘ \ E0, / / \ \ \ \ \ / . ’ I I g 6— \_._- _._ _._,,,’ 4 .5 5 0 Solid - : Unrecognized for di22-GI 8 4 _ Dashdot —. : Unrecognized for di22 — E / \ \ Dashed -— : Unrecognized for di25 3 / / ‘ \ Dotted : : Unrecognized lor di2X 2 *" \ \ _ \ " ‘ _ O M 1 4 1 ~--_1——~—1_“‘~1——_ —1O -5 O 5 1O 15 20 25 30 35 4O SNR (dB) u: 8 16 N: I W I I I 4 is E /\ 14— I D x Circle 0 : di22—GI 5 Star ' :di22 E12» Plus+ :di25 - é) x-Mark x : di2X 5‘2 '8 1O *- I ‘r e 0' a) CC 8 - _i ‘13 a) g k 1 + I H -10 20 25 3O 35 4O SNR (dB) (b) Figure 5.76 Unrecognized discriminations and discrimination orders of the RCAAM/di's with an initial order of 2 and RCAAM/diZZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 64 unstored patterns. (b) Orders required for discriminations vs. SNR for the 64 unstored patterns. 212 wrong target discriminations in (b). Figure 5.76 presents the unrecognized discriminations of the RC AAM/di22-GI cascade network and the RCAAM/di2 for contaminated unstored patterns vs. SNR in (a), and discrimination orders in (b). For unstored patterns, the RCAAM/di22-GI network's advantage of the RCAAM/di2 is again apparent. Specifically, the RCAAM/di25's correct target discrimination begins falling at 30 dB. The RCAAM/diZZ-GI network has the fewest unknowns and lowest discrimination orders for all contamination situations. Figure 5.77 shows the correct target discriminations of the RCAAM/fl 12-GI cascade network and the RCAAM/til for contaminated stored patterns vs. SNR in (a), and the wrong target discriminations in (b). Figure 5.78 presents the unrecognized discriminations of RCAAM/filZ-GI cascade network and the RC AAM/fil for contaminated stored patterns vs. SNR in (a), and discrimination orders in (b). It can be observed that the RCAAM/fil's discrimination performances are inclined to converge to the RCAAM/f112's, while the sc continues increasing. From the previous subsection, we have noticed that the RCAAM/fi12- GI cascade network is much more efficient in processing time and space than the RCAAM/fillVX, which set an so as large as 40. By checking the ratios of correct to wrong target discrimination increments, especially for severe contaminations, we find that the RCAAM/fi12-GI network also has better discrimination performances than the RCAAM/filIVX. In Figure 5.78 , the unrecognized rate doesn't apparently lowered, while the discrimination orders used by the RCAAM/filIVX get huge. Figure 5.79 shows the correct target discriminations of the RC AAM/fi 1 2-GI cascade network and the RCAAM/til for contaminated unstored patterns vs. SNR in (a), and the 213 100 : ‘L __ —— a — — "I— I I §‘ 70’ 90* d C .9 E 80 »- _ .E 8 Solid — :ii12—GI g 70 — Dashdot —. : fi12 ~ ,, Dashed -— :ti15 8; Dotted: :ti1lVX 5 60 — ~ ‘5 9 , 5 SOr - - O ’,‘ 4O 1 l 1 1 1 l l 1 1 —1O -5 O 5 1O 15 20 25 30 35 40 SNR (dB) (61) 35 I I I l I I I I I 9°, 30 — - m 8 g 25 r 4 ,,C_ Solid - : Wrong Discrimination for fi12—Gl g 20 _ Dashdot -. : Wrong Discrimination tor fi12 _ 5 Dashed —— : Wrong Discrimination iorfi15 g Dotted : : Wrong Discrimination for ti1IVX *5 15 ‘ " 2’ m I— 10 - - U) r: 9 g 5 r + O i l 1 l 1 l 1 —1 O -5 O 5 1O 15 20 25 30 35 4O SNR (dB) (b) Figure 5.77 Performance comparisons of the RCAAM/fi's with an initial order of 1 and RC AAM/filZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 68 stored patterns. (b) Wrong target discriminations vs. SNR for the 68 stored patterns. 214 0.) (It — —( —-4 q — ‘ 30 i— \ -1 ‘\ :5 25 L \- \ \ d 9., , \ \ - Solid - : Unrecognized tor ti12-GI 8 20 _ ~ \ \_ Dashdot -. : Unrecognized for ti12 _ 5 \ Dashed -- : Unrecognized for li15 8, Dotted : : Unrecognized tor tithX 8 15 " 4 9 C 3 10 ~ a 5 r 7 d \ W-::A‘;~!“1——-—l—_ 1 l —1 0 1O 15 2O 25 30 35 4O SNR (dB) (a) g r I l I I 3g 60 '- a .9. .§ 0 v v =6 " " X o 40 ~ 2 E g Circle 0 : ti12—GI 9 3O 5 Star ' :fi12 1 ‘0 Plus + : ms 9 x—Mark x :fi1IVX '5 20 - a U" a) (E (I) __ .4 ES 10 I § % E ‘3‘ if: I + t 4 1 H O -1 o —5 o 5 1O 15 20 25 so 35 4o SNR (dB) (b) Figure 5.78 Unrecognized discriminations and discrimination orders of the RCAAM/fi's with an initial order of 1 and RCAAM/dilZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 68 stored patterns. (b) Orders required for discriminations vs. SNR for the 68 stored patterns. 215 100 I I I I .. . T """" I 90 ~ ~ 80 r — Solid - :fi12—Gl Dashdot -. : Ii12 ‘ Dashed -- :fi15 Dotted: :fi1lVX 60 I Correct Target Discriminations (%) \l o I l 50 1 1 1 1 1 1 —1 0 1O 15 20 25 3O 35 4O SNR (dB) (a) 30 I I I I I I 8: 25 ,— —i U) c .9 E 20 L — E Solid - : Wrong Discrimination for fi12-Gl E3 Dashdot -. : Wrong Discrimination for ti12 -‘—”- 15 *- Dashed -— : Wrong Discrimination for MS ~ 9 Dotted : : Wrong Discrimination tor fithX 8) a 10 _ a [— U) 8 5 3 0 A 1 1 1 1 1 -10 10 15 2O 25 3O 35 40 SNR (dB) (b) Figure 5.79 Performance comparisons of the RCAAM/fi’s with an initial order of 1 and RCAAM/filZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 64 unstored patterns. (b) Wrong target discriminations vs. SNR for the 64 unstored patterns. 216 3O \ I I I I I I I I I i\ 25— \ ‘ — , \ \. Solid — : Unrecognized for fi12-Gl , \ \ Dashdot —. : Unrecognized for ti12 V \ '\ Dashed -— : Unrecognized torti15 ' Dotted : : Unrecognized tor fi1IVX N O I —L O I Unrecognized (%) a: 5 ~ 2 1 l l ........ l. ...... I 1 —1 0 1O 15 20 25 30 35 4O SNR (dB) (3) u: E) I I T I I I I I I E 60 i- -i .9 E i’ sob ~ 0 X x x x 4.‘ g4o— . 6 E Circle 0 : fi12—Gl .9 30 — Star' :fi12 * '0 Plus + :ti15 ,2 x—Mark x : fi1lVX 320 ~ ~ (D m \— 9 ‘00) 1O ,. 7 -i— i i IT 5 1 I if :1 4 H ‘ a —10 -5 O 5 1O 15 20 25 30 35 4O SNR (dB) in Figure 5.80 Unrecognized discriminations and discrimination orders of the RCAAM/fl's with an initial order of 1 and RCAAM/filZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 64 unstored patterns. (b) Orders required for discriminations vs. SNR for the 64 unstored patterns. 217 wrong target discriminations in (b). Figure 5.80 presents the unrecognized discriminations of the RCAAM/fi12-Gl cascade network and the RCAAM/til for contaminated unstored patterns vs. SNR in (a), and discrimination orders in (b). For unstored patterns, the RCAAM/fil2-GI cascade network's advantage of the RCAAM/f1] is evident. All the RCAAM/fl], including the one with a sc of 40, couldn't reach 100% correct target discrimination even for a slight contamination, while the cascade network with a low so of 2 maintains the 100% correct target discrimination without difficulty until 3 dB. Although those unstable states, to which the RCAAM/filIVX is unable to converge, are ambiguous and oscillating, the RCAAM/flIZ-GI cascade network still can converge them by their slightly revealed inclinations to correct stored patterns. The RCAAM/fil2-GI cascade network's excellent efficiency and great discrimination capability become much clearer in this case. Figure 5.81 shows the correct target discriminations of the RCAAM/adIZ-GI cascade network and the RCAAM/ad1 for contaminated stored patterns vs. SNR in (a), and the wrong target discriminations in (b). Figure 5.82 presents the unrecognized discriminations of the RCAAM/adlZ-GI cascade network and the RCAAM/ad1 for contaminated stored patterns vs. SNR in (a), and discrimination orders in (b). From the correct target discrimination performances, the analog data's deficiency, whose low correlation-based discrimination resolution makes it need a high discrimination order to converge, doesn't bother the RCAAM/adlZ-GI cascade network. The RCAAM/adIZ-GI cascade network always has 100% correct target discrimination until -3 dB by only using a low sc of 2, while the RCAAM/ad1 still struggles to converge a lot of unknowns until it sets a high sc of 15. The cascade network still retains the RCAAM/ad's remarkable rare wrong 218 target discrimination. Since the analog data has no nonlinearity inconsistency problem, the converging failures should be attributed to the low correlation-based discrimination resolution. The analog data form also brings us the low correlation-based discrimination resolution, while it is introduced to eliminate the bipolar coding inconsistency occurring with severe distortion. Then the low correlation-based discrimination resolution needs more orders for discrimination, while the elimination of the inconsistent disorder with severe contamination greatly reduces the discrimination order. Thus the analog data form raises the discrimination orders required for light contamination, but at the same time reduces the discrimination orders required for severe contamination. This is the reason why the RC AAM/ad's discrimination order curves w.r.t. SNR are flatter than the RCAAM/fl's and RCAAM/di's. This is a fair and balanced trade : getting advantage at one side and paying (or losing) profit on the other side in return. As discussed before, a flat discrimination order curve has low contamination observability. Figure 5.83 shows the correct target discriminations of the RCAAM/adIZ-GI cascade network and the RCAAM/ad1 for contaminated unstored patterns vs. SNR in (a), and the wrong target discriminations in (b). Figure 5.84 presents the unrecognized discriminations of the RCAAM/adlZ-Gl cascade network and the RCAAM/ad1 for contaminated unstored patterns vs. SNR in (a), and discrimination orders in (b). Compared to contaminated stored patterns, the RCAAM/ad12-Gl's discrimination performances almost haven't changed, while the RCAAM/adl's decay a lot. Again, the cascade network discriminates without error until -6 dB. It's amazing that about 3% contaminated unstored patterns still stay in confused states for at least 15 iterations, when the GI network itself has 219 correctly converged about 24% unrecognized states left by the RCAAM/ad12. For both contaminated stored and unstored patterns, the RC AAM/adl's discrimination performances again become converging to the RCAAM/adIZ-Gl's, while the stability criterion continues increasing. From the above simulation result comparisons, the excellent efficiency, high converging capability and powerful discrimination ability of the RC AAM-GI cascade network are very impressive. The performance comparisons between the RCAAM and RCAAM-GI cascade network have offered an alternative way to appreciate what a great converging capability the GI network has. As soon as an ambiguous state moves into attractive basins of the stored patterns, the GI network will converge the state, even if still spurious, to one of the stored patterns. With this issue in mind, a network doesn't need to converge a contaminated input all the way to a real stable state. All it needs to do is to associate an input with all stored patterns a few times. Then the input, if not severely distorted, will reveal its inclination to some stored states, and thus move from an ambiguous state, if existing, into the realms of attractive basins. Now the network can leave the task of converging the attracted state to one of stored patterns to the GI network. 220 100 l I I I I I I I .3 ______________________ e > /‘~"” g 95 “ / 5 - / ‘ g V / Solid - :ad12—GI "g / / Dashdot -. : ad12 ‘: / Dashed -— :ad15 8 , Dotted: : ad1lXV 9> / fl / ‘5 85 r ’ ‘ 93 5 O 80 J I 1 1 L I 1 l 1 —1O -5 0 5 1O 15 2O 25 30 35 4O SNR (dB) (8) U1 .1 .1 _ _( _. 1 2 _. .. b I 1 Solid — : Wrong Discrimination for ad12-GI Dashdot -. : Wrong Discrimination tor ad12 Dashed —- : Wrong Discrimination for ad15 Dotted : : Wrong Discrimination for ad1lXV (D I 1 Wrong Target Discriminations (%) "r’ 1 .— h- 1 I I 1 1 I o 5 10 15 20 25 30 35 4o SNR(dB) (b) Figure 5.81 Performance comparisons of the RCAAM/ad's with an initial order of 1 and RCAAM/adIZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 68 stored patterns. (b) Wrong target discriminations vs. SNR for the 68 stored patterns. 221 15 I I I I I I I T I \. \ \. / — _ _ _ __ __ _ _ _ _ .— _ _ __ _ ,_ .— \ __ — , - ~ ‘ / ’ §1OL ‘ ’ — 8 Solid — : Unrecognized for ad12—GI .l_\_l Dashdot -. : Unrecognized tor ad12 CC» \ Dashed -- : Unrecognized tor ad15 8 \ \ Dotted : : Unrecognized for ad1 IXV 8’ \ c _ _. , ’ _ 3 5 \ \ \ I T \ \ X 1 V V 1 I 1 l 1 1 1 ’u? C I I I I I I I I I 'g 22 — T “ " d E 20 ~ - o g 18 ~ ~ 5 Circleo :ad12-oil 0’ 16 ” ' Star' :ad12 ‘ E L Plus + :ad15 Q 14 x—Mark x : ad1 IXV ‘ '0 3 12 - - §10 ~ 1 1 m I r r (I) 5 8r r 9 a 1 u . a 0 —1o 20 25 3o 35 4o Figure 5.82 Unrecognized discriminations and discrimination orders of the RCAAM/ad's with an initial order of 1 and RCAAM/dilZ-GI cascade network vs. SNR for the 68 stored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 68 stored patterns. (b) Orders required for discriminations vs. SNR for the 68 stored patterns. 222 100 I I I I I I T I I g / ' ‘ , M h r ,,,,, g 95* , n .9 ' ‘5 Solid - : ad12-GI E /" - \\ Dashdot-.zadtz .E 90 — / \ \ Dashed -— : ad15 - '5 / \ ,/’\ Dotted: :ad1IXV .9 , \ \ o . ‘\ _____ -z“ ‘3, 85— ‘ ‘‘‘‘‘‘ = (T: ,'\. i- ,' \. ...a \ o , - --\ 2 80~ \ - . O ’\ o \ z ‘ 75 l 1 t 1 L “‘1—_~1—'_—1_'-'_"1'_—— —1O —5 0 5 1O 15 20 25 30 35 40 SNR(dB) (a) 01 —i —i .1 .1 ..r _. .- $ T 1 OD r 1 Solid - : Wrong Discrimination for ad12-GI Dashdot -. : Wrong Discrimination for ad12 Dashed -— : Wrong Discrimination for ad15 Dotted : : Wrong Discrimination for ad1lXV Wrong Target Discriminations (%) 2 ” 1 1 - .. 0 I l l l l l J l —1 o o 5 10 15 20 25 30 35 4o SNR (dB) (b) Figure 5.83 Performance comparisons of the RCAAM/ad's with an initial order of 1 and RCAAM/adIZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Correct target discriminations vs. SNR for the 64 unstored patterns. (b) Wrong target discriminations vs. SNR for the 64 unstored patterns. 223 25 l f F I I ’i__>_7___1:___j____ 20~ , , ’ a / _ _ o\° \ \A, / 8 15h ’ f '_ __ , ———————— a .5 / ————— ,. .— C / cu ’ ~ \ , ’ 8 \ / / \ 93 10 L \ \ , / Solid - : Unrecognized for ad12-GI ‘ c “ ‘ ’ Dashdot -. : Unrecognized forad12 3 Dashed —— : Unrecognized for ad15 5 Dotted : : Unrecognized for ad1lXV X J l L l l 1 J 1 —1O —5 O 5 10 15 2O 25 3O 35 40 .p i iK ( _ ( _i ’l l\ - y a 1 -t ID A) l0 (D C) [O r l I ) > 1 1 1 Circleo :ad12—GI Star' :ad12 _ Plus+ :ad15 x-Mark x : ad1lXV ..s .5 R) -h 7 I l I Orders Required for Network Discriminations a: I 10 ” “ 8“ 1 1 1 gfi‘ rrfihgi 1 4’1E:, 1 E -10 —5 O 5 10 15 20 25 3O 35 4O SNR(dB) Figure 5.84 Unrecognized discriminations and discrimination orders of the RCAAM/ad's with an initial order of 1 and RCAAM/adIZ-GI cascade network vs. SNR for the 64 unstored patterns belonging to 4 targets. (a) Unrecognized discriminations vs. SNR for the 64 unstored patterns. (b) Orders required for discriminations vs. SNR for the 64 unstored patterns. CHAPTER 6 Target Discrimination using Neural Network with Spectrum Magnitude Response 6.1 Introduction In the previous three chapters, we used the sampled backscatter time response as the network process information. We determined the beginning response time for each stored and unstored aspect responses by a simple detection algorithm, and then picked the first 100 time samples, starting from the detected beginning time, as time domain analog stored/unstored patterns. Then noises were added to the deterministic analog stored/unstored patterns to simulate the network tolerance performances. So we had assumed that the detection of the beginning response for target aspect responses is available and consistent even under various noises. In practical noise-limited situations, finding the same beginning response time used in training is very difficult. Therefore, the network must also store or train several time-shifted neighborhoods of the time segment pattern for each aspect angle, to increase tolerance for time-shifted patterns. This is impractical, since it dramatically reduces the network capacity. Let's show a time-shifted case resulted from an inappropriate determination for the beginning response occurred in our previous time domain simulations. In our simulations, there are four targets and each target has 17 aspect patterns stored in networks. Figure 6.1 shows the time response stored segments of the fourth and sixth aspect stored patterns of target F14. The aspect angle spacing between any two adjacent stored (or unstored) patterns of a target is 18", therefore the aspect angle spacing between the fourth and sixth aspect 224 0.4 225 Normalized Time Response Solid I I l l l l : The fourth stored aspect (5.4 deg) pattern of target F14 Dashdot : The sixth stored aspect (9.0 deg) pattern of target F14 d 11 10 20 30 4O 50 60 Time Sample 70 80 90 100 Figure 6.1 Two aspect (54° and 9°) stored patterns with an apparent time-shift resulted from an inappropriate beginning response determination occurred in time domain simulations. 226 stored pattern of target F 14 is 36°. There is an apparent time-shifi between these two aspect stored segments, since our detection to the beginning response failed. The analog correlation gain between these two stored aspect patterns is -O.2914, and the bipolar correlation gain is 14 after encoded by 3 bits coding 7 levels. Since the stored aspect responses are normalized, the analog autocorrelation gain for any stored pattern should be 1. The bipolar autocorrelation gain should have the value of 300, thus the correlation gain of 14 indicates the two stored patterns of target F 14 with aspect angles spacing with 36" have 143 bits different among 300 bits. Therefore, no matter in analog or digital data form, the correlation- based inconsistence resulted from the confused allocation of the beginning responses is worse for these two aspect stored patterns. This correlation-based inconsistence will degrade the target group idea and then make network’s discrimination only depend on individually distinct aspect patterns but consistent target clustering in our simulations. In this chapter, we use the spectral magnitude, which is time-shifi invariant, as the network process information to overcome the difficulty of consistently allocating the beginning time responses. Unfortunately, we use less information here than the time domain process since the phase is ignored. Also, the sharp specular peaks characteristic of a typical backscatter time response don't occur in the corresponding frequency spectrum. 6.2 Spectrum Magnitude Process and Normalization To simulate target impulse response in the time domain, we measure the frequency responses of 4 targets in the frequency band 1-7 GHz with a frequency increment of 0.01 GHz. Thus, we have 601 measured spectral responses including both magnitudes and phases. Since an impulse in time domain has uniform spectral distribution in the frequency domain, 227 the transmitted wave generated by a HP-87ZOB network analyzer sweeps from 1 GHz to 7 GHz to emulate the uniform spectral distribution. To have a narrow sampling time, (8192- 601) zeros are attached to the measured spectral responses to have 8192 total spectral responses. Then a 8192-point inverse FF T is used to create the corresponding scattering time responses. We then assume these time responses as measured by a time-domain radar system. To eliminate any spurious reflections within the measurement chamber and save space, we time-gate the responses and only adopt 820 responses for each aspect angle as measured effective responses by a time-domain radar system. In time domain simulations, we picked the first 100 samples from 820 points as the aspect pattern, starting from the detected target beginning response time. But here we use the 820 time responses as time domain aspect pattern, and then use the FFT to transform the time responses back into the frequency domain to have frequency domain aspect pattern. Therefore, the simulation inputs are always 820 time responses in our spectral magnitude simulations. Since we measured the spectral responses in the frequency band 1-7 GHz, the F FT transform should have effective spectral responses in frequency band l-7GHz. To be consistent with the time domain simulations, we'd like to use 100 spectral samples as frequency domain aspect pattern. We also want to fully utilize the spectral responses within that bandwidth, so the transferred spectral responses should have 100 effective frequency samples in frequency band 1-7GI—Iz and zeros outside the band. Afier calculation, a 1358 point FFT is used to have the band covered by 100 frequency samples, 18 to 117. Then we pick the 100 frequency samples residing from 18 to 117 and only use the spectral magnitude portion as network processing patterns. To systematically process data, each aspect spectrum 228 0.25 / 11 “III“ ‘III’Ii‘II‘ (\‘III IIIIIII II III “fililii‘i‘kl‘l‘hw Normalized Spectral Magnitude O ' \1 (“SI IIII\‘I\II‘:IIIIII“I11I‘“II“\'II“\l \“ ' ““W‘llm' III‘II’I “'3‘“ “It '9 01\ 1”:\“ \a \\ \\.‘ “I ‘HI; 0 05 fliI'l Iii (II. ‘ III" “111qu III“: “iii \‘i‘iiill “i" Iii II 0; III” I ’II III III ‘IIIII I III “\M II, 'II I" II‘ I , I 70 i” ”II/l I. ’IIIII-I \II “N, pl ‘\ \I‘?‘ III‘IIIII :11 1111‘ ““ 1111111 11111 '11“i1 «6’0”, W’W‘ “III“ «0", I/II‘I‘ "II“ “J“ “I“ \II [VI ‘I IM‘ '1’“ 50 I" III], ”II" ‘I’Hd’; /\I“"I‘ » (VIII [,5 I“? 6"» III a .. Iii/11, W I II .11.“: \11’1l1i’ I I I11 1;. only” “1111“1'1I1I'III'HIIW” 3121,30,, III II I I I N 10 ’QQZI’I: 3 guano“ (3:7) ‘ 2 we Figure 6.2 68 aspect FFT spectrum magnitude patterns, with normalized energy of 1, used for spectrum process network trainings/storage. Spectrum process networks simulate 4 targets, each target has 17 trained/stored aspect spectral patterns. The effective frequency band is 1-7 GI-Iz. 229 magnitude pattern is normalized to energy 1. Figure 6.2 shows the 68 training/stored spectral magnitude patterns obtained from the 68 aspect time responses where sample point 1 corresponds to 1 GHz and point 100 to 7 GHz. Since only one half of the signal information is utilized, the spectrum magnitude patterns are not as distinct from each other as the time domain patterns. In digital data process simulations, not only 7 quantization levels are again used but also 5 levels are tried. since the dynamic oscillation range for spectrum magnitude patterns is much smaller than time responses, 5 quantization intervals may be enough to represent the small dynamic oscillation range. Then we use 3 bits to binomially encode the 5 and 7 numen'cal intervals, so each binomial pattern will have 300 bits for both 5 and 7 quantization levels. The code assignment for 7 quantization levels encoded by 3 bipolar bits is same as the time domain one presented in chapter 3. The code assignment for 5 quantization levels encoded by 3 bipolar bits is shown below : BitlBit2 -l -l -l l l l l -l Bit3 '1 2 3 4 Table 6.1 Code assignment of 5 quantization levels encoded by three bipolar bits. 230 The above code assignment has a great linear range, and the statistical linear range is 3:35 levels. Therefore, the linear range almost covers all 5 levels. The linearity only fails on two specific situations that the signal in level 1 will recognize itself closer to the level 5 than the level 4, and the signal in level 5 will recognize itself closer to the level 1 than the level 2. Thus the 5-level code has higher linearity than the 7-level code, and this is the major reason that we also use the S-level code in spectrum magnitude simulations. The great linearity can prevent correlation-based networks from wrong similarity recognition under contaminations. At the same time the code with low quantization levels will result in low correlation-based discrimination resolution, and it will be analyzed later. There are some problems with analog spectrum process networks, since the spectral magnitude carries less information than the time response signal. In the time responses, the location of specular peaks is a good measurement for discrimination. In contrast, the spectral magnitude doesn't oscillate around its mean value nor have a lot of sharp peaks. Apparently, the frequency magnitude changes much slower than the time domain signal, and its variance is much less than its corresponding time response. Therefore, the spectral magnitude distribution is more uniform and the discrimination among these spectrums becomes harder. Since the spectrum magnitude is positive, the correlation between any two stored spectrum patterns is always a positive value. Therefore, before processing analog spectrum magnitude, we normalize spectrum magnitude patterns to satisfy the following two conditions : (1). Every stored pattern must be normalized to 0 mean value. (2). The energy of each pattern must be normalized to some uniform value. The above condition (1) makes recurrent correlation process with a nonlinear threshold 231 function more efficient, while condition (2) enhances the linearity of the correlation operation. Suppose {A(i)| i=l,...,n} is a stored spectrum magnitude pattern with energy 1. Then )3 A(i>2=1 (1) i-l Letting 3 denote the mean of A, i.e., a: 12 A(i) (2) n H and E{ A(i) - fl} = 0, then we have two processes by which to produce the normalized pattern A (1). Prior Process : For condition (2), we have 2 [C'(A(i)-a)]2=1 (3) i-l so that (4) and A(i) = C-( A(i) - g). If AM(i) = A(i) - a, has been calculated, then we use the following process. (2). Posterior Process : Condition (2) gives A 1' 4(1) = —i)—— =1) -AM(i) .. 2 (5) 2: AM“) {-1 where i: W (6) Due to A . E{——M(—I)——} = D-E{AM(1')} = o I .. (7) 2 AMU)2 , the condition (2) is also satisfied. Since we have assumed A has energy 1, then 2 A,,(,-)2.1-na_2 (s) i-l and thus C=D. Thus, we now have the normalized spectrum magnitude pattern _A_ satisfying both conditions (1) and (2) : Em» =0 (9) 2 4mm (10) i-l Although we have normalized the spectral magnitude to overcome the deficiency caused by disregarding phase, the analog correlation processing algorithm still requires fiirther improvement. As analyzed previously, the frequency spectra are more similar to each other than the time responses, and thus the crosscorrelations between any two stored spectral patterns almost always have positive values, even after the above normalization. Therefore, 233 we need an offset to further compensate this deficiency. We statistically evaluate the crosscorrelations for all the stored spectral patterns to find out the average and the minimum, and then design the offset. Since the spectral patterns have been normalized, the maximum correlation gain, (i.e. autocorrelation,) among all stored patterns is 1, provided the input is equal to some stored pattern. Let Cm," denote the minimum of crosscorrelations among all stored patterns. If a recurrent correlation associative network is to converge to the expected stored pattern, it needs to satisfy at least 1-01m: > -(Cmm—0fifset) (11) and thus 1+Cmm Offset < 2 (12) Typically, Cmin has a negative value. If -(C,,,,,,-Offset) > l-Offset exists, there exists an input and one stored pattern that have negative normalized correlation gain and the gain scale is larger than the normalized input autocorrelation. Therefore, the stored pattern with minimum correlation gain will overcome the others including the input's true stored pattern when the recurrent iterations are even. If we use the value of Offset just satisfying the above inequality margin, then the negative gain dominator may still occur when inputs are contaminated. And the critical satisfaction also may make the network not converge or converge very slowly. We evaluate the mean of all crosscorrelations, then statistically and experimentally find that 2/3 of the mean is a good choice for the Offset. Therefore, the network processing algorithm for RCAAM/ad requires some modification to process the analog frequency spectrum magnitude. Suppose { ( E', C’)| i: 1,...,P} are the associative pattern pairs stored in RCAAM/ad, 234 where E‘ is a bipolar column vector of length m and C‘ is a normalized analog spectral magnitude vector oflength n, given by E = [£1 £2 tip] and C= [Cl C2 CP], and let the offset vector Fm,“ be a column vector of length P with each component value equal to the Offset. If U is an n-dimensional normalized analog spectral magnitude input, then we can construct the initial Analog-Digital RCAAM as follows : M; E-{ Diag [(CT-U - Fofi,,,).“(init)]-CT} (13) 050: g- { Diag [(CTU — Foflset)."(init)]-Foflm} (14) where A."q2[A,"A2"...AP"]T and "A 0...0' 0 A2 0... Diag (A)E homo/1p if A = [A1 A2 A1,]T, and OSO is an m by 1 column vector. Then the current dynamic memory output is V0 = Sign (Mo-U - 0S0) (15) The initial order init is usually set to 0. Since the input has analog form (the same as C' ), and the output has binomial form (the same as E‘ ), we have fixed input (Uk = U,) for this analog- digital hybrid memory. Then the dynamic memories Mk, the dynamic Offset vector 08,, and the evolution outputs Vk at the recurrent iteration time k have Mk = g-{ Diag[(CTU-0flset )."(init+k)]°CT } (16) 235 OS, = 5-{ Diag[(CTU-0fifs~et )."(init+k)]-Foflm } (17) V, = Sign (Mk°U - 03,) (18) We previously showed the inconsistent correlation gain resulted from inconsistent beginning response detections. Now let's check if the spectrum magnitude process can alleviate the problem. Figure 6.3 shows the FFT spectrum magnitude patterns corresponding to the previous time-shifted responses, while Figure 6.4 presents their normalized spectrum magnitude patterns. We can see that the whole waveform shifiing effect has been eliminated for both types of spectrum magnitude patterns. The correlation gain for the two spectrum magnitude patterns is 0.897 and is 0.639 for the two normalized spectrum magnitude patterns. Compared to the negative correlation gain occurred in time domain, the spectrum magnitude waveform for these two aspect patterns are more consistent. Afier offsetting, the spectrum magnitude correlation gain is 0.349, and then still positive. In time domain, the 7-level bipolar coding gave the two aspect patterns a correlation gain of 14, thus two bipolar patterns had 157 bits in common and 143 bits different. In spectrum magnitude process, the S-level coding has a correlation gain of 132, therefore two bipolar patterns have 216 bits in common and 84 bits different. The 7-level coding has a correlation gain of 62, therefore two bipolar patterns have 182 bits in common and 118 bits different. Therefore, the digitized data form still demonstrates the property that the spectrum magnitude process has better correlation-based consistence for the target responses with close aspect angles than time domain process. 236 0.3 r 1 fl 1 1 Solid : The fourth stored aspect (5.4 deg) pattern of target F14 Dashdot : The sixth stored aspect (9.0 deg) pattern of target F14 0.25 r Spectrum Magnitude O a: I i | .\ ,I \/ l i ’\ i \ i ', ' \ i /‘ ‘ \ l ' \. l . y I ' i \- _' i, 1“ 0.1 _. ' l l " I" . q . _ 1 I . ‘ - . \ l . \_, I \4 l “ / ' / \ -\ l , . l .’ r .1 '\ l l \ . 005* ‘ I \ . ‘, _ ‘_ ,‘ » I \ “ . ' / \ I l I. ‘_ K \ \’ \ - ‘ I \ i - 1 ~ . . l ‘l / \ \ \ \ O l I l 1 I ' 1 2 3 4 5 6 7 Frequency (GHz) Figure 6.3 FF T spectrum magnitude patterns of two time responses from target F14 with respective aspect angle of 5.40 and 9°. 237 0.4 I T I I I ‘. ,r Solid : The fourth stored aspect (5.4 deg) pattern of target F14 l r Dashdot : The sixth stored aspect (9.0 deg) pattern of target F14 i r 0.3 r _' I_ r l I 1 r l i 1 . i. 0.2 - r , i ‘, 8 r 1 D a: r | c . U) l | (U l ' \ 2 0 1 l. l l /\ I v \ E r l i l a . I *5 l l , \. i / \ cu , r v , Q. l « I I . . r, \ - , I \ r .’ v. , . r O r l , \_,' \ (\le \ l . j \. / . \ r t, t. \ / \, \ r i ‘ ’ I \ i ‘ I) \ . .1 \ l 3.", \ I ‘_ II \ —0.1 ~ I ‘ ' U \ \I /' —O 2 ___ __,._ 2, 1 1 1 1 1 2 3 4 5 6 Frequency (GHz) Figure 6.4 Normalized FFT spectrum magnitude patterns of two time responses from target F 14 with respective aspect angle of 54° and 9°. 238 6.3 Manipulation of Spectrum Magnitude Data Before Network Process In spectrum magnitude simulations, we use the same 4 targets, 68 stored/trained and 64 unstored/untrained aspect angles as we did in the time domain case. In time domain noise tolerance simulations, the desired noise based on the energy of the simulated pattern is directly added to the pattern, i.e. 100 samples or responses. Since we have assumed that the time responses measured from a time domain radar is only available in practical situation, we used IFFT to have time responses from lab-measured spectral responses and then used FFT to obtain the spectral responses as the network process information. To be consistent with this assumption, the noise in spectral simulations should be added to time domain signal and then use FFT to obtain the contaminated spectral responses. Remember that there are 8192 time responses after IFF T from the measured spectral responses. Then we time-gated the 820 effective responses as time domain aspect responses measured by a time domain radar. Also recall that the time domain network only pick the first 100 samples among 820 responses as network process pattern, starting from the detected beginning response time. To emulate the practical situation, the simulated noise is added to the 820 time responses in spectral simulations. Given a time domain input with 820 responses, we calculate its energy, then add the desired simulated noise to the 820 responses. So, in our simulations, we calculate the energy of 820 samples for each pattern, then generate the corresponding noise with a SNR required for simulation. Afier noise is added to the 820 samples, extra 538 zeros from (1358-820) are attached to the 820 values to form a 1358- element signal. Then a 1358-point Discrete Fourier Transform (DFT) is used to obtain 1358 spectrum magnitudes. As described previously, the 1358-sample is designed to have the 239 available measurement band 1-7 GHz covered by 100 frequency samples, and no response outside the band. Finally we pick the 100 spectrum magnitude samples, sample 18 to sample 1 17, as spectral network process pattern. Then the further normalization is computed based on this spectral pattern before presenting the pattern to the network. We have explained that the Signal To Noise (SNR) definitions for time domain and spectrum domain processes are different. We now show the noise contamination effects in time domain responses and then spectral responses after DFT. Figure 6.5 shows the 820 time responses used as measured responses by a time domain radar. The solid line stands for the true responses, while the dotted line denotes the contaminated signal with a SNR of 0 dB. Then Figure 6.6 presents the 1358 spectrum magnitudes obtained from the above 820 time responses plus 538 zeros by using 1358-point Fourier transform. It's clear to see that the true spectrum magnitude, solid line, only has values in two sample intervals, samples 18 to 117 and samples 1242 tol34l, corresponding to the frequency band 1-7 GHz. After noise is added to time responses, the transformed spectral responses have magnitudes spreading over the whole band. Figure 6.7 shows the 100 time responses used as the time domain network process aspect pattern and its contaminated signal, dash-dot line, with a SNR of 0 dB. In time domain simulations, the desired noise calculation is based on the 100 responses, i.e. aspect pattern itself. Finally, Figure 6.8 presents the 100 spectrum magnitudes used as spectrum network process pattern. They are truncated from the previous 1358 spectrum magnitudes, so only the noise spectrum occurred in the measure band, 1-7 GHz, will contaminate the true spectral pattern. Again the calculation of 0 dB is based on the 820 time responses. It is noted that the noise amplitudes become large, compared to the spectrum magnitude, if the spectrum 240 x10 Solid : Without Noise Dotted : With Added Noises Time Response _4_ 4 l l 4 I J 1_ 1 O 100 200 300 400 500 600 700 800 Time Sample Figure 6.5 820 time responses of target B52 at aspect angle 0° used as measured responses by a time domain radar. The solid line shows the true responses, while the dotted line denotes the contaminated signal with a SNR of 0 dB. 900 Solid : Without Noise Dotted : With Added Noises _e N I I Spectrum Magnitude l 0.4—I 1 I I 1 1 I I I 600 800 1 000 1200 1400 Spectrum Sample Figure 6.6 1358 spectrum magnitudes transformed from the previous two time responses by using a 1358-point DF T. The solid line shows the true responses, while the dotted line denotes the contaminated signal with a SNR of 0 dB. The evaluation of 0 dB is based on the 820 time responses. Time Response 1.5 0.5 -1.5 0 .— .-~ g- ,o-r a. Time Sample m T r n n F r T r 1. 1 I1 I1 I1 III I I 1 1 1 '. II I 1 11 l 1 1 1 .1 1 1‘1 ' 1i 11 i .‘1 1—- I i l I I f I 1 1 -1 1 , l l 1 » I I lI l 1 II- I I I ‘\/| '11 .1 1 1 l | 1| 1. ‘ , 1. 1 ‘ . l l 1' y l - . . IiII' 'I I ’I 'I.I1 1 ' 1 II 'I ' ii I I 1 ' 1- l 1 ," 1 ‘ 1 « 1. 1 1 ,1 1 1 1 . 1 ./ 1 1.‘ ,1 1 1 1 1 1 1 l1 . 1 1 l 1 I l N l 1 l I 1 I I 1 I ‘ 'II .J III 1\ ‘1' '1' 1’, 1'1 1 |1\l I | 1 1 ‘11 1 , 1 1 ' r ‘ 1 r ‘ 1. . - I 1 H .I 1 l , II I 1] ‘ ‘ i . ‘ - 11 l . , 1. l ' I I I _ l ' l’ ‘1 I. 1 l. 11 \1 l |_ 1' I 1 " 1I . 1' ‘I 11 1’ 1‘ I .1 1 1- ; 1 ' 1 I Solid:Without Noise Dash-Dot : With Added Noises 1 1 1 1 1 1 1 1 1 10 20 30 40 50 60 70 80 90 100 Figure 6.7 100 time responses truncated from the previous 820 responses used as the time domain network process aspect pattern. The solid line shows the true responses, while the dash-dot line denotes the contaminated signal with a SNR of 0 dB. Here the evaluation of 0 dB is based on the 100 time responses. 243 03 j I T T I I 1 1 f Solid : Without Noise 1‘. Dash-Dot : With Added Noises 0.25I“ I 1 " l 1 l 1 1 1 l 1 l 1 02* ' ‘ " l (D 1 g 1 E 1 8) l \ I H 1 l \ 2 0.151- 1 I 1 11 — E I '1 l 1 I l 1' E 1 l l 1 I l 1' ’I 8 1 l I "l 1 1 I 1' ‘ 8. 1 '1 l I l 1 I 1 I‘ I I | (D 1 1l 1 l 1 1 1 l1 '1' 1 ' '. 'l 1 1’1 II ' 1 II \ I1 ["1" ‘.- II 0'1 II | I 1 ' I ' 1 1 HI I l I \' 1 H. a l l 1 l i 1 l l l 1 l 1 ' l l I ‘ 11 1 1 1 1 l 1 11 l 1 ’ 1 1 l ‘ 1 1 1 1 , l ’ , 1 1 1 1‘ lI , 1 1 I II ' 1 l " ‘ I ‘I I ’1 1- 1 II I 1 l l I 1 1 | I 0.05 l I 1 ' I I 1 III, I 1 l‘/ ’ 1! 1’ l l l, \4 l 1 1 1‘ I I i .1 l l 1 I .. 1 l '1 \ 1 l 1 '1 I ‘ 0 J L l l L l I 1 4L 0 10 20 30 4O 50 60 70 80 90 100 Spectrum Sample Figure 6.8 100 spectrum magnitudes truncated from the previous 1358 spectral responses used as the spectrum network process pattern. The solid line shows the true responses, while the dash-dot line denotes the contaminated signal with a SNR of 0 dB. The evaluation of 0 dB is based on the 820 time responses. 244 magnitude pattern is normalized, i.e. magnitudes will oscillate around the mean value. 6.4 Comparisons of Correlation-based Discrimination Resolutions for Different Data Formats We have used different information forms for network processes, 7 levels encoded by 3 bits in time domain bipolar networks, analog valued responses in time domain RCAAM/ad, 5 and 7 levels encoded by 3 bits in spectral process bipolar networks, and analog valued spectrum magnitude in spectrum magnitude RCAAM/ad. Given different information forms, a RC AAM network will have different convergent iterations and performances. With the same network structure and stored aspect patterns, the convergent iterations indicate how easy a given information can be discriminated, therefore, we realize that the information form can affect the correlation-based discrimination resolution and then performances. Then we may ask which information form has high correlation-based discrimination resolution ? This section will answer ( or analyze) this question. Since the discrimination medium for most networks used in this thesis are based on correlation gains, we will focus on the relationship between the information form and its correlation-based discrimination resolution. What kind of correlation distribution indicates better discrimination resolution ? To systematically analyze both analog and digital data forms, let's normalize the maximum correlation gain to 1, then the normalized correlation gain of 1 will indicate a gain of 300 to our 300-bit bipolar patterns and l to the normalized analog patterns. Since the network output prior threshold function is sum of the stored patterns weighted by their individual correlation gains, the dominant patterns ought to have apparently larger correlation gains than the others. Our output stages always use the bipolar form, i.e. ————-—_____ ' m—_-'-——v '3‘ ... 245 a nonlinear threshold function. Then it will be difficult to find any dominant pattern after the threshold function, if great populations have large correlation gains. Thus the better correlation gain distribution, to a stored pattern input, should have few gains (or patterns) in high values and most gains in small values, then the stored pattern, the input, won't be interfered by those small gain patterns and can dominate the network convergence. What kind of transition curve, expressing gain population change, is better ? If the population transition from small gain to high gain is smooth, then it's unclear to have the dominant patterns. Therefore, a better transition should be steep. Another concern is that high consistencies among adjacent aspect patterns will help to enhance the target group clustering, and then the group consistent dominant force will become greater than a distinct pattern and easier to relieve the interferences from other patterns or groups. Therefore, the high gain population transition boundary is better to have small populations which are aspect consistent. Consistent patterns are 3 for one aspect spacing consistencies, plus and minus aspect spacing, and they are 5 for two aspect spacing consistencies, i.e. plus one, plus two, minus one and minus two aspect spacing. In this study, the aspect spacing for stored aspect patterns is 18". Given a stored pattern, we may say the adjacent aspect patterns are aspect consistent, if the smallest correlation gain between the pattern and its adjacent aspect patterns is greater than an aspect consistency criterion. Therefore, if an aspect consistency criterion is assumed to 40% of the maximum gain, then the aspect consistent bipolar patterns should have at least (50+40/2)%=70% or 300*70%=210 bits in common, or at most 30% or 90 bits different. Thus if a bipolar gain population curve has a steep transition and its high gain with distinctly small population starts 246 around 0.4 and has patterns about 3 or 5, then we can expect this bipolar data form has high correlation-based discrimination resolution. Although the number of the same response samples between two aspect consistent patterns can not be given for analog patterns from an assumed aspect consistency criterion, the same gain population will affect the discrimination output in the same way. Figure 6.9 shows the correlation gain population distribution for the 68 stored time domain patterns with bipolar and analog forms, while Figure 6.10 presents the gain scale population distribution, ignoring the signs of correlations. Again the maximum correlation gains have been normalized to l for both analog and bipolar patterns. First we assume an active threshold gain, then a stored pattern will be active if it has correlation gain with the given stored pattern input greater than the active threshold. The 'active pattern' means the pattern has the capability to affect the output. The correlation gain population is statistically estimated for all stored patterns by calculating the number of stored patterns which have correlation gains greater than the assumed active threshold. Comparison between two figures shows that the analog patterns have a lot of negative correlation gains with scales less than 30% of the maximum gain. Therefore, high negative gain population below the gain scale of 0.3 indicates that these negative correlations won't bother dominant patterns for a high active threshold but may become a problem if the active threshold falls below 30% of the maximum gain like heavy contamination situations. According to the previous analysis, the time domain bipolar patterns have a great gain population distribution, since it has steep transition curve and small population, about 4, at the high gain transition comer with a gain of 0.4. From the population distribution, presented a stored pattern as an input, the time domain bipolar data 247 70 1 T 60 l A O T Number of Active Patterns 8 l 10* Solid - : Time domain bipolar data generated by 3-bit coding 7 levelsi Dashed -- : Time domain analog data ~ O l l O 0.1 0.2 1 1 l l 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Active Threshold of Correlation Gain Figure 6.9 Correlation gain population distribution for 68 stored time domain patterns with different data forms. 248 70 T I T T T T T T T Solid - : Time domain bipolar data generated by 3-bit coding 7 levels 60 _ Dashed -- : Time domain analog data s 50H T 40 30* Number of Active Patterns 10- O l l l l l l O 0.1 0.2, 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Active Threshold of Correlation Gain Figure 6.10 Correlation gain population distribution, only considering gain scale, for 68 stored time domain patterns with different data forms. 249 will only have 4 patterns with their correlation gains greater than 40% of the maximum gain, 300. Therefore, at most 4 stored patterns can dominate the network converging, if a dominant pattern should have its correlation gain greater than 40% of the maximum gain. The small population at the high gain side afier the transition region may be regarded as an estimate for the number of aspect consistent patterns. Thus about 4 adjacent aspect patterns are in aspect consistency with respect to time domain bipolar data, 'and then the consistent aspect spacing is (4-1)* 1.80 = 5.4". It is apparent that in time domain the bipolar format has better correlation-based discrimination resolution than analog data, and this is consistent with the results of the previous chapter. Figure 6.11 shows the correlation gain population distribution for 68 stored spectrum magnitude patterns with different data forms, while Figure 6.12 presents the gain scale population distribution, ignoring the signs of correlations. Before normalization, the gain population distribution for analog spectral patterns is very poor, and there are 97% of the stored patterns with their gains greater then 70% of the maximum gain. Therefore, with most of the stored patterns at high gains all stored patterns will be confused together and bring no dominant winner, and then networks has no discrimination ability at all to these analog patterns. Now it is evident why we needs to normalize the analog spectrum magnitude data before network processing. After the normalization, the analog correlation population distribution has been greatly improved so that the population with gains greater than 70% of the maximum gain has dropped to 7%, or about 5 patterns. Although the gain population transition curve is smooth, the new distribution has appeared the dominant patterns at gains greater than 0.7. The normalization greatly reduces the high gain population, then the aspect ——— ---wa-~--~~ -. 250 consistency population greatly decreases. Therefore, we may say that the normalization nonlinearly increases the pattern space and the differences among adjacent aspect patterns, thus the aspect consistent patterns under the same consistency criterion decrease. The offsetting to the normalized analog data makes the gain population distribution curve shift toward the left side, therefore the small population transition gain is shified to 0.4 from 0.7. There are two deficiencies from this offsetting, the first is that the gain scale dynamic range is reduced from 1 to (l-foffset) and the second is that the shifting or subtraction by the foffset reflects a lot of negative gain population at low gain area. The first deficiency will reduce discrimination resolution, while the second one can cause negative gain interfering problem if the active gain drops to a low value like heavy contamination situation. Therefore, the correlation-based discrimination resolution for normalized and offset analog data is worse than the 7-level bipolar data and even worse than the 5-level bipolar data although the shified population distribution curve is close to the one for the 7-level bipolar data. To systematically compare with the other data forms, the normalized and offset correlation gains are again normalized to its own maximum gain, (l-foffset). Then the normalized and offset analog data format still has great improvement compared to the normalized analog data, although the transition curve is still not as steep as the 5-level and 7-level bipolar data. The small population transition gain is shified to a lower value and the negative gain population is again increased for correlation gain scale less than 0.4, compared to the normalized analog data. The smooth transition curve is a characteristic for analog data, since analog data format has full linearity and then the aspect consistency is high and smooth. 251 Therefore, this high aspect consistency, with more population at high gain, will make analog patterns struggle in self-competition at high gain area, and then delays converging but won't bother the target discrimination correct rate, since those aspect consistent patterns belong to the same target. The 7-level bipolar spectral data format apparently has the best gain population distribution and the negative gain population at low gains is few. The real small population transition gain should occur at the gain of 0.4, the population with gains greater than 0.4 is less than 1.5 patterns. The dominant pattern is almost unique when the considered active gain is larger than 0.4. Therefore, the 7-level bipolar data format has a small number of aspect consistent patterns, then the target group idea becomes faint. It is equivalent that the 7-level bipolar patterns are distinct from each other, and then they have stronger distinct pattern idea than target group idea. Thus the discrimination with 7-level bipolar patterns almost depends on distinguishing distinct patterns and has few self-competition phenomena, and then saves converging iterations. The S-level bipolar spectral data format also has a good gain population distribution, and few negative gain population at low gains. Its few population transition gain occurs at the gain of 0.5, and its aspect consistency is higher than the 7-level bipolar data. Since the 5-level bipolar coding has higher linearity than the 7-level bipolar coding, the 5-level bipolar data has more patterns in aspect consistency than the 7-level bipolar data. It is an interesting phenomenon that the 5-level bipolar gain distribution curve is crossed around the gain of 0.43 by the distribution curve of the normalized and offset analog data. The S-Ievel bipolar data format has lower population than the normalized and offset data for the gains larger than 252 70 T T T T l T l T T i Dashdot : Frequency domain bipolar data gener ted by 3-bit coding 5 levels; Solid - : Frequency domain bipolar data genera d by 3-bit coding 7 levels 601 x-Mark x : Frequency domain analog data A ffset analog data Dashed -- : Normalized and offset analog d a w/ max. correlation gain normalized to 1 50 ~ n .\ I‘,\ g \ 2 \ 53 . a o. 40 _ ~ 0) .2 ‘6' < “5 g» 30 ~ - E :3 z 20 — - 10 - - 0 1 4 r 1 1 41...“ ”fa-fa 1 i 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Active Threshold of Correlation Gain Figure 6.11 Correlation gain population distribution for 68 stored spectral process patterns with different data forms. 253 70 l T T T T T T T T T Dashdot : Frequency domain bipolar data gener ted by 3-bit coding 5 levels Solid - : Frequency domain bipolar data genera d by 3-bit coding 7 levels; 60 _ d ffset analog data Dashed -- : Normalized and offset analog d a w/ max. correlation gain normalized to 1 50 — d (I) E (D 1% CL 40 — — (D .2 8 < ”5 g 30 — - E 3 z 20 - - 10 — ~ 0 1 1 1 1 1 1 I 1 T I - , 1 —“1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Active Threshold of Correlation Gain Figure 6.12 Correlation gain population distributions, only considering gain scale, for 68 stored spectral process patterns with different data forms. 254 0.043 and has higher population for the gains lower than 0.043. It indicates that the 5-level bipolar form has better correlation-based discrimination resolution if dominant patterns needs to have correlation gain larger than 0.5. Practically, it can be expected that the 5-level bipolar data format has better correlation-based discrimination resolution and fewer converging iterations than the analog data, since the few population transition for 5-level bipolar data apparently occur at the gain of 0.5 while the analog data has a smooth population transition curve. Now let's use the normalized target (cross)correlation gain estimate formula, developed in the previous chapter, to quantitatively estimate the correlation-based discrimination resolutions for different data forms. In the previous chapter, the estimation calculation was based on two assumptions as follows : (I). Only active patterns, whose correlation gains greater than the assumed active threshold gain, can participate in generating the next state in the output stage. (2). For the active patterns, their associative output patterns are consistent. From the previous correlation gain population distributions for different spectral data, the analog data format has its small population transition at high gain, therefore we use higher active threshold gains for the estimate formula to keep consistent with above two assumptions. The following three tables show the estimated normalized target correlation gains for different spectral data forms. The estimated normalized target correlation gains for time domain data were presented in the previous chapter. In Table 6.2, an active threshold gain of 0.4, 40% of the maximum gain, is assumed. The normalized analog data have normalized target crosscorrelation gains comparable to 1, 255 No. (%) Estimated Normalized Target Correlation Spectral Process of Active Input Gain w.r.t. Input Target Input Data Form Patterns Target w.r.t. a “”5110“ B52 BSS F14 TRl of 40% B52 1.0000 1.0808 0.9665 1.1579 Normalized 36.44 B58 0.5369 1.0000 0.7341 0.7888 Analog Data (53.59%) F14 0.5569 0.8495 1.0000 0.7881 TR] 0.5552 0.7580 0.6558 1.0000 B52 1.0000 0.5595 0.3881 0.4797 Normalized and 13.79 B58 0.2172 1.0000 0.3418 0.3815 Offset Analog (20.28%) Data F14 0.1952 0.4433 1.0000 0.4513 TRl 0.2157 0.4422 0.4044 1.0000 B52 1.0000 0.6480 0.3326 0.5785 Bipolar Data with 15.24 B58 0.2446 1.0000 0.3748 0.4790 5 Qantization (22.41%) Levels F14 0.1387 0.4142 1.0000 0.5901 TRl 0.1915 0.4192 0.4684 1.0000 B52 1.0000 0 0 0 Bipolar Data with 2.59 B58 0 1.0000 0.0412 0.1039 7 Qantization ( 3.81%) Levels F14 0 0.0453 1.0000 0.0476 TR] 0 0.1107 0.0460 1.0000 Table 6.2 Estimated normalized target correlation gains subject to a active threshold of 40% for different spectral process data forms. 256 therefore its correlation-based discrimination will be greatly interfered by these undesired patterns and then has low correlation-based discrimination resolution. Afier offsetting the normalized analog data, the normalized target crosscorrelation gains for the normalized and offset data have been apparently decreased and the active pattern number also decreases. Compared to the 5-level bipolar data, both normalized target crosscorrelation gains are very close, and this quantitative similarity is consistent with the crossing gain around 0.4 for the two gain population distribution curves in the previous figure. The 7-level bipolar data have normalized target crosscorrelation gains near or equal to 0, thus at this fair active threshold the 7-level bipolar data format has much greatly prevailed the others. In Table 6.3 an active threshold gain of 50% is assumed. The normalized analog data have slightly improved their normalized target crosscorrelation gains, while the normalized and offset analog data apparently reduce their normalized target crosscorrelation gains. Compared to the normalized and offset analog data, the 5-level bipolar data have firrther noticeably improved their normalized target crosscorrelation gains, and again this quantitative variation is consistent with the previous figure. The 7-level bipolar data have all normalized target crosscorrelation gains 0 except two terms. Finally an active threshold 60% of the maximum gain is assumed in Table 6.4. This time the normalized analog data have greatly reduced its normalized target crosscorrelation gains. The 5-level has 0's in all its normalized target crosscorrelation terms except four terms, while the normalized analog and offset analog data continue improving their target crosscorrelation gains. The 7-level bipolar data finally have 0's in all their normalized target crosscorrelation terms. 257 Spectral Process Input Data Form No. (%) of Active Patterns w.r.t. a Threshold of 50% Input Target Estimated Normalized Target Correlation Gain w.r.t. Input Target 352 358 F14 TR] 852 1.0000 0.7998 0.6356 0.9655 Normalized 22.53 B58 0.3372 1.0000 0.5211 0.6763 Analog Data (33.13%) F14 0.3047 0.5923 1.0000 0.6368 TR] 0.3802 0.6310 0.5233 1.0000 BS2 1.0000 0.3229 0.2058 0.1930 Normalized and 7.41 858 0.1161 1.0000 0.1782 0.2693 Offset Analog (10.90%) Data F14 0.0991 0.2386 1.0000 0.2371 TR] 0.0818 0.3177 0.2087 1.0000 B52 1.0000 0.1762 0.0239 0.0242 Bipolar Data with 4.91 858 0.0790 1.0000 0.0776 0.1954 5 Qantization ( 7.22%) Levels F14 0.0125 0.0906 1.0000 0.2853 TR] 0.0095 0.1720 0.2153 1.0000 B52 1.0000 0 0 0 Bipolar Data with 1.44 858 0 1.0000 0 0.0229 7 Qantization ( 2.12%) Levels F 14 0 0 1 .0000 0 TR] 0 0.0238 0 1.0000 Table 6.3 Estimated normalized target correlation gains subject to a active threshold of 50% for different spectral process data forms. 258 Spectral Process Input Data Form No. (%) of Active Patterns w.r.t. a Threshold of 60% Input Target Estimated Normalized Target Correlation Gain w.r.t. Input Target B52 BS8 F14 TR] BSZ 1.0000 0.4897 0.2111 0.4572 Normalized 10.06 B58 0.1968 1.0000 0.2027 0.3782 Analog Data (14.79%) F14 0.1048 0.2507 1.0000 0.4205 TR] 0.1993 0.4105 0.3689 1.0000 852 1.0000 0.1274 0.0988 0.0317 Normalized and 4.12 B58 0.0455 1.0000 0.0718 0.1512 Offset Analog ( 6.06%) Data F14 0.0406 0.0822 1.0000 0.1060 TR] 0.0124 0.1649 0.1008 1.0000 B52 1.0000 0 0 0 Bipolar Data with 2.4] B58 0 1.0000 0 0.0900 5 Qantization ( 3.55%) Levels F 14 0 0 1.0000 0.0408 TR] 0 0.0867 0.0343 1.0000 B52 1.0000 0 0 0 Bipolar Data with 1.03 B58 0 1.0000 0 0 7 Qantization ( 1.51%) Levels F14 0 0 1 .0000 0 TR] 0 0 0 1.0000 Table 6.4 Estimated normalized target correlation gains subject to a active threshold of 60% for different spectral process data forms. 259 RCAAM Discrimination Iterations for Contaminated Stored Patterns Under 40 dB / 0 dB ; Average. Information Process Form RCAAM/di22 RCAAM/fi12 RCAAM/ad12 Time Domain Process 3.17 / 5.33 ; 3.98 / 8.54 ; 5.54 / 6.04; (Bipolar Data Coding 7 Levels) 4.35 6.76 6.09 Spectral Process 5.05 / 8.69 ; 5.49 / 7.94 ; 6.49 / 7.40 ; (Bipolar Data Coding 5 Levels) 7.12 7.12 9.71 Spectral Process 2.61 /4.71 ; 3.95 / 6.15; 6.41 /7.51 ; (Bipolar Data Coding 7 Levels) 4.06 5.61 9.70 Table 6.5 Comparisons of network discrimination iterations among different data forms by testing networks with contaminated stored patterns. Let's check if the above gain population distributions and the estimated normalized target correlation gains are consistent with network discrimination iterations for different data forms. To compare between analog and bipolar data, the RC AAM/fi and RCAAM/ad should be used, since they have the same discrimination algorithm but different input data forms. In Table 6.5, the p denotes the initial order and the q represents the stability criterion in network RC AAM/dipq, RCAAM/fipq and RCAAM/adpq. The comparisons of discrimination iterations between the time domain networks and spectrum magnitude networks are inappropriate, since the time domain responses have less consistencies resulted from the 260 inconsistent detections of the start beginning responses. In Table 6.5, only the discrimination iterations for 68 contaminated stored patterns under 40 db, 0 db and the average among 10 SNR's are listed here. The 0 db case may approximate the contaminated unstored pattern case. From Table 6.5, Under 40 dB the order of discrimination iterations from small to high are : the 7—level bipolar spectral data < the 7-1evel bipolar time data < the 5-level bipolar spectral data ... the analog time data < the normalized and offset analog spectral data. This order is consistent with the correlation gain population distributions in the previous four figures and the quantitatively estimated normalized target correlation gains in the previous three tables. Under 0 dB the order of discrimination iterations from small to high are : the 7-1evel bipolar spectral data < the analog time data < the S-level bipolar spectral data < the 7-level bipolar time data < the normalized and offset analog spectral data. For spectral data forms, the consistencies are still kept. The 7-level bipolar time data format becomes worse than the analog time data, since the 7-level code will suffer severe nonlinear distortion but the analog data still have the linearity when the contamination scale is statistically greater than 1.5 levels. Under serious contamination, the correlation gain population distribution curve should be shifted toward the left side, since the dominant patterns with high correlation gain will become confused by serious distortion and have their high gains drop to low gains. The analog time data format has smooth population distribution curve, while the 5-level bipolar spectral data format has apparent population transition around the gain of 0.5. Therefore, under severe contamination the 5-level bipolar time data may drop its high gains below the transition gain, while the analog time data smoothly decrease their high gains. Then the analog time data will 261 have better correlation-based discrimination than the 5-level bipolar spectral data. Since the normalized and offset analog spectral data have a lot of negative gain population resulted from the offsetting around low gain scale, the heavy contamination can make its high gain population move toward low gains and then struggle with the negative gain population. 6.5 Network Simulations with Spectrum Magnitude Response Here we simulate the major networks used in this study, GI, HCAM, ECAM, RCAAM and GI cascaded networks, with bipolar data coding 5 and 7 quantization levelsAs described before, we use the bipolar data with 5 quantization levels due to their high linearity and less oscillations in spectrum magnitude response. The 7-level data format has a high correlation-based discrimination resolution and can better express the detailed response variations, as shown in the previous section, although it has worse linearity. Figure 6.13 shows the GI network performances vs. SNR for the 68 contaminated stored patterns, while Figure 6.14 presents the network generalization performances for the 64 contaminated unstored patterns. The 5-level bipolar data format demonstrates its higher noise tolerance than the 7-level data format for both stored and unstored patterns. If a G1 net, trained with patterns of few quantization levels, can converge with zero error, then it means that the well trained GI net can discriminate all training patterns under such a low resolution. Thus the GI net trained with low resolution patterns can still work very well, while the GI net trained with high resolution patterns suffers from noise sensitivity. The spectrum magnitude GI network still prefers to leave ambiguous inputs unknown rather than discriminate wrongly. Figure 6.15 shows the HCAM and HCAM-GI cascade network performances vs. 262 SNR for the 68 contaminated stored patterns, while Figure 6.16 presents the network generalization performances for the 64 contaminated unstored patterns. For 5-level bipolar patterns, the HCAM of order 5 is used, since an order of 3 has very poor performances. For the 7-level bipolar patterns, the HCAM of order 3 is used, since it already shows better performances than the 5-level bipolar form. Again, this is another proof analyzed in the previous section that the 7-level data format has better correlation-based discrimination resolution. Both HCAM's, order of 5 for the 5-level code and order of 3 for the 7-level code, have poor performances for contaminated unstored patterns. With SNR higher than 0 dB, both HCAM-GI cascade networks have improved the noise tolerance performances for the stored patterns. For the unstored patterns, it also raises the wrong discrimination rate with almost the same amount as the correct one, while the HCAM-GI cascade networks improve the correct discrimination rate. Figure 6.17 shows the ECAM and ECAM-GI cascade network performances vs. SNR for the 68 contaminated stored patterns, while Figure 6.18 presents the network generalization performances for the 64 contaminated unstored patterns. Similar to the time domain ECAM performances, spectrum magnitude ECAM has excellent performances for both 5-level and 7-level bipolar data. Comparing the performances with SNR's less than 0 dB, the 5-level bipolar patterns have a higher correct discrimination rate and a lower wrong discrimination rate than the 7-level bipolar patterns. It again proves our analysis that the 5- level data have better linearity than the 7-level data, and then the 5-level data can still have their linearity under severe contamination where the linearity of the 7-level data has no longer survived. The correlation-based discrimination resolution becomes trivial and plays an 263 100 f 1 T r 1 90 - §‘ m 80 - c .9 g 70 ~ .E .5 60— D ‘55 Dashdot —. : Correct discrimination for S-level G1 9 50 _ Solid — : Correct discrimination for 7-level GI cu I; Dotted : : Wrong discrimination for 5-level G] 8 40 _ Dashed -- : Wrong discrimination for 7-level GI E E m 30 5' ‘6' ‘2 5 20 ~ \, O 3 \ 10 - ‘ . . ‘ \ ' \ O L L I \ .T x _ 1 1 1 1 1 1 —1 0 -5 0 5 10 15 20 25 30 35 SNR (dB) (6) T T T T T A 40 - o\° '0 30 _ Dashdot —. : Unrecognized for 5—level Gl g Solid - : Unrecognized for 7-level GI : g2o~ o 9 5 10 r O - l L 1 l 1 -1 0 —5 0 5 1 O 15 20 25 30 35 SNR (dB) (9) Figure 6.13 Spectrum magnitude GI network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. 264 100 T T T T T T T T T 90 ~ 1°“ L 80a _ m c: .9 a 70- — .E .8 60— . D ‘55 Dashdot —. : Correct discrimination for 5-level GI 9 50 ._ Solid - : Correct discrimination for 7-level Gl _ <1: I; Dotted : : Wrong discrimination for S—level GI c Dashed -- : Wrong discrimination for 7—level GI 9 4O 1— —1 2 E m 30 '— ‘I ‘6' ‘3 \ ‘5 20— \ _ O ‘ \ \ 10 r \ \ — 1 1 1 ”7*”T"‘1“”’f""*f"'7"'"‘“ —10 —5 0 5 10 15 20 25 30 35 40 SNR(dB) (a) T T T T T A4 ’— — o\° 0 ‘0 30 L Dashdot -. : Unrecognized for 5-Ievel GI _ g Solid — : Unrecognized for 7-level GI c 8’ 20 - '1 o 93 c 10 r 4 D —— - l l 1 l 1“‘1_-—1—__J--—1—’_ -10 -5 0 5 10 15 20 25 30 35 40 SNR(dB) (b) Figure 6.14 Spectrum magnitude GI network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. 265 1 00 90 r 10‘ °V 80 _ m c .9 “g 70 r .E 8 60 r Star ’ : Correct discrimination for 5—level HCAM (order 5) 1 C) Dashdot —. : Correct discrimination for 5—level HCAM (order 5)-Gl 3'5 Plus + : Correct discrimination for 7-level HCAM (order 3) 93 50 a Solid - : Correct discrimination for 7-level HCAM (order (3)-GI to I; x—Mark : Wrong discrimination for 5-level HCAM (order 5) c Dotted : : Wrong discrimination for 5-level HCAM (order 5)—Gl 9 40 ” Circle 0 : Wrong discrimination for 7—level HCAM (order 3) 3 Dashed -- : Wrong discrimination for 7-level HCAM (order 3)-Gl ‘o g 30 r a ‘6' 9 (‘3 20 ~ _ 0 10 r — 0 ———Q3=Q=—a——a\—la&——a ' ‘ a" ----- s 1 s 1 4d —1 0 5 10 1 5 2O 25 30 35 40 SNR (dB) (8) T T T T T T T 80 r — O) O r Unrecognized (%) A o T Dotted : : Unrecognized Tor 5—level HCAM (order 5) Dashed -— : Unrecognized for 5-level HCAM (order 5)-GI _1 Dashdot -. : Unrecognized for 7-level HCAM (order 3) Solid - : Unrecognized for 7—Ievel HCAM (order 3)-Gl 20~ .\ g - O 1 1 _ 1‘ _ — .1 - _ Z- — — 1— _ — ‘1 — — - -10 5 10 15 20 25 3O 35 SNR(dB) (b) Figure 6.15 Spectrum magnitude HCAM and HCAM-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. 266 r 70 _ a x 1 7,; 60 F \ ’ Star ‘ : Correct discrimination for 5-level HCAM (order 5) g \ Dashdot -. : Correct discrimination for 5—level HCAM (order 5)-Gl 3: Plus + : Correct discrimination for 7-level HCAM (order 3) 2 Solid - : Correct discrimination for 7-level HCAM (order 3)-GI 1:: 50 ~ ~. t3 .9 Q 1 ‘65 40 — * A c» 1 . 1‘6 \ ‘ 1— \ .............. a I r 9 30 T \ \ , _____ -1 3 ‘—-~ _______________ U _ c 111 5 20 ” ‘ g x—Mark : Wrong discrimination for 5-level HCAM (order 5) 0 Dotted : : Wrong discrimination for 5-level HCAM (order 5)-Gl 0 Circle 0 : Wrong discrimination for 7—Ievel HCAM (order 3) 10 - Dashed -- : Wrong discrimination for 7-Ievel HCAM (order 3)—-GI ‘ 0 >9 A 1 Q— i ('3 —1 0 —5 0 5 10 15 20 25 30 35 40 SNR (dB) (8) 1‘ r r T r r T 1 T g 80 "' \ 1 \ . r 860— ‘\-__________ - .E "----—————— c 8’40 — D - - ’ M 0 otted . . Unrecognized for 5-level HCA (order 5) <3 Dashed -- : Unrecognized for 5—level HCAM (order 5)-GI 5 20 7 Dashdot —. : Unrecognized for 7-Ievel HCAM (order 3) Solid — : Unrecognized Tor 7—Ievel HCAM (order 3)—GI 1 4 1 l _l _] L] l L —1 0 —5 0 5 10 15 20 25 30 35 40 SNR (dB) (b) Figure 6.16 Spectrum magnitude HCAM and HCAM-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. 267 100 r 1 3K 31‘ *1 3K r X 1 3K 90 r _ 80 r a 70 r _ 60 * Star " : Correct discrimination for S-Ievel ECAM .1 Dashdot -. : Correct discrimination for 5-Ievel ECAM—GI Plus + : Correct discrimination for 7—level ECAM Correct and Wrong Target Discriminations (%) 50 _ Solid — : Correct discrimination for 7-Ievel ECAM—GI _ x-Mark : Wrong discrimination for 5—level ECAM Dotted : : Wrong discrimination for S-Ievel ECAM-GI _1 40 ” Circle 0 : Wrong discrimination for 7—level ECAM Dashed -— : Wrong discrimination for 7-level ECAM-GI 30 ~ ‘ 20 ~ - 10 - ‘ 0 1 1 E $ 3 I a ‘ 3 1 s —1 0 -5 0 5 1 0 15 20 25 30 35 40 SNR (dB) (3) T T T T T T T T T 0:0, 4 - Dotted : : Unrecognized for 5-level ECAM - '0 Dashed —- : Unrecognized for 5—level ECAM-GI g 3 r Dashdot -. : Unrecognized for 7-level ECAM d 8, Solid — : Unrecognized for 7—level ECAM-GI o 2 - a o ‘2 c P ~ 3 I / \ 0 ’1 \ .1 1 1 1 1 4 1 1 —1 0 -5 0 5 10 1 5 20 25 30 35 40 SNR (dB) (b) Figure 6.17 Spectrum magnitude ECAM and ECAM—GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. 268 1 00 T r + r * T x 90 - - 10‘ L 80 ._ .1 m c .9 E 70 — - E E3 60 — Star ' : Correct discrimination for 5—level ECAM — D Dashdot —. : Correct discrimination for 5—Ievel ECAM—GI ‘55 Plus + : Correct discrimination for 7—Ievel ECAM E3 50 _ Solid — : Correct discrimination for 7—level ECAM—GI 2 cu I; x-Mark : Wrong discrimination for S—level ECAM c Dotted : : Wrong discrimination for S—Ievel ECAM—GI 9 40 T Circle 0 : Wrong discrimination for 7-level ECAM ‘ 3 Dashed -— : Wrong discrimination for 7-Ievel ECAM-GI “O 1% 30 r .1 8 93 5 20 - _ O 10 — _ a A n \> O 1 fi‘ " 1 p 1 —1 0 —5 0 5 1 0 15 20 25 30 35 40 SNR (dB) (8) j T 7 T T fl T T T 0:: 4 '- Dotted : : Unrecognized for S-level ECAM -1 ‘c Dashed —— : Unrecognized for 5—level ECAM-GI g 3 - Dashdot —. : Unrecognized for 7—level ECAM d g, Solid — : Unrecognized for 7-level ECAM-GI o 2 — ~ 0 9 C l— __ __ —— —’ \ _1 O l l Ari-“W 1 1 1 ITI‘w‘ —‘l 0 —5 0 5 1 O 15 20 25 3O 35 40 SNR (dB) (b) Figure 6.18 Spectrum magnitude EC AM and ECAM-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. r1 li‘fla 269 unimportant role to the ECAM, since the ECAM already exponentially expands the discrimination space by using the exponentially weighted correlation gain. Therefore, only the linearity part of bipolar codes will affect the ECAM performances. The ECAM still aggressively converges all inputs into two categories, correct or wrong patterns, and leaves no unknown behind for both stored and unstored patterns. Therefore, the cascaded GI network nearly couldn’t improve nothing. Figure 6.19 shows the RCAAM/d122 and RCAAM/di22-GI cascade network performances vs. SNR for the 68 contaminated stored patterns, while Figure 6.20 presents the network generalization performances for the 64 contaminated unstored patterns. The first digit behind the 'di' stands for the initial order, and the second digit denotes the stability criterion (sc) used by the network. The 7-level bipolar data format prevails the 5-level bipolar data format for the RCAAM/d122 due to its high correlation-based discrimination resolution. Here we only use the low initial order and sc to show its advantage over the HCAM with higher orders, 5 for 5-level and 3 for 7-level. The low discriminations for contaminated unstored patterns can be improved by using a higher initial order. The RCAAM/di-GI cascade network has a chance to improve the performances for 5-level bipolar data, since the RCAAM/di22 using 5-level bipolar data leaves several unknowns behind due to its low discrimination resolution. Figure 6.2] shows the RCAAM/fi12 and RCAAM/filZ-GI cascade network performances vs. SNR for the 68 contaminated stored patterns, while Figure 6.22 presents the network generalization performances for the 64 contaminated unstored patterns. Comparing the RCAAM/fil2 performances between 5-level and 7-level bipolar data, the 7- 270 level patterns prevail for fair and slight contamination situations, while the 5-level patterns present better performances for serious distortion cases. It is a good example to express that the discrimination resolution factor will lead network performances where the code linearity still functions, and the linearity factor will dominate network performances where the code linearity fails. For the RCAAM/fi12-GI cascade networks, the 5-level bipolar data format always has better performances than the 7-level bipolar data format for both stored and unstored patterns, since the 5-level GI net trained with lower resolution data has higher noise tolerance. It is amazing that the RCAAM/filZ-GI cascade network, using such a small discrimination space, has the same performances as the ECAM does for the contaminated stored patterns. From the above observations, we can expect that the high correlation-based discrimination resolution advantage for the 7-level bipolar data will fade when the stability criterion (sc) becomes higher, since a higher sc will delay to give the discrimination output until an output can stay at the same state for sc times correlation accumulations. During the sc correlation accumulations, the discriminatuion resolution is increased, but the code linearity is kept the same. Thus a lower resolution data, 5-level, will improve its discrimination resolution from a high sc, but a higher resolution data, 7-level, can not improve its linearity from the sc. Hence, the linearity advantage will gradually appear when the sc goes high. Figure 6.2] shows the RCAAM/ad12 and RCAAM/adIZ-GI cascade network performances vs. SNR for the 68 contaminated stored patterns, while Figure 6.22 presents the network generalization performances for the 64 contaminated unstored patterns. Compared to the RCAAM/fi12, especially for unstored patterns, the analog spectrum 271 1 00 31‘ 1 3K 1 x 90 - a 80 - e 70 — 1 60 - Star ' : Correct discrimination for 5-level RCAAM/di22 4 Dashdot -. : Correct discrimination for 5—level RCAAM/di22-Gl Plus + : Correct discrimination for 7—Ievel RCAAM/di22 50 Solid — : Correct discrimination for 7—Ievel RCAAM/diZZ-Gl T . \ x-Mark : Wrong discrimination for S-level RCAAM/di22 Dotted : : Wrong discrimination for 5—level RCAAM/di22-Gl Circle 0 : Wrong discrimination for 7-level RCAAM/di22 T Dashed -- : Wrong discrimination for 7—Ievel RCAAM/diZZ-GI 30— 20* Correct and Wrong Target Discriminations (%) 10 r 4 E 1 g I e ‘ £1 -—1 0 15 20 25 30 35 40 SNR (dB) (8) 1 5 T T T T T T T T T g: Dotted : : Unrecognized for 5-level RCAAM/di22 '0 10 _ . 7 Dashed -- : Unrecognized for 5-level RCAAM/di22—Gl _1 g I ‘ Dashdot -. : Unrecognized for 7-level RCAAM/d122 8) Solid - : Unrecognized for 7-Ievel RCAAM/di22—GI o 5- ~ c D _ ._ I \ \ T“~ r‘#__1 1 1 1 1 L 1 —1 0 —5 0 5 1 0 15 20 25 30 35 40 SNR (dB) (b) Figure 6.19 Spectrum magnitude RC AAM/di22 and RCAAM/di22-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. 272 1 00 T T T T T T T T T 90 - ’3 2., 80 1— m c .9 “g 70 — .E a 60 — / Star " : Correct discrimination for 5-level RCAAM/di22 ~ 0 Dashdot -. : Correct discrimination for 5-level RCAAM/di22—Gl ‘55 Plus + : Correct discrimination for 7—level RCAAM/d122 9 50 _ Solid - : Correct discrimination for 7—level RCAAM/diZZ—GI _. <1: I; . x—Mark : Wrong discrimination for 5—level RCAAM/di22 1: - Dotted : : Wrong discrimination for 5-level RCAAM/d122-Gl 9 40 T ‘ , Circle 0 : Wrong discrimination for 7-level RCAAM/d122 '1 3 Dashed —- : Wrong discrimination for 7-level RCAAM/di22-Gl 'c g 30 ~ ~ ‘6' 9 ‘5 20 - q 0 10 ~ 'V‘ " a V O C — -‘ — — - 7') O 1 I I I 1 I I l 1 —1 0 —5 0 5 1 0 1 5 20 25 30 35 40 SNR (dB) (81) 1 5 T T T T T T T T T 10‘ Dotted : 2 Unrecognized for 5—level RCAAM/di22 2., Dashed -— : Unrecognized for 5—Ievel RCAAM/diZZ-Gl 8 10 — Dashdot —. : Unrecognized for 7—level RCAAM/di22 “ g ‘ » . , . Solid - : Unrecognized for 7-level RCAAM/di22-GI a) " - . . _ 8 . . e 5 ” »- - . c 3 _\~.\\ #1 _______ _--—_.__.—'—'——- #TT:T”“——arr"flr\:':_1.——-—-1-———1————1——__1__ —1 0 —5 0 5 1 0 1 5 20 25 30 35 40 SNR (dB) (b) Figure 6.20 Spectrum magnitude RCAAM/di22 and RCAAM/di22-GI cascade network generalization performances with bipolar data coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. 273 l 00 3F 3‘ 1 )K 1 31‘ _T_ X 90 r — 80 - _ 70 ~ - 60 - Star " : Correct discrimination for S-level RCAAM/fi12 -< Dashdot -. : Correct discrimination for 5—level RCAAM/fi12-Gl Plus + : Correct discrimination for 7—level RCAAM/1112 50 _ Solid — : Correct discrimination for 7—level RCAAM/1i12-Gl g x-Mark : Wrong discrimination for 5-level RCAAM/m2 Dotted : : Wrong discrimination for 5—Ievel RCAAM/1i12—Gl 40 T Circle 0 : Wrong discrimination for 7-level RCAAM/1112 ‘ Dashed —— : Wrong discrimination for 7-level RCAAM/fi12-GI Correct and Wrong Target Discriminations (%) 20 — 4 10 - - 0 I ' a e 1 5 1 & J 3’ —1 0 —5 0 5 10 15 20 25 30 35 40 SNR (dB) (8) 30 T T T T T F fl T T g Dotted : : Unrecognized Tor 5—level RCAAM/m2 n 20 _ Dashed -- : Unrecognized for 5-level RCAAM/ii12-Gl g g \. ' . Dashdot -. : Unrecognized for 7-level RCAAM/fi12 CC» \ 1 Solid - : Unrecognized for 7—Ievel RCAAM/fi12-Gl o \ s 10 — \, . 0 k 1\ \ a .4 1 1 1 1 1 1 -10 —5 0 5 10 15 20 25 30 35 40 SNR (dB) Figure 6.21 Spectrum magnitude RCAAM/fi12 and RCAAM/filZ-GI cascade network performances with bipolar data coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. 274 100 90 r (0‘ 2., 80 _ m r: .9 .22 70* .E <3 60 1- Star ' : Correct discrimination for S—level RCAAM/m7. * D Dashdot -. : Correct discrimination for 5-Ievel RCAAM/fitZ—Gl 6 Plus + : Correct discrimination for 7—level RCAAM/fi12 E3 50 _ Solid - : Correct discrimination for 7—level RCAAM/fiiZ-GI _ co :3 x—Mark : Wrong discrimination for 5—Ievel RCAAM/ti12 c Dotted : : Wrong discrimination for 5-level RCAAM/tit 2-GI A 9 40 " Circle 0 : Wrong discrimination for 7-level RCAAM/fi12 3 Dashed -— : Wrong discrimination for 7-level RCAAM/fitZ—GI 'o g 30' \ " ‘6 93 ‘5 20~ _ O 10 — ‘ O 41 . Q . . A 1 fig 1 f —1 O —5 O 5 10 15 20 25 30 35 4O SNR (dB) (8) 30 T T f T T T T T T g: Dotted : : Unrecognized for 5-level RCAAM/fi12 13 20 _ - ‘ Dashed -- : Unrecognized for 5—level RCAAM/fitZ—Gl d g \ \ Dashdot —. : Unrecognized for 7—level RCAAM/fi12 g.) \ \ ' Solid - : Unrecognized for 7-level RCAAM/fi12-Gl 8 \ E) 10 _ \ ‘ _ c ‘\__—————~_~ “_-—————————___ D _ O T‘-—— ——- + —-1-—--1--—-1—-—-1--——1——-- —1 O —5 O 5 10 15 20 25 3O 35 40 SNR (dB) (b) Figure 6.22 Spectrum magnitude RCAAM/fin and RCAAM/filZ-GI cascade network generalization performances with bipolar data coding S and 7 levels vs. SNR for 64 contaminated unstored patterns. 275 1 00 I q 1 i 90 r - 35 m 80 r — c .9 E 70 ” 4 .E :03 60 L Star ' : Correct discrimination for 5-Ievel RCAAM/ad12 ~ 0 Dashdot —. : Correct discrimination for 5—level RCAAM/adtz-Gl ‘55 Plus + : Correct discrimination for 7-level RCAAM/ad12 9 50 _ Solid - : Correct discrimination for 7—level RCAAM/ad12—GI 2 <1: :3 x—Mark : Wrong discrimination for 5-level RCAAM/ad12 : Dotted : : Wrong discrimination for 5-level RCAAM/ad12-GI 9 40 5 Circle 0 : Wrong discrimination for 7-level RCAAM/ad12 E Dashed —— : Wrong discrimination for 7—level RCAAM/ad12—Gl 'o g 30 r — ES 93 5 20 — _ O 10 — - ——%fl——$——G—LE E E4 a ' a 1 4 —1 O —5 O 1 0 1 5 20 25 3O 35 40 SNR (dB) (a) f T T T T T J g\: 60 _ Dotted : : Unrecognized for 5-level RCAAM/ad12 o Dashed -— : Unrecognized tor 5—level RCAAM/ad12-Gl g 40 i- Dashdot -. : Unrecognized for 7-level RCAAM/ad12 - 8, Solid - : Unrecognized for 7-level RCAAM/ad12—Gl 20 E D 0 '_' .LWI J “4411......4...“ J. —1 0 —5 O 1 O 15 20 25 3O 35 40 SNR (dB) 0)) Figure 6.23 Spectrum magnitude RCAAM/ad12 and RCAAM/adIZ-GI cascade network performances with bipolar output forms coding 5 and 7 levels vs. SNR for 68 contaminated stored patterns. 276 1 00 r 90 — 10‘ 2., 80 _ m c .9 .2 7o - .E g 60 — Star ' : Correct discrimination for 5-level RCAAM/ad12 - O Dashdot —. : Correct discrimination for 5-Ievel RCAAM/ad12-GI 5 Plus 4» : Correct discrimination for 7—level RCAAM/ad12 E) 50 ,_ Solid - : Correct discrimination for 7—level RCAAM/ad12—Gl - as '3) x-Mark : Wrong discrimination for 5—level RCAAM/ad12 c Dotted : : Wrong discrimination for 5-level RCAAM/ad12—Gl 9 4O ” Circle 0 : Wrong discrimination for 7-level RCAAM/ad12 ‘ 3 Dashed —— : Wrong discrimination for 7—level RCAAM/ad12—Gl 'o g 30 — ~ ‘6 93 5 2O *- d 0 10 r n o ———&-L-a——-§---a—L—a a a 1 :8: 1 e L a —1 O -5 O 5 1O 1 5 2O 25 3O 35 40 SNR (dB) (8) fl T T T T l (D O T Dotted : : Unrecognized for 5—level RCAAM/ad12 Dashed —- : Unrecognized for 5—level RCAAM/ad12-Gl Dashdot -. : Unrecognized for 7-level RCAAM/ad12 Solid — : Unrecognized for 7-level RCAAMIadlZ—GI _ l Unrecognized (%) A O 20- __________ a 0 _1___....—4————(-——-——1————4————i———- —10 —5 O 5 1O 15 2O 25 3O 35 4O SNR(dB) (b) Figure 6.24 Spectrum magnitude RCAAM/ad12 and RCAAM/adIZ-GI cascade network generalization performances with bipolar output forms coding 5 and 7 levels vs. SNR for 64 contaminated unstored patterns. 277 magnitude has low correlation-based discrimination resolutions. Since a low sc, 2, is set, the patterns with low correlation-based discrimination resolution will still remain confused together after residing at the same state for 2 times and then a lot of inputs are left unrecognized. From the RCAAM/ad12 performances for contaminated unstored patterns, we can realize that the 7-level bipolar data format has better pattern distinction resolution than the S-level data format, since the weighted gains, the correlation gains between an analog input and the stored analog patterns, to both 5—level and 7-level bipolar data are the same. A better pattern distinction resolution indicates that the patterns (or codes) are highly distinct from each other, thus have high correlation-based discrimination resolution. The near zero wrong discrimination from the RCAAM/ad12 indicates that the normalized and offset analog spectrum magnitude patterns have very high linearity so that they only become ambiguous but not close to any wrong pattern under noisy situations. The cascaded GI net has greatly converged those ambiguous states to their true corresponding targets, thus the combination of the RCAAM/ad and GI networks becomes a very good scheme. Compared to the time domain RCAAM/ad12, the effect of correlation-based discrimination resolution of data forms to network performances is clear. 6.6 Network Comparisons Several bipolar/binary networks, the Hopfield recurrent net and BAM's, have been simulated for the purpose of comparisons. To demonstrate the RCAAM's advantages over other popular networks, the following neural networks have been simulated : Hopfield recurrent network, Bidirectional Associative Memories (BAM) and Multi-Layer feedforward and error-BackPropagation (ML/BP) networks. 278 The autoassociative Hopfield net always leads to some undefined states and leaves none discriminated. The BAM also has poor performances and always converges to wrong heteroassociative partners. We have altered the BAM process strategy by using target group code in its heteroassociative partners, and then we only discriminate the target group code portion of the final stable state and ignore the rest of the code. This effort has greatly improved the correct recognition rate and also reduced the wrong rate in simulations. In our simulations, we design a set of 7-bit heteroassociative codes corresponding to the 68 300-bit stored patterns. The first two bits are designed as a target group code for four different targets, i.e. [-1 -l] for BSZ, [-l l] for B58, [1 -1] for F14 and [l 1] for TR], and then the next five bits are coded to represent 17 azimuthal responses of each target. Therefore, this altered BAM will discriminate an input as a correct target or wrong target, and leave none unknown. For example, if the BAM using group code has 36.2% and 8.7% correct discriminations respectively at 40 and 0 dB; then the remaining 63.8% and 91.3% all contribute to wrong discriminations. We also simulate two ML/BPs, one without a hidden layer and another with one hidden layer containing 25 neurons, to compare their performances to the networks we used. We use the uncoded 100 analog responses as training inputs for each training pattern, and use the same output target set used by the GI net. Therefore, they have analog inputs and bipolar outputs. The initial weights are randomly initiated, and a momentum has been used to reduce the chance of being trapped in local minima. Compared to the GI net, ML/BPs require more training epochs and manipulations to converge all training patterns to their desired targets with zero error. Although a momentum has been used, the trainings are still sometimes F1? 279 trapped in local minima a number of times. The ML/BP nets have a similar performance phenomenon to the GI net. As expected, they prefer to leave ambiguous inputs unknown rather than discriminate wrongly. We have used two different discrete levels to quantize an analog spectrum magnitude, 5 levels and 7 levels. Less quantization levels give not only higher linearity but also lower noise sensitivity, although they may lose resolution. In a severe noisy condition, the coding's linearity and noise tolerance sometimes are more important. From figures 1 and 6, it is easy to see that the time responses oscillate with a much larger range than the frequency spectral magnitudes. Therefore 7 levels are used in the time responses for accurately representing its large oscillation range. Different factors prevail in different networks. The noise tolerance and linearity factor prevail in training-based convergent networks, i.e. the GI net, while the resolution factor prevails in correlation-based recurrent memories for moderate noisy conditions. For correlation-based recurrent memories, more quantization levels will enlarge the correlation gain of two similar patterns and reduce the gain of two unlike patterns. This two-side effect greatly increases the discrimination resolution, so the recurrent network performance will improve. To illustrate in detail 9 different network performances under 10 different SNR levels, six figures are plotted. Figure 6.25 shows the performance comparisons of time domain networks with 7-level bipolar data for 68 contaminated stored patterns, while Figure 6.26 presents the network generalization performance comparisons for 64 contaminated unstored patterns. Figure 6.27 shows the performance comparisons of spectrum magnitude networks with 5-level bipolar data for 68 contaminated stored patterns, while Figure 6.28 presents the 100 80— 70* Point . : BAM using target group code — Plus + : Continuous MUBP w/o hidden layer Correct Target Discriminations (%) 0) O I 50 _ p ' Dotted : : Continuous ML/BP w/ one hidden layer _ 5 Star ' : GI Net 1 ' x-Mark x : HCAM(order 5)-Gl 40 - ‘ Circle 0 : ECAM—GI — Dashed -— : RCAAM/diZZ-GI 3O __ , Solid - : RCAAM/filZ—Gl q Dashdot -. : RCAAM/ad12—Gl 20 l L l l l l J l l —10 -5 o 5 1o 15 2o 25 30 35 4o SNR(dB) (a) 33 E; I I I .550’ i a .940— - .E .330“ ‘ D 520— _ 9> +910~ - U) C 9 o ‘ fi ‘ * g —10 25 30 35 4o SNR(dB) (b) I I I j I I I I I a; 60 — T '0 s40» — c on 8 ‘3 20 - e C 3 0 .."' fi 1 a l a —1 O —5 O 5 10 15 20 25 30 35 4O SNR(dB) (C) Figure 6.25 Performance Comparisons of different time domain networks (7-level bipolar data, if used) vs. SNR for 68 contaminated stored patterns. .__ _ 13“““*"'."T' 281 100 90* 80* Point . : BAM using target group code — Plus + : Continuous ML/BP w/o hidden layer Correct Target Discriminations (%) O) O I 50 h , Dotted : : Continuous ML/BP w/ one hidden layer _ , Star ' : GI Net . x—Mark x : HCAM(order 5)—Gl 40 L . " Circle 0 : ECAM-GI ~ x Dashed -- : RCAAM/di22—Gl 30 L .- ' Solid — : RCAAM/fitZ-Gl _ Dashdot -. : RCAAM/ad12—GI 20 g l L l J l 1 L 1 -1O —5 O 5 1O 15 20 25 30 35 4O SNR (dB) (3) O) O ‘l —d —1 J A O 1 Wrong Target Discriminations (%) 20l— _ o - " . 14 a 4 a J i —10 —5 o 5 1o 15 20 25 30 35 4o SNR (dB) (b) A50 1 . 1 o\° 84o _ .5 C CD §2o — E D o - .. » L w -- ~ a —10 —5 o 5 1o 15 20 25 30 35 4o SNR (dB) (C) Figure 6.26 Generalization performance Comparisons of different time domain networks (7-level bipolar data, if used) vs. SNR for 64 contaminated unstored patterns. 1 00 I . I . §‘90— 4 U) c 8 80— - m .E g 70 _ Point . : BAM using target group code Z3 Plus + : Continuous MUBP w/o hidden layer 8 Dotted : : Continuous MUBP w/ one hidden layer .6 60 _ Star . 1 GI Net _ g) m .2 [j 50 _ x—Mark x : HCAM(order 5)-Gl _ 8 Circle 0 : ECAM—GI t Dashed —- : RCAAM/di22—Gl 8 4o — - . Solid — : RCAAM/fi12-GI _ 1’ Dashdot -. : RCAAM/ad12-Gl I 30 I l l l l l l l 1 -1 O —5 0 5 1O 1 5 20 25 30 35 40 SNR(dB) (a O) O I / L 4:. O 1 N O I Wrong Target Discriminations (%) 0 . . .. _ l t l a —1 0 —5 0 5 10 15 20 25 30 35 40 SNR (dB) (bl A I I T I I I I I I §5o~ \ _ U 8 \ 'E 40 * , '* U) 0 §2O~ , 5 0 :' ; A - -: :5 . " ‘ fl 1 h -10 -5 0 5 10 15 20 25 30 35 40 SNR (dB) (0) Figure 6.27 Performance Comparisons of different spectrum magnitude process networks (S-level bipolar data, if used) vs. SNR for 68 contaminated stored patterns. 283 100 it) g5 90 — - , » m . c ,g 80 r / / ’ _. Point . : BAM using target group code 2 / , . " Plus + : Continuous MUBP w/o hidden layer _ _ ~ / Dotted : : Continuous MUBP w/ one hidden layer .5 7o ,, a r ‘ ,. X .— / 9. 60 h / ll ‘ Star ' : GI Net 81 ’ .’ . x-Mark x : HCAM(order 5)—Gl a 50 _ I 4 Circle 0 : ECAM—GI i- , ‘6 I # f 9 _ I . ES 40 / Dashed -— : RCAAM/di22—GI O I. Solid - : RCAAM/fiIZ—Gl 30 a A Dashdot -. : RCAAM/ad12—Gl l l l l l 1 l _L l —1 O —5 O 5 10 15 20 25 30 35 SNR (dB) (a) $3 7,; I I 1 c .2 60 r + <13 .9 .E 8 40 r .2 x O ‘65 920 - <13 1— ________________________ ‘c” e 0 ‘- =— 7» “7.: ? g 1% g —1 O —5 0 5 1O 15 20 25 30 35 SNR (dB) (b) A . I I I I I I I I I 25 50 — ‘1 U \ S . 'E 40 o: 8 9 20 c 1 O ' + i ‘ —1 O —5 O 5 1O 15 20 25 30 35 SNR (dB) (0) Figure 6.28 Generalization performance Comparisons of different spectrum magnitude process networks (S-level bipolar data, if used) vs. SNR for 64 contaminated unstored patterns. 284 100 I a I * I . g 90 ~ - U) c: -f-3 80 — _ to E g 70 _ _ 8 Point . : BAM using target group code H 60 _ Plus + : Continuous MUBP w/o hidden layer 8, Dotted : : Continuous MUBP w/ one hidden layer .i 63 Star ' : GI Net : 50 _ l x-Mark x : HCAM(order 5)-GI a 8 ,’ Circle 0 : ECAM-GI t: Dashed -- : RCAAM/di22-Gl 8 40 _ - Solid — : RCAAM/tiIZ-GI d 1’ Dashdot —. : RCAAM/ad12-Gl 30 4 l I l I l l I —1 0 —5 5 1O 15 20 25 30 35 40 SNR (dB) (a) 3‘" T,” 50 I 1 r c .9 ES 40 - — .E .E 5 30 r z .92 9 20 ~ ~ a.) 9’ f.“ 10 - _ E” 9 0 f1 g 1 g L a g —1 0 15 20 25 30 35 4O SNR (dB) (b) A I I I I I I I I 0\° 50 ~ ‘ d 'o \ g \ ‘E 40 ~ _ a) 8 9 20 - _ 5 0 ,. A: 39.1-- l g 1 a —1 O —5 5 10 15 20 25 3O 35 4O SNR (dB) (0) Figure 6.29 Performance Comparisons of different spectrum magnitude process networks (7-level bipolar data, if used) vs. SNR for 68 contaminated stored patterns. 1 00 ,1, I x 1 1 fir;é_ ______ .x ______ _::T g; 90_ :|., T, .......... w _ .5 80 r - 6 2f .9 E 70 F ‘ 8 8 Point . : BAM using target group code .. 60 5 Plus + : Continuous MUBP w/o hidden layer 7 8, Dotted : : Continuous MUBP w/ one hidden layer 8 50 _ Star ' : GI Net 4 [j x-Mark x : HCAM(order 5)-Gl 8 Circle 0 : ECAM-GI t 40 _ Dashed -- Z RCAAM/dIZZ-Gl ‘ 8 Solid — : RCAAM/IIIZ-Gl Dashdot —. : RCAAM/ad12—GI 3O *- . - l l L l 1 l l l l —10 —5 O 5 1O 15 2O 25 3O 35 4O SNR(dB) (a) (D O —4 _ —4 -u —( —n —1 — A O l N O l Wrong Target Discriminations (%) ‘ ..~ ’ ‘1 _ “W 0 .L —10 —5 0 5 10 15 20 25 30 35 4O SNR (dB) (b) A I I I f F I I I I o\° '\ 3430— . 2 'C \ “N’ \ ‘E 40* - c» 8 920‘ ~4 C \ .. 1 o—eT—qbe—a—d—e + 31 fl 1 a I -10 —5 O 5 10 15 20 25 3O 35 4O SNR(dB) (C) Figure 6.30 Generalization performance Comparisons of different spectrum magnitude process networks (7-level bipolar data, if used) vs. SNR for 64 contaminated unstored patterns. 286 network generalization performance comparisons for 64 contaminated unstored patterns. Figure 6.29 shows the performance comparisons of spectrum magnitude networks with 7- level bipolar data for 68 contaminated stored patterns, while Figure 6.30 presents the network generalization performance comparisons for 64 contaminated unstored patterns. For correlation-based networks, we use their GI-cascaded networks for performance comparisons. All HCAM's use an order of 5 in comparisons, since we want the best performances of the HCAM to compare with the RCAAM. In time domain networks, it is clear that the correlation-based networks, except the BAM, have better performances than the training-based networks despite of the needed discrimination space. The RCAAM/ad12- GI cascade network has the best performance, and the ECAM-GI and the RCAAM/diZZ-GI have almost the same performances. The BAM has the worst results, since, to improve its discrimination, its discrimination philosophy is altered to the bi-state, correct or wrong. In the spectrum magnitude networks, the RCAAM/adIZ-GI cascade network no longer has the best performances, since the normalized and offset analog spectrum magnitude has the lower correlation-based discrimination resolution than the S-level and 7-level bipolar data. The RCAAM/filZ-GI and ECAM-GI cascade networks using S-level bipolar data have the best performances, since the linearity of 5-level bipolar code can survive through severe noise. Again, the BAM has the worst performances. The HCAM-GI cascade network using 5-level bipolar data has poor performance for contaminated unstored patterns, but the HCAM-GI cascade network has the performances similar to the RCAAM/diZZ-GI network's. But a HCAM with an order of 5 means that each recurrent operation needs to manipulate 5 multiplications to have a correlation gain for one stored pattern. Therefore, compared to the 287 RCAAM, the HCAM needs 5 times more hardware operations, or clocks in digital circuit, to finish the correlation evaluation. We summarize the architectures of the recurrent correlation associative networks used in Table 6.6. The computation space for RCAAM's is iteratively adaptive. For lightly contaminated patterns, the computation space required for discrimination is small due to few iterations, while a larger space is required for highly distorted patterns. Row 5 presents the available knowledge observed from the network operations about contamination or similarity between input and the final stable output. The ECAM converges most distorted inputs to some stable states within 3 iterations since it extremely expands the discrimination space. Therefore, we are unable to determine contamination from the ECAM. The HCAM's with small order are usually trapped in some spurious stable states, while the RCAAM typically avoids that with its accumulatively dynamic memory and the adjustment of sc. The decisions are deterministic for both HCAM and ECAM, when their outputs don't change for one iteration, since their memories are fixed. The architectures and numerical performances are summarized in Table 6.7 for time domain process networks, in Table 6.8 for spectrum magnitude process networks with S quantization levels coded by 3 bits and in Table 6.9 for spectrum magnitude process networks with 7 quantization levels coded by 3 bits. Column 4 in each table presents the training epochs and final errors for the ML/BPs and the GI networks. The first number in this column presents the training epochs with which network trainings converge to 0 error, while the second number denotes the extra training cycles made to ensure all training pattern outputs deeply enter the saturation regions of the sigmoid function. Theoretically, this will increase noise ,.._-_.-...--.n 288 Network HCAM ECAM RCAAM Input form Binomial Binomial Binomial or Analog Process structure/ Fixed memory/ Fixed memory/ Dynamic memory/ Recurrent operation Adaptive Adaptive Accumulatively adaptive Computation Space Small for small Extremely huge Fit for orders discrimination Observability about High for small orders Very little High contamination or ; low for high orders similarity between input and the final output Possibility trapped in Very high for small Very little Adjustable unknown stable states orders Decision strategy Deterministic Deterministic Flexible Hardware Realization Capable for fair Nearly incapable Capable orders Multiplication operation times of the order (1 + correlation 1 time (clock) times required to used; i.e. 5 times for gain) times; calculate a gain the HC AM with times of weighting for each order of 5 correlation gain stored pattern at each are spent on update in a digital circuit calculating the power of exponential base Table 6.6 Architecture summary of the recurrent correlation associative networks used. tolerances and decrease the biased learnings. Column 5 presents the average iterations which the recurrent networks require to reach the stable state adopted for target discrimination. The upper number in every row corresponding to 'Trained' means the result obtained by testing the contaminated trained/stored patterns, while the lower number corresponding to fli..4__+ é 289 'Untrained' indicates the result obtained by testing the contaminated untrained/unstored patterns. Column 6 presents the minimum SNR in dB at which the network still performs 95% correct discrimination. Column 7 shows the correct and wrong discrimination rates in percentage respectively at 40 and 0 dB. The number pairs left to the ',' shows the correct discrimination for 40 and 0 dB, while the number pairs right to ',' presents the wrong discrimination. Finally, column 8 presents the maximum integer computation scale required for processing one stored pattern. High resolution and linearity are the most important issues as long as the input data format is concerned in evaluating the network performance. The high resolution will expand the differences between any two trained/stored patterns and then make pattern discrimination easier, while the linearitiy between an input and the stored patterns can greatly increase the confidence in network performances. But analog data are hard to use in a recurrent associative update, therefore the analog networks can't iteratively adapt to the final stable state. Without using analog signal, we may not confidently determine the contamination or the similarity between an input and the network output. Also an analog network can't be cascaded to a high performance and high dimension recurrent autoassociative network as a group decoder. Unless an analog network is capable of hardware realization, it can't take advantage of today's digital computer technologies. A binomial data only has two states by which an artificial neuron model emulates the bi-state, activated and inactive, of a biological neuron. Therefore, the binomial data is well qualified for use in recurrent and cascade operations, like complicated biological neural nets. Binomial data uses much less space than continuous data in a digital computer process, and it can be fiirther compressed. Quantizing “2.1-ram 290 continuous data by finite discrete levels can also result in a tolerance of light contamination in a way. And the binary data are easy and safe to store for long periods of time. Input/ Memery Training Trained; Max. Output Size Epoch Untrained Integer Form /Error _ o Comput. Network Averg. Min. dB Percent (/o) of Scale Recog. w/ 95% Correct, Wrong Iterat. Correct Recog. at Recog. 40/0 dB Hopfield net Bip/Bip 300x300 X 9.68; None; 0/0, 0/0; 3002 9.71 None 0/0, 0/0 BAM using Bip/Bip 300x7 X 4.28; 20 dB; 96.0/636, 4/36; 3002 group code 4.25 None 83/58.2, 17/42 ML/BP w/o Analog lOOx4 164+ X 10 dB; lOO/59.3, 0/ 4; Not hidden layer /Bip 1012/O 14 dB 980/566, 0/5 Integer ML/BP w/ 1 Analog 100x25 624+ X 14 dB; lOO/56.6, 0/ 3; Not hidden layer /Bip +25x4 414/0 14 dB 995/542, 0/ 4 Integer GI Net Bip/Bip 300x4 3+ X 10 dB; 100/42.4, 0/11; Not 120/0 14 dB 100/42.3, 0/12 Integer HCAM 5.02; None; 80.6/ 52.5, 0/ 2; 3003 (order of3); 6.12 None 63/ 47.2, 0/ 3 Bip/Bip 68x300 X HCAM 3.44; 20 dB; 975/84, 0/ 7; 3005 (order of 5) 3.76 None 920/852, 0/ 6 ECAM Bip/Bip 68x300 X 3.01; 3 dB; lOO/90.7, O/ 9; 2300 3.03 3 dB 100/90.9, 0/ 9 RCAAAM Bip/Bip 68x300 X 4.35; 3 dB; 100/90.9, 0/ 9; Adaptive /di22—GI 5.15 3 dB 100/91.7, 0/ 8 RCAAAM Bip/Bip 68x300 X 6.76; 3 dB; lOO/89.6, 0/ 8; Adaptive /fi12—GI 7.88 3 dB 100/894, 0/ 7 RCAAAM Analog 68x100 X 6.09; -6 dB; 100/ 100, 0/ 0; Adaptive /da12-GI /Bip +68x300 7.3 -6 dB 100/100, 0/ 0 Table 6.7 Network architectures and performances summary for time domain target discrimination with 7 quantization levels encoded by 3 bits. 292 Input/ Memery Training Trained; Max. Output Size Epoch Untrained Integer Form /Error _ o Comput. Network Averg. Min. dB Percent (/o) of Scale Recog. w/ 95% Correct, Wrong Iterat. Correct Recog. at Recog. 40/0 dB Hopfield net Bip/Bip 300x300 X 5.40; None; 0/0, 0/0; 3002 5.29 None 0/0, 0/0 BAM using Bip/Bip 300x7 X 4.73; None; 55.9/48.1, 44/52; 3002 group code 4.7] None 44.4/40.6, 56/59 NHJBP w/o Analog 100x4 1981+ X 14 dB; 100/65.2, 0/ 10; Not hidden layer /Bip 864/0 None 875/644, 2/11 Integer ML/BP w/ 1 Analog 100x25 1337+ X 10 dB; 100/62.5, 0/10; Not hidden layer /Bip +25x4 678/0 None 906/602, 2/ 12 Integer GI Net Bip/Bip 300x4 10+ X 6 dB; 100/75.3, 0/ 5; Not 120/0 20 dB 953/734, 3/ 6 Integer HCAM 6.56; None; 2.79/ O, 0/ 0; 3003 (order of 3); 6.22 None 0/ 0, 0/ 0 Bip/Bip 68x300 X HCAM 4.51; None; 949/572, 0/ 0; 3005 (order of 5) 7.71 None 32.5/28, 0/ 0 ECAM Bip/Bip 68x300 X 2.95; -3 dB; 100/ 100, 0/ 0; 2300 3.05 -3 dB 100/98.3, 0/ 2 RCAAAM Bip/Bip 68x300 X 7.12; 3 dB; 100/87.5, 0/12.5; Adaptive /di22-GI 9.42 None 87.5/794, 12/20 RCAAAM Bip/Bip 68x300 X 7.12; -3 dB; 100/100, 0/ 0; Adaptive /fi12-Gl 10.06 -3 dB 989/972, 0/ 2 RCAAAM Analog 68x100 X 9.71; 0 dB; 100/99, 0/ 0; Adaptive /da12-GI /Bip +68x300 12.85 3 dB 988/93,], 0/ 0 Table 6.8 Network architectures and performances summary for spectrum magnitude target discrimination with 5 quantization levels encoded by 3 bits. . 7.1m m7 293 Input/ Memery Training Trained; Max. Output Size Epoch Untrained Integer Form /Error , Comput. Network Averg. Mm. dB Percent (%) of Scale Recog. w/ 95% Correct, Wrong Iterat. Correct Recog. at Recog. 40/0dB Hopfield net Bip/Bip 300x300 X 6.37; None; 0/0,0/0; 3002 6.19 None 0/0, 0/0 BAM using Bip/Bip 300x7 X 4.17; None; 83.4/70, 17/30; 3002 group code 4.13 None 77.7/59.5, 22/40 ML/BP w/o Analog 100x4 1981+ X 14 dB; 100/65.2, 0/10; Not hidden layer /Bip 864/0 None 875/644, 2/11 Integer ML/BP w/l Analog 100x25 1337+ X 10 dB; 100/62.5, 0/10; Not hidden layer /Bip +25x4 678/0 None 906/602, 2/12 Integer GI Net Bip/Bip 300x4 10+ X 10 dB; 100/66.3, 0/10; Not 120/0 None 89.5/63.4, 3/9 Integer HCAM 5.18; None; 96.5/ 59.9, 0/ 0; 3003 (order of3); 9.38 None 486/ 33.1, 0/0 Bip/Bip 68x300 X HCAM 3.28; OdB; 100/99.1, 0/ 0; 3005 (order of 5) 3.4 3 dB 966/92, 3/ 8 ECAM Bip/Bip 68x300 X 2.96; OdB; 100/99.9, 0/ 0; 2300 3.05 0 dB 975/953, 3/ 5 RCAAAM Bip/Bip 68x300 X 4.06; OdB; 100/996, 0/ 0; Adaptive /di22-GI 5.83 14 dB 956/903, 4/10 RCAAAM Bip/Bip osxsoo x 5.61; OdB; 100/99.7, 0/0; Adaptive /fi12-GI 8.08 OdB 984/952, 2/4 RCAAAM Analog 68x100 X 9.7; OdB; 100/99, 0/ 0; Adaptive /da12-Gl /Bip +68x300 13.37 3dB 100/93.3, 0/0 Table 6.9 Network architectures and performances summary for spectrum magnitude target discrimination with 7 quantization levels encoded by 3 bits. CHAPTER 7 Conclusions We have used several different neural network architectures to discriminate among radar targets at a wide variety of aspect angles. From the simulations, it appears that correlation-based neural networks have powerful and effective problem solving abilities. Comparing the GI network to the recurrent correlation-based associative memories, we find a well trained GI network has a quite good attraction basin within which the GI net can correctly converge from a state to its associative stored pattern. But the GI network noise tolerance is inferior to the correlation-based associative memories. Typical RCAM's have off- line predetermined fixed memory, limiting the network flexibility and adaptability. Therefore, some RCAM networks may blindly reserve a huge computation space to operate in regardless of inputs, and some may not work well if used with insufficient predetermined order. We have proposed a flexible and highly adaptive real-time learning network, the RCAAM , which has dynamic memory to allow the given input to adapt in a parallel fashion to all stored patterns, and an adjustable stable criterion to observe contamination and semi- stable states. The adaptive mechanism can be expressed from two points of view. We may say that the dynamic memory learns to adapt to the input; therefore the adaptive (or changing) direction along which the dynamic memory changes is subject to the input presented. The dynamic memory seemingly tries to find a way to compromise with the given input at a low energy-like state. In contrast, we may also say, with the memory fixed, the input keeps changing (or updating) in order to adapt to the memory contructed by stored patterns through 294 295 recurrent learnings. Hence the resouces required for discrimination are dynamic and just fit to discriminate the input presented, and is no longer detemlinistic. A flexible decision strategy should be considered whenever noise contamination is involved in network performance. The contamination observability of the RCAAM through semi-stable states allows us to estimate the input contamination and the corresponding network reliability (or confidence) so that we can decide to accept or discard the network discrimination output based on that information. The network may have several semi-stable states corresponding to the same input if the stability criterion is set to a low value, say 1. The spurious (unknown) semi-stable states will fiirther converge to one of the stored patterns if the stability criterion becomes high. Thus, the discrimination decision strategy is flexible : 3 phases, correct/wrong/unknom or 2 phases, correct/wrong. We have simplified the RCAAM iteration to an easily realizable implementation form, which speeds up the dynamic memory computations. The GI-cascaded network usually improves the performances for fair contamination situations, and saves a great deal of discrimination space. A GI network cascaded to an RCAAM with a low initial order and stable criterion is an effective and economical combination. We have also used spectrum magnitudes as the network processing patterns. The spectrum magnitude network simulations produce excellent results for the ECAM-GI, HCAM-GI with order higher than 5, and RCAAM-GI. We find that the linearity and resolution are the most important factors when the data format is considered in network performance. We have analyzed the correlation-based discrimination resolutions for different data formats, and find that the bipolar data format usually has a better correlation-based 296 discrimination resolution for both time and spectral domain data. The analog data format always has better linearity than bipolar data. A nonlinear normalization with additional offsetting can convert the useless analog spectrum magnitude into useful analog data with correlation-based discrimination resolution and a correlation gain population similar to the bipolar data expressing 5 quantization levels. We find that, with coding bits fixed, the code with less quantization levels has better linearity. The RCAAM has the same or better performances, but with great space savings compared to the ECAM or the same implementation complexity as the HCAM. The bipolar process network performances show the spectral processes producing results comparable to those using time responses, although half the information has been ignored. Hence, the spectrum magnitude process is an alternative, effective and realizable technique. Speaking of hardware implementation, the ECAM processing high dimension patterns can't be realized since the known physical materials can't represent such a huge dynamic range needed for ECAM‘s pattern discrimination. Altough the HCAM with order higher than 5 can perform very well, the high order will reduce its updating speed. Consider an HCAM with order of 7. Since the hardware multiplication in the HCAM will be repeated 7 times between the current input and each stored pattern on each update, the time spent on correlation-gain calculation for each stored pattern for the HCAM is 6 times more than the time for the RCAAM if a control clock is used. Therefore, the HCAM with higher order is less efficient and less flexible than the RCAAM. Typically, the RCAAM/fl requires one more pattem- dimension register than the RCAAM/di in hardware realization if the input and output registers can be trigged in different directions. 297 Some improvements can be developed to relieve the interference resulting from the negative correlation gains. This can be done simply by adding a positive-valued offset to the correlation gain between the input and each stored pattern. However, this offset will raise the correlation mean, so the correlation-based discrimination resolution will be decreased, and then the discrimination processing time will be increased. A better improvement in evaluating the correlation gain for the stored pattern i'with bipolar format is given below : Gain I. = (Input 69 Weighted Pattern I. )x [Sum (Input 0 Pattern I. )] (1) where the e denotes the inner product and the 0 represents the Exclusive-NOR (or Equivalence) operation, which gives an active output only if two input bits are the same. In this improvement, the negative gains will be inhibited by the bracketed term, since the correlation gain with a large negative value indicates that the different bits between the input and the evaluated pattern are more than the same bits, and then the sum between these two bipolar vectors should be small. In contrast, the bracket term will benefit the positive gains, since the positive gains indicate that more corresponding bits between the input and the evaluated pattern have the same signs, and then the sum of the Exclusive-NOR outputs should be large . The hardware implementation for an Exclusive-NOR is not difficult at all. To improve the converging consistency, a weighted previous network sum can be added to the current network sum before the nonlinear threshold fiinction. This weighted previous network sum can be considered as a momentum to keep the converging trace consistent, and the older momenta with weighting less than 1 will graduately decay when time goes on. Some target discrimination methods combined with other techniques, the discrete 298 wavelet transform and the late-time natural fi'equency integration filter scheme, have also been used during this research period and demonstrated very pleasing results. A perspective study will be devoted to a complicated neural network synthesis which is expected to be able to predict sequential actions and estimate its own performance confidence (or error) to a causal system. Since the updated desired outputs used for network training are temporarily generated by the synthesis network itself, the network training can converge only if the self- consistence criterion is satisfied. BIBLIOGRAPHY 299 BIBLIOGRAPHY [1]. P.11avarasan, J. E. Ross, E. J. Rothwell, K. M. Chen, and D. P. Nyquist, "Performance of an automated radar target discrimination scheme using E pulses and S pulses, "IEEE Trans. Antennas Propagat., vol. 41, pp. 582-588, May 1993. [2]. C. E. Baum, E. J. Rothwell, K. M. Chen , and D. P. Nyquist, "The singularity expansion method and its application to target identification, "Proc. IEEE, vol. 79, no. 10, pp.1481- 1492, October 1991. [3]. E. J. Rothwell, K. M. Chen, D. P. Nyquist, and W. M. Sun, "Frequency domain E-pulse synthesis and target discrimination, "IEEE Trans. Antennas Propagat., vol. AP-3 5, pp. 426- 434, April 1987. [4]. EM. Kennaugh, "The K-pulse concept, "IEEE Trans. Antennas Propagat., vol. AP-29, no. 2, pp. 329-331, March 1981. [5]. F. Y. S. Fok, "K-pulse estimation for a right-angle bent wire using more than one impulse response, "IEEE Trans. Antennas Propagat., vol. 38, pp. 1092-1098, July 1990. [6]. J. P. Bayard and D. H. Schaubert, "Target identification using optimization techniques, "IEEE Trans. Antenna Propagat., vol. 38, pp. 450-456, April 1990. [7]. M. A. Morgan, "Scatterer discrimination based upon natural resonance annihilation, "J. Electromagnetic Waves Appl, vol. 2, no. 5/6, pp. 481-502, 1988. [8]. D. G. Dudley and D. M. Goodman, "Transient identification and target classification" in Time-Domain Measurements in Electromagnectics, E. K. Miller, Ed. New York : Van Nostrnad-Reinhold, 1986, pp. 456-497. [9]. C. W. Chuang and D. L. Moffatt, "Natural resonances of radar targets via Prony's method and target discrimination, "IEEE Trans. Aerosp. Electron. Syst., vol. AES-12, pp. 583-589, Sept. 1986. [10]. E. J. Rothwell, K. M. Chen, D. P. Nyquist, and J. E. Ross, "Time-domain imaging of airborne targets using ultra-wideband or short-pulse radar, "IEEE Trans. Antennas Propagat., vol. 43, no. 3, March 1995. 300 [11]. C. L. Bennet, "Time-domain inverse scattering, "IEEE Trans. Antennas Propagat., vol. AP-29, no. 2, pp. 213-219, March 1981. [12]. J. D. Young, "Radar imaging from ramp response signatures, "IEEE Trans. Antennas Propagat., vol. AP-24, pp. 276-282, May 1976. [13]. M. C. Lin and Y. W. Kiang, "Target discrimination using multiple-frequency amplitude returns, "IEEE Trans. Antennas Propagat., vol. 38, pp. 1885-1889, Nov. 1990. [14]. M. C. Lin, Y. W. Kiang and H. J. Li, "Experimental discrimination of wire stick targets using multiple-frequency amplitude returns, "IEEE Trans. Antennas Propagat., vol. 40, pp. 1036-1040, Sept. 1992. [15]. H. J. Li and S. H. Yang, "Using range profiles as feature vectors to identify aerospace objects, "IEEE Trans. Antennas Propagat., vol. 41, pp. 261-268, March 1993. [16]. E. J. Rothwell , K. M. Chen, D. P. Nyquist, J. E. Ross, and R. Bebermeyer, "A radar target discrimination scheme using the discrete wavelet transform for reduced data storage, "IEEE Trans. Antennas Propagat., vol. 42, no. 7, pp. 1033-1037, July 1994. [17]. H. V. Poor, "An Introduction to Signal Detection and Estimation", Springer-Verlay, New York, 1988. [18]. T. Kohonen, "Self-Organization and Associative Memory", Springer Verlog, Berlin, 1989. [19]. Bart Kosko, "Neural Networks for Signal Processing", Pretice-Hall, Inc. 1992. [20]. Robert Schalkoff, "Pattern Recognition : Statistical, Structural and Neural Approaches", John Wiley & Sons, Inc. , 1992. [21]. John J. Craig, "Introduction to Robotics -mechanics and control", Addison-Wesley, 1989. [22]. R. G. Atkins, R. T. Shin and J. A. Kong, "Chapter 7 : A Neural Network Method for High Range Resolution Target Classification", Progress in Electromagnetics Research 4, J. A. Kong, Ed., Elsevier, New York, 1991. [23]. H. J. Li and V. Chiou, "Aerospace Target Identification-Comparison between the Matching Score Approach and the Neural Network Approach", Journal of Electromagnetic Waves and Applications, Vol. 7, No. 6, 873-893, 1993. [24]. John Edwin Ross HI, "Chapter 5 Transient Measurements", PhD. Thesis : Application 301 of Transient Electromagnetic Fields to Radar Target Discrimination, 1992. [25]. Minsky, M., and Papert, S. Perceptrons : An Introduction to Computational Geometry. MIT Press, Cambridge, 1969. [26]. Widrow, B., and Hoff, ME. Adaptive Switching Circuits. Institute of Radio Engineers, Western Electronic Show and Convention, Convention Record, part4, 1960, 96-104. [27]. Werbos, P. J. Beyond Regression : New Tools for Prediction and Analysis in the Behavioral Sciences. PhD. thesis, Harvard University, 1974. [28]. Parker, D. B. Learning Logic. Invention Report, $81-64, File 1, Office of Technology Licensing, Stanford University, 1982. [29]. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. Learning Internal Representations by Error Propagation. In Rumelhart, D. E., and McClelland, J. L. (eds), Parallel Distributed Processing: Explorations in the Microstructures of Cognition, vol. 1. MIT Press, Cambridge, 1986. [30]. J. Hertz, A. Krogh and R. G. Palmer, Introd. to the Theory of Neural Computation , Addison-Wesley, 1991. [31]. Y. Kamp and M. Hasler, "Recursive Neural Networks for Associative Memory" (John Willy & Sons Ltd.) 1990. [32]. M. H. Hassoun, "Dynamic Associative Memories" in Artificial neural networks and stochastical pattern recognition old and new connections, pp. 195-218, 1991. [33]. M. H. Hassoun, "Associative Neural Memories, Theory and Implementation", Oxford University Press, 1993. [34]. D. Hebb, Organization of Behavior, John Wiley & Sons, N.Y., 1949. [35]. J.J. Hopfield, "Neural network and physical systems with emergent collective computational", Proc. Nat. Acad. Sci. US. vol.79, pp. 2554-2558,]982. [36]. J.J. Hopfield, "Neurons with graded response have collective computational properties like those of two-state neurons", Proc. Nat. Acad. Sci. US. vol.81, pp.3088-3092, 1984. [3 7]. McEliece, R. J., Posner, E. C., Rodemich, E. R., and Venkatesh, S. S. (1987). "The Capacity of the Hopfield Associative Memory," IEEE Trans. Info. Theory, IT-33, 461-482. [38]. B. Kosko, "Bidirectional Associative Memories", IEEE Trans. Sys. Man Cybern, 302 SMC-18, pp.49-60, 1988. [39]. Y. F. Wang, J. B. Cruz, Jr. and J. H. Mulligan, Jr., "Guaranteed Recall of All Training Pairs for Bidirectional Associative Memory" IEEE Trans. on Neural Networks vol.2 no.6 pp.559-567, 1991. [40]. T. D. Chiueh and R. M. Goodman, "High-Capacity Exponential Associative Memory" in Proc. IEEE Int. Conf. Neural Networks (San Diego, CA), vol. 1, pp.153-160,1988. [41]. T. D. Chiueh and R. M. Goodman, "Recurrent Correlation Associative Memories" IEEE Trans. on Neural Networks, vol.2 no. 2, pp.275-283, 1991. 111011an sran UNIV. LIBRRRIES llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 31293014027241