

#### This is to certify that the

#### dissertation entitled

# AN INTELLIGENT MICROSENSOR FOR MONITORING ROLLING ELEMENT BEARINGS

presented by

Mark B. Chuey

has been accepted towards fulfillment of the requirements for

Ph.D. degree in Elect. Engr.

Milliel Shavellet

Date \_\_\_\_11/11/93\_\_\_\_\_

MSU is an Affirmative Action/Equal Opportunity Institution

10 to 20 0

0-12771



## LIBRARY Michigan State University

PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due.

|       | DATE DUE | DATE DUE |
|-------|----------|----------|
| EN 25 |          |          |
|       |          |          |
|       |          |          |
|       |          |          |
|       |          |          |
|       |          |          |
|       |          |          |

MSU is An Affirmative Action/Equal Opportunity Institution c\circ\data\u00e4a.pm3-p.1

AN I MONITO

# AN INTELLIGENT MICROSENSOR FOR MONITORING ROLLING ELEMENT BEARINGS

By

Mark D. Chuey

#### **A DISSERTATION**

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

**DOCTOR OF PHILOSOPHY** 

Department of Electrical Engineering

1993

The ability to examined by designing post suspended mass in This is converted to an by a temperature-complete temove any offset order. 256-times over aciel first-stage deciminationing. The outpleasure is a resolution to a resolution.

programmable sixth-ord
compared to a program
when the threshold is e

extraction logical

some element in a plan-

A simulation of stateters. The programmes as well as the sign

#### **ABSTRACT**

## AN INTELLIGENT MICROSENSOR FOR MONITORING ROTATING ELEMENT BEARINGS

by

#### Mark D. Chuey

The ability to monitor rolling element bearings with an intelligent microsensor is examined by designing key elements of the device. The intelligent sensor contains a four-post suspended mass microaccelerometer to sense bearing housing acceleration vibrations. This is converted to an electrical signal through a half-active piezoresistive bridge driven by a temperature-compensated supply. The voltage output is amplified, high-pass filtered to remove any offset signal, and low-pass filtered. The result is digitized in a second-order, 256-times oversampling sigma- delta modulating analog-to-digital converter. A novel first-stage decimator, located on-chip, provides a minimum resolution of 9 bits for monitoring. The output of the decimator can be transmitted off-chip for second-stage decimation to a resolution suitable for diagnostic purposes or it can be put into on-chip feature extraction logic. Feature extraction is accomplished by filtering the signal in a programmable sixth-order IIR filter then extracting the mean squared value. This value is compared to a programmable threshold value. An alarm signal is transmitted off-chip when the threshold is exceeded a programmable number of times. The sensor is to serve as one element in a plant-wide multisensor monitoring system.

A simulation of bearing housing vibrations is developed to examine monitoring parameters. The program simulates different types of point defects and random noise sources as well as the signal transmission path through the housing. Comparisons between

peak value, mean so frequencies. The effor features are also exam-

The on-chip s

Ext-stage decimation
generator accumulator
require less circuit are
aministing A classific

peak value, mean squared, crest and kirtosis measures are made for various sampling frequencies. The effects of sampling resolution on mean squared and spectral monitoring features are also examined.

The on-chip sigma-delta modulating analog-to-digital converter employs a novel first-stage decimation filter. The sinc<sup>3</sup> FIR filter uses two coefficient generator/accumulator pairs, each alternately producing an output. The filter is shown to require less circuit area than previous designs at a cost of reduced noise rejection and antialiasing. A classification scheme for sinc<sup>3</sup> filters is also provided.

No project of people Abrief listing

I would first

Fisher, Dr. J. Hall, Dr.

D. M. Shanblan

I would also

Eigneering Departmer

thoughout the past e

diring this period. Fred

ण leaching schedule t

Many thanks to

Dr Jeff Abell for pro-

Burgett for his suggest

traliable strategic advis

The support and friends has be

la: Furkioti for her pati

A special thank

Position have been positions

#### **ACKNOWLEDGMENTS**

No project of this size can be completed without the help and support of many people. A brief listing of some of those people is provided here.

I would first like to thank the members of my doctoral committee: Dr. P. D. Fisher, Dr. J. Hall, Dr. M. Nayeri, Dr. R. Tonda, Dr. C. L. Wey and the committee chair, Dr. M. Shanblatt.

I would also like to acknowledge the faculty of the Electrical & Computer Engineering Department at GMI Engineering and Management Institute for their support throughout the past eight years. Special thanks is extended to the department heads during this period, Fred Cribbins and Dr. Dave Leffen, for their cooperation in arranging my teaching schedule to accommodate the requirements of the doctoral program.

Many thanks to Pat Irish for producing some of the figures in Chapters 3 and 4, Dr. Jeff Abell for proofreading the technical chapters of this dissertation, Dr. Scott Burgett for his suggestions in the area of signal processing, and Gerald Vossler for invaluable strategic advice.

The support and understanding of my parents, Donald and Rosemary Chuey, my family and friends has been invaluable throughout this project. I would also like to thank Jean Furkioti for her patience.

A special thank you is extended to Dr. Richard Tonda for his help with the mechanical design of the microaccelerometer. Without Rick's assistance this project would not have been possible.

LIST OF TABLES

LIST OF FIGURES

CEAPTER 1 INTRO:

11 MACHINES

12 Marian

13 PROJECTS

14 Acc 3277

141 K

142 Te

143 V.

15 OVERVEW

CHAPTER 2 BEARING

11 BEASTALL

22 BEAUNG VI

23 VERATIONS

231 Tra:

232 Sign

23.3 Sign

234 Fea:

24 BEARING Visit

24.1 Béar

242 Simu

## **TABLE OF CONTENTS**

| LIST OF TABLESx                              |    |
|----------------------------------------------|----|
| LIST OF FIGURESxi                            | ii |
| CHAPTER 1 INTRODUCTION 1                     |    |
| 1.1 MACHINERY MONITORING 1                   |    |
| 1.2 Intelligent Sensor Monitoring System     |    |
| 1.3 Project Scope 6                          |    |
| 1.4 ASSUMPTIONS8                             |    |
| 1.4.1 Knowledgeable User                     |    |
| 1.4.2 Temperature Range                      |    |
| 1.4.3 Vibration Signal Ergodicity9           |    |
| 1.5 OVERVIEW10                               | 0  |
| CHAPTER 2 BEARING VIBRATIONS 12              | 2  |
| 2.1 BEARING LIFE AND FAILURE 13              | 3  |
| 2.2 BEARING VIBRATIONS                       | 5  |
| 2.3 VIBRATION SIGNALS AND FEATURE EXTRACTION | 8  |
| 2.3.1 Transducer Type and Location           | 9  |
| 2.3.2 Signal Frequency Range 20              | 0  |
| 2.3.3 Signal Magnitude Range and Resolution  | 0  |
| 2.3.4 Feature Extraction Methods 22          | 2  |
| 2.4 BEARING VIBRATION SIMULATION 20          | 6  |
| 2.4.1 Bearing Vibration Model                | 6  |
| 2.4.2 Simulator Input                        | 7  |

243

244

25 SMCTAT

251

252

26 M NT 4

CHAPTER 3 TRAN

31 \$1500 C

311

312 3131

32 F.N.T. E. .

3211

322 S

323 E

324 F

33 PEZORENI

3.3.1 7

332 P

333 P

3.3.4 P.

34 PERFORMS

3.4.1 R

3,4,2 Re

| 2.4.              | 3 Simulator Operation Details   | . 30 |
|-------------------|---------------------------------|------|
| 2.4               | 4 Simulator Example             | . 31 |
| 2.5 SIMUL         | ATION RESULTS                   | . 36 |
| 2.5.              | 1 Feature Extraction Simulation | . 36 |
| 2.5.              | 2 Quantization Simulation       | . 40 |
| 2.6 <b>M</b> ONIT | OR SYSTEM SPECIFICATIONS        | . 42 |
| CHAPTER 3 TRA     | ANSDUCER                        | . 44 |
| 3.1 SUSPE         | NDED MASS ACCELEROMETER THEORY  | . 44 |
| 3.1.              | 1 Mass-Spring-Damper System     | . 44 |
| 3.1.              | 2 Transducer Geometry           | . 46 |
| 3.1.              | 3 Isotropic Beam Theory         | . 47 |
| 3.2 FINITE        | ELEMENT ANALYSIS                | . 53 |
| 3.2.              | 1 Finite Element Model          | . 53 |
|                   | 3.2.1.1 Element type            | . 54 |
|                   | 3.2.1.2 Material properties     | . 54 |
|                   | 3.2.1.3 Element size            | . 56 |
| 3.2.              | 2 Static Analysis               | . 59 |
| 3.2.              | 3 Eigenvalue Analysis           | . 62 |
| 3.2.              | 4 Frequency Analysis            | . 63 |
| 3.3 PIEZOF        | RESISTOR                        | . 66 |
| 3.3.              | 1 Transduction Methods          | . 66 |
| 3.3.              | 2 Piezoresistance               | . 67 |
| 3.3.              | 3 Piezoresistor Type            | . 69 |
| 3.3.              | 4 Piezoresistor Bridge          | . 70 |
| 3.4 Perfo         | RMANCE CHARACTERISTICS          | . 70 |
| 3.4.              | 1 Resistor Dimensional Effects  | . 75 |
| 3.4               | 2 Resistor Temperature Effects  | 79   |

3.4

3 4

35 DEVICE

3 5

3.5.2

3 5 3

36 Electe

361

362

363

CHAPTER 4 OVER

41 SEECT

42 BANG SE

421

422

423

424

43 TRANNOT 44 DECIMAN

44.1

442

| 3.4.3 Beam Thickness Effects                                  | 81         |
|---------------------------------------------------------------|------------|
| 3.4.4 Nonlinearity, Hysteresis and Repeatability              | 83         |
| 3.5 DEVICE CONSTRUCTION                                       | 85         |
| 3.5.1 Beam Micromachining                                     | 85         |
| 3.5.2 Process Compatibilities                                 | 87         |
| 3.5.3 Packaging Benefits                                      | 88         |
| 3.6 ELECTRONIC CONSIDERATIONS                                 | 89         |
| 3.6.1 Temperature Compensation                                | <b>8</b> 9 |
| 3.6.2 Noise Effects                                           | 94         |
| 3.6.3 Amplification and Offset Reduction                      | 97         |
| CHAPTER 4 OVERSAMPLING A/D CONVERTER                          | 100        |
| 4.1 SELECTION OF A/D CONVERTER TYPE                           | 101        |
| 4.2 BASIC SDM CONVERTER OPERATION                             | 102        |
| 4.2.1 Single-loop SDM                                         | 104        |
| 4.2.2 Double-loop SDM                                         | 106        |
| 4.2.3 Decimator                                               | 108        |
| 4.2.4 Regions of Operation                                    | 109        |
| 4.3 TRANSDUCER SIGNAL AMPLIFICATION AND OSR CALCULATION       | 112        |
| 4.4 DECIMATOR DESIGN                                          | 117        |
| 4.4.1 Sinc <sup>3</sup> Filter Classification and Performance | 118        |
| 4.4.1.1 In-band noise                                         | 121        |
| 4.4.1.2 Antialiasing                                          | 124        |
| 4.4.1.3 Droop                                                 | 124        |
| 4.4.2 Sinc <sup>3</sup> Filter Architecture                   | 127        |
| 4.4.2.1 Method I architecture                                 | 129        |
| 4.4.2.2 Method II architecture                                | 131        |
| 4.4.2.3 Method III architecture                               | 135        |

451

452

453 454

CHAPTER 5 FEAT

51 Distract 9

52 DECIMAL

S3 ANTALIS

54 IIR FILTS

5.5 MS CALC

56 CONTROL

5.7 SIMPLATE

CHAPTER 6 SYSTE

61 SW. ZA

62 SIMPLAT

CHAPTER 7 CONT

7.1 CONTREE

7.111

7.12 H

71.3

7.14 F

71.5 [

72 FRITTER H

| 4.4.2.3.1 H120 coefficients                             | 137 |
|---------------------------------------------------------|-----|
| 4.4.2.3.2 H120 second coefficient generator             | 139 |
| 4.4.3 Filter Type and Architecture Selection            | 144 |
| 4.5 ADDITIONAL ADC SYSTEM ISSUES                        | 148 |
| 4.5.1 ADC System Review                                 | 148 |
| 4.5.2 Antialiasing filter                               | 149 |
| 4.5.3 Sigma-Delta Modulator                             | 153 |
| 4.5.4 System Simulation Results                         | 154 |
| CHAPTER 5 FEATURE EXTRACTION AND DECISION LOGIC         | 157 |
| 5.1 DIGITAL SYSTEM OVERVIEW                             | 157 |
| 5.2 DECIMATOR TIMING AND CONTROL                        | 161 |
| 5.3 Antialiasing Filter                                 | 163 |
| 5.4. IIR FILTER                                         | 170 |
| 5.5 MS CALCULATION AND THRESHOLDING                     | 174 |
| 5.6 CONTROL UNIT                                        | 180 |
| 5.7 SIMULATION                                          | 183 |
| CHAPTER 6 SYSTEM SIMULATION                             | 187 |
| 6.1 SIMULATOR COMPONENTS                                | 187 |
| 6.2 SIMULATION RESULTS                                  | 189 |
| CHAPTER 7 CONTRIBUTIONS AND FURTHER RESEARCH            | 197 |
| 7.1 CONTRIBUTIONS                                       | 197 |
| 7.1.1 Intelligent Microsensors and Machinery Monitoring | 198 |
| 7.1.2 Bearing Simulator and Simulation Results          | 199 |
| 7.1.3 Microaccelerometer                                | 200 |
| 7.1.4 First-Stage Decimation Filter                     | 202 |
| 7.1.5 Differing Monitoring and Diagnostic Precisions    | 203 |
| 7.2 FURTHER RESEARCH                                    | 203 |

APPENDIX A BE

APPENDIX B

BI STSPEN

B2 ISCR

B3 RAYLEI

B4 MATER

B.5 STRIKE

. . .

B6 P37.4

B7 BEAMT

APPENDIX C

C1 DECIMA

C2 USE 4 F

C3 472(90)

C4 SATICE

C.5 W(35.5)

Ce Zoze b

C7 TSNR

APPENDIX D DIC

BBLIOGRAPHY.

| APPENDIX A BEARING MODEL PARAMETERS            | 206 |
|------------------------------------------------|-----|
| APPENDIX B                                     | 208 |
| B.1 SUSPENDED MASS                             | 208 |
| B.2 ISOTROPIC BEAM THEORY                      | 209 |
| B.3 RAYLEIGH'S PRINCIPLE FUNDAMENTAL FREQUENCY | 213 |
| B.4 MATERIAL COEFFICIENTS                      | 215 |
| B.5 STRESS AVERAGE                             | 218 |
| B.6 PIEZORESISTANCE COEFFICIENTS               | 222 |
| B.7 BEAM THICKNESS VARIATIONS                  | 222 |
| APPENDIX C                                     | 224 |
| C.1 DECIMATOR FILTER REGISTER WIDTHS           | 224 |
| C.2 USE OF RIPPLE-CARRY ADDERS                 | 230 |
| C.3 Antialiasing Filter Characteristics        | 232 |
| C.4 SWITCHED CAPACITOR FILTER                  | 234 |
| C.5 Worst-Case Antialiasing                    | 239 |
| C.6 Noise Power                                | 240 |
| C.7 TSNR SIMULATION                            | 241 |
| APPENDIX D DIGITAL BUTTERWORTH FILTERS         | 247 |
| BIBLIOGRAPHY                                   | 253 |

Table 1 Vibration s

Table 2 Actual vs : Table 3 Actual Win

Table 4 Natural fre

Table 5 Piezoresist...

Table 6 Offset sour

Table 7. Input signa

Table 8 Hardware: Table 9 Method II I

Table 10 Method II

Table II Method II

Table 12 Method III

Table 13 Hardware:

Table 14. Relative ar-

Table 15 SC filter c!

Table 16. Programma

Table 17 Decimator

Table 18 Antialiasing

Tible 19 Antialiasing

Table 20 IIR filter ties

Table 21. First multip

## **LIST OF TABLES**

| Table 1. Vibration signal ranges for different bearings            | 21  |
|--------------------------------------------------------------------|-----|
| Table 2. Actual vs. model bearing damage indexes for no defect.    | 34  |
| Table 3. Actual vs. model bearing damage indexes for an OR defect. | 36  |
| Table 4. Natural frequencies and modal masses                      | 63  |
| Table 5. Piezoresistance coefficients [49].                        | 69  |
| Table 6. Offset sources.                                           | 97  |
| Table 7. Input signal deviation factors                            | 115 |
| Table 8. Hardware for Method I implementation                      | 132 |
| Table 9. Method II H <sub>N</sub> 120 logic                        | 134 |
| Table 10. Method II accumulator sizes.                             | 134 |
| Table 11. Method II hardware implementation.                       | 135 |
| Table 12. Method III H120 coefficient generation, $N_1 = 8$        | 138 |
| Table 13. Hardware for Method III Implementation.                  | 143 |
| Table 14. Relative area figures, $A(n)$ .                          | 146 |
| Table 15. SC filter characteristics                                | 152 |
| Table 16. Programmable init register values.                       | 160 |
| Table 17. Decimator control signals.                               | 162 |
| Table 18. Antialiasing filter control signals                      | 168 |
| Table 19. Antialiasing FIR coefficients.                           | 169 |
| Table 20. IIR filter timing.                                       | 171 |
| Table 21. First multiply/accumulate operation timing.              | 172 |

Table 22 Accumula

Table 23 MS and th

Table 24 Effects of

Table 25 Effects of

Table 26 Maximum

Table 27 Bearing ge

Table 28 Bearing h.

Table 29 Bearing h.

Table 30 Bearing co

Table 31 Data gene

Table 32 NISA mat

Table 33 P-type Tag

Table 34 Method I r

Table 35 Method II

Table 36 Band-pass

| Table 22. | Accumulation cycle timing.                                                     | 176 |
|-----------|--------------------------------------------------------------------------------|-----|
| Table 23. | MS and threshold timing.                                                       | 176 |
| Table 24. | Effects of shft[7:4]                                                           | 178 |
| Table 25. | Effects of shft[3:0]                                                           | 178 |
| Table 26. | Maximum average MS values for band-pass filter                                 | 193 |
| Table 27. | Bearing geometry parameters.                                                   | 206 |
| Table 28. | Bearing housing resonance parameters.                                          | 206 |
| Table 29. | Bearing housing high-pass parameters.                                          | 207 |
| Table 30. | Bearing contact noise parameters.                                              | 207 |
| Table 31. | Data generation parameters                                                     | 207 |
| Table 32. | NISA material coefficients                                                     | 219 |
| Table 33. | <i>P</i> -type $\pi_{44}$ vs. temperature for $9 \times 10^{18}$ concentration | 224 |
| Table 34. | Method I register widths.                                                      | 229 |
| Table 35. | Method II register widths.                                                     | 230 |
| Table 36. | Band-pass filter coefficients                                                  | 250 |

Figure 2 Intelligent Figure 3 Bearing vib Figure 4 Bearing life Figure 5 Average be Figure 6 Bearing vii Figure 7 NSK NT Figure 8 Actual [27] Figure 9 Actual [27] Figure 10 Defect incl Figure 11 Compariso Figure 12 Quantizati Figure 13 Quantizati Figure 14 Mass-sprin Figure 15. Overall tra Figure 16. Beam geor Figure 17 Loaded be Figure 18. Test beam Figure 19 Element siz Figure 20. Beam top a Figure 21. First six ac

Figure 22 Longitudir

Figure 1. The monity

## LIST OF FIGURES

| Figure 1. The monitoring hierarchy [1].                                     | 2  |
|-----------------------------------------------------------------------------|----|
| Figure 2. Intelligent monitoring system.                                    | 4  |
| Figure 3. Bearing vibration monitor.                                        | 7  |
| Figure 4. Bearing lifetime wear rate [13]                                   | 14 |
| Figure 5. Average bearing vibration level vs. time [10]                     | 16 |
| Figure 6. Bearing vibration model                                           | 28 |
| Figure 7. NSK / NTN 30204 bearing section [27].                             | 32 |
| Figure 8. Actual [27] and simulated zero-defect bearing signals and spectra | 32 |
| Figure 9. Actual [27] and model defective bearing signals.                  | 35 |
| Figure 10. Defect indexes vs. frequency.                                    | 37 |
| Figure 11. Comparison of normalized defect indexes                          | 39 |
| Figure 12. Quantization effects on MS.                                      | 41 |
| Figure 13. Quantization effects on spectrum.                                | 43 |
| Figure 14. Mass-spring-damper system [35]                                   | 45 |
| Figure 15. Overall transducer geometry.                                     | 48 |
| Figure 16. Beam geometry                                                    | 49 |
| Figure 17. Loaded beam in the four-beam accelerometer                       | 50 |
| Figure 18. Test beam.                                                       | 57 |
| Figure 19. Element size effects                                             | 58 |
| Figure 20. Beam top stress distributions                                    | 60 |
| Figure 21. First six accelerometer modes.                                   | 64 |
| Figure 22 Longitudinal stress frequency response                            | 65 |

Figure 23 Piezoresi Figure 24 Piezoresi Figure 25 Bridge ou Figure 26 Bridge ou Figure 27. Dimensio Egure 28 nu vs. ter Figure 29 Temperat Figure 30. Temperat Figure 31 Beam th. Figure 32 Etch-stop Figure 33 Accelera-Figure 34 Temperat Figure 35 Temperat Figure 36 Approxim Figure 37 Offset red Figure 38. Tradition. Figure 39. Single-loc Figure 40 Conversion Figure 4] Effect of ( Figure 42 Double-le Figure 43 Conversion Figure 44 Conceptua Figure 45. OSR effect Figure 46 Signal gain Figure 47 OSR and g Figure 48 Standard St Egute 49 H100, H01

| Figure 23. | Piezoresistor bridge.                                      | 71  |
|------------|------------------------------------------------------------|-----|
| Figure 24. | Piezoresistor placement.                                   | 72  |
| Figure 25. | Bridge output for dimensional changes in R <sub>1</sub>    | 76  |
| Figure 26. | Bridge output for dimensional changes in all resistors     | 76  |
| Figure 27. | Dimensional effects on the span and offset.                | 78  |
| Figure 28. | $\pi_{44}$ vs. temperature                                 | 80  |
| Figure 29. | Temperature effects on the bridge output                   | 81  |
| Figure 30. | Temperature effects on the span.                           | 82  |
| Figure 31. | Beam thickness effects on stress and fundamental frequency | 83  |
| Figure 32. | Etch-stop apparatus [60]                                   | 86  |
| Figure 33. | Accelerometer package cross-section [55].                  | 88  |
| Figure 34. | Temperature compensating bridge supply                     | 91  |
| Figure 35. | Temperature compensation of the span.                      | 93  |
| Figure 36. | Approximate system noise effects                           | 96  |
| Figure 37. | Offset reduction and amplification.                        | 99  |
| Figure 38. | Traditional and oversampling ADC models.                   | 103 |
| Figure 39. | Single-loop SDM.                                           | 105 |
| Figure 40. | Conversion noise spectral density.                         | 106 |
| Figure 41. | Effect of OSR on quantization noise.                       | 107 |
| Figure 42. | Double-loop SDM.                                           | 107 |
| Figure 43. | Conversion of a rail-to-rail sinusoid.                     | 110 |
| Figure 44. | Conceptualized TSNR vs. input power                        | 112 |
| Figure 45. | OSR effects on TSNR                                        | 114 |
| Figure 46. | Signal gain profile.                                       | 114 |
| Figure 47. | OSR and gain determination.                                | 116 |
| Figure 48. | Standard second-order SDM ADC                              | 118 |
| Figure 49. | H100, H010 and H001 spectra                                | 120 |

- Figure 58 Method II Figure 68 TSNR vs Figure 69 TSNR 1/3 Figure 70 Digital log Figure 73 Frequency Figure 74 Spectral fi
- Figure 50 Six High
  - Figure 51 In-band r
  - Figure 52 Worst-ca
  - Figure 53 Worst-ca
  - Figure 54 Worst-ca
  - Figure 55 High archi
  - Figure 56 Method I
  - Figure 57 Method II

  - Figure 59 Dual coef
  - Flatte 60 Coefficier
  - Figure 61 Improved
  - Figure 62 A(n) vs c
  - Figure 63 ADC 55.50
  - Figure 64 Antialias...
  - Figure 65 Continuo
  - Figure 66 Sampling
  - Figure 67 Simulated
  - Figure 71 Decimator
  - Figure 72 Decimator
  - Figure 75 Antialiasin
  - Fere of IIR filter b

| Figure 50. | Six Hijk spectra                                          | . 121 |
|------------|-----------------------------------------------------------|-------|
| Figure 51. | In-band noise power vs. n.                                | . 123 |
| Figure 52. | Worst-cast antialiasing                                   | 125   |
| Figure 53. | Worst-case aliasing with low-pass filters.                | 125   |
| Figure 54. | Worst-case droop, N <sub>1</sub> = 64.                    | 126   |
| Figure 55. | Hijk architectures.                                       | 128   |
| Figure 56. | Method I H120 architecture.                               | 123   |
| Figure 57. | Method II H120 architecture                               | 131   |
| Figure 58. | Method III H120 architecture                              | 139   |
| Figure 59. | Dual coefficients vs. time.                               | 140   |
| Figure 60. | Coefficient generator 1 circuit                           | 141   |
| Figure 61. | Improved coefficient generator 1 circuit                  | 142   |
| _          | A(n) vs. quantization noise power.                        |       |
| Figure 63. | ADC system block diagram.                                 | 148   |
| Figure 64. | Antialiasing filter spectrum.                             | 150   |
| Figure 65. | Continuous and SC filter frequency response               | 152   |
| Figure 66. | Sampling rate effects on SC filter parameters.            | 153   |
| Figure 67. | Simulated frequency response.                             | 155   |
| Figure 68. | TSNR vs. frequency.                                       | 155   |
| Figure 69. | TSNR vs. input power.                                     | 156   |
| Figure 70. | Digital logic block diagram.                              | 158   |
| Figure 71. | Decimator logic diagram.                                  | 162   |
| Figure 72. | Decimator timing diagram.                                 | 164   |
| Figure 73. | Frequency response with and without antialiasing          | 167   |
| Figure 74. | Spectral folding with and without the antialiasing filter | 167   |
| Figure 75. | Antialiasing filter logic diagram.                        | 168   |
| Figure 76. | IIR filter block diagram.                                 | 171   |

Figure 77. MS calci Figure 78 Control Figure 79 Simulatio Figure 80 Digital si Figure 81 Simulatic Figure 82 Compani Figure 83 Compani Figure 84 Range of Figure 85 Range of Figure 86 Acceleror Figure 87 Suspende. Figure 88 Clamped-Figure 89 Crystally 2 Flaire 90 Quarter No Figure 91 Piezoresis Figure 92 744 V3 tem Figure 93 Beam thick Figure 94 Sinc filter Figure 95. Register w Figure 96 Ripple-can Figure 97 Ripple-can Figure 98 System con Ferre 99 Antialiasing Face 100 System f Fare 191 Second-c Figure 102 Switched Este 193 Second-co

| Figure 77. | MS calculation and thresholding unit circuit.         | . 175 |
|------------|-------------------------------------------------------|-------|
| Figure 78. | Control unit diagram.                                 | . 181 |
| Figure 79. | Simulation model hierarchy                            | 184   |
| Figure 80. | Digital simulation output.                            | 185   |
| Figure 81. | Simulation modules.                                   | 187   |
| Figure 82. | Comparison of feature extraction parameters.          | 190   |
| Figure 83. | Comparison of computed MS for various decimator gains | 192   |
| Figure 84. | Range of MS values for band-pass filter, N = 512.     | 195   |
| Figure 85. | Range of MS values for band-pass filter, N = 2048     | 196   |
| Figure 86. | Accelerometer sensitivity vs. resonant frequency.     | 200   |
| Figure 87. | Suspended mass                                        | 208   |
| Figure 88. | Clamped-sliding beam.                                 | 210   |
| Figure 89. | Crystallographic-to-model axes transformation         | 216   |
| Figure 90. | Quarter beam longitudinal stress values.              | 221   |
| Figure 91. | Piezoresistance vs. dopant concentration [52].        | 223   |
| Figure 92. | $\pi_{44}$ vs. temperature for p-type silicon [52].   | 223   |
| Figure 93. | Beam thickness variations.                            | 225   |
| Figure 94. | Sinc filter.                                          | 227   |
| Figure 95. | Register width simulation results.                    | 230   |
| Figure 96. | Ripple-carry adder.                                   | 232   |
| Figure 97. | Ripple-carry adder/subtractor                         | 234   |
| Figure 98. | System components that affect frequency response      | 235   |
| Figure 99. | Antialiasing filter parameter relationships.          | 237   |
| Figure 100 | System frequency response.                            | 237   |
| Figure 101 | . Second-order low-pass network                       | 238   |
| Figure 102 | Switched capacitor integrators.                       | 239   |
| Figure 103 | Second-order switched capacitor filter                | 239   |

Figure 104 First-s:

Figure 105 TSNR

Figure 106 Freques

Figure 107 Freque:

### xvi

| Figure 104. | First-stage spectral folding. | 242 |
|-------------|-------------------------------|-----|
| Figure 105. | TSNR vs. simulation time      | 245 |
| Figure 106. | Frequency response of BPF 1.  | 252 |
| Figure 107. | Frequency response of BPF 2.  | 252 |

The purpose merosensors to me through the design conditioning circuitry thresholding and ala-

This chapter monitoring bearings problem. Section 1.2 a system employing scope of this project. 14 lists the assumption of this difference working of this difference working the second of the second of this difference working the second of this difference working the second of the second of

# II MACHINERY M

The purpose militarce future deci-

Machinery madecisions about the or decisions range from complex issues such a state possibilities of the decisions of the decisions about the or decisions range from the complexity of the decisions about the order of the o

# CHAPTER 1 INTRODUCTION

The purpose of this project is to examine the feasibility of using intelligent microsensors to monitor the health of rolling element bearings. This is accomplished through the design of a device that incorporates a microaccelerometer, analog conditioning circuitry, a novel analog-to-digital converter, and logic for feature extraction, thresholding and alarm generation, all on the same substrate.

This chapter provides introductory and supporting material for a discussion of monitoring bearings using microsensors. Section 1.1 describes the general monitoring problem. Section 1.2 defines an intelligent microsensor and discusses the configuration of a system employing these devices for machinery monitoring. Section 1.3 describes the scope of this project within the general framework provided in the first sections. Section 1.4 lists the assumptions used throughout the design process. Finally, Section 1.5 presents an overview of this dissertation.

## 1.1 MACHINERY MONITORING

The purpose of monitoring is to obtain information. This information is used to influence future decisions concerning the monitored system. The amount and type of information required from monitoring depends on the kind of decisions to be made.

Machinery monitoring refers to the gathering of information required to make decisions about the operating conditions and relative health of mechanical systems. These decisions range from simple inquiries as to whether a device is operating or not to complex issues such as quantitative predictions about future operating parameters and failure possibilities. The precise definition of machinery monitoring, therefore, differs depending on the context. Figure 1 shows different levels of meaning based on the complexity of the decision under consideration. The figure is modified from Brawley [1].



Monitoring recondition of ball and these devices that had however, on many pconsequences due to addition to the cost of to other equipment as

Condition monitoring

Condition monitoring

Condition monitoring

Condition monitoring

Condition feature of the

Condition feature level to a the



Figure 1. The monitoring hierarchy [1].

Monitoring rolling element bearings involves gathering information about the condition of ball and roller bearings. Status monitoring checks for failed bearings, or those devices that have seized. This requires very simple data and little data reduction. However, on many pieces of machinery the sudden seizure of a bearing has catastrophic consequences due to the force created by the rapid deceleration of rolling masses. In addition to the cost of the bearing, there are the costs that may be incurred by the damage to other equipment and by the unscheduled equipment down time. This makes status monitoring a virtually useless technique for monitoring most bearings.

Condition monitoring determines if a bearing is performing at some level or within a set of specifications. Because of the monotonic nature in which damage is accumulated in a bearing, condition monitoring can be used to estimate the health of the device. Condition monitoring requires extracting a sufficient amount of information to obtain a relevant feature of the bearing, reducing the data to extract the feature, then comparing the feature level to a threshold value.

Performance monit health

Diagnostic n the bearing. This performance monite diagnosed through ; [2,3].

Prognostic in bearing or, more specim bearing life expectantacting surfaces in available.

In addition to Continuous monitor dedicated data collected meters are permeters may perform analysis. A compresental monitoring inaccuracies introducted deenorating condition.

An ideal solu:

low-level, low-resolu:

and for periodic vericentral computer for in
a mechanism for realize

Performance monitoring is similar to condition monitoring in its data requirements.

Performance monitoring differs in that it provides a quantitative measure of the bearing's health.

Diagnostic monitoring is used to determine the cause of any defect discovered in the bearing. This generally requires data that is more detailed than is necessary for performance monitoring. Further, research indicates that different bearing defects are best diagnosed through the use of various types of data reduced using a variety of techniques [2, 3].

Prognostic monitoring attempts to predict the future operating conditions of the bearing or, more specifically, exactly when the bearing will fail. Due to the wide variance in bearing life expectancy and the difficulty in obtaining direct information about the contacting surfaces inside the bearing, no prognostic monitoring systems are commercially available.

In addition to the level of monitoring, the frequency of monitoring is important. Continuous monitoring requires sensors permanently attached to the machine and dedicated data collection, reduction and analysis hardware. At the other extreme, handheld meters are periodically connected to machinery by maintenance personnel. The meters may perform condition monitoring on location or may record data for future analysis. A compromise technique uses permanently attached sensors connected to a central monitoring processor via multiplexed cables. This method removes the inaccuracies introduced by hand-held devices but may result in a delay in detecting deteriorating conditions if the degree of multiplexing is high.

An ideal solution would employ small processors at the sense point to perform low-level, low-resolution continuous condition monitoring. When a problem is detected and for periodic verification of operation, the sensor signal could be transmitted to a central computer for in-depth processing on high-resolution data. Intelligent sensors offer a mechanism for realizing such a monitoring system.

# 12 INTELLIME

As mir

parameter of information ab

A monitory of the Attached to the type of machine vibration, acoust and particulate

and affect chan



#### 1.2 INTELLIGENT SENSOR MONITORING SYSTEM

As minimum requirements, an intelligent sensor is capable of measuring some parameter of its environment, reducing the effects of cross-parameters and transmitting information about the parameter. Additionally, it may perform signal conditioning and format conversion, extract parameter features, make decisions based on these features, and affect changes either directly or by communicating the results.

A monitoring system that employs intelligent sensors is pictured in Figure 2. Attached to the machinery being examined is a group of smart sensors. Depending on the type of machinery, the sensed parameters may include temperature, voltage, current, vibration, acoustic emissions, speed, torque, force, chemical makeup, radiation, pressure and particulate size. A group of sensors is connected to an interface unit through



Figure 2. Intelligent monitoring system.

dedicated cables T well as initialization

Once initial performing data gard condition is detected smart sensor can be digitized data to the data reduction, performed sensors with computer for further

The central pleasement the monit minalization, and responses or via a plant Alternation Protocol

Many of the mentioning system, simplemented [6, 7] imaligent sensors

In order to miscommon electronics common buffers
are used to carry massimity in silicon based

dedicated cables. The interface unit provides each sensor with power and clock signals as well as initialization parameters on startup.

Once initialized, the intelligent sensors are capable of operating autonomously, performing data gathering, reduction and condition monitoring functions. If a change in condition is detected, an alarm is transmitted to the interface unit. Additionally, each smart sensor can be placed in a transducer mode, allowing it to transmit conditioned, digitized data to the interface unit. The interface unit may collect this data, implement data reduction, perform multivariate condition monitoring or diagnostics, reinitialize the intelligent sensors with new parameters, and forward the data to the central processing computer for further analysis.

The central processing system provides advanced data manipulation, coordination between the monitoring of different machine subsystems, overall plant monitoring initialization, and report generation. The interface units are connected to the central processor via a plant-wide communication network, such as the MAP (Manufacturing Automation Protocol) network [5].

Many of the high-level components required for the intelligent sensor based monitoring system, such as the interface units and the central processor, have been implemented [6, 7]. A major set of components that have not been designed is the intelligent sensors.

In order to minimize the cost of the system, each intelligent sensor should contain common electronics for connecting to the interface unit. These electronics provide communication buffering, power supply regulation and signal separation if common wires are used to carry multiple signals. Since temperature is the major source of cross-sensitivity in silicon based microsensors [8, 9], a temperature sensing circuit should also be included.

As a bear

localized metal fati plis. The pitting accumulation of dafailure or seizes due

No technique accumulated contact varying success, including success, including the success and the success acceleration. Three typesceleration. Acceleration.

accelerometers A disprovided in Chapter 2

This project of

the sensor system is so by the bold boxes in

The common

provides a voltage our
A possible scheme for combining the clock at

to the monitor on stations operation

#### 1.3 Project Scope

As a bearing ages, the contact between races and rolling elements produces localized metal fatigue. This fatigue produces small subsurface cracks that may grow into pits. The pitting of the surfaces increases contact stresses, further accelerating the accumulation of damage. This damage progresses until the bearing suffers a component failure or seizes due to excessive friction.

No technique has been developed for direct, in situ measurement of the accumulated contact surface damage. Several indirect measures have been used with varying success, including housing vibration, housing temperature, lubricant temperature, lubricant metal analysis, and acoustic emission. Of these, the most widely used is housing vibration. Three types of housing vibrations can be measured: displacement, velocity and acceleration. Acceleration is most often used due to the frequency range and cost of accelerometers. A detailed description of rolling element bearings and their monitoring is provided in Chapter 2.

This project concentrates on the design of critical elements unique to an intelligent microsensor for monitoring the vibrations of rolling element bearings. A block diagram of the sensor system is shown in Figure 3. The system is divided into two portions, signified by the bold boxes in the figure. The top portion represents electronics common to all intelligent sensors. The bottom portion is unique to the vibration monitor.

The common portion contains three subsystems, a temperature sensor, communication logic, and power and clock separation circuitry. The temperature sensor provides a voltage output used to adjust the bridge supply for cross-parameter correction. A possible scheme for connecting an intelligent sensor to the interface unit uses 3 wires, combining the clock and power supply on a single cable. This requires separation circuitry at the sensor. Communication logic passes initialization parameters from the interface unit to the monitor on startup and passes data from the monitor back to the interface unit during operation.

Сонтинизация Huffer Serud Communication reset / p..... Power and Clock 25 V W 2 Clark



Figure 3. Bearing vibration monitor.

The s
micromachine
amplification
digital convert
mean squared
logic. The de
compensated, conventions

14 ASS 3.5 TO

programmable i

Three a monitor. These operating temper

141 Knowledge

Although knowledgeable

consguration of inelligent sensor

the correct place

Prace prace

With regard

determine approp

understanding of

The sensor components unique to the bearing vibration monitor include a micromachined accelerometer incorporating a half-active piezoresistive bridge, amplification and conditioning electronics, a novel sigma-delta modulating analog-to-digital converter, programmable digital filters, feature extraction logic for calculating the mean squared value of the filtered acceleration signal, thresholding logic, and control logic. The device is capable of operating as either a smart accelerometer, transmitting compensated, digitized vibration acceleration values back to the interface unit, or as an autonomous condition monitor, comparing filtered mean squared vibration values to a programmable threshold.

#### 1.4 ASSUMPTIONS

Three assumptions were used in the development of the intelligent vibration monitor. These assumptions concern the availability of a knowledgeable user, the range of operating temperatures, and the ergodicity of the bearing vibration signals.

## 1.4.1 Knowledgeable User

Although the bearing monitor is considered an intelligent device, it still requires a knowledgeable user. The user must first make decisions regarding the overall configuration of the monitor system. Once this is accomplished, the appropriate types of intelligent sensors must be chosen to match the needs of the machinery subsystem. Next, the correct placement of each sensor is critical to maximizing the sensing of the desired parameter and minimizing the effects of cross-parameters.

With regard to the intelligent vibration monitor, a knowledgeable user must determine appropriate initialization parameters for the sensor. This requires an understanding of the bearing configuration and potential failure modes as well as the

knowledgeab! deteriorates gra assume the signa the correlation

limitations of the vibration In the

142 Temper

The or

This range is

located in man

The rea simple tempera

range is desired

143 Vibration

The vib since a change

deteriorating he

 $D_{espite}$ 

assumed to be s

assumption allow

In addition

[il] Ergodicity

esemble average

limitations of the sensor. The user must also be able to interpret the results produced by the vibration monitor to determine an appropriate course of action.

In the future, it may be possible to replace the functions provided by the knowledgeable user with an expert system.

# 1.4.2 Temperature Range

The operating temperature range assumed for this device is  $20 \pm 20$  °C ( $68 \pm 36$  °F). This range is reasonable for bearings mounted on motors, fans, pumps or transmissions located in manufacturing facilities designed for human workers.

The reason for this choice for the range of temperatures is to allow for the use of a simple temperature compensation scheme in the intelligent bearing monitor. If a greater range is desired, a more complicated scheme may be employed.

# 1.4.3 Vibration Signal Ergodicity.

The vibration signal over the life of the bearing is nonstationary. This is apparent since a change in the signal's statistical properties over time provides an indication of deteriorating health.

Despite the long-term nonstationary quality of the vibration signal, it is always assumed to be stationary over short intervals [10]. This is because the bearing normally deteriorates gradually over an extended period of time, often many years. This assumption allows the calculation of power spectral densities for frequency analysis.

In addition to assuming the vibration signal is locally stationary, it is common to assume the signal is locally ergodic as well. This assumption is reported to be valid due to the "correlation between machinery vibration and defects for similar pieces of equipment" [11]. Ergodicity allows signal statistics to be calculated from time averages instead of ensemble averages. Hence the major reason for the ergodic assumption is the

impracticality of operating condition

15 OVERVIEW

The rema

discusses the natu and defect-free ho monitoring featu

requirements for o

Chapter 3 mass accelerome

riezoresistive bri

new The charge

electronics

A second

Chapter 4. The

the area than

quantization noi Tay be routed o

Chapter

programmable i

and control log

discussed.

The re presented in Ci

bearing vibratio

impracticality of obtaining data from identical pieces of machinery under identical operating conditions.

#### 1.5 OVERVIEW

The remainder of this dissertation is divided into seven chapters. Chapter 2 discusses the nature of bearing vibrations and presents a simulator for generating defective and defect-free housing acceleration signals. These signals are used to examine condition monitoring feature extraction methods and to determine frequency and resolution requirements for condition and diagnostic monitoring.

Chapter 3 describes the microaccelerometer. The design of a four-post suspended mass accelerometer is followed by finite element analysis results. The half-active piezoresistive bridge that transduces acceleration into a change in voltage is presented next. The chapter also includes discussions on transducer construction issues and support electronics.

A second-order sigma-delta modulating analog-to-digital converter is designed in Chapter 4. The converter incorporates a novel first-stage decimation filter requiring less chip area than conventional designs. The first-stage decimator provides sufficient quantization noise reduction for on-chip condition monitoring. Output from the first-stage may be routed off-chip for further diagnostic processing.

Chapter 5 discusses the design of the on-chip second-stage decimator, sixth-order programmable IIR digital filter, mean squared value calculation logic, thresholding logic, and control logic. A register transfer level simulation of the digital hardware is also discussed.

The results of a functional simulation of the intelligent vibration monitor are presented in Chapter 6. The simulator consists of 3 modules written in C representing the bearing vibration source, transducer and conditioning electronics, and digital logic.

Chapter 7:

further research

Four appen

bearing simulation

Chapters 3 and 4 re

programmable IIR

Chapter 7 lists the contributions resulting from this project and presents areas for further research.

Four appendixes are provided. Appendix A lists the parameters used in the bearing simulation of Chapter 2. Appendixes B and C provide supporting material for Chapters 3 and 4 respectively. Appendix D contains the design of digital filters used in the programmable IIR section of the monitor.

A bearing mechanism. The the friction between the inner race, attach to the recess, separation and rotating conteedie, tapered inner and outer provides balance surrounds, and

Bearings

The of rolling

Another method

Self-aligning bear

dissertation, the

do not A third

bearing princips

(thrust) bearing

Regardie they age and we surface of the be

the health of the Bearing:

# **CHAPTER 2 BEARING VIBRATIONS**

A bearing interfaces a rotating shaft to a stationary support or another rotating mechanism. The goal of a good bearing is to sufficiently distribute the load and minimize the friction between the two elements. Rolling element bearings consist of five basic parts: the inner race, outer race, rolling elements, cage and housing. The inner and outer races attach to the rotating shaft and the stationary support. The rolling elements fit into the races, separating the two races and providing the transfer of load between the stationary and rotating components. Rotating elements may be balls (spheres), or may be cylindrical, needle, tapered or barrel rollers. Most rolling element bearings contain a cage between the inner and outer races. The cage maintains proper spacing between the rolling elements, provides balance, and prohibits contact between the rotating elements. The housing surrounds and protects the bearing assembly. Throughout the remainder of this dissertation, the term bearings will imply rolling element bearings only.

Bearings can be classified using criteria based on their construction and use. The type of rolling element used in the bearing divides them into ball and roller bearings. Another method of classification considers the technique for guiding the shaft, yielding self-aligning bearings that allow for shaft misalignment and non self-aligning bearings that do not. A third method of classifying bearings uses the supported load direction. A radial bearing principally supports a load perpendicular to the shaft and an angular contact (thrust) bearing supports load along the axis of rotation as well as radially.

Regardless of the type, all bearings tend to exhibit similar deterioration patterns as they age and wear. This wear produces vibrations that can be detected on the outside surface of the bearing housing. A monitoring system detects these vibrations to determine the health of the bearing.

Bearing life and failure modes, and the vibrations these modes produce are covered

obtaining vibration signals. Section used throughout 25. Section 2.6 material presented

21 BEARNOLI

One of:

the effective life magnitude differ bearings will fai

avoid catastrop

the other extrem

The var

failure modes as due to material

the rate of wear

bearing should

One pri

Progress to the Following the

feed particles

in Sections 2.1 and 2.2. Section 2.3 contains a discussion of the issues associated with obtaining vibration signals and the techniques for extracting salient features from these signals. Section 2.4 presents a bearing vibration simulator and the specific bearing model used throughout this dissertation. The results of the simulation are presented in Section 2.5. Section 2.6 develops a set of specifications for the monitor system based on the material presented.

## 2.1 BEARING LIFE AND FAILURE

One of the difficulties in using rolling element bearings is the large discrepancy in the effective life of identical bearings under similar loads. This is reflected in the order of magnitude difference between the  $B_{10}$  life (length of time during which 10% of all bearings will fail) and the  $B_{90}$  life. Therefore, it is important to monitor the bearing to avoid catastrophic failure and the subsequent costly unscheduled downtime, and to avoid the other extreme of replacing perfectly useful bearings [12].

The variance in expected bearing life is partially due to the number of possible failure modes and their varying effects. Under optimal conditions, a bearing will wear out due to material fatigue based on the load, speed and service time. The function describing the rate of wear for a bearing throughout its life has a bathtub shape, as shown in Figure 4. Initially, the bearing will accumulate wear rapidly during a short breaking-in period. The bearing should then experience a long period of low wear accumulation and low fatigue.

One principal failure mode begins as the bearing ages. Metal fatigue causes small cracks to form below the surface of the races and/or rolling elements. These cracks progress to the surface, breaking free small pieces of material and producing pits. Following the initial pitting, wear accumulates rapidly due to the damage caused by the freed particles and due to rolling surfaces impacting the pits.

electrical currents, temperatures. The Under cert.

distributed indental electric current), coor more of the su

defect, also accelet



Figure 4. Bearing lifetime wear rate [13].

This may create additional pitting on other surfaces and may enlarge the pits already present, both in depth an in surface area, further increasing the wear rate. The damage progresses until the bearing suffers a component failure or seizes due to excessive rubbing friction.

The formation of bearing surface damage and subsequent bearing failure can be accelerated due to several factors including lubrication failure, contamination, corrosion, electrical currents, excessive preloading, vibration, incorrect mounting and excessive temperatures. These factors tend to produce point or local defects [12, 14].

Under certain conditions, factors such as lubrication failure, brinelling (regularly distributed indentations in the races due to excessive vibration or shock, static loading or electric current), contamination and corrosion may cause somewhat even damage over one or more of the surfaces simultaneously [15]. This damage, classified as a distributed defect, also accelerates wear and greatly reduces bearing life.

pattern around eccentric races

> source of study temperature, ac examined as po: commercial use

Anoth

The de

installed beam

22 BEARING VIII

ease of implemen

Contacts | Vibrations from v they can be used ! sources of vibrati

resulting from inte Vibrations

conditions, the ma with the races S functions of the su

properly lubricated

amplitude, zero me

results from contact

though the shaft c staff rotational vibra

Another set of difficulties includes distributed defects due to faulty or improperly installed bearings. These problems produce unusual loading distributed in a symmetrical pattern around the bearing. Factors producing these defects include misaligned races, eccentric races, off-size rolling elements and out-of-round components [14].

The detection of the damage and wear level suffered by a bearing has been a source of study since the invention of the bearing. Various bearing parameters, including temperature, acoustic emissions, lubricant debris and housing vibrations, have all been examined as potential sources for health monitoring. Currently, the majority of systems in commercial use are based on the analysis of housing vibrations due to the effectiveness, ease of implementation and relatively low cost of vibration-based systems.

## 2.2 BEARING VIBRATIONS

Contacts between elements in a bearing produce forces that result in vibrations. Vibrations from within the bearing are transmitted to the surface of the housing where they can be used to determine the type and extent of damage. This section examines the sources of vibrations in rolling element bearings and the characteristics of vibrations resulting from internal defects.

Vibrations from new bearings generally arise from three sources. Under optimal conditions, the main source of vibrations is the normal contact of the rotating elements with the races. Since this contact involves rolling surfaces, the vibrations produced are functions of the surface smoothness and, since these surfaces are quite smooth on new, properly lubricated bearings, the vibration produced have the form of relatively low amplitude, zero mean Gaussian noise [12, 16, 17]. A second source of housing vibrations results from contacts in other machinery. The vibrations are transmitted to the bearing through the shaft or the mountings. These sources include fundamental and harmonic shaft rotational vibrations, gear mesh vibrations, cavitation noise from pumps, blade noise

from turbines, and results from implement on the s

As the bear

Throughout most point, however, the continues to deterior

The basic of depending on the factoristics is to the magnitude at 4 to 1 kHz contains contacting other electronic approximately 500 I in the contacting stapproximately 10 k in the bearing [12]

from turbines, and impacts from reciprocating equipment. A third source of vibrations results from improperly installed or defective bearings, with the type of vibration dependent on the specific problem.

As the bearing ages, the average level of vibration increases, as shown in Figure 5. Throughout most of the bearing's life, the vibration level growth is gradual. At some point, however, the bearing experiences a defect that accelerates the rate of increase. As it continues to deteriorate, the vibration level climbs until failure occurs.

The basic characteristics of these vibrations change, as does their magnitude, depending on the failure mode, loading and bearing usage. One method of examining the characteristics is to consider the magnitude frequency spectrum of the bearing vibrations. The magnitude at 0 Hz must be zero if the bearing is stationary. The range from above 0 to 1 kHz contains the repetition frequencies corresponding to defective components contacting other elements and most vibrations from connected machinery. The range from approximately 500 Hz to 20 kHz shows the random vibrations produced by the roughness in the contacting surfaces and harmonics of the defect frequencies. The range above approximately 10 kHz includes the vibrations resulting from excitation of the resonances in the bearing [12]. These ranges vary depending on bearing type and running conditions.



Figure 5. Average bearing vibration level vs. time [10].

The defect calculated if the best face or inner race. The frequencies of the defect is on a The frequency of frequency (f<sub>b</sub>) 1 frequencies [13, 1]

Ntere

 $f_r$   $f_l$ 

 $f_{o}$ 

 $f_b$ 

 $f_l$ 

n

D

 $D_{l}$ 

¢

The defect repetition rates for point defects on rolling elements or races can be calculated if the bearing geometry and shaft speed are known. If the defect is on the outer race or inner race, each rolling element will strike the defect once per cage revolution. The frequencies of the sum of the impacts due to all rolling elements are known as the ball pass frequency of the outer race  $(f_0)$  and the ball pass frequency of the inner race  $(f_i)$ . If the defect is on a rolling element, it contacts both the inner and outer races alternately. The frequency of contact with one of the races (outside) is referred to as the ball spin frequency  $(f_0)$ . Below are the formulas for calculating the point defect repetition frequencies [13, 18].

$$f_{i} = \frac{n}{2} f_{r} \left( 1 + \frac{D_{b}}{D_{p}} \cos \phi \right),$$

$$f_{o} = \frac{n}{2} f_{r} \left( 1 - \frac{D_{b}}{D_{p}} \cos \phi \right),$$

$$f_{b} = \frac{D_{p}}{2D_{b}} f_{r} \left( 1 + \frac{D_{b}^{2}}{D_{p}^{2}} \cos \phi \right),$$

$$f_{i} = \frac{1}{2} f_{r} \left( 1 - \frac{D_{b}}{D_{p}} \cos \phi \right).$$

where

 $f_r$  = rotating unit frequency (shaft speed) (Hz),

 $f_i$  = inner race element pass frequency (Hz),

 $f_o$  = outer race element pass frequency (Hz),

 $f_b$  = rotating element spin frequency (Hz),

 $f_t$  = fundamental train frequency (Hz),

n = number of rotating elements,

 $D_b$  = rotating element diameter (mm),

 $D_p$  = pitch diameter (mm),

 $\phi$  = contact angle.



The fundamental training revolve around the ax

The magnitude function of the bearing defect contacts another. The detected vibration the housing surface exponentially damped as

If the defect of

through regions of vary period the inverse of the the fundamental train from the fundamental train from the fundamental train from the fundamental train from the attenuation conditions are the attenuation of the attenuatio

groove, causing the vibra

Due to the comp features are extracted for with obtaining a signal an of the transducer, frequen featuring, and the type in

<sup>23</sup> VIBRATION SIGNALS

The fundamental train frequency is the speed at which the rotating elements and the cage revolve around the axis of rotation.

The magnitude and shape of the vibrations produced by the point defect are a function of the bearing geometry, loading, and defect location and size. When a point defect contacts another surface, the resulting shock can be approximated as an impulse. The detected vibration is dependent on the transmission path from the defect location to the housing surface. A good approximation of the resulting acceleration is an exponentially damped sinusoid [13].

If the defect occurs on the outer race, each rolling element impact will cause a decaying pulse of similar magnitude. If the defect is on the inner race, it may rotate through regions of varying load. This produces a series of decaying pulses, each with a period the inverse of the inner race pass frequency and with the magnitude modulated by the fundamental train frequency. If the defect is on a rolling element, it will alternately strike the inner and outer races, passing the inner race induced vibrations at a weaker level due to the attenuation caused by the signal passing through the rolling elements. Since the defective element is rotating through areas of different loading, the magnitudes will be modulated by the fundamental train frequency. If the rolling elements are balls, there is the additional possibility of having the defect spin partially or completely out of the race groove, causing the vibrations to intermittently fade or disappear.

## 2.3 VIBRATION SIGNALS AND FEATURE EXTRACTION

Due to the complex and random nature of bearing vibration signals, descriptive features are extracted for use in monitoring and diagnostics. Several issues are associated with obtaining a signal and extracting the salient features, including the type and location of the transducer, frequency range of the transduced signal, magnitude range and signal resolution, and the type and amount of data reduction to obtain an indication of the

bearing health velocity, and a desired, with Industr frequencies,

23.1 Transdu

Three t

acceleration at

required for g

housing mass

housing surface

Indust

vibrations and

[19]. Veloci

limited frequ

sinusoidal c

Indus a frequency

most prevale

choice in a n

The

limitations re

The transdu

area of high

bearing health.

## 2.3.1 Transducer Type and Location

Three types of vibration signals have been used to monitor bearings: displacement, velocity, and acceleration. The selection is primarily dependent on the frequency range desired, with displacement showing the greatest sensitivity at low frequencies and acceleration at high frequencies.

Industrial displacement transducers are useful to about 1 kHz, below the minimum required for general bearing monitoring. They find use on large, slow machines where the housing mass is significantly greater than the rotating mass resulting in vibrations at the housing surface that may not be representative of the internal condition [19].

Industrial velocity transducers tend to be bulky, to be sensitive to cross-axis vibrations and magnetic interference, and to have a useful frequency range up to 2 kHz [19]. Velocity transducers have been successfully used to monitor bearings despite their limited frequency ranges. They are reported to have a flatter frequency response to a sinusoidal contact force than displacement transducers, which accentuate lower frequencies, and acceleration sensors, which accentuate higher frequencies [20].

Industrial acceleration transducers have good insensitivity to cross-parameters and a frequency range from 0 to above 10 kHz [19]. Acceleration appears to be by far the most prevalent transduction method in the field, and has been selected as the method of choice in a number of research articles that compare velocity and acceleration [12, 21, 22].

The location of the transducer is important in detecting defects. One of the limitations required by most bearing designs is that the transducer must be non-intrusive. The transducer should be located on the housing surface at a point corresponding to the area of highest loading inside the bearing [20].

 $^{6001}$  8 to  $_{1000}$   $_{8}$ 

amount beyond t

their own work, 1

A survey acceleration valu bearings under di

### 2.3.2 Signal Frequency Range

The optimum range of frequencies for bearing vibration acceleration signals depends on the type of bearing, application and feature extraction method used. Two basic ranges have been identified, low frequency monitoring and high frequency monitoring.

Low frequency monitoring extends up to between 2 and 10 kHz [23]. This range is designed to encompass the defect repetition frequencies and their significant harmonics. Various reports indicate that this range is suitable for extracting salient information about the health of a bearing [17, 24, 25].

High frequency monitoring extends to 20 kHz and beyond. This range includes the vibrations that arise when housing resonances are excited by point defect impacts. Several studies report greater success in the high frequency range, particularly if there is significant noise from external sources [19, 21, 22].

The current available research is inconclusive as to the best range of frequencies to use in bearing monitoring. If the bearing is operating in a relatively quiet environment or filtering is used to remove extraneous signals, the low frequency range is appropriate. This also fits into the present technological constraints for microaccelerometer frequency ranges.

#### 2.3.3 Signal Magnitude Range and Resolution

Xistris et al. claim that vibration monitors should ideally operate in the range of 0.001 g to 1000 g measured to 10 kHz [23]. This would require 20 bits of resolution, an amount beyond the capability of most industrial-grade analog-to-digital converters. In their own work, 12-bit accuracy was used [23].

A survey of research was conducted to obtain a range for RMS and peak acceleration values. Table 1 presents the starting and maximum values for different bearings under different conditions.

Table 1 Vibration signal ranges for different bearings.

| Bearing Load/ | Frequency Bearing | Frequency Bearing  | Peak (g) Frequency Bearing    | Frequency Bearing             |
|---------------|-------------------|--------------------|-------------------------------|-------------------------------|
| Type          | Kange (Kriz)      | aximum Kange (KTZ) | No defect Maximum Kange (KTZ) | No defect Maximum Kange (KTZ) |
| double row    | 0 - 20 double row | 500 0 - 20         | 0 - 20                        | 500 0 - 20                    |
| ball          | ball              | ball               | ball                          | ball                          |
| double row    | 0 - 20            | 50 - 500 0 - 20    | 0 - 20                        | 50 - 500 0 - 20               |
| ball          | ball              | ball               | ball                          | ball                          |
| tapered       | 0 - 20            | 28 - 69 0 - 20     | 1-2 28-69 0-20                | 28 - 69 0 - 20                |
| roller        | roller            | roller             | roller                        | roller                        |
| tapered       | 0 - 5 tapered     | 0-5                |                               | 0-5                           |
| roller        | roller            |                    | 0.63 roller                   |                               |
| 1100 - 2000   | 2.5 - 5           |                    |                               |                               |
|               |                   |                    |                               |                               |
| tapered       | 3 - 5 tapered     |                    |                               |                               |
| roller        | roller            | roller             | roller                        | roller                        |
| tapered       | 3 - 5 tapered     | 3.0 3-5            | 3-5                           | 3.0 3-5                       |
| roller        | roller            | roller             | roller                        | roller                        |
|               |                   |                    |                               |                               |

to cover this range Appropriate quantiz Many techn indicate that more 16] and trending The fequencies. An al tange exceeds the

Two facts ca variation between th test conditions Th satisfy the needs for The second i Amongst the RMS v maximum for any or the RMS values (11 from 24 to 40 dB

> The range be required to represer

234 Feature Extra

These include the a single-value indexe trending, and time

Probably th

over a set period o

Spectral ter

Two facts can be obtained from an examination of the table. First, there is a large variation between the no defect and maximum values for different bearings under different test conditions. This implies that it is highly unlikely that any one type of device will satisfy the needs for all monitoring situations.

The second item concerns the magnitude range required for a given bearing signal. Amongst the RMS values, there is no more than a 31 dB variation between no defect and maximum for any one type of bearing and test. This indicates that 6 bits are required for the RMS values (11 bits for mean squared values). The spread of peak amplitudes varies from 24 to 40 dB. Assuming the signal is symmetric, between 5 and 8 bits are necessary to cover this range.

The range between no defect and maximum levels does not consider the resolution required to represent the signal. Unfortunately, this area is not covered in the literature.

Appropriate quantization levels must be obtained through experimentation or simulation.

#### 2.3.4 Feature Extraction Methods

Many techniques have been developed for monitoring the health of bearings. These include the areas of spectral analysis, statistical analysis, low frequency time-domain single-value indexes, high frequency time-domain single-value indexes, frequency-domain trending, and time-domain trending. Several reports that compare different methods indicate that more than one method may be required to detect impending failures [12, 15, 16].

Probably the most prevalent technique in industry is power spectrum windowing and trending. The person implementing this technique sets maximum limits over ranges of frequencies. An alarm is raised when a magnitude value at a frequency within a specified range exceeds the "window" threshold value. The maximum level within each window over a set period of time may be saved for trending analysis [15, 26].

Spectral techniques yield very good sensitivity to changes in the bearing's health,

but possereduction

near-futu

computa

single-va

I

sensor.

defective

Variance.

I

time rang

where a

calculate

derelope five qua

moniton

but possess a high degree of computational complexity and may produce very little data reduction. These disadvantages eliminate spectral methods from consideration for use in near-future smart sensor monitors due to the silicon area required to perform the computations and to store and process the large amounts of data produced.

In contrast, time-domain statistical indexes have simple algorithms and produce single-valued outputs for a given time range, making them better suited to a self-contained sensor. Their main detraction is that they generally are not as accurate at detecting defective bearings. Four techniques were considered for use in the monitor: peak, variance, crest and kirtosis values.

The peak value,  $A_p$ , is the maximum vibration acceleration value over the given time range. Expressed mathematically,

$$A_{p} = \max_{n} (a_{i}),$$

where  $a_i$  is the i<sup>th</sup> acceleration value over n samples. The peak value is the easiest to calculate, and has been studied extensively [4, 15, 16, 22, 26, 28]. Comprehensive limits on peak levels have been determined for use as guides when examining bearings. One set, developed at the Southwest Research Institute, partitions possible bearing conditions into five quality grades [28]:

$$A_p < 0.9 g$$
 No fault  
 $0.9 < A_p < 1.8 g$  Acceptable  
 $1.8 < A_p < 3.6 g$  Marginal  
 $3.6 < A_p < 7.2 g$  Failure probable  
 $7.2 < A_p$  Danger of immediate failure.

The precise acceleration levels for each grade depend on the type of bearing being monitored, equipment mounting, vibration frequencies and monitoring equipment.

Corrective fac usage and mor points in a set. performs best sensitive to th monotonicly as

found as the me

The va

where a is th acceleration, el the second equ

variance can the

The MS It appears to be <sup>26</sup>, 27]. Of t

increasing defe 16, 17]. Like t

The kir central moment Corrective factors must be used to compensate for variations in bearing construction, usage and monitoring methodology [28]. Since the peak value is only one of the data points in a set,  $A_p$  is arguably the worst representative feature of the vibration signal. It performs best under artificially introduced single-point defects [16]. The peak value is sensitive to the load and speed of the monitored bearing, and may not increase monotonicly as the bearing ages.

The variance,  $\sigma_a^2$ , is the second central moment of the acceleration. It can be found as the mean square minus the square of the mean, or

$$\sigma_a^2 = \overline{(\mathbf{a} - \overline{\mathbf{a}})^2} = \overline{\mathbf{a}^2} - (\overline{\mathbf{a}})^2,$$

where  $\bar{a}$  is the mean of a. Since the bearing housing is fixed, it has zero average acceleration, eliminating the need to calculate the mean. This leaves only the first term in the second equality, the mean squared (MS) value, a measure of the signal power. The variance can then be calculated as

$$MS = \frac{1}{n} \sum_{n} a_i^2.$$

The MS value is the square of the commonly used root mean square (RMS) value. It appears to be the most popular low-frequency single-value index [4, 12, 15, 16, 17, 21, 26, 27]. Of the four indexes considered, MS is the most monotonic with respect to increasing defect magnitude, with often a slight drop in value just prior to failure [4, 12, 16, 17]. Like the peak value, MS is sensitive to variations in the load and speed.

The kirtosis value, or coefficient of kirtosis, K, is the unitless ratio of the fourth central moment of a to the square of the variance

$$K = \frac{\overline{(\mathbf{a} - \overline{\mathbf{a}})^4}}{\sigma_a^4}$$

Again, if the mean is zero this expression can be simplified as

$$K = \frac{n\sum_{n}a_{i}^{4}}{\left(\sum_{n}a_{i}^{2}\right)^{2}}.$$

The kirtosis value indicates the "peakedness" of the signal. A normally distributed random variable will have a kirtosis value of three. Higher values result from signals containing peaks. A great deal of laboratory and field work has been done on kirtosis with mixed results [12, 15, 16, 17, 27, 29, 30]. When a point defect is intentionally introduced, the kirtosis value can be quite large. Comparing the kirtosis values in different frequency bands may yield trending information. As the bearing defect increases, the kirtosis values appear to increase then decrease, with higher frequency bands increasing later in the damage progression [12, 17]. One major advantage of kirtosis is its relative insensitivity to load and speed variations. However, the kirtosis value may never increase in some defect modes, such as lubrication failure, and, even when the kirtosis value trends upward with an increasing defect, the value undergoes wild fluctuations between individual measurements [16, 30]. The product of the RMS and kirtosis values has also been examined with some success [16].

The crest value, C, is a unitless measure defined as the ratio of the peak value to the standard deviation, or

$$C = \frac{A_p}{\sigma_a}$$

Since the mean is zero, the standard deviation reduces to the RMS value. The theory behind the crest value is similar to that of kirtosis. By examining the ratio of the peak to the RMS, an indication of the peakedness of the signal can be obtained. Like kirtosis, the crest value is relatively insensitive to speed and load and, also like kirtosis, it sometimes does not indicate a significant change before failure [15, 16, 27, 30].

24 BEA

vibration

(

condition

to provid

reasons

Second

often dif

**د**.

compute

suitable Purposes

the vibra

vibration

data poi:

optiona!

Model th

and prov

241 Bé

1

the low-

between

I

describe(

igration

श<sub>्चित्त</sub>ः

#### 2.4 BEARING VIBRATION SIMULATION

One of the tools required for designing the vibration monitor is a source of vibration signals representing various stages of bearing deterioration and testing conditions. Traditionally, a test fixture is constructed and actual bearings are run to failure to provide the necessary signals. For this project, the test method was avoided for several reasons. First, this technique requires expensive test equipment and skilled personnel. Second, the test fixture and measuring equipment often introduce errors. Finally, it is often difficult to control the variance in testing parameters and defect levels.

An alternate method is to synthesize the necessary bearing vibration signals with a computer simulation. Such a simulator was written in C as part of this project because no suitable existing software could be located. The simulator provides signals for two purposes, comparing various feature extraction methods and simulating the operation of the vibration monitor. The output of the simulator is a set of data points representing the vibration acceleration at the surface of the bearing housing and a summary display of the data points including statistics, defect frequencies, a portion of the time signal and an optional periodogram magnitude spectrum. The remainder of this section discusses the model that forms the core of the simulator, describes the simulator input and operation, and provides the bearing example used during this project.

## 2.4.1 Bearing Vibration Model

There are three theoretical issues related to the development of a vibration model, the low-level background noise signal, the defect source signals, and the transmission path between the signal generation site and the sense point.

The acceleration vibrations detected at the housing surface of a healthy bearing are described as being "noise-like and low in level" [31]. Experimental evidence shows these vibrations to have a zero-mean Gaussian distribution [12, 16, 17] and, hence, the mean squared level of the noise is the variance. The measured level of the noise is dependent on

the bearing

T

Kronecke

the defea

magnitud

the defea

also incre

frequency

T

affects th adding si

second-c

[13] A

attenuat;

generation

backgrou

is passed

filters re

and high

and gua

bearing

2,4,2 Si

conditio

nger and

the bearing type, application, measurement frequency range and operating conditions.

The defect source signal due to a point defect can be simulated by a periodic Kronecker delta impulse train [13]. One of the factors determining the repetition rate is the defect location; on the outer race, on the inner race, or on a rolling element. The magnitude of the impulse is affected by the severity of the defect as well as the loading at the defect location. As the defect and/or loading increases, the magnitude of the impulse also increases. Experimental evidence shows that a corresponding growth in the high-frequency background noise level may also occur [17, 27].

The transmission path between the defect generation site and the sensing point affects the signal in several ways. One effect is to excite resonant modes in the housing, adding significant components to the vibration signal. The modes may be represented by second-order systems under damped systems, with parameters experimentally determined [13]. A second effect of the transmission path is to attenuate the signal. The amount of attenuation is frequency dependent and may be periodically time dependent if the generation point is moving with respect to the sense point.

A model that represents these effects is presented in Figure 6. The Gaussian background noise source is summed with any defect sources present. The resulting signal is passed through a set of parallel second-order underdamped low-pass and/or band-pass filters representing the significant resonant modes of the bearing. The outputs are summed and high-pass filtered. The high-pass filter represents the housing mounting base support and guarantees that the average value of the signal will be zero, a required condition if the bearing is stationary (i.e., no net acceleration).

#### 2.4.2 Simulator Input

The program begins by requesting parameters describing the bearing, fault conditions and simulation environment. The bearing parameters can be entered by the user and stored in a file or can be loaded from a previously saved file.



Figure 6. Bearing vibration model.

Bearing and backgroun mode, the type damping ratio first-order base diameter and b tolling element Three proportional 1

where

The proportion

bearing deterio

Cp to 1 three contact s

defect. For th

Rolling elemen

and inner race

entered Finally

The prop to sampling and Bearing variables include housing resonant mode information, bearing geometry and background noise parameters. The number of housing modes is a variable. For each mode, the type of second-order response, low-pass or band-pass, the resonant frequency, damping ratio and gain are entered. Also required are the cutoff frequency and gain of the first-order base support high-pass filter. Bearing geometry data includes the unitless pitch diameter and ball diameter, since only their ratio is required by the program, the number of rolling elements, pitch angle and thrust factor.

Three types of background contact noise are available: none, constant and proportional. Expressions determining the noise variance  $(\sigma_n^2)$  are

$$\sigma_{n}^{2} = \begin{cases} 0 & \text{none} \\ GMAG & \text{constant} \\ GMAG \times (1 + nprop \times Mag[0]) & \text{proportional,} \end{cases}$$

where GMAG = noise base level,

nprop = noise constant of proportionality,

Mag[0] = magnitude of the first defect.

The proportional noise option models the observed increase of background noise as the bearing deteriorates.

Up to ten point defects can be introduced into the modeled bearing. Any of the three contact surfaces, outer race, inner race or rolling element, may be selected for each defect. For the race defects, an angle entry locates a defect relative to other defects. Rolling element defects are specified by a ball number and the ratio between outer race and inner race transmission strengths. A relative magnitude for each defect is also entered. Finally, the shaft speed is provided for calculating defect contact intervals.

The program allows for setting the simulation environmental parameters pertaining to sampling and output. The sampling frequency (inverse of the period between generated

data points subsampling spectrum, a

Butterworth

reduce alias

Pros

penodogran

24.3 Simu)

representing

The

noise value

checked W

added to the

to the repet

housing mo

written to

performed

A Glatiables usi

random vari

ès

data points), number of samples and subsampling frequency are programmable. The subsampling frequency determines the sampling rate for summary statistics and the spectrum, and must be no greater than one-third the sampling rate. A fifth-order Butterworth filter with a cutoff frequency of one-third the sampling frequency is used to reduce aliasing. The subsampling has no effect on the generated data points.

Program menu options allow the user to save the output data to a file, generate a periodogram spectrum, and print summary data.

## 2.4.3 Simulator Operation Details

The simulator is time driven, with each pass through the simulation loop representing one sampling period. If the background noise option is used, a Gaussian noise value is generated at the start of each pass. Each of the defect generators is then checked. When sufficient time has elapsed to trigger the defect, the impulse magnitude is added to the signal and the next defect trigger time is calculated by adding the current time to the repetition period. The signal is next passed, in parallel, through the second-order housing mode filters, summed, then high-pass filtered. If the data is to be stored, it is written to a file. The data is then subsampled, statistics are taken and an FFT is performed. The remainder of this section presents some of the pertinent simulator details.

A Gaussian noise value is generated from twelve uniformly distributed random variables using the Convolution Method [33]. If  $\mathbf{R}_i$  are independent, uniformly distributed random variables over [0,1], then the Gaussian random variables  $\mathbf{Z}$  and  $\mathbf{Y}$  can be formed as

N(0, 1): 
$$\mathbf{Z} = \sum_{i=1}^{12} \mathbf{R}_i - 6,$$

$$\mathbf{N}(m_y, \sigma_y): \qquad \mathbf{Y} = m_y + \sigma_y \cdot \mathbf{Z}.$$

second-orderesponse du filter are de transform is

The

an additional correspond: summary in:

If the

FFT is per

spectrum co

where  $f_{ac}$  is the end of t

where L, to points gen

The first 5

244 Simi

 $Th_{\epsilon}$ 

The housing mode filters are constructed by an impulse invariant conversion of the second-order underdamped transfer function. This method most closely represents the response due to the impulse-like point defects. The high-pass filter and the antialiasing filter are developed using the bilinear transformation. A design example of the bilinear transform is included in Appendix D.

Before the beginning of the simulation loop, the loop count is increased to include an additional 0.1 seconds of simulation time, based on the sampling frequency. Data corresponding to the first 0.1 seconds of operation is not saved nor used to calculate summary information to allow for initialization of the filters.

If the spectrum option is selected for inclusion in the display, the power spectral density is estimated by periodogram averaging [33]. Once initialization has occurred, an FFT is performed on each successive set of 1024 samples producing the complex spectrum coefficients A(k), for

$$f_{k} = \frac{k f_{sub}}{1024}, \qquad k = 0, 1, 2...1023,$$

where  $f_{sub}$  is the subsampling frequency. The periodogram average,  $\bar{I}(f_k)$ , calculated at the end of the simulation, is

$$\bar{I}(f_k) = \frac{1}{1024 L} \sum_{i=0}^{L-1} |A_i(k)|^2,$$

where L, the number of periodograms averaged, is determined by the number of data points generated and the ratio of the subsampling frequency to the sampling frequency. The first 512 coefficients of  $\bar{I}(f_k)$  are displayed as the spectrum.

### 2.4.4 Simulator Example

The simulation of an actual bearing provides a means for comparing feature extraction methods and provides an input for simulating the monitor. The parameters



Figure 7. NSK / NTN 30204 bearing section [27].

used for the simulation are from a type NSK / NTN 30204 tapered roller bearing as reported by Howard and Stachowiak [27]. The 30204 is a general purpose, medium duty bearing that may find use, for example, as a motor shaft guide [34] This type of bearing was chosen because it appears to lend itself well to the working range of the monitor as specified and due to the detailed information available. Figure 7 shows a cutaway view of the bearing and relevant dimensions. Appendix A contains a complete listing of the bearing parameters used for the model.

The bearing was tested at a shaft speed of 2000 rpm and an axial load of 295 N in a fixture that isolated the drive unit vibrations from the bearing. The radial housing vibrations were measured with a Bruel and Kjaer model 8309 piezoelectric accelerometer. The output was amplified and low-pass filtered with a B&K type 2635 charge amplifier before sampling at 116.2 kHz. High frequency tests were conducted with a cutoff frequency of 30 kHz. The resulting spectrum revealed housing resonances at 2.2, 11.3, 18.2, 26.8 and 35.2 kHz within the measured range.



Figure 8. Actual [27] and simulated zero-defect bearing signals and spectra.

Figure and the actual sampling rate kHz. The nactual test 293 · 10 - 5.

Butterworth the effects of

Table and kirtosis

runs The in

The notch on the

magnitude o

of 300.0 I

compares th

Da

\_\_\_

Cres

K

Figure 8 presents a comparison of the time signals and spectra from the simulator and the actual bearing. The actual bearing graphs are from [27]. The model used a sampling rate of 2.56 Mhz, 143,360 sampling points and a subsampling frequency of 174 kHz. The number of sampling points and subsampling frequency were chosen to match actual test conditions. The background noise variance (GMAG) was set at  $2.93 \times 10^{-5}$ . For the purpose of this comparison, a second-order, bilinear transformed Butterworth filter with a cutoff frequency of 30 kHz was added to the model to simulate the effects of the charge amplifier.

Table 2 compares three vibration acceleration feature indexes, RMS, crest factor and kirtosis factor, of the actual bearing as reported in [27] and five simulation model runs. The indexes are presented here for comparison purposes.

The effects of a point defect were created in the actual bearing by introducing a notch on the outer race (OR). This was modeled in the simulation by a relative defect magnitude of 0.0022 and by using varying noise with a constant of proportionality (*nprop*) of 300.0. Figure 9 shows the time signal for the actual and model bearing. Table 3 compares the damage indexes.

Table 2 Actual vs. model bearing damage indexes for no defect.

| Damage       | Actual |        |        | Simulation | 1      |        |
|--------------|--------|--------|--------|------------|--------|--------|
| Index        | Test   | run #1 | run #2 | run #3     | run #4 | run #5 |
| RMS          | 4.91   | 5.77   | 5.93   | 5.66       | 5.66   | 5.88   |
| Crest Factor | 4.19   | 4.11   | 4.26   | 5.32       | 3.77   | 3.73   |
| Kirtosis     | 3.11   | 3.00   | 3.08   | 3.08       | 3.08   | 3.06   |

..

-50

50

acceleration (m/s²)

-50

Fig



# a) actual signal [27].



b) simulation signal.

Figure 9. Actual [27] and model defective bearing signals.

Ι

RN: Cre

25 SNET

Th compare c

NSK / N

Appendix defect rep

2.56 milli

generated

25.1 Fea

A on the da

one-third

100 Hz a

magnitud

pearings. defect to

as estima

Table 3. Actual vs. model bearing damage indexes for an OR defect.

| Damage       | Actual |        |        | Simulation | 1      |        |
|--------------|--------|--------|--------|------------|--------|--------|
| Index        | Test   | run #1 | run #2 | run #3     | run #4 | run #5 |
| RMS          | 8.75   | 8.66   | 8.53   | 8.61       | 8.81   | 8.50   |
| Crest Factor | 6.45   | 6.69   | 7.84   | 6.74       | 7.08   | 7.50   |
| Kirtosis     | 6.15   | 5.14   | 6.20   | 5.04       | 5.32   | 5.50   |

#### 2.5 SIMULATION RESULTS

This section summarizes the results obtained from the bearing simulator to compare defect indexes and to consider the effects of quantization. The model uses the NSK / NTN 30204 bearing parameters described in Section 2.4 and summarized in Appendix A. An outer race point defect and a shaft speed of 2000 rpm resulted in a defect repetition frequency of 207 Hz ( $f_o$ ). For each of the test cases, a sampling rate of 2.56 million samples per second was used, and a sufficient number of samples were generated to result in an output sample size of 640 for each subsampling frequency.

#### 2.5.1 Feature Extraction Simulation

A series of simulations produced data useful for examining the effect of frequency on the damage indexes. A fifth-order Butterworth low-pass filter with a cutoff frequency one-third of the subsampling frequency limits the subsampled data to frequencies between 100 Hz and 40 kHz in various runs. Ten sets of results with no defect (0.000 defect magnitude) were analyzed to produce the standard deviation values expected for healthy bearings. One set was obtained at each of six relative defect magnitude levels from no defect to 0.005, the approximate border between the marginal and failure probable grades as estimated using the peak value comprehensive limits discussed in Section 2.3.4. The

mean squ therefore.

results ar C

First, cres from bety ವೆರ range

> below the K

ranges fro kHz As

frequenci

T

of freque frequenc;

nonmone

T relative o

repetition

frequenc

respect t

change f

indexes

one with

accepta!

betweer

mean squared and peak values increase as the subsampling frequency increases and, therefore, the values are normalized to the average of the ten zero-defect runs. The results are summarized in Figure 10.

Conclusions for each index type can be drawn by examining the simulation results. First, crest is a relatively poor index of the bearing's health. The range of values extends from between 3 and 4 for no defect to just under 7 for the best-case high defect, and the ±3 $\sigma$  range is about 2. Also, the measure is insensitive to increasing damage at frequencies below the housing resonant frequencies.

Kirtosis is a better indication of relative health than the crest factor. The kirtosis ranges from 3, indicating a Gaussian distribution of values, to a maximum of 11.5 at 30.5 kHz. As with the crest factor, the kirtosis value only indicates a deteriorating condition at frequencies containing housing resonances.

The normalized peak value is a better indicator than kirtosis over the entire range of frequencies simulated. A factor change of about 3.5 at low frequencies up to 6 at high frequencies occurs between the zero-defect and severely damaged bearing. Some nonmonotonicity with respect to an increase in defect magnitude can be observed.

The mean squared value is the best defect index. Normalized values for the 0.005 relative defect magnitude range from about 9 at high frequencies up to 18 at the defect repetition frequency. The normalized MS value is the best measure around the defect frequencies, has the smallest relative  $\pm 3\sigma$  range, and exhibits the best monotonicity with respect to increasing defect magnitude.

As an indicator of bearing health, the most important factor for an index is the change from no defect to the current defect level. Figure 11 presents a comparison of the indexes with each normalized to the average zero-defect level. Two graphs are provided: one with a 0.002 relative defect magnitude, the comprehensive magnitude border between acceptable and marginal grades, and one with a 0.005 defect magnitude, the border between marginal and failure probable grades. The mean squared index provides the best

SV b

Salist Ass



Figure 10. Defect indexes vs. frequency.



a) 0.002 relative defect magnitude



b) 0.005 relative defect magnitude

Figure 11. Comparison of normalized defect indexes.

indication of acceptable ma peak value for 252 Quantiz The ef spectrum wer of =10 g were The ef sets for no d imminent leve was then quar bit quantization The shaded re point runs. The 5 level of appro the floating p of the quanti At relative of quantization the failure po 1024-point 1

The

data set repi

at 6, 8, 10 a

indication of deterioration in all instances except at very high frequencies for the acceptable/marginal border where peak value is superior. Kirtosis and crest are inferior to peak value for every case.

### 2.5.2 Quantization Simulation

The effects of quantization on the mean squared value and the vibration power spectrum were investigated using simulation. A 5 kHz cutoff frequency and a peak range of  $\pm 10$  g were used.

The effects of quantization on the MS value were investigated by generating data sets for no defect through a relative defect magnitude of 0.1, well beyond the failure imminent level predicted by the Southwest Research comprehensive limits. Each data set was then quantized at 4-bit through 10-bit resolution. The results for the 5-bit through 8-bit quantization, together with the floating point (flp.) values, are presented in Figure 12. The shaded region indicates the  $\pm 3\sigma$  range obtained by averaging ten zero-defect floating point runs.

The 5-bit quantization is too coarse to detect any values before a relative defect level of approximately 0.002. At quantization greater than five bits, the MS value follows the floating point value exactly, plus an additional amount equal to the mean squared value of the quantization noise. The 8-bit line is indistinguishable from the floating point results. At relative defect values above about 0.05, the peak value can exceed 10 g, causing quantization saturation and an under estimation of the MS value. This should be beyond the failure point of the bearing.

The effects of quantization on the spectrum were examined by generating ten 1024-point FFTs. These were averaged to produce power spectrum periodograms. One data set representing no defect and one at a relative defect level of 0.002 were quantized at 6, 8, 10 and 12-bit levels. The resulting spectra plus the floating point spectrum are



## Key to Comprehensive Vibration Limits

A = no fault

B = acceptable

C = marginal

D = failure probable

E = failure immanent

Figure 12. Quantization effects on MS.

presented in I

8-bit and 6-bi

All of

spectrum.

26 MONITOR

This

monitor systemean squares

iargest variat

defect magni

The down to 10

accelerome:

Olg Thes

duty bearing

The

off-chip sp

presented in Figure 13. The spectra are shown with a two decade vertical spacing for clarity.

All of the spectra follow the trends of the floating point spectrum. However, the 8-bit and 6-bit spectra have significant differences at lower magnitudes, particularly in the zero-defect graph. The 12-bit spectrum is virtually identical to the floating point spectrum.

#### 2.6 MONITOR SYSTEM SPECIFICATIONS

This chapter concludes with the development of a set of specifications for the monitor system based on the previous findings. The feature extraction method will be the mean squared value. This defect index yielded the best results in simulation, having the largest variation between no defect and defect levels, a monotonic increase with increasing defect magnitude, and good sensitivity at low frequencies.

The accelerometer should have a flat response up to between 2 and 10 kHz and down to 100 Hz. A low-pass cutoff of 5 kHz was chosen for this project. The accelerometer should have a magnitude range of  $\pm 10~g$  and have the ability to resolve 0.1 g. These parameters are sufficient to cover the monitoring requirements of a medium duty bearing such as the NSK / NTN 30204.

The analog-to-digital converter should produce at least 8 bits of accuracy for the mean squared calculation. It should also support the capability for 12-bit quantization for off-chip spectrum generation.



a) no defect



b) 0.002 relative defect magnitude

Figure 13. Quantization effects on spectrum.

### CHAPTER 3 TRANSDUCER

The acceleration transducer converts bearing vibrations into a change in resistance. This is accomplished by using a mass suspended by four beams. As the device is accelerated, the suspended mass moves relative to the base material, causing stress in the beams. Two piezoresistors placed on one of the beams convert the stress into a change in resistance. The piezoresistors are arranged to form a half-active bridge to accentuate stress induced changes while minimizing the effects of resistor variations.

Section 3.1 presents theory associated with suspended mass accelerometers. In Section 3.2, the results of finite element modeling of the four-beam accelerometer are given. A discussion of the piezoresistor follows in Section 3.3. Section 3.4 then estimates the performance characteristics. Section 3.5 presents transducer construction issues. Finally, the electronics that convert the change in resistance to a voltage signal are considered in Section 3.6. Appendix B provides greater details in the development of relations used in this chapter.

### 3.1 SUSPENDED MASS ACCELEROMETER THEORY

This section theoretically analyzes a suspended mass accelerometer. It starts with a brief discussion of a mass-spring-damper system. Next, a basic description of the transducer design is presented. This is followed by design calculations developed from isotropic beam theory.

### 3.1.1 Mass-Spring-Damper System

Figure 14 shows a mass-spring-damper system with base excitation. The spring is assumed to be linear and elastic, with the stiffness k relating the applied force F to the



Figure 14. Mass-spring-damper system [35].

deformation d by F = kd. The damper is assumed to be linear and viscous, with damping constant c relating the applied force to the velocity v by F = cv.

An expression for the displacement of the mass, x(t), in terms of the base displacement, y(t), is

$$m\ddot{x}+c(\dot{x}-\dot{y})+k(x-y)=0.$$

A more useful expression is obtained by reformulating the problem in terms of the relative motion between the base and the mass. Let z = x - y. The equation of motion becomes

$$m\ddot{z} + c\dot{z} + kz = -m\ddot{y} = m\omega^2 Y \sin(\omega t).$$

The displacement and acceleration transfer functions can be obtained through frequency transformations. The magnitudes are

$$\left|\frac{Z}{Y}(\omega)\right| = \frac{m\omega^2}{\sqrt{(k-m\omega^2)^2+(c\omega)^2}} = \frac{r^2}{\sqrt{(1-r^2)^2+(2\zeta r)^2}},$$

where 
$$\omega_0 = \sqrt{\frac{k}{m}}$$
 is the fundamental frequency,  $r = \frac{\omega}{\omega_0}$  is the normalized frequency, and  $\zeta = \frac{c}{2\sqrt{km}}$  is the damping ratio.

Figure 14 shows the relationships between the magnitude transfer function, normalized frequency and the damping ratio.

Parameters of importance in the mass-spring-damper model include the frequency response and the sensitivity. The sensitivity can be expressed as a function of the strain, or normalized change in length, of the spring. The primary issues of importance regarding frequency include the range of relatively flat response in the low frequency pass band, and the frequency and damping ratio of the fundamental response.

### 3.1.2 Transducer Geometry

A mass suspended by one or more beams is used to implement the mass-springdamper arrangement in silicon. The spring action is supplied by the elasticity of the thin silicon beams supporting the mass. The principal damping is supplied by the fluid through which the mass and beams travel.

Single beam accelerometers are the simplest construction. They have relatively good sensitivity because of the freedom of movement allowed to the mass. This configuration, however, has two serious drawbacks. First, the resonant frequency is low because of the large spring constant. Second, the system is sensitive to off-axis accelerations due to the lack of support for the freely suspended mass.

The addition of more beams of the same size to the same mass has the effect of decreasing the sensitivity and increasing the bandwidth. If these beams are distributed on different sides of the mass, the motion of the device in the plane of the beams is severely

restricted. Thi

The geo

and 16 The prosilicon [36] For connect the base pyramid with ou

The mass is calc

acceleration ]

The geo

different philoso make the transo produces device

condition varia

allows the sur electronics, iso

3.1.3 Isotropia

A first simple beam the

in Appendices

If a slen material is subj between the log restricted. This greatly reduces sensitivity to off-axis acceleration. Most recent commercial accelerometers are constructed with four beams.

The geometry of the transducer designed for this project is shown in Figures 15 and 16. The projected 45° angles result from the anisotropic micromachining of (100) silicon [36]. Four beams, each contained within dimensions of 90  $\mu$ m × 60  $\mu$ m × 10  $\mu$ m, connect the base material to a mass cut approximately into the shape of the frustrum of a pyramid with outer dimensions of 2320  $\mu$ m × 2320  $\mu$ m × 400  $\mu$ m and a mass of 3.55 mg. The mass is calculated from the volume and density of silicon in Appendix B.1.

The beams are each located near a different corner of the mass to reduce off-axis acceleration. Placing them on each corner would minimize sensitivity to off-axis acceleration, but would also complicate the micromachining [37].

The geometry was chosen principally for acceptable bandwidth. It follows a different philosophy from many of the products currently available. The usual goal is to make the transducer as small as possible to allow electronics on the same chip. This produces devices that are more susceptible to manufacturing variations, operating condition variations, and small-scale effects. In contrast, the transducer proposed here allows the surface of the large mass to be used for the placement of conditioning electronics, isolating them from the other electronics.

### 3.1.3 Isotropic Beam Theory

A first estimate of the transducer's mechanical properties can be obtained from simple beam theory. Additional details of the results presented in this section are provided in Appendices B.2 and B.3.

If a slender prismatic beam constructed of a linear, isotropic, homogenous, elastic material is subjected to a load resulting in a small deflection, an approximate relationship between the loading and the deflection is [38]



Figure 15. Overall transducer geometry.



Figure 16. Beam geometry.

$$EI\frac{d^4y}{dx^4}=w(x),$$

where

x = length along the beam,

y(x) = deflection of the beam,

w(x) = load function,

E = stiffness (Young's) modulus,

I = cross-section area moment of inertia.

Solutions to the differential equation for deflection, longitudinal stress, and bending moment along the beam require four boundary conditions. These are found by considering the ends of a clamped-sliding beam under end loading as shown in Figure 17.

The four-post suspension limits the free end of the beam to motion normal to the plane. It also results in a loading of the "sliding" end equivalent to one-quarter of the total



Boundary Conditions
$$y(0) = 0 \qquad \frac{dy}{dx}\Big|_{x=0} = 0$$

$$M(L) = EI \frac{d^2y}{dx^2}\Big|_{x=L} = \frac{maL}{2} \qquad S(0) = EI \frac{d^3y}{dx^3}\Big|_{x=0} = -F = -ma$$

Figure 17. Loaded beam in the four-beam accelerometer.

mass times the

However, the

The s

The effective

value may be

fundamental fr

One in

maximum oute

piezoresistor

where

The maximum

either the mass

The ma

٠٠١١٥٥ [١]

This syst

mass times the applied acceleration.

The simplified beam model used assumes a constant cross-sectional area. However, the last 10  $\mu$ m on either end of the beam merge with the supporting material. The effective length of the beam lies somewhere between 70 and 90  $\mu$ m. The shorter value may be used for stress calculations and the longer value for displacement and fundamental frequency calculations to yield worst case values.

One important value obtained from the solution to the differential equation is the maximum outer fiber stress,  $\sigma_0$ . This stress is converted to a change in resistance by the piezoresistor. The maximum outer fiber stress is

$$\sigma_0 = \frac{M_0 c}{I} = -\frac{macL}{2I} = -396 g g / \mu m \cdot s^2,$$

where

 $M_0$  = maximum moment,

c = maximum edge-to-neutral surface distance,

m = suspended mass,

a = acceleration,

L = beam length,

g = number of gravitational accelerations.

The maximum outer fiber stress occurs at the top of the beam at the point of attachment to either the mass or the base.

The maximum deflection in the y-direction, which occurs at the point where the beam connects to the mass, can be found as

$$y(L) = -\frac{maL^3}{12EI} = -7.61 \times 10^{-4} \text{ g } \mu\text{m}.$$

This system of units, grams (g), micrometers ( $\mu$ m), and seconds (s), is used to accommodate the finite element analysis.

their inaccurace to the base do long as the lass simplifications stresses exist beams with sr approaches the attachment to used to yield a

Severa

[35] Details a potential energy Assuming the

The fu

contribution of

Since the systemaximum pote

Several assumptions made in the development of the above equations contribute to their inaccuracies. First, the material is not isotropic. Second, the post attaching the mass to the base does not fit the beam definition that the length should be at least ten times as long as the largest cross-sectional dimension (width or thickness). Third, the standard simplifications that only pure bending exists along the beam and that only longitudinal stresses exist at points away from applied loads, while providing accurate results for beams with small deflections, may produce significant errors as the length dimension approaches the width dimension. Fourth, the material in the base and mass near the attachment to the beam will not remain rigid. Finally, an approximate beam length was used to yield a worst case solution.

The fundamental frequency,  $\omega_0$ , can be approximated using Rayleigh's Principle [35]. Details are provided in Appendix B.3. If the vibrating beam is conservative, the potential energy will be completely converted to kinetic and vice versa twice per cycle. Assuming the effect due to the suspended end mass is significantly greater than the contribution of the weight of the beam, the equations for the maximums are

maximum potential energy: 
$$P = \frac{1}{2} mgd = \frac{1}{2} mgy(L)$$
, maximum kinetic energy:  $K = \frac{1}{2} mv^2 = \omega_0^2 my^2(L)$ .

Since the system is assumed conservative, the maximum kinetic energy must equal the maximum potential energy. Equating the two and solving for  $\omega_0$  yields

$$\omega_0 = \sqrt{\frac{g}{y(L)}} = \sqrt{\frac{12EI}{L^3 m}} = 1.14 \times 10^5 \text{ rads/s}$$
or
$$f_0 = 18.1 \text{ kHz}.$$

This estimated introducing still beam is assume resonant freque

# 32 FNITE ELI

The finite elements, apply results of each was NISA II (Engineering Mandely used the handle nonisoti

PC.

analysis yields used for detern response is pre

This sea

# 32.1 Finite El

Three of element

the structure in

This estimation suffers from three inaccuracies. First, the deflection used is static, introducing stiffness. Second, the material in the base and mass near attachment to the beam is assumed rigid. Both of these simplifications cause the estimate to have a higher resonant frequency. The third inaccuracy arises from the use of 90  $\mu$ m as the beam length. This tends to underestimate the fundamental frequency.

### 3.2 FINITE ELEMENT ANALYSIS

The finite element method consists of three steps, dividing the problem into small elements, applying appropriate physical relations to the elements, then combining the results of each element to obtain a solution [39]. The program used to run the simulation was NISA II (Numerically Integrated Elements for System Analysis II) developed by Engineering Mechanics Research Corporation (EMRC) in Troy, MI. This program is widely used throughout the world for finite element analysis and is known for its ability to handle nonisotropic materials. The program was run on a 4/490 Sun Server using a Sun IPC.

This section opens by describing the model used. Next, a description of the static analysis yields stress and displacement data. This is followed by the eigenvalue analysis used for determining the modes and their relative contributions. Finally, the frequency response is presented.

### 3.2.1 Finite Element Model

Three of the decisions required in developing a finite element model include the type of element to be used, the material properties of the elements, and the partitioning of the structure into those elements. This segment will examine each of these issues in turn.

### 3.2.1.1 Element type

NISA II supports over 15 different three-dimensional element types and orders. The hybrid solid element (NKTP = 9) was chosen because it supports wide variations in thickness throughout the model, a necessary requirement for the accelerometer [40]. Each element is a hexahedron with six faces and eight nodes, each node having three degrees of freedom. The additional computational cost associated with using this element type is justified because the simpler tetrahedron and hexahedron elements were found to exhibit numerical stability problems with the accelerometer model.

### 3.2.1.2 Material properties

Part of the description for a structure is the set of constitutive equations. These are mathematical descriptions of the material's properties. The relevant constitutive equation for this finite element model relates applied loads or stresses to displacements or strains. The tensor stress-strain relationship is

$$\sigma_{ij} = C_{ijkl} \, \boldsymbol{\varepsilon}_{kl} \, ,$$

where

 $\sigma_{ij}$  = Cauchy stress tensor,

 $C_{ijkl}$  = fourth-order stiffness tensor,

 $\varepsilon_{kl}$  = small strain tensor.

NISA II uses a linear elastic orthotropic model for nonisotropic bulk material. A material is orthotropic if the stiffness coefficients, in matrix notation, have the form [41]

$$C_{ij} = \begin{bmatrix} C_{11}C_{12}C_{13} & 0 & 0 & 0 \\ C_{12}C_{22}C_{23} & 0 & 0 & 0 \\ C_{13}C_{23}C_{33} & 0 & 0 & 0 \\ 0 & 0 & 0 & C_{44} & 0 & 0 \\ 0 & 0 & 0 & 0 & C_{55} & 0 \\ 0 & 0 & 0 & 0 & 0 & C_{66} \end{bmatrix}.$$

The particul

$$\left\langle \begin{array}{c} \sigma_{ii} \\ \uparrow \\ \uparrow \\ \uparrow \\ \downarrow \end{array} \right\rangle =$$

where

Thus, nine o three Poisso

used by the

Silic

crystallogra

for 300 °K 1

Beca

along (100

transformed

to the modu

The particular form of the constitutive equations used in NISA II is

$$\begin{bmatrix} \sigma_{11} \\ \sigma_{22} \\ \sigma_{33} \\ \tau_{12} \\ \tau_{23} \\ \tau_{31} \end{bmatrix} = \begin{pmatrix} \frac{1}{\alpha} \begin{bmatrix} E_{11} (1 - v_{23} v_{32}) E_{22} (v_{12} + v_{13} v_{32}) E_{33} (v_{13} + v_{12} v_{23}) & 0 & 0 & 0 \\ E_{22} (1 - v_{13} v_{31}) & E_{33} (v_{23} + v_{21} v_{13}) & 0 & 0 & 0 \\ E_{33} (1 - v_{12} v_{21}) & 0 & 0 & 0 & 0 \\ & & & & & & \alpha G_{12} & 0 & 0 \\ & & & & & & & \alpha G_{23} & 0 \\ & & & & & & & & \alpha G_{31} \end{bmatrix} \begin{bmatrix} \varepsilon_{11} \\ \varepsilon_{22} \\ \varepsilon_{33} \\ \gamma_{12} \\ \gamma_{23} \\ \gamma_{31} \end{bmatrix},$$

where 
$$\alpha = 1 - v_{12} v_{21} - v_{23} v_{32} - v_{31} v_{13} - v_{12} v_{23} v_{31} - v_{21} v_{13} v_{32},$$

$$v_{ij} = \text{Poisson's ratio},$$

$$E_{ij} = \text{elastic modulus},$$

$$G_{ij} = \text{shear modulus},$$

$$\tau_{ij} = \sigma_{ij} = \text{shear stress},$$

$$\gamma_{ij} = 2\varepsilon_{ij} = \text{"engineering" shear strain}.$$

Thus, nine coefficients need to be supplied: three elastic moduli, three shear moduli and three Poisson's ratios. It should be noted that the indices 1, 2, 3 refer to the axes x, y, z used by the model.

Silicon is an orthotropic material. Its stiffness coefficients, referenced to crystallographic axes, have been experimentally determined to be

$$C_{11} = 1.658 \times 10^{8}$$
 g/ $\mu$ m·s<sup>2</sup>,  
 $C_{12} = 0.639 \times 10^{8}$  g/ $\mu$ m·s<sup>2</sup>,  
 $C_{44} = 0.796 \times 10^{8}$  g/ $\mu$ m·s<sup>2</sup>,

for 300 °K temperature and 0 kg/cm<sup>2</sup> pressure [42].

Because the beams align with the (110) directions and the crystallographic axes lie along (100) directions, the coefficients must be converted to tensor notation and transformed to the new coordinate system. The stiffness constants must then be converted to the moduli format used in NISA II. Appendix B.4 provides details of the transform.

3

;(

,

This results in the following model constitutive coefficients, written in the form used by the program:

$$EX = EZ = 1.692 \times 10^8$$
 g/µm·s²,  
 $EY = 1.302 \times 10^8$  g/µm·s²,  
 $GXY = GYZ = 0.398 \times 10^8$  g/µm·s²,  
 $GXZ = 0.225 \times 10^8$  g/µm·s²,  
 $NUXZ = 0.0626$ ,  
 $NUXY = 0.362$ ,  
 $NUYZ = 0.279$ .

### 3.2.1.3 Element size

The decision regarding how to divide the structure into elements is similar to any numerical method in that the finer the divisions, the more accurately the discrete result will approach a continuous result. The goal of the finite element modeling is to estimate properties of the beams since they contain the stress-sensitive piezoresistors. This is also the region that will undergo the most change due to acceleration. Hence, the beams will require greater accuracy than the larger base and mass volumes.

No straightforward theoretical method exists to determine how a particular region should be divided into elements. Instead, the typical method employed is to partition the volume into successively finer elements, noting as the results asymptotically approach the continuous solution. The process is halted when the desired accuracy is obtained.

This technique was implemented by using a 70  $\mu$ m beam with the same cross-section and material properties as the beam in the full model. The base side was rigidly fixed. The mass was modeled by a rigid, dense block having one-quarter of the total mass and a length of 10  $\mu$ m, constrained to move only in the vertical (y) direction. Figure 18 shows an example test beam.

cent

(mar cros

divid

with

obta

a len do n

prob

mod

2% c the a

decis

eleme

finene



Figure 18. Test beam.

Static and eigenvalue analyses were conducted to obtain the stress at the top center of the fixed end (maximum stress), displacement at the top center of the mass end (maximum displacement), and the first harmonic frequency. Six trials were run with the cross-section divided into 4, 2 width by 2 height, and 6 trials with the cross-section divided into 8, 4 width by 2 height. The 6 trials were run for each cross-section starting with the length divided into 2 elements, doubling each time until 64 length divisions were obtained. The results of these tests are summarized in three graphs in Figure 19.

The theoretical values are the results of the isotropic beam theory calculations with a length of 70  $\mu$ m. The calculations are provided in Appendices B.2 and B.3. These limits do not represent the asymptotic limits for a continuous solution to the general anisotropic problem.

The results show a good correlation between the asymptotic result from the beam model and the isotropic theory. They also indicate that the model results are within about 2% of the asymptote at 256 elements for an eight element cross-section, and within 5% of the asymptote at 128 elements for an eight element cross-section. This resulted in the decision to use an eight element cross-section with 30 length divisions for a total of 240 elements per beam. This accuracy is carried into the supporting material, decreasing in fineness away from the beams.







Figure 19. Element size effects.

constra

equilibr stresses

[43].

Since the

on the to

(a) of the

Darker r

stress

the beam

1

value of

the width

the shear,

on the top

is 221 g/j

attaches to

### 3.2.2 Static Analysis

Static analysis produces the steady-state result to an external load. In this case, it is used to obtain stress and displacement information. The technique used by NISA II is based on the principal of virtual displacements, which states that the total internal virtual work must equal the total external virtual work. The virtual work is expressed in terms of virtual displacements that are compatible with the conditions of equilibrium, kinematic constraints and boundary conditions.

The solution proceeds by first calculating the stiffness matrix for the system. The equilibrium equations are then solved to obtain element displacements. Next, the element stresses are calculated. Finally, the element results are combined to provide system results [43].

One of the results obtained from a static analysis is the distribution of stresses. Since the stress along a beam (longitudinal stress) is used to convert acceleration into a change in resistance, its values are critical. Figure 20 presents the distribution of stresses on the top of a beam aligned with the x-axis due to 1 g of downward acceleration. Part (a) of the figure shows the distribution of normal stresses in the x-direction (SXX). Darker regions indicate increased compressive stress and lighter regions increased tensile stress.

The maximum longitudinal stress on the top surface occurs at the outside edge of the beam above where the bottom of the beam attaches to the base or mass, and has a value of 428 g/ $\mu$ m·s<sup>2</sup>. The maximum longitudinal stress at the center of the beam (across the width) occurs at the same point along the length and is 344 g/ $\mu$ m·s<sup>2</sup>.

Several measures of stress are important in determining potential to failure. One is the shear, or in-plane stress. Figure 20b shows shear stress (SZX) due to 1 g acceleration on the top of the beam, with darker regions showing greater stress. The maximum shear is 221  $g/\mu m \cdot s^2$  per g, occurring on the top face above where the bottom of the beam attaches to the base material.



Figure 20. Beam top stress distributions.

determin

stress

stresses)

principal

point, wi

the third

nature ex

stresses at the aniso

octahedr.

maximun

value of

B anisotrop

experime

mechanic

imposed

7×105 an

in the [1

950,000

T

maximum

are over t

that the u

g before fi

Another stress measure is the principal stress. The normal and shear stresses are determined along the model axes. These may not be the directions of maximum normal stress. The stresses resolved in directions producing only normal stresses (no shear stresses) are known as principal stresses. Figure 20c shows the distribution of the first principal stress (S1P). The principal stress is also greatest above the lower attachment point, where its value is 432 g/ $\mu$ m·s² per g. The second principal stress is near zero and the third principal stress is a reflection of the first. This verifies the compressive - tensile nature expected on opposite ends of this beam.

A fourth stress measure is the magnitude of the octahedral stress. Octahedral stresses are normal to the octahedral, or  $\{1,1,1\}$  planes. This measure is significant due to the anisotropic cubic nature of silicon. Figure 20d presents the magnitude of the octahedral stresses due to 1 g acceleration on the top surface of the beam (SEQ3). The maximum octahedral stress occurs in the region above the lower beam attachment, with a value of 206 g/ $\mu$ m·s<sup>2</sup>.

Because it is an anisotropic material, the yield strength of silicon should be anisotropic. However, the anisotropic yield strength coefficients have not been experimentally determined. This is because the practical yield stress of a silicon mechanical device tends to be more sensitive to surface defects than to the direction of imposed stresses. Measured yield stresses vary between about  $10^5$  and  $10^6$  psi (about  $7\times10^5$  and  $7\times10^6$  g/ $\mu$ m·s²). Results from Nova Sensor have shown yields of 150,000 psi in the [110] direction for a (100) silicon wafer. The value was shown to increase to 950,000 psi with removal of surface defects [44].

The maximum stress seen by the accelerometer is below 500 g  $g/\mu m \cdot s^2$ . The maximum expected acceleration in normal use is 10 g. This implies that resulting stresses are over two orders of magnitude below the minimum listed yield stress. This also implies that the unprotected device should be able to withstand shocks resulting from nearly 1000 g before fracturing.

Al

model -7.9 · 10

3.2.3 Eig

frequenci

T

by solving

where

Several 1

Subspace

solve sys

the first

Ä

results s

an indic

assumpti

natural fi

62

Along with stress information, displacement data can be obtained from the static model. The displacement at the center of the suspended mass was found to be  $-7.9 \times 10^{-4} \, \mu \text{m/g}$  acceleration.

## 3.2.3 Eigenvalue Analysis

The purpose of performing an eigenvalue analysis is to obtain the device's natural frequencies (eigenvalues) and free vibrating modes (eigenvectors). This is accomplished by solving the following equation:

$$M\ddot{\mathbf{u}} + K\mathbf{u} = \mathbf{0}$$

where

M = mass matrix,

K = stiffness matrix

$$\mathbf{u} = \overline{\varphi} e^{j\omega t + \psi}$$

= sinusoidal forcing function.

Several methods are available in NISA II for extracting eigenvalues. The Accelerated Subspace iteration method was chosen for this analysis because of its ability to rapidly solve systems.

A primary result of the analysis is the set of resonant frequencies. These values for the first ten modes and their modal masses in the y-direction are listed in Table 4. The results show that the fundamental frequency will be 17.7 kHz. The modal mass provides an indication of the relative effects of that mode on the system. This reinforces the assumption that the only significant mode is the first.

Another useful result of the eigenvalue analysis is the mode of vibration at each natural frequency. The vibration modes for the first six harmonics are shown in Figure 21.

3.2.4

forcing sinuso

magn:

dampi

simula

109).

Table 4. Natural frequencies and modal masses.

| Mode | Frequency<br>(kHz) | Modal Mass (UY)       |
|------|--------------------|-----------------------|
| 1    | 17.7               | 3.55×10 <sup>-3</sup> |
| 2    | 30.9               | 7.6×10 <sup>-14</sup> |
| 3    | 30.9               | 1.3×10 <sup>-13</sup> |
| 4    | 94.8               | 7.5×10 <sup>-17</sup> |
| 5    | 94.8               | 2.6×10 <sup>-18</sup> |
| 6    | 153.6              | 2.9×10-8              |
| 7    | 316.4              | 1.5×10 <sup>-16</sup> |
| 8    | 880.5              | 5.1×10 <sup>-17</sup> |
| 9    | 908.4              | 1.5×10 <sup>-9</sup>  |
| 10   | 991.8              | 7.0×10-15             |

# 3.2.4 Frequency Analysis

Frequency analysis indicates the steady-state system response to a harmonic forcing function. The longitudinal stress was plotted as a function of 1 g swept frequency sinusoidal acceleration in the y-direction. The response is dependent on the type and magnitude of damping used. The device packaging will supply viscous squeeze-film damping with a critical ratio as described in Section 3.5.3. Figure 22 presents the simulated magnitude spectrum at the beam center (node 115) and outside edge (node 109). The response is flat beyond the 5 kHz limit required for the device.



Figure 21. First six accelerometer modes.

E.M.R.S.Z.Z.STRESSS (MIDDLE)

FREDL



FREQUENCY RESPONSE TO 1g BASE ACCELERATION, CRITICAL DAMPING GRAPH: X-AXIS LOG SCALE

Figure 22. Longitudinal stress frequency response.

:3

0

•

Ţ

n

n

İr.

n:

e

)f

lii

17(

ir

h

#### 3.3 PIEZORESISTOR

This section presents the transduction method used to convert mechanical stress to an electrically compatible parameter. A comparison of common techniques for transduction leads to the decision to use a diffused, single crystal piezoresistor. This is followed by a discussion of piezoresistance. Next, the type of piezoresistor is selected. Finally, the bridge containing the piezoresistors is described.

## 3.3.1 Transduction Methods

Three methods are currently used in mechanical microsensors to convert acceleration into an electrical signal: the variable capacitor, polysilicon piezoresistor, and crystalline silicon piezoresistor.

A variable capacitor uses the movement between mechanical elements, such as the suspended mass and the base material below it, to form a time-varying capacitance. This method has high sensitivity and low cross-sensitivity to temperature compared to resistor methods, but suffers from complex signal routing and severe nonlinearity. These drawbacks make the capacitive technique difficult to implement for smart sensors [45].

A polysilicon piezoresistor is constructed by depositing polysilicon onto an insulator, such as SiO<sub>2</sub>, that is placed over an area of high stress. This method allows for easy trimming during manufacturing and a higher operating temperature than single crystal resistors, but suffers from lower sensitivity, higher nonlinearities and an increased number of manufacturing steps [46].

Crystal silicon piezoresistors are constructed by diffusing or implanting dopants directly into the substrate material at areas that will experience stress. This method provides a high degree of linearity and sensitivity, and is the most conducive to current circuit manufacturing techniques. Its major drawback is a high sensitivity to temperature, which can be minimized with compensating circuitry. The limit in maximum temperature

is due to

analog ar

T

BICMOS

linearity

of the se

be direc

importar

33.2 Pi

applied

stress, o

where, 1

switchin

This re

notation

67

is due to p-n leakage currents, an effect that limits the temperature range of on-chip analog and digital circuitry as well.

The main criteria used for selecting a transduction method are linearity and BiCMOS process compatibility, both met best by the crystalline piezoresistor. The linearity is important because of the difficulty in compensating for the nonlinear distortion of the sensed parameter. Unlike cross-sensitivity parameters, nonlinear distortion cannot be directly measured and removed from the signal. Finally, process compatibility is important in minimizing the cost of the monitor.

#### 3.3.2 Piezoresistance

Piezoresistance is a bulk material property relating the change in the ratio of an applied electric field, E, to the resulting current density, J, due to a change in the applied stress,  $\sigma_{kl}$ . A tensor equation relating these variables is [47]

$$E_i = \rho J_j + \pi_{ijkl} J_j \sigma_{kl},$$

where, for crystalline silicon,  $\rho$  = isotropic unstrained resistivity,

 $\pi_{ijkl}$  = fourth-rank piezoresistivity tensor.

The expression can be further simplified using the symmetry of silicon and by switching from tensor to matrix notation [48]:

$$\rho \pi_{11} = \pi_{1111}$$
,

$$\rho \pi_{12} = \pi_{1122}$$
,

$$\rho\,\pi_{_{44}}=2\,\pi_{_{2323}}.$$

This results in the following representation for the piezoresistance of silicon in matrix notation:

and th

appli resul:

knov com

stres

wher sensi

[], [8

l' is direc

$$\pi_{ij} = \begin{bmatrix} \pi_{11} \, \pi_{12} \, \pi_{12} & 0 & 0 & 0 \\ \pi_{12} \, \pi_{11} \, \pi_{12} & 0 & 0 & 0 \\ \pi_{12} \, \pi_{12} \, \pi_{11} & 0 & 0 & 0 \\ 0 & 0 & 0 \, \pi_{44} & 0 & 0 \\ 0 & 0 & 0 & 0 & \pi_{44} & 0 \\ 0 & 0 & 0 & 0 & 0 & \pi_{44} \end{bmatrix},$$

and the following set of equations:

$$\begin{split} \frac{E_1}{\rho} &= J_1 \Big[ 1 + \pi_{11} \sigma_1 + \pi_{12} (\sigma_2 + \sigma_3) \Big] + \pi_{44} (J_2 \sigma_6 + J_3 \sigma_5), \\ \frac{E_2}{\rho} &= J_2 \Big[ 1 + \pi_{11} \sigma_2 + \pi_{12} (\sigma_1 + \sigma_3) \Big] + \pi_{44} (J_1 \sigma_6 + J_3 \sigma_4), \\ \frac{E_3}{\rho} &= J_3 \Big[ 1 + \pi_{11} \sigma_3 + \pi_{12} (\sigma_1 + \sigma_2) \Big] + \pi_{44} (J_1 \sigma_5 + J_2 \sigma_4). \end{split}$$

The majority of practical applications have the electric field, current density and applied stress aligned in the same direction, such as the longitudinal stress of a beam. This results in

$$\frac{E_{1'}}{J_{1'}} = \rho(1+\pi_{11'}\sigma_{1'}) = \rho+\Delta\rho,$$

known as the longitudinal piezoresistance coefficient, where the 1' subscript is the common alignment direction and  $\Delta \rho$  is the change in resistivity resulting from the applied stress. The transformed piezoresistivity constant,  $\pi_{11}$ , is [47]

$$\pi_{11'} = \pi_{11} + 2(\pi_{12} + \pi_{44} - \pi_{11})[a_{11}^2 a_{12}^2 + a_{11}^2 a_{13}^2 + a_{12}^2 a_{13}^2],$$

where  $a_{1i}$  is the direction cosine between the new 1 direction and the old *i* direction. The sensitivity to stress will be maximized when  $\pi_{11}$  is maximized. The alignment quantity, in [], ranges between zero if 1' is aligned with a crystallographic axis,  $\langle 100 \rangle$ , and one-third if 1' is aligned with a  $\langle 111 \rangle$  direction. The alignment value is one-quarter for a  $\langle 110 \rangle$  direction.

3.3.3 P

donors

5

when t

with a

The los

A p-ty

longitu

piezor

# 3.3.3 Piezoresistor Type

Two types of piezoresistors can be constructed by driving either p- or n-type donors into the substrate. An example of the coefficients that result is presented in Table 5.

For p-type silicon,  $\pi_{11}$  is maximized when  $(\pi_{12} + \pi_{44})$  is maximized. This occurs when the piezoresistor is aligned with a  $\langle 111 \rangle$  direction. For n-type silicon,  $\pi_{11}$  is maximized when  $(\pi_{12} + \pi_{44})$  is minimized. This occurs when the piezoresistor is aligned with a  $\langle 100 \rangle$  direction.

The anisotropic etching of silicon results in beams aligned with a  $\langle 110 \rangle$  direction. The longitudinal piezoresistance coefficients for both types are

*n*-type: 
$$\pi_{11'} = -102.2 + 2(53.4 - 13.6 + 102.2)(0.25)$$
  
 $= -3.12 \times 10^{-7} \quad \mu \text{m} \cdot \text{s}^2 / \text{g},$   
*p*-type:  $\pi_{11'} = 6.6 + 2(-1.1 + 138.1 - 6.6)(0.25)$   
 $= 7.18 \times 10^{-7} \quad \mu \text{m} \cdot \text{s}^2 / \text{g}.$ 

A p-type resistor yields over twice the sensitivity to stress as does the n-type for a longitudinal arrangement on a  $\langle 110 \rangle$  beam. For p-type silicon, the longitudinal piezoresistance is about one-half of  $\pi_{44}$ .

Table 5. Piezoresistance coefficients [49].

| Sample | Coefficient (×10 <sup>-8</sup> μm·s <sup>2</sup> /g) |                 |            |
|--------|------------------------------------------------------|-----------------|------------|
| (Ω-cm) | $\pi_{11}$                                           | π <sub>12</sub> | $\pi_{44}$ |
| n 11.7 | -102.2                                               | 53.4            | -13.6      |
| p 7.8  | 6.6                                                  | -1.1            | 138.1      |

The resistors can be constructed by diffusing boron into the silicon base material until a surface concentration of about  $10^{19}$  cm<sup>-3</sup> is obtained, yielding a resistance of approximately  $200~\Omega/\Box$  [45] and a longitudinal piezoresistance of approximately  $50\times10^{-8}~\mu$ m-s<sup>2</sup>/g at room temperature. The piezoresistance coefficient derivation is presented in Appendix B.6. Each resistor is 3  $\mu$ m wide and 40.5  $\mu$ m long, for a resistance of 2.7 k $\Omega$ .

## 3.3.4 Piezoresistor Bridge

To take advantage of the equal magnitude but opposite sign of the developed stress, a resistor will be placed at each end of a beam. These resistors are then connected to two identical resistors off the beam to form a half-active bridge. This is shown in Figure 23, where  $R_1$  and  $R_3$  are the active piezoresistors.

The piezoresistors can be made using standard masking and diffusion processes. They are placed at the center of the beam along its length. A possible layout of the resistors on the beam is shown in Figure 24. The design uses scaled CMOS 1  $\mu$  rules except for the resistor diffusion, which follows scaled bipolar 1  $\mu$  rules [50].

The interconnects that run along the beam should be made of metal due to its lower sensitivity to stress and higher current density. The aluminum used for the interconnects is 8  $\mu$ m wide. The specification allows 0.8 mA/ $\mu$ m [50], for a total of 6.4 mA. This yields a maximum allowable supply voltage of 17.28 V across the 2.7 k $\Omega$  bridge. A bridge supply of 15 V is proposed.

#### 3.4 Performance Characteristics

This section develops performance characteristics of the piezoresistive transducer.

A general bridge expression is developed to obtain the nominal bridge output. The next





Figure 23. Piezoresistor bridge.



Figure 24. Piezoresistor placement.

• (

M

3

ī

, !

'n

51

g

2(

two segments examine how changes in bridge resistor dimensions and operating temperature affect span, offset and equivalent resistance. The following segment considers the impact of a construction variation in beam thickness on stress and fundamental frequency. The final segment discusses nonlinearity, hysteresis and repeatability by reviewing published results of related devices.

The performance characteristics of each resistor include the piezoresistance, resistance and temperature sensitivity, and the time and cyclic stability of these properties. The value for each resistor in the bridge can be expressed to a first order approximation as

$$\mathbf{R}_{i} = \rho(T) \left[ 1 + \pi(T) \overline{\sigma}_{i} \right] \mathbf{D} \Delta_{i},$$

where

 $R_i = resistor i$ 

 $\rho(T)$  = temperature dependent resistivity,

 $\pi(T) = \pi_{11'} =$  temperature dependent longitudinal piezoresistance,

 $\overline{\sigma}_i$  = average longitudinal stress in resistor i,

 $D = \frac{l}{A}$  = ratio of designed physical dimensions of resistors,

l = designed length of resistors,

A = designed cross-sectional area of resistors,

 $\Delta_i$  = fractional variations in ratio D for resistor i.

The first-order model was chosen principally due to the lack of general knowledge about higher-order coefficients [51].

In Appendix B.5, the finite element analysis results are used to develop the average stress along the resistor length as 206.3 g/ $\mu$ m·s² per g acceleration. A value of 200 g/ $\mu$ m·s² is used throughout this section.

By considering the bridge structure, the properties of the resistors can be converted into the standard performance characteristics of output to applied acceleration.

0 I offset, span and equivalent resistance. The transfer function relating the output voltage,  $V_{O_1}$  to the supply voltage,  $V_{S_1}$  for the bridge shown in Figure 23 is

$$\frac{V_o}{V_s} = \frac{R_1 R_4 - R_2 R_3}{(R_1 + R_2)(R_3 + R_4)}.$$

The ideal transfer function is obtained by inserting the definition for the resistors and assuming that all resistors have ideal size and operate at ideal temperature. The transfer function can then be expressed as

$$\frac{V_o}{V_s} = \frac{\rho(1+\pi\overline{\sigma})D \cdot \rho D - \rho D \cdot (1-\pi\overline{\sigma})D}{\left[\rho(1+\pi\overline{\sigma})D + \rho D\right]\left[\rho D + \rho(1-\pi\overline{\sigma})D\right]}$$

$$= \frac{2\pi\overline{\sigma}}{4-\pi^2\overline{\sigma}^2}$$

$$\approx \frac{1}{2}\pi\overline{\sigma} \qquad \text{since} \quad \pi\overline{\sigma} \approx 1 \times 10^{-4} \text{ per } g,$$

$$\overline{\sigma}_1 = -\overline{\sigma}_3 = \overline{\sigma},$$
 $\overline{\sigma}_2 = \overline{\sigma}_4 = 0.$ 

The nominal output voltage per applied g is found using the bridge supply voltage. With a supply of 15 V,

$$V_O = \frac{1}{2}\pi\overline{\sigma}V_S$$
$$= 750 \,\mu\text{V/g}.$$

Under ideal conditions, the equivalent resistances of the bridge as seen by the supply and the output amplifier are

$$R_{\text{eq source}} = (R_1 + R_2) / / (R_3 + R_4) = 2700 \Omega,$$
  
 $R_{\text{eq output}} = (R_1 + R_3) / / (R_2 + R_4) = 2700 \Omega.$ 

## 3.4.1 Resistor Dimensional Effects

Any dimensional changes common to all resistors will be canceled out. A change in any single resistor  $R_i$  can be represented by a percent deviation from the length/area quotient,  $\Delta_i$ . As an example, if  $R_1$  varies and the other resistors remain unchanged, the transfer function becomes

$$\frac{V_o}{V_s} = \frac{\left(1 + \pi \overline{\sigma}\right) \Delta_1 - \left(1 - \pi \overline{\sigma}\right)}{\left[\left(1 + \pi \overline{\sigma}\right) \Delta_1 + 1\right] \left[\left(1 - \pi \overline{\sigma}\right) + 1\right]}.$$

Figure 25 shows the change in output voltage per supply voltage per g due to different changes from nominal size in  $R_1$ .

A worst case occurs if the dimensions of  $R_1$  and  $R_4$  are increased on the same device that  $R_2$  and  $R_3$  are decreased. The transfer function becomes

$$\frac{V_o}{V_s} = \frac{\Delta^2 (1 + \pi \overline{\sigma}) - \frac{1}{\Delta^2} (1 - \pi \overline{\sigma})}{\left[\Delta (1 + \pi \overline{\sigma}) + \frac{1}{\Delta}\right] \left[\frac{1}{\Delta} (1 - \pi \overline{\sigma}) + \Delta\right]},$$

where

 $\Delta$  = fractional increase in  $R_1$  and  $R_4$ ,

 $\frac{1}{\Delta}$  = fractional decrease in  $R_2$  and  $R_3$ .

Figure 26 graphs the change in output due to acceleration for dimensional variations in  $\Delta$  from unity.

The offset, or zero, is defined as the unstressed bridge output. The span is defined as the outputs over the range of inputs from zero to maximum. Related measures are the slope, the slope of the line between the outputs at maximum and zero inputs, and the sensitivity, the span divided by the maximum operating input. The general functions for the half-active bridge are



Figure 25. Bridge output for dimensional changes in R1.



Figure 26. Bridge output for dimensional changes in all resistors.

offset 
$$= \frac{V_O}{V_S}\Big|_{\overline{\sigma}=0}$$

$$= \frac{\Delta_1 \Delta_4 - \Delta_2 \Delta_3}{(\Delta_1 + \Delta_2)(\Delta_3 + \Delta_4)},$$

$$span = \frac{V_O}{V_S}\Big|_{\overline{\sigma}=\overline{\sigma}_m} - \frac{V_O}{V_S}\Big|_{\overline{\sigma}=0}$$

$$= \frac{(1 + \pi \overline{\sigma}_m)\Delta_1 \Delta_4 - (1 - \pi \overline{\sigma}_m)\Delta_2 \Delta_3}{[(1 + \pi \overline{\sigma}_m)\Delta_1 + \Delta_2][(1 - \pi \overline{\sigma}_m)\Delta_3 + \Delta_4]} - \frac{\Delta_1 \Delta_4 - \Delta_2 \Delta_3}{(\Delta_1 + \Delta_2)(\Delta_3 + \Delta_4)},$$

where the maximum stress,  $\overline{\sigma}_m$ , is due to a 10 g acceleration. Expressions showing the effect on offset and span to increasing  $R_1$  and  $R_4$  and decreasing  $R_2$  and  $R_3$  are

offset = 
$$\frac{\Delta^{2} - \frac{1}{\Delta^{2}}}{2 + \Delta^{2} + \frac{1}{\Delta^{2}}},$$

$$span = \pi \overline{\sigma}_{m} \frac{4 + 2\Delta^{2} + 2\frac{1}{\Delta^{2}} + \pi \overline{\sigma}_{m} \left(\Delta^{2} - \frac{1}{\Delta^{2}}\right)}{\left[2 + \Delta^{2} + \frac{1}{\Delta^{2}}\right]\left[2 - \pi^{2}\overline{\sigma}_{m}^{2} + \left(1 + \pi \overline{\sigma}_{m}\right)\Delta^{2} + \left(1 - \pi \overline{\sigma}_{m}\right)\frac{1}{\Delta^{2}}\right]}$$

$$\approx \frac{2\pi \overline{\sigma}_{m}}{2 + \Delta^{2} + \frac{1}{\Delta^{2}}}.$$

Figure 27 shows the effect of dimensional changes on the offset and span. The dependent axis of each graph represents fractional changes in the ratio of the resistor length to the resistor area. The graphs represent a worst-case condition that results from increasing  $R_1$  and  $R_4$  by the same factor by which  $R_2$  and  $R_3$  are decreased. The approximate formulation for span is used.

No more than a 3% deviation in dimensions is expected, with about 1% due to diffusion differences and 2% due to mask variations [45]. This is modeled by a dimensional increase in  $R_1$  and  $R_4$  of 1.5% and a decrease of  $R_2$  and  $R_3$  of 1.5%. The offset increases by approximately 0.5% of the supply for each 1% dimensional change in



a) Dimensional effects on the span.



b) Dimensional effects on the offset.

Figure 27. Dimensional effects on the span and offset.

worst case, producing a 1.5% offset. The span changes by 0.02% of the full scale due to a 3% dimensional change in the worst case.

The steady state acceleration value must be removed for the feature extraction logic. This will be done by filtering the bridge output as discussed in Section 3.6.3. Therefore, systemic offset errors will not affect the overall operation. Also, each device will be tested for gain, with any deviation corrected by the filter gains or by adjusting thresholds. Therefore, systemic span errors will not affect the overall operation.

The variations in the equivalent resistance seen by the supply and the load are also affected by dimensional changes. The worst case occurs if all resistors change by a factor of  $\Delta$ , resulting in

$$\begin{split} R_{\text{eq source}} &= \left[ \rho D (1 + \pi \overline{\sigma}) \Delta_1 + \rho D \Delta_2 \right] / \left[ \rho D (1 - \pi \overline{\sigma}) \Delta_3 + \rho D \Delta_4 \right] \\ &= \rho D \frac{4\Delta^2 - \pi^2 \overline{\sigma}^2 \Delta^2}{4\Delta} \\ &\approx \rho D \Delta, \end{split}$$

$$R_{eq load} \approx \rho D\Delta$$
.

The bridge input and output resistances vary linearly with dimensional changes.

### 3.4.2 Resistor Temperature Effects

This section examines the effects on the bridge characteristics due to changes in temperature. Throughout the following calculations, the stress and dimensions of the piezoresistors are assumed to be independent of temperature. Also, all piezoresistors are assumed to be at the same temperature, the ambient temperature of the chip.

Changes in temperature affect both the resistivity and the piezoresistivity. However, since there is assumed to be no dimensional changes due to temperature, the resistivity change effect will cancel out of the bridge transfer function.



Figure 28.  $\pi_{44}$  vs. temperature.

The piezoresistance coefficient has a slightly nonlinear dependency on temperature at the proposed concentration and throughout the working temperature range. Figure 28 graphs the experimentally determined piezoresistance coefficients for p-type doping at  $9\times10^{18}$  cm<sup>-3</sup> concentration (developed from [52] in Appendix B.6).

The temperature dependent transfer function is

$$\frac{V_o}{V_s} = \frac{\left(1 + \pi(T)\overline{\sigma}\right) - \left(1 - \pi(T)\overline{\sigma}\right)}{\left[\left(1 + \pi(T)\overline{\sigma}\right) + 1\right]\left[\left(1 - \pi(T)\overline{\sigma}\right) + 1\right]}$$

$$= \frac{2\pi(T)\overline{\sigma}}{4 - \pi^2(T)\overline{\sigma}^2}.$$

Figure 29 shows the relation of bridge output to applied stress using temperature-dependent longitudinal piezoresistance approximated by taking one-half of  $\pi_{44}$  as developed in Section 3.3.3.



Figure 29. Temperature effects on the bridge output.

The zero offset is not affected by variations in temperature using the first-order model. The span is affected by temperature, resulting in the function

$$SPAN = \frac{2\pi(T)\overline{\sigma}_m}{4 - \pi^2(T)\overline{\sigma}_m^2}.$$

The temperature dependent span is graphed in Figure 30. As stated in Chapter 1, the assumed working temperature range for the monitored bearing is 20 °C ±20 °C (32 to 104 °F). Over this range, the span exhibits a 9% change, implying that some form of temperature compensation is required. Temperature compensation is discussed in Section 3.6.1.

## 3.4.3 Beam Thickness Effects

The principal manufacturing dimensional effect is the etching of the beam thickness. The process requires cutting through about 400  $\mu$ m of material to leave the



Figure 30. Temperature effects on the span.

10 µm thick posts. The depth of the cut not only determines the thickness of the beam, but the width of the bottom of the beam, the length of the beam and the suspended mass. These affect both the natural frequency and the stress on the piezoresistors that, in turn, affect the span and offset. The changes in the stress per g acceleration and in the fundamental frequency due to beam thickness variations are shown in Figure 31. The graphed values of average stress and fundamental frequency were obtained using ideal beam theory calculation. They should be used to obtain an indication of the amount of variation to be expected, not of the exact values. The formulas used to obtain the graphed values are derived in Appendix B.7.

The variation in average stress per g that results from etch thickness differences will change the sensitivity of the accelerometer. If the change is not excessive, it can be compensated for by increasing or decreasing a filter gain in the feature extraction section, or by changing the decision threshold level. Variations in fundamental frequency will not adversely affect operation as long as the frequency response remains flat to 5 kHz.



Figure 31. Beam thickness effects on stress and fundamental frequency.

## 3.4.4 Nonlinearity, Hysteresis and Repeatability

Nonlinearity (also known as linearity) expresses the difference between the actual output and a linear output, usually expressed as a percent of full scale output (FSO). The definition for the linear output varies. The two most common are a line between the zero input and the maximum input, and a least-squares, best fit line through the curve. Hysteresis represents deviations in the dependent output between increasing and decreasing independent variables. Often, this includes temperature cycling. Repeatability (also known as stability), represents a device's ability to resist parameter changes over time. Tests are sometimes performed while cycling independent parameters such as acceleration or temperature, sometimes done statically and sometimes as a combination of these methods. A final parameter, accuracy, is used to represent the combined error introduced by nonlinearities, hysteresis and repeatability.

Each of these parameters is strongly dependent on the physical characteristics of the device under test, the test conditions, and the test procedure. Therefore, the results are difficult to predict theoretically and are usually determined experimentally. Since the device described in this paper has not been built, these parameters have not been found. The results of similar devices that have been tested will be presented to provide an indication of expected results. All of the devices discussed use implanted piezoresistors. The actual terms used in the reports have been reproduced here.

One of the first accelerometers reported has a 0.45 mg mass suspended by a cantilevered beam. This device exhibited  $\pm 1\%$  nonlinearity FSO [53].

Recent commercial accelerometers include a family of four-post accelerometers from ICSensors that exhibit an accuracy of  $\pm 0.2\%$  span typical and  $\pm 1.0\%$  span maximum. The literature defines accuracy as "repeatability, hysteresis and linearity (best fit straight line)" [54].

SenSym offers a series of four-post accelerometers that have 0.3% FSO linearity typical and 1.0% FSO linearity maximum. Linearity, for their pressure sensors, is defined as "the maximum deviation of measured output at constant temperature (25 °C) from best straight line determined by three points (offset pressure, full-scale pressure, and one-half full-scale pressure)" [55].

Micropressure sensors were developed before microaccelerometers and, therefore, a larger body of work exists to describe them. Since they are closely related to accelerometers, results should be similar.

Siemens reports a family of pressure sensors with  $\leq \pm 0.2\%$  nonlinearity,  $\leq \pm 0.2\%$  full scale temperature hysteresis, and  $\leq \pm 0.5\%$  full scale/annum long term stability. The stability was determined from temperature cycling (-60 to 150 °C, 1000 cycles), storage at temperature (150 °C for 5000 hours) and temperature-stress storage (150 °C and 15V inverse voltage for 5000 hours) [56].

Foxboro/ICT has developed a sensor with a temperature nonclosure, or "shift of zero or span after temperature cycling expressed as % of span," of below 0.1%. Also, the "long term stability of temperature slopes at zero and span" is 0.02% [57].

Finally, a recent experimental sensor from China showed the average of three units to have values of 0.15% FSO nonlinearity, 0.09% FSO hysteresis, 0.15% FSO repeatability and an overall accuracy of 0.32% FSO. The worst of the three had an accuracy of 0.43% FSO [58].

### 3.5 DEVICE CONSTRUCTION

This section considers three issues associated with the microaccelerometer fabrication. First, a technique is described for accurately micromachining the beams and mass. Second, the compatibility of BiCMOS and micromachining is briefly discussed. Third, the device packaging and its effects on damping ratio and over-acceleration protection are reviewed.

#### 3.5.1 Beam Micromachining

The basic shape of the device is determined by a wet chemical etch. This process, known as bulk micromachining, is accomplished after all electronics have been constructed and protected with a silicon dioxide layer. Traditionally, the etch has been one of the hardest fabrication procedures to control due to the ratio of starting thickness to ending thickness (approximately 40 to 1 in this application) and the dependence of the process on temperature, agitation and etchant variations.

The process can be converted from one of controlling etch rate to the more precise task of controlling epitaxial layer thickness by using an electrochemical passivation etch-stop. Electrochemical passivation involves exposing a *p-n* junction to the etchant. One side of the junction is connected to a voltage source. The other side of the source is



Figure 32. Etch-stop apparatus [60].

attached to an electrode suspended in the etchant. A positive voltage applied to the junction causes the etch to proceed at a normal pace. When the junction is reached, the etch is greatly reduced and the current flow is reduced [59].

One particularly well suited method was reported by NEC, and is shown in Figure 32. A 20  $\mu$ m thick, 4  $\Omega$ -cm n-epitaxial layer was deposited on a 350  $\mu$ m thick, 40  $\Omega$ -cm p-type substrate. This was then suspended in a hydrazine-water solution at 90 °C with a platinum electrode to complete the circuit. After over two hours, the supply current dropped and etching ceased. The etching is thought to have stopped completely due to an oxide film formed at the surface of the n material. 1 mm x 1 mm x 20  $\mu$ m diaphragms were constructed with a thickness variation of  $\pm 2 \mu$ m due solely to epitaxial layer variation [60].

A similar process employed by the Electron Physics Laboratory at the University of Michigan produced 1 mm  $\times$  1 mm  $\times$  10  $\mu$ m diaphragms using an ethylene diamine -- pyrocatechol -- water (EDP) etchant. The results showed 75% of the devices within  $\pm 1$   $\mu$ m, the tolerance of the measurement device [45]. A similar technique yielded test membranes 1.5  $\mu$ m thick [59].

# 3.5.2 Process Compatibilities

Key to manufacturing the monitor is the ability to integrate bulk micromachining, analog electronics and digital electronics on the same substrate. A BiCMOS process, blending the best characteristics of bipolar and CMOS devices, offers the best opportunity to meet the performance characteristics required by the monitor.

To date, no descriptions of BiCMOS bulk micromachined devices have been published. However, all the component technologies appear to be available. Bulk micromachined sensors compatible with either CMOS electronics [58, 61] or bipolar electronics [62, 63] have been reported. BiCMOS has been successfully combined with surface micromachining to create a high-g accelerometer [64].

Mixed analog and digital BiCMOS processes have been developed for the telecommunications industry [65-69]. BiCMOS digital circuits can offer the high density and low power dissipation of CMOS and the increased drive ability and subsequent reduced switching time of bipolar processes. In analog systems, the higher transconductance and lower noise levels achieved with bipolar transistors can be coupled with the ability to produce simple and accurate switches, high impedance inputs, charge storage nodes and complementary transistors [70]. One technology offers supply voltages to 18 V, typical digital gate delays of 4.5 ns, interface cells corresponding to MIL STD 883C, and ROM, PLA and switched capacitor filter compilers [67].

The key to combining BiCMOS and bulk micromachining is properly preparing the wafer for etching prior to electronic processing steps. The growth of a 10  $\mu$ m n-type

epitaxial layer on a *p* substrate is compatible with both bipolar [71] and CMOS circuitry [61]. This *p-n* junction forms the stop for the etch process described in Section 3.5.1.

#### 3.5.3 Packaging Benefits

This section examines how packaging can be used to set the damping ratio and prevent over-acceleration damage. The packaging consists of two additional micromachined wafers that form a cavity around the wafer containing the device. Figure 33 shows a conceptualized cross-section of such a device available from SenSym.

Proper damping of the device is required to limit travel at resonant frequency and to assure a flat vibration response to 5 kHz. In an experiment with one of the first microaccelerometers, various fluids were tried. The results showed a linear relationship between viscosity and damping ratio. In order to come close to the desired damping value of 0.7, a silicone oil was proposed [53]. Unfortunately, these oils have densities close to that of bulk silicon, reducing the effective mass and, hence, the sensitivity [72].

Recent devices use squeeze-film damping. This technique limits the flow of air around the moving mass. Air is a good fluid for this purpose because of its low density and because its viscosity is relatively insensitive to temperature variations. A  $\pm 15\%$  variation in viscosity from -30 to 75 °C was reported to cause a  $\pm 1.2$  dB change in



Figure 33. Accelerometer package cross-section [55].

sensitivity at resonance for a critically damped part [73]. The viscosity can be adjusted by opening channels in or narrowing the base material near the mass.

Over-acceleration is a problem with microaccelerometers. Any acceleration large enough to create a stress in excess of the yield will damage the device. Since this maximum stress is dependent on the mass displacement, limiting the mass travel will eliminate this mode of failure. This is accomplished by using stops placed on both the top and bottom support structures, as shown in Figure 33. ICSensor [73], SenSym [55] and Nova Sensors [74] all produce devices utilizing this technique. One Nova Sensor product has a working range of  $\pm 2 g$  with an over-acceleration range of 1500 g in any direction.

#### 3.6 ELECTRONIC CONSIDERATIONS

This section covers issues related to converting the change in resistance to a voltage that represents the acceleration. It opens with a discussion of temperature compensation and the bridge supply. Next, the principal sources of noise and their effect on the system are considered. The specifications for the bridge output amplifier are then developed. The final segment describes the analog electronic subsystem for amplification and offset reduction.

## 3.6.1 Temperature Compensation

Analysis in Section 3.4.2 indicates that changes in device operating temperature strongly affect the piezoresistor bridge. To a first-order approximation, the offset is not affected by temperature. However, some form of compensation must be applied to correct for the 9% variation in span across the range of 0 to 40 °C.

Many techniques have been proposed to compensate for temperature variations.

Passive compensation is the simplest method. Resistors are placed across portions of the bridge to modify the balance and output level. An ICSensors pressure sensor uses three

external resistors to compensate for offset, temperature coefficient of offset and temperature coefficient of span [75]. The major drawbacks of this technique are the requirement for testing under operating conditions and the addition or adjustment of passive components after the device is constructed.

A second method involves a temperature sensitive gain element placed in the bridge output circuit. Devices use resistors with positive temperature coefficients in feedback to compensate for the negative temperature coefficients of the piezoresistance. Several pressure sensors have been built implementing this technique with bipolar [76] or MOS circuitry [77].

A third method introduces a reference signal for calibration or comparison. One proposed device uses electrostatic deflection to place a known stress on the piezoresistive bridge. This stress is used to calculate span compensation in a ratiometric analog-to-digital converter (ADC) [78]. A constructed accelerometer uses the difference between the output of half-active bridges placed on support beams and the outputs from similar bridges on dummy beams, removing temperature and construction induced offsets [73]. This technique requires either multiple matched amplifiers, or multiple or shared ADCs. Due to its comparative nature, this method has a greater impact on offset effects than on span effects.

A fourth method increases the supply voltage as a function of temperature to compensate for the negative temperature coefficient of the piezoresistance. One implementation uses two npn transistors with different current densities to generate a voltage proportional to absolute temperature. This signal is amplified and added to a temperature insensitive band-gap reference voltage. The sum is increased to provide the bridge supply [79]. Several integrated pressure sensors have been constructed using similar techniques [63, 80].

The bridge supply modification method is recommended for the bearing monitor because of its proven ability to reduce the temperature coefficient of span and because it is easily integrated into the monitor system. A block diagram of an implementation is shown in Figure 34.

The bridge supply voltage,  $V_s(T)$ , can be expressed as

$$V_{S}(T) = G \cdot (V_{ref} + kV_{temp}) = G \cdot (V_{ref} + \alpha T),$$

where

 $V_{ref}$  = temperature independent reference voltage,

 $V_{temp}$  = temperature dependent voltage,

G, k =amplifier gains,

 $\alpha$  = temperature gain factor (V/°C),

T = temperature (°C).

The values of G and  $\alpha$  were found using the least squares method [81] on the temperature dependent span values at three points, the nominal temperature (20 °C) and the operating limit temperatures (0 and 40 °C). With a reference voltage of 0.644 V [79], the resulting values are

$$G = 1.48 \times 15 = 22.2 \text{ V/V}$$
  
 $\alpha = 2.28 \text{ mV/}^{\circ}\text{C}.$ 



Figure 34. Temperature compensating bridge supply.

Figure 35 presents two graphs demonstrating the effectiveness of this scheme. The first graph shows the temperature span with negative temperature sensitivity, the normalized linear bridge supply with positive temperature sensitivity and the resulting compensated span. The second graph plots the compensated and uncompensated spans as a percentage of the uncompensated span at 20 °C (% change in span). The variation in span over the desired temperature range is reduced from 9.09% to 0.34%.

These results represent a "best case" scenario. The voltage reference and electronics are assumed to be independent of temperature, the temperature sensitive voltage source is assumed to be linear with temperature, and the gain values G and  $\alpha$  are assumed to be precisely adjustable. One device, constructed with a separate integrated circuit for electronics and piezoresistive pressure sensor and with external adjustable gain potentiometers for  $\alpha$ , reported a deviation in sensitivity of 0.78% in the range of -40 to  $100~^{\circ}$ C [79]. This demonstrates the effectiveness of this method except for the impact of gain variations. The second graph of Figure 36 includes the effects of increasing G by 5% while  $\alpha$  is increased and decreased by 5%. The average change in span is about 5%, representing an increase in overall sensitivity. The deviation in span across the temperature range of interest increases to 0.35% when  $\alpha$  is decreased by 5% and the span deviation increases to 0.57% when  $\alpha$  is increased by 5%. The average span increase is fixed in a given device and can be compensated by adjusting the digital filter gains or mean squared threshold value in the feature extraction section. Decreases in gain values G and  $\alpha$  yield similar results.

This method is particularly suited to the sensor system described in Chapter 1. The temperature sensor, voltage reference and associated electronics can be located on the common interface integrated circuit chip. The final gain, G, can be set for each specific sensor type on its integrated circuit chip.





Figure 35. Temperature compensation of the span.

## 3.6.2 Noise Effects

The two major sources of electronic noise are expected to be the piezoresistors and the bridge output amplifier.

Three primary forms of noise are typical in electrical circuits: shot noise, thermal noise and 1/f noise. Work on pressure sensors indicates that the thermal noise is the dominant source in piezoresistor based sensors [82]. The effect of the thermal noise on the system can be determined by passing the power spectral density (PSD) of the noise produced by the resistors through a transfer function representing the monitor electronics up to the feature extraction stage.

Thermal noise has a flat power spectrum throughout the frequencies under consideration. Its PSD,  $S_n(\omega)$ , can be expressed as [83]

$$S_n(\omega) = 2kTR$$

where

$$k = 1.38 \times 10^{-23} \text{ J/°K}$$

 $T = \text{temperature in } ^{\circ}K,$ 

 $R = resistance in \Omega$ .

The noise PSD at the feature extraction stage,  $S_R(\omega)$ , is

$$S_{R}(\omega) = S_{n}(\omega) |H(\omega)|^{2}$$

The noise power,  $P_R$ , is found by integrating  $S_R(\omega)$  over all frequencies, or

$$P_{R} = \frac{1}{2\pi} \int_{-\infty}^{\infty} S_{R}(\omega) d\omega$$
$$= 4kTR \int_{0}^{\infty} |H(f)|^{2} df.$$

The system transfer function, H(s), is developed in Section 4.5.2. The area of the magnitude squared of the transfer function is numerically integrated in Appendix C.6, and

is found to be 1.32×10<sup>9</sup> Hz. The area includes the voltage gain through the amplifier of 388 calculated in Section 4.3. The resulting noise power at 20 °C is 5.77×10<sup>-8</sup> V<sup>2</sup>.

Due to the small signal level put out by the transducer, the first amplifier is critical to system operation. Principal among its characteristics must be a low level of noise. Several projects report amplifiers in BiCMOS with an equivalent input noise voltage density of 8 nV/ $\sqrt{\text{Hz}}$  RMS [67, 84]. One particularly quiet amplifier designed for an audio tape preamplifier has a measured input-referred noise level of 300 nV RMS CCIR/ARM over the range of 20 Hz to 20 kHz [85]. An input-reflected density of 8 nV/ $\sqrt{\text{Hz}}$  will be used in noise calculations for this project.

Assuming a flat PSD and the same transfer function used to calculate the resistor noise power, the amplifier noise power,  $P_E$ , is

$$P_E = (8 \times 10^{-9})^2 (1.32 \times 10^9)$$
  
= 8.46 × 10<sup>-8</sup> V<sup>2</sup>.

The combined noise power is the sum of the constituent powers, or

$$P_C = P_R + P_E$$
  
= 1.42 × 10<sup>-7</sup> V<sup>2</sup>.

This results in 377 µV RMS output noise.

A temperature increase to 40 °C raises the resistor noise power about 7%. If temperature affects the amplifier in a similar fashion, the combined noise power becomes  $1.52 \times 10^{-7} \text{ V}^2$ .

An indication of the effect of the noise on the system can be obtained by comparing its amplitude to that of an expected signal. From the bearing model derived in Chapter 2, the minimum mean squared value in the rang of 0 to 5 kHz is above  $0.01 g^2$ , or

8.47×10<sup>-4</sup> V<sup>2</sup>. This results in a signal-to-noise ratio (SNR) of 37.8 dB. The SNR decreases to 37.5 dB at 40 °C.

Another useful indication of the noise effect is to compare the combined resistor and amplifier noise to the quantization noise of the ADC. Assuming ideal quantization, the quantization noise power,  $P_q$ , is

$$P_{q} = \frac{m_{p}^{2}}{3L^{2}},$$

where

 $m_p$  = peak value (10 g, 2.91 V),

L = number of quantization levels.

Figure 36 shows the nominal combined resistor and amplifier SNR at 20 °C, quantization SNR and cumulative "total" SNR.



Figure 36. Approximate system noise effects.

## 3.6.3 Amplification and Offset Reduction

Offsets present problems in microsensors because their values are often several orders of magnitude greater than desired signal levels. An offset from the monitor accelerometer will appear as a large mean squared error in the feature extraction section. Therefore, some form of offset elimination must be employed. Table 6 lists offset sources for the monitor accelerometer.

The effect of resistor size variations on the offset are developed in Section 3.4.1. The four bridge resistors are expected to show no more than a 3% deviation in value between them due to size variations.

Section 3.4.2 shows that, to a first-order approximation, temperature does not affect the bridge offset. However, stress on the beam results from the bimetal effect between the silicon and silicon dioxide layers due to the differences in thermal expansion coefficients [78]. Additional thermal stresses may be introduced by packaging. The maximum of the temperature dependent offset depends on how the transducer is constructed. Recent improvements in manufacturing have reduced uncompensated thermal offset to  $\pm 2\%$  of the full scale output between -40 and 85 °C in a  $\pm 2$  g

Table 6. Offset sources.

| Source              | Offset (mV) | Comment                         |  |
|---------------------|-------------|---------------------------------|--|
| bridge resistors    | ±225        | 3% variation between resistors  |  |
| temperature         | ±15         | estimated thermal stress effect |  |
| single-sided supply | 7.5         | 60 dB CMRR                      |  |
| gravity             | ±0.75       | 1 g × 750 μV/g                  |  |

commercial sensor. The sensitivity of the sensor is 2.5 mV/g [86]. A 15 V supply results in an offset of

$$V_{\text{offset}} = 2\% \times 2.5 \text{ mV/V/} g \times 15 \text{ V} \times 2 g$$
  
= 1.5 mV.

The commercial accelerometer has a modified cantilever design, significantly different from the four post design used in this project. Also, the commercial accelerometer has no integrated electronics, allowing the manufacturing process to be optimized for the sensor characteristics. A factor of 10 is introduced to compensate for these differences. This brings the thermal offset effect close to the values observed in experimental sensors [78].

The use of a single-sided bridge supply introduces an offset due to the common mode gain of the differential amplifier connected to the bridge output. Because the bridge is designed to be balanced, the common mode input will be one-half the bridge supply voltage, or 7.5 V. A conservative 60 dB common mode rejection ratio (CMRR) for the amplifier will result in a 7.5 mV offset.

A final source of "offset" is the effect of gravity. Since the sensor can be mounted in any orientation, a constant signal of up to  $\pm 750~\mu V$  may be combined with the vibration signal. Since this constant will add to the overall mean squared value, it should be removed before feature extraction.

The offset voltage can be removed by analog or digital filtering. However, the offset voltage is almost four orders of magnitude greater than the signal produced by a 0.1 g RMS input. Without filtering, an amplification of the vibration signal to obtain a sufficient signal-to-quantization noise level would produce a constant value that would saturate the amplifiers. Figure 37 presents a conceptual subsystem diagram for amplifying and high-pass filtering prior to the ADC.



Figure 37. Offset reduction and amplification.

The first amplifier gain of 20 is designed to increase the differential signal level without allowing the DC level to exceed 5V. The resulting vibration signal level is 1.5 mV/g. The coupling capacitor  $C_C$ , forms a first-order high-pass filter with the transfer function

$$H(s) = \frac{-s\frac{R_f}{R_i}}{s + \frac{1}{R_iC_c}}.$$

The cut off frequency,  $f_c$ , is

$$f_{\rm c} = \frac{1}{2\pi R_{\rm i} C_{\rm c}}.$$

The bearing vibration signal contains relevant spectral components above 100 Hz, suggesting a cut off frequency no greater than 10 Hz. The input resistor value must be limited in magnitude due to construction and thermal noise limitations, forcing a large capacitance value. For example, a 16 k $\Omega$  resistor will require a 1  $\mu$ f capacitor to produce a 9.9 Hz cut off. The size of this capacitor prohibits its integration on the sensor IC.

The second amplifier stage has a gain of 19.4, resulting in an amplifier gain of 388 V/V or 291 mV/g. The amplifier gain is calculated in Section 4.3. The signal is then low-pass filtered and digitized. Chapter 4 presents the design of the ADC subsystem.

# CHAPTER 4 OVERSAMPLING A/D CONVERTER

The conversion of the analog transducer signal to a digital format is an integral part of a smart sensor. One reason is that the transducer signal must be digitized if digital logic is used for on-chip information processing. Additionally, a digital format provides greater immunity to noise.

Chapter 2 developed minimum specifications required by the monitor for digitized accelerometer signals. These constraints result in the following analog-to-digital converter (ADC) specifications:

- 5 kHz conversion frequency
- 9-bit accuracy on chip
- support for 12-bit accuracy off chip
- $\pm 10 g$  maximum input signal range
- 40 dB minimum dynamic range

In addition, the following design goals should be met to minimize the cost of the device and facilitate solid state manufacturing:

- No precision components
- Small size and low complexity
- BiCMOS processing compatibility.

This chapter opens by explaining the reasons for choosing an oversampling A/D converter over other architectures. A description of the basic operation of a sigma-delta modulating converter is given in Section 4.2. A calculation of the amplification of the transducer signal prior to conversion and the oversampling ratio is presented in Section 4.3. This is followed in Section 4.4 by the development of the decimation filter, including a new classification scheme and a novel implementation for minimum area. Section 4.5 discusses details of the final converter system. Section 4.5.1 provides a summary of the

system and simulation results are in Section 4.5.4. Appendix C gives details of calculations used in Chapter 4.

### 4.1 SELECTION OF A/D CONVERTER TYPE

Many architectures are available for converting analog signals to digital. Most can be grouped into one of four classes: serial, successive approximation, parallel and oversampling.

Serial converters include single-slope and dual-slope devices. A single-slope converter compares the input voltage to an increasing ramp voltage that starts at the minimum level. A counter records the number of clock periods until the ramp voltage exceeds the input voltage. If the maximum clock value corresponds to the time required for the ramping voltage to reach the largest converter voltage and the minimum clock value corresponds to the smallest voltage, the clock count is the digitized output. A dual-slope converter uses a reference voltage in the count-up phase and the input voltage in the count-down phase. The ratio of the counts is equal to the ratio of the voltages. Serial converters require a reference voltage, but do not require precision components. The major drawback for this application is the slow operating speed, limiting conversion frequencies to less than 100 Hz [50].

Successive approximation converters use a comparator and a digital-to-analog converter to make a sequence of estimates about the input voltage, each estimate deciding the next significant bit. Different architectures result from different D/A converters, including voltage scaling, charge scaling, serial and algorithmic. These methods require a voltage reference. Both the charge and the voltage scaling methods also require ladder arrangements of resistors or capacitors with precise ratios. The serial method is based on charge distribution between precisely matched capacitors. The algorithmic method uses a

pipelined architecture to sum the analog output due to each bit, requiring precise capacitor matching in the summers and low noise circuitry for the low-order bits.

Parallel, or flash converters, use resistor or capacitor ratios to develop all the output bits simultaneously. Although very fast, these devices require large amounts of hardware with precise ratios and a precise voltage or current reference.

Oversampling converters sample the input signal at a rate many times faster than the desired Nyquist rate, then exchange this precision in time for precision in amplitude using digital filtering. Hence, oversampling replaces the need for exact analog components with digital processing, a situation ideally suited for VLSI implementation. The only precise component required is a clock, necessary for the digital processing later in the monitoring algorithm.

### 4.2 Basic SDM Converter Operation

This section outlines the operation of a sigma-delta modulated (SDM) analog-to-digital converter. The description represents a summary of work in the area [87-95]. To be consistent with the literature, single-sided spectra will be used.

Standard A/D converters consist of a sharp antialiasing low-pass filter followed by sampling at the Nyquist frequency,  $f_N$ , and n-bit quantizing, as shown in Figure 38a. This process is known as pulse code modulation (PCM). Errors in this type of conversion result from having a non ideal low-pass filter, inaccuracies and noise in the electronics, and finite quantization. Considering only the latter, the output can be expressed as

$$y_i = y(nT_S) = x(nT_S) + e(nT_S),$$

where  $e(nT_S)$  is the quantization error. If the quantizer has L levels and a working range of  $\pm m_p$  the error e will fall within the range of  $\pm \frac{1}{2} \Delta$ , where



Figure 38. Traditional and oversampling ADC models.

$$\Delta = \frac{2m_p}{I}.$$

The error is usually assumed to be uniformly distributed and uncorrelated with the input. If it is not, a dither signal can be added before quantization. This results in an RMS value of

$$e_{\rm RMS}=\frac{\Delta}{\sqrt{12}},$$

with a spectral density of

$$E(f) = e_{\rm RMS} \sqrt{2T_{\rm S}}.$$

The effects of quantization can be reduced if the signal is oversampled before it is quantized, as shown in Figure 38b. The desired signal is recovered by low-pass filtering at

the conversion frequency,  $f_0$ , followed by downsampling. This process is referred to as decimation. Oversampling uniformly distributes the quantization noise across the sampled frequency range, resulting in a total noise value in the signal band,  $n_0^2$ , of

$$n_0^2 = \int_0^{f_0} E^2(f) df = \frac{e_{RMS}^2}{N},$$

where N is the ratio of the oversampling frequency to the Nyquist frequency, known as the oversampling rate (OSR). Each doubling of the oversampling rate yields an increase of 3 dB in the signal-to-noise ratio (SNR), an increase of ½ of the value of the least significant bit in the converted word.

## 4.2.1 Single-loop SDM

Sigma-delta modulators are used to reshape the noise spectrum, pushing noise out of the conversion range. Figure 38c shows an A/D converter system using a modulator quantizer, and Figure 39 presents a block diagram of a single-loop SDM and a discrete model with a 1-bit quantizer.

The output of a single-loop, or first-order SDM, is the quantized accumulation of the error between the input and the converted input. The average value of successive output values will approximate the input. The output can be expressed in terms of the input and quantization error as

$$y_i = w_i + e_i$$
  
=  $x_{i-1} + e_i - e_{i-1}$ .

Defining noise  $n_i$  as the difference between the output and the delayed input,

$$n_i = e_i - e_{i-1},$$





b) 1-bit discrete model

Figure 39. Single-loop SDM.

with a spectral density of

$$N_1(f) = E(f)(1 - e^{-j2\pi f T_S}),$$
  
 $|N_1(f)| = N_1(f) = E(f) 2\sin(\pi f T_S)$   
 $= 2e_{RMS} \sqrt{2T_S} \sin(\pi f T_S).$ 

Figure 40 shows how the noise is shaped in the frequency domain for a 64-times oversampled system. The number indicates the modulator order, with  $N_0(f)$  representing oversampling without modulation.

If a low-pass filter is used to decimate the modulator output, most of the quantization noise can be eliminated. The resulting in-band noise power,  $n_1^2$ , is

$$n_1^2 = \int_0^{f_0} N_1^2(f) df \approx e_{\text{RMS}}^2 \frac{\pi^2}{3} (2f_0 T_S)^3$$
 since  $f_S^2 >> f_0^2$ .



Figure 40. Conversion noise spectral density.

Each doubling of the oversampling rate results in a 9 dB SNR gain, yielding a 1½ bit increase in output word width. The effect of increasing the oversampling rate on normalized SNR for various oversampling configurations is shown in Figure 41. Ideal decimation is assumed in the graph.

# 4.2.2 Double-loop SDM

Additional integrators and feedback paths can be added to further push the quantization noise out of the conversion band. A second-order SDM is constructed by placing a second loop around the first-order system. A block diagram and an equivalent discrete model are shown in Figure 42.

Using the discrete model, the output of the modulator is

$$y_i = x_{i-1} + e_i - 2e_{i-1} + e_{i-2} = x_{i-1} + n_i$$



Figure 41. Effect of OSR on quantization noise.



Figure 42. Double-loop SDM.

The quantization noise spectrum  $N_2(f)$  becomes

$$N_2(f) = E(f) (1 - e^{-j2\pi/T_s})^2,$$
  
 $|N_2(f)| = N_2(f) = 4e_{RMS} \sqrt{2T_s} \sin^2(\pi f T_s).$ 

Figure 40 includes the noise spectrum for the second-order SDM. The total in-band noise power,  $n_2^2$ , obtained by integrating to  $f_0$ , is

$$n_2^2 \approx e_{\text{RMS}}^2 \frac{\pi^4}{5} (2f_0 T_{\text{S}})^5$$
 since  $f_{\text{S}}^2 >> f_0^2$ .

This yields an increase of 15 dB SNR for each doubling of the OSR, an increase of 2½ bits in the A/D converter output word width. Figure 41 includes the relative effects of the OSR on the SNR for a double-loop system.

Besides better noise spectral properties than single-loop or oversampled pulse code modulation systems, double-loop SDMs exhibit noise that is effectively uncorrelated with the input signal, even if the signal is constant [91].

More complex structures are possible. Additional loops can be added, but the system soon becomes unstable due to slight variations in circuit parameters and may settle into limit cycles if the quantizer saturates. Single and double-loop modulators can be concatenated, but precise component matching is necessary. Multiple bit quantizers can also be used, but this reintroduces the difficulties of flash A/D and D/A converters, since conversion must take place at the sampling rate [87].

## 4.2.3 Decimator

The decimation stage consists of low-pass filtering and downsampling.

Downsampling is accomplished by sampling the filter output at the slower, desired rate.

Implementing the low-pass filter presents more difficulties.

The low-pass filter, or decimating filter, performs three functions. First, it removes the quantization noise shifted out of the conversion band. Second, it attenuates out-of-band signal components before downsampling, reducing aliasing. Third, it reduces the effects of circuit noise by frequency limiting the noise power spectral density. To accomplish these goals, the filter's magnitude spectrum requires a flat pass band, a reasonably sharp roll off, and good rejection in the stop band. The filter performs calculations at the oversampling rate, requiring an efficient architecture to minimize delays. Additionally, the filter should occupy minimum surface area to allow sufficient room on the IC for other monitor subsystems.

Theoretically, since the output of the SDM represents the average of the input signal, an averaging filter could be used. Such filters are referred to as sinc filters due to their  $\sin(f)/f$  spectral shape. However, this filter does not sufficiently perform antialiasing or noise attenuation. In practice, the equivalent of several concatenated averaging filters is used to implement oversampling sigma-delta converters. It has been found that n+1 successive sinc filters are sufficient for filtering the quantization noise of an n<sup>th</sup>-order sigma-delta modulator [96]. A filter implementing the logical concatenation of n+1 sinc filters is referred to as a sinc<sup>n+1</sup> filter.

Two potential difficulties arise from using a sinc<sup>n+1</sup> filter: insufficient out-of-band frequency attenuation and magnitude droop in the high frequency portion of the pass band. Both problems are usually reduced to acceptable levels by partial downsampling after the sinc<sup>n+1</sup> filter, additional low-pass filtering, and then final downsampling.

# 4.2.4 Regions of Operation

The SDM quantizer gives the A/D converter nonlinear operation characteristics. These can be seen by considering the conversion of a rail-to-rail sinusoid with frequency within the conversion band, as shown in Figure 43. The top graph presents the converted



a) converted sinusoidal signal.



b) conversion error (ADC output - input signal).

Figure 43. Conversion of a rail-to-rail sinusoid.

OU

th

output from the 64-times oversampled A/D converter designed in the next sections. The bottom graph presents the error between the converter output and the input, corrected for the delay of the modulator and decimator.

The region exhibiting the severest difference between the input and converted output occurs near the extremes and is due to quantizer saturation. The quantized output is added to the input signal and, in the case of a two-loop system, the output of the first integrator. If the input is large and has the opposite sign of the quantization error, the sum will saturate the following integrator. This produces an input to the quantizer that is greater than the output level, causing the quantizer to function as a limiter. The quantization error will then contain frequency components of the input and, hence, will no longer be uncorrelated with the input. Since the quantizer output is fixed between two levels, the output power is fixed. The additional harmonic power increases the error produced by the SDM. It can be noted by observing Figure 43 that the quantizer saturation problem is not fatal, as the converter recovers from saturation when the input level is decreased.

In regions away from saturation, deviation is due to the SDM operating in the linear region. This is the expected quantization error, appearing as a smaller random variation with zero mean.

The magnitude-dependent operation of the converter can be described by examining a plot of the total signal-to-noise ratio (TSNR) of the output versus the input signal power. Figure 44 presents such an unscaled, conceptionalized graph. The graph can be divided into three regions, overload (saturation), normal (linear) and idle channel [93].



Figure 44. Conceptualized TSNR vs. input power.

The idle channel region results from nonideal circuit operation and the high frequency limit cycles that result from near-zero inputs. An example of the so-called idle pattern is a  $\{+1, -1, +1, -1, ...\}$  cycle that could result from an input of zero to a single-loop SDM. Since the fundamental frequencies of the SDM outputs are one-half to one-eighth of the sampling frequency, the decimation filter will remove the noise. However circuit inaccuracies, such as a linear region in the quantizer or a comparator offset, will disrupt the pattern. This produces low frequency harmonic noise that dominates the signal at low levels.

### 4.3 TRANSDUCER SIGNAL AMPLIFICATION AND OSR CALCULATION

In this section, the gain for the transducer amplifier and the oversampling ratio are calculated. These values are necessary for calculating decimation filter parameters. However, the design process is iterative and the gain and OSR cannot be estimated without an accurate picture of the hardware that follows. For this reason, and in an

attempt to make the design process easier to understand, the final design is used in calculating the values.

Two related decisions are required in the design of this converter: the oversampling ratio and the gain of the input signal. With a two-loop architecture, each doubling of the OSR results in an increase in the SNR of 15 dB. Figure 45 shows the simulated output signal-to-noise ratios versus normalized input sinusoid power for OSRs of 32, 64 and 128.

To select the appropriate OSR, the specifications for the input signal must be considered. The bearing monitor accepts up to  $\pm 10 g$  and should be able to detect at least 0.1 g. This can be represented as a wedge, as seen by the heavy line in Figure 46. If a full scale input of 10 g is to have 10 bits of accuracy, the point of the wedge must lie at or above 60.2 dB, and at least 54.8 dB is required for 9 bits.

Due to construction variations and temperature changes, the electrical signals corresponding to a given acceleration will deviate from device to device and over time. Specific sources for these deviations include beam etch thickness, piezoresistor dimensions, piezoresistor temperature dependence, bridge supply gain variations and transducer output amplification variance. Derivations for the first four were discussed in Chapter 3. A variation of  $\pm 10\%$  will be used for the transducer amplifier and bridge supply electronics gains. Actual values for deviations depend on the specific process used to manufacture the device and, therefore, can only be obtained by experimental analysis. The deviation factors considered are summarized in Table 7. The deviations assume that any DC value from the transducer has been removed by the offset filter prior to the converter. It is also assumed that the various deviation causes are independent of each other, allowing their power effects to be summed.

The effect of the deviations is to widen the range required by the input signal for a given desired TSNR. The result is shown in Figure 46 as a modified wedge, where the



Figure 45. OSR effects on TSNR.



Figure 46. Signal gain profile.

Table 7. Input signal deviation factors.

| SOURCE                |                |        | NORMALIZED EFFECT |            |  |
|-----------------------|----------------|--------|-------------------|------------|--|
| Cause                 | Deviation      | Range  | Magnitude         | Power (dB) |  |
| Beam                  | nominal        | 10     | 1.000             |            |  |
| Thickness             | +10%           | 11     | 0.796             | -1.98      |  |
| (μm)                  | -10%           | 9      | 1.281             | 2.15       |  |
| Resistor Size         | nominal        |        | 1.000             |            |  |
| $(\Delta l/\Delta A)$ | ±3%            |        | 0.9998            | -0.002     |  |
| Amplifier             | nominal        |        | 1.000             |            |  |
| Electronics           | -10%           |        | 0.900             | -0.92      |  |
| (Δgain)               | +10%           |        | 1.100             | 0.83       |  |
| Bridge Supply         | nominal        | 495    | 1.000             |            |  |
| Electronics           | -10%           | 447.24 | 0.904             | -0.88      |  |
| (ΔG & Δα)             | +10%           | 545.89 | 1.103             | 0.85       |  |
| Temperature           | nominal        | 20     | 1.000             |            |  |
| (Δ°C)                 | ΔG & Δα @ -10% | 0      | 0.998             | -0.02      |  |
|                       | ΔG & Δα @ +10% | 40     | 1.008             | 0.07       |  |

combined construction deviation at nominal temperature is shown as a dashed line, and the addition of temperature variation is shown as a light solid line. The temperature effects on the piezoresistors are minimal due to the temperature compensating bridge supply discussed in Section 3.6.1.

The calculations for amplifier gain and OSR can now be made. In order to accurately convert the input signal regardless of the variations in a device, its signal envelope must fit below the TSNR curve for the converter. Horizontal shifts of the envelope correspond to changes in the amplifier gain, while vertical shifts correspond to changes in output resolution.

If an accuracy of 10 bits at 10 g is desired, the envelope top must lie at 60.2 dB TSNR, and 54.2 dB is required for 9-bit resolution. Figure 47 shows the placement of the envelopes within TSNRs from OSRs of 32, 64 and 128, with 10 g nominal at -4.7 dB. This value is chosen to allow for space between the signal envelope and the declining TSNR curves near maximum input. From the figure, an OSR of 64 times yields sufficient clearance for both 9- and 10-bit resolutions. For resolution purposes, the worst case point



Figure 47. OSR and gain determination.

for the signal envelope occurs at the lowest possible gain for a 10 g input, -8.5 dB normalized input.

The amplifier gain can be calculated using the nominal transduction gain, the placement of the nominal point on the envelope (nominal voltage level corresponding to  $\pm 10 g$ ), and the converter input voltage range. The envelope must be placed so that it falls within the TSNR limits. Placing the nominal 10 g input at -4.7 dB (0.58 times full scale) meets this objective. Section 4.5.4 shows this placement is sufficient for all input frequencies within the pass band.

The nominal transduction gain, calculated in Section 3.4, is 750  $\mu$ V/g. This produces 7.5 mV at 10 g. The ADC will have an input range of ±5 V. In order to set a 10 g input to correspond with -4.7 dB normalized input, the nominal amplifier gain must be

$$\frac{(5 \text{ V}) \times (0.58)}{(750 \text{ } \mu\text{V/g}) \times (10 \text{ } g)} = 388.$$

## 4.4 DECIMATOR DESIGN

The decimator provides the filtering, downsampling and accumulation necessary to convert the 1-bit output of the sigma delta modulator at the sampling rate into a multibit number at the Nyquist rate. The decimation filter must remove a sufficient amount of the shaped quantization noise to provide the desired word width, must provide attenuation of out-of-band signals, and must have a flat pass band region. For this project, additional design constraints are imposed by the need to minimize the required chip area.

The method used in a majority of second-order converter designs found in the literature involves a two-stage decimation process. In the first stage a  $\sin^3$  filter, to be described in the next section, is followed by decimation to a factor of  $N_1$  below the sampling rate. This stage accomplishes the bulk of the noise reduction. The second stage

.

uses a low-pass filter followed by downsampling by  $N_2$ . This stage provides additional antialiasing and corrects for the magnitude linear distortion, referred to as droop, caused by the sinc<sup>3</sup> filter. The total oversampling ratio, N, is the product  $N_1 \times N_2$ , with  $N_2$  usually at 4. Such a system is shown in Figure 48. The objective of the design presented in this chapter is to reduce the architectural complexity of the sinc<sup>3</sup> filter and to eliminate the need for placing the second stage on the monitor IC.

The remainder of this section opens with a classification of various sinc<sup>3</sup> filters, including a development of their transfer functions and comparisons of noise reduction, antialiasing and droop. This is followed in Section 4.4.2 by a discussion of three architectures to implement the sinc<sup>3</sup> filter and estimates of their relative chip areas. Finally, the selection of the filter type and architecture to be implemented is justified in Section 4.4.3.

### 4.4.1 Sinc<sup>3</sup> Filter Classification and Performance

The sinc<sup>3</sup> filter is created, conceptionally, by concatenating three averaging filters.

The output of each filter is formed by summing the previous K equally weighted samples,



 $f_s$  = sampling frequency

 $f_{\rm D}$  = decimator (intermediate) frequency

 $f_{\rm N}$  = Nyquist frequency

 $f_0$  = pass band frequency

Figure 48. Standard second-order SDM ADC.

$$y(k) = \frac{1}{K} \sum_{i=0}^{K-1} x(k-i).$$

The frequency spectrum, obtained by using a z-transform, is

$$Y(z) = \frac{1}{K} \sum_{i=0}^{K-1} z^{-i} X(z),$$

$$H(z) = \frac{1}{K} \frac{1 - z^{-K}}{1 - z^{-1}},$$

$$H(f) = H(e^{j\omega T_{S}})\Big|_{\omega=2\pi f}$$

$$= \frac{1}{K} \frac{1 - \cos(2\pi f K T_{S}) + j \sin(2\pi f K T_{S})}{1 - \cos(2\pi f T_{S}) + j \sin(2\pi f T_{S})}$$

$$= \frac{1}{K} \frac{2 \sin^{2}(\pi f K T_{S}) + j 2 \sin(\pi f K T_{S}) \cos(\pi f K T_{S})}{2 \sin^{2}(\pi f T_{S}) + j 2 \sin(\pi f T_{S}) \cos(\pi f T_{S})}$$

$$= \frac{1}{K} \frac{\sin(\pi f K T_{S})}{\sin(\pi f T_{S})} e^{-j\pi f T_{S}(K-1)}$$

$$= \frac{\sin(f K T_{S})}{\sin(f T_{S})} e^{-j\pi f T_{S}(K-1)}.$$

The magnitude spectra for  $N_1 = 32$  and K values of  $N_1$ ,  $\frac{1}{2}N_1$  and  $\frac{1}{4}N_1$  (referred to as H100, H010 and H001) are pictured in Figure 49. The magnitude spectra are plotted as a function of the sampling frequency,  $f_S$ . The phase spectrum is not critical for this application due to the squaring operation that follows the converter, the decision threshold based only on the magnitude and the relative unimportance of filter delay.

Sinc<sup>3</sup> filters built with three averaging filters of length  $N_1$  (H100) are nearly optimal in their removal of quantization noise [96]. Virtually all SDM ADCs built to date use this type of sinc<sup>3</sup> filter for the first stage decimator. Combinations of three averaging stages with smaller values for K, such as  $\frac{1}{2}N_1$  and  $\frac{1}{4}N_1$ , produce filters with fewer zeros on the unit circle, resulting in degraded noise reduction performance but simpler architectures. These filters can be formed by multiplying combinations of the three first-



Figure 49. H100, H010 and H001 spectra.

order averaging filters H100, H010 and H001. The magnitude spectra that result are specified by the function

$$Hijk(f) = \frac{\left[\sin(\pi f N_1 T_S)\right]^i \left[2\sin(\frac{1}{2}\pi f N_1 T_S)\right]^j \left[4\sin(\frac{1}{4}\pi f N_1 T_S)\right]^k}{N_1^3 \sin^3(\pi f T_S)},$$

where

 $i = \text{number of length } N_1 \text{ stages},$ 

 $j = \text{number of length } \frac{1}{2}N_1 \text{ stages,}$ 

 $k = \text{number of length } \frac{1}{4}N_1 \text{ stages,}$ 

i+j+k=3.

Figure 50 shows the spectra for six filters, H300 (the standard filter), H210, H120, H030, H012 and H003, for  $N_1 = 32$ .

The length of the filter (number of coefficients in a convolution implementation) is

$$L = N_1 (i + \frac{1}{2} j + \frac{1}{4} k).$$

| · |            |  |  |
|---|------------|--|--|
|   |            |  |  |
|   |            |  |  |
|   |            |  |  |
|   |            |  |  |
|   | <b>.</b> 1 |  |  |



Figure 50. Six Hijk spectra.

L includes 3 zero coefficients for padding to equal a multiple of N<sub>1</sub>.

Henceforth, only the filters H300, H120 and H012 will be considered. The remaining filters have lengths that are not multiples of  $N_1$  and are, therefore, difficult to implement.

The 3 filter performance specifications of interest are the total in-band quantization noise, worst case attenuation of out-of-band signal before aliasing, and the worst-case magnitude droop.

### 4.4.1.1 In-band noise

The quantization noise at the output of the sinc<sup>3</sup> filter results from passing the noise function  $n_2(t)$  through the filter. Downsampling by  $N_1$  aliases the spectrum about  $f_D$ , copying the noise above  $f_D$  into the pass band. If Hijk contains an H100 term, its spectrum will go to zero at multiples of  $f_D$ . These regions of near-zero noise are folded to

the region around DC. If a second low-pass filter strongly attenuates any signal above  $f_0$  before the second decimation by  $N_2$ , very little noise is aliased into the pass band. This technique has been used to produce outputs with near theoretical SNRs, but requires excessive hardware.

Since only nine bits of output precision are required, no second stage low-pass filter will be used. A sufficient noise attenuation has been obtained by selecting an appropriate sinc<sup>3</sup> filter and OSR. The in-band noise that results will be the total noise obtained from passing the quantizer noise through the sinc<sup>3</sup> filter, because all of this noise will be aliased into the desired signal band. The total quantization noise,  $n_D$ , is analytically determined by integrating the power spectral density of the output of the sinc<sup>3</sup> filter driven by the shaped quantization noise input  $S_D(f)$  from DC to  $\frac{1}{2} f_S$ .

$$\begin{split} S_{D}(f) &= S_{Q}(f) \times |\text{H}ijk(f)|^{2} \\ &= N_{2}^{2}(f) \times \text{H}ijk^{2}(f), \\ n_{D}^{2} &= \int_{0}^{\frac{1}{2}f_{S}} S_{D}(f) \, df \\ &= \int_{0}^{\frac{1}{2}f_{S}} \frac{32e_{\text{RMS}}^{2} T_{S} \left[ \sin(\pi f N_{1} T_{S}) \right]^{2i} \left[ 2\sin(\frac{1}{2}\pi f N_{1} T_{S}) \right]^{2j} \left[ 4\sin(\frac{1}{4}\pi f N_{1} T_{S}) \right]^{2k} \, df, \\ where & S_{Q}(f) = \text{ quantization noise PSD} \\ &= 32e_{RMS}^{2} T_{S} \sin^{2}(\pi f T_{S}), \\ e_{RMS}^{2} &= P_{e} = \frac{m_{p}^{2}}{3L^{2}}. \end{split}$$

The calculation represents a system with inputs normalized to  $\pm 1$ .

A graph of the approximate in-band noise power for filters H300, H120 and H012 as a function of n, the  $log_2(N_1)$ , is presented in Figure 51. The integration was performed



Figure 51. In-band noise power vs. n.

numerically using Simpson's 1/3 rule. A discussion of the numerical integration is provided in Appendix C.6.

The graph indicates that H300 has the best noise attenuation at a given decimation rate, and that H012 performs the same as an H120 filter using half the decimation rate. The signal-to-noise ratio can be obtained by subtracting the in-band quantization noise from the input signal power. For example, consider a 0.5 normalized magnitude signal to be quantized into nine bits. The quantization noise must be 54 dB below the signal level for a 9-bit output. An input of 0.5 (-6 dB) requires a signal-to-noise ration of 60 dB. This results in a decimation rate of at least 32 if H300 or H120 are used and at least 64 if H012 is used. It should be noted that this noise is only due to the aliased quantization noise and, therefore, only estimates the SNR in the linear operating region of the quantizer. The performance is degraded at near maximum input amplitudes from nonlinear noise due to quantizer saturation and at very low amplitudes from noise due to circuit imperfections.

### 4.4.1.2 Antialiasing

Out-of-band signal components are aliased into the pass band in the same manner as out-of-band quantization noise. In a two-stage decimator, the second stage low-pass filter attenuates the out-of-band signal before the second downsampling aliases the signal into the pass band. Eliminating the low-pass filter in the second stage requires the reduction of higher frequency signal components by other means. The first method is to use a sharper filter in the first stage. The added architectural complexity for this solution defeats the purpose of eliminating the second filter.

A second method is to increase the order of the second downsampling. Hijk is monotonically decreasing up to  $f_D$ , implying that the worst case aliasing occurs at  $f_0$ . Therefore, increasing  $N_2$  decreases the aliasing. Figure 52 shows the worst case attenuation of an aliased signal for filters H300, H120 and H012. Note that, if  $f_0$  and  $N_1$  are fixed, each doubling of  $N_2$  requires a doubling of the sampling rate.

A third method for reducing the aliasing is to apply analog low-pass prefilters, the same technique conventional converters use. Since some filtering can be accomplished in both stages of the decimation, the order of this prefiltering need not be as high as would be required for non-oversampled converters. Figure 53 shows the worst case antialiasing for the three filters of interest using second-order and fourth-order Butterworth filters each with a cut-off frequency at  $f_0$ . In order to achieve 54 dB as the maximum allowable worst case attenuation, if a second order analog low-pass filter is used in addition to the natural attenuation of the transducer, then  $N_2$  must be at least 2 for H300 and H120 and at least 3 for H012.

## 4.4.1.3 Droop

Because Hijk is monotonically decreasing up to  $f_D$ , the quality of the magnitude spectrum's flatness can be expressed as the drop in gain from 0 dB at DC to the gain at  $f_0$ .



Figure 52. Worst-cast antialiasing.



Figure 53. Worst-case aliasing with low-pass filters.

This is known as the droop of the filter. The droops for H300, H120 and H012 are plotted against  $N_2$  with  $N_1 = 64$  in Figure 54. The droop values are strongly dependent on  $N_2$ , as can be seen in Figure 54, but only weakly dependent on  $N_1$ . For example, the droop for H120 and  $N_2 = 2$  ranges from 1.32 dB at  $N_1 = 8$  to 1.36 dB at  $N_1 = 256$ , a change of 3% over 5 octaves. The effect decreases with increasing  $N_2$ .

Normally, the droop is corrected with the second stage filter [94]. Since this filter is not present in this design, an alternate method is employed. Before examining this alternative, it is useful to consider how the signal will be used after the converter. The digitized signal can be routed to one of two places: the feature extraction logic on the bearing monitor IC or the central monitoring computer. The feature extraction logic uses the signal to detect fairly coarse changes in vibration acceleration amplitude. Hence, small variations in the magnitude spectrum will have little effect on its operation.

If the signal is sent to the central monitor, precise measurements are to be expected, namely frequency spectral analysis. Hence, a high degree of accuracy is required. However, the central monitor need not make computations in real time and may



Figure 54. Worst-case droop,  $N_1 = 64$ .

have a full-scale floating point processor at its disposal. It could run the sensor output through a large FIR filter, compensating for the droop to the 12-bit accuracy available from the converter.

Two alternatives exist for correcting the droop on-chip. First, the magnitude spectrum can be corrected by the digital filter in the feature extraction section that follows the converter. This filter is used primarily for rough frequency selection and, hence, does not need to be architecturally complicated. Hence, it will not have resolution to sufficiently compensate for the droop.

A second method uses the second-order analog low-pass filter in front of the quantizer. By making the response slightly underdamped, the magnitude spectrum of the input is increased in the area where the decimator droop causes attenuation. This is the technique used in the monitor. The damping ratio and cut-off frequency values that produce the optimal filter depend on the Hijk type used and are discussed in Section 4.5.2.

#### 4.4.2 Sinc<sup>3</sup> Filter Architecture

Since space is a critical resource on a smart sensor, the implementation of the  $\mathbf{H}ijk$  decimator is an important concern. This section begins by examining the three architectures that can be used to implement the decimator. The key to the first two methods lies in fractioning the sinc<sup>3</sup> filter into the product of a numerator function,  $\mathbf{H}_{N}ijk$ , and a denominator function,  $\mathbf{H}_{D}$ :

$$\mathbf{H}ijk(z) = \mathbf{H}_{N}ijk(z) \times \mathbf{H}_{D}(z)$$
$$= \frac{\mathbf{H}_{N}ijk(z)}{(1-z)^{3}}.$$

The first architecture, referred to as Method I, implements the denominator as cascaded IIR accumulators, then intermediately downsamples, and finally implements the numerator as an FIR filter. The second architecture, Method II, places the numerator FIR

before the denominator IIR to take advantage of the 1-bit arithmetic in the FIR stage. Method III implements Hijk directly as an FIR filter, generating the coefficients on-the-fly to reduce storage requirements. A block diagram of each of the three methods is shown in Figure 55. Following the development of the architectures is a section that develops a relative area figure for each method and compares the size and quantization noise rejection qualities to determine the most appropriate filter type and architecture.

All of the architectures accept the  $\pm 1$  output of the sigma-delta modulator and produce a two's complement integer within the range [-1, 1). For simplicity, the filter  $\mathbf{H}ijk$  and the magnitude function  $\mathbf{H}ijk$  will both be represented by the symbol  $\mathbf{H}ijk$ .



Figure 55. Hijk architectures.

## 4.4.2.1 Method I architecture

Method I implements Hijk by using three accumulators at the sampling frequency for the denominator function and an FIR at an intermediate rate, dependent on the filter type, for the numerator function. Figure 56 presents a detailed diagram of the architecture for H120. All eight registers have the same width.

Regardless of which  $sinc^3$  type is used, the denominator has the form  $H_D(z)$ . This can be implemented as three successive IIR accumulators, each implementing the difference equation

$$y(k) = y(k-1) + x(k).$$

The size of each register depends on the filter type and decimation rate. Using  $n = \log_2(N_1)$ , H300 requires (3n+1)-bit registers, H120 requires (3n-1)-bit registers and H012 requires (3n-4)-bit registers. Detailed development of the register widths can be found in Appendix C.1. The accumulators each require an adder with the same output width as the registers. The first adder actually implements an increment/decrement function, since one input is the  $\pm 1$  output of the SDM.

The numerator function,  $H_N ijk$ , can be implemented with addition circuits following an intermediate downsampling. The three numerator functions are listed below:

$$H_{N}300 = (1-z^{-N_{1}})^{3} = 1-3z^{-N_{1}} + 3z^{-2N_{1}} - z^{-3N_{1}},$$

$$H_{N}120 = (1-z^{-N_{1}})(1-z^{-\frac{1}{2}N_{1}})^{2} = 1-2z^{-\frac{1}{2}N_{1}} + 2z^{-\frac{3}{2}N_{1}} - z^{-2N_{1}},$$

$$H_{N}012 = (1-z^{-\frac{1}{2}N_{1}})(1-z^{-\frac{1}{4}N_{1}})^{2} = 1-2z^{-\frac{1}{4}N_{1}} + 2z^{-\frac{3}{4}N_{1}} - z^{-N_{1}}.$$

The intermediate sampling rate, providing a decimation of  $N_1/l$ , allows for a minimal amount of storage to be used to implement the FIR filter,  $H_N ijk$ . H300 requires an intermediate decimation of  $N_1$  and four registers, producing delayed values at  $z^0$ ,  $z^{-N_1}$ ,



Figure 56. Method I H120 architecture.

 $z^{-2N_1}$  and  $z^{-3N_1}$ . H120 requires an intermediate decimation of  $\frac{1}{2}N_1$  and five registers, producing delayed values at  $z^0$ ,  $z^{-\frac{1}{2}N_1}$ ,  $z^{-N_1}$ ,  $z^{-\frac{3}{2}N_1}$  and  $z^{-2N_1}$ . H012 requires an intermediate decimation of  $\frac{1}{4}N_1$  and five registers, producing delayed values at  $z^0$ ,  $z^{-\frac{1}{4}N_1}$ ,  $z^{-\frac{1}{2}N_1}$ ,  $z^{-\frac{1}{4}N_1}$  and  $z^{-N_1}$ . These registers have the same widths as those in the IIR stage.

The actual implementation of  $H_N ijk$  can be done with a shift-and-add circuit. Following are the difference equations for implementing the shift-and-add, expressed at the intermediate decimation rate:

$$H_N 300$$
:  $y(k) = x(k) - x(k-1) + x(k-2) - x(k-3) + 2[x(k-2) - x(k-1)],$   
 $H_N 120$  and  $H_N 012$ :  $y(k) = x(k) - x(k-4) + 2[x(k-3) - x(k-1)].$ 

The addition is performed using modulo arithmetic with a modulus of the register width, allowing the sum to fit into the same width as the input registers. This results in 5b-2 full-adder cells being required for the  $H_N300$  shift-and-add circuit, and 3b-2 full-adder cells required for  $H_N120$  and  $H_N012$ , where b is the word width in bits.

It should be noted that the adders for the IIR accumulators must operate at the sampling rate whereas the shift-and-add circuit need only operate at the intermediate decimation frequency. The IIR adders will be referred to as fast adders and the FIR adders as slow adders. The number of full-adder cells is based on the assumption that ripple-carry adders (RCAs) can be used. This assumption is verified in Appendix C.2. Table 8 summarizes the hardware required for Method I, assuming all adder cells are full-adders for simplicity.

### 4.4.2.2 Method II architecture

The second architecture for Hijk implements the numerator FIR section before the 3 accumulator IIR denominator section. This takes advantage of the  $\pm 1$  output of the

Table 8. Hardware for Method I implementation.

| Filter Type | Number of<br>Storage Cells | Number of Fast Adder Cells | Number of Slow Adder Cells |
|-------------|----------------------------|----------------------------|----------------------------|
| H300        | 24n + 7                    | 9n + 3                     | 15n - 2                    |
| H120        | 24n - 8                    | 9n - 3                     | 9n - 5                     |
| H012        | 24n - 32                   | 9 <i>n</i> – 12            | 9n – 14                    |

SDM to reduce the numerator FIR arithmetic computation at a cost of expanded storage requirements. Figure 57 shows the Method II architecture for H120.

In order to implement the FIR section first, the previous L SDM output values must be stored, where L is the number of bits in the filter (length of the filter). This requires  $2^n$  1-bit storage locations for  $H_N012$ ,  $2\times 2^n$  locations for  $H_N120$  and  $3\times 2^n$  locations for  $H_N300$ . The memory is arranged as a shift register.

An adder arrangement similar to that used for Method I could be used to implement the FIR, but the 1-bit word width allows for a direct implementation in combinational logic. The truth table and the resulting functions for  $H_N120$  are shown in Table 9. The capital letters A, B, C and D represent the 1-bit inputs delayed  $z^0$ ,  $z^{-\frac{1}{2}N_1}$ ,  $z^{-\frac{3}{2}N_1}$  and  $z^{-2N_1}$  respectively. The small letters a, b, c and d represent the 4 bits of the two's complement output, with most significant bit a. The same functions apply for H012, with  $N_1$  replaced by  $\frac{1}{2}N_1$ . It is noted that, for any of the filter types, the least significant bit of the output is always zero. This implies that only 3 bits are necessary for representing the output of  $H_N120$  and  $H_N012$ . Four bits are required for  $H_N300$  as its output consists of even values between -8 and 8.



Figure 57. Method II H120 architecture.

Table 9. Method II H<sub>N</sub>120 logic.

| ABCD | A – 2B + 2C – D | abcd |
|------|-----------------|------|
| 0000 | 0               | 0000 |
| 0001 | -2              | 1110 |
| 0010 | 4               | 0100 |
| 0011 | 2               | 0010 |
| 0100 | -4              | 1100 |
| 0101 | 6               | 1010 |
| 0110 | 0               | 0000 |
| 0111 | -2              | 1110 |
| 1000 | 2               | 0010 |
| 1001 | 0               | 0000 |
| 1010 | 6               | 0110 |
| 1011 | 4               | 0100 |
| 1100 | -2              | 1110 |
| 1101 | -4              | 1100 |
| 1110 | 2               | 0010 |
| 1111 | 0               | 0000 |

 $a = B\overline{C} \vee \overline{A}BD \vee \overline{A}\overline{C}D$   $b = \overline{A}(B \oplus C \oplus D) \vee A(B \oplus C)$   $c = A \oplus D$  d = 0

Table 10. Method II accumulator sizes.

| Filter Type        | REG1 Size | REG2 Size  | REG3 Size |
|--------------------|-----------|------------|-----------|
| H <sub>D</sub> 300 | n + 1     | 2n + 1     | 3n + 1    |
| H <sub>D</sub> 120 | n + 1     | 2 <i>n</i> | 3n-1      |
| H <sub>D</sub> 012 | n         | 2n - 2     | 3n - 4    |

Table 11. Method II hardware implementation.

| Filter Type | Number of Storage Cells | Number of Adder Cells |
|-------------|-------------------------|-----------------------|
| H300        | $3\times 2^n + 6n + 3$  | 6n + 3                |
| H120        | $2\times 2^n + 6n$      | 6n                    |
| H012        | $2^n + 6n - 6$          | 6n – 6                |

The IIR accumulator section that follows the FIR section is similar to the IIR section in Method I. One difference is that each successive accumulator can be allowed to grow to the maximum size [94]. Table 10 lists the sizes required for each accumulator as a function of n, where REG1 is the first accumulator following the FIR, REG2 is next and REG3 is the final accumulator. The widths are developed in Appendix C.1. The number of full-adder cells for the IIR stage is the same as the register size.

The hardware requirements for Method II implementations are listed in Table 11. The combinational logic required for the 1-bit shift-and-add circuitry has been omitted since it is relatively small and independent of n.

## 4.4.2.3 Method III architecture

Method III realizes Hijk directly as a convolution filter. This takes advantage of the 1-bit arithmetic, since the filter input is the SDM output, without the need for L 1-bit storage locations.

Method III implements the function

$$y(k) = \frac{1}{L} \sum_{i=0}^{L-1} x(i)h(k-i),$$

where h is the impulse response of Hijk and x is the 1-bit output of the SDM. For a normalized full-scale ADC output in the range [-1, 1), the values for h also range between [-1, 1). Throughout the following discussion the values for h are scaled to be integers for simplicity.

Since the filter output is downsampled by  $N_1$  in the next operation, the y values can be accumulated until the next downsample. This requires  $L/N_1$  copies of the accumulation hardware, or 3 copies for H300, 2 for H120 and 1 for H012. Furthermore, since x is the  $\pm 1$  output of the SDM, the convolution term for each input involves only adding or subtracting the appropriate filter coefficient to the accumulated value.

Two techniques can be used to obtain the filter coefficients. First, they can be stored in memory. Hijk has linear phase, or

$$h(i) = h(L-1-i) \qquad 0 \le i \le \frac{1}{2}L,$$

implying that only half of the coefficients need to be held. However, since the coefficient size requirements are on the order of n bits, the storage necessary would be on the order of  $n \times N_1$  or  $O(n2^n)$  bits.

The second technique generates the coefficients as needed. A relation to generate the H300 coefficients is

$$h(i) = \begin{cases} \frac{i(i+1)}{2} & 0 \le i < N_1 \\ \frac{N_1(N_1+1)}{2} + (i-N_1)(2N_1-1-i) & N_1 \le i < 2N_1 \\ \frac{3N_1-i-1}{2} & 2N_1 \le i < 3N_1. \end{cases}$$

A circuit to generate the coefficients and perform the decimation has previously been designed [97]. The implementation uses a total of 16n+2 storage cells, 15n+6 adder cells and 9n+1 3-to-1 multiplexer cells, excluding the control circuitry.

#### 4.4.2.3.1 H120 coefficients

A relation to generate the H120 coefficients is

$$h(i) = \begin{cases} \frac{i(i+1)}{2} & 0 \le i < \frac{1}{2} N_1 \\ \frac{\frac{1}{2} N_1 (\frac{1}{2} N_1 + 1)}{2} + (\frac{1}{2} N_1 - 1)(i - \frac{1}{2} N_1) - \frac{(i - \frac{1}{2} N_1 - 1)(i - \frac{1}{2} N_1)}{2} & \frac{1}{2} N_1 \le i < N_1 \\ \frac{N_1^2}{4} - \frac{(i - N_1)(i - N_1 + 1)}{2} & N_1 \le i < \frac{3}{2} N_1 \\ \frac{\frac{1}{2} N_1 (\frac{1}{2} N_1 + 1)}{2} - \frac{1}{2} N_1 (i - \frac{3}{2} N_1 + 1) + \frac{(i - \frac{3}{2} N_1)(i - \frac{3}{2} N_1 + 1)}{2} & \frac{3}{2} N_1 \le i < 2N_1. \end{cases}$$

While this formula is capable of producing any coefficient for a given *i*, it is not particularly useful for developing an architecture. A simpler scheme is obtained by considering the incremental difference between adjacent coefficients,

$$\Delta(i) = h(i) - h(i-1).$$

This yields the set of first-order difference coefficients

$$\Delta(i) = \begin{cases} i & 0 \le i < \frac{1}{2} N_1 \\ N_1 - i & \frac{1}{2} N_1 \le i < \frac{3}{2} N_1 \\ i - 2N_1 & \frac{3}{2} N_1 \le i < 2N_1. \end{cases}$$

A further clue to an efficient implementation is found by considering a function for the difference between  $\Delta(i)$  coefficients,  $\Gamma(i)$ ,

$$\Gamma(i) = \Delta(i) - \Delta(i-1),$$

resulting in the set of second-order difference coefficients

$$\Gamma(i) = \begin{cases} +1 & 0 \le i < \frac{1}{2} N_1 \\ -1 & \frac{1}{2} N_1 \le i < \frac{3}{2} N_1 \\ +1 & \frac{3}{2} N_1 \le i < 2N_1. \end{cases}$$

The implementation of this algorithm uses an (n+1)-bit counter, the most significant bit indicating  $\Gamma(i)$ . This controls an increment/decrement (n+1)-bit register for generating the  $\Delta(i)$  coefficients. The  $\Delta(i)$  values feed a (2n-1)-bit accumulator that generates the h(i) coefficients. Table 12 provides an example of this process for  $N_1 = 8$ .

An architecture to implement this technique for generating the FIR coefficients and for performing the convolution is shown in Figure 58. Since, for H120 L =  $2N_1$ , two sets of coefficient generating logic are required. The outputs are alternately selected via 2-to-1 multiplexers. Control logic for this architecture is covered in Section 5.2.

Table 12. Method III H120 coefficient generation,  $N_1 = 8$ .

| i  | COUNT | $\Gamma(i)$ | $\Delta(i)$ | h(i) |
|----|-------|-------------|-------------|------|
| 0  | 4     | 0 (+1)      | 1           | 0    |
| 1  | 5     | 0 (+1)      | 2           | 1    |
| 2  | 6     | 0 (+1)      | 3           | 3    |
| 3  | 7     | 0 (+1)      | 4           | 6    |
| 4  | 8     | 1 (-1)      | 3           | 10   |
| 5  | 9     | 1 (-1)      | 2           | 13   |
| 6  | 10    | 1 (-1)      | 1           | 15   |
| 7  | 11    | 1 (-1)      | 0           | 16   |
| 8  | 12    | 1 (-1)      | -1          | 16   |
| 9  | 13    | 1 (-1)      | -2          | 15   |
| 10 | 14    | 1 (-1)      | -3          | 13   |
| 11 | 15    | 1 (-1)      | -4          | 10   |
| 12 | 0     | 0 (+1)      | -3          | 6    |
| 13 | 1     | 0 (+1)      | -2          | 3    |
| 14 | 2     | 0 (+1)      | -1          | 1    |
| 15 | 3     | 0 (+1)      | 0           | 0    |



Figure 58. Method III H120 architecture.

## 4.4.2.3.2 H120 second coefficient generator

The second coefficient generator can be implemented with a duplicate copy of the generator described in Section 4.4.2.3.1, with a COUNT value differing by one-half of the total count  $(N_1)$ . However, a more area-efficient design can be developed by considering the relationships between the coefficient values in the two sections. Figure 59 shows the progression of values for  $N_1 = 8$ .

Throughout the discussion below, the following notation will be used:

h = multiple-bit value,

h = 1-bit value,

 $^{k}h = k$ -bit value,

 ${}^kh_i = 1$ -bit value  $h_i$  repeated for k bits,



Figure 59. Dual coefficients vs. time.

 $\langle h \rangle_r = h \text{ modulo } 2^r,$  $h_i | h_j = \text{concatenation of } h_i \text{ and } h_j.$ 

The coefficients for generator 1 h'(k) are related to generator 0 by

$$h'(k) = 2^{2n-2} - h(k),$$

where  $2^{2n-2}$  is the maximum value for h. An architecture for implementing this operation can be obtained by considering the subtraction performed with modulo arithmetic, or

$$\begin{aligned} {}^{2n-1}\boldsymbol{h'} &= \left\langle 2^{2n-2} - \boldsymbol{h} \right\rangle_{2n-1} \\ &= \left\langle 2^{2n-2} + \left( \overline{\boldsymbol{h}} + 1 \right) \right\rangle_{2n-1} \\ &= \left\langle \left\{ \boldsymbol{h}_{2n-2} \overline{\boldsymbol{h}}_{2n-3} \overline{\boldsymbol{h}}_{2n-4} \cdots \overline{\boldsymbol{h}}_{0} \right\} + 1 \right\rangle_{2n-1}. \end{aligned}$$

Before being put into the accumulator, the coefficient is sign extended from 2n-1 to 3n-1 bits. This can be accomplished either by sign extending  $^{2n-1}h'$  after the addition of 1 or by sign extending  $^{2n-1}h$  prior to the addition. The first technique is shown in Figure 60. As a precursor to the next improvement, the second technique is expressed as

$$\begin{array}{l}
3^{n-1}\boldsymbol{h}' = {}^{n}\boldsymbol{h}'_{2n-2} \mid \left\langle 2^{2^{n-2}} + {2^{n-1}\overline{\boldsymbol{h}} + 1} \right\rangle_{2n-1} \\
= \left\langle 2^{2^{n-2}} + {}^{n}\overline{\boldsymbol{h}}_{2n-2} \overline{\boldsymbol{h}}_{2n-2}\overline{\boldsymbol{h}}_{2n-3}\overline{\boldsymbol{h}}_{2n-4} \cdots \overline{\boldsymbol{h}}_{0} \right\} + 1 \right\rangle_{3n-1} \\
= \left\langle {}^{n}\boldsymbol{h}_{2n-2} \boldsymbol{h}_{2n-2}\overline{\boldsymbol{h}}_{2n-3}\overline{\boldsymbol{h}}_{2n-4} \cdots \overline{\boldsymbol{h}}_{0} \right\} + 1 \right\rangle_{3n-1} \\
= \left\langle {}^{n+1}\boldsymbol{h}_{2n-2} \overline{\boldsymbol{h}}_{2n-3}\overline{\boldsymbol{h}}_{2n-4} \cdots \overline{\boldsymbol{h}}_{0} \right\} + 1 \right\rangle_{3n-1} .$$

The term  ${}^{n}h'_{2n-2}$  represents the *n*-bit sign extension of h'.

This circuit can be further reduced by combining the adder in the coefficient generator with the adder/subtractor in the accumulator. The equations describing the resulting logic are



Figure 60. Coefficient generator 1 circuit.

Figure 61 uses these results to show a minimal area circuit for the second coefficient generator.



Figure 61. Improved coefficient generator 1 circuit.

The longest path in the coefficient generator is needed to verify that ripple-carry adders can be used in the implementation. The longest path is through the (2n-1)-bit adder for h in coefficient generator 0, through a 2-to-1 MUX, then through n+1 bits of the adder in coefficient generator 1. Appendix C.2 provides the calculations that verify the use of ripple-carry adders.

The coefficient generation architecture for H012 is identical to that of H120 with  $N_1$  replaced by  $\frac{1}{2}N_1$ . Since, for H012 L =  $N_1$ , only one copy of the coefficient generation logic and no output selection multiplexers are required. The MUXs listed are for the add/subtract accumulator implementation. A description and verification of the use of RCAs in the adder/subtractor is presented in Appendix C.2.

Table 13 summarizes the hardware requirements for the Method III architecture components. The number of adder cells includes the increment/decrement circuitry in H120 and H012. The multiplexer counts are for 3-to-1 multiplexers in the H300 filter and 2-to-1 multiplexers for the H120 and H012 filters. The control hardware was not specified for the H300 filter [97], hence, the table values represent an underestimation of the amount of memory required for this case. The following section indicates that the

Table 13. Hardware for Method III Implementation.

| Filter Type | Number of Storage Cells | Number of Adder Cells | Number of MUX Cells |
|-------------|-------------------------|-----------------------|---------------------|
| H300        | 16n + 4                 | 15n + 6               | 9n + 1              |
| H120        | 10 <i>n</i> – 1         | 9n – 2                | 7n - 3              |
| H012        | 7 <i>n</i> – 7          | 6n – 7                | 3n – 4              |

H120 filter requires significantly less area for the monitor application than the H300, so this omission serves to underestimate the superiority of the selected option.

## 4.4.3 Filter Type and Architecture Selection

This section discusses the motivations for selecting H120 with a Method III architecture as the decimator filter for the analog-to-digital converter. The decision is based on the implementation that uses the smallest estimated area while accomplishing the necessary quantization noise limiting. The section begins by developing an area figure, A(n), to estimate the relative area required by each filter type and architecture. These area figures are then compared to noise limiting for different decimation rates.

The area required for a particular filter type, architecture and decimation rate is dependent on many issues, including: IC technology; type of memory, adder and multiplexer cells; control strategy; cell aspect ratios; clocking strategy; clock distribution; power distribution; relative location of input and output connections; design rules and interconnection efficiency.

If a comparative area is to be calculated without actually designing every combination, limiting assumptions must be made. First, assume that factors such as technology, interconnection, clocking, power and layout strategies affect each design proportionately, so that the relative area is only a function of the type and amount of hardware required to implement the design. Second, assume that the relative areas required by each type of cell; memory, adder and multiplexer, is constant throughout a design and between different designs. Finally, assume that relative cell sizes are 1 for a 2-to-1 MUX, 3 for a memory and 6 for an adder cell. These values are based on a 4 transistor 2-to-1 MUX (a 3-to-1 MUX is assumed to require 1½ times this area), a 14 transistor, clock race immune, static D flip-flop, and a 28 transistor full-adder cell [98]. The coarse estimate of the relative area required for each design is obtained by multiplying

the number of each cell type by its relative size. Table 14 presents A(n) for the three filter types implemented with each of the three architecture methods. Note that, in all cases of interest, Method III has a smaller area figure than Method I.

In order to choose an appropriate filter type, the quantization noise attenuation for a given n must be considered. The development of an approximation of the total in-band noise was presented in Section 4.4.1.1. By comparing this information with the area required, an appropriate filter type, architecture and decimation rate can be obtained. This is done graphically in Figure 62 where the area figure, A(n), is plotted against the quantization noise. The numbers next to the data points are values of n.

A starting point in calculating an appropriate filter type and architecture is to determine the maximum quantization noise allowed. A 9-bit output requires a 54.2 dB SNR. As discussed in Section 4.3, in order for the signal gain profile to fit under the ADC TSNR curve, a signal input of -8.5 dB must satisfy the 54.2 SNR requirement. This implies that the noise level must fall below 62.7 dB. From the graph, H120 Method II with  $N_1 = 32$  and H012 Method III with  $N_1 = 64$  offer the smallest area together with an appropriate noise attenuation.

The previous calculation will overestimate the minimum required noise for two reasons. First, the noise is approximated using an expression for the SDM quantization noise and the decimator frequency response. Second, the calculation does not include imperfections in the SDM and decimator. Electronic noise, integrator leakage, integrator nonlinearity, first integrator gain variation, comparator hysteresis, finite filter word width and filter distortion can all contribute to ADC inaccuracies [99]. The degree to which these inaccuracies affect the converter output depends on the technology and architecture of the SDM, the manufacturing tolerances and the desired resolution. As an example, a

Table 14. Relative area figures, A(n). (normalized to the size of a 2-to-1 MUX)

| Filter Type | Method I           | Method II                 | Method III       |
|-------------|--------------------|---------------------------|------------------|
| H300        | 216n + 27          | $9 \times 2^n + 54n + 18$ | 152n + 50        |
| H120        | 180 <i>n</i> – 72  | $6\times 2^n + 54n - 36$  | 91 <i>n</i> – 18 |
| H012        | 180 <i>n</i> – 252 | $3\times 2^n + 54n - 72$  | 60 <i>n</i> – 67 |



Figure 62. A(n) vs. quantization noise power.

12-bit 15 MHz sampling rate ADC showed an 8 dB difference between ideal simulation and the actual device signal-to-total harmonic distortion ratio [89]. To provide a safety margin, a designed noise level of -80 dB will be used.

According to Figure 62, the filter that best meets the -80 dB noise requirement is the H012 Method II with n = 7. A difficulty in implementing this filter arises when its antialiasing ability is considered. Section 4.4.12 discusses the situation where an H012 filter preceded by a second order low-pass filter requires a second decimation of 16 to achieve at least 54 dB worst-case aliasing attenuation. This results in a total oversampling ratio of

$$N = N_1 \times N_2 = 128 \times 16 = 2048.$$

To achieve a Nyquist rate of 10 kHz, a sampling rate of over 20 MHz is required.

The sampling rate can be reduced by a factor of 8 if the first stage decimator is implemented as an H120 filter of order n = 6. Preceding the SDM by a second order low-pass filter requires only a second decimation of 4 for a worst-case antialiasing. This results in an oversampling ratio of

$$N = N_1 \times N_2 = 64 \times 4 = 256,$$

for a sampling frequency of 2.56 MHz. At n = 6, the Method III H120 filter has the lowest A(n) value.

An alternative implementation would be the H012 Method III with  $N_1 = 128$ ,  $N_2 = 4$  and a fourth-order low-pass prefilter. This would offer sufficient noise reduction and antialiasing with approximately two-thirds the hardware of the H120 filter at the expense of twice the sampling frequency rate, another second-order prefilter and an increased droop.

#### 4.5 ADDITIONAL ADC SYSTEM ISSUES

This section details elements of the final converter design. Section 4.5.1 reviews the complete system. Section 4.5.2 describes the switched capacitor (SC) implementation for the antialiasing filter. Section 4.5.3 briefly discusses the sigma-delta modulator. Section 4.5.4 presents graphs describing the ADC developed through simulation.

## 4.5.1 ADC System Review

This section reviews the overall converter system. Figure 63 presents the system block diagram.

The analog acceleration signal from the amplifier and offset removal filter is low-pass filtered in a switched capacitor (SC) filter designed to assist in antialiasing and to reduce the effects of decimator filter droop. The SC filter has a sampling rate of 160 kHz. Details of this filter are described in Section 4.5.2 and Appendices C.3 and C.4.

The SC filter output enters the second-order sigma-delta modulator. The signal is upsampled to 2.56 Msamples/sec in the SDM to provide 256 times oversampling for the ADC. The SDM is also implemented in SC technology. A brief discussion of the SDM is provided in Section 4.5.3.

The SDM  $\pm 1$  (1, 0) output is filtered and downsampled in the decimator that



Figure 63. ADC system block diagram.

follows. The decimator is composed of two parts, coefficient generation and filtering. The coefficient generators produce 11-bit sinc<sup>3</sup> FIR coefficients on-the-fly in a novel, reduced area design. The coefficients are sign-extended to 17 bits in the filter. The filter section is composed of two identical units. Each unit contains a 128-point FIR. Multiplication of the coefficient and the input amounts to a sign change due to the ±1 SDM output. A 17-bit accumulate-and-dump operation sums the products and provides the decimation. Outputs of the FIRs are alternately selected via 2-to-1 multiplexers to provide downsampling by a factor of 64. The design of the decimator is discussed in detail in Section 4.4. Because the digital decimator must be synchronized with the digital filtering that follows, the decimator control logic is described with the feature extraction logic in Chapter 5.

The 40 ksamples/sec output of the decimator is sent to the selectable antialiasing filters in the feature extraction section before final downsampling by 4. The signal may also be routed off-chip for further processing.

#### 4.5.2 Antialiasing filter

Section 4.4.1.2 described the necessity for preceding the ADC with a second-order low-pass filter to reduce the aliasing caused by downsampling. The filter cutoff frequency  $f_c$  and damping ratio  $\xi$  are chosen to satisfy two conflicting sets of constraints. First, the antialiasing is improved by lowering  $f_c$  because the filter gain at any frequency above the cutoff is reduced. This also improves the systems ability to attenuate the thermal noise produced by the transducer amplifier, as discussed in Section 3.6.2. Second, the filter can be used to reduce the droop of the decimator filter near  $f_0$  (5 kHz), as discussed in Section 4.4.1.3, by increasing  $f_c$ , and making the filter slightly under damped. This produces a gain slightly greater than 1 near  $f_0$ , offsetting the droop.

Appendix C.3 presents the method used to calculate antialiasing filter parameters  $f_c$  and  $\xi$  that minimize the mean squared error between the actual filter and a target filter. The target filter was chosen to obtain acceptable antialiasing and electronic noise reduction. The optimal cutoff frequency was found to be 7000 Hz and the optimal damping ratio is 0.56. Figure 64 shows the spectra of the resulting filter, the decimator, the system including the antialiasing filter, and the ideal.

The antialiasing filter and SDM are both implemented with a switched capacitor (SC) architecture. There are three main reasons for selecting switched capacitors over other techniques. First, filter parameters are determined by the ratio of capacitors, not by precise component values. Second, switched capacitor circuits require less area than other techniques because of their regular architecture and elimination of resistors. Third, due to the oversampling in the ADC, a high frequency clock is already available.

Because it is sampled, the SC filter frequency response will be different from the



Figure 64. Antialiasing filter spectrum.

continuous filter. The response is not only a function of  $f_c$  and  $\xi$ , but of the SC sampling frequency  $f_p$  as well. Appendix C.4 describes the design of the SC filter and the minimum mean squared technique used to match the SC filter to the continuous filter for various sampling rates. The  $f_c$  and  $\xi$  that result in the best fit for various quotients of the decimator sampling frequency by powers of 2 together with the continuous filter characteristics are presented in Table 15.

The selection of the SC sampling frequency is, once again, a tradeoff between conflicting characteristics. A higher  $f_p$  results in a filter that more closely matches the continuous filter. However, as is shown in Appendix C.3, the area required for the capacitors,  $A_C$ , is proportional to the ratio of the sampling rate to the cutoff frequency, or

$$A_{\rm c} \propto 3 + 2\xi + 2\frac{f_p}{f_c}.$$

Decreasing the sampling frequency reduces the area required. However, in addition to a poorer match of the analog filter, there is an increase in the effect of aliasing caused by sampling at the SDM rate. This is demonstrated in Figure 65, which shows the frequency responses after sampling at  $f_s$  for the continuous filter and the SC filter  $(f_p = 80 \text{ kHz})$ .

When the signal is downsampled, the higher peaks are attenuated by the decimation filter. However, enough gain remains to result in an increase in the worst-case antialiasing and the amplifier thermal noise power. Appendix C.5 discusses the calculations for antialiasing and Appendix C.6 for noise power. The results are included in Table 15 and summarized graphically in Figure 66. A sampling frequency of 160 kHz offers the best tradeoff between capacitor size and antialiasing and noise power.

Table 15. SC filter characteristics.

| SC Sampling       | Cutoff           | Damping | Worst-Case   | Electronic                          | Relative SC |
|-------------------|------------------|---------|--------------|-------------------------------------|-------------|
| Frequency         | Frequency        | Ratio   | Antialiasing | Noise Power                         | Area        |
| $f_{\rm p}$ (kHz) | $f_{\rm c}$ (Hz) | ξ       | (dB)         | (×10 <sup>-8</sup> V <sup>2</sup> ) | (1pF = 1)   |
| analog            | 7000             | 0.56    | -64.5        | 8.43                                | _           |
| 2560              | 7040             | 0.563   | -64.6        | 8.43                                | 735.6       |
| 1280              | 7080             | 0.566   | -64.6        | 8.43                                | 369.8       |
| 640               | 7130             | 0.571   | -64.6        | 8.43                                | 187.0       |
| 320               | 7260             | 0.583   | -64.3        | 8.43                                | 95.6        |
| 160               | 7560             | 0.611   | -63.2        | 8.46                                | 49.9        |
| 80                | 8120             | 0.680   | -58.8        | 8.54                                | 27.2        |
| 40                | 9440             | 0.902   | -36.3        | 9.20                                | 16.2        |



Figure 65. Continuous and SC filter frequency response.



Figure 66. Sampling rate effects on SC filter parameters.

# 4.5.3 Sigma-Delta Modulator

This section presents the results of research on the construction of second-order sigma-delta modulators. The current state-of-the-art in SDMs produces systems that are more than adequate for use in the bearing monitor.

The SDM model used in this project was designed by Boser and Wooley [99]. The model is based on a device incorporating two fully differential SC integrators and a regenerative latch for the quantizer. A sampling rate of 4 MHz is used. The SDM is implemented in 3  $\mu$ m CMOS and operated with a single-sided 5 V supply. Reported simulation and test results are comparable to those obtained with the simulation of the device presented here.

Sufficient work has been reported to implement SDMs similar to the Boser and Wooley design with BiCMOS circuitry. This includes the construction of high frequency

op-amps for SC circuits [100], the SC circuits [67, 101], and the incorporation of these SC circuits in SDM ADCs [66].

### 4.5.4 System Simulation Results

This section presents several graphs that summarize the performance of the monitor system from the transducer through the ADC. The graphs are for nominal system conditions and the gains are normalized to 1 (the amplifier gain of 388 V/V has been excluded). All the graphs were produced by a simulator written in C. Details of the TSNR simulation are provided in Appendix C.7.

Figure 67 shows the frequency response of the system from 10 to 10 kHz. The sinusoidal input magnitude was 0.2. Figure 68 presents the TSNR as a function of frequency for the same input and frequency range. The TSNR drops off rapidly as a function of frequency because the quantization noise is attenuated only by the decimation sinc<sup>3</sup> filter. Figure 69 shows the TSNR as a function of magnitude for a 625 Hz sinusoidal input, the TSNR for a 5000 Hz sinusoidal input, the 9-bit signal gain envelope, and the nominal gain location for a 10 g input. The figure verifies a sufficient TSNR for 9-bit operation for all input frequencies and throughout all expected variations in system parameters.



Figure 67. Simulated frequency response.



Figure 68. TSNR vs. frequency.



Figure 69. TSNR vs. input power.

# CHAPTER 5 FEATURE EXTRACTION AND DECISION LOGIC

In this chapter, the logic for extracting a salient feature from the converted bearing signal and for using this value to determine the health of the bearing is presented. The results of Section 2.5 indicate that the mean squared (MS) value of the signal offers an appropriate single-value indication of increasing damage. Filtering enhances this value by increasing the effects due to defect sources while decreasing the effects due to external sources. The decision logic triggers an alarm when a programmable MS threshold is exceeded a programmable number of times.

The remainder of this chapter is divided into seven sections. Section 5.1 presents an overview of the hardware. Sections 5.2 through 5.6 cover the decimator, antialiasing filter, programmable IIR filter cells, MS and thresholding unit, and control unit, respectively. A simulation used to verify the hardware design is summarized in Section 5.7. Figure 70 presents a block diagram of the system. Throughout this chapter, the names of registers and control signals are printed in Courier type and module names are printed in Courier italic type for clarity.

#### 5.1 DIGITAL SYSTEM OVERVIEW

The first-stage decimator for the sigma-delta modulating analog-to-digital converter, antialiasing filter and second-stage decimation form the first functional unit, referred to as the conversion unit. The basic design of the decimator is presented in Section 4.4. Because of its close coupling to the antialiasing filter, the timing and control of the decimator will be covered in this chapter.

Feature extraction is accomplished by filtering the acceleration signal then calculating the MS value. Three second-order programmable IIR units form the filter. Each unit implements a biquad digital filter with five 12-bit coefficients.



Figure 70. Digital logic block diagram.

Hardware for calculating the MS value, comparing this value to a threshold, and counting threshold exceptions is the final functional unit. The MS value is found by accumulating the squared output of the last IIR cell. The number of accumulations is programmable in powers of two from 256 to 4096. The resulting sum is then divided by the number of samples accumulated to form the mean squared. Each time the MS value exceeds a programmable threshold, an exception counter is incremented. When the count exceeds a programmable value, a system alarm is generated and the last MS value is shifted out for transmission off-chip.

In addition to MS threshold exceptions, a system alarm is generated if a numerical overflow is detected in the MS calculation, IIR filtering or, optionally, the gain in the conversion unit. The current MS value is also shifted out when an overflow exception occurs. The three least significant bits of the MS value are set or cleared to indicate the cause of the alarm. Once an overflow occurs, the overflow alarm flag remains set until the monitor system is reinitialized.

The functional units are interconnected via serial lines. This is possible due to the low data rate (10 kbits/second) after the second decimation. The use of serial connections reduces the wiring complexity and simplifies block placement and interconnection routing. Two lines are required, one for transmitting data to the next block and one for sending coefficients and option selection bits to the blocks during initialization. In addition, single-bit lines provide calculation overflow, reset and coefficient shift signals.

Upon initialization, the monitor accepts 216 bits of information. The allocation and purpose of these bits are summarized in Table 16 in the reverse order in which they are shifted into the functional blocks. The parameter value registers, referred to as init registers, are concatenated to form a shift register that extends across all of the functional blocks. The purposes of these registers are described in the following sections. At the end of 216 2.56 Mhz clock periods, the coefficients are in place and the monitor begins operation.

Table 16. Programmable init register values.

| Functional unit | Init<br>register | Number of bits | Description                       |
|-----------------|------------------|----------------|-----------------------------------|
| control         | mode             | 1              | monitor or accelerometer mode?    |
| MS &            | shft             | 8              | MS gain factor (1 - 8)            |
| threshold       | exnum            | 9              | number of exceptions before alarm |
|                 | thresh           | 9              | MS threshold for exception        |
| IIR             | B2               | 12             | filter coefficient                |
| filter          | B1               | 12             | filter coefficient                |
| (×3)            | A2               | 12             | filter coefficient                |
|                 | A1               | 12             | filter coefficient                |
|                 | A0               | 12             | filter coefficient                |
| decimator &     | aa               | 1              | use antialiasing filter?          |
| antialiasing    | ovfsw            | 1              | decimator gain overflow check?    |
|                 | shval            | 7              | decimator gain factor (1 - 128)   |

## 5.2 DECIMATOR TIMING AND CONTROL

The one-bit, 2.56 Mbits/sec output of the sigma-delta modulator represents the average value of its input. The purpose of the first-stage decimator is to convert this signal to at least 9 bits of accuracy at 40 kbits/sec. This is accomplished by low-pass filtering and downsampling by 64. Details of the operation of the decimator are presented in Section 4.4. The hardware is shown in Figure 71. Table 17 presents the timing for control signals.

The 7-bit count register serves a dual purpose. First, its most significant bit (MSB) provides the ±1 second-order difference value for generating the low-pass filter FIR coefficients. Second, it serves as the timing source for the decimator and antialiasing filter. The count register is driven by the 2.56 MHz system clock.

The 7-bit delta and 11-bit h registers provide the mechanism for generating the FIR coefficients.

The 17-bit registers acc0 and acc1 accumulate the partial sum outputs for the FIR. For acc0, the output of h is either added or subtracted from the accumulated value depending on the  $\pm 1$  output from the SDM. For acc1, the output of h is modified before addition or subtraction to generate the second set of FIR coefficients.

Seventeen 2-to-1 multiplexers alternately select between acc0 and acc1 twice per cycling of count. By alternately choosing one of the 128-bit FIR outputs every 64 clock cycles, a decimation factor of 64 is achieved.

Monitor system simulation, covered in Chapter 6, indicated the need for introducing a gain at the decimator output. This need arises when a low amplitude acceleration signal, as occurs in a defect-free bearing, is passed through a narrow-band IIR filter. The signal reduction due to the small pass-band and finite arithmetic results in zero-valued output from the filter. A gain is implemented by left-shifting acc by a programmable number. The 8-bit short register is programmed with a 1 for each



Figure 71. Decimator logic diagram.

Table 17. Decimator control signals.

| count (hex)  | signal    | description                                     |
|--------------|-----------|-------------------------------------------------|
| 20           | clr_acc0  | clear acc0                                      |
| 60           | clr_acc1  | clear acc1                                      |
| 20, 60       | lch acc   | latch acc0 or acc1 into acc based on tog        |
| 20, 60       | cmp tog   | complement toggle bit                           |
| 21-27, 61-67 | rot shcnt | rotate shont                                    |
| 21-27, 61-67 | lsht acc  | left-shift acc if MSB of short is 1 for scaling |
| 04-0F, 44-4F | rsht_acc  | right shift data out for off-chip transmission  |

required shift, resulting in gains that are powers of two from 1 to 128. short is rotated its full width twice per cycling of count, after the next output has been latched into acc. During the rotation, the current MSB indicates whether or not a shift in acc should occur.

Shifting acc may result in an overflow. Since acc is a signed value, the overflow is detected as a difference between the current value of acc's MSB and the previous MSB. As discussed in Section 6.2, monitor system simulation indicates that, for narrow-band filtering, overflow rates as frequent as 25% did not adversely affect the MS calculation. A programmable bit (ovfsw) provides a switch for enabling or disabling the system exception on shift overflow.

When the monitor is reset, all of the data registers in the decimator unit are cleared except count, which is initialized to 32 (20h).

Figure 72 shows a timing diagram obtained by simulating the decimator hardware. The simulated output of the SDM has seven 0 values (-1) for every 1 value (+1), representing a normalized DC value of -0.75. The 17-bit output in acc, 14000h, corresponds to the -0.75. A value of 01 in short causes a gain of 2. This results in an output value of 08000h, corresponding to a DC level of 0.5, which causes an overflow. The simulation program is described in Section 5.7.

## 5.3 ANTIALIASING FILTER

The antialiasing filter reduces spectral components above 5 kHz before the final downsampling. This increases the accuracy of the output by reducing the aliased signal noise at the expense of additional harmonic distortion and circuit complexity. The antialiasing filter and subsequent filtering and feature extraction are applied only to the signal in the monitor mode. If the device is in accelerometer mode, the 40 kHz first-stage



Figure 72. Decimator timing diagram.

decimator output is sent directly off-chip for further filtering. This relaxes the requirements for the antialiasing filter since its output will be used only for comparative monitoring and not diagnostics.

The selection of the type of antialiasing filter presents two significant design tradeoffs. The first is the exchange between control of the filter frequency response and filter
complexity. Ideally, the antialiasing filter should be flat from DC to 5 kHz, have no
transition width, and pass no signal components above 5 kHz. Such a filter would be
noncausal and, hence, impossible to construct. The filter can be approximated to within a
given tolerance provided there are no restrictions on the filter order and complexity.
Because of the limited area on the monitor IC, the filter hardware complexity must be kept
to a minimum.

The second design trade-off considers the exchange between filter characteristics for a given filter order. For example, maximizing the flatness in the pass-band decreases the roll-off in the transition region. The simulation in Chapter 2 showed that the best source for indicating bearing defects are the defect repetition frequencies, which tend to be in the hundreds of Hertz. Therefore, a significant amount of pass-band flatness can be exchanged for improved transition response.

Several filter architectures were examined before selecting a  $sinc^2$  filter due to its sufficient attenuation of high frequency components and simple design. The second-stage decimation factor is 4, requiring 8 coefficients for the  $sinc^2$  filter. The discrete transfer function for the FIR filter, H(z), is

$$H(z) = \frac{1}{16} \sum_{i=0}^{3} (i) z^{-i} + (4-i) z^{-(4+i)}$$
$$= \frac{1}{16} \left\{ z^{-1} + 2z^{-2} + 3z^{-3} + 4z^{-4} + 3z^{-5} + 2z^{-6} + z^{-7} \right\}.$$

Each unit delay,  $z^{-1}$ , represents one output of the first-stage decimator.

Figure 73 shows the frequency response of the antialiasing filter, as well as the system frequency response with and without the antialiasing filter. The spectral folding that results after the second-stage decimation both with and without the antialiasing filter is shown in Figure 74. For low-frequency signals, the antialiasing filter provides sufficient out-of-band attenuation without severe in-band linear distortion. At signal frequencies below 500 Hz, the system response through the antialiasing filter yields a worst case attenuation of -58 dB for an input at 9.5 kHz and a worst case in-band attenuation of -0.06 dB.

Bypassing the antialiasing filter folds most of the signal below 40 kHz into the 0 - 5 kHz pass band. Obtaining the MS value of this signal yields an indication of the overall power of the bearing acceleration signal.

Figure 75 shows a logic diagram of the antialiasing filter implementation. Table 18 presents the timing for control signals.

The 3-bit register cent provides timing for the output functions of the antialiasing unit. It can be logically formed by adding 2 bits to count and including the MSB of count. cent has been provided as a separate register for clarity.

The coefficient multiplication in the antialiasing FIR filter is implemented with repetitive additions. This can easily be accomplished because all of the coefficients are positive integers no greater than 4 and because new data values enter the filter at the low rate of 40 kwords/sec. The FIR coefficients are functions of ccnt. Table 19 summarizes the coefficient values and signals A, B, C and D that indicate whether or not an addition should be performed in one of the addition slots. The signals for controlling filter addition are then

add\_acc0 = 
$$(\overline{A} \wedge t0) \vee (\overline{B} \wedge t1) \vee (\overline{C} \wedge t2) \vee (\overline{D} \wedge t3)$$
,  
add\_acc1 =  $(\overline{A} \wedge t0) \vee (\overline{B} \wedge t1) \vee (\overline{C} \wedge t2) \vee (\overline{D} \wedge t3)$ ,



Figure 73. Frequency response with and without antialiasing.



Figure 74. Spectral folding with and without the antialiasing filter.



Figure 75. Antialiasing filter logic diagram.

Table 18. Antialiasing filter control signals.

| ccnt (hex) | count (hex)  | Signal   | Description of action                    |  |
|------------|--------------|----------|------------------------------------------|--|
| all        | 00, 40       | inc_ccnt | increment cont                           |  |
| see text   | 00-03, 40-43 | add aar0 | add acc to aar0                          |  |
| see text   | 00-03, 40-43 | add aar1 | add acc to aar1                          |  |
| 0, 4       | 04           | lch out  | latch aar0 or aar1 into outreg           |  |
| 0, 4       | 04           | cmp tog2 | complement tog2, selects output          |  |
| 0          | 20           | clr aar0 | clear aar0                               |  |
| 4          | 20           | clr aar1 | clear aarl                               |  |
| 0, 4       | 20-2B        | sht_out  | right shift outreg, serial output to IIR |  |

where t0 = 1 when count = 00 or 40h, t1 = 1 when count = 01 or 41h, t2 = 1 when count = 02 or 42h, t3 = 1 when count = 03 or 43h.

A set of 12 2-to-1 multiplexers is used to select between the 12 most significant bits of the antialiasing filter and the output of the first-stage decimator. This allows the antialiasing filter to be selectively bypassed. The data is latched into outreg once for every fourth output value from the first-stage decimator, yielding a second-stage decimation factor of 4. The 12-bit outreg is right-shifted to provide serial output from the ADC unit into the first programmable filter unit.

When the monitor is reset, all of the data registers in the antialiasing unit are cleared.

Table 19. Antialiasing FIR coefficients.

| ccnt  | FIR filter coefficient |      | Ado | d sul | bsig | nals |
|-------|------------------------|------|-----|-------|------|------|
| value | aar0                   | aar1 | A   | В     | С    | D    |
| 0     | 0                      | 4    | 0   | 0     | 0    | 0    |
| 1     | 1                      | 3    | 0   | 0     | 0    | 1    |
| 2     | 2                      | 2    | 0   | 1     | 1    | 0    |
| 3     | 3                      | 1    | 0   | 1     | 1    | 1    |
| 4     | 4                      | 0    | 1   | 1     | 1    | 1    |
| 5     | 3                      | 1    | 1   | 1     | 1    | 0    |
| 6     | 2                      | 2    | 1   | 0     | 0    | 1    |
| 7     | 1                      | 3    | 1   | 0     | 0    | 0    |

A = ccnt[2] B = C = ccnt[2] $\oplus$  ccnt[1] D = ccnt[2] $\oplus$  ccnt[0]

## 5.4. IIR Filter

The monitor includes 3 second-order, programmable IIR filter cells. This sixth-order filter can be used to focus on frequencies likely to signify damage, such as the bearing defect frequencies, or can be used to reject signals from sources external to the bearing. A cellular design approach was used to simplify the VLSI implementation and to allow for an easy increase or decrease in the filter order by adding or removing cells.

Each filter cell implements the transfer function

$$H(z) = \frac{A_0 + A_1 z^{-1} + A_2 z^{-2}}{1 - B_1 z^{-1} - B_2 z^{-2}}.$$

This results in the difference equation, expressed in terms of the IIR cell notation, as

$$Y0 = 2 \times \{A0 \cdot X0 + A1 \cdot X1 + A2 \cdot X2 + B1 \cdot Y1 + B2 \cdot Y2\}.$$

Each of the five coefficients, A0, A1, A2, B1 and B2, is implemented as a 12-bit sign-magnitude number in the range [-1, 1). The number range required for typical analog filter coefficients with unity-gain pass bands is [-2, 2). Therefore, the coefficients are divided by 2 prior to application in the monitor, and the partial products are multiplied by 2 during filtering. Sign-magnitude numbers are used to facilitate the implementation of the multiplication.

In addition to the five coefficients, the filter must hold the three most recent 12-bit inputs, X0, X1 and X2, and the two most recent 12-bit outputs, Y1 and Y2. Figure 76 shows a block diagram for the IIR cell.

The IIR unit produces filtered values at the rate of 10 kwords/sec, the same rate as the converter output. The timing is controlled by the 8-bit fcount counter operating at the system clock rate of 2.56 Mhz. This yields 256 clock periods per filter output. Table 20 summarizes the events in a filter cycle.



Figure 76. IIR filter block diagram.

Table 20. IIR filter timing.

| fcount (hex) | Description of action                         |
|--------------|-----------------------------------------------|
| 00 - 0C      | shift data values between registers and units |
| 20 - 3D      | $0A \cdot 0X + 0Y \rightarrow 0Y$             |
| 40 - 5D      | Y0 ← Y0 + X1·A1                               |
| 60 - 7D      | Y0 ← Y0 + X2·A2                               |
| 80 - 9D      | Y0 ← Y0 + Y1·B1                               |
| A0 - BD      | Y0 ← Y0 + Y2·B2                               |
| CO           | 12-bit error check                            |

At the start of each cycle, a new 12-bit data value is serially shifted into X0 from the conversion unit. At the same time, the old X0 is shifted into X1, X1 into X2, Y1 into Y2, Y0 is shifted into Y1 and the next functional unit, and 0 is loaded into Y0.

The filter output is calculated as five separate products of a coefficient with its corresponding data value, together with an accumulation of the partial result. The multiplication in the IIR filter is implemented using the indirect (shift-and-add) method. This architecture was chosen due to its minimal hardware requirements relative to other techniques. Table 21 shows the timing for the first multiply/accumulate operation; the remaining four are identical. The left shift of prod implements the multiplication by 2 to compensate for the ½ factor in the coefficients. The design and implementation of indirect multipliers are well known and will not be covered in detail [102].

The shift-and-add multiplication requires a 22-bit addition. Appendix C.2 indicates that a 21-bit ripple-carry adder is the longest allowed for a unit gate delay of 9 ns and a

Table 21. First multiply/accumulate operation timing.

| fcount (hex) | Description of action                    |  |
|--------------|------------------------------------------|--|
| 20           | load reg, clear prod, load rsgn & csgn   |  |
| 21           | complement reg if negative               |  |
| 22           | increment reg if negative                |  |
| 23 - 38      | 11-bit shift-and-add multiply, overflow? |  |
| 3A           | left shift prod, overflow?               |  |
| 3B           | complement prod if rsgn⊕csgn = 1         |  |
| 3C           | increment prod if rsgn⊕csgn = 1          |  |
| 3D           | add prod to Y0, overflow?                |  |

clock rate of 2.56 Mhz. Three solutions are possible. First, change the parameters of the system; *i.e.*, decrease the gate delay or reduce the clock frequency. Second, pay the increased hardware cost for a faster device, such as a carry-completion adder. Third, allow two clock cycles for the completion of the addition before latching the results into prod. Since there are ample clock periods in the 256 period cycle, the last option is used. This results in 22 clock cycles for the shift-and-add operations.

Since the coefficients are stored in sign-magnitude form, the MSB of each value indicates the sign and the remaining 11 bits are used to determine whether or not an add operation is performed for each shift in the multiplication algorithm. The five coefficient init registers are connected to form a circular shift register. At the start of each multiplication cycle, the MSB of register A0 is stored in csgn. As prod is shifted, the concatenation of the coefficient registers is also shifted so that the current MSB of register A0 provides the add control.

Once the multiplication is completed, a left-shift of prod is performed to implement a multiply-by-2 operation to compensate for the halving of the coefficients. The 11 most significant bits are then converted to a 12-bit two's complement value based on the signs of the coefficient (csgn) and corresponding data value (rsgn). The result is a 12-bit two's complement value in the range [-1, 1).

Next, the product is added to the 13-bit contents of Y0. The extra bit in Y0 allows the accumulated partial sum to exceed the [-1, 1) range. The final filter output, however, must fit into 12 bits.

Error checking is done at three points in the multiplication/accumulation cycle. Unsigned addition overflow checks are made at the end of the indirect multiplication and after the subsequent multiplication-by-2 operation. A 13-bit signed addition error check is made after the accumulation operation. After the final multiplication/accumulation, a 12-bit signed error check is made on YO. Any of these errors will set the overflow flag, which causes a system exception. The overflow flag is cleared on a monitor reset.

When the monitor is initialized, the five 12-bit coefficient registers are loaded by serially shifting in values as described in Section 5.1. All of the data registers are initialized to zero.

#### 5.5 MS CALCULATION AND THRESHOLDING

The MS and thresholding unit performs the final feature extraction and decision functions. The output of the last IIR filter stage is squared and summed. When a programmable number of inputs have been accumulated, the total is divided by the number of values to obtain the mean squared value. This implements the equation

$$MS = \frac{1}{N} \sum_{i=0}^{N-1} x^2$$
.

The MS value is compared to a programmable threshold value to indicate if an exception has occurred.

When the number of exceptions exceeds a programmable amount, an alarm signal is generated. An alarm may also be generated by numerical overflows in any of the previous units (converter, IIR) or in this unit. At each alarm, a 12-bit word is transmitted off-chip through the interface unit. The first 9 bits contain the most significant bits of the most recently calculated MS value. The remaining 3 bits indicate which of the possible sources caused the alarm.

Figure 77 shows a register diagram of the MS and thresholding unit, Table 22 shows the timing cycle for each data input, and Table 23 shows the timing for MS calculation and thresholding.

An 8-bit counter, cntrl, counts the 2.56 MHz clock periods to maintain control between each new data value. Overflows of cntrl are counted in the 12-bit cntr2 register. This counter indicates the accumulation of values for the MS calculation.



Figure 77. MS calculation and thresholding unit circuit.

Table 22. Accumulation cycle timing.

| cntr1 (hex) | Description of action                            |
|-------------|--------------------------------------------------|
| 00 - 0B     | Shift in x from third IIR cell                   |
| 0C          | $x \leftarrow \overline{x} + 1$ if x is negative |
| 25          | $x^2 \leftarrow x \times x$                      |
| 28          | acc ← acc + x²                                   |

Table 23. MS and threshold timing.

| cntr2 (hex) | cntr1 (hex) | Action                                         |  |
|-------------|-------------|------------------------------------------------|--|
| FFF         | 80 - 8F     | rotate shft, shift acc based on shft[0]        |  |
| FFF         | <b>C</b> 0  | latch c_over                                   |  |
| FFF         | C0 - D0     | threshold comparison, shift acc, rotate thresh |  |
| FFF         | D1          | increment exnum on MS > thresh                 |  |
| FFF         | D2          | latch alarm causes into acc ext[2:0]           |  |
| 000         | 00 - 0B     | shift out {acc[6:0], acc ext[4:0]}             |  |
|             |             | initialize cntr2, t_greater, a_greater         |  |
| 000         | 0C          | and c_over                                     |  |
|             |             | if e_over set, initialize excnt and e_over     |  |

At the start of each data cycle (every 100  $\mu$ s), the 12-bit output from the last IIR cell is shifted into x. The absolute value of the contents of x is obtained next. The 12-bit result is then squared in a shift-and-add multiplier similar to the one discussed in Section 5.4 and placed in  $x^2$ . The most significant 12 bits of the product are added to the 24-bit accumulator acc. The size of acc allows for up to 4096 values of  $x^2$  to be accumulated without any chance of overflow.

overflows, acc is right-shifted to implement the division by N required in the MS calculation. Both actions are influenced by the 4 most significant bits of the 8-bit register shft. After each MS calculation (acc shift sequence), shft[7:4] is transferred into the most significant bits of cntr2, preloading a count value. When cntr2 overflows, shft is rotated, with each 0 that appears in the MSB position indicating a right-shift of acc. Table 24 summarizes the effects due to the 5 allowed values of shft[7:4].

The 4 least significant bits of shft provide for optional right-shifting of acc. This scaling implements a trade-off between range and resolution in the 9-bit comparison operation that follows. The effects of shft[3:0] are shown in Table 25. From Section 4.3, an input of -10 g acceleration corresponds to -2.91 V, or 58.2% of the full-scale -5 V limit of the ADC. This implies that the maximum output of the ADC corresponds to -17.2 g. The maximum MS value is then  $295.22 g^2$  (25  $V^2$ ).

At the end of the acc shift operations controlled by shft, the least significant of the 9 bits of MS to be used for threshold comparison is located in acc[6], the MSB in acc[14]. It is possible to have bit acc[15] set (and bits acc[14:6] cleared) if all the input values for x were 800h. This can only occur if the electronics fail, since the offset reduction filter discussed in Section 3.6.3 eliminates any DC signal components prior to the sigma-delta modulator. Bit acc[19] is latched into the comparison overflow flag, c\_over, to indicate an error. Discarding the MSB and including the next least significant bit introduces an implicit gain of 2 in the resulting MS value shifted out of the monitor.

Table 24. Effects of shft[7:4].

| shft[7:4] | Number of shifts Division factor (N) |      | cntr2 starting value |  |
|-----------|--------------------------------------|------|----------------------|--|
| 1111      | 0                                    | 256  | 3840                 |  |
| 1110      | 1                                    | 512  | 3584                 |  |
| 1100      | 2                                    | 1024 | 3072                 |  |
| 1000      | 3                                    | 2048 | 2048                 |  |
| 0000      | 4                                    | 4096 | 0                    |  |

Table 25. Effects of shft[3:0].

| shft[3:0] | Number of shifts | Gain<br>factor | Maximum (g²) | Resolution (g <sup>2</sup> ) |
|-----------|------------------|----------------|--------------|------------------------------|
| 1111      | 0                | 16             | 1.15         | 2.26×10 <sup>-3</sup>        |
| 1110      | 1                | 8              | 4.61         | 9.03×10 <sup>-3</sup>        |
| 1100      | 2                | 4              | 18.45        | 3.61×10 <sup>-2</sup>        |
| 1000      | 3                | . 2            | 73.82        | 1.44×10 <sup>-1</sup>        |
| 0000      | 4                | 1              | 295.26       | 5.78×10 <sup>-1</sup>        |

For programmable gain factors above 1, (2, 4, 8 or 16) it is possible that the shifted acc has a 1 in bits acc[18:15]. These bits are ORed together with acc[19] to indicate a comparison overflow, an indication that the scaled MS value exceeds the limit of the 9-bit comparator.

A bitwise comparison of acc[14:6] with the 9-bit programmable thresh register is next performed. This is accomplished by shifting successively significant bits of acc through acc[6] while rotating corresponding bits of thresh through thresh[0]. As each bit pair is compared, two latches keep track of whether the MS value is greater (a\_greater = 1), the threshold value is greater (t\_greater = 1) or the values are equal (both latches are 0).

At the end of the comparison, the 9-bit MS value is in acc[6:0] and the two most significant bits of the accumulator extension register, acc\_ext[4:3].

If a greater is 1 at the end of the comparison, the MS value is greater than the threshold and the 10-bit exception count, excnt, is incremented. If this causes excnt to overflow, an exception overflow (e\_over) is generated. The excnt register, therefore, is initialized with a 9-bit value that is n less than the overflow,  $2^9$ , where n is the number of exceptions before overflow. This value can be found by taking the twoscomplement of n.

Any of 3 sources of overflow may cause an alarm from this unit. The first alarm cause is an exception counter overflow. The second cause is a comparison overflow. The combination of the first two overflow types implements a dual-level thresholding. The e\_over value indicates if a lower level has been exceeded a programmable number of times. The c\_over value indicates if a higher level, outside the comparison limit, has been exceeded once. The third alarm cause is an overflow from the conversion unit or any of the IIR filter units, signified as m\_over.

When an alarm occurs, the values of m\_over, c\_over and e\_over are latched into acc\_ext[2:0] respectively. The 12-bit value {acc[6:0], acc\_ext[4:0]} is then

serially sent to the interface unit for transmission off-chip. This signal contains the 9 most significant bits of the compared MS value together with the status of the possible causes for the alarm that initiated the data transfer.

Upon initialization, the values in the init registers shft, thresh and exnum are serially loaded as described in Section 5.1. All of the data registers are loaded with 0. This implies that the first MS calculation will require 4096 values regardless of the value in shft. The MS value resulting from this first calculation is discarded. This amount of time, 0.4 seconds, allows the transient responses of the filters to decay to an insignificant level.

## 5.6 CONTROL UNIT

The control unit performs three functions. First, it generates the reset and coefficient shift signals that control whether the logic is running in initialization or normal operation mode. Second, it determines which output will be transmitted off-chip based on whether the device is operating in monitor mode or in accelerometer mode. Finally, the control unit is responsible for interfacing the monitor system to the communication system that connects to the external interface unit.

The design of the logic for this last function depends heavily on the design of the communication logic and the transmission methodology. A simple buffered serial scheme was chosen to demonstrate a possible implementation. Figure 78 shows a diagram of the control unit.

When power is first applied to the IC, all of the registers are cleared. This sets the serial data and clock lines to input. The communication logic on the transmitting side synchronously sends 12 bits into the transfer register. Each bit causes an increment of the modulo-12 counter dantr. When dantr overflows, the contents of transfer are parallelly loaded into buffer, and is asserted to indicate the presence of a coefficient



Figure 78. Control unit diagram.

word, and cshift is asserted. This begins the serial shifting of the 12 bits through the init registers of the various units as described in Section 5.1. As each bit is shifted through mode, the modulo-12 counter bentr is incremented.

When bontr overflows, cshift is cleared, cdat is cleared and the counter contr is incremented. The modulo-18 counter contr indicates the number of 12-bit coefficient words that have been shifted into init registers. When contr overflows, reset is set causing the device to begin normal operation; in\_out is set, placing the serial data and clock lines in output mode; and sontr begins incrementing with each 2.56 MHz clock period.

The init register mode serves two functions. First, it indicates whether the output to be transmitted off-chip comes from the conversion unit (raw acceleration data) or from the MS calculation and threshold unit (MS value). Second, it indicates to the control hardware in the control unit of how often to expect data. In accelerometer mode, data is serially shifted into buffer every scntr value of 24h - 2Fh, 64h - 6Fh, A4h - AFh, or E4h - EFh. This results in 12-bit values at 40 kwords/sec, the data rate before second-stage decimation. In monitor mode, data is serially shifted into buffer when scntr has values between 00h and 0Bh, and the alarm line from the thresholding unit is asserted.

Each time a bit is serially shifted into buffer from either the converter unit or the threshold unit, bentr is incremented. When bentr overflows, buffer is parallelly loaded into transfer and tdat is asserted. Each system clock period while tdat is asserted causes a clock pulse to be fed down the serial clock line and causes dentr to be incremented. Transfer is also left-shifted each period sending the next bit down the serial data line. When dentr overflows, tdat is cleared.

The purpose of the digital hardware simulation described here is to test the connection and control structure of the logic design. Due to the length of time required to perform a hardware simulation run, obtaining results with simulated bearing acceleration data was not feasible. Chapter 6 presents a system simulation, including an algorithmic simulation of the digital hardware in C, that produces outputs similar to actual applications of the monitor.

The logic described in this chapter was simulated using Verilog-XL digital simulator by Cadence Design Systems. Verilog-XL uses the Verilog Hardware Description Language, capable of both behavioral and structural modeling. Verilog HDL constructs allow for algorithmic level, register transfer level (RTL), gate-level and switch-level simulation [103].

A hierarchical model of the logic was constructed to facilitate task partitioning and information flow. Figure 79 shows the simulation model hierarchy. Connections between the control unit module and the converter, IIR filter and thresholding leaf modules are shown in Figure 70.

The converter module, dec. v, is implemented at the register transfer level. It accepts the 1-bit simulated output of the sigma-delta modulator, supplied from the testsys. v module, and implements the decimator and antialiasing filter.

The IIR filter module, *iir.v*, is also an RTL model. Three instances of *iir.v* implement the sixth-order programmable filter.

The MS calculation and thresholding module, sqsm.v, is a multiple-level module. The squaring operation is implemented at the algorithmic level due to its similarity to the multiplier in iir.v. The remainder of the module is an RTL model.



Figure 79. Simulation model hierarchy.

The control module, control. v, is implemented as an intermediate level module because it must connect to each of the leaf modules. All of control. v is implemented at the register transfer level except the serial data and clock connections, which are implemented at the switch level due to their bi-directional information flow.

The top-level module, testsys. v, serves 3 functions. First, it simulates the systems connected to the digital logic. This includes the sigma-delta modulator connected to the decimator and the communication, clock and power-up logic connected to the control unit. Second, it collects output for display or storage. Verilog can generate timing diagrams, as in Figure 72, or numerical data, as used in Figure 80 later in this chapter. Third, testsys. v sets up simulation parameters such as the simulation clock rate. The clock rate was chosen so that one simulation clock tick corresponds to one gate delay, or 9 ns. This results in a system clock period of 43.4 simulation ticks for a frequency of 2.56 MHz.

Several input patterns, including constants, square waves and sinusoids, together with many different coefficient patterns were used to test the operation of the digital hardware. The results of one representative test are presented here.

The ADC simulator, employed to produce the simulation results in Chapter 4, was used to produce the output of the SDM to a 1 kHz sine wave with an amplitude one-half of full scale (8.6 g, 2.5 V).

The converter unit was initialized to bypass the antialiasing filter. A sixth-order Butterworth band-pass filter with a center frequency of 1 kHz and a bandwidth of 400 Hz was designed. The analog filter was converted to a digital filter using the bilinear transform with frequency prewarping. The calculations are developed in Appendix D. The MS value was calculated every 512 samples (51.2 ms). A threshold of 0 and an exception count of 1023 were used to cause an alarm after every MS calculation. The control unit operated in the monitor mode.

を 100mm では、100mm では



Figure 80. Digital simulation output.

Partial results of the simulation are shown in Figure 80. The figure includes the input sinusoid, ADC output at outreg in dec. v, and filter output at Y0 of iir. v3. A phase shift differentiates the input sinusoid and the decimator output, as discussed in Appendix C.7. The filter output is attenuated by the 0.983 gain of the filter at 1 kHz. The phase shift results from the delay of the filter.

The simulation was run for a 1 second period following the initial 4096 samples discarded for initializing the system filters. At every overflow of cntr2, the 12-bit value 1E9h was shifted out of the control unit. The three least significant bits indicate that a threshold overflow was the cause of the alarm. The remaining 9 bits indicate the last MS value that exceeded the threshold. The decimal output value is found by converting the hex value into an integer value and dividing by 4096. This results in an MS value of 0.119 relative to full-scale  $(11.9 g^2, 2.98 V^2)$ . The normalized theoretical value is

$$\frac{\left[(0.5)(0.983)\right]^2}{2} = 0.121.$$

The principal reason for the 1.5% difference between the theoretical and actual values is that only 9 bits of accuracy are used in the output. This introduces a relative error of up to

$$\frac{2^{12-9}}{4096}=0.002,$$

sufficient to cover the discrepancy.

## **CHAPTER 6 SYSTEM SIMULATION**

Chapter 6 describes a simulation of the bearing vibration monitor system. The simulator is divided into 3 modules. Each module is written as a separate function in C. A supervisory top-level program calls each module, allowing for multiple feature extraction passes over the same data. Section 6.1 outlines the components of the simulator. Section 6.2 presents the simulation results.

### 6.1 SIMULATOR COMPONENTS

The system simulator is composed of three modules, a bearing vibration simulator, an analog and conversion electronics simulator, and a feature extraction logic simulator. The relationship between the modules is shown in Figure 81.

The first module simulates the vibration sources and transmission paths within the rotating element bearing. The module accepts as inputs physical parameters such as the bearing geometry, housing resonances, defect types and magnitudes, and shaft speed, as



Figure 81. Simulation modules.

well as simulation parameters such as sample frequency and number of data points. The module produces a string of 32-bit floating-point values corresponding to the amplitude of the bearing housing acceleration. A sampling rate of 2.56 Msamples per second is used to correspond with the clock rate of the digital system. Details of the bearing vibration simulation are discussed in Section 2.4.

The second module simulates the transducer, amplification and offset reduction circuitry, switched capacitor filter, sigma-delta modulator, and first-stage decimator. This module accepts the simulated acceleration as input. The outputs are the 17-bit two's complement integers at a simulated rate of 40 ksamples/second from the first-stage decimator. The effects of the transducer, described in Section 3.2, are simulated as a second-order low-pass filter. The transducer and the high-pass offset reduction filter are implemented as difference equations, converted to discrete filters through the bilinear transformation. The switched capacitor filter is implemented as a difference equation directly from its z-domain transfer function, described in Appendix C.4. The sigma-delta modulator and first-stage decimator simulation is identical to that used to obtain the total signal-to-noise ratio results of Chapter 4.

The third module simulates the antialiasing filter, second-stage decimator, IIR filter and MS calculation as described in Chapter 5. The module accepts the output from the second module, converting the 17-bit values into scaled, 12-bit values according the programmable gain factor. The optional antialiasing filter is applied as programmed. The sixth-order IIR filter is algorithmically simulated using 12-bit twos-complement arithmetic. The MS value is calculated by summing the squared filter outputs for a programmed number of data values. The outputs of the third module are the MS values, converted to floating point and scaled by the programmable MS and decimator gain factors. The rate of the output depends on the programmable number of samples used to generate the MS value (256 - 4096), resulting in output at a simulated rate of 39 - 2.4 values per second.

## 6.2 SIMULATION RESULTS

A set of 14 simulation runs was performed using parameters from the type NSK / NTN 30204 tapered roller bearing described in Section 2.4.4. The first run simulates a defect-free bearing. The remaining 13 runs simulate an increasing outer race defect, taking the bearing through acceptable vibration limits and into imminent failure as estimated in Section 2.5.2. For each run, 4.2 million data points were generated in the bearing vibration module.

The converter output from each of the 14 runs was used in several tests with different programmable coefficient values. All of the tests used 512 squared filter outputs to calculate the MS value. This resulted in 16 MS values per test. Test variations include no filtering, antialiasing filter only, and band-pass filtering around the defect frequency (207 Hz) both with and without antialiasing filtering. The results of the test are shown in Figure 82. The values for each curve represent the average of the 16 MS outputs normalized to the minimum defect magnitude MS value for that set of coefficients. The comprehensive vibration limits, discussed in Section 2.3.4, are based on the magnitude of the bearing acceleration signal.

The heavy line shows the MS calculated for all data values in the bearing vibration module. These values represent the "actual" mean squared housing accelerations, low-pass filtered at 5 kHz.

The output values for the all-pass IIR filter both with and without the antialiasing filter closely follow the bearing vibration module MS signal. The test without antialiasing has a unity gain for both the decimator output and the MS value, and is shown as a dotted line on the graph. The test with antialiasing has a gain of 1 at the decimator and a gain of 8 for the MS value, and is shown as a dashed line on the graph.





Key to comprehensive vibration limits

A = no fault

B = acceptable

C = marginal

D = failure probable

E = failure imminent

Figure 82. Comparison of feature extraction parameters.

Tests were also conducted using a band-pass filter with a center frequency of 200 Hz and a bandwidth of 300 Hz. The center frequency was chosen to match the outer race defect frequency (207 Hz). The bandwidth covers the inner race and ball-pass defect frequencies (293, 189 Hz), since it is not possible to predict the type of failure before it occurs. Discrete coefficients were obtained for a sixth-order Butterworth band-pass filter through the use of the bilinear transform. Details of the filter design are in Appendix D. Eight tests were conducted, each with the antialiasing filter and an MS gain value of 16. The decimator gain factors ranged between 1 and 128. The results with a gain of 16 are shown as a light solid line in Figure 82. This graph shows the sharpest increase in MS values through the acceptable range, with an 8-times increase in magnitude between the zero defect MS value and the marginal limit. By varying the decimator output gain, the region of sharp increase can be shifted with respect to the relative defect magnitude. This is shown in Figure 83, which graphs the MS outputs with the antialiasing and band-pass filters for various decimator gain factors.

From Figure 83, it is noted that the slope of the average MS curve decreases with increasing relative defect magnitude. This is due to an increase in the percentage of 12-bit first-stage decimator outputs that overflow. Since this overflow is a modulo operation (bits over 12 are discarded), excessive overflow implements a uniformly distributed random source, hence the flattening of the MS curves. Table 26 lists the decimator gain value that produces the maximum average MS value out of each of the tests with bandpass and antialiasing filtering for a given defect level. The table shows that overflow rates as high as 25% produce average MS values greater than those produced by decreased gains. This indicates that decimator output overflow rates as high as 25% are acceptable.



Key to comprehensive vibration limits

A = no fault

B = acceptable

C = marginal

D = failure probable

E = failure imminent

Figure 83. Comparison of computed MS for various decimator gains.



Table 26. Maximum average MS values for band-pass filter.

| Relative defect | Gain for maximum | Maximum average       | % overflow of    |
|-----------------|------------------|-----------------------|------------------|
| magnitude       | MS               | MS (g <sup>2</sup> )  | 12-bit decimator |
| 0.0             | 32               | 7.86×10 <sup>-5</sup> | 17.4             |
| 0.0001          | 32               | 8.88×10 <sup>-5</sup> | 18.6             |
| 0.00018         | 32               | 9.07×10 <sup>-5</sup> | 19.6             |
| 0.00032         | 32               | 9.55×10 <sup>-5</sup> | 21.4             |
| 0.00056         | 32               | 9.64×10 <sup>-5</sup> | 24.6             |
| 0.001           | 16               | 1.32×10 <sup>-4</sup> | 4.0              |
| 0.0018          | 16               | 2.25×10 <sup>-4</sup> | 9.2              |
| 0.0032          | 16               | 3.28×10 <sup>-4</sup> | 18.8             |
| 0.0056          | 8                | 6.40×10 <sup>-4</sup> | 6.7              |
| 0.01            | 8                | 1.35×10 <sup>-3</sup> | 21.1             |
| 0.018           | 4                | 4.70×10 <sup>-3</sup> | 12.2             |
| 0.032           | 2                | 1.72×10 <sup>-2</sup> | 0.0              |
| 0.056           | 2                | 4.40×10 <sup>-2</sup> | 0.0              |
| 0.1             | 1                | 9.99×10 <sup>-2</sup> | 0.0              |

A final concern is the range of MS values for a given defect magnitude and set of monitor parameters. Figure 84 shows the range of values corresponding to each of the relative defect magnitudes for decimator gains of 4, 8 and 16. Ranges that extend to the bottom of the graph indicate MS values of 0.

The monitor configured with a gain of 16 shows best separation of ranges in the acceptable region. The monitor programmed with a decimator gain of 8 shows best separation in the marginal region of operation. However, in both cases, there is overlap in the range of MS values for closely spaced defect magnitudes. This range of MS values for a given defect level signifies the difficulty in selecting a threshold to use as an absolute indication of bearing deterioration. Two solutions may be used to compensate for the overlapping ranges.

First, the number of IIR outputs used to calculate the MS value, N, can be increased. One possible reason for the large range of MS values is the close relation between the defect period (4.8 ms) and the MS calculation period (5.1 ms), occasionally resulting in 2 defect impacts adding to a particular MS value. This effect can be reduced by taking the same data used in the previous runs and increasing N to 2048, resulting in 4 MS values per test. The results for a configuration with band-pass filtering, antialias filtering, and decimator gains of 4, 8 and 16 are shown in Figure 85.

A second solution can be obtained by setting the threshold at the average expected MS value for a given defect level, generating an alarm when a set number of threshold exceptions have occurred, and recording the period of time between alarms. A decrease in the period indicates a change in the health of the bearing. The counting and alarm generating logic is included in the monitor as described in Section 5.5. The timing mechanism would be located in the interface unit, discussed in Section 1.3.

# comprehensive vibration limits



Key to comprehensive vibration limits

A = no fault

B = acceptable

C = marginal

D = failure probable

E = failure imminent

Figure 84. Range of MS values for band-pass filter, N = 512.



Key to comprehensive vibration limits

A = no fault

B = acceptable

C = marginal

D = failure probable

E = failure imminent

Figure 85. Range of MS values for band-pass filter, N = 2048.

### CHAPTER 7 CONTRIBUTIONS AND FURTHER RESEARCH

The purpose of this project is to examine the feasibility of using an intelligent microsensor for monitoring the health of rolling element bearings through analysis of housing acceleration vibrations. To that end, a device is designed that uses a four-post suspended mass microaccelerometer to sense acceleration vibrations. This is converted to an electrical signal through a half-active piezoresistive bridge driven with a temperature-compensated supply. The voltage output is amplified, high-pass filtered to remove any offset signal, and low-pass filtered. The result is digitized in a second-order sigma-delta modulating analog-to-digital converter. A novel first-stage decimator, located on-chip, provides a minimum resolution of 9 bits for monitoring. The output of the decimator can be transmitted off-chip for further filtering before second-stage decimation to a resolution suitable for diagnostic purposes or it can be passed to the feature extraction section. Feature extraction is accomplished by first filtering the signal in a programmable sixth-order IIR filter, then extracting the mean squared value. This value is then compared to a programmable threshold value. An alarm signal is transmitted off-chip when the threshold is exceeded a programmable number of times.

The remainder of this chapter is divided into two sections. Section 7.1 lists the major contributions of this project. Section 7.2 discusses areas for future research related to the application of intelligent sensors to bearing monitoring.

#### 7.1 CONTRIBUTIONS

This section describes five major contributions of this project: the concept of applying intelligent microsensors to machinery monitoring, a simulator for bearing housing vibrations and results obtained from the simulation, design of a high-sensitivity, high-resonant frequency microaccelerometer, the design of a reduced area first-stage

decimation filter for sigma-delta modulating ADCs, and an implementation of dual precisions for monitoring and diagnosis.

### 7.1.1 Intelligent Microsensors and Machinery Monitoring

The concept of using intelligent microsensors for monitoring bearing vibrations contributes to both the fields of microsensors and of machinery monitoring and diagnostics. Continuous machinery monitoring is currently implemented with transducers on the devices under inspection, connected via multiplexers to a central processing unit. The processor extracts salient features, usually using an FFT to obtain an estimate of the spectrum. The output is then compared to programmed limits or is analyzed by a human expert. This technique suffers from several inadequacies. The cabling and processing can be expensive if a large number of sense points are required or the sense points are widely distributed. Also, if the degree of multiplexing is significantly increased to reduce the amount of cabling and hardware, the monitoring ceases to be continuous.

The advent of inexpensive processors and application-specific integrated circuits (ASICs) has made possible the development of small, rugged data analysis units placed near the sense points. However, the cost per sense point of this technique makes it prohibitive for all but highly critical or dangerous applications.

An intelligent microsensor offers the opportunity for data reduction and analysis at the sense point for normal monitoring operations, eliminating the need for a dedicated external processor. Further, the microsensor can be placed in a sense-only mode, transmitting conditioned, digitized data back to a central processor for in-depth analysis when necessary. This offers the possibility for switching between continuous low-level, low-resolution monitoring at many points without external supervision and selective high-level, high-resolution diagnostics, all at a potential cost per sense point significantly below that currently available.

This reduction in cost should occur when microsensor manufacturers begin to mass produce intelligent devices. Currently, some commercial microsensors are packaged together with signal conditioning electronics, but no devices are available that combine a transducer, conditioning and conversion electronics, and signal reduction and decision logic. The reasons for this seem to be more economic than technological. No market has yet presented itself for mass produced intelligent microsensors and, hence, industrial research and development has been slow in this area.

During the past 30 years, the costs of analog, digital and hybrid electronic devices have plummeted as the demand has increased, further increasing demand and continuing the trend. It is reasonable to assume that the cost of devices that combine micromechanical transducers, analog and digital electronics will follow a similar pattern and decrease as more applications for these devices are developed. The intelligent bearing monitor is an excellent initial concept for this process since a market for relatively high-priced machinery monitoring and diagnostic equipment already exists.

### 7.1.2 Bearing Simulator and Simulation Results

A second contribution of this work is the development of a bearing vibration simulator and the results obtained from its application. Surprisingly, a review of research in the field of bearing vibration monitoring revealed a lack of simulation-based models for analyzing monitoring methods. Only one paper discussed a simulation model for defective bearings, and this work was concerned with calculating the defect frequency, not with examining a generalized strategy for failure detection [13]. Previous research has centered on constructing fixtures for testing actual bearings under various conditions. While this technique produces results that will more accurately represent field conditions, it does not lend itself to generalization to other bearing types nor has it been used to examine the effects of monitoring equipment frequency and quantization limitations.

| •        |       |  |
|----------|-------|--|
|          |       |  |
|          |       |  |
|          |       |  |
| <u>-</u> | <br>_ |  |

The bearing vibration simulation described in Chapter 2 was used to examine the relationships between various feature extraction methods and frequency. The results show a strong dependency between the two most popular single-value indicators, MS and kirtosis, with MS showing greater sensitivity at the lower defect frequencies while kirtosis has greater sensitivity around the higher housing resonant frequencies. This may explain the mixed results reported in the literature, which often fails to consider the relationships between the bearing housing and defect frequencies and the sampling rate of the data acquisition equipment.

### 7.1.3 Microaccelerometer

An examination of commercially available microaccelerometers, conducted at the start of this project, revealed no devices that met sensitivity and operating frequency range



Figure 86. Accelerometer sensitivity vs. resonant frequency.

| <u>-</u> | _ |  |  |
|----------|---|--|--|

requirements for bearing condition monitoring. This prompted the design of the four-post, suspended mass, piezoresistive accelerometer detailed in Chapter 3. Figure 86 presents a comparison of the resonant frequency versus sensitivity for this and commercially available piezoresistive microaccelerometers. Lines connect devices belonging to the same model (from the same manufacturer). The numbers refer to bibliographic references for the specification sheets. For comparison, the resonant frequency is used instead of the maximum operating frequency as the latter specifications were not always available and because the definition of operating frequency range differs between manufacturers. The maximum operating frequency varies from about 30% to 60% of the resonant frequency, depending on the damping ratio of the device and the manufacturer.

The transducer developed for this project has a resonant frequency of 17.7 kHz and a sensitivity of 50  $\mu$ V/V/g (750  $\mu$ V/g with a 15 V supply). The complete intelligent sensor is designed for a range of  $\pm 10$  g and an operational frequency range of 10 to 5000 Hz. This superior combination is achieved through several factors. First, the high resonant frequency results from suspending the mass from four posts placed near the corners of the mass. This produces greater stiffness than an arrangement of a cantilevered beam or a quad beam with two beams on opposing sides.

This relatively high resonant frequency is not without cost. The increased stiffness limits the mass movement, reducing the change in resistance and, hence, decreasing the potential sensitivity. This is compensated for by the sizing of the piezoresistors, their arrangement in a half-active bridge, and the integrated BiCMOS electronics. The latter allows for amplification of the bridge output with a minimum of added external noise as compared with the off-chip electronics required by other microaccelerometers.

# 7.1.4 First-Stage Decimation Filter

A fourth contribution of this project is the development of a reduced area firststage decimation filter for second-order sigma-delta modulating analog-to-digital converters. A classification scheme for these filters is also presented.

To optimize the quantization noise rejection without excessive hardware, second-order SDM ADCs use  $sinc^3$  filters in their first-stage decimators. These filters are logically created by concatenating 3 averaging filters, each with a sin(f)/f spectrum. Traditionally, each of these 3 filters has had a length equal to the first-stage decimation ratio,  $N_1$ . This project examined utilizing averaging stages with lengths  $\frac{1}{2}$  and  $\frac{1}{4}$  of  $N_1$ . A classification scheme, based on the length of the averaging filters used in the  $sinc^3$  filter, is summarized by the expression Hijk, where i is the number of length  $N_1$  averaging stages, j is the number of  $\frac{1}{2}N_1$  stages, and k is the number of  $\frac{1}{4}N_1$  stages. In order for Hijk to remain a  $sinc^3$  filter, the sum of i, j and k must be 3. The filter typically used in second-order SDM ADCs is classified as H300. Of all other possible combinations of averaging stage lengths, the H120 filter possesses the best combination of filtering characteristics and implementability. A comparison of the H300 filter with the H120 filter indicates that the H120 filter requires less area to implement, but sacrifices quantization noise rejection and antialiasing ability.

Three architectures for implementing each type of sinc<sup>3</sup> filter were considered. Method I uses an IIR filter, downsampling to an intermediate frequency, then an FIR filter. Method II uses and FIR filter followed by an IIR filter. Method III uses an FIR filter with coefficients generated in special hardware. Methods I and III are shown to require hardware with area of order  $O(log_2(N_1))$ , with Method III requiring less hardware than Method I in all cases of interest. Method II is shown to require hardware with area  $O(N_1)$ . A Method III H120 filter requires less area than a Method II H120 filter for any  $N_1$  over 32. A Method III H300 filter requires less area than a Method II H300 filter for

any  $N_1$  over 64. The filter proposed for application in the intelligent vibration monitor is the Method III H120 sinc<sup>3</sup> filter.

## 7.1.5 Differing Monitoring and Diagnostic Precisions

A fifth contribution is the implementation of differing numerical precisions for onchip monitoring and off-chip diagnostic calculations. Simulation of the bearing acceleration conducted in Chapter 2 indicates the need for differing precisions in the data used for monitoring and for diagnosis. This dual precision is implemented by splitting the 256 times downsampling decimator of the analog-to-digital conversion process. The firststage, 64 times downsampling decimator is located on-chip. When coupled with a secondorder switched capacitor antialiasing filter, the system provides sufficient noise attenuation for a minimum resolution of 9 bits. The output of the first stage is selectably routed to either the on-chip feature extraction logic or transmitted off-chip. In accelerometer mode, the first-stage output is transmitted to a central processor, where a floating-point filter may be used to further eliminate quantization noise before final downsampling by 4. In monitor mode, an optional simple antialiasing filter may be applied prior to downsampling by 4 and feature extraction.

#### 7.2 FURTHER RESEARCH

The design of an intelligent microsensor for bearing health monitoring encompasses many fields, including rolling element bearing design, vibration analysis, sensor design, solid-state processing, continuum mechanics, analog electronics, digital electronics, and digital signal processing. This section presents specific areas of further research that will directly impact the future implementation of an intelligent bearing microsensor.

| <u>=</u> |  |
|----------|--|

A first area for continued effort is the improvement of the bearing vibration simulator. The program currently uses a linear model of the housing to modify signals produced by point defects on the outer race, inner race and rolling elements. Modifications should include modeling of area defects such as spalls and roughness; nonlinear effects such as brinelling and rolling element contact with the cage; and external noise sources such as pump cavitation, gear teeth mesh and reciprocating equipment impacts. The model output should be compared to a variety of actual bearing vibration signals to verify its accuracy.

Further research needs to be conducted into a BiCMOS process that is compatible with analog and digital electronics as well as micromachining. The process must produce analog components capable of handling high power and high voltage for the bridge supply as well as components with low noise characteristics for the amplifiers; digital components with small area and low power dissipation characteristics; and the ability of all devices to withstand the micromachining operations that follow BiCMOS processing during manufacturing and the rigorous industrial operating environment.

As discussed in Chapter 1, the work done for this project is one component of a plant-wide monitoring system. One item yet to be designed is the circuitry common to all intelligent sensors in the system. This circuitry includes the temperature sensor, power supply and clock separation, and communication electronics.

The design of an accurate, inexpensive, rugged, integrated temperature compensating system is still needed. This project referred to several experimental developments but, as yet, none of these designs have been implemented in a commercial device.

In order to reduce the cost of the monitor system, the number of wires connecting a sensing IC to the Interface Unit must be kept to a minimum. A 3 wire implementation using a common wire, a communication wire, and a combined power supply and clock wire is one possibility. Further research is necessary in developing a standard for

| · |   |  |  |
|---|---|--|--|
|   |   |  |  |
|   |   |  |  |
|   |   |  |  |
|   | _ |  |  |

\_\_\_

multiplexing communication, power and clock signals for remote sensing in a manufacturing environment. This research will require cooperation between intelligent sensor manufacturers, network designers and industrial users of the monitor systems.

The Interface Unit also needs to be designed. This device initializes the intelligent sensors on startup, provides the power and clock signals, and performs data collection and reduction operations when the remote sensors operate in transducer mode. The reduction of data could include implementation of the second-stage decimation filter and subsequent downsampling, obtaining an FFT of the data, and extracting machine speed and defect frequency information.

Further research is needed into the area of coordinating the information produced by multiple sensors in the same environment, known as sensor fusion. A possible product of this study could be the development of an expert system for determining optimal initialization parameters and for performing diagnosis on the resulting data.

| ·        |  |  |
|----------|--|--|
|          |  |  |
|          |  |  |
|          |  |  |
|          |  |  |
| <u>-</u> |  |  |

# **APPENDICES**

| <u>-</u> |  |  |
|----------|--|--|

# APPENDIX A BEARING MODEL PARAMETERS

Appendix A presents the parameters used in the bearing housing vibration computer simulation in Chapter 2. The bearing geometry and housing resonant values are for an NSK / NTN 30204 tapered roller bearing with an outer race point defect as discussed in [27].

Table 27. Bearing geometry parameters.

| Parameter                       | Value    |
|---------------------------------|----------|
| pitch diameter $(D_n)$          | 34.00 mm |
| ball diameter $(D_b)$           | 6.00 mm  |
| number of rotating elements (n) | 15       |
| contact angle (φ)               | 12.96°   |
| thrust factor                   | 1.00     |

Table 28. Bearing housing resonance parameters.

| Туре      | Resonant frequency (kHz) | Damping ratio | Pass band<br>gain |
|-----------|--------------------------|---------------|-------------------|
| band-pass | 2.2                      | 0.023         | 0.15              |
| low-pass  | 11.3                     | 0.066         | 0.10              |
| low-pass  | 18.2                     | 0.087         | 0.09              |
| low-pass  | 26.8                     | 0.089         | 0.09              |
| low-pass  | 35.2                     | 0.083         | 0.10              |

Table 29. Bearing housing high-pass parameters.

| Frequency (kHz) | High frequency gain |
|-----------------|---------------------|
| 8               | 1.00                |

Table 30. Bearing contact noise parameters.

| Parameter | Value                 |
|-----------|-----------------------|
| GMAG      | 2.90×10 <sup>-5</sup> |
| nprop     | 300                   |

Table 31. Data generation parameters.

| Parameter             | Value                     |
|-----------------------|---------------------------|
| sampling frequency    | 2.56×10 <sup>6</sup> (Hz) |
| number of data values | 2.2×10 <sup>5</sup> (Hz)  |
| subsampling frequency | 1.2×10 <sup>5</sup> (Hz)  |
| shaft speed           | 2000 (rpm)                |

### **APPENDIX B**

This Appendix provides details of calculations summarized in Chapter 3.

### **B.1 SUSPENDED MASS**

The calculation of the suspended silicon mass involves multiplying the volume for the frustra of two pyramids by the density of silicon. Figure 87 gives the dimensions for the suspended mass.

The formula for the volume of a pyramid is

$$V = \frac{1}{3}(base\ area) \times (height)$$
$$= \frac{1}{3}l^2 \times h.$$

When silicon is anisotropically etched, {111} surfaces dissolve at a rate several orders of magnitude less than other surfaces. If a wafer with a top surface of (100) is



Figure 87. Suspended mass.

not to scale

|          | • |  |  |
|----------|---|--|--|
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
|          |   |  |  |
| <u>-</u> |   |  |  |

subjected to anisotropic etching, the resulting structure will have surface normals parallel to lines formed by connecting opposite corners of a unit cube. This implies that a pyramid formed by such a process would have a height one-half the length of its base  $(h = \frac{1}{2}l)$ .

The volume of the frustrum of such a pyramid with height t can then be found by removing the volume of a pyramid of height h - t, where the length of the base of the removed pyramid is l - 2t. This results in

$$V = \frac{1}{3}l^2h - \frac{1}{3}(l-2t)^2(h-t)$$
  
=  $\frac{1}{3}t(3l^2 - 6lt + 4t^2)$ .

The volume of the suspended mass is the sum of two frustra of pyramids, the first having a base of 2320  $\mu m$  and height of 390  $\mu m$  and the second having a base of 2320  $\mu m$  and a height of 10  $\mu m$ . This yields

$$V = 1.526 \times 10^9$$
 µm,  
 $\rho = 2.328 \times 10^{-12}$  g/µm<sup>3</sup> [105],  
 $m = \rho V = 3.553 \times 10^{-3}$  g.

### **B.2** ISOTROPIC BEAM THEORY

The general equation relating deflection of a beam, y(x), to the loading function, w(x), is [38]

$$w(x) = EI \frac{d^4y}{dx^4} = -F\delta(x - L)$$

for a point force -F applied at the mass end (x = L). Solving this equation yields

$$EIy(x) + \frac{1}{6}C_1x^3 + \frac{1}{2}C_2x^2 + C_3x + C_4 = -\frac{1}{6}F(x-L)^3U(x-L).$$

The four constants of integration can be found by applying four boundary conditions for a clamped-slider beam, as shown in Figure 88. The reaction force at the clamped end must

oppose the applied force for the beam to be in static equilibrium. The deflection at the slider will create moments at both attached ends. The moments are found to equal FL/2 by summing moments about the z-axis.

The first coefficient is found by noting that there is no deflection at the base side (x = 0) because the beam is clamped, forcing  $C_4$  to be zero. The slope of the beam at the clamped end is also zero, forcing  $C_3$  to be zero. Solving for the remaining two constants requires expressions for shear force, S(x) and the moment, M(x). From the elementary theory of prismatic beams,

$$S(x) = EI \frac{d^3y}{dx^3}, \qquad M(x) = EI \frac{d^2y}{dx^2}.$$

Using the moment and shear information obtained from the beam,



Boundary Conditions
$$\frac{dy}{dx}\Big|_{x=0} = 0$$

$$M(L) = EI \frac{d^2y}{dx^2}\Big|_{x=L} = \frac{maL}{2} \qquad S(0) = EI \frac{d^3y}{dx^3}\Big|_{x=0} = -F = -ma$$

Figure 88. Clamped-sliding beam.

$$S(0^+) = -F = C_1 \implies C_1 = -ma$$
,  
 $M(L^-) = -\frac{1}{2}FL = C_1L + C_2 \implies C_2 = \frac{1}{2}maL$ .

The relevant relations for the beam are then

Shear: 
$$S(x) = ma$$
,  
Moment:  $M(x) = ma(x - \frac{1}{2}L)$ ,  
Deflection:  $y(x) = \frac{ma}{EI}(\frac{1}{4}Lx^2 - \frac{1}{6}x^3)$ ,  
for  $0 \le x \le L$ .

If numerical values are required, the area moment of the cross section, I, and the largest distance from an outside surface to the neutral surface, known as the extreme fiber distance c, must be specified. These values are available in reference books for different cross sections. For the trapezoidal beam cross section [106],

$$c = \frac{h(a+2b)}{3(a+b)},$$

$$I = h^3 \frac{(a^2 + 4ab + b^2)}{36(a+b)},$$

where 
$$a = \text{top width}$$
 (40 µm),  
 $b = \text{bottom width}$  (60 µm),  
 $h = \text{thickness}$  (10 µm).

A function for the maximum longitudinal stress at any point along the length of the beam can be derived from the stresses in a beam subjected to pure bending, and yields a good approximation if the deflection is small. The stress is proportional to the moment at the point, M(x), and the distance from the cross-sectional centroid along y. This is

THE PERSON NAMED IN COLUMN TWO IS NOT THE OWNER.

| = |  |  |
|---|--|--|

maximized at any point along the length at the farthest surface from the centroid, the extreme fiber distance. The maximum longitudinal stress along the beam is then

$$\sigma_0(x) = \frac{M(x)c}{I} = \frac{mac}{I}(x - \frac{1}{2}L).$$

The absolute maximum longitudinal stress ( $\sigma_0$ ) occurs where the moment is maximized, at either attachment point. Its value is

$$\sigma_0 = \frac{M(0)c}{I} = -\frac{macL}{2I}$$
.

The maximum longitudinal stress at the base side (x = 0) will be tensile if the acceleration is in the -y (down) direction and compressive if in the +y (up) direction. The maximum stress at the mass end will be equal in magnitude but opposite in sign.

One difficulty in applying the simple beam theory developed here to the actual transducer is deciding what length to use for the beam. The derivation assumes a constant cross section. However, with the transducer, the 10 µm of material at each end gradually merge into the supporting material. The longer length (90 µm) will be used in deflection and fundamental frequency calculations and the shorter length (70 µm) will be used in stress calculations because these yield worst case solutions.

Two expressions of value for design and analysis are the maximum longitudinal stress,  $\sigma_0$ , and the maximum deflection, y(L). For both quantities, there is a linear relationship with acceleration, a. Therefore, they can be expressed in terms of the number of gravitational accelerations, g, as

$$\sigma_0 = -\frac{macL_{\sigma}}{2I} = -395.7 g g/\mu m \cdot s^2$$

and

$$y(L) = -\frac{maL_y^3}{12 EI} = -7.611 \times 10^{-4} g \mu m,$$

where  $m = \frac{1}{4} \text{ total mass} = 8.83 \times 10^{-4} \text{ g}$ 

 $L_{\sigma} = 70 \, \mu m$ 

 $L_{\nu} = 90 \, \mu \text{m},$ 

 $c = 5.333 \, \mu m$ 

 $I = 4111 \, \mu m^4$ 

 $E = 1.692 \times 10^8 \text{ g/} \mu \text{m} \cdot \text{s}^2$ 

 $a = -9.81 \times 10^6 \text{ g } \mu\text{m/s}^2$ .

The system of units, grams (g), micrometers ( $\mu$ m), and seconds (s), was chosen to facilitate the finite element analysis. The isotropic modulous of elasticity, E, is the anisotropic stiffness value along the length of the beam. This value is calculated in Section B.4.

## B.3 RAYLEIGH'S PRINCIPLE FUNDAMENTAL FREQUENCY

An approximation of the fundamental frequency for a conservative system (no damping) can be found by equating the maximum potential energy and the maximum kinetic energy [35]. A general formulation of the kinetic energy for a beam of constant cross-section A and length L attached to a mass m is

$$K = \frac{1}{2} \int \left( \frac{\delta y}{\delta t} \right)^2 dm = \frac{1}{2} \rho A \int_0^L \left( \frac{\delta y}{\delta t} \right)^2 dx + \frac{1}{2} m \left( \frac{dy}{dt} \right)_{y=L}^2.$$

If the beam is vibrating at its fundamental resonant frequency,  $\omega_0$ , the expression for the time varying natural response of the displacement along its length can be written as

$$y(x,t) = y(x)\cos(\omega_0 t).$$

Substitution into the kinetic energy expression yields

$$K = \frac{1}{2} \rho A \omega_0^2 \sin^2(\omega_0 t) \int_0^L y^2(x) dx + \frac{1}{2} m \omega_0 \sin^2(\omega_0 t) y^2(L).$$

Kinetic energy is maximized when  $sin(\omega_0 t)$  is 1.

When the beam deflection equation developed in Section B.2 is substituted for y(x), the maximum kinetic energy is

$$K = \left(\frac{ma}{EI}\right)^{2} \left[\frac{1}{2}\rho A L \omega_{0}^{2} \frac{7}{1728} L^{7} + \frac{1}{2}m \omega_{0}^{2} \frac{12}{1728} L^{7}\right]$$

$$= \frac{1}{3456} \left(\frac{\omega_{0} ma}{EI}\right)^{2} L^{7} \left[7\rho A L + 12m\right]$$

$$= \frac{1}{3456} \left(\frac{\omega_{0} ma}{EI}\right)^{2} L^{7} \left[5.70 \times 10^{-7} + 1.07 \times 10^{-2}\right].$$

From this expression, it can be seen that the contribution due to the beam is over four orders of magnitude less than that due to the mass.

Ignoring the effect of the beam, the maximum potential energy is

$$P = \frac{1}{2} mgy(L) \cos(\omega_0 t) \Big|_{t=0} = \frac{1}{2} mgy(L).$$

The potential energy will be maximized one-quarter cycle away from when the potential energy is maximized. If the system is conservative, these two values will be the same. Equating the two expressions yields

$$K = P$$

$$\frac{1}{2}\omega_0^2 my^2(L) = \frac{1}{2}mgy(L).$$

Using a length of 90 µm, the fundamental frequency is then

$$\omega_0 = \sqrt{\frac{g}{y(L)}} = \sqrt{\frac{12EI}{L^3 m}} = 1.14 \times 10^5 \text{ rads/s}$$
or
$$f_0 = 18.1 \text{ kHz}.$$

#### **B.4 MATERIAL COEFFICIENTS**

This section develops the orthotropic constants required for the finite element analysis program NISA II from the experimentally obtained stiffness coefficients for silicon. The development requires three steps. First, the stiffness coefficients will be inverted to obtain compliance coefficients. Second, the compliance coefficients must be transformed from the crystallographic axes to the model axes. Third, the coefficients must be converted to the form required by NISA II.

Since the formula used for the conversion requires compliance terms rather than stiffness terms, the stiffness matrix must be inverted. This can be seen by examining the matrix form compliance and stiffness relations.

stiffness relation: 
$$\sigma_i = C_{ij} \varepsilon_j$$
  
compliance relation:  $\varepsilon_k = S_{kl} \sigma_l$   
 $\Rightarrow \mathbf{I} = S_{ii} C_{kl}$ 

The stiffness coefficients have been determined experimentally [42]. Solving the matrix inversion for individual compliance terms yields

$$S_{11} = \frac{C_{11} + C_{12}}{C_{11}(C_{11} + C_{12}) - 2C_{12}^{2}} = 0.768 \times 10^{-8} \,\mu\text{m} \cdot \text{s}^{2}/\text{g},$$

$$S_{12} = \frac{-C_{12}}{C_{11}(C_{11} + C_{12}) - 2C_{12}^{2}} = -0.214 \times 10^{-8} \,\mu\text{m} \cdot \text{s}^{2}/\text{g},$$

$$S_{44} = \frac{1}{C_{44}} = 1.256 \times 10^{-8} \,\mu\text{m} \cdot \text{s}^{2}/\text{g}.$$

In order to transform the coefficients, they must be in tensor notation, not the abbreviated matrix notation. The algorithm for converting between the two forms is [107]

$$S_{ijkl} \Leftrightarrow S_{mn}$$
 when  $m$  and  $n$  are 1, 2 or 3,  
 $2S_{ijkl} \Leftrightarrow S_{mn}$  when either  $m$  or  $n$  are 4, 5 or 6,

$$4S_{ijkl} \Leftrightarrow S_{mn}$$
 when  $m$  and  $n$  are 4, 5 or 6.

The last two conditions contain the factor of ½ for converting between engineering and tensor strain.

By definition, the crystallographic axes correspond to the  $\langle 100 \rangle$  directions. These will be designated by the right-hand system 1, 2, 3. When the device is etched, the beams are formed along  $\langle 110 \rangle$  directions. To facilitate the model, an alternate coordinate system was developed so that the x- and z-axes are parallel to major device features. This requires transforming all material and field tensors as well. The new axes will be designated 1', 2' and 3' corresponding to the x-, y- and z-axes, respectively. Figure 89 shows the crystallographic and transformed axes, and the direction cosines required for the transformations.

The formula for transforming a fourth order tensor is

$$S'_{ijkl} = a_{im}a_{jn}a_{ko}a_{lp}S_{mnop},$$

where  $S'_{ijkl}$  is the tensor in the new coordinate system. A sample calculation is shown below with zero terms omitted.

$$a_{ij} = a_{new, old} = \begin{bmatrix} \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ -\frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \end{bmatrix}$$

$$(0, 0, 1) \quad (1, 0, 0) \quad (1, 0, 1)$$

Figure 89. Crystallographic-to-model axes transformation.

$$\begin{split} S'_{1111} &= \sum_{m=1}^{3} \sum_{n=1}^{3} \sum_{o=1}^{3} \sum_{p=1}^{3} a_{1m} a_{1n} a_{1o} a_{1p} S_{mnop} \\ &= a_{11} a_{11} a_{11} a_{11} S_{1111} + a_{11} a_{11} a_{13} a_{13} S_{1133} + a_{11} a_{13} a_{11} a_{13} S_{1313} + \\ &a_{11} a_{13} a_{13} a_{11} S_{1331} + a_{13} a_{11} a_{13} S_{3113} + a_{13} a_{11} a_{13} a_{11} S_{3131} + \\ &a_{13} a_{13} a_{11} A_{11} S_{3311} + a_{13} a_{13} a_{13} A_{13} S_{3333} \\ &= \frac{1}{4} S_{1111} + \frac{1}{4} S_{1133} + \frac{1}{4} S_{1313} + \frac{1}{4} S_{1331} + \\ &\frac{1}{4} S_{3113} + \frac{1}{4} S_{3131} + \frac{1}{4} S_{3311} + \frac{1}{4} S_{3333} \\ &= \frac{1}{2} S_{1111} + \frac{1}{2} S_{1122} + S_{1212} \\ &S'_{11} &= \frac{1}{2} S_{11} + \frac{1}{2} S_{12} + \frac{1}{4} S_{44} \end{split}$$

Following is the set of coefficients that results, expressed in matrix notation:

$$S'_{11} = S'_{33} = \frac{1}{2}S_{11} + \frac{1}{2}S_{12} + \frac{1}{4}S_{44} = 0.591 \times 10^{-8} \, \mu \text{m·s}^2/\text{g},$$

$$S'_{22} = S_{11} = 0.768 \times 10^{-8} \, \mu \text{m·s}^2/\text{g},$$

$$S'_{13} = \frac{1}{2}S_{11} + \frac{1}{2}S_{12} - \frac{1}{4}S_{44} = -0.037 \times 10^{-8} \, \mu \text{m·s}^2/\text{g},$$

$$S'_{12} = S'_{23} = S_{12} = -0.214 \times 10^{-8} \, \mu \text{m·s}^2/\text{g},$$

$$S'_{44} = S'_{66} = S_{44} = 1.256 \times 10^{-8} \, \mu \text{m·s}^2/\text{g},$$

$$S'_{55} = 2(S_{11} - S_{12}) = 1.964 \times 10^{-8} \, \mu \text{m·s}^2/\text{g}.$$

The formulation for converting the compliance coefficients to the notation required for NISA II is [41]

$$S'_{ij} = \begin{bmatrix} \frac{1}{E_{11}} & \frac{-\mathbf{v}_{12}}{E_{11}} & \frac{-\mathbf{v}_{13}}{E_{11}} & 0 & 0 & 0\\ \frac{-\mathbf{v}_{21}}{E_{22}} & \frac{1}{E_{22}} & \frac{-\mathbf{v}_{23}}{E_{22}} & 0 & 0 & 0\\ \frac{-\mathbf{v}_{31}}{E_{33}} & \frac{-\mathbf{v}_{32}}{E_{33}} & \frac{1}{E_{33}} & 0 & 0 & 0\\ 0 & 0 & 0 & \frac{1}{2G_{23}} & 0 & 0\\ 0 & 0 & 0 & 0 & \frac{1}{2G_{13}} & 0\\ 0 & 0 & 0 & 0 & 0 & \frac{1}{2G_{12}} \end{bmatrix}.$$

The resulting program coefficients are listed in Table 32.

### **B.5 STRESS AVERAGE**

This section calculates the effective stress over the length of the piezoresistor.

This is done using two methods, averaging the theoretical isotropic beam stress and averaging the finite element normal (longitudinal) stresses.

Assuming a constant cross-sectional area for the resistor, A, the stress-dependent resistance distribution can be expressed as

$$\frac{d\mathbf{R}}{d\mathbf{x}} = \frac{\rho}{\mathbf{A}} [1 + \pi \sigma_0(\mathbf{x})].$$

Integrating over the length  $l = x_1 - x_0$  and substituting the stress relation developed in B.2 yields the resistance,

Table 32. NISA material coefficients.

| EX   | $=E_{11}'$               | $=\frac{1}{S'_{11}}$       | = $1.692 \times 10^8 \text{ g/} \mu \text{m} \cdot \text{s}^2$ |
|------|--------------------------|----------------------------|----------------------------------------------------------------|
| EY   | = E' <sub>22</sub>       | $=\frac{1}{S'_{22}}$       | = $1.302 \times 10^8 \text{ g/}\mu\text{m}\cdot\text{s}^2$     |
| EZ   | $=E_{33}^{\prime}$       | $=\frac{1}{S'_{33}}$       | = $1.692 \times 10^8 \text{ g/}\mu\text{m}\cdot\text{s}^2$     |
| GYZ  | = G' <sub>23</sub>       | $=\frac{1}{2S_{44}'}$      | $= 0.398 \times 10^8 \text{ g/}\mu\text{m}\cdot\text{s}^2$     |
| GXZ  | = <i>G</i> <sub>13</sub> | $=\frac{1}{2S_{55}'}$      | = $0.255 \times 10^8 \text{ g/}\mu\text{m}\cdot\text{s}^2$     |
| GXY  | = G' <sub>12</sub>       | $=\frac{1}{2S_{66}'}$      | $= 0.398 \times 10^8 \text{ g/}\mu\text{m}\cdot\text{s}^2$     |
| NUXY | $= v'_{12} = v'_{32}$    | $=\frac{S'_{12}}{S'_{11}}$ | = 0.362                                                        |
| NUYZ | $= v'_{23} = v'_{21}$    | $=\frac{S'_{23}}{S'_{22}}$ | = 0.279                                                        |
| NUXZ | $= v'_{13} = v'_{31}$    | $=\frac{S'_{31}}{S'_{33}}$ | = 0.0626                                                       |

$$R = \int_{x_0}^{x_1} \frac{\rho}{A} \left[ 1 + \pi \sigma_0(x) \right] dx$$

$$= \frac{\rho l}{A} + \frac{\rho}{A} \pi \frac{mac}{I} \int_{x_0}^{x_1} (x - \frac{1}{2}L) dx$$

$$= \frac{\rho l}{A} \left[ 1 + \pi \frac{mac}{I} \frac{x_1 + x_0 - L}{2} \right]$$

$$= \frac{\rho l}{A} \left[ 1 + \pi \frac{\sigma_0(x_1) + \sigma_0(x_2)}{2} \right]$$

$$= \frac{\rho l}{A} \left[ 1 + \pi \overline{\sigma}_0 \right]$$

where  $\overline{\sigma}_0$  is the average stress over l.

This shows that the average stress can be used to calculate the total resistance. The development depends on having a cross-sectional area independent of position along the resistor. Although this may not be true, the approximation is sufficient for the first order analysis and any error introduced should be similar in all resistors.

The formula for calculating the average stress between points  $x_1$  and  $x_0$  along a beam of length  $l = x_1 - x_0$  is

$$\overline{\sigma}_0 = \frac{mac}{2I}(x_1 + x_0 - L).$$

The length of the proposed piezoresistors is 40.5  $\mu$ m. Applying this formula to a beam of length 90  $\mu$ m from 0 to 40.5  $\mu$ m yields an average stress of 280 g/ $\mu$ m·s² per g acceleration. A beam of length 70  $\mu$ m with a piezoresistor from -10 to 30.5  $\mu$ m results in an average stress of 82 g/ $\mu$ m·s² per g acceleration. The actual average should lie between these estimates.

Results of the finite element analysis can be used to obtain a more accurate measure of the average stress. Figure 90 shows a quarter model of the longitudinal stress



Figure 90. Quarter beam longitudinal stress values.

distributed on the top surface of a beam. The piezoresistor placement is shown by the box on the left. By averaging the stress contribution due to each element, the average stress value seen by the beam piezoresistors is estimated to be 206  $g/\mu m \cdot s^2$  per g acceleration.

## **B.6 PIEZORESISTANCE COEFFICIENTS**

This section demonstrates how the piezoresistance coefficients used in resistor calculations were obtained. The results are from experimental work done on heavily doped diffused layers in silicon by Tuft and Stelzer [52].

The first result discovered by Tuft and Stelzer is the independence of the piezoresistance coefficients to layer thickness for a given concentration.

The second result is the dependence of the principal coefficients on concentration. Figure 91 presents the results. At a boron surface concentration of  $10^{19}$  and a temperature of 27 °C, the *p*-type  $\pi_{44}$  coefficient is  $100 \times 10^{-8}$  µm·s<sup>2</sup>/g.

The third result is the dependency of major coefficients on temperature. This is presented in Figure 92. The closest concentration tested to the proposed concentration of  $10^{19}$  is  $9\times10^{18}$  cm<sup>-3</sup>. Table 33 presents an estimate of the  $\pi_{44}$  values obtained from the graph.

### **B.7 BEAM THICKNESS VARIATIONS**

This section develops the formulas used to examine the effects of beam thickness on stress and fundamental frequency. Figure 93 shows how a change of  $\Delta t$  due to etch variations modifies the device geometry. Only those dimensions affected by the change in thickness are shown. The change in beam thickness will cause a change in the maximum



Figure 91. Piezoresistance vs. dopant concentration [52].



Figure 92.  $\pi_{44}$  vs. temperature for p-type silicon [52].

Table 33. P-type  $\pi_{44}$  vs. temperature for  $9 \times 10^{18}$  concentration.

| Temperature (°C) | π <sub>44</sub><br>(×10 <sup>-8</sup> μm·s²/g) |
|------------------|------------------------------------------------|
| -40              | 115                                            |
| -20              | 109                                            |
| 0                | 104                                            |
| 20               | 99                                             |
| 40               | 95                                             |
| 60               | 91                                             |
| 80               | 88                                             |
| 100              | 85                                             |

edge to neutral surface distance, c, the cross-section area moment of inertia, I, and the suspended mass, m.

$$c = \frac{10 + \Delta t}{3} \frac{16Q + 4\Delta t}{100 + 2\Delta t},$$

$$I = (10 + \Delta t)^3 \frac{37,000 + 140\Delta t + \Delta t^2}{18(50 + \Delta t)},$$

$$m = \frac{\rho}{12} \Big\{ (10 + \Delta t) \Big[ 3(2320 + 2\Delta t)^2 - 6(2320 + 2\Delta t)(10 + \Delta t) + 4(10 + \Delta t)^2 \Big] + (390 - \Delta t) \Big[ 3(2320 + 2\Delta t)^2 - 6(2320 + 2\Delta t)(390 - \Delta t) + 4(390 - \Delta t)^2 \Big] \Big\}.$$

These values can then be substituted into the formulas for  $\sigma_0$  and  $f_0$ .



dimensions in micrometers not to scale

Figure 93. Beam thickness variations.

# **APPENDIX C**

Appendix C provides supporting material for Chapter 4. Section C.1 discusses the register widths required for the H300, H120 and H012 architectures. Section C.2 shows that ripple-carry adders will work for all decimator architectures. Section C.3 develops the frequency characteristics for the antialiasing filter that precedes the sigma-delta modulator. Section C.4 derives the switched capacitor functions used in the antialiasing filter and develops an expression for the relative capacitor area. Sections C.5 and C.6 discuss the worst-case antialiasing and noise power calculations. Section C.7 discusses the simulation used to obtain the total signal-to-noise ratio values for the converter.

#### C.1 DECIMATOR FILTER REGISTER WIDTHS

This section discusses the register widths for the H300, H120 and H012 registers. Regardless of the width, all data values are represented by two's complement numbers in the range [-1, 1). Two factors affect the width of the decimator registers, modulo arithmetic in the IIR stage and rounding due to the FIR stage.

The discussion of the modulo arithmetic is directly from Chu and Burrus [108]. All the filter types have three poles at z = 1. These poles force the filters to be asymptotically stable. Depending on the input value, the filter output may overflow any finite number of bits. Despite this difficulty, the filter will still provide the correct output if a modulo arithmetic is used. This arithmetic can be explained by considering the sinc filter in Figure 94.

Let the number range of the IIR accumulator be the integers in [-R/2, R/2). Let  $\langle N \rangle_R$  represent the residue of N modulo R. The output of the accumulator is



Figure 94. Sinc filter.

$$v(n) = \left\langle \sum_{i=0}^{\infty} x(n-i) \right\rangle_{R}.$$

The output of the filter, y(n), is then

$$y(n) = \langle v(n) - v(n-D) \rangle_{R}$$

$$= \left\langle \left\langle \sum_{i=0}^{\infty} x(n-i) \right\rangle_{R} - \left\langle \sum_{i=0}^{\infty} x(n-D-i) \right\rangle_{R} \right\rangle_{R}$$

$$= \left\langle \sum_{i=0}^{\infty} x(n-i) - \sum_{i=0}^{\infty} x(n-D-i) \right\rangle_{R}$$

$$= \left\langle \sum_{i=0}^{D-1} x(n-i) \right\rangle_{R},$$

where D is usually the decimation ratio [108].

The modulo of the FIR section must be the same as that of the IIR section for y(n) to recover information about the input, x(n) [108].

In order for the filter to put out the correct result,

$$\left\langle \sum_{i=0}^{D-1} x(n-i) \right\rangle_R = \sum_{i=0}^{D-1} x(n-i).$$

If  $R_i$  is the number range of the input, then a bound on the sum is

$$\left|\sum_{i=0}^{D-1}x(n-i)\right| \leq \sum_{i=0}^{D-1}\left|x(n-i)\right| \leq D\frac{R_i}{2}.$$

A necessary and sufficient condition for the filter to work for any input is that

$$R \geq DR_i$$
.

A concatenation of M sinc filters can be modeled by M IIR accumulators followed by M FIR cells. If the moduli of the stages are equal, then the output of the m<sup>th</sup> FIR cell is

$$y_{M-m}(n) = \left\langle \sum_{i_1}^{D-1} \sum_{i_2}^{D-1} \cdots \sum_{i_m}^{D-1} x_{M-m}(n-i_1-i_2\cdots-i_m) \right\rangle_{R} \qquad 1 \leq m \leq M,$$

where  $y_0(n) = y(n)$  is the output of the final stage. A necessary and sufficient condition that y(n) not overflow for any input of size  $R_i$  is

$$R \geq \mathbf{D}^{\mathbf{M}} R_i$$
.

To preserve meaning, the modulo of the data entering the FIR must be the same as the modulo of the data at the output of the FIR. The FIR stages of the Method I and II filters can be modeled as a concatenation of three stages, each with the form

$$H_n(z)=1-z^{-D}.$$

This sum could have a value one bit wider than the widest input value. Since the modulo does not increase, the additional bit at the output of the stage represents an arbitrary increase in precision. Also, since the coefficients represent fractional numbers this unnecessary increase can be eliminated by truncating the least-significant bit, an operation that does not affect the modulo arithmetic [108].

The Method I implementations are different from the model above in that the order of the FIR stages and the decimator are switched. This does not affect the calculation

because the register width depends on the operation of the IIR stage and the following FIR stage must have the same modulo [109]. Since the input to the filter is the  $\pm 1$  output of the SDM,  $R_i$  is 2. The register widths for each filter type are listed in Table 34, where  $n = \log_2(N_1)$ .

In the Method II architecture, the IIR stage follows the FIR stage. This allows the size of each registers in the IIR stage to be only as wide as is necessary to contain the result up to that register [95, 109]. The width of each register is determined by  $D_k R_k$ , where  $D_k$  is the decimation rate of the k<sup>th</sup> stage and  $R_k$  is the data width into the k<sup>th</sup> stage.

As is discussed in Section 4.4.2.2, the outputs of the FIR stage for H300 can be expressed as the integers between and including -4 and 4, requiring 4 bits. The output of the FIR stage for both H120 and H012 can be expressed as the integers between and including -3 and 3, requiring 3 bits. Since the input to the FIR stage is a one bit value, the additional bits in the output represent arbitrary precision. These are eliminated by rounding at the output of the first IIR accumulator.

Table 35 shows the size of the three IIR registers, REG1, REG2 and REG3, for each of the three filter types. For H120 and H012, where D takes on two different values, the larger value is used with the first register to yield a worst-case size.

Table 34. Method I register widths.

| Filter Type | $D^{\mathbf{M}}R_{i}$                   | Number of Bits |
|-------------|-----------------------------------------|----------------|
| H300        | $N_1^3(2)$                              | 3n + 1         |
| H120        | $N_1(\frac{1}{2}N_1)^2(2)$              | 3n - 1         |
| H012        | $(\frac{1}{2}N_1)(\frac{1}{4}N_1)^2(2)$ | 3n – 4         |

Table 35. Method II register widths.

| Filter Type | Register | $D_k(R_k)$                         | Number of Bits |
|-------------|----------|------------------------------------|----------------|
| H300        | REG1     | N <sub>1</sub> (2)                 | n + 1          |
|             | REG2     | $N_1(2N_1)$                        | 2n + 1         |
|             | REG3     | $N_1(2N_1^2)$                      | 3n + 1         |
| H120        | REG1     | N <sub>1</sub> (2)                 | n + 1          |
|             | REG2     | $\frac{1}{2}N_1(2N_1)$             | 2 <i>n</i>     |
|             | REG3     | $\frac{1}{2}N_1(N_1^2)$            | 3n - 1         |
| H012        | REG1     | $\frac{1}{2}N_1(2)$                | n              |
|             | REG2     | $\frac{1}{4}N_1(N_1)$              | 2n – 2         |
|             | REG3     | $\frac{1}{4}N_1(\frac{1}{4}N_1^2)$ | 3n - 4         |



Figure 95. Register width simulation results.

Simulations were used to verify the register width calculations. Figure 95 presents the results of one simulation for H120 Method II with  $N_1 = 64$ . The bold line represents a filter having registers with sufficient widths. The other three plots show the same filter with either REG1, REG2 or REG3 having insufficient width.

The method III implementation consists of two parts, the coefficient generator and the accumulator. The width of the accumulator will be the same as the register widths in the Method I implementation, since it represents the output of the filter. The coefficient generator contains three registers, COUNT,  $\Delta(i)$  and h(i). Consider the H120 filter. The COUNT register provides the control by cycling through the length of the filter. The length of the H120 filter is

$$N_1 + \frac{1}{2}N_1 + \frac{1}{2}N_1 = 2N_1$$

requiring n + 1 bits.

The other register widths can be determined from the formulas for  $\Delta(i)$  and h(i). The coefficient formula is

$$h(i) = \begin{cases} \frac{i(i+1)}{2} & 0 \le i < \frac{1}{2}N_1 \\ \frac{\frac{1}{2}N_1(\frac{1}{2}N_1+1)}{2} + (\frac{1}{2}N_1-1)(i-\frac{1}{2}N_1) - \frac{(i-\frac{1}{2}N_1-1)(i-\frac{1}{2}N_1)}{2} & \frac{1}{2}N_1 \le i < N_1 \\ \frac{N_1^2}{4} - \frac{(i-N_1)(i-N_1+1)}{2} & N_1 \le i < \frac{3}{2}N_1 \\ \frac{\frac{1}{2}N_1(\frac{1}{2}N_1+1)}{2} - \frac{1}{2}N_1(i-\frac{3}{2}N_1+1) + \frac{(i-\frac{3}{2}N_1)(i-\frac{3}{2}N_1+1)}{2} & \frac{3}{2}N_1 \le i < 2N_1. \end{cases}$$

The maximum and minimum values for h occur at values that make the difference zero.  $\Delta(i)$  is defined as h(i) - h(i-1), or

$$\Delta(i) = \begin{cases} i & 0 \le i < \frac{1}{2} N_1 \\ N_1 - i & \frac{1}{2} N_1 \le i < \frac{3}{2} N_1 \\ i - 2N_1 & \frac{3}{2} N_1 \le i < 2N_1. \end{cases}$$

 $\Delta(i)$  is zero at i of 0 and  $N_1$ . The range of values for h is

$$0 \leq h(i) \leq \frac{N_1^2}{4},$$

requiring 2n-1 bits. The range for  $\Delta$ , determined by inspection, is

$$-\frac{N_1}{2} \leq \Delta(i) \leq \frac{N_1}{2},$$

requiring n+1 bits.

#### C.2 Use of Ripple-Carry Adders

Ripple-carry adders (RCAs) are the simplest adder architecture, requiring the smallest area to implement for a given number of bits. This section verifies the assumption that RCAs are sufficient for performing the additions in all architecture types.

An RCA can be used in a particular implementation if the delay through the adder is less than the rate at which values are put into the adder. An m-bit RCA can be built from m-1 full-adder cells and a half-adder cell. For simplicity, only full-adder cells will be used. The implementation is shown in Figure 96.

The delay through each full-adder cell depends upon how the cell is constructed.

A conservative estimate is derived from the equations for the outputs, or



Figure 96. Ripple-carry adder.

$$S = \overline{A}B\overline{C}_i + A\overline{B}\overline{C}_i + \overline{A}\overline{B}C_i + ABC_i,$$

$$C_o = AB + BC_i + AC_i.$$

This results in 3 gate delays  $(\Delta_g)$  for the sum (S) and two for the carry out  $(C_o)$ .

The longest delay for the adder is through the first m-1 carry outs and the last sum. The delay of the adder is then

$$\Delta_{FA} = 2\Delta_{g}(m-1) + 3\Delta_{g} = (2m+1)\Delta_{g}.$$

The largest possible adder must meet the criteria

$$[2m+1]\Delta_{\mathsf{g}}<\frac{1}{f_{\mathsf{S}}},$$

where  $f_8$ , the sampling frequency, is the frequency at which data is put into the adder. A mixed analog and digital BiCMOS process has typical gate delays of 4.5 ns [67]. A delay of 9 ns will be used to include a conservative estimate for interconnection delays and process variations. For a sampling frequency of 2.56 MHz, the longest adder is 21 bits.

Each of the H120 implementation methods requires at least one (3n-1)-bit adders, where n is the  $\log_2(N_1)$ . This indicates that ripple-carry adders can be used for first-stage downsampling of up to 128.

An adder/subtractor circuit is required for Methods I and III. A ripple-carry adder/subtractor (RCAS) is shown in Figure 97. The circuit delay for the RCAS is the delay through an m-bit RCA plus the delay through a 2-to-1 MUX. Using 3  $\Delta_g$  for the MUX, the device must meet the criteria

$$\left[2m+4\right]\Delta_{g}<\frac{1}{f_{s}}.$$

| · |  |  |  |
|---|--|--|--|
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
| = |  |  |  |



Figure 97. Ripple-carry adder/subtractor.

Using calculations similar to those for the RCA, the longest RCAS is 19 bits. For the Method III H120 implementation, this results in a maximum first stage decimation ratio of 64.

The second coefficient generator (generator 1) for the reduced area Method III H120 architecture has a longest path through a (2n-1)-bit adder, a 2-to-1 MUX and a (n+1)-bit adder. In order to use ripple-carry adders with a clock rate of 2.56 MHz, n must satisfy

$$[6n+5]\Delta_{\mathsf{g}}<\frac{1}{f_{\mathsf{S}}},$$

using a delay of 3  $\Delta_g$  for the MUX. This results in a maximum downsampling ratio of 64 for ripple-carry adders.

## C.3 ANTIALIASING FILTER CHARACTERISTICS

This section develops the characteristics for the second-order antialiasing filter that precedes the sigma-delta modulator. The cutoff frequency and damping ratio can be adjusted to partially compensate for the droop in the decimation filter. Keeping the damping ratio and cutoff frequency as low as possible to minimize the electronic noise

passed through the filter is of secondary importance. Figure 98 summarizes the relevant system components.

The frequency response of the system through the ADC, H(f), is the product of the responses of the transducer,  $H_{trans}(f)$ , the amplifier,  $H_{hpf}(f)$ , the prefilter,  $H_{filt}(f)$ , and the decimator,  $H_{dec}(f)$ . The transfer functions, with pass-band gain normalized to 1, are

$$H(f) = H_{trans}(f) \times H_{hpf}(f) \times H_{filt}(f) \times H_{dec}(f),$$

$$H_{trans}(f) = \frac{f_{tran}^{2}}{f_{tran}^{2} - f^{2} + j2\zeta_{tran}f},$$

$$H_{hpf}(f) = \frac{jf}{jf + f_{hpf}},$$

$$H_{filt}(f) = \frac{f_{filt}^{2}}{f_{filt}^{2} - f^{2} + j2\zeta_{filt}f},$$

$$H_{dec}(f) = \frac{\sin(64\pi fT_{S}) \cdot 4\sin^{2}(32\pi fT_{S})}{64^{3}\sin^{3}(\pi fT_{S})}.$$

where

 $T_S$  = sampling period (391 ps),

 $f_{\text{trans}}$  = transducer cutoff frequency (17.7 kHz),

 $\zeta_{\text{trans}}$  = transducer damping ratio (1.0),

 $f_{hpf}$  = offset high-pass filter cutoff frequency (10 Hz),

 $f_{\text{filt}}$  = antialiasing filter cutoff frequency,

 $\zeta_{\text{filt}}$  = antialiasing filter damping ratio.

The filter coefficients  $f_{\text{filt}}$  and  $\zeta_{\text{filt}}$  are chosen to make H(f) as flat as possible in the



Figure 98. System components that affect frequency response.

pass band without sacrificing antialiasing and electronic noise reduction characteristics. This is accomplished by choosing a target filter with acceptable characteristics and then finding the coefficients that minimize the mean square error between the magnitude of H(f) and the target filter. The target filter used has a gain of 1 up to 5 kHz, has zero gain above 10 kHz and a linear transition between 5 and 10 kHz. The gradual change minimizes the "ringing" in frequency that results from approximating a discontinuity with a finite number of sampled points (Gibbs phenomenon).

A C program was developed to find  $f_{\rm filt}$  and  $\zeta_{\rm filt}$  by exhaustive enumeration. Values for  $f_{\rm filt}$  and  $\zeta_{\rm filt}$  are selected. The resulting |H(f)| is compared to the target filter between 100 Hz and 6 kHz in increments of 100 Hz to produce an error. The errors are squared, summed together and averaged to produce a mean squared error. The  $f_{\rm filt}$  and  $\zeta_{\rm filt}$  that result in the minimum mean squared were chosen. The result of the simulation is presented in Figure 99, where the inverse of the mean squared error is plotted against  $f_{\rm filt}$  and  $\zeta_{\rm filt}$ .

The parameters yielding the closest fit to the target filter are  $f_{\rm filt} = 7$  kHz and  $\zeta_{\rm filt} = 0.56$ . The magnitude frequency response through the first stage decimator is shown in Figure 100. The heavy line shows the overall system magnitude response. The light solid line is the response of the antialiasing prefilter only. The dashed line is the response of the first stage decimator only.

# C.4 SWITCHED CAPACITOR FILTER

This section derives the expressions for the second-order switched capacitor (SC) antialiasing filter. A function for comparing the capacitor areas of SC filters based on the filter sampling frequency is developed. The SC filter approximates a continuous filter that can be represented by the normalized RLC network shown in Figure 101. The transfer function for this circuit is



Figure 99. Antialiasing filter parameter relationships.



Figure 100. System frequency response.



Figure 101. Second-order low-pass network.

$$H(s) = \frac{\frac{1}{L_{n}C_{n}}}{s^{2} + \frac{R_{n}}{L_{n}}s + \frac{1}{L_{n}C_{n}}}$$
$$= \frac{\omega_{n}^{2}}{s^{2} + 2\xi\omega_{n}s + \omega_{n}^{2}},$$

where

as

 $\omega_n$  = normalized analog cutoff frequency,

 $\xi$  = damping ratio,

$$R_{n} = 2\xi,$$

$$L_{n} = C_{n} = \frac{1}{\omega}.$$

The operation of the network can be described with two Laplace integral equations

$$I(s) = \frac{\omega_n}{s} [V_i(s) - V_o(s) - 2\xi I(s)],$$

$$V_o(s) = \frac{\omega_n}{s} I(s).$$

These equations can be implemented using two stray-insensitive switched capacitor integrators in inverting and noninverting configurations. Figure 102 shows the two configurations and their transfer functions [50]. The resulting pair of equations in the z-domain are



Figure 102. Switched capacitor integrators [50].



Figure 103. Second-order switched capacitor filter.

$$I(z) = \frac{1}{1-z^{-1}} \left[ \alpha_{11} z^{-1} V_{1}(z) - \alpha_{21} V_{0}(z) - \alpha_{31} I(z) \right],$$

$$V_{0}(z) = \frac{1}{1-z^{-1}} \alpha_{12} z^{-1} I(z).$$

where 
$$\alpha_{11} = \frac{C_{11}}{C_1} = \omega_n = \frac{\omega_c}{f_p},$$

$$\alpha_{21} = \frac{C_{21}}{C_1} = \omega_n = \frac{\omega_c}{f_p},$$

$$\alpha_{31} = \frac{C_{31}}{C_1} = 2\xi\omega_n = \frac{2\xi\omega_c}{f_p},$$

$$\alpha_{12} = \frac{C_{12}}{C_2} = \omega_n = \frac{\omega_c}{f_p},$$

 $\omega_{\rm c}$  = filter cutoff frequency (r/s),

 $f_p = SC$  sampling frequency (Hz).

The replacement of  $\omega_n$  with the ratio  $\omega_c/f_p$  results from mapping s into the z-domain. Figure 103 shows an implementation of these equations in a SC circuit. The resulting transfer function is

$$H(z) = \frac{\alpha_{11}\alpha_{12}z^{-2}}{\left(1-z^{-1}\right)^2 + \alpha_{31}\left(1-z^{-1}\right) + \alpha_{12}\alpha_{21}z^{-1}}.$$

The goal of the SC filter design is to match the magnitude response of the continuous antialiasing filter. For each sampling rate, the SC filter parameters  $f_c$  and  $\xi$  were chosen to minimize the squared error between the SC filter and the continuous filter. Comparisons were made between 100 and 6000 Hz in 100 Hz steps for  $f_c$  in 10 Hz steps and  $\xi$  in 0.001 steps.

The filter coefficients are implemented through the ratio of capacitors. To simplify the design, a common value can be used for the denominator capacitor value, or

$$\mathbf{C}_1 = \mathbf{C}_2 = \mathbf{C}.$$

The SC filter capacitor values can be expressed as ratios of C. For example,

$$\mathbf{C}_{11} = \boldsymbol{\alpha}_{11}\mathbf{C}.$$

The total capacitor area required by the filter, A<sub>C</sub>, is

$$A_{C} = 2C + \alpha_{11}C + \alpha_{21}C + \alpha_{31}C + \alpha_{12}C.$$

WAR COLD TO BE

For comparison, the area is normalized by  $\alpha_{11}C$ , resulting in

$$A_{CN} = \frac{2}{\alpha_{11}} + 3 + \frac{\alpha_{31}}{\alpha_{11}}$$
$$= 2\frac{f_p}{f_s} + 3 + 2\xi.$$

#### C.5 WORST-CASE ANTIALIASING

When the output of the first-stage decimator filter is downsampled, the spectrum is folded at intervals of 20 kHz ( $f_D/2$ ). The portion that lies within the desired signal band (0 to 5 kHz) introduces permanent distortion into the signal. The portion outside the signal band (5 to 20 kHz) will also become part of the signal band due to final downsampling by 4. The second-stage decimator filter, therefore, must remove the bulk of this signal before final downsampling.

The folding is demonstrated in Figure 104 for predownsampled frequencies up to 100 kHz. The graphed frequency response includes the first-stage decimator, SC low-pass filter and transducer responses. The 10 Hz high-pass offset filter effect has been removed for clarity. The frequencies before downsampling at the folding points are listed in the margins of the graph.



Figure 104. First-stage spectral folding.

The worst-case antialiasing value of 63.2 dB attenuation results from a predownsampled frequency of 35 kHz.

#### C.6 NOISE POWER

The noise power represents the effects of thermal noise seen at the input of the feature extraction logic section. The two major sources of thermal noise are the piezoresistors in the transducer, producing resistor noise, and the amplifier immediately following the transducer, producing amplifier or electronic noise.

The effective resistor noise power is the area of the power spectral density (PSD) found by passing the flat thermal PSD produced at the resistor through the amplifier, including the offset reduction high-pass filter, antialiasing SC low-pass filter and first-stage decimation filter. The effective resistor noise power value, derived in section 3.6.2, is

$$P_{R} = \frac{1}{2\pi} \int_{-\infty}^{\infty} S_{R}(\omega) d\omega$$

$$= 4kTR \int_{0}^{\infty} |H_{1}(f)|^{2} df$$

$$= 4.367 \times 10^{-17} \int_{0}^{\infty} |H_{1}(f)|^{2} df,$$

where  $H_1(f)$  is the single-sided transfer function from the amplifier through the decimation filter and the device is at room temperature (293 °K).

The effective electronic noise power,  $P_E$ , is the area of the PSD found by passing the input-referred, flat thermal PSD produced by the amplifier through the amplifier, antialiasing SC low-pass filter and first-stage decimation filter. The effective electronics power value is

$$P_E = 6.4 \times 10^{-17} \int_0^\infty |H_1(f)|^2 df$$

Both power values require the area of the square of  $H_1(f)$ . This area, G, is estimated with a C program implementing Simpson's  $\frac{1}{3}$  rule. The approximate integration is expressed as [110]

$$G = \frac{h}{3} [g_0 + 4(g_1 + g_3 + \dots + g_{N-2}) + 2(g_2 + g_4 + \dots + g_{N-3}) + g_{N-1}],$$

$$h = \text{step size (10 Hz)},$$
 $g(x) = \left| H_{hpf}(x) \times H_{SC filt}(x) \times H_{dec}(x) \right|^2,$ 
 $g_i = g(10i),$ 
 $i = 0, 1, 2, ..., 128000.$ 

The area for  $f_p = 160$  kHz is 8782 Hz.

#### C.7 TSNR SIMULATION

No direct analytical technique exists for calculating the TSNR of a sigma-delta modulating ADC due to the nonlinearity of the device. Hence, the TSNR is estimated by comparing a sinusoidal input to the resulting output. This section discusses two critical issues associated with using a simulation to find the "total" signal-to-noise ratio, the method for performing the comparison and the number of samples required.

The difficulty in performing the comparison arises from the phase delay between the input sinusoid, x(n), and the ADC output, y(n). The sinusoidal input is

$$x(n) = M_x \sin(2\pi T_s f_x n),$$

where

 $M_x$  = magnitude,

 $T_S$  = sampling period,

 $f_x$  = frequency.

The TSNR is derived from the error between y(n) and the delayed input, x'(n). This error, e(n), is

$$e(n) = y(n) - x'(n),$$

$$x'(n) = M_x \left| H(f_x) \right| \sin \left( 2\pi T_s f_x \left[ n - \left\{ 1 + \frac{1}{2} \left( \frac{f_x}{f_p} - 1 \right) \right\} \right] + \arg(H(f_x)) \right).$$

The transfer function,  $H(f_x)$ , can be the decimator if only the ADC is being considered or can be the product of the transducer, anti-offset filter, antialiasing filter and decimator transfer functions if the entire system is being considered. The term in brackets,  $\{\}$ , accounts for digital circuit delays. The factor of 1 accounts for the average delay of the second-order sigma-delta modulator. The remaining term is the average delay of the SC filter.

The TSNR can be calculated as

TSNR = 
$$\frac{\text{signal power}}{\text{error power}}$$
  
=  $\frac{\overline{x'^2(n)} - \overline{x'(n)}^2}{\overline{e^2(n)} - \overline{e(n)}^2}$ ,

where 
$$\overline{g(n)}$$
 = time average of  $g(n)$   
=  $\frac{1}{N} \sum_{n=0}^{N-1} g(n)$ .

The second issue associated with obtaining the TSNR is the number of samples, N, used in calculating the averages. Figure 105 shows the TSNR average and a 3 standard deviation range for various values of N.

Thirty simulation runs were used to develop the average and standard deviation



Figure 105. TSNR vs. simulation time.

values. The input is converted to a random process with the inclusion of a uniformly distributed angle random variable,

$$x_i(n) = M_x \sin(2\pi T_s f_x n + \theta_i),$$
  $\theta: U[0, 2\pi].$ 

The shifted input, x'(n), also has  $\theta_i$  added to its argument. For this simulation, a magnitude of 0.2 and a frequency of 625 Hz were used.

To initialize the simulated filters, 16,000 samples were generated before gathering data for statistics. Since the output of the ADC is 40,000 samples/second, this represents 0.4 seconds of time. The slowest responding simulated filter is the first-order offset reduction high-pass filter, with a time constant of 0.1 seconds.

On the basis of the results of the above simulation, a sample number of 50,000, or 5 seconds of simulated time, was chosen for the simulations in Section 4.5. This value results in a standard deviation of 0.02 dB in the TSNR.

# APPENDIX D DIGITAL BUTTERWORTH FILTERS

Appendix D describes the design of sixth-order Butterworth band-pass filters converted to digital filters using the bilinear transformation. The design begins by designing a third-order Butterworth filter. This is converted to a sixth-order band-pass filter using a frequency transformation. The bilinear transformation with frequency prewarping is next used to convert the continuous transfer function into the z domain. Finally, the coefficients are converted to 12-bit signed-magnitude numbers for use in the bearing monitor's IIR filter cells. Two examples used in Chapters 5 and 6 are provided.

A low-pass Butterworth filter with unity DC gain has a magnitude transfer function of

$$|H_{LPF}(j\omega)| = \frac{1}{\sqrt{1+(\omega/\omega_r)^{2K}}},$$

where K is the filter order and  $\omega_r$  is the cutoff frequency. The Butterworth filter has the flattest pass-band region for a given order because the first 2K - 1 derivatives of  $H_{LPF}$  are zero at DC [111]. The poles of  $H_{LPF}$  can be found by substituting  $s^2$  for  $\omega^2$ . The 2K poles are spaced evenly on a circle of radius  $\omega_r$  centered in the s-plane. The K poles on the left-hand portion of the plane form the stable low-pass filter  $H_{LPF}(s)$ . For a third-order filter, the transfer function is

$$H_{LPF}(s) = \frac{\omega_{r}^{3}}{(s + \omega_{r})(s + \omega_{r} \angle 60^{\circ})(s + \omega_{r} \angle - 60^{\circ})} = \frac{\omega_{r}^{3}}{s^{3} + 2\omega_{r}s^{2} + 2\omega_{r}^{2}s + \omega_{r}^{3}}.$$

A low-pass filter of order K can be converted to a band-pass filter of order 2K using the well-known geometric transformation [112]

$$H_{BPF}(s) = H_{LPF}(s) \Big|_{s \leftarrow \frac{s^2 + \omega_C^2}{s}}$$

The mapping results in a sixth-order band-pass filter with center frequency  $\omega_c$  and bandwidth  $\omega_r$ .

In order to implement the filter in 3 second-order cells, the transfer function must be factored into the product of 3 realizable second-order transfer functions. Algebraic techniques for accomplishing this have been developed [111]. The result can be written as

$$H_{BPF}(s) = \frac{\omega_r s}{s^2 + \omega_r s + \omega_c^2} \times \frac{\omega_r s}{s^2 + (D\omega_c/E)s + D^2\omega_c^2} \times \frac{\omega_r s}{s^2 + (\omega_c/DE)s + \omega_c^2/D^2},$$

where

$$Q = \omega_{c}/\omega_{r},$$

$$E = \sqrt{\frac{1 + 4Q^{2} + \sqrt{(1 + 4Q^{2})^{2} - 4Q^{2}}}{2}},$$

$$D = \frac{1}{2} \left\{ E/Q + \sqrt{(E/Q)^{2} - 4} \right\}.$$

The continuous filter in s is converted to a discrete filter in z through the bilinear transformation. The transformation is accomplished by replacing s with the relation [112]

$$s \leftarrow C \frac{z-1}{z+1},$$

where

$$C = \omega_{c} \cot \left( \frac{\pi f_{c}}{f_{s}} \right),$$

 $f_{\rm S}$  = sampling frequency.

The value C implements frequency prewarping. The bilinear transform maps the left-half plane in the s-domain into a circle in the z-domain, with  $s = \pm \infty$  mapping to  $z = \pm \pi$ . This creates a fairly linear relationship between a low steady-state frequency in s ( $j\omega$ ) and the corresponding frequency in z ( $e^{j\omega T_3}$ ). However, as the analog frequency being considered approaches  $\frac{1}{2} f_s$ , large variations in the analog frequency produce only slight changes in the corresponding digital frequencies. Frequency prewarping exactly matches any one

| . The state of the |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|

analog frequency to a corresponding digital frequency, in this case the center frequency of the band-pass filter [112].

Applying the bilinear transform to a general second-order band-pass transfer function results in

$$H(z) = \frac{Rs}{s^2 + Ms + P}\bigg|_{s \leftarrow C^{\frac{z-1}{z+1}}} = \frac{A_0 + A_1 z^{-1} + A_2 z^{-2}}{1 - B_1 z^{-1} - B_2 z^{-2}},$$

where 
$$d = C^2 + MC + P$$
,  
 $A_0 = RC/d$ ,  
 $A_1 = 0$ ,  
 $A_2 = -RC/d$ ,  
 $B_1 = (2C^2 - 2P)/d$ ,  
 $B_2 = (MC - C^2 - P)/d$ .

The coefficients M and P are different for each of the 3 second-order terms in the sixth-order filter. The IIR filter hardware, described in Section 5.4, implements each second order term as the difference equation

$$y[nT_{s}] = 2\{A_{0}x[nT_{s}] + A_{1}x[(n-1)T_{s}] + A_{2}x[(n-2)T_{s}] + B_{1}y[(n-1)T_{s}] + B_{2}y[(n-2)T_{s}]\}.$$

The multiplicative factor of 2 is described below.

Two example filters were implemented for this project. The first, BPF 1, has a center frequency of 1 kHz and a bandwidth of 400 Hz and is used in the hardware simulation discussed in Section 5.7. The second, BPF 2, has a center frequency of 200 Hz and a bandwidth of 300 Hz and is used in the system simulation of Section 6.2. The coefficient values derived for each of the filters are listed in Table 36.

Table 36. Band-pass filter coefficients.

| Term,         | BPF 1   |     | BPF 2   |     |
|---------------|---------|-----|---------|-----|
| coefficient   | decimal | hex | decimal | hex |
| 1, <b>A</b> 0 | 0.115   | 076 | 0.086   | 058 |
| 1, <b>A</b> 1 | 0.0     | 000 | 0.0     | 000 |
| 1, A2         | -0.115  | 876 | -0.086  | 858 |
| 1, B1         | 1.640   | 68F | 1.814   | 741 |
| 1, B2         | -0.905  | B9E | -0.828  | B50 |
| 2, A0         | 0.106   | 06C | 0.087   | 059 |
| 2, A1         | 0.0     | 000 | 0.0     | 000 |
| 2, A2         | -0.106  | 86C | -0.087  | 859 |
| 2, B1         | 1.388   | 58D | 1.812   | 740 |
| 2, B2         | -0.876  | B80 | -0.864  | B75 |
| 3, A0         | 0.105   | 06B | 0.092   | 05F |
| 3, A1         | 0.0     | 000 | 0.0     | 000 |
| 3, A2         | -0.105  | 86b | -0.092  | 85F |
| 3, B1         | 1.448   | 5CA | 1.955   | 7D2 |
| 3, B2         | -0.790  | B28 | -0.960  | BD7 |

The coefficients are stored as 12-bit sign-magnitude numbers in the IIR cells. Typical filters with unity gain have coefficients in the range [-2, 2). The coefficients are scaled by a factor of ½ prior to digitizing to facilitate the multiplication implementation. The partial products are left-shifted before accumulation to compensate for the scaling. The resulting 12-bit coefficients that will be shifted into the IIR cells are listed in Table 36 in the "hex" column.

Figures 106 and 107 show the analog and resulting digital filter's frequency response curves for BPF 1 and BPF 2. The second filter shows a deviation in the passband region due to the finite bit length in the digitized coefficients and the small ratio of  $f_c$  to  $f_s$ .



Figure 106. Frequency response of BPF 1.



Figure 107. Frequency response of BPF 2.

## **BIBLIOGRAPHY**

## **BIBLIOGRAPHY**

- [1] G. Brawley, "Diagnostic health condition performance monitoring does this make sense?" Proceedings of the 1st International Machinery Monitoring and Diagnostics Conference, Las Vegas NV, pp. 215-221, Aug 1989.
- [2] N. Higbie, "Automatic fault detection in machinery," *Proceedings of the 1st International Machinery Monitoring and Diagnostics Conference*, Las Vegas NV, pp. 541-545, Aug 1989.
- [3] J. Xu, S. Shang and Y. Lin, "Fuzzy diagnosis method in machinery condition monitoring and failure diagnosis," *Proceedings of the 2nd International Machinery Monitoring & Diagnostics Conference*, Los Angeles CA, pp. 227-233, Sep 1990.
- [4] T. Koizumi, M. Kiso and R. Taniguchi, "Preventive maintenance for roller and journal bearings of induction motor based on the diagnostic signiture analysis," *Transactions of the ASME*, vol. 108, pp. 26-31, Jan 1986.
- [5] P. Cleaveland, "Industrial LANs: Where's the action?" I&CS Industrial & Process Control Magazine, pp. 27-31, Nov 1989.
- [6] C. Kembossos and D. Bout, "Machine monitoring with a difference," *Process and Control Engineering*, vol. 44, no. 10, pp. 32-34, Oct 1991.
- [7] J. Renwick, "Condition monitoring of machinery using computerized vibration signiture analysis," *IEEE Transactions on Industry Applications*, vol. IA-20, no. 3, pp. 519-527, May-Jun 1984.
- [8] K. Wise, "Circuit techniques for integrated solid-state sensors," *IEEE 1983 Custom Integrated Circuits Conference*, Rochester NY, pp. 23-25, May 1983.
  - [9] J. Brignell, "Interfacing solid state sensors with digital systems," *Journal of Physics E*, vol. 18, no. 7, pp. 559-565, Jul 1985.
  - [10] T. Sanker and G. Xistris, "Failure proediction through the theory of stochastic excursions of extreme vibration amplitudes," ASME Journal of Engineering for Industry, pp. 133-137, Feb 1972.

- [11] G. Xistris, T. Sankar and G. Ostiguy, "Reliabitlity of machinery using fatigue damage accumulation due to random vibrations," *Journal of Mechanical Design*, ASME, vol. 100, pp. 619-625, Oct 1978.
- [12] T. Henry, "Monitoring rolling element bearing condition," <u>Bearings: Searching for a Longer Life</u>, Cheltenham Bearing Conference, Chartered Mechanical Engineer, pp. 63-69, Oct 1984.
- [13] S. Braun and B. Datner, "Analysis of roller / ball bearing vibrations," *Journal of Machine Design*, ASME, vol. 101, pp. 118-125, Jan 1979.
- [14] L. Meyer, F. Ahlgren and B. Weichbrodt, "An analytical model for ball bearing vibrations to predict vibration response to distributed defects," *Journal of Mechanical Design*, ASME, vol. 102, pp. 205-210, Apr 1980.
- [15] J. Tranter, "The fundamentals of and application of computers to condition monitoring and predictive maintenance," *Proceedings of the 1st Annual Machinery Monitoring and Diagnostics Conference*, Las Vegas NV, pp. 394-401, Aug 1989.
- [16] J. Mathew and R. Alfredson, "The condition monitoring of rolling element bearings using vibration analysis," *Journal of Vibration, Acoustics, Stress and Reliability in Design*, vol. 106, pp. 447-453, Jul 1984.
- [17] D. Dyer and P. Stewart, "Detection of rolling element bearing damage by statistical vibration analysis," *Journal of Mechanical Design*, ASME, vol. 100, pp. 229-235, Apr 1978.
- [18] J. Taylor, "Identification of bearing defects by spectral analysis," *Journal of Mechanical Design*, ASME, vol. 102, pp. 199-204, Apr 1980.
- [19] J. Reason, "Continuous vibration monitoring moves into diagnostics," *Power*, vol. 131, no. 1, pp. 47-51, Jan 1987.
- [20] J. Berry, "How to track rolling element bearing health with vibration signiture analysis," *Sound and Vibration*, vol. 25, no. 11, pp. 24-35, Nov 1991.
- [21] Z. Reif and M. Lai, "Detection of developing failures by means of vibrations," Proceedings of the 12th Biennial Conference on Mechanical Vibration and Noise, Montreal Quebec, pp. 231-236, Sep 1989.
- [22] M. Lai and Z. Reif, "Prediction of ball bearing failures," Proceedings of the 1st Annual Machinery Monitoring and Diagnostics Conference, Las Vegas NV, pp. 122-126, Aug 1989.

- [23] G. Xistris, G. Boast and T. Sankar, "Time domain analysis of machinery vibration signals using digital techniques," *Journal of Mechanical Design*, ASME, vol. 102, pp. 211-216, Apr 1980.
- [24] G. Chaturvedi and D. Thomas, "Bearing fault detection using adaptive noise cancelling," *Journal of Mechanical Design*, ASME, vol. 104, pp. 280-289, Apr 1982.
- [25] P. Bradshaw and R. Randall, "Early detection and diagnosis of machine faults of the Trans Alaska pipeline," MSA Mechanical Signature Analysis, S. Braun ed., 9th Biennial Conference on Mechanical Vibration and Noise, ASME, Detroit MI, pp. 7-18, Sep 1983.
- [26] I. Martin, D. Pearce and A. Self, "The use of a distributed vibration monitoring system for on-line mechanical fault diagnosis," *Proceedings of the Software Engineering for Real Time Systems Conference*, Circncester UK, pp. 277-283, Sep 1991.
- [27] I. Howard and G. Stachowiak, "Detection of surface defects in rolling contact bearings," *Mechanical Engineering*, Transactions of the Institution of Engineers, pp. 158-163, 1989.
- [28] A. Lifshits, H. Simmons and R. Smalley, "More comprehensive vibration limits for rotating machinery," *Journal of Engineering for Gas Turbines and Power*, ASME, vol. 108, pp. 583-560, Oct 1986.
- [29] H. Martin and D. Robinson, "Kirtosis measurments on bearings at low speeds," *Proceedings of the 1987 Sensors Expo*, Detroit MI, pp. 147-151, Sep 1987.
- [30] J. Mathew, "Monitoring the vibrations of rotating machine elements an overview," *Proceedings of the 12th Biennial Conference on Mechanical Vibration and Noise*, Motreal Quebec, ASME DE-vol 18-5, pp. 7-14, Sep 1989.
- [31] S. Dumbacher and G. Slater, "Dynamic modeling of bearing failures. "Proceedings of the 2nd Annual Machinery Monitoring and Diagnostics Conference, 1990, Los Angeles CA, pp. 549-553, Sep 1990.
- [32] J. Banks and J. Carson, <u>Discrete-Event System Simulation</u>, Prentice-Hall Series in Industrial and Systems Engineering, W. Fabrycky and J. Mize eds., Englewood Cliffs NJ, pp. 318-319, 1984.
- [33] A. Oppenheim and R. Schafer, <u>Discrete Signal Processing</u>, Prentice-Hall Signal Processing Series, A. Oppenheim ed., Englewood Cliffs NJ, pp. 158-163, 1989.

- [34] P. Behrendson, Supervisor of Development, Delco Chassis, General Motors Co., Personal conversation, January 14, 1993.
- [35] S. Rao, Mechanical Vibrations, Addison-Wesley Publishing, Reading MA, pp. 130-133, 1986.
- [36] K. Peterson, "Silicon as a mechanical material," *Proceedings of the IEEE*, vol. 70, no. 5, pp. 420-457, May 1982.
- [37] B. Puers et al., "A new uniaxial accelerometer in silicon based on the piezojunction effect," *IEEE Transactions on Electron Devices*, vol. 35, no. 6, pp. 764-769, June 1988.
- [38] Y. Fung, A First Course in Continuum Mechanics, Prentice-Hall Inc., Englewood Cliffs NJ, pp. 171-176, 1977.
- [39] R. Huston and C. Passerello, <u>Finite Element Analysis</u>, an <u>Introduction</u>, Marcel Dekker Inc., New York NY, p. 1, 1984.
- [40] NISA II User's Manual, Engineering Mechanics Research Co., PO Box 696, Troy MI, 48029.
- [41] R. Christensen, Mechanics of Composite Materials, John Wiley and Sons, New York NY, pp. 152-155, 1979.
- [42] H. Skimin and P. Andreatch Jr., "Elastic moduli of silicon vs. hydrostatic pressure at 25.0 °C and -195.8 °C," *Journal of Applied Physics*, vol. 35, no. 7, pp. 2161-2165, Jul 1964.
- [43] K. Bathe, <u>Finite Element Procedures in Engineering Analysis</u>, Prentice-Hall Inc., NJ, pp. 114-121, 1982.
- [44] F. Pourahmadi, telephone conversation, Nova Sensors, 1055 Mission Court, Fremont CA, 94539, 1991, Sep 22.
- [45] S. Kim and K. Wise, "Temperature sensitivity in silicon piezoresistive pressure transducers," *IEEE Transactions on Electron Devices*, pp. 802-810, 1983.
- [46] E. Obermeir, "Polysilicon layers lead to a new generation of pressure sensors," International Conference on Solid-State Sensors and Actuators, *Transducers '85*, pp. 430-433.
  - [47] W. Mason and R. Thurston, "Use of piezoresistive material in the measurement of displacement, force and torque," *Journal of the Acoustical Society of America*, vol. 29, no. 10, pp. 1096-1101, Oct 1957.

- [48] M. Chuey and T. Grotjohn, "An introduction to the effects of mechanical stress on the resistivity of silicon," Technical Report MSU-ENGR-89-007, Michigan State University, Apr 1989.
- [49] C. Smith, "Piezoresistance effect in germanium and silicon," *Physics Review*, vol. 94, no. 1, pp. 42-49, Apr 1954.
- [50] R. Geiger, P. Allen and N. Strader, <u>VLSI Design Techniques for Analog and Digital Circuits</u>, McGraw-Hill Publishing Co., pp. 108-126, 1990.
- [51] K. Kada, "Piezoresistance effect of silicon," Sensors and Actuators A, vol. 28, pp. 83-91, 1991.
- [52] O. Tufte and E. Stelzer, "Piezoresistive properties of silicon diffused layers," Journal of Applied Physics, vol. 94, no. 2, pp. 313-318, Feb 1963.
  - [53] L. Roylance and J. Angell, "A batch-fabricated silicon accelerometer," *IEEE Transactions on Electron Devices*, vol. ED-26, no. 12, pp. 1911-1917, Dec 1979.
  - [54] ICSensors product literature on models 3140, 3031, 3026 and 3021 accelerometers, ICSensors, 1701 McCarthy Blvd., Milpitas CA, 95035.
  - [55] SenSym 1989 Solid-State Sensor Handbook, SenSym, Inc., 1244 Reamwood Avenue, Sunnyvale CA, 94089.
  - [56] J. Binder, K. Becker and G. Ehrler, "Silicon pressure sensor for the range 2 kPa to 40 MPa," *Siemens Components* (English ed.), vol. 20, no. 2, Germany, pp. 64-67, Apr 1985.
- J. Bryzek, "A new generation of high accuracy pressure transmitters employing a novel temperature compensation technique," WESCON Conference Record, vol. 25, pp. 15.4.1-15.4.10, 1981.
- J. Wang et al., "A novel structure of pressure sensors," *IEEE Transactions on Electron Devices*, vol. 38, no. 8, pp. 1797-1802, Aug 1991.
  - [59] T. Jackson, M. Tischler and K. Wise, "An electrochemical *p-n* junction etch-stop for the formation of silicon microstructures," *IEEE Electron Device Letters*, vol. EDL-2, no. 2, pp. 44-45, Feb 1981.
  - [60] M. Hirata, S. Suwazono and H. Tanigawa, "A silicon diaphragm formation for pressure sensor by anadic oxidation etch-stop," Proceedings of the International Conference on Solid State Sensors and Actuators, Philidelphia PA, pp. 260-263, 1985.

- [61] W. Riethmüller et al., "Development of commercial CMOS process-based technologies for the fabrication of smart accelerometers," 1991 International Conference on Solid-State Sensors and Actuators, San Fransisco CA, pp. 416-419, Jun 1991.
- [62] M. Lee, B. Lee and S. Jung, "A bipolar integrated silicon pressure sensor," Sensors and Actuators A, vol. 34, pp. 1-7, 1992.
- [63] I. Baskett, R. Frank and E. Ramsland, "The design of a monolithic, signal conditioned pressure sensor," *IEEE 1991 Custom Integrated Circuit Conference*, San Diego CA, p. 4p, May 1991.
- [64] R. Payne and A. Dinsmore, "Surface micromachined accelerometer: A technology update," SAE International Congress and Exposition, Detroit MI, pp. 574-582, Feb 1991.
- [65] R. Hadaway et al., "A sub-micron BiCMOS technology for telecommunications," *Microelectronic Engineering* (1991), vol. 15, pp. 513-516.
- [66] J. Kendall et al., "BANCMOS: A 25V mixed analog-digital BiCMOS process," *IEEE 1990 Bipolar Circuits and Technology Meeting*, Minneapolis MN, pp. 86-89, Sep 1990.
- [67] B. Graindourze and H. Casier, "Mixed analog/digital in a mixed bipolar/MOS technology," *Proceedings of the 3rd Annual IEEE ASIC Seminar and Exhibit*, Rochester NY, p. 4.4p, Sep 1990.
- [68] E. Berkcan, "MxSICO: A mixed analog digital compiler: Application to oversampled A/D converters," *IEEE 1990 Custom Integrated Circuits Conference*, Boston MA, p. 14.9.1, May 1990.
- [69] K. Soejima et al., "A BiCMOS technology with 660 MHz vertical PNP transistor for analog/digital ASICs," *IEEE Journal of Solid-State Circuits*, vol. 25, no. 2, pp. 410-416, Apr 1990.
- [70] B. Wooley, "BiCMOS analog circuit techniques," 1990 IEEE International Symposium on Circuits and Systems, New Orleans LA, vol. 3, pp. 1983-1986, May 1990.
- [71] K. Mahmoud and R. Wolffenbuttel, "Compatibility between bipolar readout electronics and microstructures in silicon," *Sensors and Actuators A*, vol. 31, pp. 188-189, 1992.

- [72] T. Tschan, N. De Rooij and A. Bezinge, "Oil-damped piezoresistive silicon accelerometers," 1991 International Conference on Solid-State Sensors and Actuators, San Fransisco CA, pp. 112-114, Jun 1991.
- [73] H. Allen, S. Terry and D. DeBruin, "Accelerometer systems with self-testable features," *Sensors and Actuators*, vol. 20, pp. 153-161, 1989.
- [74] Nova Sensors product literature on NAS and NAH series accelerometers, 1055 Mission Court, Fremont CA, 94539, 1990.
- [75] "Temperature compensation IC pressure sensors," ICSensors Application Note, TN-002, Milpatas CA, Mar 1985.
- [76] J. Borky and K. Wise, "Integrated signal conditioning for silicon pressure sensors," *IEEE Transactions on Electron Devices*, vol. ED-26, no. 12, pp. 1906-1910, Dec 1979.
- [77] H. Tanigawa et al., "MOS integrated silicon pressure sensor," *IEEE Transactions on Electron Devices*, vol. ED-37, no. 7, pp. 1191-1195, Jul 1985.
- [78] H. Muro et al., "Stress analysis of SiO<sub>2</sub>/Si bi-metal effect in silicon accelerometers and it compensation," Sensors and Actuators A, vol. 34, pp. 43-49, 1992.
- [79] J. Gakkestad, H. Jakobsen and P. Ohlckers, "A front end CMOS circuit for a full-bridge piezoresistive pressure sensor," *Sensors and Actuators A*, vols. 25-27, pp. 859-863, 1991.
- [80] K. Suzuki et al., "Nonlinear analysis of a CMOS integrated silicon pressure sensor," *IEEE Transactions on Electron Devices*, vol. ED-34, no. 6, pp. 1360-1366, Jun 1987.
  - [81] Quattro Pro for Windows User's Guide, Borland International Inc., Scotts Valley CA, 1992, pp. 309-316, 1992.
  - [82] R. Spencer et al., "A theoretical study of transducer noise in piezoresistive and capacitive silicon pressure sensors," *IEEE Transactions on Electron Devices*, vol. 35, no. 8, pp. 1289-1297, Aug 1988.
  - [83] A. Papoulis, <u>Probability, Random Variables, and Stochastic Processes</u>, McGraw-Hill Inc., New York NY, pp. 360-361, 1965.
  - [84] A. Baschirotto, "An analog BiCMOS integrated circuit for front-end RDS decoder," *IEEE Transactions on Consumer Electronics*, vol. 37, no. 3, pp. 585-590, Aug 1991.

- [85] M. Monti, D. Rossi and M. Belloli, "Low-noise tape preamplifier with new self-biasing architecture," *IEEE Journal of Solid State Circuits*, vol. 37, no. 7, pp. 966-972, Jul 1992.
- [86] J. Bryzek et al., "New technologies for silicon accelerometers enable automotive applications," SAE Technical Paper 920474, *Proceedings of the 1992 SAE International Congress and Exposition*, Detroit MI, Feb 1992.
- [87] J. Candy and G. Temes, "Oversampling methods for A/D and D/A conversion,"

  Oversampling Delta-Sigma Data Converters Theory, Design and Simulation,

  IEEE Press, Piscataway NJ, pp. 1-25, 1992.
- [88] J. Candy, B. Wooley and O. Benjamin, "A voiceband codec with digital filtering," *IEEE Transactions on Communications*, vol. CON-29, pp. 815-830, Jun 1981.
- [89] R. Koch et al., "A 12-bit sigma-delta analog-to-digital converter with a 15-Mhz clock rate," *IEEE Journal of Solid-State Circuits*, vol. SC-21, no. 6, pp. 1003-1010, Dec 1986.
- [90] J. Candy, "A use of limit cycle oscillations to obtain robust analog-to-digital converters," *IEEE Transactions on Communications*, vol. COM-22, pp. 298-305, Mar 1974.
- [91] J. Candy, "A use of double integration in sigma delta modulation," *IEEE Transactions on Communications*, vol. COM-33, pp. 249-258, Mar 1985.
- [92] M. Hauser, P. Hurst and R. Brodersen, "MOS ADC-filter combination that does not require precision analog components," *ISSCC Digital Technical Papers*, pp. 80-81, Feb 1985.
- [93] V. Friedman et al., "A dual-channel vioce band PCM codec using ΣΔ modulation technique," *IEEE Journal of Solid-State Circuits*, vol. SC-24, pp. 274-280, Apr 1989.
- [94] A. Huber et al., "FIR lowpass filter for signal decimation with 15 Mhz clock frequency," *IEEE Proceedings of the ICASSP '86*, Tokyo, pp. 1533-1536, Apr 1986.
- [95] B. Agrawal and K. Shenoi, "Design methodology for ΣΔM," *IEEE Transactions on Communications*, vol. COM-31, pp. 360-370, Mar 1983.
- [96] J. Candy, "Decimation for sigma delta modulation," *IEEE Transactions on Communications*, vol. COM-34, pp. 72-76, Jan 1986.

- [97] H. Meleis and P. LeFur, "A novel architecture design for VLSI implementation of an FIR decimation filter," *IEEE Proceedings of the ICASSP '85*, pp. 1380-1383, Mar 1985.
- [98] N. Weste and K. Eshraghian, <u>Principles of CMOS VLSI Design</u>, A System <u>Perspective</u>, Addison-Wesley Publishing Co., pp. 202-316, 1985.
- [99] B. Boser and B. Wooley, "The design of sigma delta modulation analog-to-digital converters," *IEEE Journal of Solid-State Circuits*, vol. SC-23, pp. 1293-1308, Dec 1988.
- [100] A. Karabucikas et al., "A high-frequency fully differential BiCMOS operational amplifier," *IEEE Journal of Solid-State Circuits*, vol. 26, no. 3, pp. 203-208, Mar 1991.
- [101] R. Castello and L. Tomasini, "1.5-V high-performance SC filters in BiCMOS technology," *IEEE Journal of Solid State Circuits*, vol. 26, no. 7, pp. 930-936, Jul 1991.
- [102] K. Hwang, Computer Arithmetic Priciples, Architecture and Design, John Wiley & Sons, New York, pp.130-135, 1979.
- [103] Verilog-XL Reference Manual, Version 1.6, vol. 1, pp. 1-1 1-4, Mar 1991.
- [104] Silicon Microstructures Inc. product literature on models 7501 and 7502, rev. 1.0 15-02, Silicon Microstructures Inc., 46725 Fremont Boulevard, Fremont CA, 94538.
- [105] M. Neuberger and S. Ewlls, <u>Silicon</u>, Electronic Properties Information Center, Hughes Aircraft Co., Culver City CA, Oct 1969.
- [106] E. Oberg, F. Jones and H. Horton, <u>Machinery's Handbook</u>, 22nd Ed., Industrial Press Inc., New York NY, pp. 262-263, 1984.
- [107] J. Nye, Physical Properties of Crystals, Oxford University Press, London, 1957.
- [108] S. Chu and C. Burrus, "Multirate filter designs using comb filters," *IEEE Transactions on Circuits and Systems*, vol. CAS-31, pp. 913-924, Nov 1984.
- [109] E. Dijkstra et al., "On the use of modulo arithmetic comb filters in sigma delta modulators," *IEEE Proceedings of the ICASSP '88*, pp. 2001-2004, Apr 1988.
- [110] P. Davis and P. Rabinowitz, <u>Numerical Integration</u>, Blaisdell Publishing Co, Waltham MA, pp. 20, 1967.

- [111] J. Johnson, <u>Introduction to Digital Signal Processing</u>, Prentice-Hall Inc., Englewood NJ, pp. 211-212, 236-239, 1989.
- [112] W. Stanley, <u>Digital Signal Processing</u>, Reston Publishing Company, Inc., Reston VA, pp.139-144, 168-174, 1975.

