OVERDUE FINES: 25¢ per day per its! RETURNING LIBRARY MATERIALS: Place in book return to raw charge from ci mulation "col ARIMA STATISTICAL MODELS OF TRANSIENT HEAT CONDUCTION ERRORS FOR AITKEN'S CONFIDENCE REGION By Irwin Perry Schisler A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mechanical Engineering 1979 ABSTRACT ARIMA STATISTICAL MODELS OF TRANSIENT HEAT CONDUCTION ERRORS FOR AIKEN'S CONFIDENCE REGION By Irwin Perry Schisler Parameter estimation is central to the scientific analysis of data. When thermal properties are found from transient temperature measurements, the thermal parameters are usually determined by a parameter estimation procedure based on standard least squares. How- ever, at the high sampling rates associated with minicomputers, the model for the errors is important because the temperature residuals display serial correlation and possible signatures. Hence, the usual model for the errors of N(0,o2 I) is inadequate and the model N(p,o2 V) is required. The V matrix in the N(O,o2 V) model was represented by an ARIMA model having zero mean. An ARIMA (1,0,1) model fit the residuals better than other ARIMA (p,d,q) models. The best fitting ARIMA model was determined by applying Akaike's information criterion to the standard least squares residuals. The coefficients required in Akaike's criterion were estimated from a rapid solution of the Yule- Walker equations. A small-dominant parameter analysis was used to bound the change in the estimated thermal parameters caused by the Irwin Perry Schisler non-zero value of the p vector. A one small parameter model that approximately fit the observed signatures in the residuals was used to predict changes in the thermal parameters that could be neglected. The confidence region for the thermal parameters in the non- linear parameter estimation problem was approximated by an ellipse typically used for linear estimation problems. The size of this ellipse is related to the fractile. The fractile is computed from the central F-distribution when the V matrix is known. It was found that the fractile should be computed from the noncentral F-distribution when the V matrix is estimated and the sample size is small. The presence of ARIMA (1,0,1) errors is important in the statement of the correct confidence ellipse. When the N(0,o2 V) model is correct, the incorrect N(0,o2 1) model yielded a 95 percent confidence ellipse that in reality is only a 15 percent confidence ellipse. When the N(0,o2 V) model is true, the thermal parameters should be estimated by Aitken's least squares. An iterative Aitken's least squares estimator was implemented within an existing standard least squares computer program for the heat conduction problem. The accu— racy of this modified computer code was verified in a simulation study. It was found that the effects of correlated errors for the available dynamic measurements and associated models can be treated without great difficulty. The parameter estimates using standard least squares tend to be very close to the values computed by the more powerful Aitken's least squares method. Hence the parameters required in the covariance matrix, V, can be estimated accurately and easily from the standard least squares residuals. The covariance Irwin Perry Schisler matrix for the thermal parameters can be evaluated by a simple recursive difference scheme rather than using the covariance matrix V directly in the matrix product. Thus, Aitken's confidence region and standard errors can be computed easily. DEDICATION My entry into a doctoral program evolved from my high school days where I was impressed by what proved to be the farsighted and sophisticated acquaintance with engineering attained by Mr. Wendell Martin, an omnibus and resourceful high school science teacher. ii ACKNOWLEDGMENTS Computer time for the Monte Carlo study was funded in part under National Science Foundation Grant GK-4l495. Also used was subsidized computer time allocated at Michigan State University to doctoral research. An eight month graduate research assistantship was obtained through the Division of Engineering Research, J. W. Hoffman, Director. I appreciate the efforts made by the original doctoral committee: K. J. Arnold, F. W. Bakker-Arkema, J. V. Beck (Chairman), and M. C. Potter. Professor Beck discussed and provided insights into his published research procedures for the correlated error problem. Professor Bakker encouraged entry into a doctoral program by finan- cial support to participate in a low-grade reject heat project. Professors Arnold and Potter made suggestions on the proposed research. iii LIST OF TABLES LIST OF IMPORTANT SYMBOLS ................... Chapter I. II. DESCRIPTION OF THE PROBLEM 1.1 1.2 1.3 1.4 1.5 1.6 STATISTICAL ANALYSIS OF HEAT CONDUCTION DATA 2.1 2.2 2.3 TABLE OF CONTENTS Introduction ................... Nomenclature for Correlated Error Studies Review of Correlated Error Studies . . . . . . . : 1.3.1 Estimation Using a Covariance Matrix 1.3.2 Developing ARIMA Error Models . . . . I : : 1.3.3 Generalized Standard Errors: Monte Carlo Studies ............... eview of Simulation Studies ........... 1 Standard Errors in Heat Conduction R 1.4. 1.4.2 Confidence Regions for Nonlinear Parameter Estimates ............ Problem Statement ................ Plan of Investigation .............. Description of Farnia's Transient Temperature Data ....................... 2.1.1 Heat Conduction Model ........... 2.1.2 Experimental Conditions .......... 2.1.3 Farnia's Estimates of the Heat Flux . . . . 2.1.4 Estimates of the Thermal Parameters . . . . Description of the Covariance Matrix Problem . . . 2.2.1 Computed Temperature Residuals ...... 2.2.2 General Structure of the Error Covariance Matrix .................. Observed Structure of the Covariance Matrix 2.3.1 Durbin-Watson and Schmidt Tests ...... 2.3.2 Cumulative Error Test ........... 2.3.3 Contemporaneous Correlation Test ..... 2.3.4 Normality Test .............. iv LIST OF FIGURES ........................ Page vii ix xi 2.4 Serial ARIMA Models for the Covariance Matrix . . 69 2.4.1 ARIMA Model Specification by Akaike's Criterion ................. 69 2.4.2 Sampling Rate and ARIMA Model Specification ............... 78 2.4.3 Sampling Rate and ARIMA Coefficients . . . 81 ~ 2.5 Investigation of the ARIMA(1,0,1) Model and Residuals .................... 82 III. ESTABLISHING THE CONFIDENCE REGION .......... 89 3.1 Estimators and Confidence Regions Considered . . . 89 3.2 Confidence Region When the V Matrix Is Known . . . 92 3.3 Confidence Region When the V Matrix Is Estimated . 93 3.3.1 Confidence Coefficient Estimator of Noncentrality Coefficient ......... 95 3.3.2 Trial Statistic Estimator of the Noncentrality Coefficient ......... 97 3.3.3 Monte Carlo Study of a Linear Model . . . . 98 3.4 Aitken's Least Squares Estimator ......... 102 3.4.1 Adjustment of SLS Computer Code to Obtain ALS Estimates ............... 103 3.4.2 Demonstration of ALS Estimation for ARIMA(1,0,0) Errors ............ 108 3.4.3 Aitken's Least Squares with ARIMA Errors . 110 3.4.4 Approximating Maximum Likelihood by Aitken's Estimator ............ 114 3.5 Aitken's Least Squares Simulation Study ..... 117 3.5.1 Estimates of Thermal and ARIMA Parameters . 118 3.5.2 Estimates of the Information Matrix . . . . 118 3.5.3 Estimates of the Noncentrality Coefficient ................ 122 3.6 Calculation of Confidence Ellipses ........ 122 3.6.1 ALS and SLS Confidence Ellipses ...... 124 3.6.2 MLS and SLS Confidence Ellipses ...... 128 IV. FURTHER INVESTIGATION OF THE COVARIANCE MATRIX . . . . 133 4.1 Optimum Experiment Design ............ 133 4.2 Signature in the Residuals at the Heated Surface . 135 4.3 Signature in the Heat Flux at the Insulated Surface ..................... 138 4.4 Same ARIMA Coefficients for All Sensors ..... 140 4.5 More Detailed ARIMA Coefficients ......... 141 4.6 Improved ARIMA Order Specification ........ 143 V. CONCLUSIONS AND RECOMMENDATIONS ............ 144 5.1 Feasibility of Aitken's Least Squares Estimation . 144 5.2 Engineering Design Implications ......... 147 5.3 Parameter Estimation Implications ........ 147 APPENDICES .......................... 149 A. COMPUTATION OF AKAIKE'S CRITERION ........... 150 A.l Stationarity and Invertibility Tests ....... 151 A.2 Computation Method with MA Absent ........ 153 A.3 Computation Method with MA Present ........ 156 A.4 Verification of the Pagano-Chow Algorithm . . . . 159 A. Autocovariance for Multisensor Data ....... 161 B. DERIVATION 0F DISTRIBUTION FOR PERCENTILES WHEN V 15 KNOWN ....................... 163 8.1 ALS Confidence Ellipse .............. 164 8.2 MLS Confidence Ellipse .............. 168 8.3 SLS Confidence Ellipse .............. 170 8.4 Computation of Percentiles ............ 171 C. SIMULATION OF HEAT CONDUCTION DATA ........ . . 173 C.l Analytical Solution for Temperature ..... . . 174 C.2 Generation of Normal Variates ........ . . 175 REFERENCES ....................... . . . 179 vi N 01 N NNNN #00“) N N N N N N C O O O 0 £0me .11 .12 .13 .14 .15 LIST OF TABLES Linear Model Monte Carlo Study ............ One Parameter Heat Conduction Simulation Study . . . . Parameters Estimated by Computer Program COND . . . . Parameters Estimated by Computer Program PROPTY Parameters Assigned by Studying the Residuals . . . . Yule-Walker Estimates for Case-0 Heat Conduction Residuals ..................... Hard Limit Estimates for Case-O Heat Conduction Residuals ..................... Estimate of Contemporaneous Correlation ....... Kuiper Statistics for Pre-Heat-Flux Residuals . . . . Top Six ARIMA Models by Experiment .......... T0p Six ARIMA Models at Insulated Surface ...... Top Six ARIMA Models at Heated Surface ........ Stability of ARIMA Coefficients by Experiment for 0.3 Second Data .................. Frequency of Best ARIMA Models for 0.3 Second Data . . . Frequency of Best ARIMA Models for 0.6 Second Data . . . Stability of ARIMA Coefficients by Experiment at 0.6 Seconds .................... Predicted ARIMA(1,0,1) Coefficients at 0.6 Seconds . . . Confidence Coefficients for ARIMA(0,0,0) Compared to Central F Values ................ vii Page 14 18 32 34 35 61 64 66 7O 74 75 76 77 80 80 83 83 96 Table 3.2 3.3 0000000) \lC‘U‘l-b .1O .11 .12 .13 .14 .15 n>bwwwwwww Page Confidence Coefficients for ARIMA(2,0,0) Compared to Non-Central F Values ............... 97 Sample Mean and Variance for Equation (3.17) by Monte Carlo Study ..................... 100 Confidence Coefficients for Equation (3.17) ...... 102 Convergence of Aitken's Estimator for Case-0 Data . . . 109 Estimates of ARIMA(1,0,1) Starting Values ....... 115 Heat Conduction Simulation Study with ARIMA(1,0,1) Errors ........................ 119 Heat Conduction Study Information Matrix and Sensor Weights .................... 121 Kuiper and Noncentrality Estimates ........... 123 Aitken's ARIMA(1,0,1) Ellipse Statistics ........ 126 Standard Least Squares Ellipse Statistics ....... '126 Correct SLS Confidence Coefficients ........ . . 127 Confidence Ellipse Data SLS and MLS ........ . . 131 Ninety-Five Percent Ellipse Coefficients ..... . . . 13‘1 Comparison of Ellipse Coefficients ........ . . . 13¢ Initial Estimates of the Non-Zero Mean ..... . . . 13'6 ARIMA Model Discrimination for Series A . . . . . . . 160 Approximations for u(0,t) and u(L,t) ..... . . . . . 175; viii Figure NNNNN NNNNN NN N-a N N N N N N N N N O C O C O O O 0») 1~DCX)\I(J\U'I .10 .11 .12 .13 .14 .15 .16 .17 .18 .19 .20 .21 Insulated Insulated Insulated Insulated Insulated Insulated Insulated Insulated Insulated Insulated Heated Surface Heated Surface Heated Heated Heated Heated Heated Heated Heated Heated Insulated Surface, e and Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Surface Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: LIST OF FIGURES Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: Residuals: ix Case-0 Case-1 ........... M e: Case-O Sensor-3 ..... Figure 2.22 3.1 3.2 4.1 Heated Surface, e and M e: Case-O Sensor-7 ...... Ninety-Five Percent Confidence Ellipse for SLS and ALS Ninety-Five Percent Confidence Ellipse for SLS and MLS Heat Flux at Insulated Surface: Case-1 Page 88 125 129 139 Symbol ++ LIST OF IMPORTANT SYMBOLS Vector of normal random variates Axis length of ellipse Akaike's Information Criterion Covariance matrix of the a vector Estimate of the parameter 8 vector Product of specific heat and density Number of difference operations Vector of residuals, e = y - X b Fisher-Snedecor distribution, F(k,n-k,l-a) Noncentral F distribution, F'(k,n-k,A,l-a) Fractile for confidence coefficient Thermal conductivity Number of thermal parameters Autoregressive transformation matrix Number of sensors in multisensor data Number of measurements per sensor Number of autoregressive parameters, ARIMA(p,d,q) Transformation matrix, w = P a Number of moving-average parameters, ARIMA(p,d,q) Matrix for quadratic form Estimate of white noise variance, 02 xi Equation (C (3 (2. (2. (3. .8) .44) 15) .20.A) .3) .13.8) .5) .1) .1) .32) .19.A) .19.A) .1) .34) .1) .6) .21) Symbol 2! Student's t-distribution, t(n-k,1-o) Confidence region statistic Matrix determined by ARIMA model, w = 02 v Vector of additive errors, y = X B + w Covariance matrix of the w vector Vector of measurements Confidence coefficient Parameter vector in linear model Moving-average coefficient Noncentrality coefficient Mean value of residuals Angle of ellipse with k-axis White noise variance Autoregressive coefficient Is distributed as xii Equation (1.11) (3.15) (3.2) (3.33) (3.2) (3.22) (3.44) CHAPTER I DESCRIPTION OF THE PROBLEM In Section 1.1 the importance to parameter estimation of modeling the errors is discussed. In Section 1.2 three features are isolated in modeling errors and in using the correct covariance matrix. The literature on these three factors is reviewed in Section 1.3 for simple physical models. This is extended in Section 1.4 to more complicated physical models that more closely resemble nonlinear estimation of the heat conduction problem. In Section 1.5 the problem investigated in this dissertation is defined as being only one aspect of ongoing parameter estimation studies at Michigan State University, and in Section 1.6 the topics presented in subsequent chapters are outlined. 1.1 Introduction Mathematical modeling of physical phenomena is an important part of engineering and science. In many cases either the physical model is unknown or many of the associated constants are not known accu- rately enough. In such cases it is imperative to perform experi- ments and to take measurements. Parameter estimation is central to the scientific analysis of such data to determine parameter values and accuracies and also to provide insight for improving upon the mathematical model of the physical phenomena. 1 In addition to modeling the physical phenomena it is necessary to model the measurement errors to provide (1) the most accurate parameter estimates, (2) accurate confidence regions, and (3) optimal experiment designs. In dynamic experiments in which the variables are being repeatedly measured, the common assumption of uncorrelated measurements may not be valid. Heat conduction data can be used to test the development of these worthwhile parameter estimation tech- niques. Currently optimal design in heat conduction is determined by selecting boundary conditions, by determining the duration of the experiments, and by choosing locations for the measurement sensors. In addition to these conditions the optimal design depends on the covariance matrix of the errors, denoted W. Different covariance matrices are obtained as the time between measurements is changed and there may be a best sampling rate. The W matrix can also be used in discriminating between rival mathematical models and in maximum a posteriori estimation. Several examples can be given. Van Fossen (1973, Figure 4.2.9) found that the presence of correlated errors prevented effective dis- crimination between rival models describing the heating of a Bismuth- Lead alloy, because no technique was available to determine the error covariance matrix W. Beck (1975) cited as examples three heat con- duction papers where correlated errors are undoubtedly present and modeling these errors could improve the parameter estimates or confi- dence regions. Clarke (1973) suggested using correlated error models to improve the fit of hydrologic groundwater equations. 1.2 Nomenclature for Correlated Error Studies The specific techniques considered in this dissertation involve modeling errors in order to improve the estimation of thermal parame- ters and the statement of confidence regions for these parameters. Although finding thermal parameters requires nonlinear parameter estimation it is easier to define the techniques by the following linear model having additive errors: namely, y = X B + w (1.1) where y is an n by 1 vector of measurements, X is a known matrix, B is the estimated k by 1 parameter vector, and w is an n by 1 vector of measurement errors. The standard assumption for w is w ~ N(o,d2 I) (1.2) which means that the errors have zero mean, constant identical variance, and a normal distribution. The variance 02 is assumed to be estimated, throughout this dissertation. A 100(1-a) percent confidence ellipse for the parameter vector b is (b - e)t xt x (b - e) s k 52 F(k,n-k,l-a) (1.3) where b is the estimate of B, s2 is an estimate of 02, and F is the fractile yielding probability a in the right tail of the F-distribution. The confidence interval for the j-th parameter is given by + _ _ 2 t -1 Q bj _ t(n k,1 a) (S (X x)jj) (1.4) where t is the fractile for the Student's t-distribution, and (Xt X)3} is the j-th diagonal entry in the inverse of the Xt X matrix. Some researchers such as Draper and Smith (1966, p. 65) denote the estimated standard errors as 4 _ 2 t -1 5 est. 5. e. (bj) — (s (X x)jj) (1.5) The confidence interval in Equation (1.4) depends on two factors: (1) the estimated standard errors, and (2) the fractile of the Student's t-distribution. Beck (1975) indicated that errors in heat conduction experi- ments conducted at Michigan State University were not independent. Thus, the model that should be used for the errors is w ’6 N(O,oz V) (1.6) where V is not equal to the identity matrix I. The presence of the correlated errors was noticed when a minicomputer was used for data acquisition. The correlation was noticed by plotting the residuals, e = y - X b, and testing whether the number of sign changes was abnormally small. In the statistical literature this sign test is also called a run test or a Geary test. The analysis by the sign test indicated that the errors are more correlated as the sampling rate on the minicomputer is increased. Accurate data were collected by Farnia (1976) using an IBM-1800 minicomputer data acquisition system using a rate of three-tenths of a second between measurements. I used these data to study the problems associated with including the V matrix in the estimation and confidence region equations. The data are undoubtedly representative of data from dynamic experiments in other engineering problems so that the analysis techniques developed also apply to these engineering problems. When the covariance matrix V in Equation (1.6) is known the data and model can be transformed so that the errors satisfy Equa- tion (1.2). Assuming that the V matrix can be factored as v = L Lt (1. then the transformations -1 F = L y and z = L'1 x (1.8) make the linear physical model in Equation (1.1) become F = z e + L'1 w (1.9) where -1 n 2 L w ~ N(O,o I) (1.10) Thus the transformed errors are distributed according to the error model given by Equation (1.2). Hence the confidence interval for the j-th parameter in Equation (1.9) is b. i t(n-k,1-a) (52 t -1 % J (2 213.3. ) (1 where the estimated standard errors are given by est. 5. e. (bj) = (52 (2t 2);} )i (1 Using Equation (1.8) it is clear that Equation (1.12.A) can also be written as est. 5. e. (b.) = (52 (xt v" X)" )i (1 J 33 which clearly shows that Equation (1.12) depends on the V matrix. When the V matrix is known, the standard errors are given by Equation (1.12) rather than the incorrect Equation (1.5). When the V matrix is estimated, Theil (1971, p. 246) used the expression asymptotic standard errors since the asymptotic .11) .12.A) .12.B) estimates of the standard errors are based on the asymptotic approximation to the sampling distribution. Thus the confidence interval given by Equation (1.11) should be written as bj i G(l-a) (est. 5. e. (bj) ) (1.13) where Equation (1.12) is used to estimate the standard errors. The more accurate the estimate of the V matrix, the better is the approximation that G(1-a) is t(n-k,l-a). In terms of the above presentation the need for an error model given by Equation (1.6) instead of by Equation (1.2) has three fea- tures. First, the coefficients in the V matrix in Equation (1.6) must be estimated. Second, although less obvious than the first feature, the structure of the V matrix in terms of the L matrix in Equation (1.10) must be modelled. Third, an adequate approximation of G(1-a) in Equation (1.13) must be proposed. 1.3 Review of Correlated Error Studies Three features of the correlated error problems were isolated in Section 1.2. In this section these three features will be dis- cussed for simple physical models. 1.3.1 Estimation Using a Covariance Matrix The first feature of the correlated error problems is estimat- ing the standard errors. In order to compute the standard errors a computation scheme must be available to estimate the ARIMA parame- ters in the V matrix in Equation (1.7). Maximum likelihood estimators are preferred when the physical parameters are estimated and the errors are correlated. The maximum likelihood procedure uses as the cost function the logarithm of the probability density fML given by fML(B) = (24)'*" (det u)’é exp( - e S) where s = (y - x 13)t w“ (y - x 5) (1.14.4) with the error covariance matrix given by W = 02 V. The statistical assumptions for using fML are normally distributed additive errors that have zero mean and are correlated. The covari- ance matrix W is estimated but X is errorless and there is no prior information. Beck and Arnold (1977) used a concise notation where a string of eight numbers referred to tabulated assumptions so that the above problem is denoted 11-01211 when the V matrix is estimated. When the error covariance matrix is assumed to be known the maximum likelihood estimator reduces to the Aitken's least squares estimator which has the cost function 1 fALS(B) = (y - x e)t w’ (y - x 4) (1.14.8) Aitken (1935) proved that the cost function fALS yields the minimum variance estimator when W is known. The prefix A in ALS is used throughout this dissertation to denote Aitken with LS used to denote least squares. For data acquired by a minicomputer the sample size can be large, say n = 1000, or even larger. Thus, it could be difficult to find the matrix W'1 even if the error covariance matrix W were known. In previous research at MSU by Van Fossen (1973) solving the problem of inverting the W matrix seemed to be impractical. Beck (1974) approached the problem of inverting W by assuming the errors could be modelled by certain classes of autoregressive integrated moving- average (ARIMA) models, and found that the inverse for these models was related to certain differences of the data and model values. This occurs because the inverse itself is not needed; what is needed is only the evaluation of a quadratic form in Equation (1.14) con- taining the inverse of the error covariance matrix W. For an autoregressive process with known coefficients, Beck (1974) proposed using the differences Zj(i) = ej(i) - ”lj ej(i-l) - . . . - gpj ej(i-p) where the residuals are ej(i) = yj(i) - Tj(i) and the subscripts refer to the j-th thermocouple at the i-th time for the data vector y and the temperature T predicted from the model. The Aitken's least squares cost function for the differenced residuals is m n fALS(B) = .5 .g Zj(i) Zj(i) (1.14.C) j-l 1-1 where the limits on the summation is for m sensors and n measure- ments per sensor. The implementation of parameter estimation using the cost function given by Equation (l.l4.C) is considered in this dissertation when the Dj's are estimated. Deutsch (1965, p. 64) stated the exact values for the elements of the error covariance matrix W are seldom known. When the error covariance matrix is estimated, Dhrymes (1971), extending the doctoral thesis at MSU by Ruble, showed that a particu- lar computational scheme for iterative Aitken's least squares yielded the same estimates as from the more complicated maximum likelihood cost function. The iterative Aitken's least square procedure uses a two step scheme. In the first step estimates for the physical parameters are found using an error covariance matrix that is regarded as the true error covariance matrix; in the second step estimates for the error covariance matrix are found using the residuals from the first step. The two step scheme is repeated until the parameter esti- mates converge. 1.3.2 Developing ARIMA Error Models A second feature of the correlated error problem is finding an ARIMA model that is an accurate approximation to the errors. The ARIMA (AutoRegressive Integrated Moving-Average) model determines the V matrix in Equation (1.7) and Equation (1.6). Beck (1974) and Beck (1975) have suggested that the V matrix in heat conduction is not the identity matrix. Bard (1974, p. 248) stated that the estima- tion of the V matrix when serial correlation is unknown is relatively difficult. Apparently because replicated data are rarely available, Bard (1974, pp. 63-66) only considered contemporaneous correlation models, see Equation (2.10), when the V matrix is estimated. It can be more effective to use a deterministic model rather than to find an autocorrelated model for the V matrix. Carr (1972) showed in a Cobb-Douglass model of Bell Telephone of Canada that the need 10 for a nondiagonal V matrix can be eliminated by changing the speci- fication of the physical model. The residuals in the initial speci- fication of the model had one sign before the introduction of automatic switching equipment and had the other sign after that time. Thus, the residuals had a pattern or signature. By introducing a binary variable for the effect of the introduction of the switching equipment, Carr (1972) obtained residuals in the revised model that had more sign changes and passed tests for independent errors. For these independent errors the error covariance matrix depends on the identity matrix I rather than on the nondiagonal matrix V. In engi- neering problems the V matrix may be associated with small parameter physical processes. Bard (1974, p. 202) stated: "particularly in cases where data are very accurate neglected effects outweigh random errors in measurement, and consequently nonrandomness of residuals is the rule, rather than the exception when models are fitted to good data." Hence, it may be necessary to find a model for the V matrix. Since the book by Box and Jenkins (1970) has become well-known, ARIMA(p,d,q) undels for errors have become popular. The three order parameters indicate there are p autoregressive coefficients, d differences, and q moving-average coefficients that must be deter- mined in order to find the ARIMA(p,d,q) model that is the best fit. Pandit (1973, pp. 17-49) gave an excellent historical review of the work in 1938 by Wold that is the basis of the ARIMA model, and Pandit (1973, pp. 170-184) interpreted ARIMA models in current mathematical terminology. 11 A discrimination criterion is required to find the order parame- ters: p, d, and q. Beck and Arnold (1977, p. 473) discussed dis- crimination procedures for the physical model but did not indicate a discrimination procedure for the ARIMA order. Gallant and Goebel (1976) stated that the referees of their paper recommended the final prediction error procedure developed in 1969 by Akaike to specify the order parameters. An improved version of this criterion was developed by Akaike (1972) and seems to be the criterion accepted by investiga- tors in the area for specifying the order parameters. The criterion by Akaike (1972) was used in this dissertation. To study the situation expected in real problems where close but not exact identification is made for the orders p and q, Gallant and Goebel (1976) estimated an ARIMA(2,0,0) model when the simulated real model was either white noise, ARIMA(0,0,4), or ARIMA(1,0,0). Schmidt (1970) also considered the sensitivity of the assumed order to the actual order. Both a linear and distributed lag model were considered by Schmidt (1970) with the linear model being yi = B + a xk + wi. (1.15) Schmidt (1970) was able to study both the effect of assuming auto- correlation when it is absent, the effect of assuming independent errors (white noise) when autocorrelation is present, and the effect of assuming an insufficient number of terms in the auto- correlation model. The three error models used were: (1) independent errors or white noise w. = a. (1.16) 12 (2) first order autoregressive errors wi = P ”1.1 + 3° (1.17) 1 s and (3) second order autoregressive errors wi = D] w]._1 + 02 wi_2 + a1 . (1.18) The definition is introduced that the white noise vector satisfies azNWfiZU. 04m The system of equations based on the physical model in Equation (1.15) with a set equal to zero, and the statistical model given by Equation (1.17) will be used in example problems throughout this dissertation. The sampling statistics reported by Schmidt (1971) were the variance of B and the bias of 02 because these quantities both occur in the confidence region. The best choice was to assume second order autoregressive errors when any of the three error models was true. Schmidt (1970, p. 18) stated this is true for infinite samples since any coefficient Dj that is zero is estimated consistently as being zero, but this need not be true in finite (small) samples. 1.3.3 Generalized Standard Errors: Monte Carlo Studies This section discusses recent Monte Carlo studies that explored the estimation of the standard errors and the accuracy of using the usual t-distribution. Hence, these studies investigated two features indicated in Section 1.2 for the correlated error problem. Beck (1974) conducted and Beck and Arnold (1977, p. 318) inter- preted a Monte Carlo study for correlated errors in a particular 13 linear model given by Equation (1.15) with errors given by Equation 2 = l, and (1.17): parameter values used were a = 100, B = 0.1, o a range of values of O = -l, -0.5, O, 0.5, l. The sample size was n = 60 and t = 34 trials were used. Their results are presented in Table 1.1 in terms of two sampling statistics: the bias (rows 3 and 5) and the estimated standard errors (rows 4 and 6). The results of the study indicated that for positive values of 0 the estimated standard error is underestimated because the average of the maximum likelihood estimate of 0 is biased toward zero. The estimates of the physical parameters (a and B) are good. Goebel (1974) investigated the distribution G(1-a)where G(1-a) is contained in Equation (1.13). They considered the non- linear physical model that can be written as y, = B] exn(82 xi) + w, . (1.20) and conducted a Monte Carlo study with sample size n = 60 and t = 200 trials. Three models were used for the errors: Equation (1.16) with a2 = 0.25, Equation (1.17) with 02 = 0.25 and p = 0.735, and the fourth order moving-average model denoted ARIMA(0,0,4) wi = 1.5 ai + a1._1 + 0.85 a1._2 + 0.33 a1._3 + 0.5 a1._4 . Goebel (1974) computed the t-distribution confidence coefficients from a fifteen cell histogram and applied a chi-square goodness of fit test to show that the distribution was not a t-distribution. This is evident from the empirical fractile points for the five (a = 95) and ninety-five (a = 0.05) percent confidence intervals that were -2.28 and 1.99 compared to the true values of -1.67 and 14 .Aw—.m m—nmh .mnmpv u—ocg< new xumm "mugsome mmm.o upm.o pwm.o coo.F mmm.o “my .m .m\Anv .m .m .pmm weooo.o- mmooo.o- owooo.o- omooo.o- meoco.o- m\Am 1 av 2.3 mm: 8.... So; so; 3 .a .4233 .a .38 N—ooo.o cpooo.o omooo.o opooo.o mpooo.o 8\A8 1 my mmm.o mom.o mom.o mom.o mmm.o mm ”ww.o om¢.o emo.oi mme.oi mom.o- a _ m.o o m.o- _1 mmamewpmm *zuzpm opgmu mace: vaoz gauze; _._ m—nmh 15 1.67, respectively. This flatness in G(1-a) at nine fractile points was confirmed by Gallant and Goebel (1976) in a Monte Carlo study involving Equation (1.20) with errors described by Equation (1.18). The Monte Carlo studies described above used small sample sizes while it is conventional wisdom in econometrics that for large sample sizes the distribution is the same whether the statistical parameters are known or estimated consistently. There are several papers that allegedly show that the distribution is the same; e.g., Kmenta (1971, p. 507), Maddala (1971), Kmenta (1971, p. 529), and Schmidt (1970, p. 5). However, Schmidt (1976, p. 69) constructed a counter-example. This is for Equation (1.15) with a equal zero and the particular V matrix given by V.] = diag(19 Y: Y2: . . . s Yn-l) and the resulting distribution for the physical parameter is n3 (b - B) : N(0,026). (1.21) From Equation (1.21) it is clear that the estimate of B is the same but the variance is increased by 6, when V is estimated. When Y is unity, the asymptotic variance is increased by about eight percent, in this counter-example. Schmidt (1976, p. 71) suggested that his experience indicated the large sample variance has 6 equal to unity when the errors are described by ARIMA models. The standard errors can also be increased if Equation (1.5) is used instead of Equation (1.12); i.e., if standard least squares is used but is not valid. Magness and McGuire (1962) showed that the increase in the standard errors depends on the particular combination of X matrix from Equation (1.1) and V matrix from 16 Equation (1.6). This dependency affects conclusions from a Monte Carlo study since only specific combinations of X and V are used. 1.4 Review of Simulation Studies The purpose of this section is to review the use of small trial procedures to understand complicated physical models in contrast to the procedures used for the simple physical models presented in Section 1.3. 1.4.1 Standard Errors in Heat Conduction In a Monte Carlo study, Beck and Arnold (1977, p. 401) made an assessment of the effect of correlated errors on a heat conduction problem. The physical model was heat conduction in a semi-infinite solid subject to a step change in surface temperature. The tempera- ture in the solid was given as T(x,t) = To + (10° - To) erfc(x (4 a t)"" ) where the physical parameter is a the thermal diffusivity, the temperature at the surface is maintained at value T0 and the tem- perature infinitely far into the solid is Tm, t is time, and T(x,t) is the temperature at position x and time t. In order to simu- late actual heat conduction measurement conditions two sensors were used (m = 2) and they were located at 0.125 and 0.250 inches from the surface, and n = 28 measurements were generated for each sensor. The error model was ARIMA(1,0,0) given in Equation (1.17) and three values of the coefficient 0 were used: 0, 0.5 and 0.9. The esti- mated standard error was given as 17 est. 5. e.(a) = (s2 (x: v'] x1 + x; v" x2)'1 )3 where the subscripts 1 and 2 on the X vector indicate the two measurement locations. The X vector is the partial derivative of T with respect to a for both locations and for all the experimen- ta1 times. In Table 1.2 ztz is defined by t t -1 t -1 Z Z = X] V X] + X V X 2. The quantities calculated for Table 1.2 are related to the problems for correlated errors isolated in Section 1.2. The results for estimating the V matrix are that the standard errors are sensi- tive to the true value of the autoregressive coefficient 0 with the standard errors increasing by a factor of about seven as the true autoregressive coefficient increased from 0 = 0 to 0 = 0.9. The results for approximating G(l-a) are that in three out of five cases the estimated value is within the standard deviation so that the dis- tribution of the standard errors is consistent with a t-distribution. 1.4.2 Confidence Regions for Nonlinear Parameter Estimates In real engineering problems the models have nonlinear parameters and a simulation study using these models can be expensive. Although the distribution of G(1-a) in Equation (1.13) when the errors satisfy white noise as in Equation (1.2) was shown by Ivanov (1972) and Chambers and Ertel (1975) to be consistent and normal when the sample size is large, a procedure is required for small samples. This was investigated by Tierney (1971) for the standard least squares proce- dure where B is estimated from the equation .Am_.e e_eae .Nempv e_eee< uea xeem "esteems 18 N~.a ea.m mNF._ c.2m a.o m_ “a.a oa.e moo._ ~.~a m.o e_ a_.m oo.a ano._ N.Fo_ m.o mp _G.m mo.m .Na.o m.mpp m.o ~_ m~.m me~.e ee_.F e.wm m.o .F em.P m.mm amo._ N~.Nm m.o o_ em.P a.mm mmm.o ae.wm m.o a ma._ m._m eno._ mm.oc_ m.o m oe._ e.me Lea.o mm.Po_ m.o N ma.F e.~m Gop.P ma.aa m.o e wee.o New _m_._ em.mm o m mme.o mm, mmm.o mm.am o e oee.o mmF Neo._ up.oop o m e_~.o ~a_ mmm.o ae.oop o N oee.o em. omo.F _N.mm o P ac, x e-o_ x mm Aee\meev ao_ x a snag 13 .m .m .93 N uN 8 «zuaum cowpm—zewm co_uu:ucou pow: cmpwsmcmm mco NA «33. 19 t - (y-T) TB-o which Tierney (1971, p. 115) approximates by a seven term series 6 2 A1 (a - b) = o (1.22) i=0 that is solved for (B - b) in terms of the A 's by SNOBOL, a i symbolic manipulation computer language. From (6 - b) the moments of the distribution was found for the nonlinear parameters and hence the mean and variance of the probability distribution was calculated. It is beyond the scope of this dissertation to inves- tigate the probability distribution in this much detail, and there- fore normal and F distributions will be used. Not only is it difficult to determine the probability distri- bution for nonlinear parameters using Equation (1.22) but also it is difficult to obtain moments from a simulation study. A relevant example is the nonlinear solid state physics model investigated by both Pfeiffer and Lichtenwalner (1974) and by Chambers and Ertel (1975) which is - 82 1 + 4‘832(xi - s4) yi = 81 2 + W' l where the additive errors wi are white noise as in Equation (1.2). Pfeiffer and Lichtenwalner (1974) used t = 400 trials to plot ordered estimates of B3 on quantile paper to see whether the distribution is consistent with a normal distribution. The t = 400 trial study was "extremely expensive of computer time." Chambers and Ertel (1975) repeated this study with t = 100 trials in order to find an analysis 20 procedure adequate for small trials and they investigated: the quan- tiles from the plotted ordered values, a linearized estimate as Equation (1.5) and a quadratic estimate. Again it is beyond the scope of this dissertation to investigate the probability distribution in this much detail. 1.5 Problem Statement The general emphasis of this research stems from Beck and Arnold (1977, p. 289) who stated: "In order to present meaningful confidence regions it is necessary that the underlying assumptions are valid. Two assumptions frequently violated in scientific work are that the errors have zero mean and that the errors are uncorrelated. Erroneously taking these assumptions to be true has led many to present overly small confidence regions." The implication of this statement is that the model considered for the errors should be w z N(u,02 V). (1.23) This dissertation concentrates on the error model given by Equation (1.6) which differs from Equation (1.23) by the assumption that p = 0. The effect of a non-zero p created by a signature in the residuals is not considered in detail. Based on the three features of the correlated error problem outlined in Section 1.2 and the review of the literature, the topics that need to be investigated in the correlated error problem are as follows. First, it is necessary to find simultaneous estimates of both the physical and statistical (ARIMA) parameters; this will be implemented by modifying an existing computer code (called PROPTY) so that it computes iterative Aitken's least squares estimates. 21 Second, the ARIMA model for the autocorrelated errors is found from real data; the residuals from standard least squares estimation with computer program PROPTY will be used to find the proper ARIMA model. Third, the Aitken's least squares and standard least squares confidence regions are constructed from real data and the accuracy of using Equation (1.12) can be compared to using the incorrect Equation (1.5); a Monte Carlo study is also conducted with t = 20 trials to gain insight into the effect of estimating the error covari- ance matrix V. The originality in this dissertation is (l) to isolate relevant procedures primarily from the econometric literature in order to present confidence regions when autocorrelated errors are present, (2) to make improvements in the procedures so that they can be implemented in a problem where heat conduction parameters are esti- mated, and (3) to apply the procedures to available heat conduction data in order to show how the procedures can be used. The illustrative heat conduction problem is for the estimation of thermal conductivity and specific heat in an Armco iron disk using transient temperature and heat flux data. This analysis pro- cedure is undoubtedly characteristic of many other related parameter estimation problems in mechanical, agricultural, civil, and chemical engineering. In Section 1.1 and Section 1.2 the general importance of modeling the covariance matrix is discussed. A more specific listing of fac- tors in parameter estimation that are affected by the presence of a covariance matrix is outlined in National Science Foundation (NSF) 22 proposal GE-41495. This dissertation solves some but not all of the problems suggested in NSF proposal GE-4l495. In this proposal to NSF five modifications in computer program PROPTY were indicated: (1) to extremize a maximum likelihood cost function that contains an error covariance matrix, (2) to test the resulting computer routine for numerical stability since the additional ARIMA coefficients are estimated, (3) to incorporate sequential estimation, (4) to utilize prior information about the error covariance matrix from previous experiments, and (5) to replace the estimate of the standard errors from PROPTY with an estimate of Aitken's (or maximum likelihood) standard errors. These five aspects are motivated in part by the desire to have an estimator usable on a minicomputer data acquisition and analysis system. It is reasonable to expect researchers to try using ARIMA models for autocorrelated errors in data acquired by laboratory mini- computers, because procedures were developed in the early 1960's to handle ARIMA errors associated with tracking data processed by spacecraft minicomputers. An explicit algorithm for spacecraft minicomputer data was developed by Blum (1961) for sequential esti- mation when the coefficients in the autoregressive model are known. The third and fourth aspects are considered by Beck and Arnold (1977) in computer program NLINA, but are not considered in this dis- sertation for several reasons. Beck and Arnold (1977, p. 276) gave equations for sequential estimation when the correlated errors are known. However, there is a question of the usefulness of sequential estimates when the goal is to construct a confidence region with 23 unknown ARIMA parameters. Odell and Lewis (1971) found that although autoregressive parameters could be estimated as part of a recursive algorithm, the statistical properties of the estimated parameters could not. The fourth aspect, which is including prior information, creates two difficulties: as stated for aspect three the residuals may not have the same statistical properties, and the additional terms in the quadratic form for the standard errors alters the statistical distribution of the quadratic form. Hence aspects three and four are not considered in this dissertation. This dissertation was written to explore the fifth aspect of the research proposed in GK-41495 and indirectly the first and second aspects are considered. The first aspect is achieved by using iter- ated Aitken's least squares estimation which Dhrymes (1971) showed converged to the maximum likelihood estimates. The second aspect is not expected to be a problem because Goldstein and Swerling (1970) found that two iterations of iterated Aitken's least squares yielded estimates close to those obtained with a known covariance matrix V. The fifth aspect (in its simplest form) merely requires using the estimates from Aitken's least squares in Equation (1.12.8). 1.6 Plan of Investigation The solution of the problem outlined in Section 1.5 proceeds as follows. Chapter II describes some experimental data obtained at Michigan State University that is analyzed to identify the proper ARIMA model. Chapter III is a development of models for the confi- dence regions and also a Monte Carlo study of the confidence region 24 when the V matrix is estimated. Also in Chapter III confidence regions for real heat conduction data are presented. Chapter IV discusses additional aspects of the fitted ARIMA model. Chapter V gives the major conclusions for the correlated error problem. CHAPTER II STATISTICAL ANALYSIS OF HEAT CONDUCTION DATA This chapter develops autoregressive integrated moving-average (ARIMA) models for the errors. In Section 2.1 a procedure is outlined to obtain temperature residuals by estimating thermal parameters from the experimental transient temperature data using standard least squares. In Section 2.2 the residuals are plotted and a general model for the errors is outlined. In Section 2.3 it is shown that the residuals are not correlated contemporaneously, but are correlated serially. In Section 2.4 an ARIMA(1,0,1) model for the residuals is shown to have the best fit based on computed values of Akaike's Infor- mation Criterion, and by the agreement of estimated and predicted coefficients in the ARIMA(1,0,1) model when the sampling rate changes from 0.3 to 0.6 seconds between measurements. In Section 2.5 addi- tional comments about specifying the ARIMA model as ARIMA(1,0,1) are given. 2.1 Description of Farnia's Transient Temperature Data 2.1.1 Heat Conduction Model The conditions in the heat conduction experiments performed by Farnia (1976) are described in this section. The specimen was Armco Magnetic Ingot Iron for DC Applications. The shape of the specimen is a disk 3 inches in diameter and 1 inch in thickness. 25 26 This disk is insulated on one face (and on the edge) and is heated on the other face. The total duration of a test was about 40 seconds. A known constant heat flux was applied during a nominal 15.3 second interval at the start of the experiment. The physical model to describe these experimental conditions is the heat conduction equation in one dimension .2;[ -.§__ ELLE C a ‘ a x (k a x ) (2'1) subject to the conditions T(xso) = To (2.2.A) k §—§19431 = 0 (2.2.3) k g—{JL—tl = q0 H(t) H(s - t) (2.3) The thermal properties of Armco iron at T0 for a small temperature rise are two parameters in the model: k the thermal conductivity, and c the product of the density and the specific heat at constant pressure. The reader is referred to Table 2.2 for typical numeri- cal values of k and c for the range of conditions covered in the ten experiments conducted by Farnia (1976). The density is constant and is taken by Farnia (1976, p. 37) as 490.71 lbm ft’3. The estimates of k and c in this section of the dissertation are obtained by minimizing the standard least squares cost function n 2 1 15] (T(xj.t.) - 13(xj.t.)) (2.4) f(k.C) = 3 "Ma where Te is the experimental temperature, T is the calculated temperature, and there are m sensors at locations denoted by xj, 27 and there are n measurements per sensor taken at times denoted by ti' The procedure used to establish Te and q0 are discussed in the following sections. The method used by Farnia (1976) to obtain his estimates of k and c did not involve Equation (2.4). 2.1.2 Experimental Conditions Thermocouple millivoltages were processed by Farnia (1976) in the following manner. The millivoltages from each thermocouple were amplified and sent to an IBM-1800 computer. The computer program to convert voltages to temperatures was developed by Van Fossen (1973), who removed 60 cycle noise by averaging a voltage at a given time with another voltage one-hundred-twenty-th (1/120) of a second later. This voltage and a similarly processed voltage from a thermocouple 180° away on the same face of the disk provided the average voltage of two thermocouples and is called the output from one sensor. The sensor voltage is converted to a temperature value using regression equations for a J-type thermocouple. Farnia (1976, p. 48) used four thermocouples per face on each of the two sym- metrically placed disks: that is, four thermocouples on each of the four faces. By averaging these sixteen thermocouple readings, eight sensor readings were available for analysis: four from sensors on the heated surface and four from sensors on the insulated surface. The experiments by Farnia (1976) can be thought of as a single disk with four sensors on its heated face and four sensors on its insulated face. A three-tenths of a second sampling rate was selected for the IBM-1800, and 144 temperature values were recorded 28 during the 43 seconds of the experiment and pre-experiment and post-experiment times. The temperature of the disk was approxi- mately uniform at both the start and at the end of the 43 second time period. Using thermocouple response values by the IBM-1800 and these values were punched on cards by the IBM-1800. The data on these cards were processed by computer program COND to yield values that were aligned at the start and end of the experiment; these values were used as the experimental temperatures. Farnia (1976) found it necessary to correct the temperature for each sensor by computer program COND because the same temperature was not measured by thermocouples at the same surface despite the accuracy of the calibration. The regression model he used to line up the temperature at the beginning and end of the experiment was: T*I 0‘1 T 81 TI T"'11 = 0‘2 T 82 TH where T* is the corrected temperature, T is the temperature recorded by the IBM-1800, and the subscripts are I for insulated surface and H for heated surface. The regression constants a], B]. 02, and 82 are solutions of the system of four equations i’(TIB + THB) = 0‘1 + 81 T13 “T113 1 THB) = 0‘2 1 B2 THB *(TIF + THF) = 0‘1 + B1 TIF 5(TIF + THF) = 0‘2 + 82 THF where the subscript 8 denotes initial time and F denotes final time. 29 Computer program COND determines four sets of a's and 3'5 with one set for each pair of thermocouples. Because they are in adja- cent columns of the IBM-1800 punched data cards, the pairs are taken in the order with thermocouple numbers (1,5), (2,6), (3,7), and (4,8). Each pair has one thermocouple at the heated surface and one thermocouple at the insulated surface. This arrangement facilitates the estimation of k and c by the integral method to be described in the next section since one estimate of k and c is obtained from each pair of sensors with the average value used as the final estimate. This alignment of the sensors affects the pre- heat-flux residuals in Section 2.3.4. The data obtained by Farnia (1976) are summarized in Table 2.1. There were ten cases with starting temperatures ranging from 80 to 360 degrees Fahrenheit, and two levels of the applied heat flux. These two levels of the applied heat flux produced respective tem- perature rises of 15 and 30 degrees Fahrenheit. 2.1.3 Farnia's Estimates of the Heat Flux The heat flux q(t) was not recorded in detail by Farnia (1976) because the parameters were estimated with a method requiring only the total heat flux. The estimation method was developed by Beck and Al-Araji (1974). It is useful to briefly indicate the equations used in this method. Assume that the surface of the disk at x = L is insulated and the surface at x = O has a heat flux q(t). The total heat input 0 is given by 0=$4010 30 is assumed to be known but the detailed flux q(t) is not necessarily known. The known value of Q is computed from a scalar equation written as 0 = v2 (2 R A)" where V is the measured voltage drop and A is the known area of the face of the disk. The resistance of the heater is modelled as a linear regression on the average of the initial and final temperature R T 0‘3 T B3(TIB T THB T TIF T THF) ' The conductivity and specific heat are estimated from L Q 2 I; (T(0,t) - T(L,t) )dt (2'5°A) k: - all c ‘ L (i(x.e) - T(x.0)i) ° (2-5-3) Farnia (1976) applied the heat flux, q(t), by a thin electrical heater confined between the two identical Armco iron disks. Each disk was three inches in diameter and one inch thick. Half the heat generated by the heater went into each disk. After some initial sensor temperatures were recorded, a constant voltage was applied by a 0.0. power supply to the heater for approximately 15.3 seconds. The resulting nearly constant heat flux can be described mathe- matically as q(t) = 40 Hhaoma Ensues; cmpsasou x3 umpmswumm msmumsecma 35 for the heated surface. Specific features of these plots are dis- cussed when the residuals are modelled. Table 2.3 also indicates the quality of the residuals that are displayed in Figures 2.1 through 2.20. The magnitudes of the residuals are not of the same magnitude in all ten cases with the spread in the magnitudes of the residuals falling roughly into three groups: 0.8, 1.6, and 3.2 degrees Fahren- heit. These magnitudes are the basis of the quality index denoted A, B, and C. The residuals at the insulated surface are smaller in general than those at the heated surface. The smaller of the two heat fluxes is associated with the smaller residuals. The quality of the residuals is used as a factor in selecting the ARIMA model in a well-executed experiment in Table 2.8. Table 2.3 Parameters Assigned by Studying the Residuals d- B E F . Case sec sec sec Qual1ty O 3.45 18.6 30.0 B A = A 1 2.7 18.0 27.9 B B = B 2 6.3 21.6 31.5 A A = A 3 4.8 20.1 30.0 B C = C 4 5.1 20.4 30.3 A A = A 5 5.1 20.4 30.3 A C = C 6 4.62 19.96 29.7 B C = C 7 6.17 21.53 31.2 A B = A 8 5.26 20.51 30.3 B C = C 9 5.7 21.0 30.9 B B = B 36 cummmu umpmauwmwg mummgsm umumpamcn .F.~ «gammu To—uuozooum.u:~h 8.» gen 3‘ Son a; 8; on.— 85 8.9 8.0 8.0 p”b"””b.’.b””’-’tbtbb.-.bp”hb'>-r’b”"..'b...’"”bt-"P:.”’.”'pb>b..-”’...”’-b.b.”b>b ‘a N rvvvwvvvvv'vvvvvvv111vv‘vvvvvv'WVVVjVIv'vvtvvvvvv‘wvv1vvvvv'VVVTv—VVVVUVVVVVIVW I“! or 1- now» own- fl't "‘0 ”'0 ”’0 "’3 IIBHNBMHUJ‘BTUHOISSU 37 puommu "mpmzuemmg mummgam umuopsmcm .~.~ ensued 378283623 8.» 2... Ba 26 8: 8.. 8.. 8... 8.9 8.. 8...... rb'bbb’.’.ib.b’h’b’-'M'DD.b”-"DDMtII’bbh.D'bbb’...’.‘D'.’...b.'....-...D‘.’..b.h’..b..b-..D.h‘plh "V'V'V'V'."V'U'V'Y'V'I'V'VV orc- 09'0 I"! "v‘vi'v'v'vvv'vvvvv'"V'V'VVVIU‘""“"""""' “'3 1 I BHNSNHUJ ‘ 918110193! 38 Nnmmmo umpm=u_mmg oummgsm vmumpamcm rod-uazouuo.w=~h as.u ao.u a... a... a... an; .m.~ mg=m_u uro- uro- oo-o or: 00-0 00-3- OIIIIJWNBUHUJ‘BWUHOIBBU 39 mtmmmu "upmauwmmg muoecsm noun—:m:H .e.~ mgzmpu 12.3233. 2.: .— 8. a 2... 33 SJ 8. . B. . 8. . 8... 8... 8... 86. P”>thpbbhb””bU’i-Dp’t’ppph’b’b”’b”.-Db’...Ib}..’b..biD'-’bbbthD’-D”>bbhbhbbbp’.h"hr’bb’D’DD “'0' W I'- D' I ”'0 ”'0 09‘0- .|. I BHNBUHUJ ‘ 9181101838 03’! 40 610.60 "mpmaupmmc mumessm umumpamcn .m.~ me:m.u to.-aozouuu.u=.» 86 2.6 86 6.6 8.. 8. . 8.. 86 86 86 86 r ure- ”‘9- '- "'9 OI IIIBHNBUHUJ‘ 8181101838 H “'8 41 8.0 3‘ mummmu ”mpmzupmwg mummgzm umumpamcm .m.~ mg:m.m to—-uozaouu.w:~h 3.0 a; . :2 I ,, 2: i . , n on; on; 8.0 8.9 8.9 8.9? .m uro- ”’9- ura- U‘O "'8 ”'0 01'113HN38HUJ'813001938 42 mimmou .mpmau.mmg mumeczm cmpwpamcm t9..99299ua.uzuh .~.N ms:m.u 979 :J 99.— 99.“ 99.9 8.9 99.9 8.9. ‘ ot-o- uro- wt- or “'0 09' IIBHNBNHUJ'BTUROISBN “'1 43 summmu .mpmaupmwc oumesam umumpzmcm .m.~ mezm.u 13.3283. u... b 86 86 86 6.6 8. . 8. . 8. . 86 86 86 86.. 133: a 9.. :.:g?3’b.””...‘b-bbh’bh’Mhbbb.D’h..b-’.’I"b‘bbbb’.’D”’.’b’.b"’bhhb”bpbb’ 2;? I \ -3, .1). /‘ WVVVVVVVW‘VVV"Vp'v'V‘V'VT‘V‘V'V"VITVVVVVVVVV’V““'V'IVVVVV‘VVV'VVVVVVVVI N‘ I- 03’ I- oo-o- 9°0- Q‘ I ”'0 09 9°: IIBHNEUHUJ‘818001838 48 Nummau ”mpwzuwmmg mumwgam umuamz .mp.m mgzamu 13.3233. u: . p 8.» 23 SJ :4 8.. 8. u 8.. 8... 8... 8... .Db.”’.”bp..D"."bh”bb”..-”b”’:’_bhhbb.b.bb.’bb’.”L.p.Phb”’p’bt.pth.pbthb.._.pb.>b' 22: 2222 2;". 2 2 2 (22:2 8.9? now an arc oo-z- now- um- n- OIIUBHNBHHHJ' swnmsau VV“VVVT‘IVVVVY'WVv'VfiVVVVV"V'V‘V‘VV‘"V'V‘V'VV'11VVVV‘VVI"V'V‘V'V'V‘V'V‘V‘V 49 mummmu “mpmauvmmg muo$g=m umumw: .¢—.~ mg=m_u f9~uaozouu9.uz~h 8.9 99.9 3.9 99.9 a; 99.— 99; 8.9 99.9 3.9 8.9.... “’0' D’ i- ”'3- “' I “’0 lIBHNJUHUJ'SWUDOISBU 8.9 99.9 3.9 22 2 eummou "mpmzcmmmg mum2gsm woumw: .mp.~ mgamvu (22222 22 2222/ ”’0 ”'9 0 I31 I BHNBSHUJ' 818001938 222222 51 mummau "mpmauwmmg mum22=m umumw: .mp.~ mgamwm we..aozeuua.ug.h 8.9 69.9 09.9 99.8 8.9 8.9 O“.— 8.9 8.9 8.9 8.9 .‘bb’..’h>.hE”.h.b’x’rb"bh”-pp”b"’bbhh.h'bb’.-b”.b’D”.b".’b.b.b”’bb"b’-’.bbb..bL-.Pbbbb’ ”’0 00'0- U’ I- O" 3" '3’ 6" ”’1 09‘! W71VVVVVIT‘VVVVUVV"V'V'fi'V'V'V'VYV‘V"V'V‘V‘VV'TVVVVVVV'I‘VVVVVVVV'UVVVVVVVV .l I BHNBSHIH ' 918001938 52 oummuu "mpmzuvmmg mumygzm umumm: .n_.~ mgamwu 13.3233. .3: h 86 2.6 6.6 26 8.. 8.. 8.. 86 86 86 8.9? or:- oo'z 9' I are 00-0 9-0- 9° l- 113HN38HUJ‘ 31800 I 338 53 8.9 99.9 3.9 {2.2 “-mmmo "mpmsupmmg mummgam umpam: .mp.~ mgzmwu 13.3233. mg: 36 8.. 8.. 8. . 86 86 86 86. \O "'0- fl‘l- "'1 -2_ wo- 22 ”’0 .l I QHNJSHUJ ‘ 918001938 \n 03'! D‘! 54. 99.9 w-mmmo "mpmauwmmg ouaygam vogue: rod-aozooua.u=~p a... a... a... a... a... .m_.~ .23622 or!- 9‘: U' 1 “'0 ”'0 “'0- II’ [- 113HN38HUJ'813001838 55 mummmu "mF6662666 muamgam uwummz .o~.~ 626622 13699239». 2: h 66.6 66.6 66.6 66.6 66.6 66.6 66.. 66.6 66.6 66.6 66.6. .h’b.’bpb’.’D’.”bbb.”'b”.bbbbhbt’Elbbb’.’D”’.bb”’b”'b.‘b’bb’P’bh"F’pt’...’bbbb..P”’bhh'b’. . ‘ w 2 w ( § 5: g “‘1 IIBHNZUHUJ‘SWUOOISJU 56 2.2.2 General Structure of the Error Covariance Matrix The purpose of this section is to identify the general form of the matrix V*, the covariance matrix for the errors. In real problems the error covariance is unknown. One way of proceeding involves using the residuals produced using standard least squares. Draper and Smith (1966, p. 79) have stated concisely that "in practical problems it is often difficult to obtain specific information on the form of V* at first, and for this reason it is sometimes necessary to make the (known to be erroneous) assumption V* = I and then attempt to discover something about the form of V* by examining the residuals." The assumption V* = I implies that a standard least squares cost function, such as Equation (2.4), is used when the residuals are calculated. The structure of the V* matrix is related to a Pm matrix and an A* matrix defined below. For the j-th sensor an additive error at time i is denoted ng) and for each sensor can be written as an n component column vector w(j). The white noise consists of normal identically independently distributed errors denoted agj) for the j-th sensor at the i-th time, and the white noise for each sensor can be written as (i) an n component column vector a A practical ARIMA error model for serial correlation relates the additive error vector w(J) to the white noise vector a(J) as Wm , ,(j) am (2.7) where P(J) is a square matrix of dimension n. Equation (2.7) is an essential equation used elsewhere in the dissertation. Equation (l.l0) given previously assumes that P(J) can 57 be defined, although the matrix P(j) was denoted L. Equation (3.22) given later is a computational P‘j) matrix for ARIMA errors, although the matrix P(j) will be denoted L M']. The particular ARIMA(p,d,g) model appropriate for the heat con- duction residuals determines the particular form of the P(j) matrix. When there is more than one sensor there can be contemporaneous cor- relations. For multisensor data with m sensors the convention used is to stack the w(J) vectors by sensor as F “l ”' “ Nm ,0) Wm ,(2) w* = . and a* = . (m) (m) L" .J _a .l The model for serially correlated multisensor data is w* = Pm a* (2.8) where Pm is a square matrix of dimension mn with the matrices P(J) as its diagonal components PP(]) o . o .2 o P(2) . o Pm = . . . 0 Lo 0 . P(mi‘ The error covariance matrix is denoted w*, equals cov(w*), and satisfies _ t w* - Pm A* Pm (2.9) 58 The contemporaneous correlation defined as the correlation between different sensors is written as A* = E(a* a*t) (2.lO.A) The matrix A* is square and symmetric and has dimension mn. The matrix A* has the general form A11 A12 Alm A21 A22 AZm A* = (2.l0.B) Aml AmZ ° Amm _ _ with each component matrix Aij being a square matrix with dimension n. Each component matrix A is diagonal because by definition white 13 noise is not serially correlated. It is convenient to assume that the contemporaneous correlation has homoscedasticity defined as con- stant variance for Aii' This is discussed further in Section 4.5. Thus, in a given Aij the diagonal entries are all assumed to equal the same constant. In order to be practical, the combined model for serial and con- temporaneous correlation must only involve a few parameters. Assum- ing that each sensor obeys the same order ARIMA(p,d,q) process then there are a total of (p + q) m + m + 5 (m2 - m) coefficients from the ARIMA process in the matrix w*. This can be bounded by two con- siderations. First, Box and Jenkins (1970) suggested that usually the sum p + q is two or less. Second, Farnia's data only have eight sensors, m = 8, although sixteen thermocouples were used. Thus there can be as many as 52 different parameters in the matrix w*. 59 2.3 Observed Structure of the Covariance Matrix A reasonable approximation to the matrix w* was investigated as follows. Whether the serial correlation coefficients are zero is tested in Section 2.3.l and whether they are unity is tested in Section 2.3.2. Contemporaneous correlations are represented and estimated for the pre-heat-flux data in Section 2.3.3. It is con- cluded that the matrix N* is dominated by serial correlation and that instead of the 52 different coefficients suggested in Section 2.2.2 only 3 different coefficients are needed. In Section 2.3.4 a calibration problem related to the randomness of the white noise is discussed. 2.3.l Durbin-watson and Schmidt Tests The Durbin-watson test statistic for the presence of serial cor- relation discussed by Kmenta (l97l) is d=2(]‘¢]) where a] is the coefficient in the ARIMA(l,0,0) model computed from the Yule-Walker equations given as Equation (A.7). In a test of the hypothesis of independent errors versus the alternative of posi- tive first order autocorrelation (with n = 90 and the 0.0l level) the rule is (l) reject if d < d 1.47, L (2) do not reject if d > dU l.56, (3) do neither when dL< d < dU. From Table 2.4 the average value of 9] was estimated to be 0.772. Thus the statistic is 60 d = 2 (l - 0.772) = 0.456. Since this value is less than l.47 the hypothesis of independent errors is rejected in favor of the hypothesis of first order autocor- related errors. The hypothesis is also rejected for each of the individual sensors since the largest value of d is 0.82. A Durbin-Watson type procedure for second order autocorrelation-- i.e., ARIMA(2,0,0)--was developed by Schmidt (l972). The test uses two statistics; one is d1 = 2(l - 0]) where 0] is the Yule-Walker estimator for the coefficient in the ARIMA(1,0,0) process, and the other statistic is d, = 2(1 - a2) where 02 is the Yule-Walker estimator for the second coefficient in the ARIMA(2,0,0) process. The test statistic used by Schmidt (l972) implicitly assumes the coefficients have equal order of magnitudes and it is 5=d1+d2 In a test of the hypothesis of independent errors versus the alterna- tive of second order autocorrelation (with n = 90 and the 0.0l level of significance) the rules are (l) reject if 6 < dL = 3.18, (2) do not reject if 6 > dU = 3.35, (3) do neither when dL < 6 < d”. From Table 2.4 the estimates of 0] and 02 are 0.772 and 0.306, respec- tively. Hence the test statistic is 61 6 = 2(l - 0.772) + 2(l - 0.306) = l.844. Thus, the hypothesis of independent errors is rejected and the alternate hypothesis of second order autocorrelation is accepted. The hypothesis of independent errors is also rejected for each sensor because the largest value of 6 is 2.192. Table 2.4 Yule-Walker Estimates for Case-0 Heat Conduction Residuals ARIMA(1,0,0) Process ARIMA(2,0,0) Process Sensor 0.613 0.0098 0.435 0.291 0.0089 l 0.834 0.0161 0.498 0.403 0.0135 2 0.918 0.03l8 0.681 0.258 0.0297 3 0.696 0.0210 0.446 0.359 0.0l83 4 0.782 0.0168 0.531 0.321 0.0151 5 0.919 0.0157 0.774 0.157 0.0153 6 0.590 0.0180 0.324 0.451 0.0144 7 0.831 0.0282 0.659 0.208 0.0270 8 Average Average 0.773 0.544 0.306 2.3.2 Cumulative Error Test Cumulative errors are the special case of first order autoregres- sive errors with 0 having the value of unity. When 0 is close to unity the ARIMA(1,0,0) errors may be approximated as being cumulative errors. Some authors advocate describing errors by cumulative error models from considerations of the nature of the physical process and/or the measuring device. Mandel (1957) postulated that an extent of reaction for surcose measured on a fixed specimen has errors that are cumulative. 62 Heuvel et a1. (1976) postulated that the integral number of counts in an x-ray diffraction experiment are cumulative and used the procedure proposed by Mandel (1957) and improved by Beck (1974). Mandel (1964) suggested there are two types of errors: cumula- tive errors associated with "process" errors and other "analytical" errors. An examination of the least squares residuals could be used to decide whether the process errors dominate so that a cumulative error model is appropriate. However, the analysis by Anderson (1975, p. 142) showed that when white noise analytical errors and cumulative process errors are both present from more than one source the sum of these errors is an ARIMA(p,d,q) process. Thus, unless process errors are the only errors it is unlikely that the observed errors will be simply cumulative. This is discussed further in Section 2.5. A simple screening method to determine if cumulative errors dominate is to use a hard limit estimator of the coefficient 0 in an ARIMA(1,0,0) process. A hard limited process Z has two states that are assigned numerical values 0 and 1. These are related to a continuous valued wi-series by the rule: (1) if wi is non-negative then Z1 1, (2) if wi is negative then 2. 0. 1 For the ARIMA(1,0,0) process in Equation (1.17) Kedem (1976) derived the hard limit estimator of 0 as 0 = sin(n(c - g) ) (2.11) where 63 2 i 1-] - 2 fig} 21 + Z.l + 2n + (n-l) n - 1 Let an excursion of the w-series above the axis be called a hump. For example, there are two humps in the series (01110110) and for this series the first two terms in the numerator of the estimator of c equals four. In general, the first two terms in the numerator is twice the number of humps; thus, the parameter c can be estimated by inspection of the plotted w-series. This evaluation procedure was applied to the heat conduction residuals in Figures 2.1 and 2.11 from Beck and Arnold (1977, p. 408). Because the sample size in these plots is relatively large (n = 85), the hard limit estimator is in close agreement with the Yule-Walker estimates from a numerical ver- sion of the same data. The estimate of 0 is about 0.86 for both estimators in Table 2.5. For this value of 0 the residuals can be said to be neither independent nor cumulative. The estimates of 0 in Table 2.5 are somewhat larger than those in Table 2.4 because the mean was assumed to be zero for the estimates in Table 2.5 while in Table 2.4 it was estimated. 2.3.3 Contemporaneous Correlation Test The general structure of the contemporaneous correlation matrix A* is defined in Equation (2.l0). In order to simplify the identi- fication process several assumptions are made. First, in order to obtain a simple model only the pre-heat-flux residuals are modelled. Thus, the thermal model is the constant mean temperature which is 64 Table 2.5 Hard Limit Estimates for Case-0 Heat Conduction Residuals 9 Number of 9 Sensor Yule-Walker Humps Hard-Limit 1 0.8602 4 0.9488 2 0.9291 6 0.9096 3 0.8033 11 0.7071 4 0.6748 12 0.6548 5 0.9421 8 0.8005 6 0.8987 7 0.8600 7 0.9425 2 0.9841 8 0.9230 3 0.9689 Average Average 0.8717 0.8543 described by Equation (1.15) with a equal to zero. Second, because the sensors are identical, a common set of coefficients is used in the contemporaneous correlation model. These common coefficients are in a matrix denoted R that permits the contemporaneous correlation matrix to be written as the following kronecker matrix product _ 2 A* - o Rm # In. (2.12) In Equation (2.12) the subscript on the matrix R indicates there are m sensors, the subscript on the identity matrix I indicates there are n measurements per sensor, and the special symbol # is used to denote the kronecker matrix product whose properties are discussed by Theil (1971). ,Two forms for the Rm matrix will be considered. First, the matrix is assumed to depend on five coefficients with the matrix written as 65 a c a symmetric c c a R = c c c a 8 d d d d b (2.13.A) d d d d e b d d d d e e b d d d d e e e b ___ .2 For each experimental case, the R8 matrix represents the correlation between the residuals of different sensors. The entries in the R8 matrix are ordered so that the first four are for thermocouples at one face and the last four are at the other surface. Thus, the coef- ficients have the following expected values a = E( w§3> WSJ) ) for j = 1.2.3.4 b = E( wgJ) wa) ) for j = 5,6,7,8 c = E( wSJ) wfk) ) for j,k = 1,2,3,4 (2.13.3) d = E( w§3) wsk) ) for j = 1,2,3,4 and k = 5,6,7,8 e = E( WEJ) wgk) ) for j,k = 5,6,7,8 The sample means are used as estimators. Note that unlike serial correlation they have expected values that only depend on the cur- rent time index i. The ratios of these average values can be used to define the contemporaneous correlation coefficients which can be used to assess the importance of contemporaneous correlations. Average values can be found from the values displayed in Table 2.6. These values are a = 0.062, b = 0.064, c = -0.016, d = -0.006, and e = -o.017. Because rd = d (a b)‘* this yields rd = -0.0095 for the correlation between sensors on opposite sur- faces. Because rC = c a"1 this yields rc = -0.25 for sensors 66 99999.9 99999.9 999999.9 99999.9 m999.9 5999.9 mucmpgw> 999m9.9 pmnpo.9n 999999.9n 999F9.9u 9999.9 9F99.9 :mwz 99999.9 99m99.9u 99999.9u 99999.9: N99P.9 999_.9 9F 9 9n999.9 mummo.9u 99999.9: 99999.9: 99p~.9 9999.9 9P 9 99999.9 9~999.9u 99999.9 9_9~9.9n 99~F.9 5999.9 9F n 99999.9 p99F9.9u 99~99.9 99999.9: 9999.9 9999.9 9’ 9 99999.9 999F9.9| 99999.9 99999.9: 9999.9 9999.9 9p 9 99999.9 ~9mp9.9u 99999.9: p9999.9u NF99.9 m999.9 9P 9 99Np9.9 99n99.9u 9F999.9u 99999.9: 9999.9 9999.9 9p m 99999.9 29999.9- 99999.9: 99599.9: 9999.9 9999.9 99 N 99F99.9 99—99.9 F9999.9n -F99.9 NNP9.9 n9—9.9 9 _ NF999.91 99999.9- 95999.9: 99—99.9n 99F9.9 99P9.9 9F 9 9 m c u a 6 : mwmwu :o_umpmggou mzomcmgoasmucou $6 mumswumu 9.9 wpamh 67 located at the insulated surface. Because re = e b"1 this yields re = -0.27 for sensor located at the heated surface. If there were no contemporaneous correlation these r values would be zero. If there were significant contemporaneous correlation these r values would have absolute values near unity. Because the r values are relatively small the contemporaneous correlation is not significant and can be ignored. The estimates of c, d, and e in Equation (2.13.A) may be small because average values are computed and the elements are not neces- sarily of the same sign. The proper signs can be found by consider- ing the cross product estimator n . . 2 (1) (J) S.. n 2 wk wk where Si is the element in the i-th row and the j-th column of the j matrix S. The matrix S has the same dimension as the matrix Rm. The matrix S is called a biased but good estimator by Theil (1971, p. 321). Using this information about the signs, the second R8 matrix was written as a ‘1 f a symmetric -f -f a _ f f -f a R8 - 0 0 0 0 b (2.13.8) 0 0 0 0 f b 0 0 0 0 -f -f b 0 0 0 0 f f -f b Table 2.6 displays the estimates of the element f for the ten cases. Because rf = f (a b)'§ this yields rf = 0.486 for an upper bound on the contemporaneous correlation. Since rf is small 68 the R8 matrix in Equation (2.12) is assumed to be the identity matrix of dimension m; i.e., R = I . (2.13.C) 2.3.4 Normality Test Not all numbers occur in the less significant digits of the temperature measured by a given sensor. Thus, the initial residuals do not appear to be from a normal distribution. It is customary to have the vector a used in Equation (2.7) and in Equation (2.8) to have a normal distribution. An examination of the pre-heat-flux residuals, for instance, for case 3 data indicated a displacement and alignment by thermo- couple pairs by computer COND and a step-like pattern with occasional flat spots in the residuals. This step-like pattern probably occurs because the ten volt signal was digitized to an accuracy that yielded temperature values of approximately 0.02 degrees Fahrenheit. Thus, for a given sensor the processed data tend to have only either odd or even digits in the hundredths place. This truncation occurred in the analog to digital conversion of the measured temperature; i.e., as the thermocouple milli-voltage is converted to a numerical value by the IBM-1800 computer for storage in the disk memory. Barballa (1970) analyzed the mapping of an analog signal to a binary coded word using a ladder transfer curve. Because there are limited digits available in the ladder transfer curve, truncation occurs. The residuals were tested for normality by the Kuiper test. The Kuiper test statistics for the sample size twenty, n = 20, 69 are given in Table 2.7. Using Louter and Koerts (1970), the criti- cal point is 0.359 at the 0.01 level of significance. Thus the hypothesis of normality is rejected for four thermocouples and accepted for four thermocouples. The cases where the hypothesis is rejected seem to have flat spots in the residuals. Hence the non- normality might not exist if the temperatures were read more accu- rately than 0.02 degrees Fahrenheit. The non-normality will be considered as a calibration problem rather than as a problem of the distribution being non-normal. 2.4 Serial ARIMA Models for the Covariance Matrix In Section 2.3 it was shown that the model for the covariance matrix of the errors, W*, can be approximated by considering only the serial correlations. This section determines the values of the order parameters p, d, and q in the ARIMA(p,d,q) model for the serial correlation. These values are determined by a criterion denoted AIC that has emerged by general consensus of the workers in the field as the accepted identification criterion. The AIC criterion has replaced the procedure of examining the pattern of the estimated autocorrela- tion and partial autocorrelation and the associated Q-statistic used by Box and Jenkins (1970, p. 291). 2.4.1 ARIMA Model Specification by Akaike's Criterion The criterion AIC was introduced by Akaike (1972) and is commonly referred to as Akaike's Information Criterion. The AIC criterion is based on Kullback discrimination that is advocated by Beck and 7O 9999.9 9999.9 9999.9 9999.9 9999.9 9999.9 9999.9 9999.9 ewpmwpepm Lomcmm mpmavwmwm xzpulpmmxlmga 26$ muwpmmpmpm 969239 9.9 mpnmh 71 Arnold (1977, p. 467). The form of the criterion depends on the number of parameters that are estimated in the ARIMA model. When the mean is estimated the criterion is written as + 2 a = n ln(o ) + 2 n (n - d).1 (p + q + 2) (2,14) where n is the sample size, 02 is the white noise variance, and the number of estimated parameters in the ARIMA(p,d,q) process is p + q + 2. The numerical value two in the number of estimated parameters is used because both the mean and variance are estimated. The criterion a+ was used successfully by Akaike (1972) to identify the order of four time series in the literature that are considered to be test problems, and was also successfully used by Ozaki (1975) for the test series given in Box and Jenkins (1970) as series-A through series-F. Because the criterion a+ is used for statistical time series, the correct order is not identified for every realization of a series generated from a known ARIMA(p,d,q) process. Shibata (1976) investigated the random behavior of the values estimated for a+ in finite samples by a Monte Carlo study. The results for a true ARIMA(1,0,0) process with candidate processes having values of p from zero through ten were that the order identified most often is the true order. Thus, for the heat conduction data it is reasonable to expect that the correct ARIMA process can be identified by analyzing several realizations but might not be correctly identified from one realization. Two adjustments were made in Equation (2.14) so that it would perform well for the heat conduction residuals. First, the mean 72 is zero in the ARIMA(p,d,q) model for the heat conduction residuals specified by matrix Equation (2.8); therefore, the quantity p + q + 2 is changed to the quantity p + q + 1. Second, Jones (1975) showed by a Monte Carlo study that unless the estimator of 02 is unbiased too high an order of the ARIMA(p,0,0) process is selected as best; therefore, an adjustment factor is included since the estimator of 2 o in Appendix A is biased. Hence the AIC criterion used in this dissertation is n 1n(n e 02) + 2 n (n - d)" (p + d + 1) (2.15) 9.1 II ('1 II n - l - p - q and where o2 is a biased estimator computed by the procedures in Appendix A. Appendix A also contains algorithms for computing the coefficients (the 0's and 6's) in the ARIMA models in order to insure that the models ranked using Equation (2.15) are both sta- tionary and invertible. The tests for stationarity and inverti- bility are based on seldom-used equations developed by Wise (1956). When the candidate process did not have a moving-average component, the coefficients were solved with an algorithm developed by Pagano (1972) that reduces the round-off and truncation error. When the candidate process had a moving-average component an analytical solu- tion developed by the author as Equations (A.21) and (A.22) was used. An analytical solution is better than a numerical solution procedure because the nonlinear equations can have more than one 73 root and the roots are not necessarily real valued when the candi- date process does not fit the data. The AIC criterion was used on several groupings of the residuals: by experiment, by surface, and by sensor. Table 2.8 displays esti- mates of a++ computed to find the best ARIMA model by experiment; no model is consistently best and the values of a++ are approximately the same. Although the a++ criterion does not clearly identify the best model, it does eliminate from consideration most of the forty- five models considered as candidate models. Because the highest order process considered is ARIMA(4,2,2) there are forty-five candidate models. The elimination from consideration of most of the models is illustrated by comparing the estimates of a++ in Table 2.8 with the typical value for white noise of 600, and noting that several candi- date models had a value of the a++ criterion less than that for white noise. A best model does not emerge in the models displayed in Table 2.9 for the 336 residuals at the insulated surface, nor does one emerge in the models displayed in Table 2.10 for the heated surface. In the data gathered by Farnia (1976) a single predetermined sampling rate is used and for this single sampling rate the lack of uniqueness in the best model in Tables 2.8 through 2.10 is good since the most easily computed model in the top six can be used as the cor- rect model. The estimates of the coefficients in the top six models are not equally stable, as shown in Table 2.11. The standard devia- tions (s.d.) for the coefficients of 0 and B are smallest for the ARIMA(1,0,1) and ARIMA(0,1,1) models. When the residuals are 74 Table 2.8 Top Six ARIMA Models by Experiment ++ a -2901. -2895. -2891. -2887. -2864. -2860. ++ a -2834. -2830. -2829. -2816. -2798. -2764. ++ a -2879. -2874. -2849. -2825. -2819. -2805. Case 0 Case 4 Case 6 ++ -2189. -2185. -2182. -2182. -2181. -2180. Case 8 ++ a ~2208. -2203. -2203. -2202 -2202 -2201 29 82 76 .92 .84 .83 Case 1 a++ ARIMA -2348.13 2,0,2 -2233.82 1,0,1 -2228.29 1,0,2 -2218.82 2,0,0 -2188.26 3,1,0 -2180.33 0,1,1 Case 3 a++ ARIMA -2595.86 3,1,1 -2534.87 1,1,1 -2534.36 0,1,2 -2530.40 2,1,0 -2528.53 0,1,1 -2527.40 3,1,0 Case 5 a++ ARIMA -2422.12 0,1,1 -2418.25 2,1,1 -2414.03 1,1,1 -2413.95 0,1,2 -2412.57 2,1,0 -2409.84 3,1,0 Case 7 a++ ARIMA -2361.ll 2,1,1 -229o.40 0,1,1 -2276.l3 1,1,1 -2276.02 0,1,2 -2276.00 2,1,0 -2275.86 1,1,0 Case 9 a++ ARIMA ~2546.83 2,1,1 -2531.60 4,1,1 -2511.28 2,1,2 -2503.65 0.1.1 -2475.27 3.1.1 -2460.07 1,1,1 75 Table 2.9 Top Six ARIMA Models at Insulated Surface Case 0 Case a++ ARIMA a++ ARIMA -l458.21 0,1,1 -1362.03 2,1,1 -l453.33 1,0,1 -l333.33 0,1,1 —l443.21 1,0,2 -l305.08 1,1,1 -1437.30 2,0,0 -1304.7l 0,1,2 -l435.36 l,l,1 -1294.40 1,1,2 -l434.75 0,1,2 -1285.85 3,1,0 Case 2 Case a++ ARIMA a++ ARIMA -1507.39 1.0.2 -133B.37 0.1.1 -1505.95 1,0,1 -l330.85 3,1,0 -1450.74 2,0,0 -1326.76 1,l,l -l446.96 1,1,1 -1326.54 0,1,2 -1446.29 0,1,2 -l324.6l 2,1,0 '1430.47 3,190 '13]7.72 1,190 Case 4 Case a++ ARIMA a++ ARIMA -l467.84 1,l,l -l416.34 1,1,2 -l467.76 0,1,2 -1399.50 0,1,2 -l460.96 0,1,1 -1372.19 3,1,0 -l458.55 2,1,1 -l357.93 2,1,0 '1433.80 39130 '1350.26 09],] -l429.38 4,1,1 -1339.49 1,0,2 Case 6 Case a++ ARIMA a++ ARIMA -l413.91 1,1,1 -1472.ll 1,1,2 -l410.75 0,1,2 -1452.45 1,1,1 -1396.49 3,1,0 -l439.72 0,1,2 -l386.ll 0,1,1 -l429.l6 3,1,0 -l381.93 2,1,0 -1410.68 2,1,0 -1366.69 4,1,1 -1399.76 1.0.2 Case 8 Case a++ ARIMA a++ ARIMA ‘1404.26 19192 '1447.61 0:191 -1399.99 2.1.2 -1428.80 2,1,1 -1390.18 3,1,0 -l405.34 1,1,2 -l383.12 2,1,0 -l403.15 1,1,1 -1374.64 0,1,1 -l401.98 0,1,2 -1350.80 1,1,0 -l400.83 2,1,0 76 Table 2.10 Top Six ARIMA Models at Heated Surface Case 0 a++ ARIMA -l481.96 1,1,1 -l469.60 0,1,2 -1462.10 1,1,2 -l456.ll 3,1,0 -l461.04 2,1,0 -l450.51 2,1,2 Case 2 a++ ARIMA '1369.05 1’19] -l367.61 0.1.2 -1357.00 2,1,1 -1340.66 1.0.1 -1338.24 3.1.0 -1336.93 2,1,0 Case 4 a++ ARIMA -l447.07 1,1,1 -l440.01 0,1,2 -l420.67 1.1.2 -l412.25 3.1.0 ‘1407.41 0919] -l407.10 2.1.0 Case 6 a++ ARIMA -981.27 0,1,1 -980.45 1,1,0 -977.65 1.1.1 -977.65 0,1,2 -977.65 2,1,0 -974.87 1,1,2 ’Case 8 a++ ARIMA -1013.4l 2,1,2 -997.73 0.1.2 ”997.16 20100 -994.38 1.1.2 -99l.30 0.1.1 -991.27 1.1.0 ++ a -1033. -1032. -1029. -1029. -1028. -1027. Case Case a -1229. -1228. -1227. -1227. -1226. -1225. Case ++ a -1139. -1133. -1l30. -1130. -1130. -1129. Case ++ a -1051. -1051. -1050. -1047. -1047. -1047. ++ a -1151. -1123. -1122. -1122. -1121. —112l. Case 77 Table 2.11 Stability of ARIMA Coefficients by Experiment for 0.3 Second Data Case 040 kOCDVChm-wa—‘O (DD) < Case QLD moouosmwa—ao MO! < Case DAD QmVOSU'l-wa-HO 010’ < 0 .973 .979 .992 .985 .985 .990 .978 .965 .978 .984 .981 .008 OOOOOOOOOOOO 0.157 -0.005 -0.045 0.965 0.070 -0.050 -0.013 -O.170 -0.586 -0.l38 -0.019 0.388 0 0.550 0.436 0.866 0.384 0.725 0.404 0.305 0.369 0.179 0.529 0.475 0.202 ARIMA(1,0,1) 0. .413 .742 .292 .587 .343 .283 .353 .289 .529 .429 .152 ARIMA(1,1,1) OOOOOOOOOOO 000000000000 0 459 0 .723 .431 .827 .481 .803 .355 .293 .200 .412 .394 .0410 .3603 ARIMA(0,1,2) 92 .080 .002 .041 .036 .048 .021 0.0172 .0357 .0147 .0262 .0159 .0320 .0421 .0376 .0589 .0284 .0309 .0137 00000000000 0.0161 .0374 .0139 .0218 .0130 .0261 .0370 .0322 .0359 .0244 OOOOOOOOOOO 0000000000 0 N O\ —l Case 0.40 90090101th (09-! < Case "’2' CAD mmwmmwa—ao O 9 U1 0 0101 < 0.4.0 omwmmth—Io ARIMA(0,1,1) 2 0 O 473 0.0174 438 0.0373 N.V. N.V. 365 0.0221 632 0.0143 416 0.0259 307 0.0370 400 0.0316 199 0.0361 .609 0.0229 .4266 0.0272 .1362 0.0087 ARIMA(1,1,0) 10 02 .387 0.0183 .367 0.0385 .515 0.0180 .322 0.0224 .452 0.0159 .355 0.0266 .280 0.0373 .345 0.0323 .192 0.0361 .444 0.0252 .366 0.0271 .092 0.0085 ARIMA(1,0,0) 9 02 .898 0.0202 .933 0.0411 .838 0.0211 .969 0.0283 .895 0.0205 .976 0.0356 .956 0.0452 .910 0.0415 .953 0.0633 .918 0.0354 .925 0.0352 .041 0.0135 OOOOOOOOOOOO 78 considered by sensor the number of trials is increased from ten to eighty but the sample size per trial is reduced to eighty-four. For the ten experiments with eight sensors per experiment the number of trials is large enough to gain information by tabulating the frequency that a particular ARIMA model is identified as being best. In Table 2.12 the ARIMA(0,1,1) model occurs most frequently and is the best model, although the estimates of the coefficients are somewhat inaccurate because there are only eighty-four residuals per sensor. Thus, to select the ARIMA(1,0,1) model as best involved not only the AIC criterion but also the stability of the estimated coef- ficients and the frequency the models are identified as best when there are many time series. Combining the results from Tables 2.8 through 2.12 yields the best model as ARIMA(1,0,1) with the follow— ing values for the coefficients: 0 = 0.981, e = 0.429, and 02 = 0.031. 2.4.2 SamplinggRate and ARIMA Model Specification The order of an ARIMA process varies with the sampling rate. Wei (1977) and MacGregor (1977) showed that an ARIMA(p,d,q) model becomes an ARIMA(p,d,q*) model when only every h-th value of the original ARIMA(p,d,q) process is used. The change in the order of the moving-average component is given by the integer part of a combi- nation of the orders in the original model. This combination is written as q* = int(p + d + (q - p - d) h“) (2.16) 79 If the original process, for example, is ARIMA(1,0,1) then the pro- cess when every h-th point in the series is used remains ARIMA(1,0,1). It is important to investigate the change in order of the ARIMA(p,d,q) model with sampling rate because sampling rates other than 0.3 seconds between measurements can be used on the IBM-1800 computer and data acquisition system. The best sampling rate depends on the covariance matrix for the physical parameters which contains the error covariance matrix w*. The matrix w* is determined by the ARIMA(p,d,q) model and its coefficients. The ARIMA models identified as best at a sampling rate of 0.3 seconds between measurements are shown in Table 2.12, and are based on eighty-four measurements per sensor. The ARIMA models identified as best at a sampling rate of 0.6 seconds between measurements are shown in Table 2.13, and are based on forty-two measurements per sensor. In both cases the residuals are from the standard least squares estimates of the physical parameters k and c. The same time period was analyzed in both studies with every other residual point used in the 0.6 second study. As expected when the sampling rate increases from 0.3 seconds to 0.6 seconds, more white noise models are identified by Equation (2.15) as being best. In both tables the most frequently selected model is ARIMA(0,1,1), which occurs nineteen times out of eighty cases in Table 2.12 and thirty-one times out of eighty cases in Table 2.13. Because an ARIMA(0,1,1) model remains an ARIMA(0,1,1) model when the sampling rate changes, the ARIMA(0,1,1) model is identified as having the best representation of the data. Because the sample size is small for the analysis in both Table 2.12 80 o o o o o o o o o o o o o _ N Pm m e m o m _ o o N p o N e m. _ oua m N _ oua m N _ onQ a a o c mama ucoumm m.o so; mpmuoz use cum: mpasmm 101 processor unit seconds on the Control Data Corporation 6500 computer system. The Monte Carlo study reported in Table 3.3 required nonlinear parameter estimation because 9 was estimated simultaneously with B using maximum likelihood estimation. The low cost in terms of cen— tral processor time for this study was achieved by the simplicity of Equation (3.17) and the effectiveness of the Newton-Raphson esti- mator. The Newton-Raphson estimator used in the study is u(1+1) = u(1) _ h R 9 where i is the iteration index, u is the vector of parameters, h is the step size, R is a positive definite approximation to the inverse of the Hessian matrix, and g is the gradient vector. The matrix R was obtained by finding the eigenvalues and vectors of the Hessian matrix by the QL method in Martin et al. (1968) and Dubrulle (1970), and a modified R matrix formed by scaled decomposition and Newton- Greenstadt replacement of the eigenvalues as suggested by Bard (1974, p. 307). The QL method solved the eigenproblem in Rosser et a1. (1951) while the Jacobi method in Boothroyd (1968) that was used initially did not. The step size adjustment used was one additional try version-b in Bard (1970). Bard (1974, p. 114) stated a conver- gence criterion as abs(u(1+]) - u(‘)) s y abs(u(‘) + a) (3.18) which insures that the search terminates when each parameter ceases to change value. The convergence constants are Y = 10'5 and a = 10'4. 102 The estimation of the noncentrality parameter associated with Equation (3.17) was considered for the parameter values 8 = 0, p = 0.94, 02 = l, n = 180, and t = 1000 trials. The estimator used is Equation (3.16.A) and the estimate is A = 0.3. Although only 1000 trials were available instead of the 10,000 trials suggested by Dickey and Fuller (1976), empirical percentiles were computed by ordering the estimates of ”i computed by Equation (3.15). These percentiles yielded good agreement with the confidence coefficients when X = 0.985 as shown in Table 3.4. Perhaps an estimator of A given in Pandey and Rahman (l97l) valid for small sample sizes should have been used. Table 3.4 Confidence Coefficients for Equation (3.17) Predicted Confidence Confidence Coefficient Empirical . . . Coeff1c1ent at of Interest Percentile Empirical Percentile l-a F(1-a) X = 0.985 X = 0 0.1 0.0329 0.079 0.168 0.5 0.662 0.367 0.578 0.9 4.304 0.868 0.963 0.95 6.879 0.950 0.991 0.975 9.68 0.981 0.998 0.99 14.19 0.995 0.9997 3.4 Aitken's Least Squares Estimator The purpose of this section is to demonstrate that Aitken's least squares can be implemented within a computer code written to compute standard least squares estimates. 103 There are advantages in using an existing standard least squares (denoted SLS) computer code rather than programming a new computer code for Aitken's least squares (denoted ALS). First, an existing SLS called PROPTY was available at Michigan State University that estimates thermal parameters in the heat conduction problem. Second, PROPTY was already programmed to provide information that is useful to understand the parameter estimation problem; namely, at each measurement time it prints the two sensitivity coefficients evaluated at the estimated physical parameters, the running difference between the estimates of the physical parameters based on all the measurements and their value using only data up to the current measurement time, and the value of a design criterion in Equation (4.1). Third, PROPTY uses a solution technique and minimization routine that is effective for the heat conduction problem; e.g., it uses special rules for the Gauss step size. 3.4.1 Adjustment of SLS Computer Code to Obtain ALS Estimates The essential features of computer code PROPTY are the SLS cost function and the Gauss procedure for minimizing the cost function. The SLS cost function for computer program PROPTY is given in Beck (1964) as m n 2 film) = j; 12:3] Aj (T(xj,t1.) - Te(xj,t1.) ) (3.19.11) In Equation (3.19.A), the temperature computed by the Crank-Nicolson finite difference scheme is denoted T(x,t), the measured temperature 104 discussed in Section 2.1.2 is denoted Te(x,t), and the temperatures are both computed and measured at locations Xi and at times t1. The residuals are defined by egj) = T(xj,ti) - Te(xj,ti) (3.20.A) where the vector e is obtained by stacking as was done for Equa- tion (2.8). An abbreviation is introduced for the double summation used in Equation (3.19.A) so that Equation (3.19.A) can be written compactly as f(k,c) = (e,e) (3.19.B) which implies that the product of the two quantities enclosed in the parentheses on the right hand side of Equation (3.19.B) are multiplied at corresponding times and sensors and the sum computed. The Gauss procedure used in PROPTY is well-known. In the Gauss procedure Equation (3.19) is minimized by calculating the thermal parameters by the iterative scheme written as (1+1) (1) -1 ._ k k T(Tk.1k) (1k.1c) (1k.e) = + h (3.21) c c (Tka) (TCJC) (Tc.e) In Equation (3.20) the notation defined in Equation (3.19.B) is used. The superscripts denote the i-th and (i+l)-th iteration. The two partial derivatives of the temperature are called sensitivity coef- ficients. The rules for the step size h are as follows. Rules on the step size h were introduced to improve the convergence rate of Equation (3.21). The step size is unity unless the percentage change 105 in the parameter value is too large. The two alternate step sizes when the change in the parameter values in an iteration is too large are: (l) h = 0.6 k (abs 6k).1 when either k'l 6k exceeds 1.6 with 6k positive or k'1 6k exceeds 0.6 with 6k negative, and (2) h = 0.6 c (abs <5c)"1 when c'1 6c exceeds 1.6 and 6c is positive. The starting values of kand c are computed from a table of temperature versus thermal properties supplied by the user of computer code PROPTY. The convergence criterion is Equation (3.18) with y = 0.01 and 6 = 0. After the convergence criterion is satisfied one more iteration is performed so that the useful information discussed in Section 3.2 can be printed. In order to use the ALS procedure two computational procedures are required: (1) a procedure to convert estimation with a SLS cost function given by Equation (3.19.A) to one with an ALS cost function given by Equation (l.l4.B), and (2) a procedure to estimate the transformation matrix P. Estimation of the P matrix will be discussed in Section 3.4.2 for a test problem. The conversion of the SLS code in PROPTY to an ALS code will be considered next. For convenience in notation consider a linear model written as y = X B + P a (3.22) where y is the vector of measurements, X is a known matrix, B is the estimated parameter vector, a is a vector of white noise, and P is a matrix presented in Equation (2.7). Comparing Equation (3.22) with Equation (3.1.A) we make the association w = P a. (3.23) 106 By definition a z N(o,o2 I) (3.24) which implies 2 t cov (w) = E(P a at Pt) = o P P . (3.25.A) By defining a matrix V, Equation (3.25.A) can be written as cov (w) = 02 V (3.25.B) where v = P Pt. (3.26) Hence, the covariance matrix of the additive errors can be written as w = 02 P Pt . (3.25.C) Equation (3.25.6) contains the covariance matrix that should be used in Equation (l.l4.B) to yield Aitken's least squares esti- mates. Hence, the quadratic form that should be minimized is -2 ( fALS(6) = o y - x 6)t (P P")'1 (y - x B) . (3 26 A) But Equation (3.26.A) can be factored and written as - -2 -1 t -l fALS(B) ' 0 (P (y - X B) ) (P (y - X B) ) (3.26.B) For the linear model the quantity corresponding to the residuals in Equation (3.20) is the residual written as e = y - X b (3.20.8) Hence, Equation (3.26.8) can be written as fALS(B) = 6’2 (2" e)t (P'1 e). (3.26.0) 107 while it is clear that for a linear model like Equation (3.22) it is possible to have Equation (3.19.A) written as f(6) = A] e e. (3.19.C) Thus, the cost function in computer code PROPTY given by Equa- tion (3.19.C) can be used to compute the Aitken's least squares cost function given by Equation (3.26.C) by merely replacing the residual vector e in PROPTY by the transformed residual P-1 e. This transformation was indicated but not stated explicitly in obtaining Equation (l.l4.C). In converting computer code PROPTY into a procedure to obtain Aitken's least squares estimates, the changes in the computer code are made for the Gauss minimization procedure given as Equation (3.21) rather than for the cost function given as Equation (3.19). Using Equation (3.20.B), the transformed residuals can be written as P.1 e = P'1 1 y - P' X b. (3.27) For the linear model in Equation (3.22), the sensitivity coefficient is the X matrix. In Equation (3.27) three quantities are trans- formed: the residuals (e), the measurements (y), and the sensitivity coefficient (X). However, for the Gauss scheme in Equation (3.21) that is used by computer code PROPTY it is only necessary to adjust two quantities: the residuals (e) and the sensitivity coefficients for each parameter (Tk and Tc)‘ 108 3.4.2 Demonstration of ALS Estimation for ARIMA(1,0,0) Errors For real problems the error covariance matrix is unknown; thus, a procedure is needed for estimating the P matrix used in Section 3.4.1. This section considers Aitken's estimation with a particu- lar P matrix. The P matrix for ARIMA(1,0,0) errors can be written as P“ = I - p K (3.28) where the non-zero entries of the matrix K satisfy U013. =1 1f 1=j+19 (3.29) where 9 is the autoregressive coefficient, and I is the identity matrix. For this P matrix the Yule-Walker estimator given as Equation (A.23) can be written as ( (K e)t K e)-] et K e (3.30.A) 02 (m n)'] (P'1 e)t P'1 e. (3.30.B) 0 Because the matrix P depends on the coefficient 0, in each itera- tion of the ALS procedure the estimate of P is updated. The feasibility of converting computer code PROPTY to a code yielding ALS estimates was investigated for ARIMA(1,0,0) errors. A subroutine was written to estimate the coefficients in the ARIMA (1,0,0) model using Equations (3.30). Within the subroutine a con- vergence criterion based on Equation (3.18) was used so that overall convergence in PROPTY can only occur after the estimates of the ARIMA(1,0,0) coefficients 0 and 02 have converged. Call statements were inserted within PROPTY so that this subroutine was called 109 immediately after either the residuals (e) or the sensitivity coeffi- cients (Tk) and (Tc) are computed. The subroutine resets these quan- tities according to Equation (3.27) and returns the reset value to PROPTY; the subroutine also uses the residuals before they are reset to estimate the ARIMA(1,0,0) coefficients using Equation (3.30). The estimates obtained by the above scheme for case-0 data are shown in Table 3.5. In three iterations the thermal parameters k and c satisfied the convergence criterion set at y = 0.01 in Equa- tion (3.18). However, there were additional iterations because the parameter 0 has not converged. The convergence criterion for D was set at Y = 0.0001 in Equation (3.18) and this relatively tight cri- terion was used in order to examine the parameter estimation routine. Clearly, a larger value of Y can be used for D in the estimation routine. The convergence shown in Table 3.5 clearly demonstrated the feasibility of an ALS estimation procedure. Table 3.5 Convergence of Aitken's Estimator for Case-0 Data Iteration E] _] E3 _1 9 a: Btu hr ft F Btu ft F F 0 42.00000 55.00000 0.0000 N.A. l 43.31882 55.65817 0.9233 0.01962 2 42.97964 55.73655 0.9057 0.01958 3 43.05785 55.79758 0.9061 0.01959 4 43.05245 55.84707 0.9058 0.01959 5 43.05034 55.88694 0.9059 0.01959 6 43.04751 55.91910 0.9060 0.01955 SLS 43.35722 55.66701 0.8982 0.02022 110 The last row of Table 3.5 gives the standard least squares estimates for this same case-0 data. The parameter values are in close agreement: the difference in estimates divided by the ALS estimates are k'] 6k = 0.0072 and c'] 6c = 0.0045. Also the estimates for the ARIMA(1,0,0) coefficient are given. These were calculated from the SLS residuals and also are in close agreement with 0-] 60 = 0.0086. 3.4.3 Aitken's Least Sguares with ARIMA Errors The estimation procedure demonstrated in Section 3.4.2 for an ARIMA(1,0,0) model can be generalized to estimation with an ARIMA (p,d,q) model. The basic requirement is to define the P matrix in the representation of the errors in Equation (2.7); i.e., w = P 8. Basically the same equations for this representation were obtained by Beck and Arnold (1974) and Pagan (1974). The equations in Pagan (1974) were attributed to an unpublished paper in 1966 by Phillips. The general form of the transformation matrix P will be con- sidered for an ARIMA(s,0,0) and ARIMA(0,0,s) process in order to show the symmetry between the autoregressive and the moving-average pro- cess. Also by considering both models together the space required to present the notation is also shortened. After the notation is defined the transformation matrix P for the ARIMA(p,0,q) model will be stated. Consider the ARIMA(s,0,0) process which can be written as w. = y] w. + .+...+ .+. 1-1 Y2 w1-2 Ys w a and also consider the ARIMA(0,0,s) process that can be written as 111 Wo=ao-Y]ai_-l-... 1 1 ' Ys ai-s ' For a sample of size n, the following vectors are defined as t w (wi.w2....,wn). a = (3]: 62, . . - a a that have the initial values -* 'k * *t ) w = (wo, w], . . . , wS , and *- * 'k *t a - (a0, a], . . . , as) where in the initial values * w_] w] for 0 S i, and IA * o a_] a1 for 0 1. In the matrix notation the ARIMA(s,0,0) process can be written _'I 15* 0 w - D w = a, and for the ARIMA(0,0,s) process the model can be written -1 * * w = D a - D a . In both equations the matrix D'] is an n by n lower band matrix written as 1 .Y1 'Y2 “Y1 1 o" = , 'Ys 'Ys-l . . . 1 0 'Ys . . . 1 J__p ’YS 'Y] 1— 112 * and D is an n by 5 matrix containing a band triangular structure written as *1 12 Y3 Y5 Y2 Y3 Y5 0 Y3 * D = Ys 0 0 . . . 0 L— —J In indicial notation the n by n matrix D'] has non-zero elements given by -1 = _ < . _ . < (0 )ij Yi-j for 0 - 1 j - s, where Y0 = -19 * while the n by 5 matrix D has non-zero elements given by = o o- < (0 )-- Yi+j-l for j + 1 l - s. The notation for the D and 0* matrices displays the symmetry between the ARIMA(s,0,0) and ARIMA(0,0,s) models. A notation where the difference between the ARIMA(s,0,0) model and the ARIMA(0,0,s) models is as follows. For the ARIMA(p,0,0) process replace y by D, D by L, and 0* by L* so that the process is written as -1 ** L w = L w + a. 113 * For the ARIMA(0,0,q) process replace y by e, D by M, and 0* by M so that the process is written as -1 * 'k w = M a - M a . It is easy to show that the ARIMA(p,0,q) process is written as -1 * - * L w - L* w = M 1 a - M* a (3.31) Equation (3.31) is the matrix definition of the ARIMA(p,0,q) pro- cess. Two forms of Equation (3.31) will be considered. The remainder of this section will discuss the case when both w* and a* are set to zero. Section 3.4.4 will discuss the form when both w* and a* are estimated. * * When w and a are zero, Equation (3.31) can be written as L" w = u" a (3.32) Equation (3.32) can be rearranged and written as w = L M" a (3.33) By introducing the definition P = L M“1 (3.34) Equation (3.33) has the form of Equation (3.23) as required for adjustment of PROPTY to obtain ALS estimates as was shown in Section 3.4.1. Equation (3.33) was stated in both Pagan (1974) and Beck and Arnold (1974). In order to obtain a numerical algorithm, recall that both transformed residuals and sensitivity coefficients are used in the ALS computer code. Consider the residuals being transformed into a vector denoted 2 according to the relation 114 Z = P e (3.35.A) Using Equation (3.34), Equation (3.35.A) can be written as 2 = M L’1 e. (3.35.8) For evaluation by a computer, it was convenient to convert Equa- tion (3.35) to a two stage recursive relation. This relation can be written as (M 2). = e. - p1 e1._1 - . . . - p e. (3.36.A) 'l 1 _ -1 -1 -1 21 - (M Z).i + 61(M Z)i-] - o o o - eq (M Z).i-qo (3.36.8) Equation (3.36) clearly reduces to the transformation used in Equation (1.14.C) for ARIMA(p,0,0) errors when M becomes I and all the 6's are zero. 3.4.4 Approximatinngaximum Likelihood_by Aitken's Estimator It is more difficult to implement the maximum likelihood esti- mator for which w* and a* are estimated rather than being set to zero. Newbold (1974) presented a clear computational procedure for the ARIMA(1,0,1) process and maximum likelihood estimation. The positive log-likelihood function can be written as t F(D,6,oz,w0,a0) = 3 n 1n(02) + 3 det(I + B B) + e where 2 t a + 3 o" 6 e = i 0-2 a The initial values of the wi-series is wo, and the initial value of the ai-series is a0. The n by 2 matrix B has components written as 115 (e - n) 81'] (3)11 (811., - 9 (e - m (1 - 1021-5 6"" The n components of the vector a can be written as a1 = A1 + (30 6 ' g wo) 61-] where Zi is computed from a two stage recursive relation of the form of Equation (3.36.A) and (3.36.8). This relation can be writ- ten as (M'1 Z). = w- - n w. for 2 s i (3.37.A) 1 1 1-1 2 = (W1 Z) + e (M'1 2) (3 37 B) 1 i 1-1 ° ' with the initial values The scalar 6 arising from the initial conditions is 6 = as + (1 - 92) (e - 0)2 (we - a012 . The values of a0 and w0 are estimated in terms of a two component vector c that can be written as c = (I + 8t 8)-] 8t M L" w (3.38) with the estimates written as - c], and (3.39.A) ao No This model reduces correctly since a0 and w0 equal to zero implies c1 - c2 (6 - 9) (1 - 02)'%. (3.39.8) both 6 and c are zero and hence B is zero. 116 Equation (3.39) was computed for case-0 data in order to gain insight into the magnitudes of a0 and wo. The vector c in Equation (3.38) used w from the vector of standard least squares residuals, and used matrices B, M, and L evaluated using the parameter esti- mates obtained in Table 2.11. The use of the standard least squares residuals for the w vector is a reasonable first approximation. The resulting estimates for the eight sensors in the case-0 heat conduc- tion data are shown in Table 3.6. The initial values a0 and w0 can be set to zero, because each w0 is close to the first value of wi for the corresponding sensor plotted in Figures 2.1 and 2.11 and because the value of a0 is small with respect to the estimate of 0 equal 0.13 given in Table 2.13. Hence Aitken's least squares can be used instead of maximum likelihood. Table 3.6 Estimates of ARIMA(1,0,1) Starting Values Sensor "0 a0 1 -0.0032 -0.000291 2 -0.3996 -0.036076 3 -0.l3l9 -0.011910 4 -0.0029 -0.000266 5 0.1329 -0.000291 6 0.0644 0.005811 7 0.2163 0.019524 8 0.4605 0.041574 117 3.5 Aitken's Least Squares Simulation Study The purpose of this section is to display and interpret the results of a twenty trial Monte Carlo study the author made on Aitken's estimates for simulated heat conduction data. This study investigates the first and fifth aspect of the correlated error problem stated in Section 1.5; namely, estimation of the ARIMA coef- ficients and the modeling of the probability distribution of the confidence ellipse. With good starting values for k and c, computer program PROPTY converges in usually two iterations and at most three iterations. The central processor time per iteration is quite large since the heat conduction equation is solved numerically. A close approxima- tion to the central processor time in seconds as a function of the number of iterations I is t = 13.9 + 6.3 I. The minimum central processor time is for one iteration or twenty seconds per trial instead of the 0.18 seconds per trial required in the simple linear study reported in Section 3.3.2. Hence it is impractical to have a detailed study of the statistical properties because there is a one hundred-fold increase in the computer time. The simulation study that was conducted is as follows. The simulation of the experimental data is discussed in Appendix C. The conditions used in the study approximated the conditions estimated for case—0 data in Table 2.1. The conditions are: k 43 Btu hr- ’1 '1 3 F", 0 = 0.98, e = 0.45, oz 0.016 F2, 2 -l n = 89, m = 8, q = 2.67 Btu ft" sec , and tE - t8 = 15.3 seconds. 1 ft F , c = 55 Btu ft- 118 Although only twenty trials were used, approximately 727 central processor seconds were required to estimate the parameters on the Control Data Corporation 6500 computer system. A Monte Carlo study with ten times as many trials would have been prohibitively expensive. Yet the state-of-the-art is such that Chambers and Ertel (1975) used one hundred trials and Pfeiffer and Lichtenwalner (1974) used four hundred trials. Hence the essence of the Monte Carlo prob- lem is to determine whether small trial procedures can yield infor- mation on the estimates of the coefficients and the probability dis- tribution for the fractile that determines the confidence ellipse. 3.5.1 Estimates of Thermal and ARIMA Parameters The quantities computed for Table 3.7 are obvious. In Table 3.16 the averages Lmawax esp mcvms mucmuwywcmwm we Fm>m— Po. mg» pm m=Fm> : umNPFmELoc mg» mo app_msco= unmoom mwpmupucp <« ooooo.o