REMOTE STORAGE AT IIIIIIIIIIIIIII II IIII .9359: I III LIBRARY Michigan State University This is to certify that the dissertation entitled Estimating the Covariance Components of an Unbalanced Multivariate Latent Random Model Via the EM Algorithm presented by Leonard Joseph Bianchi has been accepted towards fulfillment of the requirements for Ph. D. degree in Counseling, Educational Psychology & Special Education (Statistics & Research Design) ém Major professor Date May 16, 15287 uen' -Am__.-_A ' 1-- IIL 0-12771 -rfr g—flr 4‘— ‘; v‘A-ra REMOTE STORAGE ‘28 F PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE 025E DATE DUE “MI :3 :3 2018 L 2/17 20¢ Blue FORMS/DateDueForms_2017.mdd - 09.5 ESTIMATING THE COVARIANCE COMPONENTS OF AN UNBALANCED MULTIVARIATE LATENT RANDOM MODEL VIA THE EM ALGORITHM by Leonard Joseph Bianchi A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Educational Psychology and Special Education 1987 ABSTRACT ESTIMATING THE COVARIANCE COMPONENTS OF AN UNBALANCED MULTIVARIATE LATENT RANDOM MODEL VIA THE EM ALGORITHM / /I- By I. /.. r' x? I f/ f qu Leonard Joseph Bianchi II 7 f [1" 4 ’7 q Iélthgugh statistical procedures’gze aggilable_£gx £§£1QQELEE. _E£§atment effects in: students taught in glggggpoms, these.prgcedures are applicable only when every class has the same number of students. The present study‘investigated a procedure that was originally established to handle missing data (EM Algorithm) but which also provides a solution to the problem of estimating parameters in multivariate analysis when samples contain unequal group\sizes. The focus of thepresent dissertation was on the estimation of latent group nd individual level variances and covariances with measurement error emoved when group sizes varied in a sample. Previous methods could nly find maximum likelihood estimates for this problem if the dataset contained groups of equal size. The EM Algorithm offers A method for finding maximum likelihood estimates of parameters in situations where classical maximum likelihood procedures faiI. The estimate of balanced and unbalanced samples were both studied while varying two factors, mainly the number of groups in the sample Leonard Joseph Bianchi (the size) and the particular model being estimated (that is to say, the unrestricted.model, the correctly specified model and the incorrectly specified model). Only I? replications ere used in this demonstration of the algorithm under different circumstances. Tests of the model based on the criteria of convergence showed this estimation procedure to be a satisfactory and effective method in theory. However, problems in the use of this algorithm appearred in the form of large number of iterations needed for convergence and lack of a universally accepted criterian for convergence. This dissertation is dedicated to Robert Garden, Chris Mooney and the small group of IEA workers in New Zealand from 1981 to 1983 from whom I learned so much. ii ACKNOWLEDGMENTS I can never fully express the appreciation and thanks I have for Dr. William Schmidt who as both a friend and my chairman never gave up on me. His support was invaluable. To Dr. Richard Houang I owe much for both his time and effort. He forced me to think, a pesky thing to do, and attempted to open my eyes and mind and look beyond the obvious meaning of the research. The other two members of my committee also have my thanks. Dr. Stephen Raudenbush gave freely of both his time and knowledge while Dr. Robert Floden contributed suggestions and ideas. A very special thanks is due to JoAnn Green who, as a friend and university employee, helped steer me through the sea of regulations and red tape which, by necessity, exist in a university. Without her kindness and help, I know all would have been lost. Thanksis also given to Janice Benjamin for the use of her equipment and to Laurence Bates who helped me with the logistics which must be faced by every doctorial student. I My gratitude goes to Dr. Arthur Tabachneck for his long hours spent in an attempt to teach me the lost art of technical writing. His kindness is more than appreciated. My special appreciation goes to Peri who put up with my tirades, depressions and periods of paranoia with patience and care. {y-‘ iii Lastly my appreciation goes to my parents who never faltered in their belief in me, who helped me out during the tough times and who never once asked "When are you going to graduate and get a job?". (Well, hardly ever......) iv (yo :LKHS u » ‘I d": is» Mbfl’» .. I '"y p, I x}; ‘ ' ” ‘\ I IV TABLE OF CONTENTS _ . /3,_:- . /.I '. . /o I 1 ’ ‘ LIST or TABLES .................................................. vii LIST or FIGURES ................................................. viii Chapter / / INTRODUCTION ............................................... 1 ' LATENT COVARIANCE STRUCTURAL MODELS ........................ 6 //II z/l/ Single Level Covariance Structural Model ............... /’6 /L7\/ VIII. DISCUSSION ................................................. 90 1. Summary and Conclusions .............................. éffi; 99 2. Future Explorations .................................. APPENDICES v A. Equations for the Estimation of the Covariance Components of the Two-Parameter Model Using the EM Algorithm ........................ 101 B. Equations for the Estimation of the Covariance Components of the Four-Parameter Model Using the EM Algorithm ........................ 103 C. Computer Program for the Estimation of the Three Parameter Latent Model through the SAS Package ................................... 106 D. Computer Program for the Estimation of the Four Parameter Latent Model through the SAS Package ................................... ll5‘ BIBLIOGRAPHY ..................................................... _123 - vi Table 0‘ 10 11 12 LIST OF TABLES Parameter Values Used in Data Generation ................... Summary Statistics of the Latent Models for Balanced and Unbalanced Data Sets with 100 Groups ................... Summary Statistics of the Latent Models for Balanced and Unbalanced Data Sets with 25 Groups .................... Summary Statistics of the Unrestricted Model for Balanced and Unbalanced Data Sets for both 25 and 100 Groups ........ Average B/E of the Latent Models for Balanced and Unbalanced data Sets for both 25 and 100 Groups ............ Average Ratio(1) of the Ph, Om and Ps Matrices for the Latent Models in Data Sets for both 25 and 100 Groups ...... Average Ratio(2) of the Ph, 0m and PS Matrices for Balanced and Unbalanced Data Sets for both 25 and 100 Groups ........ Maximum Likelihood Ratio Test of the Fit of the Correct and Incorrect Models .......................................... Average B/E of the Unrestricted Model for Balanced and Unbalanced Data Sets both 25 and 100 Groups ............... Average Ratio(1) of the Phi and Psi Matrices of the Unrestricted Model for Balanced and Unbalanced Data Sets for both 25 and 100 groups ................................ Iterations Required by Algorithm for Convergence .......... Summary Statistics of the Four Parameter Latent Model for Balanced and Unbalanced Data Sets with 100 Groups ......... vii Page 53 63 68 72 77 78 8O 82 84 85 87 94 LIST OF FIGURES Figure Page 1 Design of Study ............................................... 51 viii CHAPTER I: INTRODUCTION Although statistical procedures are available for estimating treatment effects for students taught in classroom groups, these procedures are only applicable when every class has the same number of students. Since equal class size is rare in schools, however, it is important to develop practical methods that extend current procedures to cover all patterns of class size. The present study investigates a procedure that was originally established to handle missing data (the EM algorithm), but which also provides a solution to the problem of unequal sample sizes in multivariate analyses. After presenting a review of V existing procedures, this dissertation will: (i) show how the EM algorithm can be applied to this case; (ii) exhibit a computer program that uses this procedure to analyze such data; and (iii) illustrate the procedure and program with an analysis that estimates parameters from a sample data set generated from a known distribution. Many educational researchers have engaged in attempting to identify the various factors which affect student achievement. Laboratory studies have identified how individuals respond to different educational treatments, but most formal education occurs in classroom settings in which students receive treatments as a group. Two effects are introduced in the latter situation, however, which cannot be overlooked. First, there may be some process which\affects the class as a whole. A teacher with a class having a mean IQ of 120 may decide to cover more material than one with a class having a mean of 100. Such an .-'\ w " .fiw ‘_ _ \ _ NJ 3 2 effect, unless accounted for, could obscure the relatiggghgiLJufigfiggLJUL "od-d .1.- ‘5“ ----- ~w“ I and Second, similarly, there may be an interactive effect between individuals and the class. A person with an IQ_of 110 may do much differently in a class with a mean of 100 than in W, ‘V I,” {I . ectibetween_ 120 That is,\class level processe caQMaff ‘Sgass analyses, >W - 4W gan influence__iih_in class anal ses and undividuals can have an effect on both.\\The analysis of such data must e interpreted carefully. Estimates of relationships between variables at the class and individual levels can either be low, or high, as two effects can either combine to indicate a spuriously high relationship or work against each other to reduce it. A number of models and strategies have been recently developed to analyze this type of data. One line was the development of regression models to study the individual and group effects. Others\developed\models\to estimate underlying latent variances at each level. Keesling and Wiley (1974) used the relationship between two student level variables to adjust class level scores. The aggregated values of the variables were used with the estimated student level regression coefficients to compute an expected group score. This score was then subtracted from the aggregated class scores in order to obtain residual scores which, in turn, were then used within regular linear regression models with class level variables. Keesling and Wiley's model was based on the assumption that the relationship between two variables would be identical for all levels of the model. 3 Cronbach (1976) proposed an analytic approach that focussed on/ processes going on both between and within groups. He felt regression"L effects were composed of two components, a between and a within effect.3 His model allowed for the relationship between two variables to varyv between the individual and classroom levels. .1 Burstein, Linn and Capell (1980) developed a model allowing / regression coefficients between two variables to change from class to ,5 class. These coefficients were then used as dependent variables in g; regression analyses at the group level. Raudenbush (1986) applieda empirical Bayes theory to develop a procedure for producing Maximum, .3.» Likelihood (ML) estimates of the regression coefficients in Burstein's 2241 model. I)" ffiNone of the univariate regression models, however, offered /’{,;methods for estimating measurement error. Tests and i used in If" education, virtually gy_definition, contain measurement errors which can i f L {Elate the analyses' errorWe model. _ I“? Schmidt (1971) addressed this problem by developing a a." ,émultivariate structural model. By fitting an 5.91.1211 structure the Nfivariance matrix of the student's test?, the variance and covariance of Tathe latent dimensions and ’measurement errors {could be estimated for both . , ‘ r significant as both the variance and 1’ levels. This was especiall ‘1 alveovariance of latent dimensions are frequently thegitems/ of importance “fixto researchers. aann example of the applicability of this notion was evidenced in “gifthewlnternational Associationfor the Study of Educational Achievement's If: ’5£(IEA) ”Second International Mathematics Study", where items within]. W— m 4 7"“; academic tests were systematicallyvzonstructed, from a number 0f i 4 / ‘ "‘ I dimensionsuf In one case, one dimension was word problems v§,numerical examples.while another was arithmetic vs algebra. All items,containing the two dimensions, in turn, could be combined to make four subtests. The subtests, theoretically, were assumed to have served as con tests (see Chapter 8, Lord and Novich, 1976). The four subtests contained the following dimensions: IN7‘ PM” Waffles Wales Subscore A Word Problems Arithmetic Subscore 8 Word Problems Algebra Subscore CI Numerical Arithmetic Subscore D Numerical Algebra / / Of interest to the IEA study/was not the four subscores/but the / variance and covariance of the twolunderlying dimensioms. Schmidt's —,— .‘ A ‘ 3‘ "-_ .—-— ‘WY—m nun-o... ". model/bould have been used/to provide estimates of theflatentiéariance and covariance/é; both the class‘and individual leve .3 “.4-" 3: [covariance matricegycan then ’ sis numerous models”. Wisenbaker (1981), following the same logic, further developed the estimation procedures necessary to estimate parameters of a causal model for latent covariance structures.: The structural parameters of the between and within levels, according to Wisenbaker's model, are simultaneously estimated yielding ML estimators. 3/”. ’0 \ V Schmidt's and Wisenbaker's models, however, both require groups C;~V¢V \\;g (classes) to be of equal size. The underlying multivariate normal distribution upon which the ML equations are based is a vector of length up where n is the number of students in each class and p is the number of observed measures (tests) taken by each student. The ML procedure requires that each group contains the same number of subjects - n. In educational research, however, the number of subjects usually varies from classroom to classroom. The present dissertation concentrates on the problem of estimating the latent between and within covariance matrices when the number of subjects (students) varies between groups (classes). The early chapters contain information on the background of the problem. Chapter Two, Latent Structural Models, describes the development and background of those models. Chapter Three describes the background of the specific model used in this study and the development of different techniques for estimating variance components in the unbalanced random model. Chapter Four contains a statement of the problem. The last chapters contain the derivation of the procedure and an example of its use. Chapter Five contains the derivation of the equations needed in the estimation procedure. Chapter Six details the design of a monte carlo study for illustrating the use of the EM algorithm under the current model. The results of the study are described in Chapter Seven. Chapter Eight, finally, presents a discussion of the results and conclusions. CHAPTER II: LATENT COVARIANCE STRUCTURE MODELS 1. Single Level Covariance Structure Model Latent Covariance structure analysis was developed along two \ -’\ 7,” Hz“ “\J' " ~ # -2 ,— different lines of inquiry. The,first approach, factor analysis, was _f ‘__,,P’ derived explicitly for the purpose of finding latent structures (Spearman, 1904). The second line of inquiry applied the existing random analysis of variance model toward solving the same problem (e. g. 832k, 1960). Factor analysis was developed by Spearman as a method for confirming his theory on ability. Spearman sought to show that IQ tests measured two components, a "general or G factor” common to all IQ tests and a second factor specific only to the test. Through the application of factor analysis, he was able to isolate the variance component of each test attributable to the G factor, as well as the variance component specific to the individual tests. As the mathematics for factor analysis were expanded and refined, however, its use changed from confirmatory to exploratory and became a method for reducing a set of items or measures to a lesser number of underlying latent dimensions. These dimensions, in turn, were used to form factor scores for discriminating between subjects. These later exploratory methods of factor analysis lacked a firm theoretical basis and were simply algebraic manipulations of the data. A confirmatory approach to factor analysis did not resurface until the 1950's when such an approach was considered by Howe (1955), Anderson and Rubin (1956), and Lawley (1958). The advantage of 6 7 confirmatory approaches lie in their use of statistical theory and ability to test the fit of the latent models, but early efforts went largely ignored because of computational difficulties. Interest gradually rose after Joreskog (1966) developed more efficient estimation procedures. Joreskog (1969) developed a general approach to confirmatory maximum likelihood factor analysis. Unlike the prior models, this model had the flexibility of allowing/)esearchers the capability of selecting different structures from possible solutions: orthogonal, oblique and various mixtures of the two. The factor analysis model is based on the ‘\fundamental equation r (2.1a) x- 2+Azs+£ where y is a p x 1 vector of observed variables, a is a vector of grand means, A is p x q matrix connecting the p observed values to the q latent factors (with q 5 p), 3 is a q x 1 vector of the latent factors, and z is a p x 1 vector of the error or unique parts of the test. It is assumed that E(x)-E(z)eQ, E(xx')- O, E(zz')- i, and E(yy')- 2;. The dispersion matrix for y is Z, - AQA'+ W Assuming 1 has a multivariate normal distribution, the maximum likelihood equation is (2.3a) L - 11:11am“ Izyl'“2 exm -§ (1, - 2y)’ 2’; (x, - a?) 1 The efficient part of the log(L) is (2.4a) log(L) - - it n { logIEI + tr(SZ-1) ) Minimizing the following function (2.5a) F( A, O, ‘1') - logIEI + tr(SE-1) - logISI - p yields the likelihood ratio test statistic of goodness of fit. /A second approach“ as developed through the use of the random p; ——————‘ analysis of variance model. Burgfl£l947) was the first to point out the <:analogy between the analysis of variance (ANOVA) and factor analysis.>> This was further elaborated EZ,§EE§§X;S}2§EZ; #Bock;(lgggl showed that a formal relationship exists/between the two approaches. This relationship only becomes clear if a distinction is made between factor analysis used as a ”structural” versus "discriminal" analysis. According to Bock (1960, p153): 3 N ”By Cstructural' analysis is meant a measure which attempts to make l/éausal statements about test performances by assigning to definite sourcesvthe covariation which arises between certain psychological 1‘ ( . tests: this was the original use of factor analysis. In its subsequent application to the construction of3£esqibatteries, factor analysis was also used to assess whether tests of known measurement error yield reliable distinctions/between individuals, and, if so, in how many dimensions: it seems appropriate to designate this /ldiscriminal'fénalysis.' Factor analysis doesn't separate\fhese two uses or give clear answers for either. Bock showed that a Model II (Rendom) ANOVA,mgggl.can be applied to tests in light of specific hypothesis about their composition and suitably adjusting their psychometricvcharacteristics. The analysis could be used to study st mctural and discriminal propertig§,bf the 1".— . ‘x ' tests,(free of diffiéfilt statistical and interpretation problems} The purposes of this dissertation deal only with the structural analysis and ing an example to facilitate the discussion. w Consider the design of four testslfrom two dichotomgus r"...- 3---.--,_ -__,,,,..... -»~~....__.-m-v-~Mm -......._,.,.—...... W..- 3 __3_ ~,”____,i...,-.. ,..- ,1...an shall concentrate on it dimensions, as referred to in Chapter 1, namely Type of Problem_((11,v _.- ”Mr-1...” ._— V“! Word Problems vs (2) Numerical) and Type of Mathematics ((1) Arithmetic- vs (2) Algebra). Plnnour3example, the four tests may be _identified by the following ------------ ordered pairs: A(J) 1 2 1 ll 12 B(k) ' 2 21 22 TestUk) W W /Test 113 Word Problems Arithmetic f’Test 12 Word Problems Algébra / Test 2; Numerical Arithmetic / Test 22 Numerical Algbbra A mpdelvfor the structural analysis of this design is 2.6a ' - -+ + + + ( ) xijkt ai fiiJ 71k 6 13k 6 13kt 10 where xnfit is the score of individual i on test jk on occasionat, a1 is a component pf score specific \to individual i‘on all tests, flu and 1“ are components of scoré\specific q? individual 1 en dimensions B and C respectively, 6am is a component of score specific\to individual i and the test jk (with the dimensions effect exdiuded) and GL1: is a replication k error specific to individual, test, and occasion}: These‘components are considered random effectsyhnd are assumed normal and independent, N a ~ N(O, 0‘); p-Nw.{fl 1~Nw.{f 5 ~ N(O, 0:): e - N(0, of) f Because the number of components/with a distribution over individuals is equal/to the number of tests/in the dichotomous fac ,/the covariance structure may be fully estimated. This will not be true when there are more than 2 levels to a dimension._! The/design‘jor our exampletcan be represented in matrix form as v the Hamadand design matrix p B C ,1 1 l 1 l -l (2.7a) P -1/2 1 -1 1 l -l -1 'The purpose of the structural analysis is to test whether the sample covariances between tests fit the model. The covariance matrix of the data from our example is 11 (2.8a) Sr - X.'X. - ( M /(n-1) ) where X is the N x 4 matrix of means of r replicate scores and M is the matrix of corrections to the sample data. Pre and post multiplying the population covariance matrix by P will reduce it to its canonical or diagonal form (P ZIP'). If the sample covariance matrix is treated in this way, the off diagonal elements will not necessarily be equal to zero but if the model fits though, any non zero value will attributed simply to sampling variance. Therefore a statistical test of the off 5..___ _‘ diagonals being equal to zero will be a test of the fit of the model. W _ w— wvv ‘| A maximum likelihood ratio test given by Wilk's criterion and a chi-squared approximation provided for moderate to large samples by Bartlett can be used to test that hypothesis (Anderson,1984). This is (2.9a) x2 - - (N - (2p + 11)/6 ) logIRrI where p is the number of variates, erl is the determinate of ARA' and R is the correlation matrix corresponding to S. Book's work on reformulating factor analysis in the form of a random model has spurred (the development of more complicated models)and situations. Bock and Bargmann (1966) presented a method for analyzing a sample covariance matrix<3n order to assesépthe latent sources of variance and covariance within multivariate normal data. This "structural” analysis of the sample covariance matrix has a two fold purpose. The first purpose is to statistically test the feasibility of /a hypothesized model/and the second is to provide estimates of variance components associated with the latent variables of this model. V .m- 12 This analysis is an alternative to Type III (Mixed) ANOVA Model. The method of maximum likelihood estimation is used to test the model and estimate the latent variance-covariance components. The model for the observed score vector of p tests is given as (2.10s) yu-u-i-Afi14- e where u is the vectorKnean of the 13' tests, @is a p x 9 matrix of 44.. @Coefficients'connecting the "observed and theVlatent variables, “wt/flw g is an m x 1 vector of latent scores for subject i having an m x m W 4- _- ”V“ " ii". I”..- ’ #' covariance matrix 0 and {gig is a p x 1 vector of measurement errors with a p x p covaW The model implies that vector x“ has a multivariate normal distribution 41th mean vector u and covariance matrix 2y where (2.11a) 2: - A no + i: In this model the latent variables are considered independent of each ___ _.._ __._ k. ..m;—— other with <5 considered to be a diagonal matr x\ _ .'¢‘WM~-mw The likelihood function of the general model proposed by Bock and Bargmann for p measures on N individuals sampled randomly from a multivariate normal population is 12/2 -1/2 _ 3 - , -1 _ (2.12s) L nf_1(2«) lzyl exp{ 2 (11 2,) 3,, (x1 2,) 1 Taking the natural log of the function, differentiating and setting the derivitive equal to zero will yield a maximum likelihood estimate 13 (2.13a) Log(L) - -(Np/2) 10g21r - (N/2) longyI - (N/2) cm 2'13 ) Assuming the elements of E; are functions of a scalar variable x, the first derivitive with respect to x is (2.1%) mm - 4? /" 4 4. 437,) ‘7’"! b ”ff-Q" (2.2b) 2y - Z + 6 2y is the variance-covariance of p measures for y 2 is the variance-covariance due to class effects 9 is the variance-covariance due to individual effect. The two variance-covariance matrices contain information about the class level effects and the individual level effects. These two matrices can be expressed as a function of matrices relating observed to latent , I 3 2. :4 w“ d 4 g ' l ’ s ‘ \ 18 variables and to errors of measurement. In order to estimate the underlying latent covariance structure and error that compose the two matrices, Schmidt (1971) applied Bock's model to multilevel data. The model included class effects. (2.3b) yum - pk + all + b1.j + on + duh + emm where i-1,...,m groups, j-l,,..,nfli) students/group, k-l,...,p measures, n-l, . . . ,N students, "I: is the overall mean of the kth variable, a1 is the effect of group 1, bid is the effect of person j in group i, on is an interaction betwaen measure 1: and class i, d“: is the interaction between student i in group j and measure 1:, and eun is measurement error for person j in group i on test k for this occasion. Notice that the effect at the class level occurs in two terms (2.4b) 6 - a1 + on \ and the effect at individual level is found in another two terms. (2.5b) 63k - bi: + duh Substituting these variables in the model give the following equation. (2.51)) )4 kn - uk + 611: + £3 + e ‘\\: Assuming that u, 6 , 5 and e are uncorrelated, the covariance matrix of the y‘s is given by 19 (2.71)) 2 -0+¢+‘F where 0 is the covariance matrix of Q, r is the covariance matrix of i and i is the covariance matrix of g. The model now has three components, 0 which contains the effects at class level, 1 which contains the effects at individual level and ‘1' which contains the measurement errors. The interactive random variables 2&1 and {51: could be visualized as combinations of some latent random vectors 2 and g. (2.8b) -A¢.0. (2.9b) fi -Ag The error matrix i can be rewritten as the linear combination of two components, a within (‘7') and a between group matrix (in) . From these two assumptions the covariance matrix of y is (2.1%) 2 -A¢A'+A¢A'+w +1! y one to a where A is a pxr matrix of weights relating the observed mean level variables to the vector of r latent variables. This implies that the basic model for the structural analysis of covariance component matrices of the multivariate random model is given by (2.11b) 2' - AR‘A‘ + Wu (2.12b) Z-Afl’A-i-‘I'w 20 where All is matrix of weights relating the observed mean-level variables to the latent variables Q, A is matrix of weights relating the observed individual variables to the latent variables g,, o. and o are the covariance matrices of the between and within latent effects 9, and g, and i;, and i; are the diagonal covariance matrices of the two error matrices. Each of these variance-covariance matrices, 2.4and 2, correspond to those considered by Joreskog (1967). The primary difference is that these models, which represent a set of equations, are themselves intended to be simultaneously estimated. A class of models can be generated by varying the restrictions on the six parameter matrices of this model, A‘, A, Q;, w, W. and 0;. The classifications proposed by Wiley (see last section) can be fit to these parameters. The two forms of matrices that the latent variance covariance matrices, é’ and Q, can assume are: I (l) The latent variables are uncorrelated i.e. éb is a diagonal matrix. (2) The latent variables are correlated i.e. ¢b is a symmetric matrix. The two error matrices, W and W , can have one of the 8 W following two structures: (l)The errors are heterogeneous, a general diagonal matrix. (2)The errors are homogeneous ( 021 ). 21 The four forms of restriction on A are (1) General ( A ) - all elements are to be estimated. (2) General ( A ) - most elements are to estimated except for certain specified ones ._ /,4m\ If A s reparameterized into/T A/bhere(£ is a matrix of scaling factor , two different sets of restriction can be applied. (3) Completely specified (A)\and scaled by an unknown but estimable matrix of scaling weights ( P ).\ (4)Completely specified (A) and.unscaled. \ 4A. x’ ‘t R There are (4 x 4 x 2 x 2 x 2 x 2)i25S)possible models‘which can V! be formed from them. 22 3. Extension of o. and w to causal models. Once the latent covariance matrices, é and ¢., are estimated, the matrices themselves can be used to test causal models. Specifically once scaling factors have been specified and measurement error removed, these residual covariance components contain all of the relevant information necessary to analyze a given data structure and hypothesized causal models can be tested. Joreskog (1971) has developed a procedure for estimating the parameters of causal models using maximum likelihood estimation - (LISREL). His procedure estimates the parameters for two components of casual models, namely the measurement model (based on his work mentioned in section A) and the structural model. The measurement model estimates the underlying latent constructs of the model while the structural model specifies the causal relationships among the latent variables. These two components, in turn, are used to describe the causal effects and the amount of unexplained variance among the observed var ables. Wisenbaker (1980) extended Joreskog's model t multilevel situations. His work simultaneously estimated par eters of causal models at both the between and within levels. The focus in this dissertation is on the estimation of o‘ and Q. One natural extension of this work is to develop the algorithm necessary for directly estimating the parameters of Wisenbaker's causal model when groups are unbalanced. CHAPTER III: REVIEW OF RELEVANT WORK 1. Schmidt's Structural Model The focusvof this dissertation is the estimation of the latent covariance matrices Q. and tb. Schmidt (1969) developed a general procedure \for estimatingNthese latent covariance matrices“ for multivariate normal data: Assuming classes \Vto be drawn at random, the random multivariate model is : (3.18) X '1 +a+§ where y“ is the observed set of individual level variables for p values and y” is ‘a p x 1 vector of general means. The term 31 is a random vector of schools and g“ is a random vector of errors. Both of these are considered to be distributed multivariately normal with zero mean vectors and covariance matrices 23. and 2.. This would imply that the covariance structure for this model would be: (3.2a) 2 - 2 + 2 Y C 0 Usually 2. and 2. are estimated by using the expectations of the mean squares of a Multivariate Analysis of Variance (MANOVA). In the random multivariate model it can be shown that 23 24 (3.3a) E( S(w)/[kn-k] ) - 2. and (3.4a) E( S(b)/[k-l] ) - 2.-+ n 2.. By using these formulae, 2. and 2.4can be estimated from the following equations A (3.5a) 2. - S(w)/[kn-k] A (3.68) E. - (l/nH S(b)/[k-1] - S(W)/[kn-k] }. Unfortunately this method can yield non-positive definite estimates of the matrix 2.. Schmidt used the principle of maximum likelihood to estimate these two matrices. The data in a random model would consist of m factor levels each containing n subjects with p measures on each subject. Dependent Variables (measures) Subjects Factor 1 ,2 3 . . - n Levels 1 2 l 25 Basing the likelihood function on the general notions of Tiao and Tan, (1965), the data can be visualized as m independent observations from a np-dimensional multivariate normal distribution. The general linear model for any y is given by (3-78) I - l 9 g + l 8 (b 4. IO where l 8 g is a vector of pn means (p means repeated over n times), 26 g1 is vector of pn effects (p effects repeated over n times) and g is a vector of pn errors. Both 51 and g are considered to have come from multivariate normal distributions (3.8a) 5 ~ N( 0. 2‘) 2 ~ N( O. 2.) The covariance matrix for this model is (3.9a) B-E -ll'@2 +132 Y n I e P This appears as Fz+z. . . . . . . 2 -1 . C l 2: . t3, ......2.+2'j The covariance between observations within a factor level is given by (3.10a) Cov (XJ' Y'i) - 2‘I i fj The density function of y is then l-l/Z (3.1141) f - new” Iznp exm-j [(1 - 182)'2n:(1 - 1mm from which the likelihood function follows -m/2 (3.12a) 1.04. E...) - <2«>‘““”"|znp| eprj (my, - lem'znjwi - 1mm 27 The matrix 2n must be expressed in terms of 2. and 2.. The relationship is given in (3.9a), from which the following inverse and determinate follow. (3.13s) |2 |- I: |“'1 + |z + nz| up 0 o a (3.14s) 2'1 - I e 2’1 - 11' e (2 + n2 )'1 2:" 2'1 up 0 o a a o The Likelihood can be simplified as (3.14...) M44. 2.. z.) - <21r>""“"’zlz,|""""”zexp{-;1 [tn-{2:131 + m ma: + n2 )‘131 C l I + m crux. + nz‘f‘dv - m5! - pm 1 1 n II o where S. - 1/mn 2.1-1 21% (YL1 ' Y1) (Y1.1 ' Y1) a I S. - n/m ZN (y1 - >')(y1 - y) and y” is a pxl observation vector for the jth person in the ith group. The log of the likelihood is (3.15a) Log(L(p, 2., 2.) - - ffiogax) + “immglzJ Q -l - 2 logIZ. + nBJ - :Imn tr{2.S) + m tr((2. + nza)'ls‘) + mn tr{(2. + n2.)'1(y-u)(y-u)'11 The effective part of the log likelihood function for the estimation of 2. and 2. is given by 28 (3.16a) Log L(g, 2., E.) - Ill:zmilogIZJ- 2’ logIE. + nZ‘I - ’5‘ cuzllm - 2 crux:ll + n2.)-18a} By expressing g",in.this manner, Schmidt was able to obtain the following maximum likelihood estimates for 2.4and 2a. A (3.17s) 2.!— [n/(n-1)]S. A (3.1841) 2. -.% (s. - [n/(n-l)]S.) This gives estimates of the between and with-in covariance matrices but says nothing about the latent constructs or the measurement error associated with the observed values. The equation for a single observation with latent constructs is (3.19a) y -u+a+b +c +d +e ijkn k 1 13 1k 13k iJkn i-l,...,m groups j-l,...,n(i) students/group k-l,...,p measures n-l,...,N students where u is the mean of the kth variable, a is the effect of group i, b is the effect of person j in group i, c is an interaction between measure k and class i, d is the interaction between student 1 in group j and measure k, and e is measurement error for person j in group i on test k for this occasion. Notice that the class effect occurs in two terms . 29 (3.20a) Qn - a1 + on The effect due to individuals exists in two terms. (3.21s) fink - b1.j + (Ink The model can be written in terms of the effects at each level. (3.228.) -u+Q +§ +g 1mm '1 u: Jk ijkn Assuming that g, Q, g and g are uncorrelated, the covariance matrix of the y‘ s are (3.23a) 2y -0+ 1' +\F where 0 is the covariance matrix of Q, r is the covariance matrix of 6 and I is the covariance matrix of g. The model now has three components, 0 which contains the effects at class level, 1 which contains the effects at individual level and W which contains the measurement errors. The vectors Q and g could be visualized as combinations of latent random vectors 9 and 9,. (3.21m) Qk1 - A69 (3.25a) £11k - A Q 30 The error matrix Q can be rewritten as the linear combination of two components, a within ( i; ) and a between group matrix ( W. ). From these two assumptions the covariance matrix of y is (3.26s) 2 -AQA'+AQA'+W +1! y III I I where A is a pxr matrix of weights relating the observed mean level variables to the vector of r latent variables. This implies that the basic model for the structural analysis of covariance component matrices of the multivariate random model is (3.278.) 2 -A¢A + W a sea a (3.28a) 2 - A Q A + W” where A. is matrix of weights relating the observed mean-level variables to the latent variables 9, A is matrix of weights relating the observed individual variables to the latent variables 9,, Q. and Q are the covariance matrices of the between and within latent effects g, and g, and i., and i; are the diagonal covariance matrices of the two error matrices. Substituting the structural model for 2. and 2. into the likelihood function in (3.16a) gives the maximum likelihood appropriate for the structural analysis. Taking partial derivitives of the log likelihood in respect to Q;, Q, A., A, Q., and W; and setting them equal to zero will yield maximum likelihood estimates of those parameter matrices. These equations proved to be to complicated too be solved 31 algebraically and Schmidt used the modified method of Davidson as an algorithm to estimate the matrices. Formulating the maximum likelihood equation by considering the data as 3 random vectors from a multivariate normal distribution of up measures constrained the model to have the same number of individuals in each group (i.e. there must be 2 measures for n students). When groups have unequal numbers of students, the likelihood function developed in (3.16a) will no longer hold true. In education, researchers are often in the position of collecting data for groups of unequal sizes. To obtain maximum likelihood estimates of the structural matrices in this situation requires either a new analytic strategy or the development of an alternative likelihood function. However finding maximum likelihood estimates of the covariance matrices of an unbalanced design has proved to be difficult. 32 2. Unbalanced Designs In multivariate analysis very little has been done to exam the effects of unequal group size on the estimation of the covariance matrix, although there has been much exploration in estimating variance components in the this design for the univariate case. Since Anderson (1984) feels that a number of statistical problems arising in multivariate populations are straightforward analogs of problems arising in univariate populations and the suitable method for handling these problems are similar; parsimony would suggest looking at previous developments regarding the univariate case. Searles (1971) points out the following problems which must be faced when dealing with unbalanced designs: "The property of unbiasedness itself merits questioning in the case of variance component estimators. This is so because with unbalanced data from random models the concept of repetitions of similarly structured data and associated repetitions of estimators is often not appropriate --- more data, maybe, but not necessarily with the same pattern of unbiasedness. Replications of data can not be thought of as mere resamplings of the data already available." and "even in the simplest of cases the effect of the n-pattern on properties of estimators is apparently itself a function 33 of the variance components being estimated. The effects of unbalancedness therefore appear to differ according to the values of the true variance components." This last statement refers to the fact that the MINQUE procedure and those based on it rely on the researcher choosing the “true" ratio of the between variance component to the within variance component of the variable. Welsh (1937) was the first to point out how unequal number of subjects in each group can affect the estimation and testing of statistical hypotheses. Henderson (1953) proposed three methods for estimating variance components for the unbalanced random design, using the expectations of the Random Anova Model. Rao (1971) advanced a new method for estimating variance components called MINQUE, a minimum quadratic unbiased estimator. Ahrens, Kleffe and Tenzler (1981) state "this procedure provides some kind of optimality and does not refer to the normal assumption" and 'MINQUE ... has been justified by heuristic arguments without reference to the normal distribution". Formulas for the MINQUE have been developed with increasing explicitness by Lamotte (1973, 1976) and Ahrens (1978). MINQUE has also been developed for more difficult designs (e.g. see Kleffe (1977)) MINQUE can at times give negative estimates of the variance components. Rao (1972), in turn, developed MINQE which gives variance estimates that are always positive but can be biased. It may be noted that no properties are yet known about this estimator. Searle (1972) devotes an entire review to the methods of variance 34 estimation in unbalanced random designs. The estimators reviewed are all unbiased, their other properties are unknown. Most of these estimation procedures can lead to negative estimates of the variance component. Chatterjee and Das (1983) developed a simple estimator of variance components in the random model based on Weighted Least Squares (WLS). They found that as the number of classes increase the proposed estimator is seen as not only to be the best asymtoptically normal but also to be asymtoptically equivalent to the maximum likelihood estimates. A review of recent developments in WLS can be found in Williams, Radcliffe and Speed (1975). There is no agreement on what constitutes a good estimator of the covariance when groups are unbalanced. As shown above there are many different measures each with its own strengths and weaknesses. CHAPTER IV: STATEMENT OF PROBLEM The interest of this dissertation lies in the latent covariance structure implied by the simple true score model. Based on the Simple Multivariate Random Effects Model, the two variance components, between 'g (2.) and within (2), are expressed as linear combinations of a set of latent variables. (4.1.) 2-A§A+W a sea a (4.2) 2 - A Q A + W It is the latent covariance matrices Q. and Q that are of primary interest. I In chapter 3, a maximum likelihood procedure developed by fi-r— Schmidt was presented for estimating the error matrices %; and i and ..ac.__.lWh‘_fi~“_"fd__v_fi, /—‘———-’_f ‘— w— the latent covariance matrices Qll and Q when A. and A are known.__.4i—l—7“ MW ... -QHH E‘- h" ~ ‘4‘ fl.- ." -- ...—....— 4-F"' f However, this procedure can only be used when groups in the sample , 1......) contain the same number of individualsy If the number of individuals in ,___———————-—"” 1 each group is different, this estimation procedure would not be directly ’- appropriate. The focus of this dissertation is upon the estimation of the group and individual level variances, with measurement error removed, when group sizes vary in a sample. A promising approach is the EM Algorithm. Developed as an estimation procedure for handling data sets with missing data, it offers a method of finding maximum likelihood estimates of parameters in situations where classical maximum likelihood procedures fail. 35 36 The applicability of the EM Algorithm to latent structure models is demonstrated in the next chapter. CHAPTER V: ESTIMATION PROCEDURE 1. Expectation-Maximization (EM) Algorithm The EM algorithm has gradually evolved as a method for estimating the parameters of a model when a sample contains missing data. Early works by Hartley (1958), Healy and Westmoratt (1956), Baum et al (1970), and Brown (1974) among others contained specific uses of the EM algorithm under different names. Dempster, Laird and Rubin (1975) developed a more general form for the algorithm and provided a formal proof that if the algorithm did converge, it would result in maximum likelihood estimates. Missing data cannot be directly measured but exists as function of observed data. This could be censored or truncated data where the value of the data is not the direct value of interest or it could be viewed as being comprised of combinations of latent constructs which form the observed data (Hartly and Hocking, 1971). Assuming a sample, y, is drawn from a population of a known distribution with unknown parameters ¢, then y (incomplete data) can be pictured as a subsample of x (complete data) determined by the equation y - y(x). .The complete data situation has a family of sampling densities f(x|¢) depending on ¢, from which the corresponding family of sampling distributions for the incomplete data, g(y|¢) can be derived. The EM algorithm is aimed at finding the ¢ which maximizes g(y|¢) given an observed y, but making essential use of the family f(x|¢). There are many possible f(x|¢) that will generate a g(y|¢), making the choice of f(xl¢) a major problem. 37 38 Each iteration of the EM algorithm goes through two steps, the expectation step (E-Step) and the maximization step (M-Step). If the complete data, x, comes from a distribution with parameter o, the steps can be stated as follows: 1. E-step: Estimate the ggmnlg§g_§§§g sufficient statistics conditional upon the inggmplggg_g§;§, y, and the parameter d. This step provides the connection between the complete data, x, and the incomplete data, y. 2. The M-step determines the parameter ¢ that maximizes the conditional anplggg gag; sufficient statistics. This requires writing the Maximum Likelihood equation for ¢ in terms of the complete data. The sufficient statistics for the complete data are calculated using the incomplete data and estimates of the parameters. (For the first iteration these values of the parameters are given by the user.) The sufficient statistics are then used to estimate the parameters. This value is used to recalculate the sufficient statistics which in turn are used to recalculate the parameter ¢. The iterations continue until some chosen criterion for convergence is met. 39 2. Theory for the Restricted Model It is the application of the E-M algorithm to the estimation of the variance-covariance matrices of a latent multivariate model that is the thrust of this dissertation. The model chosen was based on Schmidt's latent multivariate model with two modifications. First, the groups may or may not contain different numbers of subjects; Schmidt's model allowed only groups of equal size. This modification, however, makes Schmidt's estimation procedure inapplicable. Second, the group level error term in Schmidt's model is not included in the present model. To estimate the parameters of this unbalanced model, the EM algorithm was employed. The E-step requires the derivation of the conditional sufficient statistics and the M-step requires the Maximum Likelihood Equations of the parameters for the complete data. The model of interest has the following structure (5.1) X“ - u + Ala-Q1 + A g” + g“ whereXU is a p x 1 vector of observed variables for subject j in group i (incomplete data) 2 is a p x 1 vector of grand means for p variables. (Complete data) 0 1.1 1.1 40 is a p x q matrix connecting the p observed measures for individuals to the q underlying group latent values. is a q x 1 vector of q (where q 5 p) latent group effects for group i. (Complete data) is a p x r matrix connecting the p observed measures for individuals to the r underlying individual latent values. is a r x 1 vector of r (where r S p) latent individual effects for person j in group 1. (complete data) is a p x 1 vector of random error. For purposes of the derivation of the conditional equations necessary for this EM Algorithm, u will be considered to be a random vector from a multivariate normal distribution with a mean vector of zero and covariance matrix, 2“. ZLater in the derivation, 2: will be defined as a zero matrix, yielding posterior estimates of the grand means. This procedure is mentioned in Dempster g; a1 (1976) and further elaborated in Raudenbush (1986). The latent effects and the error are assumed to come from the multivariate normal distributions (5.2) prams“) g-N( lo .9) §~N(Qv¢a) 6~N(9."F) 41 It is the estimation of Q.,1Q and i that is of interest. Assuming the latent effects are independent, then: Cov{u.fi)- 0 COV(§.9_)- 0 (5.3) COV(u.e)- 0 COV(Q..{)- 0 COV(u.:.)- 0 COV(9...{)- 0- Before finding the conditional sufficient statistics for the maximum likelihood equations, it is important to have a clear understanding of which variables comprise the missing data, the complete data and the parameters of interest. The missing data is our observed dataset X. The complete data consists of the three latent variables u, Q and.g. The parameters of interest are the covariance matrices Q., Q and i. E-Step Development of the conditional expectations and dispertions for u,,Q and g are delineated in this section. These expectations are conditional on the observed data Y and the three parameter matrices Q;, Q and i. The observed dataset has the following expression: (5.4) I-llflg+(X®A‘)Q+(IN®4\)g+g. where 1' is a N x 1 vector and X is an N x m pattern matrix containing 1's and 0's. This matrix connects person N(i) with group m(k). 42 The variance of the observed dataset, Y, is: (5.5) 2-11'82+)Q('84\Q4\+I®AQ4\+I®W y N u one The joint distribution of X, g, Q and g is: r- _ 7- fl Y J 2’ Symmetric Matrix (5.6) p l' O 2 N2 : ~ N N u u 1 a X' 8 Q A' 0 I 9 Q i a a k a ‘ J 9 I e 01 o o I: o o ’ L .1 __ N ___J with Cov(Y, u ) - 1' o 2“ Cov(Y, 6 ) - X 0 Q. Cov(Y, a ) - I 0 Q Covm. Q ) - 0 COV(u. a ) - 0 CONE. a ) - 0. By defining the matrices and vectors as z- [1‘91 xoxa 19A] _ ., P 1 p N Eu Symmetric Matrix (5.7) T - 6 O— O I @Q k a a O O I 8 Q h _ 4b N J 43 the joint normal prior distribution can be written as Z 0 0 I Z 0 Z' + I 9 W (5.8) ~ N I 0 Z' Raudenbush (1986) derived the formulas for the conditional expectation and dispersion of matrices written in this form. The conditional expectation and dispersion of T given Y are: (5.9) mm - ("54ng + “212". (5.10) pm!) - ( rm 49 «0‘12 + 0’1)’1 I e W)": By substituting the original values of Z, I and O in (6.7) and allowing 2‘: to become a matrix of zeros as previously mentioned, the dispersion matrix becomes I’N i-1 Symmetric Matrix (5.11) Dmx) - 1 a any" I e (n x'w’lx + o") k i a k i a a a _ 1' 0 mil x o Arr") I" 9 (MFA + o'l) Partitioning this matrix into a 2 x 2 matrix and applying the procedure in Morrison (1972) for inverting such a matrix, leads to the following values for the elements 1 D(T|Y)- 2 3 a 5 6 (5.12) 1-D(u|X) - w (5.13) 2 - um, um - -1]: o ‘33?“ 44 (5.14) 3 - mam - 1k 9 (o. - 14.13113.) + 141; e <53;in Q3113. (5.15) a - D(g, pm - -1n 0 o 1'in (5.16) 5 — D(g, SIX) - 1:411; o (1/n1) o 1'ij Q1A.Q. - x o (1/n1) <5 1'ij Q1113. (5.17) 6 - D(g|X) - 1’11; 9 (1/n1)(1/nJ) o 1 de Q14\ o + In a (an - <1 .vu A e) + xx' 9 (l/n1)QA'(M - (1/n1)Q1)AQ where u - (A o 1' + it)" , -1«1 Q1 - (4\.Q‘l4\II + (1/ni)M ) -1 w - [2:10,] The conditional expectations of u, Q and a can be formulated by substituting (6.7) into (6.10). This yields: uIY u 2qu QiY (5.18) E QIY - 2' - 1.. o “'0 (i - 2’) a a i i ..le-j LQJ 1n 8 Q 4\'M(XLj - 4\ 9 - p) M-Step. The second step of the EM Algorithm requires expressing the maximum likelihood equations of the parameters Q;, Q and 0 in terms of the complete data. Assuming the values g, Q and g are known, the expression of the likelihoods for our three parameters can be directly stated as: (5.19) L(Y, o.) - 11::1 (21r)""’2|414“|'1’2 exp {-4} e'ofe} 45 '1/2 (5.20) L(Y, e) - n“ (2«)“”2|e| exp {-zl-e'e‘le) i-l ‘1/2 (5.21) L(Y, 1F) - II“ (2x)"”z|w| exp {-2 e'w’le) 1-1 The maximum likelihood estimate of Q. is derived by first finding the log of (5.18). (5.22) log(L(Y, e.)) - -kp/2 (103(k)) - flogIQJ - ézfue'ofe Taking the derivitive with respect to Q. and setting it equal to zero yields: (5.23) o -éz" 99' a i-l In reality, only the conditional expectation of g, Q and g are known (U., 6., 0*). By rewriting (5.22) as >0> t -321; (9* + 9 ‘ 9*)(9. + 9 - 61')’ “a ‘32-: (9.5”) + (9.)(9 - 6')' + (e - 94*)(9') + (e - 9*)(6 - 9*). (5.24) e. - 22:“ (9'5") + mg') The equation can be clarified by substituting By substituting (5.18) for 8. and (5.12) for D(6.). The maximum likelihood estimate becomes: A - O l I c u (5.25) Q.I Q. 3 _1[Q'4\.(Q1 Q1(A W)Qi)A.Q.] where A - (i1 - #1.) (i1 - u*)' 46 Following the same likelihood procedure from steps (5.22) through (5.24) above, gives the following maximum likelihood estimate for Q: A * _ 1 *. (5'26) ‘b u 2:1 rat-1 a1.) a” + 0(013) Equation (5.25) can be clarified by substituting By substituting (5.18) for e' and (5.15) for 0(6). The maximum likelihood estimate becomes: (5.27) e - <5 - QA'[(l/N)):_1QA'((1/n1)[Q1- 0541-10011 - M[B-(n1- l)M-1]M)AQ where n - (1,5 - (4* - in: - 1.9} - u’)’ The maximum likelihood for W followed steps (5.22) and (5.23). The maximum likelihood estimate is: ‘ _ 1 4 (5.28) 0 u :1 $31“ cue“ Replacing e by Y - (p + A + Aa) permits (5.25) to be rewritten as: A - l o c ' (5.29) w l X X (Yid (,4 + A‘s + xenon“1 (,4 + 1.9 + 1a)) replacing conditional values expanding the equation yields (5.30) w '11): X ()413 - (p*+A.9*+Aa*))(Y - (p'+.\.e'+).e'))' + D03. + i.e. + if) (5.31) 0(p'+x.e*+xa‘) - D(m*)+2n(p*, i.e')+2n(m*, Aa*)+D(A‘6*) + 20(189', Aa*)+D(.\a') 47 By substituting (5.12), (5.13), (5.14), (5.15), (5.16), (5.17) and (5.18) into (5.30) the estimate of Q can be written as: (5.32) v - w -:)f_lw<<1/n,>[c2,- 010140011 - MlB-(ni - 1)M"1M>w where B - <3:1d - A 6' - u'm: - Ag" - m‘)’ 48 3. The E-M Algorithm. The implementation of this algorithm is now complete for this latent model. The E-step uses the observed dataset Y and starting values for the parameters Q;, Q and Q to find estimates of the following sufficient statistics in (5.12) through (5.18). The M-step finds estimates of the three parameters using the values from the first step in (5.25), (5.27) and (5.32). These estimates are used in step 1 to reestimate the sufficient statistics in (5.12) through (5.18). Then in the M-step, Q.,‘Q and W are estimated again using the new values from E-step. The algorithm iterates between these two steps until some criteria is reached. Chapter VI: Design of Study 1. Design. By applying the estimation procedure (described in Chapter Five) to a set of data sampled from a population of known parameters, a check was provided for the solution of the EM algorithm together with the identification of its prOperties. Although the underlying parameters of the sample data were known, this was not intended to be a simulation study but, rather, an example of the algorithm's ability to estimate the parameters of a latent model. The EM algorithm, in operational terms, was used to estimate covariance components from both unbalanced and balanced samples drawn from the same multivariate normal distribution with known parameters. The balanced case contained 30 subjects for each group, while the unbalanced case averaged 30 subjects per group. The distribution of subjects across the groups in the unbalanced case was as follows: 20% of the groups included 10 subjects within each group, 20% had 20 subjects, 20% had 30 subjects, 20% had 40 subjects and, finally, the last 20% had 50 subjects. . The estimates of the balanced and unbalanced samples were both studied while varying two factors, namely the number of groups in the sample (the size) and the particular model being estimated (i.e. the unrestricted model, the correctly restricted model and incorrectly restricted model). The size of the sample consisted of two levels. The small sample consisted of 25 groups and the large sample of 100 groups. The difference in the number of classes gave an indication of the 49 50 properties of the EM Algorithm when the sample size varied from 25 groups with 750 subjects to 100 groups with 3000 subjects. Each of the three models studied contained different sets of parameters to be estimated. The first set involved the two covariance components for the simple multivariate random model (Unrestricted model); 2., the between groups covariance and 2, the within groups covariance matrix. The Unrestricted model gave estimates of the between and within covariance matrices of the multivariate random model. The second set consisted of Q‘, the latent group covariance matrix, Q,the latent individual covariance matrix and W, the error covariance matrix from the latent multivariate model. These parameters were derived from a latent measurement model based on the multivariate random model. The latent group covariance matrix, Q‘, and the latent individual covariance matrix, Q, were allowed to be full rank (i.e. covariances were not constrained to zero) while the error covariance matrix was constrained to a diagonal matrix. The last set of parameters were from an incorrectly specified latent multivariate random model. The parameters included were Q2, the latent group covariance matrix, Q., the latent individual covariance matrix and 1., the error covariance matrix. All three matrices were restricted to diagonal matrices. Applied to data from a population in which the parameter matrices contained non-zero covariance terms, this model demonstrated the reaction of the EM algorithm to incorrectly specified models. Figure l is a diagram of the design. There were 12 cells, each containing 10 replications. 51 FIGURE 1 Design of Study W525 25 100 Classes Classes I . Balanced | Unbalanced I Balanced | Unbalanced 119519.]. I I l I | I | I Unrestricted I a | b I c | d I I | I | I I | I | I Correctly I e | f I g | h I Specified I | I | I I | I | I Incorrectly I i | j I k | 1 I Specified I | I | I 1. Each cell contains 10 repetitions (different sets of data) 2. Cells a through f contain comparable datasets; the same can be said for datasets g through 1. 52 2. Generation of Data. .r --...-" Implementation of the experimental design required a method of creating samples drawn from a population of known parameters. The data had to fit the assumptions for the multivariate random latent model specified in Chapter Five. The values of the parameters chosen for this example are listed in Table 1. Each subject's four observed scores, 1“, were a combination of three latent group effects, QL , three latent individual effects, 9” 1, four measurement errors, $1.1 , and four grand means, u.. The most direct way to generate a dataset of observed values containing these characteristics is to create four separate vectors, one for each effect, for each subject and then to create the observed values through the equation ,<. b 6.1 y ("7’73") This is the equation from the random latent model in Chapter Five. Unlike the other three vectors, the grand mean, u , will be identical Eor all subjects. Each vector is representative of a sample vector from normal distribution with mean zero and variance covariance matrix as shown in Table l. The SAS package contains a subroutine which generates independent values from a univariate normal distribution with mean of zero and variance of one. By repeated applications three vectors of dimensions 3 x 1, 3 x l and 4 x l were created for each subject. The vectors, X( Q ), X( g ) and X( g ) each constitute a random sample of values 53 Table 1 Parameter Values Used in Data-Generationvf The dimension of the observed variables is four (p-h). The dimension of the latent group variables is three (r-3) and the dimension of the latent individual variables is three (s-3). The pattern matrices connecting the latent to the observed variables are: I P'P‘F‘h‘ u:uau:u1 0000 uauauau: o 0000 L - A - P‘F‘P‘P‘ The latent error, individual and group matrices are: 64 8 40 m-e- a 5 7 ‘ 40 7 107 PS - Q - OOOUI OOO‘O H ...; 25 10 15 OM - Q - 10 20 10 -O. -0. UIU'IUIU'I 0.5 -0.5 0.5 -0.5 15 10 35 The Expected values of the grand means of the four observed variables were set to zero (g - Q). 54 drawn from multivariate normal distributions having means of 0 and identity covariance matrices. The three vectors X( Q ), X( 2,), and X( g ) are each created from multivariate normal random distributions with mean vectors of zero and identity variance-covariance matrices. Sample data from a population with that parameter were obtained by multiplying each vector by the cholesky of the known parameter matrix. The final sample vectors were me) -T(<».)*x<§) X'(g)) -T(¢)*X(se) 10(1) -T(‘I')*X(_6.) where T( Q. ) is the cholesky of Q., T( Q ) is the cholesky of Q and T( W ) is the cholesky of W. By using X'( Q ), X'( g ), X'( g ) and 5 together as shown in equation 6.1, an observed sample data set from a population of known latent parameters was created. Each data set was created through the SAS normal random generation procedure using different seed numbers. The data sets were then used by a computer program to estimate the values of the parameters. 55 thsrtinaflalues 0}? Parameter. $251998... V The EM algorithm requires starting values for each parameter being estimated. The computer program written for this study estimates two sets of parameters, first the unrestricted between and within covariance matrices for the observed data, then in turn, the parameters of a more restricted latent model. The calculations of the starting values for the parameters of the latent model are based on the final estimates of the unrestricted between and within covariance matrices. Starting values for the between and within variance covariance matrices were estimated from equations 6.2 and 6.3 as developed by Schmidt (1971). 6.2 2- [n/(n-1)]3 A 6.3 2.:— l/m (S. - [n/(n-l)]S) These estimates are maximum likelihood estimators under the case of equal group size. These equations are used as starting values for an unbalanced design with one modification, replacing n by its harmonic-mean. In a balanced design the use of harmonic n will give the maximum likelihood estimates. When the design is unbalanced, the harmonic Q will give weighted starting values for the parameters. After 2 and )3. have been estimated, starting values for the parameters of the latent model Q., Q and Q are found. Assuming E-AQA and2-AQA+\Fthen I III ‘ 'v' 6.4 6.5 6.6 56 A o - (A'A )‘1'A'z A (A'A).1 I I I I I I I I . o - (A'A)-1'A 2: A (11%)“. 11-12 -AQA4+2-AQA, AI I . --r ‘sh_ These values form the starting estimates of theflparameters for the latent model. 57 4. The Computer Program. The computer program was written using the §A§ Procedure "Froc_MatrixF.r , v 1. Necessary Input: Y - a N x p matrix for p measures on N individuals, A. and A. a which are pattern matrices connecting the underlying latent variable with the observed-level variables and a K x 2 matrix, 2, which specifies the number of students in each group. 2. Next the program creates two new matrices, YM, a K x p matrix of means for each group, and SS, a Kp x p matrix containing the sum of squares for each group. Estimate of the parameters of a simple random model. 3. Using Schmidt's Maximum Likelihood Equations 6.2 and 6.3 the program then estimates starting values for 2. and 2. E-step 4. Using the equations given above the program next estimates the conditional varibles W, Q, g*, and Q.. M-step 5. Using these values the program then recalculates estimates for the parameters 2. and E. 58 6. A reiteration then occurs between steps 4 and 5 until the changes in 2. and 2 are less than 0.01 Estimate the parameters of the latent Restricted Model. 7. The next step in the program computes the starting estimates of Q., Q and i from 2. and 2 from the two parameter matrices in step 5 and estimates the sufficient statistics W, Q, M, Q. and u. from these values (E-step) . M-step 8. Reestimated values for the parameters Q‘, Q and i are then obtained and, finally, the program iterates between steps 7 and 8 until the changes in the parameters Q., Q and Q are less than 0.01. CHAPTER VII: RESULTS 1. Design and Measures. The EM Algorithm's ability to estimate covariance components in both balanced and unbalanced latent multivariate random effects models were demonstrated by estimating the parameters of independent samples generated from a population with a known distribution. The balanced samples contained 30 subjects for each group, and the unbalanced samples averaged 30 subjects per group. [The estimates of the balanced and unbalanced samples were studied across two dimensions, namely the number of groups in the sample and the type of model being estimated (i.e. the unrestricted model, the correctly specified latent model and an incorrectly specified latent model. Twenty elements were estimated in the unrestricted model, 10 for the Phi matrix and 10 for the Psi matrix. Sixteen elements were estimated in the correctly specified model; six for the Ph matrix, six for the Om matrix and four for the Pa matrix. The incorrectly specified model differed from the correctly specified model only in the number of matrix elements being estimated. Only the 10 diagonal elements were estimated, three for the Ph matrix, three for the Om matrix and four for the Pa matrix. Tables 2, 3 and 4 contain descriptive statistics of the estimates of the individual items of the covariance matrices for the three models over 10 repetitions for different situations. These tables include the Expected Value of the parameter (E), the Mean, the Standard Deviation (SD), the Bias, the Mean Square Error (MSE), the Bias divided by the 59 60 parameter's Expected Value (B/E), Ratio(l) and Ratio(2). Mean and Standard Deviation are self explanatory. The Bias is the difference between the Expected value of the parameter and its sample mean. The Mean Square Error (MSE) is the averaged squared deviation of the parameter estimates around their Expected value. The ratio of the Bias to the Expected Value (B/E) of a variance or covariance term converts the Bias into a percentage of the parameter's Expected Value, giving it a relative value. The difference between the MSEs of the balanced sample estimate and the corresponding unbalanced sample estimate divided by the MSE of the unbalanced sample estimate (Ratio(l)) yields a comparison of the MSEs of the two types of datasets. The last measure (Ratio(2)) is the difference between the MSEs of an element of the incorrectly specified model and the correctly specified model divided by the MSE of the element of the correctly specified model. The ratio gives a comparison of the precision of the two models. The three tables contain the lower diagonal elements of the different covariance matrices. In Tables 2 and 3, the elements of the latent matrix at group level, Q., are labeled Ph(X), the elements of the latent individual level matrix, Q , are Om(X) and the elements of the error matrix from the latent models, Q ,are labeled Ps(X). The (X) corresponds to the elements position in the lower diagonal. The latent covariance matrices would be .1 - v- --. r1 1 11 ,1 i l , 5 Ph( )- 2 3 0m( )- 2 3 j Ps( )-' 2 ‘ a 5 6 a 5 6; ' 3 :l . 4! b ...3 Ph(3) is the variance of the second latent measure and Ph(5) is the covariance between the second and third elements of the Ph matrix. Ps(2) would the variance of the second observed measure. The elements of the unrestricted model are similarly labeled. The elements of the between group covariance matrix, 2., are labeled Phi(X) while the elements of the within covariance matrix, 2 , are Psi(X). They are the same dimension as the error matrix of the latent model, 4 x 4, but include the six covariance terms in their estimates. Tables 5 through 7 contain aggregated statistics for each matrix in the latent models under the different conditions. Table 8 lists the Maximum Likelihood Ratio test of the estimates of the correctly and incorrectly specified latent models. Tables 9 and 10 list aggregated statistics for each matrix in the unrestricted models under the different conditions. Table 11 has information about the iterations necessary for the algorithm to converge. With only 10 repetitions per cell, the power of any statistical test would be low. Although some characteristics of the estimation procedure may appear with this size sample, it should be recalled that this was just a demonstration of the use of the EM algorithm for an unbalanced latent model under different circumstances and not a statistical study. 62 2. Results of Estimation Procedure Table 2 contains information about the estimates of the parameters for four different situations when the datasets are comprised of 100 groups. The four conditions are: (l) applying the correctly specified model to the balanced samples; (2) applying the correctly specified model to the unbalanced samples; (3) applying the incorrectly specified model to the balanced samples; and, lastly, (4) applying the incorrectly specified model to the unbalanced samples. The items in this table represent the statistics for cells g, h, k and l in Figure l. The B/E in Table 2 indicates the percentage of bias of the estimates. The correctly specified latent model had values of the B/E ranging from -0.9% to -13.7% for the estimates of the elements of the Ph matrix for the balanced data and 15.1% to -11.5% for the unbalanced dataset. The B/E of the estimates of the elements of the Om matrix ranged from 8.7% to -l.3% for the balanced data and from 10.9% through -9.8% for the unbalanced data and the Pa matrix had B/E values ranging from 8.7% to -27.4% for the balanced data and 5.2% to -4l.5% for the unbalanced data. In the incorrectly specified model, the estimates of the elements of the Ph matrix, the variance components, had B/E values are almost identical to the corresponding elements in the correctly specified model. The B/E of the estimates of the elements of the Om matrix ranged from 6.4% to -10.4% for the balanced data and 6.6% through -10.2% for the unbalanced data and the P3 matrix had B/E values 63 TABLE 2 Summary Statistics of the Latent Models for Balanced and Unbalanced Data Sets with 100 Groups Ph Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal - (h) (k) (1) (1) Ph(l) Mean 64.000 63.400 62.960 63.440 63.240 SD 9.330 10.710 9.290 10.530 MSE 87.449 115.906 86.653 111.523 Bias(B) -0.600 -l.040 -0.560 -0.760 B/E -0.009 -0.016 -0.009 -0.012 Ratio 1 0.325 0.287 Ratio 2 -0.009 -0.038 Ph(2) Mean 8.000 7.600 9.210 0.000 0.000 SD 1.880 2.090 MSE 3.712 5.995 Bias(B) -0.400 1.210 B/E -0.050 0.151 Ratio 1 0.615 Ratio 2 Ph(3) Mean 5.000 4.550 4.450 4.550 4.450 SD 0.640 0.600 0.650 0.560 MSE 0.635 0.696 0.648 0.650 Bias(B) -0.450 -0.550 -0.450 -0.550 B/E -0.090 -0.110 -0.090 -0.110 Ratio 1 0.097 0.003 Ratio 2 0.020 -0.067 Ph(4) Mean 40.000 39.060 38.720 0.000 0.000 SD 6.270 5.610 MSE 40.295 33.293 Bias(B) -0.940 -l.280 B/E -0.023 -0.032 Ratio 1 -0.l74 Ratio 2 Ph(5) Mean 7.000 6.040 6.560 0.000 0.000 SD 3.210 3.360 MSE 11.328 11.505 Bias(B) -0.960 -0.440 B/E -0.137 -0.063 Ratio 1 0.016 Ratio 2 64 TABLE 2 (Continued) Ph Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal Ph(6) Mean 107.000 98.420 99.560 98.190 99.210 SD 14.980 12.760 15.470 13.060 MSE 306.196 224.322 325.561 237.990 Bias(B) -8.580 -7.440 -8.810 -7.790 B/E -0.080 -0.070 v0.082 -0.073 Ratio 1 -0.267 -0.269 Ratio 2 0.063 0.061 0m Matrix 0m(1) Mean 25.000 25.580 25.900 26.600 26.640 SD 0.880 0.990 1.000 0.960 MSE 1.148 1.880 3.844 3.910 Bias(B) 0.580 0.900 1.600 1.640 B/E 0.023 0.036 0.064 0.066 Ratio 1 0.637 0.017 Ratio 2 2.348 1.080 0m(2) Mean 10.000 10.870 11.090 0.000 0.000 SD 0.830 0.900 MSE 1.530 2.130 Bias(B) 0.870 1.090 B/E 0.087 0.109 Ratio 1 0.392 Ratio 2 0m(3) Mean 20.000 19.740 19.990 18.780 18.750 SD 0.830 0.830 0.710 0.650 MSE 0.764 0.689 2.158 2.159 Bias(B) -0.260 -0.010 -1.220 -1.250 B/E -0.013 -0.001 -0.061 -0.063 Ratio 1 -0.098 0.000 Ratio 2 1.824 2.133 0m(4) Mean 15.000 15.020 14.840 0.000 0.000 SD 1.300 1.410 MSE 1.690 2.017 Bias(B) 0.020 -0.160 B/E 0.001 -0.011 Ratio 1 0.193 Ratio 2 65 TABLE 2 (Continued) 0n Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal 0m(5) Mean 10.000 10.390 9.020 0.000 0.000 SD 3.710 4.140 MSE 13.933 18.207 Bias(B) 0.390 -0.980 B/E 0.039 -0.098 Ratio 1 0.307 Ratio 2 0m(6) Mean 35.000 36.300 36.370 31.370 31.440 SD 2.900 3.180 0.670 0.690 MSE 10.288 12.198 15.090 14.558 Bias(B) 1.300 1.370 -3.630 -3.560 B/E 0.037 0.039 -0.104 -0.102 Ratio 1 0.186 -0.035 Ratio 2 0.467 0.193 Ps Matrix Ps(l) Mean 5.000 3.630 5.260 12.120 12.070 SD 4.410 5.460 0.460 0.470 MSE 21.534 29.887 56.539 55.760 Bias(B) -l.370 0.260 7.120 7.070 B/E -0.274 0.052 1.424 1.414 Ratio 1 0.388 -0.014 Ratio 2 1.626 0.866 Ps(2) Mean 6.000 5.250 3.510 2.070 2.050 SD 3.650 4.170 0.260 0.250 MSE 13.948 24.278 17.229 17.399 Bias(B) -0.750 -2.490 -3.930 -3.950 B/E -0.125 -0.415 -0.655 -0.658 Ratio 1 0.741 0.010 Ratio 2 0.235 -0.283 Ps(3) Mean 11.000 12.280 11.530 6.260 6.250 SD 3.060 3.540 0.570 0.550 MSE 11.184 12.844 25.289 25.372 Bias(B) 1.280 0.530 -4.740 -4.750 B/E 0.116 0.048 -0.431 -0.432 Ratio 1 0.148 0.003 Ratio 2 1.261 0.975 Ps(4) 66 TABLE 2 (Continued) 0m Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal Mean 12.000 13.040 13 SD 2.120 1 MSE 5.696 7 Bias(B) 1.040 1 B/E 0.087 0 Ratio 1 0 Ratio 2 .890 .960 .811 .890 .158 .371 15.390 15.480 0.610 0.610 13.141 13.828 3.390 3.480 0.283 0.290 0.052 1.307 0.770 67 ranging from 142% to -66% for the balanced data and 141% to -66% for the unbalanced data. Table 3 contains information about the estimates of the parameters for the four different situations in Table 2 when the datasets are comprised of 25 groups. Statistics for cells e, f, i and j in Figure 1 are presented in this table. The B/E of the corresponding items in Table 3 show higher percentages of bias than those in Table 2. The correctly specified latent model had values of B/E ranging from ~6.6% to -33.9% for the estimates of the elements of the Ph matrix for the balanced data and -0.6% to -33.9% for the unbalanced dataset. The B/E of the estimates of the elements of the 0m matrix ranged from 7.9% to 1.1% for the balanced data and 5.7% through -1.7% for the unbalanced data and the Ph matrix had B/E values ranging from 14.3% to -43.6% for the balanced data and 3.7% to -13.0% for the unbalanced data. In the incorrectly specified model the estimates of the elements of the Ph matrix, the variance components, had B/E values almost identical to the corresponding elements in the correctly specified model. The B/E of the estimates of the elements of.the 0m matrix ranged from 5.1% to -8.0% for the balanced data and 4.9% to -8.4% for the unbalanced data and the P3 matrix had B/E values ranging from 136% to -65% for the balanced data and 136% to -64% for the unbalanced data. Table 4 contains information about the estimates of the variables of the unrestricted model under the following conditions: (1) for balanced samples containing 100 groups; (2) for unbalanced samples containing 100 groups; (3) for balanced samples containing 25 groups; and lastly (4) for unbalanced samples containing 25 groups. Although 68 TABLE 3 Summary Statistics of the Latent Models for Balanced and Unbalanced Data Sets with 25 Groups Ph Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal (e) (f) (1) (J) Ph(l) Mean 64.000 59.750 63.620 59.890 64.070 SD 16.730 20.280 16.790 16.830 MSE 299.962 411.439 300.673 283.254 Bias -4.250 -0.380 -4.110 0.070 B/E -0.066 -0.006 -0.064 0.001 Ratio 1 0.372 —0.058 Ratio 2 0.002 -0.312 Ph(2) Mean 8.000 6.260 7.290 0.000 0.000 SD 3.050 3.520 MSE 12.667 12.951 Bias -1.740 -0.710 B/E -0.218 -0.089 Ratio 1 0.022 Ratio 2 Ph(3) Mean 5.000 4.040 4.400 4.060 4.130 SD 1.580 1.530 1.580 2.080 MSE 3.520 2.741 3.478 5.167 Bias -0.960 -0.600 ~0.940 -0.870 B/E -0.192 -0.120 -0.188 -0.174 Ratio 1 -0.221 0.486 Ratio 2 -0.012 0.885 Ph(4) Mean 40.000 28.940 33.070 0.000 0.000 SD 14.760 18.120 MSE 353.773 381.695 Bias -11.060 -6.930 B/E -0.277 -0.l73 Ratio 1 0.079 Ratio 2 Ph(5) Mean 7.000 4.660 4.630 0.000 0.000 SD 4.160 6.170 MSE 23.390 44.310 Bias -2.340 -2.370 B/E -0.334 -0.339 Ratio 1 0.894 Ratio 2 69 TABLE 3 (CONTINUED) Ph Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal Ph(6) Mean 107.000 77.450 78.720 77.910 78.770 SD 22.380 26.990 23.190 27.490 MSE 1471.089 1617.081 1478.030 1641.181 Bias -29.550 -28.280 -29.090 -28.230 B/E -0.276 -0.264 -0.272 -0.264 Ratio 1 0.099 0.110 Ratio 2 0.005 0.015 Om Matrix 0m(l) Mean 25.000 25.380 25.580 26.270 26.220 SD 1.850 1.680 2.130 2.160 MSE 3.583 3.196 6.329 6.319 Bias 0.380 0.580 1.270 1.220 B/E 0.015 0.023 0.051 0.049 Ratio 1 -0.108 -0.002 Ratio 2 0.766 0.977 0m(2) Mean 10.000 10.770 10.540 0.000 0.000 SD 1.790 1.540 MSE 3.863 2.696 Bias 0.770 0.540 B/E 0.077 0.054 Ratio 1 -0.302 Ratio 2 0m(3) Mean 20.000 21.010 21.150 20.060 20.430 SD 2.400 2.310 2.230 2.400 MSE 6.893 6.806 4.977 5.965 Bias 1.010 1.150 0.060 0.430 B/E 0.051 0.057 0.003 0.021 Ratio 1 -0.013 0.199 Ratio 2 -0.278 -0.123 0m(4) Mean 15.000 15.170 14.970 0.000 0.000 SD 1.230 1.610 MSE 1.545 2.593 Bias 0.170 -0.030 B/E 0.011 -0.002 Ratio 0.678 NDF‘ Ratio 70 TABLE 3 (CONTINUED) 0m Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal 0m(5) Mean 10.000 10.500 9.830 0.000 0.000 SD 2.830 3.900 MSE 8.287 15.242 Bias 0.500 -0.170 B/E 0.050 -0.017 B /MSE 0.034 0.002 Ratio 1 0.839 Ratio 2 0m(6) Mean 35.000 37.780 35.910 32.190 32.070 SD 5.340 2.010 2.420 2.370 MSE 37.103 4.960 14.630 15.156 Bias 2.780 0.910 -2.810 -2.930 B/E 0.079 0.026 -0.080 -0.084 Ratio 1 -0.866 0.036 Ratio 2 -0.606 2.055 Ps Matrix Ps(l) Mean 5.000 2.820 4.350 11.810 11.820 SD 4.250 4.700 1.450 1.430 MSE 23.343 22.559 53.632 53.725 Bias -2.180 -0.650 6.810 6.820 B/E -0.436 -0.130 1.362 1.364 Ratio 1 -0.034 0.002 Ratio 2 1.298 1.382 Ps(2) Mean 6.000 5.260 5.380 2.100 2.190 SD 3.560 3.560 1.190 1.230 MSE 13.282 13.101 18.316 17.642 Bias -0.740 -0.620 -3.900 -3.810 B/E -0.123 -0.103 -0.650 -0.635 Ratio 1 -0.014 -0.037 Ratio 2 0.379 0.347 Ps(3) Mean 11.000 12.570 11.410 6.430 6.520 SD 1.960 2.810 1.020 0.980 MSE 6.580 8.083 24.246 23.261 Bias 1.570 0.410 -4.570 -4.480 B/E 0.143 0.037 -0.415 -0.407 Ratio 1 0.228 -0.041 Ratio 2 2.685 1.878 71 TABLE 3 (CONTINUED) Ps Matrix Balance Unbalance Expected Balance Unbalance Diagonal Diagonal Ps(4) Mean 12.000 12.280 11.830 14.060 13.880 SD 2.330 1.290 1.220 0.430 MSE 5.516 1.696 6.204 4.112 Bias 0.280 -0.170 2.060 1.880 B/E 0.023 -0.014 0.172 0.157 Ratio 1 -0.692 -0.337 Ratio 2 0.125 1.424 72 TABLE 4 Summary Statistics of the Unrestricted Models for Balanced and Unbalanced Data Sets for Both 25 and 100 Groups Phi Matrix 100 100 25 25 Groups Groups Groups Groups Expected Balance Unbalance Balance Unbalance (e) (d) (a) (b) Phi(l) Mean 143.500 141.962 141.230 127.691 139.317 SD 16.359 16.865 34.203 44.422 MSE 270.245 290.154 1447.539 1992.756 Bias(B) -1.538 -2.270 -15.809 -4.183 Bias/Expected(B/E) -0.011 -0.016 -0.110 ~0.029 Ratio 1 0.074 0.377 Ratio 2 4.356 5.868 Phi(2) Mean 46.500 48.521 47.472 51.845 56.904 SD 12.099 13.831 20.467 23.513 MSE 150.924 192.346 450.641 673.131 Bias 2.021 0.972 5.345 10.404 B/E 0.043 0.021 0.115 0.224 Ratio 1 0.274 0.494 Ratio 2 1.986 2.500 Phi(3) Mean 32.500 35.633 34.445 39.816 44.156 SD 11.939 12.684 19.156 21.152 MSE 153.446 165.087 426.423 598.365 Bias 3.133 1.945 7.316 11.656 B/E 0.096 0.060 0.225 0.359 Ratio 1 0.076 0.403 Ratio 2 1.779 2.625 Phi(4) Mean 49.500 49.980 49.060 53.453 53.344 SD 7.474 7.330 8.875 9.751 MSE 56.117 53.944 96.128 111.500 Bias . 0.480 -0.440 3.953 3.844 B/E 0.010 -0.009 0.080 0.078 Ratio 1 -0.039 0.160 Ratio 2 0.713 1.067 Phi(S) Mean 129.500 128.490 128.139 116.448 126.424 SD 14.652 15.129 33.778 42.182 MSE 215.815 230.945 1330.236 1789.834 Bias -1.010 -1.361 -13.052 -3.076 B/E -0.008 -0.011 -0.101 ~0.024 Ratio 1 0.070 0.346 Ratio 2 5.164 6.750 73 TABLE 4 (Continued) Phi Matrix 100 100 25 25 Groups Groups Groups Groups Expected Balance Unbalance Balance Unbalance Phi(6) Mean 56.500 55.834 55.339 59.803 60.973 so 9.087 8.879 9.528 10.791 use 83.066 80.334 102.905 138.676 Bias -0.666 -1.161 3.303 4.473 B/E -0.012 -0 021 0.058 0 079 Ratio 1 -0.033 0.348 Ratio 2 0.239 0.726 Phi(7) Mean 120.500 119.725 119.638 109.674 118.120 so 13.156 13.843 33.908 40.686 use 173.748 192.454 1279.977 1661.644 Bias -0.775 -0.862 -10.826 -2.380 o/o -0.006 -0.007 -0.090 -0.020 Ratio 1 0.108 0.298 Ratio 2 6.367 7.634 Phi(8) Mean 39.500 41.359 41.109 45.143 49.719 so 10.461 12.173 21.005 23.745 ass 113.272 151.058 476.592 679.856 Bias 1.859 1.609 5.643 10.219 B/E 0.047 0.041 0.143 0.259 Ratio 1 0.334 0.426 Ratio 2 3.207 3.501 Phi(9) Mean 30.500 33.114 32.664 37.596 41.590 so 10.240 10.916 19.657 21.313 ass 112.450 124.362 442.346 590.897 Bias 2.614 2.164 7.096 11.090 B/E 0.086 0.071 0.233 0.364 Ratio 1 0.106 0.336 Ratio 2 2.934 3.751 Phi(lO) Mean 47.500 46.748 47.373 51.063 50.635 so 6.523 6.079 8.626 9.364 MSE 43.178 36.972 88.513 98.605 Bias -o.752 -0.127 3.563 3.135 B/E -0.016 -0.003 0.075 0 066 Ratio 1 -0.144 0.114 Ratio 2 1.050 1.667 74 TABLE 4 (Continued) Psi Matrix 100 100 25 25 Groups Groups Groups Groups Expected Balance Unbalance Balance Unbalance Psi(1) Mean 73.750 73.985 74.058 73.822 73.404 so 1.682 1.659 4.438 4.283 ass 2.890 2.858 19.702 18.477 Bias 0.235 0.308 0.072 -0.346 8/3 0.003 0.004 0.001 -0.005 Ratio 1 -0.011 -0.062 Ratio 2 5.816 5.466 231(2) Mean 31.250 31.613 31.616 31.392 31.609 so 1.118 1.095 3.445 3.387 ass 1.396 1.348 11.890 11.615 Bias 0.363 0.366 0.142 0.359 B/E 0.012 0 012 0.005 0.011 Ratio 1 -0.035 -o.023 Ratio 2 7.515 7.617 931(3) Mean 6 250 6.533 6.513 6.341 6.817 so 0.609 0.576 2.235 2.399 ass 0.460 0.409 5.004 6.112 Bias 0.283 0.263 0.091 0.567 B/E 0.045 0.042 0.015 0.091 Ratio 1 -0.111 0.221 Ratio 2 9.882 13.958 931(4) Mean 13.750 13.925 13.894 14.022 14.362 so 0.767 0.728 1.601 1.545 ass 0.622 0.553 2.645 2.803 Bias 0.175 0.144 0.272 0.612 8/8 0.013 0.010 0 020 0.045 Ratio 1 -0.111 0 060 Ratio 2 3.251 4.069 231(5) Mean 43.750 44.095 44.216 43.565 43.741 so 1.396 1.323 2.512 2.201 ass 2.081 1.992 6.348 4.844 Bias 0.345 0.466 -0.185 -0.009 B/E 0.008 0.011 -0.004 0.000 Ratio 1 -0.043 -0.237 Ratio 2 2.050 1.432 Psi(6) Psi(7) Psi(8) Psi(9) Psi(10) Mean SD MSE Bias B/E Ratio Ratio Mean SD MSE Bias B/E Ratio Ratio Mean SD MSE Bias B/E Ratio Ratio Mean SD MSE Bias B/E Ratio Ratio Mean SD MSE Bias B/E Ratio Ratio NH NH '00" NH NH 75 TABLE 4 (Continued) 25 Psi Matrix 100 100 25 Groups Groups Groups Expected Balance Unbalance Balance 34.750 34.987 34.936 35.350 0.857 0.864 2.306 0.797 0.785 5.718 0.237 0.186 0.600 0.007 0.005 0.017 -0.015 6.175 49.750 49.935 50.034 49.971 1.306 1.237 2.009 1.744 1.620 4.090 0.185 0.284 0.221 0.004 0.006 0.004 -0.071 1.346 16.250 16.690 16.710 15.915 1.122 1.084 2.278 1.474 1.410 5.314 0.440 0.460 -0.335 0.027 0.028 -0.021 -0.043 2.605 11.250 11.354 11.308 11.681 0.660 0.628 1.812 0.448 0.398 3.490 0.104 0.058 0.431 0.009 0.005 0.038 -0.111 6.796 30.750 30.860 30.861 30.662 0.905 0.878 1.210 0.832 0.785 1.473 0.110 0.111 -0.088 0.004 0.004 -0.003 -0.058 0.769 Groups Unbalance 35.831 1.922 4.992 1.081 0.031 .127 5.360 0.328 2.072 4.664 0.578 0.012 0.140 1.880 .095 2.303 5.331 .155 .010 0.003 2.780 .121 .836 .214 .871 .077 .207 .584 \DOOCiPI-‘N .770 .141 .302 .020 .001 .116 .660 OOOOHHO 76 the Unrestricted model was a linear combination of the latent variance and covariance components of the correctly specified model, it was directly estimated from the data. For the estimates from data containing 100 groups, B/E ranged from 9.6% to -1.6% for the balanced data and 7.1% to -12.7% for the unbalanced data. The estimates from the datasets containing 25 groups had B/E ranging from 23.3% to -11.0% for the balanced data and 36.4% to -2.9% for the unbalanced data. Table 5 contains the average B/E value for each matrix, for the balanced and unbalanced data, for both the correctly and incorrectly specified latent models. In the correctly specified latent model, the matrices from large group data (n-100 groups) showed lower average B/E than matrices from small group data (np25 groups). The Om and P8 matrices had average B/E percentages ranging between 2.9% and -4.9% versus 4.7% and -9.8%. The Ph matrices had values of -2.3% to -6.5% versus -16.5% to -22.7%. The findings regarding the incorrectly specified model were not as consistent. The B/E for the Om matrix was lower for the small group data in both the balanced and unbalanced datasets. The Ph and Ps matrices, on the other hand, had lower B/E for the large group data. The data from the balanced group would be expected to have the best estimates with the smallest MSE. If this procedure can get esti- mates of the unbalanced design, with only a small increase in the MSE, than the estimation procedure would be practical. Table 6 contains the average values of Ratio(l), the ratio of the difference between the MSE of the balanced and unbalanced samples to the MSE of the balanced sample, of each matrix of the two latent models for both sample sizes. 77 Table 5 Average B/E of the Latent Models for Balanced and Unbalanced Data Sets for Both 25 and 100 Groups 25 Classes (e) (f) (i) (J) Incorrect Incorrect Balance Unbalance Balance Unbalance Ph -0.227 -0.165 -0.175 -0.146 (0.093) (0.121) (0.130) (0.130) Om 0.047 0.024 -0.009 -0.004 (0.029) (0.030) (0.047) (0.050) Ps -0.098 -0.053 0.115 0.120 (0.250) (0.078) (0.899) (0.894) 100 Classes (3) (h) (k) (1) Incorrect Incorrect Balance Unbalance Balance Unbalance Ph -0.065 -0.023 -0.060 -0.065 (0.047) (0.091) (0.045) (0.050) Om 0.029 0.012 -0.034 -0.033 (0.035) (0.069) (0.087) (0.088) Ps -0.049 -0.026 0.103 0.102 (0.185) (0.256) (0.936) (0.933) (a), . indicates row of figures under letter refer to cell in Figure 1. 78 Table 6 Average Ratio(l) of the Ph, 0m and Ps Matrices for the Latent Models in Data Sets for both 25 and 100 Groups 25 Classes Ph Ps 100 Classes Ph Ps Unbalance 0.208 (0.386) 0.038 (0.634) -0.128 (0.395) Unbalance 0.102 (0.326) 0.269 (0.245) 0.412 (0.245) Incorrect Unbalance 0.179 (0.227) 0.078 (0.089) -0.103 (0.113) Incorrect Unbalance 0.007 (0.197) -0.006 (0.019) 0.013 (0.028) Ratio 1 - ( MSE(Unbalanced)-MSE(Ba1anced) )/( MSE(Ba1anced) ) 79 In the correctly specified model for the samples containing 100 groups, the unbalanced samples had greater MSE than the balanced in all three of the matrices averaging from 10% to 41% more. For samples containing 25 groups, the two individual level matrices, Om and P3, showed little or negative increase on the average. The P3 matrix had an average drop of 13% for the MSE, while the Om matrix had only a slight increase of 3%. The Ph matrix had an average increase of 20%. The incorrectly specified model showed the same results for samples containing 25 groups. The balanced samples containing 100 groups showed very little difference in the MSE from the unbalanced samples containing 100 groups for all three matrices. The estimates of the balanced samples improved more than the unbalanced samples (in terms of MSE) as group size increases. The imposition of a structure on the data, in turn, permits specification of incorrect models. Table 7 contains the average values of Ratio(2), the ratio of the difference between the MSE of the correctly and incorrectly specified latent models divided by the MSE of the correctly specified model, of each matrix for the balanced and unbalanced samples for both sample sizes. In the samples containing 25 groups, the balanced data shows little difference between the MSE's of the incorrectly and correctly specified models for the Ph and Om matrices, 0% and -4%. The P3 matrix had an increase in the MSE of 112%. The unbalance sample had rises in the MSE of 19% for the Ph matrix, 97% for the Om matrix and 125% for the P8 matrix. For the balanced and unbalanced samples containing 100 groups, the Ph matrix showed little increase in the MSE between the correctly 80 Table 7 Average Ratio(2) of the Ph, Om and P3 Matrices for Balanced and Unbalanced Data Sets for both 25 and 100 Groups 25 Classes Incorrect Incorrect Balance Unbalance Ph -0.002 0.196 (0.006) (0.619) 0m -0.039 0.970 (0.507) (1.093) Ps 1.121 1.258 (1.157) (0.647) 100 Classes Incorrect Incorrect Balance Unbalance Ph 0.025 -0.015 (0.036) (0.067) 0m 1.546 1.135 (0.971) (0.971) Ps 1.107 0.582 (0.604) (0.583) Ratio 2 - ( MSE(Incorrect)-MSE(Correct) )/( MSE(Correct) ) 81 and incorrectly specified models, while the 0m matrix increased 13% and 15%, respectively, and the P3 matrix showed increases of 58% and 110%. The large rises in the MSE of the P3 matrices for the incorrectly specified model can be explained by reviewing Tables 2 and 3. The P3 matrix of the incorrectly specified model is very biased with a small sampling variance. It is this bias that causes the MSE to greatly increase. Without knowing the true values of the variances and covariances, the incorrect model would be tempting to accept because of the small sampling variance that accompanies it. The incorrectly specified model does well in estimating the variances of Ph, but Om and P3 show problems with their estimates. In Tables 1 and 2 the standard deviation of the Ps elements in the correctly specified model vary between 2 to 10 times as large as the corresponding elements for the incorrectly specified model. On the other hand, the bias of the estimates of the elements of the P3 matrix, in the incorrectly specified model, range between 2 to 10 times as large as the bias for the corresponding elements in the correctly specified model. The percentage of the MSE which was due to bias in the incorrectly specified model in P3 (all sizes) ranged between 91% to 99.6%. The low standard deviation of the sample, but very incorrect estimates, indicate a very consistent but extremely biased estimate. The direction of the bias was not consistent across elements. It is also important to test the model for fit. By using the Maximum Likelihood Ratio (MLR), the correctly and incorrectly specified models can be tested for fit. If the MLR is significant, it is an indication that the model does not fit the data. Table 8 contains the statistics of the maximum likelihood ratio for the correctly and 25 100 82 Table 8 Maximum Likelihood Ratio Test of the Pit of the Correct and Incorrect Models Classes * (b) Balance Mean 2.77 (1.76) Minimum 1.29 Maximum 7.41 No. of samples (1) significant at p < .05 Classes * (h) Balance Mean 7.41 (7.05) Minimum 0.82 Maximum 22.72 No. of samples (3) significant at p < .05 * df - 4 ** df - 10 'k (e) Unbalance 4.21 (3.77) 1.58 13.13 (2) * (k) Unbalance 15.16 (12.49) 2.46 34.25 (6) *1? (C) Incorrect Balance 129.77 (39.16) 81.12 210.99 (10) *‘k (i) Incorrect Balance 618.59 (99.09) 475.21 769.72 (10) ** (f) Incorrect Unbalance 144.53 (54.53) 68.36 231.57 (10) *1! (1) Incorrect Unbalance 603.88 (89.86) 443.02 783.49 (10) 83 incorrectly specified models. The MLR's of the correctly specified model are lower than those for the incorrectly specified model's by a factor of more than 10. The tests of fit for all forty samples for the incorrectly specified model had significant MLR's. Only 12 of those samples were significant for the correctly specified model, nine of which were from samples containing 100 groups. The datasets containing 100 groups had MLR's four times the size of those from datasets with 25 groups. If the dataset is very large, the fit may be acceptable, but the MLR significant. This is a common problem in covariate structural analysis. The same problem occurs in Lisrel when using a very large sample. The unrestricted model estimated only one covariance matrix for the group level and one matrix for the individual level. Neither of these matrices, Phi or Psi, were structured or constrained. This particular model was estimated separately from the latent models. The starting values of the unrestricted model were Maximum Likelihood Estimates when the data was balanced. Schmidt's Maximum Likelihood Equations for the between and within covariance matrices of a multivariate random model were used as starting points. This algorithm always converged at the end of the first iteration for balanced datasets. Table 9 contains the average B/E of Phi and Psi in the unrestricted model. Both the balanced and unbalanced samples showed little difference in the average B/E of either matrix when the data contained 100 groups. The B/E averaged less than 2.4% of the expected value. When the data was comprised of 25 groups, the average B/E of the Psi matrix was less than 2.5% for both the balanced and unbalanced Table 9 Average B/E of the Unrestricted Model for Balanced and Unbalanced Data Sets for Both 25 and 100 Groups (e) 100 Balance Phi 0.023 (0.042) Psi 0.013 (0.013) (d) (a) (b) 100 25 25 Unbalance Balance Unbalance 0.013 0.063 0.135 (0.033) (0.127) (0.154) 0.013 0.007 0.025 (0.013) (0.016) (0.035) samples. The Phi matrix, on the other hand, had values of 13% and 17% for the balanced and unbalanced datasets irrespectively. The increase in the sample size, from 750 to 3000, had little effect on the B/E for the Psi matrix. The increase from 25 to 100 groups, however, reduced the average B/E for the estimates of the elements of the Phi matrix in the balanced and unbalanced samples from 13% and 17% to 2% and 1%. Table 10 summarizes the values of Ratio(l) for the unrestricted model. The Phi matrix showed a higher average MSE for the unbalanced samples than for the balanced samples in both small and large group data. (33% and 8%, respectively). The Psi matrix showed little difference between the MSE's of the balanced and unbalanced samples for samples of either size. As the number of groups in a sample increase, the MSE of the unbalanced data evidently approaches that of the balanced data. One last note, statistical theory states that as the sample size increases the sample variance will decrease. The SD should be about Table 10 Average Ratio(l) of the Phi and Psi Matrices for Balanced and Unbalanced Data Sets for Both 25 and 100 Groups 100 25 Unbalance Unbalance Phi 0.083 0.330 (0.142) (0.116) Psi -0.061 0.007 (0.039) (0.151) twice as large for the 25 group as for the 100 group samples. This is borne out for the Ph and 0m matrices, but not by the P3 matrix. The P3 matrix has approximately the same SD for both samples of 25 and 100 groups. 86 3. Results of the Process. A section of the findings of this study apply to the process of the EM algorithm Problems encountered in the procedure of the algorithm in this environment may apply to other situations. Table 11 contains information on the number of iterations the estimation procedure took to converge for each of the cells in Figure 1. The incorrectly specified model needed the most iterations to converge for both data containing 100 groups and data containing 25 groups. The means of the balanced and unbalanced samples were very close, 82 and 86 for the data with 25 groups and 99 and 100 in the data with 100 groups. The correctly specified model averaged 74 and 76 iterations for the balanced and unbalanced samples of 25 groups. For data containing 100 groups, the unbalanced sample averaged 38 iterations less than the balanced sample, 99 vs 61. The unrestricted model averaged fewer iterations than either of the latent models for all conditions. The latent models, at the least, averaged over 50 more iterations than the unrestricted model. The samples containing 100 groups for the unbalanced sample averaged less iterations than the unbalanced sample from the 25 group case, 5.8 against 10.3. The unrestricted model for the balanced sample under both cases always stopped after the first iteration. The starting value for the algorithm was Schmidt's maximum likelihood estimators for the between and within models. This confirmed that the algorithm was capable of stopping at a maximum likelihood estimate. 25 Groups Unrestricted Correctly Specified Incorrectly Specified 100 Groups Unrestricted Correctly Specified Incorrectly Specified 87 TABLE 11 Iterations Required by Algorithm to Convergence Mean Standard 1.0 74.6 82.2 1.0 98.8 Balanced Design Range Deviation - 1 l 53.43 23 198 19.42 50 105 - 1 1 62.84 27 192 3.62 92 104 HHHHHHHHHHHHHHHHHHH Mean 10. 76. 86. U) Unbalanced Design Standard Deviation 7.79 39.40 16.83 Range 1 28 37 159 53 102 5 6 27 117 96 104 Ph and Om were estimated as full matrices for the correctly specified model and as diagonal matrices for the incorrectly specified model. 88 Convergence was found to be a problem for two datasets and they were not used in the final analysis. In one dataset under the unbalanced case, the unrestricted model moved toward convergence for seven iterations until only one element of the three covariance matrices was slightly larger than criterion. At iteration eight, the estimates of the parameters of the matrix diverged from the expected values until the program automatically stopped at the 250th iteration. The estimates of the parameters had significantly diverged from the maximum likelihood values. WA slight_“ change in the values of the starting matrix of this dataset caused the algorithm to converge in seven iterations. In the Unrestricted model the Psi matrix converged very quickly. The convergence of the model came only after the elements in the Phi matrix reached the criteria. In the correctly specified latent model the Ph matrix was the first matrix for all of the elements to reach the convergence criteria. The P3 matrix was the last in which all elements reached criteria. _Finally, the starting values of the parameters affected the final estimated values at which the algorithm converged. Specifically, proximity of the starting values to the true values appeared to be positively related to how closely the final estimated parameters would be to the maximum likelihood estimates when reaching criterion. Criterion was reached when all elements in the covariance matrices changed less than .01. Using a set of data, an initial computer run was done on the data using values close to the expected values as starting values. These starting values caused the final estimates of the parameters to be close to the Expected value. A second run of the 89 program was then done on this data using Schmidt's formula to find starting values for the parameters. The run resulted with estimates of the parameter that were not as close to the expected values as were those from the first run. The criteria were ignored and the second was allowed to continue for 95 more iterations. It then reached values which were nearly identical to those from the first run. CHAPTER VIII: DISCUSSION 1. “Summary and Conclusions.. wAlthough statistical procedures are available for estimating treatment effects for students taught in classrooms, these procedures are applicable, only, when every class has the same number of students. The present study investigated a procedure that was originally established to handle missing data (EM Algorithm) but-which also provides a solution to the problem of estimating parameters in multivariate analysis when samples contain unequal group sizes. The focus of the present dissertation was on the estimation of latent group and individual level variances and covariances with measurement error removed when group sizes varied in a sample. Previous methods could only find maximum likelihood estimates for this problem if the dataset contained groups of equal size. The EM Algorithm offers a method for finding maximum likelihood estimates of parameters in situations where classical maximum likelihood procedures failed. To estimate a set parameters, the EM Algorithm requires two steps, an expectation step (E-step) and a maximization step (M-step). The E-step is characterized by the formulation of the sufficient statistics in terms of the observed data and the parameters. The M-step consists of developing the maximum likelihood equations for the parameters in terms of the conditional statistics. ‘93595—31VEE_ starting values for the parameters, the algorithm calculates the sufficient statistics in the E-step. These values are used to estimate the parameters in the M-step. The algorithm returns to the E-step to 90 91 recalculate the sufficient statistics based on the new values of the parameters. The parameters are reestimated using these new values of the sufficient statistics. The algorithm iterates between the E-step and the M-step until a specified criteria is reached. The estimate of balanced and unbalanced samples were both studied while varying two factors, mainly the number of groups in the sample (the size) and the particular model being estimated (that is to say, the unrestricted model, the correctly specified model and the incorrectly specified model). The unrestricted model was (8.1) I“ -u+;z1 +3.“ I is a p x 1 vector of observed data. u is a p x 1 vector of grand means. 1 is a p x 1 vector of group effects. 6 is a p x 1 vector of individual effects. These variables were considered to have come from multivariate normal distributions: (8.2) x~N(0.2Y) 1~N(9.21) u~N(Q.2u) 3~N<0.2¢) The parameters of interest for this model were 2% and 2‘. The correctly specified model was visualized as the application of a structure to the unrestricted model. By assuming 1 -'A}§ and g - Ag + £1, (8.1) becomes: (8.3) I“ - g + A.§1 + Ag” 1 is a p x 1 vector of observed data. u is a p x 1 vector of grand means. A is a p x q matrix connecting the p observed variables with q latent group variables. Q is a q x 1 vector (q 5 p) of the latent group effects. A is a p x r matrix connecting the p observed variables with r latent group variables. g is a r x 1 vector (r 5 p) of the latent individual effects. ;1 is a p x 1 vector of the latent individual errors. with X-N(Q.Z‘Y) §~N(.Q.45a) u-N(Q.3u) 2~N(9..4>) sl~N(Q.‘I')- The parameters of interest in this model are 0., Q and W. The incorrectly specified model differs from the correctly specified model in the constraints placed on the two latent covariance matrices. In the incorrect model, the latent covariance matrices, and 0; and 0 are considered to be diagonal matrices. No such constraints are placed on the latent matrices in the correct model. ‘g Tests of the m9g§1;5a§ed on the criteria of convergené37showed this estimation procedure_tgfibe a satisfactory and effective method n theory. wever, once the study had been completed, it was recognized u e‘rgrmterm, ould gg(§”EZ;;33iti)for all tha ‘a’ggdel. \f 93 pract lication . This model can be described as follows: _.\ \ z ) (8.5) X-g-O-AQ-i-Aa +£_l+_§_2_ i.) a i ‘13 i.) 1.) I I is a p x 1 vector of observed data. a is a p x 1 vector of grand means. A is a p x q matrix connecting the p observed variables with q latent group variables. Q is a q x 1 vector (q 5 p) of the latent group effects. A is a p x r matrix connecting the p observed variables with r latent group variables. 3 is a r x 1 vector (r g p) of the latent individual effects. £1 is a p x 1 vector of the latent individual‘érrors. $2 is a p x 1 vector of the latent groupvgrrors. with x~N(9.2Y) 1-N(0.9.) u~N(Q.2“) a~N(0.¢) 11~N<0.91) 32~N(0.wz) The parameters of interest in this model are 0., 0, $1 and 92. The EM Algorithm was developed for thi§_purpose and run on a ,\,__‘~________’_7 1__15 _,__1 trial set of data. The results were similar to those obtained in the N4 old del See Table 12 Three issues whighithe users of the EM orithm st contend with! are tie—enjem W and W 6./ Summary Statistics of the Four Parameter Latent Model for Balanced and Unbalanced Data Sets with 100 Groups Ph(l) Ph(2) Ph(3) Ph(4) Ph(5) Ph(6) Mean SD MSE Bias(B) B/E Ratio 1 Mean SD MSE Bias(B) B/E Ratio 1 Mean SD MSE Bias(B) B/E Ratio 1 Mean SD MSE Bias(B) B/E Ratio 1 Mean SD MSE Bias(B) B/E Ratio 1 Mean SD MSE Bias(B) B/E Ratio 1 94 TABLE 12 Ph Matrix Expected Balance 0!) 64.000 64.040 3.710 13.766 0.040 0.001 8.000 7.160 3.990 16.704 -0.840 -0.105 5.000 4.380 1.600 2.987 0.620 0.124 40.000 39.430 10.440 109.355 -0.570 -0.014 7.000 7.040 6.880 47.336 0.040 0.006 107.000 102.600 15.970 276.552 -4.400 -0.041 Unbalance (k) 64. 3 12. .280 0. -0. 0 100. 18. 384. -6. -0. .390 280 .470 128 004 119 .130 .780 .689 .870 .109 .418 .340 .480 .674 .660 .132 .105 .720 .860 .027 .280 .007 .079 .000 .900 .610 .000 .000 .006 940 540 536 060 057 95 TABLE 12(Continued) Summary Statistics - Ph Matrix Balance Unbalance Mean of B/E -0.046 -0.050 SD of B/E 0.056 0.059 Mean of Ratio 1 0.112 SD of Ratio 1 0.238 Om Matrix Expected Balance Unbalance 0m(1) Mean 25.000 24.490 24.420 SD 1.020 0.900 MSE 1.329 1.184 Bias(B) -0.510 -0.580 B/E -0.020 -0.023 Ratio 1 -0.110 0m(2) Mean 10.000 11.780 11.670 SD 1.360 1.410 MSE 5.370 5.087 Bias(B) 1.780 1.670 B/E 0.178 0.167 Ratio 1 -0.053 0m(3) Mean 20.000 18.270 18.360 SD 1.080 1.510 MSE 4.492 5.269 Bias(B) -1.730 -l.640 B/E -0.087 -0.082 Ratio 1 0.173 0m(4) Mean 15.000 16.920 16.940 SD 1.540 1.370 MSE 6.468 6.059 Bias(B) 1.920 1.940 B/E 0.128 0.129 Ratio 1 -0.063 0m(5) Mean 10.000 12.950 13.220 SD 4.200 4.060 MSE 27.309 28.004 Bias(B) 2.950 3.220 B/E 0.295 0.322 Ratio 1 0.025 96 TABLE 12(Continued) Om Matrix Expected Balance Unbalance 0m(6) Mean 35.000 32.320 32.510 SD 2.710 3.380 MSE 15.325 18.313 Bias(B) -2.680 -2.490 B/E -0.077 -0.071 Ratio 1 0.195 Summary Statistics - Om Matrix Mean of B/E 0.070 0.074 SD of B/E 0.155 0.160 Mean of Ratio 1 0.028 SD of Ratio 1 0.129 Psl Matrix Balance Unbalance Psl(l) Mean 5.000 6.710 6.700 SD 10.380 10.270 MSE 110.993 108.684 Bias(B) 1.710 1.700 B/E 0.342 0.340 Ratio 1 -0.021 Psl(2) Mean 6.000 15.380 15.670 SD 6.740 6.180 MSE 143.188 142.091 Bias(B) 9.380 9.670 B/E 1.563 1.612 Ratio 1 -0.008 Psl(3) Mean 11.000 25.330 26.330 SD 8.410 8.910 MSE 298.894 340.509 Bias(B) 14.330 15.330 B/E 1.303 1.394 Ratio 1 0.139 Psl(4) Mean 12.000 23.910 22.970 SD 7.540 6.250 MSE 214.461 172.775 Bias(B) 11.910 10.970 B/E 0.993 0.914 Ratio 1 -O.l94 97 TABLE 12(Continued) Summary Statistics - Psl Matrix Balance Unbalance Mean of B/E 0.700 0.710 SD of B/E 0.527 0.564 Mean of Ratio 1 -0.021 SD of Ratio 1 0.137 P52 Matrix Expected Balance Unbalance Ps2(1) Mean 7.000 5.160 5.330 SD 8.170 8.170 MSE 70.511 69.848 Bias(B) —l.840 -l.670 B/E -0.263 -0.239 Ratio 1 -0.009 Ps2(2) Mean 6.000 7.130 7.370 SD 5.700 5.620 MSE 33.909 33.670 Bias(B) 1.130 1.370 B/E 0.188 0:228 Ratio 1 -0.007 Ps2(3) Mean 10.000 10.490 11.090 SD 7.470 7.860 MSE 56.068 63.100 Bias(B) 0.490 1.090 B/E 0.049 0.109 Ratio 1 0.125 Ps2(4) Mean 11.000 10.700 9.760 SD 6.380 5.620 MSE 40.804 33.293 Bias(B) -0.300 -1.240 B/E -0.027 -0.113 Ratio 1 -0.184 Summary Statistics - P52 Matrix Balance Unbalance Mean of B/E -0.009 —0.002 SD of B/E 0.189 0.211 Mean of Ratio 1 -0.019 SD of Ratio 1 0.127 98 The present study.usedfthe absolute change of the estimate! as,(Z:) \ ‘ r ( _£heconvergence criteria. Raudenbuash (1986) used the change in the likelihood for their criteriaélatfli gradie of also been suggested (See no (1987)): However, each_hasfits problems:) V Theflfirst criteriawmay not reach the max likelihood solution since r. ./”’ the starting point/definitel affects’the finishing point_in such a situational Using the likelihood or its gradient can fail if theremis a K ghance.that the matrix bei“ ‘ using estimate differences, will con!gzgg_ggE_a_gingg;§;_mgggixg;fl___i#3 The pattern of the estimation in the two sets of data which failed to converge exposes a problem. The data irs onver ed toward the ooggflsjthe¥ digerged. The slight change of one value in the starting matrices/baused/éhese data sets/go converge, There are W T is singular. The a1gorithm,g¥} some articles written on the convergence of the E-M algorithm (most notably Wu (1983)) but these are for univariate cases. The multivariate case becomes much more complex. Being a linear method, the EM Algorithm goes on a slow line toward a convergence point. ££_i§,,_ much more susceptiblesto any_localgmaxima_gngLQymg_§han/Raphson-Newton or any 0th?!39.565353933999919 rewihicomaxiusoishss--9_9___1£§._J.9R§R§x toward_§hg maximum likelihood estimate. _"-——~._, .... - .3ggj13531§331tho restrictiflgcgf.£hggggégla The less_ fThelthifd. rEEEEEEEEEEERRLQQedwonathe.medelutheubetterwthe,estimatignnappearsto be. If the model is wrongly restricted, however, the EM Algorithm will still converge yielding\bad (but attractive) result' with no indicationbfr that a problem existsa\ It become imperativ that t e Maximum Likeli- hood Ratio test or a similar test be used to test the fit of the model. 99 2. Future Exploration. / i There is no clear choice-as the best criterion/for this method. This convergence criterion for this algorithm used the absolute change of the estimates. Alternative choices for the convergence criterion are the change in the model's likelihood or the likelihood's gradient. Each has its advantages. Si fina teria used in this study “was affected by the starting values of the estimated matrices, the ”a... other choices of criterion might yield closer consistent estimates. owever if any 6f thewmatriceé)is singular, the likelihood will approach infinity and likely fail to converge. Future models can be expanded nglarger more regtrictiveéand complicated models. The only problem facing these models is the.number;> o£_iterations necessary for convergence. _As the models become more complicated, more iterations are required to reach convergence. New developments‘arising in the work on the E-M algorithm might shorten j this process. The E-M algorithm however\must be derived separatelyifor _each model to which it is applied limiting the generalization from one model to another. Another factor not examined here but of importance is the unbalancedness of the sample. The degree to which the data contains unbalanced groups may or may not affect the estimation procedure. Using the unbalanced design a lower and upper limit of the MSE could be found for the sample. Literature indicates that the relative size of the matrices of the random model can affect the reliability of the estimates. 100 Lastly the expansion of this model into the unbalanced design is important for educational research. This procedure opens the way for more complicated multilevel analysis such as the causal modeling of Joreskog. APPENDICES APPENDIX A EQUATIONS FOR THE ESTIMATION OF THE COVARIANCE COMPONENTS OF THE TWO-PARAMETER MODEL USING THE EM ALGORITHM The model of the two-parameter (unrestricted) model is: - + I u+11 51 13 J I is a p x 1 vector of observed data. g is a p x 1 vector of grand means. 1 is a p x 1 vector of group effects. is a p x 1 vector of individual effects. In The EM algorithm is used to estimate the two covariance matrices of this model, 2% and 22. Both matrices are assumed to come from multivariate normal distributions: 1~N(Q,2) 7 £~N(QREE) E-Step The conditional expectations of 7 and E are: 101 MY 7" 2:11 W Q1? E - = - 1|Y 1 1k 9 231(17- u) where Q1 - (Z1 + (l/n1)2‘)-1 w - [2" 0,1" i-l M-Step ° The maximum likelihood equations of the parameters for the M-step are I A 2:1 - 21 - é):_1[27(Q1 - 01 3~N(0.9) 11~N(0.91) 3.2~N<0.92). 103 104 E-Step The conditional expectations of Q, g, gl_and g2 are: ulY\ u' 2:111 Q1? QIY L 9' - 12° 93525; - 8*) 1le 7' 32} 11‘ s 9201(3- 11) aIY/ 3' 1 some: -Ae*-a‘-32‘) 1! i.) a (5.18) E where a - (A o 1' + 1:1)" '1 '1 Q1 - (he’s); + *2 + (l/m)M ) -1 w - [2:101] M-Step The equation can be clarified by substituting By substituting (5.18) for and (5.12) for D(8.). The maximum likelihood estimate becomes: ‘ i . 0 - o. - k :11 [3. 1.821 - Qi (A - W) (21)] " 1 w - 92 - k 2:, [‘FZ(Q1 - 01(4 - 1009921 2 9 - 9 - 9w} lewvul/nincz1 - 05A - mi] - ans-(n1 - 1)a‘1]a) A o 105 ~11 - 1'1 - «‘XLIRIMI/mptcz1 - 0104 - 10011 - ans-(n1 - 1)a'1]a)i«1 where a - (j:1 - abet1 - u‘)’ 8 - (x1d - A e' - in: - A‘s." - a) APPENDIX C COMPUTER PROGRAM FOR THE ESTIMATION OF THE THREE PARAMETER LATENT MODEL IN SAS SECTION 1 - PART 1 THIS SECTION CREATES THE SAMPLE DATASET FOR USE IN THE E-M ALGORITHM STEPS. EACH DATA POINT CONSISTS OF THREE COMPONENTS, LATENT WITHIN (OM), LATENT BETWEEN (PH) AND ERROR (PS). THE OBJECT IS TO USE PATTERN MATRICES L AND LA TO CONVERT THE 3 X 3 LATENT MATRICES INTO 4 X 4 MATRICES OF OBSERVED VALUES. THE ERROR MATRIX IS ALWAYS A 4 X 4 MATRIX OF MEASUREMENT ERRORS OF THE OBSERVED VALUES. 1. SEED IS ANY RANDOM NUMBER USED TO CREATE RANDOM VALUES FROM A RANDOM GENERATOR (NORMAL). USED IN STUDY 22.939u25 199.939n25 10199 - 50199 80199 100199 110199 2. CIRCLE IS A COUNTER USED TO LOOP THROUGH THE PROGRAM CREATING DIFFERENT DATA SETS FOR ANALYSIS. 3. {AI IS A Z X 2 MATRIX OF THE NUMBER OF STUDENTS IN THE GROUPS. NOl HAS THE NUMBER OF SUBJECTS IN GROUPS - N02 HAS THE NO OF GROUPS OF THAT SIZE. FOR UNBALANCED 25 GROUPS: | FOR BALANCED 25 GROUPS: PAT-10 5/20 5/30 5/40 5/50 5; I PAT-30 25; FOR UNBALANCED 100 GROUPS: | FOR BALANCED 100 GROUPS: PAT-10 20/20 20/30 20/40 20/50 20; I PAT-30 100; 4. QM IS THE PARAMETER OF THE WITHIN COVARIANCE MATRIX. OF THE POPULATION . 5. EH IS THE PARAMETER OF THE BETWEEN COVARIANCE MATRIX OF THE POPULATION. 6. 23 IS THE PARAMETER OF THE ERROR COVARIANCE MATRIX OF THE POPULATION. 7. L IS A PATTERN MATRIX CREATING LINEAR COMBINATIONS OF THE LATENT VARIABLES IN PH. 8. LA IS A PATTERN MATRIX CREATING LINEAR COMBINATIONS OF THE LATENT VARIABLES IN OM. I'M-Sbfl-l-X'al-l-fl-l-M-l-fi-fl-fl-il-3"I'M.*M-M-l-fl-fl-i-fl-fi-abfl-II-Ifib’l-fi-fifl-I-fl-fl-X-fl- PROC MATRIX ; SEED-10199; CIRCLE-O; 106 -0 ‘0 -0 .0 .0 -0 -0 .0 ‘0 .0 .0 .0 .0 .0 ‘0 -0 ‘0 .0 .0 .0 .0 .0 .0 ‘0 .0 .0 -0 .0 .0 .0 .0 .0 .0 -0 -0 .0 .0 .0 -0 .0 -0 107 PAT-30 25; OM - 25 10 15/ 10 20 10/ 15 10 35; PH - 64 8 40/ 8 5 7/ 40 7 107; PS - 5 0 0 0/ 0 6 0 0/ 0 0 ll 0/ 0 0 0 12; L - 1 0.5 0.5/ l 0.5 -0.5/ 1 -0.5 0.5/ 1 -0.5 -0.5; LA- 1 0.5 0.5/ l 0.5 -0.5/ 1 -0.5 0.5/ 1 -0.5 -0.5; NOTE 'THESE ARE THE PARAMETER VALUES AND PATTERN OF SIZES' ; PRINT PAT WITHIN BETWN ERR; * * SECTION 1 - PART 2 * * THREE DIFFERENT VECTORS OF DATA ARE NEEDED, ONE FOR PH, ONE FOR OM * AND ONE FOR PS. THESE ARE INDEPENDENT RANDOM VARIABLES. * FOR LATER USE THE CHOLESKYS OF OUR PARAMETER MATRICES ARE NEEDED. * * 9. QHQLQM IS THE CHOLESKY OF DH. *10. QHQLPH IS THE CHOLESKY 0F PH. *11. QHQLPS IS THE CHOLESKY 0F PS. *12. A 18 VECTOR OF 21330 VALUES GENERATED AT RANDOM FROM A POPULATION * OF VALUES WITH A MEAN OF 0 AND A VARIANCE OF 1 FROM SAS * SUBROUTINE NORMAL. *13. 2 IS A VECTOR OF 3000 VALUES EQUAL TO THE INDIVIDUAL VALUES FOR * OM. *14. 21 IS A VECTOR OF 100 VALUES EQUAL TO THE GROUP VALUES FOR PH. *15. 22 IS A VECTOR OF 3000 VALUES EQUAL TO THE INDIVIDUAL VALUES FOR * PS. *16. g, 91 AND g2 ARE THE VARIANCE-COVARIANCES FOR THE THREE 2 VECTORS. * THESE MATRICES SHOULD BE IDENTITY MATRICES. * CHOLOM - HALF(OM); CHOLPH - HALF(PH); CHOLPS - HALF(PS); BEGIN: CIRCLE-CIRCLE+1; A - J.(21300,1,0); I - 1; L: A(I,1)-NORMAL(SEED); I-I+1; IF I<- 21300 THEN GO TO L; 2 - a(1:3000,1)||a(3001:6000,l)||a(6001:9000,1); Zl- a(21001:21100,1)||a(21101:21200,1)||a(21201:21300,1); 22-a(9001:12000,1)||a(12001:15000,1)|| a(15001:18000,1)||a(18001:21000,1) TOTMI-NROW(Z)-1 TOTMIG-NROW(Zl)-l C - (Z'*Z)#/TOTMI Cl- (Zl'*Zl)#/TOTMIG 02- (22'*22)#/TOTMI 108 NOTE 'THESE ARE THE VAR-GOV OF THE RANDOM DATA (NO TRANS)’ ; PRINT C C1 C2; * e * SECTION 1 - PART 3 ; * e * BY MULTIPLYING RANDOM DATA FROM A POPULATION WITH MEAN 0 AND VARIANCE; * 0F 1 BY THE CHOLESKY OF A MATRIX, A VECTOR IS CREATED WHICH WILL ; * RECREATE THAT MATRIX. - 'k *17. 1 IS THE PRODUCT 0F 2 AND CHOLW. , *18. 11 Is THE PRODUCT 0F 21 AND CHOLB. , *19. 12 IS THE PRODUCT OF 22 AND CHOLERR. ; *20. n, 21 AND 02 ARE THE VARIANCE-COVARIANCES FOR THE THREE Y VECTORS-; * THESE MATRICES SHOULD BE CLOSE TO THE PARAMETER MATRICES. , * 8 Y - z * CHOLW; o - (Y'*Y)#/TOTMI; Yl- 21 * CHOLB; D1 - (Y1'*Yl)#/TOTMIG; Y2- 22 * CHOLERR; D2- (Y2'*Y2)#/TOTMI; NOTE 'THESE ARE THE VAR-COV OF THE TRANSFORMED DATA’ ; PRINT o D1 D2 ; * ; * SECTION 1 - PART 4 ; * ; * BY MULTIPLYING VECTORS 2 AND 21 To L AND LA, THE OBSERVED VALUES FOR ; * FOR EACH INDIVIDUAL ARE CREATED. INSTEAD OF THREE MEASURES PER ; * INDIVIDUAL THERE WILL BE FOUR. (THE ERROR MATRIX WAS CREATED IN ; * TERMS OF ERRORS FOR EACH OBSERVED VARIABLES AND IS ALREADY 4 x 4.); * ; *21. x IS THE PRODUCT OF Y AND L. ; *22. x1 IS THE PRODUCT OF Y1 AND LA. ; * ; x - Y * L' X1- Y1 * LA' ; * * SECTION 1 - PART 5 * * BY ADDING VECTORS X, YYl AND Y2 TOGETHER, A TOTAL SCORE IS ACHIEVED FOR EACH INDIVIDUAL. THESE SCORES OBVIOUSLY CONTAIN THE THREE VARIANCE COMPONENTS. ALL 30 INDIVIDUALS IN EACH GROUP RECEIVE THE SAME GROUP VECTOR(X1). OTHERWISE EACH RECEIVES A DIFFERENT VALUE FROM BOTH X AND Y2. 36361-363!- *23. 112 BECOMES A 3000 x 4 VECTOR WHICH REPEATS THE SAME X1 VALUE FOR * N(I) TIMES FOR EACH GROUP. *24. X BECOMES THE SUM OF X AND YYl AND Y2. *25. FIN IS THE VARIANCE-COVARIANCE MATRIX FOR THE FINAL SET OF DATA- * ITS A 4 X 4 MATRIX BASED ON 3000 OBSERVATIONS. *26. HE IS A VECTOR OF SIZE K CONTAINING THE GROUP SIZE FOR EACH GROUP. *27. ED REPLACES X AS THE MATRIX OF DATA. THIS IS USED IS OTHER * SECTIONS. 109 MMhO; II—l; NN-l; JJ: MHFMM+1; CC-J.(PAT(II,1),1,1); DD-(CC @ X1(NN.)) ; YYliYYl//DD ; NN-NN+1 ; NKFNK//PAT(II,1) ; IF MM LT PAT(II,2) THEN GO TO JJ; "MFG; II-II+1 ; IF NN LT PAT(+, 2) THEN GO TO JJ ; Fl-(YY1'*YY1)#/NROW(YY1) ° NOTE 'THIS IS THE VAR- COV MATRIX OF GROUP DATA FOR ALL IND' ; PRINT Fl ; RD-X+YY1+Y2 ; FIN- (RD'*RD)#/TOTMI NOTE 'THIS IS THE VAR- COV MATRIX OF THE DATA TO BE USED' ; PRINT FIN ; FREE MM NN II X X1 Y Y1 Y2 D 01 D2 E E1 FIN I YYl Z 21 22 CC DD ; FREE F1 C C1 C2 A EE EEl EE2 TOTE WITHIN BETWN ERR TOTMI TOTMIG ; END OF SECTION 1 AT THIS POINT IT BECOMES IMPORTANT TO REALIZE THAT ALL THE LINES ABOVE DEAL ONLY WITH CREATING THE DATA FOR THIS ANALYSIS. THEY CAN BE DROPPED IN USING THE EM ALGORITHM. TO USE THE REST OF THE PROGRAM WITHOUT THE PRIOR LINES, THE FOLLOWING LINES MUST BE PLACED AT THE TOP OF THE PROGRAM (REMOVING THE * FROM THE FRONT - SEE SAS FOR THE FETCH COMMAND) Ski-*I-I-I-l-l-fl-X-H-l- *PROC MATRIX *FETCH RD *FETCH LA SECTION 2 - PART 1 THIS SECTION USES THE EM ALGORITHM TO GET ESTIMATES OF THE UNRESTRICTED MODEL. THE BETWEEN AND WITHIN VARIANCE-COVARIANCE MATRICES ARE ESTIMATED WITH NO STRUCTURE APPLIED. THIS FIRST PART TURNS OUT THE SUFFICIENT STATISTICS FOR THE SAMPLE DATA NEEDED IN PART 2 AND IN PART 3. E NUMBER OF GROUPS IN THE SAMPLE. E NUMBER OF OBSERVED VARIABLES IN THE SAMPLE. E TOTAL NUMBER OF OBSERVATIONS IN THE SAMPLE. KP KP I I I VECTOR OF THE GROUP MEANS. _ X KP MATRIX OF EACH GROUP‘S SUM OF SQUARE/NK. *fifl-fil’I-I-fi’fl-Il-M'fl-H-fi-I- U|§UNH K S TH E S TH H 8 TH . 13 IS A SS IS SA .0 .0 .0 .0 ‘0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 ‘0 .0 .0 .0 .0 .0 '0 .0 .0 .0 .0 '0 .0 .0 .0 .0 .0 110 ZEROl-J.(NROW(L),NROW(L),O) 3 ZEROZ-J.(NCOL(L),NCOL(L),O) ; ZERO3-J.(NCOL(LA),NCOL(LA),0) ; NH-(l#/SUM(INV(DIAG(NK))))*K PSI-((RD'*RD)-B1)*1#/(N-K) PHI-(1#/NH)*((Bl-(RD(+.)'*RD(+.)*(1#/N)))*1#/(K-1)-PSI) NOTE 'HERE ARE THE STARTING MATRICES' PRINT PHI PSI FREE Bl B s NH A ; KFNROW(NK) , *NO OF CLASSES - K ; P-NCOL(RD) ; *NO OF OBSERVED VARIABLES - P ; NiNK(+,) ; *NO OF TOTAL INDIVIDUALS - N ; GRP-J.(NROW(NK),1,1) ; *VECTOR OF 1'8, X X 1 ; D-O ; Bl-O , DO I-l TO K ; C -D+l ; D -C+NK(I,)-1 ; A -RD(C:D,) ; B -A(+.)*(1#/NK(I.)) ; Ya -YM//B' ; *GROUP MEANS ; E -(A'*A)*(1#/NK(I.)) ; SS -SS//E ; *ss FOR EACH GROUP ; Bl-((B'*B)*NK(I,))+B1 ; *SS/K OF THE GROUP MEANS ; END : * ; * SECTION 2 - PART 2 ; * e * STARTING VALUES ARE NEEDED FOR THE BETWEEN GROUP COVARIANCE MATRIX ; * PHI AND THE WITHIN COVARIANCE MATRIX PSI. THE MLE FOR EQUAL N'S ; * WILL BE USED WITH NK (NUMBER OF STUDENTS IN A GROUP) REPLACED BY THE ; * HARMONIC MEAN OF NK. ; * ; * 6. NH IS THE HARMONIC NK OF THE GROUPS. ; * 7. PHI IS THE BETWEEN GROUPS COVARIANCE MATRIX. ; * 8. PSI IS THE WITHIN GROUPS COVARIANCE MATRIX. ; * ; SECTION 2 - PART 3 THIS PART CREATES THE CONDITIONAL VALUES FOR THE MEAN AND GROUP EFFECT FOR PHI AND PSI ESTIMATES. THERE ARE FOUR IMPORTANT VARIABLES' CREATED HERE. THEY ARE CREATED IN SUBROUTINES ALPHAB AND ALPHAU. ' THIS IS PART OF THE INTERATIVE LOOP, THE E STEP. 10. TH IS A K X P MATRIX OF GROUP EFFECTS. 11. Q IS A KP X P WEIGHTING FACTORS FOR THE GROUPS. 12. W IS A P X P WEIGHTING FACTOR CALCULATED FROM THE Q'S. *fl-l-fl-fl-fl-fl-fl-l'I-fl'fl-ll’ 9. U IS A P X 1 VECTOR OF CONDITIONAL MEANS. ; BUDDYEO BUD:BUDDY-BUDDY+1 IF NROW(PAT) EQ 1 THEN LINK ALPHAB ELSE LINK ALPHAU 111 D0 I-I To K C -(P*I)-P+I D -P*I A -PHI*Q(C:D,)*(YM(C:D,)-U) TH-TH//A END FREE A at- SECTION 2 - PART 4 THIS PART CALGULATES THE MAXIMUM LIKELIHOOD VALUES FOR PHI AND PSI USING THE DATA AND THE CONDITIONAL VARIABLES. (M-STEP). THIS PROGRAM WILL KEEP LOOPING TO THE LAST PART UNTIL THE DIFFERENCES IN PHI AND PSI, AND THE NEW ESTIMATES OF PHI AND PSI ARE LESS THAN .01. 13. ONEl IS THE DIFFERNECE BETWEEN PHI ON THE LAST ITERATION AND THE NEW ESTIMATES OF PHI. l4. TWOl IS THE DIFFERNECE BETWEEN PSI ON THE LAST ITERATION AND THE NEW ESTIMATES OF PSI. 036*3636‘36301-01" E-J(P,P,0) F—J(P,P,0) DO I-l TO K C -(P*I)-P+1 D -P*I A -TH(C:D,)+U E -E+NK(I,)*((SS(C:D,))-(YM(C:D,)*A') -(A*YM(C:D,)')+(A*A')+(PHI*Q(C:D,)* (PSI*(1#/NK(I.))+(W*Q(C:D.)*PHI-2*W)))) ; F -F+Q(C:D.)-(Q(C:D.)*((YM(C:D.)-U)*(YM(C:D.)-U)'+W)*Q(C:D.)) ; END ; FREE A PHI -PHI-(PHI*((I#/K)*F)*PHI) ONEI-PHI-PHI PSI -((1#/N)*E)+W TWOI-PSl-PSI PHID-DIAG(PHI) PSID-DIAG(PSI) PHID-PHID<>ZEROI PSID-PSID<>ZEROI PHI - PHI-DIAG(PHI)+PH1D PSI - PSI-DIAG(PSI)+PSID PHI-PHI PSI-PSI FREE PSI Q TH PHI PHID PSID IF BUDDY GT 250 THEN GO TO FINAL; IF MAX(ABS(ONE1)) LT 0.01 AND MAX(ABS(TWOI)) LT 0.01 THEN GO TO FINAL ; ELSE GO TO BUD ; .0 .0 .0 ‘0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 FINAL:PRINT BUDDY PHI PSI ONEl TWOl U FREE PSl ONEl TWOl U BUDDY -0 -0 .0 .0 .0 .0 .0 .0 ‘0 .0 .0 -0 -0 112 END OF SECTION 2 SECTION 3 - PART 1 THIS SECTION USES THE E-M ALGORITHM TO GET ESTIMATES OF THE RESTRICTED MODEL. PH, OM AND PS ARE ESTIMATED WITH STRUCTURE APPLIED TO THE MODEL. THIS FIRST PART MAKES USE OF PHI AND PSI FROM THE LAST SECTION TO GET OPENING ESTIMATES OF PH, OM AND PSI. PH IS THE LATENT GROUP LEVEL VAR-COV. OM IS THE LATENT IND LEVEL VAR-COV. PS IS THE OBSERVED VARIABLES ERROR MATRIX. 8 IS THE DIMENSION OF PH. . R IS THE DIMENSION OF OM. 3F$$$$$$$$$$3$$$$$$ UI§UNH Y1-INV(L'*L) ; Y2-INV(LA'*LA) ; PHiY1*L'*PHI*L*Yl ; OM-Y2*LA ' *PS I*LA*Y2 ; PS-PSI+PHI-L*PH*L'-LA*OM*LA' ; NOTE 'THESE ARE THE STARTING VALUES IN THIS STEP' , PRINT PH OM PS ; S-NCOL(L) ; *NO OF LATENT CLASS VARIABLES - S ; RPNCOLZERO2 OMD-DIAG(OM)<>ZERO3 PSD-DIAG(PS)<>ZEROI PH-PH-DIAG(PH)+PHD OM-OM-DIAG(OM)+OMD PS-PS-DIAG(PS)+PSD FREE TH Q W B VE XX B II IF BUDDYI GT 250 THEN GO TO FINALI IF MAX(ABS(ONE)) LT 0.01 AND MAX(ABS(TWO)) LT 0.01 AND MAX(ABS(THREE)) LT 0.01 THEN GO TO FINALI ; ELSE GO TO BUDl ; FINALI: PRINT BUDDYI PH OM PS ONE TWO THREE SEED ; PRINT U ; .0 .0 -0 .0 .0 .0 .0 -0 .0 .0 .0 .0 ‘0 ‘0 FREE NK RD YM 88 N K R S '114 IF CIRCLE LT 2 THEN GO TO BEGIN STOP * * HERE ARE THE SUBROUTINES * * ALPHAU * ALPHAU: TOT -J(P,P,O) DO I-l TO K Tl -INV((PSI*(1#/NK(I,)))+PHI) TOT -TOT+T1 Q -Q//T1 END W -INV(TOT) U -W*Q'*YM FREE T1 TOT RETURN * * ALPHAB * ALPHAB: Tl -INV((PSI*(l#/NK(1,)))+PHI) Q -GRP @ Tl W -INV(T1*K) U -W*Q'*YM FREE Tl RETURN * * BETAU * BETAU: W - J(P,P,0) DO I-l TO K A-INV( (L*PH*L') + (MM*(l#/NK(I.))) ) Q-Ql/A WhW+A END FREE A WhINV(W) U-W*Q'*YM RETURN * * BETAB * BETAB: A!INV( (LRPH*L') O -GRP @ A W -INV(A*K) U-W*Q'*YM RETURN * + (MM*(1#/NK(1.))) ' *MATRIX OF Q *COND VAR FOR U ' *COND U ' *COND U ) -I(PXP; "U'U $3!-*$**$****$*$**fl-363636363636361-1-1-3636000001-0 PROC MATRIX SEED - 101997 CIRCLE - O PAT - 30 100 APPENDIX D COMPUTER PROGRAM FOR THE ESTIMATION OF THE FOUR PARAMETER LATENT MODEL IN SAS SECTION 1 - PART 1 THIS SECTION CREATES THE SAMPLE DATASET FOR USE IN THE E-M ALGORITHM STEPS. EACH DATA POINT CONSISTS OF THREE COMPONENTS, LATENT WITHIN (OM), LATENT BETWEEN (PH) AND ERROR (PS). THE OBJECT IS TO USE PATTERN MATRICES L AND LA TO CONVERT THE 3 X.3 LATENT MATRICES INTO 4 X 4 MATRICES OF OBSERVED VALUES. THE ERROR MATRICES ARE ALWAYS 4 X 4 MATRICES OF MEASUREMENT ERRORS OF THE OBSERVED VALUES. THE NOMENCLATURE USED IN THIS PROGRAM IS THE SAME AS THAT IN THE PROGRAM IN APPENDIX 3. REFER TO APPENDIX 3 FOR DEFININTIONS. l. SEED IS ANY RANDOM NUMBER USED TO CREATE RANDOM VALUES FROM A RANDOM GENERATOR (NORMAL). USED IN STUDY lQQ_§EQy£S - 26298, 27309, 49329, 93369, AND 181449 3. RAT IS A Z X 2 MATRIX OF THE NUMBER OF STUDENTS IN THE GROUPS. NOl HAS THE NUMBER OF SUBJECTS IN GROUPS - N02 HAS THE NO OF GROUPS OF THAT SIZE. FOR UNBALANCED 100 GROUPS: FOR BALANCED 100 GROUPS: PAT-10 20/20 20/30 20/40 20/50 20; | PAT-3O 100; 4. QM IS THE PARAMETER OF THE WITHIN COVARIANCE MATRIX OF THE POPULATION. 5. 23 IS THE PARAMETER OF THE BETWEEN COVARIANCE MATRIX OF THE POPULATION. 6. PSI IS THE PARAMETER OF THE WITHIN ERROR COVARIANCE MATRIX OF THE POPULATION. 6. 2S2 IS THE PARAMETER OF THE BETWEEN ERROR COVARIANCE MATRIX OF THE POPULATION . .0 .0 .0 .0 0a - 25 10 15/ 10 20 10/ 15 10 35; PH - 64 8 40/ 8 5 7/ 40 7 107; Ps1 - 5 0 0 0/ 0 6 0 0/ 0 0 11 0/ 0 0 Ps2 - 7 0 0 0/ 0 8 0 0/ 0 0 10 0/ 0 0 0 0 F‘P‘ ran: 115 .0 -0 -0 -0 U0 .0 .0 -0 -0 -0 -0 ‘0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 ‘0 .0 .0 ‘0 .0 .0 '0 .0 .0 ‘0 -0 116 PRINT OM PH P81 P82 ; L - 1 0.5 0.5/ 1 0.5 -0.5/ 1 -0.5 0.5/ 1 -0.5 -0.5; LA- 1 0.5 0.5/ 1 0.5 -0.5/ 1 -0.5 0.5/ 1 -0.5 -0.5- EOM - L#0M*L'+PSl ; EPH - LA*PH*LA'+PS2 ; PRINT EOM EPH ; 'A' * SECTION 1 - PART 2 * * FOUR DIFFERENT VECTORS OF DATA ARE NEEDED, ONE FOR PH, ONE FOR OM * ONE FOR PS1 AND ONE FOR P82. THESE ARE INDEPENDENT RANDOM VARIABLES. * FOR LATER USE THE CHOLESKYS OF OUR PARAMETER MATRICES ARE NEEDED. * CHOLOM - HALF(OM) ; CHOLPH - HALF(PH) ; CHOLPSI - HALF(PSl) ; CHOLPS2 - HALF(P52) ; BEGIN: CIRCLE-CIRCLE+1 ; A - J.(21700,1,0); I-l; L: A(I,l)-NORMAL(SEED); I-I+1; IF I<- 21700 THEN GO TO L; z - a(l:3000,1)||a(3001:6000,1)||a(6001:9000,l) ; 21- a(21001:21100,1)||a(21101:21200,1)||a(21201:21300,1) ; 22-a(9001:12000,l)||a(12001:15000,l) ||a(15001:18000,1)||a(18001:21000,1) ; Z3-a(21301:21400,l)l|a(21401:21500,1) ||a(21501:21600,1)||a(21601:21700,1) TOTMI-NROW(Z)-l TOTMIG-NROW(Zl)-l * * SECTION 1 - PART 3 * * BY MULTIPLYING RANDOM DATA FROM A POPULATION WITH MEAN 0 AND VARIANCE; * OF 1 BY THE CHOLESKY OF A MATRIX, A VECTOR IS CREATED WHICH WILL ; * RECREATE THAT MATRIX. ; * ; Y - Z * CHOLOM , Yl- Zl * CHOLPH , Y2- 22 * CHOLPSl ; Y3- 23 * CHOLPS2 ; * * SECTION 1 - PART 4 'k * BY MULTIPLYING VECTORS Z AND 21 TO L AND LA, THE OBSERVED VALUES FOR * FOR EACH INDIVIDUAL ARE CREATED. INSTEAD OF THREE MEASURES PER 117 * INDIVIDUAL THERE WILL BE FOUR. (THE ERROR MATRIX WAS CREATED IN * X - Y * L' , X1- Y1 * LA' , Xl-XI+Y3 ; X2-X +Y2 ; FOR EACH INDIVIDUAL. VARIANCE COMPONENTS. I'M-#0363639!!- MMhO; II-l; NN-I; JJ: HHFHH+1; CC-J.(PAT(II,1),1,1); DD-(CC @ X1(NN,)) ; YYldYYl//DD ; NN-NN+1 ; NKFNK//PAT(II,1) ; IF MM LT PAT(II,2) THEN GO TO JJ “MFG; II-II+1 IF NN LT PAT(+,2) THEN GO TO JJ FREE MM NN II X1 Y Y1 FREE A OM PH P81 P82 TOTMIG RD—X+YY1+Y2 FIN— (RD ' *RD)#/TOTMI FREE X Y2 FIN YYl TOTMI Z 21 Z2 Z3 END OF SECTION 1 FOR THE FETCH COMMAND) 03000360000003!- *PROC MATRIX *FETCH RD *FETCH LA *FETCH L *FETCH NK * * SECTION 1 - PART 5 BY ADDING VECTORS X1 AND X2 TOGETHER, A TOTAL SCORE IS ACHIEVED THESE SCORES OBVIOUSLY CONTAIN THE FOUR ALL INDIVIDUALS IN EACH GROUP RECEIVE THE SAME GROUP VECTOR (X1) AND A DIFFERENT VALUE FROM X2. I Z 21 Z2 CC DD AT THIS POINT IT BECOMES IMPORTANT TO REALIZE THAT ALL THE LINES ABOVE DEAL ONLY WITH CREATING THE DATA FOR THIS ANALYSIS. THEY CAN BE DROPPED IN USING THE EM ALGORITHM. TO USE THE REST OF THE PROGRAM WITHOUT THE PRIOR LINES, THE FOLLOWING LINES MUST BE PLACED AT THE TOP OF THE PROGRAM (REMOVING THE * FROM THE FRONT - SEE SAS * TERMS OF ERRORS FOR EACH OBSERVED VARIABLES AND IS ALREADY 4 X 4.); .0 .0 .0 .0 .0 -0 ‘0 .0 .0 ‘0 '0 .0 .0 .0 .0 .0 -0 .0 118 SECTION 2 - PART 1 THIS SECTION USES THE EM ALGORITHM TO GET ESTIMATES OF THE UNRESTRICTED MODEL. THE BETWEEN AND WITHIN VARIANCE-COVARIANCE MATRICES ARE ESTIMATED WITH NO STRUCTURE APPLIED. THIS FIRST PART * TURNS OUT THE SUFFICIENT STATISTICS FOR THE SAMPLE DATA NEEDED * IN PART 2 AND IN PART 3. * ;ZEROI - J.(NROW(L),NROW(L),0) ; ZERO2 - J.(NCOL(L),NCOL(L),0) ; ZER03 - J.(NCOL(LA),NCOL(LA),0) ; ZEROl - DIAG(ZEROI) ; ZER02 - DIAG(ZER02) ; ZER03 - DIAG(ZERO3) ; KPNROW(NK) ° *NO OF CLASSES - K P-NCOL(RD) *NO OF OBSERVED VARIABLES - P NhNK(+,) *NO OF TOTAL INDIVIDUALS - N GRP-J.(NROW(NR),1,1) *VECTOR OF 1'8, X X 1 S-NCOL(L) *NO OF LATENT CLASS VARIABLES - S RFNCOLZEROl , PSlD-PSlD<>ZEROl , PHI - PHl-DIAG(PH1)+PH1D , PSI - PSl-DIAG(PSl)+PSlD , FREE PSl Q W PHl PHlD PSlD G E F EE , IF BUDDY GT 25 THEN GO TO FINAL; IF MAX(ABS(ONE1)) LT 0.01 AND MAX(ABS(TWOl)) LT 0.0l THEN GO TO FINAL ELSE GO TO BUD FINALzPRINT BUDDY PHI PSI FREE PSl ONEl TWOl U BUDDY * END OF SECTION 2 SECTION 3 - PART 1 I'M-0*!- * THIS SECTION USES THE EM ALGORITHM TO GET ESTIMATES OF THE RESTRICTED MODEL. PH, OM, PSl AND P52 ARE ESTIMATED WITH STRUCTURE * APPLIED TO THE MODEL. THIS FIRST PART MAKES USE OF PHI AND PSI FROM * THE LAST SECTION TO GET OPENING ESTIMATES OF PH, OM, PSl AND PS2. 3'. .0 .0 ‘0 -0 .0 -0 .0 .0 -0 -0 -0 120 Y1-INV(L'*L) Y2-INV(LA'*LA) PHin*L'*PHI*L*Yl OM-Y2*LA'*PSI*LA*Y2 PSl-PSI-LA*OM*LA' PS2-PHI-L*PH*L' *NOTE 'THESE ARE THE STARTING VALUES IN THIS STEP' ; *PRINT PH OM PSl ; * * SECTION 3 - PART 2 'A' * THIS PART CREATES THE CONDITIONAL VALUES FOR THE MEAN AND GROUP * EFFECT FOR PH, OM, PSI AND PS2. THERE ARE FOUR IMPORTANT VARIABLES * CREATED HERE. THEY ARE CREATED IN SUBROUTINES BETAB AND BETAU. * THIS Is PART OF THE INTERATIVE LOOP, THE E STEP. * BUDDYI-O BUD1: MMP(LA*OM*LA'+PSI) AA-(L*PH*L'+P82) BUDDYl-BUDDY1+1 MPINV(MM) ; *INV OF WITHIN VARIANCES IF NROW(PAT) EQ 1 THEN LINK BETAB ELSE LINK BETAU .0 .0 .0 '0 .0 .0 P X P * SECTION 3 - PART 3 * * THIS PART CALCULATES THE MAXIMUM LIKELIHOOD VALUES FOR PH OM PSI P82 * USING THE DATA AND THE CONDITIONAL VARIABLES (M-STEP). THIS PROGRAM * WILL KEEP LOOPING TO THE LAST PART UNTIL THE DIFFERENCES IN PH, OM, * PSI AND PS2 AND THEIR NEW ESTIMATES ARE LESS THAN .01. * DO I-l TO K ; C-(P*I)-P+1 ; D-P*I LV-PH*L' *Q(C: D, )*(YM(C: D, ) -U) ; 'I'H-T'W/Lv ; END ; FREE LV ; CVE- J(P,P,0) ; BVE- J(P,P,0) ; DO I-l TO K ; C-(P*I)-P+l ; D-P*I ; CC-(S*I)-S+l ; DD-S*I ;. Z-LiTH(CC: DD, )+U AVE-NR(I, )*M*(SS(C: D, )- (YM(C: D, )*Z' )- (Z*YM(C: D, )' )+(Z*Z' ))*M ; BVE-BVE+(Q(C: D, )*( (YM(C: D, ) -U) * (YM(C: D, ) -U)' +W)*Q(C: D, )- -Q(C:D,) ); CVE-I#/NK(I,)*( Q(C:D,)*W*Q(C:D,)-Q(C:D,) ) +CVE +AVE ; END ; E-(N-K)*M ; ONE-((1#/K)*(PH*L'*BVE*L*PH)) ; TWO-((1#/N)*OM*LA'*(CVE-E)*LA*OM) ; THREE-DIAG((l#/K)*(PSZ*BVE*PSZ)) ; 121 FOURPDIAG((l#/N)*(PSl*(CVE-E)*PSl)) PH-PH+ONE OMhOM+TWO PSl-DIAG(PSl+FOUR) PSZ-DIAG(P82+THREE) PHD-DIAG(PH)<>ZER02 OMD-DIAG(OM)<>ZERO3 PSAFDIAG(PSl)<>ZEROl PSB—DIAG(P82)<>ZEROI PH-PH-DIAG(PH)+PHD OMFOM-DIAG(OM)+OMD PSl-PSl-DIAG(PSl)+PSA PSZ-PSZ-DIAG(PS2)+PSB FREE Q W CVE AVE BVE TH IF BUDDYl GT 250 THEN GO TO FINALl IF MAX(ABS(ONE)) LT 0.01 AND MAX(ABS(TWO)) LT 0.01 AND MAX(ABS(THREE)) LT 0.01 AND MAX(ABS(FOUR)) LT 0.01 THEN GO TO FINALl ELSE GO TO BUDl FINALl: PRINT BUDDYl PH OM P81 P82 FREE NK RD YM SS N K R 8 IF CIRCLE LT 3 THEN GO TO BEGIN PRINT SEED STOP * * HERE ARE THE SUBROUTINES * * ALPHAU * ALPHAU: TOT -J(P,P,0) DO I-1 TO K T1 -INV((PSI*(1#/NR(I,)))+PHI) TOT -TOT+T1 Q -Q//'1'1 END W -INV(TOT) U -W*Q'*YM FREE T1 TOT RETURN * * ALPHAB * ALPHAB: T1 -INV((PSI*(1#/NK(1,)))+PHI) Q -GRP @ T1 W -INV(T1*K) U -W*Q'*YM FREE T1 RETURN * * BETAU * BETAU: W - J(P,P,O) DO I-l TO K ‘0 -0 .0 .0 .0 .0 ‘0 -0 -0 .0 .0 ‘0 .0 .0 .0 .0 ‘0 ‘0 -0 -0 .0 ‘0 .0 .0 -0 -0 -0 .0 .0 .0 .0 -0 .0 .0 .0 .0 -0 -0 .0 .0 .0 .0 .0 -0 -0 ‘0 .0 .0 122 A~INV( AA + (MM*(1#/NK(I.))) ) ; Q—Q//A ; *MATRIX OF Q - KP X P; W-W+A ; END FREE A W-INV(W) ; *COND VAR FOR U - U-W*Q ' *YM ; *COND U - RETURN * * BETAB * BETAB: A.INV( AA + (MM*(1#/NK(1.))) ) Q -GRP @ A w -INV(A*K) U-w*Q'*YM ; *COND U - P X P; RETURN : * ul- "U'd NM 'U’U BIBLIOGRAPHY BIBLIOGRAPHY Anderson. '1‘ W ..(1984) W W John Wiley and Sons New York Anderson, T. V. and Rubin, D. B.,(1956). Statistical inference in factor analysis. MW (Neyman, ed.), Vol. V, University of California, Berkeley, California. Ahrens, H.,(1978). HINQUE and ANOVA estimator for one-way classification - a risk comparison. fiigmg§11g_igunngl, 19, 535-556. Ahrens, H., Kleffe, J. and Tenzler, R.,(198l). Mean Square Error comparison for MINQUE, ANOVA and two alternative estimators under the unbalanced one-way random model. Bignggxig_lgg;n§1, 23(4), 323-342. Baum, L. E., Petrie, T., Soules, G. and weiss, N.,(1970). A maximimization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Anngl_g£ MW. 41. 146-147. Bock, R. D.,(l960). Components of variance analysis as a structural and a discriminal analysis for psychological tests. .filifiifih Wm. 13. 151-163. Bock, R. D. and Bargmann, R. B.,(l966). Analysis of covariance structures. Egyghgmggzikg, 31, 507-534. Brown, M. L.,(1974). Identification of the sources of significance in two-way tables. Apnligd_§§g§1§§1g§, S, 405-413. Burstein, L., Linn, R. L., and Capell, F. J.,(l978). Analyzing multilevel data in the presence of heterogeneous within-class restessionS- W. 3(4). 347-383. Burt, Cyril,(1947). Factor analysis and analysis of variance. Brigifih MW. 1. 3-26. Chatterjee, S. K. and Das, K,(1983). Estimation of variance components for one-way classification with heteroscedastic error. Calcutta MW. 32. 57-78. Cronbach, L. J., with the assistance of J. E. Deken and N. webb, (1976). Research on classrooms and schools: formulation of questions, design, and analysis. Occasional paper, Stanford Evaluation Consortium. Dempster, A. P.,Laird, N. M.,and Rubin, D. B.,(1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal W2. 39(1). 1-38. 123 124 Hartley, H. 0.,(1958). Maximum Likelihood estimation from incomplete data. Biometrics, 14, 174-194. Hartley, H. O. and Hocking, R. R.,(l97l). The analysis of incomplete data. fiignggzigg, 27, 783-808. Healy, M. and Westmacott, M.,(1956). Missing values in experiments analysed on automatic computers. Apnligd_§§g§1§§1§§, 5, 203-206. Howe, W. G.,(l955). Some contributions to factor analysis. Report No. ORNL-l9l9. Oak Ridge National Laboratory, Oak Ridge, Tennessee. Henderson, C. R.,(19S3). Estimation of variance and covariance components. nignggzigg, 9, 226-252. Joreskog, K.G.(1967). Some contributions to maximum likelihood factor analysis. Egyghgmgtxikg, 32, 443-482. Joreskog, K.G.(l966). Testing a simple structure hypothesis in factor analysis. figyghgnggzikg, 31(2), 165-178. Joreskog, K.G.(l969). Efficient estimation in image factor analysis. W. 34. 51-75. Joreskog, K.G.(l970). A general method for analysis of covariance structures. fligngtxika, 57, 239-251. Joreskog, K.G.(l97l). Statistical analysis of sets of congeneric tests. Rayghgnggxigg, 36(2), 109-133. Keesling, J. W., and Wiley, D. E.,(l974). Regression models for hierarchial data. Paper presented at the Annual Meeting of the Psychometric Society, Stanford University. Kleffe, J.,(l977). Best unbiased estimators for variance components with application to the unbalanced one-way classification. mm. 32. 793-804. La Motte, L. R.,(l973). Quadratic estimation of variance components. W. 29. 311-330. La Motte, L. R.,(l976). Invariant quadratic estimators in the random one-way ANOVA model. biometrics, 32, 793-804. Lawley, D. N.,(1958). Estimation in factor analysis under various initial assumptions. W. 11, 1-12. Lord. F. M. and Novich, M. R.,(1968). WWW Ig§;_§gggg§, Addison-Wesley Publishing Company Inc., Massachusetts. 125 Morrison. D.. F..(1967). BEltLEEriars_§tsrisrissl_nsrh2ds. McGraw-Hill. New York. Rao, C. R.,(l97l). Estimation of variance and.covariance components - HINQUE theory. i2ErnEl_2£_Eslrixariars_bnalxsis. 1. 257-275. Rao, C. R.,(l972). Estimation of variance and covariance components in linear models. l2arnal_2f_AmeriEan_Statistisal_5ssoeiatien. 67. 112-115. Raudenbush, S. W.,(1986). Applications of a Hierchical Linear Model in Educational Research. Unpublished Doctorial dissertation, Graduate School of Education, Harvard. Schmidt, W., C.(1969). Covariance structure analysis of the multivariate random effects model. Unpublished doctorial dissertation. University of Chicago. ' Searle, S. R.,(l97l). Topics in variance components estimation. Bionsrriss. 27. 1-76. Spearman, C.,(l904). General intelligence, objectively determined and measured. Am2riEEn_lonrnal_2f_ksxsbolezx. 15. 201-293. Tiao, G. C. and Tan, W. Y.,(1965). Baysean analysis of random-effect models in the analysis of variance, 1. Posterior distributions of variance components. fiigmggxika, 52, 37-53. Welch, B. L.,(l937). The significance of the difference between two means when the population variances are unequal. Biometrikg, 29, 350-362. Wiley, D. E.,(1967). Analysis of covariance structures. N.S.F. Research Grant Application. - Wiley, D. E., Schmidt W. H. and Bramble W. J.,(l973). Studies of a class of covariance structure models. J u o Statistisal.éssosiation. 68. 317-323. Williams, E. R., Radcliffe, D. and Speed, T. P.,(l981). Estimating missing values in multi stratum experiments. Apnligd_§§a§1§§1g§, 30, 71-72. Wisenbaker, J.,(1980). Structural equation model applied to hierchial data. Unpublished doctorial dissertation. Department of Education, Michigan State University. Wu, C. P.,(1983). On the convergence properties of the EM algorithm. Anns1s_2f_§rsrisriss. 11. 95-103. 11041an STATE UNIV. LIBRARIES [WI“"1“"VII!”INWIWIWIHW"W WIIUWI 31293000696405