'1... ‘3‘: ‘v 35 v of): v. tilt. I “|. .Afi. . 1 o ‘. .. .- .nvot' r- ... .. :«s . .ovo‘vc v . .u:s ._s ¢ .‘ ‘. 3...... :IU: 140 .51I\.-.t... . ‘ .A l.- . .. 15. . .15“: b ‘au his»: UNIVERSITY LIBRA ARIES IIIIIIIIIIIIIIIIIIIIIIIII IIIII IIII II I 3 129300 This is to certify that the dissertation entitled Two Level Nested Hierarchical Linear Model with Random Intercepts via the Bootstrap presented by Joshua Gisemba Bagaka's has been accepted towards fulfillment of the requirements for Ph . D degree in Counse ling , Educational Psychology & Special Education (Statistics & Research Design) s/éf 1%— flow Major professor Date Marchl 1 2 .MSU is an Affirmative Action/Equal Opportunity Institution 042771 LIBRARY Michigan State ; Unlverslty L PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE ll 1-"? ‘44 55:1? _: ’ - r3331 1133b, A WI I MSU Is An Affirmative Action/Equal Opportunity Inditution . cmmn _-_—.~—“- TWO LEVEL NESTED HIERARCHICAL LINEAR MODEL WITH RANDOM INTERCEPTS VIA THE BOOTSTRAP by Joshua Gisemba Bagaka’s A DISSERTATION Submitted to Michi an State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counselin , Educational Psychology and Speci Education 1992 ABSTRACT TWO LEVEL NESTED HIERARCHICAL LINEAR MODEL WITH RANDOM INTERCEPTS VIA THE BOOTSTRAP by Joshua Gisemba Bagaka’s In statistical linear models, most procedures available for estimating the variance components of the mixed model are usually based on the assumption that the error terms and each set of random effects in the model are normally distributed with zero means and some variance—covariance structure. However, in certain research situations, there is little doubt that the error terms and each set of random effects in the mixed model can be characterized as moderately or even distinctly non—normal with heavy tails or badly skewed distributions. Efron (1979) discussed the use of a technique called the bootstrap to generate sampling distributions of statistics and thereby to draw inferences about parameters without requiring any distributional prOperties. Besides the fact that the bootstrap liberates statisticians from over-reliance on distributional assumptions, the method makes it possible to attack more complicated problems which may not have Closed—form expressions. This study utilized the bootstrap procedure to estimate the sampling distribution of estimators, their standard errors and thereby setting confidence intervals about the parameters of a mixed HLM under a variety of conditions. Applicability of the bootstrap on data originating from real research situations was demonstrated through the estimation of the effects of school, classroom, and teacher variables on the teachers’ self—eficacy. Based on the usual MINQUE and bootstrap estimators, the study showed that the success of estimation is typically affected by the nature and size of the tails of the distribution of the errors and sets of random effects parameters of the model. The bootstrap generally followed MINQUE quite closely in estimating the fixed and random effects of the model under both the normal and double exponential distributions. Particularly in estimating the pOpulation inter—class variance, 1'2 at the 0.01 level of the intra—class correlation, the bootstrap was surprisingly closer to the parameter value than the MINQUE. Due to the fact that the bootstrap procedure is highly dependent on the computer, the study recommended that software to implement the bootstrap algorithm be deve10ped to make the method available to research practitioners. Availability of the method to research practitioners will provide an important and flexible tool, typically unavailable through classical methods, of estimating the sampling distributions of statistics, their standard errors, and thereby setting confidence intervals about parameters. Copyright by JOSHUA GISEMBA BAGAKA’S 1992 This dissertation is dedicated TO my parents Ludiah and Andaraniko Bagaka and to my sister Milcah Bosibori who has been fighting hard for her life during the period of this dissertation. ACKNOWLEDGEMENT This dissertation would not have been realized without the direct and indirect assistance and encouragement of many pe0ple. Very deep appreciation goes to Dr. Stephen Raudenbush, my advisor, friend and chair of my committee for his support, direction, and encouragement throughout my doctoral studies. Very sincere gratitude goes to Dr. James Stapleton, my masters program advisor and member of my doctoral committee, whose special friendship, encouragement, endless patience, and advise fostered the congenial atmOSphere in which my entire graduate studies were undertaken. Very grateful acknowledgement also goes to the other members of my committee, Dr. Betsy Becker and Dr. James Gaveleck, for their support, prudent guidance, and encouragement. The writer also wishes to extend a special acknowledgement to Cathy Sparks for her tireless service of typing this dissertation. I wish to express my deep gratitude to the members of my family: To my children Cliff Onserio, Kathrine Kerubo, Edward Makoyo, and my wife Hellen Moraa, who sacrificed their time and enjoyment; to my parents, Ludiah and Andaraniko Bagaka for the manner they raised me, their belief in me, and their endless patience, love and support; to my parent in—laws, Bathseba and Sospeter Mariera for their continuous love; to my brothers and sisters, Musa Nyakeri, Jeliah Mongina, Yunuke Nyamunsi, Milcah Bosibori, Sarah Makori, Anna Kwamboka, Dorika Tabitha, Isabela Kemunto, and Henry Nyabuto for their special love and encouragement; to all my brother— and sister—in—laws; and to the entire extended Bagaka clan and family. Very special acknowledgment goes to my family friends, Ogega and Phoebe Mokogi Omete and their children; my nephew and friend, Thomas Ogega Orenge; Linda and Richard Solomon and their children; my sincere friends, David Wafula Makanda, the Mwakikotis, the Saginis, Zablon Oonge, Roselyne and Zipporah Chisnell, Samuel Muga, the Mapatis, Annie Woo, Zora Ziazi, the Navarros, Vicky Lutherford, Sohed, Kamuyu—wa-Kangethe, John Lelei, and the entire Kenyan family in the greater Lansing Area for their family—like support and friendship. I also wish to extend a special acknowledge to my cousin Charles Nyakeri; my nephew and role model, Dr. Benson Mochoge; and all members of the Greater Gionseri community for their encouragement and support. To Chris Obure, Nyaundi, Oigara, Paul Sikini, my brother in—law Peter Omwoyo, the 1983 NHS and N ASS staff and all friends and relatives who rendered their encouragement as well as moral and financial support. Their support is deeply appreciated by the writer, his family and the pe0ple of Kenya. May God bless everyone. ASHANTE SANA. vii TABLE OF CONTENT List of Tables List of Figures CHAPTER I. INTRODUCTION Dependence on the Normal Assumptions Negative Variance Estimates. . Purpose of the Study Objectives of the Study Summary . . II. THE MULTILEVEL MODEL AND ESTIMATION Introduction The Multilevel Model The Two—level Hierarchical Linear Model with Random Intercepts Model Assumptions . Variance Component Estimation Balanced Design Unbalanced Design . MINQUE for Two—level HLM with Random Intercepts . Choice of Weights Summary . III. THE BOOTSTRAP METHOD . Introduction Nonparametric Bootstrap . The Bootstrap Estimate of Bias The Bootstrap Confidence Intervals The t—method . Percentile Method Correction for Bias 1n Bootstrap Estimation viii Page xii H 589000: IV. DESIGN OF THE STUDY . . . . . . . . 38 Introduction . . . . . . . . . . 38 Generation of Data . . . . . . 39 Study Design and Parameter Values . . 40 Implementation of the Bootstrap using MINQUE 43 Computer Programs . . . . . 46 V. APPLICATION OF BOOTSTRAP AND MINQUE: HIGHER ORDER TEACHING (HOT) . . . . . 48 Introduction . . . . . . 48 Description of Data and Variables . . . 49 The Model Statements . . . . . . . . 50 Estimation Procedures . . . . . . . 51 Results of Estimation . . . . . . . 52 VI. SIMULATIONS AND BOOTSTRAP RESULTS . . . 58 Overview . . . . . . . 58 Results of Estimation Procedures . . . . 59 Results of Bootstrap Confidence Intervals . 94 Accuracy of Bootstrap Confidence Intervals 109 VII. SUMMARY, TECHNICAL DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS . . . . 112 Overview . . . . . . . . . . . . 112 Summary . . . . . . 114 Teachers’ Self—Efficacy Model . . . . 114 Simulated Models . . . . . . . 116 Inter—class variance :3 . . . . 117 Intra-class variance . . . . 118 Intra—class correlationa (p . . . 120 Fixed efl’ects parameters 011,012,113) . 121 Coefficient of the covariate (fl) . . 122 Technical Discussions . . . . . 123 Conclusions . . . . . . . . . . 128 Recommendations . . . . 130 Recommendations for Further Research . 130 APPENDICES . . . . . . . . . . . . . 132 A. Summary of Computational Formulae . . . 132 B. SAS/IML Computer Programs . . . . . . 143 BIBLIOGRAPHY . . . . . . . . . . . . 168 ix Table 4.1 5.1 6.1 6.2 6.3 6.4 6.5 6.6 6.7 LIST OF TABLES Design Factor Combination Trials Bootstrap and MIN QUE Estimates of the Effects of Type of Subject, School Climate Variables on the Teachers’ Perceived Self—Efficacy Average and Standard Deviation of the Functions of the Estimates 'r2 and/or 1.1 under the Normal and Double Exponential Errors and Sets of Random Efiects for p = 0.01, 0.05, and 0.20 Average and Stande Deviation of the Functions of the Estimates 02 and/ /or a’1 under the Normal and Double Exponential Errors and Sets of Random Effects for p= 0. 01, 0. 05, and 0. 20 Average and Standard Deviation of the Functions of the Estimates p and or p* under the normal and Double Exponenti Errors and Sets of Random Effects for p = 0.01, 0.05, and 0.20 Average and Standard Deviation of the Functions of the Estimates a:1 and/or a: under the normal and Double Exponential Errors and Sets of Random Effects for p = 0.01, 0.05, and 0.20 Average and Standard Deviation of the Functions of the Estimates a:3 and/or a3 under the Normal and Double Exponential Errors and Sets of Random Effects for p = 0.01, 0.05, and 0.20 Average and Standard Deviation of the Functions of the Estimates 6 and or 19 under the Normal and Double Exponenti Errors and Sets of Random Effects for p = 0.01, 0.05, and 0.20 Aver e and Standard Deviations of the Bootstrap ence Limits and the Width of the Confidence Intervals about the Six Parameters of the Model Under the Normal and Double Exponential for p = 0.01 Page 42 53 61 67 73 79 85 90 97 6.8 6.9 6.10 Aver e and Standard Deviations of the Bootstrap Confi ence Limits and the Width of the Confidence Intervals about the Six Parameters of the Model Under the Normal and Double Exponential for p = 0.05 102 Aver e and Standard Deviations of the Bootstrap Confi ence Limits and the Width of the Confidence Intervals about the Six Parameters of the Model Under the Normal and Double Exponential for p = 0.20 106 Percentage of Times that the True P0pulation Parameter Fell Within the Confidence Intervals Formed Using the Bootstrap Procedure at the Three Levels of the Intra—class Correlation 111 Figure 5.1 5.2 6.1 6.2 6.3 6.4 6.5 6.6 LIST OF FIGURES Percentage polygons for the bootstrap estimate the inter— and intra—teacher variances and the intra—teacher correlationfor the teachers’ self—eficacy prediction model Percentage polygons for the bootstrap estimate of the effects of Mathematics, Science, English, and Social Science on the teachers’ self—efficacy Percentage polygons for the MINQUE and bootstrap estimate of r2 over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20 Percentage polygons for the MINQUE and bootstrap estimate of or,”3 over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20 Percentage polygons for the MINQUE and bootstrap estimate of p over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20 Percentage polygons for the MINQUE and bootstrap estimate of a:1 over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20 Percentage polygons for the MINQUE and bootstrap estimate of (13 over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20 Percentage polygons for the MINQUE and bootstrap estimate of 13 over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20 xii Page 56 57 65 71 77 83 89 93 6.7 7.1 7.2 Page Percentage polygons for the relationship between the distribution of the function D = 1-2 — r2 and D* = r“ - r’ for p = 0.01, 0.05, and 0.20 95 Percentage polygons for the distribution of R and R* representing the ratios of the estimates of the random parameters 7’, 0:, p 125 Percentage polygons for the distribution of R and R“ representing the ratios of the estimates of the fixed parameters a” a3, and fi 126 xiii CHAPTER I INTRODUCTION Estimation is frequently based on subpopulations which can be combined collectively into one underlying p0pulat'ion. For instance, educational performance can be examined through a sample from several schools in a nation or state. It is quite natural to estimate the mean achievement and the spread of students’ achievement in each school. Yet groups such as the school district personnel may be interested in knowing the achievement of students in their school district relative to the national average, while the school principal may be interested in the performance of the school relative to the statewide or national average performance. Several other interest groups (parents, teachers, education ministers) may have interest in difierent "levels" of data, making it necessary to examine the data in stages. Recently methodologists (Aitking and Longford, 1986; Burstein, Linn, & Campell, 1978; Burstein & Miller, 1930; Goldstein, 1986; Raudenbush & Bryk, 1986) have deve10ped techniques to address studies involving data which have a hierarchical character. Most of these studies have been conducted under the assumption that the observations are independent and normally distributed. Mason, Wong, and Entwistle (1983) and Raudenbush (1984) formulated similar mathematical models for hierarchical data within—macro units with, say, students in a school as "micro units" of analysis and schools as the "macro units" of analysis. The resulting within— and between-macro units models were based on the usual independence and normality assumptions. Since the manner of obtaining data typically affects inferences that can be drawn from such models, we consider a sampling process in which the "macro" units are randomly drawn from a p0pulation before a random sample of "micro" units are drawn from each "macro" unit. The resulting data are thus associated with two random components (the between and within macro variance components) and the model is correspondingly called the random effects model. Many situations arise where "macro" units are nested within some fixed factors (not drawn randomly from a population), together with other micro fixed effects and covariate(s), resulting in a mixed model with both fixed and random effects. Analysis of variance is traditionally employed in situations involving fixed effects models, to estimate the fixed effects parameters. Although statistical procedures are available for estimating the variance components of the random parts of the mixed model, these procedures have several limitations, which take one or more of the following forms. First is the problem of unbalanced designs (unequal subclass numbers). Estimating variance components from unbalanced data is not as straightforward as from balanced data (Searle, 1971). Secondly, estimating variance components often involves relatively cumbersome algebra which makes it difficult for most methods to estimate model parameters when covariate(s) are involved as part of the fixed factors. Third is the problem of negative estimates of the variance components and last but not least, the problem that most variance component estimation procedures are based on the assumption that the random error terms and sets of random effects are normally distributed. The limitation of unbalancedness is certainly clear since balanced designs are rare in research situations. Thus, procedures of estimating variance components limited to balanced designs may not be at all useful. On the situation of cumbersome algebra and negative estimates, no one method has yet been clearly established as superior either in minimizing the amount of computation required to estimate the variance components or in obtaining non—negative estimates of the variance components. These are the situations in which attempts can be made to minimize but not necessarily to overcome the problem. ' Most procedures available for estimating the variance components of the mixed model are based on the assumption that the error terms and each set of random effects in the model are normally distributed with zero means and some variance—covariance structure. Then for the balanced random component model, it can be shown that the sum of squares in the analysis of variance are distributed independently of each other; and each sums of squares divided by the expected values of its mean square has a central chi—square distribution with the corresponding degrees of freedom (Searle, 1971). This holds true only for the random component model. For the mixed model, this distributional pr0perty only holds for those sums of squares whose expected values are not functions of fixed effects; otherwise the same ratio of sums of squares that do involve fixed effects, will have a non-central chi—square distribution. Thus, the normality assumption for the error term and each set of random effects in the model is the basis of the distributional pr0perties of variance component estimators, on which most variance component estimation procedures are based. However experience has shown that in certain research situations, there is little doubt that the error terms and each set of random effects in the mixed model can be characterized as moderately or even distinctly non—normal. For example, educationally oriented variables such as number of days absent from school, number of times a student answers a question (or talks in class) and many other variables are likely to produce non—normal data that are heavy tailed or badly skewed. Thus the results of statistical methods based on the Gaussian assumptions may not always be reliable. Approaches are available for dealing with non—normality. Most involve transforming the original data to a form more closely resembling a normal distribution such that normal theory methods can be applied (Box and Cox, 1964). Efron (1982) examined a family of six transformations and cautioned against uncritical use of normality as a criterion for successful transformation. Perhaps variance stabilizing transformations may be preferable. Efron (1982) discusses situations in which it is better to transform to homoscedasity and ignore non—normality than vicehversa. Otherwise the alternative could be to do a complete analysis to recover the lost information during transformation. However, the practical motivation of transformation theory is to avoid complicated analysis, especially in already complicated situations (Efron, 1982). What complicates the issue of using transformations even more is the fact that the underlying distribution of the original variable must be known before one decides on the apprOpriate transformation. In many situations, the underlying distribution of the original data may not be known and thus apprOpriate transformation of the data becomes difficult. Rao (1971) proposed the Minimum Norm Quadratic Unbiased Estimator (MINQUE) for variance components, which does not require the normal distribution properties of the error term and each of the sets of random effects. The method is quite general, applicable most experimental situations, and the computations are relatively simple (Rao, 1971). The approach of the MINQUE involves estimating a linear function of the variance components using a quadratic function of the observations, using pre—assigned weights in the norm. The MIN QUE estimates therefore may vary with the choice of weights. In addressing the problem of dependency on the weights when using MINQUE, Brown (1976) suggested a procedure in which, after calculating a MINQUE estimate as usual, the values therein are used as weights and the MINQUE is calculated again. The process is repeated iteratively until two successive estimates are equal, to some degree of apprordmation. The method has been named iterative MINQUE or I—MIN QUE (Brown, 1976) and it has been shown that MIN QUE and I—MINQUE estimators are asymptotically normal. However, because the I—MIN QUE estimators are obtained iteratively, they do not have the pr0perties used in deriving MIN QUE (unbiasedness, translation invariant and minimum norm), and thus they are not necessarily unbiased or "best" in any sense (Searle, 1979). This study ad0pted the MINQUE procedure as a useful method of estimating the variance component since the procedure does not require the normal distributional pr0perties. In addition, perhaps one of the most useful features of the study was in the specific manner in which MIN QUE was implemented. The study used a crude ANOVA—type estimate of the variance components of the mixed model as in Hanushek (1974). The values from this prior estimator are used to determine the weights which are then used in the computation of the MINQUE estimators. However, this does not in any way constitute the focus of the present dissertation. The primary focus stands to be an attempt to liberate statisticians from over—reliance on the normal assumptions in estimating the variance components of a mixed model. Efron (1979) has discussed the use of a technique called the bootstrap to generate sampling distributions of statistics and thereby to draw inferences about parameters without requiring any distributional pr0perties. Although Efron avoids making any general claim for the origin of the name "bootstrap," Efron’s examples may suggest to some that it is indeed a technique of "pulling ourselves up by our bootstraps" in a data analysis, that is, for obtaining inferences insensitive to model assumptions (Rubin, 1981). Indeed, the name reflects the fact that one available sample gives rise to many others (Diaconis & Efron, 1983). The deve10pment of the bootstrap starts with a sample X = {X,,...,Xn} of n observations. From this sample, a random sample of size n is drawn with rcplacement from which a first bootstrap estimate is calculated. The replicated sample is denoted by X“ = {X},...,X;} and the bootstrap replicated estimate computed from X“ is denoted by it". The process is repeated a large number B times resulting in a sequence 6?; of estimates for b = 1,...,B. If F designates the unknown distribution of X, then Efron (1979,1982) argues that the empirical bootstrap distribution F* of X* can provide a very good approximation of F for a wide variety of interesting statistics. The bootstrap, therefore, which is an elaboration of the jackknife invented by Quenouille (1949), provides a general method which can be applied to complicated situations where theoretical analysis is not possible. Under quite general conditions, the bootstrap gives asymptotically consistent results and for some simple problems which can be analyzed completely, for example, ordinary linear regression, it automatically produces results which are comparable to standard solutions (Efron, 1981b). Through a series of examples, Efron has shown that the bootstrap method works reasonably well under a variety of situations. A more detailed discussion of the bootstrap method is offered in Chapter III. Dependence on the Normal Assumptions The distributional assumptions imputed to the random error terms and each set of random effects in the mixed model are that they are independent and normally distributed with mean 0 and some variance—covariance structure. But in order to realize the increased flexibility of hierarchical linear models, careful attention needs to be paid to these statistical assumptions (Bryk and Raudenbush, 1987). Though methods are available to assess the degree to which these assumptions are realized in research situations, many researchers proceed with computations under the normal assumptions regardless of whether or not the normality condition is met. However, there are several situations in educational research where hierarchical models may be applicable, but the normal assumptions may not be guaranteed. For example, in the model involving student achievement scores, or number of days absent there is doubt that the error terms are normally distributed in certain situations. The most common macro unit of analysis for the between group hierarchical model in education is the school. Often, a random sample of schools is drawn from which a sample of students is also drawn at random. Schools with different characteristics may be in the sample resulting in an hierarchical data set with certain response variables with different distributions for each school. Certain educationally oriented variables, either at the student or school or classroom level may be observed. Yet as mentioned earlier, for some of the variables, under the assumption of random sampling from these suprpulations, there may be doubt about the normality of the p0pulation distribution. Some schools may have data sets whose underlying distribution is negatively skewed, others positively skewed, others heavy-tailed and others may even be normally distributed. Under these conditions using the standard methods to calculate parameter estimates may not provide better estimates. Attempts to transform data to a form more closely resembling a normal distribution will involve identifying the underlying original distributions for variables in each context (e.g., school) first before deciding on the most apprOpriate transformation strategies for each suprpulation. Even if the underlying distributions of the suprpulations were known, transforming variables difi'erently for each "macro" unit may deteriorate into a welter of calculations. In such a situation, therefore, the bootstrap algorithm becomes handy and apprOpriate not only to determine the expected values of the estimates without worrying about the distributional properties but also to estimate the standard errors of the estimates and the emmrical distributions of the estimators, thereby setting confidence intervals about the parameters. Negative Variance Estimates The usefulness of variance component techniques is frequently limited by the occurrence of negative estimates of essentially positive parameters (Thompson, 1962). Though methods like the Restricted Maximum Likelihood (REML) were primarily designed to remove this objectionable characteristic for certain experimental designs, the problem still remains unsolved. Thompson (1962) described an algorithm for solving the problem of negative estimates of variance components for all random effects models by considering that their expected mean square column forms a mathematical tree in a certain sense. The algorithm was described as follows: "Consider the maximum mean square in the entire array; if this mean square is the root of the tree then equate it to its expectation. If the minimum mean square is not the root then pool it with its predecessor." Thompson, 1962, p. 273. In either case the problem is reduced to an identical one having estimates of the variance components. The estimates are non—negative and have a maximum likelihood pr0perty. Other methods like the method of moments have ways of controlling for the occurrence of negative estimates by simply equating any negative estimate to zero. It is anticipated that the bootstrap method used in this study will provide another useful way of controlling for non—negativity of estimates of the variance components particularly when the p0pulation interclass variance is small but positive. For the bootstrap method, the estimate 21,; of the variance component is computed at each replication b for b = 1,2,...,B where B is a large number. The expected value of the estimate is then the average over all B replicated values of the estimator. It is anticipated that if the parameter value of the variance component is non—negative, then the sum and average over all the replicated values will be non—negative. In this case therefore, we view the bootstrap as a means of providing the MINQUE method with B opportunities to prove the positiveness of the estimate of the parameter value which is essentially positive. Pu_rpgse of the Study The interest of the study lies in a two—level mixed and nested hierarchical linear model (HLM) with random intercepts. The general problem is one of estimating the fixed effects and the variance components (group and individual level variances) under several situations including conditions under which the normality assumption may not be guaranteed. Besides the non—normality problem, the problem of negative estimates of the between "macro" unit variance component (especially in the case of boundary situations when the true variance component is small) is not new to statisticians. 10 In the present study two different estimators of the model parameters were obtained and compared against each other. The first estimator was the MIN QUE based on the original sample. The other was the bootstrap estimate computed though resampling from a sample with replacement. prg’ tives of the Study The study demonstrates the use of the bootstrap in providing estimates of the parameters (fixed and random) of a general two—level mixed and nested hierarchical linear model, determining the standard errors of the estimators and their empirical bootstrap distributions. In addition to demonstrating that the bootstrap algorithm liberates statisticians from over—reliance on the Gaussian assumptions (Diaconis and Efron, 1983), through Monte Carlo simulations, this study also ( 1) Determines the bootstrap standard errors of the variance components and thereby allow construction of bootstrap confidence intervals about the parameters. (2) Assesses the performance of the bootstrap method in determining the estimates of the sampling distributions, the standard errors, and the interval estimates of the variance component estimates of the model when the response variable, a) is normally distributed. b) has a distribution with fairly heavy tails (e.g., the double exponential distribution) (3) Evaluate the bootstrap estimates and usual MINQUE estimates of the variance components. ll (4) Examines the relative accuracy of the bootstrap method in estimating the variance components of the model in the case of boundary situations, particularly when the population interclass variance is small but positive. (5) Determines the bootstrap estimate of the fixed effects parameters and their standard errors. Summgy The present dissertation concentrates on the problem of estimating the parameters of a mixed and nested hierarchical linear model. Chapter II will describe the hierarchical model and the estimation of variance components in balanced and unbalanced designs when the models are with and without covariate(s). Chapter HI will concentrate on the discussion of the bootstrap method. The design of the study will be provided in Chapter IV and an application of the bootstrap method in estimating model parameters in higher order teaching (HOT) research will be presented in Chapter V. MINQUE and bootstrap Monte Carlo simulations results will be presented in Chapter VI and conclusions and recommendations set out in Chapter VII. CHAPTER II THE MULTILEVEL MODEL AND ESTIMATION Intr ction It is common in educational research to study the effect of character of the educational group (e.g., school, school district, classroom, province). These group—oriented variables (e.g., school policies, teacher/student ratio, per—pupil spending) may form part of a set of independent variables hypothesized to have an effect on some individual—level outcome variable(s). For example, student learning activities occur within organizations in which the individual students belong (Burstein, 1980). It is therefore necessary for educational researchers to understand and be able to explain the complex influence of not only individual level variables but also group—oriented variables on some individual student outcome variables. Data in this class of research is typically available at two levels of observation, individual (or micro) and group (or macro) levels, giving rise to a hierarchical structure of data. Similar arguments can apply as well to more complicated nesting situations (students within classrooms, classrooms within schools, schools within school districts, and school districts within states or provinces) without loss of generality (Burstein, 1980). The problem then, is that of analyzing such multilevel data when certain key independent variables are measured at different levels of an organizational hierarchy. Traditional statistical methods of data analysis like multiple regression and analysis of variance have been found to be ineffective in studies involving such data of hierarchical structure. Methodologists (Burstein, 1980; Burstein & Lin, 1978; 12 13 Mason, Wong, & Entwisle, 1984; Raudenbush Sr Bryk, 1986; Raudenbush, 1984) have not only warned against the use of such classical linear models but have also provided general statistical models of investigation when data exists in hierarchical structures. These models are commonly referred to as hierarchical linear models (HLM). Studies of school effectiveness (e.g., Brookover, et.al., 1982) have been interested not only in student achievement levels as measured by the average achievement scores but also in overall group achievement or "equity" as measured by the variability (or spread) of achievement scores. From this viewpoint, more effective schools for example, not only produce high average achievement scores but also help students of varied backgrounds to achieve mastery (Raudenbush, 1984). The notion of evaluating the effectiveness of schools in achieving "equity" by observing the within—school variability of scores can also be extended to examining effectiveness of the state, province or country by observing between—school variability in student achievement scores. Coupled with the fact that the "macro" units (e.g., schools or classrooms) in the study may constitute a random sample of such units from a p0pulation, then the mixed model conceptualization is certainly appealing. Thus, one important class of investigation in such situations would involve estimation of the variance components (or equity) in addition to examining the efl'ects of other fixed factors in the mixed hierarchical linear models. Thg Multilevel Model The structure of data considered in this multilevel framework is assumed to involve two levels of observations; the individual (micro) level and some higher (macro) level. The structure can be characterized by contexts such as schools or countries as "macro" units of analysis and individual subjects as the "micro" units of analysis. The fundamental assumption underlying this multi-level hierarchical 14 framework is that the micro values of the response variable depend in some way on context and that the effect of the micro determinants may vary as a function of context (Mason, et. al., 1983). At the lowest level, some measure of outcome for individual subjects and other individual characteristics may be apprOpriate. Suppose we begin by posing a within—context model that defines a micro equation with one micro response variable Y and one micro regressor X, which is identical for each macro unit j as, (2-1) Yij = 503' + flrjxrij + eij where j = 1, 2,...,J macro units and i = 1, 2,...,nj micro units within the macro units. In this case Yij is the response and Xfij the regressor value for subject i in macro unit j and 6,, is the random error term. The usual assumption is that Eij is distributed normally with mean zero and variance 0:. The micro parameters fioj and 311 are assumed to vary across context as a function of some macro regressor variable W. Since [90,- and flli are defined for each context, we pose the between-context models using 190,- and flu as re3ponse variables as (2-2) fiOj = 700 + Vorwij + er (2-3) 5;; = 710 + 711w1j ‘l' erj where 60) and 511 are the intercept and regression slape respectively for context j, as defined in Equation 2.1. Both the intercept and the slape are assumed to be random with eoj and e“ as their respective random error terms. It is most common to assume that the error terms eoj and elj are normally distributed with mean zero and variances 1100 and "11: respectively, with the covariance of eoj and e“ denoted by 110,. A single equation for this simple case of a multilevel model can be obtained by substituting Equations 2.2 and 2.3 into Equation 2.1 as 15 (2-4) Yij = 700 ‘I' 7orwrj + 'Yroan + 711W1jxrij + (er + Xmeu + eij). Although Equation 1.4 involves one micro regressor and one macro regressor, it is quite general in the sense that other models of potential interest can evolve from it (Mason, et. al., 1983). For example, a random effects one way analysis of variance (AN OVA) model can be derived from it by setting to zero all of the coefficients of X1 and W1, i.e., 70, = 1,0 = 7“ = e“ = 0 resulting in the model equation, (2.5) Yij = 700 + eoj +.Eij. Similarly, a fixed effects regression model is obtained by setting eoj and e,,- to zero to obtain the model equation (2-6) Yij="Yoo + 7orwij + 7roxrij + 711erxrij + 613” For this study, a hierarchical linear model with random intercepts is considered. Consequently, the random error elj associated with the random 310pe model given in Equation 2.3 is set to zero. Model Equation 2.4 then reduces to the variance component model given by (2-7) Yij = 700 + Vorwrj + 7roxrij + 711erxrij + (er + eij) which is a mixed model with the term (er + eij) as the random part and (70° + 701W1+ 7,0Xfij + 7,,Wu-Xlij) as the fixed part of the model. The fixed part of the model in Equation 2.7 may take a more general form with multiple X’s (i.e. Xm,X3ij,...) and W’s (W,,W3,...) which may also include interactions. One of the fixed factors of the model, for example, may be the sector in which the random factor of context is nested. The fixed effects factor (sector) is taken to consist of K levels, bringing the total number of fixed effects parameters (including covariates) to P. In terms of the general linear model matrix notation, and if we allow for any number L of regressor variables, Equation 2.7 can be expressed for the jth context as 16 (2'8) Y1 = {‘1‘}: +¥1§1 + $1 j= 1, 2,...J. where Yj is a (nj 1: 1) vector of response values; Xj is a (nj 1: p) matrix of known constants; 95 is a (p x 1) vector of unknown fixed effects parameters; 21 is (nj x 1) vector of 1’3; 13,- is (qx 1) vector of unknown random efiects parameters and e, is an (11j x 1) vector of random error terms. The ngflevel Hierarchical Linear Model with Random Intercepts The model illustrated thus far reduces to simpler models under specific conditions. The present study involves two factor levels, a fixed and a random factor where the random effects are nested within fixed effects. Application of this model can be seen in education research with the school background or sector (public, private or religious) as the fixed factor and individual schools as the random factor. At the lowest level, some measure of outcome, for example academic achievement may be of interest. Other school characteristics (teacher—student ratio, school financial means, school policy, inner city or suburban location) together with student characteristics (social economic status, IQ) may be included in the model as covariates. An example of the use of "Micro" and "Macro" variables can be found in Mason et. a1. ( 1983) who used a model for a multilevel analysis of the determinants of children born in fifteen less—deve10ped countries. In this study which was part of the Michigan Comparative Fertility project, Mason et. al. (1983) used countries as the macro units of analysis, while married respondents served as the micro units of analysis. Some of the macro variables used to define the context within which individual childbearing took place included socioeconomic l7 deve10pment, family planning program efiort, and per capita gross national product. The micro specifications used included contraceptive use patterns, and the wife’s education level. Warm Two levels of distributional assumptions can be specified for the multilevel hierarchical linear model described above. First is the assumption related to the micro specification model shown in Equation 2.1. For this model, the error terms, eij are assumed to be independently and normally distributed with mean vector 0 and variance 031,, for j = 1,...,J. With Equations 2.2 which describe the random 1 intercept part of the model, the following assumptions are made: (i) the error terms eoj associated with the intercept 60,- are assumed to be distributed normally with mean 0 and some variance 1100. (ii) the micro errors, ‘1' are independent of the macro errors eoj. While attainability of the distributional assumptions related to the micro error terms cij can be easily accessed by observing the distribution of the response values Yij, accessing the distributional assumptions of the macro error terms eoj is more difficult since [90,- is not directly observable. This worsens the situation in dealing with methods which are overly dependent on distributional assumptions. The assumption of independence in multilevel models also takes two forms, within— and between—group independence. Robustness to within—group dependence (dependence among observations) is of primary interest in the statistical literature (Burstein, 1980). The statistical consequences of ignoring the intraclass correlation structure that results by ignoring group membership can be quite serious (Burstein, 1980). In educational research involving student achievement, we realize that instruction is primarily group—based. Instruction of students within the same class 18 is likely to be more similar than that for students from different classes. Under these and similar circumstances, the between— and within—group error terms are likely to be correlated. In general, standard statistical estimation techniques like ordinary least squares are ineffective in the presence of within—group dependencies. Yet in several educational research situations, depending on the nature of the outcome and effect variables under study, dependence among observations may be an inevitable phenomena. Thus it may be reasonable for researchers to assume that dependencies among observations exist, such that more effort may be spent on ways to identify and adjust for these dependencies rather than assuming independence when dependence may erdst. Variance Commnents Estimation The problem of estimating the variance components in mixed linear models, containing both fixed and random effects is not new to research methodologists. Several methods of estimation have been suggested (Henderson, 1953; Hartley, 1967; Searle, 1970; Henderson, et.al., 1959; Rao, 1970; 1971a, 1971b, 1972; Thompson, 1962). The deficiencies and/or difficulties in the application of these methods are also well known (Searle, 1978). Estimates could be negative. Compuatational problems could arise, particularly when covariates are involved as part of fixed factors of the mixed model. There is no general method to cover all situations and problems. The problem of variance component estimation also varies with the design, whether the data is balanced (equal subclass numbers) or unbalanced. We: Balanced designs are those in which there are equal numbers of observations in all the macro units. The analysis of variance method (or the method of moments) is traditionally employed in estimating the variance components of mixed 19 (or random) balanced designs. The method involves equating statistics to their expected values and solving the resulting equations for the parameters (Hocking, 1985). But due to the infrequency of balanced designs in real world research, methods which are limited to balanced designs are not at all useful. Was Unbalanced designs are to those in which the number of observations in the sub—classes or macro groups are not all the same. Besides the problem of cumbersome algebra and a confusion of symbols in variance components estimation in unbalanced designs, other problems arise. Whereas with balanced data there is only one set of quadratic forms to use (the analysis of variance mean squares), there are many sets of quadratic forms that can be used for unbalanced data. And unlike in balanced data, most quadratic forms in unbalanced designs lead to estimates that have few optimal pr0perties. As Searle (1971) indicated, none of the earlier methods were clearly established as superior in variance component estimation. Efforts to adapt variance component estimation methods to unbalanced data were led by Henderson (1953) who described an analogous to the analysis of variance method used with balanced data, but designed to correct that deficiency. Other methods evolved thereafter (see Searle, 1968), but Searle ( 1971) indicated that most of these methods reduced in some way to the method of moments for balanced data. The methods involve relatively cumbersome algebra such that a discussion of unbalanced data easily deteriorate into a welter of symbols. Other more recent methods of variance components estimation evolved which are not necessarily allergic to unbalancedness. The maximum likelihood (ML) estimator of the variance components is one such method. The ML estimators of the variance components are those values of the components which maximize the likelihood over the positive space of the variance components parameters (Corbeil and Searle, 1976). Application of the ML method therefore requires assuming a 20 probability density function for the random variables, and than writing down the likelihood function of the sample data. Though the general ML procedure can be used for almost any probability density function, for variance component estimation, it is customary to assume normality (Searle, 1979). Then maximizing the logarithm of the likelihood function is fairly straightforward. However, as indicated earlier, requiring the normality assumptions in certain research situations may be expecting too much. An alternative to the ML estimator of the variance components is the restricted maximum likelihood (REML) which was first suggested by Thompson (1962) and later formally described by Patterson and Thompson (1971) and Corbeil and Searle (1976). The method is based on a transformation that partitions the likelihood under normality into two parts, one being free of the fixed effects and the other involving fixed effects. Maximizing the part that is free of fixed effects yield the REML estimators (Corbeil and Searle, 1976). The REML estimators are translation invariant, but because maximum likelihood restricts the estimator to the allowable parameter space (positive), then REML estimators are biased. Thus, in terms of assumption requirements, neither ML nor REML offers no solutions to the estimation of variance components without assuming certain distributional properties. From the late 1960’s till the early 1970’s, statisticians were involved in seeking methods of variance components estimation that posess more desirable properties than just being unbiased and translation invariant. LaMette (1973) and Rao (1970) though working independently, derived the minimum variance quadratic unbiased estimators (MIVQUE) from the theoretical viewpoints without offering ways of applying the method to actual data analysis. Henderson (1973) derived computational formulae for MIVQUE based on the mixed model equations (MME) and indeed showed that LaMotte’s and Rao’s methods were identical. 21 MIVQUE assumes normality and that V, the variance—covariance matrix of the observations is known. Then the variance of quadratic forms is minimized for this V. Since V is not known in reality, the procedure requires utilizing some prior information about V, and as a result, the variance of quadratic forms is only minimized if this prior V is the true p0pulation value. However, MIVQUE is unbiased and translation invariant. In addition, if the prior V is the same as the true V, then MIVQUE is also minimum variance. Thus, MIV QUE in general, is not minimum variance in practice, but only as good as the prior V. For the present study, the MIV QUE too does not ofier solutions to variance component estimation since the procedure requires the normality assumption. Rao (1970, 1971a,b, 1972) prOposed a minimum norm, quadratic, unbiased estimator (MINQUE) of the variance component to estimate a linear function P I g of the variance component (for known P ’ ). The method utilizes a quadratic function Y’AY of the observations for Y, a vector of observations and A, a symmetric matrix. The quadratic function Y'AY used to estimate P I g is taken to possess the properties of translation invariance, unbiasedness and minimum norm (Searle, 1979). More importantly, the MINQUE theory is developed without reference to normality or the variance of the estimator and the method is highly flexible in the choice of norm while at the same time preserving the desirable prOperties of the estimator (Rao, 1971). Since their invention these estimators have gained much recognition. See in particular, Seely (1971), Hartley et. al. (1978), and Searle (1979). Also the naive form of the MIN QUE which corresponds to the rather uninformative prior value V = woln (MIN QUEO) is provided by Statistical Analysis Systems (SAS). Due to its desirable properties, particularly the fact that the procedure does not require the normal assumptions, the present study adOpted 22 the MINQUE technique in estimating the parameters of the mixed model via the bootstrap method. MINQUE for Two—level HLM with Random Intercepts The minimum norm quadratic unbiased estimator (MINQUE) for the variance components of a mixed model is based on the statistical linear model whose general form is represented by (29) ¥=z+s with the following definitions: Y is a (nxl) vector of n observations X is an (nxP) known matrix of rank r()_() < n g is a (le) vector of P fixed efi'ects parameters 2 is an (nxJ) known matrix, often consisting of 1’s and 0’s 1) is a (J x1) vector of J unobservable random effects parameters, and e is a (nxl) vector of random error terms. In order to identify the variance components corresponding to the random effects in 13, this vector 11 is partitioned as (2.10) 9’ = [bi...bk...b;] for k = 1,...,c, where the vector bk contains jk effects for the levels of the k‘11 random factor. Corresponding to bk of 2.10 the incidence matrix 2 is accordingly partitioned as (2.11) 2: [argyle] for k = 1,...,c, such that 2.9 can be written as, 23 (2.12) 1; = 159 + 3312,13, + 5 with the model elements defined as before. Equation 2.10 and 2.11 are similar to 5.3 in Rao (1971b). A compact way of writing (2.12) is to define 5 as another 131: namely, 13,, and the corresponding 20 as In. The model Equation 2.12 becomes (2.13) Y = X9 + with the following distribution pr0perties: (2-14) 13031) = 9, VaIQk) = Jilin COVQkiPkI) = 9 for 1,1' = 0,1,...,c where cov(bk,bk,) is the matrix of covariance of the elements of bk with those of 13,, for k s 1’. The variance of y is given by (2.15) Var(b) = D = diag{a§ljk} for 1 = 0,...,c. With this formulation, we notice that Equation 2.7 is a special case of 2.13 with c = 1 whose compact form may be given by (2'16) Y = )5? 'l' @0130 + @181 where Y is a (nxl) vector of n observations, X is an (nxP) known matrix of rank r(x) < n representing fixed effects parameters, a is a (le) vector of P fixed effects parameters, 20 is an (nun) identity matrix, 130 is a (had) vector of residual error terms, 24 2, is a (nxJ) known matrix, often consisting of 1’s and 0’s and b is a (J 111) vector of J unobservable random effects parameters. The distributional pr0perties imputed to 2.16 are according to 2.14 given by (2.17) may.) =9.cov(yo.1>.) =9 (2.18) 13 = var(bk) = diag{a:1n, 7’11} for 1 = 0,1 where a: is the variance component of the residual errors and 'r2 is the variance component of the random effects of the model. For the two—level mixed model of the form given in Equation 2.16, the MINQUE estimate 5 of the variance components of 130 and b, using weights w0 and w1 in the norm is given by (2.19) §= {tr(13.azilj.zezé.i-l {familiar}. for k,k' =0,1 where P, which is given by (2.20) 13.. = y;1—Y.;05(z<’y.;is)-Iis’y;l is the projection operator on the space generated by the columns of X similar to 1.2 in Rao (1971b). v, = 213,? for D, = diag{onn, wgJ} is a dispersion matrix of b where w0 = 1- p and “’1: p for p the intraclass correlation coeficient. In practice, the weights w0 and w1 are pre—assigned numbers hence V, and P, are matrices which can be calculated easily. To advance the MINQUE estimates associated with the weights W0 and w1 for this special case, define F, and U, as follows: (2'21) Fw = {tr(P'ZkszWZkIZkI)} (2.22) U, = {Y P,ZkaP,Y} 25 for k, k, = 0,1. F, is a (2x2) matrix and U, is a 2—dimensional vector both originating from 2.19 such that the MINQUE estimator (:7 is given by (2-23) g= 13:11. Define the matrices If and A, as part of the projection Operator 2.20 as (2.24) I; = (3’11:ch 9-25) 4. = waist/r. If we let f“, to be elements of a (2x2) matrix F, for k,k’ = 0,1, then the following definitions can be given: (2.26) to, = tr(1_’,l_’,,) = tr(Y.;’) - "(113%) (2-27) for = f10 = "(91:59?” = swash -tr(y;14.z@1) (233) f“ = (“(13 .?1? 1131119? 1) In order to simplify the vector U, of quadratic forms, we notice that 13.? = (Y: - Yise’yisr ’S'Yv’fi! = we! - sir/231‘ Um such that (2-29) 13.1! = 11:0! - as?) where (:1 is the estimate of the fixed effects parameters of the model given by (2-30) Er= 1952:?- With this simplification, if we let 110 and 111 be the elements of the two 26 dimensional vector U, of quadratic forms, then the following definitions can be given: (231) no = $13.13.? = (Y - sift/m - 293) (2-32) =1! 132%? ¥= (¥-¥§)'Y&‘?1?1Yé‘(¥-¥§)- We notice that uZJZJ which occurs extensively in 2.27, 2.28, and 2.32 is block diagonal with submatrices mJ of size (annJ) whose elements are all 1’s and VJ? is block diagonal with submatrices V71 also of size (nJ-an) given by (2.33) = w(IJJj cJ- mJ) _ 1 ._ "I -_ for WTT—WTJ and cJ- — (1+(nj-UW1 for j— 1,2,...,J. Let XJ be the nJ- rows of the matrix X associated with fixed effects in context j. Let mJ = ZJJZ JJ- for ZJJ a (an1) vector of 1’s. Then IQ; and the elements of F, and U, will simplify significantly as follows: (2.34) i_<={w2(xeJ —chJs J)} for SJ = XJZJJ a (le) vector of column sums of XJ; (2.35) tr(V,?) = W3 E nJ.{(1-cJ)2 + c3(nJ—l)} (2.36) tr(V,1A,)= w3 2 tr(tJ) - a JJcJ{(1—c nJ)2 + (2—c JnJ)} for tJ =xe1_( and a J=tr(x'JJr__ 0.5, then set X, = 1. Then the variates Y defined by the equation (4.2) Y = 1; xe, are distributed as double exponential with mean zero and variance 2. SAS/IML code segments used to generate the normal and double exponential distributions are given in Appendix B. St Desi n P eter Values The structure Of data in the present study is assumed to involve a random factor consisting Of J levels, nested within some fixed factor levels. The random 41 factor may be characterized by contexts such as schools or countries and the fixed factor characterized by sector (e.g. public, private, or religious) in the case of schools as context. In the case Of countries as context, the fixed factor levels may be taken to be levels Of economic or industrial deve10pment (e.g. developed, less deve10ped, develOping or underdevelOped) or may be world regions. As noted earlier, two design factors in this study are expected to influence the success Of the estimation of model parameters. These are the pOpulation distribution Of the random components and the population intraclass correlation. The intraclass correlation denoted by p is given by (4-3) p = T 2 73+a§ where 03 and r3 are the intra— and inter—class variances Of the model respectively. As Raudenbush and Bryk (1988) indicated, the intraclass correlation has two useful and mathematically equivalent interpretations. First, it is the correlation between pairs Of values within the J contexts such that it measures the degree Of dependence among Observations sharing a context. Secondly, as a ratio, it represents the prOportion Of the total variation in the response values which is between contexts. Estimation Of variance components is Often dimcult when p is quite small, sometimes resulting in negative estimates of the variance components. Due to this feature, three levels of the intraclass correlation for each of the two distributional models were introduced in the study as part Of the design factors. In order to vary the intraclass correlation, a3, was fixed at 100 while 73 was allowed to take values 1, 5.26, and 25 resulting in p taking values of 0.01, 0.05, and 0.20 respectively. Table 4.1 presents design factor combination trials. 42 Table 4.1 Design Factor Combination Trials* Distribution Model Intraclass Double All Correlations (p) Normal Exponential 0.01 a b i 0.05 c d 0.20 e f k All g h l ’" 400 trials (different sets Of data) were specified for each cell (a through f). The design factor specification shown in Table 4.1 provided for a total of 2400 Monte Carlo simulation trials, each consisting Of a different data set. As a result, 1200 trials were performed for each of the two distributional models (normal and double exponential) and 800 trials for each Of the three levels Of the intraclass correlation, such that, g = h = 1200; i = j = k = 800 and g + h = i + j + k = 2400. The specific mixed model used in the study has two factors, a random factor with J levels, nested within a fixed factor with three levels and a micro level covariate variable. However, it should be noted that it is possible to extend this model by including additional covariates (at micro or macro level). For the purpose of the present study, all data sets used in the study were unbalanced (unequal number of subjects in each context) consisting Of 50 macro units. Fixing the parameter value Of r9 at unit (near boundary value Of zero) provided an additional advantage to the study. This is due to the interest Of the study in estimating the random effect variance component 73 near the boundary conditions. It is in these situations where most variance component estimation 43 procedures experience problems Of giving negative estimates Of 1’3 when the parameter value they are estimating is essentially positive. Thus, in an attempt to understand the performance of the bootstrap procedure in estimating 73 near boundary conditions, out Of the total 2400 trials 800 (or 33.3%) were performed for ‘r2 = 1 (p = 0.01), 800 (or 33.3%) for r3 = 5.26 (or p = 0.05), and 800 (or 33.3%) for :r’ = 25 (or p = 0.20). It should be emphasized that, in using the bootstrap algorithm to estimate the distribution Of the parameters Of the mixed model described in the study, estimation is done at each Of b bootstrap replication, for b = 1,2,...,B, where B is a large number. For the present study, B was set at 200 bootstrap replications for each trial shown in Table 4.1. 1m lem nt tion f the B ts ra usin MIN UE The MIN QUE method Of estimating the variance components requires using weights wk associated with b, for k = 0,1. Ordinarily, arbitrary weights are chosen provided one ensures that F,1 exists. According to Rao (1972), regardless Of the choice Of weight, wk’s, the MINQUE estimators will still possess the properties Of unbiasedness, translation invariance and minimum norm. However, though the MINQUE estimators may generally possess the prOperties used in deriving the estimators (unbiasedness, translation invariant, and minimum norm), One would expect that, in practice, these estimators may be as good as the prior weights that were utilized. In other words, the MIN QUE estimators depend to a certain extent on the prior weights used in the norm. Indeed, this condition was the motivation behind Brown (1976) who suggested iterative MINQUE (I-MIN QUE). But since I—MINQUE estimators are Obtained iteratively, they do not possess the properties used in deriving MIN QUE. Thus, I—MINQUE estimators are not necessarily unbiased or "best" in any sense (Searle, 1979). stud; com] 9111 meil pied 1511a {01 I the: able Instead of using arbitrary weights in implementing MINQUE, the present study employed an AN OVA—type method Of independently estimating the variance components Of the mixed model as in Hanushek (1974). The values Of this prior estimates are used to derive the weights used in MINQUE. Using Hanushek’s method, we fit ordinary regression model with all independent variables as predictors. The prior estimator, 3J3; Of the random error variance 03 is taken as usual MSE in the multiple regression model. However, the Hanushek estimator 3J3; for the variance, 73 Of the random eflects Of the model is given by . w—( N—P)33 _ II where w is the sums of squares of residual in the regression model T = E tr{SJ<(XJXJ)'ISJ-} for SJ : (le) vector of column sums Of XJ for context j N = sample size, and P = number of fixed effects parameters in the model. In order to use the Hanushek estimators 3J3; and 3J1; to derive the weight w, and w,, define the ratio RJJ = dJaJ/iJzJ. The weights w, and w, can then be Obtained by R 1 (4.5) w = H and w = . 0 I+R, 1 1+1}; , 1'2 We notice that wo= l—p and wJ=p where p = u . The value p is the intraclass correlation based on the Hanushek estimates, “Ar; and 3J1 Of r? and 03 respectively. The weights w0 and w, Obtained through 4.5 are the values used in the MINQUE procedure. It is reasonable to expect that the MINQUE based on weights established from some prior estimates of 03 and 'r2 could be an improvement over the conventional MINQUE based on arbitrary weights. 45 Implementation Of the bootstrap algorithm to estimate the parameters of the mixed, hierarchical linear model in the present study requires a random sampling procedure with replacement. First a random sample of J macro units (e.g. countries, schools) with replacement from the available sample Of J macro units is drawn. From each of the selected macro units, a random sample Of size nj micro units are selected with replacement for j = 1,2,...,J. The resulting data set is termed the bootstrap replication sample (Efron, 1981). Based on the bootstrap replicated sample, the MINQUE procedure is used to determine the estimate Of the parameters Of the model. The process is repeated a large number B times yielding B MIN QUE estimates. This technique may be presented in a sequence of steps as follows: Step 1. Construct the distributions FJ by assigning mass 1/J to each Of the macro units. Step 2. From the J macro units, select a random sample of size J with replacement Step 3. For each Of the J selected macro units containing nJ micro units, construct distributions FJJj by assigning mass 1/nJ- to the jth macro unit, for j= 1,2,...,J. Step 4. From each of the J macro units whose distributions were constructed at Step 3 above, draw a random sample size nJ with replacement for j = 1,2,...,J. At the end Of Step 4, the resulting data is termed the bootstrap replicated samme. The vector of Observations at this stage is denoted by Y*. Step 5. From the bootstrap replicated data set generated at Step 4, determine the MINQUE estimate Of the parameters Of the model given by 6’" and (1*. 46 Step 6. Independently, repeat 2, 4 and 5 a large number B times to Obtain a sequence of MIN QUE estimates Of the parameters Of the model 53;; and {1; for b = 1,2,....B A Step 7: Observe the distribution Of the values of; and Er; as the empirical bootstrap distribution Of the estimates Of the variance component and the fixed effects Of the model. The bootstrap standard error Of each of the component of the estimates is given by (4.6) s.e.(3'") = [(3.1)-él (g), _ pmi/z . B .. .. .. .. where 3.* = B—le 3: for 3'" being any one Of the components Of a or a. 81 :- ~ The Computer Programs Three main tasks in this study required the use Of a computer program. These were: Generating data sets from population with known parameter and distribution; Monte Carlo simulations; and bootstrapping. Independent computer programs were coded for each task using SAS/IML package. SAS/IML available in the MSU IBM 3090 VF mainframe computer system is a double precision and multilevel, interactive programming language. SAS/IML software is both flexible and powerful since it combines the advantages Of high—level and low—level languages (SAS/IML User’s Guide, 1985, p. xi). Though SAS provides a procedure which computes the MINQUE that corresponds to the rather uninformative prior by using zero weights as an Option to PROC VARCOMP, this procedure does not handle models that involve covariates. The independent variables handled by the procedure PROC VARCOMP are limited to main effects, interaction and nested effects; but no covariate effects are allowed in 47 the PROC VARCOMP Statement (SAS User’s Guide: Statistics, 1985, p. 819). However, the present study is not limited to models which do not involve covariates. Consequently, the more flerdble software SAS/IML was utilized in the study, not only to estimate parameters of the model but also tO generate data. As indicated earlier in this chapter, the first computer program generates sample Observations, Y and covariate X, and passes them over to the program that implements the bootstrap algorithm. The bootstrap estimates at each replication are written to a standard SAS file for further analysis. The Monte Carlo simulation computer program is implemented like the bootstrap program except that while the bootstrap samples data from a sample generated from the pOpulation, the Monte Carlo simulation program samples data directly from the population. The bootstrap SAS/IML code used in this study is thus flexible and available to be used to estimate parameters of a model using data obtained from real world research. Applicability of the bootstrap method using the present SAS/IML code is demonstrated in the present study. The computer code and method are applied on actual field research data to estimate the parmeters of the model, the sampling distribution Of the statistics and to set bootstrap confidence intervals about the parameters Of teachers’ self—efficacy prediction model. Estimation results for the fixed and random effects of the teachers’ self—eficacy model are presented in Chapter V of this dissertation. CHAPTER V APPLICATION OF BOOTSTRAP AND MINQUE: HIGHER ORDER TEACHING Intr ction The bootstrap is a new method whose time has come with the advent Of modern computers. Though its applicability in generating sampling distributions Of statistics and in construction of confidence intervals about parameters is highly promising, the method has not been widely used in educational and social science research. Strengths Of the method are Often demonstrated in situations where parametric modeling is difficult and/or normal assumptions are not possible. These situations are not uncommon in educational and social science research. The interest Of the present study was to demonstrate the Operation Of the bootstrap in a two—level hierarchical linear model. The focus Of the study was upon the estimation Of the group and individual level variances and fixed efiects parameters Of the mixed model. A highly promising approach Offered by the method in this study was that Of estimating the sampling distribution of the statistics and thereby setting confidence intervals about the parameters. The study used computer-simulated data tO extensively assess the distributional behavior Of parameter estimates under varying distributional assumptions Of the errors and sets of random effects parameters. In this chapter, applicability Of the bootstrap algorithm on data originating from a real research situation is demonstrated. The method is applied on the actual field research data to estimate the parameters of the model, the sampling distribution of the estimators and to set the bootstrap confidence intervals about 48 49 the parameters. Data used in this demonstration Of the applicability Of the bootstrap method was part Of the data gathered earlier to investigate the contextual efiects on the self-efficacy Of high school teachers. Dfieription Of data and variables The data was Obtained through a survey Of teachers in sixteen schools who taught Mathematics, Science, English, or Social Science. Each teacher was assigned to teach one or more classes in the school. Though the individual teacher was viewed as the basic unit Of analysis, each teacher provided information on several classes. As a result, we view the teachers as the "macro" units Of analysis with the classes they taught as the "micro" units Of analysis. The teacher effects therefore, constitute the random factor of the model. The dependent variable in the study was teachers’ perception Of self—efficacy which was measured at the class level. A measure of teachers’ self—efficacy represents a person’s perceived expectancy Of enacting a desired level or type of performance through personal effort (Bandura, 1986). For instance, a teacher who possess a high level Of self-efficacy will be of the view that, no matter the nature Of students or facilities he or she is provided with, he or she will produce an excellent level Of performance. On the other hand, a teacher with low self—efficacy will feel paralyzed if he or she is given "poor" children. The phenomenon has been identified to have an effect on both students’ and teachers’ performance (Fuller, et.al., 1982). In the present study, the extent to which teachers’ self—efficacy is influenced by institutional, classroom and individual teacher characteristics is examined. Academic subject taught (Mathematics, Science, English, or Social Science) represented the primary fixed factor Of the model used to predict teachers’ self—efficacy. Other independent variables Of the model which were viewed as covariates fell into two categories, namely, between— and within—teacher variables. 50 The between teacher variables included: STAFCOOP, COOperation Of staff; TCONTROL, Teacher control; and PLEADER, Principal leadership. The within teacher (or classroom) level independent variables included: STUDACH, class average student achievement level; LVLPREP, class level Of preparation; and SIZE, class size. Selection of valid data for the variables of interest resulted in a sample Of 244 teachers who provided information on 1634 classes taught. Breakdown of the number of teachers and number Of classes by academic subject areas were as follows: Mathematics had 63 teachers with 370 classrooms; Science had 59 teachers with 391 classrooms; (English had 69 teachers with 509 classrooms; and Social Science had 53 teachers with 364 classrooms. The average number Of classes for which each teacher provided information was about 6.6. The Model Stetements We begin by posing a within-teacher model that defines a "micro" equation with EFFICACY as the response variable and LVLPREP, SIZE and STUDACH as "micro" regressors which are identical for each teacher j as, (5.1) (EFFICACY)JJ = flOJ +1131 (SUBJECT) hJ + flJJ(LULPREPJ) j + ,6,J(SIZEJ)J- + 33J(STUDACHJ)J + cJJ- where j= 1,...,J teachers and i= 1,...,nJ classes for each teacher j. Since 1%, fiJJ, fi,J, and BJJ are defined for each teacher, we can pose the between—teacher model using these coefficients as responses similar to Equations 2.2 and 2.3 in Chapter H Of this dissertation. Specifically, we consider the intercept fiOJ to be random and dependent on the between—teacher independent variables such that the associated "macro" model is given by 51 (5.2) fioJ- = 700 + 70, (STAFCOOP)J + 70,(TCONTROL)J + 70, (PLEADER)J- + eJJJ where j = 1,2,...,J teachers. Combining Equation 5.1 and 5.2 yields, (5.3) (EFFICACY)JJ = [70° + 70J(STAFCOOP)J- + 70,(TCONTROL)J- + 7,, (PLEADER)J + h}: (SUBJECT) hJ + fiJJ-(LVLPREPJ)J + fl,J-(SIZEJ)J + 33J(STUDACHJ)J-] + [eoJ- + eJJ] similar to Equation 2.7 in Chapter H. Model Equation 5.3 can be written in the general linear matrix notation as in Equation 2.8 in Chapter II for teacher j with, (5.4) Y1 = (EFFICACY)JJ- 4 + 7,, (PLEADER)J + 1231 (SUBJECT) hJ + flJJ(LVLPREPJ)J- + s,J(SIZEJ)J + fl,J(STUDACHJ)J- (5.6) ngj = eoj for bj = eoj and Zj = (1,...,1)’ (5.7) s = eJJ. Equations 5.5 and 5.6 represent the fixed and random efiects Of the model respectively, while Equation 5.7 is the expression for the random errors Of the model. The intent then is to estimate both the fixed and random effects Of the model on the measure Of teachers’ self—efficacy. Estimation Precedme The ability Of the bootstrap to estimate parameters Of the model given in Equation 5.3 was demonstrated through the used of MINQUE. For each parameter, the usual MINQUE estimates were provided based on the original sample. The bootstrap estimates based on B = 1000 repeated resampling with replacement were also obtained. Due to the bootstrap’s ability to generate sampling distributions 52 through resampling, 95% bootstrap confidence intervals about each Of the parameters were also provided. Estimates were provided for a total Of 14 parameters Of the model. There were four effect levels of the factor, SUBJECT, denoted by aJ, 01,, 01,, and at,J corresponding to Mathematics, Science, English and Social Science respectively. Parameters for other fixed factors (or covariates) were denoted by (3,,fi,,3,) correSponding tO the within teacher (or class—room) effects (LVLPREP, SIZE, STUDACH) and (7J,7,,7,) corresponding tO the effects between teacher (LVLPREP, SIZE, STUDACH, STAFCOOP, TCONTROL, and PLEADER). Besides SUBJECT, all other fixed factor were viewed as covariates in the model. The inter—teacher variance Of the model was denoted by r2 while 0% denoted the variance Of the random errors (or intra—teacher variance). In addition, the intra—teacher correlation denoted by p, and computed as in Equation 4.3 in Chapter IV and the constant common to all Observations denoted by 700 were estimated through both MINQUE and bootstrap. Results Of Estimation Table 5.1 presents the MIN QUE and bootstrap results for the estimation Of the fourteen parameters of the teacher self-efficacy prediction model. The results provides the usual MINQUE estimate, the bootstrap estimate which is the average over all B = 1000 bootstrap replications, the bootstrap standard error, and 95% bootstrap confidence intervals about each parameter. The bootstrap estimate Of bias given by 3* — 3, where 3* is the average Of the estimator over the B bootstrap replications and 3 is the usual estimator based on the original sample is also represented. For purposes Of consistency with notation given by Efron (1979), the bootstrap estimate of bias is denoted by BIAS. F". 53 88... 8.... 88... 88...- 88... 8.....- 8.....- as 82.83.. .88.... 88... 8.... 8...... «8.... 88... .8... 88.... as H.28.... sees... 88...- 88... 8.... 8...... 88... 8...... 2.8... e. 8.8880188 88...- 88... 88... 88... 8...... 88... 28... a... ”see. .88....520... 2.33m $825. 88...- 88... 8...... 88... 8...... 88... 88... am can as... 8.....- .8... 88.... 88... 8...... 88.... 8...... .8. 8.893... .o .26.. as... :8... 8...... 88... 8...... 88... 88... 28... .e seesaw .68 88... 3...... 88... 88... .88... 38... 88... .8 £88... 8.8... 8...... 88... 88... 88... 2.8... 88... as .883 88... 8.... 8.... 8...... 88... 88... 88... .e 8.8888... .338 oases... 8.... 88... 8...... 88.. 88... 88.. 88.. 0.. 35.88 2...... 88... 88... 88... 8.8... :8... 88... a science 5.013.... :8... 88... 88... 88... 8...... :8... 88... «a 8§E> 983.78.... E... 88... 8.... 8...... I88... I82... 8...... as scans. 883912.... a... 5...? Ho... 4.0... .3 855. b.5052 8853.. ...o 8.2.8.. :8 88."... 8:38.. .gficlauo 3:88.. .5483 o... .3 833..» 608-3.... was .335". .85. 803.... no on». we Scene 2: .0 8.2.58 H.052 v.3 mezzoom H6 03...“. 54 From Table 5.1, it is shown that the average of the bootstrap estimates of the parameters over B = 1000 replications did not differ much from the usual MINQUE estimates. In addition, the bootstrap feature which was not available through the MINQUE procedure was the estimation of the standard error of the estimate. The statistic showed low bootstrap standard errors of the estimate for all fourteen parameters of the model. The lowest value of the bootstrap estimate of the standard error was observed for the estimates of the effect of class m and the estimate of the intra—teacher variance, (3;. For these two parameter estimates, the bootstrap estimate of bias was less than 0.002. Except for the estimator of the constant, 700 whose bootstrap estimate of bias was 0.1124, the bootstrap estimate of bias for all other statistic was no more than 0.04. An accomplishment of the bootstrap method which is not readily available through the usual MIN QUE was the construction of confidence intervals about each of the parameters of the model. The 95% bootstrap confidence intervals were used as a means of testing for the significance of both the fixed and random effects on the teachers’ self—efficacy. Based on the 95% confidence intervals, the results showed that all factors with the exception of W have statistically significant effect on teachers’ Self—Efficacy. The intra—class correlation denoted by p was significantly difierent from zero, with 0.3204 and 0.3314 being the MINQUE and bootstrap estimates of p respectively. Estimates of p through both methods indicated that, approximately 30% of the total variance in teachers’ gen—Efficacy is between teachers. The MIN QUE and bootstrap estimate of the inter—teacher variance denoted by 1" was 0.1320 and 0.1397 respectively. In some problems of practical interest, we may wish to observe the behavior of the statistic used to estimate a parameter. This requires knowledge of the sampling distribution of the statistic, often based on the Gaussian theory. In 55 situations where this theory is not available, it is often difficult to draw conclusions about the sampling distribution of the statistic. In such situations, the bootstrap oflers perhaps one of the most significant contributions to statistics. The method’s applicability to even complicated problems involving statistics which may not have closed form expressions may be a major promise of the bootstrap. But the bootstrap can be applied to simple problems as well. In the present study, the bootstrap method was used to generate the sampling distributions of the statistics used to estimate the parameters of the teachers self-efficacy prediction model. The distributions were based on 1000 bootstrap replications. Figure 5.1 presents three percentage polygons of the estimators of the inter—teacher variance, 7’, the intra—teacher variance, 0:, and the intra—teacher correlation, p, based on B = 1000 bootstrap replications. Though the distribution . 2 of 1* appeared to be slightly positively skewed, the distributions of all the three estimators are fairly symmetric with very low dispersion. Figure 5.2 presents four percentage polygons of the estimators of the effects of Mathematics, al, Science, a,, English, as, and Social Science, a:4 on teachers’ self—efficacy based on B = 1000 bootstrap replications. All four charts represent the empirical bootstrap distribution of the estimator of SQBJ EQT effects on the teachers’ perception of self—effiggy appears to be fairly symmetric with moderate variability. However, the estimates of the effect of Mathematics, Social Science, and Science seem to be slightly negatively skewed. But most importantly, all the charts show that the replicated estimates are centered extremely close to the MINQUE estimates of the effects of Mathematics, Science, English, and Social Science. 56 Figure 5.1 ° te of the inter and intra-teacher mgfimflégmtggmgfof‘mm' self-efficacy prediction (B = 1000 replications). 14 DUO!“ 12 1- 10 '- 8 +- 6 l- 4 . 2 u- o A“ . 1 0.108 0.118 0.128 0. 138 0.148 0.158 0.38 0.178 Inter—Teacher variance (3) 1 1 l 1 0.234 0.243 0252 0.201 0.27 0.270 0.288 0.207 0300 0.315 0.324 IRiki‘s-slur:variance (a?) 10 0.250 0.272 0.288 0.304 0.32 0.33 03& 0.38 0.384 0.4 Ina-ruse correlation (in) 57 Figure 5.2 Percentage polygons for the bootstrap estimate of the efiects of Mathematics, Science, English, and Social Science on the teachers’ self-eficacy (B = 1000 replications). N, A! 0424 0.4“ 0.472 0.4“ 052 0.5“ 0.” cm efiects of Mathematics, a1 ‘ //\ // /. . .. - r (14 0424 one 0472 0.4“ 081 0.6“ can A A om efiects of Science, a, / L r 1 1 1 1 A 031 0.39 0.“ 043 0.45 047 0.40 0.81 06:! on “aim“: r- . .' d ‘ 0.2. 030‘ 032' 0.382 0.370 0.4 0.424 0.44. m d 30611 SW, 04 CHAPTER VI SIMULATIONS AND BOOTSTRAP RESULTS Overview The purpose of the study was to demonstrate the use of the bootstrap in providing estimates of the parameters of a general two level mixed hierarchical linear model, determining the standard error of the estimates and their empirical bootstrap distributions. The objective was to observe the behavior of the bootstrap and MINQUE estimates of the fixed effects and the variance components of the model under several conditions including situations where the normal distributional assumptions may be violated. The study examined the influence of the magnitude of the population intraclass correlation and the tail size of the distribution on the estimation of the parameters of the mixed model. Double exponential (or LaPlace) represented a distribution with fairly long and thick tails. MIN QUE and bootstrap abilities to estimate parameters of the mixed HLM model were demonstrated by estimating the parameters from a large number of independent samples generated from pOpulations of known distributions and parameter values. Applying the estimation procedures on sets of data generated from a pOpulation of known parameters provided a means of evaluating the relative effectiveness of the methods of estimation. The independent samples consisted of 50 groups with each group containing 25 to 45 observations. Estimation of parameters was studied for two underlying population distributions, namely the normal and double exponential (or Laplace), and three levels of the intraclass correlation. These two design factors provided for a total of six design factor combinations (or cells). A total of 400 trials (based on independent 58 59 samples) were performed for each design factor combination. As a result, 2400 Monte Carlo simulation trials, each based on a different data set, were performed for the study. MINQUE and bootstrap point estimates, the 95% and 90% bootstrap confidence intervals, empirical bootstrap distribution and standard errors were provided for each trial. The MINQUE and bootstrap summary results are presented in the remaining part of this chapter. Results of Estimation Procedures Simulated data represented observations from two pOpulation distributions of random errors and sets of random effects characterized by three levels of the intraclass correlation. The mixed model contained seven parameters of which three were random effects parameters and four fixed. The random effects parameters were the within and between group variances denoted by a: and 72 respectively and the intraclass correlation denoted by p. The fixed effects parameters included 0:1, 0:2 and a:3 for levels of the fixed factor and )9, the coefficient of the covariate. The MINQUE and bootstrap estimates were obtained for each of the seven parameters of the model. While one MIN QUE estimate was obtained at each trial, (based on the original sample) the bootstrap estimate at each trial was the average over 200 bootstrap replicated values. Thus, the average of the bootstrap estimate over 400 trials is the average of the 400 averages each computed from 200 bootstrap replications. Ten functions of the MINQUE and/or bootstrap estimates were computed for the six non—redundant estimates for both models under normal and double exponential error terms and sets of random effects at each trial. The average and standard deviation of the estimates of these functions over 400 trials for each design factor combination (cells denoted by a through f of Table 4.1 in Chapter IV) are presented in Tables 6.1 through 6.6. 60 Ten functions of the estimates consisted of: MIN QUE and bootstrap estimates, the bootstrap estimate of bias denoted by BIAS, the MINQUE bias and bootstrap estimate of bias denoted by D1 and D, respectively, the MIN QUE ratio, R, and its correSponding bootstrap ratio denoted by R1, the MINQUE and bootstrap mean square error denoted by MSEl and MSE2 respectively, and the bootstrap] MIN QUE measure of relative efficiency. Table 6.1 presents the average and standard deviation of ten functions of :r’ and/or "r: over the 400 trials under the normal and double exponential error terms and sets of random effects for three levels of the intraclass correlation. From Table 6.1 it is shown that the bootstrap overestimated 13 with a bias of 0.3432 and 0.3765 under normal and double exponential respectively for p = 0.01. For this low value of the intraclass correlation, the bootstrap estimate of bias, denoted by BIAS and given by fr} — "r“ was also high and positive indicating that the bootstrap method on average overestimated the value of 1” under both normal and double exponential error terms and sets of random effects of the model. At this level of the intraclass correlation (p = 0.01), though MIN QUE on average also overestimated 1’, its bias was relatively low, at 0.0292 under normal and 0.0597 under the double exponential distribution. The ratio, R2 expected to be 1.00 was observed at 1.0292 while its bootstrap estimate, R1 was 0.7811 under the normal distribution. The same ratios were 1.0597 and 2.3032 respectively under the double exponential. At this level of the intraclass correlation condition, the bootstrap estimate seemed to be more efficient both under the normal and double exponential. From these results, it is apparent that MINQUE and bootstrap estimates were fairly close both under the normal and double exponential distributions. 61 Table 6.1 Average and standard deviation of the functions of the estimates 1’ and/or 73 under the normal and double exponential error and sets of random efi’ects for p = 0.01, 0.05, and 0.20. Nor-a1 Double Exponential Value of p Estimate Par.Value Average S .0. Average S .D . 0.01 Bootstrap, 73 1 .00 1 .3432 0. 7495 1.3765 0. 7298 MINQUE, r’ 1.00 1 .0292 0 .9010 1 .0597 0 .8879 0115:1342 0.3140 0.2113 0.3168 0.2207 0513—12 0.0292 0.9010 0.0597 0.8879 0,434 0.3432 0.7495 0.3765 0.7298 111:;2/7’ 0.7811 11.0436 2.3032 14.4144 R2=;"/1'2 1.0292 0 .9010 1.0597 0 .8879 ISE1=(:r’—1-’)’ 0.8105 1.2806 0.7899 1.2086 l8E2=(1'3—1")’ 0 .6781 1 .3690 0.6729 1 .2438 Rel. Efficiencyo 0.8366 0.8519 62 Table 6.1 (continued) Nor-a1 Double Exponential Value of p Estimate Par.Value Average 5 .0. Average S .D. 0.05 Bootstrap, 7'} 5.26 5.2900 1.6634 5.5309 2.1743 MINQUE, 7" 5.26 5.1755 1.6621 5.3996 2.1786 311358? 0.1144 0.1201 0.1313 0.1316 n,=;’—13 —0.0845 1.6621 0.1396 2.1786 0,=}:—r’ 0.0299 1.6634 0.2709 2.1743 3,91%? 1.0261 0 .0314 1.0354 0.0972 R2=;"/1’2 0 .9839 0 .3160 1.0265 0.4142 Iss1=(:r’—r’)’ 2.7627 3.5337 4.7539 8.4574 Iss2=(r:—r’)’ 2.7610 3.6216 4.7891 8.7037 Rel EfficiencyQ 0.9994 1.0074 0.20 Bootstrap,72 25.00 24.9553 5.8896 25.5167 8.8370 MINQUE, }’ 25.00 24.8520 5.8766 25 .4018 8.8398 BIAS=72—;’ 0.1032 0.2050 0.1149 0.2077 0,=}’—1’ —0.1480 5.8766 0.4018 8.8398 02:72—72 —0.0447 5.8896 0.5167 8.8370 111:;2/7’ 1 .0043 0 .0085 1 .0052 0.0089 It,=;’/r’ 0.9941 0.2351 1.0161 0.3536 lssiz(:r’—r’)’ 34.4700 49.1902 78.1073 153.6137 Iss2=(rz—r’)’ 34.6030 49.6024 78.1651 154.7723 Rel. Efficiencyfl 1.0039 1.0007 o Rel. Efficiency = MSE2/MSE1 63 At the second level of the intraclass correlation (p = 0.05), the average values of 7" and 72 were 5.1755 and 5.2900 in the normal case compared to the true parameter value set at 5.26. At this level of the intraclass correlation, both the bootstrap and MINQUE estimates were very close to the parameter 1" with 0.0299 and —0.0845 as their respective biases under normality. The estimates were slightly off under double exponential with 7’ = 5.3996 with a bias of 0.1396 and £2 = 5.5309 with a bias of 0.2709. Under this condition of the pOpulation intraclass correlation however, the strength of the bootstrap was demonstrated in the estimates R, and R2. The average value of R, was 1.0261 under the normal and 1.0354 under the double exponential compared to the average values of R2 which was observed at 0.9839 under the normal and 1.0265 under the double exponential. Perhaps the most successful estimation of ‘r’ was attained in the situation where the pOpulation intraclass correlation was 0.20, particularly under the normal distribution. Compared to the true parameter value of r3 = 25, 73 was observed at 24.9553 and 1” at 24.8520 under the normal. The average values of 7’ and 73 were 25.4018 and 25.5167 respectively under the double exponential distribution. Based on the bias of these estimators, the results shows that the bootstrap with a bias of -0.0447 and the MINQUE with a bias of -0.1480 were very close under the normal distribution. The bootstrap estimate of bias was also observed at 0.1032 under the normal. The bias for the MIN QUE and the bootstrap estimates were observed at 0.4018 and 0.5167 respectively under the double exponential, with the bootstrap estimate of bias equal to 0.1149. 64 It may be important to note that, in both the normal and double exponential distributions, the average values of R, and R2 were quite close to 1.00. In particular, the ratio R, was surprisingly close to 1.00 indicating a very successful bootstrap estimation process. The bootstrap replicated values of R, were not only centered near 1.00 but also were less variable under both the normal and double exponential. The measure of relative efficiency for the two estimators was also extremely close to 1.00 both under the normal and double exponential. Figure 6.1 displays the percentage polygons of the 400 bootstrap and MINQUE estimates of 7’ under the normal and double exponential errors and sets of random effects at each of the three levels of the pOpulation intraclass correlation. At p = 0.01, both MINQUE and the bootstrap estimates at each trial were centered near the true parameter value set at 1.00 under both the normal and double exponential. However, the percentage polygon for the bootstrap was positively skewed while that of the MINQUE was nearly symmetrical under both the normal and double exponential distributions. This is mainly due to the fact that the bootstrap was protected from giving negative estimates of r’ while MIN QUE was not. From Figure 6.1 it is apparent that a greater mass of observations were around 1.00 for the bootstrap frequent polygon than for the MINQUE polygon. It can therefore be argued that, at this level of the pOpulation intraclass correlation, the bootstrap seemed to be a good complement to the MINQUE estimator of 7’. Percentage polygons for the 400 MINQUE and bootstrap estimates under the normal and double exponential distributions at p = 0.05 shows that, both MINQUE and the bootstrap were free of giving negative estimates and both were 65 Figure 6.1 Percenuse polygons for the MIN QUE and bootstrap estimate of 13 over 400 trials under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20. .- '2'15'1‘050031‘32253159‘55 van-or}? and P — :oomrao -‘- mums '2 p a 0.01 4.5 44350051132153334455 Valneof r3 and }’ - booms-o + name l 6 4 2 O gt... vuuotr‘mtr’ - muse dis-um: 1011Q13 o p = 0.05 A A 50320253035404.” Valued 1.1 and ;-' — consume -0- name sssossro 9.0.20 A A 15202530354046fluuflm venous? anti;1 —eoomno +4»th 66 centered near the true parameter value of 1" which was set at 5.26. However, for both MINQUE and the bootstrap, the estimates were more variable under the double exponential than under the normal distribution. The difference in the variability of both the MINQUE and bootstrap estimators between the normal and double exponential were more apparent at p = 0.20 (see Figure 6.1). The percentage polygons for both estimators showed more variability under the double exponential than under the normal. Estimation results at the three levels of the pOpulation intraclass correlation show that, though the bootstrap seems to be a more stable estimator of 72, particularly at the low level of the intraclass correlation, the characteristic of the tails of the distribution seem to be equally affecting the bootstrap and MIN QUE in estimating 7’. Table 6.2 presents the average and stande deviation of the ten estimable functions of 6: and/or :72 over 400 trials under the normal and double exponential error terms and sets of random effects for three levels of the population intraclass correlation. From Table 6.2 it is shown that the bootstrap slightly underestimated a: under both the normal and double exponential distributions for p = 0.01. MIN QUE slightly underestimated a: under normality but slightly overestimated a: under the double exponential for p = 0.01. At this level of the pOpulation intraclass correlation, the bias for MIN QUE was -0.0680 under the normal and 0.0698 under the double exponential. The bias for the bootstrap estimate was —0.2025 under the normal and —0.0730 under the double exponential. The bootstrap estimate of bias was observed at -0.1344 under the normal and —0.1428 under the double exponential. Average results for R, and R2 demonstrated a very successful estimation process at this level of the population intraclass correlation. R, was observed at 0.9987 for the normal and at 0.9986 for 67 Table 6.2 Average and ,standard deviation of the functions of the estimates a2 and/ or a} under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20. Normal Double Exponential Value of p Estimate Par.Value Average S .0. Average S .D . 0.01 Bootstrap62 100.00 99.7975 3.6352 99.9270 5.9080 MINQUE, 3’ 100.00 99.9320 3.6223 100.0698 5.9100 silsziz—E’ —0.1344 0.2331 —0.1428 0.3939 0,:a’—a’ —0.0680 3.6223 0.0698 5.9100 0,863.72 —0.2025 3.6352 —0.0730 5.9080 2562/33 0.9987 0.0023 0.9986 0.0039 It,=2r’/a3 0.9993 0.0362 1.0007 0.0591 Iss1=(§3—a’)’ 13.0927 17.0420 34.8462 50.6894 lssz=(az—a’) ’ 13.2225 17.1911 34.8222 51.0716 Rel. EfficiencyQ 1.0099 0.9993 68 Table 6.2 (continued) Normal Double Exponential Value ofp Estimate Par.Value Average S.D. Average S.D. 0.05 Bootstrapp} 100.00 99.6360 3.4799 99.9349 5.8466 MINQUE, a’ 100.00 99.7693 3.4733 100.0755 5.8629 3115:3242 -0.1333 0.2398 -0.1406 0.3942 0,:a’—a’ —0.2307 3.4733 0.0755 5.8629 0,:oz—a’ —0.3640 3.4799 —0.0651 5.8466 [503/62 0.9987 0.0024 0.9987 0.0039 1.9/52 0.9977 0.0347 1.0008 0.0586 ISEl=(§’—a’)’ 12.0867 17.0071 34.3938 49.9322 lSE2=(aZ—e’)’ 12.2122 17.2576 34.1013 50.2230 Rel. EfficiencyQ 1.0104 0.9915 0.20 Bootstrap,03 100.00 100.0437 3.7669 99.7328 5.7180 MINQUE, 3’ 100.00 100.1660 3.7717 99.8545 5.7151 siAS=}z—3’ —0.1223 0.2361 —0.1216 0.3878 D,=0’2—02 0.1660 3.7718 —0.1455 5.7151 0,=2-2—a’ 0.0437 3.7669 —0.2672 5.7180 8563/63 0.9988 0.0024 0.9988 0.0039 ' s,=}’/a’ 1.0017 0.0377 0.9985 0.0572 Issl=(§3—a’)’ 14.2177 19.9419 32.6019 44.6796 131:2:(63—6’)’ 14.1557 19.9124 32.6852 45.4206 Rel. EfficiencyO 0.9956 1.0026 o Rel. Efficiency = MSE2/MSE1 ... 8 . 11+. . 111.1. ..,. .L . 7.2.1... ; . . ..$L. «wt-M . . . . .. . oh... 44.4... .17... IA: ...... ll 69 the double exponential while R, was observed at 0.9993 under the normal and at 1.0007 under double exponential. Similarly, surprisingly accurate results were observed at the 0.05 level of the intraclass correlation. At this level, the average of bootstrap estimate of a: was observed at 99.6360 with a bias of —0.3640 under the normal and at 99.9349 with a bias of -0.00651 under double exponential. The average of the MINQUE estimate of a: was observed at 99.7693 with a bias of -0.2307 under the normal and at 100.0755 with a bias of 0.0755 under double exponential. Compared to the expected ratio of the estimates at 1.00, both MINQUE and the bootstrap very closely estimated the ratio with R, = 0.9987 both under the normal and double exponential. R, was observed at 0.9977 and 1.0008 under the normal and double exponential respectively. At the two levels of the intraclass correlation condition (p = 0.01 and p = 0.05), the bootstrap and MINQUE were very close both under the normal and double exponential distribution. At the 0.20 level of the intraclass correlation, both MINQUE and the bootstrap slightly overestimated a: under the normal and slightly underestimated a: under double exponential. The bootstrap average was closer to the true value of the parameter than the MINQUE with a bias of 0.0437 under the normal distribution. On the other hand, the MIN QUE was closer to the parameter than the bootstrap with a bias of —0.1455 under the double exponential distribution. Average values of R, and R, were very close to their expected value (1.00) at this level of the intraclass correlation under both the normal and double exponential distributions. On average therefore, results in Table 6.2 shows that both MIN QUE and the bootstrap very closely estimated the parameter a: under both the normal and 70 double exponential errors and sets of random effects at all three levels of the intraclass correlation. However, at all three levels of the intraclass correlation, the standard deviation of the functions of the estimates was relatively higher under double exponential that under the normal distribution. Regardless of the underlying distribution of the errors and sets of random effects of the model, the estimation of the ratio of the estimators, R, and R, was quite close to 1.00 at all levels of the intraclass correlation. Figure 6.2 displays the percentage polygons of the 400 bootstrap and MINQUE estimates of a: under the normal and double exponential errors and sets of random efiects at each of the three levels of the pOpulation intraclass correlation. From Figure 6.2 we see that, at all levels of the population intraclass correlation, the bootstrap estimator followed the MIN QUE quite closely. Percentage polygons for both estimators were centered near the true parameter value set at 100. However, differences in variation of the estimates by distribution was quite obvious. The spread of both MIN QUE and bootstrap frequent polygons was clearly higher under the double exponential than under the normal distribution. For the estimation of 0:, therefore, it can be argued that while both MIN QUE and the bootstrap fairly closely estimated 0:, their efficiency was severely affected by the nature and size of the tails of the distribution of the errors and sets of random effects. Both estimators were less efficient under a distribution with long and thick tails (like that of the double exponential) than under a distribution with short and lighter tails. The intraclass correlation is given as a function of 7’ and 0?, whose formula is shown in Equation 4.3. It is the index which measures the degree of 71 Figure 6.2 Percentage polygons for the MIN QUE and bootstrap estimate of a: over 40 tdflsundathenomaluddoublecxponentiderronandsetsofrandomeffecufcr p = 0.01, 0.05, and 0.20. Normal Double Emu,“ 0m I: 07 92 37 '03 ",‘7 VII” Of 0’, 9 ' — mutton -¢- mums NW N MO!!! 12 ,2 ‘0 10 > 0 6 d 2 o l A 02 87 92 97 112 r u 07 92 07 .’ . p-o.05 V“"°‘a.61‘ —eoe¢sme #2411401: 1“ 107 1! 117 '9 . ‘07 Value of a”, 01° ‘Wfl *W I! or 02 97 111., .167 m ' I! a, 02 w Wu“ 0.61 722020 Value are ,7 —Wo +um a. 707 m m 2 . 72 dependence among observations sharing a context as well as providing the proportion of the total variation in the response values that is between contexts (Randenbush and Bryk, 1988). Success of estimation of model parameters often depends on this measure, with less success when p is quite small. Due to this feature, the population intraclass correlation was used as an important design factor in the present study. The MINQUE estimator p of p is obtained by substituting :r’ for r2 and :1: for 0,3, in Equation 4.3. Likewise, the bootstrap estimator p* is obtained by substituting 73 and 6:... for 7’ and a: respectively in Equation 4.3. Bootstrap and Monte Carlo results for the estimation of ten estimable functions of p and/or p... under the normal and double exponential errors and sets of random effects of the model for the three levels of the pOpulation intraclass correlation are presented in Table 6.3. Summary statistics in Table 6.3 show that both MIN QUE and the bootstrap slightly overestimated p under both the normal and double exponential distributions at the 0.01 level of the intraclass correlation. The bias for the MIN QUE and bootstrap were correspondingly 0.00013 and 0.0032 under the normal and 0.0005 and 0.0035 under double exponential. The bootstrap estimate of bias was 0.0031 under the normal and 0.003 under double exponential. The results show that though R, was poorly estimated at p = 0.01, estimation of the other nine functions of p and/or p... near their expected value at this level of the population intraclass correlation. Mean square errors for both MINQUE and the bootstrap were surprisingly close to zero, both under the normal and double exponential distributions for p = 0.01. At the 0.05 level of the intraclass correlation, all the ten estimable functions of p and/ or ,3... were very close to their expected values estimated. The average 73 Table 6.3 ,Average and standard deviation of the functions of the estimates p and] or p... under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20. Normal Double Exponential Value of p Estimate Par.Value Average S .D. Average S .D. 0.01 Bootstrap, p... 0.01 0.0132 0.0072 0.0134 0.0070 MINQUE, ,3 0.01 0.0101 0.0088 0.0105 0.0087 silszprp 0.0031 0.0021 0.0030 0.0022 D,=p—p 0.00013 0.0088 0.0005 0.0088 0,=}.—p 0.0032 0.0072 0.0035 0.0070 R,=;*/p 1.1407 3.4598 1.3958 3.3220 pr/p 1.0132 0.8798 1.0460 0.8672 lSl:1=(,E,—p)2 0.0001 0.0001 0.0001 0.0001 [532: (gr—p)? 0.0001 0.0001 0.0001 0.0001 Rel. E ciency© 1.0000 1.0000 74 Table 6.3 (continued) Normal Double Exponential Value of p Estimate Par.Value Average S .D . Average S .D. 0.05 Bootstrap, L... 0.05 0.0501 0.0151 0.0521 0.0193 MINQUE, }; 0.05 0.0491 0.0151 0.0509 0.0193 rhea—i; 0.0010 0.0011 0.0012 0.0012 D,=p—p —0.0009 0.0151 0.0009 0.0193 n,=p.—p 0.0001 0.0151 0.0021 0.0193 11,414}; 1.0237 0.0306 1.0306 0.0962 lt,=p/p 0.9829 0.3023 1.0182 0.3866 ISE1=(p—p)’ 0.0002 0.0003 0.0004 0.0006 Issz=(p,.,_p)2 0.0002 0.0003 0.0004 0.0006 Rel. EfficiencyO 1.0000 1.0000 0.20 Bootstrap, )3... 0.20 0.1978 0.0381 0.2002 0.0534 MINQUE, } 0.20 0.1972 0.0381 0.1992 0.0534 shag...) 0.0006 0.0014 0.0009 0.0014 D,=p—p —0.0028 0.0381 —0.0008 0.0534 D,=p*—p —0.0022 0.0381 0.0002 0.0534 new}; 1.0033 0.0074 1.0051 0.0080 R,=p/p 0.9860 0.1906 0.9962 0.2671 151:1:(p—p) ’ 0.0015 0.0021 0.0028 0.0043 Iss2=(p.—p) ’ 0.0015 0.0021 0.0028 0.0044 Rel. Efficiencyo 1.0000 1.0000 0 Rel. Eficiency = MSE2/MSE1 75 values of p and ,3... were 0.0491 and 0.0501 respectively under the normal and 0.0509 and 0.0521 respectively under double exponential. Accordingly, the biases for the bootstrap and MIN QUE were respectively 0.0001 and —0.0009 under the normal and 0.0021 and 0.0009 respectively under double exponential. The bootstrap estimate of bias was 0.0010 under the normal and 0.0012 under double exponential. Under this condition of the intraclass correlation, R, and R, were fairly close to 1.00 with R, = 1.0237 and 1.0336 under the normal and double exponential respectively and R, = 0.9829 and 1.0182 under normal and double exponential respectively. The mean square error for both MIN QUE and bootstrap was quite low under both normal and double exponential at this level of the intraclass correlation. The bootstrap slightly underestimated p under the normal but very accurately estimated p under double exponential at the 0.20 condition of the intraclass correlation. On the other hand, on average MINQUE slightly underestimated p both under the normal and double exponential at the 0.20 condition of the intraclass correlation. The MIN QUE and bootstrap biases were observed at —0.0028 and -0.0022 respectively under the normal and —0.0008 and 0.0002 respectively under double exponential. The bootstrap estimate of bias was 0.0006 under the normal and 0.0009 under double exponential. R, was surprisingly close to 1.00 under both the normal and double exponential but R, was slightly less than 1.00 under both the normal and double exponential. At this level of the intraclass correlation condition, both MIN QUE and bootstrap mean square errors were quite low, both observed at 0.0015 under the normal and 0.0028 under double exponential. Based on these results therefore, it is apparent that, while estimation of functions of (72,73) and (63, 6’) may not have been very successful, estimation of functions of (12*, p) which in turn depends on 73, 7", :73, and 6’ seemed to have been fairly successful for both MINQUE and the 76 bootstrap. However, at this conditions of the intraclass correlation, the bootstrap ration R, was closer to 1.00 than R, under both normal and double exponential errors and sets of random effects of the model. It may also be important to note that, for the estimation of the parameter p, the bootstrap / MINQUE measure of relative efficiency at all three levels of the intraclass correlation was extremely close to 1.00 under both the normal and double exponential distribution. Figure 6.3 shows the percentage polygons of the 400 bootstrap and MINQUE estimates of p under the normal and double exponential distributions at each of the three levels of the pOpulation intraclass correlation. At the 0.01 level of the intraclass correlation condition, though both MINQUE and the bootstrap percentage polygons were centered near the population parameter value of p, it is evident that a greater mass of observations were around the parameters value under the bootstrap frequent polygon than under the MINQUE polygon, for both normal and double exponential distributions. Thus, once again, the bootstrap method has been shown to be a more efficient estimator of p than MIN QUE at the 0.01 level of the intraclass correlation condition. Percentage polygons for the 400 MINQUE and bootstrap estimates under the normal and double exponential distributions at the 0.05 level of the intraclass correlation condition show that MIN QUE and the bootstrap followed each other very closely. However, the percentage polygons under this condition of the intraclass correlation indicated that the values of both estimates were more variable under the double exponential than under the normal. The percentage polygons under the 0.20 level of the intraclass condition shows that the bootstrap and MINQUE followed each other even more closely than at the 0.05 intraclass 77 Figure 6.3 Percentage polygons for the MINQUE and bootstrap estimate of p over 400 trialsunderthenormalanddouhle p = 0.01, 0.05, and 0.20. Nona]. exponentialerrorsandsetsofrandomeflectsfor Double Exponential * estimate of p ‘- u- "" m 0.5“!“ 0 murutsmmwmcu | gufutmcmututsammumm. ”30,01 ”main -I- *m 1‘ ,‘W '2 '2? 10? 1°, 8* 0+ 0? 0* a» u- 2» 2. c ‘ A ‘ A A ‘ ‘ ‘ ‘ o‘ . * - . L . . ° coma: cosmonaut» ‘1‘” “"1“ ‘1' °-" on an on 0 cm no: nos um cos one nor can cm m on at: estimatedp p=o,05 estimateofp —mno +uume _ m -0-u W W M ‘4 12r- t2!- IOL l0? .[ Ul‘ 9 6r- .r ‘L 2r- 2 o ‘ ‘ ‘ ‘ ‘ ‘ ‘ 0' ‘ ‘ a ‘ 0 cos u: as as or: o: ass no nos as 0 can or our c2 oz: on can a We estimate °‘ F n-ozo 0‘ 9 - mun -t-umoue -— may +IINQJE 78 correlation condition. Likewise, values of both estimates were more variable under double exponential than under the normal distributions. The estimation results at the three levels of the intraclass correlation conditions indicate that the bootstrap is a more stable estimator of p, particularly at the 0.01 level of the intraclass correlation condition. However, nature and size of the tail of the distribution of the errors and sets of random effects equally influence the bootstrap and MIN QUE in estimating p. Estimation tends to be less successful under a distribution with long and thick tails (like that of the double exponential) than under a less thick and short tailed distribution. Fixed effects parameters of the model which included a,, a,, and a, for the three levels of the fixed factor and ,6, the coefficient of the covariate was estimated at the three levels of the intraclass correlation conditions under the normal and double exponential errors and sets of random effects of the model. Due to the fact that a, for j = 1,2,3 are linearly dependent, estimation is only required for any two of (1,. For the purpose of the present dissertation, estimation results for a,, a,, and fl are presented for each of the six design factor combinations. Table 6.4 presents the summary statistics over the 400 trials for the ten estimable functions of 3., and/or 2?; under the three levels of the intraclass correlation condition for the normal and double exponential distributions. Means and standard deviations presented in Table 6.4 shows that both MINQUE and the bootstrap very closely estimated the fixed effect parameter a, at all the three levels of the intraclass correlation condition under both normal and double exponential distributions. At the 0.01 level of the intraclass correlation, the bias for MINQUE and bootstrap were 0.0261 and 0.0264 respectively under the normal and 79 Table 6.4 Average and standard deviation of the functions of the estimates (1, and/or a‘,‘ under the normal and double exponential error and sets of random effects for p = 0.01, 0.05, and 0.20. Normal Double Exponential Value of p Estilate Par.Value Average 3 .D . Average S .D. 0.01 Bootstrap 21*; -5.00 —4.9736 0.9302 —5.0251 0.9073 MINQUE, a, —5.00 -4.9739 0.9264 —5.0243 0.9032 BiAS=&1-6, 0.0003 0.0587 —0.0008 0.0619 D,=Zz,—a, 0.0261 0.9264 —0.0243 0.9032 0,:61—07, 0.0264 0.9302 —0.0251 0.9073 115247;, 0.9999 0.0128 1.0000 0.0133 n,=&,/a, 0.9948 0.1853 1.0049 0.1806 18111=(}Sz,—az,)2 0.8568 1.3327 0.8143 1.1918 [SE2=(:Y',‘—a,)% 0.8638 1.3334 0.8217 1.2026 Rel. Efficiencyo 1.0082 1.0091 80 Table 6.4 (continued) Nor-a1 Double Exponential Value of p Estimate Par.Value Average S .D. Average S .D. 0.05 Bootstrap or; —5.00 —5.0139 1.0867 -5.0211 1.0484 MINQUE, a, -5.00 4.0143 1.0889 -5.0221 1.0438 mast—2., 0.0004 0.0606 0.0010 0.0632 n,=o,—a, —0.0143 1.0889 —0.0221 1.0438 D,=a’{—a, —0.0139 1.0867 —0.0211 1.0484 ugh/Zr, 1.0001 0.0128 0.9996 0.0138 r,=;,/a, 1.0029 0.2178 1.0044 0.2088 [SE=(;,—a,)2 1.1830 1.9324 1.0873 1.5678 lSE=(0r',‘—a,)’ 1.1781 1.9236 1.0968 1.5833 Rel. Efficiencyé 0.9959 1.0087 0.20 Bootstrap a: -5.00 —4.9529 1.6260 —5.0502 1.5745 MINQUE, a, —5.00 -4.9522 1.6186 —5.0488 1.5736 ri1s=;,.;, —0.0007 0.0613 —0.0014 0.0660 n,=o,—a, 0.0478 1.6186 -0.0488 1.5736 0,=o*;—o, 0.0471 1.6260 -0.0502 1.5745 R,=or‘,/a, 1.0000 0.0183 1.0001 0.0172 n,=a,/a, 0.9904 0.3237 1.0098 0.3147 ISE1=(a,—a,)’ 2.6156 3.9830 2.4724 3.2204 Iss2=(oq—a,)’ 2.6394 4.0097 2.4754 3.2304 Rel. EfficiencyG 1.0091 1.0012 81 —0.0243 and -0.0251 under double exponential. The bootstrap estimate of bias at this level was 0.0003 under the normal and —0.0008 under double exponential. At this level of the intraclass correlation, R, was observed at 0.9999 under the normal and 1.000 under the double exponential compared to R, which was 0.9948 under the normal and 1.0049 under double exponential. Relative efficiency for the two estimators was extremely close to 1.00 both under normal and double exponential. For p = 0.05, the average of bootstrap and MIN QUE estimates over the 400 trials were —5.0139 and -5.0221 under double exponential. Their respective biases were —0.0139 and —0.0143 under the normal and —0.0211 and —0.0221 under double exponential. Compared to the expected ratio of the estimates at 1.00, both MINQUE and the bootstrap were very close in estimated the ratio with R, = 1.0001 under the normal and R, = 0.9996 under double exponential while R, = 1.0029 under normal and R, = 1.0044 under double exponential. Estimation of a, at the 0.20 level of the intraclass correlation was equally successful with R, being closer to 1.00 than R, under both normal and double exponential distributions. The bootstrap estimate of bias was lower under the normal than under double exponential. However, the bootstrap and MINQUE biases differed by no more than 0.007 under normal or double exponential distributions, and the measure of efficiency was extremely close to 1.00 under the normal and double exponential. The effect of the intraclass correlation condition on the estimation of the functions of 62, and/or Zr; was apparent in the MIN QUE and bootstrap mean square errors. Both mean square errors tended to increase with increasing p under both normal and double exponential distributions. For instance, the bootstrap mean square errors were 0.8638, 1.1781, and 2.6394 for p = 0.01, 0.05 and 0.20 82 respectively under the normal and 0.8217, 1.0968, and 2.4754 for p = 0.01, 0.05, and 0.20 respectively under double exponential. The mean square errors for the MINQUE estimates were quite close to those of the bootstrap at all levels of the intraclass correlation under both normal and double exponential. Under both the normal and double exponential, the bootstrap/MINQUE measure of relative emciency was extremely close to 1.00 indicating that the bootstrap and MIN QUE estimators very closely estimated the parameter a,. In general, therefore, results in Table 6.4 show that, both MINQUE and bootstrap very closely estimated the parameter a, at all levels of the intraclass correlation conditions under both normal and double exponential distributions. The ratio R, was consistently closer to 1.00 compared to R, at all the six design factor combinations; indicating a great deal of promise through the bootstrap method. Figure 6.4 presents six percentage polygons of the 400 bootstrap and MINQUE estimates of a, under the normal and double exponential errors and sets of random effects at each of the three levels of the pOpulation intraclass correlation condition. From these charts it is clear that the bootstrap generally followed the MINQUE very closely at all levels of the intraclass correlation. The figures also showed no obvious difierence in estimation between the normal and double exponential distributions. However, the highest spread of the estimates for both bootstrap and MINQUE were observed at the 0.20 level of the intraclass correlation followed by 0.05 level. The spread of both estimates was lowest at 0.01. Percentage polygons shown in Figure 6.4 therefore indicate that, though MINQUE and bootstrap do not differ in estimating a,, both their ability to produce efficient (less variable) estimates depends on the level of the population intraclass correlation. 83 Figure 6.4 Percentage polygons for the MIN QUE and bootstrap estimate of a, over 40 trialsunderthenormaland donbleexponentialerrorsand sets ofrandomefiectsfor p = 0.01, 0.05, and 0.20. Normal double exponential omens mom to u 12 12F 10[ ‘ IO? 5 8" 6L ah “I" u- :‘r o, x 2+ -ns «0.: -e.s -ss #5 -ss ~55 ~45 ~32 -2.s -15 ~05 Yus no.5 res -ss 45 -ss -s.s «s -3.s -as -15 -os estimate of a, p = 0,01 estimate of a, —bootstno +vmoUE —oootstreo d—wms 08cm mom to u 12" 12) ‘0? 101' 9" M- OP 0p 4 F ‘ 1- 2 b 2 b O ‘ A A ‘ - c A—A - - a . A . A '113 416 '06 16 -?.5 '16 6.5 «5 <15 '15 -15 ~08 416 '05 '95 '15 4.5 '66 '55 «.5 '15 '1‘ 45 '05 -eootsusn -2-~n~oue —bootsqu *9.“ percent cam 16 14 12 12) 10? '01. 8" .1. 6" ob ‘ l' ‘ p 2" 2» we «as -s.s -e.s -r.s -es as «.s as -2.s -ts -o.s 418 «as -o.s -es -r.s -es -ss «.6 -18 4.: -1s N estimated a, p-010 . estimated at _°°°""‘° *wucue —-sosums -o-muous 84 Both methods yield less efficient estimates when the pOpulation intraclass correlation is high. Their mean square errors increased at the same rate with increasing intraclass correlation. Despite differences in variability of the estimates at different levels of the intraclass correlation, the percentage polygons indicate that the estimates were centered nearly at the same point. Estimates were expected to be centered at the true pOpulation parameters value which was set at -5.00. The results showed that all the six percentage polygons were centered no more than 0.05 away from the true parameter value. Summary results for the bootstrap and MINQUE estimates of the parameter 0, based on the ten estimable functions of Ir, and/or of; over the 400 trials are presented in Table 6.5. Summary results are presented for each of the three levels of the intraclass correlation conditions under both the normal and double exponential distribution of the errors and sets of random effects of the model. The true population parameter, a, was set at 3.00. MIN QUE and bootstrap estimates are compared against the true pOpulation parameter value. Averages and standard deviations over 400 trials presented in Table 6.5 shows that, both MIN QUE and the bootstrap very closely estimated the parameter a,. At the 0.01 level of the intraclass correlation, both MINQUE and bootstrap slightly overestimated the parameter a, under the normal and slightly underestimated a, under the double exponential errors and sets of random effects of the model. The biases for MIN QUE and bootstrap were 0.0366 and 0.0362 respectively under the normal and —0.0123 and -0.0101 under double exponential. The bootstrap estimate of bias was observed at —0.0004 under the normal and 0.0021 under double exponential. The ratio R, was surprisingly close to 1.00 under 85 Table 6.5 , Average and standard deviation of the functions of the estimates a, and /or a“; under the normal and double exponential errors and sets of random efi'ects for p = 0.01, 0.05, and 0.20. Nornal Double Exponential Value of p Est inate Par.Value Average S .0. Average S .D . 0.01 Bootstrap, Zr; 3.00 3.0362 0.9030 2.9878 0.9053 MINQUE, ;, 3.00 3.0366 0.9029 2.9899 0.9109 ri1s=;;-;, -0.0004 0.05566 0.0021 0.0625 0,=o,—o, 0.0366 0.9029 —0.0123 0.9053 n,=og—a, 0.0362 0.9030 —0.0101 0.9109 R,=a",‘/a, 1.0005 0.0228 1.0016 0.0357 r,=;,/a, 1.0122 0.3010 0.9959 0.3018 Iss1=(o,—a,) 3 0.8145 1.1035 0.8176 1.2312 Iss2=(oag—a,) 2 0.8146 1.0961 0.8277 1.2530 Rel. Emoiencyo 1.0001 1.0124 86 Table 6.5 (continued) Normal Double Exponential Value of p Estinate Par.Value Average S .D. Average S .D. 0.05 Bootstrap, o; 3.00 3.0380 1.0368 3.0045 1.0326 MINQUE, Zr, 3.00 3.0349 1.0367 3.0002 1.0278 BlAS=o§—£r, 0.0031 0.0582 0.0043 0.0630 D,=a",‘—a, 0.0349 1.0366 0.0002 1.0278 0,=oz;—a, 0.0380 1.0368 0.0045 1.0326 r,.-.;;/;, 1.0015 0.0264 1.0003 0.0415 11,:og/a, 1.0116 0.3455 1.0001 0.3426 181:1=(2},—s,)2 1.0732 1.4720 1.0537 1.5478 Iss2=(ng;—s,)2 1.0738 1.4677 1.0635 1.5753 Rel. Efficiencyo 1.0006 1.0093 0.20 Bootstrap, Er; 3.00 3.0379 1.4153 3.0711 1.4837 MINQUE, o, 3.00 3.0380 1.4134 3.0688 1.4777 811843—31, —0.0002 0.00610 0.0023 0.0646 n,=o,—a, 0.0380 1.4134 0.0688 1.4777 0,:285—6, 0.0379 1.4153 0.0711 1.4837 8583/61, 0.9979 0.0529 1.0005 0.1067 n,.-.Zr,/a, 1.0127 0.4711 1.0229 0.4926 ISE1=( 613—61,)” 1.9940 2.9561 2.1828 3.0902 ISE2=(01’§-a,)’ 1.9995 2.9688 2.2009 3.1362 Rel. Efficiencyo 1.0028 1.0083 o Rel. Efficiency = MSE2/MSE1 87 both the normal and double exponential errors and sets of random effects of the model. The usual MINQUE ratio, R, was not as close to 1.00 as R,, indicating that the bootstrap more successfully estimated the ratio than MINQUE. A more successful estimation of the parameter a, was achieved by both the MINQUE and bootstrap under the double exponential errors at the 0.05 level of the intraclass correlation condition. At this level, the average values of MINQUE and bootstrap were 3.0349 and 3.0380 respectively under the normal and 3.0002 and 3.0045 respectively under double exponential distributions. The bootstrap and MIN QUE bias were 0.0380 and 0.0349 under the normal and 0.0045 and 0.0002 respectively under double exponential distributions. The bootstrap estimate of bias was 0.0031 under the normal and 0.0043 under the double exponential. At this level of the intraclass correlation also, the ratio R, was closer to 1.00 than R, both under the normal and double exponential errors and sets of random effects of the model. However, at the 0.05 level of the intraclass correlation condition, both MIN QUE and bootstrap mean square errors were greater than at the 0.01 level of the intraclass correlation. At the 0.20 level of the intraclass correlation, both MINQUE and bootstrap slightly overestimated the parameter a, under the normal and double exponential distribution of errors and sets of random effects of the model. However, both biases were greater under double exponential distribution than under the normal. The bootstrap estimate of bias was no more than 0.0025 under both distributions. On average, R, was closer to 1.00 than R, under both the normal and double exponential. The bootstrap and MINQUE mean square errors were near 2.00 at this level of the intraclass correlation compared to about 1.05 at the 0.05 level and about 0.80 at the 0.01 level of the intraclass correlation. Thus, both mean square errors 88 seemed to increase with increasing level of the intraclass correlation condition. The bootstrap/MINQUE measure of relative efficiency at all three levels of the intraclass correlation were very close to 1.00 under both the normal and double exponential distribution. Figure 6.5 shows the percentage polygons of the 400 bootstrap and MINQUE estimates of a, for each of the six design factor combinations. Separate frequent polygons are presented for the normal and double exponential distributions at each level of the intraclass correlation condition. From these charts, it is shown that the bootstrap followed the MIN QUE quite closely such that it was difficulty to distinguish the two at certain points. All the six charts were centered near the true parameter value of a, which was set at 3.00. Though the frequency polygons showed no variation by the distribution of errors and sets of random effects, the spread of the estimates seems to vary by level of the intraclass correlation condition. The highest spread was observed at the 0.20 level of the intraclass correlation while the lowest spread was seen at the 0.01 level. The other fixed effects parameter of the model which was examined in the study was 6, the coefficient of the covariate. The true parameter value was set at 1.00. As in the other parameters of the model, the bootstrap and MINQUE estimates of 6 were calculated at each of the three levels of the intraclass correlation under both normal and double exponential errors and sets of random efl’ects of the model. Summary results for the bootstrap and MINQUE estimates over the 400 trials for the six design factor combinations are presented in Table 6.6. Here, averages and standard deviations of ten function of [9 and 6‘" are presented 89 Figure 6.5 Percentage polygons for the MIN QUE and bootstrap estimate of a, over 40 trials under the normal and double exponential errors and sets of random efi’ects for p = 0.01, 0.05, and 0.20. Normal Double Exponential 00m mom 1‘ 1‘ 12 12 10 to t- 8 e >- f 6.)- g 1- ‘,L 4 1- 2} 2 >- o% o - _ -0 -s -s a -2 o 2 4 6 s 1O -to -s -e a -2 o 2 a 0 estimate of a, p a 0.01 estima 0‘ a, *nootstrso +‘.”NOUE —nootsuso +1911“ DONG!" DOVCOM H 16 12) 121- 10 r 10 r 8 P a p 6 l‘ O P ‘ 1' s r- 2 l‘ 2 t. o 0 ‘ ‘ 8 ~10 -s -s -o -z o 2 4 s s to ~10 s .6 a -2 o 2 s 0 estimateof a, 71-005 estimateof a, —oootsmo “Home -—oootsmo *mm 16 D ,‘Clllllll 12 12 10 t 101- 5 ‘ st- ° " o b s r ‘ .. 2 b 2 1- O .A L ' o A to .. -o o z o 2 a s s 10 to s -s -4 2 o 2 ‘ 0 estimate of p _ 0.20 estimate of a, 90 Table 6.6 Average and standard deviation of the functions of the estimates 6 and / or 6“ under the normal and double exponential errors and sets of random effects for p = 0.01, 0.05, and 0.20. Normal Double Exponential Value of p Estimate Par.Value Average S .D. Average S .D. 0.01 Bootstrap, 29* 1.00 1.0000 0.0117 1.0005 0.0123 MINQUE, p 1.00 1.000 0.0117 1.0005 0.0122 sisszfr—b —0.0000 0.0008 —0.0000 -.0009 0,:p—p —0.0000 0.0117 0.0005 0.0122 0,=p‘*—6 —0.0000 0.0117 0.0005 0.0122 11,=p‘*/p 1.0000 0.0008 1.0000 0.0009 t,=fi/p 1.0000 0.0117 1.0005 0.0122 ISE1=(@—fi)’ 0.0001 0.0002 0.0001 0.0002 ISE2= law—p)2 0.0001 0.0002 0.0002 0.0002 Rel. E ciencyo 1.0000 2.0000 91 Table 6.6 (continued) Normal Double Exponential Value ofp Estimate Par.Value Average S.D. Average S.D. 0.05 Bootstrap, p" 1.00 0.9999 0.0125 1.0005 0.0123 MINQUE, p 1.00 0.9999 0.0124 1.0005 0.0123 BlASzIS—fi 0.0000 0.0009 —0.0001 0.0009 0,:p—p —0.0001 0.0124 0.0005 0.0123 0,; —p —0.0001 0.0125 0.0005 0.0123 spa/[9 1.0000 0.0009 0.9999 0.0009 1139/5 0.9999 0.0124 1.0005 0.0123 lSEleS—fl)’ 0.0002 0.0002 0.0002 0.0002 I882: fi‘Lfl)’ 0.0002 0.0002 0.0002 0.0002 Rel. E oiencyo 1.0000 1.0000 0.20 Bootstrap, :6 1.00 0.9997 0.0121 1.0005 0.0128 MINQUE, p 1.00 0.9997 0.0120 1.0005 0.0127 8118:1343 0.0000 0.0009 0.0000 0.0009 0,:p—p —0.0003 0.0120 0.0005 0.0127 offs—p —0.0003 0.0121 0.0005 0.0128 8,437}? 1.0000 0.0009 1.0000 0.0009 s,=EJ/p 0.9997 0.0120 1.0005 0.0127 lSEle—fl)’ 0.0001 0.0002 0.0002 0.0002 ISE2= fl—fi)’ 0.0001 0.0002 0.0002 0.0003 Rel. E ciencyo 1.0000 1.0000 G Rel. Eficiency = MSE2/MSE1 92 at each of the three levels of the intraclass correlation by each of the distribution of the errors and sets of random effects of the model. From Table 6.6 it is shown that the average values of the bootstrap and MINQUE were extremely close to the true parameter value regardless of the level of the intraclass correlation or distribution of the errors and sets of random effects of the model. At all levels of the intraclass correlation condition, the bootstrap and MINQUE biases were never greater than 0.0005 and the bootstrap estimate of bias was perfectly nil under both normal and double exponential distributions. The average of the ratios R, was almost always equal to 1.00 at all levels of the intraclass correlation for both normal and double exponential errors and sets of random effects of the model. However, the average of the ratios, R, slightly differed from 1.00 for some design factor combinations. The bootstrap and MINQUE mean square errors were surprisingly small at all levels of the intraclass correlation for both normal and double exponential distributions. Thus, based on these summary statistics, it is clear that the parameter 6 was very successfully estimated by both MINQUE and the bootstrap regardless of the level of the intraclass correlation condition and the distribution of the errors and sets of random effects of the model. In terms of their relative accuracy, neither method (MINQUE or Bootstrap) was superior to the other. Their measure of relative efficiency was extremely close to 1.00 at all levels of the intraclass correlation particularly under the normal distribution. Figure 6.6 is a display of the percentage polygons of the 400 bootstrap and MINQUE estimates of 6 under the normal and double exponential errors and sets of random effects at each of the three levels of the population intraclass correlation condition. From these charts, it is apparent that the bootstrap on average followed 93 Figure 6.6 Percentage polygons for the MIN QUE and bootstrap estimate of 6 over 400 trials under the normal and double exponential errors and sets of random efiects for p = 0.01, 0.05, and 0.20. Normal Double Exponential ,‘ 0.0M 00'0"" ol to} at °.* *r :1» l ' t 0 CL A AAA ‘4 0935 ones can osss uses too: 1.015 1025 1035 1045 '55 0.006 0076 0005 0.0“ 1006 1015 1028 1.000 1040 estimate of [3 estimate of 6 —" bootstrap + utNOUE — aooutno - MINQUE A 0.986 0.9“ 0978 uses 09“ 10“ 101$ 1025 1.035 1045 A AA A 56 0.” 0.078 0.906 09“ 1” 1.018 1025 m 1066 estimate of 6 estimate of 6 —oootsmo +umms —oootsmo +804“ O TVT—fTT 1. t 0 l- 0 or- s 4 t- 2 2 a . o4 - . . . _ 0 4+4. . . . . . A 0900 uses om uses uses 10” 1.015 1020 1030 l040 0“.st 0075 0.“. uses 1.” 1.08 t” m ‘0“ estimate of 6 estimate of B ""MIU +m£ —Wn +£0.01! 94 the MINQUE quite closely at all levels of the intraclass correlation condition. The percentage polygons also showed no obvious difference in estimation between the normal and double exponential distributions. All the six figures were centered extremely close to 1.00 as expected. Results of Bootstrap Confidence Intervals The bootstrap procedure for constructing confidence intervals is perhaps one of the most significant accomplishments of the bootstrap method. The procedure can be applied to even more complicated problems involving statistics whose sampling distributions cannot be determine analytically. Derivation of the bootstrap method for the confidence interval is based on the following assumption. For an estimator, 8 of the parameter 0, let D = 6—0. Define D* = FL; to be the bootstrap quantity observed at each bootstrap replication. The bootstrap distribution of D* estimates the unknown distribution of D. As an illustration, percentage polygons based on 1000 repetitions to demonstrate the relationship between the distribution of functions D and D* of T2, 7’2 and/or 72* at 0.01, 0.05, and 0.20 intraclass correlation conditions are presented in Figure 6.7. But it is important for readers to be reminded that, while D was derived from 1000 independent samples drawn from a population with predetermined parameters, D* was derived from 1000 repeated resampling drawn from one such sample with replacement. If the distribution of D were known, then the (l — a)100% confidence interval can be defined using real values DL and DU by the probability statement, 95 Figure 6.7 Percentage polygons for the relationship between the distribtion of the function D = 13 - 73 and D“ = 1" - 13 for p = 0.01, 0.05, and 0.20. Normal fl Double Exponential -s -3 '2 -l O ‘ 2 3 l 5 6 7 3 Value cl 0 or 0° —0° occlsttso +0 MINQUE value 01 0 or D' - 0" ocean» --o MINQUE netcsnl oevcsm lO 10 8r 8 l- o r o T . P ‘ p 2 t 2 t O A 0 -e 5 s 3 -2 1 o 1 2 3 l 5 6 8 -o 5 Value 01 D or D' "" 0' 000131100 + 0 MINQUE P - 0.01 percent percent 0 10 8 > a r 6 ‘ a y» .. .l 2 t 2 1- o A a A. o -6 '5 '9 '3 '2 'l 0 l 2 3 l 5 6 7 8 -6 F '0-05 value 01 D or 0‘ — 0* 600mm: *0 MINQUE \ L A A A fl oucenl percent 12 12 101' 10)- B 7 8 b O) 6 l- 0 i’ ‘ p 2 b 2 r o 1.. ‘ o ‘20 2‘ 28 .20 16 Value 01 D or D' —D" bootstrap +0 MINQUE P 30.20 "2 '8 '4 0 4 8 12 15 20 2‘ 28 Value 01 O or 0' --0' ooomrso +0 MINQUE 96 P(DLngDU).~.l—a or P(DL$ 0—05 DU): 1—a which can be written as, (6.1) P(b—Dugogb—DL)si—o. Since D0 and D1. are not observable the probability statement in Equation 6.1 is estimated by the bootstrap probability statement, (6.2) P(6—D;gogb—D;)sl—o where D; and D; are bootstrap versions of DU and DL respectively computed from bootstrap samples. Equation 6.2 gives the bootstrap (l—a)100% confidence interval via the percentile method. The procedure is highly flexible and can be applied to complicated problems in a wide range of situations, where classical methods may fail to be useful. Indeed, this was one of the aspects of the present study where the bootstrap delivers while the MIN QUE does not. In the present study, 90 and 95 percent bootstrap confidence intervals were constructed for each of the six parameters of the mixed model. The confidence intervals were constructed for all six design factor combinations (cells a through f in Table 4.1). Table 6.7 presents the averages and standard deviations for the 90 and 95 percent confidence limits based on the bootstrap over the 400 trials at the 0.01 level of the intraclass correlation condition. The summary statistics for the lower confidence limit (L.C.L.), upper confidence limit (U.C.L.) and the width of the confidence interval are presented both under the normal and double exponential distributions. 97 88.” 88.: 88.: 82.: 52>» :36 82.63 88.” 32.2: .D.: 838 28.8 $8." 88.8 4.0.4 ”HO $8 28...“ 88.3 83; 82.2 523 ~82. 888: 3.8.“. 38.2: 40.: 886 88.8 83.” 32.8 Add ”HO 088 8.2: 3.3% 238 823. 2.85 .33.”. €23 884 $3.” 884 3.2.” 4.0.: 88.: 88.7 885 83.7 Add ”Ho .88 88.: 886 88.9 826 523 2.24 88.” 8:; 8.53 4.0.: 823 83.7. ”82 22.7. Add ”5 383 2: 3.31 .Q.m owfio>< .Q.m owfim>< 3.25qu m=_.m>.8m ~225an 3:85 2:58qu oanQ 3882 338 on. .8 3.08883 x3 2: Some 258:: 853:8 2: Ho .88 u a H8 33.2.3.8 2:58 v5 388.— ofi 82:. £23 2: ES 385 8825.8 8.5389 2: .«o 823mg“. @328: 93 888:. he 033. 98 83.: 23.: 82.: $8.: £23 2.8.: :83: 28.: 2.3.7 4.0.: 3%.: 83¢. 3.8.: :23: 4.0.: ”3 0%: 88.: 2%.: 83.: 8a.: .323 28.: $3.? 88.: 85...? 4.0.: 88.: 83.? 23.: 2.8.? 4.0.: ”no :3: ::...r 3.3:. 2.8.: 83.: $8.: 33.: £23 88.: 88.: 88.: :8: 4.0.: $8.: 38:: 2.8.: 28.? 4.0.: ”5 °5:: :m::.: 88.: $8.: 88.: £23 28.: 38.: 88.: $8.: 4.0.: $8.: $8.? $8.: 88:: .40.: "5 5: 8.: 3.3: dd 380:. dd $803.. «885%: o=_d>.3m 5888.3 3.535 32.88% 2:89 1882 @2588: Z 2%.: 99 885 88¢ 32:. 38.: £23 «23. 8:: 2:; 3:: Add 88.: :8: 2:; 835 Add ”5 $8 :85 $35 $8.0 $35 £23 2:; $84 885 3.8." .60.: 8:; :35 2:; 33¢ .15.; ”5 $8 2: 3% 8o: «33 8mg :22 £23 :8... 3»? 8:3 33 4.0.: 88.: «£3 8:; 8%; do: ”5 x8 :32 38.” 3.35 33.” £23 :85 28+ 285 55. 4.0.: $86 823 3:3 as? 4.0.; ”3 $8 8.” 3.3:. .D.m mmfim>< dd $295 c3833 33>.an “225:5 3335 33:80qu 0325 38.82 €33.53 2 23 100 From Table 6.7 it is shown that bootstrap 95% confidence intervals about the parameter 1" at the 0.01 level of the intraclass correlation had a width of 5.1793 and 5.3868 under the normal and double exponential respectively. The average width of the 90% confidence interval was 4.3137 and 4.4735 under the normal and double exponential respectively. Corresponding standard deviation to these averages show clearly how precise these intervals were. Averages and standard deviations of the confidence limits and widths of confidence intervals about the parameter of, also showed a fairly precise bootstrap interval estimation process. The average width of the confidence intervals were fairly low, particularly under the normal distribution. Standard deviations corresponding to these averages were 0.9369 under the normal and 2.2998 under double exponential. Since the bootstrap interval estimation process about the parameters 72 and a: both of which are component of p (see Equation 4.3) was rather unsuccessful, the results showed an equally successful bootstrap interval estimation process about the parameter p. The average width of the 95% confidence interval was 0.0508 under the normal and 0.0523 under the double exponential distribution. At this level of the intraclass correlation condition, the highest success of the bootstrap confidence interval estimation procedure was achieved about the parameter [3, the ooeficient of the covariate of the model. For this parameter, very precise bootstrap confidence intervals were obtained both under the normal and double exponential distributions. For instance, the average width of the 95% bootstrap confidence interval was 0.0467 under the normal and 0.0467 under the double exponential. The standard deviations corresponding to these average widths 101 were 0.0034 and 0.0037 under the normal and double exponential respectively. With these results, it is apparent that, regardless of the distribution of the errors and sets of random efl'ects of the model, the bootstrap confidence intervals about the parameter fl were extremely precise. Table 6.8 shows the summary statistics for 90% and 95% bootstrap confidence limits over the 400 trials at the 0.05 level of the intraclass correlation condition. Averages and standard deviations for the lower (L.C.L.) and upper (U.C.L.) confidence limits and the width of the confidence interval are presented under the normal and double exponential distributions. Summary statistics in Table 6.8 shows that, the bootstrap confidence interval about the parameter 1" at the 0.05 level of the intraclass correlation were by far more successful than the same intervals at the 0.01 level of the intraclass correlation condition. The average widths were much smaller and less variable. Summary statistics for bootstrap confidence intervals about the parameters a”, a,, a,, and 19 at the 0.05 and 0.01 levels of the intraclass correlation showed a more precise interval estimation under both the normal and double exponential. For the parameters 1’2 and p, the bootstrap confidence interval estimation procedure at the 0.05 level of the intraclass correlation was equally successful as at the 0.01 level of the intraclass correlation condition. For instance, compared to the average width of the 95% confidence interval about p of 0.0508 when p = 0.01, the same average was 0.0615 when p = 0.05 under the normal distribution. Similar low differences between the two levels of the intraclass correlation in the width of the bootstrap confidence intervals about the parameter p were observed under the double exponential distribution. 102 ovum." 38.2 38.: $3.: 523 836 35.2: ”can.” wounded ddd $3.6 53.8 33..” 3.3.3 ddd dd $8 Sons 33....“ vmomd $3.3 £23 8:: 8.3.52 38." 38.2: ddd 2.36. 2.8.3 2.3.» 83.3 ddd dd $3 8.2: 3.3% 32.6 manna 385 83a £23 83....“ 82.» naoad 8:: ddd nnmad «an: Emvd 33.“ ddd dd $8 nomad Rood mmmfio 3:8 523 33d 2.3.x $8; 83.x ddd $3; 82; 82.." 23: Add ”5 $3 $3 €31 dd $395. dd omSo>< 32.5mm o2¢>§£ SESSm 22:35 Eaaoqoaxm «3.59 3882 358 2: do 3888938" x: 2: Song 2863.: 853:8 2: do 533 05 23 £8: 88330 3:38; 2: do 80:25“. 335: ES 8w30>< .35 u q 3.. 3358?... 03:2. 23 188: 2: 82:. ad 053. 103 88... $3." 88... 38.... £23 «3.3 88.? 88d 88.? Add 38A 33.? :53 2.3.? Add dd $8 223 82..” 3.3.: 2.8.” £23 8:: 38.”: $2: 38.”... Add 22: 33.? 32d 32.? Add dd x3 8d: :3? $85 :3: 3.85 22:. £23 53. 85° 8:; 33 Add $8... 2.8: 3:; £85 Add dd x8 88.: 82:. so: 38.: £23 885 :86 2:; 2:3. Add 88.: as: 8:; 2.8.: ddd dd x3 2:. 3% dd 0383‘ dd $225 demand add—«>55 addendum 3‘56““: 3.5596 0325 38.82 Aumaasasv 3 2.3. 104 good 385 good 88d 53? $85 8:: 38.0 8:: 4.0.: 38d :36 $85 835 Add ”HO $8 good 32:. «mood 2.36 523 vfiod ammo." 88d mag; 4.0.: 53. at: 385 23.0 4.3 ”5 x3 2: 3% Sand 39: $2.0 38." 523 82: £23. 32: $34. 4.0.: 384 82: ammo; was; 4.0.4 ”HO $8 v2.2. 3cm.” «86¢ 33.” £23 «one; «83. 82: 32.6 4.0.: can: 8:: $3; 2...? 4.0.4 ”3 x8 8.” Gina .Q.m omflo>< .Q.m omfim>< Baazmm Biz/cam Sfififldm 12:25 Baggage 03.59 3882 soésaos 3 2&9 105 Summary results for the 90% and 95% confidence intervals for the 400 trials at the 0.20 level of the intraclass correlation are presented in Table 6.9. Averages for the lower (L.C.L.), upper (U.C.L.) confidence limits and the width of the confidence intervals are presented under both the normal and double exponential errors and sets of random effects of the model. At this level of the intraclass correlation, these summary statistics showed slightly wider bootstrap confidence procedure about the parameters 1’2 and 0: than at the other levels of the intraclass correlation. No diflerences in the level of success of the method were noticed in estimating the confidence intervals about the other parameters of the model among different levels of the intraclass correlation condition. For the fixed effects parameters, the bootstrap procedure for confidence interval was also always successful at all levels of the intraclass correlation condition. These results revealed an important feature of the bootstrap procedure for confidence intervals about the parameters 1" and 0:. The success of the procedure depends on the level of the intraclass correlation. When the pOpulation intraclass correlation is small, the bootstrap procedure for confidence intervals using the percentile method about the fixed and random effects parameters of the model is quite precise. At high values of the intraclass correlation condition, the bootstrap interval estimation procedure about the random effects parameters is slightly less precise. However, regardless of the level of the intraclass correlation, the bootstrap method for the confidence interval about the fixed effects parameters of the model (a,,a3fl) seemed to be a remarkable success. 106 :8.“ 2.8.: as... 83.: £23 :8... 33.8" 82¢ 38.8“ .D.: 83.» 38.8 32.” 33.3 Add ad *8 $2..“ 8:: 82.“ 88.: £23 22... «$8: 58.“. 2.8.2: 45.: 3.5.“ 32.8 83..» 53.8 .34 ”8 :3 8.2: acre 88.“ as: 53 2:3 523 2&2 $2.8 $5.0 33.2. 4.0.: 83... $2.8 was.” 5.ch Add ad x8 3:: 25.: as; 38.: 523 88... «5.8 «a; 88.8 40.: 33... 82.2 as.» 88.2 43..“ "no :3 8.3 Q31 ..Q.m 0383‘ dd omauo>< 325.5 2.3/gm .8883!— €235 easing «3:8 352 ad 039—. .86 u a he 158.895 03:8 and 18.8: 2: 823 .308 on. we 56883 Mu. 2: 33¢ 55:: 883:8 2: mo 53? 2: 23 £8: 883:8 9538.— 2: no Sonata". 3353. 2:. 85¢ 107 Send cmmmd $36 was.“ 523 $34 «8ch $84 23.7 ddd 33d 23.? «Sod $3..va ddd dd $8 «wand 35.” 3mm... 33.” 523 mmmnd SEAT 38d namfiml ddé 884 888: 8a: 88.? Add ”5 $8 8.? Q35 32:. good good 385 523 3.36 338 88.8 836 ddd 83.: 83... 2.8.: Sofie ddd dd $8 «Sod $3.: mmood :86 523 $36 $36 886 commd ddd owned 336 886 $2.: ddd ad $3 88 9.qu .Q.m omfio>< .Q.m $803.. 38853“ 2.3/dam 3638a .8335 ESQ—awn 2309 3882 ABSESV 8.8 28“. 108 88.: 88.: 88.: 83.: 523 as: 884 38.: 884 4.0.: 88.: 88.: as... 83.: 4.04 :0 x8 58.: 2.3.: 38.: 83.: 523 was... $84 $8.: 824 4.0.: 88.: $8.: $8.: 33.: 4.04 4.0 $3 84 :3: 88.: 824 88.: as: 523 3.3.4 :83 2:; 3.3 4.0.: 3:: 834 2.3.4 384 4.04 4.0 x8 «.22. 33.." 28.: as.” 523 83.4 82.4 33 «83 4.0.: $54 88.: 8:} 33.4 4.04 4.0 $8 8.» ca? d.m oufio>< d.m 03854 3:85am 2.3/4.3 8682:“— 330:: 38:85 2:8: .332 €25.83 2. 2:94. 109 The reader should be reminded that, the bootstrap procedure for the confidence interval demonstrated above, represents perhaps the greatest charm of the bootstrap technique. Using the technique, confidence intervals which may be difficult to obtain through the usual MIN QUE are possible. Accuracy of Bootstrap Qonfidence Intervals In this simulation study, 90% and 95% bootstrap confidence interval were constructed at each of the 400 trials. Table 6.10 shows the percentage of the number of times each of the six population parameter fell within the 90% and 95% bootstrap confidence interval for all the six design factor combinations (cells a through 1'). Ideal percentages are expected to be near (1—a)100, for a = 0.01 or 0.05. From these results, it is shown that, near perfect percentages were observed for all the six parameters at the 0.05 level of the intraclass correlation under the normal distribution. At this level of the pOpulation intraclass correlation condition, percentages under the double exponential, though not as good as those under the normal, were not far off from the expected quantity (1—a)100. For most of the parameters, disappointingly low percentages were observed at the 0.20 level of the intraclass correlation, particularly under the double exponential errors and sets of random effects of the model. Even at the other two levels of the intraclass correlation condition, there were more coverage probabilities which were below the expected quantity (1—a) than those above the expected quantity. This finding perhaps sends a precautionally message to research practitioners to aim higher 110 coverage probabilities when setting confidence intervals rather than investing high hopes at the conventional 0.9 or 0.95 coverage probabilities. However, for the parameters a’ and fl, percentages extremely close to (1—a)100% were observed for all the six design factor combinations. 111 Table 6.10 Percentage of times that the true population parameters fell within the confidence intervals formed using the bootstrap procedure at the three levels of the intraclass correlation. Normal Double Exponential Value p Expected 90% CI. 95% 0.1. 90% 0.1. 95% C.I. 0.01 1" 1.00 97.5 99.0 97.8 98.8 0.05 5.26 89.5 94.0 81.0 88.3 0.20 25.00 58.8 70.3 42.3 52.0 0.01 0’ 100.00 89.8 94.8 85.0 91.8 0.05 100.00 88.3 94.0 85.5 92.0 0.20 100.00 98.5 92.8 85.5 92.3 0.01 p 0.01 98.0 99.0 97.5 99.0 0.05 0.05 89.3 93.8 82.0 88.0 0.20 0.20 59.8 68.8 45.0 52.5 0.01 (11 —5.00 87.8 93.0 88.0 93.0 0.05 —5.00 81.8 88.3 81.8 88.3 0.20 —5.00 64.5 74.0 62.8 71.3 0.01 a3 3.00 87.5 93.5 86.8 92.8 0.05 3.00 82.3 88.3 84.0 89.3 0.20 3.00 70.3 76.8 66.5 76.8 0.01 fl 1.00 89.0 95.0 87.5 93.5 0.05 1.00 86.5 93.3 87.8 93.5 0.20 1.00 88.5 95.0 87.5 93.5 CHAPTER VII SUMMARY, TECHNICAL DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS vervi The primary purpose of the study was to demonstrate the Operation of the bootstrap in estimating parameters of a mixed hierarchical linear model with random intercepts. The study demonstrated the ability of the bootstrap algorithm in providing the estimates of the fixed and random effects of the model, generating bootstrap empirical distributions and standard errors of the statistics and thereby constructing confidence intervals about the parameters. The design of the study utilized samples generated from pOpulations of known parameters. Computer programs used to generate independent samples and perform Monte Carlo simulations were coded in Statistical Analysis Systems (SAS), mostly using Interactive Matrix Language (SAS/IML). Generation of data from pOpulations of known parameter values provided a check for the performance of the estimation procedures. P0pulation distributions from which samples were drawn from represented the normal and double exponential (or Laplace) distributions. The double exponential distribution (an example of a distribution with fairly long and thick tails) represented a distribution with some departure from normality, a situation which most classical statistical methods are typically not usable. The Minimum Norm Quadratic Unbiased Estimation (MINQUE) procedure was adopted as a useful method of estimating the parameters of the model at each bootstrap replication. The method provided a comparable partnership with the 112 113 bootstrap since they both do not require the normal distributional prOperties. Thus, for each parameter of the model, two estimators were provided. These estimators represented the MINQUE estimator based on the original sample and the bootstrap estimator based on the resampled data. Though the derivation of the MINQUE is based on arbitrary weights in the norm, the present study adapted an AN OVA—type method of independently estimating the variance components of the model as in Hanushek ( 1974). The values therein those prior estimates were used to determine the weights to be used in MINQUE. In order to extensively assess the behavior of the bootstrap and MIN QUE estimators, a total of 2400 Monte Carlo simulation trials, each consisting of a difl’erent data set were performed for six design factor combinations. The six design factor combinations represented the three levels of the intraclass correlation by the two distributional models (normal and double exponential). In addition to simulated data, the bootstrap method and MIN QUE were also applied on actual field research data to estimate both the fixed and random effects of the model involving the effect of institutional, classroom and individual teacher variables on self-efficacy of high school teachers. For each parameter of this specific model, the two estimators, the MIN QUE and the bootstrap estimates were provided side by side. However, the bootstrap’s additional estimation advantage was demonstrated by providing the bootstrap standard errors, empirical bootstrap sampling distributions of estimators, and the 95% bootstrap confidence intervals about each of the parameters of the teachers’ self-efficacy prediction model. The bootstrap estimate of bias for each estimator were also provided. 114 mg! ’ —Effi M e1 Three parameters of the random part of the model, representing the intra— and inter—teacher variances, denoted by of, and r’ respectively and the intra—teacher correlation denoted by p were estimated using both MIN QUE and the bootstrap. In addition, eleven fixed effects parameters of the fixed part of the model representing the eflects of Mathematics (a,), Science ((1,), English (as), Social Science (a,), class level of preparation (6,), class size ([3,), average student achievement level (53): staff c00peration (71), teacher control (7,), principal leadership (7,) and the constant common to all classrooms denoted by 700 were studied. The bootstrap estimates of the effects of intra—teacher variance, inter—teacher variance, the effect of class size, staff c00peration, and principal leadership were close to the MINQUE estimates. For these estimators, the bootstrap estimate of bias was no more than 0.008. However, except for the estimate of the constant 700, whose bootstrap estimate of bias was 0.1124, the bootstrap estimate of bias for the remaining nine estimators was no more than 0.04. The bootstrap provided additional estimation informations which was not available through the MIN QUE. These included the bootstrap standard error of each estimator, the 95% confidence intervals about the parameters, and the empirical bootstrap distribution of the statistics. Extremely low values of the bootstrap standard errors were observed particularly for the inter— and intra—teacher variances, the effects of the class level of preparation, class size, teacher control, and principal leadership. The bootstrap standard errors for these estimators were all close to 0.01. As a means of testing hypothesis about the parameters of the teachers’ self-emcacy prediction model, the 95% bootstrap confidence intervals about the 115 parameters were constructed. Based on these intervals, hypotheses of whether each of these parameters was different from zero were tested. Based on this bootstrap fashion of testing hypothesis, all factors, with an exception of pring’pg leadership were found to have a statistically significant effect on teachers’ self—effica_cy. Seven percentage polygons based on B = 1000 bootstrap replications of the estimators of the inter—teacher variance (1”), intra—teacher variance (cg), the intra-teacher correlation (p), the fixed effects of Mathematics (al), Science (a,), English (a,), and Social Science (a,) were presented. Though the estimate of the sampling distribution of the inter-teacher variance (7’) was slightly positively skewed, and that of the effects of Mathematics (a,), Social Science (a,), and Science (0,) were slightly negatively skewed, estimates of sampling distributions for all other estimators were fairly symmetric. But perhaps more importantly, percentage polygons for all seven estimators were centered extremely close to their corresponding usual MINQUE point estimators. By applying the bootstrap method on actual field research data, the study demonstrated three features which research practitioners may find useful. First is the bootstrap’s ability to provide the standard errors, empirical bootstrap sampling distributions of estimators, and setting confidence intervals about each of the parameters. This feature is typically not available through classical methods in the absence of certain distributional assumptions. The second feature was the flexability of the design in accommondating a wide range of independent variables (both continuous and discrete) in the model. The ability of a design to accommodate all types of independent effects is important given the limitations of most available statistical packages. For instance, the procedure VARCOM in SAS allows only for independent effects limited to main effects, interactions, and nested effects but not continuous efiects. The third feature and perhaps the least expected was the efficiency of the bootstrap computer code. Though the bootstrap is 116 typically perceived as computer intensive, the present study utilized a simple program coded in SAS/IML through the MSU IBM 3090 VF mainframe computer. With this program, one bootstrap trial on the full model (seven independent variables) of B = 1000 replications took approximately 18:31.16 CPU time, which was not very expensive. Th im M e] Difi'erent simulated models corresponding to each of the six design factor combinations (see Table 4.1) were studied. The six models represented the two distributional models (normal and double exponential) by three levels of the pOpulation intraclass correlation condition (p = 0.01, 0.05, and 0.20). The six design factor combinations are denoted by cells a through f. Each of the six models specified according to the design factor combination contained seven parameters of which three were random and four fixed. The three random effects parameters represented the inter—class variance (1’), the intra-class variance (0:), and the intraclass correlation (p). The four fixed effects parameters, a,, a,, a3, and 6 represented the three levels of the fixed factor and the coefficient of the covariates. Estimation of non—redundant effects were based on a,, 013, and B. A total of 400 Monte Carlo simulation trials (based on independent samples) were performed for each of the six design factor combinations (cells a through f), resulting in a grand total of 2400 Monte Carlo simulation trials, each based on a different data set drawn according to the specified design factor combination parameters. Ten estimable functions expressed in terms of the usual MINQUE and/or bootstrap estimates were used to assess the estimation of both the usual MINQUE and the bootstrap estimators. The ten estimable functions were carequy chosen to provide meaningful statistics like the Mean Square Errors, MSEl and 117 MSE2, for the usual MINQUE and bootstrap estimator respectively, and the bootstrap estimate of bias denoted by BIAS. Since data were drawn from populations of known parameter values, estimable functions were checked against their expected parameters values. The following is a presentation of the summary of estimation results organized by the population parameters of the models. Wt”) At the 0.01 intra—class correlation conditions, both MINQUE and bootstrap overestimated the parameter ‘r’ with biases equal to 0.0292 and 0.0597 under normal and double exponential respectively for MINQUE, and 0.3432 and 0.3765 under normal and double exponential respectively for the bootstrap. Bootstrap estimates clearly improved for p = 0.05 both under the normal and double exponential with biases of 0.0299 and 0.2709 under the normal and double exponential respectively. Corresponding biases for the MINQUE estimate were -0.0845 under normal and 0.1396 under the double exponential. The bootstrap estimate of bias was observed at 0.1144 for p = 0.05 compared to 0.3140 for p = 0.01 under the normal distribution. Particularly successful estimation results for the parameter 1'2 were attained at the p = 0.20 intraclass correlation condition. The bootstrap with a bias of —0.0447 was clearly close to the usual MIN QUE with a bias of —0.1480 under the normal distribution. The bootstrap was also fairly close to the usual MIN QUE under the double exponential with the former registering a bias of 0.4018 and the later having a bias of 0.5167. The estimate of bias at this level was 0.1149, and the ratio R1 was surprisingly close to 1.00. On average therefore, the MINQUE was closer to the parameter than the bootstrap only at the 0.01 level of the intraclass correlation condition under the 118 normal. At p = 0.05 and 0.20, the bootstrap was close to the parameter value compared to the MIN QUE under the normal distribution. Both methods failed to produce better estimates for 1" at all levels of the intraclass correlation under the double exponential distribution. Percentage polygons for the 400 MINQUE and bootstrap estimates were centered near the true pOpulation parameter value of r’ at all levels of this intraclass correlation condition. However, though the bootstrap percentage polygon appeared to be positively skewed while the MINQUE polygon was fairly symmetric, at the 0.01 level of the intraclass correlation condition it was observed that a greater mass of observations were around 1.00 for the bootstrap percentage polygon than for the MIN QUE polygon. The bootstrap confidence intervals about the parameter 1’ were extremely tight under the double exponential as well as under the normal distribution at the 0.01 level of the intraclass correlation condition. Bootstrap confidence intervals about 7’ were fairly short, both under the normal and double exponential at the 0.05 level of the intraclass correlation condition, but were wider at the 0.20 level of the intraclass correlation condition. The percentage of times the true parameter value of 1" fell within the 90 or 95 percentage bootstrap confidence intervals were close to either 90 or 95 at the 0.05 level of the intraclass correlation condition. The percentage of times the parameter value was captured by the bootstrap confidence intervals were furthest from the expected confidence coefficient at the 0.20 level of the intraclass correlation condition (see Table 6.10). Wing) MINQUE and bootstrap fairly accurately estimated the population inter—class variance both under the normal and double exponential distribution of errors and sets of random effects parameters, at all levels of the intraclass 119 correlation condition. However, at the 0.20 level of the intraclass correlation, the bootstrap was closer to the parameter value than the MIN QUE with a bias of 0.0437 compared to the MIN QUE bias of 0.1660 under the normal distribution. At the 0.01 level of the intraclass correlation the statistic R2 was extremely close to unity, as expected. At all three levels of the intraclass correlation, the standard deviation of the functions of the estimates were relatively high under double exponential than under the normal distribution. Percentage polygons for the MIN QUE and bootstrap estimates at all levels of the intraclass correlation, showed the bootstrap following the usual MINQUE quite closely. Percentage polygons for both estimators were centered extremely close to the true parameter value of 03, which was set at 100. Thus it can be argued that, while both MINQUE and the bootstrap fairly accurately estimate 03,; eficiency of these estimates is severely affected by the nature and size of the tails of the distribution of the errors and sets of random effects parameters. Both estimators are less efficient under a distribution with fairly long and/or thick tails than under a distribution with short and/ or thin tails. But the measure of their relative eficiency was extremely close to unity. The bootstrap confidence intervals about the parameter 036 showed a very successful bootstrap interval estimation process. The average widths of the confidence intervals were quite low, particularly under the normal distribution. At all levels of the population intraclass correlation condition, the percentage of times the true parameter value of a: fell within the 90 or 95 percentage bootstrap confidence intervals were extremely close to either 90 or 95 under both normal and double exponential distributions. 120 In r rr i 11 At the 0.01 level of the population intraclass correlation condition, both MIN QUE and bootstrap very slightly overestimated p under both the normal and double exponential. The biases were extremely close to zero under both distributions. With an exception of R, which was poorly estimated, all other nine functions of ,3 and/or 5* were fairly accurately estimated. At this level of the intraclass correlation conditions the mean square errors for both MINQUE and the bootstrap were particularly close to zero, under both distributions. At the 0.05 level of the intraclass correlation condition, all ten estimable functions of }3 and/or 5* were very successfully estimated, both by the MINQUE and the bootstrap. The biases for the bootstrap and MIN QUE estimators were extremely close to zero under both the normal and double exponential distributions. The bootstrap slightly underestimated p under the normal, but very accurately estimated p under the double exponential at the 0.20 level of the intraclass correlation condition. The MINQUE, on the other hand, slightly underestimated ,0 both under the normal and double exponential at this level of the population intraclass correlation condition. Under both distributions, the statistics R, and R, were extremely close to unity. However, at this condition of the intraclass correlation, the bootstrap performed as well as the MINQUE in estimating the ratio of the estimate to the parameter, p under both the normal and double exponential distributions. Percentage polygons for the 400 MINQUE and bootstrap estimates of p under the normal and double exponential distributions showed that the two methods followed each other very closely. For both methods however,the estimates of p were more variable under the double exponential than under the normal distribution, at the 0.05 and 0.20 levels of the intraclass correlation condition. 121 The bootstrap interval estimation about the parameter 1”, as a component of p (see Equation 4.3) was successful, particularly at the 0.01 level of the intraclass correlation; 90 and 95 percentage confidence intervals about p were fairly successful under both normal and double exponential. However, the percentage of times, the parameter value of p fell within the bootstrap 90 and 95 percent confidence intervals were furthest from the expected confidence coefficient at the 0.20 levels of the intraclass correlation. Fixfl effects pgameters (a,,gpgsl Since a,, a,, and a, are linearly dependent, estimation was only required for any two of them. Estimation results for a, and 0:3 were presented. At the 0.01 level of the intraclass correlation condition, both MINQUE and bootstrap fairly accurately estimated both a, and a, with biases of no more than 0.027 for a, and 0.037 for 0,. The statistics R, and R, at this level of the intraclass correlation were extremely close to 1.00 under both the normal and double exponential distributions. Mean square errors for both MIN QUE and bootstrap estimates of a, and a:3 were no more than 0.87 under both normal and double exponential distributions. The measure of their relative efficiency was quite clsoe to one. , At the 0.05 level of the intraclass correlation, all ten estimable functions of 8, and/or 8: were very accurately estimated by both MINQUE and bootstrap, under the normal. However, the bootstrap and MIN QUE estimates of a, were surprisingly accurate under the double exponential than under the normal. The average values of the functions R, and R, for both MINQUE and bootstrap estimates of a, and a, were extremely close to 1.00 under both normal and double exponential for p = 0.05. 122 At the 0.20 level of the intraclass correlation, though the bootstrap was closest to the parameter a, under the normal than under double exponential, the biases for both MINQUE and bootstrap were no more than 0.05. Both biases for bootstrap and MIN QUE estimators of a, were close to 0.04 under the normal but near 0.07 under double exponential. In general therefore, both MINQUE and bootstrap very successfully estimated the parameter a, and a, at all levels of the intraclass correlation under both normal and double exponential distributions. However, the mean square error for both MINQUE and bootstrap estimates of a, and a, tended to increase with intraclass correlation under both normal and double exponential distributions. The bootstrap confidence intervals about the parameters a, and a, at the 0.05 and 0.01 levels of the intraclass correlation showed a more precise bootstrap interval estimation process under both normal and double exponential distributions. Except for p = 0.20, the percentage of times the 90 or 95 percent bootstrap confidence intervals captured the parameters a, and a, were extremely close to the expected confidence coefficient, (1 — a) 100%. C ' n of th variates Perhaps the most accurate bootstrap and MINQUE estimation results were obtained for the parameter [9, the coeficient of the covariates. For this parameter, the bootstrap and MINQUE average estimates over 400 trials were extremely close to the true parameter value regardless of the level of the intraclass correlation or distribution of the errors and sets of random effects parameters of the model. At all levels of the intraclass correlation condition, the bootstrap and MIN QUE biases were never greater than 0.0005 and the bootstrap estimate of bias was perfectly nil under both normal and double exponential distribution. 123 Average values for the functions R, and R, of [9 and/or 8* were either extremely close to unit or exactly equal to unit at all levels of the intraclass correlation condition. The mean squares errors for the bootstrap and MIN QUE estimators of ,6 were no more than 0.0002 under the normal and 0.0003 under the double exponential at all three levels of the intraclass correlation condition. Thus, based on these results it was evident that the parameter 6 was extremely accurately estimated by both MINQUE and the bootstrap, regardless of the level of the intraclass correlation condition and the nature and size of the tail of the distribution. Percentage polygons for the MIN QUE and bootstrap estimates of )3 showed no obvious differences between MINQUE and the bootstrap nor between their estimation ability under the normal or the double exponential. At all levels of the intraclass correlation, the percentage polygons showed that a large mass of the bootstrap and MIN QUE estimates were within 0.015 points from the true parameter value of [9 which was set at 1.00. Also, regardless of the level of the intraclass correlation, the bootstrap method for the confidence intervals about the parameter 5 was a remarkable success. Even the percentage of times the 90 or 95 percent confidence intervals captured the true parameter value of 6 were extremely close to the expected confidence coefficient at all levels of the intraclass correlation for both normal and double exponential distributions. Techni Di 8 ion Much of statistical inference amounts to describing the relationship between a sample and the population from which the sample was drawn. Consider for instance, the statistic 0 used to estimate an unknown parameter 0. Suppose we define a function R given by R = 8] 0. Since the behavior of R is unobservable, we may wish to approximate its distribution. The main principle of the bootstrap is 124 to estimate the unknown distribution of a function, such as R by the distribution of R* = 2"] 8, where 70" is the bootstrap version of d, computed from repeated resampling. The key feature of this argument is the hypothesis that the relationship between 3 and 27* should closely resemble that between 3 and 0. Under the assumption that the relationships are identical, we equate the two ratios, R and R“ and obtain the estimate of 0 which is a function of data. Similar arguments can be made for other functions like say, D* = 8* — 8 whose distribution will resemble that of D = d— 0. Bootstrap confidence intervals are then constructed based on this approximation as demonstrated in Equation 6.1 through 6.3 in Chapter VI of this dissertation. In the present study, through Monte Carlo simulations, the distributions of R and R“ were observed through two types of resampling. The distribution of R was examined by drawing a random sample from a population having known parameters, computing the statistic 8 and repeating the process a large number of times. On the other hand, the distribution of R* was observed by drawing one sample from the pOpulation similar to the one used in resampling for R. From this sample, a random sample of the same size is drawn with replacement, the statistic 8" computed, and the process repeated a large number of times. The statistic 8 based on the original sample was also computed. The distributions of R and R“ were then derived from this systems. The purpose was then to empirically examine the resemblence of the distribution of R“ and that of R. Figures 7.1 and 7.2 presents the percentage polygons for the distributions of R and R* representing the ratios of the estimators of the random and fixed parameters of a mixed hierarchical linear model discussed in Chapter II of this 125 Figure 7.1 Percentage polygons for the distributions of R and R“ representing the ratios of the estimates of the random parameters 1’, 0:, and p. Normal. Double Exponential “m t2 ‘0 I . .. \ . 2 j 2 us as l t s 2 ”:5 J as on o s . '5 2 2-3 3 3‘ nlnecstndR’ ‘m? vahnofRsndR' —Qm *Qfimuugo —QW +Q°°Iootsttso“' um 10”” 12 I! 10 I I 2 0°II OIFI uses 09” 105. H! H7I '23. 2.. OIVI cuss class I“. H. U" 13” ninsofRsndR' ulna; "bacillus? —Qm -0-II-Isotsmo -—n,-m *“W'I "m “on" t: '2 to to I I I I 2 2 °‘oe...;zu?.fo‘en.uzua nlneofRsndR‘ rm, manner 126 Figure 7.2 Percentage polygons for the distribution of R and R* representing the ratios of the estimates of the fixed parameters a,, a,, and fl. Normal Double W I e s s z : 8.13 as 0.3: 07: cos ll: :35 is: '2‘ 3.1!: 033 0:: "33 no: no, .35 is. If! nhnchandH‘ Gianna, ulnschsndR' —OW *“Ioosssno —sm *vau “W “M" t2 '2 to to s s s s z ‘ z 0' c ‘ ‘ as o as t .3 3 . . as o as 1 Is 1 nlnsafRsndB.‘ «gander, valnaofRsndH’ -"m +fimsu _.m *rm “m “cm :2 12 to 10 s s s s 2 2 :07 m Q87 loss tort ms 101' «far can asst I“ ma tn too? manner puns, meanness -8“: +"’m —am +l~m 127 dissertation. The estimators represented in Figure 7.1 correspond to the random parameters r3, oi, and p while those represented in Figure 7.2 correspond to the fixed parameters a,, a,, and 6. Estimation of these parameters was done at the 0.05 level of the intraclass correlation condition. The expected value of both R and R* is 1.00. Consequently, both percentage polygons derived from the resampled data should be centered near 1.00. Indeed, Figure 7 .1 and 7.2 shows that both percentage polygons were centered extremely close to 1.00. It is important to emphasize that the distribution of R represents a sampling distribution of a statistic which is unobservable in actual research situations. Properties of this distribution can only be viewed theoretically for certain statistics, typically via the normal theory. On the other hand, the distribution of R* represents an approximation of the distribution of R. More importantly, the distribution of R“ is almost always observable via the bootstrap algorithm. If the distribution of R* fairly accurately approximates the distribution of R, then the bootstrap proves itself as a highly promising method in statistics. From Figures 7.1 and 7.2, it is apparent that the distributions of R and R“ are fairly similar, particularly in terms of their location (or central tendency). They difl'er slightly in variability. However, the distribution of R* surprisingly appear to be even "better" than that of R in the sense that, a greater mass of observations are near 1.00 under the R* curve than under the R curve. This variations were clearly marked under the double exponential than under the normal distribution. Such variations in the distributions of R and R*, though slight, by underlying distribution of errors and the random effects of the model were demonstrated to be consistent for all estimators of the six parameters of the mixed hierarchical linear model considered in the study (see Figures 7.1 and 7.2). 128 92mm The following conclusions were drawn from the results of the Monte Carlo simulation study and the results of the application the bootstrap and MINQUE on the estimation of the teachers’ self—efficacy prediction model. 1. Though the main mission of the bootstrap is not point estimation, the average of the bootstrap estimates over B bootstrap replications can sometimes be closer to the parameter value that the estimator based on the original sample. Thus, the bootstrap may be viewed as both a point and interval estimation technique. 2. Efficiency of the usual MIN QUE and the bootstrap estimators of the parameters of a model are typically affected by the nature and size of the tails of the distribution of the errors and sets of random effects of the model. Both estimators are less efficient under a distribution with fairly long tick tails than under a distribution with short think tails. In addition, the effect of the nature and size of the tails of distribution tends to be more severe in estimating random efi'ects than fixed effects of the model. 3. The bootstrap percentile method for the confidence intervals about the parameters 1", p, a,, and a, were successful at low intraclass correlation conditions. At the 0.20 level of the intraclass correlation, the coverage probabilities of the confidence intervals about these parameters was quite low. However, at and below the 0.05 level of the intraclass correlation condition, the bootstrap percentile method of the confidence intervals was shown to be highly promising. 129 4. The bootstrap’s ability to estimate the standard error of the statistics, generating empirical sampling distribution of the estimators and thereby setting confidence intervals about parameters, without reference to any distributional properties is the single most promising feature of the bootstrap. This ability was very successfully demonstrated in the present study. Most importantly, the success of the bootstrap point and interval estimation abilities were proved by comparing the bootstrap estimates against the pre—determined true values of the model parameters. 5. The MIN QUE and bootstrap estimate of the coefficient of the covariates of the model was surprisingly accurate. The bootstrap standard errors were extremely low and bias was minimal. Even the bootstrap confidence intervals about the parameter [9 were extremely precise. 6. In applying the bootstrap and MIN QUE methods on the teachers’ self—efficacy prediction model, which contained several predictors, showed the promising ability of the bootstrap and MIN QUE. The MINQUE which was once considered computationally prohibitive can be used on such a large model with easy; even via the bootstrap which involves repeated computation. The bootstrap algorithm can be implemented on a large model of seven independent variables at a cost of no more than 20 CPU time for one trial of 1000 replications. 7. For a statistic 8 used to estimate a parameter 0, the flmction R, defined by R = 8/0 was used to represent the relationship between 8 and 0. Given 8" as the bootstrap version of 8 computed from repeated resampling, we define the 130 function R“ = 87 8 as an approximation to R. Through Monte Carlo simulations, the distributions of R and R“ were found to be fairly similar, particularly in terms of central tendency. The distributions difiered slightly in variability. The distribution of R“ was slightly less variable than the distribution of R. Recommen tions Through Monte Carlo simulations, the bootstrap was demonstrated as a promising approach to estimating the standard error of the statistic, generating its sampling distribution and thereby setting confidence intervals about a parameter. This approach was empirically shown to work very well in estimating the parameters of a mixed hierarchical model whose errors and random effects parameters are either normally or double exponentially distributed. Applicability of the bootstrap approach was further demonstrated in estimating the parameters of the teachers’ self—efficacy prediction model. Implementation of the bootstrap method requires a great deal of computer usage. Though modern fast and relatively inexpensive computers are readily available, software to implement the bootstrap algorithm are currently unavailable. Development of such software is highly recommended to make the bootstrap available to research practitioners. m i ns f r her rese h Results of a simulation study are typically limited in their generalization to the conditions examined in the study. The present study examined the Operation of the bootstrap via MINQUE in estimating parameters of a mixed hierarchical model when the errors and random effects are either normally or double exponentially distributed. The study was done under three levels of the intraclass correlation conditions. 131 The effectiveness of the bootstrap approach under severely skewed or heavily tailed distributions remain to be seen. Studies to implement the bootstrap method in examining the sampling distribution of estimators of parameters whose underlying distributions are badly skewed like the gamma or heavily tailed like the Cauchy are deemed necessary to fully understand the abilities and limitations of the bootstrap approach. The present study considered an hierarchical model consisting of "micro" and "Macro" models with the assumption that only the intercepts were random. By fixing other coefficients of the "micro" models simplified the study to one of examining the variance components without covariates. A study to examine the operation of the bootstrap in models involving not only variance components but also covariance component will shed more light on the understanding of the bootstrap in the hierarchical context. The use of the bootstrap percentile method for the confidence interval at p = 0.20 was not very successful in estimating certain parameters. A more promising bootstrap t—method for the confidence interval was not used due to the fact that the standard error of the MINQUE estimator was not known. Further research geared to determining the standard error of MINQUE is deemed necessary in order for the bootstrap users to utilize the t—method for the confidence intervals. APPENDICES APPENDIX A SUMMARY OF COMPUTATIONAL FORMULAE The object of MINQUE the study was to find the estimate {I of the variance compont g of the two—level mixed model Y = 159 + Eb. The estimate g using weights w0 and w, in the norm is given by where foo for I" = {tr(lfwfifil’éefiJ} = [ J for k, k' = 0,1 10 11 and I I no I]. = {Y 13.591.13.10 = u, for 13. = Y's. — y;‘¥(l5’y;‘)5)'¥'Y&‘ - Let 1g = (H.125)— and 4.= Y;‘¥'Y;‘¥)‘¥'Y;‘ such that P, = Y? — A a. w . “’1 for n. = number of micro units in _ 1 _ Ifwe define w — Til and cj — 1+(nj'1)w1 macro group j, then the following is the summary of the computational formulae: 132 133 ———m—1- 1S 4.; (a) Let If = We = gains.- = {w 3 (X131 " °j¥i¥11¥h¥flr = {w 2 (x131 ‘ °j§j§i)}_ (b) 4.. = Y"¥I.<¥'Y;‘ = “’2 ’3 {Inj " “1%? 1i) 3231(5). ‘ 8‘?le 11)} — 2 I I I I I — W ”(19195: ‘ 8391951ng ‘ “1%? 1131531 I 2 I I + Cl?11?1j¥j’5¥j?lj?ll)- 2. F, matrix; (a) foo = tr(‘1‘?) 4415316..) where tr(Y;’) = w’ 2 n,-{(1--cj)2 + c}(nj—1)} tr(Y;‘A') = w3 2 {tr(tj) — cjaj[(1—c,-n,-)2 + (2—cjnj)]} Where 3' = {(135ij and 31 = “(351155191919 which is a scalar simplified by a,- = tr(S;I~(§j) = §jlf§r (b) fox = f10 = "($299) - "(35216940 where tr(Yfgg ,) = w’ )3 n,(1-cjnj)2 “(Memes = w’ 8 ail-sh? 134 (c) h. = as}; z z’- Z all - “(W z a a.) where tr(V 3; ,Z, Z ,Z',) = w2 E n}(1-cJ-n,)’ tr(V 2'29? ,,,ZZ )=w vlr"J-j-}3na(1-an-)3 3. g and UV; (.) €v= (strains = KX V'1Y =w2KxJ. (1,,j-J..cz,J.z',J.)YJ =w21_(XJ-— —J.Jch. z ,2}, YJ.) = w )3 (13X, — cjrjlij) where ’1 = Z’uVJ- is the sum of V elements in context j (b)n,=w’22dJ7(Inj —J..'2ezz +cJ.’nJ.zz ”,,,,- )dJ fordj=YJ —X,-a -llz 1J w’ 2 (dJTdJ. —2cJ. (if-2‘9) =n—1:5(Y— Xa) (Y- Xa) which is independent of the weights wo and w, . Consider for example the simplist and naive model given by, Yij = l‘ + ‘ij In the notation of the form Y = ’59 + @‘3 » X is a (nxl) vector of 1’s a=p isascalar 137 Z = In is an (nun) indentity matrix b is a(nx1) vector of residual error terms. Inthisspecificcase, P=l and a =(X X)- X Y: Y" (Grand mean). I},=1—,(Y— Xa) (Y— X01) 1 = —2 2 (Yij ""Y..)2 W0 W3 - -1 _ 811C}! that, Wlth F' — n—_p , the MINQUE estimator a: is given by (73- — a: =F'1U' 2 w0 1 = — —2'1‘(Y--—Y..)2 11.1) W: 1] 1 = “Fr 2 (Yij - Y..)2 = s2 which is the moment estimator. 5. MINQQE for the one way rgdom eggs bflgced model: where Y is (le) vector of N observations for N = nJ, J = # of levels. X is (le) vector of 1’s 9 II p [o ZJ andZ is (NJ) block diagonal, each block 'N H LN being a column of 1’s. 138 13 = [Po 9:]: b0 = (le) vector of residual error terms 13, = (J x1) vector of J unobservable random effects parameters. (a) If = (¥'Y;1¥)— = {W “31:an ' eastw- = {w 2 (n —cn2)}— since 11. = n c- = c J ’ J = {wnJ(1 — A)}'1 . Thus K = w_nJ('I:X)' for A = find—2,; (b) 4.. = .22(HJ.,)215,195; = w’Ku - 1,2 2 29-253 (c) foo = tr(Y;’) — tr(yre.) (i) tr(Y;’) = W” 2 1n,{(1-<=,-)2 + c,?(n,- - 1)} = w’nJ((l-c)2 + c2(n—l)} = w’nJ(1—2c + nc’) = w’J(l—A)2 + (n—1)w2J (ii) tr(V,;léw) = WK )3 nJ-(l—cjnj)3 = w’KnJ(l—A)3 = W3 mid-11')- nJ(l—A)3 = w"‘(1~-—A)2 . Thus foo = w"‘J(l—A)2 + (n—l)w’J — w2(1—z\)2 = w’{(J—1)(1—»\)’ + (n—1)J} = w’ {(J-l)(1—A)2 + J(n—1)} (d) Thus, (6) (f) 139 f0: = f10 = tr(Yé’?1?'1) -tr(V;1§wZ,Z’,) (i) tr(szzlg’l) = “'2 2 113(1"Cj11j)2 = w2nJ(1 — A) (ii) tr(Y;‘1},;,g',) = WK 2 nJe‘(1—cJ.nJ.)3 = w’Kn"'J(l--A)3 = W3 m n2J(l—/\)3 = w’n(l—A)2 f0, = {,0 = w’nJ(l—/\)2 — w’n(1—A)2 = w’n(l-A)’ (J—l) = w2n(J—1) (l—A)” f11 = ”052%? 1%? 1) - t(V;‘A'Z,Z 1%? 1) (1) tr(Y;’?@ @1? 1) = W2 2 “JO—(3111).} = w’n’J(1—A)’ (ii) tr(y;‘4.z.z1@@1) = w’ 2 Knju—cjnj)’ = waKn"’J(l--A)’I = W3 m n3J(l-—/\)3 = w’n’(1—z\)2 =1 f,, = w’n’J(l—A)’ — w’n’(1—A)’ = w’n’(l—A)’ (J—l) = w’n’(J—1) (l—A)’ = wK 2(1—A)r, (for rj = sum of Yij in context j) = wKnJ(1—A) Y.. (h) (i) Thus, ah?=,i3[§J(Y.J. —Y..=]’ 23(an .J.—nJ.Y..)2 =j§,n’(Y.J. — Y..)’ = n’2(Y.j — Y..)’ = n(SSB) (iii) g,- = ngJ. = §j(Y,J. —Y..)2 :1 s gi=j231 :91“, --Y..)2 = SST (i) u0 = w2 E (g,— 2c, h? + cJ-znjh} = w’ {2gi—2c2 h} + c’n EhJ?} = w’ {SST — (2cn)SSB + (c’n’)SSB} = w2 {SSB(1—2A+A’) + SSW} = w’ (SSB(1—A)’ + SSW} (ii) 11, = w’ 2(1-cJ. n J)’h? = w’(1—A)’ n.SSB = w’n(l—A)’ SSB D=det(F,) = foof11"f01f10 = foofu " f3, (i)foofn = w"‘{(J-1)(1--)\)’ + J(II-1)}{VIV’II"’(J-1)(140”} = w 112(1- mm -A)‘ + w‘n ’J(J—1)(n—1)(1—A)’ (ii) f3.=[w2n(J—1)(1—A)’1*=w nH—(J 112(1 —A)‘ D = f00,11 " {31 = w 1121(1—1)(n—1)(1—1\)2 141 (J) P = (foou1‘ 01%” D (i) r 0,11,: w H{(J—l)(1—/\) + J(N—l)}{w’n(l—A)’SSB} = w‘n(J—1)(1—A)‘ SSB + w‘nJ(n—1)(1—A)’SSB (ii) £0,110: {w’n(J—l)(1—A)’}{w(SSB(1—A)2 + SSW)} = w‘n(J--—1)(l—)\)4 SSB + w n(J—-1)(l—/\)2 SSW Thus foou, — {,,,u0 = wn(l—A)2 [J(n-1)SSB - (J—l)SSW]. ~ Thus fr: = w‘n(l-A) 2 [J (n—l)SSB—(J—l ) SSW] g» w‘n’J(J—1)(n—1)( l—A) ’ IF”...- 1 . . _[J(n—l)SSB] — [(J-1)SSW] n J(J— 1)(n-1) _ J n—l SSB _ J—l SSW _ — n— — n— SBB SSW _MSB _MSW =n( J—I) -nJ( n—l) n 11 As a result, 1;: = MSB-MSW n (which is the same as the method of moments). (k) 3: = (fuuo - fo1“1)/D (i) f,,u°= {w’n’(J—l)(l—A)’}{w’SB(l—A)2 + SSW} = w n’(J—1)(l—A)‘SSB+w‘n’(J—1)(1—A)’SSW (ii) fo,u,= {w’n(J—1)(1—A)’}{w2 n(1—/\)2SSB} = w n’(J— —-1)(1 -/\)‘SSB which implies that, {,,uo — 0,u, = w‘n’(J-1)(l—A)’ ssw such that 142 ) a: = “11% " {ONO/D = w‘n’(J—l)(1-/\)’SSW w‘n’J(J-l)(n-l)( l—A)’ _ SSW _ III—17n- = MSW (which is the same as in the method of moments). APPENDIX E SAS/IML COMPUTER PROGRAMS PART 1 COMPUTER PROGRAM TO IMPLEMENT TEE BOOTSTRAP ALGORITHM ON A SAMPLE EIERARCEICAL DATA DRAWN FROM A NORMAL POPULATION OP KNOWN PARAMETERS. THE PROGRAM PIRST SETS UP THE 5 AND THE FIRST PART OP THE ; MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH AS DEMONSTRATED HY EQUATION 2.11 IN CHAPTER II. THE WEIGHT :1 WAS DETERMINED SEPARATELY USING THE HANUSHER (1974) METHOD. PROC INL; START; I=1/(1-l1); GROUPS=50; NV11=REPEAT(20,2,1); NV21=REPEAT(25,5,1); NV31=REPEAT(30,IO,1); NV1=NV11//NV21//NV31; NV12=REPEAT(35,5,1); NV22=REPEAT(40,3,1); NV32=REPEAT(20,3,1); NV42=REPEAT(25,5,1); NV2=NV12I/NV22//NV32//NV42; NVI3=REPEAT(30,10,1); NV33=REPEAT(40,2,1); NV3=NV13IINV23IINV33; NV=NV1//NV2//NV3; cv=I1/(1+(uv-1)ow1); 111=REPEAT(1,465,1); 212=REPEAT(1,4SO,1); 113=REPEAT(1,555,1); 101=REPEAT(O,465,1); XOZ=REPEAT(O,4SO,1); 803=REPEAT(0,555,1); xr=x11//xoz//x03; xz=xo1//x12//xoa; xa=x01//xoz//xra; x4=xr||x2||x3; OODGOGOGOQDOO Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. 143 A: 0......09. 144 PROGRAM SEGMENT TO GENERATE DATA FROM A NQRMAL POPULATION OF SPECIFIED PARAMETER VALUES. THE PROGRAM FIRST DETERMINES THE FIXED EFFECTS PARAMETERS AND THE COVARIATE WHICH IN ARE USED IN TURN TO GENERATE THE OBSERVATIONS 1 THROUGH THE EQUATION GIVEN BY: Y = (X*ALPHA) + B + E gggg: IS ANY NUMBER USED TO CREATE A RANDOM NUMBER OF OBSERVATIONS FROM SOME POPULATION. SEED = 10199; *TIS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS DO T = 1 TO 400; REPPECTS = 2.2935 * NORMAL(REPEAT(SEED,GROUPS,1)); D1 = narrncrs[1,1]; N1 = NV[1,1]; 8J1 = REPEAT(SI,N1,1); DO I = 2 TO GROUPS; SJ = REFPECTS[I,1]; N = NV[I,1]; EJI=BJIIIREPEAT(BJ,N,1): END; SOSOOSDSOOOOOOO S=BJI; n = 10 . NORMAL(REPEAT(SEED,1500,1)); :41 = REPEAT(25,1500,1); :5 = INT(75 . UNIFORM(REPEAT(SEED,1500,1))) + :41; x=X4||xs; ALPHA = {-s,2,3,1.0}; Y = (X:ALPHA) + B + 3; AT THIS POINT A SPECIFIC DATA an! HAS BEEN GENERATED IITH THE FIXED nrrncrs ALPHA AND H AND E AS THE RANDOM ; PARTS or THE MODEL. IHILE THE FIXED nrrncrs REMAINED AT Tunas VALUES, THE RANDOM arrears PARAMETERS TOOK THE VALUES AS SHOIN SELLOI: INTRA-CLASS DATA SET CORRELATION TAU SQUARE SIGMA SQUARE 1 0.01 1.00 100 2 0.05 5.26 100 3 0.20 25.00 100 Q. Q. Q. Q. Q. Q. Q. Q. Q. Q0 Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. 145 * 5 IS A MATRIX WHICH IS PART OF THE PROJECTION MATRIX PW * GIVEN IN EQUATION 2.20 IN CHAPTER II. THE MATRIX 5 IS GIVEN BY: ‘0 X=INV(X’VWIX) THE FOLLOWING PROGRAM SEGMENT COMPUTES THE ELEMENTS OF THE MATRIX 5. Q-----------------------_--- ..... ———--- ----------- .. ------- ......--- x=o; X1=0; x2=o; M=1; N1=0; DO J=1 TO GROUPS; NJ=NV[J,I]; N1=N1+NJ; XJ=X[M:N1,]; YJ=Y[M:N1,]; EIJ=REPEAT(1,NJ,1); CJ=CV[J,1]; SJ=XJ‘*ZIJ; X1=X1+(XJ‘*XJ); X2=X2+(CJ*SJ*SJ‘); M=M+NU; END; X:W*(X1-X2); X=INV(X); S... St S Q. Q. Q. Q. Q. Q. Q. Q. Q. a------ ---------------------- ---- ----------------- - ------------- ; * DETERMINATION OF THE MATRIX FI: ; * I! IS A (282) MATRIX ASSOCIATED WITH WEIGHTS Wk SHOWN ; * IN EQUATION 2.21, AND WHOSE ELEMENTS ARE DETERMINED THROUGH ; * EQUATION 2.26 THROUGH 2.28. ; O ALPHAH IS A VECTOR OF THE ESTIMATES OF THE FIXED EFFECTS ; * PARAMETERS OF THE MODEL BASED ON THE ORIGINAL DATA SET. ; 0 THUS, THE FOLLOWING PROGRAM SEGMENT DETERMINES THE MATRICES ; * USED TO COMPUTE THE USUAL MINQUE ESTIMATES THAT ARE BASED ; * ON THE ORIGINAL DATA SET. ; a----------------- -------- ---------------------------------—-—--; F001=o; F002=0; F011=0; F012=0; F111=0; F112=0; ALPHA1={0,0,0,0}; ALPHA2={0,0,0,0}; M=1; N1=0; 146 DO J=1 TO GROUPS; NJ=NV[J.1] ; N1=N1+NJ; XJ=X[M:N1,]; YJ=Y[M:N1,]; CJ=CV[J,1]; 21J=REPEAT(1,NJ,1); CN=CJ*NJ; CJ2=CJ*CJ; NJ:=NJ*NJ; C2=(1-CJ)*(1-CJ); TJ=TRACE(XJ‘*XJ*X); SJ=XJ‘*EIJ; AJ=SJ‘*X*SJ; AC=AJOCJ; CN1=1-CN; CN12=CN1OCN1; CN13=CN1*CN12; AN=AJ*NJ; RJ=EIJ‘*YJ; F001=F001+(NJ*(C2+(CJZ*(NJ-1)) P002=P002+(TJ-AC*(CN12+(2-CN)) FOII=P011+(NJ*CN12); F012=P012+(AJOCN13); F111=F111+(NJ2*CN12); F112=F112+(AN*CN13); ALPHA1=ALPHA1+(X*XJ‘*YJ); ALPHA2=ALPHA2+(CJ*RJ*X*SJ); ALPHAH=ALPHA1-ALPHA2; M=M+NU; END; I2=I*U; I3=I2*U; F001=I2*F001; rooz=watrooz; F011=I2*P011; F012=l3*F012; F111=I2*P111; F112=I3*F112; Foo=F001-F002; F01=F011-F012; F11=F111-F112; ALPHAH:W*ALPHAH; ALPHAHT=ALPHAH‘; H: H 147 ‘0 ‘0 Q0 Q0 Q0 Q0 ‘0 Q0 Q0 ‘0 Q0 Q0 Q0 Q0 Q0 Q0 DETERMINATION OF THE MATRIX UW: Q! IS A 2 DIMENSIONAL VECTOR OF QUADRATIC FORMS WHOSE ELEMENTS ARE DENOTED BY u0 AND u1 (SEE EQUATION 2.22, 2.31 AND 2.32). QETF IS THE DETERMINANT OF THE MATRIX FW USED TO OBTAIN THE INVERSE OF THE (282) MATRIX FW. SIGMAH IS THE INTRA-CLASS VARIANCE COMPONENT BASED ON THE ORIGINAL HIERARCHICAL DATA SET. TAUH IS THE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED ON THE ORIGINAL DATA SET. LAMDA IS THE INTRA-CLASS CORRELATION BASED ON THE ORIGINAL SAMPLE AND COMPUTED BY THE FORMULA, LAMDA = TAUH/(TAUH+SIGMAH) 001:0; 011:0; M=1; N1=o; DO 3:1 TO GROUPS; UJ=NVIJrllf N1=N1+NJ; XJ=X[M:N1,]; YJ=Y[M:N1,]; CJ=CV[J,1]; HIJ=REPEAT(1,NJ,1); CN=CJ¢NJ; N2=NJONJ; =YJ-(XJ*ALPHAH); HJ=ZIJ‘*DJ; GJ=DJ‘*DJ; HJ:=HJ*HJ; CH2=CJ*H32; CN12=(1-CN)*(1-CN): =YJ-(XJfiALPHAH); 001:001+(GJ-(an*(2-CN))): U11=U11+(HJZ*CN12); M=M+NU; S.D...OOQOOOOOOO * THIS MARKS THE END OF THE COMPUTATION OF THE USUAL MINQUE * ESTIMATES BASED ON THE ORIGINAL SAMPLE. THE USUAL MINQUE * ARE PRINTED AT THE FIRST LINE. ESTIMATES PRINTED AT THE * PROCEEDING LINES ARE THE BOOTSTRAP REPLICATED ESTIMATES * BASED ON THE RESAMPLED DATA FROM THE ORIGINAL SAMPLE . _______ __ __ __ _- Q. Q. Q. Q. ‘0 ‘0 ‘0 U0=W2*U01; u1=wztn11; DETF=(FOO*F11)-(FOI*F01); SIGMAH=((F11*UO)-(F01*Ul))/DETF; TAUH=((FOO*U1)-(F01*UO))IDETF; 148 *=__:=::___=__ _ _ == ______ __ ::_ _ __ * A PROCEDURE TO BOOTSTRAP THE PARAMETER ESTIMATES * BY COMPUTING THE ESTIMATE B TIMES THROUGH RESAMPLING. * 3 IS THE INDEX COUNTER WHICH COUNTS THE BOOTSTRAP REPLICATED * SAMPLES. SEED1 IS THE RANDOM GENERATOR FOR THE BOOTSTRAP. Q. Q. Q0 Q0 Q0 ‘0 SEED1 = 10199; DO B=1 TO 200; * A PROCEDURE USED TO RESAMPLE DATA FROM THE ORIGINAL DATA SET * BY FIRST CREATING AN INDEX FOR EACH OBSERVATION. THIS * PROCESS IS REPEATED B TIMES FOR SOME LARGE B REPRESENTING * THE NUMBER OF BOOTSTRAP REPLICATIONS. O Q. Q. Q. Q. Q. ‘0 NT=1; CONSTANT=REPEAT(NT,NV[1,1,1); INDEX=NV[1,1*(UNIFORM(REPEAT(SEED1,NV[1,1,1)))+CONSTANT; DO S=2 TO GROUPS; NT=NT+NV[S-1,]; CONSTANT=REPEAT(NT,NV[S,1,1); INDEX1=NV[S,1*(UNIFORM(REPEAT(SEED1,NV[S,1,1)))+CONSTANT; INDEX:INDEX//INDEX1; END; INDEX=INT(INDEX); YSTAR=Y[INDEX]; XSSTAR=X5[INDEX]; XSTAR=X4||XSSTAR; YSTART=YSTAR‘; a--—- ------------------------- - ----- ---- ----- -------------------; * DETERMINE X=INV(X'VIIX) BASED ON THE REPLICATED COVARIATE ; t VALUES ; *-------------------- -------------- ----- ------ ------------------; X=o; x1=o; 32:0; M=1; N1=o; DO 3:1 TO GROUPS; NU=NV[J,1]; N1=N1+NJ; XJ=XSTAR[M:N1,]; YJ=YSTAR[M:N1,]; EIJ=REPEAT(1,NJ,1); CJ=CV[J,1]; SJ=XJ‘*ZIJ; X1=X1+(XJ‘*XJ); X2=X2+(CJ*SJ*SJ‘); M=M+NU; END; X:I*(R1-R2); X=INV(R) ; 149 * DETERMINATION OF THE MATRIX FM BASED ON THE REPLICATED R * MATRIX. * ALPHAH; IS THE ESTIMATE OF THE FIXED EFFECTS PARAMETERS BASED * ON THE REPLICATED DATA SET. .....................-............1..............-.............. roo1=o; F002=0; F011=o; ro12=o; F111=o; r112=o; ALPHA1={0,0,0,0}; ALPHA2={o,o,o,O}; M=1; N1=O; DO J=1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; XJ=XSTAR[M:N1,]; YJ=YSTAR[M:N1,]; CJ=CV[J,1]; SIJ=REPEAT(1,NJ,1); =CJ‘NJ; CJ2=CJ*CJ; NJ2=NJ*NJ; C2=(1-CJ)*(1-CJ); TJ=TRACE(XJ‘*XJ*R); SJ=XJ‘*EIJ; AJ=SJ‘*X*SJ; AC=AJ*CJ; CN1=1-CN; CN12=CN1*CN1; CN13=CN10CN12; AN=AJ0NJ; RJ=EIJ‘*YJ; F001=F001+(NJ*(C2+(CJZ*(NJ-1)))); F002=F002+(TJ-AC*(CN12+(2-CN))); F011=F011+(NJ*CN12); F012=F012+(AJ*CN13); F111=F111+(N32*CN12); F112=F112+(AN*CN13); ALPHA1=ALPHA1+(R*XJ‘*YJ); ALPHA2=ALPHA2+(CJ*RJ*X*SJ); ALPHAH1=ALPHA1-ALPHA2; M=M+NJ; END; W2=N*l; I3=W2*I; F001=U2*F001; F002=U3*F002; F011=W2*F011; F012=l3*F012; Q. Q. Q. Q. ‘0 ‘0 150 F111=W2*F111: F112=W3*F112; F00=F001-F002; F01=F011-F012; F11=F111~F1123 ALPHAH1=W¢ALPHAH13 ALPHAH1T=ALPHAH1‘; I * DETERMINATION OF THE VECTOR UM BASED ON THE RESAMPLED Y ; 0 AND THE REPLICATED X MATRIX ; * SIGMAHI IS THE INTRA-CLASS VARIANCE COMPONENT ESTIMATE ; t BASED ON THE REPLICATED SAMPLE ; * TAUHI IS THE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED ; * ON THE REPLICATED SAMPLE. . ; * LAMDA; IS THE INTRA-CLASS CORRELATION ESTIMATE BASED ON ; * THE REPLICATED SAMPLE. ; a — — - —=—=— — ___ _ _ _ _ _= ——==—; U01=0; U11=0; M=1; N1=0; DO J=1 TO GROUPS; NJ=NV[J,1]; NI=N1+NJ; XJ=XSTAR[M:N1,]; YJ=YSTAR[M:N1,]; CJ=CV[J,1]; EIJ=REPEAT(1,NJ,1); CN=CJ*NJ; N2=NJ*NJ; DJ=YJ-(XJ*ALPHAH1); HJ=ZIJ‘*DJ; GJ=DJ‘*DJ; H32=HJ*HJ; CH2=CJ*HJZ; CN12=(1-CN)*(1-CN); DJ=YJ-(XJ*ALPHAH1); U01=UOI+(GJ-(CH2*(2-CN))); UII=U11+(HJ2*CN12); N=N+NU; END; U0=w20001; 01=w2t011; DETF=(F00*F11)-(FOI*FOI); TAUH1=((FOO*U1)-(FOI*UO))/DETF; SIGMAHI=((F11*U0)-(F01*Ul))/DETF; O--—------------------—--------—-----------------—-------------—--; a THE NEXT PROGRAM SEGMENT PRINTS THE VALUE OF THE BOOTSTRAP ; a AT EACH OF THE B BOOTSTRAP REPLICATION. ; I 151 PRINT T (|PORNAT=4.o|) B (|PORNAT=4.0|) TAUH (IFORMAT=S.3|) TAUHI (IFORMAT=S.3|) SIGMAH (IFORMAT=S.3|) SIGMAHI (IFORMAT=S.3|); PRINT ALPHAHT (IFORMAT=S.3|) ALPHAHIT (IFORMAT=S.3|); SEED1 = SEED1 + 100; END; .__ _. _____ __ __ _ _= ___ * THE END OF THE BOOTSTRAP TRIAL BASED ON THE RESAMPLED DATA. ANOTHER TRIAL WILL BE PERFORMED AFTER CHANGING THE SEED FOR THE RANDOM SAMPLING ALGORITH. SEED = SEED + 100; END; * THIS MARKS THE END OF THE SIMULATION TRIAL. EACH SUCH TRIAL * RESULTS IN ONE SET OF THE USUAL MINQUE ESTIMATES AND B SETS * OF THE BOOTSTRAP REPLICATED ESTIMATES. THE SUMMARY 0 STATISTICS FOR THE BOOTSTRAP REPLICATED ESTIMATES ARE ALSO * COMPUTED. FINISH; RUN; PART 2 COMPUTER PROGRAM TO IMPLEMENT THE BOOTSTRAP ALGORITHM ON A SAMPLE HIERARCHICAL DATA DRAIN FROM A DQQELE EXPONENTIAL POPULATION OF KNOWN PARAMETERS. X MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH AS DEMONSTRATED BY EQUATION 2.11 IN CHAPTER II. THE WEIGHT 1; WAS DETERMINED SEPARATELY USING THE HANUSHEK (1974) METHOD. PROC IML; START; =1/(1-I1); GROUPS=50; NV11=REPEAT(20,2,1); NV21=REPEAT(25,5,1); NV31=REPEAT(30,10,1); NV1=NV11//NV21//NV31: NV12=REPEAT(35,5,1); NV22=REPEAT(40,3,1); NV32=REPEAT(20,3,1); NV42=REPEAT(25,5,1); NV2=NV12//NV22//NV32//NV42; NV13=REPEAT(30,10,1); .D....I‘OOOO. TEE PROGRAM FIRST SETS UP THE E ANDVTHE FIRST PART OF TEE Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. 152 NV23=REPEAT(3S,5,1); NV33=REPEAT(40,2,1); NV3=NV13//NV23//NV33; NV=NV1//NV2//NV3; cV=I1/(1+(NV-1)*W1); XI1=REPEAT(1,465,1); X12=REPEAT(1,480,1); x13=REPEAT(1.555,1); XOI=REPEAT(0,465,1); xoz=REPEAT(o,ASo,1); X03=REPEAT(0,555,1); :1=x11//xoz//xoa; xz=x01//312//xoa; Ia=x01//xoz//113; X4=x1||x2||xa; PROGRAM SEGMENT TO GENERATE DATA FROM A DOUBLE EXPONENTIAL POPULATION OF SPECIFIED PARAMETER VALUES. THE PROGRAM FIRST DETERMINES THE FIXED EFFECTS PARAMETERS TOGETHER WITH THE RANDOM EFFECTS PARAMETERS AND THE COVARIATE WHICH ARE USED IN TURN TO GENERATE THE OBSERVATIONS 1 THROUGH THE EQUATION GIVEN BY: Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Y = (X*ALPHA) + B + E fifigfll AND SEED2: ARE ANY NUMBERS USED TO CREATE A RANDOM NUMBER OF OBSERVATIONS FROM A DOUBLE EXPONENTIAL POPULATION. SIES'O1DS'OIDI'OIDS'. 1 IS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS ‘------------------- ----- --------- ----- ------------------------- DO T = 1 TO 400; SEED1 = 100999; SEED: = 12399; I 0 T IS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS ; 0-—---------------—-—--—-——---------------—----—--—-------------; DO T = 1 TO 1; UR1=UNIFORM(REPEAT(SEED1,GROUPS,1)); UR2=UNIFORM(REPEAT(SEED2,GROUPS,1)); LR= -1 * LOG(UR1); TR1 = REPEAT(1,GROUPS,1); TR2 = TR1 f (UR2 >= 0.5); TR3 = -1*(TR1 I (UR: < 0.5)); TR4 = TR2 + TR3; REFFECTS = 2.2935 * ((LR#TR4)/SQRT(2)); O—n- ————————————— -— --------------------- -- ---------------------- ; 31 = REFFECTS[1,1]; "1 = NV[1.1]: 3.11 = REPEAT(B1,N1,1); DO I = 2 TO GROUPS; B3 = REFFECTS[I,1]; N = xvlllll; BJI=BJIIIREPEAT(BJ,N,1); END; 153 B=BJI; E = 10 * NORMAL(REPEAT(SEED,1500,1)); 301 = REPEAT‘25,1500,1); H5 3 INT‘75 * UNIFORM(REPEAT(SEED,1500,1’)) + X41; x=x4||xs; ALPEA = {-s,2,3,1.0}; Y = (H‘ALPHA’ + D + E; ‘u...------.--------------- --------- ----------------------------- . AT THIS POINT A SPECIFIC DATA SET HAS BEEN CENERATED 'ITH * THE FIXED EFFECTS ALPHA AND E AND E AS THE RANDOM ; * PARTS OF THE MODEL. WHILE THE FIXED EFFECTS REMAINED AT . THESE VALUES, THE RANDOM EFFECTS PARAMETERS TOOK THE VALUES * AS SHO'N HELLO": INTRA-CLASS DATA SET CORRELATION TAU SQUARE SIGMA SQUARE 1 0.01 1.00 100 2 0.05 5.26 100 3 0.20 25.00 100 5 IS A MATRIX WHICH IS PART OF THE PROJECTION MATRIX PW GIVEN IN EQUATION 2.20 IN CHAPTER II. THE MATRIX L IS GIVEN BY: K=INV(X’VNIX) OSOSQSOSSOOSISSI’ * THE FOLLOWING PROGRAM SEGMENT COMPUTES THE ELEMENTS OF THE * MATRIX A. .-----------—- -------------------------- —---—----- ------ -_-.........- X=0; K1=0; X2=0; M=1; N1=0; DO J=1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; XJ=X[M:N1,]; YJ=Y[M:N1,]; 21J=REPEAT(1,NU,1); CJ=CV[J,1]; SJ=XJ‘*ZIJ; K1: K1+(XJ‘*XJ); X2=X2+(CJ*SJ*SJ‘); M=M+NJ; END; X=W*(K1-X2); K=INV(K); Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. 154 I * DETERMINATION OF THE MATRIX FI: ; t I! IS A (282) MATRIX ASSOCIATED WITH HEIGHTS Wk SHOWN ; * IN EQUATION 2.21, AND WHOSE ELEMENTS ARE DETERMINED THROUGH ; * EQUATION 2.26 THROUGH 2.28. ; * ALPHAH IS A VECTOR OF THE ESTIMATES OF THE FIXED EFFECTS ; * PARAMETERS OF THE MODEL BASED ON THE ORIGINAL DATA SET. ; * THUS, THE FOLLOWING PROGRAM SEGMENT DETERMINES THE MATRICES ; 0 USED TO COMPUTE THE USUAL MINQUE ESTIMATES THAT ARE BASED ; * ON THE ORIGINAL DATA SET. ; *---------------------------------------------------------------; F001=0; F002=0; F011=0; F012=0; F111=0; F112=0; ALPHA1={0,0,0,0}; ALPHA2={0,0,0,0}; M=1; N1=O; DO J=1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; XJ=X[M:N1,]; YJ=Y[M:N1,]; CJ=CV[J,1]; E1J=REPEAT(1,NJ,1); CN=CJ¢NJ; C32=CJ*CJ; N32=NJ*NJ; C2=(1-CJ)*(1-CJ); TJ=TRACE(XJ‘*XJ*X); SJ=XJ‘*ZIJ; AJ=SJ‘*X*SJ; AC=AJOCJ; CN1=1-CN; CN12=CN1*CN1; CN13=CN1*CN12; AN=AJ*NJ; RJ=ZIJ‘*YJ; F001=F001+(NJ*(C2+(CJ2*(NJ-1)))); F002=F002+(TJ-AC*(CN12+(2-CN))); F011=F011+(NJ*CN12); F012=F012+(AJ*CN13); F111=F111+(NJ2*CN12); F112=F112+(AN*CN13); ALPHA1=ALPHA1+(K*XJ‘*YJ); ALPHA2=ALPHA2+(CJ*R3*X*SJ); ALPHAH=ALPHA1-ALPHA2; M=M+NJ; END; 155 W2=N*W; W3=W2*'; F001=W2*F001; F002=W3*F002; F011=W2*F011; F012=W3*F012; F111=N2*F111; F112=W3*F112; F00=F001-F002; F01=F011-F012; F11=F111-F112; ALPHAH=W*ALPHAH; ALPHAHT=ALPHAH‘; DETERMINATION OF THE MATRIX UW: 9! IS A 2 DIMENSIONAL VECTOR OF QUADRATIC FORMS WHOSE ELEMENTS ARE DENOTED BY u0 AND U1 (SEE EQUATION 2.22, 2.31 AND 2.32). QETF IS THE DETERMINANT OF THE MATRIX FW USED TO OBTAIN THE INVERSE OF THE (282) MATRIX FM. SIGMAH IS THE INTRA-CLASS VARIANCE COMPONENT BASED ON THE ORIGINAL HIERARCHICAL DATA SET. TAUH IS THE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED ON THE ORIGINAL DATA SET. LAAQA IS THE INTRA-CLASS CORRELATION BASED ON THE ORIGINAL SAMPLE AND COMPUTED BY THE FORMULA, OOSSSOOSSSSOOOS Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. LAMDA = TAUH/(TAUH+SIGMAH) *--------------== ____ _-— __ _ U01=0; 011:0; M=1; N1=0; DO J=1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; XJ=X[M:N1,]; YJ=Y[M:N1,]; CJ=CV[J,1]; EIJ=REPEAT(1,NJ,1); CN=CJ¢NJ; N2=NJ*NJ; DJ=YJ-(XJ*ALPHAH); HJ=21J‘*DJ; GJ=DJ‘*DJ; HJ2=HJ¢HJ; CH2=CJ¢H32; CN12=(1-CN)*(1-CN); DJ=YJ-(XJ*ALPHAH); U01=UOI+(GJ-(CH2*(2-CN))); U11=U11+(HJ2*CN12); M=M+NJ; END; 156 TEIS MARMS TEE END OF TEE COMPUTATION OF TEE USUAL MINQUE ESTIMATES BASED ON TEE ORIGINAL SAMPLE. TEE USUAL MINQUE ARE PRINTED AT THE FIRST LINE. ESTIMATES PRINTED AT TEE PROCEEDING LINES ARE TEE BOOTSTRAP REPLICATED ESTIMATES BASED ON TEE RESAMPLED DATA FROM THE ORIGINAL SAMPLE .---——------——-—-------- ----- ..-—----——-—--------—-—--------..--- U0=W2*U01; U1=W2*U11; DETF=(F00*F11)-(F01*FO1); SIGMAH=((F11*U0)-(F01*Ul))/DETF; TAUE=((PooaU1)-(P01aU0))/DETP; .-_-—---- ----------------------------------- ---—--------- ....... a A PROCEDURE TO BOOTSTRAP TEE PARAMETER ESTIMATES 9 BY COMPUTING TEE ESTIMATE B TIMES THROUGH RESAMPLING. . fi_IS TEE INDEX COUNTER NEICE COUNTS TEE BOOTSTRAP REPLICATED a SAMPLES. SEED1 IS TEE RANDOM GENERATOR FOR TEE BOOTSTRAP. .----—--- ............................ ——------—--------- ...... --- SEED1 = 10199; DO B=1 TO 200; .----—---—-——- ----- - -------- .- --------- -----------_----_---_----- a A PROCEDURE USED TO RESAMPLE DATA FROM TEE ORIGINAL DATA SET 9 BY FIRST CREATING AN INDEX FOR EACH OBSERVATION. TEIS . PROCESS IS REPEATED B TIMES FOR SOME LARGE B REPRESENTING . TEE NUMBER OF BOOTSTRAP REPLICATIONS. . -----—-—--—-- ------------------------ ------------------------ NT=1; CONSTANT=REPEAT(NT,NV[1,],1); INDEX=NV[1,]*(UNIFORM(REPEAT(SEED1,NV[1,1,1)))+CONSTANT; DO S=2 TO GROUPS; =NT+NV[S-1,]; CONSTANT=REPEAT(NT,NV[S,1,1); INDEX1=NV[S,]*(UNIFORM(REPEAT(SEED1,NV[S,1,1)))+CONSTANT; INDEx=INDEx//INDEx1; END; INDEX=INT(INDEX); YSTAR=Y[INDEX]; XSSTAR=X5[INDEX]; XSTAR=X4||XSSTAR; YSTART=YSTAR‘; 3 * DETERMINE K=INV(X’VWIX) BASED ON THE REPLICATED COVARIATE ; 3 S’SIDS'. Q. Q. Q. Q. Q. Q. ‘0 Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. ‘0 * VALUES .---------------- -------- ----- ---------- --------—----- ----- --—--, X=0; X1=0; X2=0; M=1; N1=0; DO J=1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; =XSTAR[M:N1,]; YJ=YSTAR[M:N1,]; 157 21J=REPEAT(1,NJ,1); CJ=CV[J,1]; SJ=XJ‘*ZIJ; X1=X1+(XJ‘*XJ); X2=X2+(CJ*SJ*SJ‘); M=M+NJ; END; X=l*(R1-R2); K=INV(K); .-------——----- --------------- .- ----------- - .................. --- * DETERMINATION OF THE MATRIX Fl BASED ON THE REPLICATED X * MATRIX. * ALPHAH1 IS THE ESTIMATE OF THE FIXED EFFECTS PARAMETERS BASED * ON THE REPLICATED DATA SET. j-------——-—-------—- ....... - .......... ... ...... - --------------- .- F001=0; F002=0; F011=0; F012=0; F111=0; F112=0; ALPHA1={0,0,0,0}; ALPHA2={0,0,0,0}; M=1; N1=0; DO 3:1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; XJ=XSTAR[M:N1,]; TJ=YSTAR[M:N1,]; CJ=CV[J,1]; 21J=REPEAT(1,NJ,1); CN=CJ¢NJ; CJz=CJOCJ; N32=NJ*NJ; C2=(1-CJ)*(1-CJ); TJ=TRACE(XJ‘*XJ*X); SJ=XJ‘*ZIJ; AJ=SJ‘*X*SJ; AC=AJ*CJ; CN1=1-CN; CN12=CN1¢CN1; CN13=CN1*CN12; AN=AJ*NJ; RJ=Z1J‘*YJ; F001=F001+(NJ*(C2+(CJ2*(NJ-1))) F002=F002+(TJ-AC*(CN12+(2-CN))) F011=F011+(NJ*CN12); F012=F012+(AJ*CN13); F111=F111+(N32*CN12); F112=F112+(AN*CN13); ALPHA1=ALPHA1+(X*XJ‘*YJ); ALPHA2=ALPHA2+(CJORJ*R*SJ); Q. Q. Q. Q. Q. ‘0 )3 158 ALPHAH1=ALPHA1-ALPHA2; M=M+NJ; END; W2=I*I; I3=W2*I; F001=I2*F001; F002=I3*F002; F011=W2*F011; F012=W3*F012; F111=W2*F111; F112=I3*F112; F00=F001-F002; F01=F011-F012; F11=F111-F112; ALPHAH1=WtALPHAH1; ALPHAH1T=ALPHAH1‘; a DETERMINATION OF TEE VECTOR UW BASED ON TEE RESAMPLED Y 9 AND TEE REPLICATED X MATRIX a SIGMAHI IS TEE INTRA-CLASS VARIANCE COMPONENT ESTIMATE a BASED ON TEE REPLICATED SAMPLE a TAUH1 IS TEE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED 0 ON TEE REPLICATED SAMPLE. . LAAQA; IS TEE INTRA-CLASS CORRELATION ESTIMATE BASED ON a TEE REPLICATED SAMPLE. ‘ .------------------------------------------------------- -------- U01=0; U11=0; M=1; N1=o; DO 3:1 TO GROUPS; NJ=NV[J,1]; N1=N1+NJ; XJ=XSTAR[M:N1,]; YJ=YSTAR[M:N1,]; CJ=¢V[J.1]; 21J=REPEAT(1,NJ,1); =CJ¢NJ; N2=NJ*NJ; =YJ-(XJ*ALPHAH1); HJ=E1J‘*DJ; GJ=DJ‘*DJ; BU:=EJ~EJ; CH2=CJ*HJ2; CN12=(1-CN)*(1-CN); DJ=YJ-(XJ*ALPHAH1); U01=UO1+(GJ-(CH2*(2-CN))); U11=UII+(HJ2*CN12); M=M+NJ; END; Q. Q. Q. Q. Q. Q. Q. Q. Q. ‘0 159 UO=U2*UOI; U1=W2*U11; DETP=(P009P11)-(P019P01); TAUE1=((Poo9U1)- (P019U0))/DETP; SIGMAE1=((P119UU)-(Po1901))/DETP; .---—---— ------ --- ----------------- ------------ ---------- --_--- 9 THIS PROGRAM SEGMENT PRINTS TEE VALUE or TEE BOOTSTRAP 9 AT EACH OF TEE B BOOTSTRAP REPLICATION. . --------- .- ----------------------------- -----—- ----- u... ------- .— PRINT T (IFORMAT=4.0|) B (IFORMAT=4. 0|) TAUH (IFORMAT=S. 3|) TAUE1 (IFORMAT=S. 3|) SIGMAE (IFORMAT=S. 3|) SIGMAEI (IFORMAT=S. 3|); PRINT ALPEAET (IFORMAT=S. 3|) ALPHAH1T (IFORMAT=S.3|); SEED1 = SEED1 + 100; SEED2 = SEED2 + 100; END; .--—---——-- --------- ---— ----------------- —-----------——--------_ 9 TEE END OF TEE BOOTSTRAP TRIAL BASED ON TEE RESAMPLED DATA. 9 ANOTHER TRIAL WILL BE PERFORMED AFTER CEANGING TEE SEED FOR 9 TEE RANDOM SAMPLING ALGORITH. .------——-—----- ------------- - ------ ---------———--——--..-—-__---- END; .---------—-----—---- ------------ ---——----——-------- -------- ---- 9 THIS MARKS TEE END OF TEE SIMULATION TRIAL. EACE SUCH TRIAL 9 RESULTS IN ONE SET OF TEE USUAL MINQUE ESTIMATES AND B SETS 9 OF TEE BOOTSTRAP REPLICATED ESTIMATES. TEE SUMMARY 9 STATISTICS FOR TEE BOOTSTRAP REPLICATED ESTIMATES ARE ALSO 9 COMPUTED. Q—-----—-—-----———--—-—-—-——- ------- —-—-----—----—-------........-_.. FINISH; RUN; PART 3 COMPUTER PROGRAM TO SIMULATE THE SAMPLING DISTRIBUTION OF THE MINQUE ESTIMATE FOR A SAMPLE DRAIN FROM A HORMAL POPULATION OF KNOWN PARAMETERS. THE PROGRAM FIRST SETS UP THE A AND THEAFIRST PART OF THE A MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH AS DEMONSTRATED BY EQUATION 2.11 IN CHAPTER II. THE WEIGHT 1; HAS DETERMINED SEPARATELY USING THE HANUSHEK (1974) METHOD. .-----------------------------------------------------. ------ --- ISOOOSSSSOOS ‘0 Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. PROC IML; .D....OSSOO *IIS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS START; W=1/(1-W1); GROUPS: so; NV11-REPEAT(20,2,1); NV21=REPEAT(25,5,1); NV31=REPEAT(30,10,1); NV1=NV11//NV21//NV31; NV12=REPEAT(35,5,1); NV22=REPEAT(40,3,1); NV32=REPEAT(20,3,1); NV42=REPEAT(25,5,1); NV2=NV12IINV22I/NV32//NV42; NV13=REPEAT(30,10,1); NV23=REPEAT(35,5,1); NV33=REPEAT(40,2,1); NV3=NV13//NV23//NV33; NV=NV1//NV2//NV3; CV=w1/(1+(NV-1)9l1); X11=REPEAT(1,465,1); X12=REPEAT(1,480,1); 213=REPEAT(1,555,1); X01=REPEAT(O,465,1); X02=REPEAT(0,480,1); X03=REPEAT(0,555,1); X1=X11//X02//X03; X2=X01//X12//X03; X3=X01//X02//X13; 160 PROGRAM SEGMENT TO GENERATE DATA FROM A NORMAL EOPQEATEO! OF SPECIFIED PARAMETER VALUES. THE EQUATION GIVEN BY: Y = (X*ALPHA) + B + E SEED: IS ANY NUMBER USED TO CREATE A RANDOM NUMBER OF OBSERVATIONS FROM SOME POPULATION. SEED = 10199; THE PROGRAM FIRST DETERMINES THE FIXED EFFECTS PARAMETERS AND THE COVARIATE WHICH IN ARE USED IN TURN TO GENERATE THE OBSERVATIONS 1 THROUGH SEED = 199; D0 T = 1 TO 1000; REFFECTS = 2.2935 * NORMAL(REPEAT(SEED,GROUPS,1)); B1 = REFFECTS[1,1]; N1 = NV[1,1]; B31 = REPEAT(B1,N1,1); Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. 161 DO I = 2 TO GROUPS; 33 = REFFECT8[I,1]; N = NV[1.1]: BJ1=BJllIREPEAT(BJ,N,1); END; 8:831; 3 = 10 t NORMAL(REPEAT(BEED,1500,1)); :41 = RBPEAT(25,1SOO,1); X4 = INT(75 * UNIFORM(REPEAT(BEED,1500,1))) + 241; x:x1||x2||xa||x.; ALPHA = {-512030100}; Y = (8*ALPHR) + B + B; I * DETERMINE E=INV(X'VIIX) ; .----- ----------- - ---------- ----- ----------- -------------------; x=°. ' 31:0; H2=O; M=1; N1=NV[1,1]; DO 3:1 TO GROUPS; XJ=X[M:N1,]; YJ=Y[M:N1,]; N3=NRO'(YJ); EIJ=REPEAT(1,NJ,1); CJ=CV[J11]3 SJ=XJ‘*ZIJ; H1=K1+(XU‘*XJ); E2=K2+(CJ*SJ*SJ‘); M=M+NV[J,1]; N1=N1+NV[J,1]; END; x='*(EI-K2); E=INV(K)3 3 * DETERMINATION OF THE MATRIX FI ; 3 roo1=o; rooz=o; PO11=0; ro12=o; r111=o; [112:0; ALPHA1={0,0,0,0}; ALPHAZ={0,0,0,0}; 8:1; 31=HVI10113 DO J=1 To GROUPS; XJ=X[M:N1,]; YJ=Y[H:N1,]; CJ=CVIJ0113 NJ=NROI(YJ); 162 21J=RBPBAT(1,NJ,1); CN=CJ*NJ; CJ2=CJ*CJ; NJZ=NJ¢NJ; cz=(1-CJ)*(1-CJ); TJ=TRACB(XJ‘*XJ*R); 8J=XJ‘*21J; AJ=BJ‘*K*SJ; AC=AJOCJ; CN1=1-CN; cu12=cu1tcn1; cu13=cu1tcx12; AN=AJONJ; RJ=Z1J‘*YJ; 1001=P001+(NJ*(C2+(CJZ*(NJ-1)) 1002=r002+(TJ-Ac*(CN12+(2-CN)) ro11=r011+(NJ*CN12); r012=ro12+(AJ*cR13); r111=r111+(naztcn12); r112=r112+(AN*CN13); ALPHA1=ALPHA1+(K*XJ‘*YJ); ALPHAZ=ALPHR2+(CJtnatxtsJ); ALPHAH=ALPnh1-ALPHAZ; K=M+NV[J,1]; N1=81+NV[J,1]; H; )3 END; '2='*'; N3=I2*'; F001=U2*F001; F002='3*F002; F011='2*F011; F012=I3*FO12; F111='2*F111; F112=U3*F112; FOO=FOO1-F002; FO1=FO11-F012; F11=F111-F112; ALPHAH=I*ALPHAH; ALPHAHT=ALPHAH‘; * ----------------------------------------------- - --------------- ; DETERMINATION OF THE MATRIX UN . I .----------------- --------- - --------- ---------------------------'. 001:0; 011:0; 8:1; N1=NV[1,1]; DO J=1 TO GROUPS; xJ=X[R:R1,]; YJ=Y[M:N1,]; CJ=CV[J,1]; RJ=RRow(!J); 21J=REPEAT(1,NJ,1); CN=CJ*NJ; 163 N2=NUONJ; DJ=YJ-(XJ*ALPHAH); HJ=21J‘*DJ; GJ=DJ‘*DJ; RJ2=RJ*RJ; cnz=csanaz; CN12=(1-CN)*(1-CN); DJ=YJ-(XJ*ALPHAH); UO1=UOl+(GJ-(CHZ*(2-CN))); 011:011+(HJZ*CN12); M=M+NV[J,1]; 81=N1+NV[J,1]; SEED = SEED + 10; nun; uo=wztuoz; 01=W2*011; DETF=(F00*F11)-(F01*F01); SIGMAH=((F11*00)-(F01*01))/DETF; TAUR=((200*01)-(ro1auo))/DBTR; manna] (mumsmnm ; PRINT S TAUH SIGMAH LAMDA ALPEAET; END; PRINT SRRU; FINISH; RUN; PART 4 COMPUTER PROGRAM TO SIMULATE THE SAMPLING DISTRIBUTION OF THE MINQUE ESTIMATE FOR A SAMPLE DRAIN FROM A DOUBLE EXPONENTIAL POPULATION OF KNOWN PARAMETERS. THE PROGRAM FIRST SETS UP THE A AND THE FIRST PART OF THE A MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH AS DEMONSTRATED BY EQUATION 2.11 IN CHAPTER II. THE WEIGHT :1 WAS DETERMINED SEPARATELY USING THE HANUSHEK (1974) METHOD. PROC IML; START; I=1/(1-'1); GROUPS=50; NV11=REPEAT(20,2,1); NV21=REPEAT(25,5,1); NV31=REPEATC30,10,1); NV1=NV11IINV21IINV31; NV12=REPEAT(35,5,1); NV22=REPEAT(40,3,1); NV32=REPEAT(20,3,1); ..ODOOOOI’... Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. 164 NV42=RSPSAT(25,5,1); RV2=RV12//RV22//NV32//RV42; NV13=REPEAT(30,10,1); NV23=REPEAT(3S,5,1); RV33=RSRRAT(40,2,1); NV3=NV13IINV23IlNV33; NV:NV1//NV2//NV3; CV=W1/(1+(NV-1)*I1)3 X11=REPEAT(1,465,1); X12=REPEAT(1,480,1); x13=RRpSAT(1,sss,1); XU1=RSRRAT(0,465,1); xoz=RszAT(o,4so,1); xoa=RSpRAT(o,sss,1); 11=X11//xoz//xoa; xz=x01//x12//xoa; xa=xo1//xoz//x13; *--- ------- .- ------- --- --------------- --------------- ....... ----— 0 PROGRAM SEGMENT To GENERATE DATA FROM A DOUBLE EXPONENTIAL a POPULATION or SPECIFIED PARAMETER VALURS. = 10199; SEED2 = 11099; DO S = 1 TO 1000; UR1=UNIFORM(REPEAT(SEED1,GROUPS,1) UR2=UNIFORM(REPEAT(SEED2,GROUPS,1) LR: '1 * LOG(UR1); vv ‘0 ‘0 TR1 = REPEAT(1,GROUPS, 1); TR2 = T31 I (0R2 >= 0.5); TR3 = -1*(TR1 I (URZ < 0.5)); TR. = TR2 + TR3; REFFECTS = 2.2935 * ((LR#TRA)/BQRT(2)); B1 = REFFECT8[1,1]; N1 = NV[1,1]; 331 = REPEAT(B1,N1,1); DO I = 2 TO GROUPS; DJ = REFFECT8[I,1]; N = uvllolli BJ1=BJ1IIRBPEAT(BJ,N,1); END; 3:831; UB1=UNIPORM(RBPBAT(SEED1,1500.1)) 032=UNIFORM(RBPBAT(83302,1500.1)) LB: -1 0 LOG(UB1); o I o I TE1 = REPEAT(1,1500,1); TE2 = TE1 I (UE2 >= 0.5); TE3 = -1*(TE1 # (UEZ < 0.5)); TE‘ = TEZ + TE3; = 1a a ((LE#T34)ISQRT(2)); :41 = REPEAT(25,1500,1); x4 = INT(75 a URIRORR(RSRRAT(SSSU1,1500,1))) + :41; x=x1||x2||xa||x¢; Q. Q. Q. Q. 165 ALPHA = {-5'2'3'100}; Y = (X*ALPHA) + S + E; .------------------------------------------ ----- ---------------- DETERMINE X=INV(X’VUIX) :0; 31:0; R2=0; M=1; R1=NV[1,1]; DO J=1 TO GROUPS; XJ=X[M:N1,]; YJ=Y[X:N1,]; NJ=NROW(YJ); 21J=REPBAT(1,NJ,1); CJ=CV[J,1]; BJ=XJ‘*Z1J; R1=X1+(XJ‘*XJ); xz=x2+(CJOSJ*SJ‘); H=M+NV[J,1]; N1=N1+NV[J,1]; END; R=I*(R1-KZ); X=INV1K); DETERMINATION OF THE MATRIX F' *-------- --------------------------------- ---------- ----- -------; 1001=0; F002=0; 1011=0; 1012=0; r111=o; r112=o; ALPEA1={0,0,0,0} ALPEA:={0,0,0,0} H=1; 81=NV[1,1]; o I o I DO J=1 TO GROUPS; ZJ=X[M:N1,]; YJ=Y[M:N1,]; OJ=CV[J,1]; NJ=NROI(YJ); 81J=REPEAT(1,NJ,1); CN=CJ*NJ; OJZ=CJ*CJ; 302=NJ*NJ; cz=(1-CJ)*(1-CJ); TJ=TRACE(XJ‘*XJ*K); BJ=XJ‘*Z1J; AJ=BJ‘*K*SJ; AC=AJOOJ; o I o I 166 CN1:1-CN; CN12:CN1*CN1; CR13:CN1*CN12; AN:AJ*NJ; RJ:21J‘*YJ; 3001:F001+(NJ*(02+(OJZ*(NJ-1)))); 1002:F002+(TJ-Ac*(CN12+(2-CN))); P011:P011+(NJ*CR12); 1012:P012+(AJ*CN13); P111:P111+(NJZ*CN12); P112:P112+(AN*ON13); ALPHA1=ALPHA1+(KOXJ‘*YJ); ALPEAZ:ALPHAZ+(OJtRJ*R*SJ); ALPHAS=ALPHA1-ALPnhz; M:M+NV[J,1]; l1:N1+NV[J,1]; SUD; U2=w*w; 03:02*I; P001:02*P001; 1002:03*P002; [011:02*P011; l012:l3*P012; P111:IZ*P111; P112:03*P112; 100:P001-P002; 101:P011-P012; 111:2111-P112; ALPHAH:0*ALPEAS; ALPEAET:ALPSAS‘; .---- ........................................................... * DBTBRHINATIOR OP THE MATRIX 00 . ............................................................... 001:0; 011:0; u=1; n1=uvlllll; DO 0:1 TO GROUPS; IJ:X[H:N1,]; YJ:Y[M:R1,]; OJ:CV[J,1]; SJ:RROI(YJ); S1J:RBPBAT(1,NJ,1); CN:CJ*NJ; 82:NJ*NJ; DJ:YJ-(XJ*ALPEAH); SJ=21J‘*DJ; GJ:DJ‘*DJ; SJZ:SJ*SJ; €32:CJ*302; CN12=(1-CN)*(1-CN); DJ:YJ-(XJ*ALPHAH); Q. Q. Q. 167 001=001+(GJ-(CHZ*(2-CN))); 011=011+(HJZ*CN12); M=M+NV[J,1]; N1:N1+NV[J, 1]; SEED1 = SEED1 + 100; SEEDZ = 83302 + 100; END; 00=W2*001; 01=Wz*011; DETF=(F00*F11)-(F01*F01); SIGNAN=¢(P11tUO)-(Po1*01))/USTP; TAUH=((F00*01)-(P01*UO))/DETF; LANDA=TAUR/(TAUN+SISNAR); PRINT T TAUH SIGMAH LAMDA ALPEAET; END; FINISH; RUN; BIBLIOGRAPHY Abramovitch, L., and Singh, K. {1985). Edgeworth corrected pivital statistics and the bootstrap. The Ann I i ti , 13, 1, 116-132. Aitkin, M., and Longford, N. (1986). tatistical modeling issues in school effectiveness studies. W. (Series A). 143, 1—43. Arlin, M. (1984b). Time, equality, and mastery learning. Leg}! of Edgcatignal mm 54.1. 65-86- Arlin, M., & Webster, J. (1983). Time cost of mastery learning. M Egggtigngl Psycholggx, 15;, 187—195. Ashton, P., 8: Webb, R. (1986). Making a Diffggng: Teaghers’ Sense of Efficacy Q51 Student Aghievemgnt. New York: Longman. Bagaka’s J. G. (1989). Empirical Bayes Bootstrap: Estimation of parameter distribution through the bootstrap in hierarchical data. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco. Bandura, A. (1986 . Social F ti ns f Th ht A ti n: A Social ngg’tive heory. Englewood Cliffs, New Jersey: PrenticchHall. Bangert—Drowns, R. L. (1986 . A review of developments in meta—analytic method. Ps ch lo ' B e in, 39, 388-399. Beran, R. (1984). Jackknife approximations to bootstrap estimates. The Anngs 9f 51mm 12(1), 101—118. Bickel, P. J ., & Freedman, D. A. £1981). Some asymptotic theory for the bootstrap. Th Ann f i i , 9(6), 1196—1217. Block, J. H. Ed.) (1971). Mgtgy Lgarning: Thegg Qd Prggticg. New York: Holt, 'nehart and Winston. Bloom, B. S. (1982). 11an thggtgristig £151 Schggl Learning. New York: McGraw—Hill Book Company. Bloom, B. S. (1984a). The 2 Sigma Problem: The search for methods of instruction as efiective as one—to—one tutoring. W, 13, 6, 4—16. Bloom, B. S. (1984b). The search for methods of cup instruction as effective as one—to—one tutoring. W1, :1_1(8), 4—17. Box, G.E.P., and Cox, DR. (1964). An analysis of transformations. Journal of 39W. 326. 211-246- 168 169 Brookover, W., Beady, 0., Flood, P., Schweitzer, J ., Wisenbaker, J. (1979 . §chggl WWW: ELM—Ms M A Di erence. New York: Praeger. Brookover, W., Beamer, L., Efthim, H., Hathaway, P., Lezotte, L., Miller, S., and Passalacqua, J. (1982). Creating Effgtive Schools. Learning Publication. Brown, KG. (1976). Asymptotic behavior of MINQUE—type estimators of variance components. Annfls 9f §t§ti§tigs, 4, 746-754. Burstein, L., Linn, R. L., & Campell, F. (1978). Analyzing multi—level data in the presence of heterogeneous within—class regressions. Jomfl 9f Egngational We a (4), 347-389. Burstein, L., & Miller, MD. (1980). Regression—based anal sis of multi—level educational data. New Directioy for Mgthgglggy Q Sggigl and Beflvioral Sg’enges, Q, 194—211. Cochran, W. G. (1977). Sampling Technigue. New York: John Wiley 8: Sons. Corbeil, R. R., and Searle, R. S. (1976). A comparison of of variance component estimators. Biometrics, 32, 779—791. Corno, L., 8: Snow, R. E. (198561! Adaptingi teachin to individual differences amo learners. In M. ittloclr (e .), Haggflgook of Research on Teaching. 3rd ., Chicago, IL: Rand NcNally. Carrol, J. B. (1963). A model for school learning. Tachers Cgllggg Rmord, 64, 723—733. Carroll, R. (1979). On estimating variances of robust estimates when the errors are asymmetnc. Journgl 9f Ameriggg Stgtistigs, L4, 674—679. Dembo, M.H., & Gibson, S. (1985). Teachers’ Sense of Eficacy: An important factor in school improvement. Thg Elmentm Schml Jgurnal, 86, 2, 173-184. Diaoonis, P., 8: Efron, B. (1983). Computer—intensive methods in statistics. Sg'entifig Americgg, 218, 116-130. Dicoccio, T., and Tibshirani, R. (1987). Bootstrap confidence intervals and bootstrap approximations. ngg of Amerigg Statisticg Association, 82, 397, 163—170. Dolker, M., Halperin, S., 8: Divgi, D. R. (1982). Problems with bootstrapping Pearson correlations in very small bivariate samples. Psychometrics, _41 (4), 529—530. Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Ann ati ti , 1, 1—26. Efron, B. (1981a). Nonparametric standard errors and confidence intervals. 1h; andian Journg! 9f Stfiistig, 2 (2), 139—172. 170 Efron, B. (1981b). Censored data and the bootstrap. Jgurngl 9f Amerigg i i A ' i n, 16, 374, 312-319. Efron, B. 1982). Thg ES. kknifg, the Sggtgtrgp, Q51 gther sampligg plans. P ' adelphia: Society for Industrial and Appliedl Mathematics. Efron, B. (1982). Transformation theory: How normal is a family of distributions? Thg .4an; of Stgtigtigg, 19, 2, 323—339. Efron, B., & Gon , C. (1983). A leisurely look at the bootstrap, the jackknife, and cross—vali ation. Thg Amgriggg Statigtig’gn, 31 (1), 36—48. Efron, B., and Tibshirani, R. (1986). Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Scigngg, 1, 1, 54—77. Freedman D. A. (1981). Bootstrapping regression models. Annals of StStistics, 2(6 , 1218—1228. Fuller, E, Wood, K., Rapoport, T., 8: Dombusch, SM. (1982). The organizational context of individual efficacy. Review of Ed cation earch, fl, 1, 7—30. Goldstein, HI. 1986). Efficient statistical modeling of logitudinal data. Annals of Hggg iology, LS, 129—142. Gover, J. C. (1962). Variance component estimation for unbalanced hierarchical classifications. Biometrigs, L8, 537—542. Hall, P. (1986a). On the bootstrap and confidence intervals. The Annals 9f Statisticg, g, 4, 1431—1452. Hall, P. (1986b). On the number of bootstrap simulations required to construct a confidence interval. Thg Annals 9f Stgtistics, _13, 4, 1453-1462. Hall, P. (1988). Theoretical coomparison of bootstrap confidence intervals. The Anngg 9f Statisticg, 1S, 3, 927-953. Hall, P. ( 1989). Bootstrap confidence regions for directional data. Journal of Ameriggg Statigtigal Agggg'gtign, L4. 408, 996-1002. Hall, P. (1990). Th B t r E h E i n. (in Print). Hanushek, E. A. (1974 . Efficient estimators for regressing regression coefficients. Thg America; tgtigtigLn, gs, 2, 66—67. Hartley, HQ. (1967). Expectation, variances and covariances of AN OVA mean squares y ’synthesis’. Bigmgtrigg, 23, 105—114. Hartley, H. O., Rao, J. N. K., and LaMotte, L. R. (1978). A simple 'synthesis’ based method of variance component estimation. Biomgtrigg, 34, 233—242. Harville, D. A. (1977 . Maximum Likelihood approaches to variance component estimation an to related problems. Jgnrnfl 9f Amerign Statigticfl M20. 12.. 358. 320-340- 171 Henderson, C. R. (1953). Estimation of variance and covariance components. 219mm. 2. 226-252- Henderson, C. R. (1959); Design and analysis of animal husbandry experiments. r Igghnigpgg Qt) pgggpgg in Animg Productipn Rgearg. American Society of Animal Production Monograph. Hinkley, D., and Wei, B. (1984). Improvements of jackknife confidence limit method. Biometrik , 7_1_, 2, 331—339. Hocking), R. R. (1985). Thg apalysig pf lineag models. Monterey: Brooks/ Cole ublishing Company. Kennedy, W. J ., &. Gentle, J. E. (1980). Stgtigtical Qomppting. New York: Marcell Dekker, Inc. Laird, N. M., 81. Louis, T. A. (1987). Empirical Bayes confidence interval based on Bootstrap samples. Spurnal pf Amgrigag Statigticfl Agfipgattion, S2, 399, 739—750. LaMotte, L. R., (1973). Quadratic estimation of variance components. Biometrics, 23, 311—330. Leestma, S., 8: Nyhoff, I[1.].‘Ll(l987). PASQAL prpggagming and problem solyp'ng. New York: Mac ' an Publishing Company. Lunneborg, C. E. (3985). Estimating the correlation coefficient: The bootstrap approach. Sychologigal Bpflgtin, 9S (1), 209—215. Mason, W. M., Wong, G. Y., & Entwistle, B..(1983). Contextual analysis throng: the multi—level linear model. 1 Meth 10 . San Francisco, Jossey—Bass. Mood, A. M., Graybill, F. A., & Boes, D. C. £1974). Intrpgpgtion to the Theory of StatAtjgg. New York: McGraw—Hill ook Company. Nash, S. N. (1981). Comments on nonparametric stande errors and confidence intervals. WW. 2(2). 163-164- Newman, F.M., Rutter, R.A., & Smith, MS. (1989). Organizational factors that affect school sense of efficacy, community, and expectations. Sgp‘plogy of Education, Q, 221-238. Njunji, A. (1974). Transformation of education in Kenya since independence. Edppgtipn Eggtern Afrig, _4_, 107—125. Patterson, ED. and Thompson, R. (1971). Recovery of interblock information when block sizes are unequal. Bipmgtg’kg, SS, 545—554. Peterson, P. (1972). A R§_v_i§g pf the figmch on Mgtggy gfl'ng Strgtggigs. Unpublished Manuscript, International Association for the Evaluation of Educational Achievement. 172 Quenouille, M. 1949). Approximate tests of correlation in time series. qurnal pf the Rani Statiatiaal ngety, Series B, A, 18—84. Rao, J. N. K. (1968). "On expectations, variances and covariances of ANOVA Mean squares by ’synthesis'." Bipmgripa, A, 963-978. Rao, C. R. (1972:. Estimation of variance and covariance components in linear models. al f Am ri tatisti al A ciation, 51, 337, 112—115. Rao, C. R. (1970). "Estimation of Heteroscedastic Variances in Linear models. WW. 6i. 161-172- Rao, C. R. (1971 a . Estimation of variance and covariance components - MIN QUE t eory. qurnal of Mtfltivag'ata Angysis, 1, 257—275. Rao, C. R. (1971 b). Minimum variance quadratic unbiased estimation of variance components. Journal of Multivariata Angya'a, 1, 445—456. Rao, CR. (1972). Estimation of variance and covariance components in linear models ournal of Am ric ati ti A ciation, fl, 112-115. Rasmussen, J. L. (1987). Estimati correlation coefficients: Bootstrap and parametric approaches. Paya ologicg Btfletin, Lu (1) 136-139. Raudenbush, S. W. (1984). Applipatipn of a hiaratahigfl lineg male] in fiagatipnfl 19.831911 Unpublished doctoral dissertation, Harvard University. Raudenbush, S. W., 8: Bryk, A. S. (1986). A hierarchical model for studying school effects. Sociology of Egpcation, SS, 1—17. Raudenbush, S. W., & Bryk, A. S. (1987 ). Application of hierarchical linear models to assessing change. Payphologiag Ballatin, _1_Q;(1) 147—158. Raudenbush, S. W. 82 Bryk, A. S. (1988). Methodological advances in analyzing the effects of schools and classrooms on student learning. In Ernst Z. Rothkopf (ed.), Review of Research in Education 1988—89. Washington, DC: American Educational Research Association. Rubin, D. B. (1981). The Bayesian bootstrap. Tha Annals of Statistics, 2(1), 130—134. SAS Institute, Inc. (1985). SAS User’s Guide: Statistics. Cary, NC: SAS Institute, Inc. SAS Ingltlitute, Inc. (1985). SAS/IML User’s Guide. Cary, NC: SAS Institute, c. Schenker, N. (1985). Qualms about bootstrap confidence intervals. Jomal pf Amaricap Statiatical Aaspg’ation, SS, 390, 360—361. Searle, S. R. (1971). Topics in variance component estimation. Biometripa, 21, 6 173 Searle, S. R. (1979). Ma A an VEQQQ Qommngt Estimation: A detailed account of maximum likelihood and kindred methodology. Biometrics Unit, Cornell University, New York. Seely,J (.1971) Quadratic subspaces and completeness. Annals pf Mathematical W 4.2.. 710-721 Singh, J. (1981). On the asymptotic accuracy of Efron’ s bootstrap. Tha Annals of 5111111191 12(6), 1187—1195 Slavin, R. E. (1987). Mastery learning reconsidered. Review pf Eglugationg m 51. 2 175—213 Thompson, W. A. (1962). The problem of n ative estimates of variance components. Ann I M hem i tati ics, A}, 273—289. MICHIGAN STATE UNIV. LIBRnRIEs m‘Illlllllllll“NWlllll‘lllllWlWIWI 31293007914447