'1...

‘3‘:

‘v

35
v of):
v.

tilt. I “|. .Aﬁ. .
1 o

‘.
..
.-
.nvot'
r-

 

... .. :«s

. .ovo‘vc

v .
.u:s ._s ¢ .‘
‘. 3...... :IU: 140

 

.51I\.-.t... .
‘ .A l.- . ..

15. . .15“: b

‘au his»:

UNIVERSITY LIBRA ARIES

IIIIIIIIIIIIIIIIIIIIIIIII IIIII IIII II I

3 129300

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This is to certify that the

dissertation entitled

Two Level Nested Hierarchical Linear Model with
Random Intercepts via the Bootstrap

presented by

Joshua Gisemba Bagaka's

has been accepted towards fulﬁllment
of the requirements for

Ph . D degree in Counse ling ,
Educational Psychology & Special
Education (Statistics & Research Design)

 

s/éf 1%— ﬂow

Major professor

Date Marchl 1 2

.MSU is an Afﬁrmative Action/Equal Opportunity Institution 042771

 

 

 

 

 

 

LIBRARY
Michigan State
; Unlverslty

L

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE ll

1-"? ‘44 55:1? _:
’ - r3331 1133b,

 

 

 

 

 

 

 

A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

WI I

MSU Is An Afﬁrmative Action/Equal Opportunity Inditution .
cmmn

 

_-_—.~—“-

TWO LEVEL NESTED HIERARCHICAL LINEAR
MODEL WITH RANDOM INTERCEPTS
VIA THE BOOTSTRAP

by

Joshua Gisemba Bagaka’s

A DISSERTATION

Submitted to
Michi an State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Counselin , Educational
Psychology and Speci Education

1992

ABSTRACT

TWO LEVEL NESTED HIERARCHICAL LINEAR MODEL
WITH RANDOM INTERCEPTS VIA
THE BOOTSTRAP

by

Joshua Gisemba Bagaka’s

In statistical linear models, most procedures available for estimating the
variance components of the mixed model are usually based on the assumption that
the error terms and each set of random effects in the model are normally distributed
with zero means and some variance—covariance structure. However, in certain
research situations, there is little doubt that the error terms and each set of random
effects in the mixed model can be characterized as moderately or even distinctly
non—normal with heavy tails or badly skewed distributions.

Efron (1979) discussed the use of a technique called the bootstrap to generate
sampling distributions of statistics and thereby to draw inferences about parameters
without requiring any distributional prOperties. Besides the fact that the bootstrap
liberates statisticians from over-reliance on distributional assumptions, the method
makes it possible to attack more complicated problems which may not have
Closed—form expressions.

This study utilized the bootstrap procedure to estimate the sampling
distribution of estimators, their standard errors and thereby setting conﬁdence

intervals about the parameters of a mixed HLM under a variety of conditions.

Applicability of the bootstrap on data originating from real research situations was
demonstrated through the estimation of the effects of school, classroom, and teacher
variables on the teachers’ self—eﬁcacy.

Based on the usual MINQUE and bootstrap estimators, the study showed
that the success of estimation is typically affected by the nature and size of the tails
of the distribution of the errors and sets of random effects parameters of the model.
The bootstrap generally followed MINQUE quite closely in estimating the ﬁxed and
random effects of the model under both the normal and double exponential
distributions. Particularly in estimating the pOpulation inter—class variance, 1'2 at
the 0.01 level of the intra—class correlation, the bootstrap was surprisingly closer to
the parameter value than the MINQUE.

Due to the fact that the bootstrap procedure is highly dependent on the
computer, the study recommended that software to implement the bootstrap
algorithm be deve10ped to make the method available to research practitioners.
Availability of the method to research practitioners will provide an important and
ﬂexible tool, typically unavailable through classical methods, of estimating the
sampling distributions of statistics, their standard errors, and thereby setting

conﬁdence intervals about parameters.

Copyright
by
JOSHUA GISEMBA BAGAKA’S
1992

This dissertation is dedicated
TO
my parents
Ludiah and Andaraniko Bagaka
and to
my sister Milcah Bosibori who has
been ﬁghting hard for her
life during the period
of this dissertation.

ACKNOWLEDGEMENT

This dissertation would not have been realized without the direct and
indirect assistance and encouragement of many pe0ple. Very deep appreciation goes
to Dr. Stephen Raudenbush, my advisor, friend and chair of my committee for his
support, direction, and encouragement throughout my doctoral studies. Very
sincere gratitude goes to Dr. James Stapleton, my masters program advisor and
member of my doctoral committee, whose special friendship, encouragement, endless
patience, and advise fostered the congenial atmOSphere in which my entire graduate
studies were undertaken. Very grateful acknowledgement also goes to the other
members of my committee, Dr. Betsy Becker and Dr. James Gaveleck, for their
support, prudent guidance, and encouragement. The writer also wishes to extend a
special acknowledgement to Cathy Sparks for her tireless service of typing this
dissertation.

I wish to express my deep gratitude to the members of my family: To my
children Cliff Onserio, Kathrine Kerubo, Edward Makoyo, and my wife Hellen
Moraa, who sacriﬁced their time and enjoyment; to my parents, Ludiah and
Andaraniko Bagaka for the manner they raised me, their belief in me, and their
endless patience, love and support; to my parent in—laws, Bathseba and Sospeter
Mariera for their continuous love; to my brothers and sisters, Musa Nyakeri, Jeliah
Mongina, Yunuke Nyamunsi, Milcah Bosibori, Sarah Makori, Anna Kwamboka,
Dorika Tabitha, Isabela Kemunto, and Henry Nyabuto for their special love and
encouragement; to all my brother— and sister—in—laws; and to the entire extended

Bagaka clan and family.

Very special acknowledgment goes to my family friends, Ogega and Phoebe
Mokogi Omete and their children; my nephew and friend, Thomas Ogega Orenge;
Linda and Richard Solomon and their children; my sincere friends, David Wafula
Makanda, the Mwakikotis, the Saginis, Zablon Oonge, Roselyne and Zipporah
Chisnell, Samuel Muga, the Mapatis, Annie Woo, Zora Ziazi, the Navarros, Vicky
Lutherford, Sohed, Kamuyu—wa-Kangethe, John Lelei, and the entire Kenyan
family in the greater Lansing Area for their family—like support and friendship.

I also wish to extend a special acknowledge to my cousin Charles Nyakeri;
my nephew and role model, Dr. Benson Mochoge; and all members of the Greater
Gionseri community for their encouragement and support. To Chris Obure,
Nyaundi, Oigara, Paul Sikini, my brother in—law Peter Omwoyo, the 1983 NHS and
N ASS staff and all friends and relatives who rendered their encouragement as well
as moral and ﬁnancial support. Their support is deeply appreciated by the writer,
his family and the pe0ple of Kenya. May God bless everyone. ASHANTE SANA.

vii

TABLE OF CONTENT

List of Tables
List of Figures

CHAPTER
I. INTRODUCTION

Dependence on the Normal Assumptions
Negative Variance Estimates. .
Purpose of the Study

Objectives of the Study

Summary . .

II. THE MULTILEVEL MODEL AND ESTIMATION

Introduction
The Multilevel Model
The Two—level Hierarchical Linear Model with
Random Intercepts
Model Assumptions .
Variance Component Estimation
Balanced Design
Unbalanced Design .
MINQUE for Two—level HLM with Random
Intercepts .
Choice of Weights
Summary .

III. THE BOOTSTRAP METHOD .

Introduction
Nonparametric Bootstrap .
The Bootstrap Estimate of Bias
The Bootstrap Conﬁdence Intervals
The t—method .
Percentile Method
Correction for Bias 1n Bootstrap Estimation

viii

Page

xii

H

589000:

IV. DESIGN OF THE STUDY . . . . . . . . 38

Introduction . . . . . . . . . . 38
Generation of Data . . . . . . 39
Study Design and Parameter Values . . 40
Implementation of the Bootstrap using MINQUE 43
Computer Programs . . . . . 46

V. APPLICATION OF BOOTSTRAP AND MINQUE:

HIGHER ORDER TEACHING (HOT) . . . . . 48
Introduction . . . . . . 48
Description of Data and Variables . . . 49
The Model Statements . . . . . . . . 50
Estimation Procedures . . . . . . . 51
Results of Estimation . . . . . . . 52
VI. SIMULATIONS AND BOOTSTRAP RESULTS . . . 58
Overview . . . . . . . 58
Results of Estimation Procedures . . . . 59
Results of Bootstrap Conﬁdence Intervals . 94
Accuracy of Bootstrap Conﬁdence Intervals 109

VII. SUMMARY, TECHNICAL DISCUSSION, CONCLUSIONS,

AND RECOMMENDATIONS . . . . 112
Overview . . . . . . . . . . . . 112
Summary . . . . . . 114
Teachers’ Self—Efﬁcacy Model . . . . 114

Simulated Models . . . . . . . 116

Inter—class variance :3 . . . . 117

Intra-class variance . . . . 118

Intra—class correlationa (p . . . 120

Fixed eﬂ’ects parameters 011,012,113) . 121

Coefﬁcient of the covariate (ﬂ) . . 122

Technical Discussions . . . . . 123

Conclusions . . . . . . . . . . 128
Recommendations . . . . 130
Recommendations for Further Research . 130
APPENDICES . . . . . . . . . . . . . 132
A. Summary of Computational Formulae . . . 132

B. SAS/IML Computer Programs . . . . . . 143
BIBLIOGRAPHY . . . . . . . . . . . . 168

ix

Table
4.1
5.1

6.1

6.2

6.3

6.4

6.5

6.6

6.7

LIST OF TABLES

Design Factor Combination Trials

Bootstrap and MIN QUE Estimates of the Effects of
Type of Subject, School Climate Variables on the
Teachers’ Perceived Self—Efﬁcacy

Average and Standard Deviation of the Functions

of the Estimates 'r2 and/or 1.1 under the Normal
and Double Exponential Errors and Sets of
Random Eﬁects for p = 0.01, 0.05, and 0.20

Average and Stande Deviation of the Functions

of the Estimates 02 and/ /or a’1 under the Normal
and Double Exponential Errors and Sets of
Random Effects for p= 0. 01, 0. 05, and 0. 20

Average and Standard Deviation of the Functions

of the Estimates p and or p* under the normal
and Double Exponenti Errors and Sets of
Random Effects for p = 0.01, 0.05, and 0.20

Average and Standard Deviation of the Functions
of the Estimates a:1 and/or a: under the normal

and Double Exponential Errors and Sets of
Random Effects for p = 0.01, 0.05, and 0.20

Average and Standard Deviation of the Functions
of the Estimates a:3 and/or a3 under the Normal

and Double Exponential Errors and Sets of
Random Effects for p = 0.01, 0.05, and 0.20

Average and Standard Deviation of the Functions

of the Estimates 6 and or 19 under the Normal
and Double Exponenti Errors and Sets of
Random Effects for p = 0.01, 0.05, and 0.20

Aver e and Standard Deviations of the Bootstrap
ence Limits and the Width of the Conﬁdence

Intervals about the Six Parameters of the Model

Under the Normal and Double Exponential

for p = 0.01

Page
42

53

61

67

73

79

85

90

97

6.8

6.9

6.10

Aver e and Standard Deviations of the Bootstrap

Conﬁ ence Limits and the Width of the Conﬁdence

Intervals about the Six Parameters of the Model

Under the Normal and Double Exponential

for p = 0.05 102

Aver e and Standard Deviations of the Bootstrap

Conﬁ ence Limits and the Width of the Conﬁdence

Intervals about the Six Parameters of the Model

Under the Normal and Double Exponential

for p = 0.20 106

Percentage of Times that the True P0pulation

Parameter Fell Within the Conﬁdence Intervals

Formed Using the Bootstrap Procedure at the

Three Levels of the Intra—class Correlation 111

Figure

5.1

5.2

6.1

6.2

6.3

6.4

6.5

6.6

LIST OF FIGURES

Percentage polygons for the bootstrap estimate
the inter— and intra—teacher variances and the
intra—teacher correlationfor the teachers’
self—eﬁcacy prediction model

Percentage polygons for the bootstrap estimate of
the effects of Mathematics, Science, English,

and Social Science on the teachers’

self—efﬁcacy

Percentage polygons for the MINQUE and bootstrap

estimate of r2 over 400 trials under the normal
and double exponential errors and sets of random
effects for p = 0.01, 0.05, and 0.20

Percentage polygons for the MINQUE and bootstrap
estimate of or,”3 over 400 trials under the normal

and double exponential errors and sets of random
effects for p = 0.01, 0.05, and 0.20

Percentage polygons for the MINQUE and bootstrap
estimate of p over 400 trials under the normal

and double exponential errors and sets of random
effects for p = 0.01, 0.05, and 0.20

Percentage polygons for the MINQUE and bootstrap
estimate of a:1 over 400 trials under the normal

and double exponential errors and sets of random
effects for p = 0.01, 0.05, and 0.20

Percentage polygons for the MINQUE and bootstrap
estimate of (13 over 400 trials under the normal

and double exponential errors and sets of random
effects for p = 0.01, 0.05, and 0.20

Percentage polygons for the MINQUE and bootstrap
estimate of 13 over 400 trials under the normal

and double exponential errors and sets of random
effects for p = 0.01, 0.05, and 0.20

xii

Page

56

57

65

71

77

83

89

93

6.7

7.1

7.2

Page

Percentage polygons for the relationship between
the distribution of the function D = 1-2 — r2 and
D* = r“ - r’ for p = 0.01, 0.05, and 0.20 95

Percentage polygons for the distribution of
R and R* representing the ratios of the

estimates of the random parameters 7’, 0:, p 125

Percentage polygons for the distribution of
R and R“ representing the ratios of the
estimates of the ﬁxed parameters a” a3,

and ﬁ 126

xiii

CHAPTER I

INTRODUCTION

Estimation is frequently based on subpopulations which can be combined
collectively into one underlying p0pulat'ion. For instance, educational performance
can be examined through a sample from several schools in a nation or state. It is
quite natural to estimate the mean achievement and the spread of students’
achievement in each school. Yet groups such as the school district personnel may be
interested in knowing the achievement of students in their school district relative to
the national average, while the school principal may be interested in the
performance of the school relative to the statewide or national average performance.
Several other interest groups (parents, teachers, education ministers) may have
interest in diﬁerent "levels" of data, making it necessary to examine the data in
stages.

Recently methodologists (Aitking and Longford, 1986; Burstein, Linn, &
Campell, 1978; Burstein & Miller, 1930; Goldstein, 1986; Raudenbush & Bryk,
1986) have deve10ped techniques to address studies involving data which have a
hierarchical character. Most of these studies have been conducted under the
assumption that the observations are independent and normally distributed.

Mason, Wong, and Entwistle (1983) and Raudenbush (1984) formulated similar
mathematical models for hierarchical data within—macro units with, say, students
in a school as "micro units" of analysis and schools as the "macro units" of analysis.
The resulting within— and between-macro units models were based on the usual

independence and normality assumptions.

Since the manner of obtaining data typically affects inferences that can be
drawn from such models, we consider a sampling process in which the "macro" units
are randomly drawn from a p0pulation before a random sample of "micro" units are
drawn from each "macro" unit. The resulting data are thus associated with two
random components (the between and within macro variance components) and the
model is correspondingly called the random effects model. Many situations arise
where "macro" units are nested within some ﬁxed factors (not drawn randomly from
a population), together with other micro ﬁxed effects and covariate(s), resulting in a
mixed model with both ﬁxed and random effects. Analysis of variance is
traditionally employed in situations involving ﬁxed effects models, to estimate the
ﬁxed effects parameters.

Although statistical procedures are available for estimating the variance
components of the random parts of the mixed model, these procedures have several
limitations, which take one or more of the following forms. First is the problem of
unbalanced designs (unequal subclass numbers). Estimating variance components
from unbalanced data is not as straightforward as from balanced data (Searle,

1971). Secondly, estimating variance components often involves relatively
cumbersome algebra which makes it difﬁcult for most methods to estimate model
parameters when covariate(s) are involved as part of the ﬁxed factors. Third is the
problem of negative estimates of the variance components and last but not least, the
problem that most variance component estimation procedures are based on the
assumption that the random error terms and sets of random effects are normally
distributed.

The limitation of unbalancedness is certainly clear since balanced designs are
rare in research situations. Thus, procedures of estimating variance components
limited to balanced designs may not be at all useful. On the situation of

cumbersome algebra and negative estimates, no one method has yet been clearly

established as superior either in minimizing the amount of computation required to
estimate the variance components or in obtaining non—negative estimates of the
variance components. These are the situations in which attempts can be made to
minimize but not necessarily to overcome the problem.

' Most procedures available for estimating the variance components of the
mixed model are based on the assumption that the error terms and each set of
random effects in the model are normally distributed with zero means and some
variance—covariance structure. Then for the balanced random component model,
it can be shown that the sum of squares in the analysis of variance are distributed
independently of each other; and each sums of squares divided by the expected
values of its mean square has a central chi—square distribution with the
corresponding degrees of freedom (Searle, 1971). This holds true only for the
random component model. For the mixed model, this distributional pr0perty only
holds for those sums of squares whose expected values are not functions of ﬁxed
effects; otherwise the same ratio of sums of squares that do involve ﬁxed effects, will
have a non-central chi—square distribution. Thus, the normality assumption for the
error term and each set of random effects in the model is the basis of the
distributional pr0perties of variance component estimators, on which most variance
component estimation procedures are based.

However experience has shown that in certain research situations, there is
little doubt that the error terms and each set of random effects in the mixed model
can be characterized as moderately or even distinctly non—normal. For example,
educationally oriented variables such as number of days absent from school, number
of times a student answers a question (or talks in class) and many other variables

are likely to produce non—normal data that are heavy tailed or badly skewed. Thus

the results of statistical methods based on the Gaussian assumptions may not
always be reliable.

Approaches are available for dealing with non—normality. Most involve
transforming the original data to a form more closely resembling a normal
distribution such that normal theory methods can be applied (Box and Cox, 1964).
Efron (1982) examined a family of six transformations and cautioned against
uncritical use of normality as a criterion for successful transformation. Perhaps
variance stabilizing transformations may be preferable. Efron (1982) discusses
situations in which it is better to transform to homoscedasity and ignore
non—normality than vicehversa. Otherwise the alternative could be to do a
complete analysis to recover the lost information during transformation. However,
the practical motivation of transformation theory is to avoid complicated analysis,
especially in already complicated situations (Efron, 1982).

What complicates the issue of using transformations even more is the fact
that the underlying distribution of the original variable must be known before one
decides on the apprOpriate transformation. In many situations, the underlying
distribution of the original data may not be known and thus apprOpriate
transformation of the data becomes difﬁcult.

Rao (1971) proposed the Minimum Norm Quadratic Unbiased Estimator
(MINQUE) for variance components, which does not require the normal distribution
properties of the error term and each of the sets of random effects. The method is
quite general, applicable most experimental situations, and the computations are
relatively simple (Rao, 1971). The approach of the MINQUE involves estimating a
linear function of the variance components using a quadratic function of the
observations, using pre—assigned weights in the norm. The MIN QUE estimates

therefore may vary with the choice of weights.

In addressing the problem of dependency on the weights when using
MINQUE, Brown (1976) suggested a procedure in which, after calculating a
MINQUE estimate as usual, the values therein are used as weights and the
MINQUE is calculated again. The process is repeated iteratively until two
successive estimates are equal, to some degree of apprordmation. The method has
been named iterative MINQUE or I—MIN QUE (Brown, 1976) and it has been
shown that MIN QUE and I—MINQUE estimators are asymptotically normal.
However, because the I—MIN QUE estimators are obtained iteratively, they do not
have the pr0perties used in deriving MIN QUE (unbiasedness, translation invariant
and minimum norm), and thus they are not necessarily unbiased or "best" in any
sense (Searle, 1979).

This study ad0pted the MINQUE procedure as a useful method of estimating
the variance component since the procedure does not require the normal
distributional pr0perties. In addition, perhaps one of the most useful features of the
study was in the speciﬁc manner in which MIN QUE was implemented. The study
used a crude ANOVA—type estimate of the variance components of the mixed
model as in Hanushek (1974). The values from this prior estimator are used to
determine the weights which are then used in the computation of the MINQUE
estimators. However, this does not in any way constitute the focus of the present
dissertation. The primary focus stands to be an attempt to liberate statisticians
from over—reliance on the normal assumptions in estimating the variance
components of a mixed model.

Efron (1979) has discussed the use of a technique called the bootstrap to
generate sampling distributions of statistics and thereby to draw inferences about
parameters without requiring any distributional pr0perties. Although Efron avoids
making any general claim for the origin of the name "bootstrap," Efron’s examples
may suggest to some that it is indeed a technique of "pulling ourselves up by our

bootstraps" in a data analysis, that is, for obtaining inferences insensitive to model
assumptions (Rubin, 1981). Indeed, the name reflects the fact that one available
sample gives rise to many others (Diaconis & Efron, 1983).

The deve10pment of the bootstrap starts with a sample X = {X,,...,Xn} of n
observations. From this sample, a random sample of size n is drawn with
rcplacement from which a ﬁrst bootstrap estimate is calculated. The replicated
sample is denoted by X“ = {X},...,X;} and the bootstrap replicated estimate
computed from X“ is denoted by it". The process is repeated a large number B
times resulting in a sequence 6?; of estimates for b = 1,...,B. If F designates the
unknown distribution of X, then Efron (1979,1982) argues that the empirical
bootstrap distribution F* of X* can provide a very good approximation of F for
a wide variety of interesting statistics. The bootstrap, therefore, which is an
elaboration of the jackknife invented by Quenouille (1949), provides a general
method which can be applied to complicated situations where theoretical analysis is
not possible. Under quite general conditions, the bootstrap gives asymptotically
consistent results and for some simple problems which can be analyzed completely,
for example, ordinary linear regression, it automatically produces results which are
comparable to standard solutions (Efron, 1981b). Through a series of examples,
Efron has shown that the bootstrap method works reasonably well under a variety
of situations. A more detailed discussion of the bootstrap method is offered in
Chapter III.

Dependence on the Normal Assumptions
The distributional assumptions imputed to the random error terms and each
set of random effects in the mixed model are that they are independent and

normally distributed with mean 0 and some variance—covariance structure. But

in order to realize the increased ﬂexibility of hierarchical linear models, careful
attention needs to be paid to these statistical assumptions (Bryk and Raudenbush,
1987). Though methods are available to assess the degree to which these
assumptions are realized in research situations, many researchers proceed with
computations under the normal assumptions regardless of whether or not the
normality condition is met. However, there are several situations in educational
research where hierarchical models may be applicable, but the normal assumptions
may not be guaranteed. For example, in the model involving student achievement
scores, or number of days absent there is doubt that the error terms are normally
distributed in certain situations.

The most common macro unit of analysis for the between group hierarchical
model in education is the school. Often, a random sample of schools is drawn from
which a sample of students is also drawn at random. Schools with different
characteristics may be in the sample resulting in an hierarchical data set with
certain response variables with different distributions for each school. Certain
educationally oriented variables, either at the student or school or classroom level
may be observed. Yet as mentioned earlier, for some of the variables, under the
assumption of random sampling from these suprpulations, there may be doubt
about the normality of the p0pulation distribution. Some schools may have data
sets whose underlying distribution is negatively skewed, others positively skewed,
others heavy-tailed and others may even be normally distributed. Under these
conditions using the standard methods to calculate parameter estimates may not
provide better estimates. Attempts to transform data to a form more closely
resembling a normal distribution will involve identifying the underlying original
distributions for variables in each context (e.g., school) ﬁrst before deciding on the

most apprOpriate transformation strategies for each suprpulation. Even if the

underlying distributions of the suprpulations were known, transforming variables
diﬁ'erently for each "macro" unit may deteriorate into a welter of calculations. In
such a situation, therefore, the bootstrap algorithm becomes handy and apprOpriate
not only to determine the expected values of the estimates without worrying about
the distributional properties but also to estimate the standard errors of the
estimates and the emmrical distributions of the estimators, thereby setting

conﬁdence intervals about the parameters.

Negative Variance Estimates
The usefulness of variance component techniques is frequently limited by the

occurrence of negative estimates of essentially positive parameters (Thompson,
1962). Though methods like the Restricted Maximum Likelihood (REML) were
primarily designed to remove this objectionable characteristic for certain
experimental designs, the problem still remains unsolved. Thompson (1962)
described an algorithm for solving the problem of negative estimates of variance
components for all random effects models by considering that their expected mean
square column forms a mathematical tree in a certain sense. The algorithm was

described as follows:

"Consider the maximum mean square in the entire array; if this mean
square is the root of the tree then equate it to its expectation. If the
minimum mean square is not the root then pool it with its predecessor."
Thompson, 1962, p. 273.

In either case the problem is reduced to an identical one having estimates of the

variance components. The estimates are non—negative and have a maximum

likelihood pr0perty.

Other methods like the method of moments have ways of controlling for the
occurrence of negative estimates by simply equating any negative estimate to zero.
It is anticipated that the bootstrap method used in this study will provide another
useful way of controlling for non—negativity of estimates of the variance components
particularly when the p0pulation interclass variance is small but positive.

For the bootstrap method, the estimate 21,; of the variance component is
computed at each replication b for b = 1,2,...,B where B is a large number. The
expected value of the estimate is then the average over all B replicated values of
the estimator. It is anticipated that if the parameter value of the variance
component is non—negative, then the sum and average over all the replicated values
will be non—negative. In this case therefore, we view the bootstrap as a means of
providing the MINQUE method with B opportunities to prove the positiveness of

the estimate of the parameter value which is essentially positive.

Pu_rpgse of the Study
The interest of the study lies in a two—level mixed and nested hierarchical

linear model (HLM) with random intercepts. The general problem is one of
estimating the ﬁxed effects and the variance components (group and individual level
variances) under several situations including conditions under which the normality
assumption may not be guaranteed. Besides the non—normality problem, the
problem of negative estimates of the between "macro" unit variance component
(especially in the case of boundary situations when the true variance component is

small) is not new to statisticians.

10

In the present study two different estimators of the model parameters were
obtained and compared against each other. The ﬁrst estimator was the MIN QUE
based on the original sample. The other was the bootstrap estimate computed

though resampling from a sample with replacement.

prg’ tives of the Study

The study demonstrates the use of the bootstrap in providing estimates of
the parameters (ﬁxed and random) of a general two—level mixed and nested
hierarchical linear model, determining the standard errors of the estimators and
their empirical bootstrap distributions. In addition to demonstrating that the
bootstrap algorithm liberates statisticians from over—reliance on the Gaussian
assumptions (Diaconis and Efron, 1983), through Monte Carlo simulations, this
study also

( 1) Determines the bootstrap standard errors of the variance components
and thereby allow construction of bootstrap conﬁdence intervals about
the parameters.

(2) Assesses the performance of the bootstrap method in determining the
estimates of the sampling distributions, the standard errors, and the
interval estimates of the variance component estimates of the model
when the response variable,

a) is normally distributed.
b) has a distribution with fairly heavy tails (e.g., the double
exponential distribution)

(3) Evaluate the bootstrap estimates and usual MINQUE estimates of the

variance components.

ll

(4) Examines the relative accuracy of the bootstrap method in estimating
the variance components of the model in the case of boundary
situations, particularly when the population interclass variance is
small but positive.

(5) Determines the bootstrap estimate of the ﬁxed effects

parameters and their standard errors.

Summgy

The present dissertation concentrates on the problem of estimating the
parameters of a mixed and nested hierarchical linear model. Chapter II will
describe the hierarchical model and the estimation of variance components in
balanced and unbalanced designs when the models are with and without
covariate(s). Chapter HI will concentrate on the discussion of the bootstrap
method. The design of the study will be provided in Chapter IV and an application
of the bootstrap method in estimating model parameters in higher order teaching
(HOT) research will be presented in Chapter V. MINQUE and bootstrap Monte
Carlo simulations results will be presented in Chapter VI and conclusions and

recommendations set out in Chapter VII.

CHAPTER II

THE MULTILEVEL MODEL AND ESTIMATION
Intr ction

It is common in educational research to study the effect of character of the
educational group (e.g., school, school district, classroom, province). These
group—oriented variables (e.g., school policies, teacher/student ratio, per—pupil
spending) may form part of a set of independent variables hypothesized to have an
effect on some individual—level outcome variable(s). For example, student learning
activities occur within organizations in which the individual students belong
(Burstein, 1980). It is therefore necessary for educational researchers to understand
and be able to explain the complex inﬂuence of not only individual level variables
but also group—oriented variables on some individual student outcome variables.

Data in this class of research is typically available at two levels of
observation, individual (or micro) and group (or macro) levels, giving rise to a
hierarchical structure of data. Similar arguments can apply as well to more
complicated nesting situations (students within classrooms, classrooms within
schools, schools within school districts, and school districts within states or
provinces) without loss of generality (Burstein, 1980). The problem then, is that of
analyzing such multilevel data when certain key independent variables are measured
at different levels of an organizational hierarchy.

Traditional statistical methods of data analysis like multiple regression and
analysis of variance have been found to be ineffective in studies involving such data

of hierarchical structure. Methodologists (Burstein, 1980; Burstein & Lin, 1978;

12

13

Mason, Wong, & Entwisle, 1984; Raudenbush Sr Bryk, 1986; Raudenbush, 1984)
have not only warned against the use of such classical linear models but have also
provided general statistical models of investigation when data exists in hierarchical
structures. These models are commonly referred to as hierarchical linear models
(HLM).

Studies of school effectiveness (e.g., Brookover, et.al., 1982) have been
interested not only in student achievement levels as measured by the average
achievement scores but also in overall group achievement or "equity" as measured
by the variability (or spread) of achievement scores. From this viewpoint, more
effective schools for example, not only produce high average achievement scores but
also help students of varied backgrounds to achieve mastery (Raudenbush, 1984).
The notion of evaluating the effectiveness of schools in achieving "equity" by
observing the within—school variability of scores can also be extended to examining
effectiveness of the state, province or country by observing between—school
variability in student achievement scores. Coupled with the fact that the "macro"
units (e.g., schools or classrooms) in the study may constitute a random sample of
such units from a p0pulation, then the mixed model conceptualization is certainly
appealing. Thus, one important class of investigation in such situations would
involve estimation of the variance components (or equity) in addition to examining

the eﬂ'ects of other ﬁxed factors in the mixed hierarchical linear models.

Thg Multilevel Model

The structure of data considered in this multilevel framework is assumed to
involve two levels of observations; the individual (micro) level and some higher
(macro) level. The structure can be characterized by contexts such as schools or
countries as "macro" units of analysis and individual subjects as the "micro" units

of analysis. The fundamental assumption underlying this multi-level hierarchical

14

framework is that the micro values of the response variable depend in some way on
context and that the effect of the micro determinants may vary as a function of
context (Mason, et. al., 1983). At the lowest level, some measure of outcome for
individual subjects and other individual characteristics may be apprOpriate.

Suppose we begin by posing a within—context model that deﬁnes a micro
equation with one micro response variable Y and one micro regressor X, which is
identical for each macro unit j as,

(2-1) Yij = 503' + ﬂrjxrij + eij
where j = 1, 2,...,J macro units and i = 1, 2,...,nj micro units within the macro
units. In this case Yij is the response and Xﬁj the regressor value for subject i
in macro unit j and 6,, is the random error term. The usual assumption is that
Eij is distributed normally with mean zero and variance 0:. The micro parameters
ﬁoj and 311 are assumed to vary across context as a function of some macro
regressor variable W.

Since [90,- and ﬂli are deﬁned for each context, we pose the
between-context models using 190,- and ﬂu as re3ponse variables as

(2-2) ﬁOj = 700 + Vorwij + er

(2-3) 5;; = 710 + 711w1j ‘l' erj
where 60) and 511 are the intercept and regression slape respectively for context
j, as deﬁned in Equation 2.1. Both the intercept and the slape are assumed to be
random with eoj and e“ as their respective random error terms. It is most
common to assume that the error terms eoj and elj are normally distributed with
mean zero and variances 1100 and "11: respectively, with the covariance of eoj and
e“ denoted by 110,.

A single equation for this simple case of a multilevel model can be obtained

by substituting Equations 2.2 and 2.3 into Equation 2.1 as

15

(2-4) Yij = 700 ‘I' 7orwrj + 'Yroan + 711W1jxrij +
(er + Xmeu + eij).
Although Equation 1.4 involves one micro regressor and one macro regressor, it is
quite general in the sense that other models of potential interest can evolve from it
(Mason, et. al., 1983). For example, a random effects one way analysis of variance
(AN OVA) model can be derived from it by setting to zero all of the coefﬁcients of
X1 and W1, i.e., 70, = 1,0 = 7“ = e“ = 0 resulting in the model equation,

(2.5) Yij = 700 + eoj +.Eij.

Similarly, a ﬁxed effects regression model is obtained by setting eoj and e,,- to
zero to obtain the model equation

(2-6) Yij="Yoo + 7orwij + 7roxrij + 711erxrij + 613”

For this study, a hierarchical linear model with random intercepts is
considered. Consequently, the random error elj associated with the random 310pe
model given in Equation 2.3 is set to zero. Model Equation 2.4 then reduces to the
variance component model given by

(2-7) Yij = 700 + Vorwrj + 7roxrij + 711erxrij + (er + eij)
which is a mixed model with the term (er + eij) as the random part and
(70° + 701W1+ 7,0Xﬁj + 7,,Wu-Xlij) as the ﬁxed part of the model. The ﬁxed
part of the model in Equation 2.7 may take a more general form with multiple X’s
(i.e. Xm,X3ij,...) and W’s (W,,W3,...) which may also include interactions.

One of the ﬁxed factors of the model, for example, may be the sector in
which the random factor of context is nested. The ﬁxed effects factor (sector) is
taken to consist of K levels, bringing the total number of ﬁxed effects parameters
(including covariates) to P.

In terms of the general linear model matrix notation, and if we allow for any

number L of regressor variables, Equation 2.7 can be expressed for the jth context as

16

(2'8) Y1 = {‘1‘}: +¥1§1 + $1
j= 1, 2,...J. where Yj is a (nj 1: 1) vector of response values; Xj is a (nj 1: p)

matrix of known constants; 95 is a (p x 1) vector of unknown ﬁxed effects

parameters; 21 is (nj x 1) vector of 1’3; 13,- is (qx 1) vector of unknown

random eﬁects parameters and e, is an (11j x 1) vector of random error terms.

The ngﬂevel Hierarchical Linear Model
with Random Intercepts

The model illustrated thus far reduces to simpler models under speciﬁc
conditions. The present study involves two factor levels, a ﬁxed and a random
factor where the random effects are nested within ﬁxed effects. Application of this
model can be seen in education research with the school background or sector
(public, private or religious) as the ﬁxed factor and individual schools as the random
factor. At the lowest level, some measure of outcome, for example academic
achievement may be of interest. Other school characteristics (teacher—student
ratio, school ﬁnancial means, school policy, inner city or suburban location) together
with student characteristics (social economic status, IQ) may be included in the
model as covariates. An example of the use of "Micro" and "Macro" variables can
be found in Mason et. a1. ( 1983) who used a model for a multilevel analysis of the
determinants of children born in ﬁfteen less—deve10ped countries. In this study
which was part of the Michigan Comparative Fertility project, Mason et. al. (1983)
used countries as the macro units of analysis, while married respondents served as
the micro units of analysis. Some of the macro variables used to deﬁne the context

within which individual childbearing took place included socioeconomic

l7

deve10pment, family planning program eﬁort, and per capita gross national product.
The micro speciﬁcations used included contraceptive use patterns, and the wife’s

education level.

Warm
Two levels of distributional assumptions can be speciﬁed for the multilevel

hierarchical linear model described above. First is the assumption related to the
micro speciﬁcation model shown in Equation 2.1. For this model, the error terms,

eij are assumed to be independently and normally distributed with mean vector 0
and variance 031,, for j = 1,...,J. With Equations 2.2 which describe the random
1

intercept part of the model, the following assumptions are made:
(i) the error terms eoj associated with the intercept 60,- are assumed to
be distributed normally with mean 0 and some variance 1100.

(ii) the micro errors, ‘1' are independent of the macro errors eoj.

While attainability of the distributional assumptions related to the micro
error terms cij can be easily accessed by observing the distribution of the response
values Yij, accessing the distributional assumptions of the macro error terms eoj is
more difﬁcult since [90,- is not directly observable. This worsens the situation in
dealing with methods which are overly dependent on distributional assumptions.

The assumption of independence in multilevel models also takes two forms,
within— and between—group independence. Robustness to within—group dependence
(dependence among observations) is of primary interest in the statistical literature
(Burstein, 1980). The statistical consequences of ignoring the intraclass correlation
structure that results by ignoring group membership can be quite serious (Burstein,
1980). In educational research involving student achievement, we realize that

instruction is primarily group—based. Instruction of students within the same class

18

is likely to be more similar than that for students from different classes. Under
these and similar circumstances, the between— and within—group error terms are
likely to be correlated.

In general, standard statistical estimation techniques like ordinary least
squares are ineffective in the presence of within—group dependencies. Yet in several
educational research situations, depending on the nature of the outcome and effect
variables under study, dependence among observations may be an inevitable
phenomena. Thus it may be reasonable for researchers to assume that dependencies
among observations exist, such that more effort may be spent on ways to identify
and adjust for these dependencies rather than assuming independence when

dependence may erdst.

Variance Commnents Estimation

The problem of estimating the variance components in mixed linear models,
containing both ﬁxed and random effects is not new to research methodologists.
Several methods of estimation have been suggested (Henderson, 1953; Hartley, 1967;
Searle, 1970; Henderson, et.al., 1959; Rao, 1970; 1971a, 1971b, 1972; Thompson,
1962). The deﬁciencies and/or difﬁculties in the application of these methods are
also well known (Searle, 1978). Estimates could be negative. Compuatational
problems could arise, particularly when covariates are involved as part of ﬁxed
factors of the mixed model. There is no general method to cover all situations and
problems. The problem of variance component estimation also varies with the
design, whether the data is balanced (equal subclass numbers) or unbalanced.
We:

Balanced designs are those in which there are equal numbers of observations
in all the macro units. The analysis of variance method (or the method of

moments) is traditionally employed in estimating the variance components of mixed

19

(or random) balanced designs. The method involves equating statistics to their
expected values and solving the resulting equations for the parameters (Hocking,
1985). But due to the infrequency of balanced designs in real world research,
methods which are limited to balanced designs are not at all useful.
Was

Unbalanced designs are to those in which the number of observations in the
sub—classes or macro groups are not all the same. Besides the problem of
cumbersome algebra and a confusion of symbols in variance components estimation
in unbalanced designs, other problems arise. Whereas with balanced data there is
only one set of quadratic forms to use (the analysis of variance mean squares), there
are many sets of quadratic forms that can be used for unbalanced data. And unlike
in balanced data, most quadratic forms in unbalanced designs lead to estimates that
have few optimal pr0perties. As Searle (1971) indicated, none of the earlier methods
were clearly established as superior in variance component estimation.

Efforts to adapt variance component estimation methods to unbalanced data
were led by Henderson (1953) who described an analogous to the analysis of
variance method used with balanced data, but designed to correct that deﬁciency.
Other methods evolved thereafter (see Searle, 1968), but Searle ( 1971) indicated
that most of these methods reduced in some way to the method of moments for
balanced data. The methods involve relatively cumbersome algebra such that a
discussion of unbalanced data easily deteriorate into a welter of symbols.

Other more recent methods of variance components estimation evolved which
are not necessarily allergic to unbalancedness. The maximum likelihood (ML)
estimator of the variance components is one such method. The ML estimators of
the variance components are those values of the components which maximize the
likelihood over the positive space of the variance components parameters (Corbeil

and Searle, 1976). Application of the ML method therefore requires assuming a

20

probability density function for the random variables, and than writing down the
likelihood function of the sample data. Though the general ML procedure can be
used for almost any probability density function, for variance component
estimation, it is customary to assume normality (Searle, 1979). Then maximizing
the logarithm of the likelihood function is fairly straightforward. However, as
indicated earlier, requiring the normality assumptions in certain research situations
may be expecting too much.

An alternative to the ML estimator of the variance components is the
restricted maximum likelihood (REML) which was ﬁrst suggested by Thompson
(1962) and later formally described by Patterson and Thompson (1971) and Corbeil
and Searle (1976). The method is based on a transformation that partitions the
likelihood under normality into two parts, one being free of the ﬁxed effects and the
other involving ﬁxed effects. Maximizing the part that is free of ﬁxed effects yield
the REML estimators (Corbeil and Searle, 1976). The REML estimators are
translation invariant, but because maximum likelihood restricts the estimator to the
allowable parameter space (positive), then REML estimators are biased. Thus, in
terms of assumption requirements, neither ML nor REML offers no solutions to the
estimation of variance components without assuming certain distributional
properties.

From the late 1960’s till the early 1970’s, statisticians were involved in
seeking methods of variance components estimation that posess more desirable
properties than just being unbiased and translation invariant. LaMette (1973) and
Rao (1970) though working independently, derived the minimum variance quadratic
unbiased estimators (MIVQUE) from the theoretical viewpoints without offering
ways of applying the method to actual data analysis. Henderson (1973) derived
computational formulae for MIVQUE based on the mixed model equations (MME)
and indeed showed that LaMotte’s and Rao’s methods were identical.

21

MIVQUE assumes normality and that V, the variance—covariance matrix of
the observations is known. Then the variance of quadratic forms is minimized for
this V. Since V is not known in reality, the procedure requires utilizing some
prior information about V, and as a result, the variance of quadratic forms is only
minimized if this prior V is the true p0pulation value. However, MIVQUE is
unbiased and translation invariant. In addition, if the prior V is the same as the
true V, then MIVQUE is also minimum variance. Thus, MIV QUE in general, is
not minimum variance in practice, but only as good as the prior V. For the present
study, the MIV QUE too does not oﬁer solutions to variance component estimation
since the procedure requires the normality assumption.

Rao (1970, 1971a,b, 1972) prOposed a minimum norm, quadratic, unbiased

estimator (MINQUE) of the variance component to estimate a linear function P I g
of the variance component (for known P ’ ). The method utilizes a quadratic

function Y’AY of the observations for Y, a vector of observations and A, a

symmetric matrix. The quadratic function Y'AY used to estimate P I g is taken

to possess the properties of translation invariance, unbiasedness and minimum norm
(Searle, 1979). More importantly, the MINQUE theory is developed without
reference to normality or the variance of the estimator and the method is highly
ﬂexible in the choice of norm while at the same time preserving the desirable
prOperties of the estimator (Rao, 1971). Since their invention these estimators have
gained much recognition. See in particular, Seely (1971), Hartley et. al. (1978), and
Searle (1979). Also the naive form of the MIN QUE which corresponds to the rather
uninformative prior value V = woln (MIN QUEO) is provided by Statistical

Analysis Systems (SAS). Due to its desirable properties, particularly the fact that

the procedure does not require the normal assumptions, the present study adOpted

22

the MINQUE technique in estimating the parameters of the mixed model via the

bootstrap method.

MINQUE for Two—level HLM with
Random Intercepts

The minimum norm quadratic unbiased estimator (MINQUE) for the
variance components of a mixed model is based on the statistical linear model whose

general form is represented by

(29) ¥=z<e+a>+s

with the following deﬁnitions:

Y is a (nxl) vector of n observations

X is an (nxP) known matrix of rank r()_() < n

g is a (le) vector of P ﬁxed eﬁ'ects parameters

2 is an (nxJ) known matrix, often consisting of 1’s and 0’s
1) is a (J x1) vector of J unobservable random effects

parameters, and

e is a (nxl) vector of random error terms.

In order to identify the variance components corresponding to the random

effects in 13, this vector 11 is partitioned as
(2.10) 9’ = [bi...bk...b;] for k = 1,...,c,

where the vector bk contains jk effects for the levels of the k‘11 random factor.

Corresponding to bk of 2.10 the incidence matrix 2 is accordingly partitioned as
(2.11) 2: [argyle] for k = 1,...,c,

such that 2.9 can be written as,

23

(2.12) 1; = 159 + 3312,13, + 5

with the model elements deﬁned as before. Equation 2.10 and 2.11 are similar to 5.3
in Rao (1971b).
A compact way of writing (2.12) is to deﬁne 5 as another 131: namely, 13,,

and the corresponding 20 as In. The model Equation 2.12 becomes

(2.13) Y = X9 +

with the following distribution pr0perties:
(2-14) 13031) = 9, VaIQk) = Jilin COVQkiPkI) = 9

for 1,1' = 0,1,...,c

where cov(bk,bk,) is the matrix of covariance of the elements of bk with those of
13,, for k s 1’. The variance of y is given by
(2.15) Var(b) = D = diag{a§ljk} for 1 = 0,...,c.

With this formulation, we notice that Equation 2.7 is a special case of 2.13
with c = 1 whose compact form may be given by

(2'16) Y = )5? 'l' @0130 + @181

where

Y is a (nxl) vector of n observations,
X is an (nxP) known matrix of rank r(x) < n

representing ﬁxed effects parameters,
a is a (le) vector of P ﬁxed effects parameters,

20 is an (nun) identity matrix,

130 is a (had) vector of residual error terms,

24

2, is a (nxJ) known matrix, often consisting of 1’s and 0’s and
b is a (J 111) vector of J unobservable random effects

parameters.
The distributional pr0perties imputed to 2.16 are according to 2.14 given by
(2.17) may.) =9.cov(yo.1>.) =9
(2.18) 13 = var(bk) = diag{a:1n, 7’11} for 1 = 0,1
where a: is the variance component of the residual errors and 'r2 is the variance
component of the random effects of the model.

For the two—level mixed model of the form given in Equation 2.16, the

MINQUE estimate 5 of the variance components of 130 and b, using weights w0
and w1 in the norm is given by

(2.19) §= {tr(13.azilj.zezé.i-l {familiar}.

for k,k' =0,1

where P, which is given by

(2.20) 13.. = y;1—Y.;05(z<’y.;is)-Iis’y;l
is the projection operator on the space generated by the columns of X similar to
1.2 in Rao (1971b). v, = 213,? for D, = diag{onn, wgJ} is a dispersion

matrix of b where w0 = 1- p and “’1: p for p the intraclass correlation
coeﬁcient. In practice, the weights w0 and w1 are pre—assigned numbers hence

V, and P, are matrices which can be calculated easily.

To advance the MINQUE estimates associated with the weights W0 and w1

for this special case, deﬁne F, and U, as follows:

(2'21) Fw = {tr(P'ZkszWZkIZkI)}
(2.22) U, = {Y P,ZkaP,Y}

25

for k, k, = 0,1. F, is a (2x2) matrix and U, is a 2—dimensional vector both
originating from 2.19 such that the MINQUE estimator (:7 is given by
(2-23) g= 13:11.
Deﬁne the matrices If and A, as part of the projection Operator 2.20 as
(2.24) I; = (3’11:ch
9-25) 4. = waist/r.

If we let f“, to be elements of a (2x2) matrix F, for k,k’ = 0,1, then the

following deﬁnitions can be given:

(2.26) to, = tr(1_’,l_’,,) = tr(Y.;’) - "(113%)
(2-27) for = f10 = "(91:59?”
= swash -tr(y;14.z@1)

(233) f“ = (“(13 .?1? 1131119? 1)

In order to simplify the vector U, of quadratic forms, we notice that
13.? = (Y: - Yise’yisr ’S'Yv’ﬁ!
= we! - sir/231‘ Um
such that
(2-29) 13.1! = 11:0! - as?)
where (:1 is the estimate of the ﬁxed effects parameters of the model given by
(2-30) Er= 1952:?-

With this simpliﬁcation, if we let 110 and 111 be the elements of the two

26

dimensional vector U, of quadratic forms, then the following deﬁnitions can be
given:

(231) no = $13.13.? = (Y - sift/m - 293)

(2-32) =1! 132%? ¥= (¥-¥§)'Y&‘?1?1Yé‘(¥-¥§)-

We notice that uZJZJ which occurs extensively in 2.27, 2.28, and 2.32 is

block diagonal with submatrices mJ of size (annJ) whose elements are all 1’s and
VJ? is block diagonal with submatrices V71 also of size (nJ-an) given by
(2.33) = w(IJJj cJ- mJ)

_ 1 ._ "I -_
for WTT—WTJ and cJ- — (1+(nj-UW1 for j— 1,2,...,J. Let XJ be the nJ- rows of

 

the matrix X associated with ﬁxed effects in context j. Let mJ = ZJJZ JJ- for ZJJ
a (an1) vector of 1’s. Then IQ; and the elements of F, and U, will simplify

signiﬁcantly as follows:
(2.34) i_<={w2(xeJ —chJs J)}

for SJ = XJZJJ a (le) vector of column sums of XJ;

(2.35) tr(V,?) = W3 E nJ.{(1-cJ)2 + c3(nJ—l)}
(2.36) tr(V,1A,)= w3 2 tr(tJ) - a JJcJ{(1—c nJ)2 + (2—c JnJ)}

for tJ =xe1_( and a J=tr(x'JJr_<xJzJJ z’JJ)=tr(sJ r_<sJ)=sJ KsJ;

(2.37) tr(V,;3ZJZ J) = w2 E nJ(1-anJ)3
(2.3s) tr(V'lA, z Jz’J)= w3 2 a J(1—e nJJ)a
(2.39) tr(V,;TZJZ ’Jsz'J) = w: s nJ2(1-anJ)2
(2.40) tr(V,1A,ZJZZJZJ)=w3 EnJaJ(1-cJ

(2:41) 9‘ = W 2 (195119 ’ “19191)

27
for rJ = ZJJYJ is the sum of Y elements in context j. If we deﬁne d = Y —X a,

.. -J -J-

hJ- = ZJJdJ = dJZ JJ— -B(YJ- -XJa); and gJ= d JdJ, then the elements of the vector

U, can be computed as
(2.43) uJ = w2 E hJ(1—anJ)3.
As a result of adopting the formulae to the speciﬁc model of this dissertation

shown in Equation 2.16, the MINQUE estimators g of the variance components

associated with the weights w0 and wJ are given by

32
(2.44) {7: J f]

T:

A A

where the components a3 and ‘r2 are deﬁned by

(2-45) ”3 = (frruo " 01111”D 311d
(2-46) 72 = (foour " f10110” D

where D = fJJfoJJ - fJJJfJ0 is the determinant of the (2x2) matrix F, given in 2.21.

A special case of MIN QUE, namely MIN QUEO is obtained by using zero for
all wk’s except wo. With such weights, V

1
_, reduces to onJJ and P, to t—VEM
for M given by

9-47) enema-1'1

where L is the Cholesky decomposition of Y,. This special MINQUEO estimator

denoted by 3,, is given by
(2.48) ={tr(MZJJZJJMZJJ,ZJJ,)}1 {Y MZJJZJJMY}

for k,k’ = 0,1 in the place of 2.19. The estimators given in 2.48 were the ﬁrst
estimators suggested by Rao (1970) and they are the estimators provided by
Statistical Analysis System (SAS). They also appeared in Hartley et.al. (1978) and
in Seely (1971). However, Searle ( 1979) has indicated that, other than the fact that

28

these MINQUEO estimators correspond to the rather uninformative prior value

Y, = 'oIn: they have no particular merit, besides being relatively easy to compute.

Qheice of Initig Weights

One feature which is perhaps one of the most important features of the
MINQUE estimator is in the choice of the weights Wk: k = 0,1,...,c. Searle (1979)
indicated that regardless of the choice of the weights the MIN QUE estimators will
possess the pr0perties of unbiasedness, translation invariance and minimum norm,

provided the wk’s are chosen such that 13",! exists. A version of MINQUE

(Brown, 1976) known as Iterative MINQUE (I-MIN QUE) is obtained as follows:
After calculating a MIN QUE estimate Er using arbitrary weights as in 2.19, the

values therein are used as weights wk and g is calculated again. The process in
repeated iteratively until two successive values of (:1 are equal (to some degree of

approximation). However, because I—MINQUE estimates are obtained iteratively,
they do not have the pr0perties used in deriving 2.19; and as such they are not
necessarily unbiased or "best" in any sense (Searle, 1979).

Instead of using arbitrary weights, the present study uses the method of

estimating g as in Hanushek (1974). These prior estimates are then utilized to

determine the weights wk. It is then these prior weights which are used in the

computation of the MINQUE estimator o in 2.19.

It is expected that, by determining MINQUE estimates through weights

established from some prior estimate of 3* we may obtain more efﬁcient MINQUE

estimates of the variance components than through estimation based on arbitrary

weights.

29

EM!

Chapter II discussed the mathematical model for the hierarchical data
together with the model assumption. Methods of variance component estimation
were reviewed and their limitations, weaknesses and strengths discussed, for both
balanced and unbalanced designs. Since the minimum norm quadratic unbiased
estimation (MIN QUE) procedure was ad0pted for the present study to be used via
the bootstrap a more detailed discussion of the MIN QUE was offered, particularly
for the two—level hierarchical linear model. An improved method of choosing the
initial weights for MINQUE was presented. A detailed review and discussion of the
bootstrap method is presented in Chapter III.

CHAPTER III

THE BOOTSTRAP METHOD
Imam
A typical problem in statistics involves estimation of an unknown p0pulation
parameter 0. Two main questions arise in connection with the problem:
ngtien 1: Perhaps among several possible estimators of the
parameter 0, what estimator 9 should be used
to estimate 0 ?
Queﬁion 2: Having chosen some estimator :0, how accurate
is it as an estimator of 0 ?
This situation is easily adopted to most research problems in the real world.

For the problem of estimating the parameters of a mixed hierarchical linear
model, the issue of Question 1 was discussed in the greater part of Chapter II. The
minimum norm quadratic unbiased estimator (MINQUE) of the variance
component was ad0pted as the method of estimation.

The bootstrap method is generally concerned with the issues of Question 2.
As mentioned earlier, the bootstrap is a computer—intensive method, which
substitutes considerable amounts of computation in place of theoretical analysis
(Efron and Tibshirani, 1986). The method can routinely answer questions which are
far too complicated for traditional statistical analysis. Even for relatively simple
problems, the computer—based bootstrap is an increasingly good data analytic
bargain in an era of exponentially declining computational costs on extremely fast
computers (Efron and Tibshirani, 1986).

30

31

Somewhat unfortunately, the name "bootstrap" conveys the wrong
impression of "something for nothing"—of statisticians idly resampling from their
samples, presumably having about as much success as they would if they tried to
pull themselves up by their bootstraps (Hall, 1990). This is by no means the case.
The bootstrap is a technique with a sound and promising theoretical basis (Hall,
1990).

As indicated in Chapter I (and later in this chapter), the bootstrap approach
involves computing an estimate of the parameter of interest and repeating the
process a large number of times by resampling with replacement. Thus, in utilizing
the technique in estimating the variance component of the mixed hierarchical model
described in this study, MINQUE will be used because of its desirable properties
and the fact that it does not require the normality assumptions.

The primary objective of the dissertation is to demonstrate the estimation of
parameters of a mixed hierarchical linear model in the total absence of
distributional assumptions of the model. It was due to this fact that MINQUE was
adapted since it does not require to normality (Rao, 1971). MINQUE then provides
a comparable partnership with the bootstrap which is a method designed to liberate
statisticians from over-reliance on the normal assumptions in data analysis.

In the literature, a distinction is made among three types of bootstrap
method namely, nonparametric, smoothed and parametric bootstrap. Since it is
the backbone of the study, a detailed discussion of the nonparametric bootstrap is

deemed necessary. It is this discussion to which this Chapter is committed.

Wan

In the nonparametric problem, we deﬁne a resample X* as an unordered

collection of 11 items drawn from X with replacement, so that each X‘J‘ has

32

probability 1/n of being equal to any given one of the XJ’s. In other words,
(3.1) P(X* = XJIX) = 1/n, 1 5 i, j g 11.

Of course, (3.1) means that X” is likely to contain repeats, all of which must be

listed in the collection X*. We will here formally present the nonparametric

bootstrap with its associated terminologies and notation.

Let XJ, X,, ..... ,XJJ be independent random variables with an unknown
common distribution function F. Suppose 3 is chosen as a statistic to estimate a
parameter 0 of the distribution F. The bootstrap distribution of 3 is generated
by taking repeated samples from XJ, X, , ..... , XJJ. One such bootstrap sample is a
simple random sample X'IJ‘, X3, ..... , X3 of size n drawn from XJ, X, , ..... , XJJ
with replacement. One bootstrap replication of the statistic 3 is then the value 3*
of 3 computed on the bootstrap replicated sample (X"J‘, X3, ..... , X3). Thus, the
bootstrap distribution of 3 is generated by considering a large number B of
bootstrap replications 3* of 3.

In this process of "resampling," the 11 data points XJ, X, , ..... , XJJ are
treated as a p0pulation with distribution function F. Let F be the empirical
distribution of XJ, X, , ..... , XJJ which puts mass l/n on each XJ, for i = 1, 2,...,n.
When we resample the data with replacement X‘IJ‘, X",‘, ..... , x; are independent,
with common distribution function F. The idea, then, is that the behavior of the
bootstrap quantity 3* mimics the behavior of 3. Thus, the distribution of 3*
could be generated from the data and used to approximate the unknown sampling
distribution of 3.

Another way of looking at the bootstrap estimation is by supposing that
X J, X, , ..... , XJJ are independently and identically distributed random variables
from a population with unknown cumulative density function, F, and suppose the
objective is to draw inferences about some parameter 0 of the population. If 3 is

an estimator of 0 and F is the sample cdf that assigns mass l/n to each XJ,

33

i = 1, 2,..., n, then Schenker (1985) has clariﬁed that the bootstrap approximates
the sampling distribution of 3 under F by the sampling distribution of 3* under
17‘.

Several studies (Efron & Gong, 1983; Bickel & Freedman, 1981; Efron,
1981a; Schenker, 1985) have presented more straightforward procedure for
performing Monte Carlo simulation through steps. As a simple example, suppose
our goal is to estimate the standard error of the sample mean X. Let
3: X = 2 XJ/n (the sample mean). The bootstrap algorithm will approach this
problem in the following steps:

Step 1 Given the sample XJ, X, , ..... , XJJ, construct F by assigning
mass l/n to each of the XJ, i = 1, 2,....,n.

(3.2) F: mass 1/n at XJ, i = 1,2,...,n

Step 2 Draw a bootstrap sample X"J‘, X",‘, ..... , x; from F with

replacement and calculate
. = J11- % X";

Step 3 Independently do Step 2 some number B times, obtaining
bootstrap replications 3"J‘, 3;, ..... , 313.

Step 4 Calculate the standard error a( 3*) of 3" by

J 1/2
0(3") = [3.1.1 .335"; 4:12]

where
s

3': = ﬁg; 91':-
However, the question of how well the empirical distribution of 3* under F
approximate the sampling distribution of 3 under F is certainly crucial.
Freedman and Bickel (1981) applied the bootstrap algorithm to estimate the

standard error of the sample mean X assuming they did not know the formula

34

0N5. With a pOpulation Of 6,672 Americans aged 18—79 in Cycle I Of the Health
Examination Survey, Freedman and Bickel were interested in the estimate Of the
mean systolic blood pressure in millimeters Of mercury. Using B = 100, the
bootstrap algorithm yielded a mean systolic blood pressure Of 129.6 with a standard
deviation Of 21.4 compared to the sample mean Of 130.3 and a standard deviation Of
23.2 millimeters of mercury. On plotting the bootstrap and the theoretical
distribution of the mean, the bootstrap distribution followed the theoretical
distribution rather closely.

In estimating the distribution of the 5%, 10%, and 25% trimmed means,
Efron (1986) used B = 200 bootstrap replications for several trials along with the
jackknife and the theoretical Optimization. The simulation results showed that the
bootstrap clearly out performed the jackknife and its results were surprisingly close
to the theoretical Optimum for a scale—invariant standard deviation estimate
assuming full knowledge Of the parametric family.

In spite Of a great deal of encouragement from several studies (Beran, 1984;
Lunneborg, 1985; Rasmussen, 1987; Singh, 1981) in assessing the accuracy Of Efron’s
bootstrap, others like Dolker, Halperin, and Divgi (1982) have expressed doubt.
The key question about the bootstrap technique is its accuracy in a situation where
the sample size is small. What can one say in the case Of small samples, especially if
the true distribution has long, thick tails like in the case Of the Cauchy distribution
(Nash, 1981)? On the other hand, the advocates Of the bootstrap may question
whether the problem Of small sample sizes and long, fat tails is unique to the

bootstrap or whether it is a problem affecting most statistical methods.

The Bmtetrep Eetimete ef Big;
The bootstrap technique of estimating bias is based on the idea that for a

statistic 3, which estimates a parameter 0, 3" is the bootstrap estimate Of 3.

35

Thus, the bootstrap estimate Of bias, denoted by BIAS is given by

(3.3) BIAS = 11, bids; — 3) = 3" — i),
where 3 is the usual estimate Of 3 and 3'" is the average Of 3; Over all
bootstrap replicated values Of 3'". Efron (1982) used this principle to compare the
relative accuracy Of the bootstrap with other methods Of estimation. By using 1000
bootstrap replications for the law school data which yielded 21‘" = 0.779 compared
to Z) = 0.776, Efron found that the bootstrap BIAS = 0.003, compared to —0.007 for
the jackknife and —0.011 for normal theory. In this study, a similar principle will be
utilized to demonstrate the relative accuracy Of the bootstrap, and the usual

estimates based on this principle Of the estimate Of bias.

The Bmtetrep Conﬁdence Intervals

Approximate conﬁdence limits for the parameter can be found via the
bootstrap using either the t—method or the bootstrap percentile method.

(i) The g—method: Suppose, for example, that 3 is the usual estimator of
the parameter 0. The usual conﬁdence interval are based On the assumption that,

the statistic T, given by

(3,4) T = _9‘____",
3
is approximately distributed N(0, 1). Then the (1—a)100% conﬁdence interval is
given by

Using the same principle, we can set the bootstrap conﬁdence intervals by utilizing
the fact that the bootstrap estimator, 3" is the estimator Of 3. Let the statistic
Tg at replication b be deﬁned by

36

(3.6) T3 = u
0.

A r A
where 3" is based on the bootstrap sample and 03* is the estimator Of Var( 3)
based on the bootstrap sample. If we Observe T'J'; a large number B times, for
b = 1,2,...,B, we will estimate the distribution Of T by the empirical bootstrap
distribution Of T*’s. Then the bootstrap conﬁdence interval about the parameter 3
are determined by the following process: ﬁnd a number Tc such that the
proportion Of T’"’s between — T,J and Tc is (l—a). The bootstrap (1—a)100%
conﬁdence interval is then given by

(3.7) 3 a T035

(ii) Percentile method: The bootstrap percentile method for constructing
conﬁdence intervals about the parameter 3 may be presented as follows.

Let 3 be the usual statistic estimating a parameter 3 based on the original
sample. Deﬁne the statistic d = 3- 3. We can apprordmate the distribution Of d
by the bootstrap distribution of d'" = 3g-3, for b = 1,2,...,B, where 3;; are the
estimates Of 3 based on repeated samples from the original sample. If we ﬁnd dc
such that (l—a)100% Of all d‘ﬁ’s are between —dc and dc, then the (l—a)100%
conﬁdence interval about 3 is given by

(3.8) 3 a dc.’

Certaien for Big in Bmtetrep Eetimetion

Perhaps the most important application Of the bootstrap is in estimating the
distributions and standard errors of statistics based on independent Observations.
But in many problems of practical interests, the bootstrap is employed to estimate

an expected value (Hall, 1990). For example, if 3 is an estimate Of an unknown

37

parameter 3, then we might want to estimate bias, E( 3—3), or the distribution
function of d = 0— a.
In the case of bias, using Efron’s notation, we have,
BIAS = 13(9) — 0 = 11(0-0)
Using a = (0—3) + 3

(3.9) 3=3—(3—3)=3-d for d=3—3,
we can approximate the distribution Of d by the distribution Of d* = 3" — 3
where 3'" is the version Of 3 computed from bootstrap sample {X"J‘,X‘;,...,X;},
rather than from the original sample {XJ,X,,...,XJJ}. Thus we have an estimate of
bias, BIAS given by

BIAS = 3" —3
and by 3.9 3 can be estimated by
"BOOT = 3" ("f " 3)
= 3- 3‘" + 3

(3.10) 3300.1. .= 23- 31‘

Equation 3.10 is thus the bootstrap point estimate Of the parameter 3 which
is corrected for bias. This method which is referred to as "pivoting" has been
stressed in many studies Of the bootstrap method for conﬁdence intervals
(Abramovitch and Singh, 1985; DiCicciO and Tibshirani, 1987; Hinkley and Wei,
1984; Hall, 1990; and Schenker, 1985). In most instances, pivoting amounts to
"studentizing" or correcting for scale. However, in certain situations, pivoting is
difﬁcult to sustain in problems where scale cannot be estimated in a stable way
(Hall, 1990). But if we perceive that the major advantage Of "standard" bootstrap
methods is the ability to construct conﬁdence intervals Of particularly difﬁcult
problems, and if we see respect Of transformation as a major prOperty, then pivoting

is advocated (Hall, 1990).

CHAPTER IV

DESIGN OF THE STUDY
Intro i n

The primary purpose Of the present study was to estimate the parameters Of
a two level nested hierarchical linear mixed model under a variety Of conditions.
One such condition which forms the focus Of the study is in estimating the model
parameters (ﬁxed and random) in situations where the random error terms and sets
of random effects are not normally distributed. The also demonstrated the ability Of
the bootstrap algorithm in providing the estimates Of the ﬁxed and random effects
of the model, generating bootstrap empirical distributions and standard errors of the
statistics and thereby setting conﬁdence intervals about the parameters. This
chapter presents the methodology and design employed in the study.

Implementation of the study design required a method Of generating samples
drawn from pOpulations Of known parameters. Data generation and analysis were
performed on an IBM 3090 VF mainframe computer at Michigan State University.
Programs used in generating data and Monte Carlo simulations were coded in
Statistical Analysis Systems (SAS), mostly using Interactive Matrix Language
(SAS/IML). Data were generated in such a way so that sets Of data were sampled
from populations of known parameter values in order to provide a check for the
performance Of estimation procedures and their properties. Data were sampled were
the normal and double exponential (or Laplace) pOpulation distributions.

The normal distribution represented the situation where the classical

estimation methods are usable. The double exponential or Laplace distribution (an

38

39

example of a distribution with long and fat tails) represented a departure from
normality. In order to assess the relative usefulness Of the bootstrap method,
estimation Of the mixed model parameters were studied under the two distributional
models (normal and double exponential) and three speciﬁcations of the population
intraclass correlation condition. Speciﬁcation Of the intraclass correlation, the study

design, and parameter values are presented in the later part Of this chapter.

Generetion Of Det_a_

The uniform random number generator (UNIFORM) function in SAS/IML
was used to generate random uniform deviates in the interval (0,1). In certain
instances, this uniform random number generating function was used as a basis for
generating other random numbers by applying some standard linear transformations
to these uniform deviates. The SAS/IML software provides a procedure which
generates independent values from a standard univariate normal distribution. The
package also provides a procedure (REPEAT) which creates a matrix or vector by
repeating the values Of the argument. These three features Of SAS/IML were used
in generating data sets for the study.

In generating samples drawn from a pOpulation of known parameters, the
data created had to ﬁt certain assumptions. Each subject’s Observed score YJJ was
assumed to have been inﬂuenced by a combination Of factors and effects. The pOOl
of factors and effects included the ﬁxed effects factor or, a covariate X whose
coeﬁcient is denoted by 3; the random eﬁects factor bJ which are assumed to be
distributed with mean 0 and variance 7?; and the random error term 6 also
distributed with mean 0 and variance 03. A typical Observed value YJJ-
containing all these features is generated through the equation.

(4-1) Yij = I‘ + “k + ﬁxij + bj ‘l' ‘ij

where

40

YJJ is the Observed value of subject i in context j;

a, is the effect Of level k Of the ﬁxed factor;

XJJ is the covariate value Of subject i in context j whose coefﬁcient is
denoted by 3;

bJ is the effect Of level j (context j) of the random factor;

is the random error term associated with subject i in context j.

With k = 1,2,...,P—2; j = 1,2,...,J; and i = 1,2,...,nJ, and by using the
matrix notation, it can be shown that Equation 4.1 is similar to Equation 2.9 in

Chapter II where the term X9: in 2.9 represents the ﬁrst three terms of 4.1.

While the SAS/IML procedure NORMAL was utilized to generate
independent values from a standard univariate normal distribution, SAS/IML
program segments were coded tO generate double exponential variates. The most
direct way to generate double exponential (or Laplace) variates involves ﬁrst
generating two uniform random variates, U, and U, in the interval (0,1). Set
XJ = -1n(UJ) and X, = —1 if U, < 0.5. If U, _>_ 0.5, then set X, = 1. Then the
variates Y deﬁned by the equation

(4.2) Y = 1; xe,

are distributed as double exponential with mean zero and variance 2.
SAS/IML code segments used to generate the normal and double exponential
distributions are given in Appendix B.

St Desi n P eter Values
The structure Of data in the present study is assumed to involve a random

factor consisting Of J levels, nested within some ﬁxed factor levels. The random

41

factor may be characterized by contexts such as schools or countries and the ﬁxed
factor characterized by sector (e.g. public, private, or religious) in the case of
schools as context. In the case Of countries as context, the ﬁxed factor levels may
be taken to be levels Of economic or industrial deve10pment (e.g. developed, less
deve10ped, develOping or underdevelOped) or may be world regions.

As noted earlier, two design factors in this study are expected to inﬂuence
the success Of the estimation of model parameters. These are the pOpulation
distribution Of the random components and the population intraclass correlation.
The intraclass correlation denoted by p is given by

(4-3) p = T 2
73+a§

 

where 03 and r3 are the intra— and inter—class variances Of the model
respectively. As Raudenbush and Bryk (1988) indicated, the intraclass correlation
has two useful and mathematically equivalent interpretations. First, it is the
correlation between pairs Of values within the J contexts such that it measures the
degree Of dependence among Observations sharing a context. Secondly, as a ratio, it
represents the prOportion Of the total variation in the response values which is
between contexts. Estimation Of variance components is Often dimcult when p is
quite small, sometimes resulting in negative estimates of the variance components.
Due to this feature, three levels of the intraclass correlation for each of the two
distributional models were introduced in the study as part Of the design factors. In
order to vary the intraclass correlation, a3, was ﬁxed at 100 while 73 was allowed
to take values 1, 5.26, and 25 resulting in p taking values of 0.01, 0.05, and 0.20
respectively. Table 4.1 presents design factor combination trials.

42

Table 4.1

Design Factor Combination Trials*

Distribution Model

 

 

 

Intraclass Double All

Correlations (p) Normal Exponential
0.01 a b i
0.05 c d
0.20 e f k
All g h l
’" 400 trials (different sets Of data) were speciﬁed

for each cell (a through f).

The design factor speciﬁcation shown in Table 4.1 provided for a total of
2400 Monte Carlo simulation trials, each consisting Of a different data set. As a
result, 1200 trials were performed for each of the two distributional models (normal
and double exponential) and 800 trials for each Of the three levels Of the intraclass
correlation, such that, g = h = 1200; i = j = k = 800 and g + h = i + j + k = 2400.

The speciﬁc mixed model used in the study has two factors, a random factor
with J levels, nested within a ﬁxed factor with three levels and a micro level
covariate variable. However, it should be noted that it is possible to extend this
model by including additional covariates (at micro or macro level). For the purpose
of the present study, all data sets used in the study were unbalanced (unequal
number of subjects in each context) consisting Of 50 macro units.

Fixing the parameter value Of r9 at unit (near boundary value Of zero)
provided an additional advantage to the study. This is due to the interest Of the
study in estimating the random effect variance component 73 near the boundary

conditions. It is in these situations where most variance component estimation

43

procedures experience problems Of giving negative estimates Of 1’3 when the
parameter value they are estimating is essentially positive. Thus, in an attempt to
understand the performance of the bootstrap procedure in estimating 73 near
boundary conditions, out Of the total 2400 trials 800 (or 33.3%) were performed for
‘r2 = 1 (p = 0.01), 800 (or 33.3%) for r3 = 5.26 (or p = 0.05), and 800 (or 33.3%)
for :r’ = 25 (or p = 0.20).

It should be emphasized that, in using the bootstrap algorithm to estimate
the distribution Of the parameters Of the mixed model described in the study,
estimation is done at each Of b bootstrap replication, for b = 1,2,...,B, where B is
a large number. For the present study, B was set at 200 bootstrap replications for

each trial shown in Table 4.1.

1m lem nt tion f the B ts ra usin MIN UE

The MIN QUE method Of estimating the variance components requires using
weights wk associated with b, for k = 0,1. Ordinarily, arbitrary weights are
chosen provided one ensures that F,1 exists. According to Rao (1972), regardless
Of the choice Of weight, wk’s, the MINQUE estimators will still possess the
properties Of unbiasedness, translation invariance and minimum norm. However,
though the MINQUE estimators may generally possess the prOperties used in
deriving the estimators (unbiasedness, translation invariant, and minimum norm),
One would expect that, in practice, these estimators may be as good as the prior
weights that were utilized. In other words, the MIN QUE estimators depend to a
certain extent on the prior weights used in the norm. Indeed, this condition was the
motivation behind Brown (1976) who suggested iterative MINQUE (I-MIN QUE).
But since I—MINQUE estimators are Obtained iteratively, they do not possess the
properties used in deriving MIN QUE. Thus, I—MINQUE estimators are not

necessarily unbiased or "best" in any sense (Searle, 1979).

 

stud;
com]
9111
meil
pied
1511a

{01 I

the:

able

 

Instead of using arbitrary weights in implementing MINQUE, the present
study employed an AN OVA—type method Of independently estimating the variance
components Of the mixed model as in Hanushek (1974). The values Of this prior
estimates are used to derive the weights used in MINQUE. Using Hanushek’s
method, we ﬁt ordinary regression model with all independent variables as
predictors. The prior estimator, 3J3; Of the random error variance 03 is taken as
usual MSE in the multiple regression model. However, the Hanushek estimator 3J3;
for the variance, 73 Of the random eﬂects Of the model is given by

. w—( N—P)33
_ II

where w is the sums of squares of residual in the regression model

 

T = E tr{SJ<(XJXJ)'ISJ-} for SJ : (le) vector of column sums Of XJ
for context j
N = sample size, and
P = number of ﬁxed effects parameters in the model.
In order to use the Hanushek estimators 3J3; and 3J1; to derive the weight

w, and w,, deﬁne the ratio RJJ = dJaJ/iJzJ. The weights w, and w, can then be

Obtained by
R

 

1
(4.5) w = H and w = .
0 I+R, 1 1+1}; ,
1'2
We notice that wo= l—p and wJ=p where p = u . The value p is

the intraclass correlation based on the Hanushek estimates, “Ar; and 3J1 Of r? and
03 respectively. The weights w0 and w, Obtained through 4.5 are the values used
in the MINQUE procedure. It is reasonable to expect that the MINQUE based on
weights established from some prior estimates of 03 and 'r2 could be an

improvement over the conventional MINQUE based on arbitrary weights.

45

Implementation Of the bootstrap algorithm to estimate the parameters of the
mixed, hierarchical linear model in the present study requires a random sampling
procedure with replacement. First a random sample of J macro units (e.g.
countries, schools) with replacement from the available sample Of J macro units is
drawn. From each of the selected macro units, a random sample Of size nj micro
units are selected with replacement for j = 1,2,...,J. The resulting data set is
termed the bootstrap replication sample (Efron, 1981). Based on the bootstrap
replicated sample, the MINQUE procedure is used to determine the estimate Of the
parameters Of the model. The process is repeated a large number B times yielding
B MIN QUE estimates. This technique may be presented in a sequence of steps as

follows:

Step 1. Construct the distributions FJ by assigning mass 1/J to each
Of the macro units.

Step 2. From the J macro units, select a random sample of size J
with replacement

Step 3. For each Of the J selected macro units containing nJ micro
units, construct distributions FJJj by assigning mass 1/nJ- to
the jth macro unit, for j= 1,2,...,J.

Step 4. From each of the J macro units whose distributions were

constructed at Step 3 above, draw a random sample size nJ
with replacement for j = 1,2,...,J.
At the end Of Step 4, the resulting data is termed the bootstrap replicated samme.

The vector of Observations at this stage is denoted by Y*.

Step 5. From the bootstrap replicated data set generated at Step 4,
determine the MINQUE estimate Of the parameters Of the
model given by 6’" and (1*.

46

Step 6. Independently, repeat 2, 4 and 5 a large number B times to
Obtain a sequence of MIN QUE estimates Of the parameters Of
the model

53;; and {1; for b = 1,2,....B

A

Step 7: Observe the distribution Of the values of; and Er; as the

empirical bootstrap distribution Of the estimates Of the
variance component and the ﬁxed effects Of the model.

The bootstrap standard error Of each of the component of the estimates is given by

(4.6) s.e.(3'") = [(3.1)-él (g), _ pmi/z

. B .. .. .. ..
where 3.* = B—le 3: for 3'" being any one Of the components Of a or a.
81 :- ~

The Computer Programs

Three main tasks in this study required the use Of a computer program.
These were: Generating data sets from population with known parameter and
distribution; Monte Carlo simulations; and bootstrapping. Independent computer
programs were coded for each task using SAS/IML package.

SAS/IML available in the MSU IBM 3090 VF mainframe computer system is
a double precision and multilevel, interactive programming language. SAS/IML
software is both ﬂexible and powerful since it combines the advantages Of high—level
and low—level languages (SAS/IML User’s Guide, 1985, p. xi).

Though SAS provides a procedure which computes the MINQUE that
corresponds to the rather uninformative prior by using zero weights as an Option to
PROC VARCOMP, this procedure does not handle models that involve covariates.
The independent variables handled by the procedure PROC VARCOMP are limited

to main effects, interaction and nested effects; but no covariate effects are allowed in

47

the PROC VARCOMP Statement (SAS User’s Guide: Statistics, 1985, p. 819).
However, the present study is not limited to models which do not involve
covariates. Consequently, the more ﬂerdble software SAS/IML was utilized in the
study, not only to estimate parameters of the model but also tO generate data.

As indicated earlier in this chapter, the ﬁrst computer program generates

sample Observations, Y and covariate X, and passes them over to the program

that implements the bootstrap algorithm. The bootstrap estimates at each
replication are written to a standard SAS ﬁle for further analysis. The Monte Carlo
simulation computer program is implemented like the bootstrap program except
that while the bootstrap samples data from a sample generated from the pOpulation,
the Monte Carlo simulation program samples data directly from the population.
The bootstrap SAS/IML code used in this study is thus ﬂexible and available to be
used to estimate parameters of a model using data obtained from real world
research.

Applicability of the bootstrap method using the present SAS/IML code is
demonstrated in the present study. The computer code and method are applied on
actual ﬁeld research data to estimate the parmeters of the model, the sampling
distribution Of the statistics and to set bootstrap conﬁdence intervals about the
parameters Of teachers’ self—efﬁcacy prediction model. Estimation results for the
ﬁxed and random effects of the teachers’ self—eﬁcacy model are presented in
Chapter V of this dissertation.

CHAPTER V

APPLICATION OF BOOTSTRAP AND MINQUE:
HIGHER ORDER TEACHING
Intr ction

The bootstrap is a new method whose time has come with the advent Of
modern computers. Though its applicability in generating sampling distributions Of
statistics and in construction of conﬁdence intervals about parameters is highly
promising, the method has not been widely used in educational and social science
research. Strengths Of the method are Often demonstrated in situations where
parametric modeling is difﬁcult and/or normal assumptions are not possible. These
situations are not uncommon in educational and social science research.

The interest Of the present study was to demonstrate the Operation Of the
bootstrap in a two—level hierarchical linear model. The focus Of the study was upon
the estimation Of the group and individual level variances and ﬁxed eﬁects
parameters Of the mixed model. A highly promising approach Offered by the
method in this study was that Of estimating the sampling distribution of the
statistics and thereby setting conﬁdence intervals about the parameters. The study
used computer-simulated data tO extensively assess the distributional behavior Of
parameter estimates under varying distributional assumptions Of the errors and sets
of random effects parameters.

In this chapter, applicability Of the bootstrap algorithm on data originating
from a real research situation is demonstrated. The method is applied on the actual
ﬁeld research data to estimate the parameters of the model, the sampling

distribution of the estimators and to set the bootstrap conﬁdence intervals about

48

 

 

 

 

49

the parameters. Data used in this demonstration Of the applicability Of the
bootstrap method was part Of the data gathered earlier to investigate the contextual
eﬁects on the self-efﬁcacy Of high school teachers.

Dﬁeription Of data and variables

The data was Obtained through a survey Of teachers in sixteen schools who
taught Mathematics, Science, English, or Social Science. Each teacher was assigned
to teach one or more classes in the school. Though the individual teacher was
viewed as the basic unit Of analysis, each teacher provided information on several
classes. As a result, we view the teachers as the "macro" units Of analysis with the
classes they taught as the "micro" units Of analysis. The teacher effects therefore,
constitute the random factor of the model.

The dependent variable in the study was teachers’ perception Of self—efﬁcacy
which was measured at the class level. A measure of teachers’ self—efﬁcacy
represents a person’s perceived expectancy Of enacting a desired level or type of
performance through personal effort (Bandura, 1986). For instance, a teacher who
possess a high level Of self-efﬁcacy will be of the view that, no matter the nature Of
students or facilities he or she is provided with, he or she will produce an excellent
level Of performance. On the other hand, a teacher with low self—efﬁcacy will feel
paralyzed if he or she is given "poor" children. The phenomenon has been identiﬁed
to have an effect on both students’ and teachers’ performance (Fuller, et.al., 1982).

In the present study, the extent to which teachers’ self—efﬁcacy is inﬂuenced
by institutional, classroom and individual teacher characteristics is examined.
Academic subject taught (Mathematics, Science, English, or Social Science)
represented the primary ﬁxed factor Of the model used to predict teachers’
self—efﬁcacy. Other independent variables Of the model which were viewed as

covariates fell into two categories, namely, between— and within—teacher variables.

50

The between teacher variables included: STAFCOOP, COOperation Of staff;
TCONTROL, Teacher control; and PLEADER, Principal leadership. The within
teacher (or classroom) level independent variables included: STUDACH, class
average student achievement level; LVLPREP, class level Of preparation; and
SIZE, class size.

Selection of valid data for the variables of interest resulted in a sample Of 244
teachers who provided information on 1634 classes taught. Breakdown of the
number of teachers and number Of classes by academic subject areas were as follows:
Mathematics had 63 teachers with 370 classrooms; Science had 59 teachers with 391
classrooms; (English had 69 teachers with 509 classrooms; and Social Science had 53
teachers with 364 classrooms. The average number Of classes for which each teacher

provided information was about 6.6.

The Model Stetements
We begin by posing a within-teacher model that deﬁnes a "micro" equation

with EFFICACY as the response variable and LVLPREP, SIZE and STUDACH as

"micro" regressors which are identical for each teacher j as,

(5.1) (EFFICACY)JJ = ﬂOJ +1131 (SUBJECT) hJ + ﬂJJ(LULPREPJ) j

+ ,6,J(SIZEJ)J- + 33J(STUDACHJ)J + cJJ-
where j= 1,...,J teachers and i= 1,...,nJ classes for each teacher j. Since 1%,
ﬁJJ, ﬁ,J, and BJJ are deﬁned for each teacher, we can pose the between—teacher
model using these coefﬁcients as responses similar to Equations 2.2 and 2.3 in
Chapter H Of this dissertation. Speciﬁcally, we consider the intercept ﬁOJ to be
random and dependent on the between—teacher independent variables such that the

associated "macro" model is given by

51

(5.2) ﬁoJ- = 700 + 70, (STAFCOOP)J + 70,(TCONTROL)J +
70, (PLEADER)J- + eJJJ
where j = 1,2,...,J teachers. Combining Equation 5.1 and 5.2 yields,
(5.3) (EFFICACY)JJ = [70° + 70J(STAFCOOP)J- + 70,(TCONTROL)J-

+ 7,, (PLEADER)J + h}: (SUBJECT) hJ + ﬁJJ-(LVLPREPJ)J
+ ﬂ,J-(SIZEJ)J + 33J(STUDACHJ)J-] + [eoJ- + eJJ]
similar to Equation 2.7 in Chapter H. Model Equation 5.3 can be written in the
general linear matrix notation as in Equation 2.8 in Chapter II for teacher j with,

(5.4) Y1 = (EFFICACY)JJ-

4
+ 7,, (PLEADER)J + 1231 (SUBJECT) hJ + ﬂJJ(LVLPREPJ)J-
+ s,J(SIZEJ)J + ﬂ,J(STUDACHJ)J-
(5.6) ngj = eoj for bj = eoj and Zj = (1,...,1)’

(5.7) s

= eJJ. Equations 5.5 and 5.6 represent the ﬁxed and random eﬁects
Of the model respectively, while Equation 5.7 is the expression for the random errors
Of the model. The intent then is to estimate both the ﬁxed and random effects Of

the model on the measure Of teachers’ self—efﬁcacy.

Estimation Precedme
The ability Of the bootstrap to estimate parameters Of the model given in

Equation 5.3 was demonstrated through the used of MINQUE. For each parameter,
the usual MINQUE estimates were provided based on the original sample. The
bootstrap estimates based on B = 1000 repeated resampling with replacement were

also obtained. Due to the bootstrap’s ability to generate sampling distributions

 

 

52

through resampling, 95% bootstrap conﬁdence intervals about each Of the
parameters were also provided.

Estimates were provided for a total Of 14 parameters Of the model. There
were four effect levels of the factor, SUBJECT, denoted by aJ, 01,, 01,, and at,J
corresponding to Mathematics, Science, English and Social Science respectively.
Parameters for other ﬁxed factors (or covariates) were denoted by (3,,ﬁ,,3,)
correSponding tO the within teacher (or class—room) effects (LVLPREP, SIZE,
STUDACH) and (7J,7,,7,) corresponding tO the effects between teacher
(LVLPREP, SIZE, STUDACH, STAFCOOP, TCONTROL, and PLEADER).
Besides SUBJECT, all other ﬁxed factor were viewed as covariates in the model.
The inter—teacher variance Of the model was denoted by r2 while 0% denoted the
variance Of the random errors (or intra—teacher variance). In addition, the
intra—teacher correlation denoted by p, and computed as in Equation 4.3 in
Chapter IV and the constant common to all Observations denoted by 700 were

estimated through both MINQUE and bootstrap.

Results Of Estimation
Table 5.1 presents the MIN QUE and bootstrap results for the estimation Of

the fourteen parameters of the teacher self-efﬁcacy prediction model. The results
provides the usual MINQUE estimate, the bootstrap estimate which is the average
over all B = 1000 bootstrap replications, the bootstrap standard error, and 95%
bootstrap conﬁdence intervals about each parameter. The bootstrap estimate Of
bias given by 3* — 3, where 3* is the average Of the estimator over the B
bootstrap replications and 3 is the usual estimator based on the original sample is
also represented. For purposes Of consistency with notation given by Efron (1979),
the bootstrap estimate of bias is denoted by BIAS.

F".

 

53

 

 

88... 8.... 88... 88...- 88... 8.....- 8.....- as 82.83.. .88....
88... 8.... 8...... «8.... 88... .8... 88.... as H.28.... sees...
88...- 88... 8.... 8...... 88... 8...... 2.8... e. 8.8880188
88...- 88... 88... 88... 8...... 88... 28... a... ”see.
.88....520... 2.33m $825.
88...- 88... 8...... 88... 8...... 88... 88... am can as...
8.....- .8... 88.... 88... 8...... 88.... 8...... .8. 8.893... .o .26.. as...
:8... 8...... 88... 8...... 88... 88... 28... .e seesaw .68
88... 3...... 88... 88... .88... 38... 88... .8 £88...
8.8... 8...... 88... 88... 88... 2.8... 88... as .883
88... 8.... 8.... 8...... 88... 88... 88... .e 8.8888...
.338 oases...
8.... 88... 8...... 88.. 88... 88.. 88.. 0.. 35.88
2...... 88... 88... 88... 8.8... :8... 88... a science 5.013....
:8... 88... 88... 88... 8...... :8... 88... «a 8§E> 983.78....
E... 88... 8.... 8...... I88... I82... 8...... as scans. 883912....
a... 5...? Ho... 4.0... .3 855. b.5052 8853..
...o 8.2.8.. :8 88."... 8:38..

 

 

.gﬁclauo 3:88.. .5483 o... .3 833..»
608-3.... was .335". .85. 803.... no on». we Scene 2: .0 8.2.58 H.052 v.3 mezzoom

H6 03...“.

54

From Table 5.1, it is shown that the average of the bootstrap estimates of
the parameters over B = 1000 replications did not differ much from the usual
MINQUE estimates. In addition, the bootstrap feature which was not available
through the MINQUE procedure was the estimation of the standard error of the
estimate. The statistic showed low bootstrap standard errors of the estimate for all
fourteen parameters of the model. The lowest value of the bootstrap estimate of the
standard error was observed for the estimates of the effect of class m and the
estimate of the intra—teacher variance, (3;. For these two parameter estimates, the
bootstrap estimate of bias was less than 0.002. Except for the estimator of the
constant, 700 whose bootstrap estimate of bias was 0.1124, the bootstrap estimate
of bias for all other statistic was no more than 0.04.

An accomplishment of the bootstrap method which is not readily available
through the usual MIN QUE was the construction of conﬁdence intervals about each
of the parameters of the model. The 95% bootstrap conﬁdence intervals were used
as a means of testing for the signiﬁcance of both the ﬁxed and random effects on the
teachers’ self—efﬁcacy. Based on the 95% conﬁdence intervals, the results showed
that all factors with the exception of W have statistically
signiﬁcant effect on teachers’ Self—Efﬁcacy.

The intra—class correlation denoted by p was signiﬁcantly diﬁerent from
zero, with 0.3204 and 0.3314 being the MINQUE and bootstrap estimates of p
respectively. Estimates of p through both methods indicated that, approximately
30% of the total variance in teachers’ gen—Efﬁcacy is between teachers. The
MIN QUE and bootstrap estimate of the inter—teacher variance denoted by 1" was
0.1320 and 0.1397 respectively.

In some problems of practical interest, we may wish to observe the behavior
of the statistic used to estimate a parameter. This requires knowledge of the

sampling distribution of the statistic, often based on the Gaussian theory. In

55

situations where this theory is not available, it is often difﬁcult to draw conclusions
about the sampling distribution of the statistic. In such situations, the bootstrap
oﬂers perhaps one of the most signiﬁcant contributions to statistics. The method’s
applicability to even complicated problems involving statistics which may not have
closed form expressions may be a major promise of the bootstrap. But the
bootstrap can be applied to simple problems as well.

In the present study, the bootstrap method was used to generate the
sampling distributions of the statistics used to estimate the parameters of the
teachers self-efﬁcacy prediction model. The distributions were based on 1000
bootstrap replications.

Figure 5.1 presents three percentage polygons of the estimators of the
inter—teacher variance, 7’, the intra—teacher variance, 0:, and the intra—teacher

correlation, p, based on B = 1000 bootstrap replications. Though the distribution

. 2
of 1* appeared to be slightly positively skewed, the distributions of all the three

estimators are fairly symmetric with very low dispersion.

Figure 5.2 presents four percentage polygons of the estimators of the effects
of Mathematics, al, Science, a,, English, as, and Social Science, a:4 on teachers’
self—efﬁcacy based on B = 1000 bootstrap replications. All four charts represent
the empirical bootstrap distribution of the estimator of SQBJ EQT effects on the
teachers’ perception of self—efﬁggy appears to be fairly symmetric with moderate
variability. However, the estimates of the effect of Mathematics, Social Science,
and Science seem to be slightly negatively skewed. But most importantly, all the
charts show that the replicated estimates are centered extremely close to the
MINQUE estimates of the effects of Mathematics, Science, English, and Social

Science.

56

 

  
   

 

 
 

Figure 5.1
° te of the inter and intra-teacher
mgﬁmﬂégmtggmgfof‘mm' self-efﬁcacy prediction
(B = 1000 replications).
14 DUO!“
12 1-
10 '-
8 +-
6 l-
4 .
2 u-
o A“ . 1

 

 

0.108 0.118 0.128 0. 138 0.148 0.158 0.38 0.178

Inter—Teacher variance (3)

 

 
  

 

 

1 1 l 1

0.234 0.243 0252 0.201 0.27 0.270 0.288 0.207 0300 0.315 0.324

IRiki‘s-slur:variance (a?)

10

 

 

 

 

 

0.250 0.272 0.288 0.304 0.32 0.33 03& 0.38 0.384 0.4

Ina-ruse correlation (in)

57

Figure 5.2

Percentage polygons for the bootstrap estimate of the eﬁects of Mathematics,
Science, English, and Social Science on the teachers’ self-eﬁcacy
(B = 1000 replications).

 

 

 

N,

A!

0424 0.4“ 0.472 0.4“ 052 0.5“ 0.” cm

eﬁects of Mathematics, a1 ‘

//\
//
/. . .. -

r
(14 0424 one 0472 0.4“ 081 0.6“ can

A A

 

om

 

 

 

 

eﬁects of Science, a,

 

 

/ L r 1 1 1 1 A
031 0.39 0.“ 043 0.45 047 0.40 0.81 06:! on

“aim“:

 

 

 

 

r- .
.'
d ‘

0.2. 030‘ 032' 0.382 0.370 0.4 0.424 0.44.

m d 30611 SW, 04

 

 

CHAPTER VI

SIMULATIONS AND BOOTSTRAP RESULTS
Overview

The purpose of the study was to demonstrate the use of the bootstrap in
providing estimates of the parameters of a general two level mixed hierarchical
linear model, determining the standard error of the estimates and their empirical
bootstrap distributions. The objective was to observe the behavior of the bootstrap
and MINQUE estimates of the ﬁxed effects and the variance components of the
model under several conditions including situations where the normal distributional
assumptions may be violated. The study examined the influence of the magnitude
of the population intraclass correlation and the tail size of the distribution on the
estimation of the parameters of the mixed model. Double exponential (or LaPlace)
represented a distribution with fairly long and thick tails.

MIN QUE and bootstrap abilities to estimate parameters of the mixed HLM
model were demonstrated by estimating the parameters from a large number of
independent samples generated from pOpulations of known distributions and
parameter values. Applying the estimation procedures on sets of data generated
from a pOpulation of known parameters provided a means of evaluating the relative
effectiveness of the methods of estimation. The independent samples consisted of 50
groups with each group containing 25 to 45 observations.

Estimation of parameters was studied for two underlying population
distributions, namely the normal and double exponential (or Laplace), and three
levels of the intraclass correlation. These two design factors provided for a total of

six design factor combinations (or cells). A total of 400 trials (based on independent

58

59

samples) were performed for each design factor combination. As a result, 2400
Monte Carlo simulation trials, each based on a different data set, were performed
for the study. MINQUE and bootstrap point estimates, the 95% and 90% bootstrap
conﬁdence intervals, empirical bootstrap distribution and standard errors were
provided for each trial. The MINQUE and bootstrap summary results are presented

in the remaining part of this chapter.

Results of Estimation Procedures

Simulated data represented observations from two pOpulation distributions of
random errors and sets of random effects characterized by three levels of the
intraclass correlation. The mixed model contained seven parameters of which three
were random effects parameters and four ﬁxed. The random effects parameters were
the within and between group variances denoted by a: and 72 respectively and
the intraclass correlation denoted by p. The ﬁxed effects parameters included 0:1,
0:2 and a:3 for levels of the ﬁxed factor and )9, the coefﬁcient of the covariate. The
MINQUE and bootstrap estimates were obtained for each of the seven parameters of
the model. While one MIN QUE estimate was obtained at each trial, (based on the
original sample) the bootstrap estimate at each trial was the average over 200
bootstrap replicated values. Thus, the average of the bootstrap estimate over 400
trials is the average of the 400 averages each computed from 200 bootstrap
replications. Ten functions of the MINQUE and/or bootstrap estimates were
computed for the six non—redundant estimates for both models under normal and
double exponential error terms and sets of random effects at each trial. The average
and standard deviation of the estimates of these functions over 400 trials for each
design factor combination (cells denoted by a through f of Table 4.1 in Chapter
IV) are presented in Tables 6.1 through 6.6.

60

Ten functions of the estimates consisted of: MIN QUE and bootstrap
estimates, the bootstrap estimate of bias denoted by BIAS, the MINQUE bias and
bootstrap estimate of bias denoted by D1 and D, respectively, the MIN QUE
ratio, R, and its correSponding bootstrap ratio denoted by R1, the MINQUE and
bootstrap mean square error denoted by MSEl and MSE2 respectively, and the
bootstrap] MIN QUE measure of relative efﬁciency.

Table 6.1 presents the average and standard deviation of ten functions of :r’
and/or "r: over the 400 trials under the normal and double exponential error terms
and sets of random effects for three levels of the intraclass correlation. From Table
6.1 it is shown that the bootstrap overestimated 13 with a bias of 0.3432 and
0.3765 under normal and double exponential respectively for p = 0.01. For this low
value of the intraclass correlation, the bootstrap estimate of bias, denoted by BIAS
and given by fr} — "r“ was also high and positive indicating that the bootstrap
method on average overestimated the value of 1” under both normal and double
exponential error terms and sets of random effects of the model. At this level of the
intraclass correlation (p = 0.01), though MIN QUE on average also overestimated
1’, its bias was relatively low, at 0.0292 under normal and 0.0597 under the double
exponential distribution. The ratio, R2 expected to be 1.00 was observed at
1.0292 while its bootstrap estimate, R1 was 0.7811 under the normal distribution.
The same ratios were 1.0597 and 2.3032 respectively under the double exponential.
At this level of the intraclass correlation condition, the bootstrap estimate seemed
to be more efﬁcient both under the normal and double exponential. From these
results, it is apparent that MINQUE and bootstrap estimates were fairly close both

under the normal and double exponential distributions.

61

Table 6.1

Average and standard deviation of the functions of the estimates
1’ and/or 73 under the normal and double exponential
error and sets of random eﬁ’ects for p = 0.01, 0.05, and 0.20.

 

 

 

Nor-a1 Double Exponential

Value of p Estimate Par.Value Average S .0. Average S .D .

0.01 Bootstrap, 73 1 .00 1 .3432 0. 7495 1.3765 0. 7298
MINQUE, r’ 1.00 1 .0292 0 .9010 1 .0597 0 .8879
0115:1342 0.3140 0.2113 0.3168 0.2207
0513—12 0.0292 0.9010 0.0597 0.8879
0,434 0.3432 0.7495 0.3765 0.7298
111:;2/7’ 0.7811 11.0436 2.3032 14.4144
R2=;"/1'2 1.0292 0 .9010 1.0597 0 .8879
ISE1=(:r’—1-’)’ 0.8105 1.2806 0.7899 1.2086
l8E2=(1'3—1")’ 0 .6781 1 .3690 0.6729 1 .2438
Rel. Efﬁciencyo 0.8366 0.8519

 

 

 

 

62

Table 6.1 (continued)

 

 

 

 

 

 

 

Nor-a1 Double Exponential

Value of p Estimate Par.Value Average 5 .0. Average S .D.

0.05 Bootstrap, 7'} 5.26 5.2900 1.6634 5.5309 2.1743
MINQUE, 7" 5.26 5.1755 1.6621 5.3996 2.1786
311358? 0.1144 0.1201 0.1313 0.1316
n,=;’—13 —0.0845 1.6621 0.1396 2.1786
0,=}:—r’ 0.0299 1.6634 0.2709 2.1743
3,91%? 1.0261 0 .0314 1.0354 0.0972
R2=;"/1’2 0 .9839 0 .3160 1.0265 0.4142
Iss1=(:r’—r’)’ 2.7627 3.5337 4.7539 8.4574
Iss2=(r:—r’)’ 2.7610 3.6216 4.7891 8.7037
Rel EfﬁciencyQ 0.9994 1.0074

0.20 Bootstrap,72 25.00 24.9553 5.8896 25.5167 8.8370
MINQUE, }’ 25.00 24.8520 5.8766 25 .4018 8.8398
BIAS=72—;’ 0.1032 0.2050 0.1149 0.2077
0,=}’—1’ —0.1480 5.8766 0.4018 8.8398
02:72—72 —0.0447 5.8896 0.5167 8.8370
111:;2/7’ 1 .0043 0 .0085 1 .0052 0.0089
It,=;’/r’ 0.9941 0.2351 1.0161 0.3536
lssiz(:r’—r’)’ 34.4700 49.1902 78.1073 153.6137
Iss2=(rz—r’)’ 34.6030 49.6024 78.1651 154.7723
Rel. Efﬁciencyﬂ 1.0039 1.0007

 

 

 

 

o Rel. Efﬁciency = MSE2/MSE1

63

At the second level of the intraclass correlation (p = 0.05), the average
values of 7" and 72 were 5.1755 and 5.2900 in the normal case compared to the
true parameter value set at 5.26. At this level of the intraclass correlation, both the
bootstrap and MINQUE estimates were very close to the parameter 1" with 0.0299
and —0.0845 as their respective biases under normality. The estimates were slightly
off under double exponential with 7’ = 5.3996 with a bias of 0.1396 and
£2 = 5.5309 with a bias of 0.2709.

Under this condition of the pOpulation intraclass correlation however, the
strength of the bootstrap was demonstrated in the estimates R, and R2. The
average value of R, was 1.0261 under the normal and 1.0354 under the double
exponential compared to the average values of R2 which was observed at 0.9839
under the normal and 1.0265 under the double exponential.

Perhaps the most successful estimation of ‘r’ was attained in the situation
where the pOpulation intraclass correlation was 0.20, particularly under the normal
distribution. Compared to the true parameter value of r3 = 25, 73 was observed
at 24.9553 and 1” at 24.8520 under the normal. The average values of 7’ and 73
were 25.4018 and 25.5167 respectively under the double exponential distribution.
Based on the bias of these estimators, the results shows that the bootstrap with a
bias of -0.0447 and the MINQUE with a bias of -0.1480 were very close under the
normal distribution. The bootstrap estimate of bias was also observed at 0.1032
under the normal. The bias for the MIN QUE and the bootstrap estimates were
observed at 0.4018 and 0.5167 respectively under the double exponential, with the
bootstrap estimate of bias equal to 0.1149.

64

It may be important to note that, in both the normal and double exponential
distributions, the average values of R, and R2 were quite close to 1.00. In
particular, the ratio R, was surprisingly close to 1.00 indicating a very successful
bootstrap estimation process. The bootstrap replicated values of R, were not only
centered near 1.00 but also were less variable under both the normal and double
exponential. The measure of relative efﬁciency for the two estimators was also
extremely close to 1.00 both under the normal and double exponential.

Figure 6.1 displays the percentage polygons of the 400 bootstrap and
MINQUE estimates of 7’ under the normal and double exponential errors and sets
of random effects at each of the three levels of the pOpulation intraclass correlation.
At p = 0.01, both MINQUE and the bootstrap estimates at each trial were centered
near the true parameter value set at 1.00 under both the normal and double
exponential. However, the percentage polygon for the bootstrap was positively
skewed while that of the MINQUE was nearly symmetrical under both the normal
and double exponential distributions. This is mainly due to the fact that the
bootstrap was protected from giving negative estimates of r’ while MIN QUE was
not.

From Figure 6.1 it is apparent that a greater mass of observations were
around 1.00 for the bootstrap frequent polygon than for the MINQUE polygon. It
can therefore be argued that, at this level of the pOpulation intraclass correlation,
the bootstrap seemed to be a good complement to the MINQUE estimator of 7’.

Percentage polygons for the 400 MINQUE and bootstrap estimates under the
normal and double exponential distributions at p = 0.05 shows that, both
MINQUE and the bootstrap were free of giving negative estimates and both were

65

Figure 6.1

Percenuse polygons for the MIN QUE and bootstrap estimate of 13 over 400
trials under the normal and double exponential errors and sets of random effects for

p = 0.01, 0.05, and 0.20.

 

 

 

.-

'2'15'1‘050031‘32253159‘55

van-or}? and P

— :oomrao -‘- mums

 

 

 

'2

p a 0.01

 

4.5

 

 

44350051132153334455

Valneof r3 and }’

- booms-o + name

 

 

 

 

 

l
6
4
2
O

 

 

gt...
vuuotr‘mtr’

- muse dis-um:

1011Q13 o

p = 0.05

 

 

 

 

 

A A

50320253035404.”

Valued 1.1 and ;-'

— consume -0- name

sssossro

9.0.20

 

A A

15202530354046ﬂuuﬂm

venous? anti;1

—eoomno +4»th

 

66

centered near the true parameter value of 1" which was set at 5.26. However, for
both MINQUE and the bootstrap, the estimates were more variable under the
double exponential than under the normal distribution.

The difference in the variability of both the MINQUE and bootstrap
estimators between the normal and double exponential were more apparent at
p = 0.20 (see Figure 6.1). The percentage polygons for both estimators showed
more variability under the double exponential than under the normal. Estimation
results at the three levels of the pOpulation intraclass correlation show that, though
the bootstrap seems to be a more stable estimator of 72, particularly at the low
level of the intraclass correlation, the characteristic of the tails of the distribution
seem to be equally affecting the bootstrap and MIN QUE in estimating 7’.

Table 6.2 presents the average and stande deviation of the ten estimable
functions of 6: and/or :72 over 400 trials under the normal and double
exponential error terms and sets of random effects for three levels of the population
intraclass correlation. From Table 6.2 it is shown that the bootstrap slightly
underestimated a: under both the normal and double exponential distributions for
p = 0.01. MIN QUE slightly underestimated a: under normality but slightly
overestimated a: under the double exponential for p = 0.01.

At this level of the pOpulation intraclass correlation, the bias for MIN QUE
was -0.0680 under the normal and 0.0698 under the double exponential. The bias
for the bootstrap estimate was —0.2025 under the normal and —0.0730 under the
double exponential. The bootstrap estimate of bias was observed at -0.1344 under
the normal and —0.1428 under the double exponential. Average results for R, and
R2 demonstrated a very successful estimation process at this level of the population

intraclass correlation. R, was observed at 0.9987 for the normal and at 0.9986 for

67

Table 6.2

Average and ,standard deviation of the functions of the estimates
a2 and/ or a} under the normal and double exponential errors
and sets of random effects for p = 0.01, 0.05, and 0.20.

 

 

 

 

 

 

Normal Double Exponential

Value of p Estimate Par.Value Average S .0. Average S .D .

0.01 Bootstrap62 100.00 99.7975 3.6352 99.9270 5.9080
MINQUE, 3’ 100.00 99.9320 3.6223 100.0698 5.9100
silsziz—E’ —0.1344 0.2331 —0.1428 0.3939
0,:a’—a’ —0.0680 3.6223 0.0698 5.9100
0,863.72 —0.2025 3.6352 —0.0730 5.9080
2562/33 0.9987 0.0023 0.9986 0.0039
It,=2r’/a3 0.9993 0.0362 1.0007 0.0591
Iss1=(§3—a’)’ 13.0927 17.0420 34.8462 50.6894
lssz=(az—a’) ’ 13.2225 17.1911 34.8222 51.0716
Rel. EfﬁciencyQ 1.0099 0.9993

 

68

Table 6.2 (continued)

 

 

 

 

 

 

 

 

 

 

Normal Double Exponential
Value ofp Estimate Par.Value Average S.D. Average S.D.
0.05 Bootstrapp} 100.00 99.6360 3.4799 99.9349 5.8466
MINQUE, a’ 100.00 99.7693 3.4733 100.0755 5.8629
3115:3242 -0.1333 0.2398 -0.1406 0.3942
0,:a’—a’ —0.2307 3.4733 0.0755 5.8629
0,:oz—a’ —0.3640 3.4799 —0.0651 5.8466
[503/62 0.9987 0.0024 0.9987 0.0039
1.9/52 0.9977 0.0347 1.0008 0.0586
ISEl=(§’—a’)’ 12.0867 17.0071 34.3938 49.9322
lSE2=(aZ—e’)’ 12.2122 17.2576 34.1013 50.2230
Rel. EfﬁciencyQ 1.0104 0.9915
0.20 Bootstrap,03 100.00 100.0437 3.7669 99.7328 5.7180
MINQUE, 3’ 100.00 100.1660 3.7717 99.8545 5.7151
siAS=}z—3’ —0.1223 0.2361 —0.1216 0.3878
D,=0’2—02 0.1660 3.7718 —0.1455 5.7151
0,=2-2—a’ 0.0437 3.7669 —0.2672 5.7180
8563/63 0.9988 0.0024 0.9988 0.0039
' s,=}’/a’ 1.0017 0.0377 0.9985 0.0572
Issl=(§3—a’)’ 14.2177 19.9419 32.6019 44.6796
131:2:(63—6’)’ 14.1557 19.9124 32.6852 45.4206
Rel. EfﬁciencyO 0.9956 1.0026

 

o Rel. Efﬁciency = MSE2/MSE1

 

 

...

8 . 11+. . 111.1.
..,. .L . 7.2.1... ; . . ..$L. «wt-M . . .

  

. .. . oh... 44.4...
.17... IA: ......

 

ll

 

69

the double exponential while R, was observed at 0.9993 under the normal and at
1.0007 under double exponential.

Similarly, surprisingly accurate results were observed at the 0.05 level of the
intraclass correlation. At this level, the average of bootstrap estimate of a: was
observed at 99.6360 with a bias of —0.3640 under the normal and at 99.9349 with a
bias of -0.00651 under double exponential. The average of the MINQUE estimate
of a: was observed at 99.7693 with a bias of -0.2307 under the normal and at
100.0755 with a bias of 0.0755 under double exponential.

Compared to the expected ratio of the estimates at 1.00, both MINQUE and
the bootstrap very closely estimated the ratio with R, = 0.9987 both under the
normal and double exponential. R, was observed at 0.9977 and 1.0008 under the
normal and double exponential respectively. At the two levels of the intraclass
correlation condition (p = 0.01 and p = 0.05), the bootstrap and MINQUE were
very close both under the normal and double exponential distribution.

At the 0.20 level of the intraclass correlation, both MINQUE and the
bootstrap slightly overestimated a: under the normal and slightly underestimated
a: under double exponential. The bootstrap average was closer to the true value of
the parameter than the MINQUE with a bias of 0.0437 under the normal
distribution. On the other hand, the MIN QUE was closer to the parameter than
the bootstrap with a bias of —0.1455 under the double exponential distribution.
Average values of R, and R, were very close to their expected value (1.00) at this
level of the intraclass correlation under both the normal and double exponential
distributions.

On average therefore, results in Table 6.2 shows that both MIN QUE and the
bootstrap very closely estimated the parameter a: under both the normal and

70

double exponential errors and sets of random effects at all three levels of the
intraclass correlation. However, at all three levels of the intraclass correlation, the
standard deviation of the functions of the estimates was relatively higher under
double exponential that under the normal distribution. Regardless of the underlying
distribution of the errors and sets of random effects of the model, the estimation of
the ratio of the estimators, R, and R, was quite close to 1.00 at all levels of the
intraclass correlation.

Figure 6.2 displays the percentage polygons of the 400 bootstrap and
MINQUE estimates of a: under the normal and double exponential errors and sets
of random eﬁects at each of the three levels of the pOpulation intraclass correlation.
From Figure 6.2 we see that, at all levels of the population intraclass correlation,
the bootstrap estimator followed the MIN QUE quite closely. Percentage polygons
for both estimators were centered near the true parameter value set at 100.
However, differences in variation of the estimates by distribution was quite obvious.
The spread of both MIN QUE and bootstrap frequent polygons was clearly higher
under the double exponential than under the normal distribution. For the
estimation of 0:, therefore, it can be argued that while both MIN QUE and the
bootstrap fairly closely estimated 0:, their efficiency was severely affected by the
nature and size of the tails of the distribution of the errors and sets of random
effects. Both estimators were less efﬁcient under a distribution with long and thick
tails (like that of the double exponential) than under a distribution with short and
lighter tails.

The intraclass correlation is given as a function of 7’ and 0?, whose

formula is shown in Equation 4.3. It is the index which measures the degree of

 

71

Figure 6.2

Percentage polygons for the MIN QUE and bootstrap estimate of a: over 40
tdﬂsundathenomaluddoublecxponentiderronandsetsofrandomeffecufcr
p = 0.01, 0.05, and 0.20.

 

 

Normal Double Emu,“

0m

 

 

 

 

 

 

I: 07 92 37 '03 ",‘7
VII” Of 0’, 9 '

— mutton -¢- mums

 

 

 

 

 

 

 

 

 

NW N MO!!!

12 ,2

‘0 10 >

0

6

d

2

o l A
02 87 92 97 112 r

 

 

u 07 92 07 .’ .
p-o.05 V“"°‘a.61‘

—eoe¢sme #2411401:

1“ 107 1! 117

'9 . ‘07
Value of a”, 01°

‘Wﬂ *W

 

 

 

 

 

 

 

I! or 02 97 111., .167 m ' I! a, 02 w
Wu“ 0.61 722020 Value are ,7

—Wo +um

 

 

 

a. 707 m m
2 .

72

dependence among observations sharing a context as well as providing the
proportion of the total variation in the response values that is between contexts
(Randenbush and Bryk, 1988). Success of estimation of model parameters often
depends on this measure, with less success when p is quite small. Due to this
feature, the population intraclass correlation was used as an important design factor
in the present study.

The MINQUE estimator p of p is obtained by substituting :r’ for r2 and
:1: for 0,3, in Equation 4.3. Likewise, the bootstrap estimator p* is obtained by
substituting 73 and 6:... for 7’ and a: respectively in Equation 4.3.

Bootstrap and Monte Carlo results for the estimation of ten estimable
functions of p and/or p... under the normal and double exponential errors and sets
of random effects of the model for the three levels of the pOpulation intraclass
correlation are presented in Table 6.3. Summary statistics in Table 6.3 show that
both MIN QUE and the bootstrap slightly overestimated p under both the normal
and double exponential distributions at the 0.01 level of the intraclass correlation.
The bias for the MIN QUE and bootstrap were correspondingly 0.00013 and 0.0032
under the normal and 0.0005 and 0.0035 under double exponential. The bootstrap
estimate of bias was 0.0031 under the normal and 0.003 under double exponential.
The results show that though R, was poorly estimated at p = 0.01, estimation of
the other nine functions of p and/or p... near their expected value at this level of
the population intraclass correlation. Mean square errors for both MINQUE and
the bootstrap were surprisingly close to zero, both under the normal and double
exponential distributions for p = 0.01.

At the 0.05 level of the intraclass correlation, all the ten estimable functions

of p and/ or ,3... were very close to their expected values estimated. The average

73

Table 6.3

,Average and standard deviation of the functions of the estimates
p and] or p... under the normal and double exponential errors and

sets of random effects for p = 0.01, 0.05, and 0.20.

 

 

 

Normal Double Exponential

Value of p Estimate Par.Value Average S .D. Average S .D.

0.01 Bootstrap, p... 0.01 0.0132 0.0072 0.0134 0.0070
MINQUE, ,3 0.01 0.0101 0.0088 0.0105 0.0087
silszprp 0.0031 0.0021 0.0030 0.0022
D,=p—p 0.00013 0.0088 0.0005 0.0088
0,=}.—p 0.0032 0.0072 0.0035 0.0070
R,=;*/p 1.1407 3.4598 1.3958 3.3220
pr/p 1.0132 0.8798 1.0460 0.8672
lSl:1=(,E,—p)2 0.0001 0.0001 0.0001 0.0001
[532: (gr—p)? 0.0001 0.0001 0.0001 0.0001
Rel. E ciency© 1.0000 1.0000

 

 

 

 

74

 

 

 

 

 

 

 

 

Table 6.3 (continued)
Normal Double Exponential

Value of p Estimate Par.Value Average S .D . Average S .D.

0.05 Bootstrap, L... 0.05 0.0501 0.0151 0.0521 0.0193
MINQUE, }; 0.05 0.0491 0.0151 0.0509 0.0193
rhea—i; 0.0010 0.0011 0.0012 0.0012
D,=p—p —0.0009 0.0151 0.0009 0.0193
n,=p.—p 0.0001 0.0151 0.0021 0.0193
11,414}; 1.0237 0.0306 1.0306 0.0962
lt,=p/p 0.9829 0.3023 1.0182 0.3866
ISE1=(p—p)’ 0.0002 0.0003 0.0004 0.0006
Issz=(p,.,_p)2 0.0002 0.0003 0.0004 0.0006
Rel. EfﬁciencyO 1.0000 1.0000

0.20 Bootstrap, )3... 0.20 0.1978 0.0381 0.2002 0.0534
MINQUE, } 0.20 0.1972 0.0381 0.1992 0.0534
shag...) 0.0006 0.0014 0.0009 0.0014
D,=p—p —0.0028 0.0381 —0.0008 0.0534
D,=p*—p —0.0022 0.0381 0.0002 0.0534
new}; 1.0033 0.0074 1.0051 0.0080
R,=p/p 0.9860 0.1906 0.9962 0.2671
151:1:(p—p) ’ 0.0015 0.0021 0.0028 0.0043
Iss2=(p.—p) ’ 0.0015 0.0021 0.0028 0.0044
Rel. Efﬁciencyo 1.0000 1.0000

 

 

 

0 Rel. Eﬁciency = MSE2/MSE1

75

values of p and ,3... were 0.0491 and 0.0501 respectively under the normal and
0.0509 and 0.0521 respectively under double exponential. Accordingly, the biases
for the bootstrap and MIN QUE were respectively 0.0001 and —0.0009 under the
normal and 0.0021 and 0.0009 respectively under double exponential. The bootstrap
estimate of bias was 0.0010 under the normal and 0.0012 under double exponential.
Under this condition of the intraclass correlation, R, and R, were fairly close to
1.00 with R, = 1.0237 and 1.0336 under the normal and double exponential
respectively and R, = 0.9829 and 1.0182 under normal and double exponential
respectively. The mean square error for both MIN QUE and bootstrap was quite low
under both normal and double exponential at this level of the intraclass correlation.

The bootstrap slightly underestimated p under the normal but very
accurately estimated p under double exponential at the 0.20 condition of the
intraclass correlation. On the other hand, on average MINQUE slightly
underestimated p both under the normal and double exponential at the 0.20
condition of the intraclass correlation. The MIN QUE and bootstrap biases were
observed at —0.0028 and -0.0022 respectively under the normal and —0.0008 and
0.0002 respectively under double exponential. The bootstrap estimate of bias was
0.0006 under the normal and 0.0009 under double exponential. R, was surprisingly
close to 1.00 under both the normal and double exponential but R, was slightly
less than 1.00 under both the normal and double exponential.

At this level of the intraclass correlation condition, both MIN QUE and
bootstrap mean square errors were quite low, both observed at 0.0015 under the
normal and 0.0028 under double exponential. Based on these results therefore, it is
apparent that, while estimation of functions of (72,73) and (63, 6’) may not have
been very successful, estimation of functions of (12*, p) which in turn depends on
73, 7", :73, and 6’ seemed to have been fairly successful for both MINQUE and the

76

bootstrap. However, at this conditions of the intraclass correlation, the bootstrap
ration R, was closer to 1.00 than R, under both normal and double exponential
errors and sets of random effects of the model. It may also be important to note
that, for the estimation of the parameter p, the bootstrap / MINQUE measure of
relative efﬁciency at all three levels of the intraclass correlation was extremely close
to 1.00 under both the normal and double exponential distribution.

Figure 6.3 shows the percentage polygons of the 400 bootstrap and MINQUE
estimates of p under the normal and double exponential distributions at each of the
three levels of the pOpulation intraclass correlation. At the 0.01 level of the
intraclass correlation condition, though both MINQUE and the bootstrap
percentage polygons were centered near the population parameter value of p, it is
evident that a greater mass of observations were around the parameters value under
the bootstrap frequent polygon than under the MINQUE polygon, for both normal
and double exponential distributions. Thus, once again, the bootstrap method has
been shown to be a more efﬁcient estimator of p than MIN QUE at the 0.01 level of
the intraclass correlation condition.

Percentage polygons for the 400 MINQUE and bootstrap estimates under the
normal and double exponential distributions at the 0.05 level of the intraclass
correlation condition show that MIN QUE and the bootstrap followed each other
very closely. However, the percentage polygons under this condition of the
intraclass correlation indicated that the values of both estimates were more variable
under the double exponential than under the normal. The percentage polygons
under the 0.20 level of the intraclass condition shows that the bootstrap and
MINQUE followed each other even more closely than at the 0.05 intraclass

 

 

77

Figure 6.3

Percentage polygons for the MINQUE and bootstrap estimate of p over 400

trialsunderthenormalanddouhle

p = 0.01, 0.05, and 0.20.

Nona].

exponentialerrorsandsetsofrandomeﬂectsfor

Double Exponential

 

 

*

estimate of p
‘- u- "" m

0.5“!“ 0 murutsmmwmcu

 

   

 

 

 

| gufutmcmututsammumm.
”30,01 ”main

-I- *m

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1‘ ,‘W
'2 '2?
10? 1°,
8* 0+
0? 0*
a» u-
2» 2.

c ‘ A ‘ A A ‘ ‘ ‘ ‘ o‘ . * - . L . .
° coma: cosmonaut» ‘1‘” “"1“ ‘1' °-" on an on 0 cm no: nos um cos one nor can cm m on at:
estimatedp p=o,05 estimateofp
—mno +uume _ m -0-u

W W
M ‘4
12r- t2!-
IOL l0?
.[ Ul‘
9 6r-
.r ‘L
2r- 2
o ‘ ‘ ‘ ‘ ‘ ‘ ‘ 0' ‘ ‘ a ‘
0 cos u: as as or: o: ass no nos as 0 can or our c2 oz: on can a
We estimate
°‘ F n-ozo 0‘ 9

- mun -t-umoue

-— may +IINQJE

78

correlation condition. Likewise, values of both estimates were more variable under
double exponential than under the normal distributions.

The estimation results at the three levels of the intraclass correlation
conditions indicate that the bootstrap is a more stable estimator of p, particularly
at the 0.01 level of the intraclass correlation condition. However, nature and size of
the tail of the distribution of the errors and sets of random effects equally inﬂuence
the bootstrap and MIN QUE in estimating p. Estimation tends to be less successful
under a distribution with long and thick tails (like that of the double exponential)
than under a less thick and short tailed distribution.

Fixed effects parameters of the model which included a,, a,, and a, for the
three levels of the ﬁxed factor and ,6, the coefﬁcient of the covariate was estimated
at the three levels of the intraclass correlation conditions under the normal and
double exponential errors and sets of random effects of the model. Due to the fact
that a, for j = 1,2,3 are linearly dependent, estimation is only required for any
two of (1,. For the purpose of the present dissertation, estimation results for a,,
a,, and ﬂ are presented for each of the six design factor combinations.

Table 6.4 presents the summary statistics over the 400 trials for the ten
estimable functions of 3., and/or 2?; under the three levels of the intraclass
correlation condition for the normal and double exponential distributions. Means
and standard deviations presented in Table 6.4 shows that both MINQUE and the
bootstrap very closely estimated the ﬁxed effect parameter a, at all the three
levels of the intraclass correlation condition under both normal and double

exponential distributions. At the 0.01 level of the intraclass correlation, the bias for
MINQUE and bootstrap were 0.0261 and 0.0264 respectively under the normal and

 

79

Table 6.4

Average and standard deviation of the functions of the estimates

(1, and/or a‘,‘ under the normal and double exponential

error and sets of random effects for p = 0.01, 0.05, and 0.20.

 

 

 

Normal Double Exponential

Value of p Estilate Par.Value Average 3 .D . Average S .D.

0.01 Bootstrap 21*; -5.00 —4.9736 0.9302 —5.0251 0.9073
MINQUE, a, —5.00 -4.9739 0.9264 —5.0243 0.9032
BiAS=&1-6, 0.0003 0.0587 —0.0008 0.0619
D,=Zz,—a, 0.0261 0.9264 —0.0243 0.9032
0,:61—07, 0.0264 0.9302 —0.0251 0.9073
115247;, 0.9999 0.0128 1.0000 0.0133
n,=&,/a, 0.9948 0.1853 1.0049 0.1806
18111=(}Sz,—az,)2 0.8568 1.3327 0.8143 1.1918
[SE2=(:Y',‘—a,)% 0.8638 1.3334 0.8217 1.2026
Rel. Efﬁciencyo 1.0082 1.0091

 

 

 

 

80

 

 

 

 

 

 

 

Table 6.4 (continued)
Nor-a1 Double Exponential

Value of p Estimate Par.Value Average S .D. Average S .D.

0.05 Bootstrap or; —5.00 —5.0139 1.0867 -5.0211 1.0484
MINQUE, a, -5.00 4.0143 1.0889 -5.0221 1.0438
mast—2., 0.0004 0.0606 0.0010 0.0632
n,=o,—a, —0.0143 1.0889 —0.0221 1.0438
D,=a’{—a, —0.0139 1.0867 —0.0211 1.0484
ugh/Zr, 1.0001 0.0128 0.9996 0.0138
r,=;,/a, 1.0029 0.2178 1.0044 0.2088
[SE=(;,—a,)2 1.1830 1.9324 1.0873 1.5678
lSE=(0r',‘—a,)’ 1.1781 1.9236 1.0968 1.5833
Rel. Efﬁciencyé 0.9959 1.0087

0.20 Bootstrap a: -5.00 —4.9529 1.6260 —5.0502 1.5745
MINQUE, a, —5.00 -4.9522 1.6186 —5.0488 1.5736
ri1s=;,.;, —0.0007 0.0613 —0.0014 0.0660
n,=o,—a, 0.0478 1.6186 -0.0488 1.5736
0,=o*;—o, 0.0471 1.6260 -0.0502 1.5745
R,=or‘,/a, 1.0000 0.0183 1.0001 0.0172
n,=a,/a, 0.9904 0.3237 1.0098 0.3147
ISE1=(a,—a,)’ 2.6156 3.9830 2.4724 3.2204
Iss2=(oq—a,)’ 2.6394 4.0097 2.4754 3.2304
Rel. EfﬁciencyG 1.0091 1.0012

 

 

 

 

81

—0.0243 and -0.0251 under double exponential. The bootstrap estimate of bias at
this level was 0.0003 under the normal and —0.0008 under double exponential. At
this level of the intraclass correlation, R, was observed at 0.9999 under the normal
and 1.000 under the double exponential compared to R, which was 0.9948 under
the normal and 1.0049 under double exponential. Relative efﬁciency for the two
estimators was extremely close to 1.00 both under normal and double exponential.

For p = 0.05, the average of bootstrap and MIN QUE estimates over the 400
trials were —5.0139 and -5.0221 under double exponential. Their respective biases
were —0.0139 and —0.0143 under the normal and —0.0211 and —0.0221 under double
exponential. Compared to the expected ratio of the estimates at 1.00, both
MINQUE and the bootstrap were very close in estimated the ratio with
R, = 1.0001 under the normal and R, = 0.9996 under double exponential while
R, = 1.0029 under normal and R, = 1.0044 under double exponential.

Estimation of a, at the 0.20 level of the intraclass correlation was equally
successful with R, being closer to 1.00 than R, under both normal and double
exponential distributions. The bootstrap estimate of bias was lower under the
normal than under double exponential. However, the bootstrap and MINQUE
biases differed by no more than 0.007 under normal or double exponential
distributions, and the measure of efﬁciency was extremely close to 1.00 under the
normal and double exponential.

The effect of the intraclass correlation condition on the estimation of the
functions of 62, and/or Zr; was apparent in the MIN QUE and bootstrap mean
square errors. Both mean square errors tended to increase with increasing p under
both normal and double exponential distributions. For instance, the bootstrap
mean square errors were 0.8638, 1.1781, and 2.6394 for p = 0.01, 0.05 and 0.20

82

respectively under the normal and 0.8217, 1.0968, and 2.4754 for p = 0.01, 0.05,
and 0.20 respectively under double exponential. The mean square errors for the
MINQUE estimates were quite close to those of the bootstrap at all levels of the
intraclass correlation under both normal and double exponential. Under both the
normal and double exponential, the bootstrap/MINQUE measure of relative
emciency was extremely close to 1.00 indicating that the bootstrap and MIN QUE
estimators very closely estimated the parameter a,.

In general, therefore, results in Table 6.4 show that, both MINQUE and
bootstrap very closely estimated the parameter a, at all levels of the intraclass
correlation conditions under both normal and double exponential distributions. The
ratio R, was consistently closer to 1.00 compared to R, at all the six design factor
combinations; indicating a great deal of promise through the bootstrap method.

Figure 6.4 presents six percentage polygons of the 400 bootstrap and
MINQUE estimates of a, under the normal and double exponential errors and sets
of random effects at each of the three levels of the pOpulation intraclass correlation
condition. From these charts it is clear that the bootstrap generally followed the
MINQUE very closely at all levels of the intraclass correlation. The ﬁgures also
showed no obvious diﬁerence in estimation between the normal and double
exponential distributions. However, the highest spread of the estimates for both
bootstrap and MINQUE were observed at the 0.20 level of the intraclass correlation
followed by 0.05 level. The spread of both estimates was lowest at 0.01. Percentage
polygons shown in Figure 6.4 therefore indicate that, though MINQUE and
bootstrap do not differ in estimating a,, both their ability to produce efﬁcient (less
variable) estimates depends on the level of the population intraclass correlation.

83

Figure 6.4
Percentage polygons for the MIN QUE and bootstrap estimate of a, over 40
trialsunderthenormaland donbleexponentialerrorsand sets ofrandomeﬁectsfor

 

 

 

 

 

 

 

 

 

 

 

 

 

     

 

 

 

 

 

 

 

 

 

  
    

 

   

 

 

 

 

 

p = 0.01, 0.05, and 0.20.
Normal
double exponential
omens mom
to u
12 12F
10[ ‘ IO?
5 8"
6L ah
“I" u-
:‘r
o, x 2+
-ns «0.: -e.s -ss #5 -ss ~55 ~45 ~32 -2.s -15 ~05 Yus no.5 res -ss 45 -ss -s.s «s -3.s -as -15 -os
estimate of a, p = 0,01 estimate of a,
—bootstno +vmoUE —oootstreo d—wms
08cm mom
to u
12" 12)
‘0? 101'
9" M-
OP 0p
4 F ‘ 1-
2 b 2 b
O ‘ A A ‘ - c A—A - - a . A . A
'113 416 '06 16 -?.5 '16 6.5 «5 <15 '15 -15 ~08 416 '05 '95 '15 4.5 '66 '55 «.5 '15 '1‘ 45 '05
-eootsusn -2-~n~oue —bootsqu *9.“
percent cam
16 14
12 12)
10? '01.
8" .1.
6" ob
‘ l' ‘ p
2" 2»
we «as -s.s -e.s -r.s -es as «.s as -2.s -ts -o.s 418 «as -o.s -es -r.s -es -ss «.6 -18 4.: -1s N
estimated a, p-010 . estimated at

_°°°""‘° *wucue —-sosums -o-muous

84

Both methods yield less efﬁcient estimates when the pOpulation intraclass
correlation is high. Their mean square errors increased at the same rate with
increasing intraclass correlation.

Despite differences in variability of the estimates at different levels of the
intraclass correlation, the percentage polygons indicate that the estimates were
centered nearly at the same point. Estimates were expected to be centered at the
true pOpulation parameters value which was set at -5.00. The results showed that
all the six percentage polygons were centered no more than 0.05 away from the true
parameter value.

Summary results for the bootstrap and MINQUE estimates of the parameter
0, based on the ten estimable functions of Ir, and/or of; over the 400 trials are
presented in Table 6.5. Summary results are presented for each of the three levels
of the intraclass correlation conditions under both the normal and double
exponential distribution of the errors and sets of random effects of the model. The
true population parameter, a, was set at 3.00. MIN QUE and bootstrap estimates
are compared against the true pOpulation parameter value.

Averages and standard deviations over 400 trials presented in Table 6.5
shows that, both MIN QUE and the bootstrap very closely estimated the parameter
a,. At the 0.01 level of the intraclass correlation, both MINQUE and bootstrap
slightly overestimated the parameter a, under the normal and slightly
underestimated a, under the double exponential errors and sets of random effects
of the model. The biases for MIN QUE and bootstrap were 0.0366 and 0.0362
respectively under the normal and —0.0123 and -0.0101 under double exponential.
The bootstrap estimate of bias was observed at —0.0004 under the normal and
0.0021 under double exponential. The ratio R, was surprisingly close to 1.00 under

85

Table 6.5

, Average and standard deviation of the functions of the estimates
a, and /or a“; under the normal and double exponential errors and
sets of random eﬁ'ects for p = 0.01, 0.05, and 0.20.

 

 

 

 

 

 

Nornal Double Exponential

Value of p Est inate Par.Value Average S .0. Average S .D .
0.01 Bootstrap, Zr; 3.00 3.0362 0.9030 2.9878 0.9053
MINQUE, ;, 3.00 3.0366 0.9029 2.9899 0.9109
ri1s=;;-;, -0.0004 0.05566 0.0021 0.0625
0,=o,—o, 0.0366 0.9029 —0.0123 0.9053
n,=og—a, 0.0362 0.9030 —0.0101 0.9109
R,=a",‘/a, 1.0005 0.0228 1.0016 0.0357
r,=;,/a, 1.0122 0.3010 0.9959 0.3018
Iss1=(o,—a,) 3 0.8145 1.1035 0.8176 1.2312
Iss2=(oag—a,) 2 0.8146 1.0961 0.8277 1.2530

Rel. Emoiencyo 1.0001 1.0124

 

86

 

 

 

 

 

 

 

Table 6.5 (continued)
Normal Double Exponential
Value of p Estinate Par.Value Average S .D. Average S .D.
0.05 Bootstrap, o; 3.00 3.0380 1.0368 3.0045 1.0326
MINQUE, Zr, 3.00 3.0349 1.0367 3.0002 1.0278
BlAS=o§—£r, 0.0031 0.0582 0.0043 0.0630
D,=a",‘—a, 0.0349 1.0366 0.0002 1.0278
0,=oz;—a, 0.0380 1.0368 0.0045 1.0326
r,.-.;;/;, 1.0015 0.0264 1.0003 0.0415
11,:og/a, 1.0116 0.3455 1.0001 0.3426
181:1=(2},—s,)2 1.0732 1.4720 1.0537 1.5478
Iss2=(ng;—s,)2 1.0738 1.4677 1.0635 1.5753
Rel. Efﬁciencyo 1.0006 1.0093
0.20 Bootstrap, Er; 3.00 3.0379 1.4153 3.0711 1.4837
MINQUE, o, 3.00 3.0380 1.4134 3.0688 1.4777
811843—31, —0.0002 0.00610 0.0023 0.0646
n,=o,—a, 0.0380 1.4134 0.0688 1.4777
0,:285—6, 0.0379 1.4153 0.0711 1.4837
8583/61, 0.9979 0.0529 1.0005 0.1067
n,.-.Zr,/a, 1.0127 0.4711 1.0229 0.4926
ISE1=( 613—61,)” 1.9940 2.9561 2.1828 3.0902
ISE2=(01’§-a,)’ 1.9995 2.9688 2.2009 3.1362
Rel. Efﬁciencyo 1.0028 1.0083

 

 

 

 

o Rel. Efﬁciency = MSE2/MSE1

87

both the normal and double exponential errors and sets of random effects of the
model. The usual MINQUE ratio, R, was not as close to 1.00 as R,, indicating
that the bootstrap more successfully estimated the ratio than MINQUE.

A more successful estimation of the parameter a, was achieved by both the
MINQUE and bootstrap under the double exponential errors at the 0.05 level of the
intraclass correlation condition. At this level, the average values of MINQUE and
bootstrap were 3.0349 and 3.0380 respectively under the normal and 3.0002 and
3.0045 respectively under double exponential distributions. The bootstrap and
MIN QUE bias were 0.0380 and 0.0349 under the normal and 0.0045 and 0.0002
respectively under double exponential distributions. The bootstrap estimate of bias
was 0.0031 under the normal and 0.0043 under the double exponential. At this level
of the intraclass correlation also, the ratio R, was closer to 1.00 than R, both
under the normal and double exponential errors and sets of random effects of the
model. However, at the 0.05 level of the intraclass correlation condition, both
MIN QUE and bootstrap mean square errors were greater than at the 0.01 level of
the intraclass correlation.

At the 0.20 level of the intraclass correlation, both MINQUE and bootstrap
slightly overestimated the parameter a, under the normal and double exponential
distribution of errors and sets of random effects of the model. However, both biases
were greater under double exponential distribution than under the normal. The
bootstrap estimate of bias was no more than 0.0025 under both distributions. On
average, R, was closer to 1.00 than R, under both the normal and double
exponential. The bootstrap and MINQUE mean square errors were near 2.00 at this
level of the intraclass correlation compared to about 1.05 at the 0.05 level and about

0.80 at the 0.01 level of the intraclass correlation. Thus, both mean square errors

88

seemed to increase with increasing level of the intraclass correlation condition. The
bootstrap/MINQUE measure of relative efﬁciency at all three levels of the intraclass
correlation were very close to 1.00 under both the normal and double exponential
distribution.

Figure 6.5 shows the percentage polygons of the 400 bootstrap and
MINQUE estimates of a, for each of the six design factor combinations. Separate
frequent polygons are presented for the normal and double exponential distributions
at each level of the intraclass correlation condition. From these charts, it is shown
that the bootstrap followed the MIN QUE quite closely such that it was difﬁculty to
distinguish the two at certain points. All the six charts were centered near the true
parameter value of a, which was set at 3.00.

Though the frequency polygons showed no variation by the distribution of
errors and sets of random effects, the spread of the estimates seems to vary by level
of the intraclass correlation condition. The highest spread was observed at the 0.20
level of the intraclass correlation while the lowest spread was seen at the 0.01 level.

The other ﬁxed effects parameter of the model which was examined in the
study was 6, the coefﬁcient of the covariate. The true parameter value was set at
1.00. As in the other parameters of the model, the bootstrap and MINQUE
estimates of 6 were calculated at each of the three levels of the intraclass
correlation under both normal and double exponential errors and sets of random
eﬂ’ects of the model. Summary results for the bootstrap and MINQUE estimates
over the 400 trials for the six design factor combinations are presented in Table 6.6.

Here, averages and standard deviations of ten function of [9 and 6‘" are presented

 

89

Figure 6.5

Percentage polygons for the MIN QUE and bootstrap estimate of a, over 40
trials under the normal and double exponential errors and sets of random eﬁ’ects for

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p = 0.01, 0.05, and 0.20.
Normal Double Exponential
00m mom
1‘ 1‘
12 12
10 to t-
8 e >- f
6.)- g 1-
‘,L 4 1-
2} 2 >-
o% o - _
-0 -s -s a -2 o 2 4 6 s 1O -to -s -e a -2 o 2 a 0
estimate of a, p a 0.01 estima 0‘ a,
*nootstrso +‘.”NOUE —nootsuso +1911“
DONG!" DOVCOM
H 16
12) 121-
10 r 10 r
8 P a p
6 l‘ O P
‘ 1' s r-
2 l‘ 2 t.
o 0 ‘ ‘ 8
~10 -s -s -o -z o 2 4 s s to ~10 s .6 a -2 o 2 s 0
estimateof a, 71-005 estimateof a,
—oootsmo “Home -—oootsmo *mm
16 D ,‘Clllllll
12 12
10 t 101-
5 ‘ st-
° " o b
s r ‘ ..
2 b 2 1-
O .A L ' o A
to .. -o o z o 2 a s s 10 to s -s -4 2 o 2 ‘ 0
estimate of p _ 0.20 estimate of a,

 

90

Table 6.6

Average and standard deviation of the functions of the estimates

6 and / or 6“ under the normal and double exponential
errors and sets of random effects for p = 0.01, 0.05, and 0.20.

 

 

 

 

 

 

Normal Double Exponential
Value of p Estimate Par.Value Average S .D. Average S .D.
0.01 Bootstrap, 29* 1.00 1.0000 0.0117 1.0005 0.0123
MINQUE, p 1.00 1.000 0.0117 1.0005 0.0122
sisszfr—b —0.0000 0.0008 —0.0000 -.0009
0,:p—p —0.0000 0.0117 0.0005 0.0122
0,=p‘*—6 —0.0000 0.0117 0.0005 0.0122
11,=p‘*/p 1.0000 0.0008 1.0000 0.0009
t,=ﬁ/p 1.0000 0.0117 1.0005 0.0122
ISE1=(@—ﬁ)’ 0.0001 0.0002 0.0001 0.0002
ISE2= law—p)2 0.0001 0.0002 0.0002 0.0002
Rel. E ciencyo 1.0000 2.0000

 

91

 

 

 

 

 

 

 

 

 

 

Table 6.6 (continued)
Normal Double Exponential
Value ofp Estimate Par.Value Average S.D. Average S.D.
0.05 Bootstrap, p" 1.00 0.9999 0.0125 1.0005 0.0123
MINQUE, p 1.00 0.9999 0.0124 1.0005 0.0123
BlASzIS—ﬁ 0.0000 0.0009 —0.0001 0.0009
0,:p—p —0.0001 0.0124 0.0005 0.0123
0,; —p —0.0001 0.0125 0.0005 0.0123
spa/[9 1.0000 0.0009 0.9999 0.0009
1139/5 0.9999 0.0124 1.0005 0.0123
lSEleS—ﬂ)’ 0.0002 0.0002 0.0002 0.0002
I882: ﬁ‘Lﬂ)’ 0.0002 0.0002 0.0002 0.0002
Rel. E oiencyo 1.0000 1.0000
0.20 Bootstrap, :6 1.00 0.9997 0.0121 1.0005 0.0128
MINQUE, p 1.00 0.9997 0.0120 1.0005 0.0127
8118:1343 0.0000 0.0009 0.0000 0.0009
0,:p—p —0.0003 0.0120 0.0005 0.0127
offs—p —0.0003 0.0121 0.0005 0.0128
8,437}? 1.0000 0.0009 1.0000 0.0009
s,=EJ/p 0.9997 0.0120 1.0005 0.0127
lSEle—ﬂ)’ 0.0001 0.0002 0.0002 0.0002
ISE2= ﬂ—ﬁ)’ 0.0001 0.0002 0.0002 0.0003
Rel. E ciencyo 1.0000 1.0000

 

G Rel. Eﬁciency = MSE2/MSE1

92

at each of the three levels of the intraclass correlation by each of the distribution of
the errors and sets of random effects of the model.

From Table 6.6 it is shown that the average values of the bootstrap and
MINQUE were extremely close to the true parameter value regardless of the level of
the intraclass correlation or distribution of the errors and sets of random effects of
the model. At all levels of the intraclass correlation condition, the bootstrap and
MINQUE biases were never greater than 0.0005 and the bootstrap estimate of bias
was perfectly nil under both normal and double exponential distributions.

The average of the ratios R, was almost always equal to 1.00 at all levels of
the intraclass correlation for both normal and double exponential errors and sets of
random effects of the model. However, the average of the ratios, R, slightly
differed from 1.00 for some design factor combinations. The bootstrap and
MINQUE mean square errors were surprisingly small at all levels of the intraclass
correlation for both normal and double exponential distributions. Thus, based on
these summary statistics, it is clear that the parameter 6 was very successfully
estimated by both MINQUE and the bootstrap regardless of the level of the
intraclass correlation condition and the distribution of the errors and sets of random
effects of the model. In terms of their relative accuracy, neither method (MINQUE
or Bootstrap) was superior to the other. Their measure of relative efﬁciency was
extremely close to 1.00 at all levels of the intraclass correlation particularly under
the normal distribution.

Figure 6.6 is a display of the percentage polygons of the 400 bootstrap and
MINQUE estimates of 6 under the normal and double exponential errors and sets
of random effects at each of the three levels of the population intraclass correlation

condition. From these charts, it is apparent that the bootstrap on average followed

 

93

Figure 6.6
Percentage polygons for the MIN QUE and bootstrap estimate of 6 over 400

trials under the normal and double exponential errors and sets of random eﬁects for
p = 0.01, 0.05, and 0.20.

Normal Double Exponential

,‘ 0.0M 00'0""
ol

 

to}
at
°.*
*r

:1»

l ' t 0

CL A AAA ‘4
0935 ones can osss uses too: 1.015 1025 1035 1045

'55 0.006 0076 0005 0.0“ 1006 1015 1028 1.000 1040
estimate of [3

estimate of 6

—" bootstrap + utNOUE

    

 

 

— aooutno - MINQUE

 

 

 

 

 

 

 

 

A

0.986 0.9“ 0978 uses 09“ 10“ 101$ 1025 1.035 1045

A AA A

56 0.” 0.078 0.906 09“ 1” 1.018 1025 m 1066

estimate of 6 estimate of 6

—oootsmo +umms —oootsmo +804“

 

 

 

O
TVT—fTT

 

 
    

 

 

 

 

 

 

1.
t

0 l-
0 or-
s 4 t-
2 2 a .
o4 - . . . _ 0 4+4. . . . . . A
0900 uses om uses uses 10” 1.015 1020 1030 l040 0“.st 0075 0.“. uses 1.” 1.08 t” m ‘0“

estimate of 6 estimate of B

""MIU +m£ —Wn +£0.01!

94

the MINQUE quite closely at all levels of the intraclass correlation condition. The
percentage polygons also showed no obvious difference in estimation between the
normal and double exponential distributions. All the six ﬁgures were centered

extremely close to 1.00 as expected.

Results of Bootstrap Conﬁdence Intervals

The bootstrap procedure for constructing conﬁdence intervals is perhaps one
of the most signiﬁcant accomplishments of the bootstrap method. The procedure
can be applied to even more complicated problems involving statistics whose
sampling distributions cannot be determine analytically. Derivation of the
bootstrap method for the conﬁdence interval is based on the following assumption.
For an estimator, 8 of the parameter 0, let D = 6—0. Deﬁne D* = FL; to be
the bootstrap quantity observed at each bootstrap replication. The bootstrap
distribution of D* estimates the unknown distribution of D. As an illustration,
percentage polygons based on 1000 repetitions to demonstrate the relationship
between the distribution of functions D and D* of T2, 7’2 and/or 72* at 0.01,
0.05, and 0.20 intraclass correlation conditions are presented in Figure 6.7. But it is
important for readers to be reminded that, while D was derived from 1000
independent samples drawn from a population with predetermined parameters, D*
was derived from 1000 repeated resampling drawn from one such sample with
replacement.

If the distribution of D were known, then the (l — a)100% conﬁdence

interval can be deﬁned using real values DL and DU by the probability statement,

95

Figure 6.7

Percentage polygons for the relationship between the distribtion of the
function D = 13 - 73 and D“ = 1" - 13 for p = 0.01, 0.05, and 0.20.

Normal

ﬂ

Double Exponential

 

 

 

 

 

 

 

 

 

 

 

 

 

 

-s -3 '2 -l O ‘ 2 3 l 5 6 7 3

Value cl 0 or 0°

—0° occlsttso +0 MINQUE

 

 

 

 

value 01 0 or D'

- 0" ocean» --o MINQUE

 

 

netcsnl oevcsm
lO 10
8r 8 l-
o r o T
. P ‘ p
2 t 2 t
O A 0
-e 5 s 3 -2 1 o 1 2 3 l 5 6 8 -o 5
Value 01 D or D'
"" 0' 000131100 + 0 MINQUE P - 0.01
percent percent
0 10
8 > a r
6 ‘ a y»
.. .l
2 t 2 1-
o A a A. o
-6 '5 '9 '3 '2 'l 0 l 2 3 l 5 6 7 8 -6

F '0-05

 

 

 

value 01 D or 0‘

— 0* 600mm: *0 MINQUE

 

 

 

 

 

 

   

 

  

\

L A A A ﬂ

 

oucenl percent
12 12
101' 10)-
B 7 8 b
O) 6 l-
0 i’ ‘ p
2 b 2 r
o 1.. ‘ o
‘20 2‘ 28 .20 16
Value 01 D or D'
—D" bootstrap +0 MINQUE P 30.20

"2 '8 '4 0 4 8 12 15

 

20 2‘ 28

Value 01 O or 0'

--0' ooomrso +0 MINQUE

 

 

96

P(DLngDU).~.l—a
or
P(DL$ 0—05 DU): 1—a
which can be written as,

(6.1) P(b—Dugogb—DL)si—o.

Since D0 and D1. are not observable the probability statement in Equation 6.1 is
estimated by the bootstrap probability statement,

(6.2) P(6—D;gogb—D;)sl—o
where D; and D; are bootstrap versions of DU and DL respectively computed
from bootstrap samples. Equation 6.2 gives the bootstrap (l—a)100% conﬁdence
interval via the percentile method. The procedure is highly ﬂexible and can be
applied to complicated problems in a wide range of situations, where classical
methods may fail to be useful. Indeed, this was one of the aspects of the present
study where the bootstrap delivers while the MIN QUE does not.

In the present study, 90 and 95 percent bootstrap conﬁdence intervals were
constructed for each of the six parameters of the mixed model. The conﬁdence
intervals were constructed for all six design factor combinations (cells a through f in
Table 4.1).

Table 6.7 presents the averages and standard deviations for the 90 and 95
percent conﬁdence limits based on the bootstrap over the 400 trials at the 0.01 level
of the intraclass correlation condition. The summary statistics for the lower
conﬁdence limit (L.C.L.), upper conﬁdence limit (U.C.L.) and the width of the
conﬁdence interval are presented both under the normal and double exponential

distributions.

97

 

 

 

 

 

 

88.” 88.: 88.: 82.: 52>»
:36 82.63 88.” 32.2: .D.:
838 28.8 $8." 88.8 4.0.4 ”HO $8
28...“ 88.3 83; 82.2 523
~82. 888: 3.8.“. 38.2: 40.:
886 88.8 83.” 32.8 Add ”HO 088 8.2: 3.3%
238 823. 2.85 .33.”. €23
884 $3.” 884 3.2.” 4.0.:
88.: 88.7 885 83.7 Add ”Ho .88
88.: 886 88.9 826 523
2.24 88.” 8:; 8.53 4.0.:
823 83.7. ”82 22.7. Add ”5 383 2: 3.31
.Q.m owﬁo>< .Q.m owﬁm>< 3.25qu m=_.m>.8m ~225an
3:85
2:58qu oanQ 3882

 

338 on. .8 3.08883 x3 2: Some 258:: 853:8 2: Ho

.88 u a H8 33.2.3.8 2:58 v5 388.— oﬁ 82:.

£23 2: ES 385 8825.8 8.5389 2: .«o 823mg“. @328: 93 888:.

he 033.

98

 

 

 

 

 

 

83.: 23.: 82.: $8.: £23
2.8.: :83: 28.: 2.3.7 4.0.:
3%.: 83¢. 3.8.: :23: 4.0.: ”3 0%:
88.: 2%.: 83.: 8a.: .323
28.: $3.? 88.: 85...? 4.0.:
88.: 83.? 23.: 2.8.? 4.0.: ”no :3: ::...r 3.3:.
2.8.: 83.: $8.: 33.: £23
88.: 88.: 88.: :8: 4.0.:
$8.: 38:: 2.8.: 28.? 4.0.: ”5 °5::
:m::.: 88.: $8.: 88.: £23
28.: 38.: 88.: $8.: 4.0.:
$8.: $8.? $8.: 88:: .40.: "5 5: 8.: 3.3:
dd 380:. dd $803.. «885%: o=_d>.3m 5888.3
3.535
32.88% 2:89 1882

 

@2588: Z 2%.:

99

 

 

 

 

 

 

885 88¢ 32:. 38.: £23
«23. 8:: 2:; 3:: Add
88.: :8: 2:; 835 Add ”5 $8
:85 $35 $8.0 $35 £23
2:; $84 885 3.8." .60.:
8:; :35 2:; 33¢ .15.; ”5 $8 2: 3%
8o: «33 8mg :22 £23
:8... 3»? 8:3 33 4.0.:
88.: «£3 8:; 8%; do: ”5 x8
:32 38.” 3.35 33.” £23
:85 28+ 285 55. 4.0.:
$86 823 3:3 as? 4.0.; ”3 $8 8.” 3.3:.
.D.m mmﬁm>< dd $295 c3833 33>.an “225:5
3335
33:80qu 0325 38.82

 

€33.53 2 23

100

From Table 6.7 it is shown that bootstrap 95% conﬁdence intervals about
the parameter 1" at the 0.01 level of the intraclass correlation had a width of
5.1793 and 5.3868 under the normal and double exponential respectively. The
average width of the 90% conﬁdence interval was 4.3137 and 4.4735 under the
normal and double exponential respectively. Corresponding standard deviation to
these averages show clearly how precise these intervals were.

Averages and standard deviations of the conﬁdence limits and widths of
conﬁdence intervals about the parameter of, also showed a fairly precise bootstrap
interval estimation process. The average width of the conﬁdence intervals were
fairly low, particularly under the normal distribution. Standard deviations
corresponding to these averages were 0.9369 under the normal and 2.2998 under
double exponential.

Since the bootstrap interval estimation process about the parameters 72 and
a: both of which are component of p (see Equation 4.3) was rather unsuccessful,
the results showed an equally successful bootstrap interval estimation process about
the parameter p. The average width of the 95% conﬁdence interval was 0.0508
under the normal and 0.0523 under the double exponential distribution.

At this level of the intraclass correlation condition, the highest success of the
bootstrap conﬁdence interval estimation procedure was achieved about the
parameter [3, the ooeﬁcient of the covariate of the model. For this parameter, very
precise bootstrap conﬁdence intervals were obtained both under the normal and
double exponential distributions. For instance, the average width of the 95%
bootstrap conﬁdence interval was 0.0467 under the normal and 0.0467 under the

double exponential. The standard deviations corresponding to these average widths

101

were 0.0034 and 0.0037 under the normal and double exponential respectively. With
these results, it is apparent that, regardless of the distribution of the errors and sets
of random eﬂ'ects of the model, the bootstrap conﬁdence intervals about the
parameter ﬂ were extremely precise.

Table 6.8 shows the summary statistics for 90% and 95% bootstrap
conﬁdence limits over the 400 trials at the 0.05 level of the intraclass correlation
condition. Averages and standard deviations for the lower (L.C.L.) and upper
(U.C.L.) conﬁdence limits and the width of the conﬁdence interval are presented
under the normal and double exponential distributions.

Summary statistics in Table 6.8 shows that, the bootstrap conﬁdence
interval about the parameter 1" at the 0.05 level of the intraclass correlation were
by far more successful than the same intervals at the 0.01 level of the intraclass
correlation condition. The average widths were much smaller and less variable.
Summary statistics for bootstrap conﬁdence intervals about the parameters a”, a,,
a,, and 19 at the 0.05 and 0.01 levels of the intraclass correlation showed a more
precise interval estimation under both the normal and double exponential. For the
parameters 1’2 and p, the bootstrap conﬁdence interval estimation procedure at
the 0.05 level of the intraclass correlation was equally successful as at the 0.01 level
of the intraclass correlation condition. For instance, compared to the average width
of the 95% conﬁdence interval about p of 0.0508 when p = 0.01, the same average
was 0.0615 when p = 0.05 under the normal distribution. Similar low differences
between the two levels of the intraclass correlation in the width of the bootstrap
conﬁdence intervals about the parameter p were observed under the double

exponential distribution.

102

 

 

 

 

 

 

ovum." 38.2 38.: $3.: 523
836 35.2: ”can.” wounded ddd
$3.6 53.8 33..” 3.3.3 ddd dd $8
Sons 33....“ vmomd $3.3 £23
8:: 8.3.52 38." 38.2: ddd
2.36. 2.8.3 2.3.» 83.3 ddd dd $3 8.2: 3.3%
32.6 manna 385 83a £23
83....“ 82.» naoad 8:: ddd
nnmad «an: Emvd 33.“ ddd dd $8
nomad Rood mmmﬁo 3:8 523
33d 2.3.x $8; 83.x ddd
$3; 82; 82.." 23: Add ”5 $3 $3 €31
dd $395. dd omSo>< 32.5mm o2¢>§£ SESSm
22:35
Eaaoqoaxm «3.59 3882

 

358 2: do 3888938" x: 2: Song 2863.: 853:8 2: do
533 05 23 £8: 88330 3:38; 2: do 80:25“. 335: ES 8w30><

.35 u q 3.. 3358?... 03:2. 23 188: 2: 82:.

ad 053.

103

 

 

 

 

 

 

88... $3." 88... 38.... £23
«3.3 88.? 88d 88.? Add
38A 33.? :53 2.3.? Add dd $8
223 82..” 3.3.: 2.8.” £23
8:: 38.”: $2: 38.”... Add
22: 33.? 32d 32.? Add dd x3 8d: :3?
$85 :3: 3.85 22:. £23
53. 85° 8:; 33 Add
$8... 2.8: 3:; £85 Add dd x8
88.: 82:. so: 38.: £23
885 :86 2:; 2:3. Add
88.: as: 8:; 2.8.: ddd dd x3 2:. 3%
dd 0383‘ dd $225 demand add—«>55 addendum
3‘56““:
3.5596 0325 38.82

 

Aumaasasv 3 2.3.

104

 

 

 

 

 

 

good 385 good 88d 53?
$85 8:: 38.0 8:: 4.0.:
38d :36 $85 835 Add ”HO $8
good 32:. «mood 2.36 523
vﬁod ammo." 88d mag; 4.0.:
53. at: 385 23.0 4.3 ”5 x3 2: 3%
Sand 39: $2.0 38." 523
82: £23. 32: $34. 4.0.:
384 82: ammo; was; 4.0.4 ”HO $8
v2.2. 3cm.” «86¢ 33.” £23
«one; «83. 82: 32.6 4.0.:
can: 8:: $3; 2...? 4.0.4 ”3 x8 8.” Gina
.Q.m omﬂo>< .Q.m omﬁm>< Baazmm Biz/cam Sﬁﬁﬂdm
12:25
Baggage 03.59 3882

 

soésaos 3 2&9

105

Summary results for the 90% and 95% conﬁdence intervals for the 400 trials
at the 0.20 level of the intraclass correlation are presented in Table 6.9. Averages
for the lower (L.C.L.), upper (U.C.L.) conﬁdence limits and the width of the
conﬁdence intervals are presented under both the normal and double exponential
errors and sets of random effects of the model.

At this level of the intraclass correlation, these summary statistics showed
slightly wider bootstrap conﬁdence procedure about the parameters 1’2 and 0:
than at the other levels of the intraclass correlation. No diﬂerences in the level of
success of the method were noticed in estimating the conﬁdence intervals about the
other parameters of the model among different levels of the intraclass correlation
condition. For the ﬁxed effects parameters, the bootstrap procedure for conﬁdence
interval was also always successful at all levels of the intraclass correlation
condition.

These results revealed an important feature of the bootstrap procedure for
conﬁdence intervals about the parameters 1" and 0:. The success of the procedure
depends on the level of the intraclass correlation. When the pOpulation intraclass
correlation is small, the bootstrap procedure for conﬁdence intervals using the
percentile method about the ﬁxed and random effects parameters of the model is
quite precise. At high values of the intraclass correlation condition, the bootstrap
interval estimation procedure about the random effects parameters is slightly less
precise. However, regardless of the level of the intraclass correlation, the bootstrap
method for the conﬁdence interval about the ﬁxed effects parameters of the model

(a,,a3ﬂ) seemed to be a remarkable success.

106

 

 

 

 

 

 

:8.“ 2.8.: as... 83.: £23
:8... 33.8" 82¢ 38.8“ .D.:
83.» 38.8 32.” 33.3 Add ad *8
$2..“ 8:: 82.“ 88.: £23
22... «$8: 58.“. 2.8.2: 45.:
3.5.“ 32.8 83..» 53.8 .34 ”8 :3 8.2: acre
88.“ as: 53 2:3 523
2&2 $2.8 $5.0 33.2. 4.0.:
83... $2.8 was.” 5.ch Add ad x8
3:: 25.: as; 38.: 523
88... «5.8 «a; 88.8 40.:
33... 82.2 as.» 88.2 43..“ "no :3 8.3 Q31
..Q.m 0383‘ dd omauo>< 325.5 2.3/gm .8883!—
€235
easing «3:8 352

 

ad 039—.

.86 u a he 158.895 03:8 and 18.8: 2: 823
.308 on. we 56883 Mu. 2: 33¢ 55:: 883:8 2: mo
53? 2: 23 £8: 883:8 9538.— 2: no Sonata". 3353. 2:. 85¢

107

 

 

 

 

 

 

Send cmmmd $36 was.“ 523

$34 «8ch $84 23.7 ddd

33d 23.? «Sod $3..va ddd dd $8

«wand 35.” 3mm... 33.” 523

mmmnd SEAT 38d namﬁml ddé

884 888: 8a: 88.? Add ”5 $8 8.? Q35

32:. good good 385 523

3.36 338 88.8 836 ddd

83.: 83... 2.8.: Soﬁe ddd dd $8

«Sod $3.: mmood :86 523

$36 $36 886 commd ddd

owned 336 886 $2.: ddd ad $3 88 9.qu

.Q.m omﬁo>< .Q.m $803.. 38853“ 2.3/dam 3638a
.8335

ESQ—awn 2309 3882

 

ABSESV 8.8 28“.

108

 

 

 

 

 

 

88.: 88.: 88.: 83.: 523

as: 884 38.: 884 4.0.:

88.: 88.: as... 83.: 4.04 :0 x8

58.: 2.3.: 38.: 83.: 523

was... $84 $8.: 824 4.0.:

88.: $8.: $8.: 33.: 4.04 4.0 $3 84 :3:
88.: 824 88.: as: 523

3.3.4 :83 2:; 3.3 4.0.:

3:: 834 2.3.4 384 4.04 4.0 x8

«.22. 33.." 28.: as.” 523

83.4 82.4 33 «83 4.0.:

$54 88.: 8:} 33.4 4.04 4.0 $8 8.» ca?
d.m ouﬁo>< d.m 03854 3:85am 2.3/4.3 8682:“—

330::
38:85 2:8: .332

 

€25.83 2. 2:94.

109

The reader should be reminded that, the bootstrap procedure for the
conﬁdence interval demonstrated above, represents perhaps the greatest charm of
the bootstrap technique. Using the technique, conﬁdence intervals which may be
difﬁcult to obtain through the usual MIN QUE are possible.

Accuracy of Bootstrap Qonﬁdence Intervals
In this simulation study, 90% and 95% bootstrap conﬁdence interval were

constructed at each of the 400 trials. Table 6.10 shows the percentage of the
number of times each of the six population parameter fell within the 90% and 95%
bootstrap conﬁdence interval for all the six design factor combinations (cells a
through 1'). Ideal percentages are expected to be near (1—a)100, for a = 0.01 or
0.05.

From these results, it is shown that, near perfect percentages were observed
for all the six parameters at the 0.05 level of the intraclass correlation under the
normal distribution. At this level of the pOpulation intraclass correlation condition,
percentages under the double exponential, though not as good as those under the
normal, were not far off from the expected quantity (1—a)100. For most of the
parameters, disappointingly low percentages were observed at the 0.20 level of the
intraclass correlation, particularly under the double exponential errors and sets of
random effects of the model. Even at the other two levels of the intraclass
correlation condition, there were more coverage probabilities which were below the
expected quantity (1—a) than those above the expected quantity. This ﬁnding

perhaps sends a precautionally message to research practitioners to aim higher

110

coverage probabilities when setting conﬁdence intervals rather than investing high
hopes at the conventional 0.9 or 0.95 coverage probabilities. However, for the
parameters a’ and ﬂ, percentages extremely close to (1—a)100% were observed

for all the six design factor combinations.

111

Table 6.10

Percentage of times that the true population parameters fell within the
conﬁdence intervals formed using the bootstrap procedure at the three levels of
the intraclass correlation.

 

 

 

 

 

 

 

 

Normal Double Exponential
Value p Expected 90% CI. 95% 0.1. 90% 0.1. 95% C.I.
0.01 1" 1.00 97.5 99.0 97.8 98.8
0.05 5.26 89.5 94.0 81.0 88.3
0.20 25.00 58.8 70.3 42.3 52.0
0.01 0’ 100.00 89.8 94.8 85.0 91.8
0.05 100.00 88.3 94.0 85.5 92.0
0.20 100.00 98.5 92.8 85.5 92.3
0.01 p 0.01 98.0 99.0 97.5 99.0
0.05 0.05 89.3 93.8 82.0 88.0
0.20 0.20 59.8 68.8 45.0 52.5
0.01 (11 —5.00 87.8 93.0 88.0 93.0
0.05 —5.00 81.8 88.3 81.8 88.3
0.20 —5.00 64.5 74.0 62.8 71.3
0.01 a3 3.00 87.5 93.5 86.8 92.8
0.05 3.00 82.3 88.3 84.0 89.3
0.20 3.00 70.3 76.8 66.5 76.8
0.01 ﬂ 1.00 89.0 95.0 87.5 93.5
0.05 1.00 86.5 93.3 87.8 93.5

0.20 1.00 88.5 95.0 87.5 93.5

CHAPTER VII

SUMMARY, TECHNICAL DISCUSSION, CONCLUSIONS,
AND RECOMMENDATIONS
vervi

The primary purpose of the study was to demonstrate the Operation of the
bootstrap in estimating parameters of a mixed hierarchical linear model with
random intercepts. The study demonstrated the ability of the bootstrap algorithm
in providing the estimates of the ﬁxed and random effects of the model, generating
bootstrap empirical distributions and standard errors of the statistics and thereby
constructing conﬁdence intervals about the parameters.

The design of the study utilized samples generated from pOpulations of
known parameters. Computer programs used to generate independent samples and
perform Monte Carlo simulations were coded in Statistical Analysis Systems (SAS),
mostly using Interactive Matrix Language (SAS/IML). Generation of data from
pOpulations of known parameter values provided a check for the performance of the
estimation procedures.

P0pulation distributions from which samples were drawn from represented
the normal and double exponential (or Laplace) distributions. The double
exponential distribution (an example of a distribution with fairly long and thick
tails) represented a distribution with some departure from normality, a situation
which most classical statistical methods are typically not usable.

The Minimum Norm Quadratic Unbiased Estimation (MINQUE) procedure
was adopted as a useful method of estimating the parameters of the model at each
bootstrap replication. The method provided a comparable partnership with the

112

113

bootstrap since they both do not require the normal distributional prOperties. Thus,
for each parameter of the model, two estimators were provided. These estimators
represented the MINQUE estimator based on the original sample and the bootstrap
estimator based on the resampled data.

Though the derivation of the MINQUE is based on arbitrary weights in the
norm, the present study adapted an AN OVA—type method of independently
estimating the variance components of the model as in Hanushek ( 1974). The
values therein those prior estimates were used to determine the weights to be used
in MINQUE.

In order to extensively assess the behavior of the bootstrap and MIN QUE
estimators, a total of 2400 Monte Carlo simulation trials, each consisting of a
diﬂ’erent data set were performed for six design factor combinations. The six design
factor combinations represented the three levels of the intraclass correlation by the
two distributional models (normal and double exponential).

In addition to simulated data, the bootstrap method and MIN QUE were also
applied on actual ﬁeld research data to estimate both the ﬁxed and random effects
of the model involving the effect of institutional, classroom and individual teacher
variables on self-efﬁcacy of high school teachers. For each parameter of this
speciﬁc model, the two estimators, the MIN QUE and the bootstrap estimates were
provided side by side. However, the bootstrap’s additional estimation advantage
was demonstrated by providing the bootstrap standard errors, empirical bootstrap
sampling distributions of estimators, and the 95% bootstrap conﬁdence intervals
about each of the parameters of the teachers’ self-efﬁcacy prediction model. The

bootstrap estimate of bias for each estimator were also provided.

114

mg!
’ —Efﬁ M e1

Three parameters of the random part of the model, representing the intra—
and inter—teacher variances, denoted by of, and r’ respectively and the
intra—teacher correlation denoted by p were estimated using both MIN QUE and
the bootstrap. In addition, eleven ﬁxed effects parameters of the ﬁxed part of the
model representing the eﬂects of Mathematics (a,), Science ((1,), English (as),
Social Science (a,), class level of preparation (6,), class size ([3,), average student
achievement level (53): staff c00peration (71), teacher control (7,), principal
leadership (7,) and the constant common to all classrooms denoted by 700 were
studied.

The bootstrap estimates of the effects of intra—teacher variance,
inter—teacher variance, the effect of class size, staff c00peration, and principal
leadership were close to the MINQUE estimates. For these estimators, the
bootstrap estimate of bias was no more than 0.008. However, except for the
estimate of the constant 700, whose bootstrap estimate of bias was 0.1124, the
bootstrap estimate of bias for the remaining nine estimators was no more than 0.04.

The bootstrap provided additional estimation informations which was not
available through the MIN QUE. These included the bootstrap standard error of
each estimator, the 95% conﬁdence intervals about the parameters, and the
empirical bootstrap distribution of the statistics. Extremely low values of the
bootstrap standard errors were observed particularly for the inter— and
intra—teacher variances, the effects of the class level of preparation, class size,
teacher control, and principal leadership. The bootstrap standard errors for these
estimators were all close to 0.01.

As a means of testing hypothesis about the parameters of the teachers’
self-emcacy prediction model, the 95% bootstrap conﬁdence intervals about the

115

parameters were constructed. Based on these intervals, hypotheses of whether each
of these parameters was different from zero were tested. Based on this bootstrap
fashion of testing hypothesis, all factors, with an exception of pring’pg leadership
were found to have a statistically signiﬁcant effect on teachers’ self—efﬁca_cy.

Seven percentage polygons based on B = 1000 bootstrap replications of the
estimators of the inter—teacher variance (1”), intra—teacher variance (cg), the
intra-teacher correlation (p), the ﬁxed effects of Mathematics (al), Science (a,),
English (a,), and Social Science (a,) were presented. Though the estimate of the
sampling distribution of the inter-teacher variance (7’) was slightly positively
skewed, and that of the effects of Mathematics (a,), Social Science (a,), and Science
(0,) were slightly negatively skewed, estimates of sampling distributions for all
other estimators were fairly symmetric. But perhaps more importantly, percentage
polygons for all seven estimators were centered extremely close to their
corresponding usual MINQUE point estimators.

By applying the bootstrap method on actual ﬁeld research data, the study
demonstrated three features which research practitioners may ﬁnd useful. First is
the bootstrap’s ability to provide the standard errors, empirical bootstrap sampling
distributions of estimators, and setting conﬁdence intervals about each of the
parameters. This feature is typically not available through classical methods in the
absence of certain distributional assumptions. The second feature was the
ﬂexability of the design in accommondating a wide range of independent variables
(both continuous and discrete) in the model. The ability of a design to
accommodate all types of independent effects is important given the limitations of
most available statistical packages. For instance, the procedure VARCOM in SAS
allows only for independent effects limited to main effects, interactions, and nested
effects but not continuous eﬁects. The third feature and perhaps the least expected
was the efﬁciency of the bootstrap computer code. Though the bootstrap is

116

typically perceived as computer intensive, the present study utilized a simple
program coded in SAS/IML through the MSU IBM 3090 VF mainframe computer.
With this program, one bootstrap trial on the full model (seven independent
variables) of B = 1000 replications took approximately 18:31.16 CPU time, which

was not very expensive.

Th im M e]

Diﬁ'erent simulated models corresponding to each of the six design factor
combinations (see Table 4.1) were studied. The six models represented the two
distributional models (normal and double exponential) by three levels of the
pOpulation intraclass correlation condition (p = 0.01, 0.05, and 0.20). The six
design factor combinations are denoted by cells a through f. Each of the six models
speciﬁed according to the design factor combination contained seven parameters of
which three were random and four ﬁxed. The three random effects parameters
represented the inter—class variance (1’), the intra-class variance (0:), and the
intraclass correlation (p). The four ﬁxed effects parameters, a,, a,, a3, and 6
represented the three levels of the ﬁxed factor and the coefﬁcient of the covariates.
Estimation of non—redundant effects were based on a,, 013, and B.

A total of 400 Monte Carlo simulation trials (based on independent samples)
were performed for each of the six design factor combinations (cells a through f),
resulting in a grand total of 2400 Monte Carlo simulation trials, each based on a
different data set drawn according to the speciﬁed design factor combination
parameters. Ten estimable functions expressed in terms of the usual MINQUE
and/or bootstrap estimates were used to assess the estimation of both the usual
MINQUE and the bootstrap estimators. The ten estimable functions were carequy
chosen to provide meaningful statistics like the Mean Square Errors, MSEl and

117

MSE2, for the usual MINQUE and bootstrap estimator respectively, and the
bootstrap estimate of bias denoted by BIAS.

Since data were drawn from populations of known parameter values,
estimable functions were checked against their expected parameters values. The
following is a presentation of the summary of estimation results organized by the

population parameters of the models.

Wt”)

At the 0.01 intra—class correlation conditions, both MINQUE and
bootstrap overestimated the parameter ‘r’ with biases equal to 0.0292 and 0.0597
under normal and double exponential respectively for MINQUE, and 0.3432 and
0.3765 under normal and double exponential respectively for the bootstrap.
Bootstrap estimates clearly improved for p = 0.05 both under the normal and
double exponential with biases of 0.0299 and 0.2709 under the normal and double
exponential respectively. Corresponding biases for the MINQUE estimate were
-0.0845 under normal and 0.1396 under the double exponential. The bootstrap
estimate of bias was observed at 0.1144 for p = 0.05 compared to 0.3140 for
p = 0.01 under the normal distribution.

Particularly successful estimation results for the parameter 1'2 were attained
at the p = 0.20 intraclass correlation condition. The bootstrap with a bias of
—0.0447 was clearly close to the usual MIN QUE with a bias of —0.1480 under the
normal distribution. The bootstrap was also fairly close to the usual MIN QUE
under the double exponential with the former registering a bias of 0.4018 and the
later having a bias of 0.5167. The estimate of bias at this level was 0.1149, and the
ratio R1 was surprisingly close to 1.00.

On average therefore, the MINQUE was closer to the parameter than the
bootstrap only at the 0.01 level of the intraclass correlation condition under the

118

normal. At p = 0.05 and 0.20, the bootstrap was close to the parameter value
compared to the MIN QUE under the normal distribution. Both methods failed to
produce better estimates for 1" at all levels of the intraclass correlation under the
double exponential distribution.

Percentage polygons for the 400 MINQUE and bootstrap estimates were
centered near the true pOpulation parameter value of r’ at all levels of this
intraclass correlation condition. However, though the bootstrap percentage polygon
appeared to be positively skewed while the MINQUE polygon was fairly symmetric,
at the 0.01 level of the intraclass correlation condition it was observed that a greater
mass of observations were around 1.00 for the bootstrap percentage polygon than for
the MIN QUE polygon.

The bootstrap conﬁdence intervals about the parameter 1’ were extremely
tight under the double exponential as well as under the normal distribution at the
0.01 level of the intraclass correlation condition. Bootstrap conﬁdence intervals
about 7’ were fairly short, both under the normal and double exponential at the
0.05 level of the intraclass correlation condition, but were wider at the 0.20 level of
the intraclass correlation condition. The percentage of times the true parameter
value of 1" fell within the 90 or 95 percentage bootstrap conﬁdence intervals were
close to either 90 or 95 at the 0.05 level of the intraclass correlation condition. The
percentage of times the parameter value was captured by the bootstrap conﬁdence
intervals were furthest from the expected conﬁdence coefﬁcient at the 0.20 level of

the intraclass correlation condition (see Table 6.10).

Wing)
MINQUE and bootstrap fairly accurately estimated the population

inter—class variance both under the normal and double exponential distribution of

errors and sets of random effects parameters, at all levels of the intraclass

119

correlation condition. However, at the 0.20 level of the intraclass correlation, the
bootstrap was closer to the parameter value than the MIN QUE with a bias of
0.0437 compared to the MIN QUE bias of 0.1660 under the normal distribution.

At the 0.01 level of the intraclass correlation the statistic R2 was extremely
close to unity, as expected. At all three levels of the intraclass correlation, the
standard deviation of the functions of the estimates were relatively high under
double exponential than under the normal distribution.

Percentage polygons for the MIN QUE and bootstrap estimates at all levels of
the intraclass correlation, showed the bootstrap following the usual MINQUE quite
closely. Percentage polygons for both estimators were centered extremely close to
the true parameter value of 03, which was set at 100. Thus it can be argued that,
while both MINQUE and the bootstrap fairly accurately estimate 03,; eﬁciency of
these estimates is severely affected by the nature and size of the tails of the
distribution of the errors and sets of random effects parameters. Both estimators
are less efﬁcient under a distribution with fairly long and/or thick tails than under a
distribution with short and/ or thin tails. But the measure of their relative
eﬁciency was extremely close to unity.

The bootstrap conﬁdence intervals about the parameter 036 showed a very
successful bootstrap interval estimation process. The average widths of the
conﬁdence intervals were quite low, particularly under the normal distribution. At
all levels of the population intraclass correlation condition, the percentage of times
the true parameter value of a: fell within the 90 or 95 percentage bootstrap
conﬁdence intervals were extremely close to either 90 or 95 under both normal and

double exponential distributions.

120

In r rr i 11

At the 0.01 level of the population intraclass correlation condition, both
MIN QUE and bootstrap very slightly overestimated p under both the normal and
double exponential. The biases were extremely close to zero under both
distributions. With an exception of R, which was poorly estimated, all other nine
functions of ,3 and/or 5* were fairly accurately estimated. At this level of the
intraclass correlation conditions the mean square errors for both MINQUE and the
bootstrap were particularly close to zero, under both distributions.

At the 0.05 level of the intraclass correlation condition, all ten estimable
functions of }3 and/or 5* were very successfully estimated, both by the MINQUE
and the bootstrap. The biases for the bootstrap and MIN QUE estimators were
extremely close to zero under both the normal and double exponential distributions.

The bootstrap slightly underestimated p under the normal, but very
accurately estimated p under the double exponential at the 0.20 level of the
intraclass correlation condition. The MINQUE, on the other hand, slightly
underestimated ,0 both under the normal and double exponential at this level of the
population intraclass correlation condition. Under both distributions, the statistics
R, and R, were extremely close to unity. However, at this condition of the
intraclass correlation, the bootstrap performed as well as the MINQUE in
estimating the ratio of the estimate to the parameter, p under both the normal and
double exponential distributions.

Percentage polygons for the 400 MINQUE and bootstrap estimates of p
under the normal and double exponential distributions showed that the two
methods followed each other very closely. For both methods however,the estimates
of p were more variable under the double exponential than under the normal

distribution, at the 0.05 and 0.20 levels of the intraclass correlation condition.

121

The bootstrap interval estimation about the parameter 1”, as a component
of p (see Equation 4.3) was successful, particularly at the 0.01 level of the intraclass
correlation; 90 and 95 percentage conﬁdence intervals about p were fairly
successful under both normal and double exponential. However, the percentage of
times, the parameter value of p fell within the bootstrap 90 and 95 percent
conﬁdence intervals were furthest from the expected conﬁdence coefﬁcient at the
0.20 levels of the intraclass correlation.

Fixﬂ effects pgameters (a,,gpgsl

Since a,, a,, and a, are linearly dependent, estimation was only required
for any two of them. Estimation results for a, and 0:3 were presented.

At the 0.01 level of the intraclass correlation condition, both MINQUE and
bootstrap fairly accurately estimated both a, and a, with biases of no more than
0.027 for a, and 0.037 for 0,. The statistics R, and R, at this level of the
intraclass correlation were extremely close to 1.00 under both the normal and double
exponential distributions. Mean square errors for both MIN QUE and bootstrap
estimates of a, and a:3 were no more than 0.87 under both normal and double
exponential distributions. The measure of their relative efﬁciency was quite clsoe to
one. ,

At the 0.05 level of the intraclass correlation, all ten estimable functions of
8, and/or 8: were very accurately estimated by both MINQUE and bootstrap,
under the normal. However, the bootstrap and MIN QUE estimates of a, were
surprisingly accurate under the double exponential than under the normal. The
average values of the functions R, and R, for both MINQUE and bootstrap
estimates of a, and a, were extremely close to 1.00 under both normal and

double exponential for p = 0.05.

122

At the 0.20 level of the intraclass correlation, though the bootstrap was
closest to the parameter a, under the normal than under double exponential, the
biases for both MINQUE and bootstrap were no more than 0.05. Both biases for
bootstrap and MIN QUE estimators of a, were close to 0.04 under the normal but
near 0.07 under double exponential.

In general therefore, both MINQUE and bootstrap very successfully
estimated the parameter a, and a, at all levels of the intraclass correlation under
both normal and double exponential distributions. However, the mean square error
for both MINQUE and bootstrap estimates of a, and a, tended to increase with
intraclass correlation under both normal and double exponential distributions.

The bootstrap conﬁdence intervals about the parameters a, and a, at the
0.05 and 0.01 levels of the intraclass correlation showed a more precise bootstrap
interval estimation process under both normal and double exponential distributions.
Except for p = 0.20, the percentage of times the 90 or 95 percent bootstrap
conﬁdence intervals captured the parameters a, and a, were extremely close to

the expected conﬁdence coefﬁcient, (1 — a) 100%.

C ' n of th variates

Perhaps the most accurate bootstrap and MINQUE estimation results were
obtained for the parameter [9, the coeﬁcient of the covariates. For this parameter,
the bootstrap and MINQUE average estimates over 400 trials were extremely close
to the true parameter value regardless of the level of the intraclass correlation or
distribution of the errors and sets of random effects parameters of the model. At all
levels of the intraclass correlation condition, the bootstrap and MIN QUE biases
were never greater than 0.0005 and the bootstrap estimate of bias was perfectly nil

under both normal and double exponential distribution.

123

Average values for the functions R, and R, of [9 and/or 8* were either
extremely close to unit or exactly equal to unit at all levels of the intraclass
correlation condition. The mean squares errors for the bootstrap and MIN QUE
estimators of ,6 were no more than 0.0002 under the normal and 0.0003 under the
double exponential at all three levels of the intraclass correlation condition. Thus,
based on these results it was evident that the parameter 6 was extremely
accurately estimated by both MINQUE and the bootstrap, regardless of the level of
the intraclass correlation condition and the nature and size of the tail of the
distribution.

Percentage polygons for the MIN QUE and bootstrap estimates of )3 showed
no obvious differences between MINQUE and the bootstrap nor between their
estimation ability under the normal or the double exponential. At all levels of the
intraclass correlation, the percentage polygons showed that a large mass of the
bootstrap and MIN QUE estimates were within 0.015 points from the true parameter
value of [9 which was set at 1.00. Also, regardless of the level of the intraclass
correlation, the bootstrap method for the conﬁdence intervals about the parameter
5 was a remarkable success. Even the percentage of times the 90 or 95 percent
conﬁdence intervals captured the true parameter value of 6 were extremely close to
the expected conﬁdence coefﬁcient at all levels of the intraclass correlation for both
normal and double exponential distributions.

Techni Di 8 ion
Much of statistical inference amounts to describing the relationship between
a sample and the population from which the sample was drawn. Consider for
instance, the statistic 0 used to estimate an unknown parameter 0. Suppose we
deﬁne a function R given by R = 8] 0. Since the behavior of R is unobservable,

we may wish to approximate its distribution. The main principle of the bootstrap is

124

to estimate the unknown distribution of a function, such as R by the distribution
of R* = 2"] 8, where 70" is the bootstrap version of d, computed from repeated
resampling.

The key feature of this argument is the hypothesis that the relationship
between 3 and 27* should closely resemble that between 3 and 0. Under the
assumption that the relationships are identical, we equate the two ratios, R and
R“ and obtain the estimate of 0 which is a function of data. Similar arguments
can be made for other functions like say, D* = 8* — 8 whose distribution will
resemble that of D = d— 0. Bootstrap conﬁdence intervals are then constructed
based on this approximation as demonstrated in Equation 6.1 through 6.3 in
Chapter VI of this dissertation.

In the present study, through Monte Carlo simulations, the distributions of
R and R“ were observed through two types of resampling. The distribution of R
was examined by drawing a random sample from a population having known
parameters, computing the statistic 8 and repeating the process a large number of
times. On the other hand, the distribution of R* was observed by drawing one
sample from the pOpulation similar to the one used in resampling for R. From this
sample, a random sample of the same size is drawn with replacement, the statistic
8" computed, and the process repeated a large number of times. The statistic 8
based on the original sample was also computed. The distributions of R and R“
were then derived from this systems. The purpose was then to empirically examine
the resemblence of the distribution of R“ and that of R.

Figures 7.1 and 7.2 presents the percentage polygons for the distributions of
R and R* representing the ratios of the estimators of the random and ﬁxed

parameters of a mixed hierarchical linear model discussed in Chapter II of this

125

Figure 7.1

Percentage polygons for the distributions of R and R“
representing the ratios of the estimates of the

random parameters 1’, 0:, and p.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Normal. Double Exponential

“m

t2

‘0

I

. .. \ .

2 j 2

us as l t s 2 ”:5 J as on o s . '5 2 2-3 3 3‘
nlnecstndR’ ‘m? vahnofRsndR'
—Qm *Qﬁmuugo —QW +Q°°Iootsttso“'

um 10””

12 I!

10

I

I

2

0°II OIFI uses 09” 105. H! H7I '23. 2.. OIVI cuss class I“. H. U" 13”
ninsofRsndR' ulna; "bacillus?
—Qm -0-II-Isotsmo -—n,-m *“W'I

"m “on"

t: '2

to to

I I

I I

2 2

°‘oe...;zu?.fo‘en.uzua
nlneofRsndR‘ rm, manner

126

Figure 7.2

Percentage polygons for the distribution of R and R*
representing the ratios of the estimates of the
ﬁxed parameters a,, a,, and ﬂ.

 

Normal Double W

 

 

 

    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I e

s s

z :

8.13 as 0.3: 07: cos ll: :35 is: '2‘ 3.1!: 033 0:: "33 no: no, .35 is. If!

nhnchandH‘ Gianna, ulnschsndR'
—OW *“Ioosssno —sm *vau

“W “M"

t2 '2

to to

s s

s s

z ‘ z

0' c ‘ ‘

as o as t .3 3 . . as o as 1 Is 1
nlnsafRsndB.‘ «gander, valnaofRsndH’
-"m +ﬁmsu _.m *rm

“m “cm

:2 12

to 10

s s

s s

2 2
:07 m Q87 loss tort ms 101' «far can asst I“ ma tn too?

manner puns, meanness

 

-8“: +"’m —am +l~m

 

127

dissertation. The estimators represented in Figure 7.1 correspond to the random
parameters r3, oi, and p while those represented in Figure 7.2 correspond to the
ﬁxed parameters a,, a,, and 6. Estimation of these parameters was done at the
0.05 level of the intraclass correlation condition.

The expected value of both R and R* is 1.00. Consequently, both
percentage polygons derived from the resampled data should be centered near 1.00.
Indeed, Figure 7 .1 and 7.2 shows that both percentage polygons were centered
extremely close to 1.00.

It is important to emphasize that the distribution of R represents a
sampling distribution of a statistic which is unobservable in actual research
situations. Properties of this distribution can only be viewed theoretically for
certain statistics, typically via the normal theory. On the other hand, the
distribution of R* represents an approximation of the distribution of R. More
importantly, the distribution of R“ is almost always observable via the bootstrap
algorithm. If the distribution of R* fairly accurately approximates the distribution
of R, then the bootstrap proves itself as a highly promising method in statistics.

From Figures 7.1 and 7.2, it is apparent that the distributions of R and R“
are fairly similar, particularly in terms of their location (or central tendency). They
diﬂ'er slightly in variability. However, the distribution of R* surprisingly appear
to be even "better" than that of R in the sense that, a greater mass of observations
are near 1.00 under the R* curve than under the R curve. This variations were
clearly marked under the double exponential than under the normal distribution.
Such variations in the distributions of R and R*, though slight, by underlying
distribution of errors and the random effects of the model were demonstrated to be

consistent for all estimators of the six parameters of the mixed hierarchical linear

model considered in the study (see Figures 7.1 and 7.2).

128

92mm
The following conclusions were drawn from the results of the Monte Carlo

simulation study and the results of the application the bootstrap and MINQUE on
the estimation of the teachers’ self—efﬁcacy prediction model.

1. Though the main mission of the bootstrap is not point estimation, the average of
the bootstrap estimates over B bootstrap replications can sometimes be closer to
the parameter value that the estimator based on the original sample. Thus, the

bootstrap may be viewed as both a point and interval estimation technique.

2. Efﬁciency of the usual MIN QUE and the bootstrap estimators of the parameters
of a model are typically affected by the nature and size of the tails of the
distribution of the errors and sets of random effects of the model. Both estimators
are less efﬁcient under a distribution with fairly long tick tails than under a
distribution with short think tails. In addition, the effect of the nature and size of
the tails of distribution tends to be more severe in estimating random eﬁ'ects than

ﬁxed effects of the model.

3. The bootstrap percentile method for the conﬁdence intervals about the
parameters 1", p, a,, and a, were successful at low intraclass correlation
conditions. At the 0.20 level of the intraclass correlation, the coverage probabilities
of the conﬁdence intervals about these parameters was quite low. However, at and
below the 0.05 level of the intraclass correlation condition, the bootstrap percentile

method of the conﬁdence intervals was shown to be highly promising.

129

4. The bootstrap’s ability to estimate the standard error of the statistics,
generating empirical sampling distribution of the estimators and thereby setting
conﬁdence intervals about parameters, without reference to any distributional
properties is the single most promising feature of the bootstrap. This ability was
very successfully demonstrated in the present study. Most importantly, the success
of the bootstrap point and interval estimation abilities were proved by comparing
the bootstrap estimates against the pre—determined true values of the model

parameters.

5. The MIN QUE and bootstrap estimate of the coefﬁcient of the covariates of the
model was surprisingly accurate. The bootstrap standard errors were extremely low
and bias was minimal. Even the bootstrap conﬁdence intervals about the parameter

[9 were extremely precise.

6. In applying the bootstrap and MIN QUE methods on the teachers’ self—efﬁcacy
prediction model, which contained several predictors, showed the promising ability
of the bootstrap and MIN QUE. The MINQUE which was once considered
computationally prohibitive can be used on such a large model with easy; even via
the bootstrap which involves repeated computation. The bootstrap algorithm can
be implemented on a large model of seven independent variables at a cost of no

more than 20 CPU time for one trial of 1000 replications.

7. For a statistic 8 used to estimate a parameter 0, the flmction R, deﬁned by
R = 8/0 was used to represent the relationship between 8 and 0. Given 8" as

the bootstrap version of 8 computed from repeated resampling, we deﬁne the

130

function R“ = 87 8 as an approximation to R. Through Monte Carlo simulations,
the distributions of R and R“ were found to be fairly similar, particularly in
terms of central tendency. The distributions diﬁered slightly in variability. The
distribution of R“ was slightly less variable than the distribution of R.

Recommen tions

Through Monte Carlo simulations, the bootstrap was demonstrated as a
promising approach to estimating the standard error of the statistic, generating its
sampling distribution and thereby setting conﬁdence intervals about a parameter.
This approach was empirically shown to work very well in estimating the
parameters of a mixed hierarchical model whose errors and random effects
parameters are either normally or double exponentially distributed. Applicability of
the bootstrap approach was further demonstrated in estimating the parameters of
the teachers’ self—efﬁcacy prediction model.

Implementation of the bootstrap method requires a great deal of computer
usage. Though modern fast and relatively inexpensive computers are readily
available, software to implement the bootstrap algorithm are currently unavailable.
Development of such software is highly recommended to make the bootstrap

available to research practitioners.

m i ns f r her rese h
Results of a simulation study are typically limited in their generalization to
the conditions examined in the study. The present study examined the Operation of
the bootstrap via MINQUE in estimating parameters of a mixed hierarchical model
when the errors and random effects are either normally or double exponentially
distributed. The study was done under three levels of the intraclass correlation

conditions.

131

The effectiveness of the bootstrap approach under severely skewed or heavily
tailed distributions remain to be seen. Studies to implement the bootstrap method
in examining the sampling distribution of estimators of parameters whose
underlying distributions are badly skewed like the gamma or heavily tailed like the
Cauchy are deemed necessary to fully understand the abilities and limitations of the
bootstrap approach.

The present study considered an hierarchical model consisting of "micro" and
"Macro" models with the assumption that only the intercepts were random. By
ﬁxing other coefﬁcients of the "micro" models simpliﬁed the study to one of
examining the variance components without covariates. A study to examine the
operation of the bootstrap in models involving not only variance components but
also covariance component will shed more light on the understanding of the
bootstrap in the hierarchical context.

The use of the bootstrap percentile method for the conﬁdence interval at
p = 0.20 was not very successful in estimating certain parameters. A more
promising bootstrap t—method for the conﬁdence interval was not used due to the
fact that the standard error of the MINQUE estimator was not known. Further
research geared to determining the standard error of MINQUE is deemed necessary

in order for the bootstrap users to utilize the t—method for the conﬁdence intervals.

APPENDICES

APPENDIX A

SUMMARY OF COMPUTATIONAL FORMULAE

The object of MINQUE the study was to ﬁnd the estimate {I of the variance
compont g of the two—level mixed model Y = 159 + Eb. The estimate g using

weights w0 and w, in the norm is given by

 

where
foo for

I" = {tr(lfwﬁﬁl’éeﬁJ} = [ J for k, k' = 0,1

10 11

and
I I no
I]. = {Y 13.591.13.10 = u,
for 13. = Y's. — y;‘¥(l5’y;‘)5)'¥'Y&‘ -
Let 1g = (H.125)— and 4.= Y;‘¥'Y;‘¥)‘¥'Y;‘

such that P, = Y? — A

a. w .
“’1
for n. = number of micro units in

 

_ 1 _
Ifwe deﬁne w — Til and cj — 1+(nj'1)w1

macro group j, then the following is the summary of the computational formulae:

132

 

133

———m—1- 1S 4.;
(a) Let If = We = gains.-
= {w 3 (X131 " °j¥i¥11¥h¥ﬂr
= {w 2 (x131 ‘ °j§j§i)}_
(b) 4.. = Y"¥I.<¥'Y;‘

= “’2 ’3 {Inj " “1%? 1i) 3231(5). ‘ 8‘?le 11)}

— 2 I I I I I
— W ”(19195: ‘ 8391951ng ‘ “1%? 1131531

I

2 I I
+ Cl?11?1j¥j’5¥j?lj?ll)-

2. F, matrix;

(a) foo = tr(‘1‘?) 4415316..)

where
tr(Y;’) = w’ 2 n,-{(1--cj)2 + c}(nj—1)}
tr(Y;‘A') = w3 2 {tr(tj) — cjaj[(1—c,-n,-)2 + (2—cjnj)]}
Where 3' = {(135ij and 31 = “(351155191919

which is a scalar simpliﬁed by a,- = tr(S;I~(§j) =

§jlf§r

(b) fox = f10 = "($299) - "(35216940
where
tr(Yfgg ,) = w’ )3 n,(1-cjnj)2

“(Memes = w’ 8 ail-sh?

134

(c) h. = as}; z z’- Z all - “(W z a a.)
where
tr(V 3; ,Z, Z ,Z',) = w2 E n}(1-cJ-n,)’
tr(V 2'29? ,,,ZZ )=w vlr"J-j-}3na(1-an-)3

3. g and UV;

(.) €v= (strains

= KX V'1Y

=w2KxJ. (1,,j-J..cz,J.z',J.)YJ

=w21_(XJ-— —J.Jch. z ,2}, YJ.)

= w )3 (13X, — cjrjlij) where ’1 = Z’uVJ- is the sum of V

elements in context j

(b)n,=w’22dJ7(Inj —J..'2ezz +cJ.’nJ.zz

”,,,,- )dJ fordj=YJ —X,-a

-llz 1J
w’ 2 (dJTdJ. —2cJ.<lJ.z,Jz',,clj + cfnjdEZuZ1jdj)
W2 ’3 (£119: ' 2°jhi + cjnjhi)

for h, = 21,9, = d'z,J. = 30!,- —x.§)

lid:

(c) u,=w2}3(1—c, -nj)2dZ,J-Z
= W2 2 h-2(1-C jnj)2
4. MINQUE forth e Fixed Effects:

Model: Y = X9 + Z!)

 

where V is (nxl) vector of n observations
X is an (nxp) matrix of known constants

g is a (pxl) vector of ﬁxed effects parameters

135

Z is an (nun) identify matrix (usually denoted by Z0

in the random or mixed models).

13 is (nxl) vector of residual error terms.

1,)" onn

Y.

ll
ta

12.? = 12. = ml.

which implies that

- 1
‘1'1 = a; In and
V? = 1.1,,
w: '
K = (s’yrzo‘ = w.( ’29-
s. = V;‘z<05'Y;‘z<)‘z<’Y" = Y;‘2$I..<¥'Y;‘
= ‘-, We as???
wo
_ 1 I _ I
—ﬁN¥p¥
F. = tr(ssz’w ) =..(13,p,)

where 13. = v: -Y;"5(Z"Y;‘¥)_ W:
= my? —Y;‘I}.)
= my?) - tr(Y;‘é.)
=_n -2.

2 2
W0 W0

136

~<

= (y.1 — y;‘zt(¥'Y;‘¥)‘ X'Yﬂ‘!
= 3,120; - ¥(¥'Y;‘Z‘)— 2"er
= Yé‘ (Y - 2&3)
for §= (x’wzcr in“!
thus,
= 01- {(9)332 (Y 45%)

=L(Y—x&)’(Y—x&
w2 ' H ' --
0

where £1: (x'v;IX)‘x'vJ;lY
= X X _X—
wo( ) “TOY _

= (is)?!

Thus, for the general ﬁxed effects model, the MINQUE estimator 83 = $1;

is given by
s}: = 2:19.
2
W0 1 “ I “
—.—.p--;,<¥—z<e> (if-2‘9)
=n—1:5(Y— Xa) (Y- Xa)

which is independent of the weights wo and w, . Consider for
example the simplist and naive model given by,
Yij = l‘ + ‘ij
In the notation of the form
Y = ’59 + @‘3 »
X is a (nxl) vector of 1’s

a=p isascalar

 

137

Z = In is an (nun) indentity matrix
b is a(nx1) vector of residual error terms.

Inthisspeciﬁccase, P=l and
a =(X X)- X Y: Y" (Grand mean).

I},=1—,(Y— Xa) (Y— X01)

1
= —2 2 (Yij ""Y..)2
W0
W3
- -1 _
811C}! that, Wlth F' — n—_p ,
the MINQUE estimator a: is given by
(73- — a: =F'1U'
2
w0 1
= — —2'1‘(Y--—Y..)2
11.1) W: 1]
1
= “Fr 2 (Yij - Y..)2
= s2

which is the moment estimator.

5. MINQQE for the one way rgdom eggs bﬂgced model:

where Y is (le) vector of N observations for N = nJ,
J = # of levels.

X is (le) vector of 1’s

9
II

p
[o ZJ
andZ is (NJ) block diagonal, each block

'N
H
LN

being a column of 1’s.

138

13 = [Po 9:]:
b0 = (le) vector of residual error terms
13, = (J x1) vector of J unobservable random
effects parameters.
(a) If = (¥'Y;1¥)— = {W “31:an ' eastw-
= {w 2 (n —cn2)}— since 11. = n c- = c

J ’ J
= {wnJ(1 — A)}'1 .
Thus

K = w_nJ('I:X)' for A = ﬁnd—2,;
(b) 4.. = .22(HJ.,)215,195;
= w’Ku - 1,2 2 29-253
(c) foo = tr(Y;’) — tr(yre.)
(i) tr(Y;’) = W” 2 1n,{(1-<=,-)2 + c,?(n,- - 1)}

= w’nJ((l-c)2 + c2(n—l)}

= w’nJ(1—2c + nc’)

= w’J(l—A)2 + (n—1)w2J
(ii) tr(V,;léw) = WK )3 nJ-(l—cjnj)3

= w’KnJ(l—A)3

= W3 mid-11')- nJ(l—A)3
= w"‘(1~-—A)2 .
Thus foo = w"‘J(l—A)2 + (n—l)w’J — w2(1—z\)2

= w’{(J—1)(1—»\)’ + (n—1)J}
= w’ {(J-l)(1—A)2 + J(n—1)}

(d)

Thus,

(6)

(f)

139

f0: = f10 = tr(Yé’?1?'1) -tr(V;1§wZ,Z’,)
(i) tr(szzlg’l) = “'2 2 113(1"Cj11j)2
= w2nJ(1 — A)
(ii) tr(Y;‘1},;,g',) = WK 2 nJe‘(1—cJ.nJ.)3
= w’Kn"'J(l--A)3
= W3 m n2J(l—/\)3
= w’n(l—A)2
f0, = {,0 = w’nJ(l—/\)2 — w’n(1—A)2
= w’n(l-A)’ (J—l)
= w2n(J—1) (l—A)”

f11 = ”052%? 1%? 1) - t(V;‘A'Z,Z 1%? 1)
(1) tr(Y;’?@ @1? 1) = W2 2 “JO—(3111).}
= w’n’J(1—A)’

(ii) tr(y;‘4.z.z1@@1) = w’ 2 Knju—cjnj)’

= waKn"’J(l--A)’I
= W3 m n3J(l-—/\)3
= w’n’(1—z\)2
=1 f,, = w’n’J(l—A)’ — w’n’(1—A)’
= w’n’(l—A)’ (J—l)

= w’n’(J—1) (l—A)’

= wK 2(1—A)r, (for rj = sum of Yij in context j)
= wKnJ(1—A) Y..

(h)

(i)

Thus,

ah?=,i3[§J(Y.J. —Y..=]’ 23(an .J.—nJ.Y..)2

=j§,n’(Y.J. — Y..)’ = n’2(Y.j — Y..)’
= n(SSB)

(iii) g,- = ngJ. = §j(Y,J. —Y..)2

:1 s gi=j231 :91“, --Y..)2 = SST
(i) u0 = w2 E (g,— 2c, h? + cJ-znjh}
= w’ {2gi—2c2 h} + c’n EhJ?}
= w’ {SST — (2cn)SSB + (c’n’)SSB}
= w2 {SSB(1—2A+A’) + SSW}
= w’ (SSB(1—A)’ + SSW}
(ii) 11, = w’ 2(1-cJ. n J)’h?
= w’(1—A)’ n.SSB
= w’n(l—A)’ SSB
D=det(F,) = foof11"f01f10 = foofu " f3,

(i)foofn = w"‘{(J-1)(1--)\)’ + J(II-1)}{VIV’II"’(J-1)(140”}

= w 112(1- mm -A)‘ + w‘n ’J(J—1)(n—1)(1—A)’
(ii) f3.=[w2n(J—1)(1—A)’1*=w nH—(J 112(1 —A)‘

D = f00,11 " {31

= w 1121(1—1)(n—1)(1—1\)2

141

(J) P = (foou1‘ 01%” D
(i) r 0,11,: w H{(J—l)(1—/\) + J(N—l)}{w’n(l—A)’SSB}
= w‘n(J—1)(1—A)‘ SSB + w‘nJ(n—1)(1—A)’SSB
(ii) £0,110: {w’n(J—l)(1—A)’}{w(SSB(1—A)2 + SSW)}
= w‘n(J--—1)(l—)\)4 SSB + w n(J—-1)(l—/\)2 SSW
Thus foou, — {,,,u0 = wn(l—A)2 [J(n-1)SSB - (J—l)SSW].

~

Thus
fr: = w‘n(l-A) 2 [J (n—l)SSB—(J—l ) SSW] g»
w‘n’J(J—1)(n—1)( l—A) ’

IF”...-
1 . .

_[J(n—l)SSB] — [(J-1)SSW]

 

 

 

n J(J— 1)(n-1)
_ J n—l SSB _ J—l SSW
_ — n— — n—
SBB SSW _MSB _MSW
=n( J—I) -nJ( n—l) n 11
As a result,
1;: = MSB-MSW

 

n
(which is the same as the method of moments).

(k) 3: = (fuuo - fo1“1)/D
(i) f,,u°= {w’n’(J—l)(l—A)’}{w’SB(l—A)2 + SSW}
= w n’(J—1)(l—A)‘SSB+w‘n’(J—1)(1—A)’SSW
(ii) fo,u,= {w’n(J—1)(1—A)’}{w2 n(1—/\)2SSB}
= w n’(J— —-1)(1 -/\)‘SSB
which implies that,
{,,uo — 0,u, = w‘n’(J-1)(l—A)’ ssw

such that

142

)

a: = “11% " {ONO/D
= w‘n’(J—l)(1-/\)’SSW
w‘n’J(J-l)(n-l)( l—A)’

 

_ SSW
_ III—17n-

= MSW

(which is the same as in the method of moments).

APPENDIX E
SAS/IML COMPUTER PROGRAMS
PART 1
COMPUTER PROGRAM TO IMPLEMENT TEE BOOTSTRAP ALGORITHM

ON A SAMPLE EIERARCEICAL DATA DRAWN FROM A NORMAL
POPULATION OP KNOWN PARAMETERS.

 

THE PROGRAM PIRST SETS UP THE 5 AND THE FIRST PART OP THE
; MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF
THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN
CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN
CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH
AS DEMONSTRATED HY EQUATION 2.11 IN CHAPTER II.
THE WEIGHT :1 WAS DETERMINED SEPARATELY USING THE
HANUSHER (1974) METHOD.
PROC INL;
START;
I=1/(1-l1);
GROUPS=50;
NV11=REPEAT(20,2,1);
NV21=REPEAT(25,5,1);
NV31=REPEAT(30,IO,1);
NV1=NV11//NV21//NV31;
NV12=REPEAT(35,5,1);
NV22=REPEAT(40,3,1);
NV32=REPEAT(20,3,1);
NV42=REPEAT(25,5,1);
NV2=NV12I/NV22//NV32//NV42;
NVI3=REPEAT(30,10,1);
NV33=REPEAT(40,2,1);
NV3=NV13IINV23IINV33;
NV=NV1//NV2//NV3;
cv=I1/(1+(uv-1)ow1);
111=REPEAT(1,465,1);
212=REPEAT(1,4SO,1);
113=REPEAT(1,555,1);
101=REPEAT(O,465,1);
XOZ=REPEAT(O,4SO,1);
803=REPEAT(0,555,1);
xr=x11//xoz//x03;
xz=xo1//x12//xoa;
xa=x01//xoz//xra;
x4=xr||x2||x3;

OODGOGOGOQDOO
Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

143

A:

0......09.

144

PROGRAM SEGMENT TO GENERATE DATA FROM A NQRMAL POPULATION
OF SPECIFIED PARAMETER VALUES. THE PROGRAM FIRST DETERMINES
THE FIXED EFFECTS PARAMETERS AND THE COVARIATE WHICH IN

ARE USED IN TURN TO GENERATE THE OBSERVATIONS 1 THROUGH

THE EQUATION GIVEN BY:

Y = (X*ALPHA) + B + E

gggg: IS ANY NUMBER USED TO CREATE A RANDOM NUMBER OF
OBSERVATIONS FROM SOME POPULATION.

SEED = 10199;

*TIS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS

DO T = 1 TO 400;

 

REPPECTS = 2.2935 * NORMAL(REPEAT(SEED,GROUPS,1));
D1 = narrncrs[1,1];

N1 = NV[1,1];

8J1 = REPEAT(SI,N1,1);

DO I = 2 TO GROUPS;

SJ = REFPECTS[I,1];
N = NV[I,1];
EJI=BJIIIREPEAT(BJ,N,1):

END;

SOSOOSDSOOOOOOO

S=BJI;

n = 10 . NORMAL(REPEAT(SEED,1500,1));

:41 = REPEAT(25,1500,1);

:5 = INT(75 . UNIFORM(REPEAT(SEED,1500,1))) + :41;

x=X4||xs;

ALPHA = {-s,2,3,1.0};

Y = (X:ALPHA) + B + 3;
AT THIS POINT A SPECIFIC DATA an! HAS BEEN GENERATED IITH
THE FIXED nrrncrs ALPHA AND H AND E AS THE RANDOM ;
PARTS or THE MODEL. IHILE THE FIXED nrrncrs REMAINED AT
Tunas VALUES, THE RANDOM arrears PARAMETERS TOOK THE VALUES
AS SHOIN SELLOI:

 

 

INTRA-CLASS
DATA SET CORRELATION TAU SQUARE SIGMA SQUARE

 

1 0.01 1.00 100
2 0.05 5.26 100
3 0.20 25.00 100

 

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q0

Q. Q. Q.

Q. Q.

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

145

* 5 IS A MATRIX WHICH IS PART OF THE PROJECTION MATRIX PW
* GIVEN IN EQUATION 2.20 IN CHAPTER II. THE MATRIX 5 IS
GIVEN BY:

‘0

X=INV(X’VWIX)

THE FOLLOWING PROGRAM SEGMENT COMPUTES THE ELEMENTS OF THE

MATRIX 5.

Q-----------------------_--- ..... ———--- ----------- .. ------- ......---
x=o;

X1=0;
x2=o;
M=1;

N1=0;

DO J=1 TO GROUPS;
NJ=NV[J,I];
N1=N1+NJ;
XJ=X[M:N1,];
YJ=Y[M:N1,];
EIJ=REPEAT(1,NJ,1);
CJ=CV[J,1];
SJ=XJ‘*ZIJ;
X1=X1+(XJ‘*XJ);
X2=X2+(CJ*SJ*SJ‘);
M=M+NU;

END;

X:W*(X1-X2);

X=INV(X);

S... St S
Q. Q. Q. Q. Q. Q. Q. Q. Q.

a------ ---------------------- ---- ----------------- - ------------- ;
* DETERMINATION OF THE MATRIX FI: ;
* I! IS A (282) MATRIX ASSOCIATED WITH WEIGHTS Wk SHOWN ;
* IN EQUATION 2.21, AND WHOSE ELEMENTS ARE DETERMINED THROUGH ;
* EQUATION 2.26 THROUGH 2.28. ;
O ALPHAH IS A VECTOR OF THE ESTIMATES OF THE FIXED EFFECTS ;
* PARAMETERS OF THE MODEL BASED ON THE ORIGINAL DATA SET. ;
0 THUS, THE FOLLOWING PROGRAM SEGMENT DETERMINES THE MATRICES ;
* USED TO COMPUTE THE USUAL MINQUE ESTIMATES THAT ARE BASED ;
* ON THE ORIGINAL DATA SET. ;
a----------------- -------- ---------------------------------—-—--;

F001=o;

F002=0;

F011=0;

F012=0;

F111=0;

F112=0;

ALPHA1={0,0,0,0};

ALPHA2={0,0,0,0};

M=1;

N1=0;

 

146

DO J=1 TO GROUPS;
NJ=NV[J.1] ;
N1=N1+NJ;
XJ=X[M:N1,];
YJ=Y[M:N1,];
CJ=CV[J,1];
21J=REPEAT(1,NJ,1);
CN=CJ*NJ;
CJ2=CJ*CJ;
NJ:=NJ*NJ;
C2=(1-CJ)*(1-CJ);
TJ=TRACE(XJ‘*XJ*X);
SJ=XJ‘*EIJ;
AJ=SJ‘*X*SJ;
AC=AJOCJ;
CN1=1-CN;
CN12=CN1OCN1;
CN13=CN1*CN12;
AN=AJ*NJ;
RJ=EIJ‘*YJ;
F001=F001+(NJ*(C2+(CJZ*(NJ-1))
P002=P002+(TJ-AC*(CN12+(2-CN))
FOII=P011+(NJ*CN12);
F012=P012+(AJOCN13);
F111=F111+(NJ2*CN12);
F112=F112+(AN*CN13);
ALPHA1=ALPHA1+(X*XJ‘*YJ);
ALPHA2=ALPHA2+(CJ*RJ*X*SJ);
ALPHAH=ALPHA1-ALPHA2;
M=M+NU;

END;
I2=I*U;
I3=I2*U;
F001=I2*F001;
rooz=watrooz;
F011=I2*P011;
F012=l3*F012;
F111=I2*P111;
F112=I3*F112;
Foo=F001-F002;
F01=F011-F012;
F11=F111-F112;
ALPHAH:W*ALPHAH;
ALPHAHT=ALPHAH‘;

H:
H

147

‘0 ‘0 Q0 Q0 Q0 Q0 ‘0 Q0 Q0 ‘0 Q0 Q0 Q0 Q0 Q0 Q0

DETERMINATION OF THE MATRIX UW:
Q! IS A 2 DIMENSIONAL VECTOR OF QUADRATIC FORMS WHOSE
ELEMENTS ARE DENOTED BY u0 AND u1 (SEE EQUATION 2.22, 2.31
AND 2.32).
QETF IS THE DETERMINANT OF THE MATRIX FW USED TO OBTAIN
THE INVERSE OF THE (282) MATRIX FW.
SIGMAH IS THE INTRA-CLASS VARIANCE COMPONENT BASED ON THE
ORIGINAL HIERARCHICAL DATA SET.
TAUH IS THE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED
ON THE ORIGINAL DATA SET.
LAMDA IS THE INTRA-CLASS CORRELATION BASED ON THE ORIGINAL
SAMPLE AND COMPUTED BY THE FORMULA,

LAMDA = TAUH/(TAUH+SIGMAH)
001:0;
011:0;
M=1;
N1=o;
DO 3:1 TO GROUPS;
UJ=NVIJrllf
N1=N1+NJ;
XJ=X[M:N1,];
YJ=Y[M:N1,];
CJ=CV[J,1];
HIJ=REPEAT(1,NJ,1);
CN=CJ¢NJ;
N2=NJONJ;
=YJ-(XJ*ALPHAH);
HJ=ZIJ‘*DJ;
GJ=DJ‘*DJ;
HJ:=HJ*HJ;
CH2=CJ*H32;
CN12=(1-CN)*(1-CN):
=YJ-(XJﬁALPHAH);
001:001+(GJ-(an*(2-CN))):
U11=U11+(HJZ*CN12);
M=M+NU;

S.D...OOQOOOOOOO

 

* THIS MARKS THE END OF THE COMPUTATION OF THE USUAL MINQUE
* ESTIMATES BASED ON THE ORIGINAL SAMPLE. THE USUAL MINQUE
* ARE PRINTED AT THE FIRST LINE. ESTIMATES PRINTED AT THE
* PROCEEDING LINES ARE THE BOOTSTRAP REPLICATED ESTIMATES

* BASED ON THE RESAMPLED DATA FROM THE ORIGINAL SAMPLE

. _______ __ __ __ _-

Q. Q. Q. Q. ‘0 ‘0 ‘0

U0=W2*U01;
u1=wztn11;
DETF=(FOO*F11)-(FOI*F01);
SIGMAH=((F11*UO)-(F01*Ul))/DETF;
TAUH=((FOO*U1)-(F01*UO))IDETF;

148

*=__:=::___=__ _ _ == ______ __ ::_ _ __
* A PROCEDURE TO BOOTSTRAP THE PARAMETER ESTIMATES

* BY COMPUTING THE ESTIMATE B TIMES THROUGH RESAMPLING.

* 3 IS THE INDEX COUNTER WHICH COUNTS THE BOOTSTRAP REPLICATED

* SAMPLES. SEED1 IS THE RANDOM GENERATOR FOR THE BOOTSTRAP.

 

 

Q. Q. Q0 Q0 Q0 ‘0

SEED1 = 10199;
DO B=1 TO 200;

* A PROCEDURE USED TO RESAMPLE DATA FROM THE ORIGINAL DATA SET
* BY FIRST CREATING AN INDEX FOR EACH OBSERVATION. THIS

* PROCESS IS REPEATED B TIMES FOR SOME LARGE B REPRESENTING

* THE NUMBER OF BOOTSTRAP REPLICATIONS.

O

Q. Q. Q. Q. Q. ‘0

NT=1;
CONSTANT=REPEAT(NT,NV[1,1,1);
INDEX=NV[1,1*(UNIFORM(REPEAT(SEED1,NV[1,1,1)))+CONSTANT;
DO S=2 TO GROUPS;
NT=NT+NV[S-1,];
CONSTANT=REPEAT(NT,NV[S,1,1);
INDEX1=NV[S,1*(UNIFORM(REPEAT(SEED1,NV[S,1,1)))+CONSTANT;
INDEX:INDEX//INDEX1;
END;
INDEX=INT(INDEX);
YSTAR=Y[INDEX];
XSSTAR=X5[INDEX];
XSTAR=X4||XSSTAR;
YSTART=YSTAR‘;
a--—- ------------------------- - ----- ---- ----- -------------------;
* DETERMINE X=INV(X'VIIX) BASED ON THE REPLICATED COVARIATE ;
t VALUES ;
*-------------------- -------------- ----- ------ ------------------;
X=o;
x1=o;
32:0;
M=1;
N1=o;
DO 3:1 TO GROUPS;
NU=NV[J,1];
N1=N1+NJ;
XJ=XSTAR[M:N1,];
YJ=YSTAR[M:N1,];
EIJ=REPEAT(1,NJ,1);
CJ=CV[J,1];
SJ=XJ‘*ZIJ;
X1=X1+(XJ‘*XJ);
X2=X2+(CJ*SJ*SJ‘);
M=M+NU;
END;
X:I*(R1-R2);
X=INV(R) ;

 

149

* DETERMINATION OF THE MATRIX FM BASED ON THE REPLICATED R
* MATRIX.
* ALPHAH; IS THE ESTIMATE OF THE FIXED EFFECTS PARAMETERS BASED
* ON THE REPLICATED DATA SET.
.....................-............1..............-..............
roo1=o;
F002=0;
F011=o;
ro12=o;
F111=o;
r112=o;
ALPHA1={0,0,0,0};
ALPHA2={o,o,o,O};
M=1;
N1=O;
DO J=1 TO GROUPS;
NJ=NV[J,1];
N1=N1+NJ;
XJ=XSTAR[M:N1,];
YJ=YSTAR[M:N1,];
CJ=CV[J,1];
SIJ=REPEAT(1,NJ,1);
=CJ‘NJ;
CJ2=CJ*CJ;
NJ2=NJ*NJ;
C2=(1-CJ)*(1-CJ);
TJ=TRACE(XJ‘*XJ*R);
SJ=XJ‘*EIJ;
AJ=SJ‘*X*SJ;
AC=AJ*CJ;
CN1=1-CN;
CN12=CN1*CN1;
CN13=CN10CN12;
AN=AJ0NJ;
RJ=EIJ‘*YJ;
F001=F001+(NJ*(C2+(CJZ*(NJ-1))));
F002=F002+(TJ-AC*(CN12+(2-CN)));
F011=F011+(NJ*CN12);
F012=F012+(AJ*CN13);
F111=F111+(N32*CN12);
F112=F112+(AN*CN13);
ALPHA1=ALPHA1+(R*XJ‘*YJ);
ALPHA2=ALPHA2+(CJ*RJ*X*SJ);
ALPHAH1=ALPHA1-ALPHA2;
M=M+NJ;
END;
W2=N*l;
I3=W2*I;
F001=U2*F001;
F002=U3*F002;
F011=W2*F011;
F012=l3*F012;

Q. Q. Q. Q. ‘0 ‘0

150

F111=W2*F111:
F112=W3*F112;
F00=F001-F002;
F01=F011-F012;
F11=F111~F1123
ALPHAH1=W¢ALPHAH13
ALPHAH1T=ALPHAH1‘;

 

I
* DETERMINATION OF THE VECTOR UM BASED ON THE RESAMPLED Y ;
0 AND THE REPLICATED X MATRIX ;
* SIGMAHI IS THE INTRA-CLASS VARIANCE COMPONENT ESTIMATE ;
t BASED ON THE REPLICATED SAMPLE ;
* TAUHI IS THE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED ;
* ON THE REPLICATED SAMPLE. . ;
* LAMDA; IS THE INTRA-CLASS CORRELATION ESTIMATE BASED ON ;
* THE REPLICATED SAMPLE. ;
a — — - —=—=— — ___ _ _ _ _ _= ——==—;
U01=0;
U11=0;
M=1;
N1=0;

DO J=1 TO GROUPS;

NJ=NV[J,1];

NI=N1+NJ;

XJ=XSTAR[M:N1,];

YJ=YSTAR[M:N1,];

CJ=CV[J,1];

EIJ=REPEAT(1,NJ,1);

CN=CJ*NJ;

N2=NJ*NJ;

DJ=YJ-(XJ*ALPHAH1);

HJ=ZIJ‘*DJ;

GJ=DJ‘*DJ;

H32=HJ*HJ;

CH2=CJ*HJZ;

CN12=(1-CN)*(1-CN);

DJ=YJ-(XJ*ALPHAH1);

U01=UOI+(GJ-(CH2*(2-CN)));

UII=U11+(HJ2*CN12);

N=N+NU;
END;

U0=w20001;

01=w2t011;

DETF=(F00*F11)-(FOI*FOI);

TAUH1=((FOO*U1)-(FOI*UO))/DETF;

SIGMAHI=((F11*U0)-(F01*Ul))/DETF;
O--—------------------—--------—-----------------—-------------—--;
a THE NEXT PROGRAM SEGMENT PRINTS THE VALUE OF THE BOOTSTRAP ;
a AT EACH OF THE B BOOTSTRAP REPLICATION. ;
I

 

 

151

PRINT T (|PORNAT=4.o|) B (|PORNAT=4.0|) TAUH (IFORMAT=S.3|)
TAUHI (IFORMAT=S.3|) SIGMAH (IFORMAT=S.3|)
SIGMAHI (IFORMAT=S.3|);
PRINT ALPHAHT (IFORMAT=S.3|) ALPHAHIT (IFORMAT=S.3|);
SEED1 = SEED1 + 100;
END;
.__ _. _____ __ __ _ _= ___

* THE END OF THE BOOTSTRAP TRIAL BASED ON THE RESAMPLED DATA.

 

ANOTHER TRIAL WILL BE PERFORMED AFTER CHANGING THE SEED FOR
THE RANDOM SAMPLING ALGORITH.

SEED = SEED + 100;
END;

* THIS MARKS THE END OF THE SIMULATION TRIAL. EACH SUCH TRIAL

* RESULTS IN ONE SET OF THE USUAL MINQUE ESTIMATES AND B SETS
* OF THE BOOTSTRAP REPLICATED ESTIMATES. THE SUMMARY

0 STATISTICS FOR THE BOOTSTRAP REPLICATED ESTIMATES ARE ALSO
* COMPUTED.

FINISH;
RUN;

PART 2

COMPUTER PROGRAM TO IMPLEMENT THE BOOTSTRAP ALGORITHM
ON A SAMPLE HIERARCHICAL DATA DRAIN FROM A DQQELE
EXPONENTIAL POPULATION OF KNOWN PARAMETERS.

 

X MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF
THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN
CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN
CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH
AS DEMONSTRATED BY EQUATION 2.11 IN CHAPTER II.
THE WEIGHT 1; WAS DETERMINED SEPARATELY USING THE
HANUSHEK (1974) METHOD.
PROC IML;
START;

=1/(1-I1);

GROUPS=50;

NV11=REPEAT(20,2,1);

NV21=REPEAT(25,5,1);

NV31=REPEAT(30,10,1);

NV1=NV11//NV21//NV31:

NV12=REPEAT(35,5,1);

NV22=REPEAT(40,3,1);

NV32=REPEAT(20,3,1);

NV42=REPEAT(25,5,1);

NV2=NV12//NV22//NV32//NV42;

NV13=REPEAT(30,10,1);

.D....I‘OOOO.

TEE PROGRAM FIRST SETS UP THE E ANDVTHE FIRST PART OF TEE

Q. Q. Q. Q. Q.

Q. Q. Q. Q. Q. Q. Q.

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

 

152

NV23=REPEAT(3S,5,1);
NV33=REPEAT(40,2,1);
NV3=NV13//NV23//NV33;
NV=NV1//NV2//NV3;
cV=I1/(1+(NV-1)*W1);
XI1=REPEAT(1,465,1);
X12=REPEAT(1,480,1);
x13=REPEAT(1.555,1);
XOI=REPEAT(0,465,1);
xoz=REPEAT(o,ASo,1);
X03=REPEAT(0,555,1);
:1=x11//xoz//xoa;
xz=x01//312//xoa;
Ia=x01//xoz//113;
X4=x1||x2||xa;

PROGRAM SEGMENT TO GENERATE DATA FROM A DOUBLE EXPONENTIAL
POPULATION OF SPECIFIED PARAMETER VALUES.

THE PROGRAM FIRST DETERMINES THE FIXED EFFECTS PARAMETERS
TOGETHER WITH THE RANDOM EFFECTS PARAMETERS

AND THE COVARIATE WHICH ARE USED IN TURN TO GENERATE THE
OBSERVATIONS 1 THROUGH THE EQUATION GIVEN BY:

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

Y = (X*ALPHA) + B + E

ﬁﬁgﬂl AND SEED2: ARE ANY NUMBERS USED TO CREATE A RANDOM
NUMBER OF OBSERVATIONS FROM A DOUBLE EXPONENTIAL POPULATION.

SIES'O1DS'OIDI'OIDS'.

1 IS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS
‘------------------- ----- --------- ----- -------------------------
DO T = 1 TO 400;

SEED1 = 100999;

SEED: = 12399;

I
0 T IS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS ;
0-—---------------—-—--—-——---------------—----—--—-------------;
DO T = 1 TO 1;
UR1=UNIFORM(REPEAT(SEED1,GROUPS,1));
UR2=UNIFORM(REPEAT(SEED2,GROUPS,1));
LR= -1 * LOG(UR1);
TR1 = REPEAT(1,GROUPS,1);
TR2 = TR1 f (UR2 >= 0.5);
TR3 = -1*(TR1 I (UR: < 0.5));
TR4 = TR2 + TR3;
REFFECTS = 2.2935 * ((LR#TR4)/SQRT(2));
O—n- ————————————— -— --------------------- -- ---------------------- ;
31 = REFFECTS[1,1];
"1 = NV[1.1]:
3.11 = REPEAT(B1,N1,1);
DO I = 2 TO GROUPS;
B3 = REFFECTS[I,1];
N = xvlllll;
BJI=BJIIIREPEAT(BJ,N,1);
END;

 

153

B=BJI;
E = 10 * NORMAL(REPEAT(SEED,1500,1));
301 = REPEAT‘25,1500,1);
H5 3 INT‘75 * UNIFORM(REPEAT(SEED,1500,1’)) + X41;
x=x4||xs;
ALPEA = {-s,2,3,1.0};
Y = (H‘ALPHA’ + D + E;
‘u...------.--------------- --------- -----------------------------
. AT THIS POINT A SPECIFIC DATA SET HAS BEEN CENERATED 'ITH
* THE FIXED EFFECTS ALPHA AND E AND E AS THE RANDOM ;
* PARTS OF THE MODEL. WHILE THE FIXED EFFECTS REMAINED AT
. THESE VALUES, THE RANDOM EFFECTS PARAMETERS TOOK THE VALUES
* AS SHO'N HELLO":

 

 

INTRA-CLASS
DATA SET CORRELATION TAU SQUARE SIGMA SQUARE
1 0.01 1.00 100
2 0.05 5.26 100
3 0.20 25.00 100

 

5 IS A MATRIX WHICH IS PART OF THE PROJECTION MATRIX PW
GIVEN IN EQUATION 2.20 IN CHAPTER II. THE MATRIX L IS
GIVEN BY:

K=INV(X’VNIX)

OSOSQSOSSOOSISSI’

* THE FOLLOWING PROGRAM SEGMENT COMPUTES THE ELEMENTS OF THE

* MATRIX A.

.-----------—- -------------------------- —---—----- ------ -_-.........-
X=0;

K1=0;
X2=0;
M=1;

N1=0;

DO J=1 TO GROUPS;
NJ=NV[J,1];
N1=N1+NJ;
XJ=X[M:N1,];
YJ=Y[M:N1,];
21J=REPEAT(1,NU,1);
CJ=CV[J,1];
SJ=XJ‘*ZIJ;

K1: K1+(XJ‘*XJ);
X2=X2+(CJ*SJ*SJ‘);
M=M+NJ;

END;

X=W*(K1-X2);
K=INV(K);

Q. Q.

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

 

154

I

* DETERMINATION OF THE MATRIX FI: ;
t I! IS A (282) MATRIX ASSOCIATED WITH HEIGHTS Wk SHOWN ;
* IN EQUATION 2.21, AND WHOSE ELEMENTS ARE DETERMINED THROUGH ;
* EQUATION 2.26 THROUGH 2.28. ;
* ALPHAH IS A VECTOR OF THE ESTIMATES OF THE FIXED EFFECTS ;
* PARAMETERS OF THE MODEL BASED ON THE ORIGINAL DATA SET. ;
* THUS, THE FOLLOWING PROGRAM SEGMENT DETERMINES THE MATRICES ;
0 USED TO COMPUTE THE USUAL MINQUE ESTIMATES THAT ARE BASED ;
* ON THE ORIGINAL DATA SET. ;
*---------------------------------------------------------------;

F001=0;

F002=0;

F011=0;

F012=0;

F111=0;

F112=0;

ALPHA1={0,0,0,0};

ALPHA2={0,0,0,0};

M=1;

N1=O;

DO J=1 TO GROUPS;
NJ=NV[J,1];
N1=N1+NJ;
XJ=X[M:N1,];
YJ=Y[M:N1,];
CJ=CV[J,1];
E1J=REPEAT(1,NJ,1);
CN=CJ¢NJ;
C32=CJ*CJ;
N32=NJ*NJ;
C2=(1-CJ)*(1-CJ);
TJ=TRACE(XJ‘*XJ*X);
SJ=XJ‘*ZIJ;
AJ=SJ‘*X*SJ;
AC=AJOCJ;
CN1=1-CN;
CN12=CN1*CN1;
CN13=CN1*CN12;
AN=AJ*NJ;
RJ=ZIJ‘*YJ;
F001=F001+(NJ*(C2+(CJ2*(NJ-1))));
F002=F002+(TJ-AC*(CN12+(2-CN)));
F011=F011+(NJ*CN12);
F012=F012+(AJ*CN13);
F111=F111+(NJ2*CN12);
F112=F112+(AN*CN13);
ALPHA1=ALPHA1+(K*XJ‘*YJ);
ALPHA2=ALPHA2+(CJ*R3*X*SJ);
ALPHAH=ALPHA1-ALPHA2;
M=M+NJ;

END;

 

155

W2=N*W;

W3=W2*';
F001=W2*F001;
F002=W3*F002;
F011=W2*F011;
F012=W3*F012;
F111=N2*F111;
F112=W3*F112;
F00=F001-F002;
F01=F011-F012;
F11=F111-F112;
ALPHAH=W*ALPHAH;
ALPHAHT=ALPHAH‘;

DETERMINATION OF THE MATRIX UW:

9! IS A 2 DIMENSIONAL VECTOR OF QUADRATIC FORMS WHOSE
ELEMENTS ARE DENOTED BY u0 AND U1 (SEE EQUATION 2.22, 2.31
AND 2.32).

QETF IS THE DETERMINANT OF THE MATRIX FW USED TO OBTAIN
THE INVERSE OF THE (282) MATRIX FM.

SIGMAH IS THE INTRA-CLASS VARIANCE COMPONENT BASED ON THE
ORIGINAL HIERARCHICAL DATA SET.

TAUH IS THE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED
ON THE ORIGINAL DATA SET.

LAAQA IS THE INTRA-CLASS CORRELATION BASED ON THE ORIGINAL
SAMPLE AND COMPUTED BY THE FORMULA,

 

OOSSSOOSSSSOOOS
Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

LAMDA = TAUH/(TAUH+SIGMAH)
*--------------== ____ _-— __ _
U01=0;
011:0;
M=1;
N1=0;
DO J=1 TO GROUPS;
NJ=NV[J,1];
N1=N1+NJ;
XJ=X[M:N1,];
YJ=Y[M:N1,];
CJ=CV[J,1];
EIJ=REPEAT(1,NJ,1);
CN=CJ¢NJ;
N2=NJ*NJ;
DJ=YJ-(XJ*ALPHAH);
HJ=21J‘*DJ;
GJ=DJ‘*DJ;
HJ2=HJ¢HJ;
CH2=CJ¢H32;
CN12=(1-CN)*(1-CN);
DJ=YJ-(XJ*ALPHAH);
U01=UOI+(GJ-(CH2*(2-CN)));
U11=U11+(HJ2*CN12);
M=M+NJ;
END;

 

156

TEIS MARMS TEE END OF TEE COMPUTATION OF TEE USUAL MINQUE
ESTIMATES BASED ON TEE ORIGINAL SAMPLE. TEE USUAL MINQUE
ARE PRINTED AT THE FIRST LINE. ESTIMATES PRINTED AT TEE
PROCEEDING LINES ARE TEE BOOTSTRAP REPLICATED ESTIMATES
BASED ON TEE RESAMPLED DATA FROM THE ORIGINAL SAMPLE
.---——------——-—-------- ----- ..-—----——-—--------—-—--------..---
U0=W2*U01;
U1=W2*U11;
DETF=(F00*F11)-(F01*FO1);
SIGMAH=((F11*U0)-(F01*Ul))/DETF;
TAUE=((PooaU1)-(P01aU0))/DETP;
.-_-—---- ----------------------------------- ---—--------- .......
a A PROCEDURE TO BOOTSTRAP TEE PARAMETER ESTIMATES
9 BY COMPUTING TEE ESTIMATE B TIMES THROUGH RESAMPLING.
. ﬁ_IS TEE INDEX COUNTER NEICE COUNTS TEE BOOTSTRAP REPLICATED
a SAMPLES. SEED1 IS TEE RANDOM GENERATOR FOR TEE BOOTSTRAP.
.----—--- ............................ ——------—--------- ...... ---
SEED1 = 10199;
DO B=1 TO 200;
.----—---—-——- ----- - -------- .- --------- -----------_----_---_-----
a A PROCEDURE USED TO RESAMPLE DATA FROM TEE ORIGINAL DATA SET
9 BY FIRST CREATING AN INDEX FOR EACH OBSERVATION. TEIS
. PROCESS IS REPEATED B TIMES FOR SOME LARGE B REPRESENTING
. TEE NUMBER OF BOOTSTRAP REPLICATIONS.
. -----—-—--—-- ------------------------ ------------------------
NT=1;
CONSTANT=REPEAT(NT,NV[1,],1);
INDEX=NV[1,]*(UNIFORM(REPEAT(SEED1,NV[1,1,1)))+CONSTANT;
DO S=2 TO GROUPS;
=NT+NV[S-1,];
CONSTANT=REPEAT(NT,NV[S,1,1);
INDEX1=NV[S,]*(UNIFORM(REPEAT(SEED1,NV[S,1,1)))+CONSTANT;
INDEx=INDEx//INDEx1;
END;
INDEX=INT(INDEX);
YSTAR=Y[INDEX];
XSSTAR=X5[INDEX];
XSTAR=X4||XSSTAR;
YSTART=YSTAR‘;

3
* DETERMINE K=INV(X’VWIX) BASED ON THE REPLICATED COVARIATE ;
3

S’SIDS'.
Q. Q. Q. Q. Q. Q. ‘0

Q. Q. Q. Q. Q. Q.

Q. Q. Q. Q. Q. ‘0

* VALUES

.---------------- -------- ----- ---------- --------—----- ----- --—--,
X=0;

X1=0;
X2=0;
M=1;

N1=0;

DO J=1 TO GROUPS;
NJ=NV[J,1];
N1=N1+NJ;

=XSTAR[M:N1,];
YJ=YSTAR[M:N1,];

157

21J=REPEAT(1,NJ,1);
CJ=CV[J,1];
SJ=XJ‘*ZIJ;
X1=X1+(XJ‘*XJ);
X2=X2+(CJ*SJ*SJ‘);
M=M+NJ;
END;
X=l*(R1-R2);
K=INV(K);
.-------——----- --------------- .- ----------- - .................. ---
* DETERMINATION OF THE MATRIX Fl BASED ON THE REPLICATED X
* MATRIX.
* ALPHAH1 IS THE ESTIMATE OF THE FIXED EFFECTS PARAMETERS BASED
* ON THE REPLICATED DATA SET.
j-------——-—-------—- ....... - .......... ... ...... - --------------- .-
F001=0;
F002=0;
F011=0;
F012=0;
F111=0;
F112=0;
ALPHA1={0,0,0,0};
ALPHA2={0,0,0,0};
M=1;
N1=0;
DO 3:1 TO GROUPS;
NJ=NV[J,1];
N1=N1+NJ;
XJ=XSTAR[M:N1,];
TJ=YSTAR[M:N1,];
CJ=CV[J,1];
21J=REPEAT(1,NJ,1);
CN=CJ¢NJ;
CJz=CJOCJ;
N32=NJ*NJ;
C2=(1-CJ)*(1-CJ);
TJ=TRACE(XJ‘*XJ*X);
SJ=XJ‘*ZIJ;
AJ=SJ‘*X*SJ;
AC=AJ*CJ;
CN1=1-CN;
CN12=CN1¢CN1;
CN13=CN1*CN12;
AN=AJ*NJ;
RJ=Z1J‘*YJ;
F001=F001+(NJ*(C2+(CJ2*(NJ-1)))
F002=F002+(TJ-AC*(CN12+(2-CN)))
F011=F011+(NJ*CN12);
F012=F012+(AJ*CN13);
F111=F111+(N32*CN12);
F112=F112+(AN*CN13);
ALPHA1=ALPHA1+(X*XJ‘*YJ);
ALPHA2=ALPHA2+(CJORJ*R*SJ);

Q. Q. Q. Q. Q. ‘0

 

)3

158

ALPHAH1=ALPHA1-ALPHA2;
M=M+NJ;

END;
W2=I*I;
I3=W2*I;
F001=I2*F001;
F002=I3*F002;
F011=W2*F011;
F012=W3*F012;
F111=W2*F111;
F112=I3*F112;
F00=F001-F002;
F01=F011-F012;
F11=F111-F112;
ALPHAH1=WtALPHAH1;
ALPHAH1T=ALPHAH1‘;

a DETERMINATION OF TEE VECTOR UW BASED ON TEE RESAMPLED Y
9 AND TEE REPLICATED X MATRIX
a SIGMAHI IS TEE INTRA-CLASS VARIANCE COMPONENT ESTIMATE
a BASED ON TEE REPLICATED SAMPLE
a TAUH1 IS TEE INTER-CLASS VARIANCE COMPONENT ESTIMATE BASED
0 ON TEE REPLICATED SAMPLE.
. LAAQA; IS TEE INTRA-CLASS CORRELATION ESTIMATE BASED ON
a TEE REPLICATED SAMPLE. ‘
.------------------------------------------------------- --------
U01=0;
U11=0;
M=1;
N1=o;

DO 3:1 TO GROUPS;

NJ=NV[J,1];

N1=N1+NJ;

XJ=XSTAR[M:N1,];

YJ=YSTAR[M:N1,];

CJ=¢V[J.1];

21J=REPEAT(1,NJ,1);
=CJ¢NJ;

N2=NJ*NJ;
=YJ-(XJ*ALPHAH1);

HJ=E1J‘*DJ;

GJ=DJ‘*DJ;

BU:=EJ~EJ;

CH2=CJ*HJ2;

CN12=(1-CN)*(1-CN);

DJ=YJ-(XJ*ALPHAH1);

U01=UO1+(GJ-(CH2*(2-CN)));

U11=UII+(HJ2*CN12);

M=M+NJ;

END;

Q. Q. Q. Q. Q. Q. Q. Q. Q.

‘0

159

UO=U2*UOI;
U1=W2*U11;
DETP=(P009P11)-(P019P01);
TAUE1=((Poo9U1)- (P019U0))/DETP;
SIGMAE1=((P119UU)-(Po1901))/DETP;
.---—---— ------ --- ----------------- ------------ ---------- --_---
9 THIS PROGRAM SEGMENT PRINTS TEE VALUE or TEE BOOTSTRAP
9 AT EACH OF TEE B BOOTSTRAP REPLICATION.
. --------- .- ----------------------------- -----—- ----- u... ------- .—
PRINT T (IFORMAT=4.0|) B (IFORMAT=4. 0|) TAUH (IFORMAT=S. 3|)
TAUE1 (IFORMAT=S. 3|) SIGMAE (IFORMAT=S. 3|)
SIGMAEI (IFORMAT=S. 3|);
PRINT ALPEAET (IFORMAT=S. 3|) ALPHAH1T (IFORMAT=S.3|);
SEED1 = SEED1 + 100;
SEED2 = SEED2 + 100;
END;
.--—---——-- --------- ---— ----------------- —-----------——--------_
9 TEE END OF TEE BOOTSTRAP TRIAL BASED ON TEE RESAMPLED DATA.
9 ANOTHER TRIAL WILL BE PERFORMED AFTER CEANGING TEE SEED FOR
9 TEE RANDOM SAMPLING ALGORITH.
.------——-—----- ------------- - ------ ---------———--——--..-—-__----
END;
.---------—-----—---- ------------ ---——----——-------- -------- ----
9 THIS MARKS TEE END OF TEE SIMULATION TRIAL. EACE SUCH TRIAL
9 RESULTS IN ONE SET OF TEE USUAL MINQUE ESTIMATES AND B SETS
9 OF TEE BOOTSTRAP REPLICATED ESTIMATES. TEE SUMMARY
9 STATISTICS FOR TEE BOOTSTRAP REPLICATED ESTIMATES ARE ALSO
9 COMPUTED.
Q—-----—-—-----———--—-—-—-——- ------- —-—-----—----—-------........-_..
FINISH;
RUN;

PART 3
COMPUTER PROGRAM TO SIMULATE THE SAMPLING DISTRIBUTION

OF THE MINQUE ESTIMATE FOR A SAMPLE DRAIN FROM A
HORMAL POPULATION OF KNOWN PARAMETERS.

 

THE PROGRAM FIRST SETS UP THE A AND THEAFIRST PART OF THE
A MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF
THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN
CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN
CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH
AS DEMONSTRATED BY EQUATION 2.11 IN CHAPTER II.

THE WEIGHT 1; HAS DETERMINED SEPARATELY USING THE
HANUSHEK (1974) METHOD.

.-----------------------------------------------------. ------ ---

ISOOOSSSSOOS

‘0

Q. Q. Q.

Q. Q. Q. Q. Q.

Q. Q. Q. Q. Q. Q. Q.

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

PROC IML;

.D....OSSOO

*IIS THE INDEX COUNTER FOR THE NUMBER OF SIMULATION TRIALS

START;

W=1/(1-W1);
GROUPS: so;
NV11-REPEAT(20,2,1);
NV21=REPEAT(25,5,1);
NV31=REPEAT(30,10,1);
NV1=NV11//NV21//NV31;
NV12=REPEAT(35,5,1);
NV22=REPEAT(40,3,1);
NV32=REPEAT(20,3,1);
NV42=REPEAT(25,5,1);

NV2=NV12IINV22I/NV32//NV42;

NV13=REPEAT(30,10,1);
NV23=REPEAT(35,5,1);
NV33=REPEAT(40,2,1);
NV3=NV13//NV23//NV33;
NV=NV1//NV2//NV3;
CV=w1/(1+(NV-1)9l1);
X11=REPEAT(1,465,1);
X12=REPEAT(1,480,1);
213=REPEAT(1,555,1);
X01=REPEAT(O,465,1);
X02=REPEAT(0,480,1);
X03=REPEAT(0,555,1);
X1=X11//X02//X03;
X2=X01//X12//X03;
X3=X01//X02//X13;

160

PROGRAM SEGMENT TO GENERATE DATA FROM A NORMAL EOPQEATEO!

OF SPECIFIED PARAMETER VALUES.

THE EQUATION GIVEN BY:

Y = (X*ALPHA) + B + E

SEED: IS ANY NUMBER USED TO CREATE A RANDOM NUMBER OF

OBSERVATIONS FROM SOME POPULATION.

SEED = 10199;

THE PROGRAM FIRST DETERMINES
THE FIXED EFFECTS PARAMETERS AND THE COVARIATE WHICH IN
ARE USED IN TURN TO GENERATE THE OBSERVATIONS 1 THROUGH

SEED = 199;

D0 T = 1 TO 1000;

REFFECTS = 2.2935 * NORMAL(REPEAT(SEED,GROUPS,1));

B1 = REFFECTS[1,1];
N1 = NV[1,1];
B31 = REPEAT(B1,N1,1);

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

Q. Q. Q.

161

DO I = 2 TO GROUPS;
33 = REFFECT8[I,1];
N = NV[1.1]:
BJ1=BJllIREPEAT(BJ,N,1);
END;
8:831;
3 = 10 t NORMAL(REPEAT(BEED,1500,1));
:41 = RBPEAT(25,1SOO,1);
X4 = INT(75 * UNIFORM(REPEAT(BEED,1500,1))) + 241;
x:x1||x2||xa||x.;
ALPHA = {-512030100};
Y = (8*ALPHR) + B + B;

 

I
* DETERMINE E=INV(X'VIIX) ;
.----- ----------- - ---------- ----- ----------- -------------------;
x=°. '
31:0;
H2=O;
M=1;

N1=NV[1,1];

DO 3:1 TO GROUPS;
XJ=X[M:N1,];
YJ=Y[M:N1,];
N3=NRO'(YJ);
EIJ=REPEAT(1,NJ,1);
CJ=CV[J11]3
SJ=XJ‘*ZIJ;
H1=K1+(XU‘*XJ);
E2=K2+(CJ*SJ*SJ‘);
M=M+NV[J,1];
N1=N1+NV[J,1];

END;
x='*(EI-K2);
E=INV(K)3

3
* DETERMINATION OF THE MATRIX FI ;
3

 

roo1=o;
rooz=o;
PO11=0;
ro12=o;
r111=o;
[112:0;
ALPHA1={0,0,0,0};
ALPHAZ={0,0,0,0};
8:1;
31=HVI10113

DO J=1 To GROUPS;
XJ=X[M:N1,];
YJ=Y[H:N1,];
CJ=CVIJ0113
NJ=NROI(YJ);

162

21J=RBPBAT(1,NJ,1);
CN=CJ*NJ;

CJ2=CJ*CJ;

NJZ=NJ¢NJ;
cz=(1-CJ)*(1-CJ);
TJ=TRACB(XJ‘*XJ*R);
8J=XJ‘*21J;

AJ=BJ‘*K*SJ;

AC=AJOCJ;

CN1=1-CN;

cu12=cu1tcn1;
cu13=cu1tcx12;

AN=AJONJ;

RJ=Z1J‘*YJ;
1001=P001+(NJ*(C2+(CJZ*(NJ-1))
1002=r002+(TJ-Ac*(CN12+(2-CN))
ro11=r011+(NJ*CN12);
r012=ro12+(AJ*cR13);
r111=r111+(naztcn12);
r112=r112+(AN*CN13);
ALPHA1=ALPHA1+(K*XJ‘*YJ);
ALPHAZ=ALPHR2+(CJtnatxtsJ);
ALPHAH=ALPnh1-ALPHAZ;
K=M+NV[J,1];
N1=81+NV[J,1];

H;
)3

END;

'2='*';

N3=I2*';
F001=U2*F001;
F002='3*F002;
F011='2*F011;
F012=I3*FO12;
F111='2*F111;
F112=U3*F112;
FOO=FOO1-F002;
FO1=FO11-F012;
F11=F111-F112;
ALPHAH=I*ALPHAH;
ALPHAHT=ALPHAH‘;

* ----------------------------------------------- - --------------- ;

DETERMINATION OF THE

MATRIX UN

.
I

.----------------- --------- - --------- ---------------------------'.

001:0;
011:0;

8:1;
N1=NV[1,1];

DO J=1 TO GROUPS;

xJ=X[R:R1,];
YJ=Y[M:N1,];
CJ=CV[J,1];
RJ=RRow(!J);
21J=REPEAT(1,NJ,1);
CN=CJ*NJ;

163

N2=NUONJ;
DJ=YJ-(XJ*ALPHAH);
HJ=21J‘*DJ;
GJ=DJ‘*DJ;
RJ2=RJ*RJ;
cnz=csanaz;
CN12=(1-CN)*(1-CN);
DJ=YJ-(XJ*ALPHAH);
UO1=UOl+(GJ-(CHZ*(2-CN)));
011:011+(HJZ*CN12);
M=M+NV[J,1];
81=N1+NV[J,1];
SEED = SEED + 10;

nun;
uo=wztuoz;
01=W2*011;
DETF=(F00*F11)-(F01*F01);
SIGMAH=((F11*00)-(F01*01))/DETF;
TAUR=((200*01)-(ro1auo))/DBTR;
manna] (mumsmnm ;
PRINT S TAUH SIGMAH LAMDA ALPEAET;

END;

PRINT SRRU;

FINISH;

RUN;

PART 4
COMPUTER PROGRAM TO SIMULATE THE SAMPLING DISTRIBUTION

OF THE MINQUE ESTIMATE FOR A SAMPLE DRAIN FROM A
DOUBLE EXPONENTIAL POPULATION OF KNOWN PARAMETERS.

 

THE PROGRAM FIRST SETS UP THE A AND THE FIRST PART OF THE
A MATRIX EXCLUDING THE COVARIATES. THE CONSTRUCTION OF
THESE MATRICES ARE BASED ON THE NUMBER OF OBSERVATIONS IN
CELL TO SATISFY THE REQUIREMENTS AS IN EQUATION 2.9 IN
CHAPTER II. THE PROGRAM IDENTIFIES THE COMPONENTS ON EACH
AS DEMONSTRATED BY EQUATION 2.11 IN CHAPTER II.

THE WEIGHT :1 WAS DETERMINED SEPARATELY USING THE
HANUSHEK (1974) METHOD.

PROC IML;
START;
I=1/(1-'1);
GROUPS=50;
NV11=REPEAT(20,2,1);
NV21=REPEAT(25,5,1);
NV31=REPEATC30,10,1);
NV1=NV11IINV21IINV31;
NV12=REPEAT(35,5,1);
NV22=REPEAT(40,3,1);
NV32=REPEAT(20,3,1);

..ODOOOOI’...

Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q. Q.

164

NV42=RSPSAT(25,5,1);
RV2=RV12//RV22//NV32//RV42;
NV13=REPEAT(30,10,1);
NV23=REPEAT(3S,5,1);
RV33=RSRRAT(40,2,1);
NV3=NV13IINV23IlNV33;
NV:NV1//NV2//NV3;
CV=W1/(1+(NV-1)*I1)3
X11=REPEAT(1,465,1);
X12=REPEAT(1,480,1);
x13=RRpSAT(1,sss,1);
XU1=RSRRAT(0,465,1);
xoz=RszAT(o,4so,1);
xoa=RSpRAT(o,sss,1);
11=X11//xoz//xoa;
xz=x01//x12//xoa;
xa=xo1//xoz//x13;
*--- ------- .- ------- --- --------------- --------------- ....... ----—
0 PROGRAM SEGMENT To GENERATE DATA FROM A DOUBLE EXPONENTIAL
a POPULATION or SPECIFIED PARAMETER VALURS.

= 10199;
SEED2 = 11099;

DO S = 1 TO 1000;
UR1=UNIFORM(REPEAT(SEED1,GROUPS,1)
UR2=UNIFORM(REPEAT(SEED2,GROUPS,1)
LR: '1 * LOG(UR1);

vv
‘0 ‘0

TR1 = REPEAT(1,GROUPS, 1);

TR2 = T31 I (0R2 >= 0.5);

TR3 = -1*(TR1 I (URZ < 0.5));
TR. = TR2 + TR3;

REFFECTS = 2.2935 * ((LR#TRA)/BQRT(2));
B1 = REFFECT8[1,1];
N1 = NV[1,1];
331 = REPEAT(B1,N1,1);
DO I = 2 TO GROUPS;
DJ = REFFECT8[I,1];
N = uvllolli
BJ1=BJ1IIRBPEAT(BJ,N,1);
END;
3:831;
UB1=UNIPORM(RBPBAT(SEED1,1500.1))
032=UNIFORM(RBPBAT(83302,1500.1))
LB: -1 0 LOG(UB1);

o
I
o
I

TE1 = REPEAT(1,1500,1);

TE2 = TE1 I (UE2 >= 0.5);

TE3 = -1*(TE1 # (UEZ < 0.5));
TE‘ = TEZ + TE3;

= 1a a ((LE#T34)ISQRT(2));

:41 = REPEAT(25,1500,1);

x4 = INT(75 a URIRORR(RSRRAT(SSSU1,1500,1))) + :41;
x=x1||x2||xa||x¢;

Q. Q. Q. Q.

165

ALPHA = {-5'2'3'100};
Y = (X*ALPHA) + S + E;

.------------------------------------------ ----- ----------------

DETERMINE X=INV(X’VUIX)

:0;
31:0;
R2=0;
M=1;
R1=NV[1,1];

DO J=1 TO GROUPS;

XJ=X[M:N1,];
YJ=Y[X:N1,];
NJ=NROW(YJ);
21J=REPBAT(1,NJ,1);
CJ=CV[J,1];
BJ=XJ‘*Z1J;
R1=X1+(XJ‘*XJ);
xz=x2+(CJOSJ*SJ‘);
H=M+NV[J,1];
N1=N1+NV[J,1];

END;

R=I*(R1-KZ);
X=INV1K);

DETERMINATION OF THE MATRIX F'

*-------- --------------------------------- ---------- ----- -------;

1001=0;

F002=0;

1011=0;

1012=0;

r111=o;

r112=o;
ALPEA1={0,0,0,0}
ALPEA:={0,0,0,0}
H=1;

81=NV[1,1];

o
I
o
I

DO J=1 TO GROUPS;

ZJ=X[M:N1,];
YJ=Y[M:N1,];
OJ=CV[J,1];
NJ=NROI(YJ);
81J=REPEAT(1,NJ,1);
CN=CJ*NJ;
OJZ=CJ*CJ;
302=NJ*NJ;
cz=(1-CJ)*(1-CJ);
TJ=TRACE(XJ‘*XJ*K);
BJ=XJ‘*Z1J;
AJ=BJ‘*K*SJ;
AC=AJOOJ;

o
I

o
I

166

CN1:1-CN;
CN12:CN1*CN1;
CR13:CN1*CN12;
AN:AJ*NJ;
RJ:21J‘*YJ;
3001:F001+(NJ*(02+(OJZ*(NJ-1))));
1002:F002+(TJ-Ac*(CN12+(2-CN)));
P011:P011+(NJ*CR12);
1012:P012+(AJ*CN13);
P111:P111+(NJZ*CN12);
P112:P112+(AN*ON13);
ALPHA1=ALPHA1+(KOXJ‘*YJ);
ALPEAZ:ALPHAZ+(OJtRJ*R*SJ);
ALPHAS=ALPHA1-ALPnhz;
M:M+NV[J,1];
l1:N1+NV[J,1];
SUD;
U2=w*w;
03:02*I;
P001:02*P001;
1002:03*P002;
[011:02*P011;
l012:l3*P012;
P111:IZ*P111;
P112:03*P112;
100:P001-P002;
101:P011-P012;
111:2111-P112;
ALPHAH:0*ALPEAS;
ALPEAET:ALPSAS‘;
.---- ...........................................................
* DBTBRHINATIOR OP THE MATRIX 00
. ...............................................................
001:0;
011:0;
u=1;
n1=uvlllll;
DO 0:1 TO GROUPS;
IJ:X[H:N1,];
YJ:Y[M:R1,];
OJ:CV[J,1];
SJ:RROI(YJ);
S1J:RBPBAT(1,NJ,1);
CN:CJ*NJ;
82:NJ*NJ;
DJ:YJ-(XJ*ALPEAH);
SJ=21J‘*DJ;
GJ:DJ‘*DJ;
SJZ:SJ*SJ;
€32:CJ*302;
CN12=(1-CN)*(1-CN);
DJ:YJ-(XJ*ALPHAH);

Q. Q. Q.

167

001=001+(GJ-(CHZ*(2-CN)));
011=011+(HJZ*CN12);
M=M+NV[J,1];
N1:N1+NV[J, 1];
SEED1 = SEED1 + 100;
SEEDZ = 83302 + 100;
END;
00=W2*001;
01=Wz*011;
DETF=(F00*F11)-(F01*F01);
SIGNAN=¢(P11tUO)-(Po1*01))/USTP;
TAUH=((F00*01)-(P01*UO))/DETF;
LANDA=TAUR/(TAUN+SISNAR);
PRINT T TAUH SIGMAH LAMDA ALPEAET;
END;
FINISH;
RUN;

 

BIBLIOGRAPHY

Abramovitch, L., and Singh, K. {1985). Edgeworth corrected pivital statistics and
the bootstrap. The Ann I i ti , 13, 1, 116-132.
Aitkin, M., and Longford, N. (1986). tatistical modeling issues in school

effectiveness studies. W. (Series A).
143, 1—43.

Arlin, M. (1984b). Time, equality, and mastery learning. Leg}! of Edgcatignal
mm 54.1. 65-86-

Arlin, M., & Webster, J. (1983). Time cost of mastery learning. M
Egggtigngl Psycholggx, 15;, 187—195.

Ashton, P., 8: Webb, R. (1986). Making a Diffggng: Teaghers’ Sense of Efﬁcacy
Q51 Student Aghievemgnt. New York: Longman.

Bagaka’s J. G. (1989). Empirical Bayes Bootstrap: Estimation of parameter
distribution through the bootstrap in hierarchical data. Paper presented at
the Annual Meeting of the American Educational Research Association,
San Francisco.

Bandura, A. (1986 . Social F ti ns f Th ht A ti n: A Social
ngg’tive heory. Englewood Cliffs, New Jersey: PrenticchHall.

Bangert—Drowns, R. L. (1986 . A review of developments in meta—analytic
method. Ps ch lo ' B e in, 39, 388-399.

Beran, R. (1984). Jackknife approximations to bootstrap estimates. The Anngs 9f
51mm 12(1), 101—118.

Bickel, P. J ., & Freedman, D. A. £1981). Some asymptotic theory for the
bootstrap. Th Ann f i i , 9(6), 1196—1217.

Block, J. H. Ed.) (1971). Mgtgy Lgarning: Thegg Qd Prggticg. New York:
Holt, 'nehart and Winston.

Bloom, B. S. (1982). 11an thggtgristig £151 Schggl Learning. New York:

McGraw—Hill Book Company.

Bloom, B. S. (1984a). The 2 Sigma Problem: The search for methods of instruction
as eﬁective as one—to—one tutoring. W, 13, 6, 4—16.

Bloom, B. S. (1984b). The search for methods of cup instruction as effective as

one—to—one tutoring. W1, :1_1(8), 4—17.

Box, G.E.P., and Cox, DR. (1964). An analysis of transformations. Journal of
39W. 326. 211-246-

168

169

Brookover, W., Beady, 0., Flood, P., Schweitzer, J ., Wisenbaker, J. (1979 . §chggl

WWW: ELM—Ms M A Di erence.
New York: Praeger.

Brookover, W., Beamer, L., Efthim, H., Hathaway, P., Lezotte, L., Miller, S., and
Passalacqua, J. (1982). Creating Effgtive Schools. Learning Publication.

Brown, KG. (1976). Asymptotic behavior of MINQUE—type estimators of variance
components. Annﬂs 9f §t§ti§tigs, 4, 746-754.

Burstein, L., Linn, R. L., & Campell, F. (1978). Analyzing multi—level data in the
presence of heterogeneous within—class regressions. Jomﬂ 9f Egngational

We a (4), 347-389.

Burstein, L., & Miller, MD. (1980). Regression—based anal sis of multi—level

educational data. New Directioy for Mgthgglggy Q Sggigl and Beﬂvioral
Sg’enges, Q, 194—211.

Cochran, W. G. (1977). Sampling Technigue. New York: John Wiley 8: Sons.

Corbeil, R. R., and Searle, R. S. (1976). A comparison of of variance component
estimators. Biometrics, 32, 779—791.

Corno, L., 8: Snow, R. E. (198561! Adaptingi teachin to individual differences
amo learners. In M. ittloclr (e .), Haggﬂgook of Research on Teaching.
3rd ., Chicago, IL: Rand NcNally.

Carrol, J. B. (1963). A model for school learning. Tachers Cgllggg Rmord, 64,
723—733.

Carroll, R. (1979). On estimating variances of robust estimates when the errors are

asymmetnc. Journgl 9f Ameriggg Stgtistigs, L4, 674—679.

Dembo, M.H., & Gibson, S. (1985). Teachers’ Sense of Eﬁcacy: An important

factor in school improvement. Thg Elmentm Schml Jgurnal, 86, 2,
173-184.

Diaoonis, P., 8: Efron, B. (1983). Computer—intensive methods in statistics.
Sg'entiﬁg Americgg, 218, 116-130.

Dicoccio, T., and Tibshirani, R. (1987). Bootstrap conﬁdence intervals and

bootstrap approximations. ngg of Amerigg Statisticg Association, 82,
397, 163—170.

Dolker, M., Halperin, S., 8: Divgi, D. R. (1982). Problems with bootstrapping
Pearson correlations in very small bivariate samples. Psychometrics, _41 (4),
529—530.

Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Ann
ati ti , 1, 1—26.

Efron, B. (1981a). Nonparametric standard errors and conﬁdence intervals. 1h;

andian Journg! 9f Stﬁistig, 2 (2), 139—172.

170

Efron, B. (1981b). Censored data and the bootstrap. Jgurngl 9f Amerigg
i i A ' i n, 16, 374, 312-319.

Efron, B. 1982). Thg ES. kknifg, the Sggtgtrgp, Q51 gther sampligg plans.
P ' adelphia: Society for Industrial and Appliedl Mathematics.

 

Efron, B. (1982). Transformation theory: How normal is a family of distributions?
Thg .4an; of Stgtigtigg, 19, 2, 323—339.

Efron, B., & Gon , C. (1983). A leisurely look at the bootstrap, the jackknife, and
cross—vali ation. Thg Amgriggg Statigtig’gn, 31 (1), 36—48.

Efron, B., and Tibshirani, R. (1986). Bootstrap methods for standard errors,
conﬁdence intervals, and other measures of statistical accuracy. Statistical

Scigngg, 1, 1, 54—77.

Freedman D. A. (1981). Bootstrapping regression models. Annals of StStistics,
2(6 , 1218—1228.

Fuller, E, Wood, K., Rapoport, T., 8: Dombusch, SM. (1982). The organizational
context of individual efﬁcacy. Review of Ed cation earch, ﬂ, 1, 7—30.

Goldstein, HI. 1986). Efﬁcient statistical modeling of logitudinal data. Annals of
Hggg iology, LS, 129—142.

Gover, J. C. (1962). Variance component estimation for unbalanced hierarchical
classiﬁcations. Biometrigs, L8, 537—542.

Hall, P. (1986a). On the bootstrap and conﬁdence intervals. The Annals 9f
Statisticg, g, 4, 1431—1452.

Hall, P. (1986b). On the number of bootstrap simulations required to construct a
conﬁdence interval. Thg Annals 9f Stgtistics, _13, 4, 1453-1462.

Hall, P. (1988). Theoretical coomparison of bootstrap conﬁdence intervals. The
Anngg 9f Statisticg, 1S, 3, 927-953.

Hall, P. ( 1989). Bootstrap conﬁdence regions for directional data. Journal of
Ameriggg Statigtigal Agggg'gtign, L4. 408, 996-1002.

Hall, P. (1990). Th B t r E h E i n. (in Print).

Hanushek, E. A. (1974 . Efﬁcient estimators for regressing regression coefﬁcients.
Thg America; tgtigtigLn, gs, 2, 66—67.

Hartley, HQ. (1967). Expectation, variances and covariances of AN OVA mean
squares y ’synthesis’. Bigmgtrigg, 23, 105—114.

Hartley, H. O., Rao, J. N. K., and LaMotte, L. R. (1978). A simple 'synthesis’
based method of variance component estimation. Biomgtrigg, 34, 233—242.

Harville, D. A. (1977 . Maximum Likelihood approaches to variance component

estimation an to related problems. Jgnrnﬂ 9f Amerign Statigticﬂ
M20. 12.. 358. 320-340-

171

Henderson, C. R. (1953). Estimation of variance and covariance components.

219mm. 2. 226-252-

Henderson, C. R. (1959); Design and analysis of animal husbandry experiments.
r

Igghnigpgg Qt) pgggpgg in Animg Productipn Rgearg.
American Society of Animal Production Monograph.

Hinkley, D., and Wei, B. (1984). Improvements of jackknife conﬁdence limit
method. Biometrik , 7_1_, 2, 331—339.

Hocking), R. R. (1985). Thg apalysig pf lineag models. Monterey: Brooks/ Cole
ublishing Company.

Kennedy, W. J ., &. Gentle, J. E. (1980). Stgtigtical Qomppting. New York:
Marcell Dekker, Inc.

Laird, N. M., 81. Louis, T. A. (1987). Empirical Bayes conﬁdence interval based on

Bootstrap samples. Spurnal pf Amgrigag Statigticﬂ Agﬁpgattion, S2, 399,
739—750.

LaMotte, L. R., (1973). Quadratic estimation of variance components. Biometrics,
23, 311—330.

Leestma, S., 8: Nyhoff, I[1.].‘Ll(l987). PASQAL prpggagming and problem solyp'ng.
New York: Mac ' an Publishing Company.

Lunneborg, C. E. (3985). Estimating the correlation coefﬁcient: The bootstrap
approach. Sychologigal Bpﬂgtin, 9S (1), 209—215.

Mason, W. M., Wong, G. Y., & Entwistle, B..(1983). Contextual analysis throng:

the multi—level linear model. 1 Meth 10 . San Francisco,
Jossey—Bass.

Mood, A. M., Graybill, F. A., & Boes, D. C. £1974). Intrpgpgtion to the Theory of
StatAtjgg. New York: McGraw—Hill ook Company.

Nash, S. N. (1981). Comments on nonparametric stande errors and conﬁdence

intervals. WW. 2(2). 163-164-

Newman, F.M., Rutter, R.A., & Smith, MS. (1989). Organizational factors that
affect school sense of efﬁcacy, community, and expectations. Sgp‘plogy of
Education, Q, 221-238.

Njunji, A. (1974). Transformation of education in Kenya since independence.
Edppgtipn Eggtern Afrig, _4_, 107—125.

Patterson, ED. and Thompson, R. (1971). Recovery of interblock information
when block sizes are unequal. Bipmgtg’kg, SS, 545—554.

Peterson, P. (1972). A R§_v_i§g pf the ﬁgmch on Mgtggy gﬂ'ng Strgtggigs.
Unpublished Manuscript, International Association for the Evaluation of
Educational Achievement.

172

Quenouille, M. 1949). Approximate tests of correlation in time series. qurnal pf
the Rani Statiatiaal ngety, Series B, A, 18—84.

Rao, J. N. K. (1968). "On expectations, variances and covariances of ANOVA
Mean squares by ’synthesis'." Bipmgripa, A, 963-978.

Rao, C. R. (1972:. Estimation of variance and covariance components in linear
models. al f Am ri tatisti al A ciation, 51, 337, 112—115.
Rao, C. R. (1970). "Estimation of Heteroscedastic Variances in Linear models.

WW. 6i. 161-172-

Rao, C. R. (1971 a . Estimation of variance and covariance components -

MIN QUE t eory. qurnal of Mtﬂtivag'ata Angysis, 1, 257—275.

Rao, C. R. (1971 b). Minimum variance quadratic unbiased estimation of variance

components. Journal of Multivariata Angya'a, 1, 445—456.

Rao, CR. (1972). Estimation of variance and covariance components in linear
models ournal of Am ric ati ti A ciation, ﬂ, 112-115.

Rasmussen, J. L. (1987). Estimati correlation coefﬁcients: Bootstrap and
parametric approaches. Paya ologicg Btﬂetin, Lu (1) 136-139.

Raudenbush, S. W. (1984). Applipatipn of a hiaratahigﬂ lineg male] in ﬁagatipnﬂ
19.831911 Unpublished doctoral dissertation, Harvard University.

Raudenbush, S. W., 8: Bryk, A. S. (1986). A hierarchical model for studying school
effects. Sociology of Egpcation, SS, 1—17.

Raudenbush, S. W., & Bryk, A. S. (1987 ). Application of hierarchical linear models
to assessing change. Payphologiag Ballatin, _1_Q;(1) 147—158.

Raudenbush, S. W. 82 Bryk, A. S. (1988). Methodological advances in analyzing
the effects of schools and classrooms on student learning. In Ernst Z.
Rothkopf (ed.), Review of Research in Education 1988—89. Washington,
DC: American Educational Research Association.

Rubin, D. B. (1981). The Bayesian bootstrap. Tha Annals of Statistics, 2(1),
130—134.

SAS Institute, Inc. (1985). SAS User’s Guide: Statistics. Cary, NC: SAS Institute,
Inc.
SAS Ingltlitute, Inc. (1985). SAS/IML User’s Guide. Cary, NC: SAS Institute,
c.

Schenker, N. (1985). Qualms about bootstrap conﬁdence intervals. Jomal pf
Amaricap Statiatical Aaspg’ation, SS, 390, 360—361.

Searle, S. R. (1971). Topics in variance component estimation. Biometripa, 21,
6

173

Searle, S. R. (1979). Ma A an VEQQQ Qommngt Estimation: A detailed account
of maximum likelihood and kindred methodology. Biometrics Unit, Cornell

University, New York.

Seely,J (.1971) Quadratic subspaces and completeness. Annals pf Mathematical
W 4.2.. 710-721

Singh, J. (1981). On the asymptotic accuracy of Efron’ s bootstrap. Tha Annals of
5111111191 12(6), 1187—1195

Slavin, R. E. (1987). Mastery learning reconsidered. Review pf Eglugationg
m 51. 2 175—213

Thompson, W. A. (1962). The problem of n ative estimates of variance
components. Ann I M hem i tati ics, A}, 273—289.

MICHIGAN STATE UNIV. LIBRnRIEs
m‘Illlllllllll“NWlllll‘lllllWlWIWI
31293007914447