THESIS

Illlllllllllglﬂlllllllllllﬂl\llzlllll

 
     
   
 

LI B R A R Y
Michigan State
University

*—

 

This is to certify that the
thesis entitled
Correction for Dependence

in Two Level Nested Designs

presented by

Suwatana Sookpokakit

has been accepted towards fulfillment
of the requirements for

Ph .D . degree in COUDS 81 ing ,

Personnel Services and

 

Educational 3‘

  

Major professor

Date %/;Z 3// 8/

0-7 639

T {[113\\\\ L

awry".

 

“—1 7

 

OVERDUE FINES:

‘Zscporduperitu

RETQRNING LIBRARY MATERIALS:

, Place in book ntum to remove

charge from circulation records

 

 

 

CORRECTION FOR DEPENDENCE

IN TWO LEVEL NESTED DESIGNS

By

Suwatana Sookpokakit

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Counseling, Personnel Services
and Educational Psychology

1981

ABSTRACT

CORRECTION FOR DEPENDENCE
IN TWO LEVEL NESTED DESIGNS

By

Suwatana Sookpokakit

A commonly used design in educational research involves hier-
archically nested data. Classrooms of students are randomly assigned
to receive one of two or more alternative educational treatments.

Since dependent variables in educational research are typically defined
on students, however, the design results in students nested within
classrooms and classrooms nested with treatments.

A fully specified model for the design includes sources of
variation for treatments, classrooms and students. Given the fully
specified model, the null hypothesis about treatments can be tested
There has been resistance on the part of

using F = MST/MS

C:T C:T '

educational researchers to use the FC'T test statistic because for

these studies the test has few degrees of freedom for error and so
limited statistical power. As a result, researchers have sometimes
turned to a pooled model which ignores classroom variance. By
ignoring classroom variance the sources of variation become treat-
ments and students. The apparent test statistic for the treatment

null hypothesis is then F = MST/MSS°

S:T T '

Suwatana Sookpokakit

The test statistic, F for the pooled model requires that

S:T ’
observations on students be independent of each other. Violation of

the independence assumption when using F has been shown to yield

S:T
a test which can be either too liberal or too conservative (Glendening,
1977; Paull, 1950). What is needed, then, are analysis strategies
which have greater degrees of freedom error than the FC:T test
statistic and which are valid when there is dependence at level of
individuals among observations on the dependent variable.

Glendening and Porter (1974) suggested the possibility of using
ANCOVA to adjust for positive dependence. They pointed out that the
effects of positive dependence could be conceptualized as similar in
form to the problem created by confounding in quasi experiments.

Index of response is another adjustment strategy which is closely tied
to ANCOVA. Thus, index of response was.also considered in this
investigation.

Four possible situations of dependence were classified for an
experimental study that involves two level hierarchically nested data.
Dependence could arise because students were not randomly assigned to
classrooms (initial dependence) and/or class effects which occur
during the study (during-experiment dependence). Crossing these two
dichotomous possibilities defined the four situations, one of which
was independence.

Investigation of the utility of index of response and ANCOVA was

restricted to use of a pre test to adjust for initial dependence.

Suwatana Sookpokakit

Further, classroom populations were assumed to be normally distributed
on the dependent variable with a common variance but different means.
Results from the investigation indicated that index of response
is an appropriate analysis strategy when class effects on the
covariate are perfectly correlated with class effects on the dependent
variable. Under this condition, the pooled index of response model
provides a valid test statistic for the treatment hypothesis with
higher power than the F-test from the full index of response model.
With an additional stipulation of equality of functional regression
slopes at the class level and at the individual level, the pooled
ANCOVA model also provides correct adjustment for dependence and a
valid test with higher power than the F-test of the full ANCOVA model.
The gain in the power through the use of index of response and ANCOVA
models was primarily a function of larger degrees of freedom error.
Thus, both analysis of variance of index of response and analysis
of covariance can be used with a pooled model to provide a more
powerful test of the null hypothesis about treatments even when
initial dependence is present in post test results. For designs having
few classrooms per treatment condition, the increase in power is

substantial.

DEDICATED TO
the memory of my father,
Yuan Sookpokakit,
and my brother,

Teeranit Sookpokakit.

ii

ACKNOWLEDGEMENTS

I would like to acknowledge the members of my dissertation
committee for their assistance. I am deeply greatful to Professor
Andrew C. Porter, my committee chairman, advisor, teacher and friend
for his generous and endless advice, encouragement, support and
patience. Working with him has greatly increased my knowledge and
professional development. Special thanks go to Professors Richard
Houang, Dennis Gilliland, Robert Floden and William H. Schmidt for
their help and valuable comments.

WOrking in the Office of Research Consultation (ORC) and with the
Institute for Research on Teaching provided me with invaluable
experience. Many thanks to my friends and colleagues in the ORC and
in the Institute who were very kind to me, especially Professors
Joe L. Byers and John H. Schweitzer, who hired me in the ORC, and
Professor Andrew C. Porter, Professor Jere Brophy and Mary Rohrkemper,
who hired me at the Institute.

I acknowledge with appreciation the support of the American
Association of University WOmen who gave me a fellowship during the
1977-78 academic year.

Finally, I extend heartfelt gratitude to my mother, Sugunya
Sookpokakit, to my husband, Rungsit Suwanketnikom, to my sisters,
Puangpet Gunyabarn, Benjalug and Raywadee Sookpokakit, and to Mrs.
Martha Ward for their sympathy, encouragement and patient support.

iii

TABLE OF CONTENTS

LIST OF TMLES O O O O O O O O O O O O O O O O O O O O O O O 0

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . .
Chapter
'I. INTRODUCTION . . . . . . . . . . . . . . . .'. . .
II. REVIEW OF LITERATURE . . . . . . . . . . . . . .

III.

An Operational Definition of Independence . . . . .
Positive Dependence and Negative Dependence . . . .
Preliminary Testing on the Full Model and
Conditional Pooling . . . . . . . . . . . . . . .
A Quasi-F Statistic . . . . . . . . . . . . . .
Using a Covariate to Adjust for Positive Dependence.

SITUATIONS OF INDEPENDENCE AND DEPENDENCE . . . . .

Initial Dependence and During-Experiment Dependence.
Four Possible Situations of Dependence and
Independence . . . . . . . . . . . . . . . . . . .

Situation I: Independence . . . . . . . . . .
Situation II: Initial Dependence . . . . . . .
Situation III: During-Experiment Dependence. .
Situation IV: Initial and During-Experiment
Dependence . . . . . . . . . . . . . . . . .

CORRECTIONS FOR INITIAL DEPENDENCE USING INDEX OF
RESPONSE AND ANALYSIS OF COVARIANCE STRATEGIES . .

Modeling Situation II Dependence . . . . . . . . . .
Index of Response Strategy . . . . . . . . . . . . .

The Covariance Matrix of Z . . . . . . . . . .
Test Statistic . . . . . . . . . . . . .

iv

12

12
13

14
19
22
24
25
27
28
28
29

31

33

34
38

40
45

Chapter

V.
Appendix

A.

Analysis of Covariance .

The Covariance Matrix of Ew
Test Statistic

ARE FIXED .

B. THE REGRESSION SLOPE OF THE POOLED ANCOVA MODEL

C. VARIANCE COMPONENTS OF DEVIATED SCORES

D. COVARIANCE COMPONENTS OF DEVIATED SCORES

E. ANOVA TABLE OF THE INDEX OF RESPONSE MODEL USING
CLASSROOMS AS THE UNITS OF ANALYSIS

REFERENCES

SUMMARY AND CONCLUSIONS .

ANOVA TABLE OF THE FULL MODEL GIVEN STUDENTS

64

65

66

68

7O

71

LIST OF TABLES

Table

1.1 Power Computations Using Groups and Individuals as

Units of Analysis, Given Individuals are Independent .

vi

Page

LIST OF FIGURES

Figure

1.1

1.2

1.3

3.1

4.1

A Data Matrix of a Two Level Balanced Nested Design .

ANOVA Table of the Full Model . . . . . . . . .

ANOVA Table of the Pooled Model . . . . . . . .

Four Possible Situations of Independence and Dependence

in a Two Level Hierarchically Nested Design . .

ANOVA Table of the Index of Response Model Using
Individual Students as the Units of Analysis

vii

27

46

INTRODUCTION

In experimental studies concerned with classroom learning and
classroom teaching, the sampling frame typically involves at least
two levels of nested data. This is a common characteristic of experi-
mental design in education. For example, in many studies individual
students are nested within classrooms, and classrooms in turn are
nested within treatments. One question often raised when dealing with
data from such hierarchically nested designs is what should be the
appropriate analysis procedure (Glendening & Porter, 1976; Hannan
& Young,-l976). This question is sometimes viewed as the problem
of selecting the appropriate unit of analysis (Cronbach, 1976;
Peckham, Glass, & Hopkins, 1969; Porter & Chibucos, 1975).

Consider a two—level balanced nested design with fixed treatment
effects and one dependent variable, as shown in Figure 1.1. In this
design, an equal number of students are nested within each classroom
and an equal number of classrooms are nested within each treatment.
One concern of the researcher is to test the hypothesis of no treat-
ment effects.

Within the context of analysis of variance, the appropriate

linear model for this two-level nested data is:

 

 

 

l 2
C1 C2 C3 “4 C5 c6
S11 S21 S31 341 s51 S61
S12 s22 S32 342 s52 S62
s13 S23 S33 543 s53 S63
314 $24 534 344 554 S64
315 S25 s35 345 S55 S65
T = treatment
c = class
3 = student
Figure 1.1:
A Data Matrix of a Two Level Balanced Nested Design
Y + +A,‘r +E i 12 t
. =11 a. . .. a 9, ':
13k 1 ij ijk j s 1,2, . ,c
k . 1,2,. .,s
Where
Yijk is an observation of the outcome variable Y on

student k in class j receiving treatment 1,

u is the grand mean,

“1

is the effect of being in treatment 1,

*
A is the effect of being in class j which is nested

13

within treatment 1, and

E . is an individual error of student k.

ijk

*
In this model, u and oi are unknown constants, and Aij and Eijk

are random variables which are assumed to be independently, identi-

2
cally and normally distributed with zero means and variances o * and

A
o: , respectively.
This model, which will be referred to as the "full" model, is
considered fully specified because it accounts for both classroom
and student sources of variation in the two-level nested data. The

analysis of variance table under the "full" model is shown in

Figure 1.2. The hypothesis of no treatment effects can be stated as:

Under the hypothesis of no treatment effects, the expected mean square
of treatments is equal to the expected mean square of classes nested

within treatment or

2 2

a = 'k
E(MST) E(Msc: ) soA + OE

T

Thus, from Figure 1.2, the ratio MST/MS (i.e., the ratio of mean

C:T
square for treatments over mean square for classrooms nested within
treatments) will be the apprOpriate test statistic for the no treat-
ment hypothesis given the assumptions of the model.

Although the full model is the appropriate model for the nested
data, in practice, researchers often assume a model that ignores the
classroom grouping variable. That kind of model is under specified
and gives misleading results. An example is Anderson's large scale

study (1941) of aptitude by treatment interaction. Two teaching

methods were compared: drills and meaningful emphasis. Though the

 

 

*
- = + +
Model. . Yijk u + oi Aij Eijk
* 2*
Assumptions: Aij W NID (O, 0A )
2
’b
and Eijk NID (0, 0E)
Source of Variation d.f. E(MS) 2
2 2 t “i
Treatments (T) (t-l) o + so * + cs 2 ———-
E A . t-l
i=1
2 2
Classes (C:T) t(C-l) OE + SOA*
2
Students (S:CT) tc(S-l) OE

 

Treatments are f ixed;
Classrooms are random; and

Students are random.

 

Figure 1.2: ANOVA Table of the Full Model

methods were delivered in classrooms, Anderson's analysis disregarded
class membership and pooled all students within a treatment. Having
done so, Anderson found a significant interaction of the investi-
gated teaching methods with the aptitude variable. The interaction
was interpreted at the individual level. Cronbach and Webb (1978),
reanalyzed Anderson's data by adding the class membership variable.
They found that after controlling for the aptitude by treatment
interaction at the class level, the interaction at the individual

level was no longer significant. This inconsistency of conclusions

when analyzing the same set of data by different models or units of
analysis is a well known issue in educational evaluation. Discussion
of this issue can also be found in publications related to the

Follow Through Project (Porter, 1972; Porter & Chibucos, 1975).

Even when the appropriate unit of analysis has been recognized
there can be circumstances which prevent use of that unit. Porter
(1973) evaluated two teaching strategies, TABA and BASICS, that were
delivered in two different schools. Recognizing that the school
should be the unit of analysis, Porter chose to use students as the
unit of analysis. Porter explained that,

When identifying the unit of analysis for a study, a
crucial consideration is one of independence. This is
because all tests of significance are based on the
assumption that the units of analysis are independent
of each other. Since TABA and BASICS teacher training
took place in groups, and since group discussion is one
of the most important aspects of the two programs, it
follows directly that children in a school were not
exposed to the programs in a way such that their exposure
represented independent replications of the programs.
It could, however, be argued that multiple program
schools would have represented independent replications
of the program. Unfortunately, with only one program
school and one control school, using school as the

unit of analysis would have resulted in no tests of
significance, i.e., there would have been zero degrees
of freedom. (p. 25)

In both the Anderson and Porter examples, students and class-
rooms were grouped together within treatment levels. By ignoring the
natural grouping of students the model assumes that the treatment
effects are the only source of systematic variation among the data.

This model referred to as the "pooled" model is,

’ = l . t
Y.. - + +3.. 1 ’
13k u a 13k j=l,. .s
k = 1, .n
Where
Yijk , u and oi are defined as before, and

*
Eijk is an individual error on the outcome variable Y for

student k.

*

In this "pooled" model, Eijk is a random variable which is assumed to

be independently, identically and normally distributed with zero mean

and variance oE* .

The analysis of variance table for the "pooled" model is shown
in Figure 1.3. If the "pooled" model is assumed for the data, the
null hypothesis of no treatment effects is tested using FS:T a

MST/MSS:T' Given that the distributional assumptions for the "pooled"
model hold, that is, the observations on the students are independent,
identically and normally distributed the test statistic FS:T will have
a central F distribution with t-l and t(cs-l) degrees of freedom-under
the null hypothesis.

To test the hypothesis of no treatment effects, the choice of
the "full" model is analogous to choosing classrooms as the units
of analysis while the choice of the "pooled" model is analogous to
choosing students as the units of analysis (Glendening & Porter,
1976). For either conceptualization, the assumption of independence
cof observations for the students is the central issue. If this

:assumption can be met, the choice of the "pooled" model or students

as the units of the analysis will be the appropriate one. Otherwise,

 

 

*
: = + +
Model Yijk u a1 Eijk
. * 2;,
Assumption. Eijk N NID (O, 0E )
Source of Variation d.f. E(MS)
2
2 o.
Treatments (T) (t-l) o * + cs £-—l-
E t-
i
2
Students (S:T) t(cs-l) oE*

 

Treatments are fixed, and

Students are random.

 

Figure 1.3: ANOVA Table of the Pooled Model

the "full" model using classrooms as the units of analysis will be
' the appropriate choice.

Independence of observations is one of three assumptions for
analysis of variance models. The other two assumptions are normality
and homogeneity of variances. Since assumptions are rarely met

exactly in real world situations, researchers must be aware of the

consequences of violating assumptions. Violations of the assumptions
of a model may affect both the significance level and the sensitivity
of a test (Cochran & Cox, 1957, p. 91). Though the F tests under-
lying analysis of variance may be robust with respect to violations
of the assumptions of normality and homogeneity of variances under

certain circumstances (Glass & Stanley, 1970), they are not robust

with respect to violation of the independence assumption (Glendening,
1977; Paull, 1950).
For the "pooled" model, the assumption of independence at the

individual level is equivalent to the assumption of no grouping

2 *
effects (i.e., 0A* = 0). That is, the individuals (E ) will be inde-

ijk

pendent if 0:* is zero. When class effects are present, however,
disturbances for each individual are correlated within a class. The
degree of correlation among individual units within a class can be
measured by an intraclass correlation coefficient. Similar to 02f ,
when the intraclass correlation coefficient is zero, the condition
of independence is met. Analytic and empirical results from
Glendening's (1977) work show that dependence affects the validity
of the FS:T test leaving the researchers no choice but the full
model or the FC:T test.

In comparing the test statistics from the full and pooled models,
a distinction in degrees of freedom can be made. While the degrees
of freedom for the numerators of both statstics are the same, (i.e.,
t-l) the degrees of freedom for the denominators are different. For
the FC:T ratio, the degrees of freedom for the denominator are
t(c-l), which is dependent on the number of treatment levels and
the number of classrooms nested within treatments. For the FS:T
ratio, the degrees of freedom for the denominator are t(cs-l) which
also depends on the number of students within each classroom. Except

for the extreme situation where there is only one student in each

classroom, t(cs-l) will always be greater than t(c-l).

Peckham et al. (1969) computed power estimates of the FC'T =
test under the condition of

MST/MS test and the F = MST/MS

C:T S:T S:T

individual independence. Their results, reproduced in Table 1.1,

showed that the power of the F test is higher than the F test.

C:T

test is largest when the treat-

S:T

The gain in power by using the FS'T

ment effect is small and/or the number of levels of the grouping

variable (e.g., classrooms) is small. Under independence, the FS'T

test has higher power than the F test, so the pooled model or

C:T

using individual as the unit is the better choice.

Table 1.1: Power Computations Using Groups and Individuals
as Unit of Analysis, Given Individuals are Independent

 

Power (a 8 .05)
Treatment Effect
(pl - uz in sigma unit)

 

 

d.f. of
Analysis Unit F-Test Denominator .25 .50 .75
Individuals FS'T 198 .42 .94 .991
Groups FC:T 6 .25 .82 .987

 

Since many experimental studies can afford only a few classrooms,
the difference in power between the full and pooled models is
typically substantial. Further, small treatment effects, if any, are
also not uncommon in educational research. Peckham et a1. (1969)
called researchers' attentions to the importance of power when

testing educational effects:

10

Most studies in education assume that the individual is

the unit of statistical analysis. If this assumption is
seriously in error, one would find an abundance of signifi-
cant effects. (Indeed, there would be many instances of
contradictory significant effects if random differences
were being made "significant" through the utilization of
illegitimate power.) History suggests that this is not the
typical case. Significant effects related to different
methods of instruction are relatively rare, even when we
utilize the power afforded by treating the pupil as the
unit. All too frequently, rigor in statistical analysis

is defined as the avoidance of inaccurate probability
statements concerning a Type I error. A more comprehen-
sive notion of rigor would include a similar concern for
the avoidance of Type II errors and for the maximum use

of available data. (p. 345)

I

In conclusion, the two available test statistics, the F test

C:T

and the FS'T test, which underlie the full and the pooled models are
not always satisfactory. When treatment effects are small and there

are few classrooms, F lacks power. While F has greater

C:T S:T

degrees of freedom and so greater power, the test requires indepen-
dence at the student level.

The purpose of this study, then is to consider alternative
models which use individuals as the units of analysis and at the
same time account for the dependence that may exist in the data.
For an alternative model to have utility, the model must yield a
test statistic for the null hypothesis about treatments that has
a known sampling distribution and greater power than the FC:T test
from the full model. The investigation will proceed within the
context of a two-level balanced nested design where the available
number of classrooms is small. Independence among classrooms will
be assumed throughout. For example, the kinds of experimental

situations that are considered here are those where intact class-

rooms are assigned randomly to treatment levels.

11

Before examining alternative models, however, one needs to
understand the conditions under which independence and dependence

may occur in an experimental study. Chapter II provides (1) an

operational definition of independence among individuals in the two—
1evel balanced nested design, (2) a discussion of an approach for
empirically testing the independence assumption, (3) a discussion

of an unsuccessful procedure for taking dependence into account, and
(4) discussions of index of response and analysis of covariance as
alternative models which might allow individuals to be the units of
analysis.

Chapter III presents a classification of independence and types
of dependence in experimental studies. A discussion of ways in which
dependence can occur is also provided. Finally, a particular type
of dependence is selected for investigating the utility of index of
response and analysis of covariance as alternative models which allow
individuals to be the units of analysis.

Chapter IV provides an examination of the index of response and
the analysis of covariance strategies. Chapter V presents summary

and conclusions.

CHAPTER II

REVIEW OF LITERATURE

An Operational Definition of Independence

 

Glendening (1977) investigated the independence assumption at
the individual level for two-level nested designs using an analysis
of variance model. Given independence at the group (classroom)
level, Glendening defined independence at the individual level to be
a condition where the variance of the group units could be predicted
from group size and the variance of the individual units. Conse-
quently, within the context of analysis of variance Glendening stated
the operational definition of independence as a condition where the

))

expected mean square of classrooms nested within treatments (E(MSC'T

equals the expected mean square of students nested within classrooms

and treatments (E(MS Given classrooms as a random factor,

S:CT))'
this operational definition applies regardless of whether individuals
are a random factor (as shown in Figure 1.2) or a fixed factor (as

shown in Appendix A). Only when the individual effects are random,

however, is the condition of independence as defined by Glendening

2
equivalent to oA* being zero. It is this latter case, random
individual (student) effects, that is the focus of the present study.

2
Thus, a non-zero value Of 0A* implies dependence and a zero value of

2
oA* implies independence.

12

13

Positive Dependence and Negative Dependence

 

Using the operational definition of independence that she
defined, Glendening (1977) classified two types of dependence:
positive and negative dependence. She defined positive dependence
to be a condition where the expected mean square of classrooms nested

within treatments is greater than the expected mean square of students

nested within classrooms and treatments (i.e., E(MSC'T) > E(MS )).

S:CT

Negative dependence, on the other hand, was defined as a condition
where the expected mean square of classrooms nested within treatments
is smaller than the expected mean square of students nested within

classrooms and treatments (i.e., E(MSC'T) < E(MS Glendening

S:CT))'

pointed out that positive dependence was possible whenever individual

effects were random. This can be seen from Figure 1.2 where

2 2

2
) = so * + o which is equal to or larger than E(MSS'CT) = E .

E(MS A E

C:T

Negative dependence, however, can only occur when the individual

2
effects are fixed. That is from Appendix A, E(MS ) = soA* which may
2

E .

C:T

be lesser than E(MS ) = o

S:CT
Glendening studied analytically and empirically, the effects of

dependence on the sampling distributions of FC'T and FS'T test

statistics for the full and the pooled models respectively. Her
results showed that neither type of dependence among individuals

affected the validity of the FC’T test. However, both kinds of depen-

dence affected the validity of the FS'T test. Specifically, within

the context of analysis of variance, Glendening found that positive

dependence made the F test liberal and resulted in spuriously high

S:T

power. Negative dependence, on the other hand, made the FS°T test

14

conservative and yielded spuriously low power. Since the sampling
distribution of the FS:T test was affected by degrees of dependence,
Glendening recommended the use of the full model when dependence was
suspected.

The definition of independence in terms of an equality of expected
mean squares is helpful in two respects. First, it is an operational
definition for the assumption of independence in analysis of variance
models. The degrees of dependence in a study can be easily estimated
C:T/MSS:T ratio.

Second, the definition allows for the possibility of either positive

from an analysis of variance table, using a MS

or negative dependence. Negative dependence is rarely discussed in

the literature.

Preliminary Testingvon the Full Model and Conditional Pooling

 

As has been said, the test statistic F is appropriate only

S:T
when the assumption of independence is met at the level of individuals.
The test statistic FC:T is correct even when individuals are depen-
dent, but the test suffers from low power. If a researcher could
decide when individuals are independent, for those situations the

best strategy would be to use F When dependence is present,

S:T'
however, the FC:T test must be used. Glendening's operational
definition of independence provides a test for dependence and so
might be used to guide the researcher in deciding between the pooled
and full models.

Preliminary testing for dependence and conditional pooling

results in a two-stage testing procedure. For a two-level

15

nested design, as shown in Figure 1.1, the procedure starts with the

preliminary test (i.e., F = MS /MS ) for independence at the

C:T S:CT
individual level. The null hypothesis of the preliminary test is

Ho: E(MSC'T) = E(MS ) which is the same as the condition of

S:CT
independence defined by Glendening (1977). If the preliminary test
results in rejecting the independence hypothesis, the treatment
hypothesis is tested as if the independence assumption were not valid

using F If the preliminary test fails to reject the independence

C:T'
hypothesis, the treatment hypothesis is tested as if the independence

assumption is valid using F (i.e., the pooled model). Because the

S:T
choice of test statistic for the null hypothesis about treatments
depends on the decision made at the preliminary stage, the consequent
test of this two-stage procedure is conditional.

The preliminary test represents an attempt to avoid using a
pooled model when dependence is present. Peckham et a1. (1969)
warned researchers that the preliminary test is not an infallible
indication of whether or not independence exists (i.e., either Type I
or Type II errors are possible). Thus, they recommend the testing
and pooling procedure to be used only when a researcher has an
a priori notion that independence among individual observations
exists.

To be successful, the preliminary testing and conditional pooling
procedure must keep the actual alpha level of the conditional test
close to the nominal alpha level. Further, the power of the con-

ditional test must be greater than the power of the unconditional,

always correct, FC'T test.

16

Paull (1950) studied factors that affect the distributional
properties of the conditional test when individuals are a random
factor. He showed that the distribution of the conditional test
involved three main contingencies. The contingencies were (1) mag-

nitude of the dependence (which he defined as E(MSC.T)/E(MS )),

S:CT
(2) probability of Type I error at the preliminary test and at the
consequent test, and (3) number of classes per treatment and number
of students per class.

2

Paull found that when oA* = 0, the preliminary test was effective

in making the power of the conditional test greater than the power of
the unconditional test. However, as o:* increased from zero, Paull
found that the observed alpha level of the conditional test increased
to a maximum and then decreased slowly to being equal to the nominal
alpha level. Thus, given positive dependence, Paull found that for a
fixed probability of a Type I error, the conditional test was generally
more liberal than the unconditional test, FC:T .
Paull also found that, given dependence, the number of classes
per treatment and the number of students per class affected the
discrepancy between the distributions of the conditional test and its
reference distribution. Paull found that a large number of classes
per treatment was desirable in two respects. First, as the number of
classes per treatment increased, the power of the preliminary test
increased, and pooling inappropriately happened less often. Second,

when pooling was prescribed, the pooled mean square was weighted in

favor of the correct mean square error, MSG'T . As the number of

17

students per class increased, again the power of the preliminary test
increased. But when pooling was prescribed under the dependence

condition, the wrong mean square, MS T’ received greater weight.

S:C
Thus, the effect of class size to the distribution of the consequent
test was not simple and largely dependent on the value of positive
dependence.

Lastly, Paull examined the effect of increasing the nominal alpha
of the preliminary test. Increasing the nominal alpha level of the
preliminary test will increase its power and so decrease the frequency
of pooling. .Less frequent pooling should in turn result in less
liberalness of the conditional test under positive dependence. Paull,
however, found that increasing the alpha level of the preliminary
test did not always result in decreasing the liberalness of the con-
ditional test. From his finding, there was a critical alpha level
above which increasing the alpha level of the preliminary test
resulted in increasing the liberalness of the consequent test.

To stabilize the disturbances between the distributions of the
conditional and its reference distribution for a given amount of
positive dependence, Paull finally recommended 2F be used as the

50

critical value for the preliminary test. is the 50th percentile

(F50
in the central F distribution with (c-l) and tc(s-l) degrees of
freedom). However, it is not clear how 2F50 stabilizes the distur-
bances between the distribution of the conditional test and its
reference distribution under positive dependence. If the goal is to

increase the power of the preliminary test, Glendening (1977) pointed

out that "taking twice the critical value given a large alpha of .50

18

had the same effect as selecting a small alpha level in the first
place."

Results from Paull's study indicated that positive dependence is
an important threat to the validity of the conditional test. Only
under the condition of independence and under extreme positive depen—
dence was the conditional test valid. Unfortunately, given an inter-
mediate value of positive dependence, the preliminary test can mis-
takenly prescribe pooling and make the validity of the conditional
test questionable.

Glendening (1977) examined the utility of preliminary testing and
conditional pooling under both positive and negative dependence.
Analytically and empirically, Glendening's findings opposed the use
of the procedure. Similar to Paull (1950), Glendening concluded that
given an intermediate value of dependence, the preliminary F test was
not sensitive enough to help a researcher guard against having an
undesirably distorted probability of Type I error for the conditional
F test.

The preliminary testing and conditional pooling procedure is not
likely to be useful in experimental studies in education since
effectiveness of the procedure is limited by its insensitivity to
moderate degrees of dependence. Degrees of dependence that occur
within educational research studies usually range from small to
moderate. Glendening (1977) investigated two research studies in
elementary schools. She found that on achievement scores, the degree
to which classroom variation accounted for total variation among

students ranged consistently from 20 to 50 percent.

19

Analysis strategies that can account for moderate degrees of
dependence would be useful for analyzing nested data. The following
sections present discussions of such strategies. First the use of
a quasi-F ratio to correct for dependence is considered. The chapter
concludes by considering the possibility of using adjustment strategies

which rely on information provided by a covariate.

A Quasi-F Statistic

 

There is a class of factorial designs for which analysis of
variance does not provide direct tests of certain hypotheses even
when all assumptions of the model have been met (e.g., Kirk, 1968;
Winer, 1972). For example, the fixed main effect is not directly
testable in a completely crossed factorial design having one fixed
and two random factors with random replication in each cell of the
design. In such situations a quasi-F statistic is sometimes con-
structed to provide an appropriate test of the fixed main effect. As
will be seen, there was some reason to believe that a quasi-F test
might hold potential for providing a valid test statistic with greater

power than F in situations of positive dependence.

C:T
A linear combination of independent chi-square statistics is
distributed approximately as a chi-square distribution with degrees of

freedom estimated from a function of mean squares and degrees of
freedom (Satterthwaite, 1941, 1946). For example, let xil , x:2 and
xia be chi-square statistics which are independently distributed as

central chi-square distributions with v1, v2 and v3 degrees of freedom

respectively. Also, let M31, M82 and M33 be mean squares associated

2 2 2 2 2 2 2 h i
. . + - m w ere v s
with XVl , sz and Xv Then, le x x Xv 4

estimated by

 

A quasi-F statistic is simply the ratio of two estimated vari-
ances, at least one of which has been formed through a linear combi-
nation of independent mean squares (Hudson & Krutchkoff, 1968; Galor &
Hopper, 1969). The potential utility of the quasi-F statistic in situ-

ations of positive dependence can be seen by returning to the FS'T

statistic and its inadequacies. Given positive dependence and the

null hypothesis for treatment,

2 2

E(MST) 8 oE + soA* ,

while

2 (c-l) 2
E(MSSeT) 0E + (cs-l) °A* ’

(Glendening, 1977). Applying the strategy of constructing a quasi-F
test the following ratio can be formed

MST ' Mscm + Mssmr

MSS:CT

F' =

The expected values of the numerator and the denominator of F' are
equal under the null hypothesis of no treatment effects. This

equality can be seen by recalling that

2

2 2 t a

E(MS ) = o + so * + sc 2 -3;-
T E A . t-l ,
i=1
2 2
E(MSC:T) = GB + soA* ,
d MS 2
a“ E( S:CT) ‘ OE .

21

Thus, 2
2 t oi

.. + = _

E(MST “Sew Mss:cr) GE + 3° i=1 t-l

and so, under the null hypothesis of no treatment effects,

E(MST - MS
2
E .

C:T +1MSS:CT) and E(MSS:CT) estimate the same parameter,

0'

U
The apparent reference distribution for F is a central F distri-

bution with first and second degrees of freedom, u and v , provided

 

 

1 2
by
f) = MST ' MSC:T 4' M55:01~
1 2 2 2
(MST) + (-MSC:T) + (Mssw'r)
(t-l) t(c-l) tc(s-l)
and v2 = tc(s-l)

I
Unfortunately, F is not a legitimate quasi-F ratio. As can be

seen the complex variance of the numerator and the simple variance of

the denominator are not independent; MS is used in both places.

S:CT
'
The F statistic which appeared to hold promise as a test with greater

power than F in situations of dependence and with no cost of

C:T
additional information, has been found to lack a known distribution.
In the search for a more powerful test of treatment effects in
situations of dependence, two approaches have been considered. Both
the procedure of preliminary testing for dependence and conditional
pooling and the building of a quasi-F statistic have been seen to
yield test statistics with unknown distributions. As an alternative

to these two approaches, perhaps there exists ways to adjust the

individual observations for the dependence they reflect. This

22

possibility is considered in the following section. For the approach
to be successful in correcting dependence, the approach must meet two
criteria: (1) provide adjustment that leads to the condition of
independence on the adjusted observations; and (2) provide a statis-
tical test that has a known sampling distribution and has higher

power than the FC:T test.

Using a Covariate to Adjust for Positive Dependence

 

Glendening and Porter (1974) suggested the possibility of using
analysis of covariance (ANCOVA) to adjust for positive dependence.
The effects of positive dependence, as they pointed out, can be con-
ceptualized as similar in form to the problem of confounding in quasi
experiments. Given positive dependence, the problematic variance of

2
the class effects (0 A*) exists in the expected mean square for treat-

2
t (11 2 2 2
ments (i. e., E(MST ) = sc 2 -—-+ so * + o . Thus o * might be
i_lt: -1 A E A

removed from E(MST) by ANCOVA procedures conceptually leaving the
adjusted observations free from positive dependence.

Glendening and Porter conjectured that removing positive depen-
dence from the individual observations was possible if (1) the
covariate, X, and the dependent variable, Y, have equal degrees of
dependence (1. e., E(MSx :H)/E(MS

)= E(MS; T/E(MSY )), and (2)

S: CT 3: CT

the correlation of X and Y within classrooms was equal to one.
Following their lead, the present study investigated the possi-

bility of ANCOVA as an analysis strategy in situations of positive

dependence. Their rationale for ANCOVA, however, applies equally

23

well to an index of response strategy (i.e., Z = Y - KX where Z is
the index of response). Conceptually, ANCOVA and index of response
are closely tied, the main difference being that ANCOVA estimates the
value of K while index of response requires that K be set a priori.
The investigation in this study started with an examination of the
potential of an index of response model. Only given a reasonable
solution using the index of response strategy would the utility of
ANCOVA be investigated. If no solutions for the dependence problem
is found using a model of index of response, it is unlikely that a
solution exists in the corresponded but more complex ANCOVA model.

Unlike the preliminary testing and conditional pooling strategy
or the quasi-F approach, the index of response and the ANCOVA
strategies require information in addition to observations on the
dependent variable. To examine the index of response and ANCOVA
strategies, it is important to distinguish between when and how
dependency may arise in a study. These distinctions facilitate
understanding the causes of dependence and so may help to inform
the selection of an appropriate covariable.

In summary, there were two main related tasks that this study
intended to accomplish:

(1) to classify situations of independence and dependence in
experimental studies that assume independence at the group level,
and

(2) to investigate the possibility of using an index of response
and its corresponded ANCOVA models to correct for positive dependence

among the individual units.

CHAPTER III

SITUATIONS OF INDEPENDENCE AND DEPENDENCE

As has been stated, the problems created by positive dependence
are analogous to the problem created by confounding variables in quasi
experiments. If there are no classroom effects in the two level
nested design under consideration, there is independence at the level
of individuals. Thus, an attempt to remove dependence from the data
can be viewed as an attempt to adjust classroom effects to zero. If
classrooms were confounded with treatments in a quasi experiment, the
same adjustment procedure which removed dependence would also remove
the confounding effect of classrooms in that quasi experiment.

Two essential assumptions to the success of an "adjustment"
strategy in removing confounding effects in a quasi-experiment are
correct specification of the covariable and proper specification of
the analytic model (Olejnik, 1977). Therefore, correct specification
of the covariate and appropriate specification of the analytic model
are also required in this study. The problems inherent in the general
assessment of these two assumptions are formidable (Cronbach, Pagosa,
Floden, & Price, 1977; Olejnik, 1977).

Importantly, to be successful in the adjustment, the two assump-
tions place the responsibility on the researcher. Beside being

knowledgeable in the substantive aspect of his experiment, a

24

25

researcher must understand design problems that lead to situations of
dependence. When dealing with hierarchically nested data, it is
helpful for a researcher to understand "when" and "how" dependence
problems can arise in an experiment. Given this knowledge, the
researcher may be able to avoid problems of dependence through careful
design of his study. Further, when experimental control is not
possible for a certain type of dependence, the researcher will be in

a better position to specify and measure the dependence (i.e.,
correctly specify the covariate and analysis model).

The intention of this chapter is to provide better understanding
of correct specification of a covariate. The discussion starts with
a classification of independence and dependence situations in an
experimental study that involves two level hierarchically nested data.
As stated previously, independence at the class level is assumed. At
the outset of the study, two conditions are identified: with and
withggg random assignment of students to classrooms. During the
experimental period, an additional two conditions are identified:

.22 class effects and class effects. Together, these two dichotomous
dimensions classify four possible situations in an experimental study.
Each situation will be discussed to generate potential sources of
dependence (which in turn will be used to inform selection of

covariates).

Initial Dependence and Durinngxperiment Dependence
The following classification of situations of independence and
dependence in an experimental study is similar to how Porter (1972)

classifies situations of confounding variables in a quasi-experimental

26

study. Two important questions are "when" and "how" the dependence
arises. Answers to these two questions can serve to inform the
design of an experiment involving hierarchically nested data.

When measurement of an outcome variable is taken immediately
after the intervention, there are two places where dependence can
arise, l) at the outset of the experiment and 2) during the experi-
ment. The first kind of dependence, initial dependence, is likely
to occur from the lack of random assignment of analysis units to
classrooms. To protect against initial dependence in a nested design,
random assignment to classrooms is essential.

The second type of dependence, occurring during the experiment,
may arise from interactions among analysis units while they receive
treatments (Cox, 1958). For example, when a treatment is delivered
to intact classrooms, common class experiences may reduce the
variability of students within the same class. Cronbach (1976)
suggested that unless a researcher is prepared to assume that
students within an intact classroom are treated independently by a
treatment and respond independently from each other, students are not
independent. While Cronbach's definition of dependence is in terms
of process, the process he identifies may also result in dependence
as it has been defined here in terms of observations on the dependent
variable. Since during—experiment dependence occurs after the start
of an experiment, it can exist even in a completely randomized

experiment.

27

Four Possible Situations of Dependence and Independence

In summary, dependence can arise because students were not
randomly assigned to classrooms (initial dependence) and/or because
of classroom effects which occur during the study (during-experiment
dependence). Crossing these two dichotomous possibilities defines
four situations (Figure 3.1). Situation I is the only situation that
does not violate the assumption of independence. Situations II and
III suffer from initial and during-experiment dependence respectively.

In Situation IV, both types of dependence exist.

 

Before Experiment . DuringeExperiment

 

 

No Class Effects Has Class Effects

 

With random assignment

of students to classes I III

 

Without random assign-
ment of students to

classes II IV

 

 

 

I - independence situation

II initial dependence situation
III - during-experiment dependence situation

IV

initial plus during-experiment dependence situation

 

Figure 3.1: Four Possible Situations of Independence and
Dependence in a Two Level Hierarchically Nested Design

28

Situation I: Independence. Situation I, the independence

 

situation, is possible when at the outset of an experiment students
are randomly assigned to classes and during the course of intervention
treatments are delivered independently to individual students. An
example of this kind of treatment might be a study of different types
of individualized instruction. While an experiment dealing with
individualized instruction might be conducted in classroom settings,
students might still be required to react individually to the instruc-

tional packets. Under Situation I, E(MSC'T) and E(MS ) are equal.

S:CT
An analysis of variance using the pooled model (FS'T) is the best
strategy for testing the no treatment effect hypothesis (Glendening,

1977; Cronbach, 1976).

Situation II: Initial Dependence. Situation II includes initial

 

dependence only. In this situation, common experience effects (i.e.,
class effects) during the experiment are controlled but students are
not randomly assigned to classrooms. When students are not randomly
assigned to classrooms, the samples of students within classes are

best thought of as coming from distinct populations (Cronbach et al.,
1977). In general, the class populations will have different distri-
butions on the dependent variable regardless of treatment effects.

Consequently, E(MSC°T) is greater than E(MS ), and one has the

S:CT
problem of positive dependence.
2
For Situation II, the value of oA* is solely a function of
initial dependence. If dependence is to be removed from the data

through index of response or ANCOVA strategies the covariable must

29

reflect initial class differences that are predictive of class
differences in the dependent variable. For experiments which fit
Situation II, a pre test seems potentially the best covariable for
removing dependence. Nevertheless, the utility of a pre test for
removing dependence will be a function of the relationships between
the type of natural growth on the dimension measured by the pre and
post tests and the relationship between pre and post tests reflected

in the analytic model (Bryk & Wiseburg, 1977).

Situation III: During:Experiment Dependence. Situation III

 

suffers from dependence which occurs during the experiment. In this
situation, a researcher is able to randomly assign students to classes,
creating initial independence. However, the researcher may not be
able to eliminate, through design, effects of common class experiences
(i.e., class effects) that occur during the course of the intervention.
Webb (1977) perceived class effects as group process effects
that cause dependence on the outcome variable dimension. He explained
that knowledge of group processes in a particular class is crucial
for understanding and estimating the degrees of dependence in a class—
room. Since the knowledge of group processes would guide a researcher
as to where to look for potential covariate, Webb concluded that
studying group process may be the only way to get at this dependence.
Group process effects, however, are a function of complex and
global variables including effects due to subject matter, teacher
effectiveness, teaching strategies, student interactions and class-

room milieu which are not part of the treatment effects. For

3O

example, differences in teacher effectiveness may result in classroom
effects and so dependence at the level of individuals. But pre-
dicting differential teacher effectiveness is a substantive problem
in its own right (Dunkin & Buddle, 1974). In the literature of
research in classroom teaching and classroom learning, other group
process variables identified above have been placed in the black box
of classroom setting about which little is known (e.g., Bloom, 1976;
Duncan & Biddle, 1974).

Confusion in specifying the class effects to be removed to
create independence increases when one considers the possibility that
some class effects may also be part of treatment effects. For example,
a researcher investigating effects of teaching methods on achievement
of elementary students may include differences in instructional skills
(one kind of class effects) of teachers as a part of the treatment
effects. However, differences in management skills (another kind
of class effects) of which he may not be aware could cause dependence
among students within the same classrooms. Further, since during-
experiment dependence occurs within the same period as the treatments,
potential interactions between dependence and treatments must be
considered. Thus, in Situation III, it is important for a researcher
to understand and carefully describe what constitutes the treatment
effects and what may be nuisance variables that would induce depen-
dence among units of analysis. If class distinctions are not made
between treatment effects and nuisance variables, an adjustment to
create independence might at the same time remove part of the treat-

ment effects from the data as well.

31

Clearly, the criteria for selecting covariates to remove depen-
dence in Situation III are quite different from the criteria for
Situation II. In Situation II, the covariate must predict initial
differences while, in Situation III, the covariate must predict
effects of classrooms that are not part of treatment effects. The
use of a pre test as an adjustment variable in Situation III is
unlikely to be helpful. As Porter (1972) stated:

Although I believe pre tests to be the best predictors

of initial differences it does not necessarily follow

that they are also the best predictors of differences

that occur in the dependent variable dimension during

program participation which are not a function of program

participation. My reasoning is that initial differences

are a function of all that has preceded the study in the

life of the child, while differences that occur during

the study other than due to program most likely are

primarily a function of the child's environment at that

time. (p. 19)

Situation IV: Initial and During-Experiment Dependence. The
last situation, Situation IV, is the most complicated. This situation
suffers from initial dependence and during-experiment dependence. An
experimental study that falls into this category has two design
problems. First, it lacks random assignment of students to classes,
and, second, it deals with treatments that are delivered in group
settings. Thus, both the magnitude of initial dependence and the

2
magnitude of during-experiment dependence are contained in oA* .
Specification of an appropriate covariable or covariables to adjust
2
oA* to zero (independence) is extremely complicated. Application of

structural equation strategies to classify important causal variables

in longitudinal data (Schmidt, 1975) may be helpful in identifying

32

appropriate covariables in Situation IV. In addition to all the
difficulties identified for Situations II and III, a researcher in
Situation IV must not ignore the possibility of interactions between
initial differences and class process differences.

Having described three types of situations in which there is
dependence at the level of individuals the present study limits its
focus to the use of the index of response and ANCOVA models in
Situation II. There are two reasons for focusing on Situation II.
First, the problem of modeling dependence in a design is not well
understood in the literature. To be able to understand this modeling
problem, one should start with a simple case. Second, the initial
dependence problem, underlying Situation II, is comparable to the
initial confounding problem in a quasi-experimental study. The
literature contains a great deal of discussion about the utility of
index of response and ANCOVA for controlling the effects of con-

founding in quasi-experiments.

CHAPTER IV

CORRECTIONS FOR INITIAL DEPENDENCE USING INDEX OF RESPONSE
AND ANALYSIS OF COVARIANCE STRATEGIES

Given dependence among individuals the F statistic provides

C:T
a valid test of the null hypothesis about treatments but for most
educational research its power is low. The goal of this study is
to explore alternative tests for treatment effects when dependence
is present. These alternatives will be evaluated against the
criteria of an actual Type I error in agreement with the nominal
value and power that exceeds the FC:T test.

The review of literature in Chapter II provided two helpful

conceptions for the investigation. The first conception was an

operational definition of independence and dependence. Given a
*
ii

) are random, independence is defined

)

two level balanced nested design where both the group effect (A

and the individual effect (Eijk

as the condition when oA* is equal to zero. Dependence is defined as

2
the condition when o * is greater than zero. The second conception

A
was to recast dependence as equivalent to the effect of confounding
at the class level. The reconceptualization suggests approaches to
analysis that use adjustment strategies comparable to those used to
remove the effects of confounding in quasi—experimental studies.
Two such adjustment strategies that are investigated here are index

of response and analysis of covariance.

33

34

In dealing with adjustment strategies, both correct specification
of adjustment variables (covariables) and proper specification of the
analytic model are necessary. Chapter III provided a classification
of independence and dependence situations which could facilitate
proper specification of covariates. In this chapter the parameters
of the two analytic models are specified for the situation of initial

dependence.

Modeling Situation II Dependence

 

Situation II represents designs in which students were not
randomly assigned to classrooms. Thus as has been noted, each class-
room must be considered a separate population. Given normality,
these populations may differ in both means and variances. The set
of Situation 11 designs considered here is restricted, however, to
classroom populations which differ only in terms of means. A linear
'system of structural equations consistent with the above restriction

can be specified as follows:

Model:
Y + + A* + E ' 1 2 c
= 11 O. .. .. 1 = s a a
i k i 1 1 k .
j J J j = 1,2,. .,c
X k - 1,2,.. ,3
Xijk ' u + Aij + Vijk
A* - B A + H
ij ' 1 ij ij
and Eijk = B2 Vijk + Gijk
where
Yijk is a post test score of individual k in class j receiving

treatment i;

ijk
ijk
A..
11
ijk

ij
ijk

and B

This

follows:

35

is the grand mean of Y;

is the effect of treatment i;

is the effect (as measured by post test) of being in class
j which is nested in treatment 1;

is the specification error at post test;

is a pre test score of individual k in class j receiving
treatment 1;

is the grand mean of X;

is the initial effect (as measured by pre test) of being
in class j which is nested in treatment i;

is the specification error at pre test;

*
is the residual of Aij given Aij;

is the residual of Eijk given Vijk;

*
is the structural regression coefficient that predicts Aij
from Aij;

is the structural regression coefficient that predicts Vijk

from Eijk'

structural model can be represented by a causal diagram as

 

 

 

 

 

 

 

36

Additional distributional assumptions of the model are

A 2
ij m NID (0, 0A)

2

Hij W NID (O, OH)

V 2
ijk m NID (0, UV)

2
Gijk W NID (0, 0G)

and °AH = OVG = OAv = OAG = 0RV = one = o
2
where op denotes the variance of variable p and o q denotes the

covariance of variables p and q.
From the above assumptions of the model, the covariance structures

of Y, X and XY are in the form of super diagonal matrices. Specifi-

 

 

cally,
r q
MY ¢ ¢
Y '.
Y _ ¢ M '
Z - . ¢
Y
¢ ~ ¢ M J
where

Y . . . .
Z 18 the covariance matrix of Y with dimentions of tcs x tcs;

MX is the covariance matrix of students within a classroom only

with dimensions of s x s;

P
2 2 2 2 V
i: * *
(0E + 0A ) 0A ... 0A
Y 2* . 2 2* * ‘2
M - oA ‘l_l_ + oEI = oA (oE + oA ) o *
: -. A
. 2* 2* 2 *
+
LOA. . ' . CA (OE CA )L
d

 

 

and o is a null matrix of s x 3 dimensions.

37

 

 

MK .... .l
x 4» MY .3
Z = . . ¢
cp~~¢ Mxy
L J
where
Ex is the covariance matrix of X with dimensions of tcs x tcs;

MX is the covariance matrix of students within a class on X with

s x 3 dimensions.

 

 

- 1
2 2 2
(oA+ ov) oAv' oA
MK 2 1 1v 2 2 2 2 :2
- OA-—-—- + o I - 0A (0A + oV 0A
:2 2 ' 2 2
0’ o o o 0A (UA+ 0V)
L d
and _ .

“5

i2

 

 

where
XY . . .
Z is the covariance matrix of the cross product XY with
dimensions tcs x tcs;

M is the covariance matrix of students within a classroom on

XY with dimensions 3 x s;

* * e -
((GAA + GVE) 0AA , .. 0AA
: I
= * = * * *
M 0AA 11 + oVEI 0AA (0AA + oVE) 0AA
* * *
.. .J

 

 

38

Dependence is reflected in non zero covariances among students

within a class. Given the structural models under consideration,
these non zero covariances are seen to equal o:* which is the class
effect. For simplicity, the structural model made the restriction
that every class be characterized by the same covariance structure.
Since classrooms are assumed to be independent, the covariance of

any two individuals from different classrooms is zero. Consequently,
the covariance matrix of Y, ZY , is a super diagonal matrix. Further,

given initial dependence, where X is a pre test, the covariance

matrices of X and XY (IX and ZXY) are also super diagonal matrices.
2

A* is zero and M? is a diagonal matrix.

Given independence, 0
Consequently, the ZY matrix becomes diagonal. Thus, to appropriately
adjust for dependence an analysis strategy must have a linear model

with a residual term that has a subject by subject diagonal

covariance matrix.

Index of Response Strategy

One alternative to using the full model for analysis of
variance (ANOVA) of the two level nested design, is analysis of
variance of index of response using the pooled model. An index of
response is defined by:

Z = Y = KX

where

Z is the index of response;

Y is the post test observation;

X is the pre test observation;

and K is a known constant.

39

Using Z as the dependent variable in ANOVA the linear model is

Z Z Z
= +
Zijk “ + “i Eijk
where

uz is the grand mean of index of response,

a: is the treatment i effect,
Z . . . .
and Eijk 18 the spec1f1cation error.

Since for the set of designs under investigation classrooms are
randomly assigned to treatments, a treatment effect on Z is equal to

a treatment effect on Y. A treatment effect on Z is defined

Since Z is a linear composite of X and Y,

z_ x

ui - ”i Kui
and

Z X

u = u - Ku

so that by substitution

X X
- “i - u - K(ui - u )

OZ
1

But, given random assignment of classrooms to treatments

= = = =3
U1 u2 ' 11t u
and
oz = - = a
i 11i u i

regardless of the value of K.
Thus, the null hypothesis for treatment effects can be stated
t 2

Z a = O .
=1

40

Given the assumptions of the model F = MST/MS§°T can be used to test

the null hypothesis. Of particular concern here is the assumption

of independence which can be stated

Z 2
E1.k m NID (0 , cEz I)

tcsxl tcsxl tcsxtcs

Returning to the linear model for z and restating in terms of para-

meters of X and Y

Z.. = u - Kux + a

*
+ A.. - KA . + E - KV
ijk. 13

i ij ijk ijk
Using the relationships in the structural model under consideration

this becomes

G

X
= p - Kp + a. + B - K A.. + H . + -
k 1 ( 1 ) 13 13 (32 K) Vijk + ijk

Z..
13
The Covariance Matrix of Z

The covariance structure of Z has the form of a super diagonal

matrix
Mz <1: 9?
z ¢ M2
2 g. . ¢
tcsxtcs : ‘
¢ ¢ MZ

 

 

where the within-class covariance matrix MZ is
M2 --M¥ + K2 Mx - 2mXY .
sxs exs sxs sxs
In terms of the parameters of My , MA and MXY , M2 can be expressed

as

41

-MZ 2* 22 2 * r 2 22
— (oA + K oA - KoAA ) 1_1_ + (oE + K ov - 2KoVE)I

The diagonal element of M2 is

22 2 22

2
s - A _
(oA + K oA 2KoAA ) + (oE + K oV 2KoEV) ,

and the off diagonal element is

2 22

* - *
(oA + K oA 2KoAA ) .

For independence to exist at the level of individuals for Z,
22 must be diagonal. The 22 matrix will be a diagonal if MZ is

diagonal. To make Mz diagonal,

22

2
_ * *=
K oA 2KoAA + oA O ,

(a gradratic equation in K).

To solve for K,

2
* _/ * *
2oAA (2oAA )2 --4o2 AoA

 

 

 

 

 

K:
20A
Define
0' *
0* = AA
*
A A oA oA
Then,
* = * A
0AA 9 °A°A ’
J/fz 2 2 2*?
K_2°AA AA 4‘)AA AA 4AA
20A
* 2 2 2
+ * * *
- * o - pAA oA oA
‘0AA ““0 T-T
A o o

42

+ *—
o pAA ‘ pAA 1

The absolute value of the correlation coefficient, IpAA*I , ranges
from zero to one. If IpAA*I is less than one, K will be an imaginary

number. Thus, the only real solution of K is when *I is one and

loAA

=3 * * =
K oA /oA . But, ipAA I 1 only when the class effects on X, Aij’
*
are perfectly correlated with the class effects on Y, A . This

11
*
ij and Aij dictates the specification

errors at the class level in the structural model, i.e.,

perfect relationship between A

Hij , be

zero. Thus, the revised structural model that is appropriate for

the index of response strategy must be

*
Yijk — u + oi + Aij + Eijk
= +
xijk “ + Aij Vijk
A* B A
ij ‘ 1 ij

Eijk ' BZVijk + Gijk '

It is this structural model that is used through the rest of this
study.

In conclusion, when there is perfect correlation between class

effects for X and class effects for Y and K = oA /oA , ANOVA of
*

index of response correctly adjusts for dependence among individuals.
Since K must be known a priori, it is useful to explore different
ways of thinking about the ratio oA /oA . Since from the revised

*

structural model,

.43

*
B1 is the regression coefficient of Aij on A

ij °
It can also be shown that under certain conditions the common

within treatment regression coefficient, 8 for Y on X is equal

 

 

S:T’
*
to oA /oA .
Co (x - X)(Y - )
_ V 11k “1 ijk “i
BS:T- x
Var (Xijk - ui)
Under the structural model BS°T can be expressed as
2 2
B o + B o .
BS'T = l 2 g V (See Appendix B).
oA + oV
If B1 = B2, then
BS:T B B1
Since
= *
B1 oA /oA ,
B ' o */o

S:T- A A'
Thus, in addition to perfect correlation of the class effects when

the structural regression coefficients at the class level, B , and

1

at the individual level, 32, are equal, the common within treatment

regression coefficient, 8 is a correct adjustment coefficient to

S:T
create independence.

Further, it should be noted that for two level nested data,

if B1 = B2 then

8

Define

8

Then,

Given B

Define

C:T

Then

BC:T 3

Given Bl

Define

Tot

Then,

Tot

Given B1

(The above

S:CT ‘

S:CT ‘

44

 

— BC:T = 8s T = BTot = Bl
Cov(X -ux)(Y '11)
_ ilk ii, 113, ii
x
Var (Xijk - uij)
2
_ B2 GV
_ __7___
OV
B2 ’ BS:CT = Bl '

 

 

8Tot = B1 '

illustrations use information from Appendices C and D.)

45

Test Statistic

 

The test statistic for testing the treatment hypothesis can be
developed from the analysis of variance table of Z as shown in
Figure 4.1. From Figure 4.1, the expected mean square of treatments,

E(MSi), and the expected mean square of students pooled within treat-
2 2

Z 2
ments, E(MSS:T)’ estimate the same parameter, (Bz-Bl) oV + oG ,

under the null hypothesis of no treatment effects. Given the

Z Z Z .
assumptions about Vijk and Gijk , the FS:T — MST/ MSS:T ratio is

distributed as a central F distribution with t-l and t (cs-1) degrees

of freedom under the null hypothesis of no treatment effects.

The power of the F2 test is higher than the F test. The

S:T C:T
power of the FS°T test is a function of the probability of Type I
error, the size of the treatment effects (oi), E(MS§.T) and degrees

of freedom (t-l, and t(cs-l)). Since treatment effects are identical
for the two tests and the probability of Type I error can be held
constant, differences in power are only a function of the last two
factors.

The degrees of freedom for the error terms of F and F

Z are
S:T C:T

t(cs-l) and t(c-l) respectively. Thus, everything else held constant,

F: has higher power than F because of higher degrees of freedom.

:T C:T

The effects on power of the two error terms is not straight-

forward. Recall that

z 2 2 2

S:T) (32 ' B ° + 0G

E(MS 1) v

and

N
N

- *
E(Msc:T) soA + oE .

46

 

 

 

 

Model
2.. =Y.. -KX. i=1,2, ,t
ijk ijk ijk j=l,2,. ,c
k = 1,2, ,8
X
= - + + - +
°r Zijk (“ B1“ ) “i (32 Bl) Vijk Gijk
Assumptions
2
Vijk % NID (0, ov)
2
Gijk % NID (0, oc)
oVG = 0
Source of Sum Mean
Variation d.f. Square Square E(MS)
2
Z
T t t(T)t1 2(2 2 2 M2 m1+13 22
rea men sc1 i..'- ..) ST SCE:I ( 2-31) 0V
2
+O'G
d 2 z 2 2
Stu ent (S.T) t(cs-l) 2 (Zijk - 21..) MSS:T (B2 - Bl) ov + oG
ijk
Ms;
Given Ho: a1 = 0 , Z N F
MS t-l, t(cs-l)
S:T

 

Figure 4.1: ANOVA Table for the Index of Response Model
Using Individual Students as the Units of Analysis

47

E(MSC'T) can be expressed as

2 22 2
E(MSC:T) = soA* + BzoV + oG .
When B2 = 31’ E(MS§:T) is clearly smaller than E(MSC:T)° Also, if B1
and 32’ have the same sign, E(Msng) is smaller than E(MSC:T)' But,
when B1 and B2 have different signs, the magnitude of dependence and
class size must be taken into consideration. Thus, generally the
F§:T test has smaller error variance than the FC:T test.

Given that data on a covariate exist for building an index of

response, the power of the FZ test might more appropriately be

S:T

compared to the power of F2

' * = =
C:T . Given that pA A 1.0 and K

o */o

A A ’

z _ z
E(Msc:T) - E(MSS: )

T
(The analysis of variance table for index of response using the full

model is reported in Appendix E.) Since the two test statistics,

FS'T and FC'T , have equal expected mean square errors, their

difference in power is solely a function of their difference in
degrees of freedom. Thus, because of greater degrees of freedom

F2 has greater power than F

S:T C:T '

In conclusion, index of response is a useful method of analysis
for two level nested data when the class effects on the covariate
are perfectly correlated with the class effects on the dependent

*
1Aij . Given that A1:] = BlAij

adjustment coefficient K which must be known a priori is equal to

*
variable: i.e., Aij = B , the

a ratio of oA* over oA . Under this condition, another test can

48

be developed by using classroom as the unit of analysis, i.e.,

Z Z Z
. F
FC:T The S:T and the FC:T statistics test the correct treatment
hypothesis. The power of Fi'T is higher than FC'T because of larger

degrees of freedom and a generally smaller error term. When

Z Z Z
comparing FS:T to FC:T , the gain in the power by using FS:T is a

function of the gain in the degrees of freedom alone.

Analysis of Covariance

 

When the adjustment coefficient K = oA*/oA is not know a priori,
an alternative strategy that allows the coefficient to be estimated
from data would be helpful. One such strategy is analysis of
covariance (ANCOVA).

Consider a one way ANCOVA. The linear model involving a random

covariable is

Yijk = “ + a“: + BS:T (Xijk ’ “)8 + *3ij
where
Y is the post test,
BS:T is the common bivariate regression slope of Y on X,
u is the grand mean of Y,
W is used to denote adjustment by covariate X,
a: is the adjusted treatment i effect,
X is the pre test,
“X is the grand mean on post test, and
Egjk is the error term.

The null hypothesis is stated

49

i=1
For the model, a treatment effect is defined

aw:
i “i

x x
- ” - 88:13 (“i ' u ) '

But given random assignment

X: X: = X: X
”i “i -.. ut u ,
and
ow = - = a
1 pi u 1

regardless of the value of 8 Therefore, when the pooled model

S:T'
ANCOVA is applied to data considered in this study, treatment effects
are unbiased and the correct treatment hypothesis is tested.

An F test of the null hypothesis about treatments can be provided

given the assumptions that

Ew N NID (9- oZW I )
_ ' 9 o
tcsxliJk tCSXl E tcsxtcs

In addition, ANCOVA assumes that the within treatment slopes are
equal across all values of i and that Y and X are bivariate normal.
These assumptions are the same as those for classical ANCOVA except
that the covariable is random. In using ANCOVA when X is random,
one can still obtain unbiased estimaters and valid confidence inter-
vals and tests from the usual analysis. The only difference from
the classical result is that the variances of the estimators are
larger. Discussion of a random covariate in ANCOVA can be found

in DeGracie (1968), Winer (1971), and Huitema (1980).

50

The Covariance Matrix of EW

 

The pooled model ANCOVA can be expressed in terms of parameters

in the structural model as follows:

X

Yijk = “ ' BS:T “ + “i + (Bl ' Bs:r) Aij + (32 ’ BS:T) Vijk + Gijk
where
Ew =(B-B )A +(B-B )V +G .
ijk l S:T ij 2 S:T ijk ijk
Under the independence assumption, the covariance matrix of Egjk must
be diagonal. From the assumptions of the structural model
0AV 2 0AG = OVG = 0 '
Therefore,
W 2 2
2E = (Bl - BS'T) 2A + (32 - BS'T) 2V + 2G .
tcsxtcs ° tcsxtcs ' tcsxtcs tcsxtcs

¢
2A

tcsxtcs ' ‘

 

 

where

2 v
MA=0A1 _1_
sxs sxl lxs

But XV and 2G are diagonal, i.e.,

V_2

2 - ovI
and

G 2

51

 

 

Thus,
F w 2
E
M ¢ 2" ¢
W .
A ¢ ME 2
z = . - ¢
tcsxtcs - i.
' W
E
¢ ... ¢ M
- 1
EW
where the diagonal matrix of 2 is
W 2 2
E _ _ _ V G
M - (Bl BS:T) MA + (32 BS:T) M + M
sxs sxs sxs sxs
2 2 v 2 2 2
’ (B1- 8S:T) "Ail +(Hz-Barr) OVI+OGI '
EW
The diagonal element of M is
2 2 2 2 2
(Bl - BS:T) oA + (B2 - BS:T) o + o ,

and the off diagonal element is

2 2

O .

(B S:T) A

1 - 8

W
For independence to exist at the level of individuals, ME must be
diagonal.

Given the structural model considered in this study and given

B = B , it was shown that

l 2
B =3A_=B
o * °
S.T oA 1
EW
When B1 = BS'T , the off diagonal elements of M are zero, and 2

becomes a diagonal matrix.

-W

52

Test Statistic

 

Given the structural model and the further restrictions that

pA*A = 1 and B1 = B2 , the pooled model ANCOVA provides a valid

test of the null hypothesis about treatments. The test statistic is
W W W

FS:T = MST/MSS:T m Ft-l, t(cs-l)

To be a useful analysis strategy, however, F:°T must also have

greater power than the test statistic for the full model. The most

appropriate comparison is to the test statistic for the full model

ANCOVA, FU'T , since the use of a covariate will in general provide
a more powerful test statistic than the full model ANOVA.
9
MSW
FW =._____ W F
C:T MSW t-l, t(c-l)
C:T

where the prime is used to distinguish the treatment mean square of
the full model ANCOVA from the treatment mean square of the pooled
model ANCOVA.

A treatment effect defined by the pooled ANCOVA model is

W _ _ _ X _ X
”1 - “i u BS:T (111 u )

A treatment effect defined by the full ANCOVA model is

I

OLu: _ _B x x
i “i “ C:T

Treatment effects for the pooled and the full ANCOVA models are

8 . The

identical under the condition that B = B since 8 C'T

1 2

estimates of treatment effects for the two models differ only in

S:T a

the estimates of slopes. Under the null hypothesis for treatments,

53

 

 

 

 

 

E(MS¥) = E(Ms:,T) = 06 ( t(cs_l) _ 2)
and ' ' 2

E(Msi) = E(Msgzr) = S ‘35— ( 1 + :(c-l) - 2)

2 1
= 0G (1 + ETEjij—jip

Then, '

E(MS}; ) = (1 + ice-1) - 2) E(Msg)

(1 + t(cs-l) - 2)

Therefore, E(Msg) > E(Msg) . Similarly, E(MSU'T) > E(MSW )

S:T
gain in power by using the pooled ANCOVA model instead of the full

The

ANCOVA model is a function of increased degrees of freedom and a

slightly smaller mean square error.

CHAPTER V

SUMMARY AND CONCLUSIONS

A commonly used design in educational research involves hier-
archically nested data. Classrooms of students are randomly assigned
to receive one of two or more alternative educational treatments.

Since dependent variables in educational research are typically defined
on students, however, the design results in students nested within
classrooms and classrooms nested with treatments.

A fully specified model for the design includes sources of
variation for treatments, classrooms, and students. Given the fully
specified model, the null hypothesis about treatments can be tested
using F

= MST/MS There has been resistance on the part of

C:T C:T '

educational researchers to use the FC:T test statistic because for
these studies the test has few degrees of freedom for error and so
limited statistical power. For example, a design comparing two
treatments and having three classrooms per treatment might include
over 100 students and still have only four degrees of freedom error
for the FC:T test statistic.

As a result, researchers have sometimes turned to a pooled model
which ignores classroom variance. By ignoring classroom variance

the sources of variation become treatments and students. The

apparent test statistic for the treatment null hypothesis is then

54

55

F = MST/MS The motivation for the pooled model can be

S:T S:T '

illustrated with the previous example. What had been four degrees of

freedom error for F becomes more than 100 degrees of freedom

C:T ’

rr r .
e o for FS:T

The test statistic, F for the pooled model requires that

S:T ’
observations on students be independent of each other. Violation of

the independence assumption when using F has been shown to yield

S:T
a test which can be either too liberal or too conservative (Glendening,
1977; Paull, 1950). What is needed, then, are analysis strategies
which have greater degrees of freedom error than the FC:T test
statistic and which are valid when there is dependence at level of
individuals among observations on the dependent variable.

The search for more powerful tests of treatment effects in
hierarchically nested data began with an operational definition of
independence. Given balanced designs with two levels of nested
data, Glendening (1977) defined independence as equivalent to when
the expected values of classrooms and students mean squares are
equal. If students are considered a random factor in the design,
as they were in the present investigation, dependence occurs when
there are classroom effects which make the expected value of the
mean square for classroom exceed the expected value of the mean
square for students. Glendening labeled this situation as positive
dependence. Glendening also considered the possibility of negative

dependence which can result when students are a fixed factor in the

design. Under positive dependence, the F test is too liberal and

S:T

56

so has spurious power (Glendening, 1977; Paull, 1950). The first
analysis strategy posed as a solution to the problem which dependence
creates for the FS:T statistic, involves a preliminary test for
independence. Based on the outcome of this preliminary test, either
the full model or the pooled model is used to test the null hypothesis
about treatments (Glendening, 1977; Paull, 1950; Peckham, et al.,
1969).

To be successful, the preliminary test and conditional pooling
procedure must keep the actual alpha level of the consequent test
(the conditional test) close to the nominal alpha level. Further,
-in order that the strategy be useful, the power of the conditional
test must be greater than the power of the unconditional FC:T test.

Glendening (1977) examined the validity of the preliminary
test and conditional pooling strategy. Analytically and empirically,
Glendening's findings opposed the use of the procedure. Glendening
concluded that given a moderate value of dependence, which is often
the case, the preliminary F test is not sensitive enough to help a
researcher guard against having a distorted probability of Type I
error for the conditional F test. This conclusion was consistent
with Pau11(1950).

Since a preliminary test for independence and conditional
pooling does not result in a valid test of the treatment hypothesis,
the present investigation considered additional alternatives. First,

the use of a quasi-F ratio to correct for dependence was considered

(Satterthwaite, 1941, 1946). The potential utility of the quasi-F

57

statistic in situations of positive dependence can be seen in the

following ratio.

 

' _ MST " MSC:T + Mssmr
F ' MS

S:CT

Under the null hypothesis of treatment effects E(MST - MSC’T + MSS'CT)
2

and E(MSS°CT) estimate the same parameter, oE . The apparent

'
reference distribution for F is F where v was t(cs-l) and v

2 l ’

v1,v2
could be computed from a formula provided by Satterthwaite. But,

the complex variance of the numerator and the simple variance of the
demoninator are not independent. Thus, the F. statistic which appeared
to hold promise as a test with greater power than FC:T in situations

of dependence and with no cost of additional information lacks a

known distribution.

Glendening and Porter (1974) suggested the possibility of using
ANCOVA to adjust for positive dependence. They pointed out that the
effects of positive dependence could be conceptualized as similar in
form to the problem created by confounding in quasi experiments. They
speculated that ANCOVA would remove the problematic variance of the
class effects (o:*) from both E(MSC:T) and E(MST) . Conceptually,
this would be equivalent to the ANCOVA procedures creating adjusted
observations that are free from positive dependence.

Index of response is another adjustment strategy which is closely

tied to ANCOVA. Thus, index of response was also considered in this

investigation.

58

To consider the utility of index of response and ANCOVA
strategies, it became important to understand the possible causes of
dependence. Distinctions among when and how dependence can arise in
an experimental study could help to inform the selection of an
appropriate covariable.

Two main related tasks, then, were set for the investigation
of this study: (1) to classify situations of independence and depen-
dence in experimental studies; and (2) to investigate the possibility
of using an index of response and ANCOVA models to correct for positive
dependence among individual units.

As a result of the first task, four possible situations were
classified for an experimental study that involves two level hier-
archically nested data. Dependence could arise because students were
not randomly assigned for classrooms (initial dependence) and/or
class effects which occur during the study (during-experiment depen-
dence). Crossing these two dichotomous possibilities defined the four
situations (Figure 3.1).

Situation 1, independence, occurs when students are randomly
assigned to classrooms and there are no class effects. Analysis of
variance using the pooled model is the best analysis strategies
for testing the no treatment effect hypothesis (Glendening, 1977;
Cronbach, 1976).

Situation II includes dependence due solely to students not
being randomly assigned to classrooms. When students are not
randomly assigned to classrooms, each classroom represents a popu-

lation that could have a different distribution on the dependent

59

variable. To the extent that classroom populations have different
means, there will be a classroom effect and so dependence among
observations taken on individual students. If index of response or
ANCOVA are to have potential for creating an adjustment which
eliminates dependence, the covariate must reflect the dependence
present at the outset of the experiment. A pretest would seem to
hold the greatest potential.

Situation III includes dependence which occurs during the
experiment. Common class experinces or class effects are the causes
of this dependence. Possible class effects are complex and global:
they might include effects due to subject matter, teacher effective-
ness, teaching strategies, student interactions and classroom milieu
which are not part of the definitions of treatments. Defining
covariables which reflect during—experimental dependence and which
might be used in index of response or ANCOVA is a substantive issue
worthy of study in its own right.

In Situation IV, both initial and during—experiment dependence
occur together. Under the simplest case when initial dependence and
during—experiment dependence do not interact with each other or with
treatments, the resulting classroom effects could be thought of as
having two additive parts, one for each type of dependence. To adjust
for Situation IV dependence the researcher would need to identify a
set of covariables reflecting initial dependence and a second set
of covariables reflecting during—experiment dependence.

In this investigation the utility of index of response and

ANCOVA for adjusting for dependence was restricted to use of a

60

pre test in Situation II conditions. Further, classroom populations
were assumed to be normally distributed on the dependent variable
with a common variance but different means. In order for either
approach to be judged as having utility, their resulting test
statistics of the null hypothesis about treatments must have: 1)
a known sampling distribution, and 2) greater power than the test
statistics which results from the full model.

The structural model, assumed for the dependent variable, Y,

the covariable, X, and their interrelationship is given

*
Yijk = u + oi + Aij + Eijk
xijk= u +AiJ +Vijk
A* B A + H
ij " 1 ij 13
13k ' B2 Vijk + 013k
Assumptions
2
13 m NID (0, 0A)
2
Hij‘ m NID (O, on)
2
Vijk m NID (0, ov)
2
Gijk N NID (0, oc)
and
o = o = o - o - o - o = 0

Under dependence, the covariance structure of Y, X and XY (i.e.,

Xx , ZY and ZXY) are the form of super diagonal matrices. The diagonal

61

matrices of Y, X, and XY (i.e., My , Mx and M ) are within class

covariance matrices and are expreSSed as

Y 2 I 2
M = o * 1 l + o I
A ._ _
Mx 2 v 2
- oA‘l_l_ + o I
MXY A '
— oAA l_l_ + oEVI

Under independence, 2X , ZY and ZXY are diagonal matrices. Thus, an

analysis strategy that correctly adjusts for dependence will yield
a linear model with specifications errors that have a diagnonal
covariance matrix.

Given the structural model, it was discovered that an index of

response would only adjust for dependence when class effects on X

and Y are perfectly correlated (i.e., *l = 1.0) . Given this

IOAA
condition, an index of response
Z=Y-;A—X,
A
was seen to yield observations that are independent at the level of

individuals. The covariance matrix of Z was found to be

Z 2 2 2
2 = ((32 - K) ov + oG)I

Thus, a pooled model which uses Z as the dependent variable provides

a valid test statistic for the treatment null hypothesis, FS°T =

The F

Z S'T test has greater power than the F2 test

S:T ' C:T

statistic from the full model using the same index of response because

2
MST/MS

62

S'T test provides greater power

than the FC'T test from the full model using the dependent variable

of greater degrees of freedom. The F

both because of greater degrees of freedom and because the error

term for the FS'T test will generally be smaller than the error term

for the FC:T test.

Investigation of the pooled model ANCOVA revealed a valid test
for the treatment null hypothesis given the structural model for

X, Y and their interrelationship only when pA*A = 1.0 and B1 = B2

When B1 = B2 , it was seen that BS'T , the pooled within slope of

Y on X, was equal to the constant used to define the correct index

of response (i.e., B oA*/oA). Under this condition the ANCOVA

S:T =

pooled model yielded specification errors which have a diagonal

covariance matrix

2
2W = oGI ,
and the test statistic Fw = MSW/MSw . Fw was seen to have
S:T. T S:T S:T

greater power than the full model ANCOVA test statistic, FU:T ,
because of larger degrees of freedom and a smaller mean square error.
The key to the success of index of response and ANCOVA strategies
was the structural model and the further condition that class effects
on Y be completely accounted for by the class effects on X; i.e.,
Aij = B1 Aij . Index of response did not require any correlation
between X and Y for individuals within classrooms. ANCOVA, however,
required that the constant specifying the relationship between X

and Y at the level of individuals be equal to the constant that

specified the relation between class effects. Nevertheless, it was

63

seen that the correlation between X and Y for individuals within
classrooms need not be perfect. ANCOVA was seen to adjust for
dependence when there was specification error in Y that was not
accounted for by specification error in X.

In general, then the results of this investigation supported
and expanded upon the conjecture by Glendening and Porter (1974)
about the utility of ANCOVA for adjusting for dependence. However,
their suggestion that X and Y must be perfectly correlated can be
relaxed to a requirement that the classroom effects be perfectly

correlated.

APPENDIX A

ANOVA TABLE OF THE FULL MODEL

GIVEN STUDENTS ARE FIXED

APPENDIX A

ANOVA TABLE OF THE FULL MODEL

GIVEN STUDENTS ARE FIXED

 

*

Model: Yijk = u + oi + Aij + Eijk

* 2
Assumptions: A , m NID (o, UA*)

 

 

 

 

13
Source of Variation d.f. E(MS)
2
csiai
- *
Treatments (T) (t 1) soA + t-l
2
Classes (C:T) t(c-l) soA*
2
XXX Eijk
. - _11_k____
Students (S.CT) tc(s 1) tc(s-1)

 

Treatments are fixed;
Classrooms are random; and

Students are fixed.

 

64

APPENDIX B

THE REGRESSION SLOPE OF THE POOLED ANCOVA MODEL

APPENDIX B

THE REGRESSION SLOPE OF THE POOLED ANCOVA MODEL

Define 83° to be:

 

 

 

 

T
Cov (x — ux) (Y - u )
B = ijk 1 11k 1
S:T X
Var (Xijk - pi)
c A +v )(3 +2 )
= 0V ( ij ijk 11g, ijk
Var (Aij + vijk
0' *
= AA + oVE
2 2—
oA + oV
From
A* - B A
ij ‘ 1 ij
Eijk = B2 Vijk + Gijk
and oVG = 0 ,
* 2
0AA ’ B1°A
2
and oVE = Bzov .
Then, 2
- BloA + Bzo
BSoT 2 1.7
' o +'o
A v
If Bl = B2 ’ BS:T ' Bl

65

APPENDIX C

VARIANCE COMPONENTS OF DEVIATED SCORES

APPENDIX C

VARIANCE COMPONENTS OF DEVIATED SCORES

Model:
Y - + + A* +
ijk " “ “i ij Eijk
x = X + A + v
ijk “ ij ijk
Assumptions:
. * 2*
Aij m NID (0, oA ) ,
2
Eijk W NID (0, GE) ,
2
Aij % NID (0, oA) ,
2
* = = * = =
and oA E oAV oA V oAE 0
Define:
“i ’ u + 01
- + +A*
1113 - LI 0-1 i'
x = x
111 11
X
- + A
“13' “ ij

66

67

 

Expected
Variable Model Value Variance
* 2 2
+ *
Yijk “i ij Eijk O 0A + 0E
2
Yijk “ij ijk 0 GE
2
_ * __ 2 0'
.. u. .. + E.. 0 o * + E
13. 1 13 13. A. -;;-
i: 2
* +
Yijk p i + Aij + Eijk (xi CA G
X 2 2
+ +
Xijk “i ij Vijk 0 0A °
X 2
Xijk “ij ijk 0 0v
2
_' X +lV' 0 o2 + 0V
ij. “1 ij ij. A.. -;;—
X 2 2
+ +
Xijk “ ij Vijk 0 ° °

APPENDIX D

COVARIANCE COMPONENTS OF DEVIATED SCORES

APPENDIX D

COVARIANCE COMPONENTS OF DEVIATED SCORES

Model:
Y - + + A* + E
ijk ’ “ “i ij ijk
= +
Xijk “ Aij + Vijk
Assumptions:
* 2*
Aijk W NID (0, 0A ) ,
2
Eijk m NID (O, oE) ,
2
Aij m NID (0, 0A) .
2
Vijk N NID (0, 0V) ,
* = = * = =
and oA E oAV oA V oEV 0
Define: “i = u + oi
+ +A*
“ij ‘ “ “1 ij
X = X
Hi 11
X
and uij - u + Aij

68

69

>00

>00

>00

>00

A

A

A: n she

a: n .ﬁa

s0 Ax:

.0 Am:

has . anesv aﬁw:

A

A; n she

wv Aw:

ansxv
.ﬁe
xﬁexv

shexv

>00

>00

>00

>00

APPENDIX E

ANOVA TABLE FOR THE INDEX OF RESPONSE MODEL

USING CLASSROOMS AS THE UNITS OF ANALYSIS

APPENDIX E

ANOVA TABLE FOR THE INDEX OF RESPONSE MODEL

USING CLASSROOMS AS THE UNITS OF ANALYSIS

 

 

 

Model:
2 =Y -K§ 1312000t
i . i’. i‘. ’ ’ ’
j J J j I 1,2,.. .,c
kgl’Z’OﬁOO
or 2 =(u-Bux)+a+(B-B)V +6
ij. 1 i 2 l ij. ij.
Assumptions:
_ 2
ij. % NID (0, oV/s)
_ 2
ij. W NID (0, oG/s)
CIVG=0
Source of Sum of Mean
Variation d.f. Square Square E(MS)
t 2
_. _- 2 z scilo1
Treatment t-l scE(Z -Z...) MS _1 +
(T) i i.. T t

 

2 2
Classroom t(c-l) s 2 (Z1.-j .) MS€°T (B 1) GV + o
(C:T) j,k °
MST
Given Ho: 01 = O , MS N F
C:T t- t(c-l)

7O

REFERENCES

REFERENCES

Anderson, G.L. A comparison of the outcomes of instruction under two
theories of learning. Unpublished doctoral dissertation,
University of Minnesota, 1941.

Bloom, B.S., Human Characteristics and School Learning, New York:
McGraw-Hill, 1976.

 

Bryk, A.S. and Weisberg, H.I. Use of the Nonequivalent Control Group
Design When Subjects Are Growing, Psychological Bulletin, 1977,
.84, 950-962.

 

Cronbach, L.J. Research on classrooms and schools: formulation of
questions, design, and analysis. Occasional papers of the
Stanford Evaluation Consortium, Stanford University, California,
July, 1976.

Cronbach, L.J. & Webb, N. Between-class and within-class effects in
a reported aptitude X treatment interaction: Reanalysis of a
study by G.L. Anderson. Journal of Educational Psycholpgy, 1975,
.61, 717-724.

Cronbach, L.J., Pagosa, D.R., Floden, R.E., & Price, G.G. Analysis
of covariance in non randomized experiments: parameters
affecting bias. Occasional paper, the Stanford Evaluation
Consortium, Stanford, California, 1977.

Cochran, W.G., & Cox, G.M. Experimental Designs. New York: Wiley &
Sons, 1957.

 

Cox, D.R. Plannigg of Experiments. New York: Wiley & Sons, 1958.

 

DeGracie, J.S., "Analysis of covariance when the concomitant variable
is measured with error." Unpublished Ph.D. thesis, Library,
Iowa State University of Science and Technology, Ames, Iowa,
1968.

Dunkin, M.J. & Biddle, B.J. The Study of Teaching, New York: Holt,
Rinehart, and Winston, Inc., 1974.

 

Gaylor, D.W. & Hopper, F.N., Estimating the degrees of freedom for
linear combinations of mean squares by Satterthwaite's Formula.
Technometrics, 1969, 11, 691-706.

 

71

72

Glass, G.V., & Stanley, L.C. Statistical Methods in Education and
Psychology. Englewood Cliffs, N.J.: Prentice-Hall, 1970.

 

Kirk, R.E. Experimental Design: Procedure for the Behavioral
Sciences. Belmont, California: Brooks/Cole, 1968.

Hannan, M.T., & Young, A.A. On certain similarities in the estimation
of multi-wave panels and multi-level cross-sections. Paper
presented at the Conference on Methodology for Aggregating Data
in Educational Research, Stanford University, October 23-24, 1976.

Hudson, J.D., & Krutchkoff, R.G. A monte earlo investigation of tests

employing Satterthwaite's synthetic mean squares. Biometrika,
1968, 55, 431-433.

Huitema, B.E. The Analysis of Covariance and Alternatives. New York:
Wiley & Sons, 1980.

Olejnik, S.F. Data analysis strategies for quasi-experimental studies
where differential group and individual growth rates are
assumed. Unpublished Ph.D. dissertation, East Lansing:
Michigan State University, 1977.

Paull, A.E. On a preliminary test for pooling mean squares in the
analysis of variances. Annals of Mathematical Statistics,
1950, 21, 539-556.

Peckham, P.D., Glass, G.V., & Hopkins, K.D. The experimental unit in
statistical analysis. The Journal of Special Education, 1969,
.3, 337-349.

Porter, A.C. Some design and analysis concerns for quasi-experiments
such as Follow Through. Paper presented at the meeting of the
American Psychological Association, Hawaii, August, 1972.

Porter, A.C. An Evaluation of the TABA and BASICS Teaching Strategies
Programs in the Lansing Public Schools. Paper presented for the
Michigan Department of Education, College of Education, Michigan
State University, November, 1973.

Porter, A.C., & Chibucos, T.R. Selecting an analysis strategy. In
G. Borich (ed.), EvaluatinggEducational Programs and Products.
Educational Technology Press, 1974.

Porter, A.C., & Chibucos, T.R. Common problems of design and analysis
in evaluative research. Sociological Methods & Research, 1975,
_3, 235-257.

Satterthwaite, F.W. Synthesis of variance. Psychometrika, 1941,
E, 309-16 0

73

Satterthwaite, F.W. An approximate distribution of estimates of
variance components. Biometrics Bulletin, 1946, 2, 110-2.

 

Schmidt, W.H. Structural equation models and their application for
longitudinal data. Paper prepared for the Conference on
Longitudinal Statistical Analysis, Boston, 1975.

Winer, B.J. Statistical principles in experimental design. New York:
McGraw-Hill, 1962.

 

 

"Illllllllllllllll