“WWW”? ‘ . . ‘ . .. . VHPWWIMJN‘. oIO‘Wlho , - 'fltnunQV-s ... . Q” .. n n . u. . . , . ’l. .fi ! .r.‘# ..v.(..1 4 .. 1.. 7 . . v I . U 1“ , {,4 “ fl HALE? . “It VJ) ..C Li ‘1 W rs" ’-‘ This is to certify that the thesis entitled DATA ANALYSIS STRATEGIES FOR QUASI—EXPERIMENTAL STUDIES WHERE DIFFERENTIAL GROUP AND INDIVIDUAL GROWTH RATES ARE ASSUMED presented by STEPHEN F. OLEJNIK has been accepted towards fulfillment of the requirements for Ph. D. degree in Counseling , Personnel Services & Educational Psychology Q C Rae Major professor D3“: October 28, 1977 0-7639 v.0---v- n-~ o v "y -—.———. *q -‘— DATA ANALYSIS STRATEGIES FOR QUASI-EXPERIMENTAL STUDIES WHERE DIFFERENTIAL GROUP AND INDIVIDUAL GROWTH RATES ARE ASSUMED By Stephen F. Olejnik A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Personnel Services and Educational Psychology 1977 DATA ANALYSIS STRATEGIES FOR QUASI-EXPERIMENTAL STUDIES WHERE DIFFERENTIAL GROUP AND INDIVIDUAL GROWTH RATES ARE ASSUMED By Stephen F. Olejnik Selecting an appropriate analysis strategy for a study based on a quasi-experimental research design has been a topic of considerable controversy. Recently, the discussion has focused on the issue of academic growth rates. Some authorities have suggested that the initial difference between comparison groups on a pretest measure implies that comparison groups are growing at different rates academically. The differential growth rate problem has been labeled the fan spread hypothesis. This theory suggests that along with an increasing mean difference between groups there is a prOportional increase in the within group variability. Under this model traditional analyses techniques have been challenged as inapprOpriate on the basis that they underadjust for group differences. The purpose of the study was to compare four analytic strategies for quasi-experimental studies where differential group and individual growth rates are assumed. Two models of growth were considered. One assumed that both individual and group growth were linear. The second model assumed linear group growth but an individual's growth rate within the group was assumed to vary over time. Under the second model of within group growth, an individual's relative academic position within Stephen F. Olejnik the group was likely to change. For each of these models four analytic strategies were considered: (l) gains in standard scores; (2) single covariable analysis of covariance with estimated true scores; (3) gain scores adjusted for differential growth rates; and (4) multiple fallible covariable analysis of covariance. The appropriateness of each proce- dure under the two models was based on two criteria: (l) the effect estimated, and (2) the precision with which each effect was estimated. The results of the study showed that when individuals and groups grow at different rates but in a linear fashion, gains in standard scores, estimated true score analysis of covariance and gain scores adjusted for differential growth rates, all provide the correct estimate of group differences. For the second model of growth considered, gains in standard scores and gain scores adjusted for differential growth rates were both shown to estimate the desired effect. Only the multiple fallible covariate analysis of covariance procedure was shown to be an inappropriate technique for both models of differential growth. The standard error associated with each competing analytic strategy was then compared for sample sizes ranging from 20 to 120 and the corre- lation between the pretest and posttest measures ranging from .l to .9. The results indicated that when the sample is large and the correlation between the measures high, all three procedures provide approximately equal precision. As the sample size and the relationship decrease, greater precision was provided by the gain score technique adjusted for differential growth rates. The analysis also indicated that gains in standard scores provided a spuriously low standard error as a result Stephen F. Olejnik of the two stage process of first calculating the adjusted variable and then estimating the treatment effect. As the sample size and relation- ship decreased, the greater the underestimation in the standard error. It was concluded that when data are available from two points in time prior to the period investigated, adjusted gain scores provide the best analytic strategy of those considered where group growth is linear. If data are available from a single point in time and individuals and groups are growing differentially but linearly, then true score analysis of covariance provides the preferred analytic approach. In situations where only the results of a single pretest are available and individual growth rates vary across time, gains in standard scores may be appro- priate but the sample size must be large. To demonstrate these results a data set was obtained and analyzed using the three competing strategies. The findings of this analysis were consistent with those predicted above. ACKNOWLEDGMENTS I am especially grateful to Dr. Andrew C. Porter, my friend and the chairman of my dissertation committee, for his support and guidance throughout this research project and my graduate program. By giving generously of his time and knowledge, Andy has enriched my educational experiences and made them happy ones. I would also like to thank Dr. Kenneth Arnold for his assistance, especially in the mathematical aspects of this study. Sincere appreciation is extended to Drs. John Schweitzer, Robert Flooden, and William Schmidt for their constructive criticisms and suggestions in earlier drafts of this dissertation. I am also grateful to Ms. Janice Cave for her encouragement throughout my doctoral program and her assistance in the typing of the initial drafts of this paper. Finally, I am indebted to the Institute for Research on Teaching for their financial support during the past year while this research was conducted. ii TABLE OF CONTENTS Page LIST OF TABLES ......................... v LIST OF FIGURES ........................ vii Chapter l. INTRODUCTION ...................... 1 Statement of the Problem ............... l2 Purpose of the Study ................. l2 2. THE ANALYTIC STRATEGIES ................ l4 Gains in Standard Scores ............... 21 Analysis of Covariance With Estimated True Scores . . 23 Adjusted Gains Scores ................ 25 Analysis of Covariance With Multiple Covariates . . . 3l Summary ....................... 35 3. REVIEW OF RELEVANT LITERATURE ............. 36 Data Analysis for Quasi-Experiments ......... 36 Raw Gain Scores ................... 39 Gains in Standard Scores ............... 4O Estimate True Score Analysis of Covariance ...... 43 Analysis of Covariance With Multiple Fallible Covariates ..................... 49 Summary ....................... Sl 4. EFFECTS ESTIMATED AND THEIR STANDARD ERRORS ...... 52 Estimation of Treatment Effects ........... 55 Estimation With Gains in Standard Scores ....... 56 Estimation With True Score Analysis of Covariance . . 58 Estimation With Adjusted Gain Scores ......... 6l Estimation With Multiple Fallible Covariates ..... 64 Precision ...................... 68 Standard Error For an Index of Response ....... 69 The Standard Error For Kenny's Procedure ....... 72 The Standard Error For Porter's Procedure ...... 77 Chapter Page The Standard Error For Adjusted Gain Scores ...... 82 The Standard Error For the Keesling-Wiley Procedure . . 84 5. DISCUSSION ........................ 95 A Data Example ..................... 102 General Linear Models of Group Growth ......... 108 One Group vs. Two Group Research Designs ........ 112 Limitations ...................... 114 Appendix A. THE VARIANCE OF THE PRODUCT OF TWO RANDOM VARIABLES . . . 116 B. THE COVARIANCE OF A RANDOM VARIABLE AND THE PRODUCT OF TWO RANDOM VARIABLES ................. 117 C. THE COVARIANCE OF TWO ADJUSTED VARIABLES ......... 118 D. THE EXPECTED VALUE OF THE RATIO OF TWO NON-INDEPENDENT SAMPLE STANDARD DEVIATIONS SQUARED ............ 119 E. THE EXPECTED VALUE OF THE RATIO OF TWO NON-INDEPENDENT SAMPLE STANDARD DEVIATIONS ................ 123 F. THE EXPECTED VALUE OF A SAMPLE REGRESSION COEFFICIENT . . 124 G. THE EXPECTED VALUE OF A SAMPLE REGRESSION COEFFICIENT SQUARED ......................... 129 H. THE VARIANCE OF THE ADJUSTED GAIN SCORE VARIABLE ..... 133 BIBLIOGRAPHY ........................... 137 iv Table LIST OF TABLES A Hypothetical Data Example of the Fan Spread Hypothesis With the Linear Model of Within-Group Growth ......................... A Hypothetical Data Example of the Fan Spread Hypothesis With the Non-Linear Model of Within-Group Growth ......................... The Expected Value of the Ratio of Two Non-Independent Sample Standard Deviations When the Population Standard Deviations Are Equal .................. The Variance of the Ratio of Two Non-Independent Sample Standard Deviations When the Population Standard Deviations Are Equal .............. The Variance of a Sample Regression Coefficient When the Population Standard Deviations of the Two Variables Are Equal ................... The Variance of the Adjustment Coefficient Suggested in the True Score Analysis of Covariance Procedure The Standard Errors Associated With Gains in Standard Scores, True Score Analysis of Covariance, and Adjusted Gain Scores .................. Coefficients for the Second, Third, and Fourth Components of the Standard Error Associated With the Three Competing Analytic Strategies When p = .9 for the Manifest Variables and the Population Variance of the X and Y Variables Are Equal ........... Coefficients for the Second, Third, and Fourth Components of the Standard Error Associated With the Three Competing Analytic Strategies When p = .8 for the Manifest Variables and the Population Variance of the X and Y Variables Are Equal ........... Page 53 54 75 76 81 83 86 88 89 Ta Table 10. ll. 12. 13. 14. 15. 16. Coefficients for the Second, Third, and Fourth Components of the Standard Error Associated With the Three Competing Analytic Strategies When p = .7 for the Manifest Variables and the Population Variance of the X and Y Variables Are Equal ............ Coefficients for the Second, Third, and Fourth Components of the Standard Error Associated With the Three Competing Analytic Strategies When 0 = .6 for the Manifest Variables and the Population Variance of the X and Y Variables Are Equal ............ Coefficients for the Second, Third, and Fourth Components of the Standard Error Associated With the Three Competing Analytic Strategies When p = .5 for the Manifest Variables and the Population Variance of the X and Y Variables Are Equal ............ Coefficients for the Second, Third, and Fourth Components of the Standard Error Associated With the Three Competing Analytic Strategies When p = .4 for the Manifest Variables and the Population Variance of the X and Y Variables Are Equal ............ Group Means in the Metric of Grade Equivalent Scores and Standard Deviations on the Paragraph Meaning Subtest of the Stanford Achievement Test Battery ..... Results of the Data Analyses Using the Gains in Standard Score Strategy. True Score Analysis of Covariance, Adjusted Gain Scores With the Analysis of Variance Model, Adjusted Gain Scores With the Derived Standard Error, Traditional Analysis of Covariance and Analysis of Variance With Residualized Gain Scores .......... The Effect Estimated by the Gains in Standard Scores, True Score Analysis of Covariance and Adjusted Gain Scores, and Traditional Analysis of Covariance ...... vi Page 90 91 92 93 103 105 106 LIST OF FIGURES Figure Page 1. The selection by maturation interaction: increasing mean differences in achievement between comparison groups across time . . . . ................ 5 2. The fan spread hypothesis: increasing mean difference in achievement between comparison groups with a proportional increase in the within-group variability across time ..................... . . 6 3. The fan spread hypothesis with the linear model of within-group growth ............... . . . . 8 4. The fan spread hypothesis with a non-linear model of within-group growth ................... 8 5. Underadjusting for group differences with the raw gain score procedure in a situation conforming to the differential growth rate theory ............. 17 6. Differential growth rates considered over three points in time ................... . ..... 26 7. Group achievement means plotted across three points in time . . . . . . . . ................... 103 in me Che m The COn One I.ilVe CHAPTER 1 INTRODUCTION Individuals are dynamic; attitudes, perceptions, and knowledge continually change as a result of interactions among individuals and the communications media. The understanding of this dynamic nature and at times the modification of it, is a major concern of social science research. Interest in naturally changing entities has raised, however, difficult problems in measurement and analysis. Although considerable discussion has been devoted to the topic of measuring change (McNemar, 1958; Lord, 1956, 1958, 1963; Bereiter, 1963; Cronbach and Furby, 1970; Linn & Slinde, 1977), the problem is far from resolved. The problems related to measuring change exist to varying degrees in all research designs. These issues are less troublesome in experi- mental studies where the investigator manipulates the variables of interest and the effects on other variables are observed. Measuring change is more difficult in quasi-experimental studies where the investigator lacks the freedom to manipulate the variables of concern. The present study focuses on the issues of change associated with the latter design. Specifically, this study considered the non-equivalent control group design (Campbell & Stanley, 1963) where the results of one or two pretests are available prior to the period under investigation. Another popular research design frequently adopted by social scientists is based on natural variation. In these studies the investigator identifies a group of individuals and observes them in their "natural environment" on a set of variables of interest. The relationship among these variables is then based on correlational techniques. Measuring change in these studies is more difficult than in quasi-experiments and is not considered here. The finding of the present study may, however, be indirectly relevant to studies based on natural variation. The evaluation of an educational program frequently results in a quasi-experiment of the type considered here. Compensatory education programs such as Head Start and Follow Through, in particular, have been the focal point of much discussion on growth assessment in quasi- experimental studies (Campbell & Erlebacher, 1970). In these programs students are not randomly assigned to treatments, but rather those in greatest need are given the treatment-~that is, additional assistance. Treatment effects are estimated by comparing the academic achievement of those students receiving the additional assistance with a group of students who did not receive the additional assistance. Evaluation in quasi-experimental settings raises a number of issues and problems not generally encountered in true-experimental designs. Campbell and Boruch (1975) discuss in detail several concerns which may arise in the analysis and interpretation of quasi-experimental data. The overriding theme of their work revolves around the issue of bias in the estimation of treatment effects. An estimate of the treatment effect is biased if the estimate indicates that the effect of the program was either positive or negative when there were no true treatment effects or vice versa. Although there are several factors which contribute to a biased estimate, the entire problem originates from the fact that without randomization there are likely to be sub- stantial differences between the individuals in their initial status on the outcomes to be assessed. Estimating the treatment effects solely on the differences in post-treatment test scores may be biased since, depending on the direction of the initial differences, a program may appear effective or harmful even though there were not true treat- ment effects. Several strategies have been suggested which take initial differences into consideration when estimating a program's effectiveness. Campbell and Boruch (1975) argue, however, that these adjustment procedures frequently fall short of eliminating all of the bias that can result from initial differences. The magnitude of the bias is related to two issues: (1) specifying the appropriate var- iables on which the adjustment is made, and (2) identifying the apprOpriate model to use the variables in predicting change. These are difficult problems to solve. Specifying the appropriate variables means that the investigator can identify those variables which are predictive of all confounding variables that affect the dependent variable. Knowledge of those variables on which the assignment of individuals to groups was based would provide one possible solution. Another solution is the random assignment of individuals to groups. This solution is not possible for studies being considered here. The present study does not pursue this aspect of the specification problem. The second issue, that of specifying the appropriate model is a major concern here. This aspect of the specification problem includes the question of measuring the adjustment variables reliably. The unreliability issue has been considered in detail and several solutions have been suggested (Lord, 1960; Porter, 1967; De Gracie, 1968; Stroud, 1972). Specifying the appropriate analytic model is dependent on how individuals change over time. The issue of growth models has recently been considered explicitly by several researchers (Campbell, 1971; Kenny, 1975; Bryk 8 Weisberg, 1977). Campbell in particular has been concerned with the relationship between growth rates and estimates of treatment effects. Although his interest has centered on the evaluation of compensatory education programs, his work applies equally to other quasi-experimental investigations. The issue which Campbell has raised involves the implications of initial differences on the outcome dimension for the prospect of differential growth rates. His reasoning is based on the belief that groups which differ in their initial average performance also differ in their rate of development on the outcome dimension. Thus, in the evaluation of compensatory education programs, a control group may have a higher average pretest score because as a group these individuals have grown quicker than the group who will receive the treatment, and this development is likely to continue without program intervention. Pictorially this selection by maturation interaction can be presented as in Figure 1, where the lines represent group average performance over time. X Control Treatment Achievement Time T Figure l. The selection by maturation interaction: increasing mean differences in achievement between comparison groups across time. Campbell further develops this idea of differential growth rates into a theory which he has labeled the "fan spread hypothesis." It states that along with the increasing mean difference between the compared groups, a proportional increase in the variance within the groups occurs. Figure 1 can be modified to reflect the changing variance as in Figure 2. The labels for treatment-control are arbitrary. X Control 4..) C 3 Treatment 9’ Q) IE U <1: :- Time T Figure 2. The fan spread hypothesis: increasing mean difference in achievement between comparison groups with a proportional increase in the within-group variability across time. The dashed lines represent the increasing range of achievement scores within the treatment and control groups over time. This relationship between the increasing mean difference and the within-group variance can be represented in the following formula: uxpt " uxct __, K 0t where: “x t’ “xct are the population means on measure (X) for the p program and control groups, respectively, at time t; ot is the pooled within-group standard deviation of the outcome measure at time t; and K is a constant. Thus the difference between group means relative to the pooled within—group standard deviation remains constant over time. It might be noted here that parallel growth patterns between groups may also conform to Campbell's fan spread model if the within group variance remains constant across time. Evidence supporting the fan spread model of growth has been provided by both cross-sectional and longitudinal studies (Osborne, 1966; Baugham & Dahlstron, 1968; Fennessy, 1974). Although this is a relatively small sample of studies on which to base a theory, Campbell is confident that additional findings will support the model (Campbell & Boruch, 1975). Kenny (1975) has argued for the reasonableness of the theory and has provided additional data conforming to the fan spread hypothesis. This model of growth suggested by Campbell and supported by Kenny presents a special case of a more general issue involving linear-growth models. Bryk and Weisberg (1977) have gone beyond Campbell's fan spread hypothesis and have considered the problem of differential linear-growth patterns. By varying the initial starting points of growth and the average rate of group growth, several linear models were considered. The discussions of growth models presented by Campbell, Kenny, Bryk, and Weisberg have concentrated on differential growth rates between comparison groups and have ignored the issue of differential growth rates within groups. The question of growth rates within groups can be conceptualized in at least two ways as presented in Figures 3 and 4. Achievement Time T Figure 3. The fan spread hypothesis with the linear model of within-group growth. Achievement Figure 4. The fan spread hypothesis with a non-linear model of within-group growth. In each diagram the solid line represents the average growth rate for the group, while the dotted lines represent individual rates of growth. Figure 3 presents within-group growth rates that are generally associated with the fan spread hypothesis. It conceptualizes within- group growth as having a common starting point and different linear rates of individual growth across time. Thus, in any two subsequent points in time, individuals maintain their relative position within the group. Figure 4, on the other hand, represents the situation in which the group's mean growth is linear but individual growth is not linear. Under this model an individual's growth rate may vary over time, i.e., growth may occur in spurts, but group growth may be con- stant. Both of these models can result in data conforming to Campbell's fan spread hypothesis. The implications these models of within-group growth have for data analysis and estimation of treatment effects are substantially different. Given that the fan spread model represents a valid conceptual- ization of how individuals and groups change over time in quasi- experimental studies, Campbell (1971) has argued that current analytic strategies are inadequate in adjusting for the differential nature of growth. As a result, recent efforts to evaluate compensatory education programs may be misleading since the differential growth patterns have not been considered. The conclusion that these programs have been ineffective or even harmful may be a statistical artifact rather than actuality. 10 In response to Campbell's argument that current analytic strategies inadequately adjust for the fan spread model, several researchers have proposed new or modified techniques to resolve the differential growth problem. Kenny (1975) has argued that given the fan spread model, an appropriate analytic strategy is what he calls standardized gain scores (also referred to as gains in standard scores). The fan spread hypothesis suggests increasing variability within groups across time. Kenny's approach counters this increasing variability by standardizing the pretest and posttest scores using the pooled within— group standard deviation at time 1 and time 2, respectively. The difference between the standardized scores is then computed and used as the dependent measure with the analysis of variance model. Another solution to the fan spread model was proposed by Porter and Chibucos (1974). They suggested that the analysis of covariance model was appropriate for the differential growth rate situation if the covariate was perfectly reliable. Given that the covariate was fallible, then analysis of covariance with the estimated true score of the covariate, would adequately adjust for the fan spread model. Estimated true score analysis of covariance was originally developed by Porter (1967) as a solution to the single fallible covariate problem. Still another solution which might be considered to adjust for the fan spread effect is the use of gain scores adjusted for differential group growth. The raw-gain-score strategy assumes that groups are changing at relatively equal rates, and that the only difference between the groups other than that caused by treatments, is the 11 initial status at the point of intervention. The fan spread model allows that not only do the groups differ in their pre-treatment performance levels, but also that the groups are changing at different rates. Therefore simple gain scores could be inappropriate in light of the fan spread model. If, on the other hand, the gain scores themselves were adjusted for the differences between the groups' growth rates, an appropriate estimate of the treatment effect might be obtained. An adjustment of the type just described could be made if data from two pretests over time rather than one pretest were available. Given that two pretests were available for each group, then each group's growth rate could be estimated by the difference between the group's mean performance on the first and second pretests. Given that multiple pretest data are available, a fourth procedure which might be considered to analyze data conforming to the fan spread hypothesis is the analysis of covariance model with multiple covariates. This technique might be used by researchers in the field where the tendency is to use all available information on a group of subjects in the hope of increasing precision and adjusting for all initial differences. Furthermore, in light of the earlier discussion on analysis of covariance, multiple covariates may also adjust for the differential growth rate problem if corrected for their unreliabilities. Recently, Keesling and Wiley (1976) have suggested a procedure which they argue solves the multiple fallible covariate problem. Their pro- cedure may therefore also provide an appropriate solution to the fan, spread problem. 12 Statement of the Problem A great number of educational research efforts are based on designs that are quasi-experimental. As a result, researchers in the field frequently encounter difficult problems in measuring change and estimating treatment effects. Campbell has argued that a difficulty which has not always been explicitly recognized in quasi-experimental studies is a question of differential growth rates. In particular, the evaluation of compensatory education programs, where the children with the slowest academic growth receive the treatment, may be especially vulnerable to this problem. Campbell has suggested that traditional analyses strategies fail to take the differential growth patterns into consideration in estimating treatment effects; thus traditional strategies result in biased estimates of program effectiveness. These programs have therefore appeared less beneficial than was actually the case. In response to Campbell's arguments, several analytic strategies have been suggested which may provide the apprOpriate adjustment under the fan spread condition. Purpose of the Study The purpose of the study was to compare four procedures in terms of their appropriateness as strategies for data analysis in quasi- experimental studies given that individuals and groups may grow differentially. The four strategies considered were these: (1) gains in standard scores, (2) single covariable analysis of covariance with estimated true scores, (3) gain scores adjusted for differential growth rates, and (4) multiple fallible covariable analysis of 13 covariance. Specifically, two types of individual growth were considered: first, the situation in which the correlation between the pre-intervention and post-intervention measures was unity, p =1, except for measurement errors; and second, the situation in which the relationship between the two measures was not perfect, pail, regardless of measurement errors. This second case results when individuals begin to grow at different points in time and grow at different rates, or when individuals grow academically in spurts. The apprOpriateness of the four strategies was based on two considerations: first, the effect estimated by each technique, and second, the precision with which each effect was estimated. Further- more, a discussion of the effects and implications of violating either of two assumptions was considered. They were the homogeneity of regression assumption and the assumption that individuals and groups grow in a linear fashion. In conducting this study two approaches were used. First, the four analysis strategies were considered analytically to determine whether the procedures estimated the effect of interest. Standard errors were then derived for those strategies estimating the appro- priate effect to identify those procedures which offered the greatest precision. The second approach in comparing the strategies was to analyze a set of data with the procedures that estimated the appropriate effect. The conclusions drawn by each strategy were then compared as were their respective error terms. CHAPTER 2 THE ANALYTIC STRATEGIES The previous chapter has raised explicitly the question of differential growth rates in quasi-experimental studies. While some detail was given to the presentation of this problem, only general statements as to possible solutions to the fan spread model were pro- vided. In this chapter the analytic strategies suggested are considered in detail. Before studying these solutions, however, a brief discussion is presented concerning the inadequacies of raw gain scores and analysis of covariance without correction for an unreliable covariate in quasi-experiments. In experimental studies where individuals are randomly assigned to a treatment or a control group, program effectiveness can be esti- mated by the difference between the group means on some measure of interest following the treatment: a=u -u yp yC' Where a is the estimate of the treatment effectiveness; p ,p c are the population means on the post-treatment measure yp y (Y) for the program and control groups, respectively. 14 15 Since individuals were randomly assigned to the two groups, it can be assumed that prior to the implementation of the treatment, the groups differ only by chance factors on the outcome dimension and their average growth rates were in the long run the same. Furthermore, if the treatment has no true effect, the groups would remain equivalent after the period of program experience and the above estimate would equal zero. For the situation described, the analysis of variance model provides an appropriate analytic strategy to compare the group means in that it estimates the effect of interest. The dependent measure using this strategy is simply the performance on a post- treatment measure on a dimension of interest. With quasi-experimental studies initial differences are likely on the outcome dimension even before the treatment is implemented. If the above procedure was used to estimate the treatment effect and the program had no true effect, then the initial difference would have been treated as a program effect and erroneous conclusions would have been drawn. Thus in quasi-experimental studies some type of an adjustment is necessary to take into consideration those initial differences which may influence the final outcome measure. Two strategies frequently chosen by researchers in the field for data analysis in quasi-experimental studies have been the analysis of variance model with raw gain scores as the dependent measure and analysis of covariance. Though the former strategy has often been criticized (Cronbach & Furby, 1970; Campbell & Erlebacher, 1970), it still remains a popular approach for researchers in the field 16 (Richards, 1975). Raw gain scores are calculated by taking the difference of the post-treatment and pre-treatment scores (post-pre) and creating a new variable. For example, if Y represents the post- treatment score on some variable of interest and X represents the pre-treatment score, a new variable W is created by the simple difference between the two scores; i.e., W = Y-X. Since the new variable is created by taking the difference between scores, this procedure logically requires that the same or equivalent form of the measure be administered at the two points in time. The analysis of variance model is then used with W as the dependent measure. The program effectiveness when the raw gain score strategy is used can be written as: Where: is the treatment difference estimated by using the 0‘GS . gain score strategy; u ,u c are the population means on the post-treatment measure yp y (Y) for the program and control groups, respectively; and are the population means on the pre-treatment measure U ,U XP xc (X) for the program and control groups. respectively. The adjustment which is made using this technique in a situation conforming to a fan spread model when there are no true treatment effects is presented in Figure 5. 17 Achievement Time Figure 5. Underadjusting for group differences with the raw gain score procedure in a situation conforming to the differential growth rate theory. The solid lines represent actual mean growth of individuals in the two groups and the dashed line represents the adjusted growth pattern for 62' The difference at t2 between the solid line for G], and the dashed line for G2 represents the bias remaining after adjustment in estimating the treatment effect. It is clear that the gain score strategy requires the assumption of equal growth rates between com- parison groups in light of no treatment effects. Violation of this assumption results in an underadjustment in the situation depicted and therefore is a biased estimate of the program effect. A second popular analytic strategy frequently adopted by researchers in the field is the analysis of covariance model. This procedure is similar to the raw gain score strategy in the nature of 18 the adjustment made, but it is not restricted to the use of the same or parallel form of the outcome measure in order to make the adjustment. Rather than subtracting the pretest score from the posttest score, the analysis of covariance model subtracts only a portion of the difference of pretest from the posttest. That portion is equal to the pooled within-group linear regression slope of the line predicting the outcome measure from the adjustment variable. To facilitate the understanding of the effect estimated, the dependent variable can conceptually be thought of as W==Y-by.xx for the analysis of variance model. Although the adjusted variable is not actually computed, it can be thought of as such to facilitate comparisons across other similar analytic strategies. The treatment difference can then be written as: “AC = uyp-uyc-By,x(uxp-uxc). (1) Where: “AC is the estimate of the treatment difference using the analysis of covariance strategy; 8 -x is the pooled within-group linear regression slope of y Y on X; and “yp’uyc’uxp’uxc are as defined previously. Campbell and Erlebacher (1970) and Kenny (1975) have argued that in situations conforming to the fan spread hypothesis this strategy also underadjusts for initial differences. As a result, the strategy pro- vides a biased estimate of the treatment effect. Their discussion, however, focused on analysis of covariance with a fallible covariate. 19 Furthermore, the distinction between the two models of within-group growth, that was proposed in Chapter 1 was not made in either presentation. The importance of this distinction is examined in Chapter 4. Bryk and Weisberg (1977), on the other hand, have suggested that the analysis of covariance with a reliable covariate was an appr0priate strategy for the fan spread condition. Their conclusion was based on the assumption of linear growth for individuals within treatment groups. The same conclusion was drawn by Porter and Chibucos (1974). Analysis of covariance has also been criticized as inappropriate in situations conforming to the fan spread model because the assumption of homogeneity of regression slopes between treatment groups may be violated. Bryk and Weisberg (1977) and Campbell and Boruch (1975) have suggested that differential growth rates imply a differential relationship between the covariate and the dependent measure for the comparison groups. These authors, however, may have confused the regression of achievement on time as presented in Figure l with the regression of posttest achievement on pretest achievement. These two regressions are not the same and a violation of the homogeneity of regression slopes does not have to occur for the latter case. The regression slope for each comparison group is defined as: S by-x = r.xy 31.. X 20 Where by-x is the regression slope; rxy is the correlation of the pretest and posttest measures, and S ,S is the standard deviation of the posttest and pretest, respectively. The fan spread model suggests only an increasing variance across time such that Sy > Sx' This increase in variance occurs in both groups. Nothing in the differential growth rate situation suggests that the correlation between the two measures should differ from one group to the other. It does not follow then that a violation of the homogeneity of regression slopes assumption is likely. Chapter 4 presents two hypothetical examples conforming to the fan spread model where the assumption is not violated. Finally, the analysis of covariance strategy has been criticized as inappropriate in quasi-experiments since the slope, reflecting growth over time, is underestimated due to measurement errors in the covariate. This is a legitimate criticism which has received consid- erable attention as noted earlier. Several solutions to this problem have been suggested. Later in this chapter one of those suggested solutions will be considered in detail. Thus several methodologists have argued that data analysis for a study based on a quasi-experimental design may be extremely troublesome as a result of differences in growth rates between comparison groups. The evaluation of compensatory education programs, in particular, has been singled out as being especially vulnerable to this problem. 21 Traditional strategies for estimating treatment effects like gain score analysis of variance and analysis of covariance have been criticized as inappropriate since they can underadjust for initial as well as growth related differences between groups. Figure 5 presented a hypothetical example of the bias remaining after using the gain score strategy. It will be shown that a similar underadjust- ment could arise with the analysis of covariance strategy depending on the nature of growth and the value of By-x in equation 1. Gains in Standard Scores There have been several proposals as to how data might be analyzed in light of the fan spread hypothesis. One approach to the problem suggested by Campbell (1971) and later by Kenny (1975) was the use of gains in standardized scores. This approach counters the increasing variability within the treatment and control groups across time by standardizing the pretest and posttest scores separately (”given unit variance and a mean of zero," Kenny, 1975, p. 347). To standardize the scores obtained at each test administration, Kenny suggested the fol- lowing procedure: from each score, subtract the grand mean (across groups) and divide this difference by the pooled within-group standard deviation. (Since subtracting a constant has no effect on further. analysis, it is not considered in the subsequent discussion of the strategy. Rather, standardization is achieved by simply dividing each score by the pooled within-group standard deviation for the pretest and posttest data considered separately.) Conceptually, this results in a new dependent variable W, created by subtracting from the posttest 22 score the product of the ratio of the pooled within-group standard S deviation (posttest to pretest) and the pretest score, i.e., W=Y-§1X. x The variable W is then taken as the dependent measure in the analysis of variance model. The treatment difference can be presented as: 31 -O. (u -lJ ) O‘GSS = uyp"“yc x xp xc ° Where: “655 is the estimate of the treatment difference estimated by the gains in standard score strategy; are the pooled within-group standard deviations of the pre—treatment and post—treatment measures, respectively; and O'y,O “yp’uyc’uxp’uxc are as defined preV1ously. The above statement of the treatment difference is only an approximation since E(S) f o. The use of this strategy, like that of raw gain scores, logically requires that the pre-treatment measure be identical to or a parallel form of the post-treatment measure. In defending the use of this technique, Kenny (1975) presented several examples where standardized gain scores provided an unbiased estimate of program effects while raw gain scores and analysis of covariance were shown to be biased. He concluded that in certain situations (i.e., where individuals were assigned to a program based on sociological or demographic variables) gains in standardized scores provide the best analytic strategy. Bryk and Weisberg (1977) provided further evidence showing that this strategy is appropriate in situations conforming to the fan spread hypothesis. 23 Analysis of Covariance With Estimated True Scores Another approach which has been suggested as appropriate in a situation conforming to the fan spread hypothesis is the use of estimated true scores in the analysis of covariance. This procedure was originally developed (Porter, 1967) to eliminate the bias intro- duced by measurement errors when analysis of covariance is used to estimate treatment effects in quasi-experimental studies. Porter and Chibucos (1974) showed that this procedure corrects for the fan spread theory when the pre-treatment measure predicts the post-treatment outcome perfectly except for errors of measurement. To use the procedure suggested by Porter, the estimated true scores of the covariate must be computed. This can be achieved with the following formula: A:_+ -- T X oxx (X X) Where: T is the estimated true score of the covariate; X' is the group mean on the covariate; X is the covariate's observed score; and pXX is the reliability of the covariante. This approach requires knowledge of the covariate's reliability. The question as to which reliability coefficient should be used in the above formula has not been completely answered, but Porter and Chibucos (1974) have suggested that if possible a test-retest 24 reliability coefficient over a relatively short period of time should be the first choice. Having estimated the covariate's true score, this variable is then used as the covariate in the analysis of covariance model to estimate group differences. Using Porter's procedure the program effect can be written as: = _ _ _XL§. - O‘TS uyp “yc pXX (“xp “xc)‘ Where: GT5 is the estimate of the treatment difference computed by the true score analysis of covariance strategy; and “yp’uyc’uxp’uxc’ By-x and pxx are as defined previously. The similarity of this approach to that of gains in standard scores presented earlier is clearly shown with the following substitution: 0 By-x = pxy 6f-. The estimate of the treatment difference can now be written as the following: p o x #1 - "5Tx o (“xp “xc)' =11 '11 0‘18 yp yc xx x Given individuals conform to the fan spread, pre-treatment scores should predict post-treatment scores perfectly except for measurement errors. Thus the ratio of the correlation between measures and the 25 O pretest reliability is equal to unity, Eff = 1 if pxx = pxy' The estimate of the program effect provided by true score analysis of covariance and gains in standard scores is the same for fan spread data conforming to the first model of within group growth. This similarity is only true for the linear growth model for individuals within comparison groups. When individuals within groups are growing non-linearly, the ratio of the correlation between measures and the reliability coefficient of the pretest does not equal unity. The effect estimated by the gains in standard scores and analysis of covariance with estimated true scores is therefore different. The two procedures also differ in that the gains in standard scores approach assumes that the correct ratio of the standard deviations is known for the population, while estimated true score analysis of covariance estimates the parameter on the sample data. Adjusted Gains Scores The inadequacies of the raw gain score strategy in situations conforming to the fan spread hypothesis were discussed in some detail earlier. It was shown that for the general fan spread model of growth, gain scores were inappropriate since they adjust only for initial differences at the point of intervention and not for differences in the rate of growth as the model predicts. These latter differences, if uncontrolled, would result in a biased estimate of program effec- tiveness. The gain score strategy may still be appropriate, however, if modified to reflect not only initial differences but also growth rate differences. Such a modification could be made if additional 26 additional data collected at some time prior to the point of intervention were available. This modified gain score approach then could provide a third alternative solution to the fan spread hypothesis. To facilitate a discussion on the development of the modified gain score procedure, Figure 6 presents in greater detail differential achievement growth over time for a hypothetical program and control group without a treatment effect. Point of End of W Intervention Treatment I Control Group I I +: ' l m E ' I 2 ' ' I .2 I I I 3; : : I Erogram I roup I i l (t3uyp) I I (t H ) I I i 1 2p I I l l I t1 t2 t3 T Time Figure 6. Differential growth rates considered over three points in time. 27 The horizontal axis T denotes time while the vertical axis W represents achievement. On the time dimension three points are identified: t1, t2, and t3. The vertical dotted line at t2 indicates the point of intervention while the dotted lines at t1 and t3 represent points in time prior to and at termination of the intervention, respectively. The solid lines represent the linear regression of achievement on time for the populations of program and control groups. In this figure the control group is shown to have a higher achievement rate (growing faster) than the program group. The points at which these growth curves intersect the dotted vertical lines represent the average achievement level on the measure administered at time t. For example, (t2’uxp) represents the population mean on measure X for the program group at the time of intervention. These solid lines can be defined in regression equations and used to predict the average group perfor- mance at any point in time. If, for example, group performance at t3 was of interest, the following equations may be used: II + .. yp an bp(t3 t2) and C II where: u ,p c are the population mean performance on measure (Y) yp y at time t3 for the treatment and control groups, respectively; a ,a are the intercept constants of the growth curves for the treatment and control groups, respectively, at t ° 23 28 are the lepes (rate of growth per unit time) of the regression line predicting achievement from time for the program and control groups, reSpectively; and t3--t2 is the period of intervention. The difference in average performance of the program and control groups at the termination of the intervention can be determined as “AGS - uyp'-uyc-(ap'-ac)-[(bp-bC)(t3-t2)]. When the intervention has no effect, the equation is as follows: 0 = uyp-uyc- (ap-ac)-[1. Since the intercepts ap and aC of the growth curves are the initial achievement levels prior to the point of intervention, then, p c uxp"“xc the difference in the mean pretest scores of the two groups. With this substitution the expression becomes the following: uyp-uyc-(uxp-uxc)-[(bp-bc)(t3-t2)]. (2) The first two terms of the equation are identical to that of raw gain scores that adjust for initial differences in test performance while the second component adjusts for the differential growth rates. If the lepes are equal, that is the rate of growth is the same for both 29 groups, the second component equals 0 and raw gain scores provide the appropriate adjustment procedure. The fan spread model, however, states that the growth rates are not equivalent, and therefore an additional adjustment is needed. The slope of a regression line is defined as the ratio of the change in the vertical axis to the change in the horizontal axis, i.e., b = %¥n. By using the information available before the start of the intervention, the growth rate for each group can be estimated. For the program group, the regression slope can be written as the following: b = uxp"uzp P t -t ' 2 1 This equation is the ratio of the change in population mean achievement at two points in time prior to the intervention with the period of time between testing. Similarly for the control group the regression slope can be written as the following: b ___ 11xc -“zc With these estimates of growth rates, the third term of expression 2 can be written as: u -u u -u r] = I<‘“w'“z:::::‘“ > w] - 30 If the period of time between the first and second testing equals the period of intervention t2 to t3, the above equation can be simplified as the following: [(bp ' bC)(t3 " 132)] = “pr ’Uzp) ‘ (ch -VZC)]' Thus, the difference in group mean gains prior to the period of intervention can be an appropriate estimate of the difference in growth rates between program and control groups. The combination of this adjustment for differential growth rates and the adjustment for differences in initial performance levels results in estimating the treatment effects as: - ) -(u '-u ) = - - - _ XP 2p xp zc _ aAGS uyp uyc (“xp “xc) t2-t] (t3 t2) ‘ To achieve this type of adjustment requires the creation of a new variable that takes into consideration all three test results. A variable, W, can be defined as the following: _ __ t -t w=v-x—(w-Z) 3 2 tz't‘l ° Where: X,Y,t3-t2,t2-t] are as defined earlier; X' is the group mean on the pretest administered at t2 or the point of intervention; and '2 is the group mean on the pretest administered at t, or some point in time prior to the intervention. 31 If the time between the first two test administrations equals the period of intervention, the new variable is simply a gain score minus the difference of the group's means at two points in time prior to the intervention. This second factor adjusts for the growth rate of the group. Since this technique makes no assumptions as to the rate of growth or the initial starting point, it is appropriate for any situation where growth is linear for groups. Analysis of Covariance With Multiple Covariates The strategy presented above required that two assessments be made prior to the start of the intervention. Given the availability of this pre—treatment information, a fourth procedure which has been suggested to analyze data in a quasi-experimental setting conforming to the fan spread model is analysis of covariance with the two pretests as covariates. It was suggested when this technique was proposed that the covariates should be corrected for their unreliability. Keesling and Wiley (1977) have recently suggested a new approach to the problem which may provide a reasonable solution to the question of multiple fallible covariates. Their approach estimates the treatment effects within groups separately and then compare the magnitude of those effects across groups. The general model for estimating the adjusted posttest mean for a group can be written as: 32 Where: TG is the estimate of the effect for group G; “yG is the population mean on the posttest measure Y for group G; Bx’EI are the vector of population means on the fallible and error free covariates, respectively, for group G; and F'B' are the row vectors of structural regression coefficients for the fallible and error free covariates, respectively, for the group G. In the present study the adjustment procedure using fallible covariates is of primary interest, and therefore a discussion of the error free covariates and their structural coefficients will not be presented here. In the situation under consideration the posttest mean adjusted by the two pretest means under the Keesling-Wiley model can be written as: 16 = “ye “TI“x'Yzl'r Where: p ’“x and p2 are the population means of the three tests as y defined earlier; and y],y2 are the structural regression coefficients for the two fallible covariates. The estimates of the structural regression coefficients are defined as the following: 33 = 2 _ 2 2 _ 2 Y1 (052061n OE1EZO€2nH (“51052 “5152) Y = 2 -o / 02 2 (051052n £152051n) (&G€1O€2 05152) Where: g1,gz,n represent the true scores of the first covariate Z, the second covariate X, and the posttest measure Y; and 02,0 represent the variance and covariance of the subscripted variables, respectively. In terms of the notation used elsewhere in this study, the structural regression coefficients can be written as the following: Y =(B , -B , B )/(l-D2 ); l Ty TX TZ Tx Ty Tz TXTZ Y =(8 , -B , B . )/(l-o2 ). 2 Tyz T T zTx Ty Tx TxTz (For a discussion of the distinction between structural equation models and regression models, see Goldberger and Duncan, 1973.) It might be noted at this point that the above regression coefficients are similar to those used by Pravalpruk (1974) in one of the two methods he con- sidered for solving the multiple fallible covariate problem. His estimates were: YT ‘ v)/::-TBT2 -Tx By T2);/ (1"DTXTZ ) V— BT z'Tx 8,), T x)/(1 -p%XTZ) . 34 In his study, however, Pravalpruk only considered situations where the reliabilities of the covariates were equal. Furthermore, since By-Tx sidered by Pravalpruk and those considered by Keesling and Wiley are = BTy'Tx and By°Tz = BTy'Tz’ the adjustment coefficients con- identical. To obtain his regression coefficients, Pravalpruk first computed the estimated true scores for his covariates and then cor- rected for the attenuated relationship between the two covariates. Concerning this approach, Pravalpruk concluded that while the correct effect was estimated, the appropriate probability of a type I error was only obtained in a two-group design but not in a four-group design. When a four-group design was considered, the test statistic was shown to be too liberal for practical purposes. Thus Pravalpruk concluded that his method did not provide a satisfactory solution to the multiple fallible covariate problem. In the approach suggested by Keesling and Wiley the true score variance-covariance matrix is estimated using the replicate measures of the covariates. Using the true score relationship the structural regression coefficients are then estimated. The estimation of these parameters is provided in a computer program by Jbreskog and Van Thillo (1972). To facilitate comparisons across the analytic strategies proposed for solving the fan spread problem the estimate of a treatment effect can be written as the following: 35 p “p D G a = MAC yp yc 1 - pIxTz OTX xp xc Z Y 0T T - pTxszTxTy ( 011 1--p2 TXTZ ) (uzp - 112C) . While the above procedure has been demonstrated on a data set, there have not as yet been any investigations considering the properties of the distribution of the test statistic in small samples. Although further study of the Keesling-Wiley procedure is needed before it can be adopted as a competing analytic strategy, it was considered in the present study because it appeared to be a promising technique for the future. Summary The four analytic strategies proposed as solutions to the fan spread model have been considered in some detail in this chapter. The following chapter reviews the literature concerning data analysis in quasi-experiments, focusing on discussions directly relevant to the strategies considered in the present chapter. CHAPTER 3 REVIEW OF RELEVANT LITERATURE The first chapter began by identifying the measurement of change as a difficult problem in a study based on a quasi-experimental research design. As an example of this type of study, the evaluation of com- pensatory education programs was suggested. These investigations are characterized by the fact that comparison groups frequently differ in their initial status on the outcome of interest which generally is some measure of achievement. It was then pointed out that some authorities have suggested that differences in initial achievement levels were an indication that the groups were growing academically at different rates. Campbell has labeled this differential growth rate issue the fan spread hypothesis. Given this model, traditional analytic strategies have been questioned. Several procedures were then introduced as potential solu- tions for differences in growth rates. These solutions were described in detail in Chapter 2. The present chapter reviews the literature as it relates to the proposed solutions. Data Analysis for Quasi-Experiments The debate on the analysis of quasi-experiments has ranged from pessimistic arguments stating that appropriate analytic procedures do not currently exist for these studies (Lord, 1967; Crombach & Furby, 36 37 1970) to the optimistic view that a cautious interpretation of carefully analyzed data is useful (Elashoff, 1969; Harnquist, 1968; Porter, 1973). Campbell and Erlebacher (1970) reviewed and explicitly demonstrated using simulated data, many of the problems raised when treatment effects are estimated from quasi-experiments. The purpose of their presentation was to illustrate the inappropriateness of gain scores and analysis of covariance for estimating the effectiveness of compensatory education programs. Campbell and Erlebacher observed that pretest data often showed control groups having a higher average per- formance than those individuals involved in the compensatory program. "That difference no doubt is there because of previously more rapid rate of growth on the part of the Control Group, which would be expected to continue during the period of the experimental treatment" (p. 198). Thus in addition to all of the other problems the literature has iden- tified as being present in quasi-experimental studies, differential growth rates of treatment groups must also be considered in the selection of an analytic strategy. Campbell and Erlebacher suggested that as a result of all of these issues, traditional analyses techniques were inadequate to estimate the effectivenss of compensatory education programs. When these procedures were used, biased estimates resulted, and programs appeared to be ineffective or even harmful. In discussing the fan spread hypothesis, Campbell and Erlebacher have focused on differences in growth rates between comparison groups and have neglected the question of within group growth. Their argument that analysis of covariance is inappropriate for the fan spread model 38 is dependent on the nature of the within group growth. This issue is examined in detail in the next chapter. The authors also have confused the problem of differential growth rates with the problem of an unreliable covariate. Given the linear model of within group growth, analysis of covariance provides an appropriate adjustment if the covariate is perfectly reliable. Finally, Campbell and Erlebacher do not point out that parallel growth rate for comparison groups is a special case of the fan spread model when the variance of the measure remains constant across time. Building on many of the same arguments raised by Campbell and Erlebacher, Campbell and Boruch (1975) identified six ways in which quasi-experimental research designs in evaluations of compensatory education programs underestimate program effectiveness. The issue of differential growth rates among treatment groups was cited as a major contributing factor to a biased estimate of a program effect. Although the authors attempted to clarify the question of differential growth rates, they fell short of addressing those issues ignored in the Campbell and Erlebacher paper. The authors did provide, however, several examples of studies from the literature which supported their contention that cognitive test scores do follow fan spread patterns. As a possible solution to this problem, the authors suggested that "standardizing scores (to mean zero and variance one) at each age level would produce a metric which eliminates differential growth rates" (p. 37). While this procedure may be appropriate, Campbell and Boruch argued that actual growth patterns in achievement were not very well understood and deserved further study. 39 Raw Gain Scores Several authorities (Lord, 1956; McNemar, 1958; Cronbach & Furby, 1970; Marks & Martin, 1973) have considered the question of measuring true change with gain scores. Their concern, however, has been with individuals rather than groups. Although related, a distinction should be made between the two topics. The literature has not always done this with the result of conflicting statements and confusion as to the appropriateness of analysis strategies. The use of gain scores as a strategy to measure group change has been questioned by Cronbach and Furby (1970). These authors pointed out that in true experiments only differences in the post-treatment tests were needed to determine a treatment effect. If a pre-treatment measure was available, then other more powerful techniques such as analysis of covariance provided more powerful estimates of treatment effects than gain scores. In quasi-experimental studies Cronbach and Furby argued that the distribution of true pre-treatment measures was different across the suprpulations being compared. As a result, these authors agreed with Lord (1967) and concluded that no analysis strategy was appropriate to compare treatment effects in quasi-experiments. Not all authorities share this pessimistic view concerning gain scores as an analytic strategy in quasi-experimental studies. Porter (1973) who has argued that the use of gain scores does not provide the best strategy in true experimental studies, has suggested that under certain assumptions gain scores may provide the best technique for quasi-experimental investigations. The issue of the appropriateness 40 of the technique is dependent on the reasonableness of the assumption that the relationship between the true pre and true post measures is actually unity (BTy'Tx==])' Porter argued that given the assumption that the treatment effects are additive and that pretest and posttest measure the same variable in a common metric, it can be shown that the relationship between the true parts of the measures does equal unity. He concluded that given these assumptions, the gain score strategy does provide a reasonable approach to data analysis in quasi-experimental studies. The situation in which Porter has suggested that gain scores would be appropriate is the special case of the fan spread model when the growth patterns are parallel and the variance of the measures remain constant across time. The additivity assumption implies the linear within group growth model. Gains in Standard Scores As noted earlier, a solution to the more general fan spread model was proposed by Campbell and Boruch (1975). Their solution was to standardize the observation at each point in time. Although they indicated that this procedure may be appropriate they did not recommend its use, but rather suggested that further study of academic growth patterns is needed. Kenny (1975) took a much stronger position on Campbell's proposed solution and argued that under certain conditions standardized gain scores provided the best analytic strategy for quasi- experiments. Kenny pointed out that in quasi-experiments the procedure assigning groups to a treatment was the determining factor in selecting 41 the appropriate analysis strategy. For example, if subjects were assigned to the treatment strictly on test scores to be used as the covariate then analysis of covariance without adjustments for the unrealiability of the covariate was the appropriate analysis technique. 0n the other hand, if the subjects themselves determined which treatment they received, then analysis of covariance corrected for a fallible covariable was the appropriate approach. Finally, if the treatment was assigned to groups based on sociological or demographic character— istics of the group, then the use of standardized gain scores was the appropriate procedure. An example of a study based on sociological or demographic variables is one in which members of a particular social group are eligible for treatment as a matter of legislation. Another example is a study in which treatment is given to members of a partic- ular organization such as a school district. A current example where a compensatory education project was assigned by legislation is the Response to Educational Needs Project in the Anacostia region of Washington, D.C. This program was initiated by the President and Congress specifically for this area in order to increase community involvement and the quality of education in the schools of the region. (This project is currently being evaluated by the National Institute of Education.) Kenny suggested that the assigning of groups to treatments based on sociological and demographic factors resulted in situations conforming to the fan spread model. That is, treatment and control groups were likely to have different growth rates. Kenny judged the 42 use of raw gain scores, analysis of covariance, and adjusted analysis of covariance to be inappropriate strategies for these types of studies. Standardized gain scores, however, were appropriate since the technique adjusts for the increasing variability across time. To support his contention as to the appropriateness of the procedure, Kenny cited two examples in an attempt to demonstrate the inapprOpriateness of raw gain scores and analysis of covariance with or without correction for the fallible covariate while the standardized gain score strategy drew the correct conclusion. He thus concluded that the use of standardized gain scores was an appropriate technique for many quasi-experimental studies. Kenny's presentation on the appropriateness of the procedure he advocates has several weaknesses. First the label of standardized gain scores is misleading. Gains in standard scores is a more accurate description of the approach. A second weakness is the author fails to distinguish different models of within group growth. As a result his conclusion that adjusted covariate analysis of covariance does not estimate the treatment effect correctly is in error. This issue is addressed in Chapters 4 and 5. A third problem with Kenny's paper is that after showing that gains in standard scores appropriately estimates the treatment effect of interest, the author fails to consider the standard error of the approach. As a result of this oversight, Kenny failed to recognize that the procedure he recommended results in a spuriously low standard error. The standard error issue is considered in detail in Chapters 4 and 5. 43 Estimate True Score Analysis of Covariance The second solution to the general fan spread model that was proposed in the first two chapters was the estimated true score analysis of covariance. The technique differs from traditional analysis of covariance in that the estimated true score rather than the observed score on the pre-treatment measure is used as the covariate in the analytic model. Analysis of covariance with an observed score covariate has been criticized as inappropriate as a result of measurement errors. The problem arises in the calculation of the slope of the regression of the posttest on the pretest used in computing the residuals. The treatment effect which is of interest is based on the latent true variables, but only the observed scores are available. If the re- gression slope estimated on the observed scores is not the same as that which would be obtained if the latent true scores were available, the regression estimate is said to be biased, and the effect which is estimated is not likely to be the desired one. In the classical analysis of covariance model the covariate is assumed to be fixed rather than random. That is, the levels of the covariates are chosen by the researcher. Berkson (1950) considered using a controlled observation (fixed levels) as the independent variable in estimating the regression line. He concluded the regression slope in this situation to be unbiased. On the other hand, when the independent variable is uncontrolled (a random sample from an existent population), the slope is biased. "The distinction between the two situations 44 resides not in what the variate represents but how the observations are acquired. If the values are obtained by taking a sample from an existent population, we have the biased situation. If they are obtained by making them as controlled observations, we have the biased situation" (Berkson, 1950, p. 179). In education and psychology, it is rare to find a situation where the covariate is fixed. Rather, for most cases the covariate is a random variable and it is usually measured with error. Thus the slope which is estimated on the observed scores is biased, and the effect which is estimated is not the one of prime interest when treatment groups differ on their mean covariate score. The nature of the bias in using the observed scores rather than the latent true variables can be shown in a number of ways. From a measurement perspective, it is most easily seen through the following set of identities: o = .21 By,x ox),O (3) x 0 =0 1’0 (4) xy yTx xx ox-GT N6; <5) ,r— x—i’x By.x = pyTx 0xx 0xx UT (6) x (7) '03 I '0 ID 45 Equation 7 shows that when the observed scores are used to estimate the slope of the regression of the dependent measures on the covariate, it underestimates the desired relationship based on the latent true variables. The observed score regression is biased by a factor equal- ling the reliability of the covariate. The treatment effect estimated by the observed score analysis of covariance can be written as O‘AC = “yp"“yc"By-x (“xp"“xc)° The treatment effect estimated on the latent true variables, on the other hand, can be written as a = - - BYOX TS “yp “yc Pxx (uxp-uXC). Since the latent slope is always larger than the observed score slope, the difference between the two group means on the posttest is adjusted to a lesser extent with the unadjusted regression slope. In the case of evaluating compensatory education programs, analysis of covariance with observed scores underadjusts for initial differences between comparison groups mean on the covariate. To obtain an unbiased estimate of the regression slope, Lord (1960) suggested a large sample approach which required two independent measures of the covariate. The resulting test statistic, U, was shown to follow the normal distribution and provide an unbiased estimate of treatment effects. This procedure is limited to studies comparing two groups with large samples and two measures on the covariate. Porter (1967) studied the distributional properties of the U statistic when 46 samples were small. His results indicated that a sample size larger than 20 was necessary for the U statistic to converge to the normal distribution when the reliability of the covariate and/or the correlation of the covariate and dependent measure were low. Another approach to this issue of bias was suggested by Harnquist (1958). He proposed that the problem of an unreliable covariate could be solved simply by multiplying the sum of squares of the covariable by the estimate of the covariable's reliability. Porter (1967), in further developing this idea, suggested that the estimated true score of the variable be used as the covariate. Using this approach, Porter showed that this technique provides an unbiased estimate of the regression slope. Furthermore, in a simulation study using this technique, he showed that the analysis of covariance strategy using true scores followed very closely the theoretical F-distribution. Stroud (1972) provided still another solution to the problem of an unreliable covariate but his procedure required knowledge of the error variance of the covariate. This approach is also a large sample solution and as yet no small sample distributional investigations have been done with this procedure. The fan spread model as suggested by Campbell is a special conceptualization of a more general linear growth model. Several models of linear-group growth, including the fan spread, were con- sidered in detail by Bryk and Weisberg (1977) for the non-equivalent control group design. These authors suggested that a simple repre- sentation of individual linear growth could be provided by considering 47 the product of growth rate (n) and time, i.e., the difference between (tj) the point of assessment and (T) the initial starting point of growth; growth = fl(tj-T). Similarly, the average performance of a group at time t1 was defined as ufl(t]-UT) where u" was the average growth rate for individuals within the group and pT was the average starting point within the group. By changing the initial starting point for the groups and their rate of growth, Bryk and Weisberg considered five linear growth models. To produce the fan spread condition, a common starting point was assumed for all individuals and the average rate of growth was assumed constant for the group but varied between the groups. Growth within the groups was assumed con- stant for the individual but varied across individuals. Using this model of growth, the authors compared four analyses strategies on the basis of the effect estimated by each technique. The procedures which were considered were the following: (1) raw gain scores, (2) gains in standard scores, (3) analysis of covariance, and (4) Belson's approach to analysis of covariance. Belson's method, in contrast with tradi- tional analysis of covariance, estimates the regression slope using only the data on the control group. In contrast to Kenny's conclusions, Bryk and Weisberg's analysis indicated that both gains in standard scores and analysis of covariance adequately adjusted for the fan spread model. This conclusion was based on the assumption that the covariate was perfectly reliable. When considering the other models of linear growth that were studied, the authors concluded that these approaches were inadequate in estimating treatment effects. They 48 therefore suggested that pretest-posttest data were inadequate to adjust for initial differences that frequently accompany the non-equivalent control group design. Bryk and Weisberg further recommended that researchers develop new procedures which might take into consideration more data on growth collected prior to an intervention. In their study, Bryk and Weisberg made a basic assumption that both groups and individuals within groups grow linearly. As a result they showed that analysis of covariance with a reliable covariate correctly adjusted for the differential growth rates. The assumption that individuals within groups grow linearly, however, is not necessary for the fan spread situation as was pointed out in the first chapter. Without this assumption and the assumption of a perfectly reliable pretest, however, the procedure would not provide the appropriate adjustment. These assumptions were not explicitly stated or discussed in the authors' presentation. Furthermore, Bryk and Weisberg in con- cluding that both gains in standard scores and analysis of covariance provide the apprOpriate adjustment, implied that the two procedures are equally appropriate. This is not completely true. While both procedures estimate the same effect, the precision with which the estimate is made differs. The precision of the strategy suggested by Kenny is spuriously high. The issue of precision is considered in detail in the next two chapters. 49 Analysis of Covariance ith Multiple Fallible Covariates The fourth analytic strategy which was proposed as a solution to the fan spread model was the analysis of covariance with multiple covariates. This procedure is an extension of the single covariate analytic technique previously discussed. As in the single covariate approach, multiple covariate analysis of covariance has been challenged as inappropriate because of the unreliability of the measures. Unfor- tunately, the solution to this problem appears more complex than the single fallible covariate case. Pravalpruk (1974) considered two approaches to this problem when two fallible covariates were available. The first method he considered was an extension of Porter's (1967) procedure of using estimated true scores as covariates. In addition, however, he corrected the correlation between the two covariates for attenuation before calculating the beta weights. In the second approach considered, Pravalpruk transformed the second fallible covariate to be independent of the first fallible covariate and then used the estimated true score procedure. The first method was shown to estimate the desired effect, while the second method did not. In simulating the distributions of the F-statistic computed by the two methods, he found that in the case of two-group designs the probability of a type I error rate was satisfactory. In the case of a four-group design, however, the type I error rates were found to be too liberal for practical use. Pravalpruk concluded that the problem of multiple fallible covariate remains unsolved. 50 Stroud (1974) extended the procedure he suggested to the single fallible covariate problem to include multiple covariates. His solution, however, requires knowledge of the error variances associated with each covariate. As yet no investigations have been conducted studying the distributional prOperties of his test statistic when small samples are available. More recently, Keesling and Wiley (1976) have suggested a procedure to adjust for initial differences that may accompany the non-random assignment of experimental units to treatment groups. Their technique is an extension of Lord's (1960) large sample covariance analysis (discussed earlier) to include multiple covariates, some of which may be error-free and others fallible. The two approaches are similar in that the Keesling-Wiley procedure is also a large sample solution; it requires replicate measures of the fallible covariates, and it is useful in the analysis of two-group designs. Unlike Lord's technique, however, the Keesling-Wiley procedure requires replicate measures of the dependent variable as well as the covariates. It does not require that the replicate measures be parallel, nor is it limited to two replicates per variable. Briefly, the Keesling-Wiley strategy is to take the within- cell observed variance-covariance matrix of all replicate measures to estimate the within-cell true score covariance matrix. Using these estimates of true score relationships, the parameters of the structural regression system of true dependent variables on the true covariates can then be computed. These structural regression coefficients are 51 maximum-likelihood estimates of the parameters computed following the procedure suggested by Jbreskog (1973) and implemented in a computer program by Jbreskog and Van Thillo (1972). The true adjusted dependent variable is then calculated for each treatment group and compared to assess the treatment differences. Further discussion of this approach was presented in the previous chapter. Summary The review of the literature has indicated that the measurement of change and the estimation of treatment effects in quasi-experimental studies has been a focal point of a great deal of discussion. In dis- cussing these issues, researchers have shown considerable disagreement as to the appropriateness of various analytic strategies suggested for quasi-experiments. The lack of explicit recognition of differential growth rates as a contributing factor to biased estimates of program effectiveness has contributed to that debate. The literature review has also identified a serious weakness in the previous work in the area of analytic strategies for quasi-experiments. That deficit concerns the almost total neglect of precision as an issue in selecting among competing analytic strategies. The next two chapters attempt to shed some light on the issue of precision as it relates to the four proposed solutions for the fan spread condition. CHAPTER 4 EFFECTS ESTIMATED AND THEIR STANDARD ERRORS The four strategies considered in the second chapter have been suggested as possible solutions to the problem of differential growth rates between comparison groups in quasi-experimental studies. The present chapter compares and evaluates these strategies on the basis of two criteria: (1) the appropriateness of the effects estimated, and (2) the precision with which the effects are estimated. While the first criterion must be met before the issue of precision can be sensibly addressed, the second criterion provides a useful basis on which to decide among appropriate competing analytic strategies. Two conditions of within-group growth are considered in evaluating the effect estimated by each strategy. The first assumes growth to be linear at the individual level across time such that an individual's relative position within the group remains constant over time. This is the "traditional" model of the fan spread condition examined by Campbell and others and represented pictorially in Figure 4 found in Chapter 1. This model of growth suggested that individuals began to grow at some common point in time; and that the rate of growth varied among individuals, but is constant for an individual within a group. As a result, the relationship between test performance across time is perfect except for measurement errors, p==l. Table 1 depicts 52 53 Table l A Hypothetical Data Example of the Fan Spread Hypothesis With the Linear Model of Within-Group Growth Time 0 Time 1 Time 2 Individual Z X Y 1 1.8 3 4.2 Group 1 2 3.0 5 7.0 3 4.2 7 9.8 Mean 3.0 5 7.0 Standard deviation 1.2 2 2.8 4 4.8 8 ll 2 Group 2 5 6.0 10 14.0 6 7.2 12 16 8 Mean 6.0 10 14.0 Standard deviation 1.2 2 2.8 this model of within-group growth for two hypothetical groups, each consisting of three individuals at three points in time. For this data set, measurement errors are assumed to equal zero. The second condition of within group growth which is considered assumes that individuals within a group do not grow linearly but rather in "spurts." The rate of growth for an individual may therefore vary across time resulting in some variability in the relative position an individual holds within the group. This view of the fan spread model has not previously been con- sidered in the literature. Pictorially it was represented in Figure 5 found in Chapter 1. Following this model of within-group growth, the relationship between test performances across time is no longer perfect even without measurement errors, 01‘]. Table 2 presents this model of 54 Table 2 A Hypothetical Data Example of the Fan Spread Hypothesis With the Non-Linear Model of Within-Group Growth Time 0 Time 1 Time 2 Individual 2 X Y 1 2 ll 22 2 3 17 31 Group 3 3 4 15 25 4 5 18 28 5 6 19 34 Mean 4 16 28 Standard deviation 1.56 3.16 4 74 6 4 15 28 7 5 21 37 Group 4 8 6 19 31 9 7 22 34 10 8 23 40 Mean 6 20 34 Standard deviation 1.56 3.16 4.74 within-group growth for two hypothetical groups each consisting of five individuals at three points in time. For these data it is assumed that there are no errors of measurement. While the two conditions described above differ in terms of within-group growth rates, both have average group growth that is linear. In considering both the appropriateness of the effect estimated and the question of precision, it has been assumed that the measures available are the same or parallel forms of the same instrument. 55 Estimation of Treatment Effects The fan spread model of growth, as discussed earlier, suggests that concomitant with an increase in mean difference between comparison groups is a proportional increase in within-group variability. Further- more, this relationship between the mean differences and pooled standard deviation remains constant across time. Algebraically this relationship is presented as The terms are as defined previously. This representation of the differential growth rate problem indicates that the appropriate adjustment strategy should take the following form: An analytic strategy having the above form would provide an unbiased estimate of group differences in situations conforming to the fan spread model of growth. Since the definition of the fan spread hypothesis does not include a reference to the nature of the within-group growth pattern, the above approach is appropriate for both condition 1, p>=l, and con- dition 2, pail. Assuming that group 1 represents a program group and group 2 the control, the estimate of a non-existent treatment effect for the data from Table l depicting condition 1 is as follows: 56 C II -7 -l.4 (-5). Similarly assuming that group 3 represents a program group and group 4 its control, the estimate of the group differences for the data in Table 2 representing condition 2 is as follows: (28 - 34) - L113- (16 - 20) (A) O ll (-6)-l.5 (-4). The above examples demonstrate the appropriateness of the adjustment technique suggested by the definition of the fan spread model or alternatively that the data conform to the fan spread model. Estimation With Gains in Standard Scores The nature of the hypothesis tested by each of the four analytic strategies is reflected in their respective estimates of group differences presented in Chapter 2. The gains in standard scores approach suggested by Kenny was shown to estimate group differences as O‘GSS Llypmuyc o xp xc ' 57 This equation is identical to the adjustment strategy suggested above based on the definition of the fan Spread model of growth. In Campbell's definition of the fan spread model, however, it was not clear whether or not the hypothesis was based on manifest or latent variables. If the fan spread hypothesis is defined on the latent true variables then gains in standard scores uses the ratio of the standard deviations on the observed variables, $2., when the ratio of the standard deviations of the true variables, STf.’ is desired. The relationship between the variance of the manifest variables to the variance of the latent true variables are shown in the following expressions: 2 = 2 and 2 _ 2 o — o . y Ty pyy The ratio of the standard deviations on the manifest variables in terms of the latent true variables can be written as the following: o 32.. _IIL_£§21 o o x Tx /§;; If the reliability of the pretest equals the reliability of the post- test, pxx==pyy, then the ratio of the observed score standard deviations is appropriate for the latent fan spread model. On the other hand for the latent fan spread model, the ratio of observed standard deviations 58 is an inappropriate adjustment coefficient when the reliability of the measures are not equal. Under the manifest fan Spread model, gains in standard scores strategy as proposed by Kenny is appropriate regardless. While the discussion of the fan spread model indicated that the appropriate adjustment coefficient was the ratio of the population standard deviations for the posttest to the pretest, Kenny's procedure estimates that ratio using the sample standard deviations. The expected value of the ratio of the sample standard deviations, however, does not equal the ratio of the population standard deviation, E(Sy/thioy/ox , where the samples are small. The effect estimated by Kenny's technique, therefore, is not the desired one when the sample size is small. Further discussion of this point is presented later in the chapter. A final consideration concerning the gains in standard scores approach is the fact that this procedure is not affected by the relationship among the individuals within the comparison groups. For large samples, then, the technique estimates the appropriate effect for both models of within-group growth. Estimation With True Score Analysis of Covariance The second solution to the fan spread hypothesis proposed earlier was the analysis of covariance model with estimated true scores as the covariate. This approach estimates the group differences as 59 This statement of the estimate of group differences indicates that this strategy is identical to the adjustment strategy suggested by the fan spread definition except for the ratio of pxy/pxx. This ratio provides a correction for errors of measurement. Therefore, if the true relationship between the two measures is perfect as pro- posed by condition 1, the ratio of the correlation to the reliability of the covariate will also equal unity. Thus for the first condition being considered, the analysis of covariance model with estimated true scores provides the appr0priate adjustment for the fan Spread situation. This result contradicts Kenny's (1975) conclusion that analysis of covariance adjusted for an unreliable covariate does not apprOpriately adjust for the differential growth rate problem. Since Kenny did not specify the model of within group growth in his study, it must be assumed that his examples illustrated situations where individual growth rates were not linear across time. Previously a distinction was drawn between the manifest and latent fan spread models. Considering the latent model, the rela- tionship between the adjustment coefficient provided by estimated true score analysis of covariance and the ratio of the latent variable standard deviations can be written as Q Q .4 X 0L0 x x x Q O x I< II (:1 The above expression is true when the reliability of the pretest and posttest are equal for the linear model of within group growth. If 60 the reliabilities are not equal then the following set of relationships Show that the appropriate adjustment is still provided by the procedure: 0 0 35232 $1<_Tx_)/( TX) 0xx 0x 0xx 'Pyy VPxx ox OTip/0,1,1 pxx GTx VPyy O IpXX UTX The last expression equals the ratio of the latent standard deviations when the linear model of within group growth is true. If the manifest fan spread model is assumed then the effect estimated by the true score analysis of covariance strategy is appropriate only when the reliabilities of the pretest and posttest are equal. The data example presented earlier from Table l, where no errors of measurement were assumed and p==1, demonstrated the appro- priateness of the true score analysis of covariance technique. The example also demonstrated that the assumption of the homogeneity of regression slopes between comparison groups need not be violated in a fan spread situation. For both groups the relationship between the X and Y measures equaled one and the standard deviation for each variable was equal across the two groups. The regression slopes are therefore equal between comparison groups. Under condition 2, however, the true relationship between the pretest and posttest does not equal unity even after correcting for 61 errors of measurement. The data in Table 2 represent this type of situation. In this example the true relationship between the X and Y measures equals .90. The estimate of the adjhsted group difference determined by the analysis of covariance model with estimated true scores of the covariate is as follows: “ACTS 3 (P ’1‘ )' (28 - 34) - (.90) (fig-71%) (16- 20) (-6)-(1.35) (-4) = -.60. Thus under condition 2 this strategy underadjusts for initial group differences. The data in Table 2 also demonstrate the assumption of homogeneity of regression slopes does not have to be violated in Situations conforming to the fan spread model. Estimation With Adjusted Gain Scores The third analytic strategy proposed in Chapter 2 to provide an appropriate adjustment for the differential growth rate problem was the adjusted gain score stratagy. This technique suggested that given the results of two test administrations prior to the introduction of the treatment, the average rate of growth within each group could be estimated and used to adjust the raw gain scores for differences in the growth rates. Estimating group differences with the adjusted gain score strategy was represented as 62 a = ( ~11 ) -( - ) - [( - ) -( - )] t2'-t] AGS uyp yc Uxp “xc “xp “2p uxc Uzc t1-t0 ' The terms are as previously defined. In Situations where the period of time between the first and second pretests, (t1-t0) equals the period of intervention, (t2-t1), the above estimate can be simplified as the following: 041165 = (uyp - uyC) - 2(pr - uxc) + (“2p - 112C). The utility of the effect estimated is demonstrated for the fan spread data by showing that aAGS==O. Assume that the difference between the two group means on the X variable equals some constant (a), uxp-uxc==a. And the difference between the group means on the Y measure equals (a-Ib) where b is any constant, p =a +b. Then for fan spread yp ' “yc data and equally distant time points the difference between the group means on the 2 measure would equal a-b; uzp-pzc==a-b. The effect estimated by the adjusted gain score strategy can be written as the following: GAGS = uyp-uyc-2(uxp-u c) +(uZp-uzcl II N 9) I N D 63 The adjusted gain score procedure does not require equal time periods between the administration of the tests. The group's growth rate can be adjusted for differences in time periods by the ratio of the time under investigation to the time between the first and second pretests. Since the adjusted gain scores are only a function of means and thus unaffected by errors of measurement, the procedure is appropriate for both the manifest and latent fan spread models. Finally the adjusted gain score strategy is not influenced by the model of within-group growth, it is therefore appropriate for both condition 1, p==1 and condition 2, pfl of the fan spread hypothesis. The estimate of the group difference provided by the adjusted gain score approach using the data for Table l is as follows: GAGS=IU -u )-(u -uxc)-[(uxp-u yp yc xp )- (uxc - uzcl] 2P (7-14)-(5-lO)-[(5—3)-(10-6)] “-7+5-H2-4H. O I For the data found in Table 2 the estimate of the group difference is as follows: “AGS ‘ (uyp-u )-(u -uxcl-[(ux )-(uxc-uzc)] yc xp p"“ZP (28 - 34) - (16 - 20) - [(16 - 4) - (20 -6)] O 11 -6+4-[m-I4L Both of these examples assume that the period of time between t0 and t1 is equal to the period from t1 to t2. 64 Estimation With Multiple Fallible Covariates The final analytic strategy suggested to adjust for the problem of differential growth rates between comparison groups was the analysis of covariance model with multiple covariates. The multiple covariates are the double pretest data collected prior to the period of interven- tion. In the review of the literature it was pointed out that because of measurement errors in the covariates, the analysis of covariance model has been judged as inappropriate in quasi—experimental studies. It was also pointed out, however, that Keesling and Wiley (1976) have suggested an approach which adjusts for the fallible covariates. As a result their technique was suggested as a possible solution to the fan spread problem. Following the Keesling-Wiley procedure, an estimate of the group differences is stated as the following: o pTxTy"pTxszTzTy_ Ty OLMAC = uyp"uyc" lap?xTz oTx (uxp"uxc) OTXTI ' PTXTZPTXB, Oly ( ) " 11 ' 11 . l - 2 0 2p zc pTxTz Tz On the surface this estimate differs considerably from the one suggested by the definition of the fan Spread model. The nature of the coefficients must therefore be examined. For condition 1 the relationship between the test performance was said to be perfect, c>=l. Under this assumption, then, the following is true: pTxTy==pTxTz=4 pTzTy‘=]' In the above adjustment coefficients, there appears the following factor in the denominator: l-p%xTz. If, as condition 1 suggests, the true relationship between test performance is perfect, 65 then denominators in these coefficients are zero, and the coefficients are undefined. The Keesling-Wiley approach is inappropriate for con- dition 1 when the covariates are repeated administrations of the same or parallel forms of the posttest measure. Condition 2, on the other hand, suggests that the relationship between test scores across time is not perfect pI‘l. Under this con- dition the adjustment coefficients suggested by the analysis of covariance strategy are defined. It is necessary then to determine whether or not the coefficients provide the appropriate adjustment. For the Keesling-Wiley procedure to be appropriate under condition 2 of the fan spread model, the following equality must be true: EX"’pxzpzyE—y-(u -u >+ py'pxzpflgll - >=31( - ) 1--p:Z Ox xp xc '1- piz Oz uzp uzc Oz uxp “xc Where in the above expression the correlation coefficients, (p), and the variances, (o), are in terms of the latent true variables. The fan spread model defines the differences between the group means on the 2 variable as the following: The Keesling-Wiley coefficient can then be written as - o .- 0” pxipzy—X—(u -u If” oxipxyfy—(u -u ) l--px2 Ox xp xc I-pxz 0x xp xc EX. OXY"pxzpzy pzy"pxszy = o (“xp 'UXC) l- p7 + 1'02 x xz xz 66 For the Keesling-Wiley procedure to be appropriate, then, the factor in brackets must equal unity: pxy"pxzpzy + pzy"pxszy = 1 l-p2 1-p2 ' xz xz This may be rewritten as the following: - 2 nylozy‘sz (ozy+oxy) - l-oxz . OY‘ ... 2 (oxy'tpzy) (l-oxz) - 1 - 0x2. Further simplification of the equation is provided by noting that: 2 .. 1"pxz — (1"pxz) (I'Ipxz)' Thus for the Keesling-Wiley procedure to appropriately adjust for the fan spread hypothesis under the second model of within group growth (oxy'rozy) = (1 +0”)- This can only be true if both pxy and p2 are greater than pxz' y Since pzy involves variables measured at two points farther apart in time than pxz’ the above is highly unlikely. Given the more reasonable assumption that the correlations between any two equally distant points in time are equal, the statement cannot be true. Thus the Keesling- Wiley procedure is inappropriate as a solution to the fan spread model. 67 The data presented in Table 2 demonstrate the inappropriateness of the Keesling-Wiley procedure for condition 2 of the fan spread model. For these data the true relationships between the measures are as follows: pzx = .85, pzy = .70, and pxy = .90. With these values, the estimate of the group difference is as in the following: = (u _ u )_ pxy ' pxzpzy 31 u _ u )_ pzy ' pxszy 31 u _ u ) OMAC yp yc l - pi, 0x xp xc l - 0;, oz zp 2c (28- 34) - 301418333470) $13; (16 -20) .70- (.85)(.90) 4.74 l-(.85)2 1756(4’6) -6 - (l.O99)(1.5)(-4) - (-.2342)(3.038)(-2) -6+-6.595-1.4229 = -.8279. The analysis of covariance with the two pretests as covariates does not therefore estimate the desired effect under the fan spread model for either condition 1, p=l, or condition 2, pfl. The examination of the estimates of group differences by the four analytic strategies has shown that for condition 1, p==1, gains in standard scores, analysis of covariance with estimated true scores 68 of the covariate, and adjusted gain scores all estimate the effect of interest. Only the Keesling-Wiley analysis with the double pretests as covariates was Shown to estimate the wrong effect. For condition 2, p>#l, only gains in standard scores and adjusted gain scores were shown to estimate the desired effect. Thus, researchers have a choice in selecting an analytic strategy in both condition 1 and 2 of the fan spread model. Given these findings, the selection of one technique over another might be based on precision. Precision The precision of an analytic strategy is defined in terms of the standard error of a Simple contrast which in turn is determined by the variability of the adjusted variable. Therefore, a comparison of the standard errors associated with each analytic strategy provides a way by which the precision of each technique can be assessed. It was pointed out in Chapter 2 that each analysis strategy being considered could be conceptualized in terms of an adjusted dependent measure. In comparing a program group with a control group, the contrast of interest is the difference between the means of the two groups on the adjusted variable. The standard error is therefore the square root of the variance of this contrast, /'Var (Wh-—WE). The variance of the contrast is defined as Var (W -W€) = Var (Wh)-+Var (W£)-2 Cov (Wp, D W’). (8) 69 To determine the standard error of the contrast, both the variance and the covariance of the adjusted means are needed. All of the analytic strategies considered take this general form and differ only in the approach used to define the adjusted variable, W. Standard Error For an Index of Response Since the gains in standard scores and the analysis of covariance model with estimated true scores as the covariate use only a Single pretest and the posttest, the two approaches are similar in how the adjusted variable is created. They both take the general form of an index of response, W = Y -KX. Where X and Y are the pretest and posttest scores, respectively, and K is the adjustment coefficient. The contrast of interest, however, is stated in terms of the group means on the adjusted variable which can be written as: W'= Y -KX. The variance of the adjusted mean is the variance of the linear combination of posttest means minus the product of the adjustment coefficient and the pretest mean, Var (W) Var (YV-KX) (9) Var (Y) +Var (KY) -2 Cov (V,KX). If K is known to the researcher independently of his data-—that is the population parameter K is a known constant--the variance of the mean of the adjusted variable could be further Simplified to: 7O Var (W) = Var (7)-+K2 Var (X)-—2K Cov (Y,X). The strategies suggested by Kenny (gains in standard scores) and Porter (analysis of covariance with estimated true score of the covariate) do not assume knowledge of the correct adjustment coefficient; rather, they require that the coefficient be estimated on the data. As a result of fluctuation from sample to sample, the adjustment coefficient must be considered a random variable. In determining the variance of the adjusted mean, the adjustment coefficient appears only in combina- tion with the pretest mean. Thus, to determine the variability of the adjusted variable, the variance of the product of two random variables and the covariance of a random variable and the product of two random variables are needed. Keesling and Wiley (1976) have derived statements for both of these terms assuming that K is independent of both X and Y. The variance of the product of two independent random variables can be written: Var (KY) = E(K)2 Var (_)'(')+E(X)2 Var (K)+Var (K) Var (X). (10) The covariance of a random variable with the product of two independent random variables can be written: Cov (V,K ) = E(K) Cov (Y3XD, where again, K is assumed to be independent of the pretest and posttest means. The proof of these two statements is provided in Appendices A and B. In summary, then, the variance of the adjusted variable is: Var (11’) = Var (T) +£(K)2 Var (if) +£(Y)2 Var (K) +-Var (K) Var (X) -2E(K) Cov (‘x’,l’). Finally, to determine the variance of the contrast of interest (equation 9), the covariance of the two adjusted means is needed; i.e., Cov (WE,WE). Since only the adjusted coefficient is common to the group mean on the adjusted variable, Keesling and Wiley state this covariance term as: Cov (Wp,W) = E(X) E(X ) Var (K). (11) The proof of this statement is provided in Appendix C. Substituting equations 10 and 11 in equation 9, and assuming a balanced design with the program and control groups having equal var- iance on the pretest and posttest measures, the variance of the contrast between two group means on an adjusted variable is: Var (Wb-W£) = 2 [Var (Y)+E(K)2 Var (X)-+Var (K) Var (X) - 2E(K) Cov (X,V)]+E(Xp)2 Var K+I:(T<'C)2 Var (K) - 2E(Xb) E(XE) Var (K). This can be simplified to: Var (Wh-WE) = 2 [Var(Y)+-(E(K)i+- Var(K)) Var (X) - 2E(K) Cov (7,7)] +[E(Xp) -50?an Var (K). 72 Furthermore, E(X ) = uxp and E(XE) = , so that, p “xc Var (Wp -WC) = 2 [Var (.Y_)+(E(K)2 + Var (K)) Var (X) - 2E(K) Cov (7,7)] + (u -u )2 Var(K). (12) Xp xc Equation 12 is a general statement of the variance of the contrast of interest. The Standard Error For Kenny's Procedure The strategy suggested by Kenny defined the adjustment coefficient as the rato of the pooled standard deviations of the posttest to the pretest: K==§f—. To determine the expected value and the variance of this coefficient, the density function of the ratio of two correlated standard deviations is needed. Bose (1935) and Finney (1938) have derived this function from the bivariate normal density function. The density function of two correlated standard deviations is written as the following: "'1 n-2 -fl dF = 2(1-gpi[77_ v 1 ‘ 4p2\)2 2 v 80151, 1%?) (1+v2)n-1 (1+V2)2 where: 0 represents the relationship between the pretest and posttest for the population on the manifest variables; 8 is the beta function; S S \) BAX/Elwand y x n is the sample size. 73 The expected value of the ratio of the standard deviation is determined over the interval from O to m the density function as in the following: E(v) =fm \) dFv. 0 To determine the variance of the coefficient, the expected value of the squared ratio of the standard deviation is needed. This value is cal- culated by integrating from O to w the product of the ratio of the variances and the density function: E(vz) =1; v2 dFv. Solving the above equation results in: 2 2 E51=M31 s: "-3 Appendix 0 provides the calculation leading to the above solution. The solution to determining the expected value of the ratio of standard deviation was more difficult and required the numerical integration of the following function: n-4 5 S . E(TL) = an_'] f y’(1-pz):1-402 Y(1-Y) (Y(1-D) 2 dy (531-. X I"(-2-> O X The derivation of this solution is provided in Appendix E. To integrate this function, a "canned" computer program called DCADRE (de Boor, 1971) Twas used in conjunction with the CDC6500 computer at Michigan State 74 University. The solution of the equation for values of n ranging from 20 to 120 in multiples of 10 and values of p for the manifest variables ranging from .1 to .9 in increments of .l are presented in Table 3. Earlier in this chapter it was stated that the ratio of sample standard deviation was a biased estimator of the rate of population standard deviation. The results found in Table 3 confirm that statement. If the ratio of the standard deviations was an unbiased estimator of the ratio of the p0pu1ation standard deviations then all entries in Table 3 would have been one. For a sample of size 30 and a correlation of .1 between the measures, the ratio of the standard deviations equals 1.0178. As the sample size becomes large and the correlation between measures high, the magnitude of the bias is reduced. For example, with a sample of 50 and a correlation of .8, the ratio of the sample standard deviations equal 1.0038. Thus the procedure suggested by Kenny provides the appropriate estimate of the group differences in a fan spread Situation only when the sample is large and the correlation between measures is high. The variance of the adjustment coefficient suggested by Kenny is determined by: we we) These results are presented in Table 4, assuming the p0pulation variances of the two measures are equal. For a sample of 30 and a correlation of .l, the variance of the ratio of the standard deviations equals .03733. As the sample size and the correlation 75 wooo._ m_oo._ NNoo.p “Noo._ Nmoo._ omoo._ mmoo.F Paco._ Neoo._ om_ oooo.P ~_oo._ emoo._ omoo._ mmoo._ amoo.F «4°C.? meoo._ oeoo._ o_F o_oo.F mpoo.F omoo._ mmoo.F mmoo._ meoo.F Keoo.. mace.” Pmoo.p oo_. P_oo._ _~oo._ mmoo._ smoo._ meoo._ meoo.~ «moo.P mmoo.P omoo._ om ~_oo._ mwoo._ mmoo._ _eoo._ meoo.F amoo.. amoo._ Ncoo.P «moo.p om «200.2 Kmoo._ mmoo._ meoo.P mmoo._ Nooo._ “moo._ ”Koo._ muoo._ cu n_oo._ _moo._ aeoo._ omco.P mooo.p muoo._ meoo.2 mwoo._ owoo._ om omoo._ mmoo.F «moo.P “moo.P mkoo.F mmoo._ maoo.F _o_o._ eo_o.P Om omoo._ meoo._ mooo._ mmoo.. oo_o._ ~_Po.. _~_o.P N~_o._ Fm_o._ oe mmoo.P Gooo._ mmoo._ oppo.~ mm_o.F mmpo._ eopc._ Mk_o.p mm_o._ om mmoo._ eoPo._ ©¢_o._ mmpo.P m_mo._ mmmo.F ammo._ FNNO.P mNNO.P ON a. m. N. e. m. a. m. N. _. xxa c _o:cm mc< meowumw>mo ugmvcmum cowpmfisaoa mg» cog: m a_aae mcowpmw>mo vemnceum mpasmm ucmccmamucH-coz 03h mo oppmx on» we mapm> umpomaxm one 76 mo—oo. womoo. mmeoo. memoo. meooo. pmnoo. ~w~oo. vmwoo. omwoo. omp wupoo. nmmoo. mmeoo. oomoo. monoo. mmmoo. emmoo. pomoo. om¢oo. opp mmpoo. unmoo. nmmoo. Noooo. ponoo. onwoo. mvmoo. mmmoo. cmopo. cop mpmoo. ¢Fvoo. wmmoo. mmnoo. mmmoo. omooo. pmo—o. ooppo. ¢¢_~o. om ocmoo. wovoo. moooo. mmwoo. wmmoo. mmopc. mwppo. mmmpo. ammpo. ow owmoo. mmmoo. oonoo. ommoo. mmppo. mmmpo. mompo. mm¢Po. vmvpo. on vmmoo. amooo. mmwoo. mmppo. mumpo. mmepo. moo—o. mmmPo. mmnpo. on movoo. mokoo. Fmopo. Pumpo. mompo. mompo. mmmpo. voomo. mN—No. om mpmoo. mmmoo. mmmpo. mvmpo. meomo. omwmo. omemo. wmowo. N—nwo. ow oomoo. mvmpo. wompo. ooqmo. upwmo. oopmo. mmvmo. mpomo. mmmmo. om mmppo. mcpmo. ovomo. ovmmo. mpmvo. moomo. mmvmo. nowmo. mmomo. om m. w. m. o. m. c. m. N. p. c Axe ~mscu mc< mcowumw>mo cemccmum corum_=aoa mcp cog: mcowpww>mo ccmucmum mPQEmm pcmncmamucchoz exp we ovumm on» we mucmwcm> on» e mFQMH 77 between the measures increase, the variance of the ratio decreases. For a sample of 50 and a correlation of .8 the variance of the ratio of the standard deviations equals .00769. The Standard Error For Porter's Procedure The analytic strategy suggested by Porter using the analysis of covariance model with estimated true score of the covariate, defines the adjustment coefficient as the ratio of the linear regression slope of posttest on pretest to the reliability of the covariate: K = %¥:§" One way of calculating the reliability coefficient is to use the xx sample data and estimate a measure of internal consistency for the pretest. It was suggested earlier, however, that a test-retest reliability was a more desirable coefficient. If one of the two administrations is also used as the covariate, there would be a lack of independence between the regression slope and the reliability coefficient, thus complicating the estimation of the expected value and the variance of the adjustment. A second approach of estimating the reliability coefficient would be to select an independent sample of subjects and administer the instrument twice. This estimate of the test-retest reliability could be obtained in the pilot testing of the instrumentation. In both of these methods the resulting estimate of the reliability coefficient is a random variable whose fluctuation from sample to sample must be taken into consideration in estimating the variance of the adjustment coefficient. A third way of determining the reliability coefficient is to use the published test-retest reli- .ability coefficient if one is available. The researcher, however, must 78 be careful that the reported reliability is appropriate for the population he is studying. This latter approach was taken in the present study because it seemed likely that a standardized instrument would be used in an evaluation study and because it simplified calcu- lations since the coefficient would be constant across replications. Given that the reliability estimate is known, the expected value of the adjustment coefficient is the product of the reciprocal of the reliability coefficient and the expected value of the regression lepe: .12...) 1 --E(b ). oxx o Y’X and the variance is: b Var(eggfi) 6%;-Var (by-x)' Thus the problem of determining the expected value and variance of the adjustment coefficient is simplified to determining the expected value and variance of the regression lepe. To obtain these values, the density function of a regression coefficient is needed. This function is well known and appears in several forms. Kendall and Stewart (1952) present the density function of the regression coefficient as: “4‘23“ ;—>(§f)m__ d” 79 Where p,oy,0x, and n are as defined previously; and T represents the gamma function. The expected value of the regression slope is determined by integrating over the range, - age we» we mcowumw>oo uemucmpm cowum_:aoa mg» can: u:m_uwwmmou cowmmocmmm m—QEom m eo mu:e_ca> web m mpnmh 82 by Porter can then be calculated by multiplying the variance of the regression slope by the reciprocal of the reliability coefficient squared. The reliability coefficient for each entry in Table 5 equaled the respective correlation between the measures. These results are presented in Table 6. For a sample of size 30 and a correlation between the measures equaling .1 with a reliability coefficient of .1, the variance of the adjustment coefficient equals 3.6667. As the sample Size increases and the correlation and reliability increase, the var- iance of the adjustment coefficient decreases. For a sample of 50 and a correlation of .8 with a reliability coefficient of .8, the variance of the adjustment coefficient equals .01197. The Standard Error For Adjusted Gain Scores The derivation of the standard error for the adjusted gain score strategy involves a slightly different approach. An individual's adjustment score is determined by the following formula: v-x-(X-T) z II or Where Z,X, and Y are the first pretest, the second pretest (administered just prior to the introduction of the treatment) and the posttest, respecitvely. The variance of this variable can be written as the following: 83 PONoo. Fweoo. omwoo. QNmpo. womNo. sweeo. Ncmwo. mmeN. mpoem.o ONP ONNoo. mNmoo. mxmoo. Nompo. eomNo. momco. omceo. omeNN. mNmNm.o opp NvNoo. ommoo. mnopo. mmmpo. mmomo. vamo. wNeop. NvaN. moNo.F oo— moNoo. ncmoo. omppo. mvoNo. weemo. mmooo. NNmpp. mwmmN. mum—.P om momoo. ~m~oo. compo. momNo. mmwmo. mpmmo. NONmF. onppm. nmmN.p ow ommoo. ovmoo. mmmpo. mmoNo. mnvvo. omwno. Fmomp. ONmmm. o~m¢.~ ox N_¢oo. ummoo. cNmpo. m—Fmo. mmNmo. FPNmo. o¢u~—. mopNe. momu.~ om mmcoo. mmppo. mFNNo. mmmmo. mwmoo. omppp. mFm—N. moopm. emoF.N om cmooo. ONmpo. mpmNo. momwo. mopwo. mm~¢p. wNmNN. mmmvo. nmno.N cc momoo. MNONo. mmwmo. vwmmo. NFPFF. vvvmp. mvvnm. omwmw. noom.m om mmmpo. mommo. NNpmo. wmvop. Nvoup. mmwom. nnvmm. wmppw.~ mMNw.m ON 0. w. m. o. m. c. m. N. p. : mcsumooea mocmwcm>oo we m_mxpmc< mcoum «:2» mg“ cw umpmmmmam ucmwuwwemou ucmspmanu< asp mo mocm_cm> och o open» 84 Var (W) = Var (Y)-+Var (X)-+Var (X)-+Var (2)-Cov (Y,X) -2 Cov (Y,X) + 2 Cov (Y,7)-+Cov (X,X) -2 Cov (X,2)-—2 Cov (X,2). The above can be simplified to: Var (W) = Var (Y)-+(l-+%) Var (X) -(2-+%) Cov (X,Y) -+%—[Var (Z)-+2 Cov (Y.Z)'-4 COV (X’Z)] (see Appendix H). To facilitate comparisons with the two previous standard errors, the above variance of the contrast of adjusted group means can be written as the following: Var (W) Var (Y) +(l +%) Var (X) - (2 +%-) Cov (X,Y) +1H [Var (2') +2 Cov (7,7) - 4 Cov (X,Y)]. Jn'IVam (Y)+(1 +3.4 Va" 0‘) ' (2112?) C“ (x.v)] +F'z (Var (2) +2 Cov (v.2) -4 Cov (x,2) . The Standard Error For the KeeslinggWiley Procedure Since the analysis of covariance with multiple covariates was shown to be inappropriate in fan spread situations for either con- dition l or 2, the standard error associated with that technique is not considered. lo presents analytil the stal determi compone of the include 0f cova between C0mpone the fir second are det Precisi the Coe Th three a Varian( diffE1 SCore a “Sing t Coeffic Vided i other he 85 To facilitate comparisons of statistical precision, Table 7 presents in summary, the standard errors associated with the competing analytic strategies. The formulas presented in Table 7 indicate that the standard errors of the analytic strategies being considered are determined by a combination of four components. The first three components involving: the variance of the posttest, the variance of the pretest, and the covariance of the pretest and posttest, are included in all three standard errors. Standardized gains and analysis of covariance with estimated true scores consider the squared difference between the population means in the fourth component, while the fourth component of the adjusted gain score strategy involves the variance of the first pretest and the covariance of the first pretest with the second pretest and with the posttest. Since the three standard errors are determined by basically the same components, the differences in the precision of the analytic strategies can be explained by differences in the coefficients of the four components. The first component of each standard error is identical for all three analytic strategies with 1 as the coefficient of the posttest variance term. The coefficient of the last three components, however, differ considerably. Both gains in standard scores and estimated true score analysis of covariance strategies determine these coefficients using the expected value and variance of their respective adjustment coefficients. These expected values and variance estimates were pro- vided in Tables 3, 4, and 6. The adjusted gain score approach, on the other hand, determines the coefficient of the last three components 86 C £533-£58331; «4+ :58 m; -32.; m: +3.5 m ”museum cmmm woumswum so» Lasso uceucaum xx xx u xx xx lelllaa A8213}: :3: >8 TEE, Jug + «a +3.22, m . N . . a e .2, 2:2: 2 e E bis; N "mumwcm>ou on» we mmcoom mac» umuaemumm com mocmwcm>oo mo m_mzpocm Lee cocco venucmum X X X X Hm 28> ioxa-ax3+ 9.525 NM mNvacg NM .E>+ Mm m :52; m m m m m N ”mmcoom ucmucmum c_ mcwmm Low Loccm ucmucmum mucoum crow uwumsnc< ucm .mucmwem>ou mo mwmxpmc< weoom mag» .mmcoum uxmucmpm cw mcwmw cp_3 woumwoomm< mcoccm ucmocepm one n 3%.— 87 solely on the sample size. Tables 8 through 13 present the coefficients associated with the second, third, and fourth components of the standard errors for the three analytic strategies being considered. For example, the standard error for each competing analytic strategy when the sample size equals 60 and the correlation between the measures equals .8 (see Table 9) can be computed as the following: for gains in standard errors: /5—2(1 [Var(Y)+(l.0194)Var(X)-2.0096 Cov(x,Y)+,()0534(uxp '“xc’z for true score analysis of covariance: Vfg%-[Var(Y)+(l.OO968)Var(X)-2.0 Cov(X,Y)]+.OO968(uxp--uxc)2 for adjusted gain scores: “6%{Var(Y)+(l.OSO)Var(X)-2.033Cov(X,Y)]+.OOOS6[Var(Z)+2Cov(Y,Z)-4Cov(X,Z)] Tables 8 through 13 were compiled using the previously derived expected values and variances of the adjustment coefficients when the variance of the pretest and posttest are equal. The fan spread model, however, suggests increasing variability from pretest to posttest. This assumption has the effect of increasing the expected value and variance of the adjustment coefficients (see Tables 3, 4, angZG) by a factor of the population posttest to pretest variance ratio, 3¥-. In general, x ONooo. NeNoo. omwoo. oooNo.N ooooo.N ooNoo.N ooomo.p NeNoc._ mmmoc.p cop wmooo. mNaoo. oeNoo. oomNo.N ooooo.N oeNoo.N oommo.~ vaco.F omeoo.— om mmooo. pweoo. emmoo. mmmmo.N ooooo.N oemoo.N ooomc.— pp¢cc.p «Nocc.p oo meoo. emcee. vpmoo. ooomo.N ooooo.N ONmoo.N oomwo.p emooo._ omopc.p oe oomoo. ommpo. mmppo. ooooP.N ooooo.N ooppo.N ooomp.p campo.p mNNNo.F oN x.» x a xxo x x.» xx x a x : xx : c .A a a ... .... .... 1.. .131. ...... ...... ...... ...... ...... .W. . mm mucoum mucmwco>ou mmcoum mucoom mucowca>ou mmeoom museum oocawsa>oo museum swam wo mwmxpoc< uceuceum swam wo mwmxwmc< vcmuceum :wom wo mwmxposc vcavcoum woumanu< cw mcwmu umumznv< cw mcwmm umpmanv< cw wcwom ucwcoqeoo guano; A>.xv >ou peacoQEou vcwgw Axv La> acmcoasou ucouom wmaam oc< mmpnmwco> > can x any we mucowcu> cowuapaaoa mgu use mownawco> «mowwcoz one saw a. u n cos: muwmouusum owuxwmc< ocwuwaeou mmcgw on» saw: empowoomm< coccu ccnccmum on» we mucoconsou gagged ecu .vL—gw .ccooom on» cow macowowwwoou m opaoh 89 omooo. cameo. Nxmoo. oocNo.N ooooo.N ammoo.N ooomo.p ammoc._ meoo.p cap .mooo. ,meoo. mmeoo. oomNo.N ooooo.~ oowoo.~ commo.p pmxco.— mucoc._ om mmooo. moose. emcee. mmmmo.N ooooc.~ oNooo.N cacao.— mmmoo._ mmNFo.p om mN_oo. onFo. mmooo. ooomo.~ ooooo.N oomoo.N oommc.p onpo.p oempo.— oe oomoo. momma. mepNo. oooo_.N ooooo.~ omoNo.N ooomp._ mommo.F mmNco._ oN x.x x x _ xx x x.» xx x x x = xx a Q m m a m m a x.x km Q m NM an: «o «b N + N diam—VIN. N NM MN m+_. ~b+~A a: _. «5+N NW m : museum uueuwco>ou mucoum museum open—La>ou museum mugcum aucawco>ou museum cwam wo mwmxwoc< ugaccuum cwuu wo mwmch=<. unaucoum :wou wo mwmxwuc< ccaucupm nopmznu< cw mcwoo voamanu< cw mcwua voumsfiu< cw mcwau acocoasou zucsow L A>.xv >ou acmcoasou vcwgw Axv cu> acocoasou vacuum puzam mc< mopnawca> > new x we» we mucowca> cowuapaaom uga uza mmpaewcn> umowwcux an» cow m. u a con: mowumuosum o.ax_.=< aewaaaeou «stew as» 53.: uaaawuomax cog.” venueaum we. wo macacoasou guess; cc. .uewgw .ueouom age cow maca.a_wwoco o awash 9C) ONooo. «Nope. “Nmoo. oooNo.N ooooo.N onoo.N ooomc._ muopo._ m¢o—o.p ocp pmooo. Nmmpo. mmooo. oomNo.N ooooo.N oomoo.N oommc._ Nmn—o.n eNm—o.p cw mmooo. onpo. mamoo. mmmmo.N ooooo.N omwoo.N ooomc.— mNm—o._ pmxpo.p om mN_oo. m_mNo. wwmpo. ooomo.N ooooo.N compo.N oomxc._ nFmNo.p mmec.p co oomoo. NNpmo. meomo. oooo_.N ooooo.N ONcNo.N omomp.p NNpoo.p mm¢mo.— oN x.» x x xx x x.» xx x x c xx 5 c Q c x. a IN: «3 «6 mx MD M + N IAIN-N NW1 N .w.+.. mb+L 3m Lml m\>.m.o+ NW. N : mmcoum mucawco>ou mmcoum mucoum oucmwce>ou museum mucoum oocuwco>ou monoum cwua wo mwmxfioe< neoucuum :wnm wo mwmxwoc< vgaucaum cwuu wo mwmxwac< vguvcaum G38? 5 2.5m. 8332 5 22.... 832.2 5 2:8 «coconsou guczow A>.xv acmcoasou ccwgw Axv La> «cocoasoo vcouom pmzom mc< mmpnowca> > use x men we mucawco> cowuapaaom on» can mo—nowcu> umuwwcuz on» cow 5. u a cog: mmwmouosum ungpoc< mcwumgsou pouch one new: umuewoomm< Logcu uguucoum mzu wo mucmcoasou zuczow use .ugwsw .ucoumm on» Low mucowuwwwoou op opgnh 91 ONooo. mmwpo. Noooo. oo0No.N ooooo.N oomoo.~ ooomo.— mmmpo.— mNm—o.p oo— wmooo. momNo. mmwoo. oomNo.N ooooo.N onoo.N oowmo.p momNo.p mmopo.F ow omooo. m——mo. mNppo. mmmmo.N ooooo.N oNP—o.N ooomc.— mP—mo.p NmNNo.— om mNPoo. momvo. mvxpo. ooomo.N ooooo.~ oou~o.~ comwo.p momco.p vamo.— cc oomoo. mmcop. ommmo. coco—.N ooooo.N oomm¢.~ oocmp.P mm¢o_.— mmmso.p cN x.» x x _ xx x x.» xx x a x = xx a mx m c a c a x x a m\ m mmgoum mozawca>ou wagon“ mmLoUm mocowca>ou museum mucoum oucowcu>ou mmroum cwum wo mwmxfiac< ugoucaum swam wo mwmxwae< unaucaum cwoa wo «waxw¢:< vcavcaum 83:2 ... 2:8 8:32 5 2:8 3:32 5 2:3 newcogeoo cacao; A>.xv >ou ucmcoasou vgwgw Axv sa> ucocogsau vacuum Fezum ac< mmpauwcu> » ecu x on» we mucnwco> cowuo—saom on» was mmpnowg~> anew—cox men cow o. u n cos: mo—uouocum uwuxwec< ocwumasou woes» may saw: emuawuomm< Lose“ vecucoum us» we mucocoqeou gucaow vco .vcwzw .ucouum on» cow mucuwowwwoou pp opnch €92 oNooo. mmomo. mpcmoo. OOONo.N ooocc.N ocNoo.N ooomo.— mmomo.— , MNmpc.— cap pmooo. mammo. wxmoo. oomNo.N ooooo.N ammoo.N commo.— mammo.p ommpo._ om omooo. vamo. mNm—o. mmmmo.N ooooc.N cempo.N ooomo.P chmo.p mNeNo._ om mN_oo. wopwo. weoNo. ooomo.N ooooo.N oooNc.N ocmxo.p wowmo.— mmcco.— ov oomoo. xvox—. mpmeo. coco—.N ooooo.N ooNeo.N ooOmp.p Newsp.p mpmwo.p ON .» x x xx x x.» xx x x x N: XX x a m\ m c Q c a x X «a M\ m TIN. «Q\ «o «o M+N IAIxJNfl N HM uN N+p ub+ua a: 11—1 ub+~ NW N : mmcoum oucmwca>oo mucoum mucoum mucuwcc>ou mmcoum mucoom mucawga>ou museum swam wo mwmzpo:<. neocceum swam wo mwmxauc< ugauccum cwam wo mwmxpac< ucavcaum cmumznv< cw mcwew empmsnv< cw mcwuw umumsnv< cw mcwmo acmcoasou cucaow A>.xv >ou peacoasoo vgwgw Axv Lo> “cocanou acouom Pena“ wc< wwwauwca> > use x may we muzuweo> cowuapznom on» use mmponwso> umowwcu: as» Low m. u a cue: mowuuuuupm owuapmc< mcwumasou mmcgw ecu zuwz woumwuomm< eoccu upovcmum as» we mucocoasou cacao; vac .ucwgw .vcouom on» cow mucowuwwquU NP m—acw 93 ONooo. vamo. mowco. oo0N0.N ccooc.N camco.N ooomc.p Npemo.p Nmspo.~ cop Pmooo. m—mmo. omopo. oomNc.N cccoo.N owe—o.N ocmmo.— m—moo.— oN—No.— om omooo. ——Nmo. mac—o. mmmmo.N ooooo.N ooepo.N oocmo.p pmeo.~ memNo.p on mNpoo. m—e—p. meNo. ooomo.N ooooo.N ocNNo.N camno.— mm—vp.p a¢m¢0.— cc oomoo. mwmom. omomo. oooo~.N ooooo.N ooN¢Q.N cocmp.p mwmom.p mmwmo.— 0N ..m age... ...... ...... 141...”... mg m: ..m.......... ... ...}... w. . . 3.38 35228 museum .328... 85:38 3.83 museum 352:3 when” m cwom wo mwmxpoe< venucuum swam wo mwmx—c:< ucavcoum :wuu wo m—maflo:< vguvcoum caumshux a? mcwaw ambasfiux cw .=_.a ecumanux e. “swam ucmcoasou gassed A>.xv >ou “cocoaeoo vgwgh Axv co> “cocoasou vacuum peso“ oc< mmpnuwca> » use x we» we oucowcu> cowuupsaom one can mupnawgc> umuwwcoz can sow v. u a can: mowuauacum uwuxpnc< ocwumasou ouczw we» saw: umuowuomm< Lose“ vcuvcoum use we mucucoasau augsaw veg .uewgw .vcouom use cow mucowuwwwoou mp «pack the st is is 1111 94 the greater the difference between the two variances, the larger the standard error. This results from the fact that the second component is increased by the ratio of the variances while the third component is increased by the ratio of the standard deviations. However, the magnitude of the increased standard error is the same for the gains in standard score approach as that for the estimated true score analysis of covariance technique. Thus, in comparing the precision associated with these two strategies, the coefficients found in Tables 8 through 13 provide a reasonable basis on which judgments can be made. The adjusted gain score approach is not affected by the fan spread assump- tion. Thus, the standard errors presented in Tables 8 through 13 remain the same regardless of the difference between the variance of the pre- test and posttest. The implications of this result when comparing the standard error associated with the adjusted gain score approach with either gains in standard scores or the estimated true score analysis of covariance techniques are discussed in the following chapter. CHAPTER 5 DISCUSSION The previous chapter has considered the fan spread model from two perspectives. One approach, the traditional perspective, is viewed within group growth as linear with a common starting point and a constant rate of growth for an individual but varying rates of growth among individuals. This view of growth suggested a perfect relationship, p==1, between test performances except for measurement errors. The second perspective of the fan spread model viewed within group growth as non-linear for an individual, but linear for the average of the group. This approach suggested a less than perfect relationship, (>21, between test performances even with perfectly reliable measures. In considering the traditional conceptualization of the fan spread model, gains in standard scores, estimated true score analysis of covariance, and adjusted gain score strategies, were all shown to be appropriate techniques. Choosing one of these strategies in preference to the others must therefore be based on some criterion other than the effect estimated. A second criterion was suggested involving the pre- cision provided by each procedure in testing the common hypothesis. To evaluate the precision, the standard error associated with each approach was derived. Examining these standard errors showed that they were determined by the same basic components. The three competing strategies 95 96 have the same first three components consisting of the posttest variance, the pretest variance and the pretest-posttest covariance. They differ, however, in the coefficients associated with the second and third components. The coefficients for the gains in standard scores and true score analysis of covariance are determined by the expected value and variance of their respective adjustment coefficient. The second and third coefficients for the adjusted gain score procedure, on the other hand, are determined solely on the basis of sample size. There are considerable differences among the three strategies in the fourth component of the standard errors. The fourth component for the gains in standard scores and true score analysis of covariance is determined by the square of the p0pulation mean difference between the comparison groups on the pretest measure. They differ only in the coefficient for this component which is determined by the variance of their respective adjustment coefficients. The fourth component for the adjusted gain score procedure, on the other hand, is determined by the variance of the first pretest, the covariance of the first pretest and second pretest, and the covariance of the first pretest and the posttest. The coefficient for this component is determined by the sample size. Although there were differences in the fourth component, it was suggested that a comparison of the component coefficients could determine differences in the precision provided by the three analytic strategies. The coefficients of those components were presented in Tables 8 through 13. These coefficients were appropriate for the special case where the variance of the pretest measure and the posttest 97 measure were equal. Conclusions based on this data, however, can be extended to situations where the variances are unequal. The first three tables, Tables 8, 9, and 10, present the coef- ficients for the situations when the correlations between measures is high (922.7) which is likely for condition 1. Under this condition, there appear to be only minor differences between the coefficients defined by the three strategies for the three components. Furthermore, this result is consistent for both small and large samples. Special attention might be given to the coefficients for the fourth component. When the relationship between the measures is high, the coefficients associated with the fourth component appear to be very small. For practical purposes these coefficients could be judged to be essentially zero. The second three tables, Tables 11, 12, and 13, present the coefficients for the situations when the correlation between measures is low, (p<:.7). Under this condition, typical for condition 2 of the fan spread model, the coefficients associated with the second and third components again appear similar for the three strategies. As the sample size increases, the magnitude of the coefficients decreases, thus reducing the standard error and increasing the precision of the test. The coefficients associated with the fourth component, however, can no longer be judged as being equal across the three competing strategies. The coefficients for the fourth component of the adjusted gain score strategy are unaffected by the relationship between the measures and thus, remain essentially zero. The coefficients for gains in standard 98 scores and estimated true score analysis of covariance, on the other hand, are inversely related to the relationship between the measures. The effect of this relationship is greatest in small samples. Thus, in comparing the precision associated with the three strategies, the adjusted gain score procedure must be judged as providing the smallest standard error when the relationship between the measures is low. It should also be remembered that the adjusted gain score procedure is the only one of the three to provide unbiased estimates of effects for data with low correlations. The above discussion was based on the coefficients presented in Tables 8 through 13. These values were determined for the situation when the variance of the pretest equals the variance of the posttest. The fan spread model suggests, however, that the variance increases with time. This assumption of the fan spread model has no effect on the standard error associated with the adjusted gain score strategy. The standard errors of the gains in standard scores and true score analysis of covariance are, however, affected by increasing variance. As variance increases, the expected value and variance of the adjustment coefficients increase, which in turn increases the coefficients of the three components previously discussed. In comparing the three competing analytic strategies, under increasing variance greater precision is achieved through the adjusted gain score procedure than either gains in standard scores or true score analysis of covariance. Furthermore, the difference in precision increases as: the relationship between the pretest and posttest measures decreases, the sample size decreases, and the greater fan Spread. 99 The adjusted gain score procedure requires the availability of two pretests prior to the intervention. When these data are not obtainable, the decision must be made between gains in standard scores and estimated true score analysis of covariance. Concentrating on the coefficients for the gains in standard scores and estimated true score analysis of covariance, a review of Tables 8 through 13 indicates almost no differences when the relationship is high between the pretest and posttest measures. Sample size has very little effect on the magnitude of these coefficients in the first three tables. As the relationship decreases between the measures, the coefficients associated with the estimated true score analysis of covariance procedure became slightly larger than those for the gains in standard scores approach. This difference was more prominent in small rather than large samples. The dissimilarity between the coefficients was especially salient in comparisons considering the fourth component. These results indicate that in situations where the relationship is low between the pretest and posttest measures, gains in standard scores are likely to have a smaller standard error than estimated true score analysis of covariance. The observations on which this conclusion was based, were derived for the special case when the variance of the pretest and posttest measures was equal. In situations conforming to the fan spread model, the increase in variability affects both strategies equally. Thus, the above conclusion applies equally to situations conforming to the general fan spread hypothesis. 100 It should be noted that the standard error given for the gains in standard scores procedure differs from that suggested by Kenny. Kenny has suggested a two-stage process: first, determine the adjusted variable; and second, use it as the dependent variable in the analysis of variance model. This procedure assumes that the adjustment coeffi- cient determined in step 1 is a constant across replications. Based on this assumption the standard error takes the following form: s2 5 fl [Var (Y) +—-2x VaY‘ (X) -2 g1 COV (X,Y)] SX x Kenny's standard error, therefore, differs from the correct standard error presented in Table 7 by eliminating all factors involving the variability of the adjustment coefficient. This reduced form of the standard error produces Spurious precision which leads to a liberal test of the hypothesis under investigation. The degree to which Kenny's procedure is too liberal is dependent on the variability of the adjustment coefficient. Tables 8 through 13 provide some information on this question. When the relationship between the pretest measure and the posttest measure is high and the sample size is large, Table 6 indicates that the variability of the adjustment coefficient is essentially zero. As shown earlier, these are the only conditions under which the procedure estimates the correct effect. Thus, under those conditions the procedure suggested by Kenny is likely to be appropriate. As the sample becomes small and the relationship between measures weakens, the probability of error 101 associated with Kenny's technique increases as does the bias in estimating the effect. These observations were based on the situation when the variance of the measures are equal across time. In situations conforming to the fan spread model, the variability associated with the adjustment coefficient increases and with it the inappropriateness of the gains in standard scores technique. Thus, unless the sample is large and the relationship between the measures is high, the use of gains in standard scores as proposed by Kenny should be avoided. On the other hand, the estimated true score analysis of covariance tech- nique does take the variability of the adjustment coefficient into consideration in determining its standard error. This procedure is, therefore, appropriate when only a single pretest measure is available and data conform to condition 1. In summary, for the fan spread hypothesis under both condition 1, p=l, and condition 2, pfl, the above findings have indicated that the most desirable analytic strategy of those considered is that of adjusted gains. This approach tested the correct hypothesis under both models of the fan spread condition and with equal to or greater precision than the competing analytic strategies. When only a single pretest was available, estimated true score analysis of covariance was shown to be a more desirable strategy than gains in standard scores for condition 1. This conclusion, however, was limited only to the traditional conceptualiza- tion of the fan spread model. This finding, however, was further qualified by suggesting that when (a) the variance of the measures are equal, (b) the relationship between pretest and posttest is high, 102 and (c) the sample is large; the two procedures estimate the desired effect with equal precision. Finally, when only a single pretest is available and the second model of the fan spread hypothesis is appro- priate, no strategy was appropriate. The multiple covariate analysis of covariance as suggested by Keesling and Wiley (1976) was rejected as an inappropriate technique for any condition of the fan spread hypothesis. A Data Example To demonstrate the above results, which were derived analytically, a data set was obtained and analyzed. These data were collected on students from seven different elementary schools over a three-year period. Each Spring these students were given the Stanford Achievement Test battery and scores were recorded in the metric of grade equivalents for each subscale of the test as well as total scores. In order to simulate a situation Similar to the evaluation of a compensatory education program, random samples of 30 students from the first quartile and 30 students from the second quartile, based on fourth grade total reading scores, were chosen for comparative purposes. Since the students involved were not part of any special program, the only difference between the two groups was their growth rate. The dependent measure chosen to compare the three analytic strategies was the paragraph meaning subtest. Table 14 presents the means in grade equivalents for the hypothetical treatment group (first quartile) and the control group (second quartile) over the three year period. The values in parentheses are the respective standard deviations. 103 Table 14 Group Means in the Metric of Grade Equivalent Scores and Standard Deviations on the Paragraph Meaning Subtest of the Stanford Achievement Test Battery Spring 1973 Spring 1974 Spring 1975 First quartile (treatment) 3.01 (.530) 4.03 (.723) 4.44 (1.218) Second quartile (control) 3.97 (.453) 4.94 (.939) 5.54 (.939) These group means are plotted on a time by achievement graph in Figure 7. For the paragraph meaning subtest. students in the first quartile were approximately one grade equivalent behind the students in the second quartile. This difference remained constant over the three-year period. The ratios of the group mean differences to the pooled standard deviations at each point in time were 1.935, 1.089, and 1.012. 5.5 1 5.0 ' 4.5 r 4.0 A 3.5 . 3.0 f l _L L Spring '73 Spring '74 Spring '75 :— Figure 7. Group achievement means plotted across three points in time. 104 In addition to the three competing analytic strategies considered in this study, traditional analysis of covariance and analysis of variance with residualized gains were also computed. The residualized gains procedure creates a new variable, W, by subtracting from the posttest the product of the regression slope and the pretest measure. The new variable is then used as a dependent variable in the analysis of variance model. The results of these analyses are presented in Table 15. Table 16 presents the effects estimated by each strategy. These results illustrate several points raised in the earlier discussion. Since there were no differences between the groups except for natural growth rates, the correct conclusion from the analyses should have been that the groups do not differ. The adjusted gain score and gains in standard scores procedures indicated this conclu- sion. Estimated true score analysis of covariance, on the other hand, indicated a significant difference between the two groups. Examining the relationship between the covariate (Spring 1974 data) and the dependent measure (Spring 1975 data) indicated a correlation of .33. To correct for measurement errors the Kuder-Richardson reliability for internal consistency was used since the more desirable test-retest reliability coefficient was not available. This reliability coef- ficient equaled .92. The ratio of the correlation coefficient to the reliability coefficient was, therefore, significantly less than unity and little adjustment was made for initial differences. 105 Table 15 Results of the Data Analyses Using the Gains in Standard Score Strategy, True Score Analysis of Covariance, Adjusted Gain Scores With the Analysis of Variance Model, Adjusted Gain Scores With the Derived Standard Error, Traditional Analysis of Covariance and Analysis of Variance With Residualized Gain Scores Sources d.f. MS F F prob. Gains in standard scores Between 1 .0835 .063 .802 Within 58 76.4297 True score analysis of covariance Between 1 4.905 4.605 .036 Within 57 1.065 Adjusted gain scores (using the analysis of variance model) Between 1 .8449 .667 .417 Within 58 1.2661 Adjusted gain scores (using the derived standard error in a t-test) v = -.504..(-.367I=.:;Z§Z.= ..8002 9" c ([2 __ __ 2 .2962 fi-Var (vp .vc) [/35 (1.3157) F = t2 = .64037 Traditional analysis of covariance Between 1 5.690 5.343 .024 Within 57 1.065 Analysis of variance with residualized gains Between 1 7.439 7.108 .010 Within 58 1.047 106 Table 16 The Effect Estimated by the Gains in Standard Scores, True Score Analysis of Covariance, Adjusted Gain Scores, and Traditional Analysis of Covariance Gains in standard scores: S O‘GSS = Yp"vc I Sf (Xp"xc) 1.0785 4.44-5.54- .83] (4.03-4.94) aGSS .08 Estimated true score analysis of covariance: b .- _- -11 -7- 675 vp YC Pxx (7p xc) 4.44 -5.54 - 191% (4.03 -4.94) -.665 0‘15 Adjusted gain scores: aAGS = Yp - Yc - (xp — xc) - [(xp Sip) - (7C -7C)] 4.44-5.54-(4.03-—4.94)-[(4.03-3.01)-(4.94-—3.97)] “AGS = -.24 Traditional analysis of covariance and residualized gains: “AC = Yp--YC‘-by.x (Xp-XC) 4.44 -5.54 -.44 (4.03-4.94) aAC = .70 107 Kenny's procedure indicated no statistically significant differences between the comparison groups when the posttest was the Spring 1975 data and the pretest was the Spring 1974 data. The conclusion would have been different, however, if the period under investigation was the 1973-74 school year. The effect estimated by the strategy would have been the following: S -“' _ .1 “ _“ O‘GSS Yp Yc Sx (xp Xc) .831 4.03 -4.94 - T492 (3.01-3.97) -.91-1.689 (-.96) .71, thus indicating a positive treatment effect when there actually was no treatment. The adjusted gain score approach was computed treating the adjustment as a constant (like Kenny's technique) and as a variable. The degree to which the former analysis is spurious is illustrated by a comparison of the two F-ratios. The F-ratio when the adjustment was treated as a constant was slightly larger, indicating a slightly more powerful test. The data were also analyzed using the traditional analysis of covariance and the analysis of variance with residualized gains as the dependent variable. The purpose of doing these analyses was to illustrate the inappropriateness of both procedures. The adjustment coefficient used by both procedures is the same, i.e., the observed 109 The average achievement for a particular group was then defined as Average Achievement = u1T (tj"“r)’ p7T is the average growth rate of the group; t. is as defined above; and u is the average point in time when the individuals in the group began to achieve. Based on this simple model of linear growth, the authors showed that the theoretically correct adjustment coefficient for all linear growth patterns Should take the form 1111p (ty-qu) - nc (ty-uTc) , u u11p n"x"1ltp)"uirc Ttxuut;y where the subscripts p and c indicate the program and control groups, respectively. This coefficient times the difference between the two groups on the pretest, results in the adjustment factors 11 -u ‘(Unp(ty‘U1-p)'u.nc(t 'uTcll- YD YC Y From this formulation of individual growth, Bryk and Weisberg argued that the gains in standard scores and analysis of covariance with a reliable covariate provided an unbiased estimate of group differences in situations conforming to the traditional fan spread model. Bryk and Weisberg Suggested, however, that in situations other than the tradi- tional fan spread, neither of these procedures adequately adjusts for 110 differential linear growth patterns. Their conclusion, therefore, concurs with the findings observed in the present study. The present study introduced an adjusted gain score strategy. The estimate of group differences determined by this procedure was written as “AGS = uyp-uyC-(u p -uyc)-[(uxp-uzp)-(uxc-uzc)]. In terms of Bryk and Weisberg's model, this estimate of the group difference can be written as O‘AGS T uyp T “yc T (uxp T “xc) T [(“np(tx T utp) T unp(tz T “1p” I T (unchx T 11“). u11c(tz T HIGH] ' Where: uxp uflp(tx-urp); uzp u"p(tz-qu); uxc “nc(tx"utc); and uzc u11c(tz T uTc) ' Since the average growth rate for each group is assumed to be constant across time, the above estimate of group differences can be simplified to “AGS = uyp - uyc - (uxp - uxc) - [linutx -11Tp)-(’cz - qul] -unc[(tx neg-(1, ween]. 111 Further simplification is obtained by the following substitution: tx T “1p T (tx T t:2) T “:2 ‘ “Tp)° Therefore, the adjusted group difference can be written as ”AGS = uyp-uyC-(uxp-u ml- [u"p(t x- tz)-unc(tx-tz)]- If the period of time between the first and second pretests equals the period of intervention, and changing uxp and “xc to ”np(tx"“tp) and ). u (t '-u TTC X TC (t -u O‘AGS T uyp ’“yc'TEUnp x )T unc(th T'J um) (t -tx1+u..c(ty-tx11. TP 11P Y Combining Similar terms: Which can be further simplified to This is the adjustment factor identified by Bryk and Weisberg for all linear growth patterns. Thus, the adjusted gain score strategy provides not only the appropriate adjustment for the fan spread model but also any situation in which group growth is linear. 112 One Group vs. Two Group Research Designs The adjusted gain score strategy has been shown to be an appropriate technique in situations where groups are changing linearly. Given the assumptions of linear growth and the availability of data from two pretests, the necessity of having an independent control group may be questioned. Under these conditions it might be argued that a treatment group could serve as its own control. This could be achieved by defining a regression line based on the pre-intervention data and predicting posttest scores under a no-treatment effect condition. Dif- ferences between the obtained posttest performances and the predicted performances might then be attributed to the treatment. While this approach might be used, it would be difficult to argue that differences between the observed and expected posttest performances were due solely to the treatment. History, maturation, testing, and regression effects are all reasonable threats to internal validity for this approach. The first two threats represent those alternative explanations which can be attributed to changes either outside or within the individual that occur concurrently with the treatment. The last two threats, on the other hand, are factors which could distort the predictive regression line. Testing refers to changes in test performance from the first pretest to the second pretest that are a result of familiarity of test items. Regression effects are those distortions attributed to positive or negative errors of measurement. The use of an independent control group provides a means by which the effects of these threats to internal validity can be controlled as long as those effects are equal across the 113 two groups. For example, if there are testing effects equaling (a) units on the X measure and this distortion is the same for both groups, then the estimate of the group difference using the adjusted gain score procedure could be written as u -11 -[(1,xp+a) - (uxc+a)] {01,111+al 112p) ' (“n+3 '“zc” VP YC uyp - uyc - [(uxp - uxc) + (a - all-[(14,111 - 142p) - (uxc -uzc) + (a - a)]. Thus, as long as the distortions affect both groups equally, estimates of group differences are still apprOpriate. The selection by regression or testing interactions refer to the distortions which affect one group to a greater extent than the other group. If this assumption is violated then the group difference estimated is biased. In addition to the issue of internal validity discussed above, a further consideration is the question of precision. With an independent control group, the contrast of interest is the difference between the adjusted group means. The standard error associated with that contrast can be written as ¢r2 Var (W). The degrees of freedom associated with this test is the sum of the two sample sizes minus 2. In the single group design the contrast of interest is the adjusted group mean minus zero. Its standard error can be written as / Var (W). 114 The degrees of freedom associated with this test is the number of individuals in the group minus one. Comparing the one group versus the two group design in terms of precision would indicate smaller error variance associated with the former. The two group design, however, has more degrees of freedom. The difference between the degrees of freedom becomes negligible, however, when the sample is large. Thus, the one group design is likely to provide greater precision than the two group design. The problems related to threats to internal validity in the one group design, however, discourage the use of this approach. Limitations The adjusted gain score procedure has been presented as an appropriate analytic strategy for situations conforming to the fan spread model. In order to focus attention on what were considered to be the central points for comparison, some assumptions about the cir- cumstances of application have been made. A basic assumption was that it is possible to measure the same individuals repeatedly on the same variable. This may be difficult in a real world setting. The example data examined in the present study to illustrate the analytically derived results, however, has indicated that it is not impossible to obtain such measurements. In schools both the administration and teachers require repeated testing to monitor student progress. These tests, however, may not be appropriate for the adjusted gain score approach Since they are unlikely to be the same test or a parallel form of the test. With careful planning, repeated testing of individuals with parallel forms of a test may be possible. 115 A second assumption has been that there are no selection by regression or selection by testing interactions. Regression effects and testing effects were briefly discussed earlier. If these dis- tortions of the group's growth rate affect both groups equally, then an estimate of group differences is not affected. On the other hand, when there are differential effects associated with these threats to internal validity, then the estimated group difference is biased. Finally, the adjusted gain score strategy has been based on the assumption that groups grow in a linear fashion. This assumption is likely to be met in situations involving short periods of time. Over extended time periods it seems less likely that a linear model would adequately characterize group changes. The estimated true score analysis of covariance and gains in standard scores strategies also make the same assumption about linear growth. These procedures, however, use data obtained over a shorter period of time than adjusted gain scores and thus less likely to violate the assumption. When the assumption is violated, the adjustment provided by the adjusted gain score strategy can be totally inappropriate. Thus, in situations where the intervention period is extensive, the use of the adjusted gain scores may not be appropriate. APPENDICES APPENDIX A THE VARIANCE OF THE PRODUCT OF TWO RANDOM VARIABLES APPENDIX A THE VARIANCE OF THE PRODUCT OF TWO RANDOM VARIABLES Given: X and Y stochastically independent such that E()_(_) = g. EY,( )= g Var(_X_) = ,Var(Y __)== 21, 11 | x I |m Im 11 |-< 1 1:5 VarDI'l] Var[§'1+§'2+e'e+e'6] Var (3'1)+Var[§_'_§_]+Var[_e'fl]+Var [E'§]+2 Cov(_£;_'_n_,_g_'§) +2 Cov(_ g'r1__,'__§_ n)+2 Cov(_' _,6' _)+2 COV(_§'§.g'1) 2111+ EItr{(_e_' §)(§_' €)}]+2E[tr{(E' _)(§_'§)}] +2E[tr{(n' e)(_e_' 6H] 5'2y§_+_11_'2x_r_1_+E[tr{(§_<§_')(§8_')}]+25[tr{(§§')(§§')}] +2E[tr{(§§:_' )(611' )I] gel; +_T]_'X£D_+ICY‘IE(§O_') E(ee')}+2tr{E(§§') E(Eg'n +2tr{E(§e') E(§rl') = _g_'£ gfn'e g+trIX 2} X—-.. .115. If X and Y are scalars rather than vectors: Var(XY)= [E(X)]2 o; +[E(Y)]2 a; +02 x0; 116 APPENDIX B THE COVARIANCE OF A RANDOM VARIABLE AND THE PRODUCT OF TWO RANDOM VARIABLES APPENDIX B THE COVARIANCE OF A RANDOM VARIABLE AND THE PRODUCT OF TWO RANDOM VARIABLES Given: W = Z - X_Y_where Z and Y are stochastically independent of‘X and such that E(X) II J m A N V II E. E(Y) Ex, Var(Y) = Z , Var(Z) C Var(X) 2 .. oz, Cov(z,y) - 02x- Cov(Z,_fY) Emu) -E<2) E(X'Y) E(X)) E(Z.X_) -C_€_'_Tl_ §'[E(Z,D £11] ‘5' Cov(Z,Y) =5.ng If X and Y are scalars rather than vectors: = E(X) Cov (Z,Y). 117 APPENDIX C THE COVARIANCE OF TWO ADJUSTED VARIABLES APPENDIX C THE COVARIANCE OF TWO ADJUSTED VARIABLES If 1111 = z1 - 5'11 and 112 = 22 — _x_'_\_(2 Z1 and Y1 are independent of Z2 and Y2 and X independent of Y and Z Cov (W1W2) = Cov (X311, 5312) = 2 'z + '2 tr(>:-lzz l) +n l(.112 E _IXZT If X and Y are scalars rather than vectors = E(Yl) E(YZ) Var(X). 118 APPENDIX D THE EXPECTED VALUE OF THE RATIO OF TWO NON-INDEPENDENT SAMPLE STANDARD DEVIATIONS SQUARED de1 fo' Whe If let APPENDIX D THE EXPECTED VALUE OF THE RATIO OF TWO NON-INDEPENDENT SAMPLE STANDARD DEVIATIONS SQUARED The density function of the ratio of two correlated standard deviations was derived by Bose (1935) and Finney (1938) as the following: n-l _n .. . 2(I-RZ)T._U:E__ 1-321 3(92l’ flEl) (1+w2)"" (1+w212 Where w 3 S2’02 S /o } x"== 2 . 61762-4 . 61762-4 b: 2-4 6:767:4' 2 let's choose the + Sign for x x=%1b+/BTTZ} XT'=%{b-/$TTT} x + x" = b x - x'1 = b2-4 119 120 If x = 142, b = (1 -pi) (t+t")2+4p2-2 (11+W'1)2 = W2+W’2+2 = b+2 = (1 -(>2)(t+t")2+402 (VTN'IHw-WTI) 913i= (1 -pz)(t+t-l)(t-t-1) 5% 2 2 <1) =< . 1 - 1 MW2 w+w71 (l-p‘i)(t+t")2+402 1-402 ( W )2 =(1-pzl(t+t")2 (1’02)(t+tT1)2+402 (112-w 2) = /b2-4 = /(b-2)(b+2) = [HI-oz)(t+t'1)2+4p211(1-o:°-)(t+1:")2+462 -4l 1K1 -62)I (1-621(t+t-l12+46211(wt-12-41 (t'tT1)(I-pz)1/2 {(1 -DT)(t+t-l)2+4p2}1/2 aw- (1-pz)(t+t")(t-t") g; " T(t-t-11(1-62)"211-621(t+t ‘1)2+402t1/T T n-l1 1 TtT (l - p2)T(t+t")nT 2 1 dt 2 Let y = ]-y = d : m I+t2 1+t2 y (1+1?)2 n-3 It}. “'3 2 dt _ 2t t dt 2 T" - = y “-111 2 dy (t+t“)"‘1 t (1+t2)"'3 (1+1?)2 8— 9:9- 2 dF = .Y _0 '1) ELY y 3(1'2’1 n'TI) 2 ’TT2TT 122 2 WIT E)1) T 402 2 Where (W-+W") E(w+w-‘)2 = (1 -oZ1 E Y(1—137+ 402 E(w+w-‘) = E(W2)+E(N’2)+2 = 215(w21+11 F 1 1 n-5( n 5 1 ("T ) )TZT'dy EyTTlTT-TSITTlTTTTlTTTEO(1Ty = 4(n-2)(n-3) 3 4(n-2) (fl-3)2 n-3 E(WZ) = [NEW-112”1 _ (I - p:%n4(n- -2) + 202 _] 2(n-2)-n+3 _ 2 2n-4-2n+6 = (n-1)--Zp2 n-3 p n-3 n-3 =. 11 1.9.1.3» 42.1.32 m /""\ 121.8: V 1 F? i’:: I N 1°. 04 “SIN APPENDIX E THE EXPECTED VALUE OF THE RATIO OF TWO NON-INDEPENDENT SAMPLE STANDARD DEVIATIONS APPENDIX E THE EXPECTED VALUE OF THE RATIO OF TWO NON-INDEPENDENT SAMPLE STANDARD DEVIATIONS From Appendix D: w+w-1= Vin—Ly] 1:, + 402 E(w+w‘1) = 2E(W) gig E(w)=- l/o " _ dy VW ae .2? ”Pm” mzfs f— oz+4p Y(1-Y) [Y(1-”115— The above integra] was evaluated with the DCADRE program on the CDC6500 computer. 2 u 0' Q —l S2 237 -5 n-4 r n-T 2 2 T “2 W] VT-p +40 Y(1-Y) [Y(1-fl] (”31-- o T23 m A mlm _. N V II APPENDIX F THE EXPECTED VALUE OF A SAMPLE REGRESSION COEFFICIENT APPENDIX F THE EXPECTED VALUE OF A SAMPLE REGRESSION COEFFICIENT ) = Density function for (B Y'X fl - 2 .. dF: 1(2)“ p) (31)“ 1 db /2 _ o 0 p a 2 n ”1T1?” x 3%("92)+(b' Tl) x x O 1et--‘x = k CIX n-T +00 E(b = F(L21_)(]_02)T km] f b db y“) FEW-'51.) -oo ’k2(1-oz)+(b-ok)2 "/2 3&1 n-T - I‘(%)(1—oz) k b db ' meg—1) f Hume-em(k2(1-pz>){"’2 -00 1:31 1 +00 n n- _ 11(5)“ "’2) k f b db n/2 "/2 kn -00 ,Hkb’O-Er) . HF(9§-l)(1-oz) 124 Te 125 Tet t = b-pk b = t'tpk db = dt v=k2(T-oz) +00 t + pk dt _ t7 n/2 -00 T n - FL?) I tdt + k n 1 1’2 t2 "'2 p Maw) ( -3.) L .00 it -———~[] - ———z‘ = ———~[" e y 1+t _ v+t v+t '5' v 2-= v + t2 y (1%)» 1/2 3%)] 1/2 t = (V “V.” 1/2 126 y -.L [y1/2y1/2v1/2 +v1fl (1 _y)1/2 (1 _y)1/2] U1 -y) LP” 3’ _Vl/Z y+T-y 2 [[(1 -y) yJVZ’] y 1/2 1 dt 2 [(1 -y) yJI’zy k(T _02)1/2 = N 127 tdt Similarly, / ° ( T + fi)n/72 v1” 128 +00 00 n + 1 dt 2 1 f ( t271/2 =-2f]y ”V” 1 d)’ o v) v”2 2 1/2 3/2 (l-y) y v1” II ‘D x0 |«< APPENDIX 6 THE EXPECTED VALUE OF A SAMPLE REGRESSION COEFFICIENT SQUARED APPENDIX G THE EXPECTED VALUE OF A SAMPLE REGRESSION COEFFICIENT SQUARED n-1 _ r(-3-)(1 '92) 31 M db dF - "-1 ( Ox ; DU 2 "[2 ”Y(T) 37("02)*(b‘—1) x 0x 0 Tet 5%: k +00 2 F%)(“02) n-T f b2 db E(b ) = "_T' k -m 2 "/2 fir-1(7) 3k2(1-pz)+(b-Ok)1 +00 f b2 db [2 2 '-n -00 ‘(14’ b-I'fp )(k2(1-p2))‘ n-1 n/2 T61)“ '02) km f” b2 db 2 Hr (95-1)(1-oz)"/2 k" -°° 1+57T9—Tykb'15g 129 t= b-pk b = t+pk db=dt \) = k2(1-02) 130 +00 112 1; I1%) t2 + thk + (pk)2 dt n—1 HP