. . t 5 a. 2 3.. . I‘- ~ g . 3.3%. 53!. . s . L :1. 5.7.442. : . .3. .w A I». .4. l‘ 4.1 .51. s. a 3;... . .. p‘ .v‘, -,E5 3% s .. «nu... .é....:.: um. . 08.1.5! 2 :.\2§.t:—3:3§ . I . la 1. .1 . 3. a. glagnhoi: .. 1.2.5.375 .wnulflfi .9 x . h... y. b. . a...l..:.uflt:s t. .3: 42b ;. .2. .J 51: I . mu». .. (35‘. xx... 3.930..“ nL).Nm..i$. 3‘ Janna.) I A ”Mun-l. nu»: 0.9.32.1“ Ii .1 3.3.1.5. :flfimar 1.0.3.» «firmwmiizte . .c . EM (£5. imp... Jun-.95.!- I. .33 :05‘ 4.2.5.... as? a... 25...... .. .13.... IS .51.. ,. 71-35).... 1... 3:1,..." bilr;(tvl 1.. .‘\«. I .\..\.Dl’.l i. - .3...» w, .... . . .- $\.i7 ~ .... .I . I. . t. n E . . . .5 .4... a. figurgrfisé. :3. :. 2&0 lHllililziflflmflflwlilill/Hilllllll £36449 LIBRARY Michigan State University This is to certify that the dissertation entitled Computation of power in the nested random effects models presented by l Xiaofeng Liu has been accepted towards fulfillment of the requirements for Ph.D. . Measurement & Quantitative degree in . Methods Major professor Date H" ’V'éi7 MSU i: an Affirmative Action/Equal Opportunity Institution 0- 12771 PLACE lN RETURN Box to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE sill5 95; 21311 11100 W.“ Computation of Power in the Nested Random Effects Models By Xiaofeng Liu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Educational PsychologY, and Special Education 1 999 ABSTRACT Computation of Power in the Nested Random Effects Models By Xiaofeng Liu Nested random effects models contain random effects due to the nested sampling units. Such models used to be framed as mixed analysis of variance (ANOVA). Nested data are now often analyzed by means of more‘flexible models such as hierarchical linear models (HLM) because HLMs accommodate continuous covariates and unbalanced designs. However, -few people understand how to compute power in HLMs. This dissertation utilizes the relation between ANOVA and HLM as a basis for the computation of power in HLM. In essence, hierarchical linear models without continuous covariates are mixed ANOVA models. Power functions in ANOVA can be derived through ANOVA tables, though it is typically difficult to obtain ANOVA tables for complex designs. The dissertation simplifies the derivation of the ANOVA table through the structural representation of the models in HLM. The derived ANOVA tables can then be translated into similar HLM tables for deriving power functions in HLM. Knowledge of those power functions in HLM allows us to choose appropriate sample sizes for prospective studies using HLM. Two hypothetical examples are provided to illustrate the application of power functions in planning educational studies. To My Parents ACKNOWLEDGMENTS The dissertation is the culmination of my intellectual pursuit for the past three years. I am grateful for all the people who had helped me finish this dissertation. I thank Dr. Stephen Raudenbush for his long term academic guidance and unlimited support in my doctoral studies. Dr. Raudenbush is a very knowledgeable and generous person. He first introduced me into the computation of power two years ago, and l have been working very closely with him in this area. His insight and wisdom have greatly shaped my thinking and have tremendously helped found the basis of the dissertation. It is really hard to enumerate all his ideas in my dissertation, and I am sure that I will inadvertently overlook giving him due credit many places in the dissertation. I am thankful for Dr. Kenneth Frank, Dr. Mark Reckase, and Dr. Aaron Pallas for agreeing to serve on my dissertation committee and giving me valuable feedback. I am also grateful for my parents who provided me the good education and supported me in any of my endeavors. Last but not the least, I should thank all my relatives for their unfailing belief in my intellectual ability since my childhood, and their help that made my first trip to the United States possible. VI. VII. TABLE OF CONTENTS Introduction Chapter 1: statistical power and its computation Chapter 2: computation of power in the hierarchical linear model through the ANOVA table Chapter 3: computation of power in the hierarchical linear model through the HLM table Chapter 4: four examples Chapter 5: numerical application of power functions in planning educational studies Conclusion vi LIST OF TABLES Table 1: HLM formulation of the example Table 2: translation between HLM and ANOVA terms Table 3: ANOVA table of the example Table 4: derivation of EMS—step 1 Table 5: derivation of EMS—step 2 Table 6: derivation of EMS—step 3 Table 7: final ANOVA table of the example Table 8: HLM table of the example Table 9: HLM table of 3 level CRT Table 10: HLM table of MST with a covariate at the site level Table 11: HLM table of a combination of CRT and MST Table 12: power in cluster randomized trial Table 13: power in multi-site trial vii LIST OF ABBREVIATIONS HLM Hierarchical Linear Model CRT Cluster Randomized Trial MST Multi-site Clinical Trial ANOVA Analysis of Variance viii INTRODUCTION A random effects model specifies more than one random term. The random factors contain, levels that are randomly selected from a population of possible levels. As the number of possible levels of some factors may be very large, it is impossible to assess all of them. Random inclusion of some levels in the model becomes economical and convenient. Suppose that we are studying the effect of schools on implementing one instruction method. There are hundred of thousands of schools. It would be impossible to examine all the schools. A sensible strategy is to identify all the schools and randomly sample a few of them to assess school effects (Littell et al., 1997). A complete random effects model contains only a mean; the rest of the terms are random factors. The key interest surrounds the estimation of the grand mean and the variance of each random component. The random effects may be crossed in some cases, but nesting of random effects occurs in most real sampling schemes. Large experimental units are first selected, and then the smaller units are randomly selected from each large experimental unit. In education, school districts are the natural sampling units from which individual schools may be randomly chosen. The institutional hierarchical structure often determines the design of our studies, be it of experimental or survey nature. In fact, the hierarchical structure of experimental units leads to the nesting of random effects, which correspond to different units of various sizes The random effects models can be analyzed through two frameworks. In the past they were treated as mixed analysis of variance (ANOVA) models. The mixed ANOVA approach always assumes a balanced design and no continuous covariates. It therefore restricts its application to many real data analysis. Data tend to come in with some missing values; and the continuous covariates are common. People are now often analyzing the data by more flexible models like hierarchical linear models (HLM). HLMs can include continuous covariates in the models and can accommodate missing data. In HLM the models are arranged in a few levels based on the hierarchy of the sampling units. Corresponding to the experimental units, the hierarchical linear model may be represented by a sub-linear model at each level. For simplicity we assume one fixed effect at each level. The generic presentation at each level is as follows: Outcome = mean + coefficient*fixed effect + random effect. The “mean” and “coefficient” may be the outcome variables for the next level. Each level must contain a “mean”, while the “fixed effect” and “random effect” may be optional at the higher levels. To generalize the model further we may include a continuous covariate for each level. The formula of each level changes into Outcome = mean + coefficient*fixed effect + coefficient*covariate + random effect. The hierarchical formulation reflects the structure of the design. It appeals to people’s intuition, and, therefore, the hierarchical linear model has gained popularity among researchers in various disciplines. Much research in hierarchical linear model so far has been focused on estimation theory and algorithmic implementation of the parameter estimation. Much is yet to be known about the performance of the test of those parameters in the model. The power of the test of the parameters in the model is rarely computed for two reasons. First, the complexity of the model itself deters people from computing the power of the tests. It requires some mathematical sophistication to carry out the computation of power of relevant tests. Second, it is hard to derive the power function of key parameters in HLM. Power functions are usually required for determining sample size in planning a study. Many researchers who plan a study using HLM want to choose an appropriate sample size. For example, a researcher may want to compare two types of counseling practice in schools. It is logistically easy to have one school practice one type of counseling; so the researcher decides to randomly assign half of schools to one type of counseling and the other half of schools to the other type of counseling. At the end of the study, students in schools will be examined on certain criterion outcomes, and data will be analyzed as a cluster randomized trial by HLM. The researcher is interested in knowing how many students from each school should be recruited in the study. The question can be easily answered if we have the power function for the test of the main effect of treatment in the corresponding hierarchical linear model. In this case the power function may be derived analytically if we know the variance of the estimate of the treatment (see Raudenbush, 1997). However, there is not a general way to derive power functions for the general HLM when the design of the study becomes complex. Stroup (1998) provides a general way to compute the power for mixed linear models. Since HLM is a subset of mixed linear models, his approach is applicable to the power calculation of HLM. If we formulate the hierarchical linear model in the framework of a linear mixed model, it can be expressed as Mode|2y=Xfl+Zu+e (1) IZI~MVNII3I [‘5 ill] The vector ,6 contains the parameters of fixed effects; and the vector u represents the random effects whose covariance matrix is G; and e is a vector of individual random error whose covariance matrix R is a diagonal matrix with the diagonal elements being 0". It is noted that E(y)=X,B; Var(y)=V=ZGZ'+R. Test Ho: K'fl=0, where K is a matrix containing contrast constants. (Kfi)'[K' O , Here 0,2 is assumed to be unknown, and the test is a T statistic, A T: 31 '6 . (4) 2(y,,—5»‘,,)2+2(y,,—z,)2 i=1 i=1 2n—2 The power function of the test at the 5 percent significance level is 11 n 1 - probt( tinv( 0.95, 2n-2), 2n-2, 6f; )1, (5) where probt is the cumulative distribution function for the non-central T; and tinv is the quantile function for the central T (see appendix A for the definition of the functions); 0.95 is equivalent to 1 - the alpha level of the test (we assume 0.05 alpha level from thereon); 2n-2 is the degrees of freedom for the central and non- n central T distribution; 6 is the standardized effect size E (Cohen, 1988); atl; 0' is the non-centrality parameter of the non-central T distribution. F test The multi-site clinical trial is a popular two-factor design. The treatment factor is a fixed effect; and its power function can be expressed in terms of the cumulative distributive function of a noncentral F. The random factor, i.e. the site, is a random effect, and its power function can be formulated in terms of the cumulative distributive function of a central F. The model can be written in the ANOVA notation: yak =p+ak+7rj+(arr)jk+g“m, (6) 7r}. ~ N(0,12), (0:70}, ~ N(o,a§,,), a“ 1,, ~ N(0,0'2) For simplicity we only consider one-sided test and a level of 0.05 throughout the dissertation, though the two-sided test can be derived similarly. All the cumulative distribution functions and their inverse functions from here on use the same notation as SAS. Their definitions are provided in the Appendix A. 12 where Ya, is the outcome for the ith participant nested within the ith site and receiving treatment k (i = 1.., n;j = 1,...,J, k=1,...K). Here ,u is the grand mean, a, is the main effect of treatment k, 7:}. is the main effect of site j, an], is the interaction effect between sitej and treatment k. Note that It and art are viewed as random effects, and that they are independent. The test of the treatment effect assumes the null hypothesis that a, =a2 =-°-=ak. The test statistic is the ratio of the mean square for the treatment over the mean square for the treatment-by-site interaction. It assumes an F distribution with df for the numerator as k-1, df for the denominator as J-1, and the non-centrality parameter The power function for the test at 0! =0.05 is 2 12.12613 1 — probf(finv(0.95, K-1, J-1) , K-1, J-1, ———"——), (7) 0'2 + n0";r where probf is the cumulative distribution function for the non-central F; finv is the quantile function for the central F; 0.95 is equal to 1 — alpha level of the test; K4 is the df for the numerator for the central and non-central F distribution; J-1 is the 13 nJZaf df for the denominator for the central and non-central F distribution; 2—"— is 0' +210; the non-centrality parameter for the non-central F distribution. The test of the random site effect assumes a central F distribution after transformation (see Raudenbush and Liu, 1999). The power function can be expressed as follows: 0.2 1— probf(finv(0.95, J-1, (n-1)*K*J)* ——2———2—, J-1, (n-1)*K*J), (8) 0' + ax where probf is the cumulative distribution function for the central F; finv is the quantile function for the central F; 0.95 is equal to 1 — alpha level of the test; J-1 is the df for the numerator for the central F distribution; (n-1)*K*J is the df for the 2 denominator for the central F distribution; and 7—0—2— times finv function is 0' +120“ the quantile parameter for probf. 12 test The Wald test is a 12 test, although it is seldom used. Suppose that Y = Xfl + e , (9) where e ~ N(0,021). The null test can usually be set as Ho: Afl=0 14 A6 assumes a normal distribution, i.e. Al? ~ N(A,6,A(X'X)'l A'oz); and the test statistics Q = (Am/MK)" A'az 1" (Al?) ~ 13.....,(6). (10) where the non-centrality parameter 6 is (Afl)’[A(X'X)"A'0'2]"(A,B). If we do not know 02 , we usually substitute its estimate 62. The estimate 62 has a chi square distribution times a constant. Therefore, the new statistics [Q/mnan follows a F distribution with non-centrality parameter 6 /(,. 6 = (Afl)'[A(X'X)" A'02]"(Afl). It can be proved that the new statisticis a monotonic function of the likelihood ratio statistic (see pp. 110, Stapleton, 1995). It is also noted that ,B can be the maximum likelihood estimate or the least square estimate, and that they are identical if efollows the normality and independence assumption. The power functions can be expressed as: 1 — probchi(cinv(0.95, rank(A)), rank(A), (Afl)'[A(X'X)"A'0'2](A,B) ), (11) where probchi is the cumulative distribution function for non-central chi square; and the cinv is the quantile function for central chi square. The test of parameters in the mixed model provides another example. We can formulate the general model as: 15 y=Xfl+Zu+e (12) E(y)=Xfl; Var(y)=V=ZGZ'+R Test Ho: K 36 = O (K'fi)'[K'(X'V"X>"‘K12 + Z 207...- —i..>2 (30) and, that is, TSS = $88 + SSE, where 883 is the between sum of squares and SSE is the within sum of square as in the one-way ANOVA. SSE 02 SSE 02 E(‘fl-é')=0')271 =—;-+r,and J—_§~(7+T)13_2. (31) 2 2 2 E(fl) =9—+r+—{Z'—, and fl~ (57—+r)x,2(5 =—J§—). (32) 2 —1 n 4 2 —1 n 0' 4(— + r) n . SSB . . . . . The test of 7, IS F=§E———. It follows a F distribution. The power function is (J - 2) inz 1 — probf(finv(0.95, 1, J-2), J-2, ——2— ). (33) 0' 4(———— fl- r) n It is equivalent to the previous power function (see appendix B). Therefore, the first principle is true for the 2"d level in the first case. From text books on experimental design (see Montgomery, 1997), the estimated 62 always has the expectation 02 and follows 0212 distribution when the design is balanced. We know from the above that E(nSSE /(J — 2)) = 0'2 + nr. (34) The test statistics for r is "SSE/(j —2). After transformation, it follows a central 0’ F distribution. The power function is a l 0' 02+nr 1 - probf( finv(0.95, J-2, Jn-J)* ,J-2, Jn-J). (35) Therefore, the second principle holds true too for the 2"d level. 36 Now we can generalize the proof to the n+1 level. The model can be formulated as follows: Leveln+1: 5'17“ 2 yg” +7,"”w}”' +u3”, uj ~ N(O,r"”) j =1,...,J (36) Leveln: n _ n n Y..- 4.,- ” ’ Var(Y; ) = of i = 1,....n (37) w}”‘ takes -‘/2 or 1/z for control and experimental conditions. “:1.“ = 7;" .17,“ ff” are independent since the nth level are nested in the n+1 level. The test of 7:” is the same as a two sample independent t test. The test statistic has a T distribution with degree of freedom J-2 and a non—centrality 7f” 4 0'2 I; the first and second principle are proved. parameter . The same results duplicate in the n+1 level. Therefore Case 2: The level does not have a fixed effect. Level 2: ,6,” = 700 + uoj (38) flij=710+uij (39) (“01.12. NUO} [700 710]] ulj O 1'01 Til Level1: Y”. 2,60]. +,BUXU +ry. r;.j~i.i.d.N(0,0'2) (40) 2 ‘ - - - . . 0' ,8,” = Y j and Y l...Y J are independent and have variance --—-+ 2'00. . . . n 37 ,8”. =75j—YC . =AY., 5} J where 75}. is the mean for the experimental group at the jth site; and )7“. is the mean for the control group at the jth site. A7,...AI7J are independent and have . 40'2 variance — + r”. n The test of 700 and yloare the same as one sample t test. 60]. and ,6”. are the corresponding observed scores. The estimates are: 2?} 207; “if 2 . A — . — 0' 700 J Var(Y.j)= J—l E(Var(Y.j))=0')Efj =7+Tooi (41) J _ J _ = 2 ZAY.) _ 2(AYJ TAX.) _ 402 )7”, = J V&r(AYj)= J—l E(V&’(AY)))=0':7, :--n_+T” (42) The power function for the test of yw is 1 — probt( tinv(0.95, J-1), J-1, 27w ). (43) J(40 + r” )/J n Therefore the first principle holds true in this case. It should be noted that the estimates ZOE-ff :AY} :(AYJ_A7)2 V547.» : i ii.) J ’ V&r(AY-1) = J-l 38 are algebraically related to the mean squares for the terms in the ANOVA, which correspond to the parameters 1401., y”, , u”. in the HLM model. Now we prove that the second principle is true in this case. The estimated ézagain has the expectation 02 and follows 02 12 distribution. 207.,- —17..)2 , Vcir()7_j) = has a (0—+ 10,0134 distribution. The test statistics for the n nVar(?_’j) 6'2 parameter uoj is . The numerator is equivalent to the mean square for the site effect in the ANOVA model. After transformation, the statistics has a central F distribution. The power function for the test of ac). is 2 1 - probf( finV(0.95, J-1, nJ-2J)*—20 ,J-1, nJ-2J). (44) 0' n ion". —47)’ V6r(AYj)= J—-1 hasa ( C; 2 + 1,0134 . The test statistics for the 4/n(Var(Ai7,) 2 . It is the same as the mean square ratio of parameter ui ,- is 0' treatment-by-site interaction over within-cell error. After transformation, the test statistics has a central F distribution. The power function is 0.2 1 — probf( finv(0.95, J-1, Jn-2J)*-—-2—-——— 0' +nrn/4 ,J-1, Jn-2J). (45) For the n+1 level, we add n+1 superscripts to the parameters in the model. Essentially the proof is the same as we have provided in the previous case. 39 Creating the HLM table With the aid of the three principles we may construct an HLM table similar to ANOVA table and derive the power functions directly. We use the same example from the previous chapter to illustrate the construction of a HLM table. First, we put all the terms in the combined model into the table and write the subscripts and their corresponding total number of levels on the leftmost column. Second, we use the same rules to construct the subscript for each term and derive the degrees of freedom as we have done in the ANOVA table. Third, we fill in the parameters for the random and fixed effects. There are three. mles to do so (for simplicity, we restrain all the covariates to take two values): 1. If the outcome at one level is the coefficient of the covariate at the next lower level, divide the variance of the random component of that level by 2. Otherwise, we write down the variance of the random component. 2. For the fixed effect we write down the corresponding coefficient, square it , and then divide it by 2. 3. For the fixed interaction effect we write down the corresponding coefficient, square it, and then divide it by 4. Fourth, we refer to the rule 8 in the previous chapter to write down the numerator or denominator of the non-centrality parameter for the power functions. 40 With the aid of the HLM table, we can derive power functions for the tests of the fixed and random effects. For the test of the random component the construction of the power function is the same as in the previous chapter. For the fixed effect all the power functions use a non-central t distribution. The non-centrality parameter is the square root of the ratio of the piece for the fixed effect in the last column over the expectation of the random term, against which the fixed effect is tested. In general, non-centrality parameter =sqrt[ (numerator of non-centrality parameter for the fixed effect)/(denomenator of non-centrality for the random effect) ]. 41 The HLM table for the example in the previous chapter is provided as below. The subscripts are kept the same to highlight the translation between ANOVA and HLM. In the following chapter all the HLM tables adopt the HLM subscripts and notation. Here h denotes for site characteristic; k for site; ifor treatments; j for cluster; and [for individual. a is the number of levels for the treatment factor; b for the cluster factor; c for the site factor; (I for the site characteristic factor; n for the within-cell error. Term Sub- df Param- Numerator or denominator script eter of non-centrality parameter - _ _ 2 2 h . d 700lW h 1 d 1 700] abcnyom 2 2 k 26 um, [C(11) (C -1)d rpm 0'2 +br, + abn 7/300 iia 7010X i a_1 70w2 bcdn 70i02_ 2 7’01in (a—1)(d—1) 1%: bcnf’on2 4 u... X ik(l) (a —1)(c — 1). 1:0, 02 + m, + b” 2:2 j : b ray. b(kh) a(b -1)cd 1,, c)-2 + n r, I : n em 1(ijkh) abcd(n — 1) 02 0-2 Table 8: HLM table of the example 42 The power function for testing the treatment is bcdn E 1 — probf(tinv(0.95, (a -1)(c -1)d ), (a —1)(c — 1)d , 2 ). (46) 0'2 +nr,r +bn—gm The power function for testing the site characteristic is 2 abcn ZE— 1—probt(tinv(0.95, (c—1)d), (c-1)d, 2 ). (47) 02+nz'8 +abn—22g The power function for testing the treatment-by-site characteristics is 2 bcn .721 1 -- probt(tinv(0.95, (a —1)(c -— 1)d ), (a -1)(c --1)d , 2 4 ). (48) 0' +nrfl +bm'lml 43 Chapter 4 FOUR EXAMPLES We provide the HLM tables for three different designs: a 3-Ievel cluster randomized trial, a multi-site clinical trial with a site covariate, and a combination of cluster randomized trial and multi-site clinical trial ( cluster randomized trial replicated across multi-sites). For each design the power functions are given for the tests of the parameters in the model. Finally we provide the power function for a potential HLM analysis based on Tennessee classroom size study (Finn & Achilles, 1990). 3-level cluster randomized trial In school-based intervention studies schools are randomly assigned into ' treatment and control condition. Classrooms are nested within each school. The design can be formulated as a 3-level HLM model: Level3: fl00k=y000+yomwk+u00k’ uj ~N(O,r/,) k=1,...,K (49) Level2: 7:01., = flock +r0jk ’01:. ~ N(O,z',,) j =1,...,J (50) Level1: Y). =,;,,,.e,,,, e0, ~ N(0,0'2) i: 1,....n (51) w, takes -‘/2 or V2 for control and experimental conditions. 44 Term Subscript df Parameter Numerator or denominator of non- centrality parameter ’12 room 1 2'1 750. M" @- 2 2 sz um, k(l) 2(K-1) r}, 02 +nr,r +ner j'J r011 j(kl) 2K(J_1) T” 0.2+n1'” 12): em i(jkl) 2JK(n-l) 02 (72 Table 9: HLM table of 3 level CRT The power function for the test of 700, is 2 JKnZ—fl 1 — probt( tinv(0.95, 2(K —1)). 2(K — 1) , 2 2 ). 3 (52) 0' + nr,r + nJr 13 The power function for the test of r), is 2 1— probf(finv(0.95, 2(K-1), 2K(J—1))* 2 a ”T" , 0' + nr,r + ng'), 2(K—1), 2K(J —1)). (53) 3 A SAS programs can be used to compute the power value of any listed functions and plot a power curve. The functions input are exactly the same as provided (see Appendix C, the SAS programs). 45 The power function for the test of r” is 0,2 1 — probf( finv(0.95, 2K (J —1), 2JK(n—1))*-—2—, 0' +an 2K(J — 1) ), 2JK(n — 1) ). (54) Multi-site clinical trial with a site covariate The multi-site clinical trial is widely used in mental health research. Patients are randomly assigned into the treatment or control condition at each clinical site. The same study is replicated across a number of clinical sites. The key interests surround the average treatment effect across the sites and the variability of treatment effect among the sites (Raudenbush & Liu, 1999). When the treatment effects vary significantly across sites, it usually implies that the fluctuation of the treatment effect is not simply random but is related to some characteristic of the sites, Le. a site covariate. The model may be formulated as a 3-level HLM: LeveL2: flu]. = 600 +6mwj +qu uoj ~ N(O,‘roo) j =1,...,J (55) ,6”. =6lo+l9nwj+u,j. uU~N(O,1'”) (56) Level1: ya. = ,60}. +6ij. +rg. r9. ~ ii.d.N(0,0'2) i=1,....n (57) where X j = 1/ 2 if subject in the treatment X ,j = -1/ 2 if subject in the control. 46 Term Subscript df Parameter Numerator or denominator of non- centrality parameter [:2 601W]. 1 2‘1 0—021 126—021 2 2 jzi “01- 1(1) 2(£_1) TOO 02+2£T00 2 2 k 2 BIOXij k 2_1 92’. 22191:). 2 2 2 Hum-X.) kl (2—1)(2—1) 912] £19]: 4 2 2 4 “ii/Y.)- jk(1) 2 i_1 312 2 +£i (2 ) 2 a 2 2 ' ' 2 2 “1’. ’11 '(Jkl) 41(n/2-1) " a 2 2 Table 10: HLM table of MST with a covariate at the site level The power function for the test of 90, is , 2 1 — probt( tinv( 0.95, J — 2 ), J — 2, /w ). (58) a +n2'00 The power function for the test of 1'00 is 2 1— probf(finv( 0.95, J—2), 2J(n/2—l))* , a , (59) 0' + It Too 47 2(J—1),2J(n/2-1)). The power function for the test of 6,0 is 1 — probt( tinv( 0.95, J - 2), J — 2, (60) The power function for the test of 19,, is I 2 19:1 1 — probt( tinv( 0.95, J — 2 ), J —2, -—22—4 ). (61) 0'2 + 2 5i 2 2 The power function for the test of 1,, is 2 1 — probf( finv( 0.95, J - 2, 2J(n / 2 —1) )*—1——, ‘ (,2 , 2 :6 2 2 J—2, 4J(n/2—1)). (62) A combination of cluster randomized trial (CRT) and multi-site trial (MST) This design has the features of both CRT and MST. At each site there is a CRT; and the same CRT is replicated across a number of sites. For example, a school-based intervention CRT can be conducted across a number of different school districts. It then becomes a 3-level HLM; and the model is listed as follows: Level3: [300, = 7000 +1100, uomk ~ N(O,tpm) k =1,...,K (63) ,Bm,‘ = 70.0 +u0” um ~ N(0,rpm) (64) 48 Level2: ”0,1: =16001 +flOIkX+r0jk rOjk ~N(O9Tzr) j=1,.-.,J (65) Level1: if], = 7:01., + e1], 6,7,, ~ N(O,0'2) i = 1,....n (66) Term Subscript df Parameter Numerator or denominator of non- centrality parameter k K k _ “00" K 1 T501" 0'2 + nz',r + 2-‘g—nz‘pm ’32 7010X l 2_1 1511 110,52 2 2 k K — - uOIkX I ( 1X2 1) {1’0” 0'2 + 717,, 1;;- "Tami ' 2 1.11 r01, 10") 2K(1]——1) I, a +n1',, 2 2 (In em i(jkl) 23—1-K(n—1) 0.2 0'2 2 Table 11: HLM table of a combination of CRT and MST The power function for the test of r 12.... is 0'2+nr,r J 1- robf finv0.95, K—1,2K —--1 * , p ( ( (2 )) oz+nrx+nJrflW 49 K—l, K(—:—-1) ). (67) The power function for the test of 70.0 is 2,9,8. 1 — probt( tinv(0.95, (K -1)(2 -1) ), (K —1)(2 —1), 2 J2 ). (68) 2 a +nTn+‘2—ntflou The power function for the test of r 1,0” is 02+n1', 1 — probf( finv(0.95, (K —1)(2 — 1) , 21a? 1) )* J . 2 . O- +nrfl+§nrflou (K —1)(2 — 1) , 2m; — 1) ). (69) The power function for the test of r, is 0,2 1 — probf( finv(0.95, 2K(-J-—1), JK(n -1) )* 2 2 0' +nr,r 2K(%-1), JK(n -1) ). (70) Multi-site trial with a continuous covariate and a site characteristic The studies of school effectiveness relate different types of school policy to students' achievement. Data are often collected on students from a number of schools, which can be classified by their policy types. The students are nested in individual schools, and schools are nested in different policy types. HLM is often used to analyze those nested data. The school policy types are considered as a school-level categorical covariate; the students’ background information is 50 modeled as student-level variables. They can be either categorical or continuous variables. For example, students’ gender is a categorical variable, and their scores on achievement tests are continuous variables. A generic model may be constructed as follows: School-level: [30}. =600+I901Wj+u0j uoj ~N(O,r00) j=1,...,J (71) ,6”. = 6,, + BMW]. + 14,-. u”. ~ N(0, 1r“) ' (72) )6“ = 620 +621ij +112}. uzj. ~ N(0,r22) (73) where W}. is a school level categorical variable; and it is assumed to be dichotomous for simplicity. Student-level: Y.)- = )6,” +flUX“). +,8sz2,} +5.]. 1;}. ~ i.i.d.N(0,o*§) i=1,....n (74) where X, is a categorical variable, and it takes 1A» and -‘/2 for a.student-level dichotomous characteristic; X2 is a continuous variable, i.e. a continuous covariate; a; is the student-level error variance with the inclusion of a continuous covariate. Kreft (1993) used the same type of HLM model to study the effect of school selective recruitment on students’ success. The sample contains 70 secondary schools in Amsterdam. Some schools selectively admit students based on their scores on achievement tests, and the other schools admit all students regardless of their scores. So the selective policy of schools is the school level covariate, 51 and it is represented by W]. in the model. The student level variables contain gender, test score on an achievement test, and their interaction. The gender corresponds to X1 in the model and the test score to X2. The interaction can be deemed as an additional continuous covariate like X2 (we limit the number of continuous covariates to one in the model for simplicity, though the results generalize). If we plan a similar study, we can use the same model. Assuming the model is balanced, we may derive the power function for the test of school types, i.e. the test of parameter 190,. The power function is based on equation (58) except that 02 in (58) is replaced by a; , that is, 2 1 — probt( tinv( 0.95, J - 2 ), J — 2, (£19215— ). (75). ac + "Too This is because we reduce the above-mentioned model to a multi-site trial with a site characteristic. If we move the continuous covarite X2 to the left side of the equation (73), (73) changes into (76) and (76) is equivalent to equation (57), which is the level 1 model in the multi-site trial with a site characteristic. nyzm—flzszii =50)+131in11+’})' (76) After changing the )3}. into the adjusted Y ,j , we can apply the power functions in the multi-site trial with a site characteristic to the above-mentioned generic model. 52 Chapter 5 NUMERICAL APPLICATOIN OF POWER FUNCTIONS IN PLANNING EDUCATIONAL STUDIES The power function evaluates the probability of rejecting the null hypothesis in our study. Since most of the studies are used to reject the null hypothesis, statistical power becomes a natural criterion to evaluate the soundness of a research plan. In the following we examine statistical power in two designs using HLM, i.e. cluster randomized trial and multi-site trial. In each design we pose a research question. Appropriate power functions are then chosen to determine the sample sizes. At the end the two designs are compared in terms of power performance. Cluster randomized trial The cluster randomized trial is used widely in educational research. For example, schools are randomly assigned to the treatment or control condition. Students in the same schools tend to share common characteristics; and their responses to the treatment may not be independent of each other. The nesting nature of the design requires HLM analysis (see Raudenbush, 1997). The model may be formulated as follows: Level1: 53 Xy=flo,-+r.): r.)- ~N(0,0'2) I (1,2, ...,n)j (1,2, ...J) (77) where Y9. is the individual score; .30,- is the mean of the jth cluster; r is the individual error n is the number of subjects in each cluster J is the total number of clusters. Level2: floj=7°°+70IWJ+uofl “01' ~N(O,z'00), (78) where 700 is the grand mean; yo, is the treatment effect; W}. takes ‘/2 for the treatment condition and -‘/2 for the control condition uoj is the cluster effect. The combined model is therefore: n2700+7mnlj+u0j+ny (79) The derived HLM table is as follows: Term Subscript Df Parameter Numerator/denominator of non-centrality parameter k:2 70,111,. k 1 7_§._ Jnré. 2 4 J'IJ/Z uoj 106) J-2 to, 02+nr i: n 1;). i(jk) J(n - 1) o'2 0'2 The power functions of the test of the main treatment effect is . .Inygl 1-probt(tinv(0.95, J — 2) , J - 2 , —2—— ). (80) 4(0' + M) The power function for the test of the cluster effect is 0.2 2 —,J—2,J(n-1)). (81) a-t-nz' 1-probf(finv(0.95, J - 2, J (n —1))* The variance components and effect size 70, are real value parameters and are influenced by their measurement scale. In planning a specific study we rarely know those parameters. However, functions of those parameters are available from previous studies of similar nature. In the cluster randomized trial p = 02+r is reported as an intraclass correlation coefficient in most of the previous studies using the same design. It varies from 0 to 1.0. 6 = —§3'—— is the standardized 0' + 1' effect size whose magnitude may easily be evaluated. 6 may be assumed to take 0.2, 0.5, 0.8 for small, median, and large effect (Cohen, 1988). It is 55 therefore natural to translate the variance components and effect size into their functional forms, whose values we can get from previous studies. After reparameterization the power function for the test of the main effect becomes 1-probt(tinv(0.95, J — 2), J — 2, (82) ‘5 ) 4 1— p ' — —+ JJ( n p) The power function for the test of the cluster effect becomes 1-probf(finv(0.95, J - 2, J(n — 1) )*——1‘—”-—, J —- 2, J(n — 1) ). (83) l—p+np We may substitute the hypothesized parameter values into the power function and plot the power against a sample size variable, e.g. J, the number of clusters or n the number of subjects in each cluster. An appropriate sample size may be found from the power curve to obtain a desired power level. Depending on our research question we may use different power functions in planning the study. In the following we present a typical research problem. An educational researcher wants to design a school-based intervention study. The researcher is interested in comparing the differential effects of two counseling programs on students morale and academic aspiration. The outcome of students morale and academic aspiration will be a composite score on a continuous scale. It is logistically feasible to administer the same counseling program in a school at one time. So the evaluator decides to use the cluster randomized trial. The schools as clusters are randomly assigned to using either one counseling program or the other. The evaluator has 10 participating schools 56 and wants to know how many students should be recruited in each school. Since the effect of the counseling programs corresponds to the treatment effect in the model, the power function for the test of the main effect of treatment should be used to choose the sample size. Assume that the researcher gets an intraclass correlation coefficient from previous school-based studies, e.g. ,0 =0.05, and a standardized effect size 0.5 from a preliminary study. The power function can be plotted over the sample size n in the figure 1 (see table 12 in appendix D for numerical values). If the cluster size n is set to be 20, then the power will be 0.75. The choice of sample size of 20 at each school is therefore justified. PCNVER q d .1 90‘ . .1 .1 q d 80‘ C d .1 1 [LL ll_L llll lllll 0900000000 UT 0 LLL [VIIIIlTTTIfilllIIITITTTIFTTTIIfiTTTTTIFITTITTIIIII 0 IO 20 30 40 50 N upper curve for fixed effect lower curve for random effect Figure 1: power in the cluster randomized trial 57 Observing the power function we may notice how the n and J influence the power given the parameters p and 6 . If p is small like 0.05 in the previous case, then most of the variation among students scores occurs within schools. If it is more costly to sample clusters than to sample people within clusters, then increasing n is more efficient to raise the power than increasing J. Increasing n greatly reduces the denominator of the non-centrality parameter in the power function and thus increases the power. It is exactly reflected in the figure 1. On the contrary, if p is large, then increasing J is more efficient to get high power than increasing n (see Raudenbush, 1997). It is noted that high power of one test is achieved at the cost of low power of the other tests. In the cluster randomized trial obtaining the desirable power of the test of the treatment effect does not necessarily guarantee high power for the test of the cluster effect. In the figure 1 the lower curve represents the power for the test of the cluster effect. It is obvious that its power is much lower than the power for the test of the treatment. Such conflict may be easily resolved if the researcher compares the importance of individual tests with reference to research questions they answer and sets them in priority order. The power ' function of the test, which answers the key research question, is used to choose the sample size. In the current example the main effect of treatment is of keen Interest. The test of the treatment effect overweighs the test of the cluster effect; and choice of sample sizes should be made with the power of the test of the main effect of treatment. 58 Multi-site trial The multi-site trial is a popular design because it is easy to administer. At each site there is an independent randomized experiment; and the same experiment is replicated across a number of sites (see Raudenbush and Liu, 1999). The model may be formulated as a 2-level HLM: Level1: Yi‘j =flo) +fllei'j +7" ‘1’ r.)- ~N(0,az) I (1,2, ...,n)j (1,2, ...J) (84) where Y9. is the individual score; ,6,” is the mean at the j-th site; ,6”. is the treatment effect at the j-th site; r”. is the within-cell error. LeveLZ: flo,‘ = 2'00 Tuo)‘, “0} ~ N(0,I'00 ); (85) fllj : 710 +ulj’ ulj ~ N(O’Tll)i (86) where 70,, is the grand mean; y", is the main effect of treatment; Cov(u0 1,11, j = to, is the covariance between the site mean and treatment effect. 59 The combined model is )1): =700+710Xij+u0j+710XguijT’ij- (37) If we express the combined model in the ANOVA notation, it becomes equation (6). The terms in both models are arranged in the same order. The HLM table for the multi-site trial is as follows: Term Subscript df Parameter Numerator or denominator of non- centrality parameter 2 2 “.ng 1'" J-1 a 02,236. 2 2 2 j:J qu j IJ—l r00 02411100 2 i:n/2 r I(jk) 2J(n/2—1) 0'2 0'2 The power function for the test of main effect of treatment is 2 1-probt(tinv(0.95, J —-1 ), J —1 , —4’1&—— ). 40' + in“ (88) The power function for the test of treatment-by-site interaction is 1-probf(finv(0.95, J -1 , J (n — 2) )* 4'7;— 0' 60 2 II 4 ,J-1,J(n—2)). (89) Observing the HLM model, we may notice that y”, is the unstandardized treatment effect, and that 1'” is the variance of the unstandardized treatment effects across individual sites. As in the cluster randomized trial we translate those parameters into their functional forms whose values can be conjectured. 7,0 is transformed into 6 = 231, and it becomes a standardized effect size. 0' Similarly, 2'” becomes 03 = 5%, i.e. the variance of standardized treatment 0' effects across sites; and its value may be set at 0.05, 0.10, 0.15 for small, median, and large (Raudenbush 81 Liu, 1999). The power functions with the new parameterization are as follows: 1-probt(tinv(0.95, J —l ), J -1, —i——); (90) f. 2?. Jn J and 1-probf(finv(0.95, J —1 , J(n — 2) )*—3—,, J — 1 , J(n — 2) ). (91) "0' 1+ 5 4 We may use either of power functions to choose sample size. The choice depends on the research question in the study. If the researcher tries to find out whether one innovative instruction program is better than the routine program, then the main effect of instruction is of great importance. The power function for the test of treatment effect should be used to make sample size choice. If on the contrary, the researcher is concerned about whether the differential treatment effect is related to the administration of those treatments at individual sites, the 61 power function for the treatment-by—site effect should be used to select a sample size. Suppose that a researcher is interested in the differential effect of two tutoring methods, and that he or she conjectures a median effect size 0.5, median effect size variability across sites 0.10, and that there are 10 participating schools. He or she wants to know how many students at each school should be recruited to maintain the power of the test of the treatment at 0.75. The power can be plotted over a range of possible sample sizes n (see figure 2 and table 13 in appendix D). The power of the test for the treatment-by-site interaction (random effect) is also plotted over the same range of sample size n. The power arises very quickly with the increase of n; and it reaches 0.76 when n is 14. So the sample size 14 gives the researcher good chance to discover any median treatment effect. It is easy to see that the power for the treatment is much higher than the power for the interaction. This does not affect the adequacy of the research design. Although the interaction is included in the model, it is not considered to be significant. Its inclusion allows us to trace the source of the variances and get a good estimate of each variance components. Observing the power function (91 ), we can also see the effect of J on power. For a constant n, increasing J will raise the power because it increases the non- centrality parameter in the power function. This is especially true when the effect size variability is large. 62 PCNVER 1.001 q q 90‘ ' '1 .1 q .1 d .80. d d .1 —( .701 d ‘1 d 80‘ e -1 I . . 50‘ e -1 q u G 40‘ ' d -( ‘ q _ .30. d d d 20‘ 9 I d i H 10‘ o .1 . 9099999999 .oo‘ IUIIUUIIIIIIIIIIUTU—TITIIITIIIUIITIIIIIUU'IUIUUIII'T 0 10 20 30 40 50 N upper curve for fixed effect lower curve for randon effect Figure 2: power in the multi-site trial In the multi-site trial the treatment conditions are crossed with the sites; and the design allows the estimation of treatment-by-site interaction in addition to the estimation of the site random effect. In the cluster randomized trial, the cluster- by-treatment interaction is not estimable and is swept under the cluster random effect. This increases our uncertainty about the source of the variation in subjects” responses to the treatments, and it in turn enlarges the variance of our estimate of the treatment effect. As the variance of the estimate of treatment effect increases, it is less likely to reject the null and have high statistical power. Comparing the two designs in terms of power, we can see that the multi-site trial 63 is superior to the cluster randomized trial. The same sample size n returns higher power in the multi-site trial than in the cluster randomized trial. For example, when n is set at 14, power is 0.76 for the multi-site trial and 0.68 for the cluster randomized trial (see table 12 and 13 in appendix D). When n is chosen to be 20, power is 0.86 for the multi-site trial and 0.75 for the cluster randomized trial. In addition, the site and cluster variability are unfavorable for the power of the test of main effect of treatment. In the examples multi-site trial accommodates a higher site variability than the cluster randomized trial. In the multi-site trial the site variability is set at a moderate level, i.e. the effect size variability of 0.10, whereas in the cluster randomized trial the variability of clusters over the total variance is 0.05, which is considered low. In short, the multi-site trial outperforms the cluster randomized trial in terms of power even under unfavorable conditions. However, the choice of design may depend on other logistical issues. If the schools can not give differential treatments to the students at one time, cluster randomized trial may become a favorable design. It does not require that the subjects receive different treatments at one place. In sum, the power functions may be used to assess the statistical adequacy of sample size in a certain design. They can also be used to compare different designs in terms of the power performance. The key is to come up with a reasonable set of parameters, which are meaningful to researchers. Once the parameter values are chosen, the power function can be plotted over a certain 64 possible range of sample size. It is then easy to determine a desirable power level and sample size for the design. 65 Conclusion Sample size issue plays an important role in educational and social research. Prediction studies need to use a large enough sample to make sound generalizations. The larger the sample; the more stable the estimates become. Sample size is related to the extent to which the model can make an accurate prediction in the general case (Brooks, et al, 1996). Other studies, which test a research hypothesis, also involve choice of sample size. The test of the parameter needs a large enough sample to make the final inference defensible. . The larger sample the study uses; the more information it can generate, and the more confidently the conclusion can be made about the detected treatment effect. Such confidence in the conclusion is related to the probability with which we can reject the null hypothesis and confirm our belief in the alternative hypothesis. The probability of rejecting the null hypothesis is the statistical power of the test, and the power is related to sample size of the study. The larger the sample is; the higher the statistical power can be achieved. Sample size determination depends on the power function of the relevant test. When the model becomes complicated, it is hard to derive the power functions. It is especially true in multi-level modeling. The model is complex because the 66 coefficients at the lower level are considered as random at the higher level. The estimation of parameters follows very sophisticated algorithms, i.e. iterative generalized least squares or the expectation maximization (EM) procedure. It is really hard to estimate the power of all the tests in all cases. However, we can simplify the derivation of the power functions by placing some reasonable constraints on our model. We may impose a balanced design requirement on power analysis. Given the fact that studies are usually planned as taking a balanced design, it is quite practical to apply the constraint of balanced design in the power analysis. Once we limit our investigation to balanced designs. We literally eliminate the difference in many estimation methods of the parameters. They converge on the same estimate when the design is balanced. This gives us an unique solution to power analysis of multi-level models. However, the power will vary under the unbalanced design. The unbalanced design may either be a result of missing value or unbalanced sampling plan. In the first case the power should be lowered because of information loss in the data. It may spuriously be higher or lower than it should be. This may be true if the data are not missing at random and the imputation methods are not properly used. We will discuss below the logic of power attrition when data are missing. Assume that we use the multiple imputation method. Distributions are first hypothesized for missing values; and then multiple values are generated for each missing value from those distributions to yield multiple complete data sets. The routine analysis is then performed on those multiple data sets to produce multiple 67 estimates of the same parameter; and the multiple estimates are averaged to give the final estimate of the parameter. The variance of the final estimate consists of two components: the first the average of the imputed estimates’ variances; the second the sample variance of those estimates. When the data are complete, only the first component exists. Therefore, the variance of the final estimate from multiple imputation method is larger than it should be if no data are missing (Schafer and Olsen, 1998; Rubin, 1987). The larger the variance of the estimate; the less likely the test will reject the null hypothesis. The power of the test therefore decreases. The unbalanced design may also arise from a sampling plan. Some sampling units may naturally have more subjects than other units. In general the power will be lower than in an unbalanced design given the total sample size. It is difficult to assess the power change without real cases. There are many procedures to adjust those unbalanced design in the data analysis. Those procedures may vary in their power performance. In addition, the distribution of the test statistics often depends on specifically used procedures. If the departure from balance design is not severe, we may treat it as a balanced design and calculate power by substituting average sample sizes or their harmonic means into the power functions. Under the balanced design multi-Ievel modeling can be carried out in two approaches: mixed ANOVA’s and hierarchical linear models. They are 68 essentially the same in the planning stage of a study. The dissertation points out very clearly the connection between the two approaches. They can literally be translated from one to the other. In the former approach it is easier to do the power analysis of tests of fixed effects of more than one levels. The second approach (HLM) gives flexibility and advantages in the stage of data analysis because it accommodates missing values and the unbalanced designs of real data. The dissertation shows the power analysis for both approaches. With the HLM approach the dissertation invents a handy HLM table to derive the power functions of parameters in the model. In HLM the power analysis literally uses the estimates of parameters at the lower level as the outcome for the parameters at the high level. At each level the model is simplified to a linear regression. It takes either the form of a one sample t test or a two sample independent t test. The power functions of the relevant parameters are derived similarly to the case of one sample t test or two sample t test. The expectation of the estimated variance of random component has patterns from the lower level to the higher level. Through algebraic transformations we may use the estimates of variance to test each random component at each level. The estimates of the treatment and variance components are algebraically related to the mean squares of their couterparts in ANOVA. The ANOVA tables provides a scaffold for systematically developing power functions for the key parameters in HLM. 69 With slight modification the power analysis can be extended to the case of having continuous covariates at each level in HLM. We use the CRT as an example to illustrate the approach and generalize it to any level. We may assume some covariates at the 2"Cl level and modify equation 27 as follows: flojzyo+yle+y2le+y3X2j+°'+uj1 uj~N(O,T) j=1,...,J. (92) To simplify the computation, we assume that the population coefficients of those covariates are known, and that the percentage of variation in flojdue to the covariates are known (we may use empirical estimates from the previous study to substitute), and that the covariates do not have any collinearity with the fixed effects (Randomization or matching subjects on covariates can help to achieve that). If we leave out those covariates in the analysis, we literally force 2' to be larger than it should be. We may view the variation due to the covariates are swept under the random error at that level. To assess the power change due to the inclusion of covariates we may adjust the 1 parameters in the power function by a percentage score, that is n=—’—C— (93) T where 77 is the ratio of the reduced r, due to inclusion of covariates over the orignal I. For example, the adjusted power function for testing the treatment effect becomes 1 — probt( tinv(0.95, J-2), J-2, 7' ). (94) [1(12- + 1' ) J n 77 70 Of course, the computed power value will be approximate. It should be higher than the real one because it does not assume the estimation of covariate coefficients. The estimation of covariate coefficients consumes some information in the data, which may othenNise be used to gain more precision in estimating the treatment effect. If we consider the collinearity between covariates and fixed effect, then the variance estimate for the treatment effect will be increased correspondingly (see Raudenbush, 1997) and power decreases correspondingly. In short, the real power value falls between the unadjusted power function and the adjusted power function. To generalize the approach to any level, we may . T . . hypothesrze a percentage score '7 = ——C— for each level. 2' IS the random vanance T .at that level; and Q is the reduced random variance due to the inclusion of covariates. We may obtain those percentage scores from previous studies, and then we may adjust the random error parameters in the power function. by their corresponding percentage scores. When planning a study researchers can standardize the parameters in the power functions and bypass the assumption of full knowledge of the key parameters. This makes it easy to plan a study. Raudenbush (1997) and Raudenbush & Liu (1999) have proposed some standardization scheme for cluster randomized trial and multi-site clinical trial. They can be adapted to general cases. This is because the every two levels of HLM essentially assumes a CRT or MST. 71 The rules in the dissertation may form the basis of a computer software which computes the power of the tests of key parameters in the HLM. Hopefully the dissertation will become a stepping stone to serious investigation of power analysis of general HLM, e.g. categorical outcome and multivariate outcome. 72 APPENDICES 73 APPENDIX A DEFINITIONS OF THE USED PROBABILITY FUNCTIONS Noncentral T cumulative distributive function: probt( x, degrees of freedom, non-centrality parameter) Quantile function for central T distribution: tinv(cumulated probability, degrees of freedom); Central F cumulative distributive function: probf(x, df for the numerator, df for the denominator) Noncentral F cumulative distributive function: probf(x, df for the numerator, df for the denominator, non-centrality parameter) Quantile function for central F: finv(cumulated probability, df for the numerator, df for the denominator) Noncentral Chi cumulative distributive function: probchi(x, df, non-centrality parameter) Quantile function for central Chi: cinv(cumulated probability, df) 74 APPENDIX B CONVERSION BETWEEN NON-CENTRAL T ' AND F ' Definition of non-central T' and F' U+6 IV V , where U is a standard normal random variable. TJ(5)~ 232(52) 2 P;:.<62>~———/ 4‘1“” 2 T 2 I.» IV V V Therefore ”5) : Jméz) ma) 2 o . ” -JF.:.<62) .TJ(5)<0 Also we state the following results without proof, since the proof uses the same logic as the following derivation: ll. Conversion in power of two-sided test between non-central T' and F ' power= P[ T'(6)2t g ]+ P[T'(6)srg ] 2’ 2' =P[T:»(6)2 ‘dfl-mlfl’]+P[TV'(6)< —.flv a—.l:v] = P[(Tv(6))2 Z .fl-adw] = P[I;‘l:'v(§2 ) ">- .fl-aflzv] 75 APPENDIX C SAS PROGRAM TO COMPUTE POWER THIS SAS PROGRAM IS USED TO COMPUTE THE VALUES OF POWER FUNCTIONS IN THE DISSERTATION. THE FUNCTIONS SHOULD BE ENTERED AS THEY APPEAR IN THE DISSERTATION; AND ALL THE PARAMETERS SHOULD BE REAL VALUES. =======================================_ ___*/ %KEYDEF F1 'END; PGM; REC; SUB'; %LET P=; %LET FUN=; %WINDOW FUNCTION COLOR=CYAN ROWS=30 COLUMNS=7O GROUP= FIRST #5 @4 ”INPUT THE POWER FUNCTION" #6 @4 ”ALL THE PARAMETER INPUTS SHOULD BE REAL VALUES" #10 @4 "ENTER POWER FUNCTION BELOW" #12 @4 FUN 60 ATTR=UNDERLINE REQUIRED=YES GROUP=SECOND #5 @4 FUN 60 #7 @4 'THE ABOVE FUNCTION IS EQUAL TO ' @36 P 3 ATTR=UNDERLINE #12 @4 'PRESS' @10 'ENTER' A=UNDERLINE @16 To END' #13 @4 'OR PRESS FUNCTION KEY' @28 'F1' A=UNDERLINE @32 'TO CONTINUE %DISPLAY FUNCTION.FIRST; DATA DSN1; POWER=&FUN; RUN; DATA NULL; SET DSN1; CALL SYMPUT('P',TRIM(LEFT(POWER)) ); 76 RUN; %DISPLAY FUNCTION.SECOND; THIS PROGRAM TAKES A VARIABLE NAME, ITS RANGE, AND A POWER FUNCTION. IT THEN PLOTS THE POWER FUNCTION AGAINST THE VARIABLE OVER THE PROVIDED RANGE %KEYDEF F1 'END; PGM; REC; SUB’; %LET X=; %LET UPBOUND=; %LET LOWBOUND=; %LET FUN=; %WINDOW PWPLOT COLOR=CYAN ROWS=30 COLUMNS=7O GROUP= FIRST #5 @4 ”INPUT THE VARIABLE NAME” @36 X 8 ATTR= UNDERLINE #6 @4 ”AGAINST WHICH POWER SHOULD BE PLOTTED" GROUP=SECOND #5 @4 ”INPUT THE VARIABLE NAME" @36 X 8 ATTR=UNDERLINE #6 @4 ”AGAINST WHICH POWER SHOULD BE PLOTI'ED" #8 @4 "ENTER THE UPBOUND" @29 UPBOUND 8 ATTR=UNDERLINE REQUIRED=YES @43 ”FOR" @48 X PROTECT=YES #10 @4 ”ENTER THE LOWBOUND” @29 LOWBOUND 8 ATTR=UNDERLINE REQUIRED=YES @43 ”FOR” @48 X PROTECT=YES #13 @4 ”ENTER POWER FUNCTION BELOW” #14 @4 FUN 60 ATTR=UNDERLINE REQUIRED=YES %DISPLAY PWPLOT.FIRST; %DISPLAY PWPLOT.SECOND; 77 DATA PW (KEEP=POWER 8X); LOW=SYMGET(‘LOWBOUND'); UP =SYMGET('UPBOUND'); INC=(UP-LOW)/100; DO &X=LOW TO UP BY INC; POWER=&FUN; OUTPUT; END; RUN; GOPTlON HORIGIN=2 VORIGIN=2 VSIZE=5 HSIZE=4; symbol1 interpol=join width=2; AXIS1 ORDER=(O T0 1.0 BY 0.1); PROC GPLOT DATA=PW; PLOT POWER*&X/ VAXlS=AXlS1 FRAME; RUN; 78 APPENDIX D SAS PROGRAM FOR FIGURE 1 AND 2 AND TABLE 12 AND 13 P==================================== = THIS PROGRAM PRODUCES FIGURE 1 AND 2 TABLE 12 AND 13 IN THE DISSERTATION. FIGURE 1 AND TABLE 12 ARE FOR CRT; FIGURE 2 AND TABLE 13 ARE FOR MST; ================================ —————— = */ FILENAME TABLE1 'C:\|iu\dissertation\table1.rtI‘; FILENAME TABLE2 'C:\liu\dissertation\table2.rtf'; DATA CRT (KEEP=N POWER_F POWER_R); FILE TABLE1; P======================== PARAMETERS FOR CRT ======:==================fl ALPHA=0.05; * SIGNICANCE LEVEL; DELTA=O.5; * DELTA STANDS FOR STANDARDIZED EFFECT SIZE; RHO=0.05; * RHO IS THE INTRACLASS CORRELATION; J= 10; * J IS # OF CLUSTERS; PUT @10 'n' @20 'fixed effect' @40 'random effect' ll; DO N=5 TO 50; P=================;_ : — — POWER FUNCTION IS THE SAME AS ( 82, 83) IN CHAPTER 5. NC IS THE 4TH PARAMETER IN THE POWER FUNCTION FOR FIXED EFFECT; OMEGA IS THE SCALE IN TH POWER 79 FUNCTION FOR RANDOM EFFECT ============== —: _____====:===:=:V NC=DELTAISQRT(4*( (1-RHO)/N + RHO )IJ ): POWER_F=1-PROBT(T|NV(1-ALPHA,J-2),J-2.NC) ; OMEGA=(1-RHO)/( 1-RHO + N*RHO); POWER_R=1-PROBF(FINV(1-ALPHA,J-2,J*(N-1))*OMEGA,J-2,J*(N-1) ) ; FORMAT POWER_F 8.2 POWER_R 8.2; PUT @10 N @20 POWER_F @40 POWER_R; OUTPUT; END; PUT // @10 'Table 12: power in cluster randomized trial' ; RUN; *PROC PRINT DATA=CRT; RUN; DATA MST (KEEP=N POWER_F POWER_R); FILE TABLE2; ALPHA=0.05; DELTA=O.5; SIG_DELT=O.10; * VARIABILITY OF DELTA ACROSS SITES; J=10; PUT @10 'n' @20 'fixed effect' @40 'random effect' ll; DO N=4 TO 50 BY 2; *ASSUME A BALANCED DESIGN ; P================================ _ =___ POWER FUNCTIONS ARE THE SAME AS (90,91) IN CHAPTER 5. NC=DELTAISQRT(4/(N*J)+SIG_DELT/J); 80 POWER_F=1-PROBT(TINV(1-ALPHA,J-1), J-1, NC); OMEGA=1/(1+N*SIG_DELT/4); POWER_R=1-PROBF(FINV(1-ALPHA, J-1, J*(N-2))*OMEGA, J-1, J*(N-2) ); FORMAT POWER_F 8.2 POWER_R 8.2; PUT @10 N @20 POWER_F @40 POWER_R; OUTPUT; END; PUT // @10 'Table 13: power in multi-site trial' ; RUN; %MACRO PWPLOT(DSN); GOPTION HORIGIN=2 VORIGIN=2 VSIZE=5 HSIZE=4£ SYMBOL1 INTERPOL=JOIN LINE=1 WIDTH=2 ; SYMBOL2 INTERPOL=JOIN LINE=2 WIDTH=1; FOOTNOTE1 J=C H=1 'upper curve for fixed effect'; FOOTNOTE2 J=C H=1 'Iower curve for random effect'; AXIS1 ORDER=(0 T0 1.0 BY 0.1) LABEL=(FONT=SWISS 'POWER'); PROC GPLOT DATA=&DSN; PLOT POWER_F*N POWER_R*N /OVERLAY VAXIS=AXIS1 FRAME; RUN; %MEND PWPLOT; %PWPLOT(CRT) %PWPLOT(MST) 81 APPENDIX D n treatment effect cluster effect 5 0.43 0.12 6 0.48 0.14 7 0.51 0.17 8 0.55 0.19 9 0.57 0.21 10 0.60 0.23 11 0.62 0.26 12 0.64 0.28 13 0.66 0.31 14 0.68 0.33 15 0.69 0.35 16 0.70 0.38 17 0.72 0.40 18 0.73 0.42 19 0.74 0.44 20 0.75 0.46 21 0.76 0.48 22 0.76 0.50 23 0.77 0.52 24 0.78 0.54 25 0.78 0.56 26 0.79 0.57 27 0.80 0.59 28 0.80 0.61 29 0.81 0.62 30 0.81 0.63 31 0.81 0.65 32 0.82 0.66 33 0.82 0.67 34 0.83 0.69 35 0.83 0.70 36 0.83 0.71 37 0.84 0.72 38 0.84 0.73 39 0.84 0.74 40 0.84 0.75 41 0.85 0.76 42 0.85 0.77 43 0.85 0.78 44 0.85 0.79 45 0.85 0.79 46 0.86 0.80 47 0.86 0.81 48 0.86 0.81 49 0.86 0.82 50 0.86 0.83 Table 12: power in cluster randomized trial 82 n treatment effect treatment*site 4 0.40 0.07 6 0.51 0.09 8 0.59 0.11 10 0.66 0.13 12 0.72 0.15 14 0.76 0.17 16 0.79 0.20 18 0.82 0.22 20 0.85 0.25 22 0.86 0.27 24 0.88 0.30 26 0.89 0.32 28 0.90 0.34 30 0.91 0.37 32 0.92 0.39 34 0.93 0.41 36 0.94 0.44 38 0.94 0.46 40 0.95 0.48 42 0.95 0.50 44 0.95 0.52 46 0.96 0.54 48 0.96 0.56 50 0.96 0.58 Table 13: power in multi-site trial 83 BIBLIOGRAPHY Bibliography Bennet, C. A. and N. L. Franklin (1954). Statistical analysis in chemistry and the chemical industry. Wiley, New York. Brooks, Gordon et al. (1996). Precision power and its application to the selection of regression sample sizes. Mid-Westem Educational Researcher, 9, 10- 17. Kreft, lta (1993). Using multilevel analysis to assess school effectiveness: a study Of Dutch secondary schools. Sociology of Education, 66, 104-129. Littell, et al. (1997). SAS System for Mixed Models. SAS, North Carolina. Montgomery, D. C. (1996). Design and analysis of experiments. Wiley, New York. Raudenbush, S. (1993). Hierarchical linear models and experimental design. In Edwards, L (Ed.), Applied analysis of variance in behavioral science. New York: Marcel Dekker, Inc. Raudenbush, Stephen 8. Liu, Xiaofeng (1999). Statistical power and optimal design in multi-site clinical trial, revised and resubmitted to Psychological Methods Rubin, DB. (1987). Multiple imputation for nonresponse in surveys. J. Wiley & Sons, New York. Schafer, J.L. and Olsen, MK. (1998). Multiple imputation for multivariate missing- data problems: a data analyst's perspective. Multivariate Behavioral Research, 33 (4), 545-571. Scheffe, H. (1959). The analysis of variance. Wiley, New York. 85 Searle, S. R. (1971 ). Linear models. Wiley, New York. Stapleton, James. (1995). Linear statistical models. New York: John Wiley. Stroup, Walter. (1998). An introduction to mixed model analysis. course notes. 86 ICHIGRN STRTE UNI V. IIIIIIII I||III3I 0|I2II OIII IIIIIIIIILIIIIIIIIIIIIIS