3...? . . , .v‘uai‘ Q1111... . a fiévuadhfi 2.. . Wkly. . . , . . - tvflfmmg . mfifififiu av}: . x 4. #3., Erin. 1 i v o \i .3 3 1.. it .. is M. «up: 1.". 1... P Eve. 5% V FER .. .. 50.3%....1, 1. h». . ’9 V i O‘ 3‘ flaw“.a.wwau.uhltnshm\2rarflt fin” 35:9...» .0 . ...r4.lhfi n 1%.». V 3:333. 1 9.5: 29.....qu a. 3x .1.- :. .2151 12...... V {is ‘3‘: . V \3\. I13. vurl..st..xs I I3. 12.. ‘ \‘ fin: v.11: he‘d-14.; I V . , fifimfifi E53 has": n &Oo‘f This is to certify that the dissertation entitled METHODS OF META-ANALYZING REGRESSION STUDIES: APPLICATIONS OF GENERALIZED LEAST SQUARES AND FACTORED LIKELIHOODS presented by Meng-Jia Wu LIBRARY ichigan State University has been accepted towards fulfillment of the requirements for the im- Measurement & Quantitative Methods Qigflvfit 7 M LMajbr Professor’s Signature :11. I V i W Q Date PhD degree in MSU is an Affirmative Action/Equal Opportunity Institution PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 2/05 p:/ClRC/DateDue.indd-p.1 METHODS OF META-ANALYZING REGRESSION STUDIES: APPLICATIONS OF GENERALIZED LEAST SQUARES AND FACTORED LIKELIHOODS By Meng-Jia Wu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Educational Psychology and Special Education 2006 ABSTRACT METHODS OF META-ANALYZING REGRESSION STUDIES: APPLICATIONS OF GENERALIZED LEAST SQUARES AND FACTORED LIKELIHOODS By Meng-Jia Wu Regression is one of the most commonly used quantitative methods for exploring the relationship between predictor(s) and the outcome of interest. One of the challenges meta-analysts may face when intending to combine results from regression studies is that the predictors are usually different fi'om study to study, even though the primary researchers may have been studying the same outcome. In the current study, two methods, generalized least squares (GLS) and factored likelihoods through the sweep operator (SWP), for combining results were examined for their ability to reduce the problems arising from regression models that contain different predictors in the meta-analysis. Both of the methods utilize the zero-order correlations among the variables in the regression studies. The GLS method treats the correlations from each study as a subset of multivariate outcomes, and combines the results with the consideration of the dependencies of the correlations across studies (Raudenbush, Becker, & Kalaian, 1988; Becker, 1992). The SWP method in this study applies the concept of missing data to the regression models that contained different predictors included in the synthesis. An empirical study was conducted by creating a set of regression studies using a subset of NELS:88 dataset. The correlations among the created studies were combined. A final regression model with standardized slopes was calculated for each of the predictors using each of the two methods. The results from this empirical study showed that SWP produced less biased estimates of slopes in most situations. The precision of the results from those two methods could be impacted by the features of studies included in the meta-analysis. Therefore, a Simulation was conducted to investigate the impacts of missing-data pattems, intercorrelations among the predictors and the outcome, and the sample size. The results indicated that the difference between the two methods was not large. SWP consistently performed slightly better at estimating the slope of the predictor that was fully observed in all studies in the synthesis. Generally, SWP performed well when the sample sizes were equal and small across all studies, and GLS performed better when the sample sizes were equal and large. ACKNOWLEDGEMENTS Working on this dissertation has been an exciting and satisfying journey. Without the support and encouragement from several people, I would never have been able to finish this work. First of all, I must express my deep gratitude and appreciation to Betsy Jane Becker, a dream advisor and a friend who accepted me at my lowest and brought out the best in me through out my graduate study. She has taught me the innumerable lessons on the workings of quantitative research and the insights in life as well during the past years. Her technical and editorial advices were essential to the completion of this dissertation. I would not have achieved as much as I did without her encouragement and guidance. My thanks also need to go to the co-chair of my dissertation, Richard Houang, who challenged my thinking and helped me to make this dissertation complete. I would also like to thank two other committee members, Fred Oswald, and Mathew Reeves, for providing many valuable comments that improved this dissertation hugely. I am very grateful to have such a great committee to work with. Continuing financial support has been provided by the TQ->QT project during the past six years. I would especially like to give my thanks to Mary Kennedy, one of the P15 of this project, for being the role model of a true and witty scholar. Thanks also go to Soyeon Ahn and J inyoung Choi, whom I worked with in this project since the very beginning, and all other colleagues during the wonderful six years. Without their emotional and intellectual support, the process of writing the dissertation would not have been this enjoyable. iv The friendship of long-time friends, Ping Nieh and Charles Shih, is much appreciated. Traveling with them and dining out at our favorite restaurants had prevented me from being an overworked student. Entertaining their daughter Natalie, who was born when I started to work more intensely on the dissertation, was the best stress reliever. I have to especially thank Ping, who helped me to go through the crazy checking process at the Graduate School for submitting this dissertation after I moved to Chicago to start my work. Without her help, I would not be able to get my degree. Last, but not least, I would like to thank my parents who gave me birth at the first place. They always put my advantages over theirs and have never doubted about my ability to finish my studying, which made this dissertation possible. TABLE OF CONTENTS LIST OF TABLES ............................................................................... LIST OF FIGURES ............................................................................ CHAPTER 1 INTRODUCTION .......................................................... 1.1 Absence ofMethods for Meta-analyzing Regression Results I .2 Potential Effect Sizes from Regression Studies ....................................... 1.3 Potential Problems of Synthesizing Regression Studies 1.4 Purpose ofthis Research--. 09.0.9000. C O . O . ‘.C O -Ammwuu CHAPTER2 LITERATUREREVTEW ................................................... 2.2 In Epidemiology ........................ ................................................. 2.3 In Economics” 2. 4 In Ecology. 20 5 m Pdim SCigfiw" Q. um Q OCCOOOCCOQOQOJOW..MO~OOOOO. 0.. 0.. OOOOOCOOCO OOOOOOOOOOOOOOIOOOO 2.6 In Education ............................................ i ...................................... 2.7 The Most Recent Study ..................................................................... CHAPTER3 METHODOLOGIES ..................................-..--.-....---.....-.--.. 3.1 Focusing on the Zero-order Correlations ................................................ 3. 2 Constructing the Standardized Regression Model 3.3.1 Multivariate Generalized Least Squares .......................................... 3. 3. 2 Factored Likelihoods through Sweep Operators ................................ 3.4Da1a Generation .............................................................................. 3.4.1 Choice ofPammeters -.....---.. 3.4.2 MissingRate .......................................................................... 3.4.3 Replications in the Simulation ...................................................... CHAPTER 4 EMPIRICAL EXAMINATION ............................................. 4.1 Sample Creation .............................................................................. 4. 2 Application of Multivariate Genenalized Least Squares. .. 4. 3 Application of Factored Likelihood Method through the Sweep Operators 4. 4 Results from the Empirical Examination ................................................ CHAPTER 5 SIMULATION RESULTS ..................................... . ............. 5.1 Fixed-effects Model (Condition 1throngh Condition 4) 5.1.1 Correlation Matrix R1 ................................................................ 5.1.2 Correlation Matrix R2 ................................................................ 5.1.3 Correlation Matrix R3 ................................................................ vui p—o . meowwuma sum—~— 5. y... 12 12 13 14 15 22 4o 41 46 46 47 50 50 53 57 65 67 67 ‘67 76 83 5.1.4 Correlation Matrix R4 ................................................................ 90 5.2 AN OVA Results ......................................................................... 99 5.3 Mixed-effects Model (Condition 5 through Condition 8) .............................. 112 5.2.1 PattemI ................................................................................ 113 5.2.2 Pattern H ................................................................................ 113 5.2.3 Pattern IH .............................................................................. 114 5.2.4 Pattern IV .............................................................................. 114 5.2.5 Pattern V ................. '. ............................................................. 115 CHAPTER 6 DISCUSSION ................................................................... 126 APPENDD( A .................................................................................. 132 APPENDIX B .................................................................................. 139 APPENDD( C ................................................................................... 146 APPENDIX D .................................................................................. 154 APPENDD( E .................................................................................. 159 REFERENCES .................................................................................. 163 vii Table 3.1 Table 4.1 Table 4.2 Table 4.3 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5 Table 5.6 Table 5.7 Table 5.8 Table 5.9 Table 5.10 Table 5.11 LIST OF TABLES Percentages of Missingness for Each Pattern with Each Sample Size ............................................................................. Sample Sizes and Standardized Regression Coefficients for Four Studies ......................................................................... Correlations among Five Variables ........................................ Estimated Standardized Regression Coefficients from both Methods .............................................................................................. Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R1 ............ Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern H and Correlation Matrix R1 ........... _ Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R1 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R1 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R1 ........... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R2 ............ Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern II and Correlation Matrix R2 ........... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R2 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R2 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R2 ............ Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R3 ............ 47 52 52 66 71 72 73 74 '75 78 79 80 81 82 85 Table 5.12 Table 5.13 Table 5.14 Table 5.15 Table 5.16 Table 5.17 Table 5.18 Table 5.19 Table 5.20 Table 5.21 Table 5.22 Table 5.23 Table 5.24 Table 5.25 Table 5.26 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern II and Correlation Matrix R3 ............ Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R3 ........... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R3 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R3 ........... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R4 ............ Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern II and Correlation Matrix R4 ........... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R4 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R4 .......... Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R4 ........... Ranges of Percentage Relative Bias Produced by GLS and SWP. . Analysis of Variance for the Differences in Estimates of the Slope OfX1. . . . . . ............................................................................................. Analysis of Variance for the Differences in Estimates of the Slope Osz ........................................................................... Analysis of Variance for the Differences in Estimates of the Slope of X3 ........................................................................... Analysis of Variance for the Differences in Estimates of the Slope of X4 ........................................................................... Parameters, Estimated Mean Slopes and Standard Errors for Each Predictor under Mixed-effect Models with Different Sample Sizes in Pattern I .................................................................... ix 86 87 88 89 93 94 95 96 97 98 100 101 101 102 116 Table 5.27 Parameters, Estimated Mean Slopes and Standard Errors for Each Predictor under Mixed-effect Models with Different Sample Sizes in Pattern II .................................................................... 1 18 Table 5.28 Parameters, Estimated Mean Slopes and Standard Errors for Each Predictor under Mixed-effect Models with Different Sample Sizes in Pattern III ................................................................. 120 Table 5.29 Parameters, Estimated Mean Slopes and Standard Errors for Each Predictor under Mixed—effect Models with Different Sample Sizes in Pattern IV ................................................................. 122 Table 5.30 Parameters, Estimated Mean Slopes and Standard Errors for Each Predictor under Mixed-effect Models with Difi'erent Sample Sizes in Pattern V ................................................................. 124 LIST OF FIGURES Figure 3.1 The Models for Four Created Studies and the Structure of the Data. . 26 Figure 3.2 Five Sets of Regression Models with Different Numbers of Predictors Missing from Studies ............................................................ 42 Figure 3.3 Eight Combinations of the Four Intercorrelation Matrices for Four Studies ............................................................................. 46 Figure 5.1 Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Difi'erences in Slopes of X1 ...................................... 102 Figure 5.2 Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X2 ...................................... 105 Figure 5.3 Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X3 ...................................... 107 Figure 5.4 Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X4 ...................................... 110 CHAPTER 1 INTRODUCTION Meta-analysis is a quantitative procedure that allows researchers to summarize a myriad of studies focusing on one topic. This technique helps to address the challenges introduced by the existence of multiple answers to a given research question. The essential feature of meta-analysis is adopting the same type of effect size across studies, so the results from different studies are comparable. As Lipsey and Wilson (2001) summarized: The various effect size statistics used to code different forms of quantitative study findings in meta-analysis are based on the concept of standardization. The effect statistic produces a statistical standardization of the study findings such that the resulting numerical values are interpretable in a consistent fashion across all the variables and measures involved. (p.4) The commonly used effect sizes in meta-analysis fall into one of two families: The d family and the r family (Rosenthal, 1994). Generally speaking, the d family includes proportions or mean differences between groups; the r family includes the Pearson product moment correlation (r), as well as the Fisher’s transformation of r. 1.1 Absence of Methods for Meta-analyzing Regression Results For more than two decades, methods for synthesizing mean differences and correlations have been broadly studied and clearly documented in several major publications (see Cooper, 1998; Cooper & Hedges, 1994; Hunter & Schmidt, 2001; Lipsey & Wilson, 2001; Sutton, Abrams, Jones, Sheldon, & Song, 2000). Methods for synthesizing cumulative evidence from studies using regression, however, have not yet been well studied. Regression has been widely used by researchers in different fields for predicting and explaining the variation in outcomes of interest. Regression can also be considered as a more sophisticated method, compared to correlation, because it involves more statistical controls when studying relationships among variables. Without appropriate methods to combine results using regression, a great deal of evidence can not be used. Excluding regression studies sabotages the thorough understanding of research questions when conducting a meta-analysis. 1.2 Potential Effect Sizes from Regression Studies Statistics that can be found in regression studies are the raw regression coefficient or slope and sometimes its standard error, the t statistic for testing the significance of the slope, the standardized slope, and the R2, which is the proportion of variance explained by the model. A raw regression coefficient (slope) represents the expected increment in the dependent variable when the focal independent variable increases one unit, while controlling for other independent variables in the model. The magnitude of the raw coefficient changes when the scales of the dependent and independent variables change. This characteristic means the slope cannot be compared directly across models, unless all the models use the same scales to measure both the dependent and independent variables. The t statistic associated with the slope is more like a standardized estimate because each t is the raw regression coefficient scaled by its own standard error. The t statistic itself has less power to explain the magnitude of the slope, and it depends on the sample size. The standardized slope is the raw regression coefficient standardized by the standard deviations of the predictor and the outcome. It is scaled in a standardized unit. Therefore, it can be compared directly across models. The explained variance (R2) represents the proportion of the variance in the outcome accounted for by all the predictors combined in a regression model. If we wish to focus on variance explained by a certain predictor, then the partial R2 value for that predictor will need to be computed by withdrawing the effects of other independent variables. Hunter and Schmidt (1990) argued that using R2 to represent the magnitude of effect loses the direction of the effect. They also stated “variance-based indices of effect size make [variables that account for small percentages of the variance which might be] important effects appear much less important than they actually are, misleading both researchers and consumers of research” (p. 190). 1.3 Potential Problems of Synthesizing Regression Studies One of the major problems arising when synthesizing regression studies is that the potential eflect sizes discussed above are not comparable if the models included in the synthesis do not all use the same predictors. That is, the effects of different variables are partialed out, or held constant, when computing the effect of a focal predictor. Therefore, the focal slope, no matter whether it is a standardized or a raw slope, has different meanings across studies. This problem becomes complicated quickly when models contain many predictors. Unless the extra predictors in some models are absolutely independent of the focal predictor, which is never true, comparing the slopes or other effect sizes from unparallel models is comparing apples to oranges. One solution to this issue might be including only models that contain the same variables. However, it is unrealistic to expect to find parallel models created for the same research question, especially in education, where large numbers of variables are typically used to investigate one phenomenon. Another problem that arises when meta—analyzing raw slopes from a set of regression studies is that the magnitude of the raw slope can change when the scales Of the outcome and the predictors change. This implies that only when all variables are measured using the same scales, and all models contain the same predictors in the studies included in the synthesis, can slopes be compared directly. 1.4 Purpose of This Research Since the solution of including only models that contain the same variables measured in the same scales (so the raw slopes can be comparable) is impractical, the current study focused on investigating methods for reducing the impact of unparallel regression models by synthesizing scale-free correlations among the variables in the model, which the standardized slopes are based on. Then the synthesized correlations can be used to create a final regression model with standardized slopes as the synthesis result. Two methods were examined in this study. One method uses a non-model based multivariate generalized least squares (GLS) approach; the other method uses model based factored likelihood estimation. These two methods were first examined empirically by creating and analyzing four pseudo studies based on samples that were drawn from a selected sub sample from a large national dataset. Then a Monte Carlo stimulation was conducted to test the precision and stability of the two methods under different scenarios. CHAPTER 2 LITERATURE REVIEW Researchers in several fields have been trying to include regression studies in their meta-analyses. Most of these syntheses have either oversimplified the situation, or the methods proposed were limited to other fields and may not be applicable to education. Among those methods, a more universal technique that was proposed in the early 19705 to investigating regression coefficients at one time was to create a hierarchical-linear—model-like model for modeling the variance among the coefiicients (Hanushek, 1974). However, the method focused on quantifying the variance among the slopes and required raw data along with some infrequently reported summarized statistics, which may not be applicable in the meta-analysis context. 2.1 In Psychology Raju, Fralicx, and Steinhaus (1986) proposed a “regression slope model” to adjust for the variability of the slopes found among studies that originates fi'om the use of unreliable measures. The model presented by the authors is byx = Byx rxx + e where by. is the observed regression coefficient for predicting y from x, Byx is the unattenuated and unrestricted population regression coeflicient, rxx is the unrestricted population reliability of predictor x, and e is the sample error associated with by, (p. 197). The ultimate goal for assessing validity generalization (VG) is to estimate the mean and the variance of the regression slope parameter (Byx) using the mean ofox (M3) and its variance (VB). As Raju et a1. pointed out, regression slope models “should theoretically be affected by scale differences in either or both of the predictor or criterion instruments used across the separate validity studies. ....The use of the new models for studying validity generalization, therefore, required that the scales for the predictor and criterion variables be comparable across studies” (p. 199). As Raju, Pappas, and \Vrlliams (1989) also pointed out, “[w]ithout the common metrics for the criterion and predictor variables, it is almost impossible to interpret credibility intervals of the type used with the correlation model. The use of the new models for studying VG, therefore, requires that scales for the predictor and criterion variables be comparable across studies”(p. 903). In addition to limiting scale comparability, the other requirement that is implied by Raju and colleagues’ model is that only one predictor is involved in the model. This condition might easily be achieved when studying validity generalization, yet it is usually not the case in education studies. 2.2 In Epidemiology Several reviews have been done that synthesize the slopes from dose-response models, which are widely used for evaluating the relationship between dose (e.g., of a drug or other treatment) and response. To create a dose-response model, researchers assign values for different dose levels and use those values as a predictor to predict a targeted response which is in the form of an odds ratio. Greenland and Longnecker (1992) combined the slopes from dose-response models based on 10 published datasets. They used techniques analogous to the standard inverse—variance weighting techniques that are used in contingency data to analyze the differences among the slopes. The same approach was adopted to study the relationship between individual consumption of chlorinated drinking water and bladder cancer (Villanueva, Fernandez, Malats, Grimalt, & Kogevinas, 2003). In both meta-analyses, the dose levels in different studies were relabeled with new values according to the same standards, and the outcomes were all odds ratios. Therefore, the slopes are comparable. 2.3 In Economics Meta-analyses of regression studies can be found in syntheses of demand studies. The characteristic of demand studies that facilitates conducting a meta-analysis is that the demand elasticities from different studies are typically all on the same scale, because a demand elasticity, which is a regression slope, expresses the relationship between demand and its determinant as the percentage change in demand caused by a 1% change in the determinant. Crouch (1995) conducted a meta-analysis to synthesize 80 studies of international tourism demand. Those studies produced 1,964 observations (i. e., regression equations) and 10,078 regression coefficients. The majority of included demand elasticities concerned income, price, exchange rates, transportation cost, and marketing expenditures. The author adopted the synthesis method proposed by Raju et al. (1986), mentioned in the previous section. However, the regression coefficients were Obtained from international tourism demand models that were not parallel and contained more than one independent variable, which violated the requirement of Raju and his colleagues’ methods. The author was actually aware of the violation and stated that “the value of b, may be affected by the inclusion of other explanatory variables” (p. 109), but he did not justify his decision to include unparallel models. A series of articles pertinent to meta-analyzing regression studies focusing on elasticity in economics can be found in the special issue of Journal of Economic Surveys published in 2005 (Vol. 19, Issue 3). 2.4 In Ecology A recent review combining regression results is focused on summarizing the relationship between population density and body size for mammals and birds (Bini, Coelho, & Diniz-Filho, 2001). The authors used a conventional weighting scheme to weight the slope by its standard error. It is not clear whether “body size” and “population density” were measured on the same scales though they could have been. It might be safe to assume that population density was measured on one scale across studies. However, body size could be measured in terms of length, weight, body mass, or some other measure. The authors did not mention how they dealt with the different units for the predictor. Moreover, it is not clear if all 74 regression models included in this meta-analysis used “body size” as the only predictor. 2.5 In Political Science Lau, Sigelrnan, Heldman and Babbitt (1999) tried to combine the results from both group comparison studies and regression studies that focused on the effect of negative political advertisements on political campaigns. They found that about one-quarter of their data points “come from ordinary least squares (OLS) or logistic regression equations, and there is no universally accepted method for handling such data in a meta-analysis” (p. 855). To avoid losing data, they decided to use I statistics associated with the regression coefficients from regression studies to represent the treatment (exposure to negative advertisements) versus control (exposure to no advertisements or positive advertisements) mean difference effect, and then converted each I value into d, by using d = 2t/(dj)1/2. They argue that therefore the converted ds can be combined with other ds from group comparisons. The authors cited Stanley and Jarrell (1989), who claim the t statistic has no dimensionality therefore can be combined directly when the units of independent and dependent variables are not the same across studies, to justify the usage of synthesizing 1‘ statistic for the slopes from regression models in their synthesis. However, the impact from different independent variables being used in different models still exists. 2.6 In Education Hanushek (1989) summarized 187 studies studying the impact of differential expenditures on school performance in 38 separately published articles or books, using the “vote-count” method, which simply ignores the magnitudes of efl‘ects, and counts the numbers of studies with significant positive estimates, significant negative estimates, nonsignificant positive estimates, or nonsignificant negative estimates. To avoid the poor statistical properties of the vote-count method (Hedges & Olkin, 1980), Greenwald, Hedges and Laine (1996) tried to summarize half-standardized Slopes in a review of educational production functions examining the same topic as Hanushek. However, fundamental problems for synthesizing the results from the production function still exist: The models usually do not involve the same predictors; different outcomes might be used in different studies; and the scales of all the variables may not be identical across studies. 10 2.7 The Most Recent Study In a recently article, Peterson and Brown (2005) conducted an empirical study and derived a formula for converting standardized slopes (ofien denoted as 33 even though they are sample estimates) reported in regression studies into Pearson’s correlations (rs) in order to include slopes and analyze them with other correlations using conventional methods designed for synthesizing correlations. The authors searched 35 journals from disciplines including psychology, consumer behavior, management, marketing, and sociology from the period of 1975-2001. They included only studies with both £5 and rs reported at the individual level. A total of 1,504 corresponding ,Bs and rs were identified from 143 articles containing 160 data sets and 270 regression models. Given the relationships shown in the ,Bs and rs they collected, the authors derived an equation r =.98,8 + .05)., where 7. is an indicator variable that equals 1 when ,3 is nonnegative and 0 when ,6 is negative. Peterson and Brown’s research is the first published study that mainly focused on incorporating the estimates from regression studies with those from correlational studies in the meta-analysis context. The authors did notice the relationship between 63 and rs can be impacted by features such as sample size and numbers of predictors in the regression model. However, they oversimplified the situation and did not really utilize those features to create their formula for converting [is to rs. ll CHAPTER 3 METHODOLOGIES As mentioned in the introduction, two major problems for incorporating regression studies in meta-analysis are that 1) different predictors may be used in different primary studies studying the same topic, and 2) predictors and the outcome are often measured in different scales across studies. As presented in the literature review section, most of the meta-analyses that have been done in different fields are either solely focused on simple regression studies where the same scales for the predictor and the outcome are comparable across studies, or the meta-analyst simply ignored the fact that the slopes may have different meanings because different predictors are used across studies. In order to combine the results from regression models in a general way and to be more precise in estimating the effect of the predictors by considering the impact from unparallel models, the currently research focuses on utilizing the zero-order correlation matrix from each study included in the meta-analysis to calculate summarized standardized slopes for a final regression model, which is the result of synthesizing regression studies. 3.1 Focusing on the Zero-order Correlations Instead of synthesizing slopes directly, the two methods examined in this study both start by summarizing the zero-order correlations among variables in regression models. As Hunter and Schmidt (2001) pointed out: A multiple regression analysis of a primary study is based on the full zero-order 12 correlation (or covariance) matrix for the set of predictor variables and the criterion variable. Similarly, a cumulation of multiple regression analyses must be based on a cumulative zero-order correlation matrix. (p. 475) Three major reasons make combining correlations beneficial. First, focusing on the correlations among the predictors and the outcome across studies, rather than trying to combine the slopes directly, disposes of the problem that slopes have different meaning when the models contain difierent predictors. This is because correlations among the variables used in a regression model are “zero-order measures” for the relationships, which means that the correlations between two variables will not change when other predictors are added into the model. Other than the advantage of stability, focusing on correlations also allows us to get by the problem of different scales used in measuring the same predictors in different models, because correlations are metric free and can be combined directly (under certain assumptions, which will be discussed later). Moreover, with the focus on correlations among the variables in the regression model, the results from correlational studies can be easily combined with the results from regression studies. This expands the set of studies that can be synthesized, because studies reporting the relationship between any pairs of variables of interest can be included. 3.2 Constructing the Standardized Regression Model Once the selected correlations that the regression models are based on are combined appropriately, the summarized correlations are used to create a final regression model with standardized slopes, because standardized slopes are firnctions of the associated correlations. The relative importance of the predictors can then be appraised. 13 Also, the variance explained by each predictor (e. g., the partial R2) based on the final model can also be calculated if it is of interest. 3.3 Proposed Methods Two methods for summarizing regression results based on the zero-order correlations that allow unparallel models to be combined were investigated in this research. One method uses a non-model based multivariate generalized least squares (GLS) approach (see Becker, 1992; Becker & Schram, 1994; Gleser & Olkin, 1994; Hedges & Olkin,l985; Raudenbush, Becker, & Kalaian, 1988); the other method uses model based factored likelihood estimation through the sweep operator (SWP). As indicated in Becker (2000), the GLS method has been typically used in multivariate meta-analysis. This method was used in the current research to compare with the SWP method, a new application to meta-analysis. Details for each method will discussed separately. Before the methods are presented, it should be noted that, as with all parametric statistical methodologies, the methods proposed here require certain assumptions. A general assumption that is required for each model included in the meta-analysis is that all the predictors and the outcome are measured appropriately, and they are related to each other approximately linearly, except for the presence of dummy variables. Those are the major assumptions for any regression study. In addition, we have to assume that multicollinearity is not a problem for each of the regression models. That is, the predictors are not highly correlated with each other in one model. In the primary studies, the authors may or may not report checking these assumptions. Yet we have to assume l4 the condition of linearity is not violated to work on the correlations in the meta-analysis, and we have to assume the absence of multicollinearity to build a meaningful final model and estimate the synthesized standardized slopes based on individual ones. Other specific assumptions for each method will be discussed in the presentation of each method. 3.3.1 Multivariate Generalized Least Squares If we think about the zero-order correlations between variables (predictors and outcomes) from regression models as the effect sizes in a synthesis, the problem of meta-analyzing those correlations is similar to meta-analyzing multivariate effect sizes from studies. Each study may contain some similar predictors and some different ones, that makes the correlations produced in each study a subset of the correlations from the final model, which need to be determined. Several methods for synthesizing correlations, in terms of the correlation matrices, have been investigated and discussed (e. g., Becker & Schram, 1994; Furlow & Beretvas, 2005). To combine subsets of multivariate outcomes in order to calculate the standardized Slopes for the predictors in the final model, the method first proposed by Raudenbush, Becker, and Kalaian (1988) based on generalized least squares (GLS) is adopted in the current research. To illustrate the application of GLS to synthesize regression results, an auxiliary example is used. The same example will be used to illustrate the next method as well. Suppose four regression studies are to be included in a synthesis. All of them studied the same outcomes Y“, where k is study number (k = 1 to 4) and 1 represents subject I in study k. Study 1 contains only predictor X1; Study 2 contains both X1 and X2; 15 Study 3 contains X1, X2, and X3; Study 4 contains X1, X2, X3,and X4. The estimated regression models with standardized slopes (B s) are as Shown below for the four studies. Study 1: Y1, = BHX”, for I =1 to n1 Study 2: Y2, = B21X21,+822X22, forl =1 to n2 Study 3: Y3, = B31X311+B32X321 +B33X33, for 1 =1 to n3 Study 4: IQ, = B41X411+ B42X421+ B43X431+ B44X44, for 1 =1 to n4 where, X H , is the value of variable X1 for subject I in study k, X 1.21 is the value of variable X2 for subject I in study k, X B , is the value of variable X3 for subject I in study k, X k 4, is the value of variable X4 for subject I in study k, and nk is the sample size of study k. Following the example above, the vectors of zero-order correlations of the four studies (rk, k=1, 2, 3, or 4) with elements rmmablc 1 variable 2) in the vectors are as follows. In the following expressions, for simplicity, only the numerical part of the variable label is used inside Of the parentheses (i.e., 4 indicates of X4). 16 p’4(1’1)i ’4(Y2) _"3(Y1) - r403) rzm) r3(r2) r404) _ _ _ r 3(Y3) __ 74(12) r1_|:’l(Y1):|’ r2 — r20,” , r3 - , and r4 — r202) r302) r403) r303) '4(14) _"3(23) _ "4(23) '4(24) _"4(34) _ To use the GLS method to summarize multivariate outcomes, we need an identity matrix, W, to identify which correlation is estimated in each study. The relationships among the correlation vector, indicator matrix, and the population correlation vector (p) is shown below. 17 r W x p e (1) Tim) _ 91(1’1) rm) I1 0 o o o 0 0 0 0 ol e2(m rm) 1 0 0 0 0 0 0 0 o 0 92m) ’202) 0 1 o o 0 0 0 0 0 o 62021 ’3m>i’333333333 rm) 0 1 o 0 0 0 o o o 0 pm) 83"” ’3‘“) 0 0 1 o o 0 0 0 0 o ‘3‘”) 93‘”) r302) 0 o o 0 1 0 0 o 0 0 W3) 8302) ’303) 0 0 0 0 0 1 0 O 0 0 PM) 8303) 73(23) 0 0 0 0 0 0 O l 0 0 9(12) 83(23) r40,” 1 0 o 0 0 0 0 0 0 0 p0,) em) rm) 0 1 0 o 0 o o 0 o 0 pm) 9402) ’40/3) 0 0 1 0 o o 0 0 0 0 Pas) 94m) 3 3 2 2. ‘3 3 3 3 3 3 r402) 0 0 0 0 o 1 0 0 o 0 —p‘3"’~ e40") ’403) 0 0 0 0 o o 1 o 0 0 e403) r404) o o 0 0 0 0 0 1 o 0 e404) r4(23) 0 0 0 0 0 0 O O 1 0 94(23) rm.) _0 0 o 0 o 0 0 o 0 1_ e424, _"4(34) _ _e4(34) _ The estimated population correlation vector ([1) contains the synthesized correlations for the full model, if we assume the final model contains the outcome Y and all the four predictors. It can be computed as: fi= 0V2“W)"Wfi“r, (2) where 2‘. is the large variance-covariance matrix containing the variance-covariance 18 matrices (53, s) of all the studies included in the meta-analysis on the diagonal, and zeros in the upper and lower triangles. That is, "2:, 0 0 ol . t 0 2: O 2 .0 , (3) o o 2, o _o o 0 2‘2,_ The components in E, depend on the intercorrelations of the predictors and the outcome, as well as the sample size in study k. The variance of each correlation in each study can be obtained based on second-order and fourth-order moments of the samples based on large sample theory (Pearson & Filon, 1898; Olkin & Siotani, 1976), which can be simplified to (1“ rk(ij)2)2 .2 _ O'k (7'9“) — "k for k = 1, 2, 3 and 4; i= Y, X1,X2,X3, and X4;j = Y, X1,X2,X3: and X4; i=fij. (4) The covariance between any two correlations in which there is a common variable is 6,, (rij, rij') = 1 2 2 2 3 [5(2rkui') " 0:614le ‘ ’kw) " rice") " rk(ji')) + ’konl/ "k, (5) l9 The covariance between any two correlations that do not involve any variables in common is 6k (ry',r,'y') = 1 2 2 2 2 [‘2‘ ’k”k(m’k(m +’k(i'i)rk(i'j)rk(i'j1+’k(j'i)"k(j'j)rk(j'n)l/"Iv Therefore, to fit on a page, the full variance-covariance matrices for the first two studies stacked to form the large matrix for GLS look like 20 E c o o o o c ANNFKCMQ ANN~VC.NNN\CvN.© ANNg‘LLNNvaqu NN~N.N\ 0.5164 0.2248 0.1884 0.1734a 0.000274 0.000323 0.000362 0.000493 N3 0% 4% 18% 45% GLS 0.5170 0.2255 0.1888 0.1728 0.000364 0.000425 0.000431 0.000485 SWP 0.5160 0.2252 0.1888 0.1734a 0.000362 0.000423 0.000430 0.000487 N4 0% 55% 82% 96% GLS 0.5167 0.2260 0.1899 0.1744 0.000649 0.000841 0.001043 0.001883 SWP 0.5159 0.2249 0.1881 0.1767 0.000653 0.000843 0.001038 0.001928 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 71 Table 5.2 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern II and Correlation Matrix R1 Method 5, E, E, E, SE1 SE2 SE3 SE4 0.5161 0.2253 0.1886 0.1734 N1 0% 0% 0% 25% GLS 0.5230 0.2269 0.1894 0.1737 0.000945 0.001050 0.001008 0.001109 SWP 0.5156 0.2247 0.1876 0.1729 0.000887 0.000998 0.000950 0.001064 N2 0% 0% 0% 25% GLS 0.5171 0.2250 0.1887 0.1733 0.000240 0.000267 0.000266 0.000282 SWP 0.5165 0.2248 0.1885 0.1733 0.000239 0.000266 0.000265 0.000281 N3 0% 0% 0% 4% GLS 0.5180 0.2252 0.1889 0.1732 0.000359 0.000401 0.000386 0.000369 SWP 0.5167 0.2249 0.1885 0.1733 0.000358 0.000396 0.000382 0.000366 N4 0% 0% 0% 55% GLS 0.5171 0.2250 0.1889 0.1743 0.000373 0.000427 0.000394 0.000550 SWP 0.5158 0.2246 0.1889 0.1735 0.000369 0.000424 0.000389 0.000542 Note. The bolded values are the population 810pes for predictors. 72 Table 5.3 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R1 Method 3, E, E, E4 SE1 SE2 SE3 SE4 0.5161 0.2253 0.1886 0.1734 N1 0% 75% 75% 75% GLS 0.5243 0.2237 0.1875 0.1704 0.001248 0.001925 0.001810 0.001795 SWP 0.5170 0.2263 0.1896 0.1723 0.001242 0.001947 0.001826 0.001818 N2 0% 75% 75% 75% GLS 0.5171 0.2249 0.1891 0.1742 0.000340 0.000519 0.000511 0.000491 SWP 0.5165 0.2251 0.1892 0.1744 0.000340 0.000520 0.000513 0.000492 N3 0% 45% 45% 45% GLS 0.5169 0.2251 0.1890 0.1729 0.000398 0.000521 0.000506 0.000476 SWP 0.5158 0.2254 0.1893 0.1732 0.000398 0.000523 0.000508 0.000477 N4 0% 96% 96% 96% GLS 0.5191 0.2243 0.1860 0.1724 0.001099 0.001930 0.001902 0.001786 SWP 0.5173 0.2250 0.1864 0.1728 0.001104 0.001940 0.001907 0.001797 Note. The bolded values are the population slopes for predictors. 73 Table 5.4 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R3 Method D, D, D, E, SE. SE2 SE3 SE4 0.5161 0.2253 0.1886 0.1734 N1 0% 0% 0% 75% GLS 0.5218 0.2283 0.1915 0.1688 0.001009 0.001096 0.001098 0.001799 SWP 0.5151 0.2254 0.1882 0.1742 0.000986 0.001067 0.001063 0.001862 N2 0% 0% 0% 75% GLS 0.5169 0.2247 1113368 0.1737 0.000251 0.000277 0.000280 0.000478 swp 0.5164 0.2245 0.1884 0.1741 0.000251 0.000275 0.000278 0.000479 N3 0% 0% 0% 45% GLS 0.5173 0.2255 0.1895 0.1724 0.000356 0.000413 0.000401 0.000478 SWP 0516121 0.2251 0.1890 0.1731 0.000356 0.000411 0.000399 0.000480 N4 0% 0% 0% 96% GLS 0.5171 0.2257 0.1895 0.1742 0.000584 0.000622 0.000633 0.001833 SWP 0.5158 0.2251 013368 0.1763 0.000593 0.000627 0.000638 0.001855 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 74 Table 5.5 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation matrix R1 Method B, E, E, E, SEl SE2 SE3 SE4 0.5161 0.2253 0.1886 0.1734 N1 0% 0% 0% 0% GLS 0.5240 0.2272 0.1892 0.1737 0.000927 0.001030 0.000988 0.000982 SWP 0.5166 0.2247 0.1878 0.1719 0.000879 0.000979 0.000935 0.000931 N2 0% 0% 0% 0% GLS 0.5170 022538 0.1887 1117348 0.000249 0.000269 0.000260 0.000248 sw1> 0.5165 0.2251 0.1885 0.1733 0.000248 0.000267 0.000259 0.000248 N3 0% 0% 0% 0% GLS 0.5179 0.2252 0.1889 0.1736 0.000367 0.000404 0.000374 0.000367 SWP 0.5165 0.2248 0.1885 1117348 0.000362 0.000397 0.000369 0.000362 N4 0% 0% 0% 0% GLS 0.5177 0.2252 0.1890 0.1736 0.000366 0.000401 0.000372 0.000366 SWP 0.5166 0.2248 013368 11173411 0.000361 0.000397 0.000369 0.000362 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 75 5.1.2 Correlation Matrix R2 Pattern I. The combination of Pattern I with R2 showed more missing data as the relationship between the predictor and the outcome became weaker, and there was no correlation among predictors with R2. As shown in Table 5.6, SWP estimated the slope for X1 better than GLS under different sample sizes. GLS always overestimated the slope for X1. Different from the results for this pattern with R1, GLS estimated the slope of X3 precisely when the sample sizes were small and equal across studies (N1) in the synthesis. As was true for correlation matrix R1, GLS was superior when much data was missing (96% in N4 on X4). Compared to the scenario where the correlation was R1, the results from GLS were more stable with smaller SES than those from SWP, yet the difi‘erences in SEs between the two methods were small. Pattern II. The combination of Pattern II with R2 had missingness only on the last variable X4, the weakest predictor, in one study included in the meta-analysis, and there was no correlation among the predictors. As shown in Table 5.7, SWP gave better estimates of the slope for X1 and GLS still overestimated the slope for this variable. GLS usually did better in estimating slopes for X2 and X3, while SWP did well in estimating the slopes for X4. SWP produced more stable estimates for the slopes for all variables, except for X4 with sample size defined by N2, where GLS produced a slightly smaller SE than SWP. Pattern III. The combination of Pattern III with R2 had the predictors that were weakly related to the outcome (X2, X3, and X4) were present in only the last study in the synthesis and there was no correlation among predictors. As shown in Table 5.8, SWP still worked better than GLS in estimating the slope for fully observed variable X1, and 76 GLS still overestimated the slope for this variable. SWP also performed better in estimating the slopes for the four variables when the sample sizes were small and equal across the four studies (N1) in the synthesis. The variables that were less strong related with the outcome (X3 and X4) were more often missing. GLS started to show better estimates. Similar to the condition when the correlation was R1, GLS produced slightly more stable estimates than SWP. Pattern IV. The combination of Pattern IV with correlation matrix R2 had predictors X1 to X3 present in all four studies. Predictor X4, which had the weakest relation to the outcome, was present only in the last study included in the synthesis. As shown in Table 5.9, SWP performed better in estimating the slope for X1, while GLS kept overestimating the slope for this variable. SWP still worked better than GLS with small equal sample sizes (N 1) in this pattern. When sample sizes were large with much missing data (e.g., X4 with 75% missing in N2 and 96% in N4), GLS tended to work better than SWP. GLS also produced more stable estimates for B: when sample sizes varied, as well as for all four variables in N4. Pattern V. In Pattern V with correlation matrix R2, all the studies in the synthesis included all four predictors. There was no missing data for any of the predictors and there was no correlation among those predictors. In this scenario, as shown in Table 5.10, SWP produced estimates closer to the population values on X1, while GLS continued overestimating the slope for this variable. However, comparing to the results from the correlation matrix R1, the mean slopes from GLS were closer to the population values for most of the predictors than those from GLS, when sample sizes varied. As in the conditions where the correlation matrix was R1, SWP produced more stable estimates 77 than GLS. Table 5.6 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R2 Method E, E, E, E, SE, SE2 SE3 SE. 0.6 0.4 0.3 0.25 N1 0% 25% 50% 75% a GLS 0.6041 0.4013 03000 0.2442 0.001040 0.001082 0.001123 0.001419 sw1> 0.5998 0.3992 0.3003 0.2514 0.001041 0.001079 0.001123 0.001467 N2 0% 25% 50% 75% GLS 0.6005 0.3997 0.2999 0.2497 0.000277 0.000288 0.000299 0.000378 0.2999 0.2502 0.000279 0.000288 0.000300 0.000384 SWP 0.6002 0.3995 18% 45% a 0-4000 0.2999 0.2492 0.000333 0.000359 0.000354 0.000393 GLS 0.6006 0.2501 0.000331 0.000358 0.000354 0.000397 N3 0% 4% SWP 0.5999 0.3998 0.3001 N4 0% 55% 82% 96% GLS 0.6006 0.4010 0.3016 0.2513 0.000753 0.000842 0.000902 0.001411 0.2546 0.000775 0.000850 0.000915 0.001487 SWP 0.5996 0.3997 0.3001 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 78 Table 5.7 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern II and Correlation Matrix R2 A A Method B, 32 D, B, SEl SE, SE, SE4 0.6 0.4 03 0.25 N1 0% 0% 0% 25% GLS 0.6043 0.4015 0.3005 0.2497 0.000823 0.000863 0.000847 0.000890 SWP 0.5991 0.3990 0.2987 0.2499 0.000772 0.000824 0.000799 0.000854 N2 0% 0% 0% 25% 8 8 GLS 0.6006 0.3997 03000 112500 0.000211 0.000223 0.000221 0.000230 8 SWP 0.6002 0.3995 0.2999 112500 0.000209 0.000221 0.000220 0.000233 N3 0% 0% 0% 4% GLS 0.6012 0.3998 0.3001 0.2496 0.000312 0.000329 0.000306 0.000306 SWP 0.6003 0.3994 0.2998 0.2499 0.000310 0.000326 0.000305 0.000305 N4 0% 0% 0% 55% GLS 0.6005 0.3999 0.3005 0.2514 0.000335 0.000370 0.000341 0.000436 SWP 0.5996 0.3994 0.3003 0.2504 0.000332 0.000366 0.000337 0.000435 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 79 Table 5.8 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R2 Method 5, 132 B, 3, SE, SE, SE, SE4 0.6 0.4 0.3 0.25 N1 0% 75% 75% 75% GLS 0.6067 0.3977 0.2981 0.2472 0.001321 0.001464 0.001446 0.001450 swr 0.6021 0.4014 0.3005 0.2491 0.001328 0.001499 0.001452 0.001464 N2 0% 75% 75% 75% GLS 0.6006 0.3996 0.3005 0.2506 0.000351 0.000397 0.000401 0.000398 SWP 0.6002 0.3999 0.3006 0.2508 0.000351 0.000402 0.000404 0.000399 N3 0% 45% 45% 45% GLS 0.6007 0.3997 0.3002 0.2497 0.000375 0.000408 0.000406 0.000394 SWP 0500011 0.4003 0.3006 035008 0.000375 0.000412 0.000408 0.000394 N4 0% 96% 96% 96% GLS 0.6023 0.4006 0.2990 0.2507 0.001283 0.001434 0.001498 0.001452 SWP 0.6007 0.4008 0.2992 0.2508 0.001281 0.001464 0.001502 0.001459 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 80 Table 5.9 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R2 Method E, E, E, E, SE, SE2 SE3 SE, 0.6 0.4 03 0.25 N1 0% 0% 0% 75% GLS 0.6034 0.4020 0.3014 0.2440 0.000968 0.000968 0.000977 0.001367 SWP 0.5987 0.3994 0.2993 0.2532 0.000967 0.000965 0.000954 0.001435 N2 0% 0% 0% 75% GLS 0.6006 0.3995 0.3002 0.2501 0.000244 0.000253 0.000253 0.000371 SWP 0.6002 0.3993 030008 0.2508 0.000246 0.000245 0.000253 0.000376 N3 0% 0% 0% 45% 0.3006 0.2488 0.000318 0.000352 0.000337 0.000389 GLS 0.6008 0.4003 0.000320 0.000352 0.000335 0.000393 a SWP 0-6000 0.3999 0.3003 0.2501 0% 96% N4 0% 0% 0.6013 0.4010 0.3008 0.2526 0.000680 0.000698 0.000669 0.001365 GLS 0.3002 0.2550 0.000705 0.000708 0.000675 0.001407 SWP 0.6001 0.4003 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 81 Table 5.10 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R2 Method 5, E, E, E, SEl SE, SE, SE, 0.6 0.4 03 0.25 N1 0% 0% 0% 0% a GLS 0.6053 0.4017 0.3002 03500 0.000798 0.000825 0.000803 0.000806 SWP 960008 0.3989 0.2987 0.2487 0.000758 0.000790 0.000760 0.000771 0% 0.25003 0.000210 0.000221 0.000220 0.000209 0.000206 0% 0% 0% 0.000210 0.000207 0.6006 0.3999 0.3001 GLS 0.2999 0.2499 0.000209 SWP 0.6002 0.3997 0% 0% 0% N3 0% GLS 0.6011 0.3998 0.3002 0.2502 0.000313 0.000331 0.000302 0.000304 SWP 0.6002 0.3994 0.2998 0.2499 0.000309 0.000326 0.000297 0.000300 N4 0% 0% 0% 0% GLS 0.6010 0.3999 0.3003 0.2501 0.000311 0.000330 0.000299 0.000303 SWP 0.6002 0.3994 0.2999 0.2499 0.000308 0.000327 0.000298 0.000300 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the pOpulation value. 82 5.1.3 Correlation Matrix R3 Pattern I. The combination of Pattern I with R3 had more missing data as the relationships between predictors and the outcome became stronger. As shown in Table 5.11, GLS outperformed SWP in estimating the slope of X 1 in N4 where a large portion of data were missing in X2, X3, and X4. Consistent with previous results, GLS tended to result in slight overestimation of the slope for X 1 when sample sizes varied in this pattern, and so did SWP. When the sample sizes were small and equal across studies in the synthesis (N 1), SWP performed better. When the sample sizes were large and equal (N2), GLS tended to do better. When a large portion of data were missing on X4 (e. g., in N4), which had a high correlation with the outcome, GLS generally performed better. SWP tended to be more stable when the sample size was small and equal across studies (N1) and when missingness occurred less (N3); GLS seemed to be more stable when sample size was large (N2) or when more data were missing (N4). Pattern II. The combination of Pattern II with correlation matrix R3 had missing data only on the last variable X4, which had the strongest relation to the outcome, in only one study included in the meta-analysis. As shown in Table 5.12, SWP produced mean slopes for X, that were closer to the population value (B31=0.1734) than the GLS means, except in N4 where there were more missing values on the last predictor. That was the same finding as in Pattern I with correlation matrix R3. For all the different sample sizes, SWP produced better slopes for X4 than did GLS. When the overall sample size was large and data were more complete (e. g. N2 and N3), SWP precisely reproduced the population value for the slope for X4. Also, SWP resulted in more stable estimates. Pattern III. In Pattern III with the correlation matrix R3, the predictors that were 83 more strongly related to the outcome (X2, X3, and X4) were present in only the last study in the synthesis. As shown in Table 5.13, SWP tended to perform better than GLS in estimating the slope for X, when sample sizes varied, except in N3, where the sole information on X2, X3, and X, was based on a study with a large sample size. Generally speaking, GLS and SWP produced very similar mean slopes for several variables under different sample size sets (e.g., X2 in N2, N3; X4 in N2) in this pattern. GLS and SWP produced similar SEs. Yet GLS was slightly more stable than SWP in most of the conditions. Pattern IV. The combination of Pattern IV with correlation matrix R3 had predictors X, to X3 present in all four studies, and X4, which had the strongest relation to the outcome, was present only in the last study included in the synthesis. As shown in Table 5.14, SWP consistently resulted in better estimates of the slope of X, than GLS, which tended to result in overestimation of the slope of X, as well as slopes of other variables. SWP also performed better than GLS in N1, N3, and N4 at estimating the slopes of X2 and X3. For X4, GLS tended to perform better than SWP. GLS consistently came up with more stable estimates for the slopes for X4. Pattern V. In Pattern V, all the studies in the synthesis included all four predictors and there was no missing data for any of the predictors. Under this scenario, as shown in Table 5.15, SWP produced better estimate of the X, slope most of the time. In contrast to previous findings in this pattern, GLS produced a mean slope for X, that was the same as the population value when the sample size was large and equal across studies (N2). When the sample size was small and equal across studies (N1), SWP tended to perform better than GLS at estimating the slopes of all variables. SWP also better estimated the slopes 84 for X3 and X4, that were more strongly related to the outcome, in this pattern. Moreover, SWP produced more stable estimates than GLS for slopes of all variables when sample sizes varied. Table 5.11 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R3 A Method 1}, D, D, 3, SE, SE, SE3 SE4 0.1734 0.1886 0.2253 0.5161 N1 0% 25% 50% 75% GLS 0.1758 0.1914 0.2285 0.5121 0.001387 0.001460 0.001599 0.001640 swp 0.1736 0.1882 0.2240 0.5184 0.001386 0.001434 0.001584 0.001656 N2 0% 25% 50% 75% GLS 0.1737 0.1882 0.2253‘1 0351618 0.000383 0.000391 0.000425 0.000438 SWP 0.1736 0.1880 0.2250 0.5165 0.000383 0.000391 0.000426 0.000444 N3 0% 4% 18% 45% GLS 0.1736 0.1887 0.2258 0.5157 0.000438 0.000467 0.000472 0.000456 SWP 0.1732 0.1883 0.2254 0.5163 0.000437 0.000465 0.000470 0.000457 N4 0% 55% 82% 96% GLS 0.1735 0.1892 0.2277 0.5177 0.001146 0.001316 0.001389 0.001637 SWP 0.1731 0.1879 0.2242 0.5226 0.001160 0.001327 0.001396 0.001695 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 85 Table 5.12 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern II and Correlation Matrix R3 A SE2 Method 13, E, E, E, SE, SE, SE, 0.1734 0.1886 0.2253 0.5161 N1 0% 0% 0% 25% GLS 0.1750 0.1901 0.2262 0.5215 0.001052 0.001073 0.001089 0.001031 SWP 0.1729 0.1878 0.2238 0.5164 0.000998 0.001020 0.001024 0.000968 N2 0% 0% 0% 25% GLS 0.1738 0.1883 0.2254 0.5166 0.000276 0.000282 0.000276 0.000260 SWP 0.1737 0.1881 0.2252 051618 0.000274 0.000280 0.000275 0.000260 N3 0% 0% 0% 4% GLS 0.1743 0.1885 0.2259 0.5166 0.000394 0.000398 0.000397 0.000348 SWP 0.1739 0.1881 0.2252 051618 0.000393 0.000395 0.000394 0.000344 N4 0% 0% 0% 55% GLS 0.1733 0.0188 0.2254 0.5188 0.000455 0.000479 0.000472 0.000495 sw1> 0.1730 0.1879 0.2256 0.5165 0.000448 0.000473 0.000467 0.000493 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 86 Table 5.13 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R3 A A Method 3, 1}, B, 3, SE, SE, SE, SE, 0.1734 0.1886 0.2253 0.5161 N1 0% 75% 75% 75% GLS 0.1789 0.1887 0.2269 0.5123 0.001531 0.001930 0.001866 0.001695 SWP 0.1757 0.1891 0.2272 0.5138 0.001521 0.001933 0.001870 0.001706 N2 0% 75% 75% 75% GLS 0.1741 018358 0.2256 0.5170 0.000408 0.000533 0.000540 0.000457 SWP 0.1739 013363 0.2257 0.5170 0.000408 0.000534 0.000540 0.000459 N3 0% 45% 45% 45% GLS 0.1735 0.1883 0.2259 0.5159 0.000440 0.000505 0.000521 0.000452 SWP 0.1731 0.1883 0.2260 051618 0.000440 0.000505 0.000521 0.000451 N4 0% 96% 96% 96% GLS 0.1746 0.1893 0.2221 0.5170 0.001423 0.001923 0.001987 0.001705 SWP 0.1736 0.1894 0.2223 0.5176 0.001424 0.001925 0.001994 0.001728 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 87 Table 5.14 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R3 Method ,3, E, 1}, E, SE, SE, SE, SE, 0.1734 0.1886 0.2253 0.5161 N1 0% 0% 0% 75% GLS 0.1747 0.1920 0.2311 0.5124 0.001381 0.001366 0.001433 0.001623 SWP 0.1723 0.1883 0.2236 0.5205 0.001372 0.001361 0.001423 0.001655 N2 0% 0% 0% 75% GLS 0.1740 0.1880 0.2257 0.5163 0.000353 0.000361 0.000374 0.000425 SWP 0.1738 0.1878 0.2252 0.5169 0.000345 0.000362 0.000375 0.000429 N3 0% 0% 0% 45% GLS 0.1738 0.1889 0.2267 0.5150 0.000430 0.000450 0.000448 0.000440 SWP 0.1735 0.1885 0.2257 0.5159 0.000430 0.000449 0.000446 0.000442 N4 0% 0% 0% 96% GLS 0.1741 0.1899 0.2268 0.5180 0.001110 0.001158 0.001202 0.001601 SWP 0.1733 0.1888 0.2240 0.5226 0.001127 0.001169 0.001215 0.001631 Note. The bolded values are the population slopes for predictors. 88 Table 5.15 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R3 A Method B, E, E, E, SE, SE, SE, SE, 0.1734 0.1886 0.2253 0.5161 N1 0% 0% 0% 0% GLS 0.1760 0.1902 0.2264 0.5219 0.000987 0.001013 0.001014 0.000936 SWP 0.1740 0.1879 0.2246 0.5147 0.000952 0.000961 0.000949 0.000887 N2 0% 0% 0% 0% GLS 917348 (113368 0.2255 0.5166 0.000264 0.000262 0.000268 0.000231 SWP 0.1737 0.1885 1122533 0.5160 0.000263 0.000261 0.000268 0.000229 N3 0% 0% 0% 0% GLS 0.1740 0.1883 0.2258 0.5174 0.000401 0.000400 0.000388 0.000347 SWP 0.1737 0.1880 032533 0.5162 0.000395 0.000393 0.000382 0.000340 N4 0% 0% 0% 0% GLS 0.1739 0.1884 0.2259 0.5174 0.000399 0.000398 0.000384 0.000343 SW? 0.1737 0.1880 032531 0.5162 0.000394 0.000394 0.000382 0.000340 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 89 5.1.4 Correlation Matrix R4 Pattern I. The combination of Pattern I with correlation matrix R4 led to more missing data occurred as the relationships between the predictor variables and the outcome became stronger. Also, there was no correlation among the predictors in R4. As shown in Table 5.16, both GLS and SWP performed well in estimating the slope of X ,. SWP estimated the slope of X, precisely when the sample sizes were equal across studies (N 1 and N2). As was true for other patterns and correlations, GLS tended to overestimate the slope of X, all the time. SWP did not estimate the slope well when large amounts of data were missing on the variable that related strongly to the outcome (e. g., the slope for X4 in N4). Similar to earlier findings, GLS always produced more stable estimates of the slope for X4 and resulted in a smaller SE. The differences in SE8 between the two methods were similar to those found in Pattern I with correlation matrix R2. Pattern II. The combination of Pattern II with correlation matrix R4 had missing data only on the last variable X4, which had the strongest relation to the outcome and appeared in only one study included in the meta-analysis. There was no correlation among predictors. As shown in Table 5.17, SWP produced better estimates of the slope for X, most of the time and GLS still tended to overestimate the slope of X1. GLS performed better than SWP at estimating the slope of X2. GLS also produced very precise estimates of the slope of X4 when there was little missing data (4%) in N3. SWP produced more stable estimates than GLS most of the time, except the SE4 values calculated via GLS in N2 and N4 were smaller than those produced by SWP. Pattern III. When Pattern III is combined with correlation matrix R4, the predictors that were more strongly related to the outcome (X2, X 3, and X4) were present in only the 90 last study in the synthesis and there was no correlation among predictors. As shown in Table 5.18, SWP gave better estimates of the slope for X, most of the time and GLS tended to overestimate the slope of X ,. Compared with the results fiom other scenarios, neither method performed particularly well at estimating the slope of X, in N1. When the sample size was larger and equal across studies (N2), GLS and SWP overestimated the slopes for all four variables. Both methods produced good estimates for slopes of the variables when less missing data occurred (N3); when there was more missing data occurred (N4), SWP produced better estimates for X ,, which was the only variable that was fully observed. Both methods produced equally stable estimates in most situations. Pattern IV. In the combination of Pattern IV with correlation matrix R4, predictors X, to X3 were present in all four studies, whereas X4, which was related to the outcome the most strongly, was present only in the last study included in the synthesis. As shown in Table 5.19, SWP estimated the slope of X, better across all different sample size patterns, except in N1 when GLS estimates were less bias. Both methods tended to overestimate the slope of X ,. SWP gave better estimates of the slope of X3, which was a variable that was fully observed and related to the outcome most strongly. When sample sizes were equal and large across studies (N2), GLS produced precisely estimate of the slope of X4. When many values were missing on X4 (96% in N4), GLS also better estimated the slope of X4. When the proportion of missingness was smaller (45% in N3), SWP tended to do better. GLS generally produced more stable estimates than SWP. Pattern V. In Pattern V with the correlation matrix R4, all the studies in the synthesis included all four predictors. No missing data occurred for any of the predictors and there was no correlation among those predictors. As shown in Table 5.10, SWP consistently 91 produced estimates of the slope for X, that were closer to the population value, while GLS always resulted in overestimation of the slope of this variable when sample sizes varied. SWP worked well when the sample sizes were small and equal (N1). SWP produced less stable estimates of the slopes for X4. SWP also produced less stable estimates when variables highly related to the outcome were based on smaller sample sizes (N4). 92 Table 5.16 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern I and Correlation Matrix R4 Method 3, E, E, 13', SE, SE, SE, SE, 0.25 0.3 0.4 0.6 Ni 0% 25% 50% 75% GLS 0.2522 0.3023 0.4022 0.5954 0.001271 0.001312 0.001370 0.001240 a SWP 03500 0.2996 0.3998 0.6034 0.001283 0.001308 0.001371 0.001312 N2 0% 25% 50% 75% GLS 0.2502 0.2997 0.4000 0.5999 0.000356 0.000361 0.000369 0.000330 a SWP 112500 0.2995 0.3998 0.6004 0.000356 0.000362 0.000370 0.000354 N3 0% 4% 18% 45% 8 GLS 0.2501 0.3001 114000 0.5994 0.000388 0.000406 0.000399 0.000371 a SWP 0.2498 0.2998 114000 0.6002 0.000387 0.000405 0.000398 0.000387 N4 0% 55% 82% 96% GLS 0.2506 0.3015 0.4029 0.6015 0.001145 0.001283 0.001287 0.001130 swr 0.2497 0.3001 0.4009 0.6066 0.001161 0.001293 0.001298 0.001321 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 93 Table 5.17 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern H and CorrelationMatrix R4 Method 1}, E, E, E, SE, SE, SE, SE, 0.25 0.3 0.4 0.6 N1 0% 0% 0% 25% GLS 0.2509 0.3015 0.4009 0.6027 0.000883 0.000894 0.000897 0.000844 SWP 0.2491 0.2993 0.3980 0.6002 0.000843 0.000852 0.000849 0.000807 N2 0% 0% 0% 25% GLS 0.2502 0.2998 0.4001 0.6003 0.000234 0.000239 0.000234 0.000216 SWP 0.2502 0.2996 0.3999 0.6001 0.000232 0.000238 0.000233 0.000222 N3 0% 0% 0% 4% GLS 0.2505 0.2999 0.4003 950001 0.000326 0.000329 0.000320 0.000303 SWP 0.2502 0.2996 0.3998 0.5999 0.000325 0.000327 0.000318 0.000302 N4 0% 0% 0% 55% GLS 0.2499 0.2999 0.4007 0.6021 0.000399 0.000428 0.000425 0.000374 SWP 0.2495 0.2995 0.4002 0.6004 0.000395 0.000422 0.000421 0.000392 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 94 Table 5.18 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern III and Correlation Matrix R4 A A Method E, E, B, 3, SE, SE, SE, SE, 0.25 03 0.4 0.6 N1 0% 75% 75% 75% GLS 0.2545 0.3010 0.4013 0.5976 0.001407 0.001568 0.001519 0.001438 SWP 0.2523 0.3012 0.4017 0.5986 0.001402 0.001572 0.001522 0.001449 N2 0% 75% 75% 75% GLS 0.2505 0.3001 0.4004 0.6005 0.000369 0.000439 0.000434 0.000390 SWP 0.2503 0.3001 0.4004 0.6006 0.000369 0.000439 0.000434 0.000393 N3 0% 45% 45% 45% GLS 0.2501 0.2998 0.4006 0.5999 0.000383 0.000413 0.000422 0.000394 SWP 0.2498 0.2999 0.4007 960001 0.000383 0.000413 0.000422 0.000393 N4 0% 96% 96% 96% GLS 0.2511 0.3021 0.3987 0.6026 0.001375 0.001555 0.001589 0.001463 SWP 0.2502 0.3018 0.3985 0.6026 0.001375 0.001557 0.001591 0.001486 Note. The bolded values are the population slapes for predictors. a. Mean estimated slope is equal to the population value. 95 Table 5.19 Missing Percentage, Estimated Mean Slopes and Standard Errors for Each Predictor for Pattern IV and Correlation Matrix R4 Method 5, E, E, E, SE, SE, SE, SE, 0.25 0.3 0.4 0.6 N1 0% 0% 0% 75% GLS 0.2511 0.3029 0.4037 0.5948 0.001295 0.001275 0.001314 0.001146 swp 0.2487 0.2999 0.3993 0.6052 0.001305 0.001282 0.001316 0.001272 N2 0% 0% 0% 75% GLS 0.2505 0.2996 0.4004 960008 0.000333 0.000343 0.000349 0.000304 SWP 0.2503 0.2994 0.4001 0.6007 0.000335 0.000345 0.000351 0.000337 N3 0% 0% 0% 45% GLS 0.2503 0.3004 0.4008 0.5987 0.000378 0.000397 0.000383 0.000356 SWP 1125003 030008 0.4002 0.6001 0.000380 0.000397 0.000382 0.000373 N4 0% 0% 0% 96% GLS 0.2516 0.3023 0.4020 0.6029 0.001129 0.001183 0.001181 0.001033 SWP 0.2505 0.3010 0.4002 0.6067 0.001145 0.001194 0.001186 0.001215 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 96 Table 5.20 Missing Percentage, Estimated Mean Slapes and Standard Errors for Each Predictor for Pattern V and Correlation Matrix R4 A A Method 3, D, B, 194 SE, SE, SE, SE, 0.25 03 0.4 0.6 N1 0% 0% 0% 0% GLS 0.2516 0.3015 0.4008 0.6036 0.000808 0.000825 0.000797 0.000803 swp 0.2499 0.2993 0.3986 0.5987 0.000784 0.000787 0.000750 0.000763 N2 0% 0% 0% 0% GLS 0.2502 0.3001 0.4001 0.6003 0.000215 0.000217 0.000218 0.000203 SWP 0.2502 0300011 0.3999 0.5999 0.000214 0.000216 0.000217 0.000202 N3 0% 0% 0% 0% GLS 0.2503 0.2998 0.4003 0.6008 0.000329 0.000323 0.000316 0.000305 SWP 0715008 0.2995 0.3998 0.5999 0.000324 0.000318 0.000311 0.000299 N4 0% 0% 0% 0% GLS 0.2502 0.2998 0.4004 0.6007 0.000327 0.000321 0.000313 0.000302 SWP 035003 0.2995 0.3999 0.5999 0.000323 0.000318 0.000311 0.000299 Note. The bolded values are the population slopes for predictors. a. Mean estimated slope is equal to the population value. 97 Table 5.21 presents the bias ranges for each slope for each method across scenarios. When important variables tended to be missing from the model (R3 and R4), the estimates for X ,, X2, and X3 from both departed from the population values relatively large. The ranges of the bias for the slope of X4 were largest among the four predictor slopes for both methods. Since it was missing the most across studies, it was more difficulty to estimate it precisely. Table 5.21 Ranges of Percentage Relative Bias Produced by GLS and SWP SWP GLS Largest negative Largest positive Largest negative Largest positive X 1 Value -0.6229 1.3324 -0.06 3.1782 Scenario Pat4N,R3 Pat3N,R3 Pat2N4R4 Pat3N,R3 X2 Value -0.43 75 0.6067 -0.71 1.7976 Scenario Pat4N2R3 Pat3N4R4 Pat3N,R, Pat4N,R3 X3 Value -1.3313 0.8432 -1.4334 1.5561 Scenario Pat3N4R3 Pat3N,R3 Pat3N4R3 Pat4N,R3 X4 Value -0.8479 2.00 -2.6648 1.1121 Scenario Pat5N,R, Pat4N4R2 Pat4N,R, Pat5N 1R3 Note. Eight characters denote a scenario. The first four characters indicate the pattern (Patl through Pat5); the following two characters indicate the sample size set (N 1 through N4); the last two characters indicate the correlation matrix (R, through R4). 98 5.2 AN OVA Results The AN OVA results for each predictor were summarized in Table 5.22 to Table 5.25. Noted that the scale for the marginal means for pattern IV was different from other patterns to present the large negative differences between two methods in estimating X4 slope. The small adjusted R squares for modeling the difference between two methods for each predictor (ranging from .029 for X2 to .088 for X ,) indicated that only a small portion of the differences between two methods was attributable to the missing data patterns, correlation matrices, sample size sets, and their interactions. Because of the large amount of data that was generated for this research, the significance values were all less than .0001, which indicates the significance of all factors. The largest 1,2 estimate (E2) among the four AN OVAs for four predictors was .06 for sample sizes (N) for the slopes for X ,. Different missing pattems explained less than 0.01% in the variance of the different estimates between the GLS and the SWP methods for X ,. Correlation matrices (Rs) also explained less than 0.01% variances of the different estimates between two methods for X2. For X4, missing data patterns explained most amount of variance (E2=.034) of the outcome, while the pattern and sample size interaction also contributed 2.8% of the variance. The interactions were also significant at the .0001 level. To show the nature of the interaction, the correlation matrix (R)*sample sizes (N) interactions were plotted for the five missing data patterns for each of the four predictors in Figure 5.1 to Figure 5.4. In most of the plots for the X, slope, large discrepancies arose for the four predictors in the sample size set N1. Across all five patterns, the largest differences between the methods of estimating the X, slope were present with the matrix R,. The differences between the 99 two methods for estimating the slope of X2 were smaller. For the predictors X3 and X4 when more data were missing, the two methods were similar at estimating the slope under N2 and N3. The two methods were different at estimating the slopes for these two predictors when important correlations tended to be missing more frequently (R3 and R4). Table 5.22 Analysis of Variance for the Differences in Estimates of the Slope of X, Dependent Variable: E, (GLS) — E, (8 WP) Type III Sum Mean Partial Eta Source of Squares df Square F Si' Squared coma“ .2878 79 .004 99.23 .000 .089 Model Intercept .177 1 .177 4848.50 .000 .057 Pattern .001 4 .000 6.18 .000 .000 R .041 3 .014 370.09 .000 .014 N .186 3 .062 1696.83 .000 .060 Pattern * R .003 12 .000 6.34 .000 .001 Pattern * N .002 12 .000 4.97 .000 .001 R * N .051 9 .006 153.87 .000 .017 Pattern * R * N .003 36 9.44E-005 2.58 .000 .001 Error 2.922 79920 3.66E-005 Total 3.386 80000 Corrected Total 3.208 79999 a. R Squared = .089 (Adjusted R Squared = .088) 100 Table 5.23 Analysis of Variance for the Differences in Estimates of the Slope of X2 Dependent Variable: E, (GLS) - E, (SWP) Type III Sum Mean Partial Eta Source of Squares df Square F fig. Squared Corrected ,, Model .110 79 .001 30.88 .000 .030 Intercept .034 1 .034 750.78 .000 .009 Pattern .032 4 .008 175.87 .000 .009 R .002 3 .001 12.19 .000 .000 N .031 3 .010 230.63 .000 .009 Pattern * R .004 12 .000 6.68 .000 .001 Pattern * N .035 12 .003 64.42 .000 .010 R '1‘ N .002 9 .000 5.30 .000 .001 Pattern * R * N .005 36 .000 2.96 .000 .001 Error 3.602 79920 4.51E-005 Total 3.745 80000 Corrected Total 3.712 79999 a. R Squared = .030 (Adjusted R Squared = .029) Table 5.24 Analysis of Variance for the Differences in Estimates of the Slope of X3 Dependent Variable: E3 (GLS) — E, (S WP) Type III Sum Mean Partial Eta Source of Squares df Square F Sig. Squared corrected .156‘1 79 .002 43.75 .000 .041 Model Intercept .046 l .046 1014.26 .000 .013 Pattern .039 4 .010 213.81 .000 .011 R .013 3 .004 93.08 .000 .003 N .033 3 .011 243.15 .000 .009 Pattern * R .009 12 .001 16.07 .000 .002 Pattern * N .044 12 .004 81.41 .000 .012 R * N .012 9 .001 28.90 .000 .003 Pattern * R * N .007 36 .000 4.52 .000 .002 Error 3.604 79920 4.51E-005 Total 3.806 80000 Corrected Total 3.760 79999 a. R Squared = .041 (Adjusted R Squared = .041) 101 Table 5.25 Analysis of Variance for the Differences in Estimates of the Slope of X4 Dependent Variable: E, (GLS) — E, (SWP) Type III Sum of Mean Partial Eta Source Squares df Square F Sig. Squared Corrected Model .652‘ 79 .008 82.690 .000 .076 Intercept .067 1 .067 669.80 .000 .008 Pattern .278 4 .070 696.66 .000 .034 R .007 3 .002 22.01 .000 .001 N .055 3 .018 182.15 .000 .007 Pattern '1‘ R .033 12 .003 27.78 .000 .004 Pattern "‘ N .231 12 .019 193.06 .000 .028 R * N .015 9 .002 16.91 .000 .002 Pattern * R * N .033 36 .001 9.20 .000 .004 Error 7.982 79920 9.99E-005 Total 8.702 80000 Corrected Total 8.635 79999 a. R Squared = .076 (Adjusted R Squared = .075) Estimated Marginal Means Estlmated Marglnal M98113 Of Dlflef 91169: X1 at Pattern I 0.01 - 0.1208 - 0.1136 ‘ 0.1204 - 0.1!)2 "‘ 0.00 - 0.002 - 0004 - 0006 '- -0.008 '1 Figure 5.1. Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X, 102 Estimated Marginal Means of Difference: X1 at Pattern II 0.01 T R 0.008 - — R1 0006 — --... R2 09041 - - - R3 11°02 —- R4 0.00 4 0002 — .0004 — 0006 - 41.008 - Estimated Marginal Means Estimated Marginal Means of Difference: X1 at Pattern ill 0.01 — R 0.1114 -' Estimated Marginal Means E l Figure 5.1. (cont’d) Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X, 103 Estimated Marginal Means Estimated Marginal Means Estimated Marginal Means of Difference: X1 at Pattern 1V 0.01 — 0-008 - 0.006 1 0.004 -1 0.002 - 0.00 ~ .0002 -3 411114 - 0016 - 411138 m N1 N2 NS Estimated Marginal Means of Difference: X1 at Pattern lV 0.01 d 0.008 '1 0.006 '- 0.004 ‘ 0.002 — 0.00 '— «01302 ‘ «0W -‘ -O.m6 — 43.008 4 N1 N2 N3 —R1 ----R2 -- - R3 —— R4 Figure 5.1. (cont’d) Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X, 104 Estimated Marginal Means of Difference: X2 at Pattern i 0.01 -1 R 0.008 ~ — R1 ”-006 i --- R2 0.004 '— . - .. - R3 0002— w —-— R4 0.00 - -O.CD2 - .0004 - 0016 — {HUB ‘ Estimated Marginal Means Estimated Marginal Means of Difference: X2 at Pattern II 0.01 - R 0.008 d — R1 00061 ~---- R2 0“” ---R3 2 6.602— 0032 '1 0014 - -0.m6 - 0.1138 - Estimated Marginal Figure 5.2. Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X2 105 Estimated Marginal Means of Difference: X2 at Pattern Ill 0.01 - R “mm- -—-R1 00% " _.--. R2 QMX‘ ---R3 ”-1101" —- R4 0-00 -‘ 0.002 '- .0004 '- -O.1116 -' .0006 4 Estimated Marginal Means Estimated Marginal Means of Difference: X2 at Pattern IV 0.01 R 0.008 - —— R1 00% - ..... R2 11°04“ " - - - R3 0.00 '- 0002 - .0004 d 0016 ‘ 0.1118 ‘ Estimated Marginal Means l Figure 5.2. (cont’d) Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X2 106 rglnal mated Mes Estimated Marginal Means of Difference: X2 atPatternV 0.01 -3 0.008 ~ 0.006 4 0-004 — 0.002 — 0.00 - 0.002 - 43004 - 41.006 - 0108 - N1 NZ N3 N4 Figure 5.2. (cont’d) Interactions of Sample Size Sets and Correlation Matrices for Five Patterns for Differences in Slopes of X2 rglnal Estimated Ma Estimated Marginal Means of Difference: X3 at Pattern I Figure 5.3. Interactions of Sample size Sets and Correlation Matrices in Five Patterns for Differences in Slopes of X3 107 Estimated Marginal Means of Difference: X3 at Pattern II 0.01 4 R 0.008 - 0006 -3 0-004 - _ _ _ R3 0-002 — iii ~.~ . __ R4 0.00 - - - 01112 - 01114 1 41.006 4 .0008 -3 Estimated Marginal Means N1 N2 N3 N4 Estimated Marginal Means of Difference: X3 at Pattern III 0.01 - R 0.008- 0.006- 0.004— 0.002- —- R4 01!)" G-—"'- -0.002 "‘ 41-004 '— -0.006 -‘ 41.008 - Estimated Marginal Means Figure 5.3. (cont’d) Interactions of Sample size Sets and Correlation Matrices in Five Patterns for Differences in Slopes of X3 108 Estimated Marginal Means of Difference: X3 at Pattern 1v 0.01 - R 0-008 - 0306‘ ~. ---- R2 0-1134 - 0012 4 0-00 — 41.002 — 41M - 41.006 - -0.m8 -‘ Estimated Marginal Means N1 N2 N3 N4 Estimated Marginal Means of Difference: X3 at Pattern V 0.01 - R 0-008 - -— R1 0.006 - ...... R2 110°“ - - -R3 ”1'2“ =- —- R4 0-1!) -1 01112 ~ 01114 -1 0M - 43.1118 -: Estimated Marginal V 0 N1 112 N3 N4 N Figure 5.3. (cont’d) Interactions of Sample size Sets and Correlation Matrices in Five Patterns for Differences in Slopes of X3 109 Estimated Marginal Means of Difference: X4 at Pattern l 0.01 - R 0.0011 - — R1 "-005 ‘ ----- R2 0.004 ‘ _ _ _ R3 0.002 - 0.00 4 R4 41012 - 0.1114 - 0016 -‘ ORB Estimated Marginal Means Estimated Marginal Means of Difference: X4 atPIthrnll 0.01 R 0.1118 - — R1 rims - :3 ‘ m" R2 0.1114 "1 a s I I I R3 0002 - 9‘ __ R4 QW‘ _b i 0004 '- -0.m6 '- noes-l Estimated Marginal Means Figure 5.4. Interactions of Sample size Sets and Correlation Matrices in Five Patterns for Differences in Slopes of X4 110 Estimated Marginal Means of Difference: X4 at Pattern Ill I101 '1 R 72 0.0084 —- R1 E» :32: “R2 i 2 0002- ' ' ' “3 m ' —- R4 3; 0.00- W E .0002- ;3 .0004~ In 410064 0.0034 I l l I N1 N2 N3 N4 N Estimated Marginal Means of Difference: X4 at Pattern IV 0.01 In R 0.008 E“ _ R1 0.000 ~- 0.004 E— ..... R2 0.0021- ' ' ‘ R3 0.00 ‘=- -0.002 f- 0.004%- 0.0065. 0.008).; 41.01 ‘_4 Estimated Marginal Means I N1 N2 N3 N4 N Figure 5.4. (cont’d) Interactions of Sample size Sets and Correlation Matrices in Five Patterns for Differences in Slopes of X4 111 Estimated Marginal Means of Difference: X4 atPattemV 0.01 - R 0-008 H - — R1 0.006 " 1 . --.... MW <=\\~. R2 0002- \“\~ 0.00- ....... 5: “gage—4% -- R4 0.002 - 0.004 1 -O.(!16 " 01118 -‘ Estimated Marginal Means .0 Figure 5.4. (cont’d) Interactions of Sample size Sets and Correlation Matrices in Five Patterns for Differences in Slopes of X4 5.3 Mixed-effects Model (Condition 5 through Condition 8) In this part of simulation, I made the models more complex by choosing different correlation matrices for the first two and the last two studies. This represents a more complex fixed effects model, with two groups of studies. The results based on different matrices (R5 though R3) under mixed-effects model in the syntheses for each of the five missing patterns are shown in Table 5.26 through Table 5.30. Within each pattern, the mean slope for each predictor and the standard errors based on GLS and SWP methods were reported for each sample size set (N I thought N4) for each of the four conditions. Note that the population values for the slopes for the variables were always the same for N1 and N2. This is because N1 and N2 both had equal sample sizes across the four studies included in the synthesis, and the summarized correlation matrix used for calculating the 112 slopes weighted by sample size as shown in the methods section. 5.3.1 Pattern I In this pattern, the relative bias between of the estimated slopes was less than 5% ‘ most of the time for both methods. However, the relative bias was much greater in some conditions. SWP generally performed better than GLS, and produced slopes closer to the population values. The worst estimation from SWP in this pattern was in condition 5 when the sample size set was N4. Here correlations among predictors (X 1 and X2) existed only in one study (study 2) based on a somewhat large sample (the sample size for the second study was 1000 in N4). The relative bias of the slopes for X2, X3, and X4 were 9.55%, 11.48%, and 12.64% respectively. Both GLS and SWP produced smaller relative bias when the sample sizes were from N2 and N3. They performed well especially in condition 8 when the sample size was equal to N3. The relative biases of estimates of all the slopes produced by both methods were all less than 1% in that condition. The stability of the estimates of both methods was similar to those based on fixed-effects model. 5.3.2 Pattern II In this pattern, SWP produced much closer estimates than did the GLS procedure. Most of the time, SWP resulted in less than 1% relative bias in estimating the slopes. For GLS, with sample size sets N1 and N2, the relative biases were greater than 5% all the time for all variables, while SWP produced bias values under 1% most of the time. With sample size sets N3 and N4, GLS performed only slightly better for a few slopes with the relative bias less than 5%. The bias for those with relative bias less than 5% by GLS method ranged from 2.35% (slope for X4 with sample size N3 in condition 6) to 4.98% 113 (X3 with sample size N4 in condition 8). SWP did not perform as well as in other scenarios when the sample size set was N4 in condition 8. The relative biases of slopes for all four predictors were all above 1%. However, they were still smaller than the values for estimates from the GLS method. The stability of the estimates of both methods was similar to those based on the simple fixed-effects model. 5.3.3 Pattern III The results for the each condition presented in Table 5.28 were identical to those presented in Table 5.8 (same as Condition 5), Table 5.18 (same as Condition 6), Table 5.3 (same as Condition 7), and Table 5.13 (same as Condition 8). The identity arose because, in this pattern, the intercorrelations among X2, X3, and X3 were provided by only the last study in the synthesis, which made it the same as Pattern 111 under the simple fixed-effects model. The comparisons between GLS and SWP under each condition for different sample size sets can be found in the previous sections. 5.3.4 Pattern IV In this pattern, GLS produced large relative percentage bias values in most conditions. The largest bias produced by GLS was in estimating the slope of X 1 (bias=18.33%) with the sample size set N4 in Condition 6. SWP also produced the largest bias in the same scenario (bias=l7.86%). Actually, when the sample size equaled N4 in this pattern in Condition 5 and Condition 6, where X4 had zero correlation with other variables, the estimated slopes for all the four variables from both methods had rather larger relative biases. SWP consistently resulted in large bias, ranging from 8.35% (in N3) 114 to 35.95% (in N4), for E4 in Condition 5. Contrary to the large biases found in those situations, SWP consistently produced small biases for all four variables across the sample size sets in Condition 7. 5.3.5 Pattern V No missing data occurred in this pattern, and the results in this pattern were similar to those found for Pattern II, where the missingness occurred only in one variable in the first study. Most of the time, the SWP slopes showed less than 1% relative percentage bias. In Condition 7 and Condition 8, when sample sizes were equal across studies (N 1 and N2), GLS estimated had more than 10% relative bias for the slopes of X 1, X2 and X3. Large biases in estimating the slopes of X2, X3 and X4 from GLS could be found in Condition 5 and Condition 6 when the sample sizes were equal, as well as in Condition 6 with sample size set N3 and in Condition 5 with sample size set N4. 115 552555 55555 82555 3:85 $65 @555 5.5555 835 .36 3:85 53855 $2555 3:555 5555 56555 5.555 55.555 30 «2 55555 55555 5535 3.25 558555 558555 545555 68555 3555 55555 5535 £35 .36 535555 558555 5545555 558555 8555 8555 $555 5355 6.8 52 5555.5 5555.5 835 SS5 38555 535555 38555 68555 $555 5855 56555 63.55 .36 28555 535555 3885 68555 5855 :85 56555 5.55 6.5 52 552555 62555 32555 552555 5855 2545 $555 5&5 .36 £2555 £2555 32555 55555 5365 6845 $555 8455 who _2 55555 5555 $35 5625 33.5.5 5 835.80 562555 25855 585555 2.55555 2555 3.35 835 55565 36 652555 5655555 5555555 635555 5585 55565 3655 3565 Se ...z 55655 5552 3:5 :65 553555 $5555 88555 28555 :25 3555 56555 55565 .36 558555 568555 635555 38555 :25 3555 6355 $565 6.8 62 5535 55565 565 65565 535555 555555 88555 $8555 5855 5525 5565 55565 .36 558555 558555 38555 535555 8655 £555 £35 965 6.5 52 82555 82555 3285 555555 5855 55555 $35 5565 .36 8:555 6:555 2:555 565855 5585 3555 55.35 5565 30 _z 5535 55565 3665 @565 “5.5.5.5 a 85580 .mm £6 .6 _mm .m .m am .w 8522 H8035 E 603m 295m 380me 53» 2082 HoobPuofiE Hove: 88:55 seam Sm £85 @555 was mono—m :82 voamfiwmm .Eouofiflwm cad 2an 116 .8233 2: 8m memo—6 “82:83 05 803 6029» Ben 2:. .802 32555 32555 2285 3:555 5565 RS5 £25 52:5 .36 562555 55555 :2555 2:555 5265 56555 :85 42:5 6.5 ...z 2.65 :25 5.555 5535 35555 535555 62.5555 52.5555 665 56.85 5525 8:5 .36 62.5555 3.5555 85.5555 595555 $65 $85 325 «E5 6.8 52 5665 $.35 555 :35 «3.5555 55.5555 558555 558555 5:65 @555 325 55555 .36 345555 845555 88555 $8555 ~35 $35 655 :25 6.5 52 62555 55555 553555 55555 5265 55.85 5625 52:5 .36 58855 552555 562555 55555 55565 3.85 3555 53:5 6.5 _2 665 .525 $35 325 £93.55 5 82280 3:555 $55555 :55555 255555 ~85 55:5 3555 59.65 .36 8:555 655555 255555 585555 5525 55:5 3555 5665 6.5 42 835 625 65555 53.65 55.5555 «95555 535555 $8555 55:5 59:5 :35 5865 .36 535555 595555 55555 558555 385 E25 5&5 .5565 who 52 3:5 325 3.35 265 435555 558555 28555 $8555 885 :35 5355 4565 .36 535555 38555 355555 38555 225 $25 .685 5565 6.5 5.2 6285 £2555 22555 85855 $25 555 8555 55565 .36 552555 32555 3.2555 555855 :25 55:5 8555 5365 6.5 E 5525 225 :35 5565 53315 5 8.55:8 .56 £6 Nmm _mm .m .m .m .m 8.582 _Eoumm E 603m 295m “:08me 53, 23502 Hooboéoxmz $95 886on seem com maobm 9.85% was memo—m c502 vowmfiumm £82553 8.885 85 use. 117 53.5555 535555 525555 «@5555 53.65 5535 5255 62:5 .36 5.35555 52.5555 $3555 535555 55565 $555 53.5 5255 6.5 3,: 3565 :35 8:5 53:5 535555 38555 52.5555 658555 5365 6555 $35 6255 .36 :8555 38555 558555 558555 56565 5855 3555 53.55 6.5 52 2565 3555 $.35 225 585555 38555 525555 68555 £565 325 59.55 6555 .36 685555 568555 68555 568555 6565 9.25 5525 885 6.6 Nz $55555 855555 585555 255555 «$65 $25 $35 8555 .36 255555 355555 585555 355555 3565 $25 5585 325 6.50 :z 3565 5:55 2:5 825 53.5.: 5 8:580 565555 38555 553555 558555 55555 685 6685 3.565 .36 365555 38555 535555 88555 $35 385 £63 :25 6.6 «z 5555 2.35 5.35 3.65 $885 28555 £8555 28555 53.55 385 5855 635 E6 28555 255555 9.8555 2885 6&5 8555 33.5 6565 6.6 52 5355 5635 582 «365 525555 #8555 38555 88555 5525 835 3555 62.65 .36 68555 585555 @8555 68555 $35 8555 :35 $35 6.6 Nz 5655555 355555 865555 865555 £25 £85 55555 $565 .36 5555555 255555 355555 355555 635 55555 59.55 5535 who :2 $35 5535 $52 365 ”5.5.5:: 5 855.50 56 £6 £6 _mm .m mm 6 _m 5250: 5.5%.: 5 653m oEEmm Bobbi £3, 6:602 “echoing/H 55:: 38605 scam 8.: EEE 2.565% 28 mono—m :32 wouagumm .BouoEaSm Rd 2an 118 doEE—oo on... .8.“ mono—6 63883 2: 803 6253 Eon 0:8 .802 33555 53555 59585 23555 665 5665 325 5585 .56 83555 83555 63555 33555 3.665 225 86.65 6585 650 ..z 2665 $85 835 $35 28555 6885 £8555 68555 265 E55 285 2655 .36 88555 83555 553555 558555 865 6625 625 285 616 62 5:65 335 SS5 235 38555 66885 $8555 6885 E65 335 685 $25 .36 $8555 $8555 $8555 $8555 5365 $65 $35 285 650 N2 3885 $885 68555 355555 365 863 5&5 6625 .36 855555 25855 68855 82555 6365 5:65 :35 385 So E 365 335 SS5 335 91% ”w 53680 63555 38555 525555 38555 555 $85 $25 565 .36 33555 68555 68555 @8555 .535 835 325 2565 30 «z SS5 385 5535 665 568555 £8555 $8555 3885 .525 2.555 285 3.65 .36 625555 28555 23555 $885 325 2.85 235 665 So 62 55:5 335 335 5265 $8555 $8555 $8555 2885 $25 5:55 8565 5365 .36 28555 $8555 £8555 88555 5585 835 $25 365 650 «z @5855 66585 38555 365555 3:5 5525 8565 5.65 .36 22555 88555 68855 $65555 5585 835 6365 5665 30 2 52:5 235 5265 365 £7.33 5 85680 3mm .mm .mm _mm ..m .m .m .55 Bee: nfiouam E mufim vacuum 880me 5.3 $352 Hooboéofiwé Sun: .888on scam 18m Eobm vauaflm ES 835 932 woumawmm .fiBoSmSA 6.685 2.6 2.3 119 56:85 86855 $285 65555 5855 66565 6865 885 .36 83555 566855 662555 65555 885 $565 .865 :25 68 52 8885 S385 2385 68555 55555 82.5 5535 65:5 .36 .5885 S585 2385 68555 $565 885 6535 835 68 62 88555 33555 56385 $8555 885 385 8565 835 .36 55885 $385 23555 $8555 885 385 8565 835 68 «2 53.855 3.2555 3285 83555 56565 :85 2565 $35 .36 66:85 2285 @2555 8:555 5365 285 5865 $25 68 Hz 55 ..5 6.5 35 33.5.5 5 825.50 56:85 86855 $285 62555 6525 N83 835 S85 .36 @3555 653555 3.3555 6285 885 5525 8555 6855 68 «z .5885 653555 23555 38555 5525 55565 885 5585 .36 38555 553555 653555 38555 5&5 8565 33.5 3555 68 62 558555 33555 83555 68555 6525 8565 5865 885 .36 658555 83555 88555 68555 5525 65565 55565 885 68 N2 $3555 $3555 3385 63555 5&5 65565 :85 8555 .36 563555 33555 $3555 6285 N35 625 565 $85 68 _z 35 m5 ..5 55 5565 £333. 5 8528 56 .mm .mm 56 an .m .m _.m 8502 H: Eofimnm 5 603m 298m 808me £33 £382 80b06082 .898 88:85 scam 8m 6.88m 288% ES memo—m :82 woumaumm .880858m mad 2an 8088.80 0% 8m 6088 8888.8 05 0803 6028, Eon 08H .08 Z 121 8885 8885 8885 8:85 885 885 885 885 .36 8:85 8885 8885 8885 885 885 885 8:5 68 «z 8885 86855 8885 8885 885 885 885 8:5 .36 8885 8885 8885 8885 885 885 885 8:5 68 2 8885 8885 8885 8885 5865 885 885 8:5 .36 8885 8885 8885 8885 5:65 885 885 8:5 68 «z 8885 8885 8885 8885 885 885 885 8:5 .36 8885 8885 8885 8885 885 885 885 885 68 _z 885 885 885 885 8.888 8 888.8 8885 8885 8885 8:85 8:5 885 885 6865 .56 8885 8885 8885 8885 8:5 885 885 8865 68 82 8885 8885 8885 8885 8:5 885 885 8865 .56 8885 8885 8885 8885 8:5 885 885 885 68 82 8885 8885 8885 58855 885 885 885 8:5 .36 8885 :885 8885 8885 8:5 885 885 :85 68 82 8885 8885 8885 8885 8:5 885 885 5:65 .836 8:85 8885 8885 8885 8:5 885 885 885 68 Hz 885 885 885 885 £8.88 8 .8888 .8 .8 .8 _mm .m .m .m _m 8.82 H: Eofimm 8 608m 088% 808.c5 £83 8082 880-8082 888 8880.5 80am 8.“ 688m cggm 88 60.85 8.02 808885 .808888m 8.83 8.6 058. 8885 8885 8885 8885 885 885 885 885 88 88:85 8:85 8:85 8:85 885 885 885 885 86 vz 885 885 885 885 8885 88855 8885 8885 885 $85 885 885 A58 8885 8885 8885 88855 885 885 :85 885 m6 2 885 885 885 885 8885 8885 88855 8.885 885 885 885 885 88 :885 8885 8885 8885 885 885 885 885 86 82 8885 8885 8885 8885 R85 885 885 885 88 8:85 8885 8885 8885 885 885 8.85 885 86 :z 5885 88:5 885 885 53:88 8 88880 8885 8885 8885 8885 885 885 885 :85 88 8885 8885 8885 8885 885 885 885 885 86 82 885 :85 5885 885 8885 88855 8885 8885 885 885 885 885 85m 8885 $885 58585 8885 885 885 :85 885 86 82 885 885 88.5 885 8885 8885 8885 8885 8.85 885 885 885 88 8885 8885 :885 8885 885 885 885 885 86 82 8:85 :885 8885 8885 885 885 885 R85 .88 8:85 8885 58855 8885 885 885 885 885 m6 :2 885 885 885 885 885 £888 ”m 8288 V8 M8 88 _mm mm mm mm _m 8502 28085 E m08m 295m 3080985 :83 8062 80b06082 .893 68605 scam .88 888m 265% can @065 502 608885 80808885 omd 03E. 122 doE—uqoo 05 8m mono—m “30883 05 803 was?» Eon BB .802 2285 8285 8:85 8285 885 885 885 385 53m 8285 8:85 8:85 8285 285 885 885 885 30 «z 885 885 885 885 83855 $855 53855 8885 5:55 285 825 825 8pm 3885 8885 8885 8885 :85 $85 285 885 So 2 as... 885 2.85 885 8885 8885 8885 $885 885 885 885 3.25 .56 8885 8885 8885 $885 885 885 885 885 So 82 58855 8:85 3885 8285 R85 885 885 825 $6 8285 8:85 8285 8285 885 885 885 885 86 Hz 885 885 $85 $85 an; “5 8288- 8285 8885 8885 8885 8:5 :85 885 5855 “56 3855 8885 :885 8885 825 885 885 885 So 52 8:5 885 88.5 855 8885 _8855 8885 8885 825 885 885 885 8% $885 8885 8885 8885 825 885 885 285 3o .2 825 885 885 885 8885 8885 58855 :885 $25 885 885 885 85m 8885 88855 8885 8885 2:5 885 885 $85 Sc 82 8285 8285 8285 8885 825 $85 $85 885 8.6 8:85 8285 8285 8885 8&5 885 885 :85 So 2 825 885 885 8.85 iiémm 8 8288 5 mmm ”mm _mm vm mm 6 _m 882 28885 E mofim 2&8de :85me 53, £262 Hooboéofi—Z $98 886on 55m 8.“ muobm Eatsflm can mono—m =32 cougumm £30835 6.853 85 038 123 8885 58585 8885 88855 885 885 885 885 88 8885 8885 8885 8885 885 885 885 885 So 5.2 885 885 885 885 8885 58855 8885 :885 885 885 885 8:85 88 :885 8855 8885 8885 885 885 8885 885 v46 82 885 885 885 :85 2885 58585 8885 8885 885 885 885 885 88 88855 88585 8885 8885 885 885 885 885 30 82 :885 8885 8885 8885 885 885 885 885 .58 8885 :885 :885 88855 885 885 885 885 30 _z 885 :85 885 885 33:88 a 8888 58855 8885 8885 58585 885 885 285 885 .58 8885 88855 2885 88585 885 885 885 885 30 82 885 885 885 885 :885 8885 8885 58855 :85 885 885 :85 88 58855 8885 8885 8885 885 885 885 885 So 82 :85 885 885 885 8885 8885 8885 8885 885 885 885 885 88 8885 8885 $8855 8885 5885 885 885 885 Sc 82 88855 8885 8885 :885 885 885 885 885 88 8885 8885 3885 88585 8885 885 285 885 30 :z 885 885 :85 885 885 £853: 8 8888 .8 m8 N8 _mm 8 mm 8 _m 882 >803?" E mofim £93m Bahama £3, 28532 Buboéofiuz Sun: 366on scam Ho.“ thm c8285 28 8305 :82 38835 .Eouogm own 28? 124 dorm—Eco 05 8m mono—m Hon—cams: 05 803 83g Eon 2F .80 Z 885555 885555 885555 885555 885 885 885 :85 88 885555 88555 885555 885555 885 5585 885 885 30 «z 885 885 885 :85 885555 885555 585555 88555 5885 885 885 885 88 885555 885555 85555 858555 885 885 885 :85 So 82 885 885 885 885 885555 885555 885555 885555 885 885 885 885 .58 885555 385555 885555 885555 885 885 885 885 30 82 585555 885555 885555 885555 885 885 885 885 88 885555 885555 885555 885555 885 385 885 885 30 :z 885 :85 885 885 £333. 8 82580 585555 885555 885555 885555 885 885 885 885 88 585555 585555 885555 885555 885 885 885 885 So 82 :85 885 885 885 85555 885555 585555 585555 885 885 :85 $85 .88 585555 885555 88555 885555 885 8585 :85 885 So 82 885 885 885 885 885555 885555 585555 885555 885 885 885 885 .58 885555 885555 585555 885555 :885 885 8585 885 So 82 885555 585555 885555 885555 885 885 885 885 88 885555 585555 88555 885555 885 885 :85 8585 So :2 885 885 :85 885 5333: 8 8:880 v8 :8 N8 .8 8 mm mm _m 8502 >Eotmm 5 88m 298% Bahama 5:3 £0on :ooboéofiz :25: 88:65 scam 8.: 808m Edwaflm :5 832m 502 “58858”; éofiamhwm 8.853 8.8 258 125 CHAPTER 6 DISCUSSION This chapter summarizes the major findings based on the two methods investigated in this research, and compares the two methods in a more general way. Suggestions for choosing from the two methods are provided, as well as the limitations and further investigations of the current study. This research extends the factored likelihood method through the sweep operator (SWP), which was originally designed for handling missing data, to the meta-analysis context. The results from the SWP method were compared to the results from the GLS method, which is a typical procedure for synthesizing multivariate data in meta-analysis. The major difference between the two methods is that the SWP utilizes the concept of maximum likelihood while GLS is not a likelihood-based approach and focuses on weighting the correlations by their variability. Exploring the SWP method provides another point of view and possible way to deal with the missing information that often occurs in meta-analyses. In the current study, the correlation matrices from regression studies were combined in order to obtain the synthesized standardized slopes as a summary of the included regression models in the synthesis. The two methods investigated in this study allow the information from regression studies to be combined with correlational studies, which can be considered as simple regression studies. Being able to incorporating regression studies with correlational studies helps to improve the understanding of the relationship found in correlational studies, because more variables are held constant in regression studies than 126 in correlational studies while exploring the relationship between the outcome and the predictor. As the results presented in Chapter 5 show, each of the GLS and SWP methods has its own strength in synthesizing regression studies with different patterns of missing data, missing rates, and differences in what was missed (in terms of the strength of the ‘ correlations remaining in the matrix). The methods were first examined assuming fixed effects (Condition 1 through Condition 4), when all the four studies included in a meta-analysis were based on the same population correlation matrix. The major finding assuming fixed effects across studies was that SWP consistently performed slightly better than GLS when estimating the slope of the variable that was present in all regression models (X 1), while GLS consistently overestimated it in all the five missing patterns with different sample sizes. The empirical examination using the pseudo studies in Chapter 4 also confirmed this finding. This result makes SWP a more desirable method especially when a researcher’s focus is on the relationship between the outcome and one specific predictor. In that case, the bivariate relationship of interest can be adjusted appropriately by other variables that were controlled in the regression studies when using SWP. ' The estimated slopes obtained from two methods were very close to the population values, indicating that both methods produced good estimates of the slopes for the final model. There were a few tendencies found for the two methods in terms of the impact of the study patterns, sample sizes, and the strength of correlations that were missing. For example, SWP tended to perform better in estimating ,82 when the sample size was small and equal across studies (N 1) in all patterns; GLS tended to perform better on estimating ,62 when the sample size was large and equal across studies (N2). When more 127 missing data occurred (N4), SWP tended to produce more precise estimates of the slope for X 3 in all patterns, no matter what correlation matrix the studies were based on; GLS tended to perform better on the same variable when the sample size was large and equal across studies (N2). When there were less missing data within studies (N3), SWP tended to estimate the slope for X4 better, while GLS tended to do better with more data missing (N4) on this variable. The percentage relative biases were calculated to quantify the difierences between estimated slopes and the population slopes. In all the scenarios of simulation, both methods produced the bias under 5%, which made the two methods good estimation methods (Hoogland & Boomsma, 1998). Yet the ranges of bias from GLS were consistently larger than the ranges from SWP which made GLS less desirable. The largest positive bias values for estimating the slope for X1 produced by both methods were both under Pattern III with the sample size N1 and correlation matrix R3. This indicates that when the sample size was small and equal across studies and when important variables were missing more, comparing to missingness for less important variables, both methods did not do as well as in other scenarios for estimating the X1 slope. On the other hand, SWP did very well (zero bias) in estimating B1 in this pattern with the sample size N3 and the correlation matrix R2. That implies when there were mostly correlational studies but only a few regression studies with big sample sizes included in the meta-analysis, SWP can estimate the slope for the most observed variable very well, especially when the predictors that were missing from the correlational studies were related to each other and the outcome less strongly and when there are no intercorrelations among the predictors. Another summary and comparison of the results 128 from two methods can be found in Appendix E. In the factorial AN OVAs, relatively little of the variance of the differences between slopes estimated by the two methods were explained by the patterns, the correlation matrices, and the sample size sets. In all cases, less than 10 percent of the variability is explained by all the factors. For X1, which is of the most of interest, the sample sizes seemed to be the most important factor for explaining the variation of the differences and patterns seemed to be the least important factor. The implication for this finding is that when the researcher is making the decision of the method to be used, the most important thing to keep in mind is the sample sizes of the primary studies included in the synthesis. S Ordinal interactions existed in the current analyses for X1. The interaction plots showed that, when separating by patterns, GLS and SWP produced more difi‘erent estimates of the X1 slope in sample size set N1 than in other sample size sets. Combining this result with the previous finding that SWP consistently produced closer slopes on X1, the SWP method is especially preferred when the sample sizes are small and equal across studies, no matter what the correlation matrix is. The two methods were also examined under mixed-effects models (Condition 5 through Condition 8). By assuming mixed effects, the relationships among variables in the four studies in the synthesis were not all based on the same population correlation matrix. Since methods for meta-analyzing multivariate data under this model have not been well developed (due to the difficulty of estimating between-study variances with multivariate data appropriately), the population slopes calculated for each scenario were based on the weighted mean correlations using the existing correlations in the current research. As a consequence, the estimates from SWP showed lower relative percentage 129 bias most of the time, because SWP depends more on the weighted mean correlations at the beginning of the calculations than does the GLS procedure. When comparing the results for the fully observed predictor X1 under the mixed-effects model to the results of same scenarios but under a fixed-effects model, both methods showed the largest negative differences (mixed-effect results minus fixed-effects results) in Condition 8 with sample size equaled N] and the largest positive differences in Condition 7 with the sample size defined by N3 in Pattern III. This finding indicates that when the important variables (e. g., the X4 the correlation matrices R3 and R4 in this research) were more likely to be missing (e.g., X4 is missing in study 1 to study 3 in pattern 111), the estimate of the slope of the fully observed variable can be very different (when using either methods) under fixed- and mixed- effects conditions. More investigations on producing appropriate estimates under non-fixed effect models will be needed. As all research has limitations, this study is constrained in several ways. First, using the factored likelihood estimation through the sweep operator requires the predictors included in the models in the synthesis to be arranged in a monotone pattern somewhat like those shown in Figure 3.2. Those patterns help to obtain the maximum likelihood of the correlations without an iterative process, which made SWP an easy method to use. The desired pattern sometimes can be achieved by rearranging the order of the predictors in the models. Or, some variables may have to be excluded from models in order to obtain the desired pattern. The GLS method, on the contrary, is more flexible in this matter, and can be used with any correlations that are available in the studies. Other methods for handling missing data that might be useful for synthesizing regression 130 studies in the meta-analysis context, such as multiple imputations, might be worth while to investigate, since combining regression studies is somewhat similar to dealing with missing information from primary studies. Second, the correlations used in the synthesis from the primary studies were assumed to be perfectly measured in the current studies. That is, the errors from the instruments used to measure the variables in the regression models were not taken into account. I made this simplification because I wanted to focus on eliminating the impact of the unparallel situations that occurs when regression models do not contain all the same predictors. Further research should investigate possible solutions, such as utilizing the concept of structural equation modeling, to incorporate measurement errors in meta-analysis. Third, the correlations among the variables in the regression studies are required in order to use the methods investigated in this research. Unfortunately, it is very likely the information about the zero-order correlations may not be reported or may be only partially reported. In this matter, Bayesian perspectives might provide a possible direction for obtaining the correlations needed for synthesizing regression studies based on other information, such as slopes reported in the regression study. A possible method is to use the Gibbs Sampler (Casella & George, 1992; Gelfand & Smith, 1990), that is based on elementary properties of Markov chains, to generate possible correlations based on the observed distribution of the slopes of regression models. 131 \- APPENDIX A: SAS Macro for GLS Below is the example of SAS macro for generating the data under Pattern I for four sample sizes and four correlations and calculating the standardized slopes and standard errors using GLS. 3|!*********************************************************************** N= Sample size set for four studies (N 1 through N4); R= Correlation matrix R1 through R5; nk=sample size for study k in a synthesis, k=1 to 4; riia=correlation between variables i, i=y,1,2,3,or 4; #1”. Libname GLSpl 'C:\'; %Macro GLSpl(N,R,n1,n2,n3,n4,ry1,ry2,ry3,ry4,r12,r13,rl4,r23,r24,r34); Title GLS PATTERNl &N &R ; Proc IML; nseed=125;nrep=1000; Patl=j(nrep,8,0); do sim=l to nrep; sl=j(&n1,2,0); do i=1 to &n1; sl[i,1]=rannor(nseed); sl[i,2]=rannor(nseed); end; slr={1 &ryl,&ryl 1}; col=root(slr); zl=sl*col; rl=corr(zl); 52=j(&n2,3,0); do i=1 to &n2; 82[i,1]=rannor(nseed); 52[i,2]=rannor(nseed); 32[i,3]=rannor(nseed); end; 52r={1 &ry1 &ry2,&ryl 1 &r12,&ry2 &r12 1}; c02=root(82r); zZ=sZ*c02; r2=corr(22); $3=j(&n3,4,0); do i=1 to &n3; 53[i,1]=rannor(nseed); s3[i,2]=rannor(nseed); 132 33[i,3]=rannor(nseed); 53[i,4]=rannor(nseed); end; 33r={1 &ryl &ry2 &ry3, &ryl 1 &r12 &rl3, &ry2 &r12 1 &r23, &ry3 &r13 &r23 1}; c03=root(s3r); 23:53*c03; r3=corr(z3); s4=j(&n4,5,0); do i=1 to &n4; i,1]=rannor(nseed s4[ ), s4[i,2]=rannor(nseed); s4[i,3]=rannor(nseed); s4[i,4]=rannor(nseed); s4[i,5]=rannor(nseed); end; s4r={1 &ryl &ry2 &ry3 &ry4, &ryl 1 &r12 &r13 &rl4, &ry2 &r12 1 &r23 &r24, &ry3 &r13 &r23 1 &r34, &ry4 &r14 &r24 &r34 1}; co4=root(s4r); z4=s4*co4; r4=corr(z4); rl_yl=r1[1,2]; r2_yl=r2[1,2];r2_y2=r2[1,3];r2_12=r2[2,3]; r3_y1=r3[1,2];r3_y2=r3[1,3];r3_y3=r3[1,4];r3_12=r3[2,3];r3_l3=r3[2,4];r 3_23=r3[3,4]; r4_yl=r4[1,2];r4_y2=r4[1,3];r4_y3=r4[1,4];r4_y4=r4[1,5];r4_12=r4[2,3];r 4_13=r4[2,4];r4_14=r4[2,5];r4_23=r4[3,4];r4_24=r4[3,5];r4_34=r4[4,5]; *1‘:+**+*****************************************~k*********************** ++-++-++*iriuk*ir****~k**~k***iriri'i'iri'i-i'irir‘k‘kiririri'irir; p1=ncol(rl); diml=pl*(pl—1)/2; covl=j(diml,dim1,0); mat=j(dim1,2,0); k=1; do i=2 to p1; do j=1 to (i—l); covl[k,k]=(1—r1[i,j]##2)##2/&nl; mat[k,l]=i; mat[k,2]=j; k=k+1; end; end; do i=2 to diml; do j=1 to (i-l); s=mat[i,1]; t=mat[i,2]; u=mat[j,ll; v=mat[j.2]; covl[i,j]=(0.5*rl[s,t]*rl[u,v]*(r1[s,u]##2+rl[s,v]##2+r1[t,u]##2+r1 133 [t,v]##2)4qil[s,u]*r1[t,v]+rl[s,v]*r1[t,u]-(rl[s,t]*rl[s,u]*rl[s,v]+r1[t ,s]*rl[t,u]*rl[t,v]+rl[u,s]*rl[u,t]*rl[u,v]+r1[v,s]*rl[v,t]*rl[V,U]))/& n1; end; end; do i=2 to diml; do j=1 to (i-l); covl[j,i]=covl[i,j]; end; end; p2=ncol(r2); dim2=p2*(p2-1)/2; cov2=j(dim2,dim2,0); mat=j(dim2,2,0); k=1; do i=2 to p2; do j=1 to (i—l); cov2[k,k]=(1-r2[i,j]##2)##2/&n2; mat[k,1]=i; matlk,2]=j; k=k+1; end; end; do i=2 to dim2; do j=1 to (i—l); s=mat[i,1]; t=mat[i,2]; u=mat[j.1]; v=mat[j.2]; cov2[i,j]=(0.5*r2[s,t]*r2[u,v]*(r2[s,u]##2+r2[s,v]##2+r2[t,u]##2+r2 [t,v]##2)+r2[s,u]*r2[t,v]+r2[s,v]*r2[t,u]-(r2[s,t]*r2[s,u]*r2[s,v]+r2[t ,s]*r2[t,u]*r2[t,v]+r2[u,s]*r2[u,t]*r2[u,v]+r2[v,s]*r2[v,t]*r2[v,u]))/& n2; end; end; do i=2 to dim2; do j=1 to (i-l); cov2[j,i]=cov2[i,j]; end; end; p3=ncol(r3); dim3=p3*(p3-1)/2; cov3=j(dim3,dim3,0); mat=j(dim3,2,0); k=1; do i=2 to p3; do j=1 to (i—l); cov3[k,k]=(1-r3[i,j]##2)##2/&n3; mat[k,1]=i; matlk,2]=j; k=k+1; end; end; do i=2 to dim3; 134 do j=1 to (i-l); s=mat[i,1]; t=mat[i,2]; u=mat[j,1]; v=mat[j,2]; cov3[i,j]=(0.5*r3[s,t]*r3[u,v]*(r3[s,u]##2+r3[s,v]##2+r3[t,u]##2+r3 [t,v]##2)+r3[s,u]*r3[t,v]+r3[s,v]*r3[t,u]-(r3[s,t]*r3[s,u]*r3[s,v]+r3[t ,s]*r3[t,u]*r3[t,v]+r3[u,s]*r3[u,t]*r3[u,v]+r3[v,s]*r3[v,t]*r3[V,U]))/& n3; end; end; do i=2 to dim3; do j=1 to (i-l); cov3[j,i]=cov3[i,j]; end; end; p4=ncol(r4); dim4=p4*(p4-1)/2; cov4=j(dim4,dim4,0); mat=j(dim4,2,0); k=1; do i=2 to p4; do j=1 to (i—l); cov4[k,k]=(1—r4[i,j]##2)##2/&n4; mat[k,1]=i; mattk,2]=j; k=k+1; end; end; do i=2 to dim4; do j=1 to (i—l); s=mat[i,1]; t=mat[i,2]; u=mat[j,1] v=mat[j,2]; I cov4[i,j]=(0.5*r4[s,t]*r4[u,v]*(r4[s,u]##2+r4[s,v]##2+r4[t,u]##2+r4 [t,v]##2)+r4[s,u]*r4[t,v]+r4[s,v]*r4[t,u]-(r4[s,t]*r4[s,u]*r4[s,v]+r4[t ,s]*r4[t,u]*r4[t,v]+r4[u,s]*r4[u,t]*r4[u,v]+r4[v,s]*r4[v,t]*r4[v,U]))/& n4; end; end; do i=2 to dim4; do j=1 to (i—l); cov4[j,i]=cov4[i,j]; end; end; p=dim1+dim2+dim3+dim4; bigmtx=j (p; pr 0); bigmtx[1:diml,1:diml]=covl; bigmtx[diml+1:dim1+dim2,diml+1zdiml+dim2]=cov2; bigmtx[diml+dim2+1:dim1+dim2+dim3,diml+dim2+1:diml+dim2+dim3]=cov3; bigmtx[dim1+dim2+dim3+1:dim1+dim2+dim3+dim4,diml+dim2+dim3+1:diml+dim2+ dim3+dim4]=cov4; 135 rvecl=r1_yl; Start rvec2; k=1; rvec2=j(dim2,1,0); do i=2 to p2; do j=1 to (i-l); rvec2[k]=r2[i,j]; k=k+l; end; end; finish; run rvecZ; Start rvec3; k=1; rvec3=j(dim3,1,0); do i=2 to p3; do j=1 to (i-l); rvec3[k]=r3[i,j]; k=k+1; end; end; finish; run rvec3; Start rvec4; k=1; rvec4=j(dim4,1,0); do i=2 to p4; do j=1 to (i-l); rvec4[k]=r4[i,j]; k=k+1; end; end; finish; run rvec4; outcome=rvecl//rvec2//rvec3//rvec4; w=j(p.10,0); w[1,1]=1; w[2,1]=1; w[3,2]=1; w[4,3]=1; w[5,1]=1; w[6,2]=1; w[7,3]=1; w[8,4]=1; w[9,5]=1; w[10,6]=1; w[11,1]=1; w[12,2]=1; w[13,3]=1; w[14,4]=1; w[15,5]=1; 136 w)*inv(bigmtx)*w)*t(w)*inv(bigmtx)*outcome; Start backmtx; syncor=j(5,5,1); k=1; do i=2 to 5; do j=1 to (i-l); syncor[i,j]=rho[k]; syncor[j,i]=rho[k]; k=k+l; end; end; finish; run backmtx; Rll=syncor[2:5,2:5]; R12=syncor[2:5,1]; SLOPE=inV(Rll)*R12; Patl[sim,1]=&n1; Pat1[sim,2]=&n2; Patl[sim,3]=&n3; Pat1[sim,4]=&n4; Pat1[sim,5]=SLOPE[1,l]; Patl[sim,6]=SLOPE[2,1]; Patl[sim,7]=SLOPE[3,1]; Pat1[sim,8]=SLOPE[4,1]; end; Create GLSp1.GLSPAT1&N&R from Patl [colname={n1 n2 n3 n4 x1 x2 x3 x4 } ]; Append from Patl; run; quit; %Mend GLSpl; * 1* éGiSp;(Nl,Rl,150,150,150,150,0.6,0.4,0.3,0.25,0.25,0.1,0.05,0.15,0.1,0. :Z;SPI(N2,R1,2000,2000,2000,2000,0.6,0.4,0.3,0.25,0.25,0.1,0.05,0.15,0. :Cgéiilg3,Rl,150,500,1000,2000,0.6,0.4,0.3,0.25,0.25,0.1,0.05,0.15,0.1, gé::;;(N4,Rl,2000,1000,500,150,0.6,0.4,0.3,0.25,0.25,0.1,0.05,0.15,0.1, 0.15); /*R2*/ %GLSp1(Nl,R2, 150,150,150,150,0.6,0.4,0.3,0.25,0,0,0,0,0,0); %GLSp1(N2,R2,2000,2000,2000,2000,0.6,0.4,0.3,0.25,0,0,0,0,0,0), %GLSp1(N3,R2,150,500,1000,2000,0.6,0.4,0.3,0.25,0,0,0,0,0,0); %GLSp1(N4,R2,2000,1000,500,150,0.6,0.4,0.3,0.25,0,0,0,0,0,0); /*R3*/ 137 %GLSp1(N1,R3,150,150,150,150,0.25,0.3,0.4,0.6,0.15,0.1,0.05,0.15,0.1,0. :Zigpl(N2,R3,2000,2000,2000,2000,0.25,0.3,0.4,0.6,0.15,0.1,0.05,0.15,0. :C2;3)(I;3,R3,150,500,1000,2000,0.25,0.3,0.4,0.6,0.15,0.1,0.05,0.15,0.1, :Ci:;¥1(N4,R3,2000,1000,500,150,0.25,0.3,0.4,0.6,0.15,0.1,0.05,0.15,0.1, 0.25); /*R4*/ %GLSp1(Nl,R4,150,150,150,150,0.25,0.3,0.4,0.6,0,0,0,0,0,0); %GLSp1(N2,R4,2000,2000,2000,2000,0.25,0.3,0.4,0.6,0,0,0,0,0,0) , %GLSp1(N3,R4,150,500,1000,2000,0.25,0.3,0.4,0.6,0,0,0,0,0,0); %GLSp1(N4,R4,2000,1000,500,150,0.25,0.3,0.4,0.6,0,0,0,0,0,0); quit; 138 APPENDIX B: SAS Macro for SWP Below is the example of SAS macro for generating the data under Pattern I for four sample sizes and four correlations and calculating the standardized slopes and standard errors using SWP. Among the five patterns studied in this research, Pattern I has the most complicated codes because of the numbers of steps the calculation needed to be carried OUL ************************************************************************ N= Sample size set for four studies (N 1 through N4); R= Correlation matrix R1 through R5; nk=sample size for study k in a synthesis, k=1 to 4; riy=correlation between variables i, i=y,1,2,3,or 4; iaéi’. ************************************************************************ Libname SWPpl 'C:\'; %Macro patternl(N,R,nl,n2,n3,n4,ryl,ry2,ry3,ry4,r12,r13,r14,r23,r24,r34); Title PATTERNl &N &R ; Proc IML; nseed=125;nrep=1000; Patl=j(nrep,8,0); do sim=l to nrep; sl=j(&n1,2,0); do i=1 to &nl; sl[i,1]=rannor(nseed); sl[i,2]=rannor(nseed); end; slr={1 &ryl,&ryl 1}; col=root(slr); zl=sl*col; r1=corr(zl); /*print rl;*/ 52=j(&n2,3,0); do i=1 to &n2; 52[i,1]=rannor(nseed); 52[i,2]=rannor(nseed); 52[i,3]=rannor(nseed); end; 52r={1 &ryl &ry2,&ry1 1 &r12,&ry2 &r12 1}; 139 c02=root(32r); 22=82*C02; r2=corr(22); 83=j(&n3,4,0); do i=1 to &n3; s3[i,1]=rannor(nseed); s3[i,2]=rannor(nseed); ( ) ( I s3[i,3]=rannor nseed $3[i,4]=rannor nseed); end; 33r={1 &ryl &ry2 &ry3, &ryl 1 &r12 &r13, &ry2 &r12 1 &r23, &ry3 &r13 &r23 1}; c03=root(s3r); z3=s3*c03; r3=corr(z3); s4=j(&n4,5,0); do i=1 to &n4; s4[i,1]=rannor(nseed); s4[i,2]=rannor(nseed); s4[i,3]=rannor(nseed); s4[i,4]=rannor(nseed); s4[i,5]=rannor(nseed); end; s4r={1 &ryl &ry2 &ry3 &ry4, &ryl 1 &r12 &r13 &r14, &ry2 &r12 1 &r23 &r24, &ry3 &r13 &r23 1 &r34, &ry4 &r14 &r24 &r34 1}; co4=root(s4r); z4=s4*co4; r4=corr(z4); r1_y1=rl[1,2]; r2_yl=r2[1,2];r2_y2=r2[1,3];r2_12=r2[2,3]; r3_yl=r3[1,2];r3_y2=r3[1,3];r3_y3=r3[1,4];r3_12=r3[2,3];r3_13=r3[2,4];r 3_23=r3[3,4]; r4_y1=r4[1,2];r4_y2=r4[1,3];r4_y3=r4[1,4];r4_y4=r4[1,5];r4_l2=r4[2,3];r 4_13=r4[2,4];r4_14=r4[2,5];r4_23=r4[3,4];r4_24=r4[3,5];r4_34=r4[4,5]; fakiirsk-k+1&*********~x***‘k‘k‘k*********+*~k*************‘k***~k**~k***ir'k‘kirir-k'k'k-k*‘k AVEy1=(rl_y1*&n1+r2_y1*&n2+r3_y1*&n3+r4_yl*&n4)/(&n1+&n2+&n3+&n4); 0:3(2. 2: 1); O[1,2]=AVEyl; O[2,1]=AVEy1; AVEyl=(r2_yl*&n2+r3_yl*&n3+r4_y1*&n4)/(&n2+&n3+&n4); AVEy2=(r2_y2*&n2+r3_y2*&n3+r4_y2*&n4)/(&n2+&n3+&n4); AVE12=(r2_12*&n2+r3_12*&n3+r4_12*&n4)/(&n2+&n3+&n4); 8234=j(3,3,1); 8234[1,2]=AVEy1; 8234[2,1]=SZ34[1,2]; 8234[1,3]=AVEy2; 5234[3,1]=8234[1,3]; 8234[2,3]=AVE12; 8234[3,2]=8234[2,3]; R11=8234[1:2,1:2]; 140 R12=SZ34[1:2,3]; slope234=inv(R11)*R12; var234=1-t(slope234)*R12; _=j(212ll); [1,1]=-1/O[1,1]; [1I2]=O[1I2l/O[1Ill; [2I 2]=O[2I2]-O[1I2]*O[1I2]/O[1Il]; [2I 1]=A_Y[1I2]; (212(1); I2]=-1/A_Y[2I2]; I2]=A_y[1I2]/A_y[2I2]; ,1]=A_Y[1,1]-A_Y[1,2]*A_Y[1,2]/A_Y[2,2]; I1] =A _y[1I2]; (3I3I1); :2,1:2] =A; :2,3]=slope234[1:2,1]; 1:2]=T(slope234[1:2,1]); 3]=var234[1]; ‘ WJWJWJWszm'b » w w > lb lb v I» WUHHUWNHHNU