. a
33:2

u.'w' .1
4-.

........
..
'd.
«m-
m.w,~
..-.

' 1:5.
NW. ~ m

 

. '
it" it”
ggﬁ: 3

Fig“ ‘53
- g s H -
2r?!“
igas’r- a
’ 3

E3 531.3,
«a s-

€233.
“331-3311”.
4..., "w. m
‘M'Jk$“
. ‘31.
b" a:

m

 

2- LIBRARY
2 008 Michigan State
University

This is to certify that the
dissertation entitled

APPLICATION OF MODEL-DRIVEN META-ANALYSIS
AND LATENT VARIABLE FRAMEWORK IN SYNTHESIZING
STUDIES USING DIVERSE MEASURES

presented by

Soyeon Ahn

has been accepted towards fulfillment
of the requirements for the

Ph. D degree in Department of Counseling,
Educational Psychology and
Special Education

 

 

WWW/law

Major @ofessor’ 3 Signature

ngIU/l’kz/ $008

Date

MSU is an afﬁrmative-action, equal-opportunity employer

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5/08 KilProlechreleIRClDaIeDue indd

APPLICATION OF MODEL-DRIVEN META-ANALYSIS
AND LATENT VARIABLE FRAMEWORK IN SYNTHESIZING STUDIES
USING DIVERSE MEASURES
By

Soyeon Ahn

A DISSERTATION

Submitted to
Michigan State University
In partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Counseling, Educational Psychology and Special Education

2008

ABSTRACT
APPLICATION OF MODEL-DRIVEN META-ANALYSIS
AND LATENT VARIABLE FRAMEWORK IN SYNTHESIZING STUDIES
USING DIVERSE MEASURES
By
Soyeon Ahn

In Spite of a growing interest in meta-analysis, the application of existing
methodology faces numerous difﬁculties and limitations. In particular, the use of diverse
measures in primary studies introduces two methodological concerns in the application of
meta-analytic techniques. First, individual study effects can vary signiﬁcantly depending
on differences in measures employed. Second, the existing methodologies are limited in
dealing with very sparse data structures, where effect size has its unique measurement
characteristics.

In support of resolving these concerns, the current research proposes a method for
handling a very sparse data structure of effect Sizes that arises from variations in
measures used in primary studies. The proposed model is based on model-driven meta-
analysis, structural equation modeling with latent variables, and method-of-moments
estimation technique. This study presents the model speciﬁcation in which the true
population relationship between two latent variables is estimated. A method to extract
unknowns in estimating the relationship between two underlying constructs (Equation 3.
13) is discussed.

First, several Monte Carlo Simulations are performed in order to examine the

performance of the proposed estimator under different conditions. Results from

simulations indicate that the proposed approach correctly estimates the desired population

parameter. MANOVA results Show that the factor loadings and reliabilities of indicators
have the largest effect on the bias and MSE values of the estimators.

Second, the application of the proposed approach is demonstrated by re-analyzing
a sub-set of studies reviewed by Ahn and Choi (2004). The estimated strength of the
relationship between teachers’ subject matter knowledge and student achievement
included in Ahn and Choi using the proposed method was smaller than the weighted
mean correlation corrected for artifacts proposed by Hunter and Schmidt (1990, 1994)
and the z-transformed variance-weighted mean correlation proposed by Shadish and
Haddock (1994), but leads to the same inference.

Lastly, four practical considerations of the proposed approach were discussed,
followed by a list of potential future research to resolve those limitations. In this section,
I demonstrate how well the proposed approach estimates the strength of the relationship

between two underlying constructs when it is based on a misspeciﬁed population model.

Copyright by
SOYEON AHN
2008

ACKNOWLEDGEMENTS

Though only my name appears on the cover of this dissertation, a number of great
people have contributed to its production. I owe my gratitude to all who have made this
dissertation possible and because of whom my graduate training experience has been one
that I will never forget.

My deepest gratitude goes to my two Spiritual mentors, Drs. Betsy J. Becker and
Mary M. Kennedy. I have been exceptionally fortunate to have these two great mentors
during my six-year graduate training at MSU. Their mentorship was paramount in
providing a well—rounded experience consistent with my long-term career goals. There
are only a few graduate students who are given the opportunity to develop their own
individuality and self-sufﬁciency by being allowed to work with such independence. For
everything you’ve done for me, Dr. Betsy J. Becker and Dr. Mary M. Kennedy, I thank
both of you and I will pay back to my students. I hope that one day I would become as
good an advisor to my students as the two have been to me.

I think a decision to have Dr. Becker as my academic advisor was the best choice
I have ever made in my entire life. She gave me the freedom to explore on my own and at
the same time the guidance to recover when my steps faltered. Her patience and support
helped me overcome many crisis Situations and ﬁnish this dissertation. She always read
my terrible drafts from one day to another; she edited my grammar and APA disasters
without a complaint. But, I am not yet perfect for APA format. I would also like to
convey many special thanks to Dr. King Beach, who welcomed me whenever I came to
Florida for a visit, and provided me very warm and practical advice when I was in

difﬁcult situations (in particular, one year ago).

Dr. Mary M. Kennedy! We, all TQ-QT girls (you should remember all three girls
in TQ-QT), love your sense of humor (i.e., “piece of cake”), your thoughtful insights,
practical and generous advice, never-ending support toward us, and values on research
and teaching. Dr. Kennedy, everything is “a piece of cake”, isn’t it?

My co-chair, Dr. Kimberly S. Maier, has been always there to encourage me and
give me practical advice to go through such a long and lonely journey. Dr. Maier had
always been overwhelmingly generous with graduate students———but it was only after she
became my advisor that I realized how committed she was to training, supporting and
encouraging her students. I will never forget her warm and cheering note whenever I felt
miserable.

I would like to thank Dr. Richard T. Houang for his insights and enormous
assistance to solve the technical part of this dissertation. Dr. Houang was always open to
my call for meetings and welcomed me with his great smile. When I was deadly
frustrated, he was the one who shed light on problems and shared his knowledge and
insights. Without his help, I doubt I would be able to complete the dissertation and
graduate in time.

My thanks also go to Dr. Frederick L. Oswald, the other member of my
dissertation committee. Dr. Oswald went well beyond his duty in reading and copiously
commenting on my research.

I would like to acknowledge Dr. Ralph Putnam’s help on developing the expert
judgment assessment. He nicely Shared his experience and knowledge to develop the
assessment tool for expert judgments. In addition, I want to acknowledge my appreciation

to ﬁve graduate students at Michigan State University.

vi

I am also grateful to the following former or current TQ-QT staff members.
Among them, special thanks go to Dr. Meng—jia Wu (at Loyola University, Chicago), Dr.
J inyoung Choi (at Ewha Womans University, Korea), and Rae-Seon (Sunny) Kim (at
Florida State University). I am also thankful to Steve J. Pierce in Community Psychology
at Michigan State University for his encouragement and practical advice on research.

Many friends have helped me stay sane through these difﬁcult years. Their
support and care helped me overcome setbacks and stay focused on my graduate study. I
greatly value their ﬁiendship and deeply appreciate their conﬁdence in me. I am also
grateful to the Korean Buddhism Organization with which I shared numerous happy
moments.

I also want to acknowledge Dr. Andrew C. Shin, who has shared ﬁve years of my
last twenty. I owe him a lot and I will not forget everything he has done for me during the
ﬁve years. When I was up till 3 am. in the main library, he was always there by my Side.

Last, but not the least, I would like to convey my sincere appreciation to my
parents and my younger brother in Korea. In particular, I want to say a very special
thanks to my dad, Young Gil Ahn, who truly believes in me and supports my success
during the entire 30 years of my life. I strongly believe that most of who I am has been
built on my dad’s very extraordinary support. Also, I really believe that my mom’s
sincere prayer to Buddha at 5 am. every morning truly worked for my success. Thank

you very much, my lovely mom Okhee Shim!

vii

PREFACE

Nearly Six years of research experience in the Teacher Qualiﬁcations and Quality
of Teaching (TQ-QT) project1 under the direction of principal investigators Drs. Betsy J.
Becker and Mary M. Kennedy at Michigan State University provided me a solid
theoretical and practical background for completion of this dissertation. Approximately
500 studies that examine the relationship between teacher qualiﬁcations and quality of
teaching vary tremendously and introduce several interesting methodological questions in
research synthesis.

This dissertation focuses on how to combine studies when the original studies use
diverse measures with different measurement characteristics such as reliability and
validity, even though researchers intend these to represent the same underlying constructs.
In this research, I have tried to develop an approach whereby we can combine the very
sparse data structure that arises from large variations across studies in measures. The
proposed method is based on the assumption that all measures are attempting to represent
the same underlying construct even though their measurement characteristics are quite
different.

The proposed approach is developed based on three existing ideas in statistics and
measurement — model-driven meta-analysis, structural equation modeling (SEM) with
latent variables, and a method-of-moments estimation technique. Even though the
proposed method is built on a simple one-factor model, it is possible to expand this model
to solve more complicated issues in meta-analysis. As presented in the section on

practical considerations, more attention should be paid to developing a method that can

 

I For more detailed information, please see the website http://www.msu.edu/'user/'rnkennedy/TOOT/

viii

handle missing data in research synthesis. In addition, the robustness of the proposed
model should be examined before applying the proposed model in practice.

It is customary to list a long series of acknowledgements somewhere in the
preface of a dissertation. I have gained enormous personal and scientiﬁc beneﬁts during
my time Spent on the TQ-QT project at MSU, both from the people with whom I have
worked and the environment that they have created. I am only going to personally thank
four people, my mentors Drs. Betsy J. Becker and Mary M. Kennedy (we often call them

“Spiritual Mentors (SM)”), to whom I owe so much that it would be pointless to try to

encapsulate it, Dr. Meng-Jia Wu (at Loyola University at Chicago), and Rae-Seon
(Sunny) Kim (at Florida State University), who have played multiple roles as colleague,
friend, and big sister. Their academic and emotional support helped me go through a long

and sometimes lonely journey toward the completion of this dissertation.

ix

TABLE OF CONTENTS

LIST OF TABLES ........................................................................... xii
LIST OF FIGURES .......................................................................... xiv
CHAPTER 1 INTRODUCTION .......................................................... l
1.1. Challenges of Research Synthesis in Education and Social Science ............ l
1.2. Empirical Example ..................................................................... 5
1.3. Purpose of Research .................................................................... 5
CHAPTER 2 LITERATURE REVIEW .................................................. 7
2.1. Meta-analytic Methods For Synthesizing Studies using Various Indicators 8
2.1.1. Univariate Method .......................................................... 8

2.1.2. Artifact Corrections ......................................................... 9

2.1.3. Multivariate Method ........................................................ 12

2.2. Model-driven Meta-analysis .......................................................... 14
2.3. Structural Equation Modeling with Latent Variables .............................. 17
2.3.1. Structural Equation Modeling in Meta-analysis .......................... 17

2.3.2. Latent Variable Framework in Meta-analysis ............................ 18

2.4. Method of Moments Estimation Technique ......................................... 20
CHAPTER 3 METHODOLOGIES ....................................................... 22
3.1. Model Speciﬁcation ................................................................... 22
3.2. Structural Equation Modeling with Latent Variables ............................. 23
3.3. Estimation .............................................................................. 25
3.4. Information for Estimating p5.” ..................................................... 27
3.4.1. Population Correlation Coefﬁcients pg. Between xs and ys ........... 28

3.4.2. Factor Loadings (Validity Coefﬁcients) .................................. 30

3.5. Extracting Unknowns in the Model ................................................ 33
3.5.1. Use of Reliability Information .............................................. 31

3.5.2. Use of Expert Judgments ................................................... 32

CHAPTER 4 SIMULATION ............................................................. 37
4.1. Data Generation ....................................................................... 37
4.1.1. Choice of Parameters ....................................................... 38
4.1.2. Replications .................................................................. 40

4.2. Data Evaluation ........................................................................ 41
4.3. Simulation Results ..................................................................... 42
4.3.1. Estimators ..................................................................... 42

4.3.2. Bias and MSE of Estimators ................................................ 42

4.3.3. Factors Affecting Estimators of the Strength of Relationship Between

Two Constructs .............................................................. 45

4.4. Conclusions ............................................................................ 51
CHAPTER 5 APPLICATION ............................................................ 53
5.1. Study Description ...................................................................... 54
5.2. Method .................................................................................. 56
5.3. Expert Judgments ...................................................................... 58
5.4. Results ................................................................................... 60
CHAPTER 6 PRACTICAL CONSIDERATIONS .................................... 62
CHAPTER 7 DISCUSSION .............................................................. 66
APPENDIX A ................................................................................ 69
APPENDIX B ................................................................................ 73
APPENDIX C ................................................................................ 84
BIBLIOGRAPHY ........................................................................... 141

xi

Table 2.1

Table 2.2

Table 4.1

Table 4.2

Table 4.3

Table 4.4

Table 4.5

Table 4.6

Table 4.7

Table 4.8

Table 4.9

Table 4.10

Table 4.11

Table 4.12

Table 4.13

Table 4.14

LIST OF TABLES
Attenuation Artifacts and the Corresponding Multiplier ................
Comparisons of Correction Formulas ......................................
Bias and MSE of Estimators ................................................

Bias and MSE of Sample-size Weighted and Z-transfomred Variance
Weighted Estimators Under Different Conditions (7 = 0) .............

Bias and MSE of Sample-size Weighted and Z-transformed Variance
Weighted Estimators Under Different Conditions (7 = .5) ............

Mean bias of r from Field (2001) ..........................................

Results from Multivariate Analysis of Variance (MANOVA) on the
Bias of Estimators for 7 = 0 ...............................................

Tests of Between-F actor Effects for Bias of Overall Effect-Size
Estimators (7 = O) ..........................................................

Results from Multivariate Analysis of Variance (MANOVA) on the
MSES of Estimators for 7 = 0 .............................................

Tests of Between-F actor Effects for MSES of Overall Effect-size
Estimators (7 = 0) ..........................................................

Results from Multivariate Analysis of Variance (MANOVA) on the
Bias of Estimators for 7 = .5 .............................................

Tests of Between-Factor Effects for Bias of Overall Effect-size
Estimators (7 = .5) ........................................................

Results from Multivariate Analysis of Variance (MANOVA) on the
MSES of Estimators for 7 = .5 ...........................................

Tests of Between-F actor Effects for MSES of Overall Effect-size
Estimators (7 = .5) .........................................................

Correlation Matrix of Six Indicators .....................................

ANOVAS Comparing Bias of ESI Across Which r is Included

xii

85
86

87

88

91

94

95

96

97

98

99

100

101

102

103

104

Table 4.15

Table 5.1

Table 5.2

Pairwise Comparisons Comparing Bias of ESl Depending On Which

r is Included ................................................................... 105
Measures used to Represent Teachers’ and Students’ Knowledge in 8

Studies .......................................................................... 106
Description of 8 Studies ....................................................................... 107

xiii

Figure 1.1

Figure 2.1

Figure 3.1
Figure 3.2

Figure 3.3

Figure 4.1
Figure 4.2

Figure 4.3
Figure 4.4
Figure 4.5
Figure 4.6
Figure 4.7
Figure 4.8

Figure 4.9

Figure 4.10

Figure 4.11

LIST OF FIGURES
An empirical example from Ahn and Choi (2004) .....................

An underlying model used in a meta-analysis by Whiteside and
Becker (2000) ...............................................................

A hypothetical meta-analysis with k studies ............................
A population model for a hypothetical meta-analysis .................

Covariance structure model for ranking data with p = 4
alternatives ..................................................................

Histograms of estimators when 7 is set to 0 ............................
Histograms of estimators when 7 is set to .5 ...........................

Biases of two estimators depending on the true population
relationship between two underlying constructs (7) ..................

MSES of two estimators depending on the true population
relationship between two underlying constructs (7) ..................

Biases of two estimators depending on the reliabilities of indicators
when 7 is set to O ..........................................................

Biases of two estimators depending on the reliabilities of indicators
when 7 is set to .5 .........................................................

MSES of two estimators depending on the reliabilities of indicators
when 7 is set to 0 ..........................................................

MSES of two estimators depending on the reliabilities of indicators
when 7 is set to .5 .........................................................

Biases of two estimators depending on the factor loadings of
indicators when 7 is set to 0 .............................................

Biases of two estimators depending on the factor loadings of
indicators when 7 is set to .5 ............................................

MSES of two estimators depending on the factor loadings of
indicators when 7 is set to 0 .............................................

xiv

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

Figure 4.12 MSES of two estimators depending on the factor loadings of

indicators when 7 is set to 0 ............................................. 126
Figure 4.13 Biases oftwo estimators depending on k when 7 is set to O 127
Figure 4.14 Biases of two estimators depending on k when 7 is set to .5 ....... 128
Figure 4.15 MSES oftwo estimators depending on k when 7 is set to 0 129
Figure 4.16 MSES of two estimators depending on k when 7 is set to .5 ....... 130

Figure 4.17 Biases of two estimators depending on the number of missing rs
when 7 is set to 0 ......................................................... 131

Figure 4.18 Biases of two estimators depending on the number of missing rs
when 7 is set to .5 ........................................................ 132

Figure 4.19 MSES of two estimators depending on the number of missing rs
when 7 is set to 0 ........................................................ 133

Figure 4.20 MSES of two estimators depending on the number of missing rs
when 7 is set to .5 ........................................................ 134

Figure 4.21 Biases of ESl depending on which correlation is included with 7

of .5 ......................................................................... 135
Figure 5.1 A model for meta-analysis investigating teachers’ subject matter

knowledge (SMK) and student learning in mathematics ............ 135
Figure 6.1 Biases of two estimators depending on speciﬁc variances of

indicators when 7 is set to O ............................................ 137
Figure 6.2 Biases of two estimators depending on speciﬁc variances of

indicators when 7 is set to .5 ........................................... 138
Figure 6.3 MSES of two estimators depending on speciﬁc variances of

indicators when 7 is set to O ............................................ 139
Figure 6.4 MSES of two estimators depending on speciﬁc variances of

indicators when 7 is set to .5 .......................................... 140

X V'

CHAPTER 1
INTRODUCTION
From its ﬁrst appearance, meta-analysis has been widely used in various
disciplines including medicine, economics, psychology, epidemiology, and education
(Chalmers, Hedges, & Cooper, 2002; Hedges, 1983; Slavin, 2008; Vanhonacker,
Lehmann, & Sultan, 1990). In spite of a growing interest in meta-analytic techniques as a
means of providing rigorous evidence in many ﬁelds (Borrnan, 2002; Slavin, 2008;
Towne, Wise, & Winters, 2005), the application of existing methodology in research
synthesis faces numerous difﬁculties and limitations due to the inherent nature of
research in education and social sciences (Berk, 2006; Rubin, 1992; Slavin, 1984; Thum

& Ahn, 2007).

1.1. Challenges of Research Synthesis in Education and Social Science

As Kennedy (2007) has pointed out, multiple factors simultaneously inﬂuence
outcomes within naturally occurring settings in education and social sciences. Many
researchers have thus used multiple regressions or hierarchical linear models to eliminate
numerous confounding variables in the primary research (Kennedy, Ahn, & Choi, 2008).
However, their study ﬁndings have been often excluded from meta-analyses (e. g., Ahn &
Choi, 2004; Qu & Becker, 2003) because no generally accepted methods exist for
integrating results of multiple regressions or hierarchical linear models (Becker &
Schram, 1994; Becker & Wu, 2007; Wu, 2006a, 2006b).. AS discussed in Becker and
Schram (1994), regression analyses, path analyses, canonical correlations, and factor

analyses are not easily synthesized. This is because partial correlations provided in such

analyses seldom represent the same parameters, which vary depending on other variables
included in the model.

For example, Ahn and Choi (2004) found that among 49 studies examining the
relationship between teacher subject matter knowledge and student achievement in
mathematics, 11 used regression analysis and 4 used more advanced data-analytic
techniques such as hierarchical linear modeling (HLM) or structural equation modeling
(SEM). Consequently, Ahn and Choi (2004) excluded those 15 studies from their meta-
analysis, and synthesized only the remaining 34 studies that provided correlation
coefﬁcients between teacher knowledge and student achievement.

In addition, in the social sciences and education, no natural scales of
measurements exist (Hedges & Olkin, 1985). Consequently, studies employ a variety of
measures. While these may represent “the same” underlying construct (e. g., student
learning, depression, or other broad constructs), meta-analysts often encounter difﬁculties
in putting effects on a common outcome metric across studies using various measures
(Rubin, 1992). As Bollen (1989) demonstrated, study ﬁndings (e.g., correlations or
regression coefﬁcients) differ if the measurement errors of indicators (with variations in
the reliabilities of indicators) are introduced or if the factor loadings (i.e., validity) of
indicators are not equal to one.

Choi, Ahn, and Kennedy (under review) discovered that 15 different measures of
teacher knowledge in mathematics (e. g., Glennon Test of Mathematical Understanding,
Test of Understanding of the Real Number System (TURNS), etc) were used across the
16 studies included in their meta-analysis on teachers’ subject matter knowledge in

mathematics. Similarly, Becker and Wu (2007) identiﬁed 79 unique measures of student

IQ

learning used to represent the quality of teaching across 65 studies that investigate the
relationship between teacher qualiﬁcations and quality of teaching. Such use of diverse
measures in the primary studies often introduces the following two methodological
concerns in the application of meta-analytic techniques.

First, individual study effects can vary signiﬁcantly depending on measurement
differences in the variables employed (Baugh, 2002; Lipsey & Wilson, 2001; Nugent,
2006; Oswald & Converse, 2005; Oswald & Johnson, 1998; Rubin, 1992; Slavin, 1984).
Thus, many researchers (Hunter & Schmidt, 1990; Oswald & Converse, 2005; Oswald &
Johnson, 1998; Raju, Anselmi, Goodman, & Thomas, 1998; Raju, Burke, Normand, &
Langlois, 1991; Raju, Fralicx, & Steinhaus, 1986) have proposed methods for correcting
study effects for differences in measurement. Hunter and Schmidt’s (1990) approach,
which adjusts correlation coefﬁcients for potential measurement artifacts including
sampling error, measurement unreliability, and range restriction, has been widely adopted
in social sciences, particularly in applied psychology.

However, some researchers (Lambert & Curlette, 1995; Oswald & Johnson, 1998)

have demonstrated that the estimate of the population correlation (,5) obtained via the

Hunter and Schmidt’s approach does not always estimate the true value ( p) and its

associated variance (0'33) is also somewhat inaccurate. For example, based on Monte

Carlo simulations, Oswald and Johnson (1998) demonstrated that discrepancies between
7") and ,0 get larger with small within-study sample sizes and with smaller numbers of
effect sizes included in the meta-analysis.

Recently, Thum and Ahn (2007) have applied a latent variable framework in

research synthesis and proposed to adjust for differences in regression coefﬁcients due to

the factor loadings, the measurement errors, and the variances of latent variables before
combining the coefﬁcients. On the other hand, a number of limitations stand in the way
of practical application of Thum and Ahn’s approach. In particular, many components of
the model are unreported, including the factor loadings of both criterion and predictor
variables, an index of the true relationship between two constructs, and information on
measurement errors. Even if reasonable priors on the unknowns can be selected, the
estimation process outlined by Thurn and Ahn requires information not easily available
and thus practical applications may be limited.

The second concern is that the existing univariate or multivariate statistical
modeling approaches for meta-analysis (e. g., the Generalized Least Squares (GLS)
method presented by Raudenbush, Becker, & Kalaian, 1988) are limited in dealing with
very Sparse data structures, which occur when each effect size (e. g., correlation or
regression coefﬁcient) has its unique measurement characteristics for predictor and
outcome variables.

For example, in the meta-analysis by Choi, Ahn, and Kennedy (under review),
none of the correlation coefﬁcients from 16 studies uses the same measures of both
teacher’s knowledge and student achievement in mathematics. In such a case, the GLS
method, which is frequently used to combine non-independent effect-sizes in the meta-
analysis, is inapplicable due to a singular design matrix for estimating the true population

correlation coefﬁcient and its variance.

1.2. Empirical Example

Figure 1.1 in the Appendix C displays studies included in the meta-analysis by
Ahn and Choi (2004) that focuses on the effect of how much math teachers know on
student learning in mathematics. In Figure 1.1 in the Appendix C, three aforementioned
challenges in synthesizing studies are well delineated: 1) Studies provide results from
diverse data-analytic techniques (e. g., correlation coefﬁcients in Brown, 1988; regression
coefﬁcients in Chaney, 1995; HLM coefﬁcients in Chiang, 1996). 2) Different sets of
predictors (i.e., coursework, degree level, major, GPA, and test scores for teacher
knowledge in mathematics) and outcome variables (i.e., California Achievement Test
(CAT), National Assessment of Educational Progress (NAEP), and Iowa Test of Basic
Skills (ITBS) for student achievement in mathematics) are used across studies. 3) Only
two studies (i.e., Teddlie, Falk, & Falkowski, 1983 and Hill, Rowan, & Ball, 2005)
provide exactly identical links between the same sets of predictors and criterion variables,

leading to a very Sparse data structure for ﬁrrther analyses.

1.3. Purpose of Research

The current research proposes a new methodology for handling a very sparse data
structure of the effect sizes (i.e., correlations or regression coefﬁcients) that mostly arises
from the variations in the measures used in the primary studies. To accomplish this, I use
a Structural Equation Modeling (SEM) approach with latent variables (Bollen, 1989), the
ideas of model-driven meta-analysis (Becker & Schram, 1994), and a method-of-

moments estimation technique (Casella & Berger, 1990; Gelman, 1995). This method

quantiﬁes the relationship between two underlying constructs measured by different sets
of indicators with unique measurement characteristics such as reliability and validity.

As Messick (1993) indicated, there are several ways of conceptualizing validity
(e. g., content validity, criterion validity, predictive validity, etc). I use the term validity to
refer to the structural relationship (correlation) between the indicator and its underlying
construct, which can be understood based on a structural equations approach (Bollen,
1989). AS Bollen (1989) pointed out, the validity of a measure is deﬁned as the
magnitude of the direct structural relation between the indicator and its associated
construct.

In this dissertation, I ﬁrst present the speciﬁcation of a population model using
model-driven meta-analysis and SEM with latent variables. Based on the speciﬁed
population model, the true population relationship between two latent variables is
quantiﬁed by applying the method-of-moments estimation technique. Moreover, three
approaches are discussed for obtaining the unknown values needed to compute the
method-of-moments estimator of the strength of the relationship between two underlying
constructs. Then a series of Monte Carlo simulations is conducted to test the performance
of the proposed approach under different conditions. Last, its practical application is
demonstrated by synthesizing a set of studies that are reviewed by Ahn and Choi (2004),
in which the relationship between teachers’ subject matter knowledge and student

achievement in mathematics was investigated.

6

CHAPTER 2
LITERATURE REVIEW

In many disciplines a variety of measures with different measurement
characteristics are often used to represent “the same” underlying construct in the primary
studies (Farley, Lehmann, & Ryan, 1981). For instance, Crowl, Ahn, and Baker (in press)
reported that the parent-child relationship quality is measured in several ways across the
19 studies included in their meta-analysis. These measures include standard observational
techniques, structured interviews, and several standardized assessments such as the
Family Relations Test, the Parenting Stress Index, and the Dyadic Adjustment Scale.

Also, many studies no longer focus on only a few simple bivariate relationships
(e.g., zero-order correlations), or differences (main effects) on a few outcomes (Becker,
2001; Becker & Schram, 1994). An example from an ongoing synthesis of the studies
examining the relationship between teacher qualiﬁcations and the quality of teaching
(TQ-QT)2 indicates that only 55 out of 461 coded studies used a bivariate correlation
analysis, and 17 others reported simple t tests. Most other studies examined the effect of
teacher qualiﬁcations on the quality of teaching based on more advanced data analytic
techniques such as multiple regression, Multivariate Analysis of Variance (MANOVA),
and Analysis of Covariance (ANCOVA). In this section, I ﬁrst review how these

challenges have been handled in research synthesis.

 

2 More details about the TQ-QT project can be found in http://tnvwmsu.edit/user/mkenrredw’TQQTL

2.1. Meta-analytic Methods For Synthesizing Studies using Various Indicators
In the literature, three methods are often used to synthesize studies using various
measures of the predictor and outcome variables. These are a univariate method, an

artifact- correction approach, and a multivariate method.

2.1.1. Univariate Method

The ﬁrst approach involves creating collections of studies that use the same
measures and then performing a series of separate univariate analyses of effect sizes on
each relationship. This is accomplished by calculating an average effect for each category
(see Hedges & Olkin, 1985; Hunter & Schmidt, 1990; Shadish & Haddock, 1994) based
on traditional research-synthesis techniques (e. g., the z-transformed variance-weighted
average proposed by Hedges and Olkin (1985) or Rosenthal and Rubin (1991)).

For instance, Choi, Ahn, and Kennedy (under review) categorized 51 correlation
coefﬁcients extracted from 19 studies into 8 categories in terms of the content domain
(i.e., arithmetic, algebra, and geometry) and the cognitive demands of the student
mathematics achievement measure (i.e., computation, concepts, and applications). Then
they obtained the z-transformed variance-weighted average estimates for 8 categories by
performing a series of separate univariate analyses, one for each subgroup of studies.

A univariate data-analysis is often used due to its ease of application. However, it
is limited when the interest is in an overall picture of interrelationships among all
variables included in the model as a whole. Moreover, when individual studies contribute

multiple measures of relationships, the univariate method ignores possible dependence in

the data, and thus might lead to inaccurate conclusions (Becker & Schram, 1994; Gleser

& Olkin, 1994).

2.1.2. Artifact Correction

Some methodologists (e.g., Bollen, 1989; Nugent, 2006) have argued that effect
sizes (i.e., standardized mean differences, correlation coefﬁcients) based on variables
with different measurement characteristics are not directly comparable. For instance,
Nugent (2006) demonstrated that the distribution of the standardized mean difference,
which is the most widely used scale invariant effect-size measure in the current practice '
of meta-analysis, varies depending on the reliabilities of measures used in the comparison
groups. It has been also known that the correlation coefﬁcient varies depending on the
reliability of one or both measures (Baugh, 2002; Bollen, 1989; Hancock, 1997; Hunter
& Schmidt, 1990, 1994).

Although most discussions have been limited to correlation coefﬁcients,
particularly in applied psychology, a number of researchers have suggested using the
correction formulas with other effect-size measures such as regression coefﬁcients and
standardized mean differences attenuated due to measurement characteristics such as
reliability and range restriction (Hunter & Schmidt, 1990; Oswald & Converse, 2005;
Oswald & Johnson, 1998; Raju et al., 1986; Raju et al., 1991; Raju et al., 1998).

In fact, corrections for correlation coefﬁcients are heavily used in the meta-
analytic procedures proposed by Hunter, Schmidt, and Jackson (1982) and elaborated by

Hunter and Schmidt (1990, 2004). Hunter and Schmidt (1990, 1994) have indicated that

the study population correlation p0 is always lower than the actual correlation p. This is

because we cannot do any study perfectly, and study imperfections produce the artifacts
that systematically reduce the actual correlation parameter. Therefore, they have

identiﬁed 10 possible sources of artifacts, and propose to correct the attenuated sample
correlation by multiplying it by appropriate “artifact multipliers” (1,- Shown in Table 2.1

in the Appendix C.

After disattenuating each sample correlation using appropriate artifact multipliers

ai , the weighted mean correlation F is obtained by

F=Zwsrs /Zws (2.1)
where rs is the 3th study correlation; the weight for study 5 suggested by Hunter and
Schmidt is

w, = NSASZ, (2.2)
where N s is the sample size for study 5, and AS is the compound artifact multiplier for

study 5.

More elaborations of Hunter and Schmidt’s method have been developed by a
number of researchers (e. g., Le, 2003; Sackett & Yang, 2000 for correcting range
restriction; Hancock, 1997; Raju & Brand, 2003; Raju, Burke, Normand, & Langlois,
1991 for correcting reliability and range restriction; Oswald & Converse, 2005 for
correcting the unrestricted predictor reliability, the range-restricted criterion reliability,
and the restricted validity coefﬁcient). The focus of recent studies (Raju, Burke,
Normand, & Langlois, 1991) has been on how to correct correlation coefﬁcients for study
artifacts when not all the included studies provide information related to study artifacts.

Some researchers (Baugh, 2002; Bollen, 1989; Raju et al., 1986; Raju et al., 1991; Raju

10

et al., 1998) have also expanded their discussions to include attenuation in either
unstandardized or standardized regression coefﬁcients. More details can be found in
Table 2.2 in the Appendix C.

However, some research has indicated that some of the correction formulas
frequently used in research synthesis fail to fully eliminate the effects of study artifacts.
Based on Monte Carlo simulations, Oswald and Johnson (1998) found that the Hunter
and Schmidt’s method, which corrects study artifacts, yields estimates of the population
parameter that do not estimate properly the true value under some conditions, even for
bivariate normal data. In addition, Lambert and Curlette (1995) have Shown that the
variance of the corresponding mean correlation coefﬁcient can be greatly underestimated
when some measures have skewed distributions of the predictor and criterion scores.

Such ﬁndings suggest that the existing methods for correcting the attenuation of
correlation coefﬁcients might not fully eliminate the consequences of study artifacts on
effect-size measures. Moreover, no one has suggested how information on some of the
artifacts can be obtained from primary studies. In particular, the construct validitieS of
both predictor and outcome variables, which are brieﬂy mentioned in Hunter and
Schmidt (1990, 1994), are seldom reported in the primary studies. Considering that these
artifact multipliers are not often reported, the application of this correction will be limited

in practice unless methods are developed for obtaining the unreported values.

2.1.3. Multivariate Method

The third approach for combining dependent effect Sizes from multiple measures

is to use multivariate methods. By using multivariate methods, intercorrelations

ll

(dependencies) among several effects can be taken into account. This should lead to a
more accurate error rate and ensure that samples with more data do not over-inﬂuence the
results (Becker & Schram, 1994).

The most frequently used multivariate approach is a Generalized Least Squares
(GLS) method suggested by Raudenbush, Becker, and Kalain (1988). The GLS method is
a feasible and ﬂexible approach for analyzing multivariate data (Becker & Schram, 1994).
Depending on how the covariances between correlations for the variance-covariance
matrix S are computed, several variations of the GLS method have been proposed by
Becker and Fahrbach (1994), Cheung (2000), Furlow (2003), and Furlow and Beretvas
(2005).

In this section, a general overview of the GLS method is presented with a special
focus on pooling correlation matrices. I begin by considering that the goal is to estimate
the pooled m x m correlation matrix from the correlation coefﬁcients which are reported
in k studies using m variables. To accomplish the GLS analysis, the correlation
coefﬁcients should be stacked in a vectorr. The ﬁxed-effects model for the correlation

rsj (s=1tok andj= 1 to m', m*=m(m—l)/2)canbewrittenas

rsj =pj +esj, fors=l to k, and j=1 to m*. _ (2-3)

This model can be re-written as a multiple regression in matrix form, in which the

product of a matrix X and a set of population correlations p j predict a set of sample

correlations. Speciﬁcally

r = Xp. +e, (2-4)

. . * * . . . . . .
where the matrix X IS a stack of m x m identity matrices for k studrcs, and Identrﬁes
which correlations are estimated in each study and p. 2 (p1 ,..., pm )' contains the

population correlations. The pooled correlations and their standard errors are estimated

by the following GLS formula shown in Becker (1992)
a. =(X'C'1X)'1X'C'lr (2.5)
and

vtﬁ.)=(x'C"X)". (2.6)
where C is the variance-covariance matrix among the correlations within studies
included in the meta-analysis on the diagonal, with blocks of zeros in the upper and lower
triangles. See Olkin and Siotani (1967) for formulas forC. Also, other ways of estimating
C can be found in Becker and Fahrbach (1994); S. Cheung (2000); Cheung and Chan
(2005); Furlow (2003); and Furlow and Beretvas (2005).

However, the application of the GLS method might be problematic for very
Sparse datasets, in which few studies use the same measures of variables of interest. This
is because the design matrix X in equation 2.5 and equation 2.6 may become Singular,
and GLS analysis would be impossible when estimating the true population correlation

coefﬁcient and its variance.

2.2. Model-driven Meta-analysis
Becker (e. g., Becker, 2001; Becker & Schram, 1994; Whiteside & Becker, 2000)
described model-driven meta-analysis as an efﬁcient tool to deal with the growing

complexity of primary studies in research synthesis. Becker (2001) refers to the model-

13

driven meta—analysis as a review that incorporates models from the substantive theory
and informs us about the strength of relations posited by a population model. In a model-
driven meta-analysis, the interrelationships among multiple constructs or measures that
are explicit in the model are individually as well as simultaneously examined. Eventually,
a model-driven meta-analysis can delineate a more complete system of relationships
among constructs or variables than a traditional synthesis and provide a model for
making further predictions based on real or hypothetical predictor values.

Becker and Schram (1994) discuss the rationale for employing models in
synthesizing studies. First, they emphasize the importance of theory and theoretical
models in primary studies, which are useful to verify or refute competing models.
Similarly, a model-driven meta-analysis can help the reviewer build a stronger basis of
explanation for the mechanisms behind a phenomenon of interest. Second, a model-based
research synthesis can provide an overall picture of patterns among variables across the
existing studies, by piecing together parts of a process that has been studied by different
researchers or studied using different samples. Last, they point out that theoretical models
can also guide reviewers in the conduct of the review process, much as they can help the
conduct of primary research.

In a model-driven meta-analysis, models can arise empirically or be derived from
theory (Becker, 1997). Figure 2.1 in the Appendix C Shows one example of a model used
in the meta-analysis conducted by Whiteside and Becker (2000), in which multiple
factors affecting child outcomes including externalizing symptoms, internalizing
symptoms, social skills, and cognitive skills are investigated. As seen in Figure 2.1,

models are often illustrated using ﬂowcharts or path diagrams. Such a diagram has two

14

components — boxes representing a construct or a set of constructs, and arrows
representing paths indicating interrelationships among a set of constructs. In Figure 2.1,
Whiteside and Becker have used 14 boxes representing variables or constructs (10 for
predictors, and 4 for outcomes), and 19 arrows representing paths for interrelationships
(including bidirectional relationships) among 14 variables or constructs. Due to the
limited number of studies, a Slightly reduced model was ﬁnally estimated in their meta-
analysis. More details can be found in Whiteside and Becker (2000).

Based on Cooper’s ﬁve stages of the review (1982), Becker (1992, 1997, 2001)
drew parallels for incorporating models in conducting a model-driven meta-analysis. At
the ﬁrst stage of problem formulation, the models can guide reviewers to conceptualize
the problem, deﬁne the constructs, and determine study relevance, even though they
could also limit the generalization from the review by limiting variables and underlying
constructs. At the data collection stage, researchers can easily establish explicit inclusion
rules. This can occur because researchers who set up their models are fully informed
about the research related to their own model and the research on competing models. The
next stage is data evaluation, in which reviewers judge the procedural adequacy of
studies in the review. At this stage, models can be used to identify and code aspects of
study features, extract outcomes, and determine the type of data that will be used in data
analysis. At the data analysis stage, models allow reviewers to test not only individual
paths, but also interrelationships among several constructs or variables in the models.
Furthermore, researchers can examine the extent to which the relationships posited in the

models are observed in the data. At the public presentation stage, reviewers are expected

15

to describe explicitly the use of models in each stage. This helps readers evaluate the
generalizability of the ﬁndings from the proposed model in a model-driven meta-analysis.
As Becker (2001) mentioned, the major beneﬁt of employing a model—driven
meta-analysis is its capacity to provide information about different theoretical and
empirical models. Moreover, researchers can obtain the overall picture of a complicated
system reﬂected in the primary studies, by estimating interrelationships among constructs
or variables Speciﬁed in the models. Consequently, the synthesized models can be useful
to establish the validity of proposed models against other competing models and to help
further formulate stronger explanations for the mechanisms of the phenomenon.
However, several statistical and practical problems in synthesizing models have
been identiﬁed. One of the most prominent issues is the missing data problem, which can
occur as the result of several causes (e. g., researchers may contribute to publication bias
by failing to report nonsigniﬁcant results (the ﬁle-drawer problem), or all the variables of
interest for the meta-analysis may not be included in any Speciﬁc study). Missing data at
the synthesis level can make estimation impossible or difﬁcult. Also, a sufﬁciently large
sample Size is required for performing a model-driven meta-analysis. Other practical, but
less technical issues concern 1) variations in deﬁning the constructs across studies, 2)
between-studies and within-study variation in synthetic models, 3) sources of artifactual

variation, and 4) model misspeciﬁcation.

l6

2.3. Structural Equation Modeling with Latent Variables
Bollen (1989) argues that structural equation models with latent variables
encompass two general model types. One is a latent variable model that summarizes the
structural relationship between latent variables as
n=Bn+F§+C. (2.7)
where I] is the vector of latent endogenous random variables; i represents the latent
exogenous random variables; B is the coefﬁcient matrix showing the effect of the latent
endogenous variables on each other; and F is the coefﬁcient matrix for the effects of I;
on I].
The second component is a measurement model that speciﬁes the structural
relation of observed to latent variables as
x = Ax§+5, (2.8)
and

y=Ayn+s. (2.9)
where y and x are vectors of observed variables; Ax and Ay are the factor-loading

matrices that show the relations of x to g and y to 1], respectively; and 8 and 5 are

the errors of measurement for y and x.

2.3.1. Structural Equation Modeling in Meta-Analysis
Although other statistical methods (e.g., a standardized regression equation from
the pooled correlation matrix) can be used to obtain an empirical synthesized model,

many researchers (e. g., Becker, 1992; S. Cheung, 2000; Cheung & Chan, 2005; Furlow,

l7

2003) have applied structural equation modeling (SEM) to model-driven meta-analysis.
In general, the application of structural equation modeling in the meta-analysis involves
two steps. The two-step approach in meta-analytic SEM entails ﬁrst pooling a correlation
matrix across studies included in the meta-analysis, and then performing the SEM by
inputting the pooled correlation matrix into standard SEM software such as LISREL or
EQS. The meta-analytic SEM has been widely employed in literature (e.g.,Brown &
Peterson, 1993; Hom, Caranikas-Walker, Prussia, Griffeth, 1992; Premack & Hunter,
1988; Schmidt, Hunter, & Outerbridge, 1986), focusing on a path analytic method (e. g.,
Cheung & Chan, 2005; Furlow, 2003).

However, a few researchers (i.e., Cheung & Chan, 2005) have recently applied
meta-analytic SEM to estimate a conﬁrmatory factor analysis (CFA) model (Furlow,
2003). Cheung and Chan (2005) have proposed a slightly different technique, which is
called the 2-stage structural equation modeling (TTSEM) method. In their TTSEM
method, the correlation matrices are ﬁrst pooled using the technique of multiple-group
analysis in SEM, and then the pooled correlation matrices are used to ﬁt the CFA model.
Advances in Cheung and Chan’s method are 1) to introduce observed variables and their
corresponding constructs in the model, and 2) to estimate factor loadings and
measurement errors of observed variables for measuring their constructs in the

synthesized model.

2.3.2. Latent Variable Framework in Meta-analysis

Recently, Thum and Ahn (2007) have introduced a latent variable framework for

synthesizing studies. The latent variable model consists of a measurement model that

18

speciﬁes the relation of observed to latent variables and a latent variable model that
Shows the inﬂuence of latent variables on each other. Thum and Ahn (2007) suggested
the application of the latent variable model to reach the ultimate goal of research
synthesis -- to understand the true relationship among constructs represented by the latent
variables, which are measured using various indicators across the included studies.

If the objective in each study i is to reveal the underlying relationship among

speciﬁc unobserved constructs say, y, the relationship between y and each study-speciﬁc
estimate, say, (31-) based on the observable indicators employed has a predictable

functional relationship that ties the observable indicators to their respective constructs.

F urtherrnore, Thum and Ahn analytically showed that the study-speciﬁc estimates (i.e.,
ordinary least square (OLS) regression coefﬁcients, ,8,-) can be related to the underlying

relationship among unobserved constructs y based on validity and reliability, the
covariance among constructs, sampling factors, and misspeciﬁcations of the structural
model. Therefore, Thum and Ahn proposed to ﬁrst adjust study-speciﬁc estimates using
their respective measurement and structural parameters, and then obtain an average effect.
A Simulation by Thum and Ahn indicates that the average estimate of the OLS regression
coefﬁcients corrected by the reliabilities of predictors and validities of predictors and
outcomes is the least unbiased of several estimates.

However, a number of limitations stand in the way of practical application of
Thum and Ahn’s approach in the real world. In particular, many components for
correcting OLS regression coefﬁcients are seldom reported, including factor loadings of

both criterion and predictor variables, and information on measurement errors. Even if

19

reasonable priors on unknowns can be selected, the estimation process outlined by Thum

and Ahn is quite complicated and thus practical applications are limited.

2.4. Method of Moments Estimation Technique

The method of moments is the oldest method of ﬁnding point estimators (Gelman,
1995), which is to estimate the population parameters such as mean, variance, median
and etc. of a probability distribution by matching theoretical moments to speciﬁed values
(Casella & Berger, 1990). This method is preferable to other approaches because it is
simple in that it always provides some sort of estimate.

Let X1 , ..., X n be a sample from a population with probability density function

f (x 0 ,...,0 ) with ﬁnite moments E[xk ]. Methods-of—moments estimators are
1 k

obtained by equating the ﬁrst k sample moments to the corresponding k population
moments, and solving the resulting system of simultaneous equations. The sample

consists of n observations, x1,..., x” . The kth raw or uncentered moments are

n
”11 :1. Z Xi], H :Exl,
”i=1
1 n
m2=— Z xi2, u2=Ex2,
"i=1
n
mk =1 ); xik, pk =Exk. (2.10)
"i=1

20

The population moments 71,- will typically be a function of 91 ,...,6k , say
,uj (01 ,...,Ok) . The method-of—moments estimator (él ,..., 5k) of (61 ,..., 6k) is obtained

by solving the following system of equations for (621 ,...,ék) in terms of (ml ,...,mk):
M1 = #1(91,..-.¢9k),

mz =#2(91,-~,9k).

mk =,uk(91,...,6k). (2.11)

The method-of-moments estimation technique is preferable to other estimation
techniques such as Fisher’s maximum likelihood estimation technique, if the family of
probability models is not known or when estimating parameters of a known family of
probability distributions (Gelman, 1995). It also provides consistent estimators of
parameters (Greene, 1997). However, the method-of moments estimators are not
necessarily efﬁcient and sufﬁcient. Therefore, the method-of-moments estimators are
often used as the ﬁrst approximation to the solutions of the likelihood equations or a

Bayes prior (Gelman, 1995).

CHAPTER 3
METHODOLOGIES

AS discussed in the previous sections, meta-analysts face challenges and
difﬁculties in synthesizing studies when the original studies use diverse measures. Study
effects vary considerably depending on the differences in measures employed and thus
they are not directly comparable. In addition, data can be too Sparse to apply the existing
univariate or multivariate meta-analytic methods. Therefore, the existing methods are
unable to fully resolve these challenges in research synthesis.

As a result, I propose a methodology in which the strength of the relationship
between two latent variables is estimated. In this proposed approach, the underlying
population model that is applied to all included studies is ﬁrst formulated based on two
perspectives; one is based on model-driven meta-analysis, and the other is structural
equation modeling with latent variables. Then the ﬁnal estimator, in which the strength of
the relationship between two constructs that are measured differently across studies is

quantiﬁed, is obtained by applying the method-of-moments estimation technique.

3.1 Model Specification

Suppose that the primary goal in the meta-analysis is to understand the strength of
relationship between two latent variables, the exogenous (4" ) and endogenous (77)
variables, which is represented by 7 . All k studies in the meta-analysis provide study-

speciﬁc effects (i.e., correlations or regression coefﬁcients) estimating 7 from a set of

predictors x = [x1,x2 ,...,xp_1,xp] and different outcome variables

fromy = [H,)’2....,yq_1,yq ] . As shown in Figure 3.1 in the Appendix C, for instance,
the ﬁrst study may provide a zero-order correlation coefﬁcient between x1 and y., and the
kth study reports regression coefﬁcients predicting y2 using x3 and x p'

Figure 3.2 in the Appendix C speciﬁes the population model that underlies the k
included studies in the hypothetical meta-analysis. Our primary goal in the meta-analysis

is to estimate 7 from the study-Speciﬁc effects linking observed predictors
x = [x1,x2,...,xp_1,xp] and criterion variables y = [y1,y2,...,yq_1,yq] . Each of

these represents its corresponding underlying constructs, E, and I] , with different

accuracy.

3.2. Structural Equation Modeling with Latent Variables
The underlying measurement model delineated in Figure 3.2 implies that the

indicator variables and their corresponding latent variable are related. Speciﬁcally,
x=Ax§+5, 0-”
y =Ayﬂ+8, . (3.2)

where x (p x 1 )and y (q x 1) are vectors ofobserved variables; Ax (p x c, c is the

total number of g) and Ay ( q x d, , d is the total number of n) are the factor-loading

matrices that Show the relations of y to r] and x to E, respectively; and 6 (p x l )and
a (q x 1) are the errors of measurement for y and x . The errors ins are assumed to be

uncorrelated with I], F, and 5 , and 6 is in turn uncorrelated with r], g and s.

Let T be the p + q dimensional column vector of both indicators x and

X r
yT=[Y:l=[x1 x2 xp_1 xp y1 y2 yq_1 yq].ThecorreSponding

population covariance matrix of T is schematized as

r. 2 w
2 = 2(T) = [2“ 2“]. (3.3)
yx yy

The covariance matrix 2(T) consists of four submatrices: (1) the covariance

matrix among the yS, 2y), , (2) the covariance matrix of x with y, ny , (3) the transpose
of the covariance matrix of x with y, ny , and (4) the covariance matrix among the

xs, Xxx .

Let us consider the implied covariance matrix of y, Eyy (T). It is

zyy (T) = E(yy') = E[(Ayn + sxAyn + an

i (3.4)
= AyE(II'I')A y + 93:

where @a is the q x q variance-covariance matrix of a.

The covariance matrix of x with y, Exy (T) , and its transpose, Zyx (T), are equal

to
2,, (T) = E(xy') = EKAxs + ﬁlmy" + 8)" (3.5)
= AxEttn'M'y.
and
Zyx (T) = E010: EKAy" + ”(Axé + 5).] (3.6)

= AyE(IIS'lA;('

24

Finally, the covariance matrix of x, Xxx (T) , is written as

Xxx (T) = E(xx') = E[(Ax§ + 5)(Ax§ + 5)']

, (3.7)
= AxE(§§')Ax + (95,

where (95 is the p x p variance-covariance matrix of 6.
If I assemble equations (3.4) - (3.7) into a single matrix 2(T) , the population

covariance matrix for the sets of indicator variables is

fzxx ny
E y W, , . (3.8)
AxE(§§)Ax+@5 AernrAy

AyEms'MSr AyEmn'M'y +9.2

HT):

 

3.3. Estimation

From the population covariance matrix shown in Equation 3.8, let us focus on the

covariance matrix of x with y, zxy (T) . If I assume all x and y indicators are

standardized with mean of O and variance of 1, the covariance matrix of y with x ,

zxy (T), becomes a matrix of population correlations,

 

 

I ”xv/1 pxzyr po-m pxpyr
pxryz pxzyz pxpnyz pxpyz
Exy(T)=E(xy')= E 3 E E . (3.9)
pqua pxzyq—r pxp-ryq-1 pxpyq-I
_ leyq ”qu pxp-lyq-l pxpyq 3

25

Applying Equation 3.5, this correlation matrix can be written as

nym=E(xy)
=Afoéﬂ'lA'y
r .0ny1 pxzyl pxp_1y1 pxpyl 1
p11}? pxzyz pxp_1y2 pxpyz
pleq-l pxzyq—r pxp—ryq—r pxpyq-1
_ pxryq pxzyq po—ryq-I pxpyq .

 

F AXI’IyI 5(5’7') ’Ixz’lyrEW') 41p-1/lylE(§rt') apatite) _
11152 5(9'5’7') 112 472 “5’72 411,4 525677) ixp/iyzﬂén')

1371 qu_1E(§77') 1x2 XCVq-l EQWI) ... ’1qu quq Eg’l'l ’I'xp qu-l 51:77,)

_‘xri:qu<€'7'> ﬁzz-20.11562) ap_,tqu<rn') Aymara);

 

 

(3.10)

Equation 3.10 suggests that the correlation between x,- and y j , pxi'J’j , can be
written as a function of the factor loadings of x,- and y j , 21x1. and ’57 , where i = l to p,

and j = l to q. Applying the method of moments by equating the sample moments to the

corresponding population moments, and solving the resulting system of Simultaneous

equations (Casella & Berger 1990) leads to

1'E(xy')1=[1'AXIIA'yIIE(§'I')

q p , (3.11)
j=1i=l

Recall that p5,, is equal to E(§I}') , where the means and variances of g and t]

are 0 and 1, respectively. Then Equation 3.11 becomes

26

Efﬁ'l'nxi lyj

II MS
I M'U

j 11—

q
=p§n 2 xx, 1y]. (3.12)
j=1i=l
q p
=pgnl z xyj( 2: xx, )1.
j=l i=1

From Equation 3.12, the correlation between two constructs 4‘ and 77 , pg, is

written as

f f
pry.
Eﬂizl ' J

(313)

II
V)

 

,0 ‘=
in q p
.Z ’I'yj ('2 439-)
j =1 l=I
Therefore, if I know all of the population correlations between x,-
and y j(pxl. y]. ), and the factor loadings of x,- and y j , xixl. and 21y]. , the correlation p5,,
between two latent variables can be estimated. In general, however, I will estimate each

correlation using the sample value of x,- and y j , and I will also need estimates of the

factor loadings. I discuss this issue in the next sections.

3.4. Information for Estimating p5,,

Two components are required to estimate pg, using the method-of-moments

estimator shown in Equation 3.13. One is the set of estimates of the population

correlation coefﬁcients between xi and y j (i.e., the estimates of pxl. y]. ), and the other is

27

the factor loadings of x,- and y j for measuring the exogenous (é ) and endogenous (77)

variables, ’le- and ﬂy]. , respectively.

3.4.1. Population Correlation Coefﬁcients Between xs And ys, pxiyj

The population correlation coefﬁcients can be estimated if studies provide zero-

order correlation coefﬁcients among x,- and y j' Several methods for estimating mean

population correlation coefﬁcients from studies have been widely investigated and
discussed (Becker, 1992; Becker & Schram, 1994; Raudenbush, Becker, & Kalaian,

1988; Wu, 2006a, 2006b). In the current research, two methods are used to estimate the

population correlation coefﬁcients 10x; y}. . First, I average sample-size weighted

observed correlation coefﬁcients for each x and y pair (as in Hunter & Schmidt, 1990). A
second method is to combine z-transforrned variance weighted correlations (Shadish &

Haddock, 1994).

First, the correlation coefﬁcient estimates of pxiyj are obtained from the sample-

size weighted mean of the observed correlation coefﬁcients between the xs and ys:

k
2 n5r(xiyjls
~ 2 5:1
pxl'yj k 9
Z "5

5:]

(3.14)

 

where r(x1'J’i )s is the sth reported correlation coefﬁcient between x,- (i = l to p ) and y j

(j = 1 to q) and k is the number of studies.

28

Second, the most often used univariate method in meta-analysis, which is to
combine z-transformed correlations (Shadish & Haddock, 1994) will provide a second
estimator. Z-transforrned variance-weighted correlations are obtained by converting the

correlation coefﬁcients r( Xi. Yi )s S by Fisher’s variance stabilizing z transform:

Z[,.(xiyj >13 =.5{ln[((1+ My]. )5 ) / (1 — any]. )3 )1), (3.15)

where In is the natural logarithm. If the underlying data are bivariate normal, the

 

 

condrtronal vanance of Z[ r(x1'J’j ) 15 rs

v - 1 (3 16)

S (n. — 3) ’ '
where n5 is the within-study sample size of the 3th study.
The z-transformed weighted average correlation coefﬁcient is
k
2 SE1 WSZ[r(xiyj)]S 3 l7
[’Ixiy 1)] ‘ k . ( - >
2 HS
5:1
where wS is a weight assigned to the 5‘11 study. The weights are calculated by
1
W5 = —. (3.18)

Vs
The estimate in the 2 metric shown in Equation 3.17 is then back-transformed to
obtain [2 via

exp(23[ —l

«my-)1)
])+1'

 

[2 = _ (3.19)
exp(22[r(xiyj)

3.4.2. Factor loadings (Validity coefﬁcients)

Factor loadings or validity coefﬁcients of observed variables are rarely reported in
primary studies. For instance, only one study included in a meta-analysis by Choi, Ahn,
and Kennedy (under review) provided a validity coefﬁcient of the indicators representing
how teacher tests measured teachers know math knowledge. Therefore, these values need
to be estimated using other information provided in the studies or by other means.

In the case where all studies provide the correlation matrix among all variables
used in studies, factor loadings of all variables are easily computed. Consider a simple

one-factor, three-indicator model:

I— —-

X1 2“7111 51
x2 = 1x2 I§I+ 82 . (3.20)
x3 1x3 53

 

 

where g is uncorrelated with 5,- ( i = 1, 2, 3). This leads to the following relationship:

— _

 

 

xii (251 +mr(5x1)
varCrI) 1
cov(x1,x2) var(x2) = 1112x295] 4:52 ¢] +var(5x3 )
cov(x1,X3 ) covﬂr 2,)1‘ 3 ) var(r3 ) 2
’le Ax3 ¢I 1x2 4x3 ¢1 Ax3¢1+var(dx3)j

(3.21)

To ensure the model is identiﬁed, 1 set (.61 to l (Bollen, 1999). Then, the
covariances among the x,- S are computed as

cov(x1,x2) = ’le 21x2 ,cov(x1,x3) = xix] 1x3 ,cov(x2,x3) 2 21x22 4x3 . (3.22)

30

Likewise, if I have correlations among all variables used in all studies, their factor
loadings are easily computed based on Equation 3.22. The same logic can be applied to

obtain the factor loadings of y 1" However, if no information is provided (e. g., no study

provides correlations among x1,x2 , and x3 ), they must be approximated based on other

information provided in each study. More details for obtaining factor loadings of

variables used in the studies are discussed below.

3.5. Extracting Unknowns in the Model

If no correlation coefﬁcients among the x,- or y 1- exists, the following two

methods can be used to estimate factor loadings of variables used in the studies. One is
based on the reliabilities of the observed variables, which are fairly frequently reported in

primary studies. The other uses expert judgments about the validities of variables.

3.5.1. Use of Reliability Information

Considering that the reliabilities of measures are likely to be reported, it would be
reasonable to use them to estimate the factor loadings of the indicators. Bollen (1989)
introduced an alternative way to deﬁne the reliability based on classical test theory as
well as the measurement model. Based on classical test theory (Allen & Yen, 1979;
Crocker & Algina, 1986), the observed score (x) can be written as

x = ‘I.’ + e , (3.23)

where t is the true score, e is the measurement error score or error of measurement, and

the expected value of measurement error is assumed to equal to 0. Thus the expected

value of x is Ti.

31

In addition, the true scores I depend on the latent variables E, such that
t=Ax<§+s, (3.24)
where A xi is the coefﬁcient that Speciﬁes the structural relationship between t and Q ,

and s represents speciﬁc variance unrelated to i and to e. Substituting equation 3.24 into
equation 3.23 leads to
x=Ax§+s+e. (3.25)
Since the reliability is the ratio of true score variance to the observed score

variance, I can write the reliability of xi as

2
_ V3I(Ti) _ ’lxl- ¢1 +Var(si)
pxixi — var(xi) — var(xi) I (3.26)

 

 

From equation 3.26, if the variance of the latent variable ( ¢,-) is set to 1, the
speciﬁc variance equals 0, and the variance of x,- iS known or can be estimated, the x,-

factor loading can be written as

 

2x]. = \[pxixi *var(x,-). (3.27)

The same logic can be applied to estimate the y j factor loading as

 

2y]. = prjyj *var(yj). (3.28)

3.5.2. Use of Expert Judgments

The second method is to use expert judgments about the factor loadings of

indicator variables xi and y 1" Each content expert as an independent rater would be

asked to provide information regarding how well each of the indicators used in the

studies represents the corresponding underlying constructs. When judging the validity of
each indicator, experts are expected to read the individual studies carefully, and then rank
order all indicators in terms of each one’s relation to its corresponding construct’. Experts
would also be asked to provide an approximate value for the validity coefﬁcient of each
indicator.

According to Thurstone’s (1927) discrete utility model, raters rank indicators
based on their utilities, in this case their validities, which are unobserved and vary across

respondents (Maydeu-Olivares & B6ckenholt, 2005). I shall denote by ’1' the latent

random variable associated with the validity for an indicator xiv. If a respondent prefers
an indicator x, over an indicator x0 his or her perception of the validity of an indicator

x,- - , u xi' should be larger than that n indicator x0. This can be Speciﬁed as

liftx th
ux.,= , 0 0. (3.29)
1 01ftx0 <tx0

The response process shown in equation 3.29 can also be written in terms of

differences between the latent utilities,u: = txi —th , where u: is the latent
1 1

comparative response. Then, u x1. can be re-written as a function of u: :
z

lifud Z 0
xi
u = . (3.30)
0 if ud < 0

Xi

 

3 A possible protocol for obtaining expert judgments is shown in Appendix A.

33

Then, the latent comparative response as a linear function of latent random

variable ux. is
1

Ud = At, (3.31)
where A is an design matrix, consisting of p choice alternatives in the columns and the

p paired comparisons in the rows. Note that there are p predictors and q outcomes.

For instance, suppose that individual experts rank order 4 measures [A, B, C, D]
in terms of their validity. In the design matrix A , each column corresponds to one of the
4 choice alternatives [A, B, C, D]. The corresponding rows give the Six paired

comparisons [A, B], [A, C], [A, D], [B, C] [B, D], [C, D]. Thus the design matrix A is

1-100
10-10
100-1
A= (3.32)
01-10
010-1
_001-1_

 

 

A different ordering, say [A, C, B, D] would lead to a different A design

matrix.

Assuming that the vector of latent utilities txl. is normally distributed in the

population of respondents, the mean and covariance structure of u: are
1
= A , 3.33
uud Ptxl. ( )
x,-
and
2 d = AZt A'. (3.34)
11 xi

AS Maydeu-Olivares and Bockenholt (2005) pointed out, the Thurstonian ranking
model can be estimated using the SEM framework. Figure 3.3 in the Appendix C depicts

the covariance structure shown in 3.34 as a SEM model for a ranking model with four

choice alternatives. In ﬁgure 3.3, there are six observed variables 11:. , and four latent
1

validitiestxl. . In fact, a: are not actually observed, but their dichotomizations “xi are
1

observed.

In SEM, parameters in the structured multivariate normal distributions of u: that
1

have been dichotomized according to a set of thresholds are estimated in several stages
(Muthen, 1978). First, the thresholds and the tetrachoric correlations among the
underlying normal variables are estimated. Then, the parameters are estimated from the
thresholds and tetrachoric correlations. The thresholds and tetrachoric correlations are

d

x- . The standardized latent response
1

obtained ﬁrst by standardizing the latent responses u

Z (u: )is computed as
1

d _ d
Z(11,“)- DfuXi wugi ), (3.35)

where D is a diagonal matrix with the reciprocals of the standard deviations of u: on
1

the diagonal:

D = [Diagal ud )1-1/2. (3.36)

x,

35

Then, the standardized latent responses Z (11:: ) are multivariate normal with
. 1

mean 0 and correlation matrix P d , where
u
xi

P2016) ) = D(Eud )D = D(A£xt A')D. (3.37)
31' xi

Also, the standardized latent difference responses Z (a: ) are related to the
1

observed u xi by a threshold relationship as

1 if 2(“3. ) 2 Txi
ux_ = a; . (3.38)
l .
0 1fz(uxi ) < 7x1.

Since there are p paired comparisons, there will be 13 thresholds Txi . The vector

of Txi values has the following structure, which is proven in Maydeu-Olivares and

Béckenholt (2005)

r = -DA,u,x. . (3.39)
1
Thus, the parameters of interest ’u’x- and 2%. are estimated using equations 3.33
1 '1

— 3.34. The estimation process, which is a SEM with categorical indicators, can be

performed using standard SEM software such as LISREL, EQS or MPLUS.

CHAPTER 4
SIMULATION
A number of Monte Carlo simulations are conducted to test the performance of
the proposed approach under different conditions. In these simulations, I estimate the

relationship between exogenous variables 5 and endogenous variables I] , each of which
rs measured usrng 3 1nd1cators x,- and y j , wrth Y = [.r1,x2 ,.r3 , y] , y2 , y3] . It IS also

assumed that these indicators are each standardized with a mean of 0 and a standard
deviation of 1.
In each hypothetical meta-analysis, the data to be combined are from a series of k

independent studies, in which the 5’” study reports zero-order correlation coefﬁcients

rxin , with population correlation coefﬁcients pxiyj , where i = 1, 2,3, j = 1,2,3 , and
s = l,2,...,k — l,k . The sample correlation coefﬁcients (rxiyj ) are obtained ﬁom a ﬁxed

sample size of 30 in each study.

4.1. Data Generation

R (R Development Core Team, 2008) version 2.6.2 is used to generate data and
examine the performance of the proposed approach for estimating the correlation
between two latent constructs.

The method-of-moments estimator given in Equation 3.13 may be affected by the

features of the population model underlying the meta-analysis, including how well x, and
y]. represent the underlying constructs (i.e., the validities of indicators), the total number

of sample correlation coefﬁcients included in the hypothetical meta-analysis (i.e., sample

37

Size, the number of studies), the number of sample correlation coefﬁcients r, ,_ to be

included for each study, and how reliable the predictors and outcome variables are. Also,
estimates are likely to be affected by the amount of missing data (i.e., the number of X—Y
correlations that are not reported) and the quality of missing data, so several missing-data
conditions are investigated.

For a series of hypothetical meta-analyses, the sample correlation coefﬁcients

rxiyj are generated from a multivariate normal distribution of n = 30 cases per pseudo

X
study, for the vector containing x,- and y j , with [y] = [x1,x2 ,x3 , y] , y2 , y3 ]', assumed

to have mean vector of 0 and variance-covariance matrix of 2. 2 , shown in Equation

3.10, is determined from the factor loadings of xi and yj (Axi ,i = 1,2,3 and ij ,

j = 1, 2,3 ), the true relationship between 4‘ and I] , and the measurement errors of x,- and

yj (6i, i=1,2,3 and ej, j=1,2,3).

4.1.1. Choice of Parameters

The parameters to be varied in the simulations are the index of true relationship
between 5 and 77 (7 ), the reliabilities of the predictor and outcome variables, the number
of studies (k) included in the hypothetical meta-analysis, and the quantities of missing

5‘in values. The ﬁrst two simulation parameters are used to create the variance-

covariance matrix 2 that generates the zero—order correlations for each study. The next
two sets of simulation parameters represent the characteristics of studies included in each

hypothetical meta-analysis.

True relationship between 5 and 77. Two values are used to characterize the true

relationship between the two underlying constructs. These values are selected to
investigate how well the proposed model estimates its true relationship 1) when there is

no relationship between 5 and 77 (i.e., 7 = 0), or 2) when there is a medium and positive
relationship between {5 and 77 (i.e., 7= .5).

Reliabilities of xs and ys. Four sets of the reliabilities of indicators, xi and y j ,

(i'e"{px1xl ’px2x2 ’px3x3 ’pYI Y1 ’pY2Y2 ’pY3Y3 } = {'9"9"9"9”9"9}’
{.5,.5,.5,.5,.5,.5}, {.2,.2,.2,.2,.2,.2}, {.9,.5,.2,.9,.5,.2}) are used to compute the factor
loadings of x,- and y j based on Equations 3.27 — 3.28. Three values of reliabilities -
.9, .5, and .2 - are chosen to represent high, medium, and low reliabilities of the

indicators, respectively.

Once the factor loadings of x,- and y j are determined, the variances of the

measurement errors are obtained from var(xi) = Xi _ (it + var(8Xi ), and
1

var(yj) = 7.in (l) + var(eyj ). Since the variances of the indicators and underlying

construct (75) are set to l in this simulation, the variances of the measurement errors are

obtained by
__ _ 2
var(SXi)—l AXi, (4.1)
and
2
vars . =l—7t . 4.2
(yp yj ( 1

39

Total number ofstudies included. In the published research syntheses in Review of
Educational Research from 1990 to 2004 and Psychological Bulletin from 1995 to 2004,
the number of independent studies, k, varied from 12 to 180 (Ahn & Becker, 2005). Ahn
and Becker (2005) indicate that approximately 75% of meta-analyses were based on
fewer than 40 studies. Therefore, two values (i.e., k = 9 and 36), which are multiples of

nine, are used in this simulation, considering that 9 pairs of zero-order correlations using
3 xs and 3 ys can be generated from the population model. For each of k studies, as many

as 9 sample correlation coefﬁcients rxiyj are generated.
Number of missing rxl. J’j . In practice, the reported correlation coefﬁcients

between x. and y] vary considerably. Since the population model is established under the

assumption that 3 xs and 3 ys are observed in the primary studies, at least three pairs of

zero-order correlatron coefﬁcrents (1.e., rx1 yl , rxzyz , and "x3y3 ) should be provrded.
Thus, the quantities of missing rxiyj values manipulated in the simulation varied from O
(1.e., all nrne rxiyj values are provrded) to 6 (1.e., all other rxiJ’j except rx1y1 , rxzyz ,

and rx3y3 are missing). Therefore, 7 variations (i.e., the number of missing rs equals 0, l,

2, 3, 4, 5, and 6) are used in this simulation.

4.1.2. Replications

From 8 population variance-covariance matrices (i.e., 2 values of 7 X 4 sets of

reliabilities of xs and ys), a total of 112 meta-analyses (i.e., 2 values of k X 7 variations

regarding the qualities of missing variables) are generated in this simulation. These 112

40

different conditions are replicated 1,000 times, leading to 1 12,000 hypothetical meta-

analyses in the Simulation.

4.2. Data Evaluation
The index of the relationship between the two constructs, which is the method-of-

moments estimator based on Equation 3.13, is obtained as follows: 1) Nine estimated

population correlation coefﬁcients (73x1. y}. s) are obtained from k rxiyj s by pooling the
values of the rxiJ’j based on a sample-size weighted average (ESl) and a z-transformed

variance-weighted average (E82), 2) Sums of factor loadings of x,- s and sums of factor

loadings of y j s are computed, and 3) Two values of the index of relationship between

the two constructs, which is shown in Equation 3.13, are computed from two sums of
nine estimated population correlation coefﬁcients in step 1, and the sums of the factor
loadings from step 2.

These estimates E81 and ES2 are compared to the strength of the true relationship

between two latent variables (i.e., 7 = p5,] = O and 7 = pi” = .5). In particular, the bias
and mean-squared error (MSE) of the estimators are evaluated. Denoting each of the two
effect-Size estimators as d and the population effect Size as 6, I computed

Bias(5) = E(d) - 5, and

MSE (6") = [Bias((§‘ )]2 + Var(s" ),
where E( d ) is computed as the mean 5 value and Var( d ) is the empirical variance of

the 5 values across the 1,000 replications for each combination.

41

Then, multivariate analysis of variance (MANOVA) was performed on the bias
and MSE values of the estimators in order to examine the relative performance of the
proposed methods for estimating the strength of the true relationship between the two
underlying constructs. The simulation features used as factors in the MANOVAS are 1)
the factor loadings of predictor and outcome variables, 2) the number of studies included
in the hypothetical meta-analysis, and 3) the number of missing sample correlation

coefﬁcients.

4.3. Simulation Results

4.3.1. Estimators

Figure 4.1 and Figure 4.2 in the Appendix C display the distributions of the
estimators (i.e., using a sample-size weighted average (BSD and a z-transformed variance
weighted average (ES2)). They are summarized for two 7 values of 0 and .5. Figure 4.1
shows that both E81 and ES2 are normally distributed with the mean of 0 when 7 value is
set to 0. And, Figure 4.2 displays that the strength of the relationship between two

underlying constructs is underestimated with y value of .5.

4.3.2. Bias and MSE of Estimators
Table 4.1 in the Appendix C presents the average bias and MSE values of ESl

and E82 that represent the strength of the relationship between the two underlying

constructs (7 ). They are also summarized for two 7 values of 0 and .5.
The biases and MSES of the estimators when 7: 0 are .0001 and .008,

respectively. This indicates that the strength of the relationship between the two

underlying constructs seems correctly and accurately estimated. No noticeable
differences are found between the two estimators based on sample-size weighted average
rs (E81) and that computed from z-transforrned variance-weighted rs (ES2) in terms of
their bias and MSE values.

When yiS equal to .5, the bias values of E81 and ES2 are .008 and .023 and the
MSE values of ESl and ES2 are .008 and .009, respectively. This Shows that the
proposed approach slightly overestimates the true relationship between two underlying

constructs for y = .5, and does so with less accuracy. Similar to the case with y = 0, no

noticeable differences between ESl and ES2 are found in terms of the MSE. However,
the bias value of ES2 is bigger than that of BS 1.

Table 4.2 and Table 4.3 in the Appendix C Show the average bias and MSE values
of ESl and ES2 according to three factors manipulated in this simulation — the
reliabilities and factor loadings of indicators, the number of missing rs, and the number of
studies (k). Table 4.2 shows that when 7 = 0 the bias and MSE values of both estimators
are not affected by the three factors used in the simulation.

In Table 4.3, however, when y = .5, the bias and MSE values of the estimators
depend on some factors (i.e., reliabilities and factor loadings of indicators and the number
of studies included in the meta—analysis). In particular, as fewer studies are included or
indicators with smaller reliabilities and factor loadings are included in the meta-analysis,

the bias and MSE values of estimators get bigger. AS is true for the estimators when 7 is

set to 0, no noticeable differences between ES] and ES2 are found in terms of their bias

and MSE.

43

The quality of this Simulation is evaluated by comparing the average biases shown
in Table 4.2 and Table 4.3 to those presented in Field (2001). Field (2001) reports mean
effect sizes using two well-known methods of synthesizing correlation coefﬁcients. They
are 1) Hedges and Olkin (1985) or Rosenthal and Rubin (1991) (i.e., ES2 in this
dissertation) and 2) Hunter and Schmidt (1990) (i.e., ES] in this dissertation). He reports
results for different average sample sizes, different numbers of studies in the meta-
analysis, and different levels of population effect size for the homogeneous case (i.e.,
Table 1 p. 170 in Field (2001)) and the heterogeneous case (i.e., Table 4 p. 174 in Field
(2001)). See Field (2001) for the simulation design in more detail.

Table 4.4 in the Appendix C displays the mean bias of r obtained from the
simulation conducted by Field (2001). As shown in Table 4.4, when the population
correlation between two underlying constructs is set to O, the average mean bias obtained
in this simulation is Similar to those obtained by Field. For example, the mean biases of
nearly 0 in this simulation With 7 of 0 for both E81 and ES2 based on 9 and 36 studies
with 30 sample size (see Table 4.2) are equal to O in Field’s results. No bias is found
regardless of the values of the factor loadings and reliability of indicators and the number
of missing correlations.

When the population correlation (7) is set to .5, the mean bias values of both ES]
and ES2 without any missing rs (see Table 4.3) are similar to the values reported in Field
(2001 ). However, the mean bias in this Simulation with missing rs gets larger as the
number of missing rs increases. With 3 missing rs, the mean bias of ES2 is closer to the

mean bias for heterogeneous case shown in Field (2001).

44

4.3.2. Factors Affecting Estimators of the Strength of Relationship Between
Two Underlying Constructs

The two indicators of the quality of the estimated strength of the relationship
between two true constructs, Bias and MSE, are evaluated in relation to the following
factors: 1) the number of studies (k), 2) the number of missing rs, and 3) the reliabilities
(factor loadings) of the indicators. The MANOVAS examine the effects of these
characteristics on the bias and MSE of two estimates.

Factors affecting bias and MSE of estimators when 7 = 0. Table 4.5 and Table

4.6 in the Appendix C display results from MANOVAS for the bias of the estimators,

when 7 is equal to 0. As shown in Table 4.5, statistically signiﬁcant differences are

found in the bias values of the overall estimates across the levels of all factors. In
particular, the signiﬁcant Wilks’ Lambdas indicate that the MSES of estimators differ
depending on the number of studies (k), the number of missing rs, the reliabilities and the
factor loadings of the indicators. In addition, the univariate Analysis of Variance
(ANOVA) for all factors shown in Table 4.6 in the Appendix C indicates negligible
impact of these factors on the bias of the effect-size estimators. Also, the partial Eta-
squares for all the study features manipulated in this simulation equal zero.

Table 4.7 in the Appendix C indicates that the three factors signiﬁcantly affect the
MSES of the both estimators with 7 = 0. In particular, the signiﬁcant Wilks’ Lambdas
indicate that the MSES of estimators differ depending on the number of studies (k), the
number of missing rs, the reliabilities and the factor loadings of the indicators. In addition,
the univariate Analysis of Variance (ANOVA) for all factors displayed in Table 4.8 in the

Appendix C shows a statistically signiﬁcant impact of these factors on the MSES of the

effect—size estimators. The factor loadings of the indicators have the largest effect on the
MSES of the estimators with an Eta-square of .68. Also, the Eta-square of .53 for the
number of studies included in the meta-analysis implies that this factor has a medium
impact on their MSES. Finally, the impact of the number of missing rs on the MSES is
negligible (e.g., the 772 of the number of missing rs is .07).

Factors affecting bias and MSE of estimators when 7 = .5. Table 4.9 in the
Appendix C displays results from MANOVAS when y is set to .5. Table 4.9 and Table
4.11 in the Appendix C Show that the bias and MSES of estimators differ depending on
all factors used in this simulation. In particular, the signiﬁcant Wilk’s Lambdas suggest
that the bias and MSES of both E81 and E82 signiﬁcantly differ depending on the
number of studies (k), the number of missing rs, and the factor loadings and reliabilities
of indicators. In addition, the univariate Analysis of Variance (ANOVA) presented in
Table 4.10 in the Appendix C indicates that the reliabilities of indicators have the largest
effect on the biases of estimators based on an Eta-square of .71.

Also, the univariate Analysis of Variance (ANOVA) for all factors displayed in
Table 4.12 in the Appendix C indicates the statistically signiﬁcant impacts of these
factors on the MSES of the effect-size estimators. While the number of missing variables
has the smallest effect on the estimators’ MSES (772 = .12), the factor loadings and
reliabilities of indicators and the number of studies included in the meta-analysis have
medium effects on the estimators’ MSES with Eta-square values of .62 and .40,

respectively.

Below, the inﬂuence 0f each study factor on the bias and MSE values of both

estimators is discussed.

46

True relationship between 5 and 77. In Figure 4.3 and Figure 4.4 in the Appendix

C, the average biases values of two estimates are displayed according to the true

population relationship betweenthe two underlying constructs (7 ). In particular, the bias
values of estimators with 7 = 0 differ from those with 7 = .5, indicating that when 7 = 0

the proposed model correctly estimates the strength of the relationship between two
underlying constructs (i.e., Bias is nearly 0 with 7 = 0). However, no differences on the
MSE values of estimators are found according to the true population relationship between
the two underlying constructs (7 ).

Reliabilities of indicators. Figure 4.5 in the Appendix C displays the mean biases
of two estimators according to the reliabilities of indicators when 7 is set to 0.
Regardless of which set of the reliabilities of the indicators is used, the mean biases of the
estimators are not far off from O and they do not noticeably differ from one another.

As shown in Figure 4.6 in the Appendix C, both E81 and E82 have higher bias
values when the reliabilities of three indicators vary (i.e., .2, .5 , and .9). However, no
obvious differences are observed among the mean bias values in terms of the reliabilities
of indicators.

Figure 4.7 and Figure 4.8 in the Appendix C compare the MSES for different
magnitudes of the indicators’ reliabilities with 7 of O and .5, respectively. Although no
signiﬁcant differences in the mean MSES are found, the estimators based on indicators
with the reliabilities of .2 have bigger mean square errors. This indicates that the accuracy
of estimating the strength of the relationship between two underlying constructs is lower

when rs arise from indicators with lower reliabilities.

47

Factor loadings of indicators. As shown in Figure 4.9 in the Appendix C, when 7

is set to 0, mean bias values of both E81 and E82 almost equal 0.

As displayed in Figure 4.10 in the Appendix C, when rs based on the indicators
with different factor loadings (i.e., .45, .71, and .95) are combined, the estimators have
the highest mean bias values of estimators (i.e., .07).

Figure 4.1 l in the Appendix C compares the MSES of the estimators according to

the factor loadings of indicators with 7 = 0. Figure 4.10 shows that the MSE values of

the estimators are highest when the factor loadings of indicators are .45. The MSES are
nearly zero when rs are based on factor loadings greater than .70, implying that the
strength of relationship between two underlying constructs is accurately estimated when
indicators with high factor loadings are combined.

Figure 4.12 in the Appendix C displays the MSES of estimators according to the

factor loadings of indicators with 7 of .5 . With 7 of .5, the MSE values of indicators are

nearly zero when rs from the indicators with the factor loadings of .95 are combined,
while they get bigger when indicators with factor loadings of .45 are combined.

Number of studies (k). As shown in Figure 4.13 and Figure 4.14 in the Appendix
C, there is no signiﬁcant relationship between the number of studies included in the meta-
analysis and the bias values of estimators.

However, k is negatively related to the MSES of estimators, which is shown in
Figure 4.15 and Figure 4.16 in the Appendix C. For example, the mean MSES are bigger
with k of 9, while they get smaller with k of 36. This makes sense since less information

is available with fewer studies.

48

Number of missing rs. Figure 4.17 and Figure 4.18 in the Appendix C compare
the biases of estimators depending on the number of missing zero-order correlations.
Figure 4.17 shows the mean biases with 7 of 0 do not noticeably differ depending on
how many correlation coefﬁcients are missing.

On the other hand, when 7 is set to .5, the mean biases slightly increases when
the number of missing rs increases, but they are close to 0. In particular, the mean bias
values of estimators are nearly zero without any missing rs, while it is approximately .04

off from zero with only three rs (i.e., rx1 yl ,rxzyz ,rx3 y3 ) included. However, regardless

of the number of missing rs, the mean bias values are not far off from 0.

As Shown in Figure 4.19 in the Appendix C, no relationship between the MSES
and the number of missing rs is found when 7 is set to 0. However, when 7 is set to .5,
there is a slightly positive relationship between MSES and the number of missing rs
increases (see Figure 4.20 in the Appendix C).

Quality of missing rs. Lastly, the effect of which rs are included on the
performance of the proposed approach is examined. This is accomplished by looking at
the bias and MSE values of estimators when correlation coefﬁcients from indicators with
different reliabilities are included.

Table 4.13 in the Appendix C shows the correlation matrix of the six indicators in
terms of their reliabilities. At least three zero-order correlations (i.e.,

rx”,1 ’rxzyz ’rx3y3 ) on the diagonal (i.e., shaded in Table 4.12 in the Appendix C)

Should be always included. The quality of rs is determined by the reliability of each
indicator. For instance, the quality of rxlﬂ equals to that of ’31)"! because they are based

on two indicators with high reliability (i.e., .9) and medium reliability (i.e., .5).

49

For a Simple demonstration here, I only present how the strength of the
relationship between two underlying constructs is estimated by not including each one of
six correlations, which are not shaded in Table 4.13 in the Appendix C.

Table 4.14 in the Appendix C shows the results from the analysis of variance
(ANOVA) on the effect of which r is not included in the meta-analysis on the bias value
of E81. Since there is no signiﬁcant difference between E81 and E82 in terms of their
bias and MSE, I here present an ANOVA result for E81.

As shown in Table 4.14, which r is not included in meta—analysis does not have an
signiﬁcant impact on the bias of the E81 with 7 of 0 (F5, 24000) = 2.070, p = .07).
However, ANOVA result indicates that the bias of estimators depend on the quality of
missing r when7 is equal to .5 (F6, 24000) = 12.60, p < .05, 772 = .003 ).

Figure 4.20 in the Appendix C compares the bias values of E81 depending on
which correlation coefﬁcient is included in a meta-analysis, in addition to three zero-

order correlations (i.e., rx1 J’I , rxzyz , rx3 y3 ) on the diagonal (i.e., three Shaded areas in
Table 4.12). As shown in Figure 4.20, E81 includes correlation coefﬁcients from
indicators with high (i.e., .9) and medium (i.e. .5) reliabilities (i.e., rx1 y2 and rxzyl )
and it has smallest bias (i.e., |-.25|). However, E81 including correlations from indicators

with medium (i.e., .5) and low (i.e. .2) reliabilities (i.e., rx2y3 and rX3y1 ) has the biggest

bias (i.e., |-.27|).
In addition, statistically Signiﬁcant results from pairwise comparisons shown in
Table 4.15 in the Appendix C indicate that the bias of E81 differ according to the quality

of r. For example, the bias of E81 including correlations from indicators with high

50

(i.e., .9) and medium (i.e. .5) reliabilities (i.e., r1.1 v") and ".12 V1 ) is signiﬁcantly

different from that based on correlations from indicators with medium (i.e., .5) and low

(i.e. .2) reliabilities (i.e., 5(2),3 and 5.3).] ).

4.4. Conclusions

Findings from this simulation in which the performance of the proposed approach
shown in Equation 3.13 for estimating the strength of the relationship between the two
underlying constructs is evaluated in relation to the different factors in this simulation are
as follows:

First, the average bias and MSE values of the estimators are approximately zero
when the true relationship between the two underlying constructs is set to 0 and .5,.
Overall, no distinguishable difference between the bias and MSE values of E81 and E82
is found. The mean bias values of E81 and E82 from this simulation are comparable to
those presented by Field (2001) with 7 of 0 and .5.

Second, the bias values of the estimators differ according to the number of studies,
the number of the missing rs, the factor loadings and the reliabilities of indicators with 7
of both 0 and .5. Among them, the factor loadings and the reliabilities of indicators have
the biggest effect on the bias of the estimators, while the number of the missing rs has the
smallest effect on the bias of the estimators

Third, the MSE values of the estimators differ according to the number of studies,
the number of missing rs, the factor loadings and the reliabilities of indicators. This is

found in the MSE of the estimator with 7 of both 0 and .5. The factor loadings and

reliabilities of indicators have the largest effect on the MSE values of the estimators. The

51

number of studies included in the meta-analysis has the next largest effect on the MSES
of estimators.

F ifth, when no rs are missing, the mean bias and MSE values of the estimators
become nearly zero with 7 of 0 and .5, indicating that the strength of the relationship
between the two underlying constructs is correctly estimated using the proposed approach.
When all 6 rs are not included and 7 = .5, the mean bias values of estimators are
about .03. In addition, under these same conditions, the mean MSE values of estimators
are about .01.

Lastly, a statistically Si gniﬁcant effect on the bias of which correlation is included

is found for 7 of .5 . This indicates that the reliabilities of the missing rs have a

Si grriﬁcant inﬂuence on the accuracy of the estimators.

CHAPTER 5
APPLICATION

The application of the proposed method is demonstrated by re—analyzing a subset
of studies reviewed by Ahn and Choi (2004). In their meta-analysis, Ahn and Choi
investigated the relationship between teachers’ subject matter knowledge (SMK) and
student learning (8L) in mathematics. In order to deal with considerable variation in
measures, Ahn and Choi ﬁrst categorized the included studies in terms of how teachers’
SMK was measured (e. g., teachers’ test scores and teachers’ coursework in mathematics).
Then they conducted a series of univariate analyses, one for each subgroup of studies.

However, their meta—analysis, based on a univariate method, is limited in its
ability to portray the overall picture of the relationship between what teachers know and
how much students learn in mathematics. For instance, one of the conclusions that Ahn
and Choi drew in their meta-analysis was that the relationship between teachers’ subject
matter knowledge and student learning is stronger when teachers ’ SMK is measured by
teacher test scores. Such a ﬁnding can be limited if I want to make an inference about the
overall picture of the relationship between two underlying constructs - teachers’
knowledge and student achievement in mathematics.

Therefore, 1 demonstrate here how to use my proposed method to assess the
strength of the relationship between teachers’ knowledge in math and students’
achievement, each of which is measured differently across studies. I expect that the
proposed method will deal with the sparse data structure that mostly comes from

variation in measures and further provide an assessment of the strength of the relationship

53

between the two underlying constructs (i.e., teachers’ knowledge and student

achievement in mathematics).

5.1. Study Description

The main purpose of this meta—analysis is to understand the relationship between
teachers’ subject matter knowledge (SMK) and student learning (8L) in mathematics.
Studies included in this research synthesis were drawn from a larger literature database
gathered as part of the Teacher Qualiﬁcations and the Quality of Teaching (TQ-QT)
study. The TQ-QT project aims to synthesize studies investigating the relationship
between indicator(s) of teacher qualiﬁcations (TO) and the quality of teaching (QT),
which have been conducted in the United States since 1960 (Wu, Becker, & Kennedy,
2002). As of winter 2005, the data base included about 480 studies. More details about
search criteria and selected studies in the TQ-QT study can be found at
http://www.msu.edu/~mkennedv/TOOT.

From the 480 studies in the TQ-QT database, 27 studies including 18 dissertations,
4 journal articles, 1 conference paper, and 4 reports were included in Ahn and Choi’s
meta-analysis. Details about inclusion rules can be found in Ahn and Choi (2004).
Among 27 studies, only 8 studies based on 6th grade-level students are used in the
demonstration of the proposed method. These 8 studies are a relatively homogenous
group in terms of statistical analysis, grade level, whether reliability is reported, and
content domain (i.e., arithmetic) of students’ mathematical knowledge. For instance, all 8
studies provide correlation coefﬁcients between teachers’ subject matter knowledge as

measured by tests and student learning.

However, these 8 studies vary in terms of the measures used to represent teachers’
and students’ knowledge; by test type — researcher-made local, researcher-made large-
scale, or commercial; by whether the gain score metric is utilized in analyzing the student
achievement test and in terms of the unit of analysis and the time interval between pre-
and post-test in year. For instance, 7 studies used commercial student measures (i.e., CAT,
SAT, SRA, CTBS, and ITBS), while 1 study used a researcher-made large-scale
assessment of student learning. In addition, all except two studies (i.e., Turgoose, 1996
and Lampela, 1966) provide correlation coefﬁcients based on gain scores. However,
some (such as Caezza, 1969) are based on 2 year gains, while others (e.g., Cox, 1970) are
based on 1 year gain scores. There are also differences in the unit of analysis across the 8
studies (i.e., student level is used in Caezza, 1969 and Cox, 1970 vs. classroom level in
Bassham, 1962; Koch, 1972; Lampela, 1966; Moore, 1964; Prekeges, 1973, and
Turgoose, 1996).

Let us closely look at the various measures that are used to represent both teachers
and students’ knowledge in mathematics in these 8 studies displayed in Table 5.1 in the
Appendix C. Two studies (i.e., Bassham, 1962; Moore, 1964) used the same test (i.e.,
Glennon test of basic mathematical understanding) as a measure of teacher’s subject
matter knowledge. Bassham (1962) and Lampela (1966) used California Achievement
Test (CAT) and Cox (1970) and Moore (1964) used the SRA Achievement Series as a
measure of student’s knowledge. However, no pair of studies provide correlation
coefﬁcients based on the same measures of teachers’ and students’ knowledge. Figure 5.1

in the Appendix C shows the empirically driven population model.

55

5.2. Method

Since the proposed approach in this research makes use of correlation coefﬁcients
among the predictor and outcome variables, correlation coefﬁcients that estimate the
strength of the relationship between teachers’ subject matter knowledge and student
achievement in mathematics are extracted from the 8 studies. Table 5.2 in the Appendix
C displays the study characteristics of the 8 studies in more detail.

Four studies (i.e., Caezza, 1969; Cox, 1970; Moore, 1964; Prekeges, 1970)
provide more than one correlation coefﬁcient, which are from subtests of their student
achievement tests. For example, Caezza provided three correlation coefﬁcients from three
subtests of the Stanford Achievement Test (SAT), which are SAT: Concept, SAT:
Computation, and SAT: Application. When several correlation coefﬁcients are provided
for the same sample, the following two rules are applied to extract one independent study
effect.

First, if available, a correlation coefﬁcient from the total score is used. For
instance, a correlation coefﬁcient of .16 based on a total score on the Achievement Series
(SRA) is obtained from Cox (1970). Second, if a study provides several correlation
coefﬁcients from subtests of the student achievement test, the average correlation
coefﬁcient is obtained. This rule was applied to three studies (Caezza, 1969; Moore,
1964, and Prekeges, 1970).

Among the 8 studies, only one study (i.e., Turgoose, 1996) provides the validity
coefﬁcients for the variables. Turgoose (1996) reports the concurrent validity for the
Tests of Achievement and Proﬁciency (TAP) ranged from .69 to .79. Seven studies (i.e.,

Caezza, 1969; Cox, 1970; Lampela, 1966; Moore, 1964; Prekeges, 1973; Turgoose,

1996) present reliability information related to teachers’ subject matter knowledge
measures. And four studies (i.e., Caezza, 1969; Moore, 1964; Prekeges, 1973; Turgoose,
1996) provide the reliability of student learning measures in mathematics. Different types
of reliability information are reported for the indicators, including Cronbach’s alpha, test-
retest, and KR-20 reliabilities.

For studies that do not report the reliability of indicators (e. g., Bassham, 1962),
whenever possible reliability is obtained from other studies that use the same measure.
For example, the reliability for the Glennon test of mathematical understanding (an
indicator of teachers’ math knowledge) is obtained from Moore (1964). Similarly, the
reliability of the Achievement Series (SRA) test used in Cox (1970) is obtained from
Darakjian and Michele (1982). Therefore, the reported reliabilities of .51 to .67 provided
by Darakjian and Michele (1982) are used for the reliability of the SRA used in Cox
(1970)

The speciﬁc procedure to estimate the relationship between teachers’ SMK and
student learning is as follows. First, the average population correlation coefﬁcients
between the two sets of observed indicators are computed by averaging sample—size
weighted correlation coefﬁcients and averaging z-transformed variance-weighted
correlation coefﬁcients. Second, the validity information for the indicators is extracted
from three sources of available information. 1) The validity of indicators that are
provided in studies or borrowed from other studies (i.e., Cox, 1980; Moore, 1964;
Turgoose, 1996) is directly used, 2) If the reliability information is available, the validity
is extracted using Equation 3.27 and Equation 3.28, and 3) When the reliability or the

validity of indicators is not available from the individual study (e. g., the California

57

Achievement Test (CAT) used in Bassham (1962) and Lampela (1966)), the validity of
the indicator is obtained from expert judgments. The detailed procedure for gathering the
expert judgments about the validity of indicators is later discussed in more detail. Lastly,
the strength of the relationship between SMK and students’ learning is computed using

Equation 3.13.

5.3. Expert judgments

When no information about the reliabilities and the validities of indicators is
available, validity information of indicators is obtained from content experts in the
domain. Here, I deﬁne content expert as a person who is fairly familiar with research on
teachers’ subject matter knowledge with mathematical teaching experience. Based on this
deﬁnition of content expert, ﬁve graduate students focusing on mathematics education,
from the Department of Counseling, Educational Psychology and Special Education and
the Department of Teacher Education at Michigan State University, provided their
judgments on the validity information of indicators used in 8 studies.

Those ﬁve content experts approximated the validities of indicators based on the
protocol Shown in Appendix A. The protocol was developed to help content experts make
better judgments about the indicators’ validities. Based on the concept of concurrent
validity, experts were ﬁrst asked to compare all measures that are present in the
population model (see Figure 5.1) in terms of how closely each indicator represents what
constitutes teachers’ subject matter knowledge in mathematics. In the process, either the
actual test items (for the Glennon test of basic mathematical understanding) or

descriptions of the assessments (i.e., CAT, SAT, and ITBS) were provided.

58

Second, content experts are asked to rate how well each indicator measures the
conceptual dimension of teachers’ subject matter knowledge in mathematics. The
conceptual dimension of teachers’ subject matter knowledge is derived by integrating
various researchers’ deﬁnition of teachers’ knowledge in mathematics. Many researchers
(e.g., Ball, 1990a; Ball, 1990b; Hill et al., 2004; Shulman, 1986) have different but
similar conceptual deﬁnitions of teachers’ knowledge in mathematics. Therefore, I have
adopted a working deﬁnition based on the number and operations standards for grade 6-8
that are suggested by the National Council of Teachers of Mathematics (N CTM)’.

After rating how closely each indicator represents three dimensions of teachers’
subject matter knowledge, experts are asked to rank order the indicators from the one
most likely to measure teachers’ subject matter knowledge to the least likely measure.
Then, they are ﬁnally asked to provide the approximate value of a factor loading, which
is essentially a correlation coefﬁcient between what is measured in each indicator and
what constitutes teachers’ subject matter knowledge in arithmetic. It took an average of 1
hour for each content expert to produce their judgments on the validity of indicators.

Since I need the validity of CAT used by Bassham (1962) and Lampela (1966), I
took the average of the validity values provided by ﬁve content experts and used it as an
approximate factor loading of CAT. The validity of CAT obtained from the ratings of the

ﬁve content experts was .329.

 

4 Number and operations standards for grade 6-8 that are suggested by NCTM are available at
http://standards.nctrn.org/document/chapter6lnumb.htm.

59

5.4. Results

The estimated strength of the relationship between teachers’ subject matter
knowledge and student achievement using Equation 3.13 was .0005 computed using the
sample-size weighted average correlation coefﬁcient (E81), and .0006 using 2-
transforrned weighted correlation coefﬁcients (E82). These nearly zero estimates based
on the pr0posed method-of—moments estimator given in Equation 3.13 indicate that there
is no relationship between how much teachers know and 6th grade student learning in
mathematics.

E81 = .0005 and E82 = .0006 were also compared to the weighted mean
correlation corrected for artifacts (i.e., reliabilities and construct validity of indicators)
proposed by Hunter and Schmidt (1990, 1994) and to the z-transformed variance-
weighted mean correlation proposed by Shadish and Haddock (1994). The weighted
mean correlation corrected for artifacts is .007 and the z—transformed variance-weighted
mean correlation is .008, all indicating that there is no relationship between how much
teachers know and 6th grade student learning in mathematics. Although the same
inference is drawn, the estimated strength of the relationship between teachers’ subject
matter knowledge ands student achievement using Equation 3.13 is much smaller than
the estimators based on the methods proposed by Hunter and Schmidt (1990, 1994) and
Shadish and Haddock (1994).

However, a nearly zero correlation between teachers’ subject matter knowledge
and student achievement should not be over-interpreted. As shown in the previous section,
the method-of-moments estimators based on Equation 3.13 might be affected by several

factors in the population model. Above all, none of the 8 studies provided all possible

60

pairs of correlation coefﬁcients that are present in Figure 5.1 in the Appendix C. Since
the effect of missing rs on the bias and MSES of estimators is large, the estimators
computed in the presence of missing correlation coefﬁcients might be biased.

In addition, the estimated validity coefﬁcients of measures might be incorrect.
First of all, 8 studies reported the reliabilities of measures based on different methods
such as test-retest and Cronbach alpha. No acceptable methods exist to put the different
types of reliabilities of measures in a common scale (though all range between 0 and 1).
Therefore, the obtained validities could be either over-estimates or under-estimates,
because each type of coefﬁcient may tap different sources of error.

In fact, if the true population correlation coefﬁcient is 0 and thus study factors do
not affect the bias values of estimators as found in the simulation, the estimators based on
Equation 3.13 might not be far off. However, there is no certainty that the true correlation
between teachers’ subject matter knowledge and student achievement is 0, thus it would
be unwise to ignore the effect of study factors on computing the strength of the

relationship between two constructs.

61

CHAPT ER 6
PRACTICAL CONSIDERATIONS

The proposed method provides an innovative way to deal with one of the
challenges in research synthesis that comes from variations in measures. By using the
proposed approach, a reviewer can combine the sparse data that arise from the large
variations in measures. Thus, the strength of the relationship between the two underlying
constructs can be estimated. However, the method’s application in practice raises several
methodological questions. In this section, some practical considerations of the proposed
approach are discussed, followed by an outline of potential future research to examine
those limitations.

First, the most critical issue in the proposed approach is that some components
needed to compute the index of the relationship between the two underlying constructs
given in Equation 3.13 are not often reported in primary studies. In particular, researchers
virtually never report the factor loadings (aka. the validity coefﬁcients) of variables
employed in the studies. Even though three alternative methods (i.e., use of correlation
matrices, use of expert judgments, and use of reliability information) are suggested for
obtaining validity information, these all might introduce errors in estimating the true
relationship between two underlying constructs. One potential solution is to use other
available sources to obtain information for estimating the factor loadings of variables. For
instance, reliability information used for obtaining the factor loadings of variables can be
acquired from test manuals, other similar studies using the same variables, or personal

contact with study authors.

Second, the other issue in using the proposed approach is related to the missing
data. As described above, the studies included in the meta-analysis do not always report
all of the relationships or paths included in the population model. AS discussed in the
simulation, studies often also do not use exactly parallel models, instead using different
predictors and outcome variables. Results from the simulation Show that the number of
missing rs and the quality of the rs have statistically Signiﬁcant impacts on the bias and
MSE of estimators. Therefore, more attention should be paid to developing an approach
that can deal with missing data in research synthesis.

Third, the proposed approach presumes that the included studies are based on the
same population, which indicates the ﬁxed-effect model. Without this assumption, study-
speciﬁc effects (i.e., correlation coefﬁcients) should not be combined. For instance, if
there are signiﬁcant differences in the correlations among teachers’ subject matter
knowledge and student achievement in terms of grade level (See Ahn & Choi, 2004),
these correlations should not be pooled. In fact, the test for the homogeneity of the
correlation matrices (Becker & Schram, 1994) can be used to conﬁrm this assumption.

Fourth, the proposed method assumes that the theoretically or empirically driven
model is correctly speciﬁed. In fact, the development of a population model that is as
comprehensive as possible is required. However, it is highly probable that the empirically
or theoretically driven model may be misspeciﬁed. For example, a meta-analyst may
derive a one-factor model even though the underlying population model is really a two-
factor model. Therefore, understanding the robustness of the proposed model can be

helpful for the application of this model, so I here investigate how well the proposed

63

model estimates the strength of the relationship between two underlying constructs when
it is based on a misspeciﬁed population model.
Let us suppose that the observed scores of indicators include one additional

component, which is the Speciﬁc variance (51') introduced in Equation 3.23. AS discussed

in section 3.5.1 , the speciﬁc variance is unrelated to the underlying constructs or to the
measurement errors. For instance, if teachers’ test scores that represent how much they
know in a subject depend on teachers’ age, teachers’ age introduces variation in their test
scores that is not related to the underlying construct of teachers’ subject matter

knowledge or to measurement error. When the speciﬁc variance (3,) of an indicator is

nonzero, its factor loading can be smaller than its reliability (see the relationship between
factor loadings and reliabilities described in Equation 3.25).

In order to examine the robustness of the proposed approach when speciﬁc
variances of indicators are introduced, the bias and MSE values of the estimators are
compared for different values of the Speciﬁc variances of the indicators (i.e., SV =
0, .15, .45, or .85). As shown in Figure 6.1 through Figure 6.4 in the Appendix C, the bias
and MSE values of the estimators are compared under the following four conditions
depending on the values of the speciﬁc variances: 1) All six indicators have zero speciﬁc
variances (i.e., 8V(x, or y,) = SV2(x2 or y;) = SV3(x3 or y3) = O), 2) All six indicators
have speciﬁc variances of.15 (i.e., SV(x, oryl) = SV2(x2 oryg) = SV3(x3 or y3) = .15),
3) All six indicators have speciﬁc variances of .45 (i.e., SV(x1 or y,) = 8V2(x2 or y2) =
8V3(x3 or y 3) = .45), and 4) All six indicators have speciﬁc variances of .85 (i.e., SV(x1

or y,) = 8V2(x3 or y;) = SV3(x3 or y 3) = .85), and 5) The three xs and three ys have

64

different speciﬁc variances (i.e., SV(x, or y,) = .85, 8V (x; or y;) = .45, 8V (X3 or y;)
= .15).

Figure 6.1 and Figure 6.2 in the Appendix C Show that when the Speciﬁc
variances of the six indicators are zero, the bias values of both estimators are the smallest
regardless of the values of the index of the true relationship between two underlying

constructs (7 ). In fact, regardless of 7 , the biases of estimators are nearly zero.

However, the effect on the MSES of estimators of having nonzero speciﬁc
variances in the indicators seems to be greater and more obvious. As Shown in Figure 6.3
and Figure 6.4 in the Appendix C, the MSES of estimators based on indicators with
nonzero speciﬁc variances are bigger by .2 or more, compared to MSES of estimator
using indicators with zero speciﬁc variance. This pattern is shown regardless of the
population values of the true relationship between two underlying constructs (7 ). This
indicates that the strength of the relationship is accurately estimated when combining
correlations that are generated from indicators without speciﬁc variance introduced.
Therefore, it should be fully understood that the true relationship between two constructs
might be underestimated when combining correlations using indicators with speciﬁc

variances.

CHAPTER 7
DISCUSSION

With the recent movement toward evidence-based policy and practice in
education (Whitehurst, 2002), a growing interest has been devoted to meta-analytic
techniques as a means of providing rigorous educational evidence (Slavin, 2008). As
attractive and useful as these may seem in providing critical determinants for policy
decisions, the application of existing methodology in research synthesis faces numerous
difﬁculties and limitations due to the inherent nature of research in education and social
science.

A particular problem in the social sciences and education is that studies employ
diverse measures, even though researchers intend these to represent the same underlying
constructs. Due to the use of diverse measures in primary studies, meta-analysts
encounter three important problems. One is the variety of measures that are used to
measure the same underlying construct. Another is the variety of statistical methods that
are used to estimate the relationship between the two constructs. Third, there are not
many pairs of the studies that use the same combinations of measures or statistical
approaches, which leads to call as a sparse data structure. There may be many studies but
no easy way to form a collection of studies that use the same measures or the same
statistical methods to generate their estimates of effect sizes.

Therefore, I have proposed a new method for quantifying the strength of the
relationship between two constructs that are measured in many different ways across
studies. In this research, I have developed an approach that can handle the very sparse

data structure that arises from variations in measures and statistical techniques. I also

66

want an approach that recognizes the fact that, even though a set of measurements can be
quite different in their characteristics, they are all attempting to measure the same
underlying construct.

One advantage of using the proposed approach given in Equation 3.13 is that it
can estimate the strength of the relationship between two underlying constructs that are
measured in different ways. By using the proposed method, variation in measures, which
may lead to considerable heterogeneity across studies, is taken into account. Thus, the
proposed method can provide more precise estimates by combining the corrected study
effects. Contrary to Hunter and Schmidt’s approach, the proposed method also suggests
practical ways to adjust measure differences (i.e., how to obtain the validities of the
indicators) that are derived from the population model. In addition,

In spite of the advantages of using the proposed method mentioned above, further
research is required to resolve some practical issues. First of all, although the approach
focuses on a simple bivariate relationship between two underlying constructs, each of
which is measured using various indicators, it is certainly plausible to expand the
proposed method for application to more complicated models. For instance, the partial
correlation between two constructs controlling for a 3rd construct can be obtained from
pooled correlations of control variables with both outcome and predictor variables.

As demonstrated in the simulation, the proposed approach correctly estimates the
desired population parameters (i.e., the true relationship between two underlying
constructs) if no missing rs exist. However, the bias and MSE values of the estimators get
bigger as the number of missing rs is slightly increased. Since estimators differ

depending on the number of missing rs, further investigations are needed to deal with the

67

missing rs. AS discussed in the previous sections, one potential solution would be to
impute missing information based on the available information. However, more attention
should be paid to developing an approach that can deal with the missing data, which
might be missing by “design”.

One potential study would be to look into the robustness of the proposed model
when the population model is not correctly speciﬁed. Knowing how robust the proposed
approach is would deﬁnitely offer useful insights for applying the proposed approach in
practice. Therefore, the generalizability of the proposed approach should be investigated
under different scenarios. For example, the performance of the proposed method based on
a one-factor model can be examined, although the true population model is a two-factor
model.

In addition, the proposed approach has been examined for the case in which all
indicators are assumed to be continuous variables. Future research could also examine

how to combine relations involving both continuous and categorical variables.

68

APPENDIX A:

Protocol for Obtaining Expert Judgments

69

Expert Judgments on Measure Validity
: How teacher knowledge and student learning in math have been measured?

 

 

Many researchers have long been interested in the effect of how much teachers know in a subject
they taught on improving student learning. However, studies and reviews have shown mixed
ﬁndings for the relationship between teachers’ subject matter knowledge and student learning (Ahn
8 Choi, 2005; Darling-Hammond, 2000; Wilson, Floden, 8 Ferrini-Mundy, 2001). This research
posits that variation in measures to represent both teachers’ subject matter knowledge and student
learning might lead inconsistent ﬁndings. For instance, in literature, different indicators have been
used to represent both teachers’ subject matter knowledge and student learning in mathematics;
some researchers use the indicator, number of teachers’ courses in math as the best
representation of teachers’ knowledge (the construct) and others use teachers’ test scores (as
indicator) for measuring teachers’ subject matter knowledge (the construct). Therefore, this
assessment aims to obtain your judgments about how the indicators used in 8 primary studies
represent their corresponding constructs (i.e., teachers' subject matter knowledge and student
learning in arithmetic). These 8 studies focus on the relationship between teachers’ knowledge and
5m grade students’ test scores in arithmetic. Please read the following instruction and provide your

judgments on the attached sheet.

 

[Instruction]

You as an independent rater are expected to work individually. First, you should be familiar
with the instrument(s) used in each study. You want to look at the provided information regarding to
instrument (e.g., sample item(s) of the instrument, the instrument, and method section in the study).
If needed, you can also use any accessible resources (e.g., intemet, test manual, other study using
the same instrument). For more resources, please contact Soyeon Ahn (ahnso@msu.edu or 517-
256-1891).

Second, you should evaluate how well each indicator represents each dimension of math
knowledge in arithmetic.

Finally, based on your evaluation, you want to rank order measures used in studies for
which you think would be most likely to measure each dimension of teachers’ subject matter
knowledge and students’ understanding of arithmetic knowledge. Also, you want to provide the

approximate value of validities of each indicator in terms of correlation value.

70

 

 

.EE.nEonBEEubcoEsooEEo.EFo:.mEmc:Sm\\d§_ E oﬁmzmg N6 33% SF 2353 205830 USN 23:5: EPUZ Eoc 3:850 2m mEaFESm m

 

 

”839559:
E .592 8283 :03: Bo: was FcoEmmommm some E 3532: E F23 5953 scum—88o oFmEFxoamm 2: 0235 335 .m

 

”035—322
SFFNE Be .33 .mhozomﬁ 85936 9 38F: 32: on 2:95 x55 so» £033 9 95:88 3532: F0 mm: 05 BUS VEE 835 .m

”@8388 08 59:8 m E ego—>65— FmosmEoﬁmE .Fo 8:38:80 8:: FEE; OF 02on 2: F0 2E8 E 2338 :03 8mm .F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

a m N F m a m N F m Fa m N F m w m N F 96:282.;56.85068 Fa
0529235

v m N F m FN m N F o F‘ m N F m Fa m N F 5268262663 m.>o$._.5 m
mange/8x

v m N F m FV m N F m Fa m N F m w m N F 533262663 5228 N
mcﬁcﬁemocs ﬁgmegme

v m N F o F» m N F m w m N F m w m N F 062698232856 F

Freon Es been 3: NEE Es been E:
o o m < 8.332 9
550d

83838 059882 312: was 3625 82580 .0

5598 0:0 8 828 >05 Bo: new mcosﬁoqo .Fo $5588 953823 .m

mEomem 598:: can £385: macaw mQEmcocﬂe £385: wcccomoaou mo $83 £5285: 9835ch .<
“0:085:ro E 36230ch 3288232 ,Fo mcoFFEcoD

 

7238332 5 3323—2 auto—320— .5332 30.21% .3339:

71

 

 

.EE.5::5000000205008:0000.80.85000800000538“; E 0E0=0>0 wé 800% 00F 0000005 80500000 000 8008:: 2.002 805 008050 000 0800005 0

 

 

”005080508
8 305 8050000 £008 30: F080 808000000 £000 8 5080008 8 0023 8003009 00503800 000808800 05 053000 00005 .m

 

”03007505
00:08 000 .300 28080000 080008 00 3005 5008 00 2003 x85 00% £053 00 90500000 880008 00 mm: 05 00000 0800 00005 .N

5050008 000 80800 0 8 0305505 005080508 («0 080800800 0005 A053 00 0000000 05 F0 08000 8 080008 £000 000m

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

.F
m 0 m N F w v N F w v m N F m 0 m N F F<~Fmv m0C0m 8080>0E0< o
m 0 m N F m 0 N F m 0 m N F m 0 m N F 005080508 8 53000 m
m 0 m N F m 0 N F m 0 m N F m 0 m N _ AmmPFv :Em 0Fm0mF 000% 0.32 Ft
m 0 m N F m 0 N F m 0 m N F m 0 m N F 582:5 m
F200
m 0 m N F m 0 N F m 0 m N F m 0 m N F 000% 8080>0Eo< 800005 N
F8000
m F0 m N F m .0 N F w v m N F m .0 m N F 000% 8080>0Eo< 0800.500 F
580% 2000 099$ 20% Riga. F3: 098% 20% 0080002 OF
Q m <
8050 .D

0000850 030800000 0x08 000 3800: 080800 .0
005080 080 00 000F00 >05 30: 000 80500000 .8 358008 00008000: .m
$30320ch 03000008 m80§$ 0008:: 080 “80588 m0080 328050—00 “800800 8500000008 F0 903 “8008:: 000080003 .<

0050E5F0< 8 0000—399 F0 805850Q

 

705080502 8 00.80002 $5230.0— .$:0F05m _

APPEDIX B:

R Code for the Proposed Model

73

APPEDIX B:
R Code for the Proposed Model

#
# Author: Soyeon Ahn

# Date: 2008-02-01/ 2008-02-03: Fifth revision

# Simulation in detail: This is a simulation set-up for the dissertation. In this simulation, the k
independent studies with sample size of 30 are generated from the population model. In the
population model, the relationship between two underlying factors (ksi & eta), each of which is
measured using three indicators (three xs p=3 & three ys q=3), is of our interest. In the population,
data with 30 sample size are generated from the multivariate distribution with mean vector of 0 and
variance-covariance matrix, which is obtained from the population parameters. The zero-order
correlation coefﬁcients between xs and ys are computed for meta-analysis.

# 1000 replications per conditions.

# 0 missing case - all the included studies provide all 9 possible zero-order correlation coefﬁcients
between 3x3 and 3ys.

#

 

 

# Set-up the directory.

setwd (”Oz/Documents and Settings/Soyeon Ahn/Desktop/Dissertation/Simulation/sim__data/0
missing")

getwd()

library(MASS)

# .
# Population parameters: Reliability of x, reliability of y, and gamma. For all simulation, it is
assumed that ksi and eta are standardized with mean of 0 and variance of 1. This indicates that
phi and psi are set to 1.

# Gamma is set to .5 & 0.

# Reliabilities of xs: [.9, .9, .9]; [.5, .5, .5]; [0,5,2]; [
# Reliabilities of ys: [.9, .9, .9]; [.5, .5, .5]; [.9,.5,.2] [
# Speciﬁc variances: 0 or (Reliability - .05).

# Factor loading: sqrt(Reliability-speciﬁc variance)

# Delta=speciﬁc variance+(1-factor |oadidng"2).; Epsilon=speciﬁc variance+(1-factor loading"2).

# Assuming that we have correlation coefﬁcients for diagonal, # of missingness will be manipulated
(0/6, 1/6, 2/6, 3/6, 4/6, 5/6, 6/6). Choice of what should be unavailable will be based on random
selection.

#

 

2,2,2].
; 2,2,2].

 

# Detailed procedures.

# 1. make var-cov; 2. generate multivariate normal distribution using mean vector & var-cov
produced in 1; 3. generate correlation coefﬁcients; 4. generate random number for choosing
missingness; 5. get the estimator.

74

MV<-matrix(c(0), nr0w=1, ncol=6, byrow=TRUE) #MV is mean-vector of variables (3x3 +3ys)

# Create hypothetical meta-analysis for each condition.
# Function called "hypothetical meta" creates 1. variance-covariance matrix for generating zero-
order correlations for each study in a hypothetical meta-analysis.

hypothetical.meta<-function(GAMMA, V, Re|1,Rel2, Rel3, study){
# For creating variance-covariance matrix, we need the following information - Gamma, Phi, Psi,
Lambda_X (from reliabiiity of X3), Lambda_Y (from reliability of Ys), Theta_delta, and
Theta_epsilon.
GA<-matrix(c(GAMMA), c(1 ,1 ))

phi<-matrix(1,c(1,1))

psi<-matrix(1,c(1,1))

rel<-matrix(c(Rel1, Rel2, Rel3), c(3,1))
# Here, one additional condition is added for this simulation. If V=1, no speciﬁc variances exist on
any side of exogenous and endogenous variables.

if (V==1) {sv <-matrix(c(0,0,0), c(3,1))} else {sv<-matrix(c(Rel1-.05,Re|2-

.05,Rel3-.05), c(3,1))}

LX<-sqrt(rel-sv)

LY<-sqrt(rel-sv)

TD<-matrix(c((1-LX[1,1]“2),0,0,0,(1-LX[2,1]“2),0,0,0, (1-LX[3,1]"2)),nrow=3,
ncol=3,byrow=TRUE)

TE<-matrix(c((1-LY[1,1]"2),0,0,0,(1-LY[2,1]"2),0,0,0, (1-LY[3,1]"2)),nrow=3,
ncol=3,byrow=TRUE)
# Now, using seven parameters above, variance-covariance matrix (called sigma-XX, sigma_YY,
sigma_XY) will be established.
# Sigma_XX= LX*phi*LX'+TD; Sigma_YY=LY*psi*LY'+TE; Sigma_XY=LX*GA*LY' (Use these
equasﬁons)

Sigma_XX<-LX %*% phi %*% t(LX)+TD

Sigma_YY<-LY °/o*°/o psi °/o*% t(LY)+TE

Sigma_XY<-LX%*°/oGA°/o*%t(LY)

Sigma_YX<-LY%*°/oGA%*°/ot(LX)

Sigma<-rbind(cbind(Sigma_XX,Sigma_XY),cbind(Sigma_YX, Sigma_YY))

# Second, I'm generating multivariate normal distribution and creating correlation coefﬁcients
among all indicators.

# x1 x2 x3

# y1 [1,1] [2,1] [3,1]
# y2 [1,2] [2,2] [3,2]
# y3 [1,3] [2,3] [3,3]

# For missing 4 cases.
es.meta<-matrix(0,study*9,4)
ES<-matrix(0,1,17)

#change the 1: (1,6,15, 10, 6, 3, 1)

75

for(j in 1:1){

for (i in 1:study){

cor.data<-mvmorm(30, MV, Sigma, empirical=FALSE)
a<-cor(cor.data)

cmatrix<-cor(cor.data)[1:3,4:6]

attributes(cmatrix)

# .
# For missing information, indicate elements in the correlation matrix that is generated from
multivariate normal distribution called "cor.data”. It is named as C1-C15.
C1<-cbind(1,2,cmatrix[1,2])
C2<-cbind(1,3,cmatrix[1,3])
C3<-cbind(2,3,cmatrix[2,3])
C4<-cbind(2,1,cmatrix[2,1])
C5<-cbind(3,1,cmatrix[3,1])

)

)

 

C6<—cbind(3,2,cmatrix[3,2]

C7<-cbind(1,1,cmatrix[1,1] #C7 is on diagonal.
C8<-cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.
CQ<~cbind(3,3,cmatrix[2,2]) #C9 is on diagonal.

# By having random #, we can assgin the same # of Rs across all three elements on the diagonal.
#random<-runif(1 ,0,1)
# ind<-ifelse(random<1/3,1,ifelse(random>=1/3 & random<2/3,2,3))

# 0 missing
es.meta[9*i-8,]<cbind ,
es.meta[9*i-7,]<-cbind

#C1, C2, C3, C4, C5, C6, C7, C8, C9

(i C
(i, C
es.meta[9*i-6,]<—cbind(i, C
es.meta[9*i-5,]<-cbind(i, C
es.meta[9*i-4,]<-cbind(i, C
es.meta[9*i-3,]<-cbind(i, C
es.meta[9*i-2,]<-cbind(i, C
es.meta[9*i-1,]<-cbind(i, C
es.meta[9*i,]<-cbind(i, C6)
#

7)
8)
9)
1)
2)
3)
4)
5)
}

 

#
#1 missing case

# For missing information, indicate elements in the correlation matrix that is generated from
multivariate normal distribution called "cor.data". It is named as C1-C15.
C1<-cbind(1,2,cmatrix[1,2])

C2<-cbind(1,3,cmatrix[1,3])

C3<~cbind(2,3,cmatrix[2,3])

C4<-cbind(2,1,cmatrix[2,1])

C5<-cbind(3,1,cmatrix[3,1])

C6<~cbind(3,2,cmatrix[3,2])

C7<-cbind(1,1,cmatrix[1,1]) #C7 is on diagonal.

 

76

C8<~cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.
C9<-cbind(3,3.cmatrix[2,2]) #C9 is on diagonal.
all.e|ement1<-rbind(C2, C1 ,C1 ,C1, C1, C1)
all.e|ement2<-rbind(C3, C3,C2,C2, C2. C2)
all.e|ement3<-rbind(C4, C4.C4,C3, C3, C3)
all.e|ement4<-rbind(C5, C5,C5,C5, C4, C4)
all.e|ement5<-rbind(C6, C6,C6,C6, C6, C5)

#(2,3,4,5,6), (1,3,4,5,6), (1.2.4.5,6), (1.2.3.5,6), (1,2,3,4,6). (1,2,3,4,5).
# By having random #, we can assgin the same # of Rs across all three elements on the diagonal.
#random<-runif(1.0,1)

# ind<-ife|se(random<1/3,1,ifelse(random>=1/3 & random<2/3,2,3))

es.meta[8*i-7,]<-cbind(i, C7) #C1, C2, C3. C4, C5, C6, C7. C8, C9
es.meta[8*i-6,]<-cbind(i, C8)

es.meta[8*i-5,]<-cbind(i. C9)

es.meta[8*i-4,]<-cbind(i, matrix(all.element1[j,],c(1:3)))
es.meta[8*i-3,]<-cbind(i, matrix(all.element2[j,],c(1:3)))
es.meta[8*i-2,]<-cbind(i, matrix(all.element3[j,].c(1:3)))
es.meta[8*i-1,]<-cbind(i, matrix(all.element4[j,],c(1:3)))
es.meta[8*i,]<-cbind(i, matrix(all.element5[j,],c(1:3)))}
#

 

#
# 2 missing case
# For missing information, indicate elements in the correlation matrix that is generated from
multivariate normal distribution called "cor.data“. It is named as C1-C15.
C1<-cbind(1,2,cmatrix[1,2])
02<-cbind(1.3,cmatrix[1,3])
C3<-cbind(2,3,cmatrix[2,3])
C4<-cbind(2,1,cmatrix[2,1])
CS<-cbind(3,1,cmatrix[3,1])
CG<-cbind(3,2,cmatrix[3,2])
C7<-cbind(1,1.cmatrix[1,1]) #C7 is on diagonal.
C8<-cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.
C9<-cbind(3.3,cmatrix[2,2]) #C9 is on diagonal.
all.e|ement1<-rbind(C3.C2,C2,C2,C2.C1,C1,C1,C1,C1,C1,C1,C1,C1,C1)
all.e|ement2<-rbind(C4,C4.C3,C3,C3,C4.C3.C3,C3,C2.C2,C2,C2,C2,C2)
all.element3<-rbind(C5,C5.C5,C4,C4,C5,C5,C4,C4.C5.C4,C4,C3,C3,C3)
all.element4<-rbind(C6,C6,C6,C6,C5.C6,C6,C6,C5,C6.C6,CS,C6,C5,C4)

)

)

 

# (3.4.5.6). (2.4.5.6 , (2.3.5.6), (2.3.4.6),
# (2.3.4.5). (1.4.5.6 , (1.3.5.6). (1.3.4.6), (1,345).
#(1.2,5,6). (1.2.4.6), (1.24.5), (1.23.6),

11 (1.2.3.5), (1.2.34).

# 2 missing

77

es.meta[7*i-6,]<-cbind(i, C7) #C1, C2, C3, C4, C5, C6, C7, C8, C9
es.meta[7*i-5,]<-cbind(i, C8)
es.meta[7*i~4,]<-cbind(i, C9)
es.meta[7*i-3,]<—cbind(i, matrix(all.element1 [j,],c(1 :3)
es.meta[7*i-2,]<—cbind(i, matrix(all.element2[j,],c(1:3)))
es.meta[7*i-1,]<-cbind(i, matrix(all.element3[j,],c(1:3)))
es.meta[7*i,]<-cbind(i, matrix(all.element4[j,],c(1:3)))}

))

# (1,2,3,4), (1,2,3,5), (123,6), (2,3,4,5), (2,3,4,6), (3,4,5,6)
#

 

 

#
# 3 missing case

# For missing information, indicate elements in the correlation matrix that is generated from
multivariate normal distribution called "cor.data". it is named as C1-C15.
C1<-cbind(1,2,cmatrix[1,2])

CZ<-cbind(1,3,cmatrix[1,3])

C3<—cbind(2,3,cmatrix[2,3])

C4<-cbind(2,1 ,cmatrix[2,1])

C5<—cbind(3,1 ,cmatrix[3,1])

CG<—cbind(3,2,cmatrix[3,2])

CT<—cbind(1,1,cmatrix[1,1]) #C7 is on diagonal.

C8<-cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.

C9<—cbind(3,3,cmatrix[2,2]) #C9 is on diagonal.

all.e|ement1<-rbind(c1,c1,c1,c1,c2,c2,c2,c3,c3,C4)
all.element2<-rbind(C2,C2,C2,CZ,C3,C3,C3,C4,C4,C5)
all.e|emen13<-rbind(C3,C4,C5,C6,C4,C5,C6,C5,C6,06)
# (1,2,3), (1,2,4), (1, 2,5), (1,2, 6), (2,3,4), (2,3,5), (2,3,6), (3, 4, 5), (3,4,5), (4, 5, 6)
#

 

#
# 4 missing case

 

# For missing information, indicate elements in the correlation matrix that is generated from
multivariate normal distribution called "cor.data". It is named as C1-C15.
C1<-cbind(1,2,cmatrix[1,2])

C2<-cbind(1,3,cmatrix[1,3])

C3<-cbind(2,3,cmatrix[2,3])

C4<-cbind(2,1,cmatrix[2,1])

CS<—cbind(3,1,cmatrix[3,1])

C6<-cbind(3,2,cmatrix[3,2])

C7<-cbind(1,1,cmatrix[1,1]) #C7 is on diagonal.

CB<-cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.

C9<~cbind(3,3,cmatrix[2,2]) #C9 is on diagonal.
all.e|ement1<-rbind(C1,C1,C1,C1,C1,C1,C2,C2,C2,C2, C3,C3,C3,C4,C4,C5)
all.element2<-rbind(C2,C3,C4,C5,C6,C3,C4,C5,C6,C4, C5,C6,C5,C6,C6,C6)

78

# By having random #, we can assgin the same # of Rs across all three elements on the diagonal.
#random<-runif(1,0,1)
#ind<-ifelse(random<1/3,1,ifelse(random>=1/3 & random<2/3,2,3))

# 4 missing

es.meta[5*i-4,]<-cbind(i, C7) #C1, C2, C3, C4, C5, C6, C7, C8, C9
es.meta[5*i-3,]<-cbind(i, C8)

es.meta[5*i-2,]<-cbind(i, C9)

es.meta[5*i-1,]<-cbind(i, matrix(all.element1[j,],c(1:3)))
es.meta[5*i,]<-cbind(i, matrix(all.element2[j,],c(1:3)))}

# (c1, c2), (c1, c3), (c1, C4), (c1, C5), (C1, C5), (c2, C3), (C2, c4), (c2, C5), (c2, cc), (c3,
c4),(c3, C5), (C3, c5), (c4, cs), (c4, cc), (c5, cc).
#

 

#
# 5 missing case

# For missing information, indicate elements in the correlation matrix that is generated from
multivariate normal distribution called "cor.data". It is named as C1-C6.
C1<-cbind(1,2,cmatrix[1,2])

CZ<-cbind(1,3,cmatrix[1,3])

CB<-cbind(2,3,cmatrix[2,3])

C4<cbind(2,1 ,cmatrix[2,1])

CS<-cbind(3,1,cmatrix[3,1])

C6<-cbind(3,2,cmatrix[3,2])

C7<-cbind(1,1,cmatrix[1,1]) #C7 is on diagonal.

C8<-cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.

C9<-cbind(3,3,cmatrix[2,2]) #C9 is on diagonal.

all.e|ement<-rbind(C1,C2,C3,C4,C5,C6)

 

# By having random #, we can assgin the same # of Rs across all three elements on the diagonal.
#random<-runif(1,0,1)
#ind<-ifelse(random<1/3,1,ifelse(random>=1/3 & random<2/3,2,3))

# 5 missing

es.meta[4*i-3,]<-cbind(i, C7) #C1, C2, C3, C4, C5, C6, C7, C8, C9
es.meta[4*i-2,]<-cbind(i, C8)

es.meta[4*i-1,]<-cbind(i, C9)

es.meta[4*i,]<-cbind(i, matnx(all.element[j,],c(1:3)))}

#

 

#
# 6 missing

C7<-cbind(1,1,cmatrix[1,1]) #C7 is on diagonal.
C8<-cbind(2,2,cmatrix[2,2]) #C8 is on diagonal.
C9<-cbind(3,3,cmatrix[2,2]) #C9 is on diagonal.

 

79

# By having random #, we can assgin the same # of Rs across all three elements on the diagonal.
#random<-runif(1,0,1)
# ind<-ifelse(random<1/3,1,ifelse(random>=1/3 & random<2/3,2,3))

es.meta[3*i-2,]<-cbind(i, C7) #C1, CZ, C3, C4, C5, C6, C7, C8, C9

es.meta[3*i-1,]<-cbind(i, C8)

es.meta[3*i,]<-cbind(i, C9)}

attributes(esmeta)

# After creating a hypothetical studies with # of studies in it, next step is to compute the ﬁnal
estimates based on sample-size weighted average Rs & z-transformed variance weighted Rs.

#

 

# E81 is the ﬁnal ES based on sample-size weighted Rs & E82 is the ﬁnal ES based on 2-
transfonned weighted Rs.

meta<-cbind(es.meta, es.meta[,4]*30, 27*(.5*log((1+es.meta[,4])/(1-es.meta[,4]))))
E81.sum<—data.matrix(aggregate(meta[,5], list(x=meta[,2], y=meta[,3]),sum))
ES1<-cbind(ES1 .sum[,1], ES1.sum[,2],ES1.sum[,3]/(30*study))
E82.sum<-data.matrix(aggregate(meta[,6], list(x=meta[,2], y=meta[,3]), sum))
E82<-cbind(E32.sum[,1], E82.sum[,2],E82.sum[,3]/(27*study))

E82<-cbind(ES2.sum[,1], E82.sum[,2],(exp(2*E82[,3])-1)/(exp(2*ESZ[,3])+1))

meta_ES1<-as.matrix((apply(ES1,2,sum))/((apply(LX,2,sum))*(apply(LY,2,sum))))
meta_E82<-as.matrix((app|y(E82,2,sum))/((apply(LX,2,sum))*(apply(LY,2,sum))))

# ES includes both the ﬁnal ES based on sample-size weighted Rs(ES1) & the ﬁnal ES based on
z-transforme weighted Rs.

ESU,]<-cbind(j, LX[1,1], LX[2,1], LX[3,1], LY[1,1], LY[2,1], LY[3,1], sv[1,1], sv[2,1], sv[3,1], study,
GA, Rel1, Re12, Rel3, meta_ES1[3,1], meta_E82[3,1])}

result<-return(ES)}

# Make hypothetical meta-analyses depending on different conditions in accordance with different
parameters.

 

#
# Deﬁnine function depending for getting ﬁnal ES.
#

 

matrix.ga<-matrix(c(.5,0), c(1,2))

matrixspeciﬁc.variance<-matn'x(c(1,0), nrow=1, ncol=2,byrow=TRUE)
matrix.reliability<~matrix(c(.9,.9,.9,.5,.5,.5,.2,.2,.2,.9,.5,.2) nrow=4, ncol=3, byrow=TRUE)
n.study<-matrix(c(9, 36), ncol=1, nrow=2, byrow=FALSE) # k is # of studies included in meta:
analysis;

# 32 conditions by 2 (Gamma) * 4(Reliability sets) * 2(# of study) * 2(# of speciﬁc variance) =32
#(0.5,0,.9,.9,.9,9) (O.5,1,.9,.9,.9,9) (0,0,.9,.9,.9,9) (O,1,.9,.9,.9,9)

80

(0.,,51 9, 9, .935)(0,

0.,,51.,5.,5 590)( ..0

(05,155 5.,35)(0
0,
(

0 .35) (0,1..9..9,.9,35)
5
0
0.,,512, 229) (0 ..2
0
.,9
0,

.,.,.999

55 )9(01,5,5.5,9)
5,...55,,,35)(015..5,.,535)
W22mp122wzm
(05,.12...,,2235)0 ..,..22
0.,,519.,.,0.529)(0, ...52.
(05......195235)(0 9,...25

.,235) (0.1.2.2 .,235)
9) (0,1..9,5..2,9)
.35) (0,1,.9,5,.2,35)

VAVAVAV

replication<-1000 #write # of replications here
M1 <-lapply(1 :replication, function(x) hypothetical.meta(O. 5,0
M2<-Iapply(1:replication, function(x) hypothetical.meta(O. 5.1
M3<-lapply(1:replication, function(x) hypothetical.meta(0,0,9,
M4<-lapp|y(1:replication, function(x) hypothetical.meta(0,1,9 ,.
M5<-Iapply(1:replication, function(x) hypothetical.meta(O. 5,0
M6<-lapply(1:replication, function(x) hypothetical.meta(O. 5,1
M7<~|apply(1:replication, function(x) hypothetical.meta(0,0. 9,.
M8<-Iapply(1:replication, function(x) hypothetical.meta(0,1,. 9..
M9<-lapply(1:replication, function(x) hypothetical.meta(O. 5,0,. ,.
M10<-lapply(1:replication, function(x) hypothetical.meta(O. ,1. .5.. 5,.
M11<-|app|y 1:replication, function(x) hypothetical.meta(O, ,.
M12<-lapply 1:replication, function(x) hypothetical.meta(O,
0.
0.

c"cc:cc>.“3“3c::1no“)

(
(
M13<-Iapply(1:replication, function(x) hypothetical.meta(
M14<—lapply(1 r:e,p|ication function(x) hypothetical. meta(
M15<-Iapply(1 :replication, function(x) hypothetical. meta(0, ..
M16<-lapply(1: replication, function(x) hypothetical. meta(O, ,.
M17<-lapply(1: replication, function(x) hypothetical. meta(O. 5,
M18<- |app|y(1 :replication, function(x) hypothetical. meta(O 5,
M19<-Iapply(1: replication, function(x) hypothetical. meta(O, 0,.
M20<-Iapply(1:replication, function(xxhypothetical.)meta(O,1,.
M21<-lapply(1:replication, function(x) hypothetical. meta
M22<-Iapply(1:replication, function(x) hypothetical. meta
M23<-Iapply(1:replication, function(x) hypothetical. meta

(

(

(

(

(X

070)

(0
(
(
M24<-lapply 1:replication, function x) hypothetical.meta(
M25<-|apply 1:replication, function x) hypothetical. meta(
M26<-lapply 1:replication, function x) hypothetical. meta(
(
(
(
l
(
l

V
vv
VVVV
vv

(
(
(
M27<-Iapp|y(1:replication, function x) hypothetical. meta
M28<-lapply(1:replication, functionx) hypothetical. meta

(

(

(

(

M29<-lapply 1:replication, function(x) hypothetical.meta
M30<-lapply 1:replication, function(x ) hypothetical. meta
M31<-lapply 1:replication, function(x) hypothetical. meta
M32<-lapply 1:replication, function(x) hypothetical. meta
#

wwpgmcocoppwwmwcocowbwwmmcocom

coco—‘Pcocor‘ommfpmmf‘
mm

.,5
0.5,
0,.0
0,..1
0.,5
05,
0,0,.
0,.1,
0.,5
0.5
0,.0.
01,.

VV

 

# Creating dataset by combining

 

#

#class(META1[[1]])
#replication

81

#Iength(META1)
# replication == length(META1)

library(gdata)

create.data<-function(META1.META2){

META2<-matrix(0.1000,17)

for (i in 1:replication){

META2[i,]<-META1[[i]]}

colnames(META2)<-c( "Ind", "LX1", "LX2", "LX3". "LY1", "LY2". "LY3". "svi", ”sv2", "sv3", "study",
"GA". “Rel1”. "Rel2", "Rel3", "E81", "E82")

result<-return(META2)}

R1 <-create.data(M1 .D1)

R2<-create.data(M2,DZ)

R3<-create.data(M3,D3)

R4<-create.data(M4.D4)

R5<-create.data(M5.05)

R6<-create.data(M6.D6)

R7<-create.data(M7,D7)

R8<-create.data(M8,D8)

R9<create.data(M9.D9)

R10<-create.data(M10,D10)
R11<-create.data(M11.D11)
R12<-create.data(M12,D12)
R13<-create.data(M13.D13)
R14<create.data(M14.D14)
R15<-create.data(M15.D15)
R16<-create.data(M16.D16)
R17<-create.data(M17.D17)
R18<-create.data(M18.D18)
R19<-create.data(M19.D19)
R20<-create.data(M20.D20)
R21 <-create.data(M21 .D21)
R22<-create.data(M22.D22)
R23<-create.data(M23.D23)
R24<-create.data(M24,D24)
R25<-create.data(M25,D25)
R26<-create.data(M26.D26)
R27<-create.data(M27.D27)
R28<-create.data(M28,D28)
R29<-create.data(M29,D29)
R30<-create.data(M30,D30)
R31 <-create.data(M31 ,D31)
R32<-create.data(M32.D32)

big.data<-combine(R1, R2. R3, R4, R5, R6. R7. R8, R9, R10, R11. R12. R13, R14, R15, R16. R17.
R18. R19. R20. R21, R22, R23. R24. R25. R26. R27. R28. R29. R30, R31, R32)
write.matrix(big.data, ﬁle="D_0 missing.xls", sep=" ")

83

APPENDIX C:

Tables and Figures

84

Table 2.1

Attenuated Artifacts and the Corresponding Multiplier

 

Attenuation Artifacts

The corresponding multiplier

 

Random error of measurement in
dependent variable Y

01:5,

ryv is the reliability of the Y measure

 

Random error of measurement in
independent variable X

a2: ”xx:

’10: is the reliability of the X measure

 

Artiﬁcial dichotomization of
continuous dependent variable
split into proportions p and q

Artiﬁcial dichotomization of
continuous independent variable
split into proportions p and q

Imperfect construct validity of the
dependent variable Y

a3 = biserial constant = ¢(c)/ t/pq

2
where ¢(x) = e—x / J27r is the unit normal
density function and c is unit normal distribution
cut point corresponding to a split of p

a4 = biserial constant = ¢(c)/ Jpq

where ¢(x) = e_x2 / m is the unit normal
density function and c is unit normal distribution
cutioint corresgonding to a split of p

a5 = the construct validity of Y

 

Imperfect construct variable of the
independent variable X

a6 = the construct validity of X

 

Range restriction on the dependent
variable Y

 

a7 = \/(uy2 +,02 “'uyzpz),

where u = (SDy study population)/
(SD), reference population)

 

Range restriction on the
independent variable X

 

”8 = \/(ux2 + p2 _ ux2p2) ,
where u = (SDx study population)/

( SDx reference population)

 

Bias in the correlation coefﬁcients
due to small sample sizes

agzl-(l—p2)/(2N—2)

 

Study-caused variation

 

Partial correlation to remove the effects of
unwanted variation in experience

 

85

Table 2.2.

Comparisons of Correction formulas

 

  

 

  

 

d Correlation Regression
Oswald Raju & Brand
Le (2003); Hunter & & (2003);
Correcting Nugent Sackett & Schmidt Hancock Converse Raju et
factors (2009) Yang (2000) (1990) (1997) (2005) al.(l991)
Reliability (X) x x x x x
Reliability (Y) x x x
Validity (X) x
Validity (Y) x
Range restriction x x x
Sampling error x x

 

 

 

 

86

Table 4.1.

Bias and MSE ofEstimators

 

 

 

 

 

 

 

7
Quality of MSE Bias MSE
Estimators E81 E82 ES 1 E82 E81 E82 ES 1 E82
Min -.01 -.01 0 0 -.02 0 0 0
Max .01 .01 .06 .06 .15 .17 .05 .06
M .0001 .0001 .0082 .0082 .008 .023 .008 .009
SD .003 .003 .01 .01 .03 .03 .01 .01

 

 

 

 

 

 

 

 

 

87

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

moo. ooo. moo. ooo. o 3

:o. ooo.- ooo. ooo.- o

o8. moo. ooo. moo. m

ooo. ooo. woo. ooo. e

woo. so; woo. ooo.- m

woo. ooo. 8o. ooo. N

woo. ooo. ooo. ooo. _

ooo. ooo.- ooo. ooo.- o o S. E. E. m. m. m.

so. ooo.- so. ooo.- o

so. ooo. so. ooo. m

ooo. ooo. ooo. ooo. a

so. ooo. ooo. ooo. m

ooo. ooo. so. ooo. N

ooo. So. So. so. _

so. ooo. ooo. ooo. o 3

woo. woo. woo. ooo.- o

ooo. ooo.- ooo. ooo.- m

woo. ooo. ooo. ooo. 5

ooo. ooo. ooo. ooo. m

ooo. _oo. woo. ooo. N

ooo. ooo. ooo. ooo. _

ooo. moo. ooo. moo. o o woo woo woo oo oo oo

mm: Em mm: mam woaez. « Enos @2me 525 5<me ahmxo :55

use 53685 $685 $685 3:523 3:353 5:553

Nmm ﬂmm coooom Sooom 56$

 

 

 

 

 

 

 

 

 

 

S H \C 283380 28.6be $35 ESQEEM heEMExx octets; EELQQETN 3E SEES: mNE.-mEEum.\o MME ES 3%

NV 2an

88

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

moo. ooo. moo. ooo. _

moo. ooo. moo. ooo. o om
moo. moo. moo. moo. o

:o. ooo. ooo. ooo. m

ooo. ooo. ooo. ooo. w

ooo. ooo.- woo. ooo.- m

woo. ooo. woo. ooo. m

woo. ooo. ooo. ooo. o

moo. woo. woo. woo. o o mw. K. mo. m o.
2o. ooo.- woo. ooo. o

:o. ooo. :o. ooo. m

ooo. moo. ooo. So. w

woo. ooo. woo. ooo. m

ooo. ooo. moo. ooo. m

ooo. ooo. ooo. ooo. ﬂ

ooo. moo.. moo. moo.- o om
omo. ooo. mmo. ooo. o

owo. .oo. mwo. Hoo. m

mmo. ooo. mmo. ooo. w

mmo. woo. omo. moo. m

mmo. ooo.- hmo. ooo.- m

omo. moo. mmo. moo. _

wmo. woo. mmo. moo. o o mw. mw. mw. m m.
moo. moo. moo. moo. o

moo. ooo. moo. ooo. m

moo. ooo. moo. ooo. w

moo. ooo. moo. ooo. m

moo. ooo. moo. ooo. m

moo. ooo. moo. ooo. ﬂ

 

 

 

 

 

 

 

 

 

 

 

 

82950 2. 29¢

89

 

 

 

 

 

moo. So. moo. Moo. o
moo. _oo.- moo. ooo.- m
moo. ooo. moo. ooo. w
moo. ooo. moo. ooo. m
moo. _oo. moo. ooo. m

 

 

 

 

 

 

 

 

 

 

 

 

3258 2. 29¢

90

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Noo. ooo. Noo. woo; _

Noo. ooo. Noo. ooo. o om

o_o. ooo. ooo. ooo. o

ooo. ooo. ooo. ooo. m

ooo. Noo. Noo. ooo. e

ooo. ooo. Noo. ooo; m

ooo. ooo. ooo. woo; N

Noo. ooo. ooo. woo; :

ooo. ooo. ooo. ooo; o o :N. _N. :N. m. m. n

_oo. ooo. :oo. woo; o

_oo. Noo. _oo. Noo; m

:oo. ooo. :oo. ooo. o

_oo. ooo. _oo. Noo; m

:oo. Noo. :oo. ooo. N

:oo. ooo. :oo. Noo; _

:oo. ooo. _oo. ooo; o om

moo. ooo. moo. ooo; o

moo. ooo. moo. ooo. m

moo. moo. moo. ooog o

moo. ooo. moo. ooo. m

moo. ooo. moo. ooo; N

moo. moo. moo. Noor :

moo. ooo. moo. ooo; o o mo. mo. mo. o. o. o.

mm: 55 mm: 55 meme: 3: :9.me :Nocsc Ease 5:.me :Ntho Eng

55 m5:53 5:33 mega: 5:525: 5:553: 5:523

mmm :mm :ooomm 86$ SSE

 

mm. H \C €82}on 55.:ng .535 EQEEEM EmeExx 822.255 hmEwaSﬁTN 55o mesons: NontrmNSEomko mm: ~53 .5.on

mw 035%

91

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ooo. owo. woo. mwo. m

moo. omo. moo. mmo. _

woo. mwo. moo. mmo. o cm
mmo. mm_. omo. owﬁ. o

mmo. oo_. w_o. moo. m

o_o. owo. woo. mmo. w

moo. woo. ooo. mwo. m

:_o. owo. ooo. _wo. m

ooo. mwo. moo. mmo. _

woo. owo. ooo. mmo. o o mw. :5. mo. m o.
m_o. woo. w_o. m_or o

__o. ooo. ooo. woor m

ooo. ooo. ooo. woo; w

woo. ooo. woo. moor m

ooo. ooo. moo. moor m

ooo. moo. ooo. oﬂor :

ooo. ooo. ooo. _ﬁor o om
mmo. ooo. mmo. ooor o

mwo. moo. _wo. moor m

wmo. ooo. wmo. ooor w

mmo. moo. omo. moor m

mmo. ooo. wmo. ooor m

mmo. moo. wmo. oﬁor _

mmo. woo. mmo. moor o o mw. mw. mw. m m.
moo. ooo. moo. moor o

moo. woo. moo. moor m

moo. woo. moo. woo: w

moo. woo. moo. moor m

moo. woo. moo. moor m

 

 

 

 

 

 

 

 

 

 

 

 

8:558 3 55$

 

 

 

 

mo. 03. mo. wmo. o
o5. m2. moo. woo. m
:o. ooo. woo. mmo. w
moo. woo. woo. owo. m

 

 

 

 

 

 

 

 

 

 

 

 

8:558 3 55S

()3

Table 4.4

Mean bias of rfrom Field (2001)

 

 

 

 

 

 

 

 

 

 

 

 

Number
Homogeneous case Heterogeneous case
of
studies H/O H/S H/O H/S
10 -.0001 -.0001 0 0
p = 0
30 -.00005 -.00005 0 0
10 .006 -.007 .1785 -.024
p = 5
30 .007 .007 .205 -.024
Note.

H/O is the method suggested by Hedges and Olkin (1985) or Rosenthal and Rubin (1991).

This is equivalent to ESZ in this dissertation; H/S is the method suggested by Hunter and

Schmidt (1990). This is equivalent to ESl in this dissertation

94

 

 

333?...» 5

 

 

wmo. _o.v mwoﬁmw m oohfwwow wo.mm ooom “Swag 38m
wmo. :o.v wwi mw m oo.mw_wwow wodm oowﬁ. m.w==_BOI
wmo. Ho.v wwoﬁ mw m oohwowwow wo. @3qu .353
wmo. gov wwo _ mw m 093 :wwow oo. mom; £32m C: @2535 bosom .3 .5352
$0. _o.v owimw m mwmomoomw 2 .mm “com “mambo: 38m
woo. _o.v whomow o meCmmmw owo moo; Ewes—20$
omo. :ov omomow o oomwowow— oo. $5qu .853
www. _o.v wmomow o wwowmo: ow; 88H 98:5 coo Amazons— :oouomv moo—SEEM
wmo. _o.v owoomw o wowmoo o _. ooom “mowomq mkom
owo. _o.v wmomow m_ ww.woww m _. womb. musigoz
wmo. _o.v chamow m_ wwomww ow. $5qu .323
wmo. :ov whomow m: oomww mo. moat. 96:5 ma @622 .3 55:52
.5 a woo so u 35> 55::

 

Q“

m .ExoxScEEQKQ 3.5 2: to Q; QZVEQ 355.235 \o ooobctw BEEPS§< EExuézomk

3.. Book

 

ooommv

mmé

 

 

 

 

 

mmmumam Eﬁ

ooommv wo..“ ﬁmmlmmmm
0 $2 mv 00v Nmmlmﬁm Stm

o owoﬁmv ~04V Emlmﬁm

C 5v 0mm So. _ oo. mmmlmﬁm
C 5v mdw So. _ oo. Emlmﬂm Cb 39:2: 33% we EDS: Z
moo. _o.v N502 m5. m wo. NmMImﬂm mozmtg who
ooo. 5v 5:2 NS. m wo. ~mmlmﬂm 3:69: 569: bzﬁwzum

moo. 5V VNQ So. 0 S. mmmlmﬂm
moo. 5v was So. 0 S. Emlmﬁm wk @632 LS $9832

”9 a u a: 5.3 mm Maugham 85%

 

S H \C mEEEwM wwwiuwkm QEMSQKQ Sum §~\..29m\g~w .4.533\..§§:mm,\o mask

9v 2an

9 6

 

 

 

 

5mm. av 32? N wmsmmomm 3.: 805 “wows:— 9x85
nmm. _o.v wwo: mv m wmsmmomm 3.: can; 995—20:
5mm. :o.v wwemv m wmsmmomm ow. 35qu .353
5mm. av 33? N meNMOmN wm. 88H 98:5 Ct 39:05 33.50 89652
won. _o.v 33? m 5.3%: KM. Hood Emma: 9505
mmo. :o.v Emmow o wmdvmwmm :m.m 88% 95:38:
w 5 m. :o.v cummow o oodmovf mm. 35:84 .353 83$“? o
33.. 8v whomcw o 2.3wa R. 3th 96:5 50 32:52 565v 5:53:05
om: SV 32? o omﬁmmg 3. “com “mama: 9:05
woo. :o.v 3.on N: madam 3. 8a; 99::80:
30. 5v @3me m: Omﬁvow 5w. «3qu .353
80. av wmmmow 5 90¢an m _. 88H 96:5 wk wﬁmﬂz mo 598:2
a a go as a 33> swam

(‘1

Q U x foSBECQxG 9mm: 2: :9 «VS 0236 823.8; \o Muibsﬁx 2922.335 EPKENEQQ

5v 2an

()7

 

ooommv wows mlimmE

 

 

 

 

 

ooommv nwwo 5mmIMm2 130%

00. owofmv om: NmMIme
oo. owimv Cd _mMImm2 88m

:mm. :ov wonooov m5: _. m5: mmMImmE
mmm. _o.v wo. :moov 3.: 5 :2 _mMImmE C: 3:208 583% («o 838:2
mwo. _o.v mnéwmm: m 91. m wvdm mmmlmmE 838:3; 0
Q3. _o.v omﬁmmo; 00.0 m owd: EmlmmE 50 3,885: 8925.: 5:58—05

owo. 5v KSwmm 2. o 2.. NmmlmmE
ooo. EV cm.mmm I. c we. _mmlmm2 9‘ .8882 5.0 838:2

s q u ,3 ,8 mm 3235‘“: 850m

(‘3

8 H \C 838.5% 33:8.»ka 23m>©x~v 8mm: 3&8.»th 3m.§~m-:mm.x.$m\e 8.?er

w: 29.?

()8

 

 

 

 

mmo. 3v wwﬁ m? m 8.3233.“ woam 80m “mews; 38¢
wmo. _.o.v wwoﬁmv m coﬂwgwmv wodm 83H ﬁgs—BEE
wmo. 5v mwgmv m 8.33on we. 25qu .353
m3. 5v mmoﬁmv m coax—gov oo. 8th mag—E O: 39:2: 3330 898:2
mwo. _o.v owmﬁmv m mVNONSQ Sam “com “wows; 33m
$0. _o.v vnmmow o wn.m:mmmv 3.0 mow; mwﬁzgom
owo. _o.v 033w w SNEvaf oo. 35qu .353 333:3
3w. 5v whomow o vwdwmot $4 38% magi 0 go @532 SSEV bzﬁazom
wmo. 5V 33$ 0 woémmc 3. “com Emma; 33¢
0.8. 5v £33 9 Egg m a 85 £5.50:
wmo. _o.v 03mg .9 3.03% ow. «3an .823
mmo. _o.v $.on NM 8va m _. 88% 96:5 mg @532 .«o SpEsz
b a § § a 86> 3mm

 

.n. H A S\EQEE.:mm~\Q ”3% 2t :0 «WA 923%» 985.23; \Q (4.233% 23x95~§< Epﬁﬂt‘amx

oé £me

99

 

ooommv

3.x:

 

 

 

 

 

mmmimam Eﬁ
ooommv «mdmm #mMImEm
oo. $2 Q @052 Nmmlmwmm Sam
oo. 93 ﬂ 9 00.03 Eulmﬂm
80. av 3.8: 3. ﬂ 3. mmmumgm E
ooo. _o.v mmﬁ mo. _ mo. Emlﬁﬁm @2535 35% mo 595:2
m2. 5v 3.33% 982 m SS». mmmtmem 832:; o .8
won. 3V 9::va 523 # m 2.9% Emlmﬂm 3532 56mm: bzﬁmzom
35. av 8.88 RN 0 a: Nmmumgm
23. 5v @330 gm 0 3.: amlmam r 33:2 (6 “38:2
mt Q. R m: xv mm £08835 venom

 

G. H \C EQEEM 32-8»ka Engxc 3E, .Exéuﬁﬁ \Bxgmrsmpémmkc big

0:” 28g

100

 

 

 

 

0mm. 5V wwoﬁmv N mobmmmmm we; Hood :8qu 38m
omm. 5V mwoﬁmv m 3.83% we; 83H muczgom
0mm. SV 39 Q m mmémmmmm wv. @3qu .8252
0mm. 3V wwo H 3 m m0©~mmm~ mm. mom; 96:5 G: 89:2: 35% .«o 55:32
0:. SV 33 Q m mmwmowg mm.m wood 36qu ﬁxed
m K. SV :33 c madmmcmm 3%. 88% m.wE:BOI
woo. 5V onmmow o ofmgmmm oo. 858mg .953 3325.» o
33. av mummcw o mmwmoo _ m mm; 88% 98:5 mo 3532 56mm: bzﬁmzom
m3. SV 32 Q o 00$ 2 2. 80m Hmvwaq mahom
to. _o.v 3.98m NH mwhooo n g. 88% wwcngom
3b. 5v 03mg 2 onéwnm ow. 25:54 .333
mwo. 5V whomow 9 360mm 3. 8th mug—E m.‘ 9522 Mo Snacsz
s a £0 36 u 02$ 6&5

(*1

 

.n. H R SKEQEEEQKQ “MM: 3: to QAQZESC 823.23; \3 .nwbszvx 33:33:: Schmgzgm

2% 2me

101

 

 

 

 

 

 

ooommv mmdw mmMImmE
ooommv an: #mmlmmE ESP
mod—Va owemv m5: mmmlmmZ
modmﬁm awamv oo.: #mmlmmE “ohm
mwm. _o.v nmwoog wZKNNd # mmd mmmlmmZ
Ev. _o.v mmdwwSm Nmmﬁwvw ﬂ wvw ﬁmmlmmZ 0: 89:2: .3330 59:52
ooo. 5v oo©$vmm 238.5 m modm mmMImmZ 332:? o
wmo. 5V ~980me oommhmd m ~52 _mmlmm2 ,3 353.2 .585: .§_5m:om
mm _. 5v :uNmo: owoonm. o omd mmmlmmE
mm ﬂ. _o.v 8.5:: ovvwwm. 0 mm; #mmlmmE mg wEmﬂZ ,3 88:32
-s a K m: \n mm msaémm

Q. H \C BQEEM 33-3%3 25m>©xc “QM: Exéumkw Bm.§.~m-§3:mqu .235

mﬁv 033%

103

Table 4.13

Correlation Matrix of Six Indicators

 

X1

(reliability = .9)

x2

(reliability =.5)

X3

(reliability = .2)

 

 

 

 

y: rx2,yl. rx3,yl
(reliability = .9) (medium, high) (low, high)
yz rxl,y2 rx3,y_7
(reliability = .5) (high, medium) (low, medium)
Y3 rxl.y3 ery]
(reliability = .2) (high, low) (medium, low)

 

 

 

 

103

 

Table 4.14

ANO VA S Comparing Bias of ES 1 Across H711'C/z r is Included

 

 

 

7 Source SS df MS F p ’7

0 Intercept .02 1 .02 .65 .42 0
which r is included .38 5 .08 2.07 .07 0
Error 88.31 23,994 .04
Total 8 8.71 24,000

.5 Intercept 1,589.6 1 1,589.60 41,2615 <.01 .632
which r is included 2.4 5 .49 12.69 <.01 .003
Error 924.4 23,994 .04
Total 2,516.42 24,000

 

 

 

104

Table 4.15

Pairwise Comparison Comparing Bias of ES 1 Depending 0n Which r is Included

 

 

 

 

 

 

 

 

95% Cl
Ind Ind Mean Difference SE p Upper Lower
1 2 .012* .004 .008 .003 .020
3 .023* .004 .000 .014 .032
4 -.006 .004 .206 -.014 .003
5 .014* .004 .002 .005 .023
6 .019* .004 .000 .011 .028
2 1 -.012* .004 .008 -.020 -.003
3 .011* .004 .010 .003 .020
4 -.017* .004 .000 -.026 -.009
5 .002 .004 .614 -.006 .011
6 .007 .004 .090 -.001 .016
3 1 -.023* .004 .000 -.032 -.014
2 -.011* .004 .010 -.020 -.003
4 -.029* .004 .000 -.O37 -.020
5 -.009* .004 .040 -.018 .000
6 -.004 .004 .387 -.012 .005
4 1 .006 .004 .206 -.003 .014
2 .01 7* .004 .000 .009 .026
3 029* .004 .000 .020 .037
5 .019* .004 .000 .011 .028
6 .025* .004 .000 .016 .033
5 1 -.Ol4* .004 .002 -.023 -.005
2 -.002 .004 .614 -.01 1 .006
3 .009* .004 .040 .000 .018
4 -.019* .004 .000 -.028 -.011
6 .005 .004 .233 4.003 .014
6 1 -.019* .004 .000 -.028 -.011
2 -.007 .004 .090 -.016 .001
3 .004 .004 .3 87 -.005 .012
4 -.025* .004 .000 -.O33 -.016
5 -.005 .004 .233 -.014 .003

 

 

 

 

 

 

 

Nate- 1 : r(xl.y2 ; 2 = r(xl.y3); 3 : r(x2, y3); 4 : r(.r2,yl); 5 : r(x3,yl); 6 : r(x3.y2)

105

Table 5.1

Measures used to represent teachers ' and students '

knowledge in 8 studies

 

 

 

 

 

 

 

 

 

Studies Measures of Teachers’ Knowledge Measures of Students’ Knowledge
Bassham Glennon test score of basic California Achievement Test (CAT)
(1962) mathematical understanding
Caezza Callahan Test of Mathematical Stanford Achievement Test (SAT)
(1969) Knowledge
Cox Dr. Leroy's Test of Mathematical Achievement Series (SRA)
(1970) Understanding
Koch Test of Understandings of the Real Grade Equivalent Scores from
(1972) Number System (TURNS) CTBS
Lampela Stoneking Test of Basic California Achievement Test (CAT)
(1966) Arithmetical Principles and

Generalizations
Moore Glennon test score of basic Achievement Series (SRA)
(l 964) mathematical understanding
Turgoose Tests of Achievement and Iowa Test Basic Skill (ITBS)
(1996) Proﬁciency (TAP)
Prekeges Test of Teacher understanding Growth in mathematics
(1973)

 

 

 

106

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cesaozaa<
“Eta
mo. m3. Em _Z _Z m “we...
EoEo>oEo<
28.8%
. commasanU 032382
8 - 3:8
. - he: . om- 32350522 ..
2 Sq Rm 2 a _Z _2 m 98% cw ow £22 _ o
. E m av. do m3. w.
it EoEo>qu< me E: a
Eotaum . F :_ U
aoocou
:38
o «.2. Em _Z _Z m “3%
EoEo>oEo< Soc C
9855 3N3 u
59 me. 32938:: 35:
mm. mm ‘5 _Z _Z .Z E m EoEo>uEu< 3. 23¢. _2 2:2 _ EEBEEEZ _2 mm _ Emﬁaemw
EEoEmU c9220 : .
x25 o:_m> 09p :5: 02m> 09$ :5: Em 20H
m .8 ::D («o agronomic. oEmZ do EESLQEH oEaZ do do 850m
2 5:523 2 5:533. 2 ..,... 2 Sam
$9292 and. EoEo>oEo< EothoEm—z Eoeam 56,—. 032305. 5:22 Suﬁsm 553% cocagmoo

 

 

 

 

 

 

 

 

335% ms? :QCSaBQQ

.N.m Bank

107

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

38:00
omv 2 40 SW. HZ cm :4 -Aémvmotom
EoEo>oEo<
:otﬁszoU $65320ch $60 3
8.- : d MK. Hz 3 _z -Aémvmocom mm. .232 _z :32 5:22.32 _2 : N 282
EoEo>oEo< SEED
chOmmom
8.- : d a. _2 av .z -Aémvscom
EoEo>oEo<
2555:8050
was 3.3055 Ac v
E. To. 15 _Z :4 Hz _2 $525 No; so: 3 £22 .mucoEEC< _Z wm m swam?
-ooi 06mm? _ 4
~89 wcimcoum
mmHU Eob ﬁzz PC
Amohoom 633m 59:32
no. cm 40 ow. #2 mo HZ Eoﬁzam mm. 2 om £22 .momocio :4 cm m
889 $523323 83. :
050E525 :38. .3 amok cuox
chOmmom
8. EN 3 uﬁ§mvmocom
EoEo>oEo<
5:829:00
3. SN Em Aémvmmcom as :93 =
EoEo>oEo< c .n U D 83:
_Z _Z _2 Hz 6820 mm. 2 m? E22 628E252? Em m_ . .
U . you
. . “8% $884 5
3 2m Em .Aémvmocom
EoEo>oEo<
:38.
.2. EN 3 A§mv§sm
EoEo>oEo<

 

8:528 mm 035

108

$2? coca—8.80 um Agog—MES 2: E 8383 no 89:2:6 Z
AEoEmmm—U n 40 35me H Emv :cD A80: noccaéogO n 825-:qu ”Eu:

36529::2 M £36 Huston “3% 58808800 H m 688 owSA ”owmﬁLonoSomom u m :80; 638-8:88mmm H C 09$ 36%

.5me 8:80:00 u v new “mtomom H m 628:3me n m .20th 8533 u _ ”mooSom

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6:2
E<C
1. o . a Q< ”mm—WNW No.2 “36¢ :: q 5:20coi can .. 3033
we w AU am a _ _Z :4 m :gm . m mm. 28% xv ._ 2 r Equ>mEo< .2 cm o 333:?
amok Pso— .
go 33%
. xx. 2 “mesa em I cosﬂszoU
- H A z z .
mo :2 c m mm. 53% mm ._ 2 E 5320
. ow. 2 H881 2 z wE>_om E0305 ‘ . 380$ _ : wEucgmbncs Ammo:
no :2 Em mv. bmorr 9 £32 _ 55380 on bmuh oo .: E _ Luzoawkuoumok :2 E m 33me
. 5.1 aged 31 wEwcﬁmBncb
- a s:
S :2 w m 3.. 53% NV ._ 2 5530.5

 

wosnﬁcou mtm 035.

100

@oomv 6:0 25 52 E8“ 07535 33:95 :<

O .1

Sea. 235 mat H40 mm<z H42 mama/4 “mom. .650 «.di x53
-.BAunaumum meoU
“nah Hmoncusvm

#882623» «5.35m “5:80 $5.338.

 

922E;

momma?“
Maya—Um

avg—vcoh
H
332300

86 HO 0..>m .H m

@8358
“.830“
$355

@963“
$85on

9538

.N.\ 3%:

i

\
x

)
3
IA
0
L:
5
I
0
U

L10 1 IL’IOJ 10;)

39:59.
«ED

110

 

 

 

 

.283 Boom 50¢ 832;“. g 883 58m 25 02

8:55 E 3m: ~52me £3 WEE b

 

.h 8008 hoxoom 95 028255 3 m_mbm§-ﬁoE a S now: 1:58 wcrrbncz «EH

E
\

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N .N 333%

 

 

 

 

 

 

 

Jain-3.8.0 W AI . £0.13!!—
33 A! E.— gir: A gin A
in;
i ii:
DIG-Ill 4—“
In}!!! V
A \. Clan-10009 ‘ \Sﬂlou
31‘ Ell ‘
£55 .
d A j
{I w tales-.21; \; #33:
A AL tad
~ Vin-l :33. m w
E
_ nil-m in.» u
Hausa-pa 2.6 3 ugssmhlnﬂfogih :Bimﬂlh 13.3:0

 

 

 

111

 

’7

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Study 3‘! 3‘2 X3 ‘p-I X Y! y2 y3 yq-l yq
1 ‘J \J
2 \/ \l \J
3 v
k-2
k-l \J \l \l
k \l t ‘1
Figure 3.1.

A hypothetical meta-analysis with k studies

 

 

y]

 

 

 

 

 

 

 

 

 

 

 

 

 

7
y2 A
yq—l
3’61
Figure 3.2.

A population model for a hypothetical meta-analysis

 

x,

 

 

 

x2

 

 

 

 

 

 

 

 

 

 

\ _ .-_'r . ‘
3‘. ‘k __ :r' _ \i
‘ K M H-
- ‘ ~ \ ./-- --‘\‘
“a _ - :x‘ 0' . I.
I — ~ — - - If u _v.
: _—~-- “ \~ _ ----’
— -' P. .l I D/-
f__ " ~. I
c. ...A ' I i - ,’f
‘5 -"
o . -
‘0. _/-.
.- K
IN. -” ..-
- .3 -._
f N _
“'32,." 5.. 1: '— .L‘
-1 ‘ K \ l' l
\ ‘5. - M.
f - p, ‘\
a" ‘ f- -
- a a. ' ‘0 .
- i - "’ I g: .- . Hg.
r” w ' ‘ N ' f? ‘- - I.
I ' \ .l' .‘ ‘.x -/ .' -'\. --- —-__/
l '- ‘..__.___ .- ./ .-
3... _ l 5... -’-, V’ J. .-
5. I'll-x". c'
/ .
a. J. a I
's 3'.
‘-. i,/ 3" x a
.J‘/ .0 .- n .
/ ~n . I ‘5
.- \ .' .
‘-
. [II "n ‘l .n f-— id‘- '
ﬂ ‘ 3" ._ / \._ _:-/
i - t . " '- ' l '5
U i_: . r-~l -— - Y‘ —- 4" I“ I‘-
. g”. \ _ .ll-
1" '. .- \\- ‘j I.
. ' " —- " o
.. x . '4’
.I. . ..-
.\ I"
'. ..r'
I.- .’
"q" l?“ ‘ I 4 '1 I
I ,
. . . ___" x '1; I
I". r w ‘ , . - . f". ‘3 -‘ a. ‘0
k _‘A ‘ .' - |' ~ I . iat' 'l
— if”; __ -— _ N" .r, ‘3! ’ "_/.
1’ I. : ‘ ._-
f- .- I .I r-. ' -‘ ‘— -
.1- " X x”
r". x‘_ _',.r'
x xi
/ . / v, ,
.- I‘

Figure 3.3

Covariance structure model for ranking data with p = 4 alternatives

 

8 Example is from Atlaydeu-Olivarcs & Bockenholt (2005).

114

47', .

 

 

 

 

 

 

 

 

r5

 

 

 

ES2

Figure 4.1

Histograms of estimators when 7 is set to O

115

 

 

 

 

TU

 

 

E31

 

 

1.5

 

 

 

 

Figure 4.2

Histograms of estimators when y is set to .5

116

 

 

 

 

 

— Bias of E81
003* —- Bias of 552
O Bias of E8]
>< Bias 0fE82
X
./
0.02— '
,/
/
'/
/
0.01‘
0.00-
T I
.0 .5
True Relationship Between Two Constructs
Figure 4.3

Bias of two estimators depending on the true population relationship between two

underlying constructs (y)

117

 

—-- MSE of E81
0.10“ --- MSE of E82
0 MSE ofESl
>< MSE of E82

 

 

 

 

0.05-
0.00“
l l
.0 .5
True Relationship Between Two Constructs
Figure 4.4

MSES of two estimators depending on the true population relationship between two

underlying constructs ( y)

118

True Relationship Between Two Constructs: .0

 

 

 

 

 

-— Bias of E81
010“ —-- Bias ofESZ
O Bias ofES'l
>< Bias ofE82
0.05—
0.004 e: —e e— 4
005-
-0.10-<
l r
Rel1=ReI2= e|3=.9 l Rel1=Rel2=Rel3=2
Rel1=Rel2=Rel3=.5 Rel1=.9&Rel2=.5&Rel3=.2
Reliability of Indicators
Figure 4.5

Bias oftwo estimators depending on the reliabilities ofindicators when y is set to O

119

True Relationship Between Two Constructs: .5

 

 

 

—Bias of E81

0.10“ _- Bias of E82
O Bias ofESl
>< Bias ofE82

005‘

0.00“

Gﬁ G
005-“
-0.10-

 

 

 

l T
ReIl=Rel2=Rel3=2 l Rel1=Rel2=Rel3=B _
Rel1=Rel2=Rel3=5 Rel1=.9&Rel2=.5&Rel3=.2

Reliability of Indicators

Figure 4.6

Bias of two estimators depending on the reliabilities of indicators when 7 is set to .5

True Relationship Between Two Constructs: .0

 

 

 

 

 

0.10—
0.05—
\ A A
000-“ V
-0.05-*
-0.10-
l l
Rel1=Rel2=Rel3=.2 l Rel1=Rel2=Rel3=9
Rel1=Rel2=Rel3=.5 Rel1=.9&Rel2=.5&Rel3=.2
Reliability of Indicators
Figure 4. 7

—lvlSE of ESI
--- MSE of E82
O MSE ofESl
>< MSE ofESZ

MSES of two estimators depending on the reliabilities of indicators when 7 is set to O

121

True Relationship Between Two Constructs: .5

 

— MSE ofES'l
010* —- MSE ofE82
O MSE ofESl
>< MSE ofES2

0.00— \ M

 

 

 

 

005-
010—
l r '
Rel1=Rel2=Rel3=2 T Rel1=Re|2=Rel3=9 l
Rel1=Rel2=Rel3=5 Rel1=.9&Rel2=.5&Rel3=.2
Reliability of Indicators
Figure 4.8

MSES oftwo estimators depending on the reliabilities ofindicators when 7 is set to .5

True Relationship Between Two Constructs: .0

 

 

 

 

 

— Bias of E81
010“ —- Bias ofES2
O Bias ofESl
>< Bias ofE82
0.05—
000- 0— —& e e
0.06-
—0.10-
I r
LA.I=LA;=LA3=.45 l LA1=LA2=LA3=95
LA'l=LA2=LA3=.71 LA1=.95&LA2=.71&LA3=.45
Factor Loadings of Indicators
Figure 4.9

Biases of two estimators depending on the factor loadings of indicators when 7 is set to O

True Relationship Between Two Constructs: .5

 

 

 

— Bias of E81
0110‘ —- Bias of E82
O Bias ofESI
>< Bias of E82
0.05—
0.oo—«
Gﬁ C
p.05—
-010—

 

 

 

I I
LA1=LA4=LA3=.45 l LA1=LA2=LA3=95
LA1=LA4=LA3=71 LA1=.95&LA2=.71&LA3=.45
Factor Loadings of Indicators

Figure 4.10

Biases of two estimators depending on the factor loadings of indicators when 7 is set

to .5.

124

True Relationship Between Two Constructs: .0

 

 

 

 

 

--- MSE ofES'l
010‘ —-- MSE ofE82
O MSE ofESl
>< MSE ofE82
0.05—
\ ‘4
0.00— 0
005-
0.10-
l * T
LA1=LA2=LA3=.45 l LA1=LA2=LA3=95
LA1=LA¢=LA3=.71 LA1=.95&LA2=.71&LA3=.45
Factor Loadings of Indicators
Figure 4.11

MSES of two estimators depending on the factor loadings of indicators when 7 is set to O.

True Relationship Between Two Constructs: .5

 

 

 

—- Bias of E81
010‘ --' Bias of E82
0 Bias of ESl
>< Bias of E82
0.05—
n—- -—- -—u -—
0.00“
of G
005-“
-0.10—

 

 

 

I I
LA1=LA2=LA3=.45 l LA1=LA2=LA3=95
LA1=LA2=LA3=.71 LA1=.95&LA2=.71&LA3=.45
Factor Loadings of Indicators

Figure 4. .12

MSES of two estimators depending on the factor loadings of indicators when 7 is set

to .5.

126

True Relationship Between Two Constructs: .0

 

 

 

 

 

— Bias of ES‘I
010‘ --- Bias ofESB
O Bias of E81
>< Bias ofESZ
0.0%
0.00-4 8— 9
005-
010‘
l I
9 36
# of Studies Included
Figure 4.13

Biases of two estimators depending on k when 7 is set to O

127

True Relationship Between Two Constructs: .5

 

 

 

—Bias of E81
0.10" --- Bias of E82
0 Bias ofESl
>< Bias ofE82
005-4
C} O
0.00—
005-
010—
T T
9

 

 

# of Studies Included

Figure 4.14

Biases of two estimators depending on k when 7 is set to .5

True Relationship Between Two Constructs: .0

 

-- MSE of E81
0.10-1 --- MSE ofESZ
O MSE ofESl
>< MSE ofES2

005"

010—

 

 

 

—

# of Studies Included

Figure 4.15

MSES of two estimators depending on k when 7 is set to 0

True Relationship Between Two Constructs: .5

 

 

 

 

— MSE of E81
0.10‘ --- MSE of E82
0 MSE of E81
>< MSE ofES2
0.05-t
005‘
p.104
1 T
9 36
# of Studies Included
Figure 4.16

MSES of two estimators depending on k when 7 is set to .5

I30

True Relationship Between Two Constructs: .0

 

 

 

 

 

—-— Bias of E81
0.104 --- Bias ofES2
O Bias ofESi
>< Bias of E82
0.05“
000- O——+ #2 :G—ﬁ #0 H
-0.05-+
-0.10-4
l l l T l l T
0 1 2 3 4 5 5
# of Missing rs
Figure 4.17

Biases of two estimators depending on the number of missing rs when 7 is set to 0

131

True Relationship Between Two Constructs: .5

 

 

— Bias of E81
010‘ -——- Bias of E82
0 Bias ofESI
>< Bias ofESf.‘
005—:
0.00-
0.05—
010‘4

 

 

 

# of Missing rs

Figure 4.18

Biases of two estimators depending on the number of missing rs when 7 is set to .5

True Relationship Between Two Constructs: .0

 

 

 

 

—- MSE ofES'l
0.10"i --- MSE ofESZ
O MSE ofESl
>< MSE ofES2
0.051
0.00—
005-
-0.10J
l T T T T T T
0 1 2 3 4 5 6
# oi Missing rs
Figure 4.19

MSES of two estimators depending on the number of missing rs when 7 is set to O

True Relationship Between Two Constructs: .5

 

 

 

 

0.10“

0.05—

000—

005-7

010-
I I I I I I I
0 1 2 3 4 5 5

# of Missing Is
Figure 4.20

—1v18E ofESI
--" MSE ofESL‘
O MSE ofES'I
>< MSE MES?

MSES of two estimators depending on the number of missing rs when 7 is set to .5

134

 

0.035“

0.025“

0.02‘

Mean Bias of E81

0015‘

0.01“

0.005“

 

 

 

wl T I l I I
r(x '1 ,y2) r(x1 ,y3) r(x2 ,y3) r(x2.y1) r(x3 .y2) r(x3,y1)

Correlation Included in Meta-analysis

Figure 4.21

Bias of ESl depending on which correlation is included with 7 of .5

 

Glennon test

 

 

Callahan
Test

 

 

 

 

Number of
coursework

 

 

 

 

TAP

 

 

 

Figure 5.1.

 

 

CAT

 

 

 

 

SAT

 

CTBS

 

 

 

lTBS

 

 

SRA

 

A model for meta-analysis investigating teachers’ subject matter knowledge (SMK) and

student learning in mathematics

.3

6

 

 

 

 

True Relationship Between Two Constructs: .0

 

 

 

 

 

-— Bias of E81

0.10- —- Bias of E32
0 Bias ofESI
>< Bias of E82

0.054

-0.05-

010—

l i l l
sv1=sv;=sv3=0 sv1= sv2= sv3= . 45 lsv1=.85&sv2=.45&sv'3=.15
sv‘l =sv2=sv3=. 15 sv1= sv2= sv3=.85

Specific Variances of Indicators

Figure 6.1

Biases of two estimators depending on specific variances of indicators when 7 is set to 0

True Relationship Between Two Constructs: .5

 

 

 

 

 

 

 

—Bias of E81
010" --' Bias of E82
0 Bias ofESl
>< Bias of ES2
0.05~
"\.
.\.
.”_. ____. «-—-—- .,.... x—-—. ...,. x
0.00- \
-005—
-010-
I I
sv1= sv2=sv3=0 sv1=sv2= sv3=. 45 lsvl =.85 &sv2=. 45 &sv3=. 15
sv1= sv2= sv3=.15 sv1=sv2= sv3= . 85
Speciﬁc Variances of Indicators
Figure 6.2

Biases of two estimators depending on speciﬁc variances of indicators when 7 is set to .5

138

True Relationship Between Two Constructs: .0

 

 

 

 

 

 

—-- MSE of E81
0.30“ -- MSE of E82
0 MSE ofESI
,._ __ *_ ,__,x,___, _ x X MSE ofESZ’
025“ cr 3 ﬁt)
0.20“
0.15-
0.10“
0.05-4
0.00“
I r I
sv1=svs=sv3=0 sv’l=sv2=sv3=.45 lsv1=.85&sva=.45&sv3=.15
sv1=sv2=sv3=. 15 svl = sv2= sv3=.85
Speciﬁc Variances of Indicators
Figure 6.3

MSES of two estimators depending on speciﬁc variances of indicators when 7 is set to O

True Relationship Between Two Constructs: .5

 

 

 

 

 

 

n ._ _ “MSE ofESl
-..5'- --- MSE of E82
0 MSE ofESl
X— - —- ““4" -- --—x XMSEofESE
0.25“ 3 ‘3 ﬁ’)
020—
0.15“
0.10“
0.05“
0.00“
I F I
sv1=sv2=sv3=0 svl =sv2= sv3=. 45 lsv1=.85&sv2=.45&sv3=. 15
sv1=sv2=sv3=.15 sv1=sv2=sv3=.85
Speciﬁc Variances of Indicators
Figure 6.4

MSES of two estimators depending on speciﬁc variances of indicators when 7 is set to .5

I40

BIBLIOGRAPHY

I41

BIBLIOGRAPHY

Ahn, S., & Choi, J. (2004). T eachers' subject matter knowledge as a teacher
qualiﬁcation: A synthesis of the quantitative literature on students' mathematics
achievement. Paper presented at the annual meeting of American Educational
Research Association, San Diego, CA.

Ahn, S. & Becker, B. J. (2005, February). Incorporating quality scores in 0 meta-
analysis: A simulation study. Poster presented at the 5th Annual Campbell
Collaboration Colloquium, Lisbon, Portugal.

Allen, M. J ., & Yen, W. M. (1979). Introduction to measurement theory. Monterey:
Brooks/Cole. 40-48.

Ball, D. L. (1990a). The mathematical understandings that prospective teachers bring to
teacher education. Elementary School Journal, 90, 449-466.

Ball, D. L. (1990b). Prospective elementary and secondary teachers' understanding of
division. Journal for Research in Mathematics Education, 21(2), 132-144.

Baugh, F. (2002). Correcting effect sizes for score reliability: A reminder that
measurement and substantive issues are lined inextricably. Educational and
Psychological Measurement, 62(2), 254-263.

Becker, B. J. (1992). Using results from replicated studies to estimate linear models.
Journal of Educational Statistics, 1 7, 341-362.

Becker, B. J. (1997). Meta-analysis and models of substance abuse prevention. NIDA
Research Monograph, 1 70, 96-119.

Becker, B. J. (2001). Examining theoretical models through research synthesis: the
beneﬁts of model-driven meta-analysis. Evaluation & The Health Professions,
24(2), 190-217.

Becker, B. J ., & Schram, C. M. (1994). Examining explanatory models through research
synthesis. In H. M. Cooper & L. V. Hedges (Eds). The handbook of research
synthesis (pp. 357 - 381). New York: Russell Sage Foundation.

Becker, B. J ., & Wang, Q. (2006, April 8). Study indices based on slopes. Paper
presented at the annual meeting of American Educational Research Association,

San Francisco, CA.

Becker, B. J ., & Wu, M. (2007). The synthesis of regression slopes in meta-analysis.
Statistical Science, 22(3), 414-429.

142

Becker, B. J ., & Fahrbach, K. (1994, April). A comparison of approaches to the synthesis
of correlation matrices. Paper presented at the annual meeting of the American
Educational Research Association, New Orleans, LA.

Berk, R. (2006). Statistical inference and meta-analysis. Retrieved paper on May, 2006
from htﬁtp://repositories.cdlib.org/uclastat/papers/Z006051601

Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley-
Interscience.

Borman, G. D. (2002). Experiments for educational evaluation and improvement.
Peabody Journal of Education, 77(4), 7-27.

Brown, M. A. (1988). The relationship between levels of mathematics anxiety in
elementary classroom teachers, selected teacher variables, and student
achievement in grades two through six. Unpublished doctoral dissertation, The
American University.

Brown, S. P., & Peterson, R. A. (1993). Antecedents and consequences of salesperson job
satisfaction: Meta-analysis and assessment of causal effects. Journal of Marketing
Research, 30, 63-77.

Casella, G., & Berger, R. L. (1990). Statistical inference. Duxbury Press, Belmont, CA.

Chalmers, I., Hedges, L. V., & Cooper, H. (2002). A brief history of research synthesis.
Evaluation & the Health Professions, 25(1), 12-37.

Chaney, B. (1995). Student outcomes and the professional preparation of eighth grade
teachers in science and mathematics: NSF/NELS.'88 Teacher transcript analysis
(regression). Rockford MD: Westat, Inc.

Chiang, F. -S. (1996). Ability, motivation, and performance: A quantitative study of
teacher effects on student mathematics achievement using NELS:88 Data.
Unpublished doctoral dissertation, University of Michigan, Ann Arbor.

Choi, J ., & Ahn, S. (2003) Measuring teachers ' subject-matter knowledge. Presented at
the annual meeting of the American Educational Research Association, Chicago,
IL.

Choi, J ., Ahn, S., & Kennedy, M. (Under review). Role of teacher's subject matter
knowledge. Manuscript submitted to the Journal on June, 2007

Cheung, M. W. -L. & Chan, W. (2005). Meta-analytic structural equation modeling: A
two-stage approach. Psychological Methods, 10(1), 40-64.

Cheung, S. F. (2000). Examining solutions to two practical issues in meta-analysis:
Dependent correlations and missing data in correlation matrices. Dissertation
Abstracts International, 6 I , SB.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory.
New York: Holt.

Crowl, A., Ahn, S., & Baker, J. (In press). A meta-analysis of developmental outcomes
for children of same sex and heterosexual parents. Journal of GLBT Family
Studies.

Dominici, F ., Pannigiani, G., Reckhow, K. H., & Wolpert, R. L. (1997). Combining
information from related regressions. Journal of A gricultural, Biological, and
Environmental Statistics, 2(3), 313-332.

Farley, J. U., Lehman, D. R., & Ryan, M. J. (1981). Generalizing from “imperfect”
replication. The Journal of Business, 54(4), 597-610.

Field, A. P. (2001). Meta-analysis of correlation coefﬁcients: A monte carlo comparison
of ﬁxed- and random-effects methods, Psychological Methods, 6(2), 161-180.

F urlow, C. F. (2003). Meta-analytic methods of pooling correlation matrices for
structural equation modeling under different patterns of missing data.
Unpublished doctoral dissertation, University of Texas, Austin.

Furlow, C. F ., & Beretvas, S. N. (2005). Meta-analytic methods of pooling correlation
matrices for structural equation modeling under different patterns of missing data.
Psychological Methods, 10(2), 227-254.

Gelman, A. (1995). Methods of moments using monte carlo simulation. Journal of
Computational and Graphical Statistics, 4(1), 36-54.

Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational
Researcher, 5, 3-8.

Gleser, L. J ., & Olkin, I. ( 1994). Stochastically dependent effect sizes. In H. M. Cooper
& L. V. Hedges (Eds). The handbook of research synthesis (pp. 339 - 356). New
York: Russell Sage Foundation.

Greene, W. H. (1997). Econometric analysis. New York, NY: Macmillian.

Hancock, G. R. (1997). Disattenuated for score reliability: A structural equation
modeling approach. Educational and Psychological Measurement, 5 7(4), 598-606.

Hedges, L. V. (1983). Combining independent estimators in research synthesis. The
British Journal of Mathematical and Statistical Psychology, 36, 123—131.

144

Hedges, L. V., & Olkin, l. (1985). Statistical methods/or meta-analysis. Orlando:
Academic Press.

Hill, H., Rowan, R., & Ball, D. (2005) Effects of teachers' mathematical knowledge for
teaching on student achievement. American Educational Research Journal, 42
(2), 371-406.

Hill, H. C., Schilling, S. G., & Ball, D. L. (2004). Developing measures of teachers'
mathematics knowledge for teaching. The Elementary School Journal, 105(1), 1 1-
30.

Hom, P. W., Caranikas-Walker, F ., Prussia, G. E., & Griffeth, R. W. (1992). A meta
analytical structural equations analysis in a model of employee turnover. Journal
of Applied Psychology, 77, 890-909.

Hunter, J .E., Schmidt, F. L., & Jackson, G. B. (1982). Meta-analysis: Cumulating
research findings across studies. Beverly Hills, CA: Sage Publication.

Hunter, J. E., & Schmidt, F. L. (1990). Methods of meta-analysis: Correcting error and
bias in research findings. New York: Russell Sage Foundation.

Hunter, J. E., & Schmidt, F. L. (1994). Correcting for sources of artiﬁcial variation across
studies. In H. M. Cooper & L. V. Hedges (Eds). The handbook of research
synthesis (pp. 323 - 336). New York: Russell Sage Foundation.

Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and
bias in research ﬁndings (2“ ed. ). New York: Russell Sage Foundation.

Kennedy, M. (2007). Finding cause in a self-propelled world. Unpublished manuscript.
Retrieved on August, 2007 from http://www.msu.edu/user/mkennedy/TQQTL

Kennedy, M., Ahn, S., & Choi, J. (2008). The value added by teacher education. In M.
Cochran-Smith, S. Feiman-Nemser & J. McIntyre (Eds), Handbook of research
on teacher education: Enduring issues in changing contexts (3rd ed.). Mahwah,
NJ: Lawrence Erlbaum Associates, Inc.

Lambert, R. G., & Curlette, W. L. (1995, April). The robustness of the standard error of
summarized, corrected validity coeﬂicients to non-independence and non-
normality of primary data. Paper presented at the annual meeting of the American
Educational Research Association, San Francisco, CA.

Le, H. A. (2003). Correcting for indirect range restriction in meta-analysis: Testing a
new meta-analytic method. Unpublished doctoral dissertation, University of Iowa,

Iowa City, IA.

Le, H., & Schmidt, F .L. (2006) Correcting for indirect range restriction in meta-analysis:

145

Testing a new meta-analytic procedure. Psychological Methods, 11, 416-438.
Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.

Maydeu-Olivares, A., & Bockenholt, U. (2005). Structural equation modeling of paired
comparison and ranking data. Psychological Methods, 10(3), 285-304.

Messick, S. (1993). Validity. In Linn, R. L. (Ed.). Educational measurement (pp. 13-104).
Phoenix: The Oryx Press.

Nugent, W. R. (2006). The comparability of the standardized mean difference effect size
across different measures of the same construct: Measurement considerations.
Psychological Measurement, 66(4), 612-623.

National Council of Teachers of Mathematics. (n.d.). Retrieved January 8, 2008, from
http://standards.nctrn.org/document/chapter6/index.htm.

Olkin, I., & Siotani, M. (1976). Asymptotic distribution of functions of a correlation
matrix. In S. Ikeda (Ed), Essays in probability and statistics (pp. 235-251).
Tokyo: Shinko Tsusho.

Oswald, F. L., & Converse, P. D. (2005). Correcting for reliability and range-restriction
in meta-analysis. Paper presented at the 20th annual conference of the Society for
Industrial and Organizational Psychology, Los Angeles, CA.

Oswald, F. L., & Johnson, J. W. (1998). On the robustness, bias, and stability of statistics
from meta-analysis of correlation coefﬁcients: Some initial monte carlo ﬁndings.
Journal of Applied Psychology, 83(2), 164-178.

Peterson, R. A., & Brown, S. P. (2005). On the use of beta coefﬁcients in meta-analysis.
Journal of Applied Psychology, 90(1), 175-181.

Premack, S. L., & Hunter, J. E. (1988). Individual unionization decisions. Psychological
Bulletin, 97, 274—285.

Qu, Y., & Becker, B. J. (2003). Does traditional teacher certification imply quality? A
meta-analysis. Paper presented at the annual meeting of American Educational
Research Association, Chicago, IL.

R Development Core Team (2008). R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna, Austria.

Raju, N. S., Anselmi, T. V., Goodman, J. S., & Thomas, A. (1998). The effect of

correlated artifacts and true validity on the accuracy of parameter estimation in
validity generalization. Personnel Psychology, 51, 453-465.

146

Raju, N. S., & Brand, P. A. (2003). Determining the signiﬁcance of correlations corrected
for unreliability and range restriction. Applied Psychological Measurement, 27,
52-71.

Raju, N. S., Burke, M. J., Normand, J., & Langlois, G. M. (1991). A new meta-analytic
approach. Journal of Applied Psychology, 76(3), 432-446.

Raju, N. S., Fralicx, R., & Steinhaus, S. D. (1986). Covariance and regression slope
models for studying validity generalization. Applied Psychological Measurement,
10(2), 195-211.

Raudenbush, S. W., Becker, B. J ., & Kalaian, K. (1988). Modeling multivariate effect
sizes. Psychological Bulletin, 103(1), 111-120.

Rubin, D. B. (1992). Meta-analysis: Literature synthesis or effect-size surface estimation?
Journal of Educational Statistics, 1 7(4), 363-3 74.

Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded
typology. Journal of Applied Psychology, 85, 112-118.

Schmidt, F. L., Hunter, J. E., & Outerbridge, A. N. (1986). Impact of job experience and
ability on job knowledge, work sample performance, and supervisory ratings of
job performance. Journal of Applied Psychology, 71(3), 432-439.

Shadish, W. R., & Haddock, C. K. (1994). Combining estimates of effect size. In H. M.
Cooper & L. V. Hedges (Eds). The handbook of research synthesis (pp. 261 -
282). New York: Russell Sage Foundation.

Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching.
Educational Researcher, 15(2), 4-14.

Slavin, R. E. (1984). Meta-analysis in education: How has it been used? Educational
Researcher, 13(8), 6-15.

Slavin, R. E. (2008). What works? Issues in synthesizing educational program
evaluations. Educational Researcher, 3 7(1), 5-14.

Teddlie, C., Falk, W., & Falkowski, C. (1983, April). The contribution of principal and
teacher inputs to student achievement. Paper presented at the annual meeting of
the American Educational Research Association, Montreal, Quebec, Canada.

Thum, Y. M., & Ahn, S. (2007). Challenges of meta-analysis from the standpoint of a

Latent variable framework. Paper presented at the 7th Campbell collaboration
colloquium, London, England.

147

Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 79, 281—
299.

Timrn, N. H. (2004). Estimating effect sizes in exploratory experimental studies when
using a linear model. The American Statistician, 58(3), 213-217.

Towne, L., Wise, L. L., & Winters, T. M. (Eds). (2005). Advancing scientific research in
education. Washington, DC: National Academic Press. Available from
http://www.NAP.edu.

Vanhonacker, W. R., Lehman, D. R., & Sultan, F. (1990). Combining related and sparse
data in linear regression models. Journal of Business & Economic Statistics, 8(3),
327-335.

Whiteside, M. F., & Becker, B. J. (2000). Parental factors and the young child's
postdivorce adjustment: a meta-analysis with implications for parenting
arrangements. Journal of Family Psychology, 14(1), 5-26.

Wu, M. (2006a, April). Applications of Generalized Least Squares and F actored
Likelihood in Synthesizing Regression Studies. Paper presented at the American
Educational Research Association, San Francisco, CA.

Wu, M. (2006b). Methods of meta-analyzing regression studies: Applications of

Generalized Least Squares and Factored Likelihoods. Unpublished Doctoral
dissertation, Michigan State University.

148

       

11111111111111 '

56 7793

       

11111111111111le

930