EXPLORING THE ESTIMATION OF EXAMINEE LOCATIONS
USING MULTIDIMENSIONAL LATENT TRAIT MODELS
UNDER DIFFERENT DISTRIBUTIONAL ASSUMPTIONS
By
HYESUK JANG

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
Measurement and Quantitative Methods - Doctor of Philosophy
2014

ABSTRACT
EXPLORING THE ESTIMATION OF EXAMINEE LOCATIONS
USING MULTIDIMENSIONAL LATENT TRAIT MODELS
UNDER DIFFERENT DISTRIBUTIONAL ASSUMPTIONS
By
Hyesuk Jang
This study aims to evaluate a multidimensional latent trait model to determine how well
the model works in various empirical contexts. Contrary to the assumption of these latent trait
models that the traits are normally distributed, situations in which the latent trait is not shaped
with a normal distribution may occur (Sass et al, 2008; Woods & Thissen, 2006). As a result
when studies construct evaluations or comparisons in order to determine the appropriate
estimation method and to avoid inefficient ones, the distribution or distributional statistics of the
latent trait are considered as a key assumption. This study explores the performance of parameter
estimation using a bifactor model, a type of multidimensional latent trait model in order to
provide information of the effects of violations of the distributional assumptions.
The effects of the distributional assumptions are evaluated using simulation studies. A
two-parameter logistic bifactor model with three factors: one general and two specific factors, is
used as a basic multidimensional latent model. Simulation studies construct eight distributional
conditions based on the degree of skewedness of the general factor, the directions of skewedness
of the specific factors, the correlation between specific factors and four types of item parameter
conditions.
The results showed that item parameter estimation was affected by the degree of
skewedness of the general factor, the directions of skewedness of the specific factors, and the
correlation between specific factors. These conditions of the latent trait distributions had

different effects on item parameter estimation depending on the type of item parameter. Based on
the variances of the mean biases and correlations between generated and estimated parameters,
the most important condition of the latent trait distribution for

parameter estimation was the

correlation between the specific factors. With the increasing number of studies and practical need
for multidimensional structures of latent traits, this research provides useful guidelines for
constructing appropriate multidimensional models.

Key words: Multidimensional latent trait model, bifactor model, latent trait distribution,
simulation study

ACKNOWLEDGMENTS

I would like to deeply thank my advisor and committee members for supporting me
throughout my doctoral study. Dr. Mark Reckase always encouraged and supported me with not
only his professional knowledge and skills but also sincere words and emotional care. With his
advice and care, I completed my doctoral study and dissertation. Dr. Kenneth Frank gave me
unstinting support all the time. He motivated me to go forward and guided me with valuable
directions and warm messages. I really appreciate Dr. Kimberly Maier. With thoughtful advice
and support, she helped and guided me about both academic and personal issues during the
doctoral study. I am very grateful to Dr. Chae Young Lim for helping and guiding me with my
doctoral study and personal plan. She always helped me with her professional knowledge and
kind directions.
My acknowledgement goes to my mentors, friends and colleagues. Without their
generous sharing and hearty support, it would have been impossible for me to complete my
doctoral studies. All of the things that I have shared with them helped me to go through my
doctoral study and they are valuable memories for me.
Lastly, I am very grateful to my parents Daeyong Jang and Jinsook Jeong, my brother
Hyungoo Jang, my sister in law Soyeon Han, and my beloved nephews Youwhan Jang and
Youhyun Jang. I was able to accomplish my doctoral study with their unconditional love, trust
and support. From the bottom of my heart, I would like to thank my entire family.

iv

TABLE OF CONTENTS

LIST OF TABLES. ....................................................................................................................... vii
LIST OF FIGURES ........................................................................................................................ x
1. Introduction ................................................................................................................................. 1
1.1 Multidimensional Latent Trait .............................................................................................. 1
1.2 Multidimensional Latent Trait Models ................................................................................. 3
1.3 Latent Trait Distributions ...................................................................................................... 5
1.4 Simulation Study ................................................................................................................... 6
1.5 Research Question ................................................................................................................. 7
2. Literature Review……………………………………………………………………………...10
2.1 Bifactor Model .................................................................................................................... 10
2.2 Previous Research on Latent Trait Distributions ................................................................ 12
2.3 Simulation Study ................................................................................................................. 15
2.3.1 Simulation as a Research Methodology ....................................................................... 15
2.3.2 Simulation Studies of Latent Trait Models................................................................... 18
a. Distribution of Latent Trait ........................................................................................ 18
b. Intercorrelation between Latent Traits ....................................................................... 18
c. Discrimination Parameter ........................................................................................... 19
d. Difficulty Parameter ................................................................................................... 19
e. Replication .................................................................................................................. 20
f. Number of Items ......................................................................................................... 20
g. Number of Examinees ................................................................................................ 21
h. Estimation Methods.................................................................................................... 21
i. Computer Programs for Parameter Estimation ........................................................... 21
2.4 Empirical Data Analysis ..................................................................................................... 22
2.4.1 Distribution of Latent Traits in PISA 2003, 2006, and 2009 ....................................... 22
2.4.2 Sub-domain Proficiency Levels in PISA ...................................................................... 25
3. Method………………………………………………………………………………………...29
3.1 Data Generation................................................................................................................... 29
3.1.2 Data Generating of Latent Trait Distributions .............................................................. 29
3.1.2 Data Generating of Item Parameters ............................................................................ 35
3.1.3 Data Generating of Examinees’ Responses .................................................................. 37
3.2 Evaluation Methods............................................................................................................. 38
4. Results……………………………………………………….………………………………...40
4.1. Data Generation.................................................................................................................. 40
4.1.1 Item Parameter Generation ........................................................................................... 40
4.1.2 Parameter Generation ................................................................................................ 41
4.2. Bifactor Analysis ................................................................................................................ 47
4.2.1 Item Parameters ............................................................................................................ 47
v

4.2.2 Parameters ................................................................................................................. 52
a. Mean of Mean Biases ................................................................................................. 52
b. Variances of Mean Biases .......................................................................................... 55
c. Correlations between Generated and Estimated Parameter Distributions .................. 56
d. Kolmogorov-Smirnov Test (KS test) ......................................................................... 57
5. Discussion……………………………………………………………………………………..64
5.1 Summary of the Results ...................................................................................................... 64
5.2 Implications ......................................................................................................................... 65
APPENDICES .............................................................................................................................. 70
Appendix A ............................................................................................................................... 71
Appendix B ............................................................................................................................... 75
Appendix C ............................................................................................................................... 79
Appendix D ............................................................................................................................... 83
Appendix E................................................................................................................................ 85
BIBLIOGRAPHY ......................................................................................................................... 93

vi

LIST OF TABLES

Table 2-1. Comparison of Three Research Methodologies .......................................................... 16
Table 2-2. Cut scores of Reading Literacy Proficiency Levels in PISA 2009 ............................. 23
Table 2-3. Percentage Distribution of Proficiency Level Scores in PISA 2000, 2003 and 2009 . 24
Table 3-1. Simulation Conditions for Data Generation ................................................................ 31
Table 3-2. Simulation Combinations of Latent Trait Distributions .............................................. 32
Table 3-3. Parameters for Generating Distributions of Simulation Combinations of Item
Parameters ............................................................................................................................. 36
Table 4-1. Descriptive Statistics of Generated Item Parameters .................................................. 42
Table 4-2. Descriptive Statistics of General

Parameters Generated .......................................... 44

Table 4-3. Descriptive Statistics of First Specific

Parameters Generated ................................. 45

Table 4-4. Descriptive Statistics of Second Specific
Table 4-5. Correlations Between Two Specific

Parameters Generated............................. 46

Parameters Generated .................................... 47

Table 4-6. Means of Item Parameter Mean Bias .......................................................................... 48
Table 4-7. Variances of Item Parameter Mean Bias ..................................................................... 49
Table 4-8. Means of

Parameter Mean Bias ............................................................................... 53

Table 4-9. Variances of

Parameter Mean Bias .......................................................................... 54

Table 4-10. Mean of the Correlations of the General Factors ...................................................... 58
Table 4-11. Correlation Means of the First Specific Factors ........................................................ 59
Table 4-12. Correlation Means of the Second Specific Factors ................................................... 60
Table 4-13. Frequency of Significant Differences between Distributions of Generating and
Estimated General Factor Parameters ................................................................................ 62
Table A-1. Parameter Estimates of Quadratic Regression Function for Positively Skewed
Distribution with 2,000 examinees ........................................................................................ 71
vii

Table A-2. Parameter Estimates of Cubic Regression Function for Positively Skewed
Distribution with 2,000 examinees ........................................................................................ 71
Table A-3. Parameter Estimates of Quadratic Regression Function for Positively Skewed
Distribution with 10,000 examinees ...................................................................................... 72
Table A-4. Parameter Estimates of Cubic Regression Function for Positively Skewed
Distribution with 10,000 examinees ...................................................................................... 72
Table A-5. Parameter Estimates of Quadratic Regression Function for Negatively Skewed
Distribution with 2,000 examinees ........................................................................................ 73
Table A-6. Parameter Estimates of Cubic Regression Function for Negatively Skewed
Distribution with 2,000 examinees ........................................................................................ 73
Table A-7. Parameter Estimates of Quadratic Regression Function for Negatively Skewed
Distribution with 10,000 examinees ...................................................................................... 74
Table A-8. Parameter Estimates of Cubic Regression Function for Negatively Skewed
Distribution with 10,000 examinees ...................................................................................... 74
Table B-1. Mean and Variance of Item Parameter Biases under Disc of 1.3 and Diff of -0.5 ..... 75
Table B-2. Mean and Variance of Item Parameter Biases under Disc of 1.3 and Diff of 0.5 ...... 76
Table B-3. Mean and Variance of Item Parameter Biases under Disc of 1.8 and Diff of -0.5 ..... 77
Table B-4. Mean and Variance of Item Parameter Biases under Disc of 1.8 and Diff of 0.5 ...... 78
Table C-1. Mean and Variance of

Parameter Biases under Disc. of 1.3 and Diff. of -0.5 ........ 79

Table C-2. Mean and Variance of

Parameter Biases under Disc. of 1.3 and Diff. of 0.5 ......... 80

Table C-3. Mean and Variance of

Parameter Biases under Disc. of 1.8 and Diff. of -0.5 ........ 81

Table C-4. Mean and Variance of

Parameter Biases under Disc. of 1.8 and Diff. of 0.5 ......... 82

Table D-1. Correlations between Generated and Estimated Factors with Discrimination
Parameters from mean of 1.3 ................................................................................................. 83
Table D-2. Correlations between Generated and Estimated Factors with Discrimination
Parameters from mean of 1.8 ................................................................................................. 84
Table E-1. Numbers of Frequencies Significant by KS Test under Condition 1 .......................... 85
Table E-2. Numbers of Frequencies Significant by KS Test under Condition 2 .......................... 86
Table E-3. Numbers of Frequencies Significant by KS Test under Condition 3 .......................... 87
viii

Table E-4. Numbers of Frequencies Significant by KS Test under Condition 4 .......................... 88
Table E-5. Numbers of Frequencies Significant by KS Test under Condition 5 .......................... 89
Table E-6. Numbers of Frequencies Significant by KS Test under Condition 6 .......................... 90
Table E-7. Numbers of Frequencies Significant by KS Test under Condition 7 .......................... 91
Table E-8. Numbers of Frequencies Significant by KS Test under Condition 8 .......................... 92

ix

LIST OF FIGURES

FIgure 1-1. Examples of Multidimensional Latent Variable Models (Reise et al, 2007; Reise et al,
2010) ........................................................................................................................................ 3
Figure 2-1. Percentage distribution of proficiency level in PISA 2000, 2003 and 2006 .............. 25
Figure 2-2. Percentage Distribution of Proficiency Level in PISA 2009 ..................................... 25
Figure 2-3. Percentage Distribution of Sub-domains in PISA 2000, 2003, 2006, and 2009 ........ 26

x

1. Introduction
1.1 Multidimensional Latent Trait
The work on latent traits started in the 1950s. According to Gifford (1978), the word
‘latent trait’ was mentioned in Lazarsfeld (1950), and latent trait theory was first developed by
Lord (1952, 1953a, 1953b). Latent traits are unobservable, and cannot be measured directly. In
latent trait theory, the latent trait is portrayed as underlying participants’ performance on sets of
test items, which is why it is called a “latent” trait or ability (Gifford, 1978). Test items are used
to collect participants’ responses to particular stimuli, and based on the response features from
the collected data, the characteristics of the participants and items may be estimated by using a
latent trait model.
In order to have a basis in scientific methods, item response theory (IRT) as a latent trait
theory has been developed to describe the relationship between participants’ responses and their
level of abilities by a mathematical function (Lord, 1980). The models used for item response
theory can be distinguished as unidimensional item response theory (UIRT) or multidimensional
item response Theory (MIRT) models depending on whether the number of latent traits modeled
is one, or more than one. According to the book by Reckase (2009), work in fields such as
education, psychology, and statistics suggests that the structure of human knowledge is
complicated, and that the processes that produce observed responses to test items are often
complex and varied. As a result, multidimensional item response theory (MIRT) has been
developed to better fit reality. Chalmers (2012) also suggested that even though unidimensional
models can be useful, in order to adequately specify the nature of measures with complicated
structures, it is essential to consider their dimensionality.

1

Many researchers also have considered multidimensionality in measuring particular
constructs of interest. To produce test items that follow an expected factor structure, item
construction studies have constructed and analyzed test item data using multidimensional latent
trait models, for example in educational assessment (OECD, 2007a & 2007b; von Davier, 2008;
Hichendorff, 2013) and psychological or sociological constructs (Capella & Turner,
2004;Yoshida & James, 2010; Eboli & Mazzulla, 2007; Martin, 2007; Duncan-Jones, 1981a &
1981b). In educational assessment of science literacy in PISA 2006, for example, the test
consisted of three content areas: earth and space systems, living systems, and physical systems.
According to OECD (2007a; 2007b), the average score on science questions from different
content areas for a particular country tend to vary. This suggests that even though the test
examines science literacy as a general latent trait, a different pattern of the students’ ability
distribution can exist depending on the sub-latent traits (OECD, 2007a). In this case, modeling
the total latent trait, ignoring the sub-domains, could result in scores that are not easily
interpretable or policy decisions that are erroneous because of the lack of information about the
student latent trait.
Capella and Turner (2004) developed an instrument of customer satisfaction in the
vocational rehabilitation services. In this research, the customer satisfaction survey considered
four components of satisfaction: counselor interpersonal factors, counselor job effectiveness,
satisfaction with the services, and satisfaction with the agency. Confirmatory factor analysis
indicated that the satisfaction instrument consisted of three dimensions that reflected the
counselor that the customers interacted with, the services that the customers received, and the
agency that provided the services. The research showed that customer satisfaction can be
described as consisting of multiple latent traits.
2

Model A:

Model B:

Model C:

Multiple Correlated Traits

Second Order

Bifactor

Figure 1-1. Examples of Multidimensional Latent Variable Models (Reise et al, 2007; Reise et al,
2010)
Multidimensional measurement and analytical methods have been used in social network
analysis of human interactions. One of the efforts in the measurement of interactions that has
attracted interest was the construction of the Interview Schedule for Social Interaction (ISSI),
which was developed by the Social Psychiatry Research Unit at the Australian National
University (Duncan-Jones, 1981a & 1981b). The survey consisted of 50 items asking about the
availability or adequacy of social interaction and attachment, and about acquaintances, friends,
attachment, opportunities for nurturance and reassurance of worth, and reliable alliances.
Duncan-Jones (1981) evaluated and characterized the structure of the ISSI according to
subdomains of social relationships by using confirmatory factor analysis.
1.2 Multidimensional Latent Trait Models
To describe various item content, formats, and relationships between multiple factors,
various latent trait models have been developed and used. Reise et al. (2007) provides examples
of multidimensional latent models, three of which are shown in Figure 1-1. Circles represent
dimensions or latent factors, and rectangles represent items used for measuring the factors.

3

Model A is a typical multidimensional correlated traits model. Each latent trait is related to some
of the items, and it is assumed that there is a correlation between the factors. Model B shows a
multidimensional model with a higher order structure of latent traits, which is often referred to as
a second-order factor model. As in Model A, the latent traits at the lower level are measured by
some of the items. The difference is the presence of a second order trait explaining the
correlation between the first level traits. Although Models A and B measure certain factors
related to common parts of the items and show relationships among the latent traits, they do not
include general factor directly related to all of the items. Model C shows the structure of a
bifactor model. The bifactor model has two kinds of factors: a ‘general’ factor connected to the
all items accounting for the item intercorrelations, and several ‘group’ or ‘specific’ factors
connected to the some of the items representing additional covariance unexplained by the general
factor.
The bifactor model has a mathematical relationship with other models specifying
multidimensional structures of test items. In Rijmen (2010), the bifactor model was compared to
two other multidimensional IRT models: the testlet model and the second-order model. The
research demonstrated that while all three models take account of item clusters, there are some
differences in the consideration of specific factor loadings. The testlet model has a constraint on
the loadings of the specific factors, and they are estimated from the general factor loading in a
proportional way within each testlet. Under the assumption that the second order model also has
proportional specific factor loadings, the second order models can be described as restricted
forms of the bi-factor model. Therefore, research using the bifactor model can be viewed as
relevant to multidimensional models in general, including those described above that take item
clusters into account.
4

1.3 Latent Trait Distributions
Many studies using latent trait models focus on the latent trait estimate. To estimate the
latent trait means to locate each examinee somewhere on each continuum scale, allowing us to
investigate a examinee’s status on each latent trait, or to compare examinees’ relative statuses
(Reckase, 2009). Hambleton et al. (1991) emphasized the importance of the latent trait in
demonstrating that the IRT models are based on two postulates; one about whether participants’
performances on test items can be explained by their latent traits, and the other about whether the
relationship between their item performance and the traits can be modeled by a particular family
of item characteristic functions. Among the parameters in latent trait models including item
difficulty, item discrimination and item guessing parameters, and person latent trait parameters,
Sass et al. (2008) suggested that the latent trait parameters be considered the most important to
estimate, because the latent trait estimates can be used to determine an examinee’s proficiency
classification or standing on a psychological construct.
The estimation of the latent traits is important not only to providing examinees’
proficiency classification or measurement of psychological constructs but also because it reflects
to a central assumption of latent trait models. Most of factor models assume normal distribution
of the latent trait; however, situations in which the latent trait is not normally distributed may
occur in reality (Sass et al, 2008; Woods & Thissen, 2006). Violation of the assumption that the
latent trait is normally distributed is a critical issue because it affects confidence in the estimates
from statistical models, which have desirable regularity properties only under conditions
consistent with the assumptions. Researchers who are interested in studies related to parameter
estimation of latent trait models have used various estimation methods, such as maximum
likelihood, least squares, and Bayesian estimators. All these estimation methods use the
5

distribution or distributional statistics of the latent trait, and assumptions about the latent trait
distribution are an important issue. Many studies have constructed evaluations or comparisons of
estimation methods in order to identify the most appropriate estimation methods and to avoid the
inefficient ones (Finch, 2010; Cai et al., 2011; Li & Lissitz, 2012; Woods & Thissen, 2006). As
an extension of these studies, my research evaluates the estimation performance of a
multidimensional model under conditions characterized by various distributional assumptions
about the latent traits.
1.4 Simulation Study
The study of latent trait distributions is significant in order to evaluate the performance of
latent trait models and their estimation processes. Many of the studies that evaluate estimation
quality use simulation and this study exploring the estimation performance of multidimensional
latent trait models also uses simulation. Simulation studies have been popular in various fields
of research. The large numbers of simulation studies in certain fields or addressing particular
topics shows that many researchers are still using simulation studies (Axelrod, 2005). One reason
for using simulation study is a lack of appropriate empirical data. Practically, it is true that
collecting data requires a lot of time and effort, and sometimes it is hard to get the data that we
really want to analyze. Especially when research conditions do not permit researchers to collect
appropriate data to address a particular question, such data can be simulated. However, leaving
aside practical limitations that suggest study by simulation, there are significant benefits that
simulation study provides.
First, by using a simulation study, we can re-use existing information derived from
previous research to conduct deeper research or develop a sequential research line. If we have

6

parameter estimates in an original dataset related to our research interest, using them to generate
additional data samples may be a better use of resources than collecting data again.
Simulation study provides not only the tools to allow us to use available information, but
also the opportunity to study unobserved research conditions. Without collecting data, empirical
analysis is not able to be used unless the analysis uses second hand or published data. On the
other hand, in simulation studies, conditions that cannot easily be created or may never have
occurred can be produced. Also, the results from these studies can predict and help prepare for
empirical situations that might occur in the future.
Simulation study methods allow us to replicate the statistical analyses in a very efficient
way. Replication allows us to confirm that the results from a simulation study are reliable, or to
test that the inferences from various models are robust (Axelrod, 2005). For example, in order to
compare estimation methods, we need to examine amounts of estimation error or other statistical
errors that are not related to one’s research interests, but inevitably occur. In this case, one set of
empirical data is not enough to compare the differences among all conditions of interest. Finally,
the results from the simulation study provide methodological information and can be discussed
related to empirical applications, as I will do in this study.
1.5 Research Question
This research evaluates estimation performance of a multidimensional latent trait model
for latent trait and item parameters under various latent trait distribution conditions. The
multidimensional latent trait model that is used for this research is a two-parameter bifactor
model with one general factor and two specific factors. In order to answer the research questions,
a simulation study is constructed with conditions representing different latent trait distributions

7

and sets of item parameters. The latent trait distribution conditions can be characterized
according to the degree of skewedness of the distribution, the direction of the distribution
skewedness, and the intercorrelations between specific factors. The latent trait distribution
conditions includes four conditions combining general and specific latent trait distributions with
particular skewedness: 1) normal general factor and non-normal specific factors skewed in the
same way; 2) normal general factor and non-normal specific factors skewed in a different way; 3)
non-normal general factor and non-normal specific factors skewed in the same way; and 4) nonnormal general factor and non-normal specific factors skewed in a different way. The parameter
estimation is evaluated under two levels of intercorrelation between specific factors. Further,
four different possible sets of item difficulty and discrimination parameters are considered. In
total, eight conditions of latent trait distributions and four conditions of item parameters are
constructed through simulation. Specific descriptions of the simulation study design are provided
in Chapter 3.
This research explores the estimation performance of the bifactor model under different
distributional and item parameter conditions. My specific research questions are as follows:
What precision results from the parameter estimation of the bifactor model:
1) depending on each combination of latent trait distributions in terms of (a) the
normality of the general factor distribution, (b) direction of the skewed distributions
included in the general or specific factors, and (c) the correlations between the specific
factors?, and
2) depending on the levels of the item difficulty and discrimination parameter values?

8

The literature review of the bifactor model, latent trait distribution and related studies are
discussed in Chapter 2, and the procedure of the simulation study and methods to generate and
analyze the data are provided in Chapter 3.

9

2. Literature Review
2.1 Bifactor Model
Collected data is not a direct measure of the unobservable latent trait, but rather a proxy
for the latent trait. Therefore, before an analyst uses collected data, its measurement properties
should be evaluated about whether it represents the construct validly, and measures the construct
consistently across the participants. Also, in order to provide precise information for construct
analysis, it is important to consider the suitability of the method that we use for the analysis.
In order to measure the sub-structure of a general latent trait, Gibbons and Hedeker
(1992) introduced the bifactor model for binary items, which is derived from the ‘bifactor’
solution named by Holzinger and Swineford (1937). The model has the constraints that each item
has a) a nonzero loading on the general factor, which is the primary dimension; and b) a second
loading on no more than one of the specific factors. Also, each specific factor is orthogonal to
the general factor and other specific factors. The pattern matrix for a bifactor model of five items,
for example, could be shown as
10

11

0

1

30

[
where

0
50

31

0
0

0
0
0 ,
5

]

is the factor loading of item on factor . The factor loadings indicate the item slope,

or “discrimination,” parameters (Cai, Yang, & Hansen, 2011). In the matrix above, the five items
are measures for one general factor and two group factors. The loadings on the general factor in
the first column are

0,

which should be nonzero, and the loadings of items 1-3 in the second

10

and 4-5 in the third column are related to specific factors 1 and 2 respectively (Gibbons &
Hedeker, 1992; Li & Lissitz, 2012).
The function of the bifactor model can be explained as an IRT model. Compared to the
unidimensional two-parameter IRT model, in the bifactor model general and specific latent traits
are divided into separate parts with corresponding item parameters.
The functions of the unidimensional two-parameter IRT model and two-parameter
bifactor model are as follows (Reckase, 2009; Cai et al, 2011; Li & Lissitz, 2012):
Unidimensional two-parameter IRT model:
(u 1| 0 , ai , di )

1
1 exp -[di ai 0 ]

, and

Two-parameter bifactor model:

(u 1| 0 ,

s , a0 , as ,di )

1
1 exp [di a0

0

as s ]

The left-hand side of each equation represents the probability that an examinee answers a
question (or an item) correctly. is an individual examinee’s ability related to the latent trait, and
the value shows the location on the

continuum. ‘a’ and ‘d’ are the item parameters for the ith

item. ‘a’ indicates an item discrimination, and ‘d’ is calculated from the item difficulty and item
discrimination parameters as shown in Equation below: (X):

di b√∑

m
v 0

av

where ‘b’ is an item difficulty parameter and m is the number of dimensions (Rec ase, 009).

11

The bifactor model has been used for multidimensional item analyses with various
purposes. Reise et al (2007) demonstrated the utility of the bifactor model, and according to the
research, the bifactor model can inform decisions about the dimensionality of the data and what
type of models are appropriate for analysis: a) the bifactor model can be used to check the
assumptions of unidimensional IRT models and test the fit of these models to possibly
multidimensional data; b) it can be used, like non-hierarchical MIRT models, to form subscales;
and c) it can be an alternative to using non-hierarchical multidimensional models for measuring
individual differences. As a representative multidimensional latent trait model, the bifactor
model is investigated under various distributional conditions of the latent traits. The next section
discusses previous research on latent trait distributions.
2.2 Previous Research on Latent Trait Distributions
The estimation of the latent traits is important not only to providing examinees’
proficiency classification or measurement of psychological constructs but also because it reflects
to a central assumption of latent trait models. Gibbons and Hedeker (1992) explain the
assumption of the latent distribution in factor models by using Thurstone’s multiple factor model
(1947). The multiple factor model is as follows:
y

where y is a latent variable,

1 1

3 3 ….

is an underlying ability,

,

is a factor loading, and is a residual.

In the multiple factor model, underlying abilities of

and the latent trait of y are

assumed to follow normal distributions. It implies that the underlying abilities (

) are

orthogonal, which is an assumption of any bifactor model, and that residuals of are normally
distributed. The assumption that the relations between factors are orthogonal reduces the
12

complexity of the integration involved in estimating the parameters. The estimation efficiency
produced by the assumption of orthogonal latent trait distributions is strength of the bifactor
model.
As mentioned above, most of factor models assume normal distribution of the latent trait;
however, situations in which the latent trait is not normally distributed may occur in reality. Sass
et al (2008) suggested two cases that could result in non-normal distribution of a latent trait: (a) a
non-normal sampled distribution, and (b) a non-normal original distribution. A non-normal
sampled distribution is derived from a non-randomly sampled population distribution. For
example, when the sample is collected from a limited range of the population distribution, for
instance, collected only from the low level class or the high level class, the latent trait
distribution can be skewed. Also, it may occur that the original latent trait follows a non-normal
distribution when a test is very difficult or very easy, or when psychological constructs that have
skewed response distributions are observed.
The research on latent trait distributions has been conducted on both the unidimensional
and multidimensional latent trait models. Sass et al. ( 008)’s research using unidimensional IRT
models showed that (a) a positively skewed distribution produces greater latent trait estimation
error than a normal distribution does; (b) for extreme examinees, item difficulty estimates
produce larger amount of estimation error; and (c) the best latent trait estimation procedure
depends on whether a researcher is primarily interested in extreme or non-extreme examinees.
Woods and Thissen (2006) introduced the non-parametric estimation of IRT latent distribution
using spline–based densities, which they refer to as Ramsay-Curve IRT (RC-IRT). They showed
its capability by applying it to normally-distributed and skewed latent distributions in a
simulation study.
13

Finch (2010) compared the estimation methods implemented by NOHARM (unweighted
least squares estimation) and Mplus (robust weighted least squares estimation) software using
multidimensional confirmatory factor analysis models. The results showed that the estimation
methods of NOHARM and Mplus were affected by the distribution of the latent traits, and that
item difficulty and discrimination parameters estimated from responses of examinees a skewed
latent trait distribution have a larger amount of standard error than those estimated from
examinee groups with a normal latent trait distribution. This research added to results that show
IRT parameter estimation is affected by the latent trait distribution shape, and can be explained
by the fact that item response theory models express the functional relationship between the
latent trait and observed score distributions as a normal ogive (McDonald, 1997). Batley and
Boss (1993) studied the estimation of latent trait distributions with three levels of
intercorrelations between two latent traits in multidimensional two-parameter logistic model.
They showed that the both the best estimation of the first latent trait and worst estimation of the
second latent trait occurred in the ‘0’ correlation condition. According to their discussion,
estimation of the second latent trait was influenced by rescaling of the estimates; as the
correlation between the latent traits increases, the model with two latent traits has features closer
to those of a unidimensional model. Cai et al. (2011) studied estimation efficiency using full
information bifactor analysis. The research was designed to study conditions involving a
multigroup bifactor model with normally distributed latent factors, and various types of items,
such as dichotomous, ordinal, and nominal items. This study was constructed under the normal
distribution assumption; however, the research discussed the possibility of the non-normal
distribution of the latent trait and suggested the future research.

14

2.3 Simulation Study
2.3.1 Simulation as a Research Methodology
In Axelrod (2005), simulation study is identified as one of three major research
methodologies: induction, deduction, and simulation; research using induction methodology
discovers the patterns that the research is interested in by analyzing empirical data, while
research with deduction methodology suggests a set of axioms and proves the consequences
from the logical connections between the assumptions. Indicating simulation as a “third way of
doing science,” Axelrod ( 005) explained its process and theme as different from induction and
deduction (See Table 2-1). First, in a simulation study, assumptions of theory are used for data
generation, whereas research with the deduction methodology uses assumptions in order to
affirm or reject a theorem that the research focuses on. Second, the data set in a simulation study
is not only collected empirically as in research with induction methodology, but is also generated
with specified conditions based on the assumptions. In the analysis of research conducted by the
deduction methodology, consequences can be drawn from logical relationships between
assumptions, while in research conducted by the induction methodology, significant patterns in
the empirical results can be found. On the other hand, simulation methodology provides a tool to
support creation of study designs precisely representing research conditions of theoretical or
practical interest.
Küppers and Lenhard (2005) mentioned that computer simulation based on a theoretical
or experimental framework might rarely be successful because reality is too complicated to be
explained only by the theorem or by experiments. Computer simulation consists of numerical
solutions and imitations of empirical situations. The quality of a numerical solution solutions

15

Table 2-1. Comparison of Three Research Methodologies

Deduction

Using

To affirm or reject a

assumptions

theorem

Induction

Simulation

To ground the theorem

To generate data sets

Used for construction of
Data

Specified and generated
Collected empirically

the theorem

with the assumptions

Drawing consequences
Analysis and

Providing a tool to be
Finding the significance

from the relationships

able to use intuitional

results

in data
between assumptions

methods

* This is tabulated by using Axelrod (2005).
and imitations of empirical situations. The quality of a numerical solution depends on knowing
how to control inevitable statistical or calculation errors, and validating an imitation of an
empirical situation in order to reproduce the results from empirical analysis. If a theorem to
specify phenomena is established, the validity of computer simulation is related to whether the
study represents the empirical situation or reality accurately. As a result, Küppers and Lenhard
(2005) argued that simulation modeling can be considered an attempt to imitate reality, and can
be validated not by theoretical arguments but by using experience or existing data because the
simulation study is an “experiment with theories.”

16

The fact that a simulation study is an imitation and representation of reality means that
judgments about its validity can depend on epistemology. Schumid (2005) discussed the truth of
simulation as connected to three philosophical theories. First, every simulation study should have
a corresponding counterpart in reality, which is called a property of correspondence. Once the
property of correspondence is met, there are sequential questions of how to demonstrate the
relation between the statements to be described as assumptions, and reality, to exist, and how to
define “reality” itself. Second, another philosophical theory related to simulation study is
consensus, which means that simulation studies should be accepted by a community perspective.
This implies that in addition to having an objective connection to reality, simulation studies
should have subjective rationales in their context. Last, simulation studies should have coherence,
which means their design is believed to be consistent with other theorems. However, a coherent
situation does not guarantee a true relationship between reality and the simulation study.
Referring to “sufficient accuracy and specific purpose” as the important points in evaluating the
validity of simulation studies (Robinson, 2004, p210), Schumid (2005) delineated a validation
process that determines the sufficient level of accuracy of a simulation, and constructs the
simulation model to represent the real world system for a specific purpose.
To sum up, constructing a simulation study should be based on theory and empirical
evidence to support the validity of the design. Review of previous studies and empirical analyses
for determining the conditions of this simulation study will be provided subsequently.

17

2.3.2 Simulation Studies of Latent Trait Models
In order to answer questions about estimation of performance, many studies have been
constructed using simulation. Based on previous simulations studies, simulation conditions and
parameters were reviewed for selecting the simulation conditions of this study.
a. Distribution of Latent Trait
Finch (2010) constructed a simulation study to compare unweighted least squares (ULS
and robust weighted least squares (RWLS) estimation. Latent trait distributions were generated
as normal, or skewed with skewness of -1.5 and kurtosis of 3.0. Cai et al. (2011) generated a
general factor and specific dimensions, which were set to be jointly normally distributed and
mutually orthogonal. Li and Lissitz (2012) also generated their general latent traits from normal
distributions with means of -0.5, 0, and 1, and a variance of 1, and specific latet trait values from
a standard normal distribution. In Woods and Thissen (2006), three kinds of latent trait
distributions were constructed from 1) a normal distribution with skewness of 0 and kurtosis of 3,
b) a platykurtic distribution with skewness of 0 and kurtosis of 2.53), and c) a positively skewed
distribution with skewness of 1.57 and kurtosis of 6.52.
b. Intercorrelation between Latent Traits
In Batley and Boss’s (1993) study, three levels of intercorrelations (0, 0.25 and 0.5) were
constructed between two latent traits. Gosz and Walker (2002) used three intercorrelations (0.5,
0.75, and 0.9), two of which were higher than those of Batley and Boss. Finch’s ( 010) research
used four levels of intercorrelations in order to evaluate the accuracy of item parameter
estimation: a ‘0’ correlation as no correlation, a 0.3 correlation as a low level of intercorrelation,
0.5 as a medium level, and 0.8 as a fairly large correlation. The research concluded that with a

18

high level of correlation between the latent traits, there is great bias in item parameter estimation,
regardless of the estimation method used.
c. Discrimination Parameter
In order to generate the item parameters for a simulation, previous research used either a
population distribution or a specific value of each parameter. The study of Finch (2010)
generated discrimination parameters from a normal distribution with an estimated mean of
0.9657 and a standard deviation of 0.3161; the simulated discrimination parameters ranged
between 0.3736 and 2.0158. Woods and Thissen (2006) also generated their discrimination
parameters from a normal distribution, but with a mean of 1.7 and a standard deviation of 0.3,
based on analysis of existing psychological scales. Cai et al. (2011) and Li and Lissitz (2012)
used specific parameter values for the simulation data, which were values ranging from 1 to 2.
d. Difficulty Parameter
Difficulty parameters are usually generated from a normal distribution. Finch (2010)
generated difficulty parameters from the standard normal distribution. Li and Lissitz (2012), who
studied the bifactor model in vertical scaling, used difficulty parameters for non-common items
from normal distributions with means between -0.5 to 0.5 and variance of 1, and as common
items from a uniform distribution with a range of 1.5, for example, -1 to 0.5 or -0.5 to 1. Woods
and Thissen (2006) generated difficulty parameters from a truncated standard normal distribution
ranging from -2 to 2. Cai et al. (2011) selected specific values for difficulty parameters, for
example, -1, -.25, .25, 1.

19

e. Replication
Various simulation studies with IRT models have used between 100 and 1000
replications. Finch (2010) and Woods and Thissen (2006) completed their study with 1000
replications of the simulation. The number of replications in Cai et al. (2011) was 500, and Li
and Lissitz (2012) and Sass et al. (2008) replicated their simulations 100 times.
An appropriate or adequate number of replications depends on what kinds of parameter property
are of primary research interest, because each parameter property needs a different level of
replication to obtain stable estimation results. Once the parameter property of interest is
determined, depending on the number of replications, we can evaluate the precision (stability) of
the simulation. If the stability of estimation looks good above a certain minimum number of
replications, we do not need to replicate the studies many times unnecessarily. On the other hand,
if a large number of replications are necessary in order to get stable results, the appropriate
number of simulations should be determined, and conducted. At this point, it is important to
think of simulation efficiency or to construct an efficient algorithm of simulation because in case
a large volume of data simulation is necessary, it is crucial to do the simulation study in a speedy
and simple way.
f. Number of Items
With a small number of items, less than 30, the precision of item parameter estimation is
influenced mainly by the latent trait distribution, whereas the impact of the number of items on
estimation is small (Finch, 2010). This result is consistent with Stone (1992), who reported that
the calibration results from at least 40 items are robust.

20

g. Number of Examinees
For ULS and RWLS estimation methods, even though the estimation precision slightly
increases when the number of examinees is increased, there is no significant effect of the number
of examinees on the estimation results if the number is greater than 250 examinees (Finch, 2010).
Studying estimation of the bifactor model in vertical scaling, Li and Lissitz (2012) generated
1,000, 2,000, and 4,000 examinees. The research noted that estimation results showed that the
accuracy and stability of estimation increased with the sample size, and that especially the results
from the sample sizes of 2,000 and 4,000 had lower root mean squared error and standard errors
than those from the sample size of 1,000.
h. Estimation Methods
Finch ( 010)’s results show that using ULS estimation in NOHARM software provided
better precision of item parameters, than RWLS estimation in Mplus, but he points out that ULS
with NOHARM should not be used when the models have pseudo-guessing parameters and high
correlations between latent traits. Also, both of the estimation methods tended to underestimate
the item parameters when latent trait distributions were skewed. In order to conduct multigroup
concurrent calibration during vertical scaling, Li and Lissitz (2012) implemented marginal
maximum likelihood by the EM algorithm using IRTPRO software. Woods and Thissen (2006)
also used the EM algorithm for marginal maximum likelihood to compute spline-based densities.
i. Computer Programs for Parameter Estimation
As interest in parameter estimation of latent trait models has increased, various kinds of
computer programs have employed and developed program languages and packages for latent
trait analysis. Chalmers (2012) introduced the Multidimensional IRT package for Rprogramming, and Sheng (2010) studied MATLAB programming in order to estimate MIRT
21

models with general and specific latent traits using Bayesian methods. IRTPRO is equipped with
bifactor model analysis, and Mplus also provides bifactor model analysis with maximum
likelihood estimators. Seo (2011) estimated the parameters of latent traits for the bifactor model
with maximum likelihood and Bayesian estimation methods using the MBICAT algorithm in R.
WLS estimators can be utilized by using NOHARM and Mplus with limited-information
algorithms, and BMIRT has a Bayesian MCMC estimator (Chalmers, 2012).
2.4 Empirical Data Analysis
In order to show the importance of and provide the rationale for study of skewed latent
trait distributions, the proficiency scales of the Program for International Student Assessment
(PISA) mathematics, reading, and science tests were investigated. The Organization for
Economic Co-operation and Development provides technical and supplementary reports to
describe the test construction and report key findings of the assessment, and the National Center
for Education Statistics has also published analysis of the PISA results focusing on US students
from PISA 2000 to 2009. The data and information used in this part are collected from those
reports and modified for this research.
2.4.1 Distribution of Latent Traits in PISA 2003, 2006, and 2009
The Program for International Student Assessment uses proficiency levels to describe
student performance. In order to reach a particular level, a student must be able to answer a
majority of items correctly at that level. Students are classified into one of the levels according to
their scores (OECD, 2001). For example, the reading literacy scale in PISA 2009 has eight cut
point scores from Level 1b to Level 6, and students’ scores are located on a scale from 0 to 1,000
(NCES, 2010). An example of specific cut scores is shown in Table 2-2.

22

Table 2-2. Cut scores of Reading Literacy Proficiency Levels in PISA 2009
Greater than

Less than or equal to

Below level 1b

-

262.04

Level 1b

262.04

334.75

Level 1a

334.75

407.47

Level2

407.47

480.18

Level3

480.18

552.89

Level4

552.89

625.61

Level5

625.61

698.32

Level6

698.32

-

Table 2-3 and Figures 2-1 and 2-2 show the percentage distributions of 15-year-old
students in the United States on combined reading, mathematics, and science literacy scales by
proficiency level. In the 2000, 2003, and 2006 PISA results, the distribution of reading
proficiency followed a negatively skewed distribution, whereas the mathematics and science
literacy scales had positively skewed distributions. In PISA 2009, compared to the results in
PISA 2000, the reading proficiency scale, which was modified from 6 levels to 8 levels, was
generally normally distributed. In PISA 2009, the distribution of mathematics literacy had heavy
left tails similar to the distribution from PISA 2003. Although the science literacy distribution

23

Table 2-3. Percentage Distribution of Proficiency Level Scores in PISA 2000, 2003 and 2009
Mathematics
Reading (%)

Science (%)
(%)

2009

2000

Below level

Below level
1

1b

2009

2003

2009

2003

8

10

4

8

Below level
4

1

1

Level 1b

4

Level 1

9

Level 1

15

16

14

17

Level 1a

13

Level 2

20

Level 2

24

24

25

24

Level 2

24

Level 3

27

Level 3

25

24

28

24

Level 3

28

Level 4

24

Level 4

17

17

20

18

Level 4

21

Level 5

16

Level 5

8

8

8

8

Level 5

8

-

-

Level 6

2

2

1

2

Level 6

2

-

-

-

-

-

-

-

* The table is made by the information from NECS, 2001, 2004, 2007, & 2010
has been getting closer to a symmetric distribution over time, compared to the results from PISA
2006, it still is a little skewed with a heavy left tail.

24

Reading in PISA2000

Mathematics in PISA2003

Science in PISA2006

Figure 2-1. Percentage distribution of proficiency level in PISA 2000, 2003 and 2006
(Modified from NCES, 2000, Table A3.7; NCES, 2004, Figure 4; NCES, 2007, Figure 4)

Reading in PISA2009

Mathematics in PISA2009

Science in PISA2009

Figure 2-2. Percentage Distribution of Proficiency Level in PISA 2009
(Modified from NCES, 2010, Figure 3, 5, & 7)

2.4.2 Sub-domain Proficiency Levels in PISA
From 000 to 009, ISA has measured student’s Mathematics and Science literacy with
three kinds of sub-domains. The distributions of sub domain performance by proficiency level
are shown in Figure 2-3.

25

2000 Reading Literacy categorized into six levels from Below Level 1 to Level 5

Retrieving information

Interpreting texts

Reflecting on texts

2003 Mathematics Literacy categorized into seven levels from Below Level 1 to Level 6

Quantity

Space and Shape

Change and relationship

2006 Science Literacy categorized into seven levels from Below Level 1 to Level 6

Identifying scientific
issues

Explaining phenomena
scientifically

Using scientific
evidence

2009 Reading Literacy categorized into eight levels from Below Level 1b to Level 6

Access and retreive

Integrate and interpret

Reflect and evaluate

Figure 2-3. Percentage Distribution of Sub-domains in PISA 2000, 2003, 2006, and 2009
26

The analysis results show that the distributions of sub domain proficiency levels for each
subject show different patterns. Reading literacy in PISA 2000 had negatively skewed
distributions with long tails to the left for all three sub-domains. Mathematics literacy in PISA
2003 had a positively skewed distribution with a heavy tail in the lower level of proficiency for
the three domains, whereas they had peak points at different proficiency levels. Science literacy
in PISA 2006 showed similar positively skewed distributions for the three domains, but with
slightly different kurtosis. Reading literacy in PISA 2009, with eight levels of proficiency,
showed almost a symmetric distribution, and it showed quite a different distribution from reading
literacy in PISA 2000, which had seven levels of proficiency.
In summary, in PISA 2003, 2006, and 2009 the distributions of reading are negatively
skewed, with a heavy tail on the high levels of proficiency. The distributions of mathematics and
science are positively skewed, with the heavy tails toward the low levels of proficiency.
In this analysis of sub-domain proficiency levels in PISA, the distributions of the subdomains show different patterns by subject, and the three domains in each subject show slightly
different distributional properties. While reading in 2000 had negatively skewed distributions for
the three domains, each distribution had different levels of skewedness and thickness of its tails.
The subdomain distributions for math in 2003 had common heavy tails on the lower level;
however, each distribution had different points with the highest frequencies. Science in 2006 had
a similar pattern for the three domains, but with different levels of kurtosis, and reading in 2009
also had a similar pattern of symmetric distribution for three domains, with the different level of
kurtosis.

27

The results of this PISA score distribution analysis show that the distributions of the
latent trait scores can have different shapes for each subject matter, and distributions within the
sub-domains of each subject can have different properties. Within the same general construct,
such as math, science or reading, their sub-domain proficiency level scores had different
distributional properties, especially related to the skewedness and kurtosis of the distributions.

28

3. Method
3.1 Data Generation
In order to study the effect of different distributional assumptions when a
multidimensional latent trait model is estimated, I generated a set of latent trait distributions and
item parameter sets corresponding to the simulation conditions. Previous literature and empirical
analyses were used for selecting the true item and person parameters in order to allow the
simulation data to be representative of the reality. Table 3-1 shows the simulation conditions for
data generation. The model used for estimating the item and latent trait parameters is a twoparameter logistic bifactor model. There are three latent traits in each model, one general factor
and two specific factors. For the latent trait distributions, normal and skewed distributions are
generated. Simulation conditions related to the item parameters consist of two levels of item
difficulty and item discriminations.
3.1.2 Data Generating of Latent Trait Distributions
The bifactor model designed for this research has three latent factors including one
general factor and two specific factors and three latent trait distributions in every replication of
the simulation study. The trait distributions are generated according to combinations
characterized by (a) the normality (shape) of the general factor distribution, (b) the direction of
skewedness of the skewed distributions included in the general or specific factors, and (c) the
correlations between the specific factors. The general factor distribution in each condition is one
of two shape types: normal or positively-skewed. Each general factor distribution is paired with
two specific factor distributions, which are two positively skewed distributions or one positively
skewed distribution and one negatively skewed distribution.

29

The bifactor model assumes that the specific factors are not only uncorrelated with the
general factor, but also uncorrelated with each other. In order to evaluate the estimation quality
when this assumption is violated, two levels of intercorrelation between latent traits of 0.2
(barely correlated) or 0.8 (correlated) are given for each paired condition. With the two
conditions of normality of the general factor, two conditions of direction of skewedness of the
specific factor distributions, and two levels of intercorrelations, a total of eight simulation
conditions are assigned to the latent trait distributions.
Table 3-2 shows the specific simulation conditions of the latent traits. Based on the
simulation conditions, in order to look at the effects of general factor distribution, the results
from conditions 1 to 4 and the results from corresponding conditions 5 to 8 are compared.
Similarly, the direction effects of the skewed distributions of specific factors are evaluated by the
comparison of Condition 1 vs. Condition 3, Condition 2 vs. Condition 4, Condition 5 vs.
Condition 7, and Condition 6 vs. Condition 8. The effect of the extent of correlation between the
specific factors on estimation are evaluated with the comparison of Condition 1 vs. 2, Condition
3 vs. Condition 4, Condition 5 vs. Condition 6, and Condition 7 vs. Condition 8.
Because the model has item difficulty and discrimination parameters, but no guessing
parameter, the effects found from the negatively skewed distribution will be the same as the
positively skewed distribution except opposite in sign. For example, the results from the
combination of negatively skewed specific factors with normal distribution of general factor are
implied by the results from the combination of positively skewed specific factors with normal
distribution in Condition 1 and 2.

30

Table 3-1. Simulation Conditions for Data Generation
Simulation factors

Condition
Two-parameter multidimensional latent trait model - bifactor

Model

model with three factors of one general factor and two specific
factors
Normal distribution from standard normal distribution with mean

Distribution of Latent Traits
of 0 and standard deviation of 1. Positively or negatively skewed
with Directions
with a mean of 0.3 or -0.3, and skewedness of 0.8 or -0.8
2 conditions generated from lognormal distributions with the range
Discrimination Parameter

from 0.5 to 2.5; Mean of 1.3 with standard deviation of 0.15 and
mean of 1.8 with standard deviation of 0.15
2 conditions generated from normal distributions with the range

Difficulty Parameter

from -2 to 2; Mean of-0.5 with standard deviation of 0.4 and mean
of 0.5 with standard deviation of 0.4

Number of Items
Number of Examinees
Estimation method
Replications

Total 60 items with 30 items for each specific factor
2000
Full Information Marginal Maximum Likelihood in IRTPRO
50

31

Table 3-2. Simulation Combinations of Latent Trait Distributions
Condition
General

Specific 1

Specific 2

Correlation

1

Normal

Skewed (+)

Skewed (+)

0.2

2

Normal

Skewed (+)

Skewed (+)

0.8

3

Normal

Skewed (+)

Skewed (-)

0.2

4

Normal

Skewed (+)

Skewed (-)

0.8

5

Skewed (+)

Skewed (+)

Skewed (+)

0.2

6

Skewed (+)

Skewed (+)

Skewed (+)

0.8

7

Skewed (+)

Skewed (+)

Skewed (-)

0.2

8

Skewed (+)

Skewed (+)

Skewed (-)

0.8

Number

* (+) or (-) means a positively or negatively skewed distribution respectively.

The simulation conditions of the latent trait distributions combine the degree of skewedness of
the distributions and the intercorrelations needed to describe multivariate latent trait distributions.
For example, assume that one general factor and two positively skewed specific factors need to
be generated. The procedure of the data generation is as follows:1) a distribution for the general
factor is generated from a normal distribution with mean of 0 and standard deviation of 1 or

32

mean of -0.3 and skewedness of 0.8 for a positively skewed distribution, 2) in order to generate
skewed distributions for specific factors with particular levels of correlation between them, two
distributions from multivariate normal distributions are generated with correlations of 0.2 and 0.8,
and 3) the distributions with the correlations that are generated in 2) are non-linearly transformed
into skewed distributions. In this case, the transformed distribution is going to be skewed
compared to the normal distribution, but the correlation between the two specific factors is not
changed.
As mentioned above, the generated distributions considering intercorrelations are
transformed into skewed distributions. In order to transform the correlated distributions into
skewed distributions, I applied the idea of the Copula method (Nelsen, 1999). Relational
functions between normal and targeted skewed distributions are estimated by using their
cumulative probability distributions, and are used in order to transform the multivariate normal
distributions with the targeted intercorrelation into skewed distributions. The steps to generate
the distributions are as follows:
Step 1. Generating the targeted skewed distribution
Based on conditions in previous studies, positively skewed distributions with their first
four moments having values of: mean of -0.3, variance of 1, skewedness of 0.8, and kurtosis of
3.5, are created. For negatively skewed distributions, a mean of 0.3, variance of 1, skewedness of
-0.8 and kurtosis of 3.5 are used.
Step 2. Calculating the cumulative probability distribution of the skewed distribution
generated in the step 1

33

In order to calculate a cumulative probability distribution, R-programming is used with
the function of ‘ecdf’, which calculates an empirical cumulative distribution.
Step 3. Estimating the regression function between the normal and the skewed probability
distributions.
For each positively skewed or negatively skewed distribution, regression coefficients
between two distributions were estimated. In this research, the coefficients were estimated fifty
times through simulation analysis estimating the function needed to transform from a normal
distribution to a skewed distribution (see Tables in Appendix A). The fifty data sets of 2,000 or
10,000 cases generated from normal and skewed distributions showed similar coefficients, and
quadratic and cubic functions were identified as the best functions to use for the transformation
based on the variance explained in the model fit. Each coefficient of the polynomial functions
had a very small amount of variance, which shows that estimated coefficients for each trial were
very similar. Finally, the cubic transformation function was selected. The R-squares of the
regression functions between the data of normal distributions and the data of skewed
distributions were over .999. The means of the fifty coefficients were used, and by using a
regression function, the normal distributions were approximately transformed into the skewed
distributions. The regression functions are as follows:
In order to transform the normal distribution values (X) into skewed distribution values
(Y), two regression functions are used:
for positively skewed distributions,
Y = -0.4508 + 1.0167 X + 0.1461 x - 0.0136
for negatively skewed distributions,

34

x3 , and

Y = -0.4516 + 1.0135 X – 0.1500 x – 0.0115 x3
Step 4. Generating multivariate normal distributions with correlations of 0.2 or 0.8
By using the function to generate a multivariate normal distribution, sets of distributions
are generated with correlations of 0.2 or 0.8 between the specific factor distributions.
Step 5. Calculating the skewed distribution by using the function estimated in step3
The values of the multivariate distributions generated in step 4 are transformed into
skewed distributions by using the one of the formulas estimated in step 3.
Step 6. Checking the descriptive statistics and correlations of the generated distributions
If the descriptive statistics and correlations are not acceptable, the parameters of the
normal distributions are modified to reach the targeted values for the distributions.
3.1.2 Data Generating of Item Parameters
For the purpose of stability, but allowing some comparison, the discrimination parameter
and difficulty parameters are each generated with two levels. Two sets of discrimination
parameters are generated to represent “high” and “low” levels of discrimination in an item set. In
order to avoid negative values for discrimination parameters, the parameters are generated from
lognormal distributions. The lognormal distribution is a log-transformed distribution from a
normal distribution (Hogg & Tanis, 1997), and the probability density function is

f(X) = x
with two parameters of

1
√

-

(ln x- )

e

,X>0

and , and the mean and variance of a lognormal distribution are

calculated by

35

Table 3-3. Parameters for Generating Distributions of Simulation Combinations of Item
Parameters
Discrimination from 0.5 to 2.5

Difficulty from -2 to 2

Mean

SD

Mean

SD

1

1.3

0.15

-0.5

0.4

2

1.3

0.15

0.5

0.4

3

1.8

0.15

-0.5

0.4

4

1.8

0.15

0.5

0.4

E(X)= e

and Var(X)= (e -1)e

.

Therefore, with the specific target value of mean, E(X), and variance (squared standard
deviation), Var(X), the parameters of

and

are calculated by using the two formulas:

ar( )
)

ln(E( )) - ln (√ E(

1) , and

ar( )
)

√ln ( E(

1)

For example, in order to generate a distribution of discrimination parameters with a mean
of 1.3 and a standard deviation of 0.15 from a lognormal distribution, the

of -0.13118 and

of

0.51222 should be used. Similarly, the discrimination parameters in this study are generated from
a lognormal distribution with a mean of 1.3 or 1.8 and a standard deviation of 0.15.
For the difficulty parameter, normal distributions with a mean of -0.5 or 0.5 and a
standard deviation of 0.5 are used. To avoid violating the latent trait model assumption that the
function between the latent trait and the probability of the correct answer is monotonically
36

increasing, the variances of the distribution for generating item parameters are manipulated by
giving the range of the difficulty parameter distribution. The distributions of the created
discrimination parameters ranged from 0.5 to 2.5, and those of the created difficulty parameters
ranged from -2 to 2. The combinations of item parameters are shown in Table 3-4.
3.1.3 Data Generating of Examinees’ Responses
The estimation performance of the bifactor model is evaluated via the comparison
between the data generated under the simulation conditions and the data estimated from the
bifactor model. In other words, the comparison means to investigate how the estimated
parameters from the bifactor model are close to the generated data under the simulation
conditions. In order to estimate parameters by using the bifactor model, examinees’ responses
are generated based on the item and
Item parameters and

parameters generated under the simulation conditions.

parameters are plugged in to the function of the bifactor model in

order to calculate the probability of answering correctly:

(u 1| 0 ,

1,

, a0 , a1 ,a , di )

1
1 exp [di a0

, where di

0

a1

1

a

]

-b√∑v 0 av

As a result, each examinee has a probability to answer correctly for each item. To add
randomness to each value, random variables from uniform distributions ranged from 0 to 1 are
generated and assigned to each response probability value. If the probability is higher than the
random value, the corresponding response is assigned as 1, which means that the examinee

37

answers that question correctly. If the probability is lower than the random value, the response is
assigned as 0, which means that the examinee responds with the wrong answer.
The total number of items is 60, which are divided so that 30 items are indicators for each
of the two specific factors, and the number of examinees in each data set is 2000. The data sets
are generated by using R-programing and the full information marginal maximum likelihood
estimation (MML) implemented in IPRPRO is used for estimating the parameters. 50
replications of the simulation study are conducted to achieve stable estimation of the results.
3.2 Evaluation Methods
The generated data are compared to the true parameters in order to confirm if the
generated data sets represent the planned simulation design. Descriptive statistics for the
generated data, such as the mean, standard deviation, minimum, maximum, skewedness and
kurtosis, will be provided.
In order to evaluate the estimation precision of the model under the designed conditions,
and item estimated parameters are compared to the true parameters assigned. Means and
variances of mean bias were used to judge the precision of parameter estimation. The formula for
mean bias is as follows.
1

Mean Bias = ∑i 1 ( ̂i - i ),
where

is the given parameter, ̂ is the estimate of the parameter, and

is the number of

parameters, for example, =30 for discrimination item parameters of each specific factor. The
mean and variance of mean bias were calculated across the replications. Bias is the index
showing that the difference between the parameter and its estimates. To judge overall bias across
the parameters estimated, mean bias is calculated by the average of the differences.
38

For the investigation of the distributional difference between the estimated distributions
and the generated distribution, the Kolmogorov-Smirnov (KS) test was utilized. The KS test is
used to compare two empirical distributions by using their cumulative functions (Stapleton, 2008;
Hogg & Tanis, 1997). In order to compare the generated and estimated distribution, the KS test
was utilized with the entire parameter set, and for specific value ranges of the parameters. For
the KS test with the entire data set, a total of 2000 parameters for each analysis were tested. For
the specific ranges, 2,000 parameters were sorted by their locations, and sets of 200 parameters
were sequentially assigned to each category. Thereby, ten specific value categories were
constructed for the KS tests. All categories had the same frequency of parameters; however,
that does not mean that their continuums had the same width, because depending on the
generated distribution, the frequencies in a certain fixed range can be different. Every
simulation condition was replicated fifty times, and values showing the numbers of frequencies
from the fifty replications that were statistically significant under the significance level of 0.05
will be reported.

39

4. Results
4.1. Data Generation
For the simulation study, the item and parameters were generated, and those values
were used to generate response strings through the bifactor models. The descriptive statistics of
the generated item and parameters are provided.
4.1.1 Item Parameter Generation
The bifactor model used in this study includes three item discrimination parameters
related to general or specific factors, and one d parameter calculated using the discrimination and
difficulty parameters. Table 4-1 shows the descriptive statistics of the generated discrimination
and difficulty parameters. In order to check the range of the generated data, and to avoid extreme
values, minimum, mean, and maximum values were calculated.
The discrimination parameters were generated within the range from 0.5 and 2.5 from a
lognormal distribution with a mean of 1.3 and 1.8, and a standard deviation of 0.15. The mean,
and standard deviation statistics showed that the generated discrimination parameters of the
general and the two specific factors had means and standard deviations very close to those of the
generating distribution in each simulation condition. For the discrimination parameters generated
from a lognormal distribution with a mean of 1.3, the means of three discrimination parameter
sets were 1.298, 1.309, and 1.304, which were very close to the simulation condition of 1.3; and
for the parameters from a distribution with a mean of 1.8, the three means were 1.802, 1.804, and
1.796. All of the discrimination parameters also had values very close to the parameter of 1.5. No
values were out of the range from 0.5 and 2.5. The difficulty parameters were generated from
normal distributions with a mean of 0.5 or -0.5, and a standard deviation of 0.4 within the range

40

of -2 to 2. The generated difficulty parameters showed the proper level of mean and standard
deviation for the data sets. The means of the difficulty parameter sets were 0.497 and 0.495 for
the mean parameter of 0.5, and they were -0.515 and -0.497 for the mean parameter of -0.5.
Their standard deviations were close to 0.4, and there were no extreme values outside the range
of -2 to 2.
4.1.2

Parameter Generation

There were eight types of ability distributions under the conditions characterized by the
degree of skewedness of the distributions, the direction of skewedness, and the correlation
between specific factors. The normal distributions were generated from a standard normal
distribution with a mean of 0 and a standard deviation of 1. The parameter values for the skewed
distributions were means of 0.3 for the negatively skewed distributions, and -0.3 for the
positively skewed distributions, standard deviations of 1, skewedness of 0.8, and correlations
between the specific factors of 0.2 or 0.8.
The descriptive statistics of the generated

parameters are provided in Tables 4-2, 4-3,

and 4-4. The generated ability distributions showed means standard deviations, skewedness, and
correlations near their anticipated values. The general factor distributions were generated from
normal distributions or positively skewed distributions (See Table 4-2). The means of the normal
distribution means were from -0.002 to 0.010, and the means of their standard deviations were
from 0.997 to 1.003. The positively skewed distributions had means that ranged from -0.308 to 0.302, and means of skewedness that ranged from 0.798 to 0.807.

41

Table 4-1. Descriptive Statistics of Generated Item Parameters
Mean (SD)

Factors
General

1.3 (0.15)

Specific1

Specific2
Disc.
Parameters
General

1.8 (0.15)

Specific1

Specific2

Specific1
-0.5 (0.4)
Specific2
Diff.
Parameters
Specific1
0.5 (0.4)
Specific2

Mean

SD

Min

Max

Min

1.245

0.128

0.841

1.480

Mean

1.298

0.151

0.994

1.689

Max

1.331

0.188

1.106

1.957

Min

1.224

0.096

0.858

1.471

Mean

1.309

0.151

1.028

1.631

Max

1.372

0.196

1.174

1.791

Min

1.235

0.114

0.870

1.481

Mean

1.304

0.149

1.018

1.639

Max

1.369

0.217

1.132

1.953

Min

1.764

0.106

1.349

2.039

Mean

1.802

0.148

1.485

2.178

Max

1.840

0.179

1.583

2.490

Min

1.739

0.111

1.411

1.945

Mean

1.804

0.151

1.518

2.134

Max

1.880

0.192

1.617

2.349

Min

1.734

0.112

1.344

1.976

Mean

1.796

0.156

1.493

2.155

Max

1.888

0.211

1.621

2.350

Min

-0.625

0.281

-1.993

-0.025

Mean

-0.515

0.405

-1.330

0.338

Max

-0.382

0.534

-0.901

0.946

Min

-0.698

0.323

-1.719

-0.205

Mean

-0.497

0.395

-1.315

0.283

Max

-0.287

0.491

-0.937

0.641

Min

0.340

0.301

-0.700

1.065

Mean

0.497

0.399

-0.293

1.297

Max

0.601

0.501

-0.013

1.743

Min

0.294

0.289

-0.637

0.952

Mean

0.495

0.386

-0.281

1.291

Max

0.710

0.519

0.044

1.828

* Disc. Parameters: Discrimination parameters; Diff. parameters: difficulty parameters
42

Although this study didn’t consider urtosis as a simulation factor, the skewedness of the
distributions affected the kurtosis of the distributions. Standard normal distributions with a mean
of 0 and standard deviation of 1 have skewedness of 0 and kurtosis of 3 when the standardized
fourth moment is used for the formula of the kurtosis (DeCarlo, 1997). Therefore the kurtosis
values of the normal distributions from conditions 1 to 4 in Table 4.2 show values acceptable to
be regarded as a standard normal distribution. From conditions 5 to 8, the means of kurtosis
values range from 3.551 to 3.646, which are higher than the values for the non-skewed
distribution, showing that the kurtosis values are related to the skewedness of the distribution.
Based on the simulation conditions, all of the first specific distributions were positively
skewed (See Table 4-3). The generated factor distributions had means of means that ranged from
-0.305 to -0.297, means of the standard deviations that ranged from 0.995 to 1.004, and the
means of skewedness that ranged from 0.788 to 0.805.

The data sets of second specific factors were generated from positively skewed
distributions and negatively skewed distributions (See Table 4-4). For the positively skewed
distributions, the means of the distributions were from -0.309 to -0.300, and the means of
skewedness were from 0.783 to 0.807. For the negatively skewed distributions, the means were
from 0.296 to 0.308, and the means of skewedness were -0.834 to -0.812.
The descriptive statistics for the correlations between specific factors in each of the eight
distributional simulation conditions of parameters showed that those distributions had values of
correlations close to 0.2 or 0.8 (See Table 4-5). The mean correlations of distributions generated
with the correlation of 0.2 were 0.192, 0.199, 0.194, and 0.197, and those generated with the
correlation of 0.8 were 0.792, 0.794, 0.791, and 0.793.
43

Table 4-2. Descriptive Statistics of General
Cond.

1

2

3

4

5

6

7

8

Distribution

Normal

Normal

Normal

Normal

Positively
Skewed

Positively
Skewed

Positively
Skewed

Positively
Skewed

* Cond.: Numbers of

Parameters Generated

Stat

Mean

SD

Min

Max

Skew

Kurtosis

Min

-0.043

0.960

-4.371

2.845

-0.151

2.841

Mean

-0.002

0.997

-3.431

3.397

0.006

2.988

Max

0.048

1.041

-2.889

4.131

0.105

3.335

Min

-0.047

0.962

-4.430

2.775

-0.097

2.733

Mean

-0.002

1.003

-3.405

3.440

0.005

2.967

Max

0.046

1.037

-2.807

4.333

0.151

3.239

Min

-0.057

0.958

-4.587

2.878

-0.121

2.751

Mean

-0.001

0.998

-3.478

3.439

-0.005

3.005

Max

0.034

1.027

-2.952

3.979

0.111

3.461

Min

-0.035

0.954

-4.149

2.968

-0.094

2.789

Mean

0.010

1.002

-3.387

3.481

0.004

2.997

Max

0.063

1.045

-2.929

5.050

0.094

3.264

Min

-0.347

0.961

-1.868

3.266

0.706

3.288

Mean

-0.305

0.999

-1.868

4.192

0.807

3.616

Max

-0.253

1.045

-1.866

5.656

0.938

4.202

Min

-0.361

0.970

-1.868

3.192

0.683

3.091

Mean

-0.308

0.996

-1.868

4.243

0.798

3.579

Max

-0.261

1.036

-1.866

5.838

0.914

4.075

Min

-0.347

0.964

-1.868

3.427

0.698

3.176

Mean

-0.302

0.999

-1.868

4.267

0.802

3.602

Max

-0.260

1.040

-1.864

5.781

0.978

4.425

Min

-0.350

0.957

-1.868

3.352

0.681

3.119

Mean

-0.305

0.999

-1.868

4.151

0.798

3.551

Max

-0.249

1.036

-1.867

5.378

0.923

4.020

simulation conditions

44

Table 4-3. Descriptive Statistics of First Specific
Cond.

1

2

3

4

5

6

7

8

Distribution
Positively
Skewed

Positively
Skewed

Positively
Skewed

Positively
Skewed

Positively
Skewed

Positively
Skewed

Positively
Skewed

Positively
Skewed

* Cond.: Numbers of

Parameters Generated

Stat

Mean

SD

Min

Max

Skew

Kurtosis

Min

-0.363

0.949

-1.868

3.133

0.670

3.028

Mean

-0.300

0.995

-1.868

4.185

0.788

3.546

Max

-0.259

1.033

-1.866

5.981

0.939

4.119

Min

-0.341

0.969

-1.868

3.395

0.669

3.056

Mean

-0.297

1.002

-1.868

4.262

0.794

3.578

Max

-0.259

1.035

-1.865

5.544

0.973

4.468

Min

-0.343

0.967

-1.868

3.182

0.679

3.060

Mean

-0.303

1.004

-1.868

4.190

0.805

3.591

Max

-0.247

1.044

-1.867

5.572

0.985

4.569

Min

-0.364

0.967

-1.868

3.408

0.698

3.177

Mean

-0.298

0.999

-1.868

4.242

0.803

3.607

Max

-0.251

1.051

-1.867

5.568

0.939

4.339

Min

-0.350

0.965

-1.868

3.228

0.613

2.960

Mean

-0.300

1.000

-1.868

4.172

0.790

3.526

Max

-0.243

1.041

-1.867

5.467

0.897

3.932

Min

-0.352

0.970

-1.868

3.461

0.691

3.219

Mean

-0.301

1.001

-1.868

4.150

0.801

3.563

Max

-0.254

1.050

-1.867

5.174

0.893

3.969

Min

-0.352

0.955

-1.868

3.310

0.681

3.097

Mean

-0.305

0.996

-1.868

4.268

0.799

3.609

Max

-0.258

1.044

-1.866

5.717

0.912

4.165

Min

-0.358

0.962

-1.868

3.418

0.727

3.235

Mean

-0.305

1.000

-1.868

4.266

0.803

3.601

Max

-0.230

1.041

-1.865

5.572

0.957

4.473

simulation conditions
45

Table 4-4. Descriptive Statistics of Second Specific
Cond.

Distribution

1

Positively
Skewed

2

3

4

5

6

7

8

Positively
Skewed

Negatively
Skewed

Negatively
Skewed

Positively
Skewed

Positively
Skewed

Negatively
Skewed

Negatively
Skewed

Parameters Generated

Stat

Mean

SD

Min

Max

Skew

Kurtosis

Min

-0.359

0.966

-1.868

3.262

0.660

3.129

Mean

-0.309

0.997

-1.868

4.170

0.785

3.518

Max

-0.270

1.031

-1.866

5.279

0.928

4.313

Min

-0.348

0.965

-1.868

3.540

0.623

3.127

Mean

-0.300

1.001

-1.868

4.188

0.788

3.543

Max

-0.257

1.042

-1.866

5.323

0.929

4.107

Min

0.269

0.967

-5.797

1.870

-0.980

3.240

Mean

0.305

1.000

-4.391

1.871

-0.821

3.664

Max

0.365

1.035

-3.324

1.871

-0.721

4.501

Min

0.253

0.963

-6.221

1.870

-1.012

3.074

Mean

0.308

0.996

-4.358

1.871

-0.819

3.690

Max

0.364

1.030

-3.245

1.871

-0.663

4.922

Min

-0.353

0.966

-1.868

3.127

0.643

3.021

Mean

-0.303

0.999

-1.868

4.107

0.783

3.534

Max

-0.252

1.034

-1.868

5.850

0.951

4.200

Min

-0.354

0.960

-1.868

3.458

0.630

2.979

Mean

-0.305

0.998

-1.868

4.356

0.807

3.618

Max

-0.250

1.047

-1.867

5.655

0.939

4.297

Min

0.250

0.967

-5.539

1.870

-0.958

3.035

Mean

0.296

1.004

-4.295

1.871

-0.812

3.607

Max

0.347

1.040

-3.358

1.871

-0.702

4.140

Min

0.242

0.942

-5.740

1.870

-0.965

3.181

Mean

0.301

1.002

-4.442

1.871

-0.834

3.720

Max

0.368

1.041

-3.516

1.871

-0.680

4.688

* Cond.: Numbers of simulation conditions

46

Table 4-5. Correlations Between Two Specific

Parameters Generated

Condition

1

2

3

4

5

6

7

8

Correlation

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

Min

0.133

0.772

0.149

0.781

0.151

0.771

0.158

0.776

Mean

0.192

0.792

0.199

0.794

0.194

0.791

0.197

0.793

Max

0.231

0.828

0.258

0.806

0.241

0.806

0.227

0.807

4.2. Bifactor Analysis
For each condition of the latent trait distributions and item parameters, item and
parameter estimation was evaluated. The tables in the body text show the mean and variance of
the mean bias of the parameter estimates. More details of the descriptive statistics such as
minimum and maximum values of the mean and variance are attached in Appendix B, C and D.
4.2.1 Item Parameters
For evaluating the bifactor model under the different distributional conditions, the mean
and variance of the mean bias of the estimated parameters were calculated. Table 4-6 and 4-7
show means and variances of item parameter mean bias under each of eight

simulation

conditions, and more detailed statistics are provided in Appendix B. Among the simulation
conditions, there were four noticeable patterns of item parameter estimation.

47

Table 4-6. Means of Item Parameter Mean Bias
Cond.

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

a0

0.121

0.391

0.137

0.407

0.118

0.382

0.110

0.395

a1

-0.207

-0.714

-0.190

-0.750

-0.144

-0.715

-0.170

-0.736

a2

-0.166

-0.715

-0.116

-0.689

-0.188

-0.681

-0.146

-0.717

d

-0.433

-0.431

0.005

0.023

-0.840

-0.845

-0.420

-0.426

a0

0.143

0.444

0.134

0.413

0.206

0.488

0.171

0.449

a1

-0.097

-0.678

-0.123

-0.713

-0.109

-0.629

-0.088

-0.671

a2

-0.125

-0.634

-0.171

-0.728

-0.056

-0.659

-0.210

-0.754

d

-0.443

-0.438

-0.003

0.008

-0.872

-0.860

-0.428

-0.430

a0

0.216

0.494

0.282

0.538

0.202

0.502

0.185

0.508

a1

-0.308

-0.978

-0.289

-1.007

-0.284

-0.954

-0.235

-1.000

a2

-0.321

-0.979

-0.215

-0.957

-0.274

-0.955

-0.290

-0.993

d

-0.611

-0.611

-0.001

0.054

-1.204

-1.195

-0.602

-0.586

a0

0.305

0.615

0.266

0.542

0.421

0.709

0.334

0.617

a1

-0.208

-0.891

-0.214

-0.961

-0.145

-0.854

-0.177

-0.921

a2

-0.208

-0.891

-0.214

-0.961

-0.145

-0.854

-0.177

-0.921

d

-0.635

-0.615

-0.015

0.009

-1.278

-1.249

-0.623

-0.620

Cond.

Disc: 1.3
Diff: -0.5

Disc: 1.3
Diff: 0.5

Disc: 1.8
Diff: -0.5

Disc: 1.8
Diff: 0.5

* Cond.: Numbers of simulation conditions
* G, S1, & S2: Distributions of general, first specific, and second specific factors
* a0, a1, and a2: Discrimination parameter for general, first and second specific traits; d: dparameter
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
48

Table 4-7. Variances of Item Parameter Mean Bias
Cond.

Cond.

Disc: 1.3
Diff: -0.5

Disc: 1.3
Diff: 0.5

Disc: 1.8
Diff: -0.5

Disc: 1.8
Diff: 0.5

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

a0

0.019

0.018

0.025

0.021

0.020

0.019

0.025

0.022

a1

0.013

0.020

0.013

0.026

0.011

0.020

0.011

0.021

a2

0.012

0.020

0.013

0.020

0.012

0.019

0.013

0.020

d

0.008

0.008

0.204

0.210

0.009

0.009

0.196

0.209

a0

0.031

0.024

0.022

0.022

0.030

0.021

0.022

0.023

a1

0.014

0.022

0.014

0.022

0.017

0.020

0.014

0.022

a2

0.015

0.021

0.012

0.025

0.014

0.023

0.014

0.030

d

0.011

0.011

0.207

0.213

0.016

0.015

0.209

0.220

a0

0.020

0.023

0.026

0.034

0.028

0.026

0.045

0.030

a1

0.017

0.023

0.016

0.027

0.015

0.020

0.016

0.025

a2

0.014

0.023

0.018

0.025

0.015

0.022

0.018

0.022

d

0.011

0.011

0.415

0.423

0.012

0.011

0.393

0.412

a0

0.031

0.032

0.025

0.033

0.027

0.034

0.036

0.043

a1

0.016

0.020

0.018

0.025

0.019

0.026

0.019

0.027

a2

0.017

0.023

0.016

0.031

0.020

0.027

0.017

0.035

d

0.018

0.018

0.410

0.425

0.028

0.028

0.437

0.449

* Cond.: Numbers of simulation conditions
* G, S1, & S2: Distributions of general, first specific, and second specific factors
* a0, a1, and a2: Discrimination parameter for general, first and second specific traits; d: dparameter
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
49

First, the degree of skewedness of the general factor was influential in the estimation of
d-parameters. Table 4-6 shows the mean of the mean bias related to the item parameters. Under
conditions 5, 6, 7, and 8 having the skewed general factor distribution, the d-parameter estimates
show a larger amount of the mean bias than under conditions 1, 2, 3, and 4 having the normal
distribution of the general factors. Table 4-7 includes the variances of the mean bias of item
parameter estimates. When the variances of d-parameter estimates are compared across the
conditions, there is no significant pattern with respect to the d-parameter variances, and this
result shows that the degree of skewedness of the general factor is influential not in the variances
of d-parameters biases but in the means of d-parameter biases.
Second, the condition of the skewedness directions combined with the skewed specific
factor distributions was influential in d-parameter estimations. When two specific factors had
distributions with the same direction, for example, two positively skewed distributions or two
negatively skewed distributions, the d-parameters had larger biases, the values of which under
the conditions 1, 2, 5, and 6 shown in Table 4-6, whereas the d-parameters had smaller amounts
of bias when the directions of the skewed distributions were different under the conditions 3, 4, 7,
and 8. The amounts of bias increased when the general factor distributions were also skewed.
The results for the d parameters under conditions 1 to 4 had smaller amounts of bias than the
results under conditions 5 to 8.
Also, the direction of skewedness of the skewed distributions affects the variance of the d
parameters. As shown in Table 4-7, the d parameters had large variances under conditions 3, 4, 7,
and 8 when the two specific trait distributions had different directions of skewedness. On the
other hand, the variance of the discriminations (a0, a1 and a2) related to the general, first and

50

second specific factors had no significant patterns depending on the directions of skewedness of
the distributions. These patterns were shown across all four item parameter conditions.
Third, the strength of the correlation between the specific factors had a noticeable effect
on the estimation of discrimination parameters. The distributional conditions with a high
correlation of 0.8 between the specific factors had larger amounts of mean bias in item
discrimination parameter estimation than the conditions with a smaller correlation of 0.2 between
the specific factors. Table 4-6 shows that the discrimination parameter estimates under
conditions 2, 4, 6, and 8 with a high correlation of 0.8 had larger mean biases than the results
under conditions 1, 3, 5, and 7 with a lower correlation of 0.2. For example, under the mean
discrimination parameter of 1.3 and difficulty parameter of -0.5 condition in Table 4-6, with a
low correlation of 0.2 between the specific factors, mean biases of the discrimination parameters
related to the general factor range from 0.110 to 0.137 for conditions 1, 3, 5, and 7. However,
corresponding the range of the mean biases with the high correlation is from 0.382 to 395 under
conditions 2, 4, 6, and 8. This pattern was found regardless of the item parameter combination.
The high correlation also affects the variance of discrimination parameter estimates for the
specific factors. The discrimination parameters had large amounts of variance in the mean bias
when there was a high level correlation between the two specific factors under conditions 2, 4, 6,
and 8 in Table 4-7.
Lastly, generally the item discrimination parameters related to the general factor were
overestimated, whereas the discrimination parameters related to specific factors and d parameters
were underestimated. Negative or positive values of bias indicate underestimation or
overestimation, respectively, because bias is the result of subtracting a parameter from its
estimated value. In Table 4-6, all mean biases of the discrimination parameters related to the
51

general factors were positive values, which means the parameters tend to be overestimated. Most
of the discrimination parameters related to the specific factors and d parameters had negative
mean biases, except for some d parameters, especially under conditions 3 and 4.
Based on the results, it was demonstrated that the degree of skewedness of the general
factor distributions, the skewedness directions of the specific factor distributions, and the
correlation between the specific factor distributions are influential in estimating the item
parameters. No noticeable pattern of item parameter estimates across the four item parameter
conditions was found.
4.2.2

Parameters

Similar to the item parameter estimation,

parameter estimation was evaluated under the

eight distributional conditions across four item parameter conditions. In this section, the mean of
the mean biases, variance of the mean biases, and correlation between the generated and
estimated trait distributions for the general, first specific and second specific factors are
investigated.
a. Mean of Mean Biases
The results for mean bias of the parameters are shown in Table 4-8. The parameters
were generated from a standard normal distribution, from a negatively skewed distribution with a
mean of 0.3, or from a positively skewed distribution with a mean of -0.3. The mean values of
mean biases in Table 4-8 are very close to 0, -.3, or .3. They are the discrepancies from 0 that is
the mean of a standard normal distribution to the simulation parameters. Item response functions
are manipulated by the item and

parameters, but the continuum is not a fixed scale, or an

arbitrary one. Because of this indeterminacy, the estimation procedure should select the method

52

Table 4-8. Means of
Cond.

Parameter Mean Bias
1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

G

0.003

0.006

0.002

-0.010

0.305

0.308

0.303

0.308

S1

0.301

0.297

0.304

0.298

0.299

0.301

0.305

0.305

S2

0.309

0.300

-0.305

-0.308

0.304

0.305

-0.295

-0.301

G

0.001

0.004

0.000

-0.010

0.303

0.303

0.301

0.302

S1

0.301

0.297

0.303

0.298

0.299

0.301

0.303

0.305

S2

0.307

0.300

-0.306

-0.308

0.302

0.305

-0.296

-0.302

G

-0.001

0.006

0.006

-0.016

0.308

0.309

0.306

0.301

S1

0.301

0.297

0.306

0.298

0.302

0.301

0.308

0.305

S2

0.305

0.300

-0.302

-0.308

0.304

0.305

-0.296

-0.301

G

0.001

0.002

0.001

-0.010

0.311

0.308

0.301

0.300

S1

0.300

0.296

0.304

0.297

0.303

0.301

0.304

0.305

S2

0.308

0.300

-0.306

-0.308

0.305

0.305

-0.297

-0.302

Cond.

Disc: 1.3
Diff: -0.5

Disc: 1.3
Diff: 0.5

Disc: 1.8
Diff: -0.5

Disc: 1.8
Diff: 0.5

* Cond.: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

53

Table 4-9. Variances of
Cond.

Parameter Mean Bias

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

G

0.383

0.546

0.408

0.570

0.399

0.559

0.413

0.579

S1

0.471

0.851

0.491

0.889

0.457

0.832

0.471

0.874

S2

0.469

0.856

0.455

0.816

0.470

0.845

0.454

0.833

G

0.419

0.574

0.403

0.570

0.412

0.558

0.395

0.558

S1

0.454

0.823

0.448

0.808

0.468

0.803

0.450

0.802

S2

0.463

0.832

0.490

0.903

0.466

0.813

0.509

0.922

G

0.374

0.542

0.399

0.577

0.396

0.559

0.423

0.591

S1

0.479

0.852

0.506

0.909

0.470

0.841

0.483

0.894

S2

0.484

0.862

0.461

0.812

0.476

0.850

0.468

0.835

G

0.415

0.578

0.398

0.575

0.411

0.560

0.390

0.559

S1

0.461

0.817

0.451

0.800

0.472

0.809

0.455

0.796

S2

0.466

0.830

0.515

0.927

0.479

0.820

0.531

0.948

Cond.

Disc: 1.3
Diff: -0.5

Disc: 1.3
Diff: 0.5

Disc: 1.8
Diff: -0.5

Disc: 1.8
Diff: 0.5

* Cond.: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
54

to set up the mean and variance of the distribution in order to estimate unique parameters
(Lord, 1980; Reckase, 2009). The most frequently used method is to appoint a mean of 0 and a
variance of 1. The IRTPRO software used for this research sets the mean and variance of the
distribution to 0 and 1, respectively, as default values, and the results showed the average
discrepancies between the generated parameters and 0. Therefore, the mean biases close to 0, 0.3, or 0.3 indicate that the means of the estimated parameters were at ‘0’. This result shows
that in order to evaluate the mean of mean bias of the distribution, some alternative method for
giving the mean value needs to be utilized instead of using the fixed value of 0 for the mean of
the distribution. The estimates were also consistently centered at ‘0’ regardless of the item
condition.
b. Variances of Mean Biases
The condition of the latent trait distributions with the most important effect on the
variance of the mean biases was the correlation between specific factors. The variances of
parameter mean biases are shown in Table 4-9. Different from the results for the means of mean
biases, the variance results showed a specific pattern depending on the correlation between the
specific factors. The amount of variance in the mean bias increased under conditions 2, 4, 6, and
8with a high correlation between the specific factors (correlation=0.8), compared to the amount
of variance in mean bias under conditions with a low correlation between the specific factors.
While the general, first specific and second specific factors all had a large amount of variance in
mean bias with the high correlation, the specific factor distributions showed more variance in
mean bias than the general factor distributions across all item conditions. In order to investigate
information about estimation precision, as a first insight, the correlations between the generated
parameters and the estimated parameters were calculated.
55

c. Correlations between Generated and Estimated Parameter Distributions
Tables 4-10, 4-11 and 4-12 show the results of the mean correlations between the
generated and estimated parameters, and more detailed information is provided in Appendix D.
Under the various and item conditions, no noticeable patterns related to the correlation between
the generated and estimated general factor distribution were found. The correlations between the
generated and estimated distributions for the general factor are shown in Table 4-10. The
estimated general factor scores showed constant high mean correlations regardless of the and
item conditions, although the correlations were slightly lower when the level of correlations
between the specific factors was high. Most of the mean correlations were greater than .77 under
conditions 1, 3, 5, and 7, with a low correlation of 0.2 between specific factors, and the mean
correlations were greater than .70 under conditions 2, 4, 6, and 8 with a high correlation of 0.8
between the specific factors.
The first and second specific factors showed the estimation precision to be sensitive to
the level of correlation between the specific factors. Under the low correlation of 0.2 between the
specific factors, the mean correlations of the generated and estimated first specific factors, shown
in Table 4-11, and of the generated and estimated second specific factors, shown in Table 4-12,
were over 0.7, although those correlations were slightly lower than the correlations between the
generated and estimated parameters for the general factor. Whereas the level of correlation
between the specific factors was only slightly influential on the observed correlation between the
generated and estimated general factor parameters, the mean correlations between the
generated and estimated parameters for the first and second specific factors were below 0.5
under conditions 2, 4, 6, and 8 with the high correlation between the specific factors.

56

The correlation between the generated and estimated parameters of the specific factors
also showed noticeable patterns according to item condition (see Tables 4-11 and 4-12). While
the mean correlations between the generated and estimated parameters of the specific factors did
not show a distinguishable difference depending on the level of item discrimination parameters
(mean discrimination parameters of 1.3 vs. 1.8), they showed a significant pattern depending on
the level of the item difficulty parameters (difficulty parameters of 0.5 and -0.5) especially under
conditions 4 and 8, which had specific factors with a high correlation and distribution skewed in
opposite directions. The first specific factors had lower correlations between the generated and
estimated parameters under conditions 4 and 8 when the item difficulty parameters had a mean
of -0.5, whereas the second specific factors had lower correlations under conditions 4 and 8
when the item difficulty parameters had a mean of 0.5. This result implies that the effect of the
correlation between the specific factor distributions on the correlation between the generated and
estimated parameters for the specific factors is related to not only the direction of skewedness of
the distributions but also to the item parameter conditions.
d. Kolmogorov-Smirnov Test (KS test)
In order to compare the generated and estimated

distributions, the KS test was utilized

with the entire parameter set, and with specific ranges of the parameters. Tables 4-13 shows
summary results of the KS test; complete results of the KS tests are included in Appendix E.
Every simulation condition was replicated fifty times, and among fifty replications the values in
the tables show the numbers of frequencies that were statistically significant under the
significance level of 0.05. For example, in Table 4-13, under Condition 1 with the mean of
discrimination parameters equal to 1.3 and mean of difficulty parameters equal to -0.5, 18 of the
estimated distributions among fifty replications were shown to be significantly different from
57

Table 4-10. Mean of the Correlations of the General Factors

General

Condition

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Correlation

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

(1.3, -0.5)

0.789

0.720

0.775

0.708

0.780

0.711

0.772

0.703

(1.8, -0.5)

0.767

0.702

0.777

0.709

0.771

0.707

0.782

0.712

(1.3, 0.5)

0.794

0.724

0.779

0.707

0.782

0.713

0.766

0.699

(1.8, 0.5)

0.768

0.700

0.779

0.708

0.771

0.704

0.784

0.713

* Condition: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* (1.3, -0.5): Discrimination parameters with mean of 1.3 and Difficulty parameters with mean of
-0.5
* Correlation: Correlation between specific factor distributions
the generated distribution by the p-value for the KS test statistic being less than the
significance level of 0.05.
Most of the ten specific categories of the estimated general factor distributions were not
significantly different from the generated distributions when they were generated from a standard
58

Table 4-11. Correlation Means of the First Specific Factors
Condition

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

(1.3, -0.5)

0.726

0.431

0.719

0.386

0.740

0.442

0.728

0.403

(1.8, -0.5)

0.722

0.434

0.708

0.378

0.730

0.443

0.722

0.395

(1.3, 0.5)

0.739

0.448

0.747

0.457

0.730

0.465

0.741

0.463

(1.8, 0.5)

0.732

0.457

0.744

0.465

0.728

0.459

0.736

0.467

S1

* Cond.: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* (1.3, -0.5): Discrimination parameters with mean of 1.3 and Difficulty parameters with mean of
-0.5
* Corr.: Correlation between specific factor distributions

normal distribution under the conditions 1, 2, 3, and 4, whereas the KS tests on the entire set of
parameters more often showed significant differences between the generated and estimated
distributions. For example, under Condition 2 with the mean discrimination parameter equal to

59

Table 4-12. Correlation Means of the Second Specific Factors
Cond.

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

(1.3, -0.5)

0.729

0.425

0.741

0.448

0.729

0.433

0.743

0.441

(1.8, -0.5)

0.718

0.423

0.735

0.451

0.726

0.431

0.733

0.442

(1.3, 0.5)

0.734

0.447

0.718

0.374

0.732

0.449

0.705

0.355

(1.8, 0.5)

0.730

0.447

0.701

0.359

0.721

0.443

0.692

0.341

S2

* Cond.: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* (1.3, -0.5): Discrimination parameters with mean of 1.3 and Difficulty parameters with mean of
-0.5
* Corr.: Correlation between specific factor distributions

1.3 and mean difficulty parameter equal to -0.5 in Table 4-13, all of the replications were
significant when the entire data set was tested, but few estimated parameter distribution
replications (between one and three) were significantly different from the generated true
parameter distribution when KS tests were conducted on ten specific categories of the values.
60

When the correlation between the specific factor distributions was high, the frequencies
of significant test results for differences between the estimated and generated general factor
distributions increased. Compared to conditions1 and 3, under conditions 2 and 4 significant
results under the KS test were found much more frequently. For example, in Table 4-13, with the
mean discrimination parameter equal to 1.3 and mean difficulty parameter equal to -0.5, KS test
results from all of the fifty replications of the entire data sets showed significant differences
between the generated and estimated parameters under the high level correlation between
specific factors (Condition 2 and 4) whereas only eighteen or twenty two replications are
significant under lower level correlation.
All of the specific factor distributions were positively or negatively skewed, and the
results of the KS test showed that the estimated distributions were significantly different from
the generated distributions. According to Stapleton (2008), the KS test is powerful when the
tested distributions are away from normality, as long as the sample size is sufficient. That means
that when the sample size increases, the sensitivity of the KS test becomes stronger. The tests on
the entire parameter distributions that included 2,000 values could have been more sensitive than
the tests on the specific categories, which included 200 parameter values. For example, Table 413 shows that the KS tests for the entire data set were more frequently significant than the tests
within the sub-categories.

61

Table 4-13. Frequency of Significant Differences between Distributions of Generating and
Estimated General Factor Parameters
Discrimination

Mean:1.8 / SD: 0.15

Item
Difficulty

Mean:-0.5 / SD: 0.4

Mean:0.5 / SD: 0.4

Cond.

1

2

3

4

1

2

3

4

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Correlation

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

1 to 2000

18

50

22

50

34

50

21

49

1 to 200

0

1

0

2

0

5

0

4

201 to 400

0

3

0

3

0

4

0

3

401 to 600

0

3

0

4

0

2

0

4

601 to 800

0

2

0

0

0

4

0

0

801 to 1000

0

2

0

0

0

4

0

0

1001 to 1200

0

2

0

2

0

1

0

4

1201 to 1400

0

2

0

1

0

0

0

2

1401 to 1600

1

2

0

1

1

3

0

2

1601 to 1800

0

2

1

6

0

1

0

6

1801 to 2000

1

3

0

1

0

1

0

3

* Cond.: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

62

Table 4-13 (cont’d)
Discrimination

Mean:1.8 / SD: 0.15

Item
Difficulty

Mean:-0.5 / SD: 0.4

Mean:0.5 / SD: 0.4

Condition

1

2

3

4

1

2

3

4

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(-)

(-)

(+)

(+)

(-)

(-)

Correlation

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

1 to 2000

28

50

46

50

50

50

46

50

1 to 200

0

1

1

4

1

1

0

5

201 to 400

1

1

0

4

0

3

0

5

401 to 600

2

3

1

6

0

2

1

6

601 to 800

0

2

1

2

0

3

1

0

801 to 1000

0

2

1

2

0

3

1

0

1001 to 1200

0

1

0

5

0

1

0

3

1201 to 1400

0

3

0

1

0

2

0

1

1401 to 1600

1

2

0

1

1

4

1

2

1601 to 1800

1

2

1

6

1

1

0

6

1801 to 2000

0

3

0

0

0

3

0

5

* Cond.: Numbers of simulation conditions
* G: General factor distribution; S1: First specific factor distributions; S2: Second specific
distributions
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

63

5. Discussion
5.1 Summary of the Results
Item parameter estimation was affected by the degree of skewedness of general factor,
the directions of skewedness of the specific factors, and the correlation between specific factors.
These influential conditions of the latent trait distributions had different effects on item
parameter estimation depending on the type of item parameter. First, the degree of skewedness of
the general factor was influential in the estimation of the d parameters. Second, the direction of
skewedness of the specific factor distributions was also influential in d parameter estimation. The
skewedness direction affected both the mean and variance of the d parameter mean biases, and
the effect on estimation increased when the general factor distribution was also skewed. Third,
the correlation between the specific factors had a noticeable effect on the estimation of
discrimination parameters. While estimation of discrimination parameters related to both the
general and specific factors was affected by the size of the correlation between the specific
factors, the discrimination parameters of the specific factors exhibited much more variance in
their mean biases as a result than the discrimination parameters corresponding to the general
factor. Lastly, generally the item discrimination parameters related to the general factor were
overestimated, whereas the discrimination parameters related to the specific factors and d
parameters were underestimated.
The estimated distributions had means of 0, and so the mean biases of the
distributions had values close to 0, -0.3, and 0.3, depending on the direction of the generated
distribution. Based on the variances of the mean biases and correlations between generated and
estimated parameters, the most significant condition of the latent trait distribution in parameter
estimation was the correlation between the specific factors. The amount of variance in the mean
64

bias increased under conditions with a high correlation of 0.8 between the specific factors.
While all three factors, general, first specific and second specific, had large amounts of variance
in mean bias with the high correlation, the specific factor distributions showed much more
variance than the general factor distributions across the item conditions.
Whereas only a slightly noticeable pattern was found related to the correlation between
the generated and estimated distributions for the general factor, the correlations between the
generating and estimated distributions for the first and second specific factors were markedly
lower when the correlation between the specific factors was high (0.8) than when it was low
(0.2). Also the effect of the correlation between the specific factors depended on the item
condition, and this result implies that the effect of the correlation between the specific
distributions is related to not only the direction of skewedness of the distributions but also to the
item parameter conditions.
By the Kolmogorov-Smirnov test, most of the ten specific categories of the estimated
general factor distributions were not found to be significantly different from the generated
distributions when the parameters were generated from a standard normal distribution. When the
correlation between the specific factor distributions was high, the frequencies of significant test
results for the general factor distribution increased. All of the specific factor distributions were
positively or negatively skewed distributions, and the results of the KS test showed that the
estimated distributions were significantly different from the generated distributions.
5.2 Implications
The use of measurements based on the concepts of multi-dimensional and non-normal
distributions have been increasing in various fields. Latent trait models have been developed in

65

order to represent these complicated measurement properties. Researchers have studied
appropriate estimation methods for each model, and the recommended methods have been
evaluated in empirical situations. As an extension of these studies, this research examines the
estimation performance of a bifactor model under various distributional conditions of the general
and specific factors.
In many cases, the distributions of latent traits represent particular participant
characteristics are non-normal. For example, it is not unusual to find that satisfaction
measurements from a program evaluation or interaction frequencies in a social networking
analysis have a skewed distribution with a long tail or with high kurtosis. When measurement
models are used to estimate the parameters from data that do not follow a normal distribution,
the normal distribution assumption of the estimation method may be violated. Therefore, new
models and estimation methods should be developed in order to solve these problems: how the
estimation of the model can be made robust when the normal distribution assumption is violated,
or how the empirical data distribution can be substituted for a normal distribution in the
estimation procedure. Woods and Thissen (2006) introduced Ramsay-Curve IRT, which is a nonparametric estimation procedure for the IRT latent distribution, and showed the capability of the
method with normal and non-normal latent distributions. Also a complex model to allow
correlations between the latent trait factors has been studied (Fujimoto, 2014; Cai 2010). For
these newly-developed methods, it is necessary to evaluate their capability in different empirical
situations to determine their limitations and produce further developments. This research
evaluated the estimation quality of the bifactor model and the results showed how conditions of
the item and parameter distributions affect item and parameter estimation under particular

66

estimation assumptions. As previous research has done, this research is expected to provide
information about estimation performance and guidelines for future research.
One of the most important conditions studied in this research was the non-normality of
the distribution of the latent traits being measured. Varying the amount and direction of
skewedness of the distributions and the correlation between the specific factors, the results
showed that in the analysis of data generated from skewed distributions, both the item and
conditions influenced the quality of estimation. Also, the conditions had different effects
depending on the type of item parameter estimated. The results from this research showing the
effect of skewed latent trait distributions are consistent with the results of previous studies. Sass
et al. (2008) demonstrated the effect of skewed distributions on estimating the distribution and
item parameters using a unidimensional latent trait model. In that study, difficulty parameter
estimates were particularly affected by the presence of a skewed latent trait distribution.
Similarly, in my research the amount and direction of skewedness of the latent trait distributions
had a significant influence on the mean and variance of d parameters’ mean biases, which relates
to the estimation of difficulty parameters.
The most significant condition of the latent trait distributions for estimation was the
correlation between the specific factors. The correlation of the specific factors had a remarkable
impact on estimation not only by itself but also in conjunction with particular item parameter
conditions. This result shows that the combination of the item and conditions and the
distributional assumptions should be considered simultaneously when the model and estimation
method are evaluated.

67

The skewed distributions were transformed from normal distributions via the Copula
method, and Kolmogorov-Smirnov tests were used to evaluate the distributional differences
between the generated and estimated parameter distributions. My application of those methods
has suggested some implications for future research. The Copula method requires identification
of a transformation function, and in this research two polynomial functions were used for
transformation to produce negatively and positively skewed latent trait distributions. Even
though the

s of the functions were values very close to 1 (.999), it should be noted that the

extreme values were particularly sensitive to the polynomial transformation function selected.
Also, Kolmogorov-Smirnov tests showed the differences between the generated and estimated
parameters in specific ranges; however, this method had very high power to detect differences
when entire distributions were compared. . Especially for the skewed distributions, all cases in
each category were significant, using a significance level of 0.05. That shows that the skewed
distributional condition tended to have significant differences between the generated and
estimated parameter distributions, however, it could not provide specific information and details
for each range. Therefore, more sophisticated and alternative methods are required for the
transformation and the evaluation procedures.
In an effort to measure the structure of complex constructs, multidimensional latent trait
models have been developed. The bifactor model is one of those multidimensional models, and is
connected mathematically to other major classes of multidimensional measurement models. This
research evaluates the bifactor model to determine how well it model works in various empirical
contexts. While the distributions of latent traits are often assumed to be normal, the distributions
observed in empirical data are not always normal. Also, despite the advantages of the bifactor
model, it restricts the latent traits to be orthogonal.
68

The results from this research provided information about the estimation properties of
bifactor models under conditions when their distributional and relational assumptions are not met.
Also, the influence of item parameters was shown. Based on this information, the results can be
applied to analyses using models of multidimensional latent traits. The study of the effect of the
latent trait distribution on parameter estimation is also significant in terms of providing
information about measurement error for data analysis. With the increasing number of studies
and practical need for multidimensional structures of latent traits, this research is expected to
provide useful guidelines for investigating appropriate multidimensional models.

69

APPENDICES

70

Appendix A
Table A-1. Parameter Estimates of Quadratic Regression Function for Positively Skewed
Distribution with 2,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9980

-0.4508

0.9770

0.1461

-

Var

0.0000

0.0006

0.0002

0.0001

-

Min

0.9960

-0.4948

0.9418

0.1282

-

Max

0.9993

-0.3915

1.0128

0.1727

-

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

Table A-2. Parameter Estimates of Cubic Regression Function for Positively Skewed
Distribution with 2,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9990

-0.4508

1.0167

0.1461

-0.0136

Var

0.0000

0.0006

0.0003

0.0001

0.0000

Min

0.9978

-0.4948

0.9772

0.1282

-0.0236

Max

0.9997

-0.3915

1.0652

0.1727

-0.0060

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

71

Table A-3. Parameter Estimates of Quadratic Regression Function for Positively Skewed
Distribution with 10,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9986

-0.4434

0.9793

0.1454

-

Var

0.0000

0.0001

0.0001

0.0000

-

Min

0.9978

-0.4701

0.9676

0.1337

-

Max

0.9993

-0.4261

0.9970

0.1557

-

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

Table A-4. Parameter Estimates of Cubic Regression Function for Positively Skewed
Distribution with 10,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9995

-0.4434

1.0166

0.1454

-0.0125

Var

0.0000

0.0001

0.0001

0.0000

0.0000

Min

0.9993

-0.4701

0.9989

0.1337

-0.0175

Max

0.9998

-0.4261

1.0366

0.1557

-0.0086

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

72

Table A-5. Parameter Estimates of Quadratic Regression Function for Negatively Skewed
Distribution with 2,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9984

0.4516

0.9801

-0.1500

-

Var

0.0000

0.0006

0.0002

0.0001

-

Min

0.9964

0.3921

0.9453

-0.1785

-

Max

0.9995

0.4950

1.0173

-0.1308

-

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

Table A-6. Parameter Estimates of Cubic Regression Function for Negatively Skewed
Distribution with 2,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9991

0.4516

1.0135

-0.1500

-0.0115

Var

0.0000

0.0006

0.0003

0.0001

0.0000

Min

0.9981

0.3921

0.9720

-0.1785

-0.0225

Max

0.9997

0.4950

1.0625

-0.1308

-0.0023

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

73

Table A-7. Parameter Estimates of Quadratic Regression Function for Negatively Skewed
Distribution with 10,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9988

0.4441

0.9798

-0.1462

-

Var

0.0000

0.0001

0.0001

0.0000

-

Min

0.9980

0.4223

0.9646

-0.1566

-

Max

0.9995

0.4705

0.9977

-0.1351

-

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

Table A-8. Parameter Estimates of Cubic Regression Function for Negatively Skewed
Distribution with 10,000 examinees
R Square

Constant

b1

b2

b3

Mean

0.9996

0.4441

1.0151

-0.1462

-0.0119

Var

0.0000

0.0001

0.0001

0.0000

0.0000

Min

0.9993

0.4223

0.9976

-0.1566

-0.0164

Max

0.9998

0.4705

1.0360

-0.1351

-0.0079

*b1, b2, and b3: regression coefficients of linear, quadratic, and cubic terms

74

Appendix B
Table B-1. Mean and Variance of Item Parameter Biases under Disc of 1.3 and Diff of -0.5
Cond.
G
S1
S2
Corr.

1
(0)
(+)
(+)
0.2

2
(0)
(+)
(+)
0.8

3
(0)
(+)
(−)
0.2

4
(0)
(+)
(−)
0.8

5
(+)
(+)
(+)
0.2

6
(+)
(+)
(+)
0.8

7
(+)
(+)
(−)
0.2

8
(+)
(+)
(−)
0.8

Mean
min
0.0518 0.3207 0.0576 0.3393 0.0374 0.3084 0.0478 0.3176
mean
0.1214 0.3910 0.1372 0.4072 0.1178 0.3824 0.1098 0.3952
max
0.2168 0.4773 0.2348 0.5125 0.1995 0.4422 0.1665 0.4613
min
-0.5501 -0.8914 -0.5463 -0.9447 -0.3600 -0.8853 -0.5512 -0.9216
S1
mean -0.2066 -0.7139 -0.1895 -0.7500 -0.1442 -0.7150 -0.1697 -0.7357
Disc.
max
-0.0482 -0.5252 0.0691 -0.6273 0.0698 -0.5660 0.0899 -0.5897
min
-0.3516 -0.9029 -0.4893 -0.8175 -0.4804 -0.8331 -0.7114 -0.8767
S2
mean -0.1661 -0.7150 -0.1157 -0.6894 -0.1879 -0.6814 -0.1456 -0.7172
Disc.
max
0.0331 -0.5773 0.1494 -0.5508 0.0080 -0.5188 0.0821 -0.5423
min
-0.5161 -0.5358 -0.0651 -0.1145 -0.9444 -0.9847 -0.4985 -0.5359
D
mean -0.4329 -0.4310 0.0054 0.0234 -0.8401 -0.8448 -0.4201 -0.4264
max
-0.3546 -0.3103 0.1018 0.1433 -0.7695 -0.7345 -0.3000 -0.2852
Variance
min
0.0067 0.0107 0.0070 0.0113 0.0071 0.0115 0.0079 0.0147
G
mean
0.0188 0.0177 0.0247 0.0212 0.0196 0.0193 0.0245 0.0221
Disc.
max
0.0727 0.0296 0.0615 0.0336 0.0686 0.0294 0.1181 0.0313
min
0.0060 0.0075 0.0054 0.0125 0.0049 0.0067 0.0049 0.0105
S1
mean
0.0126 0.0203 0.0129 0.0259 0.0108 0.0203 0.0114 0.0213
Disc.
max
0.0369 0.0468 0.0274 0.0457 0.0196 0.0401 0.0249 0.0525
min
0.0042 0.0073 0.0056 0.0077 0.0054 0.0094 0.0062 0.0080
S2
mean
0.0117 0.0195 0.0133 0.0199 0.0122 0.0186 0.0128 0.0198
Disc.
max
0.0256 0.0435 0.0343 0.0342 0.0271 0.0339 0.0345 0.0365
min
0.0039 0.0043 0.1592 0.1787 0.0059 0.0056 0.1528 0.1778
mean
0.0077 0.0075 0.2043 0.2096 0.0092 0.0089 0.1964 0.2092
D
max
0.0106 0.0115 0.2470 0.2478 0.0147 0.0137 0.2662 0.2436
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
G
Disc.

75

Table B-2. Mean and Variance of Item Parameter Biases under Disc of 1.3 and Diff of 0.5
Cond.
G
S1
S2
Corr.

1

2

3

4

5

6

7

8

(0)
(+)
(+)
0.2

(0)
(+)
(+)
0.8

(0)
(+)
(−)
0.2

(0)
(+)
(−)
0.8

(+)
(+)
(+)
0.2

(+)
(+)
(+)
0.8

(+)
(+)
(−)
0.2

(+)
(+)
(−)
0.8

Mean
min
0.0464 0.3541 0.0555 0.3390 0.1280 0.3927 0.0635 0.3637
mean
0.1432 0.4437 0.1344 0.4133 0.2057 0.4875 0.1711 0.4491
max
0.2692 0.5308 0.2398 0.5052 0.3134 0.5833 0.2394 0.5404
min
-0.5176 -0.8712 -0.4041 -0.8545 -0.6482 -0.8471 -0.3627 -0.8388
S1
mean -0.0974 -0.6775 -0.1226 -0.7126 -0.1087 -0.6289 -0.0878 -0.6714
Disc.
max
0.1969 -0.4658 0.0996 -0.4984 0.2257 -0.4316 0.1478 -0.4996
min
-0.5220 -0.9141 -0.4374 -0.9474 -0.3851 -0.8688 -0.7276 -1.0232
S2
mean -0.1249 -0.6341 -0.1708 -0.7281 -0.0564 -0.6590 -0.2103 -0.7537
Disc.
max
0.1200 -0.4482 0.0475 -0.5688 0.1772 -0.4432 0.0244 -0.6011
min
-0.5524 -0.5218 -0.0893 -0.0972 -0.9769 -0.9906 -0.5214 -0.5450
D
mean -0.4432 -0.4382 -0.0034 0.0077 -0.8721 -0.8598 -0.4282 -0.4304
max
-0.3359 -0.3474 0.0783 0.2179 -0.7575 -0.7560 -0.3112 -0.2353
Variance
min
0.0084 0.0152 0.0079 0.0109 0.0098 0.0116 0.0088 0.0132
G
mean
0.0306 0.0236 0.0218 0.0223 0.0295 0.0213 0.0217 0.0228
Disc.
max
0.0985 0.0353 0.1008 0.0341 0.1263 0.0300 0.0619 0.0398
min
0.0065 0.0120 0.0050 0.0057 0.0063 0.0077 0.0057 0.0101
S1
mean
0.0139 0.0224 0.0142 0.0220 0.0172 0.0200 0.0143 0.0223
Disc.
max
0.0350 0.0628 0.0336 0.0604 0.0761 0.0388 0.0310 0.0486
min
0.0048 0.0081 0.0059 0.0127 0.0059 0.0082 0.0059 0.0115
S2
mean
0.0146 0.0211 0.0117 0.0246 0.0143 0.0230 0.0143 0.0304
Disc.
max
0.0315 0.0430 0.0196 0.0733 0.0309 0.0631 0.0644 0.0617
min
0.0065 0.0065 0.1561 0.1804 0.0086 0.0096 0.1767 0.1825
mean
D
0.0107 0.0108 0.2068 0.2130 0.0155 0.0147 0.2092 0.2201
max
0.0152 0.0187 0.2699 0.2423 0.0234 0.0304 0.2582 0.2694
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
G
Disc.

76

Table B-3. Mean and Variance of Item Parameter Biases under Disc of 1.8 and Diff of -0.5
Cond.
G
S1
S2
Corr.

1

2

3

4

5

6

7

8

(0)
(+)
(+)
0.2

(0)
(+)
(+)
0.8

(0)
(+)
(−)
0.2

(0)
(+)
(−)
0.8

(+)
(+)
(+)
0.2

(+)
(+)
(+)
0.8

(+)
(+)
(−)
0.2

(+)
(+)
(−)
0.8

Mean
min
0.0679 0.3911 0.1290 0.4211 -0.0440 0.4045 -0.0789 0.4002
mean
0.2163 0.4935 0.2820 0.5378 0.2016 0.5022 0.1850 0.5080
max
0.3628 0.5746 0.4263 0.6272 0.3604 0.5831 0.3220 0.6128
min
-0.4473 -1.0972 -0.3875 -1.1511 -0.6284 -1.1186 -0.5386 -1.1185
S1
mean -0.3077 -0.9783 -0.2885 -1.0068 -0.2842 -0.9541 -0.2349 -0.9995
Disc.
max -0.1196 -0.8678 -0.0026 -0.8202 0.0696 -0.7873 0.2381 -0.8592
min
-0.5096 -1.1444 -0.3255 -1.0831 -0.6628 -1.1417 -0.6950 -1.1150
S2
mean -0.3206 -0.9786 -0.2154 -0.9570 -0.2740 -0.9550 -0.2901 -0.9927
Disc.
max -0.1832 -0.8407 -0.1340 -0.8116 0.0605 -0.7702 0.0484 -0.9018
min
-0.7969 -0.7648 -0.1830 -0.1192 -1.4206 -1.3482 -0.8585 -0.8027
D
mean -0.6106 -0.6111 -0.0009 0.0544 -1.2043 -1.1949 -0.6017 -0.5864
max -0.4454 -0.4274 0.1481 0.2478 -1.0508 -1.0203 -0.4262 -0.3206
Variance
min
0.0114 0.0136 0.0139 0.0203 0.0096 0.0136 0.0142 0.0199
G
mean
0.0201 0.0225 0.0259 0.0336 0.0283 0.0260 0.0451 0.0300
Disc.
max
0.0640 0.0356 0.1055 0.0496 0.1399 0.0418 0.2533 0.0434
min
0.0058 0.0109 0.0074 0.0138 0.0077 0.0104 0.0067 0.0122
S1
mean
0.0168 0.0228 0.0163 0.0270 0.0154 0.0204 0.0157 0.0250
Disc.
max
0.0373 0.0416 0.0286 0.0469 0.0336 0.0341 0.0289 0.0547
min
0.0068 0.0098 0.0074 0.0132 0.0052 0.0114 0.0069 0.0106
S2
mean
0.0142 0.0227 0.0183 0.0246 0.0146 0.0215 0.0180 0.0216
Disc.
max
0.0263 0.0383 0.0339 0.0429 0.0340 0.0395 0.0751 0.0403
min
0.0072 0.0065 0.3004 0.3682 0.0076 0.0075 0.3244 0.3592
mean
D
0.0112 0.0110 0.4151 0.4230 0.0119 0.0107 0.3932 0.4115
max
0.0158 0.0181 0.4952 0.4753 0.0206 0.0150 0.5369 0.5031
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
G
Disc.

77

Table B-4. Mean and Variance of Item Parameter Biases under Disc of 1.8 and Diff of 0.5
Cond.
G
S1
S2
Corr.

1

2

3

4

5

6

7

8

(0)
(+)

(0)
(+)

(0)
(+)

(0)
(+)

(+)
(+)

(+)
(+)

(+)
(+)

(+)
(+)

(+)
0.2

(+)
0.8

(−)
0.2

(−)
0.8

(+)
0.2

(+)
0.8

(−)
0.2

(−)
0.8

Mean
0.0205 0.5126 0.0211 0.4275 0.2486 0.5698 0.1853 0.5014
Min
0.3053 0.6149 0.2661 0.5419 0.4206 0.7092 0.3344 0.6165
Mean
0.4789 0.7022 0.3815 0.6671 0.5687 0.8086 0.4428 0.7829
Max
-0.4631 -1.0605 -0.3136 -1.0915 -0.2541 -1.0893 -0.3178 -1.0873
Min
S1
Mean -0.2075 -0.8910 -0.2135 -0.9612 -0.1446 -0.8536 -0.1768 -0.9212
Disc.
-0.0386 -0.6841 0.0053 -0.8276 0.0221 -0.6149 -0.0555 -0.7876
Max
-0.3654 -1.0131 -0.4576 -1.1091 -0.2997 -1.0280 -0.4166 -1.1505
Min
S2
Mean -0.1884 -0.8777 -0.3010 -0.9951 -0.1591 -0.8628 -0.3035 -1.0090
Disc.
0.1765 -0.7680 -0.2001 -0.8612 -0.0055 -0.6371 -0.1684 -0.9012
Max
-0.7658 -0.7621 -0.2331 -0.1928 -1.5176 -1.4091 -0.7523 -0.7552
Min
D
Mean -0.6353 -0.6148 -0.0147 0.0086 -1.2776 -1.2486 -0.6227 -0.6196
-0.4744 -0.4390 0.2370 0.2027 -1.1127 -1.1132 -0.4510 -0.4308
Max
Variance
0.0132 0.0217 0.0135 0.0193 0.0142 0.0212 0.0084 0.0235
Min
G
0.0306 0.0321 0.0253 0.0325 0.0270 0.0335 0.0363 0.0432
Mean
Disc.
0.1981 0.0562 0.0770 0.0599 0.0599 0.0566 0.0779 0.0705
Max
0.0052 0.0088 0.0092 0.0113 0.0068 0.0112 0.0100 0.0148
Min
S1
0.0162 0.0204 0.0177 0.0245 0.0187 0.0263 0.0189 0.0273
Mean
Disc.
0.0361 0.0411 0.0328 0.0438 0.0300 0.0574 0.0370 0.0591
Max
0.0082 0.0117 0.0074 0.0154 0.0099 0.0119 0.0078 0.0138
Min
S2
0.0174 0.0228 0.0157 0.0308 0.0195 0.0270 0.0170 0.0348
Mean
Disc.
0.0314 0.0487 0.0282 0.0523 0.0300 0.0475 0.0304 0.0666
Max
0.0098 0.0094 0.3407 0.3794 0.0172 0.0137 0.3464 0.3523
Min
0.0183 0.0181 0.4104 0.4245 0.0283 0.0283 0.4371 0.4490
D
Mean
0.0358 0.0321 0.4890 0.4770 0.0520 0.0521 0.5564 0.5055
Max
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
G
Disc.

78

Appendix C
Table C-1. Mean and Variance of
Cond.

Parameter Biases under Disc. of 1.3 and Diff. of -0.5

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(−)

(−)

(+)

(+)

(−)

(−)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

Min

-0.0488

-0.0625

-0.0352

-0.0793

0.2690

0.2224

0.2388

0.2542

Mean

0.0028

0.0064

0.0021

-0.0096

0.3048

0.3084

0.3029

0.3078

Max

0.0479

0.0818

0.0513

0.0568

0.3582

0.3697

0.3516

0.3638

Min

0.2641

0.2586

0.2431

0.2510

0.2430

0.2538

0.2513

0.2301

Mean

0.3011

0.2966

0.3042

0.2976

0.2989

0.3012

0.3046

0.3050

Max

0.3671

0.3412

0.3503

0.3636

0.3568

0.3521

0.3598

0.3577

Min

0.2681

0.2570

-0.3532

-0.3643

0.2483

0.2501

-0.3484

-0.3677

Mean

0.3086

0.3003

-0.3051

-0.3079

0.3039

0.3047

-0.2950

-0.3014

Max

0.3578

0.3479

-0.2684

-0.2527

0.3531

0.3538

-0.2346

-0.2418

Min

0.3345

0.4924

0.3597

0.5294

0.3627

0.5197

0.3705

0.5306

Mean

0.3827

0.5464

0.4077

0.5702

0.3990

0.5589

0.4129

0.5786

Max

0.4680

0.5952

0.4981

0.6341

0.4601

0.6262

0.5272

0.6369

Min

0.4169

0.7023

0.4116

0.7582

0.3874

0.7176

0.4160

0.7748

Mean

0.4714

0.8506

0.4905

0.8886

0.4569

0.8322

0.4708

0.8738

Max

0.6123

0.9790

0.6497

1.0351

0.5355

0.9342

0.5497

1.0342

Min

0.4151

0.7733

0.4076

0.7341

0.4108

0.6922

0.3743

0.7218

Mean

0.4688

0.8557

0.4546

0.8158

0.4698

0.8449

0.4537

0.8330

Mean
G

S1

S2
Variance
G

S1

S2

Max
0.5209 0.9521 0.5657 0.9319 0.5523 0.9613 0.6878 0.9304
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

79

Table C-2. Mean and Variance of
Cond.

Parameter Biases under Disc. of 1.3 and Diff. of 0.5

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(−)

(−)

(+)

(+)

(−)

(−)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

Min

-0.0579

-0.0538

-0.0442

-0.1161

0.2325

0.2271

0.2487

0.2191

Mean

0.0007

0.0043

0.0000

-0.0104

0.3033

0.3034

0.3006

0.3023

Max

0.0463

0.0719

0.0656

0.0462

0.3606

0.3632

0.3483

0.3538

Min

0.2534

0.2585

0.2469

0.2508

0.2404

0.2539

0.2572

0.2299

Mean

0.3006

0.2965

0.3032

0.2975

0.2993

0.3011

0.3027

0.3048

Max

0.3661

0.3411

0.3538

0.3635

0.3506

0.3515

0.3445

0.3577

Min

0.2548

0.2569

-0.3568

-0.3644

0.2453

0.2500

-0.3389

-0.3678

Mean

0.3070

0.3002

-0.3060

-0.3080

0.3019

0.3046

-0.2956

-0.3015

Max

0.3724

0.3483

-0.2682

-0.2528

0.3524

0.3534

-0.2487

-0.2417

Min

0.3616

0.5200

0.3526

0.5297

0.3571

0.5115

0.3553

0.5122

Mean

0.4194

0.5741

0.4030

0.5698

0.4123

0.5578

0.3949

0.5581

Max

0.5173

0.6330

0.4736

0.6306

0.5163

0.6333

0.4665

0.6049

Min

0.3798

0.7138

0.3569

0.7004

0.3959

0.6914

0.3853

0.7350

Mean

0.4540

0.8229

0.4479

0.8076

0.4676

0.8034

0.4496

0.8016

Max

0.5882

0.9599

0.5479

0.9231

0.7469

0.9582

0.5219

0.9080

Min

0.3922

0.7491

0.4265

0.8308

0.4104

0.6833

0.4231

0.7808

Mean

0.4626

0.8316

0.4897

0.9025

0.4658

0.8125

0.5092

0.9221

Mean
G

S1

S2

Variance
G

S1

S2

Max
0.5590 0.9164 0.5512 0.9843 0.5341 0.9619 0.7431 1.0380
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

80

Table C-3. Mean and Variance of
Cond.

Parameter Biases under Disc. of 1.8 and Diff. of -0.5

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(−)

(−)

(+)

(+)

(−)

(−)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

Min

-0.0752

-0.0751

-0.0515

-0.0815

0.2449

0.2489

0.2355

0.2266

Mean

-0.0006

0.0064

0.0062

-0.0158

0.3080

0.3092

0.3055

0.3014

Max

0.0770

0.0755

0.0793

0.0325

0.3767

0.3796

0.3938

0.3713

Min

0.2484

0.2587

0.2542

0.2512

0.2558

0.2539

0.2558

0.2299

Mean

0.3009

0.2967

0.3057

0.2977

0.3018

0.3012

0.3080

0.3050

Max

0.4024

0.3414

0.3610

0.3636

0.3512

0.3510

0.3857

0.3576

Min

0.2614

0.2571

-0.3461

-0.3644

0.2461

0.2499

-0.3475

-0.3677

Mean

0.3053

0.3003

-0.3022

-0.3079

0.3040

0.3047

-0.2957

-0.3013

Max

0.3521

0.3481

-0.2544

-0.2528

0.3740

0.3533

-0.2395

-0.2421

Min

0.3391

0.4948

0.3606

0.5362

0.3365

0.5172

0.3635

0.5561

Mean

0.3737

0.5422

0.3989

0.5765

0.3959

0.5593

0.4227

0.5911

Max

0.4285

0.5954

0.4342

0.6204

0.4531

0.6102

0.5729

0.6315

Min

0.4215

0.7403

0.4629

0.8165

0.4059

0.7850

0.4283

0.8126

Mean

0.4791

0.8520

0.5061

0.9091

0.4699

0.8405

0.4826

0.8936

Max

0.5247

0.9537

0.5536

1.0262

0.5151

0.9308

0.5409

0.9735

Min

0.4347

0.7870

0.4223

0.7219

0.4195

0.7551

0.4111

0.6987

Mean

0.4844

0.8619

0.4608

0.8119

0.4761

0.8497

0.4679

0.8347

Mean

G

S1

S2
Variance
G

S1

S2

Max
0.5302 0.9652 0.5055 0.8846 0.5481 0.9195 0.6768 0.9165
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

81

Table C-4. Mean and Variance of
Cond.

Parameter Biases under Disc. of 1.8 and Diff. of 0.5

1

2

3

4

5

6

7

8

G

(0)

(0)

(0)

(0)

(+)

(+)

(+)

(+)

S1

(+)

(+)

(+)

(+)

(+)

(+)

(+)

(+)

S2

(+)

(+)

(−)

(−)

(+)

(+)

(−)

(−)

Corr.

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

Min

-0.0565

-0.0658

-0.0730

-0.0768

0.2499

0.2409

0.2534

0.2200

Mean

0.0005

0.0022

0.0006

-0.0104

0.3108

0.3076

0.3010

0.3001

Max

0.0640

0.0651

0.0742

0.0379

0.3715

0.3713

0.3475

0.3487

Min

0.2505

0.2576

0.2435

0.2511

0.2358

0.2552

0.2502

0.2299

Mean

0.2998

0.2963

0.3038

0.2973

0.3033

0.3014

0.3043

0.3048

Max

0.3546

0.3408

0.3695

0.3636

0.3824

0.3526

0.3547

0.3576

Min

0.2521

0.2559

-0.3656

-0.3646

0.2455

0.2499

-0.3638

-0.3678

Mean

0.3078

0.3001

-0.3058

-0.3080

0.3052

0.3045

-0.2967

-0.3015

Max

0.3620

0.3471

-0.2578

-0.2526

0.3549

0.3546

-0.2475

-0.2419

Min

0.3821

0.5198

0.3686

0.5382

0.3768

0.5213

0.3580

0.4981

Mean

0.4154

0.5780

0.3983

0.5753

0.4105

0.5599

0.3897

0.5585

Max

0.5126

0.6286

0.4257

0.6249

0.4413

0.6044

0.4188

0.6138

Min

0.4073

0.7006

0.4196

0.7288

0.4313

0.7346

0.4150

0.6650

Mean

0.4614

0.8169

0.4509

0.8000

0.4715

0.8088

0.4554

0.7957

Max

0.5468

0.9201

0.4836

0.8611

0.5162

0.8903

0.5117

0.8618

Min

0.4155

0.7575

0.4742

0.8381

0.4219

0.7322

0.4887

0.8186

Mean

0.4658

0.8296

0.5145

0.9271

0.4794

0.8197

0.5306

0.9475

Mean

G

S1

S2

Variance
G

S1

S2

Max
0.4983 0.9384 0.5620 1.0190 0.5316 0.9070 0.5711 1.0542
*Disc.: Discrimination item parameter / Diff.: Difficulty parameter
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

82

Appendix D
Table D-1. Correlations between Generated and Estimated Factors with Discrimination
Parameters from mean of 1.3
Cond.
1
2
3
4
5
6
7
8
G
(0)
(0)
(0)
(0)
(+)
(+)
(+)
(+)
S1
(+)
(+)
(+)
(+)
(+)
(+)
(+)
(+)
S2
(+)
(+)
(−)
(−)
(+)
(+)
(−)
(−)
Corr.
0.2
0.8
0.2
0.8
0.2
0.8
0.2
0.8
Difficulty parameters with mean of -0.5 and standard deviation of 0.4
Mean 0.7890 0.7200 0.7749 0.7079 0.7800 0.7106 0.7719 0.7033
Var
0.0002 0.0002 0.0003 0.0002 0.0002 0.0001 0.0002 0.0002
G
Min
0.7440 0.6867 0.7219 0.6737 0.7355 0.6736 0.7166 0.6698
Max 0.8253 0.7504 0.8030 0.7347 0.8046 0.7364 0.7952 0.7329
Mean 0.7260 0.4310 0.7186 0.3859 0.7396 0.4416 0.7276 0.4032
S1
Var
0.0007 0.0018 0.0007 0.0016 0.0004 0.0011 0.0003 0.0017
Min
0.6280 0.3317 0.6346 0.2993 0.6904 0.3624 0.6905 0.2815
Max 0.7670 0.5216 0.7649 0.5044 0.7768 0.5251 0.7674 0.4973
Mean 0.7295 0.4248 0.7414 0.4482 0.7289 0.4329 0.7431 0.4414
S2
Var
0.0003 0.0011 0.0003 0.0014 0.0004 0.0011 0.0011 0.0013
Min
0.6969 0.3178 0.6718 0.3558 0.6685 0.3690 0.5572 0.3504
Max 0.7631 0.4932 0.7702 0.5161 0.7656 0.5059 0.7839 0.5397
Difficulty parameters with mean of 0.5 and standard deviation of 0.4
Mean 0.7668 0.7024 0.7771 0.7085 0.7713 0.7066 0.7819 0.7119
G
Var
0.0003 0.0002 0.0002 0.0002 0.0003 0.0002 0.0002 0.0002
Min
0.7031 0.6677 0.7479 0.6711 0.7182 0.6664 0.7468 0.6867
Max 0.8047 0.7352 0.8046 0.7312 0.7980 0.7357 0.8037 0.7407
Mean 0.7387 0.4480 0.7466 0.4572 0.7296 0.4650 0.7410 0.4625
S1
Var
0.0007 0.0019 0.0004 0.0017 0.0023 0.0016 0.0003 0.0015
Min
0.6492 0.2882 0.7044 0.3705 0.4561 0.3751 0.6983 0.3786
Max 0.7742 0.5267 0.8015 0.5427 0.7702 0.5451 0.7850 0.5311
Mean 0.7339 0.4470 0.7178 0.3736 0.7324 0.4487 0.7049 0.3549
S2
Var
0.0004 0.0010 0.0003 0.0015 0.0005 0.0020 0.0012 0.0024
Min
0.6733 0.3722 0.6874 0.2322 0.6688 0.3681 0.5076 0.2216
Max 0.7746 0.5015 0.7508 0.4258 0.7721 0.5467 0.7505 0.4712
* Cond.: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions
83

Table D-2. Correlations between Generated and Estimated Factors with Discrimination
Parameters from mean of 1.8
Condition
1
2
3
4
5
6
7
8
G
(0)
(0)
(0)
(0)
(+)
(+)
(+)
(+)
S1
(+)
(+)
(+)
(+)
(+)
(+)
(+)
(+)
S2
(+)
(+)
(−)
(−)
(+)
(+)
(−)
(−)
Corr.
0.2
0.8
0.2
0.8
0.2
0.8
0.2
0.8
Difficulty parameters with mean of -0.5 and standard deviation of 0.4
Mean 0.7937 0.7243 0.7785 0.7072 0.7817 0.7125 0.7663 0.6992
Var
0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0004 0.0001
G
Min
0.7658 0.6891 0.7569 0.6724 0.7535 0.6867 0.6900 0.6771
Max 0.8177 0.7471 0.7993 0.7244 0.8072 0.7399 0.7945 0.7217
Mean 0.7215 0.4336 0.7083 0.3783 0.7304 0.4433 0.7219 0.3947
S1
Var
0.0002 0.0007 0.0002 0.0009 0.0002 0.0005 0.0002 0.0007
Min
0.6959 0.3464 0.6830 0.2991 0.6950 0.3829 0.6914 0.3248
Max 0.7442 0.4907 0.7341 0.4266 0.7731 0.4833 0.7556 0.4463
Mean 0.7184 0.4226 0.7351 0.4508 0.7255 0.4314 0.7333 0.4418
S2
Var
0.0002 0.0007 0.0002 0.0008 0.0003 0.0007 0.0008 0.0007
Min
0.6863 0.3561 0.7057 0.3838 0.6624 0.3703 0.5923 0.3486
Max 0.7569 0.4847 0.7604 0.5123 0.7740 0.4996 0.7745 0.4911
Difficulty parameters with mean of 0.5 and standard deviation of 0.4
Mean 0.7677 0.6997 0.7790 0.7075 0.7710 0.7041 0.7839 0.7129
G
Var
0.0002 0.0002 0.0001 0.0002 0.0001 0.0002 0.0001 0.0002
Min
0.7055 0.6628 0.7563 0.6799 0.7490 0.6830 0.7648 0.6805
Max 0.7911 0.7315 0.8001 0.7348 0.7939 0.7398 0.8086 0.7482
Mean 0.7323 0.4566 0.7439 0.4649 0.7276 0.4595 0.7362 0.4667
S1
Var
0.0002 0.0009 0.0001 0.0007 0.0001 0.0007 0.0001 0.0009
Min
0.6942 0.3662 0.7172 0.4064 0.7069 0.4089 0.7027 0.4047
Max 0.7568 0.5181 0.7617 0.5211 0.7516 0.5106 0.7577 0.5344
Mean 0.7305 0.4471 0.7007 0.3591 0.7211 0.4433 0.6917 0.3414
S2
Var
0.0002 0.0006 0.0002 0.0012 0.0002 0.0012 0.0002 0.0010
Min
0.7053 0.3885 0.6766 0.2989 0.6995 0.3356 0.6651 0.2687
Max 0.7577 0.4987 0.7279 0.4371 0.7576 0.5178 0.7218 0.4105
* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific
* (0): Standard normal distribution; (+): Positively skewed distributions; (-): Negatively skewed
distributions
* Corr.: Correlation between specific factor distributions

84

Appendix E
Table E-1. Numbers of Frequencies Significant by KS Test under Condition 1
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

18

50

50

34

50

50

28

50

50

50

50

50

1 to 200

0

50

50

0

50

50

0

50

50

1

50

50

201 to 400

0

50

50

0

50

50

1

50

50

0

50

50

401 to 600

0

50

50

0

50

50

2

50

50

0

50

50

601 to 800

0

50

50

0

50

50

0

50

50

0

50

50

801 to 1000

0

50

50

0

50

50

0

50

50

0

50

50

1001 to 1200

0

50

50

0

50

50

0

50

50

0

50

50

1201 to 1400

0

50

50

0

50

50

0

50

50

0

50

50

1401 to 1600

1

50

50

1

50

50

1

50

50

1

50

50

1601 to 1800

0

50

50

0

50

50

1

50

50

1

50

50

1801 to 2000

1

50

50

0

49

50

0

50

50

0

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.
85

Table E-2. Numbers of Frequencies Significant by KS Test under Condition 2
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

1 to 200

1

50

50

5

50

50

1

50

50

1

50

50

201 to 400

3

50

50

4

50

50

1

50

50

3

50

50

401 to 600

3

50

50

2

50

50

3

50

50

2

50

50

601 to 800

2

50

50

4

50

50

2

50

50

3

50

50

801 to 1000

2

50

50

4

50

50

2

50

50

3

50

50

1001 to 1200

2

50

50

1

50

50

1

50

50

1

50

50

1201 to 1400

2

50

50

0

50

50

3

50

50

2

50

50

1401 to 1600

2

50

50

3

50

50

2

50

50

4

50

50

1601 to 1800

2

50

50

1

50

50

2

50

50

1

50

50

1801 to 2000

3

50

50

1

50

50

3

50

50

3

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

86

Table E-3. Numbers of Frequencies Significant by KS Test under Condition 3
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

22

50

50

21

50

50

46

50

50

46

50

50

1 to 200

0

50

50

0

50

50

1

50

50

0

50

50

201 to 400

0

50

50

0

50

50

0

50

50

0

50

50

401 to 600

0

50

50

0

50

50

1

50

50

1

50

50

601 to 800

0

50

50

0

50

50

1

50

50

1

50

50

801 to 1000

0

50

50

0

50

50

1

50

50

1

50

50

1001 to 1200

0

50

50

0

50

50

0

50

50

0

49

50

1201 to 1400

0

50

50

0

50

50

0

50

50

0

50

50

1401 to 1600

0

50

50

0

50

50

0

50

50

1

50

50

1601 to 1800

1

50

50

0

50

50

1

50

50

0

50

50

1801 to 2000

0

50

50

0

50

50

0

50

50

0

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

87

Table E-4. Numbers of Frequencies Significant by KS Test under Condition 4
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

50

50

50

49

50

50

50

50

50

50

50

50

1 to 200

2

50

50

4

50

50

4

50

50

5

50

50

201 to 400

3

50

50

3

50

50

4

50

50

5

50

50

401 to 600

4

50

50

4

50

50

6

50

50

6

50

50

601 to 800

0

50

50

0

50

50

2

50

50

0

50

50

801 to 1000

0

50

50

0

50

50

2

50

50

0

50

50

1001 to 1200

2

50

50

4

50

50

5

50

50

3

50

50

1201 to 1400

1

50

50

2

50

50

1

50

50

1

50

50

1401 to 1600

1

50

50

2

50

50

1

50

50

2

50

50

1601 to 1800

6

50

50

6

50

50

6

50

50

6

50

50

1801 to 2000

1

50

50

3

50

50

0

50

50

5

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

88

Table E-5. Numbers of Frequencies Significant by KS Test under Condition 5
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

1 to 200

50

50

50

50

50

50

50

50

50

50

50

50

201 to 400

50

49

50

50

50

50

50

50

50

50

50

50

401 to 600

49

50

50

50

50

50

50

50

50

50

50

50

601 to 800

49

50

50

49

50

50

50

50

50

50

50

50

801 to 1000

49

50

50

49

50

50

50

50

50

50

50

50

1001 to 1200

49

50

50

49

50

50

50

50

50

50

50

50

1201 to 1400

50

50

50

49

50

50

50

50

50

50

50

50

1401 to 1600

50

50

50

50

50

50

50

50

50

50

50

50

1601 to 1800

50

50

50

50

50

50

50

50

50

50

50

50

1801 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

89

Table E-6. Numbers of Frequencies Significant by KS Test under Condition 6
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

1 to 200

50

50

50

48

50

50

50

50

50

50

50

50

201 to 400

50

50

50

50

50

50

50

50

50

49

50

50

401 to 600

49

50

50

49

50

50

50

50

50

50

50

50

601 to 800

50

50

50

50

50

50

49

50

50

49

50

50

801 to 1000

50

50

50

50

50

50

49

50

50

49

50

50

1001 to 1200

50

50

50

50

50

50

50

50

50

50

50

50

1201 to 1400

49

50

50

48

50

50

50

50

50

49

50

50

1401 to 1600

50

50

50

50

50

50

50

50

50

50

50

50

1601 to 1800

50

50

50

50

50

50

50

50

50

50

50

50

1801 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

90

Table E-7. Numbers of Frequencies Significant by KS Test under Condition 7
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

1 to 200

50

50

50

49

50

50

50

50

50

50

50

50

201 to 400

50

50

50

49

50

50

50

50

50

50

50

50

401 to 600

48

50

50

50

50

50

49

50

50

50

50

50

601 to 800

50

50

50

49

50

50

50

50

50

49

50

50

801 to 1000

50

50

50

49

50

50

50

50

50

49

50

50

1001 to 1200

50

50

50

50

50

50

50

50

50

50

50

50

1201 to 1400

50

50

50

50

50

50

50

50

50

50

50

50

1401 to 1600

50

50

50

49

49

50

49

50

50

50

50

50

1601 to 1800

50

50

50

49

50

50

50

50

50

50

50

50

1801 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

91

Table E-8. Numbers of Frequencies Significant by KS Test under Condition 8
Discrimination

Mean: 1.3 / SD: 0.15

Mean: 1.8 / SD: 0.15

Mean: -0.5

Mean: 0.5

Mean: -0.5

Mean: 0.5

SD: 0.4

SD: 0.4

SD: 0.4

SD: 0.4

Difficulty

G

S1

S2

G

S1

S2

G

S1

S2

G

S1

S2

1 to 2000

50

50

50

50

50

50

50

50

50

50

50

50

1 to 200

50

50

50

50

50

50

50

50

50

49

50

50

201 to 400

50

50

50

50

50

50

50

50

50

50

50

50

401 to 600

50

50

50

48

50

50

50

50

50

49

50

50

601 to 800

50

50

50

50

50

50

50

50

50

50

50

50

801 to 1000

50

50

50

50

50

50

50

50

50

50

50

50

1001 to 1200

50

50

50

49

50

50

50

50

50

50

50

50

1201 to 1400

50

50

50

48

50

50

49

50

50

49

50

50

1401 to 1600

50

50

50

50

50

50

50

50

50

50

50

50

1601 to 1800

49

50

50

50

50

50

50

50

50

49

50

50

1801 to 2000

49

50

50

48

50

50

49

50

50

49

50

50

* Condition: Numbers of simulation conditions
* G: General factor; S1: First specific factor; S2: Second specific factor
* The frequencies under the significance level of .05 were counted.

92

BIBLIOGRAPHY

93

BIBLIOGRAPHY
Axelrod, R. (2005). Advancing the art of simulation in the social sciences. In J.-P. Rennard (Ed.),
Handbook of research on nature inspired computing for economy and management (pp. 90-100).
Hersey, PA: Idea Group.
Batley, R. -M., & Boss, M. W. (1993). The effects on parameter estimation of correlated
dimensions and a distribution-restircted trait in a multidimensional item response model. Applied
Psychological Measurement, 17(2), 131-141.
Bratley, P., Fox, B., & Schrage, L. (1987). A guide to simulation. Second Edition. New York:
Springer-Verlag.
Cai, L. (2010). A two-tier full-information item factor analysis model with applications.
Psychometrika, 75(4), 581-612.
Cai, L., Yang, J.S., & Hansen, M. (2011). Generalized full-information item bifactor analysis.
Psychological Methods, 16(2), 221-248.
Capella, M. E., & Turner, R. C. (2004). Development of an instrument to measure consumer
satisfaction in vocational rehabilitation. Rehabilitation Counseling Bulletin, 47(2), 76-85.
Chalmers, R. P. (2012). A Multidimensional item response theory package for the R
Environment. Journal of Statistical Software, 48(6), 1-29.
Chen, F. F., West, S. G., & Sousa, K. H. (2006). A comparison of bifactor and second-order
models of quality of life. Multivariate Behavioral Research, 41(2), 189-225.
DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2(3), 292307.
DeMars, C. E. (2006). Application of the bi-factor multidimensional item response theory model
to testlet-based tests. Journal of Educational Measurement, 43(2), 145-168.
Duncan-Jones, P. (1981a). The structure of social relationships: Analysis of a survey instrument:
I. Social Psychiatry. Social Psychiatry, 16(2), 55-61.
Duncan-Jones, P. (1981b). The structure of social relationships: Analysis of a survey instrument:
II. Social Psychiatry, 16(3), 143-149.
Eboli, L., & Mazzulla, G. (2007). Service quality attributes affecting customer satisfaction for
bus transit. Journal of Public Transportation, 10(3), 21-34.
Fahrmeir, L., & Tutz, G. (2001). Multivariate statistical modeling based on generalized linear
models (2nd ed.). New York, NY: Springer.

94

Finch, H. (2010). Item parameter estimation for the MIRT model: Bias and precision of
confirmatory factor analysis-based models. Applied Psychological Measurement, 34(1), 10-26.
Frank, K. (1998). The social context of schooling: Quantitative methods. Review of Research in
Education, 23, 171-216.
Fujimoto, K. A. (2014). Bayesian Extended Two-Tier Full-information Item Factor Analysis
Model. Paper presented at the 76th Annual conference of the National Council on Measurement
in Education, Philadelphia, PA.
Gibbons, R. D., & Hedeker, D. R. (1992). Full-information item bi-factor analysis.
Psychometrika, 57(3), 423-436.
Gifford, J. A.(1978).Developments in latent trait theory: Models, technical issues, and
applications. Review of Educational Research, 48(4), 467-510.
Gosz, J. K., & Walker, C. M. (2002). An empirical comparison of multidimensional item
response data using TESTFACT and NOHARM. In annual meeting of the National Council on
Measurement in Education, New Orleans, LA.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response
theory. Newbury Park, CA: Sage Publications, Inc.
Hichendorff, M. (2013). The language factor in elementary mathematics assessments:
Computational skills and appliced problem solving in a multidimensional IRT framework.
Applied Measurement in Education, 26(4), 253-278.
Hogg, R. V., & Tanis, E. A. (1997). Probability and statistical inference. Upper Saddle River, NJ:
Prentice Hall.
Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2(1), 41-54.
Jung, I, Choi, S, Lim, C, & Leem, J. (1994). Effects of different types of interaction on learning
achievement, satisfaction and participation in web-based instruction. Innovations in Education
and Teaching International, 39(2), 153-162.
Küppers, G., & Lenhard, J. (2005). Validation of simulation: Patterns in the social and natural
sciences. Journal of Artificial Societies and Social Simulation, 8(4)3.
(http://jasss.soc.surrey.ac.uk/8/4/3.html)
Lazarsfeld, P. F. (1950). The logical and mathematical foundation of latent structure analysis. In
S. A. Stouffer, L. Guttman, E. A. Suchman , P. F. Lazarsfeld, S. A. Star, & J. A. Clausen (Eds.),
Studies in social psychology in World War II: Vol. 4. Measurement and prediction (pp. 362-412).
Princeton, NJ : Princeton University Press.
Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied
Psychological Measurement, 30(1), 3-21.

95

Li, Y., & Lissitz, R. W. (2012). Exploring the full-information bifactor model in vertical scaling
with construct shift. Applied Psychological Measurement, 36(1), 3-20.
Lord, F. M. (1952). A theory of test scores. Psychometric Monograph, No7.
Lord, F. M. (1953a). An application of confidence intervals and of maximum likelihood to the
estimation of an examinee's ability. Psychometrika, 18, 57-75.
Lord, F. M. (1953b).The relation of test score to the trait underlying the test. Educational and
Psychological Measurement, 13, 517-548.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale,
NJ: Lawrence Erlbaum Associates.
Marsden, P. V. (2005). Recent Developments in Network Measurement. in P. J. Carrington, J
Scott, & S. Wasserman (Eds.) Models and Methods in Social Network Analysis (pp.8-30). New
York: Cambridge University Press.
Martin, A. J. (2007). Examining a multidimensional model of student motivation and
engagement using a construct validation approach. British Journal of Educational Psychology,
77(2), 412-440.
McDonald, R. P. (1997). Normal-Ogive multidimentional model. In W. J. van der Linden & P. K.
Hambleton (Eds.) Handbook of modern item response theory (pp. 257-269). New York: Springer.
Murphy, K. R., Cronin, B. E., & Tam, A. P. (2003). Controversy and consensus regarding the
use of cognitive ability testing in organizations. Journal of Applied Psychology, 88(4), 660-671.
National Center for Education Statistics (2010). Highlights from PISA 2009. (NCES 2011-004).
U.S. Department of Education. Retrieved from NCES
(http://nces.ed.gov/pubs2011/2011004.pdf).
National Center for Education Statistics (2007). Highlights from PISA 2006. (NCES 2008-016).
U.S. Department of Education. Retrieved from NCES
(http://nces.ed.gov/pubs2008/2008016.pdf).
National Center for Education Statistics (2004). PISA 2003 results from the U.S. perspective
highlights. (NCES 2005-003). U.S. Department of Education. Retrieved from NCES
(http://nces.ed.gov/pubs2005/2005003.pdf)
National Center for Education Statistics (2001). Outcomes of learning: Results from the 2000
Program for International Student Assessment of 15-year-olds in reading, mathematics, and
science literacy (NCES 2002-115). U.S. Department of Education. Retrieved from NCES
(http://nces.ed.gov/pubs2002/2002115.pdf)
Nelsen, R. B. (1999). An introduction to copulas. Springer.
Organization for Economic Cooperation and Development. (2001). Knowledge and Skills
96

for Life: First Results from the OECD Programme for International Student Assessment.
Paris: Author.
Organization for Economic Cooperation and Development (2007a). PISA 2006: Science
Competencies for Tomorrow’s World Executive Summary, Paris: Author.
Organization for Economic Cooperation and Development. (2007b). PISA 2006: Science
Competencies for Tomorrow’s World ( olume I: Analysis.) Paris: Author.
Pommerich, M., & Segall, D. O. (2008). Local dependence in an operational CAT: Diagnosis
and implications. Journal of Educational Measurement,45(4), 201-223.
Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving
dimensionality issues in health outcomes measures. Quality of Life Research, 16, 19-31.
Reise, S. P., Moore, T. M., & Haviland, M. G. (2010). Bifactor models and rotations: Exploring
the extent to which multidimensional data yield univocal scale scores. Journal of Personality
Assessment, 92(6), 544-559.
Reckase, M. D. (2009). Multidimensional Item Response Theory. Springer-Verlag, New York.
Rijmen, F. (2010). Formal relations and an empirical comparison among the bi-factor, the testlet,
and a second-order multidimensional IRT model. Journal of Educational Measurement, 47(3),
361-372.
Robinson, S. (2004). Simulation: The practice of model development and use. Wiley, Chichester,
UK.
Sass, D. A., Schmitt, D. A., & Walker, C. M. (2008) Estimating non-normal latent trait
distributions within item response theory using true and estimated item parameters. Applied
Measurement in Education, 21(1), 65-88.
Schumid, A. (2005). What is the truth of simulation? Journal of Artificial Societies and Social
Simulation, 8(4)5. (http://jasss.soc.surrey.ac.uk/8/4/5.html)
Seo, D. G. (2011). Application of the bifactor model to computerized adaptive testing.
Unpublished doctoral dissertation, University of Minnesota. (ERIC Document Reproduction
Service No.ED526366) Retrieved July 27, 2012, from ERIC database.
Sheng, Y. (2010). Bayesian estimation of MIRT models with general and specific latent traits in
MATLAB. Journal of Statistical Software, 34(3), 1-27.
Singelis, T. M. (1994). The measurement of independent and interdependent self-construals.
Personality and Social Psychology Bulletin, 20(5), 580-591.
Stapleton, J. H. (2008). Models for probability and statistical inference: Theory and applications.
Hoboken, NJ: John Wiley & Sons.

97

Stone, C. A. (1992). Recovery of marginal maximum likelihood estimates in the two-parameters
logistic response model: An evaluation of MULTILOG. Applied Psychological Measurement,
16(1), 1-16.
Stouffer, S.A. (1950). An overview of the contributions to scaling and scale theory. In S. A.
Stouffer, L. Guttman, E. A. Suchman , P. F. Lazarsfeld, S. A. Star, & J. A. Clausen (Eds.),
Studies in social psychology in World War II: Vol. 4. Measurement and prediction (pp. 3-45).
Princeton, NJ : Princeton University Press.
Thurstone, L. L. (1947). Multiple factor analysis. Chicago: University of Chicago press.
von Davier, M. (2008). A general diagnostic model applied to language testing data. British
Journal of Mathematical and Statistical Psychology, 61(2), 287-307.
Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent
population distribution using spline-based densities. Psychometrika, 71(2), 281-301.
Yoshida, M., & James, J. D. (2010). Customer satisfaction with game and service experiences:
Antecedents and consequenses. Journal of Sport Management, 24, 338-361.

98