IT’S BOTH WHO YOU ARE AND WHERE YOU’RE FROM:
RELATING VOCATIONAL INTERESTS AND SOCIOECONOMIC STATUS TO BIAS IN
BIODATA AND SJTS
By
Joshua Prasad

A THESIS
Submitted to
Michigan State University
In partial fulfillment of the requirements
for the degree of
Psychology-Master of Arts
2017

ABSTRACT
IT’S BOTH WHO YOU ARE AND WHERE YOU’RE FROM:
RELATING VOCATIONAL INTERESTS AND SOCIOECONOMIC STATUS TO BIAS IN
BIODATA AND SJTS
By
Joshua Prasad
Differences in responding to biodata and situational judgement tests (SJTs) based on
gender and racial minority group status were evaluated. It was hypothesized that vocational
interests and socioeconomic status (SES) could be used to help characterize the differences in
experience between groups (e.g. Cottrell, Newman, Roisman, 2015; Nye, Su, Rounds, &
Drasgow, 2012). As a result, interests and SES may help explain differences in both the
constructs assessed by biodata and SJTs as well as differences in item functioning (DIF;
Drasgow, 1987). Hypotheses were evaluated using multiple-indicator multiple-cause models to
simultaneously model latent constructs and item responses (MIMIC; Muthén, 1989). Findings
indicate that interests helped explain differences across gender in both the constructs assessed as
well as DIF. Interests explained few differences based on minority group status and SES did not
seem to meaningfully explain differences in either of the demographic group comparisons. Many
items still exhibited DIF as a function of gender or minority group status after accounting for
vocational interests and SES, suggesting that further work is needed to identify additional
substantive explanations of DIF. Overall, the present work constitutes a thorough examination of
differential functioning in noncognitive assessments and establishes a meaningful relationship
between the noncognitive constructs assessed here and vocational interests.

TABLE OF CONTENTS

LIST OF TABLES ........................................................................................................................ iv
LIST OF FIGURES ....................................................................................................................... vi
INTRODUCTION ...........................................................................................................................1
Biodata. ........................................................................................................................................3
Situational Judgment Tests. .........................................................................................................4
Measurement Bias ..........................................................................................................................5
Potential for Bias in Biodata and SJTs ..........................................................................................7
Frame of Reference. ...................................................................................................................11
Item Accessibility. .....................................................................................................................13
Socioeconomic Status ..................................................................................................................14
Vocational Interests .....................................................................................................................18
METHOD ......................................................................................................................................24
Participants and Procedures .........................................................................................................24
Measures ......................................................................................................................................24
Demographics. ...........................................................................................................................24
Biodata. ......................................................................................................................................24
SJT. ............................................................................................................................................27
Interests. .....................................................................................................................................28
Median Local Income. ...............................................................................................................28
Data Analysis ...............................................................................................................................29
Testing for Uniform and Nonuniform DIF. ..............................................................................34
Explanation of DIF, a Model Building Approach. ...................................................................36
RESULTS ......................................................................................................................................39
Assessment of Differential Item Functioning ..............................................................................42
Evaluation of Hypotheses ............................................................................................................43
DISCUSSION ................................................................................................................................54
Limitations ...................................................................................................................................63
Practical Implications...................................................................................................................65
Conclusion ...................................................................................................................................68
APPENDICES ...............................................................................................................................70
APPENDIX A: Configural model estimation and DIF analyses for studied scales ....................71
APPENDIX B: MIMIC model analyses for studied scales .......................................................103
REFERENCES ............................................................................................................................151

iii

LIST OF TABLES

Table 1. Dimensions assessed with the biodata and SJT measures ..............................................25
Table 2. Descriptive statistics and intercorrelations of studied variables ......................................40
Table 3. Correspondence of RIASEC dimensions with biodata and SJT ......................................45
Table 4. Summary of regressions of biodata latent factor scale scores on vocational interests ....46
Table 5. Summary of degree of support and relevant results for hypotheses posed in the present
study ...............................................................................................................................................56
Table A1. Configural model estimation and DIF analyses for the Behavioral Leadership
scale................................................................................................................................................72
Table A2. Configural model estimation and DIF analyses for the Leadership Positions
scale................................................................................................................................................74
Table A3. Configural model estimation and DIF analyses for the Knowledge scale ....................75
Table A4. Configural model estimation and DIF analyses for the Continuous Learning
scale................................................................................................................................................78
Table A5. Configural model estimation and DIF analyses for the Values scale ...........................81
Table A6. Configural model estimation and DIF analyses for the Social Responsibility
scale................................................................................................................................................83
Table A7. Configural model estimation and DIF analyses for the Perseverance scale .................85
Table A8. Configural model estimation and DIF analyses for the Discrete Adaptability
scale................................................................................................................................................87
Table A9. Configural model estimation and DIF analyses for the Routine Adaptability
scale................................................................................................................................................89
Table A10. Configural model estimation and DIF analyses for the situational judgment
scale................................................................................................................................................90
Table B1. MIMIC model of the Behavioral Leadership scale ....................................................104
Table B2. MIMIC model of the Leadership Positions scale ........................................................108

iv

Table B3. MIMIC model of the Knowledge scale ......................................................................110
Table B4. MIMIC model of Continuous Learning scale .............................................................118
Table B5. MIMIC model of the Perseverance scale ...................................................................123
Table B6. MIMIC model of the Discrete Adaptability scale .......................................................131
Table B7. MIMIC model of the Routine Adaptability scale .......................................................137
Table B8. MIMIC model of the Social Responsibility scale .......................................................138
Table B9. MIMIC model of the Values scale ..............................................................................144
Table B10. MIMIC model of the Situational Judgment scale .....................................................149

v

LIST OF FIGURES

Figure 1. Example measurement model of a latent factor with scale items serving as observed
indicators ..........................................................................................................................................7
Figure 2. Proposed analytic approach of testing how interests and SES influence individual
responses to biodata and SJT items ...............................................................................................11
Figure 3. Depiction of individuals with equivalent standings on a latent trait but nonequivalent
comparison of groups .....................................................................................................................13

vi

INTRODUCTION

As Industrial/Organizational (IO) psychology grapples with issues related to adverse
impact and cognitive testing, organizations have been increasingly reliant on biodata and
situational judgment tests (SJTs), due to the fact that they show less potential for adverse impact
while maintaining reasonable validities for predicting important outcomes like training or job
performance (Robertson & Smith, 2008; Ployhart, 2006; Schmitt, Keeney, Oswald, Pleskac,
Billington, Sinha, & Zorzie, 2009). Though the potential for adverse impact may be lessened,
recent work has demonstrated that there may still be room for concern with biodata (Imus,
Schmitt, Kim, Oswald, Merritt, & Westring, 2010) and SJTs due to measurement bias (Kim,
Schmitt, Friede, Oswald, Ramsay, et al., 2004), which can contribute to adverse impact (Nye &
Drasgow, 2011). Measurement bias occurs when individuals with the same standing on the latent
trait assessed by the test, but sampled from different subgroups, have unequal observed scores on
the scale (Drasgow, 1987). In other words, bias represents differences in the way that individuals
respond to a measure rather than actual differences in the latent trait. In addition to its potential
effects on adverse impact, measurement bias can also influence comparisons across groups.
Therefore, bias in the measure is an important concern that needs to be addressed.
The explanations for the bias on biodata measures and SJTs provided by both Imus et al.
(2010) and Kim et al. (2004) relied solely on group differences in access to experiences relevant
to these measures. Although these studies identified group differences due to measurement bias,
simply identifying these differences does little to explain the psychological mechanisms
underlying them. There are likely to be several factors that differ across groups and that may
cause bias in the measurement of psychological constructs and understanding these factors will
provide additional information about how issues of measurement bias in biodata and SJTs can be
1

addressed in future research. Therefore, the present work attempts to broaden the investigation of
how and why bias might occur in biodata and SJTs to address this gap in the literature.
Biodata and SJTs can be designed to measure social and motivational attributes (e.g.
perseverance, adaptability, leadership) that may be relevant to a particular position, but are not
typically captured by assessments focused on cognitive abilities. In other words, biodata and
SJTs provide the methodology for assessing a broad range of attributes in a systematic manner.
Biodata assessments involve asking respondents the frequency with which they engage in
behaviors that are thought to be job relevant, with the thought that respondents may likely
continue these behaviors while on the job (Hough & Oswald, 2000; Whitney & Schmitt, 1997).
SJTs, on the other hand, consist of presenting the respondent with job-relevant dilemmas and
asking him or her to evaluate the appropriateness of a set of potential responses (McDaniel,
Morgeson, Finnegan, Campion, and Braverman, 2001).
The utility of these assessments has been established by demonstrating the unique
validity these assessments hold alongside other predictors, such as cognitive assessments, when
predicting criteria like job performance (McDaniel et al., 2001), training (Robertson & Smith,
2008), or early college success (Schmitt et al., 2009). Given the flexibility of these assessments,
their potential relationships with key workplace outcomes, and their low to moderate subgroup
differences, biodata and SJTs present a powerful pair of assessment methods. Further, the ability
to create items that are clearly connected to the workplace tend to generate favorable reactions
from both job applicants and HR professionals (Ployhart, 2006). However, these benefits can
only be realized if the latent constructs assessed by these methods are measured appropriately.
Additionally, due to the differences in how these measures are constructed and the way they
measure latent constructs, examining bias in each of these techniques will provide a more
general understanding of bias, rather than the current norm of merely demonstrating that bias
2

occurs without explaining why it occurs (Gierl, 2005). As a result, biodata and SJTs are the focus
of this investigation, and will be introduced in turn.
Biodata. The key principle of biodata assessments is that past behavior is predictive of
future behavior. By assessing the frequency or quality of behaviors that may be job relevant, the
continuation of these behaviors on the job may prove beneficial to performance (Hough &
Oswald, 2000; Whitney & Schmitt, 1997). In addition to their job relevance, the behaviors
targeted in a biodata assessment are selected based on whether they are thought to causally
influence the construct of interest (Dean, 2013). A strength of biodata assessments is that they
can be tailored to a specific job in order to increase their validity with important outcomes
(Oswald, Schmitt, Kim, Ramsay, & Gillespie, 2004). This strength comes with a drawback,
however, because the term biodata is often used in an unsystematic way. Mael (1991) attempted
to address this criticism by establishing a framework to clarify what constitutes a biodata
assessment. This framework describes biodata assessments broadly as tapping into behaviors that
an individual has enacted in order to adapt to his or her environment, as well as behaviors that
are consistent with his or her personal and social identity.
Despite the variety of biodata assessments, reasonable validities have been observed
when using biodata as a predictor across a number of jobs (Hunter & Hunter, 1984) and criteria,
such as performance, salary progress, and person-organization fit (Mael & Ashforth, 1995;
Schneider & Schmitt, 1986). These benefits also come with frequent observations of low adverse
impact (Bliesener, 1996; Shackleton & Newell, 1997). It is common practice to screen biodata
items for differential functioning across groups. However, little work has been produced that can
inform actual evidence based principles in item screening (Whitney & Schmitt, 1997). In applied
settings, biodata assessments are most commonly used as a selection tool (Dean, 2013; Whitney
& Schmitt, 1997). They have been used successfully for a wide range of jobs such as managers,
3

hotel staff, and equipment distributors, among many others, though they are still less common
than conventional selection methods such as interviews (Robertson & Smith, 2008).
Biodata assessments are typically scored using either rational scoring or empirical
keying. Rational scoring involves subject matter experts weighting each item based on how job
relevant they view the item to be (Hough & Paullin, 1994). Empirical keying, on the other hand,
refers to collecting data on strong and weak employees and using their responses to weight items
based on how well the item discriminates between those employees (Mumford & Owens, 1987).
Situational Judgment Tests. Rather than assess attributes by having individuals endorse
statements about themselves and their experiences, SJTs present a series of job-relevant
dilemmas, as well as several potential responses, and participants must evaluate the
appropriateness of the responses (McDaniel et al., 2001). Contemporary versions of these tests
are developed using subject matter experts, who generate the dilemmas, provide realistic
responses, and identify the best and worst options (McDaniel et al., 2001). Participants are given
scores based on whether or not their answers align with what subject matter experts (SMEs)
deemed to be the best and worst responses to each dilemma (Schmitt et al., 2009). As is the case
with biodata, SJTs are typically used in selection contexts and their popularity has been
increasing, particularly in the United States and Europe (McDaniel, Hartman, Whetzel, & Grubb,
2007). SJTs can also be helpful as a training and development tool, given the ability to practice
analyzing situations and choosing a response (Ployhart, 2006).
SJTs were originally criticized for being a reflection of intelligence rather than a unique
ability, which appeared to be the case given initial correlations with tests of mental ability
(McDaniel et al., 2001). Further refinement of these tests, however, began to differentiate them
from cognitive ability measures. A key advance in the development of SJTs involved asking
respondents to rate both the best and worst responses to a situation. This modification resulted in
4

moderate validities with job performance criteria and weak relationships with mental ability
(Motowidlo, Dunnette, & Carter, 1990). However, McDaniel et al. (2001) caution that this was
often found to be the case in samples where range restriction was a possibility.
An additional point of investigation regarding the construction of SJTs has been related
to the instructions provided for these measures. The two broad categories of instructions are
referred to as behavioral tendency and knowledge instruction. Behavioral tendency instructions
ask the participant to rate which actions they would be most and least likely to take, whereas
knowledge instructions have participants report which actions should be taken (McDaniel et al.,
2007). Though a seemingly subtle difference, meta-analytic work has suggested that SJT
instruction types are related to important outcomes of the assessment process. Whetzel,
McDaniel, and Nguyen (2008) showed that SJTs administered with knowledge instructions
produced the greatest racial subgroup differences, which was explained by the fact that SJTs
with this type of instruction correlated higher with measures of cognitive ability. McDaniel et al.
(2007) point out that the weaker relationships between the behavioral tendency instructions and
cognitive ability may be due to the production of scores that reflect typical performance, whereas
cognitive ability describes maximal performance. Behavioral tendency instructions come with
their own drawbacks, however, as responses may be more prone to distortion. Respondents
engaging in impression management or faking may select responses that are socially desirable
rather than those that reflect typical performance (Ployhart, 2006).
Measurement Bias
Measurement bias is problematic for selection settings because it can result in the
disproportionate selection of one group over another. This can increase errors in the selection
process and result in either selecting unqualified individuals or rejecting those who are qualified
(Aguinis & Smith, 2007). Further, use of a biased test can contribute to adverse impact, or the
5

selection of minority group members at a significantly lower rate than those of a majority group
(Zedeck, 2010). An important contributor to bias at the test level is bias at the item level (i.e.
differential item functioning or DIF; Nye & Drasgow, 2011).
DIF is generally characterized as either uniform or nonuniform. Uniform DIF refers to
differences across groups that are consistent throughout the entire range of the latent trait.
Nonuniform DIF, on the other hand, refers to differences that are not consistent across range of
the latent trait, such that group differences vary in size depending on standing on the latent trait
(Mellenbergh, 1989). Identifying either form of DIF and its contribution to scale scores may help
to decompose why score differences may exist between groups. It should be noted, though, that
removal of items flagged for DIF may only result in a modest reduction in group differences (e.g.
reduction in standardized group difference of roughly .05; Schmitt & Quinn, 2010).
Figure 1 illustrates how items relate to a latent factor in a measurement model and serves
as the starting point for further testing of DIF. In the figure, a latent factor represents the trait
assessed by the measure (i.e., either biodata or SJT). Each item within the measure has a linear
relationship with the latent factor that is defined by an intercept and a factor loading. The
intercept describes the predicted value of item responses when the latent trait is at a value of
zero. In the analyses conducted here, zero represents an average standing on the latent trait, so an
item’s intercept describes how an individual with an average standing on the latent trait would
respond to that item. A uniform DIF effect can be thought of as differences in an item intercept
across groups because this difference would be constant across the entire range of the latent trait
(Woods & Grimm, 2011).
The factor loading of an item describes how a difference in standing on the latent trait
corresponds to a difference in the predicted item response. For example, if an individual has a
latent score of one (latent standing is one standard deviation above the mean), the factor loading
6

Figure 1.
Example measurement model of a latent factor with scale items serving as observed indicators

Note. Items 1 through 3 represent the items of a particular noncognitive scale, measuring a
Latent Factor. Paths originating from the latent factor leading to each item represents that item’s
factor loading. Item intercepts are not depicted graphically but are estimated in the measurement
models analyzed in this study.
would describe how much higher that individual’s predicted item response would be when
compared to an individual with an average standing on the latent trait. A nonuniform DIF effect
can be thought of differences in the factor loading of an item across groups. A significantly
different factor loading across groups would result in predicted item response differences across
groups as well. In an ideal situation when no DIF exists, neither the intercept nor the factor
loading of an item would differ as a function of group membership. In contrast, DIF occurs when
the factor loadings and/or intercepts are influenced by non-random factors that differ across
groups.
Potential for Bias in Biodata and SJTs
As mentioned above, perhaps the chief appeal of using biodata and SJTs to measure
constructs of interest is their purported ability to avoid adverse impact while being of use in
7

predicting performance (Robertson & Smith, 2008; Ployhart, 2006; Schmitt et al., 2009). Tests
that avoid adverse impact are important because of their ability to reduce legal liability and result
in a more ethical employee selection process (Aguinis & Smith, 2007). The claim that biodata
and SJTs demonstrate little risk for adverse impact is not without contest. Bobko and Roth
(2013) reviewed meta-analytic evidence of black-white subgroup differences and found
differences of d = .38 for SJTs (Whetzel, McDaniel, & Nguyen, 2008) and d = .33 for biodata
(DeCorte, Lievens, & Sackett, 2007), favoring whites in both cases. Characterizing these effect
sizes as small, or as minimal risk, is somewhat misleading as Bobko and Roth (2013) point out,
but also potentially underestimates the effects in actual selection contexts. This is due to the fact
that a majority of the studies contributing to these meta-analytic effect size estimates are from
incumbent, rather than applicant samples. Weekeley, Ployhart, and Harold (2004) provide one of
the few studies where applicant samples were available, and conversion of their results into
effect sizes produced differences in SJT scores of d = .39 for incumbents and d = .79 for
applicants when comparing whites to non-whites (Bobko & Roth, 2013). Whetzel, McDaniel,
and Nguyen (2008) found similar results for Hispanic (d = .24) and Asian (d = .29) respondents,
with the extent to which the test was associated with cognitive ability being the strongest
explanatory variable. Here, cognitive ability appears to be a moderator of these subgroup
differences (Whetzel et al., 2008). In addition, black-white differences in biodata scores also
appear to be explained in part by cognitive ability (DeCorte, Lievens, & Sackett, 2007). It should
be noted that with regard to gender, overall SJT differences favored females (d = -.11), with
conscientiousness and agreeableness, rather than cognitive ability, serving as meaningful metaanalytic moderators (Whetzel et al., 2008).
Additional work examining subgroup differences in biodata and SJTs has explored
differences across cultural groups. For example, recent work by Prasad, Schmitt, Ryan, Showler,
8

Nye (2016) examined differences in the operation of biodata and SJTs when comparing
American and Chinese student samples. Though not wholly supported, hypotheses were based on
differences in experience based on differing educational systems, as well as differences in
cultural values as measured by the GLOBE study (House & Hanges, 2004). Prasad et al. (2016)
found that score differences on biodata scales assessing leadership, knowledge, adaptability, and
perseverance aligned with differences in the educational systems and cultural values of the two
groups. On the other hand, scales assessing continuous learning, social responsibility, and
academic values (i.e. behaving in accordance with a well-developed set of values), did not align
with hypothesized differences, revealing the possibility of other influences on group mean
differences in biodata assessments. Additionally, Prasad et al. (2016) conducted MACS analyses,
as outlined by Nye and Drasgow (2011), to try to separate latent trait differences from bias. The
role of bias in influencing observed mean differences became quite apparent as differences
between observed and corrected effect sizes ranged from d = .05 to d = .51 across biodata scales
and the SJT. Further, any observed differences between groups on the SJT were eliminated after
accounting for bias, suggesting that raw score differences, at least in the comparison of American
and Chinese students, reflect bias in measurement more so than meaningful differences in
situational judgment. At the scale level, attempts to understand why group differences exist have
been limited to the prediction of group differences or the use of moderators in meta-analysis.
Other work examines item-level responses in efforts to better understand how different groups
use biodata and SJTs.
Multiple studies have shown that bias at the item-level is an important concern with both
biodata and SJTs (Imus et al., 2010; Kim et al., 2004; Whitney & Schmitt, 1997). Whitney and
Schmitt (1997) explored the potential influence of differences in cultural values on item use
across black and white respondents. Though they found that culture was a general predictor of
9

response selection, they did not find that culture explained DIF. Imus et al. (2010) argued that
the cause of DIF was due to members of certain demographic groups having reduced access to
the experiences probed in some of the biodata items. Upon examination, Imus et al. (2010) found
that the degree to which an item operated differently across sexes was negatively correlated with
how much more accessible the item was judged by females (when compared to male judgments
of accessibility; r = -.51, p < .05). The researchers described accessibility as the degree to which
an individual felt they had ample opportunity to experience the event or situation described in the
item stem. Though DIF was also observed across races, accessibility was not shown to explain
these differences.
Based on the work at both the item- and scale- levels, it is clear that bias can play a role
in certain biodata assessments and SJTs. The explanation of bias, however, must be further
developed. Substantive explanations of bias are important for understanding why bias occurs, but
have progressed slower than our ability to statistically model bias (Gierl, 2005). In other words, a
substantial amount of work has been conducted to identify bias but much less work has
examined potential explanations for the bias that is identified. As Whitney and Schmitt (1997)
point out, test developers have little evidence based guidance about how to write items in a way
that avoids bias. Consequently, DIF can only be detected post-hoc and may require researchers to
drop items after the data have already been collected (Imus et al., 2010). Dean (2013)
demonstrated that the identification and removal of biased items substantially improved the
measurement qualities of a biodata measure. Specifically, a substantial reduction of subgroup
differences was observed with little influence on the predictive validity of the assessment.
Further understanding of why items function differently may help to engender the improvements
observed by Dean (2013) in other biodata measures and SJT prior to conducting a costly test
validation study. Therefore, the goal of the present study was to propose a conceptual model
10

(depicted in Figure 2) that will help to build upon past developments to understand the sources of
bias in biodata measures and SJTs. Two potential mechanisms behind these sources of bias are
explained next.
Figure 2.
Proposed analytic approach of testing how interests and SES influence individual responses to
biodata and SJT items

Note. Models 1 through 3 denote sequential MIMIC models incorporating additional explanatory
variables (SES and Interests). Items 1 through 3 represent the items of a particular noncognitive
scale, measuring a Latent Factor. Item 3 represents an item that has been flagged for DIF. Paths
originating from the latent factor leading to each item represents that item’s factor loading. Paths
originating from Demographic Group, SES, and Interests leading to the Latent Factor represent
the regression of the Latent Factor onto each variable. Paths originating from Demographic
Group, SES, and Interests leading to the factor loading of Item 3 and Item 3 itself represent tests
of nonuniform and uniform DIF, respectively. Curved, dotted lines between Demographic Group
and both SES and Interests represent the observed biserial correlations between those variables.

Frame of Reference. When responding to biodata or SJT items, participants must often
make an evaluation in reference to some other person or group. This can occur explicitly in
either the question stem (e.g. “How often do others tend to compliment you on your
determination to continue with a project under difficult circumstances?”) or the response options
11

(e.g. “Much more than most people”). Even if not done explicitly, items may ask respondents to
evaluate an abstract amount, whereby making some sort of social comparison may help
participants respond to the item. This can be problematic, as argued by Robert, Lee, and Chan
(2006), since an assumption that many test creators make is that participants are sampled from
the same population and make evaluations against that population. This may not be the case as
individuals of different backgrounds may evaluate against a reference group closer to themselves
and not against the population intended by the researcher. Robert, Lee, and Chan (2006) refer to
this as the frame of reference effect, whereby the participant evaluates against a local comparison
group rather than a global one (i.e. population of interest).
The frame of reference effect is thought to exert its influence by producing nonequivalent intercepts on items when comparing different groups. Robert, Lee, and Chan (2006)
describe this as the product of individuals responding to an assessment based on the perceived
differences between themselves and their comparison group. This problem has been represented
graphically using Figure 3. Two individuals (“A” and “B”) are shown to have equivalent
standings on some latent trait. However, both individuals are making comparisons against local
comparison groups with different standings on that same trait. When responding to an item
related to this latent trait, individual A’s response may be inflated due to the relatively large
perceived difference between individual A and his/her local comparison group. Other
participants with a similar background to individual A (i.e. use a similar local comparison group)
will respond systematically higher to items reflecting this latent trait than would participants
similar to individual B. As a result, responses would not be comparable between the groups that
individuals A and B come from.

12

Figure 3.
Depiction of individuals with equivalent standings on a latent trait but nonequivalent
comparison groups
5
4
3
2
1
0
Individual A
Standing on Latent Trait

Individual B
Perceived Standing of Local Comparison Group

Item Accessibility. In addition to the frame of reference effect, Robert, Lee, and Chan
(2006) suggest that the relevance of an item to the construct being assessed by a scale may differ
as a function of group membership, which may bias responding. In other words, the degree to
which item content accurately reflects a construct may vary between groups. When responding to
biodata and SJT items, individuals are often presented with a specific behavior or situation meant
to serve as an example of a broader construct. Problems may arise when the item content is more
construct-relevant for one group than another. For example, consider the following biodata item
stem: “In the past six months, how often did you read a book just to learn something?” This item
uses the specific behavior of voluntarily reading a book as an indicator of the broader construct
of continuous learning. Should an evaluator use this item to compare continuous learning
between younger and older adults, the underlying assumption would be that books are an equally
applicable means of seeking new information for both groups of respondents. If this is not the

13

case, the item will have a weaker relationship with the construct of interest for the group who are
less likely to read books irrespective of continuous learning. Put more generally, if item content
differs in relevance between groups, the item will have a weaker loading on the latent construct
for the group who finds the content less relevant.
Item accessibility, as investigated by Imus et al. (2010), can be thought of as a specific
case of why items may vary in construct-relevance between groups. Imus et al. (2010) define
item accessibility as differences between groups in the opportunity to have specific experiences
due to social barriers resulting in those experiences being differentially construct relevant. For
example, Imus et al. (2010) found that Black respondents felt as though they had less opportunity
to take, “a leadership role in High School and/or organized activity,” than White respondents.
Further, Imus et al. (2010) go on to describe that items that are differentially accessible would
also differ in how informative they are as indicators of the underlying construct.
The proposed study seeks to apply the frame of reference effect and item accessibility to
potential DIF in biodata and SJT items. Based on previous research, the current work proposes
that SES and vocational interests will influence an individual’s frame of reference when
responding to biodata and SJT. Further, the accessibility of the item content in biodata and SJT
items may vary across groups due to SES (i.e. restricted access). The conceptual model that is
proposed here is illustrated in Figure 2 and the rationale behind these proposed mechanisms is
described below.
Socioeconomic Status
The socioeconomic status (SES) of the community an individual comes from may
strongly influence the experiences her or she has had and the factors that come to mind when
evaluating items related to academic pursuits. The broad argument presented here is that the
content assessed by biodata and situational judgment items may be influenced by SES (e.g. Kim
14

et al., 2004, Imus et al. 2010). As reviewed by Cottrell, Newman, and Roisman (2015), low SES
communities can suffer a number of setbacks. These researchers also connect SES differences to
race via census data. Updated 2014 estimates of household income indicate that Black families
across the United States have a median income of $35,398 and Hispanic families have a median
income of $42,491. Both figures are substantially lower than the median incomes of White
($60,256) and Asian ($74,297) families (United States Census Bureau, 2015). Cottrell et al.
(2015) argue that these race differences in SES can provide a partial explanation for subgroup
differences on some psychological characteristics. As such, differences in SES are also likely to
be a potential source of both DIF and true score differences in biodata and SJTs.
It is important to note that observed score differences on a latent construct are a function
of both bias and true score differences across groups. These true score differences are commonly
referred to as impact, and despite being a valid representation of a particular difference between
groups, can still contribute to adverse impact in selection contexts (Cottrell, Newman, &
Roisman, 2015). Using an appropriate methodology, it is possible to differentiate bias and impact
in group differences. As described below, SES is likely to be one source of bias on biodata and
SJT items. However, SES may also produce score differences on these types of measures by
influencing the latent constructs that are assessed. Because biodata and SJT items assess past
experiences and the procedural knowledge developed as a result (Lievens & Motowidlo, 2016;
Mael, 1991), individuals from low SES communities may have had fewer relevant experiences
resulting in lower levels of the latent trait being assessed after accounting for bias in the measure.
Should this be the case, it is likely that economically disadvantaged minorities have been denied
access to the opportunities necessary to develop the latent abilities measured by biodata and
SJTs.

15

H1: The effects of minority status on the standings of the latent traits measured by
biodata and SJTs will be partially explained by SES such that minority status will initially
predict lower standings on the latent traits measured by biodata and SJTs and this effect will be
weakened upon inclusion of measures of SES.
Though the effects of SES are likely broad, SES may specifically influence the
characterization of a local comparison group. SES may be particularly characteristic of the
comparison group for those in the current study given that they are just transitioning out of high
school and that the conditions of their school are likely linked to the SES of the surrounding area.
Poor communities may have reduced access to educational resources, providing fewer
opportunities for students to engage in academic pursuits than in more affluent areas (Duncan &
Magnuson, 2005). Additionally, parents in these communities may have less leisure time to
spend with their children, resulting in fewer occasions to convey educational aspirations (Cottrell
et al., 2015). Consequently, when individuals from low SES communities consider how often
they engage in the academic activities assessed in the biodata scales of knowledge, continuous
learning, and perseverance, the norm for their local comparison group may be relatively lower
due to the reduced academic resources of those around them. This norm may lead them to use a
different frame of reference than other individuals from more affluent areas. In turn, this may
lead individuals from low SES communities to overestimate their actual standing on these items.
Further, Cottrell et al. (2015) also point out that poorer communities are also relatively
more dangerous, serving as a less stable environment for individuals residing in these areas.
Again, when evaluating questions related to academic values and social responsibility,
individuals from low SES communities may have a different frame of reference. This can result
in DIF across groups in the form of different item intercepts, due to the potentially lower
standing of their local comparison group on these dimensions (e.g. Robert, Lee, & Chan, 2006),
16

such that individuals from low SES communities are more likely to endorse higher response
options than individuals from more affluent communities. However, it should be highlighted that
though SES will likely lead to intercept differences as a function of race, factor loading
differences will be unlikely among biodata items. This is due to the relatively general nature of
biodata item content facilitating construct relevance regardless of SES. Such a prediction aligns
with the findings of Imus et al. (2010) whereby perceptual differences in accessibility between
Black and White participants did not correlate with biodata item slope parameters.
H2: The effects of minority status on DIF in biodata items will be partially explained by
SES such that minority status will initially predict higher item intercepts and this effect will be
weakened upon inclusion of SES.
Bias may also be observed in SJTs when they assess content that may have a relationship
with SES. However, this may not be related to the frame of reference effect since responding to
SJT items requires the identification of a response to a specific situation, rather than selfevaluation against a reference group. This feature of SJT items makes the likelihood of item
intercept differences as a result of the frame of reference effect relatively low. Instead, bias in
SJTs may be more related to item accessibility. Kim et al. (2004) related differences in SES to
the differential opportunities hypothesis, which describes how disadvantaged minority group
members may not have access to specific opportunities required to demonstrate their standing on
a particular ability (Deutsch & Brown, 1964; Jachuck & Mohanty, 1974). This lack of access
may hinder the ability of a minority group member to respond to items related to a specific
context, in this case academic situations, when in reality the majority and minority group
members have similar standings on the latent trait. For example, if an SJT item asks about
activities that are associated with SES, that item may not be as relevant to disadvantaged
minority individuals. Due to this difference in item accessibility, the item in question may load
17

poorly onto the latent construct of situational judgment (e.g. Imus et al., 2010; Robert, Lee, &
Chan, 2006). This should result in smaller item factor loadings among members of groups that
are of lower SES.
H3: The effects of minority status on DIF in SJT items will be partially explained by SES
such that minority status will initially predict smaller item factor loadings and this effect will be
weakened upon inclusion of SES.
Vocational Interests
As mentioned by Imus et al., (2010), interests may be relevant to biodata assessments
given the role interests play in shaping the experiences of an individual. It is possible that this
logic may also be extended to SJTs given the relationship between experience and performance
on such measures (Lievens & Motowidlo, 2016; McDaniel et al., 2001). Interests, as theorized
by Holland (1997), can be described using a six-dimensional structure, where each dimension
represents a distinct domain of behaviors an individual may be interested in. The domains are as
follows: 1) Realistic – working with objects, also related to working outdoors, 2) Investigative –
working with ideas, particularly in the sciences, 3) Artistic – following creative pursuits, such as
writing and visual arts, 4) Social – working with and helping others, 5) Enterprising – taking on
leadership or persuasive positions, often associated with pursuits related to economic growth,
and 6) Conventional – preferring well-structured or traditional roles or environments. Nye, Su,
Rounds, and Drasgow (2012) demonstrated that having a strong interest in a particular domain
serves as a precursor to being motivated to engage in behaviors relevant to that domain. Interests
help to motivate behavior by directing individuals toward particular goals, influencing the
amount of effort expended on certain activities, and promoting perseverance on these activities
over time. The direction of behavior has been shown in the past through studies demonstrating
that interests can predict choice of an academic major or occupation (Eccles-Parsons, 1983;
18

Fouad, 1999; Holland, Fritzsche, & Powell, 1994). In terms of effort and persistence, Van
Iddekinge, Putka, and Campbell (2011) found that interests moderately predicted effort and
intentions to continue. Given the link between interests and motivation, interests should play a
determining role in the experiences an individual pursues. Further, experiences resulting from
interests may lead to the development of the constructs assessed by biodata and SJTs.
It stands to reason that if an individual consistently directs his or her attention towards
behaviors and experiences as a function of interests, their responses to biodata and SJT items
may reflect their interests to some extent. Longitudinal meta-analytic work by Low, Yoon,
Roberts, and Rounds (2005) shows that vocational interests are quite stable during adolescence
and early adulthood. Thus, the role of interests as a precursor to motivation of specific behaviors
(e.g. Nye et al., 2012) should be stable during the period leading up to the assessment of biodata
and SJT responses in the current study. If individuals are relying on vocational interests to direct
behaviors during the period of time that biodata and SJTs focus on, then it is plausible that these
noncognitive assessments may overlap with vocational interests. Further, Su and Nye (in press)
describe that declarative and procedural knowledge can be developed through engagement in
activities that align with an individual’s vocational interests. This relationship between interests
and knowledge bears some similarity to the theoretical account of SJT responding provided by
Lievens and Motowidlo (2016). These researchers argue that selecting an appropriate behavior to
a particular situation presented in an SJT item is partially determined by procedural knowledge
gained through experience. Regarding biodata, Mael (1991) argues that responses to biodata
items can reflect behaviors enacted as an adaptive response, which oftentimes coincides with the
acquisition of knowledge. Given the fact that vocational interests may shape experiences during
early adulthood and that the noncognitive assessments studied here are intended to capture past

19

experience and knowledge, it is likely that the constructs assessed by interests and noncognitive
measures are related.
Though there are few empirical examples relating biodata and SJT assessment methods
with vocational interests, several connections between them can be made based on theory and
content. Situational judgment, as assessed in this study, may be related to investigative interests,
since thoughtful analysis is an attribute both constructs hold in common. Beyond investigative
interests, social interests may also be related to situational judgment. Most of the situations
described in SJTs are embedded within a social context, and individuals who are motivated to
pursue more social experiences may be more adept at choosing effective responses in a dilemma.
In addition to having an interest in being around other people, social interests also describe being
concerned for the welfare of others (Holland, 1997). This quality may also be found in biodata
assessments that measure social responsibility and academic values, since behaviors related to
these constructs could also be motivated by preserving the welfare of others. Enterprising
interests are likely related to leadership given characteristics related to leading and persuading
others. The pursuit of economic growth may also relate to persistence, given the overlap in
exercising commitment over a period of time. Adaptability may also be related to enterprising
interests since economically favorable opportunities may be associated with capitalizing on
changes in your environment. Given the role of interests in guiding behavior, I suggest that:
H4: High social interests should predict higher levels of the latent traits of social
responsibility, academic values, and situational judgment
H5: High investigative interests should predict higher levels of the latent traits of
knowledge, continuous learning, and situational judgment
H6: High enterprising interests should predict higher levels of the latent traits of
leadership, adaptability, and perseverance
20

H7: High conventional interests should predict higher levels of the latent trait of
knowledge
If these propositions hold, past work investigating differences in interests between
demographic groups may suggest demographic differences in responses to biodata and SJT
items. Su, Rounds, and Armstrong (2009) meta-analyzed over 40 technical manuals of
vocational interest measures and found substantial differences across genders. Specifically, they
found that men tended to have more interest in realistic (d = 0.84) and investigative (d = 0.26)
domains, whereas women had stronger artistic (d = -0.35), social (d = -0.68), and conventional (d
= -0.33) interests. In contrast, enterprising interests were not substantially different between men
and women. Tracey and Robbins (2005) found similar results regarding mean differences
favoring males for realistic interests and females for social interests. As mentioned above,
interests affect motivation and direct individuals towards certain activities, which can translate
into life choices such as choice of academic major or occupation (Eccles-Parsons, 1983; Fouad,
1999; Holland, Fritzsche, & Powell, 1994). Lending credence to the idea that interests may lead
to seeking experiences relevant to particular academic domains, such as the ones evaluated by
biodata and SJTs, Su, Rounds, and Armstrong (2009) also analyzed differences in interest
measures related to engineering, science, and mathematics and found that men favored all three.
The researchers argued that these gender differences may have contributed to some of the gender
disparities observed in specific occupations in STEM fields.
Recent meta-analytic work has also found racial differences in vocational interests
between Whites and African Americans (Jones, Newman, Su, & Rounds, under review). On
average, White respondents tended to score higher on realistic (d = -.22) and investigative (d = .16) scales whereas African Americans scored higher on the social (d = .26), enterprising (d =
.18), and conventional (d = .28) interest scales. Scores on the artistic scales were not
21

substantially different across these groups (Jones et al., under review). As discussed with gender,
differences in interests between Whites and Blacks have manifested themselves in terms of
occupational choice. African Americans tend to be represented more in conventional, social, and
enterprising jobs and underrepresented in STEM fields, which correspond to investigative
interests (Walker & Tracey, 2012). Given the gender and race differences on vocational interests,
it is also likely that different subgroups will be motivated to pursue different sets of activities,
which will then manifest in group differences on biodata and SJTs:
H8: The effect of minority and gender status on the latent traits assessed by biodata and
SJTs will be partially explained by differences in vocational interests such that minority and
gender status will predict standing on the latent traits assessed by biodata and SJTs, and this
effect will be weakened by the inclusion of vocational interests in the model.
In addition to generating race and gender differences on the latent construct, group
differences on vocational interests are also likely to result in bias on biodata. Specifically, it may
be the case that interests relate to the frame of reference effect in their production of DIF on
biodata items (Robert, Lee, & Chan 2006). Schneider’s (1987) attraction-selection-attrition
(ASA) model, specifically the attraction process, may help inform how interests can shape the
local reference group an individual uses when responding. Schneider’s (1987) model describes
how individuals are attracted to organizations where they may find similar others. Though
organizations are not the focus of this investigation, the underlying process of being attracted to
similar others may still be informative, and is in fact based in Holland’s (1959, 1997) work
arguing that individuals choose professions that match their interests. In other words, individuals
may choose to participate in experiences that match their interests and the other individuals in
these environments are also likely to share those interests. If this is the case, then individuals
may respond to questions about their experiences, like those used in biodata measures, using
22

others with similar interests as their frame of reference. Given the race and gender differences in
vocational interests, this suggests that interests may mediate the effects of race or gender on DIF
in biodata assessments.
H9: The effect of minority and gender status on DIF in biodata items will be partially
explained by vocational interests such that group status will initially predict biodata item
intercepts, and this effect will be weakened by the inclusion of vocational interests in the model.
Evaluation of these hypotheses expressed above was accomplished through the methods
and proposed analyses described below, using the general approach depicted in Figure 2.

23

METHOD
Participants and Procedures
Participants consisted of college students who were admitted and chose to attend a large,
Midwestern university. During the application process, 11,637 students completed a biodata
assessment and SJT along with other common admissions requirements including providing
demographic information. Admitted students who chose to enroll attended one of several
orientation sessions, during which further survey data was collected in paper and pencil format.
A subset of these admitted students, whose selection was based on orientation scheduling,
completed a survey containing a vocational interests inventory based on the RIASEC model
(Holland, 1997), as well as a parallel version of the SJT they took during the application process.
The final sample consisted of 1,486 students, of which 616 were Male and 827 were female (43
did not provide a response). The racial composition of the sample consisted of 1,070 White, 158
Black, 106 Asian, 78 Hispanic, 58 Multiracial, 5 Native American, and 11 participants who did
not specific their race. White, Black, and Asian participants were analyzed separately whereas all
other participants were included in an “Other” category.
Measures
Demographics. Demographic data included in this study comes from data collected
during the application process. Of note, race, gender, high school zip code, and Pell grant
eligibility status were obtained.
Biodata. The biodata assessment used in the present study was developed to measure 12
dimensions identified via content analysis of university websites describing the attributes they
hoped to develop in students (Oswald et al., 2004). Seven of these attributes, defined and
presented with example items in Table 1, have been retained for the current study due to their
high reliabilities and past work demonstrating their validity for predicting criteria like college
24

GPA. This version of the biodata assessment comes from the Student Behavior and Experiences
Inventory, which also includes the SJT measure below (Oswald et al., 2004). Items were
designed to ask about interests, hobbies, experiences, and relevant background information
related to the construct being assessed (Imus, et al., 2010). Assessing such a range of content
translates into a broad assessment of the construct that includes attitudes, beliefs, and past
behaviors. Each of seven scales used 10 multiple-choice items (except Social Responsibility
which had 9 items) designed to assess aspects of the individual’s past experiences thought to be
indicative of capabilities suited for a university context.
Table 1.
Dimensions assessed with the biodata and SJT Measures
Dimension title and definition
Sample item
Knowledge: Gaining knowledge and mastering For class work, how often do you tend to skim
facts, ideas and theories and how they
the material, reading only the important points?
interrelate, and the relevant contexts in which
a. Almost all the time
knowledge is developed and applied.
b. Most of the time
c. Sometimes
d. Rarely
e. Never
Continuous Learning: Being intellectually
curious and interested in continuous learning.
Actively seeking new ideas and new skills,
both in core areas of study as well as in
peripheral or novel areas.

In the past month, how many times have you
looked for more information about something
that you found interesting?
a. Never
b. Once or twice
c. 3 to 5 times
d. 6 to 10 times
e. More than 10 times

Social Responsibility: Being responsible to
society and the community, and demonstrating
good citizenship. Being actively involved in
the events in one's surrounding community,
which can be at the neighborhood, town/city,
state, national, or college/university level.

How many hours of volunteer work did you do
in high school?
a. 0
b. Between 1 and 10
c. Between 11 and 30
d. Between 31 and 75
e. More than 75

25

Table 1 (cont’d)
Leadership: Demonstrating skills in a group,
such as motivating others, coordinating groups
and tasks, serving as a representative for the
group, or otherwise performing a managing
role in a group.

When asked to do a class project with other
students, how often do you take the lead and
assign tasks or roles to people in the group?
a. I am usually the one who
assigns tasks or roles to get the
work done
b. More than half the time I end up
assigning the tasks and roles
c. About half the time I take the
lead in assigning tasks and roles
d. I rarely take the lead in
assigning tasks and roles
e. I never take the lead unless I
have been assigned to do so

Perseverance: Committing oneself to goals and
priorities set, regardless of the difficulties that
stand in the way.

When encountering problems that take a long
time to solve, how impatient do you tend to
become?
a. Extremely impatient
b. Very impatient
c. Somewhat impatient
d. Slightly impatient
e. Not at all impatient

Adaptability: Adapting to a changing
environment (at school or home), dealing well
with gradual or sudden and expected or
unexpected changes.
Being effective in planning one’s everyday
activities and dealing with novel problems and
challenges in life.

In the past, how difficult have you found it to
adjust to major changes in your life (e.g.
moving, a new school, a new job)?
a. Extremely difficult
b. Very difficult
c. Difficult
d. Not very difficult
e. Not as all difficult

Academic Values: Having a well-developed set In you first three years of high school, how
of values, and behaving in ways consistent
often did you skip classes without a legitimate
with those values. In everyday life, this could
reason?
mean being honest, not cheating (on exams or
a. Most of the time
in committed relationships), and having respect
b. A lot
for others.
c. Sometimes
d. Once or twice
e. Never

26

Table 1 (cont’d)
Situational Judgment: Making good decisions
in various academic and social situations
related to each of the above areas. Analyzing
and choosing from among various alternative
possible actions in problem situations.

You are part of a three-person group working
on a class project with a quickly approaching
deadline. One member of the team is not
pulling his weight. He avoids assignments,
complains about the amount of work that has
to be done, and says the project doesn’t really
matter anyway. While you are all classmates,
you seem to be the group leader. What would
you do?
a. Divide the workload among
members of the group, making sure
everyone knows that are responsible for
their share. If the group member still
does not pull his own weight, bring it
up with the instructor.
b. Speak with him in private and offer
him moral encouragement to complete
his portion of the project. If the group
member still does not pull his own
weight, bring it up with the instructor.
c. Try to get the team member
motivated to do his work. If that
doesn’t help the situation, just put more
effort into the project yourself in order
to complete it.
d. Just do the group member’s portion
of the assignment in addition to your
own, and tell the instructor about the
situation.
e. See if the person could be removed
from your group.
f. Consult with the non-problematic
group about the most appropriate
course of action, and then act on
whatever you jointly decide.

Note. Table reproduced with permission from Prasad, Showler, Schmitt, Ryan and Nye (2016).

SJT. The SJT included in the current study was developed as a predictor of college
performance, and is part of the Student Behavior and Experiences Inventory (Oswald et al.,
2004) described above. This measure was also intended to reflect academically related
capabilities, but did so through presenting scenarios with a list of possible actions. Specifically,
27

each item contained a dilemma that is commonly faced by college students, along with several
possible responses to that dilemma. The same twelve dimensions originally assessed by the
biodata measure are also assessed by the 25 scenarios presented in the SJT. Previous findings,
however, found that a unidimensional model of situational judgment best represented responses
to this kind of assessment despite agreement among researchers regarding the sorting of items
into different dimensions (Oswald et al., 2004). As a result, analyses treated all items as
belonging to the same scale, as has been done in the past (e.g. Schmitt et al., 2009).
Interests. The vocational interest measure used during orientation activities was the brief
public domain RIASEC markers scale, developed and validated by Armstrong, Allison, and
Rounds (2008). This measure assessed the six RIASEC vocational interest dimensions originally
proposed by Holland (1997). Six items per dimension were used which asked about activities
related to a particular dimension, such as “Set up and operate machines to make products,” or
“Sing in a band,” for a total of 36 items. Participants used a five-point Likert-scale to indicate
their level of interest in an activity from “Dislike very much” to “Like very much.” Scores
consisted of the sum of item scores for each RIASEC dimension, individually.
Median Local Income. In addition to information gathered from subjects during their
application and orientation processes, 2010 median household income of the zip code for their
high school was used to describe their socioeconomic status. Localized income data was
retrieved from the U.S. Census Bureau website (United States Census Bureau, 2010). Income
data were divided by a factor of 1,000 before inclusion in SEM analyses. Mplus documentation
indicates that model estimation may be hindered in instances where some variable variances are
much larger than other variables in the model (Muthén & Muthén, 2011). By dividing income
values by 1,000 the resulting variance of the Median Local Income variable was closer in
magnitude to other studied variables, facilitating model estimation.
28

Data Analysis
The hypotheses posed in the present research were evaluated using a multiple indicator
multiple cause (MIMIC) model approach (Jöreskog, & Goldberger, 1975; Muthén, 1989). In
addition to the explanation that follows, Figure 2 serves as an illustration of this approach.
Before such analyses were conducted, two prerequisite steps were required. First, an adequate
measurement model for all scales had to be estimated to promote subsequent structural model fit.
Second, scale items had to be assessed for DIF as the MIMIC approach used here would not be
identified if all items, including those that did not demonstrate DIF, were regressed onto
explanatory covariates. An excessive number of estimated paths would be required to assess all
items simultaneously for the substantive factors contributing to DIF (Woods, Oltmanns,
Turkheimer, 2009). This section describes these prerequisite steps, as well as the implementation
and interpretation of the final structural models.
A measurement model that fit well for each scale was important to promote overall model
fit and the interpretability of results in subsequent analyses. The adequacy of model fit for these
and subsequent analyses were assessed using the following rules of thumb: SRMR < .05, NNFI >
.90, CFI > .90, and RMSEA < .08. However, these rules of thumb were used as guidelines rather
than cutoffs and model fit was examined holistically as conditions may arise where an individual
fit index may signal misfit unnecessarily (Nye & Drasgow, 2011; Schermelleh-Engel,
Moosbrugger, & Müller, 2003). White males were used as the subgroup for which measurement
models were tested and modified because these individuals served as the referent group for DIF
analyses across both race and gender. For each scale, an initial measurement model was fit to the
data whereby all items would load onto a single latent factor.
Initial fit of the Continuous Learning (χ2(35) = 122.78, RMSEA = .07, CFI = .92, NNFI =
.90, SRMR = .04) scale was acceptable and the unidimensional model was used for further
29

analyses. All other scales required some form of modification to identify an appropriate
measurement model for analyses. In cases of unsatisfactory initial model fit, standardized
residuals and modification indices were assessed to identify where model fit could be improved.
In many cases, misfit was identified due to additional content that was shared after accounting
for the latent factor. This issue is common in many types of noncognitive assessments (Nye,
Allemand, Gosling, Potter, & Roberts, 2016) and was addressed by correlating the errors of these
items if this constraint seemed theoretically justified.
For the Academic Values scale, initial fit was unsatisfactory (χ2(35) = 107.06, RMSEA =
.07, CFI = .88, NNFI = .85, SRMR = .05), and was improved through the incorporation of
correlated errors between the items “In the past year, how many times have you copied someone
else’s work and submitted it as your own (at school or at work)?” and “In high school, how many
times have you cheated on a school project, assignment, or test?” both of which related to the
frequency of cheating on academic assignments (χ2(34) = 70.04, RMSEA = .05, CFI = .94, NNFI
= .92, SRMR = .04). Initial fit of the Social Responsibility scale (χ2(27) = 129.33, RMSEA = .09,
CFI = .92, NNFI = .89, SRMR = .05) was good, but subsequent estimation of the configural
models for race (χ2(108) = 494.76, RMSEA = .10, CFI = .90, NNFI = .86, SRMR = .06) and
gender (χ2(54) = 427.40, RMSEA = .10, CFI = .89, NNFI = .86, SRMR = .05) did not fit well.
Correlated errors between the items “How many hours of volunteer work did you do while in
high school?” and “In the past year, how many hours were you engaged in community service or
volunteer activities?” were also estimated and the resulting model yielded improved
measurement model fit (χ2(26) = 89.43, RMSEA = .07, CFI = .95, NNFI = .93, SRMR = .04).
Further, estimating this model across both gender (χ2(52) = 248.78, RMSEA = .07, CFI = .94,
NNFI = .92, SRMR = .04) and race (χ2(104) = 314.60, RMSEA = .07, CFI = .94, NNFI = .92,
SRMR = .05) fit well. Again, these additional model constraints were justified given the shared
30

content of these items. Past research has demonstrated that these constraints are necessary when
justified by the content of the items and/or their theoretical relationship (Cole, Ciesla, & Steiger,
2007). All correlated residuals included to improve measurement model fit were included in
subsequent analyses as well.
Beyond the addition of correlated residuals, some scales required the removal of a
problematic item to achieve appropriate measurement model fit. Initial fit for the Knowledge
scale was poor (χ2(35) = 116.39, RMSEA = .07, CFI = .88, NNFI = .84, SRMR = .05) and both
modification indices and residuals indicated that numerous aspects of the model were
misspecified. Inspection of item descriptive statistics revealed a strong ceiling effect for the
following item: “In your high school courses, how effective would you say you were at learning
knowledge and mastering general concepts?” The ceiling effect was reflected in the relatively
weak loading of this item onto the latent Knowledge construct (λ = .41). Removal of this item
resulted in acceptable model fit (χ2(27) = 72.22, RMSEA = .06, CFI = .92, NNFI = .90, SRMR =
.04). Thus, this item was excluded from further analyses. The Perseverance scale also did not fit
well with a single factor (χ2(35) = 167.98, RMSEA = .09, CFI = .83, NNFI = .78, SRMR = .05).
Examination of standardized residual covariances revealed that the item “How often have you
achieved a personal goal that seemed unattainable at first?” significantly related to several other
items in a way that was not captured by the latent factor. Removal of this item meaningfully
improved the fit of the Perseverance scale (χ2(27) = 95.84, RMSEA = .07, CFI = .90, NNFI =
.87, SRMR = .04). Based on the findings of Oswald et al. (2004) we modelled the SJI with a
single latent factor, which did not appear to fit well (χ2(275) = 392.61, RMSEA = .03, CFI = .79,
NNFI = .77, SRMR = .05). Improved fit was achieved by the removal of the item “You have
very much wanted to be a teacher, but you failed the entrance exam into the College of
Education. This exam is not given again for a year. What would you do?” due to excessive
31

residual correlation with four other items. Additionally, the error terms of two pairs of items
were correlated based on similarity in content1. As a result of these modifications, model fit was
closer to acceptable levels (χ2(250) = 306.36, RMSEA = .02, CFI = .89, NNFI = .88, SRMR =
.04).
Results also indicated that a two-factor model fit the Adaptability and Leadership scales
best. The initial fit of a unidimensional model for the Adaptability scale was poor (χ2(35) =
166.08, RMSEA = .09, CFI = .80, NNFI = .74, SRMR = .06), and examination of residual
correlations appeared to indicate distinct subsets of items. Four items that related to changes to
an individual’s normal routine like “In the past, how difficult have you found it to adjust to major
changes in your life (e.g., moving, a new school, a new job)?” appeared to relate highly in a way
not captured by a single latent construct. A two-factor model appeared to fit the Adaptability
scale where the four routine-related items loaded onto one factor and all other items loaded onto
the other factor, with a correlation included between factors (χ2(34) = 69.25, RMSEA = .05, CFI
= .95, NNFI = .93, SRMR = .04). It should be noted that the correlation between latent factors in
the two-factor model of Adaptability was quite high (r = .57), suggesting that these factors may
not be meaningfully distinct. However, in subsequent analyses these factors were individually

The first pair of correlated items were “You are searching for a major that interests you and
think you might be interested in psychology. You do not know much about preparation to be a
psychologist or what kinds of opportunities exist for careers in this area. What action would you
take?” and “You are interested in several different classes/disciplines, but don’t know anything
about future educational or career opportunities in these areas. What steps would you take to get
informed?”. The second pair of correlated items were “You are part of a committee to reduce
cross-cultural tension in your dorm. A group of students in your dorm complain to you that
people always wish them ‘Merry Christmas’ or ‘Happy Easter’ when these holidays are not
meaningful to them. They request that their differences be respected. How would you address
this problem?” and “A friend on your floor is always organizing ‘social’ activities including trips
to local bars. Aside from the fact that this person is underage and failing some classes, you
realize that the individual is drinking half a dozen or more drinks at least three or four times a
week. No one else seems to know or be concerned about the person. What would you do?”.
1

32

resulted in good model fit, and will be referred to as Routine Adaptability (four routine-related
items; χ2(2) = 4.02, RMSEA = .05, CFI = .99, NNFI = .98, SRMR = .02) and Discrete
Adaptability (all other items; χ2(9) = 20.04, RMSEA = .05, CFI = .96, NNFI = .94, SRMR = .03)
in further analyses.
In addition, a single-factor model of Leadership also did not fit well (χ2(35) = 266.98,
RMSEA = .12, CFI = .83, NNFI = .78, SRMR = .07). Examination of residual correlations
revealed the possibility of a factor representing experience with past leadership positions (e.g.
“The number of high school clubs and organized activities (such as band, sports, newspapers,
etc.) in which you took a leadership role was:”) whereas the other seemed to relate to leadership
behaviors (e.g. “During the past year, how often have you taken charge of a group that you were
in, without being asked?”). Modelling Leadership with two factors (χ2(34) = 148.17, RMSEA =
.09, CFI = .92, NNFI = .89, SRMR = .05) yielded acceptable model fit. However, the factor that
appeared more related to leadership behaviors did not fit well on its own (χ2(9) = 95.43, RMSEA
= .14, CFI = .88, NNFI = .80 SRMR = .06). Two items within this factor still demonstrated
evidence of a strong residual correlation, and both related to tasks focused on organizing the
group2. A two-factor model of the Leadership scale with the included residual correlation yielded
satisfactory model fit (χ2(33) = 88.91, RMSEA = .06, CFI = .96, NNFI = .95, SRMR = .04), and
the factors were highly correlated (r = .70). Independent models of these factors fit well, and are
treated independently in further analyses as Leadership Positions (χ2(2) = 0.41, RMSEA = .00,
CFI = 1.00, NNFI = 1.00, SRMR = .00) and Behavioral Leadership (χ2(8) = 24.98, RMSEA =

The two leadership behavior items were “In the past year, how many times have you been
responsible for assigning tasks and setting deadlines for other people?” and “How many times in
the past year have you set the schedule (time and/or tasks) for groups in which you have
worked?”
2

33

.07, CFI = .98, NNFI = .96, SRMR = .03). With suitable measurement models identified, scale
items were assessed for DIF.
DIF analyses were conducted to identify which items displayed DIF that could be
explained via a MIMIC model. As mentioned previously, testing all items for DIF using the
MIMIC approach would require an excessive number of estimated paths (Woods, Oltmanns,
Turkheimer, 2009). As such, multiple group analyses for race and gender were conducted
separately to flag items for DIF and items flagged as functioning differently across in any of the
groups were examined further using the MIMIC model. Before flagging items for DIF, a suitable
referent item had to be found for each scale. This involved estimating a constrained baseline
model (i.e. all item loadings and intercepts constrained to equality across groups) followed by
models where an individual item was freely estimated across groups (Stark, Chernyshenko, &
Drasgow, 2006). Models were estimated until an item was found that produced an increase in the
CFI < .002 (Meade, Johnson, & Braddy, 2008), signifying an item that could serve as a suitable
referent item. If an item met this condition for analyses of both race and gender, then this item
was used as a referent in further analyses. A free baseline model (i.e. all item loadings and
intercepts freely estimated) was then estimated for each scale, followed by models where an
individual item was constrained to equality across groups. When constraining an item resulted in
a decrease of CFI > .002 (i.e. constraining an item to equality across groups significantly
worsened model fit), that item was flagged for DIF (for further description and rationale, see
Stark, Chernyshenko, & Drasgow, 2006; Nye & Drasgow, 2011). The items flagged for DIF
based on race and gender can be found in appendices A1-A10 with each table corresponding to a
specific scale.
Testing for Uniform and Nonuniform DIF. The main analysis used in the current study
was to model responses to biodata and SJT items using MIMIC structural equations models
34

(Muthén, 1989). MIMIC models are useful tools for assessing DIF, particularly when the goal is
to explain DIF rather than just detect it. As with CFA more generally, the MIMIC model
estimates the latent trait underlying a measure and then models participants’ standings on the
latent trait. This makes it possible to differentiate true differences in the latent trait from bias in
the measure. The application of a MIMIC model as well as the model building approach
described below are depicted in Figure 2. All of the following tests described below were
conducted within the Mplus version 7.4 software package (Muthén & Muthén, 2011).
As described above, uniform DIF implies that there are consistent differences between
groups across the entire response scale for a particular item. Specifically, uniform DIF is
reflected in differences in the intercepts of the items. Uniform DIF can be detected if a grouping
variable (e.g., race, gender) significantly predicts a response to an item while also controlling for
its relationships with the latent trait (Woods, 2009).
Nonuniform DIF, on the other hand, describes how the response scale may be different
across groups. This refers to differences in the factor loadings of the items on a latent trait. The
detection of nonuniform DIF in the MIMIC model required the computation of an interaction
term between the latent trait and the grouping variable as described by Woods and Grimm
(2011). This interaction term is then used to predict responses to a particular item and a
significant path suggests that the response scale for that item is dissimilar across groups. In their
work, Woods and Grimm (2011) computed interaction terms using the XWITH command in
Mplus as it is the approach recommended in the Mplus documentation for computing an
interaction between a latent continuous variable and observed categorical variable (Muthén &
Muthén, 2011). However, the XWITH command from Mplus only allows for interaction terms to
be used as predictors and as such will only be used to predict item responses (i.e. cannot be used
to model correlations between other predictor variables). Further, interaction terms estimated in
35

this way require specification of random slopes and intercepts, which limit model fit information
to AIC and BIC values (Muthén & Muthén, 2011).
Explanation of DIF, a Model Building Approach. A model building approach was
employed here to help explain instances where either uniform or nonuniform DIF is detected.
After all scale items were individually evaluated for DIF, a baseline DIF model was estimated
whereby all items flagged for DIF were regressed onto grouping variables and interaction terms
to model uniform and nonuniform DIF, respectively. Regressions of the latent factor onto the
dichotomous grouping variables were also included. This baseline model was used to identify the
specific DIF effects as described above. A second model was then estimated where SES
variables (i.e. Pell grant eligibility, median local income) were added to the baseline model.
Specifically, items flagged for DIF as well as the latent factor of the target scale were regressed
onto these SES variables. Finally, a third model was estimated in which the vocational interest
scale scores are added to the model, with regressions of the DIF items and the latent factor of the
scale onto each interest scale.
No correlations among the explanatory covariates (i.e. dichotomous grouping variables,
interactions, SES variables, vocational interest scale scores) were specified in any of the three
explanatory models. Muthén (1989) explains that MIMIC models should be conditioned on
exogenous explanatory variables (Jöreskog & Goldberger, 1975). Specification of correlations
between predictors within the MIMIC model would treat these variables as endogenous, with
their variances and errors estimated as model parameters (Muthén & Muthén, 2011; Muthén
2012), violating the original suggestion by Muthén (1989). Further, inclusion of these
correlations may be a source of model over-identification (Hauser, Robert, & Goldberger, 1971).
However, correlations between predictors were important for evaluating the hypotheses, as
explained further below. These correlations were obtained from the observed correlations
36

between predictors obtained outside of the estimated model (Muthén, 2012). Most relevant to
hypotheses posed here, biserial correlations were reported between the dichotomous grouping
variables and both SES variables and vocational interest scales. Additionally, interaction terms
from the third model were saved following model estimation and these saved terms were
correlated with both SES variables and vocational interest scales as well.
In instances where the third model revealed what appeared to be a meaningful change in
a demographic variable predicting a latent factor, follow-up model comparisons were conducted
to evaluate the strength of the explanatory effect. When a regression of the latent factor on a
demographic variable appeared meaningful, this model was compared to an alternative model
where the path from the demographic variable to the latent factor was fixed to the corresponding
value observed in the first mimic model (i.e. did not include SES variables or vocational
interests). Given that the estimation of the interaction between the latent score and demographic
variables limits model fit information to AIC and BIC values (Muthén & Muthén, 2011), rules of
thumb from the literature describing those model fit statistics were used to assess whether an
explanation of an effect was significant. Raftery (1995) shows that a BIC decrease of less than 2
constitutes weak evidence of model improvement, a decrease 2 to 6 is good evidence, 6 to 10 is
strong evidence, and greater than 10 is very strong evidence. For AIC, Burnham and Anderson
(2004) suggest that a decrease of less than 2 is weak evidence, a decrease of 4 to 7 is strong
evidence, and a decrease of greater than 10 is very strong evidence.
The model building approach proposed here is a novel approach to help tie observed
demographic DIF effects to potential explanatory variables. Broadly speaking, an observed
demographic DIF effect was considered somewhat explained if two criteria are met. First, the
inclusion of explanatory variables must have led to an observable decrement in the uniform or
non-uniform DIF effect from the baseline model. Second, the explanatory variable must have
37

both predicted responses to the DIF items and correlated with the variable signifying the uniform
or nonuniform DIF effect. Additionally, given the number of explanatory variables, the
incremental approach of including SES variables followed by vocational interest variables was
proposed to help distinguish the effects of these two predictors.

38

RESULTS

Descriptive statistics, intercorrelations, and Cronbach’s alphas are presented in Table 2.
Reliabilities for the Values (α = .57) and SJT (α = .65) scales were somewhat low, but all scales
seem reliable enough to include in subsequent analyses. Small to moderate intercorrelations
suggest that biodata, SJT, and the RIASEC interest scales measure distinct, but related
constructs. Importantly, several of the categorical demographic variables correlated with some of
the explanatory covariates. As expected, Black participants (coded as Black = 1, White = 0)
generally came from areas with lower median income (r = -.41, p < .001) and were more likely
to be Pell grant eligible (r = .39, p < .001). In addition, female participants (coded as female = 1,
male = 0) were slightly more likely to come from areas with lower median income (r = -.08, p =
.018) and Asian participants (coded as Asian = 1, White = 0) generally came from wealthier
areas (r = .18, p < .001). Consistent with previous research (Su, Rounds, & Armstrong, 2009),
gender was negatively associated with Realistic (r = -.53, p < .001), Investigative (r = -.17, p <
.001), Enterprising (r = -.34, p < .001), and Conventional interests (r = -.24, p < .001). Gender
was positively associated with Artistic (r = .14, p < .001) and Social Interests (r = .42, p < .001).
Black respondents tended to have slightly higher Artistic (r = .10, p = .019) and Social (r = .12, p
= .005) interests. Asian respondents had stronger Conventional interests (r = .13, p = .009).
Nevertheless, the effect sizes of the correlations between race and vocational interests were
generally small.
Biserial correlations could also be examined to assess the differences on biodata and SJT
scales between demographic groups. Females scored significantly higher on Leadership (r = .08,
p = .013), Values (r = .12, p = <.001), Social Responsibility (r = .28, p = <.001), Perseverance (r
= .20, p = <.001), and SJT (r = .28, p = <.001). Lower scores were obtained by females on the
39

Table 2.
Descriptive statistics and intercorrelations of studied variables
Variable
M
SD
1
2
3
4
5
6
7
8
9
3.43
0.71 (.85)
1) Leadership
3.55
0.58
.38** (.82)
2) Continuous Learning
3.54
0.44
.30**
.55**
(.72)
3) Knowledge
**
**
4.27
0.36
.17
.26
.44** (.57)
4) Values
3.72
0.67
.50**
.24**
.22**
.17** (.77)
5) Social Responsibility
4.00
0.43
.42**
.46**
.56**
.41** .32**
(.77)
6) Perseverance
**
**
**
3.56
0.43
.36
.35
.45
.30** .27**
.52**
(.71)
7) Adaptability
**
**
**
**
**
**
3.94
0.26
.19
.23
.34
.43
.21
.36
.24**
(.65)
8) Situational Judgment
*
**
**
**
**
0.57
0.49
.08
-.13
.00
.12
.28
.20
.00
.28**
9) Female
0.11
0.31 -.05
.05
-.10*
-.11* -.07
.15**
-.06
.02
.09**
10) Black
0.07
0.26 -.06
.01
-.10*
-.12*
.15**
-.26**
-.19**
-.13**
-.02
11) Asian
*
*
0.10
0.30 -.09
-.02
-.07
-.04
-.09
.02
-.05
-.03
-.03
12) Other
**
**
*
**
0.34
0.47 -.01
.08
.02
-.08
-.05
.11
.03
.01
.06
13) Pell Eligibility
*
**
66628.40 27525.11 -.01
-.04
-.06
.02
.00
-.08
-.05
-.05
-.08*
14) Med. Local Income
2.20
0.82 -.06*
.09**
.00
-.11** -.11**
-.12**
-.03
-.14**
-.53**
15) Realistic
3.12
0.96 -.01
.14**
.17**
.01
-.01
.00
.04
.03
-.17**
16) Investigative
**
**
**
*
2.79
0.92
.07
.20
.04
.00
.05
.02
-.07
.06
.14**
17) Artistic
3.16
0.80
.16**
.07**
.11**
.12** .24**
.18**
.11**
.20**
.42**
18) Social
**
**
**
3.00
0.86
.11
-.01
-.03
-.08
.00
.04
.08
-.05
-.34**
19) Enterprising
2.54
0.81
.02
.04
.08** -.03
-.03
.00
.05*
-.04
-.24**
20) Conventional
Note. Reliabilities presented along the diagonal in parentheses. Female denotes the dummy coded variable representing Gender
(coded as Female = 1, Male = 0). Black, Asian, Other represent dummy coded variables representing Race categories (all coded as
Minority = 1, White = 0). ** p < .01; * p < .05.

40

Table 2. (Cont’d)
Variable
10
11
12
13
14
15
16
17
18
19
20
1) Leadership
2) Continuous Learning
3) Knowledge
4) Values
5) Social Responsibility
6) Perseverance
7) Adaptability
8) Situational Judgment
9) Female
10) Black
-.10**
11) Asian
-.12** -.09
12) Other
.39** -.02
.10**
13) Pell Eligibility
**
**
.18
-.41
-.07
-.22**
14) Med. Local Income
.09
-.04
.00
-.03
-.01
(.85)
15) Realistic
*
.07
-.08
.03
-.04
-.05
.36**
16) Investigative
(.84)
*
*
*
.02
.10
.10
.01
-.06
.08**
.13**
(.78)
17) Artistic
**
*
**
**
-.00
.12
-.01
.06
-.04
-.09
.08
.28**
(.77)
18) Social
**
**
-.03
-.06
.01
-.03
.08
.27
-.01
-.03
.07**
(.83)
19) Enterprising
.13**
-.06
-.07
-.04
.03
.39**
.11**
-.10**
.00
.52**
(.82)
20) Conventional
Note. Reliabilities presented along the diagonal in parentheses. Female denotes the dummy coded variable representing Gender
(coded as Female = 1, Male = 0). Black, Asian, Other represent dummy coded variables representing Race categories (all coded as
Minority = 1, White = 0). ** p < .01; * p < .05.

41

Continuous Learning scale (r = -.13, p = <.001), and no score differences were observed for
Knowledge and Adaptability. Black participants scored significantly lower on Knowledge (r = .10, p = .018), and Values (r = -.11, p = .011), but scored higher on Perseverance (r = .15, p =
<.001). No differences were observed on the Leadership, Continuous Learning, Social
Responsibility, Adaptability, and SJT scales between Black and non-Black participants. Finally,
Asian participants scored significantly lower on Knowledge (r = -.10, p = .038), Values (r = -.12,
p = .013), Adaptability (r = -.19, p = <.001), and SJT (r = -.13, p = .009), but scored higher on
Social Responsibility (r = .15, p = .002). Even though significant score differences were
observed, most were quite small and in the case of gender almost all favored females.
Assessment of Differential Item Functioning
Tables A1-A10 display item content, configural model fit for race and gender, as well as
fit for each model used to flag items for DIF across the scales included in this study. Two scales,
Routine Adaptability and SJT, could not be fully analyzed as planned. For Routine Adaptability,
a suitable referent item could not be identified for analyses based on Race, as all items produced
a change in CFI of greater than .005, which is too far beyond the cutoff of .002 to consider using
as a referent. As a result, further analyses of this scale only examined differences across gender.
A suitable referent was found for the SJT, but an acceptable configural model fit could not be
achieved for race (χ2(1000) = 1311.69, RMSEA = .03, CFI = .83, NNFI = .81, SRMR = .05).
This suggests that although a well-fitting measurement model could be found with white males,
this model did not fit well when applied across races3. However, configural model fit of the SJT

Further analyses were conducted to examine the inability to find configural invariance across
races. First, the measurement model found with white males was estimated within each racial
group. Acceptable model fit was found among White participants (χ2(250) = 395.98, RMSEA =
.02, CFI = .89, NNFI = .88, SRMR = .03), but not among any other racial group (Black: χ2(250)
= 326.21, RMSEA = .04, CFI = .44, NNFI = .38, SRMR = .07; Asian: χ2(250) = 294.72,
3

42

was acceptable for gender (χ2(502) = 692.88, RMSEA = .02, CFI = .88, NNFI = .87, SRMR =
.04). Thus, further analyses of the SJT focused only on gender. Due to the inability to establish
configural invariance in SJT based on race, H3, which suggested that factor loading differences
across racial groups could be accounted for by SES, could not be tested. For all other scales, DIF
was identified in at least one item and the MIMIC model analyses were conducted as planned to
evaluate the hypotheses that account for these effects.
Evaluation of Hypotheses
Broadly, the present work sought to identify instances of differential item functioning and
latent factor score differences between demographic groups, and attempted to explain why such
differences occur. H1 stated that minority respondents may have lower standings on the latent
factors measured by biodata and SJT scales and that these differences would be attenuated after
accounting for SES. Across all scales, being Pell grant eligible only significantly predicted a
higher standing on the latent Continuous Learning factor (β = .21, p = .003), and median local
income only predicted a lower standing on the Leadership Positions factor (β = -.07, p = .031). In
neither case did these effects coincide with meaningful differences in the latent factors across
demographic groups. It should also be noted that the inclusion of SES variables rendered some
demographic differences in the latent factors nonsignificant. However, in each of these cases the
SES variables themselves did not significantly predict the latent factor, suggesting that the

RMSEA = .04, CFI = .77, NNFI = .75, SRMR = .08; Other: χ2(250) = 294.78, RMSEA = .03,
CFI = .78, NNFI = .75, SRMR = .07). Examination of modification indices and standardized
residuals did not indicate clear causes of model misfit in the measurement models of minority
groups. However, configural invariance across races was also tested using the full applicant
sample (N = 11,637) and yielded acceptable model fit (χ2(1000) = 2159.85, RMSEA = .02, CFI =
.90, NNFI = .89, SRMR = .02), suggesting that the inability to find evidence for configural
invariance is a function of using a reduced sample (for whom vocational interest data was
available) more so than substantive differences in situational judgment across groups. The issue
of sample size is further discussed in the limitations section of the discussion.

43

change in significance was likely due to error in the estimate of demographic group effects more
so than a meaningful explanation of an effect by SES. With all of this in mind, there was
minimal support for H1.
H2 stated that item intercepts on the biodata scales would vary across racial minority
groups and this variation would be partially explained by SES. Of the 37 biodata items flagged
for DIF, nine were significantly predicted by SES variables after accounting for the latent factor.
However, only one item (“How often do you ask a teacher or classmate questions that go beyond
the material but are still relevant to the topic (either in or out of class)?”, Table B4) displayed the
expected effect of attenuating an initially high item intercept among Black respondents (β = .27,
p = .001), though the intercept difference remained significant (β = .20, p = .023) after inclusion
of median local income (β = -.06, p = .022). Other intercept differences varied across
demographic groups and were partially explained by SES but were not consistent with the
hypothesized effects. A relatively low item intercept among Asian respondents for the
aforementioned item (β = -.20, p = .044) was attenuated (β = -.19, p = .079) by inclusion of
median local income (β = -.06, p = .022), but the change in statistical significance was associated
with a minimal change in effect size. Similarly, for the item “In your first three years of high
school, how often did you skip classes without a legitimate reason?” (Table B9) the inclusion of
Pell grant eligibility (β = -.18, p = .002) decreased an initially significant intercept difference
across male and female respondents (initial model: β = -.13, p = .016; secondary model: β = -.08,
p = .130), but the actual change in effect size was small, suggesting that the explanatory nature of
Pell grant eligibility was not meaningful in this case. For the item “When asked to do a class
project with other students, how often do you take the lead and assign tasks or roles to people in
the group?” (Table B5) inclusion of Pell grant status seemed to clarify an initially non-significant
intercept among Asian respondents (Initial model: β = -.21, p = .080; Secondary model: β = -.25,

44

p = .045). However, this effect may not be practically important given the small change in effect
size and borderline statistical significance. Overall, H2 received little support as the occurrence
of SES variables predicting item responses did not have substantial effects on intercept
differences across demographic groups.
Several hypotheses also posited an association between Vocational Interests and the
latent traits measured by biodata and SJT scales (H4 – H7; summarized in Table 3).

Table 3.
Correspondence of RIASEC dimensions with biodata and SJT
RIASEC dimension
Corresponding biodata or SJT dimension(s)
Realistic
Investigative
Continuous Learning, Knowledge, Situational Judgment
Artistic
Social
Social Responsibility, Academic Values, Situational Judgment
Enterprising
Leadership, Adaptability, Perseverance
Conventional
Knowledge

These hypotheses were evaluated using the third MIMIC model in Figure 2 for each scale, part
of which included regressions of the scale latent factor onto each vocational interest scale. The
results of these analyses are summarized in Table 4 and the full results are provided in Appendix
B for each scale. Overall, the vocational interest scales predicted each noncognitive measure as
expected, with some notable exceptions. Social interests predicted Social Responsibility (β = .13,
p = <.001), Academic Values (β = .12, p = .001), and Situational Judgment (β = .19, p < .001) as
expected. Additionally, Social interests predicted Leadership Behaviors (β = .16, p < .001),
Leadership Positions (β = .15, p < .001), Knowledge (β = .13, p = .006), Perseverance (β = .13, p
= .003), Discrete Adaptability (β = .14, p = .008), and Routine Adaptability (β = .13, p = .003).
Given that the expected relationships were observed, these results support H4. Investigative
interests predicted Situational Judgment (β = .07, p = .022), Continuous Learning (β = .10, p =
.004), and Knowledge (β = .13, p = .005) as hypothesized providing support for H5. Enterprising
45

interests were hypothesized and found to predict Behavioral Leadership (β = .21, p = <.001),
Leadership Positions (β = .12, p = .002), Discrete Adaptability (β = .12, p = .047), Routine
Adaptability (β = .11, p = .013), and Perseverance (β = .13, p = .006). Unexpectedly,
Enterprising interests also negatively related to Continuous Learning (β = -.08, p = .03). Overall,
these results support H6. Finally, Conventional interests were found to predict Knowledge (β =
.10, p = .049) as expected. Conventional interests also predicted Continuous Learning (β = .09, p
= .020) and Discrete Adaptability (β = .15, p = .010) and were negatively related to Behavioral
Leadership (β = -.12, p = .002) and Routine Adaptability (β = -.11, p = .010). In spite of the
preponderance of unexpected relationships, H7 received support. Beyond examination of these
hypothesized relationships, Realistic and Artistic Interests did demonstrate some predictive
utility. Realistic interests negatively predicted Values (β = -.11, p = .010), Knowledge (β = -.12,
p = .038), Perseverance (β = -.15, p = .001), Discrete Adaptability (β = -.21, p < .001), and
Situational Judgment (β = -.12, p = .002). Artistic Interests only predicted Continuous Learning
(β = .19, p < .001). Though in some cases the specific relationships between vocational interests
and both biodata and SJT did not appear exactly as expected, this set of results as a whole

Table 4.
Summary of regressions of biodata latent factor scale scores on vocational interests
Vocational
Behavioral
Leadership
Continuous
Interest Scale
Leadership
Positions
Knowledge
Learning
Values
-.08 (.04)
-.01 (.04)
-.12* (.06)
-.02 (.04)
-.11* (.04)
Realistic
-.01 (.04)
-.03 (.03)
.13** (.05)
.10** (.03)
.05 (.04)
Investigative
.05
(.03)
.04
(.03)
-.03
(.05)
.19**
(.03)
.00 (.04)
Artistic
.16** (.04)
.15** (.03) .13** (.05)
.05 (.04)
.12** (.04)
Social
.21** (.04)
.12** (.04) -.04 (.05)
-.08* (.04)
-.07 (.04)
Enterprising
-.12** (.04)
-.01 (.04)
.10* (.05)
.09* (.04)
.07 (.04)
Conventional
Note. ** p < .01, * p < .05. Effects presented are standardized regression weights with standard
errors in parentheses. Each set of regression weights corresponding a particular biodata latent
factor are from the final model of that biodata scale. Please see Appendix B for the full set of
results for each scale.

46

Table 4 (cont’d)
Vocational
Discrete
Routine
Interest Scale
Perseverance
Adaptability
Adaptability
Social Responsibility
-.15** (.05)
-.21** (.06)
-.02 (.05)
-.03 (.04)
Realistic
.08 (.04)
.04 (.06)
.00 (.04)
.05 (.03)
Investigative
-.04 (.04)
.03 (.05)
-.06 (.04)
-.05 (.03)
Artistic
.13** (.04)
.14** (.05)
.13** (.04)
.13* (.03)
Social
.13** (.05)
.12* (.06)
.11* (.04)
.06 (.04)
Enterprising
-.01 (.05)
.15* (.06)
-.11* (.04)
-.02 (.04)
Conventional
Note. ** p < .01, * p < .05. Effects presented are standardized regression weights with standard
errors in parentheses. Each set of regression weights corresponding a particular biodata latent
factor are from the final model of that biodata scale. Please see Appendix B for the full set of
results for each scale

suggests a meaningful relationship between interests and the constructs measured here using
noncognitive assessments.
H8 posited that the relationships between vocational interests and the biodata and SJT
scales would help explain initially observed differences between demographic groups. This
appeared to be the case for female respondents who initially demonstrated a higher standing on
the latent Values factor (β = .23, p = .001). After inclusion of vocational interests, a higher
standing on the Values factor among females was eliminated (β = .02, p = .779). Further, this
model fit substantially better than a model where the relationship between the gender variable
and the Values factor was fixed to the value observed in the first MIMIC model, which did not
include SES variables or vocational interests (constrained model: AIC = 29293.07, BIC =
29851.18; unconstrained model: AIC = 29297.65, BIC = 29850.54). Of note, the change in AIC
signified strong evidence. As mentioned previously, both Realistic and Social values predicted
the latent Values factor, and both were associated with being female. A similar pattern was seen
for the Discrete Adaptability factor, whereby females initially had a higher standing on this
factor (β = .36, p < .001) but this higher standing was weakened (β = .21, p = .071) after
accounting for vocational interests. Constraining this regression parameter to the value of the
47

first MIMIC model appeared to worsen model fit compared to when the regression was freely
estimated (constrained model: AIC = 18485.55, BIC = 19054.09; unconstrained model: AIC =
18486.43, BIC = 19060.19), with the change in BIC signifying strong evidence. This reduction is
likely due primarily to Social and Realistic interests because gender differences in these interest
dimensions mirrored the gender differences in Discrete Adaptability (e.g. females tend to have
higher Social Interests; Social Interests predict higher Discrete Adaptability; β = .14, p = .008).
Further, though females ultimately demonstrated higher Perseverance (β = .27, p = .005), this
effect was meaningfully larger (β = .39, p < .001) before the addition of vocational interests.
However, freely estimating this regression did not clearly lead to better model fit than when the
regression parameter was constrained (constrained model: AIC = 25538.47, BIC = 26320.87;
unconstrained model: AIC = 25538.77, BIC = 26326.38). This suggests that the change in the
regression weight of gender predicting perseverance is not significant. Situational Judgment
demonstrated a similar set of findings whereby females had a higher standing on the latent trait
(β = .33, p < .001) even after including vocational interests, but this effect was meaningfully
larger in the initial model (β = .49, p < .001). However, freely estimating the parameter of the
SJT factor regressed onto the gender dummy coded variable also did not clearly lead to better
model fit than when it was constrained (constrained model: AIC = 73665.14, BIC = 74186.74;
unconstrained model: AIC = 73663.56, BIC = 74190.37), suggesting that the observed change in
regression weights between models is not significant. It should also be noted that after
incorporation of SES and vocational interests, females still had a moderately lower standing on
Routine Adaptability (β = -.43, p < .001) and a moderately higher standing on Social
Responsibility (β = .41, p < .001). Overall, it does appear as though vocational interests help
explain some of the true score gender differences across the biodata scales.

48

Though the explanatory power of vocational interests was demonstrated for some of the
latent score differences based on gender, the effects of minority group status were largely
independent of vocational interests. Across all scales, there were no observed differences in
latent scores across minority groups that were meaningfully decreased after including vocational
interests in the model. This is likely due to the relatively small differences in vocational interests
observed across the racial subgroups. However, significant differences in latent scores as a
function of minority group status should be highlighted. In the final MIMIC models that included
SES and vocational interests, Black participants still had a moderately lower standing on
Leadership Behaviors (β = -.40, p = .002), Knowledge (β = -.46, p = .002), and Discrete
Adaptability (β = -.53, p = .003). Asian participants had a moderately higher standing on Social
Responsibility (β = .40, p < .001), a moderately lower standing on Knowledge (β = -.41, p =
.007) and Leadership Behaviors (β = -.46, p = .002), and a significantly lower standing on
Discrete Adaptability (β = -.74, p < .001). Given some evidence of vocational interests
explaining differences based on gender but not across minority groups, H8 received moderate
support.
Lastly, H9 stated that observed differences in the item intercepts across demographic
groups would be partially explained by vocational interests. Three items from the Discrete
Adaptability scale (Table B4) demonstrated a similar pattern of intercept differences based on
gender. Initially, the item “How often have you failed to meet responsibilities because you had
taken on too much?” demonstrated a significantly lower item intercept for females (β = -.15, p =
.018). This effect was diminished (β = -.06, p = .543) upon inclusion of vocational interests, with
Artistic interests being the primary contributor (β = -.11, p = .001). It should be noted that though
the overall change in effect size was small, proportionally the effect was roughly halved, which
signifies that the intercept difference being explained by Artistic interests may be meaningful.

49

The item “How difficult has it been for you to continue with something after being interrupted
and having to take care of something else?” also initially demonstrated a lower intercept for
females (β = -.17, p = .013) that was attenuated (β = .00, p = .976) by vocational interests. The
relevant interests here were Realistic (β = .15, p = .001), Enterprising (β = -.09, p = .041), and
Conventional (β = -.09, p = .031) interests. Finally, the item “In the past, how difficult has it
been for you to change your study habits to improve on a skill or to do better in a class”
exhibited a lower intercept for females (β = -.13, p = .046) that was attenuated (β = -.09, p =
.347) by the inclusion of Investigative (β = .08, p = .014) and Artistic (β = -.13, p < .001)
interests. For this item a small observed change in effect size was also proportionally a
meaningful one (roughly a reduction of 30%), which may be meaningful in terms of item
functioning.
Gender-based DIF in two Continuous Learning items also seemed to be explained by
interests (see Table B4). Responses to “When learning new things, some people tend to feel
stressed or tired, while others tend to feel inspired or refreshed. How do you tend to feel when
you learn new things?”, demonstrated a significantly lower intercept for females (β = -.20, p <
.001) that decreased in the final model (β = -.13, p = .11). This effect may have been due to
Enterprising interests predicting responses to this item (β = -.06, p = .047). “How often do you
ask a teacher or classmate questions that go beyond the material but are still relevant to the topic
(either in or out of class)?” also eliminated statistically significant intercept differences across
men and women (Initial model: β = -.12, p = .013; Final model: β = -.12, p = .141). However,
none of the interest dimensions were related to responses for this item so the change in
significance was likely due to the lower statistical power of the refined model because the actual
effect size was unchanged. Lastly, “How important has it been in the past for you to be involved
in community or volunteer work?” (Social Responsibility, Table B8) was initially found to have

50

a higher intercept for females (β = .20, p <.001), an effect that was eliminated in the final model
(β = .08, p = .144) due to the effects of Realistic (β = -.06, p = .025) and Social interests (β = .08,
p = .002). For the most part, the relationship between interests and an item response
corresponded to the association between gender and interests. For example, an initially lower
intercept for an item among females would ultimately be revealed to be a lower intercept among
those with low Realistic interests, which was the case for most females. This pattern signals
meaningful explanation of these effects.
Several intercept differences across minority groups also appeared to be related to
vocational interests, but the explanatory pattern was less clear than for gender. For the item “To
what extent would your friends describe you as someone who goes after what you want?”
(Perseverance, Table B5), an intercept difference was found for respondents categorized as Other
(β = -.21, p = .03) but this effect was mitigated in the final model (β = -.13, p = .377) after
including interests. This was most likely due to Artistic interests predicting the item response (β
= -.09, p = .002) given the relatively high Artistic interests among those in the Other race
category. All other observed intercept differences based on minority status that appeared to be
partially explained using vocational interests were likely not practically meaningful. Some
methodological issues should be brought up before these remaining effects are summarized.
The methodological issues that likely yielded several unexpected effects were the
changes in the variables that were included in each model and the sample size for each minority
group. The model building approach used here involved systematically adding new variables to
the model to test the various effects (i.e. SES in the second model and vocational interests in the
third), which may have influenced estimates of the variance of the endogenous variables
(Muthén, 1989). Though these changes likely influenced some of the effects across gender, the
relatively small sample sizes of minority groups likely exacerbated the consequences of adding

51

variables to the models and decreased the statistical power of these models. As a result, the
following effects are likely not practically meaningful but are still discussed.
The aforementioned Perseverance item, “To what extent would your friends describe you
as someone who goes after what you want?” (Perseverance, Table B5) yielded a significant
intercept difference among Asian respondents (β = -.26, p = .031). This effect was weakened (β
= -.23, p = .11) after the inclusion of interests, and may be due to Conventional interests
predicting the response to this item (β = -.09, p = .015) given the association between being
Asian and Conventional interests. However, the minor change in effect size also suggests that
this change in significance may not be practically meaningful. This item was also predicted by
Enterprising (β = .10, p = .004) and Investigative interests (β = .06, p = .041), but Asian
respondents did not show a meaningful association with either interest domain. Black
respondents were predicted to have a lower item intercept for “Over the past year, how many
times were you given detention (or a similar punishment)?” (Table B9; β = -.30, S.E. = .11, p =
.006) but after the inclusion of interests this effect became nonsignificant (β = -.41, S.E. = .24, p
= .095). This change in significance is primarily due to a larger standard error, as the magnitude
of the effect itself increased. Conventional interests predicted the item response to this item as
well (β = .08, p = .015), but were not higher among Black respondents. The item “How often do
you ask a teacher or classmate questions that go beyond the material but are still relevant to the
topic (either in or out of class)?” (Table B4) demonstrated intercept differences across Black and
White respondents but, again, this effect decreased (Initial model: β = .27, p = .001, Final model:
β = .013, p = .926) after including vocational interests in the model. Interestingly, none of the
vocational interest dimensions appeared to predict responses to this item and, therefore, could
not have explained this reduction. The inclusion of interests also seemed to clarify an intercept
difference in “To what extent would your friends describe you as someone who goes after what

52

you want?” (Table B5; β = .31, p = .030) and “How many times in the past year have you set the
schedule (time and/or tasks) for groups in which you have worked?” (Behavioral Leadership.
Table B1; β = .23, p = .032) for Black respondents, but these effects may be due to error given
the nonsignificant initial effect.
Overall, vocational interests appeared to explain intercept differences across men and
women in a predictable way. The role of interests in explaining intercept differences across races
was somewhat more tenuous and many of the results for these models were likely influenced by
the methodological limitations of the models examined here. These issues will be further
discussed in the limitations section below. These findings, alongside the fact that intercept
differences were not explained for 23 other biodata items and one SJT item suggests that H9
received moderate support.

53

DISCUSSION

The goal of the present study was to help extend the examination of how and why bias
may occur in biodata and SJTs. By looking at multiple measurement methods, as well as
multiple constructs in the case of biodata, this study sought a more general understanding of the
explanatory mechanisms underlying measurement bias. At a broad level, this work serves to help
address the criticism of bias research that demonstration of bias too often takes precedence over
its explanation (Griel, 2005). The primary explanatory factors included in this study were
indicators of SES and vocational interests. MIMIC modelling was the primary analytic approach
used to incorporate explanatory variables in the assessment of DIF. This analytic approach also
allowed for distinguishing DIF from true demographic score differences on the latent trait. A
summary of the results for each hypothesis can be found in Table 5. Overall, SES did little to
explain either differences in latent scores or the preponderance of DIF. Vocational interests, on
the other hand, helped explain both differences in latent scores and DIF but for gender more so
than race. Further, vocational interests were consistently related to the constructs measured by
biodata scales. This suggests that for both the construction of biodata scales and their
interpretation, vocational interests are important to consider. Specifically, vocational interests
explained variance in some item responses originally attributed to differences between males and
females, suggesting that item content that is highly related to interests may be more likely to
exhibit DIF. Additionally, given that some latent differences between males and females were
also explained by interests, it is important to be mindful that differences between males and
females may reflect differences in interests.
Several score differences between groups were found to be meaningful at the latent level
of analysis. MIMIC analyses indicated that females had a moderately higher standing on the

54

latent factors of Perseverance, Values, Discrete Adaptability, Social Responsibility, and
Situational Judgment. Vocational interests appeared to explain a meaningful amount of the
gender differences for Values and Discrete Adaptability, but a significantly higher standing on
Perseverance, Social Responsibility, and Situational Judgment remained after interests were
included in the model. Despite a higher standing on Discrete Adaptability, females demonstrated
a moderately lower standing on the Routine Adaptability latent factor. For racial minorities,
Black respondents had a slightly lower standing on the Leadership Behaviors latent factor and
Asian respondents had a much lower standing. Interestingly, despite the conceptual similarity
between leadership positions and behaviors, the Positions factor was largely the same across
demographic groups. The Knowledge and Discrete Adaptability scales demonstrated a similar
pattern to Leadership Behaviors where both Black and Asian test-takers scored either moderately
or substantially lower than White test-takers. Social Responsibility analyses demonstrated a
moderate effect for Asians such that they had a higher standing on this trait. Across scales,
several latent factor differences between groups were found and appeared independent of
vocational interests and SES.
At the item level, meaningful patterns of DIF emerged in only two instances. Examining
the pattern of DIF is important to assess the extent to which items consistently favor one group
over another. The Leadership Behaviors scale contained three items flagged for DIF, two of
which demonstrated significantly higher intercepts for Asian participants in the first MIMIC
model controlling for demographic variables (the third item was nonsignificant) It is possible
that these two items may serve to obscure the lower standing Asian participants have on the
latent factor by artificially increasing their observed scores. The other instance was in the case of
Discrete Adaptability, where four of five items flagged for DIF demonstrated significantly lower.

55

Table 5.
Summary of degree of support and relevant results for hypotheses posed in the present study
Hypothesis
Degree of Support
H1: The effects of minority status on the standings of the latent traits Not supported

measured by biodata and SJTs will be partially explained by SES
such that minority status will initially predict lower standings on the
latent traits measured by biodata and SJTs and this effect will be
weakened upon inclusion of measures of SES.

Summary of Relevant Results
Pell grant eligibility and median
local income did not explain
latent score differences

H2: The effects of minority status on DIF in biodata items will be
partially explained by SES such that minority status will initially
predict higher item intercepts and this effect will be weakened upon
inclusion of SES.

Not supported

Only one item demonstrated the
expected effect of SES variables
attenuating an inflated intercept
difference

H3: The effects of minority status on DIF in SJT items will be
partially explained by SES such that minority status will initially
predict smaller item factor loadings and this effect will be weakened
upon inclusion of SES.

Not evaluated

An acceptable configural model
for SJT across race could not be
estimated

H4: High social interests should predict higher levels of the latent
traits of social responsibility, academic values, and situational
judgment
H5: High investigative interests should predict higher levels of the
latent traits of knowledge, continuous learning, and situational
judgment
H6: High enterprising interests should predict higher levels of the
latent traits of leadership, adaptability, and perseverance

Supported

Social interests predicted all
expected noncognitive constructs

Supported

Investigative interests predicted all
expected noncognitive constructs

Supported

Enterprising interests predicted all
expected noncognitive constructs

H7: High conventional interests should predict higher levels of the
latent trait of knowledge

Supported

Conventional interests predicted
the latent knowledge construct

56

Table 5 (cont’d)
H8: The effect of minority and gender status on the latent traits
assessed by biodata and SJTs will be partially explained by
differences in vocational interests such that minority and gender
status will predict standing on the latent traits assessed by biodata and
SJTs, and this effect will be weakened by the inclusion of vocational
interests in the model
H9: The effect of minority and gender status on DIF in biodata items
will be partially explained by vocational interests such that group
status will initially predict biodata item intercepts, and this effect will
be weakened by the inclusion of vocational interests in the model

Moderately
supported

Interests explained a meaningful
amount of the difference in Values

and Discrete Adaptability
factors across gender, but not
race
Moderately
supported

Interests explained uniform DIF
effects across gender for 3 Discrete
Adaptability and 2 Continuous
Learning items. Uniform DIF
effects across race did not appear
to be meaningfully explained by
interests.

item intercepts for females. This pattern of DIF may undermine equitable use of Discrete Adaptability scale scores across gender.
Given that DIF was found for many of the items studied here yet few consistent patterns were observed, future research may seek
larger samples with more power to detect DIF effects via MIMIC modelling or use other approaches (e.g. mean and covariance
structure analyses, Nye & Drasgow, 2011) to distinguish bias from observed scores.
One of the stronger set of findings in the current work was that vocational interests were related to the constructs measured
using biodata and situational judgment assessments. This finding has several potential implications. First, one of the main arguments
posed by Oswald et al. (2004) for the utility of biodata and situational judgment assessments was that these assessments predicted
incremental variance in academic performance over personality and indicators of cognitive ability. Given that the results here
demonstrate that biodata. scales bear some relationship with interests, and that past work shows that interests predict academic

57

performance (Nye et al., 2012), the extent to which the incremental validity observed by Oswald
et al. (2004) is representative of vocational interests should be examined. Second, if interests do
direct behaviors and influence the development of procedural knowledge that is assessed by
biodata and SJTs, the results found here may imply a more nuanced origin to the constructs
assessed. Past work suggests that biodata and situational judgment assessments measure
constructs that are the product of past experiences (Mael, 1991; Lievens & Motowidlo, 2016). If
vocational interests help determine the experiences an individual chooses to pursue, then it
should follow that constructs that are the product of such experiences are also indirectly related
to interests. It should be noted that though a causal account is provided to justify the link
between vocational interests and the biodata and situational judgment scales here, the evidence
provided is quite limited with respect to causality. Future work should examine the relationship
between interests and the constructs assessed using biodata and SJTs with a longitudinal design
to at least establish temporal precedence between these constructs. Should such evidence be
provided, future work may able to further examine the relationship between these constructs, as
well as the long-term consequences of vocational interest change.
Though an overall connection between interests and the latent constructs assessed by
biodata and SJTs was supported, several unanticipated relationships were observed that should
be discussed. Of note, Social and Conventional interests predicted several biodata constructs that
were not expected, and it was not anticipated that Realistic and Artistic interests would predict
any noncognitive construct. For Social, Conventional, and Artistic interests, some unanticipated
relationships make sense theoretically and would make sense to expect in future studies. For
example, Social interests predicting Leadership Positions and Behaviors bears similarity in
content related to working with other people. A similar argument could be made for those with
Conventional interests not engaging in Behavioral Leadership or Routine Adaptability. Further,
58

those with Artistic interests may be quite engaged in pursuing new ideas, as is captured by
Continuous Learning (Holland, 1997; Oswald et al., 2004). Instead of specific relationships
based on content, it may be the case that Realistic interests conflict with qualities that fit well in
an academic context. This may be the case given the broad negative relationships observed
between Realistic interests and several of the noncognitive constructs assessed by biodata and
SJTs. Other relationships, such as Social interests predicting Knowledge or Conventional
interests predicting Discrete Adaptability may need to be further evaluated. Given that these
effects were not hypothesized and do not seem to align based on construct content, it is difficult
to comment on whether such relationships should be expected in future studies. In spite of not
hypothesizing certain relationships, it appears as though the general connection between interests
and the noncognitive constructs assessed here based on construct content still holds.
Vocational interests also played a meaningful role in explaining uniform DIF effects.
Five items were found that exhibited DIF as a function of gender and differences in vocational
interests partially explained these effects. In these instances, an observed uniform DIF effect was
attenuated by vocational interests in a way that corresponded to the relationship between gender
and interest. For example, the incorporation of social interests into the MIMIC model attenuated
intercept differences favoring females because social interests were positively related to both the
item and to being female.
Though interests helped explain uniform DIF effects, the observed pattern does not align
with the frame of reference effect proposed by Robert et al. (2006). The frame of reference effect
suggests that individuals respond to items that ask for a social comparison by perceiving their
own unique comparison group. Thus, comparison group differences may explain item response
differences as well. Specifically, it was thought that an individual may use their demographic
group as a referent and that their responses to biodata and SJT items would differ from their
59

referent group’s vocational interests. In other words, individuals may view themselves as
particularly high on a trait when their demographic group’s standing is low. An individual’s
demographic group was thought to serve as a comparison group given that individuals tend to be
attracted to similar others (e.g. Holland, 1997; Schneider, 1987). However, the results indicate
that individual item responses are aligned with their demographic group’s standing on vocational
interests. This suggests that in instances where DIF was explained, demographic group
membership served as a proxy for the individual’s vocational interests more so than a description
of that individual’s perceived comparison group.
Future research examining the frame of reference effect may benefit from two
considerations. First, demographic variables may serve as a poor indicator of one’s comparison
group. Robert et al. (2006) discuss local comparison groups more in terms of culture, which may
be a more salient indicator of a comparison group than demographic group membership. Use of
psychological variables that describe an individual’s comparison group may be more likely to
reveal a frame of reference effect than demographics. Second, it may be the case that biodata
assessments are constructed in a way that reduces reliance on comparison groups. Though some
items rely on social comparison or require evaluating some abstract amount, characteristics
thought to produce the frame of reference effect (Robert et al. 2006), these qualities may not
influence responding as much as they would in other assessment methods.
Though a clear pattern of statistical results were found for DIF based on gender,
evaluation of item content is harder to link to vocational interests and DIF. In some instances this
relationship is quite clear. For example, responses to the item “How important has it been in the
past for you to be involved in community or volunteer work?” is clearly aligned with Social and
Realistic vocational interests as engaging in community or volunteer work likely involves
working with others and may also involve working outdoors or with primarily manual tasks.
60

However, most other items explained by vocational interests were less clearly linked based on
content. For example, the item “How often have you failed to meet responsibilities because you
had taken on too much?” was explained by Artistic interests, but the relationship between
excessive responsibilities and being interested in creative pursuits is more ambiguous. Further,
the small effect sizes for many of these effects, like this item in particular, invites the possibility
that some of these findings were observed by chance and do not represent theoretically
meaningful results. Future efforts may seek to hone in on key experiences that differ across
males and females, using vocational interests and differential accessibility (e.g. Imus et al., 2010)
as a guide, to help provide more specific suggestions as to how to write biodata items in a way
that reduces the risk of DIF.
As was the case with some of the relationships between interests and the constructs
assessed here by biodata and SJTs, interests would often explain uniform DIF in a way that was
not immediately apparent. Like the example item above, a uniform DIF effect would be found
with an item whose content did not seem to relate to the particular vocational interest that was
found to be statistically relevant. Some of these unexpected effects may be due to Type I error.
Given that many of the standardized uniform DIF effects were small, it is possible that some
were observed by chance. However, it may also be the case that such unexpected effects are due
to an unaccounted mechanism(s) of DIF. Even though many uniform DIF effects appeared as
though demographics were serving as a proxy for interests, other mechanisms of DIF may also
be at play. Given the fact that little evidence was found that the frame of reference effect was
producing uniform DIF effects (Robert et al. 2006), future research should consider other
mechanisms by which demographic DIF may occur in biodata and SJT assessments.
Contrary to expectation, the role of SES in accounting for differences across
demographic groups was quite minimal at both the item and the scale levels. The biodata and
61

SJT measures studied here capture noncognitive attributes that were in part the product of past
experiences (Oswald et al., 2004) and these experiences were thought to be shaped by the
environment of the individual being assessed. Due to the way SES may shape one’s environment
and that differences in race coincide with differences in SES (e.g. Cottrell et al., 2015) it was
thought that SES may be reflected in the measures assessed here. Only one item demonstrated
the expected pattern of effects where SES partially explained DIF. In addition, no latent mean
differences across groups were explained by SES. However, this is not to say that SES was not
relevant whatsoever. Items assessed for DIF via MIMIC modelling were selected based on
exhibiting evidence of DIF across demographic groups. Of the items selected, nine items were
predicted by SES variables after accounting for the latent factor. It is possible that other items
that did not demonstrate DIF across demographic groups may still demonstrate some form of
bias related to SES. Further, both the Continuous Learning and Leadership scales were predicted
by SES, though SES was negatively related to a higher standing on Continuous Learning and the
effect on Leadership was quite small. Though the effects of SES at the item- and scale-levels
appeared minimal, the potentially broad and high-stakes use of noncognitive assessments may
warrant further examination of the influence of SES.
Finally, it is important to take stock in both the nature and scope of current explanations
of measurement bias, as well as substantive group differences, in biodata assessments and SJTs.
Imus et al. (2010) provide good evidence for the role of accessibility in explaining differences in
item responses based on gender, but accessibility was less related to response differences based
on race. Kim et al. (2004) use socioeconomic reasons to predict DIF in SJTs across racial groups,
but the results indicated other major reasons existed for DIF. Cultural values seem to relate to
construct differences in biodata and SJT scales across racial groups (Prasad et al., 2017) but do
not explain why biodata item response differences arise as a function of race (Whitney &
62

Schmitt, 1997). The present study adds to this body of research by again evaluating SES as well
as introducing the role of vocational interests. At the item level, both vocational interests and
perceptions of accessibility seem important to item response differences across males and
females, but only limited evidence of the role of SES exists for race differences in survey
responding.
Limitations
A limitation of the present research could be the use of median local income as an
indicator of SES. Median local income was included as a way to characterize the economic
resources an individual may have experienced during high school. However, the MIMIC model,
as used here, may not have been able to adequately incorporate the prediction of an individual
level outcome using a group level variable (e.g. Kozlowski & Klein, 2000). Future research
exploring the impacts of SES should use appropriate statistical methods that can model
multilevel relationships or assess individual-level perceptions of environmental variables to
examine these relationships.
In spite of efforts to ensure otherwise, sample size limitations may have had a number of
negative effects on the present study. First, adequate modelling of the SJT was hindered by the
small sample sizes for some minority groups. Specifically, when estimating a configural model
for the SJT, unique item factor loadings and intercepts as well as latent means and variances
would constitute 52 unique estimated parameters per group. For the Asian and Black groups, this
would have resulted in roughly two to three participants per estimated parameter. Though the fit
indices used here should be relatively robust to samples of this size, having so few participants
per estimated parameter may increase the error in parameter estimates that are used for
comparisons across groups. The sample sizes and the number of estimated parameters was also a
concern for the MIMIC model and its ability to detect uniform and nonuniform DIF. Though the
63

subgroup sample sizes used here met or exceeded the recommendations by Woods and Grimm
(2011), the models estimated were significantly more complex than those simulated in their
research. First, Woods and Grimm (2011) only examined varying conditions with a single focal
and referent group whereas the present research employs four focal groups. Further, the
incorporation of eight explanatory variables may have also constituted a meaningful increase in
model complexity.
In terms of DIF detection, there were a few instances where there were discrepancies
between the DIF items identified by mean and covariance structure (MACS) analyses and
MIMIC modelling (Woods & Grimm, 2011). This happened more frequently for DIF based on
race, where an item would be flagged for DIF using MACS analyses but then no significant
effect would be found within the MIMIC model. It may be the case that the free baseline
approach is more strongly affected by the small sample sizes in some groups than a MIMIC
model, but it may also be the case that the significant differences detected by MACS analyses
reflected differences between two focal groups. Using the MIMIC model, DIF effects are
detected when a particular focal group is significantly different than the referent, but not
necessarily when focal groups are different from each other. Using the present analyses it is
difficult to compare the relative likelihood of either explanation, but future research may wish to
keep such considerations in mind.
Finally, in some instances, the effects varied between models in unexpected ways. For
example, an initially significant intercept difference for Black participants increased in size
between the first and third MIMIC models but was ultimately nonsignificant. This coincided
with a relatively large increase in the estimated standard error for that effect. Across all models,
there were several other instances where a particular DIF effect would fluctuate in terms of
statistical significance across models even though the actual size of the effect did not change
64

substantially. This may be due to the prescription by Muthén (1989) that explanatory variables
should be exogenous combined with the model building approach used here. Specifically, if the
estimated variances of endogenous variables are conditioned on the exogenous variables in the
model, then it is likely that endogenous variable variances as well as the standard errors of
effects involving endogenous variables would fluctuate based on the changing set of exogenous
variables. Though this did not seem to meaningfully influence the core results of this study,
future research on this topic will need to address this methodological limitation.
Practical Implications
The present research demonstrated that latent score differences may be large enough to
cause concern of measurement invariance when using these assessments across demographic
groups. The implication of this is that in spite of the many benefits biodata and SJT assessments
hold (e.g. Ployhart, 2006; Mitchell et al., 2001), additional measurement equivalence research
may be advisable before expanding the use of these measures across different demographic
groups. Given some of the moderate to large latent differences between demographic groups,
there may be meaningful differences in the experiences individuals from different groups have
beyond item level idiosyncrasies. These differences should be further understood to promote the
fair and effective use of noncognitive assessments in selection procedures. Why these behaviors
differ based on race and whether other behaviors could be assessed that are construct relevant are
important questions to ask to further refine the use of biodata and situational judgment
assessments.
Beyond a deeper understanding of biodata and SJT assessment methods, a failure to take
measurement invariance into account may negatively impact the accuracy of selection
procedures. Nye and Drasgow (2011) argue that the presence of DIF may artificially influence
how individuals are rank ordered based on a selection instrument. Inaccuracy in rank ordering of
65

candidates due to bias is problematic not only for accurately and fairly hiring individuals but also
because it can increase the risk of adverse impact. A failure to account for bias as a function of
group membership can cause individuals from a particular demographic group to be
disproportionately selected over other groups. Further, Nye and Drasgow (2011) found that the
risk of adverse impact due to bias increases as higher cut scores are used. Given the size of latent
score differences (particularly those that favor White males) and the number of items flagged for
DIF, the results of this study indicate that bias does play a meaningful role in interpreting biodata
and SJT score differences across groups. Combined with the findings of Nye and Drasgow
(2011), this suggests that the risk of adverse impact may also increase as practitioners turn to
noncognitive assessments to complement cognitive ability testing for highly competitive
positions.
It is also important to highlight the practical implications of possible compensatory DIF
(e.g. Raju, Van der Linden, & Fleer, 1995). For both the leadership positions and continuous
learning scales, DIF was identified when scale level demographic differences were modest. Even
in instances where significant demographic differences in scale scores were observed, they were
often quite modest even while DIF was present. It is possible that many of the DIF effects
observed here were either small overall or did not systematically favor one group over another.
Although items within a scale may show evidence of DIF, DIF in opposite directions can cancel
out at the scale level and result in scale scores that are not biased across groups (Raju, Van der
Linden, & Fleer, 1995). This presents a dilemma regarding the practical use of these scales. Like
past studies (e.g. Schmitt et al., 2009), observed score differences between groups were relatively
minor suggesting low risk of adverse impact when used in a selection context. However, if these
scales were being used operationally and items that exhibited DIF were removed, the scale level

66

similarity between groups may be changed as well. Thus, the practical note is to evaluate
whether item removal due to DIF truly improves the equity of scale scores.
Beyond concerns about the comparability of scores, some practical guidance can be
gained from this research. Following the call for explanatory mechanisms of DIF by Griel
(2005), this research shows that biodata items that tap into experiences but that also relate to
vocational interest domains may be likely to function differently across males and females.
Though items flagged for DIF may still warrant removal whether the mechanism for DIF is
understood or not, it may behoove test creators to consider the content of experiences covered by
items and whether or not those experiences may differ based on gender. Further, it does not
appear as though socioeconomic differences dramatically influenced DIF. As a result, test
makers may not be at risk when incorporating content that may vary as a function of SES (e.g.
relationships with teachers and other school related activities). Though these suggestions are
relatively intuitive, such guidance may be helpful given the flexibility of constructing biodata
assessments and the onus placed on test makers when creating new assessments for different
constructs and contexts.
Beyond the implications for assessment, the present research also has implications for
prospective college students. High social, investigative, and enterprising interests seemed to be
positively related to constructs assessed in the biodata measure used here, whereas realistic
interests yielded negative relationships for the most part. This suggests that individuals who have
high social, investigative, and enterprising interests, as well as low realistic interests, may be
more likely to engage in the experiences that yield the noncognitive qualities to do well in
college (Oswald et al., 2004). Though this interest signature, so to speak, may help shape efforts
to help prospective college students, what exactly those efforts should be is still a broad question.
The stability of interests during high school years (Low et al., 2005) suggests that individuals
67

with certain interests may not naturally direct themselves towards experiences that may prepare
them for college. This would imply that these individuals may benefit from external influence,
such as being directed towards more volunteer opportunities or incorporating more class
activities with the need to independently explore a topic. However, with the aid of future
research, changing a high school student’s interests may be an option to consider for more selfdirected engagement in experiences that promote academic success. Low et al. (2005) found that
reported interest levels increased during college years and Morris (2016) found that interest
differences between males and females were larger among younger participants than older
participants. These findings suggest that interest change can occur and possibly in a way that
reduces demographic differences. Future work in vocational interest development may reveal
strategies that lead to high school students being self-motivated to pursue experiences that may
promote academic success.
Conclusion
Differences in the use of biodata and SJT assessments were evaluated between both
males and females and across minority groups. The present research found that vocational
interests were important to an individual’s overall standing on the noncognitive attributes
assessed here. Further, interests may also help explain why some latent score and item response
differences exist across males and females. Further work is needed to identify additional
mechanisms for DIF as many items still exhibited DIF as a function of gender when accounting
for interests. Additionally, vocational interests do not seem to play a role in explaining item
functioning or latent score differences across racial groups. SES was also evaluated as an
explanation of DIF and latent score differences across groups, but for the most part such effects
were not observed. Overall, the present work constitutes a thorough examination of differential

68

functioning in noncognitive assessments and establishes a meaningful relationship between the
noncognitive constructs assessed here and vocational interests.

69

APPENDICES

70

APPENDIX A:
Configural model estimation and DIF analyses for studied scales

71

Table A1.
Configural model estimation and DIF analyses for the Behavioral Leadership scale
Item Content
Item Responses
Fit Statistic
Gender
Race
2
*During the past year, Never
χ (df)
123.75 (16) 127.04 (32)
how often have you
Once or twice
RMSEA
.095
.090
taken charge of a
Between three and five
CFI
.957
.962
group that you were
times
NNFI
.919
.928
in, without being
Between six and ten
SRMR
.037
.037
asked?
times
More than ten times
(Reversed) How often Much more often than
χ2 (df)
124.37 (18) 132.07 (38)
in the past year have
most people
RMSEA
.089
.082
you volunteered to be Somewhat more often
CFI
.957
.962
the spokesperson for a than most people
NNFI
.929
.940
group project you did
About as often as most
SRMR
.037
.039
at school or work?
people
Somewhat less often than
most people
A good bit less often than
most people
In the past year, how
many times have you
been responsible for
assigning tasks and
setting deadlines for
other people?
How many times in
the past year have you
set the schedule (time
and/or tasks) for
groups in which you
have worked?

Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

123.76 (18)
.089
.958
.929
.037

151.64 (38)
.090
.954
.928
.045

Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

134.21 (18)
.093
.954
.923
.041

141.05 (38)
.085
.959
.935
.044

72

Table A1 (cont’d)
(Reversed) When
asked to do a class
project with other
students, how often do
you take the lead and
assign tasks or roles to
people in the group?

I am usually the one who
assigns tasks or roles to
get the work done
More than half the time I
end up assigning the
tasks and roles
About half the time I take
the lead in assigning
tasks and roles
I rarely take the lead in
assigning tasks and roles
I never take the lead
unless I have been
assigned to do so

χ2 (df)
RMSEA
CFI
NNFI
SRMR

130.08 (18)
.092
.955
.925
.039

140.45 (38)
.085
.959
.935
.045

You are quiet
χ2 (df)
126.30 (18) 138.23 (38)
You follow others
RMSEA
.090
.084
You are sometimes more CFI
.957
.960
of a leader and
NNFI
.928
.936
sometimes more of a
SRMR
.041
.043
follower
You lead others
Note. * denotes item identified as referent item. Fit statistics displayed by referent item are for
the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in bold
denote indication of DIF as determined by a CFI decrease of > .002. (Reversed) denotes that
lower item responses relate to a higher standing on the target scale, and that item responses
were reversed prior to analyses.
When you are in a
meeting for a project
or activity, how do
you tend to be?

73

Table A2.
Configural model estimation and DIF analyses for the Leadership Positions scale
Item Content
Item Responses
Fit Statistic
Gender
Race
2
*How many times in
Never
χ (df)
5.52 (4)
15.41 (8)
the past year have you Once
RMSEA
.023
.050
tried to get someone to Twice
CFI
.999
.996
join an activity in
Three or four times
NNFI
.997
.987
which you were
Five times or more
SRMR
.009
.015
involved or leading
The number of high
school clubs and
organized activities
(such as band, sports,
newspapers, etc.) in
which you took a
leadership role was:

I did not take a
leadership role
1
2
3
4 or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

15.94 (6)
.047
.994
.988
.021

22.42 (14)
.040
.995
.992
.031

During the last two
years, how many
leadership positions
were you offered
(even if you didn't
take them)?

None
One
Two or three
Four or five
More than five

χ2 (df)
RMSEA
CFI
NNFI
SRMR

6.42 (6)
.010
1.000
1.000
.014

17.77 (14)
.027
.998
.996
.018

In the past year, how
often have you been
selected by a group or
club to serve as an
official or
representative?

Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

11.05 (6)
.034
.997
.994
.012

22.74 (14)
.041
.995
.991
.029

Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in bold
denote indication of DIF as determined by a CFI decrease of > .002.

74

Table A3.
Configural model estimation and DIF analyses for the Knowledge scale
Item Content
Item Responses
Fit Statistic
Gender
2
*How often have you
Very often
χ (df)
179.63 (54)
studied for tests by
Often
RMSEA
.050
trying to memorize
Sometimes
CFI
.919
just the basic facts and Rarely
NNFI
.892
not much more?
Never
SRMR
.038

Race
234.53 (108)
.056
.919
.892
.044

For classwork, how
often do you tend to
skim the material,
reading only the
important points?

Almost all the time
Most of the time
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

192.71 (56)
.057
.912
.887
.039

240.49 (114)
.055
.919
.898
.045

(Reversed) In general,
what is the lowest
grade that you find
acceptable for
yourself?

A or equivalent
B or equivalent
C or equivalent
D or equivalent
F or equivalent

χ2 (df)
RMSEA
CFI
NNFI
SRMR

184.54 (56)
.056
.917
.894
.042

254.70 (114)
.058
.910
.887
.050

(Reversed) How often
do you spend extra
time on school
assignments, even
after they are turned
in, so that you can
gain a better
understanding of the
material or principles?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

185.94 (56)
.056
.917
.893
.039

252.04 (114)
.057
.912
.889
.046

Generally, whenever
you learn about a topic
or how to perform a
task, how often do you
learn all the details as
well as the general
principles?

Hardly ever
Not very often
Sometimes
Often
Almost always

χ2 (df)
RMSEA
CFI
NNFI
SRMR

179.66 (56)
.055
.921
.898
.038

261.75 (114)
.059
.906
.881
.047

75

Table A3 (cont’d)
(Reversed) When you
took classes that you
thought were easy,
how important was it
for you still to
understand the
concepts underlying
the class material?

Extremely important
Very important
Rather important
Sort of important
Not important

χ2 (df)
RMSEA
CFI
NNFI
SRMR

182.00 (56)
.055
.919
.896
.038

253.09 (114)
.057
.911
.888
.046

(Reversed) In your last
year of high school, on
how many tests did
you "settle" for a
passing grade, rather
than spend significant
amounts of time
learning material well?

Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

182.12 (56)
.055
.919
.896
.039

239.1 (114)
.054
.920
.899
.044

A year after
completing a class,
how much can you
typically remember
about what you were
taught?

I tend to forget most
of what was taught in
class
I remember the
general ideas that
were taught in class
I remember some of
the details that were
taught in class
I remember a lot of
the details that were
taught in class

χ2 (df)
RMSEA
CFI
NNFI
SRMR

196.27 (56)
.058
.910
.884
.039

238.36 (114)
.054
.921
.900
.045

76

Table A3 (cont’d)
How do you compare
your standards for
learning to those of
your high school
teachers?

Much lower than my
teachers' standards
Lower than my
teachers' standards
About the same than
my teachers'
standards
Higher than my
teachers' standards
Much higher than my
teachers' standards

χ2 (df)
RMSEA
CFI
NNFI
SRMR

182.11 (56)
.055
.919
.896
.039

245.50 (114)
.056
.916
.894
.046

Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in bold
denote indication of DIF as determined by a CFI decrease of > .002. (Reversed) denotes that
lower item responses relate to a higher standing on the target scale, and that item responses
were reversed prior to analyses.

77

Table A4.
Configural model estimation and DIF analyses for the Continuous Learning scale
Item Content
Item Responses
Fit Statistic
Gender
Race
2
*(Reversed) When it
Very often
χ (df)
333.72 (70)
436.06 (140)
is not required to do
Often
RMSEA
.071
.075
so, how often do you
Sometimes
CFI
.923
.915
read materials (e.g.
Rarely
NNFI
.901
.891
books, magazines,
Never
SRMR
.040
.045
web sites) that pertain
to subjects that you are
learning about in
class?
(Reversed) When
curious about a
particular subject, how
likely were you to go
out and research the
subject on your own?

Extremely likely
Very likely
Rather likely
Sort of likely
Not likely

χ2 (df)
RMSEA
CFI
NNFI
SRMR

334.15 (72)
.070
.924
.904
.040

445.47 (146)
.074
.914
.894
.050

In the past month, how
many times have you
looked for more
information about
something that you
found interesting?

Never
Once or twice
3 to 5 times
6 to 10 times
More than 10 times

χ2 (df)
RMSEA
CFI
NNFI
SRMR

385.12 (72)
.077
.909
.886
.047

429.21 (146)
.074
.916
.897
.045

(Reversed) How often
do you ask a teacher
or classmate questions
that go beyond the
material but are still
relevant to the topic
(either in or out of
class)?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

336.99 (72)
.070
.923
.903
.041

456.42 (146)
.076
.911
.891
.047

In the past 6 months,
how many times have
you been so absorbed
when learning
something that you
didn't realize how
much time passed?

Almost Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

334.15 (72)
.070
.924
.904
.040

440.22 (146)
.074
.916
.896
.045

78

Table A4 (cont’d)
In the past month, how
many times did you go
out and learn more
about something
simply because it
seemed interesting?

Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

361.32 (72)
.074
.916
.895
.044

443.76 (146)
.074
.915
.895
.046

(Reversed) When a
textbook or instructor
mentions another
source of information
about a topic, how
likely are you to find it
and learn more on
your own?

Extremely Likely
Very Likely
Somewhat Likely
Not very likely
Not at all likely

χ2 (df)
RMSEA
CFI
NNFI
SRMR

335.65 (72)
.070
.923
.904
.041

446.26 (146)
.074
.914
.894
.046

(Reversed) How likely
were you to take a
class or find an
instructor so that you
could learn more
about a hobby or skill?

Much more likely
than most people
Somewhat more
likely than most
people
About as likely as
others
Somewhat less likely
than most people
Much less likely than
most people

χ2 (df)
RMSEA
CFI
NNFI
SRMR

334.00 (72)
.070
.924
.905
.041

446.79 (146)
.074
.914
.894
.047

(Reversed) How often
have you become
involved in something
just for the sake of
learning?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

333.88 (72)
.070
.924
.905
.041

438.06 (146)
.073
.916
.897
.045

79

Table A4 (cont’d)
When learning new
things, some people
tend to feel stressed or
tired, while others tend
to feel inspired or
refreshed. How do
you tend to feel when
you learn new things?

Very stressed/tired
χ2 (df)
347.03 (72)
444.15 (146)
Somewhat
RMSEA
.072
.074
stressed/tired
CFI
.920
.915
Something in
NNFI
.900
.895
between stressed/tired SRMR
.041
.047
and inspired/refreshed
Somewhat
inspired/refreshed
Very
inspired/refreshed
Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in bold
denote indication of DIF as determined by a CFI decrease of > .002. (Reversed) denotes that
lower item responses relate to a higher standing on the target scale, and that item responses
were reversed prior to analyses.

80

Table A5.
Configural model estimation and DIF analyses for the Values scale
Item Content
Item Responses
Fit Statistic
Gender
2
*(Reversed) In high
0
χ (df)
186.27 (68)
school, how many
1
RMSEA
.048
times have you
2 or 3
CFI
.931
cheated on a school
4 to 10
NNFI
.908
project, assignment, or More than 10
SRMR
.038
test?
(Reversed) In the past, Much more likely
χ2 (df)
186.68 (70)
how likely were you to than most people
RMSEA
.047
return money that you Somewhat more
CFI
.932
received by accident
likely than most
NNFI
.912
(e.g., received extra
people
SRMR
.039
change after buying
About as likely as
something)?
others
Somewhat less likely
than most people
Much less likely than
most people

Race
305.23 (136)
.058
.907
.877
.048
311.38 (142)
.057
.907
.882
.050

During high school,
how many times have
you expressed
disapproval or anger at
a friend for behaving
in a manner that you
considered to be
unethical or wrong?

Never
Once
Twice
Three or four times
Five times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

194.72 (70)
.049
.927
.906
.040

316.03 (142)
.057
.904
.879
.050

(Reversed) In the past
year, how many times
have you copied
someone else’s work
and submitted it as
your own (at school or
at work)?

Never
Once
Twice
Three or four times
More than five times

χ2 (df)
RMSEA
CFI
NNFI
SRMR

190.24 (70)
.048
.930
.909
.039

309.42 (142)
.056
.908
.883
.050

(Reversed) When you
have found someone
else's belongings, how
often have you
attempted to return
them?

Always
Most of the time
Half of the time
Less than half of the
time
I have never found
someone's belongings

χ2 (df)
RMSEA
CFI
NNFI
SRMR

187.50 (70)
.048
.931
.911
.039

313.22 (142)
.057
.906
.881
.051

81

Table A5 (cont’d)
(Reversed) Over the
past year, how many
times were you given
detention (or a similar
punishment)?
In your first three
years of high school,
how often did you skip
classes without a
legitimate reason?

Never
Once
Twice
Three or four times
Five times or more
Most of the time
A lot
Sometimes
Once or twice
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR
χ2 (df)
RMSEA
CFI
NNFI
SRMR

201.54 (70)
.050
.923
.901
.044
196.47 (70)
.049
.926
.905
.043

316.72 (142)
.058
.904
.878
.052
316.93 (142)
.058
.904
.878
.053

If a fellow student
offered you a copy of
upcoming exam
questions that he had
retrieved from the
teacher’s recycling
bin, how likely would
you be to accept a
copy?

Extremely likely
Quite likely
Somewhat unlikely
Not at all likely

χ2 (df)
RMSEA
CFI
NNFI
SRMR

187.45 (70)
.048
.931
.912
.039

315.29 (142)
.057
.905
.879
.052

If you were struggling
with a school
assignment, and a
fellow student with
more expertise offered
to finish it for you,
how likely is it that
you would accept the
offer?

Extremely likely
Quite likely
Somewhat likely
Not at all likely

χ2 (df)
RMSEA
CFI
NNFI
SRMR

190.48 (70)
.048
.929
.909
.039

314.84 (142)
.057
.905
.879
.052

How many times have
you been accused of
acting unethically?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

208.60 (70)
.052
.919
.896
.044

323.11 (142)
.059
.900
.874
.058

Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in bold
denote indication of DIF as determined by a CFI decrease of > .002. (Reversed) denotes that
lower item responses relate to a higher standing on the target scale, and that item responses
were reversed prior to analyses.

82

Table A6.
Configural model estimation and DIF analyses for the Social Responsibility scale
Item Content
Item Responses
Fit Statistic
Gender
Race
2
*How many hours of
0
χ (df)
248.78 (52) 314.6 (104)
volunteer work did
Between 1 and 10
RMSEA
.071
.074
you do while in high
Between 11 and 30
CFI
.944
.943
school?
Between 31 and 75
NNFI
.922
.922
More than 75
SRMR
.043
.049
How many times in
the past year have you
volunteered in social
service or charity
organizations?

Never
Once
Twice
Three
Four times or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

251.4 (54)
.070
.943
.925
.044

317.98 (110)
.071
.944
.927
.049

During the past two
years, how many times
did you work with notfor-profit groups?

0
1
2
3 or 4
5 or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

250.91 (54)
.070
.944
.925
.044

323.08 (110)
.072
.943
.925
.050

During the last year,
how many times have
you given money,
food, or clothes to a
charity or a poor
person in need?

0
1
2
3
More than 3

χ2 (df)
RMSEA
CFI
NNFI
SRMR

251.22 (54)
.070
.943
.925
.044

333.06 (110)
.074
.940
.922
.055

In the past year, how
many hours were you
engaged in community
service or volunteer
activities?

None
Less than 10 hours
11 - 40 hours
41 - 80 hours
More than 80 hours

χ2 (df)
RMSEA
CFI
NNFI
SRMR

251.8 (54)
.070
.943
.924
.044

336.09 (110)
.074
.939
.921
.051

(Reversed) How
important has it been
in the past for you to
be involved in
community or
volunteer work?

Extremely important
Very important
Important
Not very important
Not at all important

χ2 (df)
RMSEA
CFI
NNFI
SRMR

267.71 (54)
.073
.939
.918
.044

326.44 (110)
.073
.942
.924
.053

83

Table A6 (cont’d)
(Reversed) In the past,
how likely were you to
stop to help a stranger
in need (e.g., giving
directions to a lost
person)?
In the past year, in
how many fundraisers
have you participated?

During the past year,
how often have you
recycled?

Extremely Likely
Very Likely
Somewhat Likely
Not very likely
Not at all likely

χ2 (df)
RMSEA
CFI
NNFI
SRMR

248.84 (54)
.070
.944
.926
.043

324.78 (110)
.072
.942
.925
.051

None
1
2
3
4 or more

χ2 (df)
RMSEA
CFI
NNFI
SRMR

254.89 (54)
.071
.942
.923
.045

334.26 (110)
.074
.940
.921
.055

Never
Not very often
Sometimes
Often
Always

χ2 (df)
RMSEA
CFI
NNFI
SRMR

250.9 (54)
.070
.944
.925
.044

376.5 (110)
.081
.928
.906
.054

Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in
bold denote indication of DIF as determined by a CFI decrease of > .002. (Reversed)
denotes that lower item responses relate to a higher standing on the target scale, and that
item responses were reversed prior to analyses.

84

Table A7.
Configural model estimation and DIF analyses for the Perseverance scale
Item Content
Item Responses
Fit Statistic
Gender
2
(Reversed) How
Extremely important χ (df)
270.65 (54)
important is it to you
Very important
RMSEA
.073
to succeed in whatever Important
CFI
.889
task you are engaged
Not very important
NNFI
.852
in?
Not at all important
SRMR
.044

Race
323.8 (108)
.073
.891
.855
.048

To what extent would
your friends describe
you as someone who
goes after what you
want?

Not at all
A slight extent
A moderate extent
A large extent
A great extent

χ2 (df)
RMSEA
CFI
NNFI
SRMR

274.07 (56)
.072
.888
.856
.045

335.13 (114)
.072
.888
.859
.055

How frequently do
you fail to get what
you want because you
did not put in enough
effort?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

277.88 (56)
.073
.886
.854
.049

373.04 (114)
.078
.869
.835
.053

(Reversed) To what
extent has it been
important to you to do
your very best
whenever you take on
a project?

Extremely important
Very important
Important
Not very important
Not at all important

χ2 (df)
RMSEA
CFI
NNFI
SRMR

276.89 (56)
.073
.887
.854
.046

329.62 (114)
.071
.891
.862
.050

(Reversed) How often
have you
accomplished
something you
initially thought was
very difficult or
almost impossible?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

277.95 (56)
.073
.886
.854
.045

331.98 (114)
.072
.890
.861
.050

(Reversed) How often
have you finished a
project when faced
with difficult
circumstances?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

272.37 (56)
.072
.889
.857
.044

336.77 (114)
.073
.888
.858
.052

85

Table A7 (cont’d)
(Reversed) How often
do others tend to
compliment you on
your determination to
continue with a project
under difficult
circumstances?
How often do you tend
to give up on a task
after being told that
you were not doing
well?
When encountering
problems that take a
long time to solve,
how impatient do you
tend to become?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

283.94 (56)
.074
.883
.850
.045

339.48 (114)
.073
.886
.856
.054

Almost all the time
Most of the time
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

286.15 (56)
.074
.882
.848
.054

333.77 (114)
.072
.889
.860
.051

Extremely impatient
Very impatient
Somewhat impatient
Slightly impatient
Not at all impatient

χ2 (df)
RMSEA
CFI
NNFI
SRMR

289.19 (56)
.075
.881
.846
.050

325.02 (114)
.071
.893
.865
.049

Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in
bold denote indication of DIF as determined by a CFI decrease of > .002. (Reversed)
denotes that lower item responses relate to a higher standing on the target scale, and that
item responses were reversed prior to analyses.

86

Table A8.
Configural model estimation and DIF analyses for the Discrete Adaptability scale
Item Content
Item Responses
Fit Statistic
Gender
Race
2
(Reversed) How
Much more effective χ (df)
89.99 (18)
115.44 (36)
effective would others than most people
RMSEA
.073
.077
say you are at
Somewhat more
CFI
.924
.914
handling multiple
effective than most
NNFI
.873
.856
projects
people
SRMR
.037
.042
simultaneously?
About as effective as
most people
Somewhat less
effective than most
people
Much less effective
than most people
How often have you
failed to meet
responsibilities
because you had taken
on too much?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

96.36 (20)
.072
.919
.879
.041

121.88 (42)
.072
.913
.876
.047

(Reversed) How
difficult has it been for
you to continue with
something after being
interrupted and having
to take care of
something else?

Very easy
Easy
Not easy but not
difficult
Difficult
Very difficult

χ2 (df)
RMSEA
CFI
NNFI
SRMR

96.64 (20)
.072
.919
.878
.037

119.51 (42)
.070
.916
.880
.046

(Reversed) How often
do you plan ahead and
make a specific
schedule of things you
need or want to do?

Very often
Often
Sometimes
Rarely
Never

χ2 (df)
RMSEA
CFI
NNFI
SRMR

130.46 (20)
.086
.883
.824
.052

118.37 (42)
.070
.917
.882
.043

In the past, how
difficult has it been for
you to change your
study habits to
improve on a skill or
to do better in a class

Very difficult
Difficult
Not easy but not
difficult
Easy
Very easy

χ2 (df)
RMSEA
CFI
NNFI
SRMR

94.27 (20)
.071
.921
.882
.038

124.38 (42)
.073
.911
.872
.048

87

Table A8 (cont’d)
When you are working
on a serious and
relatively difficult task
and something or
someone interrupts
you, how do you
usually react?

With a great deal of
annoyance - it is hard
to get back to the
original task
You are irritated - it's
hard to stay on task
when you are
interrupted
It bothers you just a
little - you'd really
prefer not to be
interrupted
It doesn't bother you you feel one of the
challenges of any job
is the ability to
“juggle" several
things at a time

χ2 (df)
RMSEA
CFI
NNFI
SRMR

104.31 (20)
.075
.911
.866
.040

119.84 (42)
.071
.916
.879
.043

Note. * denotes item identified as referent item. Fit statistics displayed by referent item are
for the estimation of the configural model. Gender and Race denote multiple groups models
comparing gender and race demographic groups, respectively. Fit statistics presented in bold
denote indication of DIF as determined by a CFI decrease of > .002. (Reversed) denotes that
lower item responses relate to a higher standing on the target scale, and that item responses
were reversed prior to analyses.

88

Table A9.
Configural model estimation and DIF analyses for the Routine Adaptability scale
Item Content
Item Responses
Fit Statistic
Gender
2
*Compared with
A very long time
χ (df)
4.083 (4)
others, how long does A long time
RMSEA
.005
it take you to feel
Neither a short nor a long
CFI
1.000
comfortable in new
time
NNFI
1.000
situations or places?
A short time
SRMR
.009
A very short time
In the past, how
difficult have you
found it to adjust to
major changes in your
life (e.g., moving, a
new school, a new
job)?

Extremely difficult
Very difficult
Difficult
Not very difficult
Not at all difficult

χ2 (df)
RMSEA
CFI
NNFI
SRMR

9.47 (6)
.028
.996
.992
.016

How difficult has it
been for you to deal
with situations that
forced you to make
adjustments in your
daily life (e.g., a
broken leg, illness, or
family crisis)?

Very difficult
Difficult
Not easy but not difficult
Easy
Very Easy

χ2 (df)
RMSEA
CFI
NNFI
SRMR

8.01 (6)
.021
.998
.996
.015

To what extent have
you been bothered by
sudden changes in
your schedule?

To a great extent
To a large extent
To a moderate extent
To a slight extent
Not at all

χ2 (df)
RMSEA
CFI
NNFI
SRMR

4.19 (6)
.000
1.000
1.000
.010

Note. * denotes item identified as referent item. Fit statistics displayed by referent
item are for the estimation of the configural model. Gender and Race denote
multiple groups models comparing gender and race demographic groups,
respectively. Fit statistics presented in bold denote indication of DIF as determined
by a CFI decrease of > .002. (Reversed) denotes that lower item responses relate to a
higher standing on the target scale, and that item responses were reversed prior to
analyses.

89

Table A10.
Configural model estimation and DIF analyses for the situational judgment scale
Item Content
Item Responses
*You have been standing
Politely inform the person that there is a line and hopefully he/she
in line for the restroom for will move to the back. (Best)
some time after a campus Say aloud to someone near you how rude it is that people cut in
event, and someone cuts
line.
into the line ahead of you. Give them dirty looks, and try to squeeze them out of line.
What would you do?
Scold the person for not respecting other people. (Worst)
Be annoyed but not do anything. It’s just one more person.
Calmly cut back in front of them. (Best)
You are part of a threeperson group working on a
class project with a
quickly approaching
deadline. One member of
the team is not pulling his
weight. He avoids
assignments, complains
about the amount of work
that has to be done, and
says the project doesn’t
really matter anyway.
While you are all
classmates, you seem to
be the group leader. What
would you do?

Divide the workload evenly among members of the group,
making sure everyone knows they are responsible for their share.
If the group member still does not pull his own weight, bring it up
with the instructor. (Best)
Speak with him in private and offer him moral encouragement to
complete his portion of the project. If the group member still does
not pull his own weight, bring it up with the instructor.
Try to get the team member motivated to do his work. If that
doesn’t help the situation, just put more effort into the project
yourself in order to complete it.
Just do the group member’s portion of the assignment in addition
to your own, and tell the instructor about the situation. (Worst)
See if the person could be removed from your group.
Consult with the non-problematic group member about the most
appropriate course of action, and then act on whatever you jointly
decide.

90

Fit Statistic
χ (df)
RMSEA
CFI
NNFI
SRMR

Gender
691.43 (500)
.023
.881
.861
.035

χ2 (df)
RMSEA
CFI
NNFI
SRMR

693.63 (502)
.023
.881
.869
.035

2

Table A10 (cont’d)
A fellow student allows
you to listen to threatening
phone messages that have
been placed on the
person’s voicemail by
another student. The
student does not want you
to tell anyone, but thinks
the caller may be capable
of causing physical harm.
What would you do?
As a leader of a student
organization, you asked a
committee member to
track the use of important
and costly supplies. In
response, she developed
forms requiring the
organization’s committee
members to indicate when
and how they used various
supplies. The coordinating
individual now complains
that no committee
members are complying
with her request for
information on the use of
supplies. How would you
handle this situation?

Try to talk them into calling the police and warn them not to walk
around alone. (Best)
Talk to the resident assistant about it.
Contact the police yourself if you think there is any real threat of
physical harm.
Find out who is making the calls, if it is another student, confront
them – singly or jointly. (Worst)
Unless the friend knows something that they’re not saying, there
is no reason not to call the police – so call them if your friend
won’t.
Have the friend change their phone number.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

703.34 (502)
.023
.875
.862
.036

Explain the importance of tracking to the committee, and request
that everyone comply with the request. (Best)
Ask everyone to respect the coordinating individual’s hard work
and effort by cooperating.
Limit access to the supplies until people start filling out the
forms, or have penalties for not complying.
Designate someone else to be in charge of tracking and enforcing
the information requests. (Worst)
Ask the committee if there is a misunderstanding about the forms
and for suggestions on improving them.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

693.91 (502)
.023
.881
.869
.035

91

Table A10 (cont’d)
Your roommate, usually a
tidy person, has recently
experienced some
personal difficulties. As a
result, he/she has become
quite distracted and has
left much of the household
responsibilities to you.
You have talked to
him/her about your
concerns, and
empathetically requested
that he/she resume his/her
share of the
responsibilities as soon as
possible. A month passes
and you are still doing too
much of his/her work.
What would you do?

Find out more about his/her problem and try to deal with that
first. (Best)
Stop doing all of the household responsibilities to show him/her
what it’s like.
Talk with him/her again and explain that you are suffering as a
result of his behavior. (Best)
Tell him/her that if he/she doesn’t help, you will move out.
(Worst)
Do your share of the work, and put anything of his/hers that
affects you in his/her area of the room.

92

χ2 (df)
RMSEA
CFI
NNFI
SRMR

694.88 (502)
.023
.880
.868
.035

Table A10 (cont’d)
After you arrive on
campus, you begin to
socialize with a group of
students who drink
regularly even though all
are underage. By the end
of the term, you realize
that you are drinking
several drinks at least
three nights a week, but
you don’t know how to
withdraw from the group
in which this is normal
routine behavior. What
action would you take?
You have been having
trouble with a class in
which everyone else
seems to be doing well.
Your homework comes
back with unsatisfactory
grades week after week,
and your test scores have
been marginally passing.
How would you proceed?

Ask a close friend to help watch out for your best interests, and
pursue other activities with other people.
As long as you keep your grades up it is not a problem. (Worst)
Explain to the group that you are concerned about falling behind
if you continue the behavior and concentrate more on your
studies instead.
Join alternative groups such as campus clubs and sports, or
maybe even take an evening or early morning job. (Best)
Just socialize with the group less frequently.
Continue socializing with the group, but don’t always drink when
they do.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

691.91 (502)
.023
.882
.870
.035

Find a study group to work with you.
Talk to the professor, and to friends in the class, and read more.
Get tutoring, and study more frequently for this class.
Seek help from someone in the class who is doing well.
Talk to the professor or TA to find out what you are doing wrong,
compare notes with others and seek out tutoring. (Best)
Stay calm and continue to do the best you can. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

691.85 (502)
.023
.882
.870
.035

93

Table A10 (cont’d)
There is a seminar being
held on campus that would
expand your
understanding of a class
topic, but the seminar time
conflicts with the class
schedule. What would
you do?
You are the student
coordinator for the gym,
and it’s 4:30 P.M. You
have just been informed
that there is no heat in the
gym. As it is the middle of
winter and very cold, you
know this will be a
problem. There is a
student dance being held
in the gym at 7:00 P.M.,
and there are no
alternative facilities in
which to hold the number
of people expected at this
event. What would you
do?

Skip the class, and go to the seminar because it is related to the
class. (Worst)
Go to class because it might cover what the seminar would cover.
Go to class and talk to someone that went to the seminar.
Get advice from the professor and then decide what to do. (Best)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

695.58 (502)
.023
.880
.868
.035

Let everyone know that it’s postponed or called off.
Call maintenance, and see if they can fix it. (Best)
Look for small heaters to fill the room.
Call people and check the consensus opinion about what to do.
Find a group of rooms as an alternative location.
Inform the students to dress warmly. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

694.09 (502)
.023
.881
.869
.035

94

Table A10 (cont’d)
You and five other
students must have a
report ready within 48
hours. The last time the
six of you worked
together, you became the
leader. You know that one
of the group members did
no work whatsoever on
the last occasion, yet she
is in your group again.
This time it is necessary
that all members pull their
own weight. What would
you do?

You are collaborating with
other classmates on a
project. The group of you
keeps running into a
variety of problems that
threaten to cause the
project to be late. The
other group members want
to just plan to submit it
late. Another option would
be to devote much more
time than planned to the
project and possibly get it
in on time. What would
you do?

Let her know that you are aware that she did not do any work last
time, and that this time it is necessary that she fully contribute.
Do all of your end of the work and ensure that the instructor is
aware that you did your share, regardless of what the other
members do.
Explain to the group that the professor will be made aware of
who contributed what to the project, and ensure that this happens.
(Best)
Stress the importance that everyone fully contributes his or her
share to the project.
Work as closely with her as possible (e.g. assign both of you a
related task) so as to offer encouragement and ensure that her
work gets done.
Assign her a specific task with a specific timeframe. If she does
not do the work, ask to have her re-assigned, and have the group
pick up her work.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

694.11 (502)
.023
.881
.869
.035

Try to get it done, but plan to submit it late. (Worst)
Ask the instructor for help or for an extension. If that doesn’t
work, just try your best and do what you can or turn it in late.
Motivate the group to devote more time and work together to get
it done. (Best)
Have the group decide what to do. (Worst)
Work hard to finish it because there are consequences for being
late and meeting deadlines is important to you. (Best)
Tell the instructor your situation, and ask for advice.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

691.65 (502)
.023
.882
.870
.035

95

Table A10 (cont’d.)
You know that a group of
students in your class
cheats on exams by
putting formulas into
scientific calculators, cell
phones, or some electronic
device. The professor has
clearly warned against
such activity, but you are
not sure what she would
do if she knew what these
students were doing.
What action would you
take?
Because of family
problems, you realize that
your parents can no longer
support you financially at
the same level as they
have and you do not have
enough money to continue
in school. What plans
would you make?

Try doing the same thing until people start getting caught.
(Worst)
Study the way you know best, don’t cheat, but don’t turn in the
other students either. (Best)
You would do nothing; it’s none of your business.
You would mention it to the professor so she can deal with the
problems in the class.
Don’t tell the professor, but make sure it is clear you are not
involved in case they get caught.
Send the professor an anonymous message about what is going
on. (Best)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.42 (502)
.023
.882
.870
.035

Apply for student financial aid or get a part-time job. (Best)
Ask other family members for money to finish school.
Drop out of school and save money for going back. (Best)
Take fewer classes because of the lower level of finances.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

694.40 (502)
.023
.880
.869
.035

96

Table A10 (cont’d)
An event in the news
makes you wonder about
the history behind the
news incident. What
would you do?

Do some research, looking up all the facts for yourself.
Do a quick Internet search to see if you could find any
information. (Best)
Think about it briefly, then move on. (Worst)
Ask others what they know about the topic. (Best)
Resolve to read the newspaper more often.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.43 (502)
.023
.882
.870
.035

You are finding a
particular class dull and
boring, and are having
difficulty staying awake.
What would you do?

Do what you can to stay awake, such as drinking caffeine or
sitting toward the front of the class. (Best)
Read the class material beforehand to make the lecture more
interesting.
During the lecture, do some studying that is required for the
course.
Make sure you are getting enough sleep every school night.
(Best)
Skip the class if it is that dull and boring to you. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

698.38 (502)
.023
.878
.866
.035

Your grade for a particular
class is based on three
exams, with no class
attendance requirement.
All of the homework
requirements for the class
are posted on the
professor’s web site.
What would you do?

Attend class for as long as you feel that it is helping your grades.
Do all the homework but only go to some of the lectures. It’s the
exams that count.
Go to all the classes anyway. The professor may say something
important. (Best)
Skip classes, but if you did poorly on the first exam, start going to
classes.
There is no need to go to classes. Just get the homework done,
and pass the exams. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.18 (502)
.023
.882
.870
.035

97

Table A10 (cont’d)
You share a dorm room
with three other students.
One half-hour before you
are expecting a guest, you
get home to find the place
completely trashed. There
is no sign of any of your
roommates. What would
you do?
One of your friends’
roommates frequently
parties until late at night,
often returning to the
room after drinking,
engaging in loud and
obnoxious behavior. Your
friend finds that she
cannot study or sleep well,
but also feels reluctant or
afraid to talk with the
dorm authorities. What
action would you take?

Clean up the mess as much as possible before the guest arrives.
Then speak with your roommates immediately upon their return,
so your guest knows how concerned you were about the mess.
Leave the mess and explain the situation to your guest. (Worst)
Leave the mess and take the guest somewhere else.
Clean up the mess as much as possible before the guest arrives.
Then, without the guest around, ask the roommates why the place
was trashed so badly and what can be done in the future to avoid
this situation. (Best)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.46 (502)
.023
.882
.870
.035

Approach the dorm authorities on behalf of your friend.
Talk to the roommate yourself, and explain that her behavior
bothers your friend. (Worst)
Tell your friend to talk with her roommate and let her know that
the behavior is not acceptable.
Offer to let your friend stay with you when necessary.
Suggest to your friend that she talk it out with the roommate, and
offer to be available as a neutral third party when the two have
the conversation. (Best)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

693.66 (502)
.023
.881
.869
.035

98

Table A10 (cont’d)
You are searching for a
major that interests you
and think you might be
interested in psychology.
You do not know much
about preparation to be a
psychologist or what kinds
of opportunities exist for
careers in this area. What
action would you take?
You are interested in
several different
classes/disciplines, but
don’t know anything
about future educational
or career opportunities in
these areas. What steps
would you take to get
informed?

Talk to an advisor in psychology to see what career options are
available. (Best)
Talk with a friend who is a psychology major to see what it is
about.
Take an introductory psychology course to see what areas in
psychology there are.
Look up job listings for psychologists on the Internet. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

691.93 (502)
.023
.882
.870
.035

Go to an advisor or knowledgeable professional who might tell
you more and answer your questions. (Best)
Research topics using available resources like relevant books and
Internet web sites.
Attempt to obtain some hands-on experience, like internships.
(Best)
Use the school career services and career counselors.
Take some introductory classes in the area of interest to see if you
want to pursue that area further.
Think about your interests and try to figure out which of them fit
with the different disciplines.
Ask friends and family for advice and information. If possible ask
a friend who is familiar with the area.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

696.83 (502)
.023
.879
.867
.035

99

Table A10 (cont’d)
In a class of 50 students,
you discover that a group
of your friends have
worked out a scheme to
share answers on an exam.
The professor has vision
problems and will likely
never notice. You are not
doing very well in the
course. What would you
do in these circumstances?
Your professor has just
given you a project that
will obviously require the
whole semester to
complete. She gave you
all the details you need to
get started, but you are not
sure how the project
should proceed from there.
She does not appear to
intend to give you any
more information in class.
What would you do?

Avoid being around these friends.
It is not exactly honest but under the circumstances, the scheme is
OK. You would join them.
Do your own work and not tell the professor about the scheme
because it is not your problem. (Best)
Cheat and get a good grade. (Worst)
Tell the professor about the scheme.
Study for the exam, but join the scheme as a backup strategy for
the test.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.85 (502)
.023
.881
.870
.035

Work out the project to the best of your ability and approach the
professor if you get stuck. (Worst)
Generate some ideas, and then go to office hours to see how the
professor responds to them.
Ask the professor about the project after class.
Visit the professor or a teaching assistant during office hours to
discuss the project. (Best)
Talk to other students to get an idea of what they are doing.
Try to get an idea of whether or not other students seem confused.
If so, bring the issue up with the professor during class. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.97 (502)
.023
.881
.869
.035

100

Table A10 (cont’d)
You are part of a
committee to reduce
cross-cultural tension in
your dorm. A group of
students in your dorm
complain to you that
people always wish them
“Merry Christmas” or
“Happy Easter” when
these holidays are not
meaningful to them. They
request that their
differences be respected.
How would you address
this problem?
A friend on your floor is
always organizing “social”
activities including trips to
local bars. Aside from the
fact that this person is
underage and failing some
classes, you realize that
the individual is drinking
half a dozen or more
drinks at least three or
four times a week. No one
else seems to know or be
concerned about the
person. What would you
do?

Ask the group to politely ignore the greetings with the realization
that the people had good intentions. (Worst)
Tell the well-wishers to please respectfully refrain from making
specific holiday greetings. (Worst)
Have a meeting at which people can discuss their differences and
hopefully work out an understanding. (Best)
As part of the committee, make all cultural holidays visible so
that people can be aware of diversity. (Best)
Tell them to respond with a meaningful greeting of their own.

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.58 (502)
.023
.882
.870
.035

Talk to him/her about easing up on the alcohol, explaining that it
will not help with his/her classes, which should be the main
reason why he/she is in college.
Use humor to broach the topic and offer alternatives to his/her
usual “social” activities.
Bring up the situation with the floor’s resident assistant.
Try to get him/her involved in other activities. (Best)
Talk to the person to subtly determine if there are other issues
that need to be addressed, and refer him/her to help if appropriate.
(Best)
Talk to other people on the floor, and discuss ways to address the
situation.
Ask him/her once about this behavior and see where the
discussion leads, then leave him/her to his/her own course of
action. (Worst)

χ2 (df)
RMSEA
CFI
NNFI
SRMR

692.88 (502)
.023
.881
.870
.035

101

Table A10 (cont’d)
Note. * denotes item identified as referent item. Fit statistics displayed by referent item are for the estimation of the configural
model. Gender and Race denote multiple groups models comparing gender and race demographic groups, respectively. Fit
statistics presented in bold denote indication of DIF as determined by a CFI decrease of > .002. (Reversed) denotes that lower
item responses relate to a higher standing on the target scale, and that item responses were reversed prior to analyses. (Best) and
(Worst) denote the responses rated as best and worst by subject matter experts. For further scoring information see Oswald et al.
(2004).

102

APPENDIX B:
MIMIC model analyses for studied scales

103

Table B1.
MIMIC model of the Behavioral Leadership scale
Model 1
Outcome
Behavioral
Leadership
Factor

Item
Responses

Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

β (S.E.)
p
.05 (.06)
.400
-.32 (.11)
.003
-.53 (.13) <.001
-.30 (.10)
.004

104

Model 2
β (S.E.)
.06 (.07)
-.36 (.12)
-.53 (.14)
-.35 (.11)
.08 (.07)
-.02 (.04)

p
.345
.003
<.001
.001
.296
.598

Model 3
β (S.E.)
-.06 (.08)
-.40 (.13)
-.46 (.13)
-.37 (.10)
.04 (.07)
-.04 (.04)
-.08 (.04)
-.01 (.04)
.05 (.03)
.16 (.04)
.21 (.04)
-.12 (.04)

p
.409
.002
<.001
<.001
.560
.286
.052
.836
.162
<.001
<.001
.002

Table B1 (cont’d)
Effect on Items
How many times
in the past year
have you set the
schedule (time
and/or tasks) for
groups in which
you have worked?

Never
Once
Twice
Three or four
times
Five times or
more

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.04 (.05)
.390
-.08 (.08)
.350
.45 (.07) <.001
.11 (.08)
.193
-.01 (.02)
.795
-.04 (.03)
.154
.01 (.02)
.667
.03 (.02)
.216

105

.04 (.05)
-.04 (.09)
.44 (.08)
.11 (.09)
-.01 (.03)
-.03 (.03)
.01 (.02)
.04 (.02)
.00 (.06)
.06 (.03)

.391
.638
<.001
.213
.569
.218
.614
.062
.997
.031

.03 (.07)
.06 (.11)
.40 (.10)
.03 (.10)
-.01 (.03)
-.03 (.03)
.01 (.02)
.04 (.02)
.00 (.06)
.06 (.03)
-.01 (.03)
.00 (.03)
.00 (.03)
-.01 (.03)
-.06 (.03)
.10 (.03)

.650
.555
<.001
.791
.829
.253
.742
.078
.944
.024
.767
.879
.885
.758
.042
.001

Table B1 (cont’d)
In the past year,
how many times
have you been
responsible for
assigning tasks
and setting
deadlines for
other people?

Never
Once
Twice
Three or four
times
Five times or
more

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.16 (.05) <.001
.05 (.07)
.491
.33 (.08) <.001
.04 (.08)
.598
-.02 (.02)
.343
-.04 (.02)
.073
-.01 (.02)
.670
-.01 (.03)
.655

106

.18 (.05)
.14 (.08)
.33 (.08)
.03 (.08)
-.04 (.02)
-.03 (.02)
-.01 (.02)
.00 (.03)
-.07 (.05)
.09 (.03)

<.001
.071
<.001
.694
.110
.169
.632
.895
.196
<.001

.24 (.07)
.23 (.11)
.33 (.09)
.05 (.10)
-.03 (.02)
-.03 (.02)
-.01 (.02)
.00 (.03)
-.07 (.05)
.10 (.03)
.03 (.03)
-.02 (.03)
.01 (.03)
-.07 (.03)
-.11 (.03)
.06 (.03)

.001
.032
<.001
.586
.192
.169
.741
.913
.196
<.001
.295
.370
.584
.010
<.001
.034

Table B1 (cont’d)
In the past year,
how many times
have you been
responsible for
assigning tasks
and setting
deadlines for
other people?

I am usually
the one who
assigns tasks
or roles to get
the work done
More than
half the time I
end up
assigning the
tasks and
roles
About half the
time I take the
lead in
assigning
tasks and
roles
I rarely take
the lead in
assigning
tasks and
roles
I never take
the lead
unless I have
been assigned
to do so

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising

.13 (.04)
.09 (.07)
-.02 (.08)
.07 (.07)
.02 (.02)
-.03 (.02)
-.01 (.02)
.00 (.02)

.004
.233
.821
.314
.344
.265
.618
.959

.12 (.04)
.09 (.08)
-.03 (.09)
.09 (.08)
.02 (.02)
-.02 (.02)
.01 (.02)
.01 (.02)
-.02 (.05)
-.01 (.02)

.010
.272
.712
.210
.417
.459
.757
.677
.639
.774

.15 (.06)
.17 (.11)
-.07 (.10)
.08 (.10)
.01 (.02)
-.01 (.03)
.01 (.02)
.01 (.02)
-.02 (.05)
-.01 (.02)
-.02 (.03)
.02 (.02)
-.03 (.02)
-.07 (.03)
-.02 (.03)

.017
.134
.466
.394
.720
.575
.531
.577
.758
.606
.369
.398
.196
.008
.485

Conventional
.08 (.03)
.005
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

107

Table B2.
MIMIC model of the Leadership Positions scale
Outcome
Item Responses
Leadership
Positions Factor

Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
β (S.E.)
p
.06 (.06)
.352
.02 (.10)
.872
.02 (.12)
.887
-.21 (.10)
.031

108

Model 2
β (S.E.)
p
.06 (.06)
.325
-.01 (.12)
.928
.03 (.13)
.797
-.26 (.10)
.009
-.03 (.07)
.636
-.07 (.03)
.031

Model 3
β (S.E.)
p
-.01 (.07)
.888
-.02 (.12)
.838
.05 (.13)
.712
-.27 (.10)
.006
-.05 (.07)
.485
-.08 (.03)
.009
-.01 (.04)
.788
-.03 (.03)
.333
.04 (.03)
.200
.15 (.03) <.001
.12 (.04)
.002
-.01 (.04)
.735

Table B2 (cont’d)
Effect on Items
The number
I did not take a
of high school leadership role
clubs and
1
organized
2
activities
3
(such as band, 4 or more
sports,
newspapers,
etc.) in which
you took a
leadership
role was:

Gender
.15 (.04)
.001
.14 (.05)
.001
.18 (.06)
.005
Black
.03 (.07)
.729 -.02 (.08)
.852
.08 (.13)
.540
Asian
-.06 (.08)
.490 -.05 (.08)
.549 -.02 (.10)
.840
Other
.04 (.07)
.587
.00 (.08)
.977
.07 (.10)
.461
Gender X Factor -.01 (.02)
.715 -.01 (.02)
.607 -.01 (.02)
.517
Black X Factor
-.04 (.02)
.071 -.04 (.02)
.073 -.04 (.02)
.132
Asian X Factor
.00 (.02)
.909
.00 (.02)
.839 -.01 (.02)
.756
Other X Factor
-.01 (.02)
.549 -.02 (.02)
.305 -.03 (.02)
.241
Pell Eligibility
.02 (.05)
.758
.01 (.05)
.914
High School
.00 (.02)
.963 -.01 (.02)
.667
Realistic
-.03 (.03)
.324
Investigative
.02 (.02)
.490
Artistic
-.01 (.02)
.551
Social
-.02 (.03)
.542
Enterprising
.07 (.03)
.009
Conventional
-.04 (.03)
.206
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and
the standing on the latent factor score.

109

Table B3.
MIMIC model of the Knowledge scale
Outcome
Item Responses
Knowledge Factor

Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
Model 2
Model 3
β (S.E.)
p
β (S.E.)
p
β (S.E.)
p
.00 (.08)
.983 -.05 (.09)
.547 -.16 (.10)
.131
-.42 (.13)
.001 -.50 (.15)
.001 -.46 (.15)
.002
-.35 (.14)
.013 -.38 (.15)
.012 -.41 (.15)
.007
-.24 (.14)
.072 -.22 (.14)
.134 -.20 (.14)
.157
.06 (.10)
.562
.07 (.10)
.449
-.06 (.05)
.210 -.06 (.05)
.246
-.12 (.06)
.038
.13 (.05)
.005
-.03 (.05)
.507
.13 (.05)
.006
-.04 (.05)
.500
.10 (.05)
.049

110

Table B3 (cont’d)
Effect on Items
For classwork,
how often do you
tend to skim the
material, reading
only the important
points?

Almost all the
time
Most of the
time
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.20 (.05) <.001
-.28 (.10)
.004
-.05 (.10)
.600
-.09 (.08)
.243
.04 (.03)
.255
-.04 (.03)
.244
-.02 (.03)
.451
-.04 (.03)
.275

111

.22 (.06)
-.29 (.11)
-.06 (.11)
-.13 (.09)
.03 (.03)
-.04 (.04)
-.01 (.03)
-.04 (.04)
.08 (.06)
.06 (.03)

<.001
.011
.563
.133
.433
.219
.690
.293
.175
.052

.16 (.07)
-.15 (.11)
-.03 (.10)
-.05 (.12)
.02 (.03)
-.05 (.04)
-.02 (.04)
-.04 (.04)
.09 (.06)
.06 (.03)
.00 (.03)
.03 (.03)
.00 (.03)
.00 (.03)
-.04 (.03)
-.04 (.03)

.032
.167
.743
.664
.562
.208
.619
.274
.161
.036
.916
.346
.928
.952
.270
.244

Table B3 (cont’d)
(Reversed) In
general, what is
the lowest grade
that you find
acceptable for
yourself?

A or equivalent
B or equivalent
C or equivalent
D or equivalent
F or equivalent

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.00 (.05)
.981
-.42 (.09) <.001
-.12 (.13)
.357
-.16 (.08)
.058
-.04 (.03)
.153
-.06 (.03)
.070
-.07 (.04)
.093
-.08 (.03)
.006

112

-.01 (.06)
-.50 (.11)
-.20 (.16)
-.25 (.10)
-.03 (.03)
-.04 (.03)
-.08 (.04)
-.07 (.03)
-.07 (.06)
-.07 (.03)

.812
<.001
.193
.008
.267
.203
.039
.027
.291
.011

.11 (.07)
-.38 (.10)
-.05 (.17)
-.07 (.12)
-.04 (.03)
-.02 (.04)
-.10 (.04)
-.07 (.03)
-.04 (.06)
-.06 (.03)
-.03 (.03)
.12 (.03)
-.04 (.03)
-.04 (.03)
-.06 (.03)
.15 (.03)

.121
<.001
.766
.574
.156
.507
.015
.025
.458
.018
.358
<.001
.225
.191
.102
<.001

Table B3 (cont’d)
(Reversed) How
often do you
spend extra time
on school
assignments, even
after they are
turned in, so that
you can gain a
better
understanding of
the material or
principles?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.13 (.06)
.19 (.11)
.28 (.10)
.15 (.09)
.03 (.03)
-.03 (.03)
-.02 (.03)
-.03 (.03)

113

.029
.070
.007
.109
.234
.320
.511
.324

.15 (.06)
.13 (.12)
.27 (.12)
.11 (.11)
.03 (.03)
-.02 (.03)
-.02 (.03)
-.01 (.03)
.13 (.07)
-.03 (.03)

.019
.275
.023
.315
.356
.531
.530
.802
.053
.373

.17 (.08)
.19 (.11)
.28 (.10)
.12 (.10)
.01 (.03)
-.04 (.03)
-.02 (.03)
-.02 (.03)
.14 (.07)
-.03 (.03)
.05 (.04)
.04 (.03)
.03 (.03)
-.01 (.03)
-.02 (.04)
.02 (.04)

.026
.100
.006
.246
.622
.163
.390
.597
.041
.364
.222
.266
.351
.779
.541
.671

Table B3 (cont’d)
Generally,
whenever you
learn about a topic
or how to perform
a task, how often
do you learn all
the details as well
as the general
principles?

Hardly ever
Not very often
Sometimes
Often
Almost always

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.03 (.06)
.602
.46 (.09) <.001
.07 (.12)
.550
.07 (.09)
.430
.03 (.03)
.353
-.02 (.04)
.596
-.03 (.04)
.334
.00 (.03)
.925

114

-.01 (.06)
.47 (.11)
.04 (.14)
.08 (.10)
.02 (.03)
-.03 (.05)
-.03 (.04)
.00 (.03)
.04 (.07)
.02 (.03)

.857
<.001
.800
.438
.427
.545
.386
.937
.496
.638

-.04 (.08)
.56 (.12)
.11 (.12)
.05 (.11)
.02 (.03)
-.05 (.03)
-.04 (.04)
.00 (.03)
.05 (.07)
.01 (.03)
.01 (.04)
.04 (.03)
.03 (.03)
.01 (.03)
.03 (.04)
-.03 (.03)

.665
<.001
.364
.613
.559
.144
.277
.909
.469
.756
.724
.215
.327
.693
.483
.451

Table B3 (cont’d)
(Reversed) When
you took classes
that you thought
were easy, how
important was it
for you still to
understand the
concepts
underlying the
class material?

Extremely
important
Very important
Rather
important
Sort of
important
Not important

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.05 (.05)
.31 (.11)
.15 (.10)
.16 (.08)
.00 (.03)
.00 (.04)
.00 (.03)
-.02 (.03)

115

.327
.004
.142
.057
.936
.904
.866
.426

.07 (.06)
.33 (.12)
.13 (.11)
.17 (.09)
-.01 (.03)
.00 (.04)
.00 (.03)
.00 (.03)
.05 (.06)
.00 (.03)

.185
.004
.226
.068
.705
.934
.961
.907
.431
.886

.12 (.08)
.32 (.11)
.13 (.12)
.16 (.10)
-.01 (.03)
-.02 (.03)
.01 (.03)
-.01 (.03)
.04 (.06)
.01 (.03)
.07 (.04)
-.04 (.03)
.05 (.03)
-.02 (.03)
-.01 (.03)
-.02 (.04)

.123
.004
.259
.113
.677
.613
.854
.811
.523
.866
.060
.273
.114
.535
.706
.602

Table B3 (cont’d)
A year after
completing a
class, how much
can you typically
remember about
what you were
taught?

I tend to forget
most of what
was taught in
class
I remember the
general ideas
that were taught
in class
I remember
some of the
details that were
taught in class
I remember a
lot of the details
that were taught
in class

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.25 (.05) <.001
-.07 (.11)
.505
-.12 (.12)
.306
.02 (.10)
.816
-.04 (.03)
.145
-.01 (.04)
.836
.00 (.03)
.960
.01 (.03)
.703

116

-.24 (.06)
-.10 (.13)
-.18 (.13)
.02 (.11)
-.02 (.03)
-.02 (.05)
-.01 (.03)
.00 (.04)
.07 (.06)
.02 (.03)

<.001
.457
.172
.844
.463
.724
.880
.988
.290
.541

-.23 (.08)
-.07 (.11)
-.20 (.11)
.02 (.10)
-.03 (.03)
-.01 (.05)
-.01 (.03)
-.01 (.04)
.07 (.06)
.03 (.03)
.02 (.04)
.02 (.03)
.06 (.03)
.02 (.03)
-.05 (.04)
.05 (.04)

.005
.562
.060
.869
.391
.859
.858
.890
.246
.366
.678
.640
.046
.624
.171
.140

Table B3 (cont’d)
How do you
compare your
standards for
learning to those
of your high
school teachers?

Much lower
Gender
-.09 (.06)
.117 -.11 (.06)
.053 -.04 (.08)
.581
than my
Black
.17 (.10)
.088
.11 (.12)
.381
.19 (.10)
.063
teachers'
Asian
.07 (.11)
.541
.04 (.14)
.763
.08 (.12)
.511
standards
Other
-.09 (.09)
.321 -.09 (.09)
.307 -.02 (.10)
.849
Lower than my Gender X Factor
-.03 (.03)
.407 -.03 (.03)
.436 -.03 (.03)
.323
teachers'
Black X Factor
-.03 (.04)
.480 -.02 (.04)
.597 -.03 (.04)
.413
standards
Asian X Factor
-.04 (.04)
.288 -.03 (.04)
.412 -.03 (.04)
.436
About the same Other X Factor
-.04 (.03)
.161 -.03 (.03)
.276 -.03 (.03)
.251
than my
Pell Eligibility
.00 (.06)
.974
.02 (.06)
.797
teachers'
High School
-.05 (.03)
.119 -.04 (.03)
.213
standards
Realistic
-.03 (.03)
.437
Higher than my
Investigative
.15 (.03) <.001
teachers'
Artistic
.04 (.03)
.211
standards
Social
.00 (.03)
.949
Much higher
Enterprising
.01
(.03)
.881
than my
teachers'
standards
Conventional
.07 (.03)
.052
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

117

Table B4.
MIMIC model of Continuous Learning scale
Outcome
Continuous
Learning
Factor

Item Responses

Predictor

Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
β (S.E.)
p

-.05 (.06)
.14 (.09)
.10 (.13)
.04 (.11)

118

Model 2
β (S.E.)
p

.457
.145
.431
.706

-.05 (.06)
.03 (.10)
.04 (.13)
.01 (.11)
.21 (.07)
-.02 (.03)

Model 3
β (S.E.)
p

.395
.798
.736
.925
.003
.500

-.13 (.08)
.079
.01 (.11)
.923
-.03 (.13)
.791
-.03 (.11)
.764
.23 (.07)
.001
-.01 (.03)
.662
-.02 (.04)
.695
.10 (.03)
.004
.19 (.03) <.001
.05 (.04)
.149
-.08 (.04)
.030
.09 (.04)
.020

Table B4 (cont’d)
Effect on Items
In the past month,
how many times
have you looked for
more information
about something
that you found
interesting?

Never
Once or twice
3 to 5 times
6 to 10 times
More than 10
times

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.42 (.05)
-.10 (.08)
.04 (.09)
.00 (.08)
.02 (.02)
.00 (.03)
-.01 (.02)
.00 (.02)

119

<.001
.227
.686
.993
.306
.994
.801
.841

-.43 (.05) <.001
-.07 (.09)
.431
.06 (.09)
.518
-.01 (.08)
.918
.03 (.02)
.249
-.02 (.03)
.553
-.01 (.02)
.586
.01 (.02)
.708
-.08 (.06)
.170
.00 (.03)
.882

-.54 (.08) <.001
-.08 (.15)
.627
.09 (.12)
.467
-.04 (.11)
.755
.02 (.02)
.310
.00 (.03)
.962
-.01 (.02)
.765
.01 (.02)
.823
-.08 (.05)
.157
.00 (.03)
.881
-.02 (.03)
.425
-.02 (.03)
.421
.07 (.03)
.011
.02 (.03)
.520
-.03 (.03)
.377
.02 (.03)
.439

Table B4 (cont’d)
(Reversed) How
often do you ask a
teacher or classmate
questions that go
beyond the material
but are still relevant
to the topic (either
in or out of class)?

Very often
Often
Sometimes
Rarely
Never
Almost Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.12 (.05)
.27 (.08)
-.20 (.10)
-.11 (.08)
.02 (.02)
.00 (.03)
-.01 (.02)
.00 (.02)

120

.013
.001
.044
.172
.306
.994
.801
.841

-.13 (.05)
.20 (.09)
-.18 (.10)
-.16 (.09)
-.01 (.03)
.03 (.03)
-.05 (.02)
.04 (.03)
.00 (.06)
-.06 (.03)

.008
.023
.079
.059
.569
.299
.060
.155
.955
.022

-.12 (.08)
.01 (.14)
.01 (.16)
-.27 (.13)
-.01 (.03)
.04 (.03)
-.05 (.03)
.03 (.03)
-.01 (.06)
-.06 (.03)
.01 (.03)
.01 (.03)
.00 (.03)
.05 (.03)
.04 (.03)
-.03 (.03)

.141
.926
.974
.043
.682
.132
.069
.203
.827
.024
.680
.773
.987
.063
.155
.297

Table B4 (cont’d)
In the past month,
how many times did
you go out and
learn more about
something simply
because it seemed
interesting?

Never
Once
Twice
Three or four
times
Five times or
more

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.31 (.04)
-.14 (.08)
-.04 (.09)
-.19 (.07)
.03 (.02)
-.01 (.03)
.01 (.02)
.04 (.02)

121

<.001
.079
.699
.010
.098
.838
.642
.018

-.30 (.05) <.001
-.08 (.09)
.364
-.02 (.09)
.870
-.21 (.08)
.011
.03 (.02)
.172
-.01 (.03)
.619
.01 (.02)
.660
.04 (.02)
.036
-.04 (.05)
.433
.02 (.02)
.350

-.36 (.08) <.001
-.04 (.15)
.781
-.05 (.15)
.732
-.33 (.11)
.002
.03 (.02)
.195
-.01 (.03)
.668
.01 (.02)
.527
.03 (.02)
.081
-.04 (.05)
.438
.02 (.02)
.390
.02 (.03)
.395
-.01 (.03)
.578
.04 (.03)
.120
.03 (.03)
.328
.04 (.03)
.168
-.05 (.03)
.061

Table B4 (cont’d)
When learning new
things, some people
tend to feel stressed
or tired, while
others tend to feel
inspired or
refreshed. How do
you tend to feel
when you learn new
things?

Very
stressed/tired
Somewhat
stressed/tired
Something in
between
stressed/tired and
inspired/refreshed
Somewhat
inspired/refreshed
Very
inspired/refreshed

Gender
-.21 (.05) <.001 -.20 (.05) <.001 -.13 (.08)
.105
Black
.12 (.08)
.153 .07 (.09)
.413 .24 (.17)
.173
Asian
-.13 (.08)
.113 -.19 (.09)
.029 -.04 (.12)
.731
Other
.02 (.08)
.829 .02 (.08)
.773 .13 (.11)
.213
Gender X Factor -.01 (.02)
.751 -.02 (.03)
.494 -.02 (.03)
.385
Black X Factor
-.04 (.03)
.222 -.05 (.04)
.197 -.05 (.04)
.229
Asian X Factor
-.03 (.02)
.219 -.04 (.02)
.120 -.05 (.02)
.048
Other X Factor
-.02 (.02)
.288 -.02 (.02)
.337 -.03 (.02)
.195
Pell Eligibility
.02 (.05)
.746 .02 (.05)
.782
High School
-.03 (.02)
.194 -.03 (.03)
.205
Realistic
.06 (.03)
.055
Investigative
.04 (.03)
.137
Artistic
-.04 (.03)
.138
Social
.02 (.03)
.486
Enterprising
-.06 (.03)
.047
Conventional
.01 (.03)
.697
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

122

Table B5.
MIMIC model of the Perseverance scale
Outcome
Perseverance
Factor

Item Responses

Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
Model 2
Model 3
β (S.E.)
p
β (S.E.)
p
β (S.E.)
p
.39 (.07) <.001
.38 (.08) <.001
.27 (.10)
.005
.23 (.11)
.037
.18 (.12)
.143
.23 (.13)
.071
-.29 (.14)
.037 -.28 (.15)
.058 -.26 (.15)
.076
.04 (.12)
.771 -.03 (.13)
.845 -.02 (.13)
.888
.13 (.09)
.129
.12 (.09)
.178
-.05 (.04)
.280 -.06 (.04)
.185
-.15 (.05)
.001
.08 (.04)
.062
-.04 (.04)
.270
.13 (.04)
.003
.13 (.05)
.006
-.01 (.05)
.816

123

Table B5 (cont’d)
Effect on Items
To what extent
would your
friends describe
you as someone
who goes after
what you want?

Not at all
A slight extent
A moderate
extent
A large extent
A great extent

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.03 (.06)
-.12 (.10)
-.26 (.12)
.01 (.09)
-.02 (.03)
.05 (.02)
-.01 (.03)
-.03 (.03)

124

.672
.209
.031
.955
.602
.025
.884
.331

.02 (.06)
-.18 (.10)
-.28 (.13)
-.03 (.09)
-.02 (.03)
.04 (.02)
-.01 (.03)
-.04 (.03)
.13 (.06)
.01 (.03)

.784
.088
.026
.753
.523
.064
.715
.221
.031
.817

.04 (.08)
-.31 (.14)
-.23 (.14)
.00 (.12)
-.01 (.03)
.05 (.02)
-.01 (.03)
-.04 (.03)
.13 (.06)
-.01 (.03)
.01 (.04)
-.05 (.03)
.01 (.03)
-.01 (.03)
.10 (.04)
-.09 (.04)

.677
.030
.107
.975
.786
.064
.734
.236
.040
.845
.897
.082
.753
.827
.004
.015

Table B5 (cont’d)
How frequently
do you fail to get
what you want
because you did
not put in enough
effort?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.01 (.06)
.830
-.28 (.10)
.004
-.75 (.10) <.001
-.21 (.10)
.030
.03 (.03)
.358
-.03 (.03)
.290
-.02 (.04)
.568
-.03 (.04)
.432

125

-.02 (.06)
-.37 (.11)
-.77 (.11)
-.21 (.1)
.03 (.04)
-.04 (.04)
-.02 (.04)
-.04 (.05)
.08 (.06)
-.04 (.03)

.746
.001
<.001
.040
.390
.306
.665
.321
.184
.184

.01 (.09)
-.33 (.17)
-.75 (.13)
-.13 (.15)
.02 (.04)
-.01 (.04)
-.02 (.04)
-.04 (.05)
.08 (.06)
-.04 (.03)
-.01 (.04)
.06 (.03)
-.09 (.03)
-.01 (.03)
-.01 (.03)
.06 (.03)

.944
.049
<.001
.377
.613
.738
.569
.404
.209
.108
.765
.041
.002
.879
.695
.070

Table B5 (cont’d)
(Reversed) How
often have you
accomplished
something you
initially thought
was very difficult
or almost
impossible?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.08 (.06)
.19 (.11)
.07 (.10)
.03 (.10)
-.01 (.03)
.01 (.03)
.03 (.03)
.00 (.03)

126

.158
.089
.483
.786
.681
.818
.253
.914

.10 (.06)
.17 (.12)
.01 (.1)
.00 (.1)
-.01 (.03)
.02 (.04)
.03 (.03)
-.02 (.04)
.10 (.06)
.06 (.03)

.105
.157
.903
.969
.815
.642
.229
.675
.086
.044

.08 (.08)
.07 (.17)
-.05 (.12)
.04 (.13)
-.01 (.03)
.01 (.04)
.04 (.03)
-.03 (.04)
.10 (.06)
.06 (.03)
.06 (.04)
-.06 (.03)
.06 (.03)
.05 (.03)
-.03 (.03)
-.01 (.03)

.307
.710
.682
.763
.848
.798
.205
.508
.105
.046
.098
.058
.034
.133
.334
.775

Table B5 (cont’d)
(Reversed) How
often have you
finished a project
when faced with
difficult
circumstances?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.02 (.06)
-.29 (.10)
-.12 (.11)
.06 (.11)
-.01 (.03)
.03 (.03)
-.01 (.03)
.00 (.04)

127

.703
.006
.262
.583
.676
.239
.671
.922

.01 (.06)
-.26 (.12)
-.15 (.12)
.01 (.11)
-.03 (.03)
.04 (.03)
.00 (.04)
.03 (.04)
.01 (.06)
.03 (.03)

.832
.025
.214
.901
.334
.221
.918
.490
.867
.268

.01 (.08)
-.40 (.17)
-.15 (.12)
-.02 (.15)
-.03 (.03)
.05 (.03)
-.01 (.04)
.02 (.04)
.01 (.06)
.03 (.03)
.02 (.04)
.02 (.03)
.05 (.03)
.06 (.03)
-.03 (.03)
.04 (.03)

.866
.016
.210
.875
.286
.134
.756
.617
.828
.226
.617
.446
.062
.089
.410
.228

Table B5 (cont’d)
(Reversed) How
often do others
tend to
compliment you
on your
determination to
continue with a
project under
difficult
circumstances?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.12 (.06)
.14 (.11)
.04 (.11)
.09 (.08)
-.01 (.03)
.05 (.03)
-.03 (.03)
-.03 (.03)

128

.035
.188
.679
.306
.836
.056
.311
.182

.09 (.06)
.15 (.12)
.02 (.11)
.09 (.09)
.00 (.03)
.06 (.03)
-.01 (.03)
-.04 (.03)
.05 (.06)
.02 (.03)

.138
.216
.872
.282
.968
.057
.633
.169
.459
.373

.09 (.08)
.08 (.16)
.05 (.12)
.16 (.11)
-.01 (.03)
.04 (.03)
-.01 (.03)
-.04 (.03)
.04 (.06)
.02 (.03)
.01 (.04)
-.06 (.03)
.03 (.03)
.04 (.03)
.02 (.03)
.05 (.03)

.264
.603
.684
.141
.672
.172
.652
.168
.503
.525
.801
.064
.267
.206
.646
.167

Table B5 (cont’d)
How often do you
tend to give up on
a task after being
told that you were
not doing well?

Almost all the
time
Most of the time
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.27 (.06) <.001
.17 (.09)
.063
-.21 (.12)
.080
.08 (.10)
.423
.06 (.03)
.036
-.02 (.03)
.485
.05 (.03)
.123
.03 (.03)
.375

129

-.23 (.06)
.08 (.10)
-.25 (.13)
.08 (.10)
.04 (.03)
-.02 (.03)
.05 (.03)
.03 (.03)
.13 (.06)
-.04 (.03)

<.001
.396
.045
.443
.248
.420
.110
.397
.034
.207

-.24 (.08)
.10 (.14)
-.35 (.14)
.04 (.13)
.04 (.03)
-.02 (.03)
.04 (.03)
.03 (.03)
.12 (.06)
-.04 (.03)
.07 (.03)
-.03 (.03)
-.03 (.03)
.03 (.03)
-.02 (.04)
-.01 (.03)

.004
.447
.010
.773
.198
.436
.234
.398
.044
.235
.044
.363
.373
.393
.677
.754

Table B5 (cont’d)
When
encountering
problems that take
a long time to
solve, how
impatient do you
tend to become?

Extremely
impatient
Very impatient
Somewhat
impatient
Slightly
impatient
Not at all
impatient

Gender
-.28 (.06) <.001 -.27 (.06) <.001 -.33 (.08) <.001
Black
-.04 (.10)
.689 -.07 (.11)
.554 -.09 (.15)
.533
Asian
-.08 (.10)
.416 -.14 (.10)
.182 -.19 (.12)
.126
Other
-.07 (.11)
.526 -.03 (.10)
.759
.02 (.13)
.860
Gender X Factor
.03 (.03)
.328
.05 (.03)
.130
.03 (.03)
.392
Black X Factor
.02 (.03)
.559
.01 (.03)
.650
.02 (.03)
.604
Asian X Factor
.01 (.03)
.878
.01 (.03)
.799 -.01 (.03)
.872
Other X Factor
.01 (.04)
.818 -.03 (.04)
.461 -.03 (.03)
.412
Pell Eligibility
.07 (.06)
.289
.07 (.06)
.237
High School
.01 (.03)
.705
.01 (.03)
.758
Realistic
.05 (.04)
.187
Investigative
.00 (.03)
.902
Artistic
-.03 (.03)
.362
Social
.07 (.03)
.028
Enterprising
-.13 (.03) <.001
Conventional
.08 (.03)
.015
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

130

Table B6.
MIMIC model of the Discrete Adaptability scale
Outcome
Discrete
Adaptability
Factor

Item Responses Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
Model 2
Model 3
β (S.E.)
p
β (S.E.)
p
β (S.E.)
p
.36 (.10) <.001
.36 (.10) <.001
.21 (.12)
.071
-.42 (.16) .007 -.47 (.17)
.006 -.53 (.18)
.003
-.68 (.20) .001 -.73 (.21)
.001 -.74 (.21) <.001
-.27 (.15) .069 -.38 (.15)
.014 -.37 (.15)
.014
.05 (.11)
.677
.03 (.11)
.808
-.04 (.05)
.441 -.07 (.05)
.164
-.21 (.06) <.001
.04 (.06)
.445
.03 (.05)
.512
.14 (.05)
.008
.12 (.06)
.047
.15 (.06)
.010

131

Table B6 (cont’d)
Effect on Items
How often have
you failed to meet
responsibilities
because you had
taken on too
much?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.15 (.06)
.03 (.11)
-.14 (.13)
-.02 (.11)
.02 (.03)
.04 (.03)
-.03 (.04)
.07 (.04)

132

.018
.780
.287
.893
.564
.191
.404
.072

-.15 (.06)
-.01 (.13)
-.11 (.16)
.06 (.12)
.01 (.04)
.05 (.03)
-.02 (.04)
.05 (.04)
.03 (.07)
-.04 (.03)

.015
.919
.492
.636
.804
.121
.542
.237
.621
.269

-.06 (.10)
-.15 (.17)
-.02 (.14)
-.07 (.17)
-.01 (.04)
.06 (.03)
-.02 (.04)
.04 (.04)
.05 (.07)
-.03 (.03)
.07 (.04)
.03 (.03)
-.11 (.03)
-.01 (.04)
-.03 (.04)
-.07 (.04)

.543
.375
.879
.688
.790
.081
.595
.347
.486
.369
.055
.440
.001
.689
.448
.072

Table B6 (cont’d)
(Reversed) How
difficult has it
been for you to
continue with
something after
being interrupted
and having to take
care of something
else?

Very easy
Easy
Not easy but
not difficult
Difficult
Very difficult

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.17 (.07)
.06 (.11)
.01 (.12)
.11 (.10)
-.04 (.03)
-.03 (.03)
-.05 (.03)
-.04 (.03)

133

.013
.573
.936
.253
.155
.328
.055
.196

-.18 (.07)
-.06 (.12)
-.05 (.14)
.12 (.11)
-.03 (.03)
-.02 (.03)
-.05 (.03)
-.02 (.03)
.12 (.08)
-.03 (.04)

.011
.617
.741
.279
.201
.512
.064
.422
.108
.472

.00 (.10)
.08 (.15)
.16 (.16)
.23 (.12)
-.05 (.03)
-.02 (.03)
-.05 (.03)
-.03 (.03)
.14 (.08)
-.01 (.04)
.15 (.05)
.01 (.04)
-.07 (.04)
-.04 (.04)
-.09 (.04)
-.09 (.04)

.976
.600
.296
.056
.049
.549
.038
.310
.076
.840
.001
.891
.068
.261
.041
.031

Table B6 (cont’d)
(Reversed) How
often do you plan
ahead and make a
specific schedule
of things you need
or want to do?

Very often
Often
Sometimes
Rarely
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.38 (.06) <.001
.06 (.09) .494
.06 (.12) .618
.07 (.08) .404
-.01 (.03) .824
-.02 (.03) .578
-.01 (.04) .774
.01 (.03) .723

134

.41 (.06)
.06 (.10)
.04 (.14)
.10 (.09)
.00 (.03)
-.03 (.03)
-.01 (.05)
.00 (.03)
-.01 (.06)
.01 (.03)

<.001
.525
.794
.251
.978
.455
.905
.923
.898
.644

.48 (.09)
.13 (.14)
.10 (.15)
.11 (.13)
.00 (.03)
-.02 (.03)
-.02 (.05)
.00 (.03)
-.02 (.06)
.01 (.03)
.07 (.03)
-.03 (.03)
-.08 (.03)
.01 (.03)
.02 (.04)
-.03 (.03)

<.001
.375
.524
.380
.978
.522
.630
.897
.761
.685
.033
.259
.011
.811
.481
.459

Table B6 (cont’d)
In the past, how
difficult has it
been for you to
change your study
habits to improve
on a skill or to do
better in a class

Very difficult
Difficult
Not easy but
not difficult
Easy
Very easy

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.13 (.06)
.00 (.11)
-.20 (.11)
-.05 (.10)
-.01 (.03)
.01 (.03)
-.07 (.03)
-.05 (.04)

135

.046
.988
.077
.624
.671
.641
.035
.187

-.14 (.07)
-.04 (.12)
-.20 (.15)
-.07 (.11)
.00 (.03)
.01 (.03)
-.07 (.04)
-.09 (.04)
.04 (.07)
-.02 (.03)

.041
.718
.176
.549
.912
.793
.060
.014
.543
.468

-.09 (.09)
-.06 (.15)
.02 (.14)
.23 (.16)
-.02 (.03)
.02 (.03)
-.06 (.03)
-.08 (.04)
.05 (.07)
-.02 (.03)
-.01 (.04)
.08 (.03)
-.13 (.03)
.04 (.04)
-.05 (.04)
.03 (.04)

.347
.695
.907
.148
.468
.418
.062
.030
.504
.548
.717
.014
<.001
.323
.180
.449

Table B6 (cont’d)
When you are
working on a
serious and
relatively difficult
task and
something or
someone
interrupts you,
how do you
usually react?

With a great
Gender
-.24 (.06) <.001 -.23 (.07)
.001 -.24 (.10)
.016
deal of
Black
.22 (.10) .037
.11 (.12)
.354
.21 (.14)
.149
annoyance - it
Asian
.14 (.15) .326
.12 (.17)
.486
.09 (.16)
.592
is hard to get
Other
-.01 (.09) .877 -.04 (.10)
.675
.10 (.14)
.457
back to the
Gender X Factor
.02 (.03) .570
.03 (.03)
.390
.01 (.03)
.693
original task
Black X Factor
-.01 (.03) .764 -.01 (.03)
.809 -.01 (.03)
.755
You are
Asian X Factor
.01 (.04) .761
.01 (.04)
.857
.02 (.04)
.664
irritated - it's
Other X Factor
-.02 (.03) .578 -.04 (.03)
.264 -.05 (.03)
.141
hard to stay on Pell Eligibility
.11 (.07)
.119
.12 (.07)
.081
task when you High School
-.04 (.03)
.235 -.03 (.03)
.288
are interrupted
Realistic
.11 (.04)
.007
It bothers you
Investigative
.00 (.04)
.937
just a little Artistic
-.05
(.03)
.123
you'd really
.03 (.03)
.331
prefer not to be Social
Enterprising
-.06
(.04)
.112
interrupted
It doesn't
bother you you feel one of
the challenges
of any job is
the ability to
“juggle"
several things
at a time
Conventional
-.07 (.04)
.077
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

136

Table B7.
MIMIC model of the Routine Adaptability scale
Outcome
Routine
Adaptability
Factor

Effect on Items
How often
have you
failed to meet
responsibilities
because you
had taken on
too much?

Item Responses

Predictor
Gender
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
β (S.E.)
p
-.41 (.07) <.001

Very often
Often
Sometimes
Rarely
Never

Model 2
β (S.E.)
p
-.36 (.07)
<.001
.02 (.08)
.829
.00 (.04)
.968

Model 3
β (S.E.)
p
-.43 (.08) <.001
.01 (.08)
.868
-.02 (.04)
.684
-.02 (.05)
.690
.00 (.04)
.946
-.06 (.04)
.117
.13 (.04)
.003
.11 (.04)
.013
-.11 (.04)
.010

Gender
.19 (.05) <.001
.16 (.05)
.002 .16 (.07)
.020
Gender X Factor
.04 (.03)
.194
.03 (.03)
.286 .03 (.03)
.317
Pell Eligibility
.09 (.05)
.075 .09 (.05)
.075
High School
-.05 (.03)
.080 -.05 (.03)
.065
Realistic
-.02 (.03)
.498
Investigative
-.01 (.03)
.709
Artistic
.01 (.03)
.690
Social
-.05 (.03)
.143
Enterprising
-.02 (.03)
.537
Conventional
.04 (.03)
.245
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

137

Table B8.
MIMIC model of the Social Responsibility scale
Outcome
Social
Responsibility
Factor

Item Responses

Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
Model 2
Model 3
β (S.E.)
p
β (S.E.)
p
β (S.E.)
p
.45 (.06) <.001
.48 (.06)
<.001 .41 (.07) <.001
-.20 (.10)
.045 -.21 (.11)
.066 -.18 (.12) .135
.33 (.09) <.001
.38 (.10)
<.001 .40 (.10) <.001
-.19 (.10)
.048 -.16 (.10)
.112 -.17 (.10) .111
-.09 (.07)
.192 -.11 (.07) .123
-.05 (.03)
.130 -.05 (.03) .072
-.03 (.04) .379
.05 (.03) .150
-.05 (.03) .127
.13 (.03) <.001
.06 (.04) .131
-.02 (.04) .613

138

Table B8 (cont’d)
Effect on Items
During the last
year, how many
times have you
given money,
food, or clothes to
a charity or a poor
person in need?

0
1
2
3
More than 3

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.08 (.06)
.28 (.08)
.04 (.12)
.13 (.09)
-.02 (.03)
-.05 (.03)
.03 (.03)
-.05 (.03)

139

.140
.001
.739
.131
.507
.115
.215
.110

.06 (.06)
.28 (.09)
.00 (.12)
.08 (.09)
-.01 (.03)
-.04 (.03)
.04 (.03)
-.05 (.03)
-.02 (.06)
.01 (.03)

.318
.002
.977
.419
.639
.157
.162
.118
.727
.757

.03 (.08)
.35 (.12)
-.05 (.16)
.14 (.12)
-.01 (.03)
-.05 (.03)
.04 (.03)
-.05 (.03)
-.04 (.06)
.00 (.03)
.04 (.03)
-.09 (.03)
-.01 (.03)
.10 (.03)
.06 (.03)
-.05 (.03)

.727
.004
.757
.222
.706
.141
.136
.127
.473
.891
.294
.002
.857
.001
.084
.109

Table B8 (cont’d)
In the past year,
how many hours
were you engaged
in community
service or
volunteer
activities?

None
Less than 10
hours
11 - 40 hours
41 - 80 hours
More than 80
hours

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.07 (.04)
.25 (.08)
.13 (.09)
.08 (.06)
.04 (.02)
.05 (.02)
.04 (.02)
.01 (.02)

140

.085
.001
.157
.223
.013
.002
.026
.450

-.07 (.04)
.29 (.09)
.15 (.10)
.10 (.07)
.04 (.02)
.06 (.02)
.04 (.02)
.01 (.02)
.08 (.05)
.03 (.02)

.090
.001
.120
.159
.034
.001
.024
.613
.081
.141

-.14 (.06) .010
.15 (.10) .130
.06 (.12) .627
.09 (.07) .207
.04 (.02) .039
.06 (.02) <.001
.04 (.02) .041
.01 (.02) .508
.08 (.05) .109
.04 (.02) .090
.01 (.03) .779
-.01 (.02) .754
.02 (.02) .335
.04 (.02) .132
-.04 (.03) .140
.03 (.03) .201

Table B8 (cont’d)
(Reversed) How
important has it
been in the past
for you to be
involved in
community or
volunteer work?

Extremely
important
Very important
Important
Not very
important
Not at all
important

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.20 (.04)
.23 (.07)
-.02 (.08)
.03 (.02)
-.01 (.02)
-.01 (.02)
-.03 (.02)

141

<.001
.001
.842
.164
.768
.598
.105

.21 (.05)
.22 (.08)
-.04 (.08)
.02 (.02)
-.01 (.02)
-.01 (.02)
-.02 (.02)
.07 (.05)
.01 (.02)

<.001
.004
.658
.305
.510
.739
.235
.170
.795

.08 (.06)
.23 (.08)
.00 (.10)
.01 (.02)
.00 (.02)
-.01 (.02)
-.01 (.02)
.05 (.05)
.00 (.02)
-.06 (.03)
-.02 (.02)
-.02 (.02)
.08 (.02)
-.04 (.03)
.01 (.03)
.06 (.07)

.166
.004
.996
.433
.871
.502
.490
.286
.896
.025
.513
.395
.002
.111
.605
.380

Table B8 (cont’d)
In the past year, in None
how many
1
fundraisers have
2
you participated?
3
4 or more

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.08 (.05)
-.02 (.09)
-.21 (.10)
-.18 (.08)
.04 (.02)
-.01 (.03)
.06 (.02)
.01 (.03)

142

.140
.836
.038
.030
.118
.613
.002
.669

.05 (.05)
-.06 (.09)
-.19 (.11)
-.20 (.09)
.04 (.03)
.00 (.03)
.07 (.02)
.01 (.03)
-.04 (.06)
-.04 (.03)

.377
.539
.067
.020
.144
.929
.001
.788
.458
.123

-.05 (.07) .481
-.09 (.11) .398
-.29 (.14) .031
-.22 (.09) .018
.04 (.03) .155
.00 (.03) .986
.07 (.02) .001
.01 (.03) .702
-.07 (.06) .182
-.06 (.03) .021
.00 (.03) .915
-.06 (.03) .033
-.01 (.03) .630
.12 (.03) <.001
.06 (.03) .047
-.01 (.03) .708

Table B8 (cont’d)
During the past
year, how often
have you
recycled?

Never
Not very often
Sometimes
Often
Always

Gender
-.01 (.05)
.819
.01 (.06)
.804 .01 (.08) .869
Black
-.82 (.10) <.001 -.62 (.11)
<.001 -.62 (.13) <.001
Asian
-.21 (.12)
.095 -.18 (.12)
.140 -.26 (.16) .113
Other
-.13 (.09)
.126 -.05 (.09)
.551 -.08 (.1)
.421
Gender X Factor
.02 (.03)
.522
.02 (.03)
.489 .03 (.03) .395
Black X Factor
.03 (.04)
.505
.02 (.04)
.593 .01 (.04) .866
Asian X Factor
.04 (.03)
.125
.03 (.03)
.305 .03 (.03) .307
Other X Factor
.01 (.03)
.755
.01 (.03)
.775 .00 (.03) .948
Pell Eligibility
-.12 (.06)
.044 -.10 (.06) .087
High School
.14 (.03)
<.001 .14 (.03) <.001
Realistic
.05 (.03) .124
Investigative
.00 (.03) .880
Artistic
.05 (.03) .076
Social
-.03 (.03) .393
Enterprising
-.01 (.03) .681
Conventional
-.01 (.03) .702
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

143

Table B9.
MIMIC model of the Values scale
Outcome
Values Factor

Item Responses

Predictor
Gender
Black
Asian
Other
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
Model 2
Model 3
β (S.E.)
p
β (S.E.)
p
β (S.E.)
p
.23 (.07)
.001
.20 (.07)
.004
.02 (.08)
.779
-.14 (.10)
.181 -.11 (.11)
.328 -.06 (.12)
.619
-.29 (.14)
.031 -.27 (.15)
.066 -.28 (.15)
.052
-.01 (.11)
.930
.02 (.11)
.830
.03 (.11)
.784
-.03 (.08)
.740 -.03 (.08)
.732
-.01 (.04)
.855
.00 (.04)
.997
-.11 (.04)
.010
.05 (.04)
.204
.00 (.04)
.929
.12 (.04)
.001
-.07 (.04)
.099
.07 (.04)
.131

144

Table B9 (cont’d)
Effect on Items
During high school,
how many times
have you expressed
disapproval or
anger at a friend for
behaving in a
manner that you
considered to be
unethical or wrong?

Never
Once
Twice
Three or four
times
Five times or
more

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.12 (.05)
-.15 (.09)
-.21 (.11)
-.10 (.09)
-.04 (.03)
-.02 (.04)
-.05 (.03)
-.02 (.03)

145

.029
.098
.056
.250
.223
.566
.089
.580

-.12 (.06)
-.15 (.10)
-.28 (.12)
-.09 (.09)
-.03 (.03)
-.02 (.04)
-.03 (.03)
-.01 (.03)
-.01 (.06)
.05 (.03)

.025
.142
.016
.322
.374
.648
.265
.806
.858
.069

-.24 (.07)
-.11 (.11)
-.24 (.11)
-.09 (.10)
-.03 (.03)
-.03 (.04)
-.02 (.03)
-.01 (.04)
-.02 (.06)
.05 (.03)
-.08 (.03)
.00 (.03)
.05 (.03)
.08 (.03)
-.04 (.03)
.01 (.03)

<.001
.332
.037
.095
.314
.375
.522
.757
.813
.070
.018
.950
.072
.012
.209
.718

Tables B9 (cont’d)
(Reversed) Over the
past year, how
many times were
you given detention
(or a similar
punishment)?

Never
Once
Twice
Three or four
times
Five times or
more

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

.25 (.05) <.001
-.30 (.11)
.006
.11 (.09)
.248
-.06 (.10)
.520
-.11 (.05)
.031
.03 (.08)
.750
-.05 (.03)
.072
.01 (.05)
.924

146

.26 (.06)
-.32 (.13)
.18 (.09)
-.12 (.11)
-.11 (.05)
.08 (.10)
-.07 (.02)
.05 (.06)
-.06 (.06)
.00 (.03)

<.001
.010
.038
.282
.033
.460
<.001
.418
.269
.995

.33 (.10)
-.41 (.24)
.22 (.11)
-.14 (.15)
-.12 (.05)
.12 (.12)
-.07 (.02)
.05 (.06)
-.06 (.06)
.00 (.03)
-.02 (.03)
.02 (.03)
-.02 (.03)
-.03 (.03)
-.04 (.03)
.08 (.03)

.001
.095
.044
.337
.033
.320
<.001
.390
.293
.957
.514
.426
.490
.336
.200
.015

Table B9 (cont’d)
In your first three
years of high
school, how often
did you skip classes
without a legitimate
reason?

Most of the
time
A lot
Sometimes
Once or twice
Never

Gender
Black
Asian
Other
Gender X Factor
Black X Factor
Asian X Factor
Other X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

-.13 (.06)
-.06 (.09)
-.10 (.10)
-.21 (.11)
.06 (.04)
.06 (.04)
-.04 (.03)
.07 (.07)

147

.016
.521
.349
.057
.173
.207
.288
.267

-.08 (.06)
.02 (.10)
-.07 (.11)
-.21 (.12)
.04 (.04)
.07 (.05)
-.06 (.04)
.09 (.07)
-.18 (.06)
-.02 (.03)

.130
.854
.519
.095
.389
.177
.124
.191
.002
.551

-.07 (.08)
-.04 (.15)
-.03 (.13)
-.29 (.17)
.05 (.04)
.07 (.06)
-.06 (.04)
.09 (.07)
-.18 (.06)
-.02 (.03)
.03 (.03)
.00 (.03)
.00 (.03)
-.04 (.03)
.01 (.03)
-.02 (.03)

.413
.800
.796
.084
.295
.240
.123
.179
.004
.605
.357
.968
.985
.236
.856
.624

Table B9 (cont’d)
How many times
have you been
accused of acting
unethically?

Very often
Often
Sometimes
Rarely
Never

Gender
.33 (.06) <.001
.36 (.06) <.001
.40 (.08) <.001
Black
-.04 (.09)
.644 -.01 (.10)
.909 -.04 (.13)
.747
Asian
-.04 (.10)
.647 -.03 (.10)
.741 -.06 (.12)
.634
Other
-.13 (.10)
.187 -.07 (.10)
.487 -.13 (.13)
.333
Gender X Factor
-.06 (.04)
.143 -.07 (.04)
.084 -.07 (.04)
.104
Black X Factor
-.03 (.04)
.508 -.03 (.04)
.551
.00 (.05)
.961
Asian X Factor
.04 (.04)
.361
.03 (.05)
.562
.02 (.05)
.629
Other X Factor
.08 (.04)
.039
.07 (.04)
.090
.07 (.04)
.082
Pell Eligibility
-.06 (.06)
.321 -.05 (.06)
.345
High School
.04 (.03)
.141
.04 (.03)
.158
Realistic
.01 (.03)
.744
Investigative
-.01 (.03)
.762
Artistic
-.05 (.03)
.070
Social
-.01 (.03)
.724
Enterprising
-.01 (.03)
.736
Conventional
-.03 (.03)
.443
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score.

148

Table B10.
MIMIC model of the Situational Judgment scale
Outcome
Situational Judgment Factor

Effect on Items
A fellow student allows you to
listen to threatening phone
messages that have been placed
on the person’s voicemail by
another student. The student does
not want you to tell anyone, but
thinks the caller may be capable
of causing physical harm. What
would you do?

Predictor
Gender
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional
Gender
Gender X Factor
Pell Eligibility
High School
Realistic
Investigative
Artistic
Social
Enterprising
Conventional

Model 1
Model 2
Model 3
β (S.E.)
p
β (S.E.)
p
β (S.E.)
p
.49 (.07) <.001
.52 (.07)
<.001 .33 (.08) <.001
.03 (.07)
.638 .04 (.07)
.583
-.02 (.04)
.538 -.03 (.03)
.357
-.12 (.04)
.002
.07 (.03)
.022
.01 (.04)
.815
.19 (.04) <.001
-.01 (.04)
.793
.05 (.04)
.211
.47 (.06) <.001
-.02 (.03)
.484

149

.45 (.06)
-.01 (.04)
.04 (.06)
-.05 (.03)

<.001
.756
.504
.063

.49 (.09)
-.02 (.04)
.04 (.06)
-.04 (.03)
-.03 (.03)
.05 (.03)
.02 (.03)
-.01 (.03)
-.03 (.03)
.05 (.03)

<.001
.512
.397
.110
.385
.076
.489
.643
.369
.106

Table B10 (cont’d)
You are finding a particular class
dull and boring, and are having
difficulty staying awake. What
would you do?

Gender
-.06 (.07)
.370 -.07 (.07)
.304
.11 (.13)
.419
Gender X Factor
-.11 (.05)
.034 -.12 (.06)
.030
.03 (.03)
.317
Pell Eligibility
.07 (.06)
.205
.09 (.06)
.100
High School
-.01 (.03)
.744
-.01 (.03)
.757
Realistic
-.05 (.03)
.113
Investigative
.05 (.03)
.092
Artistic
.00 (.03)
.971
Social
-.06 (.03)
.047
Enterprising
-.04 (.03)
.266
Conventional
.05 (.03)
.125
Note. Effects listed with “X Factor” denote an interaction between particular demographic grouping variable and the
standing on the latent factor score. For item responses please see Table A10.

150

REFERENCES

151

REFERENCES

Aguinis, H., & Smith, M. A. (2007). Understanding the impact of test validity and bias on
selection errors and adverse impact in human resource selection. Personnel Psychology,
60(1), 165-199.
Armstrong, P. I., Allison, W., & Rounds, J. (2008). Development and initial validation of brief
public domain RIASEC marker scales. Journal of Vocational Behavior, 73, 287-299.
Bliesener, T. (1996). Methodological moderators in validating biographical data in personnel
selection1. Journal of Occupational and Organizational Psychology, 69(1), 107-120.
Bobko, P., & Roth, P. L. (2013). Reviewing, categorizing, and analyzing the literature on Black–
White mean differences for predictors of job performance: Verifying some perceptions
and updating/correcting others. Personnel Psychology, 66(1), 91-126.
Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: understanding AIC and BIC
in model selection. Sociological Methods & Research, 33(2), 261-304.
Cole, D. A., Ciesla, J. A., & Steiger, J. H. (2007). The insidious effects of failing to include
design-driven correlated residuals in latent-variable covariance structure analysis.
Psychological Methods, 12(4), 381-398.
Cottrell, J. M., Newman, D. A., & Roisman, G. I. (2015). Explaining the Black–White Gap in
Cognitive Test Scores: Toward a Theory of Adverse Impact. Journal of Applied
Psychology, 100(6), 1713-1736.
Dean, M. A. (2013). Examination of ethnic group differential responding on a biodata
instrument. Journal of Applied Social Psychology, 43(9), 1905-1917.
De Corte, W., Lievens, F., & Sackett, P. R. (2007). Combining predictors to achieve optimal
trade-offs between selection quality and adverse impact. Journal of Applied Psychology,
92(5), 1380-1393.
Deutsch, M., & Brown, B. (1964). Social Influences in Negro‐White Intelligence Differences.
Journal of Social Issues, 20(2), 24-35.
Drasgow, F. (1987). Study of the measurement bias of two standardized psychological tests.
Journal of Applied Psychology, 72(1), 19-29.
Duncan, G. J., & Magnuson, K. A. (2005). Can family socioeconomic resources account for
racial and ethnic test score gaps?. The Future of Children, 15(1), 35-54.

152

Eccles-Parsons, J. (1983). Expectancies, values, and academic behaviors. In J. T. Spence (Ed.),
Achievement and Achievement Motivations (pp. 75–121). San Francisco, CA: Freeman.
Fouad, N. A. (1999). Validity evidence for interest inventories. In M. L. Savickas & R. L.
Spokane (Eds.), Vocational interests: Meaning, measurement, and counseling use (pp.
193–209). Palo Alto, CA: Davis-Black.
Gierl, M. J. (2005). Using a dimensionality-based DIF analysis paradigm to identify and interpret
constructs that elicit group differences. Educational Measurement: Issues and Practices,
24, 3–14.
Hauser, Robert M., & Goldberger (1971). The treatment of unobservable variables in path
analysis. Sociological Methodology, 3, 81-117.
Holland, J. L. (1959). A theory of vocational choice. Journal of Counseling Psychology, 6, 35–
45.
Holland, J. L. (1997). Making vocational choices: A theory of vocational personalities and work
environments (3rd ed.). Odessa, FL: Psychological Assessment Resources.
Holland, J. L., Fritzsche, B., & Powell, A. (1994). Self-directed search: Technical manual.
Odessa, FL: Psychological Assessment Resources.
Hough, L. M., & Oswald, F. L. (2000). Personnel selection: Looking toward the future-Remembering the past. Annual Review of Psychology, 51(1), 631-664.
Hough, L., & Paullin, C. (1994). Construct-oriented scale construction: The rational approach. In
Stokes, G. S., Mumford, M. D., Owens, W. A., (Eds.). Biodata handbook: Theory,
Research, and Use of Biographical Information in Selection and Performance Prediction.
(pp. 109-145). Palo Alto, CA: CPP Books.
House, R. J., Hanges, P. J., Javidan, M., Dorfman, P. W., & Gupta, V. (Eds.). (2004). Culture,
leadership, and organizations: The GLOBE study of 62 societies. Sage publications.
Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternative predictors of job
performance. Psychological Bulletin, 96(1), 72.
Imus, A., Schmitt, N., Kim, B., Oswald, F. L., Merritt, S., & Westring, A. F. (2010). Differential
item functioning in biodata: Opportunity access as an explanation of gender-and racerelated DIF. Applied Measurement in Education, 24(1), 71-94.
Jachuck, K., & Mohanty, A. K. (1974). Low socio-economic status and progressive retardation
in cognitive skills: A test of cumulative deficit hypothesis. Indian Journal of Mental
Retardation, 7(1), 36-45.
Jones, K.S., Newman, D. A. Su, R. & Rounds, J. (under review). Vocational Interests and
Adverse Impact: A Meta-Analysis of Black-White Differences in Vocational Interests.
153

Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and
multiple causes of a single latent variable. Journal of the American Statistical
Association, 70(351a), 631-639.
Kim, B. H., Schmitt, N., Friede, A., Oswald, F. L., Ramsay, L. J., & Gillespie, M. A. (2004).
Differential item functioning in situational judgment tests: Is it a function of the scoring
procedure? Paper presented at the annual meeting of the Society for Industrial and
Organizational Psychology, Chicago, Illinois
Kozlowski, S. W. J., & Klein, K. J. (2000). A multilevel approach to theory and research in
organizations: Contextual, temporal, and emergent processes. In K. J. Klein & S. W. J.
Kozlowski (Eds.), Multilevel Theory, Research, and Methods in Organizations:
Foundations, Extensions, and New Directions (pp. 3-90). San Francisco: Jossey-Bass.
Lievens, F., & Motowidlo, S. J. (2016). Situational judgment tests: From measures of situational
judgment to measures of general domain knowledge. Industrial and Organizational
Psychology, 9(1), 3-22.
Low, K. D., Yoon, M., Roberts, B. W., & Rounds, J. (2005). The stability of vocational interests
from early adolescence to middle adulthood: a quantitative review of longitudinal studies.
Psychological Bulletin, 131(5), 713-737.
Mael, F. A. (1991). A conceptual rationale for the domain and attributes of biodata items.
Personnel Psychology, 44(4), 763-792.
Mael, F. A., & Ashforth, B. E. (1995). Loyal from day one: Biodata, organizational
identification, and turnover among newcomers. Personnel Psychology, 48(2), 309-333.
McDaniel, M. A., Hartman, N. S., Whetzel, D. L., & Grubb, W. (2007). Situational judgment
tests, response instructions, and validity: a meta‐analysis. Personnel Psychology, 60(1),
63-91.
McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P.
(2001). Use of situational judgment tests to predict job performance: A clarification of
the literature. Journal of Applied Psychology, 86(4), 730-740.
Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity of alternative fit
indices in tests of measurement invariance. Journal of Applied Psychology, 93(3), 568.
Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of
Educational Research, 13(2), 127-143.
Morris, M. L. (2016). Vocational interests in the United States: Sex, age, ethnicity, and year
effects. Journal of Counseling Psychology, 63(5), 604.
Motowidlo, S. J., Dunnette, M. D., & Carter, G. W. (1990). An alternative selection procedure:
The low-fidelity simulation. Journal of Applied Psychology, 75(6), 640.
154

Mumford, M. D., & Owens, W. A. (1987). Methodology review: Principles, procedures, and
findings in the application of background data measures. Applied Psychological
Measurement, 11(1), 1-31.
Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika,
54(4), 557-585.
Muthén, L. K. (2012, August 30). MIMIC Modeling [Msg 17]. Message posted to
http://www.statmodel.com/discussion/messages/11/650.html?1454806417
Muthén, L. K., & Muthén, B. O. (2011). Mplus User’s Guide. 2010. Los Angeles, CA: Muthén
& Muthén, 6.
Nye, C. D., & Drasgow, F. (2011). Effect size indices for analyses of measurement equivalence:
Understanding the practical importance of differences between groups. Journal of
Applied Psychology, 96(5), 966.
Nye, C. D., Allemand, M., Gosling, S. D., Potter, J., & Roberts, B. W. (2016). Personality Trait
Differences Between Young and Middle‐Aged Adults: Measurement Artifacts or Actual
Trends?. Journal of Personality, 84(4), 473-492.
Nye, C. D., Su, R., Rounds, J., & Drasgow, F. (2012). Vocational interests and performance a
quantitative summary of over 60 years of research. Perspectives on Psychological
Science, 7(4), 384-403.
Oswald, F. L., Schmitt, N., Kim, B. H., Ramsay, L. J., & Gillespie, M. A. (2004). Developing a
biodata measure and situational judgment inventory as predictors of college student
performance. Journal of Applied Psychology, 89(2), 187.
Ployhart, R. E. (2006). Staffing in the 21st century: New challenges and strategic opportunities.
Journal of Management, 32(6), 868-897.
Prasad, J. J., Showler, M. B., Schmitt, N., Ryan, A. M., & Nye, C. D. (2016). Using Biodata and
Situational Judgment Inventories across Cultural Groups. International Journal of
Testing, 1-24.
Raju, N. S., Van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of
differential functioning of items and tests. Applied Psychological Measurement, 19(4),
353-368.
Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology,
111-163.
Robertson, I. T., & Smith, M. (2001). Personnel selection. Journal of Occupational and
Organizational Psychology, 74(4), 441-472.

155

Robert, C., Lee, W. C., & Chan, K. Y. (2006). An empirical analysis of measurement
equivalence with the INDCOL measure of individualism and collectivism: Implications
for valid cross‐cultural inference. Personnel Psychology, 59(1), 65-99.
Schermelleh-Engel, K., Moosbrugger, H., & Müller, H. (2003). Evaluating the fit of structural
equation models: Tests of significance and descriptive goodness-of-fit measures. Methods
of Psychological Research Online, 8(2), 23-74.
Schmitt, N., Keeney, J., Oswald, F. L., Pleskac, T. J., Billington, A. Q., Sinha, R., & Zorzie, M.
(2009). Prediction of 4-year college student performance using cognitive and
noncognitive predictors and the impact on demographic status of admitted students.
Journal of Applied Psychology, 94(6), 1479.
Schmitt, N., & Quinn, A. (2010). Reductions in measured subgroup mean differences: What is
possible. Adverse Impact: Implications for Organizational Staffing and High Stakes
Selection, 425-451.
Schneider, B., & Schmitt, N. (1986). Staffing Organizations. Glenview, IL: Scott, Foresman.
Schneider, B. (1987). The people make the place. Personnel Psychology, 40(3), 437-453.
Shackleton, V., & Newell, S. (1997). International assessment and selection. In N. Anderson &
P. Herriot (Eds.), International Handbook of Selection and Assessment. Chichester, UK:
Wiley.
Stark, S., Chernyshenko, O. S., Drasgow, F., & Williams, B. A. (2006). Examining assumptions
about item responding in personality assessment: Should ideal point methods be
considered for scale development and scoring? Journal of Applied Psychology, 91(1), 25.
Su, R. & Nye, C. D. (in press). Interests and person-environment fit: A new perspective on
workforce readiness and success. In J. Burrus, K. D. Mattern, B. Naemi, & R. D. Roberts
(Eds.) Building Better Students: Preparation for the Workforce.
Su, R., Rounds, J., & Armstrong, P. I. (2009). Men and things, women and people: a metaanalysis of sex differences in interests. Psychological Bulletin, 135(6), 859.
Tracey, T. J., & Robbins, S. B. (2005). Stability of interests across ethnicity and gender: A
longitudinal examination of grades 8 through 12. Journal of Vocational Behavior, 67(3),
335-364.
United States Census Bureau (2015). Income and Poverty in the United States: 2014. Retrieved
from: https://www.census.gov/content/dam/Census/library/publications/2015/demo/p60252.pdf
United States Census Bureau (2010). 2010 Demographic Profile Data. Retrieved from:
https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml

156

Van Iddekinge, C. H., Putka, D. J., & Campbell, J. P. (2011). Reconsidering vocational interests
for personnel selection: the validity of an interest-based selection test in relation to job
knowledge, job performance, and continuance intentions. Journal of Applied Psychology,
96(1), 13.
Walker, T. L., & Tracey, T. J. G. (2012). Perceptions of occupational prestige: Differences
between African American and White college students. Journal of Vocational Behavior,
80, 76-81.
Weekley, J. A., Ployhart, R. E., & Harold, C. M. (2004). Personality and situational judgment
tests across applicant and incumbent settings: An examination of validity, measurement,
and subgroup differences. Human Performance, 17(4), 433-461.
Whetzel, D. L., McDaniel, M. A., & Nguyen, N. T. (2008). Subgroup differences in situational
judgment test performance: A meta-analysis. Human Performance, 21(3), 291-309.
Whitney, D. J., & Schmitt, N. (1997). Relationship between culture and responses to biodata
employment items. Journal of Applied Psychology, 82(1), 113.
Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to
two-group analysis. Multivariate Behavioral Research, 44(1), 1-27.
Woods, C. M., & Grimm, K. J. (2011). Testing for nonuniform differential item functioning with
multiple indicator multiple cause models. Applied Psychological Measurement, 35(5),
339-361.
Woods, C. M., Oltmanns, T. F., & Turkheimer, E. (2009). Illustration of MIMIC-model DIF
testing with the Schedule for Nonadaptive and Adaptive Personality. Journal of
psychopathology and behavioral assessment, 31(4), 320-330.
Zedeck, S. (2010). Adverse impact: History and evolution. Adverse impact: Implications for
organizational staffing and high stakes selection, 3-27.

157