LIBRARY '
Michigan State
University

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 c:/C|RC/DateDue.p65-p.15

SCORE INFLATION ON BIODATA AND SITUATIONAL JUDGMENT
INVENTORY ITEMS: A COLLEGE ADMISSIONS QUANDARY

By

Lauren Jill Ramsay

A THESIS

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

MASTER OF ARTS
Department of Psychology

2003

 

I

ABSTRACT

SCORE INFLATION ON BIODATA AND SITUATIONAL JUDGMENT
INVENTORY ITEMS: A COLLEGE ADMISSIONS QUANDARY

By

Lauren Jill Ramsay

Tests used in a selection context must be carefully examined to ensure that if they are
susceptible to inﬂation, that vulnerability does not affect their utility. Biodata and
situational judgment measures could be used to support college admissions decisions.
This study explores the relationship between score inﬂation on these two tests, and
situational factors inﬂuencing performance: motivation to perform well, coaching on how
to perform well, and a warning statement not to respond dishonestly. Both motivation
and coaching were found to predict performance. Conscientiousness and emotional
stability were associated with social desirability measured by self deception and
impression management, and extreme inﬂation appeared to be driven more by situation
than by personality. Item-level analyses showed that biodata items that did not require
elaboration were more susceptible to inﬂation, as were items that were less objective, less
controllable, and more college-relevant Criterion-related validity was not negatively
affected by inﬂation in this study, and corrections to selection decisions based on three
indices of inﬂation (bogus items, an inﬂation index, and high scores on impression
management) did not result in any change in the quality of candidates selected, in terms

of performance outcomes. Implications of the ﬁndings are discussed.

ACKNOWLEDGEMENTS

This research was conducted with support from the College Board, New York.

iii

TABLE OF CONTENTS

LIST OF TABLES ............................................................................................................. vi
LIST OF FIGURES ......................................................................................................... viii
INTRODUCTION ............................................................................................................... 1
LITERATURE REVIEW .................................................................................................... 5
Deﬁnitional Issues ................................................................................................... 5
Biodata ..................................................................................................................... 8
Situational Judgment Inventory ............................................................................. 12
Coaching ............................................................................................................... 13
Individual Differences in Faking ........................................................................... 19
Effects on Validity ................................................................................................. 21
Inﬂation Controls ................................................................................................... 25
Warning ........................................................................................................ 26
Composition of Items ................................................................................... 27

Statistical Control using Social Desirability or Bogus Items ....................... 30

Study Development ................................................................................................ 34
METHOD .......................................................................................................................... 45
Sample .................................................................................................................... 45

Study Design .......................................................................................................... 46
Procedure ............................................................................................................... 47
Manipulations ........................................................................................................ 49
Measures ................................................................................................................ 50
RESULTS ............................................... - ........................................................................... 57
Situational Differences ........................................................................................... 57
Individual Differences ........................................................................................... 67

Item Differences ..................................................................................................... 73
Validity and Selection Decisions ........................................................................... 82
CONCLUSION .................................................................................................................. 91
Discussion .............................................................................................................. 91
Limitations ............................................................................................................. 97
Practical Implications ............................................................................................. 98

iv

APPENDICES ................................................................................................................. 1 04

Appendix A -— Mael’s Taxonomy ........................................................................ 104
Appendix B - Sample Questionnaire ................................................................... 106
Appendix C - Sample Biodata Items ................................................................... 120
Appendix D — Sample Situational Judgment Items ............................................. 122
Appendix E — Wording of Instruction Sets .......................................................... 124
Appendix F - Informed Consent Form — Motivated Group ................................ 129
Appendix G - Informed Consent Form — Not Motivated Group ......................... 132
Appendix H — Faking Study Protocol .................................................................. 135
Appendix I — Coaching Directions ...................................................................... 141
Appendix J - Coaching Handout ......................................................................... 143
Appendix K — College GPA Release Form ........................................................ 156
Appendix L — High School GPA and SAT/ACT Score Release Form ............... 158
Appendix M - Twelve Dimensions of College Performance .............................. 160
Appendix N - Bogus Biodata Items .................................................................... 164
Appendix 0 — Inﬂation Index Items .................................................................... 166
Appendix P — Calculation for Inﬂation Index Selection ..... ' ................................. 169
Appendix Q — Biodata Item Types Coding Instructions ...................................... 172
Appendix R — SJ I Item Types Coding Instructions ............................................. 174
REFERENCES ................................................................................................................ 176

LIST OF TABLES

Table 1 - Sample Elaborated Biodata Item ....................................................................... 29
Table 2 — Expected Score Levels ....................................................................................... 39
Table 3 —- Study Manipulations and Participants per Cell .................................................. 46

Table 4 — Coefficient Alpha and Descriptives for Ratings of Biodata Item
Characteristics .................................................................................................................... 55

Table 5 - Coefﬁcient Alpha and Descriptives for Ratings of SJ I Item
Characteristics .................................................................................................................... 56

Table 6 — Analysis of Variance Results for Biodata with and without the Inclusion

of Covariates ...................................................................................................................... 59
Table 7 — Means and Standard Deviations of Biodata across Conditions ......................... 61
Table 8 — Analysis of Variance for SJ I with and without Covariates ................................ 63

Table 9 — Means and Standard Deviations of SJ I Responses for Various Study
Conditions .......................................................................................................................... 64

Table 10 - Correlation Matrix ........................................................................................... 69

Table 11 - Overall Means and Standard Deviations of Four Item Characteristics
Fisher 2 for Biodata ............................................................................................................ 73

Table 12 — Analysis of Variance Results for Fisher 2 for Four Item Characteristics
for Biodata ......................................................................................................................... 74

Table 13 — Means and Standard Deviations of Four Item Characteristic Fisher 2 for
Various Study Conditions for Biodata ............................................................................... 75

Table 14 — Overall Means and Standard Deviations of Four Item Characteristics
Fisher 2 for SJ I ................................................................................................................... 77

Table 15 - Analysis of Variance Results for Fisher 2 for Four Item Characteristics
for SJ I ................................................................................................................................. 78

Table 16 — Means and Standard Deviations of Four Item Characteristic Fisher 2 for
Various Study Conditions for SJ I ...................................................................................... 79

vi

Table 17 - Correlations and Descriptive Statistics for Predictors and Criteria ................. 82

Table 18 — Zero Order and Partial Correlations between Situational Judgment and
Biodata and Two Criteria Controlling for Measures of Faking ......................................... 83

Table 19 — Coefﬁcient Alpha of Biodata and SJ I Scales for Various Study
Conditions .......................................................................................................................... 84

Table 20 — Adjusted r for Biodata and SJ I Scales in Predicting GPA and Absenteeism

Across Various Study Conditions ...................................................................................... 85
Table 21 — Descriptives for High Performance and Reference Groups ............................. 87
Table 22 - Descriptive Statistics for Selection Ratio of .10 .............................................. 89
Table 23 — Descriptive Statistics for Selection Ratio of .25 .............................................. 89
Table 24 - Descriptive Statistics for Selection Ratio of .50 .............................................. 9O

vii

LIST OF FIGURES

Figure 1 — Conceptual Model ............................................................................................ 36

Figure 2 — Expected Interaction of Warning, Coaching, and Motivation on
Test Performance ............................................................................................................... 40

Figure 3 -— Interactions between Coaching, Motivation, and Warning for Biodata
Performance ....................................................................................................................... 66

viii

 

 

 

 

,.
»
.5.
l“'
«w
\
A.‘
.—
ll "

s....
I
I....
D
n
-c

21' f“:

{I .'

V- v...-.

..‘.w
\

h...‘

- t
.

A.-. ..
.3. -
.
...__
"~ ~..
\ "A.
.5”.

 

 

 

INTRODUCTION

Colleges have historically considered traditional academic measures, such as high
school GPA and SAT/ACT scores, in predicting academic performance in college, which
informs admission decisions. However, there is more to the college experience than
academic performance alone, and as colleges are pressed to measure student performance
along broader criteria, they need to consider a similarly broad range of predictors.
Biodata questions and situational judgment inventories have shown promise as two such
examples of predictors of college student performance (Oswald, Schmitt, Ramsay, Kim,
& Gillespie, in press), and in general, the prospect of using a more inclusive range of
instruments is attractive for gathering a wealth of information on applicants, given the
aim of increasing the diversity of applicants accepted to college. Nevertheless, such
measures require thorough development and validation efforts because applicants to
college may be motivated to distort their responses (e. g., “fake good”) and inﬂate their
scores on these tests. Faking can be deﬁned as a conscious effort to manipulate
responses to make a positive impression (see Zickar & Robie, 1999), and people can and
do fake on certain tests when motivated to do so (McFarland & Ryan, 2000; Viswesvaran
& Ones, 1999). It is reasonable to assume that applicants may manipulate their responses
to a test in a way that enhances the likelihood of their success in a selection process
(Hough, Eaton, Dunnette, Kamp, & McCloy, 1990), and such manipulation may affect
the utility of the tests (Rosse, Stecher, Miller, & Levin, 1998). This manipulation may be
motivated by a desire to appear socially desirable or job desirable (Ones, Viswesvaran, &
Reiss, 1996). As Corr and Gray (1994, p. 433) succinctly put it, “to some extent it is de

rigueur for successful job applicants/incumbents to engage in some form of impression

management.” Nevertheless, some inﬂation goes beyond low-key impression
management, and may include outright lying and deliberately providing false
information. As it is difﬁcult to establish the boundaries of what is honest and what is
dishonest responding, I will use the term inﬂation, rather than faking, unless I am
referring to literature that speciﬁcally uses the term faking. I seek to understand the
extent to which individual difference characteristics affect test responses, how responses
might be distorted under different situational constraints, and how both these factors
affect the utility of biodata and situational judgment inventories designed for college
admissions.

While we know that individuals are capable of inﬂation on some items on biodata : h...
tests and situational judgment inventories, we do not know enough about situational
factors that contribute to inﬂation, or methods of identifying it and limiting its effects. If
eventually biodata and situational judgment measures might be used to contribute to the
information collected by colleges in making admissions decisions and in student
development contexts (Oswald, et al., in press), then applicants will be motivated to
achieve high test scores, and many will likely demand coaching programs to help prepare
for the test. In other words, if there is a move to supplement ability measures with the
noncognitive measures considered in this paper in making college admissions decisions,
the evaluation process may become more vulnerable to coaching and other forms of
manipulation.

Some researchers regard faking studies to be of little value, however, when
conducted outside a real selection experience. I argue that there are important reasons to

examine faking in tests to be used in a selection context, before they are actually used for

Ix)

 

decision-making. Inﬂation and coaching may be problematic in some contexts, and
considering the motivation to perform well on selection tests in a college-admissions
context, these issues must be addressed when considering tests such as biodata and
situational judgment inventories for college admissions. Past research makes a similar
recommendation: With the development of the Assessment of Background and Life
Experiences (ABLE), a test containing personality and biodata content, that test was not
regarded as ready for implementation until research had been conducted on the effects of
coaching and faking (White, Young, & Rumsey, 2001). In the present context, without
studying the causes, contexts, and outcomes of test-takers’ manipulating their responses
to these new tests, and instead simply implementing a new test in a college admissions
context, universities would be opening themselves up to intense criticism and probable
litigation for using a test the appropriateness of which has not yet been supported. It is
essential that test users understand the extent to which a test’s validity may be affected by
coaching or inﬂation, and how warning statements, biodata elaboration requirements and
correction scales might be useful if such tests were used in a college admissions context.
To that end, this study tests a sample of freshman students, most of whom are in their
ﬁrst semester of college. With this sample I will be capturing information on individuals
who have not yet had college experiences and are very similar in age and experience to
those actually applying to college.

Much of the research on faking has focused on personality tests, allowing enough
studies for Viswesvaran and Ones (1999) to meta-analyze the extent to which personality
trait scales can have their scores inﬂated. They conclude that in between-subjects design

studies of the Big Five factors and social desirability scales, faking good instructions in

 

laboratory settings result in response inﬂation of about half a standard deviation across
the Big Five, and more than one standard deviation in social desirability scales, compared
to those with instructions to respond honestly. As is the case in employment selection, by
inﬂating scores, some applicants to college may be able to distort and thereby improve
their ranking in comparison to others, consequently affecting who is selected, under a
top-down selection system (e.g., Rosse, et al., 1998).

While identifying inﬂation is one avenue toward ensuring a reliable selection
process, another is discouraging responses that are dishonest or untrue. Following is a
review of literature related to inﬂation that is relevant to biodata measures and situational
jUdgment tests. I discuss the effectiveness of various methods to reduce inﬂation, such as
Warning statements and biodata elaboration requirements, and the use of items with
Characteristics that make them more resistant to inﬂation.

I address identifying and correcting for score inﬂation using various scales. I also
e"(a-thine the impact of test coaching, as coaching will almost certainly occur in the event

that any noncognitive measure is used in high-stakes decision-making. In many cases I
haVe tapped the literature on personality and integrity tests when it is likely to apply to

responses to biodata and situational judgment items.

 

LITERATURE REVIEW
Deﬁnitional Issues
The literature covers a range of studies that describe personality test score
inﬂation somewhat generally as faking (see Viswesvaran & Ones, 1999). Score inﬂation
could be partially explained by a number of effects that are not always differentiated.
Effects sometimes classiﬁed under the umbrella of response bias are socially desirable
responding, job-desirable responding, self-deception, impression management, and lying.
Considering all these effects as one thing does little to further our understanding of how
and why individuals inﬂate their scores on certain tests. These different precipitators of
score inﬂation are each described below. = «the.
Socially desirable responding may occur when the characteristics being captured
by a test or test item are transparent to the respondent, and those characteristics are
regarded as attractive by the respondent or more generally attractive in the respondent’s
CUIture. Personality tests are criticized for being susceptible to socially desirable
r esPending, where conscientiousness and emotional stability factors of the Big Five, for
e"EEIII‘rple, are regarded as socially desirable and adaptive. This transparency potentially
IIla-kes personality measures, such as those capturing the Big Five, susceptible to
inﬂation. On average, the Big Five measures have been shown to be inﬂated by about
hal f a standard deviation under instructions to fake good (Viswesvaran & Ones, 1999).
To be able to control for this tendency, social desirability scales have been created to
gather information about the tendency of individuals to respond in a socially desirable

allrler. However, these scales themselves are fakeable, with Viswesvaran and Ones

 

 

(1 999) showing in their meta-analysis a faking effect size of more than one standard
deviation on social desirability scales.
Paulhus (1984) presents evidence of the two-component model of socially
desirable responding; self—deception and impression management. Self-deception is the
unconscious inclination that an individual has towards claiming that desirable
characteristics apply to them. Impression management is the conscious dissembling that
an individual engages in to present a favorable impression. Paulhus’ Balanced Inventory
of Desirable Responding (BIDR) is an example of a social desirability measure that
Captures both dimensions. Paulhus demonstrated that impression management tended to
be more susceptible to situational change than self-deception. It is this impression
management component that is regarded as being most closely linked to faking. While
these two factors of social desirability are conceptually useful, they have not consistently
been empirically supported by other researchers, and there are more nuanced ways to
100k at them, including the positive and negative attributes of speciﬁc items (see Paulhus

& Reid, 1991).

Job—desirable responding reﬂects the recognition that individuals understand that,
for particular jobs, different job behaviors or characteristics will be appropriate and
desirable. If the test being taken is capturing traits that are transparent to the respondent,
then respondents who understand which characteristics are desirable in the particular
Work context can respond in a job-desirable way. Dwight and Alliger (1997) show that
j Qb‘relatedness of integrity items was positively related to their fakability, and this may
b 6 because job-relevant items are more transparent, which makes job-desirable

r
asponding easier. Miller (2001) describes how coaching for a test can focus on traits

 

l'k

 

desirable for a particular job, where job applicants can modify their responses to a test (or
interview) to suit the target audience (e. g., an HR recruiter, a direct supervisor).

Lying is a term that should be used with caution, because it implies an intent to
deceive. Nonetheless, the term has been used to refer to people with scores on “lie
Scales” that in some cases simply comprise unusual responses rather than responses that
are necessarily lies. Scales that are made up of bogus items reﬂecting ﬁctitious
experiences that the individual could not possibly have had (e.g., Anderson, Warner, &

Spencer, 1984), are the closest that research in this area gets to capturing outright lying.
By identifying these deﬁnitional issues I seek to point out that being too quick to
simplify the issue of faking may result in missing separate and important factors in score
inﬂation. The present study examines score inﬂation in two tests whose data show initial
promise for use in a college admissions context: biodata and situational judgment
inventories. Inﬂation on these two tests may be a result of any number of the different
factors described previously. Social desirability is evident in certain biodata and
Situational judgment items, where some responses will be regarded as more socially
acceptable than others. For example, it may be apparent that an item addressing
mmtiCUItural tolerance has certain response options that reﬂect socially desirable
behaViOI'. Similarly, there are biodata and situational judgment items that reﬂect the
typical expectations of a college student, and those with information about those
expectations should not ﬁnd it difﬁcult to identify questions and response options that
reﬂect deSirable behavior, such as attending class. Some individuals may have a natural
inelination to present themselves in a positive light, and this tendency toward impression

management is expected to vary across individuals, even when they are faced with the

a‘l‘ai.‘

L.‘~"H

 

Same questions and situational factors as others. These various situational and individual
facets of social desirability all have a place in research on non-cognitive predictors such
as biodata and situational judgment inventories.

Having provided some background with the deﬁnitional issues that relate to this
reseaI‘Ch, I will now address the literature on two tests that are of interest in a college
admiSSion context: biodata and situational judgment inventories.

Biodata
Biodata measures comprise items related to the examinee’s background and

experiences, and they have long been used as a tool in personnel selection (see discussion

 

in Stokes, 1994, in The Biodata Handbook). While biodata items have demonstrated = ‘4‘”
utility in adding to the information that we have about a candidate, they are not
impervious to inﬂation. Lautenschlager (1994) discusses a selection of twelve biodata
studies that support the notion that individuals may distort or dissimulate on biodata
items, even though they are designed as measures of actual background and factual
experience. Studies have used a range of participant groups, including job incumbents,
jOb applicants, and students playing various roles. Accuracy in biodata has been
Operational ized in different ways: correlational accuracy, level of mean differences, and
absolute accuracy. Correlational accuracy, according to Lautenschlager, refers to
consistency in correlation with an external criterion. Level of mean difference refers to
the: variation in the mean score between participant groups in studies where there are
diﬁ‘erent Conditions for different groups. Absolute accuracy refers to consistency in

responses at the individual subject level.

Lautenschlager discusses twelve studies using biodata items that relate to faking,
going back as far as 1950, providing a map of the development of research in this area.
Early studies, such as that of Keating, Patterson and Stone (1950), capture accuracy of
biodata by calculating the correlation between the report of the individual and report of
the supervisor. Correlations were very high (as high as .98 for duration of employment).
while it is possible that correlations would be high if everyone was inﬂating their
responses equally, I suspect that in this case such strong relationships probably occur
because of the clear veriﬁability of the information requested. Today’s biodata items

tend to cover a broader content area, they may be clearly job-relevant, and at the same
time be more difﬁcult to verify as many experiences take place independently of the
knowledge of others, making them more susceptible to inﬂation. Regarding the type of
biodata items that may be susceptible to inﬂation, several studies are illuminating. Klein
and Owens (1965) conducted a faking manipulation of a biodata questionnaire under
instructions to respond as would a typical, creative research scientist. Under this
maniplﬂation, this study found item-type differences, where objective items were less
susceptible to inﬂation than subjective items. Doll (1971) extended the idea of item type
by including continuous (Likert scale) and noncontinuous (multiple choice) items, and
found that continuous item responses were more susceptible to faking. Also included in
this study was a warning of the presence of an existing lie scale, which also reduced
faking. A study conducted by Cohen and Leﬂ(owitz (1974) concluded that the MMPI’s
K- scale could identify people who may be responding in a socially desirable fashion.
Thornton and Gierasch (1980) experimented with an empirically keyed biodata

inVentory, and they concluded that empirical keying did not result in the inventory being

 

any less vulnerable to faking. Empirical keying, however, has since been demonstrated
to be one effective method of limiting faking (Kluger, Reilly, & Russell, 1991). Pannone
(1 984) attempted a different approach to identify faking — he included a bogus item that
asked about experience on a nonexistent piece of equipment. This method continues to
generate some interest as a means of identifying faking.

Graham, McDaniel, Douglas and Snell (2002) provide another useful review of
biodata studies that relate to faking and validity issues. They cite the study by McManus
and Masztal (1993) who used Mael’s (1991) taxonomy to investigate the relationship of
item characteristics and validity, concluding that historical, external, objective, and

veriﬁable items are less susceptible to faking. Mael’s taxonomy has also prompted other = ..,...
research to be discussed in more detail later (see Composition of items).

Stanush (1997) conducted a meta-analysis to establish the susceptibility of self-
report measures to faking. She concluded that self-report measures are susceptible to
distortion under two experimental conditions: under instruction on how to self-present, or
under a condition of honest responding (d = .64). However, inﬂation as a result of
instructional set was larger than that explained by motivation, which Stanush uses to

supp“t the argument that the validity of self-report measures is not likely to be

negatively affected by real-life motivation. I question this conclusion, because it is

difﬁcult to capture all aspects of the effects of real-life motivation in a study, and
evidence about mean differences is different than evidence for similar criterion—related
validity. Stanush (1997) does ﬁnd that biodata inventories tended to be more susceptible
to inﬂation than personality inventories (d = .94 vs. d = .45, respectively; see also

MCFal‘land, 2000). Studies of biodata accuracy have provided valuable information on

10

 

1'" L—rx

the usefulness of these tests; however, the present study addresses the unresolved issue on
the effects of inaccuracy in biodata on the predictive validity of the test.

Miller (2001) demonstrates that the Conscientiousness Biodata Questionnaire
(cBDQ), a biodata measure designed to capture the personality trait of conscientiousness,
was readily faked in a study where participants were provided with a coaching session
that provided trait-related information. While understanding the usefulness of biodata as
accurate or inaccurate is important, it is also valuable to attempt to reduce inaccuracy.

Kluger, et al. (1991) note that ﬁndings of response bias in biodata items have been mixed.
They argue that empirically keying the item response options creates items that are less
susceptible to faking, and also have statistical advantages. With empirical keying, it can
be difficult for fakers to guess or know where they will lose or gain points with faking,
whereas with item-keying, fakers can operate effectively once the direction of the
desirability of responses on the continuum is determined. They note that faking good
may also include presenting behaviors that are not socially desirable, but that are
perceived as desirable in candidates for a particular position (i.e., job desirable). The
anthers found that, generally, participants asked to respond as if they were real applicants
for ajob yielded higher scores than the honest group, had lower empirically keyed scores,
and more extreme empirically keyed scores. Gore (2001) also explores the fakability of
biodata questions and notes empirical keying as a potential way to reduce response
inﬂation in biodata questions. Considering that the biodata questions generated by
Oswald, et al., (in press) are not empirically keyed and have response options that are

largely Continuous Likert-type scales, similar to most personality self-report measures, it

ll

- ij‘th

 

is worth investigating whether these items are more susceptible to inﬂation, as the present

Study does.

Situational Judgment Inventory

Situational judgment inventories (SJ ls) typically comprise sample problem
Scenarios for which the respondent must choose a response representing an appropriate
course of action. A review addressing situational judgment items can be found in
McDaniel and Nguyen (2001). Motowidlo, Dunnette, and Carter (1990) are cited for
their primary study in developing situational judgment items. Items tap a broad range of
content areas and dimensions, and they can vary tremendously in complexity. While they
are most often presented written in paper-and-pencil format, administration can vary, and
may even include the presentation of video vignettes (e.g., Lievens, Coetsier, &
Decaesteker, 2000). Key characteristics identified by McDaniel and Nguyen are that
situational judgment inventories are typically correlated with ability and experience, as
well as conscientiousness, emotional stability, and agreeableness.

While situational judgment measures are useful in that they can address a variety
0f SkillS, and they add to the prediction of performance, the method could still permit
some level of inﬂation, depending on how the items are written. If SJ Is are found to be
less vulnerable to inﬂation than personality and biodata measures yet also show adequate
predictive validity, they may be more likely to be used in a selection context. Nguyen
(2002) fOund that respondents to a situational judgment test for customer service
selection were able to raise their scores when instructed to score favorably, depending on
the thaSing of the test questions. When the items were phrased in a fashion that required

a MOSt Likely/Least Likely response instruction, participants were able to raise their

 

scores. However, when the items required a Best/Worst response choice, respondents
were not able to raise their scores. Vasilopoulos, Reilly and Leaman (2000) examined a
sample of job applicants, and administered a battery of tests that includes situational
judgment items. All participants also responded to the BIDR impression management
scale, and based on their total score, they were then dichotomously categorized as being
high or low on impression management. Although not the focus of their study, f"
Vasilopoulos, et al. found that in comparing groups with low, moderate and high job
familiarity, the difference in mean situational judgment scores between those low on

impression management and high on impression management grew as job familiarity

 

grew. Mean differences in standard deviation terms ((1) were .18, .22, and .73, ‘ ""'
respectively. These results support the idea that situational judgment items are
susceptible to inﬂation due to socially desirable responding, and that knowledge about
the .lOb for which the situational judgment test is written is valuable in gaining a higher
score. However, this social or job desirable responding could actually be adaptive
behavior that reﬂects useful knowledge.
Now that I have described the characteristics of biodata and situational judgment
inventories, I will discuss coaching as a means of improving performance.
Coaching
Where there is a strong motivation to do well on a test, such as gaining admission to
COllege, people are likely to use whatever resources are available to them to do the best
that they can, including test coaching programs. Coaching can be defined as getting
outside guidance (White, et al., 2001), and such external interventions can improve scores

0
n Selection tests (see discussion in Sackett, Burris, & Ryan, 1989). Kulik, Bangert-

l3

Downs and Kulik (1984), in their meta-analysis of coaching, conclude that studies of
improvements in SAT scores reﬂect a d = .15 improvement as a result of coaching. For a
range of other ability tests, they found a d = .51 coaching effect size for test-retest study
designs, and a d = .27 coaching effect in studies without a pre-test. Ability tests are less
susceptible to coaching than are some non-cognitive tests. Alliger and Dwight (2000),
through their meta-analysis, show that coaching instructions and instructions to fake good
both tended to improve scores on integrity tests (d = .90 with faking instructions, and d =
1.32 with coaching for overt tests, and d = .38 with faking instructions, and d = .36 with
coaching for personality-based measures). As integrity tests are non-cognitive in nature,
these results may generalize to those for biodata measures. Miller (2001) provided a
brief coaching session to participants in a study of their responses to personality and
biodata items. Miller’s coaching consisted of explaining and deﬁning the Big Five traits,
describing how the traits are measured, and providing sample items. The trainer also
speciﬁed that the most important trait in predicting job performance was
conSCientiousness, although other traits were also covered so that the participants were
able t0 discriminate between personality characteristics. That study demonstrated that
brief coaching resulted in signiﬁcantly improved performance on the personality
dinlension of interest in the study - conscientiousness — as well as biodata items tapping
consCientiousness, along with post-training knowledge tests that were created for the
Study.
While tests that are pure g measures are stable, because of the stable nature of g,
DeI‘Sonality-related tests are not as reliable. As biodata and situational judgment tests are

Tl . . . .
0t Phre ability measures, but are instead based on personal experience and a broad range

14

of competencies, it is highly likely that coaching would produce greatly improved
performance on these measures, as coaching could make salient the most desirable
response pattern. This would allow the participants to modify their self-presentation
strategy to maximize their score.
Messick and Jungeblut (1981) found that score improvement was positively
related to the amount of time spent in coaching, though there is also a point of
diminishing returns, and there can be effective brief coaching. Messick and Jungeblut
acknowledge the variety of training that is included under the “coaching” umbrella; some
authors restrict their usage of the coaching term to mean practice on sample items and
last-minute cramming, while others use it to mean special full-time instruction that could
extend for months. The authors were examining coaching for the SAT, a cognitive
ability test that requires knowledge and skills typically acquired over a long period of
time. The effectiveness of brief coaching has been variable, dependent on the type of
coaChing provided. The most brief of the SAT coaching programs evaluated in the
MeSSiCk and Jungeblut (1981) study provided 30-60 minutes each of coaching for Verbal
and Math sections of the SAT, resulting in a signiﬁcant improvement in scores, even
though the underlying skills required for high scores on the SAT are supposed to be
acquired over long periods of time. Similarly non-cognitive measures are susceptible to
brief coaching. Klubeck and Bass (1954) found that 30 minutes of training to improve
perfoI‘rnance in leaderless group discussion was effective for those who were already high
in leadership skills, but less effective for other individuals, while Petty (1974)
demonstrated that 15 minutes of training could be effective in improving performance in

1e . . .
aderless group discussions. Sackett, et al. (1989) note that exercise-spec1f1c training,

15

which is effectively training to the test, rather than more general skill development
training, increases the effectiveness of coaching. The same authors also comment that
there has been an absence of research on coaching on personality tests. Some eleven
years later, Miller (2001) states that coaching on personality selection tests has received

little attention, and found that an exercise-speciﬁc training program that included sample

.w I- -I.‘

itezrns was effective in improving scores on biodata and personality tests. A 15-minute

coaching session that included time to complete a learning outcome measure was

L ‘I'AJ'..

suﬂcient to generate an effect size of d = 1.66 on the conscientiousness scale of the

 

NE O-F Fl when comparing the scores of those given coaching along with instructions to
1'.
1 ° 9!.“ '

fake to those receiving control group training and instructions to be honest (coaching and
faking instruction effect). The effect size dropped to d = .48 when looking at just the
coaching effect, by comparing those who received coaching along with faking
instructions to those with control group coaching along with faking instructions, to hold
the faking instruction condition equal across the groups. Effect sizes for differences on
the CBDQ varied by subscale, with the subscales measuring organization (d = 1.23) and
attention to detail (d = 1.14) showing the greatest effect sizes for the coaching and faking
instruction effect. The subscales of planfulness and deliberate/rational showed a
Coaching effect size of d = .60.

Cunningham, Wong, and Barbee (1994) conducted a study that included a
Speciﬁc explanation of the content of the test, effectively providing very brief coaching
on the underlying rationale of an overt integrity test. This coaching was provided simply
by providing written statements about the nature of the test prior to the test

admini stration. For example, the following statement was used to provide information

r

.0
{13" :wil5‘

'KL’q

‘V' ” \flhidadc

C... Is. v.)

1 _, l

|..| L‘. NLK‘AA‘A
o

 

,.: «<1 ‘1 "‘
r. 5)»..‘8‘ 465‘
. - ‘

:‘C'. if 311:1

b

 

q I
I)‘ ~ ~« ~ ~ -0
.L.v1.>t'>CC UL

‘u—

:kl-m;} “uni: s

'l"-" i

.- i in) ii ﬂ

I ..c _v "‘

 

s s -
.i . '\ Pity‘l‘: 0L

"“u “M 58.

u

I
'- ¢ 4 -.,._ +‘
3&1ch if

C+

: "lama-7?" '
:l‘. \iAALij‘ in

NE... '
V .
‘Ntii .mwrl)\ 7)
l' ' \'
1
s.
.3" ~. ‘
‘N‘ \ I. :.
ubtk Lu) DCVI'
E'H'I-v ;.' .
'WL

 

.V..

“in ‘

.,~. ~ ‘v-

.“-\. 4') - . .

m... x -“

‘5' .
a d

‘1

‘ u

 

 

about punitive questions: “For example, honest individuals tend to have relatively
punitive attitudes towards themselves and to those who commit crimes. They are more
likely to recommend punishment than forgiveness for those who commit crimes. You
may want to keep these ideas in mind as you take the test” (p. 650). Similarly, for
proj ective questions, participants were coached, “For example, honest individuals tend to
pro j ect the image that they are honest, and deny any temptations towards dishonesty.
They also see other people, both ﬁ'iends and strangers, as being as honest as they are.
You may want to keep these ideas in mind as you take the test” (p. 650).

This written form of coaching was effective in raising scores by about 10%, and

 

demonstrated that speciﬁc information about the test led to improved performance in ‘ "*""
related areas of the test (in this case, information about projective tests leading to greater
improvement in projective questions, and information about punitiveness leading to
gr eater improvement in punitive questions). Considering that coaching could make the
dim'ensions being examined in a biodata or situational judgment test more overt (as
aCCOI'nplished by Miller (2001) for biodata), and overt tests of integrity have been shown
to be more easily faked than covert tests (Alliger & Dwight, 2000), relatively brief
(:0 acT'hing on the dimensions captured by biodata and situational judgment items is likely
to be effective in raising scores on these tests.

Another issue that needs to be examined is whether there are differential effects of
coaching for high and low ability applicants. Kulik, Kulik, and Bangert’s meta-analysis
( 1 984), for example, found that effect sizes of practice on aptitude tests were greater for
higher ability (d = .82) and middle ability (d = .40) samples than for low ability groups (d

\

\ ‘ 1 7) on identical tests in a test-retest study. Also, ethnicity may have an effect on

17

 

 

 

 

‘0'.

\

u
b
'1

' O
-.-‘,"'"‘L‘ J 5‘“

V“
.1-
$3....

-I l'
”I "u . ‘ .
jbunu‘“\

v“"""ﬂ REF)"
- mu
'

D
“‘blnd“"
w

.l
. qu'c - ,l
1"”? .‘AEJD '

. I
p...”
v

‘ 9
'r‘ '.'§"".?J-* "" .
.h...~.~\.ulb\o “as:

O - --O‘ 1" ‘L‘V‘
l
kvé.‘ tL‘ SK :5.”

v 1 .

liﬂ”\ - "a1 vb

‘..b.\-db .- “in.“
V

l-lill C03;

. Q .
' ;"'P.v in .
iL-y.....: _“ D‘

‘ also ' . ‘
‘ g" \h P‘
.....5 .33“. En in:

. 5n"; - .
u‘udl' Ad”. '
:~ ‘IIibblk‘

-“'—“ win. a \

, .
K ‘uv... v
4,. . yv- .
funk“; ll 5‘.
v

‘ '\ I u

-5 . II 1
u“. y \

C l- km

a 4“n9
~\ . ~URHC\lj

t Lu.
l

 

co aching effectiveness, namely non-Whites may be more likely than Whites to be in the
lower and middle-ability groups, and may be less likely to improve as a result of
coaching. Ryan, Ployhart, Gregarus, and Schmit (1998), in discussing test preparation
programs in a selection context cite a study presented by Holden (1996) that
demonstrated that coached Caucasians improved more than minorities. Differential
access to coaching along with differential results based on ability or other individual
di fTerences might exacerbate the problem of adverse impact in college admissions.

That coaching can be effective is cause for closer examination of the effects of
coaching on biodata and situational judgment tests for use in college admissions. While
some research has been conducted on the coaching of children for educational selection
(e-g-, Bunting & Mooney, 2001 ), and there has been an examination of coaching for
college admissions tests such as the SAT, as l have discussed, and MCAT (e.g., Jones,
1 98 7 ; Jones, & Vanyur, 1987), there is a dearth of work on the coaching of students
responding to biodata and situational judgment tests in a college admissions context. As
coaching would probably become widely available if such a test was to be used in a
SeleCtion context, it is important to consider changes in its predictive validity, especially
as there appear to be differences in students who undertake coaching. Those who seek
coaching on the SAT I tend to be from more afﬂuent families, have parents with more

f<>tTrial education, and are more likely to be Asian American (Powers & Rock, 1998).

one would expect that as coaching became available, scores of those coached would tend

to be elevated, and the predictive validity of the test may be reduced over time.

18

 

 

 

 

 

.. \i )0"

CORN.“

mt" '9’". '4'” if
3.2-)»- we

_;. . ~ n‘“
QLL Q’s LOAA)\1C

. ‘17
ac; '13: WEI; ca:

wan"

Q. '
”'5. ‘ 9"" \ i
”A ~0..\£\‘ - U
U

mgr». h 11 h 1
w‘...>\.erlnil'».\A-L‘

exp-Ejulipl‘yc ”,4
,s..,..i\.lu‘.. .s\
r v

‘5

1"") as, l:,‘.
.J . v .. '
:bhilhri.hlrl

r71 Hi. T

int and r

a .

-.

I
l

. .,' ‘
\.-L.‘.lu or; DC’V
. ..

J -
«T

.m
.x_,

t.

t “ C

\.
RE! 3
I»;\‘ ‘4 Q.

‘.
V»u A

“‘J

'2“-
-~\ P

 

 

Individual Differences in Faking
Conscientiousness is linked to achievement-strivin 7, and extraversion is linked to

status-striving (Barrick, Stewart, & Piotrowski, 2002), which suggests that individuals
who are conscientious and extraverted may tend to be more motivated to perform well on
tests that will earn them a symbol of achievement and status, such as getting accepted
into college. Not only are there possible individual differences in motivation, there may
a1 so be individual differences in ability to inﬂate responses, as emotional stability and
co nsci entiousness have been found to be positively related to socially desirable
responding (Ones, Viswesvaran, & Reiss, 1996). Studies that have been conducted on
faking in biodata have been criticized for making the assumption that everyone in a
faking condition inﬂates their scores to the same extent (e. g., Becker & Colquitt, 1992).
MCFarland and Ryan (2000) argue that there are individual differences in response
inﬂation on personality-related measures, and they conducted a study examining
i1"ICIiVidual variability in faking across non-cognitive measures. They conducted a study
Using a sample of students with honest vs. fake instructional sets as a within-subjects
factOr. The order of instructions was manipulated as a between-subjects factor. The
respondents completed an integrity test and biodata inventory under the manipulated
irzlStruction conditions, and then all respondents completed a self-monitoring scale.

P articipants in the faking condition were offered a ﬁnancial incentive whereby the top

1 5% in test scores would receive $15. McFarland and Ryan concluded that the
I>erSOnaIity characteristics found to be positively associated with faking on noncognitive

leaSures include integrity and conscientiousness, while neuroticism was negatively

19

.-
- oqnxﬂ’? \""
3. lm ‘
yam-w ~
,.

T

o

01’1"”
K".'\\

I ~03“ .x-.l...

1“ I
‘1‘. 0-1,) \‘r‘
'.4\. .JE \k-l‘u

Q

 

. I
u‘.. «a o. 4 _'-
’ \H. ...'..'l

.5; ~

 

, ,‘ 1 1'
met ".v ,4 .- ‘,.
‘5 .‘Ji...|.L.«14 9..
I»:
l
f
: Doggy-r-léi‘
; ..v}\l..-Aul\ n.
J
I.
P'ta‘i» ‘
uC.r. .U

 

 

"o ~o . . ~
«.33. {'03 if
I

‘ i

l" a. . “I“
4 . ‘

“midn‘ 11

p., ..
’ “.-

«.1,» \""\'

»-H-u.b§

I

Mersman and Schultz (1998) suggested that the ability to fake is an independent
construct. They conducted a study with a sample of students who completed personality,
gel f—monitoring and social desirability scales, and the tendency to fake good was captured
by establishing consistency with Saucier’s (1994) Big Five Mini-Markers under honest
arnd fake conditions. Mersman and Schultz found that faking good was not correlated
with self-monitoring, impression management or self-deception, and conclude that
individual differences in faking capacity are unrelated to those constructs of self-
presentation. I view this conclusion with caution as in this study, the participants who

were told to fake did not have any real motivation to do so. As with many studies, the

honest respondents may not have been providing responses that were entirely without = NH"

inflation. The combination of these two factors in this study may have minimized the
relationships between faking and individual differences because the gap between fakers
and non-fakers was restricted.

It is apparent that people do not all inﬂate their responses to the same extent,
When required to fake, and there is at least preliminary evidence that conscientiousness
and emotional stability are related to faking (McFarland & Ryan, 2000; Ones,
Viswesvaran, & Reiss, 1996). However, further study contributions on what particular
trait or pattern of traits make some people better able to fake than others would be
helpﬁal.

Having discussed the relevant literature on biodata, SJ Is, coaching, and individual

dlﬂ‘erences, I will now consider inﬂation as it relates to validity.

20

43.1 -‘H‘. ,-

 

'1; i'

l '.
1....\...

1 A. V
.
n...b¢a‘- sL l

:1; - ‘1' n1

, .
Qt ~LIV~IA AAAU

 

 

 

' I
~o . _

’_ "ii‘ 0".
ﬁb~n.u‘\

I.
ut‘u

 

T“. r
0; t? , ‘
“U‘ 1‘: ‘.u
‘4.
VOTE 7.
l" ,'

 

Effects on Validity
It is apparent that the high value that applicants to college place on college
admissions is likely to precipitate motivated responding on college admissions tests.
Regarding ability, self-reports of ability have generally been found to be quite accurate
across a number of research studies (Mabe & West, 1982), and with g—loaded tests it is
di fﬁcult to inﬂate one’s scores beyond one’s ability level. Noncognitive tests, however,
are more malleable as the best response can in many cases be guessed. Zickar and
Drasgow (1996) acknowledge that as tests are fakeable, it may be useful to focus on how
to recognize the fakers after they have taken the test. One way to do this is with detection
methods that identify patterns of inconsistent responding. They used two samples in the
AB LE dataset from Project A. Polytomous item response models were used to score
responses, and IRT model fit was examined. They then computed person-fit indices to
idel‘lti fy possible fakers. They found that fakers who had been coached were easier to
deteCt than those who were simply ad lib faking. They examine a theta-shift model to
idellti fy fakers, conducting statistical manipulations to test for the effects of faking on
Vali<1i ty. Their results show theta differences between honest responders, ad lib fakers,
and Coached fakers, and they found larger effect sizes than shown by Ones, et al.,
Snggﬁesting that faking may have more negative implications for validity than has been
demonstrated. However, they did not report correlations between personality scales and
cute()mes under the different faking conditions, so the issue of criterion-related validity
Was not addressed directly.
Some authors claim that construct validity is not changed by score inﬂation.

Smith and Ellingson (2002), in comparing a sample of applicants and a sample of

21

.. u...”

 

 

 

 

ixqo'l'ﬁt‘i.‘ 15 :r

UN.“ wk 5“.
r

“.34...” J v 0"
' ‘ 9' :M .1.

outnuunbu
2" "V‘.7" ‘ ‘1
£9.10“ urpnkh
Jig. ' O.
‘39.‘ f) '
‘¥.EMAA1 tint A
”Cr‘hi
.AnJl‘re.

:‘Jdt’féi'

I ‘l ‘4
“:- 3‘2“. W‘ ”‘4‘
u.» mum Ll 11.1

train, 1
' i '31 ‘l .W“
.Nmuui .
I 'Uiu..

Ti."
‘ .
u.“ .I‘ 1!” 0‘“
‘9Ai’ p
c -U. :

1

"”9...
molut§ “a, I
- \ L

1'. ‘- ‘ ‘ .

¥4~\::|‘-‘I' _‘.
a“, '-

3535. .

.
.h. .
‘~'.‘-
N. 1
'hi

in“-

‘|\.§i .

 

students (non-applicants) found that applicants did not simply inﬂate their responses
across all scales, and the relationship between social desirability scales and personality
measures did not change signiﬁcantly across samples. They concluded that social
desirability is trait-based rather than situation-driven, and that construct validity is not
attenuated by inﬂation. However, as there were slightly different patterns for applicants
and non-applicants, where applicants scored higher on some positive dimensions that

were different from the dimension for the non-applicant group, that conclusion may be

premature.
Anderson, Warner, and Spencer (1984) note that there are mixed opinions about
t '- lir‘

the effect of inﬂation due to socially desirable responding. Inﬂation can be regarded as a
healthy adjustment to a speciﬁc situation, and some researchers have found that inﬂated
Scores due to socially desirable responding do predict performance in certain jobs (e. g.,
sales roles). Some researchers provide evidence that faking has little effect on criterion-
rel ated validity. Barrick and Mount (1996) conducted a study examining the effects of
sel f‘ecieception and impression management on the predictive validity of the Big Five.
USing regression analyses and latent-variable modeling, they concluded that, in two
8311113 les of truck drivers, conscientiousness and emotional stability were valid predictors
of Voluntary turnover and supervisory ratings of performance, and that the predictive
Vali(lity of the personality measures was not signiﬁcantly negatively affected by
cornirolling for self-deception and impression management. They do note that adjusted
Va‘1i<1ities were generally slightly smaller. Ones, Viswesvaran and Reiss (1996), in their
metEl~analysiis, conclude that correcting for social desirability does little to increase the

effe(:tiveness of Big Five measures as predictors, however, it would be inappropriate to

22

 

 

'
n

1.. ‘1'“ l 3‘
3‘.) 5mm "‘

w‘
A.”

. . ‘
in“ at a narrk‘

. I i "H
. l 9 9' -
"LC?” Ara-116‘ U‘

V"
j 5")“ 3
\AAQ'JMJJ.

. i l‘. .
3-7:: at :ha. I

Some rc

:. as next j:

-— 1;. .... ~
#:1313503 DC'

‘ t . ,
and _ i
Eada~1& £11.1r‘ \

'V
N.
5
A .

i’”.*;f- \ ‘ ..
VAN-ﬁ‘l. . ‘ ‘1

2‘, i__ .

‘ u

("7; 1 -
«u tornam
‘ “rwtot

;“i:q"l'n~ V ‘ .1
“NW‘JAD. “is:

n “CO-fin:

1 ‘ !
.‘U‘i-ﬁ ‘-
“i\a,\y ‘1
u

?.
AIL”.

l
‘ l

 

 

broadly claim as a result of their ﬁndings that faking does not affect validity. Their study
1 ooks at a narrow perspective — the use of speciﬁc social desirability measures in their
model, rather than measures that are more situation- rather than personality-driven. Also,
I suspect that social desirability scales do not fully capture the extent of inﬂation,
partially because these measures, too, are susceptible to socially desirable responding
(Vi swesvaran & Ones, 1999). They are also correlated with conscientiousness and

emotional stability (Ones, Viswesvaran, & Reiss, 1996), and they do not appear to

capture all that explains score inﬂation.

Some researchers are more suspicious of inﬂation in self-reports of skills and
abi liti es, viewing the inaccuracies as a real problem that may make the reports invalid
Predictors of performance, and less useful as differentiators between applicants (e.g.,
Smith. & Ellingson, 2002; Topping & Gorman, 1997). According to a recent paper by
Graham, McDaniel, Douglas and Snell (2002), “For biodata, the degree of prediction is
likely enhanced by the accuracy of the self-report information” (p. 574). They conduct a
Study comparing responses to biodata items under honest-responding and faking-good
conditions, with job performance as the criterion of interest. They categorize biodata
items according to various item characteristics, and demonstrate that criterion-related
Validity of items with different characteristics varies across honest and faking conditions.
For example, they show that for items rated as veriﬁable through hard records, validity

for tl'le honest condition was .16, but .02 for the faking group. Items rated as veriﬁable
through supervisors or coworkers showed validity coefﬁcients of .12 for honest

resDQnders, and .03 for fakers. It is apparent from this study that the item characteristics

as Well as the situational instruction set are related to validity.

23

 

 

 

(
_A\

o
Q‘I—-s§)ﬂ '

. ‘».s.;..‘

5. ...

, ,0 !‘
n- ‘\ ‘7'

6 J, L...
..

.r-D' \~,

U
bob ..¥u‘.
.

I].

.‘ -»---«~c '
l'.._' '“l

4....-.
v

«- --.--ii0'
ﬂash.

.3 .4. y

‘M r‘.1¢.
' \
”vs-Euru.
..

. . n0~

4
‘Ipyukmasb
..

.

F‘ﬁ ‘ \jno .

~>J¥m46 t
. .

.
‘.\ g i V
,. h -.
p. 4
' ‘Asn.

 

‘1le
h.
hug». -‘
“a‘..
u

 

9.
’-A\0\
’b}\ C
l.‘ .
\
lg“,
lu‘ a..x,
‘¥.l 'lr
‘—

McFarland (2000), who found that the criterion-related validity of
conscientiousness was not signiﬁcantly affected from a statistical perspective, stated that
the .05 difference in validity between the honest and applicant conditions might be of
practical signiﬁcance in a real selection context. Stark, Chemyshenko, Chan, Lee, and
Drasgow (2001) cite disagreement about the effects of faking on the validity and utility of
personality tests. They look at both trait and situational perspectives in faking, and using
the Sixteen Personality Factor Questionnaire (16PF), in a sample of applicants and non-
applicants, they showed signiﬁcant differential item functioning (DIF) across groups of
applicants and non-applicants, where the items appeared to operate differently for
applicant and non-applicant groups. They conclude that the construct validity of the
1 6PF is negatively affected by faking among applicants.
Ones, Viswesvaran, and Reiss (1996) note that job applicants may be faking in a
j 0b desirable manner rather than a socially desirable manner. Presenting oneself in the
best possible light in a college admissions context could be in a job desirable fashion, as
Well as in a socially desirable direction. In the college context, the issue of job-desirable
bel‘lawior is limited by the individual conception of what is valued as effective college
performance. While some may consider job-desirable college performance as
academically focused behavior, others may recognize the broader goals that include
leadership behavior, interpersonal skills, multicultural tolerance, artistic appreciation, and
so 011. Such differences in perceptions of the criterion space may affect individuals’
per(heived job-desirable behavior, and possibly also the criterion-related validity of the
tests- Coaching is one way to provide information about what is job-desirable behavior.

1\’1iller(2001) demonstrates that those who receive coaching are far better equipped to

 

 

 

 

. I '

. 0h ' F

9‘.“ i" :C L
“5‘9 “A

. . r
-v. -?-\

.‘.b _C\ .

.
' -. l
3 ~. .,.,_

.. “jag“

LAI-
b

. j ‘

{ix-“N ,.

~5.....\ l' a
' H

i'” 'gns‘. _
-."“‘.tlll l\
‘ .4

‘31.:‘hv‘. . I
“'W 4'. \dl:

 

‘(alk'
“-14. ‘Hl'b‘
A‘ .

"‘k

Hal 1"

 

perform well on personality and biodata items, with those who received coaching rising
to the top of the distribution of performance scores, and those without coaching falling
toward the bottom. In a top-down selection system based on test performance, coaching
would have had a real effect on who was selected in this context, by ﬁlling the upper
bands of performers with individuals whose performance has been enhanced by coaching,
while those who have not had the opportunity to improve their performance through

coaching might fall into lower performance bands, and ﬁnd themselves not getting

selected under a top-down system.
Faking is apparent in testing in many selection contexts, and if faking as a result

0 f coaching affects test validity, it is essential that steps are taken to discourage faking,
identify it, and correct for it. However, score improvements after coaching could be a
I’eSLIIt of an artiﬁcial increase in the score, or a real improvement as a result of task
familiarity (see Sackett, et al., 1989). Our goal is to have admissions tests that have both
c0flstruct validity and predictive validity. Students are motivated to perform well to gain
admission to college, and if biodata and situational judgment tests are to be used in an

adnli ssions context, we need to understand better how they perform under coached

conditions.
Having noted important areas of the literature on inﬂation and possible related

proIDIemS, I will now discuss ways in which researchers have attempted to control

inﬂ ation.
Inﬂation Controls

As I expect that faking is possible with non-cognitive measures, and likely in a

Slhlation where there is a motivation to perform well on those measures, it is pragmatic to

25

 

 

 

 

* .

’ «A? I
"7,” mo ‘ ‘1th
ﬁtwn- --

'

’ .0, \‘
”aﬁfl‘.r
1‘ .5.\..->ub)

\\ 1""
Will;

. l

.1 "“7 ‘f‘ . r
“Jung ‘klll “Um
V

'9 an 39".
9‘ 51.n3

.i.‘.

c
>7)
"

(I

‘ I . 3
... ,, , ‘ , ..
Jud“, ,3: n

‘ V

"v 4,... a _.
,;‘ y
m ul)..L‘..C§. '.

: lII :~ ‘
'h'“ All GUT

.
8;.

nM‘r' .J .
Mutt, l: t"

try to prevent faking in the ﬁrst place. Three possibilities of faking control are the use of

warning statements, the requirement of item elaboration, and the consideration of certain

characteristics in item writing.
Warning
Warning statements may be a practical and effective way to manage inﬂation on

biodata and situational judgment inventories. Dwight and Donovan (2003), in examining
faking on noncognitive selection measures, found that warning statements were effective
in lowering predictor scores. They demonstrated that by providing a warning that
included the risk of veriﬁcation of dishonest responses as well as a negative consequence
for di shonest responding, the least faking on the California Psychological Inventory (CPI)
Would result. They showed an effect size of d = .75 on the CPI scale of well-being, and d

5

c - 6 l for dominance, when comparing those who were not warned with those who
I'eceﬁved an optimally effective warning. Hough et al. (1990) also emphasize the
importance of the use of warning statements, stating that the use of warning statements
‘Warrants greater attention as a method of reducing the amount and effect of intentional
distortion” (p. 582). In a sample of employees, Becker and Colquitt (1992) included a
WEI—thing statement that responses to biodata questions may be veriﬁed with other sources.
They found signiﬁcant mean differences between a test-taking group with no warning, a
fakin g group with no warning, and an applicant group who was warned. Vasilopoulos
(1 999) used a warning of response veriﬁcation in a study of a selection system that
inclllded personality and situational judgment measures, with a resultant mean drop in
SCOFes on three of the personality scales for warned respondents, but not on the situational

judgment scale. Warning statements are likely to be most effective in limiting faking

 

 

 

,3. ."> . \ '
. :f ‘b\ {33\

ﬂ "5.. u.
1

1
\
R 70"? \“L.
-,...A.- '1"
v

:«oa RY m 'I

3-.....8- .AAU
v

v
.o .1 I Q .
u§na in 4‘ 9

‘r t ”a.
.ub...‘ yOAAI‘
A

9 s
«“m .
u .t' “m

n..‘, v
I

~ -.

“"4-,‘. ' ‘
, .

tutu...“ A“
v "

ha‘ '
i f 0....

’ . V

-‘ -.c I: I‘:
v I

:‘f'dr' -L.
nuy, 3C ‘iLV‘F
.

93‘

when they include both a warning about the potential identiﬁcation of faking, and a
warning about the potential consequences of faking (Dwight & Donovan, 2003). While a
warning may be effective in reducing score inﬂation, the effect may vary from item to
item due to varying item transparency (Kluger & Colella, 1993); hence consideration of
item composition is important.
Composition of Items
Zickar and Robie (1999) conducted an item-level analysis of faking good on

personality items. They criticize the focus of the research literature on scale-level
analysis, and identify the need for item-speciﬁc examination as people “respond to (and
fake) individual items, not scales” (p. 552). Whether or not a test is fakeable depends on
the composition of the items. Becker and Colquitt (1992) note the discrepancies in
f‘ltlciings with regard to biodata items being faked and posit that the differences have to do
With the type of items. They examined a group of employees who were instructed to
either be honest, or fake good on a biodata instrument, and job applicants, who completed
the instrument as part of the application process. The authors concluded that the form
Was fakeable, and that those who were faking good or were real applicants did have

inf] ated scores compared to the honest condition. Items were examined for particular
characteristics that may make them more or less fakeable, and using the framework of
M331 (1991), they found that items that are more likely to be faked were less historical,
obj ective, discrete, veriﬁable, and external than other items. They were also more
releV'ant to the job. (Mael’s taxonomy is shown in Appendix A.) Similarly, Elliot,
LaVVty-Jones, and Jackson (1996) found that responses to objective tests of personality

Were relatively unaffected by instructions to fake. Dwight and Alliger (1997) conducted

27

 

nae: .

 

 

 

”1‘;
.g- ‘\ ‘\
lo...-

1,?
n)

i O-
-;_._‘q u.
. §\~.

o

o.‘.. 9' “f
h..*._.l,;,

 

‘~.s.)

a.

3?!

'.~M.‘_.
r

 

1“
”El“ “
3"; Wu.
ov‘u.“;
V
‘ t
.‘h
I‘Z‘il
‘ *u
‘7" ‘ 1..
\\i u
N» 4‘.“
. .
P‘!“ V “
“‘9‘. ‘
t“
V s ‘
‘ l
l.x 4‘“
.(‘\
n‘._ '
s‘;\ H..-
“—‘
»
fa.
“‘ \sA
_ _ \
'\

a study of ratings of individual integrity test items, ﬁnding that the perception that an
i tern would be easy to fake was related to the job relatedness (r = .50, p < .001) and
i nvasiveness of the item (r=.25, p < .05). In their meta-analysis of the susceptibility of
integrity tests to coaching and faking, Alli ger and Dwight (2000) conclude that the overt
tests were more susceptible to faking and coaching than were the covert tests. Biodata
and situational judgment items should be less fakeable where the items are more
obj ective, veriﬁable, and not clearly related to college performance. Mael’s (1991)
taxonomy of biodata items provides further ideas for item characteristics that may be
used sensibly in biodata items, including equal access and being controllable.
Respondents without perceived access to the biographical experiences to which an item
re fers might be less likely to inﬂate their scores than individuals who have had such
experiences. They may view such experiences as completely beyond the realm of
possibility. Similarly, individuals may be unlikely to inﬂate responses when the issue is
one in which they have little control. Graham, et al., (2002) examine biodata validity as
explained by item characteristics using Mael’s (1991) attributes. They concluded that
item attributes associated with validity are different for faking and honest respondents,
leaV’ing the authors skeptical about the possibility of writing biodata items that are valid
for both fakers and honest responders. For honest responders, item attributes most highly
as’Sociated with validity were those that were controllable and concerned with the
indi\/idual’s feelings or attitudes (r = .22), veriﬁability through hard records (r =.16), and
Veﬁ flable through supervisors or coworkers (.12). For fakers, items that were veriﬁable
throUgh friends (r =.11), controllable and concerned with actions that the individual

chooses to perform (r = .07) were associated with validity. Schmitt, Gillespie, Kim,

:2”-'7'\
‘. “a-“ ‘
I, 1
ll
~r‘ r )“

£4; ts$..

’1 ‘1

t ,L -‘w. .
“laud“- \.

Q
\- ‘s‘. ‘~
"\
3&1”...
A

3f‘,a)",3"
u..\u.t\lr

 

n 7...,
b

-
«aka... a
.

(1:

 

 

 

Ramsay, Oswald, and Y00 (in press) found that biodata items that were more objective

and veriﬁable were less correlated with the participants’ BIDR self deception and

impression management scores.

Requiring elaboration within biodata items is another method that may reduce the
l ikelihood of individuals inﬂating their responses. In requiring elaboration, the test item
speciﬁes that the respondent should provide examples that reﬂect evidence of the level of
experience that they indicated, rather than simply indicating the level of experience. An
example of a biodata item with an elaboration requirement is shown in Table 1.
Table 1

Sample Elaborated Biodata Item

The number of high school clubs and organized activities (such as band, sports,
newspapers, etc.) in which you took a leadership role was:

a. I did not take a leadership role

b. 1
c 2
d. 3

e. 4 or more

If you answered b, c, d, or e, brieﬂy indicate up to 4 clubs or activities and the

nature of your role.

 

 

 

29

 

 

 

 

.3. H...

A»-.. r.

" 'A‘ .
"' Abhn

 

 

in." _‘
n ’P‘
.

x

5:1"
I

 

Schmitt and Kunce (2002) found that by requiring item elaboration in a biodata
test, they reduced mean scores by .7 to .8 standard deviation units. They also found
carry-over effects of score reduction in nonelaborated items in the same instrument. By
requiring elaboration on biodata items, it should make the items less likely to be faked.
This effect of elaboration on biodata items was found again in a more recent study by
S chmitt, et al. (in press); however, the carryover effect was not found. Schmitt et al. (in
press) showed a difference in mean scores of .8 standard deviations between elaborated
and non-elaborated items, where elaborated items had lower scores. It may be valuable
to require elaboration to biodata items as a means of limiting faking in elaborated items
and possibly, through a carryover effect, in non-elaborated items. However, one should
bear in mind one possible explanation; the lower scores as a result of elaboration could be
because the individual involved simply can’t remember the details of the biographical
eVent, and they may limit their response to examples on which they can recall enough
detail to elaborate.

While attending to speciﬁc item characteristics when generating items is an
important way of discouraging inﬂation, it is likely that some individuals will still inﬂate
their responses, and I will now discuss the identiﬁcation of inﬂation.

Statistical Control using Social Desirability 0r Bogus Items
While I expect some score inﬂation to take place, it would be useful to be able
identify those who tend to manipulate their responses. Inﬂation identiﬁcation and control
DOSsibilities include the use of bogus items as a lie detection system, the use of scales that
capture impression management, and the use of indices of improbable responding. Once

faking has been identiﬁed, it is possible to statistically control for it by partialing out the

30

 

 

 

 

“ 0' ‘1.
,. .
5.3-3 V‘

‘a I

' - '2 s
.W‘aL. ' ‘
I.

.
wio'Hx';

' H‘»\,|-~n

..7.‘ 7 ‘.'J
\

.- ’V'H [.3
' S

>I-L H. h
\—

flil. A

..

‘~-.s..

33.1...1'
.

. “
.-'}l. 10o,,
si¥u shall.

3"?!” I.
“t.'\. .6.“

"L .3 .- « - .
K3,. 3\\.\.

- ‘
v.3, 'nhn.
. . ,1
b‘ 4.1-“;
.1 .
“Paw-1‘4
"" duct
5..

..
ﬁrm»,
.L\.l.) “6..
"‘ﬂ.:‘-“‘
My .15)
x.) .;

.-
. “i" \‘F. ‘
‘L‘N. Q.-

effects using the various faking indices as covariates in estimating the validity of biodata,
SJ 1, or personality measures (see Barrick & Mount, 1996).
Bogus items
It is apparent that individuals do inﬂate responses on certain tests in certain
contexts, and that if one expects that faking will happen, it is practical to plan for it.
Using items that capture the claim of impossible experiences is one way to identify
faking (e.g., Alliger & Dwight, 2000; Anderson, Warner, & Spencer, 1984; Pannone,
1 984). Anderson, et al. used such bogus items interspersed among real job task items in a
test of a job applicant sample to establish whether applicants were faking. The level of
their afﬁrmations of experience in bogus items was used to adjust downward their
experience score on real items. This was justiﬁed by the correlation between the inﬂation
Scale scores and examinations for the particular occupational classes. They concluded
that inﬂated scores are pervasive. In their sample, 45% of the participants indicated that
they had observed or performed at least one bogus task. Anderson, et al. also concluded
that the inﬂation scales had high reliabilities (average alpha of .86) and that the bogus
items were useful. In a secondary component of this study a sub-group of the applicants
Were also examined on a typing test after being asked how many words per minute they
c0111(1 type. The authors found that applicants did inﬂate their self-report of typing skill,
and that this inﬂation correlated with the scores on the bogus items in the more general
S'kills measure. Correcting for inﬂation increased the usefulness of the test in predicting
the criterion. Alliger and Dwight (2000) generated bogus items similar to those used by
Anderson, et al. (1984). They state that the bogus scale is “a measure of fabrication, the

demonstrated willingness of the person to create information about themselves that has

31

 

 

 

 

q. \ﬁ
.~ «.v‘f‘») \ l-
J. LUJ‘W“

”:14J"..lrll ‘.

*“$:~. near : ‘I
.9 Sue grew
A as u “‘
b
r — h "‘
is“: He Laps
... "v V
35»; 3L3'ru
<- «w
.UMJ.

“.I{;.1‘t 1m v‘v!

.evly Aliyn‘
. .

‘12- 44.
. um. Lu. or.
J’R‘Vm‘ ) 4'.

llbt Lu

-. .5..,'"
A

’3') ‘ l- «. " -
hire-53. SOLE;

cl... ~
"' 1'10.
Dakx 3;“: 'b\

I», . I
1 'Al““\
“1 suuL

2’13
H.."

 

 

no connection to actual experience, rather than a measure of subtle faking (e.g.,

exaggerating positive attributes)” p. 10. They found that the optimal warning condition
had the greatest impact on scores on the bogus item scale (d = .41) and suggest that bogus
items are tapping a speciﬁc aspect of faking — the tendency to fabricate information.
Social Desirability and Improbable Response Indices

Social desirability scales have been used to identify those more likely to present a

positive impression. Hough, Eaton, Dunnette, Kamp, and McCloy (1990) agree that
people can distort their responses, having examined personality constructs and the effect
Of response distortion on their validity, with four response validity scales that they
created: social desirability, poor impression, self-knowledge and nonrandom response.
The social desirability scale was patterned after other unlikely virtues scales, to detect
those trying to appear more attractive, where the areas of experience being examined
Were education, training, job involvement, job proﬁciency, delinquency and substance
abuse. They concluded that people can fake when instructed to do so, and that the
reSlxmse validity scales detected the distortion.

Ellingson, Sackett, and Hough (1999) attempted a correction using a social
desirability scale score and found that the corrected score was ineffective at improving
validity. In their within-subjects design across faking/non-faking conditions, they used
the SD scale to correct faking conditions to see whether scores approximated non-faked

Scores. However, Paulhus (1984) emphasizes that the situation determines whether both
c0Il'lponents of social desirability should be controlled. As self-deception is viewed as a
Stable trait, there may be little value in controlling for it in examining the effectiveness of

a test that is being used under speciﬁc situational constraints that are affecting the

inclination to provide socially desirable responses. In such a case, it may be more
relevant to control for impression management to reduce the effect of conscious faking.
Controlling for social desirability with a scale such as the Marlowe-Crowne may not
provide useful information as the scale is made up of both factors in social desirability;
impression management and self-deception (Paulhus, 1984).

Barrick and Mount (1996) examine the effect of self-deception and impression
management on the predictive validity of the Big Five when looking at turnover and
supervisor ratings. They found that distortion of personality constructs occurred through
sel f—deception and impression management, but that the distortion did not negatively
aff‘ect the validity. Christiansen, Gofﬁn, Johnston and Rothstein (1994) attempted to = W"

correct for inﬂation on personality scales by partialing out the effects of the inﬂation
scale, however, they were not able to demonstrate that such a correction affected the
Val idity of the personality measure as a predictor of performance ratings.

Other researchers, however, have found that criterion-related validity and
selection decisions can be improved by correcting for faking. Hough (1998) used results
0 f at) “Unlikely Virtues” scale to either adjust the individual’s score or to remove them
from the applicant pool, ﬁnding both techniques to be effective in reducing the impact of

See re inﬂation on hiring decisions and increasing the predictive validity. Anderson,
Warner, and Spencer (1984) attempted corrections after using bogus items that are true
lie scores and found that corrections improved the test’s usefulness as a predictor.

Becker and Colquitt (1992) corrected the scores of their motivated group (the

app] icants) by using an inﬂation index. The index was created by calculating the mean

d1 ff‘erence between scores for applicants and non-applicants. This index was then used to

33

 

 

 

"‘ 'm ‘7‘ "‘V ‘WIV‘

 

. I
‘ .. ~ on )
4" l ' 1|
Burl-Oil). “It l

stormed sc-
rim \in . 5
it sane cucr
scores to test v
.‘..,..,' 'hch «. .2

I
”luv “A"; b ‘H

_l

.3?“ ”I, ‘9 no
“l“kl-chﬁd

.'\i:h&._

 

 

 

adjust the distribution of scores. When corrected scores were used to make hiring
decisions, the correction changed the rank order of candidates to the extent that 17% of
the candidates were faced with a reversal of the hiring decision made under the
uncorrected scores. This correction effectively neutralized the inﬂation, however, one
concern with such an approach is that it assumes that everyone has inﬂated their scores to
the same extent. Christiansen, et al. (1994) used the Krug (1978) approach to correcting
scores to test whether there would be different hiring decisions after correction. They
found that without correction, discrepant hiring decisions were made in 16% of the cases
at a selection ratio of .15, and they also note that overall, candidates moved an average of
2 3 positions in rank (SD = 36) when corrections were used. Rosse, et a1. (1998) also

d emonstrated that by inﬂating scores, applicants may be able to affect who is hired.

Although there are mixed findings about the effectiveness of corrections, they
may be a practical way for test users to examine the effect of score inﬂation on criterion-
rel ated validity. Such examination will be important if biodata and situational judgment

items are to be used in a college admissions context in predicting student performance.
Study Development
To summarize, traditional measures such as high school GPA and SAT/ACT
Scores are used to make college admissions decisions, and tests such as biodata and
Situational judgment could also be used (Oswald, et al., in press). However, it is apparent
that biodata scores can be inﬂated, although they may not be as susceptible to
manipulation as are personality scales (Sisco, 1999). Situational judgment scores can
also be improved by faking (Nguyen, 2002). Performance on these measures is

moderated by the situational factors of the conditions under which the tests are taken.

34

:1 Hi?“

N

 

“it," f“,

\ ..- u-
v

. . «H
'3 i‘
a» ‘

gas-
-

.v.’ .33..
u»...‘\..»..

- a 4
'1'. “\u u

.
‘ —v "--
JML'L t“;

.
ii’j- g“,

V“‘-\~.A.
-‘
~“‘n 1
u\"' Q-
:‘uw.
~ ‘
'l ._V
‘ﬁl 3: r
V
.,.
- h.
“l ‘1’”—

u‘ ‘CEH

\‘W- .
«m, 4 w
b L
‘ l.
.A‘l “

 

 

 

Being motivated to perform well, experiencing coaching to be able to perform well, and
receiving a warning statement about not exaggerating responses in an attempt to perform
well all may contribute systematically to the score that an individual achieves. These
factors are expected to interact, with some combinations providing a stronger likelihood
o f high scores than others. There are also individual differences such as
conscientiousness and emotional stability that are expected to relate to performance on
biodata and situational judgment. While cognitive ability may also affect faking, ability
is not the focus of this study. Rather, I consider the individual differences of personality
traits as they relate to performance and inﬂation. Item characteristics, such as biodata
e1 aboration requirements and item response verifiability, also play an important role,
Where certain characteristics may make some items more vulnerable to inﬂation.

While there are mixed views on the effects of faking on validity, recent work by
Graham, et al. (2002) demonstrates that criterion—related validity of biodata items can be
negatively affected by faking. Assuming, therefore, that inﬂation could be problematic, it
may be helpful to control for score inﬂation. This could be achieved by identifying those
Who respond to bogus items, those who score highly on an inﬂation index developed for
SuCh a purpose, and those who score unusually highly on impression management, and
c=<>rrecting for the inﬂation. Inﬂated scores may result in a change in the rank order of
reSI:>(>ndents (e. g., Frei, 1998), and subsequently, a change in who would be selected in a
toI>~down selection system in admissions; a critical issue in a college admissions context.
By Selecting based on performance on these tests, we hope to be able to choose the best

cal1didates, i.e., those with good actual college performance (college GPA and

att erldance).

35

 

r..._.w~.—._:_ - . a v..a.a.-.e.~.:.:a~
~==-~>~n~=n

:3-7-93A-
-uaui nugv-uaunnninJ-anu

 

 

 

<m0
owgoo

 

 

 

oocaecotm
owe—BU

 

 

.3392—
..o 5:55»

Ht?"

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

meats?
wcﬁomoo
$2.“
cougcoE
woo—33:5
Quota—Em
v:
@8235 +
E0833“ w$I
owozoo 3:03.256 guaca—
cocoosmm Ego—m coogoéq
5389 88h.
2.03 + .8 ocean—Ston— mm
03
$38 A bsgoﬁom
8:85:82 89:95 _
coca—*5 Est—2“.“—
Ezﬂuon . .

.8“ 558585

338% Nemioozob

_ Saw?—

 

3558820
Eu: “mop.

 

 

 

828.525
:3:

36

  
  

 

u'SVW-‘rr v

   
 

. .--¢—.\-.— ...--..

firm:
{3: tailor q‘.
firs

inﬂuence 1

V .
a‘uI ~ ivy-I.
’=5 4 L
3.... ’b‘ll

’ ‘
V739 '"I . ‘V;
And .111.“

Set

most effcc

”11.,4'3 “ .-
.--. llu~ kt}.

‘I
a}...
u? "\f‘?' 9
“can‘t“: L

”:36 1"‘7‘ 9?
:Mw“

t n‘i‘igm ..
“‘an -1
”L, 5

I
p

fi‘wl ,‘ .
nn.$u‘ .) CI

5\-~,..11

V

(I . '. ‘, ‘
”W \ .

- “A, it.

1"‘Yplvr,"
u
a“ C“
"N
~ ‘rylh
"‘m‘l ‘. .‘
s. “ ' L.

Figure 1 provides a model linking the various components of the study, to address
the major questions deﬁned below.

First, how do different situational differences such as testing conditions
inﬂuence how an individual performs on biodata and 8.115? Situational factors
addressed in the preceding literature review included: incentive to perform well so as to
gain admission to college, coaching as a way to know how to achieve the best score, and
warnings to prevent score inﬂation. By examining these factors and their interactions,
this study will enhance our understanding of the susceptibility of biodata and SJ Is to
score inﬂation. These factors are considered in examining hypotheses 1 — 4.

Second, what are the individual difference characteristics of people who are
most effective in inﬂating their responses? We do know that those who perform well
in general are conscientious and emotionally stable. The existing literature does not
provide conclusive evidence on who is most likely to fake, and so this study will
contribute to the development of theory on individual differences and inﬂation. Certainly
some general predictions can be made. For one, it is expected that conscientiousness and
emotional stability will be positively related to inﬂation. Self-deception, which is trait-
related, is expected to be associated with inﬂation. While impression management is
typically identiﬁed as a trait, it is also inﬂuenced by situation, and is expected to have a
StI‘Onger relationship with inﬂation than self-deception. These questions are addressed in
testing hypotheses 5 — 6.

Third, what are characteristics of the test items that seem most susceptible to
inflation? As reﬂected in the work of many researchers, item characteristics can play an

important role is developing a test invulnerable to faking. By examining the

37

u‘. -h.f'

O .

 

3 W5?"

 

 

' 1
.'~ «
\F '.

A’s

,., .,

“
4.
win.‘ a.
r

:"‘_-,-91
sum“

l““" {7

«mu;
5

V n
3"... W o
.J‘5EQ ‘

H)?
"‘ “NA.

4

 

characteristics of biodata and SJ I items under varying testing circumstances, I will be
able to contribute to the theory of test construction, speciﬁcally for improved biodata and
SH items that are less vulnerable to faking. This question is addressed in examining
hypotheses 7 - 8.

Fourth, does the susceptibility of biodata and SJ Is to faking affect their
predictive validity if they were to be used in college admissions decision-making?
This issue is of critical importance if the tests used here are put into practice for selection
in an applied setting. I seek to identify differences in predictive validity and establish the
extent to which inﬂation affects whom would be selected. This question is addressed in
testing hypotheses 9 — 10.

To address the ﬁrst question of the study, regarding the situational factors that
affect faking, I propose several hypotheses based on past research. Different conditions
inﬂuence how an individual performs on biodata and situational judgment tests. We
know that people can and do inﬂate their scores when motivated to do so (Viswesvaran &
Ones, 1999), that coaching is effective in improving scores (e. g., Cunningham, Wong, &
Barbee, 1994), and that warning statements effectively suppress inﬂation (e. g., Dwight
and Donovan, 2003). Speciﬁcally, our hypotheses are as follows:

H1: Motivation will improve scores on biodata and situational judgment.

H2: Coaching will improve scores on biodata and situational judgment.

H3: Warning will reduce scores on biodata and situational judgment.

Interactions between these situational factors as well as the main effects
aSSOciated with each factor are also explored. For example, someone who is highly

mOtivated to perform well, is provided with coaching, and is not given any warning about

38

 

 

.A'

~¢
-5.-"

 

- >~71
...b\

w- ' ‘V'
.6; ..

u
r l
g...”

\U. --

 

. '
1’ ”“ 1

....1L

>-..

[u

4
.Ah' 5

".$

‘ “"Ic.

7-0
-..'.,:L;
It .
,. ,
IN“: '
'I ‘75.
... l

l“"
‘O
w‘ .4.

inﬂating responses is most likely to score highest on the tests. Next highest would be
someone who has had coaching and no warning, but no motivation. Individuals in these
two groups would be more cautious in inﬂating responses when provided with a warning,
but the warning is likely to have a more powerful effect for those who are motivated,
resulting in those without an incentive having a higher score than those with an incentive.
Those who are highly motivated to perform well have more to lose by being caught
faking than those who are not particularly concerned about the test outcome. This effect
of the interaction between warning and valence was demonstrated by McFarland (2000)
when looking at the Big Five factor of openness. Performance is expected to be lowest in
the group that has no coaching, no motivation. The warning statement is not expected to
have any effect for this group, as they have nothing to lose nor gain by manipulating their

scores. Table 2 shows these general expected score levels within the different situational

 

conditions.
Table 2
Expected Score Levels
Coached Not coached
Warned Not warned Warned Not warned
Motivated Middle High Low Middle
NOt motivated High High Low Low

 

 

. The following graph (Figure 2) provides a visual depiction of the expected
lnteraction.

39

In.

.I'ﬂ

. . *f.--

L‘.

 

Toast l—‘c—‘a rfc ”11 It) not")

i

H4:

Figure 2

Expected Interaction of Warning, Coaching, and Motivation on Test Performance

 

 

 

 

+ Coached, Motivated

 

 

 

 

(D

0

C

E +Coached, Not

36:5 _ Motivated

an . Not Coached,

7i; Motivated

.23 PT —i_ '7’ m er“. Not Coached, Not
Motivated

 

 

 

No Warning Warning

H4: Motivation, coaching, and warning will interact in affecting scores on biodata

and situational judgment.

The second question of the study addresses individual differences in faking. We
know that conscientiousness and emotional stability are generally related to performance,
and thus expect these characteristics to relate to biodata and SJ I performance.

H5: Conscientiousness and emotional stability will be related to performance on

biodata and situational judgment.
Based on the ﬁndings that personality traits have been found to be associated with

Social desirability (Ones, et al., 1996) and faking (McFarland & Ryan, 2000), I posit that

40

 

 

 

won't ‘
“(‘byiS‘Asl\ '
a b T‘ I;

w“? .

“rust-ju-

'- in" 9

rinse .0

ﬁlm 2 \
. lik'l mini-1L

liti:

k» il'
A $1
‘. , i‘
'\ "‘15”
14-: um.» I

1
I

r ‘ ‘o .
it.) LET“:

I
it“ are:

< ‘v
Kw.“ ii u
LNLlw-ch i .

‘,

mix "v
‘Mq"' r.

‘5 “he v

u

.
‘i

Pu"
't . W

conscientiousness and emotional stability will also be important correlates of inﬂated
scores, and self-deception and impression management are also expected to be related to
inﬂation. Inﬂation will be identiﬁed in three ways: by participants showing a positive
response to bogus items, showing high scores on the inﬂation index or showing high

performance on the BIDR impression management scale.

«r7

H6: Conscientiousness, emotional stability, and social desirability will be
associated with inﬂation captured by performance on a bogus item scale, the

inﬂation index, and impression management.

 

In this model, impression management is considered from two perspectives: ﬁrst,
as an individual difference that predicts biodata and SJ I performance and inﬂation on
bogus items and the inﬂation index, and second, as an outcome when used as an indicator
0f possible faking. (The empirical generation of the inﬂation index, where items that
Show extreme score differences between two manipulation groups in the study, is

described in the Method section below.)

The third question of this study addresses characteristics of items that relate to
Whether the items are more or less likely to be susceptible to inﬂation. This research
unstion is related to the work of Mael (1991), Becker and Colquitt (1992), and others,
Who provide insight into item characteristics that make items more vulnerable to score
inﬂation. Items on which it is easier to provide an inﬂated response are items that are not
Obj ective and are difﬁcult to verify (Becker & Colquitt, 1992). Items that are more
relevant to the job (Becker & Colquitt, 1992) and overt (Alliger & Dwight, 2000) are
InOre likely to be faked. I expect that items that are not viewed as relevant to the criterion

of interest (here college performance) are less likely to be inﬂated as the link to desirable

41

‘1‘")

E

‘ - D xvi-1"]!

 

 

 

1
,‘..
1

‘ s...
54"!
O

-"i
"v" ‘\ vi
ll...)

‘.

" ”ad. A
iCﬁE. ‘ 5

 

"

“.v0...

.1.

'..L.v\ 1i.

‘ ﬁd-
‘3’._}l ‘
iii-i“ ‘
v...A '

academic behavior is difﬁcult to make. Considering the examples of invasive and non-
invasive items provided in Mael’s (1991) taxonomy (p. 773), invasiveness is expected to
reduce inﬂation. Consider Mael’s noninvasive item “Were you on the tennis team in
college?” Inﬂation is likely in an item such as this as there is little vulnerability created
by answering positively. Now consider Mael’s invasive item, “How many young
children do you have at home?” This item, and others that are invasive, are likely to
receive muted rather than extreme responses, and so are less likely to be inﬂated. Items
that are perceived as unequal in access are less likely to be inﬂated as the individual may
see inﬂated options as so far beyond the realm of possibility that they would not choose
them. This argument is supported by the example of a nonequal access item provided by
Mael, “Were you captain of the football team?” For most women, this item would be
beyond what they would regard as possible. Similarly, an item that is considered
Controllable is more susceptible to inﬂation. Mael’s example, “How many tries did it
tElke you to pass the CPA exam?” is an example of a controllable item that is likely to be
faked, while an uncontrollable item such as, “How many brothers and sisters do you
have?” is less likely to be inﬂated. Logically, any item that is judged by people to be
uIlfakeable is unlikely to prompt inﬂation, although this is a speculative hypothesis. The
following hypotheses were tested.

H7: Biodata items that are rated as less objective, less veriﬁable, and more
relevant to school work will show greater inﬂation than items that are rated as
more objective, more veriﬁable, and less relevant to school work. Also,
items that are rated as invasive, are outside the individual’s control, are

unequal in terms of access on the part of students, as well as items that are

42

 

 

9;“ ~vJ~
whenurr.
l

. Van-19f

, | '
asunuks.

Ktlm’l".

. “HAM.“
:3“er
"rt-‘50} 3.

"V! O l v
Ni . 1
““Dtk'lel D
r .

‘x
‘v
1 i V -
~~.’v\ ~ .
1. .A. .1

\ ‘\\

rated as unfakeable are less likely to show inﬂation that might be the result of
some form of faking. (Greater inﬂation will be demonstrated by differences
in the correlations between ratings of item types and mean responses to the
items across study manipulation conditions.)

Another item type that has been shown to be relevant in reducing inﬂation is the
requirement to elaborate on one’s response to biodata items (Schmitt & Kunce, 2002;
Schmitt, et al., in press).

H8: Elaborated items will be less likely to be inﬂated that non-elaborated items.

The fourth question of the study addresses the validity of biodata and SJIs as tools
used in selection. As inﬂation may not be an adaptive behavior, but rather an artiﬁcial
self-presentation, it is likely that inﬂated responses are less effective predictors of
Performance than non-inﬂated responses (see Graham, et al., 2002). If non-inﬂated
reSponses show criterion-related validity, then validity of inﬂated responses may be
improved by correction. This possibility can be addressed by using regression analyses
t0 predict performance, partialing out the effects of faking as measured by one or more
faking indices (see White, et al., 2001). As mentioned earlier, I attempt to identify
inﬂation through bogus items, an empirically constructed inﬂation index and the BIDR
impression management scale. Some conditions are more likely than others to permit or
enCourage faking, and that will effect who would appear highest in a ranking in a top-
dOWn selection system (e.g. Dwight & Donovan, 2003; Frei, 1998).

H9: Controlling for responses to the three measures of faking will lead to a
suppressor effect, with faking correlating with biodata and SJ I but not with

performance criteria. Statistically controlling for faking should thereby

43

increase the amount of variance in college performance outcomes (GPA and
attendance) predicted by biodata and situational judgment.

More potential fakers (those marking bogus items, scoring high on impression
management, and scoring high on the inﬂation index) will be identiﬁed among the top
candidates if the faking goes uncorrected in top-down selection.

H10: Correction by removing those candidates who are identiﬁed as having

inﬂated scores, and replacing them with the next-highest scoring individual

 

l.“-

not identiﬁed as having inﬂated scores, will result in a better choice of

candidates, based on actual college performance (GPA and absenteeism).

44

In?
.mﬂ

 

.g- ~Jv
unob‘l I
n

-9... a 2
iota-ow

";'H ““1 ‘

. .

w
.,._
A b:

 

 

2“

“.5

i

-.
L .
~ ‘- N

METHOD
Sample
Michigan State University has a large student population (about 35,000
undergraduate students), and at the undergraduate level has relatively open admissions
standards when compared with other universities. Because of this, I am provided the
opportunity to sample a group of students in a situation in which I am not faced with
signiﬁcant range restriction in cognitive ability (ACT scores in this sample range from 8
to 35 with a mean of 23 (N= 341 ), and SAT scores range from 560 to 1480 with a mean
of 1,100 (N = 92). National mean scores for 1999 were 21 for ACT and 1016 for SAT.
'I‘l'lis sample will allow us to consider more accurately the effects of general cognitive
ability as they relate to our constructs of interest. Not only is there heterogeneity in
ability in the student population, but also a diversity of ethnicities is represented on this
campus. The ethnic breakdown of the student population is roughly comparable to that
of the United States college applicant pool, although this campus underrepresents
minority groups relative to the population of the US. At the university, 77.3 % are White,
9 - 8% are Aﬁ'ican American, 1.9 % are Hispanic American, 5.4 % are Asian American,
and 5.6% Other. In the sample, 79.28 % are White, 6.08% are African American, 2.49 %
are Hispanic American, 8.56 % are Asian American, and 5.6% Other. Of the sample
94% identiﬁed themselves as US citizens. Of the 5% who were non-citizens, 1% were
Canadian, and 4% were of other citizenship. English was the primary language of 96%
of the sample. Women account for 58.84% of the sample, and the campus has a student
population that is 55% female. To ensure that we collected data from individuals who are

C:IOSe to the typical age of students applying for college, we recruited only a subgroup of

45

 

 

 

(\-

.4
1.;

wi-‘

the student population; to be able to participate, students were required to have been in
their ﬁrst year of college. The age of the participants ranged from 18 to 22 years, with
71% of the sample being 18, 20% being 19, and 6% being 20 years old. The mean age of
the sample was 18. Participants were recruited from the university’s Psychology
Department Subject Pool, had not participated in other studies using similar measures,
and received extra course credit for their participation.
Study Design

This study is designed as a 2 X 2 X 2 orthogonal design, both for situational
judgment and independently for biodata. The biodata part of the study has an additional
factor with two levels: repeated measures on elaborated vs. non-elaborated items. The
study manipulations are identiﬁed in Table 3 below, where there are eight possible
variations in experience during the study, excluding elaboration, with equal numbers of
participants in each cell.
Table 3

Study Manipulations and Participants per Cell

 

Coached Not coached
Warned Not warned Warned Not warned
Motivated 43 45 45 45
Not motivated 44 48 46 46

 

 

There were approximately the same number of participants per cell in the research
design, as shown in Table 3 above, and a total of 362 participants overall which provided

sufﬁcient statistical power to test for the main effects and interactive hypotheses

46

 

 

 

 

l

‘J
b\

\

u.

u,

u...

‘1‘?

 

 

 

(Hypotheses 1 - 4). I am assuming a medium effect size (d = .50), and an alpha level of
.05. With 40 participants per cell there will be a power level of .88. Samples within the
cells will be too small to provide sufﬁcient power to reject the null hypothesis for
analysis of validity and reliability across conditions.
Procedure

The questionnaire for this study was split into two booklets, and four forms of the
questionnaire were distributed. The four forms were essentially the same, apart from the
instructional set that was used at the beginning of the second booklet. The instructional
set provided the warning and motivation conditions. The ﬁrst booklet of each form
contained a Big Five personality measure, social desirability and impression management
scales, college GPA, absenteeism and demographic questions. After completing the ﬁrst
booklet, those groups receiving coaching experienced a brief coaching session relating to
the questions in the upcoming booklet, while those groups not targeted to receive
coaching got nothing. There was no placebo treatment for the no coaching group; they
were moved directly to the next phase of the experiment, which for both coached and
non-coached groups was the second booklet. Instructional sets at the beginning of the
second booklet varied across the four forms, providing the study manipulations. A
sample of the questionnaire form is show in Appendix B, although biodata and situational
judgment items have been removed, due to the proprietary nature of the measures.
Sample biodata and situational judgment items are shown in Appendix C and D,
respectively. Apart from the instructional sets, forms were identical. The wording of all

four of the instructional sets is shown in Appendix E.

47

 

a-fh'W}
“.11; u)...

 

.
.4. I
‘ ‘
5.x.
3.... .,.

1
."\ 'P‘
P-v ~in‘x
l: .

 

\
‘..-

u
“dun 5
tr-

11:".
A..'. g

.

 

This study was approved by the University Committee on Research Involving
Human Subjects (UCRIHS). Samples of the two informed consent forms, one for
motivated participants and one for participants who were not motivated, are shown in
Appendix F and G respectively. The forms for this project were pilot tested, and data
collection was completed in Fall 2002 and Spring 2003 semesters. The study was
advertised through the web page of the Psychology Department subject pool.
Participation was restricted to freshman students who had not participated in other studies
we have conducted using similar measures. Participants were offered extra credit in
psychology for their participation, and the data collection sessions lasted 90 minutes.
Participants, who signed up via the subject pool web site, were randomly assigned to
different coaching conditions, and were randomly assigned to different forms. It is
important in attempting to understand the effects of coaching through an experimental
design that subjects be randomly assigned to the coaching condition (Messick &
Jungeblut, 1981). Sessions were designed to seat up to 50 participants in a classroom
setting, and were administered by research assistants working according to a written
protocol (Appendix H). Informed consent forms, data release forms, questionnaire forms
and scantrons were placed in envelopes for each participant and distributed according to
the protocol. Half of the sessions were provided with a ten-minute coaching component
aﬁer participants had completed the ﬁrst booklet, containing all measures apart from
three measures: biodata items, bogus items and situational judgment questions. These
three components were completed immediately after coaching for the coached group, and

the presentation of the manipulated instruction sets.

48

 

“’51
g. \
n

0’ 3‘

g... h

-0

.u»~‘

a ’3

a..5

 

{

. .

~..

“a.“

P}
I /

xv“
“‘4
r
.,
.A-.
h‘
\
»
\Sa-v-

Manipulations

Rather than conducting a directed faking study, where individuals are directed to
provide the best response even if that means lying, I have preferred to create situational
motivators in the form of an external cash incentive, provide constraints by using warning
statements, and present ideas on what the tests are capturing through a coaching session.
These are a more realistic and subsequently useful approach than directed faking in
examining inﬂation on biodata and 8115.
Motivation

To encourage students in the motivated conditions to do their very best on the
test, and to make them most similar to motivated candidates in a college admission
setting, a ﬁnancial incentive was offered. Those who scored above the 50th percentile on
the tests administered were promised and have been mailed $10. Those in the non-
motivated conditions received no incentive.
Warnings

To create a similar effect to warnings that may appear on college application
materials, the materials for those in the warning conditions included warning statements
as shown in the instruction sets in Appendix E. An example of a warning statement is:
“Note that we may verify a subset of your responses, and if you respond dishonestly, that
may invalidate this test as well as your chance to receive $10 for high performance.”
Coaching

A ten-minute coaching component reviewed sample biodata and situational
judgment items along with deﬁnitions of the performance dimensions that the questions

were designed to measure. This coaching session was designed as an orientation to these

49

. 1 . v
.4 :s
a w: .\. a.

,..

.4\.

 

4...

1‘.

 

o..

nah

 

\1. r

5

‘l

34.

L1}

particular selection devices. This form of orientation is common in formal coaching for
selection tests (Sackett, et al., 1989), and is similar in length to the coaching for biodata
provided in Miller (2001). Also, it is far more comprehensive than the written coaching
statements found by Cunningham, et al. (1994) to be effective. We expected that this
brief coaching would have an effect because it was exercise-speciﬁc (Sackett, et al.), and
was provided immediately prior to the completion of the biodata and situational judgment
inventory questionnaire.

By reading aloud the directions (Appendix I) and handout material (Appendix J),
the proctor of the session provided coaching for the participants. To ensure uniformity in
coaching, the same proctor administered all coaching sessions, reading from a prepared
script. Once the coaching component was complete, the proctor led immediately into
administering the second form in the study, to avoid any discussion and questions during
the testing session regarding the coaching.

Measures
College Grade Point Average

To be able to evaluate the predictive validity of the measures being tested in this
study, two outcomes were collected. Participants were asked to release their actual
college GPA. A sample of this information release form provided to the university
registrar is shown in Appendix K.

Absenteeism
Participants were also asked to identify how frequently they have missed class,

and the reasons for their absence.

50

 

 

 

".11-
Neil

...
my». I

~
3.44\

 

 

,.
“w.
~_,~1,_

5
5w
4

c.“

 

Gender and Ethnicity

To be able to examine subgroup differences in responses, participants were asked
to indicate their ethnicity and gender.
High School GPA and SA T/A CT Scores

Participants were asked to provide access to their high school GPA and SAT/ACT
scores as provided by the university admissions ofﬁce. A sample of the information
release form is shown in Appendix L. SAT and ACT scores were converted to new
variables through linear transformation based on national normative information on
means and standard deviations, and then combined to create a composite cognitive ability
index consisting of an average of all available test scores for each person.
Personality

A Big Five personality inventory based on the International Personality Item Pool
(IPIP) was used to measure personality (Goldberg, 1999). Scale alpha levels are
Conscientiousness (.81), Openness (.77), Agreeableness (.82), Emotional Stability (.89),
Extraversion (.87). I presented hypotheses only for the Conscientiousness and Emotional
Stability; other measures were examined on an exploratory basis.
Social Desirability

Self-deception and impression management dimensions of social desirability were
measured using the Balanced Inventory of Desirable Responding (BIDR) scale (Paulhus,
1988). Because of concerns about the intrusive nature of one item in each scale (i.e., “I
have sometimes doubted my ability as a lover” and “I never read sexy books or
magazines”), these two items were not used. Scale alphas are .67 for self-deception and

.80 for impression management.

51

Biodata

Biodata items generated by Oswald, et al., (in press) were reviewed, and those
that were empirically determined in their sample of ﬁrst-year students to be the best
predictors of college performance outcomes (GPA, absenteeism, and a self-assessment on
a behaviorally anchored rating scale) were selected for this battery. While Oswald, et al.
had written their items to tap 12 dimensions of student performance (see Appendix M), as
the items were intercorrelated, they regarded their biodata scale as unidimensional.
Nevertheless, to capture the breadth of student performance, items for this study were
selected to ensure that the content of all twelve performance dimensions was addressed in
the scale. Also included were all elaborated items used by Oswald et al., so that half of
the items for this study were elaborated and half were not. Overall biodata scale alpha
was .88. Alpha for elaborated items was .78 and .80 for non-elaborated items. Due to the
proprietary nature of these items, a sample of the biodata items is provided in
Appendix C.
Bogus Items

The biodata questions include four bogus items to assess faking. These items
were based on bogus items used by Anderson, et al. (1984), and were interspersed with
the real biodata items. The bogus items are identiﬁed in Appendix N. The bogus items
scale alpha was .37. This was not unexpected as these items produced very little
variance. If respondents were paying attention and were honest, I did not expect them to

indicate any activity on these four items when there is no incentive to inﬂate responses.

 

~"H

 

x'n‘ ‘
kn \

 

~6.‘,

.
ru~
‘Cyt
.2.

”‘4“

~.

5.
x
I

 

Is

:4;

"c

U»

Situational Judgment

Situational judgment items generated by Oswald, et al., (in press) were reviewed,
and those that were empirically determined to be the best predictors of college
performance outcomes (GPA, absenteeism, and a self-assessment on a behaviorally
anchored rating scale) were identiﬁed for this battery. On dimensions where the
predictive validity of the SI I items was low, the best items were rationally selected based
on content. Two items per dimension were selected to ensure that the content of all
twelve performance dimensions was addressed in the overall scale. Scale alpha was .77.
Due to the proprietary nature of these items, a sample of situational judgment items is
shown in Appendix D. ’“*'
Inflation Index

To be able to identify probable fakers, the mean score differences for items
between the individuals in the manipulation condition where people are most apt to
inﬂate their scores (coached, motivated, not warned) and in the manipulation condition
where people would be least apt to inﬂate their scores (no coaching, no motivation,
warned) were exarrrined. Eight items with the largest mean difference in responses
comprised the inﬂation index. These items are shown in Appendix 0, with the difference
scores in Appendix P.
Item Type

To generate assignments of item type to biodata items, two professors, four
graduate students, and three undergraduate research assistants on the project provided
ratings indicating the degree to which each biodata item was: objective, veriﬁable,

controllable, equally accessible, relevant to college, invasive, and fakeable. Ratings were

53

 

 

1
«1‘: .1
.uuv "

a»: C

“’JL‘ ‘

.
.10»—
A4..\-\..r

tact

“if",lr-
‘~\.\"
W C
.
K-
‘ll 7’ p
A.“‘ ..
’3‘...)
lbs““'\
‘
u
.‘\|
Ls ‘ ”‘1';
““‘~\
\.

,.
1. ~31"
b.““'i.‘
s... u ,
I.‘ 1"“
“~43
ﬂ.

..

1."

w 2.
\-

I.

”5;, T
G. ‘

l

5.7

.4.

v. 1‘,‘

'—V

made on a 5-point scale ranging from “Not at all” to “Completely,” one dimension at a
time. See Appendix Q for the rating form, which includes deﬁnitions of the item-type
dimensions. The 126 items were distributed evenly among the raters so that each rater
assessed a unique set of 56 different items. In the end, this resulted in 4 ratings per item.
Additionally, four sets of 14 items were rated by the same set of 4 raters. (Items 1, 10,
19... were rated by the same people; items 2, 11... were rated by another set of the same
4 people, etc.)

In order to index the amount of agreement between these ratings, the internal
consistency of the ratings across items was calculated, treating each rater as
interchangeable. This analysis was conducted for each item-type dimension by
aggregating ratings for the 126 items into four groups that consisted of ratings from the
four, randomly assigned raters. The coefﬁcient alpha estimates for the four groups of
ratings were then examined for each dimension. Alpha coefﬁcients were highest for
college relevance, veriﬁability, objectivity, and controllability (see Table 4), and these
dimensions were retained as relevant for further analysis. Low reliability on some
dimensions was either a result of all the judges rating all the items the same way (6. g,
fakeable, where all items were regarded as highly fakeable, with no variability across
items), or inconsistency in how judges rated items (e. g., invasive).

For objectivity, veriﬁability, relevance, and controllability the ratings across the
four groups of ratings were averaged to compute an overall dimension value for each
item. These values were then used in item analyses, the results of which are described

below.

54

T151 4

Imap.

..u\ru

p

CI,” ‘9‘
". v.1

r' "
(2'5":-
biAcsh

I!"
[1‘1in

 

Table 4

Coefﬁcient Alpha and Descriptives for Ratings of Biodata Item Characteristics

 

Dimension Alpha N Min. Max. Mean SD
Objective 0.70 126 1.00 5.00 2.89 0.93
Veriﬁable 0.75 126 1.00 4.75 2.49 0.91
Controllable 0.67 126 1.75 5.00 3.97 0.77
Equal access 0.37 126 2.5 5.00 4.1 1 0.62

College relevant 0.80 126 1.25 5.00 3.28 0.94
Invasive 0.06 126 1.00 4.00 2.75 0.73

Fakeable 0.06 126 2.50 5.00 4.26 0.49

Similarly, analyses of S11 item type were conducted, and rating instructions and
item-type deﬁnitions are shown in Appendix R. Two professors and ﬁve graduate
students provided ratings indicating the degree to which each situational judgment item
response option was: objective, veriﬁable, controllable, equally accessible, relevant to
college, invasive, and fakeable. Ratings were made on a 5-point scale ranging from “Not
at all” to “Completely,” one dimension at a time. The 24 situational judgment items were
distributed to all raters, with each rater rating all items. To assess agreement between
these raters, we measured the internal consistency of the ratings across items, for each
dimension. The alpha coefﬁcients were highest for college relevance, veriﬁability,
objectivity, and controllability (see Table 5), and we retained the data for these
dimensions that demonstrated traditionally accepted levels of internal consistency.

Again, we discarded ratings on categories where the raters provided little variance across

55

 

tens (6%-
r“: tier»
stress the s:
111168 \\ crcl
men.

labia 5

.:‘\ .
’ " "‘7 7 -,
(Ml/shtl.

ﬂmensior

 

 

Olajective
l'eriﬁablc
Controllal
E94211 acc
College It
151‘ Bite

Editable

 

 

items (e.g., fakeable, equal access) or were inconsistent in rating (e. g., invasive). The
judges viewed biodata items as more fakeable than SJ 1 items. Next, we averaged ratings
across the seven individual raters’ values to compute an index for each dimension. These
values were then used in item analyses to identify items viewed as most vulnerable to
inﬂation.

Table 5

Coeﬂicient Alpha and Descriptives for Ratings of SJI Item Characteristics

 

Dimension Alpha N Min. Max. Mean SD
Objective 0.81 155 1.43 4.71 3.09 0.70
Veriﬁable 0.83 155 1.29 4.71 2.86 0.76
Controllable 0.78 155 1.71 4.86 3.67 0.67
Equal access 0.69 155 2.14 5.00 4.30 0.49
College relevant 0.86 155 1.86 5.00 3.77 0.91
Invasive -0.50 155 2.00 3.43 2.76 0.31
Fakeable 0.17 155 2.86 4.71 3.66 0.37

The results of the use of these measures are described in the Item Differences

section of the results below.

56

 

 

 

To

. .. : ,, t
«tutorial

.r

130:?» 3310‘:

mm m I
mews p0:
I Y i

11”."

. 1
Em»). )C:

mun. - .
‘lACQ‘reb L

 

353013 \r ,

r..«
.x ,
‘V‘k‘hie a.

 

L'v._id.~‘t;—! ,
‘1‘
\

1Y-

,.
1- w. ,
.' "
.g‘eL

 

 

RESULTS
Situational Differences

To address the ﬁrst question of our study, and test Hypothesis 1-4, regarding the
situational factors that affect faking, I conducted a 2 (coaching vs. no coaching) X 2
(motivation vs. no motivation) X 2 (warning vs. no warning) ANOVA separately for the
biodata and SJ I measures. The biodata ANOVA had an additional factor (elaboration vs.
no elaboration on the items), which was a within-subjects factor. Results for biodata are
shown in Table 6 for an analysis that included no covariates and an analysis that included
various possible covariates of biodata responses including sex, race (minority versus
white), self deception, impression management, cognitive ability, high school GPA, and
measures of the Big Five constructs. Table 7 contains the means and standard deviations
of responses to the biodata for all conditions.

As can be seen in Table 6, the motivational effect and the coaching effect are both
statistically signiﬁcant and the means in Table 7 indicate that the effect was in the
predicted direction thus conﬁrming our ﬁrst two hypotheses. The Warning effect was
nonsigniﬁcant indicating lack of support for Hypothesis 3. I hypothesized that these
factors would interact, but did not observe the three way interaction depicted in Figure 2.
For the analysis that did not include covariates, the interaction between Motivation and
Coaching was marginally signiﬁcant. Examination of the means for these conditions
indicated that the combination of both Motivation and Coaching without Warning
produced the largest biodata scores (Mean =3.41) as would be expected. Neither
Coaching nor Motivation alone (Means =3.09 and 3.06) produced as large an increment

in performance when neither Motivation nor Coaching were provided. Variance did

57

 

 

[£136 3UP >
:0: \l'amr‘d
is: people 7
“In
facets inzer
for btth an.

“C‘Lid be \t‘

35mm Ex

33),... -

datum
‘ '1.

2'12...

‘I-ﬁiili) -

 

change across manipulation conditions, with the greatest variability in the Coached and
not Warned groups, suggesting that coaching does not necessarily standardize the way
that people respond.

When covariates were included in the analyses, the Motivation and Warning
factors interaction was also statistically signiﬁcant. Examination of the pattern of means
for both analyses (with and without covariates) indicated that a warning that responses
would be veriﬁed did appear to erase the inﬂation of responses that occurred when a
monetary motivation to get good scores was provided. With a warning, the means of the
motivated (Mean=3.1 1) and nonmotivated groups (Mean=3.01) were not very different
compared to the two conditions in which no warning was present (Means: 3.24 for
motivated, and 2.99 for nonmotivated).

One of the covariates (Extraversion) interacted with the elaboration factor.
Examination of this interaction indicated that the correlation between responses to
elaborated biodata questions and Extraversion (.29) was higher than the correlation
between Extraversion and the responses to nonelaborated items (.21). The impact of
elaboration also appeared much smaller when all covariates were included in the

analyses.

58

 

 

lab-Es 6
.lmirsis of

Berreen SU

Coaching ((
limitation

timing (\

 

fill
Crll'
llrll’
fill-1M1

Emir

 

 

Table 6

Analysis of Variance Results for Biodata with and without the Inclusion of Covariates

Between Subjects Source

Coaching (C)
Motivation (M)
Warning (W)

C x M

C x W

M x W

C x M x W
Error

Within Subiects
Elaboration (E)
E x C

E x M

E x W
ExCxM

E x C x W

E x M x W

ExCxMxW

Without Covariates

With Covariates

 

 

df F

1 24.95**

1 16.73**

1 .39

1 3.69

1 .02

1 2.11

1 .03
354 (.42)8

1 726.11**

1 .60

1 .07

1 .03

1 2.32

1 .72

1 .08

1 .63

 

59

 

df F

1 26.96**

1 13.62**

1 .54

1 1.52

1 1.92

1 628* *’

1 1.22
312 (.32)3

1 3.38

1 .60

1 .25

1 .07

1 1.62

1 .01

1 .27

1 1.33

 

 

 

lab-C 0 C01

“BESSEP

E1E1on
:1 Ages:

EXCURSLI

at -1

\.| .
‘1 lol‘ 0.
‘1‘“15 ..

t o
190:.

 

Table 6 Cont.

Within Subiects
§Q~LC€

E x Self Deception

E x Impression Management

E x Extroversion

E x Agreeableness

E x Conscientiousness

E x Emotional Stability

E x Openness

E x High School GPA

E x Cognitive Ability

E x Race

ExSex

Error

alEquals the Mean Square Error.

*p < .05. **p < .01.

Without C ovariates

With Covariates

 

 

df

354

60

 

F

(0.05)21

 

df
1

l

312

 

F
.89
.09

4.81*
3.80
.95
.13
.81
2.14
.56
.21
.34

(0.05)8

 

m 24.2.

 

docs—5358 e: mm? 805 8:865 c 68580

8:22:38 2: 8:86.: 0:0 .wcﬁoaoo can .wEEmB 523382 833?: £250 .585 .82..

 

 

 

 

me ovd 0.0.6. mmd ohm med mod
we ovd 3N two _m.m med m5m
me 86 :16. mod 09m 50.0 36
me nmd cod ovd wm.m med ww.m
.3 2.0 mod omd Nm.m 93 2:
we med eqm ovd omd med :8
we Ed mod mmd wm.m mmd vwd
cw nmd ww.m med mmm nmd SN.
2 QM :32 QM :32 QM. :32
:38. Sega mumnomm
3:5:80 Baconﬂocoz 882m weakens—m

 

 

 

 

E 580 5 EB .3 52
§ :88 E Ema A c 82
3 :88 8V 583 .3 32
2: £80 8V 683 .5 82
8 580 E 683 .§ 52
2: £86 3 Es,» .2: .22
3 :88 av em? .2: 82

.8188 2: ES,» 5 82

“£62.5on .8953 8335\6 meoteémQ Samurai has .383

b 2an

61

 

 

was 51:17.11.

effects of

1.18169) 1

 

 

In Table 8, I present the ANOVA results for the SJ I measure both with and
without the inclusion of the covariates. Table 9 provides the corresponding means and
standard deviations for the conditions in our study, and the variance of the SJ 1 responses
was similar across all manipulation conditions. As was true for biodata responses, the
effects of Motivation and Coaching were statistically signiﬁcant and the means (see
Table 9) were in the expected direction. The Warning effect was not statistically
signiﬁcant. One interaction, that of Motivation and Coaching, was statistically
signiﬁcant but the results were not consistent with expectations in that the presence of
warning produced larger SJI scores than no warning, the difference being larger in the
two conditions that did not receive the motivational manipulation. The means presented
in Table 9 are consistent with our expectations in that the Motivated, Coached, No
Warning condition produced the best SJ I responses, over a standard deviation higher than
the warned condition that was not coached or motivated. However, several other
conditions did not ﬁt our expected pattern of results. The Coached, Motivated and
Warned group performed as well as did a similar group with no warning. Results that
included the covariates did not change the impact of the manipulations in any substantial
way even though a number of covariates (self deception, impression management,
agreeableness, conscientiousness, high school GPA, sex and race) were related

signiﬁcantly to SJ 1 responses. I will address these relationships in more detail later in the

paper.

 

 

Coach:
\lcwti‘»:
\Virnf:
C 1 .\l
C 111'

iii K \V

 

 

 

Table 8

Analyses of Variance for SJI with and without Covariates

Between Subjects

Coaching (C)
Motivation (M)
Warning (W)

C x M

C x W

M x W

C x M x W

Error

Self Deception
Impression Management
Extraversion
Agreeableness
Conscientiousness
Emotional Stability
Openness

High School GPA

Without Covariates

With Covariates

 

 

df F df F
1 40.60** 1 56.39**
1 14.65** 1 10.06*
1 1.41 1 2.41
1 988* 1 625*
1 .22 1 1.36
1 .09 1 .91
1 .10 1 1.20
352 (0.12)21
1 545*
1 400*
1 2.41
1 419*
1 4.71*
1 1.29
1 1.52
1 4.79*

63

 

 

 

 

 

Table 8 1

Rue

HOT

217.11.1318 1

 

 

 

Table 8 cont.

Cognitive Ability

Race

Sex

Error

aEquals the Mean Square Error.

*p < .05. **p < .01.

Table 9

Means and Standard Deviations of SJI Responses for Various Study Conditions

Mot (0), Warn (0) Coach (0)8
Mot (0), Warn (0) Coach (1 )
Mot (0), Warn (1) Coach (0)
Mot (0), Warn (1) Coach (1)
Mot (1), Warn (0) Coach (0)
Mot (1), Warn (0) Coach (1)
Mot (1), Warn (1) Coach (0)

Mot (1), Warn (1) Coach (1)

 

Without Covariates

 

 

With Covariates

 

 

 

 

 

1 .32
1 3.96*
1 18.44**
310 (0.09)8
Mean SD N
.50 .32 46
.62 .37 48
.56 .32 46
.67 .37 44
.52 .33 44
.89 .32 45
.58 .35 44
.89 .33 43

aMot, Warn, Coach indicates Motivation, Warning, and Coaching. One indicates the

manipulation occurred, 0 indicates there was no manipulation.

64

 

 

111 f7:

 

111C131

..
37f
.
,,_
(J
-1-

' I
l
Rik).

S‘“x

 

 

Hypothesis 1, that motivation will increase scores, was supported for both biodata
and situational judgment. Hypothesis 2, that coaching would increase scores, was also
supported for biodata and situational judgment. Support for Hypothesis 3 was not found,
with warning being ineffective at reducing scores. Hypothesis 4 was partially supported
in the case of biodata, but not for situational judgment responses.

It is apparent from these ﬁndings that both biodata and situational judgment
scores are susceptible to coaching, as well as to motivation. It is important to consider
that in this study, very brief coaching was provided, and if such tests were to be used in
college admissions decision-making, it is reasonable to expect that more comprehensive
coaching would become available as a result of the high-stakes nature of college
admissions decisions. The results do indicate, though, that a warning statement can have
some impact in reducing inﬂation at least for the biodata. An examination of the pattern
of results for biodata (Figure 3) shows that the suppressor effect of a warning statement is
most powerful for those who are motivated, lending some support to the idea that those
who have something to lose are more likely to take a warning statement seriously,
although this would need to be veriﬁed with further research.

A limitation of the warning manipulation is the slight difference between the
wording for motivated and not motivated groups, where the motivated groups risked
losing the opportunity to earn extra cash if caught responding dishonestly, while the
group without a cash incentive did not suffer this risk. According to the recent work of
Dwight and Donovan (2003), potential consequences in faking are an important factor in

the effectiveness of warning statements.

65

 

 

3.2

3.1

2.7

2.6

2.4

 

 

 

Figure 3

Interactions between Coaching, Motivation, and Warning for Biodata Performance

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3.2 -— m“ - _
3.1 —~ \ .___
3 __ - _
+Coached, Motivated
2.9 __
I ' = W a +Coached, Not
2 8 i ___________ ___ _ -,__ Motivated
' ~- Not Coached,
Motivated
2'7 it +Not Coached, Not
Motivated
2.6 ——— ——— 444444-442—
2.5
2.4 - 2

 

No Warning Warning

It is also possible that for someone who had not received a warning, their own

suspicions about a bogus item may have raised the possibility that was a lie scale, and

this may have effectively operated as a warning. It would be helpful in future research to

include manipulation checks so that it is clear whether participants registered that they

were being warned.

In this study, motivation was generated by an offer of a small amount of cash, and

this incentive was sufﬁcient to inﬂuence score inﬂation. If biodata and SJ I performance

were used to contribute to admissions ratings, the importance of admission to the college

of one’s preference would probably be a more powerful motivator, resulting in signiﬁcant

66

 

; *1
1‘.-L*

ire

SCL

4-r ." '-
bg-e\kl\Jr1

1.

I16

‘4 .
u.-..

. x

6‘1

:noritV

A,
‘d

V

‘AA'

m

 

IY
.A‘
\

.‘Wl

J

S

 

[he

a:

m

 

Cir

.

5
~\

.93
\

f!
Ir: 1*

.‘
’..

score inﬂation. The risk of not gaining admission would be an important risk. The

artiﬁcial context of the warnings in the study combined with the monetary motivation

may have limited the effectiveness of warnings, but the effect was in the expected

direction when the interaction of Motivation and Warning was examined for biodata.
Individual Differences

To address the second question of the study, test Hypotheses 5 and 6, and
examine the impact of individual difference correlates of faking — race (white vs.
minority), gender, ability (measured by ACT/SAT scores) and personality characteristics
(speciﬁcally conscientiousness, emotional stability, and social desirability) — correlational
analyses were conducted. Faking may be captured in three ways: by an individual
showing 1) a positive response to the bogus items, 2) a high score on an inﬂation index,
or 3) a high score on the BIDR impression management scale.

Table 10 shows the correlation matrix for the variables of interest. Of the faking
identiﬁcation methods, scoring on the bogus items was associated with scoring on the
inﬂation index (r=.3 8), but neither of these was strongly related to impression
management. The bogus items and inﬂation index were related to coaching (r=.23 and
r=.36, respectively), and the inﬂation index was also related to motivation (r=.24).
Personality was not highly associated with inﬂation as indexed by bogus items, with only
openness having a signiﬁcant relationship (r=.16). For the inﬂation index, extraversion
(r=.22), agreeableness (r=.21), openness (r=.21), and self-deception (r=.22), showed
signiﬁcant relationships. However, personality was more strongly associated with
impression management; conscientiousness (r=.36), agreeableness (r=.33), emotional

stability (r=.22), and self deception (r=.39) were correlated signiﬁcantly with impression

67

 

 

P“. I"I
“,ﬁob‘

. 1 .
‘n- "1‘
LI‘C'U.

v u

'e {1‘
4

s o“ n

[.\:"j.‘
,,,_.
Act-guy

.'_.

(x
‘k. oa>\

 

 

 

management. Methods of identifying inﬂation appeared to be unrelated to race, cognitive
ability, and age, apart from impression management, for which age was negatively
related (r=-.12). The only relationship between inﬂation identiﬁcation scales and
outcomes was for impression management, which was negatively related to absenteeism
(r=-.23).

Biodata scores were associated with situational judgment scores (r=.47), and both
biodata and SJ I scores were related to multiple personality traits, however, the patterns of
relationships were slightly different. Biodata scores were most highly related to openness
(r=.33), self deception (r=.27), extraversion (r=.26) and agreeableness (r=.25). Also
related to biodata were agreeableness (r=.25), impression management (r=.16),
conscientiousness (r=.15) and emotional stability (r=.13). Situational judgment, on the
other hand, was most closely related to impression management (r=.29),
conscientiousness (r=.24), and self-deception (r=.22). Also related to SJ I performance
were agreeableness (r=.21), extraversion (r=.16) and openness (r=. 12). Cognitive ability
was related to biodata performance (r=. l 8), but not SJ I performance. Race was unrelated
to either biodata or SJI performance.

Both race (r=.20) and ability (r=.3 7) were related to GPA. Biodata was unrelated
to the outcomes of absenteeism and GPA, and SJ I performance was negatively related to
absenteeism (r=-.20). Age was positively related to absenteeism (r=.17) and negatively
related to GPA (r=-.16). Conscientiousness was positively related to GPA (r=.l3) and
negatively related to absenteeism (r=.37). Emotional stability (r=-.l 1) and self-deception

(r=-.l6) were also negatively related to absenteeism.

68

 

 

I. I Zrllll

.»~ .w~.~....~.

moo-

00.0

no.0-

.430

2.030

1.2.0

*N_.0

v0.0

5.0-

No.0-

:30

zomo

gin—.0

moo-

:omd

*3 .0
00.0-
00.0-
#00.

as .0-
no.0-
no.0-

*m 0 .0-

.120.

ow<

‘ihf‘

v0.0
woo-
woo.
00.0-
1.00
No.0
ammo-
moo
moo

1.2.0.

05.?
02::on

00.0

00.0-

moo-

$3.0-

00.0

5.0

5.0.

moo

no.0

00.0

*imd

comm

1.70:0

moo

“SEO

33:0

“.20

no.0

*avNo

Elmo

sic—.0

no.0-

moo

v0.0-

:m

*ammo

no.0-

**o~.o

Inmo

**mm.o

*m_.o

aim—.0

:mmo

.Ibmd

no.0-

**w~.0

00.0

“info

3285

So. tome
8d :3-
3o tome

to? 3de
3o .12

.3de so

to? So

:26 .13
3o INS

Ed- So-
3o 26
3o- 85

zone $5.885

:26 30.233

3:

3.0

L805
soon—E;

32.0

00.0.

00.0

00.0

*©~.o

moo

3.0

no.0

#00

No.0-

no.0

«0.0

3.5.0

.2130

no.0-

00:33.0
002.53
003282
Om

0

mm

ow<
05.?
8mm
5
5265

3:

3.3.0 305 coomoﬁ

28m
mswom

page: 20.039209

S 053.

69

 

 

 

Ulllllllt'dl

I alth- IU 1‘.

 

 

5:?“

.00. V 9.3.. .mo. V a...

.833 Em 98 8805

Bob @3062 soon 32 x09: cosmoﬁ :50 25: 025.6 523 E20508 2865 385580 5 555 505 c2305 53, $8028.50 cm 05 8365..

v0.0
woo-

So-
28

**0Nr

“commie.

3.2.0-
3.2.0

ow<

00.0-

00.0

02380

:nmd

no.0

.95.?
93::on

no.0-

No.0

No.0-

BEES

tow-.0
5.0-

coax

no.0-

00.0-

_0.0-

00.0

080382

00.0

_.._..0N.0-

Em

no.0-

3.2.0..

No.0

moo-

v0.0

Om

00.0
moo-

282m

000

m0.0

00.0

00.0

_0.0

1.80

00.0
:mmo-

S:

no.0
“L _.o-
no.0
moo
#00-
1.3.0

3.0

mm

no.0
no.0-

$805
5033

*m o .0 <00
**nm.0- EmmooEomnxx
No.0- 02380
no.0- 00:33
00.0 35.502
.1. _ m .0 Om
3o- 0
1 _ .0 mm
D
00.0 <00
5.0 «568531..
23m
mawom

325:8 0. 035-

70

 

. \

n

A.-

CO‘

.r.
m.

hn‘. .

 

\.\ .

Hypothesis 5 was supported for biodata and partially supported for SJ 1. Both
conscientiousness and emotional stability were associated with performance on biodata,
while only conscientiousness was related to SJ I performance. Hypothesis 6 was also
partially supported. Conscientiousness and emotional stability were related to impression
management, yet performance on the bogus items and the inﬂation index were not related
to either of these personality traits. Self-deception was related to the inﬂation index and
impression management.

These results provide support for the idea that positive responses to bogus items
and inﬂating one’s score on other items is less personality-related and more situation-
driven, whereas the tendency toward impression management is more personality-related.

It is apparent from this study that individuals do claim experience with non-
existent things, and that this scoring on bogus items is weakly related to openness. It is
possible that people are being more “open” about how they interpret these particular
bogus items, and may be drawing parallels between an experience that they have had and
that which is captured by the bogus item, generously deciding to claim that they have
suitable enough experience to identify it as such. However, as bogus item scores were
more strongly associated with coaching, I expect that situational determinants have a
more powerful effect on marking bogus items, where individuals who have had coaching
on what dimensions are desirable may be overly enthusiastic about demonstrating
experience on items that they see as matching those dimensions, rather than responding in

an honest fashion. While it may be ethically questionable to include bogus items in a

71

 

 

 

 

5'3
n

TC”.

1\

'2 1‘

Li -

wt

of};

 

 

 

college admissions application, this does provide useful information about the ways that
people may fake.

As the inﬂation index used here is created empirically, based on the difference in
performance between the optimal performance group and the reference group, it is not
surprising that the index is very highly correlated with biodata and SJ I performance.
Nevertheless, once overlap items are removed from biodata and SJ I scales, the index
remains highly correlated with performance. It is interesting to note that
conscientiousness, although signiﬁcantly related to biodata and SJ 1 performance, is
unrelated to the index. As the inﬂation index is related to coaching, this reinforces the
notion that situational factors are important in explaining faking. It should be noted that
the inﬂation index created empirically for this study would require cross-validation for
use elsewhere. (The items that made up the index can be viewed in Appendix 0.)

That personality traits of conscientiousness and emotional stability were related to
self-deception is consistent with research on social desirability, and it is not surprising to
note that agreeableness is also related to self-deception — it is comforting to think of
oneself as agreeable. These three personality traits appear to be the most important in
explaining moderately inﬂated performance, however, they do not appear to be as useful
in cases of extreme faking. This suggests that presenting oneself in a positive light is
adaptive, and that items to which individuals respond probably include a level of social
desirability and/or job desirability.

That cognitive ability was related to biodata but not SJ I performance was
surprising, but may be explained by the breadth of dimensionality captured by the SJ 1,

which moves beyond typical college academic issues to include issues of social

 

CO

LU

 

 

consciousness, multicultural tolerance, artistic appreciation, etc. That cognitive ability
was weakly related (r = .18) to biodata could be a result of socioeconomic status, where
those with ability were also exposed to greater opportunity for suitable experiences.
Item Differences

To address the third question of the study, and examine the effects of certain item
characteristics as they affect inﬂation, the following analyses were conducted. The item
characteristic ratings made up of the mean rating provided by experts (described earlier)
were correlated with the biodata item responses for each participant. The r was then
transformed to a Fisher 2. ANOVA was used to test for differences in Fisher 2
correlations across the study manipulation conditions. Overall mean levels are shown in
Table 11, ANOVA results for the four item characteristics are shown in Table 12, and
levels across conditions are shown in Table 13.

Table 11

Overall Means and Standard Deviations of F our Item Characteristic Fisher 2 for

 

Biodata

Dimension N Mean SD
Objectivity 362 -O.22 0.21
Veriﬁability 362 -O.18 0.21
Controllability 362 -0.15 0.16
Relevance 362 0.10 0.16

‘5‘"

.5. v 9.... .3. v HH...

.Hohm Bazom :82 20 £959

 

 

 

 

 

 

 

 

 

 

 

.33 Ham .3: Hi 0.3 an .060 am... 85
moo H 8.0 H .26 H HH.o H 22.2 x o
8.0 H Ed H Rd H HS H B x 2
8H H 8.0 H Hg H one H B a. o
.52. H :3 H oH.o H a; H 2 x 0
23 H 8.0 H 23 H m S H 90 3.53
mg H .. How H 2: H A? H 06 @3282
3d H the H mom H 1.85 H 60 9.288
850m mHooEHHm
”H no u \Hc a to a \H. gnu.
8§>23H 5:526:80 3:585.» $280.30

8535 nonmotwtotexuﬁ ENS .Sok .6\N Sink age-03.3% 825qu \o .333on

N_ 035.

74

Hf“

.HHouHHSESHE o: m?» 805 8:00:05

0 600.5000 2002:0088 05 00:00:05 0:0 $550000 98 $52.83 528382 88205 00000 .503 .82..

 

 

 

 

 

 

 

 

 

 

a. 2.0 NHHH 2.0 2.? H3 2.0- H20 £5.
a. 2.0 8.0 2.0 2.0- mg 8.0- H2 25.
9. 20 NS 2.0 2.0- mg 2.0- 20 03-
Q 20 NS 2.0 :0- Ed :0- 20 8.0-
3. 3 mod 2.0 2.0- NS :0- 20 one-
2. 2.0 o; m; 2.0- 8.0 H20. NS ad-
Hz. 20 3o :0 2.0- H20 2.0- 2.5 NS-
9. 20 m3 «.3 2.0- H20 N20- 20 Rd-
2 00 :82 on :82 a. :82 9... :82
85.53 0:323:80 $585, $308.30

 

H0 €80 :0 H53 .00 022
80 £80 3 EB .8 H22
H H0 580 H8 855 .HHV 62
He €80 He 555 .3 82
H0 680 :0 Es; .80 82
H8 €80 H 0 as,» .80 82
:0 £80 He ES, .80 82

.80 580 80 EB .80 82

8300.5 xo\m=o.HH.HH0=oD .0050. 020.20; .5\N .00me 0520330be :0: 30.50 0:0.2030Q 330280. 020 0:00:

2 050.0-

75

 

 

0 \
3K

—

The correlation between item objectivity and item performance does vary
signiﬁcantly as a result of the manipulation conditions of Coaching and Motivation (see
Table 12). Mean levels of the Fisher 2 for objectivity across conditions are shown in
Table 13 below. It should be noted that all the correlations are negative indicating that
the more objective the item was judged to be, the lower were students’ scores on the
biodata items. As would be expected the strongest negative correlations were in the two
manipulation groups without motivation or warning, and the weakest were in the groups
with both motivation and coaching.

The correlation between item veriﬁability and item performance does not vary w:
signiﬁcantly across manipulation conditions, and mean levels across conditions are not
very different, as shown in Table 13. Again, all correlations are negative indicating
different items that are more veriﬁable produce lower biodata Scores.

For controllability, the correlation with item performance does vary signiﬁcantly
as a result of the manipulation conditions of coaching and motivation. Mean levels of the
Fisher 2 for controllability across conditions are shown in Table 13, where the largest gap
in means is between the groups without coaching or motivation, and the group that has
both coaching and motivation, but no warning. Correlations between controllability and
biodata scores are once again negative, but slightly lower than correlations between
objectivity and biodata performance. This is contrary to the direction expected for
controllability.

College relevance did not show significant differences in correlation with item

performance across manipulation conditions, however, there is a signiﬁcant Coaching by

Motivation interaction. This suggests that those who are both motivated and coached

76

have a greater likelihood of scoring highly on biodata items, when those items appear
relevant to academic performance, than those who only experience one or the other of
those conditions. This also suggests that the coaching on dimensionality of biodata items
is effective in helping individuals who are motivated to identify dimensions that are
relevant to college performance. All correlations between item relevance and biodata
item responses were positive, but lower than those with other item characteristics. These
correlations indicate that the higher the perceived relevance of the item, the higher the
response to biodata items.

Similar analyses were conducted for SJ 1 correct response options, where the
dimensions of objectivity, veriﬁability, controllability and college relevance of each
correct response option for the question, “What would you be most likely do?” was
correlated with SJ I performance, and then converted to Fisher 2.

Overall mean levels are shown in Table 14. ANOVA results for the four item
characteristics are shown in Table 15, and mean levels across manipulation conditions are
shown in Table 16.

Table 14

Overall Means and Standard Deviations of F our Item Characteristic Fisher 2 for

 

SJI

Dimension N Mean SD
Objectivity 362 0.02 0.22
Veriﬁability 362 -0.01 0.21
Controllability 362 0.13 0.18
Relevance 362 0.10 0.13

77

e [til

.5. V 92.... .mo. V a...

.2025 895m 5222 222 Enigma

 

 

 

 

 

 

 

 

 

 

 

.282 am .2223 2.3 .222: 2.2 .232 2.2 28.22
1222.22 2 8.22 2 22.2 2 .222 2 222 x 22 22 0
222.22 2 8.22 2 8.22 2 2.22 2 3 x 22
22.22 2 22.22 2 ~22 2 222.22 2 35
22mm 2 02.22 2 22.2 2 222.2 2 223
22.2 2 22.22 2 2.22.22 2 8.22 2 S: 22225223
2.2.22 2 2.2.2 2 3.2: 2 2 18.8 2 2222 2522222822
8.22 2 2 2.2 2 $22.3 2 £22.22. 2 62 2222222228
ouSom $8326
2 x22 .2 x222 2 x222 .2 222 gm
822g232 22222222282228 222222222252, 222222282220

Na. xoxdohwtmtexgb ENC 230K 20.? 22222.3 goKazzmmm 8:22.252 \0 2.222532%

2 @3222.

78

Y Q?"

dorm—222222222222 ca 33 8222 8228222222

0 .32-222822 28222223222288 22222 3228222222 0:0 92222222200 222222 $222535 2222223202 8228222222 22022220 .22-2222,? .2222

 

 

 

 

 

 

 

 

 

 

22. 2.2.22 2.2.22 222.22 :22 :22 222.22 222.22 22.22
22. 2.2.22 2222.22 22.22 22.22 22.22 222.22- 2.2.22 2,222.22-
22. 22.22 222.22 22.22 22.22 222.22 22.22 22.22 22.22
22. 22.22 222.22 22.22 22.22 22.22 222.22- 8.22 222.22-
2.2. 22.22 222.22 2 2 .22 22.22 22 2.22 222.22- a 2 .22 8.22
22. 2.22 2.2.2 2.2 222.22 222.22 222.22- 2222 2222.22-
222. 2.2.2 222.22 22.22 22.22 222.22 222.22 2.22 2.22.22
22. m 2 .22 222.22 22.2.22 2 2 .22 222.22 2 2.22- 22 2 .22 222.22-
2 22.2- 22822 Q2 :82 a. 22822 Q2- 2282
82.225232 22222222228228 22222222222232 22222822222

 

222 22222222 222 22.222222 .222 22.2
2222 222822 222 E22222 .222 22222
222 2288 2222 E22222 .222 22.2
2222 22222222 2222 E22222 .222 202
222 2222822 222 22.22.25 .2222 22222
2222 2222822 222 E22222, .222 22222
222 22222222 2222 8223 .2222 22222

.2222 22228 2222 2:223 .2222 2222

EM. Ska-22222222220 22.32% 3222.232 onN .222-2.2K .222-2223292222225 ESN 2220,58 22.222.223.2222Q «ravagw 2228 2.2233

02 22222-2.

79

The correlation between item objectivity and item performance does vary

 

signiﬁcantly as a result of the manipulation conditions of Coaching and Motivation (see
Table 15). Mean levels of the Fisher 2 for objectivity across conditions are shown in
Table 16 below. Where the correlations are negative, this indicates that the more

objective the item was judged to be, the lower were students’ scores on the SJ I items. As a

would be expected for the strongest negative correlations were in the two manipulation

 

groups without motivation or warning, and the positive correlations are the groups with
coaching. The largest mean correlations were for the coached and motivated groups.

The correlation between item veriﬁability and SJ I item performance varies ”W
signiﬁcantly across manipulation conditions of coaching and motivation, and mean levels
across conditions vary, with negative correlations for the groups without coaching, as
well as the group with coaching and a warning statement with no motivation. Again,
negative correlations indicate different items that are more veriﬁable produce lower SJI
scores under conditions that do not precipitate high performance. Coaching and
motivation result in high performance, as well as positive correlations, suggesting that
under certain conditions, veriﬁability' may no longer suppress scores on SJ Is.

For controllability, the correlation with item performance does not vary

signiﬁcantly as a result of the manipulation conditions of coaching, motivation or

warning. Mean levels of the Fisher 2 for controllability across conditions are fairly

stable. Correlations between controllability and SJ 1 scores are positive, as expected for

controllability.
College relevance did not show signiﬁcant differences in correlation with item

performance across manipulation conditions, however, there is a signiﬁcant Coaching by

80

 

Motivation by Warning interaction. All correlations between item relevance and SJ I item
responses were positive. These correlations indicate that the higher the perceived
relevance of the item, the higher the response to SJ I items. One limitation of this analysis
is the dichotomization of scores for SJ I performance, where I have considered a correct
response a score of 1, and an incorrect response a performance score of 0.

These results provide support for Hypothesis 7 on two dimensions; objectivity
and college relevance, for biodata, and signiﬁcant ﬁndings for controllability for biodata,
but in a direction Opposite to that hypothesized. This suggests that these item
characteristics are related to the likelihood that the biodata item score may be inﬂated. f
Future work on biodata should take into account the characteristics of the item where i ' i m...
they may be able to be formulated in a fashion that limits faking. A closer examination
of the issue of controllability shows mixed results of the effect of controllability on
biodata validity, depending on the type of controllability (see Graham, et al., 2002). This
suggests that hypotheses about controllability must consider speciﬁc issues of whether
the individual can choose to perform an action, has no control over the action, has shared
control, or the individual’s feelings or attitudes are the issue of interest. Each of these
categories may provide conﬂicting results, and such speciﬁcations were not taken into
account by the expert raters in this study.

The results also provide support for Hypothesis 7 on dimensions of objectivity,
veriﬁability, and college relevance, for SJ 1. It appears from the change in correlation
direction ﬁom negative to positive under high performance conditions that scores on SJ ls

are less likely to be suppressed as a result of item characteristics than are scores on

biodata items under similar conditions.

81

 

 

The issue of an elaboration requirement as a means of reducing inﬂation on
biodata items was tested, with results shown earlier in Table 6. While there was a
signiﬁcant main effect for elaboration, the impact of elaboration also appeared much
smaller when all covariates were included in the analyses. Hypothesis 8 was supported.
It is possible that there is an item type confound where the items that were chosen by

Oswald, et al. (in press) for elaboration may have also been those that were more ﬂ

 

veriﬁable. Since they were veriﬁable, respondents were less likely to inﬂate responses.

Validity and Selection Decisions

 

l
The fourth question of the study addresses validity and the effects of selection _ v
decisions. Table 17 shows correlations between the predictors and outcomes, as well as

their descriptive statistics.

 

Table 17

Correlations and Descriptive Statistics for Predictors and Criteria

 

Year 1 Absentee- SJ I total N Mean SD
ism
GPA
Year 1 GPA ‘ 353 2.88 0.73
Absenteeism -0.20** 362 3.1 l 1.1 l
SJI total 0.06 ~0.20** 360 0.65 0.37
Biodata total 0.09 -0.05 0.47** 362 3.09 0.48

**p < .01.
To test the validity of these predictors of student performance, and to test

Hypothesis 9, the difference between the zero-order validity coefﬁcients and the validity

82

 

coefﬁcients with the faking identiﬁcation scales partialed out were calculated. Results
are shown in Table 18.

Table 18

Zero Order and Partial Correlations between Situational Judgment and Biodata and Two

Criteria Controlling for Measures of Faking

 

 

 

Situational Judgment T
Zero order Bogus Inﬂation Impr.Mgt. All three I.

r Partial Partial Partial Partialled
GPA 0.06 0.05 0.02 0.03 0.01 £

Absenteeism -0.20* -0.20* —0.23* -0.l4* -0.l9*

 

Em
Zero order Bogus Inﬂation Impr.M gt. All three
r Partial Partial Partial Partialled
GPA 0.09 0.07 0.05 0.08 0.03
Absenteeism -0.05 -0.06 -0.05 -0.01 -0.05
*p < .05.

For the GPA outcome, neither biodata nor SJ I proved to be useful predictors of
academic performance, whether or not they were statistically adjusted using the faking
measures. SJ I did predict absenteeism, but the zero-order correlations were comparable
to all four partial correlations. The partial correlation when Impression Management was
controlled was somewhat lower, though not statistically different from the other

correlations. In any event, there is no evidence that inﬂation, as measured by our three

83

 

indices, is attenuating the validity of these measures to predict grade point average and
absenteeism.

As McFarland (2000) noted that faking may affect reliability, the alpha
coefﬁcients across manipulation conditions were examined for biodata and SJ 1, and are
shown in Table 19. While they do not differ dramatically across conditions, it is
interesting to note that the manipulation condition which would be expected to display ﬂ
the greatest degree of inﬂation (Motivated, Coached, not Warned) does have the highest

reliability for biodata. However, the same group has the lowest reliability for SJ 1, but

 

none of the reliabilities are very different across conditions, and all are relatively high. i I
Table 19

Coeﬂicient Alpha of Biodata and SJI Scales for Various Study Conditions

 

Biodata SJ I N
Mot (0), Warn (0) Coach (0)a .80 .70 46
Mot (0), Warn (0) Coach (1) .89 .74 48
Mot (0), Warn (1) Coach (0) .87 .71 46
Mot (0), Warn (1) Coach (1) .78 .77 44
Mot (1), Warn (0) Coach (0) .82 .70 44
Mot (1), Warn (0) Coach (1) .94 .69 45
Mot (1), Warn (1) Coach (0) .84 .77 44
Mot (1), Warn (1) Coach (1) .90 .66 43

 

 

 

aMot, Warn, Coach indicates Motivation, Warning, and Coaching. One indicates the

manipulation occurred, 0 indicates there was no manipulation.

84

 

 

 

 

 

 

‘tiir-

.222. v 22*... .222. v 22...

28222222222258 on £23 22.2222 822222222222 22

.222-22822 25222222222232: 222 822222222222 22220 .chomoU 22:22 .wEEaB 22222222322222 822222222222 2.2022220 .22-2223 .2222.2

 

 

 

 

 

 

9. 2222.22 2222.22 2222.22- mmd
2.2. 2. 2 .22- 2222.22 2222.22- 300
3. 222.22- 2222.22- 2222.22- 2222.22
2.2. mmd- 2.22.22 2222.22 2222.22
2.2. 3.2.22- 2222.22 5 2 .22- 2222.22-
9. 2222.22- 2222.22 2222.22- 222.22
2222 232.0022- 2 2.22 22 2 .22- mmd
o2. 2.2-N22- 222.22- 2222.22- 1. 22.22-
>2 252822282222. <25 2528222322222. 22.220

32.2.22

 

222 .2228 222 E23 .222 2222
2222 22228 222 E22222, .222 2.22
222 22222222 2222 E22222 .222 22222
2222 22222222 2222 E22222, .222 2222
222 2222.220 222 E22222 .2222 202
2222 222822 222 E22222, .2222 2222
222 22222222 2222 8.22222 .2222 22.2

.2222 22222222 2222 E223 .2222 222

22822222222220 223$. 2.282.232 2.82.222 222-2.2822232; 2228 32$ 3229223222 22.2 2.222226% 25% 2252 83.3.5 20.? 2222222222..

22m 022222.2-

 

The amount of variance in GPA and absenteeism, predicted by biodata and SJ I,
across conditions is shown in Table 20. There are no distinct patterns to validity across
conditions, and validity comparisons across groups can not be clearly interpreted as they
are limited by the small sample size and low overall criterion-related validity of the
biodata and SJ I measures.

To test Hypothesis 10, I examined the responses of two groups of respondents.
Respondents in a group that should be most apt to inﬂate their scores (the Coaching,
Motivation, no Warning group, which I refer to as the high performance group) were
compared with those in the group in which faking should be minimized as the individuals
would be least apt to inﬂate their scores (the no Coaching, no Motivation and a Warning
group, which I refer to as the reference group). Table 21 below presents descriptive data
for each of the two groups. The high performance group shows a higher mean score on
all three of the faking identiﬁcation scales, although the largest gap between the two is on
the scales comprising bogus items and the inﬂation scale on which the mean differences
are over .5 standard deviation units. Recall that the Inﬂation Index was generated using
these two conditions, so this large difference is artiﬁcially inﬂated, and this index would
need to be cross-validated. A much smaller mean difference was obtained on the
Impression Management scale, suggesting that this is more stable and personality-related,

rather than being prompted by situational factors, which seems to be the case for the

Bogus items and items making up the Inﬂation Index.

86

 

 

 

 

 

 

: . .1“ .II

 

 

 

 

 

22.22- 322 222.2 22. 2.22.2 222.2 22.
22.22 222.22 222.2 22. 2.22 2.2.2 2.2.
22.22 22.222 222-2.2 22. 222 2 22.222 22.
22. 2- 9.2 2.22222 22. 8.2 222.2 2.2.
22.22- 2.22 222.2. 22. 2.2 2.2.2 22.
2. Q2. 5222 2 a. 92...: >2
oocohumom 00222295922022
2222222

 

 

2m- 22 2222282582..

222.2.-222 <20 2 28>

$22 2 -0 22 2522-2832332 2828222222222

87

2 22.222 222222 25222222222

28-2.2 28m 22222222

8.2296 mono-2.28% .2252 2.252222222me 22$: 26223222222226me

2N o22222-2-

Given these data, a large number of positive responses to the Bogus items
deﬁnitely seems to indicate that the respondents’ answers are generally suspect. The
same might be true of the Inﬂation index, but, as I noted above, results for that scale
should be cross-validated.

To examine how the use of these scales to correct for faking might affect who is
selected or admitted to college, I examined possible admissions decisions based on a
selection of the top 10%, 25% and then 50% of the participants, based on their biodata
and S] I scores. To control for faking, I identiﬁed those who scored above a cut-point on
the faking scales. The Bogus scale is presented as an example; similar analyses were
conducted with the inﬂation index and impression management, with similar results. A
cut-point of 4 on the Bogus items scale was chosen as this reﬂected any positive response
on the Bogus scale items. As the Bogus items are made up of experiences that are
impossible, I can presume that anyone who claims experience on these items has not been
accurate about their experience, and it may be reasonable to exclude them. Having
excluded these individuals, I then proceeded down the list, selecting the next best
candidates in terms of biodata and SJ I performance, who do not meet the cut score on the
Bogus items scale. The actual college performance (GPA and absenteeism) of those who
were excluded was then compared to the actual college performance of those who were
selected as alternatives.

Looking ﬁrst at the .10 selection ratio, those chosen for best performance included
26 individuals with scores above 4 on the Bogus scale. These individuals were removed,

and replaced with the next-highest scoring individuals who did not have scores above the

88

 

 

 

 

cut on the Bogus scale. The differences in college performance for these two groups is

shown in Table 22 below.

Table 22

Descriptive Statistics for Selection Ratio of . I 0

Absenteeism

Year 1 GPA

Removed from top
10% selection

N Mean SD

Replacements for top
10% selection

N Mean SD d

 

26 3.08 1.06

26 2.77 0.90

26 3.08 0.84 0.00

26 2.85 0.82 ~0.09

The same process was followed for a selection ratio of .25 and the results are

shown below, demonstrating little difference between those excluded and those pulled in

as replacements.

Table 23

Descriptive Statistics for Selection Ratio of .25

Absenteeism

Year 1 GPA

Removed from top
25% selection

N Mean SD

Replacements for top
25% selection

N Mean SD d

 

58 3.02 1.03

58 2.89 0.78

89

58 2.95 0.96 0.07

55 2.87 0.73 0.03

 

 

 

 

The same process was followed again for a selection ratio of .50, with similar

ﬁndings.

Table 24

Descriptive Statistics for Selection Ratio of .5 0

 

Removed from top Replacements for top
50% selection 50% selection 7
N Mean SD N Mean SD d
Absenteeism 109 3 .02 1.02 109 3 .05 1.15 -0.03
Year 1 GPA 108 2.93 0.70 103 2.89 0.79 0.06

 

‘FVH2H

It appears that, regardless of the level of selectivity, using Bogus items as a means
of removals does not change the quality of the student population, when considering
actual college performance.

This study demonstrates those who are coached and motivated to achieve can
produce high scores on biodata and situational judgment, and those scores may be a result
of inﬂation due to faking. Inﬂation scales can be a useful method of identifying response
manipulation but in this case are less useful in correcting for it. Further research into
inﬂation scales and corrections would be necessary to develop protocols for their use. It
may be helpful to conduct further analyses with a broader criterion measure that is
conceptually more closely related to the biodata and situational judgment dimensions.
Results here are probably affected by the very low criterion-related validity of the biodata

and SJ I measures, and Graham, et al. (2002) have linked faking in biodata to reduced

criterion-related validity.

90

CONCLUSIONS
Discussion

This study has addressed four major questions. First, how do different situational
differences such as testing conditions inﬂuence how an individual performs on biodata
and SJ Is? Second, what are the individual difference characteristics of people who are
most effective in inﬂating their responses? Third, what are characteristics of the test
items that seem most susceptible to inﬂation? Fourth, does the susceptibility of biodata
and SJ Is to faking affect their predictive validity if they were to be used in college
admissions decision-making?

Situational Factors : '; 9w

Situational factors can be very important determinants of performance on biodata
and situational judgment items. Brief coaching was shown to improve scores on both of
these non-cognitive measures, an important issue in the context of test application.
According to Sackett, et al. (1989), “In the typical performance domain, the examinee
may adopt an explicit self-presentation strategy in responding to the selection device
based on a hypothesis about what the employer is looking for in an applicant.” (p. 148).
Coaching appears effective in aiding examinees in generating hypotheses and presenting
themselves appropriately. Were biodata and SJ Is to be used in a college admissions
selection process, those who have access to coaching would be better able to improve
their scores. Such susceptibility to coaching is of great practical signiﬁcance to the
College Board, the owners of these measures, as they would need to be able to defend the
reliability and validity of the measures, or ensure that examinees have received equal

access to coaching (see Standards for Educational and Psychological Tests and

91

 

 

 

 

Manuals). Considering the effectiveness of the written coaching provided in
Cunningham, et al. (1994), it may be appropriate for researchers to consider future
studies that examine the difference in effectiveness between face-to-face coaching
programs, and materials that can be provided in hard copy or via electronic media. Also,
it may be useful to conduct an examination of the relationship between warning
statements and coaching, as coaching may effectively negate the power of the warning
statement.

If these tests were to be used to contribute to college admissions decisions, it is
realistic to assume that college applicants would be highly motivated to perform well.
Motivation was shown to have a signiﬁcant effect on performance on these measures, and
the interaction of coaching and motivation for SJ Is in this study suggests that maximal
performance in these tests in an applied setting will be best facilitated by the combination
of personal motivation and coaching.

Warning statements, although not found in this study to be a signiﬁcant factor in
suppressing inﬂation overall, did operate in the direction expected when individuals are
motivated to perform well. Warning statements are not equally relevant for biodata and
SJ I items. While objective and veriﬁable biodata items, such as the number of leadership
positions that a person held in high school, could be veriﬁed by contacting the high
school, veriﬁcation would be more difﬁcult to conduct for SJ Is. While peers or teachers
could be contacted and asked whether an individual is actually likely to behave in a
certain way in a certain situation, the possibility of this veriﬁcation is clearly more

remote, suggesting that warning statements about veriﬁcation of responses are less useful

92

 

 

 

 

for SJ Is, and may even make matters worse by actually planting the idea that one could
answer dishonestly.

To be able to implement the biodata and SJ I in an admissions context, these
measures would need further examination, ideally with a sample of individuals who
really have a personal desire to score well on the tests and a great deal to lose if they do
not, to fully understand the power of a warning statement and the interaction effects of
motivation and warning. To ensure the most powerful warning effect it may be best to
include both the warning that faking could be identiﬁed, as well as the warning that there
would be potential negative consequences for faking (see Dwight & Donovan, 2003).
According to Dwight and Donovan, both of these characteristics of a warning statement
play a part in limiting faking. A study using a sample of real college applicants who
believe that these tests are being used in their admissions decision-making process would
provide an excellent framework for building our understanding of the effects of
motivation and warning on biodata and SJ I score inﬂation.

Identification of Inﬂation

Three inﬂation identiﬁcation methods were used, providing valuable information
about individual differences and inﬂation. Inﬂation captured by these scales was
unrelated to race, gender, or cognitive ability.

First, Bogus items were considered. Bogus items were successfully used to
identify those who had claimed experiences that were impossible, and scoring on Bogus
items was more situational than personality-related. Bogus item scoring was weakly
related to openness to experience, which might suggest that those who are more creative

in framing their own personal experiences are able to warrant marking experience on

93

' it!“

2.,__, _. g

 

 

 

 

Bogus items, perhaps viewing the experience being addressed as in some way related to
an experience they have had. Future research with bogus items should include a broader
selection of bogus items, with varying degrees of obviousness. There may be particular
bogus item characteristics that make some more effective as ﬂags of inﬂation than others.
Second, an Inﬂation Index was created. Inﬂation captured with an empirically-
derived Inﬂation Index was more personality-driven that the Bogus items, and was Fl
weakly related to extraversion, agreeableness, openness and self-deception. The Inﬂation

Index proved to be a useful technique in identifying those with unusually high scores.

 

r
Nevertheless, using such an index as a way to ﬂag problematic responders could cause 7 j a ‘
headaches for the College Board, who would need to be able to defend their practices : in.“ l
against those arguing that they did not cheat, but were responding honestly, but simply
have unusual experiences that result in their being ﬂagged by the index. It should be
noted that while the empirical Inﬂation Index proved useful, results should be cross-
validated.
Third, the BIDR Impression Management scale was used as an indicator of
inﬂation. Impression management appeared to be the most personality-related of the
identiﬁcation scales. Impression management was more strongly related to
conscientiousness, agreeableness, emotional stability, and self-deception. This suggests
that inﬂation related to impression management may be more adaptive than inﬂation
related to Bogus items, or the Inﬂation Index. That impression management appears to
be trait-related lends support to the argument that it should not be used for controlling for

socially desirable responding. It is apparent from this study that inﬂation is largely a

94

 

 

ﬁlnction of the situation in which the test is taken, rather than the characteristics of the
individual, and this study lends support to the view of impression management as a trait.
Item Characteristics

Item characteristics can play an important role in creating a test that is less
vulnerable to inﬂation. This study found that biodata items that were judged as being less
objective, less controllable, and more college relevant were more susceptible to inﬂation
under different manipulation conditions. SJ I items that were objective or veriﬁable
tended to have lower scores, unless the individual was coached and motivated to do well,
in which case, the suppressor effect faded. The suppressor effect of item characteristics
was not weakened for biodata to the extent it was for SJ I items, when coaching and :5 it
motivation to perform were provided.

To limit inﬂation, it may be important for biodata test generators to choose items
that are objective and veriﬁable where possible. However, one disadvantage of trying to
ensure that items are objective and veriﬁable is that biodata inventory builders may lose
the valuable multidimensionality captured from a broader pool of items. Also, there may
be resistance to removing items that are fakeable when those items are perceived as the
most valid. While college relevance was shown to increase inﬂation, there is an inherent
tradeoff in seeking to minimize faking by trying to disguise college relevance — tests that
are viewed as relevant are more likely to be perceived as fair (see Schmitt, Oswald, Kim,
Gillespie, & Ramsay, in press), and trying to reduce college relevance would not be
recommended.

That less controllable items were more susceptible to inﬂation suggests that

further research will be necessary to resolve the different categories of control, and the

95

 

consequences of having items that tap those categories. Graham, et al., (2002) indicate
four categories of controllable items, and these may all produce differing results in terms
of inﬂation and validity. Judges in this study were not given multiple categories for
rating controllability of items, and the work of Graham, et al., provides guidance for
improved evaluation of controllability in future research.

An elaboration requirement in biodata items appears to be effective in suppressing
scores, adding to existing literature in this area (e. g., Schmitt & Kunce, 2002). However,
not all items are equally suitable for an elaboration requirement. It may be that more
veriﬁable items are more easily written with an elaboration requirement. Also, there is an
issue of memory that may be affecting responses to elaborated biodata items.
Respondents may limit their responses when elaboration is required only because they
can’t remember the details and so would be unable to record them. Ramsay, Kim,
Gillespie, and Friede (2003) have considered memory as a factor in a biodata elaboration
study, but used a memory test that has questionable construct validity, and may be more a
test of general knowledge. Further research in this area may provide answers to the
questions surrounding elaboration requirements and the factors that contribute to their
effectiveness.

These are issues that test developers will need to balance as they attempt to create
inventories that are reliable and useful. Considering the relationship of item
characteristics to inﬂation, and that validity may also be affected by vulnerability to
inﬂation linked to item characteristics (Graham, et al., 2002) item characteristics as a

means of understanding inﬂation may be a fruitful area for future research.

96

Validity

The issue of validity is of critical importance if the tests used here are put into
practice for selection in an applied setting. It appears from this study that inﬂation on
biodata and SJ I does not attenuate the criterion-related validity, however, we are
examining a set of tests that show limited criterion-related validity without inﬂation. To
reach a deﬁnitive conclusion about the relationship of inﬂation and validity, it may be
useful to conduct further research with performance measures that are multidimensional
and more closely linked to the dimensions captured by the biodata and SJ I questions.

Identifying participants who appeared suspicious, based on the three inﬂation
identiﬁcation methods, and then replacing them with participants who were not suspected
of inﬂation did not prove effective in selecting a higher performing group in terms of
college performance. However, criterion-related validity was low, and such corrections
may have different effects if one were examining a broader criterion space.

Limitations

One limitation of this study is the slightly different warning manipulations for the
motivated and not motivated groups. While both groups received the same statement
about the possibility that responses would be veriﬁed, the consequences of being found
providing dishonest answers had a greater negative consequence for the motivated group,
who would lose a cash payout for high performers. These differences mean that the
manipulation was not equivalent across groups. It would have been helpful to have
included a manipulation check regarding the warning statements to establish that they had
been attended to. This sample was not completing these measures under the real

expectation that responses would contribute to college admission decisions, limiting the

97

 

power of the warning statement that there would be negative consequences for
responding dishonestly. In a real world scenario, the risk that one’s college application
may be rejected based on dishonest responding would be of great consequence.
Therefore, future research will need to examine warning statements in a sample of real
applicants. As the inﬂation index was generated in the study sample, it would need to be
cross-validated as similar results may not be found in a different sample.

Another possible limitation is that elaborated items in the study may have been
items that were more veriﬁable. This veriﬁability, rather than the elaboration
requirement itself, may have limited the respondents’ inclination to inﬂate responses. In
addition, it is difﬁcult to draw conclusions about any changes in validity, or the
effectiveness of correcting for inﬂation, when there is very limited criterion-related
validity for the biodata and SJ I measures in this sample.

Practical Implications

The use of biodata and situational judgment measures have shown promise as
predictors of student performance (Oswald, et al., in press). However, in this sample,
these measures did not prove effective as performance predictors. For these tests to be
implemented in a college admissions context, greater criterion-related validity would
need to be demonstrated. Considering the vulnerability of these measures to inﬂation
prompted by motivation and coaching, it may be most practical to reserve measures such
as these for informational or developmental purposes, rather than for admissions
decision-making.

Bearing in mind that respondents can improve their scores when coached very

brieﬂy on how to do so, it is problematic to promote this test as one that could be used in

98

 

admissions decision-making. If the tests were to be used in such a way, test owners
would need to provide access to coaching materials, so that all examinees have an equal
opportunity to improve their scores through knowledge about the test. While written or
web-based materials could easily be generated by the test owners, it is reasonable to
assume that the implementation of tests such as these would prompt market forces to
produce tests preparation training classes that could provide greater advantages to those
with the resources to attend such training. This could precipitate sub-group differences in
test performance, one of the issues that these tests are designed to avoid.

Test owners would be well-advised to implement a warning statement in the test
administration, once further research has been conducted that clariﬁes the most effective
type of warning statement in a college admissions context. It is expected that a warning
statement, claiming that dishonest responses may be veriﬁed and that there may be real
negative consequences for inﬂating responses, may be effective. However, the power of
the warning statement is expected to vary based on the purpose for which the testing tools
are used. High-stakes decisions based on test performance may result in greater
responsiveness to a warning statement that includes negative consequences for dishonest
responding.

In compiling an inventory of biodata items, test developers should consider items
that are less likely to be inﬂated. This may be achieved by focusing on items that are
written in a way that maximizes their content veriﬁability and objectivity. Controllability
of item content should also be considered, but further research is necessary regarding the

dimensions of controllability.

99

There may be items that have high criterion-related validity, yet are vulnerable to
inﬂation. Test developers will need to balance the issue of transparency and possible
inﬂation with prediction of performance. It would also be advisable to include an
elaboration requirement for biodata items, where possible, to limit the item susceptibility
to inﬂation.

While high scores on the BIDR’s impression management scale are personality-
related, and may be adaptive, scores on bogus items are not. The inclusion of bogus

items is one method of identifying individuals who may be responding in a dishonest

 

manner. However, some members of the public may ﬁnd it inappropriate to ask “trick I
questions”, and could generate publicity that makes the items themselves less useful. An : QM.
inﬂation index is another suitable means of identifying responses that are suspicious, and
the index used here will need to be validated with another sample. While use of one or
more of these scales to identify inﬂation is possible in an applied setting, their
implementation may be difﬁcult. While conceptually, bogus items identifying those who
have lied, and the empirical soundness of an inﬂation index, should be defensible reasons
to exclude the scores of a respondent, the public relations problems that could ensue with
implementation of such measures could make use of these scales infeasible.

One problem that may arise with the use of corrections using inﬂation
identiﬁcation scales is the partialing out of variance that may be an important predictor of
performance. Also, people do not necessarily inﬂate their responses in the same way,

and standard correction method may not account for this (see discussion in Christiansen,

et al., 1994). Also, some individuals may honestly have marked responses that indicate

100

their real but very unusual levels of experience, and such candidates may find themselves
discriminated against, if such methods were put into practice.

While we can use these techniques to identify inﬂation, what to do about
respondents suspected of faking in a college admissions context is debatable. Decisions
to dismiss dishonest applicants in such a high stakes setting will need to be based on far
more extensive research than has been provided in this study, to withstand public scrutiny
and possible litigation. If these tests were simply to be used for developmental purposes,
bogus items and the inﬂation index would be useful markers for a counselor reviewing
test results.

Further research on these biodata and situational judgment measures would allow : in“.
them to be reﬁned so that they effectively capture the dimensionality of the criterion
space that they are designed to predict. To be of practical value they should have
construct and criterion-related validity, and this will need to be examined in different
samples, while considering different item types. Research should also seek to establish

the feasibility of using these measures in an admissions context and should investigate

other possible uses for these tests.

101

 

 

APPENDICES

APPENDIX A

Mael’s Taxonomy

103

; ’lili"

 

Mael 's Taxonomy of Biodata Items

Historical
How old were you when you got your ﬁrst

paying job?

External
Have you ever been ﬁred from a job?

Objective
How many hours did you study for your
real-estate license test?

First-hand

How punctual are you about coming to
work?

Discrete

At what age did you get your driver’s
license?

Veriﬁable

What was your grade point average in
college?

Were you ever suspended from Little
League?

Controllable

How many tries did it take you to pass the
CPA exam?

Equal access

Were you ever class president?

Job relevant

How many units of cereal did you sell
during the last calendar year?
Noninvasive

Were you on the tennis team in college?

Future or hypothetical

What position to you think you will be
holding in 10 years?

What would you do if another person
screamed at you in public?

Internal

What is your attitude towards friends who
smoke marijuana?

Subjective

Would you describe yourself as shy?
How adventurous are you compared to
your coworkers?

Second-hand

How would your teachers describe your
punctuality?

Summative

How many hours do you study during an
average week?

Nonveriﬁable

How many servings of fresh vegetables do
you eat every day?

will“

Noncontrollable

How many brothers and sisters do you
have?

Nonequal access

Were you captain of the football team in
high school?

Not Eb relevant

Are you proﬁcient at crossword puzzles?

Invasive
How many young children do you have at
home?

 

Note. Adapted from “A Conceptual Rationale for the Domain and Attributes of Biodata
Items,” by F. A. Mael, 1991, Personnel Psychology, p. 773.

104

 

APPENDIX B

Sample Questionnaire

 

PID:A

FORMl

The ﬁrst booklet asks questions pertaining to how you approach other people or life in
general. You will also complete some demographic questions. You will use the ﬁrst
scantron for these questions.

The second booklet contains questions that ask you about your history and life
experiences. You will also be presented with descriptions of problem situations, and you
will indicate which action you would be most likely to take and which action you would
be least likely to take. These are situational judgment tasks. You will use the second
scantron for these questions.
As you are answering these questions, please record your answers on the scantron form.
For each question, please ﬁll in completely the circle you choose. Where you are asked
to elaborate on your answer, please write your response on the lines provided in your
exercise booklet.
First, please take a moment to complete the following areas of your ﬁrst scantron:

PID — Please write in your PID, and then fill in the

corresponding circles.

Form — Please indicate Form 1 A

Also, please indicate your PID on the cover of this booklet, at the top right hand comer.

You will have 90 minutes to complete this study.

106

 

The following pages contain phrases describing people's behaviors. Use the rating scale
below to describe how accurately each statement describes you and please provide
answers that describe yourself as you generally are now, not how you wish to be in the
future. Describe yourself as you honestly see yourself, in relation to other people you
know of the same sex as you are, and roughly your same age. So that you can describe
yourself in an honest manner, your responses will be kept in absolute conﬁdence. Please
read each statement carefully.

 

Please use the ﬁve-point scale below:

1 = Very Accurate

2 = Moderately Accurate

3 = Neither Accurate nor Inaccurate
4 = Moderately Inaccurate

5 = Very Inaccurate

 

 

 

 

90. Make people feel at ease. : «m»
91. Am not interested in abstract ideas.

92. Change my mood a lot.

93. Don't like to draw attention to myself.

94. Talk to a lot of different people at parties.
95. Have excellent ideas.

96. Insult people.

97. Follow a schedule.

98. Am exacting in my work.

99. Get stressed out easily.

100. Seldom feel blue.

101. Don't mind being the center of attention.

102. Worry about things.

107

 

 

Please use the ﬁve-point scale below:

1 = Very Accurate

2 = Moderately Accurate

3 = Neither Accurate nor Inaccurate
4 = Moderately Inaccurate

5 = Very Inaccurate

 

 

103.

104.

105.

106.

107.

108.

109.

110.

111.

112.

113.

114.

115.

116.

117.

118.

119.

120.

Have little to say.

Don't talk a lot.

Use difﬁcult words.

Keep in the background.

Have difficulty understanding abstract ideas.
Make a mess of things.

Pay attention to details.

Am always prepared.

Feel little concern for others.
Have a rich vocabulary.

Like order.

Often feel blue.

Am full of ideas.

Spend time reﬂecting on things.
Take time out for others.

Have frequent mood swings.
Have a soft heart.

Am quick to understand things.

108

 

 

 

Please use the ﬁve-point scale below:

1 = Very Accurate

2 = Moderately Accurate

3 = Neither Accurate nor Inaccurate
4 = Moderately Inaccurate

5 = Very Inaccurate

 

 

121.

122.

123.

124.

125.

126.

127.

128.

130.

131.

132.

133.

134.

135.

137.

138.

139.

Am interested in people.

Start conversations.

Am the life of the party.

Neglect my duties.

Am relaxed most of the time.

Am not interested in other people's problems.
Often forget to put things back in their proper place.
Feel others' emotions.

Sympathize with others' feelings.

Do not have a good imagination.

Get irritated easily.

Am easily disturbed.

Get chores done right away.

Am not really interested in others.

Am quiet around strangers.

Feel comfortable around people.

Leave my belongings around.

Have a vivid imagination.

Get upset easily.

109

' UH."

 

Please answer this next set of questions using the ﬁve-point scale below.
I = Very true
2 = Mostly true
3 = Somewhat true
4 = Mostly untrue
5 = Very untrue
140. My ﬁrst impressions of people usually turn out to be right.
141. It would be hard for me to break any of my bad habits. P1
142. I don't care to know what other people really think of me.

143. I have not always been honest with myself.

 

144. I always know why I like things. _. B
145. When my emotions are aroused, it biases my thinking.

146. Once I've made up my mind, other people can seldom change my opinion.
147. I am not a safe driver when I exceed the speed limit.

148. I am fully in control of my own fate.

149. It's hard for me to shut off a disturbing thought.

150. I never regret my decisions.

151. I sometimes lose out on things because I can't make up my mind soon enough.
152. The reason I vote is because my vote can make a difference.

153. My parents were not always fair when they punished me.

154. I am a completely rational person.

155. I rarely appreciate criticism.

Please stop and check that you have just completed the first side of the ﬁrst scantron.
156. I am very conﬁdent of my judgments.

157. It's all right with me if some people happen to dislike me.

110

Please continue using the five-point scale below.

158.

159.

160.

161.

162.

163.

164.

165.

166.

167.

168.

169.

170.

171.

172.

173.

174.

175.

176.

177.

1 = Very true
2 = Mostly true
3 = Somewhat true
4 = Mostly untrue
5 = Very untrue
I don't always know the reasons why I do the things I do.
I sometimes tell lies if I have to.
I never cover up my mistakes.
There have been occasions when I have taken advantage of someone.
I never swear.
I sometimes try to get even rather than forgive and forget. t
I always obey laws, even if I'm unlikely to get caught. 7 i '1‘
I have said something bad about a friend behind his or her back.
When I hear people talking privately, I avoid listening.
I have received too much change from a salesperson without telling him or her.
I always declare everything at customs.
When I was young I sometimes stole things.
I have never dropped litter on the street.
I sometimes drive faster than the speed limit.
I have done things that I don't tell other people about.
I never take things that don't belong to me.
I have taken sick-leave from work or school even though I wasn't really sick.
I have never damaged a library book or store merchandise without reporting it.

I have some pretty awful habits.

I don't gossip about other people's business.

111

 

178. Please enter 8 on your scantron for this item.

Please ﬁll in the appropriate answer on your form according to the responses provided.

For questions 90 and 91, use the following scale:

= less than 1.00 _
= 1.00 to 1.49 a
= 1.50 to 1.79
= 1.80 to 2.09
= 2.10 to 2.39
= 2.40 to 2.69
= 2.70 to 2.99 H
= 3.00 to 3.39 .
=3.40 to 3.59 .. 2,...

= 3.60 or greater -

 

renew: rm 9.0.6:»

179. Cumulative college GPA:

180. GPA for this past semester:

181. Indicate the extent to which you have missed regularly scheduled class(es) in the
past six months.

I have never missed class
Imissed 1-3 classes.

I missed 4-8 classes.

I missed 9-15 classes.

I missed more than 15 classes.

.09-99‘?»

112

If you have missed class in the past six months, indicate the reasons you missed class.

Please mark:

a = Yes

b=No

182. You were faced with an emergency.

183. You were sick.

184. You partied too much the night before.

185. You were tired or you failed to get up in time.

186. You were talking or socializing with friends.

187. You were involved with another university event and couldn’t go.
188. You found the class boring.

189. You did not believe the instructor would cover anything new or important.

Demographics — Please respond to the following questions to the best of your ability.

190.

191.

192.

“-"r'e‘qo wee-99‘s»

.09-99‘!»

What is your age?
18
19
20
21
22
23
24
25
26
27 +

What is your gender?
male
female

What is your year in school?
freshman

sophomore

junior

senior

5‘h year +

113

at!!!-

 

193.

9‘1”

1 94.

195.

r'P‘qo rm .o-P .ch

Which of the following best characterizes you?
US. Citizen

Non-citizen — Canadian

Non-citizen — other

Is English your primary language?
yes
no

What ethnicity do you consider yourself to be?
Mexican American

Puerto Rican

Other Hispanic

American Indian or Alaskan native

Asian

Black/African American
Caucasian/White/Not of Hispanic origin
Native Hawaiian or other Paciﬁc Islander
Other

 

 

You have now completed the ﬁrst section of your form.
Please raise your hand to let the proctor know that you have
finished this section.

Now please wait quietly until everyone has completed this section.

 

 

114

~ in.»

 

1‘3, - -

PIDzA

 

You are now beginning the second section.
Please take a moment now to prepare the second scantron:
PID — Please write in your PID, and then fill in the

corresponding circles.

Form — Please indicate Form 1 B

 

Also, please indicate your PID on the cover of this booklet, at the top right hand comer. - - * M"

115

Please read these instructions very carefully:

Imagine that you are applying for admission to Michigan State University,
and your responses to these questions could inﬂuence the decision on
whether or not to accept you for admission. In other words, imagine that
this questionnaire is part of the test requirements for college admissions, and
admission here is very important to you. Complete this questionnaire in a
way that presents yourself honestly but in the best light possible so that you
are most likely to get admitted to the university.

As an added incentive to do well, participants in this study who score above
the 50th percentile on this questionnaire will receive $10.

 

: " it!!-
Now please proceed, answering the remainder of the questions in this study.

116

Request for Payment Information

Please provide the following information to allow us to mail a check to you if you meet
the required score. Those participants who score above the 50th percentile on this
questionnaire will be mailed $10.

Name:

 

Address line 1:

 

 

Address line 2:

 

City: State: Zip:

117

Following are some biographical data questions:

(Biodata items are inserted here)

You will be presented with descriptions of problem situations. Each problem has
between four and seven alternative actions that might be taken to deal with the problem.
You are to make two judgments for each problem. First, decide which alternative you
would be MOST LIKELY to take in response to the problem. It might not be exactly
what you would do in the situation, but it should be the alternative that comes closest to
what you would actually do. Record your answer on the scantron form. Second, decide
which alternative you would be LEAST LIKELY to take in the situation, and record your
answer on the scantron. Please read all of the alternatives before deciding.

 

(Situational judgment items are inserted here)

 

You have now completed the study.
Please wait quietly until everyone has finished their work.

Thank you for your participation.

 

 

 

118

 

 

APPENDIX C

Sample Biodata Items

119

: Viki."

 

1. During the past year, how many times out of self-interest have you searched for
information about other regions, countries, or cultures (at the library or on the

Internet)?
f. 0
g. 1-3
h. 4-7
i. 8-12
j. more than 12

If you answered b, c, d, or e, brieﬂy describe up to 5 countries or cultures and the
topic that you investigated.

 

 

 

 

 

2. How many times in the past year have you tried to get someone to join an activity ‘ ”W

in which you were involved or leading?

a. never

b. once

c. twice

(1. three or four times
e. ﬁve times or more

3. In the past six months, how often did you read a book just to learn something?

a. never

b. once

c. twice

d. three or four times
e. ﬁve times or more

If you answered b, c, d, or e, brieﬂy describe up to 4 books you read and what you
wanted to learn.

 

 

 

 

 

NOte. For a further information on the complete set if items, please contact the College
Board, New York, New York.

APPENDIX D

Sample Situational Judgment Items

121

 

tfuj.‘ .

You will be presented with descriptions of problem situations. Each problem has
between four and seven alternative actions that might be taken to deal with the problem.
You are to make two judgments for each problem. First, decide which alternative you
would be MOST LIKELY to take in response to the problem. It might not be exactly
what you would do in the situation, but it should be the alternative that comes closest to
what you would actually do. Record your answer on the scantron form. Second, decide
which alternative you would be LEAST LIKELY to take in the situation, and record your
answer on the scantron. Please read all of the alternatives before deciding.

 

. (II. 2.97

Your grade for a particular class is based on three exams, with no class attendance
requirement. All of the homework requirements for the class are posted on the professor’s

web site. What would you do?

 

3. Attend class for as long as you feel that it is helping your grades.
b. Do all the homework but only go to some of the lectures. It’s the exams that count. i

c. Go to all the classes anyway. The professor may say something important. .

. o . . -: V i"?
d. Skrp classes, but if you did poorly on the ﬁrst exam, start gorng to classes.
C. There is no need to go to classes. Just get the homework done, and pass the exams.

4. What would you be most likely to do?

5. What would you be least likely to do?

You are ﬁnding your freshman year very difﬁcult. The courses are hard, and you feel
your grades are not satisfactory. Material in class seems to be covered very quickly. What

would you do?

3. Talk with the professors and TAs to get help on how to study.
b. Find a study partner and work on homework and class material together.
c. Talk to your parents and an advisor.

(1. Study hard, try your best, and don’t worry about it.
6. Talk to my advisor and teachers; see if there are study groups or review sessions I can

attend.
f. Hire a tutor for the difﬁcult classes.
6- What would you be most likely to do?

7- What would you be least likely to do?

 

Note. For a further information on the complete set ifitems, please contact the College
Board, New York, New York.

APPENDIX E

Wording of Instruct-ion Sets

-;. 7* ihlv"

I...“ I 'ﬂt”

 

Motivated Warned Group:

“Please read these instructions very carefully:

Imagine that you are applying for admission to Michigan State University, and
your responses to these questions could inﬂuence the decision on whether or not to accept
you for admission. In other words, imagine that this questionnaire is part of the test PS
requirements for college admissions, and admission here is very important to you.

Complete this questionnaire in a way that presents yourself honestly but in the best light

 

possible so that you are most likely to get admitted to the university. 3'

As an added incentive to do well, participants in this study who score above the

5 0(11 percentile on this questionnaire will receive $10.

Note that we may verify a subset of your responses, and if you respond
dishonestly, that may invalidate this test as well as your chance to receive $10 for high

Performance.

Now please proceed, answering the remainder of the questions in this study.”

Motivated Not Warned Group:

“Please read these instructions very carefully:

Imagine that you are applying for admission to Michigan State University, and
your responses to these questions could inﬂuence the decision on whether or not to accept
you for admission. In other words, imagine that this questionnaire is part of the test
requirements for college admissions, and admission here is very important to you.
Complete this questionnaire in a way that presents yourself honestly but in the best light

possible so that you are most likely to get admitted to the university.

« ,. m:-

As an added incentive to do well, participants in this study who score above the

50‘h percentile on this questionnaire will receive $10.

Now please proceed, answering the remainder of the questions in this study.”

a"'. 9““.‘I
'_ .
'D
.2
l
I

 

Not Motivated Warned Group:

“Please read these instructions very carefully:

The following questionnaire is being tested as a way to collect information about
high-school students who are applying to go to college. We would like your
straightforward, honest answers to these questions. Your responses are strictly
conﬁdential, and they will not be used to evaluate you in any way, so please provide

answers that are as honest and accurate as possible.

Note that we may verify a subset of your responses, and if you respond

dishonestly, that may invalidate this test.

Now please proceed, answering the remainder of the questions in this study.”

126

 

Not Motivated Not Warned Group:

“Please read these instructions very carefully:

The following questionnaire is being tested as a way to collect information about
high-school students who are applying to go to college. We would like your "1
straightforward, honest answers to these questions. Your responses are strictly I

conﬁdential, and they will not be used to evaluate you in any way, so please provide

 

answers that are as honest and accurate as possible. _ 3"!

.. v‘rbt“

Now please proceed, answering the remainder of the questions in this study.”

APPENDIX F

Informed Consent Form — Motivated Group

) ‘lP-‘i'

128

 

Predictors of Student Success - Informed Consent

Please read and sign below:

In the project in which you are participating, we will be asking you to respond to a series
of questions. The ﬁrst two sets of questions are measures of judgment and of background
experiences and preferences; they are experimental measures designed to be related to
outcomes of students attending a college or university. The major purpose of this project
is to investigate how well students do on these measures, given the instructions to take
them, and whether the measures of judgment and background are related to your MSU
grades. We are also asking you to respond to some commonly used personality measures
which will help us interpret the meaning of your responses to the judgment and
background measures. Because a major purpose of our study is to determine if your
responses to the judgment and background measures are related to your performance as a
student at MSU, we will also be asking your permission to allow the registrar to give us
access to your grades and to the Office of Admissions to allow access to your high school

grades and ACT/SAT scores.

In order to link your responses to the measures with your college and high school grades,
and ACT/SAT scores, we will be asking you to provide your PID. All information you
provide will be completely conﬁdential. Only the project team (two faculty members and
three graduate students) will have access to the password-protected data ﬁle with the
original PID attached, and all data will be reported at the group level so that no one will
be able to identify a particular person. As soon as we link your responses to the data from
Admissions and the Registrar’s Ofﬁce, your PID will be deleted from our data ﬁle. Your
privacy will be protected to the maximum extent allowable by law.

We expect that it will take 90 minutes for you to complete this study, for which you will
earn extra credit in Psychology 101. Participation in this study is voluntary. As an
alternative to participation in this study, you may do other work, such as a paper, that is
Coordinated with your instructor. As an incentive to do well on this questionnaire, those
Participants who score above the 50’h percentile on this questionnaire will be given $10.
By signing below you indicate that you are free to refuse to participate in this project or
any part of the project. You may refuse to answer some of the questions and may
discontinue your participation at any time without penalty.

If you have any questions or concerns about your participation in this project, you can
Call Neal Schmitt (517-355-8305) or send an email message to Schmitt@msu.edu. If you
have questions or concerns regarding your rights as a study participant, or are dissatisfied
at any time with any aspect of this study, you may contact - anonymously if you wish -
AShir Kumar, M.D., Chair of the University Committee on Research Involving Human
Subjects (UCRIHS) by phone: (517) 355-2180, fax: (517) 432-4503, e-mail:
uCrihs@msu.edu, or regular mail: 202 Olds Hall, East Lansing, MI 48824.

 

 

Your signature below indicates your voluntary agreement to participate in this study.

 

 

Signature Date

 

130

APPENDIX G

Informed Consent Form — Not Motivated Group

131

,llkii'

 

Predictors of Student Success - Informed Consent

Please read and sign below:

In the project in which you are participating, we will be asking you to respond to a series
of questions. These questions are measures of background experiences and preferences,
experimental measures designed to be related to outcomes of students attending a college
or university. The major purpose of this project is to investigate how well students do on
these measures, given the instructions to take them, and whether the measures of

judgment and background are related to your MSU grades Because a major purpose of M!
our study is to determine if your responses to the judgment and background measures are
related to your performance as a student at MSU, we will also be asking your permission
to allow the registrar to give us access to your grades and to the Ofﬁce of Admissions to
allow access to your high school grades and ACT/SAT scores.

 

In order to link your responses to the measures with your college and high school grades, _ J
and ACT/ SAT scores, we will be asking you to provide your PID. All information you
provide will be completely conﬁdential. Only the project team (two faculty members and
three graduate students) will have access to the password-protected data ﬁle with the
original PID attached, and all data will be reported at the group level so that no one will
be able to identify a particular person. As soon as we link your responses to the data from
Admissions and the Registrar’s Ofﬁce, your PID will be deleted from our data ﬁle. Your
privacy will be protected to the maximum extent allowable by law.

rd

We expect that it will take 90 minutes for you to complete this study, for which you will
earn extra credit in Psychology 101. Participation in this study is voluntary. As an
alternative to participation in this study, you may do other work, such as a paper, that is
coordinated with your instructor. By signing below you indicate that you are free to
refuse to participate in this project or any part of the project. You may refuse to answer
some of the questions and may discontinue your participation at any time without

penalty.

If you have any questions or concerns about your participation in this project, you can
call Neal Schmitt (517-355-8305) or send an email message to Schmitt@msu.edu. If you
have questions or concerns regarding your rights as a study participant, or are dissatisﬁed
at any time with any aspect of this study, you may contact — anonymously if you wish -
Ashir Kumar, M.D., Chair of the University Committee on Research Involving Human
Subjects (UCRIHS) by phone: (517) 355—2180, fax: (517) 432-4503, e-mail:
ucrihs@msu.edu, or regular mail: 202 Olds Hall, East Lansing, MI 48824.

 

Your signature below indicates your voluntary agreement to participate in this study.

 

 

Signature Date

 

 

NH.“

133

APPENDIX H

Faking Study Protocol

 

am.“

134

PROTOCOL — Faking Study
College Board — Fall 2002 data collection

Before test administration:

Subiects:
1.) Find out how many subjects will be participating and bring enough of all supplies for
each subject plus a few extras. (check subject pool for # of subjects signed up 30 minutes 1

prior to proctoring). 3_

How to check for number of subjects:

 

http://psychology.msu.edu/SubjectPool/Welcome.asp Sign in, view/modify experiment
sessions on leﬁ side, then choose “Predictors of Student Success”, and look at “subjects
signed up”.

 

2‘? IN"
2.) Print offthe list of names. 1

Materials:

There are four slightly different forms for this study. Materials have been prepared and
placed in envelopes numbered one through four. As forms have slightly different consent
and personal information collection forms, all of these are already in the envelopes.

You will need to take enough materials so that you have (the number of participants
divided by four) of each packet, plus a few extras in case of problems.

Please have these materials organized so that materials can be distributed fairly quickly.
1.7.1 Sign-in sheet (ﬁll in date, location, and proctor name). Take a new sign-in sheet

each time.

Sufﬁcient copies of questionnaire envelopes

Extra 10-option scantrons

Debrieﬁng forms

Pencils -- please sharpen them all beforehand.

Make sure you have a watch or some way to tell time with you.

Stamp and stamp pad

Green Sheets for giving credit

Elli—11311311311318

During the test administration:
Procedure:
0 Arrive at least 15 minutes before the start of the session.

0 Place the envelopes on the desks, every other seat, or spaced apart further than
that if there are not many pe0ple signed up. Distribute the four different forms
systematically, one through four, as you lay them out, so that the different forms
are evenly distributed around the room.

0 As people enter room, ask them to sign the sign-in sheet. Check ID to make sure Tl
that face, name, and what they write on the sign in sheet all match.

0 If an individual shows up who has not signed up through the website, thank them
for coming and ask them to sign up online for a different session. If subjects
forget their ID but they are signed up, let them participate. J

 

3 an-
0 As they come in, tell them that they may open their envelopes, and review the .

informed consent and date release forms while they wait for the session to begin.
It is okay for subjects to sign forms in pencil.

0 Start about 5 minutes after designated start time, or earlier if all participants have
arrived. Those who are late may stay late.

0 Read the script verbatim.
0 Collect Informed Consent and Data Release.
0 Time for test: 90 minutes.

0 Participants will stop working at the end of the ﬁrst section, and raise their hands
to let you know they are done. When everyone has completed section one, you
will tell everyone to begin section two at the same time. They should leave their
envelopes containing section 2 under their seats until it is time to move on to
section 2. (This is done because the room having a coaching session will also be
stopped at this point to begin coaching before moving on.)

0 As you administer the test, (a) at 7:00 PM post the time remaining as 30 minutes,
and at 7:15 PM as 15 minutes,

(b) announce the time aloud as you post the time.

o If they ﬁnish early, you may start to collect their forms ~10 minutes before the
end of the session.

0 As you collecting the materials, check for the following, with the student:

PID (on both scantron sheets)

Form # and A or B (on both scantron sheets)

Completed scantron sheets (should be completed through #106 on ﬁrst
scantron sheet and #97 on the second sheet)

Booklets with elaborations, with PID and name on the booklets

If a participant has incomplete information, ask them to stay to complete
the form.

131131 1311313

When you collect materials, give the student the debrieﬁng form.

Stamp the white cards that the students brought and ﬁll in 3 half-hour credits. If
they forgot their card, ﬁll out a green sheet for them. Name of experiment is
Predictors of Student Success. Experiment stamp # is MSU/PSYCH/ 101,
Experimenter is Neal Schmitt.

If a participant decides not to participate, just let them leave and thank them for
their time.

Do not answer any content related questions

nu.»-

 

Script
<Have enveIOpes on desks>
<As students enter>

“Hi. Please sign-in and may I check your ID? You may take a seat, open your packet
and look at the consent forms, but do not begin.”

<check ID when they ’re signing in>
<Begin speech 5 minutes after designated starting time>

“Hello,

Thanks for participating in our study about the characteristics and experiences of college
students. In order to ensure that the instructions are identical each time we run this study,
I’m required to read these instructions verbatim. My name is <say your name> and I
will be proctoring this data-collection session (“along with <other person’s name>” — add
this if you have two people proctoring). This session lasts about an hour and a half, and
so let me say in advance that we appreciate your time and help with this project.

Hw-

It is important that this is your ﬁrst year of college, and that you have not previously
participated in this study. If either of these criteria render you ineligible, please let me
know now.

What I(we) have provided is an envelope containing two booklets containing questions
for you to read over and complete. You’ll ﬁll out the scantrons, and record some
answers on your booklets. I(we) will post and announce the time when you have only 30
minutes left to complete it. The time is posted just to make sure everything runs
smoothly — you should have plenty of time, so don’t worry about having to rush.

Please answer everything thoughtfully. It is important that you answer these questions
seriously and follow the instructions closely. When you have completed the ﬁrst section,
please stop, and raise your hand. Do not move on to the second section until I ask you to
do so.

When you are through with the entire study, please wait patiently until everyone else is
ﬁnished with theirs. If you know now that you will not be able to stay for the entire hour
and a half, please sign-up for a different session of the study where you will be able to
stay for the entire duration.

Now, please read and sign the lnfonned Consent form, which provides more information

on the nature of the task. With the lnfonned Consent form are two Data Release forms
we also need you to sign. When you are ﬁnished we will collect those forms.”

138

 

<you collect the Informed Consent and Data Release forms >

“Please bubble-in your PID and form number and letter A on the ﬁrst scantron

immediately. Please write your PID at the top right corner of the question booklet. Keep
the questionnaire for the ﬁrst section on your desk, and place the envelope containing the
second questionnaire packet under your seat.”

<Wait for subjects to fill in scantron and place envelope under seat>

“Please raise your hand any time if you have any questions or problems, for example, if

there is a page missing in your packet or you don’t understand something you’re reading. 51
Do you have any questions?” 1'

You may now begin.”
<Note start time>

 

<When all students have raised their hands at the end of the ﬁrst section, collect their V E}
forms and check scantrons for proper completion:> ‘
“Please bubble-in your PID and form number and letter B on the second scantron. Please 5’ "i"

write your PID at the top right comer of the second section question booklet. It is very

important that you follow all instructions in this section carefully. You
may now begin the second section.”

<Time is up>

“Okay, time is now up. I will come by to collect your scantrons and packets. Please
remain seated until all packets have been picked up and checked over quickly and you
have received a debrieﬁng form.

<Forms and scantrons are collected>
<Check to make sure they are completed properly>
<Pass out debrieﬁngform>

”Thank you very much for participating! I can now stamp your white credit cards, if
your teacher requires them. If you forgot yours, I have a slip I can ﬁll out for you with
instructions on how to exchange the slip for credit. We will also be giving you credit
online through the subject pool. We hope you found this study rewarding or at least
worth the extra credit. If you know anyone who may be interested and eligible, please
have them sign up with the Subject Pool.”

 

 

——-—-—= THINGS TO DO AFTER FINISHED =——————
1. Enter info. from sign-in sheet into designated folder in Fred’s lab.
2. Return materials to Fred’s lab in Baker. Place items in the appropriate location.

139

APPENDIX I

Coaching Directions

140

mi.»

 

 

 

Coached Group:

“Shortly you will be given a questionnaire that is pretty different from the SAT or

ACT, but it is a type of test that a high school student might take as a college admissions
test. We would like to help you do your very best on this questionnaire. In order to do
that, we will take a few minutes to discuss the types of questions that you may see in the
questionnaire. We will explain the factors that will be important in selecting the best
response options and answering the questions effectively.

Colleges are interested in several key performance dimensions. They are shown
on this handout and we will read a deﬁnition for each dimension. We will look at an
example of the type of question that you might see related to each dimension.

Biodata questions are scored on a scale of more or less positive. On the handout,
the plus and minus sign show you which answers are more or less desirable on a given
dimension.

The situational judgment questions have answer options that have been reviewed
by a panel of judges. The judges have determined which are the best and worst answers
for the dimension. The best and worst answer options are indicated for you.

Let’s look at the handout.

(Read dimensions and question examples on the handout.)

Now that you have an idea of what colleges might be looking for, you are better

equipped to answer the questions that are coming up.”

141

MM."

 

 

 

APPENDIX J

Coaching Handout

142

wt

 

If."‘ .r u "v"-

 

Dimensions Explanation

 

 

Knowledge, learning, mastery of Gaining knowledge and attaining mastery of general
general principles (incl. GPA) principles to which one is exposed. This is not
necessarily equivalent to grades, though grades can be
an indicator.

 

 

An example of a biodata question is:

Think about those courses in high school you were most interested in. Generally how
determined were you to learn the facts and concepts from the class?

a. extremely determined +
b. very determined

c. rather determined

(1. sort of determined

e. not very determined -

An example of a situational judgment question is:

Your grade for a particular class is based on three exams, with no class attendance
requirement. All of the homework requirements for the class are posted on the
professor’s web site. What would you do?

a. Attend class for as long as you feel that it is helping your grades.

b. Do all the homework but only go to some of the lectures. It’s the exams that count.
c. Go to all the classes anyway. The professor may say something important.

(1. Skip classes, but if you did poorly on the ﬁrst exam, start going to classes.

e. There is no need to go to classes. Just get the homework done, and pass the exams.

Best Response? 0
Worst Response? e

 

 

 

143

UHF:

 

 

Continuous learning, intellectual Being intellectually curious and actively interested in
interest and curiosity continuous learning both in core areas of study as well
as in peripheral or novel areas.

 

 

An example of a biodata question is:

In the last two months, how many times have you watched or listened to a documentary
(e.g., The History Channel, The Learning Channel, PBS, NPR)?

a. never —
b. 1-3

c. 4-6

(1. 7-10

e. more than 10 +

An example of a situational judgment question is:

You are enjoying a particular class, and ﬁnd yourself frustrated that only limited
material can be covered in class time. What would you do?

a. Nothing. You would be frustrated but it’s not a big deal.

b. Ask the professor about obtaining some extra materials.

c. Take the initiative and study outside of class. Find related materials on your
own.

(1. Take a more advanced class on the topic. Either switch to one or take it in
the future.

e. Drop the class and just learn the material on your own, maybe by reading the
same textbook.

f. Ask the professor to cover more material in class.

Best Response? (1
Worst Response? e

 

 

 

144

trul-

 

 

 

 

Artistic cultural appreciation and Possessing an interest and appreciation for art and
curiosity culture without necessarily being an expert.

 

 

An example of a biodata question is:

Compared with others your age, how much do you know about art (e. g., types of
painting, sculpture, and music -- historically and across cultures)?

a. much more than others +
b. somewhat more than others

0. about the same as others

(1. somewhat less than others

e. much less than others -

An example of a situational judgment question is:

You see a painting that intrigues you. You know nothing about it other than the artist’s
name. What would you do?

a. Look up the artist on the Internet to see if you can see some of their other work.
b. Ask others if they know anything about the artist.

c. Do some research to ﬁnd out what you want to know.

(I. Look for help at the library, asking for books about this artist.

e. Enjoy the painting, but leave it at that.

Best Response? a or c
Worst Response? e

 

 

 

145

m:-

-‘—IR' .13: 1'2
.,. .

 

 

 

 

Multicultural tolerance and Being open and tolerant to a multicultural

appreciation environment. Positively impacting, contributing to, or
actively seeking the beneﬁts of a multicultural
environment.

 

 

An example of a biodata question is:

In the last six months, how many times have you tried to talk to someone from a different
country or culture just to learn about their background?

a. never -
b. once

c. twice

d. three or four times

e. ﬁve times or more +

An example of a situational judgment question is:

You grew up in a small farming community and moved into a dorm area in which all
students were from an urban background. They seem to have different concerns and
interests and often just stare when you talk about your background and experiences.
How would you react?

a. Ask them questions about their experiences in the hopes that they will
develop some interest in your background. ,

b. Find other places to make ﬁiends with people who also come from farming
communities.

c. Try to talk to just one person, on their own, about what life was like for you
growing up.

(I. Ask others about their experiences and ask if they have any questions of
you.

c. Voice your feelings about the staring, and limit your talking about your
background.

Best Response? a or (1
Worst Response? e

 

 

 

146

littl-

 

 

 

 

Leadership Demonstrating leadership skills such as motivating
others and coordinating groups, or otherwise
erforming a leadership role.

 

 

An example of a biodata question is:

In high school, how often were you seen as the person that organized weekend activities
with your ﬁiends?

a. very often +
b. quite often

0. often

d. not that often

e. not very often -

An example of a situational judgment question is:

You notice that people are leaving soda cans and trash lying around in the common
areas of the dorm. What would you do?

a. Tell them to stop, and then tell the resident assistant.

b. Collect the cans and throw away the trash, turning in the cans for money.

c. Don’t do anything; the maintenance crew will clean it up, as it is their job to
do that.

(1. Post a sign to remind people not to litter, then bring it up at a ﬂoor meeting.

e. Comment on the mess to your friends in the dorm, and not do anything
about it.

Best Response? b or (1
Worst Response? e

 

 

 

147

 

til"

 

 

Interpersonal skills Exhibiting proﬁcient oral communication skills,
working well with others and possessing an awareness
of and appropriate reaction to the social dynamics of a
situation.

 

 

An example of a biodata question is:

What is your attitude about working in a team or group (e. g., on work or classroom
projects)?

a. Very positive, I prefer to work in teams

+

b. Positive, I usually like to work in teams

c. I am indifferent, it makes .no difference whether or not I work in teams

d. I usually prefer to work alone on individually assigned tasks that contribute to the team
e. I always try to work alone, I dislike working on teams

An example of a situational judgment question is:

One of your roommates has an embarrassing problem. He/she has a strong body odor
and it is offensive to everyone who must interact with him/her. What would you do?

a. Provide lots of hints such as making a joke or refening to the unpleasantness
of your own body odor. Hopefully the person would realize the point.

b. Tactfully tell the roommate about the problem and suggest a solution. One
way to start could be to ask the roommate if he/ she would like you to be honest
and direct about any problems.

0. Take actions to deal with the odor without confronting the roommate
directly. You might open the windows or buy some air fresheners.

(I. Ask someone else to talk to your roommate about the problem or leave an
anonymous message.

6. Switch your room.

f. Leave them alone. They probably know about it and cannot help the odor.

Best Response? b
Worst Response? e

 

 

 

148

M"

 

 

 

 

Social responsibility, citizenship Being a socially responsible individual and
and involvement demonstrating good citizenship and active
participation in the community, whether that is
college, neighborhood, town, state, or country.

 

 

An example of a biodata question is:

How much time do you spend thinking about the likely effects of new policies on
people's lives (school, work or government policies)?

a. much more time than most people +
b. more time than most people

c. about as much time as most people

(1. less time than most people

e. much less time than most people -

An example of a situational judgment question is:

Several students in your dorm have spoken with you about the danger of walking
through the dorm parking lot because so many people speed in and out of the lot.
While walking down the hallway, you overhear a student joke about how fast he zips
through the parking lot on the way out. How would you handle this situation?

a. Tell the student driver that he is not funny and could seriously injure
someone.

b. Warn the student driver that he should stop because he could receive a ticket
for such actions.

c. You might disapprove but would not take any action.

d. Tell an authority ﬁgure about this individual and the threat he poses, such as
the police or a dorm director.

e. Jokingly make a statement that he could kill someone, hoping that he
understands the real message.

f. Confront the student and ask him to please drive more slowly and carefully.

Best Response? a or f
Worst Response? (1

 

 

 

149

w-

 

 

Physical and psychological health Engaging in physically healthy behaviors (e. g. eating
properly and exercising regularly) and avoiding
physically unhealthy behaviors (e.g. alcohol and drug
abuse and unprotected sex). Being mentally alert,
stable, and engaged in a scholastic environment.

 

 

An example of a biodata question is:
How much energy do you have compared with other people?

a. More than anyone I know +
b. More than most

0. About the same amount

(1. Less than most

e. Less than anyone I know -

An example of a situational judgment question is:

All of the people who live near you seem to party, drink and use drugs on weekend
nights. You like most of these people, but do not want to engage in some of the
behavior in which they engage. You have no one else to hang outwith. How would
you deal with this situation?

a. Move away from that group, so that you are not getting into trouble.

b. Continue to be ﬁiends and hang out with them, but do not engage in their
activities.

c. Join a club or ﬁnd other friends, and tell your current friends they should
understand.

d. Try to teach them the problems associated with partying and using drugs.
6. Start hanging out by yourself, and refer your friends to some professional
help.

Best Response? b
Worst Response? e

 

 

 

150

W"

 

 

 

l

 

 

Career orientation Having a clear sense of purpose and direction

regarding one’s career interests and plans. Planning,

establishing and following a set of reasonably ordered
riorities in light of one’s career interests and plans.

 

 

An example of a biodata question is:
When were you ﬁrst sure about what you generally wanted to do as a lifelong career?

a. by the time you were in elementary school +
b. by the time you were in high school

c. by your last year in high school

(I. still have not decided -

An example of a situational judgment question is:

You know what kind of job you want and that getting some experience in that ﬁeld will
be helpful in getting a job after graduation. You have no idea how to get this experience
or an internship. How would you proceed?

a. Go to the job placement center or talk with your advisor.

b. Talk with some professors.

0. Attend all the campus career fairs and use all the available campus resources.

(1. Contact some of your friends in the ﬁeld and seek out information in your
department. '

e. Research jobs on the internet, write a resume, and then contact a group of potential
employers or internship possibilities, see who contacts you and work from there.

f. Call places that hire students from your major and seek job/intemship opportunities.

Best Response? a or 0
Worst Response? b or d

 

 

 

151

M"

 

£2121.- .

1!:
:‘l

 

Adaptability and life skills Adapting successfully to a changing environment,
where changes are gradual or sudden and expected or
unexpected. Doing one’s daily tasks in a thoughtful
manner.

 

 

An example of a biodata question is:
How often have you failed to meet responsibilities because you had taken on too much?

a. very often -
b. often

c. sometimes

(I. seldom

e. never +

An example of a situational judgment question is:

You are going through an especially busy period at school. It is the end of the semester,
you have papers due, need to prepare for exams, and coworkers at your part-time job are
asking that you work more shifts to help them out. You ﬁnd yourself beginning to lose
track of details and are feeling overwhelmed. What do you do?

a. Decide what’s important and then prioritize your responsibilities.

b. Relax and take a step back, knowing that you can’t do everything at once.

c. Get organized, and start planning ahead and scheduling.

d. Apologize, decline the extra shifts, and tell your coworkers that school is your
ﬁrst priority.

e. Quit your job.

f. Sleep less and work harder to get things done.

Best Response? a or (1
Worst Response? e or f

 

 

 

152

 

“are

N"

 

 

Perseverance Following through on intended courses of action,
“sticking to it”, and being committed and dedicating
time and effort to established goals and priorities.
Taking Ersonal pride in one's commitments.

 

 

An example of a biodata question is:

When encountering problems that take a long time to solve, how impatient have you
become?

a. extremely impatient -
b. very impatient

c. somewhat impatient

d. slightly impatient

e. not at all impatient +

An example Of a situational judgment question is:

An accounting class is a required course for the degree you are pursuing, and you have
no interest in accounting. Your grades in the class are beginning to reﬂect your lack of
interest. What would you do?

a. Try your best to get through it, understanding that you really need the
course, and don’t want to have to take it a second time.

b. Meet with a study group to get your grades up.

0. Focus on asking questions to make it more interesting.

(1. Spend time with a tutor to boost your grades.

e. See if you can ﬁnd a way to relate accounting to your area of interest, to
keep your enthusiasm up.

Best Response? b or d
Worst Response? c

 

 

 

153

 

 

 

Ethics and integrity Exercising good judgment and adhering to a set of
ethics. Demonstrating integrity by being honest and
lawful, not cheating, and maintaining a sense of honor
in one’s personal and academic life.

 

 

An example of a biodata question is:
How many times have you cheated on an exam?

a. never +
b. once

c. twice

d. three or four times

e. ﬁve times or more -

An example of a situational judgment question is:

You have been having trouble in a course. On the day of the exam someone offers to
sell you a copy of the exam. You have heard rumors that a large number of students in
the class have purchased this exam and are afraid that if many students do well, your
grade will look even worse. What would you do?

a. Purchase the exam so that you do not look worse than everybody else.
b. Take the exam with the knowledge that you have from studying.

c. Approach the instructor and explain what you have been offered.

(1. Buy the exam, but try to miss some of the questions.

e. Inform the instructor anonymously in the hopes that he will change the
exam.

Best Response? 6
Worst Response? (1

 

 

 

154

 

 

:1!

 

APPENDIX K

College GPA Release Form

g—I
U1
U1

 

 

l

J
I’
s
s
r r

Authorization of Release of Cumulative GPA for Freshman and Sophomore Years

As part of a research project on the prediction of undergraduate student performance, I
authorize the Ofﬁce of the Registrar to release my GPA for each of the ﬁrst four
semesters of my academic work at MSU to a research team directed by Drs. Fred Oswald
and Neal Schmitt. As soon as these investigators link these data to other test information
I am providing them, any information identifying me personally will be removed from
the data ﬁle. This information will not be released to any other research teams or
individuals and all reports on this project will involve aggregated data that do not allow
personal identiﬁcation.

Signed

 

PID

 

Date

 

156

 

 

 

APPENDIX L

High School GPA and ACT/SAT Score Release Form

157

 

ﬂl

Authorization of Release of cumulative high school GPA and ACT/SAT Scores

As part of a research project on the prediction of undergraduate student performance, I
authorize the MSU Ofﬁce of Admissions to release my ACT and/or SAT scores to a
research team directed by Drs. Fred Oswald and Neal Schmitt. As soon as these
investigators link these scores to other test information I am providing them, any
information identifying me personally will be removed from the data ﬁle. This
information will not be released to any other research teams or individuals and all reports
on this project will involve aggregated data that do not allow personal identiﬁcation.

Signed

 

PID

 

Date

 

o.»-

 

-__"

APPENDIX M

Twelve Dimensions of College Performance

 

 

159

Twelve Dimensions of College Performance

 

Intellectual Behaviors

Knowledge, learning, and mastery of general principles (Knowledge)

Gaining knowledge and mastering facts, ideas and theories and how they interrelate, and
understanding the relevant contexts in which knowledge is developed and applied.
Grades or GPA can indicate, but not guarantee, success on this dimension.

Continuous learning, and intellectual interest and curiosity (Learning)

Being intellectually curious and interested in continuous learning. Actively seeking new
ideas and new skills, both in core areas of study as well as in peripheral or novel areas.
Artistic cultural appreciation and curiosity (Artistic)

Appreciating art and culture, either at an expert level or simply at the level of one who is
interested.

Interpersonal Behaviors

Multicultural tolerance and appreciation (Multicultural)

Showing openness, tolerance, and interest in a diversity of individuals (e. g., by culture,
ethnicity, or gender). Actively participating in, contributing to, and inﬂuencing a
multicultural environment.

Leadership (Leadership)

Demonstrating skills in a group, such as motivating others, coordinating groups and
tasks, serving as a representative for the group, or otherwise performing a managing role
in a group.

Interpersonal skills (Interpersonal)

160

 

Communicating and dealing well with others, whether in informal social situations or
more formal school-related situations. Being aware of the social dynamics of a situation
and responding appropriately.

Social responsibility, citizenship and involvement (Citizenship)

Being responsible to society and the community, and demonstrating good citizenship.
Being actively involved in the events in one's surrounding community, which can be at
the neighborhood, town/city, state, national, or college/university level. Activities may
include volunteer work for the community, attending city council meetings, and voting.

Intrapersonal Behaviors

 

Physical and psychological health (Health) "
Possessing the physical and psychological health required to engage actively in a
scholastic environment. This would include participating in healthy behaviors, such as
eating properly, exercising regularly, and maintaining healthy personal and academic
relations with others, as well as avoiding unhealthy behaviors, such as alcohol/drug
abuse, unprotected sex, and ineffective or counterproductive coping behaviors.

Career orientation (Career)

Having a clear sense of career one aspires to enter into, which may happen before entry
into college, or at any time while in college. Establishing, prioritizing, and following a set
of general and speciﬁc career-related goals.

Adaptability and life skills (Adaptability)

Adapting to a changing environment (at school or home), dealing well with gradual or
sudden and expected or unexpected changes. Being effective in planning one’s everyday

activities and dealing with novel problems and challenges in life.

161 n

Perseverance (Perseverance)

Committing oneself to goals and priorities set, regardless of the difﬁculties that stand in
the way. Goals range from long-term goals (e. g., graduating from college) to short-term
goals (e. g., showing up for class every day even when the class isn’t interesting).

Ethics and integrity (Ethics)

Having a well-developed set of values, and behaving in ways consistent with those
values. In everyday life, this probably means being honest, not cheating (on exams or in

committed relationships), and having respect for others.

 

 

Note. From “Developing a Biodata Measure and Situational Judgment Inventory as
Predictors of College Student Performance,” by F. L. Oswald, N. Schmitt, L. J. Ramsay,
B. H. Kim, and M. A. Gillespie, Journal of Applied Psychology, in press.

 

 

APPENDIX N

Bogus Biodata Items

163

 

 

 

1. In the past six months, how often have you resolved disputes by isometric

analysis?
a. never
b. once
0. twice
(1. three or four times
e. ﬁve times or more
2. How often in the past year have you programmed in AJ MR?
a. never
b. once
0. twice
(1. three or four times
e. ﬁve times or more
3. How often, in the past three years, have you operated a rhetaguard?
a. never
b. once
c. twice
(1. three or four times
e. ﬁve times or more

4. In the past six months, how often have you matrixed solvency ﬁles according
to publication standards?

never
once
twice
three or four times
ﬁve times or more

9999‘!»

 

Note. Items 1, 3 and 4 adapted from “Inflation Bias in Self-Assessment Examinations:
Implications for Valid Employee Selection,” by C. D. Anderson, J. L. Warner, and C. C.
Spencer, 1984, Journal of Applied Psychology, 69, 577. Item 2 was written for this
study.

164

 

 

 

APPENDIX 0

Inﬂation Index Items

165

 

 

In the past year, in how many fundraisers have you participated?

a. none
b. 1

c. 2

d. 3

e. 4 or more

In the past month, how many times did you go out and learn more about
something simply because it seemed interesting?

a never E
b once i:
c. twice

(1 three or four times

e ﬁve times or more

 

If a fellow student offered you a copy of upcoming exam questions that he
had retrieved ﬁ'om the teacher’s recycling bin, how likely would you be to
accept a copy?

a extremely likely
b. quite likely

c somewhat likely
(I not at all likely

How often in the past year have you volunteered to be the spokesperson for a
group project you did at school or work?

much more often than most people
somewhat more often than most people
about as oﬂen as most people
somewhat less oﬁen than most people
a good bit less often than most people

.09—99‘s»

Brieﬂy describe up to 4 examples and the nature of the projects.

 

 

166 r
l

 

 

You are ﬁnding your freshman year very difﬁcult. The courses are hard, and
you feel your grades are not satisfactory. Material in class seems to be covered
very quickly. What would you do?

a. Talk with the professors and TAs to get help on how to study.

b. Find a study partner and work on homework and class material together.

c. Talk to your parents and an advisor.

(1. Study hard, try your best, and don’t worry about it.

6. Talk to my advisor and teachers; see if there are study groups or review
sessions I can attend.

f. Hire a tutor for the difﬁcult classes.

1. What would you be most likely to do?

2. What would you be least likely to do?

You grew up in an environment that did not expose you to people from other
ethnic groups. When you ﬁrst get to college, you ﬁnd that your roommate is of

a different race, and you ﬁnd some of their habits strange. What would you do?

a. Seek out information about that person’s culture.
b. Try to accept the person as is and not judge her/him.
c. Talk about your cultural differences and learn a new perspective.

(1. Be patient. Try not to ask her/him questions since that might make her/him feel

uncomfortable.
e. Ask the person questions about her/his habits and understand the situation.
f. Politely ask her/him to stop the habits or just put up with it.

1. What would you be most likely to do?
2. What would you be least likely to do?
167

 

 

APPENDIX P

Calculation for Inﬂation Index Selection

 

 

168

 

Calculation Inﬂation Index Item Selection
Reference group (not

Optimal group (coached, coached, not motivated, with

motivated, not warned) warning)
Reference
group mean
minus optimal
N Mean SD N Mean 80 group mean
8101 45 2.31 1.35 46 1.78 _ 1.11 -0.53
8102 45 3.71 1.27 46 3.13 1.44 058
8103 44 2.64 1.48 46 1.93 1.39 -0.70
8104 45 3.47 1.36 46 2.72 1.28 -0.75
BIOS 45 2.78 0.90 46 2.39 0.61 -0.39
8106 45 3.84 1.36 46 3.37 1.48 -0.47
8107 44 3.50 1.53 46 3.07 1.62 -0.43
8108 44 3.18 1.32 46 2.15 1.25 -1.03
8109 45 2.24 1.17 46 1.87 0.91 -0.37
81010 45 3.24 1.32 46 2.04 1.11 -1.20
81011 45 2.96 1.49 46 2.59 1.60 —0.37
81012 45 2.44 1.49 46 1.80 1.17 -0.64
81013 45 3.22 1.59 46 2.76 1.55 -0.46
81014 45 3.22 1.43 46 2.52 1.28 -0.70
81015 45 2.31 1.35 46 1.74 1.14 -0.57
81016 45 3.02 1.48 46 2.41 1.27 -0.61
81017 45 1.62 0.98 46 1.50 0.89 -0.12
81018 45 4.07 1.12 46 3.89 1.20 -0.18
81019 45 3.33 1.55 46 2.65 1.42 -0.68
81020 45 3.24 1.49 46 2.72 1.66 -0.53
81021 45 4.29 1.14 46 4.26 1.24 -0.03
81022 45 3.60 1.07 46 3.02 1.06 -0.58
81023 45 3.60 0.99 46 3.20 0.83 -0.40
81024 45 3.82 1.13 46 3.15 1.17 -0.67
81025 45 4.24 0.71 46 4.09 0.69 -0.16
81026 45 3.02 0.89 46 2.41 0.78 -0.61
81027 45 3.91 1.02 46 3.65 0.99 —0.26
81028 45 3.56 0.99 46 3.30 0.79 —0.25
81029 45 4.38 0.72 46 4.30 0.84 -0.07
81030 45 3.73 1.23 46 3.04 1.11 -0.69
81031 45 4.33 1.02 46 4.30 0.96 -0.03
81032 45 2.93 1.39 46 2.39 1.16 -0.54
81033 45 4.00 0.98 46 3.85 0.82 -0.15
81034 45 2.58 0.81 46 3.04 0.76 0.47
81035 45 4.38 0.78 46 4.00 0.84 -0.38
81036 45 3.93 0.86 46 3.74 0.91 —0.19
81037 45 4.40 0.72 46 4.15 0.82 -0.25
81038 45 3.53 1 .14 46 3.13 1.36 -0.40
81039 45 3.53 0.79 46 2.39 1.11 -1.14
81040 45 3.78 1.22 46 3.46 1.03 -0.32

169

**

i"!

**

 

 

B1041
BIO42
SJ|1 C
SJ|2C
SJ|3C
SJI4C
SJ|5C
SJI6C
SJ|7C
SJI8C
SJ|9C
SJ|10C
SJ|11C
SJ|12C
SJ|13C
SJ|14C
SJ|15C
SJI16C
SJI17C
SJI18C
SJ|19C
SJ|20C
SJ|21 C
SJ|22C
SJ|23C
SJ|24C

45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
44
43
43
44
44

3.47
3.47
1.76
0.91
1.00
0.09
0.71
0.53
1.47
1.24
1.29
1.20
0.47
0.73
0.89
0.22
0.44
1.27
1.13
1.42
0.96
0.86
1.05
1.07
0.91
1.20

"Inﬂation index items

0.81
1.39
0.57
0.95
0.71
0.73
1.12
1.06
0.76
0.86
1.22
1.16
0.79
0.72
0.93
0.74
0.97
0.81
1.14
0.84
0.82
0.82
0.65
1.33
0.74
1.00

46
46
46
46
46
46
46
46
46
46
46
46
46
46
46
45
46
45
46
46
46
46
46
46
46
46

170

3.22
2.41
1.26
-0.02
0.98
0.02
0.26
0.39
1.07
0.85
1.30
0.76
0.26
0.39
0.39
-0.07
0.15
1.18
—0.37
1.26
0.43
0.83
1.02
0.74
0.37
0.48

0.94
1.36
0.74
1.14
0.71
0.83
1.00
0.88
1.00
1.15
0.89
1.49
0.77
0.98
1.16
0.75
0.84
1.03
1.08
0.74
0.78
0.90
0.86
1.36
0.93
1.13

-0.25
-1.05
-0.49
-0.93
-0.02
-0.07
-0.45
-0.14
-0.40
-0.40

0.02
-0.44
-0.21
-0.34
-0.50
-0.29
-0.29
-0.09
-1.50
-0.16
-0.52
-0.04
~0.02
-0.33
—0.54
-0.73

*t

**

 

 

APPENDIX Q

Biodata Item Types Coding Instructions

 

E
1

 

l

171

Biodata Item Type Coding

1=Not at all

2=Very small extent
3=Moderate extent
4=Great extent
5=Completely

Provided below are dimensions by which a biodata item can described, along with their
deﬁnitions. Please choose one dimension and rate all items on that dimension. Then,
proceed to the next dimension.

 

bjective he factual parts of an event that are not tied to one’s
interpretation or evaluation.

To what extent does this item refer to objective events?

 

eriﬁable vents that could be corroborated by an independent
ource; someone unrelated could agree with every
tatement that was made.

To what extent does this item refer to veriﬁable events?

lControllable Events where one has a choice about whether to perform
r not.

To what extent does this item refer to controllable events?

 

 

 

Equal access xperiences or skills that everyone has an equal chance
f obtaining.
To what extent does this item refer to events with equal access?

 

 

College relevant vents that seem to be related to understanding why one
'11 do well in college.

To what extent does this item refer to events relevant to college life?

 

 

 

Ilnvasive [Events that are too personal and/or invade one’s privacy.

 

 

To what extent does this item refer to noninvasive events?

 

Fakeableness e extent to which respondents can produce false (but
ealistic) responses without being caught. This is an
verall rating and not a single aspect of the item.

To what extent does this item appear to be fakeable?

172

 

APPENDIX R

SJ I Item Types Coding Instructions

 

 

 

SJI Item Types Scoring

Provided below are dimensions by which an SJ I stem and response options can be
described, along with their deﬁnitions. Please choose one dimension and rate all items
(stem and response options) on that dimension. Then, proceed to the next dimension.

Dimension Deﬁnition

Rating Instruction

 

Objective: The factual parts of an event
that are not tied to one’s interpretation or
evaluation.

To what extent does this item refer to
objective events?

 

Veriﬁable: Events that could be
corroborated by an independent source;
someone unrelated could agree with
every statement that was made.

To what extent does this item refer to
veriﬁable events?

 

Controllable: Events where one has a
choice about whether to perform or not.

To what extent does this item refer to
controllable events?

 

Equal Access: Experiences or skills that
everyone has an equal chance of
obtaining.

To what extent does this item refer to events
with equal access?

 

College Relevant: Events that seem to
be related to understanding why one will
do well in college.

To what extent does this item refer to events
relevant to college life?

 

Invasive: Events that are too personal
and/or invade one’s privacy.

To what extent does this item refer to
noninvasive events?

 

 

Fakeableness: The extent to which
respondents can produce false (but
realistic) responses without being
caught. This is an overall rating and not
a single aspect of the item.

 

To what extent does this item appear to be
fakeable?

 

 

Recall that the questions are framed as:
What would you be most likely to do?
What would you be least likely to do?

Please use the following rating scale when determining the extent to which an item stem

or response ﬁts a dimension:
1=Not at all

2=Very small extent
3=Moderate extent

4=Great extent
5=Completely

174

 

1'71

REFERENCES

 

175

 

REFERENCES

Alliger, G. M., & Dwight, S. A. (2000). A meta-analytic investigation of the
susceptibility of integrity tests to faking and coaching. Educational and
Psychological Measurement, 60(1), 59-72.

Anderson, C. D., Warner, J. L., & Spencer, C. C. (1984). Inﬂation bias in self-
assessment examinations: Implications for valid employee selection Journal of
Applied Psychology, 69(4), 574-580.

Becker, T. E., & Colquitt, A. L. (1992). Potential versus actual faking of a biodata form:
An analysis along several dimensions of item type. Personnel Psychology, 45,
389-406.

Bunting, B. P., & Mooney, E. (2001). The effects of practice and coaching on test results
for educational selection at eleven years of age. Educational Psychology, 21(3),
243-253.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd Ed.).
Hillsdale, NJ: Lawrence Earlbaum Associates.

Corr, P. J ., & Gray, J. A. (1994). Relationship between attributional style and lie scores
in an occupational sample motivated to fake good. Personality and Individual
Differences, 18(3), 433-435.

 

Cunningham, M. R., Wong, D. T., & Barbee, A. P. (1994). Self-presentation dynamics
on overt integrity tests: Experimental studies of the Reid Report. Journal of
Applied Psychology, 79, 643-658.

Clevenger, J ., Pereira, G. M., Wiechmann, D., Schmitt, N., & Schmidt-Harvey, V.
(2001). Incremental validity of situational judgment tests. Journal of Applied
Psychology, 86(3), 410-417.

Dweck, C. S., & Leggett, E. L., (1988). A social-cognitive approach to motivation and
personality. Psychological Review, 95, 256-273.

Dwight, S. A., & Alliger, G. M. (1997). Reactions to overt integrity test items.
Educational and Psychological Measurement, 57(6), 937-948.

Dwight, S. A., & Donovan, J. J. (2003). Do warnings not to fake reduce faking? Human
Performance, 16(1), 1-23.

176

 

~9-

 

Ellingson, J. E., Sackett, P. R., & Hough, L. M. (1999). Social desirability corrections in
personality measurement: Issues of applicant comparison and construct validity.
Journal of Applied Psychology, 84(2), 155-166.

Elliot, S., Lawty-Jones, M. & Jackson, C. (1996). Effect of dissimulation on self-report

and objective measures of personality. Personality and Individual Differences,
2_l(3), 335-343.

Elliot, A. J ., & Harackiewicz, J. M. (1996). Approach and avoidance achievement goals
and intrinsic motivation: A mediational analysis. Journal of Personality and
Social Psychology, 7Q 461-475.

Frei, R. L. (1998). Fake this test! Do you have the ability to raise your score on a service
orientation inventory? Dissertation Abstracts International, 58, 5681.

Friede, A., Gillespie, M., Kim, B., Oswald, F ., Ramsay, L. & Schmitt, N. (2002).
Development and Validation of Alternative Measures of College Student Success
(Technical Report). East Lansing: Michigan State University.

 

Goldberg, L. R. (1993). The structure of phenotypic personality traits. American
Psychologist, 48, 26-34.

Gore, B. A. (2001). Reducing and detecting faking on a computer-administered biodata
questionnaire. Dissertation Abstracts International, 61, 4457.

Graham, K. E., McDaniel, M. A., Douglas, E. F ., & Snell, A. F. (2002). Biodata validity
decay and score inﬂation with faking: Do item attributes explain variance across
items? Journal of Business and Psychology, 16(4), 573-592.

Harachiewicz, J. M., Barron, K. E., Carter, S. M., Lehto, A. T., & Elliot, A. J. (1997).
Determinants and consequences of achievement goals in the college classroom:
Maintaining interest and making the grade. Journal of Personality and Social
Psychology. 73, 1284-1295.

Harackiewicz, J. M., Barron, K. E., Pintrich, P. R., Elliot, A. J ., & Thrash, T. M. (2002).
Revision of Achievement Goal Theory: Necessary and Illuminating. Journal of
Educational Psychology, 94, 638-645.

Harackiewicz, J. M., Barron, K. E., Tauer, J. M., & Elliot, A. J. (2002). Predicting
success in college: A longitudinal study of achievement goals and ability
measures as predictors of interest and performance from freshman year through
graduation. Journal of Educational Psychology, 94, 562-575.

Holden, L.M. (1996). The effectiveness of training in improving test performance.

Presented at 11th Annual Conference for the Society of Industrial and
Organizational Psychology, Inc., San Diego, CA.

177 r

Hough, L. M. (1998). Effects of intentional distortion in personality measurement and
evaluation of suggested palliatives. Human Performance, 11(2-3), 209-244.

Hough, L. M., Eaton, N. K., Dunnette, M.D., Kamp, J. D., & McCloy, R. A. (1990).
Criterion-related validities of personality constructs and the effect of response
distortion on those validities. Journal of Applied Psychology/L756), 581-595.

Jones, R. F. (1987). The effect of commercial coaching courses on performance on the
MCAT. Journal of Medical Education, 61(4), 273-284.

Jones, R. F ., & Vanyur, S. (1987). A comparison of the predictive validity of the MCAT
for coached and uncoached students. Journal of Medical Education, 61(4), 335-
338.

Klubeck, S., & Bass, B. M. (1954). Differential effects of training on persons of different
leadership status. Human Relations, 7, 59-72.

Kluger, A. N., Reilly, R. R., & Russell, C. J. (1991). Faking biodata tests: are option-
keyed instruments more resistant? Journal of Applied Psychology, 76(6), 889-
896.

Kluger, A. N., & Colella, A. (1993). Beyond the mean bias: The effect of warning
against faking on biodata item variances. Personnel Psychology, 46(4), 763-780.

Kulik, J. A., Bangert-Downs, R. L., & Kulik, C. C. (1984). Effectiveness of coaching for
aptitude tests. Psychological Bulletin, 95, 175-188.

Kulik, J. A., Kulik, C. C., & Bangert, R. L. (1984). Effects of practice on aptitude and
achievement test scores. American Educational Research Journal, 21, 43 5-447.

Kuncel, N. R., Hezlett, S. A., & Ones, D. S. (2001). A comprehensive meta-analysis of
the predictive validity of the graduate record examinations: Implications for

graduate student selection and performance. Psychological-Bulletin, 127(1), 162-
181.

Lautenschlager, G. J. (1994). Accuracy and faking of background data. In G. S. Stokes,
M. D. Mumford, & W. A. Owens (Eds), Biodata handbook (pp. 391-419). Palo
Alto, CA: CPP Books.

Mabe, P. A. III & West, S. G. (1982). Validity of self-evaluation of ability: A review and
meta analysis. Journal of AJplied Psychology, 67, 280-296.

Mael, F. A. (1991). A conceptual rationale for the domain and attributes of biodata
items. Personnel Psychology, 44, 763-792.

178

 

mf' I t

F

Irma-

 

 

 

McCrae, R. R., & Costa, P. T. (1999). A ﬁve-factor theory of personality. In L. A. Pervin
& O. P. John (Eds), Handbook of personality: Theory and research (pp. 139-153).
New York: Guilford.

McDaniel, M. A., Morgeson, F. P., Bruhn Finnegan, E. B., Campion, M. A., &
Braverrnan, E. P. (2001). Predicting job performance using situational judgment
tests. Journal of Applied Psychology, 86(4), 730-740.

McDaniel, M. A., & Nguyen, N. T. (2001). Situational judgment tests: A review of

practice and constructs assessed. International Journal of Selection and
Assessment, 9(1-2), 103-1 13.

McFarland, L. A. (2000). Toward an integrated model of applicant faking (Doctoral
dissertation, Michigan State University, 2000). Dissertation Abstracts
International, 61, 2805.

McFarland, L. A., & Ryan, A. M. (2000). Variance in faking across noncognitive
measures. Journal of Applied Psychology,85(5), 812-821.

McManus, M. A. & Kelly, M. L. (1999). Personality measures and biodata: Evidence
regarding their incremental predictive value in the life insurance industry.
Personnel Psychology, 52, 137-148.

Messick, S., & Jungeblut, A. (1981). Time and method in coaching for the SAT.
Psychological Bulletin, 89(2), 191-216.

Midgley, C., Kaplan, A., Middleton, M., Maehr, M., Urdan, T., Anderman, L. H.,
Anderman, E., & Roeser, R. (1998). The development and validation of scales

assessing students’ achievement goal orientations. Contemporary Educational
Psychology, 23, 113-131.

Miller, C. E. (2001). The susceptibility of personality selection tests to coaching and
faking. (Doctoral dissertation, University of Akron, 2000). Dissertation
Abstracts International, 61 , 3888.

 

Motowidlo, S. J ., Russell, T. L., Carter, G. W., & Dunnette, M. D. (1988). Revision of
the Management Selection Interview: Final report. Minneapolis, MN: Personnel
Decisions Research Institute.

Motowidlo, S. J ., Dunnette, M. D., & Carter, G. W. (1990). An alternative selection
procedure: The low-ﬁdelity simulation. Journal of Applied Psychology, 75(6),
640-647.

Mount, M., Witt, L. A. & Barrick, M. R., (2000). Incremental validity of empirically

keyed biodata scales over GMA and the ﬁve factor personality constructs.
Personnel Psychology, 299-323.

179

 

Mumford, M. D., & Stokes, G. S. (1992). Developmental determinants of individual
actions: Theory and practice in applying background measures. In M. D. Dunnette
& L. M. Hough (Eds), Handbook of industrial and organizational gychology
(Vol. 3, pp. 61-138). Palo Alto, CA: Consulting Psychologists Press.

Nguyen, N. T. (2002). Faking in situational judgment tests: An empirical investigation of
the work judgment survey. Dissertation Abstracts International, 63, 3109.

Ones, D. S., Viswesvaran, C., & Reiss, AD. (1996). Role of social desirability in
personality testing for personnel selection: The red herring. Journal of Applied
Psychology, 81(6), 660-679.

Oswald, F. L., Schmitt, N., Ramsay, L. J ., Kim, B. H., & Gillespie, M. A. (in press).
Developing a biodata measure and situational judgment inventory as predictors of
college student performance. J oumal of Applied Psychology.

Pannone, R. D. (1984). Predicting test performance: A content valid approachto
screening applicants. Personnel Psychology, 37, 507-514.

Paulhus, D. L. (1988). Assessing self-deception and impression manaiement in self-
reports: The Balanced Inventory of Desirable Responding Unpublished manual.
Vancouver, British Columbia: University of British Columbia.

Paulhus, D. L., & Reid, D. B. (1991). Enhancement and denial in socially desirable
responding. Journal of Personality and Social Psychology, 60, 307-317.

Petty, M. M. (1974). A multivariate analysis of the effects of experience and training
upon performance in a leaderless group discussion. Personnel Psychology, 27,
271-282.

 

Powers, D. E., & Rock, D. A. ( 1998). Effects of coaching on SAT I: Reasoning scores.
New York, NY: The College Board.

Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of response
distortion on preemployment personality testing and hiring decisions. Journal of
Applied Psychology. 83(4), 634-644.

Ryan, A. M., Ployhart, R. E., Gregarus, G. J ., & Schmit, M. J. (1998). Test preparation
programs in selection contexts: Self-selection and program effectiveness.
Personnel Psychologyy,51(3), 599-621.

Sackett, P. R., Burris, L. R., & Ryan, A. M. (1989). Coaching and practice effects in
personnel selection. In C. L. Cooper and I. Robertson (Eds.), International
Review of Industrial and Organizational Psychology, (pp. 145-183). Oxford,
England: John Wiley & Sons Ltd.

180

 

 

 

Schmit, M. J ., & Ryan, A. M. (1993). The Big Five in personnel selection: Factor
structure in applicant and nonapplicant populations. Journal of Applied
Psycholggj, 78(6), 966-974.

 

Schmitt, N., Gillespie, M. A., Kim, B. H, Ramsay, L. J ., Oswald, F. L., & Yoo, T. Y. (in
press). Impact of elaboration on social desirability and the validity of biodata
measures. Journal of Applied Psychology.

Schmitt, N. & Kunce, C. (2002). The effect of required elaboration of answers to biodata
questions. Personnel Psychology, 55(3), 569-587. T“!

Schmitt, N., Oswald, F. L., Kim, B. H., Gillespie, M. A., & Ramsay, L. J. (in press). The
impact of justice and self-serving bias explanation of the perceived fairness of
different types of selection tests. International Journal of Selection and
Assessment.

 

 

Sisco, H. C. (1999). Biodata predictors of the ﬁve factor model. Dissertation Abstracts L49
International, 59, 4520.

Smith, D. B., & Ellingson, J. E. (2002). Substance versus style: A new look at social

desirability in motivating contexts. Journal of Applied Psychology, 87(2), 211-
219.

Stanush, P. L. (1997). Factors that inﬂuence the susceptibility of self-report inventories
to distortion: A meta-analytic investigation. (Doctoral dissertation, Texas A and
M University, 1997). Dissertation Abstracts International, 58, 2167.

Stark, S., Chemyshenko, O. S., Chan, K., Lee, W. C., & Drasgow, F. (2001). Effects of
the testing situation on item responding: Cause for concern. Journal of Applied
Psychology, 86(5), 943-953.

Stokes, G. S. (1994). Introduction and history. In G. S. Stokes, M. D. Mumford, & W.
A. Owens (Eds.), Biodata handbook (pp. xv-xix ). Palo Alto, CA: CPP Books.

Stokes, G. S., & Searcy, C. A. (1999). Speciﬁcation of scales in biodata form
development: Rational vs. empirical and global vs. speciﬁc. International Journal
of Selection and Assessment, 7(2), 72-85.

Vasilopoulos, N. L. (1999). The impact of job familiarity and warning of response
veriﬁcation on the relationship between response latency and impression
management: A ﬁeld investigation. Dissertation Abstracts International, 59,
4521.

 

181 P

Viswesvaran, C. & Ones, D. S. (1999). Meta-analyses of fakability estimates:

Implications for personality measurement. Educational and Psychological
Measurement, 59(2), 197-210.

Wiggins, J. S., & Trapnell, P. D. (1997). Personality structure: The return of the Big Five.

In R. Hogan, J. Johnson, & S. Briggs (Eds.), Handbook of personality psycholggy
(pp. 737-765). San Diego: Academic Press.

White, L. A., Young, M. C., & Rumsey, M. G. (2001). ABLE implementation issues and
related research. In J. P. Campbell & D. J. Knapp (Eds.), Exploring the Limits of

Personnel Selection and Classiﬁcation (pp. 525-558). Mahwah, NJ: Lawrence
Erlbaum Associates, Inc., Publishers.

Zickar, M. J ., & Drasgow, F. (1996). Detecting faking on a personality instrument using
appropriateness measurement. Applied Psychological Measurement, 20, 71-87.

Zickar, M. J ., & Robie, C. (1999). Modeling faking good on personality items: an item-
level analysis. Journal of Applied Pimhologuﬁ4(4), 551-563.

182