$9.
a“.

51:75.. .. ‘ . . ‘ . . . a.
:7 xiii... ‘ . ‘ ‘ . ﬁasrﬁmlkt
.21.». ..

.&HRJ¥

9393:
E.-

l 1. .

asquxw» .. 1a!

xivéft V
.. .. . if
2.9.. hi!

:1 L531
t... can?

.tt:3.’.5.n£,
ii. ,
:5“? $5,...
AT ‘

1
haunt“...
1.4.3: :3... .

4.5;, a. :s: 5...};
.II {SKI If; L7.I-\Y‘i
ﬂuﬁué... ‘33.}. .

ea, ...ﬁt§
.ro:

 

 

\ 5’. 1!
:1 311 A

ﬂammmﬁmgwﬁm at

'r

 

 

LMlS

r)

J/ 5.4

This is to certify that the
thesis entitled

AN INVESTIGATION OF FAKING: ITS ANTECEDENTS AND
IMPACTS IN APPLICANT SETTINGS

presented by
ANTHONY S. BOYCE

has been accepted towards fulﬁllment
of the requirements for the

MA. degree in Department of Psychology

 

 

////////

Major Professor’s S ature

7/1 8/2005

 

Date

MSU is an Aﬁirrnative Action/Equal Opportunity Institution

__ _ ”—— 7 4____ I— l I. ____ ,__ _. v__ _.—_

 

LIBRARY
Michigan State
University

 

. .. -:----.-VA-—

.A_--A-<-

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2/05 p2/ClRC/DateDuejndd-pt1

AN INVESTIGATION OF FAKIN G: ITS ANTECEDENTS AND IMPACTS IN
APPLICANT SETTINGS

By

Anthony S. Boyce

A THESIS
Submitted to
Michigan State University

in partial fulﬁllment of the requirements
for the degree of

MASTER OF ARTS
Department of Psychology

2005

ABSTRACT

AN INVESTIGATION OF FAKING: ITS ANTECEDENTS AND IMPACTS IN
APPLICANT SETTINGS

By

Anthony S. Boyce
Researchers have demonstrated that personality-based self-report tests are valid
predictors of important organizational criteria including supervisory ratings of job
performance, organizational citizenship behaviors, and training performance. However,
there remains concern that the validity and utility of such tests may be compromised by
intentional distortion, or faking, on the part of applicants. The present study examined
both antecedents and consequences of applicant faking using a within-subj ects design
consisting of the completion of a personality-based selection test at two periods in time.
The ﬁrst administration of the test occurred when participants applied for employment
and the second administration occurred under conﬁdential conditions once applicants had
been hired. The results indicate that faking is positively related to the extent to which
individuals believe that others engage in faking in applicant contexts, but is unrelated to a
number of other antecedents investigated. The results also suggest that applicant faking
can result in changes in the rank-ordering of individuals. The results do not support a
conclusion that faking erodes the criterion-related validity of personality-based tests, but
the pattern of results suggests this may be a possibility. The results are discussed in

terms of the limitations of the current study and future research directions.

ACKNOWLEDGEMENTS

Thank you to everyone in my life who has ever believed in me, encouraged me,
criticized me, or challenged me. Without such, I would surely have found something
easier to do than this, but never as satisfying!

Thank you to my advisor, Ann Marie Ryan, for her never ending stream of
insightful comments and patience. Thank you to my committee, Drs. Neal Schmitt, Rick
DeShon, and John Arnold, for their extremely astute comments and recommendations.

Thank you to my family, speciﬁcally my mother and father, Janis and Larry
Boyce, and my brother, Jason Boyce, for all their support and unwavering belief in me,
even when I had doubts.

Thank you, last but by no means least, to Sarah Conklin for inspiration and
allowing me to bore her with details of this thesis more than any person should ever be

allowed to bore another.

“There’s only one corner of the universe you can be certain of improving,

and that’s your own self.” —Aldous Huxley

iii

TABLE OF CONTENTS

LIST OF TABLES ........................................................................................................... vii
LIST OF FIGURES ........................................................................................................ viii
INTRODUCTION .............................................................................................................. 1
The Nature of Faking ...................................................................................................... 3
Practical Issues in Faking Research ................................................................................ 5
Existing Models of Faking ............................................................................................ 24
Past Research ................................................................................................................ 26
The Current Model and Study ....................................................................................... 28
Summary ....................................................................................................................... 37
METHOD ......................................................................................................................... 39
Sample ........................................................................................................................... 39
Design ........................................................................................................................... 40
Measures ....................................................................................................................... 40
Procedure ...................................................................................................................... 46
RESULTS ......................................................................................................................... 49
Construct Validity of the Selection Test ....................................................................... 49
Selection Test Difference Scores, Scale Reliability, and Effect Sizes ......................... 50
Descriptive Statistics ..................................................................................................... 52
Hypothesis Tests ........................................................................................................... 52
Exploratory Analyses .................................................................................................... 70
DISCUSSION ................................................................................................................... 79
Does Faking Affect Selection Decisions? ..................................................................... 81
Are Social Desirability Scales 3 Valid Operationalization of Faking? ......................... 83
Does Faking Affect Construct Validity? ....................................................................... 84
Does Faking Affect Criterion-related Validity? ........................................................... 86
Do Individuals’ Perceptions & Beliefs Relate to Faking? ............................................ 90
Limitations .................................................................................................................... 93
Conclusion .................................................................................................................... 93
APPENDIX A ................................................................................................................... 96
APPENDIX B ................................................................................................................... 97
- APPENDIX C ................................................................................................................... 99
APPENDIX D ................................................................................................................. 107
APPENDIX E ................................................................................................................. 109

iv

APPENDIX F .................................................................................................................. 1 10

APPENDIX G ................................................................................................................. 114
APPENDIX H ................................................................................................................. l 15
APPENDIX I .................................................................................................................. 1 16
REFERENCES ............................................................................................................... 1 17

LIST OF TABLES

Table 1. Paired Samples t Tests, Effect Sizes, and Reliability Information ..................... 51
Table 2. Descriptive Statistics and Intercorrelations ........................................................ 53
Table 3. Performance Composite Regressed onto Applicant Test Dimensions and
Impression Management .............................................................................................. 57
Table 4. Multiple Groups Conﬁrmatory Factor Analyses ................................................ 58

Table 5. Performance Composite Regressed onto Applicant Test Dimensions Squared. 62
Table 6. Rank-order Correlations ..................................................................................... 64
Table 7. D-scores regressed onto Ethics, Beliefs, and Their Interaction .......................... 66

Table 8. D-scores regressed onto Self-efﬁcacy, Subjective Norms and Their
Interaction ........................................... 68

Table 9. D-scores regressed onto Self-efﬁcacy, Ethics, Beliefs, and Their Interaction 69
Table 10. Split-group Correlations and Effect Sizes ........................................................ 71

Table 11. Performance Regressed onto Applicant Test Dimension Scores, D-scores, and
Their Interaction ........................................................................................................... 74

Table 12. Polynomial Regression onto Applicant and Incumbent Test Dimension Scores
And Their Interaction ................................................................................................... 7 7

Table 13. Hypothesis Summary ........................................................................................ 80

LIST OF FIGURES

Figure 1. Model of Faking ................................................................................................ 30

Figure 2. Graph of the Interactive Effects of D-scores and Applicant Test Scores on
Performance ................................................................................................................. 75

Figure 3. Graph of Polynomial Regression Results .......................................................... 78

vii

INTRODUCTION

Personality-based self-report tests are becoming increasingly prevalent in
organizational selection processes. These types of measures have proven, across a range
of occupations, to be valid predictors of important organizational criteria including
supervisory ratings of j ob performance, organizational citizenship behaviors, and training
performance (Barrick & Mount, 1991; Barrick, Mount, & Judge, 2001; Organ & Ryan,
1995). Personality tests have also been shown to exhibit substantially less adverse
impact, and to predict similar criteria, than more cognitively loaded measures (Bobko,
Roth, & Potosky, 1999; Schmitt, Clause, & Pulakos, 1996). Despite these advantages,
there is concern that the validity and practical utility of such tests may be compromised
by intentional distortion, or faking, on the part of applicants (e.g., Hogan & Nicholson,
1988; Zerbe & Paulhus, 1987).

The effect of applicant faking on the validity and utility of personality self-report
tests has been the subject of a voluminous amount of research in the past decade. Some
researchers report evidence that applicants do not fake such measures and that even if
faking does occur it does not affect the validity of these instruments (Barrick & Mount,
1996; Hough, 1998; Hough, Eaton, Dunnette, Kamp and McCloy, 1990; Ones &
Viswesvaran, 1998; Ones, Viswesvaran, & Reiss, 1996; Smith & Ellingson, 2002).
Conversely, other researchers have found that faking not only occurs and affects
criterion-related validity, but affects construct validity and selection decisions as well
(e.g., Christiansen, Gofﬁn, Johnston, & Rothstein, 1994; Donovan, Dwight & Hurtz,
2003; Douglas, McDaniel, & Snell, 1996; Ellingson, Sackett, & Hough, 1999; Rosse,

Stecher, Miller, & Levin, 1998; Schmit & Ryan, 1993; Topping & O’Gorman, 1997).

There are many potential reasons for these conﬂicting results. For example, some
researchers have relied on the use of social desirability or lie scales to identify fakers in
selection settings (e.g., Ellingson, Smith, & Sackett, 2001; Hough, 1998; Rosse et al.,
1998). The problem with many of these scales is that it is not clear whether they actually
reﬂect faking or whether they reﬂect substantive personality traits that have real and
meaningful relationships with other traits. Other researchers have used difference scores
(d-scores) to operationalize faking behavior (e.g., Dunnette, McCartney, Carlson, &
Kirchner, 1962; McFarland & Ryan, 2000). While d-scores are an objective measure of
response distortion, research studies that have used these scores were oﬁen conducted in
laboratory settings where participants are instructed to “fake- good.” Such manufactured
situations have been useful in showing that individuals can fake, and that faking can
impact the validity of personality measures, but fail to show that applicants do fake or
that applicant faking does affect validities. Additionally, there is a need for more
theoretically-driven research addressing the conditions, both contextual and individual,
that lead to faking behavior. Without such research it is difficult to determine why some
studies have shown detrimental effects and others have not.

To summarize, despite the interest and efforts of both researchers and
practitioners there is still no consensus on whether faking substantially affects the
usefulness of personality measures for personnel selection. The study proposed here
attempts to inform this debate by addressing some key issues that have been overlooked
or under-researched. First, in an effort to address one of the key limitations of past
research, the current study will operationalize faking multiple ways (i.e., lie scale and d-

scores). Second, the current study will attempt to elucidate unexamined, or poorly

examined, proximal antecedents of faking in order to contribute to the theoretical
understanding of the conditions that lead to such behavior. Finally, the study described
here will utilize a within subjects design to examine the faking of actual applicants in a
ﬁeld setting. Given that the goal of faking research is to generalize ﬁndings to real-world
applicant settings, it is prudent to examine this phenomenon in such settings.

Before describing the current study in more detail, it is necessary to review the
nature of faking and the various ways in which it has been operationalized in the
literature. Next applied research investigating the impact of faking on the construct
validity, criterion-related validity, and selection decisions that result from personality-
based selection tests will be reviewed. In this section, particular attention will be given to
the limitations of much of this research for providing deﬁnitive conclusions on the effects
of faking as well as how the current study will address these limitations. Finally, two
models of faking behavior will be reviewed and the model tested in the current study will
be presented. It should be noted that faking research has also focused on biodata and
situational judgment tests as well, but given the focus of the current research on
traditional personality tests, ﬁndings from these related literatures will be discussed only

when directly applicable to the current research.

The Nature of Faking
Faking, variously termed response distortion, response inﬂation, and impression

management, is a deliberate and conscious attempt to convey false information to create a
positive impression on others (Paulhus, 1984; 1986; Zerbe & Paulhus, 1987). Paulhus

and other faking researchers (e.g., McFarland & Ryan, 2000; Ellingson et al., 1999)

 

 

It

FAA

 

 

consider faking as distinct from self-deception, another form of socially desirable
responding. Self-deception is an unconscious form of response inﬂation motivated by a
desire to protect one’s self from psychological threats and is correlated with healthy
psychological traits like self-esteem, high need for achievement, and an internal locus of
control (Paulhus, 1986). In accordance with Paulhus’ conception, in the remainder of this
paper faking, impression management, response distortion, and response inﬂation will
refer to conscious dissembling and not to unconscious forms of socially desirable
responding like self-deception.

Some authors have directly considered the motivational processes underlying
individuals’ desires to present themselves favorably, albeit from a rather macro
perspective. Schlenker (1980) posits that at the most general level, an individual’s
decision to engage in impression management stems from the same motivational sources
as most other behavior, that is to maximize expected rewards and minimize expected
punishments. Similarly, Leary and Kowalski (1990) suggest that antecedents to engaging
in impression management stem largely from three general sources: the goal relevance of
impressions, value of desired goals, and discrepancy between the desired and currently
perceived image. The goal relevance of impressions refers to the relationship between
the desired image and the attainment of social or material outcomes. The value of desired
goals encompasses both the importance of goal attainment to the individual as well as the
scarcity of the goal. Finally, in Leary and Kowalski’s model, discrepancy between the
desired and current image refers to both real and imagined divergence in others’
impressions of the identity an individual would like to convey. After reviewing research

investigating the impact of faking on the use of personality tests in selection, past models

of the behavior that build on the general social psychology theories presented above will

be reviewed and the antecedents tested in the current study will be presented.

metical Issues in Faking Research
While the motivating factors contributing to an individual’s decision to fake

personality-based selection measures are important, many researchers have been more
concerned with the implications of such behavior for the use of these measures in applied
settings. Speciﬁcally, the major concerns in applied settings surround the issues of
response distortion scales used to identify fakers or “correct” trait scores for faking, and
examining the effects of faking on construct validity, criterion-related validity, and
selection decisions.

Response Distortion Scales. Response distortion scales, variously termed faking
scales (Levin & Zickar, 2002) social desirability scales (Crowne & Marlowe, 1960),
response validity scales (Hogan & Hogan, 1992), unlikely virtues scales (Hough, 1998),
or impression management scales (Paulhus, 1984) have oﬁen been examined to
determine their effectiveness in identifying fakers or correcting faked trait scores. A
number of commercially available tests include such scales (of. Hough, 1998). These
scales typically include items referring to behaviors that are undesirable but extremely
common (“I sometimes drive faster than the speed limit”) or are very desirable but
extremely uncommon (e.g., “I have never dropped litter on the street” Paulhus, 1984).
Applicants who score above some pre-set cutoff are considered to have faked their

responses.

At ﬁrst glance, these types of scales seem to function as they should. A meta-
analysis by Viswesvaran and Ones (1999) found that social desirability scores were
inﬂated at roughly twice the rate, in difference score terms, as scores on measures of the
Big Five when participants were instructed to “fake-good.” In another article, the same
authors refer to these results as suggesting “. . . response distortion scales are likely to be
useful in ﬂagging individuals who fake,” (Ones & Viswesvaran, 1998: p. 249). However,
note that only studies utilizing “fake-good” instruction sets were included in the meta-
analysis. It is possible that different results would be obtained if studies using “incentive-
motivation” instructions or applicant populations were meta-analyzed in a similar
fashion. Unfortunately, such a meta-analysis including an effect size for social
desirability scales does not yet exist. Additionally, these authors failed to separate
response distortion scales measuring self-deception from those measuring impression
management, which Paulhus (1984; 1986) argued was a more accurate measure of the
conscious dissembling that is faking. Despite the limitations just noted, Viswesvaran and
Ones’ (1999) meta-analysis does provide some evidence that social desirability scales
appear to function in the proposed manner (i.e., groups expected to have higher scores do
in fact score higher than groups expected to have lower scores). However, there exist a
number of limitations in using these scales to “deﬁne” faking in either research or applied
contexts.

Perhaps the most severe limitation comes from recent research that demonstrates
that responses to these scales reﬂect more “substance” than “style.” Smith and Ellingson
(2002) conducted a study in which both applicants and students were given the same

personality measures along with three different social desirability scales. Student

administration of the test was conducted under complete anonymity and students were
instructed to answer honestly.

To test for differences in method and trait loading across the two groups, Smith
and Ellingson utilized multiple-groups conﬁrmatory factor analysis. Assuming that the
social desirability scores captured a situation-speciﬁc response pattern (i.e., “style”) one
would expect to see larger method factor loadings (i.e., loadings on the latent social
desirability construct) and smaller trait factor loadings (i.e., loadings on the latent
personality constructs) in the job applicant group than in the student group. However, the
results showed similar method and trait loadings across both groups indicating that the
social desirability scales measured substantive trait variance (i.e., “substance”) rather
than a situation-speciﬁc response pattern. Similarly, Hurd and colleagues (2001), using
meta-analytic techniques, found that social desirability and personality scale scores
shared primarily trait variance in both incumbent and applicant settings.

There also exists evidence that response distortion scales function differently
across samples of applicants and non-applicants. Stark and colleagues (2001) conducted
a study in which they were able to compare the consequences of faking on construct
validity across applicants and non-applicants and across subgroups dichotomized on the
basis of impression management scores. Differential test function analyses suggested
that the impression management items measured different underlying constructs across
groups of applicants and non-applicants. The authors concluded, “This ﬁnding casts
doubt on the generalizability of research from similar traited faking studies, which
compare faking groups created from a single sample of respondents using IM scores” (p.

951).

A third limitation to the use of response distortion scales involves the
demonstrated failure of these scales to allow for recovery of honest scores. Ellingson,
Sackett, and Hough (1999), utilizing a counterbalanced repeated-measures design,
obtained both honest and faked scores from a sample of military personnel on both an
unlikely virtues scale and a personality measure. By combining the honest and faked
condition scale scores into a single distribution and regressing scores of each personality
scale on the unlikely virtues scale, the authors obtained multipliers for each personality
scale. Conceptually these multipliers allowed the estimation of scale scores that
participants would have obtained if they had exhibited zero intentional distortion in their
responses. While this correction effectively removed the standardized mean differences
found between the honest and faked scores, the corrected scores, on average, did not
correlate with honest scores signiﬁcantly greater than did faked scores. In addition to the
inability of these scales to allow for the recovery of honest scores, there also exists
evidence that these scales can result in false-positives where some honest individuals are
incorrectly classiﬁed as fakers (Zickar & Drasgow, 1996).

Another limitation involves the possibility that responses to these scales reﬂect a
degree of positive mental health. Zerbe and Paulhus (1987) suggest that this is likely to
be especially true for those scales that fail to separate self-deception from impression
management. As mentioned previously, Paulhus (1986) demonstrated that self-deception
is positively related to self-esteem, high need for achievement, and an internal locus of
control.

A ﬁnal limitation is that these scales themselves may be susceptible to faking. As

advocated by Whyte (1957) in The organization man, test takers opting to respond in a

moderately well-adjusted manner can effectively evade such instruments. Furthermore,
Kroger and Tumbull (1975) demonstrated participants were able to evade detection by
the MMPI validity scales when coached on how to do so.

The above discussion suggests that social desirability scales are not an adequate
Operationalization of faking. As such, one would expect these types of scales to be only
slightly to moderately correlated with difference scores obtained from a within-subj ects
administration of a personality test under both motivating and non-motivating
circumstances.

Hypothesis 1: Scores on a social desirability scale will have a small to moderate
correlation with difference scores.

Construct Validity. Concerns about the inﬂuence of faking on construct validity
are based on the idea that if measurement equivalence cannot be established across
applicant and non-applicant groups then the associations among personality measures and
other predictors and measures of job performance may be obscured (Stark,
Chernyshenko, Chan, Lee, & Drasgow, 2001). This is a particularly salient concern
given that it is a common practice in many organizations to validate personality-based
selection tests on volunteer incumbent samples and then assume that the relationships
found will generalize to applicant samples. Research investigating the effect of faking on
the construct validity of personality measures indicates that faking can adversely impact
construct validity (e.g., Ellingson et al., 1999; Schmit & Ryan, 1993). However, research
also indicates that faking does not harm construct validity (e.g., Ellingson et al., 2001;
Smith, Hanges, & Dickson, 2001). Some of these differences can be explained by

examining the methodologies employed in studying this phenomenon.

Laboratory studies utilizing “honest” and “fake—good” instructional-sets have
found the largest and most consistent differences in construct validity (e. g., Douglas et
al., 1996; Ellingson et al., 1999). However, many authors have suggested that “fake—
good” instructional sets artiﬁcially inﬂate differences beyond what would be expected in
applicant settings (e.g., Levin & Zickar, 2002; Ones et al., 1996). Hogan (1991) suggests
that impression management requires both the motivation to do so and the ability. This
suggests that “fake-good” instructional sets result in larger differences in construct-
validity than examinations conducted in applicant settings because participants are
equalized with respect to their motivation under “fake-good” instructions while
applicants’ motivation to do so is likely to vary. Additionally, Smith and Ellingson
(2002) suggest that laboratory studies eliminate some of the “natural deterrents” to
response distortion present in applicant settings, such as the fear of being caught.
Furthermore, there is evidence that laboratory studies using “fake-good” instructions
result in larger standardized mean differences than are commonly observed between
applicant and incumbent samples (Birkeland, Manson, Kisamore, Brannick, & Liu,
2003). Thus, “fake-good” instruction sets are useful for showing that faking can impact
construct validity, but do not inform whether actual applicant faking does affect construct
validity.

Some studies have examined the effects of faking on construct validity by
artiﬁcially dichotomizing groups on the basis of social desirability scale scores (e. g.,
Ellingson et al., 2001; Ones & Viswesvaran, 1998). While these studies generally found

support for construct validity across groups, as discussed previously the

10

Operationalization of faking as scores on response distortion scales is suspect and is likely
to limit the generalizability of conclusions (Smith & Ellingson, 2002; Stark et al., 2001).

Unfortunately, even after removing ﬁ'om consideration those studies with limited
validity, the impact of faking on construct validity is still unclear. Studies utilizing
applicant samples without dichotomization on the basis of response distortion scale
scores will be reviewed next.

Schmit and Ryan (1993) investigated the effects of response inﬂation on construct
validity by comparing a sample of applicants to an employment assistance service to
students who took the same test, the NEO-F F I, under non-motivating conditions.
Multiple-groups conﬁrmatory factor analysis (MCF A) revealed that the hypothesized ﬁve
factor solution ﬁt the student sample but not the applicant sample. An exploratory factor
analysis revealed a six-factor solution for the applicant sample. Schmit and Ryan
suggested that this additional factor represented an “ideal employee” factor as it
contained signiﬁcant factor loadings from composites made up of items referring to being
a hard worker, likable, conscientious, courteous, etc. Additionally, the applicant group
factor scale intercorrelations were substantially higher than those of the student sample.
One possible alternative explanation for the MCF A results is that the samples may have
violated the assumption of multivariate normality, discussed below, resulting in the
erroneous rejection of a true model for the applicant group.

Similarly to Schmit and Ryan (1993), Weekley, Ployhart, and Harolds (2003)
failed to ﬁnd evidence of measurement invariance across an applicant and incumbent
sample. While these researchers did ﬁnd similar factor forms, evidence of conﬁgural

invariance, they did not ﬁnd evidence of similar factor loadings across the two groups, a

ll

minimum condition suggested to be necessary to conclude measurement invariance
across groups (of. Rock, Werts, & F laugher, 1978). Also similar to Schmit and Ryan
(1993), these authors did not present a discussion of the satisfaction or violation of
multivariate normality, so it is somewhat unclear whether the results reﬂect real
differences or inﬂated Type I error.

Smith and Ellingson (2002), discussed previously, conducted MCF A on their
sample of applicants for entry-level managerial positions and students. MCF A indicated
that the factor form and loadings were not signiﬁcantly different across the two groups.
One reason they suggested for why their ﬁndings differed from some prior studies was
that past studies relied on estimation procedures that required the assumption of
multivariate normality. Violations of this assumption result in inﬂation of Type I error in
proportion to the degree of non-normality in the data set. Given that applicant
distributions often violate this assumption (Smith & Ellingson, 2002; Smith et al., 2001 ),
this is a valid critique of prior studies. This study also found similar intercorrelations
among the personality dimensions for both groups. One cause for concern in this study
was the observation that the student group actually scored higher (i.e., in the more
socially desirable direction) on some of the personality dimensions, thus the results could
be due to sample-speciﬁc irregularities.

Smith, Hanges, and Dickson (2001) conducted another study utilizing procedures
robust to violations of multivariate normality. In this study, applicant, incumbent, and
student samples were obtained from an archival database maintained by the publishers of
the Hogan Personality Inventory. With the exception of the student sample, for which all

cases in the database for which there was complete data were used in the analyses, the

12

samples were randomly selected from the total database, representing many different
organizations, to achieve equal sample sizes. Using the student sample to specify the
baseline model, MCFA revealed that the hypothesized factor structure actually ﬁt the
applicant sample slightly better than the incumbent sample. Furthermore, the model even
indicated measurement error invariance across the samples, a result not obtained in any
other investigation of the effects of faking on factorial invariance. Intercorrelations of the
personality dimensions were also similar across the groups. One drawback of this study
is that given the archival nature of the data there was very little information available to
adequately describe the samples involved. However, it was noted that incumbents were
largely administered the test as part of organization-sponsored self-development and
career counseling programs. Whether the administration of the inventory in this context
resulted in any motivation on the part of incumbents to appear desirable is unclear.

Two studies examining only applicant populations also speak to this issue of
construct validity. One study examined two samples of applicants and found
substantially higher intercorrelations among the personality dimensions than a prior study
conducted by the same authors on a non-applicant sample using the same scales (Barrick
& Mount, 1993; Barrick & Mount, 1996). Collins and Gleaves (1998) also conducted an
investigation utilizing only an applicant sample. In this study the personality dimensions
were also more highly intercorrelated than expected on the basis of prior research
conducted with non-applicant samples. However, conﬁrmatory factor analysis suggested
a good ﬁt of the data to the theorized ﬁve-factor model.

Two recent studies utilizing item-response theory methodologies for investigating

measurement invariance also yield conﬂicting results. Stark and colleagues (2001),

13

discussed previously, found the presence of differential item and test functioning (DIF,
DTF) across samples of applicants and non-applicants suggesting that faking adversely
affects the construct validity of personality scales. Note, however, that the non-applicant
sample included respondents who took the inventory for research, counseling, or
developmental purposes, so it is difﬁcult to establish that all non-applicants were equally
unmotivated to inﬂate responses.

Robie, Zickar, and Schmit (2001) conducted similar DIF and DTF analyses on a
sample of incumbents and applicants within the same organization. These authors found
the scales to be more highly intercorrelated on average in the applicant group than in the
incumbent group. However, DIF and DTF analyses revealed that these elevated
correlations among the scales were not associated with degradation in the psychometric
properties of those scales.

To summarize, it appears that applicant and non-applicant responses tend to have
similar factor forms (i.e., the same items load on the same factors), but as often as not
dissimilar factor loadings, and almost always dissimilar measurement errors across
groups. Additionally, applicant groups tend to have higher intercorrelations among
scales than non-applicant groups indicating an erosion of discriminant validity among the
assessed constructs.

It is interesting to note that there have been no studies that have investigated
construct validity using the same sample measured once in an applicant setting and a
second time under non-motivating conditions. The combination of a within-subjects
design and a ﬁeld sample has two primary advantages over other types of designs and

samples used for examining construct validity issues. First, assuming construct validity

l4

evidence is present for the sample under non-motivating conditions it can be inferred that
any deviations found under motivating conditions are due to the context and not due to
substantive differences inherent in the samples. Second, a ﬁeld sample allows
assessment of construct validity under real life applicant conditions void of the artiﬁcial
equalization and potential inﬂation of respondents’ motivation to fake that is present in
laboratory settings. Given the equivocal results of past studies investigating the
similarity of factor loadings across applicant and non-applicant responses, this issue will
be investigated on an exploratory basis in the present study. Unfortunately, the large
sample sizes necessary for DIF and DTF analyses prevent the utilization of these methods
in the current study’s investigation of measurement equivalence across groups.

Hymthesis 2a: Factor forms will not be signiﬁcantly different across the two
measurement periods (i.e., applicant administration and research administration); that is,
there will be conﬁgural invariance of the personality factors across both administration
periods.

Hypothesis 2b: Measurement errors of the personality test will be signiﬁcantly
different across the two measurement periods.

Hypothesis 2c: Average intercorrelations among the scales will be higher when
the inventory is administered for application purposes than when it is administered for
research purposes.

Criterion-related validity. Many researchers have argued that faking does not
substantially affect the criterion-related validity of personality tests and have even gone
so far as to call faking the “Red Herring” of personality testing for personnel selection

(e.g., Hough, 1998; Ones et al., 1996). However, other researchers have critiqued these

15

claims and presented evidence to the contrary (e. g., Haaland & Christiansen, 1998;
Mueller-Hanson, Heggestad, & Thornton, 2003 a). As with the debate over the effects of
faking on construct validity, no clear answer emerges.

Faking, often operationalized as scores on a scale purported to measure socially
desirable responding, has often been assumed to operate as either a suppressor variable or
a moderator. As a suppressor variable, faking is assumed to be positively correlated with
predictor scores (i.e., personality measures), but unrelated, or negatively related, to
criterion measures (e. g., job performance; e. g., Ones et al., 1996). The resulting effect of
such a phenomenon is that in the presence of faking the observed relationship between
the personality measure and the criterion is attenuated. Researchers have also suggested
that faking may act as a moderator of the relationship between personality scales and
various criteria (e. g., Hough et al., 1990). In this situation, the criterion-related validity
of a personality measure is expected to change as a function of the degree of faking
engaged in by respondents. It has also been hypothesized that faking may act as a
mediator or a predictor in its own right (Ones, et al., 1996).

Ones and her colleagues (1996) conducted a meta-analysis to examine the effects
of faking on criterion-related validity. These authors investigated the predictor, mediator,
and suppressor hypotheses discussed above. Unfortunately, the authors operationalized
faking as responses to social desirability scales and thus the results should be interpreted
with caution. Social desirability scores did not predict task performance or supervisory
ratings of job performance thus precluding the possibility of these scores mediating the
relationship between personality and criteria. Additionally, by partialling social

desirability scores from personality measures, the authors were able to investigate the

16

impact of such responding on validity. It was found that social desirability scores did not
attenuate the validity of the personality measures. The authors concluded that social
desirability did not act as a suppressor variable of this relationship.

Hough et al. (1990) studied faking, again operationalized as responses to a social
desirability scale, as a moderator of the relationship between personality and job
performance. The study utilized a concurrent-validation sample to examine hypotheses.
To test the moderation hypothesis the authors used the mean social desirability score
obtained with a separate sample instructed to “fake-good” to dichotomize the validation
sample into “overly desirable” and “accurate” responders. While almost a third of the
resulting correlations with performance dimensions were signiﬁcantly different for the
“overly desirable” and “accurate” groups, the mean difference between the group
correlations was only .03. The authors concluded that socially desirable responding did
not moderate the relationship between the personality measures used in this study and
performance criteria.

The results of the two studies described above are similar to a number of other
studies that have operationalized faking as responses to social desirability scales (e. g.,
Barrick & Mount, 1996; Christiansen et al., 1994; Ones & Viswesvaran, 1998; Weiner &
Gibson, 2000). Similarly to the discussion of construct validity, however, laboratory
investigations utilizing “fake-good” instruction sets have yielded conﬂicting results (e. g.,
Dunnette et al., 1962; Douglas et al., 1996; Frei, Snell, McDaniel, & Grifﬁth, 1998)

Dunnette and colleagues (1962) study and results are representative of studies
conducted using “fake-good” instruction sets. In their study, Dunnette et a1. administered

the Adjective Checklist to sales employees under both instructions to respond honestly

17

and instructions to “beat the test.” Supervisor performance ratings showed the test to be
predictive (correlations for the different dimensions ranged from .22 to .38) for responses
obtained in the honest condition. However, when respondents attempted to “beat the
test” all test dimensions failed to signiﬁcantly predict supervisor performance ratings, and
indeed some of the dimensions even exhibited correlations in the opposite direction.
While “fake-good” instructions have been shown to result in larger differences in
observed scores than is observed in applicant conditions, this study, and other studies
using similar methodologies (e.g., Christiansen et al., 1994; Frei et al., 1998), clearly
shows that faking can destroy criterion-related validity.

A more recent study by Mueller-Hanson and her colleagues (2003a) suggests that
criterion-related validity is harmed by faking. On the basis of evidence provided by
Drasgow and Kang (1984) that correlation coefﬁcients are extremely robust to changes in
rank order in only particular ranges of a bivariate distribution, these authors hypothesized
that while criterion-related validity may not be signiﬁcantly different for a motivated
group than for a non-motivated group, the validity would be signiﬁcantly higher for the
bottom portion of the motivated group than for the top portion of this group.

To test this hypothesis, Mueller-Hanson and her colleagues conducted a study in
which they told one group of participants (i.e., the motivated group) that the personality
measure would be used to select people into the next part of the study and that those
selected would be eligible for a $20 cash prize. The criterion measure was performance
on a 50-item test that involved simple but time-consuming and tedious exercises. The
participants were allowed to quit the performance test whenever they wished with no

adverse consequences (i.e., they would still be eligible for the $20 prize). The

18

relationship between the personality measure (i.e., an achievement motivation measure)
and the criterion was larger for the control group (r = .17, p <.05) than for the motivated
group (r = .05, ns), but the difference was not signiﬁcant. When the groups were
separated into thirds it was found that the validity of the lower portion of the motivated
group distribution (r = .45, p<.05) was signiﬁcantly greater than the validity of the upper
portion of the same group (r = .07, ns), while the difference in the control group (ram =
.06) failed to reach statistical signiﬁcance. The generalizability of the results of this
study is bolstered due to the presence of a motivated condition that more closely
approximated the conditions of applicant settings. The authors even warned participants
in the motivated group of the consequences associated with responding dishonestly (i.e.,
disqualiﬁcation from the study and ineligibility for the cash prize). Note that due to the
laboratory nature of this experiment and the very narrow criterion used the external
validity of the study is somewhat questionable and requires further veriﬁcation.

Haaland and Christiansen (1998) tested a similar hypothesis, albeit without a
control group. In this study, qualiﬁed recruits attending a police academy were
administered the test prior to being formally offered a space at the academy. The test was
not used to select recruits, but rather was forwarded on to local agencies for use in
selection of applicants from the pool of graduating recruits. Performance ratings were
obtained from police academy ofﬁcers trained to provide these ratings.

Haaland and Christiansen speciﬁcally hypothesized that due to faking, “. . .one
would expect a departure ﬁom linearity in construct relationships across different ranges
of personality test scores,” (p. 3). Indeed, this is exactly what was found. The validity of

the test was the same for the entire sample as it was for the lower half of the distribution.

19

However, for the upper half of the distribution the validity was zero and for the top 15%
of scorers the validity was equal to that of the entire sample, although in the opposite
direction! When scores were corrected for range restriction these differences became
even more pronounced. Assuming fakers were overrepresented in the top 15% of the
distribution, it appears that faking does impact criterion-related validity, although in a
manner more complex than previous researchers realized. Unfortunately, the design of
this study prevents the conclusion that the results found were due to faking and not due to
some unmeasured factor shared among those at the top of the distribution.

In summary, it appears that faking operationalized as responses to a social
desirability scale does not impact criterion-related validity. Furthermore, when
examining the entire distribution of scores, it appears the effects of faking on criterion-
related validity may be masked due to the insensitivity of correlation coefﬁcients to
changes in rank order isolated to only certain areas of the distribution (e.g., the top-end).
However, when validity coefﬁcients are examined separately for different ranges of the
distribution, there is evidence of deviations from linearity indicating that faking can
impact criterion-related validity.

As mentioned previously, operationalizing faking as responses to social
desirability scales has many limitations, including the fact that these scales have been
shown to reﬂect true trait variance (Smith & Ellingson, 2002), partialling the responses of
these scales from faked responses fails to result in the recovery of true scores (Ellingson,
, et al., 1999), and the possibility that social desirability scales themselves can be faked
(Whyte, 1957). Given these limitations, and others noted previously, it is not surprising

that studies examining the inﬂuence of faking on criterion-related validity by partialling

20

social desirability scores from observed scores have failed to show that faking erodes
validity.

Hyppthesis 3: Partialling social desirability scale scores from applicant
personality scores will not result in a signiﬁcant change in criterion-related validity.

Research discussed previously by Haaland and Christiansen (1998) suggests that
the effects of faking on criterion-related validity are masked due to the insensitivity of the
correlation coefﬁcient to changes in the rank order at one end of the distribution. The
current study uses only selected applicants, who are more likely to be in the upper
portions of the distribution; thus, it is expected that signiﬁcant differences in criterion-
related validity, computed on the entire distribution of scores within the current sample,
will be found between the scores obtained in the applicant setting and scores obtained in
the research setting. Additionally, given the results found by Haaland and Christiansen
(1998) with regards to criterion validity differences across different ranges of the
distribution, it is also expected that the current study will ﬁnd signiﬁcant differences in
validity between the upper and lower halves of the applicant sample score distribution.

Hyp_othesis 4a: Criterion-related validity will be signiﬁcantly greater for
personality scores obtained in a non-motivating context (i.e., for research purposes) than
scores obtained in a motivating context (i.e., for application purposes).

Hypothesis 4b: Criterion-related validity of the upper half of the distribution of
applicant sample scores will be less than the criterion-related validity of the bottom half
of the distribution.

In relation to the above hypotheses, note that there is evidence that the use of

some types of impression management tactics by subordinates positively relates to

21

supervisory performance ratings, although the effect sizes are generally small (e.g.,
Gordon, 1996; Wayne & Liden, 1995). Assuming that faking is indicative of an
applicant’s ability to utilize, and likelihood of actually utilizing, impression management
tactics on the job, it could be the case that faking actually increases the validity of
personality tests when the criterion is supervisor-rated performance. Two meta-analysis
have addressed this possibility (Ones et al., 1996; Viswesvaran, Ones, Hough, 2001).
Both of these studies operationalized faking as responses to social desirability scales and
found overall correlations between these scales and job performance ratings of less than
.05. It is possible that an alternative Operationalization of faking may yield different
results. However, on the basis of these two meta-analyses and the lack of empirical
investigations utilizing an alternative Operationalization hypotheses 4a and 4b are
proposed.

Selection decisions. Much of the research that has examined the effects of faking
on selection decisions and the rank-ordering of applicants suffers from the same
limitations discussed above under construct and criterion-related validity. It is interesting
to note, however, that some of the same research interpreted as providing evidence that
criterion-related validity is not affected by faking also provided evidence that changes in
selection decisions and rank-orders of applicants was likely to occur due to faking (e. g.,
Hough, 1998; Christiansen et al., 1994). For example, Christiansen and his colleagues
found that an impression management “correction” suggested that using top-down
selection on the basis of raw scores would have resulted in up to 16% of those selected
being discrepant hires (i.e., those hired on the basis of raw scores who would not have

been hired on the basis of their corrected scores).

22

Similarly laboratory studies have also found evidence of changes in the rank-
order of applicants (e.g., Dunnette et al., 1962; Frei et al., 1998). Mueller-Hanson et al.
(2003a), described previously, found that when the motivated and non-motivated groups
were combined into the same applicant pool, motivated responders rose to the top of the
distribution resulting in an increased likelihood of being selected. In fact, for selection
ratios of 60% or less, signiﬁcantly more motivated group members would have been
selected than would be expected on the basis of their representation in the entire applicant
pool. The authors concluded that personality tests should only be used to “select-out” the
lowest scores instead of being used to “select-in” the top scorers as is done when
applicants are selected on a top-down basis. Unfortunately some authors suggest that
“select-in” procedures are still commonplace in many organizations (Arthur, Woehr,
Graziano, 2001).

Even when these studies are viewed in light of the limitations previously
discussed, it is clear that faking can give an advantage to those who choose to engage in
such behavior. The current study will only examine those applicants who were hired,
indicating that they exceeded minimum cutoffs on the personality test and performed
adequately in a pre-hire interview. However, even given this constraint it is likely that
some applicants would not have been hired on the basis of their honest responses to the
personality measure.

Hypothesis 5a: The rank-ordering of people on the basis of responses obtained in
a non-motivating context (i.e., for research purposes) will be substantially different than

the rank-ordering of responses obtained in a motivating context (i.e., for application

purposes)-

23

Hypothesis 5b: Some of the people hired on the basis of their responses in a
motivating context (i.e., applicant setting) would not have been hired on the basis of their

responses in a non-motivating context (i.e., research setting).

Existing Models of Faking
The practical concerns surrounding faking are important in their own right, but

should be considered within a theoretical framework of what motivates individuals to
engage in faking. Snell, Sydell, and Lueke (1999) present an “interactional model” of
faking behavior. According to this model, in order for an applicant to successfully fake a
noncognitive test the applicant must have both the motivation and the ability to do so.
Based on a review of the psychological literature related to dishonest behaviors (e.g.,
deception, theft, etc.), these authors identify three broad factors hypothesized to inﬂuence
motivation to fake: demographic, dispositional, and perceptual factors. The authors
continue on to provide a laundry list of speciﬁc constructs hypothesized to inﬂuence each
of these factors. While the model presented by these authors is useful as a heuristic map
of constructs that may be related to faking, it does not provide the theoretical nesting
necessary in order to develop a full explanation of the mediating and moderating
variables that inﬂuence an individual’s motivation to fake. Without an explanation of
these mediators and moderators it is very difﬁcult to use this model to investigate ways to
detect and deter faking. Additionally, this model fails to account for the ways in which
currently known methods of deterring faking operate. For example, warnings not to fake

have been shown to deter faking to some extent (Dwight & Donovan, 2002), but this

24

model does not provide for an explanation of the psychological processes through which
this effect occurs.

McFarland and Ryan (2000) present a more thorough model of faking,
hypothesizing both mediators and moderators of the process. They suggest that all
individual differences related to faking behavior operate through the mediating
mechanism of beliefs toward faking, deﬁned as the extent to which an individual holds a
belief that faking is an acceptable practice, and the more proximal mediating mechanism
of intentions to fake. Situational inﬂuences, such as warnings, are hypothesized to
moderate the relationship between beliefs toward faking and intentions to fake. While
this model is more comprehensive than the Snell et a1. (1999) model described above, it
is unlikely that beliefs toward faking are the only mediating mechanism through which
variables inﬂuencing one’s motivation to fake operate.

Both Snell et a1. and McFarland and Ryan provided models that helped
researchers direct their efforts in a more organized and systematic fashion. However,
current research ﬁndings warrant the investigation of additional antecedents that account
for a wider variety of inﬂuences and a more complex conception of the ways in which
these variables operate to inﬂuence one’s motivation to fake. For example, research
shows that individuals believe that faking on selection tests is not the same as outright
lying (Lueke, Snell, Illingworth, & Paidas, 2001). This suggests that a construct
capturing an individual’s beliefs regarding the similarity of faking and lying should be
included in models of faking behavior. Additionally, in order to more adequately
explicate the motivational processes that relate to faking, individual-level moderators of

such behavior need exploration. For instance, it may be the case that even if an

25

individual is motivated to fake, his or her self-efﬁcacy for successfully faking may limit
the degree that he or she actually engages in response distortion.

The discussion presented below, in order to be comprehensive, will begin by
reviewing some of the past research on contextual and individual difference inﬂuences on
faking behavior that will not be examined in the current study. Next, the model to be

tested in the current study will be presented and speciﬁc hypotheses will be discussed.

Past Research
Contextual Inﬂuences. Contextual inﬂuences refers to situational conditions

present in the applicant context. These types of inﬂuences on faking behavior include
warnings not to fake and competition for the job. Dwight and Donovan (2002) conducted
a meta-analysis of the literature on warning applicants not to fake and found that these
warnings were effective in reducing the degree of response inﬂation that occurs in
applicant contexts. These authors also conducted a follow-up study with college students
to determine the types of warnings that were most effective in deterring such behavior.

In order to increase generalizability of this study, participants were informed that only the
four top scorers would be “selected” to receive a monetary beneﬁt, no other beneﬁts were
available (i.e., course credit was not offered for participation). Warnings including both a
cautionary note that faking is identiﬁable and a discussion of the potential consequences
of such behavior (e. g., removal from the selection process) exhibited the greatest
deterrence effect on actual faking behavior. Thus, it appears that warnings decrease
individuals’ motivation to engage in faking by decreasing one’s belief in being able to

fake without being caught and increasing the salience of the consequences that may

26

follow from such behavior. This is consistent with research suggesting that people are
likely to impression manage when either expected beneﬁts of doing so increase or the
expected costs of not doing so increase (Schlenker, 1980). Note, however, that warnings
only served to reduce, not eliminate, response inﬂation, indicating that warnings do not
represent a panacea.

Perceived competition for a job has also been shown to inﬂuence an individual’s
motivation to engage in faking behavior. Leary and Kowalski (1990) suggest that one’s
motivation to engage in impression management increases when the desired resource is
scarce. Pandey and Rastogi (1979) provide support for this notion in a study in which it
was shown that applicants increased their use of the impression management tactic of
ingratiation towards an interviewer when perceived competition for the job was high.
Lueke and her colleagues (2001) also showed that individuals reported being more likely
to fake a personality test when presented with a scenario describing intense competition
for a desired job. Competition is likely to inﬂuence motivation to fake by increasing an
individual’s attitude towards the utility of faking, a topic considered more fully in the
next section.

Individual Dijfkrences. Individual differences in personality represent another set
of variables found to inﬂuence an individual’s motivation to fake. While the list of
personality variables that may inﬂuence faking is quite long, faking research has largely
been concerned with only three: conscientiousness, neuroticism, and Machiavellianism.

Costa and McRae (1985) describe conscientious individuals as responsible and
rule-abiding and describe neurotic individuals as being especially concerned with how

others view them. Conscientiousness and neuroticism also relate to integrity, indicating

27

that highly conscientious and less neurotic individuals are generally more honest (Ones,
Viswesvaran, & Schmidt, 1993). Additionally, in a laboratory study McFarland and
Ryan (2000) found that conscientiousness and neuroticism were related to faking.
Leary and Kowalski (1990) suggest that people high on Machiavellianism are
more likely to engage in impression management than those low on this trait.
Machiavellianism, deﬁned as a belief that others can be manipulated (Christie & Geis,
1970), relates to self-reported lying and cheating in pursuit of desired ends (Kashy &
DePaulo, 1996). Furthermore, Mueller-Hanson and her colleagues (2003b) found that
Machiavellianism correlated with difference scores for students taking a personality
measure under both “honest” instructions and instructions to respond as if applying for

one’s “dream job.”

The Current Model and Study
The review above highlights the inﬂuences of both personality and contextual

factors on faking behavior. While these constructs are important, the current study will
focus on less researched and more proximal inﬂuences on faking behavior. Building on
the theories of reasoned action and planned behavior (Ajzen & Fishbein, 1980; Ajzen,
1985) the model presented here attempts to clarify the constructs and psychological
processes involved in an applicant’s choice of whether or not to engage in faking. The
model tested in the current study, presented in Figure 1, retains attitudes, subjective
norms, and perceived behavioral control beliefs, or self-efﬁcacy (Ajzen & Madden,
1986), contained in the theory of planned behavior; however, an additional explanatory

construct has been added. Speciﬁcally, ethical beliefs have been added to the model in

28

order to capture an inﬂuence on motivation that is unique to the context of faking, or
deviant behaviors more generally. Additionally, note that the substantive variables in the
model are assumed to operate through the mediator of intentions. However, due to the
nature of the current study (i.e., a ﬁeld sample of actual applicants) it is not possible to
obtain an uncontaminated measure of intentions to fake prior to the individuals’ potential
engagement in such behavior.

Attitudes. While not tested in the current study, it is important to highlight the
role that attitudes play in the theories of reasoned action and planned behavior as well as
the role that attitudes may play in one’s motivation to engage in faking. Attitudes
towards a given behavior, sometimes called subjective expected utility (Harrison, 1995),
are based on both belief strength, the strength of perceived contingencies between
performing a behavior and possible consequences of the behavior, and on the valence, or
desirability, of those consequences (Ajzen & Fishbein, 1980). Belief strength is the
degree to which an individual believes that the consequence will follow from the
behavior. Valence is the degree to which one perceives the consequence as desirable or
undesirable. Prior research has shown attitudes to predict a variety of volitional
behaviors including volunteer attendance (Harrison, 1995), weight loss (Schif’rer &
Ajzen, 1985), and class performance (Ajzen & Madden, 1986).

Although there may be other relevant consequences, the most important with
respect to faking are increasing and decreasing one’s chance of being hired. Thus an
applicant’s attitude toward faking is composed of the sum of the products of belief
strength and valence for the consequences of: (a) faking and thereby increasing one’s

chances of selection, and (b) getting caught faking and thereby reducing, or eliminating,

29

 

Knowledge of

    
 
 
 
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
 

 

 

 

 

 

 

 

 

Constructs
H9b
Self-eff. ,
re: faking
iAttitudes :» ...... 5
H10a ~~~~~~~~~~~
SUbJ' I Intentions i——> Faking
Norms I ............. I
Ethics:
Lying
Beliefs
about
faking

 

 

 

Figure 1. Model of Faking.

NOTE: Hypotheses are noted by their numbers. Dashed lines indicate variables and relationships not
tested in the current study

one’s chances of selection. Warnings represent a key construct investigated in prior
faking research that is likely to inﬂuence faking behavior through its effects on attitudes.
Dwight and Donovan (2002), discussed previously, found that warning applicants
that faked scores are detectable and explicitly informing applicants of the consequences
(e.g., removal from the selection process) resulted in the largest decrease in response
inﬂation. It is possible that warnings deter faking by increasing both belief strength and
negative valence associated with being caught faking. Some authors (e.g., McFarland &
Ryan, 2000) argue that warnings moderate the relationship between attitudes towards
faking and faking behavior. However, this could be an artifact related to the time of

attitude measurement in relation to when warnings are given. For example, if attitudes

30

are measured prior to the time warnings are given it may appear empirically that
warnings moderate the relationship between beliefs and faking. However, if attitudes
were measured after warnings are given it is likely that the warnings would show a main
effect on attitudes by increasing the belief strength that faking will lead to being caught,
rather than moderating the relationship between attitudes and faking behavior.

In order to adequately assess attitudes, applicants’ attitudes would need to be
assessed prior to receiving a selection or rejection decision. The nature of the current
study precludes such an investigation.

Subjective Norms. Ajzen and Fishbein (1980) deﬁne subjective norms as an
individual’s perceptions of salient others’ beliefs about whether a behavior is, or is not,
acceptable. Perceptions of others’ beliefs toward a given behavior have been found to
predict intentions to perform and subsequent performance of that behavior (e.g., Ajzen &
Madden, 1986; Schiﬁer & Ajzen, 1985). These effects have also been found with respect
to cheating, lying, and shoplifting (Beck & Ajzen, 1991).

Lueke et al. (2001) assessed individuals’ subjective norms in regard to faking and
found that respondents who self-reported engaging in response distortion in the past also
indicated that they believed that others thought that this behavior was acceptable and
appropriate. In another study, Mueller-Hanson and her colleagues (2003b) assessed
subjective norms as an indicator of the latent variable perceptions of the situation, which
also included belief in the importance of faking and belief in ability to fake as separate
indicators. While this measure is not a pure measure of subjective norms, structural

equation modeling showed that the subjective norms component was the largest

31

determinant of perceptions of the situation and, ﬁrrthermore, that perceptions of the
situation were the largest predictor of actual faking.

Although no research has examined the potential of situational inﬂuences to
change individual’s subjective norms, it is possible that the situation can be leveraged to
indicate to applicants that faking is not a common or acceptable behavior. For example,
test administrators could stress that faking on these tests is the same as lying and that
most people do in fact respond honestly to these types of tests. Given the norms against
lying in society, such a statement may help to deter faking by causing people to assess
salient others’ beliefs regarding lying, instead of focusing solely on the perceptions of
others’ beliefs regarding dissembling on a personality test. While it is unlikely that such
a statement would eliminate faking by itself, in conjunction with the traditional warnings
of possible detection, it may further decrease applicants’ motivation to fake. Therefore, it
is important to assess the relationship between subjective norms and faking behavior
within an actual applicant sample.

Hypothesis 6a: Perceptions that others believe faking to be an acceptable practice
will be related to faking.

While Ajzen and F ishbein (1980) limit their deﬁnition to perceptions of others’
beliefs, literature suggests that it is appropriate to expand this deﬁnition to include
perceptions of others’ behavior as well. For example, Graham, Monday, O’Brien, and
Steffen (1994) found that individuals who believed that a large number of students
cheated were more likely to report having cheated themselves. In the literature on faking,
similar effects of the perceptions of others’ behavior have also been found. For example,

Lueke et a1. (2001) found that individuals reporting a belief that others distort responses

32

on selection tests were more likely to report having engaged in faking themselves.
Similarly to the discussion of perceptions of others’ beliefs, perceptions of others’
behavior may be amenable to situational inﬂuences aimed at raising applicants’
awareness that faking is not a common and acceptable behavior. Thus, it is important to
verify prior ﬁndings within an actual applicant population.

Hypothesis 6b: Perceptions that others engage in faking will be related to faking.

Ethical beliefs. Neither the theory of reasoned action nor the theory of planned
behavior hypothesize an ethical inﬂuence on behavior. However, some theorists have
argued for the inclusion of a moral-ethical component in models of behavioral decisions
(e.g., Etzioni, 1988; Triandis, 1977). Ethics, deﬁned here as an individual’s personal
beliefs about the inherent goodness or badness of performing a behavior, are separate
from instrumental concerns, captured by attitudes, and perceived expectations and
behavior of others, captured by subjective norms. Rather, it reﬂects an internalized
pressure to be consistent with one’s own value system, void of any social pressures or
referents. Ethical concerns have been shown to predict behavior above and beyond
attitudes, subjective norms, and self-efﬁcacy (Harrison, 1995).

Leary and Kowalski (1990) suggest that most people have an internalized ethic
against lying that prevents them from claiming images blatantly inconsistent with their
self-concepts. This notion is supported by a study that utilized the randomized response
technique to examine faking in which it was found that only 15% of people admitted to
giving responses during a selection process that were “completely false or made up”
(Donovan et al., 2003). However, 32% reported having “exaggerated my personality

characteristics or traits” and 62% reported having “down played what some might

33

consider my negative attributes.” Another study, found that many individuals who
admitted to faking in the past, perceived response distortion as “different than lying”
(Lueke et al., 2001). Taken together, these studies indicate that although an individual
may have an ethical compunction against lying, this does not necessarily extend to
faking.

Given this disconnect between lying and faking, any assessment of ethical beliefs
regarding faking must contain two components. The ﬁrst must assess the degree to
which an individual holds an internalized ethic against lying generally. The second
component must assess the degree to which the individual perceives response inﬂation on
a selection test as the same as lying. On the basis of evidence that some people perceive
lying as different than faking it is expected that the relationship between ethical beliefs
concerning lying and faking will be moderated by beliefs that faking is the same as lying.

Hypgthesis 7: The relationship between a self-reported ethic against lying and
faking will be moderated by an individual’s belief that faking on a selection test is the
same as lying, such that in the presence of a belief that faking is not the same as lying the
relationship between reporting an ethic against lying and faking will be reduced.

Self-eﬂicacy. Self-efﬁcacy, or perceived behavioral control in the terminology of
the theory of planned behavior (Ajzen, 1985; Ajzen & Madden, 1980), has been shown to
predict a variety of volitional behaviors including weight loss (Schiﬁer & Ajzen, 1985),
task performance (Stajkovic & Luthans, 1998), and volunteer attendance (Harrison,
1995). Self-efﬁcacy refers to the degree to which an individual believes he or she can

successfully perform a desired behavior. These beliefs may be based on past experience,

34

perceived ability, second-hand information obtained from others, and other factors that
increase or reduce the perceived difﬁculty of performing a behavior (Bandura, 1997).

The only study to examine the role of self-efﬁcacy with regard to faking was
Lueke et a1. (2001). These authors found that self-reported ability to distort one’s
responses to a personality test was related to self-reported faking behavior in the past.
Research has shown that self-efﬁcacy beliefs can be inﬂuenced by contextual factors
(Bandura, 1997). Given these ﬁndings it is possible that self-efﬁcacy regarding faking
may be amenable to efforts by test administrators aimed at reducing individuals’ beliefs
that faking is possible. Thus it is important to establish the inﬂuence of such beliefs on
applicant faking.

Hypothesis 8: Individuals with high self-efﬁcacy for enhancing their responses to
a selection test in a desirable way will be more likely to engage in faking.

Possessing the knowledge of the nature of the construct being assessed may also
inﬂuence an individual’s motivation to fake a noncognitive selection test. Reynolds,
Sinar, and Haaland (2003) showed that a pre-testing orientation program describing the
nature of the personality constructs measured in a selection test can inﬂuence test scores.
These researchers compared test scores of applicants receiving a construct-focused
orientation program to scores of applicants receiving either no orientation or a general
orientation on the format of the test. The group of applicants participating in the
construct-focused orientation scored signiﬁcantly higher on the test than both the group
receiving no orientation and the general orientation group.

Frei Snell, McDaniel, and Grifﬁth (1998) measured participants’ knowledge of

the constructs associated with successful performance in customer service jobs. In an

35

“applicant” condition in which participants were told to respond as if they were applying
for a customer service job, knowledge of the construct signiﬁcantly related to response
inﬂation as measured by within-subj ect difference scores. Additionally, prior research
has shown that item transparency relates to faking (e.g., Alliger, Lilienfeld, & Mitchell,
1996) suggesting that if individuals are aware of the desirability of the construct being
measured, they may be more motivated to fake.

While it is possible that the relationship between knowledge of constructs and
faking is fully mediated by self-efﬁcacy, it is more likely that this variable is only
partially mediated by self-efﬁcacy. Some minimal knowledge of whether the construct is
desirable or not is necessary in order for an individual to enhance his or her responses at
all; thus, it is expected that knowledge of constructs will also exert a direct inﬂuence on
faking. ’

Hypothesis 9a: Knowledge of the constructs being assessed will be related to
faking.

Hypothesis 9b: The relationship between knowledge of the constructs being
measured and faking will be partially mediated by self-efﬁcacy.

Bandura (1997) states, “Beliefs of personal efﬁcacy constitute the key factor of
human agency. If people believe they have no power to produce results, they will not
attempt to make things happen.” (p. 3). This suggests that in the absence of some
minimum level of self-efﬁcacy for a given course of action, an individual will not even
attempt the action. While this is perhaps overstated, it is likely that a lack of self-efﬁcacy
greatly diminishes an individual’s motivation to pursue an action. In the present context,

it is likely that, regardless of other motivating factors, an individual who believes that he

36

or she is not capable of effectively faking his or her responses to a noncognitive selection
test will not attempt to do so.

Hypothesis] 0a: The relationship between subjective norms and faking will be
moderated by self-efﬁcacy, such that in the presence of low faking-related self-efﬁcacy
the relationship between subjective norms and faking will be reduced.

For the variables assessing ethical beliefs, the hypothesis below represents a
three-way interaction between ethical beliefs regarding lying, beliefs that faking is the
same as lying, and self-efﬁcacy.

Hypothesileb: The relationship between ethical beliefs and faking will be
moderated by self-efﬁcacy, such that in the presence of low faking-related self-efﬁcacy

the relationship between ethical beliefs and faking will be reduced.

Summary
As noted, some of the above relationships have been investigated in prior studies,

however many of these studies have used somewhat weak methodologies that limit
generalizability, such as using faking scales to operationalize faking or using a “fake-
good” instruction set. While these types ’of study were useful in the initial stages of
faking research, it is time to use more complex and generalizable methodologies utilizing
applicant samples with more concrete measures of response distortion.

The greatest contribution of the present study is the utilization of a within-subjects
design and a ﬁeld sample to investigate applicant faking. Researchers suggest that this is
the type of design and sample that is required to adequately address the debates in the

current literature (e.g., Stark et al., 2001; Mueller-Hanson et al., 2003a; Weekley et al.,

37

2003). The current study also contributes to the knowledge base of faking by
investigating antecedents to faking behavior that have been suggested, but not

deﬁnitively shown, to inﬂuence faking.

38

METHOD

Sample
The entire sample consisted of 169 part- and full-time employees of a large

Midwestern theme park. All employees applied, and were selected, for entry-level
positions within the organization between January and July, 2004. Of the entire sample,
9 people had substantial amounts of missing test data and were thus excluded from all
analyses. An additional 9 people did not exceed the normative cutoffs, and thus should
not have been hired based on their applicant test scores and were not included in the
analyses. Due to organizational delays in inserting the social desirability scale into the
applicant test, data for this scale was obtained from only 29 participants as applicants, 4
of whom should not have been hired on the basis of their applicant test scores, resulting
in an analyzable samples size of 25. Due to a researcher error, the knowledge measure
was administered to only 49 participants, 4 of whom should not have been hired on the
basis of their applicant test scores, resulting in analyzable sample size of 45. Due to
small amounts of other missing data the sample size for all analyses, excluding the social
desirability and knowledge measures, is between 147 and 151.

The sample was predominately female (62%). Age data was available for 87% of
the sample and indicated that individuals included in the sample ranged between 18 and
72 years of age with a mean of 40. Job title information was available for 86% of the
sample and indicated that the sample included food service workers (9%), presenters and
tour guides (47%), visitor services and retail employees (21%), security personnel (4%),
and custodial workers (3%). Race data was not available for the current sample, but
analysis of a large sample of prior applicant data for this organization indicates that

applicants are predominately white (53%) or Aﬁican-American (41%), with smaller

39

representations of Hispanics (3%), Asian or Paciﬁc Islanders (<1 %) and Native
Americans (<1%).

Approximately 500 employees were eligible to participate in this study.
Applicant test scores were available for 221 employees who were eligible to participate,
but did not. Independent-samples t-tests were conducted on applicant test scores to
examine any differences betweeen those individuals volunteering for this study and those
individuals who did not volunteer. The results conﬁrm that there were no significant
mean differences between the current sample and the available sample of eligible

employees (for all tests: t (370) < 1.974, n.s).

Desigp
The experiment utilizes a within-subjects design consisting of the completion of a

personality-based selection test at two periods in time. The ﬁrst administration of the test
occured when the participants applied for employment to the organization. The second
administration occured 3 — 6 months later between August and November, 2004.

Supervisory performance ratings were collected at the end of November, 2004.

Measures
Selection test. The proprietary selection test was developed and validated

speciﬁcally for this organization. A thorough job analysis was conducted in order to
elucidate the personality dimensions important for performance at this organization.
Scales were constructed to assess these dimensions, and a concurrent validation study

was utilized to establish the validity of the instrument. The ﬁnal instrument contains ﬁve

40

dimensions: Adaptability, Conﬁdence and F riendliness, Productivity and Quality F ocus,
Ease of Supervision, and Reasoning and Problem Solving.

Version 1 of the selection test contains 107 self-report items assessing personality
constructs, 16 multiple-choice items assessing reasoning ability, and 13 self-report items
assessing theft and substance abuse. With the exception of 7 items, all items assessing
personality constructs are answered on a 5-point likert-type scale ranging from strongly
disagree to strongly agree. Approximately 82% of the full analyzable sample (130
people) took version 1 of the selection test as applicants. Version 2 of the selection test is
completely nested (i.e., all of the items in Version 2 were included in Version 1) within
version 1 and contains 72 self-report items assessing personality constructs, 16 items
assessing reasoning ability, and 13 self-report items assessing, theft and substance abuse.
Version 2 also contains an additional 17 self-report items assessing response distortion,
described subsequently, that are not included in Version 1 of the test and are not scored
or used for selection purposes. Approximately 18% of the analyzable sample (29 people)
took version 2 of the selection test as applicants. Only the items contained in both
versions of the tests will be examined and used to test hypotheses.

In addition to differing in the total number of items, the two versions differed
slightly in the percentage of people meeting the minimum criteria for interview eligibility
(i.e., Version 1: 72.3%; Version 2: 68.2%), and the average concurrent validity of the
dimensions (Version 1: r =.23; Version 2: r =29). The test is scored, for selection
purposes, on an empirically-derived rationally constructed 0 to 3 scale with the 2 least
desirable options receiving a score of “0”, neutral responses receiving a score of “l ”, and

desirable and highly desirable responses receiving scores of “2” or “3” depending on

41

whether the options exhibited substantial validity beyond the other options for a given
item.

The second administration of the test included only the 72 self-report items from
Version 2 and the 17 response distortion items. The theft and substance abuse items were
excluded due to both the very sensitive nature of the topics covered by the items and the
limited amount of time for the research administration of the test. Additionally, in the
interest of time, the reasoning ability multiple-choice items were excluded as these items
assess cognitive ability and are not related to the primary hypotheses. Due to the
proprietary nature of the selection test, it will not be reproduced here and only a few
example items will be provided.

The Adaptability dimension was designed to assess the extent to which
individuals ﬂexibly adapt to changes in demands and procedures in the workplace and
maintain composure in stressful. Examples of items included in this dimension are, “I
enjoy it when I get to do new and different things at work,” and, “I’m at my best when
I’m challenged and things are difﬁcult.”

The Conﬁdence and F riendliness dimension was designed to assess the degree to
which an individual enjoys being with others, conﬁdently approaches one-on-one and
group interaction situations and is comfortable interacting with both customers and
coworkers. Examples of items included in this dimension are, “I often feel
uncomfortable around others,” and, “I am skilled in handling social situations.”

The Productivity and Quality Focus dimension was developed to measure the
extent to which individuals are detail focused, reliable, responsible, and concerned with

the quality of their work. Examples of items included in this dimension are, “I am very

42

exact in what I do,” and, “I almost always do more than is required in work or school
activities.”

The personality-based items in the Reasoning and Problem Solving dimension
was designed to assess the extent to which individuals are intellectually curious, creative,
and seek out opportunities to learn. Examples of items included in this dimension are, “I
avoid reading difﬁcult material,” and, “I do not have a good imagination.” The test score
for this dimension is a composite of the personality and cognitive ability items. In order
to replicate how this dimension is used in practice, the cognitive ability items ﬁom the
applicant administration were used to form the scale score for this dimension in both the
applicant and incumbent settings. That is, the cognitive ability items were only
administered in the applicant setting and were not administered in the incumbent setting.

The Ease of Supervision dimension was developed to measure the degree to
which individuals trust supervisors, are willing to take direction, and are generally even-
tempered. Examples of items included in this dimension are, “A lot of supervisors just
enjoy controlling people,” and, “I get irritated easily.”

Performance Appraisal. The performance appraisal form was developed on the
basis of a job analysis and discussions withsupervisors (Appendix A). Supervisors
completed similar forms for employees who took part in the initial validation of the
selection test. Supervisors were informed that this performance appraisal was for
research purposes only and would in no way affect employees. The eight performance
dimension ratings were averaged to form a composite performance rating that is used for

all analyses.

43

Experimental measures. Efforts were made to use established scales to measure
the constructs described below. However, many of the scales used in prior research
contained very few items, sometimes as few as a single item (e. g., “beliefs about faking”
from Lueke et al., 2001; 2002). Thus, it was necessary to supplement existing scales with
additional items in most of the below measures. All scales were assessed using a 5-point
liken-type scale ranging from strongly disagree to strongly agree, unless otherwise noted.

The order of the measures presented to participants was the same order in which
the measures are discussed below, with the exception of the Unlikely Virtues scale which
was embedded in Version 2 of the selection test. The order of presentation of the
measures was chosen in order to avoid explicitly priming participants to the possibility
that faking can be considered lying. It is possible that once participants were presented
with the scales concerning lying they may answer other items differently than if they had
not previously thought about faking in terms of lying. Conversely, the ethic against lying
scale was presented prior to the beliefs about faking scale, which measures beliefs that
faking is the same as lying, in order to allow participants to think about their ethical
beliefs regarding lying generally before asking them whether they believe that faking is
the same as lying. ‘

Subjective Norms: Others ’ Beliefs. Five items were used to assess the extent to
which participants believe that signiﬁcant others in their lives would approve or
disapprove of responding desirably on personality-based selection tests (Appendix B).
Four of these items were adapted ﬁ'om a scale used by McFarland (2000) and one item
was developed speciﬁcally for this study. Higher scores on this scale indicate a belief

that others think it is acceptable to fake on selection tests.

Subjective Norms: Others ’ Behavior. Five items were used to examine the extent
to which participants believe that others engage in faking on personality-based selection
tests. Four of these items were adapted from scales used by Lueke and her colleagues
(2001; 2002) and Mueller-Hanson and her colleagues (2003b) and one item was
developed speciﬁcally for this study (Appendix B). Higher scores on this scale indicate a
belief that others engage in faking on selection tests.

Self-eﬂicacy regarding faking. Six items were used to examine participants’ self-
efﬁcacy for faking (Appendix B). Three of the items were adapted from Wiechmann
(2000), two items were adapted ﬁ'om McFarland (2000), and one item was adapted from
Mueller-Hanson (2003b). Higher scores on this scale indicate high self-efﬁcacy for
successfully faking responses.

Ethic against lying. Seven items were used to measure participants’ ethical stance
on lying (Appendix B). Four of these items were adapted from a scale used by Christie
and Geis (1970) to assess attitudes towards lying in relation to the personality construct
of Machiavellianism. Three additional items were constructed speciﬁcally for this study.
Higher scores on this scale indicate a stronger ethic against lying.

Beliefs about faking. Four items were used to assess participants’ beliefs that
distorting one’s responses on a personality-based selection test is similar to lying
(Appendix B). One of these items was adapted from a scale used by Lueke et al. (2001;
2002) and three items were developed speciﬁcally for this study. Higher scores on this
scale indicate a belief that faking on selection tests is not the same as lying generally.

Knowledge of constructs. A 15-item multiple choice test was developed

speciﬁcally for this study in order to assess the degree to which participants are aware of

45

which constructs are being assessed within the test. The items include, in the stem, an
item from the personality-based selection test. For each item, participants were instructed
to choose the description of the category that the item belonged to, choose which answer
was the most desirable by organizational standards, and rate their conﬁdence that they
knew the most organizationally desirable response to the item (Appendix C). For each
item there is an additional response option of “None of the above.” Three items were
chosen ﬁom each dimension of the selection test to represent a range of obviousness.
That is, some items obviously come from a certain dimension (e.g., item number 14),
while for other items it is less clear which dimension the item belongs to (e.g., item
number 2).

Response distortion. The International Personality Item Pool (IPIP) l7-item
unlikely virtues scale (a = .76) will be used as the measure of response distortion
embedded in Version 2 of the selection test (IPIP, 2001; Appendix D). This scale was
constructed to be parallel to the unlikely virtues scale contained in the Multidimensional

Personality Questionnaire where it is used as a validity scale (T ellegen, in press).

Procedure
F irst/Applicant Administration. All applicant testing was completed on-site and

was supervised by the organization’s hiring personnel. Applicants were instructed to
answer honestly, but no warnings regarding faking were provided. There are two
normative cutoffs that must be exceeded in order for an applicant to receive an interview.
In Version 1 of the selection test, applicants must score at or above the 10th percentile on

each dimension and must score at or above the 25th percentile on the overall score, a

46

summation of standardized dimension scores. Applicants who take Version 2 must score
at or above the 11‘h percentile on each dimension and must score at or above the 23rd
percentile on the overall score. These cutoffs were designed to, and do in fact, eliminate
approximately 30% of all applicants. Applicants exceeding the minimum cutoffs are
interviewed by current managers who make all ﬁnal hiring decisions. The organization
estimates its current selection ratio to be approximately 30%.

Second/Incumbent Administration. Between August and November, 2004,
participants were re-administered only the personality-based portion of the selection
instrument. This time period was chosen for two reasons: First, it was necessary to allow
a sufﬁcient amount of time between the applicant and research administrations of the test
in order to prevent participants from simply recalling how they had responded to the
items as an applicant. Second, many participants are seasonal workers and are laid off in
mid-Novernber. Participants were assured that no one within the organization will ever
have access to their individual responses to this administration of the test in both the
written consent form and the verbal protocol (Appendix E; F). Participants completed the
test during their normal working hours and received their normal hourly wage for
participation.

After completion of the personality-based portion of the selection test, all
participants were reminded of the conﬁdentiality of their responses and were again
requested to respond honestly to the remaining experimental measures. After completion
of all measures, participants were administered a second consent form (Appendix G)

requesting their permission to obtain their applicant administration test scores and

47

supervisory performance ratings. Participants had access to debrieﬁng forms once all
participants had completed the study (Appendix H).

Supervisory performance ratings were obtained in late November from
participants’ current supervisors. After agreeing to participate in the current study by
signing and dating the consent form (Appendix I), supervisors completed a performance

appraisal for each participant.

48

RESULTS

Construct Validity of the Selection Test
Responses to the incumbent administration of the personality-based test were

subjected to an exploratory factor analysis to assess the factor structure of the test.
Incumbent responses were factor analyzed, as opposed to applicant responses, due to the
ﬁndings of previous research indicating higher intercorrelations among scales and the
emergence of different factor structures in applicant settings as opposed to research
settings (e.g., Barrick & Mount, 1996; Schmit & Ryan, 1993; Weekley et al., 2003).
However, the a priori dimensions of the test did not emerge as pure factors. Despite
concerns about the low person-to-item ratio (Ford, MacCallum, & Tait, 1986),
exploratory factor analyses were conducted to investigate the factor structure of the test.
On the basis of a series of rational item groupings and exploratory factor analyses with
varimax rotation two correlated factors emerged accounting for 37.25% of the variance in
responses. Analysis of the items contained in each factor conﬁrmed that one factor was
composed of items similar to items generally used to assess emotional stability and the
other factor was composed of items similar to those generally used to measure
conscientiousness. All factor loadings for each factor were between .42 and .75.

The emotional stability and conscientiousness scales will be used to test all
relevant hypotheses. Additionally, the ﬁve a priori, heterogeneous, scales and the overall
test score will also be used to test all relevant hypotheses as this represents how the test is
used in practice. The emotional stability and conscientiousness scale scores were
computed by averaging the item responses such that higher scores indicate more of the
construct. The apriori dimension scale scores were computed by summing the

empirically coded test responses and standardizing according to previously established

49

norms in accordance with how these dimensions are used in practice. The overall test

score was computed by summing the standardized dimension scores.

Selection Test Difference Scores. Scale Reliability. apd Effect Sizes
Difference scores (d-scores) were computed for all applicants by subtracting

incumbent scores from applicant scores. This was done for the emotional stability and
conscientiousness scales as well as for the test dimensions and overall score. Researchers
have argued that d-scores are appropriate when these scores represent a construct of
substantive interest, as when one expects a Participant X Treatment interaction (Tisak &
Smith, 1994). Other researchers have criticized the use of d-scores because these scores
may exhibit low reliability (e.g., Edwards, 2002). However, Rogosa, Brandt, and
Zimowski (1982) demonstrate that d-scores do not necessarily exhibit low reliability, and
can, in fact, be an accurate and valuable measure of individual change even in situations
where the reliability is low. In the current study, d-scores represent a construct that is
conceptually meaningful in that it reﬂects the amount of response inﬂation occurring as a
ﬁrnction of the setting in which test scores were obtained (i.e., applicant and incumbent
settings). Thus, a Participant X Treatment interaction was expected because of the
assumption, and ﬁnding, that some individuals inﬂate their responses more in an
applicant context relative to a research context. Reliability information for the emotional
stability, conscientiousness scales, test dimension scales, and social desirability scores, as
well as the respective d-scores, are contained in Table l. The d-score reliabilities were
estimated with an equation provided by Rogosa and his colleagues (T able 3, Assumption

0: 1982). Reliability information was not available for the overall test score.

50

Table 1: Paired Samples t Tests. Effect Sizes, and Reliability Information
Applicant Research

 

 

Variable Context Rei.‘ Context Rel.’ D-score Rel? Effect Sizes t‘

Emotional Stability .86 .83 .67 0.83 10.24
Conscientiousness .77 .76 .58 0.89 10.42
Adaptability .51 .65 .36 0.79 8.78
Conﬁdence & Friendliness .43 .76 .44 0.62 7.13
Productivity & Quality .64 .74 .52 1.01 11.23
Reasoning & Problem Solving .55 .63 .37 0.21 5.04
Ease of Supervision .73 .71 .51 0.99 11.19
Overall Test Score - - - 1 .14 13.05
Social Desirability .67 .79 .58 1.34 4.35
NOTE:n = 151.

1Alpha reliability estimates.
2D-score reliability computed with an equation provided Rogosa et al. (1982).
3Effect sizes were computed by subtracting the mean score for the incumbent setting from the

mean score for the applicant setting and dividing by the pooled standard deviation.

‘Aii t-values are signiﬁcant (p < .01 ); df = 149

In the applicant context, the alpha reliability of the majority of test dimensions
scales is quite low in the current sample, although acceptablelevels of reliability were
obtained for the emotional stability and conscientiousness scales. Previous analyses on a
much larger applicant database indicate that the alpha reliability for all test dimensions is
between .65 (Reasoning & Problem Solving) and .82 (Ease of Supervision). It is likely
that the alpha reliability estimates underestimate the true reliability of the test dimensions
because the items are empirically-keyed, resulting in lower item variance generally, and
because of the restricted range of the applicant score distribution. The alpha reliability
estimates obtained for the incumbent context were generally at acceptable levels, but the
reliability of the Adaptability and Reasoning & Problem Solving scales were slightly
below traditionally acceptable levels (a = .65 and .63, respectively). The d-scores
generally exhibited low reliability which is not unexpected given the substantial positive

correlations between the applicant and incumbent test scores and the low reliability

estimates observed for the applicant scores.

51

Paired-sample t-tests and effect size estimates for the difference between the
means obtained in the applicant and incumbent settings are also contained in Table 1.
Effect sizes were computed by subtracting the mean score for the incumbent setting from
the mean score for the applicant setting and dividing by the pooled standard deviation.
The positive effect sizes indicate that higher mean scores were obtained in the applicant
setting than in the incumbent. All mean differences were signiﬁcant and all effect sizes

were moderate to large.

Descriptive Statistics
Table 2 contains means, standard deviations, reliabilities, and intercorrelations for

all variables. There were high to moderate correlations between the selection test scores
in both the applicant and the incumbent context, but smaller intercorrelations across the
two contexts. Note the substantially lower variance obtained in the applicant as opposed
to the incumbent context. In combination with the evidence of higher mean scores, a
ceiling effect, in this particular applicant context, this suggests that faking served to

depress the variance of scores in this context.

Hyppthesis Tests
Social Desirability. Hypothesis 1 states that applicant social desirability scores

will have a small to moderate correlation with d-scores. Examination of the correlations
between social desirability scores and d-scores (Table 2) indicates that applicant social
desirability has a small to moderate relationship with d-scores. However, due to the very

small sample size available for these analyses no ﬁrm conclusion can be made about the

52

6.8... cc.— _. ucm 29:8 .2 0 pence m_ .3563
.Amo. v 3 0:83:ch 8m Eon 5 825.850 ”mt-OZ

 

 

 

$58. 8. 8. 8. 8.- a. «0. 3. 8. 8. 3. 8. 8. 8. 8.. 8.- 8.0 80. 8800330080.“. .3
638. 8. 8. 30. 8. «a. 8. 03. 3. 8. k0. N0. 30. 3. 8.- 8.0 8.0 805.002.383.888 .03
808. 8. 30.- 3. an. 03. B. 8. «a. 8. 8. 03. 3. 30.- 8.0 8.0. 85802.8
658. 8.- 3. 0a. 3. 8.- an. 8. 3. 8. 03. 8. 8.- 8.0 03 8880002088 .3
88 8. 3. 03. 8. 8. 8.- 03. 3. 03. 8. 8. 8.- 8.0 3.8 a___asm_80_soem .2
. 8.38 «no... “can—:30:—
808.-.“ 3. «0. 00. 8. 30.- 3. 30 30. 8.- 8.3 8.33 a___a<a>0_80o.~3
:00 8. 8. 03.- 8. 8. am. 8. 8. 3. 8. 8.0 3.... b___as_aao_n_8m .3
.- 8. 8. 2.. 8. «k. 8. 3.. N3. 3. 8.0 8.0 28882905 .03
A038. 8. 8. 8. 8. 2. 8. 8. 8.0 8.0 02228090 88 .0
8.03. 3. 8. 3. 8. 30. 8.- 8.0 30.0 , 003.8 50305805088”. .0
$88.. 8. 00. 8. 8. 3. 8.0 8.0 Engagznauea s
08.02. 2. 3. 30.- 8. 8.0 30.0 88__05E.0888000 .0
:88. 3. 03. 8. 8.0 8.0 e___asa8<.m
st 8. 8. 8. 8.0 8.8 888082088 .8
80:0. 8. 8.0 8.8 Essmacaeoem .0
«200m «mo-P «:00:
3.- 8.03 8.8 8< .N
8.0 8.0 .850 .3
awn-Wig

53

 

tormwvwmwmwvwormmnomvmm_.chmmﬁ
82585295 new 8535 93:0on N mam...

.23: .2 3 new 23:3 .2 o emcee 0_ 500003
._mcomm_u 3: 35.0.. 353:8 30 03:59.8 33?. .95. v 3 335090 3m Ron 5 0:30.830 ”mt-OZ

 

 

 

3. 8.- 8.- 8. 8. 30.- 00.- 8.- 8.- 8.- 8... 3.- 8.- 8. 8.- 8. 8.- 0.0.3 8.0 888.88 :85 .8
8. 8.- 3. B.- 3... 8. 3. 3. 8. 8. 8. 8. 3. 3. 8.- 8. 8. 3.0 3.8 008.305. 58:80:00.8
3. 3. 8.- 3. 8.- 31 3. 8. 8. 8. 8. 8. 8. 8. 3. 8.- 8. 30.0 3.8 83850 008.305. .3
3. 8. 8. 8. 8.- 8. 8. 8. 3. 3. 8. 3. 8. 8. 3.- 3. 8. 38 8.0 8.58020 0000.305. .8
30. 8.- 8.- 3.- 30. 3. 8.- 8.- 8.- 30. 30. 3.- 8.- 8. 00. 30. 00.- 8.0 88 000330080360 .8
8. 8. 8. 8. 3. 8. 00. 3. 3. 8. 3. 8. 3.3. 8. 3. 3.- 03.0 00.0 83.882880 .3
3.- 8. 3.- 3.- 8.- 3. 8. 8.- 8.- 8. 8.- 8. 3.- 8.- 3. 3.- 3.- 8.0 3.0 88.03.00 .8
8.- 8.- 8.- 8.. 3.- 00. 30.- 8.- 30.- 30.- 30.- 30. 3.- 8. 3. 8.- 30.- 00.0 8.0 53800 .0850 .8
8.- 30.- 8. 00.- 3.- 8.- 8. 8.- 3.- 8.- 8. 8. 8.- 8.- 8.- 8.- 8. 8.0 00.0 $2.00 .2050 .3
0230805 Waco-Eta xm

8.- 8.- 3.- 8.. 8.- 3.- 8. 8. 8. 8.- 8. 00. 8.. 8. 8. 30. 3. 8.0 3.0 3.5889080 .00
8.- 8.- 00.- 8.- 3.- 8.- 8. 8. 8. 8. 8. 8.- 8. 3. 3. 3.- 3. 03.0 3.0 980383.305 .8
8.- 3.- 3..., .- .- 30.- 30. 8. 8. 3. 8. 8.- 8. 8. 3. 3.- 3. 8.0 3.0 02038008 800 .8
8.- 3.- 3.- 3.. 3.. 3.- 3.- 8. 8.- 00. 8.- 3.- 8. 8.- 00. 8. 3. 00.0 3.0 83.00 50.85 .0 0500833 .8
8.- 8.. 8.. E..- 8.- 8. 8. 8. 3. 8. 8. 8. 3. 3. 3. 3.- 3. 30.0 8.0 8000038805 .8
8.- 3.. 8.. 3.- 8.- 8.- 8. 8. 3. 8.- 3. 8. 3. 3. 30. 8.- 8. 8.0 8.0 085.053.085.088 .8
8.. 8.. 8.- 3.- 8.. 30. 8. 3. 3. 8. 3. 3.- 8. 3. 8. 8.- 8. 8.0 8.0 35382 .3

.. 3.- 3.- 8.- 3.- 3. 3. 8. 3. 3. 8. 8. 8. 8. 3. 8.- 3. 3.0 8.0 80080002088 .8
3.- 8.- 8.- 3.- 8.- 8.- 00. 8. 8. 8.- 8. 3. 3. 8. 8. 00. 8. 8.0 8.0 3.5908285 .8

goofn

8. 3. 8. 8. 30. 3.- 8. 3. 8. 30.- 3. 00. 3. 8. 8. 3. 3.- 8.0 00.0 35389080 .8
2.. 8. 3.. 3. 8. 8. R. 8. 8. 8. 8. 3.. 8. 8. 3. 8. 8.- 3.0. 8.0. 0.803002905 .8
8. 3. 3.. 8. 00. 3. 8. 3. 3.. 3. 3. 8. 8. 8. 3.. 8. 00. 00.0 3.0. 02038008 800 .3

3. 8. 3. 3. 3. 3. 3.- 8. 3. 8. 3. 3. 3. 3. 8. 00. 8.- 00.0 8.0 83.00 5202300500008 .3
3333333030 0 k 0 0 v 0 N 3 00 :82
8030.00.09.35 ecm 8:0:me 0200:0000 ”gooﬁu 2an

54

0.0:. .3 3 0:0 0_0E00 .2 o 0008 0. 30:06F
.8803 05 0:20 005038 0.0 00250.3. 0:02 .30. v 3 0:005:90 0.0 28 :_ 0:250:00 rut-02

 

 

 

8.- 3. 00. 8.- 8.- 8.- 8. 00. 8.- 3.- 3.- 3.- 8.- 00. 8.- 3.- 8.- 30. 8. 3. 00. 000080000 .005 .8
.80 8. 3. 3. 3. 8. 3. 30. 8. 3. 8. 3.- 3. 3. 8. 8. 8. 8.- 30. 8.- 3. 0000.305. 0. 000000000 .8
.8. 8. 8. 3. 3. 8. 3.- 3. 3. 3. 8.- 3. 30. 3. 8. 8. 8. 8. 00. 8. 0300300002305. .8
30.. 8. 8. 8.- 8. 8.- 8.- 8. 8. 3.- 3. 30.- 3. 8. 8. 8. 3. 3. 8. 000000080 0000.305. .8
.00.. 8.- 8. 8. 8. 3.- 8. 8. 3.- 00. 8. 8. 8. 30.- 8.- 8.- 3.- 8. 8.0.0.302 0.2.00 .8
80.. 8.- 3.- 8.- 3.- 3.- 3.- 8. 3.- 8.- 8.- 8.- 3.- 00. 8. 8. 3. 8.3.8.8.... 00.50 .3
.808. 8. 00.- 3. 3. 3.- 3. 8.- 3. 3. 8. 00. 8.- 8.- 3. 30830.80 .8
.08 8. 30. 8. 0% 3.- 8. 3. 3. 8. 3. 8.- 3.- 8.- 3. 538000.050 .8
.0: 8. 8. 3. 3.. 8. 8. 8.- 8. 8. 3.- 8.- 8.- 8. 00.00.2050 .3

98—500: .0225.— xm
.00.. 00. 3. 8. 8. 00. 8. 8. 8. 8.- 8.- 3.- 3.- 3.00.000 .0300 .00
-. 8. 0t 3. 8. 8. 30. 00. 8.- 8.- 00.. 3.- 0.000 .003 .005 .8
30.. 3. 00. 8. 3.. .8. 00. 8.- 00.- 8.. 8.- 00.03.8008 0000 .8
.80 8. 8. 8. 3. 8. 3.- 8.- 8.. 3.- 83.00 0.0.00... .0 8.00000: .8
.8... 8. 8. 8. 3.. 8.. 8.. 8.- 8.- 3000 .0 2.80000... .8
813. 8. 8. 8.- 00.- 8.- 3.- 008.300... .0 000000000 .8
A8,. 3. 8. 3.- 8.- 8.- 8.- 3.0.0003. .3.
.00.. 8. 3.- 8.- 3.- 8. 000000000000000 .8
.80 8.- 3.- 8.- 3.- 3.8.0 800050 .8

3000.0

.8000. 8. 8. 3.00.08.00.00 .8
.. 3. 3.. 0.000 .003 .055 .8
E... 8. 00.0.2800 .0 0000 .3

.00.. 83.00 0.0.00.0 0 8.00000”. .3

mmnmommmvmmmmmrmommmmwhwmmmmvmmwumKama—.9.

05.003.30.02. 0:0 00303080. 02330000 ”2:8. N 030...

55

relationship between applicant social desirability scores and d-scores. Incumbent social
desirability scores also have small to moderate correlations with d-scores, although all
relationships are negative, indicating that higher incumbent social desirability scores are
associated with lower d-scores.

Hypothesis 3 states that partialling social desirability from applicant scores will
not result in a signiﬁcant change in criterion-related validity. To test this hypothesis
performance was ﬁrst regressed onto each test dimension to obtain an estimate of the
validity of the test dimensions. Next, performance was regressed onto social desirability
in the ﬁrst step and regressed onto test dimension in the second step. Examination of
Table 3 indicates that, consistent with the hypothesis, partialling social desirability scores
ﬁ'om applicant scores did not result in signiﬁcant changes in the criterion-related validity.
However, due to the very small sample size available for these analyses and the absence
of applicant test score criterion-related validity no ﬁrm conclusion can be made about the
effect of partialling social desirability on criterion-related validity.

Construct Validity. Hypothesis 2a states that factor forms will not be signiﬁcantly
different across the two measurement contexts. Hypothesis 2b states that measurement
errors of the selection test will be signiﬁcantly different across the two measurement
periods. The previously discussed ﬁnding of a lack of factorially pure test dimension
scales prevents tests of these hypotheses with those scales. Thus these hypotheses were
tested for the emotional stability and conscientiousness scales only. Both emotional
stability and conscientiousness were distributed normally in both contexts (i.e., skewness
and kurtosis statistics were less than .70). This indicates that it is uneccessary to use

asymptotic distribution-free estimation procedures as suggested by previous researchers

56

.8. v 9 ”8 u c muoz

 

Chg m P v.
mac. mowr
omo. mom. 7
N3. omm...

mm 33.63 «m

co_mcmE_o “mop
N

EmEommcms. Sammie.

coinage “we...
253mm

 

«ma Eco E a

908 “mm... __m._m>o

«E 455 E a
co_m_2maam mace". 3:30

a 333385

£28 8285
w mcEOmmmm

m_nmtm> “%me

 

mm Emacs «m

co_mcmE_o amok
N

EmEmmmcms. c2329»:

co_mcmE_o “me...
2:38

 

23%,; $5

 

ANNA: mo _..
m _.o. mmhr
omo. mow. _...
w 5. ommr
«m4 55 E a
32:05:“.
a 85380

b=5muamu< mmmcmaozcﬂowcoo banmgw .mcozoEm

 

EmEmmmcms. Sammie. new mco_mcoE_Q “no... Emoimﬂ 9:0 uommmawm 330.500 859:0th ”m can...

57

.mo. v a. ”5— u c ..m...02
8.8..9

 

vo. wt 2.. .vmsmv b m5. .m> vs. 3.0 .3. EN E .28 =38 698......> .28. .838 608.830 .28.
.838 6580. .28. .838 6.2.3 .28. 85“. .92

88:?

no. 8. mm. .89 N .as. .w> 92 mg. .38: E .28 8.. .8cm_.m> .28. .838 .8cm_.m>oo .28.
.838 6580. .28. .838 .50qu .28. coxE .32

38:?

No. 8. mo. mod F ms. m> us. 9: .863 a. .28 8.. 6088., .28. 8.. .8cm_.m>oo .28.
.838 6582 .28. 8:8 .533 .28. 85”. .ms_

88:? .o..o 8.. 608:9 .28. 8.. .8885“.

no. mo. mm. mm. m «s. m> :2 mt. Lad? E .28. 8.. 6580. .28. 8:8 .Eouma .28. 85“. NS.
88.29 .28 8.. .88.:9 .28. 8.. 608.850

mo. 8. mm. I I I 3.. .3. For 8 .28. 8.3580. .28. 8.. .538 .28. 8x... .:2

1.21m (mmzm 3... «x4 .2 583800 .Eux ax .u

 

”.8585 .28”. b22550 3:20 29.52 “V 28..

58

investigating the effect of faking on measurement invariance (e. g., Smith & Ellingson,
2002). Meredith’s (1993) suggested order of model building for examining the
equivalence of two tests was used to examine differences in construct validity between
the two measurement contexts.

Multiple-groups conﬁrmatory factor analyses (MCFA) with correlated errors were
conducted in order to test Hypotheses 2a and 2b. Random parcels composed of two or
three items each were used as indicators of the latent factors. Table 4 reports the results
of the MCFA described below. Model 1 (M1) tests Hypothesis 2a by constraining only
the factor pattern across contexts. This model yielded adequate ﬁt of the model to the
data as evidenced by ﬁt indices within the intervals suggested by Hu and Bentler (1999).
This indicates that the same item parcels are causing the latent factors across both
measurement contexts.

Model 2 (M2) imposed the additional constraint of equal factor loadings. This
model also yielded adequate ﬁt of the model to the data. This indicates that the item
parcels are equivalent indicators of the latent factors across measurement contexts.
Model 3 (M3) imposed the additional constraint of equal covariance. The adequate ﬁt of
this model to the data indicates additional support for the construct validity of the scales
across measurement contexts.

Model 4 (M4) imposed the constraint of equality of variances across measurement
contexts. The signiﬁcant chi-square difference test indicates that the variance of the
latent factors differs across the two contexts. Examination of the variance of the latent
factors in the unrestricted model reveals that the variance of the latent factors is greater in

the incumbent (o2 = .237 and .169 for emotional stability and conscientiousness,

59

respectively) than in the applicant context (<52 = .182 and .122 for emotional stability and
conscientiousness, respectively). This provides ﬁlrther evidence of a restriction of
variance in the applicant as opposed to the incumbent context, although it should be
noted that the ﬁt indices for this model do indicate an adequate ﬁt of the data to the
model even with this additional constraint.

Model 5 (MS) tests Hypothesis 2b by constraining the measurement errors of the
indicators to be equal across contexts. The signiﬁcant chi-square difference test and the
deterioration of the ﬁt indices indicates that measurement errors are not equivalent across
the two contexts. Examination of the errors indicates that there is more measurement
error associated with the incumbent scores. This is likely due to the greater score
variance obtained in incumbent as opposed to the applicant context. Therefore,
Hypothesis 2b is supported.

Hypothesis 2c states that the average intercorrelations among the scales will be
higher in the applicant than in the incmnbent context. Examination of emotional stability
and conscientiousness reveals support for this hypothesis. The intercorrelation between
these two constructs in the applicant context is .49, and the intercorrelation in the
incumbent context is .20. This difference is signiﬁcant (t (148) = 2.49, p < .05). Thus,
Hypothesis 2c is supported.

Criterion-related validity. Hypothesis 4a states that signiﬁcantly higher criterion-related
validity will be observed for the scores obtained in the incumbent as opposed to the
applicant context. Table 2 shows that the correlations between applicant test scores and
performance ratings are all lower than the correlations between incumbent test scores and

performance ratings. T -tests for dependent correlations were used to formally examine

60

this hypothesis. The conscientiousness scale shows marginally signiﬁcant differences
between the contexts (t (146) = -1.613, p < .10). The differences between the criterion-
related validity of emotional stability is not signiﬁcant. Therefore, despite that the
directions of the differences are consistent with expectations, Hypothesis 4a is not
supported. Both the Productivity and Quality dimension and overall test scores exhibited
signiﬁcantly higher correlations in the incumbent as opposed to the applicant context (t
(146) = -2.246 and -1.762, p < .05, respectively).

Hypothesis 4b states that the criterion-related validity of the upper portion of the
applicant distribution will be lower than for the lower portion of the distribution. A series
of hierarchical regressions in which performance was regressed onto applicant test scores
in the ﬁrst step and squared-applicant test scores in the second step were used to test this
hypothesis. Table 5 contains the results of the regression analyses. None of the squared
predictor terms added signiﬁcantly to the prediction of performance ratings. Thus,
Hypothesis 4b is not supported.

However, the null results could alternatively be explained by the restricted range
of applicant test scores available in the current study. For example, if a curvilinear
relationship exists across the entire range of applicant test scores such that the
relationship is linear at lower test scores (i.e.., applicant test scores not meeting the
minimal standards used for selection and not included in the current sample) and either
curvilinear or ﬂat at higher test scores (i.e., applicant test scores exceeding the minimal
standards included in the current study), then it is possible that a curvilinear relationship

still exists even though it was not found with the current restricted data. Analysis of

61

.mo. v a... ”0.1. u c ..m...02

 

 

 

 

 

 

 

Boo. moo. Jooo. moo. Goo. ooo. Goo. moo. Aux 8.3.2. «m.
Foo. mmFr woo. NNNr Foo. not- Noo. mmF. N5.2.250 .85
N
ooo. ommr Foo. voor ooo. NNor Foo. moor 5.2.250 .8...
F
2.580
~m< 3.5 .8. o ~m< 3.5 .8. a m< 5.5 .8. a “ad 20.5 .8. u o_8..m> Q20
280 .mm... __m.m>0 5.22050 5200 .5390 2.00“. 5.830
.0 8mm. a 9.528”. a 3.2.0505
.Noo. mFo. Goo. moo. 453 woo. Soo. ooo. mm 8.3.63 a...
go. Foo. Foo. ooF.- Foo. own. Noo. on»..- N5.2.250 .8...
N
F 8. oomr voo. 3 F .- moo. non. woo. moNr 57.550 .8...
F
«5.83
«ad 4.9.5 .8. a km< 3.5 .8. 9 mad 3.5 .8. a ~m< 3.5 .8. a o.8.8> 020
«.8555... 5.5282 823025.850 3:320 85580
a 85350

 

88:5 2.0.2.250 .2: 55 2.52.950 .8... .5232 25 882%”. mgmonEoo 855.50.. no 28...

62

unrestricted applicant data from this organization supports the possibility of this
alternative as the standard deviations obtained for applicant test scores in the unrestricted
sample are roughly twice that obtained in the current sample. This indicates that the data
included in the current sample is substantially restricted.

Selection decisions and rank-order. Hypothesis 5a states that the rank-ordering of
test scores will be substantially different in the applicant as opposed to the incumbent
context. In order to test this hypothesis, correlations between applicant and incumbent
context test scores were computed for the entire distribution. Table 6 contains the results
of these analyses for all test dimensions and the overall test scores. With the exception of
the Reasoning and Problem Solving dimension, the rank-order correlations were much
lower than would be expected when comparing the rank-order of test scores within
individuals across time periods. That is, if the same test is given to the same people at
two different times, the rank-order correlations between the two time periods would be
expected to be quite high (e.g., .70 or greater), which was not found in the current study.
Therefore, Hypothesis 5a is supported. However, test-retest unreliability of the test
cannot be entirely ruled out as an alternative explanation. Recall that the Reasoning and
Problem Solving dimension includes cognitive ability items answered in the applicant
context and used to compute scores on this dimension in both contexts so it is not
surprising that this rank-order correlation is much higher than the others.

Examination of the rank-order correlations, corrected for range restriction, for the
top 50 applicants, selected on the basis of overall applicant test scores, suggests that there
were approximately the same amount of changes in rank-order at the top of the

distribution as there were for the entire distribution. This is unexpected given previous

63

Table 6: Rank-order Correlations
Top 50 Applicant

 

 

Entire Top 50 Applicant Scorers -
Distribution‘ Scorers Corrected2
Emotional Stability .53 .45 .65
Conscientiousness .45 .35 .52
Adaptability .41 .01 .02
Conﬁdence & Friendliness .53 .43 .51
Productivity & Quality .42 .32 .51
Reasoning & Problem Solving .87 .77 .90
Ease of Supervision .43 .28 .50
Overall Test Score .46 .24 .56
NOTE: Correlations in bold are signiﬁcant (p < .05).

1
n = 151.
2Values represent correlations corrected for range restriction on applicant test scores.

research ﬁndings suggesting that individuals engaging response distortion are more likely
to appear at the top of the distribution (e.g., Christiansen & Halaand, 1998).

Hypothesis 5b states that many of the individuals hired on the basis of applicant
test scores would not have been hired on the basis of incumbent context scores. Of the
151 individuals in the current sample, all of whom exceeded the minimally acceptable
normative score standards on the test as applicants, only 97 participants exceeded these
standards based on incumbent test scores. This indicates that over a third of the
individuals in this sample would not have been selected in to the organization on the
basis of their incumbent test scores and the normative standards used in the applicant
context, suggesting that these applicants successfully faked their applicant test scores.
Therefore, Hypothesis Sb is supported. It is also possible that test unreliability may have
contributed to the obtained results. However, as discussed in the exploratory analyses
section below, the 54 applicants that did not meet minimal standards based on incumbent
tests scores had signiﬁcantly greater d-scores than individuals meeting the minimal

standards in both contexts.

Antecedents. Hypothesis 6a states that perceptions of others’ beliefs about the
acceptability of faking on selection tests will be positively related to faking. The
correlations between other’s beliefs and d-scores contained in Table 2 show that
perceptions of others’ beliefs is signiﬁcantly related only to Reasoning and Problem

Solving d-scores. This correlation indicates that the more an individual perceives that
signiﬁcant others in his or her life believe that it is acceptable to fake on selection tests,
the less the individual actually faked his or her scores on the Reasoning and Problem
Solving dimension. This relationship is the opposite of that hypothesized. Therefore,
Hypothesis 6a is not supported.

Hypothesis 6b states that perceptions of others’ behavior in applicant contexts
will be related to faking, such that perceiving that other applicants fake on such tests will
positively relate to the degree of faking engaged in by the individual. The correlations
between others’ behavior and d-scores contained in Table 2 show that perceptions of
others’ behavior is signiﬁcantly positively related to d-scores for every test dimension
except Conﬁdence & Friendliness and Reasoning & Problem Solving. This indicates that
the more strongly an individual perceives that other applicants fake on selection tests, the
greater the extent of his or her faking in the applicant context. Therefore, Hypothesis 6b
is supported.

Hypothesis 7 states that the relationship between ethics against lying and faking
will be moderated by beliefs about faking, such that in the presence of a belief that faking
is not the same as lying the relationship between ethics and faking will be reduced.

Moderated regression analyses, reported in Table 7, reveal no support for this hypothesis.

65

.8. v n... 62".. ”0.02

 

 

 

 

 

 

 

35.. mmo. 83.. N3. 33.. m3. .moo. mNo. .00 00.03.64... N0
omor 03.- omor Nmo.- 5.50.0.5
So. 3o. 03. Foo. N
mFo.- N3. Vmor Fvor 9.0.0”. .0000. 0.0.00
.0..- r 000.- 20. ta. 9.3 .052. 8.5m.
08. mmo. No. No. F
«04 3.5 .0. a «04 .350 .0. a 0d 5.5 .0.d ~0< .350 .0. u 0.00t0> 00.0
0.000 .00... __0.0>O 5.03.0000 5200 0.0.00.0 0:00“. 3.000
.0 0000 .m 0550000 0 3.2.0520
88.. 3o. 88.. to. 88.. woo. Soo. FNo. .0 00.00.61... N0
ovor 9 F.- VNo. NNo.- 5.50.0.5
Foo. moo. Noo. Foo. N
moo. Noo.- NNo. NNo.- 050.0“. .000< 20:00
08.- 2 ..- 08.- 08.- 9.3 .202 8.50
moo. ooo. moo. 89 F
N0.. .350 .0. a N0.. 4350 Ed ~0< .350 .0. 0 ~04 5.5 .0. m. 0.00t0> 00.0
0005.05.00 3:50.002 000005050050 35020 .05580
0 00:05.50

 

5.50.0.0. ..0..... 0540.050 .0250 25 00000500 00.0000 K 0.00...

66

Furthermore, individually, ethics against lying and beliefs about faking are not
consistently related to d-scores.

Hypothesis 8 states that self-efﬁcacy for faking will be positively related to d-
scores. The correlations between self-efﬁcacy and d-scores (see Table 2) reveal no
consistent relationships. Self-efﬁcacy is negatively related to d-scores of the Reasoning
and Problem Solving dimension, but this is in the opposite direction of that hypothesized.
Therefore, Hypothesis 8 is not supported.

Hypothesis 9a states that individuals’ knowledge of constructs being assessed will
be related to d-scores, and Hypothesis 9b states that self-efﬁcacy will mediate these
relationships. Table 2 shows that none of the three operationalizations of knowledge
were consistently related to d-scores. Thus, Hypothesis 9a is not supported. The very
low power, due to the small sample size (i.e., n = 45), may be partially responsible for the
lack of signiﬁcant relationships. The lack of a signiﬁcant relationship between self-
efﬁcacy and d-scores, as well as between the knowledge measures and d-scores,
precludes the possibility of self-efﬁcacy mediating the relationship between knowledge
and faking. Thus, Hypothesis 9b is not supported.

Hypothesis 10a states that the relationship between subjective norms, a
summative variable of perceptions of others’ beliefs and behavior, and faking will be
moderated by self-efﬁcacy, such that in the presences of low faking-related self-efﬁcacy
the relationship between subjective norms and faking will be reduced. Moderated
regression analyses, reported in Table 8, reveal only marginal support for this hypothesis

for the emotional stability and conscientiousness dimensions. Thus, Hypothesis 10a is

not supported.

67

.00. v 0.. .00.. u : .m-FOZ

 

 

 

 

 

 

 

88.. .8. 38.. 03. .08.. $0. an... 03. 00 5.8.2. «0
so. So. «8.- 08. 555....
.8. 8... 8o. 3... N
8... . 00.. 08.- «0.. 0052 02.5.50
8...- ..3.- 03.- 03. 580.250
on... - 03. . «0... 08. F
m... .05 .0. 0 ~02 00.5 .0. 0 00 00.5 .0. 0 .0.. 00.5 .0. 0 25...; 5.0
0.000 .00 0. ..0.0>O 00.03.0000 0.200 0.0.00.0 0:000 3:03.
.0 000m .0 00.000000 .0 3.>..0:00.0
88.. :o. 68.. :o. .08.. 00... $8.. 3... 4.0 5.0%.... «m
«.8.- E..- .3. 08. 5.55.5
«8. 8... 05. .8. N
03. 08. .3. 00... 0.52 00.5.30
8...- 8.. 08. :o. 585.50
8... .5. v8. 08. .
«00 00.5 .0. 0 .0.. 2.5 .0. 9 «0.. 00.5 .0. 0 .0.. 00.5 .0. 0 255> 5.0
050.05... 3.5.52 00058050050 3.5.0 0550.0

0 000000000

5.00.0.0. ..00... 000 .0052 02.00300 0.00050-000 0.00 530.0% 00500.0 .0 0.00...

68

.8. v a. 8. u .. 0.02

 

 

 

 

 

 

 

a8. 08. .0.... 8... 2840.... .3... 08. mm 5.80.... .0
8...- 8..- 8... 0.... 5.55.5 53.5....
8... 3... 8. 8... 0
v8.- 8...- 08. .8. 5.00-8.50
08. 8... 08. o... 05.50.58.550
v8. .8. .8.- 8..- 8.0.0.58550
8... 8... 08. .8. m
0 0..- .8. .8.- 08.- 8.5. so... 0550
8..- - 80.- 8...- 8..- 8..... .008... 8.0.0
8... .0..- 08.- 8... 585.50
08. 08. .8. 08. .
.0.. 8.5 .0. 0 .0.. 8.5 .0. 0 04 8.5 .0. 0 N04 8.5 .0. 0 255> 5.0
0.000 .00 h ..0.0>O 00.03.0000 03.00 0.0.00.0 0000.... £030
.0 000m 0 00.000000 .0 3.2.2.080
88.. 3.... 48.... 8... .08.. 8... .08.. 8... mm 5.8.2. .0
8.. 8...- 08.- E ..- 5505.5 53-5....
3... 8... 8. .8. 0
08.- 8..- .8. 9... 0550.00.50
.8. 08.- ..8. 3.. 0550-58550
8.. o... - 8..- 8... 8.0.0.58508
0.... .8. .8. 08. N
.8. 08.- .8. 8.... 8200 .82 05.00
.8.- 8...- 3...- ..8.- 8..... .800... 8.0.0
8...- 8... .8. 08. 585......0
So. .3. 0.... 8... .
.0.. 8.5 .0. 0 «ma 8.5 .0. 0 .00 8.5 .0. 0 .0.. 8.5 .0. 0 255> 5.0
008.05... 8.5.52 0005805550 8.5.0 0580.0
0 000000000

 

5.80.0.0. 000.0 000 .0..0..00 00.0.0 50050-000 0.00 030%0 00500-0 .o 0.00 .-

69

Hypothesis 10b states that the relationship between ethical beliefs, beliefs about
faking, and faking will be moderated by self-efﬁcacy, such that in the presence of low-
faking related self-efﬁcacy the relationship between ethical beliefs and faking will be

reduced. Moderated regression analyses, reported in Table 9, reveal no support for this

hypothesis.

Exploratory Analyses
Three sets of exploratory analyses were performed to gain a greater understanding

of the patterns in the data. Speciﬁcally, split-group analyses, moderated regression
analyses, and polynomial regression analyses were performed to ﬁ1rther examine the
relationship between faking and performance.

Previous researchers have examined differences in criterion-related validity for
different portions of the applicant distribution to assess the affect of faking on validity
(e. g.,Haaland & Christiansen, 1998; Mueller-Hanson et al., 2003). Table 10 presents the
split-group correlations of applicant scores, incumbent scores, and d-scores with overall
performance ratings and also presents t-tests and effect size estimates for the difference
between the means obtained by the two groups. The distribution was dichotomized on
the basis of whether or not the individuals’ incumbent scores exceeded the minimally
acceptable score standards established for the applicant setting. The group failing to meet
the minimal score standards as incumbents had signiﬁcantly lower applicant and
incumbent mean scores, but signiﬁcantly greater d-scores. This pattern suggests that
these individuals engaged in greater amounts of faking than the individuals exceeding the

minimal standards in both contexts.

7O

Table 10: Split;group Correlations and Effect Sizes

 

Pass1 Fail1 Effect Size2 t3
Applicant Test §Qres n=95 n=54
Emotional Stability .00 -.22 -.37 -2.19
Conscientiousness .06 -.01 -.40 -2.35
Adaptability -.03 -.21 -.53 -3.10
Confidence & Friendliness -.07 -.27 -.51 -3.02
Productivity & Quality -.04 -.09 -.40 -2.38
Reasoning 8 Problem Solving -.01 -.28 -.36 -2.12
Ease of Supervision .03 -.20 -.35 -2.08
Overall Test Score -.03 -.35 -.67 -3.95
lngumbent Test mm
Emotional Stability -.17 .14 -1 .06 -6.25
Conscientiousness .20 .12 -.94 -5.56
Adaptability -.21 .03 -1 .31 -7.70
Conﬁdence & Friendliness -.08 -.17 -1.00 -5.91
Productivity & Quality .09 .16 -1.29 -7.61
Reasoning & Problem Solving .03 -.13 -.48 -2.85
Ease of Supervision -.03 .14 -1.45 -8.56
Overall Test Score -.06 .01 -2.07 42.20
M
Emotional Stability .18 -.33 77 4.52
Conscientiousness -.1 1 -.15 57 3.39
Adaptability .16 -.16 .88 5.16
Conﬁdence & Friendliness .02 .07 .79 4.68
Productivity & Quality -.12 -.24 96 5.68
Reasoning & Problem Solving -.09 -.22 28 1.95
Ease of Supervision .07 -.27 .88 6.31
Overall Test Score .03 -.24 1.12 7.95

 

NOTE: Values in bold are signiﬁcant (p < .05).

1"Pass" refers to those individuals meeting or exceeding the minimal applicant score
standards based on incumbent scores, and "Fail' refers to those individuals failing to
meet or exceed the minimal applicant score standards based on incumbent scores.

2Effect sizes were computed by subtracting the mean score for the incumbent setting from
the mean score for the applicant setting and dividing by the pooled standard deviation;
negative values indicate that higher scores were obtained for the group that met or
exceeded the minimal applicant score standards based on incumbent scores.

30 = 149.

An examination of Table 10 reveals that the correlations between applicant test
scores and performance were predominately negative, although not consistently
signiﬁcant due to the small sample size, for the group who failed to meet the minimal
standards as incumbents, while the correlations for the group exceeding the standards are

essentially zero. A similar pattern emerges when examining the correlations between d-

71

scores and performance for these two groups. Although not signiﬁcant, the pattern
suggests that for individuals who, on the basis of incumbent scores, would have failed the
selection test the extent of faking is negatively related to performance. Furthermore, this
group of individuals had signiﬁcantly greater d-scores than the group exceeding the
minimal standards in both contexts. This provides further support for the notion that
some of the people in the current sample were able to successfully fake their way to
passing the minimal standards. However, it is important to note that individuals with a
true high score on these test dimensions may not have been able to inﬂate their responses
in the applicant setting because these individuals are already at, or very close to, the
highest scores possible on the test. If this ceiling effect were not present, it is possible
that similar results would have been observed for the “pass” group as were observed for
the “fail” group.

Exploratory analyses were performed to investigate whether faking,
operationalized as d-scores, moderates criterion-related validity. Hypotheses 4a and 4b
also examined related, but different, phenomenon as the current analyses. Hypothesis 4a
examined the differences in criterion-related validity for scores obtained in the two
contexts. In contrast, the moderated regression analyses described here test whether the
validity of applicant test scores changes over different levels of faking. Hypothesis 4b
examined the differences in validity for the top portion of the distribution, where faked
scores are most likely to reside, and the lower portion of the distribution, likely to have
less faked scores. The current moderated regression analyses are less inﬂuenced by the
restricted range of applicant test scores that may have resulted in the null result for

Hypothesis 4b.

72

Table 11 shows that the interaction of applicant scores and d-scores was

signiﬁcant for emotional stability, but not for conscientiousness. Figure 2 contains a

graph of the signiﬁcant interaction. For emotional stability, faking moderates the

relationship between applicant scores and performance such that in the presence of a
relatively high degree of response inﬂation, test scores relate negatively to performance.

In the presence of relatively low amounts of faking, test scores relate positively to
performance. A similar pattern was also observed for Base of Supervision and overall.

test scores. It is important to note that the results of these analyses are purely exploratory

and cannot be interpreted as strong support for the notion that faking moderates the

criterion-related validity of applicant test scores. This is especially true when one

considers the compounding of unreliability inherent in d-scores and the interaction term.

As discussed previously, some researchers argue that polynomial regression using

the d-score components is a more appropriate analytical technique than analyses

involving d-scores (cf. Edwards, 2002). Thus, the interactive effects of applicant and
incumbent test scores on performance ratings were examined using polynomial
regression. As suggested by Edwards (2002), applicant and incumbent test scores were
entered in the ﬁrst step, applicant and incumbent squared scores were entered in the
second step, and a multiplicative interaction term for applicant and incumbent scores was
entered in the third step of a hierarchical regression predicting performance ratings.

The results, reported in Table 12, show that the interaction of applicant and

incumbent test scores was signiﬁcant for emotional stability, Ease of Supervision, and
overall test scores. Figure 3 shows the nature of this interaction for emotional stability.

The graphs of the interactions for Base of Supervision and overall test scores are very

73

.0... v a. .03 u c 0.02

 

 

 

 

 

 

 

.00.... 0.... .8.... 08. .08.. 8... .00.... 08. .0. 5.8.3.. .0
.. 08.- . 80.- 0...- E.- 5505....
. .0... . 00... 8... 8... u
80.- 0....- 0.... . 0:..- 0.80-..
80.- ..8. . 08. .- 00... 5.0555 .5. 88.5...
.0... 0.... 08. . 8... .
“ma .35 .0. 0 .00 8.5 .0. 0 «ma 8.5 .0. a .0.. 8.5 .0. 0 0505> 5.0
0.000 .00.. __0.0>O 5.03.0030 m:.>.o0 80.00.... 0:00“. 50:0
.0 000m .0 05:800.. .0 3.2.0000...
3850.... 48.... 0.... .08.. 3.... S... .8. .0. 5.03.9... N:
01.- 30: 00....- - 0.0..- 5505....
.8. 0.... 0.... . 8... 0
N8. 3...- 0....- ..vw. 0.80-..
80.- 00..- 8.. 0...- 5.055... .00.. 88.5...
2... .8. 8... .8. .
«ma 8.5 .0. 0 .00 8.5 .0. 0 L0... 8.5 .0. 0 .00 8.5 .0. 0 050...? 5.0
00885... 8.5.5.2. 055880.550 8.80.0 05850
.0 00:03:00

 

8.50.02. .0.: 0:0 00.0006 00.000 8.2950 .00... .:00._Qa< 0.:0 0000050”. 00:09:83.. ”E. 030..

74

 

 

3:380 35.5.5
5 I 5 .-

 

00 I.
0.0000 cm... III

.5 7.
0.0000 .50.. +

 

 

08058.00 :0 00.000 .00... 800.39...
0:0 00.000-D .0 0.00em 9.80.0.5 0:. .0 500.0 ”N 0.59“—

 

 

 

 

 

aoueuuoyed

 

L

 

75

similar to the graph for emotional stability and thus are not reproduced here.
Examination of Figure 3 suggests that regardless of the direction of the difference
between applicant and incumbent scores, greater differences predict lower performance.
This indicates that, for emotional stability, Ease of Supervision, and overall test scores,
the absolute difference between applicant and incumbent scores relates negatively to
performance ratings. Contrary to theory and expectations, this suggests that response

inﬂation as well as response deﬂation inﬂuences the criterion-related validity of applicant

test scores.

76

.8. v 0.. .03. u : M802

 

 

 

 

 

 

 

.08.. 08. .08.. .8. .08.. 00... .0.... 08. .00 5.5.2. .0
. 0...... . 0.0. .0...- .8. 5055....

. 8... . 8... 08. 8... 0
00... 8...- 08. 8... N5.050.... .5. .50585
.0 . .- 80.- 80.- 00.. «5.055... .5. .0885.

.8. 8... 08. .8. N
0.0. 0.0. 0.0. . 0.... 5055.0 .5. .5025...
80.- 0.0. . 80..- 80.- 5.055.... .5. .0885.

.8. 0.... 00... . 00... .

.0.. 8.5 .0. 0 .00 8.5 .0. 0 m0 8.5 .0. 0 .04. 8.5 .0. 8 0.00_.0> 5.0
0.000 .00... _.0.0>O 8.03.0030 :.>.00 80.090 0:000 3:030
.0 000m .0 0550000 .0 33000090

.8.... ..8. .8.... 00... .0.... .8. ..0.... 00... .00 5.500.. .0
08. s... 03.0 - v8.0 5505....

8... 08. 08. 08. 0
.3.- 0000 0.0. 0.0.. N5.80.0.0 .5.- .5385
0.... .8.- .00. .0..- N...._0._..._.._0 .5.- 88......

8... 8... 08. 0.... N
8...- 0.... . 80. 80. 5.055... .5. .0855...
0.0.- 80.- 80.- .3. 5.055... .5. 88.0....

.8. 3... . 00... .8. .

.00 8.5 .00.. .00 8.5 .0. 0 .00 .95 .0. 0 .0.. 8.5 .0. 0 m.55> 5.0
05:85... 0.80.52 05530050050 8.5.0 0508.0
.0 8:000:00

 

5.00.08. ..0..... 0:0 00.000 50:08... .00-.- .:0..E:0:_ 0:0 .:00._00< 0.:0 00:08.0..00 .0 :0.000.m0l0 .0.:.0:>_o0 ”N. 0.00...

77

000085.30 Saw... 0.00%... 00.00% .00—.00 .032

 

 

 

00.05 .00-w 0:00:00... 00.05 .00-.- 83.555
0.0 0..- 0.0- 0.0- 0..- 0.0
. A: .
0.0 0. — 00
w
m 0.0
m
.m 0..
P
cow [i1

00.00m .00..- 338:0...
od- 0. T 0.0 0.. ed

Applicant Test Scores

 

0.300% 00.0-8&0”. 380.50.. .0 500.5 .m 0.5mm“—

 

 

78

DISCUSSION

The results of this study indicate that individuals, on average, respond to
personality-based self-report tests more desirably in applicant as opposed to incumbent
settings. Furthermore, the results suggest that different individuals may inﬂate their
responses to different degrees in applicant settings and this inﬂation may affect rank-
ordering as well as selection decisions. The low power conﬁrmatory factor analyses
suggest that applicant faking did not erode the construct validity of the personality-based
measures used in the current study as both factor patterns and loadings appear to be equal
across applicant and incumbent contexts. However, the signiﬁcantly higher
intercorrelation obtained in the applicant context for emotional stability and
conscientiousness suggests the opposite conclusion with regards to construct validity.

Some of the analyses, such as the d-score moderation analyses, suggest that the
criterion-related validity of applicant test scores may be moderated by applicant faking.
However, other analyses, such as the examination of different validity for applicant and
incumbent test scores and the analysis of a curvilinear relationship beween applicant
scores and performance, suggest that criterion-related validity is not attenuated or
moderated by applicant faking. Finally, the degree of faking engaged in by an individual
is positively related to the degree to which an individual endorses a perception that others
engage in faking in applicant settings. Table 13 formally summarizes the hypothesizes
and whether each one was supported by the data. The remainder of this paper will

address ﬁve central questions in the faking literature and how the results of this study

contribute to this knowledge base.

79

Table 13: Hypothesis Summary

 

 

thesis : Scores on a social desirability scale will have a small to moderate
rrelation with difference scores.

Supported

 

 
  
  

: Factor forms will not be signiﬁcantly different across the two
asurement periods (i.e., applicant administration and research administration), that
is there will be conﬁgural invariance of the personality factors across both

'nistration periods.

Supported

 

i : Measurement errors of the personality test will be signiﬁcantly
'fferent across the two measurement periods.

Supported

 

   
 

othesi 2c: Average intercorrelations among the scales will be higher when the
' ventory is administered for application purposes than when it is administered for

search purposes.

Supported

 

i : Partialling social desirability scale scores ﬁom applicant personality
res will not result in a signiﬁcant change in criterion-related validity.

Not supported

 

 
  
 
 

thesi 4a: Criterion-related validity will be signiﬁcantly greater for personality
res obtained in a non-motivating context (i.e., for research purposes) than scores
btained in a motivating context (i.e., for application purposes).

Not supported

 

   
 

h si 4b: Criterion-related validity of the upper half of the distribution of
pplicant sample scores will be less than the criterion-related validity of the bottom

Not supported

 

If of the distribution.

: The rank-ordering of peOple on the basis of responses obtained in a
on-motivating context (i.e., for research purposes) will be substantially different
the rank-ordering of responses obtained in a motivating context (i.e., for

 
 
 

Supported

 

 
  
 

: Some of the people hired on the basis of their responses in a
tivating context (i.e., applicant setting) would not have been hired on the basis of

eir responses in a non-motivating context (i.e., research setting).

Supported

 

Iglypothesis 6a: Perceptions that others believe faking to be an acceptable practice

'11 be related to faking.

Not supported

 

W: Perceptions that others engage in faking will be related to faking.

Supported

 

 
  
 

: The relationship between a self-reported ethic against lying and faking
'll be moderated by an individual’s belief that faking on a selection test is the same
lying, such that in the presence of a belief that faking is not the same as lying the
lationship between reporting an ethic against lying and faking will be reduced.

Not supported

 

 

h ' : Individuals with high self-efﬁcacy for enhancing their responses to a
selection test in a desirable way will be more likely to engage in faking.

Not supported

 

8O

 

Table 13 (cont): Hypothesis Summary

Hygthesis

 

I Result
thesis 9a: Knowledge of the constructs being assessed will be related to faking.

Not supported
othesis 9b: The relationship between knowledge of the constructs being measuredl
and faking will be partially mediated by self-efﬁcacy.

 
   
 
   
 

 

Not supported
othesis 1 0a: The relationship between subjective norms and faking will be
oderated by self-efﬁcacy, such that in the presence of low faking-related self-
fﬁcacy the relationship between subjective norms and faking will be reduced.

 

Not supported
H othesis 1 Ob: The relationship between ethical beliefs and faking will be
oderated by self-efﬁcacy, such that in the presence of low faking-related self-
fﬁcacy the relationship between ethical beliefs and faking will be reduced.

 

Not supported

 

Does Faking Affect Selection Decising
Past research suggests that individuals engage in response inﬂation by showing

that applicants generally score higher than incumbents (e. g., Rosse et al., 1998),
applicants score higher than research participants (e. g., Birkeland et al., 2003),
individuals incentivized to perform well on personality tests get higher scores than when
they are not incentivized (e.g., Mueller-Hanson et al., 2003a), and by showing that when
instructed to do so, individuals can inﬂate their responses (e. g., McFarland & Ryan,
2000). All of these types of designs have unique limitations that open their results to
criticism. For example, research that shows applicants tend to score higher than
incumbents does not indicate that only some applicants are faking, but rather could
indicate that all applicants are uniformly increasing their scores in response to the

situation. If true, this would result in little or no change in rank-orders and, thus, no out

of order decisions or erosion of criterion-related validity as a consequence of faking.
However, the within-subjects design of the current study as well as the use of actual

applicants conﬁrms that many individuals do in fact inﬂate their responses in applicant

81

 

settings and, furthermore, that individuals engage in different degrees of faking resulting
in changes in rank order and even changes in which individuals should be selected.

Echoing the concerns of Mueller-Hanson and her colleagues (2003 3), these results
suggest personality test scores should not be used in a top-down, or select-in, manner.
However, the current results also cast some doubt on the use of these tests for select-out
purposes as well. Speciﬁcally, the expected gains from implementing such a test may not
be realized if a substantial number of people exceed the normative cutoff standards due to
response inﬂation.

The results showing that many of the individuals in this study would not have
been hired on the basis of their incumbent test scores suggests the possibility that
individuals can fake their way to passing normative cutoff standards. However, as noted
previously, the unreliability of the test may have also contributed to this ﬁnding. Future
research could help to untangle the effects of unreliability vis-a-vis faking within the
current sample by obtaining a test-retest reliability estimate that would allow for an
assessment of how many people would likely not have passed on the second
administration simply due to error. Such research could be performed with a
convenience sample of students tested at two time periods. Future research with similar
designs and samples using tests with high levels of test-retest reliability would also help

to elucidate whether some applicants do in fact fake their way into being selected.

Assuming that applicant faking was largely responsible for the current ﬁnding

that some people who met the minimal test standards on the basis of applicant but not
incumbent scores suggests that by not controlling faking on such tests, organizations may

be unfairly rewarding those who are inclined to dishonestly raise their scores. This

82

phenomenon takes on even more importance when organizations use these tests in a top-
down selection manner. Perhaps one of the reasons individuals tend to view the use of
personality and other non-veriﬁable self-report tests as less fair than more objective
selection procedures such as cognitive ability tests or interviews (Hausknecht, Day, &
Thomas, 2004) is because of their awareness that people can effectively lie on such tests
to increase their chances of selection. Supporting this notion is recent research showing
that individuals with relatively lower fairness perceptions of such tests report engaging in
greater levels of faking (McFarland, 2002). Future research using within-subjects
designs and ﬁeld samples should include measures of fairness perceptions to assess this

possibility.

Are Social Desirabilig Scales a Valid Qperationalization of Faking?

One of the key goals of the current study was to compare multiple
operationalizations of faking in order to contribute to understanding whether social
desirability scales are valid measures of faking in applicant contexts and could thus be
used to identify faked applicant test scores. The relatively low intercorrelations between
applicant social desirability scores and d-scores suggests the possibility that these two
operationalizations of faking are unique constructs and do not share substantial variance.
Partialling social desirability scores ﬁ'om applicant test scores did result in slight, but
insigniﬁcant, increases in criterion-related validity for the conscientiousness,
Adaptability, and Reasoning and Problem Solving scales. Unfortunately, however, the
very small sample size available for these analyses precludes drawing any ﬁrm

conclusions ﬁom these results.

83

The prevalence with which social desirability scales are used in practice to
identify potentially faked responses (Gofﬁn & Ghristiansen, 2003) indicates a distinct
need for researchers to determine whether these scales are actually measuring applicant
faking. Research designs using actual applicants and a within-subjects design, such as
used by the current study, with larger sample sizes would be helpful in determing
whether faking is adequately operationalized by these scales. However, it is important to
note that recent simulation research by Schmitt and Oswald (in press) suggests that
identifying fakers with social desirability scales and removing those identiﬁed from the
selection process is likely to have very little positive impact in terms of mean

performance of those hired.

Does Faking Affect Construct Validity?
Consistent with the majority of prior research (e. g., Smith & Ellingson, 2002;

Weekley et al., 2003 ), the current study found that applicant and non-applicant test
responses result in similar factor forms but dissimilar measurement errors. The current
study also found that factor loadings were similar across contexts, a minimum condition
suggested to be necessary to conclude measurement invariance across groups (cf., Rock
et al., 1978). The results of the current study are likely to be a more veridical reﬂection
of reality than some of the prior studies for two main reasons.
First, many previous studies assessed measurement invariance by comparing an
applicant group to a separate research or student group, raising the possibility that the
dr'Herences found between the two groups were a result of differences inherent in the

pat-ticular samples and not due to the differing contexts. The within-subjects design of

84

the current study allows the inference that if differences were found between the two
measurement contexts, the differences must be due to the context and not to underlying
differences between the samples. Second, the current study’s use of a ﬁeld sample allows
assessment of measurement differences as a function of real-life applicant motivation
instead of artiﬁcial laboratory-induced motivation present in many previous studies.
Thus, the results suggest that acceptable degrees of measurement invariance are present
between applicant and incumbent settings despite signiﬁcant score differences between
the applicant and incumbent contexts. Note, that evidence was found that the
interrelationships between constructs was greater in the applicant as opposed to
incumbent settings suggesting some erosion of discriminant validity, but this relationship
did not appear affect the factor structure of the constructs.

Despite the advantages of the current design over previous research designs
examining construct validity, it is important to note that the test used in this study did not
exhibit the expected factor structure in either context and thus the results are based on a
post hoc factor structure established with the incumbent test responses. Furthermore, the
MCF A is generally a low power test which may have prevented the emergence of
signiﬁcant differences between the two contexts. Also, as noted previously, the results
indicated that the discriminant validity of the tests were somewhat compromised in the
applicant context. Finally the existence of individual differences in the degree of
response inﬂation found between the two contexts suggests that there are differences in
the constructs being measured between the two contexts, despite the lack of signiﬁcant

factor loading differences observed with the MCF A. Additionally, it is important to note

85

that some researchers insist that similar error structures must also be observed to
conclude factorial invariance (Meredith, 1993).

Given these limitations and conﬂicting results, it is necessary for future research
to replicate these results before any application of the current results can be justiﬁed. For
example, research using similar designs and tests with more robust and established factor
structures would likely provide more generalizable evidence. Additional research
comparing alternative measures of personality constructs assessed via self-report
instruments would also help to further understanding of the effect of applicant faking on

the construct validity of personality-based self-report tests.

Does Faking Affect Criterion-related Validigﬂ
The current study was designed to address the effect of faking on criterion-related

validity with a unique research design that has many advantages over previous
investigations. However, despite prior evidence of the validity of the selection test used
in this study, in the current sample, the test demonstrated very low validity in both the
applicant and the incumbent context. Therefore the current study’s assessment of the
affect of faking on criterion-related validity is exploratory and should not be interpreted
as proof that faking either does or does not impact the criterion-related validity of these
types of tests.

The results of this study provide conﬂicting evidence of the effects of faking on
criterion-related validity. The pattern of results for some analyses appear to be consistent
with laboratory studies of faking, demonstrating that faking can impact criterion-related

validity (e. g., Dunnette et al., 1962; Mueller-Hanson et al., 2003a). However, the pattern

86

of results for some analyses also suggests that faking does not affect criterion-related
validity.

Generally, the criterion-related validity of the incumbent scores was greater, or
less negative, than for the applicant scores. While these results did not reach traditional
signiﬁcance levels, the pattern suggests that faking may impact criterion-related validity.
Additionally, examination of the performance correlations for the group of individuals
whose incumbent scores did not exceed minimal selection standards reveals a similar, but
stronger pattern. Speciﬁcally, the applicant score correlations with performance for this
group are all negative, while the incumbent score correlations are predominately positive.
Despite this pattern, the majority of these analyses were not signiﬁcant and thus the
results may be due to the generally low validity of the test, test unreliability, or range
restriction.

Investigation of a curvilinear relationship between applicant test scores and
performance revealed no such relationship, in contrast to the ﬁndings of previsous
researchers (e. g., Haaland & Christiansen, 1998). However, the restriction of applicant
test score range inherent in the current sample may have masked such a relationship
because the lower portion of the score distribution, expected to have a positive linear
relationship with performance, is missing from the analyses. Future research should
investigate this question by using a design similar to this study, but with an unrestricted
sample. For example, a similarly designed study performed in the context of a predictive

validation study in which the selection test is not used for selection would provide the

necessary data to adequately investigate this question.

87

D-scores show a pattern of negative relationships with performance for the entire
sample, although these relationships are largely non-signiﬁcant. While the pattern
suggests that the greater the amount of response inﬂation engaged in by an individual the
lower his or her performance ratings, the results were nevertheless nonsigniﬁcant and
potentially due to test unreliability. One piece of evidence suggests the possibility that
the results were not solely due to unreliability. If the negative relationships between d-
scores and performance were due to unreliability then there is no reason to expect that a
different pattern would emerge for individuals who did or did not meet minimal score
standards on the basis of incumbent test scores. However, the data suggests that the
relationship between difference scores and performance is stronger for individuals who
did not exceed the minimal selection standards on the basis of their incumbent test scores.
Recall that these individuals also had signiﬁcantly greater levels of faking compared to
those individuals exceeding the minimal standards at both time periods. Therefore,
focusing solely on traditional signiﬁcance leads to the conclusion that d-scores are
unrelated to faking, but focusing on the pattern of results suggests the opposite
conclusion.

D-scores interacted with applicant test scores for the emotional stability and Base
of Supervision scales, as well as for overall test scores. Speciﬁcally, individuals
engaging in greater levels of faking exhibited a negative relationship between applicant
test scores and performance ratings. Individuals engaging in little or no faking had a
positive relationship between applicant test scores and performance. This suggests that

faking may moderate the relationship between applicant test scores and performance.

88

However, given the extremely low reliability of both applicant scores and d-scores, the
results should be interpreted cautiously.

The polynomial regression analyses suggest that response inﬂation and response
deﬂation equally negatively impact predicted performance. The results of the polynomial
analyses should be interpreted with caution. First, these results could be due to
regression to the mean effects in which people with extremely high scores the ﬁrst time
received somewhat lower scores the second time due to error. Second, there is absolutely
no theoretical justiﬁcation for why response deﬂation in applicant contexts should relate
negatively to performance. One could speculate that very unstable people may respond
lower at one time than another and that this instability could manifest itself in behaviors
on the job. However, such speculation based on the analyses from one sample commits
the sin of “letting the empirical tail wag the theoretical dog,” (Bedeian & Day, 1994).
Other researchers also argue that theory should serve as a precondition for selecting the
types of analyses used to test research hypotheses (cf. Schoorman, Bobko, & Rentsch,
1991) and Tisak and Smith (1994) speciﬁcally caution against accepting unconstrained
polynomial regression models simply because they ﬁt the data better in a particular
sample even though no theory exists to explain the results. Thus, before accepting these
results as an accurate reﬂection of reality additional theoretical wOrk and replication of
these results is necessary.

Thus, while the results are conﬂicting, the pattern of results suggests the
possibility that the criterion-related validity of personality-based selection tests may be
affected by applicant faking. However, future research is necessary in order to examine

whether the pattern of results found in the current study are replicable and generalizable.

89

Future research should speciﬁcally seek to examine these relationships, in the context of
within-subjects designs using ﬁeld samples of actual applicants, similar to the current
study. However, future research should use different personality-based tests and also
should attempt to utilize samples that are not restricted due to selection on the basis of the
tests being examined.

There is also the possibility, as suggested by recent simulation research (Schmitt
& Oswald, in press), that the validity of personality-based selection tests is so small that
faking can only have a marginal impact upon their validity. In some ways, the results of
the current study are more supportive of this notion than that faking does or does not
impact criterion-related validity. This suggests that research efforts aimed at increasing
the validity of these types of tests may yield a larger payoff in terms of predicting job

performance than research that continues to investigate faking.

Dmividuals’ Perceptions & Beliefs Relate to Faking?
The current study investigated the inﬂuence of perceptions that others think it is

acceptable to engage in faking, perceptions that others actually fake, ethics against lying,
beliefs that faking is the same as lying, self-efﬁcacy, and knowledge of measured
constructs. The current study found that only perceptions that others actually engage in
faking related to the extent to which an individual engaged in faking.
The ﬁnding of a relationship between an individual’s perceptions that others
engage in faking and d-scores is in need of replication, but is nevertheless interesting.
jhis is interesting because it represents another possible route through which individuals

may be persuaded not to fake in applicant contexts. For example, in addition to

90

traditional warnings that focus on injuctive norms (i.e., what is approved or disapproved),
test administrators could make appeals focusing on descriptive norms (i.e., what is
commonly done) of stating that most people do not engage in faking on such tests. Prior
research has demonstrated that both types of normative appeals contribute to intentions
and behavior (e.g., Cialdini et al., in press). Future research would be useful in
determining if the addition of disjunctive norms to the injunctive warnings against faking
commonly provided to applicants are helpful in further reducing applicant faking.

There are a number of potential reasons for the why the other antecedent
measures did not relate to faking. For example, as described previously, d-scores tend to,
and did in the current study, have low reliability which may have attenuated the
correlations between the antecedent measures and d-scores. Additionally, while the
antecedent measures used in the current analysis included some items from prior research
with established construct validity, some of the items were developed speciﬁcally for this
study and thus may not be adequately tapping the intended constructs. It is also possible
that the current sample, in addition to being restricted in terms of applicant test scores, is
restricted in terms of the antecedent measures as well which could explain why some of
the antecedents consistently related to d-scores in the expected direction but did not reach
statistical signiﬁcance (e. g., self-efﬁcacy). A useful follow-up study would be to
administer these antecedent measures to an additional sample to determine if in fact the
range of responses obtained in the current study is restricted. Another possibility is that
participants were engaging in socially desirable responding on the antecedent measures

due to the presence of the experimenter and the sensitive nature of some of the questions.

91

Utilization of a randomized response technique methodology would help to control for
this possibility in future research (Fox & Tracy, 1986).

In relation to the knowledge of constructs measure, it is difﬁcult to say whether
“correctly” choosing the construct represented by an item is a sufficient measure of
individuals’ knowledge of the constructs given that the factor analyses failed to support
the a prior constructual dimensionality of the test. Additionally, the sample size was
quite small for the knowledge measures and thus the associated statistical tests were
woefully underpowered. Future research should investigate the usefulness of similar
knowledge measures using a more construct valid test and a larger sample size.

Despite the lack of ﬁndings for many of the individual beliefs and perceptions
included in this study, it is important to continue research to uncover what leads
individuals to engage in faking. This is particularly important in light of recent
simulation research showing that variability in applicant faking had a larger effect on
validity coefﬁcients than either the average magnitude of faking or the total proportion of
faked scores in a sample (Komar, Theakston, Brown, & Robie, 2005). Thus, future
research should continue to investigate the perceptions and beliefs examined here as well
as other individual differences that may inﬂuence faking. Verbal protocol analysis of
individuals engaging in responding to personality-based tests under both motivating and
non-motivating conditions may be particularly helpful in elucidating the process through
which individuals determine how to respond to such questions in different contexts. The
investigation of person-situation interactions would also further understanding of what

conditions, both individual and contextual, motivate individuals to engage in faking and

92

whether certain situations are more likely to motivate individuals with given beliefs or

characteristics more than other situations.

Limitations
Several study limitations exist. First, the sample was severely restricted as only

hired individuals were able to participate. Future research could avoid this limitation by
examining faking in the context of a true predictive validation study where selection is
not based on test scores. The restriction of range is likely the reason that the applicant
test scores exhibited non-signiﬁcant, and sometimes negative, validities with
performance ratings for the entire distribution. However, the ﬁnding of non-signiﬁcant
predictive validity for the applicant test also represents a limitation of the current study.
It should be noted that the validity obtained for this test in the initial concurrent
validation study was substantial, with average correlations between test scores and
performance ratings of .28. It may also be argued that the use of d-scores in this study
represents another limitation. However, as noted previously, d-scores have distinct
advantages over other operationalizations of faking (i.e., d-scores represent a direct
measure of the amount of inﬂation occurring as a function of the setting), are
conceptually meaningful in this context, and are not inherently ﬂawed (Rogosa et al.,

1982)

Conclusion
The results of this study contribute to the literature on faking in a number of ways.

First, the large effect sizes found between the applicant and incumbent administrations of

93

the test indicate that applicants do engage in response inﬂation. Second, the ﬁnding that
36% of the current sample did not exceed the normatively established cut scores, on the
basis of their incumbent test responses, corroborates previous ﬁndings performed in
laboratory settings or using social desirability scores to identify faked responses, although
the unreliability of the test used in the current study indicates that replication of this
ﬁnding with a more reliable test is desirable. This study also presents some evidence that
faking may impact criterion-related validity. Finally, this study provides evidence that an
individual’s perceptions of others’ behavior in applicant settings relates to individual
response inﬂation. However, the current study failed to ﬁnd relationships between the
other antecedents measured and faking behavior. Future research with similar designs
that addresses the limitations of the current study will be useﬁrl in determining whether

the current pattern of results is replicable and generalizable.

94

APPENDICES

95

APPENDIX A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0__0.0>0 000.090 0.5 0.0. 00> 0.002. 26: 900009.
0 O O O O 0: >09 .05 0.0.00. .050 050 .0500 05.0 :0 05.05050 00:09.0..00 __0.0>O
.00.:l00:00 :05... 09.. :0 0.. 0... 032m 009.. 59.0 .050
O O O O O 0.0: 0:0 50>. 5:00:00 0.09 0. >5 0. 003.0000 5.; 05.0.5 00:00:05.0.
0:050:50 5.00050 5 :0>0 0.00 900 050.505. 000.0503
0 O O O O 05 5 000:0:0 0. 0:0 0.90900 :0. 050:0:0 0. 0.0000 25x0... 5.50.000...
.000 0:0; 05 .02-$.95 33000 0:0 30:0 .0 _0>0_ : E 33500090
0 O O O O 0 050.505. 290.050 0:0 20.0.0000 0.:09:0_000 00.0.0900 0:0 >0.0:m
0:00 .00 0|0.:_5 0.00 0:0.0':. 5 05055 000.0... 00.-.000i0 5:90:30
0 O O O O 0.0 0050.003 05 .0 0000: 05 0.00:0 0. 2550:0000. 009000... 0:0 2550:0000”.
.0.:090>909_ 00000.0 0%:00. 090.090 00200 202.059. 0528 90.090
0 O O O O 0:50.000 0:509 5 00.300090 0:0 00.9 005000 20.000900... 0:0 05:00001
0.090.000 .0 0000: 05.009 0003.00 0:0 0.0009 0000“.
O O O O O 50900 00:05. 0000090 0. .055 .0 >500... 0:. 5.; 00:50:00 003.00 0:0 >503.
.080. 90 .050 0005.0:0...
O O O O 0 0:0 0.090.000 0.02.0. 20:00“. 003.50 05000 0:0 900.900 0:0 8:000:00
..003.0mm0
O O O O O 99. 0055 .50: 05900. 0. 5000 95.0050 00.0. 0. 0553 5.03.0000 .0 000m
0.05.0300 0.00m00m
O O O O O .900. 0 .0 :00 050: >090 0. 09000 0.050 531.03 0..0>> 0.050 5.; 0:0..0>>.
$0. .200 $0.. $00 .20.. :0..|0._.000n_ .300". 00:09.0..00
00. .52 0.05.2 .52 .00
9 05:03:00
.m m 20.0.0900 0.0 005.0. 50> .0 =0 “.009090m 50:000 0>.:0> 000>0.090 =0 .0 5.0.
.m m d0 .._n a 00.05 5 00>o_090 0.5 0. :00. 05.0. 0.0009000 05 :0 050.000 5 00> 0050 0. _0>0_
m m .w H We 05.0. :000 5.3 00.0.8000 000090900 05 00: 00.00. 00:09.0t00 :000 .0 .:0_. 05
u E 0. 0.050 0.0009000 05 5 5.090.005 .0 _.x_. :0 000.0 209.0 05.0. 0 0x09 0.- 0.0.00.
0 0532.0. 05 .0 :000 :0 00>0.090 55 0.00.05 0. 50.. 05 0. 0.000 05.0. 05 000 0000.0

 

 

 

.0902 00>o_09m

 

.0902 50>

9.0". _00_0.00< 00:09.0..00 .00-3.0 :0.0000m 09.00.00 _0>0._->..:m

96

APPENDIX B

INSTRUCTIONS: Please show how strongly you agree or disagree with each of the

following statements by putting a circle around the appropriate number to the right of

each item. You may choose not answer any question,

PLEASE NOTE:

“Answering in a desirable manner,” includes slightly exaggerating your responses
to make yourself look good, responding to a question in a desirable way even if
you do not think that it is completely true, and completely making up answers

without thinking about whether the answers are true or not.
“Selection test” refers to any test that is similar to the one you took as an applicant.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3’ 0
o q
0 0
9% 3 E a; a:
~<
Please circle the appropriate number to the right of each item to show '3 3
how strongly you agree or disagree with it. a
§|_Ibiective Norms: Others’ Beliefs
If I answered in a desirable manner on a selection test, most of
l the mle who are important to me would disapprove.‘ 1 2 3 4 5
Noone whoisimportanttomethinksitisOKtoanswerina
2 desirable manner on a selection test.“ 1 2 3 4 5
Most people who are important to me will look down on me if I
3 answer in a desirable manner on a selection test.* 1 2 3 4 5
My parents would approve of me answering in a desirable
4 manner on a selection test 1 2 3 4 5
My friends and family would disapprove of me enhancing my
responses on a selection test in order to make a good
5 impression.* I 2 3 4 5
§ubiective Norms: Others’ Ehgvior
Other people probably answer in a desirable manner on selection
6 tests in order to get a better score. 1 2 3 4 5
Most applicants would not hesitate to answer in a desirable
7 manner on a selection test. 1 2 3 4 5
In most hiring situations, applicants do NOT distort or enhance
8 their msponses to selection tests! 1 2 3 4 5
Everyone changes their answers on selection tests to appear more
9 desirable. 1 2 3 4 5
Most applicants exaggerate their answers to selection tests in
10 order to make a good impression. I 2 3 4 5

 

97

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

9 0
o E!
a"? U z 8
.— -~ > 0°
‘< 3 9' (£9. ~2—
Please circle the appropriate number to the right of each item to show g 3
how strongly you agree or disagree with it. 0
Self-efﬁcacy
I am confident that I could receive a higher score on selection
11 tests by exaggerating my responses. 1 2 3 4 5
I’m conﬁdent I could ﬁgure out how to get a higher score on
12 selection tests. 1 2 3 4 S
It would be easy for me to increase my score on a selection test
13 by answering in a desirable way. 1 2 3 4 5
14 If I want to, I could increase my score on a selection test. 1 2 3 4 5
I could make myself look better on a selection test by responding
15 dishonestly. l 2 3 4 5
I could respond in a very desirable way to selection tests, if I
16 chose to. l 2 3 4 5
Ethic inst L
17 People should never lie. 1 2 3 4 5
18 I never tell lies to other people. 1 2 3 4 5
I think it is sometimes necessary and ethically acceptable to lie to
19 other people.‘ 1 2 3 4 5
The best way to handle people is to tell them what they want to
20 hear even if it is not true.‘K l 2 3 4 5
21 Honesty is the best policy in all cases. 1 2 3 4 5
22 There is no excuse for lying to someone else. 1 2 3 4 5
23 All in all, it is better to be honest than to be dishonest. l 2 3 4 5
Iiefs About Fa '
I believe that answering in a desirable manner on a selection test
24 is not the same as ly'mg. l 2 3 4 5
25 Lying is different than exaggerating responses on a selection test 1 2 3 4 5
Making yourself look good on a selection test by answering in a
26 desirable manner is different than lying. l 2 3 4 5
Exaggerating responses on a selection test in order to make a
27 _good impression is the same as lying.‘ l 2 3 4 5

 

‘* denotes reverse keyed items

NOTE: All scale labels will be omitted in the participant version of these measures.

98

 

APPENDIX C

Knowledge of Constructs
INSTRUCTIONS: Each question below presents one of the items that you responded to in the previous phase

of the experiment. For each item: A) ﬁll in the circle for the category that best describes the item; 2) ﬁll in
the circle that indicates how you think the organization would like you to respond to the item; 3) ﬁll in the
circle that indicates how conﬁdent or sure you are that you know how the organization would like you to
respond to this item.

1. “I take time out for others.”

 

 

C) How sure are you that you

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
different situations. 0 ' Strongly Disagree 0 ' Very sure
0 2 How conﬁdent and ﬁiendly
a person is. O 2 Disagree 0 2 Somewhat sure
0 3 How productive and focused
on quahty a person 18‘ O 3 Neither O 3 Neither sure nor unsure
O 4 How a person reacts to
being supervised. O 4 Agree 0 4 Somewhat unsure

O 5 How good a person is at ,
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 6 None of the above.
O 7 I don’t know.
.42, “I usually arrive early for appointments.”

 

 

C) How sure are you that you

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
different situations. 0 1 Strongly Disagree 0 ' Very sure
0 2 How conﬁdent and friendly 4
a person is. O 2 Disagree 0 2 Somewhat sure
0 3 How productive and focused
on quahty a person 18' O 3 Neither O 3 Neither sure nor unsure
O 4 How a person reacts to
being supervised. O 4 Agree 0 4 Somewhat unsure

O 5 How good a person is at
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 6 None of the above.

0 7 [don’t know.

 

 

 

 

 

99

 

93. “Sometimes I enjoy breakitng the rules.“

 

C) How sure are you that you

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
different situations. 0 ' Strongly Disagree 0 ' Very sure

0 2 How conﬁdent and ﬁ'iendly
a person is.

O 3 How productive and focused
on quality a person is.

O 4 How a person reacts to
being supervised.

O 5 How good a person is at
problem solving.

0 4 None of the above.
0 7 Idon’t know.

0 2 Disagree
0 3 Neither
O 4 Agree

0 5 Strongly Agree

0 2 Somewhat sure
0 3 Neither sure nor unsure
O 4 Somewhat unsure

O 5 Very unsure

 

44. “I giannoyed when people chan e things that wOrk perfectly well.”

 

C) How sure are you that you

 

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
diﬁ‘erent situations. 0 ' Strongly Disagree 0 1 Very sure

0 2 How conﬁdent and friendly
a person is.

O 4 How productive and focused
on quality a person is.

O 4 How a person reacts to
being supervised.

O 5 How good a person is at
problem solving.

0 4 None of the above.

0 7 I don’t know.

 

o 2 Disagree
0 3 Neither
O 4 Agree

0 ‘4 Strongly Agree

 

O 2 Somewhat sure
0 3 Neither sure nor unsure
O 4 Somewhat unsure

O 5 Very unsure

 

100

 

 

”.0

t 5., “I catch on to things Quickly.” 5

 

C) How sure are you that you

 

O 2 How conﬁdent and ﬁiendly
a person is.

O 5 How productive and focused
on quality a person is.

O 4 How a person reacts to
being supervised.

O 5 How good a person is at
problem solving.

0 5 None of the above.
0 7 Ildon’t know.

 

O 2 Disagree
0 5 Neither
O 4 Agree

0 5 Strongly Agree

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
different situations. 0 1 Strongly Disagree 0 ' Very sure

0 2 Somewhat sure
0 5 Neither sure nor unsure
O 4 Somewhat unsure

O 5 Very unsure

 

, 6. “I do not have a good imagination.”

 

 

C) How sure are you that you

 

O 2 How conﬁdent and ﬁiendly
a person is.

O 5 How productive and focused
on quality a person is.

O 4 How a person reacts to
being supervised.

O 5 Howgoodapersonisat
problem solving.

0 5 None of the above.
0 7 Idon’t know.

 

 

O 2 Disagree
0 5 Neither
O 4 Agree

0 5 Strongly Agree

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ‘ How adaptable a person is to
different situations. 0 ' Strongly Disagree 0 ' Very sure

0 2 Somewhat sure
0 5 Neither sure nor unsure
O 4 Somewhat unsure

O 5 Very unsure

 

101

 

 

.5 7. “Ilike working on several things. at a time.”

 

C) How sure are you that you

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
diﬂemnt situations. 0 ' Strongly Disagree . O ' Very sure

0 2 How conﬁdent and friendly
a person is.

O 5 How productive and focused
on quality a person is.

O 4 How a person reacts to
being supervised.

O 5 How good a person is at
problem solving.

0 5 None of the above.
0 7 Idon’t know.

 

O 2 Disagree
0 5 Neither
O 4 Agree

0 5 Strongly Agree

 

O 2 Somewhat sure
0 5 Neither sure nor unsure
O 4 Somewhat unsure

O 5 Very unsure

 

8. “Frequent interruptions and changes in priority bother me.” 5 5

 

C) How sure are you that you

 

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 ' How adaptable a person is to
different situations. 0 5 Strongly Disagree 0 4 Very sure

0 2 How conﬁdent and ﬁiendly
a person is.

O 5 How productive and focused
on quality a person is.

O 4 How a person reacts to
being supervised.

O 5 Howgoodapersonisat
problem solving.

0 5 None of the above.
0 7 Idon’t know.

 

O 2 Disagree
0 5 Neither
O 4 Agree

0 5 Strongly Agree

 

o ’- Somewhat sure
0 5 Neither sure nor unsure
O 4 Somewhat unsure

0 5 Very unsure

 

102

 

 

:1 9. “I am always prepared.” 5

 

C) How sure are you that you

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 5 How adaptable a person is to
different situations. 0 ' Strongly Disagree 0 1 Very sure
0 2 How conﬁdent and friendly
a person is. O 2 Disagree 0 2 Somewhat sure
0 5 How productive and focused
on quahty a person 53' O 5 Neither O 5 Neither sure nor unsure
O 4 How a person reacts to
being supervised. O 4 Agree 0 4 Somewhat unsure
O 5 How good a person is at
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 5 None of the above.
0 7 Idon’t know.

 

510. “I tend to resist when people tell me what ”2095-5 -- .

 

5C) How sure are you that you

 

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 1 How adaptable a person is to
different situations. 0 5 Strongly Disagree 0 5 Very sure
0 2 How conﬁdent and ﬁiendly
a person is. O 2 Disagree 0 2 Somewhat sure
0 5 How productive and focused
on quality a person 55‘ O 5 Neither O 5 Neither sure nor unsure
O 4 How a person reacts to
being supervised. O 4 Agree 0 4 Somewhat unsure
O 5 How good a person is at
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 5 None of the above.
0 7 [don’t know.

 

 

 

103

 

 

511. “I have little to say.”

 

C) How sure are you that you

 

 

 

 

 

0 5 None of the above.
0 7 Idon’t know.

 

 

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 5 How adaptable a person is to
different situations. 0 5 Strongly Disagree 0 5 Very sure
0 5 How conﬁdent and ﬁ'iendly
a person is. O 5 Disagree 0 2 Somewhat sure
0 5 How productive and focused
on quahty a person 55' O 5 Neither O 5 Neither sure nor unsure
0 5 How a person reacts to
being supervised. O 5 Agree 0 5 Somewhat unsure
O 5 Howgoodapersonisat
problem solving. 0 5 Strongly Agree 0 5 Very unsure
O 5 None of the above.
0 7 [don’t know.
$.12. “I avoid reading diﬁicult material.”
C) How sure are you that you
B) How would the know how the organization

A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 5 How adaptable a person is to
diﬂ'erent situations. 0 5 Strongly Disagree 0 5 Very sure
0 5 How conﬁdent and friendly
a person is. O 5 Disagree 0 2 Somewhat sure
0 5 How productive and focused
on 5555555555 a person 55' O 5 Neither O 5 Neither sure nor unsure
O 5 How a person reacts to
being supervised. O 5 Agree 0 5 Somewhat unsure
O 5 Howgoodapersonisat
problem solving. 0 5 Strongly Agree 0 5 Very unsure

 

104

 

 

"13. ”A lotof supervisors juSt enjoy controlling people.”

 

C) How sure are you that you

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 5 How adaptable a person is to
diﬁ‘erent situations. 0 5 Strongly Disagree 0 5 Very sure
0 5 How conﬁdent and ﬁ'iendly
a person is. O 5 Disagree 0 5 Somewhat sure
0 5 How productive and focused
on 55555555555 a person 55' O 5 Neither O 5 Neither sure nor unsure
O 5 How a person reacts to
being supervised. O 5 Agree 0 5 Somewhat unsure
O 5 How good a person is at
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 5 None of the above.
0 5 [don’t know.

 

 

 

14. “i typically am not very interested in joininggroup activities.

9’

 

C) How sure are you that you

 

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 5 How adaptable a person is to
different situations. 0 5 Strongly Disagree 0 5 Very sure
0 5 How conﬁdent and friendly
a person is. O 5 Disagree 0 5 Somewhat sure
0 5 How productive and focused
055 51555555), 5 person 55' O 5 Neither O 5 Neither sure nor unsure
O 5 How a person reacts to
being supervised. O 5 Agree 0 5 Somewhat unsure
O 5 How good a person is at
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 5 None of the above.
0 7 Idon’t know.

 

 

 

105

 

 

15. “l tendrto ignore my duties.”

 

C) How sure are you that you

 

 

B) How would the know how the organization
A) Which category best describes organization like you to would like you to answer this
this item? answer this item? item?
0 5 How adaptable a person is to
different situations. 0 5 Strongly Disagree 0 5 Very sure
0 5 How conﬁdent and friendly
a person is. O 5 Disagree 0 5 Somewhat sure
0 5 How productive and focused
on quality a person 55' O 5 Neither O 5 Neither sure nor unsure
O 5 How a person reacts to
being supervised. O 5 Agree 0 5 Somewhat unsure
O 5 How good a person is at
problem solving. 0 5 Strongly Agree 0 5 Very unsure

O 5 None of the above.
0 7 Idon’t know.

 

 

 

106

 

APPENDIX D

Unlikely Virtues

I always admit it when I make a mistake.

I never give up hope.

I know that anyone who tries can get a job.
I always know why I do things.

I never give up.

I Know immediately what to do.

I believe there is never an excuse for lying.
I always know what I am doing.

I am always ready to start aﬁ'esh.

. I have never engaged in gossip.

. I will do anything for others.

. I am always prepared.

. I don’t always practice what I preach.*
. I have some bad habits.*

. I have sometimes had to tell a lie.*

. I am not always honest with myself."
1 7.

I am not always what I appear to be.‘

* denotes reverse keyed items

107

APPENDIX E
Informed Consent

Project Title: Selection Test Responding
Investigator’s Names: Anthony Boyce and Dr. Ann Marie Ryan
Description and Explanation of Procedures: This study is investigating how people respond to

pre-employment tests like the one you took before
being hired. You will be asked to respond to a test
similar to the one you took as an applicant to
[organization name]. You will also be asked to
answer some additional survey questions.

Beneﬁts: This study will last about one and a half hours (1.5
hours). You will receive your normal hourly wage
for participation.

Risks and Discomforts: None

 

This study is investigating how people respond to pre-employment tests like the one you took before being
hired. Thank you for participating in this study! Participation is completely voluntary, and all of your
answers will be completely CONFIDENTIAL. No one at [organization name] will ever see or have access
to your individual responses to any of the questions in this study. Your privacy will be protected to the
maximum extent allowable by law. All of the study materials will be taken off-site by Anthony Boyce at
the end of each day so that no one at [organization name] will have access to them.

Participation is voluntary, you may choose not to participate at all, or you may refuse to participate in
certain procedures or answer certain questions or discontinue your participation at any time without penalty
or loss of beneﬁts.

If you have any questions about this study, please contact Anthony Boyce (Michigan State University,
Baker Hall Room 20, East Lansing, MI 48824; boyceant@msu.edu). If you have questions or concerns
regarding your rights as a study participant or if you are dissatisﬁed with your treatment during this study
you may contact - anonymously if you wish — Peter Vasilenko, Ph.D., Chair of the Michigan State
University Committee on Research Involving Human Subjects (UCRII-IS) by phone: (517) 355-2180, fax:
(517) 432-4503, email: ucrihs@msu.edu, or regular mail: 202 Olds Hall, East Lansing, MI 48824. A copy
of this consent form will be available for you to take home.

Please remember that your answers during this study are completely conﬁdential and absolutely no one at
[organization name] will know or see your individual responses. Only the investigators will have access to
your individual responses.

Your signature below indicates your voluntary agreement to participate in this study.

 

First and Last Name (please print) Signature

 

 

Today’s Date

 

 

 

 

108

 

APPENDIX F
Protocol

Thank you for agreeing to participate in this study. My name is Anthony Boyce
and I’m a student at Michigan State University. The research project that you are going to
participate in today is part of the requirements for me to graduate and has no con nection
with [organization name] other than that they have allowed me to ask you to participate.
This study is investigating how people respond to pre-employment tests like the one you
took before being hired. You will be asked to respond to a test similar to the one you
took as an applicant to [organization name]. You will also be asked to answer some
additional survey questions. Note, that you are not in any way obligated to participate in
this study. If you would prefer not to participate, you are welcome to leave now or at any
time throughout the study.

[Begin handing out the ﬁrst consent form]

The consent form I am handing out states that you are agreeing to participate in
this study voluntarily. Additionally, it states that all your answers today will be
completely conﬁdential. No one at [organization name] will ever have access to your
responses in today’s study. In order to ensure this, I will take all of your surveys with me
when I leave at the end of the day. The consent form also informs you that the study
takes approximately one and a half hours and that you will be paid your normal hourly
wage for participation today. If you voluntarily agree to participate in this study today,
please print your name, sign the form, and write in today’s date in the appropriate boxes.
Are there any questions?

[Wait until everyone has ﬁnished reading and signing the forms and collect them]

[Begin handing out the ﬁrst questionnaire]

109

I am now handing out a questionnaire that is similar to the one that you ﬁlled out
as an applicant to [organization name]. In order to examine how people respond to these
types of questionnaires, I need to be able to link your responses to this survey to the
survey we will take in the next phase of the study, so please write your last and ﬁrst name
in the box provided.

Please answer all of the questions as honestly and accurately as you can.
Remember, your responses to this survey are completely conﬁdential. No one but me
will ever see or know your individual answers and your answers will never be
communicated to anyone at [organization name] for any reason. Your answers will be
used for research purposes only and will have no effect on you or your employment at
[organization name] in any way. Therefore, please answer the following questions as
honestly as possible. When you are ﬁnished please sit quietly until everyone is ﬁnished.

[When everyone is ﬁnished, collect the tests.]
[Begin handing out the second questionnaire]

I am handing out the next survey now. I need to be able to connect your
responses to the last questionnaire to your responses on this one, so please put your ﬁrst
and last name in the box provided.

Again, please answer all of the questions as honestly and accurately as you can.
Remember, your responses to this survey are completely conﬁdential. No one but me
will ever see or know your individual answers and your answers will never be
communicated to anyone at [organization name] for any reason. Your answers will be

used for research purposes only and will have no effect on you or your employment at

110

[organization name] in any way. Therefore, please answer the following questions as
honestly as possible.

In the questions on this survey, the term selection test refers to any questionnaire
like the one you just took that is given to people when they apply for a job. The phrase
responding in a desirable manner refers to slightly exaggerating your responses to a
selection test in order to make yourself look good, responding to a question in a desirable
way even if you do not think that it is completely true, or outright lying on your answers.
You may begin. When you are ﬁnished please sit quietly until everyone is ﬁnished.

[When everyone is ﬁnished, collect the surveys]
[Begin handing out the second consent form]

We have one more thing to do before we’re done today. I am currently handing
out a consent form similar to the one that you ﬁlled out at the beginning of the session.
In addition to investigating how people respond to selection tests, I am also investigating
how peoples’ responses to these tests change over time and how these changes relate to
job performance. In order to investigate these issues, it is necessary for me to have
access to your applicant test responses as well as to have access to your job performance
data. Once this information is linked to the surveys you took today, all names will be
removed and any identifying information will be destroyed. If you agree to allow me to
collect this information, please print your name, sign the form, and write in today’s date.
Also, please remember that no one at [organization name] will ever have access to any of
the individual information that you provided today.

When I am done running these sessions, I will make sure that everyone who has

participated is given a debrieﬁng form that describes the purpose of the study in greater

111

detail. Please do not tell other employees about the questionnaires that you ﬁlled out
today or anything else about this study. It is very important that when people participate
they all have exactly the same information at the same steps in the study, so again please
do not tell other employees about the details of this study. Please return the consent
forms to me when you have completed ﬁlling them out. Thank you for participating in

this study.

112

APPENDIX C

Informed Consent 2

Project Title: Selection Test Responding

Investigator’s Names: Anthony Boyce and Dr. Ann Marie Ryan

 

In addition to understanding how people respond to selection tests, this study is also
investigating how these responses change over time and how these changes relate to job
performance. In order to investigate these issues, it is necessary for the investigators to
have access to your applicant test responses as well as to have access to your job
performance data.

I am asking for your authorization to release your applicant test scores and performance
ratings to me for purposes of this investigation only. You may decline this request to
have your scores released. The data released will be treated as conﬁdential and no one
but the investigators will have access to this information. Once this information is linked
to the surveys you took today, names will be removed and any identifying information
destroyed. No one at [organization name] will have access to the individual information
you provided today. Your privacy will be protected to the maximum extent allowable by
law.

If you have any questions about this study, please contact Anthony Boyce (Michigan
State University, Baker Hall Room 20, East Lansing, MI 48824; boyceant@msu.edu). If
you have questions or concerns regarding your rights as a study participant or if you are
dissatisﬁed with your treatment during this study you may contact — anonymously if you
wish — Peter Vasilenko, Ph.D., Chair of the Michigan State University Committee on
Research Involving Human Subjects (UCRIHS) by phone: (517) 355-2180, fax: (517)
432-4503, email: ucrihs@msu.edu, or regular mail: 202 Olds Hall, East Lansing, MI
48824. A copy of this consent form will be available for you to take home.

Your signature below indicates your voluntary agreement to allow the investigators to
obtain both your applicant test responses and job performance ratings from your
supervisor.

 

First and Last Name (please print) Signature

 

 

Today’s Date

 

 

 

 

113

 

APPENDIX H

Debriefmg Form

Selection tests like the one you took as an applicant are good at predicting job
performance. Research has shown that people with certain characteristics are better at
some types of jobs than others. By using tests like these, companies are attempting to
make sure that the people they hire will like the job they are hired to do and will be good
at it.

However, sometimes people respond differently when taking this type of test as an
applicant than they do when they take this type of test for other purposes (for example,
for research purposes) and some people respond differently in the same situation at
different times (for example, six months later). When people respond differently in
different situations or at different times it is more difﬁcult for these types of tests to
accurately predict job performance.

This study is attempting to ﬁgure out why some people respond differently in different
situations and at different times. Additionally, this study is attempting to ﬁgure out
whether these changes in answers relate to job performance. By addressing these issues
the study is attempting to improve the quality of these tests so that they will be more
beneﬁcial to both workers and companies.

If you would like to read more about how these types of questionnaires relate to job
performance you can go to the library for the articles listed below. If you have any
questions regarding this study please contact Anthony Boyce by email:
boyceant@msu.edu or by mail: Michigan State University, Baker Hall Room 20, East
Lansing, MI 48824.

Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and
job performance: A meta-analysis. Personnel Psychology, £0), 1-26.

Organ, D. W., & Ryan, K. (1995). A meta-analytic review of attitudinal and
dispositional predictors of organizational citizenship behavior. Personnel Psychology,
_4_8_(4), 775-802.

114

APPENDIX I

Informed Consent

Project Title: Selection Test Responding
Investigator’s Names: Anthony Boyce and Dr. Ann Marie Ryan
Description and Explanation of Procedures: This study is investigating how people respond to

pre-employment tests like the one administered to
applicants at [organization name]. You will be asked
to provide performance appraisal ratings for a
number of employees you supervise.

Beneﬁts: The amount of time this task will require depends on

how many participating employees you supervise,
but should take no longer than 30 minutes.

Risks and Discomforts: None

 

This study is investigating how people respond to pre-employment tests like the one administered to
applicants at [organization name] and how responses change over time. Thank you for participating in this
study! Participation is completely voluntary, and all of your performance ratings will be completely
CONFIDENTIAL. No one at [organization name] will ever see or have access to the ratings you provide.
Your privacy will be protected to the maximum extent allowable by law. All of the rating forms will be
taken off-site by Anthony Boyce at the end of each day so that no one at [organization name] will have
access to them.

Participation is voluntary, you may choose not to participate at all, or you may refuse to participate in
certain procedures or answer certain questions or discontinue your participation at any time without penalty
or loss of beneﬁts.

If you have any questions about this study, please contact Anthony Boyce (Michigan State University,
Baker Hall Room 20, East Lansing, MI 48 824; boyceant@msu.edu). If you have questions or concerns
regarding your rights as a study participant or if you are dissatisﬁed with your treatment during this study
you may contact - anonymously if you wish — Peter Vasilenko, Ph.D., Chair of the Michigan State
University Committee on Research Involving Human Subjects (UCRIHS) by phone: (517) 355-2180, fax:
(517)432-4503, email: ucrihs@msu.edu, or regular mail: 202 Olds Hall, East Lansing, MI 48824. A copy
of this consent form will be available for you to take home.

Please remember that the performance ratings you provide during this study are completely conﬁdential
and absolutely no one at [organization name] will know or see the performance ratings you provide. Only
the investigators will have access to the performance ratings you provide.

Your signature below indicates your voluntary agreement to participate in this study.

 

First and Last Name (please print) Signature

 

 

 

Today’s Date

 

 

 

 

115

REFERENCES

Ajzen, I. (1985). From intentions to actions: A theory of planned behavior. In J.
Kuhl & J. Beckrnan (Eds), Action control: From cognition to behpgor. Heidelberg:
Springer.

Ajzen, I., & Fishbein, M. (1980). Understanding attitudes and predicting social
behavior. Englewood Cliffs, NJ .: Prentice-Hall.

 

Ajzen, I., & Madden, T. J. (1986). Prediction of goal-directed behavior: Attitudes,
intentions, and perceived behavioral control. Journal of Experimental Social Psychology,
_2_2_(5), 453-474.

Alliger, G. M., Lilienfeld, S. 0., & Mitchell, K. E. (1996). The susceptibility of
overt and covert integrity tests to coaching and faking. Psychological Science, 7, 32 - 39.

Arthur, W., Jr., Woehr, D. J ., & Graziano, W. G. (2001). Personality testing in
employment settings: Problems and issues in the application of typical selection
practices. Personnel Review. 30(6), 657-676.

Bandura, A. (1997). Self-efficacy: The exercise of control. New York, NY, US:
W. H. Freeman/Times Books/ Henry Holt & Co.

Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and
job performance: A meta-analysis. Personnel Psychology, 44(1), 1-26.

Barrick, M. R., & Mount, M. K. (1993). Autonomy as a moderator of the
relationships between the Big Five personality dimensions and job performance. Journal

oprplied Psychology, 78(1), 111-118.

Barrick, M. R., & Mount, M. K. (1996). Effects of impression management and
self-deception on the predictive validity of personality constructs. Journal of Applied

Psychology, 81 (3), 261 -272.

Barrick, M. R., Mount, M. K., & Judge, T. A. (2001). Personality and
performance at the beginning of the new millennium: What do we know and where do we
go next? International Journal of Selection & Assessment 9(1-2), 9-30.

 

Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable
distinction in social psychological research: Conceptual, strategic, and statistical
considerations. Journal of Personaligz & Social Psychology, 51(6), 1173-1182.

116

Beck, L., & Ajzen, I. (1991). Predicting dishonest actions using the theory of
planned behavior. Journal of Research in Personality, 25(3), 285-301.

Bedeian, A. G., & Day, D. V. (1994). Difference scores: Rationale, formulation,
and interpretation. Journal of Management, 20, 695-698.

Birkeland, S., Manson, T., Kisamore, J ., Brannick, M., & Liu, Y. (2003). A meta-
analﬁis of the difference between job applicants and non-applicants on personalig
measures. Paper presented at the 18th Annual Conference of the Society for Industrial
and Organizational Psychologists, Orlando, FL.

 

Bobko, P., Roth, P. L., & Potosky, D. (1999). Derivation and implications of a
meta-analytic matrix incorporating cognitive ability, alternative predictors, and job
performance. Personnel Psychology, 52(3), 561-589.

Christiansen, N. D., Gofﬁn, R.D., Johnston, N.G., & Rothstein, M.G. (1994).
Correcting the 16PF for faking: effects on criterion-related validity and individual hiring
decisions. Personnel Psychology, 47, 847 - 860.

Christie, R., & Geis, F. L. (1970). Studiea in Machiavellianism. New York:
Academic Press Inc.

Cialdini, R. B. (in press). Managing social norms for persuasive impact. Social
Inﬂuence.

 

Cohen, J. (1988). Statistical p_ower analypis for the behavioral sciences (2nd ed.).
New Jersey: Lawrence Erlbaum.

Collins, J. M., & Gleaves, D. H. (1998). Race, job applicants, and the Five-Factor
Model of Personality: Implications for Black psychology, industrial/organizational
psychology, and the Five-Factor Theory. Journal of Applied Psychology, 83(4), 531-544.

Costa, P. T., & McCrae, R. R. (1985). The NEO Personality Inventory Manual.
Odessa, FL: Psychological Assessment Resources.

Crowne, D. P., & Marlowe, D. (1960). A new scale of social desirability
independent of psychopathology. Journal of Consulting Psychology, 24(1960), 349-354.

Donovan, J. J ., Dwight, S.A., & Hurtz, G.M. (2003). An assessment of the
prevalence, severity, and veriﬁability of entry-Level applicant faking using the
randomized response technique. Human Performance 16(1), 81 - 106.

 

Douglas, E. F ., McDaniel, M. A., & Snell, A. F. (1996). The validity of non-
cognitive measures decays when applicants fake, Proceedings of the Academv of
Management (pp. 127 - 131). Cincinnati, OH.

117

Drasgow, F., & Kang, T. (1984). Statistical power of differential validity and
differential prediction analyses for detecting measurement nonequivalence. Journal of

Applied Psychology, 69(3), 498-508.

Dunnette, M. D., McCartney, J ., Carlson, H. C., & Kirchner, W. K. (1962). A
study of faking behavior on a forced-choice self-description checklist. Personnel

Psychology, 15(2), 13-24.

Dwight, S. A., & Donovan, J.J. (2003). Do warnings not to fake reduce faking?
Human Performance. 16(1), 1 - 23.

Edwards, J. R. (2002). Alternatives to difference scores: Polynomial regression
analysis and response surface methodology. In F. Drasgow & N. Schmitt (Eds),
Measuring and Analyzing:Beh_avior in Organizations: Advances in Measurementﬁd
Data Analysis. San Francisco: Jossey-Bass.

Ellingson, J. E., Sackett, P. R., & Hough, L. M. (1999). Social desirability
corrections in personality measurement: Issues of applicant comparison and construct

validity. Journal of Applied Psychology, 84(2), 155-166.

Ellingson, J. E., Smith, D. B., & Sackett, P. R. (2001). Investigating the inﬂuence
of social desirability on personality factor structure. Journal of Applied Psychology,
_8§(1), 122-133.

Etzioni, A. (1988). The moral dimension: Toward a new economics. New York,
NY, US: Free Press.

Ford, J. K., MacCallum, R. C., Tait, M. (1986). The application of exploratory
factor analysis in applied psychology: A critical review and analysis. Personnel

Psychology, 39(2), 291 -3 14.

Fox, J. A., & Tracy, P. E. (1986). Randomized respgnse: A method for sensitive
surveys. Beverly Hills, CA: Sage.

Frei, R. L., Snell, A. F., McDaniel, M. A., & Grifﬁth, R. L. (1998). Usinga within

subjects desigr to identify the differences between social desirabiligy and abilig to fake.
Paper presented at the 13th Annual Conference of the Society of Industrial and

Organizational Psychologists, Dallas, TX.

 

Gofﬁn, R. D., & Christiansen, N. D. (2003). Correcting personality tests for
faking: A review of popular personality tests and an initial survey of researchers.
International Journal of Selection a_r_r_d AssessmenL, 11, 340-344.

Gordon, R. A. (1996). Impact of ingratiation on judgments and evaluations: A
meta-analytic investigation. Journal of Personalig & Social Psychology, 71(1), 54-70.

118

Graham, M. A., Monday, J., O'Brien, K., & Steffen, S. (1994). Cheating at small
colleges: An examination of student and faculty attitudes and behaviors. Journal of

College Student Development, 35(4), 255-260.

Haaland, D., & Christiansen, N. D. (1998). Departures from linearity in the
r____elationship between applicant personality test scores and performance as evidence of
resppnse distortion. Paper presented at the 22nd Annual International Personnel
Management Association Assessment Council Conference, Chicago, IL.

Harrison, D. A. (1995). Volunteer motivation and attendance decisions:
Competitive theory testing in multiple samples from a homeless shelter. Journal of

Applied Psychology, 80(3), 371-385.
Hausknecht, J. P., Day, D. V., & Thomas, S. C. (2004). Applicant reactions to

selection procedures: An updated model and meta-analysis. Personnel Psychology, 57,
639-683.

Hogan, R., & Hogan, J. (1992). Hogan personalig inventory manual. Tulsa:
Hogan Assessment Systems.

Hogan, R., & Nicholson, R. A. (1988). The meaning of personality test scores.
American Psychologist, 43(8), 621-626.

Hogan, R. T. (1991). Personality and personality measurement. In M. D. Dunnette

& L. M. Hough (Eds), Handbook of industrial and organizational psychology (2 ed., Vol.
2, pp. 873-919). Palo Alto, CA: Consulting Psychologists Press, Inc.

Hough, L. M. (1998). Effects of intentional distortion in personality measurement
and evaluation of suggested palliatives. Human Performance 11(2), 209 - 244.

 

Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & et al. (1990).
Criterion-related validities of personality constructs and the effect of response distortion
on those validities. Journal of Applied Psychology, 75(5), 581-595.

Hu, L., & Bentler, P. M. (1999). Cutoff criteria for ﬁt indexes in covariance
structure analysis: Conventional criteria versus new alternatives. Structural Equation

Modeling, 6(1), 1-55.

Hurd, J. M., Barrett, G. V., Miguel, R. F ., Tan, J. A., & Lueke, S. B. (2001).

When do resmnse distortion scales reﬂect faking? A meta-analysis. Paper presented at
the 16th Anual Conference of the Society for Industrial and Organizational Psychologists,

San Diego, CA.
International Personality Item Pool. (2001). A Scientiﬁc Collaboratory for the

Development of Advanced Measures of Personality Traits and Other Individual
Differences. (http://ipip.ori.org/): Internet Web Site.

119

Kanji, G. K. (1993). 100 SpatisticaiTegsts London: SAGE Publications.

Kashy, D. A., & DePaulo, B. M. (1996). Who lies? Journﬂf Person’ality &
Social Psychology, 70(5), 1037-1051.

Komar, S., Theakston, J ., Brown, D. J ., & Robie, C. (2005). Faking and the
validity of personality: A monte carlo investigation. Manuscript submitted for

publication.

Kroger, R. O., & Tm‘nbull, W. (1975). Invalidity of validity scales: The case of
the MMPI. Journal of Con_sulting & Clinical Psychology, 43(1), 48-55.

 

Leary, M. R., & Kowalski, R. M. (1990). Impression management: A literature
review and two-component model. Psycholog'cal Bulletirn 107(1), 3447.

Levin, R. A., & Zickar, M. J. (2002). Investigating self-presentation, lies, and
bullshit: Understanding faking and its effects on selection decisions using theory, ﬁeld
research, and simulation. In J. M. Brett & F. Drasgow (Eds), The Psychology of Work_:
Theoretically Based Empircal Research (pp. 253 - 276). Mahwah, NJ .: Lawrence
Erlbaurn Associates.

Lueke, S. B., Snell, A. F., EIllingworth, A. J ., & Paidas, S. M. (2001). _Ap
empirical test of an interactional model of faking. Paper presented at the 16th Annual
Conference of the Society for Industrial and Organizational Psychologists, San Diego,
CA.

McFarland, L. A. (2000). Toward an integgated model of applicant faking.
Unpublished dissertation. Michigan State University, East Lansing.

McFarland, L. A. (2002). Consmuences of warning against faking on a
personalng test. Paper presented at the 17th Annual Conference of the Society for

Industrial and Organizational Psychologists, Toronto, ON.

McFarland, L. A., & Ryan, A. M. (2000). Variance in faking across noncognitive
measures. Journal of Applied Psychology, 85(5), 812-821.

Meredith, W. (1993). Measurement invariance, factor analysis and factorial
invariance. Psychometrika, 58(4), 525-543.

Mueller-Hanson, R., Heggestad, E. D., & Thornton, G. C., III. (2003a). Faking
and selection: Considering the use of personality from select-in and select-out

perspectives. Journal of Applied Psychology, 88(2), 348-355.

Mueller-Hanson, R. A., Heggestad, E. D., & Thornton, G. C. (2003b). Individual
differences in impression management: An exploration of the psychological processes

120

underlyp'ng faking. Paper presented at the 18th Annual Conference of the Society for
Industrial and Organizational Psychologists, Orlando, FL.

Ones, D. S., & Viswesvaran, C. (1998). The effects of social desirability and
faking on personality and integrity assessment for personnel selection. Human
Performance 11(2-3), 245-269.

 

Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in
personality testing for personnel selection: The red herring. Journal of Applied

Psychology, 81(6), 660-679.

Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-
analysis of integrity test validities: Findings and implications for personnel selection and
theories of job performance. Journal of Applied Psychology, 78(4), 679-703.

Organ, D. W., & Ryan, K. (1995). A meta-analytic review of attitudinal and
dispositional predictors of organizational citizenship behavior. Personnel Psychology
Smial Issue: Theory and literature. 48(4), 775-802.

Pandey, J ., & Rastogi, R. (1979). Machiavellianism and ingratiation. Journal of
Social Psychology, 108(2), 221-225.

Paulhus, D. L. (1984). Two-component models of socially desirable responding.
Journal of Personality & Social Psychology, 46(3), 598-609.

Paulhus, D. L. (1986). Self-deception and impression management in test
responses. In A. Angleiner & J. S. Wiggins (Eds), Personality assessment via
questionnaire (pp. 142 - 165). New York: Springer.

Reynolds, D. H., Sinar, E. F ., & Haaland, D. E. (2003). Non-cogpitive testingi_r_r
metice: Effects of prepmtion on score characteristics and subggoup differences. Paper
presented at the 18th Annual Conference of the Society of Industrial and Organizational
Psychologists, Orlando, FL.

Robie, C., Zickar, M. J ., & Schmit, M. J. (2001). Measurement equivalence
between applicant and incumbent groups: An IRT analysis of personality scales. Human
Performance. 14(2), 187-207.

Rock, D. A., Werts, C. E., & Flaugher, R. L. (1978). The use of analysis of
covariance structures for comparing the psychometric properties of multiple variables
across populations. Multivariate Behavioral Research. 13(4), 403-418.

Rogosa, D., Brandt, D., & Zimowski, M. (1982). A growth curve approach to the
measurement of change. Psychological Bulletin, 92, 726-748.

121

Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of
response distortion on preemployrnent personality testing and hiring decisions. Journal of

Applied Psychology, 83(4), 634-644.

Schifter, D. E., & Ajzen, I. (1985). Intention, perceived control, and weight loss:
An application of the theory of planned behavior. J ourn_al of Persgality & Social

Psychology, 49(3), 843-851.

Schlenker, B. R. (1980). Impression Management. Monterey, CA: Brooks/Cole.

Schmit, M. J ., & Ryan, A. M. (1993). The Big Five in personnel selection: Factor
structure in applicant and nonapplicant populations. Journal of Applied Psychology,
18(6), 966-974.

Schmitt, N., Clause, C. S., & Pulakos, E. D. (1996). Subgroup differences
associated with different measures of some common job relevant constructs. In C. L.
Cooper & I. T. Robertson (Eds), International Review of Industrial and Organizational
Psychology (pp. 115 - 140). New York: Wiley.

Schmitt, N., & Oswald, F . L., (in press). The impact of corrections for faking on
the validity of noncognitive measures in selection settings. Journg of Applied

Psychology.

Schoorman, F. D., Bobko, P. & Rentsch, J. (1991). The role of theory in testing
hypothesized interactions: An example from the research on escalation of commitment.
Journal of Applied Social Psychology, 21, 1338-1355.

Smith, D. B., & Ellingson, J. E. (2002). Substance versus style: A new look at
social desirability in motivating contexts. Journal of Applied Psychology, 87(2), 211-219.

Smith, D. B., Hanges, P. J ., & Dickson, M. W. (2001). Personnel selection and the
ﬁve-factor model: Reexamining the effects of applicant's ﬂame of reference. Journal of

Applied Psychology, 86(2), 304-315.

Snell, A. F., Sydell, E. J ., & Lueke, S. B. (1999). Towards a theory of applicant
faking: integrating studies of deception. Human Resource Management Review. 9(2),
219 - 242.

Stajkovic, A. D., & Luthans, F. (1998). Self-efficacy and work-related
performance: A meta-analysis. Psychological Bulletin, 124(2), 240-261.

Stark, S., Chernyshenko, O. 8., Chan, K.-Y., Lee, W. C., & Drasgow, F. (2001).
Effects of the testing situation on item responding: Cause for concern. Journal of Applied
Psychology, 86(5), 943-953.

122

Tellegen, A. (in press). MPQ (Multidimensional Personality Questionnaire):

Manual for administration, scoring, and inte_rpretation. Minneapolis: University of
Minnesota Press.

Tisak, J. & Smith, C. S. (1994). Defending and extending difference score
methods. Jourpal of Maﬂgement. 20(3), 675-682.

Topping, G. D., & O'Gorman, J. G. (1997). Effects of faking set on validity of the
NEO—FFI. Personalig & Individual Differences, 23(1), 117-124.

Triandis, H. C. (1977). Interpersonal behavior. Monterey, CA.: Brooks/Cole.

Viswesvaran, C., & Ones, D. S. (1999). Meta-analyses of fakability estimates:
Implications for personality measurement. Educational & Psychological Measurement.
$0), 197-210.

Viswesvaran, C., Ones, D. S., & Hough, L. M. (2001). Do impression
management scales in personality inventories predict managerial job performance
ratings? International Journal of Selection & Assessment 9(4), 277-289.

 

Wayne, S. J ., & Liden, R. C. (1995). Effects of impression management on
performance ratings: A longitudinal study. Academy of Management Journal. 38(1), 232-
260.

Weekley, J. A., Ployhart, R. E., & Harold, C. (2003). Personality and situational
judgment tests across applicant and incumbent settings: An examination of validity,
measurement, and subgroup diﬂerences. Paper presented at the 18th Annual Conference
of the Society for Industrial and Organizational Psychologists, Orlando, F L.

Weichmann, D. (2000). Applicant reactiona to novel selection tools. Unpublished
thesis. Michigan State University, East Lansing.

Weiner, J. A., & Gibson, W. M. (2000). mctical effects of faking on job
applicant attitude test scores. Paper presented at the 15th Annual Conference of the
Society of Industrial and Organizational Psychologists, New Orleans, LA.

Whyte, W. H. (1957). The organization man. Garden City, NY: Doubleday.

Zerbe, W. J ., & Paulhus, D. L. (1987). Socially desirable responding in
organizational behavior: A reconception. Academy of Management Review. 12(2), 250-
264.

Zickar, M. J ., & Drasgow, F. (1996). Detecting faking on a personality instrument
using appropriateness measurement. Applied Psychological Measurement, 20(1), 71-87.

123

   

I11111111111713»111]:I