PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/01 cJCIRC/DaleDuepGS-pJS A QUANTITATIVE REVIEW OF PREDICTORS OF JOB TASK AND CITIZENSHIP PERFORMANCE By Brian Hahn Kim A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Industrial/Organizational Psychology 2004 ABSTRACT A QUANTITATIVE REVIEW OF PREDICTORS OF JOB TASK AND CITIZENSHIP PERFORMANCE By Brian Hahn Kim A large body of research on job performance has examined citizenship performance behaviors in contrast with job task behaviors. However, findings in the literature have not always provided consistent and overwhelming support for contextual, or citizenship, performance theories, particularly with regard to its hypothesized determinants. Meta-analytic results of this study partially support stipulations of Motowidlo, Borman, and Schmit’s (1997) revised theory of contextual/citizenship job performance. Personality dimensions tended to predict citizenship behaviors better than task behaviors. However, cognitive ability remained the single best construct-level predictor across the performance dimensions, in a variety of settings. Biodata also predicted both task and citizenship performance very well. The implications of using such a two-dimension framework are discussed. TABLE OF CONTENTS LIST OF TABLES ..................................................................... LIST OF FIGURES .................................................................... INTRODUCTION ..................................................................... Job Performance .................................................................... Task Performance ............................................................... Citizenship / Contextual Performance ....................................... Going beyond formal job tasks ......................................... Jobs confounding citizenship and task performance ................. Challenging behaviors ..................................................... Recipients of citizenship: people, tasks, or organizations ........... Relative importance of task and citizenship ........................... Integrated frameworks of citizenship performance ................... Empirical Evidence for the Task-Citizenship Distinction ................. Job Performance Antecedents .................................................... Cognitive Ability ............................................................... Personality ....................................................................... Structured Interviews ........................................................... Biodata ........................................................................... Summary of Research Hypotheses ............................................... Conclusion ........................................................................... METHOD ............................................................................... Literature Search ................................................................... Criteria for Study Inclusion ................................................... Data Coding Procedure ............................................................ Interrater Agreement ............................................................ Meta-analytic Procedure ........................................................... Corrections for Artifactual Variance ......................................... Measurement unreliability ................................................ Range variation ............................................................. . Multivariate Meta-analysis Procedure ........................................... Database Description ............................................................... Outlier Analyses .................................................................... iii vi QUIH 13 15 17 18 20 22 26 3O 34 37 39 41 46 48 48 48 49 51 59 63 63 65 69 74 74 77 Overview of Meta-analytic Results .................................................. Hypothesis 1: Moderation by Job Type ............................................. Hypothesis 2: Moderation by Citizenship Dimension ............................. Hypotheses 3 and 4: Differential Prediction Patterns for Performance Dimensions .............................................................................. Hypotheses 5 and 6: Moderation by Job Complexity ............................. Hypotheses 7 through 10: Specific Predictior — Criterion Relationships ....... Hypothesis 12: Prediction of Biodata Linked to Constructs ..................... Supplemental Analyses ............................................................... DISCUSSION ........................................................................... Review of Research Goals ......................................................... Summary of Findings .............................................................. Future Directions ................................................................... Limitations ........................................................................... Overall Conclusion ................................................................. APPENDIX A: Pilot Coding Sheet .................................................. APPENDIX B: Coding Sheet ......................................................... APPENDIX C: Code Book ........................................................... APPENDIX D: Interrater Coding Agreement Results ........................... APPENDIX E: SAS / IML program for multivariate computations ............ APPENDIX F: Full List of Studies in Database .................................. APPENDIX G: Scree Plots for Outlier Analyses .................................. APPENDIX H: Job Complexity Codes .............................................. APPENDIX 1: Biodata Studies For Which Raters Assigned Construct Codes APPENDIX J: Results for Restricted Samples .................................... REFERENCES ......................................................................... iv 79 89 93 98 99 101 104 106 111 111 112 118 122 124 126 138 140 144 145 151 154 164 165 166 167 LIST OF TABLES Table 1. Study Variable Labels and Definitions ................................... 52 Table 2. Reliability Information For Scales ........................................ 56 Table 3. Database Descriptives ...................................................... 76 Table 4. Descriptive Statistics of Reliability Estimates ........................... 78 Table 5. Meta-analytic Correlation Matrix for Job Performance and Performance Predictors ............................................................... 82 Table 6. Meta-analytic Results For Pairs of Study Variables ..................... 86 Table 7. Estimated Population Correlation Matrix Based on Multivariate Meta-analysis ........................................................................... 92 Table 8. Tests of the Moderating Effect of Job Type .............................. 95 Table 9. Percentage of Correlations From Each Personality Dimension ...... 95 Table 10. Simple Comparisons of Job Dedication and Interpersonal Facilitation .............................................................................. 97 Table 11. Tests of the Moderating Effect of Job Complexity .................... 102 Table 12. Pairwise Meta-analytic Estimates for the Multivariate Sample. . 110 Table 13. Summary of Conclusions for Hypotheses .............................. 113 LIST OF FIGURES Figure 1. Model of Hypothesized Relationships Between Predictor Variables and Job Performance Dimensions ....................................... Figure 2. Model of Hypothesized Relationships Between Personality Dimensions and Citizenship Dimensions ........................................... Figure 3. Multilevel Structure of Meta-analytic Database ........................ Figure 4. Path Diagram with Multivariate Estimates of Study Variables ...... vi 45 62 lfl.wm #w -_ 4" INTRODUCTION ii After nearly a century of concerted research on the topic, I/O psychologists continue to disagree on what exactly constitutes job performance (Campbell 1990; Coleman & Borman, 2000; Rotundo & Sackett, 2002; Van Dyne, Cummings, & Parks, 1995) despite the notion that “individual performance on a ‘task,’ virtually any task that the culture views as having value, is one of the most important dependent variables in psychology, basic or applied” (Campbell, McCloy, Oppler, & Sager, 1993, p. 35). At the most general level, job performance is the set of behaviors executed by an employee in the context of work that contribute to the overall effectiveness of an organization. Understanding which employees perform this set of behaviors well and how they do so is a central imperative for many areas of industrial-organizational research, most notably in areas of personnel selection and job training. Despite a general recognition that people can and do perform many different actions at work, examinations of distinct aspects of job performance have been largely ignored in research (Austin & Villanova, 1992). Job performance is most often measured globally with a supervisor or peer rating (Bemardin & Beatty, 1984; Cascio, 1995; Cleveland, Murphy, & Williams, 1989; Scullen, Mount, & Goff, 2000). The use of broad performance measures is practical and efficient for making simple evaluations based on the rank order of individuals, as in selection. Furthermore, composite scores of performance based on more specific measures tend to distribute people according to a compensatory model of performance in which few workers excel on every dimension. Despite these advantages, however, broad measurements of overall performance result in a loss of information about specific causal relationships and typically leave a considerable portion of variance to be explained in the criterion (e. g., Schmidt, 2002; i‘ Schmidt et al., 1985; Viswesvaran & Ones, 2002; Schmidt & Hunter, 1998). Measures of overall performance can also create conceptual ambiguity when based on lower-order constructs that are quite different and caused by different factors. That is, the antecedents (e.g., cognitive ability) having the greatest influence on a performance composite may differ from those having the greatest influence on a specific aspect of performance. There remain some large gaps in the research literature about the nature of specific relationships between various predictors and the possible facets of job performance (Hough & Oswald, 2000). This problem has been evident for some time in “the criterion problem” that results from the reliance on a single, criterion deficient outcome (Austin & Villanova, 1992; Nathan & Alexander, 1988). To the extent that different aspects of job performance are uniquely influenced by various determinants, theory requires critical examinations of the full range of performance antecedents and how they cause each performance behavior or process. Fortunately, a few researchers have already attempted to lay the conceptual foundation needed to define and study job performance at a more detailed level. Around the late 19805 and early 905, a number of job performance taxonomies were put forth by various parties (e.g., Brief & Motowidlo, 1986; Campbell 1990; McCloy, Campbell, & Cudeck, 1994; Organ, 1988; Schmidt & Hunter 1992). Campbell et a1. (1993; Campbell, Gasser, & Oswald, 1996) developed a broad taxonomy of performance behaviors that was intended to account for all jobs in the Dictionary of Occupational Titles. The taxonomy consists of eight components including aspects of task proficiency, demonstrating effort, maintaining discipline, facilitating peer performance, supervising, and managing. The Campbell model can be applied to many situations and has undoubtedly sparked interesting research questions that would be ignored by focusing on overall performance. Although the general growth of large taxonomic systems like that proposed by Campbell and colleagues has improved our ability to create theories about work processes, a fair amount of attention has been focused on a particular set of behaviors that were not traditionally thought of as job performance. Although not always well defined, the general class of behaviors related to creating and maintaining a positive work environment for the purpose of enhancing an individual’s or group’s capability of producing organizational output seems important for making workers efficient and satisfied, and achieving organizational success. Such behaviors have, in fact, been included in a number of job performance models (e.g., Borman & Motowidlo, 1993; Campbell et al., 1993; Organ, 1988; Van Dyne et al., 1995) and typically are distinguished from other behaviors related more directly to the production of organizational output. While the 19905 witnessed the development of a number of labels for these behaviors gain popularity, such as contextual, organizational citizenship, prosocial, personal initiative, and extra—role performance, a recent confluence of research has begun to support the various investigations of environment-supporting, or contextual, behaviors as one broad dimension of job performance that is distinguishable from task-related behaviors performed explicitly to deliver organizational output. The consideration of a general distinction between “task” and “contextual” behaviors allows for a more focused study of performance at a level of detail just one step removed fi'om the use of overall performance measures, and is believed to benefit past performance models by increasing parsimony and generalization across jobs. Borman and Motowidlo (1997, Coleman & Borman, 2000) offered their initial (1993) model of contextual performance as an overarching framework that subsumes many facets of performance generally related to enhancing the work environment, or context. Campbell and colleagues (1996) found their model to be compatible with the Borman and Motowidlo (1993) framework by directly linking a portion of their dimensions to contextual performance. Others (e.g., Organ, 1997; Podsakoff, MacKenzie, Paine, & Bachrach, 2000; Rotundo & Sackett, 2002) have linked the concept of contextual performance to a set of almost parallel concepts (Organ, 1988) broadly termed “organizational citizenship behaviors.” This study reviews the conceptual fiamework of contextual, or (as it is now commonly labeled) citizenship, performance as posited by previous theorists. As more than a decade has passed since the introduction of the task-citizenship performance taxonomy, it is appropriate to evaluate how this perspective has influenced research and whether the distinction has elucidated our understanding of performance behaviors. This review also helps to identify conceptual ambiguities and potential levers for closing the theoretical gaps with research. Based on the literature review, a series of hypotheses concerning the differential patterns of relationships between the task and citizenship performance dimensions and various performance predictors are then tested in a quantitative review of relevant published research. This study extends findings from previous work by providing meta-analytic estimates of cumulated data, by examining commonly used, practical measures of performance predictors as well as measures of theoretical constructs, and by simultaneously considering the relationships of various If“ '1 predictors of task and citizenship performance across a wide range of jobs. Job Performance Despite its obvious importance, research on the concept of performance had been largely absent in the literature before the late 19805 (Campbell et al., 1996; Ford, Kraiger, & Schechtrnan, 1986; Motowidlo, Borman, & Schmit, 1997). Initially, performance was assumed to be a general, uniform construct and a sufficient outcome against which other phenomena could be validated. However, the general definition that has become prevalent today considers performance to be those behaviors, under the control of the individual, that contribute to the goals of an organization employing the individual (Campbell et al., 1996; Murphy & Cleveland, 1995). Though not necessarily restricted to observable behaviors, performance is made up of actions and not intentions or consequences of those actions, as stated in the definition (Campbell et al., 1993; Murphy, 1989). Furthermore, performance is not equivalent to effectiveness. Job performance is simply action in the context of work; it can be executed well or poorly. Effectiveness and other outcomes have a valence attached to them, a valence beyond the individual’s control. Beyond the definition, it is generally agreed that job performance consists of too many distinct behaviors to be considered a single theoretical construct. The idea that everything a person does at work (that contributes to organizational effectiveness) is the same thing, job performance, with the same antecedents and consequences is hardly usefirl. An analogous, equally impotent perspective would be to label most human behaviors as simply life performance. Therefore, there has always been a need to differentiate performance behaviors based on their relationships to other constructs in a nomological network for improving the theoretical meaning and practical usefulness of job performance concepts. Task Performance Most would agree that any definition of job performance should at least include those tasks that provide essential functions for transforming an organization’s raw input to output (Borman & Motowidlo, 1993; Campbell et al., 1993; Rotundo & Sackett, 2002), without which the organization would not survive. The foundation of modern I/O psychology was, in fact, spurred on by task-based work such as Taylor’s initial studies of scientific management (Taylor, 1911, 1912; Locke, 1982). With a lack of theory about human resource potential, a readily available (and disposable) workforce, and burgeoning interest in assembly line-type work, there was little point in studying non-task behaviors during that era. Starting with those early studies, a plethora of research has supported the validity and use of formal job tasks in understanding the nature of work (Austin & Villanova, 1992). Borman and Motowidlo (1993) conceptualized “task performance” as the activities that execute or indirectly service core technical functions that transform environmental resources into organizational products. Task behaviors cover a wide range of behaviors performed by workers at all levels and can range from assembling car parts to taking customer service calls to planning inventory shipments. Other theories of job performance also seem to endorse this basic premise (Rotundo & Sackett, 2002). A major focus on task measures as performance criteria makes intuitive sense — they represent the “core firnctions.” For instance, National Research Council’s Committee on the Performance of Military Personnel views work samples as the only true measure of performance because they demonstrate actual behaviors required on the job (see Campbell et al., 1996). However, the centrality of task performance in research becomes questionable when considering the many complex processes that occur in actual job settings. Early researchers recognized the importance of “other” individual characteristics that were important for accomplishing job tasks, including one’s effort level, sense of loyalty, and willingness to be helpful and cooperate (Barnard, 1938; Katz, 1964). Jobs today typically consist of a number of varied tasks that are strung together and performed over time, often in different or changing environments (Cascio, 1995). Organizations are also increasingly using team-based structures where interdependent goals can only be achieved through multiple people working in concert with each other (Kozlowski, Gully, Nason, & Smith, 1999; Lawler, Mohrman, & Ledford, 1995). To successfully adapt and coordinate their tasks with others, workers may find it necessary to focus on non-task related behaviors such as generally getting along with coworkers or deferring personal responsibilities to backup or “cover for” someone who is unable to perform at a particular time (Blickensderfer, Cannon-Bowers, & Salas, 1997; Dickinson & McIntyre, 1997). It is also likely that non-task behaviors become increasingly beneficial after one becomes familiar with core tasks. As job tasks become easier and require less attention through practice and/or automatization, workers can focus on other aspects of work such as taking the initiative to perform additional work, finding ways to improve core functions, or helping coworkers with issues unrelated to the job. These non-task behaviors, in turn, may increase organizational effectiveness. Even the famed Hawthorne project produced results that are in line with this notion. Worker productivity increased not only because workers were given more attention but because they were able to create a supportive social environment. Task performance may also play a smaller role in upper-level jobs where managers are concerned with the general state of the organization and production, organizational politics, managing others who perform core functions, defending the organization, and more, rather than performing core tasks themselves. Thus, task performance is very important but does not appear to capture the entire domain of behaviors that lead to organizational effectiveness. Citizenship / Contextual Performance Organizational research has increasingly focused on behaviors that improve the general social and psychological context in which job tasks are performed. These contextual, or citizenship, behaviors are believed (e. g., Borman & Motowidlo, 1993; Campbell et a1. 1993) to be important dimensions of a worker’s overall contribution to an organization and are believed to have a set of determinants that is unique from that for task performance. Unfortunately, there was very little consensus about what specific activities comprise this alternative dimension of job performance initially, and many ambiguities still remain. Nonetheless, people have suggested that behaviors generally supporting the work environment, or context, are important and worth studying. Among the specific behaviors investigated in recent years are following rules, volunteering to do extra or unrelated work, showing extra effort and perseverance, or defending organizational objectives from external criticism. Ahnost as many performance concepts as there are behaviors have been offered by researchers to classify types of contextual activities, ‘un including but not limited to contextual performance, citizenship performance, organizational citizenship behavior (OCB), prosocial behavior, personal initiative, loyalty, interpersonal facilitation, and whistle-blowing. Some efforts to integrate concepts and theories have been made but the model of task and contextual performance by Borman and Motowidlo (1993) is, perhaps, the broadest and most flexible, enough so to serve as an overarching framework for the various types of performance listed above (see Borman & Motowidlo, 1993, 1997 for explanations of the various concepts; Coleman & Borman, 2000), subsuming or being synonymous with many terms. Contextual performance (later renamed citizenship) was originally defined as the activities that support main task functions by shaping the organizational, social, and psychological context in which they are carried out (Borman & Motowidlo, 1993). Restated, contextual performance is the set of activities that are under the individual’s discretion and contribute to organizational effectiveness but are not task performance. These behaviors differ from task performance in four basic ways (Borman & Motowidlo, 1993). First, they do not support the technical core itself as much as its environment. Consequently, being proficient is less important than demonstrating initiative beyond a base level of requirements or expectations. Second, they are common to all jobs, unlike core tasks that vary by job and organizational goal. Third, their variance is largely determined by volition and predisposition rather than by knowledge, skills, and abilities (KSA’s) leading to proficiency. Fourth, they are “not likely” to be required or explicitly rewarded by a role though sometimes formally recognized in certain jobs. In addition, this facet of performance appears to be more affective (Hattrup, O’Connell, & Wingate, 1998; Motowidlo & Van Scotter, 1994; Penner, Midili, & Kegelmeyer, 1997) or attitudinal in nature (Organ, 1997; Penner et al., 1997) than task performance. Borman and Motowidlo (1997) revised their contextual performance conception by tying it to existing frameworks of OCB, soldier effectiveness, sportsmanship, whistle- blowing, courtesy, civic virtue, and employee reliability to develop a five-category taxonomy of behaviors: persisting with effort and enthusiasm, volunteering beyond one’s job tasks, helping and sportsmanship, following rules and civic virtue, and endorsing organizational objectives. They admit that their distinction between contextual and task performance remains blurred but held to the following three assertions: 1) important task activities differ by job while important contextual behaviors generalize across jobs (e.g., being amicable is helpful for a salespeople and machinists), 2) tasks are “more likely” to be role prescribed, and 3) individual differences in task performance are determined more by cognitive ability while differences in contextual performance are determined more by personality. Motowidlo and others (1997) later specified a more specific model where the links between personality and cognitive ability with task and contextual performance were mediated by relevant skills, habits, and knowledge, also reiterating their third assertion. While the use of the Borman and Motowidlo model as an overarching framework seems plausible and useful, a number of conceptual debates over the nature of different aspects of citizenship performance must be addressed before this middle range concept can be applied to integrate theory and test expectations with real world phenomena. Four debates of particular importance have generated much discussion among researchers and are addressed here. First, the idea of extra-role behaviors is distinguished from 10 citizenship performance. Second, the importance of citizenship in managerial jobs is addressed. The third debate compares and contrasts helping behaviors with “challenging” behaviors. The fourth separates interpersonal and task aspects of citizenship. The section is concluded with a brief discussion about the interaction between task and citizenship performance before an attempt is made to form a unified concept of citizenship performance that is comparable to the meso-level variable of task performance. Going beyond formal job tasks. Overlapping with citizenship performance, OCBs are defined as the behaviors across time and persons that jointly promote organizational effectiveness but are not formally required or directly rewarded (Organ, 1988); they are extra-role and discretionary (Borman & Motowidlo, 1997). Organ (1997) later reexamined his definition and dropped the requirement that OCBs are extra-role, realizing that classification schemes should not label a particular behavior differently depending on the setting in which it is viewed. A salesperson may be required to smile when a customer enters the building but is performing no differently than a custodian who smiles, despite not being required to do so. This led Organ to conclude that OCB is “synonymous” with contextual performance but to promote the continued use of his term, OCB, because he “find[s] that both academic and practitioner types readily and intuitively grasp what it is all about” (p. 91) and because the term contextual performance “simply strikes [him] as cold, gray, and bloodless” (p. 91). Despite this rather blithe justification for labeling, Borman and others (e.g., Borman, Penner, Allen & Motowidlo, 2001 and Coleman & Borman, 2000) appear to have converted, using the term “citizenship performance.” However, it is duly noted that citizenship performance is not 11 strictly equivalent with contextual performance because Borman and Motowidlo (1997) L include whistle blowing and similar behaviors as part of contextual performance while Organ does not. Similarly, Van Dyne and colleagues (1995) defined extra-role behaviors (ERBs) as behaviors that are intended to or do benefit the organization, are discretionary, and go beyond existing role expectations. This definition has obvious relevance to the extent that it overlaps with contextual performance but is distinct in two ways. First, it relies on subjective perceptions about what is “required” by a role, either explicitly or implicitly. These perceptions can vary for superiors who rate performance and for workers who decide whether or not to perform beyond their formal or perceived role. Second, the term ERB includes behavioral intentions and implies the notion of altruism, requiring that a person’s intention is to help the organization or another person and not merely oneself. While the latter stipulation can be useful in understanding and predicting certain actions (Hogan, Rybicki, Motowidlo, & Borman, 1998), it is irrelevant from the perspective of actually measuring performance. People who act in a certain way are said to be performing, regardless of their intentions. Conceptually, the inclusion of altruistic behavioral intentions is thus incompatible with most accepted definitions of performance that are defined in terms of behaviors (e. g., Campbell et al., 1993) — requiring them as the base unit of analysis. Practically, a focus on ERBs will undoubtedly lead to frequent measurement errors as the same behavior can be labeled differently depending on how observers infer someone’s intentions and make causal attributions about actions (Schnake, 1991). For example, the act of complimenting could be seen as supportive and cooperative for one person and as ingratiating and sly for another. Therefore, this study 12 will be limited to understanding explicit performance behaviors but does recognize that the concept of intentions and perceiving responsibilities beyond formal, role requirements may be important in other work. Jobs confounding citizenship and task performance. Some believe that task performance in managerial and service jobs is confounded with citizenship behaviors because these workers spend a considerable portion of their time nurturing the social environment of coworkers and less time dealing with core production (Borman & Motowidlo, 1993) than other types of workers. The assumption is that the primary function of managers is to provide social support and endorse/protect the organization, and that corresponding acts comprise their main job tasks, many of which are likely to be formally required. The theoretical implication is that managers’ “task” performance is simultaneously citizenship performance. The practical implication is that organizational attempts to increase or enhance managerial performance will have the same effect on citizenship and on task performance (Conway, 1999). Conway (1999) suggested that interpersonal activities related to guiding and developing subordinates who perform tasks were, in essence, the managerial version of task performance, in addition to technical-administrative duties that were more directly related to core production. As when discussing the extra-role distinction, this type of reasoning is questionable according to the behavioral definition of performance since the same behavior should not be labeled differently just because it is performed by a manager rather than by a lower-level worker (Organ, 1997; Rotundo & Sackett, 2002). Instead, activities tied closely to the administration of core production processes, direction about how to plan and organize production, and backing up subordinates to enable production 13 should be considered task performance. Activities that are further removed from core firnctions, such as showing loyalty to a group, helping subordinates with personal issues, defending the organization, and demonstrating loyalty, would fall in the domain of citizenship performance. Admittedly, Conway (p. 5) does state that the key distinction rests in whether or not managerial behaviors are “more explicitly oriented toward goal achievement.” Although there may be some “gray area” in distinguishing such behaviors, refrarning the debate in this way allows us to ask more meaningful and testable questions. Perhaps managerial jobs require more citizenship performance relative to task performance (cf. Ilgen & Hollenbeck, 1991). Conway (1999) found support that both job dedication and interpersonal dimensions of citizenship contributed uniquely to overall managerial performance, beyond the contributions of task performance. The arguments above could similarly be applied to sales employees who must maintain a supportive environment to achieve sales (MacKenzie, Podsakoff, & Fetter, 1991). Vinchur, Schippman, Switzer, and Roth (1998) found that conscientiousness was a good predictor of sales criteria while cognitive ability only predicted ratings criteria well. If, as many hypothesize, conscientiousness enables workers to take initiative, put forth extra effort, and be dedicated to their job, then it should be related to performance in sales jobs where the usefulness of citizenship is more salient. Ultimately, the belief is that managers or sales agents are expected, or required, and rewarded to perform citizenship behaviors well, thereby increasing the correlation between citizenship predictors and job performance criteria. 14 Hypothesis 1 (HI): Citizenship performance will show higher positive correlations with noncognitive predictors in managerial and sales jobs than other jobs. However, the effect predicted by Hypothesis 1 might not occur for two specific reasons. First, there may be little observed variance in citizenship performance by managers if they were selected on their aptitude or willingness to perform citizenship behaviors or if citizenship behaviors are formally required. Second, studies with poorly defined criteria that label citizenship behaviors as task performance will most likely attenuate any correlations between citizenship performance and noncognitive predictors. The second effect is controlled for in this study by using a set of rules for categorizing performance criteria into task and citizenship dimensions based on definitions of performance derived from the literature (described in the Methods section). Challenging behaviors. This debate concerns the difference between behaviors that promote an organization and behaviors that challenge it (Van Dyne et al., 1995; Organ, 1997). Challenging refers to behaviors like whistle blowing, principled organizational dissent, and general voice. While conflicting definitions have emerged in the literature, whistle blowing generally refers to discretionary behaviors that disclose an illegal, immoral, or illegitimate act with the intention of ultimately improving the organization (Van Dyne et al., 1995). Persons can benefit personally, but only in addition to their contribution to the organization, though they are ofien penalized for their acts. For example, the whistleblowers in three recent scandals (i.e., Enron, WorldCom, and FBI) initially attempted to rectify problems internally and privately (without public recognition) before deciding that a more drastic measure was necessary to invoke 15 changes (Lacayo & Ripley, 2002). Principled organizational dissent is opposition to practices that are not illegal but are still objectionable on the basis of “conscientious principles.” Voice behaviors promote change rather than prohibit current practices; they may include persuading others, counteracting groupthink, or providing constructive criticism. Challenging then refers to a broader group of behaviors “criticizing the inefficiency of the status quo” for the benefit of the organization (Van Dyne et al., 1995, p.252) Puffer (1987) stated, “noncompliant behaviors are distinct types of nontask behavior that have a common achievement-motivation base but are influenced by different perceived situational contingencies” (p. 619). Compared to citizenship, challenging behaviors appear to have a “different character altogether,” sometimes incurring immediate costs before eventually benefiting the organization (Organ, 1997). At the same time, challenging appears to affect organizations through the psychological environment more than through job tasks (Borman & Motowidlo, 1997), and appears to be determined by personality and motivation more so than by cognitive ability (LePine & Van Dyne, 2001). For these reasons, it is included here under the broad category of citizenship, though future investigations can and should assess the extent to which challenging is different from other dimensions of citizenship. Finally, it is noted that challenging behaviors are arguably distinct from sheer negative or retaliatory acts such as sabotage or counterproductive behaviors that are not related to achieving organizational goals (Kelloway, Loughlin, Barling, & Nault, 2002; Miles, Borman, Spector, & Fox, 2002; Puffer, 1987; Rotundo & Sackett, 2002). 16 Recipients of citizenship: people, tasks, or organizations. Citizenship behaviors are typically performed for certain targets or recipients. The recipient may be the organization or an individual coworker. Some argue for separating OCB-I (behaviors directed at other individuals) and OCB-O (behaviors directed at the organization). OCB- 15 may be determined more by personality, trust, and emotional expression whereas OCB-Os may be determined by conscious, cognitive decisions to reciprocate in a social exchange (Lee & Allen, 2002; Settoon & Mossholder, 2002). LePine, Erez and Johnson (2002) found little support for this distinction. Due to a dearth of empirical support for this distinction in the current literature, these behaviors will be treated similarly in this paper. Van Scotter and Motowidlo (1996) split the concept of citizenship in a similar fashion into two dimensions: interpersonal facilitation and job dedication. Interpersonal facilitation refers to social acts of helping and cooperating with others while job dedication refers to self-disciplined motivated acts like working hard, taking initiative, and following rules. In their study, results suggested that job dedication is not clearly distinct from task performance. The constructs were moderately correlated (r = .48) and had similar patterns of relationships with experience, ability, job knowledge, and personality (average r = .15 with conscientiousness). This led those authors to believe that motivational elements related to task performance account for part of the citizenship domain. In contrast, results supported interpersonal facilitation as being unique fiom both job dedication and task performance (r = .36 and .35, respectively). Johnson (2001) similarly concluded that a (motivational) measure of job-task conscientiousness was related to aspects of both task and contextual performance. 17 Organ and Ryan (1995) conducted separate meta-analyses for altruism and generalized compliance when estimating the correlations between predictors and OCBs based on theoretical grounds. As the patterns of relationships were similar across the two analyses and no estimate of the correlation between the two dimensions of citizenship was provided, the usefulness of this distinction still needs to be investigated. Hurtz and Donovan (2000) found similar patterns of relationships when the Big Five personality dimensions were correlated with job dedication and with interpersonal facilitation. The only exception was for agreeableness, which was slightly more related to interpersonal facilitation (rc = .20) than job dedication (rc = .10). Though this partitioning of citizenship is tentative based on the existing empirical evidence, there is enough speculation that interpersonal behaviors directed at other people may be different from behaviors directed at the organization or work to warrant firrther investigation. The primary analyses of this study are concerned with the difference between task and citizenship but a secondary hypothesis concerning this moderator of correlations involving citizenship is that: Hypothesis 2 (H2): Effect sizes for measures related to citizenship performance will be moderated by the degree to which interpersonal facilitation and job dedication aspects are measured. Relative importance of task and citizenship. A corollary to the task-citizenship distinction is that all organizations require task performance by definition, or else there is no “work” to be done. Some minimum level of task output (basic competency or “satisficing”) is inevitably required for an organization to exist. Conversely, organizations do not necessarily require individuals to exhibit citizenship behaviors (e.g., l8 fully automated systems or small organizations with little interaction between individuals). This reasoning might explain the finding by Rotundo and Sackett (2002) that citizenship is weighted more for effective task performers. So it may be the case that a minimum level of task performance must be demonstrated before any worth is attributed to a worker. However, citizenship behaviors may allow organizations to reach maximal levels of effectiveness or to ensure continuous development or survival. The clearest examples of citizenship that are likely to help individuals function together above typical levels of task performance are conscientious behaviors and following rules. By using the social and psychological environment to support core functions in this way, workers (particularly managers) presumably enable the organization as a whole to function in an integrated fashion that is better than the sum of individual performances. Citizenship concerns may also take precedence over core functioning for pragmatic reasons, becoming the source of individual differences when job applicants all perform at similar levels of task performance (e. g., simple jobs that anyone can do well). Alternatively, organizations requiring strictly routinized task performance with little room for discretion may find that certain aspects of citizenship contribute little to overall effectiveness or are detrimental because they distract workers from their task performance (Hunt, 2002). Similarly, task performance may also have greater practical utility in extremely complex jobs with varying assignments where the majority of an individual’s attention and effort must be devoted to perform a particular task well (e. g., aeronautical engineers and physics professors). In conclusion, both concepts may have 19 great practical and theoretical importance but an absence of task performance logically precludes the need for citizenship performance. Integrated frameworks of citizenship performance. There have been a few attempts at integrating different models of performance to form a unified theory of citizenship. Van Dyne and colleagues (1995) proposed a complex nomological network for classifying four extra-role constructs: OCB, prosocial behaviors, whistle blowing, and organizational dissent. They were able to substantively clarify constructs by reducing conceptual overlap in previous definitions. Among their recommendations, they suggest concentrating on citizenship behaviors as a broader and more consistent term. (The reader is referred to the original article for specific conclusions — many of which are regarded as irrelevant here due to their focus on ERBs rather than citizenship/contextual aspects.) Coleman and Borman (2000) derived an overall model of citizenship performance by using factor analysis, multidimensional scaling (MDS), and cluster analysis on a similarity correlation matrix composed of citizenship dimensions that were sorted by 1/0 psychologists. The factor analysis produced four factors that accounted for fifty-nine percent of the variance and were interpreted as: 1) helping and cooperating with others, 2) endorsing, supporting, and defending the organization, 3) following organizational rules, and 4) persisting with enthusiasm and extra effort to complete own task. The MDS analysis resulted in five groups of behaviors: l) interpersonal altruism, 2) interpersonal conscientiousness, 3) organization allegiance/loyalty, 4) organizational compliance, and 5) job/task conscientiousness. Complementing this, the cluster analysis supported three groups of behaviors: 1) interpersonal citizenship performance, 2) organizational 20 citizenship performance, and 3) job/task conscientiousness. The authors concluded that the analyses together support three broad categories of citizenship performance depending on who or what benefits from a behavior: the whole organization directly (e. g., endorsing, supporting, following rules), other workers (e. g., helping, cooperating), or the job/task (e. g., conscientiousness, extra effort). As addressed earlier in the section about recipients of citizenship actions, there is a lack of support for categorizing behaviors explicitly directed at the organization, precluding a meta-analysis. So, this study only tests the distinction between interpersonal behaviors and other (job dedication and task- directed) behaviors in H2, per the recommendations of Coleman and Borman (2000) and Van Scotter and Motowidlo (1996). Rotundo and Sackett (2002) reviewed the array of related concepts in the literature and concluded that definitions of citizenship performance continue to overlap and rely on “rnuddied” features of behavior (e.g., extra role, not explicitly rewarded, or formal part of the job). They recommended defining performance behaviors independent from the context in which they are performed or from their consequences. They then treated citizenship performance as a single, broad concept. LePine and others (2002) similarly supported a broader level of analysis, stating that specific dimensions of citizenship can be treated as equivalent indicators of a common latent construct, “a general tendency to be cooperative and helpful in organizational settings” (p. 61). In their study, potential sub-dimensions of citizenship showed high intercorrelations, similar relationships with predictors, and little incremental variance when compared to a measure of general citizenship, and nonsignificant moderators. 21 Based on these qualitative and quantitative reviews, this study relies on a general theory of citizenship performance that reverts to Borman and Motowidlo’s original definition of a general class of behaviors that support the social and psychological environment. Overall job performance then appears to consist mostly of task and citizenship dimensions. Facets of citizenship behaviors may be conceptually distinct but can be treated similarly because they are likely to be determined by the same constructs and contribute to organizational effectiveness in a similar manner. However, the past literature does suggest that citizenship behaviors might be composed of two distinguishable facets at an intermediate level of detail: interpersonal facilitation and self- disciplinary acts. The theory also implies that citizenship and task performance will be determined primarily by different individual characteristics. Task performance should be more strongly related to cognitive ability while citizenship performance should be more strongly related to personality and motivation. Also, though the two dimensions of job performance are distinct and probably weakly related, task performance may be viewed as having more weight for the survival of an organization. Finally, citizenship performance is separated fiom extra-role behaviors that may or may not be characterized as aspects of the work environment rather than of behaviors that are tied to core functions. Empirical Evidence for the T ask-Citizenship Distinction Since the early 19905, empirical support for the distinction between task and citizenship behaviors has accumulated. If these concepts are to be useful in theory, they must show distinct patterns of relationships with other variables in a nomological 22 network (Cronbach & Meehl, 1955). If they are to be usefirl in practice, they must also be related weakly to each other or else they will be functionally redundant. Overall, the literature seems to support the task-citizenship distinction using both of these standards. Motowidlo and Van Scotter (1994) examined supervisor ratings of task, contextual, and overall performance in relation to experience ability, training, and personality in a sample of Air Force mechanics. Both task and contextual performance predicted incremental variance in overall performance over each other (within an estimated range of reliabilities, .4 to .8). Task ratings explained between 17% and 44% of the variance in overall performance above contextual ratings; contextual ratings, between 12% and 34% above task ratings. Each criterion also produced a different pattern of correlations with individual characteristics, where personality correlated more strongly with contextual ratings than task ratings. Unfortunately, conclusions in this study were questionable because the data failed to show an expected large correlation between task performance and cognitive ability. The authors also limited the generalizability of their conclusions based on the idea that military jobs might involve discretionary behavior infrequently as compared with civilian jobs. With confirmatory factor analyses of fifteen multitrait-multirater matrices, Conway (1996) showed that a task/contextual model of performance fit better than a unidimensional one, particularly for nonmanagerial performance ratings. Correlations within a domain tended to be higher than between domains; mean correlations were .70 (SD = .11) for task-task, .70 (SD = .13) for contextual-contextual, and .55 (SD = .15) for task-contextual ratings across raters. Conway also examined whether contextual performance subdimensions were differentially related to task performance. He 23 concluded that three subdimensions were distinct, finding that cooperating had a lower correlation with task performance (.51) than did following rules (.72) or persisting with extra effort (.59). Finally, there appeared to be no differences in reliability between task and contextual measures. Together, the findings based on these ratings supported a distinction between the two performance dimensions (Borman et al., 2001). Hattrup and colleagues (1998) studied the relationships of cognitive ability and conscientiousness with sales performance, absenteeism, tardiness, and OCBs in a sample of sales representatives over a 6 month period. Cognitive ability was significantly correlated only with sales performance (.31) while conscientiousness was related to absenteeism (-.24) and OCBs (.23) but not tardiness. The same pattern was found when examining the incremental validity of either predictor. Their results supported the general notion that task and contextual performance are different aspects of overall job performance. For eight job families in a telecommunications firm, Johnson (2001) showed that interpersonal citizenship, organizational citizenship, job-task conscientiousness, and handling work stress all explained incremental variance in overall performance above dimensions of task performance. Other research has produced similar results when predicting other broad outcomes like systemic rewards and promotability (Allen & Rush, 1998; Van Scotter, Motowidlo, & Cross, 2000). Johnson also found that task criteria exhibited a pattern of relationships with cognitive ability and personality that was different from citizenship criteria, except for job-task conscientiousness which appeared relevant to both performance dimensions. 24 There is also some evidence showing that ratings of overall performance in the literature have included aspects of citizenship performance. Lance and Bennett (2000) evaluated a structural equation model of supervisory performance ratings made by Air Force personnel. Their model fit the data, where performance ratings were mediated by aspects of task and contextual performance. Rotundo and Sackett (2002) used a policy- capturing approach to assess the weights given to task, citizenship, and counterproductive behaviors in managerial ratings of overall performance. Raters sorted descriptions of hypothetical workers in five job types with respect to the three performance subdimensions. Managers appeared to rely primarily on three weighting strategies that varied across job types. One group of managers weighted task performance most highly. Another group weighted task and citizenship nearly equally. The third group gave counterproductive ratings the most weight, followed by task and then citizenship behaviors. There was also a significant interaction between task and citizenship ratings in predicting overall performance, suggesting that managers value citizenship more in workers who accomplish their job tasks. Conclusions about counterproductive behaviors in this study are problematic for two reasons. First, the authors scaled the worker profiles to reflect each type of performance equally, creating a nonrealistic worker in light of the low base rate of counterproductive behaviors that occur in real work environments. Second, the operational definition of counterproductive behavior was not necessarily distinguishable _ from poor citizenship performance (e. g., low levels of compliance, a facet of contextual performance, were defined as counterproductive). Because such behaviors fall on a continuum of levels, extremely poor compliance is likely to hurt the organization whereas 25 much compliance can help it. Thus, they defined the same type of behavior differently depending on its level. In conclusion, past research supports the task-citizenship distinction. Despite this, we still lack a consistent estimate of the magnitude of the relationship between task and citizenship/contextual performance which determines the usefulness of the distinction in many applications like the measurement of individuals in selection or training. Estimates that exist in the literature cover a wide range. Hattrup and collaborators (1997) estimated the correlation between task and contextual performance at .18, using figures fiom Day and Silverrnan (1989), Motowidlo and Van Scotter (1994), and McHenry, Hough, Toquam, Hanson, and Ashworth (1990). In contrast, Murphy and Shiarella (1997) set the correlation between task performance and OCBs at .00 based on a brief summary of conflicting estimates. Hattrup and other colleagues (1998) found task (sales) performance to be nonsignificantly correlated with absenteeism, tardiness, and OCB (- .18, -.16, .19, respectively). Higher estimates include those by Allen and Rush (1998) at .66, Johnson (2001) at .54, and Beaty, Cleveland, and Murphy (2001) at .75. As a result, one primary goal of this study is to estimate the relationship between task and citizenship performance with quantitative summary methods. Job Performance Antecedents Regardless of whether one is referring to overall job performance or specific dimensions, complex phenomena such as performance are likely to be multiply determined. Campbell and colleagues (1993, 1996) proposed that individual differences on job performance components are completely determined by declarative knowledge, procedural knowledge and skill, and motivation. That is, specific job performance 26 behaviors may be caused by various influences including factors intrinsic to a person or to environmental factors, but variance between individuals who perform the same task will be determined by the three components. Motowidlo and others (1997) applied this general theory to a more detailed model in which “habits, skill, and knowledge” mediate the influence of cognitive ability and personality on task and contextual performance dimensions, with personality being more strongly related to contextual performance and cognitive ability being more related to task performance. These conclusions were based on a brief qualitative review of empirical findings related to the prediction of task and contextual performance with personality and general cognitive ability. Yet, many different explanations of this hypothesized pattern can be found in various works, most of which seem to be based on intuition. People who are motivated to contribute to organizational effectiveness can do so through either task or contextual means, or through both. Because task performance can be complex, cognitive ability explains many differences between individual performances. Motivational and dispositional factors related to being conscientious are also believed to affect task performance to some degree, particularly when extra effort or care is needed to ensure successful production. In contrast, theories of citizenship have implied that social skills, personality, and motivation are strong determinants of differences between workers. These antecedents enable or motivate certain people to be of greater assistance or to make extra effort in all aspects of their work, under the assumption that citizenship behaviors are common across jobs and are not necessarily easier to perform for people of higher cognitive ability. 27 In addition, some have proposed the idea that individuals (Hogan et al., 1998; Penner et al., 1997) choose to increase citizenship behaviors when task productivity cannot be increased; there is more discretion for performing citizenship behaviors than for task behaviors, discretion that is determined by personality. This may occur either when a worker lacks the ability to improve task performance or when work processes are very structured and not amenable to improvements. It is also likely that the motivation to be generally helpful and cooperative is rooted in one’s personality, affective disposition, or current mood state (Beaty et al., 2001; Day & Silverman, 1989; Gellatly & Irving, 2001; McHenry et al., 1990; Motowidlo & Van Scotter, 1994; Murphy & Shiarella, 1997; Organ & Ryan, 1995). George and Brief (1992) theorized about the importance of having a positive mood at work as the direct precursor to acts of organizational spontaneity like helping coworkers, defending the organization, making constructive suggestions, developing oneself, and “spreading goodwill.” Citizenship could also result from cognitive processes regarding the norm of reciprocity and social exchange theory, (e.g., due to being satisfied with the job or expecting future rewards). Puffer (1987) found that a high need for achievement, high satisfaction with material rewards, and low perceived peer competition were related to more prosocial behaviors. Organ and Ryan (1995) conducted the earliest, comprehensive review of dispositional and attitudinal predictors of OCBs. Conscientiousness produced small moderate validities with the altruism and generalized compliance dimensions of OCB (p = .22 and .30, respectively) after corrections for measurement unreliability. Agreeableness produced smaller validities with the same criteria (p = .13 and .11). 28 While citizenship performance has been linked to personality dimensions, and task performance has been linked to cognitive ability in research, I am not aware of any study that has examined the differential validity for personality and cognitive ability in predicting task and citizenship performance in a comprehensive review. Primary studies and meta-analyses of predictor relationships with one criterion or the other, however, suggest that the following predictions would be supported (Borman & Motowidlo, 1997; Motowidlo et al., 1997; Motowidlo & Van Scotter, 1994; Organ & Ryan, 1995): Hypothesis 3 (H3): Task performance will show a higher positive correlation with cognitive ability than with personality. Hypothesis 4 (H4): Citizenship performance will show a higher positive correlation with personality than with cognitive ability. In addition to the hypothesized predictors of cognitive ability and personality, there could be an enormous number of specific performance determinants that have unique relationships between task and citizenship performance. Thus, I/O psychologists have sought a subset of predictors that are manageable and carry utility, giving the “most bang for the buck” in understanding work processes. As a result, two sets of predictors are analyzed here. The first set consists of construct level measures, cognitive ability and personality. The second consists of two general methods commonly used in personnel selection (Muchinsky, 1997): structured interviews and biodata. Though it may seem odd to compare construct level measures with amalgamated measures, these comparisons are meaningful in light of the way that selection tools are administered; they are often mixed together. It is also acknowledged that some potentially powerful determinants of performance are excluded because meta-analysis requires that a sufficiently large body of 29 literature exists before it can provide accurate estimates. The next sections describe the predictor-performance relationships in more detail than Hypotheses 3 and 4 indicate. Cognitive ability For decades, empirical evidence has accumulated to uphold cognitive ability as the single most consistent and strongest predictor of overall job performance, across a wide range of job types and situations (Gottfredson, 1997; Hough & Oswald, 2000; Hunter & Hunter, 1984; Jensen, 1998; Lubinski, 2000; Neisser et al., 1996; Schmidt, 2002; Schmidt et al., 1985). For many jobs cognitive ability appears to have a mean validity somewhere between .4 to .5 with overall job performance, afier correcting for range restriction and criterion unreliability (Hunter & Hunter, 1984; Mayberry & Carey, 1997; Outtz, 2002; Ree & Carretta, 2002). This evidence is so strong, in fact, that general cognitive ability (GCA) is said to have validity generalization (Murphy, 2002; Ree & Carretta, 2002; Schmidt, Hunter, Pearlman, & Shane, 1979; Schmidt et al., 1985; Viswesvaran & Ones, 2002), meaning that a large percentage “of all values in the distribution [across jobs on which generalization evidence is based] lie above the minimum useful level of validity” (Schmidt, Hunter, McKenzie, & Muldrow, 1979, p. 618) Some have even concluded that task dimensions within the same job are unlikely to moderate GCA test validities after correcting for artifactual variance in the distribution of validity coefficients observed in research (Schmidt & Hunter, 1977; Schmidt, Hunter, McKenzie et al., 1979; Schmidt, Hunter, Pearlrnan et al., 1979; Schmidt, Law, Hunter, Rothstein, Pearlman, & McDaniel, 1993), where artifacts account for up to 87% of the variance on average (Schmidt et al., 1993). “Only a measure of overall job performance 30 is needed in validity studies” when the corrected criterion reliability is high (Schmidt, Hunter, & Pearlman, 1981, p. 175). While statements like this imply that there is no need to differentiate between performance criteria for making practical selection decisions (Schmidt et al., 1981; Schmidt et al., 1985, Viswesvaran & Ones, 2002), there exist other reasons for examining performance relationships at a finer level of detail, some of which have been mentioned above in a more general context. First, the use of global performance ratings obscures the meaning of one-to-one causal relationships between cognitive ability and different performance behaviors. Though it may be the case that some general ability, such as ‘g,’ allows some individuals to excel at virtually everything and others to be generally limited, furthering the theoretical understanding of job performance and its nomological network depends on examinations of why performance behaviors are linked to each other and caused by the same or different antecedents. Second, the homogeneity of effect sizes may be just the result of various biases that are known to affect overall performance ratings including halo and leniency (e.g., Murphy & DeShon, 2000; Solomonson & Lance, 1997), personal attraction and racial bias (e.g., Ford et al., 1986; Pulakos, White, Oppler, & Borman, 1989), assimilation and contrast effects (e.g., Kravitz & Balzer, 1992), and the setting in which performers are observed by others (Rothstein, 1990). The reliability estimate for an overall rating might then be artificially high even though the measure fails to capture “true scores.” Third, the estimates of reliability (e.g., .60 in Schmidt & Hunter, 1977) that have been used to correct the validities of cognitive predictors (Schmidt et al., 1993; Schmidt et al., 1985) could be underestimates of true reliability if the observed variance in 31 performance scores is actually due to its dimensional nature rather than to error. Reliability estimates that are too small will result in overcorrections and will eliminate true variance (Algera, Jansen, Roe, & Vijn, 1984; Guion, 1998; Murphy, 1997), leading to the dubious conclusion that relationships are not moderated. Interestingly, Hunter and others (Schmidt, Hunter, McKenzie et al., 1979; Schmidt, Hunter, Pearlman et al., 1979) listed criterion contamination and deficiency as one of the seven likely sources of artifactual variance but have always considered it statistically uncorrectable. This study addresses this notion, in some sense, by manually separating different types of criteria based on their overlap with parts of the job performance construct domain. Fourth, cognitive predictors almost always leave additional variance in overall performance unexplained (Schmidt, 2002; Schmidt et al., 1985; Schmidt & Hunter, 1998; Viswesvaran & Ones, 2002). If this unexplained part of performance consists of the same stuff, then it might be considered a separate dimension of performance that is not predictable by cognitive ability. It is important to recognize that none of the above reasons have been thoroughly tested and that none disproves the claim of situational specificity for validity generalization (i.e., cognitive ability has a single true validity that does not vary across situations). The four reasons presented merely provide some compelling justifications for further investigating the relationship between cognitive ability and dimensions of performance. Any evidence found to support the distinction between task and citizenship performance would have meaningful implications. Cognitive test validities would be expected to show less variance and may be higher than previously expected if cognitive ability has a stronger link specifically with task performance, as theorized. Also, the 32 adverse impact on racial minority groups in selection that tends to result from using cognitive tests can be reduced by focusing on a dimension of performance that is less related to cognitive ability, assuming that dimension is predicted well by other variables (e.g., Hattrup, Rock, & Scalia, 1997). Some prior research distinguishing dimensions of performance in validation research provides expectations for this study. In a small-scale “meta-analysis” of three studies (including Project A, an army selection study) with widely discrepant findings, Hattrup and others (1997) estimated cognitive ability to be correlated .41 with task performance and .16 with contextual performance when corrected for unreliability and range restriction. Murphy and Shiarella (1997), however, used another set of studies (also including Project A) to estimate the same relationships as .50 and .30, respectively. The wealth of validation research on cognitive ability has also provided convincing evidence that job complexity moderates relationships between cognitive ability and overall performance (Hunter & Hunter, 1984; Murphy, 2002; Ree & Carretta, 2002). Validities with overall job performance have been estimated to be .58 for professional-managerial jobs, .56 for high-level, complex technical jobs, .51 for medium complexity jobs, .40 for semi-skilled jobs, and .23 for completely unskilled jobs (Hunter & Hunter, 1984; Schmidt & Hunter, 1998). This may occur because cognition is partly defined as the ability to deal with complex situations (Gottfi'edson, 1997). Therefore, it is believed that: Hypothesis 5 (H5): Job complexity will moderate the relationship between cognitive ability and task performance. 33 This hypothesis is essentially a replication of past meta-analyses, although this moderating effect is not of focal interest. Job complexity may also be confounded with a greater need for interpersonal interactions and citizenship-like behaviors in higher level jobs, as discussed earlier and stated in H1 (Latham & Skarlicki, 1995; Gottfredson, 1997; Murphy & Cleveland, 1995). Thus, job complexity may moderate validities for at least two reasons: 1) high cognition allows workers to deal with complexity in tasks or the job or 2) citizenship performance is weighted more in complex jobs due to the social nature of the work. If H1 is supported (i.e., citizenship performance is more strongly related to noncognitive constructs in managerial and sales jobs) and H5 is not supported, it could be the case that job complexity moderates overall performance through aspects of citizenship rather than task performance; the job complexity and managerial distinctions may act similarly and be functionally equivalent. Finally, there appears to be no a priori reason for hypothesizing that the relationship between citizenship performance and cognitive ability will be moderated by job complexity, especially given Borman and Motowidlo’s (1997) assertion that citizenship behaviors are common to all jobs. Hypothesis 6 (H6): The relationships between cognitive ability and citizenship performance will be stable across job types based on complexity. Personality Though personality research has played an on-and-off role in 1/0 research, earlier studies lacked adequate theoretical frameworks (e. g., Guion & Gottier, 1965). It was not until after Barrick and Mount’s (1991) seminal meta-analysis of Big Five (Conscientiousness, Agreeableness, Extraversion, Emotional Stability, and Openness to 34 Experience) dimensions’ validities with job proficiency that personality research regained popularity as a job performance predictor (Hurtz & Donovan, 2000; Salgado, 1998). A considerable body of research has emerged since then to support a clear link between some of the Big Five dimensions and certain aspects of job performance (Barrick, Mount, & Judge, 2001; Borman et al., 2001). Findings from a few studies are described briefly to provide an approximate idea of the relationships relevant to this study. Four of the Big Five dimensions appear to be weakly related to cognitive ability, producing correlations of less than .10 (Bobko, Roth, & Potosky, 1999; Boudreau, Boswell, Judge, & Bretz, 2001; Cortina, Goldstein, Payne, Davison, & Gilliland, 2000). As would be expected openness to experience, or intellectance, has been shown to correlate moderately (r = .21 in Boudreau et al., 2001). In Project A, conscientiousness correlated .11 with core technical proficiency (task performance), .09 with general soldiering proficiency, .22 with effort and leadership, and .30 with personal discipline (McHenry et al., 1990). Emotional stability was correlated .10, .12, .19., and .11 with those same variables, respectively. Conscientiousness and emotional stability correlated .32 with each other. One study (McManus & Kelly, 1999) found that task performance was significantly related to extraversion (r = .22), and citizenship performance was related to extraversion (r = .29), agreeableness (r = .20), emotional stability (r = .23), and openness to experience (r = .23). Another study (Beaty et al., 2001) found emotional stability to be significantly correlated with task and contextual performance (.36 and .31, respectively). Emotional stability significantly correlated with task performance as well (.24). 35 Organ and Ryan (1995) provided perhaps the best, most comprehensive meta- analysis of the relationship between personality predictors and citizenship performance facets. Conscientiousness produced the largest correlations (.21 and .30, corrected) with different aspects of citizenship. Borman and others (2001) updated these findings with 20 additional studies and found citizenship facets to have uncorrected mean correlations of .24 with conscientiousness, .13 with agreeableness, and .08 with extroversion. Hurtz and Donovan (2000) also meta-analyzed correlations between the Big Five and job performance, including measures of task and citizenship behaviors. All of the dimensions were weakly correlated with task performance and job dedication, with conscientiousness having the largest corrected mean correlations (.16 and .20, respectively). Emotional stability displayed the next largest relationships with task performance (.14), job dedication (.14), and interpersonal facilitation (.17) while the correlations for other personality dimensions averaged below .10. However, the credibility intervals indicated stable validity estimates only for the criterion of interpersonal facilitation. Overall, it is clear that conscientiousness has consistently produced the strongest uncorrected validities with task performance (with a true r of about .20) and citizenship performance (about .24). Emotional stability has also produced moderate correlations with citizenship performance on a somewhat inconsistent basis, and low but significant correlations with task performance. Extroversion, on the other hand, has typically shown weak correlations with both performance dimensions. Agreeableness seems to have produced the most inconsistent results. Finally, openness to experience appears to be under-researched, which may result partly fiom studies showing its low correlation with 36 overall performance (Barrick & Mount, 1991). Based on theory and past findings, I pose the following hypotheses, where version A applies to citizenship performance and version B applies to interpersonal and job dedication facets. These apply only if H2 is supported: Hypothesis 7 (H 7A): Conscientiousness will be positively correlated with task performance and citizenship performance. Hypothesis 7 (H 7B): Conscientiousness will show a higher positive correlation with job dedication than with interpersonalfacilitation. Hypothesis 8 (H8A): Emotional stability will show a higher positive correlation with citizenship performance than with task performance. Hypothesis 8 (H88): Emotional stability will show a higher positive correlation with interpersonal facilitation than with job dedication. Hypothesis 9 (H9): Agreeableness will be positively correlated with citizenship performance only. Hypothesis I0 (H10): Openness to experience will be positively correlated with task performance only. Structured interviews Employment interviews, also proven predictors of job performance, generally contain ambiguity regarding the constructs being assessed (Bobko et al., 1999; Campion, Palmer, & Campion, 1997; McDaniel, Whetzel, Schmidt, & Maurer, 1994). Cognitive ability seems to account for less than 20% of the variance in interview ratings (Huffcutt, Roth, & McDaniel, 1996). Huifcutt, Conway, Roth, and Stone (2001) provided a framework of the constructs that are typically assessed in interviews to “provide greater 37 insight into why formats such as the situational interview predict performance and [to] allow interviews to be optimally designed to achieve specific outcomes such as high incremental validity and minimal impact on protected groups” (p. 897). They found that interviews in the research literature have primarily assessed basic personality constructs (35%), applied social skills (28%), mental capability (16%), and knowledge and skills (10%). Unstructured interviews have no fixed format or set of questions and typically result in an overall rating for each applicant; structured interviews are the opposite (Schmidt & Hunter, 1998). Since this meta-analysis intends to distinguish predictor relationships with task and citizenship performance, unstructured interviews are unlikely to provide useful information and will be excluded. One implication of this restriction on study conclusions results from the fact that highly-structured interviews tend to assess applied mental skills and knowledge more than low-structured ones (Huffcutt et al., 2001). Thus, findings will not generalize to interviews with dissimilar content and structure. Structured interviews have produced correlations in the range of .16 to .32 with aspects of task performance (Borman, 1982; Campbell, Prien, & Brailey, 1960). For a large sample of Air Force personnel, interviews produced median correlations of around .23 (ranging from .02 to .38) with a hands-on work sample of task performance and of global technical performance. When designed to predict dimensions of OCBs, situational but not patterned interviews were significantly correlated with job performance (.50 for OCB-O and .30 for OCB-1) (Latham & Skarlicki, 1995). Correlations of structured 38 interviews appear to be around .25 with cognitive ability and range from .12 to .26 with conscientiousness (Bobko et al., 1999; Cortina et al., 2000). Based on these findings, structured interviews are assessing individual characteristics that are not limited to just general cognitive ability. These “other” skills, abilities, and motivation are likely to be useful for predicting citizenship performance, while knowledge and cognitive components are likely to predict task performance. Therefore: Hypothesis 11 (HI IA): Interviews that primarily assess cognitive ability will be positively correlated with task performance. Hypothesis 11 (H118): Interviews that primarily assess personality will be positively correlated with citizenship performance. Hypothesis (H11 C): Interviews that assess both cognitive and personality constructs in approximately the same proportion will correlate positively with both task and citizenship performance. Biodata Biographical data, or biodata, have been shown to predict job performance relatively well and to exhibit smaller differences by racial subgroups than cognitive ability (Schmidt & Hunter, 1998; Schrnitt, Rogers, Chan, Sheppard, & Jennings, 1997). Biodata are questions about past experiences that are somehow related to a criterion based on the premise that past behavior predicts future behavior (Mumford & Owens, 1987). They may be compound measures that comprise a range of constructs depending on what types of behavior are referred to in the questions (Mitchell, 1994; Nickels, 1994; Schmidt & Hunter, 1998). More recent biodata forms have also begun to include 39 questions referring to past attitudes and values, in the hope that they predict future behavior — presumably mediated through future attitudes and values. It is also interesting to note that biodata are most akin to the most commonly used selection tools like resumes and job applications. Correlations of biodata with cognitive ability have ranged widely fi'om .05 to .50 (Bobko et al., 1999; Schmidt, 1988; Vinchur et al., 1998). In predicting overall job performance, empirical evidence has shown that biodata have minimal incremental validity over general cognitive ability (Schmidt & Hunter, 1998). This can result from the relationship between biodata and cognitive ability; it may also result if overall performance measures tend to exclude citizenship behaviors (again, a criterion problem rather than a predictor one). Four biodata scales, including two dimensions of personality, in Project A had validities of .26 with core technical proficiency, .25 with general soldiering proficiency, .24 with effort and leadership, and .32 with personal discipline (Peterson et al., 1990; McHenry et al., 1990), where estimates are corrected for range restriction. McManus and Kelly (1999) showed that their biodata instrument for insurance sales representatives had similar relationships with contextual (r = .25) and sales task performance (r = .26). Biodata appears to correlate moderately with cognitive ability (correlations between .05 and .27) (Bobko etal., 1999) and weakly (correlations near zero) with personality (McManus & Kelly, 1999). The relationship between biodata and structured interviews has been estimated to be between .08 and .27 (Bobko et al., 1999). In conclusion, biodata have shown moderate correlations with both task and citizenship aspects of job performance behaviors, depending on what the specific 40 questions are designed to measure. They also appear to capture something other than personality which may still be a determinant of citizenship performance. Thus: Hypothesis 12 (H12A): Biodata that primarily assess cognitive ability will be positively correlated with task performance. Hypothesis 12 (H128): Biodata that primarily assess personality will be positively correlated with citizenship performance. Hypothesis 12 (H12C): Biodata that assess both cognitive and personality constructs in approximately the same proportion will correlate positively with both task and citizenship performance. Summary of Research Hypotheses H1: Citizenship performance will show higher positive correlations with noncognitive predictors in managerial and sales jobs than in other jobs. H2: Effect sizes for measures related to citizenship performance will be moderated by the degree to which interpersonal facilitation and job dedication aspects are measured. H3: Task performance will show a higher positive correlation with cognitive ability than with personality. H4: Citizenship performance will show a higher positive correlation with personality than with cognitive ability. H5: Job complexity will moderate the relationship between cognitive ability and task performance. H6: The relationships between cognitive ability and citizenship performance will be stable across job types, based on complexity. 41 H7A: Conscientiousness will be positively correlated with task performance and citizenship performance. H7B*: Conscientiousness will show a higher positive correlation with job dedication than with interpersonal facilitation. H8A: Emotional stability will show a higher positive correlation with citizenship performance than with task performance. H8B*: Emotional stability will show a higher positive correlation with interpersonal facilitation than with job dedication. H9: Agreeableness will be positively correlated with citizenship performance only. H10: Openness to experience will be positively correlated with task performance only. HI IA: Interviews that primarily assess cognitive ability will be positively correlated with task performance. H1 1B: Interviews that primarily assess personality will be positively correlated with citizenship performance. H1 1C: Interviews that assess both cognitive and personality constructs in approximately the same proportion will correlate positively with both task and citizenship performance. H12A: Biodata that primarily assess cognitive ability will be positively correlated with task performance. H12B: Biodata that primarily assess personality will be positively correlated with citizenship performance. H12C: Biodata that assess both cognitive and personality constructs in approximately the same proportion will correlate positively with both task and citizenship performance. 42 *The hypotheses with asterisks are only relevant if support is found for H2. Figures 1 and 2 provide an integrated visual representation of these hypotheses, many of which are based on findings in the literature reviewed. 43 .3838 8: can 885093 83.8on .3533: 2a mint? some 53> 33683 88232;: BEBE 08mm 05 Set men: 38 05 ~85 c8303 08 65 3280522 actuator—283a 88232 8:: 3:39 .802 . mm 328388924 me bzfifim EcozoEm Vt ooggctom Em mmccmsoccomomcoo aimaoNEU Vt baezeflmm Qt 832m 2m 32225 eeaoecer// mm maggotom xmfl. l . 323m 03::on r. a 2x cocotomxo 9 $05.30 3% 9329294 .mcommcofia oozes—Stem £2. 98 313.8 > 88:85 :ecBBm 32323—3” vofimofiofi: .«o .302 ._ 259m 44 03620: 000 033.?» £000 .23 0830030 momcfioazm 000605 058 05 80¢ 8:: 020m 05 :05 08:83 80 35 3:300:32 00103000000680008058 mos: 00:95 .802 coufizmomm 30030885 - 0%: 0:38 08:25 808060 93 r r r r mm: mmoflmDOMHCOmomCOU $006585 E50320 0:0 00060085 bzmcofiom cesium aims—000.0% “comicfiogm .00 .0002 .m 23E 45 Conclusion There is a large body of research on the validity of various selection tools. Cognitive ability is known to produce large validities, and this finding is generalizable across different types of jobs. Other individual difference characteristics have also produced respectable validities but in a less consistent manner. Because job performance is the outcome that many researchers try to predict and because it is a complex construct, there is a need to investigate more direct relationships between individual differences and different types of performance behaviors. Recent advancements in theory and the development of explicit job performance taxonomies have stimulated research, particularly in the area of citizenship performance. As more studies have become available and theories have become more complex, it is appropriate to evaluate the past findings and determine what we know and what we do not know. Meta-analysis is one useful method of summarizing and evaluating the information provided by many studies. By quantitatively cumulating past results, one can extract the effects of sampling error variance to which primary studies are bound, and derive better estimates of conceptual relationships. Meta-analysis also permits examinations of moderating influences on the distribution of observed validities that might not have been analyzed within any one study (Rothstein, McDaniel, & Borenstein, 2002). It is hoped that the summarizing discrepant estimates of interesting relationships (e.g., task and citizenship performance correlations) and detecting or supporting the existence of plausible moderators will refine estimates of relationships, by reducing unexplainable variance in results across studies, which may also allow us to expect higher validities for some measures in particular settings or for particular groups (Barrick et al., 2001; Salgado, 1998). 47 METHOD Literature Search I conducted a literature search using both computer-guided and manual approaches to find relevant published works between the years of 1988 (when “citizenship behavior” terms were defined) and 2004. The American Psychological Association’s PsycINFO, PsycFIRST, ERIC, ABI/INFORM, and BusinessOrgs databases were used in the computer searches, including these keywords: citizenship, contextual performance, task performance, task proficiency, prosocial behaviors, structured interview, biodata, and biographical data. The manual search covered three prominent journals in the field of I/O psychology: Journal of Applied Psychology, Personnel Psychology, and Organizational Behavior and Human Decision Processes. The reference sections of seminal articles and meta-analyses were also used to locate additional studies. These methods together yielded 589 references to studies possibly containing codable information. The large yield was partly due to the large number of variables included in searches and partly due to erring on the side of inclusion when reading unclear abstracts. Criteria for Study Inclusion Only a portion of the studies identified by the literature search yielded usable information. Studies were coded and analyzed if they met the following criteria: 1) they were written or translated in English, 2) there was a measure of the relationship between at least two of the study variables (i.e., task performance, citizenship performance, cognitive ability, the Big Five, biodata, interviews, but not overall performance'), 3) individuals were the unit of analysis, 4) the statistical information necessary for ' Information about overall performance was included as supplemental information only when an effect size for at least one of the performance dimensions was reported. 48 computing a correlational effect size was presented. Bobko et a1. (1999) explained the importance of choosing appropriate studies for inclusion in a meta-analysis and of consistently applying the same decision rules. It is ideal to combine studies primary studies and to exclude meta-analytic estimates that were derived with unique decision rules (of. Wanous, Sullivan, & Malinak, 1989). Then again, information is lest when meta-analyses contain studies that are not otherwise obtainable or eligible (e.g., studies before 1988 in this case). Thus, uncorrected cumulative correlations from meta-analyses were included in these analyses when they followed procedures similar to the one used here. As the performance taxonomy in this study has not been used in prior meta-analyses, certain studies (e. g., Barrick and Mount’s 1991 meta-analysis) did not fall within the inclusion criteria either because they classified measures differently, particularly with respect to citizenship criteria. To avoid double- counting studies and giving certain findings too much weight, meta-analyses were excluded when most of their primary studies were already included in this database (e.g., Borman et al., 2001), and primary studies were excluded if they were contained in a meta-analysis that otherwise provided a large amount of unique data. Because laboratory research is often meant to generalize to work settings, studies including university students and people in non-work settings were included when the measure of performance mirrored actual job performance in some way. So, solely academic performance measures (e.g., GPA) were excluded. Data Coding Procedure A two-stage coding process was used to obtain information necessary for testing the hypotheses. In the first stage, the author and an advanced undergraduate in 49 psychology coded sample characteristics based on information from studies identified through the literature search that met the criteria for inclusion. In the second stage, characteristics related to the hypothesized moderators were coded by multiple raters knowledgeable about 1/0 psychology. Concurrently, the author used the O*NET database (Ligp:l/online.onetcenter.org) to assign job complexity codes based on the Dictionary of Occupational Titles to test Hypotheses 5 and 6. The two stages are described in more detail below. For the first stage, a pilot coding sheet was developed based on guidelines and examples fi'om Lipsey and Wilson (2001). Items necessary for analyzing the specific hypotheses were added. The pilot coding sheet (Appendix A) was then sent to a college graduate outside of psychology to determine where the instructions were too vague, ambiguous, or confusing. As a result, the coding sheet was simplified and made shorter. The resulting coding sheet (Appendix B) was then used in subsequent phases of coding, allowing data to be recorded for each sample within a study in a new spreadsheet. Items on the coding sheet referred to the number of independent samples in the study, sample sizes, a qualitative description of the sample, whether a sample consisted of managerial employees, whether the study design was predictive or concurrent, reasons for missing data, whether a manipulation occurred between measures (as in training studies), a description of each measure, a judgment about measure subjectivity/obj ectivity, a judgment about measure broadness/narrowness, measure reliability, and correlational effect sizes between variables qualifying for this meta- analysis based on the definitions in Table 1. 50 The undergraduate student had taken courses in research methods and in statistics, and underwent training in how to code studies according to a code book of rules, including variable definitions. After the training over six weekly meetings, the undergraduate coded four studies for comparison with the author’s coding of the same studies. The coders met and discussed disagreements and ambiguity about the coding rules and operational definitions of variables. The code book was revised based on these discussions, and the revision was used to code the remaining studies. The operational definitions eventually used are those in Table 1 and the coding rules are presented in the code book (Appendix C). Interrater Agreement Both coders examined a subset of studies and agreed on 56 of 58 (96.6%) about which studies provided codable data. For these studies, the coding sheet and code book were used to collect study characteristics, measurement characteristics, and correlational effect sizes, as described earlier. Information about interrater agreement is provided in Table 2 for 29 studies that were deemed codable. For categorical variables, I computed kappa to index the level of agreement achieved beyond chance agreement (Table 2 and Appendix D). Generally, kappa values of .8 or higher are very satisfactory, between .6 and .8 are good, between .4 and .6 moderate, and of less .4 poor (Landis & Koch, 1977). The kappa values for coded information indicate that moderate agreement existed for judgments about whether samples were managerial (rc = .58) and whether measures were broad (rc = .52). Kappa values between raters regarding the study design and whether measures were subjective were lower. 51 Table 1 Study variable labels and definitions. Variable Definition Task performance Citizenship performance Job Dedication (Citizenship performance) Interpersonal facilitation (Citizenship performance) Behaviors that directly or indirectly (through other workers) affect core production that transforms input to output or delivers a service. Sometimes these are referred to as “in-role” behaviors because they are tied to one’s job roles. However, the two concepts may be very different if the role includes non-task behaviors. Behaviors that 1) are not directly related to core tasks and 2) support the social and/or psychological environment. Examples include: loyalty, cooperative behaviors (not affecting core production), whistle-blowing, sportsmanship, prosocial behavior, personal initiative, showing extra effort and perseverance, volunteering to do extra or unrelated work. Counterproductive or retaliatory behaviors are NOT included. Citizenship behaviors that do not require a direct interaction with another person. Instead, they are related to helping the organization overall. Examples: working hard, taking initiative, and following organizational rules. Citizenship behaviors that require a direct (not necessarily face to-face) interaction with another person. Examples: helping others and backing people up. 52 Table 1 (cont) Variable Definition Cognitive ability Broadly, any computational, problem solving, or mental abilities. Personality Enduring, characteristics of the individual that are tied to one of the Big Five dimensions: Conscientiousness, Agreeableness, Extroversion, Emotional Stability, and Openness to Experience. A measure may capture smaller aspects of any one dimension but not overlap with another dimension. Conscientiousness dependability, achievement striving, and planfulness Extraversion sociability, dominance, ambition, positive emotionality, and excitement-seeking Agreeableness cooperation, trustfulness, compliance, and affability Openness to experience Emotional stability Structured interview Biodata intellectance, creativity, unconventionality, and broad- mindedness lack of anxiety, hostility, depression and persona insecurity A structured interview, at the very least, evaluates a response to each question posed to the interviewee (from Huffcutt & Roth, 1998) A measure of background life experiences that is intended to predict future behaviors of the same type Note. The structured interview and biodata are predictor measures but the other variables refer to constructs (and appropriate measures). Also, simple demographics were not treated as biodata, per the definition given here. 53 For continuous variables such as the sample size of a study (N), percentage agreement is reported in Table 2. Most of these are nearly 90%. For sample size, agreement for the exact N recorded was the same 83% of the time but rarely differed by more than 5 cases. The discrepancies often resulted from differences between the N stated in the sample description and the N reported for a correlation matrix after removing some unusable cases. The lowest agreement occurred for the categorization variable that was formed; the creation of this variable and its meaning is explained below. The estimates above are imperfect because some of the codes are dependent on each other (i.e., a miscode in one place will cause a subsequent miscode). Some attempt was made to evaluate this effect by treating certain disagreements as categorization errors when a measure was not labeled the same way between raters but all other information pertaining to that measure was correct. Codes that were different only because of an earlier categorization error were not treated as a disagreement but the categorization error itself was tallied. For example, one rater treated a set of supervisory ratings as task performance while the other rater treated them as overall performance but both raters coded the reliability estimate for that measure accurately from the original study. This was counted as a categorization error but not an error in recording the reliability estimate. The percentage of categorization agreement is shown in Table 2 and represents the times that raters classified measures from primary studies into the same study variables used here. The percentage provides some indication of how generalizable the coding scheme in this study would be if applied in other meta-analyses. The percentage is fairly low but is an underestimate of the coding scheme’s reliability for two specific reasons, apart from the general inexperience of the raters in 54 conducting meta-analyses. First, the classification of certain variables determined how later variables could be classified (e.g., a job dedication variable incorrectly coded as task performance could not be coded correctly as either citizenship or job dedication). This phenomenon could not be evaluated because one categorization error could lead to one or more other categorization errors depending on the actual disagreement. Second, the percentage does not include agreement on decisions of omission when both raters decided not to include certain variables from primary studies. The percentage agreement would rise for every variable that both raters decided was too different from the definitions in Table 1 to be included. Additionally, it should be stated that virtually no estimate of interrater reliability is free fiom bias; two raters could demonstrate excellent consistency but be “wrong” if they make the same errors. Specifically with regards to the percentage agreement for r (i.e., how many times raters recorded numerical correlation values in the same way), the percentage obtained was good but not as high as one would expect, given that numbers are listed in tables and errors are less related to differences in judgment between raters than to transcription problems. However, a substantial percentage of the coding disagreements were due to the undergraduate providing the wrong sign for “neuroticism” correlations in four studies, as opposed to its positive direction, “emotional stability.” That is, the correlations were coded as the same variable and were of the same magnitude but of opposite signs. If these trivial errors are excluded since they can most likely be eliminated with additional training, agreement rises to 93%. 55 Table 2 Interrm Agreement of Coded Data. Kappa Agreement Managerial Design Broad Subjective K=58 K=29 K=52 K=43 Percentage of Agreement # of Samples N Categorize Reliability Type r 89.3% 83.3% 73.6% 89.5% 89% 86.9% Note. Managerial = 3-choice item about whether the sample included managers, nonmanagers, or both. Design = study design (predictive or concurrent). Broad = subjective assessment about criterion relevance of each measure. Subjective = subjectivity of each measure. # of samples = number of independent samples in the study for which data was coded. N = study sample size. Categorize = times correlations were associated with the same variable labels. Reliability = numerical estimate. Type = reliability index used. r = numerical correlations between variables in primary study. 56 For these 29 studies, disagreements were discussed between the coders and resolved. One study was excluded because it was not clear whether the performance variable fit cleanly into any of the categories. Due to resource constraints and because the key variables showed relatively good agreement, I coded the remaining studies. The Design, Broad, and Subjective codes were not used in later analyses due to low agreement and because they are not related to the hypotheses. They were coded as a precautionary measure for using post hoc tests to explain strange cases. To enhance the reliability and generalizability of the classification decisions, previous sources of literature were used during the remainder of the coding process. I used John’s (1990) “Big Five” taxonomy that maps the subfacets of popular measures such as the Jackson PRF, NEO-PI, and Hogan PI into the five categories. When reasonable, I used coding rules from previous meta-analyses that contained data at the level of the primary study (e.g., Cortina et al., 2000). However, previous work was not relied on when different conceptual definitions were used or when overall job performance was the sole criterion. For the second stage of coding, the author used the O*NET online database to obtain a rating of job complexity for jobs included in primary studies. It provides the “specific vocational preparation (SVP) range” fiom the Dictionary of Occupational Titles (DOT) (US. Department of Labor, 1991). SVP is defined as “the amount of lapsed time required by a typical worker to learn the techniques, acquire the information, and develop the facility needed for average performance in a specific job-worker situation” (Appendix C of the DOT) and is measured with a 9-point scale2 ranging from “Short demonstration” to “Over 10 years.” Because O*NET only references actual jobs, this procedure was 57 carried out for actual jobs in the primary studies (unless a simulated laboratory task closely resembled a real job). For mixed samples, a lower SVP value was used if the jobs did not differ by more than one point. Finally in the third stage of coding, biodata and interviews were to be coded according to the constructs being assessed by them. In the end, only biodata was deemed usable as too few studies provided correlations related to structured interviews. The author and three graduate students in 1/0 psychology who were familiar with terms and definitions but blind to the study hypotheses provided ratings of the biodata measures used in 15 studies that reported correlations related to Hypothesis 12 (i.e., with task or citizenship). Raters decided what percentage of the measure assessed the hypothesized predictors: cognitive ability and personality (excluding extraversion because it was not hypothesized to have a significant relationship to either task or citizenship performance). Raters were told that the percentages did not need to sum to 100 and that all other things being assessed in the biodata should be attributed to the remaining percentage. Because Hypothesis 12 makes a higher level distinction between biodata that primarily assess cognitive ability, primarily personality, or both, the information provided by raters was recoded to fit the broader categories rather than specific personality dimensions. That is, the percentages for the four personality dimensions were aggregated into a composite percentage representing the amount of biodata assessing personality overall. This resulted in a percentage estimate of cognitive ability and of personality assessed by biodata provided by each rater. Across the four raters and two percentage values (cognitive and personality) for all studies, interrater reliability (i.e., the intraclass 2 Often, a range was given (e.g., “below 4”). The lowest possible number was used. 58 correlation) was .89, and all raters agreed that one study did not provide enough information to be coded. This level of reliability was determined to be very acceptable since agreement between coders of subjective variables like methodological quality tends to produce low agreement (Hattie & Hansford, 1984; Viswesvaran, Ones, & Schmidt 1996) In additibn to the planned coding processes described above, I coded all studies for range restriction based on the sample setting. The “study design” variable was originally coded to provide an index of range restriction for applied samples but it was too confusing because measures were often administered at different times even though the design was concurrent, it had a low interrater agreement, and it did not necessarily mean that a given measure was, in fact, range restricted. A new code was assigned to each study (1 = restricted range, 0 = unrestricted range) based on whether a sample’s participants were explicitly selected to meet some threshold level (directly or indirectly) on one of the study variables (i.e., either predictors or job perforrrrance). Although this decision was subjective, the end result essentially was that only samples drawn fiom the general public or job applicants were counted as unrestricted, making the code fairly clear. Job incumbents, successful performers, and college students were examples of samples considered to be restricted. Restriction here was viewed simply as an influence that would attenuate observed correlations representing the true relationship between variables. Meta—analytic Procedure I generally followed the strategy developed by Hunter and Schmidt (1990; with refinements detailed in Hunter & Schmidt, 2004) but supplemented analyses with the 59 multivariate method proposed by Becker and colleagues (Becker, 1992; 1996; 2000; Kalaian & Raudenbush, 1996; Raudenbush, Becker, & Kalaian, 1988). The multivariate framework, as well as traditional decision rules, was used to manage multiple levels of dependencies in the data that would otherwise violate basic statistical assumptions. “Although all [primary study] design features lead to dependence among study outcomes, the nature of the dependence depends on exactly what comparisons are computed and the metric(s) in which they are expressed” (Becker, 2002, p. 501). At least 3 levels of dependence were evident in this database: multiple measures of the same subfacet of a study variable (e.g., altruism and courtesy formed the interpersonal facilitation variable, a facet of citizenship), the same study variable measured more than once or with separate facets (e.g., general cognitive ability measured with verbal and numerical tests), multiple correlational effect sizes between different variables within the same study (e. g., the correlation between cognitive ability and task performance, and biodata and task performance), and correlations reported for more than one sample (e.g., employees in two separate organizations). Figure 3 shows these levels of the data structure that contribute to meta-analytically derived estimates. I began by using the Hunter and Schmidt (2004) criteria for cumulating findings within studies/samples (p.429-442). When multiple measures of the same variable (as defined in this study) were used, I computed a composite correlation using equation 5-8c in Nunnally and Berstein (1994) and a corresponding estimate of the composite’s reliability (with Eq. 7-15) when sufficient information was available. Obviously, results that were believed to be distinct (i.e., task versus citizenship performance) were not combined. 60 Multiple groups sampled within the same study created a second level of dependence. Subgroups were treated as individual cases in the meta-analysis when their results could be thought of as firlly replicated designs, since the results could have been published separately. (Whether this step achieves an acceptable level of independence is relative; unique studies are often treated as independent despite having been conducted by the same researchers with the same measures.) If the grouping variable was not related to a moderator in this study (e.g., race or gender), total group correlations were used, as recommended by Hunter & Schmidt (2004). A third level of dependence existed because primary studies typically provided data for more than one relationship between variables relevant to this meta-analysis. Multivariate techniques for modeling dependencies between correlations within the same study were applied after correlations were corrected for artifactual variance. The specific procedure is described after the next section on statistical corrections. Ultimately, 196 studies provided usable data. Initial statistics for the mean sample-size weighted correlation of observed effect sizes were computed. Although many of the resulting statistics are important primarily for their use in subsequent computations, one should consider the mean weighted correlations and their confidence intervals. When the confidence interval does not include zero, the effect size is regarded as statistically significant (cf. Hedges & Pigott, 2001), as is the case with typical confidence intervals. According to Whitener (1990), a confidence interval “reflects the effect of sampling error ' and is therefore applied to sample-size weighted mean effect sizes that have not been corrected for research artifacts” (p. 316). 61 =0_§=_00m 3:00.000“:— _ _ 00:00:00Q 00.. _ _ 85:08.00 0.80. n _ 005E000; 028806 _ _ 80.2020 >030 0 5 0.9000 _ _ _ 0:0 00.0 2030—0000 - A _ beam 0 E 00388 $0000 8&0 00km _ F _ mommy—3m amp—Om mON_m noun—mm _ ohsuoagm 5&0 Evamoz 003803 omen—000-302 .0 003035 .00—202 .m 2&5 030:0) 0.0E0m >020 202000-902 _0>04 62 Corrections for Artifactual Variance Artifacts can systematically alter the magnitude of observed relationships and inhibit theory testing by introducing additional variance in estimates cumulated across studies (Paese & Switzer, 1988; Viswesvaran et al., 1996). Hunter and Schmidt (2004, p. 76) describe 10 “potentially correctable” study artifacts that alter observed correlations, but the traditional practice in validation research has been to correct only for sampling error, measurement unreliability (typically in the criterion), and sometimes for range restriction (Hunter & Schmidt, 2004; Raju, Burke, Normand, & Langlois, 1991). Nonetheless, the appropriateness of any correction depends on how well it models the error that is assumed to be distorting observed results; the correction of artifactual variance is more of an art than a set procedure. In this study, I simultaneously corrected for two types of artifacts: the dichotomization of variables and measurement unreliability using formulas provided by Hunter and Schmidt (2004). Measurement unreliability. While “a thorough investigation of the criterion domain ought to include an examination of the reliability of dimensions of job performance” (V iswesvaran et al., 1996, p. 557), there has been some debate about the accuracy and meaningfulness of different reliability indices (e.g., Murphy & DeShon, 2000; Schmidt, Viswesvaran, & Ones, 2000). Reliability estimates were most often provided by studies in the form of internal consistency (alpha) but sometimes test-retest or interrater correlations. Each form has advantages and limitations (Schmidt & Hunter, 1996; Viswesvaran et al., 1996) but none completely captures the concept of reliability and the use of any one inevitably leads to some amount of over- or under- corrections (Cortina, 1993; Cronbach & Shavelson, 2004; Hunter & Schmidt, 1990; Murphy & 63 Deshon, 2000). Thus, one must make “a best guess” about a measure’s reliability based on reported information. Alpha was most commonly reported in the studies included here but is an inappropriate reliability estimate when 1) distinct dimensions are rated (Nunnally & Bernstein, 1994) and/or when 2) a measure contains error from its items and from a rater’s judgment (Hunter & Schmidt, 2004), as in supervisor ratings of performance based on a dimensional form. In the first situation, I used other estimates of reliability (e. g., test-retest correlations) and not alpha because it underestimates true reliability. In the second situation, however, I used alpha if estimates that are more appropriate were not available and the rating was not clearly dimensional. Alpha is an overestimate of true reliability in those situations and will cause correlations to be undercorrected for the influence of artifactual variance. While the estimated p will have more error, overall conclusions will be more conservative3 and be more likely to produce Type I errors than if some error variance was mistakenly attributed to measurement unreliability and removed. Also, it is better to correct a sample correlation partially than to make no correction because failing to apply a correction procedure consistently to all studies will contribute to variance between study estimates, confusing the results of moderator analyses. Although an adequate description of the measurement method was not always provided, I attempted to use the most appropriate reliability index reported. For example, if a measure consisted of multiple dimensions and had low internal consistency but high 3 The mean meta-analytic correlation will be biased downward but will retain meaningful variance due to the multidimensionality of a measure along with the additional error variance. 64 test-retest reliability, I used the latter. When a single construct was measured with a set of written questions and internal consistency was low but interrater reliability was high, I used the internal consistency under the premise that raters cannot produce reliable true scores using an unreliable tool. Next, I borrowed estimates of reliability from other published sources when a primary study did not report the observed reliability of a measure. For example, some estimates for different kinds of performance ratings were taken from the meta-analysis of reliability by Viswesvaran et a1. (1996). Although full reliability information was not available in the end, there was enough to warrant individual corrections rather than the use of artifact distributions. Finally, I used mean substitution to estimate reliability for measures when no relevant estimate could be located in the literature. Imputation was conducted so that corrections would be applied to the data uniformly. As long as the mean value is somewhat accurate, the remaining correlations will increase by the same amount as the correlations with reliability estimates reported, preserving observed variance (i.e., not creating additional variance by correcting some correlations and not others). The resultant set of reliability estimates was used to correct correlations for predictor and criterion unreliability in order to estimate the theoretical relationship between each of the study variables (Hunter & Schmidt, 2004; Orwin & Cordray, 1985; Salgado, 1998). Range variation. Range restriction, or variation, across studies is also important to consider because it will alter the magnitude of correlations and can create artifactual variance in observed correlations. A variable measured within a limited range will attenuate the maximum possible correlation that can be obtained, and a variable measured 65 with a range larger than is found in the real world can inflate correlations. When studies are differentially affected by the effects of range variation, with some correlations being attenuated and others inflated, range variation will cause artifactual variance in the distribution of observed correlations. Despite these problems, corrections for range variation should only be made when one can accurately model the error created by differences in ranges across studies. If one cannot make proper assumptions about how true correlations are being distorted by range variation, it is impossible to remove the distortion accurately. I did not consider range variation to be correctable in this study for a number of reasons. In their meta-analysis, Organ and Ryan (1995) too felt that they could not accurately specify what a normal range of variation would be on OCB measures. Range variation is often removed through the use of artifact distributions by identifying a common reference group to which all studies should be calibrated. In this database, studies came from diverse settings including the general public, university students, job applicants, and job incumbents for different kinds of organizations. As a result, I could not assume a useful hypothetical range for a common reference group, precluding the use of artifact distributions as the basis for range restriction corrections.4 Corrections for range variation could still be made on a case by case basis, but only when primary studies provide sufficient information about the range with respect to the variables measured (e. g., selection ratios or turnover rates). Few of the studies in this database included such information. Less than five studies included measures actually used in selection. 4 Sackett and Ostgaard (1994) found small differences between job-specific applicant pools and national samples. Yet, there is no immediate evidence that university applicants are similar to job applicants. 66 Still another method of correction is to use the variance associated with different variables as an indicator of range effects. Variance estimates for measures were provided fairly often in primary studies. However, this method of correction is only appropriate when studies use the same measures (cf. Raju, Pappas, Williams, 1989). This was not the case here. Furthermore, no method allows a clear way to estimate the combined effect of (direct and indirect) range variation on multiple correlations reported for different variables in the same study, especially when some relationships might be attenuated and others enhanced. For example, a study predicting the performance of job incumbents or successful workers may be range restricted on task performance but range enhanced on citizenship behaviors if citizenship is related to helping the organization by participating in the research study. Although range restriction and other types of artifacts typically affect the magnitude and variation of effect sizes observed in research, sampling error appears to account for the bulk of artifactual variance, especially when the sample sizes in primary studies are small (Koslowsky & Sagie, 1994), accounting for more than 70% in some studies (Schmidt et al., 1993). In conclusion, this meta-analysis adopts the perspective that no correction is better than a poor one. The correlations corrected for statistical artifacts were then cumulated across studies, weighted by sample size and the size of artifact correction, to produce an estimate of the population correlation and its variance. The estimate of variance was used to assess the presence of moderators. As all of the analyses here were conducted on less than 60 studies, the power and accuracy of most methods for identifying moderators are relatively low but comparable (Sagie & Koslowsky, 1993). But because no test is 67 definitive, I used three approaches, two of which are recommended by Hunter & Schmidt (2004) First, I applied Hunter and Schmidt’s (1990) “75% rule” (of thumb) to compute the percentage of variance explained by artifacts (i.e., error). However, I lowered the threshold to 60% as recommended by others (Colquitt, LePine, & Noe, 2000; Horn, Caranikas-Walker, Prussia, & Griffeth, 1992; Koslowsky & Sagie, 1994; Mathieu & Zaj ac, 1990) since correlations were not adjusted for range variation. When 60% of the observed variance is due to sampling error, measurement error, and variable dichotomization, evidence for the presence of a true moderator was judged to be small. Second, I calculated an estimate of the true range of population correlations with credibility intervals. When these are large or overlapping 0, they too suggest the presence of moderators (Hunter & Schmidt, 1990; Tett & Meyer, 1993). Koslowsky and Sagie (1993) provide guidelines about how big an interval should be before it indicates the presence of a moderator: roughly larger than .11. Third, I computed the Q statistic and its chi-square value to determine whether the amount of observed variance would be larger than that expected based on chance. This method also allows significance tests to be computed for hypothesized moderators. The Q values for meta-analyses conducted on subgroups based on a moderator are then compared to the total group Q with a chi-square test (where the degrees of freedom equals the number of subgroups minus 1). Together, these qualitative (Hunter & Schmidt, 2004) and quantitative (Cooper, 1998) comparisons allowed me to determine whether it was likely that moderators were present and whether hypothesized moderators explained observed variance in correlations. 68 Multivariate Meta-analysis Procedure A single meta-analysis provides information about the relationship between two variables. However, researchers are often interested in examining the larger pattern of relationships between predictors and multiple outcomes that reflect a realistic phenomenon. The same is true for this study, and the relationships of key interest are for the multiple dimensions of performance. Unfortunately, a level of dependence is created in the data that most likely “affects Type I error levels in complicated ways” (Becker, 2000, p. 503) when multiple outcomes are reported by primary studies (e.g., correlations between predictors and task and citizenship performance). At least a few analytic approaches have been employed to deal with this dependence, some being more valid than others (Becker, 2000; Hunter & Schmidt, 2004). One practice is to combine multiple correlations reported for the same sample into a single effect size, but at the cost of losing information. This practice is particularly limiting when distinct constructs are combined into a less meaningful, broader unit (e.g., overall performance), and was not a viable option here as the distinction between different criteria was the central research goal. Others have conducted a series of meta-analyses for each pair of variables under consideration. A single meta-analysis will provide the best estimate of a correlation based on the primary studies cumulated, and unique multiple meta-analyses will provide good estimates of unique correlations. However, it is often the case that some primary studies report multiple correlational effect sizes. When these effect sizes are interrelated because they come from the same study, they are statistically dependent and provide less information than a set of unique, independent correlations. Thus, a study that contributes 69 correlations to a series of meta-analyses provides some redundant information, and this information about a study that is common across meta-analyses can produce correlated errors in more complex analyses of the data (e.g., linear regression), violating traditional assumptions of independence that increase the rate of Type I errors (Becker, 2000; Bliese, 2002; Gleser & Olkin, 1994; Kenny & Judd, 1996; Raudenbush et al., 1988). Therefore, one must account for dependence between the units of analysis in order to compare the magnitude of two correlations in two separate meta-analyses where some of the data overlaps because some studies contributed correlations to both meta-analyses. As an aside, dependence between studies is less problematic if the variables studied in separate meta-analyses are unrelated (Becker, 2000). In this study, there is no prior expectation that the relationship between the relevant outcomes of task and citizenship performance will be small or large. If the two dimensions are substantially related, it will be necessary to consider the dependence between primary studies/samples before comparing various correlations in linear models. Because H3, H4, H11, and H12 specifically predict that certain variables will have higher validities with one performance dimension or the other, I attempted to model dependence in the database using a relatively new multivariate method of meta-analysis described by Becker and colleagues (Becker, 1992, 2000; Raudenbush et al., 1988). This multivariate method of meta-analysis models the dependence between outcomes reported for the same sample by treating data structures as meta-analytic cases (i.e., units of analysis) rather than individual correlations. “A fully multivariate approach should provide justifiable tests of significance for more complex questions than can be addressed using the ad hoc or univariate approaches described above, and more accurate 7O probability statements for all tests conducted” (p. 505, Becker, 2000). The Becker method is quite versatile as it allows pooled correlations to be calculated even when different studies contribute different effect sizes (Raudenbush et al. 198 8). So, the data structure comprising each case may look very different because any sample may contribute one correlation or an entire correlation matrix to the analysis. The method, however, does require as input data regarding the covariance between every effect size reported in the primary study. In other words, a study reporting a ability-task performance correlation and a conscientiousness-task performance correlation would also need to provide the correlation between ability and conscientiousness. When studies did not provide this necessary information, they were either excluded from the analysis or retained by borrowing relevant estimates from other literature (Raudenbush et al., 1988) or mean imputed (Becker, 2000). I set an arbitrary cutoff that a sample would be retained if it required less than 20% of its correlations to be imputed. Therefore, samples must have contributed at least 6 correlations before they were even eligible for imputation. After correlations were corrected for artifactual variance using the Hunter and Schmidt (2002) procedure, they were used as input in the multivariate analysiss. I used formulas provided by Becker (2000) to construct a vector of correlations for each study and a corresponding variance-covariance matrix modeling the interdependence between sampling error for these correlations. (It was necessary to add Hunter and Schmidt’s (2004) correction factor to Becker’s (2000) equations 4 through 6 for computing the variances and covariances between correlations reported in a study.) I then used the 71 generalized least squares, fixed-effects approach (Becker, 1992, Raudenbush et al., 1988) to compute a vector of mean correlations cumulated across samples. These estimates are averages of corrected correlations across samples, weighted by sample size and the sample’s variance-covariance matrix for effect sizes. The analysis also produces the pooled variance-covariance matrix used in weighting. To test Hypothesis 12, I used additional formulas provided by Raudenbush et al. (1988) to predict variation in effect sizes using study level characteristics. These analyses were conducted in SAS/IML (SAS Institute, 2001), and sample syntax of the analyses is included in Appendix E. (The program for these analyses would not run in SAS with correlations of 0 so corrected mean correlations equal to 0, when using two decimal places, were set to 0.00001 for input into SAS.) The resulting estimates of population correlations were used to fill in an estimated “p-matrix” (Table 10). The p-matrix represents the best estimate of each correlation between study variables, as a total set. While researchers have tested the overall fit of models based on meta-analytically derived matrices (e.g., Carr, Schmidt, Ford & DeShon, 200; Colquitt et al., 2000; Shaw, Wild, & Colquitt, 2003; Tett & Meyer, 1993), there are a number of conceptual issues that one must address to justify conclusions derived from these analyses (V iswesvaran & Ones, 1995). The most obvious problem concerns estimating error variance in the total sample correctly because there is no single value for sample size that applies to the entire matrix (even though some have argued for the use of the harmonic mean). Also, the information contained in a meta-analytic matrix is based on pairwise (deleted) correlations rather than listwise data and can produce 5 Becker (2000) suggests using Fisher Z-values instead of correlations, particularly for primary samples that are small (n < 100). I used correlations based on justifications (p. 82-83) included in the Hunter and 72 biased or even inestimable results (cf. Darlington, 1990; Kline, 1998; Wothke, 1993). Because the major purpose of this study is to compare relational patterns within the overall model rather than to test the notion that job performance is completely determined by cognitive ability, personality, biodata, and interviews, the statistical significance of an overall model is not tested. Schmidt method (2004), and because the sample size was usually fairly large. 73 RESULTS Database Description From 172 published studies, 195 conceptually “independent” samples and 984 unique correlations were obtained. The firll listing of studies is in Appendix F. Table 3 provides a general breakdown of studies by type. Sample sizes ranged from N = 29 to N = 25,327 (for Hough’s 1992 meta-analysis contributing a conscientiousness-task performance correlation), with an average of 812 subjects. Thus, the average sample size per sample was fairly large. Although there were about 5 unique correlations associated with each sample on average (after aggregating redundant measures and forming linear composites of measured subfacets), about half of the samples contributed a single correlation to the database. For those samples contributing more than one effect size, the average number of usable correlations reported per study is approximately 9. The majority of samples consisted of employees in nonmanagerial jobs and provided data about real job performance, versus simulated job tasks. Only 15% of the samples included participants who were not explicitly selected on one of the study variables, implying little to no range restriction for those samples. The samples included in the database covered a wide range of jobs, and also included the general public (in some longitudinal studies) and university students. Because the criteria for study inclusion were broad, there are almost as many different job types as there are samples. Some of the job types included are manufacturing line workers, university administrative staff, hotel staff, agricultural coop employees, working students enrolled in an MBA program, telemarketers, food service workers, stockpersons, 74 account managers, pulp mill workers, expatriates in a technical company, computer programmers, summer camp workers, pharmaceutical workers, prison guards, and more. Some of the more commonly studied jobs were military soldiers, sales representatives, and insurance agents, often because the same researchers had access to the same organizations. Regarding data needed to apply corrections for statistical artifacts, only one study required a correction for variable dichotomization. An “unsuitable discharge” variable representing “a failure to meet minimum behavioral or performance criteria” (McDaniel, 1989, p. 965) was included as task performance. In that sample, 16.5% of the employees were discharged for this reason. Regarding corrections for measurement unreliability, 508 (73%) of 696 possible reliability estimates were obtained either from primary studies, test manuals, or other literature reporting statistics on the same measures. The type of reliability estimate differed for each measure in each study but the alpha coefficient was most commonly reported, followed by interrater reliability. Of the 185 missing values, 53 pertained to samples drawn from published meta-analyses that did not provide specific reliability information about each study. The actual number of reliability estimates obtained from the literature and number of imputed estimates are listed in Table 4. The imputation process described earlier used the following reliability estimates of performance measures, based on work by Viswesvaran et al. (1996): task (.57), citizenship (.55), job dedication (.55), interpersonal facilitation (.47), overall (.81). These estimates are considerably lower than the mean values obtained in this study. Yet, a supplemental analysis using the mean values in database (Table 4) instead of the above 75 Table 3 Database descriptives Number of independent samples in database Largest sample size Smallest sample size Average sample size Total number of unique correlations in database Average number of correlations contributed per sample Number of samples providing only one correlation Managerial Nonmanagerial Indeterminable or includes managers and lower Sampled in a work setting Range restricted 195 25,327 29 812 984 5 97 11% 65% 24% 80% 85% Note. The figures for “applied setting” and “range restricted” were based on the author's codes, as described in the methods section. 76 values produced nearly identical results. The specific number of estimates used in each analysis described below varied, but 23 of the imputed values that were associated with 48 interview correlations (column 4 of Table 4) were never used because too few studies were available in the literature (see below for further explanation). Outlier Analyses Before proceeding with meta-analytic computations, I checked that the data did not contain any obvious transcription errors (i.e., correlations above 1) and calculated the SAMD values for the observed correlations (Huffcutt & Arthur, 1995) to identify possible outliers. A scree plot of the absolute SAMD values was created for each correlation with at least 20 datapoints (Appendix G). Outliers are related to a number of issues including coding errors, model misspecification, and true score versus error. The plots shown here are simply meant to illustrate the distribution of recorded findings rather than to detect cases for removal since 1) there are not many cases per cell, 2) the point of meta-analysis is to determine if aberrant findings can be attributed to sampling error, and 3) the SAMD does not indicate the joint effect of multiple outliers within a study. That is, one outlier within a sample that otherwise provides correlations of “good quality” (i.e., non-outliers) is more likely be a true score than an outlier among many other outliers provided by the same sample, unless the lone error is due to a transcription mistake by the original authors or the meta-analytic coders, or due to a peculiar influence (e.g., an unreliable scale for that one measure). Most of the plots show a desirable pattern with a plateau at the tail. Although a few plots showed some drops in the middle (e. g., conscientiousness — citizenship and openness — emotional stability), there was never a single point by itself after the initial 77 Table 4 Reliabilig Information for Scales. Mean - Reported Irnputed W/o Reliability Min. Max. Estimates Estimates Interviews Cognitive .87 .46 .98 40 27 13 Extraversion .8 l .49 .94 3 7 6 5 Conscientious .81 .62 .98 52 19 1 8 Agreeable .76 .63 .97 41 8 7 Openness .77 .55 .91 32 7 6 Emot. Stability .83 .70 .92 35 8 7 Biodata .79 .59 .91 15 7 7 Interview .82 .59 .97 7 23 0 Task Perf .77 .27 1* 59 23 21 Citizenship Perf 74 .32 .97 50 17 17 J. Dedication Perf .83 .29 .99 61 12 12 Interpersonal Perf .80 .3 l .97 67 12 12 Overall Perf .84 .42 .96 12 16 14 (Total) (508) (185) (162) Note. Reported Estimates = number of reliability estimates obtained from literature, Min. = minimum reliability estimate reported, Max. = maximum reliability estimate reported, Irnputed Estimates = number of estimates imputed, W/o Interviews = number of estimates imputed, excluding interviews. *Although the data are likely to contain some error, some objective criteria such as number of sales were assumed to have a reliability of l, to ensure a conservative correction. 78 drop. Therefore, no explainable outliers were found and no cases were removed from the analyses. Overview of Meta-analytic Results The number of samples (k) and total sample sizes (N) for each relationship are presented above the diagonal in Table 5, where each cell represents an individual meta- analysis. The correlations above the diagonal are mean-weighted observed correlations uncorrected for statistical artifacts. Again, 95% confidence intervals were applied to uncorrected mean correlations. Bolded values have an interval that does not include 0, meaning that the effect is statistically significant. The correlations6 below the diagonal are means of correlations that have been corrected for the two statistical artifacts mentioned earlier. Also presented below the diagonal are 80% credibility intervals indicating the estimated true range of corrected correlations. A single correlation was found for five cells: biodata-job dedication, biodata- interpersonal, interview-job dedication, citizenship-job dedication, and citizenship- interpersonal. With the exception of its relationships to cognitive ability and task performance, only 2 studies with small samples contributed to the meta-analytic estimates for interviews. As a result, interviews were excluded from the remainder of the analyses since this meta-analysis would not be able to provide conclusions beyond those made in past primary studies and meta-analyses of employment interviews. There were few studies providing information about biodata as well but enough to provide at least preliminary meta-analytic findings. Two of these studies (Hough et al., 6 The mean corrected correlation (re) is often labeled p. I stray from convention because the label re is more informative, indicating how the estimate was derived, and because I attempt to derive “better” estimates of p with a multivariate meta-analysis. The multivariate method is conceptually superior but did not necessarily produce more accurate estimates because of practical limitations in the data (see Discussion). 79 1990; McHenry et al., 1990) used the Assessment of Background and Life Experiences (ABLE). The developers of the ABLE (Hough et al., 1990) intended to measure “temperaments.” So, the ABLE might be viewed as a personality measure. However, it was classified as biodata, per this study definitions, because of the way it measures individual differences through past experience rather than preferences or intentions (as other personality tests can do). Cognitive ability was weakly related to personality overall, with the highest mean corrected correlations being .19 with openness and .16 with emotional stability. In contrast, intercorrelations between the personality variables were higher than expected (e. g., compared to Hough, 1992) with the average rc weighted by sample size equal to .38. In this dataset, emotional stability demonstrated the strongest links to other personality variables. There were few studies measuring overall job performance due to the criteria for inclusion (requiring at least one dimensional measure of performance) and my research aims. The data that were collected failed to show the typically strong relationship between cognitive ability and overall performance. Even so, this estimate was based on 10 studies with a decent size (N = 8,009). These results do not contradict the large body of literature on the validity of general cognitive ability. They simply suggest that studies measuring overall performance in addition to specific performance dimensions will, for some reason, find lower validities. On the other hand, the results resembled past findings of personality validities (e.g., Barrick et al. 2001; Salgado, 1998), but were generally of smaller magnitudes. Conscientiousness produced the largest mean corrected correlation (.20). In this set of 80 studies, overall performance was related to all performance dimensions as expected but slightly more strongly (p < .01) with citizenship (rc = .65) than with task performance (rc = .41). More detailed statistics for uncorrected and corrected correlations that are of particular interest in this study are included in Table 6. Most of the mean confidence intervals excluded zero (typically for mean values above .06). Simultaneously, most of the credibility intervals for the mean corrected correlations were quite large, justifying a search for moderators. Sampling error explained at least 60% of the observed variance in corrected correlations for just 7 relationships (Table 6). (The reason why some of the estimated values of Van in Table 6 exceed 100 is mostly likely due to second order sampling error since most of these estimates involve a small number of studies.) Even so, complete homogeneity across the database correlations was not expected given the different types of samples included, the broadly defined constructs, and the small sample sizes (k) for some pairwise analyses that lowered the statistical power of moderator tests. Based on the 60% rule, openness demonstrated the most stable estimates with various measures of performance but the effect sizes were essentially null. The mean corrected correlations were all below .10 except with overall performance. Emotional stability also produced small but homogenous correlations with job dedication (rc = .06) and overall performance (rc = .07), replicating meta-analytic findings in Hurtz and Donovan (2000). The correlation between extraversion and task performance was also small (.03) but homogeneous. The Q values and their p values generally supported the same conclusions as those supported by the size of the credibility intervals. 81 Table 5 Meta-analgic Correlation Matrix for Job Performance and Performance Predictors. l 2 3 4 5 (1) Cognitive 0 (-.O7, .07) .05 (.01, .09) 0 (-.05,.06) .16 (.10, .23) 14, 4571 28, 11208 14, 6686 14, 5029 (2) Extraversion .01 .22 (.16, .28) .24 (.17, .31) .26 (.20, .33) (-.l6, .18) 34, 9624 30, 8943 28, 8356 (3) Conscientious .06 .28 .38 (.32, .44) .10 (.03, .17) (-.09, .21) (O, .56) 38, 17574 30, 9363 (4) Agreeable 0 .30 .49 .15 (.09, .22) (-.l4, .15) (-.04, .64) (.17, .81) 28, 8356 (5) Openness .18 .33 .12 .19 (O, .36) (.07, .58) (-.21, .45) (-.1 l, .50) (6) E. Stability .16 .38 .62 .50 .17 (-.06, .38) (.06, .69) (.27, .96) (.14, .87) (-.02, .37) (7) Biodata .27 .31 .37 .37 .44 (.06, .48) (.18, .44) (.29, .45) (.30, .44) (.13, .75) (8) Interview - - - - - (9) Task .28 .03 .09 .04 .02 (0, .56) (-.05, .12) (.01, .18) (-.04, .13) (-.05, .08) (10) Citizenship .29 .06 .20 .16 .06 (.06, .51) (-.l l, .22) (.09, .32) (.04, .28) (-.Ol, .13) (l l) Dedication .09 O .17 .15 .06 (O, .18) (-.l4, .14) (.03, .31) (.04, .26) (.06, .06) (12) Interpersonal .04 .03 .13 .20 .01 (-.06, .14) (-.05, .12) (.02, .24) (.10, .31) (.01, .01) (13) Overall .11 .01 .20 .06 .1 l (.04, .18) (-.15, .16) (.09, .31) (-.05, .16) (.11, .ll) Note. Italicized variables are job performance. Information above the diagonal includes the mean weighted correlation (r), the 95% confidence interval in parentheses, the number of studies (k), and the total sample size for that estimate (N) across samples. Information below the diagonal includes the mean weighted correlation corrected for artifacts (re), and the 80% credibility interval in parentheses. Bolded confidence intervals exclude O. 82 Table 5 (continued) 6 7 8 9 10 (1) Cognitive .14 (.07, .22) .23 (.14, .32) .20 (.14, .27) .19 (.13, .25) .22 (.15, .28) 15, 8226 ' 9, 16610 18,6048 30, 42107 15, 17430 (2) Extraversion .31 (.24, .39) .25 (.16, .34) .18 (.17, .19) .03 (-.02, .08) .05 (0, .11) 29, 8774 4, 1010 2, 1148 14, 2651 20, 4425 (3) Conscientious .50 (.43, .56) .27 (.24, .31) .18 (.15, .21) .08 (.06, .10) .15 (.12, .18) 34, 17821 8, 6429 2, 3625 21, 38787 25, 15425 (4) Agreeable .40 (.32, .47) .28 (.24, .32) .20 (0, .41) .03 (-.01, .07) .12 (.08, .16) 31, 16110 6, 5978 2, 437 13, 3919 20, 6143 (5) Openness .14 (.09, .19) .34 (.13, .56) .13 (.05, .22) .02 (-.04, .08) .05 (O, .09) 28, 9563 4, 859 2, 620 10, 1367 15, 2930 (6) E. Stability .35 (.30, .40) .24 (.16, .32) .08 (.05, .11) .08 (.05, .11) 8, 6697 2,1010 11, 10323 15, 11782 (7) Biodata .45 .16 (.02, .30) .12 (.09, .15) .21 (.16, .26) (.34, .57) 2, 1038 15, 44904 10, 14200 (8) Interview - - .16 (.11, .21) .28 (.18, .39) 6, 7493 4, 827 (9) Task .1] .17 - .42 (.34, .5) (.03, .19) (.05, .29) 48, 22276 (10) Citizenship .10 .30 - .49 (.Ol, .20) (.16, .44) (.04, .95) (1 l) Dedication .06 - - .39 - (-.02, .15) (-.06, .84) ( 12) Interpersonal .06 - - .35 - (-.04, .17) (-.07, .78) (13) Overall .07 .26 - .41 .65 (O, .15) (.13, .39) — (.15, .67) (.35, .94) 83 Table 5 (continued) 1 1 12 13 (1) Cognitive .08 (.02, .13) .04 (-.01, .09) .09 (.06, .13) 5, 2501 6, 3118 10, 8009 (2) Extraversion .01 (-.07, .08) .03 (-.03, .08) 0 (-.08, .08) 9, 2026 10, 2534 9, 1941 (3) Conscientious .12 (.07, .18) .10 (.05, .14) .16 (.10, .23) 12,4272 12,4713 9, 1941 (4) Agreeable .11 (.06, .16) .15 (.10, .19) .04 (-.02, .11) 11,4205 12, 4713 8,1584 (5) Openness .05 (-.01, .l 1) .01 (-.O3, .04) .09 (.04, .14) 6, 967 8, 1742 7, 1076 (6) E. Stability .05 (-.02, .13) .06 (-.01, .12) .06 (O, .12) 5, 924 7, 1699 7, 1390 (7) Biodata .25 (n/a) .30 (n/a) .20 (.11, .28) 1,116 1,368 4,6020 (8) Interview .36 (n/a) .18 (.11, .25) .22 (.17, .26) l, 47 3, 366 3, 349 (9) Task .36 ( .26, .46) .32 (.24, .40) .29 (.20, .39) 24, 8168 28, 9720 14, 9701 (10) Citizenship .54 (.39, .69) - — 10, 3547 (11) Dedication .60 (.56, .65) .55 (.45, .65) 57, 17360 11, 3432 ( :2) Interpersonal .72 .48 (.32, .63) (-47, -96) 10,3316 (13) Overall .65 .59 _ (.42, .87) (.28, .91) 84 Finally, estimates of the true population correlation, p, were calculated using corrected correlations in each study according to the multivariate procedure described in the Method section. I estimated the matrix represented by the model in Figure 1 using 661 correlations taken from 115 independent samples. As stated earlier, I needed the full set of intercorrelations between all variables (relevant to this meta-analysis) studied within each sample, though studies could contribute different numbers and types of correlations to the analysis. Based on the 20% cutoff rule I chose, it was necessary to impute one value using the mean value of that correlation in the total database. The main strength of this multivariate approach to meta-analysis, at least in theory, is that comparisons of relational patterns will be more accurate because they are weighted by the variance-covariance matrix representing dependencies among correlations within each sample. A caveat to these estimates being more accurate is that the studies contributing data to the final estimates must still be representative of the true population of relevant studies. When certain types of studies are excluded from or are overrepresented in the analysis, there is always the potential for introducing bias and creating model misspecification (Raudenbush et al., 198 8). Figure 4 shows the estimates of p produced by weighting corrected correlations by sample variance-covariance matrices. The paths depicted generally were similar to the results produced by separate meta-analyses in Table 6. However, biodata validities were considerably higher and extraversion became a weak but noticeable predictor of citizenship. Also, the magnitude of emotional stability validities rose slightly while cognitive ability became a weaker predictor of citizenship performance. Finally, some of the intercorrelations between predictors were unpredictably high (Table 7) which 85 Table 6 Meta-analytic Results For Pairs of StudLVariables. V Study k r 95% CI rc SDrc 80% CV (”21; Q Performances Task — Citizen? 48 .42 .34, .50 .49 .36 .04, .95 2 24332.01 Task — DedicatP 24 .36 .26, .46 .39 .35 -.06, .84 3 9012.43 Task - InterP 28 .32 .24, .40 .35 .33 -.07, .78 4 9765.78 Dedicat — InterP 57 .60 .56, .65 .72 .19 .47, .96 5 8965.43 Overall Performance Task 14 .29 .20, .39 .41 .20 .15, .67 5 1523.21 Citizenship 10 .54 .39, .69 .65 .23 .35, .94 4 788.41 Job Dedication 11 .55 .45, .65 .65 .18 .42, .87 6 878.89 Interpersonal 10 .48 .32, .63 .59 .25 .28, .91 4 503.19 Task Performance Cognitive 30 .19 .14, .25 .28 .22 0, .56 . 3 2679.13 Extraversion 14 .03 -.02, .08 .03 .07 -.05, .12 62 31.83 Conscientiousness 21 .08 .06, .10 .09 .07 .01, .18 15 255.86 Agreeableness 13 .03 -.01, .07 .04 .07 -.04, .13 56 29.45 Openness 10 .02 -.O4, .08 .02 .05 -.05, .08 79 18.20“ Emotional Stability ll .08 .05, .11 .11 .06 .03, .19 34 41.57 Biodata 15 .12 .09, .15 .17 .10 .05, .29 7 462.19 Citizenship Performance Cognitive 15 .22 .15, .28 .29 .18 .06, .51 4 604.64 Extraversion 20 .05 0, .ll .05 .12 -.l 1, .22 30 107.84 Conscientiousness 25 . 15 .12, . 18 .20 .08 .09, .32 3 128.02 Agreeableness 20 .12 .08, .16 .16 .10 .04, .28 36 82.02 Openness 15 .05 0, .09 .06 .06 -.01, .13 72 34.04 Emotional Stability 15 .08 .05, .11 .10 .07 .01, .20 4 74.33 Biodata 10 .21 .16, .26 .30 .ll .16, .44 10 223.36 Note. k = # of samples; r = uncorrected weighted average correlation; 95% CI = confidence interval around r; rc = corrected weighted average correlation; SDrc = standard deviation of re; 80% CV = credibility interval around rc; Van = percentage of r, variance explained by study artifacts; Q = homogeneity statistic. Bolded V," supports homogeneity using the 60% rule. *p < .01; "p < .05. Dashes are values not estimated. 86 Table 6 (continued) Study k r 95% CI rc SDrc 80% CV (10,2; Q Job Dedication Performance Cognitive 5 .08 .02, .13 .09 .07 0, .18 40 17.06 Extraversion 9 .01 —.07, .08 0 .11 -.l4, .14 35 44.94 Conscientiousness 12 .12 .07, .18 .17 .11 .03, .31 28 84.94 Agreeableness 1 l .1 1 .06, .16 . 15 .09 .04, .26 39 45.33 Openness 6 .05 -.01, .11 .06 .00 .06, .06 117 9.50" Emotional Stability 5 .05 -.03, .13 .06 .07 -.02, .15 67 12.90“ Biodata 1 .25 - .31 .00 - - - Intemersonjaj Performm Cognitive 6 .04 -.01, .09 .04 .08 -.06, .14 37 21.98 Extraversion 10 .03 -.03, .08 .03 .07 -.05, .12 56 27.93 Conscientiousness 12 .10 .05, .15 .13 .09 .02, .24 38 46.28 Agreeableness 12 .15 .10, .19 .20 .08 .10, .31 40 47.64 Openness 8 .01 -.O3, .04 .01 .00 .01, .01 144 9.10” Emotional Stability 7 .06 - 01, .12 .06 .08 -.04, .17 48 23.42 Biodata l .30 - .48 .00 - - - Overall Performm Cognitive 10 .09 .06, .13 .11 .05 .04, .18 36 41.52 Extraversion 9 0 -.08, .08 .01 .12 -.15, .16 32 41.81 Conscientiousness 9 .16 .10, .23 .2 .09 .09, .31 47 3.74 Agreeableness 8 .04 -.02, .11 .06 .08 -.05, .16 53 25.59 Openness 7 .06 .00, .12 .07 .06 O, .15 70 15.12* Emotional Stability 7 .09 .04, .14 .ll .00 .11, .11 153 7.78" Biodata 4 .20 .l l, .28 .26 .10 .13, .39 9 8.80 Cognitive Abilig Extraversion l4 0 -.06, .06 .01 .14 -.16, .18 20 119.04 Conscientiousness 28 .05 .01, .09 .06 . 12 -.09, .21 21 208.76 Agreeableness 14 .01 -.05, .06 O .1 l -. 14, .15 2 112.96 Openness 14 .15 .09, .21 .18 .14 .00, .36 17 134.47 Emotional Stability 15 .13 .05, .21 .16 .18 -.O6, .38 278.68 Biodata 9 .23 .14, .32 .27 .16 .06, .48 507.52 87 Table 6 (continued) Study k r 95% CI rc SDrc 80% CV (1:2)? Q cher Predictors Extrav - Conscient 34 .22 .16, .28 .28 .22 .00, .56 10 2874.98 Extrav — Agreeable 30 .24 .17, .31 .30 .26 -.04, .64 7 3946.74 Extrav — Openness 28 .26 .20, .33 .33 .20 .07, .58 11 2916.71 Extrav - Emot Stab 29 .31 .24, .39 .38 .25 .06, .69 6 1686.15 Extrav - Biodata 4 .25 .16, .34 .31 .10 .18, .44 36 19.09 Consc — Agreeable 38 .38 .32, .44 .49 .25 .17, .81 3521.78 Consc - Openness 30 .10 .03, .17 .12 .26 -.21, .45 1585.42 Consc - Emot Stab 34 .50 .43, .56 .62 .27 .27, .96 9525.02 Consc — Biodata 8 .27 .24, .31 .37 .06 .29, .45 34 45.20 Agree — Openness 28 .15 .09, .22 .19 .24 -.11, .50 9 771.51 Agree - Emot Stab 31 .40 .32, .47 .50 .28 .14, .87 3 4356.39 Agree - Biodata 6 .28 .24, .32 .37 .05 .30, .44 34 28.24 Open - Emot Stab 28 .14 .09, .19 .17 .15 -.02, .37 16 408.05 Open - Biodata 4 .34 .13, .56 .44 .24 .13, .75 9 97.13 Emot St — Biodata 8 .35 .30, .40 .45 .09 .34, .57 16 91.14 88 warrants some caution in interpreting these results, with eight larger than .95. See the supplemental analyses below for further explanation. In conclusion, cognitive ability, conscientiousness and biodata were the best predictors of the two performance dimensions based on both the pairwise and multivariate meta-analyses. All predictors were related to biodata. The specific study hypotheses are evaluated in the following sections based on these results and some additional analyses. Hypothesis 1: Moderation by Job Type Hypothesis 1 (H1) predicts that the correlation between citizenship performance and noncognitive predictors will vary depending on whether jobs are managerial and/or sales related versus other types of jobs. Sample level codes for managerial (vs. lower) and sales (vs. other) jobs were assigned during the initial coding phase. Of 152 samples to receive managerial codes, 26 consisted primarily or completely of managerial jobs based on the information provided in the primary studies. Of the 140 samples to receive sales codes, 19 consisted primarily or completely of sales jobs. H1 was based on the assumption that citizenship behaviors are a central part of both managerial and sales jobs, and that this common confound causes the moderation. Thus, H1 was tested using a single dichotomous category distinguishing the 45 managerial and/or sales job samples from the 99 other job samples. The first moderator analysis included the Big Five as the “noncognitive” predictors, while the second included the Big Five and biodata. Fifty six correlations involving the Big Five and citizenship were available in 16 samples. Seven of these were managerial / sales jobs. Because multiple correlations 89 provided by a study for each personality dimension are dependent, I aggregated results within studies either by forming a linear composite correlation when interrcorrelations between personality dimensions were available or by averaging correlations (which is the lower bound of the linear composite). The results of this analysis are shown in Table 8. The mean corrected correlation for that was different between the two groups but, contrary to H1, was weaker for managerial / sales jobs (.16 versus .31 for other jobs). The SD“; for the combined sample was reduced in both subgroups, meaning that the credibility intervals became smaller. The between- groups Q, of 57.88 was statistically significant at p < .01. Together the results of these moderator tests support managerial / sales job types as a moderator of the personality- citizenship relationship but in the opposite direction of that hypothesized. Also, the percentage of variance explained by artifacts was 80% for managerial / sales jobs but did not change much for “other” job types. The results of the second analysis after adding biodata as a noncognitive predictor are similar (Table 8) except that a smaller percentage of the variance in mean corrected correlations was attributable to statistical artifacts in both groups as well as the total group. This is not surprising since biodata typically differ from personality measures and may assess some part of cognitive ability to a greater degree. In conclusion, H1 was not supported but there was evidence to support managerial / sales job type as a moderator. It is noted that these results could be biased because the set of correlations reported for one subgroup were not necessarily related to the same personality dimensions as the correlations reported for the other subgroup; aggregating correlations across personality dimensions could mask a confound between job type and personality 90 ._ 80w:— E 000205 20: .053 0:0 833?» 80205 80: 000 $080 3:me 080500.80 E 20 6 030.5 woman—000-808 302305 .«0 moumfiumm .202 :3 -. mm. 33 x-.. 00m80>mbxm Tllv 2.1.1-- .......................... 03 an. 3000300890. 1 $5 3.. 0:50... .2285 . 0000:0885 950350 4 83:. :38. . 2.3 2.. 805005028000 1 $3 8. A. a: 83 :4 8 80005 . :3 .... .7 as :3 2. 6m. oocmccotom 6.3. 1 06:6 05% 32:80 338 0.5 838. «n. 0051098 8 $000000 4 0032.03 beam 0 moan—mam 08:03:02 53> gamma 50; .v 0.5me 91 £6. 84. 68. EC. E. 88. 3. 5. 00122686 av 8m. 90. ms. mo. 80. 48. E. 85:80.3 one... as Ev. 5m. 3m. Ev. 2m. m2. 0820 6 3m. :3. £6. 83 E. Essa .8285 § 0%. N3. 80.- 0%... 26580 6 as. 83 m8. 365382? E 83 §. 08888088 6 EN. gauge-aim Amy .025... 6320060 5 w A. 0 w v m N — damn—000-802 2.053232 :0 00mmm x532 cosflotoo 0052000; 080053 5 22¢ 92 dimension. (Ideally, all studies would have provided correlations between all relevant variables and this confound would be controlled.) If some personality dimensions tend to produce correlations of very different magnitude fi'om other dimensions and studies of one job type tend to report correlations for a particular set of personality dimensions, the results would incorrectly support a moderator. As a hypothetical example, personality studies of managers might measure extraversion more often than in studies of automotive line workers, which themselves tend to measure conscientiousness more often. If conscientiousness is a better predictor of citizenship than extraversion regardless of job type, effect sizes would show variance supporting moderation but not for the suspected reason of job type. A closer examination, however, suggests that job type does moderate relationships here. Table 9 shows the number of correlations contributed to the subgroups by each personality dimension. The data for the managerial / sales group were composed of a greater number of conscientiousness and agreeableness correlations, and these correlations were higher than those for the other personality variables. The opposite compositional pattern is true for the non-managerial / sales group. Therefore, one would expect managerial / sales jobs to show stronger personality-performance correlations under the assumption that job type was confounded with the dimensions of personality measured. 1 found the opposite pattern of results (Table 8), supporting job type as a moderator, rather than the type of personality dimension measured. Hypothesis 2: Moderation for Citizenship Dimension Hypothesis 2 (H2) predicts that distinguishing interpersonal facilitation from job dedication for measures of citizenship will produce two unique patterns of relationships 93 with other variables. One approach for evaluating this would be to group correlations between citizenship and other variables based on some estimate of how much the citizenship measure captures either subdimension. In essence, that is what the correlations for job dedication and interpersonal facilitation in Tables 5 and 6 represent. During the coding process, raters attempted to categorize performance correlations into interpersonal facilitation or job dedication. Correlations were categorized into the broader variable of citizenship when not clearly assessing one dimension or the other. This coding process effectively created four categories to which a sample could be assigned based on the citizenship correlations reported: job dedication, interpersonal facilitation, both job dedication and interpersonal facilitation, and overall citizenship assumed to measure job dedication and interpersonal facilitation to some unknown degree. A traditional moderator analysis comparing the four types of samples could not be conducted due to small subgroup sizes. If mutually independent categories were created, there would be just four studies for job dedication and nine for interpersonal facilitation. Twenty six studies reported data for both citizenship dimensions separately but would have to be aggregated to preserve independence between effect sizes, resulting in the loss of crucial information. Consequently, I decided to evaluate H2 in a more qualitative fashion by examining meta-analyses using citizenship performance (in Table 6) as compared with the results of meta-analyses using job dedication and interpersonal facilitation separately, as has been done in past research (e.g., Hurtz & Donovan, 2000). Because the majority of studies examined provided correlations for both subdimensions of citizenship, I treated correlations as if they were obtained from a single sample (using the smaller N between relevant cells in Table 5) and conducted (two-tailed) 94 Table 8 Tests of the Moderating Effect of Job Type. Grouping k N rc SD“; Q % V,_,_,_ Personality-Citizenship Managerial / Sales 7 2924 .16 .03 l 1.24 80.2 Other 9 10466 .31 .06 58.79 17.9 Combined sample 16 13390 .29 .08 127.91 15.6 Q, 5788* Personality & Biodata-Citizenship Managerial / Sales 8 3501 .14 .05 19.87 52.9 Other 14 23079 .26 .08 184.14 7.9 Combined sample 22 26580 .25 .09 248.24 9.7 Q1, 4426* Note. k = number of samples used in analysis, N = total sample size for analysis, rc = mean corrected correlation, SDrc = standard deviation of re, Q = homogeneity statistic, Q, = difference total Q for combined sample and sum of Q5 for subgroups, % Van = percentage of variance due to statistical artifacts. *p < .01 Table 9 Percentage of Correlajons From Each Personality Dimension Manager/ Sales Other Extraversion l 6 22 Conscientiousness 37 24 Agreeableness 26 22 Openness 1 1 l6 Emotional Stability l l 16 95 t-tests of the difference in correlations by dimension, for each predictor and for task performance. Table 10 provides the results of these tests. There was a significant difference for five of the seven variables at p < .05. In looking at the actual correlations, the magnitude of the difference in correlations was quite small in most cases. Given that these estimates were corrected for measurement unreliability in the both variables, there may not be a practically meaningful difference in validities across the two types of citizenship dimensions. In addition, I examined the change in the credibility interval size for overall citizenship and the subdimensions. The interval shrank by more than 50% for cognitive ability when subdimensions were used. For openness, the intervals shrank to essentially 0 when analyzed by subdimension. The rest of the correlations had intervals that were consistently large irregardless of performance measure. Another very important piece of information to consider is the correlation between the citizenship dimensions. The estimated population correlation for the two dimensions is very high (rc = .72) but not to the point of complete overlap, especially considering that this estimate was corrected for some artifactual variance. Also, this estimate is probably biased upwards due to common method variance since both measures were often subscales of the same instrument. Nonetheless, a strong relationship was expected since the dimensions are both indicators of citizenship in theory. The important issue is whether the two constructs are so strongly related that they are functionally redundant. To conclude H2 could not be tested directly but the pattern of results from individual tests suggests that even if there is some statistical support for H2, the 96 Table 10 Simple Comparisons of J ob Dedication and Interpersonal Facilitation Effect Sizes. Job Ded. Interpersonal Correlate (re) (rc ) N t p Cognitive .09 .04 2501 3.43 0.001 Extraversion 0 .03 2026 1 .84 0.066 Conscientiousness . l 7 . 1 3 4272 3 .66 0.000 Agreeableness . 1 5 .20 4205 4.60 0.000 Openness .06 .01 947 2.10 0.036 Emotional Stability .06 .06 924 0.00 1.000 Task Performance .39 .35 8168 5.87 0.000 Note. N is the smaller sample size of the two groups. 97 distinction between job dedication and interpersonal facilitation is not practically meaningful. Examining effect sizes based on the proportion of interpersonal facilitation or job dedication measured does not appear to improve validity estimates substantially. These results are similar to those produced by earlier studies (i.e., Conway, 1999; Hurtz & Donovan, 2000). Hypothesis 3 and 4: Differential Prediction Patterns for Performance Dimensions Although no hypothesis was formed due to a lack of consistent findings in earlier studies, the correlation between task and citizenship performance is one of the most interesting findings and useful contributions of this study. Finding evidence that the two dimensions are distinct is a perquisite for testing H3 and H4, as well as H7 through H12 later. The two dimensions were moderately correlated but this estimate was unstable (rc = .49; SD", = .36). Accordingly, past researchers have varied substantially in their estimates of this relationship. Nevertheless, this finding strongly suggests that the relationship is substantial for many cases and should not be ignored; it appears to be nonzero based on confidence and credibility intervals. From the approach of understanding a construct with a nomological network, evaluations of H3 and H4 should shed more light on this issue. H3 specifically predicts that task performance will be more strongly related to cognitive ability than personality. The data associated with cognitive ability and each personality predictor in Tables 5 and 6 are based on different subsets of primary studies in the database. Thus, the multivariate estimates shown in Figure 4 and Table 7 provide the best evaluation of this prediction (refer to Method section about testing linear models), although the results from separate meta-analyses (Table 6) are expected to show some general convergence. Cognitive 98 ability is the strongest predictor of task performance (p = .27, rc = .28)7 compared to any other single personality dimension. Conscientiousness and emotional stability produced statistically significant (p < .05) but weak relationships. The mean effect size across personality dimensions is less than .10, using either p or rc estimates. There is no direct way to test the significance of the difference in relationships given the various sample sizes used in calculations but the difference is clearly substantial and, presumably, not caused by sampling error since the total N for the separate meta-analyses was typically very large. H4 predicts that citizenship will be more strongly related to personality than cognitive ability. The results pertaining to individual personality dimensions varied somewhat depending on whether pairwise or multivariate estimates are considered. Extraversion and openness produced the lowest mean corrected correlations (both = .06) but emotional stability and openness produced the lowest p’s after controlling within study dependencies. Nonetheless, cognitive ability was the best predictor of citizenship performance in both types of analyses, although p (but not re) for conscientiousness is equally strong. The effect size of citizenship with overall personality (treated as a class of measures but not as a unitary construct) is about .12 based on either the mean p or rc across personality dimensions. Cognitive ability, on the other hand, produced a larger correlation with citizenship (p = .16, rc = .29), thereby failing to support H4. Hypotheses 5 and 6: Moderation by Job Complexity Hypotheses 5 and 6 concern the possible moderation of correlations between cognitive ability and dimensions of job performance by job complexity. I had originally 7 Again, these are estimates of the same relationship but rc is more commonly reported while “p” is more theoretically sound. 99 intended to use codes indexing job complexity obtained from the O*NET database. The database did not provide the necessary data to a sufficient degree. O*NET lacked codes for many of the jobs studied, including military jobs and nonspecific jobs classified as “residual jobs.” In all, just 61 codes were obtained. Moreover, low level jobs all received a code of “Below 4.0” and were not distinguishable from each other. As a result, I created a dichotomous moderator indexing high vs. low job complexity by splitting the obtained O*NET SVP scores. Scores below 6 were recoded as low complexity while scores 6 or higher became high complexity. Of the 33 studies providing the relevant correlations for testing H5 and H6 (i.e., cognitive ability - task performance and cognitive ability — citizenship performance), 2 did not provide enough information about their samples and 7 did not have SVP codes to be recoded. For the samples still missing codes, I assigned job complexity codes through a rating process. Seven graduate students in psychology, including the author, read a brief description of the samples. The raters were then asked to consider the total range of jobs encountered in research literature (e. g. toll booth operator to medical physician) and coded the current samples as either high (1), low (0), or indeterminate. These data are included in Appendix H. When treating “indeterminate” ratings as missing data, the interrater reliability (i.e., KR-20 coefficient) was .94 and the average kappa value between all rater pairs was .49. The average kappa is typically a good approximation of multi-rater agreement indices like Light’s 1cm (Conger, 1980). Three studies showed 100% agreement across the raters. Two studies (Hedge & Teachout, 1992 and Ree, Earles & Teachout, 1994) each received two indeterminate ratings. For the former study, 4 of the 6 remaining ratings were low complexity. For the latter study, all of the 100 remaining 6 ratings were high complexity. Due to the relatively good agreement reached, the mean rating across raters was used as the job complexity code. Contradictory to Hypotheses 5, none of the moderator tests (Table 11) supported the notion that the job complexity moderates the correlation between cognitive ability and task performance. It would be wrong, however, to accept this null hypothesis and conclude that job complexity is definitely not a moderator. Contrary to H6, the results (Table 11) support job complexity as a moderator of cognitive ability -citizenship performance correlations. The credibility intervals were smaller for subgroups as indicated by the SD, values and Q, was statistically significant. The difference in mean corrected correlations for the subgroups was fairly large, with cognitive ability being a good predictor of citizenship in less-complex jobs and a very weak predictor in complex jobs. Hypotheses 7 through 10: Specific Predictor - Criterion Relationships The next set of hypotheses predicted various types of relationships between the predictors and the job performance dimensions. The relevant statistics for this section can be found in Tables 5, 6, and 7. Tests of statistical significance are tied to whether 0 is included in 1) the 95% confidence interval for the uncorrected mean correlation, and 2) the 80% credibility interval for the corrected correlation (or for p). For all tests of Hypotheses 7 through 10, the conclusions implied by the confidence intervals agreed with those implied by the credibility intervals, and the interval test results corresponded to tests of each p in the multivariate analysis. (An exact test of p and its pooled standard error could not be computed with the current multivariate method because every study did not provide a correlation for every study variable.) 101 Table 1 1 Tests of the Moderating Effect of 1 ob Complexity. Grouping k N rc 5D,.c Q % V93 Cogpitive-Task Complex Jobs 8 7533 .24 .13 99.74 11.2 Less-complex Jobs 14 29431 .21 .27 2285.89 0.9 Combined sample 22 36964 .21 .25 2385.64 1.4 Q, .26 Cognitive-Citizenship Complex Jobs 6 3069 .09 .07 20.19 40.4 Less-complex Jobs 6 13412 .37 .13 288.74 4.5 Combined sample 12 16481 .31 .16 512.91 4.5 Q, 203.98* Note. k = number of samples used in analysis, N = total sample size for analysis, rc = mean corrected correlation, SD“ = standard deviation of re, Q = homogeneity statistic, Q, = difference total Q for combined sample and sum of Q8 for subgroups, % Van = percentage of variance due to statistical artifacts. *p < .01 102 Although significance testing can aid interpretation, specific tests do not seem to be as meaningful for meta-analytic studies because point estimates are derived (under the fixed-effects model), after accounting for first-order sampling error. It seems more relevant to instead consider the magnitude of the correlation and whether it is practically significant. In the absence of more specific standards, one can always refer to Cohen’s (1977) criteria: small (but still meaningful) effect size = at least .10, medium = at least .30, and large = .50 or higher. Hypothesis 7 predicted that conscientiousness would be significantly related to task and citizenship performance. Conscientiousness was significantly related to task (p = .07, total N = 28,040) and citizenship (p = .17, total N = 3,250) performance, and the difference between the correlations was statistically significant at p < .01 using the z-test for F isher-transformed values.8 These results support H7. It was not necessary to test Hypothesis 7B given the lack of support for a general distinction between citizenship dimensions. Nevertheless, for those interested specifically in conscientiousness and not personality generally, the results in Table 5 support the idea that conscientiousness acts through job dedication. The data do not support Hypothesis 8, that emotional stability is related to both dimensions of performance, but more strongly with citizenship. Emotional stability correlated weakly with both task (p = .08) and citizenship (p = .04) performance. The difference between correlations was not statistically significant (p > .05) and the credibility intervals for both values overlapped almost completely. H8B was not formally evaluated but the statistics in Table 5 suggest that emotional stability is equally 8 The z-test is not perfectly appropriate because the samples are not completely independent. At the other extreme, a one-sample t-test using the smallest sample size also produced a significant difference at p < .01. 103 (statistically) unrelated to job dedication and interpersonal performance. In contrast, strong support for was found Hypothesis 9 as the p for agreeableness was .17 with citizenship but just .05 with task performance. The difference between these correlations was also statistically significant (2 = 3.98, p < .01). Hurtz and Donovan (2000) found similar results in that agreeableness showed higher correlations with interpersonal facilitation than with task performance. To examine the possibility that agreeableness influences citizenship specifically through interpersonal facilitation, Iran a supplemental multivariate meta-analysis to make valid comparisons. The results did not support the notion; the p’s for both citizenship facets with agreeableness were about .22. Hypothesis 10 was not supported. Openness was not significantly related to task or citizenship performance. In summary, conscientiousness, agreeableness, and extraversion were fair predictors of citizenship performance while none of the personality dimensions were good predictors of task performance. Hypothesis 12: Prediction of Biodata Linked to Constructs Hypothesis 12 predicts that degree to which biodata assess personality and cognitive ability will determine their validity with task and citizenship performance. To make comparisons between variables using different sets of studies, it was necessary to account for dependence between reported effect sizes since some studies reported both biodata-task and biodata-citizenship correlations while others reported one or the other. Consequently, I estimated the mean corrected correlations of biodata with task and citizenship performance, weighting correlations by each sample’s variance-covariance matrix according to the multivariate method described previously. These data were then 104 analyzed in a generalized least squares regression model using the “proportion” (of constructs assessed) code to predict variation in the magnitude of correlations across studies (Raudenbush et al., 1988). The biodata “proportion” codes assigned to relevant samples in the third stage of coding (described earlier and included in Appendix I) refer to the proportion of an entire biodata measure that assesses cognitive ability when testing H 12A, and to the proportion of biodata that assesses personality (excluding extraversion) when testing H12B. (To avoid losing information through the dichotomization of variables, these proportion values were used to test hypotheses rather than some arbitrary cutoff to determine whether biodata “primarily” assessed one construct or another.) I then analyzed the 14 eligible studies (total N = 28,500) providing 29 correlations, after mean imputing three values for the correlation between task and citizenship. The analyses involved regressing task performance-biodata and citizenship- biodata correlations on 1) the proportion of biodata assessing cognitive ability and 2) the proportion biodata assessing personality. Significance (z-) tests using standard errors from the pooled variance-covariance matrix were conducted to examine whether the regression slopes in this model were nonzero. (Therefore, there is no single N associated with the significance tests, just a pooled estimate of the variance for each effect size.) The results are as follows. In predicting task performance with biodata, the regression results support H12A. The more that a biodata measures cognitive ability versus other factors, the stronger biodata validities become, by a factor of .014 for every “correlational unit” increase in the proportion of cognitive ability. This slope value of .014 is small but statistically 105 significant (2 = .014 / .00159 = 9.05, p < .01). Interestingly, ratings of proportion indicate that cognitive ability was never associated with more than 20% of a biodata measure. In predicting citizenship performance, the results fail to support H12A. The pr0portion of biodata assessing personality had a statistically significant slope value of - .0003 (2 = 5.04, p < .01) but the effect was too small to be of practical importance. Thus, H12B was not supported in any meaningful sense. Hypothesis 12C could be tested because there were only a few samples, none used biodata that even came close to measuring cognitive ability and personality in equal proportions. Despite not having produced a clear link between predictor constructs and job performance, the biodata validities estimated here are relatively high, particularly for citizenship (p = .41). Supplemental Analyses I ran some supplemental analyses to check the sensitivity of the various meta- analytic results. First, I investigated how the two largest studies affected overall estimates. These studies were not considered to be true outliers but were more influential in deriving a cumulated estimate. Hough (1992) had a total sample size of 25,327 to estimate the correlation between conscientiousness and task performance in her meta- analysis, and Brown, Stout, Dalessio, and Crosby (1988) estimated the correlation between biodata and task performance with a sample size of 16,230. (All other samples sizes were smaller than 10,000). When the Hough study is removed from the pairwise meta-analysis for conscientiousness and task performance, the results are very similar to those in Table 6: r6 = .18 with a corresponding standard deviation of .07. When the Brown et al. study is removed from its meta-analysis, the results are again very similar to 106 the original findings: rc = .16 and SDrc = .12. In both cases, the significance tests still allow one to conclude that both effects are nonzero. Next, I compared range restricted to unrestricted (or, more likely, less restricted) samples using the code that I assigned, as described in the Methods section. Based on the justifications for not correcting for range variation explicitly, I expected to see some correlations attenuated more than others. After removing the 25 samples that were considered to be unrestricted, the expected pattern was observed (see Appendix J for the correlation matrix pertaining to restricted samples). Therefore, range restriction is likely to have attenuated some correlations but in specific ways. Further work is needed to identify specifically in what settings and for which variable relationships attenuation (or enhancement) occurs. Finally, 1 investigated discrepancies in the results of the pairwise meta-analyses and the multivariate meta-analysis. The pairwise results are necessarily flawed due to theoretical reasons discussed in the Methods section, regarding dependencies between correlations from the same sample. While I attempted to derive more accurate results with the multivariate method, the resulting estimates of population correlations revealed some strange patterns with eight correlations nearly equal to 1. As the multivariate method does not appear to have been thoroughly tested or applied in the research literatureg, I explored some possible reasons for discrepancies with the pairwise results. It was possible that discrepancies in results were due to sampling differences between the total sample and the subset of studies that were eligible for input into the multivariate analysis. I computed pairwise results (Table 12) for just the multivariate- 9 The most widely available article from the group, Raudenbush et al. (1988), has been cited just 10 times in the Social Sciences Citation Index as of July 2004. 107 eligible subsample, for comparison with the results in Table 6. It appears that some of the larger correlations became inflated to near 1 by the multivariate method but even these correlations are higher than expected (e. g., given the literature on scale intercorrelations between the Big Five personality dimensions). Given that two very different weighting schemes were used, the overall results do not vary too much apart from those correlations approaching 1.00 in the multivariate analysis. In any case, this difference does not explain the major discrepancies. Another cause of the discrepancies could be that I incorporated statistical corrections into the multivariate method, corrections that had not been addressed by the original authors. I used corrected correlations as input and adjusted the computation of variances and covariances by the each sample’s correction factor (i.e., measurement unreliability) because the standard error of a corrected correlation is larger than its uncorrected counterpart (Hunter & Schmidt, 2004). To evaluate this effect, I reran the multivariate analysis on the corrected correlations without correcting the covariances for attenuation. The results produced essentially the same patterns as the original multivariate analysis, discounting this as a reasonable explanation for discrepancies. A third possibility is that certain samples received more weight and biased certain estimates upward. For example, three samples (i.e., one sample from Botwin & Buss, 1989, and two from Collins & Gleaves, 1998) contributed large correlations between extraversion and other personality variables (i.e., conscientiousness, agreeableness, and emotional stability), where most were larger than .7 before artifact corrections. Iran a multivariate analysis without these three samples and the corresponding p estimates were reduced to a “more reasonable” size (i.e., less than 1). However, further methodological 108 work needs to be done to determine whether 1) the results of those three studies, in this one exploration, were true outliers, 2) the multivariate method of weighting samples is inaccurate, 3) the method cannot handle values based on univariate corrections for measurement unreliability, or 4) sampling error produces large discrepancies between the pairwise and multivariate method when there are few studies. 109 on. E. 8. we. 2. a. we. 2. 06; 9.22320 30 S. we. 8. 3. mo. 8. mm. 855506.. nee as 8. mm. 9.. E. 3.. NM. 3820 S mm. 8. E. 9.. 2. 0:590. 080680 60 2.. E. 9.. MN. 26580 6 S. we 8. 0.885820... 0.0 mm. no. mmDCmsomuGomomcoU AMV 8. cemesim Q 0:5... 3:00.60 E w \- 0 m a. m N _ N_ 030,—. .0383 83.83st 05 com 83::me “ca—000-802 0&2:me 110 DISCUSSION Review of Research Goals The main goal of this research was to map the set of relationships between commonly studied and applied individual difference measures and mid-range concepts of job performance that are more detailed than just overall performance. This was accomplished by the creation of a meta-analytic matrix estimating the true correlations (or ranges of correlations for multiple populations with credibility intervals). The meta- analysis was intended to be comprehensive (within a specified time period) and generalizable, including all types of job samples that could be found in the literature. One drawback of such an approach is that estimated relationships showed considerable variance for the most part and additional moderator hypotheses were evaluated. This contradicts, in part, the proposition that processes related to citizenship behaviors are similar across jobs (Borman & Motowidlo, 1997). Another major goal was to provide valid comparisons of relationships across meta-analyses to test the central proposition put forth by Motowidlo et a1. (1997). Overall, the notion that task performance is predicted best by cognitive ability and citizenship is predicted best by other variables received partial support. Cognitive ability was a generally good predictor of both dimensions. Biodata are also good predictors but for reasons that are not entirely clear, perhaps because they capture cognitive, attitudinal, or other characteristics but probably not because they capture personality. A specific contribution of this meta-analysis that adds to past work is the evidence that task and citizenship performance are related (rc = .49; SDm = .36). Although this estimate shows a lot of variability, the credibility interval does not include 0, suggesting lll that the range of population effect sizes, unmoderated, tends to be non-negligible. This has implications for theory but also for practical endeavors like the recent surge of work trying to identify how various predictor-criterion combinations can affect adverse impact on racial minority groups in personnel selection (e.g., Hattrup et al., 1998; Murphy & Shiarella, 1997; Schmitt et al., 1997). At the same time, the estimated population correlation between task and citizenship performance is not necessarily as high as indicated in Figure 4. Halo error might be causing these strong observed relationships (Conway, 1996). Or, the relationship could be biased upwards because of selective sampling in primary studies. The people most likely to be studied in an experiment are job incumbents who are range restricted on task performance to some degree (i.e., they perform well enough to maintain their jobs), but who may be range enhanced if study participation is related to performing more citizenship behaviors and helping out an organization; the “normal” population would be range restricted on citizenship performance compared to experimental samples. Summary of Findings Overall conclusions pertaining to the specific hypotheses are included in Table 13 with the specific results summarized below. Unfortunately, there were not many studies available to conduct thorough statistical tests of hypothesized moderators and differential prediction patterns. Small sample sizes are associated with low statistical power and susceptibility to second-order sampling error. Hypotheses 2, 3 and 4 could not be tested directly using traditional statistical methods as a consequence of the small sample sizes, while two others could not be tested at all (H11 and H12C). Additionally, the pattern of results is not always stable. Therefore, it may be premature to conduct a meta-analysis 112 Table 13 Summag of Conclusions for Hgotheses Hypothesis Conclusion H1: Noncognitive predictors will have higher validities with Not citizenship for managerial / sales jobs. Supported H2: Citizenship dimensions of job dedication and interpersonal Not facilitation will produce differential patterns of validity. Supported H3: Task performance will be related more strongly to cognitive ability than to personality. Supported H4: Citizenship performance will be related more strongly to Not personality than to cognitive ability. Supported H5: Job complexity will moderate the relationship between cognitive Not ability and task performance. Supported H6: Job complexity will not moderate the relationship between Not cognitive ability and citizenship performance. Supported H7: Conscientiousness will be related to task and citizenship performance. Supported H8: Emotional stability will be related more strongly to citizenship Not than to task performance. Supported H9: Agreeableness will be related to citizenship performance only. Supported H10: Opennness to experience will be related to task performance Not 011W- Supported H11: Interviews will be related to task and citizenship based on what they measure. Not Tested H12: Biodata will be related to task and citizenship based on what Partially they measure. Supported 113 such as this, despite the fact that smaller meta-analyses have been published on almost every portion of the correlation matrix created here (Table 5). In spite of everything, the results are useful because they represent the current state of research related to theories of citizenship performance and offer insight about where firture research efforts can be focused. Furthermore, these results are not evident simply by surveying the literature or by using simple vote-counting methods, based on p-values. With respect to Hypothesis 1, managerial and sales jobs moderated the validity of noncognitive measures with citizenship performance but in the opposite direction of that hypothesized. Managerial/sales jobs produced a stable validity (rc) of .16. Although the validity of other jobs was much higher at .31, it was unstable with only 18% of the variance explained by artifacts. I hesitate to speculate at this point about why managerial/sales jobs produce “lower” validities than other jobs because it may be the case that one specific grouping of jobs is causing the “other” group to have validities higher than .16. However, the results support the idea that focused work on either managers (e.g., Conway, 1999) or salespeople (e. g., MacKenzie et al., 1991) might not generalize to other settings. Regarding Hypothesis 2, a direct test could not be conducted due to small sample sizes. Other tests of individual effects suggest that job dedication and interpersonal facilitation produce essentially the same pattern of relationships across cognitive ability, personality, and task performance, and that they are highly correlated with one another. Still, many studies were confounded with common method variance and similar biases because they measured these two citizenship facets with the same instrument and rater. Overall, these findings tentatively support the job dedication and interpersonal facilitation 114 as facets of a single citizenship performance construct, allowing for the use of more parsimonious theories. This also leads to the practical conclusion that there is little need to validate predictors of citizenship separately for the two dimensions, as in Organ and Ryan (1995)’s meta—analysis. As for Hypothesis 3, cognitive ability was the dominant predictor of task performance. This is unsurprising given the evidence for validity generalization and theories about performance. What is surprising is the conclusion for Hypothesis 4, that cognitive ability is also one of the best predictors of citizenship, not being outperformed significantly by any of the personality variables as was hypothesized by Motowidlo et al. (1997). The implications here are that cognitive ability is always useful and the advantages and disadvantages (e. g., adverse impact) associated with it cannot be avoided. Such conclusions have been made by others more generally (e.g., Sackett, Schmitt, Ellingson, & Kabin, 2001) or based on less compelling empirical evidence (e.g., Hattrup et al., 1998). It is still true that personality predictors can explain variance in job performance, but they are not likely to be good substitutes for cognitive ability, though conscientiousness comes close in predicting citizenship performance. The results associated with Hypotheses 5 and 6 were puzzling as they seem to contradict previous literature (e.g., Hunter & Hunter, 1984). Job complexity was not found to moderate cognitive validities specifically with task performance. Still, the moderation found in previous research has pertained mostly to broad measures like overall performance, leaving open the possibility that nontask components of performance produce extra variance that is moderated by job complexity. Again, I cannot offer more than speculation for this null finding. 115 What is interesting, on the other hand, is the strong finding that job complexity moderates cognitive validities with citizenship. This formally contradicts Hypothesis 6 but that hypothesis was posed in contrast to H5, without much theory to guide it. One possible explanation for this finding is that intelligent people can finish core tasks and immediate responsibilities faster than others when the tasks are low in complexity. This might, in turn, lead to spare time and resources that are used to perform citizenship. For example, a coworker who has already completed his main tasks is more likely to help another than someone who has a backlog of work to finish. Another possible explanation is that complex jobs tend to provide greater opportunities for citizenship and all employees are expected to perform them, rather than just the ones with high cognitive ability. At some level, everyone may be expected to endorse the organization or to show personal initiative. However, the same effect might be hypothesized for low level jobs if situational influences were strong enough (e.g., in a Total Quality organization). , Hypotheses 7 through 10 involving direct estimates of correlations were easier to test than the previous hypotheses. Conscientiousness was the only strong personality predictor of task performance but it, as well as agreeableness, predicted citizenship relatively well. Neither emotional stability nor openness was a good predictor of either performance dimension, as hypothesized. At the same time, extraversion significantly predicted citizenship performance, based on the multivariate results but not the pairwise results. Together, the findings associated with the conscientiousness and agreeableness match the results of previous studies and meta-analyses on OCBs (e. g., Borman et al., 2001). The findings for the other three personality dimensions, however, differed from 116 past findings. One unique aspect of this meta-analysis that might explain some of the discrepancies is that a broader range of measures and settings were included here, whereas past meta-analyses studied specific groups (e.g., applied samples, managers, or salespeople). Generally, it seems plausible that a particular context can determine, at least in part, whether certain personality characteristics are helpful for performing citizenship (or task) behaviors. Employees who work individually are more likely to draw upon conscientiousness to improve their overall performance whereas employees in a social or team-based atmosphere can improve their contribution to the organization either through being conscientious or being more interpersonally helpful. Because observed personality validities have varied within and across meta-analyses, more controlled laboratory work may need to be done to isolate specific causal effects. Hypothesis 11 could not be tested given the data collected but Hypothesis 12A suggests that the more biodata assess cognitive ability, the more valid it will become in predicting task performance. Given the results and explanations for Hypothesis 3, this finding is self-evident. Hypothesis 12B was also supported statistically but the size of the effect was nominal. The proportion of biodata assessing personality led to small increases in the correlation between biodata and citizenship. This suggests that characteristics other than personality will act through biodata to predict citizenship.lo Organ and Ryan (1995) made the strong conclusion that attitudinal variables were more effective in predicting citizenship. A thorough examination of attitudinal variables was beyond the scope of this study and must be relegated to future studies. Additionally, Hunter and Hunter (1984) also found biodata to have good validities across multiple '0 A review committee member noted that, because there was little variance regarding the proportion of personality assessed by the biodata measures, range restriction may have produced the null result. 117 types of performance criteria in their review of various performance predictors but argued that the operational validity of biodata might be considerably lower. Future Directions There was the general limitation of small sample sizes for many of the analyses here. As a result, I must recommend that additional work be carried out on all aspects of citizenship predictors since the findings here are not completely consistent with some other reviews like the meta-analysis of personality and OCBs by Borman et al. (2001). It is important to understand moderators of the different relational patterns across variables under more controlled conditions as meta-analytic moderator analyses cannot escape certain confounds when study characteristics or statistical artifacts covary with true moderators (Russell & Gilliland, 1995). And the results certainly suggest that there are moderators left to be identified. Specific recommendations related to the research hypotheses evaluated here are as follows. The reasons as to why managerial and sales jobs produced more stable cognitive - citizenship validities than other jobs are unclear. And it seems that, either all other jobs produce higher validities on average or, more likely, that another group of as yet unidentified jobs produces very strong validities. Clearly, detecting what kinds of moderators do produce stable validities in other jobs would help increase our understanding of citizenship performance processes. Similarly, researchers should attempt to replicate and explain the finding that cognitive ability is related more strongly to citizenship in less complex jobs. I concluded here that the findings for the job dedication aspect of citizenship were not distinct from findings for the interpersonal facilitation aspect. There is a strong 118 reason to believe that biases like halo or common method variance inflated the relationship between these dimensions to some degree. Future research should try to verify the extent to which this assumption holds. Although individual differences may not predict these different types of performance behaviors well, there is still a substantive distinction here from a content validity standpoint since organizations may be interested in increasing one type of behavior or the other. The biodata results suggest that similar types of analyses (hopefirlly on larger data sets) can increase our knowledge about the predictive power of nonconstruct measures and that such measures may produce higher validities than component construct measures. Biodata are ambiguous and are applied inconsistently across settings, possibly measuring many different things. They are rarely said to be drawing upon distinct constructs, though some scales of the ABLE have been considered measures of personality constructs (Bobko et al., 1999). This seems to have caused some to shy away from theory-based biodata. This study and past work (e. g., Hunter & Hunter, 1984; Schmitt et al., 1997), however, suggests that there are significant practical benefits associated with biodata use in predicting performance. Clearly, there is a need to go beyond this examination of personality and cognitive constructs to determine what other aspects of biodata help to predict outcomes. New findings may actually help others to develop biodata that are more construct oriented. Also, measures of attitudes or situational influences like those studied by Organ and Ryan (1995) may be related to molar measures like biodata and may show more promise for predicting and understanding citizenship behaviors. 119 Future work can also address other issues that were not investigated here but that have been applied in the past or to other types of performance measures. Researchers have consistently found differences between various types of measurement methods. Meta-analyses conducted by Ford et al. (1986) and Bommer, Johnson, Rich, Podsakoff, and MacKenzie (1995), show that there are meaningful statistical differences between subjective and objective measures. Podsakoff et al. (2000) suggested that examinations of multiple performance criteria like citizenship and task behaviors might be influenced by rater biases and common method variance. Such influences would artificially inflate correlations. If the correlation between task and citizenship performance is as high as it is estimated to be here, however, there are implications for the trend of research on reducing adverse impact for racial minority groups by using various weighting schemes of predictors and criteria in selection (e.g., De Corte, 1999; Hattrup et al., 1998; Murphy & Shiarella, 1997). These schemes are only meaningful inasmuch as the multiple variables entered into a model are providing unique information. If measures of task and citizenship performance provide redundant information in practice, these complex methods of predicting performance will be less effective. Another potential area of research would be the examination of the accuracy in measurement of citizenship behaviors since they are, almost by definition, more abstract and difficult to notice, especially if considered to be “extra-role” (Turnipseed, 2002). For example, Chen and Francesco (2003) included the item: “Complies with company rules and procedures even when nobody watches and no evidence can be traced.” This item begs the question of whether supervisors can or will notice certain acts of citizenship. 120 Lovell et al. (1999) found that men and women received similar ratings for overall performance despite women having a higher likelihood of performing citizenship behaviors. Thus, researchers might obtain more accurate results by accounting for differences in measurement of task and citizenship performance, apart from statistical considerations of (e.g., of reliability). Finally, I was concerned with the middle-range distinction between task and citizenship performance and focused on the salient issues regarding how citizenship is conceptualized and studied to this end. This endeavor was also motivated in part by the large body of more recent work on noncognitive predictors and the assumption that they are primarily beneficial for understanding citizenship. However, there are many different forms of task performance that might moderate the results found here. It is true that cognitive ability already tells us much about task performance but this meta-analysis suggests that aspects of task performance may be strongly related to citizenship performance. (It also seems that conscientiousness does not predict task performance well, as has been implied by past validation work.) Although unrelated to theory testing, future methodological and pedagogical efforts are needed to make multivariate meta-analytic methodologies useful. Results of the supplemental analyses offer some explanation for discrepancies between the traditional pairwise approach and the multivariate approach but neither could be said to be perfectly accurate. F rom the perspective of testing “sensitivity” the two analyses show what kinds of results can be derived from the data given different sets of assumptions. Consequently, there is still a need to identify the advantages and disadvantages of using 121 the multivariate method with real world data if users are to understand whether results are more or less accurate than those produced by more traditional methods. Limitations This meta-analysis was limited in several ways which caused some of the conclusions to be ambiguous until additional research is available. First, the number of studies available for estimating each bivariate relationship varied greatly and was quite small in some cases. Where the sample size (k) did not preclude a meta-analysis (i.e., as it did with the structured interview), the power to detect moderators was relatively low (Hedges & Pigott, 2001). Artificially small cells are unrepresentative of the population of studies and can contain wildly inaccurate estimates due to second order sampling error (i.e., error due to a small number of primary studies); effect sizes can be very different or very similar solely due to chance. The mean estimates of p however are relatively unaffected by the number of studies when the average sample size (N) within each study is large, as was the case here. Second, the quality of estimates from primary studies (Lipsey & Wilson, 2001) were believed to be of good quality (having been conducted by many well-respected researchers), but the accuracy of meta-analytic estimates depends on the accessibility of accurate information. These results might be biased upward to some degree for reasons related to sampling studies including but, not limited to, the effects of publication bias towards significant results or well written research, the file drawer effect where studies with null findings are not published, and the exclusion of dissertations which tend to be of weaker quality (Ashworth, Osbum, Callender, & Boyle, 1992; Campbell, Dunnette, Lawler, & Weick, 1970; Rotton, Foos, Van Meek, & Levitt, 1995). Hunter and Schmidt 122 (2004) show that “missing a few studies randomly usually does not reduce the accuracy of a meta-analysis by nearly as much as might be supposed” (p. 85). Nonetheless, this meta-analysis is limited by how well the sampled studies represent the actual universe of true relationships. Another limitation of this study is related issues of measurement that come into play when conducting a meta-analysis or primary study. My aim was to examine the conceptual relationships between multiple individual difference variables to predict job performance. The type of method used to measure those variables can create bias or be susceptible to bias, particularly if they are subjective measures (e. g., Allen & Rush, 2001). The low agreement for coders’ ratings of measure subjectivity prohibited 3 specific test of this moderator but there was a tendency for citizenship to be measured more subjectively. 1f future research establishes such a link, there would be a number of obvious implications for the use of different performance dimensions in research and practice, based on research already mentioned. The study is also limited in that a statistical artifact known to be operating could not be corrected: range variation. Because many of the samples were determined to be range restricted, the results are probably attenuated. However, it is not clear whether all results are attenuated uniformly or some are attenuated while others are enhanced. I imagine that the latter condition is true. It is worth mentioning that range enhancement, which has been largely ignored when studying cognitive ability, may occur in the study of citizenship behaviors. Becker and Randall (1994) found that employees who returned an attitude survey also performed more citizenship behaviors than their nonrespondent counterparts. The implication of such a phenomenon for this study is that correlations 123 will be biased upwards. There is a need for additional work on this topic before one can make accurate statistical corrections to the data if some relationships are range restricted while others are range enhanced.ll Also, this study is relevant only insofar as it accurately categorized the findings in primary studies in a meaningful way. The definitions of variables used to classify studies, particularly for the newer performance concepts (i.e., citizenship, job dedication, and interpersonal), were based on a broad body of literature and should exhibit acceptable face validity. I also attempted to categorize studies according to these definitions accurately using coding rules. In this study, the number of categorization errors was noticeably high (i.e., the accuracy rate was low). The index was imperfect and interrater agreement is not equivalent to capturing true scores but the agreement results suggest that more refined definition and measures of citizenship would be helpful in future work. Future syntheses should attempt to refine the definitions used here, and to apply them more accurately in classifying the results of primary studies. Overall Conclusion The results of this study partially support the theory proposed by Motowidlo, Borman, and Schmit (1997). Task and citizenship appear to be moderately correlated but distinct aspects of job performance. Although different performance predictors seem to show differential validity between the two dimensions of performance as theorized, cognitive ability is the single most effective predictor across the two dimensions. It does appear to be the case that personality dimensions can predict aspects of citizenship behavior but more research is needed to understand which aspects of personality, when, ” What determines whether a sample is range restriction and enhancement is its range compared to the range of a reference group to which one wishes to generalize findings. 124 and how they are important. Alternatively, biodata were good predictors of both task and citizenship performance. Biodata validities were not greatly affected by the degree to which various constructs were assessed. Although not all hypotheses could be fully tested quantitatively, the patterns of results have implications for past research on validity and adverse impact, as well as future theoretical work on citizenship performance determinants. 125 Appendix A Pilot Coding Sheet For any N/A value, enter 9090 Study Descrjptives 1. Study ID (If a study reports multiple independent studies with distinct outcomes and samples, add a decimal to the Study ID and code each study separately.) 2. First Author (last name first): 3. Year (last two digits): (If multiple reports of the same study, then code year of the more “formal” publication.) 4. Published: Yes/No 5. Reference source: 1 book 2 journal article 3 book chapter 4 thesis or dissertation 6 technical report 7 conference paper 8 other: 6. Citation (APA form): Study Sample 7. Type of population sampled (as described in paper): 8. Study type: 1 Predictive 2 Concurrent 9. 1 Applied setting 2 Experimental 10. 1 US 2 European 3000: 11. Total sample size (usable cases) 126 12. Mean age of sample (at start of study) 13. Job incumbents or job applicants Racial/Gender Breakdown — Fill in whatever is reported by the study (either % or N) 14. White N 15. White % 16. Black N 17. Black % 18. Hispanic N 19. Hispanic % 20. Asian N 21. Asian % 22. Males N 23. Males % 24. Official jobs included and # of people in each group (list) This section to be coded separately using information obtained in #24. (circle all that apply) 25. Managerial Nonmanagerial Other 26. Attrition: (N) l Refuse to complete study 2 Quit study 3 Thrown out by researcher 40mm: Reason?: 27. Reason for missing data? 28. Study design 1 Predictive 2 Concurrent 29. Was there a manipulation in between IV & DV measures? If so, what? 30. Are performance measures considered DV’s in study? Yes/No Item 31 to be coded separately using information obtained from ONET. 127 31. Job complexity: The amount of information, knowledge, and concepts that must be dealt with for regular job tasks (Avolio & Waldman, 1990). 1 Low (e.g., line workers) 2 Medium (e.g. supervisors, skilled workers) 3 High (e. g. professionals, specialists, upper management) Task Performance 32. Label used by authors: 33. Operationalization/Definition: 34. Constructs thought to be measured: 35. Breadth of measure: (does it measure a narrow aspect or specific task?) 1 Broad 2 Narrow 36. Measure used: Established reliability: 37. Who makes the ratings? 1 Self-report 2 Peer 3 Supervisor 4 Other 38. Range Restricted? Due to selection (check if yes) List other possible reasons and any estimates of the magnitude 39. Training or job criterion 1 Training 2 On-the-job 128 Citizenship Performance 40. Label used by authors: 41 . 42. 43. 45. 46. 47. 48. 49. Operationalization/Definition: (for qualitative purposes) Constructs thought to be measured: (for qualitative purposes) Nature of acts: (for moderator analysis) 1 Self-discipline (see definition) 2 Interpersonal (“) . Breadth of measure: (does it measure a narrow aspect or specific task?) 1 Broad 2 Narrow Measure used: Established reliability: What is measured? 1 behavioral frequency 2 behavioral quality Who makes the ratings? 5 Self-report 6 Peer 7 Supervisor 8 Other Range Restricted? Due to selection (check if yes) List other possible reasons and any estimates of the magnitude Training or job criterion 1 Training 2 On-the-job 129 Structured Interview 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. Label used by authors: Operationalization/Definition: Constructs thought to be measured: Breadth of measure: (does it measure a narrow aspect or specific task?) 1 Broad 2 Narrow Measure used: Established reliability: Structured or unstructured: majority of questions specified beforehand How many raters (one vs. panel)? 1 Single person 2 Multiple raters 3 Panel: Who makes the ratings? 9 Self-report 10 Peer ll Supervisor 12 Other Range Restricted? Due to selection (check if yes) List other possible reasons and any estimates of the magnitude Amount of structure: (Based on Huffcutt & Winfied, 1994; Huffcutt & Roth, 1998) 1 Low - standardization of topical areas to be covered 2 Medium - at least half of the questions are specified beforehand 3 High - majority of questions specified 130 Biodata 60. 61. 62. 63. 64. 65. 66. 67. 68. Label used by authors: Operationalization/Definition: Constructs thought to be measured: Breadth of measure: (does it measure a narrow aspect or specific task?) 1 Broad 2 Narrow Measure used: Established reliability: Scoring Key: 1 Empirical 2 Rational 3 F actor-analytic keying Type of items included: 1 Strictly biodata (objective measures of past experiences) 2 Attitudinal 3 Hypothetical or future events Who makes the ratings? 13 Self-report 14 Peer 15 Supervisor 16 Other Range Restricted? Due to selection ' (check if yes) List other possible reasons and any estimates of the magnitude 131 Cognitive ability 69. Label used by authors: 70. Operationalization/Definition: 71. Constructs thought to be measured: 72. Breadth of measure: (does it measure a narrow aspect or specific task?) 1 Broad 2 Narrow 73. Measure used: Established reliability: 74. Who makes the ratings? 17 Self-report 18 Peer 19 Supervisor 20 Other 75. Range Restricted? Due to selection (check if yes) List other possible reasons and any estimates of the magnitude 132 Personality 76. Label used by authors: 77. Operationalization/Definition: 78. 79. 80. 81 82. Of personality: Of each dimension: Construct measured (circle all that apply): 1 Conscientiousness 2 Extroversion 3 Agreeableness 4 Emotional Stability 5 Openness to experience / Intellectance 6 Positive Affectivity 7 Negative Affectivity 8 Other (specify label): Breadth of measure: (does it measure a narrow aspect or specific task?) 1 Broad 2 Narrow Measure used: Established reliability: . Who makes the ratings? 21 Self-report 22 Peer 23 Supervisor 24 Other Range Restricted? Due to selection (check if yes) List other possible reasons and any estimates of the magnitude 133 300025 .9 bzsflm Eeocofim .0 :0_m00>0bxm .m 80003000090. <- 303025 .0 30005 .m mmOGmSOfi—Comowfiou . v 306060 .m 55:0 .N use ._ 2 a w. h 0 w 4 m .N _ .300 0:0 05 :0 0.020550 5:50:00 0005 domme 00005 05 E 80:58 805%?» 36.0.9083 0300.6 08 En 000 3000 $5 0398 63% E 20838 0:0 :05 0.88 .2 0003000 000 206 800 05 .EB x508 05 E in .mw 3005:0032 134 ”0008300 0w80>0 wEESno 000 3030—00—00 ”0:600:22 c0>mw 0 .50 000850 000 :05 0008 mm 0005 >55 .3 a030tw> 05 55 3308 30065000 5033 £298 £050,008 00000006 08 00m: 000 0000 00000.“,20 .: 550 0 8633 € 8200-60H Am may—00.32300; Z0502: AN 202 2 ”000: 8065000 5:50:3— .vw 135 00000000 .0— 00300 002860 .0 020003.00...“ .w 0000030000w< .0 32225 .0 80.005 .m mwoflmfiomwcomomcov .V 0200000 .0 50:0 .0 05 ._ 2250 :02 25.000 0020 208.30 0030:0000 25 00 502 -222 -223 -253 -205, =§>0 .wlu 00 3W0 ..w.0 600000000 00 000000.m=0 0003 600:3: 00.0000 000000.000 000E 00=_0>+V 0000» 000000000 05 00 0:020:55 000 0000000003 000% 000:3 .0030r00> 00.0 0050505 .3 136 Definition List Task performance: Behaviors that directly or indirectly (through other workers) affect core production that transforms input to output or delivers a service. Citizenship performance: Behaviors that 1) are not directly related to core tasks and 2) support the social and/or psychological environment. Examples include: loyalty, cooperative behaviors (not affecting core production), whistle-blowing, sportsmanship, prosocial behavior, personal initiative, showing extra effort and perseverance, volunteering to do extra or unrelated work. Counterproductive or retaliatory behaviors are NOT included. Self-discipline: Citizenship behaviors that do not require a direct interaction with another person. Examples: working hard, taking initiative, and following organizational rules. Interpersonal: Citizenship behaviors that require a direct (not necessarily face- to-face) interaction with another person. Structured interview: A structured interview, at the very least, evaluates a response to each question posed to the interviewee (from Huffcutt & Roth, 1998) Biodata: A measure of background life experiences that is intended to predict future behaviors of the same type. Cognitive ability: Broadly, any test of computational, problem solving, or mental abilities. Personality: Enduring characteristics of the individual that are tied to one of the Big Five dimensions: Conscientiousness, Agreeableness, Extroversion, Emotional Stability, and Openness to Experience. A measure may capture smaller aspects of any one dimension but not overlap with another dimension. [From Barrick, Mount, & Judge (2001)] Conscientiousness: dependability, achievement striving, and planfulness. Extraversion: sociability, dominance, ambition, positive emotionality, and excitement-seeking. Agreeableness: cooperation, trustfulness, compliance, and affability. Openness to experience: intellectance, creativity, unconventionality, and broad-mindedness. Emotional stability: lack of anxiety, hostility, depression and persona insecurity. 137 Will. 0000000 0000 0000 0000000000 000000000 02 .0000 300000 0000 00—0080 000 0:00 000300, .00 0000 00.0 00>00000 000 00—00088 000 300003 00000 05 .00 00000 0.5 00000 05 00 000000.00 0000000 00,—. ._0oxm 00000022 00 0000 000 00000 0003 000000 w00000 00000 90000 05 .00 0200? 0000000000 0 050300 00 00:. _ 0000000000000 000 0000000.:— 0003000 00000—000002 .00>0w 000000 .00 b00000m 0.0000 900002 ”0000000000 0003000 0000 00000 300 6300000.:— 0: D 9000000000HN .000“ 3 0030000000 5.02— U I £2.05 000m0m 0000 00 0000000 00— E00 .00 0000000 .0000. fiomnm _0t0m0008002um 0030021 008 002.0 05000809 000000333 .00 0050 «00.00.040.04. A0000Ouv 00.000—um .0000000 0.00muu ._000=00HC 09:. 000=0w "0800000 ”00000 008 >005 —020 0508 0 00:20:. 138 0.8303. 03.55: 5.0: 300 00 «000019: 00.86 0000 E 00E: m 0003 E 0000030 000»: 090% 020.303 00: :0 000200050 03S 05 0:0 033 0.2: .0002 _ :000>O 0000000800: 00002000 0505006 0000 B03000: 20020 0:530 .0 00 00:: 0.000Q N. :06 «0g >000: 2m 08% Q :0 Q 000 V 00m0 .0056 000w< ACOSmEOU :0._000>0bxm 03:0 00 603% mob mZOEchmmmOU _ 000 _ 30:: _ 0.0th .\._ :03 «fig _ >000: _ 2m _ 03m. M 0006 mmxmw 0000.0 ESQ woo Docmgoytom 0000:50008 0 0050.? 9.0000000 : 20:50 000 35200 £000.50 00:0”— 6 s 00 $260300 2 8 m0 :00on 0.5900: 139 Appendix C Code Book Preparation: - Find the template coding file. Make a copy of it for each article. 140 Appendix C (continued) Definition List Task performance: Behaviors that directly or indirectly (through other workers) affect core production that transforms input to output or delivers a service. Sometimes these are referred to as “in-role” behaviors because they are tied to one’s job roles (but if the role includes non-task behaviors, then the two concepts may be very different). Citizenship performance: Behaviors that l) are not directly related to core tasks and 2) support the social and/or psychological environment. Examples include: loyalty, cooperative behaviors (not affecting core production), whistle-blowing, sportsmanship, prosocial behavior, personal initiative, showing extra effort and perseverance, volunteering to do extra or unrelated work. Counterproductive or retaliatory behaviors are NOT included. Self-discipline: Citizenship behaviors that do not require a direct interaction with another person. Instead, they are related to helping the organization overall. Examples: working hard, taking initiative, and following organizational rules. Interpersonal: Citizenship behaviors that require a direct (not necessarily face- to-face) interaction with another person. Examples: helping others and backing people up. Structured interview: A structured interview, at the very least, evaluates a response to each question posed to the interviewee (from Huffcutt & Roth, 1998) Biodata: A measure of background life experiences that is intended to predict future behaviors of the same type. Cognitive ability: Broadly, any test of computational, problem solving, or mental abilities. Personality: Enduring characteristics of the individual that are tied to one of the Big Five dimensions: Conscientiousness, Agreeableness, Extroversion, Emotional Stability, and Openness to Experience. A measure may capture smaller aspects of any one dimension but not overlap with another dimension. Conscientiousness: dependability, achievement striving, and planfulness. Extraversion: sociability, dominance, ambition, positive emotionality, and excitement-seeking. Agzeeableness: cooperation, trustfulness, compliance, and affability. Qpenness to experience: intellectance, creativity, unconventionality, and broad-mindedness. Emotional stability: lack of anxiety, hostility, depression and persona insecurity. 141 Appendix C (continued) Using the excel file: 1) Enter your initials on line 3 next to “Coder” 2) For “citation” write the article citation in APA format. 3) For “Source Type” enter the appropriate response 4) Go to the methods section of the article and briefly summarize the type of people studied. 5) Use your judgment based on the description to decide whether the jobs are managerial or not, or including both types. Managerial jobs usually involve the supervision of other workers. 6) How many studies or samples are included in the article? The main thing to look at is how many different groups of people were analyzed. Answer on line 13. 7) Give the size of each sample/ group studied. Answer on line 14, use other additional columns if necessary. 8) Is this study about people applying for a job, who are later evaluated on job performance? If so, then choose 1 for predictive. If the study measures current workers, then select concurrent. 9) For line 18, answer only if line 17 was predictive. How much time passed between the time when the predictors were measured (interview, biodata, personality, cognitive ability) and job performance. If the length of time is different depending on the predictors, then note that (e. g., for cognitive ability, 3 weeks). 10) When the sample size is given in the article, sometimes not everyone was used in the analyses. If there were a lot of people who didn’t provide useable data, there might be some description of this in the text. You should also check the tables/correlation matrix when you get correlations to see what the sample size is =...), to make sure this matches with the number you put in line 14. 1 1) Sometimes a study will measure the predictors and then manipulate the workers (e. g., give them a training session). If this is the case, note what happened in line 23. If there manipulation is not between one of the predictors we care about and job performance, then don’t mention this. 12) There are two types of measures: predictors (cognitive, personality factors, biodata, and interview) and job performance (overall, task, or citizenship). Citizenship can be broken into two smaller categories (see definition list above). 142 Find the measure used to assess the variables and write it in the corresponding box on line 28. Appendix C (continued) 13) Use your judgment to choose whether the measure is broad or narrow, based on the definition list. For example, if cognitive ability is measured with a short math test, that would be narrow because it captures a specific aspect of cognitive ability. Or, if task performance is measured by the number of people called per week, that might be narrow if you think there is more to performance for a sales job. 14) Again, use your judgment here to decide whether the measure is objective or subjective. Objective measurements should be similar no matter who is providing the data (whether the manager is rating a worker’s performance or a worker is taking a cognitive test). If the measure asks about self-reported feelings or perceptions and depends on the situation or time, then it would be subjective (e. g., most personality measures). 15) Find the reliability for the measure used. If it’s a single rating like overall job performance by a supervisor, then it probably won’t be listed. For line 31, write the number. For line 32, write down what type of reliability estimate the number means: alpha coefficient/intemal consistency, interrater (between multiple raters who are measuring the same thing), test-retest (at different times). If you cannot tell or it isn’t mentioned, write that down. Sometimes, these numbers will be on the “diagonal” line of the correlation matrix (check the footnote to make sure). 16) Find the correlations. Usually, these will be in correlation matrix and you won’t need to look through the text. If they are presented in more than one table, find out why. If there are multiple tables because there are different samples, use the other excel spreadsheets (in the same file). If there are multiple tables because the article authors are describing the same people in different ways, put all the numbers in the same table but label why they are different. 143 Appendix D Interrater Coding Agreement Results 0 Data are provided for the four (italicized) categorical ratings by the author and undergraduate assistant made on the coding sheet. N is the number of ratings made for the 29 published studies. Ratings that could not be made based on the information given were also treated as a category called “Undetermined.” Manager Rating Graduate Student N=33 Manager Nonmanager 132m M 0.15 0.06 Undergrad. N 0.06 0.70 0.03 B Predictive Rating Graduate Student N=33 Predictive Concurrent Undetermined P 0.03 0.03 Undergrad. C 0.09 0.67 0.03 U 0.09 0.06 Broad Rating Graduate Student N=105 Broad Narrow Undetermined B 0.70 0.15 Undergrad. N 0.02 0.02 U 0.10 Subjective Rating Graduate Student N=125 Subjective Objective Undetermined S 0.58 0.06 Undergrad. O 0.20 0.06 U 0.10 144 Appendix E SAS / IML program for multivariate computations. The following code for this paper was generated using SAS software, Version 8.02 of the SAS System for Windows. Copyright © 2001 SAS Institute Inc. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc., Cary, NC, USA. The syntax below for a level 2 regression only works with two moderators and three correlations. *Input file of correlations, tab delimited; data thesis; infile 'c:\file1.txt' recfm=v dsd dlm='09'x lrecl=5000; input cogcon cogagr cogB CogTask CogDed CogInterp conagr conB ConTask ConDed Conlnterp agrB Angask AgrDed Angnterp BTask BDed BInterp TaskDed TaskInterp DedInterp; run; *Input file of correction factor associated with each correlation; data thesisc; infile 'c:\file2.txt' recfin=v dsd dlm='09'x lrecl=5000; input cl c2 c3 c4 c5 c6 c7 08 c9 c10 c11c12 013 CM c15 c16 cl7 c18 c19 020 c2]; run; *Input file with sample size and Level 2 predictors; data thesism; infile 'c:\file3.txt’ recfm=v dsd dlm='09'x lrecl=5000; input 35 m1 m2; run; proc iml; use thesis; read all into x; *column vector of all correlations; dat=x[,]; use thesisc; read all into x2; correct=x2[,]; *matrix of correction factors; use thesism (keep=m1); read all into x3; moderat=x3[,]; *matrix of moderators; use thesism (keep=ss); read all into x3; sample=x3[,]; *sample size vector; sigma=.; *Sigma column vector; 145 Appendix E (continued) n=nrow(sample); *number of cases; numr=ncol(dat); *number of unique correlations inputed; numrtot=n*numr; *all correlations in entire meta-analysis; nmod=ncol(moderat); *number of moderators/predictors; chart={2 1,3 3,4 6,5 10,6 15,7 21,8 28,9 36,10 45,11 55}; **** Variance module ****; resultv=0; start vari(r1,sam) global(resultv); resultv=(1-rl##2)##2/sam; *Becker 2000, Eq. 4; finish; **** Covariance module ****; resultc=0; . start covar(rl ,r2,r3,r4,r5,r6,sam) global(resultc); resultc=(.5*r1 *r2*(r3 ##2+r4##2+r5##2+r6##2)+r3 *r6+r4*r5- (r1*r3*r4+rl*r5*r6+r3*r5*r2+r4*r6*r2))/sam; *Becker 2000, Eq. 5; finish; *Create Xmatrix and joint matrix with Level 2 predictors for calculations, only good for 2 predictors; xmatrix=j(1,numr,0); xmod=j(1,nmod,0); current=j(l,numr,0); current2=j(1,nmod,0); do i=1 to n; do j=1 to numr; if dat[i,i] <>. then do; current[j]=1; xmatrix=xmatrix // current; ifj < nmod + 1 then do; current2[j]=moderat[i,j]; xmod=xmod // current2; end; else xmod= xmod // {0 0}; current=j(1,numr,0); current2=j(1,nmod,0); end; end; end; e1 =nrow(xmatrix); xmatrix=xmatrix[2:e1,]; 146 Appendix E (continued) e2=nrow(xmod); xmod=xmod[2:e2,]; xmod=xmat1ix || xmod; ”Make the var-cov matrix for whole data set**; do i=1 to n; ** Row vector of correlations; caser = .; do j=1 to numr; if dat[i,j] <>. then caser = caser || dat[i,j]; end; e1=ncol(caser); crow=caser[,2:e1]; *Final vector; ** Row vector of correction factors; caser = .; do j=l to numr; if correct[i,j] <>. then caser = caser || correct[i,j]; end; e1=ncol(caser); acrow=caser[,2:e1]; *Final vector; ****Add correlations to sigma vector ****; sigma = sigma // crow‘; *****************II!***********************. 9 *With the row of correlations for study(i), make a var-cov matrix*; *Prepare data in matrices to calculate covariances; isize=ncol(crow); pmat=j(isize,isize,l); *Initialize position matrix; pmat2=j(isize,isize,l); vcv=j(isize,isize,0); *Var—Cov matrix to be filled in; *Determine # of study variables based on # of input correlations; orig=0; if isize=l then orig=1; else do q=1 to 10; if isize=chart[q,2] then orig=chart[q,l]; end; if orig=0 then print "Error for study" i "Dataz" crow acrow isize orig; *Place correlations for study(i) into a matrix; add=1; do j=1 to orig-1; 147 Appendix E (continued) do k=j+1 to orig; pmatfj,k]=crow[,add]; pmat[k,j]=cr0W[,add]; add=add+l; end; end; *Create another matrix for correction factors; add2=1; *Place correction factors for study(i) into a matrix; do j=1 to orig-1; do k=j+1 to orig; pmat2[j,k]=acrow[,add2]; pmat2[k,j]=acrow[,add2]; add2=add2+1; end; end; if isize > 1 then do; I""‘Make a list of codes for ordering correlations in study(i)**; minicycle=j(isize,2,1); order=l; do bs=1 to orig-1; do bt=bs+l to orig; minicycle[order,1]=bs; minicycle[order,2]=bt; order=order+1; end; end; "Compute all covariances between correlations in study(i)**; mcyclist=l; ps=l; pt=1; pu=l; pv=l; do j=mcyclist to orig-1; do k=j+l to orig; ps=minicycle[j,l]; pt=minicyclefi,2]; pu=minicycle[k,1]; pv=minicycle[k,2]; *Correction factor for each correlation involved; 148 Appendix E (continued) run covar(pmat[ps,pt],pmatlpumv}.pmatlps.pu},pmatIpSJJVJ,pmatlpt,pu],pmat[pt.pv},samplefi l); cf=pmat2[ps,pt]*pmat2[pu,pv]; check=resultc/cf; print check; if check < 0 then check=0.0000001; *Changes negative variance to near 0... it won't estimate 0; vcv[j,k]=check; vcv[k,j]=check; end; end; end; I""‘Create variances in vcv**; do j=l to isize; run vari(crow[,j],sample[i]); check2=resultv/acrow[,j]##2; if check2 < 0 then check2=0.0000001; vcv[j,j]= check2; end; ****Concatenate matrix for study(i) with S-matrix****; if i=1 then smatrix=vcv; else do; addold = nrow(smatrix); *a and b are square matrices; addnew = nrow(vcv); if addnew = 1 then do; new=j(addold,1,0); old=j(1,addold,0); end; if addnew > 1 then do; new=j(addold,addnew,0); old=j(addnew,addold,0); end; smatrix = smatrix I] new; bottom = old || vcv; smatrix: smatrix // bottom; end; ********************************It***********************; end; *Dump first missing value from sigma; 149 Appendix E (continued) e1=nrow(sigma); sigma=sigma[2:e1,]; It *** *Calculations*****; rho=inv(xmatrix ‘ *inv(smatrix) *xmatrix) *xmatrix ‘ * inv(smatrix)* sigma; rhov=inv(xmatrix‘ *inv(smatrix)*xmatrix); q=sigma‘ *(inv(smatrix)- inv(smatrix) *xmatrix *inv(xmatrix ‘ * inv( smatrix) * xmatrix) *xmatrix‘ * inv(smatrix))*sigm 3; print rho; print rhov; print q; *****Levels analysis*****; *Level 2 regression paramater estimates; beta=inv(xmod‘ *inv(smatrix)*xmod)*xmod‘ *inv(smatrix)*sigma; *Var-cov matrix for parameters; vbeta=inv(xmod‘ *inv(smatrix)*xmod); *Test of overall model fit; he=(sigma-xmod*beta) ‘ *inv(smatrix)*(sigrna-xmod*beta); *Test of Model significance; hr=sigma‘*inv(smatrix)*sigma - he; print beta; *parameter estimates, intercepts & slopes; print he; print hr; quit; 150 Appendix F Full List of Studies in Database L. Indented citations are included within the parent citation. Citations with more than three authors have been abbreviated. Full citations can be found in the reference section. Ackerman & Kanfer 1993 Allen & Rush 1998 Allworth & Hesketh 1999 Antonioni & Park 2001 Barbuto et al. 2003 Barksdale & Werner 2001 Barrick & Mount 1993 Barrick, Mount, Strauss 1993 Barrick, Stewart & Piotrowski 2002 Beaty, Cleveland & Murphy 2001 Becker & Vance 1993 Bell & Kozlowski 2002 Bell & Menguc 2002 Borman, White & Dorsey 1995 Borman et al. 1991 Bosshardt et al. 1992 Botwin & Buss 1989 Boudreau et al. 2001 Brown et al. 1988 Burroughs & White 1996 Caldwell & Burger 1998 Caliguiri 2000 Campion et al. 1988 Campion, Campion & Hudson 1994 Chad et al. 1999 Charbonneau & Nicol 2002 Chen & Francesco 2003 Collins & Gleaves 1998 Conway 1996 Boruch et al. 1970 Dickinson & Tice 1973 Forsythe et al. 1986 Gunderson & Nelson 1966 Gunderson & Ryan 1971 Holzbach 1978 Kavanagh et al. 1971 King et al. 1980 Lance et al. 1992 Lawler 1967 151 Orpen 1973 Tucker et al. 1967 Conway 1999 Cortina et al. 1992 Cortina et al. 2000 Baehr & Froemel 1977 Berkley 1984 Gully et al. 1998 Gully et al. 2000 Delery et al. 1992 Dicken, 1969 Dipboye et a1. 1990 Exxon 1974 Exxon 1978 Friedland 1976 Friedland 1980 Lopez 1966 Motowidlo & Schmit 1996 Phillips & Gully 1997 Reeb, 1969 Roth & Campion 1992 Schmit 1996 Tubiana & Ben-Sitakhar 1982 Tziner & Dolan 1982 Costa & McCrae 1988 Crant 1995 Cropanzano, Rupp & Bryne 2003 Dalessio & Silverhart 1994 Day & Silverman 1989 De Fruyt & Mervielde 1999 Deadrick & Madigan 1990 Douthitt, Eby & Simon 1999 Farh, Podsakoff & Organ 1990 Farh, Werbel, & Bedeian 1988 Ferris, Witt & Hochwarter 2001 Findley, Giles & Mossholder 2000 Gellatly & Irving 2001 Gellatly 1996 George 1991 Appendix F (continued) Goffin, Rothstein & Johnston 1996 Mount et al. 1998 Hansen 1989 Mount, Witt & Barrick 2000 Hattrup, O'Connell & Wingate 1998 Mumford et al. 1996 Haworth & Levy 2001 Nathan & Alexander 1988 Hedge & Teachout 1992 Hochwarter, Witt & Kacmar 2000 Hogan, Hogan, Gergory 1992 Hogan et al. 1998 Hough 1992 Hough et a1. 1990 Huffcutt et a1. 1998 Nathan & Tippins 1990 Neuman & Kickul 1998 Neuman & Wright 1999 Niehoff & Moorman 1993 Nikolaou & Robertson 2002 O'Connell et al. 2001 O'Connell et al. 2001 Huffcutt et a1. 2001 Organ & Konovsky 1989 Hui, Lam & Law 2000 Piedmont & Weinstein 1994 Johnson 2001 Ployhart, Lim & Chan 2001 Judge et al. 1999 Kaufman, Stamper & Tesluk 2001 Kidder 2002 Kinicki, Lockwood & Horn 1990 Koh, Steers & Terborg 1995 Konovsky & Organ 1996 Lam, Hui & Law 1999 Latham & Skarlicki 1995 Lee & Allen 2002 LePine & Van Dyne 2001 LePine, Colquitt & Erez 2000 Love et al. 1994 MacKenzie, Podsakoff & Fetter 1991 MacKenzie, Podsakoff & Rich 2001 MacKeznzie, Podsakoff, Paine 1999 Mael & Ashforth 1995 McDaniel 1989 McHenry et al. 1990 McManus & Kelly 1999 McNeely & Meglino 1994 Menguc 2000 Miller, Griffin & Hart 1999 Ployhart et al. 2003 Podsakoff & Mackenzie 1994 Podsakoff, MacKenzie & Bommer 1996 Podsakoff, MacKenzie & Fetter 1993 Podsakoff et al. 1990 Pulakos & Schmitt 1996 Pulakos, Borman & Hough 1988 Randall et al. 1999 Ree, Caretta & Teachout 1995 Ree, Earles & Teachout 1994 Rioux & Penner 2001 Russell et a1. 1990 Ryan, Ployhart & Friedel 1998 Sackett, Gruys & Ellingson 1998 Schmidt & Rader 1999 Schmidt et al. 1988 Schmitt & Ryan 1993 Schnake, Dumler & Cochran 1993 Scullen, Mount & Judge 2003 Shore & Wayne 1993 Shore, Barksdale & Shore 1995 Shore et a1. 2000 Moorman & Blakely 1995 Stewart 1996 Moorman 1991 Stokes & Searcy 1999 Moorrrran 1993 Tansky 1993 Moorman, Niehoff & Organ 1993 Tepper, Lockhart & Hoobler 2001 Morrison 1994 . Tompson & Werner 1997 Motowidlo & Van Scotter 1994 Turnipseed 2002 Motowidlo et al. 1992 Turnipseed 2003 152 Appendix F (continued) Turnley et a1. 2003 Williams & Anderson 1991 Van Dyne, Graham & Dienesch-1994 Witt, Burke, Barrick, & Mount 2002 Van Scotter & Motowidlo 1996 Mount, Barrick & Stewart 1998 Vaaneren, van den Berg & Willering Mount, Witt & Barrick 2000 1999 Barrick & Mount 1996 Villanova et al. 1994 Yoon & Sub 2003 153 Appendix G Scree Plots for Outlier Analyses Cognitive Ability — Task Performance l ! Scree SAMD O 5 1O 15 20 25 30 35 SAMD In descendlng rank-order Conscientiousness — Task Performance —o—Series1 . l Scree _n O SAMD O‘NchUImVGO O 5 10 15 20 25 ] SAMD In descending rank-order ‘___—’—5€'1°§1J 154 Appendix G (continued) Conscientiousness — Citizenship Performance Scree SAMD 0 5 1O 15 20 SAMD In descendlng rank-order Task — Citizenship Performance 25 ‘ —O-— Series1 l ; Scree I 0 10 20 3O 4O SAMD In descendlng rank-order 50 60 —O— Series1 155 Appendix G (continued) Task - Job Dedication Performance Scree D E —e— Sen'es1 l U) _ ._ l l i o 5 1o 15 2o 25 SAMD In descendlng rank-order 1 Task — Interpersonal Performance 3 Scree } l i l i . l | t i g K_‘ ’ —O—Series1 ‘ ‘ + 1 i l , o 5 1o 15 20 25 ‘ : SAMD in descending rank-order } L 156 Appendix G (continued) Job Dedication -— Interpersonal SAMD Scree ON-hmw 0 10 20 3O 4O 50 60 SAMD In deecendIng rank-order Cognitive Ability - Conscientiousness SAMD Scree l ,P——!' ;__.--Sewl 0 5 10 15 20 25 30 35 1 SAMD In descending rank-order 5 157 Appendix G (continued) Cognitive Ability — Interview 1 l l Scree l l 9 l a I 7 I 6 I D 5 1 E 1 +Senesi .1 l a) 4 —————— ‘ i 3 2 1 l 1 1 0 I 5 1o 15 20 25 1 SAMD In descending rank-order 1 Q1 Extraversion — Conscientiousness I Scree l l i :2 ~~— l -+—Series1 l r 3‘, . _ 10 15 20 25 SAMD In descending rank-order 158 30 35 Appendix G (continued) Extraversion — Agreeableness Scree 10 9 e 7 Q 6 E 5 é—g ($3 4 —O— Seriesi 3 2 1 o -1 I l l SAMD In descending rank-order l | I i j Scree : l 9 I a I 7 6 I O _ ‘ E 5 F_. ‘ l l < 4 1:2— SenesI l . U) l . + ‘ 2 1 0 J o 5 1o 15 20 25 30 ‘ SAMD In descending rank-order I 159 Appendix G (continued) Extraversion — Emotional Stability Scree —O— Series1 SAMD o 5 1o 15 20 25 30 I I I I I I I I I I I SAMD In descending rank-order I I I L Conscientiousness — Agreeableness I Scree r— f7 +_anes1 I SAMD I 0 10 20 30 40 I SAMD In descending rank-order 160 Appendix G (continued) Conscientiousness — Openness Scree 1o 9 8 7 Q 6 <2: 5 —O—Series1 . (D 4 3 2 1 o o 5 1o 15 20 25 30 35 SAMD In descending rank-order Conscientiousness — Emotional Stability I l I SAMD Scree 0 1O 20 SAMD In descending rank-order —O— Series1 ‘ 30 40 161 Appendix G (continued) Agreeableness — Openness I I Scree 9 e 7 6 Q 5 <2: I —+— SeriesI a) 4 ‘ I 3 I 2 I 1 I I 0 I o 5 10 15 20 25 30 I. SAMD in descending rank-order Agreeableness — Emotional Stability I I Scree I I | I I I E a I . I —9— Senes1 I g I______I 0 5 1O 15 20 25 30 35 SAMD In descending rank-order I 162 Appendix G (continued) Openness — Emotional Stability SAMD 5 Scree 1o ’.--15 5 20 SAMD In descending rank-order ' -—O— Seriesi 163 Appendix H Job Complexity Codes Studv Complexity Allworth & Hesketh 1999 0 Hedge & Teachout 1992 0 Johnson 2001 1 Pulakos & Schmitt 1996 1 Pulakos, Borman & Hough 1988 0 Ree, Caretta & Teachout 1995 1 Ree, Earles & Teachout 1994 1 Note. 0 = low complexity job, 1 = high complexity, 2 = indeterminate 164 Appendix I Biodata Studies For Which Raters Assigned Construct Codes Stu_dy fl Cognitive 1%) Personalig (%) Allworth & Hesketh 1999 169 3 57 Borman et a1. 1991 4,362 0 65 Bosshardt et al.1992 357 18 53 Dalessio & Silverhart 1994 577 1 25 Hough et la. 1990 7,666 0 58 McDaniel 1990 9,336 7 27 McHenry et al. 1990 4,039 0 56 McManus & Kelly 1999 116 2 37 Mumford et al. 1996 117 5 36 O'Connell et al. 2001' 94 8 20 Pulakos & Schmitt 1996 461 8 49 Russell et al. 1990 273 13 36 Stokes & Searcy 1999 933 8 40 165 donoEmoc owned 9 26 Baggage one macaw—oboe 088 65 “momma. 2 :39? Be 0.6: 3:58 of. @2me mm Eco 0.53 82: 3:83 3388 38382: com wont/ea do: 08 338% 5:55:08: com 38280 macaw—oboe es mos—"3 .862 a. B. 8. :V. 8. B. :. 8. cm. 5. :. .3285 a: 6 .R. - mm. 3. 8. S. om. 2. 8. 3. ._5£_25eo&e=_ 2: m - R. a. 8. 8. 2. t. 8. S. 828283 no. 8: on. on :. mo. :. om. 5o. mm. .Em 35555 av E. 2. 8. 8. 5o. 3. a. 85586.“ same as 2. mm. 2. S. E. R. 552m E 2. 8. S. mm. &. $535885 § 2. 8. em. a. m6.2525 5 am. am. 8. 368382? :5 mm. mo. 8053382850 AS 3. comm—383m— ANV 55?. 2,258 E : S 5 m h e m a. m N _ .3383 8838”” 8% 338m n. xmceoaazs References References marked with an asterisk indicate studies included in the meta-analysis. *Ackerman, P. L. & Kanfer, R. (1993). Integrating laboratory and field study for improving selection: Development of a battery for predicting air traffic controller success. Journal of Applied Psychology, 78, 413-432. Algera, J. A., Jansen, P. G., Roe, R. A., & Vijn, P. (1984). Validity generalization: Some critical remarks on the Schmidt-Hunter procedure. Journal of Occupational Psychology, 57, 197-210. *Allen, T. D. & Rush, M. C. (1998). The effects of organizational citizenship behavior on performance judgments: a field study and a laboratory experiment. Journal of Applied Psychology, 83, 247-260. *Allworth, E., & Hesketh, B. (1999). Construct-oriented biodata: Capturing change- related and contextually relevant future performance. International Journal of Selection & Assessment Background, 7, 97-1 1 1. *Antonioni, D. & Park, H. (2001). The effects of personality similarity on peer ratings of contextual work behaviors. Personnel Psychology, 54, 331-360. Ashworth, S. D., Osbum, H. G., Callender, J. C., & Boyle, K. A. (1992). The effects of unrepresented studies on the robustness of validity generalization results. Personnel Psychologz, 45, 341 -361 . Austin, J. T. & Villanova, P. (1992). The criterion problem: 1917-1992. Journal of Applied Psychology, 77, 836-874. Avolio, B. J. & Waldman, D. A. (1990). An examination of age and cognitive test performance across job complexity and occupational types. Journal of Applied Psychology, 75, 43-50. *Barbuto, J. E., Jr., Brown, L. L., Wheeler, D. W., & Wilhite, M. S. (2003). Motivation, altruism and generalized compliance: A field study of organizational citizenship behaviors. Psychological Reports, 92, 498-502. *Barksdale, K., & Werner, J. M. (2001). Managerial ratings of in-role behaviors, organizational citizenship behaviors and overall performance: Testing different models of their relationship. Journal of Business Research, 51, 145-155. Barnard C. I. (193 8). T he functions of the executive. Cambridge, MA: Harvard University Press. 167 Barrick, M. R. & Mount, M. K. (1991). The big five personality dimensions and job performance: a meta-analysis. Personnel Psychology, 44, 1-26. *Barrick, M. R. & Mount, M. K. (1993). Autonomy as a moderator of the relationships between the big five personality dimensions and job performance. Journal of Applied Psychology, 78, 1 1 1-118. Barrick, M. R., Mount, M. K., & Judge, T. A. (2001). Personality and performance at the beginning of the new millennium: What do we know and where to we go next? Personality and Performance, 9, 9-30. *Barrick, M. R., Mount, M. K., & Strauss, J. P. (1993). Conscientiousness and performance of sales representatives: Test of the mediating effect of goal setting. Journal of Applied Psychology, 78, 715—722. *Barrick, M. R., Stewart, G. L., & Piotrowski, M. (2002). Personality and job performance: Test of the mediating effects of motivation among sales representatives. Journal of Applied Psychology, 87, 43-51. *Beaty, J. C. Jr., Cleveland, J. N., & Murphy, K. R. (2001). The relation between personality and contextual performance in “strong” versus “weak” situations, Human Performance, 14, 125-148. Becker, B. J. (1992). Using results from replicated studies to estimate linear models. Journal of Educational Statistics, I 7, 341-362. Becker, B. J. (1996). The generalizability of empirical research results. In C. P. Benbow & D. Lubinksi (Eds) Psychometric and social issues: Intellectual talent. (pp. 362-3 83). Baltimore, MD: The Johns Hopkins University Press. Becker, B. J. (2000). Multivariate meta-analysis. In H. E. A. Tinsley & S. D. Brown (Eds) Handbook of applied multivariate statistics and mathematical modeling. (pp. 499-525). San Diego: Academic Press. *Becker, T. E., & Vance, R. J. (1993). Construct validity of three types of organizational citizenship behavior: An illustration of the direct product model with refinements. Journal of Management, 19, 663-682. *Bell, B. S. & Kozlowski, S. W. J. (2002). Goal orientation and ability: Interactive effects on self-efficacy, performance, and knowledge. Journal of Applied Psychology, 87, 497-505. *Bell, S. J ., & Menguc, B. (2002). The employee-organization relationship, organizational citizenship behaviors, and superior service quality. Journal of Retailing, 78, 131-146. 168 Bemardin, HJ. & Beatty, R.W. (1984). Performance appraisal: Assessing human behavior at work. Boston:Kent. Blickensderfer, E., Cannon-Bowers, J. A., & Salas, E. (1997). Theoretical bases for team self-corrections: Fostering shared mental models. In M. M. Beyerlein & D. A. Johnson (Eds.), Advances in interdisciplinary studies of work teams, Vol. 4 (pp. 249-279). US: Elsevier Science/JAI Press. Bliese, P. D. (2002). Multilevel random coefficient modeling in organizational research: Examples using SAS and S-Plus. In F. Drasgow & N. Schmitt (Eds.), Measuring and analyzing behavior in organizations: Advances in measurement and Data Analysis (pp. 401—445). San Francisco, CA: Jossey-Bass. Bobko, P., Roth, P., & Potosky, D. (1999). Derivation and implications of a meta-analytic matrix incorporating cognitive ability, alternative predictors, and job performance. Personnel Psychology, 52, 561-589. Bommer, W. H., Johnson, J ., Rich, G. A., Podsakoff, P. M., & MacKenzie, S. B. (1995). On the interchangeability of objective and subjective measures of employee performance: A meta-analysis. Personnel Psychology, 48, 587-606. Borman, W. C. (1982). Validity of behavioral assessment for predicting military recruiter performance, Journal of Applied Psychology, 67, 3-9. Borman, W. C. & Motowidlo, S. J. (1993). Expanding the criterion domain to include elements of contextual performance. In N. Schmitt & W. C. Borman (Eds.), Personnel selection in organizations, (pp. 71-98). San Francisco: Jossey-Bass. Borman, W. C. & Motowidlo, S. J. (1997). Task performance and contextual performance: the meaning for personnel selection research. Human Performance, 10, 99-109. Borman, W. C., Penner, L. A., Allen, T. D., & Motowidlo, S. J. (2001). Personality predictors of citizenship performance. International Journal of Selection and Assessment 9, 52-69. *Borman, W. C., White, L. A. & Dorsey, D. W. (1995). Effects of rate task performance and interpersonal factors on supervisor and peer performance ratings. Journal of Applied Psychologz, 80, 168-177. *Borman, W. C., White, L. A., Pulakos, E. D., & Oppler, S. H. (1991). Models of supervisory job performance ratings. Journal of Applied Psychology, 76, 863-872. *Bosshardt, M. J ., Carter, G. W., Gialluca, K. A., Dunnette, M. D., & Ashworth, S. D. (1992). Predictive validation of an insurance agent support person selection battery. Journal of Business & Psychology Test validity yearbook: I, 7, 213-224. 169 *Botwin, M. D. & Buss, D. M. (1989). Structure of act-report data: Is the five-factor model of personality recaptured? Journal of Personality and Social Psychology, 56, 988-1001. *Boudreau, J. W., Boswell, W. R., Judge, T. A., & Bretz, R. D. Jr. (2001). Personality and cognitive ability as predictors of job search among employed managers. Personnel Psychology, 54, 25-50. Brief, A. P. & Motowidlo, S. J. (1986). Prosocial organizational behaviors. Academy of Management Review, 11, 710-725. *Brown, S. H., Stout, J. D., Dalessio, A. T., & Crosby, M. M. (1988). Stability of validity indices through test score ranges. Journal of Applied Psychology, 73, 736-742. *Burroughs, W. A. & White, L. L. (1996). Predicting sales performance. Journal of Business and Psychology, 1 I, 73-84. *Caldwell, D. F. & Burger, J. M. (1998). Personality characteristics of job applicants and success in screening interviews. Personnel Psychology, 53, 119-136. *Caliguiri, P. M. (2000). The big five personality characteristics as predictors of expatriate’s desire to terminate the assignment and supervisor rated-performance. Personnel Psychology, 53, 67-88. Campbell, J. P. (1990). Modeling the performance prediction problem in industrial and organizational psychology, In M. D. Dunnette & L. M. Hough (Eds.), Handbook of industrial and organizational psychology (2nd ed., Vol. 1, pp. 687-732). Palo Alto, CA: Consulting Psychologists Press. Campbell, J. P., Dunnette, M. D., Lawler, E. E. III, & Weick, K. R., Jr. (1970). Managerial behavior, performance, and eflectiveness. New York: McGraw-Hill. Campbell, J. P., Gasser, M. B., & Oswald, F. L. (1996). The substantive nature of job performance variability. In K. R. Murphy (Ed.) Individual diflerences and behavior in organizations (pp. 258-299). San Francisco, CA: Jossey-Bass Publishers. Campbell, J. P, McCloy, R. A., Oppler, S. H. & Sager, C. E. (1993). A theory of performance. In N. Schmitt & W. C. Borman (Eds.), Personnel selection in organizations, (pp. 2-69). San Francisco: Jossey—Bass. Campbell, J. T., Prien, E. P., & Brailey, L. G. (1960). Predicting performance evaluations. Personnel Psychology, 13, 435-440. 170 *Campion, M. A., Campion, J. E. & Hudson, J. P., Jr.(1994). Structured interviewing: A note on incremental validity and alternative question types. Journal of Applied Psychology, 79, 998-1002. Campion, M. A., Palmer, D. K., & Campion, J. E. (1997). A review of structure in the selection interview. Personnel Psychology, 50, 655-702. *Carnpion, M. A., Pursell, E. D. & Brown, B. K. (1988). Structured interviewing: Raising the psychometric properties of the employment interview. Personnel Psychology, 41, 25-42. Carr, J. Z., Schmidt, A. M., Ford, J. K., & DeShon, R. P. (2003). Climate perceptions matter: A meta-analytic path analysis relating molar climate, cognitive and affective states, and individual level work outcomes. Journal of Applied Psychology, 88, 605-619. Cascio, W. F. (1995). 'Whither industrial and organizational psychology in a changing world of work? American Psychologist, 50, 928-939. *Charbonneau, D., & Nicol, A. A. M. (2002). Emotional intelligence and prosocial behaviors in adolescents. Psychological Reports, 90, 361-370. *Chen, Z. X., & Francesco, A. M. (2003). The relationship between the three components of commitment and employee performance in China. Journal of Vocational Behavior, 62, 490-510. Cleveland, J. N., Murphy, K. R. & Williams, R. E. (1989). Multiple uses of performance appraisal: Prevalence and correlates. Journal of Applied Psychology, 74, 130-135. Cohen, J. (1977). Statistical power analysis for the behavioral sciences (Revised edition). New York: Academic Press. Coleman, V. I. & Borrnan, W. C. (2000). Investigating the underlying structure of the citizenship performance domain. Human Resource Management Review, 10, 25- 44. *Collins, J. M. & Gleaves, D. H. (1998). Race, job applicants, and the five-factor model of personality: Implications for black psychology, industrial/organizational psychology, and the five-factor theory. Journal of Applied Psychology, 83 , 531- 544. Colquitt, J. A., LePine, J. A., & Noe, R. A. (2000). Toward an integrative theory of training motivation: A meta-analytic path analysis of 20 years of research. Journal of Applied Psychology, 85, 678-707. 171 Conger, A. J. (1980). Integration and generalization of kappas for multiple raters. Psychological Bulletin, 88, 322-328. Conway, J. M. (1996). Additional construct validity evidence for the task/contextual performance distinction. Human Performance, 9, 309-329. *Conway, J. M. (1999). Distinguishing contextual performance from task performance for managerial jobs, Journal of Applied Psychology, 84, 3-13. Cooper, H. (1998). Synthesizing research, Sage Publications, Thousand Oaks, CA. Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98-104. *Cortina, J. M., Doherty, M. L., Schmitt, N., Kaufman, G., & Smith, R. G. (1992). The “big five” personality factors in the IPI and MMPI: Predictors of police performance. Personnel Psychology, 45, 119-140. *Cortina, J. M., Goldstein, N. B., Payne, S. G, Davison, H. K., & Gilliland, S. W. (2000). The incremental validity of interview scores over and above cognitive ability and conscientiousness scores, Personnel Psychology, 5 3, 325-351. *Costa, P. T. & McCrae, R. R. (1988). From catalog to classification: Murray’s needs and the five-factor model. Journal of Personality and Social Psychology, 55, 25 8- 265. *Crant, J. M. (1995). The proactive personality scale and objective job performance among real estate agents. Journal of Applied Psychology, 80, 532-537. Cronbach, L. J. & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302. Cronbach, L. J. & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64, 391-418. *Cropanzano, R., Rupp, D. E.,& Bryne, Z. S. (2003). The relationship of emotional exhaustion to work attitudes, job performance, and organizational citizenship behaviors. Journal of Applied Psychology, 88, 160-169. *Dalessio, A. T., & Silverhart, T. A. (1994). Combining biodata test and interview information: Predicting decisions and performance criteria. Personnel Psychology, 4 7, 303-3 15. Darlington, R. B. (1990). Regression and linear models. New York: McGraw-Hill Publishing Co. 172 *Day, D. V. & Silverman, S. B. (1989). Personality and job performance: evidence of incremental validity. Personnel Psychology, 42, 25-36. De Corte, W. (1999). Weighing job performance predictors to both maximize quality of the selected workforce and control the level of adverse impact. Journal of Applied Psychology, 84, 695-702. *De Fruyt, F. & Mervielde, I. (1999). RIASEC types and big five traits as predictors of employment status and nature of employment. Personnel Psychology, 52, 701- 727. *Deadrick, D. L. & Madigan, R. M. (1990). Dynamic criteria revisited: A longitudinal study of performance stability and predictive validity. Personnel Psychology, 43, 717-744. Dickinson, T. L. & McIntyre, R.'M. (1997). A conceptual framework for teamwork measurement. In M. T. Branmck & E. Salas (Eds.), Team performance assessment and measurement: Theory, methods, and applications (pp. 19-43). Mahwah, NJ: Lawrence Erlbaum Associates. *Douthitt, S. S., Eby, L. T., & Simon, S. A. (1999). Diversity of life experiences: The development and validation of a biographical measure of receptiveness to dissimilar others. International Journal of Selection & Assessment Background, 7, 1 12-125. *Farh, J .-l., Podsakoff, P. M., & Organ, D. W. (1990). Accounting for Organizational Citizenship Behavior: Leader fairness and task scope versus satisfaction. Journal of Management, 16, 705-721. *Farh, J .-l., Werbel, J. D. & Bedeian, A. G. (1988). An empirical investigation of self- appraisal-based performance evaluation. Personnel Psychology, 41 , 141-156. *Ferris, G. R., Witt, L. A. & Hochwarter, W. A. (2001). Interaction of social skill and general mental ability on job performance and salary. Journal of Applied Psychology, 86, 1075-1082. *Findley, H. M., Giles, W. F. & Mossholder, K. W. (2000). Performance appraisal process and system facets: Relationships with contextual performance. Journal of Applied Psychology, 85, 634-640. Ford, J. K., Kraiger, K. & Schechtman, S. L. (1986). Study of race effects in objective indices and subjective evaluations of performance: a meta-analysis of performance criteria. Psychological Bulletin, 99, 330-337. *Gellatly, I. R. (1996). Conscientiousness and task performance: Test of a cognitive process model. Journal of Applied Psychology, 81, 474-482. 173 *Gellatly, I. R., & Irving, P. G. (2001). Personality, autonomy, and contextual performance. Human Performance, 14, 231-245. George, J. M. & Brief, A. P. (1992). Feeling good-doing good: a conceptual analysis of the mood at work-organi-zational spontaneity relationship. Psychological Bulletin, 112, 310-329. *George, J. M. (1991). State or trait: Effects of positive mood on prosocial behaviors at work. Journal of Applied Psychology, 76, 299-307. Gleser, L. J ., & Olkin, I. (1994). Stochastically dependent effect sizes. In H. Cooper & L. V. Hedges (Eds). The handbook of research synthesis (pp. 339-355). New York: Russell Sage Foundation. *Goffin, R. D., Rothstein, M. G. & Johnston, N. G. (1996). Personality testing and the assessment center: Incremental validity for managerial selection. Journal of Applied Psychology, 81, 746-756. Gottfredson, L. S. (1997). Why g matters: The complexity of everyday life. Intelligence, 24, 79-132. Guion, R. M. & Gottier, R. F. (1965). Validity of personality measures in personnel selection. Personnel Psychology, 1 8, 1 3 5-164. Guion, R. M. (1998). Assessment, measurement, and prediction for personnel decisions. Lawrence Erlbaum Associates, Mahwah: New Jersey. *Hansen, C. P. (1989). A causal model of the relationship among accidents, biodata, personality, and cognitive factors. Journal of Applied Psychology, 74, 81-90. Hattie, J. A., & Hansford, BC. (1984). Meta-analysis: A reflection on problems. Australian Journal of Education, 36, 239-254. *Hattrup, K., O’Connell, M. S., & Wingate, P. H. (1998). Prediction of multidimensional criteria: distinguishing task and contextual performance. Human Performance, 11, 305-319. Hattrup, K., Rock, J. & Scalia, C. (1997). The effects of varying conceptualizations of job performance on adverse impact, minority hiring, and predicted performance. Journal of Applied Psychology, 82, 656-664. *Haworth, C. L., & Levy, P. E. (2001). The importance of instrumentality beliefs in the prediction of organizational citizenship behaviors. Journal of Vocational Behavior, 59, 64-75. *Hedge, J. W. & Teachout, M. S. (1992). An interview approach to work sample criterion measurement. Journal of Applied Psychology, 7 7, 453-461. 174 Hedges, L. V. & Pigott, T. D. (2001). The power of statistical tests in meta-analysis. Psychological Methods, 6, 203-217. *Hochwarter, W. A., Witt, L. A. & Kacmar, K. M. (2000). Perceptions of organizational politics as a moderator of the relationship between conscientiousness and job performance. Journal of Applied Psychology, 85, 472-478. *Hogan, J ., Hogan, R. & Gregory, S. (1992). Validation of a sales representative selection inventory. Journal of Business and Psychology, 7, 161-171. *Hogan, J ., Rybicki, S. L., Motowidlo, S. J ., & Borman, W. C. (1998). Relations between contextual performance, personality, and occupational advancement. Human Performance, 1 1, 189-207. Horn, P. W., Caranika-Walker, F., Prussia, G. E., & Griffeth, R. W. (1992). A meta- analytical structural equations analysis of a model of employee turnover. Journal of Applied Psychology, 7 7, 890-909. *Hough, L. M. (1992). The “big five” personality variables — construct confusion: Description versus prediction. Human Performance, 5, 139-155. *Hough, L. M., Eaton, N. K., Dunnette, M. D., & Karnp, J. D. (1990). Criterion-related validities of personality constructs and the effect of response distortion on those validities. Journal of Applied Psychology, 5, 581-595. Hough, L. M. & Oswald, F. L. (2000). Personnel selection: looking toward the future — remembering the past. Annual Review of Psychology, 51, 631-664. Huffcutt, A. I. & Arthur, W. A. Jr. (1995). Development of a new outlier statistic for meta-analytic data. Journal of Applied Psychology, 80, 327-334. *Huffcutt, A. I., Conway, J. M., Roth, P. L., & Stone, N. J. (2001). Identification and meta-analytic assessment of psychological constructs measured in employment interviews. Journal of Applied Psychology, 86, 897-913. Huffcutt, A. I. & Roth, P. L. (1998). Racial group differences in employment interview evaluations. Journal of Applied Psychology, 83 , 179-189. Huffcutt, A., Roth, P. & McDaniel, M. (1996). A meta-analytic investigation of cognitive ability in employment interview evaluations: moderating characteristics and implications of incremental validity. Journal of Applied Psychology, 81, 459-473. *Huffcutt, A. 1., Weekley, J. A., Wiesner, W. H., Degroot, T. G., & Jones, C. (2001). Comparison of situational behavior description interview questions for higher- level positions. Personnel Psychology, 54, 619-644. 175 *Hui, C., Lam, S. S. & Law, K. K. S. (2000). Instrumental values of organizational citizenship behavior for promotion: A field quasi-experiment. Journal of Applied Psychology, 85, 822-828. Hunt, S. T. (2002). On the virtues of staying ‘inside the box’: Does organizational citizenship behavior detract from performance in Taylorist jobs? International Journal of Selection and Assessment, 10, 152-159 Hunter, J. E. & Hunter, R. F. (1984). Validity and utility of alternative predictors of job performance. Psychological Bulletin, 96, 72-98. Hunter, J. E. & Schmidt, F. L. (1990). Methods of meta-analysis: Correcting error and bias in research findings (1St ed). London: Sage Publications. Hunter, J. E. & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings (2"d Edition). Thousand Oaks, CA: Sage Publications. Hurtz, G. M. & Donovan, J. J. (2000). Personality and job performance: the Big Five revisited. Journal of Applied Psychology, 85, 869-879. Ilgen, D. R., & Hollenbeck, J. R. (1991). Job design and roles. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of Industrial and Organizational Psychology (2nd ed., Vol. 2). Palo Alto, CA: Consulting Psychologists Press. Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger. John, O. P. (1990). The “Big Five” factor taxonomy: Dimensions of personality in the natural language and in questionnaires. In L. A. Pervin (Ed.) Handbook of personality: Theory and research. New York: Guilford. *Johnson, J. W. (2001). The relative importance of task and contextual performance dimensions to supervisor judgments of overall performance. Journal of Applied Psychology, 86, 984-996. *Judge, T. A., Higgins, C. A., Thoresen, C. J ., & Barrick, M. R. (1999). The big five personality traits, general mental ability, and career success across the life span. Personnel Psychology, 52, 621-652. Kalaian, H. A. & Raudenbush, S. W. (1996). A multivariate linear model for meta- analysis. Psychological Methods, 1, 227-235. Katz, D. (1964). The motivational basis of organizational behavior. Behavioral Science, 9, 131-146. *Kaufman, J. D., Stamper, C. L., & Tesluk, P. E. (2001). Do supportive organizations make for good corporate citizens? Journal of Managerial Issues, 13, 436-449. 176 Kelloway, E. K., Loughlin, C., Barling, J. & Nault, A. (2002). Self-reported counterproductive behaviors and organizational citizenship behaviors: separate but related constructs. International Journal of Selection and Assessment, 10, 143-151. Kenny, D. A. & Judd, C. M. (1996). A general procedure for the estimation of interdependence. Psychological Bulletin, 1 19, 138-148. *Kidder, D. L. (2002). The influence of gender on the performance of organizational citizenship behaviors. Journal of Management, 28, 629-648. *Kinicki, A. J ., Lockwood, C. A., Hom, P. W., & Griffeth, R. W. (1990). Interviewer predictions of applicant qualifications and interviewer validity: Aggregate and individual analyses. Journal of Applied Psychology, 75, 477-486. Kline, R. B. (1998). Principles and practice of structural equation modeling. New York: The Guilford Press. *Koh, W. L., Steers, R. M., & Terborg, J. R. (1995). The effects of transformation leadership on teacher attitudes and student performance in Singapore. Journal of Organizational Behavior, 16, 319-333. *Konovsky, M. A., & Organ, D. W. (1996). Dispositional and contextual determinants of organizational citizenship behavior. Journal of Organizational Behavior, 1 7, 253- 266. Koslowsky, M. & Sagie, A. (1993). On the efficacy of credibility intervals as indicators of moderator effects in metaanalytic research. Journal of Organizational Behavior, 14, 695-699. Koslowsky, M. & Sagie, A. (1994). Components of artifactual variance in meta-analytic research. Personnel Psychology, 47, 561-574. Kozlowski, S. W. J ., S. M. Gully, et al. (1999). Developing adaptive teams: A theory of compilation and performance across levels and time. In D. R. Ilgen and E. D. Pulakos (Eds.), The changing nature of performance: Implications for stafling, motivation, and development (pp. 240-292). San Francisco: Josey-Bass. Kravitz, D. A. & Balzer, W. K. (1992). Context effects in performance appraisal: A methodological critique and empirical study. Journal of Applied Psychology, 77, 24-31. Lacayo, R. & Ripley, A. (2002, December 30/2003, January 6). Persons of the year. TIME, 30-33. 177 *Lam, S. S. K., Hui, C. & Law, K. S. (1999). Organizational citizenship behavior: Comparing perspectives of supervisors and subordinates across four international samples. Journal of Applied Psychology, 84, 594-601. Lance, C. E. & Bennett, W. (2000). Replication and extension models of supervisory job performance ratings. Human Performance, 13, 139, 158. Landis, J. & Koch, G. G. (1977). The measurement of observer agreement for categorical data, Biometrics, 33, 159-174. *Latham, G. P. & Skarlicki, D. P. (1995). Criterion-related validity of the situational and patterned behavior description interviews with organizational citizenship behavior, Human Performance, 8, 67-80. Lawler, E. E., Mohrman, S. A., & Ledford, G. E. (1995). Creating high performance organizations: Practices and results of employee involvement and total quality management in Fortune 1000 companies. San Francisco: J ossey-Bass. *Lee, K. & Allen, N. J. (2002). Organizational citizenship behavior and workplace deviance: the role of affect and cognitions. Journal of Applied Psychology, 87, 131-142. *LePine, J. A. & Van Dyne, L. (2001). Voice and cooperative behavior as contrasting forms of contextual performance: evidence of differential relationships with big five personality characteristics and cognitive ability. Journal of Applied ‘ Psychology, 86, 326-336. *LePine, J. A., Colquitt, J. A., & Erez, A. (2000). Adaptability to changing task contexts: Effects of general cognitive ability, conscientiousness, and openness to experience. Personnel Psychology, 53, 563-593. LePine, J. A., Erez, A., & Johnson, D. E. (2002). The nature and dimensionality of organizational citizenship behavior: a critical review and meta-analysis. Journal of Applied Psychology, 8 7, 52-65. Lipsey, M. W. & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage Publications. Locke, E. A. (1982). The ideas of Frederick W. Taylor: An evaluation. Academy of Management Review, 7, 14-24. *Love, K. G., Bishop, R. C., Heinisch, D. A., & Montei, M. S. (1994). Selection across two cultures: Adapting the selection of American assemblers to meet Japanese job performance demands. Personnel Psychology, 4 7, 837-846. 178 Lovell, S. E., Kahn, A. 8., Anton, J ., Davidson, A., Dowling, E., Post, D., et al. (1999). Does gender affect the link between organizational citizenship behavior and performance evaluation? Sex Roles, 41, 469-478. Lubinski, D. (2000). Scientific and social significance of assessing individual differences: “Sinking shafts at a few critical points.” Annual Review of Psychology, 51, 405-444. *MacKenzie, S. B., Podsakoff, P. M., & Fetter, R. (1991). Organizational citizenship behavior and objective productivity as determinants of managerial evaluations of salespersons’ performance. Organizational Behavior & Human Decision Processes, 50, 123-150. *MacKenzie, S. B., Podsakoff, P. M., & Paine, J. B. (1999). Do citizenship behaviors matter more for managers than for salespeople? Journal of the Academy of Marketing Science, 27, 396-410. *MacKenzie, S. B., Podsakoff, P. M., & Rich, G. A. (2001). Transformational and transactional leadership and salesperson performance. Journal of the Academy of Marketing Science, 29, 1 15-134. *Mael, F. A., & Ashforth, B. E. (1995). Loyal from day one: Biodata, organizational identification, and turnover among newcomers. Personnel Psychology, 48, 309- 333. Mathieu, J. E. & Zajac, D. M. (1990). A review and meta-analysis of the antecedents, correlates, and consequences of organizational commitment. Psychological Bulletin, 108, 172-194. Mayberry, P. W. & Carey, N. B. (1997). The effect of aptitude and experience on mechanical job performance, Educational and Psychological Measurement, 5 7, 131-149. McCloy, R. A., Campbell, J. P. & Cudeck, R. (1994). A confirmatory test of a model of performance determinants. Journal of Applied Psychology, 79, 493-505. *McDaniel, M. A. (1989). Biographical constructs for predicting employee suitability. Journal of Applied Psychology, 74, 964-970. McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The validity of employment interviews: A comprehensive review and meta-analysis. Journal of Applied Psychology, 79, 599-616. *McHenry, J. J ., Hough, L. M., Toquam, J. L., Hanson, M. A., & Ashworth, S. (1990). Project A validity results: the relationship between predictor and criterion domains, Personnel Psychology, 43 , 335-354. 179 *McManus, M. A. & Kelly, M. L. (1999). Personality measures and biodata: Evidence regarding their incremental predictive value in the life insurance industry, Personnel Psychology, 5 2, 137-148. *McNeely, B. L. & Meglino, B. M. (1994). The role of dispositional and situational antecedents in prosocial organizational behavior: An examination of the intended beneficiaries of prosocial behavior. Journal of Applied Psychology, 79, 83 6-844. *Menguc, B. (2000). An empirical investigation of a social exchange model of organizational citizenship behaviors across two sales situations: A Turkish case. Journal of Personal Selling & Sales Management, 20, 205-214. Miles, D. E., Borman, W. E., Spector, P. E., & Fox, S. (2002). Building an integrative model of extra role work behaviors: a comparison of counterproductive work behavior with organizational citizenship behavior. International Journal of Selection and Assessment, 10, 51-57. *Miller, R. L., Griffin, M. A. & Hart, P. M. (1999). Personality and organizational health: The role of conscientiousness. Work and Stress, 13, 7-19. Mitchell, T. W. (1994). The utility of biodata. In G. S. Stokes, M. D. Mumford, & M. A. Owens (Eds.), Biodata Handbook (pp. 485-516). Palo Alto, CA: CPP Books. *Moorman, R. H. (1991). Relationship between organizational justice and organizational citizenship behaviors: Do fairness perceptions influence employee citizenship? Journal of Applied Psychology, 76, 845-855. *Moorman, R. H. (1993). The influence of cognitive and affective based job satisfaction measures on the relationship between satisfaction and organizational citizenship behavior. Human Relations, 46, 759-776. *Moorman, R. H. & Blakely, G. L. (1995). IndividualismAcollectivism as an individual difference predictor of organizational citizenship behavior. Journal of Organizational Behavior, 16, 127-142. *Moorman, R. H., Niehoff, B. P. & Organ, D. W. (1993). Treating employees fairly and organizational citizenship behavior: Sorting the effects of job satisfaction, organizational commitment, and procedural justice. Employee Responsibilities and Rights Journal, 6, 209-225. *Morrison, E. W. (1994). Role definitions and organizational citizenship behavior: The importance of the employee's perspective. Academy of Management Journal, 3 7, 1543-1567. Motowidlo, S. J ., Borman, W. C., & Schmit, M. J. (1997). A theory of individual differences in task and contextual performance. Human Performance, 10, 71-83. 180 *Motowidlo, S. J ., Carter, G. W., Dunette, M. D., Tippins, N., Werner, S. Burnett, J. R., & Vaughan, M. J. (1992). Studies of the structured behavioral interview. Journal of Applied Psychology, 7 7, 571-587. *Motowidlo, S. J. & Van Scotter, J. R. (1994). Evidence that task performance should be distinguished from contextual performance. Journal of Applied Psychology, 79, 475-480. *Mount, M. K., Judge, T. A., Scullen, S. E., Systrna, M. R., & Hezlett, S. A. (1998). Trait, rater and level effects in 360-degree performance ratings. Personnel Psychology, 51, 557-576. *Mount, M. K., Witt, L. A., & Barrick, M. R. (2000). Incremental validity of empirically keyed biodata scales over GMA and the five factor personality constructs. Personnel Psychology, 53, 299-323. Muchinsky, P. (1997) Psychology applied to work (Fifth Edition). Pacific Grove, CA: Brooks/Cole Publishing. Mumford, M. D., & Owens, W. A. (1987). Methodology review: Principles, procedures, and findings in the application of background data measures. Applied Psychological Measurement, 11, 1-31. *Mumford, M. D., Costanza, D. P., Connelly, M. S., & Johnson, J. F. (1996). Item generation procedures and background data scales: Implications for construct and criterion-related validity. Personnel Psychology, 49, 361-398. Murphy, K. R. (1989). Dimensions of job performance. In R. Dillon & J. Pellingrino (Eds.), Testing: Applied and theoretical perspectives (pp. 218-247). New York: Praeger. Murphy, K. R. (1997). Meta-analysis and validity generalization. In N. Anderson & P. Herriot (Eds) International Handbook of Selection and Assessment, (pp. 323- 342). Cinchester: John Wiley & Sons Ltd. Murphy, K. R. (2002). Can conflicting perspectives on the role of g in personnel selection be resolved? Human Performance, 15, 173-186. Murphy, K. R. & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational, and goal-based perspectives. Thousand Oaks, CA: Sage. Murphy, K. R. & DeShon, R. (2000). Interrater correlations do not estimate the reliability of job performance ratings. Personnel Psychology, 53, 873-900. 181 Murphy, K. R. & Shiarella, A. H. (1997). Implications of the multidimensional nature of job performance for the validity of selection tests: Multivariate fiameworks for studying test validity. Personnel Psychology, 50, 823-854. *Nathan, B. R. & Alexander, R. A. (1988). A comparison of criteria for test validation: a meta-analytic investigation, Personnel Psychology, 41 , 517-535. *Nathan, B. R. & Tippins, N. (1990). The consequences of halo “error” in performance ratings: A field study of the moderating effect of halo on test validation results. Journal of Applied Psychology, 75, 290-296. Neisser, U., Boodoo, G., Bouchard, T. J ., Boykin, A. W., Brody, N., Ceci, S. J ., Halpem, D. F ., Loehlin, J. C., Perloff, R., Stemberg, R. J ., & Urbina, S. (1996). Intelligence: Knowns and unknowns. American Psychologist, 51 , 77-101. *Neuman, G. A. & Wright, J. (1999). Team effectiveness: Beyond skills and cognitive ability. Journal of Applied Psychology, 84, 376-389. *Neuman, G. A., & Kickul, J. R. (1998). Organizational citizenship behaviors: Achievement orientation and personality. Journal of Business & Psychology, 13, 263-279. Nickels, B. J. (1994). The nature of biodata. In G. S. Stokes, M. D. Mumford, & M. A. Owens (Eds.), Biodata Handbook (pp. 1-16). Palo Alto, CA: CPP Books. *Niehoff, B. P. & Moorman, R. H. (1993). Justice as a mediator of the relationship between methods of monitoring and organizational citizenship behavior. The Academy of Management Journal, 3 6, 527-556. *Nikolaou, 1., & Robertson, 1. T. (2001). The F ive-F actor model of personality and work behaviour in Greece. European Journal of Work & Organizational Psychology, 10, 161-186. Nunnally, J. C. & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw Hill, Inc. *O'Connell, M. S., Doverspike, D., Norris-Watts, C., & Hattrup, K. (2001). Predictors of organizational citizenship behavior among Mexican retail salespeople. International Journal of Organizational Analysis, 9, 272-280. *O'Connell, M. S., Hattrup, K., Doverspike, D., & Cober, A. (2002). The validity of "mini" simulations for Mexican retail salespeople. Journal of Business & Psychology, 16, 593-600. Organ, D. W. (1988). Organizational citizenship behavior: The good soldier syndrome. Lexington, MA: Lexington Books. 182 Organ, D. W. (1997). Organizational citizenship behavior: It’s construct clean-up time. Human Performance, 10, 85-97. *Organ, D. W. & Konovsky, M. (1989). Cognitive versus affective determinants of organizational citizenship behavior. Journal of Applied Psychology, 74, 153-164. Organ, D. W. & Ryan, K. (1995). A meta-analytic review of attitudinal and dispositional predictors of organizational citizenship behavior. Personnel Psychology, 48, 775- 802. Orwin, R. G. & Cordray, D. S. (1985). Effects of deficient reporting on meta-analysis: A conceptual framework and reanalysis. Psychological Bulletin, 97, 134-147. Outtz, J. L. (2002). The role of cognitive ability tests in employment selection. Human Performance, 15, 161-172. Paese, P. W. & Switzer, F. S. III (1988). Validity generalization and hypothetical reliability distributions: A test of the Schmidt-Hunter Procedure. Journal of Applied Psychology, 73, 267-274. Penner, L. A., Midili, A. R. & Kegelmeyer, J. (1997). Beyond job attitudes: A personality and social psychology perspective on the causes of organizational citizenship behavior. Human Performance, 10, 111-131. Peterson, N. G., Hough, L. M., Dunnette, M. D., Rosse, R. L., Houston, J. S., Toquam, J. L., & Wing, H. (1990). Project A: Specification of the predictor domain and development of new selection/classification tests, Personnel Psychology, 43, 247- 276. *Phillips, J. M. & Gully, S. M. (1997). Role of goal orientation, ability, need for achievement, and locus of control in self-efficacy and goal-setting process. Journal of Applied Psychology, 82, 792-802. *Piedmont, R. L. & Weinstein, H. P. (1994). Predicting supervisor ratings of job performance using the NEO personality inventory. The Journal of Psychology, 128, 255-265. *Ployhart, R. E., Lim, B. & Chan, K. (2001). Exploring relations between typical and maximum performance ratings and the five factor model of personality. Personnel Psychology, 54, 809-843. *Ployhart, R. E., Weekley, J. A., Holtz, B. C., & Kemp, C. (2003). Web-based and paper- and-pencil testing of applicants in a proctored setting: Are personality, biodata and situational judgment tests comparable? Personnel Psychology, 5 6, 733-752. 183 *Podsakoff, P. M. & MacKenzie, S. B. (1994). Organizational citizenship behaviors and sales unit effectiveness. Journal of Marketing Research, 31, 351-363. *Podsakoff, P. M., MacKenzie, S. B., & Bommer, W. H. (1996). Transformational leader behaviors and substitutes for leadership as determinants of employee satisfaction, commitment, trust, and organizational citizenship behaviors. Journal of Management, 22, 259-298. *Podsakoff, P. M., MacKenzie, S. B., & Fetter, R. (1993). Substitutes for leadership and the management of professionals. The Leadership Quarterly, 4, 1-44. *Podsakoff, P. M., MacKenzie, S. B., Moorman, R. H. & Fetter, R. (1993). Transformational leader behaviors and their effects on followers’ trust in leader. The Leadership Quarterly, 1, 107-142. Podsakoff, P. M., MacKenzie, S. B., Paine, J. B., & Bachrach, D. G. (2000). Organizational citizenship behaviors: A critical review of the theoretical and empirical literature and suggestions for future research. Journal of Management, 26, 513-563. Puffer, S. M. (1987). Prosocial behavior, noncompliant behavior, and work performance among commission salespeople. Journal of Applied Psychology, 72, 615-621. *Pulakos, E. D. & Schmitt, N. (1996). An evaluation of two strategies for reducing adverse impact and their effects on criterion-related validity. Human Performance, 9, 241-258. *Pulakos, E. D., Borman, W. C. & Hough, L. M. (1988). Test validation for scientific understanding: Two demonstrations of an approach to studying predictor-criterion linkages. Personnel Psychology, 41 , 703-716. Pulakos, E. D., White, L. A., Oppler, S. H., & Borman, W. C. (1989). Examination of race and sex effects on performance ratings. Journal of Applied Psychology, 74, 770-780. Raju, N. S., Burke, M. J., Normand, J ., & Langlois, G. M. (1991). A new meta-analytic approach. Journal of Applied Psycholog, 76, 432-446. Raju, N. S., Pappas, S. & Williams, C. P. (1989). An empirical monte carlo test of the accuracy of the correlation, covariance, and regression slope models for assessing validity generalization. Journal of Applied Psychology, 74, 901-911. *Randall, M. L., Cropanzano, R., Borrnann, C. A., & Birjulin, A. (1999). Organizational politics and organizational support as predictors of work attitudes, job performance, and organizational citizenship behavior. Journal of Organizational Behavior, 20, 159-174. 184 Raudenbush, S. W., Becker, B. J. & Kalaian, H. (1988). Modeling multivariate effect sizes. Psychological Bulletin, 103, 111-120. Ree, M. J. & Carretta, T. R. (2002). g2K. Human Performance, 15, 3-23. *Ree, M. J ., Carretta, T. R., & Teachout, M. S. (1995). Role of ability and prior job knowledge in complex training performance. Journal of Applied Psychology, 80, 721-730. *Ree, M. J ., Earles, J. A. & Teachout, M. S. (1994). Predicting job performance: Not much more than g. Journal of Applied Psychology, 79, 518-524. *Rioux, S. M. & Penner, L. A. (2001). The causes of organizational citizenship behavior: A motivational analysis. Journal of Applied Psychology, 86, 1306-1314. Rothstein, H. R. (1990). Interrater reliability of job performance ratings: Growth to asymptote level with increasing opportunity to observe. Journal of Applied Psychology, 75, 322-327. Rothstein, H. R., McDaniel, M. A., & Borenstein, M. (2002). Meta-analysis: A Review of Quantitative Cumulation Methods. In F. Drasgow & N. Schmitt (Eds) Measuring and analyzing behavior in organizations (pp. 534-570). San Francisco: Jossey—Bass, Inc. Rotton, J ., Foos, P. W., Van Meek, L., & Levitt, M. (1995). Publication practices and the file drawer problem: A survey of published authors. Journal of Social Behavior and Personality, 10, 1-13. Rotundo, M. & Sackett, P. R. (2002). The relative importance of task, citizenship, and counterproductive performance to global ratings of job performance: A policy capturing approach. Journal of Applied Psychology, 8 7, 66-80. Russell, C. J. & Gilliland, S. W. (1995). Why meta-analysis doesn’t tell us what the data really mean: distinguishing between moderator effects and moderator processes. Journal of Management, 21 , 813-831. *Russell, C. J ., Mattson, J ., Devlin, S. E., & Atwater, D. (1990). Predictive validity of biodata items generated from retrospective life experience essays. Journal of Applied Psychology, 75, 569-580. *Ryan, A. M., Ployhart, R. E. & Friedel, L. A. (1998). Using personality testing to reduce adverse impact: A cautionary note. Journal of Applied Psychology, 83, 298-302. *Sackett, P. R., Gruys, M. L. & Ellingson, J. E. (1998). Ability-personality interactions when predicting job performance. Journal of Applied Psychology, 83, 545-556. 185 Sackett, P. R., Schmitt, N., Ellingson, J. E. & Kabin, M. B. (2001). High-stakes testing employment, credentialing, and higher education: Prospects in a post-affirmative- action world. American Psychologist, 5 6, 302-318. Sagie, A. & Koslowsky, M. (1993). Detecting moderators with meta-analysis: An evaluation and comparison of techniques. Personnel Psychology, 46, 629-640. Salgado, J. F. (1998). Big five personality dimensions and job performance in Army and civil occupations: A European perspective. Human Performance, 1 I, 271-288. SAS Institute Inc. (2001). The SAS system for Windows (Version 8). Cary, NC: SAS Institute Inc. Schmidt, F. L. (1988). The problem of group differences in ability scores in employment selection. Journal of Vocational Behavior, 33, 272-292. Schmidt, F. L. (2002). The role of general cognitive ability and job performance: Why there cannot be a debate. Human Performance, 15, 187-210. Schmidt, F. L. & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62, 529-540. Schmidt, F. L. & Hunter, J. E. (1981). Employment testing: Old theories and new research findings. American Psychologist, 36, 1128-1137. Schmidt, F. L. & Hunter, J. E. (1992). Causal modeling of processes determining job performance. Current Directions in Psychological Science, 1, 89-92. Schmidt, F. L. & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: practical and theoretical implications of 85 years of research findings, Psychological Bulletin, 124, 262-274. Schmidt, F. L., Hunter, J. E., McKenzie, R. C., & Muldrow, T. W. (1979). Impact of valid selection procedures of work-force productivity. Journal of Applied Psychology, 64, 609-626. *Schmidt, F. L., Hunter, J. E., Outerbridge, A. N., & Goff, S. (1988). Joint relation of experience and ability with job performance: Test of three hypotheses. Journal of Applied Psychology, 73, 46-57. Schmidt, F. L., Hunter, J. E., Pearlman, K., Rothstein Hirsh, H., Sackett, P. R., Schmitt, N., Tenopyr, M. L., Kehoe, J ., & Zedeck, S. (1985). Forty questions about validity generalization and meta-analysis. Personnel Psychology, 38, 697-798. 186 Schmidt, F. L., Hunter, J. E., & Pearlman, K. (1981). Task differences as moderators of aptitude test validity in selection: A red herring. Journal of Applied Psychology, 66, 166-185. Schmidt, F. L., Hunter, J. E., Pearlman, K., & Shane, G. S. (1979). Further tests of the Schmidt-Hunter bayseian validity generalization procedure. Personnel Psychology, 32, 257-281. Schmidt, F. L., Law, K., Hunter, J. E., Rothstein, R., Pearhnan, K. & McDaniel, M. (1993). Refinements in validity generalization methods: Implications for the situational specificity hypothesis. Journal of Applied Psychology, 78, 3-12. *Schmidt, F. L. & Rader, M. (1999). Exploring the boundary conditions for interview validity: Meta-analytic validity findings for a new interview type. Personnel Psychology, 52, 445-464. Schmidt, F. L., Viswesvaran, C. & Ones, D. S. (2000). Reliability is not validity and validity is not reliability. Personnel Psychology, 53, 901-912. *Schmitt, N. & Ryan, A. M. R. (1993). The big five in personnel selection: Factor structure in applicant and nonapplicant populations. Journal of Applied Psychology, 78, 966-974. Schmitt, N., Rogers, W., Chan, D., Sheppard, L., & Jennings, D. (1997). Adverse impact and predictive efficiency of various predictor combinations. Journal of Applied Psychology, 82, 719-730. Schnake. M. (1991). Organizational citizenship: A review, proposed model, and research agenda. Human Relations, 44, 735-759. *Schnake. M., Dumler, M. P. & Cochran, D. S. (1993). The relationship between “traditional” leadership, “super” leadership, and organizational citizenship behavior. Group and Organization Management, 18, 352-365. Scullen, S. E., Mount, M. K. & Goff, M. (2000). Understanding the latent structure of job performance ratings. Journal of Applied Psychology, 85 , 956-970. *Scullen, S. E., Mount, M. K. & Judge, T. A. (2003). Evidence of the construct validity of developmental ratings of managerial performance. Journal of Applied Psychology, 88, 50-66. Settoon, R. P. & Mossholder, K. W. (2002). Relationship quality and relationship context as antecedents of person- and task-focused interpersonal citizenship behavior. Journal of Applied Psychology, 87, 255-267. 187 Shaw, J. C., Wild, E. & Colquitt, J. A. (2003). To justify or excuse?: A meta-analytic review of the effects of explanations. Journal of Applied Psychology, 88, 444- 458. *Shore, L. M. & Wayne, S. J. (1993). Commitment and employee behavior: Comparison of affective commitment and continuance commitment with perceived organizational support. Journal of Applied Psycholog, 78, 774-480. *Shore, L. M., Barksdale, K., & Shore, T. H. (1995). Managerial perceptions of employee commitment to the organization. Academy of Management Journal, 38, 1593-1615. *Shore, L. M., Tetrick, L. E., Shore, T. H., & Barksdale, K. (2000). Construct validity of measures of Becker's side bet theory. Journal of Vocational Behavior, 5 7, 428- 444. Solomonson, A. L. & Lance, C. E. (1997). Examination of the relationship between true halo and halo error in performance ratings. Journal of Applied Psychology, 82, 665-674. *Stewart, G. L. (1996). Reward structure as a moderator of the relationship between extraversion and sales performance. Journal of Applied Psychology, 81, 619-627. *Stokes, G. S., & Searcy, C. A. (1999). Specification of scales in biodata form development: Rational vs. empirical and global vs. specific. International Journal of Selection & Assessment, 7, 72-85. *Tansky, J. W. (1993). Justice and organizational citizenship behavior: What is the relationship? Employee Responsibilities and Rights Journal, 6, 195-207. Taylor, F. W. (1911). Scientific management. The principles of scientific management, (pp. 30-48; 57-60). New Yorszarper & Row. Taylor, IF. W. (1912). What is scientific management? Excerpted testimony before the US. House of Representatives. *Tepper, B. J ., Lockhart, D. & Hoobler, J. (2001). Justice, citizenship, and role definition effects. Journal of Applied Psychology, 86, 789-796. Tett, R. P. & Meyer, J. P. (1993). Job satisfaction, organizational commitment, turnover intention, and turnover: Path analyses based on meta-analytic findings. Personnel Psychology, 46, 259-293. Thompson, J. D. (1967). Organizations in action. New York: McGraw-Hill. 188 *Tompson, H. B., & Werner, J. M. (1997). The impact of role conflict/facilitation on core and discretionary behaviors: Testing a mediated model. Journal of Management, 23, 583-601. *Tumipseed, D. L. (2002). Are good soldiers good? Exploring the link between organization citizenship behavior and personal ethics. Journal of Business Research, 55, 1-15. *Tumipseed, D. L. (2003). Hardy personality: A potential link with organizational citizenship behavior. Psychological Reports, 93, 529-543. *Turnley, W. H., Bolino, M. C., Lester, S. W., & Bloodgood, J . M. (2003). The impact of psychological contract fulfillment on the performance of in-role and . organizational citizenship behaviors. Journal of Management, 29, 187-206. US. Department of Labor. (1991). Dictionary of Occupational Titles (Rev. 4th ed.). Washington, DC: US. Government Printing Office. Van Dyne, L., Cummings, L. L., & Parks, J. M. (1995). Extra-role behaviors: In pursuit of construct and definitional clarity (a bridge over muddied waters). Research in Organizational Behavior, 1 7, 215-285. *Van Dyne, L., Graham, J. W., & Dienesch, R. M. (1994). Organizational citizenship behavior: Construct redefinition, measurement, and validation. Academy of Management Journal, 3 7, 765-802. *Van Scotter, J. R. & Motowidlo, S. J. (1996). Interpersonal facilitation and job dedication as separate facets of contextual performance. Journal of Applied Psychology, 81 , 525-531. Van Scotter, J. R., Motowidlo, S. J ., & Cross, T. C. (2000). Effects of task performance and contextual performance on systemic rewards. Journal of Applied Psychology, 85, 526-535. *Vaaneren, N. W., van den Berg, A. E., & Willering, M. C. (1999). Towards a better understanding of the link between participation in decision-making and organizational citizenship behaviour: A multilevel analysis. Journal of Occupational & Organizational Psychology, 72, 377-392. *Villanova, P., Bemardin, H. J ., Johnson, D. L., & Dahmus, S. A. (1994). The validity of a measure of job compatibility in the prediction of job performance and tumomver of motion picture theater personnel. Personnel Psychology, 4 7, 73 -90. Vinchur, A. J ., Schippmann, J. S., Switzer, F. S. III, & Roth, P. L. (1998). A meta- analytic review of predictors of job performance for salespeople. Journal of Applied Psychology, 83, 586-597. 189 Viswesvaran, C. & D. S. Ones (1995). Theory testing: Combining psychometric meta- analysis and structural equations modeling. Personnel Psychology, 48, 865-885. Viswesvaran, C. & Ones, D. S. (2002). Agreements and disagreements on the role of general mental ability (GMA) in industrial, work, and organizational psychology. Human Performance, 15, 212-23 1 . Viswesvaran, C., Ones, D. S. & Schmidt, F. L. (1996). Comparative analysis of the reliability of job performance ratings. Journal of Applied Psychology, 81 , 557- 574. Wanous, J. P., Sullivan, S. E., & Malinak, J. (1989). The role of judgment calls in meta- analysis. Journal of Applied Psychology, 74, 259-264. Whitener, E. M. (1990). Confusion of confidence intervals and credibility intervals in meta-analysis. Journal of Applied Psychology, 75 , 315-321. *Williams, L. J ., & Anderson, S. E. (1991). Job satisfaction and organizational commitment as predictors of organizational citizenship and in-role behaviors. Journal of Management, I 7, 601 -61 7. *Witt, L. A., Burke, L. A., Barrick, M. R., & Mount, M. K. (2002). The interactive effects of conscientiousness and agreeableness on job performance. Journal of Applied Psychology, 8 7, 164-169. Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K. A. Bollen & J. S. Long (Eds) Testing structural equation models. (pp.256-293) Newbury Park, CA: Sage Publications. *Yoon, M. H., & Suh, J. (2003). Organizational citizenship behaviors and service quality as external effectiveness of contact employees. Journal of Business Research, 5 6, 597-61 1. 190 IIIIIIIIIIIIIIIIIII