swan»??? .. . . H.“
1 O)! A 9’.
g

.3»...

I .l

1.. 56.. .50

‘

i13nmrﬂn.
. 5.101113... 1
[Iona-.itLi‘E-ﬁi ‘
{iv-It‘lxig ‘ .
1‘3‘ﬂsﬂ.ﬂ.<¢lz
‘ .1 I: Ital
4 ISO. 1.. n‘»
! .. 519:... In.
. In}; I. t .

0.: .rd .
:I» n {It-A .19....
.l~v4‘.|.[p .rlluhlo’

)

 

 

.iwmm Unmnmummumaé

 

_n'

 

 

MICHIGAN STATE UNOVERSIT ARIES [-
b

mum ““ mu 1." u w 'lfj.LIIHIIHHWHHHIHII
Michigan Sm 3 1293 00605 9500

.il I
LUniveniL

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This is to certify that the

dissertation entitled
The Relationship of Sex of Candidate
and Prestige of Institution
To Faculty Performance Evaluation
At the University Committee Level

presented by

Elizabeth A. Hansen

has been accepted towards fulﬁllment
of the requirements for

Ph . D . degree in Educat ion

 

 

 

 

Ma' p

Date February 1, 1990

 

MS U i: an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or More due due.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Affirmetive ActiorVEquel Opportunlty lnetituticn

 

THE RELATIONSHIP OF SEX OF CANDIDATE
AND PRESTIGE OF INSTITUTION
TO FACULTY PERFORMANCE EVALUATION
AT THE UNIVERSITY COMMITTEE LEVEL

By

Elizabeth A. Hansen

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCI‘ OR OF PHILOSOPHY

Department of Educational Administration

1 989

905434!

ABSTRACT
THE RELATIONSHIP OF SEX OF CANDIDATE
AND PRESTIGE OF INSTITUTION
TO FACULTY PERFORMANCE EVALUATION
AT THE UNIVERSITY COMMITTEE LEVEL
By

Elizabeth A. Hansen

This research provided a means for administrators to compare faculty
decision-making against a mathematical model of the perceived rating system.
Faculty subjects rated hypothetical applicants for tenure, their ratings were
compared to the computer model’s decisions. Variables were sex Of hypothetical
applicant and prestige of the candidate’s institution.

The subjects were tenure-track faculty members from Central Michigan
University. Using an ANOVA, the results were that male candidates were rated
higher than female candidates when the candidates were strong in research,
female candidates were rated higher than male candidates when the candidates
were strong in teaching; overall, male candidates were rated higher than female
candidates, male candidates from high prestige schools were rated higher than
male candidates from low prestige schools, and female candidates from low

prestige schools were rated higher than female candidates from high prestige

schools.

COPyright by
ELIZABETH ANN HANSEN

1989

ACKNOWLEDGMENTS

I wish to thank my advisor, Dr. Marvin E. Grandstaff and the members of my
committee, Dr. Samuel A. Moore, Dr. Kenneth L. Neff, and Dr. Eldon R. Nonnamaker

for their support.

Special thanks to the faculty members of Central Michigan University who participated

in the study. Dr. William Lewis was particularly helpful. I thank him.

Also, I wish to thank my husband, my sons, and my parents on the home front.

iv

TABLE OF CONTENTS

Acknowledgments

 

Abstract

 

List of Tables

 

Chapter 1 Introduction

Statement of the Problem

 

Purpose of the Study

 

Signiﬁcance Of the Study

 

 

Hypotheses
Setting of the Study

 

Scope of the Study

 

Chapter 2 Review of the Literature

Problems with EvaluatiOn

 

Two Types of Bias

 

Criteria for Evaluation: Teaching, Research and Service

 

 

The University Committee
Summary

 

Chapter 3 Design of the Study

Introduction

 

Model for Faculty Evaluation

 

Rationale for Model

 

Deﬁnitions

 

 

Development of Model
Selection of Subjects

 

Procedure

 

 

Analysis

Page
iii
iv

vii

meHD—DI—e

u—nr—t
F-‘OWGA

12
12
12
14
15
26
27
27

Page
Chapter 4 Presentation of Results

 

 

 

Introductory Explanation 30
Overall Analysis of Data 32
Summary of Results 44

Chapter 5 Conclusions and Recommendations

 

 

 

 

 

 

 

Conclusions 46
Recommendations 46
Speculations and Recommendations for Further Research 47
Appendices
A. Biographical Material on Hypothetical
Candidates for Tenure 49
B. Criteria for Tenure 58
C. Instructions 59
D. The Computer Program 61
Bibliography 77

 

Table

10.

11.

12.

13.

14.

15.

16.

LIST OF TABLES

 

Latin Square Design of Study

Data Presented to Subjects

 

 

Sum of Scores for Each Hypothetical Candidate

AN OVA: Overall

 

Sum of Scores for Candidates in High Teaching Category

 

ANOVA: Sex With High Teaching Category

 

Sum of Scores for Candidates in High Research Category

ANOVA: Sex With High Research Category

 

 

Sum of Scores for Candidates in Low Research Category

 

ANOVA: Sex With Low Research Category

 

Sum of Scores for Candidates in Low Teaching Category ...............................

 

ANOVA: Sex With Low Teaching Category-

Sum of Scores for Male and Female
Hypothetical Candidates

 

AN OVA: 86!!

 

Sum of Scores of High and Low Prestige Candidates

 

AN OVA: Prestige

 

Vii

Page
29
31
32
33
34
35
36
37
38
39
4O

41

42
42

43

CHAPTER 1

INTRODUCTION

Statement of the Problem
The problem investigated was whether bias, either based on sex of a candidate
or prestige of the candidate’s institution, would affect a performance evaluation of that

candidate by a university-wide personnel committee.

Purpose of the Study

Results from this research will help administrators determine whether there may
be unasked for variables introduced into the decision-making process when members of
a tenure and promotion committee are given distinct and measurable performance
objectives by which to evaluate a group of faculty. Also, a computerized model for
decision-making is offered as a means for comparing committee members’ decisions

with an unbiased rating.

Signiﬁcance of the Study

The signiﬁcance of this research is that it provides a means for administrators
to compare faculty decision-making against a mathematical model of the perceived
rating system. If the faculty decisions and the decisions produced by the model do not
match, then there may be unnamed variables that were not added to the model. These

variables may be ones that were not in the model by neglect, in which case the

1

2
administrator could easily adjust the computer program to consider them. These
variables may be ones that were not in the model because they should not be included
in the decision process (such as race or sex) whereby the administrator would have to

reconsider the committee decision.

Hypotheses

The following hypotheses were tested. The general hypothesis was: when given
information to evaluate faculty for a tenure decision and the faculty are from a ﬁeld
outside one’s ﬁeld of expertise, the members of a university tenure committee will
render decisions that are biased in respect to stated measurable criteria for tenure.

The general hypothesis was tested by ascertaining if sex might influence the
decisions. The Null Hypothesis was that the sex of an applicant does not signiﬁcantly
affect tenure decisions.

The general hypothesis was tested by ascertaining if perceived prestige of the
hypothetical candidate might influence the decisions (prestige was determined by where
the hypothetical candidates earned their terminal degrees). The Null Hypothesis was
that the prestige of the hypothetical candidate’s background does not signiﬁcantly affect

the tenure decisions.

W

The subjects were tenure-track faculty members from Central Michigan
University, a large predominantly undergraduate institution. There were eight faculty
members from the computer science department, eight faculty members from the
education department, eight faculty members from the mathematics department, and

eight faculty members from the English department.

Scope of the Study

The subjects were limited to one institution of higher learning to provide a
more homogeneous group so as to limit possible effects of extraneous variables. The
subjects were limited to tenure-track faculty for the same reason. The departments
were chosen for two reasons: on the basis of size; they were large enough to contain at
least eight subjects willing to participate; and on the basis of subject matter;
mathematics and computer science are more quantitative than history and English.
These four departments were chosen because they were not at all related to the
hypothetical textile department; this was to simulate a university committee making
decisions about a faculty member whose department was not represented on the

committee.

CHAPTER 2

REVIEW OF THE LITERATURE

Problems with Evaluation

The issue of evaluation of faculty is a matter that has been commented upon
since the beginnings of the university system itself. The roots and rituals of the
university go back to the Middle Ages.

"Its hierarchical arrangements are simple and standardized, but the

academic hierarchy includes a greater range of skills and a greater

diversity of tasks than any business or military organization...It is not

easy...to determine the fundamental purposes of a university or the

relative importance of different activities in contributing to those

purposes."1
Even though it may be hard to determine the factors and the respective proportions
that should be included in the decision-making process, this should be attempted
anyway since the decision will be made in any case.

The problem of evaluation of faculty is compounded by problems with
communication. As in any organization, there are ways communication can be distorted

or even prevented. This can be said to be an information screen: "...an information

screen may be deﬁned as a set of social practices, beliefs, and behaviors within an

 

1Theodore Caplow and Reece J. McGee, The Academic Marketplace (New York: Arno Press,
1977), p. 4.

5

organized group which inhibits the communication of certain kinds of information
between certain positions or in certain directions..."2

There are many types of information screens. Caplow and McGee identify one
information screen as being ”erected by the university’s administrative ofﬁcials to shield
from the working members of their departments the criteria by which men are ofﬁcially
evaluated. Our data abound in complaints from professors that they were not told
exactly either why a given colleague was hired or ﬁred or what he had or did not have
that someone else had or did not have. Every university has its legends about certain
ﬁrings or about recommended promotions that were never made." 3

This problem of communication creates a confusing situation for the faculty
member. "In view of the vague and conﬂicting criteria by which his work is judged, he
is uncertain in the allocation of his energies. He knows that he is a competitor, but
often is not clear regarding the terms of the competition." Also, because of the lack of
communication, the people doing the evaluation might use procedures that are not
standardized or not appropriate. Bias can thus be introduced into the decision-making

process.

 

21mg, p. 59.
31b_id., p. 60—61.

‘Wilson Logan, e cad mic ' Soc'olo o a rofession (London: Oxford University
Press, 1942), p. 62.

Two of Bias

Two types of bias were investigated in this research. One type of bias is based
on perceived prestige of the faculty member.

The higher the rank of the department in the disciplinary prestige
system, the more it serves its individual members by conferring a
derivative reputation on them. This reputation tends to make them more
desirable to other universities, more independent of their own, and more
inclined to mobility....the higher the prestige of a department, the greater
will be the tendency for its members to be oriented to the discipline
rather than to the university....in the high-prestige institutions men are
hired on an estimate of how much research they are likely to do. When
their tenure is decided, direct utility to the university hardly enters as a
factor in the decision to keep them. The measurement of their worth is
haunted by quite another problem- their usefulness in future staff
procurement.’

Prestige is an important factor in faculty career patterns and can inﬂuence
decision-making in terms of new faculty members. Where a faculty member received
his/her degree can be more of an inﬂuential factor than other factors such as
achievement, talent or evaluation results.‘ This factor of prestige is thought to be a
dominant consideration in professorial concerns. "Professors wish to be number one-if
not for themselves then for their department; if not in their own area of investigation,

then as teachers; and certainly for their institution as a whole and its ranking with

other colleges and universities.” Research has shown that the norms and values critical

 

5M» p. 107.

“David W. Breneman and Ted 1. K. Youn (eds.), Mdemic Qho; Markets ahd Cargrs
(Philadelphia: Palmer Press, 1988).

7Robert Blackburn, "The Meaning of Work in Academia,” ew Directions for Instithtional
Research, (San Francisco: Jossey-Bass, 1974) I, p. 80.

7
to high performance are cultivated in graduate programs, therefore doctoral prestige
can be used as a predictor of research performance.a

Another type of bias that is investigated is sex-bias. Communications expert
Patricia King comments, "Some people rate minorities and women lower than others;
some expect so little of them that anything they accomplish seems like a miracle and
gets very high marks; others bend over backward to give them a break and rate them
higher than they deserve.” 9 There are many reports of court cases involving sex-bias in
tenure decisions. One of the problems in doing a study investigating bias is that the
conﬁdential ﬁles must be opened. The Supreme Court is having a difficult time getting
access to conﬁdential decision-making information.” Research which can reveal sex-bias
by using dummy ﬁles and comparing decision-making results of a faculty committee
against a computer program is non-invasive to conﬁdential ﬁles.

Another problem in determining sex-bias is that men produce more research
than women but comparisons between them on just a count basis are inappropriate
because men are more likely to have begun their careers earlier and have them
interrupted less frequently, to have lighter teaching loads, to be employed full-time and

to be paid more.“

 

8David Dill, "Research as a Scholarly Activity: Context and Culture," New Directions for
Institutional Research, (San Francisco: Jossey-Bass, 1986), XIII, p. 13.

9Patricia King, Performance Planning and appraisal, (New York: McGraw-Hill Book Company,
1984), p. 57.

10"In Weighing Sex-bias Case, High Court Will Skirt Issue of Conﬁdentiality of Tenure-
Review Records,” e ' l o i er u io January 4, 1989, p. A13.

11Blackburn, 195;. cit., p. 87.

8
Crite ' fo va ation: Teachin esearc a (1 Service

The criteria for the evaluation of faculty must reﬂect the mission of the
university or college. Prestige and recognition are as important as the security and
ﬁnancial rewards that are associated with earned promotion and tenure. Many
institutions consider the triad of teaching, research and service as the focus for the
evaluation of faculty. Therefore, rewards in higher education tend to be related to
these three focal points, with emphasis on one or another changing depending upon
the particular institution or the particular period in history.

Teaching is a paramount responsibility of a college or university. During the
expansion of higher education in the 19605, most departments were concerned with
recruiting and retaining faculty. Now, many departments are not expanding, so the
departments are often forced to make ﬁne distinctions between faculty members. In
making these ﬁne distinctions, many administrators are emphasizing measurable
evidence for evaluating teaching, such as systematic student ratings and the content of
course syllabi and examinations.12

Research has been increasingly emphasized in higher education. From the
Morrill and Hatch Acts of the nineteenth century to the investments of private
corporations in the development of new technology, it is evident that external
inﬂuences are changing the face of institutional reward systems. "Campus after campus
has been moving aggressively to upgrade the importance of scholarly productivity as a

criterion for academic personnel decisions..."13

 

12John A. Centra, ”Using Student Assessments to Improve Performance and Vitality," New
Directions for Institutional Research (San Francisco Jossey-Bass, 1978), p. 40.

 

‘3 J. N. Schuster and H. R. Bowen, ”The Faculty at Risk,” Change, 1985, 11, pp. 15-16.

9

Measuring research performance has been problematic. Typically, research
performance has been measured by counting the number of publications. In a comment
by a professor: ”It’s very simple. We look at all the publications. Then the committee
gets together and we have a gut reaction.” Quantitative measures are popular, such as
number of journal articles (refereed and nonrefereed), books, unpublished research,
citations, and grants. The use of these quantitative, written products of research to
assess faculty performance "ﬁts well with the assertion of Etzioni and Lehman (1967)15
that assessments tend to be based on the attributes that are the easiest to measure.”5

Service, the third tier of faculty performance, tends to be poorly conceptualized
and inconsistently expressed in higher education. Service can be thought of as campus
committee assignments, membership in a local church, consulting with local businesses
and serving on professional associations. The idea that the university should provide
public and community service came from the idea of the commons concept in land use.
"It seems to be the logical development from the land-grant idea of institutional
commitment to services that reach beyond students who are enrolled in degree
programs on campus. This service mission emerged out of a conviction that knowledge
is useful beyond the classroom and that the university had a responsibility to the

society to extend the beneﬁts of learning to the larger public.“7

 

l4Peter Seldin, Successful Faculty Evaluatioh Proggams (Crugers, N.Y.:Coventry, 1980) p. 34.

15A. Etzioni and E. W. Lehman, "Some Dangers in ’Valid’ Social Measurements," The Annals,
1967, $3, pp. 1-15.

1“J. M. Braxton and A. E. Brayer, "Assessing Faculty Scholarly Performance,” New Directions
for Institutional Research, (San Francisco: Jossey-Bass, 1986), p. 28.

17Durward Long, "The University as Commons: A View from Administration,” New Directions
for Institutignal Egypt} (San Francisco: Jossey-Bass, 1977), V, p. 75.

10

The Univnrsity Qmmittee

In a study conducted by Peter Seldin, it was found that the policies and
practices used to evaluate faculty performance are "becoming more structured and
systemized. Chairman and dean evaluations, while still very important, are losing
ground to formal faculty committees, self-evaluation, and colleagues’ opinions...The
trend to decentralization and the sharing of decision-making seems clear, as does the
growing eﬂ'ort toward the reliability of the evaluative process."18 This trend would seem
to indicate that the university promotion and tenure committee might exert more
inﬂuence in the future. The university-wide committee represents university perspectives
and standards and insures appropriate consideration of the long term academic
priorities of the university and the ﬁscal situation of the university.19 This implies that
the denial of tenure or promotion could be on the basis of ﬁscal priorities or academic
planning rather than the qualities of the individual.

Research which facilitates the work of the committee by increasing the
reliability of the evaluations from the committee can be very useful. The
standardization of some of the decision-making should help lessen the confusion in the
minds of faculty who are being evaluated and should increase the reliability of the
committee decision. All of this in turn can decrease the time and cost involved with the

evaluation process.

 

18Peter Seldin, Snccessful Eaculty Evaluation Eroggams, (Crugers, New York: Coventry Press,
1981 ), p. 34.

19Donald K. Smith, "Faculty Vitality and the Management of University Personnel Policies,"
New Directing for Institutional Reseagch, (San Francisco: Jossey-Bass Inc., 1978), V, p. 9.

11

University committees tend to look at folders which contain written information
about the faculty member under discussion. Biographical data have historically been a
tool in the selection process. One of the best predictors of what individual’s are
capable of is the record of what they have done in the past.20
Summary

The issue of evaluation of a faculty member is problematic. The evaluation can
be biased and error prone but since a decision has to be made, it is important to
understand where bias can appear. The literature supports the trend in higher
education to quantify decision-making about faculty evaluation. This research points out
where bias can appear and offers a computerized model which can highlight bias and

help an administrator toward more informed decision-making.

 

20Michael Nash, Making People Pgoductive, (San Francisco: Jossey-Bass, 1985) p. 43.

CHAPTER 3
DESIGN OF THE STUDY

Introduction

In order to compare faculty responses to an unbiased response, a mechanism to
obtain an unbiased response was put into place. Since calculations based on the tenure
criteria can be very tedious to perform, a computer program which allows users to
implement the model was used in this study. (See Appendix D). The computer
program allows the user to do three different things. The ﬁrst is to devise an
evaluation criteria, that is, choose high-level characteristics such as teaching, research
and service and choose corresponding primitive characteristics, such as teaching
evaluation scores, number of journal articles, and how many times the candidate for
tenure lectured in the local community. The second is to weight any evaluation criteria,
such as giving 60% weight to research. The third is to evaluate an individual. Thus, if a
faculty member evaluates a hypothetical candidate for tenure on speciﬁc criteria and
gets a different ranking than the computer program produces, then the faculty member
allowed an unasked for criteria to inﬂuence the decision-making. The study was
designed so that the only unasked for differences among the hypothetical candidates

were sex of candidate and the candidates’s Ph.D. granting institution.

Model for Faculg Evaluation

 

The model allows you to choose which broad areas you use in evaluating faculty
and in these broad areas allows you to specify sub-areas. Everything can be weighted
and a single number will result so that comparisons can be made across faculty. The
weights can be modiﬁed so that administrators can predict the results of a policy

change on the number of faculty attaining promotion or tenure.

12

13

Rationale for; Model

One takes measurements of system components to see if the components are
performing as the system requires. We may view college professors as a component in
a very complex system called a university. Often people argue that it is not fair to take
measurements of a subset of a college professor’s performance and then base a
promotion or tenure decision on these measurements. They usually base their
arguments on the fact that any model of performance will be too simple to capture the
complexity of the job. What these people forget is that even if measurements are not
taken, a decision will be made as to how well a college professor performs. By forcing
a university to come up with a mathematical model for faculty evaluation, two things
might happen. Sometimes the very act of building the model and seeing how the model
evaluates college professors will cause the institution to admit that it has been saying
something it did not mean to faculty, such as "we value teaching above all" while
results from the model consistently show it is research that gives an instructor a higher
rating. Having the evaluation model also allows faculty to evaluate themselves by the
model at times other than formal university evaluation periods and make any mid-

career adjustments that are necessary.

Deﬁnitiogn

The word ”system" is used in many contexts: there are educational systems,
biological systems and political systems. A system consists of interrelated subsystems
that work together to convert inputs to outputs.

...we can deﬁne a system as a relation between inputs and outputs. No matter
how simple or complex the system, its successful use depends on an
understanding of its structure. This is the main task of system analysis. The job
of the analyst is to study how an organized whole can process the inputs. The
analyst wants to know the character of the system in order to forecast its
future. Destiny means evolution over time, and this evolution is governed by the
character of the system--that is, by its structure...we can model the structure by
ﬁxing our attention on the state of the system. The state is deﬁned as the
minimum information required to describe the system’s condition in such a way,
if the inputs are known, then the condition at any time is completely
determined.21

It will be difﬁcult to achieve the goal of improvement of quality unless we can
deﬁne and measure the components of quality.

In a restricted sense, quality is often considered synonymous with reliability...If
we can deﬁne and measure these characteristics of quality with some degree of
precision, then managers and customers can use such quality metrics either to
set goals to be achieved by a software product, or as the basis for rejection or
acceptance of a completed project...each high-level characteristic is decomposed
into primitive characteristics...If a metric can be deﬁned for each primitive, then
measurements of these primitives can be combined to produce a single metric
for reliability...we would like to deﬁne metrics for each primitive and combine
these metrics in some way to produce a single ﬁgure of merit for the overall
product.22

 

21Constantin V. Negotia and Dan Ralescu, Simulation, Knowledge-Based Computing, and
Fuzg Statistics, (New York: Von Nostrand Reinhold Co., 1987), p. 1.

22 SD. Conte, H.E. Dunsmore, and V.Y. Shen, Software Engineering Metrics and Models,
(Menlo Park, California: Benjamin/Cummings Publishing Co., 1986), pp. 7-8.

15
One way of quantifying performance of a college professor is to specify it as

high-level characteristics such as:

" teaching
* research
* service

The problem is to measure these with some degree of precision. It seems
difﬁcult to measure the above characteristics with a single metric.

One way of dealing with this would be to break the high level characteristics
into primitive characteristics. For example, service, a high level characteristic, might be

broken down into:

"' departmental committee service
* college committee service

"' university committee service

* service to professional groups

* service to student groups

* service to community

Developmgnt of Model

Mathematically, we might describe this as follows:

Consider 11 high level characteristics denoted H,(i=1,...,n). We will assume that
each high level characteristic is made up of m primitive characteristics (M,). We will

use 8,5 to denote the jth primitive characteristic of the ith high level characteristic.

16

The score SC. for the ith high level characteristic is

SC. = Mi
2 sij

=1

h—e

The total score would be
18C = n M;
i=1 j=1
The above formula suffers from a number of drawbacks. The ﬁrst and foremost
drawback is that if one high level characteristic has more primitive characteristics than
another, the formula will weight it more heavily. One way to do away with this is to

divide each SC. by M,. Recall that M, was the number of primitive characteristics

associated with each Hi.
This gives us an average score for each of the Hi.

TASC =

M
Z Sij

n _1_
2 Mi
'=1 =1

.—
h-

Suppose we wanted to weight the various high level characteristics. We could

associate a weight W. with each high level characteristic such that

2 W; = n. (Note there are n high level characteristics.)
TWASC = n 31V... Mi
2: M, 2 8,,-
i=1 j=1

For example, let us assume the high level characteristics are teaching, research,

and service. If we want to weight them we will choose three weights whose sum is 3. If

17
all were to be weighted equally, then W1 = 1(teaching), W; = 1(research), and W3
=1(service). Suppose we wanted to weight teaching so that it was twice as important as
research and weight research so that it was twice as important as service. This causes

the following equations.

W,=2"‘W2

W2=2*W3

W1 + W2 + W, = 3 the initial constraint

Now with a little algebra we can calculate the weights:

W1 + W1/2 + W2/2 = 3
W1 + W1/2 + W,/4 = 3
(1 + 1/2 + 1/4)W1 = 3
W1 = 3/1.75 = 12/7

W, = 12/14 = 6/7

W:

12/28=3/7

Hence the weights 12/7, 6/7, and 3/7 give the appropriate weighting for the high level
characteristics.

We may wish to weight the low level characteristics associated with a given high
level characteristic. For example, we might wish for departmental committee service to

count less than service to professional groups. This is done in a manner similar to how

18
we weighted the high level characteristics only now for each primitive characteristic Of

the high level characteristic (i) choose weights Lu such that

Using these weights instead of evenly weighting as the above formula

implies yields the new formula:

ﬂ, 2
1 Mi j=1

e M a

S. * L.

1
To make it easier to compare differences, we would want to normalize a score.
A difference of .1 between schools with different characteristics would mean the same.
We can compute a normalized score by taking TWASC and dividing it by the
maximum possible score, TWASC,“ .
NWASC = TWASC
TWASC,m
In like manner we could also normalize TWWSC as follows:
NWWSC = TWWSC
TWWSC.m
The normalized score has the property 0 <= NWASC <= 1 . The closer this
score is to 1, the higher the desirability of the instructor is under our weighting system.
If the total weighted score is divided by the highest possible weighted score, the best
someone could do is 1.
So far we have not discussed the actual values the score for each primitive

attribute may take on. Although we could use the weights to compensate for different

19
maximum scores for different primitive attributes we will assume that they are all on
the same scale. This makes the choice of weights more meaningful. We assume each
range is from 0 to 5.
If the instrument that measures a primitive attribute gives values on a different

scale, we will use a simple linear transformation to scale it to 0 to 5.

EXAMPLE

Suppose our model has three high level characteristics with the indicated primitive

characteristics.
H1 Teaching

"‘ student evaluation

* peer evaluation

* chairman’s evaluation
H2: Research

"' grants and contracts

* refereed papers

"' unrefereed papers and presentations
H3 Service

* departmental committees

"‘ college committee service

* university committee service

* service to professional groups

* service to student groups

"‘ service to community

20

In teaching, suppose we choose to weight student evaluations 3 times as heavily as

peer evaluations and to give peer evaluations equal weight with chairman’s evaluation.

Ln = 3 * La
142 = L13
L11 + L,2 + L13 = 3 the initial constraint

Now we can calculate the weights for the primitive characteristics of teaching.

I... + (1.0/3 + (1.0/3 = 3

(5*1..)/3 = 3

Ln = 9/5

Ln = (Ln)/3 = 3/5

Ln = Ln = 3/5

21
In research, suppose we choose to weight refereed papers twice as heavily as grants

and contracts. Suppose we also decide to weight grants and contracts three times as

heavily as unrefereed papers and presentations.

Ln = 2 * Ln
L22 = 3 "‘ Lu
Ln + Ln 4» Ln = 3 the initial constraint

Now we can calculate the weights for the primitive characteristics of research.

1421+2*I-’21+(1421)/6=3

11/3 * Ln = 3
Ln = 9/11
Ln = 18/11
Ln = 6/11

In service, suppose we choose to weight all six primitive characteristics evenly. This

gives us the following:

Ln=1,142=1,]-433=1,Ls4=1a145=1,146=1.

22

Now suppose we want to weight research (Hz) twice as much as teaching (H1).

Suppose we also want to count research three times as much as service.

W; = 2 * W1
W2 = 3 * W3
W1 + W, + W, = 3 the initial constraint

Now we can calculate the weights for the high level characteristics.

(W2)/2 + W2 + (W2)/3 = 3

W; = 18/11
W, = 9/11

The model is now completely speciﬁed and all that remains is to collect data
for each faculty member in each of the primitive characteristics and scale it to the 0 to

5 scale which we have assumed.

Professor A
Teaching
Student evaluation 4.3
Peer evaluation 5

Chairman’s evaluation 4

Research
Grants and contracts 1
Refereed papers 4

Unrefereed papers and

presentations 2
Service
Departmental 4
College 3
University 3
Professional groups 0
Student groups 4
Community 0
Recall that:
TWWSC = 11 Mi
2 _w_, z Sij * I.j
i=1 Mi j=1

= ﬂu)*(4-3(2)+5(§)+4(3))
3 5 5 5

+

M*(1(i)+4(ﬁ)+2(-6—))
3 11 11 11

+

éLLt *(4*1+ 3*1+ 0*1+ 4*1+ 0*1)
6

= 9.4679

24
To ﬁnd NWWSC we need to calculate TWWSCM .

TWWSC“, = 15

NWWSC = 9.4679/15 = .6312

Professor A’s strongest point is teaching. Now let us consider Professor B whose

strongest point is research and who is a poor teacher.

Professor B
Teaching
Student evaluation
Peer evaluation

Chairman’s evaluation

Research
Grants and contracts
Refereed papers
Unrefereed papers and

presentations

Departmental
College

University
Professional groups
Student groups

Community

Let us see how Professor B does.

2.5

4.3

26

TWWSC

10.2357

NWWSC = .6824

It would appear that at a school which rated research twice as much as
teaching, one could do a pretty poor job of teaching and still come out with a good
overall score. This might argue for choosing weights that are relatively close to each

other unless you really do not care about a particular component.

This model does not allow for unasked for variables such as sex or age of the
candidate to be considered. If any unasked for variables are inﬂuencing a decision,
then the model’s rating score would be different from the score someone obtained
from another rating method. University administrators can see empirically the
consequences of policy or criteria and hopefully be able to modify the policy or criteria
to correspond with reality or defend the criteria and add it to the high~level or

primitive characteristics.

Selection of Subjects

Subjects were tenure-track faculty from English, mathematics, education, and
computer science at Central Michigan University. They were randomly selected from
the campus telephone directory and asked if they would participate. If they were

willing they were given an instruction sheet. (See Appendix C).

27

Procednge

Subjects (32 faculty members, eight from each of the four departments) were
given a folder on each of four hypothetical candidates. (See Appendix A). These
folders contained enough detail to evaluate the candidates according to the given
criteria for tenure. The criteria for tenure were described in a separate handout. (See
Appendix B). The subjects were then asked to evaluate the applicants giving them a
number between one and ten. (See Appendix C). Overall, the folders were designed so
that according to the criteria the men and women were equal. (Each female applicant
was paired with a male whom the evaluation program (See Appendix D) gave an equal
score but not every evaluator received equal pairs.) The statistical test used was an
analysis of variance with a Latin square design. Because of the design of the folders,
the average for the males (as calculated by the computer model, see Appendix D) was
equal to the average for the females and the average for the high prestige pairs was
equal to the average for the low prestige pairs.
AEaJIEi-s

The experiment tested to see if two factors which should not enter into a
tenure decision (sex of candidate and prestige of PhD. granting institution) do in fact
enter into such a decision. In order to do this subjects were asked to look at folders
for candidates and evaluate them. For this to be a success there were a number of
clearly distinct outcomes. The four folders were designed so that the subjects would
produce clearly dissimilar evaluations provided that the subjects followed the stated
criteria. The subjects were people who are tenure-track faculty at Central Michigan
University and the experiment did not take up an inordinate amount of time. It was
assumed that if the subjects were asked to do a time consuming task then the level of

participation would perhaps drop to a point where a statistical analysis would be

28

impossible. Also, according to Caplow and McGee, it is debatable whether evaluators
actually spend more than a few minutes in decision-making.23 The four folders
together with cover sheets indicating sex and prestige of institution gave sixteen
combinations. It was assumed that if a factorial experiment were done, two problems
would arise. The experiment would take too long. At ﬁve minutes per folder each
subject would spend 80 minutes evaluating the folders. More importantly, there would
be a need to generate four different versions for each of the rating levels. While this
could be done the chance of biasing the experiment by something as simple as choice
of titles of publications would be great.

The experimental design chosen was the Latin square design. It is a special
design that permits the researcher to assess the relative effects of treatments when a
double type of blocking restriction is imposed on the experimental units. This research
involved testing of the effect of sex and prestige of institution on evaluation score.
Thus the set up uses the levels as columns and sex-prestige pairs as rows. Using such
a design four subjects can test all possibilities with each person getting one folder of

each level. The following is a classic Latin square design from Snedecor and Cochran.24

 

23Theodore Caplow and Reece J. McGee, The Academic Marketplace (New York: Basic
Books, Inc., 1958), p. 127.

24George W. Snedecor and William G. Cochran, Statistical Methods (Ames,Iowa: The Iowa
State University Press, 1967, p.312.

29

TABLE 1
LATIN SQUARE DESIGN OF STUDY

high low low
teaching teaching research

Prestige:
male high

male low
female high

female low

 

For example, subject A was given the following four people to evaluate: 1) a
male from a high prestige school with high research, 2) a male from a low prestige
school with high teaching, 3) a female from a high prestige school with low research,
4) a female from a low prestige school with low teaching.

Thirty-two subjects participated in the study, eight subjects from each of the
four departments. In each department two subjects were assigned to each letter. The
total score for a cell was the sum of the eight scores.

The category of service was held constant during this experiment since

according to the literature, service is not a major focus in this type of decision-making.

CHAPTER 4

PRESENTATION OF RESULTS

Introductory Explanations

There were four achievement levels:

1. high teaching scores, mid-level research score and good service score
2. high research score, mid-level teaching scores and good service
3. low research score, mid-level teaching scores and good service
4. low teaching scores, mid-level research score and good service.

These achievement levels were chosen so that using the criteria given out in the
experiment to the faculty subjects, the hypothetical candidates should be ranked in this
order. Indeed, if you set the program to evaluate according to the criteria, it will rank
in this order. Hence, if the faculty subjects do not rank them in this order, either they
are disregarding the criteria or they are showing bias for sex or prestige of institution
since these were the only other varied information in the hypothetical candidate’s
folder.

The following are the scores for the hypothetical candidates. The teaching
scores represent student, peer and chair evaluation scores, the research score was
derived from the formula on the criteria sheet and the calculations were done for each
of the subjects (so there would be no calculation error introduced on the part of the
subjects), and service consisted of six items and was above the level stated by the

criteria for full credit. (See Appendix A and B).

30

Following is the data presented to the subjects. These four sets were used in

combination with different names (male and female) and different prestige schools

(high or low).

31

TABLE 2

DATA PRESENTED TO SUBJECTS

1. High Teaching

2. High Research

3. Low Research

4. Low Teaching

: Teaching (3.28, 3.2, 3.4)
: Research (3.0)
: Service (six items)

: Teaching (2.98, 3.0, 3.2)
: Research (3.2)
: Service (six items)

: Teaching (2.98, 3.1, 3.1)
: Research (2.67)
: Service (six items)

: Teaching (2.72, 3.0, 2.8)
: Research (3.0)
: Service (six items)

 

32
Overall Malygis of the Data

RAW DATA ( Each score is the sum of eight evaluations. Each evaluation is between

0 and 10. )

TABLE 3
SUM OF SCORES FOR EACH OF THE CANDIDATES

high research high prestige male
female

high research low prestige male
female

high teaching high prestige male
female

high teaching low prestige male
female

low teaching high prestige male
female

low teaching low prestige male
female

low research high prestige male
female

low research low prestige male
female

 

33
The following is the output from an ANOVA program called UNISTAT that

was run on an IBM AT compatible machine.

Table 4
AN OVA: OVERALL

SOURCE: grand mean
sex prestige N
16

SOURCE: sex
sex prestige
male

female 8

N

SOURCE: prestige
sex prestige
high
low

SOURCE: sex prestige
sex prestige

male high

male low

female high

female low

MEAN
61.6391

MEAN
61.6528
61.6253

MEAN
61.7188
61.5594

MEAN
62.4500
60.8556
60.9875
62.2631

SD
4.9191
5.7619

SD

4.7760
7.2168
5.6766
4.9054

FACTOR:
LEVELS:
TYPE:

prestige

SOURCE

60789.9838
330.251 1

552.216
110.0837

0.0030
39.6695

0.0030
13.2232

0.000

0.1016
6.5569

0.1016
2.1856

prestige 0.046

prestige/level

8.2369
17.0617

8.2369
5.6872

sex/prestige
s/p/levels

 

34

These results appear to show that sex and prestige of school do not make a
difference. The only signiﬁcant thing is level of achievement and clearly, ranking on
level of achievement is what ought to be done in a faculty evaluation. The overall
average score was 61.6391. Males had an overall average score of 61.6528 and females
had an overall average score of 61.6253. People from high prestige schools had an
overall score of 61.7188 and those from low prestige schools an overall score of
61.5594. There is one anomaly that is worth noting. Males from high prestige schools
averaged 62.45 while males from low prestige schools averaged 60.8556. (See Table 4
ANOVA Source sex prestige). The results were almost reversed for females. Females
from high prestige schools averaged 60.9875 while females from low prestige schools
averaged 62.2631. (See Table 4 ANOVA Source sex prestige)

The above anomaly led to running the tests within achievement level.

Sex Within Achievement level : High Teaching
RAW DATA ( each score is the sum of eight evaluations. )

TABLE 5
SUM OF SCORES FOR CANDIDATES IN HIGH TEACHING CATEGORY

high prestige male
female

low prestige male
female

 

The ANOVA was run on the computer again with the above data.

35
TABLE 6

AN OVA: SEX WITH HIGH TEACHING CATEGORY

SOURCE: grand mean
prestige N MEAN
4 68.2425

SOURCE: prestige

prestige N MEAN
high 2 68.2350
low 2 68.2500

FACTOR:
LEVELS:
TYPE:

SOURCE

18628.1554

2.5600
prestige 0.0002
pres/sex 0.2500

 

SD
0.9679

SD
0.7778
1.4849

prestige
2

186281554
2.5600
0.0002 0.001 0.981
0.2500

7276.568 0.007 * *

TABLE 6

AN OVA: SEX WITH HIGH TEACHING CATEGORY

There is a signiﬁcant difference in the way males and females are treated if

they have high teaching scores. Prestige of institution does not make a signiﬁcant

difference. People in this level are treated signiﬁcantly better if they are female than

if they are male.

36

Sex W'thi Achievement vel: i esearch

RAW DATA ( Each score is the sum of eight evaluations.)

TABLE 7
SUM OF SCORES FOR CANDIDATES IN HIGH RESEARCH CATEGORY

high prestige male
female

low prestige male
female

 

The ANOVA was run on the computer again using the above data.

37

TABLE 8
ANOVA: SEX WITH HIGH RESEARCH CATEGORY

SOURCE: grand mean
prestige N MEAN SD
4 63.2150 3.1048

SOURCE: prestige
prestige N
high 2

MEAN SD
62.6000 3.5355

 

low

FACTOR:

LEVELS:
TYPE:

SOURCE

prestige
pres/sex

2

159845440
27.3529
1.5129
0.0529

3.8608

prestige
2

584.382
27.35290
1.5129 28.599
0.0529

0.026 *

0.118

 

There is a signiﬁcant difference in the way males and females are treated if
they have high research scores. Prestige of institution does not make a signiﬁcant
difference. People in this level are treated signiﬁcantly better if they are male than if

they are female.

38

Sex Within Achievement Level: Law Research

RAW DATA ( Each score is the sum of eight subjects )

TABLE 9
SUM OF SCORES FOR CANDIDATES IN LOW RESEARCH CATEGORY

high prestige male
female

low prestige male
female

 

The ANOVA was run on the computer again using the above data.

39

TABLE 10
AN OVA: SEX WITH LOW RESEARCH CATEGORY

SOURCE: grand mean
prestige N MEAN SD
4 58.7787 1.8213

SOURCE: prestige

prestige N MEAN SD
high 2 58.6000 1.8385
low 2 58.9575 2.5385

FACTOR: prestige

LEVELS: 2
TYPE:

SOURCE

13819.7656 13819.7656 1442.710 0.017 “
9.5790 9.5790
prestige 0.1278 0.1278 0.522 0.602
pres/sex 0.2450 0.2450

 

There is a signiﬁcant difference in the way males and females are treated if
they have low research scores. Prestige of institution does not make a signiﬁcant
difference. People in this level are treated signiﬁcantly better if they are female than

if they are male.

40
Sex Within Achievement Level; Low flfeaching

RAW DATA ( Each score is the sum of eight evaluations. )

TABLE 11
SUM OF SCORES FOR CANDIDATES IN LOW TEACHING CATEGORY

high prestige male
female

low prestige male
female

 

The ANOVA was run on the computer again using the above data.

41

TABLE 12
AN OVA: SEX WITH LOW TEACHING CATEGORY

SOURCE: grand mean
prestige N MEAN SD
4 56.3200 3.1596

SOURCE: prestige

prestige N SD
high 2 . 3.2173
low 2 . 3.8184

FACTOR: prestige

LEVELS: 2
TYPE:

SOURCE

12687.7699 12687.7699 70243.330 0.002 **
0.1806 0.1806
prestige 5.0176 5.0176 0.203 0.731
pres/sex 24.7506 24.7506

 

There is a signiﬁcant difference in the way males and females are treated if
they have low teaching scores. Prestige of institution does not make a signiﬁcant
difference. People in this level are treated signiﬁcantly better if they are female than

if they are male.

Because of the sex-prestige anomaly noted in the ﬁrst look at the data in Table

4, it is important to look at sex alone and prestige alone.

42

SEX Over the Whole Experiment

RAW DATA (Each score is the sum of 64 evaluations.)

TABLE 13
SUM OF SCORES FOR MALE AND FEMALE HYPOTHETICAL CANDIDATES

male 495.3225

female 491.4025

 

The ANOVA was run on the computer again using the above data.

TABLE 14
AN OVA: SEX

SOURCE: grand mean
N MEAN SD
2 493.3625 2.7719

FACTOR:
LEVELS:
TYPE:

SOURCE

486813.1188 486813.1188 63360.290 0.003 **
7.6833 7.6833

 

43

Men get a signiﬁcantly higher rating than women over all. Note that even
though women appear to be favored in three of the four achievement levels (See
Tables 6, 10 and 12), overall men do signiﬁcantly better. This suggests that the bias

against women doing well in research is weighty.

PR G Over he Whole eri t

RAW DATA (Each score is the sum of 64 evaluations)

TABLE 15
SUM OF SCORES OF HIGH AND LOW PRESTIGE CANDIDATES

 

The ANOVA was run on the computer again using the above data.

44

TABLE 16
AN OVA: PRESTIGE

SOURCE: grand mean
MEAN SD
493.3625 1.2551

prestige
2

486813.1188 486813.1188 309028.505 0.001 ”
1.5753 1.5753

 

PeOple from high prestige institutions get an overall higher rating than people
from low prestige institutions. Therefore, prestige really was signiﬁcant even though it

appeared earlier that it was not. (See Table 4).

SUMMARX OF RESULTS

Faculty achievement was signiﬁcant at the .01 level (p<.0001). This means that
when faculty are evaluated for tenure by a university-wide committee, achievement
plays a signiﬁcant role. For example, people with high scores in teaching received
higher ratings than people with low scores in teaching. This was to be expected
because the decision was supposed to be based on achievement.

Within the achievement level characterized by high teaching scores, sex was
signiﬁcant at the .01 level (p=.007). This means that people with high teaching scores

are treated signiﬁcantly better if they are female.

45

Within the achievement level characterized by high research scores, sex was
signiﬁcant at the .05 level (p=.026). This means that people with high research scores
are treated signiﬁcantly better if they are male.

Within the achievement level characterized by low research scores, sex was
signiﬁcant at the .05 level (p=.017). This means that people with low research scores
are treated signiﬁcantly better if they are female.

Within the achievement level characterized by low teaching scores, sex was
signiﬁcant at the .01 level (p=.002). This means that people with low teaching scores
are treated signiﬁcantly better if they are female.

In the study overall, sex was signiﬁcant at the .01 level (p=.001). This means
that men receive a signiﬁcantly higher rating than women overall.

In the study overall, prestige was signiﬁcant at the .01 level (p=.001).

This means that people from high prestige institutions received a higher rating than
people from low prestige institutions. When women were looked at separately, this
reversed and women from low prestige institutions were given higher scores than

women from high prestige institutions.

CHAPTER 5
CONCLUSIONS AND RECOMMENDATIONS

ME

The study shows that sex of the candidate and prestige of the candidate’s Ph.D.
awarding institution do have a signiﬁcant effect on the candidate’s performance
evaluation by a university committee. This was not apparent at ﬁrst glance of the data
because men were given less credit at doing a good job teaching than women and
women were given less credit for doing a good job in research than men. This balance
of bias can deceive the observer into thinking no bias is present. The prestige bias was
not apparent because of another balance of bias. Men from high prestige universities
were ranked higher than men from low prestige universities while women from low
prestige universities were ranked higher than women from high prestige universities.
Recommendations

Tenure decisions in which a woman is turned down primarily for not having
enough research should be reviewed very carefully. This study seems to indicate that on
the basis of equal research, women are judged lower than men by university
committees. The same can be said about a man’s being turned down because of
teaching scores. It is indicated that on the basis of equal teaching scores, men are
judged lower than women by university committees. In general, women are rated lower
than men for the same scores. This might argue for careful review of all tenure
decisions on female faculty.

A good way of quickly reviewing such decisions would be to use the software
discussed in this study. In fact, the act of using the software forces a person to quantify

the factors in a tenure decision.

47
Spgulationn nnn Rminnlendatinns fnt Eunhgr Rggatch

Why do women from low prestige universities get higher ratings than women
from high prestige universities? In any study, there is always the chance that the results
are just random error. This same study can be repeated at other institutions and the
results can be compared. Another reason for this particular result is the possibility that
most faculty consider it unseemly for a woman to get a degree from a high prestige
university. Maybe faculty expect more from a woman from a high prestige university
and are more generous to a woman from a low prestige university. Possible the faculty
feel threatened by a woman from a high prestige university.

Why do men get less credit than women for teaching well and women get less
credit than men for good research? Perhaps faculty View teaching as a woman’s work
and research as a man’s job. Because of the way the study was constructed, teaching
received more weight than research, which makes these ﬁndings all the more dramatic.
This implies that a faculty committee might feel that a man can make up for mediocre
teaching with excellent research but women can not do that. Further research can be
done by replicating the study and changing the instructions. There could be a case
where research received more weight and a case where teaching and research received
equal weight.

Another implication is that faculty (mostly men in this sample) were more
satisﬁed with the image of a woman getting her degree possibly near home so as not
to disrupt family life and becoming a teacher. Further research could be done adding
the variables of sex of rater and background of rater (prestige of institution, views on
research and teaching, and age).

Further research using the model and the computer program could involve a

study looking back at past tenure decisions. Using stated criteria, the program could

48
rank candidates and this could be compared to actual committee ranking. The main
drawback to this kind of study is that it would involve conﬁdential personal ﬁles.
Another use of the model and the computer program might be as a simulation tool for
administrative decisions. It could be used to show the consequences of several different
criteria before policies were set. For example, before announcing a possible new
emphasis on research, administrators could estimate how many present faculty would
earn tenure based on their current productivity rate.

This model and software would be useful in determining if bias is present in
decision-making provided the institution has objectively stated the major criteria in such
decision-making. The university committee, when making decisions about candidates in
a ﬁeld other than their own, need objective guidelines and even when given these
guidelines there is the possibility of bias entering into the decision-making. There is an
implication that if the computer model could provide an unbiased decision, why have
human committee members be involved at all? This is an area for further investigation.
If a department and its chairman provide input to a candidate’s qualiﬁcations, why
should a dean get input from a university committee made up of human prone to
biased decision making and who are not familiar with the candidate’s subject area?
This is a philosophical question besides one of efﬁciency and bears further

investigation.

APPENDICES

APPENDIX A

Biographical Material on Hypothetical Candidates for Tenure

The sequence letters at the top represent the four achievement levels.
W is high teaching, Y is high research, X is low teaching, and Z is low
research. The sequence numbers at the top of the other pages represent
prestige of institution and sex of candidate. Sequence 1 is female high prestige,
Sequence 2 is female low prestige, Sequence 3 is male low prestige, and
Sequence 4 is male high prestige.

The four numbers were combined with the four letters to give 16
combinations. These combinations were used in the Latin square design.

49

50

SEQ: W

Teaching:

Student Evaluations (0 to 4 scale)

Year Average Department Average

84/85 3.5 2.3

85/86 3.0 2.2

86/87 3.4 2.4

87/88 3.2 2.2

88/89 3.3 2.1

Avg. 3.28 2.24

Peer Teaching Evaluation (0 to 4 scale): 3.2
Chair Teaching Evaluation (0 to 4 scale): 3.4

Research:

"Use of Solar Powered Looms” Journal of Modern Weaving Vol. 11, No. 8, 1985.

 

”Design of Efﬁcient Injectors” Textile Engineering Journal Vol. 2, NO. 3, 1985.
"Weaving Large Plant Economics" Journal of Textile Economics, Vol. 23, No. 4, 1986.

"Injection Molds Have a Place in Cloth Production" flfextile Engineering Journal, Vol. 3,
NO. 4, 1987.

"Looms in Everyday Use" Journal of Modem Weaving, Vol. 15, No. 3, 1987.

”Design of Super Injectors" Textile Engineering Journal, Vol. 9, No. 7, 1988.

Six publications. Since the departmental average over this period was 2 papers the
formula yields a score Of 3.0 .

51

SEQ: 1
Dr. Sandra Ralston

Education: Ph.D. Harvard 1983

Service:

Committees:
Departmental Laboratory Supply Committee
Departmental Curriculum Committee
College Standards Committee
College Committee on Committees

Other:
Introduced a new course: TXS 315 Dynamic Loom Design.
Represents the department at freshman orientation.

52

SEQ: X

Teaching:

Student Evaluations (0 to 4 scale)

Year Average Department Average

84/85 2.2 2.3

85/86 3.1 2.2

86/87 2.6 2.4

87/88 2.6 2.2

88/89 3.1 2.1

Avg. 2.72 2.24

Peer Evaluation (0 to 4 scale): 3.0
Chair Evaluation (0 to 4 scale): 2.8

Research:
"Cloth Regeneration Techniques" Journal of Modern Weaving, Vol. 11, No. 5, 1986.

"Plant Layout" flfextile Engineering Journal, Vol. 3, No. 2, 1986.

"Non-Woven Cloth Production" Journal of Textile Production. Vol. 18, NO. 3, 1987.

”Cloth Production: It Can Be Increased Without New Equipment" Textile Engineering
Journal, Vol. 3, No. 4, 1987.

"Weaving a Major Factor in Cloth Production" loumal of Modern Weaving, Vol. 16,
No. 5, 1988.

"Injection Molds and You" Textile Engineering Joumal, Vol. 7, No. 5, 1989.

Six publications. Since the departmental average over this period was two papers the
formula yields a score of 3.0 .

53

Dr. Susan Powers

Education: Ph.D. North Texas State University 1983

Service:

Committees:

Other:

Departmental Undergraduate Committee
Departmental Honors Committee
College Planning Committee

College Parking Committee

Introduced a new course: TXS 313 Textile Technology.

Represents the department at parents day.

SEQ: 2

54

SEQ: Y

Teaching:

Student Evaluations (O to 4 scale)

Year Average Department Average

84/85 3.2 2.3

85/86 2.7 2.2

86/87 3.0 2.4

87/88 3.1 2.2

88/89 2.9 2.1

Avg. 2.98 2.24

Peer Evaluation (0 to 4 scale): 3.0
Chair Evaluation (0 to 4 scale): 3.2

Research:
"Water Powered Looms" Weaving Histog Journal, Vol. 10, No. 4, 1985.
"Design of Energy Efﬁcient Looms" Textile Engineering Journal, Vol. 3, NO. 3, 1986.

”Weaving the End of an Era” Journnl of Textile Economics Vol. 23, No. 3, 1986.

 

"Cloth Production in Underdeveloped Nations” jlfextile Economics Journal, Vol. 2, NO.
5, 1986.

"Looms of the Future" Journal of MQern Weaving, Vol. 16, No. 2, 1987.

”Design of Cost Effective Textile Delivery Systems" Textile Engineering Journal, Vol. 8,
No. 4, 1987.

"Weaving vs Spinning” Journal of Textile Economics Vol. 26, NO. 2, 1988.

 

”Injection Molds a Thing of the Past" Textile Engineering Journal, Vol. 7, No. 5, 1989.

Eight publications. Since the departmental average over this period was two papers
the formula yields 3.2 .

55

Dr. William Watson

Education: Ph.D. Denver University 1983

Committees:
Departmental Laboratory Supply Committee
Academic Senate
College Grievance Committee
College Committee on Committees

Other:
Introduced a new course: TXS 330 Textile Plant Design.
Represents the department at freshman orientation.

SEQ: 3

Teaching:

Student Evaluations (0 to 4 scale)

Year
84/85
85/86
86/87
87/88
88/89
Avg.

Average
3.1

3.1

3.0

2.9

2.8

2.98

Peer Evaluation (0 to 4 scale): 3.1
Chair Evaluation (0 to 4 scale): 3.1

Research:

SEQ: Z

Department Average
2.3

2.2

2.4

2.2

2.1

2.24

"Cloth Production in Industrial Nations" Textilg Economics Journal, Vol. 2, No. 5, 1986.

"A Survey of Textile Delivery Systems" Textile Engineering Journal, Vol. 7, No. 6, 1987.

"Cloth Production as a Measure of Wealth" Jnnmal of fljextile Economics, Vol. 26, No.

2, 1988.

"Cloth Production in the Underground Economy" Journal of Textile Economics, Vol.
27, No. 3, 1989.

 

Four publications. Since the departmental average over this period was two papers the
formula yields 2.67 .

57

Dr. James Anderson

Education: Ph.D. University of California - Berkeley 1983

Service:

Committees:

Other:

Departmental Laboratory Supply Committee
Departmental Hiring Committee

College Parking Committee

College Honors Committee

Introduced a new course: TXS 380 Textile Delivery
Systems.
Represents the department at graduate orientation.

SEQ: 4

APPENDIX B

Criteria for Tenure

You are a member of the tenure committee of the Textile Science Department
of Mid-America University. You have been given the task of evaluating four tenure
applicants. All of the applicants are assistant professors with ﬁve years of service.

Mid-America University rates applicants for tenure on three criteria: teaching,
research, and service. The University has carefully studied the problem of weighting
these three criteria and has agreed that the relative weights in percent are: teaching -
60% , research - 30% , and service 10%.

Within the criteria of teaching, student evaluations are given a weight of 50%
and chair and peer evaluations are given a weight of 25% each. In the area of
research, publications and grants are both taken into consideration. One funded is
counted as half a paper. The department has come up with a formula for scoring
based on number of papers. The formula is as follows:

rating = (1- (1/(x/a-t-1)))m4

where

x is the number of acceptable papers published over the time period in
question.

a is the average number Of acceptable papers published in the department

over the time period in question.

Note that if the person publishes no papers the rating is 0, if the person
publishes at the department average the rating is 2, and that the rating approaches 4
as the number of papers published grows large. It is very hard to get much above
three with this formula.

Service is generally rated by counting the number of service items listed in the
folder. If the applicant has at least one item for each two years of service, then full
credit is usually given for service.

You are to study the following tenure folders and evaluate the applicants.
Evaluate the four tenure applicants by assigning each a real number between 0 and 10.
(e.g., 5 or 7.6 ) Ten indicates a perfect candidate and zero indicates a candidate with
no redeeming qualities. Neither ten or zero are common scores for a candidate to
receive. Tenure decisions are made by the dean who relies heavily on the rankings of
the tenure committee.

58

APPENDIX C

Instructions

Time Estimate: This will take you approximately 20 minutes.

You indicate your voluntary willingness to participate by completing the attached
survey.

This research is being done by Elizabeth A. Hansen as part of a doctoral dissertation
in Educational Administration for Michigan State University. The purpose is obtain
survey data on decision-making about tenure from faculty members. All results will be
treated with strict conﬁdence and the subjects involved will remain anonymous. On
request and within these restrictions results will be made available to subjects.

Assume that you are on a university-wide tenure/promotion committee. You will read
the ”Criteria for Tenure" sheet and then look at the information provided for four
hypothetical applicants. You will give each applicant a score from zero to ten and
record it on a score sheet which is supplied for each applicant.

From the above explanation of the research and your understanding of the

ramiﬁcations of it, know that you are free to participate or discontinue participation at
any time without recrimination.

59

SEQ:
Score Sheet

Name of candidate for tenure

 

Score for candidate (must be a number between 0 and 10)

Comments (optional)

APPENDIX D

The Computer Program
-MAIN PROGRAM-

DECLARE SUB FirstScreen ()
DECLARE SUB Menu (33)
DECLARE SUB MakeSC ()
DECLARE SUB MakeCT ()
DECLARE SUB Eval ()
DECLARE SUB Scale (low, high, real, new)
OPEN 'DUMMYDES" FOR OUTPUT AS #1
CLOSE #1
OPEN ”DUMMY.WGT" FOR OUTPUT AS #1
CLOSE #1
CALL FirstScreen
response$ = ”n”
WHILE response$ = "11"
CALL Menu(response$)
IF response$ = "1" THEN
CALL MakeCT
response$ = ”n"
ELSEIF response$ = "2" THEN
CALL MakeSC

response$ = "n

61

WEND

END

62
ELSEIF response$ = "3" THEN
CALL Eval
response$ = "n"
ELSEIF response$ = "4" THEN
responseS = "y"

END IF

IF responses = "n" THEN INPUT "Hit enter to continue ”, junk$

63
EVALUATION SUBPROGRAM-

SUB Eval
REM
REM This subprogram is used to evaluate a person.
REM
CLS
PRINT "Which weighting system do you want to use?"
PRINT "DUMMY is not a valid weight ﬁle. "
PRINT "Your choices are: "
FILES ".WGT‘
PRINT "ENTER NAME ONLY NOT THE EXTENSION."
INPUT wfile$
OPEN wﬁle$ + ".wgt” FOR INPUT AS #1
INPUT #1, des$
OPEN des$ FOR INPUT AS #2
INPUT #2, n
DIM m(n), h$(n), w(n)
FOR i = 1 TO 11
INPUT #2, m(i)
NEXT i
REM ﬁnd max
max = m(l)
FOR i = 2 TO 11

IF max < m(i) THEN max = m(i)

NEXT i

DIM ll$(n, max)

DIM l(n, max)

FOR i = 1 TO 11

INPUT #2, h$(i)
FORj == 1 TO m(i)

INPUT #2, ll$(i, j)

NEXT j

NEXT i

FOR i = 1 TO 11
INPUT #1, w(i)

NEXT i

FOR i = 1 TO 11
FORj = 1 TO m(i)

INPUT #1, l(i, j)

NEXT j

NEXT i

CLOSE #1

CLOSE #2

DIM sc(i, max)

res$ = "3"

WHILE res$ <> "Y" AND res$ <> "N"
PRINT "Remember the scores must be between 0 and 5. If you choose any

other”

PRINT ”range you will be prompted for a range with each entry."

65
PRINT ”Enter Y if all your scores are between 0 and 5."
PRINT "Enter N for individual range prompt at each entry"
INPUT resS
res$ = UCASE$(res$)
WEND
FOR i = 1 TO 11
FORj = 1 TO m(i)
sc(i, j) = -1
scS ___ «q.
WHILE ((sc(i, j) <= 0) OR (sc(i, j) > 5)) AND NOT $05 = "0"
IF res$ = "Y“ THEN GOTO skip
back: PRINT ”For "; h$(i); " : "; ll$(i, j)
INPUT "Enter the lowest possible score ”, low
INPUT ”Enter the highest possible score ", high
IF high < low THEN GOTO back
skip: PRINT "Enter score for "; h$(i); " : ”; ll$(i, j);
INPUT sc$
sc(i, j) = VAL(sc$)
IF res$ = "Y" THEN GOTO skip2
IF sc(i, j) = low THEN scS = ”0"
CALL Scale(low, high, sc(i, j), new)
sc(i, j) = new
PRINT ”The score was scaled as "; sc(i, j); "on a 0 to 5 scale."
skip2: WEND
NEXT j

NEXT i
osum = 0
FOR i = 1 TO IT

insum = 0

FORj = 1 TO m(i)

insum = insum + sc(i, j) * l(i, j)

NEXT j

osum = osum + (w(i) / m(i)) * insum
NEXT i
PRINT "The total weighted score is : ”; osum
PRINT "Normalized weighted weighted score is : "; osum / (5 * n)

END SUB

67
-FIRST SCREEN SUBPROGRAM-

SUB FirstScreen

REM This subprogram displays a title and copyright notice.
CLS

LOCATE 5, 20

PRINT "FACULTY EVALUATOR"

LOCATE 18, 20

PRINT ”(c) 1989 Elizabeth A. Hansen”

FOR k = 1 TO 5000

NEXT k

END SUB

-MAKE CRITERION SUBPROGRAM-

SUB MakeCT
REM
REM This subprogram is used to make a criterion ﬁle
REM
CLS
n = 0
WHILE n <= 0
PRINT "How many high level characteristics does this criterion have";
INPUT n3
n = VAL(n$)
WEND
DIM h$(n), m(n)
FOR i = 1 TO 11
PRINT ”Enter name for "; i; " high level characteristic: ";
LINE INPUT h$(i)
h$(i) = UCASE$(h$(i))
NEXT i
FOR i = 1 TO IT
m(i) = 0
WHILE m(i) <= 0
PRINT "For high level characteristic "; h$(i)
PRINT " enter the number of primitive characteristics";

INPUT m3

.—

69
m(i) = VAL(m$)

WEND
NEXT i
REM Find maximum
max = m(l)
FOR i = 2 TO 11

IF max < m(i) THEN max = m(i)
NEXT i
DIM l$(n, max)
FOR i = 1 TO 11

PRINT "Enter primitive characteristics for"

PRINT ”high level characteristic "; h$(i)

FORj = 1 TO m(i)
PRINT ”Enter primitive characteristic"; j; " ”;
LINE INPUT l$(i, j)
NEXT j

NEXT i
PRINT
PRINT "It is now time to select a name for the description ﬁle."
PRINT "The name must consist of only letters and numbers."
PRINT "Case is ignored. Other descriptions if any are listed."
PRINT "DUMMY is not a real ﬁle. All descriptions have the extension DES."
FILES ".DES"
PRINT ”Caution use of a listed ﬁle will destroy that description ﬁle."
PRINT "ENTER ONLY THE FILE NAME NOT THE EXTENSION"

70
PRINT "Enter ﬁle name: ”;
INPUT desS

OPEN UCASE$(des$) + ”.DES" FOR OUTPUT AS #1

WRITE #1, 11

FOR i = 1 TO 11

WRITE #1, m(i)

NEXT i

FOR i = 1 TO 11

WRITE #1, h$(i)
FORj = 1 TO m(i)
WRITE #1, l$(i, j)

NEXTj

NEXT i

CLOSE #1

END SUB

71

—MAKE SCORING FILE SUBPROGRAM-

SUB MakeSC
REM
REM This subprogram weights the criterion in a description ﬁle.
REM
CLS
PRINT "Which description ﬁle do you want to use?"
PRINT "DUMMY is not a valid description ﬁle. ”
PRINT "Your choices are: "
FILES ".DES"
PRINT "ENTER ONLY THE NAME NOT THE EXTENSION"
PRINT "Enter ﬁle name: ";
INPUT it$
OPEN it$ + ".DES" FOR INPUT AS #1
INPUT #1, n
DIM m(n), h$(n)
FOR i = 1 TO 11
INPUT #1, m(i)
NEXT i
REM Find maximum
max = m(l)
FOR i = 2 TO 11
IF max < m(i) THEN max = m(i)

NEXT i

72

DIM 11$(n, max)
DIM l(n, max)
FOR i = 1 TO 11
INPUT #1, h$(i)

FORj = 1 TO m(i)

INPUT #1, 11$(i, j)

NEXT j
NEXT i
CLOSE #1
PRINT "We need to establish the weights to give high level characteristics."
PRINT "For each characteristic enter the per cent weight it should have."
PRINT "Remember there are"; n; " characteristics and the weights must total 100%"
DIM w(n)
w(n) = -1
WHILE w(n) < 0
total = 0
FOR i = 1 TO 11 - 1

PRINT "Enter per cent for "; h$(i);

INPUT w(i)

total = total + w(i)
NEXT i
w(n) = 100 - total
PRINT "To add up to 100 % "; h$(n); " must be weighted "; w(n); " %."
IF w(n) < 0 THEN PRINT "Weights must total 100%. Too much weight given to

the ﬁrst items. Try again. "

73
WEND
FOR i = 1 TO 11
w(i) = (w(i) / 100) "' 11
NEXT i
PRINT "We also need to establish weights for the primitive characteristics."
FOR i = 1 TO IT
PRINT h$(i)
PRINT "There are "; m(i); " primitive characteristics for "; h$(i)
PRINT "Remember the weights must add up to 100%."
l(i, m(i)) = -1
WHILE l(i, m(i)) < 0
total = 0
FORj =1TOm(i)-1
PRINT "Enter per cent for "; ll$(i, j);
INPUT l(i, j)
total = total + l(i, j)
NEXT j
l(i, m(i)) = 100 - total
PRINT "To add up to 100% "; ll$(i, m(i)); " must be weighted "; l(i, m(i)); " %."
IF l(i, m(i)) < 0 THEN PRINT "Weights must total 100%. TOO much weight
given to ﬁrst items. Try again."
WEND
FORj = 1 TO m(i)
10,1) = (l(i, i) / 100) "' m(i)
NEXTj

74

NEXT i
PRINT "It is now time to select a name for the weight ﬁle. The name"
PRINT "must consist of only letters and numbers. Case is ignored. Other"
PRINT "descriptions if any are listed. Dummy is not a real ﬁle. All "
PRINT "weight ﬁles have the extension WGT."
FILES ".WGT"
PRINT "Caution use of a listed ﬁle will destroy that weight ﬁle."
PRINT "ENTER ONLY THE NAME NOT THE EXTENSION."
PRINT "Enter ﬁle name: ";
INPUT wgtS
OPEN UCASE$(wgt$) + ".WGT“ FOR OUTPUT AS #1
WRITE #1, it$ + ".DES"
FOR i = 1 TO 11

WRITE #1, w(i)
NEXT i
FOR i = 1 TO n

FOR j = 1 TO m(i)

WRITE #1, l(i, j)

NEXT j
NEXT i
CLOSE #1

END SUB

75
-MENU SUBPROGRAM-
SUB Menu (a3)
REM This subprogram displays a menu and waits for a response of
REM 1, 2, 3, or 4.
REM
WHILE as <> "1" AND as <> "2" AND a$ <> "3" AND a$ <> "4"
CLS
PRINT "Make your selection: "
PRINT
PRINT "1 -- Devise an evaluation criterion"
PRINT " Choose high level characteristics and corresponding "
PRINT " primitive characteristics "
PRINT
PRINT "2 -- Weight an evaluation criterion"
PRINT " Weight the characteristics in terms of per cent"
PRINT
PRINT "3 -- Evaluate an individual"
PRINT " Calculate a score based on a particular weighting of characteristics"
PRINT
PRINT "4 —- END"
PRINT
INPUT a$
WEND
END SUB

-SCALE SUBPROGRAM-

I—it

76

SUB Scale (low, high, actual, new)

REM

REM

REM

REM

REM

REM

REM

This subprogram scales scores so that they run from 0 to 5.

low -- holds lowest possible score

high -- holds highest possible score

actual -- holds actual score value on old scale

new -- holds score scaled as a number between 0 and 5

spread -- holds the range of scores on old scale

spread = high - low

new = ((actual - low) / spread) * 5

END SUB

 

BIBLIOGRAPHY

BIBLIOGRAPHY

Blackburn, Robert. "The Meaning of Work in Academia," New Directions for
Institntional Research, San Francisco: Jossey-Bass, 1974, I, p. 80.

Braxton, J. M., and Brayer, A. E. "Assessing Faculty Scholarly Performance," New
Directionn for Institutional Research, San Francisco: Jossey-Bass, 1986, p. 28.

Breneman, David W., and Youn, Ted 1. K. (eds.), Academic Labor Markets and
Careers Philadelphia: Falmer Press, 1988.

 

Caplow, Theodore and McGee, Reece J. The Academic Marketplace, New York:
Arno Press, 1977.

Centra, John A. "Using Student Assessments to Improve Performance and Vitality,"
New Directions for Institutional Research, San Francisco Jossey-Bass, 1978, p. 40.

Conte, S.D., Dunsmore, HE, and Shen, V.Y. Software Engineering Metrics and
AM Menlo Park, California: Benjamin/Cummings Publishing Co., 1986.

Dill, David. "Research as a Scholarly Activity: Context and Culture," New Directions for
Institutional Research San Francisco: Jossey-Bass, 1986, XIII, p. 13.

Etzioni, A. and Lehman, E. W. "Some Dangers in ’Valid’ Social Measurements," m
Annals 1967, 173, pp. 1-15.

 

 

"In Weighing Sex-bias Case, High Court Will Skirt Issue of Conﬁdentiality of Tenure-
Review Records," flhe Chronicle of ﬂigher Education, January 4, 1989, p. A13.

King, Patricia. Berformance Elanning nnd apptaisal, New York: McGraw-Hill Book
Company, 1984.

Logan, Wilson. The Academic Man: Sociology of a Profession London: Oxford
University Press, 1942.

Long, Durward. "The University as Commons: A View from Administration," New
Directiotn fnr lintitutional Research, San Francisco: Jossey-Bass, 1977, V, p. 75.

Nash,Michael. Making Eeoplg Rrgnuctive, San Francisco: Jossey-Bass, 1985.

77

78

Negotia, Constantin V. and Ralescu, Dan. Simulation, Knowledge-Based Computing,
and Eugy Statistig, New York: Von Nostrand Reinhold Co., 1987.

Schuster, J. N. and Bowen, H. R. "The Faculty at Risk," Change, 1985, _1_7, pp. 15—16.

Seldin, Peter. Successful Faculty Evalnation Brograms, Crugers, New York: Coventry
Press, 1981.

Smith, Donald K. "Faculty Vitality and the Management of University Personnel
Policies," New Directions for Institutional Research San Francisco: Jossey-Bass, 1978,
V, p. 9.

 

Snedecor, George W. and Cochran, William G. Statistical Methods, Ames, Iowa: The
Iowa State University Press, 1967.

"Iill/11111111111111“