*' ’1 'u‘_ ‘ "{.’V—Q

" ‘ , Si" '

g .7, a . u. .
\ v

 

This is to certify that the

thesis entitled

DEVELOPMENT OF THE SCIENCE PROCESSES
TEST (TSPT)

presented by

Robert R. Ludeman

has been accepted towards fulﬁllment
of the requirements for

Pb . D . degree in EDUCATION

Major professor
Date M/ 9 7 ‘f

0-7 639

 

ABSTRACT
DEVELOPMENT OF THE
SCIENCE PROCESSES TEST
TSPT
by

Robert R. Ludeman

PROBLEM
There is considerable evidence that the technology of educational
evaluation has not kept pace with developments in other areas of the
educational enterprise. This study involved the development of a test
of science processes using a method of item selection which replaced
the customary panel of judges who pass on the items' validity with an
objective method of item selection based on an external criterion.

Some of the characteristics of the resulting test are examined.

LITERATURE

The literature is examined with reference to several issues
relevant to the mechanics of test construction, including speeded vs
power tests, the test blueprint, the optimum number of alternatives,
item order, acceptable difficulty level, and item discrimination. A
short survey of recently developed science process tests is presented
with the method of validation used in each case. The concern expressed
by some testing authorities over traditional methods of validation is
examined and the external criterion referenced method developed by Fyffe

and Robison and used in this study is reviewed.

Robert R. Ludeman

PROCEDURE

The item improvement phase of the study involved addition to and
revision of the items developed by Fyffe and Robison using the item
analysis data generated by their study. Two additional item tryout and
revision cycles were required before item analysis indicated the item
pool of 61 items to be of adequate quality. The result was known as
The Science Processes Test (TSPT) form C.

The validation phase of the study consisted of the administration
of three tests to the validation sample which was composed of 52 sixth
grade students. The three tests were a subset of the Individual Compe-
tency Measures taken from the Science - A Process Approach elementary
science program, TSPT form C, and the Science Research Associates (SRA)
test. The correlation of students' scores on each of the form C items
with their scores on the four subtests of the Individual Competency
Measures was computed and hypothesis one was tested. Hypothesis one
was that scores on each item of form C would exhibit a significantly
higher correlation with scores on one of the subtests of the Individual
Competency Measures than with any other subtest.

The scores on the Individual Competency Measures served as the
external criterion measure for selecting the upper and lower 27 percent
groups needed to calculate the item discrimination indicies. Form C
items were selected to be included in form D based on the requirement
that this external criterion referenced discrimination index have a
minimum value of 0.20. Thirty-six items from form C which met this
requirement were included in TSPT form D.

The correlation of form D scores with the Individual Competency

Measures scores was computed. Hypothesis two, that form D scores were

Robert R. Ludeman
more highly correlated with the Individual Competency Measures scores
than with the SRA Science test scores was tested.

Norming data for TSPT form D was obtained by administering it
to a random sample of 1301 sixth grade students. The preparation of

a test manual for TSPT form D completed this study.

RESULTS

The correlation of TSPT form D scores with the Individual
Competency Measures scores is 0.83, which is significant well beyond
the 0.001 level, and demonstrates that the external criterion referenced
method of test development used is a fruitful approach to test construc-
tion. The hypothesis that items could be objectively assigned to the
Individual Competency Measures subtests was not supported and the
intercorrelations among the Individual Competency Measures subtests
cast such doubt on their independence that no further reference was
made to the supposed subscales. The hypothesis that TSPT scores would
be more highly correlated with the Individual Competency Measures scores
than with the SRA Science test scores was also not supported. The high
correlation between the Individual Competency Measures and the SRA
Science test scores raises the question whether process tests, which
the former is claimed to be, and factual knowledge tests, which the

latter is claimed to be, do indeed lead to greatly different results.

CONCLUSIONS
Although the value of TSPT will only become apparent as it is
used, its high correlation with the Individual Competency Measures and
its quality as indicated by the test statistics suggest that it should

be of value to those concerned with science process evaluation and that

Robert R. Ludeman

the method of test construction used may be of value to those concerned

with test construction.

RECOMMENDATIONS

In view of the results of this study it is recommended that:

l. TSPT be used and evaluated by researchers.

2. Further use be made of the objective method of test develop-
ment used in this study.

3. Additional research be done to test the independence of
process test subscales.

4. Additional research be done to distinguish between process

ability and factual knowledge.

DEVELOPMENT OF THE
SCIENCE PROCESSES TEST (TSPT)

by

Robert R. Ludeman

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

College of Education

1974

© Copyright by
ROBERT R. LUDEMAN
1974

ACKNOWLEDGEMENTS

Many people have contributed toward the success of this study.
Over 1500 school children and some 100 teachers and administrators
have helped in the study. Of these, special mention must be made of
the students in the Pierce Community School who made up the validation
sample and their teachers who did not complain at my extensive
disruptions.

Grateful appreciation is also due to Drs. Sherwood K. Haynes,
Robert L. Ebel, and Glenn D. Berkheimer, members of my guidance committee.
Each member contributed from his expertise, insights and advice which
aided my personal growth and added to the value of this study.

A special note of thanks goes to the chairmen of my guidance
committee, Dr. Richard J. McLeod who suggested the study and gave
valuable guidance early in the work and to Dr. Edward L. Smith whose
interest and willingness to go beyond the call of duty in giving advice
and guidance contributed significantly to the quality of the study.

Special thanks is also due to the administration of Andrews
University without whose encouragement and financial support the work
would never have been undertaken.

Most of all the greatest thanks goes to my dear wife who endured
it all for me and without whose love it would have been neither possible

nor desirable.

ii

TABLE OF CONTENTS

ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . .

LI S T 0F TABLE S O O O O O O O O 0 O O O O O I O O O 0

LIST OF APPENDICES . . . . . . . . . . . . . . . . . .

CHAPTER

I.

II.

THE

PROBLEM . . . . . . . . . . . . . . . . .
Background . . . . . . . . . . . . . . . .
The Need for the Study . . . . . . . . . .
The Purpose of the Study . . . . . . . . .
Initial Considerations . . . . . . . . . .
Hypotheses to be Tested . . . . . . . . .
Test Instruments Used . . . . . . . . . .
Assumptions . . . . . . . . . . . . . . .
Limitations . . . . . . . . . . . . . . .
Overview of the Thesis . . . . . . . . . .

Footnotes . . . . . . . . . . . . . . . .

REVIEW OF THE LITERATURE . . . . . . . . . . .

Background . . . . . . . . . . . . . . . .
Process Evaluation . . . . . . . . . . . .
Test Construction . . . . . . . . . . . .
Other Process Tests . . . . . . . . . .

The Need for External Criterion Referenced
validation O I O O O O O O O O O O O 0

iii

Page

ii

10
ll
12
15
15
16
16

19

22

CHAPTER Page
The Work of Fyffe and Robison . . . . . . . . . . 23

Summary . . . . . . . . . . . . . . . . . . . . . 24

Footnotes . . . . . . . . . . . . . . . . . . . . 26

III. PROCEDURE . . . . . . . . . . . . . . . . . . . . . . 31
Item Improvement . . . . . . . . . . . . . . . . . 31
Validation Phase . . . . . . . . . . . . . . . . . 35

Multiple Regression Analysis . . . . . . . . . . . 42

Norming TSPT Form D . . . . . . . . . . . . . . . 43

Test Manual Preparation . . . . . . . . . . . . . 45

Summary . . . . . . . . . . . . . . . . . . . . 45

Footnotes . . . . . . . . . . . . . . . . . . . . 48

IV. ANALYSIS OF RESULTS . . . . . . . . . . . . . . . . 49
Item Improvement . . . . . . . . . . . . . . . . 49
Validation . . . . . . . . . . . . . . . . . . . 54

Norming TSPT Form D . . . . . . . . . . . . . . 65

Summary . . . . . . . . . . . . . . . . . . . 68
Footnotes . . . . . . . . . . . . . . . . . . . 69

V. SUMMARY AND CONCLUSIONS . . . . . . . . . . . . . . 70
Summary . . . . . . . . . . . . . . . . . . . . 70
Conclusions . . . . . . . . . . . . . . . . . . . 72
Implications for Further Research . . . . . . . . 74

Footnotes . . . . . . . . . . . . . . . . . . . . 77
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . 78

APPENDICES O O C O O O O O O O O O O O O O O I O O O O O O O 83

iv

10.
ll.
12.

13.

LIST OF TABLES

Item Subtest Assignments . . . . . . . . . .
TSPT Test Statistics . . . . . . . . . . . .
TSPT Form A Correlation Table . . . . . . . .
TSPT Form C Subtest Correlations . . . . . .
TSPT Form C Item Assignments . . . . . . . .
TSPT Form D Item Selection Criteria . . . . .
SRA Test Statistics . . . . . . . . . . . .
TSPT, ICM - TSPT, SRAS Correlation Comparison

Individual Competency Measures Subtest
Intercorrelations . . . . . . . . . . . .

Multiple Regression Analysis . . . . . . . .
Norming Schools Characteristics . . . . . . .
TSPT Form D Test Statistics . . . . . . . .

Norming Sample Frequency Distribution . . .

Page

50
51
52
54
S6
57
58

60

61
63
65
66

67

III-A0

III-B o

III-C.

IV-A.
IV-B.
IV-C .

IV—D o

IV-H.
IV-I.

IV-J.

LIST OF APPENDICES

One Individual Competency Measure from SAPA . . .
Integrated Processes of SAPA . . . . . . . . . .
Listing of Individual Competency Measures . . . .
The Minimum Level of Discrimination - Conventional

Item Analysis . . . . . . . . . . . . . . . .
Directions for Administering TSPT Form D . . . . .

TSPT Form D Test Manual . . . . . . . . . . . . .

TSPT Form A . . . . . . . . . . . . . . . . . .
Form A Subtest Assignments . . . . . . . . . . . .
Item Analysis Form A . . . . . . . . . . . . . . .
TSPT Form C . . . . . . . . . . . . . . . . . . .
Item Analysis Form C . . . . . . . . . . . . . . .
Validation Sample Scores . . . . . . . . . . . . .

TSPT Form C Items - Individual Competency Measures
Subtest Correlations . . . . . . . . . . . . .

TSPT Fom D O O O O I O O O O O O I O O O 0 O 0 O
Norming Area Map 0 O O O O O O O O O O O O O O O 0

Item Analysis Form C . . . . . . . . . . . . . . .

vi

Page
83
85

87

89
90

93

107
132
133
146
166

175

177
180
194

196

CHAPTER I

THE PROBLEM

BACKGROUND

The educational theorists have long recognized the need to teach
more than just factual knowledge in the schools but it was during the
post-Sputnik era that educational practice began to make significant
progress in that direction.1 It was at that time that the "acronym
curriculum" of innovative science programs began to emerge. All of them
to a greater or lesser degree claim to teach the "higher mental processes."2
Science - A Process Approach (SAPA), the elementary school science program
developed by the American Association for the Advancement of Science, has
been one of the leaders, for the whole of SAPA is built around the
processes which the developers have identified as basic to science.3

As the new courses came into common use it became apparent that
there were few, if any, tests available for assessing knowledge of these
"higher mental processes." It also became painfully apparent that good

5
process test items are not easy to write. ’

6,7,8

Some spokesman were moved

and new research relative to
9,10,11

to call for new methods of evaluation
assessment of achievement at the higher cognitive levels.
Anticipating the need for evaluation, the designers of SAPA
constructed the Individual Competency Measures. It was originally in-
tended that the Individual Competency Measures would be administered to
one student at a time. The teacher would verbally set the task, which

occasionally involved the use of physical objects, and then he would

observe the student's behavior and record his competence in using the
process skills required to perform the task. Student performance on
each required task is described in detail for the teacher so that he
can judge the student's performance.12 Appendix I-A contains a sample
of the Individual Competence Measures. Since the student is asked to
demonstrate his knowledge of the processes in much the same setting as
that in which they were learned, it is reasonable to ascribe to them
"primary" or "direct" validity.13 The difficulty with the Individual
Competency Measures is that since they are administered individually
they are too time consuming to be used very widely. Although £3222
Competency Measures have been developed for administration to from three
to six students at a time, this is a compromise which does not solve
the problem of time efficiency.

Even though the Individual Competency Measures are too time con-
suming to be used extensively, because of their validity Fyffe14 and
Robison15 recognized their potential as a standard against which to
compare a more time efficient test of science processes. They took the
initial steps in the development of such a test by generating a pool of
test items each of which was validated by correlation with the Individual
Competency Measures relating to one or more of the following four Inte-
grated Processes as defined by SAPA: Interpreting Data, Controlling
Variables, Formulating Hypotheses, and Defining Operationally. These
processes are defined in Appendix I-B of this study. Since their work
was not intended to produce a usable test, there is no way of evaluating

the merit of Fyffe and Robison's approach. This study will develop such

a test and a tentative evaluation of the approach will be attempted.

THE NEED FOR THE STUDY
A number of individuals have recognized the need for time effi—

16,17

cient tests of children's ability to use the science processes and

high quality, time efficient pencil and paper tests have been prepared

which claim to assess ability to use these processes.18’19

However, as
is true for so many other tests, they almost always base their claim of
validity solely on "expert" opinion. This is considered to be a serious

weakness.20’21’22

One reason for this practice is that if another
measure is to be used as an external standard for validation, Ebel points
out that "...it should always exemplify a measurement procedure clearly
superior to (i.e., more relevant, more precise than) that embodied in the
test in question."23 Obviously, in most cases, if such a measure exists,
it will be used and there is no need to construct another. However, in
this case, due to their time inefficiency, the Individual Competency
Measures are not practical to use directly but, by the nature of their
construction and format, they do meet Ebel's requirements for superior
relevance and precision. Thus, in testing ability to use the science
processes the opportunity does exist for the construction of a test which

does not depend solely on "expert opinion" for its validation. This

opportunity is pursued in this study.

THE PURPOSE OF THE STUDY
The purpose of this study is to develop a test, The Science
Processes Test (TSPT),using item selection based on item discrimination
referenced to an external criterion, and to evaluate the test's perfor-
mance. This method is described in detail later in this chapter. The
external criterion that will be used for determining the item discrimi-

nation is the Individual Competency Measures of SAPA. TSPT is intended to

4

be a research instrument of sufficient quality to be usable by researchers

in science education for assessing students' ability to use the integrated

processes of Interpreting Data, Controlling Variables, Formulating

Hypotheses and Defining Operationally as defined by SAPA. The manual

for TSPT form D has been prepared in accordance with the American Educa-

tional Research Association recommendations for such manuals.24
Since, as has been previously mentioned, others have found it

difficult to construct items which assess ability to use the processes of

science, as additional assurance of test validity, students' performance

on TSPT will be compared with their performance on the Individual

Competency Measures and on the Science Research Associates (SRA) Science

test, a test, which it is claimed, measures mainly factual knowledge.25

If the TSPT scores are more closely correlated to the Individual Competency

scores than to the SRA Science scores this will be taken as evidence that

TSPT is more a test of science processes than of factual knowledge.

INITIAL CONSIDERATIONS

Early in the development certain decisions were made with reference
to the development and final form of TSPT.

l. The time span required to administer the test would be no more
than approximately 45 minutes.

2. The test would be of pencil and paper multiple choice format.
The reason for decisions 1 and 2 is the requirement that the test be
easily administered without the requirement of a special testing period
and without special facilities, equipment, or training of the test
administrator.

3. The test would be a non-paced power test not having any time

limitation. There is evidence that timing this type of test is not wise.

Both decisions 2 and 3 above required that items built around projected
pictures in the original item pool had to be rewritten. In some instances,
in order to minimize the reading required, printed pictures were substituted.
4. The major portion of the study would not be attempted until
the items in the item pool appeared to be of adequate technical quality
to be useful as test items. The criteria for making this judgment are
presented in Chapter III.
5. The subjects used would be limited to only one grade level.
The reason for this decision is the elimination of as many variables as
possible. The sixth grade was chosen because typically it is the last
grade in which the SAPA materials are used and the integrated processes

are given increased emphasis in the later grades.

HYPOTHESES TO BE TESTED

Although the major portion of this study is concerned with the
development of TSPT, the section on validation does have an experimental
aspect with the following hypotheses to be tested:

1. The Integrated Process which a given test item assesses will
be indicated by the students' scores on the item having a significantly
higher correlation with their scores on that Integrated Process Subtest
than on any other subtest of the Individual Competency Measures.

2. Student scores on TSPT will have a significantly higher
correlation with their scores on the Individual Competency Measures than

the correlation they have with the SRA Science test.

THE EXTERNAL CRITERION REFERENCED METHOD OF TEST DEVELOPMENT
As used in this study, this method of test development deviates
from the typical method of test improvement through item analysis in two

important respects:

6
l. The "upper 27 percent" and the "lower 27 percent" groups used
in the item analysis are determined with reference to the external
criterion scores as opposed to the conventional procedure which uses the
scores on the test under deve10pment to determine these groups. This

26 and Robison27

procedure suggested by Fyffe provides assurance that items
will be selected on which students who know the material assessed by the
criterion test do well and students who do not know the material assessed
by the criterion test do poorly. In other words, it provides assurance
that the item discriminates on the basis of the external criterion. If
one has this assurance, it is expected that students' performance on a
test composed of such items will correlate highly with their performance
on the external criterion.. The minimum value used for this discrimination
in this study was 0.2.

2. In order to have further assurance of a high correlation with
the external criterion, a further requirement is used in this study. It
is that student scores on the item will have a minimum correlation with
their criterion test scores of 0.2. In most cases this latter requirement
is not necessary. If the discrimination requirement is met, the
correlation requirement will be met.

The reason the discrimination and correlation requirements are
lower than that usually used is that for the usual method of item analysis,
the item under consideration has contributed to the total score and so
the value is artificially inflated. This is not true when the external

criterion is used. Table 6 contains empirical evidence that, at least

for this study, 0.2 is an appropriate value.

TEST INSTRUMENTS USED

The Individual Competengngeasures

A set of tests designed by SAPA to be administered to one student
at a time.28 The testor verbally sets a task which frequently involves
a hands-on manipulation of a physical object such as the use of a stop-
watch, a balance, or a meter stick to make measurements. The testor
observes the student's behavior and records his competence in using the
process skills to perform the task he has set. Acceptable student
performance on each task is described in detail for the testor so that
he can quantitatively rate the student's performance. A typical sample
of the Individual Competency Measures is included in Appendix I-A. A
listing of the Individual Competency Measures considered for use in this

study is included in Appendix I-C.

SRA Science Test
Science Research Associates Achievement Series: Science (blue

version) form D.29

This test was chosen because it is of high quality
and, most important, it is criticized as follows by one reviewer.30
"The test appears to measure primarily a mastery of science
content. It focuses mainly upon knowledge and to a more
limited extent upon understanding. It is not appreciably
concerned with processes of science or with the problem
centered approach."
It is the lack of concern for the processes which makes the test ideal
for this study for this means it should be measuring something distinctly
different from what the Individual Competency Measures measure. This

"something" for the purposes of this study will be referred to as "factual

knowledge."

SRA Reading Test

 

Science Research Associates Achievement Series: Reading (blue

version) form D.31

This test was chosen because it was the companion
test for the SRA Science test. In an effort to shorten the total test,
there were some items which SRA scored as both science and reading items.

These items were drOpped from the reading test in order that the reading

test would measure as much "non-science" as possible.

Fry Readability Formula

 

Due to the fact that reading scales are typically intended for use
with textual material, it was felt that in this case, use of one of the
more complex reading scales was not warranted. The Fry Readability
Formula is an easy to use readability formula based on grammatical
complexity and vocabulary.32 The rule followed was that only the correct
alternative was considered in the calculation and numbers were considered
to contain one syllable for each digit. Fry places the uncertainty of
grade level determination for his scale at approximately one grade level.
The uncertainty is probably higher in this situation, but at least some

indication is given of the probable reading level.

ASSUMPTIONS
Probably the most important assumption of the study is that the
Individual Competency Measures are valid measures of students' ability
to use the science processes. Since correlation with performance on the
Individual Competency Measures is used as the criterion for judging whether
or not an item should be included in TSPT and also as the criterion for
jUdging to what extent TSPT measures students' ability to use the processes

Of Science, this assumption underlies the entire study.

9

Second, it is assumed that the inflation in correlation of TSPT
with the Individual Competency Measures which is bound to result from
the above procedure of using the same data for construction of the test
and for analysis of the test will not be serious enough to alter the
outcome of the study. This assumption will be discussed further in
Chapter V.

Third, it is assumed that the SRA Science test does npt_measure
ability to use the processes of science.

Fourth, it is assumed that the validation sample will contain both
students who possess the ability to use the science processes and students
who possess factual knowledge with respect to science but that these
abilities are possessed quite independently of one another.

Fifth, it is assumed that for the validation phase of the study
individual student's ability to use the science processes did not change
significantly between the time when it was measured using the Individual
Competency Measures and the time of administration of TSPT form C. This
time interval could not be reduced below approximately one and one-half
months due to the time required to administer the Individual Competency
Measures.

Sixth, it is assumed that the learning effect or carry-over from
one test to another will not significantly affect students' performance
on the tests.

Seventh, it is assumed that the vocabulary and context of the test
is sufficiently general that its usefulness will not be limited to
students who have studied the SAPA materials.

Eighth, it is assumed that the use of standard statistical proced-

tuxzs and item analysis procedures are applicable to this situation.

10

LIMITATIONS

This study is limited to the extent that the preceding assumptions
are invalid. Further, this study is limited in its interpretation of
what really constitutes the science processes to the interpretation used
in SAPA. But the vocabulary used in TSPT is not unique to SAPA so its
use will not be necessarily limited to students familiar with SAPA
materials. Similarly, this study is limited in its interpretation of
what really constitutes factual knowledge to the abilities assessed by
the SRA Science test.

The correlation of TSPT scores with the Individual Competency
Measures scores is expected to be high and the correlation of TSPT scores
with the SRA Science test scores is expected to be low. Since this
difference will be taken as evidence that TSPT measures ability to use
the science processes, to the extent that the Individual Competency
Measures measure factual knowledge, and to the extent that the SRA Science
test measures the science processes, the correlation of the Individual
Competency Measures with the SRA Science test will be high and contamiv
nation will be introduced. Similarly, if the validation sample contains
students who do not possess the ability to use the science processes, or
if they do not possess factual knowledge, or if these abilities do not
exist independently of each other, again the correlation of the Individual
Competency Measures with the SRA Science test will be inflated. Either
of these effects will tend to obscure the expected results, that TSPT
measures science processes to a greater extent than it measures factual

knowledge.

ll

OVERVIEW OF THE THESIS

In this chapter the following topics have been presented: The
background including the work which has lead up to this study, the need,
the purpose intended to be accomplished, the hypotheses to be addressed,
a brief description of the test instruments to be used, the assumptions
on which the study is based, and finally, the limitations of the study.

In Chapter II a review of pertinent literature will be presented,
including a brief review of the trend toward process education, the
SAPA program, the effect the new programs have had on evaluation, and
attempts to improve evaluative techniques.

Chapter III describes the procedure used to conduct this study.
The first step is the item tryout and improvement. The validation part
of the study introduces the unique method of test development used
which is given the descriptive title of the external criterion referenced
validation method of test development. Finally the norming procedure
and the test manual preparation are described.

Chapter IV presents the analysis of the data obtained. The item
analysis data from the tryouts is presented first. In connection with
the validation study the statistical hypotheses are tested and the data
reduction involved in the development of TSPT form D is presented.
Finally, the norming data for publication in the test manual is presented.

Chapter V contains a summary of the findings, the conclusions

arrived at, and a discussion of the implications of the study.

12

FOOTNOTES

Henry P. Cole, "Process Curricula and Creativity Development,"
Journal of Creative Behavior,_3, 253 (Fall 1969).

2Terry Borton, "What's Left When School's Forgotten?" Saturday
Review, 53, 69-71, 79 (April 18, 1970).

3American Association for the Advancement of Science, The
Psychological Bases of Science - A Process Approach, AAAS Misc.
Publication (1965).

4L. Lisonbee, "Testing, What For?" Science Teacher, 33, 27-29
(May 1966).

SHulda Grobman, Evaluation Activities of Curriculum Projects,
AERA Monograph Series on Curriculum Evaluation, No. 2 (Rand McNally,
Chicago, Illinois, 1968).

 

 

6Max D. Engelhart and John M. Beck, "The Improvement of Tests,"
The 62nd Yearbook of the National Society for the Study of Education
(Chicago: University of Chicago Press, 1963).

7Robert E. Stake and T. Denny, "Needed Concepts and Techniques for
Utilizing More Fully the Potential of Evaluation," The 68th Yearbook of
the National Society for the Study of Education, 2 (Chicago: University
of Chicago Press, 1969).

 

8Richard B. Smith, "Approach to Measurement in the New Science
Curriculum," Science Education, 53, 411-415 (December 1969).

 

9Bernard W. Benson and L. L. Young, "Development and Implementation
of an Instrument to Assess Cognitive Performance in High School Biology;
Assessment of Cognitive Transfer in Science Inventory," Journal of Research
in Science Teaching, 8, 211—224 (1971).

 

 

10Robert H. Ennis, "Needed: Research in Critical Thinking,"
Educational Leadership, 21, 17-20 (1971).

11Ralph W. Tyler, "Educational Evaluation: New Roles, New Means,"
The 68th Yearbook of the National Society for the Study of Education, 2,
(Chicago: University of Chicago Press, 1969).

12American Association for the Advancement of Science, loc. cit.

13Robert L. Ebel, Essentials of Educational Measurement, (Prentice-
Hall, Inc., Englewood Cliffs, New Jersey, 1972), p. 438.

 

13

14Darrell W. Fyffe, The Develgpment of Test Items for the Inte—
grated ScienCe ProcesSes: Formulatingggypotheses‘and‘Defining‘operationallx,
Unpublished Doctoral Dissertation, Michigan State University (1971).

15Richard Wayne Robison, The Development‘of Items which Assess the
Processes of Controlling Variables and Interpreting Data, Unpublished
Doctoral Dissertation, Michigan State University (1973).

16H. Grobman, "Curriculum Development and Evaluation," Journal
of Educational Research, 64, 436-422 (July 1971).

17Ralph W. Tyler, "Resources, Models, and Theory in the Improve-
ment of Research in Science Education," Journal of Research in Science

Teaching, _5_, 43 (1967).

 

18Jacqueline V. Mallison, "Review — Stanford Achievement Test:
Science," in Oscar K. Buros, Seventh Mental Measurements Yearbook, 2
(Gryphon Press, Highland Park, New Jersey, 1972).

19Irvin J. Lehmann, "Review - Test of Academic Progress: Science,"
in Oscar K. Buros, Seventh Mental Measurements Yearbook, 2 (Gryphon Press,
Highland Park, New Jersey, 1972), pp. 1243-45.

20Warren G. Findley, "Purposes of School Testing Programs and
Their Efficient Development," The 62nd Yearbook of the NatiOnal Society
for the Study of Education (Chicago: University of Chicago Press, 1963),
p. 8.

21James R. Barclay, Controversial Issues in Testing (Boston:
Houghton Mifflin, 1968), p. 60.

22Robert E. Stake, "The Countenance of Educational Evaluation,"
Teachers College Record, 68, 523-540 (April 1967).

23

Robert L. Ebel, "Must all Tests be Valid?" American Psycholggist,

16, 640-647 (October 1961).

24American Educational Research Association Committee on Test
Standards, Technical Recommendations for AchieVement Tests (National
Education Association, Washington, D.C., 1955).

 

25Clarence H. Nelson, "Review - Science Research Associates
Achievement Series: blue version," in Oscar K. Buros, Seventh Mental
Measurements Yearbook1 2 (Gryphon Press, Highland Park, New Jersey, 1972),
pp. 1231-33.

26Fyffe, op. cit., p. 40.

 

27Robison, op. cit., p. 51.
28American Association for the Advancement of Science, loc. cit.
29

Science Research Associates Achievement Series: blue version,
form D (Chicago: Science Research Associates, Inc.).

14

0Nelson, loc. cit.

31Science Research Associates Achievement Series, loc. cit.

32Edward B. Fry, Reading Instruction for Classroom and Clinic
(New York: McGraw Hill, 1972), p. 205.

CHAPTER II

REVIEW OF THE LITERATURE

The literature on recent developments in both teaching and
testing is extensive and no attempt will be made to report exhaustively
on either. Rather, in this chapter, the recent emphasis on science
processes and the implications this emphasis has for testing will be
documented. Also some other recent attempts to construct tests which

assess students' ability to use the science processes will be presented.

BACKGROUND
Educators have long recognized the value of teaching students the

1’2 and as a result

procedures and strategies of inquiry used by scientists
science process teaching has become the focal point for several curriculum
projects3 with SAPA being one of the leaders.4 SAPA has identified eight
basic processes and five integrated processes about which they have built
their program. The basic processes are: observing, using space/time
relationships, classifying, using numbers, measuring, communicating,
predicting, and inferring. The integrated processes are: Interpreting
Data, Controlling Variables, Formulating Hypotheses, Defining Operationally,
and Experimenting.5 It should be emphasized that the integrated processes
are claimed to include the basic processes and that the integrated process

of experimenting is claimed to encompass all the other processes.6 Thus

the first four integrated processes are of concern for this study.

15

l6

PROCESS EVALUATION
It is generally conceded that "To teach without testing is unthink-
7
able," that objectives should be testable, and that evaluation should

8

extend to all the outcomes to which the school addresses itself. And

yet, many science educators assert that test development has not kept pace
with the curriculum changes of the past decade.9-15 The result is that

too often we are teaching what we are not testing, and testing what we are
not teaching. Lisonbee16 points out that one probable reason for this
situation is that the objectives as listed by many of the curriculum
designers are often not testable. An obvious reason suggested by Grobman17
is the difficulty and expense of developing a good test. It has also been
suggested that new approaches to testing are needed if process abilities

are to be assessed.18"21

TEST CONSTRUCTION

22 it has

Although the multiple choice test has had its detractors,
emerged as the testing format of choice for most testing situations and
will be the only format considered here. A rather standard methodology

has evolved for test development and use and there are a number of good

23-26

sources which describe the techniques in detail.
Speeding

The question whether a test should be a "Speeded test" or a
"power test" has been examined by a number of investigators.27’28 The

consensus seems to be, "In some situations speed tests may be appropriate
and valuable, but these situations seem to be the exception not the
rule."29 Those exceptions would be when time is a factor in the evaluation

such as a typist's speed test. Otherwise, especially in situations where

17

careful thinking was involved, speeding has been found to reduce test

reliability.30 It was decided that TSPT should be a power test.

The Blueprint

Travers31 has suggested that in order to aid in achieving the
desired balance among item types and concepts used, a two dimensional
matrix or "blueprint" should be employed for assigning items to the test
under construction. Others have suggested that the multidimensional
matrix may be too awkward and time consuming and that perhaps a vector or
one dimensional matrix compoSed of categories may be of more practical

utility to the test constructor.32’33

For construction of TSPT, the four
integrated processes of Interpreting Data, Controlling Variables, Formu-
lating Hypotheses, and Defining Operationally were used as the test item

categories.

Number of Alternatives
34

 

Tversky has developed a mathematical proof based on certain
assumptions relative to test characteristics and sample prOperties which
indicates that use of three alternatives will maximize the discrimination
of a multiple choice test. Costin3S has submitted empirical evidence
which indicates both discrimination and reliability show a slight increase
when three alternatives are used as opposed to four alternatives.

Opposing these findings, Ebel36

has developed a formula based on different
assumptions relative to the characteristics of the test and the sample
which indicates that the maximum possible reliability for a 100 item test

could continue to increase as the number of alternatives is increased,

though the rate of increase drops rapidly beyond about four alternatives.

18

In view of the above findings, it is probably safe to say that the
practical difference between using three or four alternatives would
be small. It was therefore concluded that due to its wider acceptance,

the four alternative format would be used.

Item Arrangement

 

Many investigators have examined the effect of item arrangement
on test performance. Flaugher37 has reported a statistically
significant improvement in scores on a verbal test favoring the easy to
hard arrangement although he questions the practical significance of
this finding, and reports no effect on a mathematics test. The find-

38 39

ings of Munz are quite similar. More important, Brenner reports

item order did not significantly affect test reliability, difficulty,

40,41

or discrimination. Marso and Klosner support these findings.

Thus item difficulty was not used to decide on item order for TSPT.

Item Difficulty

 

There has apparently been a rather noticeable shift in thinking

over the years among testing experts in relation to item difficulty.

42 reported that maximum reliability would be achieved if the

43

Symonds
difficulty was close to 0.50. Davis in a review of both theory and
research agreed with this view. Adams"4 showed that the highest test
reliability was achieved with items of middle difficulty levels.
Wofford45 reported that contrary to theoretical prediction, wider
difficulty ranges (0.25 to 0.75) does not decrease reliability. Recently

Davis46 has indicated that a difficulty level near 0.5 is not as

important as has been thought.

19

Item Discrimination

 

Kelly47 originally proposed the use of the "upper 27 percent"
and the "lower 27 percent" as the extreme groups for item analysis.
Both Feld48 and Wofford49 later supported Kelly, although Wofford
indicated that there was no difference in result when the total sample
was used rather than the upper and lower 27 percent. Engelhart
compared a number of different indicies which have been proposed for
use as indicators of the ability of an item to discriminate. He
concluded that the "D" index (the difference between the upper 27
percent and the lower 27 percent) was about as effective as any of the
correlation type indicies which have been proposed in identifying poor
test items. It also has the advantage of being more indicative of
the actual number of discriminations made. Thus Kelly's index seems
to have stood the test of time and even the onslaught of hard to compute
indicies made usable by modern computer technology. It is Kelly's

difficulty index that is used in this study.

OTHER PROCESS TESTS

A number of tests have become available in the past decade which
have been addressed specifically to the task of assessing process
ability. Most of the developers used rather traditional methods of
test development: a pool of test items is generated, a panel of
qualified judges examine the validity of the items and inappropriate
items are dropped from the pool. The surviving validated items are
then tried out on a sample of subjects similar to the target p0pulation.
Item analysis data on the items is obtained and poor items are either

‘revised or dropped from the pool. The resulting items make up the

20

test which is usually normed by administration to a fairly large
sample of the target population.

Cooley and K10pfer51 in their development of the Test on
Understanding Science (TOUS) added an additional evidence of validity
by administering TOUS as a pre- and post-test to a group of talented
students who spent a summer working with scientists. Their scores
improved. Whether or not this can be interpreted as evidence that
TOUS measures processes may be questioned. One might argue that TOUS
is only measuring factual recall and that their factual knowledge
increased as a result of their experiences.

Welch and Pella52 sought additional evidence of validity for
their test, The Science Process Inventory (SPI), by administering it
to students, teachers, and scientists. They suggested that since
scientists would be expected to know the most and students the least
about science processes, the fact that the scientists obtained the
highest and the students obtained the lowest mean score on SPI, this
was evidence of validity. A comparison of the above ranking with their
ranking on a test that claimed to measure only factual recall might
have been interesting.

Tannenbaum53 developed the Test of Science Processes (TOSP).

He recognized the difficulty of validating a process test. In addition
to validation by expert Opinion, he asked the teacher of one group to
rank his students in order according to their process ability. He

then used the correlation between the teachers' ranking and the students'
Scores on his test as evidence of validity. In order to reduce the
reading required, TOSP contains some black and white pictures printed

itlthe test booklet and some color slides which must be projected.

21

BeardS4 also has constructed a process test with the claim to
validity based on the opinion of a panel of judges. In this case an
attempt was made to minimize the reading required by synchronizing a
taped script with color slides.

Morgan55 developed the Science Test for Evaluation of Process
Skills (STEPS). Again, expert Opinion was the source of the claim of
validity. In this case the reading problem was minimized through the
use of film loops. One loop was used for each of the five sections of
the test.

Ebel56 has supported the use of pictures as a means of reducing
the reading requirement, especially in contexts where a great deal of
explanation would otherwise be required to set the task. However, when
the pictures are projected for the entire group this constitutes
pacing, which would seem to have many of the adverse effects of a
speeded test as discussed earlier in this review. The preceding
considerations prompted the use of printed pictures for TSPT.

Probably the most convincing claim of test validity is made by
the Individual Competency Measures developed to accompany the SAPA
program.57 It uses the same materials and contexts in testing the
ability to use the processes as it uses to teach them. Thus if the
processes are taught by the SAPA program, ability to use them can
reasonably be expected to be tested by the Individual Competency
Measures. In spite of their strong claim of validity, the Individual
Competency Measures are not extensively used. The reason is their low
time efficiency. The Individual Competency Measures, by their very
nature, require an individualized testing situation. They also

frequently require that equipment and materials be available for the

22

student to manipulate as part of the evaluation.58 This tends to make
them even less attractive to the testor.

Nelson59 has developed a test, the Inquiry Skills Measures
(ISM) very similar to the Individual Competency Measures in that it is
administered on an individualized basis and requires that materials be
available for use as part of the test. Again, time efficiency detracts

from the utility of the test.

THE NEED FOR EXTERNAL CRITERION REFERENCED VALIDATION
Over the years a number of authorities have expressed concern
over the methods of test construction traditionally used. Buros6o
warned that to develop a valid test instrument, items should not be
selected based on their correlation with the total test score since the
entire test may prove to be invalid. Findley61 has warned of the danger
inherent in the use of "expert Opinion" as a means of validation.
Barclay62 has written, "... the difficulty with testing usage centers
very much on the determination of an adequate criterion which is inde-
pendent of the testing instrument."

Since the purpose of process testing is to get at children's
ability to think and reason, and since it is difficult to know how a
child arrives at a given response, it seems reasonable that the matter
of validity deserves special attention in the case of the process test.
Horrocks63 has indicated that due to the difficulty of writing good
process test items perhaps the greatest hazard in testing for the

processes is that of not developing valid items. The concern for item

validity is a primary concern of this study.

23

THE WORK OF FYFFE AND ROBISON
64 . 65 .

Fyffe and Robison approached the problem of validation by
recognizing that perhaps the Individual Competency Measures with their
previously mentioned claim of validity represented Ebel's "clearly
superior" test. But in this case there is a need for another test
because of the time efficiency problem which limits the usefulness of
the Individual Competency Measures. Therefore their procedure was as
follows:66’67

They began by first selecting a representative sample of the
Individual Competency Measures to be used as their external criterion.
The next step was to prepare multiple choice test items. Ideas for
items were drawn from a review of textbooks and laboratory manuals.

In order to assure face validity, a committee of judges composed of
both faculty and graduate students reviewed the items using the follow-
ing procedure:68’69

"The procedure followed in the review of items was to

provide each reviewer with a list of the objectives for

the four integrated processes at the same time that

proposed test items were available. Each test item was

then identified as measuring one objective or skill for a

particular process. The reviewer then had two considerations

to decide: (1) Does the item require the use of the specific

process skill identified? and (2) Does the item present

enough information that a skillful seventh grade student

can respond correctly?"

External Criterion Referenced Validation

Fyffe and Robison administered the selected Individual Competency
Measures and then their items to a group of subjects. They used the
scores on the Individual Competency Measures to obtain the "upper 27

percent" and the "lower 27 percent" groups needed for the discrimination

calculation in the item analysis. Thus decisions could be made about

24
the value of the items based not on the test under construction and
not based on "expert opinion," but rather, based on the ability of
the item to discriminate between students who did well (upper 27
percent) and those who did poorly (lower 27 percent) on the external
criterion, the Individual Competency Measures. This procedure should
satisfy the concern expressed in the preceding section, provided the
Individual Competency Measures can be accepted as defining what is

meant by the science processes.

Weakness

Fyffe reports, "Many of the items for the two processes of
interest were pre-tested on two seventh grade students."70 This is
the extent Of their item tryout before entering into the major portion
of their study which involved administration of the Individual
Competency Measures and their items to 56 students. An examination of
the item analysis data they obtained from this administration of their
items71’72 reveals that a number of their items (i.e., items 9, 10, l7,

18, 21) could probably have been improved by item tryout and revision.

SUMMARY

In this chapter, the trend toward process teaching and the
difficulties this shift in emphasis has posed for testing was presented.
Some of the issues surrounding the mechanics of testing were briefly
examined, including Speeded y§_power tests, the use of a blueprint,
the Optimum number of alternatives, whether item order should be a
concern, what value of item difficulty should be used, and finally, a
brief examination of the discrimination index. A short survey Of some

of the recent tests which have been deveIOped for the purpose of

25
assessing science processes was presented and the method of validation
for each test was examined.

The concern which authorities in testing have expressed over
the traditional methods used in test construction was examined,
especially as these relate to the validation of tests which attempt
to assess process abilities. Finally, the work of Fyffe and Robison
with their use of external criterion referenced validation was

reviewed.

26

FOOTNOTES

1National Education Association, The Central Purpose of American
Education (Washington, D.C., 1961), p. 19.

 

2Henry P. Cole, "Process Curricula and Creativity Development,"
Journal of Creative Behavior, 3’ 253 (Fall 1969).

3Terry Borton, "What's Left When School's Forgotten?" Saturday
Review, 53, 69-71, 79 (April 18, 1970).

4American Association for the Advancement of Science, The
Psychological Bases of Science - A Process Approach, AAAS Misc.
Publication (1965), pp. 65-68.

 

5American Association for the Advancement of Science, Science -
A Process Approach Commentary for Teachers, AAAS Misc. Publication,
68-7 (1968).

 

6Ibid, p. 163.

7Joint Committee of the American Association of School
Administrators, Testing, Testing, Testing (Washington, D.C.: American
Association of School Administrators, 1962), p. 9.

 

8Robert L. Ebel, "The Relation of Testing Programs to Educational
Goals," The Sixtyfsecond Yearbook of the National Society for the Study
of Education (University of Chicago Press, Chicago, 1963).

 

 

9Progress Report of the Panel on Educational Research and
Development, Innovation and Experimentation in Education (U.S. Govern-
ment Printing Office, Washington, D.C., 1964), p. 44.

 

10Ralph W. Tyler, "Resources, Models, and Theory in the Improve-
ment of Research in Science Education," Journal of Research in Science

Teaching, 5, 43 (1967).

 

11Eugene Lee, New Developments in Science Teaching (Wadsworth
Press, Belmont, California, 1967), p. 69.

12Louis Kuslan and A. H. Stone, Teaching Children Science: Ag
Inquiry Approach (Wadsworth Press, Belmont, California, 1968), p. 228.

 

 

13Richard B. Smith, "Approach to Measurement in the New Science
Curriculum," Science Education, 53, 411-415 (December 1969).

 

14John R. Bormuth, On the Theory Of Achievement Test Items,
(Chicago: University of Chicago Press, 1970).

27

15Richard C. Anderson, "How to Construct Achievement Tests to
Assess Comprehension," Review of Educational Research, 42, 145-170
(1972).

16L. Lisonbee, "Testing, What For?" Science Teacher, 33,
27-29 (May 1966).

 

l7Hulda Grobman, Evaluation Activities of Curriculum Projects,
AERA Monograph Series on Curriculum Evaluation, No. 2 (Chicago:
Rand McNally, 1968).

18Robert H. Ennis, "Needed: Research in Critical Thinking,"
Educational Leadership, 21, 17-20, 39 (October 1963).

 

 

19Robert E. Stake and T. Denny, "Needed Concepts and Techniques
for Utilizing More Fully the Potential of Evaluation," The Sixty-
eighth Yearbook of the National Society for the Study of Education, 2
(Chicago: University of Chicago Press, 1969).

 

 

20Ralph W. Tyler, "Educational Evaluation: New Roles, New Means,"
The Sixty-eighth Yearbook of the National Society for the Study of
Education, 2 (Chicago: University of Chicago Press, 1969).

 

21Hulda Grobman, "Curriculum DeveIOpment and Evaluation,"
Journal of Educational Research, 63, 436-442 (July 1971).

 

22Banesh Hoffman, The Tyranny of Testing_(Crowell—Collier Press,
New York, 1962).

 

23Ralph W. Tyler, Basic Principles of Curriculum and Instruction
(Chicago: University Of Chicago Press, 1950).

24R. Thorndike and Hagen, Measurement and Evaluation in
Psychology and Education (New York: John Wiley, 1955).

25F. B. Davis, Educational Measurements and Their Interpretation
(Wadsworth, Belmont, California, 1964).

 

 

 

 

26Robert L. Ebel, Essentials of Educational Measurement (Prentice-
Hall, Inc., Engelwood Cliffs, New Jersey, 1972).

 

27C. Terranova, "Relationship Between Test Scores and Test
Time," Journal of Experimental Education, 49, 81-83 (Spring 1972).

28Ross E. Traut and R. K. Hambleton, "The Effect of Scoring
Instructions and Degree of Speededness on the Validity and Reliability
of Multiple-Choice Tests," Educational and Psychological Measurement,
.§2, 737-758 (1972).

 

 

29Ebel, op. cit., p. 108.

30Franklin R. Evans and R. R. Reilly, "A Study of Speededness
as a Source of Test Bias," Journal of Educational Measurement, 9,
123-131 (Summer 1972).

 

28

31Robert M. W. Travers, How to Make Achievement Tests (New
York: Odyssey Press, 1950), p. 25.

32E. F. Lindquist, ed., Educational Measurement (American
Council on Education, Washington, D.C., 1951), pp. 119-495.

33

 

 

Ebel, Op. cit., p. 364.

34A. Tversky, "On the Optimal Number of Alternatives at a
Choice Point," Journal of Mathematical Psychology, 1, 386-391 (1964).

 

35Frank Costin, "Optimal Number of Alternatives in Multiple-
Choice Achievement Tests: Some Empirical Evidence for a Mathematical

Proof," Educational and Psychological Measurements, 39, 353-358
(Summer 1970).

36Robert L. Ebel, "Expected Reliability as a Function of Choices
Per Item," Educational and Psychological Measurement, 22, 565-570 (1969).

 

37Ronald L. Flaugher, R. S. Melton, and C. T. Myers, "Item
Rearrangement Under Typical Test Conditions," Educational and
Psychological Measurement, 28, 813-824 (Autumn 1968).

 

 

38C. Munz and A. D. Smouse, "Interaction Effects of Item-
Difficulty Sequency and Achievement-Anxiety Reaction on Academic
Performance," Journal of Educational Psychology, 52, 370-374(October
1968).

 

39Marshall H. Brenner, "Test Difficulty, Reliability, and Dis-
crimination as Functions of Item Difficulty Order," Journal of Applied
Psychology, 48, 98-100 (April 1964).

40Ronald N. Marso, "Test Item Arrangement, Testing Time, and
Performance," Journal of Educational Measurement, 2, 113-118 (Summer 1970).

41Naomi C. Klosner and E. K. Gellman, "The Effect of Item
Arrangement on Classroom Test Performance: Implications for Content
Validity," Educational and Psychological Measurement, 33, 413-418 (1973).

 

 

42F. M. Symonds, "Factors Influencing Test Reliability," Journal
of Educational Psychology, 12, 73-87 (1938).

 

43Fredrick B. Davis, "Item Analysis in Relation to Educational
and Psychological Testing," Psychological Bulletin, 42, 97-121 (1952).

 

44J. F. Adams, "Test Item Difficulty and the Reliability Of
Item Analysis Methods," Journal of Psycholggy, 323 255-262 (1960).

45J. C. Wofford and T. L. Willoughby, "The Effects of Test
Construction Variables Upon Test Reliability and Validity," California
Jpnrnal of Educational Research, 29, 96-106 (May 1969).

 

 

29

46Frederick B. Davis, 1971 AERA Conference Summaries: II.
Criterion Referenced Measurement (ERIC Clearinghouse on Tests, Measure-
ment and Evaluation, Princeton, New Jersey, 1972).

 

 

47Truman L. Kelly, "The Selection of Upper and Lower Groups for
the Validation of Test Items," Journal of Educational Psychology, 33,
17-24 (1939).

 

48L. S. Feld, "Note on Use of Extreme Criterion Groups in Item
Discrimination Analysis," Psychometrika, 33, 97-104 (1963).

 

49Wofford and Willoughby, loc. cit.

50Max D. Engelhart, "A Comparison of Several Item Discrimination

Indices," Journal of Educational Measurement, 3, 69-76 (June 1965).

51William W. Cooley and L. E. KlOpfer, "The Evaluation of
Specific Educational Innovations," Journal of Research in Science
Teaching, 1, 73-80 (1963).

 

52Wayne W. Welch and M. 0. Pella, "The Development of an
Instrument for Inventorying Knowledge of the Processes of Science,"
Journal of Research in Science Teaching, 3, 64-68 (1967).

 

53R. S. Tannenbaum, "DeveIOpment of the Test of Science
Processes," Journal Of Research in Science Teaching, 3, 123-136 (1971).

 

54Jean Beard, "The Development of Group Achievement Tests for
Two Basic Processes of AAAS Science - A Process Approach," Journal of
Research in Science Teaching, 3, 179-183 (1971).

 

 

55D. A. Morgan, "STEPS Science Test for Evaluation of Process
Skills," The Science Teacher, 33, 77-79 (November 1971).

56Robert L. Ebel, "Writing the Test Item," Educational Measurement,
E. F. Lindquist, ed. (American Council on Education, Washington, D.C.,

 

 

57H. Walbesser, "Science Curriculum Evaluation: Observations on
a Position," The Science Teacher, 33, 34-39 (February 1966).

58American Association for the Advancement of Science, §p_
Evaluation Model and Its Application, Second Report (AAAS, Washington,
D.C., 1968), pp. 9, 10.

 

 

59Miles A. Nelson and E. C. Abraham, "Inquiry Skill Measures,"
Journal of Research in Science Teaching, 39, 291-297 (1973).

0

6 Oscar K. Buros, "Criticisms of Commonly Used Methods of Vali—
dating Achievement Test Items," Proceedings of the 1948 Invitational
Qpnference on Testing Problems, (Educational Testing Service, 1949), p. 18.

 

30

61Warren G. Findley, "Purposes of School Testing Programs and
Their Efficient Development," The Sixty—second Yearbook of the National
Society for the Study of Education (Chicago: University of Chicago
Press, 1963), p. 8.

 

 

62James R. Barclay, Controversial Issues in Testing (Boston:
Houghton Mifflin, 1968), p. 60.

 

63John E. Horrocks and T. I. Schoonover, Measurement for
Teachers, (Charles E. Merrill Publishing Company, Columbus, Ohio,
1968), p. 70.

 

64Darrel W. Fyffe, The Development of Test Items for the
Integrated Science Processes: Formulatipngypotheses and Definipg
Operationally, Unpublished Doctoral Dissertation, Michigan State
University (1971).

 

 

 

65Richard Wayne Robison, The Development of Items which Assess
the Processes of Controllipg Variables and Interpreting Data,
Unpublished Doctoral Dissertation, Michigan State University (1973).

 

 

66Fyffe, op. cit., pp. 30-42.
67Robison, op. cit., pp. 39-55.
681bid, p. 41.

69Fyffe, Op. cit., p. 35.
7OIbid, p. 36.

711bid, p. 107.

72Robison, op. cit., p. 103.

CHAPTER III

PROCEDURE

This study consisted of several quite distinct phases which will
be described independently. They are:

The Item Improvement Phase in which the items in the item pool
were tried out and on the basis of the resulting item analysis data
they were retained, edited or dropped from the pool.

The Validation Phase in which the items in the item pool were
validated using the external criterion referenced validation method,
and based on this validation the final form of TSPT form D was
constructed. This phase Of the study has many of the characteristics
of a typical experimental study with hypotheses that are tested and
either accepted or rejected based on statistical treatment of the data.

The Norming Phase in which TSPT form D was administered to a
random sample of students and from the resulting data, norms were
prepared for use with the test.

The final phase of the study consisted Of the writing of the
test manual in which TSPT form D is described, norming data is presented,
and instructions for administration of the test and interpretation of

the results are presented.

ITEM IMPROVEMENT
Construction Of Form A
The items in the initial item pool developed by Fyffe and Robison
were written with considerable attention to content so that there would

31

32
be reasonable assurance that they would assess the ability to use the
science processes. Their procedure is described in Chapter II of this
study. Their item analysis data provides considerable evidence that
the lack of adequate item tryout and revision seriously limited the
usefulness of their items as written. Thus the initial step in this
study was to examine their items in the light of the item analysis data
presented in their study.1 Many of their items had to be rewritten and
some of them were dropped from the pool as the result of the above item
analysis. The criteria established for this and subsequent revisions
during the item improvement phase of the study are:

l. The reading level of the item was kept within the ability
of sixth grade students.

2. All alternatives were required to have been chosen.

3. Sufficient description preceded the item to set the task.

In some cases since more than one item was based on a given context,
the group Of items had to be included or excluded $2;£2£23

4. The difficulty of the item (proportion of students missing
the item) was required to be between 0.2 and 0.7.

5. The discrimination of the item as determined by conventional
methods2 was required to be at least 0.3. An empirical justification
for the use of the 0.3 value is given in Appendix III-A.

Additional items were written to bring the item pool up to 80
items. Special care was taken in writing the new items to be sure that
they were compatible in format, style, and language to those items written
by Fyffe and Robison. The pictures required to clarify certain contexts
were obtained, and the items were assembled and printed to make up TSPT

form A Parts I and II, which is included in Appendix IV-A. The test

A _

33

was separated into two parts as nearly equal in length as the contexts
would permit since it was expected that 80 items would be too long a
test to be administered in a typical class period and also probably too
long for the children to handle in a single sitting as well.

For the tryout of TSPT form A two schools were contacted, one in
Flint and one in Lansing, Michigan. Both schools were located in urban
middle class neighborhoods. Each school contained two sixth grade
classes with chance assignment of students to each class. Each school
was recommended by their respective school district administrators as
having a progressive science program. The important difference between
them was that the Flint school used the SAPA program and the Lansing
school used a traditional textbook oriented program. This method of
sample selection was used in order to increase the likelihood that
assumption 4 in Chapter I of this study was correct. In late May Of 1973,
TSPT form A was administered to one class in each school. In the tradi-
tional school two days elapsed between administration of Parts I and II.
In the SAPA school, Parts I and II were given in the morning and after-
noon, respectively, of the same day. Part I required about 30 minutes
to complete and Part II required about 50 minutes. Probably Part II was
too long for this age group. Although no time limit was imposed and no
marked deterioration in their performance was detected, many students got
quite restless before they finished Part II. The students marked their
responses on spirit duplicated answer sheets. Their responses were
transferred to machine readable forms. The Michigan State University

test scoring service scored them and generated the usual test statistics

and item analysis data.

 

34

Construction of Form B

 

Using the above item analysis data and the same revision criteria
as used previously, many of the items of Form A were either rewritten
or dropped from the item pool. The resulting 61 items made up TSPT
form B which was also in two parts. Part I contained 33 items and
Part II contained 28 items. In October of 1973 the cooperation of
another pair of similar schools was sought for item tryout, one using
the SAPA program and the other using a traditional science program. The
SAPA school was in Flint and the traditional school was in Berrien
Springs, Michigan. The schools used were very similar in size and socio-
economic status to those used for the tryout of form A. In the SAPA
school, Part I was administered the last period of the morning and Part
II was administered the last period Of the afternoon. In the traditional
school, Parts I and II were administered on consecutive days. The
students' responses were scored by the Andrews University computing
center in Berrien Springs, Michigan. The usual item analysis and test

statistics were produced.

Construction Of Form C
Again applying the previously mentioned criteria a number of the
items were rewritten but no more items were dropped from the pool. The
resulting 61 items composed TSPT form C which was printed again in two
parts of 33 and 28 items each. TSPT form C is included in Appendix IV-D.
In view of the small amount of revision required to produce form
C, it was felt that TSPT was of sufficient quality to begin the next

Phase of the study.

35

VALIDATION PHASE
Degigp of the Study,
At this point the study took on more of the characteristics of
a classic research study describable using the notation of Campbell and
Stanley3 as:
X 01 02 03

X: All the previous experiences Of the validation sample.

01: The first Observation consisting of the administration
of the Individual Competency Measures.

02: The second Observation consisting of the administration
of TSPT form C.

03: The third observation consisting of the administration of
the SRA test.

The sequence of this part of the study was:

1. Administration Of the Individual Competency Measures.

2. Administration of TSPT form C.

3. Administration of the SRA test.

4. Scoring the above tests and analysis of the results.

5. Construction of TSPT form D, a revision of TSPT form C using
the external criterion referenced method of test development as previously
defined in this study.

6. Comparison of students' performance on the form D subtest of
TSPT form C with their performance on the Individual Competency Measures

and the SRA test.

§election of Individual Competency Measures to be Used
An evaluation of the Individual Competency Measures in terms of

their appropriateness to this study was conducted and Competency Measures

lVere selected for use based on the following considerations:

36

l. Enough Individual Competency Measures were to be used to
include at least 10 tasks on each of the 4 Integrated Processes.

2. The Individual Competency Measures used were to be representa-
tive of the total pool of Individual Competency Measures available for
each Integrated Process.

3. The Individual Competency Measures used were not to require
factual recall of a given activity or terminology within the SAPA program.

4. The tasks involved were required to fit the testing situation
used and time span of approximately 5 minutes per Individual Competency
Measure per child.

A listing of the Individual Competency Measures pool considered
for use in this study with an indication of which were actually used is

included in Appendix I-C of this study.

Sample Selection and Description

In October, 1973, the science consultant for the Flint, Michigan
Community School System was requested to suggest a school in which the
validation study could be conducted. The criteria for selection were:

1. The SAPA program was highly implemented in the school.

2. The school was "typical middle class" in all other respects.

The school recommended was the Pierce Community School in Flint,
Michigan. It was located in a stable middle class suburban neighborhood
that took pride in its school and was interested and involved in what it
was doing. There were approximately 300 students enrolled in the first
six grades with grade six composed Of two classes of 29 and 26 students,
respectively. The SAPA program had been used throughout the school for
several years though none of the teachers had had any extensive training

in the program. Due to absences at one time or another during the study,

37
the final sample size was reduced to 52 students. Since the same teacher
taught science to both classes, no distinction was made in the study
between them.

In a meeting with the principal and the sixth grade teachers a
description Of the study was presented and their cooperation was sought
and obtained. Throughout the study the school personnel were extremely
cooperative even though considerable disruption of their routine was

unavoidable.

Facilities for Administeringythe Individual Competency Measures

 

Two rooms were used for administration of the Individual Competency
Measures. Both were very small, but since the tests were individualized,
this was no disadvantage. The room used in the mornings contained a
long table which proved to be ideal for setting up the equipment used in
some of the tests. The room used in the afternoons contained a sink
which helped greatly for other tests. The testing rooms were near enough
to the sixth grade classrooms so that little time was wasted in moving

students from one room to another.

Administration of the Individual Competency Measures

 

The author administered all of the Individual Competency Measures
to every child. Testing began on November 1 and continued through
December 18, 1973. Every school day during this period was used. The
total time required for testing was approximately two and one half hours
per child. The time was divided into approximately 15 minute sessions
in which three Individual Competency Measures were administered. The
materials for testing three of the Individual Competency Measures were
set up and all of the students were cycled through them before setting

up the next three Individual Competency Measures. Each set up usually

38
required slightly over two days to complete. The teachers were very
cooperative and allowed students to be called from their classrooms almost
whenever they were needed. The children were always called in alphabetical
order and it was not long before they could anticipate when they were to
be called so that very little time was wasted and disruption within the
classroom was minimized.

There was some concern that this procedure could produce some
contamination due to children sharing with their friends information
relative to test questions they knew they would be facing later. NO ready
method of avoiding the hazard was available. If such contamination did
occur, it was not readily observable. Each situation seemed to be as
unique to the last students to see it as it had been to the first.

Scores were Obtained from the rating sheets prepared for each
student on each Individual Competency Measure. One point was awarded
for each task correctly done as indicated on the rating sheet. In most
cases several tasks were involved for one Individual Competency Measure.
The specific number for each Individual Competency Measure is recorded
in Appendix I-C. These scores were analyzed and stored in the computer

at the Andrews University computing center.

Administration of TSPT form C.

On December 19, 1973, the author administered TSPT form C to the
validation sample. Part I was administered to both classes in the morning
and Part II was administered in the afternoon. These responses were also

stored in the computer at the Andrews University computing center.

Administration of the SRA Test

 

In January, 1974, the SRA tests were administered to all the

students at Pierce Community School as part of their testing program.

39
Rather than wait for the results to be returned by SRA, the sixth grade
students' responses were recorded by hand and these data were also

stored in the computer at the Andrews University computing center.

Test Scoripg

Since the Individual Competency Measures student responses were
not Of the multiple choice format, the students' scores and standard
deviations on the total test and on each.subtest were the only data
obtained. On both of the other tests, which were multiple choice and
therefore were amenable to conventional item analysis techniques, the
students' responses together with the answer keys were stored in the
computer. This allowed scoring and item analysis to be performed on any
desired subtest at any time without the need to reenter any data. This
method Of data storage proved to be particularly advantageous in the

construction of TSPT form D.

Construction of TSPT form D

 

With but minor exceptions, item improvement was complete with
form C, all items having met readability and technical quality require-
ments. The important criteria imposed for this final revision involved
questions of validity. The method Of external criterion referenced
validation as developed by Fyffe and Robison and described in Chapter I
was applied as follows:

First the students were placed in rank order according to their
performance on the Individual Competency Measures. The upper 27 percent
of this group and the lower 27 percent of this group formed the "upper
27 percent" and the "lower 27 percent" respectively for the item

analysis calculation which was performed on the TSPT form C items.

40

Next correlation coefficients were calculated for all items with
the Individual Competency Measures scores. Based on the item's discrimi-
nation as calculated above and their correlation with the Individual
Competency Measures scores, TSPT form D was constructed based on the
requirement that both of the above indicies have minimum values of 0.2
and the requirement that the item's context allowed its use. The reason
for the use of the 0.2 minimum value is mentioned in Chapter I and
empirical evidence that, at least for this study, 0.2 is appropriate is
presented in Chapter IV, Table 6. Thirty-six items met the above require—
ments and were assembled to compose TSPT form D.

A machine scorable answer sheet was designed to be used with TSPT
form D. Special care was exercised to make the answer sheet as easy to
use as possible in order to minimize the confusion it would generate
among children who had not used machine scorable answer sheets before.
The printing and binding of these materials completed the construction

of TSPT form D.

Hypotheses to be Tested

 

After both the Individual Competency Measures and TSPT form C had
been administered to the validating sample, it was possible to test the
following hypothesis:

1. Students' scores on each of the TSPT form C items will be more
highly correlated with their scores on one of the four Integrated Processes
than with their scores on any other process, and this correlation will
indicate the subtest to which the item belongs. This hypothesis can be
stated in the null form as follows:

Ho: There are no differences at the .01 level of confidence

among the correlations of the scores on each TSPT form C item

41

with the Individual Competency Measures subtest scores.

The directional alternate hypothesis is:

Ha: The score on each item of TSPT form C has a higher
correlation with its score on one of the Individual Competency
Measures subtests than with any of the other Individual Competency
Measures subtest scores.

The real concern here is with the validity of item assignment. If
a given item really assesses students' ability with respect to one of the
science processes to a greater extent than any Of the others, then that
fact should be revealed by the correlation of students' scores on the
item with their scores on the integrated processes subtests of the
Individual Competency Measures. This correlation would add quantitative
support to the "expert Opinion" criticized in Chapter II of this study.

After the composition of TSPT form D was known, the validating
sample scores on that subtest of TSPT form C were obtained. It then was
possible to test the following hypothesis:

2. The scores on the TSPT form D subtest of TSPT form C will be
more highly correlated with the scores on the Individual Competency
Measures than with the scores on the SRA Science test. This hypothesis
can be stated in the null form as follows:

Ho: There is no difference at the 0.01 level of confidence
between the correlation of the scores on the TSPT form D subtest
of TSPT form C with the scores on the Individual Competency
Measures and the correlation of the scores on the TSPT form D
subtest of form C with the scores on the SRA Science Test.

The directional alternate hypothesis is:

Ha: The scores on TSPT form D subtest of TSPT form C are

42
more highly correlated with the scores on the Individual
Competency Measures than with the scores on the SRA Science
test .
Again the concern is that of validity. It is claimed that the
Individual Competency Measures assess process ability.4 It is claimed,
as has been mentioned, that the SRA Science test assesses "mainly factual

knowledge."5

If these judgments are correct, then a comparison of the
correlations of TSPT scores with scores on these tests should provide

quantitative evidence for the validity of TSPT.

Testipngypothesis l

The correlation between the validating sample scores on TSPT form
C items and Individual Competency Measures subtests were calculated
and significant differences among those correlations were sought.
Intercorrelations among the Individual Competency Measures subtests were
also computed and significant differences among them were sought using
Hostellings t test for significance of differences of correlated

correlations.

Testinngypothesis 2

After TSPT form D was constructed the correlation between the
validating sample scores on the form D subtest of TSPT form C was
calculated. The correlation between the TSPT form D scores and the SRA
Science test scores was also calculated and a t test for the significance

of the difference was performed.

MULTIPLE REGRESSION ANALYSIS
Finally, to elucidate the relations among the variables, the

validating sample scores on the Individual Competency Measures, TSPT,

43
SRA Science, and SRA Reading were taken to the Michigan State University
Computing Center where a multiple regression was performed. The
Individual Competency Measures score was the dependent variable and the
TSPT form C, SRA Science and SRA Reading scores were the independent

variables.

NORMING TSPT FORM D
The next phase Of the development of TSPT consisted of generation
of norming information in order that potential users of TSPT will have a
frame of reference from which to judge how TSPT might perform in their

situation.

Sample Selection

 

In order to restrict the travel required, the population from
which the norming sample was drawn consisted Of the public schools within
a 50 mile radius of Berrien Springs, Michigan that contained sixth grade
classes as listed in thel973-74 Michigan and Indiana state school
directories. There were 243 schools in this population.

The schools were numbered consecutively from 1 through 243. The
first 20 of a set of computer generated random integers from 1 through
243 were obtained. The schools assigned these numbers made up the norming
sample. Of these schools, one refused to cooperate, claiming that
accountability studies, federal funding, and other activities were imposing
too heavy a testing program to allow any more. The resulting norming

sample consisted of 19 schools.

Data Collection and Reduction

 

Testing the norming sample was begun in early April of 1974 and

was completed in late May. Contact with the schools was first made

44

through the school superintendent. A brief description of the develop-
ment Of TSPT, the way their school was selected to participate, and the
extent of their involvement was given. If the school system was small,
the superintendent frequently gave immediate permission to contact the
principal. If it was large, referral was usually made to the science or
testing consultant who frequently wished to confer with the principal
before giving permission.

The next step was to contact the school principal to set up an
appointment to meet personally with him and his sixth grade science
staff. At this meeting the need for process tests, the development of
TSPT, the method of selecting the norming sample, and the part they were
being asked to play was outlined. In most cases the school personnel
were very willing to cooperate. A date was then agreed upon for
administration of TSPT and a form was completed indicating size and
number of classes, name of teachers, and science program presently being
used. A set of directions for administering TSPT was given to the teacher
(included in Appendix III-B) and these were reviewed briefly with him.

On the morning of the date TSPT was to be given, the tests and
answer sheets were delivered to the school office. The completed tests were
picked up either the same afternoon or the next morning.

The students' answer sheets were checked to see that the name block
was correctly filled in and the responses were prOperly marked. They were
then delivered to the Andrews University computing center for reading. A
printout of student responses was checked against their answer sheets to
correct any reading errors that were made. The Opscan 100 reader used
was remarkably forgiving of sixth grade students ability to stay within

the proper response field, mark plainly, and erase cleanly. Of the over

45
46,800 responses read less than 20 errors were detected. After making
any needed corrections, the tests were scored and the following information
from each school was stored in the computer: The school identification,
the students' responses and scores, the school mean and standard
deviation.

‘Feedback was sent to the schools in two parts. A computer printout
of students' names and scores together with the number of items on the
test, number of subjects who took the test, mean score, standard
deviation, mean difficulty, mean discrimination, KR20 reliability, and
standard error of measurement were returned to the schools within a few
days after they took the test. Following completion of testing, another
letter was sent to the norming sample schools which contained a computer
printout Of the frequency distribution for the entire sample of 1301
students and the test statistics as listed above for the total sample

together with a frequency distribution of the school means.

TEST MANUAL PREPARATION
The final step in the develOpment of TSPT was the preparation of
the test manual. A brief description of the development of TSPT is
presented first. Then the method of norming the test and a presentation
of the norming statistics and frequency distribution is included. Finally,
instructions for using TSPT and interpreting the results complete the test

manual. A copy of the test manual is presented in Appendix III-C.

SUMMARY
This study may conveniently be broken down into the following

phases:

46

Item Improvement

 

The items developed by Fyffe and Robison were revised using the
data from their study. Additional items were added to the pool and two
more item tryout and revision cycles were carried out using conventional

item analysis procedures. The result was TSPT form C.

Validation

The Individual Competency Measures of SAPA, TSPT form C, and the
SRA test were all administered to the validation sample. The correlation
Of each TSPT form C item score with each Individual Competency Measures
subtest score was calculated to test hypothesis one, that TSPT form C
items could be objectively placed in the appropriate Integrated Process
subscale.

TSPT form D was constructed using the external criterion referenced
validation method of test development which uses student performance on
the Individual Competency Measures as the criterion for selecting items
from TSPT form C for inclusion in TSPT form D.

The correlation of TSPT form D scores with the Individual Competency
Measures scores was calculated. Hypothesis two that TSPT form D scores
were more highly correlated with the Individual Competency Measures
scores than with the SRA Science test scores was tested.

Finally, in an effort to more fully elucidate the relationships
among the various tests, a multiple regression analysis was performed
using the Individual Competency Measures scores as the dependent variable

and TSPT, SRA Science, and SRA Reading scores as independent variables.

47
Normin
TSPT form D was administered to a random sample of 1301 sixth

grade students for the purpose of obtaining norming data for TSPT

form D.

Test Manual Preparation
Finally a test manual was prepared for TSPT form D containing a
brief sketch of the development of TSPT form D, the norming data, and

instructions for use of TSPT form D.

48

FOOTNOTES

1

Darrell W. Fyffe, The Development of Test Items for the Integrated
Science Processes: Formulating Hypotheses and Defining Operationally,
Unpublished Doctoral Dissertation, Michigan State University (1971),
pp 0 98-116 0

2Robert L. Ebel, Essentials of Educational Measurement (Prentice-
Hall, Englewood Cliffs, New Jersey, 1972), pp. 388-401.

3Donald T. Campbell and J. C. Stanley, Experimental and Quasi-
Experimental Designs for Research (Chicago: Rand McNally, 1963).

4G. Billings, "Cognitive Levels of Elementary Science Tests,"
School Science and Mathematics, 1;, 824-830 (December 1971).

5Clarence H. Nelson. "Review - Science Research Associates
Achievement Series: blue version," in Buros, Seventh Mental Measurement
Yearbook, 2 (Gryphon Press, Highland Park, New Jersey, 1972), pp. 1231-33.

CHAPTER IV

ANALYSIS OF RESULTS
The purpose of this study was to begin with the test items
developed by Fyffe and Robison and to develop a test, TSPT, designed
to assess students' ability to use the science processes as identified
by the SAPA program. The study can quite naturally be divided into
several phases. The results from each phase will be analyzed in this

chapter.

ITEM IMPROVEMENT

Construction of Form A

 

After applying the revision criteria presented in Chapter III
to the items and data available in Fyffe and Robison's study, many of
their items were rewritten, some were dropped, and additional items
were added to the pool. The result was TSPT form A.which is included
in Appendix IV-A.

Logical analysis of the items based on the contexts identified
by SAPA indicated the assignment of items to the Integrated Processes
as presented in Table 1, under the column heading marked "form A."

Appendix IV-B contains the subject assignments for each item.

49

50

TABLE 1

ITEM SUBTEST ASSIGNMENTS

 

Process Number of Items

 

form A form B form C

 

Interpreting Data 24 22 22
Controlling Variables 15 10 10
Formulating Hypotheses 18 11 ll
Defining Operationally 33_ .33 .13

Total 80 61 61

 

Results of the Tryout of Form A

 

TSPT form A was tried out in a classroom where the SAPA program
was used and in a classroom where a traditional science program was used.
The item analysis data from the tryout of TSPT form A are recorded in
Appendix IV-C. The rest of the test statistics for TSPT form A are
presented under the "form A" heading of Table 2. The purpose of this
tryout was to Obtain item analysis data for use in revising the test
items, but since data was obtained from both a SAPA school and a
traditional school, comparisons of the two schools' performance are
possible.

A t test of significance of the differences between the SAPA
and Traditional (Trad.) mean scores reveals that the mean scores for
SAPA and Traditional students are not significantly different at the

0.01 level.

51

TABLE 2

TSPT TEST STATISTICS

 

 

 

 

 

 

 

 

 

 

Form D
Form A Form B Form C Form C Norming
SAPA Trad. SAPA Trad. SAPA Subtest Sample
Number Of
subjects 32 32 31 21 52 52 1301
Number of
Items 80 80 61 61 61 35 36
Mean Score 32.45 26.99 27.65 28.95 27.12 17.12 17.9
Std. Deviation 8.10 10.39 7.75 8.31 9.42 7.75 6.90
KR20
Reliability 0.70 0.76 0.77 0.82 0.76 0.89 0.84
Std. Error 4.44 5.09 3.73 3.49 3.55 2.56 2.69
Mean
Difficulty 0.59 0.66 0.56 0.53 0.56 0.51 0.50
Mean
Discrimination 0.31 0.33 0.34 0.32 0.38 0.56 0.50
SAPA: Science program used in the school was
SAPA.
Trad.: Science program used in the school

was Traditional Textbook oriented.

52
Both schools had standardized test scores available so
comparisons among these tests and TSPT form A were possible. These

are presented in Table 3.

TABLE 3

TSPT FORM A CORRELATION TABLE

 

 

Pearsons Product Moment

 

 

Correlation

Form A with SAT Science 0.45 *
Traditional

Form A with SAT Reading 0.51 **

School

SAT Science with SAT Reading 0.86 **

Form A.with SRA Science 0.74 *
SAPA

Form A with SRA Reading 0.70 **
School

SRA Science with SRA Reading 0.91 **

 

* Significant at 0.01 level.
** Significant at 0.001 level.

SAT - Stanford Aptitude Test
SRA - Science Research Associates Test

 

Since form A is the preliminary form of TSPT no very great
importance should be attached to these results but they do lend support

to later work.

Results of the Tryout of Form B
The revision criteria as listed in Chapter III in the section

tiiﬂed ITEM.IMPROVEMENT were applied to the data Obtained from the

53

tryout of TSPT form A. A number of items were dropped and others were
rewritten. The form A items which survived and were included in form
C are indicated in Appendix IV-E. Since TSPT forms B and C are quite
similar, form B is not presented; however, by comparing forms A and C
presented in Appendicies IV-A and IV-D respectively, a good idea of
this phase of the revision process can be Obtained. TSPT form B was
tried out in the same manner as form A. The test statistics for TSPT
form B are presented under the heading "form B" in Table 2.
A t test for significance of the differences between the SAPA
and traditional mean scores reveals again no significant difference.
Form B is also a preliminary form, so again, no great importance
should be attached to the data from it, but they do show that revision

has improved the test.

Results of the Tryout of Form C

 

Applying the revision criteria again produced mostly small
revisions with no items being dropped from the test. The result was
TSPT form C which is included in Appendix IV-D.

Form C was administered to the Validating Sample which contained
only sixth grade children who were being taught science using the SAPA
materials. This sample is more fully described in Chapter III. The
item analysis data is included in Appendix IV-E and the Validation Sample
scores on TSPT form C are presented in the second column of Appendix
IV-F. A plot of these data reveals a very slight positive skew. The
other test data are presented under the heading labeled "form C" in
Table 2.

The Integrated Processes subscales of form C are all highly

correlated with one another and with the total test. These correlations

54

are presented in Table 4. The decimal points are suppressed.

TABLE 4

TSPT FORM C SUBTEST CORRELATIONS

 

 

I II III IV

I Interpreting Data (87)

II Controlling Variables 59 (78)
III Formulating Hypotheses 59 53 (77)

IV Defining Operationally 63 57 52 (86)

 

The values in parentheses are the correlation of each subtest
with the total test. All of these correlations are significant at the

0.001 level.

VALIDATION
The Individual Competency Measures
The Individual Competency Measures previously identified and
enumerated in Appendix I-C were administered to the validation sample.
The total scores are very slightly negatively skewed with the Interpreting
Data and Controlling Variables subtests accounting for most of the skew.

These results are also displayed in Appendix IV-F.

Testing Hypothesis 1

 

The null form of hypothesis 1 can be stated as follows:
Ho: There are no differences at the 0.01 level of

confidence among the correlations of the validating sample scores on

each TSPT form C item with their scores on each of the Individual

55
Competency Measures subtests of Interpreting Data, Controlling Variables,
Formulating Hypotheses, and Defining Operationally. Written

symbolically:

RI,ID = RI,CV = RI,FH = RI,DO for all I

RI,ID: The correlation of the validation sample scores

on the TSPT item (I) with their scores on the Interpreting

Data subtest of the Individual Competency Measures.

RI,CV: The above correlation with the Controlling

Variables subtest.

RI,FH: The above correlation with the Formulating

Hypotheses subtest.

RI,DO: The above correlation with the Defining

Operationally subtest.

In order to test the above hypothesis, the correlation between
the Validation Sample scores on each TSPT form C item and their scores
on each of the Individual Competency Measures subtests was computed.
The resulting correlations were then tested for significance of differ-
ences using a t test for correlated correlations. The results are
presented in Appendix IV-G. Of the 366 t tests performed, the null
hypothesis was rejected only six times. In no case were the differences
sufficient to unambiguously assign the item to one and only one of the
subtests at the 0.01 level. At the 0.1 level this procedure unambiguously
assigned four items to one and only one of the Individual Competency
Measures subtests. These four items, their assignments based on the
above correlations and their logical assignments based on their agree-

ment with the SAPA contexts are presented in Table 5.

56

TABLE 5

TSPT FORM C ITEM ASSIGNMENTS

 

 

 

 

Assignments
Item Number Correlation Logig
440 ID FH
55 DO ID
56 CV DO
61 FH ID

 

Based on the above data the hypothesis that the Integrated
Process which a given TSPT form C item assesses could be objectively
determined based on the correlation of the Validation Samples scores
on the item with their scores on the Individual Competency Measures
subtests was not supported and the items' assumed relation to the
integrated processes was not used as a criterion for selection Of form
C items to be included in TSPT form D. The question of these subtests

is considered further in the discussion section of this chapter.

TSPT form D

 

The minimum level Of external criterion referenced discrimi-
nation that should be required for inclusion of an item in TSPT form
D was empirically tested by constructing a number of subtests of TSPT
form C and examining their statistics. The results are presented in

Table 6.

57

TABLE 6

TSPT FORM D ITEM SELECTION CRITERIA

 

 

Minimum Discrimination

 

 

Form C Form D
.l .2 .3 .4

Number of Items 61 47 40 27 16 35
Mean Score 27.12 22.21 19.38 13.17 8.25 17.12
Standard Deviation 9.42 9.14 8.54 6.61 4.35 7.75
KR20 Reliability 0.86 0.89 0.90 0.89 0.86 0.89
Standard Error 3.55 3.04 2.76 2.23 1.65 2.56
Mean Difficulty 0.55 0.53 0.52 0.51 0.48 0.51
Mean Discrimination 0.38 0.42 0.46 0.55 0.62 0.56
Correlation with the 0.78 0.82 0.83 0.85 0.86 0.83

Individual Competency Measures

 

As expected, since all the statistics are using the same data,
as the minimum discrimination level is raised, the correlation with
the Individual Competency Measures and the mean discrimination go up
and the mean difficulty and standard error go down, but the KR20
reliability seems to be greatest for a minimum discrimination of about
0.2. This may reflect the reduction in size of the test as the
discrimination requirement is increased, but at any rate for this test,
the minimum external criterion referenced discrimination of 0.2 as used
by Fyffe1 and Robison seems to be about right.

The item analysis data for TSPT form C using the external
criterion referenced item analysis procedure described in Chapter I is

presented in Appendix IV-E under the "ICM" heading.

58
Included in the last column of Table 6 are the test statistics
for the form D subtest of TSPT form C using the Validation Sample data.
It should be noted that one item which was actually used on TSPT form
D (Item 8) is not included because a slight revision of this item
produced a marked improvement in its performance. Form D, the final

form of TSPT is presented in Appendix IV-H.

SRA Test

The results Of the administration of the SRA Science and Reading
tests to the Validation Sample are presented in the last two columns
of Appendix IV-F. The test statistics are presented in Table 7. It
should be noted that the items which are cross keyed by SRA as being

on both the reading test and the science test are omitted.

TABLE 7

SRA TEST STATISTICS

 

 

Science ' Reading
Number of Items 40 60
Mean Score 25.82 40.7
Standard Deviation 8.45 12.86
KR20 Reliability 0.91 0.94
Standard Error 2.54 3.05
Mean Difficulty 0.35 0.32

Mean Discrimination 0.52 0.53

 

59

Testing:Hypothesis 2
The null form of hypothesis 2 can be stated as:

Ho: There is no difference between the correlation of

the Validation Sample scores on TSPT form D subtest of TSPT form C with

their scores on the Individual Competency Measures and the correlation

of their scores on TSPT form D subtest of TSPT form C with their scores
on the SRA Science test, or symbolically:

Ho: RTSPT,ICM " RTSPT,SRAS = 0
Where:

RTSPT,ICM' The correlation of the Validating Sample
scores on the TSPT form D subtest of TSPT form C with their
scores on the Individual Competency Measures.

RTSPT,SRAS: The correlation of the Validating Sample
scores on the TSPT form D subtest of TSPT form C with their
scores on the SRA Science test.

To test the above hypothesis a t test of significance of the
difference between correlated correlations was performed. The
correlations were Obtained from the data presented in Appendix IV-F.
The results are presented in Table 8. The t value obtained is not

significant even at the 0.2 level. Thus the null hypothesis was not

rejected.

60

TABLE 8

TSPT, ICM - TSPT, SRAS CORRELATION COMPARISON

 

 

N352
RTSPT,ICM = 0'83
RTSPT,SRAS = 0'79
Significance of the difference RTSPT,ICM - RTSPT,SRAS :

Calculated: t I 0.76
For Significance (0.01, one tailed test):

t = 2.4

 

The Validating Sample scores on the form D subtest of TSPT
form C correlate about as well with the SRA Science test scores as
with the scores on the Individual Competency Measures at the 0.01
level. The lack of a significant difference is examined more fully in

the discussion section below.

Discussion

 

Hypothesis 1: To elucidate the absence of significant differences
among the TSPT item - Individual Competency Measures subscale corre-
lations, the intercorrelation among the Integrated Processes subscales
of the Individual Competency Measures were calculated. These are
recorded in Table 9. A t test Of significance of differences indi-
cated no significant differences at the 0.01 level. Thus it can be

argued that they are all measuring similar abilities and so it would be

61
very hard to find a test item that would correlate significantly higher

with one subtest than with another.

TABLE 9

INDIVIDUAL COMPETENCY MEASURES
SUBTEST INTERCORRELATIONS

 

 

 

Subtests ‘Correlations
Interpreting Data - Controlling Variables 0.75
- Formulating Hypotheses 0.72
- Defining Operationally 0.65
Controlling Variables - Formulating Hypotheses 0.74
- Defining Operationally 0.65

Formulating Hypotheses - Defining Operationally 0.62

 

Perhaps one reason for the lack of significance differences
among the subtests can be found in the definitions Of the Integrated
Processes as presented in Appendix I-B. For example, if by Interpreting
Data SAPA means the ability to "...CONSTRUCT one or more inferences or
hypotheses from a comparison of the information in two or more related
tables...", should one be surprised to find the Interpreting Data and
Formulating Hypotheses subtest scores highly correlated?

Hypothesis 2: The result of testing hypothesis 2 is unequivocal.
It would have been the same even if the level of significance were
changed by an order of magnitude. TSPT scores are as closely related
to the SRA Science scores as they are to the Individual Competency
Measures scores. To elucidate this result, the correlation Of the

Validating Sample scores on the Individual Competency Measures with

62

their scores on the SRA Science test was reexamined. For the sample
size used, a correlation greater than about 0.5 is significant at the
0.001 level of confidence. The value of 0.74 previously reported for
this correlation indicates that the Individual Competency Measures and
the SRA Science tests are to a considerable extent measuring either the
same or very closely related abilities. It is therefore very hard for
a third test to be more highly correlated with one than with the other.
To further examine this result, a multiple regression was
performed using the Individual Competency Measures scores as the depen-
dent variable and TSPT form C, Individual Competency Measures, and
the SRA Science test scores as independent variables. The results are
presented in Table 10 and represented pictorially in Figure 1. An
interesting feature of these data is that the SRA Reading test accounts
for more of the Individual Competency Measures variance (65 percent)
than any other test used. Another interesting feature is that when
TSPT form C and the SRA Reading scores are taken together, they
account for 72 percent of the Individual Competency Measures variance
and completely overlap the SRA Science test so that adding the SRA
Science test accounts for almost no additional variance. However,
the important point in terms of hypothesis 2 is that the SRA Science
test appears to be closely related to the Individual Competency
Measures, accounting for nearly the same amount of variance as any
other test used. This, as was previously mentioned, makes it very
hard to construct a test that will be more closely correlated with the

Individual Competency Measures than with the SRA Science test.

63

TABLE 10

MULTIPLE REGRESSION ANALYSIS

 

 

Dependent Variable: The Individual Competency
Measures Scores.

 

 

Order of Inserting Independent Variables Total Variance

in the Equation ' Accounted for
SRAS 0.60
SRAR 0.66
TSPT 0.72
SRAS 0.60
TSPT 0.70
SRAR 0.72
SRAR 0.65
SRAS 0.66
TSPT ,0.72
SRAR 0.65
TSPT 0.72
SRAS 0.72
TSPT 0.64
SRAS 0.70
SRAR 0.72
TSPT 0.64
SRAR 0.72

SRAS 0.72

 

64

 

ICM
28%

 

 

(variance uncounted for) ’/////,
n o . o 0 Q 4 I

5%

‘//////

R

 

 

 

 

 

 

 

 

/
\//

N
N

 

 

 

FIGURE 1

ICM - The Individual Competency Measures, the dependent variable.
Twenty-eight percent of the ICM variance is unaccounted for by
any other test and 53 percent is accounted for by all the other
tests.

SRAR - The SRA Reading test, which accounts for 65 percent of the ICM
variance. Two percent of the ICM variance is accounted for
exclusively by this test.

SRAS - The SRA Science test, which accounts for 60 percent of the ICM
variance. None of the ICM variance is accounted for exclusively
by this test.

TSPT - TSPT form C, which accounts for 64 percent of the ICM variance.
Six percent of the ICM variance is accounted for exclusively by
this test.

65

NORMING TSPT FORM D
As was mentioned in Chapter III, the norming sample was drawn
from public schools within a 50 mile radius of Berrien Springs,
Michigan that contained sixth grade classes as listed in the 1973-74
Michigan and Indiana school directories. A map indicating the area
included is presented in Appendix IV-I. Some of the characteristics

of the population and the sample are revealed in Table 11.

TABLE 11

NORMING SCHOOLS CHARACTERISTICS

 

 

Population Sample

 

 

 

 

 

Total Number of Schools Represented 243 19
Michigan 119 9
State
Indiana 124 10
25,000 or greater 118 10
Size of
Community less than 25,000 125 9
Rural 3
Type of
Community Inner City 4
Innovative 3
Science

Program Textbook 5

66

The type of community and the type of science program categories
in Table 11 should not be considered to be very precise classifications.
They were derived strictly from intuitive impressions of the school,
the classrooms, and the programs being conducted.

The statistics obtained from administration of TSPT form D to
the norming sample are presented in Table 12. The complete item

analysis is included in Appendix IV-J.

TABLE 12

TSPT FORM D TEST STATISTICS

 

 

Number of Items 36
Number of Subjects 1301
Median Score 17
Mean Score 17,900
Standard Deviation 6.899
KR20 Reliability 0.842
Standard Error 2.691
Mean Difficulty 0.503
Mean Discrimination 0.496

 

The frequency distribution of norming sample scores on TSPT
form D is presented in Table 13. A plot of these data reveals a

nearly normal distribution with a very slight positive skew.

67

TABLE 13

NORMING SAMPLE FREQUENCY DISTRIBUTION

 

 

 

 

Score Frequency Standard ScOre ‘Percentile
36 0 2.62
35 0 2.48
34 3 2.33 99.8
33 6 2.19 99.3
32 13 2.04 98.3
31 21 1.90 96.7
30 24 1.75 94.9
29 27 1.61 92.8
28 31 1.46 90.4
27 47 1.32 86.8
26 47 1.17 83.2
25 50 1.03 79.3
24 45 .88 75.9
23 58 .74 71.4
22 61 .59 66.7
21 53 .45 62.6
20 48 .30 59.0
19 51 .16 55.0
18 63 + .01 50.2
17 59 - .13 45.7
16 55 .28 41.4
15 55 .42 37.2
14 72 .57 31.7
13 69 .71 26.4
12 65 .86 21.4
11 70 1.00 16.0
10 59 1.15 11.5
9 52 1.29 7.5
8 34 1.43 4.8
7 32 1.58 2.4
6 16 1.72 1.2
5 7 1.87 0.6
4 6 2.01 0.2
3 l 2.16 0.1
2 1 2.30 0.0
1 O 2.45
0 0 -2.59

 

68

SUMMARY

The significant correlation of TSPT scores with the Individual
Competency Measures scores indicates that the external criterion
referenced validation method of test construction is a fruitful
approach to test construction.

The hypothesis that items could be objectively assigned to the
ID, CV, FH, and DO subscales was not supported. These subscales within
the Individual Competency Measures are so highly intercorrelated it
was not possible to objectively assign a single test item unambiguously
to any one subscale at the 0.01 level. Since the subscales were not
objectively identifiable, no effort was made to identify them on TSPT.

One possible reason for not being able to identify these sub-
scales is the overlap in their definitions.

The hypothesis that the correlation of TSPT scores with the
Individual Competency Measures scores would be significantly higher than
the correlation of TSPT scores with SRA Science test scores was not
supported. One possible explanation for this outcome is the high
correlation between the Individual Competency Measures scores and the
SRA Science scores. This indicates that the two tests are very closely
related in terms of what they measure. Thus it is very difficult for
a third test to be more closely related to one than to the other.

The 0.01 level of significance was used in testing the hypotheses

in this study.

69

FOOTNOTES

1

Darrell W. Fyffe, The Development of Test Items for the
Integrated Science Processes: Formulating Hypotheses and DefiningyOper-
ationally, Unpublished Doctoral Dissertation, Michigan State University
(1971), p. 43.

 

 

70

CHAPTER V

SUMMARY AND CONCLUSIONS

Test deveIOpment continues to be one of the major areas of
concern in the educational community. Recent efforts to teach process
abilities have intensified this concern. This study was an attempt to
apply a method of test development which is different from the method
widely used and to examine some of the properties of the resulting

test.

SUMMARY

The method of test development used has been called by the
descriptive name of External Criterion Referenced Validation. Simply
stated, the procedure departs from standard test development practice
in that the criterion for item acceptance is that the item discriminate
between students who do well on the external criterion and students who
do poorly on it.

For this study the external criterion is the Individual
Competency Measures selected from the elementary school science program,
Science - A Process Approach (SAPA),which defines what the test
developed during this study, The Science Processes Test (TSPT), is
intended to measure. The final form of TSPT consists of a 36 item,
four alternative multiple choice test.

The item pool developed by Fyffe and RObison was used as the

Starting point for this study with additions, deletions, and revisions

 

ll I\.ll l l. {ll-l [I'll.'|l I'll-Ill [[ "III III I Ill-Ill I‘ III

71

being made based on three item tryouts using conventional item analysis
procedures. The result was TSPT form C. Three tests were then admin-
istered to a single group of students, the Validation Sample. The tests
were TSPT form C, the Individual Competency Measures which have been
proported to measure ability to use science processes, and the SRA

test which has been criticized as measuring "mainly factual knowledge."
It was hypothesized that the items of TSPT form C could be assigned to
the four subtests of the Individual Competency Measures based on the
correlation Of the students' scores on the item with their scores on
the subtests. The data did not support this hypothesis. The subtests
proved to be so highly intercorrelated that an item which correlated
highly with one subtest was likely to correlate highly with one or more
of the other subtests. Thus unambiguous item-subtest assignment based
on item-subtest correlation did not occur for any of the TSPT form C
items at the 0.01 level. Examination of the four instances when it

did occur at the 0.1 level makes it hard to believe that they were more
than chance occurances. In view of their high intercorrelations no
further reference was made to the Individual Competency Measures
subtests, and no further attempts were made to distinguish among the
supposedly different integrated processes as identified by SAPA.

TSPT form D was constructed using the external criterion
referenced validation method of test development. The Validation
Sample's performance on the Individual Competency Measures was used as
the external criterion. The resulting test was highly correlated with
the external criterion.

It was hypothesized that TSPT form D measures more of science

processes than of factual knowledge and that students' scores on TSPT

72

form D should therefore correlate more highly with their Individual
Competency Measures scores than with their SRA Science test scores.
The data did not support this hypothesis. The SRA Science test was so
highly correlated with the Individual Competency Measures that it seems
unlikely that any test would show a significantly higher correlation
with one than with the other and no significant differences were revealed
by this study.

Finally norming data was obtained for use with TSPT form D by
administering it to a random sample Of 1301 sixth grade students and

a test manual was prepared for the test.

CONCLUSIONS

The results of this study are quite clear cut. There is no
temptation to talk about a test being "almost significant." The results
would have been the same even if the levels of significance had been
shifted either way by an order of magnitude. It seems unreasonable
that the practical results of the study would have come out appreciably
different if a different approach or data treatment had been used. In
this respect the study was very satisfying and the following conclusions

seem appropriate.

The Value of TSPT

 

Although the value of TSPT will only become apparent as it is
used, its validity as indicated by the high correlation between scores
on it and scores on the Individual Competency Measures, its quality as
indicated by the test statistics, and its ease of administration and
scoring all indicate that it should be of value to those concerned with

testing.

73
The Value of External Criterion Referenced Validation

The external criterion referenced validation method of test
deveIOpment used to develop TSPT does provide a quantitative method
for selecting test items which can be used as an objective alternate or
supplement to the subjective judgment methods commonly employed. It
seems reasonable that this method could be used to construct more time
efficient tests in many situations where the most valid methods of
evaluation involve more cumbersome methods such as direct observation,
interviews, or other criterion performances.

Further, a quantitative estimate of the degree of validity may
be inferred based on the correlation between scores on the test under
development and scores on the criterion. Such correlations would give
the potential user a more quantitative basis for making judgments
relative to the merits of a given test than the rather qualitative

reviewer's opinion on which the test user must currently rely.

Existence of Science Process Subscales

 

The four Integrated Processes as identified by SAPA are highly
intercorrelated and as a result their existence as separately identi-
fiable abilities is subject to question. The statements which define
these processes and the Individual Competency Measures which assess
students' ability to use them need to be made more distinct if they are

to be objectively distinguishable.

Correlation Between Individual Competency Measures and SRA Scores
The high correlation between scores on the Individual Competency
Measures and the SRA Science test can be interpreted as evidence that

they measure much the same things. Although logical analysis by adult

74

ex erts may indicate that one test assesses "process abilit " while
P Y

" this appears to be no

another assesses "mainly factual knowledge,
guarantee that in fact students' performances will vary widely from

one test to the other.

IMPLICATIONS FOR FURTHER RESEARCH

Both the high intercorrelation of the Individual Competency
Measures and the high correlation of process ability with factual
knowledge observed in this study raise many questions. Do the Individ-
ual Competency Measures really measure process ability? If not, what
are the processes and how can they be measured? Does the SRA Science
test really measure mainly factual knowledge? If not, can such a test
be constructed and can it then be objectively demonstrated that process
ability and factual knowledge are indeed distinguishable one from the
other?

The above mentioned high correlations could be explained if the
subjects in fact largely lacked process ability. However, their scores
on the Individual Competency Measures argue against such an interpre-
tation. A more reasonable interpretation, and one which is not without
support in the literaturel-I4 is that process ability is rather ill
defined and poorly understood. There is little real assurance for
example that process ability and factual knowledge, which seem to be
so different to test constructors and educational theorists, are indeed
objectively and quantitatively distinguishable entities. Perhaps the
most pressing need for further research is in this area. The processes
need to be more precisely defined and distinctions both among the
science processes and between process ability and factual knowledge

need to be demonstrated. Further, assuming the above distinctions can

75

be demonstrated, the problem of the underlying psychological structures
involved need to be elucidated. Until these problems are seriously
addressed, the significance of the term "science processes" must be
questioned. It is hoped that this test and the method of test develop-
ment used here may be useful tools for attacking these problems.

Another way to interpret the observed high intercorrelations
among the processes is to hypothesize that the children are taught the
processes in a rather uniform fashion. This hypothesis could be tested
in a number of ways. One way might be an experimental study in which
children are intentionally taught the processes on a differential basis.
A post test should reveal a lower intercorrelation among the processes
for the experimental group and it should also reveal a significantly
greater score improvement on the processes which were taught. The
same sort of experiment could be performed with respect to the high
correlation observed between process ability and factual knowledge.
This kind of study would go a long way toward clarifying our under-
standing of what actually is involved in what goes by the rubric
"process ability."

Correlational studies need to be performed on other pairs of
tests, one of which has been classified as a process test and the
other a factual knowledge test to see if they do indeed measure
significantly different things. Studies of this nature should supple-
ment, and perhaps eventually replace the subjective pronouncements of
the reviewers on whose judgment we must now rely when we search for a
test which measures "mainly processes" or "mainly factual knowledge."
Studiesof this kind will be helpful in moving testing out of the

backwaters of educated guesswork and into the mainstream of scientific

objective demonstration.

76

Finally, other investigators, perhaps in other fields, should
experiment with the method of test construction used in this study. It
would seem that to supplement and perhaps to ultimately replace, by a
reliable objective method of item selection, the subjective judgment
of "experts" used to establish the validity of test items is a goal
worthy of the best efforts of test constructors. The technique used
in this study could be a first step in this direction and should be
tested and improved upon by other investigators. For example, a study
could be performed in which the external criterion used to determine
the validity of the test items could be, instead of the Individual
Competency Measures as used in this study, direct observation of the
children in the laboratory, the classroom, or even at play. Various
techniques of this type could and should be tried which will promote

a more objective scientific approach in the field of testing.

77

FOOTNOTES

1Max D. Engelhart and John M. Beck, "The Improvement of Tests,"
The Sixty:second Yearbook of the NatiOnal Society fOr the‘Study of
Education (Chicago: University of Chicago Press, 1963).

2Julius M. Sassenrath, "The Factorial Composition of the Iowa
Tests of Educational Development," California Journal of Educational
Research, 36, 80—84 (March 1965).

3Stephen Klein, "Evaluating Tests in Terms of the Information
They Provide," ERIC Document ED045, p. 699 (June 1970).

4Robert L. Ebel, Essentials of Educational Measurement
(Prentice-Hall, Engelwood Cliffs, New Jersey, 1972), p. 109.

BIBLIOGRAPHY

BIBLIOGRAPHY

Books

American Association for the Advancement of Science, An Evaluation

Model and Itszpplication. Second Report. AAAS, Washington,
D.C. (1968), pp. 9, 10.

 

. Science - A Process Approach Commentary for Teachers.
AAAS Misc. Publication 68-7, 1968.

. The Psychological Bases of Science — A Process Approach.
AAAS Misc. Publication 65-68, 1965.

Barclay, James R., Controversial Issues in Testing. (Houghton Mifflin
Co., Boston, 1968), p. 60.

Bormuth, John R., On the Theory of Achievement Test Items. (Chicago:
University of Chicago Press, 1970).

Buros, Oscar K. "Criticisms of Commonly Used Methods of Validating
Achievement Test Items," Proceedingg of the 1948 Invitational
Conference on Testing Problems (Educational Testing Service,
1949), p. 18.

Davis, Frederick B. Educational Measurements and Their Interpretation.
(Belmont, California: Wadsworth, 1964).

. 1971 AERA Conference Summaries: II Criterion Referenced
Measurement. (Princeton, New Jersey: ERIC Clearinghouse on
Tests, Measurement and Evaluation, 1972).

Ebel, Robert L. Essentials of Educational Measurement. (Engelwood
Cliffs, New Jersey: Prentice-Hall, Inc., 1972).

."The Relation of Testing Programs to Educational Goals." The
Sixty-Second Yearbook of the National Society for the Study of
Education. (Chicago: University of Chicago Press, 1963).

. "Writing the Test Item." E.F. Lindquist (ed). Educational
Measurement, (American Council on Education, Washington, D.C.,
1951), pp. 185—249.

 

Engelhart, Max D., and John M. Beck. "The Improvement of Tests," The
Sixty—second Yearbook of the National Society for the Study of
Education. (Chicago: University of Chicago Press, 1963).

78

79

Findley, Warren G. "Purposes of School Testing Programs and Their
Efficient Development."' The Sixty-second Yearbook of the
National Society for the Study of Education. (Chicago:
University of Chicago Press, 1963), p. 8.

 

 

Grobman, Hulda. Evaluation Activities of Curriculum Projects, AERA
Monograph Series on Curriculum Evaluation, No. 2. (Chicago:
Rand McNally , 1968).

 

Hoffman, Banesh.‘ The Tyranny of Testing, (New York: Crowell-Collier
Press, 1962).

 

Horrocks, John E. and T. I. Schoonover. Measurement for Teachers,
(Columbus, Ohio: Charles E. Merrill Publishing Co., 1968), p. 70.

 

Innovation and Erperimentation in Education, Progress Report of the Panel
on Educational Research and Development. (U.S. Government
Printing Office, Washington, D.C., 1964), p. 44.

 

Joint Committee of the American Association of School Administrators.
Testing, Testing, Testing. (Washington, D.C.: American Assoc-
iation of School Administrators, 1962), p. 9.

 

Klein, Stephen. "Evaluating Tests in Terms of the Information They
Provide." ERIC Document ED 045 699 (June 1970).

Kuslan, Louis, and A. H. Stone. TeachinggChildren Science: An Inquiry
Approach. (Belmont, California: Wadsworth Press, 1968), p. 228.

 

Lee, Eugene. 'New Developments in Science Teaching. (Belmont, California:
Wadsworth Press, 1967), p. 69.

 

Lindquist, E. F.(ed.) Educational Measurement. (American Council on
Education, Washington, D.C., 1951), pp. 119-495.

 

National Education Association. The Central Purpose of American
Education. (Washington, D.C., 1961), p. 19.

 

 

Nelson, ClarenCe H. "Review - Science Research Associates Achievement
Series: blue version," in Euros Seventh Mental Measurements
Yearbook, 3_(Highland Park, New Jersey: Gryphon Press, 1972)

 

Stake, Robert E. and T. Denny. "Needed Concepts and Techniques for
Utilizing More Fully the Potential of Evaluation." The Sixty-
eighth Yearbook of the National Societygfor the Study of
Education, 3, (Chicago: University of Chicago Press, 1969).

 

 

Thorndike, R. L. and E. Hagen. Measurement and Evaluation in Psychology
and Education. (New York: John Wiley, 1955).

 

Travers, Robert M. W.‘ How to Make Achievement Tests. (New York:
Odyssey Press, 1950), p. 25.

 

80

Tyler, R. W. Basic Principles of Curriculum and Instruction. (Chicago:
University of Chicago Press, 1950).

. (ed.) Educational Evaluation: New Roles, New Means. The
Sixty-eighth Yearbook of the National Society for the Study of
Education, 3, (Chicago: University of Chicago Press, 1969).

Periodicals

Adams, J. F. "Test Item Difficulty and the Reliability of Item Analysis
Methods." Journal of Psycholpgy, 42, 255-262 (1960).

Anderson, Richard G. "How to Construct Achievement Tests to Assess

Comprehension." Review of Educational Research, 43, 145-170
(1972).

Beard, Jean. "The Development of Group Achievement Tests for Two Basic
Processes of AAAS Science - A Process Approach." Journal of
Research in Science Teaching, 3, 179-183 (1971).

 

Billings, G. "Cognitive Levels of Elementary Science Tests." 'SchOol
Science and Mathematics, 1;, 824-830 (December 1971).

Borton, Terry. "What's Left When School's Forgotten?" Saturgey Review,
.33, 69-71, 79 (April 18, 1970).

 

Brenner, Marshall H. "Test Difficulty, Reliability, and Discrimination
as Functions of Item Difficulty Order."‘ Journal of Applied
Psychology, 43, 98-100 (April 1964).

 

Cole, Henry P. "Process Curricula and Creativity Development." Journal
of Creative Behavior, 3, 253 (Fall 1969).

Cooley, William W. and L. E. Klopfer. "The Evaluation of Specific
Educational Innovations." Journal of Research and Science
Teaching, 3, 73-80 (1963).

Costin, Frank. "Optimal Number of Alternatives in Multiple-Choice
Achievement Tests: Some Empirical Evidence for a Mathematical
Proof." Educational and Psychological Measurements, 39,
353-358 (Summer 1970).

Davis, Fredrick B. "Item Analysis in Relation to Educational and
Psychological Testing." ‘Psychological Bulletin, 33, 97-121 (1952).

Ebel, Robert L. "Expected Reliability as a Function of Choices Per

Item." Educational and Psyehological Measurement, 32, 565-570
(1969).

. "Must All Tests Be Valid?" American PeycholOgist,_l§, 640-647
(October 1961).

 

81

Engelhart, Max D. "A Comparison of Several Item Discrimination Indices."
Journal of Educational Measurement, 3, 69-76 (June 1965).

Ennis, Robert H. "Needed: Research in Critical Thinking." ’EducatiOnal
LeaderShip,333, 17-20, 39 (October 1963).

 

 

Evans, Franklin R. and R. R. Reilly. "A Study of Speededness as a

Source of Test Bias." Jourpal of Educational Measurement, 9,
123-131 (Summer 1972).

 

Feld, L. S. "Note on Use of Extreme Criterion Groups in Item Discrimi-
nation Analysis." Psyehometrika, 39, 97-104 (1963).

 

Flaugher, Ronald L., R. S. Melton, and C. T. Myers. "Item Rearrangement
Under Typical Test Conditions." Educational and Psyehological
Measurement, 39, 813-824 (Autumn 1968).

 

 

Grobman, Hulda. "Curriculum Development and Evaluation." Journal of
Educational Research, 94, 436-442 (July 1971).

 

 

Kelly, Truman. "The Selection of Upper and Lower Groups for the
Validation of Test Items." Journal of Educational Psychology,
39, 17-24 (1939).

 

Klosner, Naomi C. and E. K. Gellman. "The Effect of Item Arrangement
on Classroom Test Performance: Implications for Content

Validity." Educational and Peychological Measurement, 33,
413-418 (1973).

 

Lisonbee, L. "Testing, What For?" Science Teacher, 33, 27-29 (May 1966)

 

Marso, Ronald N. "Test Item Arrangement, Testing Time, and Performance."
Journal of Educational Measurement, 1, 113-118 (Summer 1970).

 

Morgan, D. A. "STEPS Science Test for Evaluation of Process Skills."
The Science Teacher, 39, 77-79 (November 1971)

 

Munz, David C. and A. D. Smouse. "Interaction Effects of Item-Difficulty
Sequence and Achievement—Anxiety Reaction on Academic Performance."
Journal of Educational Psychology, 39, 37-374 (October 1968).

Nelson, Miles A. and E. C. Abraham. "Inquiry Skill Measures." Journal
of Research in Science Teaching, 39, 291-297 (1973).

 

Sassenrath, Julius M. "The Factorial Composition of the Iowa Tests of
Educational DeveIOpment." California Journal of Educational
Research, 39, 80-84 (March 1965).

Smith, Richard B. "Approach to Measurement in the New Science Curriculum."
Science Education, 33, 411-415 (December 1969).

 

Symonds, P. M. "Factors Influencing Test Reliability." Journal of
Educational Ppyehology,339, 73-87 (1938).

 

82

Tannenbaum, R. 8. "Development of the Test of Science Processes."
JoUrnal of Research in Science Teaching,_9, 123-136 (1971).

Terranova, C. "Relationship Between Test Scores and Test Time."
Journal of Experimental Education, 99, 81-83 (Spring 1972).

Traut, Ross E. and R. K. Hambleton. "The Effect of Scoring Instructions
and Degree of Speededness on the Validity and Reliability of
Multiple-Choice Tests." Educational and Psychological
Measurement, 33, 737-758 (1972).

 

 

Tversky, A. "On the Optimal Number of Alternatives at a Choice Point."
Journal of Mathematical Psychology, 3, 386-391 (1964).

Tyler, Ralph W. "Resources, Models, and Theory in the Improvement of
Research in Science Education." *Journal of Research in Science

Teaching, 3, 43 (1967).

 

Walbesser, H. "Science Curriculum Evaluation: Observations on a
Position," The Science Teacher, 33, 34-39 (1966).

 

Welch, Wayne W. and M. 0. Pella. "The Development of an Instrument for
Inventorying Knowledge of the Processes of Science." Journal
of Research in Science Teachipg,.3, 64-68 (1967).

 

Wofford, J. C. and T. L. Willoughby. "The Effects of Test Construction
Variables Upon Test Reliability and Validity." California
Journal of Educational Research, 39, 96-106 (May 1969).

Unpublished Works

Fyffe, Darrel W., The Development of Test Items for the Inregrated
Science Processes: Formulating Hypotheses and Definrgg
,Qperationally. Unpublished Doctoral Dissertation (Michigan
State University, 1971).

 

Robison, Richard Wayne, The Development of Items Which Assess the
Processes of Controlling_Variables and Interpreting Data.
Unpublished Doctoral Dissertation (Michigan State University,
1973).

 

 

APPENDICES

APPENDIX I - A

ONE INDIVIDUAL COMPETENCY MEASURE FROM SAPA

‘Science - A Process Approach/Part G-a
Defining Operationally 7: Two Common Gases

TASK l (OBJECTIVE 1):

Say, If a piece of wet blue litmus paper is put into a vial of carbon
dioxide gas, the color of the paper changes from blue to red. Use
this information to tell me an operational definition of carbon
dioxide.

Acceptable Behavior: The child states that carbon dioxide is a gas
that turns wet blue litmus paper red.

 

TASK 2 (OBJECTIVE 1):

Say, If a piece of wet red litmus paper is put into a vial of ammonia
gas, the color of the paper changes from red to blue. Use this in-
formation to tell me an operational definition of ammonia.

Acceptable Behavior: The child states that ammonia is a gas that
turns wet red litmus paper blue.

 

TASK 3 (OBJECTIVE 2):

Show the child two vials that you have prepared previously, labeled A
and B. Vial A should contain carbon dioxide and vial B ammonia gas
(put two drOps of clear household ammonia into the vial and cap it
immediately.) Give the child two strips of red litmus paper and two
strips of blue, and a vial of water. Say, One of these vials contains
carbon dioxide and the other contains ammonia. Use the Operational
definitions of carbon dioxide and ammonia that you just stated and

the objects I have given you to test the gases in the vials. Tell me
which vial contains carbon dioxide and which contains ammonia.

Acceptable Behavior: The child moistens the strips of litmus paper,
uncaps one vial, puts one strip of red and one strip of blue litmus
paper into it, quickly caps the vial, and observes the papers to see
which changes color. He does the same with the other vial. He states
that he concludes from his observations that vial A contains carbon
dioxide and vial B contains ammonia.

 

TASK 4 (OBJECTIVE 3):

Prepare some ammonia gas by placing a piece of paper toweling in the
bottom of a 50-milliliter vial, and adding about 1 milliliter of clear
.household ammonia to it. Invert another 50-milliliter vial over the
first, let it stand for about a minute, and then remove and quickly
cap it. (See Figure C). Show the child the capped vial. It will

83

84

Appendix I-A cont'd

contain mostly air, but enough ammonia gas to give a satisfactory test
for this task. Also show him some bromthymol blue solution. Say,

Watch while I add some of this green test liquid to this vial of ammonia
gas. Draw about 1 milliliter of the liquid into a medicine drOpper.
Uncap the vial of ammonia, squirt the green test liquid into the vial
and quickly replace the cap. The green test liquid will immediately
turn bright blue. Say, One operational definition of ammonia is that

it is a gas that turns moist red litmus paper blue. Tell me an
alternate operational definition of ammonia based on your observations
of what happened with green test liquid.

Acceptable Behavior: The child states that ammonia is a gas that
turns green test liquid blue.

 

*7

- .~- American-Association for the Advancement of Science, ScienCe -
A Process Approach/Part G—a. (Xerox Corporation, 1970).

APPENDIX I - B

INTEGRATED PROCESSES OF SAPA

INTERPRETING DATA

Under the general heading of InterpretingrData the following skills
are stressed.

"l.

 

DESCRIBE in a few sentences the information shown in a table
of data or graph.

CONSTRUCT one or more inferences or hypotheses from the infor-
mation given in a table of data or graph.

CONSTRUCT one or more inferences or hypotheses from a compari-
son of the information in two or more related tables of data
or graphs.

DESCRIBE certain kinds of data, using the mean, median, range,
and frequency distribution; and CONSTRUCT predictions, infer-
ences or hypotheses from this information.

CONSTRUCT inferences or hypotheses from pictorial data.
DISTINGUISH between linear and nonlinear relations, APPLY A
RULE to find the slope of graphs of linear relations, and
DESCRIBE the information provided by the slope."1

CONTROLLING VARIABLES

Under the general heading of Controlling Variables, Science - A
Process Approach attempts to develop in the child the following skills
in working with variables.

"1.

 

IDENTIFY variables which may influence the behavior or the
properties of a physical or biological system.

IDENTIFY variables which are held constant, manipulated, or
responding in an investigation or an experiment.

DISTINGUISH between conditions which hold a given variable
constant and conditions which do not hold a variable constant.
CONSTRUCT a test to determine the effects of one or more
variables on a responding variable.

IDENTIFY AND NAME variables which were not held constant in the
description of an investigation, although they varied in the
same way in all treatments or were randomized."

 

1American Association for the Advancement of Science,_§§ienge;;4i
Process Approach Commentary for Teachers, AAAS Misc. Publication 68-7

 

(1968), p. 187.

2Ibid, p. 177.

85

86

Appendix I-B cont'd

FORMULATING HYPOTHESES

Under the general heading of Formulating Hypotheses the following
skills are stressed:

"l.

CONSTRUCT a hypothesis that is a generalization of observations
or that is a generalized explanation.

CONSTRUCT and DEMONSTRATE a test of a hypothesis.

DISTINGUISH between observations that support a hypothesis and
those that do not.

CONSTRUCT a revision of a hypothesis based on observations

that were made to test the hypothesis."3

DEFINING OPERATIONALLY

"In defining operationally physical scientists state 'What you 92
or what Operation you perform' and 'what you observe.’ For example
applying these criteria an operational definition of oxygen might

be:

Oxygen: A gas that causes a glowing splint to burst into

flame (what you observe), when the splint is placed
(what you do) into a container of the gas.

If a child wishes to decide if a gas is oxygen using this definition
he knows exactly what to 99_and to observe. In contrast a non-
Operational definition of oxygen, as far as a child is concerned,
would be: Oxygen is an element composed of atoms having atomic
number 8 and atomic weight 16. Given a container of gas this
definition will be entirely useless to t2e child. He will know
neither what to do nor what to observe."

 

3Ibid, p. 159.

41616, p. 167.

APPENDIX I - C

LISTING OF INDIVIDUAL COMPETENCY MEASURES

Individual Competency Measures used in this study are indicated
by an asterisk.

Number of Tasks

 

Interpreting Data:

* E/d ID 1 Guinea Pigs in a Maze 9
* E/l ID 2 Identifying Materials 6
* E/o ID 3 Precision in Measurement 4
* E/u ID 4 Field of Vision 3
F/c ID 5 Magnetic Fields 3
F/g ID 6 Quantitative Analysis 10
F/l ID 7 A Measure of Chance 8
F/p ID 8 Contour Maps 6
F/q ID 9 Measuring Small Things 4
G/e ID 10 Moon Photos 5
Total Tasks Used 22
Controlling Variables:

* E/b CV 1 Rolling Cylinders 10
E/c CV 2 Upward Movement of Liquids 4
E/p CV 3 Growth of Mold on Bread 5

* E/q CV 4 Loss of Moisture from Potatoes 7

* F/a CV 5 Variables Affecting Chemical

Reactions 10

* F/e CV 6 The Effects of Practice on

Memorization 15

* F/h CV 7 Nutrition of a Small Animal 8

* F/j CV 8 Forgetting and Relearning 4
F/m CV 9 Human Reaction Time 8
F/o CV 10 Growth and Orientation of Plants 5
F/r CV 11 A Small Water Animal 5
G/e CV 12 Precipitating Salts from Solution 4

Total Tasks Used 54
Formulating Hypotheses:

* E/h PH 1 Observations and Hypotheses 4

* E/i FH 2 Conductors and Nonconductors 4

* F/b FH 3 Effects of Temperature on Dissolving

Time 6

* F/i PH 4 Levers 3

* F/k FH 5 Tasters and Nontasters 6

* G/d FH 6 Variation in Perceptual Judgment 6

Total Tasks Used 29
87

Appendix I-C cont'd

88

Defining Operationally:

*
*

1':

Eli
E/m
E/v
F/d

F/f
F/n
G/a
G/b

D0
D0
D0
D0

D0
D0
D0
D0

«L‘UJNH

QNOUI

Electric Circuits and their Parts

Analysis of Mixtures

Living Things are Composed of Cells
Determining the Direction of True

North

Inertia and Mass

Parts of Living Plants
Two Common Gases
Temperature and Heat

Total Tasks Used

16

Number of Items

 

O‘UJUJ

0.34-‘wa

APPENDIX III-A

THE MINIMUM LEVEL OF DISCRIMINATION - CONVENTIONAL ITEM ANALYSIS

For the purpose of observing what effect changing the minimum
acceptable level of discrimination has on TSPT form C, the following
experiment was conducted: Using data Obtained from the validation
sample and conventional procedures for calculating the discrimination
index, revisions of form C were performed by computer selection of items
based solely on minimum acceptable discrimination index. The results

are displayed below:

 

 

 

N=52
Item Selection K Percent ICM
Criterion Items Diff. Disc. "KR20 92rr3*
TSPT form C 61 56 38 0.86 0.78
Min. Disc. 0.2 48 53 48 0.89 0.81
0.3 36 52 56 0.90 0.81
0.4 30 50 60 0.88 0.82
0.5 22 48 65 0.88 0.83

*Correlation with the Individual Competency Measures
Scores.

These data reveal that a fairly low level requirement for the
minimum discrimination produces a noticeable improvement in most of the
statistics; but, probably as a result of loss in number of items, as the
minimum discrimination requirement is further increased, the KR20 value
begins to drop. Thus, for this study, probably the 0.3 minimum discrimi-

nation requirement used in the item improvement phase seems reasonable.

89

APPENDIX III-B

DIRECTIONS FOR ADMINISTERING TSPT FORM D

90

91

APPENDIX III-B

DIRECTIONS FOR ADMINISTERING THE SCIENCE PROCESSES TEST
(TSPT)

TIME REQUIRED:
TSPT is not intended to be a timed test. Thus you need not be con-
cerned that all the students start and stop at exactly the same
instant. Students should be encouraged to work efficiently but
should not feel pressured by time limitations. Most students will
complete TSPT in about 45 minutes.

MATERIALS:
The only materials the student will need in addition to the TSPT test
booklet and the TSPT answer sheet are a pencil (number 2 is recom-
mended) and an eraser.

MARKING THE ANSWER SHEET:
Since the answer sheets will be machine scored, the students response
marks need to be dense and black, and should approximately fill the
response box without extending beyond it. Mistakes should be erased
cleanly. Although students should be encouraged to use reasonable
care in marking. Extreme concern on this point is not necessary.

USING THE NAME BLOCK:
The name block need not be filled in if you anticipate that the
students will have difficulty with it. The students need only turn
the answer sheet side-wise and print their last name and first name
in the boxes provided at the tOp of the name block, placing one letter
in each box. If a student's name is too long to fit in the boxes,
tell him to simply leave off the last few letters of his name.

DIRECTIONS TO THE STUDENTS:
The directions to be read to the students are set off by vertical
lines. These need not be read word for word, but may be paraphrased
or amplified as desired.

9pening Statement:

Many experts feel that the standardized tests you take every year
(SRA, ITED, etc.) are not quite fair to you because they ask you
about a lot of facts while the science you study in school tries
to teach you more about how to think and how to act like a
scientist. Today you are being given a chance to help in making
a new test - TSPT ‘which will measure your knowledge of the pro-
cesses of science. If you work hard and do your best on this
test, we should be able to tell you how well you are learning the
processes of science and you will be helping us to make a more
fair test. There are 36 questions on this test. If you work at
a steady pace you should have plenty of time to finish. The
answer sheet will be handed to you first. Do not write on it
until you are told what to do. You will need a pencil with a
good eraser for marking the answer sheet.

 

92

Appendix III-B cont'd

Distribute the answer sheets; one to each student. Check to see that
each student has a pencil and an eraser.

Usin the Answer Sheet:

Turn the answer sheet side-wise so the letters TSPT are at the
bottom below the name block. (Hold up an answer sheet turned
correctly.) Print your last name and your first name in the
boxes at the top of the name block. Put one letter in each box.

If your name is too long to fit in the boxes, leave off the last
letters.

 

If your students are familiar with the use of the name block, you may
instruct them to fill it in. Otherwise, tell them to ignore it.

You can see that there is one line on the answer sheet for each
page of the test. For example, page 1 has only question 1 on
it, while page 2 has questions 2, 3, 4, and 5 on it, etc. You
are to black in the little box just below the letter which you
feel is the best answer for each question. Be sure to mark only
one answer for each question. If you make a.mistake, erase the
mistake. Since your answer sheet will be read by a machine, '
erase cleanly and:make your marks dark. Fill, but do not go
outside the little box. Since others will be using the test
booklets, do not make any marks on them. As soon as you are
given the test booklet, you may Open it and begin work.

 

Distribute the test booklets. Check to see that the students have
entered their names correctly in the name block.

DURING THE TESTING PERIOD:
Check to see that the students are marking the answer sheet properly.

FOLLOWING THE TEST:
Collect the test booklets and answer sheets. Place them in the con-
tainer provided and return it to the principal's office.

Thank you for your help.

APPENDIX III-C

TSPT FORM D TEST MANUAL

93 - 106

 

 

TEST MANN.

CIENE
FORM D
4 (1974)
ROCESSES
EST
by

ROBERT.R.MLUDEMAN
DARRELL W1 FYFFE
RICHARD W. ROBISON
RICHARD J. MCLEOD
GLENN D. BERKHEIMER

COPYRIGHT BY

ROBERT R. LUDEMAN
1974

USETHEANSWERSHEETPROVIIED _
PLEASEIDMJTMAKEANYMRKSONTHISBOOKLET

THE SCIENCE PROCESSES TEST
(TSPT)

TEST MANUAL

by

Robert R. Ludeman
Andrews University
Berrien Springs, Michigan

Darrel W. Fyffe

Bowling Green State University
Bowling Green, Ohio
Richard W. Robison

Manchester College

North Manchester, Indiana
Richard J. McLeod

Glenn D. Berkheimer

Michigan State University
East Lansing, Michigan

Rationale

A more complete description of the rationale and development of
TSPT is published elsewhere.1

The expressions of need voiced by researchers in science education
for efficient valid tests of science processes coupled with the difficulty
usually encountered in developing such tests prompted a group of individ-
uals in the Science and Math Teaching Center at Michigan State University
to employ a method of test development that is different from the tradi-

tional test development procedure. TSPT is the result of that effort.

 

1Robert R. Ludeman, Deveippment of The Science Brocesses Test,
unpublished dissertation (Michigan State University, 1974).

2

Most tests rely on "expert" opinion for their claim of content
type validity. This procedure is especially subject to question for a
test intended to evaluate children's ability to use the processes of
science since it has been found that writing process test items is much
more difficult than writing simple factual recall items. Therefore, in
the development of TSPT, although this procedure was used for the original
generation of test items, in the later stages of test development, this
procedure was replaced by a procedure known as "external criterion refer-

enced validation.' Using this procedure, items are included in the test

on the basis of the requirement that children's performance on each item
be highly correlated with their performance on the external criterion.
In this case, the external criterion is the Individual Competency Mea-
sures of the elementary science program Science - A Process Approach
(SAPA) . 2

The Individual Competency Measures consist of individualized test-
ing situations using the same materials and contexts used by SAPA in de-
fining the Science Processes. The test administrator evaluates the sub-
ject's ability to use the Processes as he works with materials in solving
problems the administrator poses. The Individual Competency Measures are
not widely used because of their low time—efficiency but since they are so
directly related to the context which defines the Processes, it seems
reasonable to assume that they constitute an accurate assessment of student's

ability and can be used as the criterion for validation of a more time-

efficient test.

The science processes addressed by TSPT are the integrated pro-
cesses referred to by SAPA as Interpreting Data, Controlling Variables,
Formulating Hypotheses, and Defining Operationally. It is assumed that

the Individual Competency Measures do indeed measure children's ability

 

2American Association for the Advancement of Science, Science —
A.PrOcess ApprOach Commentary_for Teachers, AAAS Misc. Publication 68-7,
1968.

 

to use these processes and that a high correlation of children's perform-
ance on the Individual Competency Measures with their performance on TSPT
therefore may be taken as evidence that TSPT is a valid measure of

children's ability to use the processes of science.
DeveIOpment of TSPT:

Based on the behavioral objectives of SAPA, 113 multiple choice
items were originally written and examined by science educators with refer-
ence to their relevance to these objectives. These items were tried out

and revised three times on the basis of conventional test development pro-

cedures (all alternatives chosen by some students, discrimination greater
than .3, difficulty between .2 and .7) with many items being either dis-
carded or rewritten to meet these requirements. In this phase of the
development which was completed in the spring of 1973, 367 sixth-grade
students were involved. At this point it was felt that the resulting item
pool of 61 items was of adequate technical quality to begin the criterion-
validation phase of the development. Accordingly, beginning early in
November and continuing through December of 1973 the Individual Competency
Measures of SAPA were administered to 52 sixth-grade children. Immediately
on completion of the administration of the Individual Competency Measures
the above 61 items were administered to the same 52 children. Their per-
formance on the Individual Competency Measures was then used as the cri-
terion for item selection for inclusion in TSPT, using the following
requirements:

1. All alternatives have been chosen by some students.

2. The context of the item allows its use.

In some cases, since more than one item was based on a given

context, the group of items had to be included or excluded

3.}; toto.

3. The difficulty of each item (proportion of students missing

the item) was required to be between .2 and .7.

4
4. Using the Individual Competency Measures scores to define
the "upper 27 percent" and "lower 27 percent" groups, each
item was required to have a minimum discrimination of .2.

5. The correlation of students' scores on each item with their
scores on the Individual Competency Measures was required
to be .2 or greater.

Out of the above items which met these requirements, 36 were used
to make up TSPT. Although more items might have been included and would
have been desirable from a strictly statistical viewpoint, experience
gained during item try-outs indicated that if the number of items
exceeded about 40, the students began to get restless and lose their con-
centration before they finished the test. TSPT was then printed and a
machine scoreable answer sheet was designed and printed.

A summary of various correlations obtained from the above pro-

cedure is listed in Table 1.

Table 1 - Correlation Summary

 

N = 52
TSPT - ICM* .830
TSPT - SRA** Science .788
TSPT — SRA** Reading .798

*Individual Competency Measures

**Science Research Associates Achievement
Series (blue version)

NorminggTSPT:

 

A Norming Sample was selected from the public schools containing
sixth-grade classes as listed in the Michigan and Indiana public school

directories and which are located within a 50 mile radius of Andrews

5
University in Berrien Springs, Michigan. From this population of 243
schools a random sample of 20 schools was drawn. One of these schools
refused to participate in the study so the actual norming sample con-
sisted of 19 schools from 12 different school systems. The sample con-
tained rural, suburban and city schools in about equal numbers. The
largest school contained 168 sixth grade students and the smallest con-
tained 21 sixth grade students. There was a total of 1301 students in
the norming sample, with a broad spectrum of science programs represented.
Since no systematic relation was observed between students' scores and
type of science program studied, no effort is made to distinguish among
programs used by the norming sample. TSPT form D was administered to
this norming sample by their own teachers in their own classrooms in the
spring of 1974. The important test statistics obtained from the norming
sample is displayed in Table 2. The distribution of student scores is

given in Table 3.

Table 2 - NORMING SAMPLE DATA FOR TSPT

 

 

Grade level 6
Number of Items 36
Number of Subjects 1301
Median Score 17
Mean Score 17.9
Standard Deviation 6.90
Standard Error of the Measurement 2.69
Mean Point Biserial Correlation .409
KR20 Reliability .842
Mean Difficulty .503

Mean Discrimination .496

 

Table 3 - Norming Sample Distribution

6

 

N = 1301
Raw Score Frequency Std. Score Percentile

36 0 +2.62

35 0 2.48

34 3 2.33 99.8
33 6 2.19 99.3
32 13 2.04 98.3
31 21 1.90 96.7
30 24 1.75 94.9
29 27 1.61 92.8
28 31 1.46 90.4
27 47 1.32 86.8
26 47 1.17 83.2
25 50 1.03 79.3
24 45 0.88 75.9
23 58 0.74 71.4
22 61 0.59 66.7
21 53 0.45 62.6
20 48 0.30 59.0
19 51 0.16 55.0
18 63 +0.01 50.2
17 59 -0.13 45.7
16 55 0.28 41.4
15 55 0.42 37.2
14 72 0.57 31.7
13 69 0.71 26.4
12 65 0.86 21.4
11 70 1.00 16.0
10 59 1.15 11.5
9 52 1.29 7.5
8 34 1.43 4.8
7 32 1.58 2.4
6 16 1.72 1.2
5 7 1.87 0.6
4 6 2.01 0.2
3 1 2.16 0.1
2 l 2.30 0.0
1 0 2.45

O 0 -2.59

 

 

TSPT is intended to be a "power test" so ample time should be
given for essentially all students to complete the test. For the norm-

ing sample it was found that 45 minutes was adequate.

Reading Level:

 

Attention was given during item writing and editing to keeping
the reading level as low as possible. The resulting reading level for

the final test using the reading scale developed by Fry3

is approxi-
mately low sixth-grade.
In instances where classes were segregated on the basis of

"good readers" and "slow readers" the "good readers" typically scored

about 5 points higher than the "slow readers."

DIRECTIONS FOR ADMINISTERING TSPT:

 

TIME REQUIRED:

At least 45 minutes without interruption should be provided for
the administration of TSPT.

TSPT is not intended to be a timed test. Thus you need not be
concerned that all the students start and stop at exactly the same
instant. Students should be encouraged to work efficiently but to take
time to think through their answers. TSPT is pp£_a factual recall test.
Thinking is required to achieve a high score on this test. Most students
will complete TSPT in less than 45 minutes.

MATERIALS:

The only materials the student will need in addition to the TSPT
test booklet and the TSPT answer sheet are a pencil (number 2 is recom-
mended) and an eraser.

MARKING THE ANSWER SHEET:

Since the answer sheets are intended to be machine scored, the

student's response marks should be distinct and should approximately

 

3Edward B. Fry,'Reading3Instruction'for'Classroom and Clinic,
(McGraw Hill, New York, 1972).

fill the response box without extending beyond it. A single dark mark
is preferred. Mistakes should be erased cleanly. Although students
should be encouraged to use reasonable care in marking, extreme concern
on this point is not necessary.
USING THE NAME BLOCK:

First the student should turn the answer sheet sidewise and
print the letters for his name in the boxes provided at the top of the
name block, one letter in each box. Care must be taken that the first
letter of the last name is entered in the fir§£_box. If a student's
name is too long to fit in the boxes provided, the last few letters
should be omitted. In order for the machine to read the name, the letter
in each alphabet column in the name block corresponding to the letter
the student has placed in the box at the top of each column of the name
block must be marked in. A single clean mark which approximately fills
but does not extend beyond the reaponse box is required. Only one letter
may be marked in each column of the name block.
DIRECTIONS TO THE STUDENTS:

The directions to be given to the students are set off by verti-
cal lines. These need not be read word for word, but may be paraphrased
or amplified as desired.

0pening_Statement:

 

This test will find out how well you can use the processes
of science. That is, how well you can think and answer the
way a scientist would. This means you will need to take
time to think before you can answer the questions. You
will have enough time so do not rush. If you work at a

steady pace, you will have plenty of time to finish. The

 

answer sheet will be handed to you first. Do not write on

it until you are told what to do. You will need a pencil

 

with a good eraser for marking the answer sheet.
Distribute the answer sheets, one to each student. Check to see that
each student has a pencil and an eraser.

Using the Answer Sheet:

Turn the answer sheet sidewise so the letters TSPT are at

 

the bottom below the name block.

Hold up an answer sheet turned correctly.

Print your last name and your first name in the boxes at
the top of the name block. Be sure you begin with the
333§r_box and put one letter in each box. If your name

is too long to fit in the boxes, just leave off the last

 

few letters.

Allow sufficient time for the names to be entered. Spot-check to see
that it is done correctly.

Under the box where you printed the first letter of your
name, go down the alphabet until you come to the first
letter of your name. Draw a line through that letter. Be
careful that your mark does not go outside the little box.
Under the box where you printed the second letter of your
name, go down rpp£_alphabet until you come to the second

letter of your name. Draw a line through that letter. Do

 

this for all the rest of the letters of your name.

Allow sufficient time for the name block to be filled. Spot-check to see
that this is done correctly.

Turn the answer sheet right-side-up. You can see that there
is ppe lime on the answer sheet rpr_eepp_ppge_of the test.

For example, page one has only question one on it, while

10
page 2 has questions 2, 3, 4, and 5 on it, etc. You are to
draw a line through the little box just below the letter
which you feel is the one best answer for each question.
If you make a mistake, erase it cleanly. Make your marks
go the whole length of the little boxes, but be sure they
do not go outside the little boxes. Since others will be
using the test booklets, do not make any marks on them.

As soon as you are given the test booklet, you may open it

 

and go to work.
Distribute the test booklets. At the same time, check to see that the
students have filled out the name block correctly.
DURING THE TESTING PERIOD:
Check to see that the students are marking the answer sheets
properly.
FOLLOWING THE TEST:
Separate the test booklets and the answer sheets. Arrange the
answer sheets in the order in which you wish the result returned to you.
Return all test booklets and answer sheets to R. Ludeman, Andrews
University, Berrien Springs, Michigan 49104. The answer sheets will be
machine scored and returned to you together with a computer printout
of student's scores and test statistics similar to what appears in
Table 2 of this manual. By special arrangement, other data may be
obtained as well, such as item analysis information, breaking the test

down into subtests, correlation of two sets of scores, etc.

ll

INTERPRETING TSPT SCORES

Care should be exercised in using both Tables 2 and 3 for inter-
preting the results of any administration of TSPT. The norming sample
used to obtain these data should not be assumed to be representative

of any wider population than that previously described.

 

 

APPENDIX IV-A

TSPT FORM A

107 - 131

THE SCIENCE PROCESSES TEST

MICHIGAN STATE UNIVERSITY

Use the answer sheet provided.

Please do not make any marks on this test booklet.

l

‘,\(.\..

.. new...
saw! '

ix.
1

In; “as l

I I ‘n';

1.

far

the end will reach before the
stick will fall?

a.
b.
c.
d.
e.

2.

a.
b.

Q10

e.

A stick 100 centimeters long

is slowly pushed over the edge
of a table as shown.

The picture that would show the answer to the above question best would be
picture showing

About how
over the edge do you think

10 cm.
30 cm.
50 cm.
70 cm.
90 cm.

 

The stick balanced on my finger.

How thick the stick is.

The stick after it has fallen off the table.

Any of the above pictures would give the answer.
None of the above pictures would give the answer.

Questions 3, 4 and 5 use the following set up:

Several identical sticks are stacked on top of each other and extended over the
edge of a table in such a way as to give the greatest possible overhang (see below).

 

The relation between maximum overhang and number of sticks is graphed below:

--~—-—zoo'

3.

4.

5.

 

 

 

 

 

,1_,,___--aam--__l-_ll.-__11,_11-"1_-_. __J l-____1____1_;_
i i T i l l
l ' - i i
s l I I !
' l i .
I l z i
i l - i 3 l . i
g . g s 1 .
m—ua ----——- -~------:----- mm ; 7 —- ~——-—-—-——f~————- -.___..;_
l ' r 1
t ! 3 '
7 l a 0 l
i? 9 . o ' ' '
2 e
. 1| . ' l
‘9’ l l l
. 0., I00 .. .. .7.---7..- -. ”-.....“ -1 _- .1; - .1 - __ - _' .. ..__ __.... 1--....“
I . ‘ .'
z ‘ 2 I | l
9 - - L
L ' ! ‘ T l
o ; ' ‘
. 2 r ;
Q , 1 i
so ~ 0 ' p - — ~—--« -—~~—-- —~;——- —-~ -—- —-
' i l t
l g ‘ 9
. I - l l
' l
' i
'0 ~ J ; 7
"m ° 2 4 6 y -[o I;
'; Sticks '-+ '
! '.

The greatest overhang you could get using 5 sticks would be about:

a.
b.
Cu
d.
e.

1 cm.

49.9 cm.

99 cm.

112 cm.

None of the above is close.

The smallest number of sticks you would need to get an overhang of 100 cm. is:

a.
b,
C.
d.
e.

Using

TDD-00‘
so

1 stick
2 sticks
3 sticks
4 sticks
you could not ever get that big an overhang.

10 sticks you could get a maximum overhang of:

more than 150 cm.

between 140 and 150 cm.

between 130 and 140 cm.

less than 130 cm. .

there is not enough data to decide which is the best answer.

6. If you needed to tell someone what I mean above by "overhang" so that they
could measure the overhang, you should say "overhang" means the distance
from the:

a. end of the top stick to the center of gravity of the system.
b. end of the top stick to the edge of the table.

c. center of gravity to the edge of the table.

d. more than one of the above is correct.

e. none of the above is correct.

Questions 7 through 10 are about
frames A and B shown in the
picture on the right.

In this picture frame B has been
turned upside down.

7. Based on what you have seen this
far it is possible that:

a. string A will remain
straight when frame A
is turned upside down.

b. string A will bend when
frame A is turned upside
down.

c. string B is held straight
by a fine thread fastened
to the bottom of the frame.

d. all of the above mey be
true.

e. none of the above can possibly be true.

 

8. From the picture on the right you
now know that string:

a.

9. What evidence do you 293 have that

A is not as stiff as string
B.

A is made of a stiff wire
that is now bent.

B is made of a stiff wire. L

B is held up by a strong _-

magnet hidden behind the 2.-
frame. 9.

none of the above is

correct.

1- .9 I I '
'.

 

something about frame A is different
different from frame B.

a.
b.
c.
d.
e.

10. Based on the evidence you mpg
have (including the picture
on the right), the best
conclusion is:

a.

the key in A fell.
the ring in B did not fall.
either of the above is evidence there is a difference.

Both 1 and 2 are needed to have evidence for a difference.
none of the above is correct.

 

F

string A is not as stiff
as string B.

string A is made of a
stiff wire that is now
bent.

string B is made of

a stiff wire.

B is held up by a strong
magnet hidden behind

the frame .

none of the above is
reasonable.

. «HI

 

11. My name for the special string used in frame B above is "Wyrstring." Suppose
a friend phoned you to find out if a piece of string he found was wyrstring.
To tell him it would be most helpful to know:

a. how stiff his string is.

b. how long his string is.

c. how big around his string is.
d. what his string is made of.
e. where he found his string.

 

v

lop~

l
—>

 

(a...)

or

I

 

 

T
f /a

 

 

 

__fron1

 

 

1-34?
‘6’:

1”“.

 

 

T
,.

 

 

 

3 ' j
' loo If. g |
T I ‘7 . ”0:145 (f'flﬁqis)l_"

3M

 

 

I _....-..-... .

 

 

 

 

 

 

The graph on the right was Obtained by setting weights on the end of the stick
clamped to the table as shown on the left.

12. The can of soup shown above bends the stick to 66 cm from the floor. The can
weighs about:

a. 67.5 grams

b. 110 grams

c. 255 grams

d. 315 grams

e. none of the above.

13. A weight of 65 grams should bend the stick to about

a. 57.5 cm from the floor
b. 64 cm from the floor
c. 67.5 cm from the floor
d. 135 cm from the floor
e. none of the above.

14. The weight that would bend the stick to 72 cm from the floor would weigh about:

a. 55 gm
b. 67.5 gm
c. 215 gm.
d. 350 gm

e. none of the above

15. According to the above experiment, doubling the weight on the end of the stick
should:

a. double its distance from the floor.

b. cut its distance from the floor in half.

c. bend it down about 5 cm closer to the floor.

d. bend it closer to the floor but not by a fixed distance.
e. none of the above is correct.

Questions 16 through 22 are about the following experiment

 

The two jars shown are filled full The jars are put in a pan and placed in
to the brim. a freezer.
The lids are screwed on tight. The temperature is set at 0'dugrees F.

16. Two hours later it is found that neither jar is frozen, possibly because:

a.
b.
c.
d.
e.

neither contains water.

more time is needed.

they are too full to freeze.

all the above are reasonable.

two of the above answers are reasonable.

17. One might expect both jars to be frozen solid one day later because:

a.
b.
c.
d.
e.

Next day when the freezer is opened
jar "Y" is broken and its contents
frozen solid.

18. You now know that:

water freezes at temperatures below 32 degrees F.

the liquids in the jars look like water.

one day in a freezer should be long enough to freeze water.
all of the above are true.

none of the above is true.

the jars contents are
different.

at least one of the
jars contains water.
the temperature of the
jars is different.
more than one of the TENS
above is correct. ’
none of the above is correct.

 

19. Suppose someone told you that the contents of jar "Y" behaved like a Cronon
while the contents of jar "X" behaved like a non-Cronon. To show the difference
between them you can say a Cronon is:

a.
b.
c.
d.
e.

a chemical, but a non-Cronon is not.
just another name for water.

easier to freeze than a non-Cronon.
more than one of the above is correct.
none of the above is correct.

, 20.

21.

22.

23.

A jar of water is left in the above freezer over night and is found to be
frozen next morning. You know that:

a.
b.
C.
a.
e.

water freezes easier than a Cronon.

water is a Cronon. °

Cronons are made of water.

the water was cold before it was put in the freezer.
none of the above is correct.

A bottle of alcohol is left in the above freezer over night. It is not frozen.
You “now know that:

a.
b.
C-
d.
8.

From
half

a.
b.
c.
d.

alcohol is a Cronon.

alcohol is a non-Cronon.
non-Cronons are made of alcohol
alcohol cannot be frozen.

none of the above is correct.

this experiment it is safe to say that a mixture of half water and
alcohol would:

be a Cronon.

be a non-Cronon.

freeze easier than pure alcohol.
freeze easier than pure water.
none of the above is correct.

Richard claims that any glass bottle will break when the water it contains
freezes. To test this idea, he puts four bottles in the freezer: .5 is empty,
3 is one-third full of water, _C_ is two-thirds full, 9 is brim-full. If
Richard is correct, which bottles will break?

a.
b.
c.
d.
e.

A only

A, B, and C.

B and C.

B, C, and D.

all the bottles.

0n the right is a picture of a
recording thermometer. The
thermometer is the dark object
being placed in the beaker.

It sends an electrical

signal to the recorder on the
left whose pen automatically
draws a graph of temperature
and time.

24.

100 m1 of water was
placed in the above beaker
and left in the freezer.

The recording thermometer
drew the graph on the
right. Thus you know
that a 10 degree drop

in temperature:

a. requires a time
of 5 minutes.

b. requires a time
of 15 minutes.

c. requires less time
for colder
temperatures.

d. requires more time
for colder
temperatures.

e. none of the above is

 

 

 

’ rem/a, (4,1, F)->
u

N
w

 

 

 

 

 

 

correc t .

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

25. When the recorder reached point é_above I opened the freezer door for a

26.

27.

quick look.

I looked again at point 9.
know now it is possible that the cause of the "flat spot" on the graph is:

a.
b.
c.
d.
e.

Ice was just beginning to form on the surface of the water.
The water was all frozen solid. From.what you

the recorder sticks at about 32 degrees and so it does not read right.
the temperature does not change while the water is turning to ice.
Opening the freezer door ruined the experiment.

more than one of the above are reasonable.

none of the above are reasonable.

Another 100 m1 beaker of
water has 10 grams of salt
dissolved in it. It is
placed in the freezer.

The recording thermometer

draws the graph on the right.

The salt seems to have had
the greatest affect on:

a.

b.

C.

the
the
the
for
the

temperature of
"flat spot."
amount of time
the "flat spot,"
cooling rate

before the "flat

the

after the "flat spot."

'spot."

cooling rate

none of the above
were affected.

- —....—.-.-.¢--.—— _

 

 

 

 

 

 

 

. ‘ l' ,
I 3. ,
.;_..- -.:—“

A

 

 

 

 

 

 

[o ’ 20 _ 30

i‘Tinee (Hvﬁ£)-¥ﬁ , 3

When the recorder reached point A_I opened the freezer door for a quick look.

'About half of the salty water was frozen.
that the cause of the "flat spot" is:

a.
b.
Co
do
8.

From what you know now it is probable

the recorder sticks at about 32 degrees and so it does not read right.
the temperature does not change while the liquid is turning to a solid.
opening the freezer door ruined the experiment.

more than one of the above are reasonable.

none of the above are reasonable.

10.

28. For future experiments it would not be necessary to open the freezer door
to determine when freezing is taking place because we can say freezing occurs:

a.
b.
c.
d.

e.

during the time when the cooling curve is temporarily flattened.
when a regular crystal structure develops.

at constant molecular energy.

when molecular motion ceases.

none of the above is correct.

29. Between the.experiments of problems 24 and 26 3_changed the following:

a.
b.
c.
d.
e.

the temperature of
the temperature of
the time .

the amount of salt
all of the above.

30. The following variable/s

a.
b.
c.
d.
e.

the temperature of
the temperature of
the amount of salt
all of the above.

none of the above.

the water.
the freezer.

in the water.

was/were kept constant in the experiment:

the water.
the freezer.
in the water.

31. This experiment tested the idea that:

a.
b.

the time for water
the time for water
the temperature at
used

the temperature at
freezer.

none of the above.

to freeze depends on the amount of water used.
to freeze depends on the temperature of the freezer.
which water freezes depends on the amount of water

which water freezes depends on the temperature of the

11.

THE SCIENCE PROCESSES TEST

MICHIGAN STATE UNIVERSITY

Use the answer sheet provided.

Please do not make any marks on this test booklet.

Questions 32 - 80
l

Questions 32 to 45 are about the cylanders shown below.

Cylanders A to D are made of metal
while cylanders E to H are made of
clear plastic.

Cylanders A, B, E and F are short
while cylanders C, D, G, and H

are long.

Cylanders A, C, E, and G are
solid, while cylanders B, D, F
and H are hollow.

 

It is expected that the time it
takes these cylanders to roll
the length of the sloping table
on the right will depend on
some or all of the above
variables.

32. By comparing rolling times
for cylanders A and E you
could test the effect of the
variable:

a. solid or hollow.
b. long or short.

c. metal or plastic.
d. amount of slope.
e. none of the above.

 

33. By comparing rolling times for cylanders F and H you could test the effect of
the variable:

a. solid or hollow.
b. long or short.

c. metal or plastic.
d. amount of slope.
e. none of the above.

34.

35.

36.

37.

By comparing rolling times for cylanders A and G you could test the effect
of the variable:

a.
b.
c.
d.
e.

solid or hollow.
long or short.
metal or plastic.
amount of slope.
none of the above.

It was found that the rolling time was exactly the same for cylanders B and D.
From this information alone, which of the following variables does not
affect the rolling time:

a.
b.
c.
d.
e.

solid or hollow.
long or short.
metal or plastic.
amount of slope.
none of the above.

The rolling times for which cylanders will tell you if a hollow cylander rolls
at a different rate than a solid cyclander?

a.
b.
c.
d.
e.

A and D.
A and F.
C and D.
C and H.
none of the above.

The rolling times for which cylanders will tell you if a metal cylander rolls
at a different rate than a plastic cylander?

a.
b.
CO
d.
e.

A and F.
A and H.
C and G.
EC and H.
none of the above.

13.

Rolling time for the above cylanders is given in the following table. Use it
to answer questions 38 to 45.

38.

39.

40.

41.

42.

Rod

316)”131C1C331>’

The material

a. true.
b. false.
c. cannot

The variable
a. true.
b. false.
c. cannot

The variable
a. true.

b. false.
c. cannot

The slope of
a. true.
b. false.

C. cannot

 

 

Material Length Type Time
metal 2 cm solid 5 sec.
metal 2 hollow 10
metal 8 solid 5
metal 8 hollow 10
plastic 2 solid 5
plastic 2 hollow 10
plastic 8 solid ' 5
plastic 8 hollow 10

from which the cylander is made affects the rolling time.
tell from the data.
solid or hollow affects the rolling time.

tell from the data.

long or short affects the rolling time.

tell from the data.

the table affects the rolling time.

tell from the data.

The above table suggests that for this experiment the rolling time for hollow

cylanders:

a. is always 10 sec.

b. does not depend on material.
c. does not depend on length,

d. all the above are correct.

e. none of the above is correct.

14.

 

ilHlll’]|llll‘|l-li.:'il'ﬂ.

 

43.

44.

45.

For this experiment a solid cylander is one which:

a. is metal.

b. is 8 cm long.

c. has a rolling time of 5 seconds.

d.‘ more than one of the above is correct.
e. none of the above.

For this experiment a metal cylander is one which:

a. is solid.

b. is 8 cm long.

c. has a rolling time of 5 seconds.

d. more than one of the above is correct.
e. none of the above is correct.

To tell someone how to answer the above questions it would be best to tell
them that by "rolling time" I mean the time:

a. for the cylander to roll the length of the table.
b. during which gravity is acting on the cylander.
c. as indicated by my stop watch.

d. more than one of the above is correct.

e. none of the above.

Questions 46 to 52 are about the following experiment:

The TV ads claim a certain false teeth cleaner is colored green and the green color
disappears when your teeth are clean. To check the time for this reaction the
following experiment is performed: The time for the green to disappear from a glass
of water is measured at several different temperatures. A graph of the data is
shown below.

- s
N
“I

I I .

.C.) -—+

l

.. (la:

7?
“emf
a

’ l

 

‘ -c-.- .
; . f l
. l ' l

;

:

l

 

 

 

 

2'

 

 

 

 

 

l
i
l
!
--!
I
l
T
l
l
'l

“...... a-.. -.-.
.. -—..—-—— -—

 

 

0 I 5' lo I: to a:
Time (min) ‘—"

46.

47.

48.

49.

50.

51.

From this information we can say that reaction time

a. increases with increases in temperature.
b. decreases with increases in temperature.
c. decreases with decreases in temperature.
d. is not effected by changes in temperature.
e. none of the above.

From the graph, how long should it take to clean your false teeth at a
temperature of 20 degrees C.

a. less than 5 minutes.

b. between 5 and 10 minutes.
c. between 10 and 15 minutes.
d. between 15 and 20 minutes.
e. more than 20 minutes.

Where on the graph is the cleaning time most affected by changes in temperature.

a. less than 5 minutes.

b. between 5 and 10 minutes.
c. between 10 and 15 minutes.
d. between 15 and 20 minutes.
e. more than 20 minutes.

If you needed to clean your false teeth in less than 5 minutes you could use

a temperature of:

a. zero degrees.

b. 25 degrees.

c. 50 degrees.

d. 75 degrees.

e. more than one of the above.

As the temperature increases by 25 degrees, the cleaning time

a. increases by between 5 and 10 minutes.
b. decreases by between 5 and 10 minutes.
c. increases by between 10 and 15 minutes.
d. decreases by between 10 and 15 minutes.
e. none of the above.

To tell a friend how to measure the cleaning time it would be best to say

the cleaning time is the time for:.

a. the green color to disappear.

b. the chemical reaction to be completed.

c. all the bacteria on the teeth to be killed.
d. all the above are equally good answers.

e. none of the above.

16.

52.

53.

54.

55..

56.

Suppose a friend has gone to a lake to swim. He wants to know the water
temperature but he has no thermometer., He borrows a tablet of the above

brand of denture cleaner, goes out to the end of the dock and drops the tablet
in the water. He tries to use the above graph to tell the water temperature.
His effort fails, probably because:

a. the water is too cold to swim in.

b. he did not wait long enough.

c. he used the wrong amount of water.

d. there are no false teeth in the water.
e. none of the above.

Jean watches a bull fight and decides that bulls charge red objects. To

.test this idea she should observe a bull in a ring in which:

a. there is no matador but there are several red objects.

b. there is no matador but there are objects of several different
colors including some that are red.

c. there is a matador who waves a red cape.

d. there is a matador who waves capes of different colors including one
that is red.

e. more than one of the above is correct.

When 100 ml of alcohol and 100 m1 of water are mixed, somewhat less than
200 m1 of solution results. A possible explanation for this observation is:

a. alcohol evaporates quickly.

b. liquids have space between their molecules.
c. some liquids cool and contract when mixed.
d. more than one of the above is correct.

e. none of the above is correct.

Suppose a friend dials a number, hands yOu the phone, and tells you to find
out if the store he has called is a hardware store or a grocery store. You
are allowed only one question and you cannot use the words "hardware" or
"grocery." You should ask:

a. if they sell can openers.

b. what they sell the most of.
c. the name of the store.

d. the name of the manager.

e. none of the above would help.

A paper cup is filled with water and held over a lighted candle. Although
the flame is very near the cup, the cup does not burn. The reason may be the

a. cup may have become soaked with water.
b. cup may be made of fire proof paper.
c. water may be absorbing heat too fast.
d. all of the above.

d. none of the above.

17.

57.

Suppose a space traveler from some distant planet visits you. The people on
his planet are just like us except they do not have eyes. He can talk with
you, but of course he cannot see. It is your job to tell him what you mean
by "sight." It would be best to begin by saying "sight" is:

a. what I do when I see.

b. how I know it is you I am talking to and not someone else.

c. how I recognize you and what you are wearing without hearing you speak
or touching you.

d. the reaction of light on the nerves in the retina of my eye.

e. none of the above.

Questions 58 to 60 are about the following experiment:

A scientist wanted to know if a special light bulb is as efficient as sun light.

58.

. 59.

60.

He selected two young bean plants. He placed one plant on his windowsill and
the other in a closet. He put his special light bulb in the socket in the
closet, turned it.on, and closed the door. He returned in three days to

see how his plants were doing. He found that the plants had grown exactly
the same amount. Therefore he decided his special light bulb is as efficient
as sun light.

The reason the scientist used 2 plants in the experiment is:

a. so he could compare the plants.

b. in case one plant died, the experiment would not be a failure.

c. he really did not need 2 plants. '

d. his chances of getting a healthy plant were better by using two plants
than if he had chosen only one.

e. none of the above.

By "efficient" the scientist must mean:

a. the type of chemical reaction that the light source causes.

b. the amount of energy delivered to the plant by the light source.
c. the ability to cause plant growth.

d. all of the above.

e. none of the above.

This would have been a better experiment if:

a. more plants had been used.

b. the light had been connected to an automatic switch that would turn
it on only when the sun was shining.

c. the scientist had given the distance from the light bulb to the plant
in the closet. ‘

d. more than one of the above is correct.

e. none of the above.

18.

61. A scientist would say I am doing mechanical work when I pedal my bike but
I am not doing mechanical work when I stop pedaling and coast. From this
statement alone you might conclude that by "mechanical work" a scientist
means that:

a.
b.
c.
d.
e.

motion must occur.

force must be applied.

either of the above is enough to mean "mechanical work" to a scientist.
both force and motion are needed.

none of the above.

62. It scientist would say I am doing mechanical work when I push a broom but I
am not doing work when I stop and lean on the broom while I talk to one of
my friends. From this statement alone you might conclude that by "mechanical
work" a scientist means that:

a.

b.
c.
d.
e.

 

motion must occur.

force must be applied.

either of the above is enough to mean "mechanical work" to a scientist.
both force and motion must occur.

none of the above.

63. If questions 61 and 62 are taken together, you might conclude that by
"mechanical work" a scientist means that:

a.
b.
C.
d.
e.

motion must occur.

force must be applied.

either of the above is enough to mean "mechanical work" to a scientist.
both force and motion must occur.

none of the above.

64. All objects can be bent by some small amount no matter how stiff they are.

This

a.
b.
c.
d.

2.0

idea must be accepted until:

no one believes it any more.

a scientist says it is no longer true.
objects are found that bend easily.
someone finds an Object that does not bend.
none of the above.

65. Mary has a thermometer in her room. Her thermometer is best described as:

a.
b.
Co
d.
e.

an indoor thermometer.

a mercury-filled glass tube.

a device for measuring temperature.
a thermostat

none of the above.

19.

0.1 1'17! (1." i ll ll Ill“ Ill

[1 ll! {I‘li'lil '9 I I 1’] ll I'll]. Tali: '14

 

Questions 66 to 72 are about the following experiment:

A science class decided to check their reaction times thus: each student had to

flip a switch as soon as he saw a light flash, heard a buzzer sound, or both. A

timer recorded the time it took for each student to react. The data was recorded
using the following form:

REACTION TIME DATA

 

 

 

SEX STIMULUS TIME
- (L = light
(B * Boy S =

sound (seconds) %

C = Girl) B both),

 

 

 

 

 

66. Using the information to be recorded in the above table it would be possible
to find out:

a.
b.
c.
d.
e.

who the boy is that has the fastest reaction time.

who the girl is that has the fastest reaction time.

whether the student with the fastest reaction time is a boy or a girl.
more than one of the above.

none of the above

67. Using the information to be recorded in the above table it would be possible
to find out whether:

a.
b.
c.
d.
e.

staying up late the night before has any effect on reaction time.
the loudness of the buzzer has any effect on reaction time.
smOking has any effect on reaction time.

more than one of the above.

none of the above.

68. Using the information to be recorded in the above table it would be possible
to find out whether on the average:

a.
b.
Co
d.
e.

the light produced quicker reactions than the buzzer.
boys required a brighter light to react than girls.
time is important.

more than one of the above.

none of the above.

20.

After the above data had been taken, the class averages were figured.
The results were:

69.

70.

71.

72.

73.

 

 

AVERAGE TIME
STIMULUS BOYS. GIRLS
Light .17 sec .15 sec.
Buzzer .22 .19
Both .14 .23

 

 

Who reacted quickest to the buzzer?

a.
b.
c.
d.
e.

boys by .02 sec.
girls by .02 sec.
boys by .03 sec.
girls by .03 sec.
boys by .09 sec.

Who reacted quickest to both the light and the buzzer together?

a.
b.
c.
d.
8.

girls by .03 sec.
boys by .05 sec.
girls by .05 sec.
boys by .09 sec.
girls by .09 sec.

Did boys react quicker to the light than the girls did to the buzzer?

a.
b.
c.
d.
e.

yes, by .02 sec.

,yes, by .09 sec.

no, girls were qui¢ker by .02_sec.
no, girls were quicker by .05 see.
no, girls were quicker by .08 sec.

The term "reaction time" as used above means the time:

a.
b.
c.

d..

required for nerve impulses to be transmitted.

required for the buzzer to buzz or the light to flash.

required for the buzzer to quit buzzing or the light to quit flashing.
during which the student is deciding how to react

none of the above.

You are given a block of wood and a beaker of an unknown liquid. To find out
whether the wood will float on the surface of the liquid you should:

a.
b.
c.
d.
e.

find the density of the wood. .

find the density of the liquid.

put the block of wood in the liquid and watch it.

put the block of wood in several different kinds of liquids and watch it.
put several different kinds of wood in the unknown liquid and watch them.

21.

It'll-ll. 1" ’i

74.

75.

76.

Which of the following tells most clearly what to doﬂand what to observe:

a.
b.
c.

d.
e.

 

add 5 m1 of sodium hydroxide to 50 m1 of grape juice.

add sodium hydroxide to grape juice and the juice will change color.
changing the hydroxide concentration of the proper indicator will
cause a change in color.

grape juice contains colored indicators.

all of the above are quite clear.

Suppose it is your job to tell the world what a "mountain" is. Everyone
will accept your definition if by using it they can always tell whether or
not what they are looking at fits what ypppmean by a "mountain." It would
be best to say, a mountain:

a.
b.
c.
d.
e.

is high.

is higher than a hill.

has an altitude of 5000 feet or more.
requires much work to climb.

none of the above.

A girl removed a lid from a jar by prying on it with the blade of a table
knife. From that use of it, you might say a knife is a:

a.
b.
c.
d.
e.

sterling silver object with a sharp edge and a decorated handle.
stainless stell object about 8 inches long with a thin blade.
metal object that can be used as a lever to Open jars.

kind of incline plane that reduces the force needed to cut.

all of the above.

22.

Use the following contour map to answer questions 77 to 80.

77.

78.

79.

80.

 

 

 

 

 

 

 

 

 

 

What is the elevation at point A?

a. 9000 feet.
b. 6500 feet.
c. 6000 feet.
d. 4000 feet
e. none of the above.

This mountain is steepest on its

a. north side.
b. south side.
c. east side.
d. west side.
e. not enough information has been given to tell.

Which of the following is at the highest elevation:

a.. A.
b. B.
c. C.
d. D.

e. not enough information has been given to tell.

The "fall line" can be said to be the direction water would flow if it were
poured on the ground. From this definition it is safe to say the direction
of the fall line at point D would be approximately:

a. north.

b. south.
c. east.
d. west. 4

e. not enough information has been given to tell.

‘-

APPENDIX IV - B

FORM A SUBTEST ASSIGNMENTS

ID* CV* FH* DO*
Items Items Items Items
3 29 1 6
4 30 2 ll
5 31 7 l9
8 32 9 20
12 33 10 21
13 34 15 28
14 35 16 43
24 36 17 44
26 37 18 45
38 52 22 51
39 58 23 55
40 60 25 57
41 66 27 59
46 67 42 61
47 68 53 62
48 54 63
49 56 65
50 64 72
69 73
70 74
71 75
77 76
78 80
19. _ _ _

TOTALS 24 15 18 23

*ID — Interpreting Data

*CV - Controlling Variables
*FH - Formulating Hypothesis
*DO - Defining Operationally

132

APPENDIX IV - C

ITEM ANALYSIS FORM A

133

134

APPENDIX IV - C

ITEM ANALYSIS FORM A

All Values are in Percent

 

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.
Item Key Alt. 933_ ‘539, L27 U27 M46 L27
1 A 0 6 25 47 0 0 13 53
B 13 25 13 37 13 25 25 38
* C 75 50 38 63 50 25
D 13 19 25 25 25 38
E 0 0 0 0 0 0
2 * A 13 25 13 81 25 31 13 72
B 0 6 13 0 0 25 25 12
C 13 6 50 13 13 25
D 0 l3 0 0 0 25
E 75 50 25 63 31 13
3 A 0 0 0 47 0 l3 13 50
B 0 6 13 50 0 0 25 75
C 0 6 25 0 0 0
* D 88 44 38 75 63 0
E 13 44 25 25 25 50
4 A 0 0 25 31 0 25 13 59
B 0 13 50 87 25 19 0 37
C 0 6 l3 0 25 13
* D 100 81 13 75 25 38
E 0 0 0 0 6 25
5 A 38 19 63 66 25 25 38 63
* B 38 50 0 38 63 44 0 63
C 0 13 25 0 6 25
D 0 0 13 0 6 0
E 25 19 0 l3 19 25
6 A 38 25 13 56 0 l3 0 47
* B 38 38 63 -25 75 50 38 37
C 0 6 0 0 6 0
D 0 0 13 13 6 13
E 25 31 13 13 25 25

 

 

135

Appendix IV-C cont'd

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.
Item Key Alt. .932. M£9_ 333. U27 M46 L27
7 A 25 6 13 69 0 l3 0 53
B 0 38 13 50 0 50 13 50
C 0 25 50 0 6 38
* D 63 25 13 88 31 38
E 16 6 13 13 0 l3
8 A 38 50 75 91 50 38 50 91
B O 6 13 25 0 0 13 25
C 0 31 13 25 38 0
D 38 6 0 0 19 38
* E 25 6 0 25 6 0
9 A 13 0 13 50 13 0 13 56
B 0 19 38 13 0 25 13 38
C 25 50 38 25 6 25
* D 50 25 0 63 44 25
E 13 6 13 0 25 25
10 A 13 38 38 78 13 25 13 91
B 0 6 0 13 13 0 13 25
* C 38 13 25 25 6 0
D 25 13 13 13 19 50
E 25 31 25 38 50 25
11 * A 88 81 75 19 50 44 25 59
B 0 0 0 l3 0 0 13 25
C 0 0 0 0 6 25
D 13 13 25 50 44 38
E O 6 0 0 6 0
12 A 13 6 25 69 0 44 25 66
* B 38 31 25 13 50 31 25 25
C 0 6 13 13 0 0
D 0 0 0 0 0 25
E 38 56 38 38 25 25
13 A O 13 13 59 13 5 13 72
B l3 19 25 13 13 44 38 0
* C 63 25 50 38 19 38
D 0 6 13 25 6 13
E 25 38 0 13 25 0

 

 

136

Appendix IV—c cont'd

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.

Item Key Alt. 933 M_4_§_ El U27 M46 L27
14 A 13 13 25 56 l3 13 50 75
B O 13 38 75 0 31 25 50

C 13 0 25 l3 13 13

D 0 25 13 25 19 13

* E 75 50 0 50 25 0
15 A 0 44 50 84 38 50 25 81
B 75 25 25 0 38 19 50 0

C 13 13 O 13 6 13

* D l9 l9 13 13 25 13

E 0 0 l3 0 0 0
16 A 0 0 0 63 0 13 25 56
B 38 6 38 38 25 25 38 38

C 0 25 25 0 l3 13

D 25 13 38 13 6 0

* E 38 56 0 63 44 25
17 A 0 25 25 59 25 69 38 75
B 13 0 13 62 0 0 0 38

C 0 13 38 13 19 25

* D 75 38 13 63 6 25

E 13 25 13 0 6 13
18 * A 50 38 25 63 25 25 25 75
B 0 0 0 25 38 38 25 0

C l3 13 25 0 0 13

D 38 31 13 25 25 13

E 0 19 38 13 13 25
19 A 0 6 25 41 0 19 38 59
B O 6 25 75 0 l3 13 50

* C 100 56 25 75 31 25

D 0 25 13 25 25 13

E 0 6 13 0 l3 13
20 A 25 25 75 69 25 56 38 78
* B 38 44 0 38 13 25 25 -12

C 13 19 13 25 13 25
D 0 6 l3 0 O 0

E 25 6 0 38 6 l3

 

137

Appendix 1V1: cont'd

 

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.
Item Key Alt. U27 M46 L_2_7_ 931 M_4_9 El

21 A 0 6 13 28 13 19 o 75
* B 88 88 25 63 50 13 25 25

C 0 6 13 o 19 13

D 0 0 38 38 44 50

E 13 O 13 o 5 13
22 A O 6 25 91 13 0 38 88
B O 13 38 0 25 19 13 -13

C 100 63 38 63 44 25

D O O O O 19 13

* E 0 19 0 O 19 13
23 A 0 O O 22 0 19 O 34
B O O 13 25 13 0 13 38

C 0 13 0 O 6 O

* D 100 69 75 88 63 50

E O 19 13 0 13 38
24 A 50 l9 13 84 13 o 25 31
B 13 31 38 -12 25 44 o 13

C 0 6 13 25 13 50

D 25 31 13 25 13 13

* E l3 13 25 13 31 o
25 A 0 O 13 81 o 13 13 69
B 63 44 38 0 25 25 25 37

C l3 l3 13 13 19 13

* D 25 13 25 50 31 13

E 0 31 13 13 6 25
26 * A 38 19 50 69 25 19 13 81
B 13 19 25 —12 38 6 13 12

C 38 31 25 38 44 25

D O 13 0 O 13 O

E 13 19 0 0 13 38
27 A O 19 O 44 13 25. 13 88
* B 63 56 50 13 25 6 13 12

C 13 13 13 25 25 13

D 25 O 25 13 38 25

E O 13 13 25 6 25

 

138

Appendix IV-C cont'd

 

 

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.

Item Key Alt. ‘g21_ yﬁ§_ £21_ U27 M46 L27
28 * A 88 25 25 59 38 38 0 72
B 13 25 25 63 38 6 38 38

C 0 0 13 0 31 0

D 0 19 O 25 13 25

E 0 31 38 0 13 25
29 A 25 19 0 78 13 13 13 81
B 25 19 25 37 13 19 50 38

C 0 19 50 13 19 0

* D 50 13 13 38 19 0

E 0 31 13 25 31 25
30 A 13 6 0 78 0 O O 63
* B 0 25 33 ‘38 50 44 13 37

C 50 19 13 13 13 25

D 38 38 50 25 31 38

E 0 13 0 13 13 13
31 A 38 25 25 81 38 19 38 94
B 13 19 75 25 25 33 o -13

C 0 l3 0 0 6 38

D 25 19 0 38 31 0

* E 25 25 0 0 6 13
32 A 0 21 ll 35 30 28 10 58
B 0 0 ll 22 0 O 10 30

* C 89 53 67 60 39 30

D 0 16 11 10 22 40

E 11 11 O 0 11 10
33 A 11 11 11 30 O 11 20 37
* B 67 79 56 11 90 50 60 30

C 0 0 33 10 6 0

D 0 5 0 0 11 20

E 22 5 0 0 22 O
34 A 22 16 33 92 10 17 30 97
B 56 26 44 -11 40 67 10 10

C 22 42 ll 30 11 30

D 0 5 0 10 6 30

* E 0 11 ll 10 O O

 

139

Appendix 1V4: cont'd

 

 

 

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.

Item Key Alt. g21_ y§§_ L27 U27 M46 L27
35 A 11 5 22 41 0 17 40 66
* B 44 74 44 0 50 39 10 40

C 11 0 11 0 6 20

D 11 11 O 10 22 20

E 22 11 22 4O 17 10
36 A 33 16 22 68 20 28 30 66
B O 32 22 34 0 6 0 30

* C 56 26 22 60 22 30

D O 16 22 20 17 20

E 11 11 11 0 22 10
37 A 11 26 33 70 0 39 20 71
B 33 16 11 22 20 22 40 50

* C 44 26 22 60 22 10

D 11 16 22 20 6 20

E 0 16 11 O 6 10
38 A 33 26 56 46 10 28 30 39
* B 56 63 33 23 90 61 3O 60

C 11 ll 11 0 11 30

D O O 0 0 O 0

E O 0 0 0 O 10
39 * A 100 84 67 16 80 72 30 37
B 0 11 22 33 20 22 50 50

C O 5 O 0 6 10

D 0 O 0 0 0 0

E O 0 0 0 0 0
40 A O 21 33 41 10 28 70 55
* B 100 53 33 67 80 50 O 80

c o 26 22 10 22 30

D O 0 11 O O 0

E 0 0 0 O O 0
41 A 11 11 56 32 10 28 50 50
B 0 5 22 78 0 22 20 70

* C 89 84 11 90 44 20

D 0 0 O 0 0 O

E 0 0 11 0 0 0

 

140

 

 

 

 

 

 

 

 

 

 

 

 

Appendix IV-C cont'd
Diff. Diff.
SAPA ' ' Disc. TRAD. Disc.

Item Key Alt. U27 "M46 “_I_._2_7_ “ 'I_J_2_7_ “_IjI_4_6_ "1.21

42 A 44 58 56 70 50 33 30 79
B 0 5 11 22 O O 0 40
C 0 5 O 0 28 20
D 44 26 22 50 11 10
E 11 5 11 0 28 30

43 A 0 5 22 38 O 11 30 63
B 0 0 22 67 10 11 0 30
C 100 58 33 50 39 20
D 0 16 22 40 28 20
E O 21 0 0 11 20

44 A 0 0 11 41 O 17 10 68
B 0 0 11 78 O 6 20 60
C O 11 33 20 17 30
D 11 21 33 20 28 30
E 89 68 11 60 33 O

45 A 67 79 33 35 6O 67 3O 45
B 0 5 22 34 0 11 10 30
C 22 5 22 0 11 30
D 0 11 11 4O 6 30
E 11 O 11 O 6 O

46 A 44 26 56 70 3O 22 30 63
B 33 32 22 11 50 44 10 40
C 22 37 0 20 11 0
D O 0 11 0 6 40
E O 5 0 0 17 20

47 A O 5 0 54 O 17 10 58
B 11 16 0 56 0 6 0 40
C O 11 44 20 28 30
D 67 53 11 70 33 30
E 22 16 33 10 17 20

48 A 78 32 ll 62 80 22 30 61
B 0 16 22 67 0 33 10 50
C O 16 0 10 11 0
D 0 16 22 O 0 20
E 2 33

21

10

33

30

 

141

Appendix IVFC cont'd

 

 

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA ‘ Disc. ’ TRAD. Disc.
Item Key Alt. U27 ‘§4§_ .Lgl ‘ ‘ U27 ‘ M46 L27
49 A 0 11 0 89 O 6 20 87
B O 37 11 -11 O 17 20 10
C 11 16 22 3O 28 0
D 78 32 33 50 39 4O
* E 11 5 22 20 11 10
50 A ll 26 44 92 20 17 10 92
B 33 37 ll 11 20 39 10 0
C 44 21 22 30 33 40
D 0 5 ll 20 6 20
* E 11 11 0 10 6 10
51 * A 78 47 33 49 10 17 10 87
B 11 5 11 45 O 11 20 0
C 0 11 O 10 17 10
D 11 37 33 80 50 20
E O 0 O 00 6 10
52 A 0 11 0 70 10 11 10 87
B 22 5 22 44 10 22 20 20
* C 44 37 0 3O 6 10
D 11 0 22 4O 28 10
E 22 47 44 10 33 40
53 A 11 11 22 76 30 17 O 74
* B 33 32 0 33 20 22 40 -20
C O 16 22 10 17 20
D 22 11 22 O 6 10
E 33 32 22 40‘ 39 20
54 A 11 16 11 70 10 22 20 76
B 11 21 22 11 O 6 30 10
C 33 26 11 20 39 10
* D 44 21 33 3O 22 20
E 0 16 11 4O 11 10
55 A 0 11 11 68 o o 20 53
* B 56 21 33 23 44 go 44 80
C 33 47 44 1o 39 40
D 0 0 0 o 11 o
E 11 21 0 o 6 20

 

142

Appendix IV—C cont'd

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA ' Di8c. TRAD. ' Disc.
Item Key Alt- 22.7. 215.6. 221 ' "22.7. 21.29 ".1221
56 A 22 26 33 78 3O 11 10 63
B 22 O 22 22 O O 20 10
C 22 37 22 3O 50 30
* D 33 21 11 40 39 30
E 0 16 0 O 0 0
57 A O O 22 62 0 6 30 55
B 11 5 0 56 O 17 10 50
* C 67 37 11 6O 56 10
D 22 42 56 30 22 30
E 0 16 0 10 O 10
58 * A 100 84 56 19 100 67 70 24
B O 0 0 44 0 6 0 30
C O 0 11 O 6 10
D 0 O 22 0 ll 10
E 0 16 0 0 ll 0
59 A 0 11 33 89 0 6 30 76
B 11 21 22 11 20 39 30 10
* C 22 5 11 30 22 20
D 44 11 O 50 6 20
E 11 21 11 0 28 0
60 A 22 O 22 84 10 17 O 61
B 22 68 33 44 30 17 10 10
C 0 0 22 0 11 30
* D 44 11 O 40 44 30
E 11 21 11 20 11 20
61 A ll 11 11 76 10 17 20 79
* B 33 21 22 ll 20 22 20 0
C 11 5 11 10 17 20
D 44 63 33 60 44 40
E O 0 11 0 0 O
62 * A 44 26 22 7O 40 11 O 84
B 22 21 33 22 10 17 30 40
C 11 11 33 10 11 30
D 22 37 0 4O 50 20
E 0 5 O O 11 20

 

 

I I
III IIIII.‘
II!
I
III
I
l
ill. J
‘l.."

‘n"
l

143

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Appendix IV-c cont'd

Diff. Diff.
SAPA Disc. TRAD. Disc.

Item Key Alt. U27 M46 L27 U27 M46 L27
63 A 11 11 11 49 10 0 10 47
B 11 11 ll 34 O 11 10 50

C 11 26 33 10 39 30

D 67 53 33 80 50 30

E 0 0 0 0 0 20
64 A 0 O O 22 O 6 10 53
B 0 0 11 56 O 28 20 90

C 0 16 11 0 17 20

D 100 84 44 100 39 10

E 0 0 22 O 11 30
65 A 11 21 67 51 10 44 30 55
B O O 0 67 O O 20 60

C 78 53 11 90 28 30

D 11 16 22 0 22 20

E 0 0 0 0 6 0
66 A 0 O 11 32 20 6 10 45
B O 0 11 33 0 0 O 20

C 89 63 56 60 61 40

D 11 16 22 20 17 20

E 0 21 O 0 11 20
67 A 0 11 33 54 0 ll 20 76
B 22 21 33 67 30 39 30 40

C 0 0 ll 0 0 0

D 11 ll 22 20 33 30

E 67 58 0 50 17 10
68 A 89 32 56 49 40 17 10 79
B 0 11 O 33 0 17 3O 30

C 0 16 44 3O 33 30

D 0 32 0 10 17 20

E 11 11 0 20 11 0
69 A 0 0 11 41 0 6 10 45
B 11 16 0 45 10 17 30 60

C 0 21 33 O 6 O

D 89 53 44 90 50 30

E 0 5 11 0 17 20

 

144

Appendix IVFC cont'd

 

 

 

 

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.
Item Kex Alt. 22.2. 243.6. .L_21 .127. w L_21

70 A 0 11 0 32 0 6 50 50
B O 0 0 33 0 6 0 70

C 0 5 11 10 22 20

* D 89 63 56 9O 44 20
E 11 16 22 O 17 O
71 * A 56 42 22 59 80 44 20 53
B 11 11 44 34 10 17 20 60

C 33 42 33 10 22 30

D 0 O 0 0 6 10

E 0 0 0 o 0 10
72 A 89 53 33 92 60 56 10 97
B 0 5 22 0 0 17 20 10

C 0 5 11 0 11 10

D 11 16 33 30 6 50

* E 0 16 O 10 0 0
73 A 0 11 22 92 0 11 50 55
B 0 0 0 78 10 6 10 70

* C 100 63 22 90 33 20

D 0 0 33 o 28 o

E 0 21 22 o 11 1o
74 A 11 21 22 73 0 17 20 79
* B 56 21 11 45 50 6 20 30

C 22 21 22 30 22 20

D 0 5 11 o o o

E 11 26 22 20 39 30
75 A 0 0 22 73 0 11 0 79
B 22 32 11 O 30 22 30 0

* C 33 21 33 10 33 10

D 0 O 11 40 11 30

E 44 42 22 10 6 20
76 A 0 O 0 46 o 6 O 39
B 0 5 22 45 0 6 10 30

* C 89 42 44 80 56 50

D 0 26 33 10 6 20

E 11 21 0 0 11 10

 

 

illlll

145

Appendix IV-C cont'd

 

 

 

 

 

 

 

Diff. Diff.
SAPA Disc. TRAD. Disc.
Item Key Alt. 12.1 BB 127. 221 192 1.21
77 * A 0 0 0 65 0 11 10 58
B 0 21 22 67 10 6 20 50
C 22 37 33 20 17 50
D O 11 33 O 6 0
* E 78 26 11 60 50 10
78 A O 5 O 51 0 6 O 61
B 0 0 22 23 0 0 30 90
C 11 11 22 0 44 10
* D 67 42 44 9O 33 0
E 22 37 11 0 6 50
79 * A 78 63 33 41 90 56 50 37
B 0 O 11 45 0 11 10 40
C 0 5 22 0 O 10
D O 11 O O 6 20
E 22 16 33 0 l7 0
80 A 0 O 0 81 O 22 10 87
B 44 26 67 -11 70 28 3O
* C 22 11 33 0 28 0
D 11 16 0 0 0 20
E 22 42 0 20 11 30

 

 

APPENDIX IV-D

TSPT FORM C

146 - 165

 

“I' IIII‘II

. I Illll
I ll '1'. l

 

 

"'

THE SCIENCE PROCESSES TEST

FORM 34.
PART I

MICHIGAN STATE UNIVERSITY

Use the answer sheet provided.

Please do not make any marks on this test booklet.

‘.I.'1l 4| 3'. J\!l'lt|}]] I.I!i ll.l11 i

l. A stick 100 centimeters long is
slowly pushed over the edge of a
table as shown. About how far
over the edge do you think the
end will reach before the stick
will fall?

A. 25cm.
B. 50cm.
C. 75cm.
D. 100cm.

 

Figure 1.

2. The picture that would show the answer to the above question best would be a
picture showing

A. The stick balanced on my finger.

B. How thick the stick is.

C. The stick after it has fallen off the table.

D. Any of the above pictures would give the answer.

Questions 3, 4 and 5 use the following set up:

Several identical sticks are stacked on top of each other and extended over the
edge of a table in such a way as to give the greatest possible overhang (see below).

 

LA)

Figure 2 Figure

The relation between greatest overhang and number of sticks is graphed below:

 

 

 

0'- f I : i
1 I . é a
I ’ o
A . v
E i z i
.2, i ' ' :
loo ‘ - _-.-.- 1. I... H--- ..._ _.-- .. 3 _.. .---__-. -._. -_. :- -- _.-.-- -_ i... . ., ,
0‘! . ; l i
t i 3 A
u - i 1 !
-< o i t 1
L 1 § 9
o g ' ' 7
3 . .
Q l i i i £
50 . I . ~ ~ - .1 - - ~5 ~ .-----..._._..._____._. -~'-H--'I-~ H~ ‘A— . .- ‘-
1 i i ’ *
i " ' i
. 1 :
w 2 s , 3 ' 1
° 1 ‘f ‘, . I ’° _ 7’?
i : Sticks -+ f 3
' ‘ I I * I
Figure 4

3. The greatest overhang you could get using 5 sticks would be about:

A. 1 cm.
B. 49.9 cm.
C. 99 cm.
D. 112 cm.

4. The smallest number of sticks you would need to get an overhang of 100 cm is:

A. 1 stick
B. 2 sticks
C. 3 sticks
D. 4 sticks

5. Using 10 sticks you could get a maximum overhang of:

A. More than 150 cm.
B. Between 130 and 150 cm.
C. Between 110 and 130 cm.
D. Less than 110 cm.

6. If you needed to tell someone what I mean above by "overhang" so that they could
measure the overhang, you could say "overhang" means the distance from the:

. Far end of the tOp stick to the center of gravity of the system.
. Far end of the top stick to the edge of the table.

. Far end of the top stick to the far end of the next stick.

. Center of gravity of the top stick to the edge of the table.

unw>

t! \q. I‘ll-ll

Questions 7 through 10 are about frames
A and B shown in Figure 5.

In Figure 6 Frame B has been turned
upside down.

7.

From Figures 5 and 6 only, it is
possible that:

A.

B.

D.

The string in Frame A will remain
straight when Frame A is turned
upside down.

The string in Frame A will bend
when Frame A is turned upside
down.

The string in Frame B is held
straight by a fine thread
fastened to the bottom of the
frame.

All of the above may be true.

3.

I I
guy.ﬂ"‘

1!-
:I
,

 

Figure 5

 

Figure 6

8.

9.

10.

Using Figures 5, 6, and 7, the
best evidence you have that

 

Frame A is different from Frame

B

is that when the frames were

turned upside down:

A.
B.
C.

D.

The key in Frame A fell.

The ring in Frame B did not fall.
Either A. or B. above is evidence
there is a difference.

Both A. and B. above are needed

to have evidence for a difference.

Using Figures 5, 6, 7 and 8, the
bes t conclusion is:

A.

My name for the special string used in Frame B above is "Wyrstring."

The string in Frame A is not as
stiff as the string in Frame B.
The string in Frame A is made of
a stiff wire that is now bent.
The string in Frame B is made of
a stiff wire that is now bent.
The string in Frame B is held up
by a strong magnet hidden behind
the frame.

a
I I I-
.1 '3 ul‘."
4 h”. -

.1.
'4

 

Figure 7

II—

V'I-I'
v
1'3-
III.
Ii.

.1"!
than“

I I l '
MAI-.III-I ’
7w?"

 

Figure 8

Suppose

a friend phoned you to find out if a piece of string he found was Wyrstring.
To tell him it would be most helpful to know:

A.
B.
C.
D.

How big around his string is.
How long his string is.

How stiff his string is.

What his string is made of.

 

Figure 9

   

”a"! 1" fro»! floor FM)»

W094?“ (gram) —>

Figure 10

The graph on the right was made by setting weights on the end of the stick which is
clamped to the table as shown in Figure 9.

11.

12.

13.

The can of soup shown above bends the stick to 66 cm from the floor. The can
weighs about:

A.
B.
C.
D.

67.5 grams
110 grams
255 grams.
315 grams.

A weight of 65 grams should bend the stick to about

A.
B.
C.
D.

57.5 cm from the floor.
64 cm from the floor.

67.5 cm from the floor.
135 cm from the floor.

The weight that would bend the stick to 72 cm from the floor would weigh about:

A.
B.
C.
D.

55 gm.
67.5 gm.
215 gm.

None of the above is correct.

Questions 14 through 17 are about the following experiment.

 

Figure 11 Figure 12
The two jars shown are filled full The jars are put in a pan and placed
to the brim. in a freezer.
The lids are screwed on tight. The temperature inside the freezer 15

0 degrees F.

14. Two hours later it is found that neither of the liquids in the jars is frozen.
John says the reason is that two hours is not long enough to freeze water.
Tom says the reason is that the liquid in the jars is not water, but is some kind
of antifreeze. '

A. Probably John is right and Tom is wrong.

B. Probably John is wrong and Tom is right.

C. Both John and Tom could be right.

D. It is unreasonable that either John or Tom is right.

15. One might expect both jars to be frozen solid one day later because:

A. Water freezes at temperatures below 32 degrees F.

B. The liquids in the jars look like water.

C. One day in a freezer should be long enough to freeze water.
D. All of the above are true.

16.

17.

18.

19.

20.

'Next day when the freezer is Opened, jar "Y" is found broken and its
contents are frozen solid: The contents of jar "X" is still liquid.

You now know that:

. Jar X and Jar Y do not contain the same kind of liquids.
. At least one of the jars contains water.

. The temperature of the jars is different.

. All of the above are correct.

Uﬁw>

Suppose someone told you that the contents of jar "Y" behaved like a "Cronon"
while the contents of jar "X" behaved like a "non Cronon." To show the difference
between them you can say a "Cronon" is:

A. A chemical, but a non Cronon is not.
B. Just another name for water.

C. Easier to freeze than a non Cronon.
D. A special kind of jar.

A jar of water was left in the above freezer over night. The next day the jar is
broken and the water is frozen. You know that:

A. Water freezes easier than a Cronon.

B. Water behaves like a Cronon.

C. Cronons are made of water.

D. The water was cold before it was put in the freezer.

A bottle of alcohol was left in the above freezer over night. It was not frozen.
You now know that:

A. Alcohol is a Cronon.

B. Alcohol is a non Cronon.

c. Non Cronons are made of alcohol

D. The alcohol was warm before it was put in the freezer.

Dick claims that any glass bottle will break when the water it contains freezes.
To test his idea, he puts three bottles in the freezer. ALis empty, 8.15 half
full of water and Q_is brim full of water. If Dick is correct, which bottles
will break?

A. A only.
B. A and B.
C. B and C.
D. C only.

Questions 21 to 33 are about the cylinders shown below:

The Cylinders shown on the right
described as follows:

Metal cylinders: A, B, C, D.
Plastic cylinders: E, F, G, H.
Short cylinders: A, B, E, F
Long cylinders: C, D, G, H.
Solid cylinders: A, C, E, G.

Hollow cylinders: B, D, F, H.

 

Figure 13

Some or all of the above variables
will probably affect the time it takes
for the cylinders to roll the length
of the table shown. The slope of the
table is kept constant by the books
under the table legs.

21. By comparing rolling times for
cylinders A and E, you could
test the effect of the variable:

A. Solid or hollow.

B. Long or short.

C. Metal or plastic.
D. Amount of slope.

.‘A
III
I «
. . .
I‘.
5 l:-
. I -
. l

I,

I.
1;.
I

I’n
I7
I.
. I;

22. By comparing rolling times for
cylinders F and H you could test
the effect of the variable:

 

Figure 14
A. Solid or hollow.
B. Long or short.
C. Metal or plastic.
D. Amount of slope.

23.

24.

25.

It was found that the rolling time was exactly the same for cylinders B and D.~
From this information alone, the variable that does not affect the rolling time is:

A.
B.
C.
D.

Which cylinders could you use to tell if a hollow cylinder rolls

Solid or hollow.
Long or short.

Metal or plastic.
Amount of slope.

at a different rate than a solid cyclinder?

A.
B.
C.
D

A and D.
A and F.
C and D.
C and H.

Which cylinders could you use to tell if a metal cylinder rolls
at a different rate than a plastic cylinder?

bow»

A and F.
A and H.
C and G.
C and H.

Rolling time for the above cylinders is given in the following table. Use it to
answer questions 26 to 33.

26.

27.

28.

 

 

 

Cylinder Material Length Type Time
A metal 2 cm solid 5 sec.
B metal 2 hollow 10
C metal 8 solid 5
D metal 8 hollow 10
E plastic 2 solid 5
F plastic 2 hollow 10
G plastic 8 solid 5
H plastic 8 hollow 10
The material from which the cylinder is made affects the rolling time.
A. True.
B. False.
C. Cannot tell from the data.
The variable solid or hollow affects the rolling time.
A. True.
B. False.
C. .Cannot tell from the data.
The variable long or short affects the rolling time.
A. True.
B. False.
C. Cannot tell from the data.

29. The slope of the table affects the rolling time.

A. True.
B. False.
C. Cannot tell from the data.

30. The above table shows that for this experiment the rolling time for hollow
cylinders:

A. Is always the same

B. Depends on how long the cylinders are
C. Is different for metal than for plastic
D. Depends on the slope of the table

31. For this experiment a solid cylinder is one which:
A. Is metal.
B. Has a rolling time of 5 seconds.
C. More than one of the above is correct.

D. None of the above is correct.

32. For this experiment a metal cylinder is one which:

A. Is solid.

B. Has a rolling time of 5 seconds.

C. More than one of the above is correct.
D. None of the above is correct.

33. To tell someone how to answer the above questions it would be best to tell them
that by "rolling time" I mean the time:

 

A. For the cylinder to roll the length of the table.
B. During which gravity is pulling on the.cylinder.
C. As shown by my stop watch. '
D. Needed for me to start and stop my stopwatch.

End Part I

LO

III
I
I
1
I.’
.I.
11'-

 

THE SCIENCE PROCESSES TEST
FORM It;

PART II

MICHIGAN STATE UNIVERSITY

Use the answer sheet provided.

Please do not make any marks on this test booklet.

I'll ll.lllllol.l|lul.l.lllluli. x

Questions 34 to 39 are about the following experiment:

The TV ads say that their falseteeth cleaner colors the water green. When the color
goes away, your teeth are clean. To find the time needed for the color to go away

I got several glasses of water. Each at a different temperature. I put a tablet

of falseteeth cleaner in each glass and measured the time needed for the green color
to go away. A graph of the data is shown below.

«~50 . ~—

«——-—— mt loo » ~-
9} ‘\\\\\\\‘~‘
4?
V
i.

ii - . -.....-

 

 

 

 

 

 

 

 

 

 

 

.. “5—1350' -
Io
_ ——~7- 0 ’ 4’ If If to 23’
Time (091%.) —-p
Figure 15

34. From this graph we can say that for a higher temperature, the time needed for the
green color to go away is:

turns!»-

Greater.

Less.

Not affected by changes in temperature.
Not enough information is given to tell.

35. From the graph, how long should it take to clean your falseteeth at a
temperature of 70 degrees F.

tdf>ul>

5 minutes or less.

Close to 10 minutes.
Close to 15 minutes.
20 minutes or more. A

36. Where on the graph is the cleaning time most affected by changes in temperature.

A.
B.
C.
D.

High temperatures.

Low temperatures.

Cleaning time is always affected the same amount by a change of temperature.
Cleaning time will always be the same.

11

Ill! 1511 Iii ll III-Ii III I I hull: Ill

37.

38.

39.

40.

41.

If you needed to clean your falseteeth in 5 minutes, you should use
a temperature of:

A. Less than 60 degrees.

B. Between 60 and 75 degrees.
C. Between 75 and 100 degrees.
D. Greater than 100 degrees.

To tell a friend how to measure the cleaning time it would be best to say
the cleaning time is the time for:

A. The green color to go away.

B. The chemical reaction to be completed.

C. All the bacteria on the teeth to be killed.
D. All the above are equally good answers.

Suppose a friend has gone to a lake to swim. He wants to know the water
temperature but he has no thermometer. He borrows a tablet of the above
falseteeth cleaner, and drops it into the lake. He tries to use the above
graph to tell the water temperature. His effort fails, probably because:

A. The water is too cold to swim in.

B. He did not wait long enough.

C. He used the wrong amount of water.

D. There are no falseteeth in the water.

Jean watches a bull fight and decides that bulls charge red objects. To test
this idea she should observe a bull in a ring in which:

A. There is no matador but there are several red objects.

B. There is no matador but there are objects of several different colors
including some that are red.

C. There is a matador who waves a red cape.

D. There is a matador who waves capes of different colors including one that is
red.

Suppose a friend dials a number, hands you the phone, and tells you to find out
if the store he has called is a hardware store or a grocery store.
You can ask only one of the following questions. You should ask:

A. If they sell can openers.
B. If they sell light bulbs. .

C. What they sell the most of.
D. How old their store is”

12

lulu. I] ‘IIIIII.
.lll'n‘

 

 

42. Suppose a space traveler from some distant planet visits you. The people on
his planet are just like us except they do not have eyes. He can talk with
you, but of course he cannot see. It is your job to tell him what you mean

by

"sight." It would be best to begin by saying "sight" is:

What I do when I see.

How I know it is you I am talking to and not someone else.

How I know you and what you are wearing without hearing you speak or
touching you.

The reaction of light on the nerves in the retina of my eye.

Questions 43 to 45 are about the following experiment:

.A scientist wanted to know if a special light bulb is as "efficient" as sunlight.
He selected two young bean plants. He placed one plant on his windowsill and the
other in a closet. He put his special light bulb in the socket in the closet,
turned it on, and closed the door. He returned in three days to see how his
plants were doing. He found that the plants had grown exactly the same amount.
Therefore he decided his special light bulb is as "efficient" as sunlight.

43. The reason the scientist used 2 plants in the experiment is:

A.
B.
C.
D.

44. By

So he could compare the plants.

In case one plant died, the experiment would not be a failure.

He really did not need 2 plants.

His chances of getting a healthy plant were better by using two plants than
if he had chosen only one.

"efficient" the scientist must mean:

The type of chemical reaction that the light source causes.

The amount of energy delivered to the plant by the light source.
The ability to cause plant growth.

All of the above.

45. This would have been a better experiment if:

A.
B.

C.

D.

More plants had been used.

The light had been connected to an automatic switch that would turn it on
only when the sun was shining.

The scientist had given the distance from the light bulb to the plant in
the closet.

More than one of the above would help.

13

.46.

47.

48.

49.

50.

A scientist would say I am doing "work" when I pedal my bike but I am not
doing "work" when I stop pedaling and coast. 'FrOm this statement alone
you might conclude that by "work" a scientist means that:

 

A. Motion must occur.

B. Force must be applied. «

C. Either of the above is enough to mean "work" to a scientist.
D. Both force and motion are needed.

A scientist would say I am doing "work" when I push a broom but I am

not doing work when I stop and lean on the broom while I talk to one of my
friends. From this statement alone you might conclude that by "work" a
scientist means that:

A. Motion must occur.

B. Force must be applied.

C. Either of the above is enough to mean "work" to a scientist.
D. Both force and motion must occur.

If questions 46 and 47 are taken together, you might conclude that by "work"
a scientist means that:

. Motion must occur.

. Force must be applied.

. Either of the above is enough to mean "work" to a scientist.
. Both force and motion must occur.

UOUUI>

All objects can be bent by some small amount no matter how stiff they are.
This idea must be accepted until:

No one believes it any more.

A scientist says it is no longer true.
Objects are found that bend easily.

Someone finds an object that does not bend.

cow»

Mary has a thermometer in her room. Her thermometer is best described as:

An indoor thermometer.

A glass tube containing a colored liquid.
A device for measuring temperature.

A thermostat.

UCP>

l4

.. 1 II’. .11.}. I1 I

III I l‘IIII It]!

Questions 51 to 55 are about the following experiment:

A science class decided to check their reaction times. Each student was asked to
flip a switch as soon as he saw a light flash, heard a buzzer sound, or both. A
timer recorded the time it took for each student to react. The data was recorded
using this data table:

REACTION TIME DATA

 

 

 

 

SEX STIMULUS TIME
(L = light
(B 8 Boy 8 = sound (seconds)
6 = Girl) B a both) ‘

 

 

 

 

 

51. Using the information to be recorded in the above table it would be possible
to find out:

A. Who the boy is that has the fastest reaction time.

B. Who the girl is that has the fastest reaction time.,

C. Whether the student with the fastest reaction time is a boy or a girl.
D. More than one of the above.

52. Using the information to be recorded in the data table it would be possible
to find out whether on the average:

A. The light produced quicker reactions than the buzzer.
B. Boys required a brighter light to react than girls.
C. Time is important.

D. More than one of the above.

15

After the above data had been taken,

were:

53.

54.

55.

56.

57.

STIMULUS

the class averages were figured. The results

 

 

Light
Buzzer
Both

 

Who reacted more quickly to the buzzer?

A. Boys by .02 sec.
B. Girls by .02 sec.
C. Boys by .03 sec.
D. Girls by .03 sec.
E. Boys by .09 sec.

.4

A. Girls by .03 sec.
B. Boys by .05 sec.
C. Girls by .05 sec.
D. Boys by .09 sec.
E. Girls by .09 sec.

 

AVERAGE TIME

Boys Girls
.17 sec .15 sec.
.22 .19

.14 .23‘

 

Jho reacted more quickly to both the light and the buzzer together?

Did boys react more quickly to the light than the girls did to the buzzer?

Yes, by .02 sec.
Yes, by .09 sec.
No, girls were quicker by
No, girls were quicker by
No, girls were quicker by

munw>

.02 sec.
.05 sec.
.08 sec.

You are given a block of wood and a glass full 6f an unknown liquid. To find out
whether the wood will float on the surface of the liquid you should:

MUOWH’

. Find the density of the wood.
. Find the density of the liquid.

. Put the block of wood in the liquid and watch it.

. Put the block of wood in several different kinds of liquids and watch it.
. Put several different kinds of wood in the unknown liquid and watch them.

A girl removed a lid from a jar by prying on it with the blade of a table knife.

From that use of it, you might say a knife is a:

A. ‘Sterling silver object with a sharp edge and a decorated handle.
B. Stainless steel object about 8 inches long with a thin blade.

C. Metal object that can be used as a lever to open jars.

D. Kind of incline plane that reduces the force needed to cut.

16

Allin.

58.

Which of the following instructions for doing an experiment tells most clearly
what to do and what to observe:

A.
B.
C.

D.

Use the

59.

60.

61.

Add 5 m1 of sodium hydroxide to 50 ml of grape juice.

Add sodium hydroxide to grape juice and the juice will change color.
The hydroxide concentration of some substances is indicated by
their color. . .

Grape juice contains colored indicators.

following contour map to answer questions 59 to 61.

 

 

Figure 16

What is the elevation at point A:

com»

7000 feet.
6000 feet.
5000 feet.
4000 feet.

This mountain is steepest on its:

A.
B.
C.
D.

North side.
South side.
East side.
West side.

Which of the following is at the highest elevation:

UCtﬂU’

cow>

l7

APPENDIX IV - E

ITEM ANALYSIS FORM C

166

167

APPENDIX IV - E

ITEM ANALYSIS FORM C
All values are in Percent
Normal: Traditional item analysis procedure is used.

ICM: External criterion referenced item analysis as
described in Chapter I

Item numbers in parentheses related to form A items.

 

 

 

 

 

 

 

 

 

Diff.
Diff. Disc.
NORMAL Disc. ICM Corr.
Item Key Alt- Ha M_4.6. 122. 221. 242 12.7.

1 A 0 17 14 44 7 4 29 44
* B 79 46 50 29 64 63 36 29
(1) C 21 38 36 29 33 36 15

D 0 0 0 0 0 0
2 * A 57 25 29 65 50 25 36 65
(2) B 7 21 14 29 14 17 14 14
C 21 38 43 14 46 36 15

D 14 17 14 21 13 14
3 A O O 7 31 0 4 0 31
(3) B 7 25 43 43 14 8 64 50
C 7 O 7 7 0 7 45

* D 86 75 43 79 88 29
4 A 7 13 21 48 7 13 21 48
(4) B 0 17 36 64 0 8 50 57
C 14 12 29 14 25 7 38

* D 79 58 14 79 54 21
5 A 36 42 36 56 43 33 43 56
(5) * B 57 46 29 39 50 50 29 21
C 7 13 21 7 13 21 22

D O O 14 0 4 7
6 A 7 25 50 67 7 38 29 67
(6) * B 57 29 14 43 50 29 21 29
C 21 29 29 21 29 29 26

D 14 17 7 21 4 21

 

168

Appendix IV-E cont'd

 

 

 

 

 

 

 

 

 

 

Diff.
Diff. Disc.
NORMAL Disc. ICM Corr.
Item Key Alt. 1121 M_49 m7. .1123. as a
7 A o 13 14 62 o 21 o 62
(7) B 14 8 21 14 21 8 14 -7
c 36 46 29 36 42 36 o
* D 50 33 36 43 29 50
8 A 57 42 29 81 57 36 81
(8) B 7 _ 25 14 14 7 21 21 14
c 7 17 43 7 25 29 08
* D 29 17 14 29 17 14
9 A 43 21 21 63 43 21 21 63
(10) B o 13 43 43 o 21 29 36
* c 50 46 7 so 42 14 33
D 7 21 29 7 17 36
10 A o o 14 50 o 4 7 so
(11) B o 8 o 64 o 4 7 50
* c 79 54 14 86 38 36
D 21 38 71 14 54 50
11 A 29 63 57 63 36 58 57 63
(12) * B 64 33 14 50 57 29 29 29
c o o 29 o 8 14 41
D 7 4 o 7 4 o
12 A 14 21 7 6O 14 17 14 60
(13) B 21 29 43 14 29 25 43 14
* c 50 38 36 50 38 36 14
D 14 13 14 7 21 7
13 A 7 25 29 37 7 29 21 37
(14) B 7 13 7 36 14 8 7 14
c o 4 14 o 8 7 26
* D 86 58 so 79 54 64
14 A 7 13 14 65 7 8 21 65
(16) B 21 29 57 57 21 42 36 43
* c 64 33 7 57 33 14 37
D 7

25 21 14 17 29

 

Ii l I! III II Illlm.

169

Appendix IV-E cont'd

 

 

 

 

 

 

 

 

 

 

 

 

Diff.
Diff. Disc.
NORMAL Disc. ICM Corr.
Item Key Alt. U27 M46 L27 221. ‘§4§_ .221
15 A 21 33 29 79 29 29 29 79
(17) B 0 O 14 29 0 O 14 14
C 43 46 50 36 58 36 11
D 36 21 7 36 13 21
16 * A 93 75 71 21 100 71 71 21
(18) B 0 17 14 21 0 13 21 29
C 0 0 0 0 0 0 26
D 7 8 l4 0 l7 7
17 A 7 29 7 48 7 25 14 48
(19) B 43 l7 l4 0 50 21 0 -21
* C 50 54 50 43 50 64 -06
D O 0 29 0 4 21
18 A 7 50 36 58 O 46 50 58
(20) * B 79 33 21 57 86 29 21 64
C 14 8 36 14 13 29 49
D 0 8 7 O 13 0
19 A 7 17 36 44 O 21 36 44
(21) * B 86 54 29 57 93 50 29 64
C 7 25 14 7 25 14 45
D 0 0 21 O O 21
20 A O 8 21 50 O 4 29 50
(23) B O 0 7 36 O 4 0 29
* C 71 46 36 71 42 43 10
D 29 46 36 29 50 29
21 A 14 21 43 58 14 17 50 58
(32) B 0 13 29 71 7 8 29 79
* C 79' 42 7 79 46 O 54
D 7 25 21 0 29 21
22 A 7 13 21 38 7 13 21 38
(33) * B 93 58 36 57 93 54 43 50
C 0 13 36 0 17 29 39
D 0 l7 7 0 17 7

 

-.ol| [I'llllill III: lulllall,’,.lil‘ll I‘ll

170

 

 

 

 

 

 

 

 

 

 

Appendix IV-E cont'd
Diff.
Diff. Disc.
NORMAL Disc. ICM Corr.
Item Key Alt. U27 M46 L_2_7_ U_2_7_ ﬂ L_2_7_
23 A 7 25 50 67 14 25 43 67
(35) B 64 21 21 43 57 21 29 29
C 14 25 14 14 21 21 34
D 14 29 14 14 33 7
24 A 21 25 21 71 36 21 14 71
(36) B 36 29 21 O 29 29 29 -7
C 29 29 29 29 25 36 —11
D 14 17 29 7 25 21
25 A 21 33 14 56 21 25 29 56
(3]) B 7 4 36 29 7 13 21 43
C 64 38 36 64 46 21 32
D 7 25 14 7 17 29
26 A 43 54 50 67 50 63 29 67
(38) B 36 38 21 14 29 29 43 -14
C 21 8 29 21 8 29 -06
D O 0 0 O O 0
27 A 93 79 r 21 33 86 75 36 33
(39) B O 13 50 71 7 13 43 50
C 7 8 21 7 13 14 44
D O 0 7 0 0 7
28 A 7 33 29 52 21 25 29 52
(40) B 64 46 36 29 64 42 43 21
C 29 21 36 14 33 29 15
D O O 0 0 O 0
29 A 50 75 57 83 57 75 50 83
(41) B 14 21 21 14 14 17 29 7
C 36 4 21 29 8 21 09
D O 0 O O 0 0
30 A 79 29 7 63 71 33 7 63
(42) B 7 13 29 71 21 17 7 64
C 7 33 43 0 33 50
D 7 21 21 7 r 13 36

 

. 1.! In]. I ll.'|ull.|l Ill ,0 n'.

Appendix IV—E cont'd

171

 

 

 

 

 

 

 

 

 

 

 

 

Diff
1 .Diff. Disc.
'NORMAL """ 'DiSc. 'ICM """ Corr.
Item Key Alt. "U27 "M46 "L27 "U27 "M46 "L27
31 A 7 21 43 65 7 29 29 65
(43) B 57 33 14 43 57 33 14 43
C 7 21 21 7 13 36 25
D 29 25 21 29 25 21
32 A 7 38 36 62 7 33 43 62
(44) B 0 17 36 64 O 17 36 71
C l4 17 14 14 17 14 44
D 79 29 14 79 33 7
33 A 86 46 21 50 86 50 14 50
(45) B 7 17 14 64 14 13 14 71
C O 8 21 O 13 14 48
C 7 29 43 O 25 57
34 A 14 38 43 40 21 25 57 4O
(46) B 86 63 29 57 79 67 29 50
C O O 21 0 8 7
D O O 7 O 0 7
35 A 7 4 29 63 7 8 21 63
(47) B 14 58 29 43 14 58 29 29
C 64 29 21 64 21 36 25
D 14 8 21 14 13 14
36 A 86 SO 36 75 86 58 21 75
(48) B 7 25 43 -36 7 17 57 -50
C 7 17 14 7 21 7 -47
D O 8 7 O 4 14
37 A O 4 29 79 O 8 21 79
(49) B 0 25 29 43 O 29 21 21
C 57 50 43 64 46 43 21
D 43 21 O 36 17 14
38 A 64 75 14 44 64 63 36 44
(51) B 0 4 21 50 O 13 7 29
C 0 4 50 O 17 29 26
D 36 17 36 29

14 ..

 

172

Appendix IV—E cont'd

 

 

 

 

 

 

 

 

 

 

 

Diff.
Diff. Disc.
NORMAL Disc. ICM ‘ ' Corr.
Item Key Alt. I_J_2_Z_ 1:146 _Iiz ~ "U27 ' 'M46 ' ‘L27
39 A O 21 21 56 O 17 29 56
(52) B 21 8 14 36 14 13 14 36
* C 57 50 21 64 42 29 30
D 21 21 43 21 29 29
40 A O 21 7 50 7 13 14 50
(53) * B 57 50 43 14 57 50 43 14
C O 8 36 O 13 29 08
D 43 21 14 36 25 14
41 A O 8 21 21 7 8 14 21
(55) B 0 8 0 50 7 4 0 21
* C 100 83 50 86 83 64 24
D O 0 29 0 4 21
42 A O 4 21 56 0 8 14 56
(57) B 7 13 21 21 14 8 21 O
* C 57 42 36 50 38 50 O
D 36 42 21 36 46 14
43 * A 100 83 36 25 100 79 43 25
(53) B O O 29 64 O 8 14 57
C O 4 O O 0 7 53
D O 13 36 O 13 36
44 A O 0 7 63 O 4 0 63
(59) B 36 33 43 -7 29 38 43 7
* C 29 42 36 36 42 29 06
D 36 25 14 36 17 29
45 A 0 4 29 63 O 8 21 63
(60) B 43 42 21 29 36 38 36 36
C 7 13 29 7 7 17 21 36
* D 50 38 21 57 33 21
46 A 7 21 21 79 O 29 14 79
(61) * B 29 13 29 0 43 8 21 21
C 21 8 29 21 17 14 13
D

43 58 21 36 46 50

 

173

Appendix IV-E cont'd

 

 

 

 

 

 

 

 

 

 

Diff.
Diff. Disc.
NORMAL Disc. ICM Corr.
Item Key Alt. 11.21 94.6. 12 1122 M127.
47 * A 43 38 29 63 43 33 36 63
(62) B 14 17 50 14 29 13 43 7
C 14 4 7 7 13 O 01
D 29 42 14 21 42 21
48 A 7 21 7 48 7 21 7 48
(63) B 7 29 14 7 7 21 29 21
C 21 8 21 21 8 21 10
* D 64 42 57 64 50 43
49 A O 4 21 33 O 8 14 33
(64) B O 4 21 79 0 8 14 79
C O 17 36 O 17 36 52
* D 100 75 21 100 67 36
50 A 7 8 57 44 0 17 50 44
(65) B 0 l3 7 71 O 8 14 21
* C 86 63 14 100 50 21 56
D 7 17 21 0 25 14
51 A O 8 36 52 O 21 14 52
(66) B O 13 14 29 O 13 14 21
* C 57 54 29 64 42 43 08
D 43 21 21 36 21 29
52 * A 21 29 14 77 29 17 29 77
(68) B 14 13 43 7 7 21 36 O
C 14 21 36 14 29 21
D 50 33 7 50 29 14
53 A O 8 7 56 0 4 14 56
(69) B 36 25 50 43 29 29 50 50
C O 21 14 O 25 7 27
* D 64 46 21 71 42 21
54 A O 13 36 54 O 17 29 54
(70) B 0 13 0 64 O 13 0 79
C 21 8 14 7 8 29 57
* D 79 46 14 93 38 14

 

174

Appendix IV-E cont'd

 

 

 

 

 

 

 

 

 

Diff.
Diff. Disc.
NORMAL Disc. ICM Corr.
Item Key Alt. REM—46.12]. 921183.127.
55 * A 50 38 29 62 50 29 43 62
(71) B 0 13 29 21 0 17 21 7
C 36 29 36 29 42 21 02
D 7 13 O 14 8 0
56 A O 8 7 54 7 8 O 54
(73) B O 8 21 86 O 8 21 79
* C 93 42 7 93 38 14 53
D 0 25 21 O 17 36
57 A 0 8 43 56 O 8 43 56
(76) B O 33 14 64 O 29 21 64
* C 79 42 14 86 33 21 38
D 21 13 21 14 25 7
58 A l4 17 36 65 14 21 29 65
(74) * B 64 25 21 43 57 25 29 29
C 14 42 43 21 38 43 23
D 7 17 O 7 17 0
59 A 0 13 29 50 7 17 14 50
(77) B 7 21 36 50 14 21 29 , 43
* C 79 46 29 64 58 21 38
D 14 21 7 14 4 36
60 A 21 13 21 58 21 17 14 58
C78) B O O 7 43 O 4 0 36
C 14 42 50 14 38 57 31
* D 64 42 21 64 38 29
61 * A 79 50 43 44 64 58 43 44
(79) B O 13 7 36 O y 13 7 21
C 14 25 21 29 17 21 26
D 7 8 21 7 8 21

 

APPENDIX IV—F
VALIDATION SAMPLE SCORES

.~Individual~Competency~~ - ~ ,
‘ Measures ‘-u4 " 1 SRA ‘

Student 181718901189) .-- -- .- . .
Number ‘ Form C FormD ’ ID* Q ﬁll 29 Tetal ‘ Science'Reading

 

 

 

 

 

 

1 42 30 22 41 23 13 95 38 59

2 26 18 21 40 23 13 69 32 47

3 38 27 20 40 22 13 89 36 56

4 21 12 20 38 22 13 55 8 19
5 26 20 20 38 22 12 57 34 46

6 l7 8 20 38 21 ll 44 19 29

7 23 8 19 37 21 11 50 26 40

8 33 20 19 37 21 11 77 32 56
9 17 10 19 36 21 11 33 10 14
10 37 27 19 36 21 10 94 34 56
11 33 22 18 36 21 10 86 28 47
12 22 14 18 35 20 10 43 28 32
13 30 22 18 35 20 10 75 35 54
14 17 6 18 34 20 9 55 7 18
15 20 14 18 34 19 9 61 21 44
16 20 17 18 33 19 9 61 26 32
17 47 28 17 33 19 9 94 33 58
18 32 20 17 33 19 9 79 34 54
19 16 8 17 33 19 8 42 11 21
20 50 34 17 33 19 8 86 37 58
21 29 18 17 32 . 18 8 62 30 36
22 19 12 17 32 18 8 49 23 29
23 26 16 17 31 18 8 64 31 42
24 29 19 16 3O 18 8 68 34 49
25 31 22 16 3O 18 7 75 28 52
26 23. 8 16 30 18 7 42 26 35
27 33 23 16 3O 17 7 75 29 43
28 25 16 15 29 17 7 83 30 52
29 39 . 23 15 28. 17 7 84 35 57
30 13 8 15 28 17 7 71 28 36
31 32 22 15 27 17 7 75 47
32 38 29 14 25 17 7 93 50
33 25 15 14 25 16 6 50 22 35
34 15 5 14 25 16 6 66 18 20
35 27 15 14 25 15 6 68 30 54
36 23 16 13 24 15 6 59 26 51
37 28 17 13 23 15 6 57 26 50
38 11 5 13 22 14 6 47 16 29
39 46 31 12 22 13 6 88 33 56
4O 22 12 12 21 13 6 53 16 21

175

176

 

 

 

 

Appendix IV-F cont'd
~ ~Individual Competency-
Student TSPT SCORES ‘Measures -‘~ -
Number Form C' 'Form D ' 112 ‘ ‘ﬂ ' E 29 'TOtal
41 12 7 11 21 13 5 66
42 31 18 10 21 13 5 75
43 23 17 10 20 12 5 79
44 16 9 10 19 12 5 47
45 18 4 10 18 12 4 33
46 12 4 10 18 12 4 57
47 38 24. 9 18 12 . 3 86
48 29 17 9 18 12 3 58
49 39 . 28 9 17 11 2 83
50 36 26 8 16 10 2 80
51 24 10 7 l6 9 2 67
52 28 21 6 9 8 2 78

 

 

1- ‘SRA .

"Science Reading
22 36
25 45
21 35
23 35
14 22
16 23
39 54
18 25
37 53
32 44
31 42
29 52

APPENDIX IV-G

TSPT FORM C ITEMS - INDIVIDUAL COMPETENCY MEASURES SUBTEST CORRELATIONS

Item

 

©mVO‘UI-L‘UJNH

Item Subtest'Correlations

1.11

21
13
.éé
.41
.Ié
21
o
.91
28
3o
.39
.12
33_
4o
09
20

-06
42
34
O8
52
45
22

-06
16

9.1

13
14
42
3o
20
25

-04
09
33
32
38
13
22
35
16
18

-02
46
42
12
49,
.32
28_

iii
.35

-07
41
09
09
44
13
38
42
45
21

-42
18
29

_31
04
29
03
.51
04
.32

1:11

.13
.12
45
4o
21
18
.92
02
.Z§
31
41
16

ID-DO

cveDo,

CV-DO

ID—FH,

DO-CV

ID-cv,
ID-CV,
FEJD,
Do-cv,
CV-ID,

Significant Differences

= * a

FH-DO

ID-DO

FH—CV
ID-DO, FH-DO
DO-ID
DO-FH
DO-ID

ID-DO* , CV-DO, FH-DO

FH-ID,
ID-CV,
FH-DO
ID-Do,
DO-FH
Do-ID
ID-Do,

ID-cv,
CV—FH,

ID-FH,

FH-ID
FH-DO

177

DO-ID

FH-CV

CV-DO*, FH-DO

CV-DO

ID—FH, ID-DO
cv-Do

ID—DO, CV-DO

178

Appendix IV-G cont'd

 

 

 

Item Item Subtest Correlations Significant Differences
.12 CV_ ELI ﬂ; Alpha = 0.1 or *Alpha = 0.01

46 O4 O9 18 .13 FH-ID

47 06 ~03 O6 '9

48 O7 11 O4 '16

49 43 50 .42 37

50 38 52 50 56_ CV-ID, DO-ID

51 19 95_ O 14 ID-CV, ID-FH

52 -02 191 -03 -05

53 49_ 21 11 34 ID-CV, ID—FH*, DO—FH

54 42. 54 38 6O DO-ID,CV—FH, DO-FH

55 .93 04 -15 22 ID-FH, CV—FH, DO-CV, DO-FH*

56 44 56 38 .38 CV-ID, CV-FH, CV-DO

57 41 33 28 .21 ID-FH

58 19 23 17 .16

59 ‘28 34 45 20 FH-ID, FH-DO

6O .18 29 39 19 FH-ID, FH—DO

61 .15 20 41 ll FH—ID*, FHrCV, FH—DO*

ID = Interpreting Data subtest of the Individual Competency Measures

CV = Controlling Variables subtest of the Individual Competency Measures
PM = Formulating Hypotheses subtest of the Individual Competency Measures
D0 = Defining Operationally subtest of the Individual Competency Measures

A11 decimal points are suppressed.

The underlined correlation indicates the subtest which the item is in-
tended to assess.

Correlations which are significantly different from each other are indi-
cated in the last column. The level of confidence is 0.1 except for

the differences followed by an asterisk, which have a confidence level
of 0.01.

This table may be interpreted as follows:

Item 1: The highest correlation is with ID, but since the correlations
with CV and FH are not significantly lower, it cannot be said that this
item assesses any one of them uniquely. Since the difference between
either the CV or FH correlation and the DO correlation is not significant
either, it cannot be said that if the item measures CV or FH it does not

.meaaure DO. 'All that can be said With.90 percent confidence is that if
ix measures ID, it does not measure DO.

179

Appendix IV-G cont'd

Item 22: The highest correlation is with ID, but the difference
between the ID and the FH correlations is not significant so it is
tempting to say this item measures either ID or FH. However, the
difference between the FH and CV correlations is not significant so if
the item measures FH, it could be measuring CV too. But the difference
between CV and D0 is not significant either so if CV is allowed, DO
cannot be excluded. Thus although this item is not as ambiguous as
number one above, it is only safe to say that it probably does not
measure DO.

Item 40: The highest correlation is with ID and the difference
between the ID correlation and any of the other correlations is significant
at the 0.1 level. Thus it can be said with 90 percent confidence that
students' scores on this item indicate their ability to use the process
of Interpreting Data more than any of the other Integrated Processes.
Therefore, it is safe to label this item as an "ID item." How this
interpretation can be harmonized with a logical analysis of the item
is not of concern here.

‘. ilieblll Ilnlrllll.l

 

APPENDIX IV-H

TSPT FORM D

180 - 193

 

(JUNE
FOIW D
, (1974)
ROESSES
EST
by
ROBERT 4R.) um
DARRELL w. FYFFE
RICHARD w. ROBISON
RICHARD ,J. MCLEOD
GLENN D. BERKHEINER
COPYRIGHT BY

ROBERT R. LLIEMAN
1974

USEDEANSVERSHETPROVIIED . _
PLEASEDDNOTMNGANYMRKSONTHISBDOKLET

The graph on the right was made by setting weights on the end of the stick which is
clamped to the table as shown in Figure l.

“I‘D-“‘—
4
...-u--*_ .
.l__...\., .‘- ‘..

is 1‘ .- ‘n’x

    

Veg/.1" (gram) —"

Figure 1 Figure 2

1- The can of soup shown above bends the stick to 66 cm from the floor. The can
weighs about:

67.5 grams
110 grams
255 grams
315 grams

U o w >

Questions 2, 3 and 4 use this set up:

Several identical sticks are stacked on top of each other and extended over the
edge of a table in such a way as to give the greatest possible overhang (see below).

 

Figure 3 Figure 4

The relation between greatest overhang and number of sticks is graphed below:

OVERHANG (an)

ISO

100

50

The

Usi

A.
B.
C.
D.

If
mea

DOUG?

 

2 4 b a lo 12
NUMBER OF STICKS

Figure 5
greatest overhang you could get using 5 sticks would be about:

1 cm.
49.9 cm.
99 cm.
112 cm.

smallest number of sticks you would need to get an overhang of 100 cm is:

stick

sticks
sticks
sticks

bUJNH

ng 10 sticks you could get a maximum overhang of:

More than 150 cm.
Between 130 and 150 cm.
Between 110 and 130 cm.
Less than 110 cm.

you needed to tell someone what I mean above by "overhang" so that they could
sure the overhang, you could say ”overhang" means the distance from the:

Far end of the top stick to the center of gravity of the system.
Far end of the top stick to the edge of the table.

Far end of the top stick to the far end of the next stick.
Center of gravity of the tOp stick to the edge of the table.

Questions 6, 7 and 8 are about this experiment:

6.

 

Figure 6 Figure 7
The two jars shown are filled full The jars are put in a pan and placed in
to the brim. a freezer.
The lids are screwed on tight. The temperature inside the freezer is

0 degrees F.

TWO hours later it is found that neither of the liquids in the jars is frozen.
John says the reason is that two hours is not long enough to freeze water.

Tom says the reason is that the liquid in the jars is not water, but is some
kind of antifreeze.

A. Probably John is right and Tom is wrong.

B. Probably John is wrong and Tom is right.

C. Both John and Tom could be right.

D. It is unreasonable that either John or Tom is right.

Next day when the freezer is opened, jar "Y" is found broken and its contents
are frozen solid:' The contents of jar "X" is still liquid. You now know that:

A. Jar "X" and "Y" do not contain the same kind of liquid.
B. At least one of the jars contains water.

C. The temperature of the jars is different.

D. All of the above are correct.

Suppose someone told you that the contents of jar "Y" behaved like a "Cronon"
while the contents of jar "X" behaved like a "non Cronon." To show the difference
between them you can say a "Cronon" is:

A. A chemical, but a non Cronon is not.
B. Definitely not water.
C. Easier to freeze than a non Cronon.
D. A special kind of jar.

3.

9, A jar of water was left in the above freezer over night. The next day the jar
is broken and the water is frozen. You know that:

A. Water freezes easier than a Cronon.

B. Water behaves like a Cronon.

. Cronons are made of water. ‘

The water was cold before it was put in the freezer.

10. A bottle of alcohol was left in the above freezer over night. It was not

frozen. You now know that:

A. Alcohol is a Cronon.

B. Alcohol is a non Cronon.

C. Non Cronons are made of alcohol.

D. The alcohol was warm before it was put in the freezer.

Questions 11 through 19 are about the following

experiment:
Cylinders of various shapes are rolled down the

sloping table shown below.

 

Figure 8

The slope is kept constant by the books under the table legs. The "rolling time"
is measured as the time it takes for the cylinders to roll the length of the table.

Whether the cylinders are metal or plastic, short or long, solid or hollow is
shown on the chart on the next page.

The Cylinders shown on the right are
described as follows:

Metal cylinders: A, B, C, D.
Plastic cylinders: E, F, G, H.

Short cylinders: A, B, E, F.
Long cylinders: C, D, G, H.

Solid cylinders: A, C, E, G.
Hollow cylinders: B, D, F, H.

ll- By comparing rolling times for
cylinders A and E, you could
test the effect of the variable:

A. Metal or plastic.
B. Short or long.

C. Solid or hollow.
D. Amount of slope.

 

Figure 9

12. By comparing rolling times for cylinders F and H you could test the effect of the

variable:

A. Metal or plastic.
B. Short or long.

C. Solid or hollow.
D. Amount of slope.

13- It was found that the rolling time was exactly the same for cylinders B and D.
From this information alone, the variable that does not affect the rolling time

is:

A. Metal or plastic.
B. Short or long.
c, Solid or hollow.

D. Amount or slope.

14-. Which cylinders could you use to
rate than a plastic cylinder?

A. A and F.
B. A and H.
C. C and G.
D. C and H.

tell if a metal cylinder rolls at a different

II‘ {I c.

Rolling time for the above cylinders is given in the following table. Use it to
answer questions 15 through 18.

15.

16.

17.

18.

19.

Cylinder Material Length Type Time
A metal 2 cm solid 5 sec.
B metal 2 hollow 10
C metal 8 solid 5
D metal 8 hollow 10
E plastic 2 solid 5
F plastic 2 hollow 10
G plastic 8 solid 5
H plastic 8 hollow 10
The variable solid or hollow affects the rolling time.
A. True.
B. False.
C. Cannot tell from the data
The above table shows that for this experiment the rolling time for hollow
cylinders:
A. Is always the same.
B. Depends on how long the cylinders are.
C. Is different for metal than for plastic.
D. Depends on the slope of the table.
For this experiment a solid cylinder is one which:
A. Is metal.
B. Has a rolling time of 5 seconds.
C. More than one of the above is correct.
D. None of the above is correct.
For this experiment a metal cylinder is one which:
A. Is solid.
B. Has a rolling time of 5 seconds.
C. More than one of the above is correct.
D. None of the above is correct. ‘
To tell someone how to answer the above questions it would be best to

that by "rolling time" I mean the time:

coon»

For the cylinder to roll the length of the table.
During which gravity is pulling on the cylinder.
As shown by my stop watch.

Needed for me to start and stop my Stop watch.

tell them

Questions 20 through 24 are about the following experiment:

The TV ads say that their falseteeth cleaner colors the water green. When the color
goes away, your teeth are clean. To find the time needed for the color to go away

I got several glasses of water. Each at a different temperature. I put a tablet

of falseteeth cleaner in each glass and measured the time needed for the green color
to go away. A graph of the data is shown below.

TEMPERATURE (Deg. E)

20

21.

22.

 

50
Too
50
lo
0
TIME (Minute)
Figure 10

From this graph we can say that for a higher temperature, the time needed for the
green color to go away is:

A. Greater.

B. Less.

C. Not affected by changes in temperature.
D. Not enough information is given to tell.

From the graph, how long should it take to clean your falseteeth at a
temperature of 70 degrees F.

A. 5 minutes or less.
B. Close to 10 minutes.
C. Close to 15 minutes.
D. 20 minutes or more.

If you needed to clean your falseteeth in 5 minutes, you should use a
temperature of:

A. Less than 60 degrees.

B. Between 60 and 75 degrees.
C. Between 75 and 100 degrees.
D. Greater than 100 degrees.

23.

24.

25.

26.

27.

28.

To tell a friend how to measure the cleaning time it would be best to say the
cleaning time is the time for:

A. The green color to go away.

B. The chemical reaction to be completed.

C. All the bacteria on the teeth to be killed.
D. All the above are equally good answers.

Suppose a friend has gone to a lake to swim. He wants to know the water
temperature but he has no thermometer. He borrows a tablet of the above
falseteeth cleaner, and drops it into the lake. He tries to use the above
graph to tell the water temperature. His effort fails, probably because:

. The water is too cold to swim in.

. He did not wait long enough.

. He used the wrong amount of water.

. There are no falseteeth in the water.

uncut»

Suppose a friend dials a number, hands you the phone, and tells you to find out
if the store he has called is a hardware store or a grocery store. You can
ask only one of the following questions. You should ask:

. If they sell can Openers.

. If they sell light bulbs.

. What they sell the most of.
. How old their store is.

wow»

A scientist wanted to know if a special light bulb is as "efficient" as sunlight.
He selected two young bean plants. He placed one plant in his window and the
other in a closet. He put his special light bulb in the socket in the closet,
turned it on, and closed the door. He returned in three days to see how his
plants were doing. He found that both plants had grown exactly the same amount.
Therefore he decided his special light bulb is as ”efficient" as sunlight.

The reason the scientist used 2 plants in the experiment is:

So he could compare the plants.

In case one plant died, the experiment would not be a failure.

He really did not need 2 plants.

His chances of getting a healthy plant were better by using two plants
than if he had chosen only one.

U0w>

All objects can be bent by some small amount no matter how stiff they are.
This idea must be accepted until:

No one believes it any more.

A scientist says it is no longer true.
Objects are found that bend easily.

Someone finds an object that does not bend.

COOS?

Mary has a thermometer in her room. Her thermometer is best described as:

An indoor thermometer.

A glass tube containing a colored liquid.
A device for measuring temperature.

A thermostat.

UOUd>

o u in...

VI

Questions 29, 30 and 31 are about this experiment:
A science class decided to check their reaction times. Each student was asked to
flip a switch as soon as he saw a light flash, heard a buzzer sound, or both. A

timer recorded the time it took for each student to react.

The class averages were figured. The results were:

 

 

 

AVERAGE TIME
STIMULUS Boys Girls
Light .17 sec .15 sec.
Buzzer .22 .19
Both .14 .23

29. Who reacted more quickly to the buzzer?

A. Boys by .02 sec.
B. Girls by .02 sec.
C. Boys by .03 sec.
D. Girls by .03 sec.
30. Who reacted more quickly to both the light and the buzzer together?
A. Girls by .03 sec.
B. Boys by .05 sec.
C. Girls by .05 sec.
D. Boys by .09 sec.

31. You are given a block of wood and a glass full of an unknown liquid. To find out
whether the wood will float on the surface of the liquid you should:

Find the density of the wood.

Find the density of the liquid.

Put the block of wood in the liquid and watch it.

Put the block of wood in several different kinds of liquids and watch it.

bow»

,- ‘

32. A girl removed a lid from a jar by prying on it with the blade of a table knife.
From that use of it, you might say a knife is a:

Sterling silver object with a sharp edge and a decorated handle.
Stainless steel object about 8 inches along with a thin blade.
Metal object that can be used as a lever to open jars.

Kind of incline plane that reduces the force needed to cut.

cow>

33. Which of the following instructions for doing an eXperiment tells most clearly
what to do and what to observe:

Add 5 m1 of sodium hydroxide to 50 m1 of grape juice.

Add sodium hydroxide to grape juice and the juice will change color.

The hydroxide concentration of some substances is indicated by their color.
Grape juice contains colored indicators.

Unw>

.‘t‘

Use the following contour map to answer questions 34 through 36.

 

 

 

Figure 11

34. What is the elevation at point A:

A. 7000 feet.
B. 6000 feet.
C. 5000 feet.
D. 4000 feet.

35. This mountain is steepest on its:

North side.
South side.
East side.
. West side.

cow>

36. Which of the following is at the highest elevation:

c7r1u1>>
bond»

10.

 

1

”'1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3 dfmbbmuor—ﬁxqzsznoo:mF3>3><>-N
a r I "If I I A I A 4 A I r w.. I 1 r i 1 i A _ 1 1.1
u w 3 9.3 9 P B 9 E 2 T 5.: 5 I 9 E 8 E T P E 2.3 x t E
5: (OUDWLOI—HXJIZEQOQfmF-D>IX>N
2; §E§§§i§é§¥ééilgi3éEE§;§;?§
:E ;m&30muOI—:xng—g;gmmh3;31;);
:8 geiémgemésgégieséeeeeieges
‘8 :thPEKSI—-o¥_lIZOO..OEZWF-D;IX>N
s 3§§§§E§¥:?13§¥§i65333>ix;3
L‘ lmooweo;:lilzéo;8m2CS§;;:;
a. ﬁrvwmrssiﬁ‘rﬁﬁ_wei.shsi III
4‘; 13399“2§::Ej§?S&°E3r-e>3><>8
3;” (mgoquI—ﬁXJZZOQ‘SKImr-D>3X)—N
s5 EEEEEEQEEE?§IE§¢5§EE2Eéﬁri
=83 3°U°S:°E::f:’529335::°>:x:: FORT?
3‘52 ségéetgrzDEJiEQL982~323X>§ (1974
m: :3333585::f:5:35°¢“:3>:f:5
”:8 assess-21::raizoeosrr=23:er
E3)” (QUOwLLOI~—1¥_JIZOO.OOEmy—3>3x>_N
n e n e e - e e e - e e .4 e e - e - w 44+
J: 33338E9152f15283953*8>if*2
53 (TUOLULLDI:"¥_JIZOQOCXWE—D;}><>-N
S 3 9 9 S 5 P f T T f f f 9 B 9 S w t P ? 3 5 t 5
INSTRUCTIONS:
1. Print your name in the boxes above.
If you have a long name, there may not be enough space for all of
your name. That will not matter.
2. For each question in the test booklet, black in all_of the box below
the letter on this answer sheet which is the best answer.
3. Notice that the question numbers go across the answer sheet. There is
one line on this answer sheet for each page in the test booklet.
ABCD
Page 1: 1
ABCD ABCD ABCD ABCD
Page 2: 2 3 4 5
ABCD ABCD ABCD
Page 3: 6 7 8
ABCD ABCD
Page 4: 9 10
ABCD ABCD ABCD ABCO
Page 5: ll 12 13 . . 14'
A s c D A a c o A s c o A s c o A a c D
Page 6: 15 16. 1 17 18 19 ‘%i 1
ABCD ABCD ABCD
Page 7: 20 ‘ 21 . , 22
ABCD ABCD ABCO ABCD AB‘CP Ag
Page 8: 23 . 24. * 25 . V 26 ; . . 27 L?:..: 23 a H
ABCD ABCD ABC!) ABC!) ABC?
Page 9: 29 3O . 7 . 31 - I 32.1. . G 33‘; :2
ABCD Asco Asco
Page 10: 34 . 35 1 5 36!

raﬁ

 

'1

 

HES

 

 

 

H MATC‘

 

 

mc'

 

 

 

 

BOX BEEOW W

NAME.

 

 

 

 

 

 

 

 

 

 

 

 

 

W-fovk-I Y
I < m U o w u a I —--:z.4 I z o a a m m b 3 > 3 x >-n:
”ﬂ n‘ﬁ’ﬁ r W W H w w w — v w.w w r wow w — w — vii *—
[u < m L’C3|UIL o I --n x 4 I 2:<3 a o aztnr— 3 > 3:>< > N
I LLLLALAAALLAAAA‘AJL‘"AULA i_i._g'_a.iki‘_._x
< 4 all) 0 UJlL U I --n x .1:2 2 o n.(3 a vwb-JD > 3 >< > N
z ﬁTfrj'ﬁrTT'WV‘T'TﬁwiT'er'ﬁ‘T’VV‘TWVTYT'.
5 .3 m LJCD E n.43 I --n x .1:2 2 o n.(3 a VIP— 3 > 3:>< r N
m ”V- pr—LLLhA“#AAMh+A—HA# Li
a < m U o m n o I — a x 4 I 2 c1o.c3:r m P 3 > 3 x y N
a “T‘T‘W‘TW7""’("*V‘T‘Tvﬁwr‘iifr’oﬁ—
D f m U o uluetazz - n x 4 I z 0 a O m var-:3 > I x > N
O LBALJ AAJLAA AIL; -A_xi._i'_A_LL; L‘Al
) ( m U o m u o I — a x J I Z O a O m m k D > 3 x > N
< m U D u u U I — a x J I z 0 a c m m H D > 1 x > N
< m U o w m o I — a x 4 I z o a o m m h 3 > 3 x > N
FFW‘TT'W'Tﬁ’V‘Y“Wf*'-“" “'1" .
< q U Q m u U I 5 a g 4 I z 0 a o m m P 3 > 3 x > N
< m U D m u U I — ﬂ x m I ZICDCLID m m b 3 > 3 x > N
i fQHEmAQI—wEDIzonommhn>3x>¢N
Z < dJAJ O UJLL o I —--s x .JZI z o n_rw a VIF- D > 3 >< > N Ftnaw
2 .< m 8 o uxu.I3:r-— a x J I z o a o a m P 3 > 3 x >-n: (190M!
4 < m U o m u o I'— a x 4 I z o n o m m k 3 > 3 x > N
a
W ﬁWW’Y'W W ‘ . w w I I _ I w w r h ﬂ rIf'
8 (EyewEOI—ﬁXJIZOmOme—D>3><>-N
Libxix‘x‘A L. ‘A _..A 4+. *
* < m U o w m o I — a x 4 I z o a o a m H 3 > 3 x > N
TVP'VTWTTF‘TWT'V'WV "3* "w *‘rfﬁ
< m U Q m u o I ~ m x J I Z O l O a W k D > 3 x > N
##4##“; “##L _Lglg AL A A
< m U o m m o I a x J I z o n o a m k 3 > 3 x > N
‘T‘WTtW-V” " " 7" ﬁwrv" f
< m U o m d o I - a x J I z 0 a O m n F D > 3 x > N
L.) 5.! 1.3 L] 1.1 V‘ I .. .' t, 1 .. .. -. . . .. '_ .t k » I' - ..

 

 

INSTRUCTIONS:

1. Print your name in the boxes above.
If you have a long name, there may not be enough Space for all of
your name. That will not matter.

2. For each question in the test booklet, black in all of the box below
the letter on this answer sheet which is the best answer.

3. Notice that the question numbers go across the answer sheet. There is
one line on this answer sheet for each page in the test booklet.

A 8 C D
Page I: l
A B C D A B C D A B C D A B C D
Page 2: 2 3 4 5
A B C D A B C D A B C D
Page 3: 6 7 8
A 8 C D A 8 C D
Page 4: 9 10
A B C D A 8 C D A B C D A B C D
Page 5: ll 12 13 ' 14
A B C D A B C D A B C D A 8 C D A 8 C D
Page 6: 15 16 17 18 . . I 19.’%i¥
A B C D A 8 C D A B C D
Page 7: 20 21 22.
A a c o A a c o A a c o A a c o A a c o A a
Page 8: 23 24 25 _ 26 . i 27 'Si " 23 J if
A B C D A B C D A B C D A B C Q A 8 C 9
Page 9: 29 30 . 31 , - i 32..; . 2 33 »-:.;L
A B C D A B C D A B C O

Page 10: 34 35 - 36'

APPENDIX IV-I

NORMING AREA MAP

194

195

 

   
     
     
  
   
     
 
  

 
  
   

 

 

.0. a
v Clue
., m m Wm “...-nun "r buoy RBIs 8mm" \
4’ H . . 'd
I \ . \‘IMIDUH "I" M". 1’ Fl @ .D. I ».
'5 7a / um. sun n 4
m ’r' tun. . Shelby .
Q Q , a’ nu Q "910003 lghne H
" i) d u
am 13 . m“ ‘ggz‘o ‘ «33' £3 a r ' .7 Pleasant
1’ Vernon
‘ 0 ‘ - m .
‘Put Wasmngxon ,/ Wham“ A. a, , 5.. um
to... (D :- / d9 Newmo Q 'what: City
1m - _
CD Wiltrtown g 1” [q ~99“on Ktnl"" ”I ; Slanlon ’ -
- ,
® ," __________ ’ 5 0 Ceca Sonnet e
1,. w}: __________ Muskegon’ 0 , (
u ‘ . - ,1 ’3’ ________________ M Greenwlh-
om’um {B wm _..h .. w _;M”waukee Grand Haven g . .
, n Axum" 9:53 v 3 ‘ Cum, mm mm ,d_. aplds loma
0‘ O .
‘ ’Whuewater Q9 " 5 Milwaukee 5: .
9
Ian 0 EB 94 H-
“ " I . Racine '4 mm.

 

 
 
  

 

. . STANDARD —O|<— SIANDARD ' v;
«n d' 1 M : "M ‘ Eator
u ‘ : 9‘ ' Rama:
0 ' . "Regan 5 $334!: 3
MTV”!!! ‘ h .. I 1 o a 9k
4 . ED 7 I 29
- '0‘. . : I Paw Paw Marsh.‘ NH 94 I
6 , . . .. Q Highland Park 5 3:33;: ‘5 . altrvlid @ 69 I- ® ®
I 1

ckonsha

  
     
  
 

H

A
Jonesvulle 3 5‘"

lllllsds
E

D

 
  

In a . .
0amV Q Jm‘m ( ‘ Sugaluck 96 . g. ..
% G5) :Kenosha 7 : - .
. 1.... ‘ h... (Emu: sum" 1 ‘
A Q)
.anga (D n? 8&1“! 1
C010

D

'76 .k\Ch'cago 2'. "37.75“" 94 55m

5 . y...
. I ‘ ‘ ~- _. \ mm:
8B -' - (Ag; Mlchiga3‘-'.-v"
I e. - East Cil¥'(~ .-
. ‘ Chicago ’53?” ‘ 3.2/3
' as
3

t'. I6 Gary ’32 20 ’ . 7
n1. . .0.
‘~ ‘ ' A ' Q 'Ll Pane SOUTH shlaw
35 I

.1,

f" q “32.—r, [‘1 Bend 9

G .. ‘.. a 32 \Hverlon 3]

s r; at? “h '5.»
._ ' ‘w.

 
    
 
 

 

9

    

 

         
     
     
  
  
 
 
  

. - > :6-
35 (.1:

le.
Mi

 

  

 

   

   

 

 

Maw.) 52
Mun @\
4
RAM .
I»?
'ulor J
,- 0 °
'1 D Kankakee
oagn _ Night ()5?
I w 8
mm: A 25
SCALE 0" IIILB l
20 N 50 79

our. men mums APPROXIMATELTJJ mus

APPENDIX IV - J

ITEM ANALYSIS FORM C

196

197

APPENDIX IV-J

ITEM ANALYSIS FORM D

Item number in parentheses refers to form C item

 

 

 

 

 

 

Item Kez Alt. U27 M46 L27 Diff. Disc.

1 A 15 32 50 6o 47
(11) * B 69 34 22
c 7 11 15
D 9 23 13

2 A 1 7 12 39 58
(3) B 4 14 28
c 6 18 29
* D 89 62 31

3 A 2 11 15 37 50
(4) B 2 1o 18
c 7 18 28
* D 89 52 39

4 A 39 37 34 57 3o
(5) * B 57 44 27
c 3 12 20
D 1 8 19

5 A 19 22 20 63 23
(6) * B 51 34 28
c 20 25 29
D 10 20 23

6 A 9 14 18 57 24
(14) B 9 14 18
* c 55 43 31
D 16 22 22

7 * A 67 66 50 38 17
(16) B 7 13 19
c 3 1o 17
D 24 11 14

 

198

 

 

 

 

 

 

 

Appendix IV - J cont'd
Item Kex Alt. U27 M46 L27 Diff. Disc.
8 A 18 22 27 54 44
(17) B 10 26 27
* c 71 42 27
D 1 1o 19
9 A 32 39 45 68 38
(18) * B 54 28 16
c 12 5 17
D 2 28 22
10 A 16 25 33 50 49
(19) * B 77 47 28
c 6 13 30
D 1 16 19
11 * A 89 52 23 46 66
(21) B 1 17 25
c 5 18 30
D 6 14 21
12 A 10 18 4o 54
(22) * B 89 58 35
c 5 16 27
D 5 16 19
13 A 5 23 20 48 56
(23) * B 86 45 30
c 8 18 28
D 7 15 22
14 A 14 21 28 59 50
(25) B 7 17 26
* c 69 37 19
D 10 26 37
15 * A 92 66 37
(27) B 6 19 31
c 2 15 27
D o 1 5

 

199

 

 

 

 

 

 

 

 

Appendix IV - J cont'd
Item Kez Alt. U27 M46 L27 Diff. Disc.
16 * A 90 50 17 48 73
(30) B 3 5 29
C 7 18 29
D 1 27 26
17 A 4 18 32 51 49
(31) * B 73 50 24
C 8 13 24
D 15 19 20
18 A 7 42 33 65 56
(32) B 8 18 27
C 13 18 23
* D 73 23 17
19 * A 91 57 30 41 61
(33) B 6 18 29
C l 12 22
D 2 13 19
20 A 11 21 3O 49 57
(34) * B 86 43 29
C 0 11 21
D 3 26 20
21 A 0 11 25 44 64
(35) B 8 18 27
* C 87 57 23
D 5 15 25
22 A l 14 23 43 73
(37) B l 16 31
C 4 14 24
* 94 56 21
23 * A 39 38 19 67 20
(38) B 6 15 23
C 5 4 33
D 50 44 24
24 A 10 23 25 57 47
(39) B 6 15 24
* C 69 40 22
D 15 22 29

 

200

Appendix IV - J cont'd

 

 

 

 

 

 

 

 

 

Item Kez Alt. U27 M46 L27 Diff. Disc.
25 A 4 5 18 34 51
(41) B 6 16 26

* c 89 69 38
D 1 1o 19

26 * A 93 62 23 40 7o
(43) B 2 12 21
c 2 16 29
D 3 11 27

27 A 1 17 18 42 58
(49) B 6 18 29
c 3 12 21
* D 90 54 32

28 A 30 35 39 73 28
(50) B 3 16 19
* C 44 23 16
D 23 27 25

29 A 1 10 18 53 54
(53) B 12 20 28
c 8 29 3o
* D 78 42 24

30 A 2 24 19 48 6O
(54) B 1 13 24
c 9 19 29
* D 88 45 28

31 A 5 12 18 44 55
(56) B 4 18 28
* c 84 55 29
D 7 16 25

32 A 3 17 20 51 49
(57) B 10 28 27
* c 76 46 27
D 11 9 26

 

201

Appendix IV - J cont'd

 

 

 

 

Item Kez Alt. U27 M46 L27 Diff. Disc.
33 A 21 33 21 61 41
(58) * B 64 44 23

C 12 21 29
D 3 3 27

34 A 7 13 18 52 59
(59) B 6 35 41
* C 83 42 24
D 4 11 17

35 A 4 12 19 51 51
(60) B 2 10 17
C 15 35 36
* D 79 44 28

36 * A 81 53 32 45 49
(61) B 3 10 17
C 14 25 36
D 1 12 14

 

 

TIIIH- 1 ,

 

 

   

”Tﬁiﬁigl’llﬁlﬂiﬂlﬁuﬁﬂiuijﬂjljiﬂﬂuiﬁﬂikﬁl‘ﬂ“