AL.
NA”

 

 

  

a

.1); ‘
Ill: \1

 

              

[N
on

PERﬂME
= EL:
‘ i

NS
um

‘
‘.,

F EX-
‘lTN‘jSCOU:
EB

Ego

NS

'0

 

!
GY‘

UA
quo
ou

 

 

 

 

 

   

a u 9. . ,5. .iu~,.¢‘_...;..¢..

  

 

 

   

«ll 7
I11;uuulwylmulwm111W)l 2““ j“ 1

2 1 LIBRARY 1
Michigan State
- University

      

This is to certify that the

thesis entitled -

THE QUALITY OF
EXPERIMENTAL ME'I‘I-DIDIDGY IN
COUNSELING AND COUNSELOR EDUCATION

presented by

Constance C. Ripstra

has been accepted towards fulﬁllment
of the requirements for

Ph-D- degree haw

 

MW

Major professor

Date July 191 1974

0-7639

 

 

  
    

BUUK QWUEQY m3.
_; -- ".5‘ smorns

IIIIIIIIIIIIIIIIII

 

ABSTRACT

THE QUALITY OF EXPERIMENTAL METHODOLOGY IN
COUNSELING AND COUNSELOR EDUCATION

BY

Constance C. Ripstra

The purpose of this study was to systematically
evaluate the quality of experimental research which has
been published in the fields of counseling and counselor
education from 1962 through 1973. Attention was directed
at the methodology and reporting of studies rather than
at the subject matter or variables being examined. The
specific independent variable was time, in order to
determine whether there has been an improvement since
1962 in the quality of published research. Four three-
year spans were chosen as levels of the independent
variable: 1962 - 1964; 1965 - 1967; 1968 - 1970;

1971 - 1973.
Following a survey of three journals, Journal

of Counseling Psychology, Personnel and Guidance Journal,

 

 

and Counselor Education and Supervision, to specify the
population of pre-, true-, and quasi-experimental studies,

a sample of 38 studies was randomly chosen for each year

Se
01’ j
a)

U

span. Each study was evaluated by a trained rater on

Constance C. Ripstra

the Evaluation Instrument for Experimental Methodology,
which produced six measures of the quality of reporting
and methodology. Three raters independently rated the
studies. Fifteen randomly chosen studies were commonly
rated to establish the average interrater reliability
estimate of .78.

A 1 x 4 design with equal cell sizes was utilized
to examine for differences between the four year spans. A
multivariate analysis of variance using orthogonal poly-
nomials was used to test the hypotheses of the trend
of the quality over time. A slight linear trend was
distinguished across the four year spans. Graphic
illustration demonstrated a very slight positive increas-
ing trend over time. Examination of the means derived
from the EIEM for the last year span revealed that the
quality of reporting and the introduction was "clearly
adequate."r However, quality of the method, results,
and discussion sections was generally "barely adequate."
In total the quality of experimental research in coun-
seling and counselor education was characterized as

"barely adequate."

THE QUALITY OF EXPERIMENTAL METHODOLOGY IN

COUNSELING AND COUNSELOR EDUCATION

BY

5359“] (\
Constance Cf Ripstra

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Counseling, Personnel Services
and Educational Psychology

1974

TABLE OF CONTENTS

Chapter Page
I. THE PROBLEM, RATIONALE, AND RELATED

RESEARCH O O O O O O O O I O O O O 1

Rationale. . . . . . . . . . . . 1

Purpose . . . . . . . . . . . . 5

Review of the Literature. . . . . . . 6

Reporting . . . . . . . . . . 6

Sampling and Generalization. . . . . 8

Designs and Controls . . . . . . . 10

Measurement and Criteria. . . . . . 12

AnalYSiS o o o o o o o o o o o 13

Replication . . . . . . . . . . 15

Hypothesis . . . . . . . . . . . l6

smary O O O O O O O O O O O O 16

II. EXPERIMENTAL DESIGN AND METHODOLOGY . . . . 18

sample. 0 O O O O O O O O O O O 18

Instrument . . . . . . . . . . . 23

Procedures . . . . . . . . . . . 25

Design and Statistical Analysis . . . . 29

III. ANALYSIS OF RESULTS . . . . . . . . . 31

Preliminary Data . 31

Test of Hypotheses.
Observations. . .

o o o o
o o o o
o o o o
o o o o
o o o o
o o o o
o o o o
o o o o
u
0)

Summary . . . . 42
IV. SUMMARY AND DISCUSSION . . . . . . . . 44
Summary . . . . . . . . . . . . 44
Discussion . . . . . . . . . . . 45
Recommendations. . . . . . . . . . 51
Conclusion . . . . . . . . . . . 53

ii

Page
APPENDICES
Appendix
A. Frequency Count of the Number of Experimen-
tal, Correlational and Miscellaneous
Studies for the Three Journals for the
Four Year Spans . . . . . . . . . . 55

B. Evaluation Instrument for Experimental

Methodology . . . . . . . . . . . 56
C. Relevant Definitions . . . . . . . . . 60
D. Notes on Rating. . . . . . . . . . . 61

E. Means and Standard Deviations for Items of
the Evaluation Instrument for Experi-
mental Methodology . . . . . . . . . 62

F. Univariate Tests of the Six Dependent
Variables for a Linear Trend. . . . . . 64

G. Principle Components of the Correlation
Matrix for the Six Dependent Variables
of the EIEM . . . . . . . . . . . 65
H. Ninety-five Percent Confidence Interval for
Estimated Means of the Six Dependent
Measures of the EIEM . . . . . . . . 66

REFERENCES 0 O O O O O O O O O O O O O O 67

iii

Table

2.1.

LIST OF TABLES

 

Page

Frequency Count of the Number of Experimen-

tal Studies for the Three Journals and

Four Year Spans in the Population . . . 21
Frequency Count of the Number of Experimen-

tal Studies for the Three Journals and

Four Year Spans in the Sample . . . . 22
Hoyt Reliability Estimates for the Fifteen

Commonly Rated Studies and the Five

Studies Rated for a Validity Estimate

in the Order of Rating. . . . . . . 28
Mean Scores and Corrected Standard Devi-

ations for the Scales of the EIEM . . . 32
Sample Intercorrelation Matrix for Scales

of the EIEM . . . . . . . . . . 33
Multivariate Test for Orthogonal Poly-

nomials for Six Scales of EIEM . . . . 34
Multivariate Test for Orthogonal Poly-

nomials for Two Overall Items of the

EIEM O O O O O O O O O O O O 35
Slopes of Estimated Means for Year Spans . 35

iv

LIST OF FIGURES

Figure Page

3.1. Observed Means on the Six Measures of the
EIEM O O O O O O O O O O O O O 36

3.2. Estimated Means on the Six Measures of the
EIEM O O O O O O O O O O O O O 37

3.3. Graphic Description of Univariate Confidence
Intervals for Six Scales of EIEM. . . . 40

CHAPTER I

THE PROBLEM, RATIONALE, AND

RELATED RESEARCH

Rationale

 

The aim of science above all else is to discover
new and useful information in the form of veri-
fiable data, that is, of data obtained under con-
ditions such that other qualified peeple can make
similar observations and obtain the same results.
This calls for orderliness and precision in
uncovering relationships and communicating them
to others. (Hilgard, 1962, p. 9)

Counseling psychology is usually considered an
applied science, and presumably the aim of science given
in the above quotation is also a goal for this branch of
psychology. Some counseling psychologists (Hansen &
Warner, 1971; Thoresen, 1969; Whiteley, 1967) have
questioned whether the profession is making significant
progress toward this goal. The quality of research
studies has been questioned and calls for improvement
have been made (Kelley, et a1., 1970; Pawlicki, 1970;
Schmidt & Pepinsky, 1965; Thoresen, 1969). There are
several considerations which make it imperative to pay

attention to these professional needs. One is that the

profession may be building a research base on a

foundation of sand. If an initial study in a particular
area shows statistically significant results, the ten-
dency is to take the results as truth and continue
investigation of the problem in an attempt to further
define the construct of interest. Due to the reticence
of most professionals to replicate studies (Smith, 1970),
the significance is never tested. Consequently, further
research or conclusions may build upon a faulty base.

The probability of a faulty base increases con-
siderably when the methodology of the study is examined.
"Research which is not well formulated is more than
worthless since it becomes deceptive as well (Whiteley,
1967, p. 281)." It is probable that a majority of
research has significant errors that confuse or invali-
date the results entirely or restrict conclusions to the,
sample. Glass and Robbins (1967) expertly demonstrated
this in an evaluation of studies by Delacato and his
associates in the field of reading theory. All of the
empirical studies cited by Delacato as supporting his
theory of the role of neurological organization in reading
were shown to contain major faults. Thus, Glass and
Robbins illustrated how research can build on a faulty
base of prior research and seemingly validate a theory
without legitimate evidence.

A second reason for examining the quality of

research in counseling is that most research does not

have a likely chance of rejecting the null hypothesis
unless the treatment effect is powerful (Cohen, 1962).
The choices of design, sample size and analysis are often
not appropriate, and, therefore, the study does not have
sufficient precision to correctly reject the null hypothe-
sis of no treatment effects. For the profession this

may mean much effort and time expended for little more
than a researcher's personal experience in the experi-
mental process. Consequently, the progress of counseling
to establish a research base can be inhibited by design
and methodological errors.

The research progress of the profession is also
slighted when researchers do not closely examine their
data for findings not directly associated with the
stated hypotheses. As Tukey (1969) stated, “Data
analysis needs to be both exploratory and confirmatory
[p. 90].” Therefore, when an experimenter stops after
his data analysis, the scientific endeavor is halted at
the beginning of the process (Eastwood, 1967).

While the consensus of the literature is that
there is a lack of well-planned and executed research in
counseling, the authors of such conclusions base their
comments on varying types of data. Some are communi-
cating intuitive feelings about the state of counseling
research (Coleman, 1957; Dressel, 1953; Fisher & Roth,

1961; Holland, 1974). Others reach the same conclusion

following a systematic review of the literature on group
counseling (Gazda & Larsen, 1968), practicum supervision
(Hansen & Warner, 1971), behavior therapy with children
(Pawlicki, 1970), and research published in 1963 (Schmidt
& Pepinsky, 1965). An occasional astute reader has
written a critical review of a published research study
which has reporting or methodological flaws (Crittenden,
1973; Marks, Conry, & Foster, 1973; Mills & Mencke, 1967;
Sieka, Taylor, Thomason, & Muthard, 1971).

While several have reviewed counseling journals
to examine such variables as types of statistics used
(Edgington, 1964), the institutional sources of pub-
lished research (Goodstein, 1963), common errors in manu-
scripts submitted for publication (Smith, Smith, Schef-
fers, & Steinmann, 1971), and publication trends, empiri-
cal versus theoretical papers (Foreman, 1966), only one
study has been published which systematically evaluated
the methodological quality of counseling research.
Kelley, Smits, Leventhal, and Rhodes (1970) critiqued
the designs of all empirical studies published in the
Journal of Counseling Psychology from 1964 through 1968.
Using Campbell and Stanley's system (1963), they labeled
the designs as pre-experimental, true-experimental, or
quasi-experimental and rated the studies according to
the internal and external validity criteria. Their
evaluation, however, scrutinized only one aspect of

research methodology, that of design.

Pur se

Several authors recommend that a qualitative
analysis of published counseling research be pursued
(Foreman, 1966; Samler, 1958; Stone & Shertzer, 1964).
The purpose of the present study was to accomplish such
an evaluation. It empirically determined those aspects
of methodology which are consistently weak in published
research related to counseling and counselor education.
Specifically, the intent of the investigator was to sys-
tematically evaluate the quality of experimental research
which has been published in the fields of counseling and
counselor education from 1962 through 1973. Such infor-
mation can be used in several ways: as a baseline of
the status of published counseling research at a given
point in time; as an attentional device directed at
the need for more carefully executed research; as an
educational tool for those who read and evaluate pro-
fessional research; as an educational tool for those who
teach research skills for improvement and/or reemphasis;
and as feedback to editors of journals for improvement
in review and acceptance criteria. By examining the
quality of research across years, one may conclude, as
some have postulated (Carkhuff, 1965; Myers, 1966;
Patterson, 1963), whether in fact the quality of research

is improving.

This investigator recognizes that this study
has examined only one of the two issues of quality of
counseling and counselor education research. The
research methodology has been evaluated, while rele-
vance of the results and studied phenomena to the pro-
fession has not been examined. While neither is a suf—
ficient condition, both are necessary conditions for
quality research in a profession.

The overall rationale of outcome of the study
was to encourage what is implied by Lykken (1968):

The value of any research can be determined, not
from the statistical results, but only by skilled,
subjective evaluation of the coherence and reason-
ableness of the theory, the degree of experimental
control employed, the sophistication of the
measuring techniques, the scientific or practical

importance of the phenomenon studied, and so on.
[pp. 158-159]

Review of the Literature

 

Many articles are devoted to examining the recur-
ring methodological problems encountered in counseling
research reports. Among these are problems with report-
ing, sampling and the accompanying difficulties in gen-
eralization, design, controls, measurement and criteria,
analysis, and the lack of replication. These problems

will be discussed in the next sections.

Reporting

 

The relevance of good reporting lies mainly with

the issue of replication, although its benefits also

contribute to valid evaluation and reliable usage of
results. Inadequate reporting is a commonly made cri-
ticism of counseling research. A recommendation made

to the Division of Counseling Psychology, American Psy-
chological Association, concerning modifications in
scientific inquiry and reporting was to encourage

" . . . a practice of reporting in greater detail the
research methodology employed, the characteristics of
the clients, the precise nature of the professional
interventions, and the outcome measures (Whiteley &
Allen, 1969, p. 84).” Others note specific deficiencies
which commonly occur in counseling publications: lack
of clear and concise definition of the problem of
interest (Harrison, 1971; Smith, Smith, Scheffers, &
Steinmann, 1971); inadequate statements regarding
treatment process, counselors' theoretical orientation
or qualifications in therapy research (Gazda & Larsen,
1968; Kiesler, 1966b; Patterson, 1966; Pawlicki, 1970;
Whiteley & Allen, 1969); description of dependent
variables (Kiesler, 1966b); and poor usage of grammar
and style (Smith, Smith, Scheffers, & Steinmann, 1971).
Other authors make general comments about the importance
of careful and complete reporting of disciplined inquiry
(Fisher & Roth, 1961; Kelley, Smits, Levanthal, & Rhodes,
1970; Orne, 1962; Spithill, 1973; Thoresen, 1969).

Kelley, et a1. (1970), suggest that authors not only

specify all details of procedures but also include state-

ments of inadequacies in their studies.

Sampling and Generalization

 

Sampling refers to the process of defining a
population of interest and then, assuming it is too
large to use in its entirety, to choose a sample from
which inferences can be generalized to that population.
Orne (1962) feels that "ecological validity," generali-
zation, is one of the two requirements for meaningful
experimentation. The ideal procedure of sampling from
a population is random selection of a sample sufficiently
large to satisfy statistical considerations.

The fields of counseling and counselor education
must contend with the usual problems encountered by those
professions interested in human beings. The population
of interest is often spread across the nation, if not
world, and therefore, too often the sampling procedure
is dictated by proximity or convenience. The consequences
of such sampling procedures are usually seen in inaccurate
and illegitimate generalizations beyond the sample. In
counseling research one must be aware of many populations
of interest possible even in a single study: counselors,
counselees, counselor educators, methods and techniques,
environmental-situational variables, and measuring
variables (Meltzoff & Kornreich, 1970; Patterson, 1960).

The use of volunteers poses a common problem (Orne, 1962;

Patterson, 1956), as does the frequent use of counselor
trainees when the population of interest is counselors
(Herr, 1964; Patterson, 1966). In an evaluative survey
of counseling process and outcome studies, Kelley, et a1.
(1970), found that 61.6% of the studies reviewed had an
interaction between subject selection and treatments,
which is a source of external invalidity (Campbell 8
Stanley, 1963). Each of these problems results in
limited generalization.

An additional problem often encountered in coun-
seling research using group design is small sample size.
Individual differences of humans create a problem for
sampling. To assume a representative sample on all
variables which contribute to the problem of interest,

a large sample size is required (Cohen, 1962; Tukey, 1969).
Reviews of group counseling research (Gazda & Larsen,
1968), abnormal-social psychology (Cohen, 1962), behavior
therapy with children (Pawlicki, 1970) and psychotherapy
research (Meltzoff & Kornreich, 1970) point out the con-
sistent use of small sample size. Although there are a
number of problems created with small 2 (Tversky &
Kahneman, 1971), the pggt'hgg solution is replication

of the study (Patterson, 1956). Unfortunately, repli-
cation studies are not valued as professional activity

(Barker & Gurman, 1972).

10

Though sampling procedures are recognized as
important aspects of research (Coleman, 1957; Patterson,
1963), much counseling research cannot legitimately be
generalized beyond the sample because of restrictions due
to error (Dressel, 1954; Krause, 1972). However, the
argument has been made that the data from a nonrandomly
selected sample may be generalized to the type of pOpu-
lation which the sample characterizes (Cornfield & Tukey,
1956). Implicit is the requirement that the sample be
very carefully described so that the reader can infer
beyond the sample. Unfortunately, as was noted in a
previous section, the general quality of reporting in
counseling research is inadequate. Thus, many studies
cannot use the Cornfield-Tukey Argument to allow gen-

eralization beyond the nonrandom sample.

Designs and Controls

 

Kelley, et a1. (1970), evaluated studies pub-

lished in the Journal of Counselinngsychology from 1964

 

through 1968 by using Campbell and Stanley's (1963) cri-
teria for design analysis. The majority of studies were
found to have sources of invalidity that were not con-
trolled in the design. They concluded that this group
of studies "has little relevance beyond that of gen-
erating testable hypotheses [p. 340]." Dressel (1953)

came to the same conclusion following an evaluation

11

similar to Kelley's. Of twelve studies reviewed in
detail, ten had errors in design.

In their survey of counseling research Kelley,
et al. (1970), found that the designs of a majority of
the published studies they reviewed were classified as
pre-experimental. Such designs have a treatment group
but no adequate comparison or control group. Results
from such designs should be considered tentative, and
no causal inferences can be made legitimately. However,
Gazda and Larsen (1968) found that 70% of the group
counseling studies reviewed had ”true-experimental"
designs. These designs have adequate controls for
evaluating treatment effects and allow causal state-
ments to be made.

In experimental studies control of all contribut-
ing variables is desirable in order to say with some
degree of confidence that the change in the dependent
variable is due to the manipulated variable. Problems
of improper or absent consideration of control seem to
be a major criticism of counseling research (Calvin,
1954; Coleman, 1957; Dressel, 1953; Harrison, 1971;
Hobbs & Seeman, 1955; Kiesler, 1966b; Patterson, 1963,
1966). Pawlicki (1970) evaluated behavior therapy
research with children and found that 85% did not provide
a control group. This is misleading, however, as many

of the reviewed studies were single-subject designs.

12

Gazda and Larsen (1968) report that 15% of the group and
multiple counseling research studies published prior to
1967 did not report use of control groups or statistical
controls.

The use of statistical control through use of
analysis of covariance does not seem widespread, though
its use is recommended in counseling literature (Feldman
& Hass, 1970; Herr, 1964; Patterson, 1956, 1963). Match-
ing seems to remain a favorite technique of counselor
researchers (Patterson, 1956), despite warnings of loss
of power and difficulties in obtaining truly matched
groups (Campbell & Stanley, 1963; Feldman & Hass, 1970).
Recommended changes in design include utilization of fac-
torial designs to simultaneously investigate and control
the many variables which are thought to contribute to
human interaction and learning (Ford, 1959; Kiesler,

1966b; Whiteley & Allen, 1969).

Measurement and Criteria

 

While it is generally recognized that instrumen-
tation is a major aspect of any scientific endeavor
(Coleman, 1957; Thoresen, 1969), inadequate measuring
devices continue to contribute to the problems in coun-
seling research. The measurement of process and outcome
variables seems to be a major stumbling block for
counseling research (Herr, 1964). Jensen, Coles, and

Nestor (1955) specify that the necessary characteristics

13

of a criterion variable include definability, stability,
and relevance. These are apparently difficult to attain.
Many researchers choose as dependent variables standardized
educational or psychological instruments or “home-made"
rating instruments (Kiesler, 1966a). Independent raters
are often employed (Bordin et a1., 1954). These introduce
error of measurement which contribute to a reduction of
the needed power to correctly reject null hypotheses.
Poor choice of dependent measures also contributes to
the preponderance of irrelevant research.

An additional consideration arises because of
the subject of counseling research. Many variables typi-
cally contribute to a concept, and, therefore, to pre-
vent imposing unidimensionality on it, multivariate
models must be encouraged (Borden et a1., 1954; Edwards
& Cronbach, 1952; Fisher & Roth, 1961; Gazda & Larsen,
1968; Lachenmeyer, 1970; Thoresen, 1969). This means
inclusion of those dependent variables thought to be

affected by the independent variables.

Analysis

When compared to other aspects of research,
analysis is infrequently pointed to as a source of
methodological error in counseling research. Cri-
ticisms center not on the inappropriateness of those
statistics used, but on insufficient use of available

techniques or procedures which add to the analysis

14

process. Thoresen (1969) supports Tukey's (1969) argu-
ments for going beyond the statistical significance test;
Nunnally (1960) concurs. ”Such 'detective work' facili-
tates serendipity . . . (and) careful analysis and re-
analysis may suggest new hypotheses and provide the basis
for speculation . . . (Thoresen, 1969, p. 268)."
Likewise, Nunnally (1960) and Tversky and Kahne-
man (1971) advocate the use of confidence intervals in
addition to the traditionally used hypothesis tests.
Such reporting gives more information than a statement of
significance. Kiesler (1966b) and Nunnally (1960)
suggest that variances of groups be examined for dif-
ferences in addition to the traditional analysis of
means. Nunnally (1960) and Thoresen (1969) stress the
use of statements of meaningful significance in preference
to the use of the .05 statistical level of significance.
Nunnally (1960) also questions the wide use of signifi-
cance tests of limited meaning in correlational studies,
where a significant finding usually specifies only that
the correlation is not zero. Cohen (1962) and Tversky
and Kahneman (1971) offer the criticism that most
research does not consider or report a value for beta,
the probability of a Type II error, which is a decision
to not reject a false null hypothesis. Cohen (1962)
reviewed all the articles published in 1960 in the

Journal of Abnormal and Social Psychology. By an

 

15

analysis of beta, he concluded that none of the articles
had a chance of rejecting the null hypothesis unless the
treatment effect were large. The implication is that
under present conditions the probability of correctly
rejecting the null hypothesis is small. Suggestions
relating to sample size, control of error variance, alpha

level and size of treatment effects are made.

Replication

 

Two aspects of replication are important: the
quality of reporting and choice of procedures which
allows for replication (Cronbach & Suppes, 1969; Orne,
1962); and the frequency with which it occurs in pro-
fessional literature. The first has been commented on
in a previous section, while the second has been alluded
to in a number of sections. That replication is a neces-
sary component in any research plan is recognized often
in professional articles (Herr, 1964; Kiesler, 1966b;
Krause, 1972; Lykken, 1968; Nunnally, 1960; Smith, 1970;
Stanley, 1967; Thoresen, 1969). "In studies where random
sampling from a defined population is difficult or
impossible, it is of crucial importance that a number
of replications be planned as part of the original
design or be carried out by other workers (Patterson,
1955, p. 255)." "Confirmation comes from repetition
(Tukey, 1969, p. 84)." However, Smith (1970) concludes

that replication is rarely done for either experimental

16

or correlational studies; Gazda and Larsen (1968) found
22 replication studies in their comprehensive review of

group and multiple counseling research.

Hypothesis

 

The following hypothesis was the primary focus
of this investigation:
Differences exist between the scores on the Evalu-
ation Instrument for Experimental Methodology for
the four groups of years of published counseling
research, such that there is an increasing linear
trend, indicating an increase in the quality of
the research across the year spans.
A statistical significance level of .05 was used.
It was deemed a reasonable value when considering both
Type I and II decision errors. Meaningful significance
was especially relevant for examination of specific
items of the Evaluation Instrument for Experimental
Methodology. To establish a summary of research weak-
nesses found across the articles, the means of indi-
vidual items were examined. An item whose mean was less
than four would indicate an aspect of research that was

rated less than adequate across the sample of experimental

studies.

Summary
The purpose of this study was to systematically

evaluate the quality of experimental research which has

been published in the fields of counseling and counselor

17

education from 1962 through 1973. Attention was directed
at the methodology and reporting of studies rather than
at the subject matter or variables being examined. The
variable of specific interest was time. Has there been
an improvement over time from 1962 through 1973 in the
quality of published research? The results will be most
pertinent to counselor researchers and educators, for
the journals from which research studies were selected
are those journals regularly read by these members of
the profession and which publish empirical studies. The
data consist of scores on the Evaluation Instrument for
Experimental Methodology, which examines the quality of

design, procedures, analysis, and reporting.

CHAPTER II

EXPERIMENTAL DESIGN AND METHODOLOGY

Sample

The population of interest was group experimental
studies published from 1962 through 1973 in the three
major counseling and counselor education journals:
Journal of Counseling PsycholOgy, Personnel and Guidance
Journal, and Counselor Education and Supervision. The
three journals were chosen as the major publication out-
lets for experimental studies for counselor researchers
and educators. The choice of two of the journals is
supported by the empirical evidence that the Journal of

Counseling Psychology and Personnel and Guidance Journal

 

were cited most often in a survey of the references of
published articles (Cotton & Anderson, 1973). Counselor

Education and Supervision, as the publication of the

 

Association of Counselor Education and Supervision, is

the official journal for professional counselor educators.

The term "experimental study" was operationally
defined as a study in which at least one variable was

manipulated and the effects on another variable were

18

19

observed (Campbell & Stanley, 1963). In other words the
experimenter systematically introduced a treatment and
recorded results of that treatment on some variable(s).
Three types of experimental studies are described by
Campbell and Stanley (1963): pre-experimental, true-
experimental, and quasi-experimental designs. All were
considered part of the population of interest.

The complete population of experimental studies
was specified by the following process. Two individuals
competent in research design and statistics labeled each
empirical study published from 1962 through 1973 in the
three journals as either experimental, correlational or
miscellaneous (see Appendix A). One of the experts
had a Ph.D. in research and statistics and at the time
was employed as a research associate. She had taught
three statistics classes and during her degree program
had worked as a research consultant for three years.

The investigator of the present study, serving as the
second consultant, had completed five of seven courses
of a cognate in research methodology in a Ph.D. program.
She had earned grades of 4.0 in the completed research
and statistics classes and for four terms had been a
graduate assistant for the research methodology series
offered by the College of Education, Michigan State Uni-

versity.

20

An empirical study was considered to be any
study which contained a report of a systematic collection
of data. The definition of "experimental“ was given in
the paragraph above. A correlational study was defined
as a study that compared existing groups of individuals
on some dependent measure. Studies not classifiable as
either experimental or correlational were labeled as
miscellaneous. Surveys and factor analytic studies
comprised the majority of these. The studies designated
as correlational or miscellaneous were not included in
the population of interest.

A total sample size of 152 was decided upon
because it was the largest possible sample size if equal
cell sizes were to be maintained. The population size
of the first year span, 1962 - 1964, was 38, thereby
setting 38 as the largest possible cell size. The sample
of 152 studies to be evaluated were randomly selected
from the total population of 363 experimental studies
(Table 2.1). Specifically, the complete population for
the year span 1962 - 1964, 38 studies, was included in
the sample. The decision to use the population resulted
in a reduction of the error variance. The samples of
38 studies for each of the remaining three year spans
were randomly selected from the respective populations.
Table 2.2 describes the sample according to year span
and journal. The sample size was 41.87% of the popu-

lation size.

21

 

 

 

 

 

 

 

woos mom mad eNH we mm menace
sea am mm mm a m aonM>ummsm one
COﬂUMUﬁmum HOHOmGOOU
who am e mm as NH accuses
cosmoﬂso can Hmccomumm
was mam ma me mm em Nmoaoaomma
mawawmcsou mo HMGHSOh
Hence mass came head «was
.0:
to w deuce naked lemma ummma ummma

 

 

coaumaomom may ca mommm umww “com one mHMGHDOb mouse
mnu MOM mmﬂosum Houcmﬁﬂummxm mo nonﬁdz mop mo unsoo mocmsvmum

H.N OHQMB

22

 

 

 

 

 

 

 

 

wooH mmH mm mm mm mm mHeuoa
HHH Hm m m N N conH>uemsm ace
GOHﬂMODUm HOHOmCDOU
HHN mm m m OH NH Hmcuson
mocmowsw can HmGGOmHmm
wmo as am mm mm em Hmewnnomwm
mcﬂammcsou mo accuson
Heuoa mamH oemH ammH eemH
m m
mo w H Hoe uHemH ummmH umemH uNGmH

 

mamamm may ca mommm ummw Hsom can mamcHSOh mouse

may MOM mmﬁosum Houcmeummxm mo HwnEsz 0:» mo panoo mocwsvmum

N.N manna

23

Of the 152 studies in the sample, 40 studies or
26% were pre-experimental, 79 or 52% were true-experimental,
and 33 studies, 22% of the sample, were quasi-experimental
designs. Sixty-one percent, 93 studies, were applied
research studies with outcome measures, and 31% of 47
studies were applied research with process measures.
Twelve studies, 8%, were considered basic research. Six
studies, 4% of the sample, were master degree theses, and
30 studies or 20% of the studies were doctoral disser-
tations. Thirty studies or 20% were at least partially
supported by a grant. As the sample was randomly selected,
these can be considered estimates of the specified popu-
lation's characteristics.

Instrument

 

Assessment of the reporting and methodology of
the studies was done using the Evaluation Instrument for
Experimental Methodology (EIEM) (Appendix B), a rating
form developed by the investigator. It has 37 Likert-
scaled items, each item having six response options.
Thirty-five items are divided into five sections, four
of which correspond to the traditional sections of an
experimental report: reporting (9 items), introduction
(5 items), methods (8 items), results (7 items), and
discussion (6 items). Two additional items provide an
overall rating of the reporting and methodology. The
reporting section evaluates the clarity of writing

and description throughout the study. The introduction

24

section covers the literature review, purpose and
hypothesis statements, and definition of the independent
variables. The methods section includes items on the
appropriateness of the dependent variables, sampling,
subject assignment and design. The results section
evaluates the statistical analysis. The discussion
section includes assessment of the conclusions, gener-
alizations, and qualifications of the study. A mean
score is reported for each section, and a mean rating
for the entire instrument is given as a total score.

The instrument was constructed by a compilation
of the recurring problems in experimental counseling
research cited previously in Chapter I. Special attention
was also given to Smith, Smith, Scheffers, and Stein-
mann's (1971) survey of the common errors which occur in
psychological studies. Other guides to the evaluation
of research (Burck, Cottingham, & Reardon, 1973; Borg,
1963; Davitz & Davitz, 1967; Farquhar & Krumboltz, 1959;
Isaac & Michael, 1971; Roberts, 1969), as well as
experts in research methodology in the Department of
Educational Psychology, Michigan State University, were
consulted during the initial and trial stages of instru-
ment development.

The interrater reliability of the instrument for
three raters prior to the data collection was calculated
as .79 using Hoyt's Analysis of Variance (1941). During

the data collection an average reliability estimate

25

was also calculated for the three independent raters as
.78 on fifteen studies evenly distributed throughout

the evaluation process. This estimate was considered
high enough to substantiate having one rater evaluate
the quality of a study. An attempt was made to estimate
the validity of the instrument. The two consultants
described earlier as having detailed the population of
research studies and considered qualified in the field of
research methodology evaluated five of the same studies
on which the interrater reliability was calculated. The
average Hoyt's ANOVA value for this form of concurrent
validity estimate was .85. This was considered high
enough to conclude that the instrument was reasonably
valid for the intended use of evaluation of experimental

methodology.

Procedures

Random selection of the sample was accomplished
by use of a random numbers table. Fifteen of the total
of 152 studies were randomly chosen to be independently
rated by all of the raters in order to establish inter-
rater reliability estimates. The remaining 137 were
randomly ordered and then assigned to the three raters.
The fifteen studies designated for reliability checks
were evenly placed throughout the sequence of the other
studies for each rater. The random sequence of the

studies was intended to avoid a time or fatigue bias,

26

and the random assignment to rater was done to avoid a
rater bias. Prior to the rating process each study was
photocopied and blinded for journal name, author's name
and affiliation, and dates.

Three individuals were paid to rate the studies.
They were recommended as superior students in the research
design and statistics classes at Michigan State Uni-
versity by the professor who taught those classes. Each
had successfully completed the three basic research
courses offered by the Department of Counseling, Per-
sonnel Services and Educational Psychology: Quantita-
tive Methods in Education, Advanced Quantitative Methods
in Education, and Experimental Design in Education. Two
raters had also completed a nonparametric statistics
course. Rater A was a doctoral student in counselor
education and had completed the three-course statistics
series immediately prior to the rating process with a
4.0 or ”A" grade in each course. Rater B was a doctoral
student in statistics and had worked as an assistant to
a research consultant. She also had finished the three-
course statistics series, as well as an advanced course
in nonparametric statistics, with a 4.0 grade in each.
Rater C was a doctoral student in rehabilitation coun-
seling and had completed the same four courses as
Rater B with 4.0 grades. He had taught experimental

psychology, which included research design and statistics,

27

for four years at the college level. The high reliability
with the two consultants tends to Support the above evi-
dence of the raters' competence for the rating task.

Training with the EIEM took place immediately
prior to the rating process. It consisted of independent
evaluations of randomly chosen studies from the remaining
population of interest after the sampling process.

Group discussion of the rating of each item was con-
ducted in order for the three raters to agree on the
meaning of a particular item. In several instances

this discussion resulted in a revision of the instrument.
In addition, each rater was provided with definitions

of relevant terms (Appendix C) and an instruction sheet
for the rating process (Appendix D).

During the two-week rating process each rater
worked independently. Checks were made at fifteen points
throughout the process to establish that a reliability
of at least .70 was maintained. Table 2.3 contains the
Hoyt's reliability coefficient for each of the fifteen
studies in the order they were rated. If the reliability
had gone below .70 for two successive studies, retraining
sessions would have been held to reestablish the inter-
rater reliability beyond the criterion of .70. In
addition to rating each article the rater was asked
to define the type of design (pre-, true-, or quasi-

experimental), type of experiment (applied-outcome,

28

Table 2.3

Hoyt Reliability Estimates for the Fifteen Commonly
Rated Studies and the Five Studies Rated for a
Validity Estimate in the Order of Rating

 

 

Order Study Number Reliabilitya Validity
1 83 .89
2 108 .84
3 54 .82
4 42 .79
5 130 .83 .85
6 143 .87 .89
7 18 .82
8 110 .79 .84
9 76 .74 .83
10 75 .51
11 117 .79 .82
12 90 .63
13 45 .88
14 63 .84
15 61 .61

 

aStandard deviation of reliability estimates
equals .107.

29

applied-process, or basic research) and statistical

tests used in each study (see Appendix B).

Design and Statistical Analysis

 

The independent variable of interest was years
of published research in counseling and counselor edu-
cation. The total time span of 1962 through 1973 was
considered. This was divided into four levels, each
level containing three years: 1962 - 1964; 1965 - 1967;
1968 - 1970; 1971 - 1973. Thus, the design of this cor-
relational study is a l x 4 matrix with an equal number

of obServations per cell:

 

Years Y1 = 1962 - 1964

Y1 Y2 Y3 Y4 Y2 = 1965 - 1967
Y3 = 1968 - 1970

n1 = 38 n2 = 38 n3 = 38 “4 = 38 Y4 = 1971 - 1973

 

 

 

 

 

 

The statistical treatment was a multivariate
analysis of variance using the six scores derived from
the EIEM: reporting (REP), introduction (I), methods
(M), results (R), discussion (D), and total (T). This
analysis would Specifically answer the research hypothe-
ses. An analysis for orthogonal polynomials, linear,
quadratic, and residual trends across the groups of

years, was performed. It was done to establish whether

30

there has been a trend in the quality of methodology for
published research across the year spans. Since the
population number of published experimental studies in
each year span was known, the analyses included a cor-
rection of the variance-covariance matrix for having a
known finite population.

The hypotheses tested were:

Hypothesis 1:

 

There is a significant linear trend for the depen-
dent measures across the four year spans.

Hypothesis 2:
There is a significant quadratic trend for the depen-

dent measures across the four year spans.

Hypothesis 3:

 

There is a significant residual trend for the depen-
dent measures across the four year spans.

CHAPTER III

ANALYSIS OF RESULTS

Statistical analyses were calculated at the
Michigan State University Computer Center on a Control
Data 6500 computer system. Use of the Michigan State
University computer facilities was made possible through
support, in part, from the National Science Foundation.
Data analyses were generated by a multivariate analysis
of variance program developed by Finn (1967) and a pro-
gram for computing a corrected variance-covariance

matrix by Scheifley (1973).

Preliminary Data

 

Mean scores and standard deviations for the four
groups on the five subscales and one total score on the
Evaluation Instrument for Experimental Methodology are
shown in Table 3.1. The standard deviations reported
are those used in the data analysis following a cor-
rection for having a finite population.

The mean ratings and standard deviations for

all items in the EIEM are reported by group in

31

32

Appendix E. An item might not have been appropriate for
a particular study and, therefore, was omitted by the
rater. This is reflected in the differing number of
studies included in the calculation of the mean for an
item. Unless otherwise noted, the number of studies

equals 38 for each group.

Table 3.1

Mean Scores and Corrected Standard Deviations
for the Scales of the EIEM

 

Y Y

 

l 2 3 4

Mean S.D. Mean S.D. Mean S.D. Mean S.D.

Reporting 4.53 .74 4.73 .65 4.63 .60 4.90 .57
Intro-

duction 4.42 .84 4.73 .80 4.67 .75 4.90 .60

Method 3.19 .86 3.56 1.00 3.56 .85 3.77 1.06

Results 3.31 .95 3.71 .90 3.52 1.01 3.84 .91

Discussion 3.33 1.19 3.62 1.02 3.36 1.10 3.70 .92

Total 3.76 .70 4.09 .69 3.94 .65 4.22 .64

 

The sample within cell intercorrelation matrix

for the scales of the EIEM are reported in Table 3.2.

Using Fisher's r to z transformation (Glass & Stanley,

1970) with an alpha level of .05, the minimum sample

correlation to be statistically significant from zero

is .16.

tistically greater than zero.

Therefore, each reported correlation is sta-

33

Table 3.2

Sample Intercorrelation Matrix for
Scales of the EIEM

 

 

Rep I M R D T
Reporting 1.00
Introduction .72 1.00
Method .57 .49 1.00
Results .50 .48 .45 1.00
Discussion .39 .37 .41 .46 1.00
Total .82 .74 .79 .75 .66 1.00

 

Test of Hypotheses

 

An analysis of orthogonal polynomials was accom-
plished to determine the form of relationship between
the year spans for the six dependent variables. The pur-
pose of this analysis, commonly called a trend analysis,
was to determine whether the means of the dependent
variables were influenced by changes in the independent
variable. For this investigation the question was whether
a trend over time existed for the quality of published
experimental research. The results of the test for
orthogonal polynomials are found in Table 3.3. The
univariate F-tests for the test of a linear trend are
shown in Appendix F.

A separate multivariate analysis of orthogonal

polynomials was performed for the two overall items of

34

the EIEM (Table 3.4). The results were consistent with

the analysis of the six EIEM scales.

Table 3.3

Multivariate Test for Orthogonal Polynomials
for Six Scales of EIEM

 

 

Test F-ratio . df p
Linear 2.135 1,148 < .05
Quadratic .320 1,148 < .93
Residual 1.043 1,148 < .40

 

A significant linear relationship with a nonzero
slope was found to exist across time. After graphing
the observed and estimated means for each dependent
measure (Figures 3.1 and 3.2), a slightly positive sig-
nificant linear trend was evident. Therefore, the
quality of methodology in counseling and counselor edu-
cation has improved over the twelve years. However, as
can be seen from the graphs of estimated means, the
degree of increase is slight. The estimated slopes
(Table 3.5), each defined as the increase in the mean of
the dependent variable from one year span to the next,
vary from .08 to .17 on the l - 6 scale used for the
EIEM. For example, for the measure Reporting there will
be a predicted increase of .10 on the criterion scale

every three years.

35

Table 3.4

Multivariate Test for Orthogonal Polynomials for
Two Overall Items of the EIEM

 

 

 

Test F-ratio df p
Linear 3.042 1,148 < .05
Quadratic 1.088 1,148 < .34
Residual 1.472 1,148 < .23

Table 3.5

Slopes of Estimated Means for Year Spans

 

 

Scale/Item Slope
Reporting .10
Introduction .14
Method .17
Results .14
Discussion .08
Total .12
Overall Reporting Item .12
Overall Methodology Item .15

 

Observed Mean Score

5.0'f

3.24-

 

O
—/
di—

Fig. 3.1.
of the EIEM.

 

 

36

Introduction

Reporting

Total

Results
Method
‘, Discussion

 

l I l
r 1 T

2 3 4
Year Spans

Observed Means on the Six Measures

Estimated Mean Score

37

 
 
 
 
 

  

 

 

5.01P
4'8‘b Introduction
4.6-- '_’_,N.-o
‘0... Reporting
‘_,_4>~'
o—-"'
4.4-k
4.2‘r
Total
4.0.“-
3 8d_ Results
' Method
’A . .
3.6-- ., Discussion
3.4-‘4'.
3.2‘-
3.0--
0]. 1 J, 1 J,
l I I l

1 2 3 4

Year Spans

Fig. 3.2. Estimated Means on the Six Measures
of the EIEM.

38

Although prediction into time should be made with
caution, the trend based on experimental research for
1962 through 1973, if maintained at the same rate, will
predict a mean rating for the total score for quality of
methodology and reporting of 4.46 by 1979, 4.94 by 1991,
and 5.30 by 2000. By 1994 the mean will indicate that
in an overall evaluation, experimental research in
counseling and counselor education will be clearly
adequate. The graphs of means (Figures 3.1 and 3.2)
reveal two interesting points. The results for the
reporting and introduction scales cluster together, and
the methods, results, and discussion scales cluster
below the first two scales. This seems reasonable, in
that the elements in the latter cluster are more concrete,
and seem to be dependent on each other, in that they
evaluate knowledge of research methodology and statis-
tics. However, the reporting and introduction scales
evaluate the description of what was done in the study
and are both based on writing skill. The second point
of interest is a consistent slight decrease in the means
for the third year span compared to the second. Although
this was not a significant decrease, as tested by the
quadratic trend, the consistency for each dependent
measure, excepting the method measure, should be noted.
An examination of the residuals, observed mean minus

the estimated mean for each dependent variable, agreed

39

with this visual observation. The estimated mean con-
sistently overestimated the means for year span three,
while consistently underestimating the means for year
span two. This lack of fit, however, was not statisti-
cally significant.

A principle components analysis of the correlation
matrix was computed (Appendix G). It indicated that
there was an overall and general factor of quality which
explained 65% of the variation of the measures.

Univariate and multivariate confidence intervals
were generated around the estimated means of the depen-
dent variables to consider the present state of experi-
mental research in counseling and counselor education.
For this evaluation only the most recent year span, 1971 -
1973, was considered, since this time span is contiguous
with the year of this investigation, 1974. Appendix H
details the upper and lower limits of a 95% confidence
interval for each variable. The conclusion can be formu-
lated that with 95% certainty the true value of the
estimated mean for each variable lies within these
bounds. Figure 3.3 contains the graphic representation
of the univariate intervals compared to the scale of the
dependent measures derived from the EIEM.

Two subscales, reporting and introduction, are
clearly on the adequate end of the scale, while the

three other subscales, method, results, and discussion,

40

.zmHm mo noamom

xwm now mam>noucn cocoowmcou mumwnm>ﬂco mo cowumwnommo oanmuno .m.m .mﬁm

 

 

 

 

 

 

 

 

 

 

 

 

 

 

w: I; Hence
n _ cowmmoomﬂo
_ .IJ muaomom
H (to vogue:
ﬁll ,J GOHuoooonucH
r w. mcwunommm
- b n P r n b u
‘ q u I! C a q i. 1a
m m a m N H
oumwnmonmmo muonnmonmmm mumnnmonmmm oumwnmonmmocn mumwnmonmmmcn mummmmmmmmmcw
mooalom mmmloh wmmlam ammuam wmmloh no
no no no no no cmmnmv
omamwamﬁooom mumovooo ounsvmoo mumsvmomcw mumdeoocw mawa mEoooo
mHucmHHmoxm anmoHo mHmnmm nHmnmm anmmHo o a .H

HHm be uoz

 

41

span the middle area of the scale. The quality of the
reporting and the introduction section was "clearly
adequate" for the last year span. However, the measures
which evaluated the essence of the experimental research
were considerably lower. These aspects of the evaluated
research studies were in the gray area, neither "clearly
inadequate” nor "clearly adequate.” The interval for

the total score was predictably between the two groupings

and could be characterized as "barely adequate."

Observations

 

The comments to follow have not been examined
statistically, but have been deemed of worth in the
attempt to delineate errors which occur in recently
published experimental research. The means for indi-
vidual items for year span four, 1971 - 1973 (see
Appendix E), were compared to the scale used in the
rating process. A criterion for meaningful significance
of 3.51 was established. Any item whose mean was less
than 3.51 would indicate an aspect of the research for
1971 - 1973 which was less than adequate.

The means for items 16, 19, 25, 31, 32, and 34
were below the criterion. The evaluation for item 16
seemed to suggest that reports of experimental studies
‘do not include adequate information, such as reliability
and validity estimates, for measurement instruments used

as dependent measures. Rated as "clearly inadequate”

42

was the degree of random selection from the population of
interest. The conclusion is that few of the studies for
this year span indicated random selection from even a
limited population. This affects the generalizability
of the results.

The rating for item 25 indicated that authors
do not give evidence that the assumptions necessary for
legitimate hypothesis tests are satisfied. This could
reflect that the authors do not mention the assumptions
or that the assumptions have been violated. In the dis-
cussion section authors failed to generalize appropriately
to populations, treatments, or settings allowable by the
design and sampling procedure. They also tended not to
indicate limitations or weaknesses of their studies when
these were evident. The final item to be rated below the
criterion referred to the author making suggestions for
further investigation which follow from his results.

Apparently, few authors made such comments.

Summary

A multivariate trend analysis revealed a linear
relationship between the four year spans for all depen-
dent measures, as well as for the two overall evaluation
items of the EIEM. Graphic representation of observed
and estimated means illustrated a slightly positive

increasing slope for each measure. The largest slope

43

would predict only a .17 increase in the mean rating of
quality for one three-year span. Although prediction
into time should be made with appropriate caution, the
trend based on experimental research for 1962 through
1973, if maintained at the same rate, will predict a
mean rating for the total score for quality of method-
ology and reporting of 4.46 by 1979, 4.94 by 1991, and
5.30 by 2000. By 1994 the mean will indicate that in an
overall evaluation experimental research in counseling
and counselor education had become clearly adequate.
Confidence intervals were generated for the six
scales of the EIEM for the fourth year span, 1971 - 1973,
to provide evidence of the level of quality of experi-
mental research in the fields of counseling and counselor
education. The measures for reporting and introduction
indicated that the quality for these two related aspects
of an experimental study was "clearly adequate," though
the band extended from ”barely adequate" to ”excellently
accomplished." The measures for method, results, and
discussion indicated a lower quality estimate for these
three aspects of research. While each had a confidence
span from ”clearly inadequate” to "clearly adequate,"
the conclusion was offered that the quality of methodology
of the counseling research published from 1971 to 1974

was mediocre.

CHAPTER IV

SUMMARY AND DISCUSSION

Summary

The purpose of this study was to systematically
evaluate the quality of experimental research which has
been published in the fields of counseling and counselor
education from 1962 through 1973. Attention was directed
at the methodology and reporting of studies rather than
at the subject matter or variables being examined. The
specific independent variable was time, in order to
determine whether there has been an improvement since
1962 in the quality of published research. Four three-
year spans were chosen as levels of the independent
variable: 1962 - 1964; 1965 - 1967; 1968 - 1970;

1971 - 1973.

Following a survey of three journals, Journal
of Counseling Psychology, Personnel and Guidance Journal,
and Counselor Education and Sppervision, to specify the
population of pre-, true-, and quasi-experimental studies,
a sample of 38 studies was randomly chosen for each year

span. Each study was evaluated by a trained rater on

44

45

the Evaluation Instrument for Experimental Methodology,
which produced six measures of the quality of reporting
and methodology. Three raters independently rated the
studies. Fifteen randomly chosen studies were commonly
rated to establish the average interrater reliability
estimate of .78.

A l x 4 design with equal cell sizes was utilized
to examine for differences between the four year spans. A
multivariate analysis of variance using orthogonal poly-
nomials was used to test the hypotheses of the trend of
the quality over time. A slight linear trend was distin-
guished across the four year spans. Graphic illustration
demonstrated a very slight positive increasing trend over
time. Examination of the means derived from the EIEM for
the last year span revealed that the quality of reporting
and the introduction was "clearly adequate.“ However,
quality of the method, results and discussion sections
was generally "barely adequate.” In total the quality
of experimental research in counseling and counselor

education was characterized as ”barely adequate.”

Discussion

 

The evaluation of experimental studies in coun-
seling and counselor education resulted in both good and
bad news for the profession. The results indicate that
there are slight differences in the form of a linear

trend which were discriminated by the trend analysis.

46

The linear trend is the major finding of this investi-
gation. Caution in interpretation is advisable, however,
because the amount of increase in quality for succeeding
year spans is minimal. Prediction over time is also
risky. Contributing factors to the quality of published
research are complex and probably do not act uniformly
over time.

Speculation about factors contributing to the
gradual increase of quality of research is relevant.
Obviously the effect of the computer on the expansion
of knowledge of statistics and research methodology has
been great. The ability to analyze data from complex
designs has been of direct benefit to the counseling
profession. The problem of adequate controls for
studies with human subjects has been somewhat relieved
by the readily available alternatives provided by com-
puter data analysis for statistical controls or complex
designs with blocking variables. The improvement of the
instruction of research methodology or a change in the
requirements for a professional certificate or degree to
include research methodology could be contributing to
the gradual improving trend. With the increasing
number of submitted manuscripts to professional publi-
cations, the criteria for acceptance could be changing
to require better quality research now than in the past.

Hopefully, investigations such as this will have impact

47

on researchers, members of the profession, and editors
toward improving the quality of research literature.

Possible factors which have inhibited the
development of higher quality research should be con-
sidered. The fields of counseling and counselor edu-
cation have not received as much financial support for
development and research as some of the other applied
sciences. This may mean that the motivation to accomplish
sophisticated research is affected. The fields are also
quite young with research holding a lower priority than
in more mature professions. The trend of training coun-
selors as practitioners versus researchers has surely
affected the quality of published research. As the
profession matures, research should be established as
a respectable priority among its members.

The postulation by several counselor educators
(Carkhuff, 1965; Myers, 1966; Patterson, 1963) that the
quality of research in counseling and counselor education
has been improving over time is supported by this empiri-
cal investigation of methodology, but with the cautions
previously stated. The significant linear trend demon-
strated that there is a slightly positive trend in the
quality of research in counseling and counselor edu-
cation. The results of this study, however, are appli-
cable only to the population of experimental studies of

counseling and counselor education research. The

48

conclusions of others (Gazda & Larsen, 1968; Hansen 5
Warner, 1971; Pawlicki, 1970; Thoresen, 1969) that there
is a lack of well-planned and executed research is also
partially supported, as demonstrated by the examination
of the confidence intervals for the dependent measures
for the last year span. The quality of reporting for
recent experimental publications is relatively high, as
evidenced by a mean of 4.89 for the overall rating of
reporting for the year Span 1971 - 1973. However, the
quality of methodology was rated less than "barely ade-
quate.” The mean for the overall methodology rating was
3.87. While the quality of the reporting of an experi-
mental project is important for replication and communi-
cation within the profession, the impact of poor quality
methodology is greater than that for poor quality report-
ing. Misleading or false results can be costly, espe-
cially in fields that deal with human beings.

In an effort to provide evaluation for specific
aspects of experimental methodology, the means for the
items for the last year span, 1971 - 1973, were examined
(see Appendix E). Aspects of methodology that were
rated below the meaningful significance criterion of
3.51 included descriptive statements about the relia-
bility and validity of dependent measures, random
selection of the sample, consideration of hypothesis

testing assumptions, and three items in the discussion

49

section. Low ratings for the two items covering dependent
measures and statistical assumptions have less impact on
the overall quality of counseling research than the
others. Of significant impact are the low ratings of
the random sample selection item and discussion items.
It is probable that many readers, especially those with
inadequate knowledge of research methodology, look to
the discussion section for conclusions without careful
consideration of the previous sections of the experi-
mental report. Thus, considering the rated inadequacy
of generalization statements (item 31), too many
researchers are misrepresenting the applicability of
their results, and too many consumers are possibly not
perceiving the illegitimate generalizations. Such
occurrences are potentially harmful to the profession
and to clients. The continued growth of counseling
research is also hampered by such practices.

Examination of items that had marginal ratings,
means of less than 4.0, for the last year span might be
useful. When judges evaluated the subjects on the
dependent variable, interrater reliabilities were not
consistently reported (item 17). The absence of such
statements partially inhibits the reader from evaluating
the precision of the analysis.

The designs in this sample of experimental

studies were rated as only ”barely adequate” in providing

50

the maximum precision possible, given the data the
researcher had (item 21). This result concurs with
Cohen's (1962) conclusions that a majority of published
research provides only a minimum degree of precision.
The evaluated research also only marginally controlled
for unbiased treatment effects (item 22). Adequate con-
trols continue to be a problem in counseling research,
as has been studied by Kelley, et a1. (1970) and come
mented on by Calvin (1954), Harrison (1971), and Patter-
son (1966).

In the results section the means of two items
fall in the 3.51 to 4.00 span. Ratings for items 28 and
29 can be interpreted that there is marginal consistency
in reporting the descriptive statistics of dependent
measures and pertinent information of the hypothesis
tests performed. These are essential components of a
results section, especially for the professional who
carefully examines the correctness of data analyses.

In the discussion section items 33 and 35 had
means between 3.51 and 4.00. Apparently researchers
were marginally consistent in comparing their findings
to theory or previous research. Of more importance was
the marginal appropriateness of causal statements made
in conclusions. This could refer to causal statements

made when the design does not allow such conclusions or

51

when the results do not warrant such conclusions. Such
incorrect statements are misleading.

In summary for this discussion of meaningful
results, the reporting and introduction sections of the
EIEM have no items with means of less than 4.00. This
is consistent with previously reported results. However,
the method section contains five of eight items with
means less than 4.00, a criterion indicating at best a
marginally adequate rating. The results section has
three of seven items so rated, and the discussion section
has five of six items with means below 4.00. The method
and discussion sections of experimental studies should
particularly be noted for inadequacies. These conclusions
agree with the previous analyses of the dependent measures

derived from the EIEM.

Recommendations

 

For subsequent investigations of the quality of
experimental methodology, continuing refinement of the
evaluation instrument is recommended. One revision
could be the construction of a scale with greater detail
or larger span to prevent a ceiling or floor effect,
the effect which results from the frequent use of maxi-
mum or minimum scale values. Such revision would add
clarity to the results derived from the instrument and

might contribute to increasing interrater reliabilities.

52

Longer training sessions for the raters would also
probably increase reliability estimates.

Recommendations for subsequent investigations
include evaluation of other types of research in counsel-
ing and counselor education, most notably correlational
research. Such an investigation would round out the
evaluation of the quality of research in these fields.
Evaluation of future years of counseling research would
also be beneficial and could build on the present inves-
tigation to establish more firmly the trend of improving
research.

As has been emphasized in Chapter I this inves-
tigation strenuously avoided evaluation of the content
and relevance of counseling research. An evaluation of
this essential aspect of the profession's research is
strongly recommended. It would require noted profes-
sionals as evaluators and would be an extremely difficult
task to operationalize. However, for a complete esti-
mation of the state of research in the profession such
an evaluation is essential.

Many recommendations to researchers have been
covered in Chapter IV, namely those aspects of research
reports to avoid which contribute to questionable and
deceptive experimental studies. Those points of
importance to experimental research were operationalized

in the EIEM. An additional recommendation is for more

53

researchers and editors to consider replication of pre-
vious research as valuable professional effort, which
is necessary to build a reliable research base for the
profession. The state of counseling and counselor edu-
cation research would be significantly benefited. Cur-
rently few studies and results are challenged. The pace
of improvement of quality could be speeded by such a
tactic.

The recommendation for the research consumer
as well as the counselor educator is to carefully con-
sider all aspects of a research report. For experimental
studies for the years studied, the methods, results, and
discussion sections were shown to have the highest pro-
bability for error or misleading statements. These
sections also have the biggest impact on the significance
of the results of an experimental study. For those
counselor educators who teach research skills, the
examination of the ratings of individual items of the
rating instrument points to those areas which should be
stressed. The instrument itself could be used as a

learning tool for the counselor.

Conclusion
The systematic evaluation of the methodology and
reporting of research in counseling and counselor edu-
cation revealed mixed results. The quality of reporting

was quite good, while the quality of experimental

S4

methodology was barely mediocre. Despite the trend of
increasing quality, the research in these fields must
be viewed critically. Whiteley's (1967) comment that
poorly formulated research is not only worthless but
deceptive should be heeded by counseling researchers,
editors, and research consumers in an effort to upgrade
the profession's research, protect future clients and
trainees, and promote better counseling service and

training.

APPENDICES

.APTHENDIXIIX

FREQUENCY COUNT OF THE NUMBER OF EXPERIMENTAL, CORRELATIONAL
AND MISCELLANEOUS STUDIES FOR THE THREE JOURNALS FOR THE
FOUR YEAR SPANS

 

1962- 1965- 1968- 1971-

 

 

 

 

 

1964 1967 1970 1973 TOTALS
Exp 24 S3 75 93 245
Journal 2f
Counseling Corr 67 96 I35 '32 930
Psychology
Misc IO 17 2| I7 65
Exp 12 19 26 4 61
Personnel g
Guidance Corr 1‘3 I38 89 l 39‘
Journal
M1 3c Ml 1” '6 0 '0‘
Exp 2 4 23 28 57
Counselor
Education é Corr 5 32 26 35 99
Supervision
Exp 38 76 124 125 363
TOTALS Corr I 95 266 250 169 870
‘Misc 59 6h ”8 27 208

 

55

APPENDIX B

EVALUATION INSTRUMENT FOR EXPERIMENTAL METHODOLOGY

ARTICLE NUMBER

 

RATER

 

TITLE OF ARTICLE

 

 

TYPE OF DESIGN:

STATISTIC USED:
(to test main
hypotheses)

TYPE OF RESEARCH:

COMMENTS:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

56

Pre-experimental
True-experimental

Quasi-experimental

ANOVA (type:

 

ANCOVA
MANOVA
t or z Tests

Nonparamentric; name

 

Correlation
Factor Analysis

Other; name

 

Applied: Process
Applied: Outcome

Basic Research

Note:

Article No.

Rater

57

 

 

EVALUATION INSTRUMENT FOR EXPERIMENTAL METHODOLOGY

idividual item may be found anywhere in the study.

Rate each item using the following rating scale:

1 2 3 4 5
strongly clearly barely barely clearly
disagree disagree disagree agree agree

OR OR OR OR OR

not at all clearly barely barely clearly
accomplish- inadequate inadequate adequate adequate
ed (absent)

OR OR OR OR OR
90-1002 70-89% 51-69% 51-69% 70-89%
inappro- inappro- inappro- appropriate apprOpriate
priate priate priate

 

 

 

 

REPORTING (attend to the quality of reportipgA not to the

content of the item)

The review of the literature is concise, understandable,
and logical.

The research hypothesis is clearly stated.
The population of interest is clearly gpecified.

The procedure for selection of subjects is clearly
specified.

The subjects are completely described on relevant variables.

The treatment procedures are clearly enough_defined to
allow for replication.

All statistics used in the analysis are named.

The results are clearly and concisely reported (no
unnecessary data are included).

The discussion is understandable and concisely written.

Give an overall rating of the quality of reporting of this study.

The items are grouped according to convention, but the content of an

 

6
strongly
agree
0R
excellently
accomplishecﬁ
0R
90-1001
apprOpriat1
l 2 3 4
l 2 3 4
1 2 3 4
l 2 3 4
l 2 3 4
1 2 3 4
l 2 3 4
1 2 3 4
l 2 3 4
1 2 3 4

58

 

 

INTRODUCTION

10. The purpose of the study is clearly stated.

11. The review of the literature is relevant to the problem
and independent variables of interest.

12. Research hypotheses are stated for all variables (if
exploratory, this is stated clearly).

13. Each independent variable and its levels are clearly
described; the design is clearly enough described to
allow you to diagram it.

14. An excellent rationale is given for the use of the
particular dependent variables chosen.

‘METHODS

15. The dependent measures are the most appropriate for the
purpose of the study.

16. The reliability and validity data are given for each
instrument used as a dependent measure.

17. The interrater reliabilities are given if raters are
used.

18. The stated pgpulation (not sample) is the relevant one
in terms of the nature of the problem and hypotheses.

19. Subjects were randomly selected from the population.

20. Subjects were randomly assigned to treatment groups.

21. Given the data collected by the researcher, the design
is such that it provides the maximum precision possible.

22. The design allows for unbiased treatment effects; there are

 

no confounding or uncontrolled irrelevant variables which
confuse the results; necessary controls for internal
validity are either built into the design or statistically
managed.

regression

subject mortality
instrumentation

history

maturation

testing

selection bias
selction-msturation interaction

59

IRESULTS

23.

The best statistical analysis for the design, data, and
hypotheses was used.

 

24. The data analysis is consistent with the design.

25. The authors gave evidence that the assumptions necessary
for the hypothesis test statistic(s) were met (normality,
independence, equality of variance, additivity, etc.).

26. The unit of analysis is equal to the experimental unit.

27. Specific answers to the hypotheses are given.

28. Means and variances or standard deviations are given for
each dependent variable according to groups.

29. The results section includes values of the test statistic,
df, and p-value (for ANOVA the M83 are given).

DISCUSSION

30. The conclusions drawn are consistent with the data results
and hypotheses.

31. The author generalizes to the population, treatments, or
settings allowable by the design.

32. If there were limitations of design, sampling, data
collection or data anlysis, the author indicates the
qualifications to his study which limit inference.

33. The author compares his findings to previous research
findings or to a theory.

34. The author makes suggestions for further investigation
which logically follow from his study.

35. The causal inferences made were entirely appropriate

according to the design, sampling and analysis.

Make an overall rating of the quality of the methodology of
this StudY'(considering items 10 - 35 and not those items
in the reporting section).

Comments or any errors which you found not covered in
the preceeding items:

60

APPENDIX C

RELEVANT DEFINITIONS

Pre-epperimental Desigp: Any design which has a treatment group but no
reasonable comparison group. Causal statements cannot be made.

Examples: 1. One-shot case study x 0
2. One-group pretest
posttest design 0 x O
3. Static-group compar-
ison __X__Q__
O

True-expgpimental Desigp: Random assignment occurs to at least one treatment
and one control group or several treatment groups. Causal statements are

appropriate.
Examples: 1. Pretest-posttest control group R O X 0
design R 0 O
2. Solomon four-group design
R O X 0
R O O
R X 0
R O
3. Posttest only control group
design R X 0
R 0

Quasi-egperimental Deslgp: For field settings where complete control of
experimental stimuli is impossible; the "when" and "to whom" of measurement
is controllable, while the "when" and "to whom" of stimuli exposure
and ability to randomize exposures are not controllable. Causal inferences
cannot be made.

 

Examples: 1. Time series 0 O O 0 X 0 O 0 O
2. Equivalent Time Samples
Design X10 X00 X10 X00
3. Nonequivalent Control Group 0 X 0
Design "' """
4. Counterbalanced Design X10 X20 X30 X40

x20 x40 x10 x30
x30 1:10 1:40 1:20
1:40 x30 x20 x10

5. Separate-sample pretest- R 0 (X)
posttest design R X 0

Applied Research-Process: A study whose purpose is the investigation of
variables directly related to the practice and process of counseling
(e. g. interaction variables, counselee variables, technique variables,
counselor variables, etc.) None of the dependent variables are measures
of the success of a counseling contact.

Applied Research-Outcome: A study whose purpose is the investigation of
variables directly related to the end result of counseling -- successful
treatment of a problem. At least one of the dependent variables is related
to the and objective of counseling (successful information seeking behavior,

a decision madeB Mg er grades, uh? self-actua ized, egg.) .
ﬁgsic Research: bora ory research ose purpose s to’ de ns and refine
constructs of theories which though ultimately applicable are not directly

applicable to counseling or counselor education.

APPENDIX D

NOTES ON RATING

BEFORE YOU BEGIN TO RATE THE FIRST ARTICLE, READ THE RATING FORM TO ACQUAINT
YOURSELF WITH THE MINOR CHANGES THAT HAVE BEEN MADE.

l.

10.

11.

Of utmost importance is the accuracy of your ratings. Therefore, I suggest
that you rate only several articles at any one sitting. This is to avoid
any interaction between articles, as well as to avoid a fatigue effect.

Frequently consult the notes that you took during the training session.
The objective is to maintain the same set of criteria for all raters across
all articles.

Freely consult any relevant sources, such as notes from statistics classes,
stat texts, experts, and especially Campbell and Stanley.

Rate the studies in the order given to you -- alphabetically A to HHH.
Remember that the first section on "Reporting" is evaluation of the clarity
g; the reporting and not evaluation of the appropriateness or adequate

nature of the content of the particular item.

Leave out any question which clearly does not apply to a particular article.
However, this should occur very intreguently.

Comment freely on a particular article. noting especially any weaknesses
which were not picked up in the standard items.

The infbrmation to answer an item may be found anyghere in the study.
Keep an accurate accounting of the time you spend rating.

If you have questions or problems call me at 517-337-0545 or leave a
‘message at 517-353-9242 (Department of Psychiatry).

GOOD LUCK -- and I hape that this is as much a learning experience as a

money-earning one for you. I appreciate the effort that you are contributing
to my project.

61

APPENDIX E

MEANS AND STANDARD DEVIATIONS FOR ITEMS OF THE EVALUATION
INSTRUMENT FOR EXPERIMENTAL METHODOLOGY

 

 

 

 

 

 

Mean S.D. Mean S.D. Mean S.D. Mean S.D

1 4.50 1.45 5.13 .88 4.76 1.13 5.37 .59

2 4.68 1.14 4.79 1.09 4.66 1.21 4.84 .97

3 4.82 .95 5.05 1.18 5.03 .91 4.95 .90

j? 4 4.76 1.36 ' 4.87 1.28 4.89 .98 4.89 1.03
§ .5 3.92 1.17 3.82 1.35 3.82 1.01 4.11 1.29
3‘ 6 4.34 1.48 4.61 1.17 4.42 1.06 4.97 .75
7 4.59“ 1.48 4.92 1.38 4.97“ 1.44 5.24 1.13

8 4.26 1.20 4.53 1.06 4.42 1.20 4.76 .97

9 4.84 .79 4.89 .80 4.71 .84 4.95 .96

10 5.21 .91 5.34 .63 5.34 .67 5.37 .63
E11 4.13 1.58 4.95 .90 4.50 1.13 5.00 .93
§.12 4.32 1.19 4.53 1.29 4.26 1.43 4.61 1.24
E13 4.87 1.49 5.11 1.18 5.16 .97 5.32“ .85
E 14 3.63 1.36 4.13 1.32 4.11 1.23 4.24“ 1.19
15 4.71 1.04 4.76 1.08 4.66 1.17 4.92 .94
16 1.94f 1.55 2.31c 1.66 1.53e .98 2.82d 1.91
17 3.451 2.11 3.37h 2.41 4.531 2.10 3.928 2.34
'3 18 5.29 .84 4.97 1.42 5.34 .75 5.30“ .74
'§ 19 1.42 1.31 1.79 1.61 1.47 1.35 1.53 1.37
2:20 2.63c 2.34 3.79 2.46 3.74 2.41 4.06c 2.33
21 3.16“ 1.38 3.50 1.45 3.70“ 1.29 3.73“ 1.57
22 2.79 1.73 3.55 1.70 3.79 1.71 3.84 1.87
23 3.21 1.49 3.92 1.26 3.68 1.45 4.03 1.40
24 3.89 1.57 4.36b 1.22 4.24 1.38 4.42 1.37

a 25 1.42 .95 1.87 1.49 1.81“ 1.33 1.87 1.51
‘3 26 3.89 2.12 3.58 2.14 3.53 2.32 4.16 2.24
,§ 27 4.61 1.03 4.95 1.11 4.62“ 1.14 4.76 .91
28 2.79 1.88 3.43“ 2.08 3.13 2.09 3.92“ 2.06
29 3.41“ 1.54 3.87 2.09 3.78“ 1.80 3.68 1.88

 

62

Appendix E con't 63

 

 

 

 

Y1 Y2 Y3 Y4
Item Mean S.D. Mean S.D. Mean S.D. Mean S.D.
30 4.34 1.05 4.82 .83 4.32 1.16 4.55 1.18
a 31 3.29 1.58 3.61 1.67 3.37 1.70 3.49“ 1.48
.3 32 2.79 1.70 2.97 1.70 3.03 1.70 3.11 1.71
0)
2 33 2.84 1.76 3.45 1.75 3.03 1.87 3.66 1.74
U
.2 34 3.00 1.96 3.32 1.88 3.34 1.65 3.50 1.69
D
35 3.22b 1.85 3.53b 1.78 3.11 1.90 3.62“ 1.60
Overall’ 4.45 .95 4.68 .81 4.53 .76 4.89 .89
Reporting
Overall-
Method- 3.34 1.15 3.76 .94 3.68 1.12 3.87 1.12
ology
“n - 37 “n - 34 8n - 25 36 - 11
bn - 36 “n - 32 hn - 19

n - 35 fn - 31 1n - 15

APPENDIX F

UNIVARIATE TESTS OF THE SIX DEPENDENT VARIABLES
FOR A LINEAR TREND

 

 

F-ratio df p

Reporting 6.224 1,148 .014-
Introduction 8.808 1,148 .004
Method 8.867 1,148 .003
Results 5.486 1,148 .021
Discussion 1.575 1,148 .211
Total 8.746 1,148 .004

 

64

 

APPENDIX G

PRINCIPLE COMPONENTS OF THE CORRELATION MATRIX FOR
THE SIX DEPENDENT VARIABLES OF THE EIEM

 

 

 

Variable Component 1 Component 2
Reporting -.8413 -.3509
Introduction -.7950 -.3848
Method -.7764 -.0574
Results -.7518 +.2424
Discussion -.6612 +.6421
Total -.9898 +.0393

 

Percent of Variation Explained by Component 1 - 65.422
Percent of Variation Explained by Component 2 - 12.45

65

APPENDIX H

NINETY-FIVE PERCENT CONFIDENCE INTERVAL FOR ESTIMATED
MEANS OF THE SIX DEPENDENT MEASURES OF THE EIEM

 

 

 

Univariate Multivariate

Lower Upper Lower Upper

“e““ure 11611: limit 11161: 111.11:
Reporting 3.93 5.76 2.73 6.00+
Introduction 3.83 5.96 2.42 6.00+
Method 2.47 5.08 .75 6.00+
Results 2.47 5.13 .72 6.00+
Discussion 2.12 5.13 .14 6.00+
Total 3.24 5.13 2.00 6.00+

 

 

66

REFERENCES

REFERENCES

Barker, H. R., & Gurman, E. B. Replication versus tests
of equivalence. Perceptual and Motor Skills,
1972, 25, 807-815.

Bordin, E. S., Cutler, R. L., Dittman, A. T., Harway,
N. I., Raush, H. L., & Rigler, D. Measurement
problems in process research on psychotherapy.
Journal of Consulting Psychology, 1954, 18, 79-82.

 

Borg, w. R. Educational research: An introduction.
New YorE: David McKay Co., 1963.

Burck, H. D., Cottingham, H. F., & Reardon, R. C.
Counseling and accountability: Methods and
critigue. New York: Pergamon Press, Inc.,

 

Calvin, A. D. Some misuses of the experimental method
in evaluating the effect of client-centered
counseling. Journal of Counseling Psychology,
1954, 1, 249-2511

Campbell, D. T., & Stanley, J. C. Experimental and
guasi-experimental desiggs for research.
Hicago: Rand MCNaIly, 1963.

 

 

Carkhuff, R. R. Counseling research, theory, and
practice--l965. Journal of Counseling Psycholggy,
1966, 11, 467—480.

Cohen, J. The statistical power of abnormal-social
psychological research: A review. Journal of
Abnormal and Social Psychology, 1962, E5, 145-153.

 

Coleman, W. The role of evaluation in improving guidance
and counseling services. Personnel and Guidance

Cornfield, J., & Tukey, J. W. Average values of mean
squares in factorials. Annals of Mathematical
Statistics, 1955!.31! 907:949.

 

67

68

Cotton, M. C., & Anderson, W. P. Citation changes in the
ggurnal of Counseling Psycholo . Journal of
Counseling Psychology, 1973, __, 272-274.

 

 

Cronbach, L. J., & Suppes, P. Research for tomorrow's
Echools: _Disciplined inquiry for education.
Toronto, Ontario: The MacMiIlan Co.,41969.

Crittenden, R. L. Comment on ”Group reactive inhibition
and reciprocal inhibition therapies with anxious
college students." Journal of Counseling Psy-
chology, 1973, 39, 3537552.

 

Davitz, J. R., & Davitz, L. J. Agguide for evaluating
research lans in ps cholo and education.
New YorE: Teachers CoIIege Press, 1967.

Dressel, P. L. Implications of recent research for

counseling. Journal of Counseling Psychology,
1954, 1, lOO-I'O'ST

 

 

 

Dressel, P. L. Some approaches to evaluation. Personnel
and Guidance Journal, 1953, 31, 284-287.

 

 

Eastwood, G. R. A note on hypothesis testing. Alberta
Journal of Educational Research, 1967, $3! 265-
273.

Edgington, E. S. A tabulation of inferential statistics
used in psychology journals. American Psycho-
logist, 1964, $2! 202-203.

 

Edwards, A. L., & Cronbach, L. J. Experimental design
for research in psychotherapy. Journal of
Clinical ngchology, 1952, 8, 51-59.

 

Farquhar, W. W., & Krumboltz, J. D. A checklist for
evaluating experimental research in psychology

and education. Jgurnal of Educational Research,

Feldman, C. F., & Hass, W. A. Controls, conceptuali-
zation, and the interrelation between experi-
mental and correlational research. American
Psychologist, 1970, 25, 633-635.

Finn, J. Multivariance: Fortranyprogram for univariate
andimultivariate anaIysis of variance and
covariance. State University of New YorE of

Buffan, I967.

 

 

69

Fisher, M. B., & Roth, R. M. Structure: An essential
framework for research. Personnel and Guidance
Journal, 1961, 32, 639-644}

 

Ford, D. H. Research approaches to psychotherapy.
Journal of Counseling Psychology, 1959, 6, 55-60.

Foreman, M. E. Publication trends in counseling
journals. Journal of Counseling Psychology,
1966, 13, 481-485.

Gazda, G. M., & Larsen, M. J. A comprehensive appraisal
of group and multiple counseling research.
Journal of Research and Development in Education,
1968, 1'(2),57-66.

Glass, 6. V., & Robbins, M. P. A critique of experiments
on the role of neurological organization in read-
ing performance. Reading Research Quarterly,
1967, 3 (1), 5-51. “it;

Glass, G. V., & Stanley, J. C. Statistical methods in

Education and psychology. Englewood, N.J.:
Prentice-Hall, Inc.,

Goodstein, L. D. The institutional sources of articles
in the Journal of Counseling Psychology. Journal
of counseling Psychology, 1963, $2! 94-95.

 

Hansen, J. C., & Warner, R. W. Review of research on
practicum supervision. Counselor Education and
Supervision, 1971, 12, 261-272.

 

Harrison, R. Research on human relations training:
Design and interpretation. Journal of Applied
Behavioral Science, 1971, 1, 71-85.

 

Herr, E. L. Basic issues in research and evaluation of
guidance services. Counselor Education and
Sgpervision, 1964, 2, 9-16.

 

Hilgard, E. Introduction to psycholo . New York:
Harcourt, Brace, & WOrld, 196 .

Hobbs, N., & Seeman, J. Counseling. In C. P. Stone 8
Q. McNemar, Annual Review of Psychology Volume
6, Stanford, CaIifornia: Annual Reviews, Inc.,
‘I955, pp. 379-404.

Holland, J. L. Vocational guidance for everyone. Edu-
cational Researcher, 1974, 3, 9-15.

 

7O

Hoyt, C. J. Test reliability estimated by analysis of
variance. Psychometrika, 1941, 6, 153-160.

 

Isaac, S., & Michael, W. B. Handbook in research evalu-
ation. San Diego: Ro ert R. Knapp: .

Jensen, B. T., Coles, G., & Nestor, B. The criterion
problem in guidance research. Journal of Coun-
seling Psychology, 1955, 2, 58-61.

 

Kelley, J., Smits, S. L., Leventhal, R., & Rhodes, R.
Critique of the designs of process and outcome
research. Journal of Counseling Psychology,
1970, _1_7_, 3577341.

Kiesler, D. J. Basic methodological issues implicit
in psychotherapy process research. American
Journal of Psychotherapy, 1966a, 2g, 135-155.

Kiesler, D. J. Some myths of psychotherapy research
and the search for a paradigm. Psychological
Bulletin, 1966b, £5, 110-136.

Krause, M. S. Experimental control as sampling problem
in counseling and therapy research. Journal of
Counseling Psychology, 1972, 19, 340-3461

 

 

Lachenmeyer, C. W. Experimentation--A misunderstood
methodology in psychological and social-
psychological research. American Psychologist,
1970, 25, 617-624.

Lykken, D. T. Statistical significance in psychological
research. Psychological Bulletin, 1968, 12, 151-
159.

 

Marks, S. E., Conry, R. F., & Foster, 8. F. The marathon
group hypothesis: An unanswered question.
Journal of Counseling Psychology, 1973, 32,

Meltzoff, J., & Kornreich, M. Research in psychotherapy.
New York: Alterton Press, Inc., 1970.

Mills, D. H., & Mencke, R. Characteristics of effective
counselors: A re-evaluation. Counselor Edu-
cation and Supervision, 1967, g, 332-333.

 

Myers, R. A. Research in counseling psychology--1964.
Journal of Counseling Psychology, 1966, $3,
71-379.

71

Nunnally, J. The place of statistics in psychology.
Educational and Psychological Measurement, 1960,
29, 641-650.

Orne, M. T. One the social psychology of the psychologi-
cal experiment. American Psychologist, 1962, 11,
776-783.

Patterson, C. H. Counseling. Annual Review of Psy-

 

Patterson, C. H. Program evaluation. Review of Edu-
cational Research, 1963, 33, 214-224.

 

Patterson, C. H. Methodological problems in evaluation.
Personnel and Guidance Journal, 1960, 22, 270-274.

Patterson, C. H. Matching versus randomization in studies
of counseling. Journal of Counseling Psychology,

Patterson, C. H. Comment. Journal of Counseling Psy-
chology, 1955, 2, 154-155.

Pawlicki, R. Behavior-therapy research with children:
A critical review. Canadian Journal of Behavioural
Science, 1970, 2, 163-173.

Roberts, K. H. Understanding research: Some thoughts on
evaluating completed educational projects. 1969,
ERIC-Stanford, California, ED 032 759.

Samler, J. Comments. Personnel and Guidance Journal,

Scheifley, V. M. Program for correction factor for a
finite population. Unpublished.

Schmidt, L. D., & Pepinsky, H. B. Counseling research
in 1963. Journal of Counseling Psychology,

 

Sieka, F., Taylor, D., Thomason, B., & Muthard, J.
A critique of ”Effectiveness of counselors and
counselor aids.” Journal of Counseling Psy-
chology, 1971, 18, 362-364f7

Smith, N. C. Replication studies: A neglected aspect
of psychological research. American Psycho-
logist, 1970, 25, 970-975.

72

Smith, 0. W., Smith, P. C., Scheffers, J., & Steinmann, D.
Common errors in reports of psychological studies.
Perceptual and Mbtor Skills, 1971, 32, 3-7.

 

Spithill, A. C. To leave a scratch on the wall: Getting
published. Personnel and Guidance Journal,
1973, 53, 35-38.

 

Stanley, J. C. Quasi-experimentation in educational
settings. The School Review, 1967, 15, 343-352.

Stone, S. C., & Shertzer, B. Ten years of the Personnel
3nd Guidance Journal. Personnel and Guidance
Journal , 1964 , _4T_9', 58-969 .

Thoresen, C. E. Relevance and research in counseling.
Review of Educational Research, 1969, 32, 263-281.

 

Tukey, J. W. Analyzing data: Sanctification or detec-
tive work? American Psychologist, 1969, 31,
83-91 0

Tversky, A., & Kahneman, D. Belief in the law of small
numbers. Psychological Bulletin, 1971, 1Q,

Whiteley, J. M. (Ed.) Research in counseling, Columbus,
Ohio: Charles E. Merrill Publishing Co., 1967.

Whiteley, J. M., & Allen, T. W. Suggested modifications
in scientific inquiry and reporting of counseling
research. Counseling Psychologist, 1969, 1'(2),
84-88 0