51000 1952 lIBRARY Michigan State University This is to certify that the thesis entitled Client Satisfaction and Meaningful Change in Psychotherapy presented by George Yunus Ankuta has been accepted towards fulfillment of the requirements for M. A. Psychology degree in 6—K V Major professor Date S7L1/28 0-7639 MSU is an Affirmative Action/Equal Opportunity Institution MSU RETURNING MATERIALS: Place in book drop to unnuuss remove this checkout from ”- your record. FINES will be charged if book is returned after the date stamped below. CLIENT SATISFACTION AND MEANINGFUL CHANGE IN PSYCHOTHERAPY BY George Yunus Ankuta A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Psychology 1988 ABSTRACT CLIENT SATISFACTION AND MEANINGFUL CHANGE IN PSYCHOTHERAPY By George Yunus Ankuta The purpose of this study was to evaluate the use of "clinical significance" in psychotherapy data analysis. Clinical significance was defined, in part, whether the client's scores on a symptom measure move from the dysfunctional to the functional range at the termination of treatment. Seventy-five adult psychotherapy clients were divided into three groups based on degree of psychological disturbance. The hypotheses investigated were: 1) a group of psychotherapy clients showing clinically significant symptom changes will report greater satisfaction and benefit from psychotherapy than a group of clients whose changes fell solely within the statistically significant improvement range, and 2) psychotherapy clients showing statistically significant improvement will report greater satisfaction and benefit from psychotherapy than a group of clients who did not improve statistically or clinically. The first hypothesis was supported. It is suggested that using clinical significance in psychotherapy data analysis is a George Yunus Ankuta way of bridging the researcher-practitioner gap by providing a measure of meaningful change which appears to have social validity. To my parents ii ACKNOWLEDGMENTS I would like to express my deepest gratitude to my thesis committee, Dr. Norman Abeles, Dr. Bertram Karon, and Dr. Raymond Frankmann. Dr. Abeles' student-centered approach created an environment in which I enjoyed developing my ideas. His personal support, acceptance, and respect helped make this research possible. His enthusiasm and expertise in psychotherapy research have been inspiring. In addition I would like to express my special appreciation to Lisa Cowden whose love and warmth has added so much to my life. iii TABLE OF CONTENTS LIST OF TABLES I I I I I I I I I I I I I V LIST OF FIGURES . . . . . . . . . . . . . iv Introduction . . . . . . . . . . . . . . 1 Statistical and Clinical Significance Defined . . 1 Issues with the Use of Statistical Significance . 4 When has Meaningful Change Occurred in Psychotherapy 8 Research Reasoning Should Serve the Needs of The Clinical Field it Intends to Study. . . . . . 15 Research Reasoning Should Match the Reasoning of the Clinical Field it Intends to Study . . . . 20 Hypotheses I I I I I I I I I I I I I I I 27 method I I I I I I I I I I I I I I I I 29 SUbjectS I I I I I I I I I I I I I I 29 Materials . . . . . . . . . 30 Symptom Check List 90 Revised . . . 30 The Strupp Post Therapy Client Questionnaire . 34 Procedure . . . . . . . . . . . . . . 36 The Groups. . . . . . . . . . . . . 38 Operational Definitions of Group Criteria . . 3B RESUI tSI I I I I I I I I I I I I I I I 43 Discussion. . . . . . . . . . . . . . . 49 RaferenCESI I I I I I I I I I I I I I I 57 iv Table h.) LIST OF TABLES Symptom Check List 90 Revised Data Used in Determining Clinical Significance Outcome Criteria. . . . . . . . . . . . . 4G The Relationship of Statistical Significance and Clinical Significance of Level of Symptom Change to Client Satisfaction. . . . . . 46 Pairwise Contrasts of Groups by Client Satisfaction . . . . . . . . . . . 47 Group Differences on Pre-therapy Symptom Level as Measured by SCL—90-R Scales . . . 4B LIST OF FIGURES Figure BEBE 1 Hypothetical Data From an Imaginary Measure Used to Assess Change in a Psychotherapy Outcome Study. . . . . . . . . . . '24 vi Introduction Statistical and Clinical Significance Defined Statistical significance refers to the evaluation of parameters of distributions using statistical hypothesis testing (Hays, 1981, ch. 7). Clinical significance refers to the effect of a treatment procedure on a single subject (Hugdahl & Ost, 198%). "Clinically significant change has been defined as a large proportion of clients improving (Hugdahl & Ost, 1981), a change which is large in magnitude (Barlow, 1981), an improvement in the client's everyday functioning (Kazdin & Wilson, 1978), a change which is recognizable to peers and significant others (Kazdin, 1977; Wolf, 1978), an elimination of the presenting problem (Kazdin & Wilson, 1978), and the attainment of a level of functioning which is no longer distinguishable from the client's nondeviant peers (Kazdin & Wilson, 1978; Kendall & Norton-Ford, 1982).“ (Jacobson et al., 1984, p. 338). Jacobson, Follette, & Revenstorf (1984) propose a two-condition evaluation to judge the criterion of clinical significance. The first condition is a measure of meaningful change which checks whether the client has moved from the dysfunctional to the functional range. The second condition is whether or not the improvement is statistically reliable. The condition of meaningful change is evaluated by asking if the level of functioning posttreatment suggests that the subject is statistically more likely to be in the functional than in the dysfunctional population. In other words, is the posttreatment score statistically more likely to be drawn from the functional than the dysfunctional distribution? There is a point we will call "c” where the probabilities of belonging to the functional and dysfunctional populations are equal. If the pretreatment and post treatment distributions are symmetric and of the same shape or are mirror images through the ordinate at “c", then "c" can be determined mathematically by the following equation: c = s, . + s,§. (see note 1 and figure 1) 50 +51 If Xp,.t is greater than “c" the client is more likely to be in the functional population. If X.°.g is less than "c" the client is more likely to be in the dysfunctional population. The condition of reliable change is evaluated by asking if the Reliable Change Index (RC) (Jacobson et al. 1984), defined by the following formula, is greater than 1.96 (see note 1): RC = (Xpagt - Xpr-)/SE s, = e. \/3 - ruu’ (A Figure 1 Hypothetical Data From an Imaginary Measure Used to Assess Change in a Psychotherapy Outcoma Study. (Jacobson et al. 1984) Dysfunctional Functional l C l l l , l 1 1 "£1 in S; ‘ripr-g Xpa-t Note 1: g, = mean of both pretreatment experimental and _ pretreatment control group Xa = mean of the well functioning population pr- = pretreatment score of a hypothetical subject Xp°.g = posttreatment score of a hypothetical subject 8; = standard deviation of the pretreatment control group, and pretreatment experimental group S; = standard deviation of the well functioning population rxx' = test-retest reliability of this measure for a dysfunctional sample 8; = standard error of measurement of this measure Issues with the Use of Statistical Significance Though statistical significance tests of parameters of distributions are the prominent method of treatment evaluation, the way they are sometimes used is subject to shortcomings. Several articles recommend supplementing statistical significance tests with tests of clinical significance and suggest measures of clinical significance (Hugdahl & Ost, 1981: Jacobson et al. 1984; Lick 1973). There are weaknesses in the way statistical significance tests are used. There is really no good reason to expect the null hypothesis to be true in any population. Examination of any set of statistics on a total population will quickly confirm the rarity of the null hypothesis in nature (Bakan, 1966). Rejecting something that is not likely to be true should not be the end goal of data analysis, although the null hypothesis still remains a potential explanation of any finding and it needs to be disposed of. Misconceptions about statistical significance testing lead to its inappropriately being used alone for presenting psychotherapy outcome research data. The "odds against chance fantasy" is the misinterpretation of the p value as the probability that the research results were due to chance or caused by chance (Carver, 1978). The "p value" of a statistic is not a probability. After all, when a statistic is computed one has a number, not a random variable. The p value is calculated by assuming that some specific chance process did produce the mean difference, and the p value is used to decide whether to accept or reject that assumption (Carver, 1978). Another misconception is the "replication or reliability fantasy“, which is the misinterpretation of statistical significance as the probability of obtaining the same result whenever a given experiment is repeated. Nothing in the logic of statistics allows this inference (Carver, 1978). Significance obtained (p < .05) does not mean that if the experiment were repeated 1GB times the same difference would occur 95 out of 100 times (Hugdahl a Ost, 1981). It does mean that if the null hypothesis is true, the probability of outcomes in the alpha level rejection region is alpha. Yet another misconception is the "valid research hypothesis fantasy" which involves concluding the research hypothesis is true as a result of statistical significance tests of parameters of distributions (Carver, 1978). Scientific hypotheses are different from statistical hypotheses and require more than statistical significance tests in one experiment to support them (Bolls, 1962: Winch, 1969). The scientist uses statistical hypotheses and tests to investigate a scientific hypotheses about nature. A statistical test in an experiment, significant or not, is merely one piece of evidence in the scientist's attempt to determine what is true about the natural world and establish support for his view. The fit of the statistical model, the plausibility of alternative hypotheses, and all available data must be considered before the scientific hypothesis can be evaluated. Statistics don't know where the numbers come from, but it is up to the scientist to know. There are limitations on the way statistical significance tests should be used for evaluating psychotherapy outcome data: 1) Statistical significance testing can draw attention from the practical question of the applied importance of behavior change (Kazdin, 1980). 2) Significance testing of parameters of distributions does not usually convey information about single subjects within the sample tested. Significance obtained (p<.05) does not mean that if any randomly selected subject from group A receiving treatment A is compared with any randomly selected subject from group B receiving treatment B the difference will exist 95% of the time in the predicted direction. What is the probability that treatment A is better than treatment B for a particular client? This cannot be determined by statistical significance testing (Hersen & Barlow, 1976: Hugdahl & Ost, 1981). Another misuse of statistical significance observed by Kazdin (1978) is that the experimenter may use statistical significance in a way that obscures important information. Suppose two treatment groups are compared. In group A, 2 of 10 change a large amount and in group B, 8 of ten change a small amount. Statistical significance tests of parameters of distributions will not differentiate these data unless they are specifically designed to do so. Many of the issues with the use of statistical significance testing mentioned above arise in psychotherapy outcome research as a result of practitioner’s need to know about the variability in improvement data so that they can select the best treatments for their clients. When variability is not reported readers may use inappropriate means of estimating variability. Therefore some aspect of the data analysis must provide information on that variability. Clinical significance, which is evaluated on a case by case basis, gives information on the variability of psychotherapy outcome data and is suggested for that purpose. Clinical significance is an apt augmentation to statistical significance tests which provides additional information that practitioners need. When has Meaningful Change Occurrad in anchotheragy? Meaningful change has occurred when change enhances the client's everyday functioning. Reduction of hand washing behavior in a client with obsessive compulsive symptoms from 100 to 70 occurrences per hour is not overwhelmingly meaningful from a clinical perspective since it does not significantly enhance everyday functioning. If after treatment the individual's behavior falls within normal levels of nonproblem peers, meaningful change has occurred. The extent to which treatment restores adequate levels of functioning needs to be assessed directly (Kazdin & Wilson, 1978). A unique concern of clinical research is in effecting changes in the client that are clinically significant or actually make a difference in the client's life. Clinically meaningful changes should be dramatic and obvious from the data so that there is no need to refer to statistical tests (Kazdin, 1977). How do we judge if a change is clinically important? One way to evaluate clinical importance of a change is to consider its social validity. Judgments about social validity are made in reference to the following: "1) The social significance of the goals. Are the specific behavioral goals really what society wants? 2) The social appropriateness of the procedures. Do the ends justify the means? That is, do the participants, care givers and other consumers consider the treatment procedures acceptable? 3) The social importance of the effects. Are consumers satisfied with the results? All the results, including any unpredicted ones?" (Wolf, 1978). How would social validity be evaluated? Kazdin (1977, 1980) discusses two ways that have been used: The social comparison method and the subjective evaluation method. In the social comparison method the behavior of the client is compared to the behavior of “nondeviant" peers. The question asked is whether the client’s behavior after treatment is distinguishable from the behavior of his peers. The social comparison method was used in the social validation and training of conversation skills (Minkin et a1. 1976). Junior High School girls were trained in three aspects of effective conversation. When judged against their nontrained peers they were rated as superior in conversation skill. In the subjective evaluation method the client’s behavior is evaluated by individuals who are likely to IO have contact with that client to determine whether the change made during treatment is significant. This method has been used by Patterson (1974) in a study of interventions for boys' conduct disorder problems. Direct observations were made in the boys' homes and classrooms before, during, and after the intervention. Daily reports on the boys' problem behavior were obtained from their parents. Changes in the problem behavior were accompanied by consistent changes in the parents' perceptions. The crucial factor in the social comparison method is to identify the client's peers. The peers are those individuals similar to the client in subject and demographic variables, but different in performance of the target behaviors. There are two ways the peer group can be used: 1) All individuals in a situation (i. e. classroom) can be used to determine whose behavior is extreme 2) The level of behavior of the peers who did not warrant treatment can serve as the criterion by which to assess the success of treatment. If treatment has been successful the client's performance should fall within the normative level of his peers (Kazdin, 1977). Normative judgements should be incorporated directly into treatment evaluation. There are several considerations in using normative 11 data. Occasionally normative standards are inadequate. Sometimes it is the norm that must be changed. For example, classroom performance in an entire school may be too low, or waste disposal in an entire industry may be unsatisfactory. Another consideration is that identifying the normative group can be difficult. In evaluating retarded patients should the normal group be society normals or untreated retarded people. In a prison situation should the norm group be nontreated prisoners or nonprisoners (Kazdin, 1977)? In addition to determining whether society would be satisfied with the individual's change, meaningfulness of change in psychotherapy can be evaluated by considering whether the individual, and the mental health practitioner, are satisfied with the changes that the individual has made in therapy. Taken together, these three perspectives on mental health, of society, the client, and the practitioner, constitute Strupp’s (1977) Tripartite model of mental health and therapeutic outcome. The model highlights the values brought to bear by the three "interested parties." Mental health and favorable therapeutic outcome can only be achieved when all three "interested parties" are satisfied. What more could the client, the practitioner and society want then the client's return to normal functioning? Returning to 12 the range of normal functioning is an intuitively appealing and nonambiguous measure of therapeutic outcome especially if one is not overly obsessive about defining the meaning of normality. Another way to evaluate the meaningfulness of change in therapy is to consider the magnitude of the change. Garfield (1981) suggests that we must look beyond standard significance tests to the extent of change in therapy. He suggests that studies in which the posttreatment outcome measure did not exceed the mid range of the scale are not meaningful, though statistical significance may be achieved, because there is still so much room for improvement. He claims that only large change is clinically meaningful. Cronbach and Furby (1970) discuss the use of change scores to evaluate the outcome of treatment. Persons may differ on the posttreatment measure more than predicted from the pretreatment score using regression. These positive deviations are subject to many competing explanations other than that these individuals benefited particularly well from treatment. They may have started with some valuable attribute that the pretreatment measure did not encompass. The pretreatment score may have been underestimated. The posttreatment scores may have been overestimated. Their posttreatment success may be an 13 accidental effect arising from some tactic casually adopted during treatment. Cronbach suggests that most often it is best to use the pretreatment and posttreatment scores as two variables separately in the analysis to allow for more complex relationships. A very disturbed patient for example may improve because of the large distance needed to achieve adequate functioning. Level of pathology rather than magnitude of change conveys the information. A neurotic patient may not show as much improvement because as one becomes more functional change may be less noticeable. Much of the important information is in pretreatment scores. Cronbach suggests that investigators who ask questions about gain scores would ordinarily be better served by framing their questions in other ways. Pretreatment level of pathology may be a factor in improvement in therapy, and alternative explanations for change are always possible, however the effects of pre and post treatment symptom level can be evaluated independently of change. The effect of the level of change on client satisfaction is important in itself. Obviously a large change cannot be achieved in an individual with a low level of pathology. A criterion for the evaluation of treatment that identifies those most satisfied with therapy is an important and useful criterion to establish. 14 The measure of clinical significance suggested by Jacobson et al. (1984) satisfies all the notions of meaningful change discussed above. To meet Jacobson et al.'s criterion of clinical significance the participant must be statistically more likely to be in the normal than the abnormal distribution of scores, and the change must be large enough to be statistically reliable (more than two standard deviations). Requiring that the participants be more likely to be in the normal than the abnormal group guarantees that they will satisfy Kazdin's condition of social validity. The social comparison method would demonstrate that the participant is like his or her nondeviant peers because now he/she is likely to be nondeviant. The subjective evaluation method would yield favorable ratings from those who are likely to have contact with the participant because now he/she is likely to be more functional and "normal". If the client returns to the normal range of functioning, client, society, and the mental health practitioner will all be satisfied with the outcome (Strupp, 1977). Jacobson et al.’s criterion of reliable change guarantees that Garfield's (1981) criterion, that meaningful change is change of great magnitude will be satisfied. Research Reasoning Should Serve the Needs of the Clinical Field it Intends to StudY. Recent trends in clinical psychology training have tended to institutionalize the research-practitioner split. Professional programs that stress the development of clinical skill exclusively and leave the research to the Ph. D.s are becoming more prominent. Previously it was observed that clinicians are unlikely to engage in research of any kind. The modal number of publications of clinical psychologists is zero (Kelly, Goldberg et. al. 1978). More seriously, many clinicians are not influenced by clinical research findings (Barlow, 1981). There are technical problems that complicate clinical research and interfere with relevance. It is hard to collect clients in large numbers that is homogeneous for a particular behavior disorder. This makes it difficult to test hypotheses about groups of people with particular disorders. There are ethical objections to withholding treatment from clients for research purposes: 1) Control groups are unethical because some persons are deprived of the treatment that they need. 2) It is impossible to insure that persons in the control group will not seek help from other 15 16 professionals and friends etc. (Smith et al., 1980). This makes establishing a control group difficult. "Psychotherapy is complex and not standardized: no two clients are treated the same way by even the same psychotherapist: so psychotherapy cannot be labeled method A or method B and studied experimentally." (Smith et al., 1980, p. 28). It is difficult to answer questions about which treatment for which individual with the group comparison research strategy that is currently popular (Barlow, 1981). Procedural and philosophical differences exist between researchers and practitioners that make functioning in both modes difficult. Practitioners tailor the length, intensity and method of any investigation to the individual and his problem. Researchers tend to continue treatment guided primarily by the experimental condition. The lack of emphasis on the individual diminishes the importance and relevance of clinical research for the practitioner. Further evidence for the researcher-practitioner split is the lack of clinical relevance of clinical research. However clinical research is defined, one ingredient of the definition has to be clinical relevance. Can the content of a research article be in some way used to help the patient.? Maletzky (1981) did a 17 study of all the articles published in ten journals selected for their high circulation rate and their expressed goal of providing useful information to practitioners. All issues of the journals from January 1978 to June 1980 were reviewed. It was found that only 25.1% of the psychology journals and 17.3% of the psychiatry journals contained any immediately useful information bits for the clinical situation. A "bit" was arbitrarily defined as a “practical" unit of information that could be used to treat patients directly. Only 4.3 "bits" of clinically useful information were contained in the average psychology journal, and only 1.3 bits per issue were contained in the average psychiatry journal. It was estimated that approximately 48.8 minutes of reading time were consumed for each “bit" of clinical information gleaned from a psychology journal and over three hours were consumed to get each "bit" of clinical information from a psychiatry journal. How well is clinical.research serving the needs of the clinician? Strupp (1981) writes about the crisis of confidence facing psychotherapy today. There is a public demand for better scientific evidence on efficacy and safety of psychotherapy. Society is demanding accountability of mental health practitioners. The profession must be able to articulate to the public in acceptable and 18 understandable terms that we have the means to help them. Strupp (1981) comments that in our quest for knowledge about human interaction each therapeutic dyad constitutes an experiment. Young therapists can learn only from the study of individual cases. If we carry out group comparisons without sustained attention to the process in individual dyads we deprive ourselves of the most important opportunity that systematic research has to offer. We can better learn about psychotherapy and persuade the public if we use clinical significance to evaluate therapy outcome. Clinical significance has an intuitive interpretation and can be explained to the public. In addition, it provides a means for each therapist to evaluate himself and be held accountable for his or her therapy outcomes. Cronbach (1975) comments that the historic separation of experimental psychology from the study of individual differences impeded psychology research. Some 30 years ago, research in psychology became dedicated to the quest for nomothetic theory. Model building and hypothesis testing became the central concern. Research problems were chosen to fit that mode. Cronbach (1975) suggested an alternate mode of inquiry: the mode of intensive local observation. 19 "...An observer collecting data in one particular situation is in a position to appraise a practice or proposition in that setting, observing effects in context. In trying to describe and account for what has happened, he will give attention to whatever variables were controlled, but will give equally careful attention to uncontrolled conditions, to personal characteristics, and to events that occurred during treatment and measurement. As he goes from situation to situation, his first task is to describe and interpret the effect anew in each locale, perhaps taking into account factors unique to that locale of series of events (cf. Geertz, 1973, chap. 1, on "thick description"). As results accumulate, a person who seeks understanding will do his best to trace how the uncontrolled factors could have caused local departures from the modal effect. That is, generalization comes late, and the exception is taken as seriously as the rule." (Cronbach, 1975, p. 124-125). Barlow (1981) notes that intense local observation could be a way of closing the research-practitioner gap. It would provide clinicians with more clinically relevant information such as what type of treatments work for which type of individual. Clinicians could be more actively involved in research. Clinicians could collect data on hundreds of thousands of cases over several years. The information could be fed into large clinical research centers (Argas, Kazdin & Wilson, 1979). This would make clinicians and researchers more interdependent. Clinicians prefer studies that tell about the clinical significance of the findings. In a survey (Sargent, 1983) of 530 members of The American Psychological Association Division 37 (Child, Youth, and Family Services) in which respondents were asked to rate 20 versions of a psychotherapy research study, experimental versions of the design received higher ratings than quasi-experimental versions and nonexperimental versions. The versions that reported the finding's clinical significance received a higher methodology rating than versions that omitted this information. Practitioners would prefer research that meets their needs. Practitioner’s prefer research that considers clinical significance. Research Reasoning Should Match the Reasoning of tha Clinical Fiald it Intends to Study When psychology and physiology became sciences, the initial experiments were performed on individual organisms and the results of these initial investigations remain relevant to the scientific world today. Broca examined a man who was unable to speak. When performing an autopsy after death Broca discovered a lesion in the third frontal convolution of the cerebral cortex. He determined it was the speech center of the brain and it is now named after him. Pavlov’s basic findings were gleaned from single organisms and strengthened by replications in other organisms (Hersen & Barlow, 1976). The study of individual differences and the statistical approach to psychology became prominent during the first half of the twentieth century. With a push from the functional school of Psychology, and a developing interest in measurement and testing of intelligence, the foundation for comparing groups of individuals was laid. Galton and Pearson expanded the study of individual differences at the turn of the century and developed many of the descriptive and inferential statistics still in use today (Hersen & Barlow, 1976). "It may seem ironic at first glance that a concern with individual differences lead to an emphasis on groups and averages, but differences among individuals, or inter-subject variability, and the distribution of these differences necessitate a comparison among individuals and a concern for a description of a group or population as a whole. In this context observations from a single organism are irrelevant." (Hersen & Barlow, 1976, p. 6). There are many advantages in clinical research to single case experimental designs. For example, “attempts to apply an ill-defined and global treatment such as psychotherapy to a heterogeneous group of clients classified under a single diagnostic category such as neurotics are incapable of assessing the more basic question on the effectiveness of a specific treatment for a specific individual." (Hersen & Barlow, 1976, p.13). Single case designs would allow more experiments in which different types of treatments, and therapists, could be paired with many different types of clients, with many different specific problems. Single-case experimental designs highlight the variability in the individual. If a client deteriorates, the reasons for deterioration cannot be speculated upon if only pre and post data are available. It would be much to the advantage of the clinical researchers to have followed the one patient’s course during treatment so that the beginning of deterioration could be pinpointed. Any N=1 study whether empirical or not, experimental or correlational, has limited power in the confirmatory aspect of scientific inquiry. But the same can be said for one isolated nomothetic study (Kiesler, 1981). The external validity of a series of single case designs in similar clients in which the original experiment is directly replicated three or four times can far surpass the experimental group/no treatment control group design (Hersen a Barlow, 1976). “Sophisticated presentation of N=1 research strategy makes it evident that intensive study of the single case involves much more than a single, isolated, N=1 study. Key ingredients include both direct and systematic replications encompassing a series of N=1 studies that address systematically the crucial issues of internal and external validity.“ (Kiesler, 1981, p. 213). The threats to internal validity that can be controlled in nomothetic research can be controlled in single case research (Kazdin, 1981). The single case strategy sequentially approximates nomothetic research. In a discussion of available research designs and methods of analysis applicable to the study of individual subjects Nunnally (1983) suggests the seldom considered possibility to consider each subject as though he or she were a separate experiment and then "glue“ subjects together in the context of the experimental design for groups of people. This would be a way of aggregating data on individuals the way Smith, Glass and Miller (1980) and Parloff (1986) aggregate studies to answer questions currently facing psychotherapy research. The practice of psychotherapy is the application of the scientific method to the single case (Hayes, 1981; Hiesler, 1981). Clinical decision making closely parallels time series methodology. Clinicians "need only (a) take.systematic repeated measurements (b) specify their own treatments, (c) recognize the design strategies they are already using, and (d) at times use existing design elements deliberately to improve clinical decision making." (Hayes, 1981, p. 194). Barlow & Hersen (1973) state that single case experimental designs are particularly well suited for the 24 study of complex behavior disorders. They review many single case experimental designs that have been employed in in clinical research while providing examples of their use. They believe that the suitability of the designs for clinical research will lead to their increased use. Single case designs usually begin by observing the client's behavior before treatment. This period is referred to as the Baseline phase. It serves two purposes: 1) to describe the existing level of performance, 2) to predict the level of performance for the immediate future if treatment is not provided. The projection of baseline performance into the future is the implicit criterion against which the treatment is evaluated. If treatment is effective, the actual level of behavior will deviate from the projected level of behavior from baseline performance. After performance stabilizes the treatment can be withdrawn to reassess whether performance under these conditions deviates from the predicted level. “Essentially, data in separate phases of single-case designs provide information about present performance, provide the predicted level of future performance, and test the extent to which prediction of performance from previous phases were accurate." (Kazdin, 1978, p. 630). Single case experimental designs and the evaluation 25 of clinical significance parallels the reasoning of the practitioner. Therefore this method of data analysis can be easily adopted by the practitioner who wishes to conduct research. In addition, research presented in the form of clinical significance can be interpreted and used by the practitioner. Successful clinical practice demands that we use good judgement in choosing optimal treatment for the condition in question (Yeaton & Sechrest, 1981). Practicing clinicians can enhance the quality of their judgment by attending to the strength, integrity, and specific standard of treatment efficacy of a treatment. Strength is the a priori likelihood that the treatment could have its intended outcome. Integrity of a treatment is the degree to which treatment is delivered as intended. Standards of treatment efficacy refer to results aggregated in studies like Smith, Glass and Miller (1980), and the aggregate studies reviewed by Parloff (1986). Parloff (1986) did an exhaustive review of psychotherapy outcome research between 1980 and 1984. The questions he thinks psychotherapy researchers must answer are: 1) are the positive effects reasonably attributable to psychotherapy or nonspecific placebo effects associated with all therapies 2) can unsafe and inefficient treatments be identified so a rationale restricting reimbursement can be provided to meet the insurance companies demands, and 3) can the most effective treatments for specific conditions be identified to better serve the patient. Parloff also comments that special problems make implementation of "state of the art" research methodology such as "randomized clinical trials“ difficult or impractical. A way to approach the questions and avoid the problem is to use single case experimental designs and clinical significance in data analysis. The purpose of this study is demonstrate the use of clinical significance in psychotherapy data analysis. In addition, the study is an attempt to support the notion that clinical significance captures an important aspect of improvement that needs to be reported along with statistical significance tests of parameters of distributions in psychotherapy outcome research . Hygotheses Hygothesis 1 (Experimental): Psychotherapy clients showing clinically significant symptom changes will report greater satisfaction and benefit from psychotherapy than (a) a group of clients whose changes fell solely within the statistically significant improvement range, and (b) a group of clients who did not improve statistically or clinically. Hygothesis 1 (Ogerational): Psychotherapy clients showing clinically significant change on the SCL-90-R Global Severity Index will report greater satisfaction and benefit from psychotherapy on the Strupp Post Therapy Client Questionnaire than (a) a group of Clients whose changes fall solely within the statistically significant improvement range on the SCL-90-R, and (b) a group of clients who did not improve statistically or clinically on the SCL-90-R. Hypothesis 2 (Exparimental): Psychotherapy clients showing statistically significant improvement will report greater satisfaction and benefit from psychotherapy than a group of clients who did not improve statistically or clinically. Hygothesis 2 (Ogerational): Psychotherapy clients showing statistically significant improvement on the SCL-90—R Global Severity Index will report greater satisfaction and benefit from psychotherapy on the Strupp Post Therapy Client Questionnaire than a group of clients who do not improve statistically or clinically on the SCL-90-R. Method Subjects: Clients: Seventy-five client-therapist dyads were selected for inclusion in the study from a database of 84 therapy cases at the Michigan State University Psychological Clinic. The clients were predominantly working and middle class. All clients agreed to participate in the Clinic’s psychotherapy research program. They ranged in age from 16 to 91 years; 68 percent were women. Theragists: The therapists were graduate students of Michigan State University working at the Psychological Clinic, recruited from the clinic practicum students and interns. They were selected from the database of 84 cases, along with the clients, for inclusion in the study. The therapists range in experience from students in first year practicum to advanced students with several years post-masters degree experience. The predominant theoretical orientation of the therapists was psychodynamic, although other orientations to treatment are represented. Since the study is being conducted after the therapy has been completed the therapists and clients were blind to the hypotheses and purposes of the study. Materials: [fig Symptom Chack Li§t 90 Revised Derogatis' (1983) Symptom Checklist 90 Revised will be used to measure the client's symptoms before and after therapy. The SCL-90-R is a 90 item self administered questionnaire which is composed of nine subscales measuring nine symptom dimensions: Somatization, obsessive—compulsive, interpersonal sensitivity, depression, anxiety, hostility, phobic anxiety, paranoid ideation, and psychoticism. Subjects are asked the extent to which they are distressed by: 1) headaches 2) nervousness and shakiness inside etc. The subject rates each of the 90 symptom items on a Likert type scale that goes from 0, (not at all) to 4, (extremely). Means are computed for each of the nine subscales. The Global Severity Index (881) is the sum of all item responses divided by 90. This represents the best single indicator of the current level of depth of the disorder. The reliability and validity of the SCL-90-R are discussed in the Administration, Scoring and Procedures Manual (Derogatis, 1983). Reliability is evaluated in terms of internal consistency and test retest reliability. Internal consistency is the consistency with 30 which the items selected represent each symptom construct. Test retest reliability is the stability of the measure across time. Internal consistency for the nine subscales was measured by coefficient alpha for a sample of 219 symptomatic volunteers (Derogatis, Rickels & Rock, 1976). Coefficient alpha treats within form correlations among the items as analogous to correlations between alternate forms, and assumes that the average correlations among existing items would be equivalent to the correlation among items in the hypothetical alternate form. The coefficients obtained for this sample were satisfactory and ranged between a low of .77 for psychoticism to a high of .90 for depression. The test-retest reliability for the SCL90 was checked on a sample of 94 psychiatric outpatients with one week elapsed time between testing. The test-retest reliability coefficients range from .78 for hostility to .90 for phobic anxiety (Derogatis, Rickels & Rock, 1976). Psychopathological symptoms would be expected to be less stable than a characteristic such as intelligence but more stable than "mood". Though psychological symptoms can fluctuate over a period of one week one would not expect much change. Criterion related validity is supported by several 32 studies. The SCL-90-R was used in a study evaluating the utility of Research Diagnostic Criteria for predicting differential response to amitriptyline and/or short term interpersonal psychotherapy. The SCL-90-R was found to be sensitive to change and differences in the RDC subtypes (Prusoff, Weissman, Klerman & Rounsaville, 1980). A comprehensive study of the relationship between sexual dysfunction and psychotherapy has utilized the SCL-90-R to demonstrate significant symptom differences between patients assigned to different DSMIII diagnostic categories (Derogatis, Meyer & King, 1981). 'These studies suggest that the type and severity of symptoms can be assessed using the SCL-90-R. Construct validity or more specifically concurrent or convergent validity is supported by determining correlation between the scales of the test and other measures of the constructs the scales are intended to measure. Derogatis, Rickels and Rock (1976) compared the dimension scores of the SCL90 with the scale scores from the MMPI. In this study 119 symptomatic volunteers were given the SCL90 and the MMPI. The results of the study were that each dimension of the SCL90 had its highest correlation with a like construct on the MMPI except for the obsessive compulsive dimension for which there is no directly comparable MMPI scale. This study supports the convergent validity of the SCL90. (d («I A similar study of concurrent validity of the SCL90 was conducted by Boleoucky and Horvath (1972). The symptom dimensions of the SCL90 were correlated with those of the Middlesex Hospital Questionnaire (MHQ). The two instruments shared 6 like symptom dimensions. Correlations between like dimensions were computed for a sample of 130 subjects. Correlations ranged from .73 for depression down to .36 for phobic anxiety. For most scales convergent validity is suggested. The global severity index (851) and MHQ global correlated .92. A confirmatory factor analysis (Derogatis & Cleary, 1977) performed on data from 1002 psychiatric outpatients confirmed the hypothesized structure of the SCL-90-R. The means for the SCL-90-R are available for a sample of 1002 heterogeneous outpatients (Derogatis, 1983). The outpatients came from centers in Johns Hopkins University, the University of Maryland, the University of Pennsylvania and the University of Wisconsin. There were 425 males and 577 females, approximately two thirds white, skewed somewhat towards the lower end of the socioeconomic spectrum. The nonpatient norm group was comprised of 974 individuals, 493 males and 480 females, eight ninths white. Social class data are not available. It represents a stratified random sample from a diverse community in a large eastern state. The SCL-90-R is a valid and reliable instrument constructed with subjects comparable to the type of subjects in this study. These facts combined with its past use in a related fashion in research such as the Derogatis et al. (1981) study in which symptom changes were evaluated with the SCL-90-R make the SCL-90-R a reasonable choice as a symptom measure for this study. The Strupp Poat Tharapy Cliant Questionnaire The client's satisfaction with the therapy experience will be measured by the Strupp Post Therapy Client Questionnaire (Strupp, 1969). Fifty-six items such as: "How much have you benefited from therapy?" are evaluated on a Likert type scale from 1, (a great deal) to 9, (not at all). The original questionnaire had 89 items. Strupp et. al. (1969) administered the questionnaire to clients at the Psychiatric Outpatient Clinic of North Carolina Memorial Hospital, and ended up with 122 completed cases, 59.9% females. The clients ranged in age form 18 to 50, 45.9% were married, and 45.9% were single. They were predominantly middle and working class. There pretreatment symptoms ranged from loss of interest in life, depression, to interpersonal difficulties, generalized anxiety, and physical symptoms. 35 The questionnaires were subjected to a cluster analysis. The analysis included: 1) study of response frequencies for each item, 2) intercorrelations (Pearson's "r") among all structured items, 3) systematic studying of statistical relationships, 4) isolation of item clusters, 5) comparison of cluster scores based on items included, and 6) correlations among items and other measures. Step 4, the isolation of cluster items, was conducted by the independent evaluation of members of the project staff. Highly correlated items were grouped to the point at which the staff could no longer agree on the grouping, and correlation among the items dropped to below .50. The analysis produced 10 clusters: 1) Therapist's warmth, 2) amount of change, 3) present adjustment - current status, 4) amount of change apparent to others, 5) therapist's interest, integrity, and respect, 6) (not used) uncertainty about therapist’s feelings, 7) intensity of emotional experience, 8) (not used) use of technical terms, 9) degree of disturbance before therapy, 10) therapist's experience/ activity level. The best established clusters were considered to be 1,2,3,4,5, and possibly 9. The cluster used for this study to measure outcome from the client's subjective perspective was (2) amount of change. This cluster contained items pertaining to: benefit from therapy, satisfaction with therapy, amount of change, and symptom relief. The inter—item correlations of these items ranged from .91 to .58. This cluster was used by Lichtenstein (1984) to evaluate psychotherapy outcome from the client's subjective perspective in a study of the effects of client and therapist gender on the outcome and process of psychotherapy. Lichtenstein found intercorrelations among these items ranging from .41 to .73, significant at the .001 level. Eaton (1986) also used this cluster as an outcome measure in a study of therapeutic alliance and outcome. Since the Strupp post therapy client questionnaire was developed with clients similar to those used in this study, in a context similar to this study, and has been used to measure therapeutic outcome from the client's subjective point of view, it is a reasonable choice for use in this study. Procadure: In the Michigan State University Psychotherapy Project database all of the 84 participants have filled out the SCL-90-R prior to beginning therapy and after completing therapy. The participants have also filled out 37 the Strupp Post Therapy Client Questionnaire after completing therapy. Cases were selected from the database based on change in the level of symptomatic distress after therapy as measured by the pre and post therapy SCL-90-R Global Severity Index (881) scores. Three groups of 25 subjects each were created using (GSI) scores and the criteria for statistical and clinical significance operationally defined below. 38 The Groupa: Groug I: Will meet the group criterion of statistical significance and all participants in the group will meet the individual criteria of clinical significance. Group II: Will meet the group criterion of Statistical Significance, but the participants in the group will not meet the individual criteria of clinical significance. Group III: Will meet neither the group criterion of Statistical Significance nor will the participants in the group meet the individual criteria of Clinical Significance. This group will be comprised of participants who do not change in therapy. Operational Definitions of Group Criteria: Critarion of Statiatical Significanca: Groups I & II will be considered to meet the condition of statistical significance if traditional between groups hypothesis tests, t—tests between the means of the post therapy SCL-90—R Global Severity Index scores of groups I and III, and groups II and III are statistically significant. Critarion of Clinical Significagga; Clinical significance is evaluated on a participant by participant basis. Participants in the group will be considered to meet the criterion of clinical significance if each participant in the group meets the following two conditions: 1) Meaningful Change, and 2) Reliable Change. (1) Mganingful Chagga: A participant will be considered to meet the condition of meaningful change if the post-treatment SCL—90—R (GSI) score is more likely to be drawn from the functional than the dysfunctional distribution. This condition is represented statistically as Xp°.g < c where c is defined according to the following formula: (see table 1) c = sIn , + s,§§ = .31(1.39) + .601(.31) = .677 5. +5, .31 + .601 F1 (2) Reliable Change: A participant will be considered to meet the condition of Reliable Change if the Reliable Change Index (RC), defined by the following formula, is greater than 1.96 (see table 1): s, = s. \/1 - r.,' = .601\/TM:TT§E§ = .242 40 Table 1 Symptom Ckack Liat 90 Revised Data Used in Determining Clinical Significanca Outcoma Criteria Symbol Definition Value X0 = mean of the SCL-90-R Global Severity Index .31 (GSI) for the well functioning normal population A r, = pretreatment mean of the SCL—90-R (GSI) 1.39 for groups I, II, & III combined pr. = pretreatment (GSI) score of a participant Xpa-e = posttreatment (GSI) score of a participant S; = standard deviation of groups I, II, & III .601 combined on the SCL-90-R (GSI) pretreatment Sm = standard deviation of the normal .31 population on the SCL-90—R (GSI) “ run = test-retest reliability of the .933 SCL-90-R (GSI) 3 SE = standard error of measurement for .242 SCL-90-R (GSI) 9 Based on a nonpatient norm group of 974 individuals from a diverse community in a large eastern state (Derogatis, 1983). 3 Based on a sample of 94 heterogeneous outpatients with one week elapsed between tests (Derogatis, 1983). 41 All subjects of the 84 meeting the criterion of clinical significance and the criterion of statistical significance, N = 25, were included in group I. Of the original 84 subjects 7 were missing the SCL~90-R scores necessary to classify them into a group, so they were dropped from the analysis. Two subjects had actually deteriorated in therapy, the (pre-therapy - post-therapy) difference in 681 scores were -.85 and - .64. These subjects were dropped from the analysis because deterioration is not consistent with the criterion for membership to any of the groups. The remaining 50 subjects were divided into groups II and III. The 25 subjects with the smallest (pre-therapy - post therapy) SCL—90-R GSI score differences were included in group III, and the other 25 subjects were put in group II. In accord with the group selection criteria, a t—test between the means of the post therapy SCL-90-R GSI scores of group II (M = .82; SQ,= .337) and III (M = 1.15; S_ = .709) was statistically significant, g (48) = -2.11, g { .04. A t-test between the means of the post therapy SCL—90-R GSI scores of group I (M = .43; SQ = .134) and III (M = 1.15: SD = .709) was statistically significant, L (48) = -5 g < .01. All subjects in group I met the conditions of clinical significance described above. The mean of the reliable change index (RC) = 2.25 and the standard deviation of the reliable change index Sac = 2.37. A one-way ANOVA comparing these three groups on client satisfaction with therapy using relevant items, (3, 4, 11, and 15) from Strupp's Post Therapy Client Questionnaire as the dependent variable was performed. These groups were compared further using pairwise contrasts. Results Hypothesis 1: The data clearly support hypothesis 1. A group of psychotherapy clients who met the criterion of clinically significant change on the SCL-90-R reported greater satisfaction and benefit from psychotherapy on selected items of the Strupp Post Therapy Client Questionnaire than (a) a group of clients whose changes met only the statistically significant improvement criterion on the SCL-90-R, and (b) a group of clients who did not meet either the criterion of statistical significance or the criterion of clinical significance on the SCL-90-R. A one way analysis of variance comparing a group of clients who met the criteria of statistical and clinical significance (I), to a group of clients who met the criterion of statistical significance (II), and a group of clients who did not meet the criterion of statistical significance or the criterion of clinical significance (III)2 was significant E (2,69) = 4.36, p < .017 (see table 2). The significance of the contrast between groups I and III (see Tables 2 and 3), considered with the lack of significance of the contrast between groups II and III indicates that the significant difference in this analysis 44 is between groups I and III. These contrasts further support hypothesis 1, that clients who have displayed clinically significant symptom change will report the greatest satisfaction and benefit from psychotherapy. Hypothesis 2: The results do not support hypothesis 2, that a group of psychotherapy clients who meet the criterion of statistical significance will report greater satisfaction and benefit from therapy than a group of clients who do not meet the criterion of statistical significance or the criterion of clinical significance. The contrast between groups II and III shows a trend toward the support of hypothesis 2, (significance of p = .099) when considered in conjunction with the nonsignificant contrast between groups I and II, and the significant contrast when groups I and II are combined and compared to group III. Post Hoc Analysis: A post hoc analysis was done comparing pretreatment level of symptomatic distress as measured by the SCL-90-R subscales, across groups I, II, and III, in a series Note 2 One subject from group II and and two subjects from group I dropped out of the analysis because they were missing the post therapy client questionnaire data. 45 of one-way analysis of variance designs (SCL-90—R symptom scale by group). Four of the 9 subscales were found to be significantly different across groups: interpersonal sensitivity, depression, paranoid ideation, and psychoticism (see table 4). This indicates that these 4 scales had initial elevations that were high enough for variation in outcome to be possible. This suggests that the sample includes a pretreatment symptom constellation of depression or interpersonal anxiety. These results are also an indication of which symptoms are likely to be alleviated or changed by psychotherapy. 46 Table 2 The Relationahip of Statistical Significance and ClinicaL Significance of Level of Symptom Change to Client Satisfaction. Group I II III Clinical Significance & Statistical No Change Statistical Significance Significance Client M 2.17 2.64 3.23 Satisfaction §_ 1.2 1.13 1.38 N 23 24 25 Table 3 47 Pairwiag Contrasts of Group; by Cliant Satisfaction. Contrast T Value P Value I vs III -2.94 .004 I vs II 1.27 .208 II vs 111 -1.67 .099 I & IIA vs III -2.68 .009 Note: The degrees of freedom were 69 in all contrasts. A: Groups I and II were combined and compared to group III. Table 4 Group Differences on Pra—therapy Symptom Level as Measured by SCL-90-R Scales. Group SCL-90-R Scale I II III Somatization M .85 .90 .52 SQ_ .57 .74 .62 Obsessive M 1.75 1.72 1.38 Compulsive SQ .64 .87 .86 Interpersonal M 1.73 1.91 1.30 Sensitivity SQ .67 .80 .92 Depression M 2.2 2.24 1.7 SQ .66 .70 1.02 Anxiety M 1.82 1.68 1.27 SQ .89 .84 1.02 Hostility M 1.13 1.15 1.2 SQ .65 .82 1.12 Phobic M .83 .76 .43 Anxiety SQ .79 .69 .69 Paranoid M 1.19 1.34 .83 Ideation SQ .64 .86 .74 Psychoticism M 1.10 1.04 .66 SQ .51 .70 .63 * p < .05: df (2,72) Discussion Psychotherapy clients showing clinically significant change on the SCL-90-R reported greater satisfaction and benefit from psychotherapy on selected items of the Strupp Post Therapy Client Questionnaire than (a) a group of clients whose changes fell solely within the statistically significant improvement range on the SCL-90-R, and (b) a group of clients who did not improve statistically or clinically on the SCL-90-R. The data clearly support hypothesis 1. Clinical significance is associated with greater client satisfaction than statistical significance. Psychotherapy clients showing statistically significant improvement did not report greater satisfaction and benefit from therapy than a group of clients who do not improve statistically or clinically. The results do not support hypothesis 2. Although a trend toward the support of hypothesis 2 was suggested. The lack of confirmation of hypothesis 2 appears to underscore the result with regard to hypothesis 1. Those satisfied with therapy are those who meet the most rigorous criterion for improvement: 1) movement into the normal range of functioning, and 2) change that is reliable and not likely to be due to chance. 49 50 The post hoc analysis evaluating pretreatment symptom level by group indicates that there was not a large enough sample of patients describing themselves as compulsives, phobics or with somatic problems to evaluate the effect of clinical significance on satisfaction for these groups. The results are most strongly supported for clients suffering from depression, interpersonal anxiety, paranoia or psychotic symptoms. An alternative explanation is that these four symptom groups represent those manifest symptoms which are most malleable and indicative of therapeutic change for a wide range of disorders. The post hoc analysis also indicates that outcome was related to pretreatment symptom level. Those with a lower level of symptomatology tended to end up in the no change group and were less satisfied with therapy. Issues concerning the use of pretreatment and posttreatment level of symptomatology as opposed to the use of change scores to evaluate the outcome of treatment have been discussed by Cronbach and Furby (1970). Cronbach’s point that much important information is provided by pretreatment scores is well taken. However, alternative explanations for change are always possible, and the effects of pre and post treatment symptom level can be evaluated independently of change. 51 The effect of the level of change on client satisfaction is important in itself. Obviously a large symptom change cannot be achieved in an individual with a low level of pathology. A criterion for the evaluation of treatment that identifies those most satisfied with therapy is an important and useful criterion to establish. Future research should control for pretreatment level of symptomatology. Research comparing a clinically significant (reliable and clinically significant change) group to a group that demonstrated reliable change (more than 2 standard deviations), but not clinically significant change (more likely to be in the well than the dysfunctional range), would be interesting. Research comparing a clinically significant group to a group that represented return to the normal range of functioning, but not reliable change, would also be of value. Jacobson et al. (1984b) used clinical significance in a study of behavioral marital therapy outcome. The clinically relevant questions of what proportions of couples improve, and how Often these improved couples truly remain in the ranks of the nondistressed are addressed. It was found that about a third of the couples actually changed their status from distressed to nondistressed by the end of therapy. In a subsequent study of behavioral marital therapy using clinical 52 significance to evaluate outcome Jacobson et al. (1985) found similar improvement rates. Though these results may appear more modest than results using the traditional methods of reporting outcome, these results provide a nonambiguous criterion for improvement which provides information on the variability of the outcome data. From a methodological perspective there are several issues. For those concerned with sample size, the following should be considered: for the purposes of the overall analysis of the three conditions in this study 25 subjects per group was deemed acceptable. Kraemer (1981) has indicated that 20 subjects per group creates sufficient power for most analysis. The cost of adding more subjects is only marginally worthwhile considering the relatively small increase in power more subjects would afford. This study would be strengthened by multiple measures of the independent variable, (symptom level) and the dependent variable, (client satisfaction) if these measures were correlated and yielded converging results. The advantage of using one well established measure of each construct is that there are no ambiguous results to explain as it is conceivable multiple measures could yield. It should also be noted that this study does not demonstrate a causal relationship between clinically U! (A significant symptom change and client satisfaction, but it does demonstrate that clinical significance is associated with greater client satisfaction. Another methodological measurement issue is that the definition of clinical significance makes use of the standard error of measurement of the instrument used to measure change. It will be much easier to get meaningful change when an instrument with a small standard error of measurement is used. The impact of instrument selection on clinical significance should be considered when developing a study using clinical significance. Another methodological issue concerns the use of normative data. In the case of this study the norms of the well functioning normal population on the SCL-90—R GSI were chosen. The question can be raised whether or not it makes sense to compare those with symptoms in the psychotic range with well functioning normals or would it be more appropriate to compare these individuals to nontreated psychotics? Clinically significant improvement of these individuals could be viewed as movement within the severely disturbed (psychotic) range. Although improvement could be viewed as change from the need for institutionalization to being able to live alone. The issue of the appropriate norm group and the meaningfulness of clinical significance is also an issue 54 for the “worried well" group who enter therapy relatively symptom free, perhaps to further personal growth or self knowledge. What standards should be used to evaluate these individuals with regard to therapy outcome? Should these individuals be compared against the "idealized fiction“ version of normality common in the dynamic perspective of psychology? Several forms of this idealized fiction of normality have been discussed by various psychoanalytic writers such as Jones (1931), Eissler (1960) and Klein (1960). A prototypical definition is given by Levine (1942) as cited in Offer and Sabshin (1966, p. 19). There "normality“ is defined in the following manner: ”1) Nonexistent in a complete form, but existing as relative and quantitative approximation. 2) In agreement with statistical averages of specific groups, if that is not contrary to standards of individual health and maturity. 3) Physical normality; Absence of physical disease; presence of good structure and function and maturity. 4) Intellectual normality. 5) Absence of neurotic and psychotic symptoms. [Levine elaborates later that the normal individual is only relatively free of neurotic and psychotic symptoms.) 6) Emotional maturity (especially in contrast with neurotic character formation). a) Ability to be guided by reality rather than fears. b) Use of long-term values. c) Grown up conscience. d) Independence. e) Capacity to "love" someone else but with an enlightened self-interest. f) A reasonable dependence. 9) A reasonable aggressiveness. 55 h) Healthy defence mechanisms. 1) Good sexual adjustment with acceptance of own gender. j) Good work adjustment." Does this definition have utility in psychotherapy outcome research or should we be more concerned with controlling symptomatology? According to the present study, symptom relief appears to be a prerequisite for client satisfaction with therapy outcome. Client satisfaction has a high level of face validity. It would be difficult to argue for an outcome criterion that could not stand the test of client satisfaction. In this study the no change group was less satisfied with their therapy. Individuals in this group tended to be people with low levels of symptomatology. Unless the goals put forth by Levine were not achieved by these clients, these results suggest that symptom relief is a preeminent factor in the client's evaluation of therapy and satisfaction with outcome. Clinical significance is clearly a meaningful way of assessing change in therapy. It is demonstrated to be a reasonable way of defining a client group in the context of outcome research. In addition clinical significance has social importance (Wolf, 1978) in the sense that it has a built in emphasis on behavior change stressed by Kazdin (1980). Clinical significance guarantees that subjects are symptomatically more like normals (Hazdin, 1977, 1978, 1980). Further, when clinical significance is used the variability in the data is made clearly visible rather than being camouflaged in group effects (Jacobson et al., 1984). Clinical significance will allow statements about the success rate of a treatment. A nonambiguous statement such as 6 out of 20 people improved with treatment A while 12 out of 20 people improved with treatment B can be made. Practitioners need to make choices about which treatment to use with a particular individual (Yeaton & Sechrest, 1981: Parloff, 1986). Clinical significance data can help answer that question. The practitioner can choose the treatment that is effective with most people or make a determination about whether his/her client is more like the 6 people who improved in treatment A or the 12 who improved in treatment B. As Sargent (1983) found, practitioners prefer studies which report results in terms of clinical significance because it helps them make the decisions that they need to make. Researchers will appreciate clinical significance because it allows a more complete representation of the data that includes variability. Ultimately psychotherapy research results cannot be meaningful unless they are usable by the practitioner. References References Argas, W. S., Kazdin, A. E., & Wilson, G. T. (1979). Behavior gflagapy: Toward an applied clinical science. San Franciscanreeman. Bakan, D. (1966). The test of significance in psychological research. Egychological Bullatin, 66, 423-437. Barlow, D. H. (1981). On the relation of clinical research to clinical practice: Current issues, new directions. Journal of conaulting and ClinicaL Esvchology, 49, 147-155. Barlow, D. H. and Hersen, M. (1973) Single-Case experimental designs. Uses in applied clinical research. Archivag of General Egychiatry. 29. 319-325. Boleoucky, Z. and Horvath, M. (1972). The SCL-90 rating scale: First experience with the Czech version in healthy male scientific workers. Aggivitas Magyosa Superior (Praha) 16, 115-116. Bolles, R. C. (1962). The difference between statistical hypotheses and scientific hypotheses. Egychological Raporta, 11. 639-645, 58 Carver, R, P. (1978). The case against statistical significance testing. Magyard Educational Raylafl 3, 378—399. Cronbach, L. J., & Furby, L. (1970). How should we measure "change"--Or should we? Psychological Bulletin, 74, 68-80. Cronbach, L. J., (1975). Beyond the two disciplines of scientific psychology. figarican Paychologiap. Feb. 116-127. Derogatis, L. (1983) SCL-90-R Administration, Scoring a Procedures Manual-II. Towson:Clinical Psychometric Research. Derogatis, L., Meyer, J., King, K, (1981). Psychopathology in individuals with sexual dysfunction. Amarican Journal of Egvchiatry. 138, 757-763. Derogatis, L., Rickels, R., and Rock, A. (1976). The SCL-90 and the MMPI: A step in the validation of a new self-report scale. British Journal of Psychiatry. 128, 280-289. Derogatis, L. R., and Cleary, P. (1977) Confirmation of the dimensional structure of the of the SCL-90: A study in construct validation. Journal of Clinical Psychology. 33(4), 981-989. 59 Eaton, T., Abeles, N., Gutfreund, M., J. (in press). Therapeutic alliance and outcome: Impact of treatment length and pretreatment symptomatology. Psychotherapy. Eissler, K. R., (1960). The efficient soldier. In Muensterburger, W. & Axelrod, S. (Eds.), IMa paychoagalytic atudv of aociagy. New York:lnternational Universities Press Garfield, S. L. (1981). Evaluating the psychotherapies. Behavior Therapy. 12, 195—307 Hays, W. L. (1981). Statistics. New York:Holt, Rinehart and Winston. Hayes, S. C. (1981). Time series methodology and empirical clinical practice. Journal of Conaulting and Clinical Paychology, 49, 193-211. Hersen, M., & Barlow, D. H. (1976). Single case experimental designa: Strategies for studying behavior change. New York: Pergamon. Hugdahl, K., & Ost, L. (1981). On the difference between statistical and clinical significance. SaMavioral Assessment. 3, 289-295. Jacobson, N., Follette, W. C., & Revenstorf, D. (1984) Psychotherapy outcome research: Methods of reporting variability and evaluating clinical significance. ehavior Therapy, 15, 336—352. 60 Jacobson, N., Follette, W. C., Revenstorf, D., Baucom, D., Hahlweg, K. & Margolin, G. (1984b). Variability in outcome and clinical significance of behavioral marital therapy: A reanalysis of outcome data. Journal of Coggulting app Clinical Paychology. 52, 497-504 Jacobson, N., Follette, W. (1985). Clinical significance of improvement resulting from two behavioral marital therapy components. Behavior Therapy, 16, 249-262. Jones, E. The concept of the normal mind. In S. D. Schmalhausen, (Ed.), Our neurotic change. New York: Farrar & Rinehart. Kazdin, A. E. (1977). Assessing the clinical or applied importance of behavior change through social validation. QgMavior Modification, 1, 427-452. Kazdin, A. E. and Wilson, T. (1978). Criteria for evaluating psychotherapy. Archivaa of General _§vchiatry. 3‘. 407-416. Kazdin, A. E. (1978). Methodological and interpretive problems of single-case experimental designs. Journal of Conaulting app Clinical Ps cholo , 46(4), 629-642. Kazdin, A. E. (1980). Raaaarch Dagign in ClinicaL Egychology. New York:Harper and Row. Kazdin, A. (1981). Drawing valid inferences from case studies. Journal of Conaulting and Clinical Egycholoqy. 49, 193-192 Kelly, E. L., Goldberg, L. R., Fisk, D. W., & Kilkowski. J. M. (1978). Twenty-five years later. aggrican Paychologiag, 33, 746-755. Kendall, P. C., & Norton-Ford, J. D. (1982). Therapy outcome research methods. In P. C. Kendall & J. N. Butcher (Eds.), Baaaarch mathoda in clinical psychology. New York: Wiley. Kiesler, D. J. (1981). Empirical clinical psychology: Myth or reality? Journal of Conaulting and ClinicaL Psychology. 49, 212-215 Klein, M. (1969). On mental health. British Journal of Medical Psychology. 33, 237-241. Kraemer, H. C., (1981). Coping strategies in psychiatric clinical research. Journal of Conaulting and clinicalgpgychology. 49(3), 309—319. Levine, M. (1942). Psychotherapy in medical practice. New York: Macmillan. Lichtenstein, A. B. (1984). The effect of client and therapist gender on the outcome and process of psychotherapy. Unpublished Dissertation, Michigan State University. 62 Lick, J. (1973). Statistical vs. clinical significance in research on the outcome of psychotherapy. International Journal of Haggai Health. 2, 26-37. Maletzky, B. M. (1981). Clinical relevance and clinical research. Sagavioral Assessment. 3, 283-288 Minkin, N., Braukmann, L. J., Minkin, B. L., Timbers, G. D., Timbers, B. J., Fixsen, D. L., Phillips, E. L., and Wolf, M. M. (1976). The social validation and training of conversational skills. Journal of Appligd Behavior Analysis. 9, 127-139. Nunnally, J. and Kotsch, W. (1983). Studies of individual subjects: Logic and methods of analysis. Britigh Journal of Clinical Paychology. 22, 83-93. Offer, D., & Sabshin, M. (1974). Normality. New York: Basic Books. Parloff, M., London, P. & Wolf, B. (1986). Individual psychotherapy and behavior change. Annual Review of Psyghology. 37, 321—349 Patterson, G. R. (1974)‘ Intervention for boys with conduct problems: Multiple settings, treatments, and criteria. Journal of Conaulting app ClinicaL E§Ychology. 42. 471-481. Prusoff, B., Weissman, M., Klerman, G. L., and Rounsaville, B. J. (1980). Research diagnostic criteria subtypes of depression: Their role as predictors of differential response to psychotherapy and drug treatment. Archivaa of Gagagal Paychiatry. 37, 796-801. Sargent, M. & Cohen, L. N. (1983). Influence of psychotherapy research on clinical practice: an experimental survey. Journal of Conaulting app, Clinigal Psychology. 51(5), 718-720. Smith, M. S., Glass, G. V., & Miller, T. L. (1980). The benefita of paychotherapy. Baltimore: Johns Hopkins University Press. Strupp, H. H., Fox, R., Lessler, K. (1969). Patients viaw thal: psychotherapy. Baltimore: The Johns Hopkins Press. Strupp, H. H. and Hadley, S. W. (1977). A tripartite model of mental health and therapeutic outcomes with special reference to negative effects in psychotherapy. aggricap_P§ychologia£. March 187-195. Strupp, H. H. (1981). Clinical research, practice, and the crisis of confidence. Journal of Conaulting and Clinical Paychology. 49, 216*219 64 Winch, R. F. and Campbell, D. T. (1969). Proof? no. evidence? yes. the significance of tests of significance. The American Sociologist. 4. 140-143. Wolf, M. M. (1978). Social validity: The case for subjective measurement or how applied behavior analysis is finding its heart. Journal of Applied Behavior Analysis 11, 203-214. Yeaton W. H., & Sechrest, L. (1981). Critical dimensions in the choice and maintenance of successful treatments: Strength, integrity, and effectiveness. Journal of Conaulting and Clinical 01 llllllllll V. .h s R E W N U E T A" Tuz SIAI 3 llllllllllll